MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Unicode Support

  • To: mathgroup at
  • Subject: [mg55524] Re: [mg55503] Unicode Support
  • From: John Fultz <jfultz at>
  • Date: Sun, 27 Mar 2005 06:45:09 -0500 (EST)
  • Reply-to: jfultz at
  • Sender: owner-wri-mathgroup at

 On Sat, 26 Mar 2005 02:39:43 -0500 (EST), Zhu Chongkai wrote:
> Hi all,
> The Mathematica Book says that Mathematica support Unicode Characters.
> And the MathLink tells that a Unicode character in Mathematica is a
> 16-bit. But the latest Unicode Standard uses 32-bit to encode a
> character. It seems to me that Mathematica's Unicode support is
> outdated, based on an old version of Unicode Standard, which only
> contains lass than 65536 characters. Will next version of Mathematica
> use 32-bit encoding? Or am I wrong?
> Cheers,
> Zhu Chongkai

Saying that Mathematica uses 16-bit Unicode characters is equivalent to 
saying that Mathematica uses UTF-16.  UTF-16 can represent any Unicode 
character, and has been able to do so since at least Unicode 2.0 (and quite 
possibly earlier).  It does so by using a reserved block of 16-bit values 
to represent non-plane 0 Unicode characters as a pair of values (known as a 
surrogate pair...see section 5.4 of the Unicode standard for more info).  
So, there is no need to change from a 16-bit encoding in order to support 
characters outside of the plane 0 range.

MathLink supports this now.  It's still just a stream of 16-bit characters. 
Mathematica can also represent the characters as surrogate pairs, but 
doesn't yet treat them as unitary characters for the purpose of string 
manipulation and text drawing operations.  That's something we'll add to a 
future release.


John Fultz
jfultz at
User Interface Group
Wolfram Research, Inc.

  • Prev by Date: Re: Hypergeometric integral looks wrong ?
  • Next by Date: Simplifying ArcTan
  • Previous by thread: Re: Unicode Support
  • Next by thread: Re: Unicode Support