|
[Date Index]
[Thread Index]
[Author Index]
Re: Re: Unicode Support
- To: mathgroup at smc.vnet.net
- Subject: [mg55551] Re: [mg55525] Re: Unicode Support
- From: John Fultz <jfultz at wolfram.com>
- Date: Tue, 29 Mar 2005 03:42:33 -0500 (EST)
- Reply-to: jfultz at wolfram.com
- Sender: owner-wri-mathgroup at wolfram.com
On Mon, 28 Mar 2005 02:42:03 -0500 (EST), Zhu Chongkai wrote:
> ======= At 2005-03-27, 18:42:43 John Fultz wrote: =======
>
>> UTF-8 is a supported character encoding by both front end and kernel
> (i.e.
>> they can import and export files as UTF-8). I believe the support has
> been
>> there since 5.0. MathLink only supports UTF-16 for now.
>>
>> UTF-32 is not supported at all in current versions. To be honest,
> nobody
>> has asked for it before. While UTF-32 is a clean way of representing
>> characters in all of the Unicode planes, I think the vast majority of
>> programs out there in the real world are using either UTF-8 or UTF-16.
>>
>> Sincerely,
>> John Fultz
>>
> Thank you for clarification. But I still think that both UTF-8 and UTF-32
> support are important, especially for MathLink. For example, I know one
> program that use both UTF-8 and UTF-32, how can I link it with MathLink?
I imagine that we'll do something with UTF-8 in MathLink down the road. As
for UTF-32...well, it's been asked for once...by you...and you yourself
admit that you could use UTF-8 instead. So, thusfar, it's just not a big
priority. Of course, in an evolving world, priorities may change.
Concerning your problem...if the program also uses UTF-16 (I would find it
difficult to believe a program would support UTF-8 and UTF-32 with no
support for UTF-16), then just use that. Otherwise, writing the code to
convert plane 0-only characters (it's unlikely you'll see anything outside
of plane 0) between UTF-32 and UTF-16 is about as trivial of a programming
exercise as you can get...it's just converting an array of 32-bit unsigned
ints to an array of 16-bit unsigned shorts and vice versa.
Writing a UTF-8 <-> UTF-16 converter is a little harder, but all of the
information needed to do it is in section 3.9 of the Unicode spec found
online at the unicode.org site. Or you can just google for somebody that
has code to do it. For example,
http://www-306.ibm.com/software/globalization/icu/index.jsp
http://www.gnu.org/software/libiconv/
Sincerely,
John Fultz
jfultz at wolfram.com
User Interface Group
Wolfram Research, Inc.
Prev by Date:
Re: front end complaint (ui design flaw?)
Next by Date:
work with graphics and output in a different notebook from the one containing your code
Previous by thread:
Re: Unicode Support
Next by thread:
Re: Unicode Support
|