a (serious) question about character codes in Mathematica

*To*: mathgroup at smc.vnet.net*Subject*: [mg71447] a (serious) question about character codes in Mathematica*From*: "Chris Chiasson" <chris at chiasson.name>*Date*: Sun, 19 Nov 2006 01:10:03 -0500 (EST)

Usually, my questions about character codes are more complaints than anything else. But, this time I just want info. I promise :-) FromCharacterCode[16^^52,"Mathematica7"]//FullForm gives "\[DoubleStruckCapitalR]" (52 in hex is 82 in decimal, so this is the 83rd character in the Mathematica7 font - numbering starts from zero) As shown below, Mathematica seems to map characters to non-standard Unicode points. In Unicode, the double struck capitol R is 0x211d in hex and 8477 in decimal[1]. But, Mathematica assigns this character to (the second argument is optional here) ToCharacterCode["\[DoubleStruckCapitalR]","Unicode"] {63413} 63413 in decimal and 0xf7b5 in hex does not map to any (regular) glyph. In fact, the number is in the private use area of Unicode. I know Mathematica has been around for a long time. Perhaps this code point was in use for DoubleStruckCapitalR before that glyph had a code point in regular Unicode. I am sure there is a perfectly good explanation. I am interested in ways to acquire the common Unicode points for the glyphs in Mathematica7 along with the stretchy characters for parenthesis and lists from other Mathematica fonts. This can be done by feeding the raw character into one of the lower level conversion routines like System`Convert`XMLDump`determineEntityExportFunction[{"\[DoubleStruckCapitalR]"},"US-ASCII"]["\[DoubleStruckCapitalR]"] (I think this call is right, but I can't be sure because it's undocumented - also, it must be enabled by exporting something first) but it requires a lot of post processing to apply it to all the characters in a particular font (because sometimes the result is a plain character instead of an HTML entity string - and sometimes the result is a semicolon...). What is the best way to get the "real" Unicode points for the exotic fonts in Mathematica? [1] http://www.unicode.org/charts/PDF/U2100.pdf -- http://chris.chiasson.name/