Re: escape-sequences in strings
- To: mathgroup at smc.vnet.net
- Subject: [mg16004] Re: escape-sequences in strings
- From: tgayley at mcs.net (Todd Gayley)
- Date: Fri, 19 Feb 1999 03:27:15 -0500
- Organization: MCSNet Services
- References: <7a2atd$1sb@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
On 12 Feb 1999 17:45:01 -0500, Timo Felbinger <timof at uranos.quantum.physik.uni-potsdam.de> wrote: > >Hello, > >I am currently working on a front end (in C) for Mathematica, which >passes user-defined input via mathlink to the Mathematica kernel for >evaluation. > >It works fine for most input, but if there is a double-quoted string in >the input which contains escape sequences, funny things happen: > > '"hello\\world"' will return 'hello\\world' > > '"hello\nworld"' will return 'hello world' > > '"hello\world"' will return nothing and seems to freeze the kernel > > '"hello\"world"' will return a syntax error > > '"hello\"world\""' will return '2', a linefeed, and ' hello world' > >In these examples, everything enclosed by single quotes is to be read >literally (ie all double quotes and backslashes were sent as is). > >The input was transfered via MLPutString as an argument of >EnterTextPacket, and the results are the payload of a RETURNTEXTPKT, >received via MLGetSTring. > >This behaviour seems to contradict what the mathematica book says about >escape sequences inside double-quoted strings. What am I missing here? >Is the front end responsible for the expansion of backslash escape >sequences, and if so, in what way? > >BTW, where can I find the "MathLink Reference Manual" which is cited in >Todd Gayley's mathlink tutorial? The "DeveloperGuide.nb" on my >Mathematica CD isn't really helpful. > >Timo Timo, The problem is that you are using MLPutString. The behavior of MLPutString changed between the 2.2 and 3.0 versions of MathLink. MLPutString and MLGetString have always sent and received Mathematica strings in a special encoded form. This encoding was necessary to represent the 16-bit character set that Mathematica uses in an 8-bit character array. To write a correct MathLink program, you have always needed to encode a string before passing it to MLPutString, and decode a string you get from MLGetString. Did anyone ever do this (including me)? Was it ever documented (including in my MathLink Tutorial)? Heck, no! I believe the only place where this fact was mentioned and the necessary code for encoding/decoding presented was in the 2.2 MathLink.h header file. The reason the uncounted number of MLGet/PutString MathLink calls that programmers have written over the years worked is that the encoding was a no-op for virtually every character in the 0-255 range, and certainly for every character that anyone would be likely to use in a C string. I considered the encoding issue a small and complicated detail, along with many others I left out of the MathLink Tutorial (which was written in '94). Bad idea. The encoding was changed in 3.0 so that a much smaller set of 8-bit characters were unaffected by the encoding. Now the encoding step is necessary to use MLGet/PutString. In other words, virtually every MathLink program that calls MLGet/PutString written in the 2.2 days, or any program written based on 2.2-era documentation, will be broken if compiled and linked against the 3.0 MathLink library. This issue is now well documented in the DeveloperGuide.nb notebook in the MathLink Developer Kit materials that are part of the Mathematica distribution. Enough of the boring history lesson. The simple answer is "Don't use MLGet/PutString anymore". If you really want to handle the full 16-bit character set in your C program, then use MLGetUnicodeString and MLPutUnicodeString. No translation necessary. More likely, though, is that you want good old 8-bit strings in your C program. If so, use MLGetByteString and MLPutByteString. These are the string functions that most programmers will use. If you get a string from Mathematica with 16-bit characters in it, then MLGetByteString will not be able to represent it properly, but then if you want to deal with 16-bit characters you probably want the Unicode functions mentioned above. Finally, the MathLink Reference Guide referred to in the MathLink Tutorial is the old 2.2-era manual. It is available in PostScript form on MathSource. It has been superseded by MathLink documentation in the Mathematica book, but I still find it a useful reference. Except the string stuff.... Todd Gayley LinkObjects