MathGroup Archive 1999

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: escape-sequences in strings

  • To: mathgroup at smc.vnet.net
  • Subject: [mg16004] Re: escape-sequences in strings
  • From: tgayley at mcs.net (Todd Gayley)
  • Date: Fri, 19 Feb 1999 03:27:15 -0500
  • Organization: MCSNet Services
  • References: <7a2atd$1sb@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

On 12 Feb 1999 17:45:01 -0500, Timo Felbinger
<timof at uranos.quantum.physik.uni-potsdam.de> wrote:

>
>Hello,
>
>I am currently working on a front end (in C) for Mathematica, which
>passes user-defined input via mathlink to the Mathematica kernel for
>evaluation.
>
>It works fine for most input, but if there is a double-quoted string  in
>the input which contains escape sequences, funny things happen:
>
> '"hello\\world"'   will return  'hello\\world'
>
> '"hello\nworld"'   will return  'hello world'
>
> '"hello\world"'    will return nothing and seems to freeze the kernel
>
> '"hello\"world"'   will return a syntax error
>
> '"hello\"world\""' will return '2', a linefeed, and '  hello world'
>
>In these examples, everything enclosed by single quotes is to be read
>literally (ie all double quotes and backslashes were sent as is).
>
>The input was transfered via MLPutString as an argument of 
>EnterTextPacket, and the results are the payload of a RETURNTEXTPKT,
>received via MLGetSTring.
>
>This behaviour seems to contradict what the mathematica book says  about
>escape sequences inside double-quoted strings. What am I missing here?
>Is the front end responsible for the expansion  of backslash escape
>sequences, and if so, in what way?
>
>BTW, where can I find the "MathLink Reference Manual" which is cited in
>Todd Gayley's mathlink tutorial? The "DeveloperGuide.nb" on my
>Mathematica CD isn't really helpful.
>
>Timo

Timo,

The problem is that you are using MLPutString. The behavior of
MLPutString changed between the 2.2 and 3.0 versions of MathLink.

MLPutString and MLGetString have always sent and received Mathematica
strings in a special encoded form. This encoding was necessary to
represent the 16-bit character set that Mathematica uses in an 8-bit
character array. To write a correct MathLink program, you have always
needed to encode a string before passing it to MLPutString, and decode
a string you get from MLGetString.

Did anyone ever do this (including me)? Was it ever documented
(including in my MathLink Tutorial)? Heck, no! I believe the only
place where this fact was mentioned and the necessary code for
encoding/decoding presented was in the 2.2 MathLink.h header file. The
reason the uncounted number of MLGet/PutString MathLink calls that
programmers have written over the years worked is that the encoding
was a no-op for virtually every character in the 0-255 range, and
certainly for every character that anyone would be likely to use in a
C string.

I considered the encoding issue a small and complicated detail, along
with many others I left out of the MathLink Tutorial (which was
written in '94). Bad idea. The encoding was changed in 3.0 so that a
much smaller set of 8-bit characters were unaffected by the encoding.
Now the encoding step is necessary to use MLGet/PutString.

In other words, virtually every MathLink program that calls
MLGet/PutString written in the 2.2 days, or any program written based
on 2.2-era documentation, will be broken if compiled and linked
against the 3.0 MathLink library.

This issue is now well documented in the DeveloperGuide.nb notebook in
the MathLink Developer Kit materials that are part of the Mathematica
distribution.

Enough of the boring history lesson. The simple answer is "Don't use
MLGet/PutString anymore". If you really want to handle the full 16-bit
character set in your C program, then use MLGetUnicodeString and
MLPutUnicodeString. No translation necessary. More likely, though, is
that you want good old 8-bit strings in your C program. If so, use
MLGetByteString and MLPutByteString. These are the string functions
that most programmers will use. If you get a string from Mathematica
with 16-bit characters in it, then MLGetByteString will not be able to
represent it properly, but then if you want to deal with 16-bit
characters you probably want the Unicode functions mentioned above.

Finally, the MathLink Reference Guide referred to in the MathLink
Tutorial is the old 2.2-era manual. It is available in PostScript form
on MathSource. It has been superseded by MathLink documentation in the
Mathematica book, but I still find it a useful reference. Except the
string stuff....


Todd  Gayley
LinkObjects



  • Prev by Date: Re: implementing a stack
  • Next by Date: Re: Pure Functions in rules
  • Previous by thread: Re: escape-sequences in strings
  • Next by thread: Text in Metafiles ...