MathGroup Archive: September 2011 [00418]

[Date Index] [Thread Index] [Author Index]
Re: passing Indeterminate and Infinity to C via MathLink
To: mathgroup at smc.vnet.net
Subject: [mg121566] Re: passing Indeterminate and Infinity to C via MathLink
From: Roman <rschmied at gmail.com>
Date: Tue, 20 Sep 2011 06:08:46 -0400 (EDT)
Delivered-to: l-mathgroup@mail-archive0.wolfram.com
References: <j4vamb$dah$1@smc.vnet.net>
Dear John,
thanks a lot for your detailed explanations. I believe that your
suggestion of using "magic" floats for representing Indeterminate and
Infinity is the most viable possibility; in this way I can send packed
arrays to C without a hitch.
The inefficiencies that you mention in the MathLink protocol, even
using packed arrays, are quite noticeable though. Much experimentation
and comparison to the use of Compile[] is required to find the fastest
way of getting things done.
Best regards,
Roman

On Sep 16, 1:07 pm, John Fultz <jfu... at wolfram.com> wrote:
> The fact that you're copying a matrix piece by piece using MLGetNext() +
> MLGet<type> is a truly tiny inefficiency.  Seriously!  I mean, it obviously is
> an inefficiency, but I seriously doubt it's the biggest inefficiency you're
> running into.  I don't know exactly what your problem is...based upon your
> concerns, I'll assume you're working with large data sets at some level...but in
> general for this sort of problem, there are bigger fish to fry.  Let me outline
> several different points below that you'll want to understand about dealing with
> large data sets.
>
> First big point, are the data being represented as a packed array in
> Mathematica?  I can tell you right now that if you have Indeterminate or
> Infinity in your data set, the answer is no!!  For example:
>
> In[1]:= ByteCount[N@Range[100000]]
>
> Out[1]= 800168
>
> In[2]:= ByteCount[Append[N@Range[100000], 1.]]
>
> Out[2]= 800176
>
> In[3]:= ByteCount[Append[N@Range[100000], 1./0.]]
>
> Power::infy: Infinite expression 1/0. encountered. >>
>
> Out[3]= 3200088
>
> Let me be very clear.  Mathematica certainly understands IEEE math internally,
> but it represents the edge cases not as doubles, but as symbols.  You report
> below that Mathematica correctly reads IEEE NaNs off of MathLink.  You're
> correct, but this is a one-way conversion which immediately loses the game if
> you were trying to maintain an extremely compact form.
>
> Second big point...could you have done things more efficiently in Mathematica,
> and avoided the massive inefficiency of pushing data sets through a shared
> memory protocol between processes?  I can't speak to this at all since I don't
> know what you're doing, but you should understand that transmission of data
> between processes is never cheap, and a bit less so in a highly structured
> communications protocol like MathLink.  But, more importantly, this compounds on
> top of my first point.  Packed arrays will be transferred over MathLinkin a
> much more compact form than a list of mixed floats and symbols.  Expression
> lists will take much longer to push over MathLink.  This communication time will
> dwarf any costs to the particular method by which you retrieve the data from the
> link.
>
> Third big point...as I said above, the transmission is going to be more costly
> than the individual read calls (which don't affect the mode of
> transmission...only how the data are being spoon-fed to you).  Packed arrays
> just represent that many fewer bytes to transmit, and that turns out to be a
> pretty big deal.  You could, in Mathematica, do a last minute conversion to a
> packed array by substituting Indeterminates and Infinities with some sort of
> magic value that Mathematica will leave alone as a float, but that your C
> program recognizes.  There might be a better way to do this...I am not the
> world's expert on the kernel's internal representations, but I know quite a lot
> about MathLink, and this is the best idea I have.  Of course, synthesizing the
> packed array has its own cost...whether the benefit is outweighed by the cost is
> something I would probably determine experimentally.
>
> Fourth big point...the Manual type does *not* preclude efficient reading of
> packed arrays when that's what has actually been written.  If MLGetReal64Array()
> succeeds, then you've won!  In all likelihood (although exceptions are
> technically possible), you've gotten the highest speed transmission rate, and
> you get to have only one copy of the data in memory, so long as you're willing
> to treat that copy as immutable.  If MLGetReal64Array() fails, then you can
> construct the array using the more piecemeal approach I suggested in my previous
> email.
>
> Sincerely,
>
> John Fultz
> jfu... at wolfram.com
> User Interface Group
> Wolfram Research, Inc.
>
>
>
>
>
>
>
> On Thu, 15 Sep 2011 04:40:56 -0400 (EDT), Roman wrote:
> > Thanks John. The entire reason why I am communicating with external C
> > procedures is in order to speed up computation; if much time is spent
> > in the communication interface then this defeats the point. In
> > particular, when passing large arrays of real numbers (containing NaN
> > and/or inf) then receiving the array element by element via
> > MLGetNext() seems a very inefficient thing to do.
>
> > I appreciate how faithful Mathematica is in the transmission process,
> > but when I pass "NaN" from C to Mathematica via MLPutReal64() then
> > Mathematica does in fact receive the "Indeterminate" symbol, not an
> > IEEE "NaN". So the conversion capability is built into Mathematica in
> > one direction but not in the other. What I was hoping for is a trick
> > which allows me to use such an automatism in the Mathematica->C
> > direction via MLGetReal64Array(). For instance, is there a way to
> > convert a matrix in Mathematica into a pure numerical representation
> > (where every element must be an IEEE number) which could then be
> > forwarded immediately (no conversions) to MLGetReal64Array()?
>
> > Cheers!
> > Roman
>
> > On Sep 12, 10:23 am, John Fultz <jfu... at wolfram.com> wrote:
> >> On Sat, 10 Sep 2011 07:29:23 -0400 (EDT), Roman wrote:
> >>> Hello all,
> >>> I am setting up a C function which accepts real numbers from MathLink.
> >>> The behavior I would like to achieve is that whenever the number is
> >>> "Infinity" then the C function receives "inf" (which is a valid
> >>> double-
> >>> precision-format number); and whenever the number is "Indeterminate"
> >>> then the C function receives "nan" (which is also a valid double-
> >>> precision-format number).
> >>> Unfortunately MathLink (Mathematica 7.0 for Mac OS X x86 (64-bit))
> >>> crashes whenever I am trying to pass either Infinity or Indeterminate
> >>> to a MathLink function expecting a double-precision number.
> >>> Would you know how to solve this without going into If[] statements on
> >>> the Mathematica side of MathLink?
> >>> Thanks!
> >>> Roman
>
> >> Mathematica represents Indeterminate and Infinity as symbols in its
> >> expression tree, and MathLink is always very faithful about transmitting
> >> the expression tree precisely.  Note that it's not very difficult to
> >> deal with this in= your C code, though.  You can just declare the
> >> function as having a Manual MathLink type and then, in the C function
> >> determine using MLGetNext() whether the next thing is a symbol or a
> >> real.  If it's a symbol, then you can just synthesizes the IEEE version
> >> of the indeterminate value in your C program.
>
> >> About two thirds of the way down this help page:
>
> >> tutorial/HandlingListsArraysAndOtherExpressions
>
> >> there's an example that illustrates how to use Manual as an argument
> >> type
Prev by Date: Re: Compilation: Avoiding inlining
Next by Date: Re: Parallel remote kernel: MathLink connection not active
Previous by thread: Re: passing Indeterminate and Infinity to C via MathLink
Next by thread: Re: passing Indeterminate and Infinity to C via MathLink