Re: passing Indeterminate and Infinity to C via MathLink

*To*: mathgroup at smc.vnet.net*Subject*: [mg121566] Re: passing Indeterminate and Infinity to C via MathLink*From*: Roman <rschmied at gmail.com>*Date*: Tue, 20 Sep 2011 06:08:46 -0400 (EDT)*Delivered-to*: l-mathgroup@mail-archive0.wolfram.com*References*: <j4vamb$dah$1@smc.vnet.net>

Dear John, thanks a lot for your detailed explanations. I believe that your suggestion of using "magic" floats for representing Indeterminate and Infinity is the most viable possibility; in this way I can send packed arrays to C without a hitch. The inefficiencies that you mention in the MathLink protocol, even using packed arrays, are quite noticeable though. Much experimentation and comparison to the use of Compile[] is required to find the fastest way of getting things done. Best regards, Roman On Sep 16, 1:07 pm, John Fultz <jfu... at wolfram.com> wrote: > The fact that you're copying a matrix piece by piece using MLGetNext() + > MLGet<type> is a truly tiny inefficiency. Seriously! I mean, it obviously is > an inefficiency, but I seriously doubt it's the biggest inefficiency you're > running into. I don't know exactly what your problem is...based upon your > concerns, I'll assume you're working with large data sets at some level...but in > general for this sort of problem, there are bigger fish to fry. Let me outline > several different points below that you'll want to understand about dealing with > large data sets. > > First big point, are the data being represented as a packed array in > Mathematica? I can tell you right now that if you have Indeterminate or > Infinity in your data set, the answer is no!! For example: > > In[1]:= ByteCount[N@Range[100000]] > > Out[1]= 800168 > > In[2]:= ByteCount[Append[N@Range[100000], 1.]] > > Out[2]= 800176 > > In[3]:= ByteCount[Append[N@Range[100000], 1./0.]] > > Power::infy: Infinite expression 1/0. encountered. >> > > Out[3]= 3200088 > > Let me be very clear. Mathematica certainly understands IEEE math internally, > but it represents the edge cases not as doubles, but as symbols. You report > below that Mathematica correctly reads IEEE NaNs off of MathLink. You're > correct, but this is a one-way conversion which immediately loses the game if > you were trying to maintain an extremely compact form. > > Second big point...could you have done things more efficiently in Mathematica, > and avoided the massive inefficiency of pushing data sets through a shared > memory protocol between processes? I can't speak to this at all since I don't > know what you're doing, but you should understand that transmission of data > between processes is never cheap, and a bit less so in a highly structured > communications protocol like MathLink. But, more importantly, this compounds on > top of my first point. Packed arrays will be transferred over MathLinkin a > much more compact form than a list of mixed floats and symbols. Expression > lists will take much longer to push over MathLink. This communication time will > dwarf any costs to the particular method by which you retrieve the data from the > link. > > Third big point...as I said above, the transmission is going to be more costly > than the individual read calls (which don't affect the mode of > transmission...only how the data are being spoon-fed to you). Packed arrays > just represent that many fewer bytes to transmit, and that turns out to be a > pretty big deal. You could, in Mathematica, do a last minute conversion to a > packed array by substituting Indeterminates and Infinities with some sort of > magic value that Mathematica will leave alone as a float, but that your C > program recognizes. There might be a better way to do this...I am not the > world's expert on the kernel's internal representations, but I know quite a lot > about MathLink, and this is the best idea I have. Of course, synthesizing the > packed array has its own cost...whether the benefit is outweighed by the cost is > something I would probably determine experimentally. > > Fourth big point...the Manual type does *not* preclude efficient reading of > packed arrays when that's what has actually been written. If MLGetReal64Array() > succeeds, then you've won! In all likelihood (although exceptions are > technically possible), you've gotten the highest speed transmission rate, and > you get to have only one copy of the data in memory, so long as you're willing > to treat that copy as immutable. If MLGetReal64Array() fails, then you can > construct the array using the more piecemeal approach I suggested in my previous > email. > > Sincerely, > > John Fultz > jfu... at wolfram.com > User Interface Group > Wolfram Research, Inc. > > > > > > > > On Thu, 15 Sep 2011 04:40:56 -0400 (EDT), Roman wrote: > > Thanks John. The entire reason why I am communicating with external C > > procedures is in order to speed up computation; if much time is spent > > in the communication interface then this defeats the point. In > > particular, when passing large arrays of real numbers (containing NaN > > and/or inf) then receiving the array element by element via > > MLGetNext() seems a very inefficient thing to do. > > > I appreciate how faithful Mathematica is in the transmission process, > > but when I pass "NaN" from C to Mathematica via MLPutReal64() then > > Mathematica does in fact receive the "Indeterminate" symbol, not an > > IEEE "NaN". So the conversion capability is built into Mathematica in > > one direction but not in the other. What I was hoping for is a trick > > which allows me to use such an automatism in the Mathematica->C > > direction via MLGetReal64Array(). For instance, is there a way to > > convert a matrix in Mathematica into a pure numerical representation > > (where every element must be an IEEE number) which could then be > > forwarded immediately (no conversions) to MLGetReal64Array()? > > > Cheers! > > Roman > > > On Sep 12, 10:23 am, John Fultz <jfu... at wolfram.com> wrote: > >> On Sat, 10 Sep 2011 07:29:23 -0400 (EDT), Roman wrote: > >>> Hello all, > >>> I am setting up a C function which accepts real numbers from MathLink. > >>> The behavior I would like to achieve is that whenever the number is > >>> "Infinity" then the C function receives "inf" (which is a valid > >>> double- > >>> precision-format number); and whenever the number is "Indeterminate" > >>> then the C function receives "nan" (which is also a valid double- > >>> precision-format number). > >>> Unfortunately MathLink (Mathematica 7.0 for Mac OS X x86 (64-bit)) > >>> crashes whenever I am trying to pass either Infinity or Indeterminate > >>> to a MathLink function expecting a double-precision number. > >>> Would you know how to solve this without going into If[] statements on > >>> the Mathematica side of MathLink? > >>> Thanks! > >>> Roman > > >> Mathematica represents Indeterminate and Infinity as symbols in its > >> expression tree, and MathLink is always very faithful about transmitting > >> the expression tree precisely. Note that it's not very difficult to > >> deal with this in= your C code, though. You can just declare the > >> function as having a Manual MathLink type and then, in the C function > >> determine using MLGetNext() whether the next thing is a symbol or a > >> real. If it's a symbol, then you can just synthesizes the IEEE version > >> of the indeterminate value in your C program. > > >> About two thirds of the way down this help page: > > >> tutorial/HandlingListsArraysAndOtherExpressions > > >> there's an example that illustrates how to use Manual as an argument > >> type