Re: passing Indeterminate and Infinity to C via MathLink
- To: mathgroup at smc.vnet.net
- Subject: [mg121484] Re: passing Indeterminate and Infinity to C via MathLink
- From: John Fultz <jfultz at wolfram.com>
- Date: Fri, 16 Sep 2011 07:07:29 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- Reply-to: jfultz at wolfram.com
The fact that you're copying a matrix piece by piece using MLGetNext() + MLGet<type> is a truly tiny inefficiency. Seriously! I mean, it obviously is an inefficiency, but I seriously doubt it's the biggest inefficiency you're running into. I don't know exactly what your problem is...based upon your concerns, I'll assume you're working with large data sets at some level...but in general for this sort of problem, there are bigger fish to fry. Let me outline several different points below that you'll want to understand about dealing with large data sets. First big point, are the data being represented as a packed array in Mathematica? I can tell you right now that if you have Indeterminate or Infinity in your data set, the answer is no!! For example: In[1]:= ByteCount[N@Range[100000]] Out[1]= 800168 In[2]:= ByteCount[Append[N@Range[100000], 1.]] Out[2]= 800176 In[3]:= ByteCount[Append[N@Range[100000], 1./0.]] Power::infy: Infinite expression 1/0. encountered. >> Out[3]= 3200088 Let me be very clear. Mathematica certainly understands IEEE math internally, but it represents the edge cases not as doubles, but as symbols. You report below that Mathematica correctly reads IEEE NaNs off of MathLink. You're correct, but this is a one-way conversion which immediately loses the game if you were trying to maintain an extremely compact form. Second big point...could you have done things more efficiently in Mathematica, and avoided the massive inefficiency of pushing data sets through a shared memory protocol between processes? I can't speak to this at all since I don't know what you're doing, but you should understand that transmission of data between processes is never cheap, and a bit less so in a highly structured communications protocol like MathLink. But, more importantly, this compounds on top of my first point. Packed arrays will be transferred over MathLink in a much more compact form than a list of mixed floats and symbols. Expression lists will take much longer to push over MathLink. This communication time will dwarf any costs to the particular method by which you retrieve the data from the link. Third big point...as I said above, the transmission is going to be more costly than the individual read calls (which don't affect the mode of transmission...only how the data are being spoon-fed to you). Packed arrays just represent that many fewer bytes to transmit, and that turns out to be a pretty big deal. You could, in Mathematica, do a last minute conversion to a packed array by substituting Indeterminates and Infinities with some sort of magic value that Mathematica will leave alone as a float, but that your C program recognizes. There might be a better way to do this...I am not the world's expert on the kernel's internal representations, but I know quite a lot about MathLink, and this is the best idea I have. Of course, synthesizing the packed array has its own cost...whether the benefit is outweighed by the cost is something I would probably determine experimentally. Fourth big point...the Manual type does *not* preclude efficient reading of packed arrays when that's what has actually been written. If MLGetReal64Array() succeeds, then you've won! In all likelihood (although exceptions are technically possible), you've gotten the highest speed transmission rate, and you get to have only one copy of the data in memory, so long as you're willing to treat that copy as immutable. If MLGetReal64Array() fails, then you can construct the array using the more piecemeal approach I suggested in my previous email. Sincerely, John Fultz jfultz at wolfram.com User Interface Group Wolfram Research, Inc. On Thu, 15 Sep 2011 04:40:56 -0400 (EDT), Roman wrote: > Thanks John. The entire reason why I am communicating with external C > procedures is in order to speed up computation; if much time is spent > in the communication interface then this defeats the point. In > particular, when passing large arrays of real numbers (containing NaN > and/or inf) then receiving the array element by element via > MLGetNext() seems a very inefficient thing to do. > > I appreciate how faithful Mathematica is in the transmission process, > but when I pass "NaN" from C to Mathematica via MLPutReal64() then > Mathematica does in fact receive the "Indeterminate" symbol, not an > IEEE "NaN". So the conversion capability is built into Mathematica in > one direction but not in the other. What I was hoping for is a trick > which allows me to use such an automatism in the Mathematica->C > direction via MLGetReal64Array(). For instance, is there a way to > convert a matrix in Mathematica into a pure numerical representation > (where every element must be an IEEE number) which could then be > forwarded immediately (no conversions) to MLGetReal64Array()? > > Cheers! > Roman > > > On Sep 12, 10:23 am, John Fultz <jfu... at wolfram.com> wrote: >> On Sat, 10 Sep 2011 07:29:23 -0400 (EDT), Roman wrote: >>> Hello all, >>> I am setting up a C function which accepts real numbers from MathLink. >>> The behavior I would like to achieve is that whenever the number is >>> "Infinity" then the C function receives "inf" (which is a valid >>> double- >>> precision-format number); and whenever the number is "Indeterminate" >>> then the C function receives "nan" (which is also a valid double- >>> precision-format number). >>> Unfortunately MathLink (Mathematica 7.0 for Mac OS X x86 (64-bit)) >>> crashes whenever I am trying to pass either Infinity or Indeterminate >>> to a MathLink function expecting a double-precision number. >>> Would you know how to solve this without going into If[] statements on >>> the Mathematica side of MathLink? >>> Thanks! >>> Roman >>> >> Mathematica represents Indeterminate and Infinity as symbols in its >> expression tree, and MathLink is always very faithful about transmitting >> the expression tree precisely. Note that it's not very difficult to >> deal with this in= your C code, though. You can just declare the >> function as having a Manual MathLink type and then, in the C function >> determine using MLGetNext() whether the next thing is a symbol or a >> real. If it's a symbol, then you can just synthesizes the IEEE version >> of the indeterminate value in your C program. >> >> About two thirds of the way down this help page: >> >> tutorial/HandlingListsArraysAndOtherExpressions >> >> there's an example that illustrates how to use Manual as an argument >> type=