Re: Mathematica not IEEE-754 compliant?
- To: mathgroup at smc.vnet.net
- Subject: [mg131935] Re: Mathematica not IEEE-754 compliant?
- From: Alain Cochard <Alain.Cochard at unistra.fr>
- Date: Sat, 2 Nov 2013 02:26:43 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- Delivered-to: l-mathgroup@wolfram.com
- Delivered-to: mathgroup-outx@smc.vnet.net
- Delivered-to: mathgroup-newsendx@smc.vnet.net
- References: <l4n97h$b25$1@smc.vnet.net> <l4sjo2$pkv$1@smc.vnet.net>
Thank you very much for your answer! danl at wolfram.com writes on Thu 31 Oct 2013 03:45: > I'm not convinced that one or the other is incorrect. It could be > the case, for example, that 80 bit registers at certain steps are > required to get one result whereas 64 bit registers, or storing > from 80 to 64 bit doubles in memory, will give the other. Whether > or how 80 bit registers get used will depend on settings at compile > time for the various programs being used. Unfortunately, I am not familiar with 80 bit registers. My --perhaps naive-- understanding comes from the official document "IEEE Standard for Binary Floating-Point Arithmetic", in which I read that All conforming implementations of this standard shall provide operations to add, subtract, multiply, divide, extract the square root, find the remainder, round to integer in floating-point format, convert between different floating-point formats, convert between floating-point and integer formats, convert binary <---> decimal, and compare. [...] Except for binary <---> decimal conversion, each of the operations shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then coerced this intermediate result to fit in the destination's format (see Sections 4 and 7). Section 7 is dealing with exceptions, which should not be a concern here. Also, as far as I understand, the issue I raised is not related to binary <---> decimal conversion. Section 4.1 is "Round to Nearest" and says: An implementation of this standard shall provide round to nearest as the default rounding mode. In this mode the representable value nearest to the infinitely precise result shall be delivered; So I do not understand how, on a system conforming to the standard, the result of a single multiplication might or might not give the closest machine number. > One thing I can point out is that the Mathematica result is > arguably the better of the two. Once you create 'num' and 'den' as > machine numbers, if you now take their bit sequences, make exact > rationals, and divide to high precision, then create the bit > sequence for the corresponding approximate value, you will recover > the 'bin' set of bits, not that of 'presumablyClosestInfPrec'. I > show this below. > > In[39]:= Take[ > RealDigits[ > N[FromDigits[RealDigits[num, 2], 2]/ > FromDigits[RealDigits[den, 2], 2], 20], 2][[1]], 53] === bin[[1]] > > Out[39]= True > > Notice that I used N[...,20] to make sure to correctly reconstruct > that (exact) quotient to more than 53 correct bits. Here too, I am probably missing something. If I understand correctly, you are saying that Mathematica's result is the better of the two because of internal consistency. But it seems to me that when you take the first 53 digits, you are simply "floor'ing" the number. As, in my exemple, the ordering is: Mma_result < exact_result < closest_number it is no surprise to me that you recover 'bin', but I do not see why it shows consistency. Here is another example with closest_number < exact_result < Mma_result numInfPrec=2^^11000001101111100011010100110001011111101100101111110*2^-53; denInfPrec=2^^11100010111100000100110011100111001001000101100111011*2^-53; num=1.*numInfPrec; den=1.*denInfPrec; res=num/den; bin=RealDigits[res,2] This gives: {{1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0}, 0} presumablyClosestInfPrec= 2^^11011010100011011010100001111010100001001001100110011*2^-53 (* This is the machine number immediately before the previous one *) Check that it is indeed the machine number closest to the exact result: N[ { numInfPrec/denInfPrec - presumablyClosestInfPrec , FromDigits[bin,2] - presumablyClosestInfPrec}, 3]//InputForm This gives {3.998894021443906958645`3.*^-17, 1.1102230246251565404236`3.*^-16} Your piece of code: Take[ RealDigits[ N[FromDigits[RealDigits[num, 2], 2]/ FromDigits[RealDigits[den, 2], 2], 20], 2][[1]], 53] === bin[[1]] gives "False", while Take[ RealDigits[ N[FromDigits[RealDigits[num, 2], 2]/ FromDigits[RealDigits[den, 2], 2], 20], 2][[1]], 53] === RealDigits[presumablyClosestInfPrec,2,53] [[1]] gives "True". Hopefully you can improve my understanding. With best wishes, Alain -- EOST (�cole et Observatoire des Sciences de la Terre) IPG (Institut de Physique du Globe) | alain.cochard at unistra.fr 5 rue René Descartes [bureau 106] | Phone: +33 (0)3 68 85 50 44 F-67084 Strasbourg Cedex, France | Fax: +33 (0)3 68 85 01 25