MathGroup Archive: December 2008 [00048]

[Date Index] [Thread Index] [Author Index]

Re: Re: Re: Floating-Point Computing

To: mathgroup at smc.vnet.net
Subject: [mg94008] Re: [mg93998] Re: [mg93927] Re: Floating-Point Computing
From: Daniel Lichtblau <danl at wolfram.com>
Date: Tue, 2 Dec 2008 00:39:44 -0500 (EST)
References: <ggofnk$rsh$1@smc.vnet.net> <200812011202.HAA05932@smc.vnet.net>

danl at wolfram.com wrote:
>> Antonio wrote:
>>> Dear List,
>>>
>>> I came across this Sun article on floating point:
>>> http://developers.sun.com/solaris/articles/fp_errors.html
>>>
>>> which calculates the following code:
>>>
>>> main()
>>> {
>>>  register i;
>>>  double f=0.0;
>>>  double sqrt();
>>>
>>>  for(i=0;i<=1000000000;i++)
>>>  {
>>>     f+=sqrt( (double) i);
>>>  }
>>>  printf("Finished %20.14f\n",f);
>>> }
>>>
>>> whose correct answer using Euler-Maclaurin formula is:
>>> f=21081851083600.37596259382529338
>>>
>>> Tried to implement in mathematica, just for fun, so that later I could
>>> play with precision, but the kernel has no more memory and shuts down.
>>>
>>> N[Total[Sqrt[Range[1000000000]]]]
>>>
>> If all those numbers are stored as 32-bit integers in a packed array,
>> they would take up about 4*10^9/1024^3 ~= 3.7 GB space.  That is already
>> too much for a 32-bit machine, and more than most present-day personal
>> computers can handle.  Mapping Sqrt to the list unpacks it, eating up
>> even more memory.  Creating a symbolic sum with Total and only
>> numericising at the end doesn't help either.
>>
>> The only reasonable approach here is to write code that is equivalent to
>> the C version, i.e. doesn't pre-compute the list of numbers to be
>> summed.  But that will run too slowly in Mathematica.  So Mathematica is
>> not really the right tool for summing 10^9 numbers.
>>
>> Two possible versions:
>>
>> fun1[n_Integer] := Module[{f = 0.},
>>    Do[f += N@Sqrt[i], {i, n}];
>>    f
>>    ]
>>
>> fun2 = Compile[{{n, _Integer}}, Module[{f = 0.},
>>     Do[f += N@Sqrt[i], {i, n}];
>>     f
>>     ]
>>    ]
>>
>> fun2[10^9] can finish in a reasonable time, but it only works with
>> machine precision numbers (because of Compile).
>>
>> If you sum fewer numbers, use Total@Sqrt@N@Range[n] instead of
>> N@Total@Sqrt@Range[n].
> 
> These two approaches can be hybridized to give quite good speed. The one
> caveat is that, when totalling a list, the semantics are not defined as to
> order of summing. Though in actual practice I believe Total sums
> first-to-last, which is in accord with the semantics of the original
> question.
> 
> Anyway, here is code that sums square roots to n, using blocks of size m.
> 
> cfun = Compile[{{n, _Integer}, {m, _Integer}},
>    Module[{tot = 0., flr = Floor[n/m]},
>     Do[
>      tot += Total[Sqrt[j*m + Range[m]]];
>      , {j, 0, flr - 1}];
>     tot += Total[Sqrt[Range[flr*m + 1, n]]];
>     tot]];
> 
> It's about four times faster than what I had for summing one at a time
> (essentially the same as fun2 above).
> 
> In[129]:= Timing[cfun[10^7, 10^5]]
> Out[129]= {0.841, 2.10819\[Times]10^10}
> 
> In[131]:= Timing[cfun[10^8, 10^7]]
> Out[131]= {10.005, 6.66667\[Times]10^11}
> 
> In[130]:= Timing[cfun[10^9, 10^7]]
> Out[130]= {104.44, 2.10819\[Times]10^13}
> 
> My general experience with Compile suggest that this speed is perhaps
> within a factor of 2-4 of the C code above. The test below also suggest
> this, since it spends most of its time in C code (though it may be a
> factor of 2 slower than need be, since it makes two passes over the list).
> 
> In[135]:= Timing[Total[Sqrt[Range[1., 10^7]]]]
> 
> Out[135]= {0.431, 2.10819\[Times]10^10}
> 
> Daniel Lichtblau
> Wolfram Research

After giving this some thought and looking at cfun[[4]], I decided this 
CompiledFunction was maybe not so efficient. So I rewrote without 
Compile and it got faster. I think the reason is that Compile is doing a 
bit of extra work in the implicit loops (Range or coercing it's result 
to be reals, maybe). Anyway, here is faster code.

fun[n_Integer, m_Integer] :=
  Module[{tot = 0., flr = Floor[n/m], k = 0.},
   Do[
    tot += Total[Sqrt[k + Range[m]]];
    k += m;
    , {j, 0, flr - 1}];
   tot += Total[Sqrt[Range[k + 1., n]]];
   tot]

In[12]:= InputForm[Timing[fun[10^9, 10^7]]]
Out[12]//InputForm= {26.009045999999998, 2.1081851083600375*^13}

This is on my machine at work, which is substantially faster than the 
laptop I had used for the timings above. It is nearly twice as fast as 
cfun on my work machine (this seems to be heavily system dependent; it 
is only 25% or so faster on my laptop).

Daniel Lichtblau
Wolfram Research

References:
- Re: Re: Floating-Point Computing
  - From: danl@wolfram.com

Prev by Date: Re: Calculating with Units

Next by Date: Manipulate slider moves from top to side

Previous by thread: Re: Re: Floating-Point Computing

Next by thread: Re: Floating-Point Computing