Re: Re: Division still cost more than multiplication?
- To: mathgroup at smc.vnet.net
- Subject: [mg24422] Re: [mg24390] Re: Division still cost more than multiplication?
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Tue, 18 Jul 2000 00:58:20 -0400 (EDT)
- References: <8k0tc4$q2o@smc.vnet.net> <8k3o2q$418@smc.vnet.net> <200007130313.XAA14017@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Richard Fateman wrote: > > I don't know why you are particularly optimizing the time for a > numeric calculation in Mathematica, since if you were concerned > about the time, you should be using something like Fortran. > > In high-end computer systems, the time to do a multiply or > a divide is essentially irrelevant, since the time to get the > data from memory and back to memory is much greater. Some systems > may several arithmetic units and are quite happy to do several > operations (adds, multiplies, divides, square roots) at the same > time. > > The observation regarding Mathematica is undoubtedly unrelated to the > time taken to do a floating-point division. It is probably related to > associated > processing, maybe having to do with simplification, or error checking > (checking for division by zero, perhaps?) > RJF > > AES wrote: > > > > In article <8k0tc4$q2o at smc.vnet.net>, AES <siegman at stanford.edu> wrote: > > > > > Old-time FORTRAN programmers (like me) were taught (at least in early > > > days) that division cost a lot more machine cycles than multiplication. > > > So, if you had an expression like y = x/c that was going to be called > > > many times inside a loop, where x and y were variables and c a constant, > > > you'd code this as: > > > > [snip] > > > > > Does this still make any sense in Mathematica? Or is it a primitive > > > relic of long-gone days? > > +++++++++++++++++++++++++++++++++++++++++++++ > > > > Anticipating replies to my query that I believe are coming from P. J. > > Hinton and Daniel Lichtbau ((thanks to both), here is the same test as > > P. J. Hintons, performed on a 1998-model 400 MHz Mac PowerBook G3, > > giving very similar results: > > > > In[7]:= {num, den} = {Random[], Random[]} > > > > Out[7]= {0.0991313, 0.0559147} > > > > In[8]:= recip = 1.0 / den > > > > Out[8]= 17.8844 > > > > In[9]:= Timing[Do[num/den, {2000000}]] > > > > Out[9]= {28.8 Second, Null} > > > > In[10]:= {num, den} = {Random[], Random[]} > > > > Out[10]= {0.132828, 0.28853} > > > > In[11]:= recip = 1.0 / den > > > > Out[11]= 3.46585 > > > > In[12]:= Timing[Do[num * recip, {2000000}]] > > > > Out[12]= {14.6833 Second, Null} As some have already noted, tests indicate that it is generally better to precompute the reciprocal. I replicate some experiments below and also endeavor to explain some of the finer points. Unfortunately I am not fully prepared to explain in full detail all the timings shown, except to guess that some types of loop overhead are better optimized in the Mathematica evaluator than others. This alone is of interest, however, because the first method shown indicates how one might come closer to the speed of compiled Fortran or C code for certain types of iterative numeric computations. First I'll point out that speed depends to some extent on the nature of the operands. If one is dividing bignums, and the divisor never changes, then it is always beneficial to precompute the reciprocal and multiply. This is because software implementation of division is typically between three and two times slower than multiplication, depending on the number of digits in the operands (this is independent of relative processor speeds of machine arithmetic multiplication vs division). For machine numbers alot depends on how you structure the Mathematica code. As Richard Fateman points out, one might (perhaps unintentionally) spend most of ones time in loop and other evaluator overhead. For examples I'll illustrate with a table of distinct numerators. The first timing indicates that my machine/platform is somewhat slower that used in the note quoted above. In[25]:= Timing[Do[num*recip, {2000000}]] Out[25]= {22.28 Second, Null} In[1]:= nums = Table[Random[], {1000000}]; In[2]:= den = Random[]; In[4]:= Timing[quots = nums/den;] Out[4]= {0.85 Second, Null} This is alot faster than using a loop because, internally, the Mathematica evaluator sees it is doing a division operation on a packed array (in version 4, that is) and it is processed by fast C code that in effect bypasses the evaluator (provided no overflow occurs at intermediate steps). Now we repeat but in a way that forces us to evaluate an iterator and explicitly fetch entries from the packed array. This is alot slower. In[6]:= Timing[quots2 = Table[nums[[i]]/den, {i,Length[nums]}];] Out[6]= {8.78 Second, Null} In[7]:= quots == quots2 Out[7]= True We can regain some speed by avoiding explicit use of an iterator: In[20]:= Timing[quots3 = Map[#/den&, nums];] Out[20]= {3.7 Second, Null} The first tow methods do not improve by using a reciprocal. In[10]:= recip = 1/den; In[11]:= Timing[quots3 = nums*recip;] Out[11]= {0.84 Second, Null} In[12]:= Timing[quots4 = Table[nums[[i]]*recip, {i,Length[nums]}];] Out[12]= {8.38 Second, Null} The third finds this advantageous. In[21]:= Timing[quots4 = Map[#*recip&, nums];] Out[21]= {2.12 Second, Null} The nums/den example most likely inverts den once since Mathematica parses a/b as a*Power[b,-1]. This would clearly explain why the first method does not improve using an explicit reciprocal. Offhand I do not know why the second example does not seem to care; most likely it evaluates its first argument in the Table-handling code and each step through the iteration thereafter sees nums[[i]*(evaluated reciprocal of den). Clearly the third method is computing 1/den every time, and that would explain why it gets a bit faster when we use an explicit reciprocal. Daniel Lichtblau Wolfram Research
- References:
- Re: Division still cost more than multiplication?
- From: Richard Fateman <fateman@cs.berkeley.edu>
- Re: Division still cost more than multiplication?