MathGroup Archive: November 1996 [00004]

[Date Index] [Thread Index] [Author Index]

Re: Speed of dot product in Mathematica

To: mathgroup at smc.vnet.net
Subject: [mg5247] Re: Speed of dot product in Mathematica
From: Robert Knapp <rknapp>
Date: Fri, 15 Nov 1996 03:33:58 -0500
Organization: Wolfram Research
Sender: owner-wri-mathgroup at wolfram.com

Carlos A. Felippa wrote:
> 
> The speed of dot products in Mathematica 2.2 depends
> significantly on implementation.  For example, timing
> 
>  n=10000; a=Table[1.,{n}]; b=a;
>  Print[Timing[a.b]];
>  Print[Timing[s=Sum[a[[i]]*b[[i]],{i,1,n}]]]
>  Print[Timing[s=0;Do[s+=a[[i]]*b[[i]],{i,1,n}]]];
>  Print[Timing[s=0;For[i=0,i<=n,i++,s+=a[[i]]*b[[i]] ]]];
> 
> on a Mac 8500 gives
> 
>  {0.0666667 Second, 10000.}
>  {1.48333 Second, 10000.}
>  {2.38333 Second, Null}
>  {3.95 Second, Null}
> 
> Is there a way to speed up the Sum form, for example using Compile, so that
> it achieves a performance similar to that of the built-in dot operator?

In Mathematica 2.2, there is no good way to use Compile on the Sum
version.  However, in Mathematica 3.0, it is very easy to compile and
the results are quite good.  Timings below are on a Pentium Pro 200.

In[44]:=
n=10000; a=Table[1.,{n}]; b=a;

In[45]:=
Timing[a.b]

Out[45]=
{0.04 Second,10000.}

In[46]:=
Timing[Sum[a[[i]]*b[[i]],{i,1,Length[a]}]]

Out[46]=
{0.34 Second,10000.}

In[47]:=
cf1 = Compile[{{x, _Real, 1},{y, _Real, 1}},x.y];

In[48]:=
cf2 = Compile[{{x, _Real, 1},{y, _Real, 1}},
    Sum[x[[i]]*y[[i]],{i,1,Length[x]}]];

In[49]:=
Timing[cf1[a,b]]

Out[49]=
{0.01 Second,10000.}

In[50]:=
Timing[cf2[a,b]]

Out[50]=
{0.06 Second,10000.}

So you can improve the Sum version so that the built in Dot is only 1.5
times faster.  However at the same time, the Compiled Dot is even faster
yet!

> 
> This is important in matrix routines where the dot product involves only
> portions of rows or columns, or where the stride is not unity.
> 
The improvement I showed above should work for these also.

> BTW, several of the LinearAlgebra Package functions (e.g. lufactor)
> use the For-loop implementation.  As shown above, that has the worst
> performance, being 60 times slower than the built-in operator.

There is a new built in command called LUDecomposition in version 3.0,
which is much more efficient thatn this.

Rob Knapp
WRI

Prev by Date: Re: Re: Break a List into Individual Elements?

Next by Date: Re: SeedRandom oddity/flaw/bug/imperfection

Previous by thread: Re: Inverse of Error Function

Next by thread: Speed of dot product in Mathematica