Re: Manually nested Tables faster than builtin nesting?
- To: mathgroup at smc.vnet.net
- Subject: [mg113449] Re: Manually nested Tables faster than builtin nesting?
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Fri, 29 Oct 2010 06:28:01 -0400 (EDT)
Bill Rowe wrote:
> On 10/27/10 at 5:13 AM, sschmitt at physi.uni-heidelberg.de (Sebastian
> Schmitt) wrote:
>
>> I find that manually nested Tables are faster (factor 2 in my
>> example) compared to the builtin nesting. I probably do a stupid
>> thinko. Could you please have a look?
>
>> In[52]:= imax = jmax = 5000;
>
>> In[53]:= (tabletable = Table[Table[i*j, {i, imax}], {j, jmax}];) //
>> Timing // First
>
>> Out[53]= 1.84
>
>> In[54]:= (table = Table[i*j, {i, imax}, {j, jmax}];) // Timing //
>> First
>
>> Out[54]= 14.55
>
> I get similar timing results. My speculation is as follows:
>
> When you use the built-in nesting, you explicitly do 25 million
> multiplies. No possibility of vector processing.
>
> But when you use explicit nesting, you are multiplying a packed
> array by a scalar. My guess is Mathematica employs some vector
> processing here and gains a speed up over individual multiplies.
>
> If my speculation is right, then there should be some operations
> which will not gain such a dramatic speed-up.
This is heading in the right direction. Actually the entire issue
involves use, or not, of packed arrays. No vectorization of
multiplication is done here. If it were, there would be an additional
speed gain.
imax = jmax = 2000;
First the basic nested version.
In[2]:= First[Timing[tnest1 = Table[Table[i*j, {i, imax}], {j, jmax}];]]
Out[2]= 0.239963
Now we vectorize and it gets noticeably faster.
In[3]:= First[Timing[tnest2 = Table[j *
Table[i, {i, imax}], {j, jmax}];]]
Out[3]= 0.099985
In[4]:= tnest1===tnest2
Out[4]= True
Now without explicit nesting.
In[5]:= First[Timing[tnonest1 = Table[j*i, {i, imax}, {j, jmax}];]]
Out[5]= 1.46478
In[6]:= tnonest1===tnest2
Out[6]= True
Why so much slower? It has to do with packing. Neither is packed, but
the first ones have packed rows.
<<Developer`
In[10]:= And @@ Map[PackedArrayQ,tnest2]
Out[10]= True
In[11]:= And @@ Map[PackedArrayQ,tnest1]
Out[11]= True
In[12]:= And @@ Map[PackedArrayQ,tnonest1]
Out[12]= False
One might well wonder why the packing is not done in this case. It has
to do with evaluation semantics. Table sees the second iterator upper
bound as a variable (because it is Held). Indeed, it could be changed in
process of handling the first iterator. Hence Table cannot set up a
packed array.
We can force preevaluation of that iterator bound as below. This
recovers the expected speed.
In[13]:= First[Timing[tnonest2 = With[{jmax=jmax},
Table[j*i, {i, imax}, {j, jmax}]];]]
Out[13]= 0.138979
We can even pack the entire thing, simply by forcing numerical
evaluation of both inner and outer iterator bounds.
In[15]:= First[Timing[tnonest3 = With[{imax=imax,jmax=jmax},
Table[j*i, {i, imax}, {j, jmax}]];]]
Out[15]= 0.13898
In[16]:= PackedArrayQ[tnonest3]
Out[16]= True
In[17]:= tnonest3===tnonest2===tnonest1
Out[17]= True
One will notice the construction of tnest2 was slightly faster. This is
presumably an indication of the speed gain from vectorization of the
multiplications.
Daniel Lichtblau
Wolfram Research