Re: Manually nested Tables faster than builtin nesting?

*To*: mathgroup at smc.vnet.net*Subject*: [mg113449] Re: Manually nested Tables faster than builtin nesting?*From*: Daniel Lichtblau <danl at wolfram.com>*Date*: Fri, 29 Oct 2010 06:28:01 -0400 (EDT)

Bill Rowe wrote: > On 10/27/10 at 5:13 AM, sschmitt at physi.uni-heidelberg.de (Sebastian > Schmitt) wrote: > >> I find that manually nested Tables are faster (factor 2 in my >> example) compared to the builtin nesting. I probably do a stupid >> thinko. Could you please have a look? > >> In[52]:= imax = jmax = 5000; > >> In[53]:= (tabletable = Table[Table[i*j, {i, imax}], {j, jmax}];) // >> Timing // First > >> Out[53]= 1.84 > >> In[54]:= (table = Table[i*j, {i, imax}, {j, jmax}];) // Timing // >> First > >> Out[54]= 14.55 > > I get similar timing results. My speculation is as follows: > > When you use the built-in nesting, you explicitly do 25 million > multiplies. No possibility of vector processing. > > But when you use explicit nesting, you are multiplying a packed > array by a scalar. My guess is Mathematica employs some vector > processing here and gains a speed up over individual multiplies. > > If my speculation is right, then there should be some operations > which will not gain such a dramatic speed-up. This is heading in the right direction. Actually the entire issue involves use, or not, of packed arrays. No vectorization of multiplication is done here. If it were, there would be an additional speed gain. imax = jmax = 2000; First the basic nested version. In[2]:= First[Timing[tnest1 = Table[Table[i*j, {i, imax}], {j, jmax}];]] Out[2]= 0.239963 Now we vectorize and it gets noticeably faster. In[3]:= First[Timing[tnest2 = Table[j * Table[i, {i, imax}], {j, jmax}];]] Out[3]= 0.099985 In[4]:= tnest1===tnest2 Out[4]= True Now without explicit nesting. In[5]:= First[Timing[tnonest1 = Table[j*i, {i, imax}, {j, jmax}];]] Out[5]= 1.46478 In[6]:= tnonest1===tnest2 Out[6]= True Why so much slower? It has to do with packing. Neither is packed, but the first ones have packed rows. <<Developer` In[10]:= And @@ Map[PackedArrayQ,tnest2] Out[10]= True In[11]:= And @@ Map[PackedArrayQ,tnest1] Out[11]= True In[12]:= And @@ Map[PackedArrayQ,tnonest1] Out[12]= False One might well wonder why the packing is not done in this case. It has to do with evaluation semantics. Table sees the second iterator upper bound as a variable (because it is Held). Indeed, it could be changed in process of handling the first iterator. Hence Table cannot set up a packed array. We can force preevaluation of that iterator bound as below. This recovers the expected speed. In[13]:= First[Timing[tnonest2 = With[{jmax=jmax}, Table[j*i, {i, imax}, {j, jmax}]];]] Out[13]= 0.138979 We can even pack the entire thing, simply by forcing numerical evaluation of both inner and outer iterator bounds. In[15]:= First[Timing[tnonest3 = With[{imax=imax,jmax=jmax}, Table[j*i, {i, imax}, {j, jmax}]];]] Out[15]= 0.13898 In[16]:= PackedArrayQ[tnonest3] Out[16]= True In[17]:= tnonest3===tnonest2===tnonest1 Out[17]= True One will notice the construction of tnest2 was slightly faster. This is presumably an indication of the speed gain from vectorization of the multiplications. Daniel Lichtblau Wolfram Research