Re: More memory-efficient inner product for large last
- To: mathgroup at smc.vnet.net
- Subject: [mg106901] Re: [mg106870] More memory-efficient inner product for large last
- From: Leonid Shifrin <lshifr at gmail.com>
- Date: Tue, 26 Jan 2010 06:37:10 -0500 (EST)
- References: <201001251009.FAA09421@smc.vnet.net>
Hi Vince, I suggest that you use lazy matrix multiplication, which can be implemented for example as follows: Clear[dotLazy]; dotLazy[first_?ArrayQ, second_?ArrayQ] := With[{fdims = Most@Dimensions@first, sdims = Most@Dimensions@second}, Module[{a, b, plus}, With[{firstSymbolic = Array[a, fdims], secondSymbolic = Array[b, sdims]}, plus[x_] := x; plus[x__] /; Length[{x}] =!= 2 := Fold[plus, First@{x}, Rest@{x}]; plus[x_, y_] /; Head[x] =!= Plus := Total[{x, y} /. {a[i_, j_] :> first[[i, j]], a[i_] :> first[[i]], b[i_] :> second[[i]], b[i_, j_] :> second[[i, j]]}]; firstSymbolic.secondSymbolic /. Plus -> plus]]]; Here is a function to test the memory consumption: ClearAll[measureMemoryConsumption]; SetAttributes[measureMemoryConsumption, HoldAll]; measureMemoryConsumption[code_, f_] := With[{current = MemoryInUse[], mmu = MaxMemoryUsed[]}, {f[code], MemoryInUse[] - current, MaxMemoryUsed[] - mmu}]; It returns a list of 3 elements, the first being some function <f> applied to your code <code>, the second is a relative increase of unclaimed memory, the third is a peak increase of memory used in a computation. First let us check that my lazy dot function gives the same as yours dot, for small lists: In[3]:= l1 = RandomReal[1., {3, 5}]; l2 = RandomReal[1., {3, 3, 5}]; In[5]:= dotLazy[l1, l1] === dot[l1, l1, 1] Out[5]= True In[6]:= dotLazy[l2, l1] === dot[l2, l1, 2] Out[6]= True In[7]:= dotLazy[l2, l2] === dot[l2, l2, 2, 2] Out[7]= True Now a more serious test: In[9]:= l1 = RandomReal[1., {3, 100000}]; l2 = RandomReal[1., {3, 3, 100000}]; In[11]:= measureMemoryConsumption[dotLazy[l1, l1], Dimensions] Out[11]= {{100000}, 800384, 5536} In[12]:= measureMemoryConsumption[dot[l1, l1, 1], Dimensions] Out[12]= {{100000}, 248, 0} In[13]:= measureMemoryConsumption[dotLazy[l2, l1], Dimensions] Out[13]= {{3, 100000}, 2400696, 1603800} In[14]:= measureMemoryConsumption[dot[l2, l1, 2], Dimensions] Out[14]= {{3, 100000}, 248, 40796320} In[15]:= measureMemoryConsumption[dotLazy[l2, l2], Dimensions] Out[15]= {{3, 3, 100000}, 7201640, 0} In[16]:= measureMemoryConsumption[dot[l2, l2, 2, 2], Dimensions] Out[16]= {{3, 3, 100000}, 256, 70808608} As can be seen, my version is less memory-efficient for list-to-list dot product, but vastly more efficient for other operations. I did not test on such huge lists as your original ones since I don't have so much memory at my disposal at the moment (running Eclipse and SQLDeveloper), but I would expect similar effect. Hope this helps. Regards, Leonid On Mon, Jan 25, 2010 at 1:09 PM, Vince Virgilio <blueschi at gmail.com> wrote: > Hello, > > I need to compute vector-vector, matrix-vector, and matrix-matrix > inner products, for vectors and matrices whose elements are not > scalars, but very large lists (~ 1.2M element each). I need Dot[] to > ignore the last tensor index, but it has no parameter for this, like > Outer's last "n_i " arguments. So I implemented my own. Unfortunately, > the matrix-vector and matrix-matrix products consume excessive amounts > of memory. The matrix-vector product peak memory footprint is ~ 800MB, > for ~ 110MB total input, and the matrix-matrix product peaks at ~ 1.8 > GB for ~ 180MB input. Apparently, memory overhead is ~ 8-10X. > > Here is a trace of system memory use (working set and its peak) for > the above Mathematica evaluations. My system is Windows 7, 2.5 GHz > Intel Core 2 Duo, 4GB RAM (Lenove R61 laptop). > > http://tinyurl.com/yfgwp26 (PDF ~ 180KB) > > Please find below my implementation of "dot", which ignores sublists > below level 1 or 2 (depends on product type). Can it be made more > efficient? > > Thank you, > > Vince Virgilio > > > dot[l1_, l2_, 1] := l1*l2 // Total; > > dot[l1_, l2_, 2, n2_: 1] := > ReleaseHold@Dot[Map[Hold, l1, {2}], Map[Hold, l2, {n2}]]; > > In[3]:= l1 = RandomReal[1., {3, 1200000}]; > l2 = RandomReal[1., {3, 3, 1200000}]; > > In[5]:= ByteCount@l1 > > Out[5]= 28800128 > > In[6]:= ByteCount@l2 > > Out[6]= 86400132 > > (* Vector-Vector inner product *) > > In[7]:= l3 = dot[l1, l1, 1]; > Dimensions@l3 > > Out[8]= {1200000} > > (* Matrix-Vector inner product *) > > l4 = dot[l2, l1, 2]; (* Memory use peaks @ ~ 800 MB *) > > In[8]:= Dimensions@l4 > > Out[8]= {3, 1200000} > > (* Matrix-Matrix inner product *) > > l5 = dot[l2, l2, 2, 2]; (* Memory use peaks @ ~ 1.8 GB! *) > > In[8]:= Dimensions@l5 > > Out[8]= {3, 3, 1200000} > >
- References:
- More memory-efficient inner product for large last dimension?
- From: Vince Virgilio <blueschi@gmail.com>
- More memory-efficient inner product for large last dimension?