MathGroup Archive: November 2005 [00480]

[Date Index] [Thread Index] [Author Index]

Re: Performance Improvement - Need help

To: mathgroup at smc.vnet.net
Subject: [mg62285] Re: [mg62256] Performance Improvement - Need help
From: "Carl K. Woll" <carlw at wolfram.com>
Date: Sat, 19 Nov 2005 05:54:08 -0500 (EST)
References: <200511172203.RAA16316@smc.vnet.net>
Sender: owner-wri-mathgroup at wolfram.com

Lee Newman wrote:
> Dear Group,
> 
> I am working with computational model that has a main loop which
> executes about 10^6 to 10^7 times over the course of a simulation --
> taking about 30 hrs. The bottleneck function (below) includes and outer
> product and some matrix algebra. I have optimized it to the best of my
> knowledge, but would desperately like to know if any further optimization
> might be possible (including calling external functions in C or other
>   language).   Any suggestions would be greatly appreciated.
> 
> FUNCTION ---------------------------------------------------------
> 
> UpdateSynapses = Compile[{{matrix, _Real, 2}, {vector1, _Real, 1},
> {vector2,_Real, 1}, {thresh1, _Real}, {thresh2, _Real}, {C1, _Real},
>   {C2, _Real}, {maxval, _Real}},
> 
> Module[{coactivation},
> 
> coactivation = Outer[Times,
> FloorZero[vector2-thresh2], FloorZero[vector1- thresh1]];
> 
> C2* maxval*coactivation  + (1 - C2* coactivation - C1)*matrix
                                                         ^
                                                         |
should be . not *, I think -----------------------------+

> 
>   ]  (* end module *)
> 
> , {{FloorZero[__], _Real, 1}} ];
> 
> Notes:
> (1) vector1 is 1x100;  vector2 is 1x1500; matrix is 100x100; matrix2 is
> 100x1500;  all vectors/matrices are comprised of reals (range 0 to 1)
> and are packed.
> (2) FloorZero=Compile[{{list, _Real, 1}}, UnitStep[list] * list].
> Eliminating this
> function does not significantly affect performance.
> (2) run time ~ 30hrs for 10^7 iterations  (Pentium 4, 2.8GHz, 1GB RAM)
> 
> Regards,
> Lee Newman

Lee,

Some comments.

1. Use Clip[vector-thresh,{0,10}] instead of FloorZero. It's a bit 
faster, and a bit clearer to me at least.

2. Your coactivation matrix can be thought of as the dot product of a 
column vector and a row vector. In this light, the dot product of 
coactivation.matrix can be thought of as

c . (r . matrix)

instead of

(c . r) . matrix

Now, the dot product of a vector with a matrix is usually much faster 
than the dot product of a matrix with a matrix, so this ought to provide 
some speed gain.

3. The only thing left to worry about is the 1-C1 part of the matrix 
product (1-C1-C2 coactivation).matrix. Since coactivation is a 1500x100 
matrix, 1-C1 is really a 1500x100 matrix where all entries are 1-C1. It 
turns out that the (1-C1).matrix part is really just 1500 copies of 
Total[m].

4. We end up with the outer product of a 1500 element column vector with 
a 100 element row vector, and then to each row we add the same 100 
element row vector. It turns out that instead of Outer, it's a bit 
faster to use Map.

Putting the above ideas together, I came up with the following 
uncompiled function:

update[m_, v1_, v2_, t1_, t2_, c1_, c2_, max_] :=
   Module[{f1, f2, i1, i2},
     f1 = c2 Clip[v1 - t1, {0, 10}];
     f2 = Clip[v2 - t2, {0, 10}];
     i1 = max f1 - f1.m;
     i2 = (1 - c1)Total[m];
     (i1# + i2 &) /@ f2]

Here is some test data:

SeedRandom[1];
m = Table[Random[], {100}, {100}];
v1 = Developer`ToPackedArray@Table[Random[], {100}];
v2 = Table[Random[], {1500}];
{t1, t2, c1, c2, max} = Table[Random[], {5}];

Let's make sure the matrices and vectors are packed:

In[9]:=
Developer`PackedArrayQ/@{m,v1,v2}

Out[9]=
{True, True, True}

Now, comparing update with UpdateSynapses:

In[10]:=
Do[r1=update[m,v1,v2,t1,t2,c1,c2,max],{100}]//Timing
Do[r2=UpdateSynapses[m,v1,v2,t1,t2,c1,c2,max],{100}]//Timing
r1==r2

Out[10]=
{1.516 Second, Null}

Out[11]=
{5.078 Second, Null}

Out[12]=
True

At least on my slow machine, update is more than 3 times faster. If you 
experience the same speedup, then it should take less than 10 hours.

Carl Woll

PS. The version of UpdateSynapses I used is:

UpdateSynapses=Compile[{
         {matrix,_Real,2},
         {vector1,_Real,1},
         {vector2,_Real,1},
         {thresh1,_Real},
         {thresh2,_Real},
         {C1,_Real},
         {C2,_Real},
         {maxval,_Real}
         },
       Module[{coactivation},
         coactivation=Outer[
             Times,
             FloorZero[vector2-thresh2],
             FloorZero[vector1-thresh1]
             ];
         C2*maxval*coactivation+(1-C2*coactivation-C1).matrix],
       {{FloorZero[__],_Real,1}}];

In[2]:=
FloorZero=Compile[{{list,_Real,1}},UnitStep[list]*list];

References:
- Performance Improvement - Need help
  - From: Lee Newman <leenewm@umich.edu>

Prev by Date: Re: Re: Re: Types in Mathematica

Next by Date: Re: Types in Mathematica

Previous by thread: Performance Improvement - Need help

Next by thread: Re: Performance Improvement - Need help