MathGroup Archive 2009

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Help with Speeding up a For loop

  • To: mathgroup at smc.vnet.net
  • Subject: [mg98920] Re: [mg98881] Help with Speeding up a For loop
  • From: Leonid Shifrin <lshifr at gmail.com>
  • Date: Wed, 22 Apr 2009 05:08:38 -0400 (EDT)
  • References: <200904202313.TAA09729@smc.vnet.net>

Hi Adam,

I have a number of comments on your code, which will hopefully clarify the
main sources of confusion and inefficiency.

Here is the version of your code which I modified somewhat.
The <initialize> function is introduced for convenience, to be able to
perform these operations many times, with different number of points. Same
with <runSimpleLoop>. I also  changed the names Ideal -> IdealLoop,
Resolution -> ResolutionLoop.

In[1]  =
initialize[num_Integer?Positive] :=
  Module[{},
   Eb1 = 0;
   k = 0;
   n = num;
   E0 = 2470;
   m = 0.2;
   DeltaE = 50;
   Sigma = 5;
   maxE = E0 - m;
   minE = E0 - DeltaE;
   (*Setup all constants,lists and ranges*)
   Eb1 = RandomReal[{minE, maxE},
     n];(*create a list of'n' random Eb1 values*)
   k = -RandomReal[TriangularDistribution[{-2470, 0}, -0.1], n];
   ];


In[2] =
runSimpleLoop[] :=
 Block[{Eb1 = Eb1, k = k, y3},
  ResolutionLoop = {};
  IdealLoop = {};
  For[i = 1, i <= n, i++,
   If[k[[i]] <
     Re[(E0 - Eb1[[i]])^2*
       Sqrt[1 - m^2/(E0 - Eb1[[i]])^2]],(*check if the {k,
    Eb1} value is under the curve*)
    AppendTo[IdealLoop, Eb1[[i]]];(*Keep events under curve in'Ideal'*)
    y3 = Eb1[[i]];(*cast element to a number*)
    Eb1[[i]] =
     First@RandomReal[NormalDistribution[y3, Sigma],
       1];(*choose a random number from a normal distribution about \
that point*)
    AppendTo[ResolutionLoop,
     Eb1[[i]]];]] ](*Keep that event in'Resolution'*)


Comments so far:

These reflect changes in the code above compared to your original version:

1.  You place in your lists Resolution and Ideal 0 to start with. This is
not correct, since this point does not belong there.  Start with {} instead.
2. This is not C, array index starts from 1, not 0
3. Try to avoid side effects and stay away from global variables. They make
life harder.   Block[{Eb1 = Eb1},    loop ] makes Eb1 dynamically local
while produces the same end result. (Generally, use Module, not Block - I
just wanted to keep the code as much close to yours as possible.
4. RandomReal[..., 1] produces a number wrapped in a list. You don'
  t need that - take First of that list.

These two are not dealt with so far:

5. Redundancy of y3.
6. No need to change Eb1 in place.


Now, the reason your code is slow: the main culprit of your implementation
is AppendTo.    Avoid it in loops at all costs -  leads to quadratic in the
size of the list complexity. Use Reap -  Sow if you insist on using loops at
all. Below is the code that uses Reap - Sow:

In[3] = runLoopWithReapSow[] :=
   Block[{Eb1 = Eb1, k = k, y3},
   Reap[For[i = 1, i <= n, i++,
      y3 = Eb1[[i]];
      If[k[[i]] < Re[(E0 - y3)^2*Sqrt[1 - m^2/(E0 - y3)^2]],
       Sow[y3, "ideal"];
       Sow[First@RandomReal[NormalDistribution[y3, Sigma], 1],
        "resolution"];
       ]], _, List][[2]]];

This loop produces the result - the lists of interest combined together,
with the string "tags": "ideal" and "resolution". The names of the labels
(tags) are completely arbitrary, I could even use just numbers 0 and 1 for
them. Let us look at a simple example:

In[4] =
initialize[20];
runSimpleLoop[];

In[5] = IdealLoop

Out[5] = {2431.1, 2442.78, 2445.85, 2447.74, 2429.49, 2448.45, 2434.79, \
2425.95};

In[6] = ResolutionLoop

Out[6] = {2436.17, 2443.8, 2432.54, 2433.07, 2420.16, 2439.93, 2418.47, \
2423.48, 2436.35}

In[7] = runLoopWithReapSow[]

Out[7] = {{"ideal", {2431.1, 2442.78, 2445.85, 2447.74, 2429.49, 2448.45,
   2434.79, 2425.95}}, {"resolution", {2433.86, 2443.81, 2446.62,
   2450.4, 2426.98, 2442.91, 2436.05, 2436.48}}}

We only can compare the <ideal> part - it is of course the same. Now, the
timing test:

In[8] =

initialize[10000];
Print[runSimpleLoop[]; // Timing];
Print[newResult = runLoopWithReapSow[]; // Timing];
{IdealLoopNew, ResolutionLoopNew} = Part[newResult, All, 2];

 {8.893,Null}
{0.892,Null}

Already here we get a 10-times speed-up. The results for Ideal are, of
course, the same:

In[9] = IdealLoopNew == IdealLoop

Out[9]= True

Finally, let me show you a more idiomatic way to approach your problem,
where we will vectorize it and get rid of loop at all:

In[10] =

(Ideal =
     Pick[Eb1, (Re[(E0 - Eb1)^2*Sqrt[1 - m^2/(E0 - Eb1)^2]] - k) /.
              _?Positive :>  True];

  Resolution = Map[RandomReal[NormalDistribution[#, Sigma], 1] &,
    Ideal];) // Timing

{0.21, Null}

In[11] = IdealLoopNew == Ideal

Out[11] = True

We got another 4-fold speedup, AND we don't need explicit loops, with all
the garbage associated with them. Basically, this code first selects all
points in Eb1 that are below the curve, and then computes the Resolution
vector on these points only.

Perhaps, wise use of Compile could further speed it up somewhat (2-4 times,
my guess).

Anyway, this code gets you 10000 points in 0.2 seconds (on my 5 years old
PIV 1.8 Ghz single core), should be probably good enough for you. If you
need around 10 000 000 or more, watch out for the memory usage - you may be
better off doing some binning as the program runs, by splitting this large
number into sycles of smaller lengths.

As to the programming, try to avoid loops in Mathematica, and also global
variables - even those you use in initialization can be made local to only
the function(s) that need them. Generally, for anything longer than a couple
of lines, try to write functions whose bodies only contain system symbols
and formal parameters.

Regards,
Leonid

On Mon, Apr 20, 2009 at 4:13 PM, Adam Dally <adally at wisc.edu> wrote:

> I am using an Intel MacBook with OS X 10.5.6.
>
> I am trying to create 2 lists: "Ideal" and "Resolution".  This is basically
> a "Monte Carlo Integration" technique. Ideal should simulate the curve.
> Resolution should simulate the curve convoluted with a normal distribution.
> I want to do this for n=10 000 000 or more, but it takes far too long right
> now. I can do n=100 000 in about 1 minute, but 1 000 000 takes more than an
> hour. I haven't waited long enough for 10 000 000 to finish (it has been 5
> days).
>
> Thank you,
> Adam Dally
>
> Here is the code:
>
> ClearAll[E0, Eb1, m, DeltaE, Sigma, k, n, y3, Ideal, Resolution, i,
> normalizer, maxE, minE]
> Eb1 = 0; k = 0; n = 10000; E0 = 2470; m = 0.2; DeltaE = 50; Sigma = 5; maxE
> = E0 - m; minE = E0 - DeltaE; Resolution = {Eb1}; Ideal = {Eb1};  (*Setup
> all constants, lists and ranges*)
>
> Eb1 = RandomReal[{minE, maxE}, n];  (*create a list of 'n' random Eb1
> values*)
> k = -RandomReal[TriangularDistribution[{-2470, 0}, -0.1], n]; (*create a
> list of 'n' random k values; triangle distribution gives more successful
> results*)
>
> For[i = 0, i < n, i++,
>
>  If[k[[i]] < Re[(E0 - Eb1[[i]])^2*Sqrt[1 - m^2/(E0 - Eb1[[i]])^2]], (*check
> if the {k,Eb1} value is under the curve*)
>     AppendTo[Ideal, Eb1[[i]]];  (*Keep events under curve in 'Ideal'*)
>     y3 = Eb1[[i]]; (*cast element to a number*)
>     Eb1[[i]] = RandomReal[NormalDistribution[y3, Sigma], 1]; (*choose a
> random number from a normal distribution about that point*)
>     AppendTo[Resolution, Eb1[[i]]]; ]] (*Keep that event in 'Resolution'*)
>
>
>



  • Prev by Date: Re: Help with a possible bug
  • Next by Date: Re: Looking for more Mathematica online user groups/forums
  • Previous by thread: Re: Help with Speeding up a For loop
  • Next by thread: Re: Help with Speeding up a For loop