MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Speed Up of Calculations on Large Lists

  • To: mathgroup at smc.vnet.net
  • Subject: [mg108939] Re: Speed Up of Calculations on Large Lists
  • From: sheaven <karg.stefan at googlemail.com>
  • Date: Wed, 7 Apr 2010 07:26:47 -0400 (EDT)
  • References: <hpcjh0$mva$1@smc.vnet.net>

Thanks everyone for the input!

Ray is absolutely correct (Compile does not speed up the calculation
based on my function movAverageOwn2FC). I used Compile with a function
similar to variation#2 from Zach with Partition where Compile resulted
in a speed up of c. 50%. Therefore, I thought that Compile is always
faster.
Obviously, this is a mistake :-)

I did some testing and here is the outcome:

1. Moving average functions


1.1 Built-in Mathematica function

In[1]:= maMathematica =
 Compile[{{inputData, _Real,
    1}, {start, _Integer}, {end, _Integer}, {incr, _Integer}},
  Module[{data, size, i},
   size = Length[inputData];
   Transpose[
    Join[{inputData},
     PadRight[MovingAverage[inputData, #], size] & /@
      Table[x, {x, start, end, incr}]]]
   ]
  ]


1.2 Function based on Span

In[2]:= maSpanF[dataInput_, days_, length_] :=
 N[Mean[dataInput[[1 + # ;; days + #]]]] & /@
  Range[0, length - days, 1]

In[3]:= maSpan[dataInput_, start_, end_, incr_] := Module[{length},
  length = Length[dataInput];
  Transpose[
   Join[{dataInput},
    PadRight[maSpanF[dataInput, #, length], length] & /@
     Range[start, end, incr]]]
  ]


1.3 Function based on Convolve

In[4]:= maConvolveF[data_, windowLen_] :=
 Module[{ker = Table[1, {windowLen}]/windowLen},
  ListConvolve[ker, data]]

In[5]:= maConvolve =
 Compile[{{dataInput, _Real,
    1}, {start, _Integer}, {end, _Integer}, {incr, _Integer}},
  Module[{length},
   length = Length[dataInput];
   Transpose[
    Join[{dataInput},
     PadRight[maConvolveF[dataInput, #], length] & /@
      Range[start, end, incr]]]
   ]
  ]


1.4 Function based on Drop

In[6]:= maDropF =
  Function[{vData, days}, With[{vAcc = Prepend[Accumulate@vData, 0.]},
    Developer`ToPackedArray[(Drop[vAcc, days] - Drop[vAcc, -days])/
      days, Real]]];

In[7]:= maDrop[dataInput_, start_, end_, incr_] := Module[{length},
  length = Length[dataInput];
  Transpose[
   Join[{dataInput},
    PadRight[maDropF[dataInput, #], length] & /@
     Range[start, end, incr]]]
  ]


2. Dataset

data = 100 + Accumulate[RandomReal[{-1, 1}, {10000}]];


3. Test for equality

In[15]:= maMathematica[data, 30, 250, 10] ==
 maConvolve[data, 30, 250, 10] == maSpan[data, 30, 250, 10] ==
 maDrop[data, 30, 250, 10]

Out[15]= False

This is very strange to me. Equality is only given based on Precision
of 8:

In[16]:= SetPrecision[maMathematica[data, 30, 250, 10], 8] ==
 SetPrecision[maConvolve[data, 30, 250, 10], 8] ==
 SetPrecision[maSpan[data, 30, 250, 10], 8] ==
 SetPrecision[maDrop[data, 30, 250, 10], 8]

Out[16]= True

Any ideas why this is happening? Numbers of the different functions
start to differ at the 10th decimal. My understanding was that
Mathematica was only testing equality up to the precision it knows to
be correct!?


4. Performance

In[17]:= AbsoluteTiming[
 Table[maMathematica[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[17]= {0.8230823, Null}

In[18]:= AbsoluteTiming[
 Table[maSpan[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[18]= {6.0676067, Null}

In[21]:= AbsoluteTiming[
 Table[maDrop[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[21]= {0.4480448, Null}

In[22]:= AbsoluteTiming[
 Table[maConvolve[data, 30, 250, 10], {n, 1, 10, 1}];]

Out[22]= {0.7780778, Null}

In essence:
Span is by far the slowest (12 times slower than Drop)
Built-in function is slower than functions based on Convolve and Drop
(2 times slower than Drop)
Convolve is slightly faster than the built-in function but is slower
than Drop (2 times slower than Drop)
Drop is by far the fastest function

>From a performance standpoint, it seems necessary to think a lot about
different algorithms for doing the same thing=85
Is there any good reference you can recommend giving some best
practices for dealing with large lists?

For those being interest I also prepared functions for the coefficient
of variation (CV) based on the variance of the sample (many thanks to
Raffy for his proposal). The impact of the algorithm on the
performance of the function is more or less the same compared to the
moving average.

cvDropF =
  Function[{vData, days},
   With[{v1 = Prepend[Accumulate[vData], 0.],
     v2 = Prepend[Accumulate[vData^2], 0.]},
    Developer`ToPackedArray[
     Sqrt[(Drop[v2, days] - Drop[v2, -days])/(days -
           1) - ((Drop[v1, days] - Drop[v1, -days])/
            days)^2*(days/(days - 1))]/(Sqrt[days]*
        maDropF[vData, days]), Real]]];

cvDrop[dataInput_, start_, end_, incr_] := Module[{length},
  length = Length[dataInput];
  Transpose[
   Join[{dataInput},
    PadRight[cvDropF[dataInput, #], length] & /@
     Range[start, end, incr]]]
  ]

cvSpanF[dataInput_, days_] := Module[{},
  Sqrt[Variance[dataInput[[1 + # ;; days + #]]] & /@
     Range[0, Length[dataInput] - days, 1]]/(Sqrt[days]*
     maSpanF[dataInput, days, Length[dataInput]])
  ]

cvSpan[dataInput_, start_, end_, incr_] := Module[{length},
  length = Length[dataInput];
  Transpose[
   Join[{dataInput},
    PadRight[cvSpanF[dataInput, #], length] & /@
     Range[start, end, incr]]]
  ]

cvConvolveF[data_, days_] :=
 Module[{ker1 = Table[1, {days}]/(days - 1),
   ker2 = Table[1, {days}]/(days), conv1, conv2},
  conv1 = ListConvolve[ker1, data^2];
  conv2 = (ListConvolve[ker2, data]^2)*(days/(days - 1));
  Sqrt[conv1 - conv2]/(Sqrt[days]*maConvolveF[data, days])
  ]

cvConvolve =
 Compile[{{dataInput, _Real,
    1}, {start, _Integer}, {end, _Integer}, {incr, _Integer}},
  Module[{length},
   length = Length[dataInput];
   Transpose[
    Join[{dataInput},
     PadRight[cvConvolveF[dataInput, #], length] & /@
      Range[start, end, incr]]]
   ]
  ]

@Raffy:
I think there is an error in your moving average function. It should
read like this (only for a single period):
mv =
  Function[{vData, days}, With[
    {v1 = Prepend[Accumulate[vData], 0.],
     v2 = Prepend[Accumulate[vData^2], 0.]},
    Developer`ToPackedArray[
     Sqrt[(Drop[v2, days] - Drop[v2, -days])/(days -
           1) - ((Drop[v1, days] - Drop[v1, -days])/
            days)^2*(days/(days - 1))]
, Real]]];




  • Prev by Date: Re: Mathematica Programming
  • Next by Date: Re: Pattern to match a list of non-negative integers
  • Previous by thread: Re: Speed Up of Calculations on Large Lists
  • Next by thread: Re: Speed Up of Calculations on Large Lists