MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Speed Up of Calculations on Large Lists

  • To: mathgroup at smc.vnet.net
  • Subject: [mg108857] Re: Speed Up of Calculations on Large Lists
  • From: Ray Koopman <koopman at sfu.ca>
  • Date: Sun, 4 Apr 2010 07:45:33 -0400 (EDT)
  • References: <hp1ua5$65k$1@smc.vnet.net>

Your compiled movAverageC takes 25% more time than the uncompiled

movAv[data_, start_, end_, incr_] := Transpose@PadRight@Join[{data},
      Table[MovingAverage[data, r], {r, start, end, incr}]]

under your test conditions.

On Apr 1, 3:59 am, sheaven <shea... at gmx.de> wrote:
> Hello everyone!
>
> I am new to Mathematica and try get a understanding of its power. I
> plan to use Mathematica mainly for financial data analysis (large
> lists...).
>
> Currently, I am trying to optimize calculation time for calculations
> based on some sample data. I started with with a moving average of
> share prices, because Mathematica already has a built in moving
> average function for benchmarking.
>
> I know that the built-in functions are always more efficient than any
> user built function. Unfortunately, I have to create functions not
> built in (e.g. something like "moving variance") in the future.
>
> I have tried numerous ways to calc the moving average as efficiently
> as possible. So far, I found that a function based on Span (or
> List[[x;;y]]) is most efficient. Below are my test results.
> Unfortunately, my UDF is still more than 5x slower than the built in
> function.
>
> Do you have any ideas to further speed up the function. I am already
> using Compile and Parallelize.
>
> This is what I got so far:
>
> 1. Functions for moving average:
>
> 1.1. Moving average based on built in function:
>
> (*Function calcs moving average based on built in function for
> specified number of days, e.g. 30 days to 250 days in steps of 10*)
> movAverageC = Compile[{{inputData, _Real, 1}, {start, _Integer}, {end,
> _Integer}, {incr, _Integer}}, Module[{data, size, i},
>    size = Length[inputData];
>    Transpose[Join[{inputData}, PadRight[MovingAverage[inputData, #],
> size] & /@ Table[x, {x, start, end, incr}]]]
>    ]
>   ]
>
> 1.2. User defined function based on Span:
> (*UDF for moving average based on Span*)
> movAverageOwn2FC = Compile[{{dataInput, _Real, 1}, {days, _Integer},
> {length, _Integer}},
>   N[Mean[dataInput[[1 + # ;; days + #]]]] & /@ Range[0, length - days,
> 1]
> ]
>
> (*Function calcs moving average based on UDF "movAverageOwn2FC" for
> specified number of days, e.g. 30 days to 250 days in steps of 10*)
> movAverageOwn2C = Compile[{{dataInput, _Real, 1}, {start, _Integer},
> {end, _Integer}, {incr, _Integer}}, Module[{length},
>    length = Length[dataInput];
>    Transpose[Join[{dataInput}, PadRight[movAverageOwn2FC[dataInput, #,
> length], length] & /@ Range[start, end, incr]]]
>    ]
>   ]
>
> 2. Create sample data:
> data = 100 + # & /@ Accumulate[RandomReal[{-1, 1}, {10000}]];
>
> 3. Test if functions yield same results:
> Test1 = movAverageC[data, 30, 250, 10]; (*Moving average for 30 days
> to 250 days in steps of 10*)
>
> Test2 = movAverageOwn2C[data, 30, 250, 10]; (*Moving average for 30
> days to 250 days in steps of 10*)
>
> Test1 == Test2
> Out = True
>
> 4. Performance testing (Singe Core):
> AbsoluteTiming[Table[movAverageC[data, 30, 250, 10], {n, 1, 20, 1}];]
> (*Repeat function 20x for testing purposes*)
> Out = {1.3030000, Null}
>
> AbsoluteTiming[Table[movAverageOwn2C[data, 30, 250, 10], {n, 1, 20,
> 1}];] (*Repeat function 20x for testing purposes*)
> Out = {11.4260000, Null}
>
> => Result UDF 9x slower
>
> 5. Performance testing (multi core):
> LaunchKernels[]
>
> Out = {KernelObject[1, "local"], KernelObject[2, "local"]}
>
> DistributeDefinitions[data, movAverageOwn2C, movAverageOwn2FC,
> movAverageC]
>
> AbsoluteTiming[Parallelize[Table[movAverageC[data, 30, 250, 10], {n,
> 1, 20, 1}]];]
> Out = {1.3200000, Null}
>
> AbsoluteTiming[Parallelize[Table[movAverageOwn2C[data, 30, 250, 10],
> {n, 1, 20, 1}]];]
> Out = {6.7170000, Null}
>
> => Result UDF 5x slower
> Very strange that the built in function does not get faster with
> Parallelize
>
> I would very much appreciate any input on how to decrease calculation
> time based on the user defined function.
>
> Many thanks
> Stefan


  • Prev by Date: Optimization problem for dice game
  • Next by Date: Re: Managing packages in the workbench
  • Previous by thread: Re: Speed Up of Calculations on Large Lists
  • Next by thread: Re: Speed Up of Calculations on Large Lists