Re: Speed Up of Calculations on Large Lists
- To: mathgroup at smc.vnet.net
- Subject: [mg108872] Re: Speed Up of Calculations on Large Lists
- From: Raffy <adraffy at gmail.com>
- Date: Mon, 5 Apr 2010 08:00:57 -0400 (EDT)
- References: <hp1ua5$65k$1@smc.vnet.net> <hp9u4i$mcp$1@smc.vnet.net>
On Apr 4, 4:45 am, Ray Koopman <koop... at sfu.ca> wrote: > Your compiled movAverageC takes 25% more time than the uncompiled > > movAv[data_, start_, end_, incr_] := Transpose@PadRight@Join[{data}, > Table[MovingAverage[data, r], {r, start, end, incr}]] > > under your test conditions. > > On Apr 1, 3:59 am, sheaven <shea... at gmx.de> wrote: > > > > > Hello everyone! > > > I am new to Mathematica and try get a understanding of its power. I > > plan to use Mathematica mainly for financial data analysis (large > > lists...). > > > Currently, I am trying to optimize calculation time for calculations > > based on some sample data. I started with with a moving average of > > share prices, because Mathematica already has a built in moving > > average function for benchmarking. > > > I know that the built-in functions are always more efficient than any > > user built function. Unfortunately, I have to create functions not > > built in (e.g. something like "moving variance") in the future. > > > I have tried numerous ways to calc the moving average as efficiently > > as possible. So far, I found that a function based on Span (or > > List[[x;;y]]) is most efficient. Below are my test results. > > Unfortunately, my UDF is still more than 5x slower than the built in > > function. > > > Do you have any ideas to further speed up the function. I am already > > using Compile and Parallelize. > > > This is what I got so far: > > > 1. Functions for moving average: > > > 1.1. Moving average based on built in function: > > > (*Function calcs moving average based on built in function for > > specified number of days, e.g. 30 days to 250 days in steps of 10*) > > movAverageC = Compile[{{inputData, _Real, 1}, {start, _Integer}, {end= , > > _Integer}, {incr, _Integer}}, Module[{data, size, i}, > > size = Length[inputData]; > > Transpose[Join[{inputData}, PadRight[MovingAverage[inputData, #]= , > > size] & /@ Table[x, {x, start, end, incr}]]] > > ] > > ] > > > 1.2. User defined function based on Span: > > (*UDF for moving average based on Span*) > > movAverageOwn2FC = Compile[{{dataInput, _Real, 1}, {days, _Integer}, > > {length, _Integer}}, > > N[Mean[dataInput[[1 + # ;; days + #]]]] & /@ Range[0, length - days= , > > 1] > > ] > > > (*Function calcs moving average based on UDF "movAverageOwn2FC" for > > specified number of days, e.g. 30 days to 250 days in steps of 10*) > > movAverageOwn2C = Compile[{{dataInput, _Real, 1}, {start, _Integer}, > > {end, _Integer}, {incr, _Integer}}, Module[{length}, > > length = Length[dataInput]; > > Transpose[Join[{dataInput}, PadRight[movAverageOwn2FC[dataInput,= #, > > length], length] & /@ Range[start, end, incr]]] > > ] > > ] > > > 2. Create sample data: > > data = 100 + # & /@ Accumulate[RandomReal[{-1, 1}, {10000}]]; > > > 3. Test if functions yield same results: > > Test1 = movAverageC[data, 30, 250, 10]; (*Moving average for 30 days > > to 250 days in steps of 10*) > > > Test2 = movAverageOwn2C[data, 30, 250, 10]; (*Moving average for 30 > > days to 250 days in steps of 10*) > > > Test1 == Test2 > > Out = True > > > 4. Performance testing (Singe Core): > > AbsoluteTiming[Table[movAverageC[data, 30, 250, 10], {n, 1, 20, 1}];] > > (*Repeat function 20x for testing purposes*) > > Out = {1.3030000, Null} > > > AbsoluteTiming[Table[movAverageOwn2C[data, 30, 250, 10], {n, 1, 20, > > 1}];] (*Repeat function 20x for testing purposes*) > > Out = {11.4260000, Null} > > > => Result UDF 9x slower > > > 5. Performance testing (multi core): > > LaunchKernels[] > > > Out = {KernelObject[1, "local"], KernelObject[2, "local"]} > > > DistributeDefinitions[data, movAverageOwn2C, movAverageOwn2FC, > > movAverageC] > > > AbsoluteTiming[Parallelize[Table[movAverageC[data, 30, 250, 10], {n, > > 1, 20, 1}]];] > > Out = {1.3200000, Null} > > > AbsoluteTiming[Parallelize[Table[movAverageOwn2C[data, 30, 250, 10], > > {n, 1, 20, 1}]];] > > Out = {6.7170000, Null} > > > => Result UDF 5x slower > > Very strange that the built in function does not get faster with > > Parallelize > > > I would very much appreciate any input on how to decrease calculation > > time based on the user defined function. > > > Many thanks > > Stefan ma = Function[{vData, vRange}, With[ {vAcc = Prepend[Accumulate@Developer`ToPackedArray[vData, Real], 0.]}, Transpose@ Developer`ToPackedArray[ Prepend[Table[ PadRight[(Drop[vAcc, n] - Drop[vAcc, -n])/n, Length[vData], 0.], {n, vRange}], vData], Real] ]]; ma[data, Range[30, 250, 10]] This is a 4-5x speed up over movAverageC. mv = Function[{vData, vRange}, With[ {v1 = Prepend[Accumulate[vData], 0.], v2 = Prepend[Accumulate[vData^2], 0.]}, Transpose@ Developer`ToPackedArray[ Prepend[Table[ PadRight[(Drop[v2, n] - Drop[v2, -n])/ n - ((Drop[v1, n] - Drop[v1, -n])/n)^2, Length[vData], 0.], {n, vRange}], vData], Real] ]]; This would be a fast moving variance.