Re: help to make code run faster (mathematica v8.01)
- To: mathgroup at smc.vnet.net
- Subject: [mg121331] Re: help to make code run faster (mathematica v8.01)
- From: Patrick Scheibe <pscheibe at trm.uni-leipzig.de>
- Date: Sun, 11 Sep 2011 07:28:02 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- References: <201109101128.HAA02485@smc.vnet.net>
Hi,
- you should check your CompiledFunction with the CompilePrint of the
CompiledFunctionTools package. You would have seen that your call to
epanKern is not inlined and created a callback to the Mathematica
kernel. This slows the whole stuff down and is in this case not wanted.
- you could have used Listable attribute and the parallelization within
you compiled function. With this your val1 function is applied
automatically to each element of a list if you give a list as parameter.
With this in mind you come up with something like
cfuncPat = Compile[{{rk, _Real}, {rj, _Real}, {dk, _Real},
{dj, _Real}, {band, _Real}}, With[{u = (dk - dj)/band},
rk*rj*If[Abs[u] < 1, (3/4)*(1 - u^2), 0]],
CompilationOptions -> {"InlineExternalDefinitions" -> True},
RuntimeAttributes -> Listable, Parallelization -> True];
compared to your version
epanKern = Compile[{u}, If[Abs[u] < 1, (3/4)*(1 - u^2), 0]];
cfuncKris = Compile[{{rk, _Real}, {rj, _Real}, {dk, _Real},
{dj, _Real}, {band, _Real}},
rk*rj*epanKern[(dk - dj)/band]];
Using now one million random values to test the speed:
vals = Array[RandomReal[1, 1000000] &, {5}];
valsTransposed = Transpose[vals];
In[36]:= Total[cfuncPat @@ vals] // AbsoluteTiming
Out[36]= {0.098913, 93552.8}
and yours
In[37]:= Total[cfuncKris @@@ valsTransposed] // AbsoluteTiming
Out[37]= {1.440017, 93552.8}
which needs about 15 times longer. The big difference here comes from
the fact that my function gets the complete list of values, is called
only once and the parallelization is done at compiled-code level. In
your case, your function is really called one million times.
Now you could rewrite your tStat function to use (at least partially)
the listable attribute by replacing the inner sum with a vector call to
your compiled function
pattStat[data_, band_, residuals_, leg_] :=
Module[{k, j, res = 0, var = 0},
res = ParallelSum[
Total@cfuncPat[residuals[[k]], residuals[[k + 1 ;; leg]],
data[[k]], data[[k + 1 ;; leg]], band], {k, 1, leg}];
2 res];
With 10000 values for your initial parameters your tStat function needs
about 38.5 seconds here. Mine is finished after 3.4 seconds.
Hope this gives you some insight.
Cheers
Patrick
On Sat, 2011-09-10 at 07:28 -0400, kristoph wrote:
> Hi
>
> I'm running out of options in order to make my code run faster. I do
> appreciate any help. I programmed a function named tStat that
> basically sums over two compiled functions. Although, I run the
> function parallel it is still rather slow. Thanks in advance for help.
> Here is what I mean:
>
> (*the following 5 lines is just random input data to test the
> function*)
>
> resp=RandomReal[10,250];
> reg=RandomReal[1,250];
> des=DesignMatrix[Table[{reg[[i]],resp[[i]]},{i,1,Length[reg]}],x,x];
> fit=LinearModelFit[{des,resp}];
> h=1.06 StandardDeviation[reg] Length[reg]^(-1/5);
>
> (*the two compiled functions which are inputs for the function tStat*)
>
> epanKern=Compile[{u},
> If[Abs[u]<1,3/4 (1-u^2),0]
> ];
>
> val1=Compile[{{rk,_Real},{rj,_Real},{dk,_Real},{dj,_Real},
> {band,_Real}},
> rk rj epanKern[(dk-dj)/band]];
>
> (*the following function is rather slow*)
>
> tStat[data_,band_,residuals_,leg_]:=Module[{k,j,res=0,var=0},
> res=ParallelSum[val1[residuals[[k]],residuals[[j]],data[[k]],data[[j]],band],
> {k,1,leg},{j,k+1,leg}];
> 2 res
> ];
>
> (*executing the function*)
> tStat[reg,h,fit["FitResiduals"],Length[reg]]//AbsoluteTiming
>
- References:
- help to make code run faster (mathematica v8.01)
- From: kristoph <kristophs.post@web.de>
- help to make code run faster (mathematica v8.01)