MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: help to make code run faster (mathematica v8.01)

  • To: mathgroup at
  • Subject: [mg121331] Re: help to make code run faster (mathematica v8.01)
  • From: Patrick Scheibe <pscheibe at>
  • Date: Sun, 11 Sep 2011 07:28:02 -0400 (EDT)
  • Delivered-to:
  • References: <>


- you should check your CompiledFunction with the CompilePrint of the
CompiledFunctionTools package. You would have seen that your call to
epanKern is not inlined and created a callback to the Mathematica
kernel. This slows the whole stuff down and is in this case not wanted.

- you could have used Listable attribute and the parallelization within
you compiled function. With this your val1 function is applied
automatically to each element of a list if you give a list as parameter.

With this in mind you come up with something like

cfuncPat = Compile[{{rk, _Real}, {rj, _Real}, {dk, _Real}, 
         {dj, _Real}, {band, _Real}}, With[{u = (dk - dj)/band}, 
         rk*rj*If[Abs[u] < 1, (3/4)*(1 - u^2), 0]], 
       CompilationOptions -> {"InlineExternalDefinitions" -> True}, 
       RuntimeAttributes -> Listable, Parallelization -> True]; 

compared to your version

epanKern = Compile[{u}, If[Abs[u] < 1, (3/4)*(1 - u^2), 0]]; 
cfuncKris = Compile[{{rk, _Real}, {rj, _Real}, {dk, _Real}, 
         {dj, _Real}, {band, _Real}}, 
       rk*rj*epanKern[(dk - dj)/band]]; 

Using now one million random values to test the speed:

vals = Array[RandomReal[1, 1000000] &, {5}];
valsTransposed = Transpose[vals];

In[36]:= Total[cfuncPat @@ vals] // AbsoluteTiming

Out[36]= {0.098913, 93552.8}

and yours

In[37]:= Total[cfuncKris @@@ valsTransposed] // AbsoluteTiming

Out[37]= {1.440017, 93552.8}

which needs about 15 times longer. The big difference here comes from
the fact that my function gets the complete list of values, is called
only once and the parallelization is done at compiled-code level. In
your case, your function is really called one million times.

Now you could rewrite your tStat function to use (at least partially)
the listable attribute by replacing the inner sum with a vector call to
your compiled function

pattStat[data_, band_, residuals_, leg_] := 
  Module[{k, j, res = 0, var = 0}, 
   res = ParallelSum[
     Total@cfuncPat[residuals[[k]], residuals[[k + 1 ;; leg]], 
       data[[k]], data[[k + 1 ;; leg]], band], {k, 1, leg}];
   2 res];

With 10000 values for your initial parameters your tStat function needs
about 38.5 seconds here. Mine is finished after 3.4 seconds.

Hope this gives you some insight.


On Sat, 2011-09-10 at 07:28 -0400, kristoph wrote:
> Hi
> I'm running out of options in order to make my code run faster. I do
> appreciate any help. I programmed a function named tStat that
> basically sums over two compiled functions. Although, I run the
> function parallel it is still rather slow. Thanks in advance for help.
> Here is what I mean:
> (*the following 5 lines is just random input data to test the
> function*)
> resp=RandomReal[10,250];
> reg=RandomReal[1,250];
> des=DesignMatrix[Table[{reg[[i]],resp[[i]]},{i,1,Length[reg]}],x,x];
> fit=LinearModelFit[{des,resp}];
> h=1.06 StandardDeviation[reg] Length[reg]^(-1/5);
> (*the two compiled functions which are inputs for the function tStat*)
> epanKern=Compile[{u},
> If[Abs[u]<1,3/4 (1-u^2),0]
> ];
> val1=Compile[{{rk,_Real},{rj,_Real},{dk,_Real},{dj,_Real},
> {band,_Real}},
> rk rj epanKern[(dk-dj)/band]];
> (*the following function is rather slow*)
> tStat[data_,band_,residuals_,leg_]:=Module[{k,j,res=0,var=0},
> res=ParallelSum[val1[residuals[[k]],residuals[[j]],data[[k]],data[[j]],band],
> {k,1,leg},{j,k+1,leg}];
> 2 res
> ];
> (*executing the function*)
> tStat[reg,h,fit["FitResiduals"],Length[reg]]//AbsoluteTiming

  • Prev by Date: Re: help to make code run faster (mathematica v8.01)
  • Next by Date: How to do this substitution?
  • Previous by thread: Re: help to make code run faster (mathematica v8.01)
  • Next by thread: Re: help to make code run faster (mathematica v8.01)