Re: Compilation options question
- To: mathgroup at smc.vnet.net
- Subject: [mg115832] Re: Compilation options question
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Sat, 22 Jan 2011 03:21:27 -0500 (EST)
Ramiro wrote: > Hello, > > I have successfully compiled the main parts of my program and runs > much faster than before. I do have a question. Consider the > following 4 examples, all compiled programs but with different > options. Example0 is just plain compile, example1 has what I thought > would be the optimal: parallelization and compilationtarget->"C" and > declaration of the return types for the external calls, but it takes > twice as long!!! Why is that? > > If I remove the declaration of the return types it goes a bit faster > than plain Compile even though it has parallelization and > CompilationTarget-> "C", and if I remove CompilationTarget-> "C" it's > about the same. > > Any insight? Thanks so much to everyone for their help with Compile, > my MCMC simulations will be much faster now. > > Ramiro > > p.s. My main program is basically multiplying the function in question > (exampleN) many many times, that's why I put multiply over the same > call. > > > > In[170]:= > example0 = > Compile[{{n, _Real, 1}, {a, _Real}, {b, _Real}, {t, _Real, 1}}, > With[{tn = Total[n]}, > b^a*Exp[LogGamma[ > tn + a] - (Total[LogGamma[n + 1]] + LogGamma[a]) + > Total[n*Log[t]] - (tn + a)*Log[Total[t] + b]]]]; > Times @@ Table[ > example0[{1, 1, 1, 1}, 1, 1, {3, 3, 3, 3}], {i, > 10000}] // AbsoluteTiming > > Out[171]= {0.210959, 6.2372127891421*10^-22811} > > In[172]:= > example1 = > Compile[{{n, _Real, 1}, {a, _Real}, {b, _Real}, {t, _Real, 1}}, > With[{tn = Total[n]}, > b^a*Exp[LogGamma[ > tn + a] - (Total[LogGamma[n + 1]] + LogGamma[a]) + > Total[n*Log[t]] - (tn + a)* > Log[Total[t] + > b]]], {{LogGamma[_], _Real}, {Total[_], _Real}}, > Parallelization -> True, CompilationTarget -> "C"]; > Times @@ Table[ > example1[{1, 1, 1, 1}, 1, 1, {3, 3, 3, 3}], {i, > 10000}] // AbsoluteTiming > > Out[173]= {0.414509, 6.2372127890803*10^-22811} > > In[174]:= > example2 = > Compile[{{n, _Real, 1}, {a, _Real}, {b, _Real}, {t, _Real, 1}}, > With[{tn = Total[n]}, > b^a*Exp[LogGamma[ > tn + a] - (Total[LogGamma[n + 1]] + LogGamma[a]) + > Total[n*Log[t]] - (tn + a)*Log[Total[t] + b]]], > Parallelization -> True, CompilationTarget -> "C"]; > Times @@ Table[ > example2[{1, 1, 1, 1}, 1, 1, {3, 3, 3, 3}], {i, > 10000}] // AbsoluteTiming > > Out[175]= {0.188601, 6.2372127890803*10^-22811} > > In[176]:= > calculatePoissonsGamma = > Compile[{{n, _Real, 1}, {a, _Real}, {b, _Real}, {t, _Real, 1}}, > With[{tn = Total[n]}, > b^a*Exp[LogGamma[ > tn + a] - (Total[LogGamma[n + 1]] + LogGamma[a]) + > Total[n*Log[t]] - (tn + a)*Log[Total[t] + b]]], > Parallelization -> True]; > Times @@ Table[ > calculatePoissonsGamma[{1, 1, 1, 1}, 1, 1, {3, 3, 3, 3}], {i, > 10000}] // AbsoluteTiming > > Out[177]= {0.219753, 6.2372127891421*10^-22811} You are calling the Mathematica evaluator for the LogGamma evaluations. I would think this call+return is slower when coming from a dll than when coming from the virtual byte machine (the interpreter for vanilla Compile). I believe this need for evaluator calls for LogGamma (and Gamma) will be addressed in a future release. As for the parallelization not helping, I suspect the issue there is communication speed. If without parallelization you can do 10^4 computations in 0.22 seconds or so, then that gives an idea of haw fast the processor intercommunication must be in order for parallelizing to be useful (it must be significantly less than 2*10^(-5) seconds, that is). While I am no expert, it strikes me that communication speed might not be quite that fast. Hence the worsened rather than improved overall speed. I should emphasize that this is just a guess on my part. Daniel Lichtblau Wolfram Research