Re: ParallelTable slows down computation
- To: mathgroup at smc.vnet.net
- Subject: [mg105712] Re: [mg105692] ParallelTable slows down computation
- From: Patrick Scheibe <pscheibe at trm.uni-leipzig.de>
- Date: Wed, 16 Dec 2009 06:18:59 -0500 (EST)
- References: <200912151233.HAA15506@smc.vnet.net>
Hi,
Here (Ubuntu 64bit, 4 Cores, Mathematica 7.0.1) the timing is 53 for the
serial evaluation and 22 sec for the parallel computation.
If I try to minimize the data-transfer-overhead which arises when the
kernels return their result, then the speed-up is more visible.
Note the changed stepsize and the semicolon:
ClearSystemCache[];
AbsoluteTiming[
Table[Integrate[
Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}];, {nn, 0,
15}, {mm, 0, 15, 1/2}];]
needs 145.314899 seconds
AbsoluteTiming[
ParallelTable[
Table[Integrate[
Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}];, {nn,
0, 15}],
{mm, 0, 15, 1/2}]
;]
needs 52.036152 seconds. Every call on a new Mathematica-session.
Cheers
Patrick
On Tue, 2009-12-15 at 07:33 -0500, K wrote:
> Hi,
>
> I was trying to evaluate definite integrals of different product
> combinations of trigonometric functions like so:
>
> ClearSystemCache[];
> AbsoluteTiming[
> Table[Integrate[
> Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}], {nn, 0,
> 15}, {mm, 0, 15}];]
>
> I included ClearSystemCache[] to get comparable results for successive
> runs. Output of the actual matrix result is suppressed. On my dual
> core AMD, I got this result from Mathematica 7.0.1 (Linux x86 64-bit)
> for the above command:
>
> {65.240614, Null}
>
> Now I thought that this computation could be almost perfectly
> parallelized by having, e.g., nn = 0,...,7 evaluated by one kernel and
> nn=8, ..., 15 by the other and typed:
>
> ParallelEvaluate[ClearSystemCache[]];
> AbsoluteTiming[
> ParallelTable[
> Integrate[
> Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}], {nn, 0,
> 15}, {mm, 0, 15}, Method -> "CoarsestGrained"];]
>
> The result, however, was disappointing:
>
> {76.993888, Null}
>
> By the way, Kernel[] returns:
>
> {KernelObject[1,local],KernelObject[2,local]}
>
> This seems to me that the parallel command should in fact have been
> evaluated by two kernels. With Method-> "CoarsestGrained", I hoped to
> obtain the data splitting I mentioned above. If I do the splitting and
> combining myself, it gets even a bit worse:
>
> ParallelEvaluate[ClearSystemCache[]];
> AbsoluteTiming[
> job1=ParallelSubmit[Table[Integrate[Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos
> [mm*ph],{ph, Pi/2,Pi}],{nn,0,7},{mm,0,15}]];
> job2=ParallelSubmit[Table[Integrate[Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos
> [mm*ph],{ph, Pi/2,Pi}],{nn,8,15},{mm,0,15}]];
> {res1,res2}=WaitAll[{job1,job2}];
> Flatten[{{res1},{res2}},2];]
>
> Result is here:
>
> {78.669442,Null}
>
> I can't believe that the splitting and combining overhead on a single
> machine (no network involved here) can eat up all the gain from
> distributing the actual workload to two kernels. Does anyone have an
> idea what is going wrong here?
> Thanks,
> K.
>
- References:
- ParallelTable slows down computation
- From: K <kgspga@googlemail.com>
- ParallelTable slows down computation