Re: ParallelTable slows down computation
- To: mathgroup at smc.vnet.net
- Subject: [mg105712] Re: [mg105692] ParallelTable slows down computation
- From: Patrick Scheibe <pscheibe at trm.uni-leipzig.de>
- Date: Wed, 16 Dec 2009 06:18:59 -0500 (EST)
- References: <200912151233.HAA15506@smc.vnet.net>
Hi, Here (Ubuntu 64bit, 4 Cores, Mathematica 7.0.1) the timing is 53 for the serial evaluation and 22 sec for the parallel computation. If I try to minimize the data-transfer-overhead which arises when the kernels return their result, then the speed-up is more visible. Note the changed stepsize and the semicolon: ClearSystemCache[]; AbsoluteTiming[ Table[Integrate[ Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}];, {nn, 0, 15}, {mm, 0, 15, 1/2}];] needs 145.314899 seconds AbsoluteTiming[ ParallelTable[ Table[Integrate[ Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}];, {nn, 0, 15}], {mm, 0, 15, 1/2}] ;] needs 52.036152 seconds. Every call on a new Mathematica-session. Cheers Patrick On Tue, 2009-12-15 at 07:33 -0500, K wrote: > Hi, > > I was trying to evaluate definite integrals of different product > combinations of trigonometric functions like so: > > ClearSystemCache[]; > AbsoluteTiming[ > Table[Integrate[ > Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}], {nn, 0, > 15}, {mm, 0, 15}];] > > I included ClearSystemCache[] to get comparable results for successive > runs. Output of the actual matrix result is suppressed. On my dual > core AMD, I got this result from Mathematica 7.0.1 (Linux x86 64-bit) > for the above command: > > {65.240614, Null} > > Now I thought that this computation could be almost perfectly > parallelized by having, e.g., nn = 0,...,7 evaluated by one kernel and > nn=8, ..., 15 by the other and typed: > > ParallelEvaluate[ClearSystemCache[]]; > AbsoluteTiming[ > ParallelTable[ > Integrate[ > Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos[mm*ph], {ph, Pi/2, Pi}], {nn, 0, > 15}, {mm, 0, 15}, Method -> "CoarsestGrained"];] > > The result, however, was disappointing: > > {76.993888, Null} > > By the way, Kernel[] returns: > > {KernelObject[1,local],KernelObject[2,local]} > > This seems to me that the parallel command should in fact have been > evaluated by two kernels. With Method-> "CoarsestGrained", I hoped to > obtain the data splitting I mentioned above. If I do the splitting and > combining myself, it gets even a bit worse: > > ParallelEvaluate[ClearSystemCache[]]; > AbsoluteTiming[ > job1=ParallelSubmit[Table[Integrate[Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos > [mm*ph],{ph, Pi/2,Pi}],{nn,0,7},{mm,0,15}]]; > job2=ParallelSubmit[Table[Integrate[Sin[ph]*1/(2 Pi)*Sin[nn*ph]*Cos > [mm*ph],{ph, Pi/2,Pi}],{nn,8,15},{mm,0,15}]]; > {res1,res2}=WaitAll[{job1,job2}]; > Flatten[{{res1},{res2}},2];] > > Result is here: > > {78.669442,Null} > > I can't believe that the splitting and combining overhead on a single > machine (no network involved here) can eat up all the gain from > distributing the actual workload to two kernels. Does anyone have an > idea what is going wrong here? > Thanks, > K. >
- References:
- ParallelTable slows down computation
- From: K <kgspga@googlemail.com>
- ParallelTable slows down computation