MathGroup Archive: January 2011 [00414]

[Date Index] [Thread Index] [Author Index]

Re: Parallelize & Functions That Remember Values They Have Found

To: mathgroup at smc.vnet.net
Subject: [mg115516] Re: Parallelize & Functions That Remember Values They Have Found
From: thomas <thomas.muench at gmail.com>
Date: Thu, 13 Jan 2011 03:27:22 -0500 (EST)
Reply-to: comp.soft-sys.math.mathematica at googlegroups.com

Dear Guido,

I have faced a similar problem recently. As a way around this, I collected the definitions known to the remote kernels in the following way:

f[n_] := f[n] = Prime[n]
DistributeDefinitions[f];
ParallelEvaluate[f[n], {n, 500000}];(*now all f's are known remotely*)
DownValues[f]=Flatten[ParallelEvaluate[DownValues[f]]];(*now all f's are known centrally*)
result = Table[f[n], {n, 500000}];

This collection of data can take quite some time and eat up the advantages you gain by parallelization. So it is only worth doing this if your real code gains enough speed by parallel evaluation. The best is to experiment with that!

Even though it works, it seems quite cumbersome to me. I feel that there should be a better way.

thomas

On Wednesday, January 12, 2011 10:08:44 AM UTC+1, Guido Walter Pettinari wrote:
> Dear group,
> 
> I am starting to discover the magic behind Parallelize and
> ParallelTable, but I still have got many problems.  The latest one
> occurred when I tried to parallelize a function that is supposed to
> store his values, i.e. those defined as f[x_] := f[x] = .....
> 
> You can reproduce my problem by running the following snippet twice:
> 
> f[n_] := f[n] = Prime[n]
> DistributeDefinitions[f];
> result = ParallelTable[f[n], {n, 500000}] // AbsoluteTiming;
> elapsed = result[[1]]
> 
> On my machine, the first execution takes 2 seconds.  Since I defined f
> as f[x_]:=f[x], I expect the second execution to take much less than
> that, but it actually takes around 1.8s.  The third one takes
> something less than that (say 1.4s), and so on.  After many
> executions, the execution time stabilizes to 0.6 seconds.
> 
> Incidentally, 0.6 seconds is the time that a normal Table takes (on
> the second execution) to run the same code:
> 
> Exit[]
> f[n_] := f[n] = Prime[n]
> result = Table[f[n], {n, 500000}] // AbsoluteTiming;
> elapsed = result[[1]]
> 
> It looks like my 4 kernels are storing the downvalues of f[x]
> separately, so that each of them stores only a (random) quarter of the
> f-values every time the code is run.  When all of them have all of the
> 500.000 f-values, which happens after many executions, the execution
> time finally reaches 0.6s.
> 
> Is there a way to make all the f-values stored by the 4 kernels
> available?  Maybe a function that "collapses" all the information
> gathered by the kernels into the main kernel, i.e. a
> DeDistributeDefinitions function?  Or maybe a way to access the memory
> of all 4 kernels?  I tried to SetSharedFunction on f[x], but it just
> made the calculation extremely long.
> 
> I will be grateful for any suggestion.
> 
> Thank you for your attention,
> 
> Guido W. Pettinari

Prev by Date: Re: How to change the directory for the docs?

Next by Date: Re: Having some trouble with plot and solve

Previous by thread: Parallelize & Functions That Remember Values They Have Found

Next by thread: Re: Parallelize & Functions That Remember Values They Have Found