MathGroup Archive: December 2009 [00584]

[Date Index] [Thread Index] [Author Index]

Re: Parallel speedup/slowdown

To: mathgroup at smc.vnet.net
Subject: [mg106021] Re: Parallel speedup/slowdown
From: Vince Virgilio <blueschi at gmail.com>
Date: Wed, 30 Dec 2009 04:16:12 -0500 (EST)
References: <hhc781$2ms$1@smc.vnet.net>

On Dec 29, 1:22 am, Thomas M=FCnch <thomas.mue... at gmail.com> wrote:
> I found the solution: I need to define data, header, and the other
> variables as shared variables (SetSharedFunction[data, header,...]). But
> this leads to a terrible slow-down. The very same code now becomes much
> slower, slower even than the non-parallel version.

SNIP

> MapIndexed:
> Parallelize@MapIndexed[Load[#1, #2[[1]]] &, listOfFilenames]
>
> Is there an easier, less overhead-burdened way of making the loaded and
> stored (meta-)data available to the master kernel for data analysis? And
> possibly also to the sub-kernels for parallelized analysis?


I don't know if it will help, but if you could live without
downvalues, try replacing shared variables with return-by-value.
Something like this untested code:

subResults =
Parallelize@
MapIndexed[
  Function[ Load[#1, #2[[1]] ] ;
                 Module[{in = Through@{data, header, meta2, meta3,
meta4, meta5}@#2[[1]]},
                   {in, subprocess@in}] ]
, listOfFilenames ] ;

masterResults = process /@ subResults[[All, 1]];

where metaN contains your other metadata, and you define subprocess
and process.


Also if you're not already using the WDX data format, you might find
it complements this nicely. It saves values, not symbols. See ref/
format/WDX.


Vince Virgilio

Prev by Date: Re: simplifying a system of equations

Next by Date: Re: Re: Re: Replace and ReplaceAll -- simple application

Previous by thread: Re: Parallel speedup/slowdown

Next by thread: Re: Parallel speedup/slowdown