Re: Parallel speedup/slowdown
- To: mathgroup at smc.vnet.net
- Subject: [mg106021] Re: Parallel speedup/slowdown
- From: Vince Virgilio <blueschi at gmail.com>
- Date: Wed, 30 Dec 2009 04:16:12 -0500 (EST)
- References: <hhc781$2ms$1@smc.vnet.net>
On Dec 29, 1:22 am, Thomas M=FCnch <thomas.mue... at gmail.com> wrote: > I found the solution: I need to define data, header, and the other > variables as shared variables (SetSharedFunction[data, header,...]). But > this leads to a terrible slow-down. The very same code now becomes much > slower, slower even than the non-parallel version. SNIP > MapIndexed: > Parallelize@MapIndexed[Load[#1, #2[[1]]] &, listOfFilenames] > > Is there an easier, less overhead-burdened way of making the loaded and > stored (meta-)data available to the master kernel for data analysis? And > possibly also to the sub-kernels for parallelized analysis? I don't know if it will help, but if you could live without downvalues, try replacing shared variables with return-by-value. Something like this untested code: subResults = Parallelize@ MapIndexed[ Function[ Load[#1, #2[[1]] ] ; Module[{in = Through@{data, header, meta2, meta3, meta4, meta5}@#2[[1]]}, {in, subprocess@in}] ] , listOfFilenames ] ; masterResults = process /@ subResults[[All, 1]]; where metaN contains your other metadata, and you define subprocess and process. Also if you're not already using the WDX data format, you might find it complements this nicely. It saves values, not symbols. See ref/ format/WDX. Vince Virgilio