MathGroup Archive: December 2009 [00557]

[Date Index] [Thread Index] [Author Index]

Parallel speedup/slowdown

To: mathgroup at smc.vnet.net
Subject: [mg105985] Parallel speedup/slowdown
From: Thomas Münch <thomas.muench at gmail.com>
Date: Tue, 29 Dec 2009 01:19:29 -0500 (EST)

Dear Mathgroup,

I have been working on parallelizing some of my code to get increased 
speed, but after initial success, it turned out that things are not that 
easy...

I am loading binary data for analysis with Mathematica. Loading of each 
individual file is quite brisk, but I usually load many files (up to 
hundreds) at once (or rather sequentially, but execution of the code is 
independent for each file). So I thought this is a great task for 
parallel computing. And indeed, I achieved in increase of speed that 
scales roughly linear with the number of kernels. Until I figured out 
that my data is loaded by the parallel kernels, but inaccessible to the 
master kernel. I found a solution for that, but now my parallel version 
of loading is more than twice as slow as the sequential version.

Here is some background on the way I load and store my data:

The data for each file is loaded by a function Load["filename",index], 
which stores the data and some of the associated metadata as downvalues 
in global variables. For example, the data itself (a real-valued array) 
is stored as a downvalue of the variable "data" as 
data[index]=<data-array>, the metadata is stored in 
header[index]=<header-string>, and there are 4 more such variables for 4 
more types of metadata. As "index" is use integers, and for each file I 
use a different integer. Note that the data and metadata are not 
"Returned[]" by the function Load, rather the assignments happen "on the 
way" while the function is executing. The actual return value of the 
function is Null.

The function Load[...] (and subfunctions that it calls) are accessible 
to the parallel kernels, and they all do their thing. But when a 
parallel kernel assigns a downvalue to data (for example 
data[1]=<array1>) this assignment appears to be "private" to that 
sub-kernel.

I found the solution: I need to define data, header, and the other 
variables as shared variables (SetSharedFunction[data, header,...]). But 
this leads to a terrible slow-down. The very same code now becomes much 
slower, slower even than the non-parallel version.

I can imagine the reason for this, I guess there is a tremendous 
overhead associated with making sure that all assignments to these 
variables are synchronized between all sub kernels and the master 
kernel. And I guess that this is generally important and a very good 
idea. However, in my case, all these parallel calls are completely 
independent of each other, and it is not necessary to be careful - each 
function call uses its unique "index", so that parallel kernels can 
never come into conflict with each other by accessing the same downvalue 
of, say, "data". This is ensured by the way I call the function, using 
MapIndexed:
Parallelize@MapIndexed[Load[#1, #2[[1]]] &, listOfFilenames]

Is there an easier, less overhead-burdened way of making the loaded and 
stored (meta-)data available to the master kernel for data analysis? And 
possibly also to the sub-kernels for parallelized analysis?

Thank you for your help,
thomas

Follow-Ups:
- Re: Parallel speedup/slowdown
  - From: Zach Bjornson <bjornson@mit.edu>

Prev by Date: Re: Re: Replace and ReplaceAll -- simple application

Next by Date: Re: define a function with variable coefficients

Previous by thread: Re: Re: bug in RandomChoice if weight is zero?

Next by thread: Re: Parallel speedup/slowdown