MathGroup Archive: December 2009 [00597]

[Date Index] [Thread Index] [Author Index]
Re: Parallel speedup/slowdown
To: mathgroup at smc.vnet.net
Subject: [mg105994] Re: Parallel speedup/slowdown
From: thomas <thomas.muench at gmail.com>
Date: Wed, 30 Dec 2009 04:11:01 -0500 (EST)
References: <hhc781$2ms$1@smc.vnet.net>
On Dec 29, 7:22 am, Thomas M=FCnch <thomas.mue... at gmail.com> wrote:
> Dear Mathgroup,
>
> I have been working on parallelizing some of my code to get increased
> speed, but after initial success, it turned out that things are not that
> easy...
>
> I am loading binary data for analysis with Mathematica. Loading of each
> individual file is quite brisk, but I usually load many files (up to
> hundreds) at once (or rather sequentially, but execution of the code is
> independent for each file). So I thought this is a great task for
> parallel computing. And indeed, I achieved in increase of speed that
> scales roughly linear with the number of kernels. Until I figured out
> that my data is loaded by the parallel kernels, but inaccessible to the
> master kernel. I found a solution for that, but now my parallel version
> of loading is more than twice as slow as the sequential version.
>
> Here is some background on the way I load and store my data:
>
> The data for each file is loaded by a function Load["filename",index],
> which stores the data and some of the associated metadata as downvalues
> in global variables. For example, the data itself (a real-valued array)
> is stored as a downvalue of the variable "data" as
> data[index]=<data-array>, the metadata is stored in
> header[index]=<header-string>, and there are 4 more such variables for =
4
> more types of metadata. As "index" is use integers, and for each file I
> use a different integer. Note that the data and metadata are not
> "Returned[]" by the function Load, rather the assignments happen "on the
> way" while the function is executing. The actual return value of the
> function is Null.
>
> The function Load[...] (and subfunctions that it calls) are accessible
> to the parallel kernels, and they all do their thing. But when a
> parallel kernel assigns a downvalue to data (for example
> data[1]=<array1>) this assignment appears to be "private" to that
> sub-kernel.
>
> I found the solution: I need to define data, header, and the other
> variables as shared variables (SetSharedFunction[data, header,...]). But
> this leads to a terrible slow-down. The very same code now becomes much
> slower, slower even than the non-parallel version.
>
> I can imagine the reason for this, I guess there is a tremendous
> overhead associated with making sure that all assignments to these
> variables are synchronized between all sub kernels and the master
> kernel. And I guess that this is generally important and a very good
> idea. However, in my case, all these parallel calls are completely
> independent of each other, and it is not necessary to be careful - each
> function call uses its unique "index", so that parallel kernels can
> never come into conflict with each other by accessing the same downvalue
> of, say, "data". This is ensured by the way I call the function, using
> MapIndexed:
> Parallelize@MapIndexed[Load[#1, #2[[1]]] &, listOfFilenames]
>
> Is there an easier, less overhead-burdened way of making the loaded and
> stored (meta-)data available to the master kernel for data analysis? And
> possibly also to the sub-kernels for parallelized analysis?
>
> Thank you for your help,
> thomas

This is in reply to my own post... I have found a solution. I have
noticed that the parallel kernels do know about the the assigned
downvalues. So it is merely a question of transferring these
downvalues to the master kernel.

The following lines do this:

DownValues[data] = Flatten[ParallelEvaluate[DownValues[data]]];
DownValues[header] = Flatten[ParallelEvaluate[DownValues[header]]];
etc.

which reads the downvalues stored in the sub-kernels, an assigns them
to the master kernel.

It appears to me that this can be a general valid way of rescuing side-
effects, contrary to what it says in the documentation. In
ParallelTools/tutorial/ParallelEvaluation it says under the header
Side Effects:

"Unless you use shared variables, the parallel evaluations performed
are completely independent and cannot influence each other.
Furthermore, any side effects, such as assignments to variables, that
happen as part of evaluations will be lost. The only effect of a
parallel evaluation is that its result is returned at the end."

Using shared variables is not a good idea in my case, as I explained
in the original post. Here are the timings on a Quad-core Windows XP
32-bit machine for loading close to 300 files (total: 42MB), in a
fresh Mathematica Session (Mathematica 7):
Sequential (non-parallel): 9.5 sec
Parallel with the trick oulined here: 5.8 sec (1.3 sec of that are
needed at the end for shuffling the data to the master kernel)
Parallel with shared variables: 20.5 sec

Here is the complete code, for those who are interested. It contains a
call to my function "Load[filename,index]", which, as side effects,
assigns down-values to the variables data[index], header[index], etc.

SetAttributes[parallelLoad, HoldRest];
parallelLoad[filter_: "*.phys", dir_: Directory[]] :=
 Module[{fn, filt},
  filt = Switch[filter, All, "*", Automatic, "*.phys", _, filter];
  SetDirectory[dir];
  fn = FileNames[filt];
  ParallelEvaluate[SetDirectory[dir]];
  Print["The following files are loaded from " <> dir <> ":"];
  Print@Grid[MapIndexed[{#2[[1]], #1} &, fn],
    Alignment -> {Left, Baseline}];
  PrintTemporary[
   Row[{"Loading... ",
     ProgressIndicator[Dynamic[Clock[]], Indeterminate]}]];
  Parallelize@MapIndexed[Load[#1, #2[[1]]] &, fn];
  PrintTemporary[
   Row[{"Collecting data... ",
     ProgressIndicator[Dynamic[Clock[]], Indeterminate]}]];
  DownValues[data] = Flatten[ParallelEvaluate[DownValues[data]]];
  DownValues[header] = Flatten[ParallelEvaluate[DownValues[header]]];
  DownValues[filename] =
   Flatten[ParallelEvaluate[DownValues[filename]]];
  DownValues[timer] = Flatten[ParallelEvaluate[DownValues[timer]]];
  DownValues[protocol] =
   Flatten[ParallelEvaluate[DownValues[protocol]]];
  DownValues[sampleRate] =
   Flatten[ParallelEvaluate[DownValues[sampleRate]]];
  Print["Done."];
  ]
Prev by Date: Re: Parallel speedup/slowdown
Next by Date: sampling a spline
Previous by thread: Re: Parallel speedup/slowdown
Next by thread: Dynamic Plot of Stock/Currency Chart