Parallel speedup/slowdown
- To: mathgroup at smc.vnet.net
- Subject: [mg105985] Parallel speedup/slowdown
- From: Thomas Münch <thomas.muench at gmail.com>
- Date: Tue, 29 Dec 2009 01:19:29 -0500 (EST)
Dear Mathgroup, I have been working on parallelizing some of my code to get increased speed, but after initial success, it turned out that things are not that easy... I am loading binary data for analysis with Mathematica. Loading of each individual file is quite brisk, but I usually load many files (up to hundreds) at once (or rather sequentially, but execution of the code is independent for each file). So I thought this is a great task for parallel computing. And indeed, I achieved in increase of speed that scales roughly linear with the number of kernels. Until I figured out that my data is loaded by the parallel kernels, but inaccessible to the master kernel. I found a solution for that, but now my parallel version of loading is more than twice as slow as the sequential version. Here is some background on the way I load and store my data: The data for each file is loaded by a function Load["filename",index], which stores the data and some of the associated metadata as downvalues in global variables. For example, the data itself (a real-valued array) is stored as a downvalue of the variable "data" as data[index]=<data-array>, the metadata is stored in header[index]=<header-string>, and there are 4 more such variables for 4 more types of metadata. As "index" is use integers, and for each file I use a different integer. Note that the data and metadata are not "Returned[]" by the function Load, rather the assignments happen "on the way" while the function is executing. The actual return value of the function is Null. The function Load[...] (and subfunctions that it calls) are accessible to the parallel kernels, and they all do their thing. But when a parallel kernel assigns a downvalue to data (for example data[1]=<array1>) this assignment appears to be "private" to that sub-kernel. I found the solution: I need to define data, header, and the other variables as shared variables (SetSharedFunction[data, header,...]). But this leads to a terrible slow-down. The very same code now becomes much slower, slower even than the non-parallel version. I can imagine the reason for this, I guess there is a tremendous overhead associated with making sure that all assignments to these variables are synchronized between all sub kernels and the master kernel. And I guess that this is generally important and a very good idea. However, in my case, all these parallel calls are completely independent of each other, and it is not necessary to be careful - each function call uses its unique "index", so that parallel kernels can never come into conflict with each other by accessing the same downvalue of, say, "data". This is ensured by the way I call the function, using MapIndexed: Parallelize@MapIndexed[Load[#1, #2[[1]]] &, listOfFilenames] Is there an easier, less overhead-burdened way of making the loaded and stored (meta-)data available to the master kernel for data analysis? And possibly also to the sub-kernels for parallelized analysis? Thank you for your help, thomas
- Follow-Ups:
- Re: Parallel speedup/slowdown
- From: Zach Bjornson <bjornson@mit.edu>
- Re: Parallel speedup/slowdown