Re: Parallel speedup/slowdown
- To: mathgroup at smc.vnet.net
- Subject: [mg105991] Re: [mg105985] Parallel speedup/slowdown
- From: Zach Bjornson <bjornson at mit.edu>
- Date: Wed, 30 Dec 2009 04:10:27 -0500 (EST)
- References: <200912290619.BAA02670@smc.vnet.net>
Hi Thomas, (It sounds like your implementation functioned properly, but SetSharedVariable is I think what you meant instead of SetSharedFunction for your metadata arrays.) One way to avoid having to call the master kernel each iteration is to parcel out a range of files to load to each slave, having each create its own precursor metadata arrays; when they are completed, concatenate those precursor metadata arrays with Join to create your master list. Sounds like you have some way already of assigning unique indexes to each file, and you have your special Load function, but you could make use of $KernelID to avoid nonunique indexes from arising. However... In absence of your special Load function, and the Map function you provided, when I've loaded files in parallel before I've always done something like this: ParallelEvaluate[SetDirectory["directory with your input files"]]; data = ParallelTable[Import[x],{x,FileNames[]}]; which is quite convenient. Hope that helps. Cheers, Zach On 12/29/2009 1:19 AM, Thomas Münch wrote: > Dear Mathgroup, > > I have been working on parallelizing some of my code to get increased > speed, but after initial success, it turned out that things are not that > easy... > > I am loading binary data for analysis with Mathematica. Loading of each > individual file is quite brisk, but I usually load many files (up to > hundreds) at once (or rather sequentially, but execution of the code is > independent for each file). So I thought this is a great task for > parallel computing. And indeed, I achieved in increase of speed that > scales roughly linear with the number of kernels. Until I figured out > that my data is loaded by the parallel kernels, but inaccessible to the > master kernel. I found a solution for that, but now my parallel version > of loading is more than twice as slow as the sequential version. > > Here is some background on the way I load and store my data: > > The data for each file is loaded by a function Load["filename",index], > which stores the data and some of the associated metadata as downvalues > in global variables. For example, the data itself (a real-valued array) > is stored as a downvalue of the variable "data" as > data[index]=<data-array>, the metadata is stored in > header[index]=<header-string>, and there are 4 more such variables for 4 > more types of metadata. As "index" is use integers, and for each file I > use a different integer. Note that the data and metadata are not > "Returned[]" by the function Load, rather the assignments happen "on the > way" while the function is executing. The actual return value of the > function is Null. > > The function Load[...] (and subfunctions that it calls) are accessible > to the parallel kernels, and they all do their thing. But when a > parallel kernel assigns a downvalue to data (for example > data[1]=<array1>) this assignment appears to be "private" to that > sub-kernel. > > I found the solution: I need to define data, header, and the other > variables as shared variables (SetSharedFunction[data, header,...]). But > this leads to a terrible slow-down. The very same code now becomes much > slower, slower even than the non-parallel version. > > I can imagine the reason for this, I guess there is a tremendous > overhead associated with making sure that all assignments to these > variables are synchronized between all sub kernels and the master > kernel. And I guess that this is generally important and a very good > idea. However, in my case, all these parallel calls are completely > independent of each other, and it is not necessary to be careful - each > function call uses its unique "index", so that parallel kernels can > never come into conflict with each other by accessing the same downvalue > of, say, "data". This is ensured by the way I call the function, using > MapIndexed: > Parallelize@MapIndexed[Load[#1, #2[[1]]]&, listOfFilenames] > > Is there an easier, less overhead-burdened way of making the loaded and > stored (meta-)data available to the master kernel for data analysis? And > possibly also to the sub-kernels for parallelized analysis? > > Thank you for your help, > thomas > >
- References:
- Parallel speedup/slowdown
- From: Thomas Münch <thomas.muench@gmail.com>
- Parallel speedup/slowdown