MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: ParallelDo and C-compiled routines

  • To: mathgroup at smc.vnet.net
  • Subject: [mg121805] Re: ParallelDo and C-compiled routines
  • From: Patrick Scheibe <pscheibe at trm.uni-leipzig.de>
  • Date: Mon, 3 Oct 2011 04:22:05 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • References: <j5ug1a$7r5$1@smc.vnet.net> <201109290605.CAA22485@smc.vnet.net>

The communication overhead is beyond good and evil. I assumed someone
who asked a question about ParallelDo is kind of concerned about
*speed*:

In[19]:= First@
 AbsoluteTiming@
  Table[ReadList["! tmp/square.exe " <> ToString[i], Number], {i, 
    10000}]

Out[19]= 48.957941

the equivalent library function:

In[18]:= First@AbsoluteTiming@Table[cfun, {i, 10000}]

Out[18]= 0.005934

I admit, that a function like "square" is a bit too short for a
"longRoutine", but why using this kind of communication when the faster
solution is as easily available?
Maybe sometimes the "old school" is for some purposes just outdated.

Cheers
Patrick

On Sun, 2011-10-02 at 02:36 -0400, Gabriel Landi wrote:
> You can always try the 'old school' way. Use Mathematica as a command prompt:
> 
> ParallelDo[result[i] =ReadList[ "! ./c_code.exe", Number], {i,number}]
> 
> Works perfectly.
> 
> 
> 
> 
> On Sep 30, 2011, at 5:03 AM, Patrick Scheibe wrote:
> 
> > On Thu, 2011-09-29 at 02:05 -0400, DmitryG wrote:
> >> On Sep 28, 2:49 am, DmitryG <einsch... at gmail.com> wrote:
> >>> Hi All,
> >>>
> >>> I am going to run several instances of a long calculation on =
> different
> >>> cores of my computer and then average the results. The program looks
> >>> like this:
> >>>
> >>> SetSharedVariable[Res];
> >>> ParallelDo[
> >>> Res[[iKer]] = LongRoutine;
> >>> , {iKer, 1, NKer}]
> >>>
> >>> LongRoutine is compiled. When compiled in C, it is two times faster
> >>> than when compiled in Mathematica. In the case of a Do cycle, this
> >>> speed difference can be seen, However, in the case of ParallelDo I
> >>> have the speed of the Mathematica-compiled routine independently of
> >>> the CompilationTarget in LongRoutine, even if I set NKer=1.
> >>>
> >>> What does it mean? Are routines compiled in C unable of parallel
> >>> computing? Or there is a magic option to make them work? I tried
> >>> Parallelization->True but there is no result, and it seems this =
> option
> >>> is for applying the routine to lists.
> >>>
> >>> Here is an example:
> >>> ************************************************************
> >>> NKer = 1;
> >>>
> >>> (* Subroutine compiled in Mathematica *)
> >>> m = Compile[ {{x, _Real}, {n, _Integer}},
> >>>        Module[ {sum, inc}, sum = 1.0; inc = 1.0;
> >>>    Do[inc = inc*x/i; sum = sum + inc, {i, n}]; sum]];
> >>>
> >>> (* Subroutine compiled in C *)
> >>> c = Compile[ {{x, _Real}, {n, _Integer}},
> >>>        Module[ {sum, inc}, sum = 1.0; inc = 1.0;
> >>>    Do[inc = inc*x/i; sum = sum + inc, {i, n}]; sum],
> >>>   CompilationTarget -> "C"];
> >>>
> >>> (* There is a difference between Mathematica and C *)
> >>> Do[
> >>> Print[AbsoluteTiming[m[1.5, 10000000]][[1]]];
> >>> Print[AbsoluteTiming[c[1.5, 10000000]][[1]]];
> >>> , {iKer, 1, NKer}]
> >>> Print[];
> >>>
> >>> (* With ParallelDo there is no difference *)
> >>> ParallelDo[
> >>> Print[AbsoluteTiming[m[1.5, 10000000]][[1]]];
> >>> Print[AbsoluteTiming[c[1.5, 10000000]][[1]]];
> >>> , {iKer, 1, NKer}]
> >>> **************************************************************
> >>>
> >>> Any help?
> >>>
> >>> Best,
> >>>
> >>> Dmitry
> >>
> >> My theory is the following. C compiler creates an executable that is
> >> saved somewhere on the hard drive and then run by Mathematica Kernel.
> >> Windows may not allow different applications (such as different
> >> Mathematica kernels in parallel computation) access a file at the =
> same
> >> time.
> >>
> >> If this is true, the solution were to create copies of this =
> executable
> >> on the hard drive, so that each kernel could run its copy.
> >>
> >> Dmitry
> >>
> >
> > No, not exactly. The compiler creates a library which is a dll in your
> > (Microsoft Windows) case or a shared object on Linux or a dylib on
> > MacOSX.
> >
> > When you compile a function into "C" than a library is created and the
> > library function of this dll|so|dylib is accessed when you call the
> > compiled function in your Mathematica session.
> >
> > On my Linux box these created C-libraries are stored in my
> > $UserBaseDirectory under
> >
> > $UserBaseDirectory/ApplicationData/CCompilerDriver/BuildFolder
> >
> > and then every unique MathKernel (with which you compile the function)
> > gets its own subdirectory. This means, if my currently running
> > MathKernel has an process id of, say 2088, I get a subdirectory
> >
> > warp-2088
> >
> > under the above mentioned folder. "warp" is here the name of my =
> machine.
> > This information is available in your "CompiledFunction" object too.
> > Look at
> >
> > c // InputForm
> >
> > of your function and notice how Oleksandr show in his mail how to
> > accesses this information to load the compiled function separately for
> > each kernel.
> >
> > Beside the expanation of Oleksandr, which describes your behavior in
> > detail, I just want to add, that you don't have to recompile a =
> function
> > everytime you restart the kernel. You could use LibraryGenerate to
> > create a library which is permanently available (it seems that the
> > libraries created with Compile[...,CompilationTarget->"C"] are deleted
> > when the kernel quits). So with your MVM CompiledFunction you could
> > create your lib with:
> >
> > << CCodeGenerator`
> >
> > m = Compile[{{x, _Real}, {n, _Integer}},
> >   Module[{sum, inc}, sum = 1.0; inc = 1.0;
> >    Do[inc = inc*x/i; sum = sum + inc, {i, n}]; sum]];
> > LibraryGenerate[m, "longRoutine"]
> >
> >
> > loadLib[] :=
> >  LibraryFunctionLoad["longRoutine",
> >   "longRoutine", {{Real, 0, "Constant"}, {Integer, 0, "Constant"}},
> >   Real] ;
> >
> > brandNewC = loadLib[];
> > NKer = 1;
> > ParallelDo[
> > brandNewC = loadLib[];
> > Print[AbsoluteTiming[brandNewC[1.5, 10000000]]],
> > {iKer, 1, NKer}
> > ]
> >
> >
> > Cheers
> > Patrick
> >
> >
> 
> 





  • Prev by Date: Re: ParallelDo and C-compiled routines
  • Next by Date: Re: Finding if a graph G contains any clique of size N...
  • Previous by thread: Re: ParallelDo and C-compiled routines
  • Next by thread: Re: Finding if a graph G contains any clique of size N...