Re: Fully vectorized system of ODE's - any advantage of C?
- To: mathgroup at smc.vnet.net
- Subject: [mg121846] Re: Fully vectorized system of ODE's - any advantage of C?
- From: DmitryG <einschlag at gmail.com>
- Date: Wed, 5 Oct 2011 03:58:46 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- References: <j6ea1b$lf2$1@smc.vnet.net>
On Oct 4, 2:44 am, DmitryG <einsch... at gmail.com> wrote: > Dear All, > > I was working on solving systems of a big number of ODE's with > vectorization and compilation in C and I have received a great support > in this forum. The speed crucially depends on the details of > programming that not always can be detected by an average user (such > as myself). > > I have realized that my vectorized codes were not yet completely > vectorized because calls have been made to components of tensors. Then > I've got an idea of a code that is completely vectorized, for a model > of classical spins that is one of the subjects of my work. Here is the > problem that, I believe, can be of interest to many Mathematica > users. > > Dynamics of a chain of classical spins (described by three-component > vectors of fixed length) precessing in the effective field created by > neighbors > > ds[i]/dt==s[i]\[Cross]H[i]; H[i]=s[i-1]+s[i+= 1]; > i=1,2,..., NSpins > > Both the vector product and interaction with neighbors in H[i] can be > very efficiently described by the RotateRight command. Putting all > three components of all NSpins spins into the double list > > s={ {s[x,1],s[x,2],..s[x,NSpins]}, {s[y,1],s[y,2],..s[y,NSpins]}, > {s[z,1],sz,2],..s[z,NSpins]} }, > > one can construct the code > > ************************************************************ > (* Runge-Kutta-4 routine *) > ClearAll[makeCompRK4, makeCompRK5] > makeCompRK4[f_] := > Compile[{{x0, _Real, 2}, {t0}, {tMax}, {n, _Integer}}, > Module[{h, K1, K2, K3, K4, SolList, x = x0, t}, h = (tMax - t0)/n= ; > SolList = Table[x0, {n + 1}]; > Do[t = t0 + k h; > K1 = h f[t, x]; > K2 = h f[t + (1/2) h, x + (1/2) K1]; > K3 = h f[t + (1/2) h, x + (1/2) K2]; > K4 = h f[t + h, x + K3]; > x = x + (1/6) K1 + (1/3) K2 + (1/3) K3 + (1/6) K4; > SolList[[k + 1]] = x, {k, 1, n}]; > SolList](*,CompilationTarget->"C"*), > CompilationOptions -> {"InlineCompiledFunctions" -> True}, > "RuntimeOptions" -> "Speed"] > > (* Defining equations *) > > RHS = Function[{t, s}, > RotateRight[ > s, {1, 0}] (RotateRight[s, {2, 1}] + RotateRight[s, {2, -1}])= - > RotateRight[ > s, {2, 0}] (RotateRight[s, {1, 1}] + RotateRight[s, {1, -1}])= ]; > > cRHS = Compile[{t, {s, _Real, 2}}, > RotateRight[ > s, {1, 0}] (RotateRight[s, {2, 1}] + RotateRight[s, {2, -1}])= - > RotateRight[ > s, {2, 0}] (RotateRight[s, {1, 1}] + RotateRight[s, {1, -1}])= ]; > > (*Compilation*) > tt0 = AbsoluteTime[]; > Timing[RKComp = makeCompRK4[RHS];] > AbsoluteTime[] - tt0 > > (* Initial condition *) > x0 = { Join[{1}, Table[0, {i, 2, NSpins}], {0}], > Join[{0}, Table[0, {i, 2, NSpins}], {0}], > Join[{0}, > Table[1, {i, 2, > NSpins}], {0}] }; (* Padding with a zero spin at the NSpins= +1 \ > position *) > > (* Parameters *) > NSpins = 120; t0 = 0; tMax = 150; n = 1000; > > (* Solving *) > tt0 = AbsoluteTime[]; > Sol = RKComp[x0, t0, tMax, n]; > AbsoluteTime[] - tt0 > > Print["Compilation: ", Developer`PackedArrayQ@Sol] > > (* Plotting *) > tList = Table[1. t0 + (tMax - t0) k/n, {k, 0, n}]; > si\[Alpha]List[\[Alpha]_, i_] := > Table[Sol[[k]][[\[Alpha], i]], {k, 0, n}]; > s1List = Transpose[{tList, si\[Alpha]List[3, 1]}]; > s2List = Transpose[{tList, si\[Alpha]List[3, 40]}]; > s3List = Transpose[{tList, si\[Alpha]List[3, 80]}]; > s4List = Transpose[{tList, si\[Alpha]List[3, NSpins]}]; > ListPlot[{s1List, s2List, s3List, s4List}, > PlotStyle -> {Blue, Green, Red, Orange}, PlotRange -> All] > > *************************************************************************= ** ********** > > The code is very short and runs faster than all my previous codes for > the same problem. There are some questions, however: > > 1) There is no difference in speed between uncompiled RHS and compiled > cRHS. > > 2) On my laptop (Windows 7, 64 bit), compilation in Mathematica > results in 0.070 execution time, whereas compilation in C (Microsoft > Visual C++) is longer, 0.075. In all previous not fully vectorized > versions of the code compiling in C gave a speed advantage by a factor > 2-3. Is it a perfect code that cannot be improved by C or, for some > reason, compilation in C does not work and the system returns to the > Mathematica compiler? > > Oliver mentioned earlier that CopyTensor in the loop in the compiled > code leads to slowdown. Here I do have CopyTensor in the loop. Could > it be moved outside?? > > I will be very greatful for all comments, as usual. > > Dmitry Here is a two-step formulation of equations in the program above that includes definition of the effective field (Heff is the same as H): ******************************************************** (* Defining equations *) Heff = Function[{t, s}, RotateRight[s, {0, 1}] + RotateRight[s, {0, -1}]]; RHS = Function[{t, s}, RotateRight[s, {1, 0}] RotateRight[Heff[t, s], {2, 0}] - RotateRight[s, {2, 0}] RotateRight[Heff[t, s], {1, 0}]]; cHeff = Compile[{t, {s, _Real, 2}}, RotateRight[s, {0, 1}] + RotateRight[s, {0, -1}]]; cRHS = Compile[{t, {s, _Real, 2}}, RotateRight[s, {1, 0}] RotateRight[cHeff[t, s], {2, 0}] - RotateRight[s, {2, 0}] RotateRight[cHeff[t, s], {1, 0}], CompilationOptions -> {"InlineCompiledFunctions" -> True}]; ******************************************************************** This works somewhat slower than the one-step formulation of equations, presumably because there are more RotateRight commands. However, this is more appropriate since Heff can contain many other terms. Here, the compiled version is faster than uncompiled one, but there is no effect of compilation in C, as above. One thing that I do not understand is why the compiled code of cRHS contains MainEvaluate[ Hold[cHeff], in spite of CompilationOptions -> {"InlineCompiledFunctions" -> True. Best, Dmitry