Re: Do I need MathLink to run finite-difference fast enough for
- To: mathgroup at smc.vnet.net
- Subject: [mg115938] Re: Do I need MathLink to run finite-difference fast enough for
- From: Rob Knapp <bobjimknapp at gmail.com>
- Date: Wed, 26 Jan 2011 05:07:03 -0500 (EST)
- References: <ihe474$qbg$1@smc.vnet.net>
On Jan 22, 1:24 am, James <icor... at hotmail.com> wrote: > Hello Mike, > > This is a Brusselator model of chemical reaction diffussion that is employing the Euler numerical method to solve the PDE system: > > u_t=D_u lap(u)+a-(b+1)u+u^2v > > v_t=D_v lap(v)+bu-u^2v > > For suitable values of the parameters, the characteristic Turing patterns of dots and stripes should emerge. > > Using Oliver's Compiler suggestions above under V7, I can run the simulator on a 64X64 grid 10000 times on my machine in about 5.5 seconds. However, I feel that's still too slow to create a reasonable interactive Demonstration of the Brusselator where the user will be changing the paramters. Ideally, I'd like to get it down to about 2 seconds. I'm not sure how fast it would run in V8. Here is the complete code using Oliver's suggestions with some additional improvements for stripes that runs in 5.5 seconds on my machine. Can anyone suggest of a way to get it down to two seconds? With Mathematica version 8, the way this can be done the fastest is with LibraryLink, not MathLink. for an example along these lines (for the 1D Brusselator) see http://reference.wolfram.com/mathematica/LibraryLink/tutorial/Numerical.html#1050043638 I adapted the C code from that to handle the 2D Brusselator problem as done below, but with one noteable exception, namely that I set it up to use the standard Euler's method. The code below by using the update of u to compute v does a modified Eulers method that has a different region of stability than the standard Euler's method (e.g. it can handle imaginary eigenvalues) In this case, the equation is parabolic so that is not an issue. I've put the C code at the very bottom of this post. Here is what I did with it in Mathematica (Version 8) In[1]:= Needs["CCompilerDriver`"] In[2]:= csource = FileNameJoin[{$TemporaryDirectory, "brusselator.c"}] Out[2]= "C:\\Users\\rknapp\\AppData\\Local\\Temp\\brusselator.c" In[3]:= lib = CreateLibrary[{csource}, "brusselator"] Out[3]= "C:\\Users\\rknapp\\AppData\\Roaming\\Mathematica\\\ SystemFiles\\LibraryResources\\Windows-x86-64\\brusselator.dll" In[4]:= setp = LibraryFunctionLoad[lib, "brusselator_2d_set_parameters", {Real, Real, Real, Real}, "Void"]; rhs = LibraryFunctionLoad[lib, "brusselator_2d_rhs", {{Real, 2}}, {Real, 2}]; In[6]:= n = 64; a = 4.5; b = 7.5; k = b + 1; du = 2; dv = 16; dt = 0.01; totaliter = 10000; SeedRandom[1234]; u0 = a + 0.3*RandomReal[{-0.5, 0.5}, {n, n}]; v0 = b/a + 0.3*RandomReal[{-0.5, 0.5}, {n, n}]; In[17]:= cf = Compile[{{uIn, _Real, 2}, {vIn, _Real, 2}, {aIn, _Real}, {bIn, _Real}, {duIn, _Real}, {dvIn, _Real}, \ {dtIn, _Real}, {iterationsIn, _Integer}}, Block[{u = uIn, v = vIn, lap, dt = dtIn, k = bIn + 1, kern = N[{{0, 1, 0}, {1, -4, 1}, {0, 1, 0}}], du = duIn, dv = dvIn, utem = uIn}, Do[lap = RotateLeft[u, {1, 0}] + RotateLeft[u, {0, 1}] + RotateRight[u, {1, 0}] + RotateRight[u, {0, 1}] - 4*u; utem = u + dt*(du*lap + a - u*(k - v*u)); lap = RotateLeft[v, {1, 0}] + RotateLeft[v, {0, 1}] + RotateRight[v, {1, 0}] + RotateRight[v, {0, 1}] - 4*v; v = v + dt*(dv*lap + u*(b - v*u)); u = utem, {iterationsIn}]; {u, v}]]; In[18]:= lf = Compile[{{u, _Real, 2}, {v, _Real, 2}, {a, _Real}, {b, _Real}, {duIn, _Real}, {dv, _Real}, {dt, \ _Real}, {iterations, _Integer}}, setp[a, b, du, dv]; Partition[Nest[# + dt*rhs[#] &, Join[u, v], iterations], Length[u]], CompilationOptions -> {"InlineExternalDefinitions" -> True}]; In[19]:= Timing[rlf = lf[u0, v0, a, b, du, dv, dt, totaliter];] Out[19]= {1.17, Null} In[20]:= Timing[rcf = cf[u0, v0, a, b, du, dv, dt, totaliter];] Out[20]= {7.098, Null} In[21]:= Max[Abs[rlf - rcf]] Out[21]= 7.89147*10^-13 So using the LibraryFunction is much faster to get the same result. Note my definition for cf does the standard Euler method to check consistency. Also note that the C code is designed to take a concantenated array that has both u and v. You can get extra speed (like another factor of 2) if you set up the C code to implement Euler's method, but I prefer sacrificing a bit for flexibility. For example, you can just use this function in NDSolve as follows. In[22]:= Timing[ sol = First[ NDSolve[{uv'[t] == rhs[uv[t]], uv[0] == Join[u0, v0]}, {uv}, {t= , 100, 100}, DependentVariables -> {uv}, Method -> {"FixedStep", "StepSize" -> dt, Method -> "ExplicitEuler"}, MaxSteps -> Infinity]];] Out[22]= {3.385, Null} This will give the same solution as it is the same method -- the returned InterpolatingFunctions will be array valued. Note that the {t, 100, 100} has NDSolve only save the solution at t = 100, otherwise with such a large array memory could become an issue. But NDSolve allows you to use other methods as well -- since this problem is purely parabolic, a stabilized Runge Kutta method that uses stages to extend the region of stability along the negative real axis is a good candidate: In[23]:= Timing[ sol = First[ NDSolve[{uv'[t] == rhs[uv[t]], uv[0] == Join[u0, v0]}, {uv}, {t= , 100, 100}, DependentVariables -> {uv}, Method -> "StabilizedRungeKutta", MaxSteps -> Infinity, AccuracyGoal -> 2, PrecisionGoal -> 2]];] Out[23]= {1.046, Null} It is worth noting that you can use NDSolve directly to solve this problem -- the most difficult part here is managing the random initial conditions which I do by constructing periodic InterpolatingFunctions: L = n; uper = Join[u0, u0[[All, {1}]], 2]; uper = Join[uper, uper[[{1}]]]; uinit = ListInterpolation[uper, {{0, L}, {0, L}}, PeriodicInterpolation -> True]; vper = Join[v0, v0[[All, {1}]], 2]; vper = Join[vper, vper[[{1}]]]; vinit = ListInterpolation[vper, {{0, L}, {0, L}}, PeriodicInterpolation -> True]; Clear[u, v]; PDE = {D[u[t, x, y], t] == du*(D[u[t, x, y], {x, 2}] + D[u[t, x, y], {y, 2}]) + a - (b + 1)*u[t, x, y] + u[t, x, y]^2 v[t, x, y], D[v[t, x, y], t] == dv*(D[v[t, x, y], {x, 2}] + D[v[t, x, y], {y, 2}]) + b*u[t, x, y] - u[t, x, y]^2 v[t, x, y]}; BC = {u[t, 0, y] == u[t, L, y], u[t, x, 0] == u[t, x, L], v[t, 0, y] == v[t, L, y], v[t, x, 0] == v[t, x, L], u[0, x, y] == uinit[x, y], v[0, x, y] == vinit[x, y]}; then, In[36]:= AbsoluteTiming[ sol = First[ NDSolve[Join[PDE, BC], {u, v}, {t, 100, 100}, {x, 0, L}, {y, 0, L}, Method -> {"MethodOfLines", "SpatialDiscretization" -> {"TensorProductGrid", "MaxPoints" -> {n, n}, "MinPoints" -> {n, n}, "DifferenceOrder" -> 2}}]];] Out[37]= {24.1700000, Null} I have eliminated messages about error estimates not being satisfied because with a random initial condition like this they are irrelevant. So this is slower -- not surprising since it is done automatically. However, if starting conditions were smooth, you could use a higher order difference method - for periodic boundary conditions, the FFT based pseudospectral approximation is really effective In[37]:= m = 16; AbsoluteTiming[ sol = First[ NDSolve[Join[PDE, BC], {u, v}, {t, 100, 100}, {x, 0, L}, {y, 0, L}, Method -> {"MethodOfLines", "SpatialDiscretization" -> {"TensorProductGrid", "MaxPoints" -> {m, m}, "MinPoints" -> {m, m}, "DifferenceOrder" -> "Pseudospectral"}}, MaxSteps -> Infinity]];] Out[37]= {1.5560000, Null} --- this does not give the same solution. -------------------------------------- Below is the C code I used ---------------------------- /* Include required header */ #include "WolframLibrary.h" #include "math.h" #include "stdlib.h" /* Return the version of Library Link */ DLLEXPORT mint WolframLibrary_getVersion() { return WolframLibraryVersion; } /* Initialize Library */ DLLEXPORT int WolframLibrary_initialize(WolframLibraryData libData) { return LIBRARY_NO_ERROR; } /* Uninitialize Library */ DLLEXPORT void WolframLibrary_uninitialize(WolframLibraryData libData) { return; } static mreal a = 4.5; static mreal b = 7.5; static mreal du = 2; static mreal dv = 16; static mreal dt = 0.01; DLLEXPORT int brusselator_2d_set_parameters(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) { if (Argc != 4) return LIBRARY_FUNCTION_ERROR; a = MArgument_getReal(Args[0]); b = MArgument_getReal(Args[1]); du = MArgument_getReal(Args[2]); dv = MArgument_getReal(Args[3]); return 0; } DLLEXPORT int brusselator_2d_rhs(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) { int err; mint i, j, n, nm1, nn, nr; mint *dims; mreal k = b + 1.; mreal *u0, *v0, *ur, *vr, *u, *v, *up, *vp, *un, *vn; MTensor Tres, Targ; Targ = MArgument_getMTensor(Args[0]); if (libData->MTensor_getType(Targ) != MType_Real) return LIBRARY_TYPE_ERROR; if (libData->MTensor_getRank(Targ) != 2) return LIBRARY_TYPE_ERROR; dims = libData->MTensor_getDimensions(Targ); nr = dims[0]; n = dims[1]; if (2*n != nr) return LIBRARY_DIMENSION_ERROR; nn = n*n; u0 = libData->MTensor_getRealData(Targ); v0 = u0 + nn; err = libData->MTensor_new(MType_Real, 2, dims, &Tres); if (err) return err; ur = libData->MTensor_getRealData(Tres); vr = ur + nn; un = u0; u = u0 + (nn - n); vn = v0; v = v0 + (nn - n); nm1 = n - 1; for (i = 0; i <= nm1 ; i++) { mreal ulap, vlap, uv; up = u; vp = v; u = un; v = vn; if (i < nm1) { un = u + n; vn = v + n; } else { un = u0; vn = v0; } ulap = u[nm1] + u[1] + up[0] + un[0] - 4.*u[0]; vlap = v[nm1] + v[1] + vp[0] + vn[0] - 4.*v[0]; uv = u[0]*v[0]; ur[0] = du*ulap + a - u[0]*(k - uv); vr[0] = dv*vlap + u[0]*(b - uv); for (j = 1; j < nm1; j++) { ulap = u[j - 1] + u[j + 1] + up[j] + un[j] - 4.*u[j]; vlap = v[j - 1] + v[j + 1] + vp[j] + vn[j] - 4.*v[j]; uv = u[j]*v[j]; ur[j] = du*ulap + a - u[j]*(k - uv); vr[j] = dv*vlap + u[j]*(b - uv); } ulap = u[nm1 - 1] + u[0] + up[nm1] + un[nm1] - 4.*u[nm1]; vlap = v[nm1 - 1] + v[0] + vp[nm1] + vn[nm1] - 4.*v[nm1]; uv = u[nm1]*v[nm1]; ur[nm1] = du*ulap + a - u[nm1]*(k - uv); vr[nm1] = dv*vlap + u[nm1]*(b - uv); ur += n; vr += n; } MArgument_setMTensor(Res, Tres); return 0; }