MathGroup Archive: January 2011 [00825]

[Date Index] [Thread Index] [Author Index]
Re: Do I need MathLink to run finite-difference fast enough for
To: mathgroup at smc.vnet.net
Subject: [mg115938] Re: Do I need MathLink to run finite-difference fast enough for
From: Rob Knapp <bobjimknapp at gmail.com>
Date: Wed, 26 Jan 2011 05:07:03 -0500 (EST)
References: <ihe474$qbg$1@smc.vnet.net>
On Jan 22, 1:24 am, James <icor... at hotmail.com> wrote:
> Hello Mike,
>
> This is a Brusselator model of chemical reaction diffussion that is employing the Euler numerical method to solve the PDE system:
>
> u_t=D_u lap(u)+a-(b+1)u+u^2v
>
> v_t=D_v lap(v)+bu-u^2v
>
> For suitable values of the parameters, the characteristic Turing patterns of dots and stripes should emerge.
>
> Using Oliver's Compiler suggestions above under V7, I can run the simulator on a 64X64 grid 10000 times on my machine in about 5.5 seconds.  However, I feel that's still too slow to create a reasonable interactive Demonstration of the Brusselator where the user will be changing the paramters. Ideally, I'd like to get it down to about 2 seconds.  I'm not sure how fast it would run in V8.  Here is the complete code using Oliver's suggestions with some additional improvements for stripes that runs in 5.5 seconds on my machine.  Can anyone suggest of a way to get it down to two seconds?


With Mathematica version 8, the way this can be done the fastest is
with LibraryLink, not MathLink.

for an example along these lines (for the 1D Brusselator) see
http://reference.wolfram.com/mathematica/LibraryLink/tutorial/Numerical.html#1050043638

I adapted the C code from that to handle the 2D Brusselator problem as
done below, but with one
noteable exception, namely that I set it up to use the standard
Euler's method.  The code below
by using the update of u to compute v does a modified Eulers method
that has a different
region of stability than the standard Euler's method (e.g. it can
handle imaginary eigenvalues)
In this case, the equation is parabolic so that is not an issue.

I've put the C code at the very bottom of this post.

Here is what I did with it in Mathematica (Version 8)

In[1]:= Needs["CCompilerDriver`"]

In[2]:= csource = FileNameJoin[{$TemporaryDirectory, "brusselator.c"}]

Out[2]= "C:\\Users\\rknapp\\AppData\\Local\\Temp\\brusselator.c"

In[3]:= lib = CreateLibrary[{csource}, "brusselator"]

Out[3]= "C:\\Users\\rknapp\\AppData\\Roaming\\Mathematica\\\
SystemFiles\\LibraryResources\\Windows-x86-64\\brusselator.dll"

In[4]:= setp =
  LibraryFunctionLoad[lib,
   "brusselator_2d_set_parameters", {Real, Real, Real, Real}, "Void"];
rhs = LibraryFunctionLoad[lib,
   "brusselator_2d_rhs", {{Real, 2}}, {Real, 2}];

In[6]:= n = 64;
a = 4.5;
b = 7.5;
k = b + 1;
du = 2;
dv = 16;
dt = 0.01;
totaliter = 10000;
SeedRandom[1234];
u0 = a + 0.3*RandomReal[{-0.5, 0.5}, {n, n}];
v0 = b/a + 0.3*RandomReal[{-0.5, 0.5}, {n, n}];

In[17]:= cf =
  Compile[{{uIn, _Real, 2}, {vIn, _Real,
     2}, {aIn, _Real}, {bIn, _Real}, {duIn, _Real}, {dvIn, _Real}, \
{dtIn, _Real}, {iterationsIn, _Integer}},
   Block[{u = uIn, v = vIn, lap, dt = dtIn, k = bIn + 1,
     kern = N[{{0, 1, 0}, {1, -4, 1}, {0, 1, 0}}], du = duIn,
     dv = dvIn, utem = uIn},
    Do[lap =
      RotateLeft[u, {1, 0}] + RotateLeft[u, {0, 1}] +
       RotateRight[u, {1, 0}] + RotateRight[u, {0, 1}] - 4*u;
     utem = u + dt*(du*lap + a - u*(k - v*u));
     lap =
      RotateLeft[v, {1, 0}] + RotateLeft[v, {0, 1}] +
       RotateRight[v, {1, 0}] + RotateRight[v, {0, 1}] - 4*v;
     v = v + dt*(dv*lap + u*(b - v*u)); u = utem, {iterationsIn}]; {u,
      v}]];

In[18]:= lf =
  Compile[{{u, _Real, 2}, {v, _Real,
     2}, {a, _Real}, {b, _Real}, {duIn, _Real}, {dv, _Real}, {dt, \
_Real}, {iterations, _Integer}},
   setp[a, b, du, dv];
   Partition[Nest[# + dt*rhs[#] &, Join[u, v], iterations],
    Length[u]],
   CompilationOptions -> {"InlineExternalDefinitions" -> True}];

In[19]:= Timing[rlf = lf[u0, v0, a, b, du, dv, dt, totaliter];]

Out[19]= {1.17, Null}

In[20]:= Timing[rcf = cf[u0, v0, a, b, du, dv, dt, totaliter];]

Out[20]= {7.098, Null}

In[21]:= Max[Abs[rlf - rcf]]

Out[21]= 7.89147*10^-13

So using the LibraryFunction is much faster to get the same result.
Note my definition for cf does the standard Euler method to check
consistency.
Also note that the C code is designed to take a concantenated array
that
has both u and v.


You can get extra speed (like another factor of 2) if you set up the
C code to implement Euler's method, but I prefer sacrificing a bit
for flexibility.  For example, you can just use this function in
NDSolve as follows.

In[22]:= Timing[
 sol = First[
    NDSolve[{uv'[t] == rhs[uv[t]], uv[0] == Join[u0, v0]}, {uv}, {t=
,
      100, 100}, DependentVariables -> {uv},
     Method -> {"FixedStep", "StepSize" -> dt,
       Method -> "ExplicitEuler"}, MaxSteps -> Infinity]];]

Out[22]= {3.385, Null}

This will give the same solution as it is the same method -- the
returned InterpolatingFunctions will be array valued.
Note that the {t, 100, 100} has NDSolve only save the solution
at t = 100, otherwise with such a large array memory could
become an issue.

But NDSolve allows you to use other methods as well -- since this
problem is purely parabolic, a stabilized Runge Kutta method that
uses stages to extend the region of stability along the negative real
axis
is a good candidate:

In[23]:= Timing[
 sol = First[
    NDSolve[{uv'[t] == rhs[uv[t]], uv[0] == Join[u0, v0]}, {uv}, {t=
,
      100, 100}, DependentVariables -> {uv},
     Method -> "StabilizedRungeKutta", MaxSteps -> Infinity,
     AccuracyGoal -> 2, PrecisionGoal -> 2]];]

Out[23]= {1.046, Null}


It is worth noting that you can use NDSolve directly to solve this
problem --
the most difficult part here is managing the random initial conditions
which
I do by constructing periodic InterpolatingFunctions:

L = n;

uper = Join[u0, u0[[All, {1}]], 2];
uper = Join[uper, uper[[{1}]]]; uinit =
 ListInterpolation[uper, {{0, L}, {0, L}},
  PeriodicInterpolation -> True];
vper = Join[v0, v0[[All, {1}]], 2];
vper = Join[vper, vper[[{1}]]]; vinit =
 ListInterpolation[vper, {{0, L}, {0, L}},
  PeriodicInterpolation -> True];

Clear[u, v];
PDE = {D[u[t, x, y], t] ==
   du*(D[u[t, x, y], {x, 2}] + D[u[t, x, y], {y, 2}]) +
    a - (b + 1)*u[t, x, y] + u[t, x, y]^2 v[t, x, y],
  D[v[t, x, y], t] ==
   dv*(D[v[t, x, y], {x, 2}] + D[v[t, x, y], {y, 2}]) + b*u[t, x, y] -
     u[t, x, y]^2 v[t, x, y]};
BC = {u[t, 0, y] == u[t, L, y], u[t, x, 0] == u[t, x, L],
   v[t, 0, y] == v[t, L, y], v[t, x, 0] == v[t, x, L],
   u[0, x, y] == uinit[x, y], v[0, x, y] == vinit[x, y]};

then,

In[36]:= AbsoluteTiming[
 sol = First[
    NDSolve[Join[PDE, BC], {u, v}, {t, 100, 100}, {x, 0, L}, {y, 0,
      L}, Method -> {"MethodOfLines",
       "SpatialDiscretization" -> {"TensorProductGrid",
         "MaxPoints" -> {n, n}, "MinPoints" -> {n, n},
         "DifferenceOrder" -> 2}}]];]

Out[37]= {24.1700000, Null}

I have eliminated messages about error estimates not being satisfied
because
with a random initial condition like this they are irrelevant.

So this is slower -- not surprising since it is done automatically.

However, if starting conditions were smooth, you could use a higher
order
difference method - for periodic boundary conditions, the FFT based
pseudospectral
approximation is really effective

In[37]:= m = 16; AbsoluteTiming[
 sol = First[
    NDSolve[Join[PDE, BC], {u, v}, {t, 100, 100}, {x, 0, L}, {y, 0,
      L},
     Method -> {"MethodOfLines",
       "SpatialDiscretization" -> {"TensorProductGrid",
         "MaxPoints" -> {m, m}, "MinPoints" -> {m, m},
         "DifferenceOrder" -> "Pseudospectral"}},
     MaxSteps -> Infinity]];]

Out[37]= {1.5560000, Null}

 ---  this does not give the same solution.


-------------------------------------- Below is the C code I used
----------------------------

/* Include required header */
#include "WolframLibrary.h"
#include "math.h"
#include "stdlib.h"

/* Return the version of Library Link */
DLLEXPORT mint WolframLibrary_getVersion() {
	return WolframLibraryVersion;
}

/* Initialize Library */
DLLEXPORT int WolframLibrary_initialize(WolframLibraryData libData) {
	return LIBRARY_NO_ERROR;
}

/* Uninitialize Library */
DLLEXPORT void WolframLibrary_uninitialize(WolframLibraryData libData)
{
	return;
}

static mreal a = 4.5;
static mreal b = 7.5;
static mreal du = 2;
static mreal dv = 16;
static mreal dt = 0.01;

DLLEXPORT int brusselator_2d_set_parameters(WolframLibraryData
libData, mint Argc, MArgument *Args, MArgument Res)
{
	if (Argc != 4) return LIBRARY_FUNCTION_ERROR;
	a = MArgument_getReal(Args[0]);
	b = MArgument_getReal(Args[1]);
	du = MArgument_getReal(Args[2]);
	dv = MArgument_getReal(Args[3]);
	return 0;
}

DLLEXPORT int brusselator_2d_rhs(WolframLibraryData libData, mint
Argc, MArgument *Args, MArgument Res)
{
	int err;
    mint i, j, n, nm1, nn, nr;
	mint *dims;
	mreal k = b + 1.;
    mreal *u0, *v0, *ur, *vr, *u, *v, *up, *vp, *un, *vn;
    MTensor Tres, Targ;
    Targ = MArgument_getMTensor(Args[0]);
    if (libData->MTensor_getType(Targ) != MType_Real) return
LIBRARY_TYPE_ERROR;
    if (libData->MTensor_getRank(Targ) != 2) return
LIBRARY_TYPE_ERROR;
	dims = libData->MTensor_getDimensions(Targ);
	nr = dims[0];
	n = dims[1];
	if (2*n != nr) return LIBRARY_DIMENSION_ERROR;

	nn = n*n;
    u0 = libData->MTensor_getRealData(Targ);
    v0 = u0 + nn;

    err = libData->MTensor_new(MType_Real, 2, dims, &Tres);
	if (err) return err;

    ur = libData->MTensor_getRealData(Tres);
    vr = ur + nn;

	un = u0;
	u = u0 + (nn - n);
	vn = v0;
	v = v0 + (nn - n);

	nm1 = n - 1;
	for (i = 0; i <= nm1 ; i++) {
		mreal ulap, vlap, uv;
		up = u;
		vp = v;
		u = un;
		v = vn;
		if (i < nm1) {
			un = u + n;
			vn = v + n;
		} else {
			un = u0;
			vn = v0;
		}

		ulap = u[nm1] + u[1] + up[0] + un[0] - 4.*u[0];
		vlap = v[nm1] + v[1] + vp[0] + vn[0] - 4.*v[0];
		uv = u[0]*v[0];

		ur[0] = du*ulap + a - u[0]*(k - uv);
		vr[0] = dv*vlap + u[0]*(b - uv);

		for (j = 1; j < nm1; j++) {
			ulap = u[j - 1] + u[j + 1] + up[j] + un[j] - 4.*u[j];
			vlap = v[j - 1] + v[j + 1] + vp[j] + vn[j] - 4.*v[j];
			uv = u[j]*v[j];

			ur[j] = du*ulap + a - u[j]*(k - uv);
			vr[j] = dv*vlap + u[j]*(b - uv);
		}

		ulap = u[nm1 - 1] + u[0] + up[nm1] + un[nm1] - 4.*u[nm1];
		vlap = v[nm1 - 1] + v[0] + vp[nm1] + vn[nm1] - 4.*v[nm1];
		uv = u[nm1]*v[nm1];

		ur[nm1] = du*ulap + a - u[nm1]*(k - uv);
		vr[nm1] = dv*vlap + u[nm1]*(b - uv);

		ur += n;
		vr += n;
	}

    MArgument_setMTensor(Res, Tres);
    return 0;
}
Prev by Date: Re: Mathematica 20x slower than Java at arithmetic/special
Next by Date: Re: Do I need MathLink to run finite-difference fast enough for
Previous by thread: Re: Vector problem
Next by thread: Re: Do I need MathLink to run finite-difference fast enough for