Re: Do I need MathLink to run finite-difference fast enough for Manipulate?
- To: mathgroup at smc.vnet.net
- Subject: [mg115798] Re: Do I need MathLink to run finite-difference fast enough for Manipulate?
- From: Oliver Ruebenkoenig <ruebenko at wolfram.com>
- Date: Fri, 21 Jan 2011 04:30:31 -0500 (EST)
Hello James,
On Wed, 19 Jan 2011, James wrote:
> Hello guys,
>
> I would like to create a Manipulate for a finite-difference on a system
> of PDEs using a 100x100 array that is processed about 10,000 times. Is
> it possible to code this in native Mathematica, for example using
> Compile, fast enough to be reasonably used by Manipulate (under 3 or 4
> seconds)? The C++ code is below and with the Compiler set to optimize
> speed, it runs in approximately 1.5 seconds. However, if I attempt to
> just code it in Mathematica using arrays, it takes much too long to
> execute.
>
> Can someone help me decide if the only reasonable option is to use
> MathLink or can I even use MathLink in a Manipulate?
>
> Thanks Guys. Here's the C++ code. The variables u, v, and lap
> are all 100x100 arrays.
>
I think you have several options:
1) Consider using NDSolve for this. This has several advantages. Better
control of the time step size - dt = fixNum is somewhat crude. Also,
NDSolve can terminated once convergence has been reached, you can control
accuracy.
2) Seen that you have a C++ code you could use link technology to just
link that code into M-. My guess is that this is quite efficient.
3) You can code this in Mathematica. You could translate the for loops in,
say, Do[] loops and put that into the compiler.
Let me show you another approach.
Prelude:
-------
n = 5;
data = Array[m, {n, n}];
data
{{m[1, 1], m[1, 2], m[1, 3], m[1, 4], m[1, 5]}, {m[2, 1], m[2, 2],
m[2, 3], m[2, 4], m[2, 5]}, {m[3, 1], m[3, 2], m[3, 3], m[3, 4],
m[3, 5]}, {m[4, 1], m[4, 2], m[4, 3], m[4, 4], m[4, 5]}, {m[5, 1],
m[5, 2], m[5, 3], m[5, 4], m[5, 5]}}
commands like
RotateLeft[data, 1]
{{m[2, 1], m[2, 2], m[2, 3], m[2, 4], m[2, 5]}, {m[3, 1], m[3, 2],
m[3, 3], m[3, 4], m[3, 5]}, {m[4, 1], m[4, 2], m[4, 3], m[4, 4],
m[4, 5]}, {m[5, 1], m[5, 2], m[5, 3], m[5, 4], m[5, 5]}, {m[1, 1],
m[1, 2], m[1, 3], m[1, 4], m[1, 5]}}
shit columns and rows.
Your stencil could then be
res = -4*data + RotateLeft[data, 1] + RotateRight[data, 1] +
RotateLeft[data, {0, 1}] + RotateRight[data, {0, 1}];
Appetizer: - ListCorrelate
----------
n = 100;
u0 = RandomReal[1, {n, n}];
v0 = RandomReal[1, {n, n}];
dt = 10^-6.;
a = 0.1; b = 0.1;
du = 1/(n - 1);
dv = 1/(n - 1);
kern = {{0, 1, 0}, {1, -4, 1}, {0, 1, 0}};
u = u0;
v = v0;
AbsoluteTiming[
Do[
lap = RotateLeft[ListCorrelate[kern, u, {-1, -1}], {1, 1}];
u = u + dt*(du*lap + (a - (b + 1)*u + v*u*u));
lap = RotateLeft[ListCorrelate[kern, v, {-1, -1}], {1, 1}];
v = v + dt*(dv*lap + (b*u - v*u*u));
, {10000}
]]
Please check that this is indeed what you coded. This runs in about 30
sec. on my laptop.
Main Course: - Compile
-----------
cf = Compile[{{uIn, _Real, 2}, {vIn, _Real, 2}},
Block[{u = uIn, v = vIn, lap, dt = 10^-6.,
a = 0.1, b = 0.1,
du, dv, n,
kern = {{0, 1, 0}, {1, -4, 1}, {0, 1, 0}}
},
n = Length[uIn];
du = 1/(n - 1);
dv = 1/(n - 1);
Do[
lap = RotateLeft[ListCorrelate[kern, u, {-1, -1}], {1, 1}];
u = u + dt*(du*lap + (a - (b + 1)*u + v*u*u));
lap = RotateLeft[ListCorrelate[kern, u, {-1, -1}], {1, 1}];
v = v + dt*(dv*lap + (b*u - v*u*u));
, {10000}
]
], CompilationTarget -> "C", RuntimeOptions -> "Speed"
];
AbsoluteTiming[
cf[u0, v0];
]
gets it down to ca. 15 secs.
Desert - Optimization
------
This
Needs["CompiledFunctionTools`"]
and
Clear[u, v]
CompilePrint[cf]
You will find that there are two calls to main eval, coming from the call
to ListCorrelate - Let's get rid of them.
cf2 = Compile[{{uIn, _Real, 2}, {vIn, _Real, 2}},
Block[{u = uIn, v = vIn, lap, dt = 10^-6.,
a = 0.1, b = 0.1,
du, dv, n,
kern = {{0, 1, 0}, {1, -4, 1}, {0, 1, 0}}
},
n = Length[uIn];
du = 1/(n - 1);
dv = 1/(n - 1);
Do[
lap = -4*u + RotateLeft[u, 1] + RotateRight[u, 1] +
RotateLeft[u, {0, 1}] + RotateRight[u, {0, 1}];
u = u + dt*(du*lap + (a - (b + 1)*u + v*u*u));
lap = -4*v + RotateLeft[v, 1] + RotateRight[v, 1] +
RotateLeft[v, {0, 1}] + RotateRight[v, {0, 1}];
v = v + dt*(dv*lap + (b*u - v*u*u));
, {10000}
]
], CompilationTarget -> "C", RuntimeOptions -> "Speed"
];
CompilePrint[cf2]
looks good.
AbsoluteTiming[
cf2[u0, v0];
]
gets it down to about 6 sec. I do not know what hardware you have so we
can not directly compare the timings.
I hope this help,
Oliver
ah, to get the result back you need to add
]; (* end Do loop *)
{u, v}
],
> startt=clock();
>
> for(p=1;p<=10000;p++){
>
>
> lap[0][0]=(u[0][1]-4*u[0][0]+u[0][M-1]+u[1][0]+u[N-1][0]);
>
> for(j=1;j<=M-2;j++)
> {
> lap[0][j]=(u[0][j+1]-4*u[0][j]+u[0][j-1]+u[1][j]+u[N-1][j]);
> }
>
> lap[0][M-1]=(u[0][0]-4*u[0][M-1]+u[0][M-2]+u[1][M-1]+u[N-1][M-1]);
>
> for(i=1;i<=N-2;i++)
> {
> lap[i][0]=(u[i][1]-4*u[i][0]+u[i][M-1]+u[i+1][0]+u[i-1][0]);
> }
>
> for(i=1;i<=N-2;i++)
> {
>
> lap[i][M-1]=(u[i][0]-4*u[i][M-1]+u[i][M-2]+u[i+1][M-1]+u[i-1][M-1]);
> }
>
> lap[N-1][0]=(u[N-1][1]-4*u[N-1][0]+u[N-1][M-1]+u[0][0]+u[N-2][0]);
>
>
> lap[N-1][M-1]=(u[N-1][0]-4*u[N-1][M-1]+u[N-1][M-2]+u[0][M-1]+u[N-2][M-1]);
>
> for(j=1;j<=M-2;j++)
> {
>
> lap[N-1][j]=(u[N-1][j+1]-4*u[N-1][j]+u[N-1][j-1]+u[0][j]+u[N-2][j]);
> }
>
> for(i=1;i<=N-2;i++)
> for(j=1;j<=M-2;j++)
> {
> lap[i][j]=(u[i][j+1]-4*u[i][j]+u[i][j-1]+u[i+1][j]+u[i-1][j]);
> }
>
> for(i=0;i<M;i++)
> for(j=0;j<N;j++){
>
> u[i][j] =
> =u[i][j]+dt*(D_u*lap[i][j]+(a-(b+1)*u[i][j]+v[i][j]*u[i][j]*u[i][j]));
>
> }
>
>
> lap[0][0]=(v[0][1]-4*v[0][0]+v[0][M-1]+v[1][0]+v[N-1][0]);
>
> for(j=1;j<=M-2;j++)
> {
> lap[0][j]=(v[0][j+1]-4*v[0][j]+v[0][j-1]+v[1][j]+v[N-1][j]);
> }
>
> lap[0][M-1]=(v[0][0]-4*v[0][M-1]+v[0][M-2]+v[1][M-1]+v[N-1][M-1]);
>
> for(i=1;i<=N-2;i++)
> {
> lap[i][0]=(v[i][1]-4*v[i][0]+v[i][M-1]+v[i+1][0]+v[i-1][0]);
> }
>
> for(i=1;i<=N-2;i++)
> {
>
> lap[i][M-1]=(v[i][0]-4*v[i][M-1]+v[i][M-2]+v[i+1][M-1]+v[i-1][M-1]);
> }
>
> lap[N-1][0]=(v[N-1][1]-4*v[N-1][0]+v[N-1][M-1]+v[0][0]+v[N-2][0]);
>
>
> lap[N-1][M-1]=(v[N-1][0]-4*v[N-1][M-1]+v[N-1][M-2]+v[0][M-1]+v[N-2][M-1]);
>
> for(j=1;j<=M-2;j++)
> {
>
> lap[N-1][j]=(v[N-1][j+1]-4*v[N-1][j]+v[N-1][j-1]+v[0][j]+v[N-2][j]);
> }
>
> for(i=2;i<=N-2;i++)
> for(j=2;j<=M-2;j++)
> {
> lap[i][j]=(v[i][j+1]-4*v[i][j]+v[i][j-1]+v[i+1][j]+v[i-1][j]);
> }
>
> for(i=0;i<M;i++)
> for(j=0;j<N;j++){
> v[i][j] =
> =v[i][j]+dt*(D_v*lap[i][j]+(b*u[i][j]-v[i][j]*u[i][j]*u[i][j]));
> }
>
> endt=clock();
>
> fprintf(stderr,"%d\n",endt-startt);
>
>