Re: Compiling Runge-kutta
- To: mathgroup at smc.vnet.net
- Subject: [mg125661] Re: Compiling Runge-kutta
- From: João Paulo Pereira <joaopereira9 at gmail.com>
- Date: Mon, 26 Mar 2012 01:48:02 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- References: <201203200721.CAA11925@smc.vnet.net>
Dear Mathgroup
I have a question regarding a problem I'm facing. Any help would be much
appreciated. You may find the full description below
Thanks in advance
Joao Pereira
---------- Forwarded message ----------
From: Oliver Ruebenkoenig <ruebenko at wolfram.com>
Date: 2012/3/25
Subject: [mg125661] Re: Compiling Runge-kutta
To: Jo=E3o Paulo Pereira <joaopereira9 at gmail.com>
Hi Joao,
perhaps it were good if posted this message to MathGroup as well, it does
not look like you did that. I will not have time to look into this
immediately and other readers of MG might find it useful too.
A first thought: Have a look at your complied function with CompilePrint
and see if you can spot something that gives a hint on why the compiled
function might not give the expected performance (e.g. call to MainEvaluate
in a loop)
Oliver
On Sat, 24 Mar 2012, Jo=E3o Paulo Pereira wrote:
Hi Oliver, you're right of course.
> This is the part of my code relevant to this issue:
>
> (*These are global variables used throughout the program*)
>
> (*GLOBAL VARIABLES*)
>
> gbvm[t_, i_] = (2.2946301434601255*^-8 +
> 0.000016958949537576912* Exp[0.06497640748638328*i]);
> gbv\[Sigma][t_] = 0.8 ;
> gbv\[Alpha][t_] = 7.0;
> gbv\[Epsilon][t_] = 0.8;
> gbvr[t_] = 0.03 ;
> gbv\[Rho][t_, i_] = 0.019 ;
> gbva[t_] = 0.65 ;
> gbv\[Phi][t_] = 0.3;
> gbv\[Delta][t_] = 0.04 ;
> gbvs[t_] = 15.0 ;
> gbvref[t_] = 65.0 ;
> gbv\[Gamma][t_] = 0.1;
> gbvz0[t_] = 0.5;
> gbvzl[t_] = 0.49346 ;
> gbvzp[t_] = 0.10794 ;
> gbv\[Beta][t_] = 0.511459;
> gbvdisc\[Rho][v_, t0_, t_] :=
> Exp[NIntegrate[
> gbv\[Rho][t0, xt - v] + gbvm[t0, xt - v], {xt, t0, t}]];
> gbvdiscr[v_, t0_, t_] :=
> Exp[Integrate[gbvr[t0] + gbvm[t0, xt - v], {xt, t0, t}]];
> gbvdiscrtr[v_, t0_, tr_, t_] :=
> Exp[Integrate[gbvr[t0] + gbvm[t0, xt - v], {xt, tr, t}]];
>
> (* ------------------------------**-------------------*)
>
> (*CONTROL VARIBLES THAT FEED INTO THE ODE SYSTEM*)
>
> controlos1[t_, x_] := Module[{xeq1, dxeq1, xsl1, s},
> xeq1[s_] =
> gbv\[Alpha][
> t0]/(gbvdisc\[Rho][v, t0, t]*
> Power[s, gbv\[Epsilon][t0]]) - (x[[4]]*gbva[t0]*e[[1]]*
> gbv\[Phi][t0])/(Power[1.0 - s, 1.0 - gbv\[Phi][t0]]) -
> x[[6]] gbv\[Gamma][t0] e[[2]];
> dxeq1[s_] = -gbv\[Epsilon][t0] gbv\[Alpha][
> t0]/(gbvdisc\[Rho][v, t0, t]*
> Power[s, gbv\[Epsilon][t0] + 1.0]) - (x[[4]]*gbva[t0]*e[[1]]*
> gbv\[Phi][t0]*(1.0 - gbv\[Phi][t0]))/
> Power[1.0 - s, 2.0 - gbv\[Phi][t0]];
> xsl1 = newton[xeq1, dxeq1, 0.3, 10^(-10), 20, {0.0, 1.0}];
> {Power[x[[6]]*gbvdisc\[Rho][**v, t0, t], -1.0/gbv\[Sigma][t0]], xsl1,
> 1.0 - xsl1}
> ];
>
> controlos2[t_, x_] := Module[{xd, xc, xeq2, dxeq2, s, xsl2},
> xd = (x[[5]] + (1.0 - gbvzl[t0]) x[[6]])*e[[2]]*x[[1]];
> xc = {
> Power[x[[6]]*gbvdisc\[Rho][v, t0, t], -1.0/gbv\[Sigma][t0]],
> Power[gbv\[Alpha][t0]/(xd***gbvdisc\[Rho][v, t0, t]),
> 1.0/gbv\[Epsilon][t0]],
> Power[x[[4]]*gbva[t0]*e[[1]]***gbv\[Phi][t0]/xd,
> 1.0/(1.0 - gbv\[Phi][t0])]
> };
> If[x[[4]] <= 0.0, {xc[[1]], 1.0, 0.0},
> If[xc[[2]] + xc[[3]] > 1.0,
> xeq2[s_] =
> gbv\[Alpha][
> t0]/(gbvdisc\[Rho][v, t0, t]*
> Power[s, gbv\[Epsilon][t0]]) - (x[[4]]*gbva[t0]*e[[1]]*
> gbv\[Phi][t0])/Power[1.0 - s, 1.0 - gbv\[Phi][t0]];
> dxeq2[
> s_] = -gbv\[Epsilon][t0] gbv\[Alpha][
> t0]/(gbvdisc\[Rho][v, t0, t]*
> Power[s, gbv\[Epsilon][t0] + 1.0]) - (x[[4]]*gbva[t0]*
> e[[1]]*gbv\[Phi][t0]*(1.0 - gbv\[Phi][t0]))/
> Power[1.0 - s, 2.0 - gbv\[Phi][t0]];
> xsl2 = newton[xeq2, dxeq2, 0.3, 10^(-10), 20, {0.0, 1.0}];
> {xc[[1]], xsl2, 1.0 - xsl2}, xc]
> ]
> ];
>
> (*THE SYSTEM OF ODE TO BE SOLVED *)
> (*The same system is divided into 2 stages with diferent functional forms=
*)
>
> prehamiltonianfield1[t_, x_] := {
> gbva[t0]*e[[1]]*Power[**controlos1[t, x][[3]], gbv\[Phi][t0]] -
> gbv\[Delta][t0]*x[[1]],
> 0.0,
> (gbvr[t0] + gbvm[t0, t - v])*x[[3]] - controlos1[t, x][[1]] -
> gbvz0[t0] + gbv\[Gamma][t0]*controlos1[t, x][[3]]*e[[2]],
> gbv\[Delta][t0]*x[[4]],
> 0.0,
> -(gbvr[t0] + gbvm[t0, t - v])*x[[6]]
> };
>
> prehamiltonianfield2[t_, x_] := {
> gbva[t0]*e[[1]]*Power[**controlos2[t, x][[3]], gbv\[Phi][t0]] -
> gbv\[Delta][t0]*x[[1]],
> x[[1]]*e[[2]]*(1.0 - controlos2[t, x][[2]] - controlos2[t, x][[3]]),
> (gbvr[t0] + gbvm[t0, t - v])*x[[3]] - controlos2[t, x][[1]] -
> gbvz0[t0] + (1.0 - gbvzl[t0])*x[[1]]*
> e[[2]]*(1.0 - controlos2[t, x][[2]] - controlos2[t, x][[3]]),
> gbv\[Delta][t0]*x[[4]] - (x[[5]] + (1.0 - gbvzl[t0]) x[[6]])*
> e[[2]]*(1.0 - controlos2[t, x][[2]] - controlos2[t, x][[3]]),
> 0.0,
> -(gbvr[t0] + gbvm[t0, t - v])*x[[6]]
> };
>
> (* ------------------------------**-----------*)
>
> (*ALGORITHMS THAT WILL BE USED*)
>
> (*NEWTON R1 ADAPTED TO SEARCH WITHIN A DOMAIN*)
>
> newton[f_, df_, p0_, \[CapitalDelta]_, max_, dom_] :=
> Module[{k, dz, z0, z1},
> Clear[k, dz, z0, z1];
> k = 0;
> dz = 1.0;
> z0 = N[p0];
> While[And[k < max, N[\[CapitalDelta]] < Abs[dz]],
> dz = -(Apply[f, {z0}]/Apply[df, {z0}]);
> z1 = z0 + dz;
> If[z1 <= dom[[1]], z1 = z0 + 0.95*(dom[[1]] - z0),
> If[z1 >= dom[[2]], z1 = z0 + 0.95*(dom[[2]] - z0), z1 = z0 + d=
z]];
> k = k + 1;
> z0 = z1;]; N[z1]];
>
> (*EXPLICIT RK-4*)
>
> rk4[f_, t0_, tf_, x0_, n_] := Module[{h, step}, h = (tf - t0)/n;
> step[{t_, x_}] := Module[{k1, k2, k3, k4}, k1 = h*f[t, x];
> k2 = h*f[t + 0.5 h, x + 0.5 k1];
> k3 = h*f[t + 0.5 h, x + 0.5 k2];
> k4 = h*f[t + h, x + k3];
> {t + h, x + 1/6*(k1 + 2 k2 + 2 k3 + k4)}];
> NestList[step, {t0, x0}, n]];
>
> (*EXPLICIT RK-4 WITH COMPILE*)
> makeCompRK[fIn_] := With[{f = fIn},
> Compile[{{x0, _Real, 1}, {t0, _Real}, {tMax, _Real}, {n, _Integer}},
> Module[{k, h, K1, K2, K3, K4, SolList, x = x0},
> h = (tMax - t0)/n;
> SolList = Table[{0., 0.}, {n + 1}];
> SolList[[1]] = x0;
> Do[t = t0 + k h;
> K1 = h f[t, x];
> K2 = h f[t + (0.5) h, x + (0.5) K1];
> K3 = h f[t + (0.5) h, x + (0.5) K2];
> K4 = h f[t + h, x + K3];
> x = x + (1/6) K1 + (1/3) K2 + (1/3) K3 + (1/6) K4;
> SolList[[k]] = x;, {k, 2, n + 1}];
> SolList]]];
>
> (* ------------------------------**--------*)
>
> (*I USE THIS ROUTIN TO CALL THE RK4, BECAUSE I NEED TO INPUT THE DATA \
> IN A SPECIAL WAY*)
>
> consProg[f_, vp_, t0p_, ep_, x0s_, var_, n_] := Module[{ts, x0, tf},
> Clear[v, e, ts, tf];
> v = vp; ts = t0p; e = ep;
> x0 = Join[x0s, var];
> tf = x0[[7]];
> rk4[f, ts, tf, x0[[1 ;; 6]], n]
> ];
>
> (*I USE THIS ROUTINE TO COMPUTE 2 STAGES OF THE LIFE-CYCLE*)
>
> consumerTR[f1_, f2_, vp_, t0p_, ep_, x0s_, var_, n1_, n2_] :=
> Module[{x0, tf, var1, x0s2, var2, phase1, phase2},
> Clear[v, t0, e, tf];
> v = vp; t0 = t0p; e = ep;
> tf = var[[4]];
> var1 = Append[var[[1 ;; 3]], gbvs[t0]];
> If[tf < gbvs[t0], Abort[],
> If[t0 < gbvs[t0],
> phase1 = consProg[f1, v, t0, e, x0s, var1, n1];
> x0s2 = phase1[[n1 + 1]][[2, 1 ;; 3]];
> var2 = Append[phase1[[n1 + 1]][[2, 4 ;; 6]], tf];
> phase2 =
> consProg[f2, v, phase1[[n1 + 1]][[1]], e, x0s2, var2, n2];
> Join[phase1, Drop[phase2, 1]],
> consProg[f2, v, t0, x0s, var, n2],
> Print["non evaluated inner If"]], Print["non evaluated outer If"]]
> ];
>
> (*NOW THE COMPILED VERSIONS of the last two routines)
>
> consProgCompile[f_, vp_, t0p_, ep_, x0s_, var_, n_] :=
> Module[{ts, x0, tf},
> Clear[v, e, ts, tf];
> v = vp; ts = t0p; e = ep;
> x0 = Join[x0s, var];
> tf = x0[[7]];
> rk4[f, ts, tf, x0[[1 ;; 6]], n]
> ];
>
> consumerTRcompile[f1_, f2_, vp_, t0p_, ep_, x0s_, var_, n1_, n2_] :=
> Module[{x0, tf, var1, x0s2, var2, phase1, phase2},
> Clear[v, t0, e, tf];
> v = vp; t0 = t0p; e = ep;
> tf = var[[4]];
> var1 = Append[var[[1 ;; 3]], gbvs[t0]];
> If[tf < gbvs[t0], Abort[],
> If[t0 < gbvs[t0],
> phase1 = consProgCompile[f1, v, t0, e, x0s, var1, n1];
> x0s2 = phase1[[n1 + 1]][[2, 1 ;; 3]];
> var2 = Append[phase1[[n1 + 1]][[2, 4 ;; 6]], tf];
> phase2 =
> consProgCompile[f2, v, phase1[[n1 + 1]][[1]], e, x0s2, var2, n2];
> Join[phase1, Drop[phase2, 1]],
> consProgCompile[f2, v, t0, x0s, var, n2],
> Print["non evaluated inner If"]], Print["non evaluated outer If"]]
> ];
>
> (*----------------------------***)
> (*RESULTS*)
>
> (*for this simple example...*)
> f = Function[{t, x}, {x[[2]], t*1 - x[[1]] + 4/(x[[1]]^3)}];
> RK4Comp = makeCompRK[f];
>
> (*both forms of the algorithm work, although the compiled is much \
> faster*)
>
> RK4Comp[{1.0, 1.0}, 0.0, 100.0, 50000]; // AbsoluteTiming
> ={2.7456000, Null}
>
> rk4[f, 0.0, 100.0, {1.0, 1.0}, 50000]; // AbsoluteTiming
> = {5.8900000, Null}
>
> but for the problem I am working on we get
>
> consumerTR[**prehamiltonianfield1, prehamiltonianfield2, 0.0,
> 0.0, {1.8236, 0.95}, {0., 0., 0.}, {2.3, 0.12, 3.0, 65.}, 15,
> 50]; // AbsoluteTiming
> ={9.9698000, Null}
> consumerTRcompile[**prehamiltonianfield1, prehamiltonianfield2, 0.0,
> 0.0, {1.8236, 0.95}, {0., 0., 0.}, {2.3, 0.12, 3.0, 65.}, 15,
> 50]; // AbsoluteTiming
> ={9.8652000, Null}
>
> The execution time is similar which I guess is because there is some
> problem
> with the compiled version and it reverts to the uncomplied version.
>
> You've probably noticed that I am feeding the initial conditions in a wei=
rd
> way, instead of having a list of 6 variables, I have one with three and
> another with 4, this last one has the remaining 3 variables and the
> terminal
> time. I am doing this because the value of the variables in the second li=
st
> (including the terminal time) are unknown and have to be such that they
> solve a set of restrictions at terminal time. To do this I apply a newton
> algorithm to find the zeros of that boundary condition which gives me the=
se
> initial conditions and the true terminal time. This implies that the
> runge-kutta will be called many times wihile the newton algorithm is
> running. So speed is really an importante issue here, also because I am
> running this problem for a population and it will have to be run for all
> generations of that population (different initial v - generation) and eve=
ry
> single generation will repeat this procedure for a given moment in
> time....if one adds the fact that I will be following this population
> through time, it becomes clear that the runge kutta algorithm will be use=
d
> thousands of times
>
> Since the right hand side of the ODE system contains switchs and implicit
> functions that call a newton algorithm
> is it still feasible to compile the rk algorithm?
>
> Thanks again in advance
>
> My best regards
>
> Joao Pereira
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Mar 23, 2012 at 6:32 AM, Oliver Ruebenkoenig <ruebenko at wolfram.co=
m
> >
> wrote:
>
> Joao,
>
> it's hard to say what the issue is without seeing the code. I
> suspect you
> get more answers if you actually post your code. Please make
> sure it is
> complete and easy to copy and past.
>
> Oliver
>
>
>
> On Tue, 20 Mar 2012, Joao wrote:
>
> > Hi there, I've been going through this thread (Vector Runge-Kutta
> ODE solver with compilation?) and I would like to know if anyone could
> give me an opinion on the problem I have.
> >
> > Basically I am using the same runge-kutta 4 taken from Jason
> Cantarella's page than Dmitry was, but I am using the NestList
> version.
> >
> > I really need to speed up this algorithm. When I did one of the
> examples you provided on this thread, the compiled version solved it
> at a duration of 7% of the non compiled version which was pretty cool.
> >
> > My problem was that when I tried to use it on the work I'm doing the
> compiled and non compiled versions took the same time. My guess is
> that the compiled version wasn't able to compute and it returned to
> the non-compiled version.
> >
> > At this moment I don't know if I'm doing something wrong or if the
> nature of the RHS of the functions I'm using prevents the compiled
> algorithm to work in a proper way.
> >
> > I can send the code if necessary.
> >
> > Nevertheless, I may advance that my RHS is a system of 6 ODE
> (hamiltonian), these ODE call other functions (controls). These last
> functions (controls) are coded in a module box. Inside that box are
> some implicit functions in R1 (hence a newton algorithm is called) and
> there is also a switch - if a given condition is above a threshold the
> control box assumes one form, bellow that threshold another form is
> assumed. After all these calculations inside the control box it
> returns simply a vector of order 3.
> >
> > I would appreciate any help. At the moment I do not know if this is
> feasible with Compile or not. Maybe I have to put all this different
> boxes inside a compile?
> >
> > Thanks in advance
> >
> > Joao Pereira
> >
> >
>
>
>
>
- References:
- Compiling Runge-kutta
- From: Joao <joaopereira9@gmail.com>
- Compiling Runge-kutta