[Date Index]
[Thread Index]
[Author Index]
RE: Increase in efficiency with Module
*To*: mathgroup at smc.vnet.net
*Subject*: [mg40060] RE: [mg40050] Increase in efficiency with Module
*From*: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
*Date*: Tue, 18 Mar 2003 02:21:27 -0500 (EST)
*Sender*: owner-wri-mathgroup at wolfram.com
>-----Original Message-----
>From: Aaron E. Hirsh [mailto:aehirsh at stanford.edu]
To: mathgroup at smc.vnet.net
>Sent: Monday, March 17, 2003 9:35 AM
>To: mathgroup at smc.vnet.net
>Subject: [mg40060] [mg40050] Increase in efficiency with Module
>
>
>I would be grateful if someone could explain the difference in
>efficiency between the following two simple programs. ry is a list of
>length n. For n = 1000, the program using Module takes 1 second on my
>laptop, whereas the program without Module takes 75 seconds.
>
>ftauc = Compile[{{ry, _Real, 1}, {n, _Real, 0}},
>
> Module[{i, j, a},
> i = 1;
> a = 0;
> Do[ j = i + 1;
> Do[
> If[ry[[i]] < ry[[j]], a++,
> If[ry[[i]] > ry[[j]], a--]];
> j++, {n - i}];
> i++, {n - 1}]; a
> ]]
>
>ftauc2 = Compile[{{ry, _Real, 1}, {n, _Real, 0}},
> i = 1;
> a = 0;
> Do[j = i + 1;
> Do[
> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]];
> j++, {n - i}];
> i++, {n - 1}]; a
> ]
>
>thank you,
>--
>
>Aaron E. Hirsh
>Center for Computational Genetics and Biological Modeling
>Department of Biological Sciences
>Stanford University
>tel. (650) 723-4952
>fax. (650) 725-0180
>
Aaron,
if you inspect the compiled code ftauc[[4]] and ftauc2[[4]] you see
considerable differences which are obviously caused by access to the global
variables i, j and a (in ftauc). I cannot go through theses statements line
by line, merely refer to
http://library.wolfram.com/database/Conferences/4682/
Athough, this appears to be out of date now, a key sentence therein still
holds: "The main obstacle to achieving a speedup ... is when the compiled
code has to externally evaluate a function". From the compiled code for
ftauc2, I suppose, that access and updates of the global variables i,j,a is
done by calling an (external) function.
Further: "In structures such as Module or Block, local variables can be
declared. The compiled code will store value of such a variable in a
register, which can then be updated whenever the variable gets a new value.
Such use local variables!
But let me report some further observations:
First my timings were:
In[3]:= ry = Table[Random[], {333}];
In[4]:= ftauc2[ry, Length[ry]] // Timing
Out[4]= {12.618 Second, -846}
In[5]:= ftauc[ry, Length[ry]] // Timing
Out[5]= {0.16 Second, -846}
Performance of the uncompiled code:
In[6]:=
ftaucU[ry_, n_] := Module[{i, j, a}, i = 1;
a = 0;
Do[j = i + 1;
Do[If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]];
j++, {n - i}];
i++, {n - 1}]; a]
In[7]:=
ftauc2U[ry_, n_] := (i = 1;
a = 0;
Do[j = i + 1;
Do[If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]];
j++, {n - i}];
i++, {n - 1}]; a)
In[8]:= ftauc2U[ry, Length[ry]] // Timing
Out[8]= {3.895 Second, -846}
In[9]:= ftaucU[ry, Length[ry]] // Timing
Out[9]= {3.846 Second, -846}
Such the uncompiled code for ftauc2 is faster than the compiled one! And
ftaucU (ftauc uncompiled) is a little bit faster than ftauc2U. We might
expect this.
The Do-loops may be improved a bit:
In[19]:=
ftaucUS[ry_] := Module[{a = 0, n = Length[ry]},
Do[Do[If[ry[[i]] < ry[[j]], a++,
If[ry[[i]] > ry[[j]], a--]], {j, i + 1, n}],
{i, 1, n - 1}]; a]
In[20]:= ftaucUS[ry] // Timing
Out[20]= {3.345 Second, -846}
Still better code from using Sum:
In[33]:=
ftaucUSU[ry_] := Module[{n = Length[ry]},
Sum[Sum[If[ry[[i]] < ry[[j]], 1,
If[ry[[i]] > ry[[j]], -1]], {j, i + 1, n}],
{i, 1, n - 1}]]
In[34]:= ftaucUSU[ry] // Timing
Out[34]= {2.543 Second, -846}
Functional code is better though:
In[23]:=
-Plus @@ Flatten[
MapIndexed[Take[#1, First[#2]] &, Outer[Order, ry, ry]]] // Timing
Out[23]= {0.601 Second, -846}
In[24]:=
Plus @@ Flatten[
MapThread[
Thread[Unevaluated[Order[#1, #2]]] &, {ry,
NestList[Rest, ry, Length[ry] - 1]}]] // Timing
Out[24]= {0.33 Second, -846}
If we compile the Loops with Order (which was better when not compiled) the
compiled code is slower (due to the function call to Order):
In[25]:=
ftaucUSSc1 = Compile[{{ry, _Real, 1}},
Module[{a = 0, n = Length[ry]},
Do[Do[a += Order[ry[[i]], ry[[j]]],
{j, i + 1, n}], {i, 1, n - 1}]; a],
{{Order[_, _], _Integer}}]
In[26]:= Timing[ftaucUSSc1[ry]]
Out[26]= {0.461 Second, -846}
In[30]:=
ftaucSC = Compile[{{ry, _Real, 1}},
Module[{a = 0, n = Length[ry]},
Do[Do[If[ry[[i]] < ry[[j]], a++,
If[ry[[i]] > ry[[j]], a--]], {j, i + 1, n}],
{i, 1, n - 1}]; a]]
In[31]:= Timing[ftaucSC[ry]]
Out[31]= {0.13 Second, -846}
Compared to your ftauc
In[32]:= Timing[ftauc[ry, Length[ry]]]
Out[32]= {0.16 Second, -846}
corresponding to that small advantage of the uncompiled code.
--
Hartmut Wolf
Prev by Date:
**Re: Re: discretization and plotting pde system**
Next by Date:
**Re: Q: lists**
Previous by thread:
**Re: Increase in efficiency with Module**
Next by thread:
**Re: RE: Increase in efficiency with Module**
| |