MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Increase in efficiency with Module

  • To: mathgroup at smc.vnet.net
  • Subject: [mg40117] Re: Increase in efficiency with Module
  • From: "Carl K. Woll" <carlw at u.washington.edu>
  • Date: Fri, 21 Mar 2003 02:36:58 -0500 (EST)
  • Organization: University of Washington
  • References: <8EB1C3634596D6118952006008F711CD5BCD2F@debis.com> <b5bjlf$5rk$1@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

Hi,

Since this thread seems to have transformed itself into a speed contest, I
thought I would throw in my 2 cents. Here is my attempt:

h[a_]:=(s+=Tr[Sign[a-First[a]]];Rest[a])

carl[t_]:=Block[{s=0},
 Nest[h,t,Length[t]];
 s]

Comparing to ftauc3PP (the fastest so far according to Bobby) we have:

test=Table[Random[],{1000}];

In[105]:=
ftauc3PP[test]//Timing
carl[test]//Timing
Out[105]=
{0.265 Second, 9362}
Out[106]=
{0.094 Second, 9362}

So, my uncompiled code is almost 3 times faster than ftauc3PP.

Carl Woll
Physics Dept
U of Washington

"Dr Bob" <drbob at bigfoot.com> wrote in message
news:b5bjlf$5rk$1 at smc.vnet.net...
> On lists of size 1000, the three fastest are now ftauc3, ftauc3P (5%
> faster), and ftauc3PP (another 8% faster).
>
> Bobby
>
> On Wed, 19 Mar 2003 09:11:54 +0100, Wolf, Hartmut <Hartmut.Wolf@t-
> systems.com> wrote:
>
> > Bobby,
> >
> > yes, I've missed a minor point, which I did not correct, as this had not
> > been the focus of the discussion:
> >
> > In[1]:= ry = Table[Random[], {1000}];
> > In[3]:=
> > ftaucSCP = Compile[{{ry, _Real, 1}}, Module[{a = 0, n = Length[ry]},
> > Do[Do[If[ry[[i]] < ry[[j]], ++a, If[ry[[i]] > ry[[j]], --a]], {j, i + 1,
> > n}], {i, 1, n - 1}]; a]];
> > In cases like this use PreDecrement and PreIncrement, instead of
> > Decrement
> > or Increment. This gives another performance boost of a percent or a
> > fraction. I'm not very experienced with the Mathematica Compiler (which,
> > BTW, generates special code to be interpreted, not C code), however my
> > observation is that code which is nearest to plain C-code compiles
> > fastest.
> > And the rule I put up above is an old rule for C programming. Nowadays,
> > however, a modern C-compiler recognizes the situation and optimizes the
> > generated machine code accordingly. And also, the extra machine cycle (a
> > register move) (if it were) needed for a++, compared to ++a gets
> > unnoticed
> > with a modern CPU (their problems are now quite different). But this
does
> > not apply to mathematica compiled code, as it is interpreted, sic!
> >
> >
> > In[4]:= Do[ftauc3[ry], {10}] // Timing
> > Out[4]= {12.168 Second, Null}
> > In[5]:= Do[ftaucSCP[ry],{10}]//Timing
> > Out[5]= {11.616 Second, Null}
> >
> >
> > Applying this to your code:
> >
> > In[20]:=
> > ftauc3P = Compile[{{ry, _Real, 1}}, Block[{i = 0, j, a = 0, n =
> > Length@ry}, Do[j = ++i;
> > Do[a += Sign[ry[[++j]] - ry[[i]]], {n - i}],
> > {n - 1}];
> > a]];
> >
> > In[21]:= Do[ftauc3P[ry], {10}] // Timing
> > Out[21]= {11.837 Second, Null}
> >
> >
> > The warning as not to use external functions does not apply to your
code,
> > as
> > Sign is one of those functions which are compiled inline.
> >
> > Other improvement (Do loops) of your code:
> >
> > In[30]:=
> > ftauc3PP = Compile[{{ry, _Real, 1}},
> > Block[{a = 0, n = Length[ry]},
> > Do[
> > Do[a += Sign[ry[[j]] - ry[[i]]], {j, i + 1, n}],
> > {i, 1, n - 1}];
> > a]];
> >
> >
> > In[31]:= Do[ftauc3PP[ry],{10}]//Timing
> > Out[31]= {10.686 Second, Null}
> >
> > That seems to be it!
> >
> >
> >
> > Hartmut
> >
> >
> >
> >> -----Original Message-----
> >> From: Dr Bob [mailto:drbob at bigfoot.com]
To: mathgroup at smc.vnet.net
> >> Sent: Tuesday, March 18, 2003 9:55 PM
> >> To: Wolf, Hartmut; mathgroup at smc.vnet.net; aehirsh at stanford.edu
> >> Subject: [mg40117] Re:  Increase in efficiency with Module
> >>
> >>
> >> I timed the two original codes, my slightly modified version of that
> >> (treat), Hartmut's two functional codes, and his fastest code, ftaucSC.
> >> There's a near tie between treat, ftaucSC, and the original ftauc (in
> >> that order).  I didn't use Real random numbers, since ties seem to be a
> >> significant factor in the original code, but the winning functions are
> >> the same with Reals anyway.
> >>
> >> Some might find the following a useful guide to timing functions.  I
> >> randomize the order in which functions are timed for each random
> >> dataset, output means and standard errors sorted from fastest to
> >> slowest, and show how to eliminate slow functions before going on to
> >> larger data sets and greater numbers of trials.
> >>
> >> <<Statistics`DescriptiveStatistics`
> >> allNames={"ftauc2","functional","functional2","ftauc","ftaucSC"
> >> ,"treat"};
> >>
randomTrial[functions_List,n_Integer]:=Module[{data=Table[Random[Integer,50
> > ],{\
> >> n}],order,results},
> >> order=Ordering[Random[]&/@functions];
> >> results=First@Timing[#@data]&/@functions[[order]];
> >> results[[Ordering@order]]/Second
> >> ]
> >> randomTrials[names:{__String},size_Integer,trials_Integer]:=
> >> TableForm[
> >> Transpose@Module[{functions=ToExpression/@names,times,ranking},times=\
> >> Transpose@Table[randomTrial[functions,size],{trials}];
> >> means=Mean/@times;
> >> ranking=Ordering@means;
> >> times=times[[ranking]];
> >> {names[[ranking]],Mean/@times,StandardErrorOfSampleMean/@times}
> >> ],TableHeadings\[Rule]{None,{"Function","Mean","Std Error"}}]
> >>
> > [...]
> >
> >> treat=Compile[{{ry,_Real,1}},Block[{i=0,j,a=0,n=Length@ry},
> >> Do[i++;j=i+1;
> >> Do[a+=Sign[ry[[j]]-ry[[i]]];j++,{n-i}],{n-1}];a]];
> > [...]
> >
> >>
> >> randomTrials[fastest, 1000, 100]
> >>
> >> TableForm[
> > {{"Function", "Mean", "Std Error"}, {"treat", 0.27246999999999955,
> > 0.0009006681021902752},
> > {"ftaucSC", 0.29871000000000036, 0.0009034496179205452}, {"ftauc",
> > 0.3101900000000001, 0.0011866849675129872}},
> >> TableHeadings -> {None, {"Function", "Mean", "Std Error"}}]
> >>
> >> Bobby
> >>
> >> On Tue, 18 Mar 2003 02:21:27 -0500 (EST), Wolf, Hartmut
<Hartmut.Wolf@t-
> >> systems.com> wrote:
> >>
> >>>
> >>>> -----Original Message-----
> >>>> From: Aaron E. Hirsh [mailto:aehirsh at stanford.edu]
To: mathgroup at smc.vnet.net
> >>> To: mathgroup at smc.vnet.net
> >>>> Sent: Monday, March 17, 2003 9:35 AM
> >>>> To: mathgroup at smc.vnet.net
> >>>> Subject: [mg40117]  Increase in efficiency with Module
> >>>>
> >>>>
> >>>> I would be grateful if someone could explain the difference in
> >>>> efficiency between the following two simple programs. ry is a list of
> >>>> length n. For n = 1000, the program using Module takes 1 second on my
> >>>> laptop, whereas the program without Module takes 75 seconds.
> >>>>
> >>>> ftauc = Compile[{{ry, _Real, 1}, {n, _Real, 0}},
> >>>> Module[{i, j, a},     i = 1;
> >>>> a = 0;
> >>>> Do[ j = i + 1;        Do[
> >>>> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]];
> >>>> j++, {n - i}];
> >>>> i++, {n - 1}]; a
> >>>> ]]
> >>>>
> >>>> ftauc2 = Compile[{{ry, _Real, 1}, {n, _Real, 0}},
> >>>> i = 1;
> >>>> a = 0;
> >>>> Do[j = i + 1;
> >>>> Do[
> >>>> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]];
> >>>> j++, {n - i}];
> >>>> i++, {n - 1}]; a
> >>>> ]
> >>>>
> >>>> thank you,
> >>>> --
> >>>>
> >>>> Aaron E. Hirsh
> >>>> Center for Computational Genetics and Biological Modeling
> >>>> Department of Biological Sciences
> >>>> Stanford University
> >>>> tel. (650) 723-4952
> >>>> fax. (650) 725-0180
> >>>>
> >>>
> >>> Aaron,
> >>>
> >>> if you inspect the compiled code ftauc[[4]] and ftauc2[[4]] you see
> >>> considerable differences which are obviously caused by access to the
> > global
> >>> variables i, j and a (in ftauc). I cannot go through theses statements
> >>> line by line, merely refer to
> >>> http://library.wolfram.com/database/Conferences/4682/
> >>>
> >>> Although, this appears to be out of date now, a key sentence therein
> > still
> >>> holds: "The main obstacle to achieving a speedup ... is when the
> >>> compiled
> >>> code has to externally evaluate a function". From the compiled code
for
> >>> ftauc2, I suppose, that access and updates of the global variables
> >>> i,j,a is done by calling an (external) function.
> >>>
> >>> Further: "In structures such as Module or Block, local variables can
be
> >>> declared. The compiled code will store value of such a variable in a
> >>> register, which can then be updated whenever the variable gets a new
> >>> value.
> >>>
> >>>
> >>> Such use local variables!
> >>>
> >>>
> > [...]
> >>> --
> >>> Hartmut Wolf
> >>>
> > [...]
> >> -- majort at cox-internet.com
> >> Bobby R. Treat
> >>
> >
> >
>
>
>
> --
> majort at cox-internet.com
> Bobby R. Treat
>
>




  • Prev by Date: Re: Increase in efficiency with Module
  • Next by Date: Re: a challenge/problem.
  • Previous by thread: Re: Increase in efficiency with Module
  • Next by thread: Comparison of Mathematica on Various Computers