Re: Increase in efficiency with Module
- To: mathgroup at smc.vnet.net
- Subject: [mg40117] Re: Increase in efficiency with Module
- From: "Carl K. Woll" <carlw at u.washington.edu>
- Date: Fri, 21 Mar 2003 02:36:58 -0500 (EST)
- Organization: University of Washington
- References: <8EB1C3634596D6118952006008F711CD5BCD2F@debis.com> <b5bjlf$5rk$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Hi, Since this thread seems to have transformed itself into a speed contest, I thought I would throw in my 2 cents. Here is my attempt: h[a_]:=(s+=Tr[Sign[a-First[a]]];Rest[a]) carl[t_]:=Block[{s=0}, Nest[h,t,Length[t]]; s] Comparing to ftauc3PP (the fastest so far according to Bobby) we have: test=Table[Random[],{1000}]; In[105]:= ftauc3PP[test]//Timing carl[test]//Timing Out[105]= {0.265 Second, 9362} Out[106]= {0.094 Second, 9362} So, my uncompiled code is almost 3 times faster than ftauc3PP. Carl Woll Physics Dept U of Washington "Dr Bob" <drbob at bigfoot.com> wrote in message news:b5bjlf$5rk$1 at smc.vnet.net... > On lists of size 1000, the three fastest are now ftauc3, ftauc3P (5% > faster), and ftauc3PP (another 8% faster). > > Bobby > > On Wed, 19 Mar 2003 09:11:54 +0100, Wolf, Hartmut <Hartmut.Wolf@t- > systems.com> wrote: > > > Bobby, > > > > yes, I've missed a minor point, which I did not correct, as this had not > > been the focus of the discussion: > > > > In[1]:= ry = Table[Random[], {1000}]; > > In[3]:= > > ftaucSCP = Compile[{{ry, _Real, 1}}, Module[{a = 0, n = Length[ry]}, > > Do[Do[If[ry[[i]] < ry[[j]], ++a, If[ry[[i]] > ry[[j]], --a]], {j, i + 1, > > n}], {i, 1, n - 1}]; a]]; > > In cases like this use PreDecrement and PreIncrement, instead of > > Decrement > > or Increment. This gives another performance boost of a percent or a > > fraction. I'm not very experienced with the Mathematica Compiler (which, > > BTW, generates special code to be interpreted, not C code), however my > > observation is that code which is nearest to plain C-code compiles > > fastest. > > And the rule I put up above is an old rule for C programming. Nowadays, > > however, a modern C-compiler recognizes the situation and optimizes the > > generated machine code accordingly. And also, the extra machine cycle (a > > register move) (if it were) needed for a++, compared to ++a gets > > unnoticed > > with a modern CPU (their problems are now quite different). But this does > > not apply to mathematica compiled code, as it is interpreted, sic! > > > > > > In[4]:= Do[ftauc3[ry], {10}] // Timing > > Out[4]= {12.168 Second, Null} > > In[5]:= Do[ftaucSCP[ry],{10}]//Timing > > Out[5]= {11.616 Second, Null} > > > > > > Applying this to your code: > > > > In[20]:= > > ftauc3P = Compile[{{ry, _Real, 1}}, Block[{i = 0, j, a = 0, n = > > Length@ry}, Do[j = ++i; > > Do[a += Sign[ry[[++j]] - ry[[i]]], {n - i}], > > {n - 1}]; > > a]]; > > > > In[21]:= Do[ftauc3P[ry], {10}] // Timing > > Out[21]= {11.837 Second, Null} > > > > > > The warning as not to use external functions does not apply to your code, > > as > > Sign is one of those functions which are compiled inline. > > > > Other improvement (Do loops) of your code: > > > > In[30]:= > > ftauc3PP = Compile[{{ry, _Real, 1}}, > > Block[{a = 0, n = Length[ry]}, > > Do[ > > Do[a += Sign[ry[[j]] - ry[[i]]], {j, i + 1, n}], > > {i, 1, n - 1}]; > > a]]; > > > > > > In[31]:= Do[ftauc3PP[ry],{10}]//Timing > > Out[31]= {10.686 Second, Null} > > > > That seems to be it! > > > > > > > > Hartmut > > > > > > > >> -----Original Message----- > >> From: Dr Bob [mailto:drbob at bigfoot.com] To: mathgroup at smc.vnet.net > >> Sent: Tuesday, March 18, 2003 9:55 PM > >> To: Wolf, Hartmut; mathgroup at smc.vnet.net; aehirsh at stanford.edu > >> Subject: [mg40117] Re: Increase in efficiency with Module > >> > >> > >> I timed the two original codes, my slightly modified version of that > >> (treat), Hartmut's two functional codes, and his fastest code, ftaucSC. > >> There's a near tie between treat, ftaucSC, and the original ftauc (in > >> that order). I didn't use Real random numbers, since ties seem to be a > >> significant factor in the original code, but the winning functions are > >> the same with Reals anyway. > >> > >> Some might find the following a useful guide to timing functions. I > >> randomize the order in which functions are timed for each random > >> dataset, output means and standard errors sorted from fastest to > >> slowest, and show how to eliminate slow functions before going on to > >> larger data sets and greater numbers of trials. > >> > >> <<Statistics`DescriptiveStatistics` > >> allNames={"ftauc2","functional","functional2","ftauc","ftaucSC" > >> ,"treat"}; > >> randomTrial[functions_List,n_Integer]:=Module[{data=Table[Random[Integer,50 > > ],{\ > >> n}],order,results}, > >> order=Ordering[Random[]&/@functions]; > >> results=First@Timing[#@data]&/@functions[[order]]; > >> results[[Ordering@order]]/Second > >> ] > >> randomTrials[names:{__String},size_Integer,trials_Integer]:= > >> TableForm[ > >> Transpose@Module[{functions=ToExpression/@names,times,ranking},times=\ > >> Transpose@Table[randomTrial[functions,size],{trials}]; > >> means=Mean/@times; > >> ranking=Ordering@means; > >> times=times[[ranking]]; > >> {names[[ranking]],Mean/@times,StandardErrorOfSampleMean/@times} > >> ],TableHeadings\[Rule]{None,{"Function","Mean","Std Error"}}] > >> > > [...] > > > >> treat=Compile[{{ry,_Real,1}},Block[{i=0,j,a=0,n=Length@ry}, > >> Do[i++;j=i+1; > >> Do[a+=Sign[ry[[j]]-ry[[i]]];j++,{n-i}],{n-1}];a]]; > > [...] > > > >> > >> randomTrials[fastest, 1000, 100] > >> > >> TableForm[ > > {{"Function", "Mean", "Std Error"}, {"treat", 0.27246999999999955, > > 0.0009006681021902752}, > > {"ftaucSC", 0.29871000000000036, 0.0009034496179205452}, {"ftauc", > > 0.3101900000000001, 0.0011866849675129872}}, > >> TableHeadings -> {None, {"Function", "Mean", "Std Error"}}] > >> > >> Bobby > >> > >> On Tue, 18 Mar 2003 02:21:27 -0500 (EST), Wolf, Hartmut <Hartmut.Wolf@t- > >> systems.com> wrote: > >> > >>> > >>>> -----Original Message----- > >>>> From: Aaron E. Hirsh [mailto:aehirsh at stanford.edu] To: mathgroup at smc.vnet.net > >>> To: mathgroup at smc.vnet.net > >>>> Sent: Monday, March 17, 2003 9:35 AM > >>>> To: mathgroup at smc.vnet.net > >>>> Subject: [mg40117] Increase in efficiency with Module > >>>> > >>>> > >>>> I would be grateful if someone could explain the difference in > >>>> efficiency between the following two simple programs. ry is a list of > >>>> length n. For n = 1000, the program using Module takes 1 second on my > >>>> laptop, whereas the program without Module takes 75 seconds. > >>>> > >>>> ftauc = Compile[{{ry, _Real, 1}, {n, _Real, 0}}, > >>>> Module[{i, j, a}, i = 1; > >>>> a = 0; > >>>> Do[ j = i + 1; Do[ > >>>> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]]; > >>>> j++, {n - i}]; > >>>> i++, {n - 1}]; a > >>>> ]] > >>>> > >>>> ftauc2 = Compile[{{ry, _Real, 1}, {n, _Real, 0}}, > >>>> i = 1; > >>>> a = 0; > >>>> Do[j = i + 1; > >>>> Do[ > >>>> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]]; > >>>> j++, {n - i}]; > >>>> i++, {n - 1}]; a > >>>> ] > >>>> > >>>> thank you, > >>>> -- > >>>> > >>>> Aaron E. Hirsh > >>>> Center for Computational Genetics and Biological Modeling > >>>> Department of Biological Sciences > >>>> Stanford University > >>>> tel. (650) 723-4952 > >>>> fax. (650) 725-0180 > >>>> > >>> > >>> Aaron, > >>> > >>> if you inspect the compiled code ftauc[[4]] and ftauc2[[4]] you see > >>> considerable differences which are obviously caused by access to the > > global > >>> variables i, j and a (in ftauc). I cannot go through theses statements > >>> line by line, merely refer to > >>> http://library.wolfram.com/database/Conferences/4682/ > >>> > >>> Although, this appears to be out of date now, a key sentence therein > > still > >>> holds: "The main obstacle to achieving a speedup ... is when the > >>> compiled > >>> code has to externally evaluate a function". From the compiled code for > >>> ftauc2, I suppose, that access and updates of the global variables > >>> i,j,a is done by calling an (external) function. > >>> > >>> Further: "In structures such as Module or Block, local variables can be > >>> declared. The compiled code will store value of such a variable in a > >>> register, which can then be updated whenever the variable gets a new > >>> value. > >>> > >>> > >>> Such use local variables! > >>> > >>> > > [...] > >>> -- > >>> Hartmut Wolf > >>> > > [...] > >> -- majort at cox-internet.com > >> Bobby R. Treat > >> > > > > > > > > -- > majort at cox-internet.com > Bobby R. Treat > >