Re: RE: Increase in efficiency with Module

*To*: mathgroup at smc.vnet.net*Subject*: [mg40086] Re: [mg40060] RE: [mg40050] Increase in efficiency with Module*From*: Dr Bob <drbob at bigfoot.com>*Date*: Wed, 19 Mar 2003 03:23:11 -0500 (EST)*References*: <200303180721.CAA12239@smc.vnet.net>*Reply-to*: drbob at bigfoot.com*Sender*: owner-wri-mathgroup at wolfram.com

I timed the two original codes, my slightly modified version of that (treat), Hartmut's two functional codes, and his fastest code, ftaucSC. There's a near tie between treat, ftaucSC, and the original ftauc (in that order). I didn't use Real random numbers, since ties seem to be a significant factor in the original code, but the winning functions are the same with Reals anyway. Some might find the following a useful guide to timing functions. I randomize the order in which functions are timed for each random dataset, output means and standard errors sorted from fastest to slowest, and show how to eliminate slow functions before going on to larger data sets and greater numbers of trials. <<Statistics`DescriptiveStatistics` allNames={"ftauc2","functional","functional2","ftauc","ftaucSC","treat"}; randomTrial[functions_List,n_Integer]:=Module[{data=Table[Random[Integer,50],{\ n}],order,results}, order=Ordering[Random[]&/@functions]; results=First@Timing[#@data]&/@functions[[order]]; results[[Ordering@order]]/Second ] randomTrials[names:{__String},size_Integer,trials_Integer]:= TableForm[ Transpose@Module[{functions=ToExpression/@names,times,ranking},times=\ Transpose@Table[randomTrial[functions,size],{trials}]; means=Mean/@times; ranking=Ordering@means; times=times[[ranking]]; {names[[ranking]],Mean/@times,StandardErrorOfSampleMean/@times} ],TableHeadings\[Rule]{None,{"Function","Mean","Std Error"}}] ftauc=Compile[{{ry,_Real,1}},Module[{i,j,a,n=Length@ry},i=1; a=0; Do[j=i+1; Do[If[ry[[i]]<ry[[j]],a++,If[ry[[i]]>ry[[j]],a--]]; j++,{n-i}]; i++,{n-1}];a]]; ftauc2=Compile[{{ry,_Real,1}},i=1; a=0;n=Length@ry; Do[j=i+1; Do[If[ry[[i]]<ry[[j]],a++,If[ry[[i]]>ry[[j]],a--]]; j++,{n-i}]; i++,{n-1}];a]; treat=Compile[{{ry,_Real,1}},Block[{i=0,j,a=0,n=Length@ry},Do[i++;j=i+1; Do[a+=Sign[ry[[j]]-ry[[i]]];j++,{n-i}],{n-1}];a]]; functional=Function[ ry,- Plus@@Flatten[MapIndexed[Take[#1,First[#2]]&,Outer[Order,ry,ry]]]]; functional2=Function[ry,Plus@@Flatten[MapThread[Thread[ Unevaluated[Order[#1,#2]]]&,{ry, NestList[Rest,ry,Length[ry]-1]}]]]; ftaucSC=Compile[{{ry,_Real,1}},Module[{a=0,n=Length[ry]},Do[Do[If[ry[[ i]]<ry[[j]],a++,If[ry[[i]]>ry[[j]],a--]],{j,i+1,n}],{i,1,n-1}];a]]; randomTrials[allNames,200,50] fastest=Drop[%[[All,1]],-1]; TableForm[{{"Function", "Mean", "Std Error"}, {"ftaucSC", 0.009999999999998295, 0.0010728561917437113}, {"treat", 0.011540000000001669, 0.0010744566733272203}, {"ftauc", 0.013420000000000981, 0.0007762231953372879}, {"functional2", 0.02447999999999979, 0.0011220389603625314}, {"functional", 0.04149999999999977, 0.001154317538458073}, {"ftauc2", 0.6374999999999994, 0.0026886382259807004}}, TableHeadings -> {None, {"Function", "Mean", "Std Error"}}] randomTrials[fastest, 500, 50] fastest = Drop[%[[All, 1]], -2]; TableForm[{{"Function", "Mean", "Std Error"}, {"treat", 0.06682000000000073, 0.0011152706269909904}, {"ftaucSC", 0.07189999999999941, 0.001092179995630839}, {"ftauc", 0.07916000000000281, 0.000679351431401268}, {"functional2", 0.1477999999999986, 0.001280943529797478}, {"functional", 0.2583799999999985, 0.0012713129963520206}}, TableHeadings -> {None, {"Function", "Mean", "Std Error"}}] randomTrials[fastest, 1000, 100] TableForm[{{"Function", "Mean", "Std Error"}, {"treat", 0.27246999999999955, 0.0009006681021902752}, {"ftaucSC", 0.29871000000000036, 0.0009034496179205452}, {"ftauc", 0.3101900000000001, 0.0011866849675129872}}, TableHeadings -> {None, {"Function", "Mean", "Std Error"}}] Bobby On Tue, 18 Mar 2003 02:21:27 -0500 (EST), Wolf, Hartmut <Hartmut.Wolf@t- systems.com> wrote: > >> -----Original Message----- >> From: Aaron E. Hirsh [mailto:aehirsh at stanford.edu] To: mathgroup at smc.vnet.net > To: mathgroup at smc.vnet.net >> Sent: Monday, March 17, 2003 9:35 AM >> To: mathgroup at smc.vnet.net >> Subject: [mg40086] [mg40060] [mg40050] Increase in efficiency with Module >> >> >> I would be grateful if someone could explain the difference in >> efficiency between the following two simple programs. ry is a list of >> length n. For n = 1000, the program using Module takes 1 second on my >> laptop, whereas the program without Module takes 75 seconds. >> >> ftauc = Compile[{{ry, _Real, 1}, {n, _Real, 0}}, >> Module[{i, j, a}, i = 1; >> a = 0; >> Do[ j = i + 1; Do[ >> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]]; >> j++, {n - i}]; >> i++, {n - 1}]; a >> ]] >> >> ftauc2 = Compile[{{ry, _Real, 1}, {n, _Real, 0}}, >> i = 1; >> a = 0; >> Do[j = i + 1; >> Do[ >> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]]; >> j++, {n - i}]; >> i++, {n - 1}]; a >> ] >> >> thank you, >> -- >> >> Aaron E. Hirsh >> Center for Computational Genetics and Biological Modeling >> Department of Biological Sciences >> Stanford University >> tel. (650) 723-4952 >> fax. (650) 725-0180 >> > > Aaron, > > if you inspect the compiled code ftauc[[4]] and ftauc2[[4]] you see > considerable differences which are obviously caused by access to the > global > variables i, j and a (in ftauc). I cannot go through theses statements > line > by line, merely refer to > http://library.wolfram.com/database/Conferences/4682/ > > Athough, this appears to be out of date now, a key sentence therein still > holds: "The main obstacle to achieving a speedup ... is when the compiled > code has to externally evaluate a function". From the compiled code for > ftauc2, I suppose, that access and updates of the global variables i,j,a > is > done by calling an (external) function. > > Further: "In structures such as Module or Block, local variables can be > declared. The compiled code will store value of such a variable in a > register, which can then be updated whenever the variable gets a new > value. > > > Such use local variables! > > > But let me report some further observations: > > First my timings were: > > In[3]:= ry = Table[Random[], {333}]; > > In[4]:= ftauc2[ry, Length[ry]] // Timing > Out[4]= {12.618 Second, -846} > > In[5]:= ftauc[ry, Length[ry]] // Timing > Out[5]= {0.16 Second, -846} > > > > Performance of the uncompiled code: > > In[6]:= > ftaucU[ry_, n_] := Module[{i, j, a}, i = 1; > a = 0; > Do[j = i + 1; > Do[If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]]; > j++, {n - i}]; > i++, {n - 1}]; a] > > In[7]:= > ftauc2U[ry_, n_] := (i = 1; > a = 0; > Do[j = i + 1; > Do[If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]]; > j++, {n - i}]; > i++, {n - 1}]; a) > > In[8]:= ftauc2U[ry, Length[ry]] // Timing > Out[8]= {3.895 Second, -846} > > In[9]:= ftaucU[ry, Length[ry]] // Timing > Out[9]= {3.846 Second, -846} > > Such the uncompiled code for ftauc2 is faster than the compiled one! And > ftaucU (ftauc uncompiled) is a little bit faster than ftauc2U. We might > expect this. > > > > The Do-loops may be improved a bit: > > In[19]:= > ftaucUS[ry_] := Module[{a = 0, n = Length[ry]}, Do[Do[If[ry[[i]] < > ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]], {j, i + 1, n}], {i, 1, n - > 1}]; a] > > In[20]:= ftaucUS[ry] // Timing > Out[20]= {3.345 Second, -846} > > > Still better code from using Sum: > > In[33]:= > ftaucUSU[ry_] := Module[{n = Length[ry]}, Sum[Sum[If[ry[[i]] < ry[[j]], > 1, If[ry[[i]] > ry[[j]], -1]], {j, i + 1, n}], {i, 1, n - 1}]] > > In[34]:= ftaucUSU[ry] // Timing > Out[34]= {2.543 Second, -846} > > > Functional code is better though: > > In[23]:= -Plus @@ Flatten[ > MapIndexed[Take[#1, First[#2]] &, Outer[Order, ry, ry]]] // Timing > > Out[23]= {0.601 Second, -846} > > > > In[24]:= Plus @@ Flatten[ > MapThread[ > Thread[Unevaluated[Order[#1, #2]]] &, {ry, NestList[Rest, ry, Length[ry] - > > > 1]}]] // Timing > > Out[24]= {0.33 Second, -846} > > > > If we compile the Loops with Order (which was better when not compiled) > the > compiled code is slower (due to the function call to Order): > > In[25]:= > ftaucUSSc1 = Compile[{{ry, _Real, 1}}, Module[{a = 0, n = Length[ry]}, > Do[Do[a += Order[ry[[i]], ry[[j]]], {j, i + 1, n}], {i, 1, n - 1}]; a], > {{Order[_, _], _Integer}}] > > In[26]:= Timing[ftaucUSSc1[ry]] > Out[26]= {0.461 Second, -846} > > > In[30]:= > ftaucSC = Compile[{{ry, _Real, 1}}, Module[{a = 0, n = Length[ry]}, > Do[Do[If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]], {j, i + 1, > n}], {i, 1, n - 1}]; a]] > > In[31]:= Timing[ftaucSC[ry]] > Out[31]= {0.13 Second, -846} > > > Compared to your ftauc > > In[32]:= Timing[ftauc[ry, Length[ry]]] > Out[32]= {0.16 Second, -846} > > corresponding to that small advantage of the uncompiled code. > > > -- > Hartmut Wolf > > > -- majort at cox-internet.com Bobby R. Treat

**References**:**RE: Increase in efficiency with Module***From:*"Wolf, Hartmut" <Hartmut.Wolf@t-systems.com>