[Date Index]
[Thread Index]
[Author Index]
Re: RE: Increase in efficiency with Module
*To*: mathgroup at smc.vnet.net
*Subject*: [mg40096] Re: [mg40060] RE: [mg40050] Increase in efficiency with Module
*From*: Dr Bob <drbob at bigfoot.com>
*Date*: Thu, 20 Mar 2003 03:33:07 -0500 (EST)
*References*: <8EB1C3634596D6118952006008F711CD5BCD2F@debis.com>
*Reply-to*: drbob at bigfoot.com
*Sender*: owner-wri-mathgroup at wolfram.com
On lists of size 1000, the three fastest are now ftauc3, ftauc3P (5%
faster), and ftauc3PP (another 8% faster).
Bobby
On Wed, 19 Mar 2003 09:11:54 +0100, Wolf, Hartmut <Hartmut.Wolf@t-
systems.com> wrote:
> Bobby,
>
> yes, I've missed a minor point, which I did not correct, as this had not
> been the focus of the discussion:
>
> In[1]:= ry = Table[Random[], {1000}];
> In[3]:=
> ftaucSCP = Compile[{{ry, _Real, 1}}, Module[{a = 0, n = Length[ry]},
> Do[Do[If[ry[[i]] < ry[[j]], ++a, If[ry[[i]] > ry[[j]], --a]], {j, i + 1,
> n}], {i, 1, n - 1}]; a]];
> In cases like this use PreDecrement and PreIncrement, instead of
> Decrement
> or Increment. This gives another performance boost of a percent or a
> fraction. I'm not very experienced with the Mathematica Compiler (which,
> BTW, generates special code to be interpreted, not C code), however my
> observation is that code which is nearest to plain C-code compiles
> fastest.
> And the rule I put up above is an old rule for C programming. Nowadays,
> however, a modern C-compiler recognizes the situation and optimizes the
> generated machine code accordingly. And also, the extra machine cycle (a
> register move) (if it were) needed for a++, compared to ++a gets
> unnoticed
> with a modern CPU (their problems are now quite different). But this does
> not apply to mathematica compiled code, as it is interpreted, sic!
>
>
> In[4]:= Do[ftauc3[ry], {10}] // Timing
> Out[4]= {12.168 Second, Null}
> In[5]:= Do[ftaucSCP[ry],{10}]//Timing
> Out[5]= {11.616 Second, Null}
>
>
> Applying this to your code:
>
> In[20]:=
> ftauc3P = Compile[{{ry, _Real, 1}}, Block[{i = 0, j, a = 0, n =
> Length@ry}, Do[j = ++i;
> Do[a += Sign[ry[[++j]] - ry[[i]]], {n - i}],
> {n - 1}];
> a]];
>
> In[21]:= Do[ftauc3P[ry], {10}] // Timing
> Out[21]= {11.837 Second, Null}
>
>
> The warning as not to use external functions does not apply to your code,
> as
> Sign is one of those functions which are compiled inline.
>
> Other improvement (Do loops) of your code:
>
> In[30]:=
> ftauc3PP = Compile[{{ry, _Real, 1}},
> Block[{a = 0, n = Length[ry]},
> Do[
> Do[a += Sign[ry[[j]] - ry[[i]]], {j, i + 1, n}],
> {i, 1, n - 1}];
> a]];
>
>
> In[31]:= Do[ftauc3PP[ry],{10}]//Timing
> Out[31]= {10.686 Second, Null}
>
> That seems to be it!
>
>
>
> Hartmut
>
>
>
>> -----Original Message-----
>> From: Dr Bob [mailto:drbob at bigfoot.com]
To: mathgroup at smc.vnet.net
>> Sent: Tuesday, March 18, 2003 9:55 PM
>> To: Wolf, Hartmut; mathgroup at smc.vnet.net; aehirsh at stanford.edu
>> Subject: [mg40096] Re: [mg40060] RE: [mg40050] Increase in efficiency with Module
>>
>>
>> I timed the two original codes, my slightly modified version of that
>> (treat), Hartmut's two functional codes, and his fastest code, ftaucSC.
>> There's a near tie between treat, ftaucSC, and the original ftauc (in
>> that order). I didn't use Real random numbers, since ties seem to be a
>> significant factor in the original code, but the winning functions are
>> the same with Reals anyway.
>>
>> Some might find the following a useful guide to timing functions. I
>> randomize the order in which functions are timed for each random
>> dataset, output means and standard errors sorted from fastest to
>> slowest, and show how to eliminate slow functions before going on to
>> larger data sets and greater numbers of trials.
>>
>> <<Statistics`DescriptiveStatistics`
>> allNames={"ftauc2","functional","functional2","ftauc","ftaucSC"
>> ,"treat"};
>> randomTrial[functions_List,n_Integer]:=Module[{data=Table[Random[Integer,50
> ],{\
>> n}],order,results},
>> order=Ordering[Random[]&/@functions];
>> results=First@Timing[#@data]&/@functions[[order]];
>> results[[Ordering@order]]/Second
>> ]
>> randomTrials[names:{__String},size_Integer,trials_Integer]:=
>> TableForm[
>> Transpose@Module[{functions=ToExpression/@names,times,ranking},times=\
>> Transpose@Table[randomTrial[functions,size],{trials}];
>> means=Mean/@times;
>> ranking=Ordering@means;
>> times=times[[ranking]];
>> {names[[ranking]],Mean/@times,StandardErrorOfSampleMean/@times}
>> ],TableHeadings\[Rule]{None,{"Function","Mean","Std Error"}}]
>>
> [...]
>
>> treat=Compile[{{ry,_Real,1}},Block[{i=0,j,a=0,n=Length@ry},
>> Do[i++;j=i+1;
>> Do[a+=Sign[ry[[j]]-ry[[i]]];j++,{n-i}],{n-1}];a]];
> [...]
>
>>
>> randomTrials[fastest, 1000, 100]
>>
>> TableForm[
> {{"Function", "Mean", "Std Error"}, {"treat", 0.27246999999999955,
> 0.0009006681021902752},
> {"ftaucSC", 0.29871000000000036, 0.0009034496179205452}, {"ftauc",
> 0.3101900000000001, 0.0011866849675129872}},
>> TableHeadings -> {None, {"Function", "Mean", "Std Error"}}]
>>
>> Bobby
>>
>> On Tue, 18 Mar 2003 02:21:27 -0500 (EST), Wolf, Hartmut <Hartmut.Wolf@t-
>> systems.com> wrote:
>>
>>>
>>>> -----Original Message-----
>>>> From: Aaron E. Hirsh [mailto:aehirsh at stanford.edu]
To: mathgroup at smc.vnet.net
>>> To: mathgroup at smc.vnet.net
>>>> Sent: Monday, March 17, 2003 9:35 AM
>>>> To: mathgroup at smc.vnet.net
>>>> Subject: [mg40096] [mg40060] [mg40050] Increase in efficiency with Module
>>>>
>>>>
>>>> I would be grateful if someone could explain the difference in
>>>> efficiency between the following two simple programs. ry is a list of
>>>> length n. For n = 1000, the program using Module takes 1 second on my
>>>> laptop, whereas the program without Module takes 75 seconds.
>>>>
>>>> ftauc = Compile[{{ry, _Real, 1}, {n, _Real, 0}},
>>>> Module[{i, j, a}, i = 1;
>>>> a = 0;
>>>> Do[ j = i + 1; Do[
>>>> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]];
>>>> j++, {n - i}];
>>>> i++, {n - 1}]; a
>>>> ]]
>>>>
>>>> ftauc2 = Compile[{{ry, _Real, 1}, {n, _Real, 0}},
>>>> i = 1;
>>>> a = 0;
>>>> Do[j = i + 1;
>>>> Do[
>>>> If[ry[[i]] < ry[[j]], a++, If[ry[[i]] > ry[[j]], a--]];
>>>> j++, {n - i}];
>>>> i++, {n - 1}]; a
>>>> ]
>>>>
>>>> thank you,
>>>> --
>>>>
>>>> Aaron E. Hirsh
>>>> Center for Computational Genetics and Biological Modeling
>>>> Department of Biological Sciences
>>>> Stanford University
>>>> tel. (650) 723-4952
>>>> fax. (650) 725-0180
>>>>
>>>
>>> Aaron,
>>>
>>> if you inspect the compiled code ftauc[[4]] and ftauc2[[4]] you see
>>> considerable differences which are obviously caused by access to the
> global
>>> variables i, j and a (in ftauc). I cannot go through theses statements
>>> line by line, merely refer to
>>> http://library.wolfram.com/database/Conferences/4682/
>>>
>>> Although, this appears to be out of date now, a key sentence therein
> still
>>> holds: "The main obstacle to achieving a speedup ... is when the
>>> compiled
>>> code has to externally evaluate a function". From the compiled code for
>>> ftauc2, I suppose, that access and updates of the global variables
>>> i,j,a is done by calling an (external) function.
>>>
>>> Further: "In structures such as Module or Block, local variables can be
>>> declared. The compiled code will store value of such a variable in a
>>> register, which can then be updated whenever the variable gets a new
>>> value.
>>>
>>>
>>> Such use local variables!
>>>
>>>
> [...]
>>> --
>>> Hartmut Wolf
>>>
> [...]
>> -- majort at cox-internet.com
>> Bobby R. Treat
>>
>
>
--
majort at cox-internet.com
Bobby R. Treat
Prev by Date:
**RE: RE: Increase in efficiency with Module**
Next by Date:
**Re: combinatorics problem**
Previous by thread:
**RE: RE: Increase in efficiency with Module**
Next by thread:
**Re: Increase in efficiency with Module**
| |