MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

RE: speed-up of a function

  • To: mathgroup at smc.vnet.net
  • Subject: [mg45117] RE: [mg45072] speed-up of a function
  • From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
  • Date: Wed, 17 Dec 2003 07:54:33 -0500 (EST)
  • Sender: owner-wri-mathgroup at wolfram.com

>-----Original Message-----
>From: Kastens at Hamburg.BAW.DE [mailto:Kastens at Hamburg.BAW.DE]
To: mathgroup at smc.vnet.net
>Sent: Tuesday, December 16, 2003 12:21 PM
>To: mathgroup at smc.vnet.net
>Subject: [mg45117] [mg45072] speed-up of a function
>
>
>Hi!
>
>I have something like this: (two lists(sublistTNW1 and 
>sublistTNW2)), containing both time (Integer) and a value (Real)).
>The following function pick-up the next dataset(time,value) 
>out of sublistTNW2 after the given time (sublistTNW1[[i,1]]).
>
>NextEvent[t_, list_] := Select[list, (#[[1]] >= t) &, 1]
>For[i = 1, i < 500, 
>{
>      NextEvent[sublistTNW1[[i, 1]], sublistTNW2]
>}; i++]
>
>Calling the function in the for-loop very often (f.ex 500 
>times or more) is indeed very slowly. 
>
>Trying to compile the function
>testfunc = 
>  Compile[{{t, _Integer}, {list, _Real, 2}}, Select[list, 
>(#[[1]] >= t) &, 1]]
>
>takes no effect in speed-up. Perhabs I've used the 
>compile-Function not correctly?
>
>How can I use lists with different data types in compile? 
>{list, _Real, 2} (s.a.) is not exactly right, because the list 
>is in the format {_Integer,_Real}.
>
>Thanks for any suggestions,
>marko
>


Marko,

I make a model of your computation, hope it applys. 

I don't thing your computation make stoo much sense unless sublistTNW2 ist
sorted. I also suppose sublistTNW1 is sorted. If not, try to reformulate
your problem.

Now the model:
 
In[1]:= n0 = 100000;
In[2]:=
sublistTNW2 = Sort[Table[{Random[Integer, {1, n0}], Random[]}, {n0}]];
In[3]:=
sublistTNW1 = Sort[Table[Random[Integer, {1, n0}], {500}]];

In[4]:= NextEvent[t_, list_] := Select[list, (#[[1]] >= t) &, 1]


In[6]:= r = {}; For[i = 1, i <= 15, 
  AppendTo[r, NextEvent[sublistTNW1[[i]], sublistTNW2]]; i++]; r = 
  Flatten[r, 1]
Out[6]=
{{210, 0.0739654}, {290, 0.571214}, {294, 0.192693}, {304, 0.718815}, {701, 
    0.202688}, {1216, 0.140055}, {1276, 0.335399}, {1454, 0.593851}, {1472, 
    0.6723}, {1604, 0.271997}, {1905, 0.962603}, {1922, 0.74373}, {2009, 
    0.217664}, {2125, 0.584308}, {2584, 0.476056}}

In[7]:= Length[r]
Out[7]= 15

This is just to check for correctness for an alternative algorithm.


Now we test your's:

In[8]:= For[i = 1, i <= 50, NextEvent[sublistTNW1[[i]], sublistTNW2]; i++]
// Timing
Out[8]= {4.587 Second, Null}

In[9]:= For[i = 1, i <= 100, NextEvent[sublistTNW1[[i]], sublistTNW2]; i++]
// Timing
Out[9]= {20.5 Second, Null}

In[10]:= For[i = 1, i <= 150, NextEvent[sublistTNW1[[i]], sublistTNW2]; i++]
// Timing
Out[10]= {47.608 Second, Null}

You see: you algorithm is O[n^2], which is prohibitive in this case.


The reason is clear: for (sorted sublists) we start always at the beginning,
such we get longer and longer scans.



This simple trick continues scanning at the last hit:
 
In[11]:= 
i = 1; Select[sublistTNW2, 
  If[#[[1]] >= sublistTNW1[[i]], ++i; True, False] &, 15]
Out[11]=
{{210, 0.0739654}, {290, 0.571214}, {294, 0.192693}, {304, 0.718815}, {701, 
    0.202688}, {1216, 0.140055}, {1276, 0.335399}, {1454, 0.593851}, {1472, 
    0.6723}, {1604, 0.271997}, {1905, 0.962603}, {1922, 0.74373}, {2009, 
    0.217664}, {2125, 0.584308}, {2584, 0.476056}}

In[12]:= % == r
Out[12]= True

Same result as for the reference algorithm.

Now test:
 
In[17]:=
i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i; True, False]
&,
     50]; // Timing
Out[17]= {0.38 Second, Null}

In[18]:=
i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i; True, False]
&,
     100]; // Timing
Out[18]= {0.811 Second, Null}

In[19]:= i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i;
True, False] &,
     150]; // Timing
Out[19]= {1.142 Second, Null}

In[20]:= i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i;
True, False] &,
     200]; // Timing
Out[20]= {1.512 Second, Null}


So the algorithm is linear (and scans sublistTNW2 only once, instead of
repeating at beginning of sublistTNW2 over and over).


Whether this algorithms is really appropriate depends on more properties of
your problem, you didn't report. Just try it!


Other strategies were to to binay search (sorted) sublistTNW2, or to splice
in sublistTNW1 as markers, Sort and Split. Perhaps others...


--
Hartmut Wolf


  • Prev by Date: Re: Compile
  • Next by Date: RE: Re: Compile
  • Previous by thread: speed-up of a function
  • Next by thread: RE: RE: speed-up of a function