RE: speed-up of a function
- To: mathgroup at smc.vnet.net
- Subject: [mg45117] RE: [mg45072] speed-up of a function
- From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
- Date: Wed, 17 Dec 2003 07:54:33 -0500 (EST)
- Sender: owner-wri-mathgroup at wolfram.com
>-----Original Message----- >From: Kastens at Hamburg.BAW.DE [mailto:Kastens at Hamburg.BAW.DE] To: mathgroup at smc.vnet.net >Sent: Tuesday, December 16, 2003 12:21 PM >To: mathgroup at smc.vnet.net >Subject: [mg45117] [mg45072] speed-up of a function > > >Hi! > >I have something like this: (two lists(sublistTNW1 and >sublistTNW2)), containing both time (Integer) and a value (Real)). >The following function pick-up the next dataset(time,value) >out of sublistTNW2 after the given time (sublistTNW1[[i,1]]). > >NextEvent[t_, list_] := Select[list, (#[[1]] >= t) &, 1] >For[i = 1, i < 500, >{ > NextEvent[sublistTNW1[[i, 1]], sublistTNW2] >}; i++] > >Calling the function in the for-loop very often (f.ex 500 >times or more) is indeed very slowly. > >Trying to compile the function >testfunc = > Compile[{{t, _Integer}, {list, _Real, 2}}, Select[list, >(#[[1]] >= t) &, 1]] > >takes no effect in speed-up. Perhabs I've used the >compile-Function not correctly? > >How can I use lists with different data types in compile? >{list, _Real, 2} (s.a.) is not exactly right, because the list >is in the format {_Integer,_Real}. > >Thanks for any suggestions, >marko > Marko, I make a model of your computation, hope it applys. I don't thing your computation make stoo much sense unless sublistTNW2 ist sorted. I also suppose sublistTNW1 is sorted. If not, try to reformulate your problem. Now the model: In[1]:= n0 = 100000; In[2]:= sublistTNW2 = Sort[Table[{Random[Integer, {1, n0}], Random[]}, {n0}]]; In[3]:= sublistTNW1 = Sort[Table[Random[Integer, {1, n0}], {500}]]; In[4]:= NextEvent[t_, list_] := Select[list, (#[[1]] >= t) &, 1] In[6]:= r = {}; For[i = 1, i <= 15, AppendTo[r, NextEvent[sublistTNW1[[i]], sublistTNW2]]; i++]; r = Flatten[r, 1] Out[6]= {{210, 0.0739654}, {290, 0.571214}, {294, 0.192693}, {304, 0.718815}, {701, 0.202688}, {1216, 0.140055}, {1276, 0.335399}, {1454, 0.593851}, {1472, 0.6723}, {1604, 0.271997}, {1905, 0.962603}, {1922, 0.74373}, {2009, 0.217664}, {2125, 0.584308}, {2584, 0.476056}} In[7]:= Length[r] Out[7]= 15 This is just to check for correctness for an alternative algorithm. Now we test your's: In[8]:= For[i = 1, i <= 50, NextEvent[sublistTNW1[[i]], sublistTNW2]; i++] // Timing Out[8]= {4.587 Second, Null} In[9]:= For[i = 1, i <= 100, NextEvent[sublistTNW1[[i]], sublistTNW2]; i++] // Timing Out[9]= {20.5 Second, Null} In[10]:= For[i = 1, i <= 150, NextEvent[sublistTNW1[[i]], sublistTNW2]; i++] // Timing Out[10]= {47.608 Second, Null} You see: you algorithm is O[n^2], which is prohibitive in this case. The reason is clear: for (sorted sublists) we start always at the beginning, such we get longer and longer scans. This simple trick continues scanning at the last hit: In[11]:= i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i; True, False] &, 15] Out[11]= {{210, 0.0739654}, {290, 0.571214}, {294, 0.192693}, {304, 0.718815}, {701, 0.202688}, {1216, 0.140055}, {1276, 0.335399}, {1454, 0.593851}, {1472, 0.6723}, {1604, 0.271997}, {1905, 0.962603}, {1922, 0.74373}, {2009, 0.217664}, {2125, 0.584308}, {2584, 0.476056}} In[12]:= % == r Out[12]= True Same result as for the reference algorithm. Now test: In[17]:= i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i; True, False] &, 50]; // Timing Out[17]= {0.38 Second, Null} In[18]:= i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i; True, False] &, 100]; // Timing Out[18]= {0.811 Second, Null} In[19]:= i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i; True, False] &, 150]; // Timing Out[19]= {1.142 Second, Null} In[20]:= i = 1; Select[sublistTNW2, If[#[[1]] >= sublistTNW1[[i]], ++i; True, False] &, 200]; // Timing Out[20]= {1.512 Second, Null} So the algorithm is linear (and scans sublistTNW2 only once, instead of repeating at beginning of sublistTNW2 over and over). Whether this algorithms is really appropriate depends on more properties of your problem, you didn't report. Just try it! Other strategies were to to binay search (sorted) sublistTNW2, or to splice in sublistTNW1 as markers, Sort and Split. Perhaps others... -- Hartmut Wolf