Re: Speed challenge: Improve on integer frequencies from Count?
- To: mathgroup at smc.vnet.net
- Subject: [mg67777] Re: [mg67739] Speed challenge: Improve on integer frequencies from Count?
- From: Sseziwa Mukasa <mukasa at jeol.com>
- Date: Fri, 7 Jul 2006 07:12:55 -0400 (EDT)
- References: <200607061052.GAA28442@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
On Jul 6, 2006, at 6:52 AM, Gareth Russell wrote: > Hi, > > A challenge for the efficiency gurus: > > Given a list of integers whose possible values lie in the range 0 to > maxInt, with any given integer represented 0, 1 or multiple times, > e.g., what you would get from > > Table[Random[Integer, {0, maxInt}], {1000}] > > create a list of frequencies of values including ALL possible values > (i.e., not the basic output of Frequencies[], because that only > includes values present in the list. > > The fastest I have been able to come up with is the obvious use of > Count[]... > > In[256]:= > myFrequencies=Compile[{{myList,_Integer,1},{ > maxInt,_Integer}},Table[Count[myList,i],{i,0,maxInt}]] > > Out[256]= > CompiledFunction[{myList,maxInt},Table[Count[myList,i],{i, > 0,maxInt}],-\ > CompiledCode-] > > In[278]:= > testList=Table[Random[Integer,{0,500}],{1000}]; > > In[281]:= > Timing[Do[myFrequencies[testList,500];,{1000}]] > > Out[281]= > {2.3103 Second,Null} > > However, it seems to me that this should be able to be improved upon, > because it scans the whole list maxInt times. One should be able to > scan the list once, and increment a frequency list in the right > places, > perhaps with a For[] loop. But try as I might, I can't come up with a > version that is faster than the compiled Count[] version above. > > Can anyone do it? On my machine Frequencies is faster than myFrequencies. The missing values can be filled in with: {0,#}&/@Complement[Range[0,maxInt],Flatten[Last/@result]]] where result is the result of calling Frequencies. Combining the two I get: myFrequencies2[lst_,maxInt_]:=Block[{freqs=Frequencies[lst]},Join [freqs,{0,#}&/@Complement[Range[0,maxInt],Flatten[Last/@freqs]]]] which is faster than myFrequencies on my machine. Of course in this case the list is not sorted, adding a sort makes it much slower than myFrequencies. Alternatively, you could just keep the list of zero frequency items separate. Regards, Ssezi
- References:
- Speed challenge: Improve on integer frequencies from Count?
- From: Gareth Russell <russell@njit.edu>
- Speed challenge: Improve on integer frequencies from Count?