Re: Speed challenge: Improve on integer frequencies from Count?
- To: mathgroup at smc.vnet.net
- Subject: [mg67775] Re: [mg67739] Speed challenge: Improve on integer frequencies from Count?
- From: "Carl K. Woll" <carlw at wolfram.com>
- Date: Fri, 7 Jul 2006 07:12:49 -0400 (EDT)
- References: <200607061052.GAA28442@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Gareth Russell wrote: > Hi, > > A challenge for the efficiency gurus: > > Given a list of integers whose possible values lie in the range 0 to > maxInt, with any given integer represented 0, 1 or multiple times, > e.g., what you would get from > > Table[Random[Integer, {0, maxInt}], {1000}] > > create a list of frequencies of values including ALL possible values > (i.e., not the basic output of Frequencies[], because that only > includes values present in the list. > > The fastest I have been able to come up with is the obvious use of Count[]... > > In[256]:= > myFrequencies=Compile[{{myList,_Integer,1},{ > maxInt,_Integer}},Table[Count[myList,i],{i,0,maxInt}]] > > Out[256]= > CompiledFunction[{myList,maxInt},Table[Count[myList,i],{i,0,maxInt}],-\ > CompiledCode-] > > In[278]:= > testList=Table[Random[Integer,{0,500}],{1000}]; > > In[281]:= > Timing[Do[myFrequencies[testList,500];,{1000}]] > > Out[281]= > {2.3103 Second,Null} > > However, it seems to me that this should be able to be improved upon, > because it scans the whole list maxInt times. One should be able to > scan the list once, and increment a frequency list in the right places, > perhaps with a For[] loop. But try as I might, I can't come up with a > version that is faster than the compiled Count[] version above. > > Can anyone do it? > > Gareth > One possibility is to use a variation of the function countfunc which is used by Histogram in the package Graphics`Graphics`: countfunc0 = Compile[{{dat, _Integer, 1}, {bincount, _Integer}}, Module[{bins = Table[0, {bincount + 1}], i}, Do[bins[[dat[[i]] + 1]]++, {i, Length[dat]}]; bins ] ]; The countfunc function counts only positive integers, so I had to modify it slightly. For your test case: In[6]:= Do[r1=countfunc0[testList,500],{1000}]//Timing Do[r2=myFrequencies[testList,500],{1000}]//Timing r1===r2 Out[6]= {0.531 Second,Null} Out[7]= {2.5 Second,Null} Out[8]= True Carl Woll Wolfram Research
- References:
- Speed challenge: Improve on integer frequencies from Count?
- From: Gareth Russell <russell@njit.edu>
- Speed challenge: Improve on integer frequencies from Count?