Re: Speed challenge: Improve on integer frequencies from Count?
- To: mathgroup at smc.vnet.net
- Subject: [mg67772] Re: Speed challenge: Improve on integer frequencies from Count?
- From: Oliver Ruebenkoenig <ruebenko at uni-freiburg.de>
- Date: Fri, 7 Jul 2006 07:12:42 -0400 (EDT)
- References: <e8ir3t$s8o$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Dear Gareth, perhaps this helps: maxInt = 5; testList = Table[Random[Integer, {0, maxInt}], {10}] {2, 1, 4, 2, 3, 0, 3, 4, 3, 0} freq[data_] := SparseArray[ Rule[ #[[All, 1]] + 1, Length /@ # ], {maxInt + 1}] &[ Split[ Sort[ data ] ] ] r = freq[testList] SparseArray[<5>, {6}] Normal[r] {2, 1, 2, 3, 2, 0} maxInt = 1000; testList = Table[Random[Integer, {0, maxInt}], {1000000}]; Timing[ r = freq[ testList ]; ] {0.588037 Second, Null} I hope i have understood your post correctly. Oliver On Thu, 6 Jul 2006, Gareth Russell wrote: > Hi, > > A challenge for the efficiency gurus: > > Given a list of integers whose possible values lie in the range 0 to > maxInt, with any given integer represented 0, 1 or multiple times, > e.g., what you would get from > > Table[Random[Integer, {0, maxInt}], {1000}] > > create a list of frequencies of values including ALL possible values > (i.e., not the basic output of Frequencies[], because that only > includes values present in the list. > > The fastest I have been able to come up with is the obvious use of Count[]... > > In[256]:= > myFrequencies=Compile[{{myList,_Integer,1},{ > maxInt,_Integer}},Table[Count[myList,i],{i,0,maxInt}]] > > Out[256]= > CompiledFunction[{myList,maxInt},Table[Count[myList,i],{i,0,maxInt}],-\ > CompiledCode-] > > In[278]:= > testList=Table[Random[Integer,{0,500}],{1000}]; > > In[281]:= > Timing[Do[myFrequencies[testList,500];,{1000}]] > > Out[281]= > {2.3103 Second,Null} > > However, it seems to me that this should be able to be improved upon, > because it scans the whole list maxInt times. One should be able to > scan the list once, and increment a frequency list in the right places, > perhaps with a For[] loop. But try as I might, I can't come up with a > version that is faster than the compiled Count[] version above. > > Can anyone do it? > > Gareth > > -- > Gareth Russell > NJIT > > Oliver Ruebenkoenig, <ruebenko at uni-freiburg.de> Phone: ++49 +761 203 7388