Re: Speed challenge: Improve on integer frequencies from Count?
- To: mathgroup at smc.vnet.net
- Subject: [mg67775] Re: [mg67739] Speed challenge: Improve on integer frequencies from Count?
- From: "Carl K. Woll" <carlw at wolfram.com>
- Date: Fri, 7 Jul 2006 07:12:49 -0400 (EDT)
- References: <200607061052.GAA28442@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Gareth Russell wrote:
> Hi,
>
> A challenge for the efficiency gurus:
>
> Given a list of integers whose possible values lie in the range 0 to
> maxInt, with any given integer represented 0, 1 or multiple times,
> e.g., what you would get from
>
> Table[Random[Integer, {0, maxInt}], {1000}]
>
> create a list of frequencies of values including ALL possible values
> (i.e., not the basic output of Frequencies[], because that only
> includes values present in the list.
>
> The fastest I have been able to come up with is the obvious use of Count[]...
>
> In[256]:=
> myFrequencies=Compile[{{myList,_Integer,1},{
> maxInt,_Integer}},Table[Count[myList,i],{i,0,maxInt}]]
>
> Out[256]=
> CompiledFunction[{myList,maxInt},Table[Count[myList,i],{i,0,maxInt}],-\
> CompiledCode-]
>
> In[278]:=
> testList=Table[Random[Integer,{0,500}],{1000}];
>
> In[281]:=
> Timing[Do[myFrequencies[testList,500];,{1000}]]
>
> Out[281]=
> {2.3103 Second,Null}
>
> However, it seems to me that this should be able to be improved upon,
> because it scans the whole list maxInt times. One should be able to
> scan the list once, and increment a frequency list in the right places,
> perhaps with a For[] loop. But try as I might, I can't come up with a
> version that is faster than the compiled Count[] version above.
>
> Can anyone do it?
>
> Gareth
>
One possibility is to use a variation of the function countfunc which is
used by Histogram in the package Graphics`Graphics`:
countfunc0 = Compile[{{dat, _Integer, 1}, {bincount, _Integer}},
Module[{bins = Table[0, {bincount + 1}], i},
Do[bins[[dat[[i]] + 1]]++, {i, Length[dat]}];
bins
]
];
The countfunc function counts only positive integers, so I had to modify
it slightly.
For your test case:
In[6]:=
Do[r1=countfunc0[testList,500],{1000}]//Timing
Do[r2=myFrequencies[testList,500],{1000}]//Timing
r1===r2
Out[6]=
{0.531 Second,Null}
Out[7]=
{2.5 Second,Null}
Out[8]=
True
Carl Woll
Wolfram Research
- References:
- Speed challenge: Improve on integer frequencies from Count?
- From: Gareth Russell <russell@njit.edu>
- Speed challenge: Improve on integer frequencies from Count?