MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Efficient Histogram Algorithm?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg113053] Re: Efficient Histogram Algorithm?
  • From: Ray Koopman <koopman at sfu.ca>
  • Date: Tue, 12 Oct 2010 04:26:32 -0400 (EDT)
  • References: <i8uknn$of3$1@smc.vnet.net>

On Oct 11, 2:17 am, Julian Francis <julian.w.fran... at gmail.com> wrote:
> Dear all,
>
> Does anyone know of a way to implement a fast histogram algorithm in
> Mathematica?
>
> My first attempt (In44) was based around a table and using the count
> function. A bit of thought indicates that this might not be very
> efficient as it would seem to involve examining the list 255 times,
> when only once is necessary. The second attempt involved using the
> built in library function BinCounts, but surprisingly this was only
> about twice as fast as my initial naive version, which makes me wonder
> if it is as efficient as it could be. I can assume for my purposes
> that the list will always have integer values between 0 and 255, and
> that there are 256 bins.
>
> I wasn't able to think of a way to implement the histogram function in
> a functional programming style that was also efficient, although it is
> fairly easy to write in an imperative language. This may reflect the
> vast majority of my programming experience has been in imperative
> languages.
>
> I am using an old version of Mathematica (v4.1) (but on a modern PC);
> but if anyone has dramatically different times for the BinCounts
> function, then maybe it is time for me to upgrade.
>
> My motivation is a pattern recognition algorithm I am implementing,
> but it requires a large number of histograms to be created, hence the
> running time of an individual histogram is actually quite important.
>
> Thanks for any help,
> Julian.
>
> In[44]:=
> jhistogram[list_] := Table[Count[list, n], {n, 0, 255}]
>
> In[51]:=
> \!\(Timing[\(randomList = Table[Random[Integer, 255], {10\^6}];\)]\)
>
> Out[51]=
> {1.373 Second, Null}
>
> In[54]:=
> Timing[hist = jhistogram[randomList];]
>
> Out[54]=
> {2.34 Second, Null}
>
> In[56]:=
> << Statistics`
>
> In[57]:=
> Timing[hist = BinCounts[randomList, {0, 255}];]
>
> Out[57]=
> {0.936 Second, Null}

Here's a compiled routine that's twice as fast as BinCounts. Also,
note that you need to give BinCounts a non-intuitive set of bins.

In[1]:= << Statistics`

In[2]:= jhistogram[list_] := Table[Count[list,n],{n,0,255}]

In[3]:= freqs = Compile[{{data,_Integer,1}},
        Module[{bins = Table[0,{256}]},
        Scan[bins[[#]]++&,data+1]; bins]];

In[4]:= Timing@Length[randomList = Table[Random[Integer,255],{10^6}]]
        Timing@Length[hist1 = jhistogram[randomList]]
        Timing@Length[hist2 = BinCounts[randomList,{-1,255}]]
        Timing@Length[hist3 = freqs[randomList]]
        SameQ[hist1,hist2,hist3]

Out[4]= {1.26 Second,1000000}
Out[5]= {2.5 Second,256}
Out[6]= {0.9 Second,256}
Out[7]= {0.45 Second,256}
Out[8]= True


  • Prev by Date: Re: local variables - Module, For loop
  • Next by Date: Re: local variables - Module, For loop
  • Previous by thread: Re: Efficient Histogram Algorithm?
  • Next by thread: Re: Efficient Histogram Algorithm?