MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Efficient Histogram Algorithm?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg113039] Re: Efficient Histogram Algorithm?
  • From: Bob Hanlon <hanlonr at cox.net>
  • Date: Tue, 12 Oct 2010 04:23:50 -0400 (EDT)

Tally is much faster but was introduced in Version 6.

jhistogram[list_] :=
 Table[Count[list, n], {n, 0, 255}]

jhistogram3[list_] :=
 Sort[Tally[list]][[All, 2]]

randomList = RandomInteger[{0, 255}, 10^6];

Timing[hist1 =
    jhistogram[randomList];][[1]]

0.360395

For BinCounts you need to use 256 as the upper bound

Timing[hist2 =
    BinCounts[randomList, {0, 256}];][[1]]

0.278348

Timing[hist3 =
    jhistogram3[randomList];][[1]]

0.003623

hist1 == hist2 == hist3

True

Total[hist1] == Total[hist2] ==
 Total[hist3] == 10^6

True


Bob Hanlon

---- Julian Francis <julian.w.francis at gmail.com> wrote: 

=============
Dear all,

Does anyone know of a way to implement a fast histogram algorithm in
Mathematica?

My first attempt (In44) was based around a table and using the count
function. A bit of thought indicates that this might not be very
efficient as it would seem to involve examining the list 255 times,
when only once is necessary. The second attempt involved using the
built in library function BinCounts, but surprisingly this was only
about twice as fast as my initial naive version, which makes me wonder
if it is as efficient as it could be. I can assume for my purposes
that the list will always have integer values between 0 and 255, and
that there are 256 bins.

I wasn't able to think of a way to implement the histogram function in
a functional programming style that was also efficient, although it is
fairly easy to write in an imperative language. This may reflect the
vast majority of my programming experience has been in imperative
languages.

I am using an old version of Mathematica (v4.1) (but on a modern PC);
but if anyone has dramatically different times for the BinCounts
function, then maybe it is time for me to upgrade.

My motivation is a pattern recognition algorithm I am implementing,
but it requires a large number of histograms to be created, hence the
running time of an individual histogram is actually quite important.

Thanks for any help,
Julian.

In[44]:=
jhistogram[list_] := Table[Count[list, n], {n, 0, 255}]

In[51]:=
\!\(Timing[\(randomList = Table[Random[Integer, 255], {10\^6}];\)]\)

Out[51]=
{1.373 Second, Null}

In[54]:=
Timing[hist = jhistogram[randomList];]

Out[54]=
{2.34 Second, Null}

In[56]:=
<< Statistics`

In[57]:=
Timing[hist = BinCounts[randomList, {0, 255}];]

Out[57]=
{0.936 Second, Null}





  • Prev by Date: Unsolvable ODE
  • Next by Date: Re: C-pointers from Mathematica
  • Previous by thread: Re: Efficient Histogram Algorithm?
  • Next by thread: Re: Efficient Histogram Algorithm?