MathGroup Archive 2006

[Date Index] [Thread Index] [Author Index]

Search the Archive

Speed challenge integer frequencies: and the winner is...

  • To: mathgroup at smc.vnet.net
  • Subject: [mg67802] Speed challenge integer frequencies: and the winner is...
  • From: Gareth Russell <russell at njit.edu>
  • Date: Sat, 8 Jul 2006 04:56:11 -0400 (EDT)
  • Sender: owner-wri-mathgroup at wolfram.com

Thanks for everyone who responded to the speed challenge. You can see 
the submitted solutions in the responses to my original post. My code 
(for finding a set of integer frequencies from a list of non-negative 
integers, including zero frequency counts for integers not in the list 
up to maxInt) was as follows:

myFrequencies=Compile[{{myList,_Integer,1},{
    maxInt,_Integer}},Table[Count[myList,i],{i,0,maxInt}]]

Other solutions were interesting, including one that made use of 
SparseArray, and two that used Split[Sort[]]. All were at lesat twice 
as fast as my code, when maxInt was set to 500. But the fastest set 
were those that followed my suggestion of scanning the list once and 
incrementing values in a frequency array initialized with zeros. I had 
been unable to program this efficiently, and I got at least three 
similar versions that were efficent, clocking in a six times faster 
than myFrequencies. The winner by a hair, submitted by Ray Koopman, was

freq8 = Compile[{{myList, _Integer, 1}, {maxInt, _Integer}},
                  Module[{bins = Table[0, {maxInt + 1}]},
                         Scan[bins[[#]]++ &, myList + 1]; bins]]

But note that Ray admitted making a small modification to a submission 
by Carl Woll! The three variations on this theme varied in using Scan, 
Map and Table, which seem to account for the small differences in 
speed, with Scan being fractionally more efficient.

A confession: in the application I wanted this for, maxInt = 5. In that 
case, as a number of people pointed out, myFrequencies is almost an 
order of magnitude faster than any other code. I determined that 
equality between myFrequencies and freq8 was achieved somewhere around 
maxInt = 73. This is strange to me, as it would seem that (compiled) 
Count is scanning the list 74 separate times, and yet doing so in the 
same amount of time as the single compiled scan of freq8. Count must be 
super-efficient!

Gareth

-- 
Gareth Russell
NJIT


  • Prev by Date: Re: Speed challenge: Improve on integer frequencies from Count?
  • Next by Date: Re: Hamiltonian circuits
  • Previous by thread: Re: ReplacePart in an If[] construct?
  • Next by thread: frontend crashing