MathGroup Archive 2001

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Any better way for finding frequencies of list entries?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg27029] Re: [mg26923] Any better way for finding frequencies of list entries?
  • From: paradaxiom at my-deja.com
  • Date: Tue, 30 Jan 2001 23:22:44 -0500 (EST)
  • References: <956316$78@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

This problem can be solved with an algorithm of linear order. Given an
arbitrary list of random digits you only have to look at every digit
once and increase the frequency of this digit. Linear algorithms can
often be implemented with Map, because you it allows you to do
something to every element of a list. To speed things up a bit you can
also use Compile, which sort of "precomputes" a number of things to
prevent the computer from doing the same thing over and over again
unnecessarily. The example below takes as its arguments the
list "frequency" which is of length 10 and an arbitrary lengh
list "digitlist".

The list "frequency" is used to store the individual frequencies of the
digits in "digitlist". This is done by increasing the (#+1)th frequency
for each # (which is one of the digits of "digitlist"). At the end this
list is returned:

In[1]:=
countdigits = Compile[{{frequency, _Integer, 1}, {digitlist, _Integer,
1}}, (frequency[[#1 + 1]]++ & ) /@ digitlist; frequency]

Out[1]=
CompiledFunction[]

Now generate a big "digitlist" and initialize "frequency" to all zeros,
to try out the function. On my computer the evaluation took about 2.3
seconds. Not using Compile was much slower, about 10 times or so.

In[2]:=
digitlist = Table[Random[Integer, {0, 9}], {10^6}];
frequency = Table[0, {10}];
Timing[countdigits[frequency, digitlist]]

Out[4]=
{2.343*Second, {100048, 100095, 99132, 99876, 100388, 99847, 100028,
100638, 99985, 99963}}

Compare this to the function "CategoryCounts" defined by the
DataManipulation package:

In[5]:=
Needs["Statistics`DataManipulation`"];

In[6]:=
Timing[Transpose[{Range[0, 9], CategoryCounts[digitlist, Range[0, 9]]}]]

Out[6]=
{5.367999999999999*Second, {{0, 100048}, {1, 100095}, {2, 99132}, {3,
99876}, {4, 100388}, {5, 99847}, {6, 100028}, {7, 100638},
   {8, 99985}, {9, 99963}}}

This result is pretty good considering the fact that "CategoryCounts"
is a much general function. Good luck with your digit hunting! :)


//Marten


Sent via Deja.com
http://www.deja.com/


  • Prev by Date: Re: Queries: notebook, matrices
  • Next by Date: Re: 1. Input of screen coordinates; 2. Fast graphics
  • Previous by thread: Re: Re: Any better way for finding frequencies of list entries?
  • Next by thread: Multiply 2 Lists together in a certain way