Re: Any better way for finding frequencies of list entries?
- To: mathgroup at smc.vnet.net
- Subject: [mg27029] Re: [mg26923] Any better way for finding frequencies of list entries?
- From: paradaxiom at my-deja.com
- Date: Tue, 30 Jan 2001 23:22:44 -0500 (EST)
- References: <956316$78@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
This problem can be solved with an algorithm of linear order. Given an arbitrary list of random digits you only have to look at every digit once and increase the frequency of this digit. Linear algorithms can often be implemented with Map, because you it allows you to do something to every element of a list. To speed things up a bit you can also use Compile, which sort of "precomputes" a number of things to prevent the computer from doing the same thing over and over again unnecessarily. The example below takes as its arguments the list "frequency" which is of length 10 and an arbitrary lengh list "digitlist". The list "frequency" is used to store the individual frequencies of the digits in "digitlist". This is done by increasing the (#+1)th frequency for each # (which is one of the digits of "digitlist"). At the end this list is returned: In[1]:= countdigits = Compile[{{frequency, _Integer, 1}, {digitlist, _Integer, 1}}, (frequency[[#1 + 1]]++ & ) /@ digitlist; frequency] Out[1]= CompiledFunction[] Now generate a big "digitlist" and initialize "frequency" to all zeros, to try out the function. On my computer the evaluation took about 2.3 seconds. Not using Compile was much slower, about 10 times or so. In[2]:= digitlist = Table[Random[Integer, {0, 9}], {10^6}]; frequency = Table[0, {10}]; Timing[countdigits[frequency, digitlist]] Out[4]= {2.343*Second, {100048, 100095, 99132, 99876, 100388, 99847, 100028, 100638, 99985, 99963}} Compare this to the function "CategoryCounts" defined by the DataManipulation package: In[5]:= Needs["Statistics`DataManipulation`"]; In[6]:= Timing[Transpose[{Range[0, 9], CategoryCounts[digitlist, Range[0, 9]]}]] Out[6]= {5.367999999999999*Second, {{0, 100048}, {1, 100095}, {2, 99132}, {3, 99876}, {4, 100388}, {5, 99847}, {6, 100028}, {7, 100638}, {8, 99985}, {9, 99963}}} This result is pretty good considering the fact that "CategoryCounts" is a much general function. Good luck with your digit hunting! :) //Marten Sent via Deja.com http://www.deja.com/