Re: Counting number of numbers in a large list between two valus
- To: mathgroup at smc.vnet.net
- Subject: [mg114548] Re: Counting number of numbers in a large list between two valus
- From: truxton spangler <truxtonspangler29 at gmail.com>
- Date: Tue, 7 Dec 2010 06:47:36 -0500 (EST)
- References: <idigg5$iq2$1@smc.vnet.net>
On Dec 6, 10:13 pm, "Carl K. Woll" <ca... at wolfram.com> wrote: > On 12/5/2010 8:57 PM, Lyle wrote: > > > > > Dear Listers, > > > I have a large (5-20million) one dimensional list of real numbers and > > I want to count the number of entries in the list that lie between 2 > > specific values (x1, x2). I need to run the function for a number of > > different ranges. > > > ie. number of list entries (l), where x1<= l<= x2 > > > I've tried: > > > tallydata[{x1_, x2_}] := Count[data, x_ /; x1<= x<= x2] > > > that takes about 3-4 seconds > > > and > > > tallydata[{x1_, x2_}] := Length[Select[data, x1<= #<= x2&]] > > > which takes a little bit longer. > > > The best I've managed is (this last one might be off by 1 or 2 but > > this doesn't really matter to me): > > > sorteddata = Sort[data]; > > nf = Nearest[sorteddata]; > > tallyrange[{x1_, x2_}] := > > First[Position[sorteddata, First[nf[x2]]]] - > > First[Position[sorteddata, First[nf[x1]]]] > > > which takes between 1 and 2 seconds but I was hoping there might be a > > faster way to do this? > > > Any help would be great! > > > Thanks, > > Lyle Gordon > > > Northwestern University > > Here's one possibility: > > data = RandomReal[1, 10^7]; > > In[206]:= Total@Unitize@Clip[data, {.2, .3}, {0, 0}] // Timing > > Out[206]= {0.14, 1000695} > > Carl Woll > Wolfram Research Why is it that a function called Count is not the fastest way to count, a function called BinCount is not the fastest way to count bins, a function called Sort is rarely the fastest way to sort, a function called Select is rarely the fastest way to select ...and so on. When "named" built in functions are rarely the best programming option this seems like a flaw in the design philosophy of the software and certainly is an added barrier to Mathematica becoming an intuitive program to learn. While I am on this topic the date and time handling functions in Mathematica, even in version 8, are so slow as to be useless for real work. This is another example of named functions not being the best programming option. Ditto the time value of money functions. It is very easy to get 2 orders of magnitude speed enhancement writing your own code which brings into question the point of developing these sort of built in functions in the form that they currently exist.