MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Counting number of numbers in a large list between two valus

  • To: mathgroup at smc.vnet.net
  • Subject: [mg114548] Re: Counting number of numbers in a large list between two valus
  • From: truxton spangler <truxtonspangler29 at gmail.com>
  • Date: Tue, 7 Dec 2010 06:47:36 -0500 (EST)
  • References: <idigg5$iq2$1@smc.vnet.net>

On Dec 6, 10:13 pm, "Carl K. Woll" <ca... at wolfram.com> wrote:
> On 12/5/2010 8:57 PM, Lyle wrote:
>
>
>
> > Dear Listers,
>
> > I have a large (5-20million) one dimensional list of real numbers and
> > I want to count the number of entries in the list that lie between 2
> > specific values (x1, x2). I need to run the function for a number of
> > different ranges.
>
> > ie. number of list entries (l), where x1<= l<= x2
>
> > I've tried:
>
> > tallydata[{x1_, x2_}] := Count[data, x_ /; x1<= x<= x2]
>
> > that takes about 3-4 seconds
>
> > and
>
> > tallydata[{x1_, x2_}] := Length[Select[data, x1<= #<= x2&]]
>
> > which takes a little bit longer.
>
> > The best I've managed is (this last one might be off by 1 or 2 but
> > this doesn't really matter to me):
>
> > sorteddata = Sort[data];
> > nf = Nearest[sorteddata];
> > tallyrange[{x1_, x2_}] :=
> >   First[Position[sorteddata, First[nf[x2]]]] -
> >    First[Position[sorteddata, First[nf[x1]]]]
>
> > which takes between 1 and 2 seconds but I was hoping there might be a
> > faster way to do this?
>
> > Any help would be great!
>
> > Thanks,
> > Lyle Gordon
>
> > Northwestern University
>
> Here's one possibility:
>
> data = RandomReal[1, 10^7];
>
> In[206]:= Total@Unitize@Clip[data, {.2, .3}, {0, 0}] // Timing
>
> Out[206]= {0.14, 1000695}
>
> Carl Woll
> Wolfram Research


Why is it that a function called Count is not the fastest way to
count, a function called BinCount is not the fastest way to count
bins, a function called Sort is rarely the fastest way to sort, a
function called Select is rarely the fastest way to select ...and so
on. When "named" built in functions are rarely the best programming
option this seems like a flaw in the design philosophy of the software
and certainly is an added barrier to Mathematica becoming an intuitive program
to learn.

While I am on this topic the date and time handling functions in Mathematica,
even in version 8, are so slow as to be useless for real work. This is
another example of named functions not being the best programming
option. Ditto the time value of money functions. It is very easy to
get 2 orders of magnitude speed enhancement writing your own code
which brings into question the point of developing these sort of built
in functions in the form that they currently exist.


  • Prev by Date: Re: Counting number of numbers in a large list between two valus
  • Next by Date: Re: WolframAlpha[] give results but not parse!
  • Previous by thread: Re: Counting number of numbers in a large list between two valus
  • Next by thread: Re: Counting number of numbers in a large list between two valus