MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Counting number of numbers in a large list between two valus

  • To: mathgroup at smc.vnet.net
  • Subject: [mg114547] Re: Counting number of numbers in a large list between two valus
  • From: Ray Koopman <koopman at sfu.ca>
  • Date: Tue, 7 Dec 2010 06:47:25 -0500 (EST)
  • References: <idhjda$9f6$1@smc.vnet.net>

On Dec 5, 6:56 pm, Lyle <lgor... at gmail.com> wrote:
> Dear Listers,
>
> I have a large (5-20million) one dimensional list of real numbers and
> I want to count the number of entries in the list that lie between 2
> specific values (x1, x2). I need to run the function for a number of
> different ranges.
>
> ie. number of list entries (l), where x1 <= l <= x2
>
> I've tried:
>
> tallydata[{x1_, x2_}] := Count[data, x_ /; x1 <= x <= x2]
>
> that takes about 3-4 seconds
>
> and
>
> tallydata[{x1_, x2_}] := Length[Select[data, x1 <= # <= x2 &]]
>
> which takes a little bit longer.
>
> The best I've managed is (this last one might be off by 1 or 2 but
> this doesn't really matter to me):
>
> sorteddata = Sort[data];
> nf = Nearest[sorteddata];
> tallyrange[{x1_, x2_}] :=
>  First[Position[sorteddata, First[nf[x2]]]] -
>   First[Position[sorteddata, First[nf[x1]]]]
>
> which takes between 1 and 2 seconds but I was hoping there might be a
> faster way to do this?
>
> Any help would be great!
>
> Thanks,
> Lyle Gordon
>
> Northwestern University

Here are some oldies, plus improvements on them. Note that the
fastest two routines that use UnitStep will give wrong answers if
the data contain any values that equal max, and that the routines
that use Clip will give wrong answers if min <= 0 <= max and the
data contain any zeros.

data = RandomReal[1.,1*^7]; min = .2; max = .3;

Total@UnitStep[(data-min)*(max-data)] //Timing
{2.64,999208}

UnitStep[data-min].UnitStep[max-data] //Timing
{2.3,999208}

Total[ UnitStep[data-min]-UnitStep[data-max] ] //Timing
{2.23,999208}

Total@UnitStep[data-min] - Total@UnitStep[data-max] //Timing
{1.73,999208}

Total@Unitize@Clip[data,{min,max},{0,0}] //Timing
{0.91,999208}

SparseArray@Clip[data,{min,max},{0,0}] /.
 SparseArray[_,_,_,d_] :> d[[2,1,-1]] //Timing
{0.77,999208}


  • Prev by Date: Re: Nested Manipulate and LocalizeVariables -> False
  • Next by Date: Re: Counting number of numbers in a large list between two valus
  • Previous by thread: Re: Counting number of numbers in a large list between two valus
  • Next by thread: Re: Counting number of numbers in a large list between two valus