Re: Counting number of numbers in a large list between two valus
- To: mathgroup at smc.vnet.net
- Subject: [mg114525] Re: Counting number of numbers in a large list between two valus
- From: Albert Retey <awnl at gmx-topmail.de>
- Date: Mon, 6 Dec 2010 06:15:16 -0500 (EST)
- References: <idhjda$9f6$1@smc.vnet.net>
Hi,
> I have a large (5-20million) one dimensional list of real numbers and
> I want to count the number of entries in the list that lie between 2
> specific values (x1, x2). I need to run the function for a number of
> different ranges.
>
> ie. number of list entries (l), where x1 <= l <= x2
>
> I've tried:
>
> tallydata[{x1_, x2_}] := Count[data, x_ /; x1 <= x <= x2]
>
> that takes about 3-4 seconds
>
> and
>
> tallydata[{x1_, x2_}] := Length[Select[data, x1 <= # <= x2 &]]
>
> which takes a little bit longer.
>
> The best I've managed is (this last one might be off by 1 or 2 but
> this doesn't really matter to me):
>
> sorteddata = Sort[data];
> nf = Nearest[sorteddata];
> tallyrange[{x1_, x2_}] :=
> First[Position[sorteddata, First[nf[x2]]]] -
> First[Position[sorteddata, First[nf[x1]]]]
>
> which takes between 1 and 2 seconds but I was hoping there might be a
> faster way to do this?
I think this is a perfect use case for Compile:
ctallydata = Compile[{{data, _Real, 1}, {x1, _Real}, {x2, _Real}},
Block[{r = 0},
Do[If[x1 <= data[[i]] <= x2, r++], {i, Length[data]}]; r]
]
in version 8 you can achieve an additional speedup by setting
CompilationTarget -> "C" and RuntimeOptions -> "Speed". With some more
effort you might achieve further speedup by parallelization or using the
GPU, but I doubt that that would be worth the effort...
hth,
albert