[Date Index]
[Thread Index]
[Author Index]
Re: Empirical CDF and InterpolatingFunction
*To*: mathgroup at smc.vnet.net
*Subject*: [mg36579] Re: [mg36555] Empirical CDF and InterpolatingFunction
*From*: Tomas Garza <tgarza01 at prodigy.net.mx>
*Date*: Fri, 13 Sep 2002 01:13:42 -0400 (EDT)
*References*: <200209111727.NAA07812@smc.vnet.net>
*Sender*: owner-wri-mathgroup at wolfram.com
Well, I don't know how fast, but it is fairly simple, anyway. Suppose you
have a series of values s for which you wish to obtain the edf.
In[1]:=
s = Table[Random[Integer, {0, 3}], {10}]
Out[1]=
{2,2,1,2,0,0,1,1,1,3}
If no specification is made about their position on the x-axis, we assume
that they correspond to the integers from 1 to 10. What we have then is the
collection of pairs
In[2]:=
porig = Transpose[{Range[10], s}]
Out[2]=
{{1, 2}, {2, 2}, {3, 1}, {4, 2}, {5, 0}, {6, 0}, {7, 1},
{8, 1}, {9, 1}, {10, 3}}
The edf gives, for each x, the proportion of points in s that are less than
or equal to x, for all x. We obtain these proportions through the cumulative
sums
In[3]:=
N[CumulativeSums[s]/Plus @@ s]
Out[3]=
{0.153846,0.307692,0.384615,0.538462,0.538462,0.538462,0.615385,0.692308,0.\
769231,1.}
so that for each of the pairs (x, y) below, y gives the proportion of points
in s that are less than or equal to x:
In[4]:=
cumporig = Transpose[{Range[10],
N[CumulativeSums[s]/Plus @@ s]}]
Out[4]=
{{1, 0.15384615384615385}, {2, 0.3076923076923077},
{3, 0.38461538461538464}, {4, 0.5384615384615384},
{5, 0.5384615384615384}, {6, 0.5384615384615384},
{7, 0.6153846153846154}, {8, 0.6923076923076923},
{9, 0.7692307692307693}, {10, 1.}}
Now shift the x values one unit to the left, by dropping the last value and
prepending 0 to them:
In[5]:=
ps = Transpose[{Prepend[Drop[Range[1, 10], -1], 0],
CumulativeSums[s]/Plus @@ s}]
Out[5]=
{{0, 2/13}, {1, 4/13}, {2, 5/13}, {3, 7/13}, {4, 7/13},
{5, 7/13}, {6, 8/13}, {7, 9/13}, {8, 10/13}, {9, 1}}
Then use Interpolation on this shifted set of points:
In[6]:=
ips=Interpolation[ps,InterpolationOrder\[Rule]0]
Out[6]=
InterpolatingFunction[{{0,9}},<>]
ips[x-1] is the edf you are looking for, as you may check by plotting it and
displaying in the same graph together with the ListPlot of cumporig above.
Tomas Garza
Mexico City
----- Original Message -----
From: "Mark Fisher" <mark at markfisher.net>
To: mathgroup at smc.vnet.net
Subject: [mg36579] [mg36555] Empirical CDF and InterpolatingFunction
> I'm trying to write a fast empirical cummulative distribution function
> (CDF). Empirical CDFs are step functions that can be expressed in
> terms of a Which statement. For example, given the list of
> observations {1, 2, 3},
>
> f = Which[# < 1, 0, # < 2, 1/3, # < 3, 2/3, True, 1]&
>
> is the empirical CDF. Note that f /@ {1, 2, 3} returns {1/3, 2/3, 1}
> and f is continuous from the right.
>
> When the number of observations is large, the Which statement
> evaluates fairly slowly (even if it has been Compiled). Since
> InterpolationFunction evaluates so much faster in general, I've tried
> to use Interpolation with InterpolationOrder -> 0. The problem is that
> the resulting InterpolatingFunction doesn't behave the way (I think)
> it ought to. For example, let
>
> g = Interpolation[{{1, 1/3}, {2, 2/3}, {3, 1}}, InterpolationOrder ->
> 0]
>
> Then, g /@ {1, 2, 3} returns {2/3, 2/3, 1} instead of {1/3, 2/3, 1}.
> In addition, g is continuous from the left rather than from the right.
>
> Obviously I am not aware of the considerations that went into
> determining the behavior of InterpolationFunction when
> InterpolationOrder -> 0.
>
> So I have two questions:
>
> (1) Does anyone have any opinions about how InterpolatingFunction
> ought to behave with InterpolationOrder -> 0?
>
> (2) Does anyone have a faster way to evaluate an empirical CDF than a
> compiled Which function?
>
> By the way, here's my current version:
>
> CompileEmpiricalCDF[list_?(VectorQ[#, NumericQ] &)] :=
> Block[{x}, Compile[{{x, _Real}}, Evaluate[
> Which @@ Flatten[
> Append[
> Transpose[{
> Thread[x < Sort[list]],
> Range[0, 1 - 1/#, 1/#] & @ Length[list]
> }],
> {True, 1}]]
> ]]]
>
> --Mark
>
>
Prev by Date:
**RE: Uneven FrameTicks with ExtendGraphics**
Next by Date:
**Re: creating adjacency matrices**
Previous by thread:
**Empirical CDF and InterpolatingFunction**
Next by thread:
**RE: Empirical CDF and InterpolatingFunction**
| |