       Re: Empirical CDF and InterpolatingFunction

• To: mathgroup at smc.vnet.net
• Subject: [mg36585] Re: [mg36555] Empirical CDF and InterpolatingFunction
• From: "Johannes Ludsteck" <johannes.ludsteck at wiwi.uni-regensburg.de>
• Date: Fri, 13 Sep 2002 01:13:52 -0400 (EDT)
• Organization: Universitaet Regensburg
• Sender: owner-wri-mathgroup at wolfram.com

```Dear Mark,
I suggest trying the following code before you put
further energy in speeding up your functions. My
code is very short and seems to be fast in my first
test with a random array of 100000 integers.

cdf[li_List]:=
FoldList[#1+Length[#2]&, 0.0,
Split[Sort[li]]]/Length[li]

t=Table[Random[Integer,{1,1000}],{100000}];

Timing[cdf[t];]
{0.44 Second,Null}

Probably the speed could be increased further generating
a compiled version. But this would require additional
programming effort. Then you had to write a function which
counts the number of
equal data 'by hand' while scanning through the data list.

Best regards,
Johannes

On 11 Sep 2002, at 13:27, Mark Fisher wrote:

> I'm trying to write a fast empirical cummulative distribution function
> (CDF). Empirical CDFs are step functions that can be expressed in
> terms of a Which statement. For example, given the list of
> observations {1, 2, 3},
>
> f = Which[# < 1, 0, # < 2, 1/3, # < 3, 2/3, True, 1]&
>
> is the empirical CDF. Note that f /@ {1, 2, 3} returns {1/3, 2/3, 1}
> and f is continuous from the right.
>
> When the number of observations is large, the Which statement
> evaluates fairly slowly (even if it has been Compiled). Since
> InterpolationFunction evaluates so much faster in general, I've tried
> to use Interpolation with InterpolationOrder -> 0. The problem is that
> the resulting InterpolatingFunction doesn't behave the way (I think)
> it ought to. For example, let
>
> g = Interpolation[{{1, 1/3}, {2, 2/3}, {3, 1}}, InterpolationOrder ->
> 0]
>
> Then, g /@ {1, 2, 3} returns {2/3, 2/3, 1} instead of {1/3, 2/3, 1}.
> In addition, g is continuous from the left rather than from the right.
>
> Obviously I am not aware of the considerations that went into
> determining the behavior of InterpolationFunction when
> InterpolationOrder -> 0.
>
> So I have two questions:
>
> (1) Does anyone have any opinions about how InterpolatingFunction
> ought to behave with InterpolationOrder -> 0?
>
> (2) Does anyone have a faster way to evaluate an empirical CDF than a
> compiled Which function?
>
> By the way, here's my current version:
>
> CompileEmpiricalCDF[list_?(VectorQ[#, NumericQ] &)] :=
>   Block[{x}, Compile[{{x, _Real}}, Evaluate[
>     Which @@ Flatten[
>       Append[
>           Transpose[{
>             Thread[x < Sort[list]],
>             Range[0, 1 - 1/#, 1/#] & @ Length[list]
>               }],
>         {True, 1}]]
>   ]]]
>
> --Mark
>

<><><><><><><><><><><><>
Johannes Ludsteck
Economics Department
University of Regensburg
Universitaetsstrasse 31
93053 Regensburg
Phone +49/0941/943-2741

```

• Prev by Date: Re: Coloured lines in graphics but black in print
• Next by Date: RE: Empirical CDF and InterpolatingFunction
• Previous by thread: RE: Empirical CDF and InterpolatingFunction
• Next by thread: RE: Empirical CDF and InterpolatingFunction