MathGroup Archive 2002

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Empirical CDF and InterpolatingFunction

  • To: mathgroup at smc.vnet.net
  • Subject: [mg36585] Re: [mg36555] Empirical CDF and InterpolatingFunction
  • From: "Johannes Ludsteck" <johannes.ludsteck at wiwi.uni-regensburg.de>
  • Date: Fri, 13 Sep 2002 01:13:52 -0400 (EDT)
  • Organization: Universitaet Regensburg
  • Sender: owner-wri-mathgroup at wolfram.com

Dear Mark,
I suggest trying the following code before you put
further energy in speeding up your functions. My
code is very short and seems to be fast in my first
test with a random array of 100000 integers.

cdf[li_List]:=
	FoldList[#1+Length[#2]&, 0.0,
      Split[Sort[li]]]/Length[li]

t=Table[Random[Integer,{1,1000}],{100000}];

Timing[cdf[t];]
{0.44 Second,Null}

Probably the speed could be increased further generating
a compiled version. But this would require additional
programming effort. Then you had to write a function which
counts the number of
equal data 'by hand' while scanning through the data list.

Best regards,
	Johannes

On 11 Sep 2002, at 13:27, Mark Fisher wrote:

> I'm trying to write a fast empirical cummulative distribution function
> (CDF). Empirical CDFs are step functions that can be expressed in
> terms of a Which statement. For example, given the list of
> observations {1, 2, 3},
> 
> f = Which[# < 1, 0, # < 2, 1/3, # < 3, 2/3, True, 1]&
> 
> is the empirical CDF. Note that f /@ {1, 2, 3} returns {1/3, 2/3, 1}
> and f is continuous from the right.
> 
> When the number of observations is large, the Which statement
> evaluates fairly slowly (even if it has been Compiled). Since
> InterpolationFunction evaluates so much faster in general, I've tried
> to use Interpolation with InterpolationOrder -> 0. The problem is that
> the resulting InterpolatingFunction doesn't behave the way (I think)
> it ought to. For example, let
> 
> g = Interpolation[{{1, 1/3}, {2, 2/3}, {3, 1}}, InterpolationOrder ->
> 0]
> 
> Then, g /@ {1, 2, 3} returns {2/3, 2/3, 1} instead of {1/3, 2/3, 1}.
> In addition, g is continuous from the left rather than from the right.
> 
> Obviously I am not aware of the considerations that went into
> determining the behavior of InterpolationFunction when
> InterpolationOrder -> 0.
> 
> So I have two questions: 
> 
> (1) Does anyone have any opinions about how InterpolatingFunction
> ought to behave with InterpolationOrder -> 0?
> 
> (2) Does anyone have a faster way to evaluate an empirical CDF than a
> compiled Which function?
> 
> By the way, here's my current version:
> 
> CompileEmpiricalCDF[list_?(VectorQ[#, NumericQ] &)] :=
>   Block[{x}, Compile[{{x, _Real}}, Evaluate[
>     Which @@ Flatten[
>       Append[
>           Transpose[{
>             Thread[x < Sort[list]],
>             Range[0, 1 - 1/#, 1/#] & @ Length[list]
>               }],
>         {True, 1}]]
>   ]]]
> 
> --Mark
> 



<><><><><><><><><><><><>
Johannes Ludsteck
Economics Department
University of Regensburg
Universitaetsstrasse 31
93053 Regensburg
Phone +49/0941/943-2741


  • Prev by Date: Re: Coloured lines in graphics but black in print
  • Next by Date: RE: Empirical CDF and InterpolatingFunction
  • Previous by thread: RE: Empirical CDF and InterpolatingFunction
  • Next by thread: RE: Empirical CDF and InterpolatingFunction