MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Empirical CDF

  • To: mathgroup at smc.vnet.net
  • Subject: [mg58926] Re: Empirical CDF
  • From: Mark Fisher <mark at markfisher.net>
  • Date: Sun, 24 Jul 2005 01:21:48 -0400 (EDT)
  • References: <dbt3oh$skn$1@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

In the meantime, Piecewise has arrived, which Mathematica *does* know how to 
integrate. Here is an example of how to use it to compute an emprical 
cummulative distribution function given a list of observations.

MakeEDFPiecewise[list_] :=
   With[{d = 1/Length[list]},
   Block[{x},
     Function @@ {x,
       Piecewise[
         Transpose[{Range[1, d, -d], Thread[x >= Reverse[Sort[list]]]}]
             ]}
       ]]

rand = Table[Random[], {100}];
fun = MakeEDFPiecewise[rand];
Plot[fun[x], {x, -.25, 1.25}, PlotPoints -> 300];

fun2 = Block[{x}, Function @@ {x, Integrate[fun[x], x]}];
Plot[fun2[x], {x, -.25, 1.25}, PlotPoints -> 300, PlotRange -> All]

For faster execution one can replace Function with Compile in the 
definition of MakeEDFPiecewise. (I use an Option to control the choice.) 
However, there are three costs: It takes longer to compute the function 
itself, the resulting object is larger, and it Mathematica complains when one 
tries to integrate it analytically (and returns an uncompiled result).

--Mark

P.S. Question to Mathematica developers: Why does fun2'[x] sometimes (but not 
always) produce the error message Reduce::ratnz?

David Kahle wrote:

> MathGroup -
> 
> swidrygiello posted on Fri, 13 Sep 2002 a question on how to create the
> empirical cumulative distribution function (ECDF) with Mathematica.  Since
> then, few others (Mark Fisher, , for example) have posted
> responses utilizing the Interpolation[] command.  However, similar results
> can be achieved using the simple code :
> 
> For[i=1,i<(Length[rand]+1),preF[x_,i_]:=UnitStep[x-rand[[i]]],i++]
> F[x_]:=Sum[preF[x,i],{i,Length[rand]}]/Length[rand]
> 
> Where 'rand' is the vector containing the random observations.  The
> resulting function F is right continuous, limits to 0 and 1 as we would
> like it to, and has the correct step sizes.  Even better, Mathematica is
> more comfortable manipulating UnitStep functions than it is piecewise
> functions defined with the Which[] command.  For example, if we would like
> to integrate the ECDF to test for certain orderings (Stochastic dominance,
> etc.), Mathematica understands integrating the F as defined above, but if
> we use the Which[] command it resorts to numerical techniques which
> typically fail for various reasons.  Hope it helps.
> 
> David Kahle
> david.kahle at richmond.edu
> 


  • Prev by Date: Re: Follow-on: StyleForm and font selection
  • Next by Date: Re: limit problem
  • Previous by thread: Re: Empirical CDF
  • Next by thread: Interpolation problem, optimization algorithms