Re: Empirical CDF

*To*: mathgroup at smc.vnet.net*Subject*: [mg58926] Re: Empirical CDF*From*: Mark Fisher <mark at markfisher.net>*Date*: Sun, 24 Jul 2005 01:21:48 -0400 (EDT)*References*: <dbt3oh$skn$1@smc.vnet.net>*Sender*: owner-wri-mathgroup at wolfram.com

In the meantime, Piecewise has arrived, which Mathematica *does* know how to integrate. Here is an example of how to use it to compute an emprical cummulative distribution function given a list of observations. MakeEDFPiecewise[list_] := With[{d = 1/Length[list]}, Block[{x}, Function @@ {x, Piecewise[ Transpose[{Range[1, d, -d], Thread[x >= Reverse[Sort[list]]]}] ]} ]] rand = Table[Random[], {100}]; fun = MakeEDFPiecewise[rand]; Plot[fun[x], {x, -.25, 1.25}, PlotPoints -> 300]; fun2 = Block[{x}, Function @@ {x, Integrate[fun[x], x]}]; Plot[fun2[x], {x, -.25, 1.25}, PlotPoints -> 300, PlotRange -> All] For faster execution one can replace Function with Compile in the definition of MakeEDFPiecewise. (I use an Option to control the choice.) However, there are three costs: It takes longer to compute the function itself, the resulting object is larger, and it Mathematica complains when one tries to integrate it analytically (and returns an uncompiled result). --Mark P.S. Question to Mathematica developers: Why does fun2'[x] sometimes (but not always) produce the error message Reduce::ratnz? David Kahle wrote: > MathGroup - > > swidrygiello posted on Fri, 13 Sep 2002 a question on how to create the > empirical cumulative distribution function (ECDF) with Mathematica. Since > then, few others (Mark Fisher, , for example) have posted > responses utilizing the Interpolation[] command. However, similar results > can be achieved using the simple code : > > For[i=1,i<(Length[rand]+1),preF[x_,i_]:=UnitStep[x-rand[[i]]],i++] > F[x_]:=Sum[preF[x,i],{i,Length[rand]}]/Length[rand] > > Where 'rand' is the vector containing the random observations. The > resulting function F is right continuous, limits to 0 and 1 as we would > like it to, and has the correct step sizes. Even better, Mathematica is > more comfortable manipulating UnitStep functions than it is piecewise > functions defined with the Which[] command. For example, if we would like > to integrate the ECDF to test for certain orderings (Stochastic dominance, > etc.), Mathematica understands integrating the F as defined above, but if > we use the Which[] command it resorts to numerical techniques which > typically fail for various reasons. Hope it helps. > > David Kahle > david.kahle at richmond.edu >