MathGroup Archive: September 2002 [00335]

[Date Index] [Thread Index] [Author Index]

RE: FW: Re: empirical CDF

To: mathgroup at smc.vnet.net
Subject: [mg36717] RE: [mg36664] FW: [mg36643] Re: [mg36619] empirical CDF
From: "DrBob" <drbob at bigfoot.com>
Date: Fri, 20 Sep 2002 04:16:35 -0400 (EDT)
Reply-to: <drbob at bigfoot.com>
Sender: owner-wri-mathgroup at wolfram.com

True, but in this case, excitement means being misled far too often.

I've never favored histograms much, except when there are lots of bars
and little apparent noise.  Otherwise, unless you're very lucky, a
histogram doesn't even locate the mode well.

Bobby Treat

-----Original Message-----
From: Blimbaum Jerry DLPC [mailto:BlimbaumJE at ncsc.navy.mil] 
To: mathgroup at smc.vnet.net
Subject: [mg36717] RE: [mg36664] FW: [mg36643] Re: [mg36619] empirical CDF

"Unlikely accidents", I think, are sometimes the most exciting part of
science...jerry

-----Original Message-----
From: DrBob [mailto:drbob at bigfoot.com]
To: mathgroup at smc.vnet.net
Subject: [mg36717] RE: [mg36664] FW: [mg36643] Re: [mg36619] empirical CDF

I'd say the applet demonstrates that histograms are useless unless we
choose a pretty small bin-width.  Choosing a bin-width of significant
size that doesn't yield a misleading histogram is an unlikely accident.

Bobby

-----Original Message-----
From: Blimbaum Jerry DLPC [mailto:BlimbaumJE at ncsc.navy.mil] 
To: mathgroup at smc.vnet.net
Subject: [mg36717] [mg36664] FW: [mg36643] Re: [mg36619] empirical CDF

There is a very nice java applet at
http://statman.stat.sc.edu/~west/javahtml/classes/   , in which you can
include your own data by replacing what is in the Applet with your own,
which gives you a real time histogram plot and lets you alter the bin
width
and see how this effects the histogram....it wasnt until I saw this that
i
understood the significance of choosing the bin width......jerry
blimbaum

-----Original Message-----
From: Bill Rowe [mailto:listuser at earthlink.net]
To: mathgroup at smc.vnet.net
Subject: [mg36717] [mg36664] [mg36643] Re: [mg36619] empirical CDF

On 9/13/02 at 11:33 PM, swidrygiello at wp.pl (Swidrygiello) wrote:

>Does anybody know how to calculate in Mathematica: 
>a)empirical CDF,
>b)empirical PDF, 
>c)normal QQ-plot; 
>d)QQ-plot two different random samples?!

Yes, but there are a number of issues particularly with an empirical
PDF. A
very nice package that does all of the above and more is mathStatica.
See
http://www.mathstatica.com for details.

Obviously, it is less expensive to write your own functions.

Just recently in message [mg36613] Mark Fisher posted code that
addresses
the empirical CDF. However, in this code you may want to replace 1/n
with
1/(n+1) or (j-0.5)/n depending on your application. Note, these will
have no
significant effect for large data sets.

The key issue with an empirical PDF is deciding the bin width. A simple
approach would be to use the functions in Statistics`DataManipulation`
and
Graphics`Graphics`. Look at the functions Histogram, Frequencies and
BinListCounts. More sophisticated approaches involve kernel methods.
These
methods will generate smoother estimates for the PDF. Again, the key is
bandwidth. There is no apriori choice for bin width or bandwith. Bad
choices
will obscure significant features in the data set.

Prev by Date: Re: build-in commutativity

Next by Date: Re: Different letters different solutions!!

Previous by thread: FW: Re: empirical CDF

Next by thread: Postponing loading of Package