       Re: Re: Histogram normalization

• To: mathgroup at smc.vnet.net
• Subject: [mg40052] Re: [mg40012] Re: [mg40005] Histogram normalization
• From: Dr Bob <drbob at bigfoot.com>
• Date: Mon, 17 Mar 2003 03:35:28 -0500 (EST)
• References: <200303160720.CAA02839@smc.vnet.net>
• Sender: owner-wri-mathgroup at wolfram.com

```OK, but that nonparametric density gives a poor fit to the data, even with
1000 samples.

Using MathStatica (see http://www.mathstatica.com/software/ ), the
following usually does a good job:

<< mathStatica.m
<< Graphics`Graphics`
f = 1/(E^(x^2/2)*Sqrt[2*Pi]); domain[f] = {x, -Infinity, Infinity};

Attributes[empiricalAnalysis] = {HoldFirst};
Options[empiricalAnalysis] = {bandwidthFactor -> 1, binFactor -> 0.15, \
minBins -> 25, maxBins -> 50};
empiricalAnalysis[f_, data_, options___Rule] :=
Module[{c, p, empiricalData, empiricalPDF, factor, categories},
{factor, categories} = {bandwidthFactor, Max[maxBins, Min[minBins,
binFactor*Length@data]]} /. {options} /. Options@empiricalAnalysis;
Off[CompiledFunction::"cfn"];
c = Bandwidth[data, f, Method -> SheatherJones];
On[CompiledFunction::"cfn"];
p = Block[{\$DisplayFunction = Identity}, NPKDEPlot[data, f, c*factor]];
empiricalData = First@Cases[p, Line[a_] -> a, Infinity];
empiricalPDF = Interpolation[empiricalData];
DisplayTogether[Histogram[data, HistogramScale -> 1,
HistogramCategories \
-> categories], p, Plot[empiricalPDF[x], {x, Min@data, Max@data}]];
empiricalPDF
]

<< "Statistics`ContinuousDistributions`"
Y = RandomArray[StudentTDistribution, 500];
empiricalAnalysis[f, Y];

That plots the empirical PDF along with the scaled histogram on the same
plot, and returns the empirical PDF as an InterpolatingFunction.
bindwidthFactor and/or binFactor can be decreased to capture or display
more "features" of the data.

Sometimes there's a numerical error in the BandWidth function (which causes
some MathStatica function or another to be evaluated in uncompiled form).
Hence the Off and On statements.

If anyone wants to see example plots without buying MathStatica, contact me
and I'll send you a notebook.

Bobby

On Sun, 16 Mar 2003 02:20:18 -0500 (EST), Kyriakos Chourdakis
<tuxedomoon at yahoo.com> wrote:

> The ``smoothed histogram'' you are looking for is
> replicated by the nonparametric density. I think you
> might find the NonParametricDensity function below
> useful. It is a very simplified version  without
> control over the kernels etc. but it should do the
> trick.
>
> The code defines the nonparametric density, creates a
> 500 point sample from a t_4 distribution, and plots
> the ``smooth histogram'' of the sample together with
> the t_4.
>
> (***************************************************)
> (* Copy into .nb                                   *)
> Quit[]; << "Statistics`ContinuousDistributions`"
>
> NonParametricDensity[x_] := Module[{sx, g, gg, T, h}, sx =
> StandardDeviation[x]; T = Length[x]; h = (sx*1.06)/T^0.2; g =
> Function[{u}, (1*Plus @@ (Exp[-((u - #1)^2/(2*h^2))] & ) /@
> x)/
> (T*h*Sqrt[2*Pi])]; FunctionInterpolation[g[u],
>
> {u, Min[x] - 4*h, Max[x] + 4*h}]];
>
> dist = StudentTDistribution; Y = RandomArray[dist, 500]; NPf =
> NonParametricDensity[Y];
>
> Plot[{PDF[dist, x], NPf[x]}, {x, -5, 5}, Frame ->
> True, Axes -> False, PlotStyle -> {Thickness[0.], Thickness[0.01]}];
> (***************************************************)
>
> Kyriakos
>
> Kyriakos Chourdakis
> http://www.theponytail.net/
>
> __________________________________________________
> Do You Yahoo!?
> Everything you'll ever need on one web page
> from News and Sport to Email and Music Charts
> http://uk.my.yahoo.com
>
>

--
majort at cox-internet.com
Bobby R. Treat

```

• Prev by Date: Re: Re: How to apply one plot range to another plot?
• Next by Date: Re: Compile + Module =X> Memory Leak; Outer & Compile =>Memory Leak
• Previous by thread: Re: Histogram normalization
• Next by thread: Re: Laplce equation