Re: Re: Histogram normalization
- To: mathgroup at smc.vnet.net
- Subject: [mg40052] Re: [mg40012] Re: [mg40005] Histogram normalization
- From: Dr Bob <drbob at bigfoot.com>
- Date: Mon, 17 Mar 2003 03:35:28 -0500 (EST)
- References: <200303160720.CAA02839@smc.vnet.net>
- Reply-to: drbob at bigfoot.com
- Sender: owner-wri-mathgroup at wolfram.com
OK, but that nonparametric density gives a poor fit to the data, even with 1000 samples. Using MathStatica (see http://www.mathstatica.com/software/ ), the following usually does a good job: << mathStatica.m << Graphics`Graphics` f = 1/(E^(x^2/2)*Sqrt[2*Pi]); domain[f] = {x, -Infinity, Infinity}; Attributes[empiricalAnalysis] = {HoldFirst}; Options[empiricalAnalysis] = {bandwidthFactor -> 1, binFactor -> 0.15, \ minBins -> 25, maxBins -> 50}; empiricalAnalysis[f_, data_, options___Rule] := Module[{c, p, empiricalData, empiricalPDF, factor, categories}, {factor, categories} = {bandwidthFactor, Max[maxBins, Min[minBins, binFactor*Length@data]]} /. {options} /. Options@empiricalAnalysis; Off[CompiledFunction::"cfn"]; c = Bandwidth[data, f, Method -> SheatherJones]; On[CompiledFunction::"cfn"]; p = Block[{$DisplayFunction = Identity}, NPKDEPlot[data, f, c*factor]]; empiricalData = First@Cases[p, Line[a_] -> a, Infinity]; empiricalPDF = Interpolation[empiricalData]; DisplayTogether[Histogram[data, HistogramScale -> 1, HistogramCategories \ -> categories], p, Plot[empiricalPDF[x], {x, Min@data, Max@data}]]; empiricalPDF ] << "Statistics`ContinuousDistributions`" Y = RandomArray[StudentTDistribution[4], 500]; empiricalAnalysis[f, Y]; That plots the empirical PDF along with the scaled histogram on the same plot, and returns the empirical PDF as an InterpolatingFunction. bindwidthFactor and/or binFactor can be decreased to capture or display more "features" of the data. Sometimes there's a numerical error in the BandWidth function (which causes some MathStatica function or another to be evaluated in uncompiled form). Hence the Off and On statements. If anyone wants to see example plots without buying MathStatica, contact me and I'll send you a notebook. Bobby On Sun, 16 Mar 2003 02:20:18 -0500 (EST), Kyriakos Chourdakis <tuxedomoon at yahoo.com> wrote: > The ``smoothed histogram'' you are looking for is > replicated by the nonparametric density. I think you > might find the NonParametricDensity function below > useful. It is a very simplified version without > control over the kernels etc. but it should do the > trick. > > The code defines the nonparametric density, creates a > 500 point sample from a t_4 distribution, and plots > the ``smooth histogram'' of the sample together with > the t_4. > > (***************************************************) > (* Copy into .nb *) > Quit[]; << "Statistics`ContinuousDistributions`" > > NonParametricDensity[x_] := Module[{sx, g, gg, T, h}, sx = > StandardDeviation[x]; T = Length[x]; h = (sx*1.06)/T^0.2; g = > Function[{u}, (1*Plus @@ (Exp[-((u - #1)^2/(2*h^2))] & ) /@ > x)/ > (T*h*Sqrt[2*Pi])]; FunctionInterpolation[g[u], > > {u, Min[x] - 4*h, Max[x] + 4*h}]]; > > dist = StudentTDistribution[4]; Y = RandomArray[dist, 500]; NPf = > NonParametricDensity[Y]; > > Plot[{PDF[dist, x], NPf[x]}, {x, -5, 5}, Frame -> > True, Axes -> False, PlotStyle -> {Thickness[0.], Thickness[0.01]}]; > (***************************************************) > > Kyriakos > > Kyriakos Chourdakis > http://www.theponytail.net/ > > __________________________________________________ > Do You Yahoo!? > Everything you'll ever need on one web page > from News and Sport to Email and Music Charts > http://uk.my.yahoo.com > > -- majort at cox-internet.com Bobby R. Treat
- References:
- Re: Histogram normalization
- From: Kyriakos Chourdakis <tuxedomoon@yahoo.com>
- Re: Histogram normalization