MathGroup Archive: July 2003 [00225]

[Date Index] [Thread Index] [Author Index]

Re: WeibullDistribution

To: mathgroup at smc.vnet.net
Subject: [mg42547] Re: WeibullDistribution
From: "Robert Nowak" <robert.nowak at ims.co.at>
Date: Mon, 14 Jul 2003 05:42:17 -0400 (EDT)
References: <beolj4$t25$1@smc.vnet.net>
Sender: owner-wri-mathgroup at wolfram.com

hi bill,

 could you please outline how exactly to do the fit ? based on the array
called data.

(have tried but i think i am missing something)



<<"Statistics`ContinuousDistributions`"

<< "Statistics`NonlinearFit`"

data = RandomArray[WeibullDistribution[5, 2], {1000}];

LH = PowerExpand[Log[PowerExpand[-Log[1 - CDF[WeibullDistribution[a, b],
x]]]]]

(-a)*Log[b] + a*Log[x]

NonlinearFit[Log[data], LH, x, {a, b}]

0.4429931859259464 + 0.024051659740139397*Log[x]

(* in contrast to *)

-5*Log[2.] + 5.*Log[x]

-3.4657359027997265 + 5.*Log[x]

regards robert



"Bill Rowe" <listuser at earthlink.net> wrote in message
news:beolj4$t25$1 at smc.vnet.net...
> On 7/11/03 at 2:57 AM, robert.nowak at ims.co.at (Robert Nowak) wrote:
>
> > "usualy" you dont have randomly noised PDF(x)-funktionsvalues at
> > positions x. "usualy" you only have random values which are expected
> > to obey a distribution with a specific PDF.
>
> Exactly right.
>
> > in the "usual" case you therefore cant fit your data against the PDF.
>
> Not exactly true but there are problems with fitting against the PDF. That
is why I suggested fitting the empirical cummulative hazard function for the
data to the expected cummulative hazard function for the Weibull
distribution.
>
> For any distribution, the cummulative hazard function is -Log[1-CDF]. For
a set of random data points, aa reasonable estimator of the empirical CDF at
data point x_j is given by (j - 0.5)/n where j is the rank of the sorted
data and n is the number of data samples. Or coded into Mathematica, this
would be H = Transpose[{Sort@x, -Log[(Range[Length at x]-.5)/Length@x]}] where
x is the vector of data values.
>
> Now for a Weibull distribution the cummulative hazard function, H =
(x/b)^a. Taking the logartihm of both sides yields.
>
> Log[H] = a Log[x] - a Log[b] were a and b are the desired parameters of
the Weibull distribution. So, setting H equal to the known empirical
cummulative hazard function and doing a linear regression analysis of Log[H]
vs Log[x] will give you the desired parameters, i.e.,
>
> f = Fit[Log@H, {1,t},t];
> a = -f[[2,1]]
> b = Exp[-f[[1]]/f[[2,1]]]
>
> However, since the PDF is the derivative of the CDF, it is possible to
estimate the PDF from data set and fit it to the theorectical PDF. The key
problem with this approach is it accenuates errors in the data and generally
the interval between subsequent data points is too large to get an accurate
estimate of the derivative. Since the empirical CDF is effectively a
summing, uncertainty in each data point tends to be supressed.
>
> > i think you have to do some of bob hanlons or similar calculations.
>
> What Bob Hanlon suggested was computing the desired parameters from the
first two moments in the data set. This is a point estimate of the
parameters and is a valid approach. However, it isn't be most robust
approach. A point estimate based on two chosen quantiles is more robust and
for the Wiebull distribution, has a closed form solution.
>

Prev by Date: Re: Re: New version, new bugs

Next by Date: Embedding fonts in exported .eps graphics

Previous by thread: Re: WeibullDistribution

Next by thread: Re: WeibullDistribution