Re: WeibullDistribution
- To: mathgroup at smc.vnet.net
- Subject: [mg42532] Re: WeibullDistribution
- From: bobhanlon at aol.com (Bob Hanlon)
- Date: Sat, 12 Jul 2003 20:53:12 -0400 (EDT)
- References: <beolj4$t25$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
I did not suggest computing the desired parameters from the first two moments of the data set. I used maximum likelihood estimation. Since the likelihood function or log likelihood function are long expresssions for large data sets, it is desirable to start the numerical process as close to the final answer as possible. The first two moments were used to obtain the initial estimate. Bob Hanlon In article <beolj4$t25$1 at smc.vnet.net>, Bill Rowe <listuser at earthlink.net> wrote: << Subject: Re: WeibullDistribution From: Bill Rowe <listuser at earthlink.net> To: mathgroup at smc.vnet.net Date: Sat, 12 Jul 2003 09:49:24 +0000 (UTC) On 7/11/03 at 2:57 AM, robert.nowak at ims.co.at (Robert Nowak) wrote: > "usualy" you dont have randomly noised PDF(x)-funktionsvalues at > positions x. "usualy" you only have random values which are expected > to obey a distribution with a specific PDF. Exactly right. > in the "usual" case you therefore cant fit your data against the PDF. Not exactly true but there are problems with fitting against the PDF. That is why I suggested fitting the empirical cummulative hazard function for the data to the expected cummulative hazard function for the Weibull distribution. For any distribution, the cummulative hazard function is -Log[1-CDF]. For a set of random data points, aa reasonable estimator of the empirical CDF at data point x_j is given by (j - 0.5)/n where j is the rank of the sorted data and n is the number of data samples. Or coded into Mathematica, this would be H = Transpose[{Sort@x, -Log[(Range[Length at x]-.5)/Length@x]}] where x is the vector of data values. Now for a Weibull distribution the cummulative hazard function, H = (x/b)^a. Taking the logartihm of both sides yields. Log[H] = a Log[x] - a Log[b] were a and b are the desired parameters of the Weibull distribution. So, setting H equal to the known empirical cummulative hazard function and doing a linear regression analysis of Log[H] vs Log[x] will give you the desired parameters, i.e., f = Fit[Log@H, {1,t},t]; a = -f[[2,1]] b = Exp[-f[[1]]/f[[2,1]]] However, since the PDF is the derivative of the CDF, it is possible to estimate the PDF from data set and fit it to the theorectical PDF. The key problem with this approach is it accenuates errors in the data and generally the interval between subsequent data points is too large to get an accurate estimate of the derivative. Since the empirical CDF is effectively a summing, uncertainty in each data point tends to be supressed. > i think you have to do some of bob hanlons or similar calculations. What Bob Hanlon suggested was computing the desired parameters from the first two moments in the data set. This is a point estimate of the parameters and is a valid approach. However, it isn't be most robust approach. A point estimate based on two chosen quantiles is more robust and for the Wiebull distribution, has a closed form solution. >><BR><BR>