       Re: FindFit, Weibull

• To: mathgroup at smc.vnet.net
• Subject: [mg92891] Re: FindFit, Weibull
• From: Bill Rowe <readnews at sbcglobal.net>
• Date: Fri, 17 Oct 2008 05:23:06 -0400 (EDT)

```On 10/16/08 at 5:04 AM, petervansummeren at gmail.com (P_ter) wrote:

>I found some data which seemed to fit a WeibullDistribution. The
>data are: pr= {16, 34, 53, 75, 93, 120}; After adding the median
>rank to the data: praxis={{16, 0.109101}, {34, 0.26445}, {53,
>0.421407}, {75, 0.578593}, {93, 0.73555}, {120, 0.890899}}; I
>checked with LeastSquares the parameters: \[Alpha]=1.4301;\[Beta] =
>76.318. So far ok. But the following did not work:
>fit2 = {a, b} /. FindFit[praxis, CDF[WeibullDistribution[a, b], x],
>{a ,b}, {x}]

There are several things that can be done. First, with complete
data it is probably better to use a different approach than
non-linear regression to find estimates for the Weibull
parameters. For example:

In:= meanLog = ExpectedValue[Log[x], WeibullDistribution[a,
b], x]

Out= Log[b] - EulerGamma/a

In:= logMedian =
Log@Quantile[WeibullDistribution[a, b], 1/2] // PowerExpand

Out= log(b)+log(log(2))/a

In:= pr = {16, 34, 53, 75, 93,
120}; NSolve[{meanLog == Mean[Log[pr // N]],
logMedian == Log[Median[pr // N]]}, {a, b}]

Out= {{a->1.20807,b->86.6841}}

Note, instead of using the median of the original data there are
other statistics that could be used to find the parameters which
would lead to somewhat different estimated values.

Another approach would be to maximize the loglikelihood
function. For example,

In:= logP = Log@PDF[WeibullDistribution[a, b], x] // PowerExpand;
ll = Total[logP /. x -> pr // N];
NMaximize[{ll, 0 < a && 0 < b}, {a, b}]

Out= {-29.5849,{a->1.93268,b->73.5261}}

But if you prefer the estimates you get from doing a regression
analysis, you can avoid the problem you are encountering by
placing constraints on a and b as well as providing better
starting values. That is,

In:= FindFit[praxis, {CDF[WeibullDistribution[a, b], x],
a > 0 && b > 0}, {{a, 1}, {b, 100}}, {x}]

Out= {a->1.52051,b->77.4766}

Note, it is important to use reasonably good estimates with the
constraints since using constraints only results in

In:= FindFit[praxis, {CDF[WeibullDistribution[a, b], x],
a > 0 && b > 0}, {a, b}, {x}]

Out= {a->0.999501,b->1.00018}

which are clearly not very good estimates.

Alternatively, since

In:= Assuming[a > 0 && b > 0 && x > 0,
Log[-Log[1 - CDF[WeibullDistribution[a, b], x]]] // FullSimplify]

Out= a*Log[x/b]

The problem can be linearized by taking logarithms and avoiding
the problems inherent in non-linear fitting. Doing this will
avoid the problem of finding good initial starting points for
the non-linear regression problem.

>So, I used the parameters in the Weibull CDF and the times from pr
>to get: praxis1= {{16, 0.106157}, {34, 0.27559}, {53, 0.451308},
>{75, 0.623149}, {93, 0.73256}, {120, 0.848082}} (something went
>wrong with editing, so maybe I sent a by accident an unfinished
>message) I checked with FindFit and it worked. But then I gave extra
>some variations: praxis2 = {#, RandomReal[{0.95, 1.05}]
>CDF[WeibullDistribution[1.4, 76.318], #]} & /@ pr; fit4 = {a,b} /.
>FindFit[praxis2, CDF[WeibullDistribution[a,b], x], {a,b}, {x}] After
>a few times trying the two above lines FindFit did not work. It
>could be ok. But it seems to me that FindFit with a such a small
>relative variation in the data from 0.95 to 1.05 should still work.

It looks like your intent was to generate some random numbers
from a Weibull Distribution to use as a test for FindFit here.
But you have set up a problem that will not be correctly solved
using a standard regression analysis. In the standard problem
statistical error is an additive term. Here, you have made it a
multiplicative term. Doing this doesn't make it significantly
more difficult for FindFit to get a solution. But it will make
it much more likely the solution is different from what you
expect and harder for you to interpret. Simply doing
RandomReal[WeibullDistribution[1.5,77.5],{6}] will give you an