Re: FindFit, Weibull
- To: mathgroup at smc.vnet.net
- Subject: [mg92891] Re: FindFit, Weibull
- From: Bill Rowe <readnews at sbcglobal.net>
- Date: Fri, 17 Oct 2008 05:23:06 -0400 (EDT)
On 10/16/08 at 5:04 AM, petervansummeren at gmail.com (P_ter) wrote: >I found some data which seemed to fit a WeibullDistribution. The >data are: pr= {16, 34, 53, 75, 93, 120}; After adding the median >rank to the data: praxis={{16, 0.109101}, {34, 0.26445}, {53, >0.421407}, {75, 0.578593}, {93, 0.73555}, {120, 0.890899}}; I >checked with LeastSquares the parameters: \[Alpha]=1.4301;\[Beta] = >76.318. So far ok. But the following did not work: >fit2 = {a, b} /. FindFit[praxis, CDF[WeibullDistribution[a, b], x], >{a ,b}, {x}] There are several things that can be done. First, with complete data it is probably better to use a different approach than non-linear regression to find estimates for the Weibull parameters. For example: In[1]:= meanLog = ExpectedValue[Log[x], WeibullDistribution[a, b], x] Out[1]= Log[b] - EulerGamma/a In[2]:= logMedian = Log@Quantile[WeibullDistribution[a, b], 1/2] // PowerExpand Out[2]= log(b)+log(log(2))/a In[3]:= pr = {16, 34, 53, 75, 93, 120}; NSolve[{meanLog == Mean[Log[pr // N]], logMedian == Log[Median[pr // N]]}, {a, b}] Out[3]= {{a->1.20807,b->86.6841}} Note, instead of using the median of the original data there are other statistics that could be used to find the parameters which would lead to somewhat different estimated values. Another approach would be to maximize the loglikelihood function. For example, In[5]:= logP = Log@PDF[WeibullDistribution[a, b], x] // PowerExpand; ll = Total[logP /. x -> pr // N]; NMaximize[{ll, 0 < a && 0 < b}, {a, b}] Out[7]= {-29.5849,{a->1.93268,b->73.5261}} But if you prefer the estimates you get from doing a regression analysis, you can avoid the problem you are encountering by placing constraints on a and b as well as providing better starting values. That is, In[8]:= FindFit[praxis, {CDF[WeibullDistribution[a, b], x], a > 0 && b > 0}, {{a, 1}, {b, 100}}, {x}] Out[8]= {a->1.52051,b->77.4766} Note, it is important to use reasonably good estimates with the constraints since using constraints only results in In[9]:= FindFit[praxis, {CDF[WeibullDistribution[a, b], x], a > 0 && b > 0}, {a, b}, {x}] Out[9]= {a->0.999501,b->1.00018} which are clearly not very good estimates. Alternatively, since In[10]:= Assuming[a > 0 && b > 0 && x > 0, Log[-Log[1 - CDF[WeibullDistribution[a, b], x]]] // FullSimplify] Out[10]= a*Log[x/b] The problem can be linearized by taking logarithms and avoiding the problems inherent in non-linear fitting. Doing this will avoid the problem of finding good initial starting points for the non-linear regression problem. >So, I used the parameters in the Weibull CDF and the times from pr >to get: praxis1= {{16, 0.106157}, {34, 0.27559}, {53, 0.451308}, >{75, 0.623149}, {93, 0.73256}, {120, 0.848082}} (something went >wrong with editing, so maybe I sent a by accident an unfinished >message) I checked with FindFit and it worked. But then I gave extra >some variations: praxis2 = {#, RandomReal[{0.95, 1.05}] >CDF[WeibullDistribution[1.4, 76.318], #]} & /@ pr; fit4 = {a,b} /. >FindFit[praxis2, CDF[WeibullDistribution[a,b], x], {a,b}, {x}] After >a few times trying the two above lines FindFit did not work. It >could be ok. But it seems to me that FindFit with a such a small >relative variation in the data from 0.95 to 1.05 should still work. It looks like your intent was to generate some random numbers from a Weibull Distribution to use as a test for FindFit here. But you have set up a problem that will not be correctly solved using a standard regression analysis. In the standard problem statistical error is an additive term. Here, you have made it a multiplicative term. Doing this doesn't make it significantly more difficult for FindFit to get a solution. But it will make it much more likely the solution is different from what you expect and harder for you to interpret. Simply doing RandomReal[WeibullDistribution[1.5,77.5],{6}] will give you an adequate sample for testing.