MathGroup Archive: October 2005 [00912]

[Date Index] [Thread Index] [Author Index]

Re: A Problem with the NonlinearFit?

To: mathgroup at smc.vnet.net
Subject: [mg61780] Re: A Problem with the NonlinearFit?
From: Bill Rowe <readnewsciv at earthlink.net>
Date: Sat, 29 Oct 2005 01:32:50 -0400 (EDT)
Sender: owner-wri-mathgroup at wolfram.com

On 10/28/05 at 3:25 AM, ligonap at web.de (Axel Ligon) wrote:

>concerning the NonlinearFit procedure I have a problem. I used the
>following inputs:

>A) *Raw datas {x, y}*
>dat1={{0,319.9},{1,196.7},{3,140.2},{7,99.0},{14,56.1},{31,32.8},{
>90,1.9}} FindFit[dat1,a*Exp[-b*x],{a,b},x]

>a ->280.376, b->0.175479

><<Statistics`NonlinearFit` f=NonlinearFit[dat1,a*Exp[-b*x],x,{a,b}]

>280.376 * Exp(-0.175479x)

>B) *Convert datas {x, ln[y]}*
>dat2={{0,5.7680},{1,5.2817},{3,4.9430},{7,4.5951},{14,4.0271},{31,3
>.4904},{90, 0.6419}} FindFit[dat2,-b*x+a,{a,b},x]

>a->5.18756, b->0.0518199

><<Statistics`NonlinearFit` f=NonlinearFit[dat2,-b*x+a,x,{a,b}]

>5.18756 - 0.0518199 x

>The difference between A) and B) is only the conversion of y to
>ln[y] and the equation y = a * exp(-b*x) to ln[y] = ln[a] - b*x.
>From this it follows that a = exp[ln[a]]. But I get in A) 280.376 /
>-0.175479 and in B) 179.031 / -0.05182 !! Why do I get these
>error?? If I use Excel, I don't get these errors. Is it an error in
>the programming of Mathematica, or not???

The discrepancy between the values returned is neither an error nor a bug in Mathematica. If you plot the function you are fitting with the data being fitted it will be more apparent why the discrepancy exists.

First, looking at the original problem

In[12]:=
dat1 = {{0, 319.9}, {1, 196.7}, {3, 140.2}, 
    {7, 99.}, {14, 56.1}, {31, 32.8}, {90, 1.9}}; 
linFit = FindFit[dat1, a*Exp[(-b)*x], {a, b}, x];

and plotting the fitted model with the data

In[14]:=
Show[Block[
    {f = a*Exp[(-b)*x] /. linFit, 
     $DisplayFunction = Identity}, 
    {ListPlot[dat1], 
     Plot[f, {x, 0, 90}]}]];

Looking at the plot, you can see the curve fits the first couple of points fairly well but shows an increasing poorer fit as the independent variable increases. 

Now do the same thing with the transformed data

IIn[30]:=
dat2 = ({First[#1], Log[Last[#1]]} & ) /@ dat1; 
lnFit = FindFit[dat2, (-b)*x + a, {a, b}, x]
Out[31]=
{a -> 5.187586659420735, b -> 0.0518206192107427}

In[32]:=
Show[Block[{f = a - b*x /. lnFit, 
     $DisplayFunction = Identity}, 
    {ListPlot[dat2], 
     Plot[f, {x, 0, 90}]}]];

You will see the worst fit to the data is the central points.

What is happening is as follows, the large dynamic range of the original data result in the data points with the larger dependent response values to dominate the fit. This is because a 1% difference for the large response values is greater than even a 50% difference for the small response values.

When you transform the response by taking the logarithm, you change things so that a 1% error has the same effect regardless of the value of the response variable. Now, the small value responses have a greater impact on the fitted parameters than before.

Also, if you simply look at 

ListPlot[dat2];

you will see the data doesn't really lie on a straight line. That is the data isn't well modeled by a Exp[-b x].

The difference in fit parameters will always exist when you compare fits done to the orginal data to fits done to the transformed data. But when the dynamic range of the dependent variable is smaller and/or the data actually fits a Exp[-b x], the difference in estimated parameters will be significantly smaller and likely unnoticed.

Most likely, the reason you don't see the same issue with Excel is that Excel is not truly doing a non-linear fit to the data. Instead, it is probably transforming the data by taking the logarithm and transforming the fitted parameters back.

In summary, the issue is not a bug it is an artifact of the transformation you are using combined with the lack of fit of the data to the model you are trying to fit.

At a bare minimum, you should always plot the data along with the fitted curve. Plots of data are generally very informative.

Finally, if you really need accurate answers for any but the most trivial problems, don't use Excel. 
--
To reply via email subtract one hundred and four

Prev by Date: Re: ParallelIO (mathlink windows program) Help Please

Next by Date: Testing whether code is run in batch mode

Previous by thread: Re: A Problem with the NonlinearFit?

Next by thread: Still bug in Random