MathGroup Archive: October 2008 [00661]

[Date Index] [Thread Index] [Author Index]

Re: Nonlinear Regression Oddities and Questions

To: mathgroup at smc.vnet.net
Subject: [mg93181] Re: Nonlinear Regression Oddities and Questions
From: Bill Rowe <readnews at sbcglobal.net>
Date: Thu, 30 Oct 2008 02:02:12 -0500 (EST)

On 10/28/08 at 4:54 AM, dh at metrohm.ch (dh) wrote:

>I can only guess what is going on.

>First, did you check that both confidence levels are identical?

>Further,it could be that Mathematica calculates the confidence
>intervalls using  error propagation technique.

No, this is not how confidence limits are computed. Confidence
limits have essentially nothing to do with error propagation.
Confidence limits are established based on either assumptions or
knowledge of the underlying statistical distribution which
produces variation in data. The standard assumption is the
underlying distribution for the error term is a normal
distribution, mean 0 and unknown standard deviation. An estimate
of the standard deviation is made from the fit residuals.
Confidence limits are computed based on this estimate, the
assumption of a normal distribution, the number of data points
and the number of model parameters.

The original poster stated he was comparing the output of
Mathematica's regression routines to another program using a
bootstrap analysis. A bootstrap analysis generates a
distribution for each of the model parameters by repeating the
regression computation for new sets of data. These new data sets
are generated typically generated by re-sampling the original
data set (non-parametric bootstrap). Confidence limits for each
of the parameters are computed as the observed quantiles of the
distributions for each of the model parameters.

In general, the two methods should give reasonably similar
estimates for the nominal values for each of the model
parameters. But the confidence limits computed using both
methods will often not be as close, particularly for small data sets.

Neither method for estimating confidence limits is more
"correct" or wrong. I can generate data for a given model using
different distributions. Which method gives results more
consistent with the parameters I use to generate data will
depend on several factors including the amount of data
generated, details of the model and the choice of distribution
used to generate the data.

Prev by Date: Stopping a program

Next by Date: Mathematica Special Interest Group (Washington DC Area)

Previous by thread: Re: Nonlinear Regression Oddities and Questions

Next by thread: Understanding the number of digits given by RealDigits