Re: Nonlinear Regression Oddities and Questions
- To: mathgroup at smc.vnet.net
- Subject: [mg93181] Re: Nonlinear Regression Oddities and Questions
- From: Bill Rowe <readnews at sbcglobal.net>
- Date: Thu, 30 Oct 2008 02:02:12 -0500 (EST)
On 10/28/08 at 4:54 AM, dh at metrohm.ch (dh) wrote: >I can only guess what is going on. >First, did you check that both confidence levels are identical? >Further,it could be that Mathematica calculates the confidence >intervalls using error propagation technique. No, this is not how confidence limits are computed. Confidence limits have essentially nothing to do with error propagation. Confidence limits are established based on either assumptions or knowledge of the underlying statistical distribution which produces variation in data. The standard assumption is the underlying distribution for the error term is a normal distribution, mean 0 and unknown standard deviation. An estimate of the standard deviation is made from the fit residuals. Confidence limits are computed based on this estimate, the assumption of a normal distribution, the number of data points and the number of model parameters. The original poster stated he was comparing the output of Mathematica's regression routines to another program using a bootstrap analysis. A bootstrap analysis generates a distribution for each of the model parameters by repeating the regression computation for new sets of data. These new data sets are generated typically generated by re-sampling the original data set (non-parametric bootstrap). Confidence limits for each of the parameters are computed as the observed quantiles of the distributions for each of the model parameters. In general, the two methods should give reasonably similar estimates for the nominal values for each of the model parameters. But the confidence limits computed using both methods will often not be as close, particularly for small data sets. Neither method for estimating confidence limits is more "correct" or wrong. I can generate data for a given model using different distributions. Which method gives results more consistent with the parameters I use to generate data will depend on several factors including the amount of data generated, details of the model and the choice of distribution used to generate the data.