Re: Re: standard errors and confidence intervals in NonlinearRegress
- To: mathgroup at smc.vnet.net
- Subject: [mg67368] Re: [mg67331] Re: standard errors and confidence intervals in NonlinearRegress
- From: Darren Glosemeyer <darreng at wolfram.com>
- Date: Tue, 20 Jun 2006 02:15:05 -0400 (EDT)
- References: <e70f49$rg4$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
At 05:13 AM 6/18/2006 -0400, Jason Quinn wrote: >Darren Glosemeyer wrote: > > The standard errors and confidence intervals in nonlinear regression are > > based on asymptotic normality. > > While the topic is up, I have a similar question about the >standard error (SE), covariance matrix, and correlation matrix reported >by Regress. Do you know how are they calculated? I've been using >Bevington and Robertson's "Data Reduction and Error Analysis for the >Physical Sciences" to calculate them and I cannot reproduce any of the >values given by Mathematica (with or without weighting). For instance, >Bevington says that the error on the fitting parameters for linear >regression are the square roots of the entries in the inverse of the >following matrix (the curvature matrix): > >A_jk= Sum[w_i * f_j(x_i) * f_k(x_i)]. > >Here w_i is the weight of the i-th datum, f_k is the k-th basis >function, and the i-th measurement of the independent variables are >collectively called x_i. The "errors" generated using this formula do >not agree with the SE values reported by Regress (ignoring me doing >something totally stupid, of course). Simiarly with what he calls the >covariance matrix. I've tried working in factors of Sqrt(N), etc., >thinking is some parent vs sample problem but to no avail. I just don't >know what is being reported by Regress and the documentation doesn't >specify in detail. > >Thanks for any insight anybody can give, >Jason Quinn The difference you observe is due to a difference between weights based on assigned measurement errors and weighted regression. In unweighted regression, responses (measurements without assigned errors) are assumed to follow a specified model and have normally distributed errors with some common variance sigma^2 that is estimated from the fitting. In weighted regression, the responses are assumed to have normally distributed errors, but the individual responses are not assumed to have the same variance. The assumed variance for response i is sigma^2/w_i, where again sigma^2 is an unknown quantity to be estimated from the fitted model and w_i is the weight of the ith datum. This weighting allows for incorporating responses with different amounts of variability, lessening the effect of responses with assumed high variability and increasing the effect of responses with assumed low variability on the parameter estimates. Often in the physical sciences measurement errors serve two purposes: they serve both as weights for the model fitting, and as assumed known standard errors. The parameter estimates will be the same in the weighted regression and in the model fitting with assumed known standard errors because the weights are the same. The estimation of the standard errors for parameter values, however, is different. In the weighted regression model there is still a constant variance sigma^2 that needs to be estimated. Under the additional assumption that the weights are from measured errors and hence contain the information about the variation in the data, the additional variance term is not needed. Assuming this is the difference you are noting, dividing the CovarianceMatrix by the EstimatedVariance, and dividing the SEs by the square root of the EstimatedVariance should give the results you are looking for. As a short example, consider the following using data and a model from the documentation. In[1]:= <<Statistics` In[2]:= data = {{ 0.055, 90}, {0.091, 97}, {0.138, 107}, {0.167, 124}, {0.182,142}, {0.211, 150}, {0.232, 172}, {0.248, 189}, {0.284, 209}, {0.351,253}}; Here are the estimated variance and the covariance matrix returned by Regress. In[3]:= {estvar,covmat}=({EstimatedVariance,CovarianceMatrix}/.Regress[data,{1,x^2},x,RegressionReport->{EstimatedVariance,CovarianceMatrix}]/.MatrixForm->Identity) Out[3]= {64.9129, {{17.7381, -247.146}, {-247.146, 5430.96}}} In this case, the weights are all 1 so the inverse of A_jk is obtained as follows. In[4]:= xmat=DesignMatrix[data,{1,x^2},x] Out[4]= {{1, 0.003025}, {1, 0.008281}, {1, 0.019044}, {1, 0.027889}, {1, 0.033124}, > {1, 0.044521}, {1, 0.053824}, {1, 0.061504}, {1, 0.080656}, {1, 0.123201}} In[5]:= Inverse[Transpose[xmat].xmat] Out[5]= {{0.273261, -3.80735}, {-3.80735, 83.6654}} If we multiply this by the estimated variance, the result is the covariance matrix given above. In[6]:= estvar*% Out[6]= {{17.7381, -247.146}, {-247.146, 5430.96}} As an example with weights other than 1, consider Weights->Range[Length[data]]. In[7]:= {estvar2,covmat2}=({EstimatedVariance,CovarianceMatrix}/.Regress[data,{1,x^2},x,RegressionReport->{EstimatedVariance,CovarianceMatrix},Weights->Range[Length[data]]]/.MatrixForm->Identity) Out[7]= {410.877, {{31.7592, -387.466}, {-387.466, 6181.06}}} Here the weights are incorporated in the A_jk matrix. In[8]:= Inverse[Transpose[xmat].(Range[Length[data]]*xmat)] Out[8]= {{0.0772961, -0.943021}, {-0.943021, 15.0436}} Multiplying by the estimated variance again gives the covariance matrix. In[9]:= estvar2*% Out[9]= {{31.7592, -387.466}, {-387.466, 6181.06}} Darren Glosemeyer Wolfram Research