Re: Re: standard errors and confidence intervals in NonlinearRegress
- To: mathgroup at smc.vnet.net
- Subject: [mg67368] Re: [mg67331] Re: standard errors and confidence intervals in NonlinearRegress
- From: Darren Glosemeyer <darreng at wolfram.com>
- Date: Tue, 20 Jun 2006 02:15:05 -0400 (EDT)
- References: <e70f49$rg4$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
At 05:13 AM 6/18/2006 -0400, Jason Quinn wrote:
>Darren Glosemeyer wrote:
> > The standard errors and confidence intervals in nonlinear regression are
> > based on asymptotic normality.
>
> While the topic is up, I have a similar question about the
>standard error (SE), covariance matrix, and correlation matrix reported
>by Regress. Do you know how are they calculated? I've been using
>Bevington and Robertson's "Data Reduction and Error Analysis for the
>Physical Sciences" to calculate them and I cannot reproduce any of the
>values given by Mathematica (with or without weighting). For instance,
>Bevington says that the error on the fitting parameters for linear
>regression are the square roots of the entries in the inverse of the
>following matrix (the curvature matrix):
>
>A_jk= Sum[w_i * f_j(x_i) * f_k(x_i)].
>
>Here w_i is the weight of the i-th datum, f_k is the k-th basis
>function, and the i-th measurement of the independent variables are
>collectively called x_i. The "errors" generated using this formula do
>not agree with the SE values reported by Regress (ignoring me doing
>something totally stupid, of course). Simiarly with what he calls the
>covariance matrix. I've tried working in factors of Sqrt(N), etc.,
>thinking is some parent vs sample problem but to no avail. I just don't
>know what is being reported by Regress and the documentation doesn't
>specify in detail.
>
>Thanks for any insight anybody can give,
>Jason Quinn
The difference you observe is due to a difference between weights based on
assigned measurement errors and weighted regression.
In unweighted regression, responses (measurements without assigned errors)
are assumed to follow a specified model and have normally distributed
errors with some common variance sigma^2 that is estimated from the
fitting. In weighted regression, the responses are assumed to have
normally distributed errors, but the individual responses are not assumed
to have the same variance. The assumed variance for response i is
sigma^2/w_i, where again sigma^2 is an unknown quantity to be estimated
from the fitted model and w_i is the weight of the ith datum. This
weighting allows for incorporating responses with different amounts of
variability, lessening the effect of responses with assumed high
variability and increasing the effect of responses with assumed low
variability on the parameter estimates.
Often in the physical sciences measurement errors serve two purposes: they
serve both as weights for the model fitting, and as assumed known standard
errors. The parameter estimates will be the same in the weighted
regression and in the model fitting with assumed known standard errors
because the weights are the same. The estimation of the standard errors
for parameter values, however, is different.
In the weighted regression model there is still a constant variance sigma^2
that needs to be estimated. Under the additional assumption that the
weights are from measured errors and hence contain the information about
the variation in the data, the additional variance term is not needed.
Assuming this is the difference you are noting, dividing the
CovarianceMatrix by the EstimatedVariance, and dividing the SEs by the
square root of the EstimatedVariance should give the results you are
looking for.
As a short example, consider the following using data and a model from the
documentation.
In[1]:= <<Statistics`
In[2]:= data = {{ 0.055, 90}, {0.091, 97}, {0.138, 107}, {0.167, 124},
{0.182,142}, {0.211, 150}, {0.232, 172}, {0.248, 189}, {0.284,
209}, {0.351,253}};
Here are the estimated variance and the covariance matrix returned by Regress.
In[3]:=
{estvar,covmat}=({EstimatedVariance,CovarianceMatrix}/.Regress[data,{1,x^2},x,RegressionReport->{EstimatedVariance,CovarianceMatrix}]/.MatrixForm->Identity)
Out[3]= {64.9129, {{17.7381, -247.146}, {-247.146, 5430.96}}}
In this case, the weights are all 1 so the inverse of A_jk is obtained as
follows.
In[4]:= xmat=DesignMatrix[data,{1,x^2},x]
Out[4]= {{1, 0.003025}, {1, 0.008281}, {1, 0.019044}, {1, 0.027889}, {1,
0.033124},
> {1, 0.044521}, {1, 0.053824}, {1, 0.061504}, {1, 0.080656}, {1,
0.123201}}
In[5]:= Inverse[Transpose[xmat].xmat]
Out[5]= {{0.273261, -3.80735}, {-3.80735, 83.6654}}
If we multiply this by the estimated variance, the result is the covariance
matrix given above.
In[6]:= estvar*%
Out[6]= {{17.7381, -247.146}, {-247.146, 5430.96}}
As an example with weights other than 1, consider Weights->Range[Length[data]].
In[7]:=
{estvar2,covmat2}=({EstimatedVariance,CovarianceMatrix}/.Regress[data,{1,x^2},x,RegressionReport->{EstimatedVariance,CovarianceMatrix},Weights->Range[Length[data]]]/.MatrixForm->Identity)
Out[7]= {410.877, {{31.7592, -387.466}, {-387.466, 6181.06}}}
Here the weights are incorporated in the A_jk matrix.
In[8]:= Inverse[Transpose[xmat].(Range[Length[data]]*xmat)]
Out[8]= {{0.0772961, -0.943021}, {-0.943021, 15.0436}}
Multiplying by the estimated variance again gives the covariance matrix.
In[9]:= estvar2*%
Out[9]= {{31.7592, -387.466}, {-387.466, 6181.06}}
Darren Glosemeyer
Wolfram Research