[Date Index]
[Thread Index]
[Author Index]
Re: linear regression with errors in both variables
*To*: mathgroup at smc.vnet.net
*Subject*: [mg96436] Re: linear regression with errors in both variables
*From*: Ray Koopman <koopman at sfu.ca>
*Date*: Fri, 13 Feb 2009 03:46:19 -0500 (EST)
*References*: <gmrmga$a1k$1@smc.vnet.net> <gn11v2$8fl$1@smc.vnet.net>
On Feb 12, 3:42 am, Ray Koopman <koop... at sfu.ca> wrote:
> On Feb 10, 2:56 am, Joerg <scha... at biologie.hu-berlin.de> wrote:
>
>> Hi,
>>
>> I want to test the hypothesis that my data
>> follows a known simple linear relationship,
>> y = a + bx. However, I have (known) measurements
>> errors in both the y and the x values.
>>
>> I suppose just a simple linear regression
>> does not do here.
>>
>> Any suggestions how do test this correctly?
>>
>> Thanks,
>>
>> joerg
>
> You don't say whether the errors are the same for all data points.
> If they are, and sx & sy are the (known) standard deviations of the
> measurement errors in x & y, then
>
> ClearAll[a,b]; Minimize[Evaluate[Expand[
> #.#&[a + b*x - y]]/(b^2 sx^2 + sy^2)], {a,b}]
>
> will give what you want. You can save a little time if you eliminate
> 'a' by centering the data. And if Mathematica honors the parentheses,
> centering will also reduce potential roundoff error problems.
>
> xbar = Mean@x; ybar = Mean@y; ClearAll[a,b];
> {#[[1]], {a -> ybar - #[[2,1,2]]*xbar, #[[2,1]]}}& @
> Minimize[Evaluate[Expand[
> #.#&[b(x-xbar)-(y-ybar)]]/(b^2 sx^2 + sy^2)], b]
>
> If the errors differ from point to point then sx & sy will be lists.
>
> ClearAll[a,b];
> Minimize[Tr[ (a + b*x - y)^2 / (b^2 sx^2 + sy^2) ], {a,b}]
>
> will work, but it's slow. Pre-evaluating doesn't help.
> The following is faster:
>
> worals[x_,y_,sx_,sy_] := Block[
> {a,b,f,z, u = 1/sx, v = 1/sy, w = (sy/sx)^2},
> {a,b} = (y*v).PseudoInverse@{v,x*v}; f = #.#&[(a+b*x-y)v];
> While[f > (z = (x*w + (y-a)b)/(b^2 + w);
> {a,b} = (y*v).PseudoInverse@{v,z*v};
> f = #.#&@Join[(z-x)u,(a+b*z-y)v])]; {f,{a,b}}]
>
> ('worals' is an acronym for Weighted Orthogonal Regression by
> Alternating Least Squares.)
If the true relation is linear, and if the errors are independent
normal with zero means and standard deviations as given in sx & sy,
then the minimized function value returned by each of the four
code fragments that I gave will have a Chi-Square distribution
with n-2 degrees of freedom. The corresponding p-value is
GammaRegularized[(n-2)/2,f/2].
Prev by Date:
**Re: LabeledListPlot**
Next by Date:
**Re: testing if a point is inside a polygon**
Previous by thread:
**Re: linear regression with errors in both variables**
Next by thread:
**Re: linear regression with errors in both variables**
| |