Re: linear regression with errors in both variables
- To: mathgroup at smc.vnet.net
- Subject: [mg96421] Re: linear regression with errors in both variables
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Fri, 13 Feb 2009 03:43:34 -0500 (EST)
- References: <gmrmga$a1k$1@smc.vnet.net>
On Feb 10, 4:56 am, Joerg <scha... at biologie.hu-berlin.de> wrote: > Hi, > > I want to test the hypothesis that my data > follows a known simple linear relationship, > y = a + bx. However, I have (known) measurements > errors in both the y and the x values. > > I suppose just a simple linear regression > does not do here. > > Any suggestions how do test this correctly? > > Thanks, > > joerg Assuming errors in both coordinates are comparable, you can do a total least squares fit with a singular values decomposition (there also exist weighted variants; not sure what will be suitable for your needs). I'll illustrate a step-by-step example of the basic case. It is set up so that the slope is around -2.7, and y intercept around 3.3. We will recover approximations to these values. m = 20; xvals = Range[m] + RandomReal[{-1, 1}, {m}]; yvals = 3.3 - 2.7*xvals + RandomReal[{-1, 1}, {m}]; We compute the means, which we will subtract so as to get data centered at the origin. means = Mean /@ {xvals, yvals}; shiftedvals = Transpose[{xvals, yvals} - means]; Now compute the SVD of the transposed, shifted data matrix. {u, w, v} = SingularValueDecomposition[shiftedvals]; The vector perpendicular to the best fitting line is given as the product of the right-side orthogonal matrix in the SVD, with a unit vector in the last (y, in this case) direction. e[j_, n_] := Module[{u = Table[0, {n}]}, u[[j]] = 1; u] perp = v.e[2, 2]; Now form the linear polynomial that defines our approximated line. In[104]:= linearpoly = ({x, y} - means).perp; Expand[linearpoly/Coefficient[linearpoly, y]] Out[105]= -3.18513 + 2.69773 x + 1. y Daniel Lichtblau Wolfram Research