MathGroup Archive: February 2009 [00346]

[Date Index] [Thread Index] [Author Index]

Re: linear regression with errors in both variables

To: mathgroup at smc.vnet.net
Subject: [mg96299] Re: [mg96281] linear regression with errors in both variables
From: DrMajorBob <btreat1 at austin.rr.com>
Date: Wed, 11 Feb 2009 05:19:39 -0500 (EST)
References: <200902101057.FAA10340@smc.vnet.net>
Reply-to: drmajorbob at longhorns.com

Here are three solvers, applied to the following data.

data = Flatten[
    Table[{x + 0.5 RandomReal[],
      y + 0.7 RandomReal[], .2 x + .3 y + .4 RandomReal[]}, {x, 1,
      20}, {y, 1, 30}], 1];

xy = data[[All, {1, 2}]];
z = data[[All, -1]];
LeastSquares[xy, z]

{0.200423, 0.302076}

FindFit[data, a x + b y, {a, b}, {x, y}]

{a -> 0.200423, b -> 0.302076}

lmf = LinearModelFit[data, {x, y}, {x, y}];
Normal[lmf]

0.038069 + 0.198848 x + 0.30105 y

They all do the same thing, depending on whether you ask for a constant  
term or not.

What if we explicitly KNOW, however, that x, y, and z had specific error  
standard deviations, in this case 0.2, 0.3, and 0.4? Surely unequal  
variances make a difference?

In that case we could scale each variable to make all error variances the  
same, then solve the transformed problem:

normData = #/{.2, .3, .4} & /@ data;
normFit = FindFit[normData, a x + b y, {a, b}, {x, y}]

{a -> 0.100211, b -> 0.226557}

And then we'd correct a and b to UNDO the transformation we made:

.4/{.2, .3} {a, b} /. normFit

{0.200423, 0.302076}

But as you see, this changes nothing.

Bobby

On Tue, 10 Feb 2009 04:57:01 -0600, Joerg <schaber at biologie.hu-berlin.de>  
wrote:

> Hi,
>
> I want to test the hypothesis that my data
> follows a known simple linear relationship,
> y = a + bx. However, I have (known) measurements
> errors in both the y and the x values.
>
> I suppose just a simple linear regression
> does not do here.
>
> Any suggestions how do test this correctly?
>
> Thanks,
>
> joerg
>

-- 
DrMajorBob at longhorns.com

References:
- linear regression with errors in both variables
  - From: Joerg <schaber@biologie.hu-berlin.de>

Prev by Date: Re: weather blog and ListStreamPlot sampling

Next by Date: Re: WeatherData - many properties unavailable?

Previous by thread: Re: linear regression with errors in both variables

Next by thread: Re: linear regression with errors in both variables