Re: Estimating slope from noisy data
- To: mathgroup at smc.vnet.net
- Subject: [mg89619] Re: Estimating slope from noisy data
- From: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>
- Date: Sun, 15 Jun 2008 06:13:07 -0400 (EDT)
- Organization: The Open University, Milton Keynes, UK
- References: <g3034r$mbl$1@smc.vnet.net>
andreas.kohlmajer at gmx.de wrote: > I have difficulties to estimate the correct slope from noisy data. > This is the code to generate the noisy data: > > Needs["LinearRegression`"]; > slope = 1.0; > sigma = 0.5; > xrange = 1.0; > > SeedRandom[123]; (* initialize random generator *) > rnd = {#, #*slope + RandomReal[NormalDistribution[0, sigma]]} &; > > (* generate 2000 data points *) > data = Table[ > rnd[RandomReal[NormalDistribution[0, xrange/3.0]]], {2000}]; > > subset = Take[data, 8]; > ListPlot[subset, PlotRange -> {{-3, 3}, {-3, 3}}, > PlotStyle -> PointSize[.025]] > fit = Regress[subset, x, x, IncludeConstant -> False, > RegressionReport -> {SummaryReport, ParameterCITable}] > > The correct slope is exactly 1. As the data is quite noisy, the CI of > the slope is very big. The estimated slope is far to big (1.947). If I > use more data points, the estimation gets better; I could also use a > wider x-range, to get a better estimate for the slope. However, I'm > quite limited in the x-range, so using a wider x-range is no option > for me. > > I could check the RSquared for significance (If[Abs[r*Sqrt[n - 2]/ > Sqrt[1 - r^2]] >= > Quantile[StudentTDistribution[n - 2], 1 - 0.05], r, 0] (* > significance of 95% *)). I this case, it is significant. > > Is there any other way to get a good estimate for the slope, without > using too many data points? > > (Keywords: fit, regression, slope, noisy, rsquared, limited data) You could try using *perpendicular offsets* rather than vertical offsets. Among many others, see [1, 2, 3] for instance. (Note that [2] has also some Mathematica code.) Also, you could weight your data points either by giving a list of explicit numbers or by a weighting function (See the option *Weights* for *Regress* as illustrated below.) [1] Weisstein, Eric W. "Least Squares Fitting--Perpendicular Offsets." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/LeastSquaresFittingPerpendicularOffsets.html [2] Sardelis, D. and Valahas, T. "Least Squares Fitting-Perpendicular Offsets." http://library.wolfram.com/infocenter/MathSource/5292/ [3] József Varga, and Zsolt Szabo, "Modified Regression Model for the Logan Plot," _Journal of Cerebral Blood Flow & Metabolism_ (2002) 22, 240â??244. Available at http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid12949 Needs["LinearRegression`"]; slope = 1.0; sigma = 0.5; xrange = 1.0; SeedRandom[123];(*initialize random generator*)rnd = {#, #*slope + RandomReal[NormalDistribution[0, sigma]]} &; (*generate 2000 data points*) data = Table[ rnd[RandomReal[NormalDistribution[0, xrange/3.0]]], {2000}]; subset = Take[data, 8]; fit = Regress[subset, x, x, Weights -> (1/#^2 &), IncludeConstant -> False, RegressionReport -> {BestFit, SummaryReport, ParameterCITable}] Show[ListPlot[subset, PlotRange -> {{-3, 3}, {-3, 3}}, PlotStyle -> PointSize[.025], AspectRatio -> 1, Epilog -> {Red, Thick, Line[{{-3, -3}, {3, 3}}]}], Plot[BestFit /. fit, {x, -3, 3}, PlotStyle -> Thick]] {BestFit -> 1.55639 x, ParameterCITable -> Estimate SE CI , x 1.55639 0.360525 {0.703884, 2.4089} ParameterTable -> Estimate SE TStat PValue , x 1.55639 0.360525 4.31701 0.00349295 RSquared -> 0.726953, AdjustedRSquared -> 0.687946, EstimatedVariance -> 0.312054, ANOVATable -> DF SumOfSq MeanSq FRatio PValue } Model 1 5.81562 5.81562 18.6366 0.00349295 Error 7 2.18438 0.312054 U Total 8 8. HTH, -- Jean-Marc
- Follow-Ups:
- Re: Problem downloading from UK Wolfram site
- From: "peter lindsay" <pl.0@mac.com>
- Re: Problem downloading from UK Wolfram site
- From: "peter lindsay" <pl.0@mac.com>
- Problem downloading from UK Wolfram site
- From: "Ernst H.K. Stelzer" <stelzer@embl.de>
- Re: Problem downloading from UK Wolfram site