Re: LinearRegression
- To: mathgroup at smc.vnet.net
- Subject: [mg60043] Re: LinearRegression
- From: Peter Pein <petsie at dordos.net>
- Date: Tue, 30 Aug 2005 04:43:01 -0400 (EDT)
- References: <dehjfi$c4p$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Tom De Vries schrieb: > Hello! I am trying to construct a list of "flawed" data points that would > fit a linear model but have a specific Correlation Coefficient (r) > > My feeble attempts can produce a nice set of somewhat messed up data for a > scatter plot... > > For example > > m = 0.5; b = 2.7; > > fudge := Random[Real, {-1, 1}]; > > data = Table[{i, fudge + (m i + b)}, {i, 1, 15, 1}]; > > ListPlot[data, PlotStyle -> {Hue[0.78]}]; > > > If I analyze the data.... > > << Statistics`MultiDescriptiveStatistics` > > {xlist, ylist} = Transpose[data]; > > Correlation[xlist, ylist] > > I can get (r) > > I've never studied Statistics, so I apologize if this is a really obvious > question. Can I do the reverse? Is is possible to produce a set of data > that would have a given r ? > > The application of this is to produce sets of data as examples and questions > for simple linear regression in a high school math class. > > Thank you for any help you can provide on this. > > Sincerely, > > Tom De Vries > > Hi Tom, if you want an exact result for r, start with randomly shifted values: In[1]:= Off[General::spell1]; SeedRandom[1]; n = 15; m = 0.5; b = 2.7; fudge := 2*Random[] - 1; xvec = Range[n] + Table[fudge/2, {n}]; yvec = y /@ Range[n]; y0 = m*xvec + b + Table[fudge, {n}]; rsquared = Together[Correlation[xvec, yvec]^2]; (* squaring speeds up the calculations below *) Correlation[xvec, y0] Out[7]= 0.9604020383011801 Well, that's not very good. We want (say) r=0.98765 (mind the power 2!): In[8]:= target = 0.98765^2; The solution should not differ too much from the starting values. Therefore I let Mathematica minimize (ysol-y0)^2 with the constraint Correlation^2==target: In[9]:= ysol = yvec /. Last[NMinimize[{(#1 . #1 & )[yvec - y0], rsquared == target}, yvec]] Out[9]= {3.4189904161075724, 4.308217962745214, 4.027884962099622, 4.973223521194384, 5.977666767464987, 6.30055842370482, 6.123658489301134, 7.248069404869771, 6.826104457810174, 7.537551447099687, 8.30583322738566, 9.413350135809361, 8.921393700729098, 9.415907295717503, 10.698605612874013} Now let's have a look at the correlation In[10]:= Correlation[xvec, ysol] Out[10]= 0.98765 yepp! In[11]:= DisplayTogether[ ListPlot[Transpose[{xvec, ysol}], PlotStyle -> {Red, AbsolutePointSize[5]}], ListPlot[Transpose[{xvec, y0}], PlotStyle -> {Hue[0.5, 0.5, 0.7], AbsolutePointSize[4]}], Plot[b + m*x, {x, Min[xvec], Max[xvec]}], Graphics[{Blue, Line /@ Transpose[Transpose /@ {{xvec, y0}, {xvec, ysol}}]}] ]; -- Peter Pein, Berlin GnuPG Key ID: 0xA34C5A82 http://people.freenet.de/Peter_Berlin/