Re: Fitting data on a vertical line

*To*: mathgroup at christensen.cybernetics.net*Subject*: [mg1569] Re: Fitting data on a vertical line*From*: harrison at helios.physics.utoronto.ca (David Harrison)*Date*: Sat, 1 Jul 1995 01:48:02 -0400*Organization*: University of Toronto - Dept. of Physics

In article <3sg3qt$8ji at news0.cybernetics.net>, David Withoff <withoff at wri.com> wrote: >In article <3sd9dm$n5n at news0.cybernetics.net> phpull at unix1.sncc.lsu.edu >(Joe Wade Pulley) writes: >> >>In[1]:= >>ls={{2.1,3},{2.1,4},{2.1,5},{2.1,6},{2.1,7}} >> >>In[2]:= >>ft=Fit[ls,{1,x},x] >> >>Out[2]= >>0.924214 + 1.94085 x > >Actually, although this result isn't what you expected, it is >mathematically correct. ... >This is the best possible fit, so the result from Fit is correct. > >There is still the user-interface question of whether the Fit >function should generate a message such as "Warning: the input is >unusual and the result probably won't be what you want." David is, as usual, correct: the least-square technique doesn't always do the thing we expect. Although "everybody knows" this, we usually ignore it. One thing which can often keep us from being misled by having a fitter doing something inappropriate is to *always* plot the data and the results of the fit. One of my favorite examples involves some made-up data by Anscombe ( American Statistician 27, (Feb. 1973), pg. 17.). Fitting this data is also discussed in Shaw & Tigg if you have a copy handy (I don't have my copy with me at the moment so can't supply a page number). Here is the data: AnscombeData = {{{10., 8.04}, {8., 6.95}, {13., 7.58}, {9., 8.81}, {11., 8.33}, {14., 9.96}, {6., 7.24}, {4., 4.26}, {12., 10.84}, {7., 4.82}, {5., 5.68}}, {{10., 9.14}, {8., 8.14}, {13., 8.74}, {9., 8.77}, {11., 9.26}, {14., 8.1}, {6., 6.13}, {4., 3.1}, {12., 9.13}, {7., 7.26}, {5., 4.74}}, {{10., 7.46}, {8., 6.77}, {13., 12.74}, {9., 7.11}, {11., 7.81}, {14., 8.84}, {6., 6.08}, {4., 5.39}, {12., 8.15}, {7., 6.42}, {5., 5.73}}, {{8., 6.58}, {8., 5.76}, {8., 7.71}, {8., 8.84}, {8., 8.47}, {8., 7.04}, {8., 5.25}, {19., 12.5}, {8., 5.56}, {8., 7.91}, {8., 6.89}}}; If you fit AnscombeData[[1]], AnscombeData[[2]], AnscombeData[[3]] and AnscombeData[[4]] each to a straight line you will get virtually the same result for all 4. You can go a bit further than just the output returned by Fit and discover that all four fits have essentially identical sum of the squares and covariance matrices. ListPlotting the data will show at a glance that three of the four fits are totally ridiculous. The question of what to do when the standard least-square algorithm does something stupid is probably beyond the scope of the topic of this newsgroup. However, in the case that began this discussion one thing to do is: In[2]:= Fit[Reverse /@ ls, {1,x}, x] -16 Out[2]= 2.1 - 3.33067 10 x -- David Harrison | "The senses do not lie, only Dept. of Physics, Univ. of Toronto | they do not tell the truth." Inet: harrison at faraday.physics.utoronto.ca | -- Mach Tel: 416-978-2977 Fax: 416-978-5848 |