Re: Fitting data on a vertical line

• To: mathgroup at christensen.cybernetics.net
• Subject: [mg1569] Re: Fitting data on a vertical line
• From: harrison at helios.physics.utoronto.ca (David Harrison)
• Date: Sat, 1 Jul 1995 01:48:02 -0400
• Organization: University of Toronto - Dept. of Physics

```In article <3sg3qt\$8ji at news0.cybernetics.net>,
David Withoff <withoff at wri.com> wrote:
>In article <3sd9dm\$n5n at news0.cybernetics.net> phpull at unix1.sncc.lsu.edu
>>
>>In[1]:=
>>ls={{2.1,3},{2.1,4},{2.1,5},{2.1,6},{2.1,7}}
>>
>>In[2]:=
>>ft=Fit[ls,{1,x},x]
>>
>>Out[2]=
>>0.924214 + 1.94085 x
>
>Actually, although this result isn't what you expected, it is
>mathematically correct. ...
>This is the best possible fit, so the result from Fit is correct.
>
>There is still the user-interface question of whether the Fit
>function should generate a message such as "Warning: the input is
>unusual and the result probably won't be what you want."

David is, as usual, correct: the least-square technique doesn't always
do the thing we expect.

Although "everybody knows" this, we usually ignore it.  One thing which
can often keep us from being misled by having a fitter doing something
inappropriate is to *always* plot the data and the results of the fit.

One of my favorite examples involves some made-up data by Anscombe
( American Statistician 27, (Feb. 1973), pg. 17.).  Fitting this data
is also discussed in Shaw & Tigg if you have a copy handy (I don't have
my copy with me at the moment so can't supply a page number).

Here is the data:

AnscombeData = {{{10., 8.04}, {8., 6.95}, {13., 7.58}, {9., 8.81},
{11., 8.33}, {14., 9.96}, {6., 7.24}, {4., 4.26},
{12., 10.84}, {7., 4.82}, {5., 5.68}},
{{10., 9.14}, {8., 8.14}, {13., 8.74}, {9., 8.77}, {11., 9.26},
{14., 8.1}, {6., 6.13}, {4., 3.1}, {12., 9.13}, {7., 7.26},
{5., 4.74}}, {{10., 7.46}, {8., 6.77}, {13., 12.74},
{9., 7.11}, {11., 7.81}, {14., 8.84}, {6., 6.08},
{4., 5.39}, {12., 8.15}, {7., 6.42}, {5., 5.73}},
{{8., 6.58}, {8., 5.76}, {8., 7.71}, {8., 8.84}, {8., 8.47}, {8., 7.04},
{8., 5.25}, {19., 12.5}, {8., 5.56}, {8., 7.91}, {8., 6.89}}};

If you fit AnscombeData[[1]], AnscombeData[[2]], AnscombeData[[3]] and
AnscombeData[[4]] each to a straight line you will get virtually the
same result for all 4.  You can go a bit further than just the output
returned by Fit and discover that all four fits have essentially identical
sum of the squares and covariance matrices.  ListPlotting the data will
show at a glance that three of the four fits are totally ridiculous.

The question of what to do when the standard least-square algorithm does
something stupid is probably beyond the scope of the topic of this
newsgroup.  However, in the case that began this discussion one thing to
do is:

In[2]:=  Fit[Reverse /@ ls, {1,x}, x]

-16
Out[2]=  2.1 - 3.33067 10    x

--
David Harrison                             | "The senses do not lie, only
Dept. of Physics, Univ. of Toronto         |  they do not tell the truth."
Inet: harrison at faraday.physics.utoronto.ca |              -- Mach
Tel: 416-978-2977  Fax: 416-978-5848       |

```

• Next by Date: Re: How to force an expression into Rational Function form?
• Next by thread: Re:Fitting data on a vertical line