MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Mathematica calculates RSquared wrongly?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg112719] Re: Mathematica calculates RSquared wrongly?
  • From: Bill Rowe <readnews at sbcglobal.net>
  • Date: Tue, 28 Sep 2010 06:04:33 -0400 (EDT)

On 9/27/10 at 5:47 AM, lawrenceteo at yahoo.com (Lawrence Teo) wrote:

>sbbBN = {{-0.582258428`, 0.49531889`}, {-2.475512593`,
>0.751434565`}, {-1.508540016`, 0.571212292`}, {2.004747546`,
>0.187621117`}, {1.139972167`, 0.297735572`}, {-0.724053077`,
>0.457858443`}, {-0.830992757`, 0.313642502`}, {-3.830561204`,
>0.81639874`}, {-2.357296433`, 0.804397821`}, {0.986610836`,
>0.221932888`}, {-0.513640368`, 0.704999208`}, {-1.508540016`,
>0.798426867`}};

>nlm = NonlinearModelFit[sbbBN, a*x^2 + b*x + c, {a, b, c}, x]
>nlm["RSquared"]

>The RSquared by Mathematica is 0.963173 Meanwhile, Excel and manual
>hand calculation show that R^2 should be equal to 0.7622.

>Is Mathematica wrong?

Whenever Mathematica and Excel disagree it is almost certain the
problem lies with Excel. Simply put, the current versions of
Excel should never be relied upon for any serious statistical
analysis. Do a Google search on Excel and you can find several
sites saying essentially the same thing as I just said here.

But this case seems to be the exception. There is a more subtle
issue in play.

The problem you are solving is not a non-linear problem. Linear
versus non-linear in model fitting refers to the way the unknown
parameters are included in the model not the functions of x used
in the model

Consider:

In[20]:= m = LinearModelFit[sbbBN, {1, x, x^2}, x];

In[21]:= m@"RSquared"

Out[21]= 0.762242

Which is the result returned by Excel. So, in this case it is
clear Excel is solving the linear regression problem and
computing RSquared for that problem correctly. In general, you
never want to use NonlinearModelFit for a linear problem that
can be handled by LinearModelFit.

Note, R is the *linear* correlation coefficient. To compute
something equivalent to R for a non-linear problem you have to
generalize the definition of R is some manner. I don't know how
this is being done in NonlinearModelFit. It is this detail that
is needed to determine whether the result returned for RSquare
by NonlinearModelFit is incorrect or not.

One final comment. Using powers of x as your set of basis
functions is OK for powers less than 2 and possibly OK for
powers up to 3. But this is definitely not a good idea for any
higher powers of x. The problem is the powers of x do not form
an orthogonal basis set. Also, perhaps even more important is
the matrices used to solve the linear regression problem become
increasingly ill conditioned as the powers of x increase. If you
need to fit a high degree polynomial to your data, you should
use Chebyshev polynomials as the basis functions rather than
powers of x.



  • Prev by Date: Question on Solve
  • Next by Date: Washington DC Area Mathematica Special Interest Group
  • Previous by thread: Re: Mathematica calculates RSquared wrongly?
  • Next by thread: Re: Mathematica calculates RSquared wrongly?