MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Optimal variable selection in multiple linear regression ?

  • To: mathgroup at
  • Subject: [mg74968] Re: [mg74948] Re: Optimal variable selection in multiple linear regression ?
  • From: "Richard Palmer" <rhpalmer at>
  • Date: Fri, 13 Apr 2007 02:00:48 -0400 (EDT)
  • References: <ev4vla$8pn$> <>

Do you have any theory guiding you here?  If you do, the theory will
tell what variables to include in the model.  For example, theory says
that sales should be effected by real price, prices of competitive
products, and income.

If you have no theory, you are data mining.  I expect that if you have
more than 100 IVs you don't have enough observations to estimate any
sort of a model using all the IVs anyway (I believe Mathematica estimates
regression models by requiring all the data be in memory).  Try a
variable reduction technique (see a textbook) to get the number down
to something managable.


On 4/12/07, fkampas <fkampas at> wrote:
> Looking at the P values that Regress gives you would be a start to finding
> out which are significant.
> "Alex" <axel.kowald at> wrote in message
> news:ev4vla$8pn$1 at
> > Hello everybody,
> >
> > I try to use Mathematica to do a multiple linear regression. I'm using
> > Regress[] and in principle everything works. However, because I have a
> > large number of independent variables (>100), I would like to select
> > only the most significant for the regression model.
> >
> > Is there already an algorithm implemented in Mathematica for selecting
> > the best variables, or how else would I do it?
> >
> > Many thanks,
> >
> > Alex
> >
> >

Richard Palmer
Cell 508 982-7266

  • Prev by Date: Re: Beginner--Help on Listing Values
  • Next by Date: Re: Variable containing code
  • Previous by thread: Re: Optimal variable selection in multiple linear regression ?
  • Next by thread: Problem using NDSolve