Re: Re: Optimal variable selection in multiple linear regression ?
- To: mathgroup at smc.vnet.net
- Subject: [mg74968] Re: [mg74948] Re: Optimal variable selection in multiple linear regression ?
- From: "Richard Palmer" <rhpalmer at gmail.com>
- Date: Fri, 13 Apr 2007 02:00:48 -0400 (EDT)
- References: <ev4vla$8pn$1@smc.vnet.net> <200704120853.EAA25046@smc.vnet.net>
Do you have any theory guiding you here? If you do, the theory will tell what variables to include in the model. For example, theory says that sales should be effected by real price, prices of competitive products, and income. If you have no theory, you are data mining. I expect that if you have more than 100 IVs you don't have enough observations to estimate any sort of a model using all the IVs anyway (I believe Mathematica estimates regression models by requiring all the data be in memory). Try a variable reduction technique (see a textbook) to get the number down to something managable. Regards, On 4/12/07, fkampas <fkampas at verizon.net> wrote: > Looking at the P values that Regress gives you would be a start to finding > out which are significant. > > > "Alex" <axel.kowald at rub.de> wrote in message > news:ev4vla$8pn$1 at smc.vnet.net... > > Hello everybody, > > > > I try to use Mathematica to do a multiple linear regression. I'm using > > Regress[] and in principle everything works. However, because I have a > > large number of independent variables (>100), I would like to select > > only the most significant for the regression model. > > > > Is there already an algorithm implemented in Mathematica for selecting > > the best variables, or how else would I do it? > > > > Many thanks, > > > > Alex > > > > > > > -- Richard Palmer Cell 508 982-7266
- References:
- Re: Optimal variable selection in multiple linear regression ?
- From: "fkampas" <fkampas@verizon.net>
- Re: Optimal variable selection in multiple linear regression ?