       Re: Multiple regression best subset

• To: mathgroup at smc.vnet.net
• Subject: [mg55672] Re: Multiple regression best subset
• From: "Ray Koopman" <koopman at sfu.ca>
• Date: Sat, 2 Apr 2005 01:28:02 -0500 (EST)
• References: <d2dobj\$ljj\$1@smc.vnet.net><200503310625.BAA15246@smc.vnet.net> <d2j8ov\$lc\$1@smc.vnet.net>
• Sender: owner-wri-mathgroup at wolfram.com

```Ian Roberts wrote:
> I check every possible subset. I presume XLMiner does the same as
> it describes the option as Exhaustive Search and does offer other
> options which are faster but not guaranteed to find the "best".
> I get the same results as they do but it's much much slower.

Here's some code that checks all the subsets. For each subset, it
saves 1 - AdjustedR^2 and an integer whose binary representation
identifies the predictors. There are undoubtedly more efficient ways
to do this, but it's unlikely they will be much simpler.

p = (* number of predictors *);
n1 = (* sample size - 1 *);
rxx = (* p x p matrix of correlations among the predictors *);
rxy = (* p-vector of correlations of the predictors with the d.v. *);

u = Sort@Table[i = Flatten@Position[IntegerDigits[j,2,p],1]; {(n1/
(n1-Length@i))*(1. - rxy[[i]].LinearSolve[rxx[[i,i]],rxy[[i]]]), j},
{j,2^p-1}];

To see the results for only the subsets with k predictors, look at

v = Select[u, Tr@IntegerDigits[#[],2] == k &];

```

• Prev by Date: Re: How can I simplify a series with some multiplicative factor?
• Next by Date: webMathematica-based on-line learning system?
• Previous by thread: Re: Re: Multiple regression best subset
• Next by thread: GraphPlot vs. SpringEmbedding