Re: Multiple regression best subset
- To: mathgroup at smc.vnet.net
- Subject: [mg55672] Re: Multiple regression best subset
- From: "Ray Koopman" <koopman at sfu.ca>
- Date: Sat, 2 Apr 2005 01:28:02 -0500 (EST)
- References: <d2dobj$ljj$1@smc.vnet.net><200503310625.BAA15246@smc.vnet.net> <d2j8ov$lc$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Ian Roberts wrote: > I check every possible subset. I presume XLMiner does the same as > it describes the option as Exhaustive Search and does offer other > options which are faster but not guaranteed to find the "best". > I get the same results as they do but it's much much slower. Here's some code that checks all the subsets. For each subset, it saves 1 - AdjustedR^2 and an integer whose binary representation identifies the predictors. There are undoubtedly more efficient ways to do this, but it's unlikely they will be much simpler. p = (* number of predictors *); n1 = (* sample size - 1 *); rxx = (* p x p matrix of correlations among the predictors *); rxy = (* p-vector of correlations of the predictors with the d.v. *); u = Sort@Table[i = Flatten@Position[IntegerDigits[j,2,p],1]; {(n1/ (n1-Length@i))*(1. - rxy[[i]].LinearSolve[rxx[[i,i]],rxy[[i]]]), j}, {j,2^p-1}]; To see the results for only the subsets with k predictors, look at v = Select[u, Tr@IntegerDigits[#[[2]],2] == k &];