Re: Multiple regressions
- To: mathgroup at smc.vnet.net
- Subject: [mg74275] Re: Multiple regressions
- From: Bill Rowe <readnewsciv at sbcglobal.net>
- Date: Fri, 16 Mar 2007 03:18:54 -0500 (EST)
On 3/15/07 at 4:59 AM, camartin at snet.net (Clifford Martin) wrote: >I have a code that does multiple regressions on a set of data. >Depending on the size and shape of the data I first do a regression, >look at residuals, subtract the worst data point and do the regression >again. I've written th= is with a set of If statements so if the size >of the data becomes too small the next regression won't be done. This >works fine for a single data set but if I put it into a loop through >multiple data sets then I run into troub= le. For example if the first >data set goes through 5 regressions and the ne= xt data set only has >enough data to go through 2 it takes the data from the= first data set >for the last 3 which I don't want. How do I clear the data after each >pass through a data set. I've tried Clear[function1, function2,. >..etc.]. I've tried UnSet[values,values]. So the question is once I've >gone through the loop how do I clear the values so the next data set >through it is not carried over. I cannot offer a specific suggestion as to how to fix your code since you failed to post your code. But I will offer a few comments on what you appear to be doing. I assume the reason for iteratively deleting points from your data set based on the residuals is improve the estimates for the regression parameters when outliers are assumed to exist. If so, it really isn't statistically valid to go about the problem with this kind of iterative approach. The problem is you are likely to delete points that cannot be justified to be outliers with an iterative approach. There are a variety of test statistics that can be computed to identify possible outliers. Several of these have been implemented in the package Statistics`LinearRegression`. Look at the documentation for this package. Quite likely, what you need is already there. If you are not familiar with the diagnostics being computed, you should refer to any good text on regression. -- To reply via email subtract one hundred and four