Re: Regression
- To: mathgroup at smc.vnet.net
- Subject: [mg73881] Re: Regression
- From: Bill Rowe <readnewsciv at sbcglobal.net>
- Date: Fri, 2 Mar 2007 06:42:19 -0500 (EST)
On 3/1/07 at 6:24 AM, anurag_uor at yahoo.com (Anurag) wrote: >Hi, I am doing regression using mathematica. The code is simple. I >define a polynomial. Read the data points from a file. I make sure >that there are 27 data points in the file and all the 27 data points >are unique. >polynomial = Flatten[Table[x^i y^j z^k, {i, 0, 2}, {j, 0, 2}, {k, 0, >2}]]; >dataf = ReadList["C:\\Debug\\ForRegression.txt", Number, RecordLists >-> True]; >datafromfile = Transpose[dataf]; >RegResult = Regress[dataf, polynomial, {x, y, z}] >I am getting error "Regress::mindata: The number of parameters to be >estimated is greater than or equal to the number of data >points. Subsequent results may be misleading. >DesignedRegress::rank: Warning: the rank of the design matrix is 25, >less \ than full rank 27. Only 25 of the 27 basis functions are >needed to provide \ this fit. Try using a different model or greater >precision." This is telling you that you need more data points. As a simple matter of fact, regression is not the tool to use when you have just enough data to uniquely determine the coefficients of a polynomial. In that case you would use Solve. Regression is the tool to use when you have an over determined system, i.e., more data points than needed to uniquely specify the coefficients. Now in addition to the above, doing finding a polynomial with the set of basis function you have used is a rather poor choice. This choice of basis functions leads to numerically unstable problems. If you must find a best fit polynomial to a data set then the package NumericalMath`PolynomialFit` will give you much better results. In general, any time you fit a set of basis functions where there is a strong correlation between the basis functions, you will encounter problems and get questionable results. An example of functions exhibiting the type of correlation that causes problems are x^2 and x^4. For a more complete discussion of these issues refer to any good statistics text on linear regression. -- To reply via email subtract one hundred and four