Re: statistical comparison of parameters from two applications of NonlinearModelFit

*To*: mathgroup at smc.vnet.net*Subject*: [mg116154] Re: statistical comparison of parameters from two applications of NonlinearModelFit*From*: Ray Koopman <koopman at sfu.ca>*Date*: Thu, 3 Feb 2011 05:33:05 -0500 (EST)

On Wed 02 Feb 2011 11:59:49 -0600, DrMajorBob <btreat1 at austin.rr.com> wrote: > For nonlinear models, it seems rash beyond words, to me, to assume > "normality, homoscedasticity, and independence". Hence, I doubt most > of the statistics returned mean much of anything. > > The same is true in linear modeling, but you have a better chance of > testing the assumptions. > > Bobby Assumptions are like horseshoes and hand grenades: close counts. None of the usual assumptions is exactly true. What really matters is that they not be so far off that the analysis leads to wrong substantive conclusions. Knowledge of what variables can and can not be ignored, what kinds of approximations work and what kinds don't, and which assumptions are safe and which are not, are all part of the research lore in every field. However, while I'm at it, let me add another assumption to the list. The standard errors, CIs, etc in these nonlinear analyses are all "asymptotic", which means that they are necessarily valid only for large samples. "Large" is not defined but is known to vary from one situation to another, so we have the assumption that the sample is large enough for asymptotic theory to apply. In the case at hand, the sample is probably too small for asymptotic theory to apply. All that doesn't make the results meaningless, but it does mean that you are more likely to get a correct interpretation of the results from a person who knows the particular substantive area and has seen similar analyses of similar data than you are from a person who lacks such experience. > On Wed, 02 Feb 2011 05:08:31 -0600, Ray Koopman <koopman at sfu.ca> wrote: >> On Feb 1, 3:54 am, dantimatter <goo... at dantimatter.com> wrote: >>> Hello Everyone, >>> >>> First, I must apologize if the following questions are silly; >>> I am neither a statistician nor a particularly good Mathematica >>> user (although I do occasionally pretend to be one). And now, >>> on the question: >>> >>> I've got some data that I try to fit with NonlinearModelFit: >>> >>> time1 = {0, 100, 200, 300, 400, 500, 600, 700, 800}; >>> data1 = {1, 0.92, 0.79, 0.81, 0.81, 0.78, 0.72, 0.75, 0.68}; >>> >>> nlm1= NonlinearModelFit[Transpose[{time1, data1}], >>> a1*Exp[-b1*x] + c1, {{a1, .5}, {b1, 0.001}, {c1, 0.5}}, x, >>> ConfidenceLevel -> .95] >>> >>> with results: >>> >>> In[11] := nlm1["BestFitParameters"] >>> nlm1["ParameterConfidenceIntervals"] >>> nlm1["ParameterErrors"] >>> >>> Out[11] = {a1 -> 0.295575, b1 -> 0.00334141, c1 -> 0.696761} >>> Out[12] = {{0.179807, 0.411344}, {-0.000284765, 0.00696758}, >>> {0.582507, 0.811014}} >>> Out[13] = {0.0473121, 0.00148194, 0.0466929} >>> >>> I then do the same thing on a second data set: >>> >>> time2 = {0, 100, 200, 300, 400, 500}; >>> data2 = {0.8, 0.73, 0.65, 0.61, 0.58, 0.57}; >>> >>> nlm2= NonlinearModelFit[Transpose[{time2, data2}], >>> a2*Exp[-b2*x] + c2, {{a2, .5}, {b2, 0.001}, {c2, 0.5}}, x, >>> ConfidenceLevel -> .95] >>> >>> In[15]:= nlm2["BestFitParameters"] >>> nlm2["ParameterConfidenceIntervals"] >>> nlm2["ParameterErrors"] >>> >>> Out[15] = {a2 -> 0.287834, b2 -> 0.00362383, c2 -> 0.516796} >>> Out[16] = {{0.204254, 0.371413}, {0.00121124, 0.00603641}, >>> {0.427572, 0.60602}} >>> Out[17] = {0.0262627, 0.000758091, 0.0280362} >>> >>> Based on our understanding of the real system, we actually expect >>> b2 <= b1, but this not what the fitting tells us (of course, >>> there's a bit of error in the extracted parameters). >>> >>> So what I'd really like to know is: >>> >>> (1) how can I make sense of these "ParameterConfidenceIntervals" >>> and "ParameterErrors" (how are they determined, and what is meant >>> by them, beyond their brief description in >>> http://reference.wolfram.com/mathematica/tutorial/StatisticalModelAnalysis.html >>> )? >>> >>> and >>> >>> (2) is there a way to establish some kind of upper limit on the >>> detectability of the difference between b2 and b1? >>> >>> Any thoughts you might have would be most appreciated. >>> >>> Many thanks! >> >> The parameter estimates are the values of the parameters (a, b, c) >> that minimize the sum of squared differences between the observed >> data and the values predicted from the model. >> >> Each ParameterError is an estimate of the standard error of >> the corresponding parameter estimate. >> >> Each ParameterConficenceInterval is an interval that you are >> (100*ConfidenceLevel)% certain contains the true value of the >> corresponding parameter *if the model is complete (all the >> variables that matter are included), the form of the model function >> is correct, and the distributional assumptions about the errors >> (normality, homoscedasticity, independence) are correct*. >> >> Each confidence level is constructed as >> estimated_parameter +/- ( estimated_standard_error * >> Quantile[StudentTDistribution[degrees_of_freedom], >> (1 + ConfidenceLevel)/2] ), >> where degrees of freedom = >> #_of_data_points - #_of_estimated_parameters. >> >> Each estimated standard error is the square root of the product >> of two factors -- one that is common to all the estimates, and >> one that is specific to each estimate. >> >> The common factor is the mean square error of approximation: >> the sum of squares that the parameter estimates minimize, >> divided by the degrees of freedom. >> >> The specific factors are more complicated, so I'll say only that: >> 1. They do not directly reflect the size of the data-approximation >> errors; >> 2. They are strongly tied to the values of the independent variable >> (your 'time'); >> 3. Ceteris paribus, they are inversely proportional to the number >> of data points. >> >> Now we're ready to talk about detecting a difference between b1 and >> b2. The formal test of the hypothesis that there is no difference >> is a student t: (b1 - b2)/Sqrt[ASE[b1]^2 + ASE[b2]^2] = -.170 . >> The degrees of freedom are data dependent, but in your case the t >> is so small that it will never be significant, no matter what the >> degrees of freedom are. >> >> To achieve significance, the t would have to be at least 12 times >> its current value, which means you would need 144 times as many >> sample points as you currently have. (Note: I've cut a lot of >> corners here. A proper analysis would be much messier but would >> lead to the same conclusion -- you need a whole lot more data, >> on the order of 200 times as much as you have shown us.) >> >> Finally, here's some code that should reproduce some of the >> NonlinearModelFit output: >> >> r1 = FindFit[Transpose@{time1,data1}, a1*Exp[-b1*x]+c1, >> {{a1,.5},{b1,.001},{c1,.5}}, x] >> f1 = a1*Exp[-b1*time1]+c1; >> e1 = f1 - data1; >> df1 = Length@data1 - 3; >> var1 = (e1.e1/.r1)/df1 (* error variance *) >> cov1 = var1*Inverse[Transpose at #.#]&@Outer[D[##]&,f1,{a1,b1,c1}]/.r1; >> ase1 = Sqrt@Diagonal@cov1 (* ASEs *) >> cov1/Outer[Times,ase1,ase1] (* correlations *) >> ({a1,b1,c1}/.r1) + Outer[Times,ase1, (* CIs *) >> {-1,1}]*Quantile[StudentTDistribution[df1],.975] >> > > -- > DrMajorBob at yahoo.com