Re: MultinormalDistribution Question
- To: mathgroup at smc.vnet.net
- Subject: [mg120316] Re: MultinormalDistribution Question
- From: Ray Koopman <koopman at sfu.ca>
- Date: Tue, 19 Jul 2011 06:53:44 -0400 (EDT)
- References: <201107100901.FAA24634@smc.vnet.net> <ivel76$89d$1@smc.vnet.net> <ivuc25$grq$1@smc.vnet.net>
On Jul 17, 3:03 am, Steve <s... at epix.net> wrote: > [...] > What I really need to do is perform this analysis on test data for > which I have only a few data points, hence the Student T distribution > would be more appropriate than the Normal distribution. Secondly, > values for the "independent" and "dependent" variables have no > physical meaning below zero. So this implies that I need truncated > distributions. I'm hoping that the solution Andrzej provided can be > generalized for these added complications. > Here are my 9 {F,t} data points where "F" is considered "independent" > and t considered "dependent". > > {{1.01041, 0.3152}, {10.455, 0.3386}, {17.9032, 0.2534}, {24.9581, > 0.5412}, {26.4688, 0.3251}, {27.4651, 0.4428}, {30.1682, > 0.3402}, {36.6174, 0.2106}, {45.6129, 0.2154}} > > Would someone be so kind as to plop this data into their notebook to > confirm a solution or two for me ? My results are below which are > based on truncating the Student T distribution, 8 degrees of freedom > and a calculated rho of -0.2327. > > [...] I have several comments. First, the correlation of t with F is so small that it is hard to justify treating it as nonzero. The unbiased estimate of the conditional variance of t|F is bigger than the unbiased estimate of the marginal variance of t. (The happens whenever the F-statistic for testing the significance of the correlation is < 1.) In bivariate normal correlation, and in linear regression with homoscedastic normal error, df = n-2, not n-1. Regression models require only the conditional distribution of the dependent variable given the independent variable. The independent variable need not be random. The fact that t can not be negative means that its conditional distributions can not be normal. Is ordinary least squares fitting justified? Yes, but only if conditional normality is abandoned. One solution is to treat the conditional distributions as Gamma[a,b] variables, where a is the shape constant and b is the scale constant. Take a = m[F]^2/v and b = v/m[F]. Then the mean of each conditional distribution will be m[F], the variance of each conditional distribution will be v, and the Gauss-Markov theorem justifies ordinary least-squares fitting. Regardless of whether the conditional distributions are assumed to be heteroscedastic truncated normal or homoscedastic gamma, the sampling distribution of the estimates of the regression coefficients and the conditional variance will not be the same as in the usual homoscedastic normal case, and the usual Student-t distributions can not be used to estimate quantiles of the conditional distributions. This may be a situation where one of John Tukey's antihubrisines applies: "The data may not contain the answer. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data."
- References:
- MultinormalDistribution Question
- From: Steve <s123@epix.net>
- MultinormalDistribution Question