MathGroup Archive: July 2011 [00425]

[Date Index] [Thread Index] [Author Index]

Re: MultinormalDistribution Question

To: mathgroup at smc.vnet.net
Subject: [mg120367] Re: MultinormalDistribution Question
From: Ray Koopman <koopman at sfu.ca>
Date: Wed, 20 Jul 2011 06:32:45 -0400 (EDT)
References: <201107100901.FAA24634@smc.vnet.net> <ivel76$89d$1@smc.vnet.net> <ivuc25$grq$1@smc.vnet.net>

On Jul 17, 3:03 am, Steve <s... at epix.net> wrote:
> [...]
> What I really need to do is perform this analysis on test data for
> which I have only a few data points, hence the Student T distribution
> would be more appropriate than the Normal distribution. Secondly,
> values for the "independent" and "dependent" variables have no
> physical meaning below zero. So this implies that I need truncated
> distributions. I'm hoping that the solution Andrzej  provided can be
> generalized for these added complications.
> Here are my 9 {F,t} data points where "F" is considered "independent"
> and t considered "dependent".

> {{1.01041, 0.3152}, {10.455, 0.3386}, {17.9032, 0.2534}, {24.9581,
>    0.5412}, {26.4688, 0.3251}, {27.4651, 0.4428}, {30.1682,
>    0.3402}, {36.6174, 0.2106}, {45.6129, 0.2154}}

> Would someone be so kind as to plop this data into their notebook to
> confirm a solution or two for me ? My results are below which are
> based on truncating the Student T distribution, 8 degrees of freedom
> and a calculated rho of -0.2327.

> [...]

Another approach is to regress u = log[t] on f linearly. This
solves the problem of keeping the conditional distributions of
t non-negative, but makes the regression of t on f nonlinear.

For your data, switching to u increases the correlation, but
the conditional s.d. is still bigger than the marginal s.d.,
so there is still room for questioning the whole exercise.

FWTW, here are the numbers I got:

{mf, sf} = {Mean@f, StandardDeviation@f}

{24.5177, 13.3704}

{mu, su} = {Mean@u, StandardDeviation@u}

{-1.1482, .311145}

r = Correlation[f,u]

-.322776

b = r*su/sf  (* slope *)

-.00751142

a = mu - b*mf  (* intercept *)

-.964034

se = su*Sqrt[(1-r^2)(n-1)/(n-2)]  (* conditional s.d. *)

.314825

Table[{x, CDF[StudentTDistribution[n-2],
      (a+b*x-Log[.5])/se]}, {x,0,50,5}]
(* f, Pr[(t|f) > .5] *)

{{0, .209020},
{ 5, .179928},
{10, .154056},
{15, .131283},
{20, .111422},
{25, .0942435},
{30, .0794909},
{35, .0669001},
{40, .0562108},
{45, .0471758},
{50, .0395667}}

References:
- MultinormalDistribution Question
  - From: Steve <s123@epix.net>

Prev by Date: Re: numeric Groebner bases et al [Was Re: Numerical accuracy/precision - this is a bug or a feature?]

Next by Date: Re: Generating Arbitrary Linear Combinations of Functions

Previous by thread: Re: MultinormalDistribution Question

Next by thread: Re: MultinormalDistribution Question