MathGroup Archive: November 2010 [00453]

[Date Index] [Thread Index] [Author Index]
Re: Serious problem with CDF[MultinormalDistribution...
To: mathgroup at smc.vnet.net
Subject: [mg113949] Re: Serious problem with CDF[MultinormalDistribution...
From: Darren Glosemeyer <darreng at wolfram.com>
Date: Fri, 19 Nov 2010 05:08:11 -0500 (EST)
On 11/18/2010 6:05 AM, Myeong Ae Kang wrote:
> With version 6.0 and 7.0.1, the difference is quite large. With other input
> values, the results are much worse even than this.
>
> In[1326]:= Needs["MultivariateStatistics`"]
>
> (1.0-CDF[MultinormalDistribution[{0,0,0},Rationalize[{{1,0.559887,0.289988},
> {0.559887,1,0.880157},{0.289988,0.880157,1}}]],SetPrecision
> [{4.5,4.5,6.9958},50]])
>
> (1.0-CDF[MultinormalDistribution[{0,0},{{1,0.559887},{0.559887,1}}],{4.5,4.5}])
>
> Out[1327]= 5.00832*10^-6
> Out[1328]= 6.75155*10^-6
>
> And, similarly to this, with version 6.0, I did the calculation of CDF with 6
> RV's in different orders of RV's. I just changed the orders of the sigma matrix
> and range vector with zero mean. But, the results(which are suppossed to equal)
> are quite different like this following:
>
> In[1432]:= ClearAll;
> Print["R,A,W ======================="]
> TT={{1,0.576363,0.300013,0.888477,0.106961,0.215919},
> {0.576363,1,0.888477,0.300013,0.215919,0.106961},{0.300013,0.888477,1,-
> 9.72249*10^-6,0.466809,-0.191847},{0.888477,0.300013,-9.72249*10^-6,1,-
> 0.191847,0.466809},{0.106961,0.215919,0.466809,-0.191847,1,-0.862483},
> {0.215919,0.106961,-0.191847,0.466809,-0.862483,1}};
> TTbnd={7.14659,7.14659,7.83599,7.83599,5.55663,5.55663};
>
> TTp={{0,0,0,0,0,1},{0,0,0,0,1,0},{0,0,0,1,0,0},{0,0,1,0,0,0},{0,1,0,0,0,0},
> {1,0,0,0,0,0}}.TT.{{0,0,0,0,0,1},{0,0,0,0,1,0},{0,0,0,1,0,0},{0,0,1,0,0,0},
> {0,1,0,0,0,0},{1,0,0,0,0,0}}
> Tbnd=TTbnd.Transpose[{{0,0,0,0,0,1},{0,0,0,0,1,0},{0,0,0,1,0,0},{0,0,1,0,0,0},
> {0,1,0,0,0,0},{1,0,0,0,0,0}}]
> 1.0-CDF[MultinormalDistribution[{0,0,0,0,0,0},TTp],Tbnd]
>
> Print["A,R,W ======================="]
> TTp2={{0,0,1,0,0,0},{0,0,0,1,0,0},{0,0,0,0,1,0},{0,0,0,0,0,1},{1,0,0,0,0,0},
> {0,1,0,0,0,0}}.TT.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{0,0,0,0,1,0},
> {0,0,0,0,0,1},{1,0,0,0,0,0},{0,1,0,0,0,0}}]
> Tbnd2=TTbnd.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{0,0,0,0,1,0},{0,0,0,0,0,1},
> {1,0,0,0,0,0},{0,1,0,0,0,0}}]
> 1.0-CDF[MultinormalDistribution[{0,0,0,0,0,0},TTp2],Tbnd2]
>
> Print["A,W,R ======================="]
> TTp3={{0,0,1,0,0,0},{0,0,0,1,0,0},{1,0,0,0,0,0},{0,1,0,0,0,0},{0,0,0,0,1,0},
> {0,0,0,0,0,1}}.TT.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{1,0,0,0,0,0},
> {0,1,0,0,0,0},{0,0,0,0,1,0},{0,0,0,0,0,1}}]
> Tbnd3=TTbnd.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{1,0,0,0,0,0},{0,1,0,0,0,0},
> {0,0,0,0,1,0},{0,0,0,0,0,1}}]
> 1.0-CDF[MultinormalDistribution[{0,0,0,0,0,0},TTp3],Tbnd3]
>
>
> During evaluation of In[1432]:= R,A,W =======================
> Out[1436]= {{1.,-0.862483,0.466809,-0.191847,0.106961,0.215919},{-0.862483,1.,-
> 0.191847,0.466809,0.215919,0.106961},{0.466809,-0.191847,1.,-9.72249*10^-
> 6,0.300013,0.888477},{-0.191847,0.466809,-9.72249*10^-6,1.,0.888477,0.300013},
> {0.106961,0.215919,0.300013,0.888477,1.,0.576363},
> {0.215919,0.106961,0.888477,0.300013,0.576363,1.}}
> Out[1437]= {5.55663,5.55663,7.83599,7.83599,7.14659,7.14659}
> Out[1438]= 1.37516*10^-8
> During evaluation of In[1432]:= A,R,W =======================
> Out[1440]= {{1.,-9.72249*10^-6,0.466809,-0.191847,0.300013,0.888477},{-
> 9.72249*10^-6,1.,-0.191847,0.466809,0.888477,0.300013},{0.466809,-0.191847,1.,-
> 0.862483,0.106961,0.215919},{-0.191847,0.466809,-0.862483,1.,0.215919,0.106961},
> {0.300013,0.888477,0.106961,0.215919,1.,0.576363},
> {0.888477,0.300013,0.215919,0.106961,0.576363,1.}}
> Out[1441]= {7.83599,7.83599,5.55663,5.55663,7.14659,7.14659}
> Out[1442]= 6.50208*10^-9
> During evaluation of In[1432]:= A,W,R =======================
> Out[1444]= {{1.,-9.72249*10^-6,0.300013,0.888477,0.466809,-0.191847},{-
> 9.72249*10^-6,1.,0.888477,0.300013,-0.191847,0.466809},
> {0.300013,0.888477,1.,0.576363,0.106961,0.215919},
> {0.888477,0.300013,0.576363,1.,0.215919,0.106961},{0.466809,-
> 0.191847,0.106961,0.215919,1.,-0.862483},{-0.191847,0.466809,0.215919,0.106961,-
> 0.862483,1.}}
> Out[1445]= {7.83599,7.83599,7.14659,7.14659,5.55663,5.55663}
> Out[1446]= 1.17684*10^-14
> ---------------
>
> Now, I have two questions.
> 1) Does mathematica version 8.0 give three similar results of the second part?
> I don't have version 8.0. Before upgrading it (if possible), I wonder that.
>
> 2) Is there any other way to get the proper results with version 6.0 or 7.01?
>
> Thank you.
>
>
> Regards,
> Myeong Ae Kang
>
> =================================
>
> Darren Glosemeyer<darreng at wolfram.com>  wrote:
>
>> On 11/17/2010 4:29 AM, Myeong Ae Kang wrote:
>>> Hello.
>>>
>>> I have a problem, serious and urgent.
>>>
>>> I got wrong results of CDF of MultinormalDistribution.
>>>
>>> I'm calculating (1.0 -CDF[MultinormalDistribution[{0, 0,0}, {{1, 0.559887,
>>> 0.289988}, {0.559887, 1,0.880157}, {0.289988, 0.880157, 1}}],
>>> {4.5,4.5,6.9958}]).
>>> The results is smaller than (1.0 -CDF[MultinormalDistribution[{0,0}, {{1,
>>> 0.559887}, {0.559887, 1}}], {4.5,4.5}]). It should be larger.
>>>
>>> Actually, I have to calculate more than 6 to 12 random variables (12 x 12
>> sigma
>>> matrix.).
>>>
>>> Mathematica 6.0 and 7.0.1 gave me the same wrong result.
>>>
>>> I compared this with the result of another system.
>>> That system gave the expected value.
>>>
>>> To fix this problem, what should I do?
>>>
>>>
>>> Regards,
>>> Myeong Ae Kang
>>>
>>>
>>>
>>
>> This is addressed in version 8. Here are the results:
>>
>>
>>
>> In[1]:= threeD = (1.0 -
>>              CDF[MultinormalDistribution[{0, 0,
>>                 0}, {{1, 0.559887, 0.289988}, {0.559887, 1,
>>                  0.880157}, {0.289988, 0.880157, 1}}], {4.5, 4.5, 6.9958}])
>>
>>                     -6
>> Out[1]= 6.75155 10
>>
>> In[2]:= twoD = (1.0 -
>>              CDF[MultinormalDistribution[{0,
>>                 0}, {{1, 0.559887}, {0.559887, 1}}], {4.5, 4.5}])
>>
>>                     -6
>> Out[2]= 6.75155 10
>>
>> In[3]:= threeD>  twoD
>>
>> Out[3]= True
>>
>>
>> The difference is at the level of machine precision numerical error:
>>
>>
>> In[4]:= threeD - twoD
>>
>>                     -16
>> Out[4]= 1.11022 10
>>
>>
>>
>> Note that if we artificially increase the precision as in the example
>> below, we get a difference at the same order of magnitude, so the
>> precision in the difference above is as good as could be expected (and
>> in fact probably better than should be expected) at machine precision.
>>
>>
>> In[5]:= CDF[MultinormalDistribution[{0, 0, 0},
>>             Rationalize[{{1, 0.559887, 0.289988}, {0.559887, 1,
>>                0.880157}, {0.289988, 0.880157, 1}}, 0]],
>>            SetPrecision[{4.5, 4.5, 6.9958}, 50]]
>>
>> Out[5]= 0.999993248446123977600817893853727681422304494350
>>
>> In[6]:= CDF[MultinormalDistribution[{0, 0},
>>             Rationalize[{{1, 0.559887}, {0.559887, 1}}, 0]],
>>            SetPrecision[{4.5, 4.5}, 50]]
>>
>> Out[6]= 0.999993248446124136306526964076086153533229885505
>>
>> In[7]:= % - %%
>>
>>                                               -16
>> Out[7]= 1.5870570907022235847211092539116 10
>>
>>
>> Darren Glosemeyer
>> Wolfram Research
>>

The results for these 6-dimensional cases are roughly the same in 
version 8 as they are in version 7. There are, however, ways to get more 
precise results in version 8 which will be shown below. We will look 
into how these can be incorporated directly into CDF for the future.

There are some important issues to keep in mind with the computation of 
CDFs so far out in the tails of higher dimenional normal. For 2 and 3 
dimensional normals, special case code is now used in version 8 which 
gets more precise results than in version 7 in your examples. In higher 
dimensions the methodology requires an (n-1)-dimensional numeric 
integral. The results obtained by CDF in the 6-dimensional examples sent 
are good to about 7 or 8 digits, however they are nearly 1 and so 
subtracting them from 1 cancels most, if not all, of the significant 
digits.

The following approach to computing the values you are after was 
provided by my colleague Sasha Pavlyk and uses the new SurvivalFunction 
and MarginalDistribution functions from version 8 to effectively remove 
pieces of probability outside the CDF region. The idea is to shave off 
portions of the space by using lower dimensional marginal survival 
functions. These would include marginals of dimension 1 through 5, but 
the contributions from the higher dimensional margins will be much less 
than those from the lower dimensional marginal this far out in the tails 
of the distributions, so some higher dimensional marginals can be ignored.


Here are the covariance matrices and end points sent:

In[1]:= TT = {{1, 0.576363, 0.300013, 0.888477, 0.106961,
             0.215919}, {0.576363, 1, 0.888477, 0.300013, 0.215919,
             0.106961}, {0.300013, 0.888477, 1, -9.72249*10-6,
             0.466809, -0.191847}, {0.888477, 0.300013, -9.72249*10-6,
             1, -0.191847, 0.466809}, {0.106961, 0.215919, 0.466809, 
-0.191847,
              1, -0.862483}, {0.215919, 0.106961, -0.191847,
             0.466809, -0.862483, 1}};

In[2]:= TTbnd = {7.14659, 7.14659, 7.83599, 7.83599, 5.55663, 5.55663};

In[3]:= TTp = {{0, 0, 0, 0, 0, 1}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 1, 0, 
0}, {0,
               0, 1, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {1, 0, 0, 0, 0, 
0}}.TT.{{0,
              0, 0, 0, 0, 1}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 1, 0, 0}, 
{0, 0, 1,
               0, 0, 0}, {0, 1, 0, 0, 0, 0}, {1, 0, 0, 0, 0, 0}};

In[4]:= Tbnd = TTbnd.Transpose[{{0, 0, 0, 0, 0, 1}, {0, 0, 0, 0, 1, 0}, 
{0, 0,
                0, 1, 0, 0}, {0, 0, 1, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {1, 
0, 0,
               0, 0, 0}}];

In[5]:= TTp2 = {{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 0, 0, 1,
              0}, {0, 0, 0, 0, 0, 1}, {1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0,
              0}}.TT.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, 
{0, 0,
               0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}, {1, 0, 0, 0, 0, 0}, {0, 
1, 0,
               0, 0, 0}}];

In[6]:= Tbnd2 = TTbnd.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, 
{0,
               0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}, {1, 0, 0, 0, 0, 0}, 
{0, 1,
               0, 0, 0, 0}}];

In[7]:= TTp3 = {{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {1, 0, 0, 0, 0,
              0}, {0, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0,
              1}}.TT.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, 
{1, 0,
               0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 
0, 0,
               0, 0, 1}}];

In[8]:= Tbnd3 = TTbnd.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, 
{1,
               0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 1, 0}, 
{0, 0,
               0, 0, 0, 1}}];



GDiff is a generalized differencing for higher dimensions:



In[9]:= GDiff[f_, {x_, a_, b_}, specs___] :=
          GDiff[(f /. x -> a) - (f /. x -> b), specs]

In[10]:= GDiff[f_] := f



This defines the computation of 1-CDF incorporating marginal survival 
functions:


In[11]:= Clear[OneMinusCDF];

In[12]:= OneMinusCDF[dist_, xvec_, cutoffdim_: 3] :=
           Module[{res, len = Length[xvec], x, sf},
            res = 1 -
              GDiff[sf[Array[x, len]],
               Apply[Sequence,
                Thread[{Array[x, len], ConstantArray[-Infinity, len], 
xvec}]]];
            res = res /. sf[{(-Infinity) ..}] :> 1;
            (* remove survival functions in more than 3 dimensions as 
small *)
            res = res /. {sf[arg_] /; len - Count[arg, -Infinity] > 
cutoffdim :>
                 0};
            res /. {sf[arg_] :>
               SurvivalFunction[
                MarginalDistribution[dist,
                 Cases[Transpose[{Range[len], arg}], {n_, 
Except[-Infinity]} :>
                   n]], With[{y = DeleteCases[arg, -Infinity]},
                 If[Length[y] == 1, First[y], y]]]}
            ]




Using marginals up to dimension 3, we get consistent results for the 
three cases:


In[13]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp], 
Tbnd, 3] // Timing

                            -8
Out[13]= {0.375, 2.75042 10  }

In[14]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp2], 
Tbnd2, 3] // Timing

                           -8
Out[14]= {0.36, 2.75042 10  }

In[15]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp3], 
Tbnd3, 3] // Timing

                            -8
Out[15]= {0.344, 2.75042 10  }


We are actually far enough out in the tails of the distribution that 
using only 1D marginals gives the same result up to the number of digits 
shown.


In[16]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp], 
Tbnd, 1] // Timing

                            -8
Out[16]= {0.015, 2.75042 10  }


There is some difference out beyond the digits shown:


In[17]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp], 
Tbnd, 3] - OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp],
            Tbnd, 1]

                    -15
Out[17]= -2.2018 10



Darren Glosemeyer
Wolfram Research
Prev by Date: Re: Mathematica 8
Next by Date: Re: Why does "Normal" not create a list of line graphics
Previous by thread: Re: Serious problem with CDF[MultinormalDistribution...
Next by thread: Does activating 8 disable 7?