Re: Serious problem with CDF[MultinormalDistribution...

*To*: mathgroup at smc.vnet.net*Subject*: [mg113949] Re: Serious problem with CDF[MultinormalDistribution...*From*: Darren Glosemeyer <darreng at wolfram.com>*Date*: Fri, 19 Nov 2010 05:08:11 -0500 (EST)

On 11/18/2010 6:05 AM, Myeong Ae Kang wrote: > With version 6.0 and 7.0.1, the difference is quite large. With other input > values, the results are much worse even than this. > > In[1326]:= Needs["MultivariateStatistics`"] > > (1.0-CDF[MultinormalDistribution[{0,0,0},Rationalize[{{1,0.559887,0.289988}, > {0.559887,1,0.880157},{0.289988,0.880157,1}}]],SetPrecision > [{4.5,4.5,6.9958},50]]) > > (1.0-CDF[MultinormalDistribution[{0,0},{{1,0.559887},{0.559887,1}}],{4.5,4.5}]) > > Out[1327]= 5.00832*10^-6 > Out[1328]= 6.75155*10^-6 > > And, similarly to this, with version 6.0, I did the calculation of CDF with 6 > RV's in different orders of RV's. I just changed the orders of the sigma matrix > and range vector with zero mean. But, the results(which are suppossed to equal) > are quite different like this following: > > In[1432]:= ClearAll; > Print["R,A,W ======================="] > TT={{1,0.576363,0.300013,0.888477,0.106961,0.215919}, > {0.576363,1,0.888477,0.300013,0.215919,0.106961},{0.300013,0.888477,1,- > 9.72249*10^-6,0.466809,-0.191847},{0.888477,0.300013,-9.72249*10^-6,1,- > 0.191847,0.466809},{0.106961,0.215919,0.466809,-0.191847,1,-0.862483}, > {0.215919,0.106961,-0.191847,0.466809,-0.862483,1}}; > TTbnd={7.14659,7.14659,7.83599,7.83599,5.55663,5.55663}; > > TTp={{0,0,0,0,0,1},{0,0,0,0,1,0},{0,0,0,1,0,0},{0,0,1,0,0,0},{0,1,0,0,0,0}, > {1,0,0,0,0,0}}.TT.{{0,0,0,0,0,1},{0,0,0,0,1,0},{0,0,0,1,0,0},{0,0,1,0,0,0}, > {0,1,0,0,0,0},{1,0,0,0,0,0}} > Tbnd=TTbnd.Transpose[{{0,0,0,0,0,1},{0,0,0,0,1,0},{0,0,0,1,0,0},{0,0,1,0,0,0}, > {0,1,0,0,0,0},{1,0,0,0,0,0}}] > 1.0-CDF[MultinormalDistribution[{0,0,0,0,0,0},TTp],Tbnd] > > Print["A,R,W ======================="] > TTp2={{0,0,1,0,0,0},{0,0,0,1,0,0},{0,0,0,0,1,0},{0,0,0,0,0,1},{1,0,0,0,0,0}, > {0,1,0,0,0,0}}.TT.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{0,0,0,0,1,0}, > {0,0,0,0,0,1},{1,0,0,0,0,0},{0,1,0,0,0,0}}] > Tbnd2=TTbnd.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{0,0,0,0,1,0},{0,0,0,0,0,1}, > {1,0,0,0,0,0},{0,1,0,0,0,0}}] > 1.0-CDF[MultinormalDistribution[{0,0,0,0,0,0},TTp2],Tbnd2] > > Print["A,W,R ======================="] > TTp3={{0,0,1,0,0,0},{0,0,0,1,0,0},{1,0,0,0,0,0},{0,1,0,0,0,0},{0,0,0,0,1,0}, > {0,0,0,0,0,1}}.TT.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{1,0,0,0,0,0}, > {0,1,0,0,0,0},{0,0,0,0,1,0},{0,0,0,0,0,1}}] > Tbnd3=TTbnd.Transpose[{{0,0,1,0,0,0},{0,0,0,1,0,0},{1,0,0,0,0,0},{0,1,0,0,0,0}, > {0,0,0,0,1,0},{0,0,0,0,0,1}}] > 1.0-CDF[MultinormalDistribution[{0,0,0,0,0,0},TTp3],Tbnd3] > > > During evaluation of In[1432]:= R,A,W ======================= > Out[1436]= {{1.,-0.862483,0.466809,-0.191847,0.106961,0.215919},{-0.862483,1.,- > 0.191847,0.466809,0.215919,0.106961},{0.466809,-0.191847,1.,-9.72249*10^- > 6,0.300013,0.888477},{-0.191847,0.466809,-9.72249*10^-6,1.,0.888477,0.300013}, > {0.106961,0.215919,0.300013,0.888477,1.,0.576363}, > {0.215919,0.106961,0.888477,0.300013,0.576363,1.}} > Out[1437]= {5.55663,5.55663,7.83599,7.83599,7.14659,7.14659} > Out[1438]= 1.37516*10^-8 > During evaluation of In[1432]:= A,R,W ======================= > Out[1440]= {{1.,-9.72249*10^-6,0.466809,-0.191847,0.300013,0.888477},{- > 9.72249*10^-6,1.,-0.191847,0.466809,0.888477,0.300013},{0.466809,-0.191847,1.,- > 0.862483,0.106961,0.215919},{-0.191847,0.466809,-0.862483,1.,0.215919,0.106961}, > {0.300013,0.888477,0.106961,0.215919,1.,0.576363}, > {0.888477,0.300013,0.215919,0.106961,0.576363,1.}} > Out[1441]= {7.83599,7.83599,5.55663,5.55663,7.14659,7.14659} > Out[1442]= 6.50208*10^-9 > During evaluation of In[1432]:= A,W,R ======================= > Out[1444]= {{1.,-9.72249*10^-6,0.300013,0.888477,0.466809,-0.191847},{- > 9.72249*10^-6,1.,0.888477,0.300013,-0.191847,0.466809}, > {0.300013,0.888477,1.,0.576363,0.106961,0.215919}, > {0.888477,0.300013,0.576363,1.,0.215919,0.106961},{0.466809,- > 0.191847,0.106961,0.215919,1.,-0.862483},{-0.191847,0.466809,0.215919,0.106961,- > 0.862483,1.}} > Out[1445]= {7.83599,7.83599,7.14659,7.14659,5.55663,5.55663} > Out[1446]= 1.17684*10^-14 > --------------- > > Now, I have two questions. > 1) Does mathematica version 8.0 give three similar results of the second part? > I don't have version 8.0. Before upgrading it (if possible), I wonder that. > > 2) Is there any other way to get the proper results with version 6.0 or 7.01? > > Thank you. > > > Regards, > Myeong Ae Kang > > ================================= > > Darren Glosemeyer<darreng at wolfram.com> wrote: > >> On 11/17/2010 4:29 AM, Myeong Ae Kang wrote: >>> Hello. >>> >>> I have a problem, serious and urgent. >>> >>> I got wrong results of CDF of MultinormalDistribution. >>> >>> I'm calculating (1.0 -CDF[MultinormalDistribution[{0, 0,0}, {{1, 0.559887, >>> 0.289988}, {0.559887, 1,0.880157}, {0.289988, 0.880157, 1}}], >>> {4.5,4.5,6.9958}]). >>> The results is smaller than (1.0 -CDF[MultinormalDistribution[{0,0}, {{1, >>> 0.559887}, {0.559887, 1}}], {4.5,4.5}]). It should be larger. >>> >>> Actually, I have to calculate more than 6 to 12 random variables (12 x 12 >> sigma >>> matrix.). >>> >>> Mathematica 6.0 and 7.0.1 gave me the same wrong result. >>> >>> I compared this with the result of another system. >>> That system gave the expected value. >>> >>> To fix this problem, what should I do? >>> >>> >>> Regards, >>> Myeong Ae Kang >>> >>> >>> >> >> This is addressed in version 8. Here are the results: >> >> >> >> In[1]:= threeD = (1.0 - >> CDF[MultinormalDistribution[{0, 0, >> 0}, {{1, 0.559887, 0.289988}, {0.559887, 1, >> 0.880157}, {0.289988, 0.880157, 1}}], {4.5, 4.5, 6.9958}]) >> >> -6 >> Out[1]= 6.75155 10 >> >> In[2]:= twoD = (1.0 - >> CDF[MultinormalDistribution[{0, >> 0}, {{1, 0.559887}, {0.559887, 1}}], {4.5, 4.5}]) >> >> -6 >> Out[2]= 6.75155 10 >> >> In[3]:= threeD> twoD >> >> Out[3]= True >> >> >> The difference is at the level of machine precision numerical error: >> >> >> In[4]:= threeD - twoD >> >> -16 >> Out[4]= 1.11022 10 >> >> >> >> Note that if we artificially increase the precision as in the example >> below, we get a difference at the same order of magnitude, so the >> precision in the difference above is as good as could be expected (and >> in fact probably better than should be expected) at machine precision. >> >> >> In[5]:= CDF[MultinormalDistribution[{0, 0, 0}, >> Rationalize[{{1, 0.559887, 0.289988}, {0.559887, 1, >> 0.880157}, {0.289988, 0.880157, 1}}, 0]], >> SetPrecision[{4.5, 4.5, 6.9958}, 50]] >> >> Out[5]= 0.999993248446123977600817893853727681422304494350 >> >> In[6]:= CDF[MultinormalDistribution[{0, 0}, >> Rationalize[{{1, 0.559887}, {0.559887, 1}}, 0]], >> SetPrecision[{4.5, 4.5}, 50]] >> >> Out[6]= 0.999993248446124136306526964076086153533229885505 >> >> In[7]:= % - %% >> >> -16 >> Out[7]= 1.5870570907022235847211092539116 10 >> >> >> Darren Glosemeyer >> Wolfram Research >> The results for these 6-dimensional cases are roughly the same in version 8 as they are in version 7. There are, however, ways to get more precise results in version 8 which will be shown below. We will look into how these can be incorporated directly into CDF for the future. There are some important issues to keep in mind with the computation of CDFs so far out in the tails of higher dimenional normal. For 2 and 3 dimensional normals, special case code is now used in version 8 which gets more precise results than in version 7 in your examples. In higher dimensions the methodology requires an (n-1)-dimensional numeric integral. The results obtained by CDF in the 6-dimensional examples sent are good to about 7 or 8 digits, however they are nearly 1 and so subtracting them from 1 cancels most, if not all, of the significant digits. The following approach to computing the values you are after was provided by my colleague Sasha Pavlyk and uses the new SurvivalFunction and MarginalDistribution functions from version 8 to effectively remove pieces of probability outside the CDF region. The idea is to shave off portions of the space by using lower dimensional marginal survival functions. These would include marginals of dimension 1 through 5, but the contributions from the higher dimensional margins will be much less than those from the lower dimensional marginal this far out in the tails of the distributions, so some higher dimensional marginals can be ignored. Here are the covariance matrices and end points sent: In[1]:= TT = {{1, 0.576363, 0.300013, 0.888477, 0.106961, 0.215919}, {0.576363, 1, 0.888477, 0.300013, 0.215919, 0.106961}, {0.300013, 0.888477, 1, -9.72249*10-6, 0.466809, -0.191847}, {0.888477, 0.300013, -9.72249*10-6, 1, -0.191847, 0.466809}, {0.106961, 0.215919, 0.466809, -0.191847, 1, -0.862483}, {0.215919, 0.106961, -0.191847, 0.466809, -0.862483, 1}}; In[2]:= TTbnd = {7.14659, 7.14659, 7.83599, 7.83599, 5.55663, 5.55663}; In[3]:= TTp = {{0, 0, 0, 0, 0, 1}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 1, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {1, 0, 0, 0, 0, 0}}.TT.{{0, 0, 0, 0, 0, 1}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 1, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {1, 0, 0, 0, 0, 0}}; In[4]:= Tbnd = TTbnd.Transpose[{{0, 0, 0, 0, 0, 1}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 1, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {1, 0, 0, 0, 0, 0}}]; In[5]:= TTp2 = {{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}, {1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}}.TT.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}, {1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}}]; In[6]:= Tbnd2 = TTbnd.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}, {1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}}]; In[7]:= TTp3 = {{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}}.TT.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}}]; In[8]:= Tbnd3 = TTbnd.Transpose[{{0, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 1, 0}, {0, 0, 0, 0, 0, 1}}]; GDiff is a generalized differencing for higher dimensions: In[9]:= GDiff[f_, {x_, a_, b_}, specs___] := GDiff[(f /. x -> a) - (f /. x -> b), specs] In[10]:= GDiff[f_] := f This defines the computation of 1-CDF incorporating marginal survival functions: In[11]:= Clear[OneMinusCDF]; In[12]:= OneMinusCDF[dist_, xvec_, cutoffdim_: 3] := Module[{res, len = Length[xvec], x, sf}, res = 1 - GDiff[sf[Array[x, len]], Apply[Sequence, Thread[{Array[x, len], ConstantArray[-Infinity, len], xvec}]]]; res = res /. sf[{(-Infinity) ..}] :> 1; (* remove survival functions in more than 3 dimensions as small *) res = res /. {sf[arg_] /; len - Count[arg, -Infinity] > cutoffdim :> 0}; res /. {sf[arg_] :> SurvivalFunction[ MarginalDistribution[dist, Cases[Transpose[{Range[len], arg}], {n_, Except[-Infinity]} :> n]], With[{y = DeleteCases[arg, -Infinity]}, If[Length[y] == 1, First[y], y]]]} ] Using marginals up to dimension 3, we get consistent results for the three cases: In[13]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp], Tbnd, 3] // Timing -8 Out[13]= {0.375, 2.75042 10 } In[14]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp2], Tbnd2, 3] // Timing -8 Out[14]= {0.36, 2.75042 10 } In[15]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp3], Tbnd3, 3] // Timing -8 Out[15]= {0.344, 2.75042 10 } We are actually far enough out in the tails of the distribution that using only 1D marginals gives the same result up to the number of digits shown. In[16]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp], Tbnd, 1] // Timing -8 Out[16]= {0.015, 2.75042 10 } There is some difference out beyond the digits shown: In[17]:= OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp], Tbnd, 3] - OneMinusCDF[MultinormalDistribution[{0, 0, 0, 0, 0, 0}, TTp], Tbnd, 1] -15 Out[17]= -2.2018 10 Darren Glosemeyer Wolfram Research