MathGroup Archive: July 2007 [00464]

[Date Index] [Thread Index] [Author Index]
Re: T Copula Calibration
To: mathgroup at smc.vnet.net
Subject: [mg78855] Re: T Copula Calibration
From: dwstrickler at tx.rr.com
Date: Wed, 11 Jul 2007 06:11:56 -0400 (EDT)
References: <200707030944.FAA18997@smc.vnet.net>
On Jul 8, 5:18 am, Carl Woll <c... at wolfram.com> wrote:
> dwstrick... at tx.rr.com wrote:
> >On Jul 4, 4:32 am, Carl Woll <c... at wolfram.com> wrote:
>
> >>dwstrick... at tx.rr.com wrote:
>
> >>>All,
>
> >>>In the interest of full disclosure and genuine humility, I don't code
> >>>in Mathematica so much as I wrestle it into submission from time to
> >>>time. Accordingly, I've come up against an optimization problem in
> >>>which Mathematica flatly refuses to see things my way. In a nutshell,
> >>>I have an n x 3 table of interdependent log-delta price data (to the
> >>>extent it matters, I have a corresponding table of alpha-Stable
> >>>standardized log-delta price data), and I'm trying to construct a
> >>>simulation algorithm that will effectuate a Student T copula
> >>>relationship between the three variables. (FWIW, I've managed to
> >>>create Clayton and Gumbel copula RNGs for similar data; they work
> >>>fine).
>
> >>>The problem is the Student T copula density function, and in
> >>>particular, the Student T DOF parameter, v. I've tried NMinimize,
> >>>FindMinimum, and FindRoot, and I simply cannot get Mathematica to return a
> >>>numerical value for v. In fact, the problem is broader than that - I
> >>>can't get Mathematica to optimize ANY function that contains a Sum or
> >>>Product
> >>>term - but I digress.
>
> >>>In terms of the density function, here's an example of what I've
> >>>tried:
>
> >>>tCopulaPDF[y__,v_,R_]:=Module[{dims,ret},dims=Length[Transpose[y]];ret=((Gamma
> >>>[(v+dims)/2]*(Gamma[v/2])^dims-1)/(Gamma[(v+1)/
> >>>2]^dims)*(Det[R])^1/2)*NProduct[(1+
> >>>(y[[i]]^2/v))^v+1/2,{i,1,dims}]*(1+y.R^-1*y/v)^-v+dims/2;Return[ret]];
>
> >>                                                          ^^^^^^
> >>It's unnecessary to use Return here. Module automatically returns the
> >>last expression.
>
> >>Without creating data to test your code, I suspect that the problem is
> >>that NMinimize attempts to evaluate tCopulaPDF with a symbolic v. If so,
> >>the simple workaround is to use
>
> >>tCopulaPDF[y_, v_?NumberQ, R_] := ...
>
> >>If the above doesn't solve your problem, make sure that
>
> >>tCopulaPDF[data, 1, corrmat]
>
> >>evaluates to a number.
>
> >>Another possible issue is the use of R^-1. Did you mean to use
> >>Inverse[R] here instead?
>
> >>Carl Woll
> >>Wolfram Research
>
> >>>where y is the n x 3 table, v is the DOF parameter, and R is the
> >>>correlation matrix for y. Gamma[] is Mathematica's built-in Euler Gamma
> >>>function.
>
> >>>Then I try to optimize with something like:
>
> >>>NMinimize[{tCopulaPDF[data,v,corrmat],v>0},v]
>
> >>>No luck. Mathematica gives me the standard NAN warning, and then
> >>>returns the
> >>>function unevaluated.
>
> >>>Thinking the source of the problem might be the NProduct term, I also
> >>>tried expressing the density function a different way:
>
> >>>tCopulaPDF[y__,v_,R_]:=Module[{dims,ret,cols,a,b,c},dims=Length[Transpose[y]];cols=y/.
> >>>{a_,b_,c_}:>Apply[Times,(1+a^2/v)^v+1/2]+Apply[Times,(b^2/v)^v
> >>>+1/2]+Apply[Times,(c^2/v)^v
> >>>+1/2];ret=((Gamma[(v+dims)/2]*(Gamma[v/2])^dims-1)/(Gamma[(v+1)/
> >>>2]^dims)*(Det[R])^1/2)
> >>>*cols*(1+y.R^-1*y/v)^-v+dims/2;Return[ret]];
>
> >>>Same result. Any suggestions? I apologize in advance if the solution
> >>>is blindingly obvious to everyone but me.
>
> >Carl,
>
> >Thanks for your suggestions.  I tried the NumberQ and Inverse[R]
> >fixes, but these still didn't allow me to optimize.  In accordance
> >with your suggestion, I did however make sure that the tCopulaPDF
> >function itself was working.  When I input:
>
> >dims = Length[Transpose[xdata]]; tCopulaPDF =
> >   (((Gamma[(v + dims)/2]*Gamma[v/2]^
> >        (dims - 1))/(Gamma[(v + 1)/2]^dims*
> >       Det[R]^(1/2)))*Product[(1 + y^2/v)^
> >       ((v + 1)/2), {i, 1, dims}])/
> >    (1 + y . R^(-1)*(y/v))^((v + dims)/2);
> >  test = tCopulaPDF /. {y -> xdata, v -> 2,
> >     R -> xcorrmat}
>
> >I get a clean, n * 3 matrix of real numbers, albeit with values that
> >are 3-5 times too large.  If I input:
>
> Neither FindMinimum nor NMinimize support minimizing a matrix. As I said
> before, your function needs to be an expression that evaluates to a
> *number* and not to a matrix when v is given a value. This is why you
> are getting the not a number error message. The reason test is a matrix
> is because the subexpression
>
> (1 + y . R^(-1) * (y/v))
>
> is a matrix. You need to change this subexpression into something that
> evaluates to a number.
>
> Carl Woll
> Wolfram Research
>
> >test2 = tCopulaPDF /. {y -> xdata, R -> xcorrmat}
>
> >the result simplifies nicely around the unspecified v parameter.
> >Trial-and-error confirms that increasing the DOF parameter (v)
> >decreases the values that are returned, but there's still the matter
> >of calibration.  When I try:
>
> >FindMinimum[test, {v, 5}], or
> >NMinimize[{test, v > 5}, v]
>
> >I get the familiar error message:
>
> >NMinimize::nnum: The function value {<<1>>} is not a number at {v} =
> >{5.063707273968251`}.  More . . .
>
> >To the extent it helps, here's a 6 * 3 sample from the standardized
> >data set:
>
> >xtrunc = {{0.19067, 0.269248, 0.248315}, {0.857068,
> >     0.371992, 0.943073}, {0.714966, 0.908612, 0.889339}, {0.255969,
> >     0.791845, 0.790311}, {0.693816, 0.988873, 0.560541}, {0.247043,
> >     0.384309, 0.196687}};
>
> >And here's the correlation matrix:
>
> >xcorrmat = {{1, .608018033570726, .759864730700654}, {.
> >608018033570726, 1, .5916085946121}, {.759864730700654, .
> >5916085946121, 1}};
>
> >Any ideas?  BTW, thanks again for your help.

Thanks, Carl.  I finally got this thing to work.  For the benefit of
any reader who may be struggling with calibrating a multivariate
Student T copula, here's how I did it:

1.  Begin with an n-dimensional time series of log-delta values.
2.  Convert the values in each dimension to probabilities using the
CDF of the distribution that provides the best fit for the underlying
values (the beauty of modeling dependence with copulae is the fact
that they support any marginal distribution function that is in turn
supported by the data; due to the flexibility and long-tail
characteristics of the alpha-Stable distribution, I typically use John
Nolan's excellent alpha-Stable Mathematica add-in for this step).
3.  Calculate Kendall's tau for each bivariate margin (the Mathematica
function is KendallRankCorrelation[ ]).  (N.B.: If your data set is
lengthy, the computation of each bivariate Kendall's tau may take a
minute or longer).  When you finish, you should have an n-by-n matrix
of tau values in which the diagonal is filled by ones.
4.  Given the relationship between Kendall's tau and elliptical
copulae (Gaussian as well as T; see Demarta, S. & McNeil, A., "The t
Copula and Related Copulas"), convert the values in the Kendall's tau
matrix using Sin[Pi/2].  Assuming your Kendall's tau matrix is called
"ktau," you can enter Map[Sin[Pi/2*#]&,ktau].  Call this new matrix R.
5.  Next, specify the multivariate T copula density function.  In
accordance with Carl's guidance, I modified the return portion; note
that the algorithm now resolves to the Trace of the rectangular matrix
that is generated by the T copula density function.  We further
specify the log of the function:

g[y__, v_, R_] = Module[{dims, mat}, dims = Length[Transpose[y]];
     mat = (((Gamma[(v + dims)/2]*Gamma[v/2]^(dims - 1))/(Gamma[(v +
1)/2]^dims*
          Det[R]^(1/2)))*Product[(1 + y^2/v)^((v + 1)/2), {i, 1,
dims}])/
       (1 + y . R^(-1)*(y/v))^((v + dims)/2); Tr[mat]];
  g2[y__, v_, R_] := Log[g[y, v, R]];

6.  Now we're ready to calibrate.  Taking the rectangular matrix input
y from step 2 and square matrix input R from step 4, the NMaximize[ ]
function below initially returns optimized values for both
Log[Tr[mat]] and the DOF parameter, v.  We employ substitution to
isolate the numerical value for v:

tDOF[y__, v_, R_] := Module[{int}, int = NMaximize[{g2[y, v, R], v >
0}, v];
     v /. int[[2]]];

7.  There are a number of ways to test the calibrated DOF value; one
is simulation.  This is the multivariate T copula RNG that I use:

Needs["Statistics`ContinuousDistributions`"];

tCopulaSim[corrmat_, v_, len_] := Module[{A, dims, srn, x2rn, y, x,
u},
    A = CholeskyDecomposition[corrmat]; dims = Length[A];
     srn := Random[NormalDistribution[0, 1]];
     srnT = Table[srn, {i, len}, {j, dims}]; x2rn :=
Random[ChiSquareDistribution[v]];
     x2rnT = Table[x2rn, {i, len}, {j, dims}]; y = srnT . A;
     x = y*Sqrt[v/x2rnT]; u = CDF[StudentTDistribution[v], x];
Return[u]];

where corrmat is the original correlation matrix for your n-
dimensional logdelta time series, and len is the length of the n-
dimensional matrix of quasi-random values you wish to create.

After running a simulation with the calibrated DOF value along with
the above T copula RNG function, among other tests I ran the Mann-
Whitney test for independence on the first column of (CDF-transformed)
actual time series data vs. the first column of T copula-simulated
values.  This yielded a t-statistic of 1.824*10^6, and a 2-sided P-
value of 0.576258 - a good outcome since the two arrays should reflect
considerable similarity.
References:
- T Copula Calibration
  - From: dwstrickler@tx.rr.com
Prev by Date: Re: Opening a foreign file from Mathematica
Next by Date: Re: Re: Problems with ShowLegend
Previous by thread: Re: Re: T Copula Calibration
Next by thread: Re: AW: position of matrix elements for intervals