MathGroup Archive 1998

[Date Index] [Thread Index] [Author Index]

Search the Archive

Bivariate Normal Distributions -- can they be estimated in my lifetime?

Dear MathGroup,
I have given up on several weeks of work trying to do maximum likelihood
estimation of a bivariate probit model (ie two binary decisions that
are correlated). The likelihood function itself is easy to construct,
and in principle should be easy to maximise (using FindMinimum[] on its
negative) because the PDF of the bivariate normal distribution is

However, it appears impossible to do this on a desktop computer. I am
fairly certain my code is correct, but it refuses to produce an answer
on even a toy problem (20 observations and 2 independent variables and
therefore 7 parameters to be estimated) no matter how long I give it. 
Since I want to use this procedure on a 53,000 observation dataset with
about 20-30 independent variables, it seems unlikely that I will ever
get this to work. 

I have encountered several problems
1.  CDF[MultinormalDistribution[{0,0},{{1,rho},{rho,1}}],{x,y}] 
generates very small imaginary parts when rho is negative. The
definition of the likelihood function for this model guarantees at
least some of its elements will have negative covariances.  Sure, I can
wrap the whole thing in Chop[], but frankly I find it peculiar that a
density function should do this.

2. The CDF of the multinormal takes too darn long to evaluate. Ok, I am
using a PII 266 not a Unix box or a PowerMac G3, but at work you have
to take what you're given.  Nonetheless, if individual realisations of
the CDF take 2-5 seconds to evaluate, the likelihood function must be
taking multiple minutes to evaluate even once, and would probably take
hours or days with the full data set.  I don't have that kind of time
(this job finishes next Tuesday. I have been struggling with this
problem for about two months). 

3. The CDF of the multinormal takes even longer than usual to evaluate
if rho is negative.  I tried to make up for this by substituting the
actual numerical double-integral of the PDF in those cases (which for
some reason evaluates faster those cases, but slower than the CDF for
positive rho). But this wasn't enough of a speed up and I still don't
get an answer.

Before anyone responds with suggestions to solve the first-order
conditions instead of maximise the likelihood, yes, the first-order
conditions are analytic (see WH Greene "Econometric Analysis" page 661)
but they take even longer to evaluate because they involve PDFs divided
by CDFs and other nonsense.

Chris Farr has previously suggested using a discrete approximation of
the CDF, but I am concerned that the approximation error would get too
large, making my results unreliable.

I notice that Wolfram has recently put a "Mathematica and Statistics"
part on its web site (the pallette is cool, by the way!).  But in my
experience, much as I love this package, it's just not ready for
prime-time professional econometrics / statistics.  Gauss and Stata can
both do this -- why can't Mathematica?  Perhaps it's time to roll some
of the statistical packages into the kernel or otherwise speed them up?
I accept that a general package like Mathematica is going to be slower
than a specialised package like Stata or Shazam -- but the margin needs
to be smaller if Mathematica is to be usable in this context.

This post is probably more in the "rant" category than the usual pleas
for assistance.  But if there is anyone out there with any ideas on how
I should proceed from here, I'd be grateful!

________________________________________________________ Luci Ellis     
ph: 61-2-9551-8786

Manager, Research                 fax:61-2-9551-8024  Payments System
Efficiency        em: ellisl at Payments Policy Department     
GPO Box 3947 Reserve Bank of Australia         Sydney 2001 Australia

  • Prev by Date: Re: Single character in Italics. Ervin Doyle;
  • Next by Date: Re: How to transpose vector?
  • Previous by thread: Generating C++BLAS wrapers with Mathematica
  • Next by thread: Re: Bivariate Normal Distributions -- can they be estimated in my lifetime?