MathGroup Archive 1995

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Random[BinomialDistribution[..]] wrong ?

  • To: mathgroup at christensen.cybernetics.net
  • Subject: [mg1089] Re: Random[BinomialDistribution[..]] wrong ?
  • From: withoff (David Withoff)
  • Date: Sun, 14 May 1995 20:51:30 -0400
  • Organization: Wolfram Research, Inc.

In article <3opku3$7h4 at news0.cybernetics.net> rubin at msu.edu (Paul A. Rubin) writes:
>In article <3olugl$ql4 at news0.cybernetics.net>,
>   bzbkowal at ZIB-Berlin.DE (Axel Kowald) wrote:
>->I'm using Mathematica 2.2 on a Macintosh and I got a problem generating 
>binomially
>->distributed random numbers.
>->
>->I have:
>->
>->Needs["Statistics`DiscreteDistributions`"]
>->SeedRandom[17]
>->Table[Random[BinomialDistribution[10^6,0.01]],{5}]
>->Out-> {998486, 998486, 998486, 998486, 998486}
>->
>->Not only that the random numbers are all the same, most importantly they 
>are in a 
>->completely wrong range. One should get something around 10000 !
>->The problem seems to be 10^6, if I use something smaller, like 10000, 
>everything works
>->fine. Furthermore the problem seems to be in the Random[] function since
>->PDF[BinomialDistribution[10^6,0.01],n] works okay (but takes a long 
>time).
>->
>->Any comments, hints, solutions ?
>
>I pasted your input statements into version 2.2.2 for Windows and got 
>something different (and a lot more reasonable):
>
>  {9996, 9954, 9854, 10107, 9837}
>
>If another Mac user can reproduce the problem, then I would suspect it to 
>be a bug in the Mac kernel.
>
>Paul

This is an example of a previously unknown (but not terribly surprising)
Macintosh specific problem with machine-arithmetic evaluation of the
BetaRegularized function.  Random numbers with a binomial distribution
are computed by computing binomial quantiles of uniformly distributed
random numbers.  Binomial quantiles are computed by interval-halving
with the cumulative density function, and the cumulative density function
is evaluated using BetaRegularized.

(For more details, see the code in the Statistics`DiscreteDistributions`
package.  The problem can be traced to BetaRegularized with ordinary
debugging methods.)

For example, here are some expected (incorrect) results on a
non-power-pc Macintosh:

In[7]:= CDF[BinomialDistribution[10^6, .01], 9000]

                       17383
Out[7]= 5.5245656595 10

In[8]:= BetaRegularized[.99, 991000, 9001]

                       17383
Out[8]= 5.5245656595 10

The universal suggestion in case of difficulty with machine-arithmetic
evaluation of special functions is to use variable-precision arithmetic.
The idiosyncracies of machine arithmetic make evaluation of special
functions very difficult, and as a practical matter it may be impossible
to get reliable results for all of the special functions of Mathematica
using machine arithmetic alone.  This is one of the main reasons that
variable-precision arithemtic is included in Mathematica.

You can get good results, even on a non-power-pc Macintosh, for the
above example using variable-precision arithmetic.

In[9]:= N[BetaRegularized[99/100, 991000, 9001], 30]

                      -25
Out[9]= 8.350415917 10

To roughly ensure that the accuracy of the cumulative density
function is beyond the granularity of the 1000000 steps in
CDF[BinomialDistribution[10^6,0.01], k], it is necessary for the
accuracy of the CDF calculation to be at least 10 digits.  (This
isn't quite right, but it is close enough for the present discussion.)

Here is a short example involving the use of a rule for BetaRegularized
that causes the precision to be increased until it gets a result
with an accuracy of at least 10 digits.

In[1]:= Needs["Statistics`DiscreteDistributions`"]

In[2]:= Unprotect[BetaRegularized]

Out[2]= {BetaRegularized}

In[3]:= BetaRegularized[z_?MachineNumberQ, a_Integer, b_Integer] :=
            Module[{prec, result},
                prec = 20;
                Print[{z, a, b}, " : trying Precision = ", prec];
                While[prec <= $MaxPrecision,
                    result = BetaRegularized[SetPrecision[z, prec], a, b];
                    If[Accuracy[result] >= 10, Return[result]];
                    prec += 40;
                    Print["  Accuracy[result] = ", Accuracy[result],
                               " : trying Precision = ", prec]
                ];
                $Failed
            ]

In[4]:= Protect[BetaRegularized]

Out[4]= {BetaRegularized}

In[5]:= Random[BinomialDistribution[10^6,0.01]]
{0.99, 500000, 500001} : trying Precision = 20
{0.99, 750000, 250001} : trying Precision = 20
{0.99, 875000, 125001} : trying Precision = 20
{0.99, 937500, 62501} : trying Precision = 20
{0.99, 968750, 31251} : trying Precision = 20
{0.99, 984375, 15626} : trying Precision = 20
{0.99, 992188, 7813} : trying Precision = 20
{0.99, 988282, 11719} : trying Precision = 20
{0.99, 990235, 9766} : trying Precision = 20
  Accuracy[result] = 6 : trying Precision = 60
  Accuracy[result] = 9 : trying Precision = 100
{0.99, 989259, 10742} : trying Precision = 20
{0.99, 989747, 10254} : trying Precision = 20
  Accuracy[result] = 7 : trying Precision = 60
{0.99, 989991, 10010} : trying Precision = 20
  Accuracy[result] = 0 : trying Precision = 60
  Accuracy[result] = 4 : trying Precision = 100
  Accuracy[result] = 7 : trying Precision = 140
{0.99, 989869, 10132} : trying Precision = 20
  Accuracy[result] = 5 : trying Precision = 60
  Accuracy[result] = 9 : trying Precision = 100
{0.99, 989930, 10071} : trying Precision = 20
  Accuracy[result] = 2 : trying Precision = 60
  Accuracy[result] = 6 : trying Precision = 100
{0.99, 989961, 10040} : trying Precision = 20
  Accuracy[result] = 2 : trying Precision = 60
  Accuracy[result] = 5 : trying Precision = 100
  Accuracy[result] = 9 : trying Precision = 140
{0.99, 989976, 10025} : trying Precision = 20
  Accuracy[result] = 2 : trying Precision = 60
  Accuracy[result] = 4 : trying Precision = 100
  Accuracy[result] = 7 : trying Precision = 140
{0.99, 989969, 10032} : trying Precision = 20
  Accuracy[result] = 2 : trying Precision = 60
  Accuracy[result] = 5 : trying Precision = 100
  Accuracy[result] = 9 : trying Precision = 140
{0.99, 989965, 10036} : trying Precision = 20
  Accuracy[result] = 2 : trying Precision = 60
  Accuracy[result] = 5 : trying Precision = 100
  Accuracy[result] = 9 : trying Precision = 140
{0.99, 989963, 10038} : trying Precision = 20
  Accuracy[result] = 2 : trying Precision = 60
  Accuracy[result] = 5 : trying Precision = 100
  Accuracy[result] = 9 : trying Precision = 140
{0.99, 989962, 10039} : trying Precision = 20
  Accuracy[result] = 2 : trying Precision = 60
  Accuracy[result] = 5 : trying Precision = 100
  Accuracy[result] = 9 : trying Precision = 140

Out[5]= 10039

As you can see, calculation of even a single random number for
BinomialDistribution[1000000, .01] can be something of an ordeal.

This analysis is crude, and is intended only to be suggestive
of what is going on, rather than a complete solution.  There
are many optimizations that could be added.  For example, the
cumulative density function of the binomial distribution uses the
general BetaRegularized function, which is not optimized for the
arguments that arise in this calculation.

A stickier problem lurking the the background is that when the
resolution of the arithmetic is comparible to the distance between
steps in the cumulative density function, then the arithmetic
will distort the distribution of random numbers.  For
BinomialDistribution[n, p], these steps are very close together,
especially in the tails of the distribution.

The most common way of dealing with this problem that I have seen
is to approximate CDF[BinomialDistribution[n, p], k] for large
values of n with something that is easier to calculate.  Those
large-n formulas that you see in elementary statistics books are
useful not only because they make it easier to find things in common
tables, but because they are easier to evaluate numerically.  So
my real recommendation, rather than crank up the precision for
BetaRegularized, is to find a large-n approximation for quantiles
of the binomial distribution.

Dave Withoff
Research and Development
Wolfram Research


  • Prev by Date: Re: Function with attributes Flat and OneIdentity
  • Next by Date: Re: path
  • Previous by thread: Re: Random[BinomialDistribution[..]] wrong ?
  • Next by thread: In[1]:= Question