MathGroup Archive: October 2004 [00788]

[Date Index] [Thread Index] [Author Index]

Re: bimodal ditribution form counting signs of Pi digits differences

To: mathgroup at smc.vnet.net
Subject: [mg51747] Re: bimodal ditribution form counting signs of Pi digits differences
From: Bill Rowe <readnewsciv at earthlink.net>
Date: Sun, 31 Oct 2004 01:17:20 -0500 (EST)
Sender: owner-wri-mathgroup at wolfram.com

On 10/30/04 at 3:49 AM, tftn at earthlink.net (Roger Bagula) wrote:

>Null hypothesis: the digits of Pi are random. To check make up a
>random set of base 10 digits using Mathematica using:
>Random[Integer,{0,9}]
>I picked a seed off the top of my head. The result is a revolting
>development. A trimodal distribution of noise pushed way positive.
>The Pi digits behave more like an "ideal" probablity than the
>Mathematica random! Anybody see what I did wrong? Dr. Bob you
>always have an opinion, ha, ha...

>(* random digit array base 10*)
>SeedRandom[123]
>a=Table[Random[Integer,{0,9}],{n,1,2000}];
>b=Table[Sum[Sign[a[[m+1]]-a[[m]]],{m,1,n}],{n,1,Dimensions[a][[1]]-1}];

Your expression for b is rather convoluted to me but after looking at it, I see you are computing the cumulative sum of the Sign of two pairs of discrete uniform deviates. 

An expression that computes the same result which should run much faster is

Rest@FoldList[Plus, 0, Sign@ListConvolve[{1,-1},a]]

>ListPlot[b,PlotJoined->True]
>(* distribution of the noise that results*)
>Max[b]
>Min[b]
>c=Table[Count[b,m],{m,Min[b]-1,Max[b]+1}]

a faster expression for c is

Flatten@{0,Length/@Split[Sort@b], 0}

>ListPlot[c,PlotJoined->True]

But other than suggesting ways to speed up your code, I don't know what you are doing "wrong" since I cannot determine what it is you expect nor what you think this code as to do with testing the null hypothesis you posted.

What distribution are you expecting for b?

When you take the sign of pairs of randomly chosen digits from 0 to 9, you clearly map the random numbers Mathematica generates to {-1,0,1}. I would expect the frequency of -1 to be roughly the same as 1 and higher than the frequency of 0. Given this, I would think the distribution of cumulative sums to be bimodal. That is I would expect runs of -1 to occur roughly as often as runs of 1 in b and runs of 0 to be relatively rare. Note, I've not taken the time to rigorously derive the distribution for b. This is simply my rough guess as to what should be expected

Using your seed but increasing the number of samples significantly, i.e.,

SeedRandom[123]; 
a = Table[Random[Integer,  {0, 9}], {n, 200000}]; 
ListPlot[b = Rest[FoldList[Plus, 0, 
              Sign[ListConvolve[{1, -1}, a]]]],
         PlotJoined -> True, Frame -> True, Axes -> None]; 
d = ({First[#1], Length[#1]}&)/@Split[Sort[b]]; 
ListPlot[d, PlotJoined ->True, Frame ->True, Axes -> None]; 

My d here is {x,y} pairs where x is a an element of b and y is the number of times x occurs in b, i.e., ListPlot[d] generates a histogram of b. My d is essentially a rescaled version of your c except I've not force the begining and end values to 0 as you do for your c.

Looking at this last plot, it does appear the distribution of b is bimodal giving me some confidence my rough guess above is correct.
--
To reply via email subtract one hundred and four

Prev by Date: Re: Regression with missing values

Next by Date: Re: Mathematica's (and others) ancient widget toolkit ... why?

Previous by thread: Re: bimodal ditribution form counting signs of Pi digits differences

Next by thread: An Integration bug?