MathGroup Archive: November 2011 [00055]

[Date Index] [Thread Index] [Author Index]

Re: Problems with DistributionFitTest

To: mathgroup at smc.vnet.net
Subject: [mg122616] Re: Problems with DistributionFitTest
From: Andy Ross <andyr at wolfram.com>
Date: Thu, 3 Nov 2011 03:47:16 -0500 (EST)
Delivered-to: l-mathgroup@mail-archive0.wolfram.com
References: <201111010502.AAA14754@smc.vnet.net> <201111021123.GAA03608@smc.vnet.net> <4EB26A05.813B.006A.0@newcastle.edu.au>

I agree with your clarification completely.  However, in this case, 
Felipe was clearly sampling from the null distribution and was asking 
why the p-value was unexpectedly small, not how to interpret p-values in 
general.

Here is an empirical result that demonstrates what I was claiming.

Set up a sample.

In[25]:= data1 =
   BlockRandom[SeedRandom[94];
    RandomVariate[NormalDistribution[], 10000]];

We get a p-value near 0.06

In[26]:= DistributionFitTest[data1, Automatic, "TestData"]

Out[26]= {0.121449, 0.0568874}

What proportion of test statistics are more extreme?

In[27]:= Count[
   Table[DistributionFitTest[
     RandomVariate[NormalDistribution[], 10000], Automatic,
     "TestStatistic"], {1000}],
   x_ /; x >=
     DistributionFitTest[data1, Automatic, "TestStatistic"]]/1000.

Out[27]= 0.06

I will also point out that I claimed that the p-value follows a 
UniformDistribution[{0,1}].  This too is only true in the context of the 
problem Felipe posed.  In general, if a p-value has a standard uniform 
distribution, the test size and power are equivalent and so the test is 
useless. Under a general alternative we would hope for a right-skewed 
distribution.

Andy Ross
Wolfram Research

On 11/2/2011 6:16 PM, Barrie Stokes wrote:
> Hi Felipe
>
> Can I beg to make a small clarification to Andy's response?
>
> The whole idea of p values and rejection of the Null Hypothesis continues to be one in which people get tangled up in logical and linguistic knots.
>
> An observed p value of does *not* allow one to make a *general* claim like "about 3% of the time you can expect to get a test statistic like the one you obtained or one even more extreme".
>
> Given the context of this p value, it's value being  0.0312946, i.e., less than 0.05, allows a frequentist-classical statistician to say that, *on this occasion*, this observed p value enables me to reject the Null Hypothesis (which is that the data are Gaussian) at the 5% significance level, or some such equivalent phrase.
>
> The important thing here is that, *by construction*, p values are equal to or less than 0.05 precisely 5% of the time *when the Null Hypothesis holds, i.e., is in fact true*, or "under the Null Hypothesis", as it's usually phrased.
>
> When one rejects the Null Hypothesis (having obtained a p value<=0.05, one is in fact betting that, in so doing, you will only be wrong in so doing 1 time in 20.
>
> If anyone doesn't like this explication, please note that I am a Bayesian, s for me to explain a p value is like George Bush explaining the meaning of the French word 'entrepreneur'.  :-)
>
> (Apparently GB once claimed that the trouble with the French is that they don't have a word for 'entrepreneur'. Actually, they do.)
>
> You may find the following code (built on your original code) helpful - run it as many times as your patience allows.
>
> numTests = 1000;
> resultsList = {};
> Do[
>   (data = RandomVariate[NormalDistribution[], 10000];
>    AppendTo[ resultsList, DistributionFitTest[data] ];
>    ), {numTests}
>   ]
> resultsList // Short
> Length[ Select[ resultsList, (s \[Function] s<= 0.05) ] ]/numTests //  N
>
> Cheers
>
> Barrie
>
>
>
>>>> On 02/11/2011 at 10:23 pm, in message<201111021123.GAA03608 at smc.vnet.net>,
> Andy Ross<andyr at wolfram.com>  wrote:
>> This is exactly what you might expect.  The p-value from a hypothesis
>> test is itself a random variable. Under the null hypothesis the p-value
>> should follow a UniformDistribution[{0,1}].
>>
>> In your case, the null hypothesis is that the data have been drawn from
>> a normal distribution. What that p-value is really saying is that about
>> 3% of the time you can expect to get a test statistic like the one you
>> obtained or one even more extreme.
>>
>> Andy Ross
>> Wolfram Research
>>
>>
>> On 11/1/2011 12:02 AM, fd wrote:
>>> Dear Group
>>>
>>> I'm not a specialist in statistics, but I spoke to one who found this
>>> behaviour dubious.
>>>
>>> Before using DistributionFitTest I was doing some tests with the
>>> normal distribution, like this
>>>
>>> data = RandomVariate[NormalDistribution[], 10000];
>>>
>>> DistributionFitTest[data]
>>>
>>> 0.0312946
>>>
>>> According to the documentation "A small p-value suggests that it is
>>> unlikely that the data came from dist", and that the test assumes the
>>> data is normally distributed
>>>
>>> I found this result for the p-value to be really low, if I re-run the
>>> code I often get what I would expect (a number greater than 0.5) but
>>> it is not at all rare to obtain p values smaller than 0.05 and even
>>> smaller. Through multiple re-runs I notice it fluctuates by orders of
>>> magnitude.
>>>
>>> The statistician I consulted with found this weird since the data was
>>> drawn from a a normal distribution and the sample size is big,
>>> especially because the Pearson X2 test also fluctuates like this:
>>>
>>> H=DistributionFitTest[data, Automatic, "HypothesisTestData"];
>>>
>>> H["TestDataTable", All]
>>>
>>> Is this a real issue?
>>>
>>> Any thougths
>>>
>>> Best regards
>>> Felipe
>>>
>>>
>>>
>>>

References:
- Re: Problems with DistributionFitTest
  - From: Andy Ross <andyr@wolfram.com>

Prev by Date: Exit a loop

Next by Date: Re: nVidia Optumus prevents using CUDA?

Previous by thread: Re: Problems with DistributionFitTest

Next by thread: Re: Problems with DistributionFitTest