Re: Problems with DistributionFitTest
- To: mathgroup at smc.vnet.net
- Subject: [mg122569] Re: Problems with DistributionFitTest
- From: Ray Koopman <koopman at sfu.ca>
- Date: Tue, 1 Nov 2011 06:58:11 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- References: <j8nuqm$egv$1@smc.vnet.net>
On Oct 31, 10:07 pm, fd <fdi... at gmail.com> wrote: > Dear Group > > I'm not a specialist in statistics, but I spoke to one who found this > behaviour dubious. > > Before using DistributionFitTest I was doing some tests with the > normal distribution, like this > > data = RandomVariate[NormalDistribution[], 10000]; > > DistributionFitTest[data] > > 0.0312946 > > According to the documentation "A small p-value suggests that it is > unlikely that the data came from dist", and that the test assumes the > data is normally distributed > > I found this result for the p-value to be really low, if I re-run the > code I often get what I would expect (a number greater than 0.5) but > it is not at all rare to obtain p values smaller than 0.05 and even > smaller. Through multiple re-runs I notice it fluctuates by orders of > magnitude. > > The statistician I consulted with found this weird since the data was > drawn from a a normal distribution and the sample size is big, > especially because the Pearson X2 test also fluctuates like this: > > H=DistributionFitTest[data, Automatic, "HypothesisTestData"]; > > H["TestDataTable", All] > > Is this a real issue? > > Any thougths > > Best regards > Felipe If the data were generated by the distribution for which you are testing then, no matter what the sample size is, the p-value is a sample from a Uniform[0,1] distribution, so 5% of the time it should be < .05, etc.