Re: The D'Agostino Pearson k^2 test implemented in mathematica / variance of difference sign test
- To: mathgroup at smc.vnet.net
- Subject: [mg65727] Re: The D'Agostino Pearson k^2 test implemented in mathematica / variance of difference sign test
- From: Maxim <m.r at inbox.ru>
- Date: Sun, 16 Apr 2006 03:48:55 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
[This post has been delayed due to email problems - moderator] Suppose that we have n independent identically distributed random variables {u[1], ..., u[n]} and P[u[i] == u[j]] == 0 for i != j. We form another sequence {xi[1] = Boole[u[1] > u[2]], ..., xi[n - 1] = Boole[u[n - 1] > u[n]]} and we're looking for the variance of the sum of xi[i]: D[N[n]] == Variance[Sum[xi[i], {i, n - 1}]] == Variance[Sum[xi[i], {i, n - 2}] + xi[n - 1]] == Variance[Sum[xi[i], {i, n - 2}]] + Variance[xi[n - 1]] + 2*Covariance[Sum[xi[i], {i, n - 2}], xi[n - 1]] == D[N[n - 1]] + 1/4 + 2*Sum[Covariance[xi[i], xi[n - 1]], {i, n - 2}] For any pair of adjacent elements we have Covariance[xi[1], xi[2]] == P[xi[1] == 1 && xi[2] == 1] - P[xi[1] == 1]*P[xi[2] == 1] == P[u[1] > u[2] > u[3]] - P[u[1] > u[2]]*P[u[2] > u[3]] == 1/6 - 1/4 == -1/12 because all permutations of {u[1], ..., u[n]} are equally probable. For any non-adjacent elements Covariance[xi[i], xi[j]] == 0. Therefore, D[N[n]] == D[N[n - 1]] + 1/4 + 2*(-1/12), D[N[2]] = 1/4 and D[N[n]] == (n + 1)/12 if n >= 2. Here is a check for n = 6: In[1]:= n = 6; Lvalfreq = {First@ #, Length@ #}& /@ Split@ Sort@ (Count[Sign[Most@ # - Rest@ #], 1]& /@ Permutations@ Range@ n) {Lval, Lp} = {Lvalfreq[[All, 1]], Lvalfreq[[All, 2]]/n!}; mu = Lval.Lp sigma = ((Lval - mu)^2).Lp Out[2]= {{0, 1}, {1, 57}, {2, 302}, {3, 302}, {4, 57}, {5, 1}} Out[4]= 5/2 Out[5]= 7/12 And a numerical test: In[6]:= Lcnt = Array[ Count[Sign[Most@ # - Rest@ #]&@ Array[Random[]&, n], 1]&, 10^5]; {Mean@ Lcnt, Variance@ Lcnt} - {mu, sigma} // N Out[7]= {0.00262, 0.0033856695} Maxim Rytin m.r at inbox.ru On Sat, 11 Mar 2006 11:47:33 +0000 (UTC), Darren Glosemeyer <darreng at wolfram.com> wrote: > > For the variance quoted on the TimeSeries page, I initially thought the > same thing you did. Assuming the signs are independent and there are > equal > probabilities of getting positive and negative signs (and 0 probability > of > getting a 0 difference), the statistic would follow > BinomialDistribution[n-1, 1/2], which would have a variance of > (n-1)/4. Simulations give a variance that appears to be (n+1)/12 (which > would still indicate a typo in the TimeSeries documentation). I haven't > figured out why this should be the variance yet. My best guess is that > the > assumption of independence is not valid given the differencing and as a > result the distribution is something other than > BinomialDistribution[n-1, 1/2]. > > > Darren Glosemeyer > Wolfram Research > > > At 05:15 AM 3/10/2006 -0500, john.hawkin at gmail.com wrote: >> Hello, >> >> I have two questions. >> >> 1. Are there any resources of .nb files available on the internet >> where I might find an implementation of the D'Agostino Pearson k^2 test >> for normal variates? >> >> 2. In the mathematica time series package (an add-on), the >> "difference-sign" test of residuals is mentioned (url: >> http://documents.wolfram.com/applications/timeseries/UsersGuidetoTimeSeries/1.6.2.html). >> It says that the variance of this test is (n+1) / 2. However, it >> would seem to me that a simple calculation gives a variance of (n-1)/4. >> It goes as follows: >> >> If the series is differenced once, then the number of positive and >> negative values in the difference should be approximately equal. If Xi >> denotes the sign of each value in the differenced series, then >> Mean(Xi) = 0.5(1) + 0.5(0) = 0.5 >> Var(Xi) = Expectation( (Xi - Mean(Xi))^2 ) >> = Expectation( Xi^2 -Xi + 0.25 ) >> = 0.5 - 0.5 + 0.25 >> = 0.25 >> >> And assuming independence of each sign from the others, the total >> variance should be the sum of the individual variances, up to n-1 for n >> data points (since there are only n-1 changes in sign), thus >> >> Variance = (n-1) / 4 >> >> There is an equivalent problem in Lemon's "Stochastic Physics" about >> coin flips, for which the answer is listed, without proof, as (n-1)/8. >> Because of these three conficting results I am wondering if I have made >> an error in my calculation, and if anyone can find one please let me >> know. >> >> Thank you very much, >> >> -John Hawkin >