MathGroup Archive: July 2010 [00635]

[Date Index] [Thread Index] [Author Index]

Re: Kolmogorov-Smirnov 2-sample test

To: mathgroup at smc.vnet.net
Subject: [mg111284] Re: Kolmogorov-Smirnov 2-sample test
From: Ray Koopman <koopman at sfu.ca>
Date: Mon, 26 Jul 2010 06:37:47 -0400 (EDT)

ks2a[y1_,y2_] := Block[{n1 = Length@y1, n2 = Length@y2, 
pool = Sort@Join[y1,y2], x,n,u}, If[Equal@@pool, {0,1.}, {x = 
Max@Abs[n2*Tr@UnitStep[y1-#]-n1*Tr@UnitStep[y2-#]&/@Rest@Union@pool],
n = n1+n2; u = Table[0,{n2+1}]; Do[ Which[
i+j == 0, u[[j+1]] = 1,
i+j < n && pool[[i+j]] < pool[[i+j+1]] && Abs[n2*i-n1*j] >= x, 
          u[[j+1]] = 0,
i == 0, u[[j+1]] = u[[j]],
j > 0, u[[j+1]] += u[[j]]], {i,0,n1},{j,0,n2}]; 
N[1 - Last@u/Multinomial[n1,n2]]}] ]

ks2a[{1,1,1},{1,1,1,1}]

{0,1.}

----- Aaron Bramson <aaronbramson at gmail.com> wrote:
> Hello everybody and thank you,
> 
> This has been very helpful, and now the two-sided K-S test for Mathematica
> is online for everybody to enjoy.
> 
> I have implemented the new code from Andy and from Ray on my data set and
> the code from Ray works out better for me...though I don't have the skill to
> decipher what that "ugly" code is doing, I've verified several results so
> I'm using those exact p-values.  I'm going to build a table of the p-values
> from these tests (which is made into plot over time with the test being
> performed on the individual-trial data streams of two cohorts at each time
> step).
> 
> I have one last question, or maybe it's a request..   In Ray's code if I put
> in two data sets wherein all the points are at the same value (e.g. all
> zero) the result is not a K-stat of 0, and a p-value of 1, but rather
> {-\[Infinity], 0.}.  That doesn't seem like the right answer (and in any
> case not the answer that I expect or can use) so this input combination
> doesn't work with how the technique calculates the stats.  So I'd like to
> request a small change to the code Ray provided so that if the inputs are
> all identical the output is {0,1} instead of {-\[Infinity], 0.}.  I could do
> this post-facto with a replacement rule, but it would probably be better and
> faster to do this in the original calculation.  But with THAT code I don't
> know where to make the appropriate changes.
> 
> Again, thanks everybody for your help.
> 
> Best,
> Aaron
> 
> p.s. I may end up using the Kuiper test and I might therefore have a similar
> question about implementing that in Mathematica very soon.

Prev by Date: Re: Documentation on (Color) Blend

Next by Date: Re: tweaking VectorPlot...

Previous by thread: Re: Kolmogorov-Smirnov 2-sample test

Next by thread: Re: Kolmogorov-Smirnov 2-sample test