MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Kolmogorov-Smirnov 2-sample test

  • To: mathgroup at smc.vnet.net
  • Subject: [mg111088] Re: Kolmogorov-Smirnov 2-sample test
  • From: Bill Rowe <readnews at sbcglobal.net>
  • Date: Tue, 20 Jul 2010 03:41:51 -0400 (EDT)

On 7/19/10 at 2:11 AM, aaronbramson at gmail.com (Aaron Bramson) wrote:

>I would like to perform a 2-sample k-s test.  I've seen some posts
>on the archive about the one-sample goodness-of-fit version of the
>Kolgomorov-Smirnov test, but I'm interested in the 2-sample version.

>Here's a description of the method:
>http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Two-
>sample_Kolmogorov.E2.80. 93Smirnov_test

>I think all the necessary components are available; e.g.
>Accumulate[BinCounts[_list_]] to get the ecdf of both datasets, abs,
>max of a list, etc.  But the data management is a bit above my
>current skill level.  Also, since all other software packages seem
>to include this test capability, I would be really surprised if
>there wasn't a package somewhere that included it by now, but I've
>searched a lot and can't find it.  Can anybody help me locate this
>this?

>Alternatively, would anybody like to work with me to build this in
>case it can't be found?

It is simple to create a function that will do what is needed.
For example,

ksTwoSampleTest[xdata_, ydata_] :=
  Module[{nx, ny, k},
   {nx, ny} = Length /@ {xdata, ydata};
   k = Max[nx, ny];
   Sqrt[nx ny/(nx + ny)] Max@
     Table[Abs[Quantile[xdata, x/k] - Quantile[ydata, x/k]], {x, k}]]

Note, while this is a simple implementation it may not be
optimal for large data sets. My *guess* is by using Quantile and
not pre-sorting the data, there is more work being done by this
code than is really needed. i suspect that the approach I've
used here has a complexity of order n^2 which should't be a
problem for modest data sets but will certainly be an issue for
large data sets.



  • Prev by Date: Re: A Question About Directive
  • Next by Date: Re: Problems with Workbench Debugger Breakpoints
  • Previous by thread: Kolmogorov-Smirnov 2-sample test
  • Next by thread: Re: Kolmogorov-Smirnov 2-sample test