[Date Index]
[Thread Index]
[Author Index]
Re: Kolmogorov-Smirnov 2-sample test
*To*: mathgroup at smc.vnet.net
*Subject*: [mg111157] Re: Kolmogorov-Smirnov 2-sample test
*From*: Andy Ross <andyr at wolfram.com>
*Date*: Thu, 22 Jul 2010 05:42:40 -0400 (EDT)
Bill Rowe wrote:
> On 7/20/10 at 3:41 AM, darreng at wolfram.com (Darren Glosemeyer) wrote:
>
>> Here is some code written by Andy Ross at Wolfram for the two
>> sample Kolmogorov-Smirnov test. KolmogorovSmirnov2Sample computes
>> the test statistic, and KSBootstrapPValue provides a bootstrap
>> estimate of the p-value given the two data sets, the number of
>> simulations for the estimate and the test statistic.
>
>> In[1]:= empiricalCDF[data_, x_] := Length[Select[data, # <= x
>> &]]/Length[data]
>
>> In[2]:= KolmogorovSmirnov2Sample[data1_, data2_] :=
>> Block[{sd1 = Sort[data1], sd2 = Sort[data2], e1, e2,
>> udat = Union[Flatten[{data1, data2}]], n1 = Length[data1],
>> n2 = Length[data2], T},
>> e1 = empiricalCDF[sd1, #] & /@ udat;
>> e2 = empiricalCDF[sd2, #] & /@ udat;
>> T = Max[Abs[e1 - e2]];
>> (1/Sqrt[n1]) (Sqrt[(n1*n2)/(n1 + n2)]) T
>> ]
>
> After looking at your code above I realized I posted a very bad
> solution to this problem. But, it looks to me like there is a
> problem with this code. The returned result
>
> (1/Sqrt[n1]) (Sqrt[(n1*n2)/(n1 + n2)]) T
>
> seems to have a extra factor in it. Specifically 1/Sqrt[n1].
> Since n1 is the number of samples in the first data set,
> including this factor means you will get a different result by
> interchanging the order of the arguments to the function when
> the number of samples in each data set is different. Since the
> KS statistic is based on the maximum difference between the
> empirical CDFs, the order in which the data sets are used in the
> function should not matter.
>
You are absolutely correct. The factor should be removed. I believe it
is a remnant of an incomplete copy and paste.
-Andy
Prev by Date:
**Re: Brillouin function for a Ferromagnet**
Next by Date:
**Re: FindRoot**
Previous by thread:
**Re: Kolmogorov-Smirnov 2-sample test**
Next by thread:
**Re: Kolmogorov-Smirnov 2-sample test**
| |