Non-parametric statistical tests

This page compares the Mann-Whitney-Wilcoxon test against two other non-parametric statistical tests, which are the Kolmogorov-Smirnov and the Pearson's Chi-square test.

We introduce each test separately in Section 1 to 3, while elaborating more on the chosen test (Mann-Whitney-Wilcoxon, Section 1).

We give a qualitative comparison in Section 4 and demonstrate that the test of our choice is solely sensitive to a difference in two distributions' medians. It does not have an undesired (for our application) sensitivity to differences in shape, which is the case for the other two.

1. Mann-Whitney-Wilcoxon test (MWW)

The MWW test was first presented by Wilcoxon in 1945 [1] and two years later discussed by Mann and Whitney [2] on a more solid mathematical basis. The test assess whether one of two random variables is stochastically larger than the other, i.e. whether their medians differ.

Let X1 and X2 be sets of drawings from unknown distributions functions, respectively. The test to assess whether the two underlying random variables are identical is done in three steps:

  1. The elements of the two sets X1 and X2 are concatenated. If X1 an X2 have cardinalities n1 and n2, respectively, the joint set has cardinality n1 + n2.
  2. The elements in the joint set are sorted in increasing order. The smallest (first) element has rank 1, the largest (last) element has rank n1 + n2.
  3. The ranksum is the sum of the ranks from all those elements that came from the set X1. Wilcoxon denoted this statistic with T.
For example, let X1 = {17.5, -2} and X2 = {23, -11.7, 3.1, 0.9, 42}. Then the three steps are as follows:
  1. X∪Y: {17.5, -2, 23, -11.7, 3.1, 0.9, 42}
  2. sort: {-11.7, -2, 0.9, 3.1, 17.5, 23, 42}
  3. ranksum: T = 2 + 5 = 7

The expected mean and variance of the statistic T are [2]:
μT = n1*(n1 + n2 + 1)
2
σT2 = n1*n2*(n1 + n2 + 1)
12
The expected mean and variance can be used to normalize the statistic, yielding the standard z value:
z = T - μT
σT
The z value is positive (negative) if the median of the first distribution is larger (smaller) than the one from the 2nd distribution. If the medians are equal, the z value is equal to zero. This can be seen in the graphs of the third column in Section 4.

2. Kolmogorov-Smirnov test (KS)

The two-sample KS test assesses whether two probability distributions differ or not [3,4]. It is sensitive to location and shape.

Given two drawings X1 and X2, the empirical cumulative distributions functions are F1(x) and F2(x), respectively. Then the test statistic is computed as:

Dn1, n2 = supx|F1(x) - F2(x)|

which is the maximum difference between the two cumulative distribution functions along the horizontal x-axis. n1, n2 are the cardinalities of X1 and X2, respectively.

The statistic Dn1, n2 can be normalized using precomputed tables [4].

3. Pearson's Chi-square test

The Pearson's Chi-square test assesses whether an observed random variable with distribution follows an expected distribution [5].

Let Oi and Ei be the bins of the observed and expected probability function, respectively. Then the Chi-square test is:

X2 = Σi=1..n (Oi - Ei)2
Ei
where X2 follows a χ2-distribution.

4. Qualitative Comparison

The following table qualitatively shows for different input distributions (columns 1 and 2) the behaviour of the three presented tests (columns 3-5). The first distribution is always rectangular. The second distribution:

If the test statistic is zero, the respective graph is marked with a dashed frame. One sees that the MWW test is only unequal to zero in the first case where the medians are different. The KS test measures the difference in shape for the example in row two. However, it barely measure the difference in shape in the third row since the cumulative distribution functions are very similar. The test statistic is close to zero. The Chi-square test also measures the difference in the third row since it sums up the squared differences in every single bin.

The semantic gray-level enhancement and color transfer are based on tone-mapping curves that adaptively de- or increase pixel values in different channles independent of their distribution. As we do not consider the shape of the distribution as an extra feature we use the MWW test. We have found the MWW to be more robust and adapted to our application.

input: 1st distribution
input: 2nd distribution
Mann-Whitney-Wilcoxon test
Kolmogorov-Smirnov test
Pearson's Chi-square test

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

expected rectangular distribution

References:

[1] Frank Wilcoxon, Individual Comparisons by Ranking Methods, Biometrics Bulletin, vol. 1, nr. 6, pp. 80-83, 1945

[2] H. B. Mann and D. R. Whitney, On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other, The Annals of Mathematical Statistics, vol. 18, nr. 1, pp. 50-60, 1947

[3] A. Kolmogorov, Sulla determinazione empirica di una legge di distributione, Giornale dell' Istituto Italiano degli Attuari 4, pp 83-91, 1933

[4] N. Smirnov, Table for Estimating the Goodness of Fit of Empirical Distributions, The Annals of Mathematical Statistics, vol. 19, nr. 2, pp. 279-281, 1948

[5] R. L. Plackett, Karl Pearson and the Chi-Squared Test, International Statistical Review, vol. 51, nr. 1, pp. 59-72, 1983