cwave.eu5.org
Also see: http://www.angelfire.com/dragon/letstry
cwave04 at yahoo dot com
Free Guestbook
My Guestbook

Last updated on Fri May 21 11:52:18 IST 2010.

Two sample inference

Mann Whitney U-test

We have same set up as before. We want to test
H0: F = G Vs H1: F ≠ G
or
H0: F = G Vs H1: F G
or
H0: F = G Vs H1: F G
Define
Dij = I{Xi < Yj}, for all i,j.
Then Mann Whitney U-test uses the test statistic
U = ∑ ij Dij

Exercise 9.1: Consider the three alternative hypothesies listed above. When should we reject H0 in favour of each of them, for large or small U?

Let
θ = E(Dij)
θ1 = E(DijDik)
θ2 = E(DijDkj)

Exercise 9.2: Compute θ under H0.

Exercise 9.3: Show that if H0 is true then
θ1 = θ2 = 1/3.

Exercise 9.4: Compute E(U) in general, as well as under H0.

Exercise 9.5: Show that
cov(DijDkl) = 0 if i ≠ k and j ≠ l
θ-θ2 if i=k and j=l
θ12 if i=k and j ≠ l
θ22 if i ≠ k and j=l

Exercise 9.6: Use the last exercise to show that
Var(U) = mn(θ-θ2) +mn(m-1)(θ22) +mn(n-1)(θ12),
in general. Check that this reduces to
mn(m+n+1)/12
under H0.

Exercise 9.7: Argue that U is distribution-free under H0 by showing that
P(U = u) = rmn(u)/ m+nCm
under H0, where
rmn(u) = number of arrangements of m X's and n Y's such that the number of times an X precedes an Y is u.

Exercise 9.8: Show that
rmn(u) = rmn(mn-u)
Hnece conclude that undr H0, U is distributed symmetrically around mn/2.

Theorem For large m,n,
(U-E(U))/sqrt(Var(U)) ~ AN(0,1).

Proof: Shall do later using a theorem for U-statistics.

An estimation problem

We can use the above idea to get a confidence interval for a two sample location problem. We have the same set up as above. But now we also assume that G is a location shift of F, i.e.,
G(x) = F(x- θ) for all x,
where θ is unknown.

Exercise 9.9: Which of the following two statements is true under this model?
Y1-θ, ..., Yn-θ are iid F
or
Y1+θ, ..., Yn+θ are iid F ?

Define
U(θ) = ∑ ij I{Zij > 0)

Exercise 9.10: Show that U(θ) has distribution free of θ. In particular, show that its distribution is same as the null distribution of the Mann-Whitney U statistic under H0.

Suppose that we are given &alpha, and we want to get a (1-&alpha)-level CI for θ. Since U(θ) has a completely known distribution, hence we can find integer c such that
P(c ≤ U(θ) ≤ mn-c) = 1-&alpha.
Due to the discrete nature of U(θ), the equality may not hold exactly. Next, order the mn differences, Zij's, as
W1 < ... < Wmn
The inequalities are strict with probability 1, because of the continuity assumption on F.

Exercise 9.11: Show that
{c ≤ U(θ) ≤ mn-c} iff {Wc < θ < Wmn-c+1}
Hnece argue that (Wc, Wmn-c+1) is a (1-&alpha)-level CI for θ.
See the hint for ONE.7.

Hodges-Lehmann approach

Recall that we had described the HL approach for one sample problem as a method to obtain point estimates from a test. We can apply this aproach to Mann-Whitney U-test to get a two sample HL estimate for θ as follows.

Notice that U(θ) is distributed symmetrically around mn/2. The HL approach suggests choosing in a way so that U() is as close to mn/2 as possible.

Exercise 9.12: Show that
= median of the Zij's.


PrevNext
© Arnab Chakraborty (2010)