Exercise 9.7:
Argue that U is distribution-free under H0 by showing that
P(U = u) = rmn(u)/
m+nCm
under H0, where
rmn(u) = number of arrangements of m X's and n Y's
such that the number of times an X precedes an Y is u.
Exercise 9.8:
Show that
rmn(u) = rmn(mn-u)
Hnece conclude that undr H0, U is distributed symmetrically
around mn/2.
Proof:
Shall do later using a theorem for U-statistics.
An estimation problem
We can use the above idea to get a confidence interval for a two sample
location problem. We have the same set up as above. But now we also assume
that G is a location shift of F, i.e.,
G(x) = F(x- θ) for all x,
where θ is unknown.
Exercise 9.9:
Which of the following two statements is true under this model?
Y1-θ, ..., Yn-θ are iid
F
or
Y1+θ, ..., Yn+θ are iid
F ?
Define
U(θ) =
∑
i
∑
j I{Zij
> 0)
Exercise 9.10:
Show that U(θ) has distribution free of θ. In particular,
show that its distribution is same as the null distribution of the
Mann-Whitney U statistic under H0.
Suppose that we are given &alpha, and we want to get a
(1-&alpha)-level CI for θ. Since U(θ) has a completely
known distribution, hence we can find integer c such that
P(c ≤ U(θ) ≤ mn-c) = 1-&alpha.
Due to the discrete nature of U(θ), the equality may not
hold exactly. Next, order the mn differences, Zij's, as
W1 < ... < Wmn
The inequalities are strict with probability 1, because of the
continuity assumption on F.
Exercise 9.11:
Show that
{c ≤ U(θ) ≤ mn-c} iff {Wc <
θ < Wmn-c+1}
Hnece argue that (Wc, Wmn-c+1) is a
(1-&alpha)-level CI for θ.
See the hint for ONE.7.
Hodges-Lehmann approach
Recall that we had described the HL approach for one sample problem as a
method to obtain point estimates from a test. We can apply this aproach to
Mann-Whitney U-test to get a two sample HL estimate for θ as
follows.
Notice that U(θ) is distributed symmetrically around mn/2. The HL
approach suggests choosing in a way so that U()
is as close
to mn/2 as possible.