EXAMPLE: We are testing performance of 4 different fertilisers using a 1-way layout. We have 20 more-or-less similar plots. We have randomly partitioned them into 4 groups consisting of 5 plots each. The plots have been numbered as $(i,j)$ with $i=1,...,4$ and $j=1,...,5.$, where $i$ denotes the fertiliser used. The model is $$ y_{ij} = \mu_i + \epsilon_{ij}, $$ where $\epsilon_{ij}\sim N(0,\sigma^2).$
The first question that we ask is here is: do the fertilisers make any difference to the yield? This corresponds to the hypothesis:$H_0: \mu_1=\cdots =\mu_4.$This is basically asking if there is at least one pair $i_1\neq i_2$ such that $\mu_{i_1}\neq \mu_{i_2}.$ But the ultimate question for the farmer will be to identify all such pairs. So now we are trying to test all the following hypotheses:
$H_{0i_1i_2}: \mu_{i_1}=\mu_{i_2}$ for $i_1\neq i_2\in\{1,2,3,4\}.$So we have to perform ${4\choose2} = 6$ tests.
This is one example of multiple comparisons. In the general set
up we have $p$ null hypotheses, $H_{0i},$ for $i=1,...,p.$
We want to design a statistical procedure that will come up with
decisions to accept or reject these $p$ null hypotheses.
We shall be given a number $\alpha\in(0,1).$ We must
guarantee that our statistical procedure is such that if all
the $H_{0i}$'s are true, then the chance of rejecting at
least one of them is $\leq \alpha.$ This probability is
called the experimentwise error rate, and is the analog
of $P($type I error$)$ in case of usual hypothesis testing.
Different procedures have been suggested in the literature. We
discuss a few of them here.
$H_0: $ all the $H_{0i}$'s are true.If this test accepts $H_0,$ then accept all the individual $H_{0i}$'s, and there is no need for the second stage. If this test rejects $H_0,$ then in the second stage we test each individual $H_{0i}$ separately at level $\alpha,$ and report the decisions. In the linear models context, each $H_{0i}$ is typically about equality of two coefficients. These pairwise comparisons are done using 2-sample $t$-test under the homoscedastic set up (same $\sigma^2$). This $\sigma^2$ is estimated using the entire data. It is easy to see that this procedures indeed keeps the experimentwise error rate $\leq \alpha.$ Consider the event that at least one $H_{0i}$ has been rejected. This means $H_0$ was rejected in the first stage. This itself has probability $\leq \alpha.$ This procedure occasionally produces counterintuitive result: it is possible that $H_0$ is rejected in the first stage, but then in the second stage all the $H_{0i}$'s get accepted!
EXAMPLE: Our model is $$ y_{ijk} = \mu+\alpha_i+\beta_j + \epsilon_{ijk}, $$ where $\epsilon_{ijk}\sim N(0,\sigma^2)$ for $i=1,...,I$ and $j=1,...,J$ and $k=1,...,K.$ We make the estimability assumptions $\sum \alpha_i = 0$ and $\sum \beta_j = 0.$
We are trying to test $H_{0j_1j_2}: \beta_{j_1}=\beta{j_2}$ for all pairs $j_1,j_2.$ To apply Tukey's procedure here we notice that $$ \b y_{.j.}\sim N\left(\mu+\beta_j, \tfrac{\sigma^2}{IK} \right). $$ Then $\b y_{.j.}$'s are all independent. Also $MSE$ is an estimator of $\sigma^2,$ which is independent of the $\b y_{.j.}$'s. Hence we can use Tukey's procedure.
EXAMPLE: Suppose that we are in the 1-way ANOVA model: $$ y_{ij} = \mu_i + \epsilon_{ij}, $$ for $i=1,...,k$ and $j=1,...,t.$
We are intereseted in the subspace $\mu_i$'s all same. If we call this $V,$ then $dim(V) = 1.$ Here is a typical component hypothesis: $\mu_1=\mu_2.$ To test this we use the test statistic $$ F = \frac{SS(\mu_1-\mu_2)/1}{MSE}, $$ where $SS(\mu_1-\mu_2)$ is obtained as $RSS_0-RSS.$ The $RSS_0$ is computed by maerging the first two classes. We reject the null hypothesis if $F>F(1-\alpha,1,$error df$).$
lm function. Instead it requires the model to be
fit using the aov function (which actually
uses lm behind the scene, but packages and prints
the output in a different form).
The qtukey function finds the quantiles for
the $Q$ distribution:
qtukey(0.90,4,3)Possible a CDF plot would help to visualise the distribution better:
x = seq(0,10,len=100) y = ptukey(x,4,3) plot(x,y,ty='l')The function
TukeyHSD performs the test (or rather
finds the confidence intervals):
summary(fm1 <- aov(breaks ~ wool + tension, data = warpbreaks))
TukeyHSD(fm1, "tension", ordered = TRUE)
plot(TukeyHSD(fm1, "tension"))
The above example has been taken from the online R help.