[Home]

Table of contents


$ \newcommand{\cov}{\mathrm{cov}} \newcommand{\var}{\mathrm{var}} \newcommand{\corr}{\mathrm{corr}} \newcommand{\hv}[1]{\hat{\vec{#1}}} \newcommand{\v}[1]{\vec{#1}} $

ANOVA tables for linear mixed models

Here we shall take a look at testing using ANOVA tables in case of linear mixed models.

1-way ANOVA

Suppose that some drug is being tested in $k$ hospitals, each of which try the drug on $n$ patients. The resulting data set has $kn$ rows (one per patient) and two columns (hospital and response). Since a future user of the analysis report would not care about the specific $k$ hospitals participating in the study, hence we consider the hospital effect as random. The model is: $$ y_{ij} = \mu + a_i + \epsilon_{ij}, $$ where $i=1,...,k$ and $j=1,...,n.$ Here We assume Here our parameters are $\mu,$ $\sigma^2_a$ and $\sigma^2_e.$

EXERCISE: Compute $\cov(y_{ij}, y_{rs})$ for $i,r\in\{1,...,k\}$ and $j,s\in\{1,...,n\}.$

EXERCISE: Use the last exercise to also compute $\corr(y_{ij}, y_{rs})$ for $i,r\in\{1,...,k\}$ and $j,s\in\{1,...,n\}.$

Comparing with the fixed effects model

  1. In the fixed effects models $V(y_{ij})$ used to be just same as the error variance. But now $V(y_{ij})$ has two components, one from the error, the other from the random effects. Hence, such a model is sometimes also called a variance components model.
  2. In the fixed effects model, all the $y_{ij}$'s were independent. But now $y_{ij}$'s belonging to the same hospital are correlated.
With these points in mind we take a look at (the first 3 columns of) the ANOVA table (the same as for fixed effects model):
Sourced.f.SS
Hospital$k-1$$n\sum \b y_{i\bullet}^2-nk\b y_{\bullet\bullet}^2$
Error$k(n-1)$$\sum y_{ij}^2-n\sum \b y_{i\bullet}^2$
Total$kn-1$$\sum y_{ij}^2-nky_{\bullet\bullet}^2$
The table is the same as for the fixed effects model. But let us understand how its interpretation may have changed in the random effect scenario:

EXERCISE: Show that $E(\b y_{i\bullet}) = \mu $ and $\var(\b y_{i\bullet}) = \sigma^2_a+\frac 1n\sigma^2_e. $ Hence show that the Hospital MS has expectation $n \sigma^2_a+\sigma^2_e.$

EXERCISE: Under the fixed effects model, the Hospital SS and the Error SS were both $\sigma^2 \chi^2 $ random variables (the former was non-central, while the latter was central). Work out the distributions of the two SS's in the random effects model.

In the fixed effects model the two SS's were independent. The same thing happens to be true even in the random effects model, though this may not be readily apparent. The following exercises lead to a proof.

EXERCISE: Let $X,Y$ and $Z$ be jointly distributed random variables such that

Then show that $X$ and $Y$ must be independent.

EXERCISE: Take

in the above exercise to conclude that the two SS's are independent.

EXERCISE: In the fixed effects model we could test $H_0:$ no hospital effect by using the null distribution of the $F$-ratio $$ \frac{\mbox{Hospital MS}}{\mbox{Error MS}}\sim F_{(k-1,k(n-1)} \mbox{ (central)}. $$ The corresponding $H_0$ for the mixed effects case is $H_0:\sigma^2_a = 0.$ Show that the same null distribution is still valid here.

EXERCISE:  We know that the Error MS is an unbiased estimator for $\sigma^2_e.$ Find an unbiased estimator of $\sigma^2_a.$

2-way ANOVA

Consider the following model $$ y_{ijk} = \mu + a_i + \beta_j + g_{ij}+\epsilon_{ijk}, $$ where $i=1,...,I,$ $j=1,...,J$ and $k=1,...,K.$ We make the usual assumptions:
  1. $\epsilon_{ijk}$'s are iid $N(0,\sigma^2_e),$ where $\sigma^2_e>0$ is unknown.
  2. $a_i$'s are iid $N(0,\sigma^2_a),$ where $\sigma^2_e\geq 0$ is unknown.
  3. $g_{ij}$'s are iid $N(0,\sigma^2_g),$ where $\sigma^2_g\geq 0$ is unknown.
  4. $a_i$'s $g_{ij}$'s and $\epsilon_{ijk}$'s are all independent.

EXERCISE: Let the first two columns of the ANOVA table be like:

Sourced.f.
Row$I-1$
Column$J-1$
Interaction$(I-1)(J-1)$
Error$IJ(K-1)$
Total$IJK-1$
Show that
  1. $E($Row MS$)=\sigma^2_e+K \theta_g +KJ \theta_a$
  2. $E($Column MS$)=\sigma^2_e+K \theta_g +KI \theta_\beta$
  3. $E($Interaction MS$)=\sigma^2_e+K \theta_g$
  4. $E($Error MS$)=\sigma^2_e$
Here

BLUP

Henderson introduced the concept of a Best Linear Unbiased Predictor (BLUP) of a random effect coefficient. Statisticians woking in animal breeding use this concept extensively. See this paper for some details. (This paper, by the way, is not included in the syllabus of this course!) Here are a few things that you should know about BLUPs.

First, the definition. Suppose that you have a LME model with a random effect $u$ (i.e., a random coeffcient). There may be other random coefficents also. Let the data vector be $\v y.$ Then by a BLUP we understand a function of the form $\v \ell'\v y$ such that $E(\v \ell'\v y - u) = 0,$ and subject to this condition $\var(\v \ell'\v y)$ is the minimum possible.

Henderson gave a computational formula for find BLUPs for the following model: $$ \v y = X\v \beta + Z\v u + \v \epsilon, $$ where $\v u\sim (\v 0, \sigma^2 G)$ and independently $\v \epsilon \sim (\v 0, \sigma^2 R).$ Here $G, R$ are known pd matrices. Then if we solve $$ \left[\begin{array}{ccccccccccc} X' R ^{-1} X & X' R ^{-1} Z\\ Z' R ^{-1} X & (Z' R ^{-1} Z + G ^{-1}) \end{array}\right]\left[\begin{array}{ccccccccccc}\hv \beta\\ \hv u \end{array}\right] = \left[\begin{array}{ccccccccccc}X' R ^{-1} \v y\\ Z' R ^{-1} \v y \end{array}\right], $$ we get BLUE $\hv \beta $ for $\beta,$ and BLUP $\hv u$ of $\v u.$ Don't cram this stuff! If you want to see its derivation that read the above paper.

BLUP's, as you may guess from the complicated system of equations, is quite different from BLUEs of $\v u$ if you assume $\v u$ to be fixed.

Exercises

  1. Consider a 2-way ANOVA model without interaction, where both the effects are random. Work out the expected values of the $MS$ values.

Comments

To post an anonymous comment, click on the "Name" field. This will bring up an option saying "I'd rather post as a guest."