[Home]

Table of contents


$\newcommand{\v}{\vec}$ $\newcommand{\nul}{\mathcal N}$ $\newcommand{\col}{\mathcal C}$ $\newcommand{\row}{\mathcal R}$ $\newcommand{\B}{\mathcal B}$ Eigenanalysis

Eigenvalues and eigenvectors

What we already (should) know...

We shall work with the field ${\mathbb C}.$ $A$ is an $n\times n$ matrix.
  1. $\v x\in{\mathbb C}^n$ is called an eigenvector corresponding to an eigenvalue $\lambda\in{\mathbb C}$ if $$ A\v x = \lambda \v x $$ and $\v x\neq \v 0.$
  2. The characteristic polynomial of $A$ is defined as $p_A(\lambda)=det(A-\lambda I).$
  3. $\lambda\in{\mathbb C}$ is an eigenvalue of $A$ if and only if $p_A(\lambda)=0.$
  4. If $\lambda\in{\mathbb C}$ is any eigenvalue of $A$ then the eigenvectors corresponding to it are the nonzero vectors in $\nul(A- \lambda I).$ The entire nullspace (including the zero vector) is called the eigenspace corresponding to $\lambda.$

Similarity

Two square matrices $A,B$ are called similar if $A = PBP ^{-1},$ for some nonsingular $P.$

Similarity is important because it is an equivalence. Two matrices are similar if and only if they both represent the same linear transformation (w.r.t. two different bases).

To understand this we need to understand the concept of change-of-basis. Suppose that we have two bases $\B_1=\{\v u_1,...,\v u_n\}$ and $\B_2=\{\v v_1,...,\v v_n\}$ for the same vector space. Then any vector has two representations (one w.r.t each of these bases). It will be useful to have a formula by which we can move from one representation to the other.

Let $\v \alpha = (\alpha_1,...,\alpha_n)$ be the representation w.r.t. $\B_1,$ and $\v \beta = (\beta_1,...,\beta_n)$ w.r.t. $\B_2.$

It should be clear that for each $\v \alpha $ we have unique $\v \beta.$ So this is a function. Also, this is one-to-one and onto. Also it must be a linear transformation. Hence there must exist a nonsingular matrix $P$ such that $\v\beta = P\v \alpha.$ This $P$ is called the change-of-basis matrix for moving from $\B_1$ to $\B_2.$ Clearly, if we want to return to $\B_1$ from $\B_2,$ the change-of-basis matrix will be $P ^{-1}.$

It is easy to see that the converse also holds: for any given basis $\B_1$ and any nonsingular matrix $P,$ there exists a basis $\B_2$ such that $P$ is the change-of-basis matrix for moving from $\B_1$ to $\B_2.$

Now suppose we have a linear transformation $f:V\rightarrow V.$ Then w.r.t. $\B_1$ it will have a matrix representation, $A,$ say. Also it will have a matrix representation $B$ w.r.t. $\B_2.$

If $P$ is the change-of-basis matrix for moving from $\B_1$ to $\B_2,$ then it is easy to express $B$ in terms of $A$ and $P$ in three steps:
  1. You start with a vector $\v v$ and its representation $\v \beta $ w.r.t. $\B_2.$ Convert it to the representation w.r.t. $\B_1$ by $\v \alpha = P ^{-1} \v \beta.$
  2. Now apply $A$ to express $f(\v v)$ as $A\v \alpha = AP ^{-1} \v \beta $ in terms of $\B_1.$
  3. Finally return to $\B_2$ as $P A P ^{-1} \v \beta.$
Thus we see that $B=P A P ^{-1}.$

Again, the converse also holds: Given any basis $\B_1$ and any pair of similar matrices $A,B$ there exists a basis $\B_2$ such that the linear transformation represented by $A$ w.r.t. $\B_1$ is the same as the linear transformation represented by $B$ w.r.t. $\B_2.$

A useful fact

TheoremIf $\v v_1,...,\v v_k$ are eigenvectors of $A$ corresponding to distinct eigenvalues, then $\{\v v_1,...,\v v_k\}$ must be linearly independent.
The proof is by induction on $k.$

Trivial for $k=1.$

Assume up to $k-1.$

To prove assume that $\alpha_1,...,\alpha_k\in{\mathbb C}$ are such that $$ \alpha_1\v v_1+\cdots +\alpha_k\v v_k = \v0. $$ Premultiplying by $A$ we have $$ \alpha_1 \lambda_1\v v_1+\cdots +\alpha_k \lambda_k\v v_k = \v0. $$ Eliminate $\v v_1$ from these two equations to get a linear combination of $\v v_2,...,\v v_k$ only that equals $\v 0:$ $$ \alpha_2 (\lambda_2-\lambda_1)\v v_2+\cdots +\alpha_k (\lambda_k-\lambda_1)\v v_k = \v0. $$ By induction hypothesis, the coefficients must all vanish. So $\alpha_2=\cdots=\alpha_k=0,$ since $\lambda_i$'s are all distinct.

Hence $\alpha_1$ must be $0$ as well, since $\v v_1\neq\v 0.$

More terminology

If $\lambda\in{\mathbb C}$ is an eigenvalue of $A,$ then its multiplicity as a zero of $p_A(\lambda)$ is called its algebraic multiplicity (am). Also, the dimension of $\nul(A- \lambda I)$ is called the geometric multiplicity (gm) of $\lambda.$ Note that am or gm both must lie between $1$ and $n.$ Also the sum of am's of all the eigenvalues must be $n.$

Theorem For any eigenvalue $\lambda\in{\mathbb C}$ we have $am(\lambda) \geq gm(\lambda).$
The proof uses a standard trick. Start with any eigenvalue $\lambda\in{\mathbb C}.$ Denote its gm by $g.$ Grab any $g$ linearly independent eigenvectors $\v v_1,...,\v v_g$ Extend it to a basis $\{\v v_1,...,\v v_n\}$ of ${\mathbb C}^n.$ Stack these as columns to get a nonsingular matrix $P.$

Note that $AP = PB,$ where $B$ is a block triangular matrix, with leading block $\lambda I_g.$

Clearly $am(\lambda,B)\geq g.$ Also $am(\lambda,B) = am(\lambda,A).$

Hence the result.

The following result can be proved using the same trick.
Theorem Any square matrix is similar to a triangular matrix.
The proof is by induction on $n,$ the size of the matrix.

Nothing to prove if $n=1.$

Assume up to $n-1.$

Shall prove for $n.$ Take any eigenvalue $\lambda\in{\mathbb C}.$ Pick any eigenvector $\v v$ corresponding to it. Extend it to a basis. Stack the basis vectors as columns to get a nonsingular matrix $P.$ Then $AP = B P,$ where $B$ has the form $$ B = \left[\begin{array}{ccccccccccc}\lambda & *\\ \v 0 & C \end{array}\right], $$ where $*$ denotes some row vector and $C$ is $(n-1)\times(n-1).$

Apply induction hypothesis on $C$ to write it as $$ C = Q T Q ^{-1}, $$ where $T$ is upper triangular. Thus $$ B = \left[\begin{array}{ccccccccccc}\lambda & *\\ \v 0 & QTQ ^{-1} \end{array}\right] = \left[\begin{array}{ccccccccccc}1 & \v 0'\\\v 0 & Q \end{array}\right]\left[\begin{array}{ccccccccccc}\lambda & *\\ \v 0 & QTQ ^{-1} \end{array}\right] \left[\begin{array}{ccccccccccc}1 & \v 0'\\\v 0 & Q ^{-1} \end{array}\right] =\left[\begin{array}{ccccccccccc}1 & \v 0'\\\v 0 & Q \end{array}\right]\left[\begin{array}{ccccccccccc}\lambda & *\\ \v 0 & T \end{array}\right] \left[\begin{array}{ccccccccccc}1 & \v 0'\\\v 0 & Q \end{array}\right] ^{-1}, $$ completing the proof.

This result is very helpful to prove properties about square matrices. If the property is invariant under similarity, then it is enough to the prove it for upper triangular matrices only. Here is one example.

If $am(\lambda,A)=gm(\lambda,A),$ then call $\lambda$ a regular eigenvalue of $A.$

Theorem The following statements are equivalent about a matrix $A_{n\times n}:$
  1. $A$ is a similar to a diagonal matrix.
  2. all its eigenvalues are regular.
  3. The sum of the gm's of all the eigenvalues equals $n.$
  4. There is a basis of ${\mathbb C}^n$ consisting of only eigenvetors of $A.$

Unitarily similar

While similarity is a useful concept, performing a similarity transformation is often numerically unstable if the change-of-basis matrix is nearly singular.
Nearly singular Consider the numbers and $10$ and $10^{-20}.$ Both are nonzero, and so a computer can compute their reciprocals. Since the computer can work with only finite precision, the results will be slightly erroneous. The amount of error will be higher for $10^{-20}$ because it is "nearly zero". You can think of this as a nearly singular $1\times 1$ matrix. For an $n\times n$ matrix, there are various ways to measure closeness to singularity (sometimes also called the conditioning of the matrix). A popular measure is the ratio of the largest absolute eigenvalue to the smallest absolute eigenvalue. Whatever the measure, the main idea is that for nearly singular matrix, the inverse computed using a computer may differ drastically from the actual inverse. A notorious example is the Hilbert matrix which $n\times n$ with $(i,j)$-th entry $\frac{1}{i+j-1}.$ It is actually a PD matrix. You can compute the $10\times 10$ Hilbert matrix in R as
H = outer(1:10,1:10, function(i,j) 1/(i+j-1))
Now compute $H ^{-1}.$
Hi = solve(H)
Since $H$ is symmetric, so must be $H ^{-1}.$ Well, let's check the symmetry of the computed inverse:
max( abs( Hi - t(Hi) ) )
The answer is 217154.2. Shocked?
So we often like to work with the case where $P$ is unitary.

$A,B$ are said to be unitarily similar if $A = PBP^*$ for some unitary matrix $P.$

The following result (called Schur decomposition) is the unitary analog of a result proved earlier. The proof is similar.
Theorem Any square matrix is unitarily similar to an upper triangular matrix.
Unitary similarity with triangular matrices is a property shared by all complex square matrices. Unitary similarity with diagonal matrices, however, is more usefull to have. There is a simple (though unexpected) characterization of such matrices. We need a little definition first.
Definition A square matrix (over ${\mathbb C}$) is called normal if $A^*A=A A^*.$
Theorem A matrix is unitarily similar to a diagonal matrix if and only if it is normal.

Proof: Thanks to Schur decomposition, it is enough to show the result for only upper triangular matrices. Trivial direct computation shows that an upper triangular matrix is normal if and only if it is diagonal.

The following diagram tells you how to proceed with the computation:
When we compute the $(1,1)$-th entry of $A A^*$ it is $\sum|a_{1j}|^2.$ But when we compute the $(1,1)$-th entry of $A^*A$ we get just $|a_{11}|^2.$ Equating these we immediately get $a_{1j}=0$ for $j\geq 2.$ [QED]

Many useful types of matrices are normal. For example, real symmetric and complex Hermitian.

Here a word is in order about the real symmetric case. Our main weapon in proving the last theorem was Schur decomposition, which is true for complex field, but not for real field. So what we get is that every real symmetric matrix can be written as $PDP^*,$ where $P$ is a (complex) unitary matrix, and $D$ is a (complex) diagonal matrix. It would be nice if $D$ and $P$ may also be guaranteed to have real entries. Indeed, this is true:
TheoremEvery real symmetric matrix is orthogonally similar to a real diagonal matrix.
The proof proceeds in two simple steps:

In the first step we show that all the eigenvalues of a real symmetric matrix must be real. Let $\lambda $ be an eigenvalue of a real symmetric $A.$ Let $\v v\in{\mathbb C}^n$ be any eigenvector for it. Just expand $\v v^* A\v v$ is two ways: $$ \v v^* A \v v = \v v^* \lambda \v v = \lambda \|\v v\|^2 $$ and $$v v^* A \v v = (A\v v)^* \v v = \bar \lambda \v v^*\v v = \bar \lambda \|\v v\|^2. $$ Since $\v v\neq\v0,$ tyhis proves that $\lambda\in{\mathbb R}.$

In the second part we show that if $\v u$ and $\v v$ are real eigenvectors corresponding to distinc eigenvalues, then $\v u'\v v=0.$ It follows easily by expanding $\v u'A\v v$ in two ways.

Since $A$ is normal, the gm's sum to $n.$

For each distinct eigenvalue, get an ONB of the eigenspace. Together they form ONB of ${\mathbb R}^n.$

Stack these as columns to get an orthogonal matrix $P.$ Clearly, $AP = PD,$ or $A = PDP',$ as required.

The decompostion $A = PDP'$ is called spectral decomposition of $A.$

Quadratic forms

Just as Schur decomposition allows us to replace any square matrix by an upper triangular matrix while working with properties invariant under similarity and congruence, similarly spectral decomposition allows us to think of real symmetric matrices as diagonal.

Think of a real diagonal matrix $D.$ It should be obvious that $D$ is So we immediately get the following result.
Theorem A real symmetric matrix is
Of course, you can state analogous result for ND, NPD and ID matrices.

Here is another generalisation of diagonal matrices. The following result is obvious.
Theorem Let $D$ be a real diagonal matrix $D = diag(d_1,...,d_n).$ Let $m =\min\{ d_i\}.$ and $M =\min\{ d_i\}.$ Then for al unit vectors $\v x$ $$m\leq \v x' D\v x \leq M.$$

Also $\v x' D\v x = m$ iff $x_i = 0$ whenever $d_i > m.$

Similarly, $\v x' D\v x = M$ iff $x_i = 0$ whenever $d_i < M.$

Proof: $\v x'D \v x = \sum d_i x_i^2 \in [m,M]\sum x_i^2=[m,M],$ since $\sum x_i^2=1.$

Also $\v x' D\v x - m = \sum (d_i-m) x_i^2.$ All the terms in the sum are nonnegative, and so the sum vanishes iff all the terms are individually zero. So for all $i$ we need $(d_i-m) x_i^2=0.$ This happens iff $x_i=0$ whenever $d_i > m.$

The condition for the maximum follows similarly. [QED]

This result easily generalises for real, symmetric matrices:
Theorem If $A$ is a real symmetric matrix, with minimum eigenvalue $m$ and maximum eigenvalue $M.$ Then for all unit vectors $\v x$ $$m\leq \v x' A\v x \leq M.$$ Also, the lower bound is attained iff $\v x$ is a unit eigenvector corresponding to $m.$ The upper bound is attained iff $\v x$ is a unit eigenvector corresponding to $M.$

Proof: Let $A = PDP'$ be a spectral decomposition. Then $$\{\v x' A \v x~:~\|\v x\|=1\} = \{\v y' D \v y~:~\|\v y\|=1\}.$$ This is follows immediately on putting $\v y = P'\v x.$

So, by the last result, the minimum and maximum is $\min\{\lambda_i\}$ and $\max\{\lambda_i\}.$

Also, the minimum is attained iff $y_i=0$ whenever $\lambda_i > m.$

Thus $\v x = P\v y$ is a linear combination of the columns of $P$ corresponding to $m.$ Since these columns form an ONB of the eigenspace for $m,$ the condition follows.

Similarly, for the other condition. [QED]

By a principal submatrix of a square matrix, we understand a (square) submatrix whose position in the original matrix is symmetric around the leading diagonal. The determinant of such a submatrix is called a principal minor. If the submatrix is positioned in the top left corner of the original matrix, then we call it a leading principal submatrix.
Theorem A real symmetric matrix is PD iff all its leading principal submatrices are PD.

Proof: The "if part" is trivial, since the entire matrix itself is a leading principal submatrix.

For the "only if part" let $A_k$ be the $k$-th leading principal submatrix of $A.$ Shall show that $A_k$ is PD, ie,

$\forall \v x(\neq\v0)\in{\mathbb R}^k~~\v x' A_k \v x > 0.$

Take any $\v x\neq\v 0$ in ${\mathbb R}^k.$

Define $\v y\in{\mathbb R}^n$ as $$ \v y = \left[\begin{array}{ccccccccccc}\v x\\\v 0_{n-k\times1} \end{array}\right]. $$ Now $$ \v x' A_k \v x =\v y' A\v y > 0, $$ since $\v y\neq\v 0$ and $A$ is PD. This completes the proof. [QED]

The following theorem is very important, as it gives a simple characterisation of PD and NND matrices.
Theorem

Proof: If $A = B' B$ for some real matrix $B$ then clearly for any $\v x$ we have $\v x' A\v x = \v x' B'B \v x = (B\v x)'(B\v x) = \|B\v x\|^2\geq 0.$ Hence NND.

For the converse let $A$ be NND. So in particular $A$ is real symmetric. Let a spectral decomposition be $A = PDP'.$ Then the diagonal entries of $D$ are the eigenvalues and hence we can take their square roots to get $D^{1/2}.$

So $A = PDP' = PD^{1/2}D^{1/2}P' = B'B,$ where $B = D^{1/2}P'.$

Proof of the PD part is similar. [QED]

Some simple facts

Here are a few facts that are simple to prove, and are often useful.
Theorem For any square matrix over ${\mathbb C}$ the trace equals the sum of eigenvalues (each eigenvalue is counted as many times as its am).

The determinant is the product of all the eigenvalues (each eigenvalue is counted as many times as its am).

Proof: The result is true for any upper triangular matrix (since the eigenvalues are the diagonal entries). So the result holds for all square matrices over ${\mathbb C}$ using Schur decomposition. [QED]

Theorem If $p(x)$ is any polynomial with coefficients in ${\mathbb C}$ then each eigenvalue of $p(A)$ is of the form $p(\lambda),$ where $\lambda $ is an eigenvalue of $A.$ Conversely, for each eigenvalue $\lambda $ of $A$ the number $p(\lambda)$ is an eigenvalue of $p(A).$

Proof: Use Schur decomposition. Notice that if $A$ is similar to an upper triangular matrix $T,$ then $p(A)$ must be similar to $p(T),$ which is again upper triangular. Also, $((p(T)))_{i i} = p(t_{i i}).$ [QED]

Theorem A square matrix $A$ is nonsingular iff all its eigenvalues are nonzero. In this case, the eigenvalues of $A ^{-1} $ are precisely the reciprocals of the eigenvalues of $A.$

Proof: The result is obvious for upper triangular matrices. Generalise to general square matrices (over $cc$) by Schur decomposition. [QED]

Theorem If $A,B$ are two matrices (not necessarily square) such that $AB$ and $BA$ are both defined, then the nonzero eigenvalues of $AB$ and $BA$ must be the same.

Proof: Let $\lambda\neq0$ be an eigenvalue of $AB$ with a corresponding eigenvector $\v v\neq\v0.$

Then $AB\v v = \lambda\v v,$ and so premultiplying by $B,$ we get $BAB\v v = \lambda B\v v.$

From this we want to claim that $\lambda $ is an eigenvalue of $BA$ corresponding to eigenvector $B\v v.$

To do this we need to check that $B\v v\neq\v 0,$ which is obvious because if $B\v v=\v 0,$ then $AB\v v=\v 0,$ and hence $\lambda \v v = \v 0,$ which is impossible since $\lambda\neq0$ and $\v v\neq\v 0.$ [QED]

The following theorem is a stronger version of teh above, which also shows that the $am$'s must match for the nonzero eigenvalues.
Theorem Let $A_{m\times n},B_{n\times m}$ be two complex matrices with $m \leq n.$ $f(\lambda)$ be the characteristic polynomial of $AB$ and $g(\lambda)$ be that of $BA.$

Then $$ g(\lambda) = (-\lambda)^{n-m} f(\lambda). $$

Proof: The proof is not as straight forward as that of the last theorem. Consider the two matrices $$ C = \left[\begin{array}{ccccccccccc}\lambda I & A \\ B & I \end{array}\right] \mbox{ and } D = \left[\begin{array}{ccccccccccc}-I & O \\B & - \lambda I \end{array}\right]. $$ Now compute $CD$ and $DC,$ and use the fact that $det(CD) = dec(C)\cdot\det(D) = det(DC).$ [QED]

This proof, by the way, is not at all intuitive. Don't feel disheartened if you do not readily see how $C$ and $D$ were chosen. They were chosen by back calculation and trial-and-error. The result is useful, not the technique used in its proof.

Singular Value Decomposition

It is a very useful generalisation of spectral decomposition of real symmetric matrices. It applies to all matrices (need not even be square) over ${\mathbb R}$ or ${\mathbb C}.$ We first state the result over ${\mathbb R}:$
Theorem Let $A_{m\times n}$ be any real matrix of rank $r.$ Then we factor $A$ as $$ A_{m\times n} = U_{m\times m}D_{m\times n}V'_{n\times n}, $$ where $U,V$ are orthogonal, and $D$ is a diagonal matrix where $d_{i i}>0$ for $i=1,...,r$ and all other entries of $D$ are zero.

Proof: First let's do a little rough work to guess how to construct $U,V$ and $D.$ Assuming that such a decomposition exists we have $$ A'A = VD^2 V'. $$ So clearly the columns of $V$ must be eigenvectors of $A'A$ and the diagonal entries of $D^2$ must be the eigenvalues. So we must take the positive square roots of the nonzero eigen values of $A'A$ as the nonzero diagonal entries of $D.$ Again, we have $AV = DU.$ This tells us how to choose the first $r$ columns of $U.$

That's all the rough work we shall need. Now we shall define $U,V,D$ following rough work, and directly prove that $A = UDV'.$

Take a spectral decomposition of the real, symmetric matrix $A'A:$ $$ A'A = V S V'. $$ We shall use this $V.$ Also we know that $r(A) = r(A'A)= r(S) = r.$ So exactly $r$ of the diagonal entries of $S$ must be nonzero (actually positive). We define $D$ as the $m\times n$ matrix all whose entries are 0, except $d_{i i} = \sqrt{ s_{i i}}.$

Next, we define $\v u_i = \frac{1}{d_i} A\v v_i$ for $i=1,...,r.$

These must be orthonormal. (Check this by directly computing $\v u_i'\v u_j$.)

Extend $\{\v u_1,...,\v u_r\}$ to an ONB of ${\mathbb R}^m:$ $$ \{\v u_1,...,\v u_m\}. $$ Stack these a columns to get an orthogonal matrix $U.$

So we have finished constructing all the necessary ingredients: $U,D$ and $V.$

Now it is trivial to check that $A = UDV'.$

This completes the proof. [QED]

If the field is ${\mathbb C}$ then the SVD looks like $A = UDV^*$, but the nonzero entries of $D$ are still positive real numbers. In the proof we work with $A^*A$ in place of $A'A.$

Cayley-Hamilton Theorem

A rather surprising theorem is the following:
Theorem If $A$ is any complex square matrix, and $p(x)$ is its characteristic polynomial, then $p(A)=O.$
An example will help to illustrate what we mean by $p(A).$ Suppose $$A= \left[\begin{array}{ccccccccccc}1&2\\3&4 \end{array}\right].$$ Then $p(x) = |A-x I| = (1-x)(4-x)-6 = x^2-5x-2.$

Then $p(A) = A^2-5A-2I = \cdots = O.$ Notice how the constant term $-2$ in $p(x)$ was converted to $-2I$ in $p(A).$

Proof: The proof relies on the standard fact that for any square matrix $B$ we have $$ B\cdot adj(B) = |B|\cdot I, $$ where $adj(B)$ is th adjugate of $B.$ It is the matrix of cofactors of $B.$ For example, the $(i,j)$-th entry of $adj(B)$ is the $(-1)^{i+j}$ times the determinant of the submatrix of $B$ obtained by deleting the $j$-th row and $i$-th column (note the switch between row and column).

We shall apply this result with $A-xI$ in place of $B$ to get $$ (A-xI)\cdot adj(A-xI) = p(x) I. $$ Let's understand $adj(A-xI)$ taking the example $A= \left[\begin{array}{ccccccccccc}1&2\\3&4 \end{array}\right].$

Here $A-xI = \left[\begin{array}{ccccccccccc}1-x&2\\3&4-x \end{array}\right],$ and so $adj(A-xI) = \left[\begin{array}{ccccccccccc}4-x & -2\\ -3 & 1-x \end{array}\right].$

Notice that each entry is a polynomial of degree at most $1.$ Convince yourself that for a general $n\times n$ matrix $A,$ each entry of $adj(A-xI)$ will be a polynomial of degree at mot $n-1.$

The next step is to rewrite $adj(A-xI)$ as a polynomial in $x$ with matrix coefficients. Again our example will help.

$$ adj(A-xI) = \left[\begin{array}{ccccccccccc}4-x & -2\\ -3 & 1-x \end{array}\right] = \left[\begin{array}{ccccccccccc}4 & -2\\ -3 &1 \end{array}\right] + \left[\begin{array}{ccccccccccc}-1 & 0\\ 0& -1 \end{array}\right]x. $$ For general $A_{n\times n}$ this will be an $n-1$-degree polynomial, which we shall write as $$ adj(A-xI) = C_0+C_1x+\cdots+C_{n-1} x^{n-1}. $$ Here each $C_i$ is an $n\times n$ matrix.

Now we are in a position to use our result: $$ (A-xI)\cdot adj(A-xI) = p(x) I, $$ which can be written as $$ (A-xI)(C_0+C_1x+\cdots+C_{n-1} x^{n-1}) = p(x)I. $$ If we write $p(x) = p_0+p_1x+\cdots+p_n x^n,$ then this becomes $$ AC_0 +(AC_1-C_0)x+(AC_2-C_1)x^2+\cdots+(AC_{n-1}-C_{n-2})x^{n-1} - C_{n-1}x^n = p_0I+p_1Ix+\cdots +p_n Ix^n. $$ Note that this holds for all $x,$ ie, we have an identity. So we can match coefficients. Don't worry about the coefficients being matrices, just think that you are matching scalar coefficients for a single identity for each entry.

Then you get $$ AC_0 = p_0I,~~ AC_1-C_0 = p_1I,~~ \ldots,~~AC_{n-1}-C_{n-2} = p_{n-1}I,~~-C_{n-1} = p_nI. $$ It immediately follows that $p(A) = p_0I+p_1A+p_2A^2+\cdots +p_n A^n = O.$ [QED]

Problems

  1. Prove that every idempotent matrix is similar to a diagonal matrix with diagonal entires $0$ or $1.$
  2. Prove that a matrix denotes an orthogonal projection (over ${\mathbb R}$ w.r.t. Euclidean inner product) iff it is of the form $PDP',$ where $P$ is an orthogonal matrix and $D$ is a diagonal matrix with diagonal entries $0$ and $1.$
  3. Show that if $A^2=A$ then $r(A)=trace(A).$ Is the converse true? Justify your answer.
  4. A diagonal matrix has eigenvalues equal to the diagonal entries. Gershghorin's theorem says that if a matrix is approximately diagonal (ie, off-diagonals may be nonzero, but have small magnitude), then the eigenvalues cannot be far from the diagonal entries. More precisely, prove the following.
    Let $A$ be a square matrix and $R_i = \sum_{j\neq i} |a_{ij}|.$ If $\lambda$ is an eigenvalue of $A$ then it must lie in the union of the discs $$ D_i = \{z\in{\mathbb C}~:~ |z-a_{ii}| \leq R_i\}. $$
  5. Show that if $A_{2\times2}$ has eigenvalues $\lambda_1\neq \lambda_2,$ then the columns of $A- \lambda_1 I$ must be eigenvectors (or $\v 0$) corresponding to $\lambda_2.$
  6. A permutation matrix is a matrix that has exactly one $1$ in each row and column (the rest being $0$). It is called a permutation matrix because pre(or post)multiplying by it causes the rows(or columns) of a matrix to be permuted. Can you find at least one eigenvalue and eigenvector for a permutation matrix?
  7. We know that the characteristic polynomial $det(A- \lambda I)$ of an $n\times n$ matrix $A$ is a polynomial with degree $n$ and coefficient of $\lambda^n$ equal to $(-1)^n.$ Is it true that any such polynomial must be the characteristic polynomial of some matrix? If we allow complex matrices, then it is easy to see that the answer is "Yes". Just find all the roots, and create a diagonal matrix with them. But if the polynomial has real coeeficients, then is it possible to have a real matrix with that polynomial as its characteristic polynomial? Explore this by choosing suitable numbers in place of the * below: $$ A = \left[\begin{array}{ccccccccccc} 0 & 1 & 0\\ 0 & 0 & 1\\ 0 & 0 & * \end{array}\right] $$ to make the characteristic polynomial $-\lambda^3,$ $3 \lambda^2 - \lambda^3$ and $2 \lambda^2 - \lambda^3.$

    Next make the characteristic polynomial $2 \lambda - \lambda^3.$ by choosing a suitable number in place of the * below: $$ A = \left[\begin{array}{ccccccccccc} 0 & 1 & 0\\ 0 & 0 & 1\\ 0 & * & 0 \end{array}\right] $$ Prove that all $n$-degree polynomials with leading coefficient $(-1)^n$ is the characteristic polynomial of some $n\times n$ matrix of the form $$ \left[\begin{array}{ccccccccccc} \v 0 & I_{n-1}\\ \v a' & b \end{array}\right]. $$
  8. Let $A_{n\times n}$ is a real, symmetric, nonsingular matrix with eigenvalues $\lambda_1 > \lambda_2 > \cdots > \lambda_n>0.$ Let $\v x_0\in{\mathbb R}^n$ be any unit vector. We define a sequence of vectors as follows. $$ \v x_{k+1} = unit(A\v x_k), $$ where $unit(\v p)$ denotes the unit vector along $\v p.$ What can you say about the limit?

    Hint:

    Let $\v v_i$ be a unit norm eigenvector corresponding to $\lambda_i.$ Then we know that $\{\v v_1,...,\v v_n\}$ is an ONB of ${\mathbb R}^n.$

    Observe that $\v x_k$ is a multiple of $A^k\v x_0$ and also has unit norm. So $$ \v x_k = \frac{A^k\v x_0}{\|A^k\v x_0\|}. $$ Let $\v x_0 = \sum\alpha_i \v v_i.$ Then $$ A^k\v x_0 = \sum \alpha_i \lambda_i^k \v v_i, $$ and hence $$ \|A^k\v x_0\|^2 = \sum \alpha_i^2 \lambda_i^{2k}. $$ Let $i_*$ be the minimum value of $i$ for which $\alpha_i\neq 0.$

    Now complete the solution.

  9. Let $A$ be a stochastic matrix (ie, whose row sums are all $1$, and entries are all $\geq0$). Show that $1$ must be an eigenvalue of $A.$ Also show that if $\lambda\in{\mathbb C}$ is any eigenvalue then $|\lambda|\leq 1.$
  10. Let $A_{5\times5}$ have $am(1,A)=5.$ What is the minimum possible value for $gm(1,A)?$ Construct one such $A$ with this minimum value.
  11. A matrix is such that all its eigenvalues have $am=1.$ Must it be semisimple?
  12. State true/false: A square matrix is nonsingular iff all its eigenvalues are nonzero.
  13. State true/false: The rank of a square matrix equals the number of nonzero eigenvalues (each counted according to its $am$).
  14. State true/false: The rank of a square matrix equals the number of nonzero eigenvalues (each counted according to its $gm$).
  15. Construct a matrix that is similar to a diagonal matrix but not unitarily similar.
  16. Prove: Two diagonal matrices are similar iff their diagonal entries are just permutations of each other.
  17. Prove: For any real square matrix $A,$ the matrices $A'A$ and $A A'$ are similar.
  18. True/False: If $A$ is unitarily diagonalisable and $B$ is similar to $A,$ then $B$ must also be unitarily diagonalisable.
  19. Show that the sum of NND (or PD) matrices is again NND (or PD).
  20. For any NND matrix $A$, show that there is an NND matrix $B$ such that $A = B^2.$
  21. If a matrix is NND, then show that all its principal submatrices (ie, submatrices symmetric around the leading diagonal) are also NND.
  22. Show that a PD matrix has positive determinant and trace.
  23. Show that a PD matrix has all leading principal minors $>0.$ The converse is also true. But we shall not prove it here.
  24. If $A,B$ are two complex matrices such that $AB-BA$ has rank $1,$ then show that $(AB-BA)^2=O.$
  25. Let $A$ and $B$ be $n\times n$ matrices where $|A|\neq0.$ Show that there are at most $n$ distinct values of $\beta $ such that $A+ \beta B$ is singular.
  26. Let $A$ and $C=C_1+C_2$ be matrices such that $AC_1=C_2A=O.$ Show that for every matrix $B$ the characteristic polynomials of $AB$ and $A(B+C)$ are equal.
  27. Consider the permutation matrix $A=((a_{ij})),$ where $a_{i,i+1}=1$ for $i=1,...,n-1$ and $a_{n1}=1.$ All the other $a_{ij}$'s are $0.$ Show that the eigenvalues of $A$ are the $n$-th roots of unity.
  28. A matrix $A$ is called nilpotent if $A^k=O$ for some $k\in{\mathbb N}.$ Show that a complex matrix is nilpotent iff all its eigenvalues are $0.$
  29. A matrix $A$ is such that all nonzero vectors are its eigenvectors. If $a_{11}=5,$ then find $A.$
  30. A complex matrix $A_{n\times n}$ is such that for all subspaces $S\leq {\mathbb C}^n$ we have $\forall \v x\in S~~A\v x\in S.$ Show that $A = \alpha I$ for some scalar $\alpha.$
  31. A complex matrix $A$ commutes with all idempotent matrices (ie, if $B$ is any idempotent matrix, then $AB=BA$). What can you say about $A?$
  32. If $\v x,\v y\in{\mathbb R}^n,$ then find the eigenvalues (along with am's and gm's) of $A_{n\times n}=\v x \v y'.$
  33. Let $A$ be a square matrix. A polynomial $f(x)$ is called an annihilating polynomial for $A$ if $f(A)=O.$ Thus, Cayley-Hamilton Theorem says that characteristic polynomial is always an annihilating polynomial. Also $p(x)$ is called a minimal polynomial for $A$ if it is annihilating and monic (ie, leading coefficient is $1$) and has the least possible degree among all annihilating polynomials.
    1. Show that a minimal polynomial always divides any annihilating polynomial. Hence conclude that minimal polynomial for a matrix must be unique.
    2. Show that similar matrices have the same minimal polynomial.
    3. If $\lambda\in{\mathbb C}$ is an eigenvalue of $A$ then show that $p(\lambda)=0.$
    4. If $p(\lambda)=0$ for some $\lambda\in{\mathbb C}$ then show that $\lambda $ must be an eigenvalue of $A.$
    5. $A$ is semisimple iff all the roots of its minimal polynomial are distinct.

      Hint:

      Showing "Semisimple $\Rightarrow$ distinct roots" is easy.

      For the converse, let the minimal polynomial be $$ m(\lambda) = (\lambda-\lambda_1)\cdots(\lambda-\lambda_k), $$ where all the $\lambda_i$'s are distinct. We need to show that ${\mathbb C}^n$ is the sum of all the eigenspaces, ie, $\nul(A-\lambda_i I).$

      Define $m_i(\lambda) = m(\lambda)/(\lambda-\lambda_i),$ ie, product of all $(\lambda-\lambda_j)$'s except for $j=i.$ Note that $$ p(\lambda) = \frac{m_1(\lambda)}{m_1(\lambda_1)} + \cdots + \frac{m_k(\lambda)}{m_1(\lambda_k)} $$ is actually identically equal to $1,$ (since it is $1$ for $\lambda=\lambda_1,...,\lambda_k$ and has degree $\leq k-1$).

      So $p(A) = I,$ and hence for all $\v x$ we have $$ \v x = \frac{m_1(A)\v x}{m_1(\lambda_1)} + \cdots + \frac{m_k(A)\v x}{m_1(\lambda_k)} $$ The proof will be complete once you show $m_i(A)\v x\in \nul(A-\lambda_i A).$

  34. If $A,B$ are square matrices of the same size, then show that $I+AB-BA$ cannot be nilpotent.
  35. Show that for any $n\times n$ matrix $A$ we can express $A^n$ as a linear combination of $I,A,...,A^{n-1}.$ Conclude that any $A^k$ ($k\in{\mathbb N}$) is in $span\{I,A,...,A^{n-1}\}.$
  36. If $\left[\begin{array}{ccccccccccc}A & B\\B' & D \end{array}\right]$ is NND where $A$ is square, then show that $\col(B)\leq \col(A)$ and $\row(B)\leq \row(D).$
  37. Show that the inverse of a PD matrix is PD.
  38. If $A = \left[\begin{array}{ccccccccccc}P & Q\\Q' & S \end{array}\right]$ is a PD matrix, then show that $P-QS ^{-1} Q'$ is PD too.
  39. Show that for a real symmetric matrix $A,$ which is not ND, the following quantities are all equal to the largest eigenvalue:
  40. Let $A$ be a real, symmetric matrix with eigenvalues $\lambda_1\geq \cdots\geq \lambda_n,$ and corresponding eigenvectors $\v v_1,...,\v v_n.$ Then for each $k=0,...,n-1$ the maximum possible value of $\v x'A\ \v x$ subject to is $\lambda_k,$ and this maximum is attained iff $\v x$ is an eigenvector corresponding to $\lambda_k.$ Note that for $k=0$ this reduces to a theorem we have already seen above.
  41. For a real square matrix $A$ define $\sigma(A)$ as the largest singular value of $A.$ Show the following:
    1. $\sigma(A)=0$ if and only if $A=O.$
    2. For any $\alpha\in{\mathbb R}$ we have $\sigma(\alpha A) = |\alpha| \cdot \sigma(A).$
    3. For any two real square matrices $A,B$ we have $\sigma(A+B)\leq \sigma(A)+\sigma(B).$
    Thus $\sigma(\cdot)$ is a norm on the space of square matrices. It is called the spectral norm.
  42. Show that for any real matrix $A_{m\times n}$ with singular values $\sigma_1,...,\sigma_r$ we have $\sum_{i,j} a_{ij}^2 = \sum_i \sigma_i^2.$ The square root of the LHS is called the Frobenius norm of $A.$
  43. If $A,B$ are square matrices of the same size, then we shall say $A\geq B$ to mean $A-B$ is NND. Prove the following.
    1. $A\geq B$ and $B\geq A$ $\Rightarrow$ $A=B$
    2. $A\geq B$ and $B\geq C\Rightarrow A\geq C.$
    3. If $A,B$ are both NND and $A\geq B$ then $|A|\geq |B|.$
    4. If $A,B$ are both PD, and $A\geq B$ and $|A|=|B|,$ then $A=B.$
  44. Let $A$ be PD. Then show that $A-\v b\v b'$ is PD iff $\v b'A ^{-1} \v b < 1.$
  45. If $B$ is an $m\times n$ matrix with $m < n.$ Show that there exists $C_{(n-m)\times n}$ such that $\left[\begin{array}{ccccccccccc}B\\C \end{array}\right]$ is PD iff the submatrix of $B$ formed by the first $m$ columns is PD.
  46. A square matrix $A$ is called strictly diagonally dominant if $\forall i~~|a_{i i}| > \sum_{j\neq i} |a_{ij}|.$ Show that if $A$ is such a symmetric matrix with positie diagonal entries, then $A$ must be PD. Is the converse true?
  47. If $A,B$ are both NND, then show that all the eigenvalues of $AB$ are nonnegative.
  48. Let $A$ be a real symmetric matrix. Show that the set $\{\v x~:~\v x' A\v x\leq 1\}$ is bounded iff $A$ is PD.
  49. Show that for all real symmetric matrix $A,$ there exists $\alpha\in{\mathbb R}$ such that $A+ \alpha I$ is PD.
  50. Show that every real symmetric matrix can be written as $A-B$ where $A,B$ are two suitable NND matrices.
  51. Find all values of $x\in{\mathbb R}$ such that the set $$ \{(x,1,...,1),~(1,x,1,...,1),...,(1,1,...,x)\} $$ is not a basis of ${\mathbb R}^n.$
  52. Show that for any real symmetric matrix $A\neq O$ we have $tr(A^2)> 0$ and $$ r(A) \geq \frac{(tr(A))^2}{tr(A^2)}. $$


HTML Comment Box is loading comments...