Section PEE Properties of Eigenvalues and Eigenvectors

From A First Course in Linear Algebra
Version 2.23
© 2004.
Licensed under the GNU Free Documentation License.
http://linear.ups.edu/

The previous section introduced eigenvalues and eigenvectors, and concentrated on their existence and determination. This section will be more about theorems, and the various properties eigenvalues and eigenvectors enjoy. Like a good 4 × 100\text{ meter} relay, we will lead-off with one of our better theorems and save the very best for the anchor leg.

Theorem EDELI
Eigenvectors with Distinct Eigenvalues are Linearly Independent
Suppose that A is an n × n square matrix and S = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{p}\right \} is a set of eigenvectors with eigenvalues {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{p} such that {λ}_{i}\mathrel{≠}{λ}_{j} whenever i\mathrel{≠}j. Then S is a linearly independent set. □

Proof If p = 1, then the set S = \left \{{x}_{1}\right \} is linearly independent since eigenvectors are nonzero (Definition EEM), so assume for the remainder that p ≥ 2.

We will prove this result by contradiction (Technique CD). Suppose to the contrary that S is a linearly dependent set. Define {S}_{i} = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{i}\right \} and let k be an integer such that {S}_{k−1} = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{k−1}\right \} is linearly independent and {S}_{k} = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{k}\right \} is linearly dependent. We have to ask if there is even such an integer k? First, since eigenvectors are nonzero, the set \left \{{x}_{1}\right \} is linearly independent. Since we are assuming that S = {S}_{p} is linearly dependent, there must be an integer k, 2 ≤ k ≤ p, where the sets {S}_{i} transition from linear independence to linear dependence (and stay that way). In other words, {x}_{k} is the vector with the smallest index that is a linear combination of just vectors with smaller indices.

Since \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{k}\right \} is linearly dependent there are scalars, {a}_{1},\kern 1.95872pt {a}_{2},\kern 1.95872pt {a}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {a}_{k}, some non-zero (Definition LI), so that

\eqalignno{ 0 = {a}_{1}{x}_{1} + {a}_{2}{x}_{2} + {a}_{3}{x}_{3} + \mathrel{⋯} + {a}_{k}{x}_{k} & & }

\eqalignno{ 0& = \left (A − {λ}_{k}{I}_{n}\right )0 &&\text{@(a href="fcla-jsmath-2.23li37.html#theorem.ZVSM")Theorem ZVSM@(/a)} &&&& \cr & = \left (A − {λ}_{k}{I}_{n}\right )\left ({a}_{1}{x}_{1} + {a}_{2}{x}_{2} + {a}_{3}{x}_{3} + \mathrel{⋯} + {a}_{k}{x}_{k}\right ) &&\text{@(a href="fcla-jsmath-2.23li39.html#definition.RLD")Definition RLD@(/a)} &&&& \cr & = \left (A − {λ}_{k}{I}_{n}\right ){a}_{1}{x}_{1} + \left (A − {λ}_{k}{I}_{n}\right ){a}_{2}{x}_{2} + \mathrel{⋯} + \left (A − {λ}_{k}{I}_{n}\right ){a}_{k}{x}_{k} &&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)}&&&& \cr & = {a}_{1}\left (A − {λ}_{k}{I}_{n}\right ){x}_{1} + {a}_{2}\left (A − {λ}_{k}{I}_{n}\right ){x}_{2} + \mathrel{⋯} + {a}_{k}\left (A − {λ}_{k}{I}_{n}\right ){x}_{k} &&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)}&&&& \cr & = {a}_{1}\left (A{x}_{1} − {λ}_{k}{I}_{n}{x}_{1}\right ) + {a}_{2}\left (A{x}_{2} − {λ}_{k}{I}_{n}{x}_{2}\right ) + \mathrel{⋯} + {a}_{k}\left (A{x}_{k} − {λ}_{k}{I}_{n}{x}_{k}\right )&&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)}&&&& \cr & = {a}_{1}\left (A{x}_{1} − {λ}_{k}{x}_{1}\right ) + {a}_{2}\left (A{x}_{2} − {λ}_{k}{x}_{2}\right ) + \mathrel{⋯} + {a}_{k}\left (A{x}_{k} − {λ}_{k}{x}_{k}\right ) &&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMIM")Theorem MMIM@(/a)} &&&& \cr & = {a}_{1}\left ({λ}_{1}{x}_{1} − {λ}_{k}{x}_{1}\right ) + {a}_{2}\left ({λ}_{2}{x}_{2} − {λ}_{k}{x}_{2}\right ) + \mathrel{⋯} + {a}_{k}\left ({λ}_{k}{x}_{k} − {λ}_{k}{x}_{k}\right ) &&\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} &&&& \cr & = {a}_{1}\left ({λ}_{1} − {λ}_{k}\right ){x}_{1} + {a}_{2}\left ({λ}_{2} − {λ}_{k}\right ){x}_{2} + \mathrel{⋯} + {a}_{k}\left ({λ}_{k} − {λ}_{k}\right ){x}_{k} &&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)}&&&& \cr & = {a}_{1}\left ({λ}_{1} − {λ}_{k}\right ){x}_{1} + {a}_{2}\left ({λ}_{2} − {λ}_{k}\right ){x}_{2} + \mathrel{⋯} + {a}_{k}\left (0\right ){x}_{k} &&\text{@(a href="fcla-jsmath-2.23li69.html#property.AICN")Property AICN@(/a)} &&&& \cr & = {a}_{1}\left ({λ}_{1} − {λ}_{k}\right ){x}_{1} + {a}_{2}\left ({λ}_{2} − {λ}_{k}\right ){x}_{2} + \mathrel{⋯} + {a}_{k−1}\left ({λ}_{k−1} − {λ}_{k}\right ){x}_{k−1} + 0 &&\text{@(a href="fcla-jsmath-2.23li37.html#theorem.ZSSM")Theorem ZSSM@(/a)} &&&& \cr & = {a}_{1}\left ({λ}_{1} − {λ}_{k}\right ){x}_{1} + {a}_{2}\left ({λ}_{2} − {λ}_{k}\right ){x}_{2} + \mathrel{⋯} + {a}_{k−1}\left ({λ}_{k−1} − {λ}_{k}\right ){x}_{k−1} &&\text{@(a href="fcla-jsmath-2.23li37.html#property.Z")Property Z@(/a)} &&&& }

This is a relation of linear dependence on the linearly independent set \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{k−1}\right \}, so the scalars must all be zero. That is, {a}_{i}\left ({λ}_{i} − {λ}_{k}\right ) = 0 for 1 ≤ i ≤ k − 1. However, we have the hypothesis that the eigenvalues are distinct, so {λ}_{i}\mathrel{≠}{λ}_{k} for 1 ≤ i ≤ k − 1. Thus {a}_{i} = 0 for 1 ≤ i ≤ k − 1.

This reduces the original relation of linear dependence on \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{k}\right \} to the simpler equation {a}_{k}{x}_{k} = 0. By Theorem SMEZV we conclude that {a}_{k} = 0 or {x}_{k} = 0. Eigenvectors are never the zero vector (Definition EEM), so {a}_{k} = 0. So all of the scalars {a}_{i}, 1 ≤ i ≤ k are zero, contradicting their introduction as the scalars creating a nontrivial relation of linear dependence on the set \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{k}\right \}. With a contradiction in hand, we conclude that S must be linearly independent. ■

There is a simple connection between the eigenvalues of a matrix and whether or not the matrix is nonsingular.

\eqalignno{ \text{$A$ is singular} &\kern 3.26288pt \mathrel{⇔}\kern 3.26288pt \text{there exists $x\mathrel{≠}0$, $Ax = 0$} & &\text{@(a href="fcla-jsmath-2.23li20.html#definition.NSM")Definition NSM@(/a)} & & & & \cr &\kern 3.26288pt \mathrel{⇔}\kern 3.26288pt \text{there exists $x\mathrel{≠}0$, $Ax = 0x$} & &\text{@(a href="fcla-jsmath-2.23li37.html#theorem.ZSSM")Theorem ZSSM@(/a)} & & & & \cr &\kern 3.26288pt \mathrel{⇔}\kern 3.26288pt \text{$λ = 0$ is an eigenvalue of $A$} & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} & & & & }

With an equivalence about singular matrices we can update our list of equivalences about nonsingular matrices.

Proof The equivalence of the first and last statements is the contrapositive of Theorem SMZE, so we are able to improve on Theorem NME7. ■

\eqalignno{ \left (αA\right )x & = α\left (Ax\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & \cr & = α\left (λx\right ) & &\text{$x$ eigenvector of $A$} & & & & \cr & = \left (αλ\right )x & &\text{@(a href="fcla-jsmath-2.23li23.html#property.SMAC")Property SMAC@(/a)} & & & & }

Unfortunately, there are not parallel theorems about the sum or product of arbitrary matrices. But we can prove a similar result for powers of a matrix.

Theorem EOMP
Eigenvalues Of Matrix Powers
Suppose A is a square matrix, λ is an eigenvalue of A, and s ≥ 0 is an integer. Then {λ}^{s} is an eigenvalue of {A}^{s}. □

Proof Let x\mathrel{≠}0 be one eigenvector of A for λ. Suppose A has size n. Then we proceed by induction on s (Technique I). First, for s = 0,

\eqalignno{ {A}^{s}x & = {A}^{0}x & & & & \cr & = {I}_{n}x & & & & \cr & = x & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMIM")Theorem MMIM@(/a)} & & & & \cr & = 1x & &\text{@(a href="fcla-jsmath-2.23li23.html#property.OC")Property OC@(/a)} & & & & \cr & = {λ}^{0}x & & & & \cr & = {λ}^{s}x & & & & \cr & & & & }

so {λ}^{s} is an eigenvalue of {A}^{s} in this special case. If we assume the theorem is true for s, then we find

\eqalignno{ {A}^{s+1}x & = {A}^{s}Ax & & & & \cr & = {A}^{s}\left (λx\right ) & &\text{$x$ eigenvector of $A$ for $λ$} & & & & \cr & = λ\left ({A}^{s}x\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & \cr & = λ\left ({λ}^{s}x\right ) & &\text{Induction hypothesis} & & & & \cr & = \left (λ{λ}^{s}\right )x & &\text{@(a href="fcla-jsmath-2.23li23.html#property.SMAC")Property SMAC@(/a)} & & & & \cr & = {λ}^{s+1}x & & & & }

So x\mathrel{≠}0 is an eigenvector of {A}^{s+1} for {λ}^{s+1}, and induction tells us the theorem is true for all s ≥ 0. ■

While we cannot prove that the sum of two arbitrary matrices behaves in any reasonable way with regard to eigenvalues, we can work with the sum of dissimilar powers of the same matrix. We have already seen two connections between eigenvalues and polynomials, in the proof of Theorem EMHE and the characteristic polynomial (Definition CP). Our next theorem strengthens this connection.

Theorem EPM
Eigenvalues of the Polynomial of a Matrix
Suppose A is a square matrix and λ is an eigenvalue of A. Let q(x) be a polynomial in the variable x. Then q(λ) is an eigenvalue of the matrix q(A). □

Proof Let x\mathrel{≠}0 be one eigenvector of A for λ, and write q(x) = {a}_{0} + {a}_{1}x + {a}_{2}{x}^{2} + \mathrel{⋯} + {a}_{ m}{x}^{m}. Then

\eqalignno{ q(A)x& = \left ({a}_{0}{A}^{0} + {a}_{ 1}{A}^{1} + {a}_{ 2}{A}^{2} + \mathrel{⋯} + {a}_{ m}{A}^{m}\right )x && && \cr & = ({a}_{0}{A}^{0})x + ({a}_{ 1}{A}^{1})x + ({a}_{ 2}{A}^{2})x + \mathrel{⋯} + ({a}_{ m}{A}^{m})x&&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} &&&& \cr & = {a}_{0}({A}^{0}x) + {a}_{ 1}({A}^{1}x) + {a}_{ 2}({A}^{2}x) + \mathrel{⋯} + {a}_{ m}({A}^{m}x)&&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)}&&&& \cr & = {a}_{0}({λ}^{0}x) + {a}_{ 1}({λ}^{1}x) + {a}_{ 2}({λ}^{2}x) + \mathrel{⋯} + {a}_{ m}({λ}^{m}x) &&\text{@(a href="#theorem.EOMP")Theorem EOMP@(/a)} &&&& \cr & = ({a}_{0}{λ}^{0})x + ({a}_{ 1}{λ}^{1})x + ({a}_{ 2}{λ}^{2})x + \mathrel{⋯} + ({a}_{ m}{λ}^{m})x &&\text{@(a href="fcla-jsmath-2.23li23.html#property.SMAC")Property SMAC@(/a)} &&&& \cr & = \left ({a}_{0}{λ}^{0} + {a}_{ 1}{λ}^{1} + {a}_{ 2}{λ}^{2} + \mathrel{⋯} + {a}_{ m}{λ}^{m}\right )x &&\text{@(a href="fcla-jsmath-2.23li23.html#property.DSAC")Property DSAC@(/a)} &&&& \cr & = q(λ)x && && }

is shown to have the three eigenvalues λ = 3,\kern 1.95872pt 1,\kern 1.95872pt − 1. Suppose we wanted a 4 × 4 matrix that has the three eigenvalues λ = 4,\kern 1.95872pt 0,\kern 1.95872pt − 2. We can employ Theorem EPM by finding a polynomial that converts 3 to 4, 1 to 0, and − 1 to − 2. Such a polynomial is called an interpolating polynomial, and in this example we can use

We will not discuss how to concoct this polynomial, but a text on numerical analysis should provide the details or see Section CF. For now, simply verify that r(3) = 4, r(1) = 0 and r(−1) = −2.

\eqalignno{ r(C) & = {1\over 4}{C}^{2} + C −{5\over 4}{I}_{4} & & \cr & = {1\over 4}\left [\array{ 3&2&2&2\cr 2&3 &2 &2 \cr 2&2&3&2\cr 2&2 &2 &3 } \right ] + \left [\array{ 1&0&1&1\cr 0&1 &1 &1 \cr 1&1&1&0\cr 1&1 &0 &1 } \right ] −{5\over 4}\left [\array{ 1&0&0&0\cr 0&1 &0 &0 \cr 0&0&1&0\cr 0&0 &0 &1 } \right ] & & \cr & = {1\over 2}\left [\array{ 1&1&3&3\cr 1&1 &3 &3 \cr 3&3&1&1\cr 3&3 &1 &1 } \right ] & & }

Theorem EPM tells us that if r(x) transforms the eigenvalues in the desired manner, then r(C) will have the desired eigenvalues. You can check this by computing the eigenvalues of r(C) directly. Furthermore, notice that the multiplicities are the same, and the eigenspaces of C and r(C) are identical. ⊠

Inverses and transposes also behave predictably with regard to their eigenvalues.

Theorem EIM
Eigenvalues of the Inverse of a Matrix
Suppose A is a square nonsingular matrix and λ is an eigenvalue of A. Then {1\over λ} is an eigenvalue of the matrix {A}^{−1}. □

Proof Notice that since A is assumed nonsingular, {A}^{−1} exists by Theorem NI, but more importantly, {1\over λ} does not involve division by zero since Theorem SMZE prohibits this possibility.

\eqalignno{ {A}^{−1}x & = {A}^{−1}(1x) & &\text{@(a href="fcla-jsmath-2.23li23.html#property.OC")Property OC@(/a)} & & & & \cr & = {A}^{−1}({1\over λ}λx) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.MICN")Property MICN@(/a)} & & & & \cr & = {1\over λ}{A}^{−1}(λx) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & \cr & = {1\over λ}{A}^{−1}(Ax) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} & & & & \cr & = {1\over λ}({A}^{−1}A)x & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMA")Theorem MMA@(/a)} & & & & \cr & = {1\over λ}{I}_{n}x & &\text{@(a href="fcla-jsmath-2.23li32.html#definition.MI")Definition MI@(/a)} & & & & \cr & = {1\over λ}x & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMIM")Theorem MMIM@(/a)} & & & & }

So x\mathrel{≠}0 is an eigenvector of {A}^{−1} for the eigenvalue {1\over λ}. ■

The theorems above have a similar style to them, a style you should consider using when confronted with a need to prove a theorem about eigenvalues and eigenvectors. So far we have been able to reserve the characteristic polynomial for strictly computational purposes. However, the next theorem, whose statement resembles the preceding theorems, has an easier proof if we employ the characteristic polynomial and results about determinants.

Theorem ETM
Eigenvalues of the Transpose of a Matrix
Suppose A is a square matrix and λ is an eigenvalue of A. Then λ is an eigenvalue of the matrix {A}^{t}. □

So A and {A}^{t} have the same characteristic polynomial, and by Theorem EMRCP, their eigenvalues are identical and have equal algebraic multiplicities. Notice that what we have proved here is a bit stronger than the stated conclusion in the theorem. ■

If a matrix has only real entries, then the computation of the characteristic polynomial (Definition CP) will result in a polynomial with coefficients that are real numbers. Complex numbers could result as roots of this polynomial, but they are roots of quadratic factors with real coefficients, and as such, come in conjugate pairs. The next theorem proves this, and a bit more, without mentioning the characteristic polynomial.

Theorem ERMCP
Eigenvalues of Real Matrices come in Conjugate Pairs
Suppose A is a square matrix with real entries and x is an eigenvector of A for the eigenvalue λ. Then \overline{x} is an eigenvector of A for the eigenvalue \overline{λ}. □

\eqalignno{ A\overline{x} & = \overline{A}\overline{x} & &\text{$A$ has real entries} & & & & \cr & = \overline{Ax} & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMCC")Theorem MMCC@(/a)} & & & & \cr & = \overline{λx} & &\text{$x$ eigenvector of $A$} & & & & \cr & = \overline{λ}\overline{x} & &\text{@(a href="fcla-jsmath-2.23li28.html#theorem.CRSM")Theorem CRSM@(/a)} & & & & }

This phenomenon is amply illustrated in Example CEMS6, where the four complex eigenvalues come in two pairs, and the two basis vectors of the eigenspaces are complex conjugates of each other. Theorem ERMCP can be a time-saver for computing eigenvalues and eigenvectors of real matrices with complex eigenvalues, since the conjugate eigenvalue and eigenspace can be inferred from the theorem rather than computed.

Subsection ME: Multiplicities of Eigenvalues

A polynomial of degree n will have exactly n roots. From this fact about polynomial equations we can say more about the algebraic multiplicities of eigenvalues.

Theorem DCP
Degree of the Characteristic Polynomial
Suppose that A is a square matrix of size n. Then the characteristic polynomial of A, {p}_{A}\left (x\right ), has degree n. □

Proof We will prove a more general result by induction (Technique I). Then the theorem will be true as a special case. We will carefully state this result as a proposition indexed by m, m ≥ 1.

P(m): Suppose that A is an m × m matrix whose entries are complex numbers or linear polynomials in the variable x of the form c − x, where c is a complex number. Suppose further that there are exactly k entries that contain x and that no row or column contains more than one such entry. Then, when k = m, \mathop{ det} \left (A\right ) is a polynomial in x of degree m, with leading coefficient ± 1, and when k < m, \mathop{ det} \left (A\right ) is a polynomial in x of degree k or less.

Base Case: Suppose A is a 1 × 1 matrix. Then its determinant is equal to the lone entry (Definition DM). When k = m = 1, the entry is of the form c − x, a polynomial in x of degree m = 1 with leading coefficient − 1. When k < m, then k = 0 and the entry is simply a complex number, a polynomial of degree 0 ≤ k. So P(1) is true.

Induction Step: Assume P(m) is true, and that A is an (m + 1) × (m + 1) matrix with k entries of the form c − x. There are two cases to consider.

Suppose k = m + 1. Then every row and every column will contain an entry of the form c − x. Suppose that for the first row, this entry is in column t. Compute the determinant of A by an expansion about this first row (Definition DM). The term associated with entry t of this row will be of the form

The submatrix A\left (1|t\right ) is an m × m matrix with k = m terms of the form c − x, no more than one per row or column. By the induction hypothesis, \mathop{ det} \left (A\left (1|t\right )\right ) will be a polynomial in x of degree m with coefficient ± 1. So this entire term is then a polynomial of degree m + 1 with leading coefficient ± 1.

The remaining terms (which constitute the sum that is the determinant of A) are products of complex numbers from the first row with cofactors built from submatrices that lack the first row of A and lack some column of A, other than column t. As such, these submatrices are m × m matrices with k = m − 1 < m entries of the form c − x, no more than one per row or column. Applying the induction hypothesis, we see that these terms are polynomials in x of degree m − 1 or less. Adding the single term from the entry in column t with all these others, we see that \mathop{ det} \left (A\right ) is a polynomial in x of degree m + 1 and leading coefficient ± 1.

The second case occurs when k < m + 1. Now there is a row of A that does not contain an entry of the form c − x. We consider the determinant of A by expanding about this row (Theorem DER), whose entries are all complex numbers. The cofactors employed are built from submatrices that are m × m matrices with either k or k − 1 entries of the form c − x, no more than one per row or column. In either case, k ≤ m, and we can apply the induction hypothesis to see that the determinants computed for the cofactors are all polynomials of degree k or less. Summing these contributions to the determinant of A yields a polynomial in x of degree k or less, as desired.

Definition CP tells us that the characteristic polynomial of an n × n matrix is the determinant of a matrix having exactly n entries of the form c − x, no more than one per row or column. As such we can apply P(n) to see that the characteristic polynomial has degree n. ■

Theorem NEM
Number of Eigenvalues of a Matrix
Suppose that A is a square matrix of size n with distinct eigenvalues {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{k}. Then

Proof By the definition of the algebraic multiplicity (Definition AME), we can factor the characteristic polynomial as

where c is a nonzero constant. (We could prove that c = {(−1)}^{n}, but we do not need that specificity right now. See Exercise PEE.T30) The left-hand side is a polynomial of degree n by Theorem DCP and the right-hand side is a polynomial of degree {\mathop{ \mathop{∑ }}\nolimits }_{i=1}^{k}{α}_{ A}\left ({λ}_{i}\right ). So the equality of the polynomials’ degrees gives the equality {\mathop{ \mathop{∑ }}\nolimits }_{i=1}^{k}{α}_{ A}\left ({λ}_{i}\right ) = n. ■

Proof Since λ is an eigenvalue of A, there is an eigenvector of A for λ, x. Then x ∈{ℰ}_{A}\left (λ\right ), so {γ}_{A}\left (λ\right ) ≥ 1, since we can extend \left \{x\right \} into a basis of {ℰ}_{A}\left (λ\right ) (Theorem ELIS).

To show that {γ}_{A}\left (λ\right ) ≤ {α}_{A}\left (λ\right ) is the most involved portion of this proof. To this end, let g = {γ}_{A}\left (λ\right ) and let {x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{g} be a basis for the eigenspace of λ, {ℰ}_{A}\left (λ\right ). Construct another n − g vectors, {y}_{1},\kern 1.95872pt {y}_{2},\kern 1.95872pt {y}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {y}_{n−g}, so that

is a basis of {ℂ}^{n}. This can be done by repeated applications of Theorem ELIS. Finally, define a matrix S by

where R is an n × (n − g) matrix whose columns are {y}_{1},\kern 1.95872pt {y}_{2},\kern 1.95872pt {y}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {y}_{n−g}. The columns of S are linearly independent by design, so S is nonsingular (Theorem NMLIC) and therefore invertible (Theorem NI). Then,

\eqalignno{ \left [{e}_{1}|{e}_{2}|{e}_{3}|\mathop{\mathop{…}}|{e}_{n}\right ] & = {I}_{n} & & \cr & = {S}^{−1}S & & \cr & = {S}^{−1}[{x}_{ 1}|{x}_{2}|{x}_{3}|\mathop{\mathop{…}}|{x}_{g}|R] & & \cr & = [{S}^{−1}{x}_{ 1}|{S}^{−1}{x}_{ 2}|{S}^{−1}{x}_{ 3}|\mathop{\mathop{…}}|{S}^{−1}{x}_{ g}|{S}^{−1}R] & & }

\eqalignno{ {p}_{A}\left (x\right ) & =\mathop{ det} \left (A − x{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.CP")Definition CP@(/a)} & & & & \cr & = 1\mathop{ det} \left (A − x{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.OCN")Property OCN@(/a)} & & & & \cr & =\mathop{ det} \left ({I}_{n}\right )\mathop{ det} \left (A − x{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li44.html#definition.DM")Definition DM@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}S\right )\mathop{ det} \left (A − x{I}_{ n}\right ) & &\text{@(a href="fcla-jsmath-2.23li32.html#definition.MI")Definition MI@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}\right )\mathop{ det} \left (S\right )\mathop{ det} \left (A − x{I}_{ n}\right ) & &\text{@(a href="fcla-jsmath-2.23li45.html#theorem.DRMM")Theorem DRMM@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}\right )\mathop{ det} \left (A − x{I}_{ n}\right )\mathop{ det} \left (S\right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.CMCN")Property CMCN@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}\left (A − x{I}_{ n}\right )S\right ) & &\text{@(a href="fcla-jsmath-2.23li45.html#theorem.DRMM")Theorem DRMM@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}AS − {S}^{−1}x{I}_{ n}S\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}AS − x{S}^{−1}{I}_{ n}S\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}AS − x{S}^{−1}S\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMIM")Theorem MMIM@(/a)} & & & & \cr & =\mathop{ det} \left ({S}^{−1}AS − x{I}_{ n}\right ) & &\text{@(a href="fcla-jsmath-2.23li32.html#definition.MI")Definition MI@(/a)} & & & & \cr & = {p}_{{S}^{−1}AS}\left (x\right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.CP")Definition CP@(/a)} & & & & }

\eqalignno{ {S}^{−1}AS& = {S}^{−1}A[{x}_{ 1}|{x}_{2}|{x}_{3}|\mathop{\mathop{…}}|{x}_{g}|R] && && \cr & = {S}^{−1}[A{x}_{ 1}|A{x}_{2}|A{x}_{3}|\mathop{\mathop{…}}|A{x}_{g}|AR] &&\text{@(a href="fcla-jsmath-2.23li31.html#definition.MM")Definition MM@(/a)} &&&& \cr & = {S}^{−1}[λ{x}_{ 1}|λ{x}_{2}|λ{x}_{3}|\mathop{\mathop{…}}|λ{x}_{g}|AR] &&\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} &&&& \cr & = [{S}^{−1}λ{x}_{ 1}|{S}^{−1}λ{x}_{ 2}|{S}^{−1}λ{x}_{ 3}|\mathop{\mathop{…}}|{S}^{−1}λ{x}_{ g}|{S}^{−1}AR]&&\text{@(a href="fcla-jsmath-2.23li31.html#definition.MM")Definition MM@(/a)} &&&& \cr & = [λ{S}^{−1}{x}_{ 1}|λ{S}^{−1}{x}_{ 2}|λ{S}^{−1}{x}_{ 3}|\mathop{\mathop{…}}|λ{S}^{−1}{x}_{ g}|{S}^{−1}AR]&&\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} &&&& \cr & = [λ{e}_{1}|λ{e}_{2}|λ{e}_{3}|\mathop{\mathop{…}}|λ{e}_{g}|{S}^{−1}AR] &&\text{${S}^{−1}S = {I}_{ n}$, (($∗$) above)}&&&& }

Now imagine computing the characteristic polynomial of A by computing the characteristic polynomial of {S}^{−1}AS using the form just obtained. The first g columns of {S}^{−1}AS are all zero, save for a λ on the diagonal. So if we compute the determinant by expanding about the first column, successively, we will get successive factors of (λ − x). More precisely, let T be the square matrix of size n − g that is formed from the last n − g rows and last n − g columns of {S}^{−1}AR. Then

This says that (x − λ) is a factor of the characteristic polynomial at least g times, so the algebraic multiplicity of λ as an eigenvalue of A is greater than or equal to g (Definition AME). In other words,

Theorem NEM says that the sum of the algebraic multiplicities for all the eigenvalues of A is equal to n. Since the algebraic multiplicity is a positive quantity, no single algebraic multiplicity can exceed n without the sum of all of the algebraic multiplicities doing the same. ■

Proof Suppose that A has k distinct eigenvalues, {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{k}. Then

\eqalignno{ k & ={ \mathop{∑ }}_{i=1}^{k}1 & & & & \cr & ≤{\mathop{∑ }}_{i=1}^{k}{α}_{ A}\left ({λ}_{i}\right ) & &\text{@(a href="#theorem.ME")Theorem ME@(/a)} & & & & \cr & = n & &\text{@(a href="#theorem.NEM")Theorem NEM@(/a)} & & & & }

Subsection EHM: Eigenvalues of Hermitian Matrices

Recall that a matrix is Hermitian (or self-adjoint) if A = {A}^{∗} (Definition HM). In the case where A is a matrix whose entries are all real numbers, being Hermitian is identical to being symmetric (Definition SYM). Keep this in mind as you read the next two theorems. Their hypotheses could be changed to “suppose A is a real symmetric matrix.”

Proof Let x\mathrel{≠}0 be one eigenvector of A for the eigenvalue λ. Then by Theorem PIP we know \left \langle x,\kern 1.95872pt x\right \rangle \mathrel{≠}0. So

\eqalignno{ λ & = {1\over \left \langle x,\kern 1.95872pt x\right \rangle }λ\left \langle x,\kern 1.95872pt x\right \rangle & &\text{@(a href="fcla-jsmath-2.23li69.html#property.MICN")Property MICN@(/a)} & & & & \cr & = {1\over \left \langle x,\kern 1.95872pt x\right \rangle }\left \langle λx,\kern 1.95872pt x\right \rangle & &\text{@(a href="fcla-jsmath-2.23li28.html#theorem.IPSM")Theorem IPSM@(/a)} & & & & \cr & = {1\over \left \langle x,\kern 1.95872pt x\right \rangle }\left \langle Ax,\kern 1.95872pt x\right \rangle & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} & & & & \cr & = {1\over \left \langle x,\kern 1.95872pt x\right \rangle }\left \langle x,\kern 1.95872pt Ax\right \rangle & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.HMIP")Theorem HMIP@(/a)} & & & & \cr & = {1\over \left \langle x,\kern 1.95872pt x\right \rangle }\left \langle x,\kern 1.95872pt λx\right \rangle & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} & & & & \cr & = {1\over \left \langle x,\kern 1.95872pt x\right \rangle }\overline{λ}\left \langle x,\kern 1.95872pt x\right \rangle & &\text{@(a href="fcla-jsmath-2.23li28.html#theorem.IPSM")Theorem IPSM@(/a)} & & & & \cr & = \overline{λ} & &\text{@(a href="fcla-jsmath-2.23li69.html#property.MICN")Property MICN@(/a)} & & & & }

If a complex number is equal to its conjugate, then it has a complex part equal to zero, and therefore is a real number. ■

Notice the appealing symmetry to the justifications given for the steps of this proof. In the center is the ability to pitch a Hermitian matrix from one side of the inner product to the other.

Look back and compare Example ESMS4 and Example CEMS6. In Example CEMS6 the matrix has only real entries, yet the characteristic polynomial has roots that are complex numbers, and so the matrix has complex eigenvalues. However, in Example ESMS4, the matrix has only real entries, but is also symmetric, and hence Hermitian. So by Theorem HMRE, we were guaranteed eigenvalues that are real numbers.

In many physical problems, a matrix of interest will be real and symmetric, or Hermitian. Then if the eigenvalues are to represent physical quantities of interest, Theorem HMRE guarantees that these values will not be complex numbers.

The eigenvectors of a Hermitian matrix also enjoy a pleasing property that we will exploit later.

Theorem HMOE
Hermitian Matrices have Orthogonal Eigenvectors
Suppose that A is a Hermitian matrix and x and y are two eigenvectors of A for different eigenvalues. Then x and y are orthogonal vectors. □

Proof Let x be an eigenvector of A for λ and let y be an eigenvector of A for a different eigenvalue ρ. So we have λ − ρ\mathrel{≠}0. Then

\eqalignno{ \left \langle x,\kern 1.95872pt y\right \rangle & = {1\over λ − ρ}\left (λ − ρ\right )\left \langle x,\kern 1.95872pt y\right \rangle & &\text{@(a href="fcla-jsmath-2.23li69.html#property.MICN")Property MICN@(/a)} & & & & \cr & = {1\over λ − ρ}\left (λ\left \langle x,\kern 1.95872pt y\right \rangle − ρ\left \langle x,\kern 1.95872pt y\right \rangle \right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.MICN")Property MICN@(/a)} & & & & \cr & = {1\over λ − ρ}\left (\left \langle λx,\kern 1.95872pt y\right \rangle −\left \langle x,\kern 1.95872pt \overline{ρ}y\right \rangle \right ) & &\text{@(a href="fcla-jsmath-2.23li28.html#theorem.IPSM")Theorem IPSM@(/a)} & & & & \cr & = {1\over λ − ρ}\left (\left \langle λx,\kern 1.95872pt y\right \rangle −\left \langle x,\kern 1.95872pt ρy\right \rangle \right ) & &\text{@(a href="#theorem.HMRE")Theorem HMRE@(/a)} & & & & \cr & = {1\over λ − ρ}\left (\left \langle Ax,\kern 1.95872pt y\right \rangle −\left \langle x,\kern 1.95872pt Ay\right \rangle \right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} & & & & \cr & = {1\over λ − ρ}\left (\left \langle Ax,\kern 1.95872pt y\right \rangle −\left \langle Ax,\kern 1.95872pt y\right \rangle \right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.HMIP")Theorem HMIP@(/a)} & & & & \cr & = {1\over λ − ρ}\left (0\right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.AICN")Property AICN@(/a)} & & & & \cr & = 0 & & & & }

Notice again how the key step in this proof is the fundamental property of a Hermitian matrix (Theorem HMIP) — the ability to swap A across the two arguments of the inner product. We’ll build on these results and continue to see some more interesting properties in Section OD.

Subsection READ: Reading Questions

Subsection EXC: Exercises

T10 Suppose that A is a square matrix. Prove that the constant term of the characteristic polynomial of A is equal to the determinant of A.
Contributed by Robert Beezer Solution [1326]

T20 Suppose that A is a square matrix. Prove that a single vector may not be an eigenvector of A for two different eigenvalues.
Contributed by Robert Beezer Solution [1327]

T22 Suppose that U is a unitary matrix with eigenvalue λ. Prove that λ had modulus 1, i.e. \left \vert λ\right \vert = 1. This says that all of the eigenvalues of a unitary matrix lie on the unit circle of the complex plane.
Contributed by Robert Beezer

T30 Theorem DCP tells us that the characteristic polynomial of a square matrix of size n has degree n. By suitably augmenting the proof of Theorem DCP prove that the coefficient of {x}^{n} in the characteristic polynomial is {(−1)}^{n}.
Contributed by Robert Beezer

T50 Theorem EIM says that if λ is an eigenvalue of the nonsingular matrix A, then {1\over λ} is an eigenvalue of {A}^{−1}. Write an alternate proof of this theorem using the characteristic polynomial and without making reference to an eigenvector of A for λ.
Contributed by Robert Beezer Solution [1327]

Subsection SOL: Solutions

\eqalignno{ {a}_{0} & = {a}_{0} + {a}_{1}(0) + {a}_{2}{(0)}^{2} + \mathrel{⋯} + {a}_{ n}{(0)}^{n} & & & & \cr & = {p}_{A}\left (0\right ) & & & & \cr & =\mathop{ det} \left (A − 0{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.CP")Definition CP@(/a)} & & & & \cr & =\mathop{ det} \left (A\right ) & & & & }

T20 Contributed by Robert Beezer Statement [1324]
Suppose that the vector x\mathrel{≠}0 is an eigenvector of A for the two eigenvalues λ and ρ, where λ\mathrel{≠}ρ. Then λ − ρ\mathrel{≠}0, and we also have

\eqalignno{ 0 & = Ax − Ax & &\text{@(a href="fcla-jsmath-2.23li23.html#property.AIC")Property AIC@(/a)} & & & & \cr & = λx − ρx & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.EEM")Definition EEM@(/a)} & & & & \cr & = (λ − ρ)x & &\text{@(a href="fcla-jsmath-2.23li23.html#property.DSAC")Property DSAC@(/a)} & & & & }

\eqalignno{ {p}_{{A}^{−1}}\left ({1\over λ}\right ) & =\mathop{ det} \left ({A}^{−1} −{1\over λ}{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.CP")Definition CP@(/a)} & & & & \cr & = 1\mathop{ det} \left ({A}^{−1} −{1\over λ}{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.OCN")Property OCN@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (−λA\right )\mathop{ det} \left ({A}^{−1} −{1\over λ}{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.MICN")Property MICN@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (\left (−λA\right )\left ({A}^{−1} −{1\over λ}{I}_{n}\right )\right ) & &\text{@(a href="fcla-jsmath-2.23li45.html#theorem.DRMM")Theorem DRMM@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (−λA{A}^{−1} −\left (−λA\right ){1\over λ}{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (−λ{I}_{n} −\left (−λA\right ){1\over λ}{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li32.html#definition.MI")Definition MI@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (−λ{I}_{n} + λ{1\over λ}A{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (−λ{I}_{n} + 1A{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.MICN")Property MICN@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (−λ{I}_{n} + A{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li69.html#property.OCN")Property OCN@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (−λ{I}_{n} + A\right ) & &\text{@(a href="fcla-jsmath-2.23li31.html#theorem.MMIM")Theorem MMIM@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\mathop{ det} \left (A − λ{I}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.23li30.html#property.ACM")Property ACM@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}{p}_{A}\left (λ\right ) & &\text{@(a href="fcla-jsmath-2.23li47.html#definition.CP")Definition CP@(/a)} & & & & \cr & = {1\over \mathop{det} \left (−λA\right )}\kern 1.95872pt 0 & &\text{@(a href="fcla-jsmath-2.23li47.html#theorem.EMRCP")Theorem EMRCP@(/a)} & & & & \cr & = 0 & &\text{@(a href="fcla-jsmath-2.23li69.html#property.ZCN")Property ZCN@(/a)} & & & & }

So {1\over λ} is a root of the characteristic polynomial of {A}^{−1} and so is an eigenvalue of {A}^{−1}. This proof is due to Sara Bucht.