Section 3.1 Generalized Eigenspaces
¶In this section we will define a new type of invariant subspace and explore its key properties. This generalization of eigenvalues and eigenspaces will allow us to move from diagonal matrix representations of diagonalizable matrices to nearly diagonal matrix representations of arbitrary matrices.
Subsection 3.1.1 Kernels of Powers of Linear Transformations
¶With Section (((section-jordan-canonical-form))) as our goal, we will become increasingly interested in kernels of powers of linear transformations, which will go a long way to helping us understand the structure of a linear transformation, or its matrix representation. We need the next theorem to help us understand generalized eigenspaces, though its specialization in Theorem (((theorem-about-powers-nilpotent))) to nilpotent linear transformations will be the real workhorse.
Theorem 3.1.1. Kernels of Powers of Linear Transformations.
Suppose \(\ltdefn{T}{V}{V}\) is a linear transformation, where \(\dimension{V}=n\text{.}\) Then there is an integer \(m\text{,}\) \(0\leq m\leq n\text{,}\) such that
Proof.
There are several items to verify in the conclusion as stated. First, we show that \(\krn{T^k}\subseteq\krn{T^{k+1}}\) for any \(k\text{.}\) Choose \(\vect{z}\in\krn{T^k}\text{.}\) Then
so \(\vect{z}\in\krn{T^{k+1}}\)
Second, we demonstrate the existence of a power \(m\) where consecutive powers result in equal kernels. A by-product will be the condition that \(m\) can be chosen so that \(m\leq n\text{.}\) To the contrary, suppose that
Since \(\krn{T^k}\subsetneq\krn{T^{k+1}}\text{,}\) Theorem PSSD implies that \(\dimension{\krn{T^{k+1}}}\geq\dimension{\krn{T^k}}+1\text{.}\) Repeated application of this observation yields
As \(\krn{T^{n+1}}\) is a subspace of \(V\text{,}\) of dimension \(n\text{,}\) this is a contradiction.
The contradiction yields the existence of an integer \(k\) such that \(\krn{T^k}=\krn{T^{k+1}}\text{,}\) so we can define \(m\) to be smallest such integer with this property. From the argument above about dimensions resulting from a strictly increasing chain of subspaces, we conclude that \(m\leq n\text{.}\)
It remains to show that once two consecutive kernels are equal, then all of the remaining kernels are equal. More formally, if \(\krn{T^m}=\krn{T^{m+1}}\text{,}\) then \(\krn{T^m}=\krn{T^{m+j}}\) for all \(j\geq 1\text{.}\) The proof is by induction on \(j\text{.}\) The base case (\(j=1\)) is precisely our defining property for \(m\text{.}\)
For the induction step, our hypothesis is that \(\krn{T^m}=\krn{T^{m+j}}\text{.}\) We want to establish that \(\krn{T^m}=\krn{T^{m+j+1}}\text{.}\) At the outset of this proof we showed that \(\krn{T^m}\subseteq\krn{T^{m+j+1}}\text{.}\) So we need only show the subset inclusion in the opposite direction. To wit, choose \(\vect{z}\in\krn{T^{m+j+1}}\text{.}\) Then
so \(\lteval{T}{\vect{z}}\in\krn{T^{m+j}}=\krn{T^m}\text{.}\) Thus
so \(\vect{z}\in\krn{T^{m+1}}=\krn{T^m}\text{,}\) as desired.
Example 3.1.2. Kernels of Powers.
As an illustration of Theorem Theorem 3.1.1 consider the linear transformation \(\ltdefn{T}{\complex{10}}{\complex{10}}\) defined by \(\lteval{T}{\vect{x}}=A\vect{x}\text{.}\)
This linear transformation is engineered to illustrate the full generality of the theorem. The kernels of the powers (null spaces of the matrix powers) increase, with the nullity incrementing first by twos, then by ones, until we top out to find the maximum nullity of \(8\) at the \(m=6\) power, well less than the maximum of \(n=10\text{.}\)
It is somewhat interesting to row-reduce the powers of \(A\text{,}\) since the null spaces of these powers are the kernels of the powers of \(T\text{.}\) These are best done with software, but here are two examples, first a mid-range power, then an extreme power.
Once we get to the sixth power, the kernels do not change, and so because of the uniqueness of reduced row-echelon form, these do not change either.
Subsection 3.1.2 Generalized Eigenspaces
¶These are the two main definitions of this section.
Definition 3.1.3. Generalized Eigenvector.
Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation. Suppose further that for \(\vect{x}\neq\zerovector\text{,}\) \(\lteval{\left(T-\lambda I_V\right)^k}{\vect{x}}=\zerovector\) for some \(k>0\text{.}\) Then \(\vect{x}\) is a generalized eigenvector of \(T\) with eigenvalue \(\lambda\text{.}\)
Definition 3.1.4. Generalized Eigenspace.
Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation. Define the generalized eigenspace of \(T\) for \(\lambda\) as
So the generalized eigenspace is composed of generalized eigenvectors, plus the zero vector. As the name implies, the generalized eigenspace is a subspace of \(V\text{.}\) But more topically, it is an invariant subspace of \(V\) relative to \(T\text{.}\)
Theorem 3.1.5. Generalized Eigenspace is an Invariant Subspace.
Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation. Then the generalized eigenspace \(\geneigenspace{T}{\lambda}\) is an invariant subspace of \(V\) relative to \(T\text{.}\)
Proof.
First we establish that \(\geneigenspace{T}{\lambda}\) is a subspace of \(V\text{.}\) Note that \(\lteval{\left(T-\lambda I_V\right)^0}{\zerovector}=\zerovector\text{,}\) so by Theorem LTTZZ we have \(\zerovector\in\geneigenspace{T}{\lambda}\text{.}\)
Suppose that \(\vect{x},\,\vect{y}\in\geneigenspace{T}{\lambda}\text{.}\) Then there are integers \(k,\,\ell\) such that \(\lteval{\left(T-\lambda I_V\right)^k}{\vect{x}}=\zerovector\) and \(\lteval{\left(T-\lambda I_V\right)^\ell}{\vect{y}}=\zerovector\text{.}\) Set \(m=k+\ell\text{,}\)
So \(\vect{x}+\vect{y}\in\geneigenspace{T}{\lambda}\text{.}\)
Suppose that \(\vect{x}\in\geneigenspace{T}{\lambda}\) and \(\alpha\in\complexes\text{.}\) Then there is an integer \(k\) such that \(\lteval{\left(T-\lambda I_V\right)^k}{\vect{x}}=\zerovector\text{.}\)
So \(\alpha\vect{x}\in\geneigenspace{T}{\lambda}\text{.}\) By Theorem TSS, \(\geneigenspace{T}{\lambda}\) is a subspace of \(V\text{.}\)
Now we show that \(\geneigenspace{T}{\lambda}\) is invariant relative to \(T\text{.}\) Suppose that \(\vect{x}\in\geneigenspace{T}{\lambda}\text{.}\) Then by Definition Definition 3.1.4 there is an integer \(k\) such that \(\lteval{\left(T-\lambda I_V\right)^k}{\vect{x}}=\zerovector\text{.}\) The following argument is due to Zoltan Toth.
This qualifies \(\lteval{T}{\vect{x}}\) for membership in \(\geneigenspace{T}{\lambda}\text{,}\) so by Definition Definition 3.1.4, \(\geneigenspace{T}{\lambda}\) is invariant relative to \(T\text{.}\)
Before we compute some generalized eigenspaces, we state and prove one theorem that will make it much easier to create a generalized eigenspace, since it will allow us to use tools we already know well, and will remove some of the ambiguity of the clause “for some \(k\)” in the definition.
Theorem 3.1.6. Generalized Eigenspace as a Kernel.
Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation, \(\dimension{V}=n\text{,}\) and \(\lambda\) is an eigenvalue of \(T\text{.}\) Then \(\geneigenspace{T}{\lambda}=\krn{\left(T-\lambda I_V\right)^n}\text{.}\)
Proof.
To establish the set equality, first suppose that \(\vect{x}\in\geneigenspace{T}{\lambda}\text{.}\) Then there is an integer \(k\) such that \(\lteval{\left(T-\lambda I_V\right)^k}{\vect{x}}=\zerovector\text{.}\) This is equivalent to the statement that \(\vect{x}\in\krn{\left(T-\lambda I_V\right)^k}\text{.}\) No matter what the value of \(k\) is, Theorem 3.1.1 gives
So, \(\geneigenspace{T}{\lambda}\subseteq\krn{\left(T-\lambda I_V\right)^n}\text{.}\)
For the opposite inclusion, suppose \(\vect{y}\in\krn{\left(T-\lambda I_V\right)^n}\text{.}\) Then \(\lteval{\left(T-\lambda I_V\right)^n}{\vect{y}}=\zerovector\text{,}\) so \(\vect{y}\in\geneigenspace{T}{\lambda}\) and thus \(\krn{\left(T-\lambda I_V\right)^n}\subseteq\geneigenspace{T}{\lambda}\text{.}\) So we have the desired equality of sets.
Theorem Theorem 3.1.6 allows us to compute generalized eigenspaces as a single kernel (or null space of a matrix representation) without considering all possible powers \(k\text{.}\) We can simply consider the case where \(k=n\text{.}\) It is worth noting that the “regular” eigenspace is a subspace of the generalized eigenspace since
where the subset inclusion is a consequence of Theorem Theorem 3.1.1.
Also, there is no such thing as a “generalized eigenvalue.” If \(\lambda\) is not an eigenvalue of \(T\text{,}\) then the kernel of \(T-\lambda I_V\) is trivial and therefore subsequent powers of \(T-\lambda I_V\) also have trivial kernels (Theorem Theorem 3.1.1 gives \(m=0\)). So if we defined generalized eigenspaces for scalars that are not an eigenvalue, they would always be trivial. Alright, we know enough now to compute some generalized eigenspaces. We will record some information about algebraic and geometric multiplicities of eigenvalues (Definition AME, Definition GME) as we go, since these observations will be of interest in light of some future theorems.
Example 3.1.7. Linear Transformation Restriction on Generalized Eigenspace.
In order to gain some experience with generalized eigenspaces, we construct one and then also construct a matrix representation for the restriction to this invariant subspace.
Consider the linear transformation \(\ltdefn{T}{\complex{5}}{\complex{5}}\) defined by \(\lteval{T}{\vect{x}}=A\vect{x}\text{,}\) where
One of the eigenvalues of \(A\) is \(\lambda=2\text{,}\) with geometric multiplicity \(\geomult{T}{2}=1\text{,}\) and algebraic multiplicity \(\algmult{T}{2}=3\text{.}\) We get the generalized eigenspace according to Theorem Theorem 3.1.6,
By Theorem Theorem 3.1.5, we know \(W\) is invariant relative to \(T\text{,}\) so we can employ Definition LTR to form the restriction, \(\ltdefn{\restrict{T}{W}}{W}{W}\text{.}\)
We will from the restriction of \(T\) to \(W\text{,}\) \(\restrict{T}{W}\text{,}\) since we will do this frequently in subsequent examples. For a basis of \(W\) we will use \(C=\set{\vect{w}_1,\,\vect{w}_2,\,\vect{w}_3}\text{.}\) Notice that \(\dimension{W}=3\text{,}\) so our matrix representation will be a square matrix of size 3. Applying Definition MR, we compute
So the matrix representation of \(\restrict{T}{W}\) relative to \(C\) is
The question arises: how do we use a \(3\times 3\) matrix to compute with vectors from \(\complex{5}\text{?}\) To answer this question, consider the randomly chosen vector
First check that \(\vect{w}\in\geneigenspace{T}{2}\text{.}\) There are two ways to do this, first verify that
meeting Definition Definition 3.1.3 (with \(k=5\)). Or, express \(\vect{w}\) as a linear combination of the basis \(C\) for \(W\text{,}\) to wit, \(\vect{w}=4\vect{w}_1-2\vect{w}_2-\vect{w}_3\text{.}\)
Now compute \(\lteval{\restrict{T}{W}}{\vect{w}}\) directly
It was necessary to verify that \(\vect{w}\in\geneigenspace{T}{2}\text{.}\) If we trust our work so far, then this output we just computed will also be an element of \(W\text{,}\) but it would be wise to check this anyway (using either of the methods we used for \(\vect{w}\)). We'll wait.
Now we will repeat this sample computation, but instead using the matrix representation of \(\restrict{T}{W}\) relative to \(C\text{.}\)
This matches the previous computation. Notice how the “action” of \(\restrict{T}{W}\) is accomplished by a \(3\times 3\) matrix multiplying a column vector of size 3.
If you would like more practice with these sorts of computations, mimic the above using the other eigenvalue of \(T\text{,}\) which is \(\lambda=-2\text{.}\) The generalized eigenspace has dimension 2, so the matrix representation of the restriction to the generalized eigenspace will be a \(2\times 2\) matrix.
Our next two examples compute a complete set of generalized eigenspaces for a linear transformation.
Example 3.1.8. Generalized Eigenspaces, Dimension 4 Domain.
In Example Example 1.4.2 we presented two invariant subspaces of \(\complex{4}\text{.}\) There was some mystery about just how these were constructed, but we can now reveal that they are generalized eigenspaces. Example Example 1.4.2 featured \(\ltdefn{T}{\complex{4}}{\complex{4}}\) defined by \(\lteval{T}{\vect{x}}=A\vect{x}\) with \(A\) given by
A matrix representation of \(T\) relative to the standard basis (Definition SUV) will equal \(A\text{.}\) So we can analyze \(A\) with the techniques of Chapter E. Doing so, we find two eigenvalues, \(\lambda=1,\,-2\text{,}\) with multiplicities,
To apply Theorem Theorem 3.1.6 we subtract each eigenvalue from the diagonal entries of \(A\text{,}\) raise the result to the power \(\dimension{\complex{4}}=4\text{,}\) and compute a basis for the null space.
In Example Example 1.4.2 we concluded that these two invariant subspaces formed a direct sum of \(\complex{4}\text{,}\) only at that time, they were called \(W\) and \(X\text{.}\) Now we can write
This is no accident. Notice that the dimension of each of these invariant subspaces is equal to the algebraic multiplicity of the associated eigenvalue. Not an accident either. (See the upcoming Theorem Theorem 3.1.10.)
Example 3.1.9. Generalized Eigenspaces, Dimension 6 Domain.
Define the linear transformation \(\ltdefn{S}{\complex{6}}{\complex{6}}\) by \(\lteval{S}{\vect{x}}=B\vect{x}\) where
Then \(B\) will be the matrix representation of \(S\) relative to the standard basis and we can use the techniques of Chapter E applied to \(B\) in order to find the eigenvalues of \(S\text{.}\)
To find the generalized eigenspaces of \(S\) we need to subtract an eigenvalue from the diagonal elements of \(B\text{,}\) raise the result to the power \(\dimension{\complex{6}}=6\) and compute the null space. Here are the results for the two eigenvalues of \(S\text{,}\)
If we take the union of the two bases for these two invariant subspaces we obtain the set
You can check that this set is linearly independent (right now we have no guarantee this will happen). Once this is verified, we have a basis for \(\complex{6}\text{.}\) This is enough for us to apply Theorem Theorem 1.2.3 and conclude that
This is no accident. Notice that the dimension of each of these invariant subspaces is equal to the algebraic multiplicity of the associated eigenvalue. Not an accident either. (See Theorem Theorem 3.1.10.)
Our principal interest in generalized eigenspaces is the following important theorem, which has been presaged by the two previous examples.
Theorem 3.1.10. Generalized Eigenspace Decomposition.
Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation with distinct eigenvalues \(\scalarlist{\lambda}{m}\text{.}\) Then
Proof.
We will provide a complete proof soon. For now, we give an outline.
We now that a decomposition of the domain of a linear transformation into invariant subspaces will give a block diagonal matrix representation. But it cuts both ways. If there is a similarity transformation to a block diagonal matrix, then the columns of the nonsingular matrix used for similarity will be a basis that can be partitioned into bases of invariant subspaces that are a direct sum decomposition of the domain (Theorem SCB). So we outline a sequence of similarity transformations that converts any square matrix to the appropriate block diagonal form.
Begin with the eigenvalues of the matrix, ordered so that equal eigenvalues are adjacent.
Determine the upper triangular matrix with these eigenvalues on the diagonal and similar to the original matrix as guaranteed by Theorem UTMR.
-
Suppose that the entry in row \(i\) and column \(j\) in the “upper half” (so \(j\gt i\)) has the value \(a\text{.}\) Suppose further that the diagonal entries (eigenvalues) \(\lambda_i\) and \(\lambda_j\) are different.
Define \(S\) to be the identity matrix, with the addition of the entry \(\frac{a}{\lambda_j-\lambda_i}\) in row \(i\) and column \(j\text{.}\) Then a similarity transformation by \(S\) will place a zero in row \(i\) and column \(j\text{.}\) Here is where we begin to understand being careful about equal and different eigenvalues.
The similarity transformation of the previous step will change other entries of the matrix, but only in row \(i\) to the right of the entry of interest and column \(j\) above the entry of interest.
Begin in the bottom row, going only as far right as needed to get different eigenvalues, and “zero out” the rest of the row. Move up a row, work left to right, “zeroing out” as much of the row as possible. Continue moving up a row at a time, then move left to right in the row. The restriction to using different eigenvalues will cut a staircase pattern.
You should understand that the blocks left on the diagonal correspond to runs of equal eigenvalues on the diagonal. So each block has a size equal to the algebraic multiplicity of the eigenvalue.
Now, given any linear transformation, we can find a decomposition of the domain into a collection of invariant subspaces. And, as we have seen, such a decomposition will provide a basis for the domain so that a matrix representation realtive to this basis will have a block diagonal form. Besides a decomposition into invariant subspaces, this proof has a bonus for us.
Corollary 3.1.11. Dimension of Generalized Eigenspaces.
Suppose \(\ltdefn{T}{V}{V}\) is a linear transformation with eigenvalue \(\lambda\text{.}\) Then the dimension of the generalized eigenspace for \(\lambda\) is the algebraic multiplicity of \(\lambda\text{,}\) \(\dimension{\geneigenspace{T}{\lambda}}=\algmult{T}{\lambda}\text{.}\)
Proof.
Coming soon: as a consequence of proof, or by counting dimensions with inequality on geometric dimension.
We illustrate the use of this decomposition in building a block diagonal matrix representation.
Example 3.1.12. Matrix Representation with Generalized Eigenspaces, Dimension 6 Domain.
In Example Example 3.1.9 we computed the generalized eigenspaces of the linear transformation \(\ltdefn{S}{\complex{6}}{\complex{6}}\) by \(\lteval{S}{\vect{x}}=B\vect{x}\) where
We also recognized that these generalized eigenspaces provided a vector space decomposition.
From these generalized eigenspaces, we found the basis
of \(\complex{6}\) where \(\set{\vect{v}_1,\,\vect{v}_2}\) is a basis of \(\geneigenspace{S}{3}\) and \(\set{\vect{v}_3,\,\vect{v}_4,\,\vect{v}_5,\,\vect{v}_6}\) is a basis of \(\geneigenspace{S}{-1}\)
We can employ \(C\) in the construction of a matrix representation of \(S\) (Definition MR). Here are the computations,
These column vectors are the columns of the matrix representation, so we obtain
As before, the key feature of this representation is the \(2\times 2\) and \(4\times 4\) blocks on the diagonal. They arise from generalized eigenspaces and their sizes are equal to the algebraic multiplicities of the eigenvalues.