Subsection

{\sc\large This section is in draft form}\\ {\sc\large Nearly complete}

\medskip We have seen in Section NLT that nilpotent linear transformations are almost never diagonalizable (Theorem DNLT), yet have matrix representations that are very nearly diagonal (Theorem CFNLT). Our goal in this section, and the next (Section JCF), is to obtain a matrix representation of any linear transformation that is very nearly diagonal. A key step in reaching this goal is an understanding of invariant subspaces, and a particular type of invariant subspace that contains vectors known as “generalized eigenvectors.” \subsect{IS}{Invariant Subspaces} As is often the case, we start with a definition.
Definition IS Invariant Subspace

Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation and $W$ is a subspace of $V$. Suppose further that $\lt{T}{\vect{w}}\in W$ for every $\vect{w}\in W$. Then $W$ is an invariant subspace of $V$ relative to $T$.

$\square$

We do not have any special notation for an invariant subspace, so it is important to recognize that an invariant subspace is always relative to both a superspace ($V$) and a linear transformation ($T$), which will sometimes not be mentioned, yet will be clear from the context. Note also that the linear transformation involved must have an equal domain and codomain — the definition would not make much sense if our outputs were not of the same type as our inputs.

As usual, we begin with an example that demonstrates the existence of invariant subspaces. We will return later to understand how this example was constructed, but for now, just understand how we check the existence of the invariant subspaces.
Example TIS Two invariant subspaces
Example TIS is a bit mysterious at this stage. Do we know any other examples of invariant subspaces? Yes, as it turns out, we have already seen quite a few. We'll give some examples now, and in more general situations, describe broad classes of invariant subspaces with theorems. First up is eigenspaces.
Theorem EIS Eigenspaces are Invariant Subspaces
Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation with eigenvalue $\lambda$ and associated eigenspace $\eigenspace{T}{\lambda}$. Let $W$ be any subspace of $\eigenspace{T}{\lambda}$. Then $W$ is an invariant subspace of $V$ relative to $T$.

Proof

Theorem EIS is general enough to determine that an entire eigenspace is an invariant subspace, or that simply the span of a single eigenvector is an invariant subspace. It is not always the case that any subspace of an invariant subspace is again an invariant subspace, but eigenspaces do have this property. Here is an example of the theorem, which also allows us to very quickly build several several invariant (4x4, 2 evs, 1 2x2 jordan, 1 2x2 diag)
Example EIS Eigenspaces as invariant subspaces
For every linear transformation there are some obvious, trivial invariant subspaces. Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation. Then simply because $T$ is a function (Definition LT), the subspace $V$ is an invariant subspace of $T$. In only a minor twist on this theme, the range of $T$, $\rng{T}$, is an invariant subspace of $T$ by Definition RLT. Finally, Theorem LTTZZ provides the justification for claiming that $\set{\zerovector}$ is an invariant subspace of $T$.

That the trivial subspace is always an invariant subspace is a special case of the next theorem. As an easy exercise before reading the next theorem, prove that the kernel of a linear transformation (Definition KLT), $\krn{T}$, is an invariant subspace. We'll wait.
Theorem KPIS Kernels of Powers are Invariant Subspaces
Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation. Then $\krn{T^k}$ is an invariant subspace of $V$.

Proof

Two interesting special cases of Theorem KPIS occur when choose $k=0$ and $k=1$. Rather than give an example of this theorem, we will refer you back to Example KPNLT where we work with null spaces of the first four powers of a nilpotent matrix. By Theorem KPIS each of these null spaces is an invariant subspace of the associated linear transformation.

Here's one more example of invariant subspaces we have encountered previously.
Example ISJB Invariant subspaces and Jordan blocks
\subsect{GEE}{Generalized Eigenvectors and Eigenspaces} We now define a new type of invariant subspace and explore its key properties. This generalization of eigenvalues and eigenspaces will allow us to move from diagonal matrix representations of diagonalizable matrices to nearly diagonal matrix representations of arbitrary matrices. Here are the definitions.
Definition GEV Generalized Eigenvector

Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation. Suppose further that for $\vect{x}\neq\zerovector$, $\lt{\left(T-\lambda I_V\right)^k}{\vect{x}}=\zerovector$ for some $k>0$. Then $\vect{x}$ is a generalized eigenvector of $T$ with eigenvalue $\lambda$.

$\square$

Definition GES Generalized Eigenspace

Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation. Define the generalized eigenspace of $T$ for $\lambda$ as \begin{align*} \geneigenspace{T}{\lambda} &=\setparts{\vect{x}}{\lt{\left(T-\lambda I_V\right)^k}{\vect{x}}=\zerovector\text{\ for some\ }k\geq 0} \end{align*} $\geneigenspace{T}{\lambda}$

$\square$

So the generalized eigenspace is composed of generalized eigenvectors, plus the zero vector. As the name implies, the generalized eigenspace is a subspace of $V$. But more topically, it is an invariant subspace of $V$ relative to $T$.
Theorem GESIS Generalized Eigenspace is an Invariant Subspace
Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation. Then the generalized eigenspace $\geneigenspace{T}{\lambda}$ is an invariant subspace of $V$ relative to $T$.

Proof

Before we compute some generalized eigenspaces, we state and prove one theorem that will make it much easier to create a generalized eigenspace, since it will allow us to use tools we already know well, and will remove some the ambiguity of the clause “for some $k$” in the definition.
Theorem GEK Generalized Eigenspace as a Kernel
Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation, $\dimension{V}=n$, and $\lambda$ is an eigenvalue of $T$. Then $\geneigenspace{T}{\lambda}=\krn{\left(T-\lambda I_V\right)^n}$.

Proof

Theorem GEK allows us to compute generalized eigenspaces as a single kernel (or null space of a matrix representation) with tools like Theorem KNSI and Theorem BNS. Also, we do not need to consider all possible powers $k$ and can simply consider the case where $k=n$. It is worth noting that the “regular” eigenspace is a subspace of the generalized eigenspace since \begin{align*} \eigenspace{T}{\lambda} &=\krn{\left(T-\lambda I_V\right)^1} \subseteq\krn{\left(T-\lambda I_V\right)^n} =\geneigenspace{T}{\lambda} \end{align*} where the subset inclusion is a consequence of Theorem KPLT. Also, there is no such thing as a “generalized eigenvalue.” If $\lambda$ is not an eigenvalue of $T$, then the kernel of $T-\lambda I_V$ is trivial and therefore subsequent powers of $T-\lambda I_V$ also have trivial kernels (Theorem KPLT). So the generalized eigenspace of a scalar that is not already an eigenvalue would be trivial. Alright, we know enough now to compute some generalized eigenspaces. We will record some information about algebraic and geometric multiplicities of eigenvalues (Definition AME, Definition GME) as we go, since these observations will be of interest in light of some future theorems.
Example GE4 Generalized eigenspaces, dimension 4 domain
Example GE6 Generalized eigenspaces, dimension 6 domain
\subsect{RLT}{Restrictions of Linear Transformations} Generalized eigenspaces will prove to be an important type of invariant subspace. A second reason for our interest in invariant subspaces is they provide us with another method for creating new linear transformations from old ones.
Definition LTR Linear Transformation Restriction

Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation, and $U$ is an invariant subspace of $V$ relative to $T$. Define the restriction of $T$ to $U$ by \begin{align*} \ltdefn{\restrict{T}{U}}{U}{U}& & \lt{\restrict{T}{U}}{\vect{u}}&=\lt{T}{\vect{u}} \end{align*} $\restrict{T}{U}$

$\square$

It might appear that this definition has not accomplished anything, as $\restrict{T}{U}$ would appear to take on exactly the same values as $T$. And this is true. However, $\restrict{T}{U}$ differs from $T$ in the choice of domain and codomain. We tend to give little attention to the domain and codomain of functions, while their defining rules get the spotlight. But the restriction of a linear transformation is all about the choice of domain and codomain. We are restricting the rule of the function to a smaller subspace. Notice the importance of only using this construction with an invariant subspace, since otherwise we cannot be assured that the outputs of the function are even contained in the codomain. Maybe this observation should be the key step in the proof of a theorem saying that $\restrict{T}{U}$ is also a linear transformation, but we won't bother.

Example LTRGE Linear transformation restriction on generalized eigenspace
Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation and we can find a decomposition of $V$ as a direct sum, say $V=U_1\ds U_2\ds U_3\ds\cdots\ds U_m$ where each $U_i$ is an invariant subspace of $V$ relative to $T$. Then, for any $\vect{v}\in V$ there is a unique decomposition $\vect{v}=\vect{u}_1+\vect{u}_2+\vect{u}_3+\cdots+\vect{u}_m$ with $\vect{u}_i\in U_i$, $1\leq i\leq m$ and furthermore \begin{align*} \lt{T}{\vect{v}} &=\lt{T}{\vect{u}_1+\vect{u}_2+\vect{u}_3+\cdots+\vect{u}_m} &&\text{Definition DS}\\ &=\lt{T}{\vect{u}_1}+\lt{T}{\vect{u}_2}+\lt{T}{\vect{u}_3}+\cdots+\lt{T}{\vect{u}_m} &&\text{Theorem LTLC}\\ &=\lt{\restrict{T}{U_1}}{\vect{u}_1}+\lt{\restrict{T}{U_2}}{\vect{u}_2}+\lt{\restrict{T}{U_3}}{\vect{u}_3}+\cdots+\lt{\restrict{T}{U_m}}{\vect{u}_m} \end{align*} So in a very real sense, we obtain a decomposition of the linear transformation $T$ into the restrictions $\restrict{T}{U_i}$, $1\leq i\leq m$. If we wanted to be more careful, we could extend each restriction to a linear transformation defined on $V$ by setting the output of $\restrict{T}{U_i}$ to be the zero vector for inputs outside of $U_i$. Then $T$ would be exactly equal to the sum (Definition LTA) of these extended restrictions. However, the irony of extending our restrictions is more than we could handle right now.

Our real interest is in the matrix representation of a linear transformation when the domain decomposes as a direct sum of invariant subspaces. Consider forming a basis $B$ of $V$ as the union of bases $B_i$ from the individual $U_i$, i.e. $B=\cup_{i=1}^m\,B_i$. Now form the matrix representation of $T$ relative to $B$. The result will be block diagonal, where each block is the matrix representation of a restriction $\restrict{T}{U_i}$ relative to a basis $B_i$, $\matrixrep{\restrict{T}{U_i}}{B_i}{B_i}$. Though we did not have the definitions to describe it then, this is exactly what was going on in the latter portion of the proof of Theorem CFNLT. Two examples should help to clarify these ideas.

Example ISMR4 Invariant subspaces, matrix representation, dimension 4 domain
Example ISMR6 Invariant subspaces, matrix representation, dimension 6 domain
The paragraph prior to these last two examples is worth repeating. A basis derived from a direct sum decomposition into invariant subspaces will provide a matrix representation of a linear transformation with a block diagonal form.

Diagonalizing a linear transformation is the most extreme example of decomposing a vector space into invariant subspaces. When a linear transformation is diagonalizable, then there is a basis composed of eigenvectors (Theorem DC). Each of these basis vectors can be used individually as the lone element of a spanning set for an invariant subspace (Theorem EIS). So the domain decomposes into a direct sum of one-dimensional invariant subspaces (Theorem DSFB). The corresponding matrix representation is then block diagonal with all the blocks of size 1, i.e. the matrix is diagonal. Section NLT, Section IS and Section JCF are all devoted to generalizing this extreme situation when there are not enough eigenvectors available to make such a complete decomposition and arrive at such an elegant matrix representation.

One last theorem will roll up much of this section and Section NLT into one nice, neat package.
Theorem RGEN Restriction to Generalized Eigenspace is Nilpotent
Suppose $\ltdefn{T}{V}{V}$ is a linear transformation with eigenvalue $\lambda$. Then the linear transformation $\restrict{T}{\geneigenspace{T}{\lambda}}-\lambda I_{\geneigenspace{T}{\lambda}}$ is nilpotent.

Proof

The proof of Theorem RGEN indicates that the index of the nilpotent linear transformation is less than or equal to the dimension of $V$. In practice, it will be less than or equal to the dimension of the domain of the linear transformation, $\geneigenspace{T}{\lambda}$. In any event, the exact value of this index will be of some interest, so we define it now. Notice that this is a property of the eigenvalue $\lambda$, similar to the algebraic and geometric multiplicities (Definition AME, Definition GME).
Definition IE Index of an Eigenvalue

Suppose $\ltdefn{T}{V}{V}$ is a linear transformation with eigenvalue $\lambda$. Then the index of $\lambda$, $\indx{T}{\lambda}$, is the index of the nilpotent linear transformation $\restrict{T}{\geneigenspace{T}{\lambda}}-\lambda I_{\geneigenspace{T}{\lambda}}$. $\indx{T}{\lambda}$

$\square$

Example GENR6 Generalized eigenspaces and nilpotent restrictions, dimension 6 domain
Invariant subspaces, and restrictions of linear transformations, are topics you will see again and again if you continue with further study of linear algebra. Our reasons for discussing them now is to arrive at a nice matrix representation of the restriction of a linear transformation to one of its generalized eigenspaces. Here's the theorem.
Theorem MRRGE Matrix Representation of a Restriction to a Generalized Eigenspace
Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation with eigenvalue $\lambda$. Then there is a basis of the the generalized eigenspace $\geneigenspace{T}{\lambda}$ such that the restriction $\ltdefn{\restrict{T}{\geneigenspace{T}{\lambda}}}{\geneigenspace{T}{\lambda}}{\geneigenspace{T}{\lambda}}$ has a matrix representation that is block diagonal where each block is a Jordan block of the form $\jordan{n}{\lambda}$.

Proof

Of course, Theorem CFNLT provides some extra information on the sizes of the Jordan blocks in a representation and we could carry over this information to Theorem MRRGE, but will save that for a subsequent application of this result.