Section IS  Invariant Subspaces

From A First Course in Linear Algebra
Version 2.30
© 2004.
Licensed under the GNU Free Documentation License.
http://linear.ups.edu/

This section is in draft form
Nearly complete

We have seen in Section NLT that nilpotent linear transformations are almost never diagonalizable (Theorem DNLT), yet have matrix representations that are very nearly diagonal (Theorem CFNLT). Our goal in this section, and the next (Section JCF), is to obtain a matrix representation of any linear transformation that is very nearly diagonal. A key step in reaching this goal is an understanding of invariant subspaces, and a particular type of invariant subspace that contains vectors known as “generalized eigenvectors.”

Subsection IS: Invariant Subspaces

As is often the case, we start with a definition.

Definition IS
Invariant Subspace
Suppose that T : V → V is a linear transformation and W is a subspace of V . Suppose further that T\left (w\right ) ∈ W for every w ∈ W. Then W is an invariant subspace of V relative to T.

We do not have any special notation for an invariant subspace, so it is important to recognize that an invariant subspace is always relative to both a superspace (V ) and a linear transformation (T), which will sometimes not be mentioned, yet will be clear from the context. Note also that the linear transformation involved must have an equal domain and codomain — the definition would not make much sense if our outputs were not of the same type as our inputs.

As usual, we begin with an example that demonstrates the existence of invariant subspaces. We will return later to understand how this example was constructed, but for now, just understand how we check the existence of the invariant subspaces.

Example TIS
Two invariant subspaces
Consider the linear transformation T : {ℂ}^{4} → {ℂ}^{4} defined by T\left (x\right ) = Ax where A is given by

\eqalignno{ A & = \left [\array{ −8& 6 &−15& 9\cr −8 & 14 &−10 & 18 \cr 1 & 1 & 3 & 0\cr 3 &−8 & 2 &−11 } \right ] & & }

Define (with zero motivation),

\eqalignno{ {w}_{1} & = \left [\array{ −7\cr −2 \cr 3\cr 0 } \right ] &{w}_{2} & = \left [\array{ −1\cr −2 \cr 0\cr 1 } \right ] & & & & }

and set W = \left \langle \left \{{w}_{1},\kern 1.95872pt {w}_{2}\right \}\right \rangle . We verify that W is an invariant subspace of {ℂ}^{4} with respect to T. By the definition of W, any vector chosen from W can be written as a linear combination of {w}_{1} and {w}_{2}. Suppose that w ∈ W, and then check the details of the following verification,

\eqalignno{ T\left (w\right ) & = T\left ({a}_{1}{w}_{1} + {a}_{2}{w}_{2}\right ) & &\text{@(a href="fcla-jsmath-2.30li38.html#definition.SS")Definition SS@(/a)} & & & & \cr & = {a}_{1}T\left ({w}_{1}\right ) + {a}_{2}T\left ({w}_{2}\right ) & &\text{@(a href="fcla-jsmath-2.30li51.html#theorem.LTLC")Theorem LTLC@(/a)} & & & & \cr & = {a}_{1}\left [\array{ −1\cr −2 \cr 0\cr 1 } \right ] + {a}_{2}\left [\array{ 5\cr −2 \cr −3\cr 2 } \right ] & & & & \cr & = {a}_{1}{w}_{2} + {a}_{2}\left ((−1){w}_{1} + 2{w}_{2}\right ) & & & & \cr & = (−{a}_{2}){w}_{1} + ({a}_{1} + 2{a}_{2}){w}_{2} & & & & \cr & ∈ W & &\text{@(a href="fcla-jsmath-2.30li38.html#definition.SS")Definition SS@(/a)} & & & & }

So, by Definition IS, W is an invariant subspace of {ℂ}^{4} relative to T. In an entirely similar manner we construct another invariant subspace of T.

With zero motivation, define

\eqalignno{ {x}_{1} & = \left [\array{ −3\cr −1 \cr 1\cr 0 } \right ] &{x}_{2} & = \left [\array{ 0\cr −1 \cr 0\cr 1 } \right ] & & & & }

and set X = \left \langle \left \{{x}_{1},\kern 1.95872pt {x}_{2}\right \}\right \rangle . We verify that X is an invariant subspace of {ℂ}^{4} with respect to T. By the definition of X, any vector chosen from X can be written as a linear combination of {x}_{1} and {x}_{2}. Suppose that x ∈ X, and then check the details of the following verification,

\eqalignno{ T\left (x\right ) & = T\left ({b}_{1}{x}_{1} + {b}_{2}{x}_{2}\right ) & &\text{@(a href="fcla-jsmath-2.30li38.html#definition.SS")Definition SS@(/a)} & & & & \cr & = {b}_{1}T\left ({x}_{1}\right ) + {b}_{2}T\left ({x}_{2}\right ) & &\text{@(a href="fcla-jsmath-2.30li51.html#theorem.LTLC")Theorem LTLC@(/a)} & & & & \cr & = {b}_{1}\left [\array{ 3\cr 0 \cr −1\cr 1 } \right ] + {b}_{2}\left [\array{ 3\cr 4 \cr −1\cr −3 } \right ] & & & & \cr & = {b}_{1}\left ((−1){x}_{1} + {x}_{2}\right ) + {b}_{2}\left ((−1){x}_{1} + (−3){x}_{2}\right ) & & & & \cr & = (−{b}_{1} − {b}_{2}){x}_{1} + ({b}_{1} − 3{b}_{2}){x}_{2} & & & & \cr & ∈ X & &\text{@(a href="fcla-jsmath-2.30li38.html#definition.SS")Definition SS@(/a)} & & & & }

So, by Definition IS, X is an invariant subspace of {ℂ}^{4} relative to T.

There is a bit of magic in each of these verifications where the two outputs of T happen to equal linear combinations of the two inputs. But this is the essential nature of an invariant subspace. We’ll have a peek under the hood later, and it won’t look so magical after all.

As a hint of things to come, verify that B = \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {x}_{1},\kern 1.95872pt {x}_{2}\right \} is a basis of {ℂ}^{4}. Splitting this basis in half, Theorem DSFB, tells us that {ℂ}^{4} = W ⊕ X. To see why a decomposition of a vector space into a direct sum of invariant subspaces might be interesting, construct the matrix representation of T relative to B, {M}_{B,B}^{T }. Hmmmmmm.

Example TIS is a bit mysterious at this stage. Do we know any other examples of invariant subspaces? Yes, as it turns out, we have already seen quite a few. We’ll give some examples now, and in more general situations, describe broad classes of invariant subspaces with theorems. First up is eigenspaces.

Theorem EIS
Eigenspaces are Invariant Subspaces
Suppose that T : V → V is a linear transformation with eigenvalue λ and associated eigenspace {ℰ}_{T }\left (λ\right ). Let W be any subspace of {ℰ}_{T }\left (λ\right ). Then W is an invariant subspace of V relative to T.

Proof   Choose w ∈ W. Then

\eqalignno{ T\left (w\right ) & = λw & &\text{@(a href="fcla-jsmath-2.30li58.html#definition.EELT")Definition EELT@(/a)} & & & & \cr & ∈ W & &\text{@(a href="fcla-jsmath-2.30li37.html#property.SC")Property SC@(/a)} & & & & }

So by Definition IS, W is an invariant subspace of V relative to T.

Theorem EIS is general enough to determine that an entire eigenspace is an invariant subspace, or that simply the span of a single eigenvector is an invariant subspace. It is not always the case that any subspace of an invariant subspace is again an invariant subspace, but eigenspaces do have this property. Here is an example of the theorem, which also allows us to very quickly build several several invariant (4x4, 2 evs, 1 2x2 jordan, 1 2x2 diag)

Example EIS
Eigenspaces as invariant subspaces
Define the linear transformation S : {M}_{22} → {M}_{22} by

\eqalignno{ S\left (\left [\array{ a&b\cr c&d } \right ]\right ) & = \left [\array{ −2a + 19b − 33c + 21d&−3a + 16b − 24c + 15d\cr −2a + 9b − 13c + 9d & −a + 4b − 6c + 5d } \right ] & & }

Build a matrix representation of S relative to the standard basis (Definition MR, Example BM) and compute eigenvalues and eigenspaces of S with the computational techniques of Chapter E in concert with Theorem EER. Then

\eqalignno{ {ℰ}_{S}\left (1\right ) & = \left \langle \left \{\left [\array{ 4&3\cr 2&1 } \right ]\right \}\right \rangle &{ℰ}_{S}\left (2\right ) & = \left \langle \left \{\left [\array{ 6&3\cr 1&0 } \right ],\kern 1.95872pt \left [\array{ −9&−3\cr 0 & 1 } \right ]\right \}\right \rangle & & & & }

So by Theorem EIS, both {ℰ}_{S}\left (1\right ) and {ℰ}_{S}\left (2\right ) are invariant subspaces of {M}_{22} relative to S. However, Theorem EIS provides even more invariant subspaces. Since {ℰ}_{S}\left (1\right ) has dimension 1, it has no interesting subspaces, however {ℰ}_{S}\left (2\right ) has dimension 2 and has a plethora of subspaces. For example, set

\eqalignno{ u & = 2\left [\array{ 6&3\cr 1&0 } \right ] + 3\left [\array{ −9&−3\cr 0 & 1 } \right ] = \left [\array{ −6&−3\cr 2 & 3 } \right ] & & }

and define U = \left \langle \left \{u\right \}\right \rangle . Then since U is a subspace of {ℰ}_{S}\left (2\right ), Theorem EIS says that U is an invariant subspace of {M}_{22} (or we could check this claim directly based simply on the fact that u is an eigenvector of S).

For every linear transformation there are some obvious, trivial invariant subspaces. Suppose that T : V → V is a linear transformation. Then simply because T is a function (Definition LT), the subspace V is an invariant subspace of T. In only a minor twist on this theme, the range of T, ℛ\kern -1.95872pt \left (T\right ), is an invariant subspace of T by Definition RLT. Finally, Theorem LTTZZ provides the justification for claiming that \left \{0\right \} is an invariant subspace of T.

That the trivial subspace is always an invariant subspace is a special case of the next theorem. As an easy exercise before reading the next theorem, prove that the kernel of a linear transformation (Definition KLT), K\kern -1.95872pt \left (T\right ), is an invariant subspace. We’ll wait.

Theorem KPIS
Kernels of Powers are Invariant Subspaces
Suppose that T : V → V is a linear transformation. Then K\kern -1.95872pt \left ({T}^{k}\right ) is an invariant subspace of V .

Proof   Suppose that z ∈K\kern -1.95872pt \left ({T}^{k}\right ). Then

\eqalignno{ {T}^{k}\left (T\left (z\right )\right ) & = {T}^{k+1}\left (z\right ) & &\text{@(a href="fcla-jsmath-2.30li51.html#definition.LTC")Definition LTC@(/a)} & & & & \cr & = T\left ({T}^{k}\left (z\right )\right ) & &\text{@(a href="fcla-jsmath-2.30li51.html#definition.LTC")Definition LTC@(/a)} & & & & \cr & = T\left (0\right ) & &\text{@(a href="fcla-jsmath-2.30li52.html#definition.KLT")Definition KLT@(/a)} & & & & \cr & = 0 & &\text{@(a href="fcla-jsmath-2.30li51.html#theorem.LTTZZ")Theorem LTTZZ@(/a)} & & & & }

So by Definition KLT, we see that T\left (z\right ) ∈K\kern -1.95872pt \left ({T}^{k}\right ). Thus K\kern -1.95872pt \left ({T}^{k}\right ) is an invariant subspace of V relative to T (Definition IS).

Two interesting special cases of Theorem KPIS occur when choose k = 0 and k = 1. Rather than give an example of this theorem, we will refer you back to Example KPNLT where we work with null spaces of the first four powers of a nilpotent matrix. By Theorem KPIS each of these null spaces is an invariant subspace of the associated linear transformation.

Here’s one more example of invariant subspaces we have encountered previously.

Example ISJB
Invariant subspaces and Jordan blocks
Refer back to Example CFNLT. We decomposed the vector space {ℂ}^{6} into a direct sum of the subspaces {Z}_{1},\kern 1.95872pt {Z}_{2},\kern 1.95872pt {Z}_{3},\kern 1.95872pt {Z}_{4}. The union of the basis vectors for these subspaces is a basis of {ℂ}^{6}, which we reordered prior to building a matrix representation of the linear transformation T. A principal reason for this reordering was to create invariant subspaces (though it was not obvious then).

Define

\eqalignno{ {X}_{1} & = \left \langle \left \{{z}_{1,1},\kern 1.95872pt {z}_{2,1},\kern 1.95872pt {z}_{3,1},\kern 1.95872pt {z}_{4,1}\right \}\right \rangle = \left \langle \left \{\left [\array{ 1\cr 1 \cr 0\cr 1 \cr 1\cr 1 } \right ],\kern 1.95872pt \left [\array{ 1\cr 0 \cr 3\cr 1 \cr 0\cr −1 } \right ],\kern 1.95872pt \left [\array{ −3\cr −3 \cr −3\cr −3 \cr −3\cr −2 } \right ],\kern 1.95872pt \left [\array{ 1\cr 0 \cr 0\cr 0 \cr 0\cr 0 } \right ]\right \}\right \rangle & & \cr {X}_{2} & = \left \langle \left \{{z}_{1,2},\kern 1.95872pt {z}_{2,2}\right \}\right \rangle = \left \langle \left \{\left [\array{ −2\cr −2 \cr −5\cr −2 \cr −1\cr 0 } \right ],\kern 1.95872pt \left [\array{ 2\cr 1 \cr 2\cr 2 \cr 2\cr 1 } \right ]\right \}\right \rangle & & }

Recall from the proof of Theorem CFNLT or the computations in Example CFNLT that first elements of {X}_{1} and {X}_{2} are in the kernel of T, K\kern -1.95872pt \left (T\right ), and each element of {X}_{1} and {X}_{2} is the output of T when evaluated with the subsequent element of the set. This was by design, and it is this feature of these basis vectors that leads to the nearly diagonal matrix representation with Jordan blocks. However, we also recognize now that this property of these basis vectors allow us to conclude easily that {X}_{1} and {X}_{2} are invariant subspaces of {ℂ}^{6} relative to T.

Furthermore, {ℂ}^{6} = {X}_{ 1} ⊕ {X}_{2} (Theorem DSFB). So the domain of T is the direct sum of invariant subspaces and the resulting matrix representation has a block diagonal form. Hmmmmm.

Subsection GEE: Generalized Eigenvectors and Eigenspaces

We now define a new type of invariant subspace and explore its key properties. This generalization of eigenvalues and eigenspaces will allow us to move from diagonal matrix representations of diagonalizable matrices to nearly diagonal matrix representations of arbitrary matrices. Here are the definitions.

Definition GEV
Generalized Eigenvector
Suppose that T : V → V is a linear transformation. Suppose further that for x\mathrel{≠}0, {\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0 for some k > 0. Then x is a generalized eigenvector of T with eigenvalue λ.

Definition GES
Generalized Eigenspace
Suppose that T : V → V is a linear transformation. Define the generalized eigenspace of T for λ as

\eqalignno{ {G}_{T }\left (λ\right ) & = \left \{\left .x\right \vert {\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0\text{ for some }k ≥ 0\right \} & & }

(This definition contains Notation GES.)

So the generalized eigenspace is composed of generalized eigenvectors, plus the zero vector. As the name implies, the generalized eigenspace is a subspace of V . But more topically, it is an invariant subspace of V relative to T.

Theorem GESIS
Generalized Eigenspace is an Invariant Subspace
Suppose that T : V → V is a linear transformation. Then the generalized eigenspace {G}_{T }\left (λ\right ) is an invariant subspace of V relative to T.

Proof   First we establish that {G}_{T }\left (λ\right ) is a subspace of V . First {\left (T − λ{I}_{V }\right )}^{1}\left (0\right ) = 0 by Theorem LTTZZ, so 0 ∈{G}_{T }\left (λ\right ).

Suppose that x,\kern 1.95872pt y ∈{G}_{T }\left (λ\right ). Then there are integers k,\kern 1.95872pt ℓ such that {\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0 and {\left (T − λ{I}_{V }\right )}^{ℓ}\left (y\right ) = 0. Set m = k + ℓ,

\eqalignno{ {\left (T − λ{I}_{V }\right )}^{m}\left (x + y\right )& ={ \left (T − λ{I}_{ V }\right )}^{m}\left (x\right ) +{ \left (T − λ{I}_{ V }\right )}^{m}\left (y\right ) &&\text{@(a href="fcla-jsmath-2.30li51.html#definition.LT")Definition LT@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{k+ℓ}\left (x\right ) +{ \left (T − λ{I}_{ V }\right )}^{k+ℓ}\left (y\right )&& && \cr & ={ \left (T − λ{I}_{V }\right )}^{ℓ}\left ({\left (T − λ{I}_{ V }\right )}^{k}\left (x\right )\right )+ && && \cr &\quad \quad {\left (T − λ{I}_{V }\right )}^{k}\left ({\left (T − λ{I}_{ V }\right )}^{ℓ}\left (y\right )\right ) &&\text{@(a href="fcla-jsmath-2.30li51.html#definition.LTC")Definition LTC@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{ℓ}\left (0\right ) +{ \left (T − λ{I}_{ V }\right )}^{k}\left (0\right ) &&\text{@(a href="#definition.GES")Definition GES@(/a)} &&&& \cr & = 0 + 0 &&\text{@(a href="fcla-jsmath-2.30li51.html#theorem.LTTZZ")Theorem LTTZZ@(/a)}&&&& \cr & = 0 &&\text{@(a href="fcla-jsmath-2.30li37.html#property.Z")Property Z@(/a)} &&&& }

So x + y ∈{G}_{T }\left (λ\right ).

Suppose that x ∈{G}_{T }\left (λ\right ) and α ∈ ℂ. Then there is an integer k such that {\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0.

\eqalignno{ {\left (T − λ{I}_{V }\right )}^{k}\left (αx\right ) & = α{\left (T − λ{I}_{ V }\right )}^{k}\left (x\right ) & &\text{@(a href="fcla-jsmath-2.30li51.html#definition.LT")Definition LT@(/a)} & & & & \cr & = α0 & &\text{@(a href="#definition.GES")Definition GES@(/a)} & & & & \cr & = 0 & &\text{@(a href="fcla-jsmath-2.30li37.html#theorem.ZVSM")Theorem ZVSM@(/a)} & & & & }

So αx ∈{G}_{T }\left (λ\right ). By Theorem TSS, {G}_{T }\left (λ\right ) is a subspace of V .

Now we show that {G}_{T }\left (λ\right ) is invariant relative to T. Suppose that x ∈{G}_{T }\left (λ\right ). Then by Definition GES there is an integer k such that {\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0. The following argument is due to Zoltan Toth.

\eqalignno{ {\left (T − λ{I}_{V }\right )}^{k}\left (T\left (x\right )\right )& ={ \left (T − λ{I}_{ V }\right )}^{k}\left (T\left (x\right )\right ) − 0 &&\text{@(a href="fcla-jsmath-2.30li37.html#property.Z")Property Z@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{k}\left (T\left (x\right )\right ) − λ0 &&\text{@(a href="fcla-jsmath-2.30li37.html#theorem.ZVSM")Theorem ZVSM@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{k}\left (T\left (x\right )\right ) − λ{\left (T − λ{I}_{ V }\right )}^{k}\left (x\right )&&\text{@(a href="#definition.GES")Definition GES@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{k}\left (T\left (x\right )\right ) −{\left (T − λ{I}_{ V }\right )}^{k}\left (λx\right )&&\text{@(a href="fcla-jsmath-2.30li51.html#definition.LT")Definition LT@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{k}\left (T\left (x\right ) − λx\right ) &&\text{@(a href="fcla-jsmath-2.30li51.html#definition.LT")Definition LT@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{k}\left (\left (T − λ{I}_{ V }\right )\left (x\right )\right ) &&\text{@(a href="fcla-jsmath-2.30li51.html#definition.LTA")Definition LTA@(/a)} &&&& \cr & ={ \left (T − λ{I}_{V }\right )}^{k+1}\left (x\right ) &&\text{@(a href="fcla-jsmath-2.30li51.html#definition.LTC")Definition LTC@(/a)} &&&& \cr & = \left (T − λ{I}_{V }\right )\left ({\left (T − λ{I}_{V }\right )}^{k}\left (x\right )\right ) &&\text{@(a href="fcla-jsmath-2.30li51.html#definition.LTC")Definition LTC@(/a)} &&&& \cr & = \left (T − λ{I}_{V }\right )\left (0\right ) &&\text{@(a href="#definition.GES")Definition GES@(/a)} &&&& \cr & = 0 &&\text{@(a href="fcla-jsmath-2.30li51.html#theorem.LTTZZ")Theorem LTTZZ@(/a)}&&&& }

This qualifies T\left (x\right ) for membership in {G}_{T }\left (λ\right ), so by Definition GES, {G}_{T }\left (λ\right ) is invariant relative to T.

Before we compute some generalized eigenspaces, we state and prove one theorem that will make it much easier to create a generalized eigenspace, since it will allow us to use tools we already know well, and will remove some the ambiguity of the clause “for some k” in the definition.

Theorem GEK
Generalized Eigenspace as a Kernel
Suppose that T : V → V is a linear transformation, \mathop{ dim}\nolimits \left (V \right ) = n, and λ is an eigenvalue of T. Then {G}_{T }\left (λ\right ) = K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{n}\right ).

Proof   The conclusion of this theorem is a set equality, so we will apply Definition SE by establishing two set inclusions. First, suppose that x ∈{G}_{T }\left (λ\right ). Then there is an integer k such that {\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0. This is equivalent to the statement that x ∈K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{k}\right ). No matter what the value of k is, Theorem KPLT gives

\eqalignno{ x & ∈K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{k}\right ) ⊆K\kern -1.95872pt \left ({\left (T − λ{I}_{ V }\right )}^{n}\right ) & & }

So, {G}_{T }\left (λ\right ) ⊆K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{n}\right ). For the opposite inclusion, suppose y ∈K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{n}\right ). Then {\left (T − λ{I}_{V }\right )}^{n}\left (y\right ) = 0, so y ∈{G}_{T }\left (λ\right ) and thus K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{n}\right ) ⊆{G}_{ T }\left (λ\right ). By Definition SE we have the desired equality of sets.

Theorem GEK allows us to compute generalized eigenspaces as a single kernel (or null space of a matrix representation) with tools like Theorem KNSI and Theorem BNS. Also, we do not need to consider all possible powers k and can simply consider the case where k = n. It is worth noting that the “regular” eigenspace is a subspace of the generalized eigenspace since

\eqalignno{ {ℰ}_{T }\left (λ\right ) & = K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{1}\right ) ⊆K\kern -1.95872pt \left ({\left (T − λ{I}_{ V }\right )}^{n}\right ) = {G}_{ T }\left (λ\right ) & & }

where the subset inclusion is a consequence of Theorem KPLT. Also, there is no such thing as a “generalized eigenvalue.” If λ is not an eigenvalue of T, then the kernel of T − λ{I}_{V } is trivial and therefore subsequent powers of T − λ{I}_{V } also have trivial kernels (Theorem KPLT). So the generalized eigenspace of a scalar that is not already an eigenvalue would be trivial. Alright, we know enough now to compute some generalized eigenspaces. We will record some information about algebraic and geometric multiplicities of eigenvalues (Definition AME, Definition GME) as we go, since these observations will be of interest in light of some future theorems.

Example GE4
Generalized eigenspaces, dimension 4 domain
In Example TIS we presented two invariant subspaces of {ℂ}^{4}. There was some mystery about just how these were constructed, but we can now reveal that they are generalized eigenspaces. Example TIS featured T : {ℂ}^{4} → {ℂ}^{4} defined by T\left (x\right ) = Ax with A given by

\eqalignno{ A & = \left [\array{ −8& 6 &−15& 9\cr −8 & 14 &−10 & 18 \cr 1 & 1 & 3 & 0\cr 3 &−8 & 2 &−11 } \right ] & & }

A matrix representation of T relative to the standard basis (Definition SUV) will equal A. So we can analyze A with the techniques of Chapter E. Doing so, we find two eigenvalues, λ = 1,\kern 1.95872pt − 2, with multiplicities,

\eqalignno{ {α}_{T }\left (1\right ) & = 2 &{γ}_{T }\left (1\right ) & = 1 & & & & \cr {α}_{T }\left (−2\right ) & = 2 &{γ}_{T }\left (−2\right ) & = 1 & & & & \cr & & & & }

To apply Theorem GEK we subtract each eigenvalue from the diagonal entries of A, raise the result to the power \mathop{ dim}\nolimits \left ({ℂ}^{4}\right ) = 4, and compute a basis for the null space.

\eqalignno{ λ & = −2 &{\left (A − (−2){I}_{4}\right )}^{4} & = \left [\array{ 648 &−1215& 729 &−1215\cr −324 & 486 &−486 & 486 \cr −405& 729 &−486& 729\cr 297 & −486 & 405 & −486 } \right ]\mathop{\longrightarrow}\limits_{}^{\text{RREF}}\left [\array{ 1&0&3&0\cr 0&1 &1 &1 \cr 0&0&0&0\cr 0&0 &0 &0 } \right ] & & & & \cr & &{G}_{T }\left (−2\right ) & = \left \langle \left \{\left [\array{ −3\cr −1 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ 0\cr −1 \cr 0\cr 1 } \right ]\right \}\right \rangle & & & & \cr λ & = 1 &{\left (A − (1){I}_{4}\right )}^{4} & = \left [\array{ 81 &−405& −81 &−729\cr −108 &−189 &−378 &−486 \cr −27 & 135 & 27 & 243\cr 135 & 54 & 351 & 243 } \right ]\mathop{\longrightarrow}\limits_{}^{\text{RREF}}\left [\array{ 1&0&{7\over 3}&1 \cr 0&1&{2\over 3}&2\cr 0&0 &0 &0 \cr 0&0&0&0 } \right ] & & & & \cr & &{G}_{T }\left (1\right ) & = \left \langle \left \{\left [\array{ −7\cr −2 \cr 3\cr 0 } \right ],\kern 1.95872pt \left [\array{ −1\cr −2 \cr 0\cr 1 } \right ]\right \}\right \rangle & & & & }

In Example TIS we concluded that these two invariant subspaces formed a direct sum of {ℂ}^{4}, only at that time, they were called X and W. Now we can write

\eqalignno{ {ℂ}^{4} = {G}_{ T }\left (1\right ) ⊕{G}_{T }\left (−2\right ) & & }

This is no accident. Notice that the dimension of each of these invariant subspaces is equal to the algebraic multiplicity of the associated eigenvalue. Not an accident either. (See the upcoming Theorem GESD.)

Example GE6
Generalized eigenspaces, dimension 6 domain
Define the linear transformation S : {ℂ}^{6} → {ℂ}^{6} by S\left (x\right ) = Bx where

\eqalignno{ \left [\array{ 2 & −4 &25&−54&90&−37\cr 2 & −3 & 4 &−16 &26 & −8 \cr 2 & −3 & 4 &−15&24& −7\cr 10 &−18 & 6 &−36 &51 & −2 \cr 8 &−14& 0 &−21&28& 4\cr 5 & −7 &−6 & −7 & 8 & 7 } \right ] & & }

Then B will be the matrix representation of S relative to the standard basis (Definition SUV) and we can use the techniques of Chapter E applied to B in order to find the eigenvalues of S.

\eqalignno{ {α}_{S}\left (3\right ) & = 2 &{γ}_{S}\left (3\right ) & = 1 & & & & \cr {α}_{S}\left (−1\right ) & = 4 &{γ}_{S}\left (−1\right ) & = 2 & & & & \cr & & & & }

To find the generalized eigenspaces of S we need to subtract an eigenvalue from the diagonal elements of B, raise the result to the power \mathop{ dim}\nolimits \left ({ℂ}^{6}\right ) = 6 and compute the null space. Here are the results for the two eigenvalues of S,

\eqalignno{ λ& = 3 &{\left (B − 3{I}_{6}\right )}^{6} & = \left [\array{ 64000&−152576&−59904&26112&−95744&133632\cr 15872 & −39936 &−11776 & 8704 &−29184 & 36352 \cr 12032& −30208 & −9984 & 6400 &−20736& 26368\cr −1536 & 11264 &−23040 &17920 &−17920 & −1536 \cr −9728& 27648 & −6656 & 9728 & −1536 &−17920\cr −7936 & 17920 & 5888 & 1792 & 4352 &−14080 } \right ] &&&& \cr & &\mathop{\longrightarrow}\limits_{}^{\text{RREF}} &\left [\array{ 1&0&0&0&−4&5\cr 0&1 &0 &0 &−1 &1 \cr 0&0&1&0&−1&1\cr 0&0 &0 &1 &−2 &1 \cr 0&0&0&0& 0 &0\cr 0&0 &0 &0 & 0 &0 } \right ] &&&& \cr & &{G}_{S}\left (3\right ) & = \left \langle \left \{\left [\array{ 4\cr 1 \cr 1\cr 2 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −1 \cr −1\cr −1 \cr 0\cr 1 } \right ]\right \}\right \rangle &&&& \cr λ& = −1&{\left (B − (−1){I}_{6}\right )}^{6}& = \left [\array{ 6144 &−16384&18432&−36864&57344&−18432\cr 4096 & −8192 & 4096 &−16384 &24576 & −4096 \cr 4096 & −8192 & 4096 &−16384&24576& −4096\cr 18432 &−32768 & 6144 &−61440 &90112 & −6144 \cr 14336&−24576& 2048 &−45056&65536& −2048\cr 10240 &−16384 &−2048 &−28672 &40960 & 2048 } \right ] &&&& \cr & &\mathop{\longrightarrow}\limits_{}^{\text{RREF}} &\left [\array{ 1&0&−5&2&−4&5\cr 0&1 &−3 &3 &−5 &3 \cr 0&0& 0 &0& 0 &0\cr 0&0 & 0 &0 & 0 &0 \cr 0&0& 0 &0& 0 &0\cr 0&0 & 0 &0 & 0 &0 } \right ] &&&& \cr & &{G}_{S}\left (−1\right ) & = \left \langle \left \{\left [\array{ 5\cr 3 \cr 1\cr 0 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ −2\cr −3 \cr 0\cr 1 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ 4\cr 5 \cr 0\cr 0 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −3 \cr 0\cr 0 \cr 0\cr 1 } \right ]\right \}\right \rangle &&&& }

If we take the union of the two bases for these two invariant subspaces we obtain the set

\eqalignno{ C & = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt {v}_{4},\kern 1.95872pt {v}_{5},\kern 1.95872pt {v}_{6}\right \} & & \cr & = \left \{\left [\array{ 4\cr 1 \cr 1\cr 2 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −1 \cr −1\cr −1 \cr 0\cr 1 } \right ],\kern 1.95872pt \left [\array{ 5\cr 3 \cr 1\cr 0 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ −2\cr −3 \cr 0\cr 1 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ 4\cr 5 \cr 0\cr 0 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −3 \cr 0\cr 0 \cr 0\cr 1 } \right ]\right \} & & }

You can check that this set is linearly independent (right now we have no guarantee this will happen). Once this is verified, we have a linearly independent set of size 6 inside a vector space of dimension 6, so by Theorem G, the set C is a basis for {ℂ}^{6}. This is enough to apply Theorem DSFB and conclude that

\eqalignno{ {ℂ}^{6} = {G}_{ S}\left (3\right ) ⊕{G}_{S}\left (−1\right ) & & }

This is no accident. Notice that the dimension of each of these invariant subspaces is equal to the algebraic multiplicity of the associated eigenvalue. Not an accident either. (See the upcoming Theorem GESD.)

Subsection RLT: Restrictions of Linear Transformations

Generalized eigenspaces will prove to be an important type of invariant subspace. A second reason for our interest in invariant subspaces is they provide us with another method for creating new linear transformations from old ones.

Definition LTR
Linear Transformation Restriction
Suppose that T : V → V is a linear transformation, and U is an invariant subspace of V relative to T. Define the restriction of T to U by

\eqalignno{ T{|}_{U}: U → U & &T{|}_{U}\left (u\right ) & = T\left (u\right ) & & & & }

(This definition contains Notation LTR.)

It might appear that this definition has not accomplished anything, as T{|}_{U} would appear to take on exactly the same values as T. And this is true. However, T{|}_{U} differs from T in the choice of domain and codomain. We tend to give little attention to the domain and codomain of functions, while their defining rules get the spotlight. But the restriction of a linear transformation is all about the choice of domain and codomain. We are restricting the rule of the function to a smaller subspace. Notice the importance of only using this construction with an invariant subspace, since otherwise we cannot be assured that the outputs of the function are even contained in the codomain. Maybe this observation should be the key step in the proof of a theorem saying that T{|}_{U} is also a linear transformation, but we won’t bother.

Example LTRGE
Linear transformation restriction on generalized eigenspace
In order to gain some experience with restrictions of linear transformations, we construct one and then also construct a matrix representation for the restriction. Furthermore, we will use a generalized eigenspace as the invariant subspace for the construction of the restriction.

Consider the linear transformation T : {ℂ}^{5} → {ℂ}^{5} defined by T\left (x\right ) = Ax, where

\eqalignno{ A & = \left [\array{ −22&−24&−24&−24&−46\cr 3 & 2 & 6 & 0 & 11 \cr −12&−16& −6 &−14&−17\cr 6 & 8 & 4 & 10 & 8 \cr 11 & 14 & 8 & 13 & 18 } \right ] & & }

One of the eigenvalues of A is λ = 2, with geometric multiplicity {γ}_{T }\left (2\right ) = 1, and algebraic multiplicity {α}_{T }\left (2\right ) = 3. We get the generalized eigenspace in the usual manner,

\eqalignno{ W & = {G}_{T }\left (2\right ) = K\kern -1.95872pt \left ({\left (T − 2{I}_{{ℂ}^{5}}\right )}^{5}\right ) = \left \langle \left \{\left [\array{ −2\cr 1 \cr 1\cr 0 \cr 0 } \right ],\kern 1.95872pt \left [\array{ 0\cr −1 \cr 0\cr 1 \cr 0 } \right ],\kern 1.95872pt \left [\array{ −4\cr 2 \cr 0\cr 0 \cr 1 } \right ]\right \}\right \rangle = \left \langle \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {w}_{3}\right \}\right \rangle & & }

By Theorem GESIS, we know W is invariant relative to T, so we can employ Definition LTR to form the restriction, T{|}_{W }: W → W.

To better understand exactly what a restriction is (and isn’t), we’ll form a matrix representation of T{|}_{W }. This will also be a skill we will use in subsequent examples. For a basis of W we will use C = \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {w}_{3}\right \}. Notice that \mathop{ dim}\nolimits \left (W\right ) = 3, so our matrix representation will be a square matrix of size 3. Applying Definition MR, we compute

\eqalignno{ {ρ}_{C}\left (T\left ({w}_{1}\right )\right ) & = {ρ}_{C}\left (A{w}_{1}\right ) = {ρ}_{C}\left (\left [\array{ −4\cr 2 \cr 2\cr 0 \cr 0 } \right ]\right ) = {ρ}_{C}\left (2\left [\array{ −2\cr 1 \cr 1\cr 0 \cr 0 } \right ] + 0\left [\array{ 0\cr −1 \cr 0\cr 1 \cr 0 } \right ] + 0\left [\array{ −4\cr 2 \cr 0\cr 0 \cr 1 } \right ]\right ) = \left [\array{ 2\cr 0 \cr 0 } \right ] & & \cr {ρ}_{C}\left (T\left ({w}_{2}\right )\right ) & = {ρ}_{C}\left (A{w}_{2}\right ) = {ρ}_{C}\left (\left [\array{ 0\cr −2 \cr 2\cr 2 \cr −1 } \right ]\right ) = {ρ}_{C}\left (2\left [\array{ −2\cr 1 \cr 1\cr 0 \cr 0 } \right ] + 2\left [\array{ 0\cr −1 \cr 0\cr 1 \cr 0 } \right ] + (−1)\left [\array{ −4\cr 2 \cr 0\cr 0 \cr 1 } \right ]\right ) = \left [\array{ 2\cr 2 \cr −1 } \right ] & & \cr {ρ}_{C}\left (T\left ({w}_{3}\right )\right ) & = {ρ}_{C}\left (A{w}_{3}\right ) = {ρ}_{C}\left (\left [\array{ −6\cr 3 \cr −1\cr 0 \cr 2 } \right ]\right ) = {ρ}_{C}\left ((−1)\left [\array{ −2\cr 1 \cr 1\cr 0 \cr 0 } \right ] + 0\left [\array{ 0\cr −1 \cr 0\cr 1 \cr 0 } \right ] + 2\left [\array{ −4\cr 2 \cr 0\cr 0 \cr 1 } \right ]\right ) = \left [\array{ −1\cr 0 \cr 2 } \right ] & & \cr & & }

So the matrix representation of T{|}_{W } relative to C is

\eqalignno{ {M}_{C,C}^{T{|}_{W} } & = \left [\array{ 2& 2 &−1\cr 0& 2 & 0 \cr 0&−1& 2 } \right ] & & }

The question arises: how do we use a 3 × 3 matrix to compute with vectors from {ℂ}^{5}? To answer this question, consider the randomly chosen vector

\eqalignno{ w = \left [\array{ −4\cr 4 \cr 4\cr −2 \cr −1 } \right ] & & }

First check that w ∈{G}_{T }\left (2\right ). There are two ways to do this, first verify that

\eqalignno{ {\left (T − 2{I}_{{ℂ}^{5}}\right )}^{5}\left (w\right ) & ={ \left (A − 2{I}_{ 5}\right )}^{5}w = 0 & & }

meeting Definition GES (with k = 5). Or, express w as a linear combination of the basis C for W, to wit, w = 4{w}_{1} − 2{w}_{2} − {w}_{3}. Now compute T{|}_{W }\left (w\right ) directly using Definition LTR,

\eqalignno{ T{|}_{W }\left (w\right ) & = T\left (w\right ) = Aw = \left [\array{ −10\cr 9 \cr 5\cr −4 \cr 0 } \right ] & & }

It was necessary to verify that w ∈{G}_{T }\left (2\right ), and if we trust our work so far, then this output will also be an element of W, but it would be wise to check this anyway (using either of the methods we used for w). We’ll wait.

Now we will repeat this sample computation, but instead using the matrix representation of T{|}_{W } relative to C.

\eqalignno{ T{|}_{W }\left (w\right ) & = {ρ}_{C}^{−1}\left ({M}_{ C,C}^{T{|}_{W} }{ρ}_{C}\left (w\right )\right ) & &\text{@(a href="fcla-jsmath-2.30li57.html#theorem.FTMR")Theorem FTMR@(/a)} & & & & \cr & = {ρ}_{C}^{−1}\left ({M}_{ C,C}^{T{|}_{W} }{ρ}_{C}\left (4{w}_{1} − 2{w}_{2} − {w}_{3}\right )\right ) & & & & \cr & = {ρ}_{C}^{−1}\left (\left [\array{ 2& 2 &−1\cr 0& 2 & 0 \cr 0&−1& 2 } \right ]\left [\array{ 4\cr −2 \cr −1 } \right ]\right ) & &\text{@(a href="fcla-jsmath-2.30li56.html#definition.VR")Definition VR@(/a)} & & & & \cr & = {ρ}_{C}^{−1}\left (\left [\array{ 5\cr −4 \cr 0 } \right ]\right ) & &\text{@(a href="fcla-jsmath-2.30li31.html#definition.MVP")Definition MVP@(/a)} & & & & \cr & = 5{w}_{1} − 4{w}_{2} + 0{w}_{3} & &\text{@(a href="fcla-jsmath-2.30li56.html#definition.VR")Definition VR@(/a)} & & & & \cr & = 5\left [\array{ −2\cr 1 \cr 1\cr 0 \cr 0 } \right ] + (−4)\left [\array{ 0\cr −1 \cr 0\cr 1 \cr 0 } \right ] + 0\left [\array{ −4\cr 2 \cr 0\cr 0 \cr 1 } \right ] & & & & \cr & = \left [\array{ −10\cr 9 \cr 5\cr −4 \cr 0 } \right ] & & & & }

which matches the previous computation. Notice how the “action” of T{|}_{W } is accomplished by a 3 × 3 matrix multiplying a column vector of size 3. If you would like more practice with these sorts of computations, mimic the above using the other eigenvalue of T, which is λ = −2. The generalized eigenspace has dimension 2, so the matrix representation of the restriction to the generalized eigenspace will be a 2 × 2 matrix.

Suppose that T : V → V is a linear transformation and we can find a decomposition of V as a direct sum, say V = {U}_{1} ⊕ {U}_{2} ⊕ {U}_{3} ⊕\mathrel{⋯} ⊕ {U}_{m} where each {U}_{i} is an invariant subspace of V relative to T. Then, for any v ∈ V there is a unique decomposition v = {u}_{1} + {u}_{2} + {u}_{3} + \mathrel{⋯} + {u}_{m} with {u}_{i} ∈ {U}_{i}, 1 ≤ i ≤ m and furthermore

\eqalignno{ T\left (v\right ) & = T\left ({u}_{1} + {u}_{2} + {u}_{3} + \mathrel{⋯} + {u}_{m}\right ) & &\text{@(a href="fcla-jsmath-2.30li42.html#definition.DS")Definition DS@(/a)} & & & & \cr & = T\left ({u}_{1}\right ) + T\left ({u}_{2}\right ) + T\left ({u}_{3}\right ) + \mathrel{⋯} + T\left ({u}_{m}\right ) & &\text{@(a href="fcla-jsmath-2.30li51.html#theorem.LTLC")Theorem LTLC@(/a)} & & & & \cr & = T{|}_{{U}_{1}}\left ({u}_{1}\right ) + T{|}_{{U}_{2}}\left ({u}_{2}\right ) + T{|}_{{U}_{3}}\left ({u}_{3}\right ) + \mathrel{⋯} + T{|}_{{U}_{m}}\left ({u}_{m}\right ) & & & & }

So in a very real sense, we obtain a decomposition of the linear transformation T into the restrictions T{|}_{{U}_{i}}, 1 ≤ i ≤ m. If we wanted to be more careful, we could extend each restriction to a linear transformation defined on V by setting the output of T{|}_{{U}_{i}} to be the zero vector for inputs outside of {U}_{i}. Then T would be exactly equal to the sum (Definition LTA) of these extended restrictions. However, the irony of extending our restrictions is more than we could handle right now.

Our real interest is in the matrix representation of a linear transformation when the domain decomposes as a direct sum of invariant subspaces. Consider forming a basis B of V as the union of bases {B}_{i} from the individual {U}_{i}, i.e. B = {∪}_{i=1}^{m}\kern 1.95872pt {B}_{ i}. Now form the matrix representation of T relative to B. The result will be block diagonal, where each block is the matrix representation of a restriction T{|}_{{U}_{i}} relative to a basis {B}_{i}, {M}_{{B}_{i},{B}_{i}}^{T{|}_{{U}_{i}} }. Though we did not have the definitions to describe it then, this is exactly what was going on in the latter portion of the proof of Theorem CFNLT. Two examples should help to clarify these ideas.

Example ISMR4
Invariant subspaces, matrix representation, dimension 4 domain
Example TIS and Example GE4 describe a basis of {ℂ}^{4} which is derived from bases for two invariant subspaces (both generalized eigenspaces). In this example we will construct a matrix representation of the linear transformation T relative to this basis. Recycling the notation from Example TIS, we work with the basis,

\eqalignno{ B & = \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {x}_{1},\kern 1.95872pt {x}_{2}\right \} = \left \{\left [\array{ −7\cr −2 \cr 3\cr 0 } \right ],\kern 1.95872pt \left [\array{ −1\cr −2 \cr 0\cr 1 } \right ],\kern 1.95872pt \left [\array{ −3\cr −1 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ 0\cr −1 \cr 0\cr 1 } \right ]\right \} & & }

Now we compute the matrix representation of T relative to B, borrowing some computations from Example TIS,

\eqalignno{ {ρ}_{B}\left (T\left ({w}_{1}\right )\right ) & = {ρ}_{B}\left (\left [\array{ −1\cr −2 \cr 0\cr 1 } \right ]\right ) = {ρ}_{B}\left ((0){w}_{1} + (1){w}_{2}\right ) = \left [\array{ 0\cr 1 \cr 0\cr 0 } \right ] & & \cr {ρ}_{B}\left (T\left ({w}_{2}\right )\right ) & = {ρ}_{B}\left (\left [\array{ 5\cr −2 \cr −3\cr 2 } \right ]\right ) = {ρ}_{B}\left ((−1){w}_{1} + (2){w}_{2}\right ) = \left [\array{ −1\cr 2 \cr 0\cr 0 } \right ] & & \cr {ρ}_{B}\left (T\left ({x}_{1}\right )\right ) & = {ρ}_{B}\left (\left [\array{ 3\cr 0 \cr −1\cr 1 } \right ]\right ) = {ρ}_{B}\left ((−1){x}_{1} + (1){x}_{2}\right ) = \left [\array{ 0\cr 0 \cr −1\cr 1 } \right ] & & \cr {ρ}_{B}\left (T\left ({x}_{2}\right )\right ) & = {ρ}_{B}\left (\left [\array{ 3\cr 4 \cr −1\cr −3 } \right ]\right ) = {ρ}_{B}\left ((−1){x}_{1} + (−3){x}_{2}\right ) = \left [\array{ 0\cr 0 \cr −1\cr −3 } \right ] & & }

Applying Definition MR, we have

\eqalignno{ {M}_{B,B}^{T } & = \left [\array{ 0&−1& 0 & 0\cr 1& 2 & 0 & 0 \cr 0& 0 &−1&−1\cr 0& 0 & 1 &−3 } \right ] & & }

The interesting feature of this representation is the two 2 × 2 blocks on the diagonal that arise from the decomposition of {ℂ}^{4} into a direct sum (of generalized eigenspaces). Or maybe the interesting feature of this matrix is the two 2 × 2 submatrices in the “other” corners that are all zero. You decide.

Example ISMR6
Invariant subspaces, matrix representation, dimension 6 domain
In Example GE6 we computed the generalized eigenspaces of the linear transformation S : {ℂ}^{6} → {ℂ}^{6} by S\left (x\right ) = Bx where

\eqalignno{ \left [\array{ 2 & −4 &25&−54&90&−37\cr 2 & −3 & 4 &−16 &26 & −8 \cr 2 & −3 & 4 &−15&24& −7\cr 10 &−18 & 6 &−36 &51 & −2 \cr 8 &−14& 0 &−21&28& 4\cr 5 & −7 &−6 & −7 & 8 & 7 } \right ] & & }

From this we found the basis

\eqalignno{ C & = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt {v}_{4},\kern 1.95872pt {v}_{5},\kern 1.95872pt {v}_{6}\right \} & & \cr & = \left \{\left [\array{ 4\cr 1 \cr 1\cr 2 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −1 \cr −1\cr −1 \cr 0\cr 1 } \right ],\kern 1.95872pt \left [\array{ 5\cr 3 \cr 1\cr 0 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ −2\cr −3 \cr 0\cr 1 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ 4\cr 5 \cr 0\cr 0 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −3 \cr 0\cr 0 \cr 0\cr 1 } \right ]\right \} & & }

of {ℂ}^{6} where \left \{{v}_{1},\kern 1.95872pt {v}_{2}\right \} is a basis of {G}_{S}\left (3\right ) and \left \{{v}_{3},\kern 1.95872pt {v}_{4},\kern 1.95872pt {v}_{5},\kern 1.95872pt {v}_{6}\right \} is a basis of {G}_{S}\left (−1\right ). We can employ C in the construction of a matrix representation of S (Definition MR). Here are the computations,

\eqalignno{ {ρ}_{C}\left (S\left ({v}_{1}\right )\right ) & = {ρ}_{C}\left (\left [\array{ 11\cr 3 \cr 3\cr 7 \cr 4\cr 1 } \right ]\right ) = {ρ}_{C}\left (4{v}_{1} + 1{v}_{2}\right ) = \left [\array{ 4\cr 1 \cr 0\cr 0 \cr 0\cr 0 } \right ] & & \cr {ρ}_{C}\left (S\left ({v}_{2}\right )\right ) & = {ρ}_{C}\left (\left [\array{ −14\cr −3 \cr −3\cr −4 \cr −1\cr 2 } \right ]\right ) = {ρ}_{C}\left ((−1){v}_{1} + 2{v}_{2}\right ) = \left [\array{ −1\cr 2 \cr 0\cr 0 \cr 0\cr 0 } \right ] & & \cr {ρ}_{C}\left (S\left ({v}_{3}\right )\right ) & = {ρ}_{C}\left (\left [\array{ 23\cr 5 \cr 5\cr 2 \cr −2\cr −2 } \right ]\right ) = {ρ}_{C}\left (5{v}_{3} + 2{v}_{4} + (−2){v}_{5} + (−2){v}_{6}\right ) = \left [\array{ 0\cr 0 \cr 5\cr 2 \cr −2\cr −2 } \right ] & & \cr {ρ}_{C}\left (S\left ({v}_{4}\right )\right ) & = {ρ}_{C}\left (\left [\array{ −46\cr −11 \cr −10\cr −2 \cr 5\cr 4 } \right ]\right ) = {ρ}_{C}\left ((−10){v}_{3} + (−2){v}_{4} + 5{v}_{5} + 4{v}_{6}\right ) = \left [\array{ 0\cr 0 \cr −10\cr −2 \cr 5\cr 4 } \right ] & & \cr {ρ}_{C}\left (S\left ({v}_{5}\right )\right ) & = {ρ}_{C}\left (\left [\array{ 78\cr 19 \cr 17\cr 1 \cr −10\cr −7 } \right ]\right ) = {ρ}_{C}\left (17{v}_{3} + 1{v}_{4} + (−10){v}_{5} + (−7){v}_{6}\right ) = \left [\array{ 0\cr 0 \cr 17\cr 1 \cr −10\cr −7 } \right ] & & \cr {ρ}_{C}\left (S\left ({v}_{6}\right )\right ) & = {ρ}_{C}\left (\left [\array{ −35\cr −9 \cr −8\cr 2 \cr 6\cr 3 } \right ]\right ) = {ρ}_{C}\left ((−8){v}_{3} + 2{v}_{4} + 6{v}_{5} + 3{v}_{6}\right ) = \left [\array{ 0\cr 0 \cr −8\cr 2 \cr 6\cr 3 } \right ] & & }

These column vectors are the columns of the matrix representation, so we obtain

\eqalignno{ {M}_{C,C}^{S} & = \left [\array{ 4&−1& 0 & 0 & 0 & 0\cr 1& 2 & 0 & 0 & 0 & 0 \cr 0& 0 & 5 &−10& 17 &−8\cr 0& 0 & 2 & −2 & 1 & 2 \cr 0& 0 &−2& 5 &−10& 6\cr 0& 0 &−2 & 4 & −7 & 3 } \right ] & & }

As before, the key feature of this representation is the 2 × 2 and 4 × 4 blocks on the diagonal. We will discover in the final theorem of this section (Theorem RGEN) that we already understand these blocks fairly well. For now, we recognize them as arising from generalized eigenspaces and suspect that their sizes are equal to the algebraic multiplicities of the eigenvalues.

The paragraph prior to these last two examples is worth repeating. A basis derived from a direct sum decomposition into invariant subspaces will provide a matrix representation of a linear transformation with a block diagonal form.

Diagonalizing a linear transformation is the most extreme example of decomposing a vector space into invariant subspaces. When a linear transformation is diagonalizable, then there is a basis composed of eigenvectors (Theorem DC). Each of these basis vectors can be used individually as the lone element of a spanning set for an invariant subspace (Theorem EIS). So the domain decomposes into a direct sum of one-dimensional invariant subspaces (Theorem DSFB). The corresponding matrix representation is then block diagonal with all the blocks of size 1, i.e. the matrix is diagonal. Section NLT, Section IS and Section JCF are all devoted to generalizing this extreme situation when there are not enough eigenvectors available to make such a complete decomposition and arrive at such an elegant matrix representation.

One last theorem will roll up much of this section and Section NLT into one nice, neat package.

Theorem RGEN
Restriction to Generalized Eigenspace is Nilpotent
Suppose T : V → V is a linear transformation with eigenvalue λ. Then the linear transformation T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )} is nilpotent.

Proof   Notice first that every subspace of V is invariant with respect to {I}_{V }, so {I}_{{G}_{T}\left (λ\right )} = {I}_{V }{|}_{{G}_{T}\left (λ\right )}. Let n =\mathop{ dim}\nolimits \left (V \right ) and choose v ∈{G}_{T }\left (λ\right ). Then

\eqalignno{ {\left (T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )}\right )}^{n}\left (v\right ) & ={ \left (T − λ{I}_{ V }\right )}^{n}\left (v\right ) & &\text{@(a href="#definition.LTR")Definition LTR@(/a)} & & & & \cr & = 0 & &\text{@(a href="#theorem.GEK")Theorem GEK@(/a)} & & & & }

So by Definition NLT, T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )} is nilpotent.

The proof of Theorem RGEN indicates that the index of the nilpotent linear transformation is less than or equal to the dimension of V . In practice, it will be less than or equal to the dimension of the domain of the linear transformation, {G}_{T }\left (λ\right ). In any event, the exact value of this index will be of some interest, so we define it now. Notice that this is a property of the eigenvalue λ, similar to the algebraic and geometric multiplicities (Definition AME, Definition GME).

Definition IE
Index of an Eigenvalue
Suppose T : V → V is a linear transformation with eigenvalue λ. Then the index of λ, {ι}_{T }\left (λ\right ), is the index of the nilpotent linear transformation T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )}.

(This definition contains Notation IE.)

Example GENR6
Generalized eigenspaces and nilpotent restrictions, dimension 6 domain
In Example GE6 we computed the generalized eigenspaces of the linear transformation S : {ℂ}^{6} → {ℂ}^{6} defined by S\left (x\right ) = Bx where

\eqalignno{ \left [\array{ 2 & −4 &25&−54&90&−37\cr 2 & −3 & 4 &−16 &26 & −8 \cr 2 & −3 & 4 &−15&24& −7\cr 10 &−18 & 6 &−36 &51 & −2 \cr 8 &−14& 0 &−21&28& 4\cr 5 & −7 &−6 & −7 & 8 & 7 } \right ] & & }

The generalized eigenspace, {G}_{S}\left (3\right ), has dimension 2, while {G}_{S}\left (−1\right ), has dimension 4. We’ll investigate each thoroughly in turn, with the intent being to illustrate Theorem RGEN. Much of our computations will be repeats of those done in Example ISMR6.

For U = {G}_{S}\left (3\right ) we compute a matrix representation of S{|}_{U} using the basis found in Example GE6,

\eqalignno{ B & = \left \{{u}_{1},\kern 1.95872pt {u}_{2}\right \} = \left \{\left [\array{ 4\cr 1 \cr 1\cr 2 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −1 \cr −1\cr −1 \cr 0\cr 1 } \right ]\right \} & & }

Since B has size 2, we obtain a 2 × 2 matrix representation (Definition MR) from

\eqalignno{ {ρ}_{B}\left (S{|}_{U}\left ({u}_{1}\right )\right ) & = {ρ}_{B}\left (\left [\array{ 11\cr 3 \cr 3\cr 7 \cr 4\cr 1 } \right ]\right ) = {ρ}_{B}\left (4{u}_{1} + {u}_{2}\right ) = \left [\array{ 4\cr 1 } \right ] & & \cr {ρ}_{B}\left (S{|}_{U}\left ({u}_{2}\right )\right ) & = {ρ}_{B}\left (\left [\array{ −14\cr −3 \cr −3\cr −4 \cr −1\cr 2 } \right ]\right ) = {ρ}_{B}\left ((−1){u}_{1} + 2{u}_{2}\right ) = \left [\array{ −1\cr 2 } \right ] & & }

Thus

\eqalignno{ M & = {M}_{U,U}^{S{|}_{U} } = \left [\array{ 4&−1\cr 1& 2 } \right ] & & }

Now we can illustrate Theorem RGEN with powers of the matrix representation (rather than the restriction itself),

\eqalignno{ M − 3{I}_{2} & = \left [\array{ 1&−1\cr 1&−1 } \right ] &{\left (M − 3{I}_{2}\right )}^{2} & = \left [\array{ 0&0 \cr 0&0 } \right ] & & & & }

So M − 3{I}_{2} is a nilpotent matrix of index 2 (meaning that S{|}_{U} − 3{I}_{U} is a nilpotent linear transformation of index 2) and according to Definition IE we say {ι}_{S}\left (3\right ) = 2.

For W = {G}_{S}\left (−1\right ) we compute a matrix representation of S{|}_{W } using the basis found in Example GE6,

\eqalignno{ C & = \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {w}_{3},\kern 1.95872pt {w}_{4}\right \} = \left \{\left [\array{ 5\cr 3 \cr 1\cr 0 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ −2\cr −3 \cr 0\cr 1 \cr 0\cr 0 } \right ],\kern 1.95872pt \left [\array{ 4\cr 5 \cr 0\cr 0 \cr 1\cr 0 } \right ],\kern 1.95872pt \left [\array{ −5\cr −3 \cr 0\cr 0 \cr 0\cr 1 } \right ]\right \} & & }

Since C has size 4, we obtain a 4 × 4 matrix representation (Definition MR) from

\eqalignno{ {ρ}_{C}\left (S{|}_{W }\left ({w}_{1}\right )\right ) & = {ρ}_{C}\left (\left [\array{ 23\cr 5 \cr 5\cr 2 \cr −2\cr −2 } \right ]\right ) = {ρ}_{C}\left (5{w}_{1} + 2{w}_{2} + (−2){w}_{3} + (−2){w}_{4}\right ) = \left [\array{ 5\cr 2 \cr −2\cr −2 } \right ] & & \cr {ρ}_{C}\left (S{|}_{W }\left ({w}_{2}\right )\right ) & = {ρ}_{C}\left (\left [\array{ −46\cr −11 \cr −10\cr −2 \cr 5\cr 4 } \right ]\right ) = {ρ}_{C}\left ((−10){w}_{1} + (−2){w}_{2} + 5{w}_{3} + 4{w}_{4}\right ) = \left [\array{ −10\cr −2 \cr 5\cr 4 } \right ] & & \cr {ρ}_{C}\left (S{|}_{W }\left ({w}_{3}\right )\right ) & = {ρ}_{C}\left (\left [\array{ 78\cr 19 \cr 17\cr 1 \cr −10\cr −7 } \right ]\right ) = {ρ}_{C}\left (17{w}_{1} + {w}_{2} + (−10){w}_{3} + (−7){w}_{4}\right ) = \left [\array{ 17\cr 1 \cr −10\cr −7 } \right ] & & \cr {ρ}_{C}\left (S{|}_{W }\left ({w}_{4}\right )\right ) & = {ρ}_{C}\left (\left [\array{ −35\cr −9 \cr −8\cr 2 \cr 6\cr 3 } \right ]\right ) = {ρ}_{C}\left ((−8){w}_{1} + 2{w}_{2} + 6{w}_{3} + 3{w}_{4}\right ) = \left [\array{ −8\cr 2 \cr 6\cr 3 } \right ] & & }

Thus

\eqalignno{ N & = {M}_{W,W }^{S{|}_{W} } = \left [\array{ 5 &−10& 17 &−8\cr 2 & −2 & 1 & 2 \cr −2& 5 &−10& 6\cr −2 & 4 & −7 & 3 } \right ] & & }

Now we can illustrate Theorem RGEN with powers of the matrix representation (rather than the restriction itself),

\eqalignno{ N − (−1){I}_{4} & = \left [\array{ 6 &−10&17&−8\cr 2 & −1 & 1 & 2 \cr −2& 5 &−9& 6\cr −2 & 4 &−7 & 4 } \right ] & & \cr {\left (N − (−1){I}_{4}\right )}^{2} & = \left [\array{ −2& 3 &−5& 2\cr 4 &−6 & 10 &−4 \cr 4 &−6&10&−4\cr 2 &−3 & 5 &−2 } \right ] & & \cr {\left (N − (−1){I}_{4}\right )}^{3} & = \left [\array{ 0&0&0&0\cr 0&0 &0 &0 \cr 0&0&0&0\cr 0&0 &0 &0 } \right ] & & }

So N − (−1){I}_{4} is a nilpotent matrix of index 3 (meaning that S{|}_{W } − (−1){I}_{W } is a nilpotent linear transformation of index 3) and according to Definition IE we say {ι}_{S}\left (−1\right ) = 3.

Notice that if we were to take the union of the two bases of the generalized eigenspaces, we would have a basis for {ℂ}^{6}. Then a matrix representation of S relative to this basis would be the same block diagonal matrix we found in Example ISMR6, only we now understand each of these blocks as being very close to being a nilpotent matrix.

Invariant subspaces, and restrictions of linear transformations, are topics you will see again and again if you continue with further study of linear algebra. Our reasons for discussing them now is to arrive at a nice matrix representation of the restriction of a linear transformation to one of its generalized eigenspaces. Here’s the theorem.

Theorem MRRGE
Matrix Representation of a Restriction to a Generalized Eigenspace
Suppose that T : V → V is a linear transformation with eigenvalue λ. Then there is a basis of the the generalized eigenspace {G}_{T }\left (λ\right ) such that the restriction T{|}_{{G}_{T}\left (λ\right )}: {G}_{T }\left (λ\right ) →{G}_{T }\left (λ\right ) has a matrix representation that is block diagonal where each block is a Jordan block of the form {J}_{n}\left (λ\right ).

Proof   Theorem RGEN tells us that T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )} is a nilpotent linear transformation. Theorem CFNLT tells us that a nilpotent linear transformation has a basis for its domain that yields a matrix representation that is block diagonal where the blocks are Jordan blocks of the form {J}_{n}\left (0\right ). Let B be a basis of {G}_{T }\left (λ\right ) that yields such a matrix representation for T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )}.

By Definition LTA, we can write

\eqalignno{ T{|}_{{G}_{T}\left (λ\right )} & = \left (T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )}\right ) + λ{I}_{{G}_{T}\left (λ\right )} & & }

The matrix representation of λ{I}_{{G}_{T}\left (λ\right )} relative to the basis B is then simply the diagonal matrix λ{I}_{m}, where m =\mathop{ dim}\nolimits \left ({G}_{T }\left (λ\right )\right ). By Theorem MRSLT we have the rather unwieldy expression,

\eqalignno{ {M}_{B,B}^{T{|}_{{G}_{T}\left (λ\right )} } & = {M}_{B,B}^{\left (T{|}_{{G}_{T}\left (λ\right )}−λ{I}_{{G}_{T}\left (λ\right )}\right )+λ{I}_{{G}_{T}\left (λ\right )} } & & \cr & = {M}_{B,B}^{T{|}_{{G}_{T}\left (λ\right )}−λ{I}_{{G}_{T}\left (λ\right )} } + {M}_{B,B}^{{I}_{{G}_{T}\left (λ\right )} } & & }

The first of these matrix representations has Jordan blocks with zero in every diagonal entry, while the second matrix representation has λ in every diagonal entry. The result of adding the two representations is to convert the Jordan blocks from the form {J}_{n}\left (0\right ) to the form {J}_{n}\left (λ\right ).

Of course, Theorem CFNLT provides some extra information on the sizes of the Jordan blocks in a representation and we could carry over this information to Theorem MRRGE, but will save that for a subsequent application of this result.

Subsection EXC: Exercises

T10 Suppose that T : V → V is linear transformation, and p(x) is a polynomial. Then define the new linear transformation p(T): V → V by interpreting the coefficients of the terms of the polynomial as scalar mutliples of linear transformations (Definition LTSM), addition of terms as the sum of linear transformations (Definition LTA), and powers as repeated composition of linear transformations (Definition LTC). Prove that T ∘ p(T) = p(T) ∘ T.

Use this observation to give a shorter argument for the proof of the invariance of the generalized eigenspace in Theorem GESIS.  
Contributed by Robert Beezer