Section IS  Invariant Subspaces

From A First Course in Linear Algebra
Version 1.33
© 2004.
Licensed under the GNU Free Documentation License.
http://linear.ups.edu/

Draft: This Section Complete, But Subject To Change

We have seen in Section NLT that nilpotent linear transformations are almost never diagonalizable (Theorem DNLT), yet have matrix representations that are very nearly diagonal (Theorem CFNLT). Our goal in this section, and the next (Section JCF), is to obtain a matrix representation of any linear transformation that is very nearly diagonal. A key step in reaching this goal is an understanding of invariant subspaces, and a particular type of invariant subspace that contains vectors known as “generalized eigenvectors.”

Subsection IS: Invariant Subspaces

As is often the case, we start with a definition.

Definition IS
Invariant Subspace
Suppose that T : V V is a linear transformation and W is a subspace of V . Suppose further that T w W for every w W. Then W is an invariant subspace of V relative to T.

We do not have any special notation for an invariant subspace, so it is important to recognize that an invariant subspace is always relative to both a superspace (V ) and a linear transformation (T), which will sometimes not be mentioned, yet will be clear from the context. Note also that the linear transformation involved must have an equal domain and codomain — the definition would not make much sense if our outputs were not of the same type as our inputs.

As usual, we begin with an example that demonstrates the existence of invariant subspaces. We will return later to understand how this example was constructed, but for now, just understand how we check the existence of the invariant subspaces.

Example TIS
Two invariant subspaces
Consider the linear transformation T : 44 defined by T x = Ax where A is given by

A = 8 6 15 9 81410 18 1 1 3 0 3 8 2 11

Define (with zero motivation),

w1 = 7 2 3 0 w2 = 1 2 0 1

and set W = w1,w2. We verify that W is an invariant subspace of 4 with respect to T. By the definition of W, any vector chosen from W can be written as a linear combination of w1 and w2. Suppose that w W, and then check the details of the following verification,

T w = T a1w1 + a2w2  Definition SS = a1T w1 + a2T w2  Theorem LTLC = a1 1 2 0 1 + a2 5 2 3 2 = a1w2 + a2 (1)w1 + 2w2 = (a2)w1 + (a1 + 2a2)w2 W  Definition SS

So, by Definition IS, W is an invariant subspace of 4 relative to T. In an entirely similar manner we construct another invariant subspace of T.

With zero motivation, define

x1 = 3 1 1 0 x2 = 0 1 0 1

and set X = x1,x2. We verify that X is an invariant subspace of 4 with respect to T. By the definition of X, any vector chosen from X can be written as a linear combination of x1 and x2. Suppose that x X, and then check the details of the following verification,

T x = T b1x1 + b2x2  Definition SS = b1T x1 + b2T x2  Theorem LTLC = b1 3 0 1 1 + b2 3 4 1 3 = b1 (1)x1 + x2 + b2 (1)x1 + (3)x2 = (b1 b2)x1 + (b1 3b2)x2 X  Definition SS

So, by Definition IS, X is an invariant subspace of 4 relative to T.

There is a bit of magic in each of these verifications where the two outputs of T happen to equal linear combinations of the two inputs. But this is the essential nature of an invariant subspace. We’ll have a peek under the hood later, and it won’t look so magical after all.

As a hint of things to come, verify that B = w1,w2,x1,x2 is a basis of 4. Splitting this basis in half, Theorem DSFB, tells us that 4 = W X. To see why a decomposition of a vector space into a direct sum of invariant subspaces might be interesting, construct the matrix representation of T relative to B, MB,BT . Hmmmmmm.

Example TIS is a bit mysterious at this stage. Do we know any other examples of invariant subspaces? Yes, as it turns out, we have already seen quite a few. We’ll give some examples now, and in more general situations, describe broad classes of invariant subspaces with theorems. First up is eigenspaces.

Theorem EIS
Eigenspaces are Invariant Subspaces
Suppose that T : V V is a linear transformation with eigenvalue λ and associated eigenspace T λ. Let W be any subspace of T λ. Then W is an invariant subspace of V relative to T.

Proof   Choose w W. Then

T w = λw  Definition EELT W  Property SC

So by Definition IS, W is an invariant subspace of V relative to T.

Theorem EIS is general enough to determine that an entire eigenspace is an invariant subspace, or that simply the span of a single eigenvector is an invariant subspace. It is not always the case that any subspace of an invariant subspace is again an invariant subspace, but eigenspaces do have this property. Here is an example of the theorem, which also allows us to very quickly build several several invariant (4x4, 2 evs, 1 2x2 jordan, 1 2x2 diag)

Example EIS
Eigenspaces as invariant subspaces
Define the linear transformation S : M22M22 by

S ab cd = 2a + 19b 33c + 21d3a + 16b 24c + 15d 2a + 9b 13c + 9d a + 4b 6c + 5d

Build a matrix representation of S relative to the standard basis (Definition MR, Example BM) and compute eigenvalues and eigenspaces of S with the computational techniques of Chapter E in concert with Theorem EER. Then

S 1 = 43 21 S 2 = 63 10 , 93 0 1

So by Theorem EIS, both S 1 and S 2 are invariant subspaces of M22 relative to S. However, Theorem EIS provides even more invariant subspaces. Since S 1 has dimension 1, it has no interesting subspaces, however S 2 has dimension 2 and has a plethora of subspaces. For example, set

u = 2 63 10 + 3 93 0 1 = 63 2 3

and define U = u. Then since U is a subspace of S 2, Theorem EIS says that U is an invariant subspace of M22 (or we could check this claim directly based simply on the fact that u is an eigenvector of S).

For every linear transformation there are some obvious, trivial invariant subspaces. Suppose that T : V V is a linear transformation. Then simply because T is a function (Definition LT), the subspace V is an invariant subspace of T. In only a minor twist on this theme, the range of T, T, is an invariant subspace of T by Definition RLT. Finally, Theorem LTTZZ provides the justification for claiming that 0 is an invariant subspace of T.

That the trivial subspace is always an invariant subspace is a special case of the next theorem. As an easy exercise before reading the next theorem, prove that the kernel of a linear transformation (Definition KLT), KT, is an invariant subspace. We’ll wait.

Theorem KPIS
Kernels of Powers are Invariant Subspaces
Suppose that T : V V is a linear transformation. Then KTk is an invariant subspace of V .

Proof   Suppose that z KTk. Then

Tk T z = Tk+1 z  Definition LTC = T Tk z  Definition LTC = T 0  Definition KLT = 0  Theorem LTTZZ

So by Definition KLT, we see that T z KTk. Thus KTk is an invariant subspace of V relative to T (Definition IS).

Two interesting special cases of Theorem KPIS occur when choose k = 0 and k = 1. Rather than give an example of this theorem, we will refer you back to Example KPNLT where we work with null spaces of the first four powers of a nilpotent matrix. By Theorem KPIS each of these null spaces is an invariant subspace of the associated linear transformation.

Here’s one more example of invariant subspaces we have encountered previously.

Example ISJB
Invariant subspaces and Jordan blocks
Refer back to Example CFNLT. We decomposed the vector space 6 into a direct sum of the subspaces Z1,Z2,Z3,Z4. The union of the basis vectors for these subspaces is a basis of 6, which we reordered prior to building a matrix representation of the linear transformation T. A principal reason for this reordering was to create invariant subspaces (though it was not obvious then).

Define

X1 = z1,1,z2,1,z3,1,z4,1 = 1 1 0 1 1 1 , 1 0 3 1 0 1 , 3 3 3 3 3 2 , 1 0 0 0 0 0 X2 = z1,2,z2,2 = 2 2 5 2 1 0 , 2 1 2 2 2 1

Recall from the proof of Theorem CFNLT or the computations in Example CFNLT that first elements of X1 and X2 are in the kernel of T, KT, and each element of X1 and X2 is the output of T when evaluated with the subsequent element of the set. This was by design, and it is this feature of these basis vectors that leads to the nearly diagonal matrix representation with Jordan blocks. However, we also recognize now that this property of these basis vectors allow us to conclude easily that X1 and X2 are invariant subspaces of 6 relative to T.

Furthermore, 6 = X 1 X2 (Theorem DSFB). So the domain of T is the direct sum of invariant subspaces and the resulting matrix representation has a block diagonal form. Hmmmmm.

Subsection GEE: Generalized Eigenvectors and Eigenspaces

We now define a new type of invariant subspace and explore its key properties. This generalization of eigenvalues and eigenspaces will allow us to move from diagonal matrix representations of diagonalizable matrices to nearly diagonal matrix representations of arbitrary matrices. Here are the definitions.

Definition GEV
Generalized Eigenvector
Suppose that T : V V is a linear transformation. Suppose further that for x0, T λIV k x = 0 for some k > 0. Then x is a generalized eigenvector of T with eigenvalue λ.

Definition GES
Generalized Eigenspace
Suppose that T : V V is a linear transformation. Define the generalized eigenspace of T for λ as

GT λ = x T λIV k x = 0  for some k 0

(This definition contains Notation GES.)

So the generalized eigenspace is composed of generalized eigenvectors, plus the zero vector. As the name implies, the generalized eigenspace is a subspace of V . But more topically, it is an invariant subspace of V relative to T.

Theorem GESIS
Generalized Eigenspace is an Invariant Subspace
Suppose that T : V V is a linear transformation. Then the generalized eigenspace GT λ is an invariant subspace of V relative to T.

Proof   First we establish that GT λ is a subspace of V . First T λIV 1 0 = 0 by Theorem LTTZZ, so 0 GT λ.

Suppose that x,y GT λ. Then there are integers k, such that T λIV k x = 0 and T λIV y = 0. Set m = k + ,

T λIV m x + y = T λI V m x + T λI V m y  Definition LT = T λIV k+ x + T λI V k+ y = T λIV T λI V k x + T λIV k T λI V y  Definition LTC = T λIV 0 + T λI V k 0  Definition GES = 0 + 0  Theorem LTTZZ = 0  Property Z

So x + y GT λ.

Suppose that x GT λ and α . Then there is an integer k such that T λIV k x = 0.

T λIV k αx = α T λI V k x  Definition LT = α0  Definition GES = 0  Theorem ZVSM

So αx GT λ. By Theorem TSS, GT λ is a subspace of V .

Now we show that GT λ is invariant relative to T. Suppose that x GT λ. Then there is an integer k such that T λIV k x = 0. Recognize also that T λIV k is a polynomial in T, and therefore commutes with T (that is, T p(T) = p(T) T for any polynomial p(x)). Now,

T λIV k T x = T T λI V k x = T 0  Definition GES = 0  Theorem LTTZZ

This qualifies T x for membership in GT λ, so by Definition GES, GT λ is invariant relative to T.

Before we compute some generalized eigenspaces, we state and prove one theorem that will make it much easier to create a generalized eigenspace, since it will allow us to use tools we already know well, and will remove some the ambiguity of the clause “for some k” in the definition.

Theorem GEK
Generalized Eigenspace as a Kernel
Suppose that T : V V is a linear transformation, dim V = n, and λ is an eigenvalue of T. Then GT λ = KT λIV n.

Proof   The conclusion of this theorem is a set equality, so we will apply Definition SE by establishing two set inclusions. First, suppose that x GT λ. Then there is an integer k such that T λIV k x = 0. This is equivalent to the statement that x KT λIV k. No matter what the value of k is, Theorem KPLT gives

x KT λIV k KT λI V n

So, GT λ KT λIV n. For the opposite inclusion, suppose y KT λIV n. Then T λIV n y = 0, so y GT λ and thus KT λIV n G T λ. By Definition SE we have the desired equality of sets.

Theorem GEK allows us to compute generalized eigenspaces as a single kernel (or null space of a matrix representation) with tools like Theorem KNSI and Theorem BNS. Also, we do not need to consider all possible powers k and can simply consider the case where k = n. It is worth noting that the “regular” eigenspace is a subspace of the generalized eigenspace since

T λ = KT λIV 1 KT λI V n = G T λ

where the subset inclusion is a consequence of Theorem KPLT. Also, there is no such thing as a “generalized eigenvalue.” If λ is not an eigenvalue of T, then the kernel of T λIV is trivial and therefore subsequent powers of T λIV also have trivial kernels (Theorem KPLT). So the generalized eigenspace of a scalar that is not already an eigenvalue would be trivial. Alright, we know enough now to compute some generalized eigenspaces. We will record some information about algebraic and geometric multiplicities of eigenvalues (Definition AME, Definition GME) as we go, since these observations will be of interest in light of some future theorems.

Example GE4
Generalized eigenspaces, dimension 4 domain
In Example TIS we presented two invariant subspaces of 4. There was some mystery about just how these were constructed, but we can now reveal that they are generalized eigenspaces. Example TIS featured T : 44 defined by T x = Ax with A given by

A = 8 6 15 9 81410 18 1 1 3 0 3 8 2 11

A matrix representation of T relative to the standard basis (Definition SUV) will equal A. So we can analyze A with the techniques of Chapter E. Doing so, we find two eigenvalues, λ = 1, 2, with multiplicities,

αT 1 = 2 γT 1 = 1 αT 2 = 2 γT 2 = 1

To apply Theorem GEK we subtract each eigenvalue from the diagonal entries of A, raise the result to the power dim 4 = 4, and compute a basis for the null space.

λ = 2 A (2)I4 4 = 648 1215 729 1215 324 486 486 486 405 729 486 729 297 486 405 486  RREF 1030 0111 0000 0000 GT 2 = 3 1 1 0 , 0 1 0 1 λ = 1 A (1)I4 4 = 81 405 81 729 108189378486 27 135 27 243 135 54 351 243  RREF 107 31 012 32 0000 0000 GT 1 = 7 2 3 0 , 1 2 0 1

In Example TIS we concluded that these two invariant subspaces formed a direct sum of 4, only at that time, they were called X and W. Now we can write

4 = G T 1 GT 2

This is no accident. Notice that the dimension of each of these invariant subspaces is equal to the algebraic multiplicity of the associated eigenvalue. Not an accident either. (See the upcoming Theorem GESD.)

Example GE6
Generalized eigenspaces, dimension 6 domain
Define the linear transformation S : 66 by S x = Bx where

2 4 25549037 2 3 4 1626 8 2 3 4 1524 7 1018 6 3651 2 8 14 0 2128 4 5 7 6 7 8 7

Then B will be the matrix representation of S relative to the standard basis (Definition SUV) and we can use the techniques of Chapter E applied to B in order to find the eigenvalues of S.

αS 3 = 2 γS 3 = 1 αS 1 = 4 γS 1 = 2

To find the generalized eigenspaces of S we need to subtract an eigenvalue from the diagonal elements of B, raise the result to the power dim 6 = 6 and compute the null space. Here are the results for the two eigenvalues of S,

λ = 3 B 3I6 6 = 64000152576599042611295744133632 15872 39936 11776 8704 29184 36352 12032 30208 9984 6400 20736 26368 1536 11264 230401792017920 1536 9728 27648 6656 9728 1536 17920 7936 17920 5888 1792 4352 14080  RREF 100045 010011 001011 000121 0000 0 0 0000 0 0 GS 3 = 4 1 1 2 1 0 , 5 1 1 1 0 1 λ = 1 B (1)I6 6 = 6144 1638418432368645734418432 4096 8192 4096 1638424576 4096 4096 8192 4096 1638424576 4096 1843232768 6144 6144090112 6144 1433624576 2048 4505665536 2048 102401638420482867240960 2048  RREF 105245 013353 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 GS 1 = 5 3 1 0 0 0 , 2 3 0 1 0 0 , 4 5 0 0 1 0 , 5 3 0 0 0 1

If we take the union of the two bases for these two invariant subspaces we obtain the set

C = v1,v2,v3,v4,v5,v6 = 4 1 1 2 1 0 , 5 1 1 1 0 1 , 5 3 1 0 0 0 , 2 3 0 1 0 0 , 4 5 0 0 1 0 , 5 3 0 0 0 1

You can check that this set is linearly independent (right now we have no guarantee this will happen). Once this is verified, we have a linearly independent set of size 6 inside a vector space of dimension 6, so by Theorem G, the set C is a basis for 6. This is enough to apply Theorem DSFB and conclude that

6 = G S 3 GS 1

This is no accident. Notice that the dimension of each of these invariant subspaces is equal to the algebraic multiplicity of the associated eigenvalue. Not an accident either. (See the upcoming Theorem GESD.)

Subsection RLT: Restrictions of Linear Transformations

Generalized eigenspaces will prove to be an important type of invariant subspace. A second reason for our interest in invariant subspaces is they provide us with another method for creating new linear transformations from old ones.

Definition LTR
Linear Transformation Restriction
Suppose that T : V V is a linear transformation, and U is an invariant subspace of V relative to T. Define the restriction of T to U by

T|U: UU T|U u = T u

(This definition contains Notation LTR.)

It might appear that this definition has not accomplished anything, as T|U would appear to take on exactly the same values as T. And this is true. However, T|U differs from T in the choice of domain and codomain. We tend to give little attention to the domain and codomain of functions, while their defining rules get the spotlight. But the restriction of a linear transformation is all about the choice of domain and codomain. We are restricting the rule of the function to a smaller subspace. Notice the importance of only using this construction with an invariant subspace, since otherwise we cannot be assured that the outputs of the function are even contained in the codomain. Maybe this observation should be the key step in the proof of a theorem saying that T|U is also a linear transformation, but we won’t bother.

Example LTRGE
Linear transformation restriction on generalized eigenspace
In order to gain some expereince with restrictions of linear transformations, we construct one and then also construct a matrix representation for the restriction. Furthermore, we will use a generalized eigenspace as the invariant subspace for the construction of the restriction.

Consider the linear transformation T : 55 defined by T x = Ax, where

A = 2224242446 3 2 6 0 11 1216 6 1417 6 8 4 10 8 11 14 8 13 18

One of the eigenvalues of A is λ = 2, with geometric multiplicity γT 2 = 1, and algebraic multiplicity αT 2 = 3. We get the generalized eigenspace in the usual manner,

W = GT 2 = KT 2I5 5 = 2 1 1 0 0 , 0 1 0 1 0 , 4 2 0 0 1 = w1,w2,w3

By Theorem GESIS, we know W is invariant relative to T, so we can employ Definition LTR to form the restriction, T|W : WW.

To better understand exactly what a restriction is (and isn’t), we’ll form a matrix representation of T|W . This will also be a skill we will use in subsequent examples. For a basis of W we will use C = w1,w2,w3. Notice that dim W = 3, so our matrix representation will be a square matrix of size 3. Applying Definition MR, we compute

ρC T w1 = ρC Aw1 = ρC 4 2 2 0 0 = ρC 2 2 1 1 0 0 + 0 0 1 0 1 0 + 0 4 2 0 0 1 = 2 0 0 ρC T w2 = ρC Aw2 = ρC 0 2 2 2 1 = ρC 2 2 1 1 0 0 + 2 0 1 0 1 0 + (1) 4 2 0 0 1 = 2 2 1 ρC T w3 = ρC Aw3 = ρC 6 3 1 0 2 = ρC (1) 2 1 1 0 0 + 0 0 1 0 1 0 + 2 4 2 0 0 1 = 1 0 2

So the matrix representation of T|W relative to C is

MC,CT|W = 2 2 1 0 2 0 01 2

The question arises: how do we use a 3 × 3 matrix to compute with vectors from 5? To answer this question, consider the randomly chosen vector

w = 4 4 4 2 1

First check that w GT 2. There are two ways to do this, first verify that

T 2I5 5 w = A 2I 5 5w = 0

meeting Definition GES (with k = 5). Or, express w as a linear combination of the basis C for W, to wit, w = 4w1 2w2 w3. Now compute T|W w directly using Definition LTR,

T|W w = T w = Aw = 10 9 5 4 0

It was necessary to verify that w GT 2, and if we trust our work so far, then this output will also be an element of W, but it would be wise to check this anyway (using either of the methods we used for w). We’ll wait.

Now we will repeat this sample computation, but instead using the matrix representation of T|W relative to C.

T|W w = ρC1 M C,CT|W ρC w  Theorem FTMR = ρC1 M C,CT|W ρC 4w1 2w2 w3 = ρC1 2 2 1 0 2 0 01 2 4 2 1  Definition VR = ρC1 5 4 0  Definition MVP = 5w1 4w2 + 0w3  Definition VR = 5 2 1 1 0 0 + (4) 0 1 0 1 0 + 0 4 2 0 0 1 = 10 9 5 4 0

which matches the previous computation. Notice how the “action” of T|W is accomplished by a 3 × 3 matrix multiplying a column vector of size 3. If you would like more practice with these sorts of computations, mimic the above using the other eigenvalue of T, which is λ = 2. The generalized eigenspace has dimension 2, so the matrix representation of the restriction to the generalized eigenspace will be a 2 × 2 matrix.

Suppose that T : V V is a linear transformation and we can find a decomposition of V as a direct sum, say V = U1 U2 U3 Um where each Ui is an invariant subspace of V relative to T. Then, for any v V there is a unique decomposition v = u1 + u2 + u3 + + um with ui Ui, 1 i m and furthermore

T v = T u1 + u2 + u3 + + um  Definition DS = T u1 + T u2 + T u3 + + T um  Theorem LTLC = T|U1 u1 + T|U2 u2 + T|U3 u3 + + T|Um um

So in a very real sense, we obtain a decomposition of the linear transformation T into the restrictions T|Ui, 1 i m. If we wanted to be more careful, we could extend each restriction to a linear transformation defined on V by setting the output of T|Ui to be the zero vector for inputs outside of Ui. Then T would be exactly equal to the sum (Definition LTA) of these extended restrictions. However, the irony of extending our restrictions is more than we could handle right now.

Our real interest is in the matrix representation of a linear transformation when the domain decomposes as a direct sum of invariant subspaces. Consider forming a basis B of V as the union of bases Bi from the individual Ui, i.e. B = i=1mB i. Now form the matrix representation of T relative to B. The result will be block diagonal, where each block is the matrix representation of a restriction T|Ui relative to a basis Bi, MBi,BiT|Ui . Though we did not have the definitions to describe it then, this is exactly what was going on in the latter portion of the proof of Theorem CFNLT. Two examples should help to clarify these ideas.

Example ISMR4
Invariant subspaces, matrix representation, dimension 4 domain
Example TIS and Example GE4 describe a basis of 4 which is derived from bases for two invariant subspaces (both generalized eigenspaces). In this example we will construct a matrix representation of the linear transformation T relative to this basis. Recycling the notation from Example TIS, we work with the basis,

B = w1,w2,x1,x2 = 7 2 3 0 , 1 2 0 1 , 3 1 1 0 , 0 1 0 1

Now we compute the matrix representation of T relative to B, borrowing some computations from Example TIS,

ρB T w1 = ρB 1 2 0 1 = ρB (0)w1 + (1)w2 = 0 1 0 0 ρB T w2 = ρB 5 2 3 2 = ρB (1)w1 + (2)w2 = 1 2 0 0 ρB T x1 = ρB 3 0 1 1 = ρB (1)x1 + (1)x2 = 0 0 1 1 ρB T x2 = ρB 3 4 1 3 = ρB (1)x1 + (3)x2 = 0 0 1 3

Applying Definition MR, we have

MB,BT = 01 0 0 1 2 0 0 0 0 11 0 0 1 3

The interesting feature of this representation is the two 2 × 2 blocks on the diagonal that arise from the decomposition of 4 into a direct sum (of generalized eigenspaces). Or maybe the interesting feature of this matrix is the two 2 × 2 submatrices in the “other” corners that are all zero. You decide.

Example ISMR6
Invariant subspaces, matrix representation, dimension 6 domain
In Example GE6 we computed the generalized eigenspaces of the linear transformation S : 66 by S x = Bx where

2 4 25549037 2 3 4 1626 8 2 3 4 1524 7 1018 6 3651 2 8 14 0 2128 4 5 7 6 7 8 7

From this we found the basis

C = v1,v2,v3,v4,v5,v6 = 4 1 1 2 1 0 , 5 1 1 1 0 1 , 5 3 1 0 0 0 , 2 3 0 1 0 0 , 4 5 0 0 1 0 , 5 3 0 0 0 1

of 6 where v1,v2 is a basis of GS 3 and v3,v4,v5,v6 is a basis of GS 1. We can employ C in the construction of a matrix representation of S (Definition MR). Here are the computations,

ρC S v1 = ρC 11 3 3 7 4 1 = ρC 4v1 + 1v2 = 4 1 0 0 0 0 ρC S v2 = ρC 14 3 3 4 1 2 = ρC (1)v1 + 2v2 = 1 2 0 0 0 0 ρC S v3 = ρC 23 5 5 2 2 2 = ρC 5v3 + 2v4 + (2)v5 + (2)v6 = 0 0 5 2 2 2 ρC S v4 = ρC 46 11 10 2 5 4 = ρC (10)v3 + (2)v4 + 5v5 + 4v6 = 0 0 10 2 5 4 ρC S v5 = ρC 78 19 17 1 10 7 = ρC 17v3 + 1v4 + (10)v5 + (7)v6 = 0 0 17 1 10 7 ρC S v6 = ρC 35 9 8 2 6 3 = ρC (8)v3 + 2v4 + 6v5 + 3v6 = 0 0 8 2 6 3

These column vectors are the columns of the matrix representation, so we obtain

MC,CS = 41 0 0 0 0 1 2 0 0 0 0 0 0 5 10 17 8 0 0 2 2 1 2 0 0 2 5 10 6 0 0 2 4 7 3

As before, the key feature of this representation is the 2 × 2 and 4 × 4 blocks on the diagonal. We will discover in the final theorem of this section (Theorem RGEN) that we already understand these blocks fairly well. For now, we recognize them as arising from generalized eigenspaces and suspect that their sizes are equal to the algebraic multiplicities of the eigenvalues.

The paragraph prior to these last two examples is worth repeating. A basis derived from a direct sum decomposition into invariant subspaces will provide a matrix representation of a linear transformation with a block diagonal form.

Diagonalizing a linear transformation is the most extreme example of decomposing a vector space into invariant subspaces. When a linear transformation is diagonalizable, then there is a basis composed of eigenvectors (Theorem DC). Each of these basis vectors can be used individually as the lone element of a spanning set for an invariant subspace (Theorem EIS). So the domain decomposes into a direct sum of one-dimensional invariant subspaces (Theorem DSFB). The corresponding matrix representation is then block diagonal with all the blocks of size 1, i.e. the matrix is diagonal. Section NLT, Section IS and Section JCF are all devoted to generalizing this extreme situation when there are not enough eigenvectors available to make such a complete decomposition and arrive at such an elegant matrix representation.

One last theorem will roll up much of this section and Section NLT into one nice, neat package.

Theorem RGEN
Restriction to Generalized Eigenspace is Nilpotent
Suppose T : V V is a linear transformation with eigenvalue λ. Then the linear transformation T|GTλ λIGTλ is nilpotent.

Proof   Notice first that every subspace of V is invariant with respect to IV , so IGTλ = IV |GTλ. Let n = dim V and choose v GT λ. Then

T|GTλ λIGTλ n v = T λI V n v  Definition LTR = 0  Theorem GEK

So by Definition NLT, T|GTλ λIGTλ is nilpotent.

The proof of Theorem RGEN indicates that the index of the nilpotent linear transformation is less than or equal to the dimension of V . In practice, it will be less than or equal to the dimension of the domain of the linear transformation, GT λ. In any event, the exact value of this index will be of some interest, so we define it now. Notice that this is a property of the eigenvalue λ, similar to the algebraic and geometric multiplicities (Definition AME, Definition GME).

Definition IE
Index of an Eigenvalue
Suppose T : V V is a linear transformation with eigenvalue λ. Then the index of λ, ιT λ, is the index of the nilpotent linear transformation T|GTλ λIGTλ.

(This definition contains Notation IE.)

Example GENR6
Generalized eigenspaces and nilpotent restrictions, dimension 6 domain
In Example GE6 we computed the generalized eigenspaces of the linear transformation S : 66 defined by S x = Bx where

2 4 25549037 2 3 4 1626 8 2 3 4 1524 7 1018 6 3651 2 8 14 0 2128 4 5 7 6 7 8 7

The generalized eigenspace, GS 3, has dimension 2, while GS 1, has dimension 4. We’ll investigate each thoroughly in turn, with the intent being to illustrate Theorem RGEN. Much of our computations will be repeats of those done in Example ISMR6.

For U = GS 3 we compute a matrix representation of S|U using the basis found in Example GE6,

B = u1,u2 = 4 1 1 2 1 0 , 5 1 1 1 0 1

Since B has size 2, we obtain a 2 × 2 matrix representation (Definition MR) from

ρB S|U u1 = ρB 11 3 3 7 4 1 = ρB 4u1 + u2 = 4 1 ρB S|U u2 = ρB 14 3 3 4 1 2 = ρB (1)u1 + 2u2 = 1 2

Thus

M = MU,US|U = 41 1 2

Now we can illustrate Theorem RGEN with powers of the matrix representation (rather than the restriction itself),

M 3I2 = 11 11 M 3I2 2 = 00 00

So M 3I2 is a nilpotent matrix of index 2 (meaning that S|U 3IU is a nilpotent linear transformation of index 2) and according to Definition IE we say ιS 3 = 2.

For W = GS 1 we compute a matrix representation of S|W using the basis found in Example GE6,

C = w1,w2,w3,w4 = 5 3 1 0 0 0 , 2 3 0 1 0 0 , 4 5 0 0 1 0 , 5 3 0 0 0 1

Since C has size 4, we obtain a 4 × 4 matrix representation (Definition MR) from

ρC S|W w1 = ρC 23 5 5 2 2 2 = ρC 5w1 + 2w2 + (2)w3 + (2)w4 = 5 2 2 2 ρC S|W w2 = ρC 46 11 10 2 5 4 = ρC (10)w1 + (2)w2 + 5w3 + 4w4 = 10 2 5 4 ρC S|W w3 = ρC 78 19 17 1 10 7 = ρC 17w1 + w2 + (10)w3 + (7)w4 = 17 1 10 7 ρC S|W w4 = ρC 35 9 8 2 6 3 = ρC (8)w1 + 2w2 + 6w3 + 3w4 = 8 2 6 3

Thus

N = MW,W S|W = 5 10 17 8 2 2 1 2 2 5 10 6 2 4 7 3

Now we can illustrate Theorem RGEN with powers of the matrix representation (rather than the restriction itself),

N (1)I4 = 6 10178 2 1 1 2 2 5 9 6 2 4 7 4 N (1)I4 2 = 2 3 5 2 4 6104 4 6104 2 3 5 2 N (1)I4 3 = 0000 0000 0000 0000

So N (1)I4 is a nilpotent matrix of index 3 (meaning that S|W (1)IW is a nilpotent linear transformation of index 3) and according to Definition IE we say ιS 1 = 3.

Notice that if we were to take the union of the two bases of the generalized eigenspaces, we would have a basis for 6. Then a matrix representation of S relative to this basis would be the same block diagonal matrix we found in Example ISMR6, only we now understand each of these blocks as being very close to being a nilpotent matrix.

Invariant subspaces, and restrictions of linear transformations, are topics you will see again and again if you continue with further study of linear algebra. Our reasons for discussing them now is to arrive at a nice matrix representation of the restriction of a linear transformation to one of its generalized eigenspaces. Here’s the theorem.

Theorem MRRGE
Matrix Representation of a Restriction to a Generalized Eigenspace
Suppose that T : V V is a linear transformation with eigenvalue λ. Then there is a basis of the the generalized eigenspace GT λ such that the restriction T|GTλ: GT λGT λ has a matrix representation that is block diagonal where each block is a Jordan block of the form Jn λ.

Proof   Theorem RGEN tells us that T|GTλ λIGTλ is a nilpotent linear transformation. Theorem CFNLT tells us that a nilpotent linear transformation has a basis for its domain that yields a matrix representation that is block diagonal where the blocks are Jordan blocks of the form Jn 0. Let B be a basis of GT λ that yields such a matrix representation for T|GTλ λIGTλ.

By Definition LTA, we can write

T|GTλ = T|GTλ λIGTλ + λIGTλ

The matrix representation of λIGTλ relative to the basis B is then simply the diagonal matrix λIm, where m = dim GT λ. By Theorem MRSLT we have the rather unweildy expression,

MB,BT|GTλ = MB,BT|GTλλIGTλ +λIGTλ = MB,BT|GTλλIGTλ + MB,BIGTλ

The first of these matrix representations has Jordan blocks with zero in every diagonal entry, while the second matrix representation has λ in every diagonal entry. The result of adding the two representations is to convert the Jordan blocks from the form Jn 0 to the form Jn λ.

Of course, Theorem CFNLT provides some extra information on the sizes of the Jordan blocks in a representation and we could carry over this information to Theorem MRRGE, but will save that for a subsequent application of this result.