Section JCF  Jordan Canonical Form

From A First Course in Linear Algebra
Version 1.08
© 2004.
Licensed under the GNU Free Documentation License.
http://linear.ups.edu/

This Section is a Draft
We have seen in Section IS that generalized eigenspaces are invariant subspaces that in every instance have led to a direct sum decomposition of the domain of the associated linear transformation. This allows us to create a block diagonal matrix representation (Example ISMR4, Example ISMR6). We also know from Theorem RGEN that the restriction of a linear transformation to a generalized eigenspace is almost a nilpotent linear transformation. Of course, we understand nilpotent linear transformations very well from Section NLT and we have carefully determined a nice matrix representation for them.

So here is the game plan for the final push. Prove that the domain of a linear transformation always decomposes into a direct sum of generalized eigenspaces. We have unravelled Theorem RGEN at Theorem MRRGE so that we can formulate the matrix representations of the restrictions on the generalized eigenspaces using our storehouse of results about nilpotent linear transformations. Arrive at a matrix representation of any linear transformation that is block diagonal with each block being a Jordan block.

Subsection GESD: Generalized Eigenspace Decomposition

In Theorem UTMR we were able to show that any linear transformation from V to V has an upper triangular matrix representation (Theorem UTM). We will now show that we can improve on the basis yielding this representation by massaging the basis so that the matrix representation is also block diagonal. The subspaces associated with each block will be generalized eigenspaces, so the most general result will be a decomposition of the domain of a linear transformation into a direct sum of generalized eigenspaces.

Theorem GESD
Generalized Eigenspace Decomposition
Suppose that T V V is a linear transformation with distinct eigenvalues λ1,λ2,λ3,,λm. Then

V = GT λ1 GT λ2 GT λ3 GT λm

Proof   Suppose that dim V = n and the n (not necessarily distinct) eigenvalues of T are scalarlistρn. We begin with a basis of V that yields an upper triangular matrix representation, as guaranteed by Theorem UTMR, B = x1,x2,x3,,xn. Since the matrix representation is upper triangular, and the eigenvalues of the linear transformation are the diagonal elements we can choose this basis so that there are then scalars aij, 1 j n, 1 i j 1 such that

T xj = i=1j1a ijxi + ρjxj

We now define a new basis for V which is just a slight variation in the basis B. Choose any k and such that 1 k < n and ρkρ. Define the scalar α = akl ρ ρk. The new basis is C = y1,y2,y3,,yn where

yj = xj,j,1 j n y = x + αxk

We now compute the values of the linear transformation T with inputs from C, noting carefully the changed scalars in the linear combinations of C describing the outputs. These changes will translate to minor changes in the matrix representation built using the basis C. There are three cases to consider, depending on which column of the matrix representation we are examining. First, assume j < . Then

T yj = T xj = i=1j1a ijxi + ρjxj = i=1j1a ijyi + ρjyj

That seems a bit pointless. The first 1 columns of the matrix representations of T relative to B and C are identical. OK, if that was too easy, here’s the main act. Assume j = . Then

T y = T x + αxk = T x + αT xk = i=11a ixi + ρx + α i=1k1a ikxi + ρkxk = i=11a ixi + ρx + i=1k1αa ikxi + αρkxk = i=11a ixi + i=1k1αa ikxi + αρkxk + ρx = i=1 ik 1a ixi + i=1k1αa ikxi + aklxk + αρkxk + ρx = i=1 ik 1a ixi + i=1k1αa ikxi + aklxk + αρkxk ραxk + ραxkρx = i=1 ik 1a ixi + i=1k1αa ikxi + akl + αρk ραxk + ρ αxk + x = i=1 ik 1a ixi + i=1k1αa ikxi + akl + α ρk ρ xk + ρ x + αxk = i=1 ik 1a iyi + i=1k1αa ikyi + akl + α ρk ρ yk + ρy

So how different are the matrix representations relative to B and C in column ? For i > k, the coefficient of yi is aij, as in the representation relative to B. It is a different story for i k, where the coefficients of yi may be very different. We are especially interested in the coefficient of yk. In fact, this whole first part of this proof is about this particular entry of the matrix representation. The coefficient of yk is

akl + α ρk ρ = akl + akl ρ ρk ρk ρ = akl + (1)akl = 0

If the definition of α was a mystery, then no more. In the matrix representation of T relative to C, the entry in column , row k is a zero. Nice. The only price we pay is that other entries in column , specifically rows 1 through k 1, may also change in a way we can’t control.

One more case to consider. Assume j > . Then

T yj = T xj = i=1j1a ijxi + ρjxj = i=1 i,k j1a ijxi + ajx + akjxk + ρjxj = i=1 i,k j1a ijxi + ajx + αajxk αajxk + akjxk + ρjxj = i=1 i,k j1a ijxi + aj x + αxk + akj αaj xk + ρjxj = i=1 i,k j1a ijyi + ajy + akj αaj yk + ρjyj

As before, we ask: how different are the matrix representations relative to B and C in column j? Only yk has a coefficent different from the corresponding coefficient when the basis is B. So in the matrix representations, the only entries to change are in row k, for columns + 1 through n.

What have we accomplished? With a change of basis, we can place a zero in a desired entry (row k, column ) of the matrix representation, leaving most of the entries untouched. The only entries to possibly change are above the new zero entry, or to the right of the new zero entry. S Suppose we repeat this procedure, starting by “zeroing out” the entry above the diagonal in the second column and first wow. Then we move right to the third column, and zero out the element just above the diagonal in the second row. Next we zero out the element in the third column and first row. Then tackle the fourth column, work upwards from the diagonal, zeroing out elements as we go. Entries above, and to the right will repeatedly change, but newly created zeros will never get wrecked, since they are below, or just to the left of the entry we are working on. Similarly the values on the diagonal do not change either. This entire argument can be retooled in the language of change-of-basis matrices and similarity transformations, and this is the approach taken by Noble in his Applied Linear Algebra. It is interesting to concoct the change-of-basis matrix between the matrices B and C and compute the inverse.

Perhaps you have noticed that we have to be just a bit more careful than the previous paragraph suggests. The definition of α has a denominator that cannot be zero, which restricts our maneuvers to zeroing out entries in row k and column only when ρkρ. So we do not necessarily arrive at a diagonal matrix. More carefully we can write

T yj = i=1 i:ρi=ρj j1b ijyi + ρjyj

where the bij are our new coefficients after repeated changes, the yj are the new basis vectors, and the condition “i : ρi = ρj” means that we only have terms in the sum involving vectors whose final coefficients are identical diagonal values (the eigenvalues). Now reorder the basis vectors carefully. Group together vectors that have equal diagonal entries in the matrix representation, but within each group preserve the order of the precursor basis. This grouping will create a block diagonal structure for the matrix representation, while otherwise preserving the order of the basis will retain the upper triangular form of the representation. So we can arrive at a basis that yields a matrix representation that is upper triangular and block diagonal, with the diagonal entries of each block all equal to a common eigenvalue of the linear transformation.

More carefully, employing the distinct eigenvalues of T, λi, 1 i m, we can assert there is a set of basis vectors for V , uij, 1 i m, 1 j αT λi, such that

T uij = k=1j1b ijkuik + λiuij

So the subspace Ui = uij 1 j αT λi, 1 i m is an invariant subspace of V relative to T and the restriction T|Ui has an upper triangular matrix representation relative to the basis uij 1 j αT λi where the diagonal entries are all equal to λi. Notice too that with this definition,

V = U1 U2 U3 Um

Whew. This is a good place to take a break, grab a cup of coffee, use the toilet, or go for a short stroll, before we show that Ui is a subspace of the generalized eigenspace GT λi. This will follow if we can prove that each of the basis vectors for Ui is a generalized eigenvector of T for λi (Definition GEV). We need some power of T λiIV that takes uij to the zero vector. We prove by induction on j (Technique I) the claim that T λiIV j u ij = 0. For j = 1 we have,

T λiIV ui1 = T ui1 λiIV ui1 = T ui1 λiui1 = λiui1 λiui1 = 0

For the induction step, assume that if k < j, then T λiIV k takes uik to the zero vector. Then

T λiIV j u ij = T λiIV j1 T λ iIV uij = T λiIV j1 T u ij λiIV uij = T λiIV j1 T u ij λiuij = T λiIV j1 k=1j1b ijkuik + λiuij λiuij = T λiIV j1 k=1j1b ijkuik = k=1j1b ijk T λiIV j1 u ik = k=1j1b ijk T λiIV j1k T λ iIV k u ik = k=1j1b ijk T λiIV j1k 0 = k=1j1b ijk0 = 0

This completes the induction step. Since every vector of the spaning set for Ui is an element of the subspace GT λi, Property AC and Property SC allow us to conclude that Ui GT λi. Then by Definition S, Ui is a subspace of GT λi. Notice that this inductive proof could be interpreted to say that every element of Ui is a generalized eigenvector of T for λi, and the algebraic multiplicity of λi is a sufficiently high power to demonstrate this via the definition for each vector.

We are now prepared for our final argument in this long proof. We wish to establish that the dimension of the subspace GT λi is the algebraic multiplicity of λi. This will be enough to show that Ui and GT λi are equal, and will finally provide the desired direct sum decomposition.

We will prove by induction (Technique I) the following claim. Suppose that T : V V is a linear transformation and B is a basis for V that provides an upper triangular matrix representation of T. The number of times any eigenvalue λ occurs on the diagonal of the representation is greater than or equal to the dimension of the generalized eigenspace GT λ.

We will use the symbol m for the dimension of V so as to avoid confusion with our notation for the nullity. So dimV = m and our proof will proced by induction on m. Use the notation #T (λ) to count the number of times λ occurs on the diagonal of a matrix representation of T. We want to show that

#T (λ) dim GT λ = dim KT λm  Theorem GEK = n T λm  Definition NOLT

For the base case, dim V = 1. Every matrix representation of T is an upper triangular matrix with the lone eigenvalue of T, λ, as the diagonal entry. So #T (λ) = 1. The generalized eigenspace of λ is not trivial (since by Theorem GEK it equals the regular eigenspace), and is a subspace of V . With Theorem PSSD we see that dim GT λ = 1.

Now for the induction step, assume the claim is true for any linear transformation defined on a vector space with dimension m 1 or less. Suppose that B = v1,v2,v3,,vm is a basis for V that yields a diagonal matrix representation for T with diagonal entries λ1,λ2,λ3,,λm. Then U = v1,v2,v3,,vm1 is a subspace of V that is invariant relative to T. The restriction T|U: UU is then a linear transformation defined on U, a vector space of dimension m 1. A matrix representation of T|U relative to the basis C = v1,v2,v3,,vm1 will be an upper triangular matrix with diagonal entries λ1,λ2,λ3,,λm1. We can therefore apply the induction hypothesis to T|U and its representation relative to C.

Suppose that λ is any eigenvalue of T. Then suppose that v KT λV m. As an element of V , we can write v as a linear combination of the basis elements of B, or more compactly, there is a vector u U and a scalar α such that v = u + αvm. Then,

α λm λmv m = α T λIV m v m  Theorem EOMP = 0 + α T λIV m v m  Property Z = T λIV m u + T λI V m u + α T λI V m v m Property AI = T λIV m u + T λI V m u + αv m  Theorem LTLC = T λIV m u + T λI V m v  Theorem LTLC = T λIV m u + 0  Definition KLT = T λIV m u  Property Z

The final expression in this string of equalities is an element of U since U is invariant relative to both T and IV . The expression at the beginning is a scalar multiple of vm, and as such cannot be a nonzero element of U without violating the linear independence of B. So

α λm λmv m = 0

The vector vm is nonzero since B is linearly independent, so Theorem SMEZV tells us that α λm λm = 0. From the properties of scalar multiplication, we are confronted with two possibilities.

Our first case is that λλm. Notice then that λ occurs the same number of times along the diagonal in the representations of T|U and T. Now α = 0 and v = u + 0vm = u. Since v was chosen as an arbitrary element of KT λIV m, Definition SSET says that KT λIV m U. It is always the case that KT|U λIU m KT λI V m. However, we can also see that in this case, the opposite set inclusion is true as well. By Definition SE we have KT|U λIU m = KT λI V m. Then

#T (λ) = #T|U(λ) dim GT|U λ  Induction Hypothesis = dim KT|U λIU m1  Theorem GEK = dim KT|U λIU m  Theorem KPLT = dim KT λIV m = dim GT λ  Theorem GEK

The second case is that λ = λm. Notice then that λ occurs one more time along the diagonal in the representation of T compared to the representation of T|U. Then

T|U λIUm u = T λI V m u = T λIV m u + 0  Property Z = T λIV m u + α(λ m λ)mv m  Theorem ZSSM = T λIV m u + α T λI V m v m Theorem EOMP = T λIV m u + αv m  Theorem LTLC = T λIV m v = 0  Definition KLT

So u KT|U λIU. The vector v is an arbitrary member of KT λIV m and is also equal to an element of KT|U λIU (u) plus a scalar multiple of the vector vm. This observation yields

dim KT λIV m dim KT| U λIU + 1

Now count eigenvalues on the diagonal,

#T (λ) = #T|U(λ) + 1 dim GT|U λ + 1  Induction Hypothesis = dim KT|U λIU m1 + 1  Theorem GEK = dim KT|U λIU m + 1  Theorem KPLT dim KT λIV m = dim GT λ  Theorem GEK

In Theorem UTMR we constructed an upper triangular matrix represntation of T where each eigenvalue occurred αT λ times on the diagonal. So

αT λi = #T (λi)  Theorem UTMR dim GT λi dim Ui  Theorem PSSD = αT λi  Theorem PSSD

Thus, dim GT λi = αT λi and by Theorem EDYES, Ui = GT λi and we can write

V = U1 U2 U3 Um = GT λ1 GT λ2 GT λ3 GT λm

Besides a nice decomposition into invariant subspaces, this proof has a bonus for us.

Theorem DGES
Dimension of Generalized Eigenspaces
Suppose T : V V is a linear transformation with eigenvalue λ. Then the dimension of the generalized eigenspace for λ is the algebraic multiplicity of λ, dim GT λi = αT λi.

Proof   At the very end of the proof of Theorem GESD we obtain the inequalities

αT λi dim GT λi αT λi

which establishes the desired equality.

Subsection JCF: Jordan Canonical Form

Now we are in a position to define what we (and others) regard as an especially nice matrix representation. The word “canonical” has at its root, the word “canon,” which has various meanings. One is the set of laws established by a church council. Another is a set of writings that are authentic, important or representative. Here we take to to mean the accepted, or best, representative among a variety of choices. Every linear transformation admits a variety of representations, and will declare one as the best. Hopefully you will agree.

Definition JCF
Jordan Canonical Form
A square matrix is in Jordan canonical form if it meets the following requirements:

  1. The matrix is block diagonal.
  2. Each block is a Jordan block.
  3. If ρ < λ then the block Jk ρ occupies rows with indices greater than the indices of the rows occupied by J λ.
  4. If ρ = λ and < k, then the block J λ occupies rows with indices greater than the indices of the rows occupied by Jk λ.

Theorem JCFLT
Jordan Canonical Form for a Linear Transformation
Suppose T : V V is a linear transformation. Then there is a basis B for V such that the matrix representation of T with the following properties:

  1. The matrix representation is in Jordan canonical form.
  2. If Jk λ is one of the Jordan blocks, then λ is an eigenvalue of T.
  3. For a fixed value of λ, the largest block of the form Jk λ has size equal to the index of λ, ιT λ.
  4. For a fixed value of λ, the number of blocks of the form Jk λ is the geometric multiplicity of λ, γT λ.
  5. For a fixed value of λ, the number of rows occupied by blocks of the form Jk λ is the algebraic multiplicity of λ, αT λ.

Proof   This theorem is really just the consequence of applying to T, consecutively Theorem GESD, Theorem MRRGE and Theorem CFNLT.

Theorem GESD gives us a decomposition of V into generalized eigenspaces, one for each distinct eigenvalue. Since these generalized eigenspaces ar invariant relative to T, this provides a block diagonal matrix representation where each block is the matrix representation of the restriction of T to the generalized eigenspace.

Restricting T to a generalized eigenspace results in a “nearly nilpotent” linear transformation, as stated more precisely in Theorem RGEN. We unravel Theorem RGEN in the proof of Theorem MRRGE so that we can apply Theorem CFNLT about representations of nilpotent linear transformations.

We know the dimension of a generalized eigenspace is the algebraic multiplicity of the eigenvalue (Theorem DGES), so the blocks associated with the generalized eigenspaces are square with a size equal to the algebraic multiplicity. In refining the basis for this block, and producing Jordan blocks the results of Theorem CFNLT apply. The total number of blocks will be the nullity of T|GTλ λIGTλ, which is the geometric multiplicity of λ as an eigenvalue of T (Definition GME). The largest of the Jordan blocks will have size equal to the index of the nilpotent linear transformation T|GTλ λIGTλ, which is exactly the definition of the index of the eigenvalue λ (Definition IE).

Before we do some examples of this result, notice how close Jordan canonical form is to a diagonal matrix. Or, equivalently, notice how close we have come to diagonalizing a matrix (Definition DZM). We have a matrix representation which has diagonal entries that are the eigenvalues of a matrix. Each occurs on the diagonal as many times as the algebraic multiplicity. However, when the geometric multiplicty is strictly less than the algebraic multiplicity, we have some entries in the representation just above the diagonal (the “superdiagonal”). Furthermore, we have some idea how often this happens if we know the geometric multiplicity and the index of the eigenvalue.

We now recognize just how simple a diagonalizable linear transformation really is. For each eigenvalue, the generalized eigenspace is just the regular eigenspace, and it decomposes into a direct sum of one-dimensional subspaces, each spanned by a different eigenvector chosen from a basis of eigenvectors for the eigenspace.

Some authors create matrix representations of nilpotent linear transformations where the Jordan block has the ones just below the diagonal (the “subdiagonal”). No matter, it is really the same, just different. We have also defined Jordan canonical form to place blocks for the larger eigenvalues earlier, and for blocks with the same eigenvalue, we place the bigger ones earlier. This is fairly standard, but there is no reason we couldn’t order the blocks differently. It’d be the same, just different. The reason for choosing some ordering is to be assured that there is just one canonical matrix representation for each linear transformation.

Example JCF10
Jordan canonical form, size 10
Suppose that T : 1010 is the linear transformation defined by T x = Ax where

A = 6 9 75 5 12 22 14 8 21 3 5 31 2 7 12 9 1 12 8 9 8 6 0 14 25 13426 7 9 75 0 13 23 13 2 24 0 1 0 13 2 3 4 2 3 3 2 1 2 9 1 1 5 5 5 1 3 32 4 3 6 4 4 3 3 4 3 2 1 5 9 5 1 9 0 2 0 0 2 2 4 4 2 4 4 4 541 6 11 4 1 10

We’ll find a basis for 10 that will yield a matrix representation of T in Jordan canonical form. First we find the eigenvalues, and their multiplicities, with the techniques of Chapter E.

λ = 2 αT 2 = 2 γT 2 = 2 λ = 0 αT 0 = 3 γT 1 = 2 λ = 1 αT 1 = 5 γT 1 = 2

For each eigenvalue, we can compute a generalized eigenspace. By Theorem GESD we know that 10 will decompose into a direct sum of these eigenspaces, and we can restrict T to each part of this decomposition. At this stage we know that the Jordan canonical form will be block diagonal with blocks of size 2, 3 and 5, since the dimensions of the generalized eigenspaces are equal to the algebraic multiplicities of the eigenvalues (Theorem DGES). The geometric multiplicities tell us how many Jordan blocks occupy each of the three larger blocks, but we will discuss this as we analyze each eigenvalue. We do not yet know the index of each eigenalue (though we can easily infer it for λ = 2) and even if we did have this information, it only determines the size of the largest Jordan block (per eigenvalue). We will press ahead, considering each eigenvalue one at a time.

The eigenvalue λ = 2 has “full” geometric multiplicity, and is not an impediment to diagonalizing T. We will treat it in full generality anyway. First we compute the generalized eigenspace. Since Theorem GEK says that GT 2 = KT 2I10 10 we can compute this generalized eigenspace as a null space derived from the matrix A,

A 2I10 10  RREF 1000000021 0100000011 00100000 1 2 0001000012 00001000 1 0 000001002 1 000000101 0 00000001 0 1 00000000 0 0 00000000 0 0 GT 2 = KA 2I10 10 = 2 1 1 1 1 2 1 0 1 0 , 1 1 2 2 0 1 0 1 0 1

The restriction of T to GT 2 relative to the two basis vectors above has a matrix representation that is a 2 × 2 diagonal matrix with the eigenvalue λ = 2 as the diagonal entries. So these two vectors will be the first two vectors in our basis for 10,

v1 = 2 1 1 1 1 2 1 0 1 0 v2 = 1 1 2 2 0 1 0 1 0 1

Notice that it was not strictly necessary to compute the 10-th power of A 2I10. With αT 2 = γT 2 the null space of the matrix A 2I10 contains all of the generalized eigenvectors of T for the eigenvalue λ = 2. But there was no harm in computing the 10-th power either. This discussion is equivalent to the observation that the linear transformation T|GT2: GT 2 GT 2 is nilpotent of index 1. In other words, ιT 2 = 1.

The eigenvalue λ = 0 will not be quite as simple, since the geometric multiplicity is strictly less than the geometric multiplicity. As before, we first compute the generalized eigenspace. Since Theorem GEK says that GT 0 = KT 0I10 10 we can compute this generalized eigenspace as a null space derived from the matrix A,

A 0I10 10  RREF 100000 0 011 010000101 0 001000 0 0 1 2 000100 0 021 000010 0 0 1 0 000001101 2 000000 0 1 1 0 000000 0 0 0 0 000000 0 0 0 0 000000 0 0 0 0 GT 0 = KA 0I10 10 = 0 1 0 0 0 1 1 0 0 0 , 1 1 1 2 1 1 0 1 1 0 , 1 0 2 1 0 2 0 0 0 1 = F

So dim GT 0 = 3 = αT 0, as expected. We will use these three basis vectors for the generalized eigenspace to construct a matrix representation of T|GT0, where F is being defined implicitly as the basis of GT 0. We construct this representation as usual, applying Definition MR,

ρF T|GT0 0 1 0 0 0 1 1 0 0 0 = ρF 1 0 2 1 0 2 0 0 0 1 = ρF (1) 1 0 2 1 0 2 0 0 0 1 = 0 0 1 ρF T|GT0 1 1 1 2 1 1 0 1 1 0 = ρF 1 0 2 1 0 2 0 0 0 1 = ρF (1) 1 0 2 1 0 2 0 0 0 1 = 0 0 1 ρF T|GT0 1 0 2 1 0 2 0 0 0 1 = ρF 0 0 0 0 0 0 0 0 0 0 = 0 0 0

So we have the matrix representation

M = MF,F T|GT0 = 0 00 0 00 110

By Theorem RGEN we can obtain a nilpotent matrix from this matrix representation by subtracting the eigenvalue from the diagonal elements, and then we can apply Theorem CFNLT to M (0)I3. First check that M (0)I32 = O, so we know that the index of M (0)I3 as a nilpotent matrix, and that therefore λ = 0 is an eigenvalue of T with index 2, ιT 0 = 2. To determine a basis of 3 that converts M (0)I3 to canonical form, we need the null spaces of the powers of M (0)I3. For convenience, set N = M (0)I3.

NN1 = 1 1 0 , 0 0 1 NN2 = 1 0 0 , 0 1 0 , 0 0 1 = 3

Then we choose a vector from NN2 that is not an element of NN1. Any vector with unequal first two entries will fit the bill, say

z2,1 = 1 0 0

where we are employing the notation in Theorem CFNLT. The next step is to multiply this vector by N to get part of the basis for NN1,

z1,1 = Nz2,1 = 0 00 0 00 110 1 0 0 = 0 0 1

We need a vector to pair with z1,1 that will make a basis for the two-dimensional subspace NN1. Examining the basis for NN1 we see that a vector with its first two entries equal will do the job.

z1,2 = 1 1 0

Reordering, we find the basis,

C = z1,1,z2,1,z1,2 = 0 0 1 , 1 0 0 , 1 1 0

From this basis, we can get a matrix representation of N (when viewed as a linear transformation) relative to the basis C for 3,

010 000 000 = J2 0 O O J1 0

Now we add back the eigenvalue λ = 0 to the representation of N to obtain a representation for M. Of course, with an eigenvalue of zero, the change is not apparent, so we won’t display the same matrix again. This is the second block of the Jordan canonical form for T. However, the three vectors in C will not suffice as basis vectors for the domain of T — they have the wrong size! The vectors in C are vectors in the domain of a linear transformation defined by the matrix M. But M was a matrix representation of T|GT0 0IGT0 relative to the basis F for GT 0. We need to “uncoordinatize” each of the basis vectors in C to produce a linear combination of vectors in F that will be an element of the generalized eigenspace GT 0. These will be the next three vectors of our final answer, a basis for 10 that has a pleasing matrix representation.

v3 = ρF 1 0 0 1 = 0 0 1 0 0 0 1 1 0 0 0 + 0 1 1 1 2 1 1 0 1 1 0 + (1) 1 0 2 1 0 2 0 0 0 1 = 1 0 2 1 0 2 0 0 0 1 v4 = ρF 1 1 0 0 = 1 0 1 0 0 0 1 1 0 0 0 + 0 1 1 1 2 1 1 0 1 1 0 + 0 1 0 2 1 0 2 0 0 0 1 = 0 1 0 0 0 1 1 0 0 0 v5 = ρF 1 1 1 0 = 1 0 1 0 0 0 1 1 0 0 0 + 1 1 1 1 2 1 1 0 1 1 0 + 0 1 0 2 1 0 2 0 0 0 1 = 1 2 1 2 1 2 1 1 1 0

Five down, five to go. Basis vectors, that is. λ = 1 is the smallest eigenvalue, but it will require the most computation. First we compute the generalized eigenspace. Since Theorem GEK says that GT 1 = KT (1)I10 10 we can compute this generalized eigenspace as a null space derived from the matrix A,

A (1)I1010  RREF 101010110 1 010010 0 10 0 000110 1 002 000001210 2 000000 0 01 0 000000 0 00 0 000000 0 00 0 000000 0 00 0 000000 0 00 0 000000 0 00 0 GT 1 = KA (1)I10 10 = 1 0 1 0 0 0 0 0 0 0 , 1 1 0 1 1 0 0 0 0 0 , 1 0 0 1 0 2 1 0 0 0 , 1 1 0 0 0 1 0 1 0 0 , 1 0 0 2 0 2 0 0 0 1 = F

So dim GT 1 = 5 = αT 1, as expected. We will use these five basis vectors for the generalized eigenspace to construct a matrix representation of T|GT1, where F is being recycled and defined now implicitly as the basis of GT 1. We construct this representation as usual, applying Definition MR,

ρF T|GT1 1 0 1 0 0 0 0 0 0 0 = ρF 1 0 0 0 0 2 2 0 0 1 = ρF 0 1 0 1 0 0 0 0 0 0 0 + 0 1 1 0 1 1 0 0 0 0 0 + (2) 1 0 0 1 0 2 1 0 0 0 + 0 1 1 0 0 0 1 0 1 0 0 + (1) 1 0 0 2 0 2 0 0 0 1 = 0 0 2 0 1 ρF T|GT1 1 1 0 1 1 0 0 0 0 0 = ρF 7 1 5 3 1 2 4 0 0 3 = ρF (5) 1 0 1 0 0 0 0 0 0 0 + (1) 1 1 0 1 1 0 0 0 0 0 + 4 1 0 0 1 0 2 1 0 0 0 + 0 1 1 0 0 0 1 0 1 0 0 + 3 1 0 0 2 0 2 0 0 0 1 = 5 1 4 0 3 ρF T|GT1 1 0 0 1 0 2 1 0 0 0 = ρF 1 0 1 1 0 0 1 0 0 1 = ρF (1) 1 0 1 0 0 0 0 0 0 0 + 0 1 1 0 1 1 0 0 0 0 0 + 1 1 0 0 1 0 2 1 0 0 0 + 0 1 1 0 0 0 1 0 1 0 0 + 1 1 0 0 2 0 2 0 0 0 1 = 1 0 1 0 1 ρF T|GT1 1 1 0 0 0 1 0 1 0 0 = ρF 1 0 2 2 1 1 1 1 0 2 = ρF 2 1 0 1 0 0 0 0 0 0 0 + (1) 1 1 0 1 1 0 0 0 0 0 + (1) 1 0 0 1 0 2 1 0 0 0 + 1 1 1 0 0 0 1 0 1 0 0 + (2) 1 0 0 2 0 2 0 0 0 1 = 2 1 1 1 2 ρF T|GT1 1 0 0 2 0 2 0 0 0 1 = ρF 7 1 6 5 1 2 6 2 0 6 = ρF 6 1 0 1 0 0 0 0 0 0 0 + (1) 1 1 0 1 1 0 0 0 0 0 + (6) 1 0 0 1 0 2 1 0 0 0 + 2 1 1 0 0 0 1 0 1 0 0 + (6) 1 0 0 2 0 2 0 0 0 1 = 6 1 6 2 6

So we have the matrix representation of the restriction of T (again recycling and redefining the matrix M)

M = MF,F T|GT1 = 0 51 2 6 0 1 0 11 2 4 1 16 0 0 0 1 2 1 3 1 26

By Theorem RGEN we can obtain a nilpotent matrix from this matrix representation by subtracting the eigenvalue from the diagonal elements, and then we can apply Theorem CFNLT to M (1)I5. First check that M (1)I53 = O, so we know that the index of M (1)I5 as a nilpotent matrix, and that therefore λ = 1 is an eigenvalue of T with index 3, ιT 1 = 3. To determine a basis of 5 that converts M (1)I5 to canonical form, we need the null spaces of the powers of M (1)I5. Again, for convenience, set N = M (1)I5.

NN1 = 1 0 1 0 0 , 3 1 0 2 2 NN2 = 3 1 0 0 0 , 1 0 1 0 0 , 0 0 0 1 0 , 3 0 0 0 1 NN3 = 1 0 0 0 0 , 0 1 0 0 0 , 0 0 1 0 0 , 0 0 0 1 0 , 0 0 0 0 1 = 5

Then we choose a vector from NN3 that is not an element of NN2. The sum of the four basis vectors for NN2 sum to a vector with all five entries equal to 1. We will mess with the first entry to create a vector not in NN2,

z3,1 = 0 1 1 1 1

where we are employing the notation in Theorem CFNLT. The next step is to multiply this vector by N to get a portion of the basis for NN2,

z2,1 = Nz3,1 = 1 51 2 6 0 0 0 11 2 4 2 16 0 0 0 2 2 1 3 1 25 0 1 1 1 1 = 2 2 1 4 3

We have a basis for the two-dimensional subspace NN1 and we can add to that the vector z2,1 and we have three of four basis vectors for NN2. These three vectors span the subspace we call Q2. We need a fourth vector outside of Q2 to complete a basis of the four-dimensional subspace NN2. Check that the vector

z2,2 = 3 1 3 1 1

is an element of NN2 that lies outside of the subspace Q2. This vector was constructed by getting a nice basis for Q2 and forming a linear combination of this basis that specifies three of the five entries of the result. Of the remaining two entries, one was changed to move the vector outside of Q2 and this was followed by a change to the remaining entry to place the vector into NN2. The vector z2,2 is the lone basis vector for the subspace we call R2.

The remaining two basis vectors are easy to come by. They are the result of applying N to each of the two most recently determined basis vectors,

z1,1 = Nz2,1 = 3 1 0 2 2 z1,2 = Nz2,2 = 3 2 3 4 4

Now we reorder these basis vectors, to arrive at the basis

C = z1,1,z2,1,z3,1,z1,2,z2,2 = 3 1 0 2 2 , 2 2 1 4 3 , 0 1 1 1 1 , 3 2 3 4 4 , 3 1 3 1 1

A matrix representation of N relative to C is

01000 00100 00000 00001 00000 = J3 0 O O J2 0

To obtain a matrix representation of M, we add back in the matrix (1)I5, placing the eigenvalue back along the diagonal, and slightly mofifying the Jordan blocks,

1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 = J3 1 O O J2 1

The basis C yields a pleasant matrix representation for the restriction of the linear transformation T (1)I to the generalized eigenspace GT 1. However, we must remember that these vectors in 5 are representations of vectors in 10 relative to the basis F. Each needs to be “un-coordinatized” before joining our final basis. Here we go,

v6 = ρF 1 3 1 0 2 2 = 3 1 0 1 0 0 0 0 0 0 0 + (1) 1 1 0 1 1 0 0 0 0 0 + 0 1 0 0 1 0 2 1 0 0 0 + 2 1 1 0 0 0 1 0 1 0 0 + (2) 1 0 0 2 0 2 0 0 0 1 = 2 1 3 3 1 2 0 2 0 2 v7 = ρF 1 2 2 1 4 3 = 2 1 0 1 0 0 0 0 0 0 0 + (2) 1 1 0 1 1 0 0 0 0 0 + (1) 1 0 0 1 0 2 1 0 0 0 + 4 1 1 0 0 0 1 0 1 0 0 + (3) 1 0 0 2 0 2 0 0 0 1 = 2 2 2 3 2 0 1 4 0 3 v8 = ρF 1 0 1 1 1 1 = 0 1 0 1 0 0 0 0 0 0 0 + 1 1 1 0 1 1 0 0 0 0 0 + 1 1 0 0 1 0 2 1 0 0 0 + 1 1 1 0 0 0 1 0 1 0 0 + 1 1 0 0 2 0 2 0 0 0 1 = 2 2 0 0 1 1 1 1 0 1 v9 = ρF 1 3 2 3 4 4 = 3 1 0 1 0 0 0 0 0 0 0 + (2) 1 1 0 1 1 0 0 0 0 0 + (3) 1 0 0 1 0 2 1 0 0 0 + 4 1 1 0 0 0 1 0 1 0 0 + (4) 1 0 0 2 0 2 0 0 0 1 = 4 2 3 3 2 2 3 4 0 4 v10 = ρF 1 3 1 3 1 1 = 3 1 0 1 0 0 0 0 0 0 0 + 1 1 1 0 1 1 0 0 0 0 0 + 3 1 0 0 1 0 2 1 0 0 0 + 1 1 1 0 0 0 1 0 1 0 0 + 1 1 0 0 2 0 2 0 0 0 1 = 3 2 3 2 1 3 3 1 0 1

To summarize, we list the entire basis B = v1,v2,v3,,v10,

v1 = 2 1 1 1 1 2 1 0 1 0 v2 = 1 1 2 2 0 1 0 1 0 1 v3 = 1 0 2 1 0 2 0 0 0 1 v4 = 0 1 0 0 0 1 1 0 0 0 v5 = 1 2 1 2 1 2 1 1 1 0 v6 = 2 1 3 3 1 2 0 2 0 2 v7 = 2 2 2 3 2 0 1 4 0 3 v8 = 2 2 0 0 1 1 1 1 0 1 v9 = 4 2 3 3 2 2 3 4 0 4 v10 = 3 2 3 2 1 3 3 1 0 1

The resulting matrix representation is

MB,BT = 20000 0 0 0 0 0 02000 0 0 0 0 0 00010 0 0 0 0 0 00000 0 0 0 0 0 00000 0 0 0 0 0 000001 1 0 0 0 00000 0 1 1 0 0 00000 0 0 1 0 0 00000 0 0 0 1 1 00000 0 0 0 0 1

If you are not inclined to check all of these computations, here are a few that should convince you of the amazing properties of the basis B. Compute the matrix-vector products Avi, 1 i 10. In each case the result will be a vector of the form λvi + δvi1, where λ is one of the eigenvalues (you should be able to predict ahead of time which one) and δ 0, 1.

Alternatively, if we can write inputs to the linear transformation T as linear combinations of the vectors in B (which we can do uniquely since B is a basis, Theorem VRRB), then the “action” of T is reduced to a matrix-vector product with the exceedingly simple matrix that is the Jordan canonical form. Wow!

Subsection CHT: Cayley-Hamilton Theorem

Jordan was a French mathematician who was active in the late 1800’s. Cayley and Hamilton were 19th-century contemporaries of Jordan from Britian. The theorem that bears their names is perhaps one of the most celebrated in basic linear algebra. While our result applies only to vector spaces and linear transformations with scalars from the set of complex numbers, , the result is equally true if we restrict our scalars to the real numbers, . It says that every matrix satisfies its own characteristic polynomial.

Theorem CHT
Cayley-Hamilton Theorem
Suppose A is a square matrix with characteristic polynomial pA x. Then pA A = O.

Proof   Suppose B and C are similar matrices via the matrix S, B = S1CS, and q(x) is any polynomial. Then q B is similar to q C via S, q B = S1q CS. (See Example HPDM for hints on how to convince yourself of this.)

By Theorem JCFLT and Theorem SCB we know A is similar to a matrix, J, in Jordan canonical form. Suppose λ1,λ2,λ3,,λm are the distinct eigenvalues of A (and are therefore the eigenvalues and diagonal entries of J). Then by Theorem EMRCP and Definition AME, we can factor the characteristic polynomial as

pA x = x λ1 αAλ1 x λ2 αAλ2 x λ3 αAλ3 x λm αAλm

On substituting the matrix J we have

pA J = J λ1IαAλ1 J λ2IαAλ2 J λ3IαAλ3 J λmIαAλm

The matrix J λkI will be block diagonal, and the block arising from the generalized eigenspace for λk will have zeros along the diagonal. Suitably adjusted for matrices (rather than linear transformations), Theorem RGEN tells us this matrix is nilpotent. Since the size of this nilpotent matrix is equal to the algebraic multiplicity of λk, the power J λkIαAλk will be a zero matrix (Theorem KPNLT) in the location of this block.

Repeating this argument for each of the m eigenvalues will place a zero block in some term of the product at every location on the diagonal. The entire product will then be zero blocks on the diagonal, and zero off the diagonal. In other words, it will be the zero matrix. Since A and J are similar, pA A = pA J = O.