To print higher-resolution math symbols, click the Hi-Res Fonts for Printing button on the jsMath control panel.
No jsMath TeX fonts found -- using image fonts instead.
These may be slow and might not print well.
Use the jsMath control panel to get additional information.
We have seen in Section IS that generalized eigenspaces are invariant
subspaces that in every instance have led to a direct sum decomposition of the
domain of the associated linear transformation. This allows us to create a block
diagonal matrix representation (Example ISMR4, Example ISMR6). We also
know from Theorem RGEN that the restriction of a linear transformation to a
generalized eigenspace is almost a nilpotent linear transformation. Of
course, we understand nilpotent linear transformations very well from
Section NLT and we have carefully determined a nice matrix representation for
them.
So here is the game plan for the final push. Prove that the domain of a
linear transformation always decomposes into a direct sum of generalized
eigenspaces. We have unravelled Theorem RGEN at Theorem MRRGE so that
we can formulate the matrix representations of the restrictions on the
generalized eigenspaces using our storehouse of results about nilpotent
linear transformations. Arrive at a matrix representation of any linear
transformation that is block diagonal with each block being a Jordan
block.
In Theorem UTMR we were able to show that any linear transformation from
V to
V has
an upper triangular matrix representation (Definition UTM). We will now show
that we can improve on the basis yielding this representation by massaging the
basis so that the matrix representation is also block diagonal. The subspaces
associated with each block will be generalized eigenspaces, so the most general
result will be a decomposition of the domain of a linear transformation into a
direct sum of generalized eigenspaces.
Proof Suppose that dimV=n
and the n (not necessarily
distinct) eigenvalues of T
are ρ1ρ2ρ3…ρn. We begin
with a basis of V
that yields an upper triangular matrix representation, as guaranteed by Theorem UTMR,
B=x1x2x3…xn. Since
the matrix representation is upper triangular, and the eigenvalues of the linear
transformation are the diagonal elements we can choose this basis so that there are then
scalars aij,
1 ≤ j ≤ n,
1 ≤ i ≤ j − 1 such
that
We now define a new basis for V
which is just a slight variation in the basis
B. Choose
any k and
ℓ such
that 1 ≤ k < ℓ ≤ n and
{ρ}_{k}\mathrel{≠}{ρ}_{ℓ}. Define the
scalar α = {a}_{kl}∕\left ({ρ}_{ℓ} − {ρ}_{k}\right ). The
new basis is C = \left \{{y}_{1},\kern 1.95872pt {y}_{2},\kern 1.95872pt {y}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {y}_{n}\right \}
where
We now compute the values of the linear transformation
T with inputs
from C,
noting carefully the changed scalars in the linear combinations of
C
describing the outputs. These changes will translate to minor
changes in the matrix representation built using the basis
C.
There are three cases to consider, depending on which column
of the matrix representation we are examining. First, assume
j < ℓ.
Then
That seems a bit pointless. The first
ℓ − 1 columns of the matrix
representations of T
relative to B
and C
are identical. OK, if that was too easy, here’s the main act. Assume
j = ℓ.
Then
So how different are the matrix representations relative to
B and
C in
column ℓ?
For i > k, the
coefficient of {y}_{i} is
{a}_{ij}, as in the representation
relative to B. It is a
different story for i ≤ k,
where the coefficients of {y}_{i}
may be very different. We are especially interested in the coefficient of
{y}_{k}. In fact, this
whole first part of this proof is about this particular entry of the matrix representation. The
coefficient of {y}_{k}
is
If the definition of α
was a mystery, then no more. In the matrix representation of
T relative to
C, the entry
in column ℓ,
row k
is a zero. Nice. The only price we pay is that other entries in column
ℓ, specifically
rows 1
through k − 1,
may also change in a way we can’t control.
As before, we ask: how different are the matrix representations relative to
B and
C in
column j?
Only {y}_{k}
has a coefficient different from the corresponding coefficient when the basis is
B.
So in the matrix representations, the only entries to change are in row
k, for
columns ℓ + 1
through n.
What have we accomplished? With a change of basis, we can place a zero in a desired
entry (row k,
column ℓ)
of the matrix representation, leaving most of the entries untouched. The only
entries to possibly change are above the new zero entry, or to the right of the new
zero entry. Suppose we repeat this procedure, starting by “zeroing out” the entry
above the diagonal in the second column and first row. Then we move right to the
third column, and zero out the element just above the diagonal in the second row.
Next we zero out the element in the third column and first row. Then tackle the
fourth column, work upwards from the diagonal, zeroing out elements
as we go. Entries above, and to the right will repeatedly change, but
newly created zeros will never get wrecked, since they are below, or just
to the left of the entry we are working on. Similarly the values on the
diagonal do not change either. This entire argument can be retooled in the
language of change-of-basis matrices and similarity transformations, and
this is the approach taken by Noble in his Applied Linear Algebra. It is
interesting to concoct the change-of-basis matrix between the matrices
B and
C and
compute the inverse.
Perhaps you have noticed that we have to be just a bit more
careful than the previous paragraph suggests. The definition of
α has a
denominator that cannot be zero, which restricts our maneuvers to zeroing out entries in
row k and
column ℓ
only when {ρ}_{k}\mathrel{≠}{ρ}_{ℓ}.
So we do not necessarily arrive at a diagonal matrix. More carefully we can
write
where the {b}_{ij}
are our new coefficients after repeated changes, the
{y}_{j} are the new basis vectors,
and the condition “i : \kern 1.95872pt {ρ}_{i} = {ρ}_{j}”
means that we only have terms in the sum involving vectors whose final
coefficients are identical diagonal values (the eigenvalues). Now reorder the basis
vectors carefully. Group together vectors that have equal diagonal entries in the
matrix representation, but within each group preserve the order of the precursor
basis. This grouping will create a block diagonal structure for the matrix
representation, while otherwise preserving the order of the basis will retain the
upper triangular form of the representation. So we can arrive at a basis that yields
a matrix representation that is upper triangular and block diagonal, with the
diagonal entries of each block all equal to a common eigenvalue of the linear
transformation.
More carefully, employing the distinct eigenvalues of
T,
{λ}_{i},
1 ≤ i ≤ m, we can assert there is
a set of basis vectors for V ,
{u}_{ij},
1 ≤ i ≤ m,
1 ≤ j ≤ {α}_{T }\left ({λ}_{i}\right ), such
that
So the subspace {U}_{i} = \left \langle \left \{\left .{u}_{ij}\right \vert 1 ≤ j ≤ {α}_{T }\left ({λ}_{i}\right )\right \}\right \rangle ,
1 ≤ i ≤ m is an invariant
subspace of V
relative to T and
the restriction T{|}_{{U}_{i}}
has an upper triangular matrix representation relative to the basis
\left \{\left .{u}_{ij}\right \vert 1 ≤ j ≤ {α}_{T }\left ({λ}_{i}\right )\right \} where the diagonal
entries are all equal to {λ}_{i}.
Notice too that with this definition,
Whew. This is a good place to take a break, grab a cup of coffee,
use the toilet, or go for a short stroll, before we show that
{U}_{i} is a subspace of the
generalized eigenspace {G}_{T }\left ({λ}_{i}\right ).
This will follow if we can prove that each of the basis vectors for
{U}_{i} is a generalized
eigenvector of T for
{λ}_{i} (Definition GEV). We
need some power of T − {λ}_{i}{I}_{V }
that takes {u}_{ij}
to the zero vector. We prove by induction on
j (Technique I)
the claim that {\left (T − {λ}_{i}{I}_{V }\right )}^{j}\left ({u}_{
ij}\right ) = 0.
For j = 1
we have,
This completes the induction step. Since every vector of the spanning set for
{U}_{i} is an element of
the subspace {G}_{T }\left ({λ}_{i}\right ),
Property AC and Property SC allow us to conclude that
{U}_{i} ⊆{G}_{T }\left ({λ}_{i}\right ). Then by
Definition S, {U}_{i} is
a subspace of {G}_{T }\left ({λ}_{i}\right ).
Notice that this inductive proof could be interpreted to say that every element of
{U}_{i} is a generalized
eigenvector of T for
{λ}_{i}, and the algebraic
multiplicity of {λ}_{i}
is a sufficiently high power to demonstrate this via the definition for each
vector.
We are now prepared for our final argument in this long
proof. We wish to establish that the dimension of the subspace
{G}_{T }\left ({λ}_{i}\right ) is the algebraic
multiplicity of {λ}_{i}. This will
be enough to show that {U}_{i}
and {G}_{T }\left ({λ}_{i}\right )
are equal, and will finally provide the desired direct sum decomposition.
We will prove by induction (Technique I) the following claim. Suppose that
T : V → V is a linear
transformation and B
is a basis for V
that provides an upper triangular matrix representation of
T. The number of
times any eigenvalue λ
occurs on the diagonal of the representation is greater than
or equal to the dimension of the generalized eigenspace
{G}_{T }\left (λ\right ).
We will use the symbol m
for the dimension of V
so as to avoid confusion with our notation for the nullity. So
\mathop{ dim}\nolimits V = m and our proof will proceed
by induction on m. Use
the notation {#}_{T }(λ) to count
the number of times λ
occurs on the diagonal of a matrix representation of
T. We
want to show that
For the base case, \mathop{ dim}\nolimits V = 1. Every
matrix representation of T
is an upper triangular matrix with the lone eigenvalue of
T,
λ, as the single diagonal
entry. So {#}_{T }(λ) = 1. The
generalized eigenspace of λ
is not trivial (since by Theorem GEK it equals the regular
eigenspace), so it cannot be a subspace of dimension zero, and thus
\mathop{ dim}\nolimits \left ({G}_{T }\left (λ\right )\right ) = 1.
Now for the induction step, assume the claim is true for any
linear transformation defined on a vector space with dimension
m − 1 or less.
Suppose that B = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {v}_{m}\right \}
is a basis for V
that yields an upper triangular matrix representation for
T with diagonal
entries {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{m}. Then
U = \left \langle \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {v}_{m−1}\right \}\right \rangle is a subspace of
V that is invariant
relative to T. The
restriction T{|}_{U}: U → U
is then a linear transformation defined on
U, a vector space of
dimension m − 1. A matrix
representation of T{|}_{U}
relative to the basis C = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {v}_{m−1}\right \}
will be an upper triangular matrix with diagonal entries
{λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{m−1}.
We can therefore apply the induction hypothesis to
T{|}_{U} and its representation
relative to C.
Suppose that λ is
any eigenvalue of T.
Then suppose that v ∈K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{m}\right ).
As an element of V ,
we can write v
as a linear combination of the basis elements of
B, or more compactly,
there is a vector u ∈ U
and a scalar α
such that v = u + α{v}_{m}.
Then,
The final expression in this string of equalities is an element of
U since
U is invariant
relative to both T
and {I}_{V }.
The expression at the beginning is a scalar multiple of
{v}_{m},
and as such cannot be a nonzero element of
U without violating the
linear independence of B.
So
The vector {v}_{m} is
nonzero since B
is linearly independent, so Theorem SMEZV tells us that
α{\left ({λ}_{m} − λ\right )}^{m} = 0.
From the properties of scalar multiplication, we are confronted with two
possibilities.
Our first case is that λ\mathrel{≠}{λ}_{m}.
Notice then that λ
occurs the same number of times along the diagonal in the representations of
T{|}_{U} and
T. Now
α = 0 and
v = u + 0{v}_{m} = u. Since
v was chosen as an
arbitrary element of K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{m}\right ),
Definition SSET says that K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{m}\right ) ⊆ U.
It is always the case that K\kern -1.95872pt \left ({\left (T{|}_{U} − λ{I}_{U}\right )}^{m}\right ) ⊆K\kern -1.95872pt \left ({\left (T − λ{I}_{
V }\right )}^{m}\right ).
However, we can also see that in this case, the opposite
set inclusion is true as well. By Definition SE we have
K\kern -1.95872pt \left ({\left (T{|}_{U} − λ{I}_{U}\right )}^{m}\right ) = K\kern -1.95872pt \left ({\left (T − λ{I}_{
V }\right )}^{m}\right ).
Then
The second case is that λ = {λ}_{m}.
Notice then that λ
occurs one more time along the diagonal in the representation of
T compared to the
representation of T{|}_{U}.
Then
So u ∈K\kern -1.95872pt \left ({\left (T{|}_{U} − λ{I}_{U}\right )}^{m}\right ). The vector
v was chosen as an
arbitrary member of K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{m}\right ).
From the expression v = u + α{v}_{m}
we can now see v also
as an element of K\kern -1.95872pt \left ({\left (T{|}_{U} − λ{I}_{U}\right )}^{m}\right ) plus
a scalar multiple of {v}_{m}.
This observation yields
In Theorem UTMR we constructed an upper triangular matrix representation of
T where each
eigenvalue occurred {α}_{T }\left (λ\right )
times on the diagonal. So
Besides a nice decomposition into invariant subspaces, this proof has a bonus
for us.
Theorem DGES Dimension of Generalized Eigenspaces Suppose T : V → V is a linear
transformation with eigenvalue λ.
Then the dimension of the generalized eigenspace for
λ is the algebraic
multiplicity of λ,
\mathop{ dim}\nolimits \left ({G}_{T }\left ({λ}_{i}\right )\right ) = {α}_{T }\left ({λ}_{i}\right ).
□
Proof At the very end of the proof of Theorem GESD we obtain the
inequalities
Now we are in a position to define what we (and others) regard as an
especially nice matrix representation. The word “canonical” has at its root, the
word “canon,” which has various meanings. One is the set of laws established by a
church council. Another is a set of writings that are authentic, important or
representative. Here we take it to mean the accepted, or best, representative
among a variety of choices. Every linear transformation admits a variety of
representations, and we will declare one as the best. Hopefully you will
agree.
Definition JCF Jordan Canonical Form A square matrix is in Jordan canonical form if it meets the following
requirements:
The matrix is block diagonal.
Each block is a Jordan block.
If ρ < λ
then the block {J}_{k}\left (ρ\right )
occupies rows with indices greater than the indices of the rows occupied
by {J}_{ℓ}\left (λ\right ).
If ρ = λ
and ℓ < k,
then the block {J}_{ℓ}\left (λ\right )
occupies rows with indices greater than the indices of the rows occupied
by {J}_{k}\left (λ\right ).
△
Theorem JCFLT Jordan Canonical Form for a Linear Transformation Suppose T : V → V
is a linear transformation. Then there is a basis
B for
V such that the matrix
representation of T
with the following properties:
The matrix representation is in Jordan canonical form.
If {J}_{k}\left (λ\right )
is one of the Jordan blocks, then λ
is an eigenvalue of T.
For a fixed value of λ,
the largest block of the form {J}_{k}\left (λ\right )
has size equal to the index of λ,
{ι}_{T }\left (λ\right ).
For a fixed value of λ,
the number of blocks of the form {J}_{k}\left (λ\right )
is the geometric multiplicity of λ,
{γ}_{T }\left (λ\right ).
For a fixed value of λ,
the number of rows occupied by blocks of the form {J}_{k}\left (λ\right )
is the algebraic multiplicity of λ,
{α}_{T }\left (λ\right ).
Theorem GESD gives us a decomposition of
V
into generalized eigenspaces, one for each distinct eigenvalue.
Since these generalized eigenspaces ar invariant relative to
T, this provides
a block diagonal matrix representation where each block is the matrix representation of the
restriction of T
to the generalized eigenspace.
Restricting T
to a generalized eigenspace results in a “nearly nilpotent” linear transformation,
as stated more precisely in Theorem RGEN. We unravel Theorem RGEN in the
proof of Theorem MRRGE so that we can apply Theorem CFNLT about
representations of nilpotent linear transformations.
We know the dimension of a generalized eigenspace is the algebraic
multiplicity of the eigenvalue (Theorem DGES), so the blocks associated with the
generalized eigenspaces are square with a size equal to the algebraic multiplicity.
In refining the basis for this block, and producing Jordan blocks the results of
Theorem CFNLT apply. The total number of blocks will be the nullity of
T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )}, which is the geometric
multiplicity of λ as
an eigenvalue of T
(Definition GME). The largest of the Jordan blocks will have
size equal to the index of the nilpotent linear transformation
T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )},
which is exactly the definition of the index of the eigenvalue
λ (Definition IE).
■
Before we do some examples of this result, notice how close Jordan canonical
form is to a diagonal matrix. Or, equivalently, notice how close we have
come to diagonalizing a matrix (Definition DZM). We have a matrix
representation which has diagonal entries that are the eigenvalues of a matrix.
Each occurs on the diagonal as many times as the algebraic multiplicity.
However, when the geometric multiplicity is strictly less than the algebraic
multiplicity, we have some entries in the representation just above the
diagonal (the “superdiagonal”). Furthermore, we have some idea how often
this happens if we know the geometric multiplicity and the index of the
eigenvalue.
We now recognize just how simple a diagonalizable linear transformation really
is. For each eigenvalue, the generalized eigenspace is just the regular eigenspace,
and it decomposes into a direct sum of one-dimensional subspaces, each
spanned by a different eigenvector chosen from a basis of eigenvectors for the
eigenspace.
Some authors create matrix representations of nilpotent linear transformations
where the Jordan block has the ones just below the diagonal (the “subdiagonal”).
No matter, it is really the same, just different. We have also defined Jordan
canonical form to place blocks for the larger eigenvalues earlier, and for blocks
with the same eigenvalue, we place the bigger ones earlier. This is fairly standard,
but there is no reason we couldn’t order the blocks differently. It’d be
the same, just different. The reason for choosing some ordering is to be
assured that there is just one canonical matrix representation for each linear
transformation.
Example JCF10 Jordan canonical form, size 10 Suppose that T : {ℂ}^{10} → {ℂ}^{10} is the linear
transformation defined by T\left (x\right ) = Ax
where
We’ll find a basis for {ℂ}^{10}
that will yield a matrix representation of
T in
Jordan canonical form. First we find the eigenvalues, and their multiplicities, with
the techniques of Chapter E.
For each eigenvalue, we can compute a generalized eigenspace. By Theorem GESD we
know that {ℂ}^{10}
will decompose into a direct sum of these eigenspaces, and we can restrict
T
to each part of this decomposition. At this stage we know that the
Jordan canonical form will be block diagonal with blocks of size
2,
3 and
5, since
the dimensions of the generalized eigenspaces are equal to the algebraic
multiplicities of the eigenvalues (Theorem DGES). The geometric multiplicities
tell us how many Jordan blocks occupy each of the three larger blocks,
but we will discuss this as we analyze each eigenvalue. We do not yet
know the index of each eigenvalue (though we can easily infer it for
λ = 2) and
even if we did have this information, it only determines the size of the largest
Jordan block (per eigenvalue). We will press ahead, considering each eigenvalue
one at a time.
The eigenvalue λ = 2
has “full” geometric multiplicity, and is not an impediment to diagonalizing
T.
We will treat it in full generality anyway. First we compute
the generalized eigenspace. Since Theorem GEK says that
{G}_{T }\left (2\right ) = K\kern -1.95872pt \left ({\left (T − 2{I}_{{ℂ}^{10}}\right )}^{10}\right ) we
can compute this generalized eigenspace as a null space derived from the matrix
A,
The restriction of T
to {G}_{T }\left (2\right )
relative to the two basis vectors above has a matrix representation that is a
2 × 2 diagonal matrix
with the eigenvalue λ = 2
as the diagonal entries. So these two vectors will be the first two vectors in our basis
for {ℂ}^{10},
Notice that it was not strictly necessary to compute the 10-th power of
A − 2{I}_{10}. With
{α}_{T }\left (2\right ) = {γ}_{T }\left (2\right ) the null space
of the matrix A − 2{I}_{10}
contains all of the generalized eigenvectors of
T for the
eigenvalue λ = 2.
But there was no harm in computing the 10-th power either. This
discussion is equivalent to the observation that the linear transformation
T{|}_{{G}_{T}\left (2\right )}: {G}_{T }\left (2\right ) →{G}_{T }\left (2\right ) is nilpotent
of index 1. In
other words, {ι}_{T }\left (2\right ) = 1.
The eigenvalue λ = 0
will not be quite as simple, since the geometric multiplicity is
strictly less than the geometric multiplicity. As before, we first
compute the generalized eigenspace. Since Theorem GEK says that
{G}_{T }\left (0\right ) = K\kern -1.95872pt \left ({\left (T − 0{I}_{{ℂ}^{10}}\right )}^{10}\right ) we
can compute this generalized eigenspace as a null space derived from the matrix
A,
So \mathop{ dim}\nolimits \left ({G}_{T }\left (0\right )\right ) = 3 = {α}_{T }\left (0\right ),
as expected. We will use these three basis vectors for the
generalized eigenspace to construct a matrix representation of
T{|}_{{G}_{T}\left (0\right )}, where
F is being defined
implicitly as the basis of {G}_{T }\left (0\right ).
We construct this representation as usual, applying Definition MR,
By Theorem RGEN we can obtain a nilpotent matrix from this
matrix representation by subtracting the eigenvalue from the
diagonal elements, and then we can apply Theorem CFNLT to
M − (0){I}_{3}. First check that
{\left (M − (0){I}_{3}\right )}^{2} = O, so we know that the
index of M − (0){I}_{3} as a nilpotent
matrix, and that therefore λ = 0
is an eigenvalue of T
with index 2,
{ι}_{T }\left (0\right ) = 2. To determine
a basis of {ℂ}^{3}
that converts M − (0){I}_{3}
to canonical form, we need the null spaces of the powers of
M − (0){I}_{3}. For
convenience, set N = M − (0){I}_{3}.
Then we choose a vector from N\kern -1.95872pt \left ({N}^{2}\right )
that is not an element of N\kern -1.95872pt \left ({N}^{1}\right ).
Any vector with unequal first two entries will fit the bill, say
where we are employing the notation in Theorem CFNLT. The next step is to multiply this
vector by N to get
part of the basis for N\kern -1.95872pt \left ({N}^{1}\right ),
We need a vector to pair with {z}_{1,1}
that will make a basis for the two-dimensional subspace
N\kern -1.95872pt \left ({N}^{1}\right ). Examining
the basis for N\kern -1.95872pt \left ({N}^{1}\right )
we see that a vector with its first two entries equal will do the job.
Now we add back the eigenvalue λ = 0
to the representation of N to
obtain a representation for M.
Of course, with an eigenvalue of zero, the change is not apparent, so we won’t display
the same matrix again. This is the second block of the Jordan canonical form for
T. However, the
three vectors in C
will not suffice as basis vectors for the domain of
T — they have the wrong
size! The vectors in C
are vectors in the domain of a linear transformation defined by the matrix
M. But
M was a matrix
representation of T{|}_{{G}_{T}\left (0\right )} − 0{I}_{{G}_{T}\left (0\right )}
relative to the basis F
for {G}_{T }\left (0\right ).
We need to “uncoordinatize” each of the basis vectors in
C to produce a linear
combination of vectors in F
that will be an element of the generalized eigenspace
{G}_{T }\left (0\right ).
These will be the next three vectors of our final answer, a basis for
{ℂ}^{10} that
has a pleasing matrix representation.
Five down, five to go. Basis vectors, that is.
λ = −1 is
the smallest eigenvalue, but it will require the most computation. First
we compute the generalized eigenspace. Since Theorem GEK says that
{G}_{T }\left (−1\right ) = K\kern -1.95872pt \left ({\left (T − (−1){I}_{{ℂ}^{10}}\right )}^{10}\right ) we
can compute this generalized eigenspace as a null space derived from the matrix
A,
So \mathop{ dim}\nolimits \left ({G}_{T }\left (−1\right )\right ) = 5 = {α}_{T }\left (−1\right ),
as expected. We will use these five basis vectors for the
generalized eigenspace to construct a matrix representation of
T{|}_{{G}_{T}\left (−1\right )}, where
F
is being recycled and defined now implicitly as the basis of
{G}_{T }\left (−1\right ). We
construct this representation as usual, applying Definition MR,
By Theorem RGEN we can obtain a nilpotent matrix from this
matrix representation by subtracting the eigenvalue from the
diagonal elements, and then we can apply Theorem CFNLT to
M − (−1){I}_{5}. First check that
{\left (M − (−1){I}_{5}\right )}^{3} = O, so we know that the
index of M − (−1){I}_{5} as a nilpotent
matrix, and that therefore λ = −1
is an eigenvalue of T
with index 3,
{ι}_{T }\left (−1\right ) = 3. To determine
a basis of {ℂ}^{5}
that converts M − (−1){I}_{5}
to canonical form, we need the null spaces of the powers of
M − (−1){I}_{5}. Again, for
convenience, set N = M − (−1){I}_{5}.
Then we choose a vector from N\kern -1.95872pt \left ({N}^{3}\right )
that is not an element of N\kern -1.95872pt \left ({N}^{2}\right ). The
sum of the four basis vectors for N\kern -1.95872pt \left ({N}^{2}\right )
sum to a vector with all five entries equal to 1. We will mess with the first entry to create a
vector not in N\kern -1.95872pt \left ({N}^{2}\right ),
where we are employing the notation in Theorem CFNLT. The next step is to multiply this vector
by N to get a portion
of the basis for N\kern -1.95872pt \left ({N}^{2}\right ),
We have a basis for the two-dimensional subspace
N\kern -1.95872pt \left ({N}^{1}\right ) and we can add to that the
vector {z}_{2,1} and we have three
of four basis vectors for N\kern -1.95872pt \left ({N}^{2}\right ).
These three vectors span the subspace we call
{Q}_{2}. We need a fourth
vector outside of {Q}_{2}
to complete a basis of the four-dimensional subspace
N\kern -1.95872pt \left ({N}^{2}\right ).
Check that the vector
is an element of N\kern -1.95872pt \left ({N}^{2}\right ) that lies
outside of the subspace {Q}_{2}.
This vector was constructed by getting a nice basis for
{Q}_{2} and
forming a linear combination of this basis that specifies three of the five entries of the
result. Of the remaining two entries, one was changed to move the vector outside
of {Q}_{2}
and this was followed by a change to the remaining entry to place the vector into
N\kern -1.95872pt \left ({N}^{2}\right ). The
vector {z}_{2,2}
is the lone basis vector for the subspace we call
{R}_{2}.
The remaining two basis vectors are easy to come by. They are the result of
applying N
to each of the two most recently determined basis vectors,
To obtain a matrix representation of M,
we add back in the matrix (−1){I}_{5},
placing the eigenvalue back along the diagonal, and slightly modifying the Jordan
blocks,
The basis C
yields a pleasant matrix representation for the restriction of the linear transformation
T − (−1)I to the generalized
eigenspace {G}_{T }\left (−1\right ).
However, we must remember that these vectors in
{ℂ}^{5} are representations
of vectors in {ℂ}^{10}
relative to the basis F.
Each needs to be “un-coordinatized” before joining our final basis. Here we
go,
To summarize, we list the entire basis
B = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {v}_{10}\right \},
If you are not inclined to check all of these computations, here are
a few that should convince you of the amazing properties of the basis
B. Compute the
matrix-vector products A{v}_{i},
1 ≤ i ≤ 10.
In each case the result will be a vector of the form
λ{v}_{i} + δ{v}_{i−1}, where
λ is one
of the eigenvalues (you should be able to predict ahead of time which one) and
δ ∈\left \{0, 1\right \}.
Alternatively, if we can write inputs to the linear transformation
T as linear combinations
of the vectors in B (which
we can do uniquely since B
is a basis, Theorem VRRB), then the “action” of
T is reduced to
a matrix-vector product with the exceedingly simple matrix that is the Jordan canonical
form. Wow! ⊠
Subsection CHT: Cayley-Hamilton Theorem
Jordan was a French mathematician who was active in the late 1800’s. Cayley
and Hamilton were 19th-century contemporaries of Jordan from Britain. The
theorem that bears their names is perhaps one of the most celebrated in
basic linear algebra. While our result applies only to vector spaces and
linear transformations with scalars from the set of complex numbers,
ℂ,
the result is equally true if we restrict our scalars to the real numbers,
{ℝ}^{}. It
says that every matrix satisfies its own characteristic polynomial.
Theorem CHT Cayley-Hamilton Theorem Suppose A
is a square matrix with characteristic polynomial
{p}_{A}\left (x\right ). Then
{p}_{A}\left (A\right ) = O.
□
Proof Suppose B and
C are similar matrices
via the matrix S,
B = {S}^{−1}CS, and
q(x) is any
polynomial. Then q\left (B\right )
is similar to q\left (C\right )
via S,
q\left (B\right ) = {S}^{−1}q\left (C\right )S. (See
Example HPDM for hints on how to convince yourself of this.)
By Theorem JCFLT and Theorem SCB we know
A is similar to a matrix,
J, in Jordan canonical
form. Suppose {λ}_{1},\kern 1.95872pt {λ}_{2},\kern 1.95872pt {λ}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {λ}_{m} are the
distinct eigenvalues of A
(and are therefore the eigenvalues and diagonal entries of
J).
Then by Theorem EMRCP and Definition AME, we can factor the characteristic
polynomial as
The matrix J − {λ}_{k}I
will be block diagonal, and the block arising from the generalized eigenspace for
{λ}_{k} will
have zeros along the diagonal. Suitably adjusted for matrices (rather than linear
transformations), Theorem RGEN tells us this matrix is nilpotent. Since
the size of this nilpotent matrix is equal to the algebraic multiplicity of
{λ}_{k}, the
power {\left (J − {λ}_{k}I\right )}^{{α}_{A}\left ({λ}_{k}\right )}
will be a zero matrix (Theorem KPNLT) in the location of this block.
Repeating this argument for each of the
m
eigenvalues will place a zero block in some term of the product at every location
on the diagonal. The entire product will then be zero blocks on the diagonal, and
zero off the diagonal. In other words, it will be the zero matrix. Since
A and
J are
similar, {p}_{A}\left (A\right ) = {p}_{A}\left (J\right ) = O.
■