Subsection

{\sc\large This section is in draft form}\\ {\sc\large Nearly complete}

\medskip We have seen that some matrices are diagonalizable and some are not. Some authors refer to a non-diagonalizable matrix as defective, but we will study them carefully anyway. Examples of such matrices include Example EMMS4, Example HMEM5, and Example CEMS6. Each of these matrices has at least one eigenvalue with geometric multiplicity strictly less than its algebraic multiplicity, and therefore Theorem DMFE tells us these matrices are not diagonalizable.

Given a square matrix $A$, it is likely similar to many, many other matrices. Of all these possibilities, which is the best? “Best” is a subjective term, but we might agree that a diagonal matrix is certainly a very nice choice. Unfortunately, as we have seen, this will not always be possible. What form of a matrix is “next-best”? Our goal, which will take us several sections to reach, is to show that every matrix is similar to a matrix that is “nearly-diagonal” (Section JCF). More precisely, every matrix is similar to a matrix with elements on the diagonal, and zeros and ones on the diagonal just above the main diagonal (the “super diagonal”), with zeros everywhere else. In the language of equivalence relations (see Theorem SER), we are determining a systematic representative for each equivalence class. Such a representative for a set of similar matrices is called a canonical form.

We have just discussed the determination of a canonical form as a question about matrices. However, we know that every square matrix creates a natural linear transformation (Theorem MBLT) and every linear transformation with identical domain and codomain has a square matrix representation for each choice of a basis, with a change of basis creating a similarity transformation (Theorem SCB). So we will state, and prove, theorems using the language of linear transformations on abstract vector spaces, while most of our examples will work with square matrices. You can, and should, mentally translate between the two settings frequently and easily.

\subsect{NLT}{Nilpotent Linear Transformations} We will discover that nilpotent linear transformations are the essential obstacle in a non-diagonalizable linear transformation. So we will study them carefully first, both as an object of inherent mathematical interest, but also as the object at the heart of the argument that leads to a pleasing canonical form for any linear transformation. Once we understand these linear transformations thoroughly, we will be able to easily analyze the structure of any linear transformation.

Definition NLT Nilpotent Linear Transformation

Suppose that $\ltdefn{T}{V}{V}$ is a linear transformation such that there is an integer $p>0$ such that $\lt{T^p}{\vect{v}}=\zerovector$ for every $\vect{v}\in V$. The smallest $p$ for which this condition is met is called the $index$ of $T$.

$\square$

Of course, the linear transformation $T$ defined by $\lt{T}{\vect{v}}=\zerovector$ will qualify as nilpotent of index 1. But are there others?
Example NM64 Nilpotent matrix, size 6, index 4
Another example.
Example NM62 Nilpotent matrix, size 6, index 2
On a first encounter with the definition of a nilpotent matrix, you might wonder if such a thing was possible at all. That a high power of a nonzero object could be zero is so very different from our experience with scalars that it seems very unnatural. Hopefully the two previous examples were somewhat surprising. But we have seen that matrix algebra does not always behave the way we expect (Example MMNC), and we also now recognize matrix products not just as arithmetic, but as function composition (Theorem MRCLT). We will now turn to some examples of nilpotent matrices which might be more transparent.
Definition JB Jordan Block

Given the scalar $\lambda\in\complexes$, the Jordan block $\jordan{n}{\lambda}$ is the $n\times n$ matrix defined by \begin{align*} \matrixentry{\jordan{n}{\lambda}}{ij} &= \begin{cases} \lambda & i=j\\ 1 & j=i+1\\ 0 & \text{otherwise} \end{cases} \end{align*} $\jordan{n}{\lambda}$

$\square$

Example JB4 Jordan block, size 4
We will return to general Jordan blocks later, but in this section we are just interested in Jordan blocks where $\lambda=0$. Here's an example of why we are specializing in these matrices now.
Example NJB5 Nilpotent Jordan block, size 5
We can form combinations of Jordan blocks to build a variety of nilpotent matrices. Simply place Jordan blocks on the diagonal of a matrix with zeros everywhere else, to create a block diagonal matrix.
Example NM83 Nilpotent matrix, size 8, index 3
It would appear that nilpotent matrices only have zero as an eigenvalue, so the algebraic multiplicity will be the maximum possible. However, by creating block diagonal matrices with Jordan blocks on the diagonal you should be able to attain any desired geometric multiplicity for this lone eigenvalue. Likewise, the size of the largest Jordan block employed will determine the index of the matrix. So nilpotent matrices with various combinations of index and geometric multiplicities are easy to manufacture. The predictable properties of block diagonal matrices in matrix products and eigenvector computations, along with the next theorem, make this possible. You might find Example NJB5 a useful companion to this proof.
Theorem NJB Nilpotent Jordan Blocks
The Jordan block $\jordan{n}{0}$ is nilpotent of index $n$.

Proof

\subsect{PNLT}{Properties of Nilpotent Linear Transformations} In this subsection we collect some basic properties of nilpotent linear transformations. After studying the examples in the previous section, some of these will be no surprise.
Theorem ENLT Eigenvalues of Nilpotent Linear Transformations
Suppose that $\ltdefn{T}{V}{V}$ is a nilpotent linear transformation and $\lambda$ is an eigenvalue of $T$. Then $\lambda=0$.

Proof

Paraphrasing, all of the eigenvalues of a nilpotent linear transformation are zero. So in particular, the characteristic polynomial of a nilpotent linear transformation, $T$, on a vector space of dimension $n$, is simply $\charpoly{T}{x}=x^n$.

The next theorem is not critical for what follows, but it will explain our interest in nilpotent linear transformations. More specifically, it is the first step in backing up the assertion that nilpotent linear transformations are the essential obstacle in a non-diagonalizable linear transformation. While it is not obvious from the statement of the theorem, it says that a nilpotent linear transformation is not diagonalizable, unless it is trivially so.
Theorem DNLT Diagonalizable Nilpotent Linear Transformations
Suppose the linear transformation $\ltdefn{T}{V}{V}$ is nilpotent. Then $T$ is diagonalizable if and only if $T$ is the zero linear transformation.

Proof

So, other than one trivial case (the zero matrix), every nilpotent linear transformation is not diagonalizable. It remains to see what is so “essential” about this broad class of non-diagonalizable linear transformations. For this we now turn to a discussion of kernels of powers of nilpotent linear transformations, beginning with a result about general linear transformations that may not necessarily be nilpotent.
Theorem KPLT Kernels of Powers of Linear Transformations
Suppose $\ltdefn{T}{V}{V}$ is a linear transformation, where $\dimension{V}=n$. Then there is an integer $m$, $0\leq m\leq n$, such that \begin{align*} \set{\zerovector} &=\krn{T^0} \subsetneq\krn{T^1} \subsetneq\krn{T^2} \subsetneq\cdots \subsetneq\krn{T^m} =\krn{T^{m+1}} =\krn{T^{m+2}} =\cdots \end{align*}

Proof

We now specialize Theorem KPLT to the case of nilpotent linear transformations, which buys us just a bit more precision in the conclusion.
Theorem KPNLT Kernels of Powers of Nilpotent Linear Transformations
Suppose $\ltdefn{T}{V}{V}$ is a nilpotent linear transformation with index $p$ and $\dimension{V}=n$. Then $0\leq p\leq n$ and \begin{align*} \set{\zerovector} &=\krn{T^0} \subsetneq\krn{T^1} \subsetneq\krn{T^2} \subsetneq\cdots \subsetneq\krn{T^{p}} =\krn{T^{p+1}} =\cdots =V \end{align*}

Proof

The structure of the kernels of powers of nilpotent linear transformations will be crucial to what follows. But immediately we can see a practical benefit. Suppose we are confronted with the question of whether or not an $n\times n$ matrix, $A$, is nilpotent or not. If we don't quickly find a low power that equals the zero matrix, when do we stop trying higher and higher powers? Theorem KPNLT gives us the answer: if we don't see a zero matrix by the time we finish computing $A^n$, then it is not going to ever happen. We'll now take a look at one example of Theorem KPNLT in action.
Example KPNLT Kernels of powers of a nilpotent linear transformation
\subsect{CFNLT}{Canonical Form for Nilpotent Linear Transformations} Our main purpose in this section is to find a basis so that a nilpotent linear transformation will have a pleasing, nearly-diagonal matrix representation. Of course, we will not have a definition for “pleasing,” nor for “nearly-diagonal.” But the short answer is that our preferred matrix representation will be built up from Jordan blocks, $\jordan{n}{0}$. Here's the theorem. You will find Example CFNLT helpful as you study this proof, since it uses the same notation, and is large enough to (barely) illustrate the full generality of the theorem.
Theorem CFNLT Canonical Form for Nilpotent Linear Transformations
Suppose that $\ltdefn{T}{V}{V}$ is a nilpotent linear transformation of index $p$. Then there is a basis for $V$ so that the matrix representation, $\matrixrep{T}{B}{B}$, is block diagonal with each block being a Jordan block, $\jordan{n}{0}$. The size of the largest block is the index $p$, and the total number of blocks is the nullity of $T$, $\nullity{T}$.

Proof

The proof of Theorem CFNLT is constructive (Proof Technique C), so we can use it to create bases of nilpotent linear transformations with pleasing matrix representations. Recall that Theorem DNLT told us that nilpotent linear transformations are almost never diagonalizable, so this is progress. As we have hinted before, with a nice representation of nilpotent matrices, it will not be difficult to build up representations of other non-diagonalizable matrices. Here is the promised example which illustrates the previous theorem. It is a useful companion to your study of the proof of Theorem CFNLT.
Example CFNLT Canonical form for a nilpotent linear transformation
Notice that constructing interesting examples of matrix representations requires domains with dimensions bigger than just two or three. Going forward we will see several more big examples.