Section FS Four Subsets
There are four natural subsets associated with a matrix. We have met three already: the null space, the column space and the row space. In this section we will introduce a fourth, the left null space. The objective of this section is to describe one procedure that will allow us to find linearly independent sets that span each of these four sets of column vectors. Along the way, we will make a connection with the inverse of a matrix, so Theorem FS will tie together most all of this chapter (and the entire course so far).
Subsection LNS Left Null Space
Definition LNS Left Null Space
Suppose $A$ is an $m\times n$ matrix. Then the left null space is defined as $\lns{A}=\nsp{\transpose{A}}\subseteq\complex{m}$.
The left null space will not feature prominently in the sequel, but we can explain its name and connect it to row operations. Suppose $\vect{y}\in\lns{A}$. Then by Definition LNS, $\transpose{A}\vect{y}=\zerovector$. We can then write \begin{align*} \transpose{\zerovector} &=\transpose{\left(\transpose{A}\vect{y}\right)} &&\knowl{./knowls/definition.LNS.knowl}{\text{Definition LNS}}\\ &=\transpose{\vect{y}}\transpose{\left(\transpose{A}\right)} &&\knowl{./knowls/theorem.MMT.knowl}{\text{Theorem MMT}}\\ &=\transpose{\vect{y}}A &&\knowl{./knowls/theorem.TT.knowl}{\text{Theorem TT}} \end{align*}
The product $\transpose{\vect{y}}A$ can be viewed as the components of $\vect{y}$ acting as the scalars in a linear combination of the rows of $A$. And the result is a “row vector”, $\transpose{\zerovector}$ that is totally zeros. When we apply a sequence of row operations to a matrix, each row of the resulting matrix is some linear combination of the rows. These observations tell us that the vectors in the left null space are scalars that record a sequence of row operations that result in a row of zeros in the row-reduced version of the matrix. We will see this idea more explicitly in the course of proving Theorem FS.
Example LNS Left null space
Sage LNS Left Null Spaces
Subsection CCS Computing Column Spaces
We have three ways to build the column space of a matrix. First, we can use just the definition, Definition CSM, and express the column space as a span of the columns of the matrix. A second approach gives us the column space as the span of some of the columns of the matrix, and additionally, this set is linearly independent (Theorem BCS). Finally, we can transpose the matrix, row-reduce the transpose, kick out zero rows, and write the remaining rows as column vectors. Theorem CSRST and Theorem BRS tell us that the resulting vectors are linearly independent and their span is the column space of the original matrix.
We will now demonstrate a fourth method by way of a rather complicated example. Study this example carefully, but realize that its main purpose is to motivate a theorem that simplifies much of the apparent complexity. So other than an instructive exercise or two, the procedure we are about to describe will not be a usual approach to computing a column space.
Example CSANS Column space as null space
This example motivates the remainder of this section, so it is worth careful study. You might attempt to mimic the second approach with the coefficient matrices of Archetype I and Archetype J. We will see shortly that the matrix $L$ contains more information about $A$ than just the column space.
Sage RRSM Row-Reducing a Symbolic Matrix
Subsection EEF Extended Echelon Form
The final matrix that we row-reduced in Example CSANS should look familiar in most respects to the procedure we used to compute the inverse of a nonsingular matrix, Theorem CINM. We will now generalize that procedure to matrices that are not necessarily nonsingular, or even square. First a definition.
Definition EEF Extended Echelon Form
Suppose $A$ is an $m\times n$ matrix. Extend $A$ on its right side with the addition of an $m\times m$ identity matrix to form an $m\times (n + m)$ matrix $M$. Use row operations to bring $M$ to reduced row-echelon form and call the result $N$. $N$ is the extended reduced row-echelon form of $A$, and we will standardize on names for five submatrices ($B$, $C$, $J$, $K$, $L$) of $N$.
Let $B$ denote the $m\times n$ matrix formed from the first $n$ columns of $N$ and let $J$ denote the $m\times m$ matrix formed from the last $m$ columns of $N$. Suppose that $B$ has $r$ nonzero rows. Further partition $N$ by letting $C$ denote the $r\times n$ matrix formed from all of the nonzero rows of $B$. Let $K$ be the $r\times m$ matrix formed from the first $r$ rows of $J$, while $L$ will be the $(m-r)\times m$ matrix formed from the bottom $m-r$ rows of $J$. Pictorially, \begin{equation*} M=[A\vert I_m] \rref N=[B\vert J] = \left[\begin{array}{c|c}C&K\\\hline0&L\end{array}\right] \end{equation*}
Example SEEF Submatrices of extended echelon form
Theorem PEEF Properties of Extended Echelon Form
Suppose that $A$ is an $m\times n$ matrix and that $N$ is its extended echelon form. Then
- $J$ is nonsingular.
- $B=JA$.
- If $\vect{x}\in\complex{n}$ and $\vect{y}\in\complex{m}$, then $A\vect{x}=\vect{y}$ if and only if $B\vect{x}=J\vect{y}$.
- $C$ is in reduced row-echelon form, has no zero rows and has $r$ pivot columns.
- $L$ is in reduced row-echelon form, has no zero rows and has $m-r$ pivot columns.
Notice that in the case where $A$ is a nonsingular matrix we know that the reduced row-echelon form of $A$ is the identity matrix (Theorem NMRRI), so $B=I_n$. Then the second conclusion above says $JA=B=I_n$, so $J$ is the inverse of $A$. Thus this theorem generalizes Theorem CINM, though the result is a “left-inverse” of $A$ rather than a “right-inverse.”
The third conclusion of Theorem PEEF is the most telling. It says that $\vect{x}$ is a solution to the linear system $\linearsystem{A}{\vect{y}}$ if and only if $\vect{x}$ is a solution to the linear system $\linearsystem{B}{J\vect{y}}$. Or said differently, if we row-reduce the augmented matrix $\augmented{A}{\vect{y}}$ we will get the augmented matrix $\augmented{B}{J\vect{y}}$. The matrix $J$ tracks the cumulative effect of the row operations that converts $A$ to reduced row-echelon form, here effectively applying them to the vector of constants in a system of equations having $A$ as a coefficient matrix. When $A$ row-reduces to a matrix with zero rows, then $J\vect{y}$ should also have zero entries in the same rows if the system is to be consistent.
Subsection FS Four Subsets
With all the preliminaries in place we can state our main result for this section. In essence this result will allow us to say that we can find linearly independent sets to use in span constructions for all four subsets (null space, column space, row space, left null space) by analyzing only the extended echelon form of the matrix, and specifically, just the two submatrices $C$ and $L$, which will be ripe for analysis since they are already in reduced row-echelon form (Theorem PEEF).
Theorem FS Four Subsets
Suppose $A$ is an $m\times n$ matrix with extended echelon form $N$. Suppose the reduced row-echelon form of $A$ has $r$ nonzero rows. Then $C$ is the submatrix of $N$ formed from the first $r$ rows and the first $n$ columns and $L$ is the submatrix of $N$ formed from the last $m$ columns and the last $m-r$ rows. Then
- The null space of $A$ is the null space of $C$, $\nsp{A}=\nsp{C}$.
- The row space of $A$ is the row space of $C$, $\rsp{A}=\rsp{C}$.
- The column space of $A$ is the null space of $L$, $\csp{A}=\nsp{L}$.
- The left null space of $A$ is the row space of $L$, $\lns{A}=\rsp{L}$.
The first two conclusions of this theorem are nearly trivial. But they set up a pattern of results for $C$ that is reflected in the latter two conclusions about $L$. In total, they tell us that we can compute all four subsets just by finding null spaces and row spaces. This theorem does not tell us exactly how to compute these subsets, but instead simply expresses them as null spaces and row spaces of matrices in reduced row-echelon form without any zero rows ($C$ and $L$). A linearly independent set that spans the null space of a matrix in reduced row-echelon form can be found easily with Theorem BNS. It is an even easier matter to find a linearly independent set that spans the row space of a matrix in reduced row-echelon form with Theorem BRS, especially when there are no zero rows present. So an application of Theorem FS is typically followed by two applications each of Theorem BNS and Theorem BRS.
The situation when $r=m$ deserves comment, since now the matrix $L$ has no rows. What is $\csp{A}$ when we try to apply Theorem FS and encounter $\nsp{L}$? One interpretation of this situation is that $L$ is the coefficient matrix of a homogeneous system that has no equations. How hard is it to find a solution vector to this system? Some thought will convince you that any proposed vector will qualify as a solution, since it makes all of the equations true. So every possible vector is in the null space of $L$ and therefore $\csp{A}=\nsp{L}=\complex{m}$. OK, perhaps this sounds like some twisted argument from Alice in Wonderland. Let us try another argument that might solidly convince you of this logic.
If $r=m$, when we row-reduce the augmented matrix of $\linearsystem{A}{\vect{b}}$ the result will have no zero rows, and the first $n$ columns will all be pivot columns, leaving none for the final column, so by Theorem RCLS the system will be consistent. By Theorem CSCS, $\vect{b}\in\csp{A}$. Since $\vect{b}$ was arbitrary, every possible vector is in the column space of $A$, so we again have $\csp{A}=\complex{m}$. The situation when a matrix has $r=m$ is known by the term full rank, and in the case of a square matrix coincides with nonsingularity (see Exercise FS.M50).
The properties of the matrix $L$ described by this theorem can be explained informally as follows. A column vector $\vect{y}\in\complex{m}$ is in the column space of $A$ if the linear system $\linearsystem{A}{\vect{y}}$ is consistent (Theorem CSCS). By Theorem RCLS, the reduced row-echelon form of the augmented matrix $\augmented{A}{\vect{y}}$ of a consistent system will have zeros in the bottom $m-r$ locations of the last column. By Theorem PEEF this final column is the vector $J\vect{y}$ and so should then have zeros in the final $m-r$ locations. But since $L$ comprises the final $m-r$ rows of $J$, this condition is expressed by saying $\vect{y}\in\nsp{L}$.
Additionally, the rows of $J$ are the scalars in linear combinations of the rows of $A$ that create the rows of $B$. That is, the rows of $J$ record the net effect of the sequence of row operations that takes $A$ to its reduced row-echelon form, $B$. This can be seen in the equation $JA=B$ (Theorem PEEF). As such, the rows of $L$ are scalars for linear combinations of the rows of $A$ that yield zero rows. But such linear combinations are precisely the elements of the left null space. So any element of the row space of $L$ is also an element of the left null space of $A$.
We will now illustrate Theorem FS with a few examples.
Example FS1 Four subsets, no. 1
Example FS2 Four subsets, no. 2
The next example is just a bit different since the matrix has more rows than columns, and a trivial null space.
Example FSAG Four subsets, Archetype G
Sage EEF Extended Echelon Form
Example COV and Example CSROI each describes the column space of the coefficient matrix from Archetype I as the span of a set of $r=3$ linearly independent vectors. It is no accident that these two different sets both have the same size. If we (you?) were to calculate the column space of this matrix using the null space of the matrix $L$ from Theorem FS then we would again find a set of 3 linearly independent vectors that span the range. More on this later.
So we have three different methods to obtain a description of the column space of a matrix as the span of a linearly independent set. Theorem BCS is sometimes useful since the vectors it specifies are equal to actual columns of the matrix. Theorem BRS and Theorem CSRST combine to create vectors with lots of zeros, and strategically placed 1's near the top of the vector. Theorem FS and the matrix $L$ from the extended echelon form gives us a third method, which tends to create vectors with lots of zeros, and strategically placed 1's near the bottom of the vector. If we do not care about linear independence we can also appeal to Definition CSM and simply express the column space as the span of all the columns of the matrix, giving us a fourth description.
With Theorem CSRST and Definition RSM, we can compute column spaces with theorems about row spaces, and we can compute row spaces with theorems about column spaces, but in each case we must transpose the matrix first. At this point you may be overwhelmed by all the possibilities for computing column and row spaces. Diagram CSRST is meant to help. For both the column space and row space, it suggests four techniques. One is to appeal to the definition, another yields a span of a linearly independent set, and a third uses Theorem FS. A fourth suggests transposing the matrix and the dashed line implies that then the companion set of techniques can be applied. This can lead to a bit of silliness, since if you were to follow the dashed lines twice you would transpose the matrix twice, and by Theorem TT would accomplish nothing productive.
Although we have many ways to describe a column space, notice that one tempting strategy will usually fail. It is not possible to simply row-reduce a matrix directly and then use the columns of the row-reduced matrix as a set whose span equals the column space. In other words, row operations do not preserve column spaces (however row operations do preserve row spaces, Theorem REMRS). See Exercise CRS.M21.