Section MR Matrix Representations
We have seen that linear transformations whose domain and codomain are vector spaces of columns vectors have a close relationship with matrices (Theorem MBLT, Theorem MLTCV). In this section, we will extend the relationship between matrices and linear transformations to the setting of linear transformations between abstract vector spaces.
Subsection MR Matrix Representations
This is a fundamental definition.
Definition MR. Matrix Representation.
Suppose that \(\ltdefn{T}{U}{V}\) is a linear transformation, \(B=\set{\vectorlist{u}{n}}\) is a basis for \(U\) of size \(n\text{,}\) and \(C\) is a basis for \(V\) of size \(m\text{.}\) Then the matrix representation of \(T\) relative to \(B\) and \(C\) is the \(m\times n\) matrix,
Example OLTTR. One linear transformation, three representations.
Consider the linear transformation, \(\ltdefn{S}{P_3}{M_{22}}\text{,}\) given by
First, we build a representation relative to the bases,
We evaluate \(S\) with each element of the basis for the domain, \(B\text{,}\) and coordinatize the result relative to the vectors in the basis for the codomain, \(C\text{.}\) Notice here how we take elements of vector spaces and decompose them into linear combinations of basis elements as the key step in constructing coordinatizations of vectors. There is a system of equations involved almost every time, but we will omit these details since this should be a routine exercise at this stage. We have
Thus, employing Definition MR
Often we use “nice” bases to build matrix representations and the work involved is much easier. Suppose we take bases
The evaluation of \(S\) at the elements of \(D\) is easy and coordinatization relative to \(E\) can be done on sight,
So the matrix representation of \(S\) relative to \(D\) and \(E\) is
One more time, but now let us use bases
and evaluate \(S\) with the elements of \(F\text{,}\) then coordinatize the results relative to \(G\text{,}\)
So we arrive at an especially economical matrix representation,
We may choose to use whatever terms we want when we make a definition. Some are arbitrary, while others make sense, but only in light of subsequent theorems. Matrix representation is in the latter category. We begin with a linear transformation and produce a matrix. So what? Here is the theorem that justifies the term matrix representation.
Theorem FTMR. Fundamental Theorem of Matrix Representation.
Suppose that \(\ltdefn{T}{U}{V}\) is a linear transformation, \(B\) is a basis for \(U\text{,}\) \(C\) is a basis for \(V\) and \(\matrixrep{T}{B}{C}\) is the matrix representation of \(T\) relative to \(B\) and \(C\text{.}\) Then, for any \(\vect{u}\in U\text{,}\)
or equivalently
Proof.
Let \(B=\set{\vectorlist{u}{n}}\) be the basis of \(U\text{.}\) Since \(\vect{u}\in U\text{,}\) there are scalars \(\scalarlist{a}{n}\) such that
Then,
The alternative conclusion is obtained as
This theorem says that we can apply \(T\) to \(\vect{u}\) and coordinatize the result relative to \(C\) in \(V\text{,}\) or we can first coordinatize \(\vect{u}\) relative to \(B\) in \(U\text{,}\) then multiply by the matrix representation. Either way, the result is the same. So the effect of a linear transformation can always be accomplished by a matrix-vector product (Definition MVP). That is important enough to say again. The effect of a linear transformation is a matrix-vector product.
The alternative conclusion of this result might be even more striking. It says that to effect a linear transformation (\(T\)) of a vector (\(\vect{u}\)), coordinatize the input (with \(\vectrepname{B}\)), do a matrix-vector product (with \(\matrixrep{T}{B}{C}\)), and un-coordinatize the result (with \(\vectrepinvname{C}\)). So, absent some bookkeeping about vector representations, a linear transformation is a matrix. To adjust the diagram, we “reverse” the arrow on the right, which means inverting the vector representation \(\vectrepname{C}\) on \(V\text{.}\) Now we can go directly across the top of the diagram, computing the linear transformation between the abstract vector spaces. Or, we can around the other three sides, using vector representation, a matrix-vector product, followed by un-coordinatization.
Here is an example to illustrate how the “action” of a linear transformation can be effected by matrix multiplication.
Example ALTMM. A linear transformation as matrix multiplication.
In Example OLTTR we found three representations of the linear transformation \(S\text{.}\) In this example, we will compute a single output of \(S\) in four different ways. First “normally,” then three times over using Theorem FTMR.
Choose \(p(x)=3-x+2x^2-5x^3\text{,}\) for no particular reason. Then the straightforward application of \(S\) to \(p(x)\) yields
Now use the representation of \(S\) relative to the bases \(B\) and \(C\) and Theorem FTMR. Note that we will employ the following linear combination in moving from the second line to the third,
We have
Again, but now with “nice” bases like \(D\) and \(E\text{,}\) and the computations are more transparent. We have
OK, last time, now with the bases \(F\) and \(G\text{.}\) The coordinatizations will take some work this time, but the matrix-vector product (Definition MVP) (which is the actual action of the linear transformation) will be especially easy, given the diagonal nature of the matrix representation, \(\matrixrep{S}{F}{G}\text{.}\) Here we go,
This example is not meant to necessarily illustrate that any one of these four computations is simpler than the others. Instead, it is meant to illustrate the many different ways we can arrive at the same result, with the last three all employing a matrix representation to effect the linear transformation.
We will use Theorem FTMR frequently in the next few sections. A typical application will feel like the linear transformation \(T\) “commutes” with a vector representation, \(\vectrepname{C}\text{,}\) and as it does the transformation morphs into a matrix, \(\matrixrep{T}{B}{C}\text{,}\) while the vector representation changes to a new basis, \(\vectrepname{B}\text{.}\) Or vice-versa.
Sage MR. Matrix Representations.
It is very easy to create matrix representations of linear transformations in Sage. Our examples in the text have used abstract vector spaces to make it a little easier to keep track of the domain and codomain. With Sage we will have to consistently use variants of QQ^n
, but we will use non-obvious bases to make it nontrivial and interesting. Here are the components of our first example, one linear function, two vector spaces, four bases.
We create alternate versions of the domain and codomain by providing them with bases other than the standard ones. Then we build the linear transformation and ask for its .matrix()
. We will use numerals to distinguish our two examples.
Now, we do it again, but with the bases D
and E
.
So once the pieces are in place, it is quite easy to obtain a matrix representation. You might experiment with examples from the text, by first mentally converting elements of abstract vector spaces into column vectors via vector representations using simple bases for the abstract spaces.
The linear transformations T1
and T2
are not different as functions, despite the fact that Sage treats them as unequal (since they have different matrix representations). We can check that the two functions behave identically, first with random testing. Repeated execution of the following compute cell should always produce True
.
A better approach would be to see if T1
and T2
agree on a basis for the domain, and to avoid any favoritism, we will use the basis of U
itself. Convince yourself that this is a proper application of Theorem LTDB to demonstrate equality.
Or we can just ask if they are equal functions (a method that is implemented using the previous idea about agreeing on a basis).
We can also demonstrate Theorem FTMR — we will use the representation for T1
. We need a couple of vector representation linear transformations.
Now we experiment with a “random” element of the domain. Study the third computation carefully, as it is the Sage version of Theorem FTMR. You might compute some of the pieces of this expression to build up the final expression, and you might duplicate the experiment with a different input and/or with the T2
version.
One more time, but we will replace the vector representation linear transformations with Sage conveniences. Recognize that the .linear_combination_of_basis()
method is the inverse of vector representation (it is “un-coordinatization”).
We have concentrated here on the first version of the conclusion of Theorem FTMR. You could experiment with the second version in a similar manner. Extra credit: what changes do you need to make in any of these experiments if you remove the side='right'
option?
Subsection NRFO New Representations from Old
In Subsection LT.NLTFO we built new linear transformations from other linear transformations. Sums, scalar multiples and compositions. These new linear transformations will have matrix representations as well. How do the new matrix representations relate to the old matrix representations? Here are the three theorems.
Theorem MRSLT. Matrix Representation of a Sum of Linear Transformations.
Suppose that \(\ltdefn{T}{U}{V}\) and \(\ltdefn{S}{U}{V}\) are linear transformations, \(B\) is a basis of \(U\) and \(C\) is a basis of \(V\text{.}\) Then
Proof.
Let \(\vect{x}\) be any vector in \(\complex{n}\text{.}\) Define \(\vect{u}\in U\) by \(\vect{u}=\vectrepinv{B}{\vect{x}}\text{,}\) so \(\vect{x}=\vectrep{B}{\vect{u}}\text{.}\) Then,
Since the matrices \(\matrixrep{T+S}{B}{C}\) and \(\matrixrep{T}{B}{C}+\matrixrep{S}{B}{C}\) have equal matrix-vector products for every vector in \(\complex{n}\text{,}\) by Theorem EMMVP they are equal matrices. (Now would be a good time to double-back and study the proof of Theorem EMMVP. You did promise to come back to this theorem sometime, didn't you?)
Theorem MRMLT. Matrix Representation of a Multiple of a Linear Transformation.
Suppose that \(\ltdefn{T}{U}{V}\) is a linear transformation, \(\alpha\in\complexes\text{,}\) \(B\) is a basis of \(U\) and \(C\) is a basis of \(V\text{.}\) Then
Proof.
Let \(\vect{x}\) be any vector in \(\complex{n}\text{.}\) Define \(\vect{u}\in U\) by \(\vect{u}=\vectrepinv{B}{\vect{x}}\text{,}\) so \(\vect{x}=\vectrep{B}{\vect{u}}\text{.}\) Then,
Since the matrices \(\matrixrep{\alpha T}{B}{C}\) and \(\alpha\matrixrep{T}{B}{C}\) have equal matrix-vector products for every vector in \(\complex{n}\text{,}\) by Theorem EMMVP they are equal matrices.
The vector space of all linear transformations from \(U\) to \(V\) is now isomorphic to the vector space of all \(m\times n\) matrices.
Theorem MRCLT. Matrix Representation of a Composition of Linear Transformations.
Suppose that \(\ltdefn{T}{U}{V}\) and \(\ltdefn{S}{V}{W}\) are linear transformations, \(B\) is a basis of \(U\text{,}\) \(C\) is a basis of \(V\text{,}\) and \(D\) is a basis of \(W\text{.}\) Then
Proof.
Let \(\vect{x}\) be any vector in \(\complex{n}\text{.}\) Define \(\vect{u}\in U\) by \(\vect{u}=\vectrepinv{B}{\vect{x}}\text{,}\) so \(\vect{x}=\vectrep{B}{\vect{u}}\text{.}\) Then,
Since the matrices \(\matrixrep{\compose{S}{T}}{B}{D}\) and \(\matrixrep{S}{C}{D}\matrixrep{T}{B}{C}\) have equal matrix-vector products for every vector in \(\complex{n}\text{,}\) by Theorem EMMVP they are equal matrices.
This is the second great surprise of introductory linear algebra. Matrices are linear transformations (functions, really), and matrix multiplication is function composition! We can form the composition of two linear transformations, then form the matrix representation of the result. Or we can form the matrix representation of each linear transformation separately, then multiply the two representations together via Definition MM. In either case, we arrive at the same result.
Example MPMR. Matrix product of matrix representations.
Consider the two linear transformations,
and bases for \(\complex{2}\text{,}\) \(P_2\) and \(M_{22}\) (respectively),
Begin by computing the new linear transformation that is the composition of \(T\) and \(S\) (Definition LTC, Theorem CLTLT), \(\ltdefn{\left(\compose{S}{T}\right)}{\complex{2}}{M_{22}}\text{,}\)
Now compute the matrix representations (Definition MR) for each of these three linear transformations (\(T\text{,}\) \(S\text{,}\) \(\compose{S}{T}\)), relative to the appropriate bases. First for \(T\text{,}\)
So we have the matrix representation of \(T\text{,}\)
Now, a representation of \(S\text{,}\)
So we have the matrix representation of \(S\text{,}\)
Finally, a representation of \(\compose{S}{T}\text{,}\)
So we have the matrix representation of \(\compose{S}{T}\text{,}\)
Now, we are all set to verify the conclusion of Theorem MRCLT,
We have intentionally used nonstandard bases. If you were to choose “nice” bases for the three vector spaces, then the result of the theorem might be rather transparent. But this would still be a worthwhile exercise — give it a go.
A diagram, similar to ones we have seen earlier, might make the importance of this theorem clearer.
One of our goals in the first part of this book is to make the definition of matrix multiplication (Definition MVP, Definition MM) seem as natural as possible. However, many of us are brought up with an entry-by-entry description of matrix multiplication (Theorem EMP) as the definition of matrix multiplication, and then theorems about columns of matrices and linear combinations follow from that definition. With this unmotivated definition, the realization that matrix multiplication is function composition is quite remarkable. It is an interesting exercise to begin with the question, “What is the matrix representation of the composition of two linear transformations?” and then, without using any theorems about matrix multiplication, finally arrive at the entry-by-entry description of matrix multiplication. Try it yourself (Exercise MR.T80).
Sage SUTH3. Sage Under The Hood, Round 3.
We have seen that Sage is able to add two linear transformations, or multiply a single linear transformation by a scalar. Under the hood, this is accomplished by simply adding matrix representations, or multiplying a matrix by a scalar, according to Theorem MRSLT and Theorem MRMLT (respectively). Theorem MRCLT says linear transformation composition is matrix representation multiplication. Is it still a mystery why we use the symbol *
for linear transformation composition in Sage?
We could do several examples here, but you should now be able to construct them yourselves. We will do just one, linear transformation composition is matrix representation multiplication.
Extra credit: what changes do you need to make if you dropped the side='right'
option on these three matrix representations?
Subsection PMR Properties of Matrix Representations
It will not be a surprise to discover that the kernel and range of a linear transformation are closely related to the null space and column space of the transformation's matrix representation. Perhaps this idea has been bouncing around in your head already, even before seeing the definition of a matrix representation. However, with a formal definition of a matrix representation (Definition MR), and a fundamental theorem to go with it (Theorem FTMR) we can be formal about the relationship, using the idea of isomorphic vector spaces (Definition IVS). Here are the twin theorems.
Theorem KNSI. Kernel and Null Space Isomorphism.
Suppose that \(\ltdefn{T}{U}{V}\) is a linear transformation, \(B\) is a basis for \(U\) of size \(n\text{,}\) and \(C\) is a basis for \(V\text{.}\) Then the kernel of \(T\) is isomorphic to the null space of \(\matrixrep{T}{B}{C}\text{,}\)
Proof.
To establish that two vector spaces are isomorphic, we must find an isomorphism between them, an invertible linear transformation (Definition IVS). The kernel of the linear transformation \(T\text{,}\) \(\krn{T}\text{,}\) is a subspace of \(U\text{,}\) while the null space of the matrix representation, \(\nsp{\matrixrep{T}{B}{C}}\) is a subspace of \(\complex{n}\text{.}\) The function \(\vectrepname{B}\) is defined as a function from \(U\) to \(\complex{n}\text{,}\) but we can just as well employ the definition of \(\vectrepname{B}\) as a function from \(\krn{T}\) to \(\nsp{\matrixrep{T}{B}{C}}\text{.}\)
We must first insure that if we choose an input for \(\vectrepname{B}\) from \(\krn{T}\) that then the output will be an element of \(\nsp{\matrixrep{T}{B}{C}}\text{.}\) So suppose that \(\vect{u}\in\krn{T}\text{.}\) Then
This says that \(\vectrep{B}{\vect{u}}\in\nsp{\matrixrep{T}{B}{C}}\text{,}\) as desired.
The restriction in the size of the domain and codomain \(\vectrepname{B}\) will not affect the fact that \(\vectrepname{B}\) is a linear transformation (Theorem VRLT), nor will it affect the fact that \(\vectrepname{B}\) is injective (Theorem VRI). Something must be done though to verify that \(\vectrepname{B}\) is surjective. To this end, appeal to the definition of surjective (Definition SLT), and suppose that we have an element of the codomain, \(\vect{x}\in\nsp{\matrixrep{T}{B}{C}}\subseteq\complex{n}\) and we wish to find an element of the domain with \(\vect{x}\) as its image. We now show that the desired element of the domain is \(\vect{u}=\vectrepinv{B}{\vect{x}}\text{.}\) First, verify that \(\vect{u}\in\krn{T}\text{,}\)
Second, verify that the proposed isomorphism, \(\vectrepname{B}\text{,}\) takes \(\vect{u}\) to \(\vect{x}\text{,}\)
\begin{align*} \vectrep{B}{\vect{u}}&=\vectrep{B}{\vectrepinv{B}{\vect{x}}}&&\text{Substitution}\\ &=\lteval{I_{\complex{n}}}{\vect{x}}&& \knowl{./knowl/definition-IVLT.html}{\text{Definition IVLT}}\\ &=\vect{x}&& \knowl{./knowl/definition-IDLT.html}{\text{Definition IDLT}}\text{.} \end{align*}With \(\vectrepname{B}\) demonstrated to be an injective and surjective linear transformation from \(\krn{T}\) to \(\nsp{\matrixrep{T}{B}{C}}\text{,}\) Theorem ILTIS tells us \(\vectrepname{B}\) is invertible, and so by Definition IVS, we say \(\krn{T}\) and \(\nsp{\matrixrep{T}{B}{C}}\) are isomorphic.
Example KVMR. Kernel via matrix representation.
Consider the kernel of the linear transformation, \(\ltdefn{T}{M_{22}}{P_2}\text{,}\) given by
We will begin with a matrix representation of \(T\) relative to the bases for \(M_{22}\) and \(P_2\) (respectively),
Then,
So the matrix representation of \(T\) (relative to \(B\) and \(C\)) is
We know from Theorem KNSI that the kernel of the linear transformation \(T\) is isomorphic to the null space of the matrix representation \(\matrixrep{T}{B}{C}\) and by studying the proof of Theorem KNSI we learn that \(\vectrepname{B}\) is an isomorphism between these null spaces. Rather than trying to compute the kernel of \(T\) using definitions and techniques from Chapter LT we will instead analyze the null space of \(\matrixrep{T}{B}{C}\) using techniques from way back in Chapter V. First row-reduce \(\matrixrep{T}{B}{C}\text{,}\)
So, by Theorem BNS, a basis for \(\nsp{\matrixrep{T}{B}{C}}\) is
We can now convert this basis of \(\nsp{\matrixrep{T}{B}{C}}\) into a basis of \(\krn{T}\) by applying \(\vectrepinvname{B}\) to each element of the basis,
So the set
is a basis for \(\krn{T}\text{.}\) Just for fun, you might evaluate \(T\) with each of these two basis vectors and verify that the output is the zero polynomial (Exercise MR.C10).
An entirely similar result applies to the range of a linear transformation and the column space of a matrix representation of the linear transformation.
Theorem RCSI. Range and Column Space Isomorphism.
Suppose that \(\ltdefn{T}{U}{V}\) is a linear transformation, \(B\) is a basis for \(U\) of size \(n\text{,}\) and \(C\) is a basis for \(V\) of size \(m\text{.}\) Then the range of \(T\) is isomorphic to the column space of \(\matrixrep{T}{B}{C}\text{,}\)
Proof.
To establish that two vector spaces are isomorphic, we must find an isomorphism between them, an invertible linear transformation (Definition IVS). The range of the linear transformation \(T\text{,}\) \(\rng{T}\text{,}\) is a subspace of \(V\text{,}\) while the column space of the matrix representation, \(\csp{\matrixrep{T}{B}{C}}\) is a subspace of \(\complex{m}\text{.}\) The function \(\vectrepname{C}\) is defined as a function from \(V\) to \(\complex{m}\text{,}\) but we can just as well employ the definition of \(\vectrepname{C}\) as a function from \(\rng{T}\) to \(\csp{\matrixrep{T}{B}{C}}\text{.}\)
We must first insure that if we choose an input for \(\vectrepname{C}\) from \(\rng{T}\) that then the output will be an element of \(\csp{\matrixrep{T}{B}{C}}\text{.}\) So suppose that \(\vect{v}\in\rng{T}\text{.}\) Then there is a vector \(\vect{u}\in U\text{,}\) such that \(\lteval{T}{\vect{u}}=\vect{v}\text{.}\) Consider
This says that \(\vectrep{C}{\vect{v}}\in\csp{\matrixrep{T}{B}{C}}\text{,}\) as desired.
The restriction in the size of the domain and codomain will not affect the fact that \(\vectrepname{C}\) is a linear transformation (Theorem VRLT), nor will it affect the fact that \(\vectrepname{C}\) is injective (Theorem VRI). Something must be done though to verify that \(\vectrepname{C}\) is surjective. This all gets a bit confusing, since the domain of our isomorphism is the range of the linear transformation, so think about your objects as you go. To establish that \(\vectrepname{C}\) is surjective, appeal to the definition of a surjective linear transformation (Definition SLT), and suppose that we have an element of the codomain, \(\vect{y}\in\csp{\matrixrep{T}{B}{C}}\subseteq\complex{m}\) and we wish to find an element of the domain with \(\vect{y}\) as its image. Since \(\vect{y}\in\csp{\matrixrep{T}{B}{C}}\text{,}\) there exists a vector, \(\vect{x}\in\complex{n}\) with \(\matrixrep{T}{B}{C}\vect{x}=\vect{y}\text{.}\)
We now show that the desired element of the domain is \(\vect{v}=\vectrepinv{C}{\vect{y}}\text{.}\) First, verify that \(\vect{v}\in\rng{T}\) by applying \(T\) to \(\vect{u}=\vectrepinv{B}{\vect{x}}\text{,}\)
Second, verify that the proposed isomorphism, \(\vectrepname{C}\text{,}\) takes \(\vect{v}\) to \(\vect{y}\text{,}\)
\begin{align*} \vectrep{C}{\vect{v}}&=\vectrep{C}{\vectrepinv{C}{\vect{y}}}&&\text{Substitution}\\ &=\lteval{I_{\complex{m}}}{\vect{y}}&& \knowl{./knowl/definition-IVLT.html}{\text{Definition IVLT}}\\ &=\vect{y}&& \knowl{./knowl/definition-IDLT.html}{\text{Definition IDLT}}\text{.} \end{align*}With \(\vectrepname{C}\) demonstrated to be an injective and surjective linear transformation from \(\rng{T}\) to \(\csp{\matrixrep{T}{B}{C}}\text{,}\) Theorem ILTIS tells us \(\vectrepname{C}\) is invertible, and so by Definition IVS, we say \(\rng{T}\) and \(\csp{\matrixrep{T}{B}{C}}\) are isomorphic.
Example RVMR. Range via matrix representation.
In this example, we will recycle the linear transformation \(T\) and the bases \(B\) and \(C\) of Example KVMR but now we will compute the range of \(\ltdefn{T}{M_{22}}{P_2}\text{,}\) given by
With bases \(B\) and \(C\text{,}\)
we obtain the matrix representation
We know from Theorem RCSI that the range of the linear transformation \(T\) is isomorphic to the column space of the matrix representation \(\matrixrep{T}{B}{C}\) and by studying the proof of Theorem RCSI we learn that \(\vectrepname{C}\) is an isomorphism between these subspaces. Notice that since the range is a subspace of the codomain, we will employ \(\vectrepname{C}\) as the isomorphism, rather than \(\vectrepname{B}\text{,}\) which was the correct choice for an isomorphism between the null spaces of Example KVMR.
Rather than trying to compute the range of \(T\) using definitions and techniques from Chapter LT we will instead analyze the column space of \(\matrixrep{T}{B}{C}\) using techniques from way back in Chapter M. First row-reduce \(\transpose{\left(\matrixrep{T}{B}{C}\right)}\text{,}\)
Now employ Theorem CSRST and Theorem BRS (there are other methods we could choose here to compute the column space, such as Theorem BCS) to obtain the basis for \(\csp{\matrixrep{T}{B}{C}}\text{,}\)
We can now convert this basis of \(\csp{\matrixrep{T}{B}{C}}\) into a basis of \(\rng{T}\) by applying \(\vectrepinvname{C}\) to each element of the basis,
So the set
is a basis for \(\rng{T}\text{.}\)
Theorem KNSI and Theorem RCSI can be viewed as further formal evidence for the The Coordinatization Principle, though they are not direct consequences.
Figure KRI is meant to suggest Theorem KNSI and Theorem RCSI, in addition to their proofs (and so carry the same notation as the statements of these two theorems). The dashed lines indicate a subspace relationship, with the smaller vector space lower down in the diagram. The central square is highly reminiscent of Figure FTMR. Each of the four vector representations is an isomorphism, so the inverse linear transformation could be depicted with an arrow pointing in the other direction. The four vector spaces across the bottom are familiar from the earliest days of the course, while the four vector spaces across the top are completely abstract. The vector representations that are restrictions (far left and far right) are the functions shown to be invertible representations as the key technique in the proofs of Theorem KNSI and Theorem RCSI. So this diagram could be helpful as you study those two proofs.
Sage LTR. Linear Transformation Restrictions.
Theorem KNSI and Theorem RCSI have two of the most subtle proofs we have seen so far. The conclusion that two vector spaces are isomorphic is established by actually constructing an isomorphism between the vector spaces. To build the isomorphism, we begin with a familar object, a vector representation linear transformation, but the hard work is showing that we can “restrict” the domain and codomain of this function and still arrive at a legitimate (invertible) linear transformation. In an effort to make the proofs more concrete, we will walk through a nontrivial example for Theorem KNSI, and you might try to do the same for Theorem RCSI. (An understanding of this subsection is not needed for the remainder — its second purpose is to demonstrate some of the powerful tools Sage provides.)
Here are the pieces. We build a linear transformation with two different representations, one with respect to standard bases, the other with respect to less-obvious bases.
Now we compute the kernel of the linear transformation using the “plain” version, and the null space of a matrix representation coming from the “fancy” version.
So quite obviously, the kernel of the linear transformation is quite different looking from the null space of the matrix representation. Though it is no accident that they have the same dimension. Now we build the necessary vector representation, and use two Sage commands to “restrict” the function to a smaller domain (the kernel of the linear transformation) and a smaller codomain (the null space of the matrix representation relative to nonstandard bases).
The first success is that the restriction was even created. Sage would recognize if the original linear transformation ever carried an input from the restricted domain to an output that was not contained in the proposed codomain, and would have raised an error in that event. Phew! Guaranteeing this success was the first big step in the proof of Theorem KNSI. Notice that the matrix representation of the restriction is a \(3\times 3\) matrix, since the restriction runs between a domain and codomain that each have dimension 3. These two vector spaces (the domain and codomain of the restriction) have dimension 3 but still contain vectors with 5 entries in their un-coordinatized versions.
The next two steps of the proof show that the restriction is injective (easy in the proof) and surjective (hard in the proof). In Sage, here is the second success,
Verified as invertible, rho_restrict
qualifies as an isomorphism between the linear transformation kernel, K
, and the matrix representation null space, MK
. Only an example, but still very nice. Your turn — can you create a verfication of Theorem RCSI (for this example, or some other nontrivial example you might create yourself)?
Subsection IVLT Invertible Linear Transformations
We have seen, both in theorems and in examples, that questions about linear transformations are often equivalent to questions about matrices. It is the matrix representation of a linear transformation that makes this idea precise. Here is our final theorem that solidifies this connection.
Theorem IMR. Invertible Matrix Representations.
Suppose that \(\ltdefn{T}{U}{V}\) is a linear transformation, \(B\) is a basis for \(U\) and \(C\) is a basis for \(V\text{.}\) Then \(T\) is an invertible linear transformation if and only if the matrix representation of \(T\) relative to \(B\) and \(C\text{,}\) \(\matrixrep{T}{B}{C}\) is an invertible matrix. When \(T\) is invertible,
Proof.
(⇒)
Suppose \(T\) is invertible, so the inverse linear transformation \(\ltdefn{\ltinverse{T}}{V}{U}\) exists (Definition IVLT). Both linear transformations have matrix representations relative to the bases of \(U\) and \(V\text{,}\) namely \(\matrixrep{T}{B}{C}\) and \(\matrixrep{\ltinverse{T}}{C}{B}\) (Definition MR).
Then
And
These two equations show that \(\matrixrep{T}{B}{C}\) and \(\matrixrep{\ltinverse{T}}{C}{B}\) are inverse matrices (Definition MI) and establish that when \(T\) is invertible, then \(\matrixrep{\ltinverse{T}}{C}{B}=\inverse{\left(\matrixrep{T}{B}{C}\right)}\text{.}\)
(⇐)
Suppose now that \(\matrixrep{T}{B}{C}\) is an invertible matrix and hence nonsingular (Theorem NI). We compute the nullity of \(T\text{,}\)
So the kernel of \(T\) is trivial, and by Theorem KILT, \(T\) is injective.
We now compute the rank of \(T\text{,}\)
Since the dimension of the range of \(T\) equals the dimension of the codomain \(V\text{,}\) by Theorem EDYES, \(\rng{T}=V\text{.}\) Which says that \(T\) is surjective by Theorem RSLT.
Because \(T\) is both injective and surjective, by Theorem ILTIS, \(T\) is invertible.
By now, the connections between matrices and linear transformations should be starting to become more transparent, and you may have already recognized the invertibility of a matrix as being tantamount to the invertibility of the associated matrix representation. The next example shows how to apply this theorem to the problem of actually building a formula for the inverse of an invertible linear transformation.
Example ILTVR. Inverse of a linear transformation via a representation.
Consider the linear transformation
If we wish to quickly find a formula for the inverse of \(R\) (presuming it exists), then choosing “nice” bases will work best. So build a matrix representation of \(R\) relative to the bases \(B\) and \(C\text{,}\)
Then,
So a representation of \(R\) is
The matrix \(\matrixrep{R}{B}{C}\) is invertible (as you can check) so we know for sure that \(R\) is invertible by Theorem IMR. Furthermore,
We can use this representation of the inverse linear transformation, in concert with Theorem FTMR, to determine an explicit formula for the inverse itself,
You might look back at Example AIVLT, where we first witnessed the inverse of a linear transformation and recognize that the inverse (\(S\)) was built from using the method of Example ILTVR with a matrix representation of \(T\text{.}\)
Theorem IMILT. Invertible Matrices, Invertible Linear Transformation.
Suppose that \(A\) is a square matrix of size \(n\) and \(\ltdefn{T}{\complex{n}}{\complex{n}}\) is the linear transformation defined by \(\lteval{T}{\vect{x}}=A\vect{x}\text{.}\) Then \(A\) is an invertible matrix if and only if \(T\) is an invertible linear transformation.
Proof.
Choose bases \(B=C=\set{\vectorlist{e}{n}}\) consisting of the standard unit vectors as a basis of \(\complex{n}\) (Theorem SUVB) and build a matrix representation of \(T\) relative to \(B\) and \(C\text{.}\) Then
So then the matrix representation of \(T\text{,}\) relative to \(B\) and \(C\text{,}\) is simply \(\matrixrep{T}{B}{C}=A\text{.}\) With this observation, the proof becomes a specialization of Theorem IMR,
This theorem may seem gratuitous. Why state such a special case of Theorem IMR? Because it adds another condition to our NMEx series of theorems, and in some ways it is the most fundamental expression of what it means for a matrix to be nonsingular — the associated linear transformation is invertible. This is our final update.
Theorem NME9. Nonsingular Matrix Equivalences, Round 9.
Suppose that \(A\) is a square matrix of size \(n\text{.}\) The following are equivalent.
- \(A\) is nonsingular.
- \(A\) row-reduces to the identity matrix.
- The null space of \(A\) contains only the zero vector, \(\nsp{A}=\set{\zerovector}\text{.}\)
- The linear system \(\linearsystem{A}{\vect{b}}\) has a unique solution for every possible choice of \(\vect{b}\text{.}\)
- The columns of \(A\) are a linearly independent set.
- \(A\) is invertible.
- The column space of \(A\) is \(\complex{n}\text{,}\) \(\csp{A}=\complex{n}\text{.}\)
- The columns of \(A\) are a basis for \(\complex{n}\text{.}\)
- The rank of \(A\) is \(n\text{,}\) \(\rank{A}=n\text{.}\)
- The nullity of \(A\) is zero, \(\nullity{A}=0\text{.}\)
- The determinant of \(A\) is nonzero, \(\detname{A}\neq 0\text{.}\)
- \(\lambda=0\) is not an eigenvalue of \(A\text{.}\)
- The linear transformation \(\ltdefn{T}{\complex{n}}{\complex{n}}\) defined by \(\lteval{T}{\vect{x}}=A\vect{x}\) is invertible.
Proof.
By Theorem IMILT, the new addition to this list is equivalent to the statement that \(A\) is invertible, so we can expand Theorem NME8.
Sage NME9. Nonsingular Matrix Equivalences, Round 9.
Our final fact about nonsingular matrices expresses the correspondence between invertible matrices and invertible linear transformations. As a Sage demonstration, we will begin with an invertible linear transformation and examine two matrix representations. We will create the linear transformation with nonstandard bases and then compute its representation relative to standard bases.
We can convert T
to a new representation using standard bases for QQ^4
by computing images of the standard basis.
Understand that any matrix representation of T_symbolic
will have an invertible matrix representation, no matter which bases are used. If you look at the matrix representation of T_standard
and the definition of T_symbolic
the construction of this example will be transparent, especially if you know the random matrix constructor,
Reading Questions MR Reading Questions
1.
Why does Theorem FTMR deserve the moniker “fundamental”?
2.
Find the matrix representation, \(\matrixrep{T}{B}{C}\) of the linear transformation
relative to the bases
3.
What is the second “surprise,” and why is it surprising?
Exercises MR Exercises
C10.
Example KVMR concludes with a basis for the kernel of the linear transformation \(T\text{.}\) Compute the value of \(T\) for each of these two basis vectors. Did you get what you expected?
C20.
Compute the matrix representation of \(T\) relative to the bases \(B\) and \(C\text{,}\)
Apply Definition MR,
These four vectors are the columns of the matrix representation,
C21.
Find a matrix representation of the linear transformation \(T\) relative to the bases \(B\) and \(C\text{,}\)
Applying Definition MR,
So the resulting matrix representation is
C22.
Let \(S_{22}\) be the vector space of \(2\times 2\) symmetric matrices. Build the matrix representation of the linear transformation \(\ltdefn{T}{P_2}{S_{22}}\) relative to the bases \(B\) and \(C\) and then use this matrix representation to compute \(\lteval{T}{3+5x-2x^2}\text{,}\)
Input to \(T\) the vectors of the basis \(B\) and coordinatize the outputs relative to \(C\text{,}\)
Applying Definition MR we have the matrix representation
To compute \(\lteval{T}{3+5x-2x^2}\) employ Theorem FTMR,
You can, of course, check your answer by evaluating \(\lteval{T}{3+5x-2x^2}\) directly.
C25.
Use a matrix representation to determine if the linear transformation \(\ltdefn{T}{P_3}{M_{22}}\) is surjective.
Choose bases \(B\) and \(C\) for the matrix representation,
Input to \(T\) the vectors of the basis \(B\) and coordinatize the outputs relative to \(C\text{,}\)
Applying Definition MR we have the matrix representation
Properties of this matrix representation will translate to properties of the linear transformation. The matrix representation is nonsingular since it row-reduces to the identity matrix (Theorem NMRRI) and therefore has a column space equal to \(\complex{4}\) (Theorem CNMB). The column space of the matrix representation is isomorphic to the range of the linear transformation (Theorem RCSI). So the range of \(T\) has dimension 4, equal to the dimension of the codomain \(M_{22}\text{.}\) By Theorem ROSLT, \(T\) is surjective.
C30.
Find bases for the kernel and range of the linear transformation \(S\) below.
These subspaces will be easiest to construct by analyzing a matrix representation of \(S\text{.}\) Since we can use any matrix representation, we might as well use natural bases that allow us to construct the matrix representation quickly and easily,
then we can practically build the matrix representation on sight,
The first step is to find bases for the null space and column space of the matrix representation. Row-reducing the matrix representation we find,
So by Theorem BNS and Theorem BCS, we have
Now, the proofs of Theorem KNSI and Theorem RCSI tell us that we can apply \(\ltinverse{\vectrepname{B}}\) and \(\vectrepinvname{C}\) (respectively) to “un-coordinatize” and get bases for the kernel and range of the linear transformation \(S\) itself,
C40.
Let \(S_{22}\) be the set of \(2\times 2\) symmetric matrices. Verify that the linear transformation \(R\) is invertible and find \(\ltinverse{R}\text{.}\)
The analysis of \(R\) will be easiest if we analyze a matrix representation of \(R\text{.}\) Since we can use any matrix representation, we might as well use natural bases that allow us to construct the matrix representation quickly and easily,
then we can practically build the matrix representation on sight,
This matrix representation is invertible (it has a nonzero determinant of \(-1\text{,}\) Theorem SMZD, Theorem NI) so Theorem IMR tells us that the linear transformation \(R\) is also invertible. To find a formula for \(\ltinverse{R}\) we compute,
C41.
Prove that the linear transformation \(S\) is invertible. Then find a formula for the inverse linear transformation, \(\ltinverse{S}\text{,}\) by employing a matrix inverse.
First, build a matrix representation of \(S\) (Definition MR). We are free to choose whatever bases we wish, so we should choose ones that are easy to work with, such as
The resulting matrix representation is then
This matrix is invertible since it has a nonzero determinant, so by Theorem IMR the linear transformation \(S\) is invertible. We can use the matrix inverse and Theorem IMR to find a formula for the inverse linear transformation,
C42.
The linear transformation \(\ltdefn{R}{M_{12}}{M_{21}}\) is invertible. Use a matrix representation to determine a formula for the inverse linear transformation \(\ltdefn{\ltinverse{R}}{M_{21}}{M_{12}}\text{.}\)
Choose bases \(B\) and \(C\) for \(M_{12}\) and \(M_{21}\) (respectively),
The resulting matrix representation is
This matrix is invertible (its determinant is nonzero, Theorem SMZD), so by Theorem IMR, we can compute the matrix representation of \(\ltinverse{R}\) with a matrix inverse (Theorem TTMI),
To obtain a general formula for \(\ltinverse{R}\text{,}\) use Theorem FTMR,
C50.
Use a matrix representation to find a basis for the range of the linear transformation \(L\text{.}\)
As usual, build any matrix representation of \(L\text{,}\) most likely using “nice” bases, such as
Then the matrix representation (Definition MR) is,
Theorem RCSI tells us that we can compute the column space of the matrix representation, then use the isomorphism \(\vectrepinvname{C}\) to convert the column space of the matrix representation into the range of the linear transformation. So we first analyze the matrix representation,
With three nonzero rows in the reduced row-echelon form of the matrix, we know the column space has dimension 3. Since \(P_2\) has dimension 3 (Theorem DP), the range must be all of \(P_2\text{.}\) So any basis of \(P_2\) would suffice as a basis for the range. For instance, \(C\) itself would be a correct answer.
A more laborious approach would be to use Theorem BCS and choose the first three columns of the matrix representation as a basis for the range of the matrix representation. These could then be “un-coordinatized” with \(\vectrepinvname{C}\) to yield a (“not nice”) basis for \(P_2\text{.}\)
C51.
Use a matrix representation to find a basis for the kernel of the linear transformation \(L\text{.}\)
C52.
Find a basis for the kernel of the linear transformation \(\ltdefn{T}{P_2}{M_{22}}\text{.}\)
Choose bases \(B\) and \(C\) for the matrix representation,
Input to \(T\) the vectors of the basis \(B\) and coordinatize the outputs relative to \(C\text{,}\)
Applying Definition MR we have the matrix representation
The null space of the matrix representation is isomorphic (via \(\vectrepname{B}\)) to the kernel of the linear transformation (Theorem KNSI). So we compute the null space of the matrix representation by first row-reducing the matrix to,
Employing Theorem BNS we have
We only need to uncoordinatize this one basis vector to get a basis for \(\krn{T}\text{,}\)
M20.
The linear transformation \(D\) performs differentiation on polynomials. Use a matrix representation of \(D\) to find the rank and nullity of \(D\text{.}\)
Build a matrix representation (Definition MR) with the set
employed as a basis of both the domain and codomain. Then
and the resulting matrix representation is
This \((n+1)\times(n+1)\) matrix is very close to being in reduced row-echelon form. Multiply row \(i\) by \(\frac{1}{i}\text{,}\) for \(1\leq i\leq n\text{,}\) to convert it to reduced row-echelon form. From this we can see that matrix representation \(\matrixrep{D}{B}{B}\) has rank \(n\) and nullity \(1\text{.}\) Applying Theorem RCSI and Theorem KNSI tells us that the linear transformation \(D\) will have the same values for the rank and nullity, as well.
M60.
Suppose \(U\) and \(V\) are vector spaces and define a function \(\ltdefn{Z}{U}{V}\) by \(\lteval{Z}{\vect{u}}=\zerovector_{V}\) for every \(\vect{u}\in U\text{.}\) Then Exercise IVLT.M60 asks you to formulate the theorem: \(Z\) is invertible if and only if \(U=\set{\zerovector_U}\) and \(V=\set{\zerovector_V}\text{.}\) What would a matrix representation of \(Z\) look like in this case? How does Theorem IMR read in this case?
M80.
In light of Theorem KNSI and Theorem MRCLT, write a short comparison of Exercise MM.T40 with Exercise ILT.T15.
M81.
In light of Theorem RCSI and Theorem MRCLT, write a short comparison of Exercise CRS.T40 with Exercise SLT.T15.
M82.
In light of Theorem MRCLT and Theorem IMR, write a short comparison of Theorem SS and Theorem ICLT.
M83.
In light of Theorem MRCLT and Theorem IMR, write a short comparison of Theorem NPNF and Exercise IVLT.T40.
T20.
Construct a new solution to Exercise B.T50 along the following outline. From the \(n\times n\) matrix \(A\text{,}\) construct the linear transformation \(\ltdefn{T}{\complex{n}}{\complex{n}}\text{,}\) \(\lteval{T}{\vect{x}}=A\vect{x}\text{.}\) Use Theorem NI, Theorem IMILT and Theorem ILTIS to translate between the nonsingularity of \(A\) and the surjectivity/injectivity of \(T\text{.}\) Then apply Theorem ILTB and Theorem SLTB to connect these properties with bases.
Given the nonsingular \(n\times n\) matrix \(A\text{,}\) create the linear transformation \(\ltdefn{T}{\complex{n}}{\complex{n}}\) defined by \(\lteval{T}{\vect{x}}=A\vect{x}\text{.}\) Then
T40.
Theorem VSLT defines the vector space \(\vslt{U}{V}\) containing all linear transformations with domain \(U\) and codomain \(V\text{.}\) Suppose \(\dimension{U}=n\) and \(\dimension{V}=m\text{.}\) Prove that \(\vslt{U}{V}\) is isomorphic to \(M_{mn}\text{,}\) the vector space of all \(m\times n\) matrices (Example VSM). (Hint: we could have suggested this exercise in Chapter LT, but have postponed it to this section. Why?)
T41.
Theorem VSLT defines the vector space \(\vslt{U}{V}\) containing all linear transformations with domain \(U\) and codomain \(V\text{.}\) Determine a basis for \(\vslt{U}{V}\text{.}\) (Hint: study Exercise MR.T40 first.)
T60.
Create an entirely different proof of Theorem IMILT that relies on Definition IVLT to establish the invertibility of \(T\text{,}\) and that relies on Definition MI to establish the invertibility of \(A\text{.}\)
T80.
Suppose that \(\ltdefn{T}{U}{V}\) and \(\ltdefn{S}{V}{W}\) are linear transformations, and that \(B\text{,}\) \(C\) and \(D\) are bases for \(U\text{,}\) \(V\text{,}\) and \(W\text{.}\) Using only Definition MR define matrix representations for \(T\) and \(S\text{.}\) Using these two definitions, and Definition MR, derive a matrix representation for the composition \(\compose{S}{T}\) in terms of the entries of the matrices \(\matrixrep{T}{B}{C}\) and \(\matrixrep{S}{C}{D}\text{.}\) Explain how you would use this result to motivate a definition for matrix multiplication that is strikingly similar to Theorem EMP.
Suppose that \(B=\set{\vectorlist{u}{m}}\text{,}\) \(C=\set{\vectorlist{v}{n}}\) and \(D=\set{\vectorlist{w}{p}}\text{.}\) For convenience, set \(M=\matrixrep{T}{B}{C}\text{,}\) \(m_{ij}=\matrixentry{M}{ij}\text{,}\) \(1\leq i\leq n\text{,}\) \(1\leq j\leq m\text{,}\) and similarly, set \(N=\matrixrep{S}{C}{D}\text{,}\) \(n_{ij}=\matrixentry{N}{ij}\text{,}\) \(1\leq i\leq p\text{,}\) \(1\leq j\leq n\text{.}\) We want to learn about the matrix representation of \(\ltdefn{\compose{S}{T}}{U}{W}\) relative to \(B\) and \(D\text{.}\) (Zach Jarvis helped with this proof.)
We will examine a single (generic) entry of this representation.
This formula for the entry of a matrix should remind you of Theorem EMP. However, while the theorem presumed we knew how to multiply matrices, the solution before us never uses any understanding of matrix products. It uses the definitions of vector and matrix representations, properties of linear transformations and vector spaces. So if we began a course by first discussing vector spaces, and then linear transformations between vector spaces, we could carry matrix representations into a motivation for a definition of matrix multiplication that is grounded in function composition. That is worth saying again—a definition of matrix representations of linear transformations results in a matrix product being the representation of a composition of linear transformations.
This exercise is meant to explain why many authors take the formula in Theorem EMP as their definition of matrix multiplication, and why it is a natural choice when the proper motivation is in place. If we first defined matrix multiplication in the style of Theorem EMP, then the above argument, followed by a simple application of the definition of matrix equality (Definition ME), would yield Theorem MRCLT.