From A First Course in Linear Algebra
Version 2.00
© 2004.
Licensed under the GNU Free Documentation License.
http://linear.ups.edu/
A basis of a vector space is one of the most useful concepts in linear
algebra. It often provides a concise, finite description of an infinite vector
space.
We now have all the tools in place to define a basis of a vector space.
Definition B
Basis
Suppose is a vector
space. Then a subset is
a basis of if it is linearly
independent and spans .
So, a basis is a linearly independent spanning set for a vector space. The requirement that the set spans insures that has enough raw material to build , while the linear independence requirement insures that we do not have any more raw material than we need. As we shall see soon in Section D, a basis is a minimal spanning set.
You may have noticed that we used the term basis for some of the titles of previous theorems (e.g. Theorem BNS, Theorem BCS, Theorem BRS) and if you review each of these theorems you will see that their conclusions provide linearly independent spanning sets for sets that we now recognize as subspaces of . Examples associated with these theorems include Example NSLIL, Example CSOCD and Example IAS. As we will see, these three theorems will continue to be powerful tools, even in the setting of more general vector spaces.
Furthermore, the archetypes contain an abundance of bases. For each coefficient matrix of a system of equations, and for each archetype defined simply as a matrix, there is a basis for the null space, three bases for the column space, and a basis for the row space. For this reason, our subsequent examples will concentrate on bases for vector spaces other than . Notice that Definition B does not preclude a vector space from having many bases, and this is the case, as hinted above by the statement that the archetypes contain three bases for the column space of a matrix. More generally, we can grab any basis for a vector space, multiply any one basis vector by a non-zero scalar and create a slightly different set that is still a basis. For “important” vector spaces, it will be convenient to have a collection of “nice” bases. When a vector space has a single particularly nice basis, it is sometimes called the standard basis though there is nothing precise enough about this term to allow us to define it formally — it is a question of style. Here are some nice bases for important vector spaces.
Theorem SUVB
Standard Unit Vectors are a Basis
The set of standard unit vectors for
(Definition SUV), is a
basis for the vector space .
Proof We must show that the set is both linearly independent and a spanning set for . First, the vectors in are, by Definition SUV, the columns of the identity matrix, which we know is nonsingular (since it row-reduces to the identity matrix, Theorem NMRRI). And the columns of a nonsingular matrix are linearly independent by Theorem NMLIC.
Suppose we grab an arbitrary vector from , say
Can we write as a linear combination of the vectors in ? Yes, and quite simply.
this shows that , which is sufficient to show that is a spanning set for .
Example BP
Bases for
The vector space of polynomials with degree at most
,
, has
the basis
Another nice basis for is
Checking that each of and is a linearly independent spanning set are good exercises.
Example BM
A basis for the vector space of matrices
In the vector space
of matrices (Example VSM) define the matrices
,
,
by
So these matrices have entries that are all zeros, with the exception of a lone entry that is one. The set of all of them,
forms a basis for .
The bases described above will often be convenient ones to work with. However a basis doesn’t have to obviously look like a basis.
Example BSP4
A basis for a subspace of
In Example SSP4 we showed that
is a spanning set for . We will now show that is also linearly independent in . Begin with a relation of linear dependence,
Equating coefficients (vector equality in ) gives the homogeneous system of five equations in four variables,
We form the coefficient matrix, and row-reduce to obtain a matrix in reduced row-echelon form
With only the trivial solution to this homogeneous system, we conclude that only scalars that will form a relation of linear dependence are the trivial ones, and therefore the set is linearly independent (Definition LI). Finally, has earned the right to be called a basis for (Definition B).
Example BSM22
A basis for a subspace of
In Example SSM22 we discovered that
is a spanning set for the subspace
of the vector space of all matrices, . If we can also determine that is linearly independent in (or in ), then it will qualify as a basis for . Let’s begin with a relation of linear dependence.
Using our definition of matrix equality (Definition ME) we equate corresponding entries and get a homogeneous system of four equations in two variables,
We could row-reduce the coefficient matrix of this homogeneous system, but it is not necessary. The second and fourth equations tell us that , is the only solution to this homogeneous system. This qualifies the set as being linearly independent, since the only relation of linear dependence is trivial (Definition LI). Therefore is a basis for (Definition B).
Example BC
Basis for the crazy vector space
In Example LIC and Example SSC we determined that the set
from the crazy
vector space,
(Example CVS), is linearly independent and is a spanning set for
. By Definition B
we see that is
a basis for .
We have seen that several of the sets associated with a matrix are subspaces of vector spaces of column vectors. Specifically these are the null space (Theorem NSMS), column space (Theorem CSMS), row space (Theorem RSMS) and left null space (Theorem LNSMS). As subspaces they are vector spaces (Definition S) and it is natural to ask about bases for these vector spaces. Theorem BNS, Theorem BCS, Theorem BRS each have conclusions that provide linearly independent spanning sets for (respectively) the null space, column space, and row space. Notice that each of these theorems contains the word “basis” in its title, even though we did not know the precise meaning of the word at the time. To find a basis for a left null space we can use the definition of this subspace as a null space (Definition LNS) and apply Theorem BNS. Or Theorem FS tells us that the left null space can be expressed as a row space and we can then use Theorem BRS.
Theorem BS is another early result that provides a linearly independent spanning set (i.e. a basis) as its conclusion. If a vector space of column vectors can be expressed as a span of a set of column vectors, then Theorem BS can be employed in a straightforward manner to quickly yield a basis.
We have seen several examples of bases in different vector spaces. In this subsection, and the next (Subsection B.BNM), we will consider building bases for and its subspaces.
Suppose we have a subspace of that is expressed as the span of a set of vectors, , and is not necessarily linearly independent, or perhaps not very attractive. Theorem REMRS says that row-equivalent matrices have identical row spaces, while Theorem BRS says the nonzero rows of a matrix in reduced row-echelon form are a basis for the row space. These theorems together give us a great computational tool for quickly finding a basis for a subspace that is expressed originally as a span.
Example RSB
Row space basis
When we first defined the span of a set of column vectors, in Example SCAD we
looked at the set
with an eye towards realizing as the span of a smaller set. By building relations of linear dependence (though we did not know them by that name then) we were able to remove two vectors and write as the span of the other two vectors. These two remaining vectors formed a linearly independent set, even though we did not know that at the time.
Now we know that is a subspace and must have a basis. Consider the matrix, , whose rows are the vectors in the spanning set for ,
Then, by Definition RSM, the row space of will be , . Theorem BRS tells us that if we row-reduce , the nonzero rows of the row-equivalent matrix in reduced row-echelon form will be a basis for , and hence a basis for . Let’s do it — row-reduces to
If we convert the two nonzero rows to column vectors then we have a basis,
and
For aesthetic reasons, we might wish to multiply each vector in by , which will not change the spanning or linear independence properties of as a basis. Then we can also write
Example IAS provides another example of this flavor, though now we can notice that is a subspace, and that the resulting set of three vectors is a basis. This is such a powerful technique that we should do one more example.
Example RS
Reducing a span
In Example RSC5 we began with a set of
vectors
from ,
and defined . Our goal in that problem was to find a relation of linear dependence on the vectors in , solve the resulting equation for one of the vectors, and re-express as the span of a set of three vectors.
Here is another way to accomplish something similar. The row space of the matrix
is equal to . By Theorem BRS we can row-reduce this matrix, ignore any zero rows, and use the non-zero rows as column vectors that are a basis for the row space of . Row-reducing creates the matrix
So
is a basis for . Our theorem tells us this is a basis, there is no need to verify that the subspace spanned by three vectors (rather than four) is the identical subspace, and there is no need to verify that we have reached the limit in reducing the set, since the set of three vectors is guaranteed to be linearly independent.
A quick source of diverse bases for is the set of columns of a nonsingular matrix.
Theorem CNMB
Columns of Nonsingular Matrix are a Basis
Suppose that is a
square matrix of size .
Then the columns of
are a basis of
if and only if
is nonsingular.
Proof () Suppose that the columns of are a basis for . Then Definition B says the set of columns is linearly independent. Theorem NMLIC then says that is nonsingular.
() Suppose that is nonsingular. Then by Theorem NMLIC this set of columns is linearly independent. Theorem CSNM says that for a nonsingular matrix, . This is equivalent to saying that the columns of are a spanning set for the vector space . As a linearly independent spanning set, the columns of qualify as a basis for (Definition B).
Example CABAK
Columns as Basis, Archetype K
Archetype K is the
matrix
which is row-equivalent to the identity matrix . So by Theorem NMRRI, is nonsingular. Then Theorem CNMB says the set
is a (novel) basis of .
Perhaps we should view the fact that the standard unit vectors are a basis (Theorem SUVB) as just a simple corollary of Theorem CNMB? (See Technique LC.)
With a new equivalence for a nonsingular matrix, we can update our list of equivalences.
Theorem NME5
Nonsingular Matrix Equivalences, Round 5
Suppose that is a
square matrix of size .
The following are equivalent.
Proof With a new equivalence for a nonsingular matrix in Theorem CNMB we can expand Theorem NME4.
We learned about orthogonal sets of vectors in back in Section O, and we also learned that orthogonal sets are automatically linearly independent (Theorem OSLI). When an orthogonal set also spans a subspace of , then the set is a basis. And when the set is orthonormal, then the set is an incredibly nice basis. We will back up this claim with a theorem, but first consider how you might manufacture such a set.
Suppose that is a subspace of with basis . Then spans and is a linearly independent set of nonzero vectors. We can apply the Gram-Schmidt Procedure (Theorem GSP) and obtain a linearly independent set such that and is orthogonal. In other words, is a basis for , and is an orthogonal set. By scaling each vector of to norm 1, we can convert into an orthonormal set, without destroying the properties that make it a basis of . In short, we can convert any basis into an orthonormal basis. Example GSTV, followed by Example ONTV, illustrates this process.
Unitary matrices (Definition UM) are another good source of orthonormal bases (and vice versa). Suppose that is a unitary matrix of size . Then the columns of form an orthonormal set (Theorem CUMOS) that is therefore linearly independent (Theorem OSLI). Since is invertible (Theorem UMI), we know is nonsingular (Theorem NI), and then the columns of span (Theorem CSNM). So the columns of a unitary matrix of size are an orthonormal basis for .
Why all the fuss about orthonormal bases? Theorem VRRB told us that any vector in a vector space could be written, uniquely, as a linear combination of basis vectors. For an orthonormal basis, finding the scalars for this linear combination is extremely easy, and this is the content of the next theorem. Furthermore, with vectors written this way (as linear combinations of the elements of an orthonormal set) certain computations and analysis become much easier. Here’s the promised theorem.
Theorem COB
Coordinates and Orthonormal Bases
Suppose that is an orthonormal
basis of the subspace
of . For
any ,
Proof Because is a basis of , Theorem VRRB tells us that we can write uniquely as a linear combination of the vectors in . So it is not this aspect of the conclusion that makes this theorem interesting. What is interesting is that the particular scalars are so easy to compute. No need to solve big systems of equations — just do an inner product of with to arrive at the coefficient of in the linear combination.
So begin the proof by writing as a linear combination of the vectors in , using unknown scalars,
and compute,
So the (unique) scalars for the linear combination are indeed the inner products advertised in the conclusion of the theorem’s statement.
Example CROB4
Coordinatization relative to an orthonormal basis,
The set
was proposed, and partially verified, as an orthogonal set in Example AOS. Let’s scale each vector to norm 1, so as to form an orthonormal set in . Then by Theorem OSLI the set will be linearly independent, and by Theorem NME5 the set will be a basis for . So, once scalked to norm 1, the adjusted set will be an orthonormal basis of . The norms are,
So an orthonormal basis is
Now, to illustrate Theorem COB, choose any vector from , say , and compute
Then Theorem COB guarantees that
as you might want to check (if you have unlimited patience).
A slightly less intimidating example follows, in three dimensions and with just real numbers.
Example CROB3
Coordinatization relative to an orthonormal basis,
The set
is a linearly independent set, which the Gram-Schmidt Process (Theorem GSP) converts to an orthogonal set, and which can then be converted to the orthonormal set,
which is therefore an orthonormal basis of . With three vectors in , all with real number entries, the inner product (Definition IP) reduces to the usual “dot product” (or scalar product) and the orthogonal pairs of vectors can be interpreted as perpendicular pairs of directions. So the vectors in serve as replacements for our usual 3-D axes, or the usual 3-D unit vectors and . We would like to decompose arbitrary vectors into “components” in the directions of each of these basis vectors. It is Theorem COB that tells us how to do this.
Suppose that we choose . Compute
then Theorem COB guarantees that
which you should be able to check easily, even if you do not have much patience.
Not only do the columns of a unitary matrix form an orthonormal basis, but there is a deeper connection between orthonormal bases and unitary matrices. Informally, the next theorem says that if we transform each vector of an orthonormal basis by multiplying it by a unitary matrix, then the resulting set will be another orthonormal basis. And more remarkably, any matrix with this property must be unitary! As an equivalence (Technique E) we could take this as our defining property of a unitary matrix, though it might not have the same utility as Definition UM.
Theorem UMCOB
Unitary Matrices Convert Orthonormal Bases
Let be an
matrix and
be an orthonormal
basis of .
Define
Then is a unitary matrix if and only if is an orthonormal basis of .
Proof Assume is a unitary matrix and establish several facts about . First we check that is an orthonormal set (Definition ONS). By Theorem UMPIP, for ,
Similarly, Theorem UMPIP also gives, for ,
As is an orthogonal set (Definition OSV), Theorem OSLI yields the linear independence of . Having established that the column vectors on form a linearly independent set, a matrix whose columns are the vectors of is nonsingular (Theorem NMLIC), and hence these vectors form a basis of by Theorem CNMB.
Now assume that is an orthonormal set. Let be an arbitrary vector from . Since spans , there are scalars, , such that
Now
Since the choice of was arbitrary, Theorem EMMVP tells us that , so is unitary (Definition UM).
C40 From Example RSB, form an arbitrary (and nontrivial)
linear combination of the four vectors in the original spanning set for
.
So the result of this computation is of course an element of
. As
such, this vector should be a linear combination of the basis vectors in
. Find
the (unique) scalars that provide this linear combination. Repeat with another
linear combination of the original four vectors.
Contributed by Robert Beezer Solution [982]
C80 Prove that is a basis
for the crazy vector space
(Example CVS).
Contributed by Robert Beezer
M20 In Example BM provide the verifications (linear independence and spanning) to
show that is
a basis of .
Contributed by Robert Beezer Solution [979]
T50 Theorem UMCOB says that unitary matrices are characterized as those matrices that “carry” orthonormal bases to orthonormal bases. This problem asks you to prove a similar result: nonsingular matrices are characterized as those matrices that “carry” bases to bases.
More precisely, suppose that
is a square matrix of size
and is a basis
of . Prove that
is nonsingular
if and only if
is a basis of .
(See also Exercise PD.T33, Exercise MR.T20.
Contributed by Robert Beezer Solution [983]
T51 Use the result of Exercise B.T50 to build a very concise proof
of Theorem CNMB. (Hint: make a judicious choice for the basis
.)
Contributed by Robert Beezer Solution [988]
M20 Contributed by Robert Beezer Statement [977]
We need to establish the linear independence and spanning properties of the set
relative to the vector space .
This proof is more transparent if you write out individual matrices in the basis with lots of zeros and dots and a lone one. But we don’t have room for that here, so we will use summation notation. Think carefully about each step, especially when the double summations seem to “disappear.” Begin with a relation of linear dependence, using double subscripts on the scalars to align with the basis elements.
Now consider the entry in row and column for these equal matrices,
Since and were arbitrary, we find that each scalar is zero and so is linearly independent (Definition LI).
To establish the spanning property of we need only show that an arbitrary matrix can be written as a linear combination of the elements of . So suppose that is an arbitrary matrix and consider the matrix defined as a linear combination of the elements of by
Then,
So by Definition ME, , and therefore . By Definition B, the set is a basis of the vector space .
C40 Contributed by Robert Beezer Statement [977]
An arbitrary linear combination is
(You probably used a different collection of scalars.) We want to write as a linear combination of
We could set this up as vector equation with variables as scalars in a linear combination of the vectors in , but since the first two slots of have such a nice pattern of zeros and ones, we can determine the necessary scalars easily and then double-check our answer with a computation in the third slot,
Notice how the uniqueness of these scalars arises. They are forced to be and .
T50 Contributed by Robert Beezer Statement [977]
Our first proof relies mostly on definitions of linear independence and spanning,
which is a good exercise. The second proof is shorter and turns on a technical
result from our work with matrix inverses, Theorem NPNT.
Assume that is nonsingular and prove that is a basis of . First show that is linearly independent. Work on a relation of linear dependence on ,
Since is nonsingular, Definition NM and Theorem SLEMM allows us to conclude that
But this is a relation of linear dependence of the linearly independent set , so the scalars are trivial, . By Definition LI, the set is linearly independent.
Now prove that spans . Given an arbitrary vector , can it be expressed as a linear combination of the vectors in ? Since is a nonsingular matrix we can define the vector to be the unique solution of the system (Theorem NMUS). Since we can write as a linear combination of the vectors in the basis . So there are scalars, such that
Then,
So we can write an arbitrary vector of as a linear combination of the elements of . In other words, spans (Definition TSVS). By Definition B, the set is a basis for .
Assume that is a basis and prove that is nonsingular. Let be a solution to the homogeneous system . Since is a basis of there are scalars, , such that
Then
This is a relation of linear dependence on the linearly independent set , so the scalars must all be zero, . Thus,
By Definition NM we see that is nonsingular.
Now for a secoond proof. Take the vectors for and use them as the columns of a matrix, . By Theorem CNMB, because we have the hypothesis that is a basis of , is a nonsingular matrix. Notice that the columns of are exactly the vectors in the set , by Definition MM.
That was easy!
T51 Contributed by Robert Beezer Statement [977]
Choose
to be the set of standard unit vectors, a particularly nice basis of
(Theorem SUVB). For a
vector (Definition SUV)
from this basis, what is ?