Section B  Bases

From A First Course in Linear Algebra
Version 1.08
© 2004.
Licensed under the GNU Free Documentation License.

A basis of a vector space is one of the most useful concepts in linear algebra. It often provides a finite description of an infinite vector space.

Subsection B: Bases

We now have all the tools in place to define a basis of a vector space.

Definition B
Suppose V is a vector space. Then a subset S V is a basis of V if it is linearly independent and spans V .

So, a basis is a linearly independent spanning set for a vector space. The requirement that the set spans V insures that S has enough raw material to build V , while the linear independence requirement insures that we do not have any more raw material than we need. As we shall see soon in Section D, a basis is a minimal spanning set.

You may have noticed that we used the term basis for some of the titles of previous theorems (e.g. Theorem BNS, Theorem BCS, Theorem BRS) and if you review each of these theorems you will see that their conclusions provide linearly independent spanning sets for sets that we now recognize as subspaces of m. Examples associated with these theorems include Example NSLIL, Example CSOCD and Example IAS. As we will see, these three theorems will continue to be powerful tools, even in the setting of more general vector spaces.

Furthermore, the archetypes contain an abundance of bases. For each coefficient matrix of a system of equations, and for each archetype defined simply as a matrix, there is a basis for the null space, three bases for the column space, and a basis for the row space. For this reason, our subsequent examples will concentrate on bases for vector spaces other than m. Notice that Definition B does not preclude a vector space from having many bases, and this is the case, as hinted above by the statement that the archetypes contain three bases for the column space of a matrix. More generally, we can grab any basis for a vector space, multiply any one basis vector by a non-zero scalar and create a slightly different set that is still a basis. For “important” vector spaces, it will be convenient to have a collection of “nice” bases. When a vector space has a single particularly nice basis, it is sometimes called the standard basis though there is nothing precise enough about this term to allow us to define it formally — it is a question of style. Here are some nice bases for important vector spaces.

Theorem SUVB
Standard Unit Vectors are a Basis
The set of standard unit vectors for m (Definition SUV), B = e1,e2,e3,,em = ei 1 i m is a basis for the vector space m.

Proof   We must show that the set B is both linearly independent and a spanning set for m. First, the vectors in B are, by Definition SUV, the columns of the identity matrix, which we know is nonsingular (since it row-reduces to the identity matrix, Theorem NMRRI). And the columns of a nonsingular matrix are linearly independent by Theorem NMLIC.

Suppose we grab an arbitrary vector from m, say

v = v1 v2 v3 vm .

Can we write v as a linear combination of the vectors in B? Yes, and quite simply.

v1 v2 v3 vm = v1 1 0 0 0 + v2 0 1 0 0 + v3 0 0 1 0 + + vm 0 0 0 1 v = v1e1 + v2e2 + v3e3 + + vmem

this shows that m B, which is sufficient to show that B is a spanning set for m.

Example BP
Bases for Pn
The vector space of polynomials with degree at most n, Pn, has the basis

B = 1,x,x2,x3,,xn .

Another nice basis for Pn is

C = 1,1 + x,1 + x + x2,1 + x + x2 + x3,,1 + x + x2 + x3 + + xn .

Checking that each of B and C is a linearly independent spanning set are good exercises.

Example BM
A basis for the vector space of matrices
In the vector space Mmn of matrices (Example VSM) define the matrices Bk, 1 k m, 1 n by

Bkij = 1 if k = i, = j 0 otherwise

So these matrices have entries that are all zeros, with the exception of a lone entry that is one. The set of all mn of them,

B = Bk 1 k m,1 n

forms a basis for Mmn.

The bases described above will often be convenient ones to work with. However a basis doesn’t have to obviously look like a basis.

Example BSP4
A basis for a subspace of P4
In Example SSP4 we showed that

S = x 2,x2 4x + 4,x3 6x2 + 12x 8,x4 8x3 + 24x2 32x + 16

is a spanning set for W = p(x) p P4,p(2) = 0. We will now show that S is also linearly independent in W. Begin with a relation of linear dependence,

0 + 0x + 0x2 + 0x3 + 0x4 = α 1 x 2 + α2 x2 4x + 4 + α3 x3 6x2 + 12x 8 + α 4 x4 8x3 + 24x2 32x + 16 = α4x4 + α 3 8α4 x3 + α 2 6α3 + 24α4 x2 + α1 4α2 + 12α3 32α4 x + 2α1 + 4α2 8α3 + 16α4

Equating coefficients (vector equality in P4) gives the homogeneous system of five equations in four variables,

α4 = 0 α3 8α4 = 0 α2 6α3 + 24α4 = 0 α1 4α2 + 12α3 32α4 = 0 2α1 + 4α2 8α3 + 16α4 = 0

We form the coefficient matrix, and row-reduce to obtain a matrix in reduced row-echelon form

1000 0100 0010 0001 0000

With only the trivial solution to this homogeneous system, we conclude that only scalars that will form a relation of linear dependence are the trivial ones, and therefore the set S is linearly independent (Definition LI). Finally, S has earned the right to be called a basis for W (Definition B).

Example BSM22
A basis for a subspace of M22
In Example SSM22 we discovered that

Q = 31 0 0 , 1 0 41

is a spanning set for the subspace

Z = ab cd a + 3b c 5d = 0, 2a 6b + 3c + 14d = 0

of the vector space of all 2 × 2 matrices, M22. If we can also determine that Q is linearly independent in Z (or in M22), then it will qualify as a basis for Z. Let’s begin with a relation of linear dependence.

00 00 = α1 31 0 0 + α2 1 0 41 = 3α1 + α2α1 4α2 α2

Using our definition of matrix equality (Definition ME) we equate corresponding entries and get a homogeneous system of four equations in two variables,

3α1 + α2 = 0 α1 = 0 4α2 = 0 α2 = 0

We could row-reduce the coefficient matrix of this homogeneous system, but it is not necessary. The second and fourth equations tell us that α1 = 0, α2 = 0 is the only solution to this homogeneous system. This qualifies the set Q as being linearly independent, since the only relation of linear dependence is trivial (Definition LI). Therefore Q is a basis for Z (Definition B).

Example BC
Basis for the crazy vector space
In Example LIC and Example SSC we determined that the set R = (1,0),(6,3) from the crazy vector space, C (Example CVS), is linearly independent and is a spanning set for C. By Definition B we see that R is a basis for C.

We have seen that several of the sets associated with a matrix are subspaces of vector spaces of column vectors. Specifically these are the null space (Theorem NSMS), column space (Theorem CSMS), row space (Theorem RSMS) and left null space (Theorem LNSMS). As subspaces they are vector spaces (Definition S) and it is natural to ask about bases for these vector spaces. Theorem BNS, Theorem BCS, Theorem BRS each have conclusions that provide linearly independent spanning sets for (respectively) the null space, column space, and row space. Notice that each of these theorems contains the word “basis” in its title, even though we did not know the precise meaning of the word at the time. To find a basis for a left null space we can use the definition of this subspace as a null space (Definition LNS) and apply Theorem BNS. Or Theorem FS tells us that the left null space can be expressed as a row space and we can then use Theorem BRS.

Theorem BS is another early result that provides a linearly independent spanning set (i.e. a basis) as its conclusion. If a vector space of column vectors can be expressed as a span of a set of column vectors, then Theorem BS can be employed in a straightforward manner to quickly yield a basis.

Subsection BSCV: Bases for Spans of Column Vectors

We have seen several examples of bases in different vector spaces. In this subsection, and the next (Subsection B.BNM), we will consider building bases for m and its subspaces.

Suppose we have a subspace of m that is expressed as the span of a set of vectors, S, and S is not necessarily linearly independent, or perhaps not very attractive. Theorem REMRS says that row-equivalent matrices have identical row spaces, while Theorem BRS says the nonzero rows of a matrix in reduced row-echelon form are a basis for the row space. These theorems together give us a great computational tool for quickly finding a basis for a subspace that is expressed originally as a span.

Example RSB
Row space basis
When we first defined the span of a set of column vectors, in Example SCAD we looked at the set

W = 2 3 1 , 1 4 1 , 7 5 4 , 7 6 5

with an eye towards realizing W as the span of a smaller set. By building relations of linear dependence (though we did not know them by that name then) we were able to remove two vectors and write W as the span of the other two vectors. These two remaining vectors formed a linearly independent set, even though we did not know that at the time.

Now we know that W is a subspace and must have a basis. Consider the matrix, C, whose rows are the vectors in the spanning set for W,

C = 2 3 1 1 4 1 7 5 4 765

Then, by Definition RSM, the row space of C will be W, C = W. Theorem BRS tells us that if we row-reduce C, the nonzero rows of the row-equivalent matrix in reduced row-echelon form will be a basis for C, and hence a basis for W. Let’s do it — C row-reduces to

10 7_ 11 01 1_ 11 00 0 00 0

If we convert the two nonzero rows to column vectors then we have a basis,

B = 1 0 7 11 , 0 1 1 11


W = 1 0 7 11 , 0 1 1 11

For aesthetic reasons, we might wish to multiply each vector in B by 11, which will not change the spanning or linear independence properties of B as a basis. Then we can also write

W = 11 0 7 , 0 11 1

Example IAS provides another example of this flavor, though now we can notice that X is a subspace, and that the resulting set of three vectors is a basis. This is such a powerful technique that we should do one more example.

Example RS
Reducing a span
In Example RSC5 we began with a set of n = 4 vectors from 5,

R = v1,v2,v3,v4 = 1 2 1 3 2 , 2 1 3 1 2 , 0 7 6 11 2 , 4 1 2 1 6

and defined V = R. Our goal in that problem was to find a relation of linear dependence on the vectors in R, solve the resulting equation for one of the vectors, and re-express V as the span of a set of three vectors.

Here is another way to accomplish something similar. The row space of the matrix

A = 1 2 1 3 2 2 1 3 1 2 07 6 112 4 1 2 1 6

is equal to R. By Theorem BRS we can row-reduce this matrix, ignore any zero rows, and use the non-zero rows as column vectors that are a basis for the row space of A. Row-reducing A creates the matrix

1001 17 30 17 010 25 17 2 17 0012 178 17 000 0 0


1 0 0 1 17 30 17 , 0 1 0 25 17 2 17 , 0 0 1 2 17 8 17

is a basis for V . Our theorem tells us this is a basis, there is no need to verify that the subspace spanned by three vectors (rather than four) is the identical subspace, and there is no need to verify that we have reached the limit in reducing the set, since the set of three vectors is guaranteed to be linearly independent.

Subsection BNM: Bases and Nonsingular Matrices

A quick source of diverse bases for m is the set of columns of a nonsingular matrix.

Theorem CNMB
Columns of Nonsingular Matrix are a Basis
Suppose that A is a square matrix of size m. Then the columns of A are a basis of m if and only if A is nonsingular.

Proof   ( ) Suppose that the columns of A are a basis for m. Then Definition B says the set of columns is linearly independent. Theorem NMLIC then says that A is nonsingular.

( ) Suppose that A is nonsingular. Then by Theorem NMLIC this set of columns is linearly independent. Theorem CSNM says that for a nonsingular matrix, CA = m. This is equivalent to saying that the columns of A are a spanning set for the vector space m. As a linearly independent spanning set, the columns of A qualify as a basis for m (Definition B).

Example CABAK
Columns as Basis, Archetype K
Archetype K is the 5 × 5 matrix

K = 10 18 24 24 12 12 2 6 0 18 30212330 39 27 30 36 37 30 18 24 30 30 20

which is row-equivalent to the 5 × 5 identity matrix I5. So by Theorem NMRRI, K is nonsingular. Then Theorem CNMB says the set

10 12 30 27 18 , 18 2 21 30 24 , 24 6 23 36 30 , 24 0 30 37 30 , 12 18 39 30 20

is a (novel) basis of 5.

Perhaps we should view the fact that the standard unit vectors are a basis (Theorem SUVB) as just a simple corollary of Theorem CNMB? (See Technique LC.)

With a new equivalence for a nonsingular matrix, we can update our list of equivalences.

Theorem NME5
Nonsingular Matrix Equivalences, Round 5
Suppose that A is a square matrix of size n. The following are equivalent.

  1. A is nonsingular.
  2. A row-reduces to the identity matrix.
  3. The null space of A contains only the zero vector, NA = 0.
  4. The linear system SA,b has a unique solution for every possible choice of b.
  5. The columns of A are a linearly independent set.
  6. A is invertible.
  7. The column space of A is n, CA = n.
  8. The columns of A are a basis for n.

Proof   With a new equivalence for a nonsingular matrix in Theorem CNMB we can expand Theorem NME4.

Subsection OBC: Orthonormal Bases and Coordinates

We learned about orthogonal sets of vectors in m back in Section O, and we also learned that orthogonal sets are automatically linearly independent (Theorem OSLI). When an orthogonal set also spans a subspace of m, then the set is a basis. And when the set is orthonormal, then the set is an incredibly nice basis. We will back up this claim with a theorem, but first consider how you might manufacture such a set.

Suppose that W is a subspace of m with basis B. Then B spans W and is a linearly independent set of nonzero vectors. We can apply the Gram-Schmidt Procedure (Theorem GSP) and obtain a linearly independent set T such that T = B = W and T is orthogonal. In other words, T is a basis for W, and is an orthogonal set. By scaling each vector of T to norm 1, we can convert T into an orthonormal set, without destroying the properties that make it a basis of W. In short, we can convert any basis into an orthonormal basis. Example GSTV, followed by Example ONTV, illustrates this process.

Unitary matrices (Definition UM) are another good source of orthonormal bases (and vice versa). Suppose that Q is a unitary matrix of size n. Then the n columns of Q form an orthonormal set (Theorem CUMOS) that is therefore linearly independent (Theorem OSLI). Since Q is invertible (Theorem UMI), we know Q is nonsingular (Theorem NI), and then the columns of Q span n (Theorem CSNM). So the columns of a unitary matrix of size n are an orthonormal basis for n.

Why all the fuss about orthonormal bases? Theorem VRRB told us that any vector in a vector space could be written, uniquely, as a linear combination of basis vectors. For an orthonormal basis, finding the scalars for this linear combination is extremely easy, and this is the content of the next theorem. Furthermore, with vectors written this way (as linear combinations of the elements of an orthonormal set) certain computations and analysis become much easier. Here’s the promised theorem.

Theorem COB
Coordinates and Orthonormal Bases
Suppose that B = v1,v2,v3,,vp is an orthonormal basis of the subspace W of m. For any w W,

w = w,v1 v1 + w,v2 v2 + w,v3 v3 + + w,vp vp

Proof   Because B is a basis of W, Theorem VRRB tells us that we can write w uniquely as a linear combination of the vectors in B. So it is not this aspect of the conclusion that makes this theorem interesting. What is interesting is that the particular scalars are so easy to compute. No need to solve big systems of equations — just do an inner product of w with vi to arrive at the coefficient of vi in the linear combination.

So begin the proof by writing w as a linear combination of the vectors in B, using unknown scalars,

w = a1v1 + a2v2 + a3v3 + + apvp

and compute,

w,vi = k=1pa kvk,vi  Theorem VRRB = k=1p a kvk,vi  Theorem IPVA = k=1pa k vk,vi  Theorem IPSM = ai vi,vi + i=1 ki pa k vk,vi  Property C = ai(1) + i=1 ki pa k(0)  Definition ONS = ai

So the (unique) scalars for the linear combination are indeed the inner products advertised in the conclusion of the theorem’s statement.

Example CROB4
Coordinatization relative to an orthonormal basis, 4
The set

x1,x2,x3,x4 = 1 + i 1 1 i i , 1 + 5i 6 + 5i 7 i 1 6i , 7 + 34i 8 23i 10 + 22i 30 + 13i , 2 4i 6 + i 4 + 3i 6 i

was proposed, and partially verified, as an orthogonal set in Example AOS. Let’s scale each vector to norm 1, so as to form an orthonormal basis of 4. (Notice that by Theorem OSLI the set is linearly independent. Since we know the dimension of 4 is 4, Theorem G tells us the set is just the right size to be a basis of 4.) The norms of these vectors are,

x1 = 6 x2 = 174 x3 = 3451 x4 = 119

So an orthonormal basis is

B = v1,v2,v3,v4 = 1 6 1 + i 1 1 i i , 1 174 1 + 5i 6 + 5i 7 i 1 6i , 1 3451 7 + 34i 8 23i 10 + 22i 30 + 13i , 1 119 2 4i 6 + i 4 + 3i 6 i

Now, choose any vector from 4, say w = 2 3 1 4 , and compute

w,v1 = 5i 6 , w,v2 = 19 + 30i 174 , w,v3 = 120 211i 3451 , w,v4 = 6 + 12i 119

then Theorem COB guarantees that

2 3 1 4 = 5i 6 1 6 1 + i 1 1 i i + 19 + 30i 174 1 174 1 + 5i 6 + 5i 7 i 1 6i + 120 211i 3451 1 3451 7 + 34i 8 23i 10 + 22i 30 + 13i + 6 + 12i 119 1 119 2 4i 6 + i 4 + 3i 6 i

as you might want to check (if you have unlimited patience).

A slightly less intimidating example follows, in three dimensions and with just real numbers.

Example CROB3
Coordinatization relative to an orthonormal basis, 3
The set

x1,x2,x3 = 1 2 1 , 1 0 1 , 2 1 1

is a linearly independent set, which the Gram-Schmidt Process (Theorem GSP) converts to an orthogonal set, and which can then be converted to the orthonormal set,

B = v1,v2,v3 = 1 6 1 2 1 , 1 2 1 0 1 , 1 3 1 1 1

which is therefore an orthonormal basis of 3. With three vectors in 3, all with real number entries, the inner product (Definition IP) reduces to the usual “dot product” (or scalar product) and the orthogonal pairs of vectors can be interpreted as perpendicular pairs of directions. So the vectors in B serve as replacements for our usual 3-D axes, or the usual 3-D unit vectors i,j and k. We would like to decompose arbitrary vectors into “components” in the directions of each of these basis vectors. It is Theorem COB that tells us how to do this.

Suppose that we choose w = 2 1 5 . Compute

w,v1 = 5 6 w,v2 = 3 2 w,v3 = 8 3

then Theorem COB guarantees that

2 1 5 = 5 6 1 6 1 2 1 + 3 2 1 2 1 0 1 + 8 3 1 3 1 1 1

which you should be able to check easily, even if you do not have much patience.

Not only do the columns of a unitary matrix form an orthonormal basis, we can employ a unitary matrix to convert one orthogonal basis into another. Here’s the theorem.

Theorem UMCOB
Unitary Matrices Convert Orthonormal Bases
Suppose that U is a unitary matrix of sixe n and B = x1,x2,x3,,xn is an orthonormal basis of n. Then the set C = Ux1,Ux2,Ux3,,Uxn is an orthonormal basis of n.

Proof   We need to establish several facts for C. First we check that C is an orthonormal set, by Theorem UMPIP, for ij,

Uxi,Uxj = xi,xj = 1

Similarly, Theorem UMPIP also gives, for 1 i n,

Uxi = xi = 1

As C is an orthogonal set, Theorem OSLI yields the linear independence of C. Having established that the column vectors on C form a linearly independent set, a matrix whose columns are the vectors of C is nonsingular (Theorem NMLIC), and hence these vectors form a basis of n by Theorem CNMB.

A converse of this result is contained in Theorem CBOB.

Subsection READ: Reading Questions

  1. The matrix below is nonsingular. What can you now say about its columns?
    A = 301 1 21 5 16
  2. Write the vector w = 6 6 15 as a linear combination of the columns of the matrix A above. How many ways are there to answer this question?
  3. Why is an orthonormal basis desirable?

Subsection EXC: Exercises

C40 From Example RSB, form an arbitrary (and nontrivial) linear combination of the four vectors in the original spanning set for W. So the result of this computation is of course an element of W. As such, this vector should be a linear combination of the basis vectors in B. Find the (unique) scalars that provide this linear combination. Repeat with another linear combination of the original four vectors.  
Contributed by Robert Beezer Solution [949]

C80 Prove that (1,2),(2,3) is a basis for the crazy vector space C (Example CVS).  
Contributed by Robert Beezer

M20 In Example BM provide the verifications (linear independence and spanning) to show that B is a basis of Mmn.  
Contributed by Robert Beezer Solution [946]

Subsection SOL: Solutions

M20 Contributed by Robert Beezer Statement [945]
We need to establish the linear independence and spanning properties of the set

B = Bk 1 k m,1 n

relative to the vector space Mmn.

This proof is more transparent if you write out individual matrices in the basis with lots of zeros and dots and a lone one. But we don’t have room for that here, so we will use summation notation. Think carefully about each step, especially when the double summations seem to “disappear.” Begin with a relation of linear dependence, using double subscripts on the scalars to align with the basis elements.

O = k=1m =1nα kBk

Now consider the entry in row i and column j for these equal matrices,

0 = Oij  Definition ZM = k=1m =1nα kBk ij  Definition ME = k=1m =1n α kBk ij  Definition MA = k=1m =1nα k Bk ij  Definition MSM = αij Bij ij   Bk ij = 0 when (k,)(i,j) = αij(1)   Bij ij = 1 = αij

Since i and j were arbitrary, we find that each scalar is zero and so B is linearly independent (Definition LI).

To establish the spanning property of B we need only show that an arbitrary matrix A can be written as a linear combination of the elements of B. So suppose that A is an arbitrary m × n matrix and consider the matrix C defined as a linear combination of the elements of B by

C = k=1m =1n A kBk


Cij = k=1m =1n A kBk ij  Definition ME = k=1m =1n A kBk ij  Definition MA = k=1m =1n A k Bk ij  Definition MSM = Aij Bij ij   Bk ij = 0 when (k,)(i,j) = Aij(1)   Bij ij = 1 = Aij

So by Definition ME, A = C, and therefore A B. By Definition B, the set B is a basis of the vector space Mmn.

C40 Contributed by Robert Beezer Statement [945]
An arbitrary linear combination is

y = 3 2 3 1 +(2) 1 4 1 +1 7 5 4 +(2) 7 6 5 = 25 10 15

(You probably used a different collection of scalars.) We want to write y as a linear combination of

B = 1 0 7 11 , 0 1 1 11

We could set this up as vector equation with variables as scalars in a linear combination of the vectors in B, but since the first two slots of B have such a nice pattern of zeros and ones, we can determine the necessary scalars easily and then double-check our answer with a computation in the third slot,

25 1 0 7 11 +(10) 0 1 1 11 = 25 10 (25) 7 11 + (10) 1 11 = 25 10 15 = y

Notice how the uniqueness of these scalars arises. They are forced to be 25 and 10.