Section B Bases

Definition B
Basis
Suppose V is a vector space. Then a subset S⊆V is a basis of V if it is linearly independent and spans V. △

So, a basis is a linearly independent spanning set for a vector space. The requirement that the set spans V insures that S has enough raw material to build V, while the linear independence requirement insures that we do not have any more raw material than we need. As we shall see soon in Section D, a basis is a minimal spanning set.

You may have noticed that we used the term basis for some of the titles of previous theorems (e.g. Theorem BNS, Theorem BCS, Theorem BRS) and if you review each of these theorems you will see that their conclusions provide linearly independent spanning sets for sets that we now recognize as subspaces of ℂm. Examples associated with these theorems include Example NSLIL, Example CSOCD and Example IAS. As we will see, these three theorems will continue to be powerful tools, even in the setting of more general vector spaces.

Furthermore, the archetypes contain an abundance of bases. For each coefficient matrix of a system of equations, and for each archetype defined simply as a matrix, there is a basis for the null space, three bases for the column space, and a basis for the row space. For this reason, our subsequent examples will concentrate on bases for vector spaces other than ℂm. Notice that Definition B does not preclude a vector space from having many bases, and this is the case, as hinted above by the statement that the archetypes contain three bases for the column space of a matrix. More generally, we can grab any basis for a vector space, multiply any one basis vector by a non-zero scalar and create a slightly different set that is still a basis. For “important” vector spaces, it will be convenient to have a collection of “nice” bases. When a vector space has a single particularly nice basis, it is sometimes called the standard basis though there is nothing precise enough about this term to allow us to define it formally — it is a question of style. Here are some nice bases for important vector spaces.

Proof We must show that the set B is both linearly independent and a spanning set for ℂm. First, the vectors in B are, by Definition SUV, the columns of the identity matrix, which we know is nonsingular (since it row-reduces to the identity matrix, Theorem NMRRI). And the columns of a nonsingular matrix are linearly independent by Theorem NMLIC.

Can we write v as a linear combination of the vectors in B? Yes, and quite simply.

this shows that ℂm⊆

, which is sufficient to show that B is a spanning set for {ℂ}^{m}. ■

Example BP
Bases for {P}_{n}
The vector space of polynomials with degree at most n, {P}_{n}, has the basis

Checking that each of B and C is a linearly independent spanning set are good exercises. ⊠

So these matrices have entries that are all zeros, with the exception of a lone entry that is one. The set of all mn of them,

The bases described above will often be convenient ones to work with. However a basis doesn’t have to obviously look like a basis.

is a spanning set for W = \left \{\left .p(x)\right \vert p ∈ {P}_{4},\ p(2) = 0\right \}. We will now show that S is also linearly independent in W. Begin with a relation of linear dependence,

\eqalignno{ 0 + 0x + 0{x}^{2} + 0{x}^{3} + 0{x}^{4}& = {α}_{ 1}\left (x − 2\right ) + {α}_{2}\left ({x}^{2} − 4x + 4\right ) && \cr &\quad + {α}_{3}\left ({x}^{3} − 6{x}^{2} + 12x − 8\right ) + {α}_{ 4}\left ({x}^{4} − 8{x}^{3} + 24{x}^{2} − 32x + 16\right )&& \cr & = {α}_{4}{x}^{4} + \left ({α}_{ 3} − 8{α}_{4}\right ){x}^{3} + \left ({α}_{ 2} − 6{α}_{3} + 24{α}_{4}\right ){x}^{2} && \cr &\quad + \left ({α}_{1} − 4{α}_{2} + 12{α}_{3} − 32{α}_{4}\right )x + \left (−2{α}_{1} + 4{α}_{2} − 8{α}_{3} + 16{α}_{4}\right )&& }

Equating coefficients (vector equality in {P}_{4}) gives the homogeneous system of five equations in four variables,

\eqalignno{ {α}_{4} & = 0 & & \cr {α}_{3} − 8{α}_{4} & = 0 & & \cr {α}_{2} − 6{α}_{3} + 24{α}_{4} & = 0 & & \cr {α}_{1} − 4{α}_{2} + 12{α}_{3} − 32{α}_{4} & = 0 & & \cr − 2{α}_{1} + 4{α}_{2} − 8{α}_{3} + 16{α}_{4} & = 0 & & \cr & & }

We form the coefficient matrix, and row-reduce to obtain a matrix in reduced row-echelon form

With only the trivial solution to this homogeneous system, we conclude that only scalars that will form a relation of linear dependence are the trivial ones, and therefore the set S is linearly independent (Definition LI). Finally, S has earned the right to be called a basis for W (Definition B). ⊠

of the vector space of all 2 × 2 matrices, {M}_{22}. If we can also determine that Q is linearly independent in Z (or in {M}_{22}), then it will qualify as a basis for Z. Let’s begin with a relation of linear dependence.

\eqalignno{ \left [\array{ 0&0\cr 0&0 } \right ] & = {α}_{1}\left [\array{ −3&1\cr 0 &0 } \right ] + {α}_{2}\left [\array{ 1 &0\cr −4 &1 } \right ] & & \cr & = \left [\array{ −3{α}_{1} + {α}_{2}&{α}_{1} \cr −4{α}_{2} &{α}_{2} } \right ] & & }

Using our definition of matrix equality (Definition ME) we equate corresponding entries and get a homogeneous system of four equations in two variables,

\eqalignno{ − 3{α}_{1} + {α}_{2} & = 0 & & \cr {α}_{1} & = 0 & & \cr − 4{α}_{2} & = 0 & & \cr {α}_{2} & = 0 & & }

We could row-reduce the coefficient matrix of this homogeneous system, but it is not necessary. The second and fourth equations tell us that {α}_{1} = 0, {α}_{2} = 0 is the only solution to this homogeneous system. This qualifies the set Q as being linearly independent, since the only relation of linear dependence is trivial (Definition LI). Therefore Q is a basis for Z (Definition B). ⊠

Example BC
Basis for the crazy vector space
In Example LIC and Example SSC we determined that the set R = \left \{(1,\kern 1.95872pt 0),\kern 1.95872pt (6,\kern 1.95872pt 3)\right \} from the crazy vector space, C (Example CVS), is linearly independent and is a spanning set for C. By Definition B we see that R is a basis for C. ⊠

We have seen that several of the sets associated with a matrix are subspaces of vector spaces of column vectors. Specifically these are the null space (Theorem NSMS), column space (Theorem CSMS), row space (Theorem RSMS) and left null space (Theorem LNSMS). As subspaces they are vector spaces (Definition S) and it is natural to ask about bases for these vector spaces. Theorem BNS, Theorem BCS, Theorem BRS each have conclusions that provide linearly independent spanning sets for (respectively) the null space, column space, and row space. Notice that each of these theorems contains the word “basis” in its title, even though we did not know the precise meaning of the word at the time. To find a basis for a left null space we can use the definition of this subspace as a null space (Definition LNS) and apply Theorem BNS. Or Theorem FS tells us that the left null space can be expressed as a row space and we can then use Theorem BRS.

Theorem BS is another early result that provides a linearly independent spanning set (i.e. a basis) as its conclusion. If a vector space of column vectors can be expressed as a span of a set of column vectors, then Theorem BS can be employed in a straightforward manner to quickly yield a basis.

Subsection BSCV: Bases for Spans of Column Vectors

We have seen several examples of bases in different vector spaces. In this subsection, and the next (Subsection B.BNM), we will consider building bases for {ℂ}^{m} and its subspaces.

Suppose we have a subspace of {ℂ}^{m} that is expressed as the span of a set of vectors, S, and S is not necessarily linearly independent, or perhaps not very attractive. Theorem REMRS says that row-equivalent matrices have identical row spaces, while Theorem BRS says the nonzero rows of a matrix in reduced row-echelon form are a basis for the row space. These theorems together give us a great computational tool for quickly finding a basis for a subspace that is expressed originally as a span.

Example RSB
Row space basis
When we first defined the span of a set of column vectors, in Example SCAD we looked at the set

with an eye towards realizing W as the span of a smaller set. By building relations of linear dependence (though we did not know them by that name then) we were able to remove two vectors and write W as the span of the other two vectors. These two remaining vectors formed a linearly independent set, even though we did not know that at the time.

Now we know that W is a subspace and must have a basis. Consider the matrix, C, whose rows are the vectors in the spanning set for W,

Then, by Definition RSM, the row space of C will be W, ℛ\kern -1.95872pt \left (C\right ) = W. Theorem BRS tells us that if we row-reduce C, the nonzero rows of the row-equivalent matrix in reduced row-echelon form will be a basis for ℛ\kern -1.95872pt \left (C\right ), and hence a basis for W. Let’s do it — C row-reduces to

For aesthetic reasons, we might wish to multiply each vector in B by 11, which will not change the spanning or linear independence properties of B as a basis. Then we can also write

Example IAS provides another example of this flavor, though now we can notice that X is a subspace, and that the resulting set of three vectors is a basis. This is such a powerful technique that we should do one more example.

and defined V = \left \langle R\right \rangle . Our goal in that problem was to find a relation of linear dependence on the vectors in R, solve the resulting equation for one of the vectors, and re-express V as the span of a set of three vectors.

Here is another way to accomplish something similar. The row space of the matrix

is equal to \left \langle R\right \rangle . By Theorem BRS we can row-reduce this matrix, ignore any zero rows, and use the non-zero rows as column vectors that are a basis for the row space of A. Row-reducing A creates the matrix

is a basis for V . Our theorem tells us this is a basis, there is no need to verify that the subspace spanned by three vectors (rather than four) is the identical subspace, and there is no need to verify that we have reached the limit in reducing the set, since the set of three vectors is guaranteed to be linearly independent. ⊠

Subsection BNM: Bases and Nonsingular Matrices

A quick source of diverse bases for {ℂ}^{m} is the set of columns of a nonsingular matrix.

Theorem CNMB
Columns of Nonsingular Matrix are a Basis
Suppose that A is a square matrix of size m. Then the columns of A are a basis of {ℂ}^{m} if and only if A is nonsingular. □

Proof ( ⇒) Suppose that the columns of A are a basis for {ℂ}^{m}. Then Definition B says the set of columns is linearly independent. Theorem NMLIC then says that A is nonsingular.

( ⇐) Suppose that A is nonsingular. Then by Theorem NMLIC this set of columns is linearly independent. Theorem CSNM says that for a nonsingular matrix, C\kern -1.95872pt \left (A\right ) = {ℂ}^{m}. This is equivalent to saying that the columns of A are a spanning set for the vector space {ℂ}^{m}. As a linearly independent spanning set, the columns of A qualify as a basis for {ℂ}^{m} (Definition B). ■

which is row-equivalent to the 5 × 5 identity matrix {I}_{5}. So by Theorem NMRRI, K is nonsingular. Then Theorem CNMB says the set

Perhaps we should view the fact that the standard unit vectors are a basis (Theorem SUVB) as just a simple corollary of Theorem CNMB? (See Technique LC.)

With a new equivalence for a nonsingular matrix, we can update our list of equivalences.

Proof With a new equivalence for a nonsingular matrix in Theorem CNMB we can expand Theorem NME4. ■

Subsection OBC: Orthonormal Bases and Coordinates

We learned about orthogonal sets of vectors in {ℂ}^{m} back in Section O, and we also learned that orthogonal sets are automatically linearly independent (Theorem OSLI). When an orthogonal set also spans a subspace of {ℂ}^{m}, then the set is a basis. And when the set is orthonormal, then the set is an incredibly nice basis. We will back up this claim with a theorem, but first consider how you might manufacture such a set.

Suppose that W is a subspace of {ℂ}^{m} with basis B. Then B spans W and is a linearly independent set of nonzero vectors. We can apply the Gram-Schmidt Procedure (Theorem GSP) and obtain a linearly independent set T such that \left \langle T\right \rangle = \left \langle B\right \rangle = W and T is orthogonal. In other words, T is a basis for W, and is an orthogonal set. By scaling each vector of T to norm 1, we can convert T into an orthonormal set, without destroying the properties that make it a basis of W. In short, we can convert any basis into an orthonormal basis. Example GSTV, followed by Example ONTV, illustrates this process.

Unitary matrices (Definition UM) are another good source of orthonormal bases (and vice versa). Suppose that Q is a unitary matrix of size n. Then the n columns of Q form an orthonormal set (Theorem CUMOS) that is therefore linearly independent (Theorem OSLI). Since Q is invertible (Theorem UMI), we know Q is nonsingular (Theorem NI), and then the columns of Q span {ℂ}^{n} (Theorem CSNM). So the columns of a unitary matrix of size n are an orthonormal basis for {ℂ}^{n}.

Why all the fuss about orthonormal bases? Theorem VRRB told us that any vector in a vector space could be written, uniquely, as a linear combination of basis vectors. For an orthonormal basis, finding the scalars for this linear combination is extremely easy, and this is the content of the next theorem. Furthermore, with vectors written this way (as linear combinations of the elements of an orthonormal set) certain computations and analysis become much easier. Here’s the promised theorem.

Theorem COB
Coordinates and Orthonormal Bases
Suppose that B = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {v}_{p}\right \} is an orthonormal basis of the subspace W of {ℂ}^{m}. For any w ∈ W,

Proof Because B is a basis of W, Theorem VRRB tells us that we can write w uniquely as a linear combination of the vectors in B. So it is not this aspect of the conclusion that makes this theorem interesting. What is interesting is that the particular scalars are so easy to compute. No need to solve big systems of equations — just do an inner product of w with {v}_{i} to arrive at the coefficient of {v}_{i} in the linear combination.

So begin the proof by writing w as a linear combination of the vectors in B, using unknown scalars,

\eqalignno{ \left \langle {v}_{i},\kern 1.95872pt w\right \rangle & = \left \langle {v}_{i},\kern 1.95872pt {\mathop{∑ }}_{k=1}^{p}{a}_{ k}{v}_{k}\right \rangle & &\text{@(a href="fcla-jsmath-2.90li39.html#theorem.VRRB")Theorem VRRB@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{p}\left \langle {v}_{ i},\kern 1.95872pt {a}_{k}{v}_{k}\right \rangle & &\text{@(a href="fcla-jsmath-2.90li28.html#theorem.IPVA")Theorem IPVA@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{p}{a}_{ k}\left \langle {v}_{i},\kern 1.95872pt {v}_{k}\right \rangle & &\text{@(a href="fcla-jsmath-2.90li28.html#theorem.IPSM")Theorem IPSM@(/a)} & & & & \cr & = {a}_{i}\left \langle {v}_{i},\kern 1.95872pt {v}_{i}\right \rangle +{ \mathop{∑ }}_{\begin{array}{c}k=1 \\ k\mathrel{≠}i \end{array}}^{p}{a}_{ k}\left \langle {v}_{i},\kern 1.95872pt {v}_{k}\right \rangle & &\text{@(a href="fcla-jsmath-2.90li37.html#property.C")Property C@(/a)} & & & & \cr & = {a}_{i}(1) +{ \mathop{∑ }}_{\begin{array}{c}k=1 \\ k\mathrel{≠}i \end{array}}^{p}{a}_{ k}(0) & &\text{@(a href="fcla-jsmath-2.90li28.html#definition.ONS")Definition ONS@(/a)} & & & & \cr & = {a}_{i} & & & & }

So the (unique) scalars for the linear combination are indeed the inner products advertised in the conclusion of the theorem’s statement. ■

was proposed, and partially verified, as an orthogonal set in Example AOS. Let’s scale each vector to norm 1, so as to form an orthonormal set in {ℂ}^{4}. Then by Theorem OSLI the set will be linearly independent, and by Theorem NME5 the set will be a basis for {ℂ}^{4}. So, once scaled to norm 1, the adjusted set will be an orthonormal basis of {ℂ}^{4}. The norms are,

\eqalignno{ \left \Vert {x}_{1}\right \Vert = \sqrt{6} & &\left \Vert {x}_{2}\right \Vert = \sqrt{174} & &\left \Vert {x}_{3}\right \Vert = \sqrt{3451} & &\left \Vert {x}_{4}\right \Vert = \sqrt{119} & & & & & & & & }

\eqalignno{ B & = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt {v}_{4}\right \} & & \cr & = \left \{ {1\over \sqrt{6}}\left [\array{ 1 + i\cr 1 \cr 1 − i \cr i } \right ],\kern 1.95872pt {1\over \sqrt{174}}\left [\array{ 1 + 5i \cr 6 + 5i \cr −7 − i \cr 1 − 6i } \right ],\kern 1.95872pt {1\over \sqrt{3451}}\left [\array{ −7 + 34i \cr −8 − 23i \cr −10 + 22i \cr 30 + 13i } \right ],\kern 1.95872pt {1\over \sqrt{119}}\left [\array{ −2 − 4i \cr 6 + i \cr 4 + 3i \cr 6 − i } \right ]\right \} & & }

Now, to illustrate Theorem COB, choose any vector from {ℂ}^{4}, say w = \left [\array{ 2\cr −3 \cr 1\cr 4 } \right ], and compute

\eqalignno{ \left \langle w,\kern 1.95872pt {v}_{1}\right \rangle = {−5i\over \sqrt{6}} ,&&\left \langle w,\kern 1.95872pt {v}_{2}\right \rangle = {−19 + 30i\over \sqrt{174}} ,&&\left \langle w,\kern 1.95872pt {v}_{3}\right \rangle = {120 − 211i\over \sqrt{3451}} ,&&\left \langle w,\kern 1.95872pt {v}_{4}\right \rangle = {6 + 12i\over \sqrt{119}} &&&&&&&& }

\eqalignno{ \left [\array{ 2\cr −3 \cr 1\cr 4 } \right ] & = {−5i\over \sqrt{6}} \left ( {1\over \sqrt{6}}\left [\array{ 1 + i\cr 1 \cr 1 − i \cr i } \right ]\right ) + {−19 + 30i\over \sqrt{174}} \left ( {1\over \sqrt{174}}\left [\array{ 1 + 5i \cr 6 + 5i \cr −7 − i \cr 1 − 6i } \right ]\right ) & & \cr &\quad \quad + {120 − 211i\over \sqrt{3451}} \left ( {1\over \sqrt{3451}}\left [\array{ −7 + 34i \cr −8 − 23i \cr −10 + 22i \cr 30 + 13i } \right ]\right ) + {6 + 12i\over \sqrt{119}} \left ( {1\over \sqrt{119}}\left [\array{ −2 − 4i \cr 6 + i \cr 4 + 3i \cr 6 − i } \right ]\right ) & & }

A slightly less intimidating example follows, in three dimensions and with just real numbers.

is a linearly independent set, which the Gram-Schmidt Process (Theorem GSP) converts to an orthogonal set, and which can then be converted to the orthonormal set,

which is therefore an orthonormal basis of {ℂ}^{3}. With three vectors in {ℂ}^{3}, all with real number entries, the inner product (Definition IP) reduces to the usual “dot product” (or scalar product) and the orthogonal pairs of vectors can be interpreted as perpendicular pairs of directions. So the vectors in B serve as replacements for our usual 3-D axes, or the usual 3-D unit vectors \vec{i},\vec{j} and \vec{k}. We would like to decompose arbitrary vectors into “components” in the directions of each of these basis vectors. It is Theorem COB that tells us how to do this.

\eqalignno{ \left \langle w,\kern 1.95872pt {v}_{1}\right \rangle = {5\over \sqrt{6}} & &\left \langle w,\kern 1.95872pt {v}_{2}\right \rangle = {3\over \sqrt{2}} & &\left \langle w,\kern 1.95872pt {v}_{3}\right \rangle = {8\over \sqrt{3}} & & & & & & }

which you should be able to check easily, even if you do not have much patience. ⊠

Not only do the columns of a unitary matrix form an orthonormal basis, but there is a deeper connection between orthonormal bases and unitary matrices. Informally, the next theorem says that if we transform each vector of an orthonormal basis by multiplying it by a unitary matrix, then the resulting set will be another orthonormal basis. And more remarkably, any matrix with this property must be unitary! As an equivalence (Technique E) we could take this as our defining property of a unitary matrix, though it might not have the same utility as Definition UM.

Theorem UMCOB
Unitary Matrices Convert Orthonormal Bases
Let A be an n × n matrix and B = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n}\right \} be an orthonormal basis of {ℂ}^{n}. Define

\eqalignno{ C & = \left \{A{x}_{1},\kern 1.95872pt A{x}_{2},\kern 1.95872pt A{x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt A{x}_{n}\right \} & & }

Then A is a unitary matrix if and only if C is an orthonormal basis of {ℂ}^{n}. □

Proof \left (⇒\right ) Assume A is a unitary matrix and establish several facts about C. First we check that C is an orthonormal set (Definition ONS). By Theorem UMPIP, for i\mathrel{≠}j,

\eqalignno{ \left \langle A{x}_{i},\kern 1.95872pt A{x}_{j}\right \rangle & = \left \langle {x}_{i},\kern 1.95872pt {x}_{j}\right \rangle = 0 & & }

\eqalignno{ \left \Vert A{x}_{i}\right \Vert = \left \Vert {x}_{i}\right \Vert = 1 & & }

As C is an orthogonal set (Definition OSV), Theorem OSLI yields the linear independence of C. Having established that the column vectors on C form a linearly independent set, a matrix whose columns are the vectors of C is nonsingular (Theorem NMLIC), and hence these vectors form a basis of {ℂ}^{n} by Theorem CNMB.

\left (⇐\right ) Now assume that C is an orthonormal set. Let y be an arbitrary vector from {ℂ}^{n}. Since B spans {ℂ}^{n}, there are scalars, {a}_{1},\kern 1.95872pt {a}_{2},\kern 1.95872pt {a}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {a}_{n}, such that

\eqalignno{ y & = {a}_{1}{x}_{1} + {a}_{2}{x}_{2} + {a}_{3}{x}_{3} + \mathrel{⋯} + {a}_{n}{x}_{n} & & }

\eqalignno{ {A}^{∗}Ay & ={ \mathop{∑ }}_{i=1}^{n}\left \langle {x}_{ i},\kern 1.95872pt {A}^{∗}Ay\right \rangle {x}_{ i} & &\text{@(a href="#theorem.COB")Theorem COB@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}\left \langle {x}_{ i},\kern 1.95872pt {A}^{∗}A{\mathop{∑ }}_{j=1}^{n}{a}_{ j}{x}_{j}\right \rangle {x}_{i} & &\text{@(a href="fcla-jsmath-2.90li39.html#definition.TSVS")Definition TSVS@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}\left \langle {x}_{ i},\kern 1.95872pt {\mathop{∑ }}_{j=1}^{n}{A}^{∗}A{a}_{ j}{x}_{j}\right \rangle {x}_{i} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}\left \langle {x}_{ i},\kern 1.95872pt {\mathop{∑ }}_{j=1}^{n}{a}_{ j}{A}^{∗}A{x}_{ j}\right \rangle {x}_{i} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}{ \mathop{∑ }}_{j=1}^{n}\left \langle {x}_{ i},\kern 1.95872pt {a}_{j}{A}^{∗}A{x}_{ j}\right \rangle {x}_{i} & &\text{@(a href="fcla-jsmath-2.90li28.html#theorem.IPVA")Theorem IPVA@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}{ \mathop{∑ }}_{j=1}^{n}{a}_{ j}\left \langle {x}_{i},\kern 1.95872pt {A}^{∗}A{x}_{ j}\right \rangle {x}_{i} & &\text{@(a href="fcla-jsmath-2.90li28.html#theorem.IPSM")Theorem IPSM@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}{ \mathop{∑ }}_{j=1}^{n}{a}_{ j}\left \langle A{x}_{i},\kern 1.95872pt A{x}_{j}\right \rangle {x}_{i} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.AIP")Theorem AIP@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}{ \mathop{∑ }}_{\begin{array}{c}j=1 \\ j\mathrel{≠}i \end{array}}^{n}{a}_{ j}\left \langle A{x}_{i},\kern 1.95872pt A{x}_{j}\right \rangle {x}_{i} +{ \mathop{∑ }}_{ℓ=1}^{n}{a}_{ ℓ}\left \langle A{x}_{ℓ},\kern 1.95872pt A{x}_{ℓ}\right \rangle {x}_{ℓ} & &\text{@(a href="fcla-jsmath-2.90li37.html#property.C")Property C@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}{ \mathop{∑ }}_{\begin{array}{c}j=1 \\ j\mathrel{≠}i \end{array}}^{n}{a}_{ j}(0){x}_{i} +{ \mathop{∑ }}_{ℓ=1}^{n}{a}_{ ℓ}(1){x}_{ℓ} & &\text{@(a href="fcla-jsmath-2.90li28.html#definition.ONS")Definition ONS@(/a)} & & & & \cr & ={ \mathop{∑ }}_{i=1}^{n}{ \mathop{∑ }}_{\begin{array}{c}j=1 \\ j\mathrel{≠}i \end{array}}^{n}0 +{ \mathop{∑ }}_{ℓ=1}^{n}{a}_{ ℓ}{x}_{ℓ} & &\text{@(a href="fcla-jsmath-2.90li37.html#theorem.ZSSM")Theorem ZSSM@(/a)} & & & & \cr & ={ \mathop{∑ }}_{ℓ=1}^{n}{a}_{ ℓ}{x}_{ℓ} & &\text{@(a href="fcla-jsmath-2.90li37.html#property.Z")Property Z@(/a)} & & & & \cr & = y & & & & \cr & = {I}_{n}y & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMIM")Theorem MMIM@(/a)} & & & & }

Since the choice of y was arbitrary, Theorem EMMVP tells us that {A}^{∗}A = {I}_{ n}, so A is unitary (Definition UM). ■

Subsection READ: Reading Questions

Subsection EXC: Exercises

\eqalignno{ S & = \left \{\left [\array{ 1\cr 3 \cr 2\cr 1 } \right ],\left [\array{ 1\cr 2 \cr 1\cr 1 } \right ],\left [\array{ 1\cr 1 \cr 0\cr 1 } \right ],\left [\array{ 1\cr 2 \cr 2\cr 1 } \right ],\left [\array{ 3\cr 4 \cr 1\cr 3 } \right ]\right \}. & & }

\eqalignno{ W & = \left \{\left .\left [\array{ a + b − 2c \cr a + b − 2c + d \cr −2a + 2b + 4c − d \cr b + d } \right ]\right \vert a,b,c,d ∈ ℂ\right \} & & }

C12 Find a basis for the vector space T of lower triangular 3 × 3 matrices; that is, matrices of the form \left [\array{ ∗&0&0\cr ∗&∗ &0\cr ∗&∗ &∗ } \right ] where an asterisk represents any complex number.
Contributed by Chris Black Solution [1044]

C13 Find a basis for the subspace Q of {P}_{2}, defined by Q = \left \{\left .p(x) = a + bx + c{x}^{2}\right \vert p(0) = 0\right \}.
Contributed by Chris Black Solution [1045]

C14 Find a basis for the subspace R of {P}_{2} defined by R = \left \{\left .p(x) = a + bx + c{x}^{2}\right \vert p'(0) = 0\right \}, where p' denotes the derivative.
Contributed by Chris Black Solution [1045]

C40 From Example RSB, form an arbitrary (and nontrivial) linear combination of the four vectors in the original spanning set for W. So the result of this computation is of course an element of W. As such, this vector should be a linear combination of the basis vectors in B. Find the (unique) scalars that provide this linear combination. Repeat with another linear combination of the original four vectors.
Contributed by Robert Beezer Solution [1048]

C80 Prove that \left \{(1,\kern 1.95872pt 2),\kern 1.95872pt (2,\kern 1.95872pt 3)\right \} is a basis for the crazy vector space C (Example CVS).
Contributed by Robert Beezer

M20 In Example BM provide the verifications (linear independence and spanning) to show that B is a basis of {M}_{mn}.
Contributed by Robert Beezer Solution [1046]

T50 Theorem UMCOB says that unitary matrices are characterized as those matrices that “carry” orthonormal bases to orthonormal bases. This problem asks you to prove a similar result: nonsingular matrices are characterized as those matrices that “carry” bases to bases.

More precisely, suppose that A is a square matrix of size n and B = \left \{{x}_{1},\kern 1.95872pt {x}_{2},\kern 1.95872pt {x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}_{n}\right \} is a basis of {ℂ}^{n}. Prove that A is nonsingular if and only if C = \left \{A{x}_{1},\kern 1.95872pt A{x}_{2},\kern 1.95872pt A{x}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt A{x}_{n}\right \} is a basis of {ℂ}^{n}. (See also Exercise PD.T33, Exercise MR.T20.)
Contributed by Robert Beezer Solution [1050]

Subsection SOL: Solutions

C10 Contributed by Chris Black Statement [1038]
Theorem BS says that if we take these 5 vectors, put them into a matrix, and row-reduce to discover the pivot columns, then the corresponding vectors in S will be linearly independent and span S, and thus will form a basis of S.

\eqalignno{ \left [\array{ 1&1&1&1&3\cr 3&2 &1 &2 &4 \cr 2&1&0&2&1\cr 1&1 &1 &1 &3 } \right ] &\mathop{\longrightarrow}\limits_{}^{\text{RREF}}\left [\array{ \text{1}&0&−1&0&−2\cr 0&\text{1 } & 2 &0 & 5 \cr 0&0& 0 &\text{1}& 0\cr 0&0 & 0 &0 & 0 } \right ] & & }

Thus, the independent vectors that span S are the first, second and fourth of the set, so a basis of S is

\eqalignno{ B & = \left \{\left [\array{ 1\cr 3 \cr 2\cr 1 } \right ],\left [\array{ 1\cr 2 \cr 1\cr 1 } \right ],\left [\array{ 1\cr 2 \cr 2\cr 1 } \right ]\right \} & & }

\eqalignno{ \left [\array{ a + b − 2c \cr a + b − 2c + d \cr −2a + 2b + 4c − d \cr b + d } \right ] & = \left [\array{ a\cr a \cr −2a\cr 0 } \right ] + \left [\array{ b\cr b \cr 2b\cr b } \right ] + \left [\array{ −2c\cr −2c \cr 4c\cr 0 } \right ] + \left [\array{ 0\cr d \cr −d \cr d } \right ] & & \cr & = a\left [\array{ 1\cr 1 \cr −2\cr 0 } \right ] + b\left [\array{ 1\cr 1 \cr 2\cr 1 } \right ] + c\left [\array{ −2\cr −2 \cr 4\cr 0 } \right ] + d\left [\array{ 0\cr 1 \cr −1\cr 1 } \right ] & & }

\eqalignno{ W & = \left \langle \left \{\left [\array{ 1\cr 1 \cr −2\cr 0 } \right ],\left [\array{ 1\cr 1 \cr 2\cr 1 } \right ],\left [\array{ −2\cr −2 \cr 4\cr 0 } \right ],\left [\array{ 0\cr 1 \cr −1\cr 1 } \right ]\right \}\right \rangle & & }

These four vectors span W, but we also need to determine if they are linearly independent (turns out they are not). With an application of Theorem BS we can see that the arrive at a basis employing three of these vectors,

\eqalignno{ \left [\array{ 1 &1&−2& 0\cr 1 &1 &−2 & 1 \cr −2&2& 4 &−1\cr 0 &1 & 0 & 1 } \right ] &\mathop{\longrightarrow}\limits_{}^{\text{RREF}}\left [\array{ \text{1}&0&−2&0\cr 0&\text{1 } & 0 &0 \cr 0&0& 0 &\text{1}\cr 0&0 & 0 &0 } \right ] & & }

\eqalignno{ B & = \left \{\left [\array{ 1\cr 1 \cr −2\cr 0 } \right ],\left [\array{ 1\cr 1 \cr 2\cr 1 } \right ],\left [\array{ 0\cr 1 \cr −1\cr 1 } \right ]\right \} & & }

C12 Contributed by Chris Black Statement [1039]
Let A be an arbitrary element of the specified vector space T. Then there exist a, b, c, d, e and f so that A = \left [\array{ a&0&0\cr b&c &0 \cr d&e&f } \right ]. Then

\eqalignno{ A& = a\left [\array{ 1&0&0\cr 0&0 &0 \cr 0&0&0} \right ] + b\left [\array{ 0&0&0\cr 1&0 &0 \cr 0&0&0} \right ] + c\left [\array{ 0&0&0\cr 0&1 &0 \cr 0&0&0} \right ] + d\left [\array{ 0&0&0\cr 0&0 &0 \cr 1&0&0} \right ] + e\left [\array{ 0&0&0\cr 0&0 &0 \cr 0&1&0} \right ] + f\left [\array{ 0&0&0\cr 0&0 &0 \cr 0&0&1} \right ]&& }

\eqalignno{ B & = \left \{\left [\array{ 1&0&0\cr 0&0 &0 \cr 0&0&0} \right ],\left [\array{ 0&0&0\cr 1&0 &0 \cr 0&0&0} \right ],\left [\array{ 0&0&0\cr 0&1 &0 \cr 0&0&0} \right ],\left [\array{ 0&0&0\cr 0&0 &0 \cr 1&0&0} \right ],\left [\array{ 0&0&0\cr 0&0 &0 \cr 0&1&0} \right ],\left [\array{ 0&0&0\cr 0&0 &0 \cr 0&0&1} \right ]\right \} & & }

The six vectors in B span the vector space T, and we can check rather simply that they are also linearly independent. Thus, B is a basis of T.

C13 Contributed by Chris Black Statement [1039]
If p(0) = 0, then a + b(0) + c({0}^{2}) = 0, so a = 0. Thus, we can write Q = \left \{\left .p(x) = bx + c{x}^{2}\right \vert b,c ∈ ℂ\right \}. A linearly independent set that spans Q is B = \left \{x,{x}^{2}\right \}, and this set forms a basis of Q.

C14 Contributed by Chris Black Statement [1039]
The derivative of p(x) = a + bx + c{x}^{2} is {p}^{′}(x) = b + 2cx. Thus, if p ∈ R, then {p}^{′}(0) = b + 2c(0) = 0, so we must have b = 0. We see that we can rewrite R as R = \left \{\left .p(x) = a + c{x}^{2}\right \vert a,c ∈ ℂ\right \}. A linearly independent set that spans R is B = \left \{1,{x}^{2}\right \}, and B is a basis of R.

M20 Contributed by Robert Beezer Statement [1039]
We need to establish the linear independence and spanning properties of the set

This proof is more transparent if you write out individual matrices in the basis with lots of zeros and dots and a lone one. But we don’t have room for that here, so we will use summation notation. Think carefully about each step, especially when the double summations seem to “disappear.” Begin with a relation of linear dependence, using double subscripts on the scalars to align with the basis elements.

\eqalignno{ 0 & ={ \left [O\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.90li30.html#definition.ZM")Definition ZM@(/a)} & & & & \cr & ={ \left [{\mathop{∑ }}_{k=1}^{m}{ \mathop{∑ }}_{ℓ=1}^{n}{α}_{ kℓ}{B}_{kℓ}\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.90li30.html#definition.ME")Definition ME@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{m}{ \mathop{∑ }}_{ℓ=1}^{n}{\left [{α}_{ kℓ}{B}_{kℓ}\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.90li30.html#definition.MA")Definition MA@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{m}{ \mathop{∑ }}_{ℓ=1}^{n}{α}_{ kℓ}{\left [{B}_{kℓ}\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.90li30.html#definition.MSM")Definition MSM@(/a)} & & & & \cr & = {α}_{ij}{\left [{B}_{ij}\right ]}_{ij} & &\text{${\left [{B}_{kℓ}\right ]}_{ij} = 0$ when $(k,ℓ)\mathrel{≠}(i,j)$} & & & & \cr & = {α}_{ij}(1) & &\text{${\left [{B}_{ij}\right ]}_{ij} = 1$} & & & & \cr & = {α}_{ij} & & & & }

Since i and j were arbitrary, we find that each scalar is zero and so B is linearly independent (Definition LI).

To establish the spanning property of B we need only show that an arbitrary matrix A can be written as a linear combination of the elements of B. So suppose that A is an arbitrary m × n matrix and consider the matrix C defined as a linear combination of the elements of B by

\eqalignno{ {\left [C\right ]}_{ij} & ={ \left [{\mathop{∑ }}_{k=1}^{m}{ \mathop{∑ }}_{ℓ=1}^{n}{\left [A\right ]}_{ kℓ}{B}_{kℓ}\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.90li30.html#definition.ME")Definition ME@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{m}{ \mathop{∑ }}_{ℓ=1}^{n}{\left [{\left [A\right ]}_{ kℓ}{B}_{kℓ}\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.90li30.html#definition.MA")Definition MA@(/a)} & & & & \cr & ={ \mathop{∑ }}_{k=1}^{m}{ \mathop{∑ }}_{ℓ=1}^{n}{\left [A\right ]}_{ kℓ}{\left [{B}_{kℓ}\right ]}_{ij} & &\text{@(a href="fcla-jsmath-2.90li30.html#definition.MSM")Definition MSM@(/a)} & & & & \cr & ={ \left [A\right ]}_{ij}{\left [{B}_{ij}\right ]}_{ij} & &\text{${\left [{B}_{kℓ}\right ]}_{ij} = 0$ when $(k,ℓ)\mathrel{≠}(i,j)$} & & & & \cr & ={ \left [A\right ]}_{ij}(1) & &\text{${\left [{B}_{ij}\right ]}_{ij} = 1$} & & & & \cr & ={ \left [A\right ]}_{ij} & & & & }

So by Definition ME, A = C, and therefore A ∈\left \langle B\right \rangle . By Definition B, the set B is a basis of the vector space {M}_{mn}.

(You probably used a different collection of scalars.) We want to write y as a linear combination of

We could set this up as vector equation with variables as scalars in a linear combination of the vectors in B, but since the first two slots of B have such a nice pattern of zeros and ones, we can determine the necessary scalars easily and then double-check our answer with a computation in the third slot,

Notice how the uniqueness of these scalars arises. They are forced to be 25 and − 10.

T50 Contributed by Robert Beezer Statement [1040]
Our first proof relies mostly on definitions of linear independence and spanning, which is a good exercise. The second proof is shorter and turns on a technical result from our work with matrix inverses, Theorem NPNT.

\left (⇒\right ) Assume that A is nonsingular and prove that C is a basis of {ℂ}^{n}. First show that C is linearly independent. Work on a relation of linear dependence on C,

\eqalignno{ 0 & = {a}_{1}A{x}_{1} + {a}_{2}A{x}_{2} + {a}_{3}A{x}_{3} + \mathrel{⋯} + {a}_{n}A{x}_{n} & &\text{@(a href="fcla-jsmath-2.90li39.html#definition.RLD")Definition RLD@(/a)} & & & & \cr & = A{a}_{1}{x}_{1} + A{a}_{2}{x}_{2} + A{a}_{3}{x}_{3} + \mathrel{⋯} + A{a}_{n}{x}_{n} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & \cr & = A\left ({a}_{1}{x}_{1} + {a}_{2}{x}_{2} + {a}_{3}{x}_{3} + \mathrel{⋯} + {a}_{n}{x}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & }

\eqalignno{ {a}_{1}{x}_{1} + {a}_{2}{x}_{2} + \mathrel{⋯} + {a}_{n}{x}_{n} & = 0 & & }

But this is a relation of linear dependence of the linearly independent set B, so the scalars are trivial, {a}_{1} = {a}_{2} = {a}_{3} = \mathrel{⋯} = {a}_{n} = 0. By Definition LI, the set C is linearly independent.

Now prove that C spans {ℂ}^{n}. Given an arbitrary vector y ∈ {ℂ}^{n}, can it be expressed as a linear combination of the vectors in C? Since A is a nonsingular matrix we can define the vector w to be the unique solution of the system ℒS\kern -1.95872pt \left (A,\kern 1.95872pt y\right ) (Theorem NMUS). Since w ∈ {ℂ}^{n} we can write w as a linear combination of the vectors in the basis B. So there are scalars, {b}_{1},\kern 1.95872pt {b}_{2},\kern 1.95872pt {b}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {b}_{n} such that

\eqalignno{ w & = {b}_{1}{x}_{1} + {b}_{2}{x}_{2} + {b}_{3}{x}_{3} + \mathrel{⋯} + {b}_{n}{x}_{n} & & }

\eqalignno{ y & = Aw & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.SLEMM")Theorem SLEMM@(/a)} & & & & \cr & = A\left ({b}_{1}{x}_{1} + {b}_{2}{x}_{2} + {b}_{3}{x}_{3} + \mathrel{⋯} + {b}_{n}{x}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.90li39.html#definition.TSVS")Definition TSVS@(/a)} & & & & \cr & = A{b}_{1}{x}_{1} + A{b}_{2}{x}_{2} + A{b}_{3}{x}_{3} + \mathrel{⋯} + A{b}_{n}{x}_{n} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & \cr & = {b}_{1}A{x}_{1} + {b}_{2}A{x}_{2} + {b}_{3}A{x}_{3} + \mathrel{⋯} + {b}_{n}A{x}_{n} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & }

So we can write an arbitrary vector of {ℂ}^{n} as a linear combination of the elements of C. In other words, C spans {ℂ}^{n} (Definition TSVS). By Definition B, the set C is a basis for {ℂ}^{n}.

\left (⇐\right ) Assume that C is a basis and prove that A is nonsingular. Let x be a solution to the homogeneous system ℒS\kern -1.95872pt \left (A,\kern 1.95872pt 0\right ). Since B is a basis of {ℂ}^{n} there are scalars, {a}_{1},\kern 1.95872pt {a}_{2},\kern 1.95872pt {a}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {a}_{n}, such that

\eqalignno{ x & = {a}_{1}{x}_{1} + {a}_{2}{x}_{2} + {a}_{3}{x}_{3} + \mathrel{⋯} + {a}_{n}{x}_{n} & & }

\eqalignno{ 0 & = Ax & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.SLEMM")Theorem SLEMM@(/a)} & & & & \cr & = A\left ({a}_{1}{x}_{1} + {a}_{2}{x}_{2} + {a}_{3}{x}_{3} + \mathrel{⋯} + {a}_{n}{x}_{n}\right ) & &\text{@(a href="fcla-jsmath-2.90li39.html#definition.TSVS")Definition TSVS@(/a)} & & & & \cr & = A{a}_{1}{x}_{1} + A{a}_{2}{x}_{2} + A{a}_{3}{x}_{3} + \mathrel{⋯} + A{a}_{n}{x}_{n} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMDAA")Theorem MMDAA@(/a)} & & & & \cr & = {a}_{1}A{x}_{1} + {a}_{2}A{x}_{2} + {a}_{3}A{x}_{3} + \mathrel{⋯} + {a}_{n}A{x}_{n} & &\text{@(a href="fcla-jsmath-2.90li31.html#theorem.MMSMM")Theorem MMSMM@(/a)} & & & & }

This is a relation of linear dependence on the linearly independent set C, so the scalars must all be zero, {a}_{1} = {a}_{2} = {a}_{3} = \mathrel{⋯} = {a}_{n} = 0. Thus,

\eqalignno{ x & = {a}_{1}{x}_{1} + {a}_{2}{x}_{2} + {a}_{3}{x}_{3} + \mathrel{⋯} + {a}_{n}{x}_{n} = 0{x}_{1} + 0{x}_{2} + 0{x}_{3} + \mathrel{⋯} + 0{x}_{n} = 0. & & }

Now for a second proof. Take the vectors for B and use them as the columns of a matrix, G = \left [{x}_{1}|{x}_{2}|{x}_{3}|\mathop{\mathop{…}}|{x}_{n}\right ]. By Theorem CNMB, because we have the hypothesis that B is a basis of {ℂ}^{n}, G is a nonsingular matrix. Notice that the columns of AG are exactly the vectors in the set C, by Definition MM.

\eqalignno{ A\text{ nonsingular} &\kern 3.26288pt \mathrel{⇔}\kern 3.26288pt AG\text{ nonsingular} & &\text{@(a href="fcla-jsmath-2.90li33.html#theorem.NPNT")Theorem NPNT@(/a)} & & & & \cr &\kern 3.26288pt \mathrel{⇔}\kern 3.26288pt C\text{ basis for }{ℂ}^{n} & &\text{@(a href="#theorem.CNMB")Theorem CNMB@(/a)} & & & & \cr & & & & }

T51 Contributed by Robert Beezer Statement [1040]
Choose B to be the set of standard unit vectors, a particularly nice basis of {ℂ}^{n} (Theorem SUVB). For a vector {e}_{j} (Definition SUV) from this basis, what is A{e}_{j}?