Section B Bases

From A First Course in Linear Algebra
Version 2.00
© 2004.
Licensed under the GNU Free Documentation License.
http://linear.ups.edu/

A basis of a vector space is one of the most useful concepts in linear algebra. It often provides a concise, finite description of an infinite vector space.

Subsection B: Bases

We now have all the tools in place to define a basis of a vector space.

Definition B
Basis
Suppose $V$ is a vector space. Then a subset $S \subseteq V$ is a basis of $V$ if it is linearly independent and spans $V$ . $△$

So, a basis is a linearly independent spanning set for a vector space. The requirement that the set spans $V$ insures that $S$ has enough raw material to build $V$ , while the linear independence requirement insures that we do not have any more raw material than we need. As we shall see soon in Section D, a basis is a minimal spanning set.

You may have noticed that we used the term basis for some of the titles of previous theorems (e.g. Theorem BNS, Theorem BCS, Theorem BRS) and if you review each of these theorems you will see that their conclusions provide linearly independent spanning sets for sets that we now recognize as subspaces of $ℂ^{m}$ . Examples associated with these theorems include Example NSLIL, Example CSOCD and Example IAS. As we will see, these three theorems will continue to be powerful tools, even in the setting of more general vector spaces.

Furthermore, the archetypes contain an abundance of bases. For each coefficient matrix of a system of equations, and for each archetype defined simply as a matrix, there is a basis for the null space, three bases for the column space, and a basis for the row space. For this reason, our subsequent examples will concentrate on bases for vector spaces other than $ℂ^{m}$ . Notice that Definition B does not preclude a vector space from having many bases, and this is the case, as hinted above by the statement that the archetypes contain three bases for the column space of a matrix. More generally, we can grab any basis for a vector space, multiply any one basis vector by a non-zero scalar and create a slightly different set that is still a basis. For “important” vector spaces, it will be convenient to have a collection of “nice” bases. When a vector space has a single particularly nice basis, it is sometimes called the standard basis though there is nothing precise enough about this term to allow us to define it formally — it is a question of style. Here are some nice bases for important vector spaces.

Theorem SUVB
Standard Unit Vectors are a Basis
The set of standard unit vectors for $ℂ^{m}$ (Definition SUV), $B = \{e_{1}, e_{2}, e_{3}, \dots, e_{m}\} = \{e_{i}| 1 \leq i \leq m\}$ is a basis for the vector space $ℂ^{m}$ . $□$

Proof We must show that the set $B$ is both linearly independent and a spanning set for $ℂ^{m}$ . First, the vectors in $B$ are, by Definition SUV, the columns of the identity matrix, which we know is nonsingular (since it row-reduces to the identity matrix, Theorem NMRRI). And the columns of a nonsingular matrix are linearly independent by Theorem NMLIC.

Suppose we grab an arbitrary vector from $ℂ^{m}$ , say

v = [\begin{matrix} v_{1} \\ v_{2} \\ v_{3} \\ ⋮ \\ v_{m} \end{matrix}] .

Can we write $v$ as a linear combination of the vectors in $B$ ? Yes, and quite simply.

\begin{array}{l} [\begin{matrix} v_{1} \\ v_{2} \\ v_{3} \\ ⋮ \\ v_{m} \end{matrix}] & = v_{1} [\begin{matrix} 1 \\ 0 \\ 0 \\ ⋮ \\ 0 \end{matrix}] + v_{2} [\begin{matrix} 0 \\ 1 \\ 0 \\ ⋮ \\ 0 \end{matrix}] + v_{3} [\begin{matrix} 0 \\ 0 \\ 1 \\ ⋮ \\ 0 \end{matrix}] + \dots + v_{m} [\begin{matrix} 0 \\ 0 \\ 0 \\ ⋮ \\ 1 \end{matrix}] \\ v & = v_{1} e_{1} + v_{2} e_{2} + v_{3} e_{3} + \dots + v_{m} e_{m} \end{array}

this shows that $ℂ^{m} \subseteq 〈B〉$ , which is sufficient to show that $B$ is a spanning set for $ℂ^{m}$ . $■$

Example BP
Bases for $P_{n}$
The vector space of polynomials with degree at most $n$ , $P_{n}$ , has the basis

B = \{1, x, x^{2}, x^{3}, \dots, x^{n}\} .

Another nice basis for $P_{n}$ is

C = \{1, 1 + x, 1 + x + x^{2}, 1 + x + x^{2} + x^{3}, \dots, 1 + x + x^{2} + x^{3} + \dots + x^{n}\} .

Checking that each of $B$ and $C$ is a linearly independent spanning set are good exercises. $⊠$

Example BM
A basis for the vector space of matrices
In the vector space $M_{m n}$ of matrices (Example VSM) define the matrices $B_{k ℓ}$ , $1 \leq k \leq m$ , $1 \leq ℓ \leq n$ by

{[B_{k ℓ}]}_{i j} = \{\begin{matrix} 1 & if k = i, ℓ = j \\ 0 & otherwise \end{matrix}

So these matrices have entries that are all zeros, with the exception of a lone entry that is one. The set of all $m n$ of them,

B = \{B_{k ℓ}| 1 \leq k \leq m, 1 \leq ℓ \leq n\}

forms a basis for $M_{m n}$ . $⊠$

The bases described above will often be convenient ones to work with. However a basis doesn’t have to obviously look like a basis.

Example BSP4
A basis for a subspace of $P_{4}$
In Example SSP4 we showed that

S = \{x - 2, x^{2} - 4 x + 4, x^{3} - 6 x^{2} + 12 x - 8, x^{4} - 8 x^{3} + 24 x^{2} - 32 x + 16\}

is a spanning set for $W = \{p (x)| p \in P_{4}, p (2) = 0\}$ . We will now show that $S$ is also linearly independent in $W$ . Begin with a relation of linear dependence,

\begin{array}{l} 0 + 0 x + 0 x^{2} + 0 x^{3} + 0 x^{4} & = α_{1} (x - 2) + α_{2} (x^{2} - 4 x + 4) \\ + α_{3} (x^{3} - 6 x^{2} + 12 x - 8) + α_{4} (x^{4} - 8 x^{3} + 24 x^{2} - 32 x + 16) \\ = α_{4} x^{4} + (α_{3} - 8 α_{4}) x^{3} + (α_{2} - 6 α_{3} + 24 α_{4}) x^{2} \\ + (α_{1} - 4 α_{2} + 12 α_{3} - 32 α_{4}) x + (- 2 α_{1} + 4 α_{2} - 8 α_{3} + 16 α_{4}) \end{array}

Equating coefficients (vector equality in $P_{4}$ ) gives the homogeneous system of five equations in four variables,

\begin{array}{l} α_{4} & = 0 \\ α_{3} - 8 α_{4} & = 0 \\ α_{2} - 6 α_{3} + 24 α_{4} & = 0 \\ α_{1} - 4 α_{2} + 12 α_{3} - 32 α_{4} & = 0 \\ - 2 α_{1} + 4 α_{2} - 8 α_{3} + 16 α_{4} & = 0 \end{array}

We form the coefficient matrix, and row-reduce to obtain a matrix in reduced row-echelon form

[\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{matrix}]

With only the trivial solution to this homogeneous system, we conclude that only scalars that will form a relation of linear dependence are the trivial ones, and therefore the set $S$ is linearly independent (Definition LI). Finally, $S$ has earned the right to be called a basis for $W$ (Definition B). $⊠$

Example BSM22
A basis for a subspace of $M_{22}$
In Example SSM22 we discovered that

Q = \{[\begin{matrix} - 3 & 1 \\ 0 & 0 \end{matrix}], [\begin{matrix} 1 & 0 \\ - 4 & 1 \end{matrix}]\}

is a spanning set for the subspace

Z = \{[\begin{matrix} a & b \\ c & d \end{matrix}]| a + 3 b - c - 5 d = 0, - 2 a - 6 b + 3 c + 14 d = 0\}

of the vector space of all $2 \times 2$ matrices, $M_{22}$ . If we can also determine that $Q$ is linearly independent in $Z$ (or in $M_{22}$ ), then it will qualify as a basis for $Z$ . Let’s begin with a relation of linear dependence.

\begin{array}{l} [\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}] & = α_{1} [\begin{matrix} - 3 & 1 \\ 0 & 0 \end{matrix}] + α_{2} [\begin{matrix} 1 & 0 \\ - 4 & 1 \end{matrix}] \\ = [\begin{matrix} - 3 α_{1} + α_{2} & α_{1} \\ - 4 α_{2} & α_{2} \end{matrix}] \end{array}

Using our definition of matrix equality (Definition ME) we equate corresponding entries and get a homogeneous system of four equations in two variables,

\begin{array}{l} - 3 α_{1} + α_{2} & = 0 \\ α_{1} & = 0 \\ - 4 α_{2} & = 0 \\ α_{2} & = 0 \end{array}

We could row-reduce the coefficient matrix of this homogeneous system, but it is not necessary. The second and fourth equations tell us that $α_{1} = 0$ , $α_{2} = 0$ is the only solution to this homogeneous system. This qualifies the set $Q$ as being linearly independent, since the only relation of linear dependence is trivial (Definition LI). Therefore $Q$ is a basis for $Z$ (Definition B). $⊠$

Example BC
Basis for the crazy vector space
In Example LIC and Example SSC we determined that the set $R = \{(1, 0), (6, 3)\}$ from the crazy vector space, $C$ (Example CVS), is linearly independent and is a spanning set for $C$ . By Definition B we see that $R$ is a basis for $C$ . $⊠$

We have seen that several of the sets associated with a matrix are subspaces of vector spaces of column vectors. Specifically these are the null space (Theorem NSMS), column space (Theorem CSMS), row space (Theorem RSMS) and left null space (Theorem LNSMS). As subspaces they are vector spaces (Definition S) and it is natural to ask about bases for these vector spaces. Theorem BNS, Theorem BCS, Theorem BRS each have conclusions that provide linearly independent spanning sets for (respectively) the null space, column space, and row space. Notice that each of these theorems contains the word “basis” in its title, even though we did not know the precise meaning of the word at the time. To find a basis for a left null space we can use the definition of this subspace as a null space (Definition LNS) and apply Theorem BNS. Or Theorem FS tells us that the left null space can be expressed as a row space and we can then use Theorem BRS.

Theorem BS is another early result that provides a linearly independent spanning set (i.e. a basis) as its conclusion. If a vector space of column vectors can be expressed as a span of a set of column vectors, then Theorem BS can be employed in a straightforward manner to quickly yield a basis.

Subsection BSCV: Bases for Spans of Column Vectors

We have seen several examples of bases in different vector spaces. In this subsection, and the next (Subsection B.BNM), we will consider building bases for $ℂ^{m}$ and its subspaces.

Suppose we have a subspace of $ℂ^{m}$ that is expressed as the span of a set of vectors, $S$ , and $S$ is not necessarily linearly independent, or perhaps not very attractive. Theorem REMRS says that row-equivalent matrices have identical row spaces, while Theorem BRS says the nonzero rows of a matrix in reduced row-echelon form are a basis for the row space. These theorems together give us a great computational tool for quickly finding a basis for a subspace that is expressed originally as a span.

Example RSB
Row space basis
When we first defined the span of a set of column vectors, in Example SCAD we looked at the set

W = 〈\{[\begin{matrix} 2 \\ - 3 \\ 1 \end{matrix}], [\begin{matrix} 1 \\ 4 \\ 1 \end{matrix}], [\begin{matrix} 7 \\ - 5 \\ 4 \end{matrix}], [\begin{matrix} - 7 \\ - 6 \\ - 5 \end{matrix}]\}〉

with an eye towards realizing $W$ as the span of a smaller set. By building relations of linear dependence (though we did not know them by that name then) we were able to remove two vectors and write $W$ as the span of the other two vectors. These two remaining vectors formed a linearly independent set, even though we did not know that at the time.

Now we know that $W$ is a subspace and must have a basis. Consider the matrix, $C$ , whose rows are the vectors in the spanning set for $W$ ,

C = [\begin{matrix} 2 & - 3 & 1 \\ 1 & 4 & 1 \\ 7 & - 5 & 4 \\ - 7 & - 6 & - 5 \end{matrix}]

Then, by Definition RSM, the row space of $C$ will be $W$ , $ℛ (C) = W$ . Theorem BRS tells us that if we row-reduce $C$ , the nonzero rows of the row-equivalent matrix in reduced row-echelon form will be a basis for $ℛ (C)$ , and hence a basis for $W$ . Let’s do it — $C$ row-reduces to

[\begin{matrix} 1 & 0 & \frac{7}{_} \\ 0 & 1 & \frac{1}{_} \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}]

If we convert the two nonzero rows to column vectors then we have a basis,

B = \{[\begin{matrix} 1 \\ 0 \\ \frac{7}{11} \end{matrix}], [\begin{matrix} 0 \\ 1 \\ \frac{1}{11} \end{matrix}]\}

and

W = 〈\{[\begin{matrix} 1 \\ 0 \\ \frac{7}{11} \end{matrix}], [\begin{matrix} 0 \\ 1 \\ \frac{1}{11} \end{matrix}]\}〉

For aesthetic reasons, we might wish to multiply each vector in $B$ by $11$ , which will not change the spanning or linear independence properties of $B$ as a basis. Then we can also write

W = 〈\{[\begin{matrix} 11 \\ 0 \\ 7 \end{matrix}], [\begin{matrix} 0 \\ 11 \\ 1 \end{matrix}]\}〉

⊠

Example IAS provides another example of this flavor, though now we can notice that $X$ is a subspace, and that the resulting set of three vectors is a basis. This is such a powerful technique that we should do one more example.

Example RS
Reducing a span
In Example RSC5 we began with a set of $n = 4$ vectors from $ℂ^{5}$ ,

R = \{v_{1}, v_{2}, v_{3}, v_{4}\} = \{[\begin{matrix} 1 \\ 2 \\ - 1 \\ 3 \\ 2 \end{matrix}], [\begin{matrix} 2 \\ 1 \\ 3 \\ 1 \\ 2 \end{matrix}], [\begin{matrix} 0 \\ - 7 \\ 6 \\ - 11 \\ - 2 \end{matrix}], [\begin{matrix} 4 \\ 1 \\ 2 \\ 1 \\ 6 \end{matrix}]\}

and defined $V = 〈R〉$ . Our goal in that problem was to find a relation of linear dependence on the vectors in $R$ , solve the resulting equation for one of the vectors, and re-express $V$ as the span of a set of three vectors.

Here is another way to accomplish something similar. The row space of the matrix

A = [\begin{matrix} 1 & 2 & - 1 & 3 & 2 \\ 2 & 1 & 3 & 1 & 2 \\ 0 & - 7 & 6 & - 11 & - 2 \\ 4 & 1 & 2 & 1 & 6 \end{matrix}]

is equal to $〈R〉$ . By Theorem BRS we can row-reduce this matrix, ignore any zero rows, and use the non-zero rows as column vectors that are a basis for the row space of $A$ . Row-reducing $A$ creates the matrix

[\begin{matrix} 1 & 0 & 0 & - \frac{1}{17} & \frac{30}{17} \\ 0 & 1 & 0 & \frac{25}{17} & - \frac{2}{17} \\ 0 & 0 & 1 & - \frac{2}{17} & - \frac{8}{17} \\ 0 & 0 & 0 & 0 & 0 \end{matrix}]

\{[\begin{matrix} 1 \\ 0 \\ 0 \\ - \frac{1}{17} \\ \frac{30}{17} \end{matrix}], [\begin{matrix} 0 \\ 1 \\ 0 \\ \frac{25}{17} \\ - \frac{2}{17} \end{matrix}], [\begin{matrix} 0 \\ 0 \\ 1 \\ - \frac{2}{17} \\ - \frac{8}{17} \end{matrix}]\}

is a basis for $V$ . Our theorem tells us this is a basis, there is no need to verify that the subspace spanned by three vectors (rather than four) is the identical subspace, and there is no need to verify that we have reached the limit in reducing the set, since the set of three vectors is guaranteed to be linearly independent. $⊠$

Subsection BNM: Bases and Nonsingular Matrices

A quick source of diverse bases for $ℂ^{m}$ is the set of columns of a nonsingular matrix.

Theorem CNMB
Columns of Nonsingular Matrix are a Basis
Suppose that $A$ is a square matrix of size $m$ . Then the columns of $A$ are a basis of $ℂ^{m}$ if and only if $A$ is nonsingular. $□$

Proof ( $\Rightarrow$ ) Suppose that the columns of $A$ are a basis for $ℂ^{m}$ . Then Definition B says the set of columns is linearly independent. Theorem NMLIC then says that $A$ is nonsingular.

( $\Leftarrow$ ) Suppose that $A$ is nonsingular. Then by Theorem NMLIC this set of columns is linearly independent. Theorem CSNM says that for a nonsingular matrix, $C (A) = ℂ^{m}$ . This is equivalent to saying that the columns of $A$ are a spanning set for the vector space $ℂ^{m}$ . As a linearly independent spanning set, the columns of $A$ qualify as a basis for $ℂ^{m}$ (Definition B). $■$

Example CABAK
Columns as Basis, Archetype K
Archetype K is the $5 \times 5$ matrix

K = [\begin{matrix} 10 & 18 & 24 & 24 & - 12 \\ 12 & - 2 & - 6 & 0 & - 18 \\ - 30 & - 21 & - 23 & - 30 & 39 \\ 27 & 30 & 36 & 37 & - 30 \\ 18 & 24 & 30 & 30 & - 20 \end{matrix}]

which is row-equivalent to the $5 \times 5$ identity matrix $I_{5}$ . So by Theorem NMRRI, $K$ is nonsingular. Then Theorem CNMB says the set

\{[\begin{matrix} 10 \\ 12 \\ - 30 \\ 27 \\ 18 \end{matrix}], [\begin{matrix} 18 \\ - 2 \\ - 21 \\ 30 \\ 24 \end{matrix}], [\begin{matrix} 24 \\ - 6 \\ - 23 \\ 36 \\ 30 \end{matrix}], [\begin{matrix} 24 \\ 0 \\ - 30 \\ 37 \\ 30 \end{matrix}], [\begin{matrix} - 12 \\ - 18 \\ 39 \\ - 30 \\ - 20 \end{matrix}]\}

is a (novel) basis of $ℂ^{5}$ . $⊠$

Perhaps we should view the fact that the standard unit vectors are a basis (Theorem SUVB) as just a simple corollary of Theorem CNMB? (See Technique LC.)

With a new equivalence for a nonsingular matrix, we can update our list of equivalences.

Theorem NME5
Nonsingular Matrix Equivalences, Round 5
Suppose that $A$ is a square matrix of size $n$ . The following are equivalent.

$A$ is nonsingular.
$A$ row-reduces to the identity matrix.
The null space of $A$ contains only the zero vector, $N (A) = \{0\}$ .
The linear system $ℒ S (A, b)$ has a unique solution for every possible choice of $b$ .
The columns of $A$ are a linearly independent set.
$A$ is invertible.
The column space of $A$ is $ℂ^{n}$ , $C (A) = ℂ^{n}$ .
The columns of $A$ are a basis for $ℂ^{n}$ .

□

Proof With a new equivalence for a nonsingular matrix in Theorem CNMB we can expand Theorem NME4. $■$

Subsection OBC: Orthonormal Bases and Coordinates

We learned about orthogonal sets of vectors in $ℂ^{m}$ back in Section O, and we also learned that orthogonal sets are automatically linearly independent (Theorem OSLI). When an orthogonal set also spans a subspace of $ℂ^{m}$ , then the set is a basis. And when the set is orthonormal, then the set is an incredibly nice basis. We will back up this claim with a theorem, but first consider how you might manufacture such a set.

Suppose that $W$ is a subspace of $ℂ^{m}$ with basis $B$ . Then $B$ spans $W$ and is a linearly independent set of nonzero vectors. We can apply the Gram-Schmidt Procedure (Theorem GSP) and obtain a linearly independent set $T$ such that $〈T〉 = 〈B〉 = W$ and $T$ is orthogonal. In other words, $T$ is a basis for $W$ , and is an orthogonal set. By scaling each vector of $T$ to norm 1, we can convert $T$ into an orthonormal set, without destroying the properties that make it a basis of $W$ . In short, we can convert any basis into an orthonormal basis. Example GSTV, followed by Example ONTV, illustrates this process.

Unitary matrices (Definition UM) are another good source of orthonormal bases (and vice versa). Suppose that $Q$ is a unitary matrix of size $n$ . Then the $n$ columns of $Q$ form an orthonormal set (Theorem CUMOS) that is therefore linearly independent (Theorem OSLI). Since $Q$ is invertible (Theorem UMI), we know $Q$ is nonsingular (Theorem NI), and then the columns of $Q$ span $ℂ^{n}$ (Theorem CSNM). So the columns of a unitary matrix of size $n$ are an orthonormal basis for $ℂ^{n}$ .

Why all the fuss about orthonormal bases? Theorem VRRB told us that any vector in a vector space could be written, uniquely, as a linear combination of basis vectors. For an orthonormal basis, finding the scalars for this linear combination is extremely easy, and this is the content of the next theorem. Furthermore, with vectors written this way (as linear combinations of the elements of an orthonormal set) certain computations and analysis become much easier. Here’s the promised theorem.

Theorem COB
Coordinates and Orthonormal Bases
Suppose that $B = \{v_{1}, v_{2}, v_{3}, \dots, v_{p}\}$ is an orthonormal basis of the subspace $W$ of $ℂ^{m}$ . For any $w \in W$ ,

w = 〈w, v_{1}〉 v_{1} + 〈w, v_{2}〉 v_{2} + 〈w, v_{3}〉 v_{3} + \dots + 〈w, v_{p}〉 v_{p}

□

Proof Because $B$ is a basis of $W$ , Theorem VRRB tells us that we can write $w$ uniquely as a linear combination of the vectors in $B$ . So it is not this aspect of the conclusion that makes this theorem interesting. What is interesting is that the particular scalars are so easy to compute. No need to solve big systems of equations — just do an inner product of $w$ with $v_{i}$ to arrive at the coefficient of $v_{i}$ in the linear combination.

So begin the proof by writing $w$ as a linear combination of the vectors in $B$ , using unknown scalars,

w = a_{1} v_{1} + a_{2} v_{2} + a_{3} v_{3} + \dots + a_{p} v_{p}

and compute,

\begin{array}{l} 〈w, v_{i}〉 & = 〈\sum_{k = 1}^{p} a_{k} v_{k}, v_{i}〉 & Theorem VRRB \\ = \sum_{k = 1}^{p} 〈a_{k} v_{k}, v_{i}〉 & Theorem IPVA \\ = \sum_{k = 1}^{p} a_{k} 〈v_{k}, v_{i}〉 & Theorem IPSM \\ = a_{i} 〈v_{i}, v_{i}〉 + \sum_{\begin{array}{c} i = 1 \\ k \neq i \end{array}}^{p} a_{k} 〈v_{k}, v_{i}〉 & Property C \\ = a_{i} (1) + \sum_{\begin{array}{c} i = 1 \\ k \neq i \end{array}}^{p} a_{k} (0) & Definition ONS \\ = a_{i} \end{array}

So the (unique) scalars for the linear combination are indeed the inner products advertised in the conclusion of the theorem’s statement. $■$

Example CROB4
Coordinatization relative to an orthonormal basis, $ℂ^{4}$
The set

\{x_{1}, x_{2}, x_{3}, x_{4}\} = \{[\begin{matrix} 1 + i \\ 1 \\ 1 - i \\ i \end{matrix}], [\begin{matrix} 1 + 5 i \\ 6 + 5 i \\ - 7 - i \\ 1 - 6 i \end{matrix}], [\begin{matrix} - 7 + 34 i \\ - 8 - 23 i \\ - 10 + 22 i \\ 30 + 13 i \end{matrix}], [\begin{matrix} - 2 - 4 i \\ 6 + i \\ 4 + 3 i \\ 6 - i \end{matrix}]\}

was proposed, and partially verified, as an orthogonal set in Example AOS. Let’s scale each vector to norm 1, so as to form an orthonormal set in $ℂ^{4}$ . Then by Theorem OSLI the set will be linearly independent, and by Theorem NME5 the set will be a basis for $ℂ^{4}$ . So, once scalked to norm 1, the adjusted set will be an orthonormal basis of $ℂ^{4}$ . The norms are,

\begin{array}{l} ‖x_{1}‖ = \sqrt{6} & ‖x_{2}‖ = \sqrt{174} & ‖x_{3}‖ = \sqrt{3451} & ‖x_{4}‖ = \sqrt{119} \end{array}

So an orthonormal basis is

\begin{array}{l} B & = \{v_{1}, v_{2}, v_{3}, v_{4}\} \\ = \{\frac{1}{\sqrt{6}} [\begin{matrix} 1 + i \\ 1 \\ 1 - i \\ i \end{matrix}], \frac{1}{\sqrt{174}} [\begin{matrix} 1 + 5 i \\ 6 + 5 i \\ - 7 - i \\ 1 - 6 i \end{matrix}], \frac{1}{\sqrt{3451}} [\begin{matrix} - 7 + 34 i \\ - 8 - 23 i \\ - 10 + 22 i \\ 30 + 13 i \end{matrix}], \frac{1}{\sqrt{119}} [\begin{matrix} - 2 - 4 i \\ 6 + i \\ 4 + 3 i \\ 6 - i \end{matrix}]\} \end{array}

Now, to illustrate Theorem COB, choose any vector from $ℂ^{4}$ , say $w = [\begin{matrix} 2 \\ - 3 \\ 1 \\ 4 \end{matrix}]$ , and compute

\begin{array}{l} 〈w, v_{1}〉 = \frac{- 5 i}{\sqrt{6}}, & 〈w, v_{2}〉 = \frac{- 19 + 30 i}{\sqrt{174}}, & 〈w, v_{3}〉 = \frac{120 - 211 i}{\sqrt{3451}}, & 〈w, v_{4}〉 = \frac{6 + 12 i}{\sqrt{119}} \end{array}

Then Theorem COB guarantees that

\begin{array}{l} [\begin{matrix} 2 \\ - 3 \\ 1 \\ 4 \end{matrix}] & = \frac{- 5 i}{\sqrt{6}} (\frac{1}{\sqrt{6}} [\begin{matrix} 1 + i \\ 1 \\ 1 - i \\ i \end{matrix}]) + \frac{- 19 + 30 i}{\sqrt{174}} (\frac{1}{\sqrt{174}} [\begin{matrix} 1 + 5 i \\ 6 + 5 i \\ - 7 - i \\ 1 - 6 i \end{matrix}]) \\ + \frac{120 - 211 i}{\sqrt{3451}} (\frac{1}{\sqrt{3451}} [\begin{matrix} - 7 + 34 i \\ - 8 - 23 i \\ - 10 + 22 i \\ 30 + 13 i \end{matrix}]) + \frac{6 + 12 i}{\sqrt{119}} (\frac{1}{\sqrt{119}} [\begin{matrix} - 2 - 4 i \\ 6 + i \\ 4 + 3 i \\ 6 - i \end{matrix}]) \end{array}

as you might want to check (if you have unlimited patience). $⊠$

A slightly less intimidating example follows, in three dimensions and with just real numbers.

Example CROB3
Coordinatization relative to an orthonormal basis, $ℂ^{3}$
The set

\{x_{1}, x_{2}, x_{3}\} = \{[\begin{matrix} 1 \\ 2 \\ 1 \end{matrix}], [\begin{matrix} - 1 \\ 0 \\ 1 \end{matrix}], [\begin{matrix} 2 \\ 1 \\ 1 \end{matrix}]\}

is a linearly independent set, which the Gram-Schmidt Process (Theorem GSP) converts to an orthogonal set, and which can then be converted to the orthonormal set,

B = \{v_{1}, v_{2}, v_{3}\} = \{\frac{1}{\sqrt{6}} [\begin{matrix} 1 \\ 2 \\ 1 \end{matrix}], \frac{1}{\sqrt{2}} [\begin{matrix} - 1 \\ 0 \\ 1 \end{matrix}], \frac{1}{\sqrt{3}} [\begin{matrix} 1 \\ - 1 \\ 1 \end{matrix}]\}

which is therefore an orthonormal basis of $ℂ^{3}$ . With three vectors in $ℂ^{3}$ , all with real number entries, the inner product (Definition IP) reduces to the usual “dot product” (or scalar product) and the orthogonal pairs of vectors can be interpreted as perpendicular pairs of directions. So the vectors in $B$ serve as replacements for our usual 3-D axes, or the usual 3-D unit vectors $\vec{i}, \vec{j}$ and $\vec{k}$ . We would like to decompose arbitrary vectors into “components” in the directions of each of these basis vectors. It is Theorem COB that tells us how to do this.

Suppose that we choose $w = [\begin{matrix} 2 \\ - 1 \\ 5 \end{matrix}]$ . Compute

\begin{array}{l} 〈w, v_{1}〉 = \frac{5}{\sqrt{6}} & 〈w, v_{2}〉 = \frac{3}{\sqrt{2}} & 〈w, v_{3}〉 = \frac{8}{\sqrt{3}} \end{array}

then Theorem COB guarantees that

[\begin{matrix} 2 \\ - 1 \\ 5 \end{matrix}] = \frac{5}{\sqrt{6}} (\frac{1}{\sqrt{6}} [\begin{matrix} 1 \\ 2 \\ 1 \end{matrix}]) + \frac{3}{\sqrt{2}} (\frac{1}{\sqrt{2}} [\begin{matrix} - 1 \\ 0 \\ 1 \end{matrix}]) + \frac{8}{\sqrt{3}} (\frac{1}{\sqrt{3}} [\begin{matrix} 1 \\ - 1 \\ 1 \end{matrix}])

which you should be able to check easily, even if you do not have much patience. $⊠$

Not only do the columns of a unitary matrix form an orthonormal basis, but there is a deeper connection between orthonormal bases and unitary matrices. Informally, the next theorem says that if we transform each vector of an orthonormal basis by multiplying it by a unitary matrix, then the resulting set will be another orthonormal basis. And more remarkably, any matrix with this property must be unitary! As an equivalence (Technique E) we could take this as our defining property of a unitary matrix, though it might not have the same utility as Definition UM.

Theorem UMCOB
Unitary Matrices Convert Orthonormal Bases
Let $A$ be an $n \times n$ matrix and $B = \{x_{1}, x_{2}, x_{3}, \dots, x_{n}\}$ be an orthonormal basis of $ℂ^{n}$ . Define

\begin{array}{l} C & = \{A x_{1}, A x_{2}, A x_{3}, \dots, A x_{n}\} \end{array}

Then $A$ is a unitary matrix if and only if $C$ is an orthonormal basis of $ℂ^{n}$ . $□$

Proof $(\Rightarrow)$ Assume $A$ is a unitary matrix and establish several facts about $C$ . First we check that $C$ is an orthonormal set (Definition ONS). By Theorem UMPIP, for $i \neq j$ ,

\begin{array}{l} 〈A x_{i}, A x_{j}〉 & = 〈x_{i}, x_{j}〉 = 1 \end{array}

Similarly, Theorem UMPIP also gives, for $1 \leq i \leq n$ ,

\begin{array}{l} ‖A x_{i}‖ = ‖x_{i}‖ = 1 \end{array}

As $C$ is an orthogonal set (Definition OSV), Theorem OSLI yields the linear independence of $C$ . Having established that the column vectors on $C$ form a linearly independent set, a matrix whose columns are the vectors of $C$ is nonsingular (Theorem NMLIC), and hence these vectors form a basis of $ℂ^{n}$ by Theorem CNMB.

$(\Leftarrow)$ Now assume that $C$ is an orthonormal set. Let $y$ be an arbitrary vector from $ℂ^{n}$ . Since $B$ spans $ℂ^{n}$ , there are scalars, $a_{1}, a_{2}, a_{3}, \dots, a_{n}$ , such that

\begin{array}{l} y & = a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} + \dots + a_{n} x_{n} \end{array}

Now

\begin{array}{l} A^{*} A y & = \sum_{i = 1}^{n} 〈A^{*} A y, x_{i}〉 x_{i} & Theorem COB \\ = \sum_{i = 1}^{n} 〈A^{*} A \sum_{j = 1}^{n} a_{j} x_{j}, x_{i}〉 x_{i} & Definition TSVS \\ = \sum_{i = 1}^{n} 〈\sum_{j = 1}^{n} A^{*} A a_{j} x_{j}, x_{i}〉 x_{i} & Theorem MMDAA \\ = \sum_{i = 1}^{n} 〈\sum_{j = 1}^{n} a_{j} A^{*} A x_{j}, x_{i}〉 x_{i} & Theorem MMSMM \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n} 〈a_{j} A^{*} A x_{j}, x_{i}〉 x_{i} & Theorem IPVA \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{j} 〈A^{*} A x_{j}, x_{i}〉 x_{i} & Theorem IPSM \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{j} 〈A x_{j}, {(A^{*})}^{*} x_{i}〉 x_{i} & Theorem AIP \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{j} 〈A x_{j}, A x_{i}〉 x_{i} & Theorem AA \\ = \sum_{i = 1}^{n} \sum_{\begin{array}{c} j = 1 \\ j \neq i \end{array}}^{n} a_{j} 〈A x_{j}, A x_{i}〉 x_{i} + \sum_{ℓ = 1}^{n} a_{ℓ} 〈A x_{ℓ}, A x_{ℓ}〉 x_{ℓ} & Property C \\ = \sum_{i = 1}^{n} \sum_{\begin{array}{c} j = 1 \\ j \neq i \end{array}}^{n} a_{j} (0) x_{i} + \sum_{ℓ = 1}^{n} a_{ℓ} (1) x_{ℓ} & Definition ONS \\ = \sum_{i = 1}^{n} \sum_{\begin{array}{c} j = 1 \\ j \neq i \end{array}}^{n} 0 + \sum_{ℓ = 1}^{n} a_{ℓ} x_{ℓ} & Theorem ZSSM \\ = \sum_{ℓ = 1}^{n} a_{ℓ} x_{ℓ} & Property Z \\ = y \\ = I_{n} y & Theorem MMIM \end{array}

Since the choice of $y$ was arbitrary, Theorem EMMVP tells us that $A^{*} A = I_{n}$ , so $A$ is unitary (Definition UM). $■$

Subsection READ: Reading Questions

The matrix below is nonsingular. What can you now say about its columns?
$A = [\begin{matrix} - 3 & 0 & 1 \\ 1 & 2 & 1 \\ 5 & 1 & 6 \end{matrix}]$
Write the vector $w = [\begin{matrix} 6 \\ 6 \\ 15 \end{matrix}]$ as a linear combination of the columns of the matrix $A$ above. How many ways are there to answer this question?
Why is an orthonormal basis desirable?

Subsection EXC: Exercises

C40 From Example RSB, form an arbitrary (and nontrivial) linear combination of the four vectors in the original spanning set for $W$ . So the result of this computation is of course an element of $W$ . As such, this vector should be a linear combination of the basis vectors in $B$ . Find the (unique) scalars that provide this linear combination. Repeat with another linear combination of the original four vectors.
Contributed by Robert Beezer Solution [982]

C80 Prove that $\{(1, 2), (2, 3)\}$ is a basis for the crazy vector space $C$ (Example CVS).
Contributed by Robert Beezer

M20 In Example BM provide the verifications (linear independence and spanning) to show that $B$ is a basis of $M_{m n}$ .
Contributed by Robert Beezer Solution [979]

T50 Theorem UMCOB says that unitary matrices are characterized as those matrices that “carry” orthonormal bases to orthonormal bases. This problem asks you to prove a similar result: nonsingular matrices are characterized as those matrices that “carry” bases to bases.

More precisely, suppose that $A$ is a square matrix of size $n$ and $B = \{x_{1}, x_{2}, x_{3}, \dots, x_{n}\}$ is a basis of $ℂ^{n}$ . Prove that $A$ is nonsingular if and only if $C = \{A x_{1}, A x_{2}, A x_{3}, \dots, A x_{n}\}$ is a basis of $ℂ^{n}$ . (See also Exercise PD.T33, Exercise MR.T20.
Contributed by Robert Beezer Solution [983]

T51 Use the result of Exercise B.T50 to build a very concise proof of Theorem CNMB. (Hint: make a judicious choice for the basis $B$ .)
Contributed by Robert Beezer Solution [988]

Subsection SOL: Solutions

M20 Contributed by Robert Beezer Statement [977]
We need to establish the linear independence and spanning properties of the set

B = \{B_{k ℓ}| 1 \leq k \leq m, 1 \leq ℓ \leq n\}

relative to the vector space $M_{m n}$ .

This proof is more transparent if you write out individual matrices in the basis with lots of zeros and dots and a lone one. But we don’t have room for that here, so we will use summation notation. Think carefully about each step, especially when the double summations seem to “disappear.” Begin with a relation of linear dependence, using double subscripts on the scalars to align with the basis elements.

O = \sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} α_{k ℓ} B_{k ℓ}

Now consider the entry in row $i$ and column $j$ for these equal matrices,

\begin{array}{l} 0 & = {[O]}_{i j} & Definition ZM \\ = {[\sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} α_{k ℓ} B_{k ℓ}]}_{i j} & Definition ME \\ = \sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} {[α_{k ℓ} B_{k ℓ}]}_{i j} & Definition MA \\ = \sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} α_{k ℓ} {[B_{k ℓ}]}_{i j} & Definition MSM \\ = α_{i j} {[B_{i j}]}_{i j} & {[B_{k ℓ}]}_{i j} = 0 when (k, ℓ) \neq (i, j) \\ = α_{i j} (1) & {[B_{i j}]}_{i j} = 1 \\ = α_{i j} \end{array}

Since $i$ and $j$ were arbitrary, we find that each scalar is zero and so $B$ is linearly independent (Definition LI).

To establish the spanning property of $B$ we need only show that an arbitrary matrix $A$ can be written as a linear combination of the elements of $B$ . So suppose that $A$ is an arbitrary $m \times n$ matrix and consider the matrix $C$ defined as a linear combination of the elements of $B$ by

C = \sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} {[A]}_{k ℓ} B_{k ℓ}

Then,

\begin{array}{l} {[C]}_{i j} & = {[\sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} {[A]}_{k ℓ} B_{k ℓ}]}_{i j} & Definition ME \\ = \sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} {[{[A]}_{k ℓ} B_{k ℓ}]}_{i j} & Definition MA \\ = \sum_{k = 1}^{m} \sum_{ℓ = 1}^{n} {[A]}_{k ℓ} {[B_{k ℓ}]}_{i j} & Definition MSM \\ = {[A]}_{i j} {[B_{i j}]}_{i j} & {[B_{k ℓ}]}_{i j} = 0 when (k, ℓ) \neq (i, j) \\ = {[A]}_{i j} (1) & {[B_{i j}]}_{i j} = 1 \\ = {[A]}_{i j} \end{array}

So by Definition ME, $A = C$ , and therefore $A \in 〈B〉$ . By Definition B, the set $B$ is a basis of the vector space $M_{m n}$ .

C40 Contributed by Robert Beezer Statement [977]
An arbitrary linear combination is

y = 3 [\begin{matrix} 2 \\ - 3 \\ 1 \end{matrix}] + (- 2) [\begin{matrix} 1 \\ 4 \\ 1 \end{matrix}] + 1 [\begin{matrix} 7 \\ - 5 \\ 4 \end{matrix}] + (- 2) [\begin{matrix} - 7 \\ - 6 \\ - 5 \end{matrix}] = [\begin{matrix} 25 \\ - 10 \\ 15 \end{matrix}]

(You probably used a different collection of scalars.) We want to write $y$ as a linear combination of

B = \{[\begin{matrix} 1 \\ 0 \\ \frac{7}{11} \end{matrix}], [\begin{matrix} 0 \\ 1 \\ \frac{1}{11} \end{matrix}]\}

We could set this up as vector equation with variables as scalars in a linear combination of the vectors in $B$ , but since the first two slots of $B$ have such a nice pattern of zeros and ones, we can determine the necessary scalars easily and then double-check our answer with a computation in the third slot,

25 [\begin{matrix} 1 \\ 0 \\ \frac{7}{11} \end{matrix}] + (- 10) [\begin{matrix} 0 \\ 1 \\ \frac{1}{11} \end{matrix}] = [\begin{matrix} 25 \\ - 10 \\ (25) \frac{7}{11} + (- 10) \frac{1}{11} \end{matrix}] = [\begin{matrix} 25 \\ - 10 \\ 15 \end{matrix}] = y

Notice how the uniqueness of these scalars arises. They are forced to be $25$ and $- 10$ .

T50 Contributed by Robert Beezer Statement [977]
Our first proof relies mostly on definitions of linear independence and spanning, which is a good exercise. The second proof is shorter and turns on a technical result from our work with matrix inverses, Theorem NPNT.

$(\Rightarrow)$ Assume that $A$ is nonsingular and prove that $C$ is a basis of $ℂ^{n}$ . First show that $C$ is linearly independent. Work on a relation of linear dependence on $C$ ,

\begin{array}{l} 0 & = a_{1} A x_{1} + a_{2} A x_{2} + a_{3} A x_{3} + \dots + a_{n} A x_{n} & Definition RLD \\ = A a_{1} x_{1} + A a_{2} x_{2} + A a_{3} x_{3} + \dots + A a_{n} x_{n} & Theorem MMSMM \\ = A (a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} + \dots + a_{n} x_{n}) & Theorem MMDAA \end{array}

Since $A$ is nonsingular, Definition NM and Theorem SLEMM allows us to conclude that

\begin{array}{l} a_{1} x_{1} + a_{2} x_{2} + \dots + a_{n} x_{n} & = 0 \end{array}

But this is a relation of linear dependence of the linearly independent set $B$ , so the scalars are trivial, $a_{1} = a_{2} = a_{3} = \dots = a_{n} = 0$ . By Definition LI, the set $C$ is linearly independent.

Now prove that $C$ spans $ℂ^{n}$ . Given an arbitrary vector $y \in ℂ^{n}$ , can it be expressed as a linear combination of the vectors in $C$ ? Since $A$ is a nonsingular matrix we can define the vector $w$ to be the unique solution of the system $ℒ S (A, y)$ (Theorem NMUS). Since $w \in ℂ^{n}$ we can write $w$ as a linear combination of the vectors in the basis $B$ . So there are scalars, $b_{1}, b_{2}, b_{3}, \dots, b_{n}$ such that

\begin{array}{l} w & = b_{1} x_{1} + b_{2} x_{2} + b_{3} x_{3} + \dots + b_{n} x_{n} \end{array}

Then,

\begin{array}{l} y & = A w & Theorem SLEMM \\ = A (b_{1} x_{1} + b_{2} x_{2} + b_{3} x_{3} + \dots + b_{n} x_{n}) & Definition TSVS \\ = A b_{1} x_{1} + A b_{2} x_{2} + A b_{3} x_{3} + \dots + A b_{n} x_{n} & Theorem MMDAA \\ = b_{1} A x_{1} + b_{2} A x_{2} + b_{3} A x_{3} + \dots + b_{n} A x_{n} & Theorem MMSMM \end{array}

So we can write an arbitrary vector of $ℂ^{n}$ as a linear combination of the elements of $C$ . In other words, $C$ spans $ℂ^{n}$ (Definition TSVS). By Definition B, the set $C$ is a basis for $ℂ^{n}$ .

$(\Leftarrow)$ Assume that $C$ is a basis and prove that $A$ is nonsingular. Let $x$ be a solution to the homogeneous system $ℒ S (A, 0)$ . Since $B$ is a basis of $ℂ^{n}$ there are scalars, $a_{1}, a_{2}, a_{3}, \dots, a_{n}$ , such that

\begin{array}{l} x & = a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} + \dots + a_{n} x_{n} \end{array}

Then

\begin{array}{l} 0 & = A x & Theorem SLEMM \\ = A (a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} + \dots + a_{n} x_{n}) & Definition TSVS \\ = A a_{1} x_{1} + A a_{2} x_{2} + A a_{3} x_{3} + \dots + A a_{n} x_{n} & Theorem MMDAA \\ = a_{1} A x_{1} + a_{2} A x_{2} + a_{3} A x_{3} + \dots + a_{n} A x_{n} & Theorem MMSMM \end{array}

This is a relation of linear dependence on the linearly independent set $C$ , so the scalars must all be zero, $a_{1} = a_{2} = a_{3} = \dots = a_{n} = 0$ . Thus,

\begin{array}{l} x & = a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} + \dots + a_{n} x_{n} = 0 x_{1} + 0 x_{2} + 0 x_{3} + \dots + 0 x_{n} = 0 . \end{array}

By Definition NM we see that $A$ is nonsingular.

Now for a secoond proof. Take the vectors for $B$ and use them as the columns of a matrix, $G = [x_{1} | x_{2} | x_{3} | \dots | x_{n}]$ . By Theorem CNMB, because we have the hypothesis that $B$ is a basis of $ℂ^{n}$ , $G$ is a nonsingular matrix. Notice that the columns of $A G$ are exactly the vectors in the set $C$ , by Definition MM.

\begin{array}{l} A nonsingular & \Leftrightarrow A G nonsingular & Theorem NPNT \\ \Leftrightarrow C basis for ℂ^{n} & Theorem CNMB \end{array}

That was easy!

T51 Contributed by Robert Beezer Statement [977]
Choose $B$ to be the set of standard unit vectors, a particularly nice basis of $ℂ^{n}$ (Theorem SUVB). For a vector $e_{j}$ (Definition SUV) from this basis, what is $A e_{j}$ ?

[next] [prev] [prev-tail] [front] [up]