Section RREF Reduced Row-Echelon Form

After solving a few systems of equations, you will recognize that it does not matter so much what we call our variables, as opposed to what numbers act as their coefficients. A system in the variables $x_1,\,x_2,\,x_3$ would behave the same if we changed the names of the variables to $a,\,b,\,c$ and kept all the constants the same and in the same places. In this section, we will isolate the key bits of information about a system of equations into something called a matrix, and then use this matrix to systematically solve the equations. Along the way we will obtain one of our most important and useful computational tools.

Subsection MVNSE Matrix and Vector Notation for Systems of Equations

Definition M Matrix

An $m\times n$ matrix is a rectangular layout of numbers from $\complex{\null}$ having $m$ rows and $n$ columns. We will use upper-case Latin letters from the start of the alphabet ($A,\,B,\,C,\dotsc$) to denote matrices and squared-off brackets to delimit the layout. Many use large parentheses instead of brackets — the distinction is not important. Rows of a matrix will be referenced starting at the top and working down (i.e. row 1 is at the top) and columns will be referenced starting from the left (i.e. column 1 is at the left). For a matrix $A$, the notation $\matrixentry{A}{ij}$ will refer to the complex number in row $i$ and column $j$ of $A$.

Be careful with this notation for individual entries, since it is easy to think that $\matrixentry{A}{ij}$ refers to the whole matrix. It does not. It is just a number, but is a convenient way to talk about the individual entries simultaneously. This notation will get a heavy workout once we get to Chapter M.

Example AM A matrix
Sage M Matrices

When we do equation operations on system of equations, the names of the variables really are not very important. Use $x_1$, $x_2$, $x_3$, or $a$, $b$, $c$, or $x$, $y$, $z$, it really does not matter. In this subsection we will describe some notation that will make it easier to describe linear systems, solve the systems and describe the solution sets. Here is a list of definitions, laden with notation.

Definition CV Column Vector

A column vector of size $m$ is an ordered list of $m$ numbers, which is written in order vertically, starting at the top and proceeding to the bottom. At times, we will refer to a column vector as simply a vector. Column vectors will be written in bold, usually with lower case Latin letter from the end of the alphabet such as $\vect{u}$, $\vect{v}$, $\vect{w}$, $\vect{x}$, $\vect{y}$, $\vect{z}$. Some books like to write vectors with arrows, such as $\vec{u}$. Writing by hand, some like to put arrows on top of the symbol, or a tilde underneath the symbol, as in $\underset{\sim}{\textstyle u}$. To refer to the entry or component of vector $\vect{v}$ in location $i$ of the list, we write $\vectorentry{\vect{v}}{i}$.

Be careful with this notation. While the symbols $\vectorentry{\vect{v}}{i}$ might look somewhat substantial, as an object this represents just one entry of a vector, which is just a single complex number.

Definition ZCV Zero Column Vector

The zero vector of size $m$ is the column vector of size $m$ where each entry is the number zero, \begin{align*} \zerovector= \colvector{0\\0\\0\\\vdots\\0} \end{align*} or defined much more compactly, $\vectorentry{\zerovector}{i}=0$ for $1\leq i\leq m$.

Sage V Vectors
Definition CM Coefficient Matrix

For a system of linear equations, \begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+\dots+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+\dots+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+\dots+a_{3n}x_n&=b_3\\ \vdots&\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+\dots+a_{mn}x_n&=b_m \end{align*} the coefficient matrix is the $m\times n$ matrix \begin{equation*} A= \begin{bmatrix} a_{11}&a_{12}&a_{13}&\dots&a_{1n}\\ a_{21}&a_{22}&a_{23}&\dots&a_{2n}\\ a_{31}&a_{32}&a_{33}&\dots&a_{3n}\\ \vdots&\\ a_{m1}&a_{m2}&a_{m3}&\dots&a_{mn}\\ \end{bmatrix} \end{equation*}

Definition VOC Vector of Constants

For a system of linear equations, \begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+\dots+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+\dots+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+\dots+a_{3n}x_n&=b_3\\ \vdots&\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+\dots+a_{mn}x_n&=b_m \end{align*} the vector of constants is the column vector of size $m$ \begin{equation*} \vect{b}= \begin{bmatrix} b_1\\ b_2\\ b_3\\ \vdots\\ b_m\\ \end{bmatrix} \end{equation*}

Definition SOLV Solution Vector

For a system of linear equations, \begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+\dots+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+\dots+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+\dots+a_{3n}x_n&=b_3\\ \vdots&\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+\dots+a_{mn}x_n&=b_m \end{align*} the solution vector is the column vector of size $n$ \begin{equation*} \vect{x}= \begin{bmatrix} x_1\\ x_2\\ x_3\\ \vdots\\ x_n\\ \end{bmatrix} \end{equation*}

The solution vector may do double-duty on occasion. It might refer to a list of variable quantities at one point, and subsequently refer to values of those variables that actually form a particular solution to that system.

Definition MRLS Matrix Representation of a Linear System

If $A$ is the coefficient matrix of a system of linear equations and $\vect{b}$ is the vector of constants, then we will write $\linearsystem{A}{\vect{b}}$ as a shorthand expression for the system of linear equations, which we will refer to as the matrix representation of the linear system.

Example NSLE Notation for systems of linear equations
Definition AM Augmented Matrix

Suppose we have a system of $m$ equations in $n$ variables, with coefficient matrix $A$ and vector of constants $\vect{b}$. Then the augmented matrix of the system of equations is the $m\times(n+1)$ matrix whose first $n$ columns are the columns of $A$ and whose last column ($n+1$) is the column vector $\vect{b}$. This matrix will be written as $\augmented{A}{\vect{b}}$.

The augmented matrix represents all the important information in the system of equations, since the names of the variables have been ignored, and the only connection with the variables is the location of their coefficients in the matrix. It is important to realize that the augmented matrix is just that, a matrix, and not a system of equations. In particular, the augmented matrix does not have any “solutions,” though it will be useful for finding solutions to the system of equations that it is associated with. (Think about your objects, and review Proof Technique L.) However, notice that an augmented matrix always belongs to some system of equations, and vice versa, so it is tempting to try and blur the distinction between the two. Here is a quick example.

Example AMAA Augmented matrix for Archetype A
Sage AM Augmented Matrix

Subsection RO Row Operations

An augmented matrix for a system of equations will save us the tedium of continually writing down the names of the variables as we solve the system. It will also release us from any dependence on the actual names of the variables. We have seen how certain operations we can perform on equations (Definition EO) will preserve their solutions (Theorem EOPSS). The next two definitions and the following theorem carry over these ideas to augmented matrices.

Definition RO Row Operations

The following three operations will transform an $m\times n$ matrix into a different matrix of the same size, and each is known as a row operation.

  1. Swap the locations of two rows.
  2. Multiply each entry of a single row by a nonzero quantity.
  3. Multiply each entry of one row by some quantity, and add these values to the entries in the same columns of a second row. Leave the first row the same after this operation, but replace the second row by the new values.
We will use a symbolic shorthand to describe these row operations:
  1. $\rowopswap{i}{j}$: Swap the location of rows $i$ and $j$.
  2. $\rowopmult{\alpha}{i}$: Multiply row $i$ by the nonzero scalar $\alpha$.
  3. $\rowopadd{\alpha}{i}{j}$: Multiply row $i$ by the scalar $\alpha$ and add to row $j$.

Definition REM Row-Equivalent Matrices

Two matrices, $A$ and $B$, are row-equivalent if one can be obtained from the other by a sequence of row operations.

Example TREM Two row-equivalent matrices

Notice that each of the three row operations is reversible (Exercise RREF.T10), so we do not have to be careful about the distinction between “$A$ is row-equivalent to $B$” and “$B$ is row-equivalent to $A$.” (Exercise RREF.T11)

The preceding definitions are designed to make the following theorem possible. It says that row-equivalent matrices represent systems of linear equations that have identical solution sets.

Theorem REMES Row-Equivalent Matrices represent Equivalent Systems

Suppose that $A$ and $B$ are row-equivalent augmented matrices. Then the systems of linear equations that they represent are equivalent systems.

So at this point, our strategy is to begin with a system of equations, represent the system by an augmented matrix, perform row operations (which will preserve solutions for the system) to get a “simpler” augmented matrix, convert back to a “simpler” system of equations and then solve that system, knowing that its solutions are those of the original system. Here is a rehash of Example US as an exercise in using our new tools.

Example USR Three equations, one solution, reprised
Sage RO Row Operations

Subsection RREF Reduced Row-Echelon Form

The preceding example amply illustrates the definitions and theorems we have seen so far. But it still leaves two questions unanswered. Exactly what is this “simpler” form for a matrix, and just how do we get it? Here is the answer to the first question, a definition of reduced row-echelon form.

Definition RREF Reduced Row-Echelon Form

A matrix is in reduced row-echelon form if it meets all of the following conditions:

  1. If there is a row where every entry is zero, then this row lies below any other row that contains a nonzero entry.
  2. The leftmost nonzero entry of a row is equal to 1.
  3. The leftmost nonzero entry of a row is the only nonzero entry in its column.
  4. Consider any two different leftmost nonzero entries, one located in row $i$, column $j$ and the other located in row $s$, column $t$. If $s>i$, then $t>j$.
A row of only zero entries is called a zero row and the leftmost nonzero entry of a nonzero row is a leading 1. A column containing a leading 1 will be called a pivot column. The number of nonzero rows will be denoted by $r$, which is also equal to the number of leading 1's and the number of pivot columns.

The set of column indices for the pivot columns will be denoted by $D=\set{d_1,\,d_2,\,d_3,\,\ldots,\,d_r}$ where $d_1<d_2<d_3<\cdots<d_r$, while the columns that are not pivot columns will be denoted as $F=\set{f_1,\,f_2,\,f_3,\,\ldots,\,f_{n-r}}$ where $f_1<f_2<f_3<\cdots<f_{n-r}$.

The principal feature of reduced row-echelon form is the pattern of leading 1's guaranteed by conditions (2) and (4), reminiscent of a flight of geese, or steps in a staircase, or water cascading down a mountain stream.

There are a number of new terms and notation introduced in this definition, which should make you suspect that this is an important definition. Given all there is to digest here, we will mostly save the use of $D$ and $F$ until Section TSS. However, one important point to make here is that all of these terms and notation apply to a matrix. Sometimes we will employ these terms and sets for an augmented matrix, and other times it might be a coefficient matrix. So always give some thought to exactly which type of matrix you are analyzing.

Example RREF A matrix in reduced row-echelon form
Example NRREF A matrix not in reduced row-echelon form

Our next theorem has a “constructive” proof. Learn about the meaning of this term in Proof Technique C.

Theorem REMEF Row-Equivalent Matrix in Echelon Form

Suppose $A$ is a matrix. Then there is a matrix $B$ so that

  1. $A$ and $B$ are row-equivalent.
  2. $B$ is in reduced row-echelon form.

The procedure given in the proof of Theorem REMEF can be more precisely described using a pseudo-code version of a computer program. Single-letter variables, like m, n, i, j, r have the same meanings as above. := is assignment of the value on the right to the variable on the left, A[i,j] is the equivalent of the matrix entry $\matrixentry{A}{ij}$, while == is an equality test and <> is a “not equals” test.

input m, n and A
r := 0
for j := 1 to n
   i := r+1
   while i <= m and A[i,j] == 0
       i := i+1
   if i < m+1
       r := r+1
       swap rows i and r of A (row op 1)
       scale A[r,j] to a leading 1 (row op 2)
       for k := 1 to m, k <> r
           make A[k,j] zero (row op 3, employing row r)
output r and A

Notice that as a practical matter the “and” used in the conditional statement of the while statement should be of the “short-circuit” variety so that the array access that follows is not out-of-bounds.

So now we can put it all together. Begin with a system of linear equations (Definition SLE), and represent the system by its augmented matrix (Definition AM). Use row operations (Definition RO) to convert this matrix into reduced row-echelon form (Definition RREF), using the procedure outlined in the proof of Theorem REMEF. Theorem REMEF also tells us we can always accomplish this, and that the result is row-equivalent (Definition REM) to the original augmented matrix. Since the matrix in reduced-row echelon form has the same solution set, we can analyze the row-reduced version instead of the original matrix, viewing it as the augmented matrix of a different system of equations. The beauty of augmented matrices in reduced row-echelon form is that the solution sets to the systems they represent can be easily determined, as we will see in the next few examples and in the next section.

We will see through the course that almost every interesting property of a matrix can be discerned by looking at a row-equivalent matrix in reduced row-echelon form. For this reason it is important to know that the matrix $B$ is guaranteed to exist by Theorem REMEF is also unique.

Two proof techniques are applicable to the proof. First, head out and read two proof techniques: Proof Technique CD and Proof Technique U.

Theorem RREFU Reduced Row-Echelon Form is Unique

Suppose that $A$ is an $m\times n$ matrix and that $B$ and $C$ are $m\times n$ matrices that are row-equivalent to $A$ and in reduced row-echelon form. Then $B=C$.

We will now run through some examples of using these definitions and theorems to solve some systems of equations. From now on, when we have a matrix in reduced row-echelon form, we will mark the leading 1's with a small box. This will help you count, and identify, the pivot columns. In your work, you can box 'em, circle 'em or write 'em in a different color — just identify 'em somehow. This device will prove very useful later and is a very good habit to start developing right now.

Example SAB Solutions for Archetype B

Archetypes A and B are meant to contrast each other in many respects. So let us solve Archetype A now.

Example SAA Solutions for Archetype A
Example SAE Solutions for Archetype E

These three examples (Example SAB, Example SAA, Example SAE) illustrate the full range of possibilities for a system of linear equations — no solutions, one solution, or infinitely many solutions. In the next section we will examine these three scenarios more closely.

We (and everybody else) will often speak of “row-reducing” a matrix. This is an informal way of saying we begin with a matrix $A$ and then analyze the matrix $B$ that is row-equivalent to $A$ and in reduced row-echelon form. So the term row-reduce is used as a verb, but describes something a bit more complicated, since we do not really change $A$. Theorem REMEF tells us that this process will always be successful and Theorem RREFU tells us that $B$ will be unambiguous. Typically, an investigation of $A$ will proceed by analyzing $B$ and applying theorems whose hypotheses include the row-equivalence of $A$ and $B$, and usually the hypothesis that $B$ is in reduced row-echelon form.

Sage RREF Reduced Row-Echelon Form