Skip to main content

Section SSLE Solving Systems of Linear Equations

We will motivate our study of linear algebra by considering the problem of solving several linear equations simultaneously. The word solve tends to get abused somewhat, as in “solve this problem.” When talking about equations we understand a more precise meaning: find all of the values of some variable quantities that make an equation, or several equations, simultaneously true.

Subsection SLE Systems of Linear Equations

Our first example is of a type we will not pursue further. While it has two equations, the first is not linear. So this is a good example to come back to later, especially after you have seen Theorem PSSLS.

Suppose we desire the simultaneous solutions of the following system of two equations.

\begin{align*} x^2+y^2&=1\\ -x+\sqrt{3}y&=0 \end{align*}

You can easily check by substitution that \(x=\tfrac{\sqrt{3}}{2},\;y=\tfrac{1}{2}\) and \(x=-\tfrac{\sqrt{3}}{2},\;y=-\tfrac{1}{2}\) are both solutions. We need to also convince ourselves that these are the only solutions. To see this, plot each equation on the \(xy\)-plane, which means to plot \((x,\,y)\) pairs that make an individual equation true. In this case we get a circle centered at the origin with radius 1 and a straight line through the origin with slope \(\tfrac{1}{\sqrt{3}}\text{.}\) The intersections of these two curves are our desired simultaneous solutions, and so we believe from our plot that the two solutions we know already are indeed the only ones. We like to write solutions as sets, so in this case we write the set of solutions as

\begin{gather*} S=\set{\left(\tfrac{\sqrt{3}}{2},\,\tfrac{1}{2}\right),\,\left(-\tfrac{\sqrt{3}}{2},\,-\tfrac{1}{2}\right)} \end{gather*}

In order to discuss systems of linear equations carefully, we need a precise definition. And before we do that, we will introduce our periodic discussions about “Proof Techniques.” Linear algebra is an excellent setting for learning how to read, understand and formulate proofs. But this is a difficult step in your development as a mathematician, so we have included a series of short essays containing advice and explanations to help you along. These will be referenced in the text as needed, and are also collected as a list you can consult when you want to return to re-read them. (Which is strongly encouraged!)

With a definition next, now is the time for the first of our proof techniques. So study Proof Technique D. We'll be right here when you get back. See you in a bit.

Definition SLE. System of Linear Equations.

A system of linear equations is a collection of \(m\) equations in the variable quantities \(x_1,\,x_2,\,x_3,\ldots,x_n\) of the form,

\begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+\dots+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+\dots+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+\dots+a_{3n}x_n&=b_3\\ &\vdots\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+\dots+a_{mn}x_n&=b_m \end{align*}

where the values of \(a_{ij}\text{,}\) \(b_i\) and \(x_j\text{,}\) \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{,}\) are from the set of complex numbers, \(\complexes\text{.}\)

Do not let the mention of the complex numbers, \(\complexes\text{,}\) rattle you. We will stick with real numbers exclusively for many more sections, and it will sometimes seem like we only work with integers! However, we want to leave the possibility of complex numbers open, and there will be occasions in subsequent sections where they are necessary. You can review the basic properties of complex numbers in Section CNO, but these facts will not be critical until we reach Section O.

Now we make the notion of a solution to a linear system precise.

Definition SSLE. Solution of a System of Linear Equations.

A solution of a system of linear equations in \(n\) variables, \(\scalarlist{x}{n}\) (such as the system given in Definition SLE), is an ordered list of \(n\) complex numbers, \(\scalarlist{s}{n}\) such that if we substitute \(s_1\) for \(x_1\text{,}\) \(s_2\) for \(x_2\text{,}\) \(s_3\) for \(x_3\text{,}\) …, \(s_n\) for \(x_n\text{,}\) then for every equation of the system the left side will equal the right side, i.e. each equation is true simultaneously.

More typically, we will write a solution in a form like \(x_1=12\text{,}\) \(x_2=-7\text{,}\) \(x_3=2\) to mean that \(s_1=12\text{,}\) \(s_2=-7\text{,}\) \(s_3=2\) in the notation of Definition SSLE. To discuss all of the possible solutions to a system of linear equations, we now define the set of all solutions. (So Section SET is now applicable, and you may want to go and familiarize yourself with what is there.)

Definition SSSLE. Solution Set of a System of Linear Equations.

The solution set of a linear system of equations is the set which contains every solution to the system, and nothing more.

Be aware that a solution set can be infinite, or there can be no solutions, in which case we write the solution set as the empty set, \(\emptyset=\set{}\) (Definition ES). Here is an example to illustrate using the notation introduced in Definition SLE and the notion of a solution (Definition SSLE).

Given the system of linear equations

\begin{align*} x_1+2x_2 + x_4&= 7\\ x_1+x_2+x_3-x_4&=3\\ 3x_1+x_2+5x_3-7x_4&=1 \end{align*}

we have \(n=4\) variables and \(m=3\) equations. Also, employing the notation described in Definition SLE we find

\begin{align*} a_{11}&=1 & a_{12}&=2 & a_{13}&=0 & a_{14}&=1 & b_{1}&=7\\ a_{21}&=1 & a_{22}&=1 & a_{23}&=1 & a_{24}&=-1 & b_{2}&=3\\ a_{31}&=3 & a_{32}&=1 & a_{33}&=5 & a_{34}&=-7 & b_{3}&=1\text{.} \end{align*}

Additionally, convince yourself that \(x_{1}=-2\text{,}\) \(x_{2}=4\text{,}\) \(x_{3}=2\text{,}\) \(x_{4}=1\) is one solution (Definition SSLE), but it is not the only one! For example, another solution is \(x_{1}=-12\text{,}\) \(x_{2}=11\text{,}\) \(x_{3}=1\text{,}\) \(x_{4}=-3\text{,}\) and there are more to be found. So the solution set contains at least two elements.

We will often shorten the term system of linear equations to system of equations leaving the linear aspect implied. After all, this is a book about linear algebra.

Subsection PSS Possibilities for Solution Sets

The next example illustrates the possibilities for the solution set of a system of linear equations. We will not be too formal here, and the necessary theorems to back up our claims will come in subsequent sections. So read for feeling and come back later to revisit this example.

Consider the system of two equations with two variables,

\begin{align*} 2x_1+3x_2&=3\\ x_1-x_2&=4\text{.} \end{align*}

If we plot the solutions to each of these equations separately on the \(x_{1}x_{2}\)-plane, we get two lines, one with negative slope, the other with positive slope. They have exactly one point in common, \((x_1,\,x_2)=(3,\,-1)\text{,}\) which is the solution \(x_1=3\text{,}\) \(x_2=-1\text{.}\) From the geometry, we believe that this is the only solution to the system of equations, and so we say the solution is unique.

Now adjust the system with a different second equation

\begin{align*} 2x_1+3x_2&=3\\ 4x_1+6x_2&=6\text{.} \end{align*}

A plot of the solutions to these equations individually results in two lines, one on top of the other! There are infinitely many pairs of points that make both equations true. We will learn shortly how to describe this infinite solution set precisely (see Example SAA, Theorem VFSLS). Notice now how the second equation is just a multiple of the first.

One more minor adjustment provides a third system of linear equations,

\begin{align*} 2x_1+3x_2&=3\\ 4x_1+6x_2&=10\text{.} \end{align*}

A plot now reveals two lines with identical slopes, i.e. parallel lines. They have no points in common, and so the system has a solution set that is empty, \(S=\emptyset\text{.}\)

This example exhibits all of the typical behaviors of a system of equations. A subsequent theorem will tell us that every system of linear equations has a solution set that is empty, contains a single solution, or contains infinitely many solutions (Theorem PSSLS). Example STNE yielded exactly two solutions, but this does not contradict the forthcoming theorem. The equations in Example STNE are not linear because they do not match the form of Definition SLE, and so we cannot apply Theorem PSSLS in this case.

Subsection ESEO Equivalent Systems and Equation Operations

With all this talk about finding solution sets for systems of linear equations, you might be ready to begin learning how to find these solution sets yourself. We begin with our first definition that takes a common word and gives it a very precise meaning in the context of systems of linear equations.

Definition ESYS. Equivalent Systems.

Two systems of linear equations are equivalent if their solution sets are equal.

Notice here that the two systems of equations could look very different (i.e. not be equal), but still have equal solution sets, and we would then call the systems equivalent. Two linear equations in two variables might be plotted as two lines that intersect in a single point. A different system, with three equations in two variables might have a plot that is three lines, all intersecting at a common point, with this common point identical to the intersection point for the first system. By our definition, we could then say these two very different looking systems of equations are equivalent, since they have identical solution sets. It is really like a weaker form of equality, where we allow the systems to be different in some respects, but we use the term equivalent to highlight the situation when their solution sets are equal.

With this definition, we can begin to describe our strategy for solving linear systems. Given a system of linear equations that looks difficult to solve, we would like to have an equivalent system that is easy to solve. Since the systems will have equal solution sets, we can solve the “easy” system and get the solution set to the “difficult” system. Here come the tools for making this strategy viable.

Definition EO. Equation Operations.

Given a system of linear equations, the following three operations will transform the system into a different one, and each operation is known as an equation operation.

  1. Swap the locations of two equations in the list of equations.
  2. Multiply each term of an equation by a nonzero quantity.
  3. Multiply each term of one equation by some quantity, and add these terms to a second equation, on both sides of the equality. Leave the first equation the same after this operation, but replace the second equation by the new one.

These descriptions might seem a bit vague, but the proof or the examples that follow should make it clear what is meant by each. We will shortly prove a key theorem about equation operations and solutions to linear systems of equations.

We are about to give a rather involved proof, so a discussion about just what a theorem really is would be timely. Stop and read Proof Technique T first.

In the theorem we are about to prove, the conclusion is that two systems are equivalent. By Definition ESYS this translates to requiring that solution sets be equal for the two systems. So we are being asked to show that two sets are equal. How do we do this? Well, there is a very standard technique, and we will use it repeatedly through the course. If you have not done so already, head to Section SET and familiarize yourself with sets, their operations, and especially the notion of set equality, Definition SE, and the nearby discussion about its use.

The following theorem has a rather long proof. This chapter contains a few very necessary theorems like this, with proofs that you can safely skip on a first reading. You might come back to them later, when you are more comfortable with reading and studying proofs.

We take each equation operation in turn and show that the solution sets of the two systems are equal, using the definition of set equality (Definition SE).

  1. It will not be our habit in proofs to resort to saying statements are “obvious,” but in this case, it should be. There is nothing about the order in which we write linear equations that affects their solutions, so the solution set will be equal if the systems only differ by a rearrangement of the order of the equations.
  2. Suppose \(\alpha\neq 0\) is a number. Let us choose to multiply the terms of equation \(i\) by \(\alpha\) to build the new system of equations,

    \begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+\dots+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+\dots+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+\dots+a_{3n}x_n&=b_3\\ &\vdots\\ \alpha a_{i1}x_1+\alpha a_{i2}x_2+\alpha a_{i3}x_3+\dots+\alpha a_{in}x_n&=\alpha b_i\\ &\vdots\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+\dots+a_{mn}x_n&=b_m \end{align*}

    Let \(S\) denote the solutions to the system in the statement of the theorem, and let \(T\) denote the solutions to the transformed system.

    1. Show \(S\subseteq T\text{.}\) Suppose \((x_1,\,x_2,\,\,x_3,\,\ldots,x_n)=(\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in S\) is a solution to the original system. Ignoring the \(i\)-th equation for a moment, we know it makes all the other equations of the transformed system true. We also know that
      \begin{align*} a_{i1}\beta_1+a_{i2}\beta_2+a_{i3}\beta_3+\dots+a_{in}\beta_n&=b_i\\ \alpha a_{i1}\beta_1+\alpha a_{i2}\beta_2+\alpha a_{i3}\beta_3+\dots+\alpha a_{in}\beta_n&=\alpha b_i \end{align*}
      This says that the \(i\)-th equation of the transformed system is also true, so we have established that \((\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in T\text{,}\) and therefore \(S\subseteq T\text{.}\)
    2. Now show \(T\subseteq S\text{.}\) Suppose \((x_1,\,x_2,\,\,x_3,\,\ldots,x_n)=(\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in T\) is a solution to the transformed system. Ignoring the \(i\)-th equation for a moment, we know it makes all the other equations of the original system true. We also know that
      \begin{align*} \alpha a_{i1}\beta_1+\alpha a_{i2}\beta_2+\alpha a_{i3}\beta_3+\dots+\alpha a_{in}\beta_n&=\alpha b_i\\ a_{i1}\beta_1+a_{i2}\beta_2+a_{i3}\beta_3+\dots+a_{in}\beta_n&=b_i \end{align*}
      This says that the \(i\)-th equation of the original system is also true, so we have established that \((\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in S\text{,}\) and therefore \(T\subseteq S\text{.}\) Locate the key point where we required that \(\alpha\neq 0\text{,}\) and consider what would happen if \(\alpha=0\text{.}\)
  3. Suppose \(\alpha\) is a number. Let us choose to multiply the terms of equation \(i\) by \(\alpha\) and add them to equation \(j\) in order to build the new system of equations,

    \begin{align*} a_{11}x_1+a_{12}x_2+\dots+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+\dots+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+\dots+a_{3n}x_n&=b_3\\ &\vdots\\ (\alpha a_{i1}+a_{j1})x_1+(\alpha a_{i2}+a_{j2})x_2+\dots+(\alpha a_{in}+a_{jn})x_n&=\alpha b_i+b_{j}\\ &\vdots\\ a_{m1}x_1+a_{m2}x_2+\dots+a_{mn}x_n&=b_m \end{align*}

    Let \(S\) denote the solutions to the system in the statement of the theorem, and let \(T\) denote the solutions to the transformed system.

    1. Show \(S\subseteq T\text{.}\) Suppose \((x_1,\,x_2,\,\,x_3,\,\ldots,x_n)=(\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in S\) is a solution to the original system. Ignoring the \(j\)-th equation for a moment, we know this solution makes all the other equations of the transformed system true. Using the fact that the solution makes the \(i\)-th and \(j\)-th equations of the original system true, we find
      \begin{align*} &(\alpha a_{i1}+a_{j1})\beta_1+(\alpha a_{i2}+a_{j2})\beta_2+\dots+(\alpha a_{in}+a_{jn})\beta_n\\ &\quad=(\alpha a_{i1}\beta_1+\alpha a_{i2}\beta_2+\dots+\alpha a_{in}\beta_n)+ (a_{j1}\beta_1+a_{j2}\beta_2+\dots+a_{jn}\beta_n)\\ &\quad=\alpha(a_{i1}\beta_1+a_{i2}\beta_2+\dots+a_{in}\beta_n)+ (a_{j1}\beta_1+a_{j2}\beta_2+\dots+a_{jn}\beta_n)\\ &\quad=\alpha b_i+b_j\text{.} \end{align*}
      This says that the \(j\)-th equation of the transformed system is also true, so we have established that \((\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in T\text{,}\) and therefore \(S\subseteq T\text{.}\)
    2. Now show \(T\subseteq S\text{.}\) Suppose \((x_1,\,x_2,\,\,x_3,\,\ldots,x_n)=(\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in T\) is a solution to the transformed system. Ignoring the \(j\)-th equation for a moment, we know it makes all the other equations of the original system true. We then find
      \begin{align*} &a_{j1}\beta_1+\dots+a_{jn}\beta_n\\ &\quad\quad=a_{j1}\beta_1+\dots+a_{jn}\beta_n +\alpha b_i -\alpha b_i\\ &\quad\quad=a_{j1}\beta_1+\dots+a_{jn}\beta_n +(\alpha a_{i1}\beta_1+\dots+\alpha a_{in}\beta_n) -\alpha b_i\\ &\quad\quad=a_{j1}\beta_1+\alpha a_{i1}\beta_1+\dots+a_{jn}\beta_n+\alpha a_{in}\beta_n-\alpha b_i\\ &\quad\quad=(\alpha a_{i1}+a_{j1})\beta_1+\dots+(\alpha a_{in}+a_{jn})\beta_n -\alpha b_i\\ &\quad\quad=\alpha b_i + b_j -\alpha b_i\\ &\quad\quad=b_j \end{align*}
      This says that the \(j\)-th equation of the original system is also true, so we have established that \((\beta_1,\,\beta_2,\,\,\beta_3,\,\ldots,\beta_n)\in S\text{,}\) and therefore \(T\subseteq S\text{.}\)

    Why did we not need to require that \(\alpha\neq 0\) for this row operation? In other words, how does the third statement of the theorem read when \(\alpha=0\text{?}\) Does our proof require some extra care when \(\alpha=0\text{?}\) Compare your answers with the similar situation for the second row operation. (See Exercise SSLE.T20.)

Theorem EOPSS is the necessary tool to complete our strategy for solving systems of equations. We will use equation operations to move from one system to another, all the while keeping the solution set the same. With the right sequence of operations, we will arrive at a simpler equation to solve. The next two examples illustrate this idea, while saving some of the details for later.

We solve the following system by a sequence of equation operations.

\begin{align*} x_1+2x_2+2x_3&=4\\ x_1+3x_2+3x_3&=5\\ 2x_1+6x_2+5x_3&=6\\ \end{align*}

\(\alpha=-1\) times equation 1, add to equation 2.

\begin{align*} x_1+2x_2+2x_3&=4\\ 0x_1+1x_2+ 1x_3&=1\\ 2x_1+6x_2+5x_3&=6\\ \end{align*}

\(\alpha=-2\) times equation 1, add to equation 3.

\begin{align*} x_1+2x_2+2x_3&=4\\ 0x_1+1x_2+ 1x_3&=1\\ 0x_1+2x_2+1x_3&=-2\\ \end{align*}

\(\alpha=-2\) times equation 2, add to equation 3.

\begin{align*} x_1+2x_2+2x_3&=4\\ 0x_1+1x_2+ 1x_3&=1\\ 0x_1+0x_2-1x_3&=-4\\ \end{align*}

\(\alpha=-1\) times equation 3.

\begin{align*} x_1+2x_2+2x_3&=4\\ 0x_1+1x_2+ 1x_3&=1\\ 0x_1+0x_2+1x_3&=4\\ \end{align*}

Which can be written more clearly as

\begin{align*} x_1+2x_2+2x_3&=4\\ x_2+ x_3&=1\\ x_3&=4\text{.} \end{align*}

This is now a very easy system of equations to solve. The third equation requires that \(x_3=4\) to be true. Making this substitution into equation 2 we arrive at \(x_2=-3\text{,}\) and finally, substituting these values of \(x_2\) and \(x_3\) into the first equation, we find that \(x_1=2\text{.}\) Note too that this is the only solution to this final system of equations, since we were forced to choose these single values to make the equations true. Since we performed equation operations on each system to obtain the next one in the list, all of the systems listed here are all equivalent to each other by Theorem EOPSS. Thus \((x_1,\,x_2,\,x_3)=(2,-3,4)\) is the unique solution to the original system of equations (and also all of the other intermediate systems of equations listed, as we transformed one into another).

The following system of equations made an appearance earlier in this section (Example NSE), where we listed one of its solutions. Now, we will try to find all of the solutions to this system. Do not concern yourself too much about why we choose this particular sequence of equation operations, just believe that the work we do is all correct.

\begin{align*} x_1+2x_2 +0x_3+ x_4&= 7\\ x_1+x_2+x_3-x_4&=3\\ 3x_1+x_2+5x_3-7x_4&=1\\ \end{align*}

\(\alpha=-1\) times equation 1, add to equation 2.

\begin{align*} x_1+2x_2 +0x_3+ x_4&= 7\\ 0x_1-x_2+x_3-2x_4&=-4\\ 3x_1+x_2+5x_3-7x_4&=1\\ \end{align*}

\(\alpha=-3\) times equation 1, add to equation 3.

\begin{align*} x_1+2x_2 +0x_3+ x_4&= 7\\ 0x_1-x_2+x_3-2x_4&=-4\\ 0x_1-5x_2+5x_3-10x_4&=-20\\ \end{align*}

\(\alpha=-5\) times equation 2, add to equation 3.

\begin{align*} x_1+2x_2 +0x_3+ x_4&= 7\\ 0x_1-x_2+x_3-2x_4&=-4\\ 0x_1+0x_2+0x_3+0x_4&=0\\ \end{align*}

\(\alpha=-1\) times equation 2.

\begin{align*} x_1+2x_2 +0x_3+ x_4&= 7\\ 0x_1+x_2-x_3+2x_4&=4\\ 0x_1+0x_2+0x_3+0x_4&=0\\ \end{align*}

\(\alpha=-2\) times equation 2, add to equation 1.

\begin{align*} x_1+0x_2 +2x_3-3x_4&= -1\\ 0x_1+x_2-x_3+2x_4&=4\\ 0x_1+0x_2+0x_3+0x_4&=0\\ \end{align*}

Which can be written more clearly as

\begin{align*} x_1+2x_3 - 3x_4&= -1\\ x_2-x_3+2x_4&=4\\ 0&=0\text{.} \end{align*}

What does the equation \(0=0\) mean? We can choose any values for \(x_1\text{,}\) \(x_2\text{,}\) \(x_3\text{,}\) \(x_4\) and this equation will be true, so we only need to consider further the first two equations, since the third is true no matter what. We can analyze the second equation without consideration of the variable \(x_1\text{.}\) It would appear that there is considerable latitude in how we can choose \(x_2\text{,}\) \(x_3\text{,}\) \(x_4\) and make this equation true. Let us choose \(x_3\) and \(x_4\) to be anything we please, say \(x_3=a\) and \(x_4=b\text{.}\)

Now we can take these arbitrary values for \(x_3\) and \(x_4\text{,}\) substitute them in equation 1, to obtain

\begin{align*} x_1+2a - 3b&= -1\\ x_1&=-1-2a+3b\text{.} \end{align*}

Similarly, equation 2 becomes

\begin{align*} x_2-a+2b&=4\\ x_2&=4 +a-2b\text{.} \end{align*}

So our arbitrary choices of values for \(x_3\) and \(x_4\) (\(a\) and \(b\)) translate into specific values of \(x_1\) and \(x_2\text{.}\) The lone solution given in Example NSE was obtained by choosing \(a=2\) and \(b=1\text{.}\) Now we can easily and quickly find many more (infinitely more). Suppose we choose \(a=5\) and \(b=-2\text{,}\) then we compute

\begin{align*} x_1&=-1-2(5)+3(-2)=-17\\ x_2&=4+5-2(-2)=13 \end{align*}

and you can verify that \((x_1,\,x_2,\,x_3,\,x_4)=(-17,\,13,\,5,\,-2)\) makes all three equations true. The entire solution set is written as

\begin{equation*} S=\setparts{(-1-2a+3b,\,4 +a-2b,\,a,\,b)}{ a\in\complexes,\,b\in\complexes}\text{.} \end{equation*}

It would be instructive to finish off your study of this example by taking the general form of the solutions given in this set and substituting them into each of the three equations and verify that they are true in each case (Exercise SSLE.M40).

In the next section we will describe how to use equation operations to systematically solve any system of linear equations. But first, read one of our more important pieces of advice about speaking and writing mathematics. See Proof Technique L.

Before attacking the exercises in this section, it will be helpful to read some advice on getting started on the construction of a proof. See Proof Technique GS.

Sage is a powerful system for studying and exploring many different areas of mathematics. In the next section, and the majority of the remaining section, we will include short descriptions and examples using Sage. You can read a bit more about Sage in the Preface. If you are not already reading this in a capable format, you may want to investigate the online version of the book where the examples are “live” and editable. Most of your interaction with Sage will be by typing commands into a compute cell. We have placed a compute cell just below this paragraph. Online this is an instance of the Sage Cell Server at work. In a PDF or on paper or in an EPUB, it is likely to quite boring, since it is empty.

If you can, place your cursor inside the cell and type 2+2 and then click on the evaluate link. Did a 4 appear below the cell? If so, you have successfully sent a command off for Sage to evaluate and you have received back the (correct) answer.

Here is another compute cell. Try evaluating the command factorial(300).

Hmmmmm. That is quite a big integer! And it possibly continues onto more than one line, or off the edge of your screen, since there are 615 digits in the result.

Online you cannot save your results, nor make new cells. But you can reload a page if your editing makes a real mess of a cell.

Each cell knows about values of variables stored in other cells, and we assume you evaluate all the cells in a section, in the order presented. In truth, the variables are self-contained within each Sage discussion. Note that reloading a page destroys all the intermediate results and you will need to start over.

Each compute cell will show output due to only the very last command in the cell. Try to predict the following output before evaluating the cell.

The following compute cell will not print anything since the one command does not create output. But it will have an effect, as you can see when you execute the subsequent cell. Notice how this uses the value of b from above. Execute this compute cell once. Exactly once. Even if it appears to do nothing. If you execute the cell twice, your credit card may be charged twice.

Now execute this cell, which will produce some output.

So b came into existence as 6. Then a cell added 50. This assumes you only executed this cell once! In the last cell we create b+20 (but do not save it) and it is this value that is output.

You can combine several commands on one line with a semi-colon. This is a great way to get multiple outputs from a compute cell. The syntax for building a matrix should be somewhat obvious when you see the output, but if not, it is not particularly important to understand now.

Some commands in Sage are functions, an example is factorial() above. Other commands are methods of an object and are like characteristics of objects, examples are .factor() and .derivative() as methods of a function.

There are other ways to compute with Sage, besides Sage cells embedded in web pages. A good option is the SageMathCloud at You can experiment with a free account, and then you can save your work in Sage worksheets between sessions. There are several ways of additionally documenting your results, using Markdown, HTML, or . SageMathCloud also supports Jupyter notebooks with a Sage kernel.

Much of our interaction with sets will be through Sage lists. These are not really sets — they allow duplicates, and order matters. But they are so close to sets, and so easy and powerful to use that we will use them regularly. We will use a fun made-up list for practice, the quote marks mean the items are just text, with no special mathematical meaning. Execute these compute cells as we work through them.

So the square brackets define the boundaries of our list, commas separate items, and we can give the list a name. To work with just one element of the list, we use the name and a pair of brackets with an index. Notice that lists have indices that begin counting at zero. This will seem odd at first and will seem very natural later.

We can add a new creature to the zoo, it is joined up at the far right end.

We can remove a creature.

We can extract a sublist. Here we start with element 1 (the elephant) and go all the way up to, but not including, element 3 (the beetle). Again a bit odd, but it will feel natural later. For now, notice that we are extracting two elements of the lists, exactly \(3-1=2\) elements.

Often we will want to see if two lists are equal. To do that we will need to sort a list first. A function creates a new, sorted list, leaving the original alone. So we need to save the new one with a new name.

Notice that if you run this last compute cell your zoo has changed and some commands above will not necessarily execute the same way. If you want to experiment, go all the way back to the first creation of the zoo and start executing cells again from there with a fresh zoo.

A construction called a “list comprehension” is especially powerful, especially since it almost exactly mirrors notation we use to describe sets. Suppose we want to form the plural of the names of the creatures in our zoo. We build a new list, based on all of the elements of our old list.

Almost like it says: we add an “s” to each animal name, for each animal in the zoo, and place them in a new list. Perfect. (Except for getting the plural of “ostrich” wrong.)

One final type of list, with numbers this time. The range() function will create an object that can generate a lists of integer. In its simplest form an invocation like range(12) does not seem to do much. But when want to, we can make teh object generate a list.

The next example will create a list of 12 integers, starting at zero and working up to, but not including, 12. Does this sound familiar?

Here are two other forms, that you should be able to understand by studying the examples. In the second one, we use a shorthand of sorts (the “starred” form) to produce a list.

We have covered a lot here in this section, so come back later to pick up tidbits you might have missed. There are also many more features in SageMathCloud, including easy and powerful collaboration, that we have not covered.

Reading Questions SSLE Reading Questions


How many solutions does the system of equations \(3x + 2y = 4\text{,}\) \(6x + 4y = 8\) have? Explain your answer.


How many solutions does the system of equations \(3x + 2y = 4\text{,}\) \(6x + 4y = -2\) have? Explain your answer.


What do we mean when we say mathematics is a language?

Exercises SSLE Exercises


Find a solution to the system in Example IS where \(x_3=6\) and \(x_4=2\text{.}\) Find two other solutions to the system. Find a solution where \(x_1=-17\) and \(x_2=14\text{.}\) How many possible answers are there to each of these questions?


Each archetype (Appendix A) that is a system of equations begins by listing some specific solutions. Verify the specific solutions listed in the following archetypes by evaluating the system of equations with the solutions listed.

Archetype A, Archetype B, Archetype C, Archetype D, Archetype E, Archetype F, Archetype G, Archetype H, Archetype I, Archetype J


Find all solutions to the linear system.

\begin{align*} x + y &= 5\\ 2x - y &= 3 \end{align*}

Solving each equation for \(y\text{,}\) we have the equivalent system

\begin{align*} y &= 5 - x\\ y &= 2x - 3\text{.} \end{align*}

Setting these expressions for \(y\) equal, we have the equation \(5 - x = 2x - 3\text{,}\) which quickly leads to \(x = \frac{8}{3}\text{.}\) Substituting for \(x\) in the first equation, we have \(y = 5 - x = 5 - \frac{8}{3} = \frac{7}{3}\text{.}\) Thus, the solution is \(x = \frac{8}{3}\text{,}\) \(y = \frac{7}{3}\text{.}\)


Find all solutions to the linear system.

\begin{align*} 3x + 2y &= 1\\ x - y &= 2\\ 4x + 2y &= 2 \end{align*}

Find all solutions to the linear system.

\begin{align*} x + 2y &= 8\\ x - y &= 2\\ x + y &= 4 \end{align*}

Find all solutions to the linear system.

\begin{align*} x + y - z &= -1\\ x - y - z &= -1\\ z &= 2 \end{align*}

Find all solutions to the linear system.

\begin{align*} x + y - z &= -5\\ x - y - z &= -3\\ x + y - z &= 0 \end{align*}

A three-digit number has two properties. The tens-digit and the ones-digit add up to 5. If the number is written with the digits in the reverse order, and then subtracted from the original number, the result is \(792\text{.}\) Use a system of equations to find all of the three-digit numbers with these properties.


Let \(a\) be the hundreds digit, \(b\) the tens digit, and \(c\) the ones digit. Then the first condition says that \(b+c=5\text{.}\) The original number is \(100a+10b+c\text{,}\) while the reversed number is \(100c+10b+a\text{.}\) So the second condition is

\begin{equation*} 792=\left(100a+10b+c\right)-\left(100c+10b+a\right)=99a-99c\text{.} \end{equation*}

So we arrive at the system of equations

\begin{align*} b+c&=5\\ 99a-99c&=792\text{.} \end{align*}

Using equation operations, we arrive at the equivalent system

\begin{align*} a-c&=8\\ b+c&=5\text{.} \end{align*}

We can vary \(c\) and obtain infinitely many solutions. However, \(c\) must be a digit, restricting us to ten values (0 – 9). Furthermore, if \(c\gt 1\text{,}\) then the first equation forces \(a\gt 9\text{,}\) an impossibility. Setting \(c=0\text{,}\) yields \(850\) as a solution, and setting \(c=1\) yields \(941\) as another solution.


Find all of the six-digit numbers in which the first digit is one less than the second, the third digit is half the second, the fourth digit is three times the third and the last two digits form a number that equals the sum of the fourth and fifth. The sum of all the digits is 24. (From The MENSA Puzzle Calendar for January 9, 2006.)


Let \(abcdef\) denote any such six-digit number and convert each requirement in the problem statement into an equation.

\begin{align*} a&=b-1\\ c&=\frac{1}{2}b\\ d&=3c\\ 10e+f&=d+e\\ 24&=a+b+c+d+e+f\text{.} \end{align*}

In a more standard form this becomes

\begin{align*} a-b&=-1\\ -b+2c&=0\\ -3c+d&=0\\ -d+9e+f&=0\\ a+b+c+d+e+f&=24\text{.} \end{align*}

Using equation operations (or the techniques of the upcoming Section RREF), this system can be converted to the equivalent system

\begin{align*} a+\frac{16}{75}f &= 5\\ b+\frac{16}{75}f &= 6\\ c+ \frac{8}{75}f &= 3\\ d+ \frac{8}{25}f &= 9\\ e+\frac{11}{75}f &= 1\text{.} \end{align*}

Clearly, choosing \(f=0\) will yield the solution \(abcdef=563910\text{.}\) Furthermore, to have the variables result in single-digit numbers, none of the other choices for \(f\) (\(1,\,2,\,\ldots,\,9\)) will yield a solution.


Driving along, Terry notices that the last four digits on his car's odometer are palindromic. A mile later, the last five digits are palindromic. After driving another mile, the middle four digits are palindromic. One more mile, and all six are palindromic. What was the odometer reading when Terry first looked at it? Form a linear system of equations that expresses the requirements of this puzzle. (Car Talk Puzzler, National Public Radio, Week of January 21, 2008) (A car odometer displays six digits and a sequence is a palindrome if it reads the same left-to-right as right-to-left.)


198888 is one solution, and David Braithwaite found 199999 as another.


An article in The Economist (“Free Exchange”, December 6, 2014) quotes the following problem as an illustration that some of the “underlying assumptions of classical economics” about people's behavior are incorrect and “the mind plays tricks.” A bat and ball cost $1.10 between them. The bat costs $1 more than the ball. How much does each cost? Answer this quickly with no writing, then construct system of linear equations and solve the problem carefully.


A quick answer is often $1.00 and 10 cents, rather than the correct answer, $1.05 and 5 cents.


Each sentence below has at least two meanings. Identify the source of the double meaning, and rewrite the sentence (at least twice) to clearly convey each meaning.

  1. They are baking potatoes.
  2. He bought many ripe pears and apricots.
  3. She likes his sculpture.
  4. I decided on the bus.
  1. Does “baking” describe the potato or what is happening to the potato?

    • Those are potatoes that are used for baking.
    • The potatoes are being baked.
  2. Are the apricots ripe, or just the pears? Parentheses could indicate just what the adjective “ripe” is meant to modify. Were there many apricots as well, or just many pears?

    • He bought many pears and many ripe apricots.
    • He bought apricots and many ripe pears.
  3. Is “sculpture” a single physical object, or the sculptor's style expressed over many pieces and many years?

    • She likes his sculpture of the girl.
    • She likes his sculptural style.
  4. Was a decision made while in the bus, or was the outcome of a decision to choose the bus. Would the sentence “I decided on the car,” have a similar double meaning?

    • I made my decision while on the bus.
    • I decided to ride the bus.

Discuss the difference in meaning of each of the following three almost identical sentences, which all have the same grammatical structure. (These are due to Keith Devlin.)

  1. She saw him in the park with a dog.
  2. She saw him in the park with a fountain.
  3. She saw him in the park with a telescope.

We know the dog belongs to the man, and the fountain belongs to the park. It is not clear if the telescope belongs to the man, the woman, or the park.


The following sentence, due to Noam Chomsky, has a correct grammatical structure, but is meaningless. Critique its faults. “Colorless green ideas sleep furiously.” (Chomsky, Noam. Syntactic Structures, The Hague/Paris: Mouton, 1957. p. 15.)


In adjacent pairs the words are contradictory or inappropriate. Something cannot be both green and colorless, ideas do not have color, ideas do not sleep, and it is hard to sleep furiously.


Read the following sentence and form a mental picture of the situation.

The baby cried and the mother picked it up.

What assumptions did you make about the situation?

  • Did you assume that the baby and mother are human?
  • Did you assume that the baby is the child of the mother?
  • Did you assume that the mother picked up the baby as an attempt to stop the crying?

Discuss the difference in meaning of the following two almost identical sentences, which have nearly identical grammatical structure. (This antanaclasis is often attributed to the comedian Groucho Marx, but has earlier roots.)

  • Time flies like an arrow.
  • Fruit flies like a banana.

This problem appears in a middle-school mathematics textbook: Together Dan and Diane have $20. Together Diane and Donna have $15. How much do the three of them have in total? (Transition Mathematics, Second Edition, Scott Foresman Addison Wesley, 1998. Problem 5–1.19.)


If \(x\text{,}\) \(y\) and \(z\) represent the money held by Dan, Diane and Donna, then \(y=15-z\) and \(x=20-y=20-(15-z)=5+z\text{.}\) We can let \(z\) take on any value from \(0\) to \(15\) without any of the three amounts being negative, since presumably middle-schoolers are too young to assume debt.

Then the total capital held by the three is \(x+y+z=(5+z)+(15-z)+z=20+z\text{.}\) So their combined holdings can range anywhere from $20 (Donna is broke) to $35 (Donna is flush). We will have more to say about this situation in Section TSS, and specifically Theorem CMVEI.


Solutions to the system in Example IS are given as

\begin{equation*} (x_1,\,x_2,\,x_3,\,x_4)=(-1-2a+3b,\,4+a-2b,\,a,\,b) \end{equation*}

Evaluate the three equations of the original system with these expressions in \(a\) and \(b\) and verify that each equation is true, no matter what values are chosen for \(a\) and \(b\text{.}\)


We have seen in this section that systems of linear equations have limited possibilities for solution sets, and we will shortly prove Theorem PSSLS that describes these possibilities exactly. This exercise will show that if we relax the requirement that our equations be linear, then the possibilities expand greatly. Consider a system of two equations in the two variables \(x\) and \(y\text{,}\) where the departure from linearity involves simply squaring the variables.

\begin{align*} x^2-y^2&=1\\ x^2+y^2&=4 \end{align*}

After solving this system of nonlinear equations, replace the second equation in turn by \(x^2+2x+y^2=3\text{,}\) \(x^2+y^2=1\text{,}\) \(x^2-4x+y^2=-3\text{,}\) \(-x^2+y^2=1\) and solve each resulting system of two equations in two variables. (This exercise includes suggestions from Don Kreher.)


The equation \(x^2-y^2=1\) has a solution set by itself that has the shape of a hyperbola when plotted. Four of the five different second equations have solution sets that are circles when plotted individually (the last is another hyperbola). Where the hyperbola and circles intersect are the solutions to the system of two equations. As the size and location of the circles vary, the number of intersections varies from four to one (in the order given). The last equation is a hyperbola that “opens” in the other direction. Sketching the relevant equations would be instructive, as was discussed in Example STNE.

The exact solution sets are (according to the choice of the second equation),

\begin{align*} &x^2+y^2= 4,\\ &\set{ \left( \sqrt{\frac{5}{2}}, \sqrt{\frac{3}{2}}\right),\, \left(-\sqrt{\frac{5}{2}}, \sqrt{\frac{3}{2}}\right),\, \left( \sqrt{\frac{5}{2}},-\sqrt{\frac{3}{2}}\right),\, \left(-\sqrt{\frac{5}{2}},-\sqrt{\frac{3}{2}}\right) }\\ &x^2+2x+y^2=3,\quad\set{(1,0),\,(-2,\sqrt{3}),\,(-2,-\sqrt{3})}\\ &x^2+y^2=1,\quad\set{(1,0),\,(-1,0)}\\ &x^2-4x+y^2=-3,\quad\set{(1,0)}\\ &-x^2+y^2=1,\quad\set{} \end{align*}

Proof Technique D asks you to formulate a definition of what it means for a whole number to be odd. What is your definition? (Do not say “the opposite of even.”) Is \(6\) odd? Is \(11\) odd? Justify your answers by using your definition.


We can say that an integer is odd if when it is divided by \(2\) there is a remainder of 1. So \(6\) is not odd since \(6=3\times 2+0\text{,}\) while \(11\) is odd since \(11=5\times 2 + 1\text{.}\)


Explain why the second equation operation in Definition EO requires that the scalar be nonzero, while in the third equation operation this restriction on the scalar is not present.


Definition EO is engineered to make Theorem EOPSS true. If we were to allow a zero scalar to multiply an equation then that equation would be transformed to the equation \(0=0\text{,}\) which is true for any possible values of the variables. Any restrictions on the solution set imposed by the original equation would be lost.

However, in the third operation, it is allowed to choose a zero scalar, multiply an equation by this scalar and add the transformed equation to a second equation (leaving the first unchanged). The result? Nothing. The second equation is the same as it was before. So the theorem is true in this case, the two systems are equivalent. But in practice, this would be a silly thing to actually ever do! We still allow it though, in order to keep our theorem as general as possible.

Notice the location in the proof of Theorem EOPSS where the expression \(\frac{1}{\alpha}\) appears — this explains the prohibition on \(\alpha=0\) in the second equation operation.