Subsection

{\sc\large This Section is a Draft, Subject to Changes}

\bigskip Our next decomposition will break a square matrix into a product of two matrices, one lower triangular and the other upper triangular. So we will write $A=LU$, and hence many refer to this as LU decomposition. We will see that this decomposition is very easy to compute and that it has a direct application to solving systems of equations. Since this section is about triangular matrices you might want to review the definitions and a couple of basic theorems back in Subsection OD.TM. \subsect{TD}{Triangular Decomposition} With a slight condition on the nonsingularity of certain submatrices, we can split a matrix into a product of two triangular matrices.
Theorem TD Triangular Decomposition
Suppose $A$ is a square matrix of size $n$. Let $A_k$ be the $k\times k$ matrix formed from $A$ by taking the first $k$ rows and the first $k$ columns. Suppose that $A_k$ is nonsingular for all $1\leq k\leq n$. Then there is a lower triangular matrix $L$ with all of its diagonal entries equal to 1 and an upper triangular matrix $U$ such that $A=LU$. Furthermore, this decomposition is unique.

Proof

Studying the proofs of some previous theorems will perhaps give you an idea for an approach to computing a triangular decomposition. In the proof of Theorem CINM we augmented a nonsingular matrix with an identity matrix of the same size, and row-reduced until the original matrix became the identity matrix (as we knew in advance would happen, since we knew Theorem NMRRI). Theorem PEEF tells us about properties of extended echelon form, and in particular, that $B=JA$, where $A$ is the matrix that begins on the left, and $B$ is the reduced row-echelon form of $A$. The matrix $J$ is the result on the right side of the augmented matrix, which is the result of applying the same row operations to the identity matrix. We should recognize now that $J$ is just the product of the elementary matrices (Subsection DM.EM) that perform these row operations. Theorem ITMT used the extended echelon form to discern properties of the inverse of a triangular matrix. Theorem TD proves the existence of a triangular decomposition by applying specific row operations, and tracking the relevant elementary row operations. It is not a great leap to combine these observations into a computational procedure.

To find the triangular decomposition of $A$, augment $A$ with the identity matrix of the same size and call this new $2n\times n$ matrix, $M$. Perform row operations on $M$ that convert the first $n$ columns to an upper triangular matrix. Do this using only row operations that add a scalar multiple of one row to another row with higher index (i.e. lower down). In this way, the last $n$ columns of $M$ will be converted into a lower triangular matrix with 1's on the diagonal (since $M$ has 1's in these locations initially). We could think of this process as doing about half of the work required to compute the inverse of $A$. Take the first $n$ columns of the row-equivalent version of $M$ and call this matrix $U$. Take the final $n$ columns of the row-equivalent version of $M$ and call this matrix $L^\prime$. Then by a proof employing elementary matrices, or a proof similar in spirit to the one used to prove Theorem PEEF, we arrive at a result similar to the second assertion of Theorem PEEF. Namely, $U=L^\prime A$. Multiplication on the left, by the inverse of $L^\prime$, will give us a decomposition of $A$ (which we know to be unique). Ready? Lets try it.
Example TD4 Triangular decomposition, size 4
\subsect{TDSSE}{Triangular Decomposition and Solving Systems of Equations} In this section we give an explanation of why you might be interested in a triangular decomposition for a matrix. Many of the computational problems in linear algebra revolve around solving large systems of equations, or nearly equivalently, finding inverses of large matrices. Suppose we have a system of equations with coefficient matrix $A$ and vector of constants $\vect{b}$, and suppose further that $A$ has the triangular decomposition $A=LU$.

Let $\vect{y}$ be the solution to the linear system $\linearsystem{L}{\vect{b}}$, so that by Theorem SLEMM, we have $L\vect{y}=\vect{b}$. Notice that since $L$ is nonsingular, this solution is unique, and the form of $L$ makes it trivial to solve the system. The first component of $\vect{y}$ is determined easily, and we can continue on through determining the components of $\vect{y}$, without even ever dividing. Now, with $\vect{y}$ in hand, consider the linear system, $\linearsystem{U}{\vect{y}}$. Let $\vect{x}$ be the unique solution to this system, so by Theorem SLEMM we have $U\vect{x}=\vect{y}$. Notice that a system of equations with $U$ as a coefficient matrix is also straightforward to solve, though we will compute the bottom entries of $\vect{x}$ first, and we will need to divide. The upshot of all this is that $\vect{x}$ is a solution to $\linearsystem{A}{\vect{b}}$, as we now show, \begin{align*} A\vect{x} &=LU\vect{x} =L\left(U\vect{x}\right) =L\vect{y} =\vect{b} \end{align*} An application of Theorem SLEMM demonstrates that $\vect{x}$ is a solution to $\linearsystem{A}{\vect{b}}$.
Example TDSSE Triangular decomposition solves a system of equations
\subsect{CTD}{Computing Triangular Decompositions} It would be a simple matter to adjust the algorithm for converting a matrix to reduced row-echelon form and obtain an algorithm to compute the triangular decomposition of the matrix, along the lines of Example TD4 and the discussion preceding this example. However, it is possible to obtain relatively simple formulas for the entries of the decomposition, and if computed in the proper order, an implementation will be straightforward. We will state the result as a theorem and then give an example of its use.
Theorem TDEE Triangular Decomposition, Entry by Entry
Suppose that $A$ is a squarematrix of size $n$ with a triangular decomposition $A=LU$, where $L$ is lower triangular with diagonal entries all equal to 1, and $U$ is upper triangular. Then \begin{align*} \matrixentry{U}{ij}&= \matrixentry{A}{ij}-\sum_{k=1}^{i-1}\matrixentry{L}{ik}\matrixentry{U}{kj} &&1\leq i\leq j\leq n\\ \matrixentry{L}{ij}&= \frac{1}{\matrixentry{U}{jj}} \left(\matrixentry{A}{ij}-\sum_{k=1}^{j-1}\matrixentry{L}{ik}\matrixentry{U}{kj}\right) &&1\leq j<i\leq n \end{align*}

Proof

At first glance, these formulas may look exceedingly complex. Upon closer examination, it looks even worse. We have expressions for entries of $U$ that depend on other entries of $U$ and also on entries of $L$. But then the formula for entries of $L$ depend on entries from $L$ and entries from $U$. Do these formula have circular dependencies? Or perhaps equivalently, how do we get started? The key is to be organized about the computations and employ these two (similar) formulas in a specific order. First compute the first row of $L$, followed by the first column of $U$. Then the second row of $L$, followed by the second column of $U$. And so on. In this way, all of the values required for each new entry will have already been computed previously.

Of course, the formula for entries of $L$ require division by diagonal entries of $U$. These entries might be zero, but in this case $A$ is nonsingular and does not have a triangular decomposition. So we need not check the hypothesis carefully and can launch into the arithmetic dictated by the formulas, confident that we will be reminded when a decomposition is not possible. Note that these formula give us all of the values that we need for the decomposition, since we require that $L$ has 1's on the diagonal. If we replace the 1's on the diagonal of $L$ by zeros, and add the matrix $U$, we get an $n\times n$ matrix containing all the information we need to resurrect the triangular decomposition. This is mostly a notational convenience, but it is a frequent way of presenting the information. We'll employ it in the next example.
Example TDEE6 Triangular decomposition, entry by entry, size 6
The hypotheses of Theorem TD can be weakened slightly to include matrices where not every $A_k$ is nonsingular. The introduces a rearrangement of the rows and columns of $A$ to force as many as possible of the smaller submatrices to be nonsingular. Then permutation matrices also enter into the decomposition. We will not present the details here, but instead suggest consulting a more advanced text on matrix analysis.