The LU Factorization

To solve a linear system of the form $A\vec x = \vec b$ we could use row reduction or, in theory, calculate $A^{-1}$ and use it to determine $\vec x$ with the equation

$$\vec x = A^{-1} \vec b$$

But computing $A^{-1}$ requires the computation of the inverse of an $n\times n$ matrix, which is especially difficult for large $n$. It is more practical to solve $A\vec x = \vec b$ with row reductions (i.e. – Gaussian Elimination). But it turns out that there are more efficient methods, especially when $n$ is large.

One method for solving linear systems that relies on what is referred to as a matrix factorizations. A matrix factorization, or matrix decomposition is a factorization of a matrix into a product of matrices. Factorizations can be useful for solving $A \vec x = \vec b$, or for understanding the properties of a matrix.

In this article we factor a matrix into lower and into upper triangular matrices to construct what is known as the LU factorization that is used to solve linear systems in a systematic and efficient method. Before we introduce the LU factorization, we will first need to introduce lower and upper triangular matrices.

Triangular Matrices

Before we introduce the LU factorization, we need to first define upper and lower triangular matrices.

Suppose that the entries of $m\times n$ matrix $A$ are $a_{i,j}$. Then $A$ is upper triangular if $ a _{i,j} = 0$ for $ i > j$. Matrix $ A$ is lower triangular if $ a _{i,j} = 0$ for $ i < j$.

As an example, all of the matrices below are in upper triangular form.

$$\begin{pmatrix} 1&5&0\\0&2&4 \end{pmatrix}, \quad \begin{pmatrix} 2&1\\0&1\\0&0\\0&0 \end{pmatrix} , \quad \begin{pmatrix} 0&0&0\\0&0&0 \end{pmatrix}$$

Notice how all of the entries below the main diagonal are zero, and the entries on and above the main diagonal can be anything. Likewise, examples of lower triangular matrices are below.

$$\begin{pmatrix} 1&0&0\\3&2&0 \end{pmatrix}, \quad \begin{pmatrix} 3&0&0&0\\1&2&0&0 \end{pmatrix} \quad \begin{pmatrix} 1&0\\4&6\\1&5 \end{pmatrix}, \quad \begin{pmatrix} 0&0&0\\0&0&0 \end{pmatrix} $$

Again, note that our definition for an upper triangular matrix does not specify what the entries on or above the main diagonal need to be. Some or all of the entries above the main diagonal can, for example, be zero. Likewise the entries on and below the main diagonal of a lower triangular matrix do not have to have specific values.

After stating a theorem that gives the LU decomposition, we will give an algorithm for constructing the LU factorization. We will then see how we can use the factorization to solve a linear system.

If $ A$ is an $ m \times n$ matrix that can be row reduced to echelon form without row exchanges, then $ A = L U $, where $ L $ is a lower triangular $ m \times m$ matrix with $ 1$’s on the diagonal, and $ U$ is an echelon form of $ A$.

Proof

To prove the theorem above we will first show that we can write $A = LU$ where $L$ is an invertible matrix, and $U$ is an echelon form of $A$.

Suppose that $m\times n$ matrix $A$ can be reduced to echelon form $U$ with $p$ elementary row operations that only add a multiple of a row to another row that is below it. Then each row operation can be performed by multiplying $A$ with $p$ elementary matrices:

\begin{align}
E_pE_{p-1} \cdots E_3E_2E_1 A &= U \quad\quad (1)
\end{align}
If we let $L^{-1} = E_pE_{p-1} \cdots E_3E_2E_1$, then
\begin{align}
L^{-1}A &= U \quad\quad (2)
\end{align}

Note that $L^{-1} = E_pE_{p-1} \cdots E_3E_2E_1$ is invertible because elementary matrices are invertible. Therefore $L^{-1}$ can be reduced to the identity with a sequence of row operations. Moreover, if we multiply Equation (2) by $L$ we obtain:

\begin{align*}
LL^{-1} A &= LU \quad \Rightarrow \quad A = LU
\end{align*}

Therefore $A$ has the decomposition $A=LU$ where $U$ is an echelon form of $A$ and $L$ is an invertible $m\times m$ matrix. To show that $L$ is lower triangular, recall from equations (1) and (2) that

$$L^{-1} = E_pE_{p-1} \cdots E_3E_2E_1 $$

Each elementary matrix $E_i$ is lower triangular because to reduce $A$ to $U$ we only used one type of row operation: adding a multiple of a row to a row below it, so each $E_i$ is a lower triangular matrix. It can also be shown that the product of two lower-triangular matrices is a lower triangular matrix, and the inverse of a lower triangular matrix is lower triangular. This implies that both $L^{-1}$ and $L$ will be lower-triangular.

Constructing the LU Factorization

To construct the LU factorization of a matrix we must first apply a sequence of row operations to $A$ in order to reduce $A$ to $U$. Equation (1) gives us that

\begin{align}
E_pE_{p-1} \cdots E_3E_2E_1 A = L^{-1}A &= U, \quad \text{where } L^{-1} = E_pE_{p-1} \cdots E_3E_2E_1
\end{align}

But if $L^{-1}L = I$, then then the sequence of row operations that reduce $A$ to $U$ will reduce $L$ to $I$. This gives us an algorithm for constructing the LU factorization. If $ A$ is an $ m \times n$ matrix that can be row reduced to echelon form without row exchanges.

Reduce $ A$ to an echelon form $ U$ by a sequence of row replacement operations, if possible.
Place entries in $ L$ such that the sequence of row operations that reduces $A$ to $U$ will reduce $ L$ to $ I$.

Example: Constructing LU for a $\mathbf{3\times 2}$ Matrix

In this example we construct LU factorizations of the following matrix.

$$A=\begin{pmatrix} 1 & 3 \\ 2 & 10 \\ 0 & 12 \end{pmatrix}$$

Because $A$ is a $3\times 2$ matrix, the LU factorization has the form

\begin{align} A = L U = \begin{pmatrix} 1&0&0\\ \ast&1&0\\ \ast&\ast&1 \end{pmatrix}\begin{pmatrix} \ast&\ast \\ 0&\ast \\ 0&0 \end{pmatrix} \end{align}

Each $\ast$ represents an entry that we need to compute the value of. To reduce $A$ to $U$ we apply a sequence of row replacement operations as shown below.

\begin{align}A = & \begin{pmatrix} 1 & 3 \\ 2 & 10 \\ 0 & 12 \end{pmatrix} \thicksim \begin{pmatrix} 1 & 3 \\ 0 & 4 \\ 0 & 12 \end{pmatrix} \thicksim \begin{pmatrix} 1 & 3 \\ 0 & 4 \\ 0 & 0 \end{pmatrix} = U \end{align}

Matrix $U$ is the echelon form of $A$ that we need for the LU factorization. We next construct $L$ so that the row operations that reduced $A$ to $U$ will reduce $L$ to $I$. Our row operations were

\begin{align*} R_2-2R_1\to R_2 \quad \text{and} \quad R_3 -3R_2 \to R_3 \end{align*}

With these two row operations, we see that $L$ must be the matrix

$$L = \begin{pmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 0 & 3 & 1 \end{pmatrix}$$

Note that the row operations $R_2-2R_1\to R_2$ and $ R_3 -3R_2 \to R_3$ applied to $L$ will give us the identity. The LU factorization of $A$ is

$$A = \begin{pmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 0 & 3 & 1 \end{pmatrix}\begin{pmatrix} 1 & 3 \\ 0 & 4 \\ 0 & 0 \end{pmatrix}$$

Using the LU Decomposition to Solve a Linear System

Our motivation for introducing the LU factorization was to introduce an efficient method for solving linear systems. Given rectangular matrix $A$ and vector $\vec b$, we wish to use the LU factorization of $A$ to solve $A \vec x = \vec b$ for $\vec x$. A procedure for doing so is below.

To solve $A \vec x = \vec b$ for $\vec x$:

Construct the LU decomposition of $A$ to obtain $L$ and $U$.

Set $U\vec x = \vec y$. Forward solve for $ \vec y$ in $ L \vec y = \vec b$

Backwards solve for $\vec x$ in $ U \vec x = \vec y$.

Example of Solving a Linear System Given $A=LU$

Solve the linear system $A\vec x = \vec b$, given the LU decomposition of $A$.
\begin{equation*}
A = LU = \begin{pmatrix} 1&0&0&0\\1&1&0&0\\0&2&1&0\\0&3&1&1\end{pmatrix} \begin{pmatrix}1&4&1\\0&1&1\\0&0&2 \\0&0&0 \end{pmatrix} , \quad \vec b = \begin{pmatrix}2\\3\\2\\0 \end{pmatrix}
\end{equation*}

Solution

We first set $U \vec x = \vec y$ and solve $L\vec y = \vec b$. Reducing the augmented matrix $(L \, | \, \vec b)$ gives us:
\begin{align}
\left(\begin{array}{cccc|c} 1 & 0 & 0 & 0 & 2\\ 1 & 1 & 0 & 0 & 3\\ 0 & 2 & 1 & 0 & 2\\ 0 & 0 & 3 & 1 & 0 \end{array}\right) &\thicksim
\left(\begin{array}{cccc|c} 1 & 0 & 0 & 0 & 2\\ 0 & 1 & 0 & 0 & 1 \\ 0 & 2 & 1 & 0 & 2\\ 0 & 0 & 3 & 1 & 0 \end{array}\right) \thicksim
\left(\begin{array}{cccc|c} 1 & 0 & 0 & 0 & 2\\ 0 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 3 & 1 & 0 \end{array}\right) \\ &=
\left(\begin{array}{cccc|c} 1 & 0 & 0 & 0 & 2\\ 0 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \end{array}\right)
\end{align}

Therefore, $\vec y$ is the vector:

$$\vec y = \begin{pmatrix} 2\\1\\0\\0 \end{pmatrix}$$

We now solve $U\vec x = \vec y$.

$$\left(\begin{array}{ccc|c} 1&4&1&2\\0&1&1&1\\0&0&2&0\\0&0&0&0 \end{array}\right) \thicksim \left(\begin{array}{ccc|c} 1&4&0&2\\0&1&0&1\\0&0&1&0\\0&0&0&0 \end{array}\right) \thicksim \left(\begin{array}{ccc|c} 1&0&0&-2\\0&1&0&1\\0&0&1&0\\0&0&0&0 \end{array}\right) $$

The solution to the linear system, $\vec x$, is the vector $$\vec x = \begin{pmatrix} -2\\1\\0 \end{pmatrix}$$

Summary

In our treatment of the LU factorization we constructed the LU decomposition using the following process.

reduce $ A$ to an echelon form $ U$ by a sequence of row replacement operations, if possible
place entries in $ L$ such that the same sequence of row operations reduces $ L$ to $ I$

Certainly there is much more to the LU factorization than what was presented here. In our introductory approach, the only row operation we use to construct $L$ and $U$ is to replace a row with a multiple of a row above it. Multiplying a row by a non-zero scalar is not needed, but more importantly, we cannot swap rows. More advanced linear algebra and numerical analysis courses would address this significant limitation.

Exercise

See if you can construct the LU Factorizations for the following matrix.
$$A = \begin{pmatrix}2&1&0\\4&3&1\\ 0 & -1 & 2\end{pmatrix}$$

Triangular Matrices