3 LU and other Factorizations
“If I have seen further it is by standing on the shoulders of Giants.”
– Isaac Newton
In the last chapter, we have seen two algorithms to solve a linear system: Gaussian and Gauss–Jordan elimination. You would be disappointed to know that none of them are used by commercial solvers. For a large, non-sparse matrix \mathbf{A}, the standard approach to solving \mathbf{A}\vec{x}=\vec{b} involves
Factorize \mathbf{A} into a lower triangular matrix \mathbf{L} and upper triangular matrix \mathbf{U} such that \mathbf{A}=\mathbf{L}\mathbf{U}. This step is merely Gaussian elimination without augmentation.
Solve the following two triangular systems by back substitution to get the solution \vec{x}: \mathbf{L}\vec{y}=\vec{b},\text{ and then }\mathbf{U}\vec{x}=\vec{y}.
Exercise 3.1 Explain why the above steps should solve the original system.
In many applications, the coefficient matrix \mathbb{A} remains unchanged, but the system is solved for various right-hand side \vec{b}.
If the factorization \mathbf{L}\mathbf{U} is already given, solving takes \mathcal{O}(n^2)-time using only back substitution—as compared to the usual \mathcal{O}(n^3)-time Gauss elimination.
The decomposition separates the Gaussian elimination step from actually solving the system for a particular \vec{b}.
In effect, the procedure reduces the solution of \mathbf{A}\vec{x} = \vec{b} from O\left(\frac{1}{3}n^3\right) to O\left(2n^2\right) operations. For large systems this can reduce the calculation time by more than 95%. Determining the \mathbf{L} and \mathbf{U} matrices still takes O\left(\frac{1}{3}n^3\right), but it only has to be done for a single \mathbf{A} matrix and can be used efficiently to solve for \vec{x} with many right-hand side vectors, \vec{b}.
We have already seen that Gaussian elimination transforms \mathbf{A} into an upper triangular system by repeated row substitutions. The final upper triangular matrix (without augmentation) is the matrix \mathbf{U} in the \mathbf{L}\mathbf{U} decomposition of \mathbf{A}.
The diagonal of \mathbf{U} contains the pivots \mathbf{p_{i}}’s. \mathbf{U} = \begin{pmatrix}\mathbf{p_{1}}&u_{12}&\ldots&u_{1n}\\0&\mathbf{p_2}& &u_{2n}\\\vdots& &\ddots& \vdots\\0&...&...&\mathbf{p_{n}}\end{pmatrix}
The only task left is to find a lower triangular matrix \mathbf{L} such that the product \mathbf{L}\mathbf{U} equals \mathbf{A}. The matrix \mathbf{L} is the record of the row substitutions taken place during Gaussian elimination. We now discuss the construction of an \mathbf{L}. We emphasize an here because this is only one way to decompose \mathbf{A} into \mathbf{L} and \mathbf{U}, there are other methods and an infinite number of \mathbf{L} and \mathbf{U} matrices, i.e. they are not unique.
3.1 A Simple \mathbf{LU} Factorization
This section considers the case when when no rows are exchanged to find a pivot (for convenience, let us call such matrices regular). To find \mathbf{L}, in this case, we simply perform negations of the same operations on the identity matrix in reverse order. For example when going \mathbf{A}\rightarrow \mathbf{U} if the last operation is to perform the row operation R_i - l_{ji}R_j\rightarrow R_i to undo this and return to \mathbf{A} we would first perform the following row operation on the identity matrix (with the same dimensions)
R_i + l_{ji}R_j\rightarrow R_i
Example 3.1 We will look at a simple example and write out the row operations. \mathbf{A} = \begin{bmatrix}1&1&0\\2&1&-1\\3&-1&-1\end{bmatrix} \xrightarrow{E_2-2E_1} \begin{bmatrix}1&1&0\\0&-1&-1\\3&-1&-1\end{bmatrix} \xrightarrow{E_3-3E_1} \begin{bmatrix}1&1&0\\0&-1&-1\\0&-4&-1\end{bmatrix} \xrightarrow{E_3-4E_2} \begin{bmatrix}1&1&0\\0&-1&-1\\0&0&3\end{bmatrix} = \mathbf{U}
\mathbf{I} = \begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix} \xrightarrow{E_3+4E_2} \begin{bmatrix}1&0&0\\0&1&0\\0&4&1\end{bmatrix} \xrightarrow{E_3+3E_1} \begin{bmatrix}1&0&0\\0&1&0\\3&4&1\end{bmatrix} \xrightarrow{E_2+2E_1} \begin{bmatrix}1&0&0\\2&1&0\\3&4&1\end{bmatrix} = \mathbf{L}
Exercise 3.2 Check if \mathbf{A} equals \mathbf{L}\mathbf{U} in the above example either manually or using numpy
. Note that @
performs matrix multiplication in numpy
.
Exercise 3.3 For fun, let’s factorize the following matrix. \mathbf{A} = \begin{bmatrix}3&1&1\\1&3&1\\1&1&3\end{bmatrix}.
3.1.1 Psuedocode
You might have already observed there is a much simpler description of \mathbf{L}, in terms of the multipliers l_{ij}’s used during Gaussian elimination.
The diagonal entries in \mathbf{L} are all 1’s.
The entries below the diagonal are simply the multipliers l_{ij}’s. \mathbf{L} = \begin{pmatrix}1&0&\ldots&0\\l_{21}&1& &0\\\vdots& &\ddots& \vdots\\l_{n1}&...&...&1\end{pmatrix}.
Here is a pseudocode.
3.1.2 Implementation
Exercise 3.4 Implement SimpleLU
(Algorithm 1) in Python / Numpy.
A matrix A (not necessarily square) is regular iff it can be factored as A=LU, where L is a lower triangular matrix with all diagonal entries 1 and U is an upper triangular matrix with all diagonal entries non-zero (these are the pivots of A). The non-zero off-diagonal entries in L (l_{ij} for i>j) are a record of the elementary row operations that transform A to U.
In fact, A can be further factored as LDV, where where L is a lower triangular matrix with all diagonal entries 1, D is a diagonal matrix with all diagonal entries non-zero, and V is an upper triangular matrix with all diagonal entries 1.
Proof. We have already seen that for regular matrices we have a finite sequence of elementary matrices E_i such that E_k\cdots E_2E_1A is an uppertriangular matrix which is our U. Set L={E_1}^{-1}{E_2}^{-1}\cdots {E_k}^{-1} and note that it is lower triangular with all diagonal entries 1 (why?). Now check that LU=A.
How do we get the factorization LDV?
Exercise 3.5 For a given regular matrix A, is the above LU (and LDV) factorization unique?
Exercise 3.6 Prove that a symmetric matrix is regular iff it can be factored as A=LDL^T, where L is a lower triangular matrix with all diagonal entries 1 and D is a diagonal matrix with all diagonal entries non-zero.
3.2 A Simple \mathbf{PLU} (Permuted \mathbf{LU}) Factorization
Our simple \mathbf{L}\mathbf{U} factorization scheme halts as soon as a_{ii}=0 is reached for some i. In the last chapter, we have seen a quick remedy: exchange the ith row exchange with the first row below it having a nonzero value in the ith column. Unfortunately, the record of such row permutations cannot be stored in \mathbf{L}, it would no longer remain lower triangular!
For this reason, we have to keep track of the row exchanges by creating a permutation matrix, \mathbf{P} by performing the same row exchanges to an identity matrix.
This gives a matrix with precisely one non-zero entry, viz., 1 in each row and in each column.
\mathbf{P} is a binary, square matrix with same number of rows as \mathbf{A}: \mathbf{P} = \begin{bmatrix}p_{11}&p_{12}&\ldots&p_{1n}\\p_{21}&\ddots& &p_{2n}\\\vdots& &\ddots& \vdots\\p_{n1}&\ldots&\ldots&p_{nn}\end{bmatrix} such that its ijth entry is 1 if and only if the ith and jth rows were exchanged in Gauss elimination.
In other words, \mathbf{LU} produces the factorization of the matrix \mathbf{PA}, i.e., \mathbf{PA}=\mathbf{LU}. We will need \mathbf{P} later, when we use the \mathbf{LU} factorization to solve the permuted system of equations, \mathbf{P A}\vec{x} = \mathbf{P} \vec{b}.
3.2.1 Pseudocode
Here is a pseudocode.
3.2.2 Implementation
Exercise 3.7 Implement SimplePLU
(Algorithm 2) in Python / Numpy.
The SciPy function scipy.linalg.lu
performs a \mathbf{PLU} decomposition. However, we can’t compare our implementation to SciPy’s in general, because the SciPy implementation uses a slightly different strategy which could result in a different (but still correct) decomposition.
A square matrix A is invertible iff there is a permutation matrix P such that PA admits a LU-factorization, i.e. PA=LU where L is a lower triangular matrix with all diagonal entries 1 and U is an upper triangular matrix with all diagonal entries non-zero (these are the pivots).
In fact, PA can be further factored as LDV, where where L is a lower triangular matrix with all diagonal entries 1, D is a diagonal matrix with all diagonal entries non-zero, and V is an upper triangular matrix with all diagonal entries 1.
Proof. Adjust the proof of the regular case (see above) to write down a proof of the theorem.
Exercise 3.8 For a given regular matrix A, is the above factorization PA=LU (and PA=LDV) unique?
3.3 Solving Systems using \mathbf{PLU} Factorization
Once we have \mathbf{L} and \mathbf{U} we can solve for as many right-hand side vectors \vec{b} as desired very quickly using the following two step process. First we let \vec{y}=\mathbf{U}\vec{x} and then solve for \mathbf{L}\vec{y}=\vec{b} for \vec{y} by using forward substitution; see the code above.