2  Systems of Linear Equations

“It is not knowledge, but the act of learning, not the possession of but the act of getting there, which grants the greatest enjoyment.”
Carl Friedrich Gauss

In this chapter, we study the systems of linear equations and algorithms to solve them. Solving linear equations is not only central to linear algebra but also to many applied fields of science. We discuss a classical method—known as Gaussian Elimination—whose variants can be traced back to antiquity. Although, the method was largely standardized using modern mathematical notations by Gauss around 1810.

Gaussian elimination is the most widely studied and taught algorithm in linear algebra across the globe. Among the myriad of its applications, we will particularly focus on

2.1 Introduction

A system of linear equations is a collection of n linear equations in m unknowns. Let us consider the following system of linear equations. For example, take the following system:

\begin{aligned} x_1 - x_2 + 2x_3 &= 5 \\ 2x_1 + x_2 - x_3 &= 2 \\ -x_1 + 4x_2 + x_3 &= 3 \end{aligned} \tag{2.1} We only consider here square systems, i.e., m=n.

The system has three equations in three unknowns: x_1, x_2, and x_3.

2.1.1 Solutions

A vector \begin{bmatrix}x_1\\ x_2\\ x_3\end{bmatrix} is called a solution to the system if it solves all the equations in the system simultaneously.

Writing a solution as a column vector is only a convention.

For example, the vector \begin{bmatrix}5/3\\ 2/3\\ 2\end{bmatrix} is a solution to Equation 2.1.

Existence of Solutions

A system of equations may not always have a solution. Even when a solution exists, it may not be unique. A system may even have infinitely many solutions! A system is called singular if there is either no solution or there are infinitely many solutions.

  1. No Solution: The system has no solutions. \begin{aligned} x_1 - x_2 &= 5 \\ 2x_1 - 2x_2 &= 2 \\ \end{aligned}

  2. Non-unique Solutions: The system has infinitely many solutions. The vector \begin{bmatrix}c +5 \\ c\end{bmatrix} is a solution for any c\in\mathbb R. \begin{aligned} x_1 - x_2 &= 5 \\ 2x_1 - 2x_2 &= 10 \\ \end{aligned}

2.1.2 Matrix Notation

A generic system of linear equations in n unknowns can be written as following:

\begin{aligned} a_{11}x_1 + a_{12}x_2 + \ldots + a_{1n}x_n &= b_1 \\ a_{21}x_1 + a_{22}x_2 + \ldots + a_{2n}x_n &= b_2 \\ \vdots\quad\quad\quad\quad & \\ a_{n1}x_1 + a_{n2}x_2 + \ldots + a_{nn}x_n &= b_n \\ \end{aligned} \tag{2.2}

Here, the unknowns form a column vector \vec{x}=\begin{bmatrix}x_1 \\ x_2\\\dots\\x_n\end{bmatrix}. The right-hand side also forms a vector \vec{b}=\begin{bmatrix}b_1 \\ b_2\\\dots\\ b_n\end{bmatrix}. The coefficients of the unknowns form the coefficient matrix \mathbf{A}=\begin{bmatrix}a_{11} & a_{12} & \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \ldots & \ldots & \ddots & \ldots\\ a_{n1} & a_{n2} & \ldots & a_{nn}\end{bmatrix}.

Boldface is used to denote a matrix and an over arrow to denote a vector.

Then, the matrix multiplication justifies writing Equation 2.2 as the following matrix equation: \begin{bmatrix}a_{11} & a_{12} & \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \ldots & \ldots & \ddots & \ldots\\ a_{n1} & a_{n2} & \ldots & a_{nn}\end{bmatrix}_{n\times n} \begin{bmatrix}x_1 \\ x_2\\\dots\\x_n\end{bmatrix}_{n\times1} =\begin{bmatrix}b_1 \\ b_2\\\dots\\ b_n\end{bmatrix}_{n\times1}, or \mathbf{A}\vec{x}=\vec{b}.

Exercise 2.1 Write the following system in matrix notation. \begin{aligned} 2x_1 + x_2 + x_3 &= 5 \\ 4x_1 - 6x_2 &= -2 \\ -2x_1 + 7x_2 + 2x_3 &= 9. \end{aligned}

2.2 Gaussian Elimination

We first motivate Gaussian elimination through a simple example system, and then formalize the method in matrix notation and row operations.

Let us consider the following linear system. \begin{aligned} x_1 - x_2 + 2x_3 &= 5 \\ 2x_1 + x_2 - x_3 &= 2 \\ -x_1 + 4x_2 + x_3 &= 3. \end{aligned} \tag{2.3} In order to solve for the unknowns, we try to eliminate:

  1. x_1 from the second equation by subtracting from it 2 times the first equation to get: \begin{aligned} x_1 - x_2 + 2x_3 &= 5 \\ 3x_2 - 5x_3 &= -8 \\ -x_1 + 4x_2 + x_3 &= 3. \end{aligned}

  2. x_1 from the third equation by subtracting from it -1 times the first equation to get: \begin{aligned} x_1 - x_2 + 2x_3 &= 5 \\ 3x_2 - 5x_3 &= -8 \\ 3x_2 + 3x_3 &= 8. \end{aligned}

  3. x_2 from the third equation by subtracting from it 1 times the second equation to get: \begin{aligned} x_1 - x_2 + 2x_3 &= 5 \\ 3x_2 - 5x_3 &= -8 \\ 8x_3 &= 16. \end{aligned} \tag{2.4} Since we have been performing the same operations also to the right sides of the equations at each step, the final system Equation 2.4 is equivalent to the original system Equation 2.3 we started with. The original system was coupled; whereas, the new equivalent system is (upper) triangular. The steps (1–3) above constitute the first half of Gaussian elimination—known as forward elimination. Now, second half of the story is called back substitution, where the values of x_1, x_2 and x_3 can easily be peeled off in steps (4–6) below.

  4. the third equation yields x_3=\tfrac{16}{8}=2.

  5. using the value of x_1 in the second equation, we get x_2=\tfrac{1}{3}[-8+5x_3]=\tfrac{1}{3}[-8+5\times2]=\tfrac{2}{3}.

  6. finally, using the values of x_1 and x_2 in the first equation, we get x_1=5+x_2-2x_3=5+\tfrac{2}{3}-2\times2=\tfrac{5}{3}.

Probably, you see the pattern already! In order to solve a system, we first perform forward elimination to turn the original system into a triangular one, then use back substitution to solve for the unknowns.

We now make these steps more organized by putting the equations (coefficients and right-hand side) into a giant matrix, called the augmented matrix. Then, we repeatedly perform row operations on it.

2.2.1 The Augmented Matrix and Row Operations

Instead of manipulating the systems in each step, we could just extract the numeric coefficients from both sides of the original system Equation 2.3 to form the following augmented matrix:

\mathbf{A}_\text{aug}= \begin{bmatrix} \begin{array}{c:c} \begin{matrix} 1 & -1 & 2\\ 2 & 1 & -1\\ -1 & 4 & 1\\ \end{matrix} & \begin{matrix}5 \\ 2 \\ 3\end{matrix} \end{array} \end{bmatrix}_{3\times4}.

The last column is the right-had-side of the system.

More generally for a system \mathbf{A}\vec{x}=\vec{b}, the augmented matrix is the matrix \mathbf{A}_\text{aug}=[\mathbf{A}\ |\ \vec{b}]_{n\times(n+1)}.

The forward elimination process works by repeatedly choosing a pivot and performing row substitutions to make the entries of \mathbf{A}_\text{aug} below the pivots all zero.

For our example system Equation 2.3, the first pivot is 1. The following is the result of applying elimination steps (1) and (2) using the pivot.

\begin{bmatrix} \begin{array}{c:c} \begin{matrix} \pmb{1} & -1 &2\\ 2 & 1 & -1\\ -1 & 4 & 1\\ \end{matrix} & \begin{matrix}5\\ 2\\ 3\end{matrix} \\ \end{array} \end{bmatrix} \xrightarrow[R_3\mapsto R_3-(-1)R_1]{R_2\mapsto R_2-2R_1} \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \pmb{1} & -1 & 2\\ 0 & 3 & -5\\ 0 & 3 & 3\\ \end{matrix} & \begin{matrix}5\\ -8\\ 8\end{matrix} \\ \end{array} \end{bmatrix}.

The pivots in each row have been boldfaced.

Next, the second pivot is 3. Using the pivot, we apply elimination step (3) now. \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \pmb{1} & -1 & 2\\ 0 & \pmb{3} & -5\\ 0 & 3 & 3\\ \end{matrix} & \begin{matrix}5\\ -8\\ 8\end{matrix} \\ \end{array} \end{bmatrix} \xrightarrow{R_3\mapsto R_3-R_2} \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \pmb{1} & -1 & 2\\ 0 & \pmb{3} & -5\\ 0 & 0 & \pmb{8}\\ \end{matrix} & \begin{matrix}5\\ -8\\ 16\end{matrix} \\ \end{array} \end{bmatrix} . Now, we have reached a triangular system, and the last pivot is 8. So, the list of pivots are 1, 3, and 8. The pivots are, indeed, very important byproducts of the elimination process, as we will see later their product returns the determinant of the coefficient matrix \mathbf{A}. The next step, back substitution, can now be easily performed to retrieve the solution.

Row Echelon Form

The final form of the augmented matrix is the following:

  1. All non-zero rows are above any rows of all zeros. (see below)
  2. The leading (non-zero) elements on each row are called pivots.
  3. Each leading (non-zero) entry in a row is in a column to the right of the leading (non-zero) entry of the row above it. (see below)
  4. The entries below each pivot are all zero.

This particular form of a matrix is known as the row echelon form.

Exercise 2.2 Solve the following system of linear equations using matrix notation and row substitution. List the pivots. \begin{aligned} 2x_1 + x_2 + x_3 &= 5 \\ 4x_1 - 6x_2 &= -2 \\ -2x_1 + 7x_2 + 2x_3 &= 9. \end{aligned}

Finding Pivots is the Game

By definition pivots are non-zero. The success of Gaussian elimination hinges on finding a full set of (n many) pivots. If the algorithm gets stuck due to the unavailability of pivot on a row, we are allowed to exchange the row with any row below it. So, the allowed elementary row operations for forward elimination are:

  1. Substitute the ith row with the row minus l times the jth (pivot) row
  2. Exchange the ith and jth rows - if needed

Exercise 2.3 (Exchange Rows If Needed) Solve the following system of linear equations using matrix notation and row substitution. List the pivots. \begin{aligned} x_1 + x_2 + x_3 &= 5 \\ 2x_1 + 2x_2 + 5x_3 &= -2 \\ 4x_1 + 6x_2 + 8x_3 &= 9. \end{aligned}

As with any algorithm, Gaussian elimination may not always succeed, i.e., it may fail to find a complete set of pivots. In such cases, the algorithm must stop prematurely without producing any solution.

Exercise 2.4 (No Pivot!) Solve the following system of linear equations using matrix notation and row substitution. List the pivots. \begin{aligned} x_1 + x_2 + x_3 &= 5 \\ 2x_1 + 2x_2 + 5x_3 &= -2 \\ 4x_1 + 4x_2 + 8x_3 &= 9. \end{aligned}

2.2.2 Algorithm

We now put together all that we have learned to write a pseudocode for the Gaussian elimination method.

\begin{algorithm} \caption{Gaussian Elimination} \begin{algorithmic} \Procedure{GaussSolve}{$A[1:n, 1:n], b[1:n, 1]$} \State{▷ Step I: Concatenate $A$ and $b$} \State{$A_\text{aug}=[A\ |\ b]$} \State{▷ Step II: Forward Elimination} \For{$i=1$ to $n$} \State{▷ Find pivot $p_i$} \For{$j=i$ to $n$} \If{$A_\text{aug}[j, i]\neq0$} \State{Exchange $R_i$ and $R_j$ of $A_\text{aug}$} \State{$\textbf{break}$} \EndIf \EndFor \State{$p_i = A_\text{aug}[i, i]$} \If{$p_i=0$}\Return{$\text{\color{red}The system is not solvable!}$}\Endif \State{▷ Reduce the rows below} \For{$j=i+1$ to $n$} \State{$R_j\mapsto R_j-(A_\text{aug}[j,i]/A_\text{aug}[i,i])\times R_i$} \EndFor \EndFor \State{▷ Step III: Back Substitution} \For{$i=n$ to $1$} \State{$x_i=\left(A_\text{aug}[i,n+1]-\sum_{j=i+1}^nA_\text{aug}[i,j]x_{j}\right)/A_\text{aug}[i,i]$} \EndFor \Return{solution $(x_1,x_2,\ldots,x_n)$ and pivots $(p_1,p_2,\ldots,p_n)$} \EndProcedure \end{algorithmic} \end{algorithm}

2.2.3 Computational Complexity

It is not hard to see that the the time complexity of solving a system of n equations in n unknown must be \mathcal O(n^3) in the worst case. Indeed, for each i\in\{1,2,\ldots,n\}, the number of arithmetical operations are roughly i^2-i. Therefore, the total number of operations is T(n)=\sum_{i=1}^n(i^2-i)=\frac{n(n+1)(2n+1)}{6}-\frac{n(n+1)}{2}=\frac{n^3-n}{3}.

2.2.4 Python Implementation

2.3 Gauss–Jordan Elimination

We now modify the Gaussian elimination method to add more steps so that entries above the pivots also become zero. This is called Gauss-Jordan method. The form of the final matrix is known as reduced row echelon form: the pivots are 1 and the entries above and below the pivots are all eliminated. In addition to row substitution and row exchange, dividing a row by a non-zero number is also an allowed elementary row operation.

The method follows the following steps to solve the system \mathbf{A}\vec{x}=\vec{b}:

  1. Augment A by concatenating b to get \mathbf{A}_\text{aug}=[\mathbf{A}\ |\ \vec{b}],
  2. Perform Gaussian forward elimination on \mathbf{A}_\text{aug},
  3. Perform elimination again to eliminate the entries above the pivots,
  4. divide rows by the corresponding pivots to make the principal diagonal entries 1.

Gauss-Jordan method provides alternative approach to solving a linear system. As we will see, the method can also be used to find the inverse of a square matrix \mathbf{A}, provided \mathbb{A} is invertible.

2.3.1 Solving Linear Systems

As discussed, the resulting system is diagonal. Consequently, the solution automatically falls out, without performing back substitution.

Gauss Elimination is Better!

Although the method saves back substitution, making the augmented system fully reduced comes at the cost of performing more arithmetic operations.

Example 2.1 Let us consider the following system: \begin{aligned} 2x_1 -x_2 &= 1 \\ -x_1 + 2x_2 - x_3 &= 2 \\ -x_2 + 2x_3 &= 1. \end{aligned} The augmented matrix is \mathbf{A}_\text{aug}= \begin{bmatrix} \begin{array}{c:c} \begin{matrix}2 & -1& 0\\-1 & 2 & -1\\0 & -1 & 2 \end{matrix} & \begin{matrix}1\\2\\1\end{matrix} \end{array} \end{bmatrix}. We first perform Gauss elimination: \begin{bmatrix} \begin{array}{c:c} \begin{matrix} 2 & -1 & 0\\ -1 & 2 & -1\\ 0 & -1 & 2\\ \end{matrix} & \begin{matrix}1\\2\\1\end{matrix} \end{array} \end{bmatrix} \xrightarrow{\quad\text{Gauss}\quad} \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \bm{2} & -1 & 0\\ 0 & \bm{\frac{3}{2}} & -1\\ 0 & 0 & \bm{\frac{4}{3}}\\ \end{matrix} & \begin{matrix}1\\\frac{5}{2}\\\frac{8}{3}\end{matrix} \end{array} \end{bmatrix}. Now, we do another round of elimination for the entries above the pivots. \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \bm{2} & -1 & 0\\ 0 & \bm{\frac{3}{2}} & -1\\ 0 & 0 & \bm{\frac{4}{3}}\\ \end{matrix} & \begin{matrix}1\\\frac{5}{2}\\\frac{8}{3}\end{matrix} \end{array} \end{bmatrix} \xrightarrow{\quad\text{Jordan}\quad} \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \bm{2} & 0 & 0\\ 0 & \bm{\frac{3}{2}} & 0\\ 0 & 0 & \bm{\frac{4}{3}}\\ \end{matrix} & \begin{matrix}4\\\frac{9}{2}\\\frac{8}{3}\end{matrix} \end{array} \end{bmatrix}. Finally, divide by the pivots to get unit along the diagonal. \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \bm{2} & 0 & 0\\ 0 & \bm{\frac{3}{2}} & 0\\ 0 & 0 & \bm{\frac{4}{3}}\\ \end{matrix} & \begin{matrix}4\\\frac{9}{2}\\\frac{8}{3}\end{matrix} \end{array} \end{bmatrix} \xrightarrow{\quad\text{Divide}\quad} \begin{bmatrix} \begin{array}{c:c} \begin{matrix} \bm{1} & 0 & 0\\ 0 & \bm{1} & 0\\ 0 & 0 & \bm{1}\\ \end{matrix} & \begin{matrix}2\\3\\2\end{matrix} \end{array} \end{bmatrix}. So, the solution is: x_1=2, x_2=3, x_3=2.

2.3.2 Algorithm

Here is a pseudocode.

\begin{algorithm} \caption{Gaussian-Jordan Elimination} \begin{algorithmic} \Procedure{GaussJordanSolve}{$A[0:n-1, 0:n-1], b[0:n-1]$} \State{▷ Step I: Concatenate $A$ and $b$} \State{$A=[A\ |\ b]$} \State{▷ Step II: Elimination} \For{$i=0$ to $n-1$} \State{▷ Find pivot $p$} \For{$j=i$ to $n$} \If{$A[j, i]\neq0$} \State{Exchange $R_i$ and $R_j$} \State{$\textbf{break}$} \EndIf \EndFor \State{$p = A[i, i]$} \If{$p=0$}\Return{$\text{\color{red}The system is not solvable!}$}\Endif \State{▷ Divide by pivot} \State{$R_i\mapsto \frac{1}{p}R_i$} \State{▷ Reduce the rows below and above} \For{$j\neq i$} \State{$R_j\mapsto R_j-a_{ji}\times R_i$} \EndFor \EndFor \State{▷ Step III: solve for $x$} \For{$i=0$ to $n-1$} \State{$x_i=a_{i,n}$} \EndFor \Return{$(x_1,x_2,\ldots,x_n)$} \EndProcedure \end{algorithmic} \end{algorithm}

Exercise 2.5 Discuss pros and cons of using back substitution over Gauss–Jordan elimination for solving a system.

2.3.3 Calculating the Inverse

An important application of Gauss–Jordan elimination is computing the inverse of a matrix.

Inverse using Gauss–Jordan

The method follows the following steps:

  1. Augment \mathbf{A} by concatenating the identity matrix \mathbf{I}_{n\times n} to get \mathbf{A}_\text{aug}=[\mathbf{A}\ |\ \mathbf{I}],
  2. Perform Gaussian forward elimination on \mathbf{A}_\text{aug},
  3. Perform elimination again to eliminate the entries above the pivots,
  4. divide the pivots to make the principal diagonal entries 1,
  5. return the first part of \mathbf{A}_\text{aug} as \mathbf{A}^{-1}, since the resulting augmented matrix becomes \left[\mathbf{I}\ |\ \mathbf{A}^{-1}\right] at this point.

Example 2.2 Let A=\begin{bmatrix}2 & -1& 0\\-1 & 2 & -1\\0 & -1 & 2\end{bmatrix}. In order to find its inverse, we augment the matrix with the identity and perform Gauss elimination: \begin{bmatrix} 2 & -1 & 0 & 1 & 0 & 0\\ -1 & 2 & -1 & 0 & 1 & 0\\ 0 & -1 & 2 & 0 & 0 & 1\\ \end{bmatrix} \xrightarrow{\quad\text{Gauss}\quad} \begin{bmatrix} \bm{2} & -1 & 0 & 1 & 0 & 0\\ 0 & \bm{\frac{3}{2}} & -1 & \frac{1}{2} & 1 & 0\\ 0 & 0 & \bm{\frac{4}{3}} & \frac{1}{3} & \frac{2}{3} & 1\\ \end{bmatrix}. Now, we do another round of elimination for the entries above the pivots. \begin{bmatrix} \bm{2} & -1 & 0 & 1 & 0 & 0\\ 0 & \bm{\frac{3}{2}} & -1 & \frac{1}{2} & 1 & 0\\ 0 & 0 & \bm{\frac{4}{3}} & \frac{1}{3} & \frac{2}{3} & 1\\ \end{bmatrix} \xrightarrow{\quad\text{Jordan}\quad} \begin{bmatrix} \bm{2} & 0 & 0 & \frac{3}{2} & 1 & \frac{1}{2}\\ 0 & \bm{\frac{3}{2}} & 0 & \frac{3}{4} & \frac{3}{2} & \frac{3}{4}\\ 0 & 0 & \bm{\frac{4}{3}} & \frac{1}{3} & \frac{2}{3} & 1\\ \end{bmatrix}. Finally, divide by the pivots to get unit along the diagonal. \begin{bmatrix} \bm{2} & 0 & 0 & \frac{3}{2} & 1 & \frac{1}{3}\\ 0 & \bm{\frac{3}{2}} & 0 & \frac{3}{4} & \frac{3}{2} & \frac{3}{4}\\ 0 & 0 & \bm{\frac{4}{3}} & \frac{1}{3} & \frac{2}{3} & 1\\ \end{bmatrix} \xrightarrow{\quad\text{Divide}\quad} \begin{bmatrix} \bm{1} & 0 & 0 & \frac{3}{4} & \frac{1}{2} & \frac{1}{4}\\ 0 & \bm{1} & 0 & \frac{1}{2} & 1 & \frac{1}{2}\\ 0 & 0 & \bm{1} & \frac{1}{4} & \frac{1}{2} & \frac{3}{4}\\ \end{bmatrix} . So, A^{-1}=\begin{bmatrix} \frac{3}{4} & \frac{1}{2} & \frac{1}{4}\\ \frac{1}{2} & 1 & \frac{1}{2}\\ \frac{1}{4} & \frac{1}{2} & \frac{3}{4}\\ \end{bmatrix}.

Exercise 2.6 Justify why the above process must produce the inverse of A.

2.3.4 Implementation

2.4 Encoding by Elementary Matrices

We can encode a single elementary row operation by “Elementary Matrices”. An elementary marix is a matrix obtained by performing a single elemenatry row operation on the identity matrix.

Exercise 2.7 Describe the elementary row operations encoded by pre-multiplying by the elementary matrices E_i E_1=\begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 2\\ \end{bmatrix}, E_2=\begin{bmatrix} 0 & 1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 1\\ \end{bmatrix}, E_3=\begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ -3 & 0 & 1\\ \end{bmatrix}

Elementary Martices trace out Elimination, and inversion
  1. Performing an elementary row operation on a m \times n matrix A can be written as EA, where E is the m \times m matrix created by performing the same elementary row operation on I_m
  2. An n \times n matrix is invertible iff it is row equivalent to I_n: the sequence of row operations that reduces A to I_n also converts I_n to A^{-1}

Proof. If the n \times n matrix A is invertible, it has a pivot position in every row, and hence (as it is a square matrix) all those pivot positions must be on the diagonal, which implies that the reduced row echelon form in I_n. Conversely, suppose that A is row equivalent to I_n. As each step of the row reduction process is realised by left multiplication by an elementary matrix, there are elementary matrices E_1, E_2, \cdots, E_k such that E_k\cdots E_2E_1A=I_n; equivalently, A=(E_k\cdots E_2E_1)^{-1}. Therefore, E_k\cdots E_2E_1 I_n = ((E_k\cdots E_2E_1)^{-1})^{-1} = A^{-1}.

Notes: Elementary matrices are invertible. (Why?)

Exercise 2.8 Let E_1=\begin{bmatrix} 1 & 0 & 0\\ -2 & 1 & 0\\ 0 & 0 & 1\\ \end{bmatrix}, E_2=\begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ -1 & 0 & 1\\ \end{bmatrix}, E_3=\begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & \frac{1}{2} & 1\\ \end{bmatrix}

and let A=\begin{bmatrix} 1 & 2 & 1\\ 2 & 6 & 1\\ 1 & 1 & 4\\ \end{bmatrix}. Describe the elemenatry row operation encoded by premultiplying by each E_i, and find E_3E_2E_1A. Then find A^{-1} as {E_1}^{-1}{E_2}^{-1}{E_3}^{-1}.

2.5 Stability of Gaussian Elimination

We know in theory that finding a full set of pivots, possibly after row exchanges, guarantees a unique solution, and vice versa. However, Gaussian elimination may not always be computationally stable in practice. This is because computers don’t have infinite precision when dealing with float values.

Let’s say a computer is capable of storing only up to three decimal places. In other words, the machine epsilon is \epsilon_m=0.0001. This is the largest positive number that the computer treats as zero. Every arithmetic operation is susceptible to roundoff errors, and more and more operations the error accumulates. For example, the following addition would result in losing the trailing digits 1 and 3: .256 + .00113 \to .257

Such roundoff errors contribute to the final error in the solution. As it turns out, the picture does not look too bad if the coefficient matrix \mathbf A is well-conditioned. Whereas, an ill-conditioned \mathbf{A} may suffer from computational instability particularly as n becomes large. Let’s take the following two examples.

Example 2.3 (Ill-Conditioned) Consider the following system. \begin{aligned} x_1 &+ \phantom{1.0001}x_2 = 2 \\ x_1 &+ 1.0001x_2 = 2 \end{aligned} Note that the determinant of the coefficient matrix is very close to 0. In a moment’s reflection, we see that the solution is: x_1 = 2, x_2 = 0. Now, if we change the right-hand side only slightly we get the following system: \begin{aligned} x_1 &+ \phantom{1.0001}x_2 = 2 \\ x_1 &+ 1.0001x_2 = 2.0001 \end{aligned} a completely different solution emerges: x_1 = 1, x_2 = 1. A change is the fifth digit of \vec{b} was amplified to a change in the first digit of the solution (Strang 2006). Even the most robust numerical method will have trouble with this sensitivity to small perturbations.

Strang, Gilbert. 2006. Linear Algebra and Its Applications.

Example 2.4 (Well-Conditioned) Consider the following system. \begin{aligned} .0001x_1 &+ x_2 = 1 \\ x_1 &+ x_2 = 2 \end{aligned} Note that the determinant of the coefficient matrix is not close to 0.

The traditional Gaussian elimination would result in the following reduction:

\begin{bmatrix} \begin{array}{c:c} \begin{matrix} .0001 & 1\\1&1 \end{matrix} & \begin{matrix}1\\2\end{matrix} \end{array} \end{bmatrix} \xrightarrow{\text{Gauss}} \begin{bmatrix} \begin{array}{c:c} \begin{matrix} .0001 & 1\\ 0 & -9999 \end{matrix} & \begin{matrix}1\\-9998\end{matrix} \end{array} \end{bmatrix} Here, the chosen pivots are .0001, -9999. By inspection, we see that the correct x_2 = .9999, and we can back substitute into the first equation .0001x_1+.9999 = 1 to get correct final solution is: x_1 = 1, x_2 = .9999.

On the other hand, roundoff produces x_2=1, which is correct to three decimal places (assuming \epsilon_m=0.0001). Then, back substitution: .0001x_1+1 = 1 leads to the wrong solution: x_1 = 0, x_2 = 1. The incorrect solution is brought on by the choice of a small pivot: 0.0001.

2.5.1 Partial Pivoting

The solution to the previous example is exchanging rows to avoid choosing small pivots. Now lets do the same computation with the same roundoff error. But, this time we exchange the rows before proceeding with the Gaussian elimination. The modification renders a larger pivot at each step.

\begin{bmatrix} \begin{array}{c:c} \begin{matrix} 1 & 1\\.0001&1 \end{matrix} & \begin{matrix}2\\1\end{matrix} \end{array} \end{bmatrix} \xrightarrow{\text{Gauss}} \begin{bmatrix} \begin{array}{c:c} \begin{matrix} 1 & 1\\ 0 & .999 \end{matrix} & \begin{matrix}2\\.999\end{matrix} \end{array} \end{bmatrix} After back substitution, we get the computed solution: x_1 = 1, x_2 = 1. This time, the solution differs from the exact solution (x_1 = 1, x_2 = .9999) only by the roundoff error of the computer.

Exchanging rows to obtain the largest possible pivot is called partial pivoting.

Full Pivoting

Exchanging both rows and columns to obtain the largest possible pivot is called full pivoting. Full pivoting will result in the most stable algorithm but requires more computations and the requirement of tracking column permutations. For most matrices, partial pivoting is sufficient enough for stability.

Exercise 2.9 (Partial Pivoting) Apply Gaussian elimination with partial pivoting for the following system of linear equations. List the pivots. \begin{aligned} x_1 + 4x_2 + 2x_3 &= -2 \\ -2x_1 - 8x_2 + 3x_3 &= 32 \\ x_2 + x_3 &= 1. \end{aligned}

Exercise 2.10 (Algorithm with Partial Pivoting) Modify Gauss-Solve to incorporate partial pivoting.

2.5.2 Hilbert Matrix

Even after partial pivoting or even full pivoting, some matrices are notoriously ill-conditioned and in some sense very close to being singular. A well known example is the Hilbert matrix, shown below.

H_n=\begin{bmatrix} 1 & \frac{1}{2} & \frac{1}{3} & \cdots & \frac{1}{n}\\ \frac{1}{2} & \frac{1}{3} & \frac{1}{4} & \cdots & \frac{1}{n+1}\\ \frac{1}{3} & \frac{1}{4} & \frac{1}{5} & \cdots & \frac{1}{n+2}\\ \vdots & \vdots & \vdots & \vdots & \vdots\\ \frac{1}{n} & \frac{1}{n+1} & \frac{1}{n+2} & \cdots & \frac{1}{2n-1}\\ \end{bmatrix}

It can be shown that H_n is non-singular (invertible), and all the entries of the inverse are integers. However, solving a linear system with coefficient matrix H_n is quite challenging. Below is a snippet where you can take a Hilbert matrix for some sizable n and - with a randomly generated vector x_0, construct the vector b= H_n x_0. If we now solve the equation H_n x = b, we can compare x with x_0.