5  Bases, Dimension and Orthogonality

“I hear and I forget. I see and I remember. I do and I understand.”
Confucius

5.1 Basis and Dimension

A basis for a vector space V is an ordered set of vectors having two properties:

  1. The vectors are linearly independent (not too many vectors).
  2. They span the entire space V (not too few vectors).

Exercise 5.1 Find two bases for \mathbb{R}^2. Do you see anything notable about the number of vectors in these bases?

Exercise 5.2 Is the following set of vectors a basis for \mathbb{R}^3: \left(\begin{bmatrix}1\\1\\0\end{bmatrix},\begin{bmatrix}1\\0\\1\end{bmatrix}, \begin{bmatrix}0\\1\\1\end{bmatrix},\begin{bmatrix}1\\1\\1\end{bmatrix}\right)? If not, can you find a remedy?

Exercise 5.3 Is the following set of vectors a basis for \mathbb{R}^3: \left(\begin{bmatrix}1\\0\\0\end{bmatrix},\begin{bmatrix}0\\1\\0\end{bmatrix}\right)? If not, can you find a remedy?

Theorem 5.1 (Dimension) Any two bases for a vector space V contain the same number of vectors. This number is called the dimension of the vector space.

There is something to notice in this definition of dimension of a vector space. Why do any two bases of a vector space have the same number of vectors?

Exercise 5.4 Prove that the dimension of \mathbb{R}^3 is 3.

More generally, we have the following.

Dimension

The dimension of the vector space \mathbf{R}^n is n.

Rank and Nullity revisited

The dimension of the column space \mathcal{C}(\mathbf A) of a matrix \mathbf{A} is called its rank. The dimension of the null space \mathcal{N}(\mathbf A) of a matrix \mathbf{A} is called its nullity.

Rank - Nullity Theorem

For a m \times n matrix \mathbf{A}, we have: \textbf{rank } ( \mathbf{A}) + \textbf{nullity }(\mathbf{A}) = n.
Rephrasing, \textbf{dim} (\mathcal{C}(\mathbf A)) + \textbf{dim} (\mathcal{N}(\mathbf A)) = n

You should now notice that a n \times n matrix is invertible (non-singular) iff its rank is n iff its nullity is 0.

Exercise 5.5 Design an algorithm to compute the rank of a rectangular matrix.

5.2 Orthogonality

While all bases have the number of vectors, some bases are just more useful than the others. This brings about the concept of an orthonormal basis.

Inner-product and Length

The inner product or dot product of two vectors \mathbf{x}=\begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix} and \mathbf{y}=\begin{bmatrix}y_1\\y_2\\\vdots\\y_n\end{bmatrix} in \mathbb{\R^n} is defined by \langle \mathbf{x},\mathbf{y}\rangle=\sum_{i=1}^n x_iy_i=x_1y_1+\ldots+x_ny_n. And, the norm of a vector, denoted \|\mathbf{x}\|, is defined by \sqrt{\langle \mathbf{x},\mathbf{y}\rangle}=\sqrt{x_1^2+x_2^2+\ldots+x_n^2}.

Note that \langle \mathbf{x}, \mathbf{y}\rangle can also be written as \mathbf{x}^T\mathbf{y}.

Exercise 5.6 Find the length of the vector \begin{bmatrix}1\\2\\-3\end{bmatrix}.

Definition 5.1 Two vectors \mathbf{x} and \mathbf{y} from a vector space V are called orthogonal if their inner-product is zero, i.e., \langle \mathbf{x}, \mathbf{y}\rangle=0.

Exercise 5.7 If a vector is orthogonal with itself, what can you say about it?

Exercise 5.8 If a set of non-zero vectors are mutually orthogonal, are the vectors always linearly independent?

Orthonormal

A set of vectors is called orthonormal if the vectors are

  1. unit length, and
  2. mutually orthogonal.

A matrix with orthonormal columns will be called orthogonal, and denoted by \mathbf{Q}.

Theorem: Calculating with an orthonormal basis

Let \mathbf{u}_1,\mathbf{u}_2, \cdots, \mathbf{u}_n be an orthonormal basis for a normed vector space \mathbf{V}. Then we can write any \mathbf{v} \in \mathbf{V} as \mathbf{v}= c_1\mathbf{u}_1+c_2\mathbf{u}_2+ \cdots + c_n\mathbf{u}_n, and \| v \| = \sqrt{c_1^2 +c_2^2+ \cdots+ c_n^2 }, where c_i= \langle \mathbf{v}, \mathbf{u}_i \rangle.

Exercise 5.9 Show that the matrix \begin{bmatrix}\cos{\theta} & -\sin{\theta}\\\sin{\theta} & \cos{\theta}\end{bmatrix} is orthogonal.

Exercise 5.10 (Challange problem) Show that a 2 \times 2 orthogonal matrix must be of one of two possible forms \begin{bmatrix}\cos{\theta} & -\sin{\theta}\\\sin{\theta} & \cos{\theta}\end{bmatrix} or \begin{bmatrix}\cos{\theta} & \sin{\theta}\\\sin{\theta} & -\cos{\theta}\end{bmatrix}, where 0 \le \theta \le 2 \pi.

Exercise 5.11 Can you give an example of a orthogonal 3\times2 matrix?

Exercise 5.12 Can you give an example of a orthogonal 2\times3 matrix?

Exercise 5.13 If \mathbf{Q} (square or rectangular) has orthonormal columns, then \mathbf{Q}^T\mathbf{Q}=\mathbf{I}.

Square Orthogonal Matrix

If \mathbf{Q} is square orthogonal matrix, then

  1. the rows of \mathbf{Q} are also orthonormal, and
  2. \mathbf{Q}^T=\mathbf{Q}^{-1}.

5.3 The Gram-Schmidt process

The Gram-Schmidt process turns a given set of linearly independent vectors \mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n into an orthonormal set of vectors \mathbf{q}_1, \mathbf{q}_2, \ldots, \mathbf{q}_n with the same span. As a result, an orthonormal basis can be obtained from a basis.

The input vectors are taken as the columns of a matrix \mathbf{A}=\begin{bmatrix}\begin{matrix}\Big|\\\mathbf{a}_1\\\Big|\end{matrix} & \begin{matrix}\Big|\\\mathbf{a}_2\\\Big|\end{matrix} &\vdots& \begin{matrix}\Big|\\\mathbf{a}_n\\\Big|\end{matrix}\end{bmatrix} to produce an orthogonal matrix \mathbf{Q}=\begin{bmatrix}\begin{matrix}\Big|\\\mathbf{q}_1\\\Big|\end{matrix} & \begin{matrix}\Big|\\\mathbf{q}_2\\\Big|\end{matrix} &\vdots& \begin{matrix}\Big|\\\mathbf{q}_n\\\Big|\end{matrix}\end{bmatrix} having the same column space as \mathbf{A}.

The procedure first chooses \mathbf{q}_1=\mathbf{a}_1/\|\mathbf{a}_1\|. In order to choose \mathbf{q}_2, it subtracts from \mathbf{a}_2 its component in the direction of \mathbf{q}_1, i.e., \mathbf{b}=\mathbf{a}_2-\langle\mathbf{q}_1, \mathbf{a}_2\rangle\mathbf{q}_1. It’s easy to check that \mathbf{b} and \mathbf{q}_1 are orthogonal. Finally, diving by the length of \mathbf{b}, we obtain the unit-length vector \mathbf{q}_2=\mathbf{b}/\|\mathbf{b}\| so that

  1. \mathbf{q}_1, \mathbf{q}_2 are orthonormal, and
  2. \mathcal{S}(\mathbf{q}_1, \mathbf{q}_2)=\mathcal{S}(\mathbf{a}_1, \mathbf{a}_2). This way, the process continues following the pseudocode:
\begin{algorithm} \caption{Gram-Schmidt Process} \begin{algorithmic} \Procedure{GramSchmidt}{$A[1:m, 1:n]$} \State{$\mathbf{Q}=\mathbf{0}$} \State{$\mathbf{q}_1=\mathbf{a}_1/\|\mathbf{a}_1\|$} \For{$j=2$ to $n$} \State{$\mathbf{b}=\mathbf{a}_j-\langle\mathbf{q}_1,\mathbf{a}_j\rangle\mathbf{q}_1-\langle\mathbf{q}_2,\mathbf{a}_j\rangle\mathbf{q}_2-\ldots-\langle\mathbf{q}_{j-1},\mathbf{a}_j\rangle\mathbf{q}_{j-1}$} \State{$\mathbf{q}_j=\mathbf{b}/\|\mathbf{b}\|$} \EndFor \Return{$\mathbf{Q}$} \EndProcedure \end{algorithmic} \end{algorithm}

Exercise 5.14 Find an orthonormal basis for the subspace of \mathbb{R}^4 consisting of all vectors orthogonal to the vector \mathbf{a}=\begin{bmatrix} 1\\ 2\\-1 \\-3 \end{bmatrix}

5.4 \mathbf{QR} Factorization

The Gram-Schmidt process quite naturally gives rise to an important factorization: \mathbf{A}=\mathbf{QR}, where \mathbf{Q} is the output orthogonal matrix and \mathbf{R} is the upper triangular matrix that connects the matrices two bases \mathbf{A} and \mathbf{Q}.

The Form of \mathbf{R}

\mathbf{R}=\begin{bmatrix} \langle\mathbf{q}_1,\mathbf{a}_1\rangle & \langle\mathbf{q}_1,\mathbf{a}_2\rangle & \ldots & \langle\mathbf{q}_1,\mathbf{a}_n\rangle \\ 0 & \langle\mathbf{q}_2,\mathbf{a}_2\rangle & \ldots & \langle\mathbf{q}_2,\mathbf{a}_n\rangle \\ \vdots &\vdots&\ddots&\vdots\\ 0 & 0 & \ldots & \langle\mathbf{q}_n,\mathbf{a}_n\rangle \\ \end{bmatrix}.

The Gram-Schmidt process encodes the proof that \mathbf{A}=\mathbf{QR}. Just like \mathbf{LU} factorization, the \mathbf{QR} factorization is also fundamental to Linear Algebra.

Theorem 5.2 Every m\times n matrix with independent column vectors can be factored into \mathbf{A}=\mathbf{QR}. The columns of \mathbf{Q} are orthonormal, and \mathbf{R} is upper triangular and invertible.