8 Positive Definite Matrices and Minimization of Quadratic Functions
“May not Music be described as the Mathematic of sense, Mathematic as Music of the reason? the soul of each the same!”
– James Joseph Sylvester
The mathematical term matrix was introduced by Sylvester. He is credited for several concepts and results in matrix theory, such as Sylvester’s Criterion for characterizing a positive definite matrix using determinants.
Given vectors \mathbf{x} and \mathbf{y} in \mathbb{R}^n, and considering a basis \{\mathbf{e}_i\}_{i=1}^n of \mathbb{R}^n, we can write the standard inner product as \langle \mathbf{x}, \mathbf{y} \rangle = \sum_{i,j=1}^n k_{ij} x_iy_j=\mathbf{x}^T\mathbf{K}\mathbf{y}, where \mathbf{K} is the n \times n matrix of inner product of the basis vectors with entries k_{ij}= \langle \mathbf{e}_i, \mathbf{e}_j\rangle. We note that the properties of the inner product gives that the matrix \mathbf{K} is symmetric and satisfies the condition \mathbf{x}^T\mathbf{K}\mathbf{x} \ge 0 with equality iff \mathbf{x}=0.
8.1 Positive Definite Matrices
A n \times n matrix \mathbf{K} is said to be positive definite if it is symmetric (\mathbf{K}^T=\mathbf{K}) and satisfies the condition \mathbf{x}^T\mathbf{K}\mathbf{x} > 0 for all 0 \ne \mathbf{x} \in \mathbb{R}^n. The corresponding homogeneous quadratic polynomial q(\mathbf{x})=\mathbf{x}^T\mathbf{K}\mathbf{x} = \sum_{i,j=1}^n k_{ij} x_ix_j is said to be a positive definite quadratic form on \mathbb{R}^n.
In fact, it can be shown that every inner product on \mathbb{R}^n is given by \langle \mathbf{x}, \mathbf{y} \rangle =\mathbf{x}^T\mathbf{K}\mathbf{y}, where \mathbf{K} is a symmetric positive definite n \times n matrix.
A n \times n matrix \mathbf{K} is said to be positive semi-definite if it is symmetric (\mathbf{K}^T=\mathbf{K}) and satisfies the condition \mathbf{x}^T\mathbf{K}\mathbf{x} \ge 0 for all \mathbf{x} \in \mathbb{R}^n.
Exercise 8.1
- Show that the symmetric matrix A=\begin{bmatrix}4 & -2 \\ -2 & 3 \end{bmatrix} and B=\begin{bmatrix}3 & 4 \\ 4 & 6 \end{bmatrix} are positive definite.
- Show that the symmetric matrix C=\begin{bmatrix}3 & 4 \\ 4 & 5 \end{bmatrix} is not positive definite.
Exercise 8.2 Explain why a positive definite matrix must be invertible (nonsingular).
Exercise 8.3 Consider a quadratic form q(\mathbf{x})=\mathbf{x}^T\mathbf{A}\mathbf{x} = \sum_{i,j=1}^n a_{ij} x_ix_j which is not necessarily symmetric. Show that q(\mathbf{x})=\mathbf{x}^T\mathbf{K}\mathbf{x}, where \mathbf{K} = \frac{1}{2}(\mathbf{A}+\mathbf{A}^T) is symmetric. (This exercise shows that a quadratic form can be always “symmetrized”)
8.1.1 Gram Matrices
Let V be an inner product space and let \mathbf{v}_1, \cdots, \mathbf{v}_n \in V. Then the Gram Matrix associated with \mathbf{v}_1, \cdots, \mathbf{v}_n \in V is the following n \times n matrix: \mathbf{K}=\begin{bmatrix} \langle\mathbf{v}_1,\mathbf{v}_1\rangle & \langle\mathbf{v}_1,\mathbf{v}_2\rangle & \ldots & \langle\mathbf{v}_1,\mathbf{v}_n\rangle \\ \langle\mathbf{v}_2,\mathbf{v}_1\rangle & \langle\mathbf{v}_2,\mathbf{v}_2\rangle & \ldots & \langle\mathbf{v}_2,\mathbf{v}_n\rangle \\ \vdots &\vdots&\ddots&\vdots\\ \langle\mathbf{v}_n,\mathbf{v}_1\rangle & \langle\mathbf{v}_n,\mathbf{v}_2\rangle & \ldots & \langle\mathbf{v}_n,\mathbf{v}_n\rangle \\ \end{bmatrix}.
- The Gram Matrix for \{\mathbf{v}_1, \cdots, \mathbf{v}_n\} \subset V is positive definite iff \mathbf{v}_1, \cdots, \mathbf{v}_n are linearly independent.
- Let \mathbf{A}=\begin{bmatrix} \mathbf{v}_1 & \cdots & \mathbf{v}_n \end{bmatrix} be a m \times n matrix, where \mathbf{v}_i \in \mathbb{R}^m. Then \mathbf{A^TA} is positive definite iff \mathbf{v}_1, \cdots, \mathbf{v}_n are linearly independent.
Exercise 8.4 Verify (both directly, and by using the above theorem) that the Gram Matrix for the vectors \mathbf{u}_1=\begin{bmatrix}1 & 2 & -1 \end{bmatrix}^T and \mathbf{u}_2=\begin{bmatrix}3 & 0 & 6 \end{bmatrix}^T of \mathbb{R}^3 is positive definite.
The following theorem gives useful ways to recognize positive definite matrices.
The following are equivalent for a symmetric matrix \mathbf{K}.
- \mathbf{K} is positive definite
- \mathbf{K} is regular (i.e., Gaussian elimination can transform it to row echelon form with only type-I row operations - no row swappings needed) and has all pivots positive.
- The corresponding quadratic form is positive definite
- There exists a matrix U with \text{rank} (\mathbf{U})= n such that \mathbf{K} =\mathbf{U}^T\mathbf{U}
- All leading determinants are positive (Sylvester’s Criterion)
- All eigenvalues of \mathbf{K} are positive
8.2 Minimization of Quadratic Functions
In this section we will see how that problem of finding a global minimum (if it exists) of a general quadratic function of n variables has a very satisfying solution by techniques of linear algebra.
We will write a (real) quadratic polynomial of n variables x_1, x_2, \cdots, x_n as p(\mathbf{x})=p(x_1, x_2, \cdots, x_n)= \sum_{i,j=1}^n k_{ij}x_ix_j - 2\sum_{i=1}^nf_ix_i+c, where \mathbf{x}=\begin{bmatrix}x_1 & x_2 & \cdots & x_n \end{bmatrix}^T \in \mathbb{R}^n and k_{ij}, f_i, c \in \mathbb{R}.
Note that p(\mathbf{x}) can be rewritten as p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c, where \mathbf{K}=(k_{ij}) is a symmetric n \times n matrix, \mathbf{f} \in \mathbb{R}^n and c \in \mathbb{R}.
We have the following global minimization result.
Let \mathbf{K} be a (symmetric) positive definite matrix. Then the quadratic function p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c has an unique minimizer, which is the solution of the linear system \mathbf{K}\mathbf{x}=\mathbf{f} (viz, \tilde{\mathbf{x}}=\mathbf{K}^{-1}\mathbf{f}). The (global) minimum value of p(\mathbf{x}) is given by p(\mathbf{x})= c- \tilde{\mathbf{x}}^T\mathbf{f}= c-\tilde{\mathbf{x}}^T\mathbf{K\tilde{\mathbf{x}}}.
If \mathbf{K} is only positive semidefinite, and if \mathbf{f} \in \mathcal{\mathbf{K}}, then every solution \tilde{\mathbf{x}} of the linear equation \mathbf{K}\mathbf{x}=\mathbf{f} is a global minimum of p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c, but the minimum is not unique.
In all other cases, p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c has no global minimum
Exercise 8.5 Show that the quadratic function of two variables p(x,y)=4x^2-4xy+3y^2-4x-2y+3 has a unique global minimum and find that minimum value.
Exercise 8.6 Show that the quadratic function of three variables p(x,y,z)=x^2+2xy+xz+2y^2+yz+2z^2+6y-7z+5 has a unique global minimum and find that minimum value.