8  Positive Definite Matrices and Minimization of Quadratic Functions

“May not Music be described as the Mathematic of sense, Mathematic as Music of the reason? the soul of each the same!”
James Joseph Sylvester

The mathematical term matrix was introduced by Sylvester. He is credited for several concepts and results in matrix theory, such as Sylvester’s Criterion for characterizing a positive definite matrix using determinants.

Given vectors \mathbf{x} and \mathbf{y} in \mathbb{R}^n, and considering a basis \{\mathbf{e}_i\}_{i=1}^n of \mathbb{R}^n, we can write the standard inner product as \langle \mathbf{x}, \mathbf{y} \rangle = \sum_{i,j=1}^n k_{ij} x_iy_j=\mathbf{x}^T\mathbf{K}\mathbf{y}, where \mathbf{K} is the n \times n matrix of inner product of the basis vectors with entries k_{ij}= \langle \mathbf{e}_i, \mathbf{e}_j\rangle. We note that the properties of the inner product gives that the matrix \mathbf{K} is symmetric and satisfies the condition \mathbf{x}^T\mathbf{K}\mathbf{x} \ge 0 with equality iff \mathbf{x}=0.

8.1 Positive Definite Matrices

Positive Definite Matrices

A n \times n matrix \mathbf{K} is said to be positive definite if it is symmetric (\mathbf{K}^T=\mathbf{K}) and satisfies the condition \mathbf{x}^T\mathbf{K}\mathbf{x} > 0 for all 0 \ne \mathbf{x} \in \mathbb{R}^n. The corresponding homogeneous quadratic polynomial q(\mathbf{x})=\mathbf{x}^T\mathbf{K}\mathbf{x} = \sum_{i,j=1}^n k_{ij} x_ix_j is said to be a positive definite quadratic form on \mathbb{R}^n.

  1. In fact, it can be shown that every inner product on \mathbb{R}^n is given by \langle \mathbf{x}, \mathbf{y} \rangle =\mathbf{x}^T\mathbf{K}\mathbf{y}, where \mathbf{K} is a symmetric positive definite n \times n matrix.

  2. A n \times n matrix \mathbf{K} is said to be positive semi-definite if it is symmetric (\mathbf{K}^T=\mathbf{K}) and satisfies the condition \mathbf{x}^T\mathbf{K}\mathbf{x} \ge 0 for all \mathbf{x} \in \mathbb{R}^n.

Exercise 8.1  

  1. Show that the symmetric matrix A=\begin{bmatrix}4 & -2 \\ -2 & 3 \end{bmatrix} and B=\begin{bmatrix}3 & 4 \\ 4 & 6 \end{bmatrix} are positive definite.
  2. Show that the symmetric matrix C=\begin{bmatrix}3 & 4 \\ 4 & 5 \end{bmatrix} is not positive definite.

Exercise 8.2 Explain why a positive definite matrix must be invertible (nonsingular).

Exercise 8.3 Consider a quadratic form q(\mathbf{x})=\mathbf{x}^T\mathbf{A}\mathbf{x} = \sum_{i,j=1}^n a_{ij} x_ix_j which is not necessarily symmetric. Show that q(\mathbf{x})=\mathbf{x}^T\mathbf{K}\mathbf{x}, where \mathbf{K} = \frac{1}{2}(\mathbf{A}+\mathbf{A}^T) is symmetric. (This exercise shows that a quadratic form can be always “symmetrized”)

8.1.1 Gram Matrices

Gram Matrices

Let V be an inner product space and let \mathbf{v}_1, \cdots, \mathbf{v}_n \in V. Then the Gram Matrix associated with \mathbf{v}_1, \cdots, \mathbf{v}_n \in V is the following n \times n matrix: \mathbf{K}=\begin{bmatrix} \langle\mathbf{v}_1,\mathbf{v}_1\rangle & \langle\mathbf{v}_1,\mathbf{v}_2\rangle & \ldots & \langle\mathbf{v}_1,\mathbf{v}_n\rangle \\ \langle\mathbf{v}_2,\mathbf{v}_1\rangle & \langle\mathbf{v}_2,\mathbf{v}_2\rangle & \ldots & \langle\mathbf{v}_2,\mathbf{v}_n\rangle \\ \vdots &\vdots&\ddots&\vdots\\ \langle\mathbf{v}_n,\mathbf{v}_1\rangle & \langle\mathbf{v}_n,\mathbf{v}_2\rangle & \ldots & \langle\mathbf{v}_n,\mathbf{v}_n\rangle \\ \end{bmatrix}.

Theorem
  1. The Gram Matrix for \{\mathbf{v}_1, \cdots, \mathbf{v}_n\} \subset V is positive definite iff \mathbf{v}_1, \cdots, \mathbf{v}_n are linearly independent.
  2. Let \mathbf{A}=\begin{bmatrix} \mathbf{v}_1 & \cdots & \mathbf{v}_n \end{bmatrix} be a m \times n matrix, where \mathbf{v}_i \in \mathbb{R}^m. Then \mathbf{A^TA} is positive definite iff \mathbf{v}_1, \cdots, \mathbf{v}_n are linearly independent.

Exercise 8.4 Verify (both directly, and by using the above theorem) that the Gram Matrix for the vectors \mathbf{u}_1=\begin{bmatrix}1 & 2 & -1 \end{bmatrix}^T and \mathbf{u}_2=\begin{bmatrix}3 & 0 & 6 \end{bmatrix}^T of \mathbb{R}^3 is positive definite.

The following theorem gives useful ways to recognize positive definite matrices.

Theorem: Characterizing Positive Definite Matrices

The following are equivalent for a symmetric matrix \mathbf{K}.

  1. \mathbf{K} is positive definite
  2. \mathbf{K} is regular (i.e., Gaussian elimination can transform it to row echelon form with only type-I row operations - no row swappings needed) and has all pivots positive.
  3. The corresponding quadratic form is positive definite
  4. There exists a matrix U with \text{rank} (\mathbf{U})= n such that \mathbf{K} =\mathbf{U}^T\mathbf{U}
  5. All leading determinants are positive (Sylvester’s Criterion)
  6. All eigenvalues of \mathbf{K} are positive

8.2 Minimization of Quadratic Functions

In this section we will see how that problem of finding a global minimum (if it exists) of a general quadratic function of n variables has a very satisfying solution by techniques of linear algebra.

We will write a (real) quadratic polynomial of n variables x_1, x_2, \cdots, x_n as p(\mathbf{x})=p(x_1, x_2, \cdots, x_n)= \sum_{i,j=1}^n k_{ij}x_ix_j - 2\sum_{i=1}^nf_ix_i+c, where \mathbf{x}=\begin{bmatrix}x_1 & x_2 & \cdots & x_n \end{bmatrix}^T \in \mathbb{R}^n and k_{ij}, f_i, c \in \mathbb{R}.

Note that p(\mathbf{x}) can be rewritten as p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c, where \mathbf{K}=(k_{ij}) is a symmetric n \times n matrix, \mathbf{f} \in \mathbb{R}^n and c \in \mathbb{R}.

We have the following global minimization result.

Theorem (Minimizing Quadratic Functions of n variables)
  1. Let \mathbf{K} be a (symmetric) positive definite matrix. Then the quadratic function p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c has an unique minimizer, which is the solution of the linear system \mathbf{K}\mathbf{x}=\mathbf{f} (viz, \tilde{\mathbf{x}}=\mathbf{K}^{-1}\mathbf{f}). The (global) minimum value of p(\mathbf{x}) is given by p(\mathbf{x})= c- \tilde{\mathbf{x}}^T\mathbf{f}= c-\tilde{\mathbf{x}}^T\mathbf{K\tilde{\mathbf{x}}}.

  2. If \mathbf{K} is only positive semidefinite, and if \mathbf{f} \in \mathcal{\mathbf{K}}, then every solution \tilde{\mathbf{x}} of the linear equation \mathbf{K}\mathbf{x}=\mathbf{f} is a global minimum of p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c, but the minimum is not unique.

  3. In all other cases, p(\mathbf{x})=\mathbf{x}^T\mathbf{K} \mathbf{x} -2\mathbf{x}^T\mathbf{f} +c has no global minimum

Exercise 8.5 Show that the quadratic function of two variables p(x,y)=4x^2-4xy+3y^2-4x-2y+3 has a unique global minimum and find that minimum value.

Exercise 8.6 Show that the quadratic function of three variables p(x,y,z)=x^2+2xy+xz+2y^2+yz+2z^2+6y-7z+5 has a unique global minimum and find that minimum value.

8.3 Application: Energy Minimization in a Spring-Mass System