Exercise 2.10 Two Point Boundary Value Problem; Computation)
4.2 Positive Definite and Semidefinite Matrices
GivenA∈Cn×n. The functionf :Cn→Rgiven by f (x)=x∗Ax=
n i=1
n j=1
aijxixj
is called aquadratic form. Note thatf is real valued ifAis Hermitian. Indeed, f (x)=x∗Ax=(x∗Ax)∗=x∗A∗x =f (x).
Definition 4.1 (Positive Definite Matrix) We say that a matrixA∈Cn×nis (i) positive definiteifA∗=Aandx∗Ax>0 for all nonzerox∈Cn; (ii) positive semidefiniteifA∗=Aandx∗Ax≥0 for allx ∈Cn; (iii) negative (semi)definiteif−Ais positive (semi)definite.
We observe that
1. The zero-matrix is positive semidefinite, while the unit matrix is positive definite.
2. The matrixAis positive definite if and only if it is positive semidefinite and x∗Ax=0 ⇒ x=0.
3. A positive definite matrixAis nonsingular. For ifAx =0thenx∗Ax=0 and this implies thatx=0.
4. It follows from Lemma4.6 that a nonsingular positive semidefinite matrix is positive definite.
5. IfAis real then it is enough to show definiteness for real vectors only. Indeed, if A∈Rn×n,AT =AandxTAx >0 for all nonzerox∈ Rnthenz∗Az>0 for all nonzeroz∈Cn. For ifz=x+iy=0withx,y ∈Rnthen
z∗Az=(x−iy)TA(x+iy)=xTAx−iyTAx+ixTAy−i2yTAy
=xTAx+yTAy,
and this is positive since at least one of the real vectorsx,yis nonzero.
Example 4.3 (Gradient and Hessian) Symmetric positive definite matrices is important in nonlinear optimization. Consider (cf. (16.1)) the gradient ∇f and hessianHf of a functionf :Ω⊂Rn→R
∇f (x)=
⎡
⎢⎢
⎣
∂f (x)
∂x1
...
∂f (x)
∂xn
⎤
⎥⎥
⎦∈Rn, Hf (x)=
⎡
⎢⎢
⎣
∂2f (x)
∂x1∂x1 . . . ∂∂x2f (x)
1∂xn
... ...
∂2f (x)
∂xn∂x1 . . . ∂∂x2f (x)
n∂xn
⎤
⎥⎥
⎦∈Rn×n.
We assume thatf has continuous first and second order partial derivatives onΩ.
Under suitable conditions on the domainΩit is shown in advanced calculus texts that if∇f (x)=0andHf (x)is positive definite thenxis a local minimum forf. This can be shown using the second-order Taylor expansion (16.2). Moreover,xis a local maximum if∇f (x)=0andHf (x)is negative definite.
Lemma 4.2 (The MatrixA∗A) The matrix A∗Ais positive semidefinite for any m, n ∈ N andA ∈ Cm×n. It is positive definite if and only if A has linearly independent columns or equivalently rankn.
Proof ClearlyA∗Ais Hermitian. Letx ∈ Cnand setz :=Ax. By the definition (1.11) of the Euclidean norm we havex∗A∗Ax = z∗z = z22 = Ax22 ≥ 0 with equality if and only ifAx = 0. It follows thatA∗Ais positive semidefinite and positive definite if and only ifAhas linearly independent columns. But this is
equivalent toAhaving rankn(cf. Definition1.6).
Lemma 4.3 (T Is Positive Definite) The second derivative matrix T = tridiag(−1,2,−1)∈Rn×nis positive definite.
Proof ClearlyT is symmetric. For anyx∈Rn xTT x=2
n i=1
xi2−
n−1
i=1
xixi+1− n i=2
xi−1xi
=
n−1
i=1
x2i −2
n−1
i=1
xixi+1+
n−1
i=1
x2i+1+x12+xn2
=x12+xn2+
n−1
i=1
(xi+1−xi)2.
ThusxTT x ≥ 0 and ifxTT x = 0 thenx1 = xn = 0 and xi = xi+1fori = 1, . . . , n−1 which implies thatx=0. HenceT is positive definite.
4.2.1 The Cholesky Factorization
Recall that aprincipal submatrixB =A(r,r)∈Ck×k of a matrixA∈Cn×nhas elementsbi,j = ari,rj fori, j = 1, . . . , k, where 1 ≤ r1 <ã ã ã < rk ≤ n. It is a leading principal submatrix, denotedA[k]ifr = [1,2, . . . , k]T. We have
A(r,r)=X∗AX, X:= [er1, . . . ,erk] ∈Cn×k. (4.5) Lemma 4.4 (Submatrices) Any principal submatrix of a positive (semi)definite matrix is positive (semi)definite.
Proof LetXandB:=A(r,r)be given by (4.5). IfAis positive semidefinite then Bis positive semidefinite since
y∗By=y∗X∗AXy=x∗Ax≥0, y∈Ck, x:=Xy. (4.6) SupposeAis positive definite andy∗By = 0. By (4.6) we havex =0and since Xhas linearly independent columns it follows thaty =0. We conclude thatBis
positive definite.
Theorem 4.2 (LDL* and LL*) The following is equivalent for a matrix A ∈ Cn×n.
1. Ais positive definite,
2. Ahas an LDL* factorization with positive diagonal elements inD, 3. Ahas a Cholesky factorization.
If the Cholesky factorization exists it is unique.
Proof Recall thatA−∗:=(A−1)∗=(A∗)−1.
We show that 1 ⇒ 2 ⇒ 3 ⇒ 1.
1 ⇒ 2: SupposeAis positive definite. By Lemma 4.4the leading principal submatricesA[k] ∈ Ck×k are positive definite and therefore nonsingular for k = 1, . . . , n− 1. Since A is Hermitian it has by Theorem 4.1 a unique LDL* factorizationA = LDL∗. To show that theith diagonal element inD is positive we note thatxi :=L−∗ei is nonzero sinceL−∗is nonsingular. But thendii=e∗iDei =e∗iL−1AL−∗ei =x∗iAxi >0 sinceAis positive definite.
2 ⇒ 3: SupposeA has an LDL* factorization A = LDL∗ with positive diagonal elementsdii inD. ThenA= SS∗, whereS := LD1/2andD1/2 :=
diag(√
d11, . . . ,√
dnn), and this is a Cholesky factorization ofA.
3 ⇒ 1: SupposeAhas a Cholesky factorizationA= LL∗. ClearlyA∗ = A.
SinceLhas positive diagonal elements it is nonsingular andAis positive definite by Lemma4.2.
For uniqueness suppose LL∗ = SS∗ are two Cholesky factorizations of the positive definite matrixA. SinceAis nonsingular bothLandS are nonsingular.
ThenS−1L=S∗L−∗, where by Lemma2.5S−1Lis lower triangular andS∗L−∗
is upper triangular, with diagonal elementsii/siiandsii/ii, respectively. But then both matrices must be equal to the same diagonal matrix and2ii =sii2. By positivity ii=siiand we conclude thatS−1L=I =S∗L−∗which means thatL=S.
A Cholesky factorization can also be written in the equivalent formA=R∗R, whereR=L∗is upper triangular with positive diagonal elements.
Example 4.4 (2×2) The matrixA =
2 −1
−1 2
has an LDL* and a Cholesky- factorization given by
2 −1
−1 2
= 1 0
−12 1 2 0 0 32
1−12 0 1
= √
2 0
−1/√ 2√
3/2
√2−1/√ 2
0 √
3/2
. There are many good algorithms for finding the Cholesky factorization of a matrix, see [3]. The following version for finding the factorization of a matrix A with bandwidthd ≥1 uses the LDL* factorization Algorithm4.1. Only the upper part ofAis used. The algorithm uses the MATLAB commanddiag.
function L=bandcholesky(A,d)
%L=bandcholesky(A,d) [L,dg]=LDL(A,d);
L=L*diag(sqrt(dg));
end
Listing 4.2 bandcholesky
As for the LDL* factorization the leading term in an operation count for a band matrix isO(d2n). Whend is small this is a considerable saving compared to the count12Gn=n3/3 for a full matrix.
4.2.2 Positive Definite and Positive Semidefinite Criteria
Not all Hermitian matrices are positive definite, and sometimes we can tell just by glancing at the matrix that it cannot be positive definite. Here are some necessary conditions.
Theorem 4.3 (Necessary Conditions for Positive (Semi)Definiteness) If A ∈ Cn×nis positive (semi)definite then for alli, j withi=j
1. aii >0,(aii ≥0),
2. |Re(aij)|< (aii+ajj)/2,(|Re(aij)| ≤(aii+ajj)/2), 3. |aij|<√aiiajj,(|aij| ≤ √aiiajj),
4. IfAis positive semidefinite andaii = 0 for somei thenaij = aj i = 0 for j =1, . . . , n.
Proof Clearlyaii=eTi Aei > (≥)0 and Part 1 follows. Ifα, β ∈Candαei+βej = 0 then
0< (≤)(αei+βej)∗A(αei+βej)= |α|2aii+ |β|2ajj+2Re(αβaij). (4.7) Takingα=1,β = ±1 we obtainaii+ajj ±2Reaij >0 and this implies Part 2.
We first show 3. whenAis positive definite. Takingα= −aij,β =aiiin (4.7) we find
0<|aij|2aii+aii2ajj −2|aij|2aii=aii(aiiajj− |aij|2).
Sinceaii>0 Part 3 follows in the positive definite case.
Suppose nowAis positive semidefinite. Forε >0 we defineB:=A+εI. The matrixBis positive definite since it is Hermitian andx∗Bx ≥εx22 >0 for any nonzerox∈Cn. From what we have shown
|aij| = |bij|<
biibjj =%
(aii+ε)(ajj+ε), i=j.
Sinceε >0 is arbitrary Part 3 follows in the semidefinite case. SinceAis Hermitian
Part 3 implies Part 4.
Example 4.5 (Not Positive Definite) Consider the matrices A1=
0 1 1 1
, A2= 1 2
2 2
, A3= −2 1
1 2
.
HereA1andA3are not positive definite, since a diagonal element is not positive.
A2is not positive definite since neither Part 2 nor Part 3 in Theorem4.3are satisfied.
The matrix 2 1
1 2
enjoys all the necessary conditions in Theorem 4.3. But to decide if it is positive definite it is nice to have sufficient conditions as well.
We start by considering eigenvalues of a positive (semi)definite matrix.
Lemma 4.5 (Positive Eigenvalues) A matrix is positive (semi)definite if and only if it is Hermitian and all its eigenvalues are positive (nonnegative).
Proof SupposeAis positive (semi)definite. ThenAis Hermitian by definition, and ifAx =λx andxis nonzero, thenx∗Ax =λx∗x. This implies thatλ > 0(≥0) since A is positive (semi)definite and x∗x = x22 > 0. Conversely, suppose A ∈ Cn×n is Hermitian with positive (nonnegative) eigenvalues λ1, . . . , λn. By Theorem 6.9 (the spectral theorem) there is a matrixU ∈ Cn×n with U∗U = U U∗ = I such thatU∗AU = diag(λ1, . . . , λn). Let x ∈ Cn and define z :=
U∗x= [z1, . . . , zn]T ∈Cn. Thenx=U U∗x=U zand by the spectral theorem x∗Ax=z∗U∗AU z=z∗diag(λ1, . . . , λn)z=
n j=1
λj|zj|2≥0.
It follows thatAis positive semidefinite. SinceU∗is nonsingular we see thatz = U∗xis nonzero ifxis nonzero, and thereforeAis positive definite.
Lemma 4.6 (Positive Semidefinite and Nonsingular) A matrix is positive definite if and only if it is positive semidefinite and nonsingular.
Proof IfAis positive definite then it is positive semidefinite and ifAx =0 then x∗Ax =0 which implies thatx=0. Conversely, ifAis positive semidefinite then it is Hermitian with nonnegative eigenvalues (cf. Lemma4.5). If it is nonsingular all eigenvalues are positive (cf. Theorem1.11), and it follows from Lemma4.5thatA
is positive definite.
The following necessary and sufficient conditions can be used to decide if a matrix is positive definite.
Theorem 4.4 (Positive Definite Characterization) The following statements are equivalent for a matrixA∈Cn×n.
1. Ais positive definite.
2. Ais Hermitian with only positive eigenvalues.
3. Ais Hermitian and all leading principal submatrices have a positive determi- nant.
4. A=BB∗for a nonsingularB∈Cn×n. Proof
1 ⇐⇒ 2: This follows from Lemma4.5.
1 ⇒ 3: A positive definite matrix has positive eigenvalues, and since the determinant of a matrix equals the product of its eigenvalues (cf. Theorem1.10) the determinant is positive. Every leading principal submatrix of a positive definite matrix is positive definite (cf. Lemma4.4) and therefore has a positive determinant.
3 ⇒ 4: Since a leading principal submatrix has a positive determinant it is nonsingular and Theorem4.1implies thatAhas a unique LDL* factorization and by Theorem4.2a unique Cholesky factorizationA=BB∗withB=L.
4 ⇒ 1: This follows from Lemma4.2.
Example 4.6 (Positive Definite Characterization) Consider the symmetric matrix A:=
3 1 1 3
.
1. We havexTAx=2x12+2x22+(x1+x2)2>0 for all nonzeroxshowing thatA is positive definite.
2. The eigenvalues ofAareλ1 =2 andλ2 =4. They are positive showing thatA is positive definite since it is symmetric.
3. We find det(A[1])=3 and det(A[2])=8 showing again thatAis positive definite since it is also symmetric.
4. FinallyAis positive definite since by Example4.2we have A=BB∗, B =
1 0 1/3 1
√3 0
0 √
8/3
.