Theorem 6.4 The Jordan Factorization of a Matrix)
6.3 The Schur Factorization and Normal Matrices
We turn now tounitary similarity transformations S−1AS, where S = U is unitary. ThusS−1 = U∗ and a unitary similarity transformation takes the form U∗AU.
6.3.2 Unitary and Orthogonal Matrices
Although not every matrix can be diagonalized it can be brought intotriangular formby aunitarysimilarity transformation.
Theorem 6.5 (Schur Factorization) For each A ∈ Cn×n there exists a unitary matrixU ∈Cn×nsuch thatR:=U∗AU is upper triangular.
The matricesU andR in the Schur factorization are calledSchur factors. We callA=U RU∗theSchur factorizationofA.
Proof We use induction onn. Forn=1 the matrixUis the 1×1 identity matrix.
Assume that the theorem is true for allk×kmatrices, and supposeA∈Cn×n, where n :=k+1. Let(λ1,v1)be an eigenpair forAwithv12 = 1. By Theorem5.5 we can extendv1to an orthonormal basis{v1,v2, . . . ,vn}forCn. The matrixV :=
[v1, . . . ,vn] ∈Cn×nis unitary, and
V∗AV e1=V∗Av1=λ1V∗v1=λ1e1. It follows that
V∗AV = λ1 x∗
0 M
, for someM ∈Ck×kandx∈Ck. (6.8) By the induction hypothesis there is a unitary matrixW1∈C(n−1)×(n−1)such that W∗1MW1is upper triangular. Define
W = 1 0∗
0 W1
andU =V W.
ThenW andUare unitary and
U∗AU =W∗(V∗AV)W = 1 0∗
0 W∗1
λ1 x∗ 0 M
1 0∗ 0W1
=
λ1 x∗W1
0 W∗1MW1
is upper triangular.
IfAhas complex eigenvalues then U will be complex even ifAis real. The following is a real version of Theorem6.5.
Theorem 6.6 (Schur Form, Real Eigenvalues) For each A ∈ Rn×n with real eigenvalues there exists an orthogonal matrixU∈Rn×nsuch thatUTAU is upper triangular.
Proof Consider the proof of Theorem6.5. SinceAandλ1are real the eigenvector v1is real and the matrixW is real andWTW =I. By the induction hypothesisV is real andVTV =I. But then alsoU =V Wis real andUTU=I.
A real matrix with some complex eigenvalues can only be reduced to block triangular form by a real unitary similarity transformation. We consider this in Sect.6.3.5.
Example 6.5 (Deflation Example) By using the unitary transformationV on the n×nmatrixA, we obtain a matrixMof ordern−1.Mhas the same eigenvalues asAexceptλ. Thus we can find another eigenvalue ofAby working with a smaller matrix M. This is an example of a deflation technique which is very useful in numerical work. The second derivative matrixT :=
2 −1 0
−1 2 −1 0 −1 2
has an eigenpair (2,x1), wherex1 = [−1,0,1]T. Find the remaining eigenvalues using deflation.
For this we extendx1 to a basis{x1,x2,x3}forR3by definingx2 = [0,1,0]T, x3= [1,0,1]T. This is already an orthogonal basis and normalizing we obtain the orthogonal matrix
V =
⎡
⎢⎣
−√1
2 0 √1 2
0 1 0
√1 2 0 √1
2
⎤
⎥⎦.
We obtain (6.8) withλ=2 and M =
2 −√ 2
−√
2 2
.
We can now find the remaining eigenvalues ofAfrom the 2×2 matrixM.
6.3.3 Normal Matrices
A matrixA∈Cn×nisnormalifA∗A=AA∗. In this section we show that a matrix has orthogonal eigenvectors if and only if it is normal.
Examples of normal matrices are
1. A∗=A, (Hermitian)
2. A∗= −A, (Skew-Hermitian)
3. A∗=A−1, (Unitary)
4. A=diag(d1, . . . , dn). (Diagonal)
Clearly the matrices in 1. 2. 3. are normal. IfAis diagonal then A∗A=diag(d1d1, . . . , dndn)=diag(|d1|2, . . . ,|dn|2)=AA∗,
andAis normal. The 2. derivative matrixT in (2.27) is symmetric and therefore normal. The eigenvalues of a normal matrix can be complex (cf. Exercise6.21).
However in the Hermitian case the eigenvalues are real (cf. Lemma2.3).
The following theorem shows thatAhas a set of orthogonal eigenvectors if and only if it is normal.
Theorem 6.7 (Spectral Theorem for Normal Matrices) A matrix A ∈ Cn×n is normal if and only if there exists a unitary matrixU∈Cn×nsuch thatU∗AU =D is diagonal. IfD = diag(λ1, . . . , λn)andU = [u1, . . . ,un]then(λj,uj),j = 1, . . . , nare orthonormal eigenpairs forA.
Proof IfB=U∗AU, withBdiagonal, andU∗U =I, thenA=U BU∗and AA∗=(U BU∗)(U B∗U∗)=U BB∗U∗and
A∗A=(U B∗U∗)(U BU∗)=U B∗BU∗. NowBB∗=B∗BsinceBis diagonal, andAis normal.
Conversely, supposeA∗A=AA∗. By Theorem6.5we can findUwithU∗U = Isuch thatB:=U∗AUis upper triangular. SinceAis normalBis normal. Indeed,
BB∗=U∗AU U∗A∗U =U∗AA∗U =U∗A∗AU =B∗B.
The proof is complete if we can show that an upper triangular normal matrixB must be diagonal. The diagonal elementseiiinE := B∗B andfiiinF := BB∗ are given by
eii= n k=1
bkibki = i k=1
|bki|2andfii= n k=1
bikbik = n k=i
|bik|2.
The result now follows by equatingeii andfiifori=1,2, . . . , n. In particular for i=1 we have|b11|2= |b11|2+|b12|2+ã ã ã+|b1n|2, sob1k =0 fork=2,3, . . . , n.
SupposeBis diagonal in its firsti−1 rows so thatbj k =0 forj =1, . . . , i−1, k=j+1, . . . , n. Then
eii = i k=1
|bki|2= |bii|2= n k=i
|bik|2=fii
and it follows thatbik =0,k=i+1, . . . , n. By induction on the rows we see that Bis diagonal. The last part of the theorem follows from Sect.6.1.1.
Example 6.6 The orthogonal diagonalization of A = 2 −1
−1 2
is UTAU = diag(1,3), whereU= √121 1
1−1
.
6.3.4 The Rayleigh Quotient
The Rayleigh quotient is a useful tool when studying eigenvalues.
Definition 6.4 (Rayleigh Quotient) ForA∈Cn×nand a nonzeroxthe number R(x)=RA(x):= x∗Ax
x∗x is called aRayleigh quotient.
If(λ,x)is an eigenpair forAthenR(x)=xx∗Ax∗x =λ.
Equation (6.9) in the following theorem shows that the Rayleigh quotient of a normal matrix is aconvex combinationof its eigenvalues.
Theorem 6.8 (Convex Combination of the Eigenvalues) Suppose A ∈ Cn×n is normal with orthonormal eigenpairs (λj,uj), for j = 1,2, . . . , n. Then the Rayleigh quotient is a convex combination of the eigenvalues ofA
RA(x)=
n
i=1λi|ci|2
n
j=1|cj|2 , x=0, x= n j=1
cjuj. (6.9)
Proof By orthonormality of the eigenvectors x∗x = ni=1 n
j=1ciuicjuj =
n
j=1|cj|2. Similarly, x∗Ax = ni=1
nj=1ciuicjλjuj = ni=1λi|ci|2. and (6.9) follows. This is clearly a combination of nonnegative quantities and a convex combination since ni=1|ci|2/ nj=1|cj|2=1.
6.3.5 The Quasi-Triangular Form
How far can we reduce a real matrix A with some complex eigenvalues by a real unitary similarity transformation? To study this we note that the complex eigenvalues of a real matrix occur in conjugate pairs,λ = μ+iν,λ = μ−iν, whereμ,νare real. The real 2×2 matrix
M = μ ν
−ν μ
(6.10) has eigenvaluesλ=μ+iνandλ=μ−iν.