DEPARTMENT OF MATHEMATICS————oOo———— PHAM HUY HIEU HERMITIAN MATRIX AND THE SCHUR -HORN THEOREM DRAFT GRADUATION THESIS HANOI, 05/2019... DEPARTMENT OF MATHEMATICS————oOo———— DRAFT GRAD
Trang 1DEPARTMENT OF MATHEMATICS
————oOo————
PHAM HUY HIEU
HERMITIAN MATRIX AND THE SCHUR
-HORN THEOREM
DRAFT GRADUATION THESIS
HANOI, 05/2019
Trang 2DEPARTMENT OF MATHEMATICS
————oOo————
DRAFT GRADUATION THESIS
HERMITIAN MATRIX AND THE SCHUR
-HORN THEOREM
Supervisor : PHD NGUYEN CHU GIA VUONG
HANOI, 05/2019
Trang 3Before presenting the main content of the thesis, I would like to press my gratitude to the math teachers, Hanoi 2 Pedagogical University,the teachers in the algebra group as well as the participating teachers.Teaching has dedicated to convey valuable knowledge and create favor-able conditions for me to successfully complete the course and thesis.
ex-In particular, I would like to express my deep respect and gratitude
to PDH Nguyen Chu Gia Vuong, who directly instructed, just told tohelp me so that I could complete this thesis
Due to limited time, capacity and conditions, the discourse cannotavoid errors Therefore, I look forward to receiving valuable commentsfrom teachers and friends
Student
Pham Huy Hieu
Trang 4In mathematics, especially linear algebra, we only attent to matriceswith real coefficients that rarely attent to complex matrices In fact,complex matrices are very important and in particular there is a matrixtype with complex coefficients hermitian matrix The individual valuesand the coefficients of their main diagonals are of special relevance TheSchur -Horn theorem will tell us the relationship It has inspired in-vestigations and substantial generalizations in the setting of symplecticgeometry.
Trang 51 PRELIMINARIES 2
1.1 Eigenvalues and eigenvectors 2
1.2 Permutation matrix 3
1.3 Hermitian matrix 5
1.4 Unitary matrix 6
1.5 Bistochastic matrix and Majorization 7
1.6 Convex hull 10
1.7 Birkhoff polytope 10
2 THE SCHUR-HORN THEOREM 12 2.1 Schur - Horn theorem 12
2.2 Proof of the Schur - Horn theorem 12
3 Application 24 3.1 The Pythagorean Theorem in Finite Dimension 24 3.2 The Schur-Horn Theorem in the Finite Dimensional Case 28
Trang 6Definition 1.1.1 In linear algebra, an eigenvector or characreristic tor of a linear transformation is a non-zero vector that changes by only
vec-a scvec-alvec-ar fvec-actor when thvec-at linevec-ar trvec-ansformvec-ation is vec-applied to it Moreformally, if T is a linear transformation from a vector space V over afield F into itself and v is a vector in V that is not the zero vector, then
v is an eigenvectors of T if T (v) is a scalar multiple of v This conditioncan be written as the equation
T (v) = λvwhere λ is a scalar in the field F , known as the eigenvalues, characteristicvalues, or characteristic root associated with the eigenvectors v
If the vector space V is finite-dimensional, then the linear mation T can be represented as a square matrix A, and the vector v
transfor-by a column vector, rendering the above mapping as a matrix cation on the left-hand side and a scaling of the column vector on theright-hand side in the equation
multipli-Av = λv
Trang 7There is a direct correspondence between n − by − n square matricesand linear transformations from an n-dimensional vector space to itself,given any basis of the vector space For this reason, it is equivalent todefine eigenvalues and eigenvectors using either the language of matrices
or the language of linear transformations
Geometrically, an eigenvector, corresponding to a real non-zero value points in a direction that is stretched by the transformation andthe eigenvalues is the factor by which it is stretched If the eigenvalue isnegative, the direction is reversed
per-The m × m permutation matrix Pπ = (pij) obtained by permuting thecolumns of the identity matrix Im, that is, for each i, pij = 1 if j = π(i)
Trang 8and 0 otherwise, will be referred to as the column representation in thisarticle Since the entries in row i are all 0 except that a 1 appears incolumn π(i), we may write
, where ej, a standard basis vector, denotes a row vector of length mwith 1 in the jth position and 0 in every other position
For example, the permutation matrix Pπ corresponding to the mutation :
Observe that the jth column of the I5 identity matrix now appears asthe π(jth) column of Pπ
The other representation, obtained by permuting the rows of the tity matrix Im, that is, for each j, pij = 1 if i = π(j) and 0 otherwise,will be referred to as the row representation
Trang 9iden-1.3 Hermitian matrix
Definition 1.3.1 In mathematics, a Hermitian matrix (or self-adjointmatrix) is a complex square matrix that is equal to its own conjugatetranspose—that is, the element in the ith row and jth column is equal tothe complex conjugate of the element in the jth row and ith column, forall indices i and j:
Hermitian ⇐⇒ aij = aji
or in matrix form:
Hermitian ⇐⇒ A = ATHermitian matrices can be understood as the complex extension of realsymmetric matrices
Proposition 1.3.1 Eigenvalues of an Hermitian matrix are real
Proof 1.3.1 Let 0 6= v ∈ Cn be an eigenvector of A with eigenvalue λ(A hermitian)
Then
vTAv = λvTv = λ kvk2Taking the conjugate transpose of the previous equation shows that
vTA∗v = λ kvk2 ⇒ λ = λ
⇒ λ is a real number
Proposition 1.3.2 Eigenvectors v1, v2 of an hermitian matrix A sponding to different eigenvalues λ1, λ2 are orthogonal (i.ehv2, v1i = 0)
Trang 10corre-Proof 1.3.2 We have:
v1TA∗v2 = λ2hv2, v1iand also
v1TAv2 = v1TA∗v2 = λ1hv2, v1i Hence, (λ1 − λ2) hv2, v1i = 0
But λ1 6= λ2, so that hv2, v1i = 0
Therefore,hv2, v1i = hv2, v1i = 0, as claimed
Definition 1.4.1 (Unitary Matrix)
In mathematics, a complex square matrix U is unitary if its conjugatetranspose U∗ is also its inverse—that is, if
U∗U = U U∗ = In,, where In is the identity matrix
U is diagonalizable; that is,U is unitarily similar to a diagonal matrix,
as a consequence of the spectral theorem Thus,U has a decomposition
Trang 11of the form 3) U = V DV∗, where V is unitary, andD is diagonal andunitary.
4) |det(U )| = 1
5) Its eigenspaces are orthogonal
6) U can be written as U = eiH, where e indicates matrix exponential,
i is the imaginary unit, and H is a Hermitian matrix
Definition 1.5.1 (Bistochastic Matrix)
In mathematics, we call an n × n matrix A = aij bistochastic if A hasnonnegative real entries, and, in addition, Pni=1ai,j = 1 for all 1 ≤ j ≤ nand Pnj=1ai,j = 1 for all 1 ≤ i ≤ n
Pni=1xi = Pni=1yiExample 1.5.2:
If xi ∈ [0, 1], and Pni=1xi = 1, then we have
1
n, ,
1n
≺ (x1, , xn) ≺ (1, 0, , 0)Theorem 1.5.1 A matrix A ∈ Mn(R) is bistochastic if and only if
Ax ≺ x, for all x ∈ Rn
Trang 12Proof: For the implication (⇐), assume Ax ≺ x for all vector x Then
nX
i=1
Ax =
nX
i=1x
From the definition of majorization is we choose x to be ej, where ej isthe vector ej = (0, , 0, 1, 0, , 0), 1 ≤ j ≤ n, then
Pnj=1an,j
Then Pnj=1aij = 1 for all i
( since maxnPnj=1aij; jo≤ 1 and minnPnj=1aij : jo ≥ 1)
For the other direction (⇒), let A be bistochastic, and let y = Ax
To prove y ≺ x, we first show that we can assume X and y have theirentries in non-increasing order; this because x = P x and y = Qy forsome permutation matrices P and Q, so
Qy = AP x
⇔ y = Q−1AP x
⇔ y = Bxwhere Q−1AP = B is bistochastic, since the permutation matrices arebistochastic, and the product of bistochastic matrices is bistochastic
Trang 13For any k ∈ {1, , n} we have
kX
j=1
yj =
kX
j=1
nX
i=1bjixi
Let si = Pkj=1bji then 0 ≤ si ≤ 1, Pni=1si = k and
kX
j=1
yi =
nX
j=1
xi =
nX
i=1
sixi−
kX
i=1
sixi−
kX
i=1
xi+ (
nX
i=1
si −
nX
i=1
si)xk
=
nX
i=1
sixi−
kX
i=1
xi+ (k −
nX
i=1
si)xk, sincek =
nX
i=1
si
=
kX
i=1
sixi+
nX
i=k+1
sixi −
kX
i=1
xi +
kX
i=1
xk−
nX
i=1
sixk
=
kX
i=1
sixi+
nX
i=k+1
sixi −
kX
i=1
xi +
kX
i=1
xk−
kX
i=1
sixk −
nX
i=k+1
sixk
= −
kX
i=1(1 − si)xi+
nX
i=k+1(xi− xk)si +
kX
i=1(1 − si)xk
=
kX
i=1(si − 1)(xi − xk) +
nX
i=k+1(xi − xk)si
≤ 0
Trang 14So Pkj=1yj ≤Pkj=1xj for all k When k = n,
nX
j=1
yj =
nX
j=1
nX
i=1bjixj
= (
nX
i=1
bi1)x1 + + (
nX
i=1
bin)xn
=
nX
to 1
More formally, given a finite number of points x1, x2, , xn in a realvector space, a convex combination of these points is a point of the form
nX
i=1
αixi = α1x1 + α2x2 + · · · + αnxnwhere the real numbers αiαi satisfy αi ≥ 0 and α1 + α2 + · · · + αn = 1
Trang 15·Bn := conv(P (g) : g ∈ Sn) Birkhoff polytope.
Definition 1.7.2 (Permutation polytopes)
·G ≤ Sn subgroup is permutation group
·P (G) := conv(P (g) : g ∈ G) is permutation polytope
Example 1.7.1 1 G = Sn ⇒ P (G) = Bn Birkhoff polytope
2 G =
D
1 2
,
3 4
, ,
(2d − 1) 2d
Definition 1.7.3 Permutation polytope generated by x
Suppose that n is a positive integer and x is a column vector in Rn.write Ox for the orbit of x under the action of Πn on Rn, that is, the set
of all points of the form σx for σ ∈ Πn We call the convex hull of Oxthe permutation polytope generated by x, denoted by Px
Trang 16THE SCHUR-HORN THEOREM
Theorem 2.1.1 Let d1, , dn and λ1, , λn be real numbers There is an
n × n Hermitian matrix with diagonal entries d1, , dn and eigenvalues
λ1, , λn, then the vector (d1, , dn) lies in the convex hull of the set
of vectors whose coordinates are all possible permutations of (λ1, , λn).Conversely,the vector (d01, d0n) lies in the convex hull of the set of vectorswhose coordinates are all possible permutations of (λ01, , λ0n), there exists
an n×n Hermitian matrix with diagonal entries d01, , d0n and eigenvalues
λ01, , λ0n
Our proof of the Schur - Horn theorem goes as follows In 2.2.1,
we show that the vector of diagonal entries of a Hermitian matrix can
be written as the product of a bistochastic matrix with the vector ofeigenvalues, then we finish the proof of this implication of the Schur -Horn theorem by applying the Birkhoff-von Neumann theorem, whichcharacterizes the set of bistochastic matries as the convex hull of the
Trang 17set of permutation matrices In 2.2.2, we turn our attention towarddealing with the other implication of the Schur - Horn theorem Toprove the remaining implication we provide an algebraic characterization
of elements of the convex hull of the set of vectors whose coordinates areall possible permutations of a given vector
Proof 2.2.1 (All diagonals lie in the permutation polytope)
” Let d1, , dn and λ1, , λn be real numbers There is an n × n mitian matrix with diagonal entries d1, , dn and eigenvalues λ1, , λn,then the vector (d1, , dn) lies in the convex hull of the set of vectorswhose coordinates are all possible permutations of (λ1, , λn).”
Trang 18Put d = (h11, , hnn) is diagonal of matrix A
Note: Since u is an unita matrix, then we have:
i=1|uij|2 = 1, for j = 1, n
(2.1)
Trang 19Following Birkhoff-von Neumann:” Any n × n bistochastic matrix lies
in the convex hull of the group of permutation matrices Πn ” so B is aconvex combination of permutation matrices Then by the definition ofthe action of Πn on Rn , we can see that Bλ is a convex combination
of elements of the orbit of λ, so bλ lies in the permutation polytopegenerated by λ Since Bλ = d, so d lies in the permutation polytopegenerated by λ
Proof 2.2.2 (All elememnts of the permutation polytope arediagonals)
We prove this by describing the geometry of the permutation tope through a few elementary algebraic operations which are much more
Trang 20poly-manageable to deal with than the pure geometry In particular, we showthat we can move from one of the vertices of the permutation polytopegenerated by (λ1, , λn) to any other vector in the permutation polytope
by a finite sequence of algebraic operations We then show that each ofthe vectors we get along the way is the diagonal of some hermitian ma-trix with eigenvalues λ1, , λn
We introduce the following terminology which is important in thiskey description of the permutation polytope
Definition 2.2.2.1 We say that (x1, , xn) ∈Rn is weakly decreasing if
x1 ≥ x2 ≥ · · · ≥ xn
Lemma 2.2.2.2: Suppose that x = (x1, , xn) and y = (y1, , yn) areweakly decreasing vectors in Rn and that Pni=1xi = Pni=1yi Then thefollowing are equivalent
a) The vector y is in the permutation polytope Px
b) There are vectors v1, vn such that v1 = x, vn = y, for each 1 ≤ m <
n, there is a transposition matrix τm ∈ πn and real number tm ∈ [0, 1]such that
vm+1 = tmvm + (1 − tm)τmvm,and for m > 1 the first m coordinates of vm+1 agree with the first mcoordinates of y
The proof of Lemma 2.2.2.2 is more technical than one would hope,
so we leave it for the appendix, but describe the main ideas of the proofhere
Lemma 2.2.2.3 Suppose that d = (d1, , dn) occurs as the diagonal of
an n × n Hermitian matrix H with eigenvalues λ1, , λn Then for anyreal number t ∈ [0, 1] and any transposition matrix τ ∈ Πn, there exists a
Trang 21Hermitian matrix with eigenvalues λ1, , λn and diagonal td + (1 − t)τ d.Proof In the case that n = 1 this is trivial, so suppose that n > 1.Write H = (hij) and suppose that τ ∈ πn transposes the kth and lthcoordinates The idea behind this proof is to construct a unitary matrix
U so that U HU∗ has td + (1 − t)τ d as a diagonal This matrix will havethe same eigenvalues as H because we are simply changing basis Wefind this matrix U by reducing to the case where n = 2
When n > 2, by conjugating H by an appropriate matrix P , withoutloss of generality, we can assume that k = 1 and l = 2 Thus wehave reduced to the case of finding a Hermitian matrix with diagonal(td1 + (1 − td2), td2 + (1 − t)d1, d3, d4, , dn)
Let U be a 2 × 2 unitary matrix and consider the n × n block-diagonalunitary matrix
followed by h33 = d3, hnn = dn In light of this, the problem of finding
a Hermitian matrix with diagonal (td1+(1−t)d2, td2+(1−t)d1, d3, , dn)reduces to the case n = 2
Thus we may then assume that
Trang 22Define a complex number ξ by
It is clear by the definitions of the complex number ξ and the entries of
U that U is unnitary The matrix A = U HU∗ has the same eigenvalues
as H, and , moreover the diagonal of A is (td1+ (1 − t)d2, td2+ (1 − t)d1),
as desired
Now, we continue to proof Schur-Horn theorem
Proposition 2.2.1 Suppose that d = (d1, , dn) and λ = (λ1, , λn) arevectors in Rn If d lies in the permutation polytope Pλ then there exists
an n × n Hermitian matrix with diagonal d and eigenvalues λ1, , λn.Proof:
Suppose that d lies in the permutation polytope ganerated by the vectors
λ As remarked earlier, if d or λ is not weakly decreasing, we say replace
it by a weakly decreasing alement, so, without loss of generality, assumethat both d and λ are weakly decreasing Then by the equivalence
of (2.2.2.2 b)and (2.2.2.2a) given in lemma 2.2.2.2, there exists vectors
v1, , vn ∈ pλ with v1 = λ, vn = d, and for each integer 1 ≤ m < n,
vm+1 = tmvm+ (1 − tm)τmvm (∗1)
Trang 23for some tm ∈ [0, 1] and some transposition matrix τm.
Let V1 denote the diagonal matrix with diagonal v1 = λ since V1
is Hermitian and the vectors vk satisfy the relation (*1), by repeatedapplication of lemma 2.2.2.2 we see that there are Hermitian matrices
V2, , Vn with diaginals v2, , vn respectively Since vn = d, this showsthat Vn is a Hermitian matrix with diagonal d, which proves the result
It is worthwhile to note that if the vector d which we started with wasnot weakly decreasing, a simple reodering of the basis i.e, conjugating
vn by a permutation matrix, gives us a matrix whose diagonal is this(non-weakly decreasing) vector Moreover, since the property of beingHermitian is invariant under conjugation by a unitary matrix, and per-mutation matrices are, in particular, unitary matrices, this new matrix
is Hermitian too
Appendix: The proof of a technical resultLemma 2.2.2.2 Suppose that x = (x1, , xn) and y = (y1, , yn) areweakly decreasing vectors in Rn and that Pni=1xi = Pni=1yi Then thefollowing are equivalent
a) The vector y is in the permutation polytope Px
b) There are vectors v1, vn such that v1 = x, vn = y, for each 1 ≤ m <
n, there is a transposition matrix τm ∈ πn and real number tm ∈ [0, 1]such that
vm+1 = tmvm + (1 − tm)τmvm,and for m > 1 the first m coordinates of vm+1 agree with the first mcoordinates of y
Proof
We first prove that (a) implies (b), so suppose that (b) holds first