GRAM–SCHMIDT ORTHONORMAL TRANSFORMATION TRANSFORMATION The Gram–Schmidt orthonormalization procedure was introduced in Section 4.3 in order to introduce the orthonormal transformation F
Trang 1GRAM–SCHMIDT ORTHONORMAL TRANSFORMATION
TRANSFORMATION
The Gram–Schmidt orthonormalization procedure was introduced in Section 4.3 in order to introduce the orthonormal transformation F applied to the matrix T The Gram–Schmidt orthonormalization procedure described there
is called the classical Gram–Schmidt (CGS) orthogonilization procedure The CGS procedure was developed in detail for the case s¼ 3, m0¼ 2, and then these results were extrapolated to the general case of arbitrary s and m0 In this section we shall develop the CGS procedure in greater detail
The CGS procedure is sensitive to computer round-off errors There is a modified version of the Gram-Schmidt procedure that is not sensitive to computer round-off errors This is referred to as the modified Gram–Schmidt (MGS) After developing the general CGS results, we shall develop the MGS procedure One might ask why explain the CGS procedure at all It is better to start with a description of the CGS procedure because, first, it is simpler to explain, second, it makes it easier to obtain a physical feel for the Gram–Schmidt orthogonalization, and third, it provides a physical feel for its relationship to the Householder and in turn Givens transformations Hence, we shall first again start with a description of the CGS procedure
As described in Section 4.3, starting with the m0þ 1 vectors t1; t2; ; tm0 þ1,
we transform these to m0þ 1 orthogonal vectors, which we designate as
q01; q02; ; q0m0 þ1 Having this orthogonal set, the desired orthonormal set
of Section 4.3 (see Figure 4.3-1) can be obtained by dividing q0i by its magnitudek q0
ik to form qi: We start by picking the first vector q01 equal to t1,
322
Copyright # 1998 John Wiley & Sons, Inc ISBNs: 0-471-18407-1 (Hardback); 0-471-22419-7 (Electronic)
Trang 2that is
At this point the matrix
T0 ¼ ½t1 t2 tm0 þ1 ð13:1-2Þ can be thought of as being transformed to the matrix
Tð1Þ0 ¼ ½q01 t2 tm0 þ1 ð13:1-3Þ Next q02 is formed by making it equal to t2 less its component along the direction of q01 From (4.2-41) it follows that the vector component of t2 along
t1, or equivalently q01, is given by
t2c¼ðq
0T
1 t2Þq0 1
Let
r120 ¼ðq
0T
1 t2Þ
Then (13.1-4) becomes
t2c ¼ r0
In turn
or
q02 ¼ t2 0
At this point T0 can be thought of as being transformed to
Tð2Þ0 ¼ ½q01 q02 t3 tm0 þ1 ð13:1-9Þ Next q03 is formed by making it equal to t3, less the sum of its two com-ponents along the directions of the first two orthogonal vectors q01and q02 Using (4.2-41) again yields that the component of t3 along q01 and q02 is given by
t3c¼ðq
0 T
1 t3Þq01
q0 Tq0 þðq
0 T
2 t3Þq02
Trang 3In general let
rij0 ¼ q
0T
i tj
for j > i > 1 Then (13.1-10) can be written as
t3c ¼ r013q01þ r023q02 ð13:1-12Þ
In turn q03 becomes
or
q03 ¼ t3 0
13q01 023q02 ð13:1-14Þ
At this point T0 can be thought of as being transformed to
Tð3Þ¼ ½ q01 q02 q03 t4 tm 0 þ1 ð13:1-15Þ The formation of q04; q05; ; q0m0 þ1follows the same procedure as used above to form q01, q02, and q03 As a result it is easy to show that, for j > 2,
q0j ¼ tj
Xj i¼1
and after theðm0þ 1Þst step T0 is given by
Tm00 þ1¼ ½ q01 q02 q0m0 þ1 ¼ Q0 ð13:1-17Þ For simplicity, for now let us assume m0¼ 2 Doing this makes it easier to develop the results we seek The form of the results obtained using m0¼ 2 apply for the general case of arbitrary m0 to which the results can be easily generalized From (13.1-1), (13.1-8) and (13.1-14) it follows that
t2¼ r120 q01þ q02 ð13:1-18bÞ
t3¼ r130 q01þ r023q02þ q03 ð13:1-18cÞ
We can rewrite the above equations in matrix form as
½ t1 t2 t3 ¼ ½ q01 q02 q03
1 r120 r013
0 1 r023
2 4
3
Trang 4where it should be emphasized that the matrix entries ti and q01 are themselves column matrices In turn, the above can be rewritten as
where T0 is given by (13.1-2), Q0 is given by (13.1-17), and for m0 ¼ 2
R0¼
1 r012 r013
0 1 r023
2 4
3
We can orthonormalize Q by dividing each orthogonal vector q0i by its magnitudek q0
ik to form the unitary vector qi
qi ¼ q
0 i
k q0
and in turn obtain
Q¼ ½q1 q2 qm 0 þ1 ð13:1-23Þ Let the magnitudes of the q0i be given by the diagonal matrix
D0 ¼ Diag ½k q01k; k q02k; ; k q0m0 þ1k ð13:1-24Þ
It is easy to see that Q is obtained by postmultiplying Q0 by D0 Thus
Using (13.1-24) it follows that (13.1-20) can be written as
which on using (13.1-25) becomes
where
Substituting (13.1-24) and (13.1-21) into (13.1-27a) yields
R¼
k q0
1k r0012 r0013
0 k q02k r0023
2 6 6 4
3 7 7
Trang 5rij00¼ r0ijk q0ik ð13:1-28aÞ Using (13.1-11) yields
r00ij¼ q
0T
i tj
q0Ti q0i k q0i k ð13:1-29Þ which becomes
rij00¼ qT
for j > i > 1
Multiplying both sides of (13.1-27) by QT yields
where use was made of the fact that
which follows because the columns of Q are orthonormal In the above Q is an
s ðm0þ 1Þ matrix and I is the ðm0þ 1Þ ðm0þ 1Þ identity matrix Because also the transformed matrix R is upper triangular, see (13.1-28), we obtain the very important result that QT is the desired orthonormal transformation matrix
F of (10.2-8) or equivalently of (12.2-6) for the matrix T0, that is,
Strictly speaking QT should be a square matrix to obey all the properties of an orthonormal matrix F given by (10.2-1) to (10.2-3) In fact it is an s ðm0þ 1Þ matrix This problem can be readily remedied by augmenting QT to include the unit vectors qm 0 þ2to qswhen s > m0þ 1, where these are orthonormal vectors
in the s-dimensional hyperspace of the matrix T0 These vectors are orthogonal
to theðm0þ 1Þ-dimensional column space of T0spanned by the m0þ 1 vectors
q1 to qm 0 þ1 Thus to form F, Q of (13.1-23) is augmented to become
Q¼ ½q1 q2 qm 0 þ1 qm 0 þ2 qs ð13:1-34Þ The matrix Q0 is similarly augmented to include the vectors q0m0 þ2 to q0s Also
terms It shall be apparent, shortly, if it is not already apparent, that the vectors
qm0 þ2 to qs actually do not have to be determined in applying the Gram– Schmidt orthonormalization procedures Moreover, the matrices Q, Q0, and D0
do not in fact have to be augmented
Trang 6For arbitrary m0 and s m0, it is now easy to show that R of (13.1-28) becomes (12.2-6) with Y30 ¼ 0 Hence, in general, (13.1-31) becomes the desired upper triangular form given by (12.2-6) with Y30 ¼ 0, that is,
R¼ QTT0 ¼
U 0 0
2 6 6 6 4
|fflffl{zfflffl}
m0
j j j j j
Y10
-Y02 -0
3 7 7 7 5
|fflfflffl{zfflfflffl}
1
gm0 g1 0
ð3:1-35Þ
For our m0¼ 2 example of (13.1-28) and Section 4.3 [see (4.3-12) and (4.3-24)
to (4.3-29a)]
R¼
kq0
1k r1200 r1300
0 kq20k r2300
3k
2
4
3
5 ¼ u11 u12 y
0 1
0 u22 y20
0 0 y30
2 4
3
Because the bottom s 0
R above This would be achieved if we did not augment Q However, even though in carrying out the Gram–Schmidt procedure we do not have to augument Q; Q0; and D0; for pedagogic reasons and to be consistent with our presentations in Section 4.3 and Chapters 10 to 12, in the following we shall consider these matrices as augmented
From (13.1-27a)
Thus, we have that, in general, for arbitrary s > m0,
R0¼
U0 0 0
2 6 6 4
|fflffl{zfflffl}
m0
j j j j j
Y100 -1 -0
3 7 7 5
|fflfflffl{zfflfflffl}
1
gm0 g1 0
ð3:1-39Þ
Trang 7and D is the unaugmented D0 of (13.1-24) without itsðm0þ 1Þst entry, that is,
D¼ Diag½ kq01k; kq02k; ; kq0m0k ð13:1-40Þ
On examining the above CGS procedure, we see that in the process of transforming the columns of T0 into an orthonormal set Q, we simultaneously generate the desired upper triangular matrix given by (12.2-6) with Y30 ¼ 0, that
is, (13.1-35) or equivalently (4.3-60) For instance, for our m0¼ 2 example, (13.1-28) is obtained using the rij00 and k q0
1 k terms needed to orthonormalize
T0 It now remains to give a physical explanation for why this happens How is the orthonormal matrix Q related to the Householder transformations and in turn the simple Givens transformation? Finally, how are the elements of R given
by (13.1-35) obtained using the CGS procedure related to those of (12.2-6) obtained using the Householder or Givens transformation?
The answer is that the orthonormal transform matrix F obtained using the CGS procedure is identical to those obtained using the Givens and Householder transformations Thus all three methods will give rise to identical transformed augmented matrices T00 ¼ FT0 This follows from the uniqueness of the orthonormal set of transformation vectors fiof F needed to put T00 in the upper triangular form Putting T00 in upper triangular form causes the solution to be in the Gauss elimination form This form is not unique but is unique if the orthonormal transformation F is used That is, except if some of the unit row vectors of F are chosen to have opposite directions for the transforms in which case the signs of the corresponding rows of the transformed matrices T00 will have opposite sign (If the entries of T00 can be complex numbers then the unit vectors of F can differ by an arbitrary phase.) Also the identicalness of the F’s for the three transforms applies only to the first m0þ 1 rows The remaining
first m0þ 1 rows
Let us now explain what is physically happening with the CGS ortho-gonalization procedure To do this we will relate the CGS orthoortho-gonalization procedure with the Householder transformation For the first Householder transformation H1, the first row unit vector h1 is chosen to line up in the direction of the vector t1of the first row of T0; see the beginning of Chapter 12 This is exactly the way q1 is chosen for the CGS procedure; see (13.1-1) and (4.3-1) and Figure 4.3-1 Next, for the second Householder transformation H2 [see (12.2-1)], the second row unit vectorðh2Þ2 is chosen to be orthonormal to
h1of H1and in the plane formed by the vectors t1and t2or equivalently h1and
t2 Again, this is exactly how q02, and equivalently q2, is picked in the CGS method; see (13.1-8) and the related discussion immediately before it That
Trang 8ðh2Þ2 is in the plane of h1, and t2 follows from the fact that the transformation
H2 leads to the second column of H2H1T0 having only its top two elements This means that t2 has only components along h1 and ðh2Þ2, the unit vector directions for which the first two elements of the second column of H2H1T0 gives the coordinates The unit vectors h1andðh2Þ2become the unit vectors for the first two rows of F formed from the Householder transformations as given
by (12.2-5) This follows from the form of Hi as given in (12.2-2) We see that
Hifor i 2 has an identity matrix for its upper-left-hand corner matrix Hence
h1 of H1 and ðh2Þ2 of H2 are not effected in the product that forms F from (12.2-5) As a result, the projections of t1; t2; ; tsonto h1are not affected by the ensuing Householder transformations H2; ; Hmþ1 It still remains to verify that h1andðh2Þ2are orthogonal The unit vector h1is along the vector t1
of T0 The unit vector ðh2Þ2 has to be orthogonal to h1 because t1 does not project the component alongðh2Þ2, the 2,1 element of H2H1T0being zero as is the 2,1 element of FT0 when F is formed by the Householder transformations; see (12.2-1) and (12.2-6) By picking the first coordinate of the row vector
ðh2Þ2to be zero, we forced this to happen As a result of this zero choice for the first entry of the row matrixðh2Þ2, the first column element of H1T0 does not project ontoðh2Þ2, the first column of H1T0 only having a nonzero element in its first entry, the element for whichðh2Þ2 is zero
Next, for the Householder transform H3 of (12.2-2), the third row unit vector
ðh3Þ3 is chosen to be orthonormal to h1 andðh2Þ2 but in the space formed by
h1,ðh2Þ2, and t3 or equivalently t1, t2, and t3 Again, this is exactly how q03is picked with the CGS procedure; see (13.1-14) In this way see that the unit row vectors of F obtained with Householder transformation are identical to the orthonormal column vectors of Q, and in turn row vectors of QT, obtained with the CGS procedure
In Section 4.2 the Givens transformation was related to the Householder transformation; see (12.1-1) and the discussion just before and after it In the above paragraphs we related the Householder transformation to the CGS procedure We now can relate the Givens transformation directly to the CGS procedure To do this we use the simple example of (4.2-1), which was the example used to introduce the CGS procedure in Section 4.3 For this case only three Givens transformations G1, G2, and G3 are needed to form the upper triangular matrix T as given by (4.3-29) As indicated in Chapter 12, each of these Givens transformations represents a change from the immediately preced-ing orthogonal coordinate system to a new orthogonal coordinate system with the change being only in two of the coordinates of one of the unit vector directions making up these coordinate systems Specifically, each new coordinate system is obtained by a single rotation in one of the planes of the s-dimensional orthogonal space making up the columns of the matrix T This is illustrated in Figures 13.1-1 to 13.1-3
We saw above that the CGS procedure forms during the orthogonalization process the upper triangular matrix R0 of (13.1-21) and (13.1-39), which is related to the upper triangular matrix R [see (13.1-27a)], which becomes equal
Trang 9to FT0 of (12.2-6) and (3.1-35) in general As we shall see now, the CGS orthogonalization generation of R0 gives us a physical significance for R0 and
U0 and in turn for R and U
Examination of (13.1-18a) to (13.1-18c) and (13.1-19) give us a physical explanation for why the CGS orthogonalization procedure produces an upper triangular R0and in turn upper triangular R [The development given in Section 4.3, which introduced the CGS procedure, also gave us a physical feel for why
R is upper triangular; see specifically (4.3-24) to (4.3-29).] Note that the ortho-gonal vectors q01; ; q0i are chosen to form ti; that is tiis the weighted sum of
q01; ; q0i with the weight for qiequaling 1; see (13.1-18a) to (13.1-18c) The ith column of R0 gives the coefficients of the weightings for the qj’s,
j¼ 1; 2; ; m0þ 1 for forming ti from the q0i’s; see (13.1-19) Because ti is only formed by the weighted sum of q01; ; q0i, the coefficients of
q0iþ1; ; q0m0 þ1 are zero, forcing the elements below the diagonal of the ith column to be zero, and in turn forcing R0 to be upper triangular Furthermore, physically, the coefficients of the ith column of R0give us the amplitude change that the orthogonal vectors q01; ; q0ineed to have to form ti; see (13.1-18a) to (13.1-19) Worded another way, the i, j element rij0 of R0timesk q0
jk gives the component of tjalong the direction qj Thus we now have a physical feel for the entries of R0and in turn U0[see (13.1-39)] To get a physical interpretation of
Figure 13.1-1 Givens transformation G1 of t1
Trang 10Figure 13.1-2 Givens transformation G2 of G1t1.
Figure 13.1-3 Givens transformation G of G G t
... H1and in the plane formed by the vectors t1and t2or equivalently h1andt2 Again, this is exactly how q02, and equivalently... procedure we not have to augument Q; Q0; and D0; for pedagogic reasons and to be consistent with our presentations in Section 4.3 and Chapters 10 to 12, in the following we shall... (12.2-6) and (3.1-35) in general As we shall see now, the CGS orthogonalization generation of R0 gives us a physical significance for R0 and
U0 and in turn