Numerical Linear Algebra and Matrix Factorizations... The book actually has a long history and started out as lecture notesthat Tom wrote for a course in numerical linear algebra.. Somet
Trang 2Science and Engineering 22
Trang 3More information about this series athttp://www.springer.com/series/5151
Trang 4Numerical Linear Algebra and Matrix Factorizations
Trang 5Tom Lyche
Blindern
University of Oslo
Oslo, Norway
ISSN 1611-0994 ISSN 2197-179X (electronic)
Texts in Computational Science and Engineering
ISBN 978-3-030-36467-0 ISBN 978-3-030-36468-7 (eBook)
https://doi.org/10.1007/978-3-030-36468-7
Mathematics Subject Classification (2010): 15-XX, 65-XX
© Springer Nature Switzerland AG 2020
This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Trang 6It is a pleasure to write this foreword to the book “Numerical Linear Algebraand Matrix Factorizations” by Tom Lyche I see this book project from threeperspectives, corresponding to my three different roles: first, as a friend and closecolleague of Tom for a number of years, secondly as the present department head,and, finally, as a researcher within the international linear algebra and matrix theorycommunity The book actually has a long history and started out as lecture notesthat Tom wrote for a course in numerical linear algebra For almost forty years thiscourse has been an important and popular course for our students in mathematics,both in theoretical and more applied directions, as well as students in statistics,physics, mechanics and computer science These notes have been revised multipletimes during the years, and new topics have been added I have had the pleasure
to lecture the course myself, using Tom’s lecture notes, and I believe that both theselection of topics and the combined approach of theory and algorithms is veryappealing This is also what our students point out when they have taken this course
As we know, the area presented in this book play a highly central role in manyapplications of mathematics and in scientific computing in general Sometimes,
in the international linear algebra and matrix theory community, one divides thearea into numerical linear algebra, applied linear algebra and core (theoretical)linear algebra This may serve some purpose, but often it is fruitful to have a moreunified view on this, in order to see the interplay between theory, applications andalgorithms I think this view dominates this book, and that this makes the bookinteresting to a wide range of readers Finally, I would like to thank Tom for hiswork with this book and the mentioned course, and for being a good colleague fromwhom I have learned a lot I know that his international research community inspline theory also share this view Most importantly, I hope that you, the reader, willenjoy the book!
June 2019
v
Trang 7This book, which has grown out of a one semester course at the University of Oslo,targets upper undergraduate and beginning graduate students in mathematics, statis-tics, computational physics and engineering who need a mathematical background
in numerical linear algebra and related matrix factorizations
Mastering the material in this book should enable a student to analyze tional problems and develop his or her own algorithms for solving problems of thefollowing kind,
computa-• System of linear equations Given a (square) matrix A and a vector b Find a vector x such that Ax = b.
• Least squares Given a (rectangular) matrix A and a vector b Find a vector x such that the sum of squares of the components of b − Ax is as small as possible.
• Eigenvalues and eigenvectors Given a (square) matrix A Find a number λ and/or a nonzero vector x such that Ax = λx.
Such problems can be large and difficult to handle, so much can be gained byunderstanding and taking advantage of special structures For this we need a goodunderstanding of basic numerical linear algebra and matrix factorizations Factoring
a matrix into a product of simpler matrices is a crucial tool in numerical linearalgebra for it allows one to tackle large problems through solving a sequence ofeasier ones
The main characteristics of this book are as follows:
1 It is self-contained, only assuming first year calculus, an introductory course
in linear algebra, and some experience in solving mathematical problems on acomputer A special feature of this book is the detailed proofs of practically allresults Parts of the book can be studied independently making it suitable for selfstudy
2 There are numerous exercises which can be found at the end of each chapter
In a separate book we offer solutions to all problems Solutions of many examproblems given for this course at the University of Oslo are included in thisseparate volume
vii
Trang 83 The book, consisting of an introductory first chapter and 15 more chapters,naturally disaggregating into six thematically related parts The chapters aredesigned to be suitable for a one week per chapter one semester course Towardthe goal of being self-contained, the first chapter contains a review of linearalgebra, and is provided to the reader for convenient occasional reference.
4 Many of the chapters contain material beyond what might normally be covered
in one week of lectures A typical 15 week semester’s curriculum could consist
of the following curated material
Chapter10gives an introduction to Kronecker products We illustrate their use
by giving simple proofs of properties of the matrix arising from a discretization
of the 2 dimensional Poison Equation Also, we study fast methods based oneigenvector expansions and the Fast Fourier Transform in Chap 11 Somebackground from Chaps.2,3and4may be needed for Chaps.10and11.Iterative methods are studied in Chaps.12and13 This includes the classicalmethods of Jacobi, Gauss Seidel Richardson and Successive Over Relaxation(SOR), as well as a derivation and convergence analysis of the methods ofsteepest descent and conjugate gradients The preconditioned conjugate gradientmethod is introduced and applied to the Poisson problem with variable coeffi-cients
In Chap 14 we consider perturbation theory for eigenvalues, the powermethod and its variants, and use the Inertia Theorem to find a single eigenvalue of
a symmetric matrix Chapter15gives a brief informal introduction to one of themost celebrated algorithms of the twentieth century, the QR method for findingall eigenvalues and eigenvectors of a matrix
5 In this book we give many detailed numerical algorithms for solving linearalgebra problems We have written these algorithms as functions in MATLAB Alist of these functions and the page number where they can be found is includedafter the table of contents Moreover, their listings can be found online athttp://folk.uio.no/tom/numlinalg/code Complexity is discussed briefly in Sect.3.3.2
As for programming issues, we often vectorize the algorithms leading to shorterand more efficient programs Stability is important both for the mathematicalproblems and for the numerical algorithms Stability can be studied in terms ofperturbation theory that leads to condition numbers, see Chaps.8,9and14 We
Trang 9Preface ix
will often use phrases like “the algorithm is numerically stable” or “the algorithm
is not numerically stable” without saying precisely what we mean by this.Loosely speaking, an algorithm is numerically stable if the solution, computed
in floating point arithmetic, is the exact solution of a slightly perturbed problem
To determine upper bounds for these perturbations is the topic of backward error
analysis We refer to [7] and [17,18] for an in-depths treatment
A list of freely available software tools for solving linear algebra problemscan be found at
www.netlib.org/utk/people/JackDongarra/la-sw.html
To supplement this volume the reader might consult Björck [2], Meyer [15] andStewart [17,18] For matrix analysis the two volumes by Horn and Johnson [9,10]contain considerable additional material
Acknowledgments
I would like to thank my colleagues Elaine Cohen, Geir Dahl, Michael Floater,Knut Mørken, Richard Riesenfeld, Nils Henrik Risebro, Øyvind Ryan and RagnarWinther for all the inspiring discussions we have had over the years Earlier versions
of this book were converted to LaTeX by Are Magnus Bruaset and Njål Foldneswith help for the final version from Øyvind Ryan I thank Christian Schulz, GeorgMuntingh and Øyvind Ryan who helped me with the exercise sessions and we have,
in a separate volume, provided solutions to practically all problems in this book Ialso thank an anonymous referee for useful suggestions Finally, I would like to give
a special thanks to Larry Schumaker for his enduring friendship and encouragementover the years
June 2019
Trang 101 A Short Review of Linear Algebra 1
1.1 Notation 1
1.2 Vector Spaces and Subspaces 5
1.2.1 Linear Independence and Bases 6
1.2.2 Subspaces 8
1.2.3 The Vector SpacesRnandCn 10
1.3 Linear Systems 11
1.3.1 Basic Properties 12
1.3.2 The Inverse Matrix 13
1.4 Determinants 15
1.5 Eigenvalues, Eigenvectors and Eigenpairs 18
1.6 Exercises Chap 1 20
1.6.1 Exercises Sect 1.1 20
1.6.2 Exercises Sect 1.3 21
1.6.3 Exercises Sect 1.4 22
Part I LU and QR Factorizations 2 Diagonally Dominant Tridiagonal Matrices; Three Examples 27
2.1 Cubic Spline Interpolation 27
2.1.1 Polynomial Interpolation 28
2.1.2 Piecewise Linear and Cubic Spline Interpolation 28
2.1.3 Give Me a Moment 31
2.1.4 LU Factorization of a Tridiagonal System 34
2.2 A Two Point Boundary Value Problem 37
2.2.1 Diagonal Dominance 38
2.3 An Eigenvalue Problem 40
2.3.1 The Buckling of a Beam 40
2.4 The Eigenpairs of the 1D Test Matrix 41
xi
Trang 11xii Contents
2.5 Block Multiplication and Triangular Matrices 43
2.5.1 Block Multiplication 43
2.5.2 Triangular Matrices 46
2.6 Exercises Chap 2 48
2.6.1 Exercises Sect 2.1 48
2.6.2 Exercises Sect 2.2 52
2.6.3 Exercises Sect 2.3 53
2.6.4 Exercises Sect 2.4 53
2.6.5 Exercises Sect 2.5 54
2.7 Review Questions 55
3 Gaussian Elimination and LU Factorizations 57
3.1 3 by 3 Example 57
3.2 Gauss and LU 59
3.3 Banded Triangular Systems 62
3.3.1 Algorithms for Triangular Systems 62
3.3.2 Counting Operations 64
3.4 The PLU Factorization 66
3.4.1 Pivoting 66
3.4.2 Permutation Matrices 66
3.4.3 Pivot Strategies 69
3.5 The LU and LDU Factorizations 70
3.5.1 Existence and Uniqueness 71
3.6 Block LU Factorization 74
3.7 Exercises Chap 3 75
3.7.1 Exercises Sect 3.3 75
3.7.2 Exercises Sect 3.4 76
3.7.3 Exercises Sect 3.5 78
3.7.4 Exercises Sect 3.6 81
3.8 Review Questions 81
4 LDL* Factorization and Positive Definite Matrices 83
4.1 The LDL* Factorization 83
4.2 Positive Definite and Semidefinite Matrices 85
4.2.1 The Cholesky Factorization 87
4.2.2 Positive Definite and Positive Semidefinite Criteria 89
4.3 Semi-Cholesky Factorization of a Banded Matrix 91
4.4 The Non-symmetric Real Case 95
4.5 Exercises Chap 4 96
4.5.1 Exercises Sect 4.2 96
4.6 Review Questions 97
5 Orthonormal and Unitary Transformations 99
5.1 Inner Products, Orthogonality and Unitary Matrices 99
5.1.1 Real and Complex Inner Products 100
5.1.2 Orthogonality 102
Trang 125.1.3 Sum of Subspaces and Orthogonal Projections 104
5.1.4 Unitary and Orthogonal Matrices 106
5.2 The Householder Transformation 107
5.3 Householder Triangulation 111
5.3.1 The Algorithm 111
5.3.2 The Number of Arithmetic Operations 113
5.3.3 Solving Linear Systems Using Unitary Transformations 113
5.4 The QR Decomposition and QR Factorization 114
5.4.1 Existence 114
5.5 QR and Gram-Schmidt 116
5.6 Givens Rotations 117
5.7 Exercises Chap 5 119
5.7.1 Exercises Sect 5.1 119
5.7.2 Exercises Sect 5.2 119
5.7.3 Exercises Sect 5.4 120
5.7.4 Exercises Sect 5.5 123
5.7.5 Exercises Sect 5.6 123
5.8 Review Questions 125
Part II Eigenpairs and Singular Values 6 Eigenpairs and Similarity Transformations 129
6.1 Defective and Nondefective Matrices 129
6.1.1 Similarity Transformations 131
6.1.2 Algebraic and Geometric Multiplicity of Eigenvalues 132
6.2 The Jordan Factorization 133
6.3 The Schur Factorization and Normal Matrices 135
6.3.1 The Schur Factorization 135
6.3.2 Unitary and Orthogonal Matrices 135
6.3.3 Normal Matrices 137
6.3.4 The Rayleigh Quotient 139
6.3.5 The Quasi-Triangular Form 139
6.3.6 Hermitian Matrices 140
6.4 Minmax Theorems 141
6.4.1 The Hoffman-Wielandt Theorem 143
6.5 Left Eigenvectors 143
6.5.1 Biorthogonality 144
6.6 Exercises Chap 6 145
6.6.1 Exercises Sect 6.1 145
6.6.2 Exercises Sect 6.2 147
6.6.3 Exercises Sect 6.3 149
6.6.4 Exercises Sect 6.4 150
6.7 Review Questions 150
Trang 13xiv Contents
7 The Singular Value Decomposition 153
7.1 The SVD Always Exists 154
7.1.1 The Matrices A∗A , AA∗ 154
7.2 Further Properties of SVD 156
7.2.1 The Singular Value Factorization 156
7.2.2 SVD and the Four Fundamental Subspaces 159
7.3 A Geometric Interpretation 159
7.4 Determining the Rank of a Matrix Numerically 161
7.4.1 The Frobenius Norm 161
7.4.2 Low Rank Approximation 162
7.5 Exercises Chap 7 163
7.5.1 Exercises Sect 7.1 163
7.5.2 Exercises Sect 7.2 164
7.5.3 Exercises Sect 7.4 167
7.6 Review Questions 168
Part III Matrix Norms and Least Squares 8 Matrix Norms and Perturbation Theory for Linear Systems 171
8.1 Vector Norms 171
8.2 Matrix Norms 174
8.2.1 Consistent and Subordinate Matrix Norms 174
8.2.2 Operator Norms 175
8.2.3 The Operator p-Norms 177
8.2.4 Unitary Invariant Matrix Norms 179
8.2.5 Absolute and Monotone Norms 180
8.3 The Condition Number with Respect to Inversion 180
8.3.1 Perturbation of the Right Hand Side in a Linear Systems 181
8.3.2 Perturbation of a Square Matrix 183
8.4 Proof That the p-Norms Are Norms 185
8.4.1 p-Norms and Inner Product Norms 188
8.5 Exercises Chap 8 190
8.5.1 Exercises Sect 8.1 190
8.5.2 Exercises Sect 8.2 191
8.5.3 Exercises Sect 8.3 194
8.5.4 Exercises Sect 8.4 197
8.6 Review Questions 198
9 Least Squares 199
9.1 Examples 200
9.1.1 Curve Fitting 202
9.2 Geometric Least Squares Theory 204
9.3 Numerical Solution 205
9.3.1 Normal Equations 205
9.3.2 QR Factorization 206
Trang 149.3.3 Singular Value Decomposition, Generalized
Inverses and Least Squares 207
9.4 Perturbation Theory for Least Squares 210
9.4.1 Perturbing the Right Hand Side 211
9.4.2 Perturbing the Matrix 212
9.5 Perturbation Theory for Singular Values 213
9.5.1 The Minmax Theorem for Singular Values and the Hoffman-Wielandt Theorem 213
9.6 Exercises Chap 9 216
9.6.1 Exercises Sect 9.1 216
9.6.2 Exercises Sect 9.2 217
9.6.3 Exercises Sect 9.3 218
9.6.4 Exercises Sect 9.4 221
9.6.5 Exercises Sect 9.5 221
9.7 Review Questions 222
Part IV Kronecker Products and Fourier Transforms 10 The Kronecker Product 225
10.1 The 2D Poisson Problem 225
10.1.1 The Test Matrices 228
10.2 The Kronecker Product 229
10.3 Properties of the 2D Test Matrices 232
10.4 Exercises Chap 10 234
10.4.1 Exercises Sects 10.1, 10.2 234
10.4.2 Exercises Sect 10.3 234
10.5 Review Questions 236
11 Fast Direct Solution of a Large Linear System 237
11.1 Algorithms for a Banded Positive Definite System 237
11.1.1 Cholesky Factorization 238
11.1.2 Block LU Factorization of a Block Tridiagonal Matrix 238
11.1.3 Other Methods 239
11.2 A Fast Poisson Solver Based on Diagonalization 239
11.3 A Fast Poisson Solver Based on the Discrete Sine and Fourier Transforms 242
11.3.1 The Discrete Sine Transform (DST) 242
11.3.2 The Discrete Fourier Transform (DFT) 242
11.3.3 The Fast Fourier Transform (FFT) 244
11.3.4 A Poisson Solver Based on the FFT 247
11.4 Exercises Chap 11 247
11.4.1 Exercises Sect 11.3 247
11.5 Review Questions 250
Trang 15xvi Contents
Part V Iterative Methods for Large Linear Systems
12 The Classical Iterative Methods 253
12.1 Classical Iterative Methods; Component Form 253
12.1.1 The Discrete Poisson System 255
12.2 Classical Iterative Methods; Matrix Form 257
12.2.1 Fixed-Point Form 258
12.2.2 The Splitting Matrices for the Classical Methods 258
12.3 Convergence 260
12.3.1 Richardson’s Method 261
12.3.2 Convergence of SOR 263
12.3.3 Convergence of the Classical Methods for the Discrete Poisson Matrix 264
12.3.4 Number of Iterations 266
12.3.5 Stopping the Iteration 267
12.4 Powers of a Matrix 268
12.4.1 The Spectral Radius 268
12.4.2 Neumann Series 270
12.5 The Optimal SOR Parameter ω 271
12.6 Exercises Chap 12 274
12.6.1 Exercises Sect 12.3 274
12.6.2 Exercises Sect 12.4 276
12.7 Review Questions 277
13 The Conjugate Gradient Method 279
13.1 Quadratic Minimization and Steepest Descent 280
13.2 The Conjugate Gradient Method 283
13.2.1 Derivation of the Method 283
13.2.2 The Conjugate Gradient Algorithm 285
13.2.3 Numerical Example 286
13.2.4 Implementation Issues 286
13.3 Convergence 288
13.3.1 The Main Theorem 288
13.3.2 The Number of Iterations for the Model Problems 289
13.3.3 Krylov Spaces and the Best Approximation Property 289
13.4 Proof of the Convergence Estimates 293
13.4.1 Chebyshev Polynomials 293
13.4.2 Convergence Proof for Steepest Descent 296
13.4.3 Monotonicity of the Error 298
13.5 Preconditioning 299
13.6 Preconditioning Example 302
13.6.1 A Variable Coefficient Problem 302
13.6.2 Applying Preconditioning 305
13.7 Exercises Chap 13 306
13.7.1 Exercises Sect 13.1 306
13.7.2 Exercises Sect 13.2 307
Trang 1613.7.3 Exercises Sect 13.3 309
13.7.4 Exercises Sect 13.4 312
13.7.5 Exercises Sect 13.5 313
13.8 Review Questions 313
Part VI Eigenvalues and Eigenvectors 14 Numerical Eigenvalue Problems 317
14.1 Eigenpairs 317
14.2 Gershgorin’s Theorem 318
14.3 Perturbation of Eigenvalues 320
14.3.1 Nondefective Matrices 322
14.4 Unitary Similarity Transformation of a Matrix into Upper Hessenberg Form 324
14.4.1 Assembling Householder Transformations 326
14.5 Computing a Selected Eigenvalue of a Symmetric Matrix 326
14.5.1 The Inertia Theorem 328
14.5.2 Approximating λ m 329
14.6 Exercises Chap 14 330
14.6.1 Exercises Sect 14.1 330
14.6.2 Exercises Sect 14.2 331
14.6.3 Exercises Sect 14.3 331
14.6.4 Exercises Sect 14.4 332
14.6.5 Exercises Sect 14.5 332
14.7 Review Questions 334
15 The QR Algorithm 335
15.1 The Power Method and Its Variants 335
15.1.1 The Power Method 335
15.1.2 The Inverse Power Method 339
15.1.3 Rayleigh Quotient Iteration 340
15.2 The Basic QR Algorithm 342
15.2.1 Relation to the Power Method 343
15.2.2 Invariance of the Hessenberg Form 344
15.2.3 Deflation 345
15.3 The Shifted QR Algorithms 345
15.4 Exercises Chap 15 346
15.4.1 Exercises Sect 15.1 346
15.5 Review Questions 347
Part VII Appendix 16 Differentiation of Vector Functions 351
References 355
Index 357
Trang 17List of Figures
Fig 1.1 The triangle T defined by the three points P1, P2and P3 23
Fig 2.1 The polynomial of degree 13 interpolating f (x) = arctan(10x) + π/2 on [−1, 1] See text 29
Fig 2.2 The piecewise linear polynomial interpolating f (x) = arctan(10x) + π/2 at n = 14 uniform points on [−1, 1] 29
Fig 2.3 A cubic spline with one knot interpolating f (x) = x4on [0, 2] 31
Fig 2.4 A cubic B-spline 34
Fig 2.5 The cubic spline interpolating f (x) = arctan(10x) + π/2 at 14 equidistant sites on[−1, 1] The exact function is also shown 51
Fig 3.1 Gaussian elimination 59
Fig 3.2 Lower triangular 5× 5 band matrices: d = 1 (left) and d= 2 right 62
Fig 5.1 The construction of v1and v2in Gram-Schmidt The constant c is given by c := s2, v1/v1, v1 103
Fig 5.2 The orthogonal projections of s + t into S and T 105
Fig 5.3 The Householder transformation in Example 5.1 108
Fig 5.4 A plane rotation 117
Fig 7.1 The ellipse y12/9+ y2 2= 1 (left) and the rotated ellipse AS (right) 160
Fig 8.1 A convex function 185
Fig 9.1 A least squares fit to data 201
Fig 9.2 Graphical interpretation of the bounds in Theorem 9.8 211
Fig 10.1 Numbering of grid points 227
xix
Trang 18Fig 10.2 The 5-point stencil 227
Fig 10.3 Band structure of the 2D test matrix 228
Fig 11.1 Fill-inn in the Cholesky factor of the Poisson matrix
(n= 100) 238
Fig 12.1 The functions α → |1 − αλ1| and α → |1 − αλ n| 262
Fig 12.2 ρ(G ω ) with ω ∈ [0, 2] for n = 100, (lower curve) and
n= 2500 (upper curve) 266
Fig 13.1 Level curves for Q(x, y) given by (13.4) Also shown
is a steepest descent iteration (left) and a conjugate
gradient iteration (right) to find the minimum of Q (cf
Examples 13.1,13.2) 281
Fig 13.2 The orthogonal projection of x − x0intoWk 291
Fig 13.3 This is an illustration of the proof of Theorem 13.6 for
k = 3 f ≡ Q − Q∗has a double zero at μ
1and one zero
between μ2and μ3 295
Fig 14.1 The Gershgorin disk R i 319
Fig 15.1 Post multiplication in a QR step 344
Trang 19List of Tables
Table 12.1 The number of iterations k nto solve the discrete Poisson
problem with n unknowns using the methods of Jacobi,
Gauss-Seidel, and SOR (see text) with a tolerance 10−8 256
Table 12.2 Spectral radial for G J , G1, G ω∗and the smallest integer
k n such that ρ(G) k n ≤ 10−8 266
Table 13.1 The number of iterations K for the averaging problem on a√
n×√n grid for various n 287
Table 13.2 The number of iterations K for the Poisson problem on a√
n×√n grid for various n 287
Table 13.3 The number of iterations K (no preconditioning) and K pre
(with preconditioning) for the problem (13.52) using the
discrete Poisson problem as a preconditioner 305
Table 15.1 Quadratic convergence of Rayleigh quotient iteration 341
xxi
Trang 202.1 trifactor 36
2.2 trisolve 36
2.3 splineint 50
2.4 findsubintervals 50
2.5 splineval 51
3.1 rforwardsolve 63
3.2 rbacksolve 63
3.3 cforwardsolve 64
3.4 L1U 73
3.5 cbacksolve 75
4.1 LDLs 85
4.2 bandcholesky 89
4.3 bandsemicholeskyL 94
5.1 housegen 110
5.2 housetriang 112
5.3 rothesstri 123
11.1 fastpoisson 241
11.2 fftrec 246
12.1 jdp 256
12.2 sordp 257
13.1 cg 286
13.2 cgtest 287
13.3 pcg 301
14.1 hesshousegen 325
14.2 accumulateQ 326
15.1 powerit 338
15.2 rayleighit 341
xxiii
Trang 21Chapter 1
A Short Review of Linear Algebra
In this introductory chapter we give a compact introduction to linear algebra withemphasis onRn andCn For a more elementary introduction, see for example thebook [13]
The following sets and notations will be used in this book
1 The sets of natural numbers, integers, rational numbers, real numbers, andcomplex numbers are denoted byN, Z, Q, R, C, respectively.
2 We use the “colon equal” symbol v := e to indicate that the symbol v is defined
by the expression e.
3 Rn is the set of n-tuples of real numbers which we will represent as bold face
column vectors Thus x∈ Rnmeans
© Springer Nature Switzerland AG 2020
T Lyche, Numerical Linear Algebra and Matrix Factorizations,
Texts in Computational Science and Engineering 22,
https://doi.org/10.1007/978-3-030-36468-7_1
1
Trang 224 Addition and scalar multiplication are denoted and defined by
5 Rm ×n is the set of matrices A with real elements The integers m and n are the
number of rows and columns in the tableau
The element in the ith row and j th column of A will be denoted by a i,j , a ij,
A(i, j ) or (A) i,j We use the notations
i with the risk of some confusion If m = 1 then A is a row vector, if
n = 1 then A is a column vector, while if m = n then A is a square matrix.
In this text we will denote matrices by boldface capital letters A, B, C,· · · and
vectors most often by boldface lower case letters x, y, z,· · ·
6 A complex number is a number written in the form x = a + ib, where a, b
are real numbers and i, the imaginary unit, satisfies i2 = −1 The set of allsuch numbers is denoted byC The numbers a = Re x and b = Im x are the real and imaginary part of x The number x := a − ib is called the complex
conjugate of x = a + ib, and |x| :=√xx=√a2+ b2the absolute value or
modulus of x The complex exponential function can be defined by
e x = e a +ib := e a ( cos b + i sin b).
In particular,
e iπ/2= i, e iπ = −1, e 2iπ = 1.
Trang 237 For matrices and vectors with complex elements we use the notation A∈ Cm ×n
and x ∈ Cn We define complex row vectors using either the transpose x T or
the conjugate transpose operation x∗ := x T = [x1, , x n ] If x ∈ R n then
x∗= x T
8 For x, y ∈ Cn and a ∈ C the operations of vector addition and scalar
multiplication is defined by component operations as in the real case (cf 4.)
9 The arithmetic operations on rectangular matrices are
• matrix addition C := A + B if A, B, C are matrices of the same size, i.e.,
with the same number of rows and columns, and c ij := a ij + b ij for all i, j
• multiplication by a scalar C := αA, where c ij := αa ij for all i, j
• matrix multiplication C := AB, C = A · B or C = A ∗ B, where A ∈
Cm ×p , B ∈ Cp ×n , C ∈ Cm ×n , and c
ij := p
k=1a ik b kj for i = 1, , m,
j = 1, , n.
• element-by-element matrix operations C := A × B, D := A/B, and E :=
A ∧ r where all matrices are of the same size and c ij := a ij b ij , d ij := a ij /b ij
and e ij := a r
ij for all i, j and suitable r For the division A/B we assume that all elements of B are nonzero The element-by-element product C = A × B
is known as the Schur product and also the Hadamard product.
10 Let A ∈ Rm ×n or A ∈ Cm ×n The transpose A T and conjugate transpose
A∗are n × m matrices with elements a T
ij := a j i and a∗
ij := a j i, respectively
If B is an n, p matrix then (AB) T = B T A T and (AB)∗ = B∗A∗ A matrix
A∈ Cn ×n is symmetric if A T = A and Hermitian if A∗= A.
11 The unit vectors in RnandCnare denoted by
Trang 2412 Some matrices with many zeros have names indicating their “shape” Suppose
A∈ Rn ×n or A∈ Cn ×n Then A is
• diagonal if a ij = 0 for i = j.
• upper triangular or right triangular if a ij = 0 for i > j.
• lower triangular or left triangular if a ij = 0 for i < j.
• upper Hessenberg if a ij = 0 for i > j + 1.
• lower Hessenberg if a ij = 0 for i < j + 1.
Trang 251.2 Vector Spaces and Subspaces 5
Many mathematical systems have analogous properties to vectors inR2orR3
Definition 1.1 (Real Vector Space) A real vector space is a nonempty set V,
whose objects are called vectors, together with two operations+ : V × V −→ V
and· : R × V −→ V, called addition and scalar multiplication, satisfying the following axioms for all vectors u, v, w in V and scalars c, d in R.
(V1) The sum u + v is in V,
(V2) u + v = v + u,
(V3) u + (v + w) = (u + v) + w,
(V4) There is a zero vector 0 such that u + 0 = u,
(V5) For each u in V there is a vector −u in V such that u + (−u) = 0,
(S1) The scalar multiple c · u is in V,
of all complex numbersC In this book a vector space is either real or complex.
From the axioms it follows that
1 The zero vector is unique
2 For each u ∈ V the negative −u of u is unique.
3 0u = 0, c0 = 0, and −u = (−1)u.
Here are some examples
1 The spaces Rn and Cn , where n ∈ N, are real and complex vector spaces,
respectively
2 LetD be a subset of R and d ∈ N The set V of all functions f , g : D → R dis areal vector space with
(f + g)(t) := f (t) + g(t), (cf )(t) := cf (t), t ∈ D, c ∈ R Two functions f , g in V are equal if f (t) = g(t) for all t ∈ D The zero
element is the zero function given by f (t) = 0 for all t ∈ D and the negative
of f is given by −f = (−1)f In the following we will use boldface letters for
functions only if d > 1.
3 For n ≥ 0 the space Π n of polynomials of degree at most n consists of all polynomials p : R → R, p : R → C, or p : C → C of the form
p(t) := a0+ a1t + a2t2+ · · · + a t n , (1.2)
Trang 26where the coefficients a0, , a n are real or complex numbers p is called the
zero polynomial if all coefficients are zero All other polynomials are said to be
nontrivial The degree of a nontrivial polynomial p given by (1.2) is the smallestinteger 0≤ k ≤ n such that p(t) = a0+ · · · + a k t k with a k = 0 The degree of
the zero polynomial is not defined Π nis a vector space if we define addition andscalar multiplication as for functions
Definition 1.2 (Linear Combination) For n ≥ 1 let X := {x1, , x n} be a set ofvectors in a vector spaceV and let c1, , c nbe scalars
1 The sum c1x1+ · · · + c n x n is called a linear combination of x1, , x n
2 The linear combination is nontrivial if c j x j = 0 for at least one j.
3 The set of all linear combinations of elements inX is denoted span(X ).
4 A vector space is finite dimensional if it has a finite spanning set; i.e., there
exists n ∈ N and {x1, , x n } in V such that V = span({x1, , x n }).
Example 1.1 (Linear Combinations)
1 Any x = [x1, , x m]T inCmcan be written as a linear combination of the unit
vectors as x = x1e1+ x2e2+ · · · + x m e m Thus,Cm = span({e1, , e m }) and
Cmis finite dimensional SimilarlyRmis finite dimensional
2 Let Π = ∪n Π n be the space of all polynomials Π is a vector space that
is not finite dimensional For suppose Π is finite dimensional Then Π =
span({p1, , p m }) for some polynomials p1, , p m Let d be an integer such that the degree of p j is less than d for j = 1, , m A polynomial of degree d cannot be written as a linear combination of p1, , p m, a contradiction
Definition 1.3 (Linear Independence) A set X = {x1, , x n} of nonzero
vectors in a vector space is linearly dependent if 0 can be written as a nontrivial
linear combination of{x1, , x n } Otherwise X is linearly independent.
A set of vectorsX = {x1, , x n} is linearly independent if and only if
c1x1+ · · · + c n x n= 0 ⇒ c1= · · · = c n = 0. (1.3)Suppose{x1, , x n} is linearly independent Then
1 If x ∈ span(X ) then the scalars c1, , c n in the representation x = c1x1+· · ·+
c n x nare unique
2 Any nontrivial linear combination of x1, , x nis nonzero,
Lemma 1.1 (Linear Independence and Span) Suppose v1, , v n span a vector space V and that w , , w are linearly independent vectors in V Then k ≤ n.
Trang 271.2 Vector Spaces and Subspaces 7
Proof Suppose k > n Write w1 as a linear combination of elements fromthe set X0 := {v1, , v n }, say w1 = c1v1 + · · · + c n v n Since w1 = 0
not all the c’s are equal to zero Pick a nonzero c, say c i1 Then v i1 can be
expressed as a linear combination of w1and the remaining v’s So the set X1 :=
{w1, v1, , v i1−1, v i1+1, , v n } must also be a spanning set for V We repeat this
for w2 and X1 In the linear combination w2 = d i1w1+ j =i1d j v j, we must
have d i2 = 0 for some i2with i2 = i1 For otherwise w2 = d1w1contradicting
the linear independence of the w’s So the set X2 consisting of the v’s with v i1
replaced by w1and v i2replaced by w2is again a spanning set forV Repeating this
process n − 2 more times we obtain a spanning set X n where v1, , v nhave been
replaced by w1, , w n Since k > n we can then write w kas a linear combination
of w1, , w n contradicting the linear independence of the w’s We conclude that
that forms a basis for V.
Proof If {v1, , v n } is linearly dependent we can express one of the v’s as a nontrivial linear combination of the remaining v’s and drop that v from the spanning set Continue this process until the remaining v’s are linearly independent They still
span the vector space and therefore form a basis
Corollary 1.1 (Existence of a Basis) A vector space is finite dimensional (cf.
Definition 1.2 ) if and only if it has a basis.
Proof Let V = span{v1, , v n} be a finite dimensional vector space By rem1.1,V has a basis Conversely, if V = span{v1, , v n } and {v1, , v n} is abasis then it is by definition a finite spanning set
Theo-Theorem 1.2 (Dimension of a Vector Space) Every basis for a vector space V
has the same number of elements This number is called the dimension of the vector
space and denoted dim V.
Proof Suppose X = {v1, , v n } and Y = {w1, , w k } are two bases for V By
Lemma1.1we have k ≤ n Using the same Lemma with X and Y switched we
The set of unit vectors{e1, , e n} form a basis for both RnandCn
Theorem 1.3 (Enlarging Vectors to a Basis) Every linearly independent set of
vectors {v1, , v k } in a finite dimensional vector space V can be enlarged to a
basis for V.
Trang 28Proof If {v1, , v k } does not span V we can enlarge the set by one vector v k+1which cannot be expressed as a linear combination of{v1, , v k} The enlargedset is also linearly independent Continue this process Since the space is finitedimensional it must stop after a finite number of steps
Definition 1.5 (Subspace) A nonempty subsetS of a real or complex vector space
V is called a subspace of V if
(V1) The sum u + v is in S for any u, v ∈ S.
(S1) The scalar multiple cu is in S for any scalar c and any u ∈ S.
Using the operations inV, any subspace S of V is a vector space, i.e., all 10
axioms V 1 − V 5 and S1 − S5 are satisfied for S In particular, S must contain the
zero element inV This follows since the operations of vector addition and scalar
multiplication are inherited fromV.
Example 1.2 (Examples of Subspaces)
1 {0}, where 0 is the zero vector is a subspace, the trivial subspace The dimension
of the trivial subspace is defined to be zero All other subspaces are nontrivial.
2 V is a subspace of itself.
3 span( X ) is a subspace of V for any X = {x1, , x n } ⊆ V Indeed, it is easy to
see that (V1) and (S1) hold.
4 The sum of two subspacesS and T of a vector space V is defined by
S + T := {s + t : s ∈ S and t ∈ T }. (1.4)
Clearly (V1) and (S1) hold and it is a subspace ofV
5 The intersection of two subspacesS and T of a vector space V is defined by
S ∩ T := {x : x ∈ S and x ∈ T }. (1.5)
It is a subspace ofV.
6 The union of two subspacesS and T of a vector space V is defined by
S ∪ T := {x : x ∈ S or x ∈ T }. (1.6)
In general it is not a subspace ofV.
7 A sum of two subspacesS and T of a vector space V is called a direct sum and
denotedS ⊕ T if S ∩ T = {0}.
Trang 291.2 Vector Spaces and Subspaces 9
Theorem 1.4 (Dimension Formula for Sums of Subspaces) Let S and T be two finite subspaces of a vector space V Then
dim( S + T ) = dim(S) + dim(T ) − dim(S ∩ T ). (1.7)
In particular, for a direct sum
dim( S ⊕ T ) = dim(S) + dim(T ). (1.8)
Proof Let {u1, , u p } be a basis for S ∩ T , where {u1, , u p} = ∅, the emptyset, in the caseS ∩ T = {0} We use Theorem1.3to extend{u1, , u p} to a basis
{u1, , u p , s1, , s q } for S and a basis {u1, , u p , t1, , t r } for T Every
x ∈ S + T can be written as a linear combination of
{u1, , u p , s1, , s q , t1, , t r}
so these vectors spanS + T We show that they are linearly independent and hence
a basis Suppose u + s + t = 0, where u := p
j=1α j u j , s := q
j=1ρ j s j, and
t := r
j=1σ j t j Now s = −(u + t) belongs to both S and to T and hence s ∈
S ∩ T Therefore s can be written as a linear combination of u1, , u p say s := p
j=1β j u j But then 0= p
j=1β j u j− q
j=1ρ j s j and since
{u1, , u p , s1, , s q}
is linearly independent we must have β1 = · · · = β p = ρ1 = · · · = ρ q = 0
and hence s = 0 We then have u + t = 0 and by linear independence of {u1, , u p , t1, , t r } we obtain α1 = · · · = α p = σ1 = · · · = σ r = 0 Wehave shown that the vectors{u1, , u p , s1, , s q , t1, , t r} constitute a basisforS + T But then
dim( S +T ) = p+q +r = (p+q)+(p+r)−p = dim(S)+dim(T )−dim(S ∩T )
and (1.7) follows Equation (1.7) implies (1.8) since dim{0} = 0.
It is convenient to introduce a matrix transforming a basis in a subspace into abasis for the space itself
Lemma 1.2 (Change of Basis Matrix) Suppose S is a subspace of a finite dimensional vector space V and let {s1, , s n } be a basis for S and {v1, , v m}
a basis for V Then each s j can be expressed as a linear combination of v1, , v m , say
s j =
m
i=1
a ij v i for j = 1, , n. (1.9)
Trang 30If x ∈ S then x = n
j=1c j s j = m
i=1b i v i for some coefficients b :=
[b1, , b m]T , c := [c1, , c n]T Moreover b = Ac, where A = [a ij] ∈ Cm ×n is
given by (1.9) The matrix A has linearly independent columns.
Proof Equation (1.9) holds for some a ij since s j ∈ V and {v1, , v m } spans V.
Since{s1, , s n } is a basis for S and {v1, , v m } a basis for V, every x ∈ S can
i=1b i v i and since b = 0 we have x = 0 But since {s1, , s n} is linearly
The matrix A in Lemma1.2is called a change of basis matrix.
WhenV = R m orCm we can think of n vectors in V, say x1, , x n, as a set
X := {x1, , x n } or as the columns of an m × n matrix X = [x1, , x n] A
linear combination can then be written as a matrix times vector Xc, where c =
[c1, , c n]T is the vector of scalars Thus
R(X) := {Xc : c ∈ R n } = span(X ).
Definition 1.6 (Column Space, Null Space, Inner Product and Norm)
Associ-ated with an m × n matrix X = [x1, , x n ], where x j ∈ V, j = 1, , n are the
following subspaces ofV.
1 The subspaceR(X) is called the column space of X It is the smallest subspace
containingX = {x1, , x n } The dimension of R(X) is called the rank of X The matrix X has rank n if and only if it has linearly independent columns.
2 R(X T ) is called the row space of X It is generated by the rows of X written as
column vectors
3 The subspaceN (X) := {y ∈ R n : Xy = 0} is called the null space or kernel
space of X The dimension of N (X) is called the nullity of X and denoted
null(X).
Trang 31ClearlyN (X) is nontrivial if and only if X has linearly dependent columns Inner
products and norms are treated in more generality in Chaps.5and8
The following Theorem is shown in any basic course in linear algebra SeeExercise7.10for a simple proof using the singular value decomposition
Theorem 1.5 (Counting Dimensions of Fundamental Subspaces) Suppose X∈
of m equations in n unknowns Here for all i, j , the coefficients a ij, the unknowns
x j , and the components b iof the right hand side are real or complex numbers Thesystem can be written as a vector equation
x1a1+ x2a2+ · · · + x n a n = b,
Trang 32The system is homogeneous if b = 0 and it is said to be underdetermined,
square, or overdetermined if m < n, m = n, or m > n, respectively.
A linear system has a unique solution, infinitely many solutions, or no solution Todiscuss this we first consider the real case, and a homogeneous underdeterminedsystem
Lemma 1.3 (Underdetermined System) Suppose A ∈ Rm ×n with m < n Then
there is a nonzero x∈ Rn such that Ax = 0.
Proof Suppose A ∈ Rm ×n with m < n The n columns of A span a subspace of
Rm SinceRm has dimension m the dimension of this subspace is at most m By
Lemma1.1the columns of A must be linearly dependent It follows that there is a
A square matrix is either nonsingular or singular.
Definition 1.7 (Real Nonsingular or Singular Matrix) A square matrix A ∈
Rn ×nis said to be nonsingular if the only real solution of the homogeneous system
Ax = 0 is x = 0 The matrix is singular if there is a nonzero x ∈ R n such that
Ax= 0.
Theorem 1.6 (Linear Systems; Existence and Uniqueness) Suppose A∈ Rn ×n .
The linear system Ax = b has a unique solution x ∈ R n for any b∈ Rn if and only
if the matrix A is nonsingular.
Proof Suppose A is nonsingular We define B = A b
Trang 33unique solution which must be x = 0 Thus A is nonsingular. For the complex case we have
Lemma 1.4 (Complex Underdetermined System) Suppose A∈ Cm ×n with m <
n Then there is a nonzero x∈ Cn such that Ax = 0.
Definition 1.8 (Complex Nonsingular Matrix) A square matrix A∈ Cn ×nis said
to be nonsingular if the only complex solution of the homogeneous system Ax= 0
is x= 0 The matrix is singular if it is not nonsingular.
Theorem 1.7 (Complex Linear System; Existence and Uniqueness) Suppose
A ∈ Cn ×n The linear system Ax = b has a unique solution x ∈ C n for any
b∈ Cn if and only if the matrix A is nonsingular.
Suppose A∈ Cn ×n is a square matrix A matrix B∈ Cn ×nis called a right inverse
of A if AB = I A matrix C ∈ C n ×n is said to be a left inverse of A if CA = I.
We say that A is invertible if it has both a left- and a right inverse If A has a right inverse B and a left inverse C then
C = CI = C(AB) = (CA)B = IB = B and this common inverse is called the inverse of A and denoted by A−1 Thus the
inverse satisfies A−1A = AA−1= I.
We want to characterize the class of invertible matrices and start with a lemma
Theorem 1.8 (Product of Nonsingular Matrices) If A, B, C∈ Cn ×n with AB=
C then C is nonsingular if and only if both A and B are nonsingular In particular,
if either AB = I or BA = I then A is nonsingular and A−1= B.
Proof Suppose both A and B are nonsingular and let Cx = 0 Then ABx = 0 and since A is nonsingular we see that Bx = 0 Since B is nonsingular we have x = 0.
We conclude that C is nonsingular.
Trang 34For the converse suppose first that B is singular and let x ∈ Cn be a nonzero
vector so that Bx = 0 But then Cx = (AB)x = A(Bx) = A0 = 0 so C is singular Finally suppose B is nonsingular, but A is singular Let ˜x be a nonzero vector such that A˜x = 0 By Theorem1.7there is a vector x such that Bx = ˜x and
xis nonzero since ˜x is nonzero But then Cx = (AB)x = A(Bx) = A ˜x = 0 for a
Theorem 1.9 (When Is a Square Matrix Invertible?) A square matrix is
invert-ible if and only if it is nonsingular.
Proof Suppose first A is a nonsingular matrix By Theorem1.7each of the linear
systems Ab i = e i has a unique solution b i for i = 1, , n Let B =b1, , b n
Then AB =Ab1, , Ab n
=e1, , e n
= I so that A has a right inverse B.
By Theorem1.8B is nonsingular since I is nonsingular and AB = I Since B is nonsingular we can use what we have shown for A to conclude that B has a right inverse C, i.e BC = I But then AB = BC = I so B has both a right inverse and
a left inverse which must be equal so A = C Since BC = I we have BA = I, so
B is also a left inverse of A and A is invertible.
Conversely, if A is invertible then it has a right inverse B Since AB = I and I
is nonsingular, we again use Theorem1.8to conclude that A is nonsingular.
To verify that some matrix B is an inverse of another matrix A it is enough to show that B is either a left inverse or a right inverse of A This calculation also proves that A is nonsingular We use this observation to give simple proofs of the
following results
Corollary 1.2 (Basic Properties of the Inverse Matrix) Suppose A, B ∈ Cn ×n
are nonsingular and c is a nonzero constant.
2 We note that (B−1A−1)(AB) = B−1(A−1A)B = B−1B = I Thus AB is
invertible with the indicated inverse since it has a left inverse
3 Now I = I T = (A−1A) T = A T (A−1) T showing that (A−1) T is a right inverse
of A T The proof of part 4 is similar
4 The matrix1c A−1is a one sided inverse of cA.
Trang 35
For any A∈ Cn ×n the determinant of A is defined by the number
The first term on the right corresponds to the identity permutation given by (i)=
i , i = 1, 2 The second term comes from the permutation σ = {2, 1} For n = 3
there are six permutations of{1, 2, 3} Then
This follows since sign( {1, 2, 3}) = sign({2, 3, 1}) = sign({3, 1, 2}) = 1, and
noting that interchanging two numbers in a permutation reverses it sign we find
sign( {2, 1, 3}) = sign({3, 2, 1}) = sign({1, 3, 2}) = −1.
To compute the value of a determinant from the definition can be a tryingexperience It is often better to use elementary operations on rows or columns
to reduce it to a simpler form For example, if A is triangular then det(A) =
Trang 36a11a22· · · a nn, the product of the diagonal elements In particular, for the identity
matrix det(I )= 1 The elementary operations using either rows or columns are
1 Interchanging two rows(columns): det(B) = − det(A),
2 Multiply a row(column) by a scalar: α, det(B) = α det(A),
3 Add a constant multiple of one row(column) to another row(column):
det(B) = det(A).
where B is the result of performing the indicated operation on A.
If only a few elements in a row or column are nonzero then a cofactor expansion
can be used These expansions take the form
ij det(A ij ) for j = 1, , n, column. (1.14)
Here A i,j denotes the submatrix of A obtained by deleting the ith row and j th column of A For A∈ Cn ×nand 1≤ i, j ≤ n the determinant det(A ij )is called the
cofactor of a ij
Example 1.3 (Determinant Equation for a Straight Line) The equation for a straight
line through two points (x1, y1) and (x2, y2)in the plane can be written as theequation
Trang 371.4 Determinants 17
which is the slope form of the equation of a straight line
We will freely use, without proofs, the following properties of determinants If
A, B are square matrices of order n with real or complex elements, then
1 det(AB) = det(A) det(B).
2 det(A T ) = det(A), and det(A∗) = det(A), (complex conjugate).
3 det(aA) = a n det(A), for a∈ C.
4 A is singular if and only if det(A)= 0
for some square matrices C, E then det(A) = det(C) det(E).
6 Cramer’s rule Suppose A ∈ Cn ×n is nonsingular and b ∈ Cn Let x =
[x1, x2, , x n]T be the unique solution of Ax = b Then
is called the adjoint of A Moreover, A j,i denotes the submatrix of A obtained
by deleting the j th row and ith column of A.
8 Cauchy-Binet formula: Let A ∈ Cm ×p , B ∈ Cp ×n and C = AB Suppose
1≤ r ≤ min{m, n, p} and let i = {i1, , i r } and j = {j1, , j r} be integerswith 1≤ i1< i2< · · · < i r ≤ m and 1 ≤ j1< j2< · · · < j r ≤ n Then
Trang 381.5 Eigenvalues, Eigenvectors and Eigenpairs
Suppose A∈ Cn ×n is a square matrix, λ ∈ C and x ∈ C n We say that (λ, x) is an
eigenpair for A if Ax = λx and x is nonzero The scalar λ is called an eigenvalue and x is said to be an eigenvector.1The set of eigenvalues is called the spectrum
of A and is denoted by σ (A) For example, σ (I ) = {1, , 1} = {1}.
Eigenvalues are the roots of the characteristic polynomial
Lemma 1.5 (Characteristic Equation) For any A ∈ Cn ×n we have λ ∈
σ (A) ⇐⇒ det(A − λI) = 0.
Proof Suppose (λ, x) is an eigenpair for A The equation Ax = λx can be written
(A − λI)x = 0 Since x is nonzero the matrix A − λI must be singular with a zero determinant Conversely, if det(A − λI) = 0 then A − λI is singular and
(A −λI)x = 0 for some nonzero x ∈ C n Thus Ax = λx and (λ, x) is an eigenpair
det(A − λI) = (a11− λ)(a22− λ) · · · (a nn − λ) + r(λ), (1.16)
where each term in r(λ) has at most n − 2 factors containing λ It follows that r is
a polynomial of degree at most n − 2, det(A − λI) is a polynomial of exact degree
n in λ and the eigenvalues are the roots of this polynomial.
We observe that det(A − λI) = (−1) n det(λI − A) so det(A − λI) = 0 if and only if det(λI − A) = 0.
1 The word “eigen” is derived from German and means “own”.
Trang 391.5 Eigenvalues, Eigenvectors and Eigenpairs 19
Definition 1.9 (Characteristic Polynomial of a Matrix) The function π A: C →
C given by π A(λ) = det(A − λI) is called the characteristic polynomial of A The equation det(A − λI) = 0 is called the characteristic equation of A.
By the fundamental theorem of algebra an n × n matrix has, counting ties, precisely n eigenvalues λ1, , λ n some of which might be complex even if A
multiplici-is real The complex eigenpairs of a real matrix occur in complex conjugate pairs
Indeed, taking the complex conjugate on both sides of the equation Ax = λx with
where c n−1= (−1) n−1trace(A) and c
0= π A ( 0) = det(A) On the other hand
then trace(A) = 4, det(A) = 3 so that π A(λ) = λ2− 4λ + 3.
Since A is singular ⇐⇒ Ax = 0, some x = 0 ⇐⇒ Ax = 0x, some x =
0 ⇐⇒ zero is an eigenvalue of A, we obtain
Theorem 1.11 (Zero Eigenvalue) The matrix A∈ Cn ×n is singular if and only if
zero is an eigenvalue.
Since the determinant of a triangular matrix is equal to the product of the diagonalelements the eigenvalues of a triangular matrix are found on the diagonal In general
it is not easy to find all eigenvalues of a matrix However, sometimes the dimension
of the problem can be reduced Since the determinant of a block triangular matrix isequal to the product of the determinants of the diagonal blocks we obtain
Trang 40Theorem 1.12 (Eigenvalues of a Block Triangular Matrix) If A = B D
0 C
is block triangular then πA = π B · π C
Exercise 1.1 (Strassen Multiplication (Exam Exercise 2017-1)) (By arithmetic
operations we mean additions, subtractions, multiplications and divisions.)
Let A and B be n × n real matrices.
a) With A, B ∈ Rn ×n, how many arithmetic operations are required to form the
where all matrices A, , Z are inRn ×n How many operations does it take to
compute W , X, Y and Z by the obvious algorithm?
c) An alternative method to compute W , X, Y and Z is to use Strassen’s formulas:
d) Describe a recursive algorithm, based on Strassen’s formulas, which given two
matrices A and B of size m × m, with m = 2 k for some k≥ 0, calculates the
product AB.
e) Show that the operation count of the recursive algorithm isOmlog2( 7)
Notethat log2( 7) ≈ 2.8 < 3, so this is less costly than straightforward matrix
multiplication
...Since the determinant of a triangular matrix is equal to the product of the diagonalelements the eigenvalues of a triangular matrix are found on the diagonal In general
it is not easy to... eigenvalues of a matrix However, sometimes the dimension
of the problem can be reduced Since the determinant of a block triangular matrix isequal to the product of the determinants of the diagonal... determinant from the definition can be a tryingexperience It is often better to use elementary operations on rows or columns
to reduce it to a simpler form For example, if A is triangular