Matrix analysis for scientists and engineers

Preface xi 1 Introduction and Review 1 1.1 Some Notation and Terminology 11.2 Matrix Arithmetic 31.3 Inner Products and Orthogonality 41.4 Determinants 4 2 Vector Spaces 7 2.1 Definition

Trang 2

Matrix Analysisfor Scientists & Engineers

Trang 3

This page intentionally left blank

Trang 4

Matrix Analysis for Scientists & Engineers

Alan J Laub

University of California Davis, California

slam.

Trang 5

Mathematica is a registered trademark of Wolfram Research, Inc

Mathcad is a registered trademark of Mathsoft Engineering & Education, Inc

Library of Congress Cataloging-in-Publication Data

Trang 6

To my wife, Beverley

(who captivated me in the UBC math library

nearly forty years ago)

Trang 7

Trang 8

Preface xi

1 Introduction and Review 1

1.1 Some Notation and Terminology 11.2 Matrix Arithmetic 31.3 Inner Products and Orthogonality 41.4 Determinants 4

2 Vector Spaces 7

2.1 Definitions and Examples 72.2 Subspaces 92.3 Linear Independence 102.4 Sums and Intersections of Subspaces 13

3 Linear Transformations 17

3.1 Definition and Examples 173.2 Matrix Representation of Linear Transformations 183.3 Composition of Transformations 193.4 Structure of Linear Transformations 203.5 Four Fundamental Subspaces 22

4 Introduction to the Moore-Penrose Pseudoinverse 29

4.1 Definitions and Characterizations 294.2 Examples 304.3 Properties and Applications 31

5 Introduction to the Singular Value Decomposition 35

5.1 The Fundamental Theorem 355.2 Some Basic Properties 385.3 Row and Column Compressions 40

6 Linear Equations 43

6.1 Vector Linear Equations 436.2 Matrix Linear Equations 446.3 A More General Matrix Linear Equation 476.4 Some Useful and Interesting Inverses 47

vii

Trang 9

viii Contents

7 Projections, Inner Product Spaces, and Norms 51

7.1 Projections 517.1.1 The four fundamental orthogonal projections 527.2 Inner Product Spaces 547.3 Vector Norms 577.4 Matrix Norms 59

8 Linear Least Squares Problems 65

8.1 The Linear Least Squares Problem 658.2 Geometric Solution 678.3 Linear Regression and Other Linear Least Squares Problems 678.3.1 Example: Linear regression 678.3.2 Other least squares problems 698.4 Least Squares and Singular Value Decomposition 708.5 Least Squares and QR Factorization 71

9 Eigenvalues and Eigenvectors 75

9.1 Fundamental Definitions and Properties 759.2 Jordan Canonical Form 829.3 Determination of the JCF 859.3.1 Theoretical computation 869.3.2 On the +1's in JCF blocks 889.4 Geometric Aspects of the JCF 899.5 The Matrix Sign Function 91

10 Canonical Forms 95

10.1 Some Basic Canonical Forms 9510.2 Definite Matrices 9910.3 Equivalence Transformations and Congruence 10210.3.1 Block matrices and definiteness 10410.4 Rational Canonical Form 104

11 Linear Differential and Difference Equations 109

11.1 Differential Equations 10911.1.1 Properties of the matrix exponential 10911.1.2 Homogeneous linear differential equations 11211.1.3 Inhomogeneous linear differential equations 11211.1.4 Linear matrix differential equations 11311.1.5 Modal decompositions 11411.1.6 Computation of the matrix exponential 11411.2 Difference Equations 11811.2.1 Homogeneous linear difference equations 11811.2.2 Inhomogeneous linear difference equations 11811.2.3 Computation of matrix powers 11911.3 Higher-Order Equations 120

Trang 10

Contents ix

12 Generalized Eigenvalue Problems 125

12.1 The Generalized Eigenvalue/Eigenvector Problem 12512.2 Canonical Forms 12712.3 Application to the Computation of System Zeros 13012.4 Symmetric Generalized Eigenvalue Problems 13112.5 Simultaneous Diagonalization 13312.5.1 Simultaneous diagonalization via SVD 13312.6 Higher-Order Eigenvalue Problems 13512.6.1 Conversion to first-order form 135

13 Kronecker Products 139

13.1 Definition and Examples 13913.2 Properties of the Kronecker Product 14013.3 Application to Sylvester and Lyapunov Equations 144

Bibliography 151 Index 153

Trang 11

Trang 12

This book is intended to be used as a text for beginning graduate-level (or even senior-level)students in engineering, the sciences, mathematics, computer science, or computationalscience who wish to be familar with enough matrix analysis that they are prepared to use its

tools and ideas comfortably in a variety of applications By matrix analysis I mean linear

algebra and matrix theory together with their intrinsic interaction with and application tolinear dynamical systems (systems of linear differential or difference equations) The textcan be used in a one-quarter or one-semester course to provide a compact overview ofmuch of the important and useful mathematics that, in many cases, students meant to learnthoroughly as undergraduates, but somehow didn't quite manage to do Certain topicsthat may have been treated cursorily in undergraduate courses are treated in more depthand more advanced material is introduced I have tried throughout to emphasize only themore important and "useful" tools, methods, and mathematical structures Instructors areencouraged to supplement the book with specific application examples from their ownparticular subject area

The choice of topics covered in linear algebra and matrix theory is motivated both byapplications and by computational utility and relevance The concept of matrix factorization

is emphasized throughout to provide a foundation for a later course in numerical linearalgebra Matrices are stressed more than abstract vector spaces, although Chapters 2 and 3

do cover some geometric (i.e., basis-free or subspace) aspects of many of the fundamentalnotions The books by Meyer [18], Noble and Daniel [20], Ortega [21], and Strang [24]are excellent companion texts for this book Upon completion of a course based on thistext, the student is then well-equipped to pursue, either via formal courses or through self-study, follow-on topics on the computational side (at the level of [7], [11], [23], or [25], forexample) or on the theoretical side (at the level of [12], [13], or [16], for example).Prerequisites for using this text are quite modest: essentially just an understanding

of calculus and definitely some previous exposure to matrices and linear algebra Basicconcepts such as determinants, singularity of matrices, eigenvalues and eigenvectors, andpositive definite matrices should have been covered at least once, even though their recollec-tion may occasionally be "hazy." However, requiring such material as prerequisite permitsthe early (but "out-of-order" by conventional standards) introduction of topics such as pseu-doinverses and the singular value decomposition (SVD) These powerful and versatile toolscan then be exploited to provide a unifying foundation upon which to base subsequent top-ics Because tools such as the SVD are not generally amenable to "hand computation," thisapproach necessarily presupposes the availability of appropriate mathematical software on

a digital computer For this, I highly recommend MATLAB® although other software such as

xi

Trang 13

xii Preface

Mathematica® or Mathcad® is also excellent Since this text is not intended for a course in

numerical linear algebra per se, the details of most of the numerical aspects of linear algebra

are deferred to such a course

The presentation of the material in this book is strongly influenced by tional issues for two principal reasons First, "real-life" problems seldom yield to simpleclosed-form formulas or solutions They must generally be solved computationally and

computa-it is important to know which types of algorcomputa-ithms can be relied upon and which cannot.Some of the key algorithms of numerical linear algebra, in particular, form the foundationupon which rests virtually all of modern scientific and engineering computation A secondmotivation for a computational emphasis is that it provides many of the essential tools forwhat I call "qualitative mathematics." For example, in an elementary linear algebra course,

a set of vectors is either linearly independent or it is not This is an absolutely fundamentalconcept But in most engineering or scientific contexts we want to know more than that

If a set of vectors is linearly independent, how "nearly dependent" are the vectors? If theyare linearly dependent, are there "best" linearly independent subsets? These turn out to

be much more difficult problems and frequently involve research-level questions when set

in the context of the finite-precision, finite-range floating-point arithmetic environment ofmost modern computing platforms

Some of the applications of matrix analysis mentioned briefly in this book derivefrom the modern state-space approach to dynamical systems State-space methods arenow standard in much of modern engineering where, for example, control systems withlarge numbers of interacting inputs, outputs, and states often give rise to models of veryhigh order that must be analyzed, simulated, and evaluated The "language" in which suchmodels are conveniently described involves vectors and matrices It is thus crucial to acquire

a working knowledge of the vocabulary and grammar of this language The tools of matrixanalysis are also applied on a daily basis to problems in biology, chemistry, econometrics,physics, statistics, and a wide variety of other fields, and thus the text can serve a ratherdiverse audience Mastery of the material in this text should enable the student to read andunderstand the modern language of matrices used throughout mathematics, science, andengineering

While prerequisites for this text are modest, and while most material is developed frombasic ideas in the book, the student does require a certain amount of what is conventionallyreferred to as "mathematical maturity." Proofs are given for many theorems When they arenot given explicitly, they are either obvious or easily found in the literature This is idealmaterial from which to learn a bit about mathematical proofs and the mathematical maturityand insight gained thereby It is my firm conviction that such maturity is neither encouragednor nurtured by relegating the mathematical aspects of applications (for example, linearalgebra for elementary state-space theory) to an appendix or introducing it "on-the-fly" whennecessary Rather, one must lay a firm foundation upon which subsequent applications andperspectives can be built in a logical, consistent, and coherent fashion

I have taught this material for many years, many times at UCSB and twice at UCDavis, and the course has proven to be remarkably successful at enabling students fromdisparate backgrounds to acquire a quite acceptable level of mathematical maturity andrigor for subsequent graduate studies in a variety of disciplines Indeed, many students whocompleted the course, especially the first few times it was offered, remarked afterward that

if only they had had this course before they took linear systems, or signal processing,

Trang 14

Preface xiii

or estimation theory, etc., they would have been able to concentrate on the new ideasthey wanted to learn, rather than having to spend time making up for deficiencies in theirbackground in matrices and linear algebra My fellow instructors, too, realized that byrequiring this course as a prerequisite, they no longer had to provide as much time for

"review" and could focus instead on the subject at hand The concept seems to work

— AJL, June 2004

Trang 15

Trang 16

Chapter 1

Introduction and Review

1.1 Some Notation and Terminology

We begin with a brief introduction to some standard notation and terminology to be usedthroughout the text This is followed by a review of some basic notions in matrix analysisand linear algebra

The following sets appear frequently throughout subsequent chapters:

1 Rn= the set of n-tuples of real numbers represented as column vectors Thus, x e Rn

means

where xi e R for i e n.

Henceforth, the notation n denotes the set {1, , n}.

Note: Vectors are always column vectors A row vector is denoted by y T , where

y G Rn and the superscript T is the transpose operation That a vector is always a

column vector rather than a row vector is entirely arbitrary, but this convention makes

it easy to recognize immediately throughout the text that, e.g., X T y is a scalar while

xy T is an n x n matrix

2 Cn = the set of n-tuples of complex numbers represented as column vectors

3 Rmxn = the set of real (or real-valued) m x n matrices.

4 Rmxnr = the set of real m x n matrices of rank r Thus, Rnxnn denotes the set of real

nonsingular n x n matrices.

5 Cmxn = the set of complex (or complex-valued) m x n matrices.

6 Cmxn = the set of complex m x n matrices of rank r.

1

Trang 17

Chapter 1 Introduction and Review

Each of the above also has a "block" analogue obtained by replacing scalar components inthe respective definitions by block submatrices For example, if A e Rnxn, B e Rm x n, and

C e R mxm , then the (m + n) x (m + n) matrix [ A0 Bc ] is block upper triangular.

The transpose of a matrix A is denoted by A T and is the matrix whose (i, j)th entry

is the (7, Oth entry of A, that is, (A7),, = a,, Note that if A e Rmx", then A7" e E"xm

If A e Cmx", then its Hermitian transpose (or conjugate transpose) is denoted by A H (or

sometimes A*) and its (i, j)\h entry is (AH),7 = («77), where the bar indicates complex

conjugation; i.e., if z = a + jf$ (j = i = v^T), then z = a — jfi A matrix A is symmetric

if A = A T and Hermitian if A = A H We henceforth adopt the convention that, unless otherwise noted, an equation like A = A T implies that A is real-valued while a statement

like A = A H implies that A is complex-valued

Remark 1.1 While \/—\ is most commonly denoted by i in mathematics texts, j is

the more common notation in electrical engineering and system theory There is some

advantage to being conversant with both notations The notation j is used throughout the

text but reminders are placed at strategic locations

Example 1.2.

Transposes of block matrices can be defined in an obvious way For example, it iseasy to see that if A,, are appropriately dimensioned subblocks, then

is symmetric (and Hermitian)

is complex-valued symmetric but not Hermitian

is Hermitian (but not symmetric)

2

We now classify some of the more familiar "shaped" matrices A matrix A e(or A eC"x")is

• diagonal if a, 7 = 0 for i ^ j.

• upper triangular if a,; = 0 for i > j.

• lower triangular if a,7 = 0 for / < j.

• tridiagonal if a (y = 0 for |z — j\ > 1.

• pentadiagonal if a i; = 0 for |/ — j\ > 2.

• upper Hessenberg if a f j = 0 for i — j > 1.

• lower Hessenberg if a, ; = 0 for j — i > 1.

Trang 18

1.2 Matrix Arithmetic

1.2 Matrix Arithmetic

It is assumed that the reader is familiar with the fundamental notions of matrix addition,multiplication of a matrix by a scalar, and multiplication of matrices

A special case of matrix multiplication occurs when the second matrix is a column

vector x, i.e., the matrix-vector product Ax A very important way to view this product is

to interpret it as a weighted sum (linear combination) of the columns of A That is, suppose

The importance of this interpretation cannot be overemphasized As a numerical example,take A = [96 85 74]x = 2 Then we can quickly calculate dot products of the rows of A

with the column x to find Ax = [50 32]' but this matrix-vector product can also be computed

v1a

For large arrays of numbers, there can be important computer-architecture-related tages to preferring the latter calculation method

advan-For matrix multiplication, suppose A e R mxn and B = [bi, ,b p ] e R nxp with

bi e W 1 Then the matrix product A B can be thought of as above, applied p times:

There is also an alternative, but equivalent, formulation of matrix multiplication that appearsfrequently in the text and is presented below as a theorem Again, its importance cannot beoveremphasized It is deceptively simple and its full understanding is well rewarded

Theorem 1.3 Let U = [MI, , u n ] e R mxn with u t e R m and V = [v { , , v n ] e R pxn

with v t e R p Then

If matrices C and D are compatible for multiplication, recall that (CD) T = D T C T

(or (CD} H — D H C H ) This gives a dual to the matrix-vector result above Namely, if

C eR mxn has row vectors cj e Elx", and is premultiplied by a row vector yT e Rl x m,

then the product can be written as a weighted linear sum of the rows of C as follows:

3

Theorem 1.3 can then also be generalized to its "row dual." The details are left to the readeiThen

Trang 19

1.3 Inner Products and Orthogonality

For vectors x, y e R", the Euclidean inner product (or inner product, for short) of x and

y is given by

Note that the inner product is a scalar

If x, y e C", we define their complex Euclidean inner product (or inner product,

for short) by

and we see that, indeed, (x, y) c = (y, x) c

Note that x T x = 0 if and only if x = 0 when x e Rn but that this is not true if x e Cn What is true in the complex case is that X H x = 0 if and only if x = 0 To illustrate, consider the nonzero vector x above Then X T X = 0 but X H X = 2.

Two nonzero vectors x, y e R are said to be orthogonal if their inner product is

zero, i.e., x T y = 0 Nonzero complex vectors are orthogonal if X H y = 0 If x and y are orthogonal and X T X = 1 and y T y = 1, then we say that x and y are orthonormal A

matrix A e Rnxn is an orthogonal matrix if A T A = AA T = /, where / is the n x n

identity matrix The notation /„ is sometimes used to denote the identity matrix in R nx "

(orC"x") Similarly, a matrix A e Cnxn is said to be unitary if A H A = AA H = I Clearly

an orthogonal or unitary matrix has orthonormal rows and orthonormal columns There is

no special name attached to a nonsquare matrix A e Rmxn (or € Cmxn) with orthonormalrows or columns

1.4 Determinants

It is assumed that the reader is familiar with the basic theory of determinants For A e R nxn

(or A 6 Cnxn) we use the notation det A for the determinant of A We list below some of

Note that (x, y) c = (y, x) c , i.e., the order in which x and y appear in the complex inner

product is important The more conventional definition of the complex inner product is

( x , y ) c = y H x = Eni=1 xiyi but throughout the text we prefer the symmetry with the real

case

Example 1.4 Let x = [ 1j ] and y = [ 1/2 ] Then

while

4

Trang 20

1.4 Determinants

the more useful properties of determinants Note that this is not a minimal set, i.e., severalproperties are consequences of one or more of the others

1 If A has a zero row or if any two rows of A are equal, then det A = 0

2 If A has a zero column or if any two columns of A are equal, then det A = 0

3 Interchanging two rows of A changes only the sign of the determinant

4 Interchanging two columns of A changes only the sign of the determinant

5 Multiplying a row of A by a scalar a results in a new matrix whose determinant is

9 det A T = det A (det A H = det A if A e Cnxn).

10 If A is diagonal, then det A = a11a22 • • • a nn , i.e., det A is the product of its diagonal

elements

11 If A is upper triangular, then det A = a11a22 • • • a nn

12 If A is lower triangular, then det A = a11a22 • • • a nn

13 If A is block diagonal (or block upper triangular or block lower triangular), with

square diagonal blocks A11, A22, • • •, A nn (of possibly different sizes), then det A =det A11 det A22 • • • det A nn

14 If A, B eRn x n,thendet(AB) = det A det 5.

15 If A € Rnxn, then det(A-1) = 1det A

16 If A e Rn x nand D e Rm x m, then det [Ac BD ] = del A det(D – CA– l B).

Proof: This follows easily from the block LU factorization

17 If A e Rn x n and D e RMmxm, then det [Ac BD ] = det D det(A – B D – 1 C )

Proof: This follows easily from the block UL factorization

5

Trang 21

Remark 1.5 The factorization of a matrix A into the product of a unit lower triangular

matrix L (i.e., lower triangular with all 1's on the diagonal) and an upper triangular matrix

U is called an LU factorization; see, for example, [24] Another such factorization is UL

where U is unit upper triangular and L is lower triangular The factorizations used above

are block analogues of these

Remark 1.6 The matrix D — CA– 1 B is called the Schur complement of A in [AC BD].

Similarly, A – BD– l C is the Schur complement of D in [AC BD ]

EXERCISES

1 If A e R nxn and or is a scalar, what is det(aA)? What is det(–A)?

2 If A is orthogonal, what is det A? If A is unitary, what is det A?

3 Let x, y e Rn Show that det(I – xy T ) = 1 – y T x.

4 Let U1, U 2 , , Uk € Rnxn be orthogonal matrices Show that the product U =

U1 U2 • • • Uk is an orthogonal matrix.

5 Let A e Rn x n The trace of A, denoted TrA, is defined as the sum of its diagonal

elements, i.e., TrA = Eni=1 aii.

(a) Show that the trace is a linear function; i.e., if A, B e Rnxn and a, ft e R, then Tr(aA + fiB)= aTrA + fiTrB.

(b) Show that Tr(Afl) = Tr(£A), even though in general AB ^ B A.

(c) Let S € R nxn be skew-symmetric, i.e., S T = -S Show that TrS = 0 Then

either prove the converse or provide a counterexample

6 A matrix A e W x " is said to be idempotent if A2 = A.

/ x ™ , • , ! T 2cos2<9 sin 20 1 _,

(a) Show that the matrix A = - _ _ 2rt is idempotent for all #

2 |_ sin 2^ 2smz# J r(b) Suppose A e IR"X" is idempotent and A ^ I Show that A must be singular

6

Trang 22

2.1 Definitions and Examples

Definition 2.1 A field is a set F together with two operations +, • : F x F —> F such that

Axioms (A1)-(A3) state that (F, +) is a group and an abelian group if (A4) also holds.Axioms (M1)-(M4) state that (F \ {0}, •) is an abelian group

Generally speaking, when no confusion can arise, the multiplication operator "•" isnot written explicitly

7

(Al) a + (P + y ) = (a + p ) + y f o r all a, ft, y € F.

(A2) there exists an element 0 e F such that a + 0 = a for all a e F.

(A3) for all a e F, there exists an element (—a) e F such that a + (—a) = 0.

(A4) a + p = ft + afar all a, ft e F.

(Ml) a - ( p - y ) = ( a - p ) - y f o r all a, p, y e F.

(M2) there exists an element 1 e F such that a • I = a for all a e F.

(M3) for all a e ¥, a ^ 0, there exists an element a"1 € F such that a • a~ l = 1 (M4) a • p = P • a for all a, p e F.

(D) a - ( p + y)=ci-p+a- y for alia, p,ye¥.

Trang 23

Chapter 2 Vector Spaces

Example 2.2.

1 R with ordinary addition and multiplication is a field

2 C with ordinary complex addition and multiplication is a field

3 Raf.r] = the field of rational functions in the indeterminate x

8

where Z+ = {0,1,2, }, is a field

4 RMrmxn = { m x n matrices of rank r with real coefficients) is clearly not a field since, for example, (Ml) does not hold unless m = n Moreover, R"x" is not a field eithersince (M4) does not hold in general (although the other 8 axioms hold)

Definition 2.3 A vector space over a field F is a set V together with two operations

+ :V x V -^V and- : F xV -»• V such that

A vector space is denoted by (V, F) or, when there is no possibility of confusion as to the underlying fie Id, simply by V.

Remark 2.4 Note that + and • in Definition 2.3 are different from the + and • in Definition

2.1 in the sense of operating on different objects in different sets In practice, this causes

no confusion and the • operator is usually not even written explicitly

Example 2.5.

1 (R", R) with addition defined by

and scalar multiplication defined by

is a vector space Similar definitions hold for (C", C)

(VI) (V, +) is an abelian group.

(V2) ( a - p ) - v = a - ( P ' V ) f o r all a, p e F and for all v e V.

(V3) (a + ft) • v = a • v + p • v for all a, p € F and for all v e V.

(V4) a-(v + w)=a-v + a- w for all a e F and for all v, w e V.

(V5) 1 • v = v for all v e V (1 e F).

Trang 24

2.2 Subspaces

3 Let (V, F) be an arbitrary vector space and V be an arbitrary set Let O(X>, V) be the

set of functions / mapping D to V Then O(D, V) is a vector space with additiondefined by

2.2 Subspaces

Definition 2.6 Let (V, F) be a vector space and let W c V, W = 0 Then (W, F) is a subspace of (V, F) if and only if (W, F) is itself a vector space or, equivalently, if and only

i f ( a w 1 + ßW2) e W for all a, ß e ¥ and for all w1, w 2 e W

Remark 2.7 The latter characterization of a subspace is often the easiest way to check

or prove that something is indeed a subspace (or vector space); i.e., verify that the set inquestion is closed under addition and scalar multiplication Note, too, that since 0 e F, thisimplies that the zero vector must be in any subspace

Notation: When the underlying field is understood, we write W c V, and the symbol c,

when used with vector spaces, is henceforth understood to mean "is a subspace of." Theless restrictive meaning "is a subset of" is specifically flagged as such

9

2 (Emxn, E) is a vector space with addition defined by

and scalar multiplication defined by

Trang 25

Then W a ,ß is a subspace of V if and only if ß = 0 As an interesting exercise, sketch

W2,1, W2,o, W1/2,1, and W1/2, 0 Note, too, that the vertical line through the origin (i.e.,

a = oo) is also a subspace.

All lines through the origin are subspaces Shifted subspaces Wa,ß with ß = 0 are

called linear varieties.

Henceforth, we drop the explicit dependence of a vector space on an underlying field.Thus, V usually denotes a vector space with the underlying field generally being R unlessexplicitly stated otherwise

Definition 2.9 If 12, and S are vector spaces (or subspaces), then R = S if and only if

R C S and S C R.

Note: To prove two vector spaces are equal, one usually proves the two inclusions separately:

An arbitrary r e R is shown to be an element of S and then an arbitrary 5 € S is shown to

be an element of R

2.3 Linear Independence

Let X = {v1, v2, • • •} be a nonempty collection of vectors u, in some vector space V.

Definition 2.10 X is a linearly dependent set of vectors if and only if there exist k distinct

elements v1, , vk e X and scalars a1, , ak not all zero such that

10 Chapter 2 Vector Spaces

2 Let W = {A € R"x" : A is orthogonal} Then W is /wf a subspace of R"x"

3 Consider (V, F) = (R2, R) and for each v € R2 of the form v = [v1v2 ] identify v1 with

the jc-coordinate in the plane and u2 with the y-coordinate For a, ß e R, define

X is a linearly independent set of vectors if and only if for any collection of k distinct

elements v1, ,Vk of X and for any scalars a1, , ak,

Trang 26

2.3 Linear Independence 11

(since 2v\ — v 2 + v3 = 0)

2 Let A e R xn and 5 e R" xm Then consider the rows of etA B as vectors in Cm [t0, t1](recall that efA denotes the matrix exponential, which is discussed in more detail inChapter 11) Independence of these vectors turns out to be equivalent to a concept

called controllability, to be studied further in what follows.

Let v f e R", i e k, and consider the matrix V = [v1, ,Vk] e R nxk The linear

dependence of this set of vectors is equivalent to the existence of a nonzero vector a e R k

such that Va = 0 An equivalent condition for linear dependence is that the k x k matrix

V T V is singular If the set of vectors is independent, and there exists a e R* such that

Va = 0, then a = 0 An equivalent condition for linear independence is that the matrix

V T V is nonsingular.

Definition 2.12 Let X = [v1, v 2 , } be a collection of vectors vi e V Then the span of

X is defined as

Example 2.13 Let V = R n and define

Then Sp{e1, e 2 , ,e n } = Rn.

Definition 2.14 A set of vectors X is a basis for V if and only ij

1 X is a linearly independent set (of basis vectors), and

2. Sp(X) = V.

Example 2.11.

is a linearly independent set Why?

s a linearly dependent setHowever,

1 LetV = R3 Then

where N = {1, 2, }.

Trang 27

Example 2.15 [e\, , e n } is a basis for IR" (sometimes called the natural basis).

Now let b1, , b n be a basis (with a specific order associated with the basis vectors)

for V Then for all v e V there exists a unique n-tuple {E1 , , E n } such that

Definition 2.16 The scalars {Ei} are called the components (or sometimes the coordinates)

of v with respect to the basis (b1, , b n ] and are unique We say that the vector x of

components represents the vector v with respect to the basis B.

Example 2.17 In Rn,

we have

To see this, write

Then

Theorem 2.18 The number of elements in a basis of a vector space is independent of the

particular basis considered.

Definition 2.19 If a basis X for a vector space V= 0) has n elements, V is said to

be n-dimensional or have dimension n and we write dim(V) = n or dim V — n For

We can also determine components of v with respect to another basis For example, while

with respect to the basis

where

Trang 28

2.4 Sums and Intersections of Subspaces

Definition 2.21 Let (V, F) be a vector space and let 71, S c V The sum and intersection

of R, and S are defined respectively by:

The subspaces R, and S are said to be complements of each other in T.

Remark 2.23 The union of two subspaces, R C S, is not necessarily a subspace.

Definition 2.24 T = R 0 S is the direct sum of R and S if

Theorem 2.22.

2.4 Sums and Intersections of Subspaces 13

consistency, and because the 0 vector is in any vector space, we define dim(O) = 0 A

vector space V is finite-dimensional if there exists a basis X with n < +00 elements; otherwise, V is infinite-dimensional.

Thus, Theorem 2.18 says that dim(V) = the number of elements in a basis

(To see why, determine 1/2n(n + 1) symmetric basis matrices.)

5 dim{A e Rnxn : A is upper (lower) triangular} = 1/2n(n + 1).

Trang 29

Remark 2.25 The complement of ft (or S) is not unique For example, consider V = R2and let ft be any line through the origin Then any other distinct line through the origin is

a complement of ft Among all the complements there is a unique one orthogonal to ft.

We discuss more about orthogonal complements elsewhere in the text

Theorem 2.26 Suppose T =R O S Then

1 every t € T can be written uniquely in the form t = r + s with r e R and s e S.

2 dim(T) = dim(ft) + dim(S).

Proof: To prove the first part, suppose an arbitrary vector t e T can be written in two ways

as t = r1 + s1 = r2 + S2, where r1, r2 e R and s1, S2 e S Then r1 — r2 = s2— s\ But r1 –r2 £ ft and 52 — si e S Since ft fl S = 0, we must have r\ = r-i and s\ = si from

which uniqueness follows

The statement of the second part is a special case of the next theorem D

Theorem 2.27 For arbitrary subspaces ft, S of a vector space V,

EXERCISES

1 Suppose {vi, , Vk} is a linearly dependent set Then show that one of the vectors

must be a linear combination of the others

2 Let x\, *2, , x/c E R" be nonzero mutually orthogonal vectors Show that [x\, ,

Xk} must be a linearly independent set.

3 Let v\, ,v n be orthonormal vectors in R" Show that Av\, , Av n are also

or-thonormal if and only if A e R"x" is orthogonal.

4 Consider the vectors v\ — [2 l]r and 1*2 = [3 l]r Prove that vi and V2 form a basis

for R2 Find the components of the vector v = [4 l]r with respect to this basis

Example 2.28 Let U be the subspace of upper triangular matrices in E" x" and let £ be thesubspace of lower triangular matrices in Rnxn Then it may be checked that U + L = Rnxn

while U n £ is the set of diagonal matrices in Rnxn Using the fact that dim (diagonal matrices} = n, together with Examples 2.20.2 and 2.20.5, one can easily verify the validity

of the formula given in Theorem 2.27

Example 2.29 Let (V, F) = (Rnxn, R), let ft be the set of skew-symmetric matrices inR"x", and let S be the set of symmetric matrices in R"x" Then V = U 0 S

Proof: This follows easily from the fact that any A e R"x" can be written in the form

The first matrix on the right-hand side above is in S while the second is in ft.

Trang 30

Exercises 15

5 Let P denote the set of polynomials of degree less than or equal to two of the form

Po + p\x + pix 2 , where po, p\, p2 e R Show that P is a vector space over E Show that the polynomials 1, *, and 2x 2 — 1 are a basis for P Find the components of the polynomial 2 + 3x + 4x 2 with respect to this basis

6 Prove Theorem 2.22 (for the case of two subspaces R and S only).

7 Let P n denote the vector space of polynomials of degree less than or equal to n, and of the form p ( x ) = po + p\x + • • • + p n x n , where the coefficients /?, are all real Let PE denote the subspace of all even polynomials in P n , i.e., those that satisfy the property p(—x} = p(x) Similarly, let PQ denote the subspace of all odd polynomials, i.e., those satisfying p(—x} = – p ( x ) Show that P n = P E © PO-

8 Repeat Example 2.28 using instead the two subspaces 7" of tridiagonal matrices and

U of upper triangular matrices.

Trang 31

Trang 32

Chapter 3

Linear Transformations

3.1 Definition and Examples

We begin with the basic definition of a linear transformation (or linear map, linear function,

or linear operator) between two vector spaces

Definition 3.1 Let (V, F) and (W, F) be vector spaces Then C : V -> W is a linear transformation if and only if

£(avi + pv 2 ) = aCv\ + fi£v 2 far all a, £ e F and far all v } ,v 2 e V.

The vector space V is called the domain of the transformation C while VV, the space into which it maps, is called the co-domain.

Trang 33

18 Chapters Li near Transformations

3.2 Matrix Representation of Linear Transformations

Linear transformations between vector spaces with specific bases can be represented veniently in matrix form Specifically, suppose £ : (V, F) —>• (W, F) is linear and further

con-suppose that {u,, i e n} and {Wj, j e m] are bases for V and W, respectively Then the

ith column of A = Mat £ (the matrix representation of £ with respect to the given bases

for V and W) is the representation of £i>, with respect to {w } •, j e raj In other words,

represents £ since

where W = [w\, , w m ] and

is the z'th column of A Note that A = Mat £ depends on the particular bases for V and W.

This could be reflected by subscripts, say, in the notation, but this is usually not done

The action of £ on an arbitrary vector v e V is uniquely determined (by linearity)

by its action on a basis Thus, if v = E1v1 + • • • + E n v n = Vx (where u, and hence jc, is

arbitrary), then

Thinking of A both as a matrix and as a linear transformation from Rn to Rm usually causes no

confusion Change of basis then corresponds naturally to appropriate matrix multiplication

Thus, £V = WA since x was arbitrary.

When V = R", W = R m and [vi, i e n}, [ w j , j e m} are the usual (natural) bases the equation £V = WA becomes simply £ = A We thus commonly identify A as a linea

transformation with its matrix representation, i.e.,

Trang 34

3.3 Composition of Transformations 19

3.3 Composition of Transformations

Consider three vector spaces U, V, and W and transformations B from U to V and A from

V to W Then we can define a new transformation C as follows:

formula

Two Special Cases:

Inner Product: Let x, y e Rn Then their inner product is the scalar

Outer Product: Let x e R m , y e Rn Then their outer product is the m x n

matrix

Note that any rank-one matrix A e Rmxn can be written in the form A = xy T

above (or xy H if A e Cmxn) A rank-one symmetric matrix can be written in the form XX T (or XX H ).

The above diagram illustrates the composition of transformations C = AB Note that in

most texts, the arrows above are reversed as follows:

However, it might be useful to prefer the former since the transformations A and B appear

in the same order in both the diagram and the equation If dimZ// = p, dimV = n, and dim W = m, and if we associate matrices with the transformations in the usual way,

then composition of transformations corresponds to standard matrix multiplication That is,

we have C — A B The above is sometimes expressed componentwise by the

Trang 35

20 Chapter 3 Li near Transformations

3.4 Structure of Linear Transformations

Let A : V —> W be a linear transformation.

Definition 3.3 The range of A, denotedlZ( A), is the set {w e W : w = Av for some v e V} Equivalently, R(A) — {Av : v e V} The range of A is also known as the image of A and

denoted Im(A).

The nullspace of A, denoted N(A), is the set {v e V : Av = 0} The nullspace of

A is also known as the kernel of A and denoted Ker (A).

Theorem 3.4 Let A : V —>• W be a linear transformation Then

1 R ( A ) C W.

2 N(A) c V.

Note that N(A) and R(A) are, in general, subspaces of different spaces.

Theorem 3.5 Let A e R mxn If A is written in terms of its columns as A = [a\, ,a n ],

then

Proof: The proof of this theorem is easy, essentially following immediately from the

defi-nition D

Remark 3.6 Note that in Theorem 3.5 and throughout the text, the same symbol (A) is

used to denote both a linear transformation and its matrix representation with respect to theusual (natural) bases See also the last paragraph of Section 3.2

Definition 3.7 Let {v1, , vk] be a set of nonzero vectors u, e Rn The set is said to

be orthogonal if' vjvj = 0 for i ^ j and orthonormal if vf vj = 8ij, where 8 t j is the

Kronecker delta defined by

Trang 36

3.4 Structure of Linear Transformations 21

Definition 3.9 Let S c Rn Then the orthogonal complement of S is defined as the set

S1- = {v e Rn : V T S = 0 for all s e S}.

Example 3.10 Let

Then it can be shown that

Working from the definition, the computation involved is simply to find all nontrivial (i.e.,nonzero) solutions of the system of equations

Note that there is nothing special about the two vectors in the basis defining S being

or-thogonal Any set of vectors will do, including dependent spanning vectors (which would,

of course, then give rise to redundant equations)

Proof: We prove and discuss only item 2 here The proofs of the other results are left as

exercises Let {v1, , v k } be an orthonormal basis for S and let x e Rn be an arbitrary

vector Set

Trang 37

we see that x2 is orthogonal to v1, , Vk and hence to any linear combination of these vectors In other words, X2 is orthogonal to any vector in S We have thus shown that

S + S 1 = Rn We also have that S U S1 =0 since the only vector s e S orthogonal to everything in S (i.e., including itself) is 0.

It is also easy to see directly that, when we have such direct sum decompositions, wecan write vectors in a unique way with respect to the corresponding subspaces Suppose,

for example, that x = x1 + x2 = x'1+ x' 2 , where x\, x 1 E S and x2, x' 2 e S 1 Then (x'1 — x1) T (x' 2 — x2) = 0 by definition of ST But then (x'1 — x1) T (x'1 – x1) = 0 since

x 2 — X2 = — (x'1 — x1) (which follows by rearranging the equation x1+x2 = x'1 + x' 2 ) Thus,

x1 — x'1 and x2 = x 2 D

Theorem 3.12 Let A : Rn —> R m Then

1 N(A)1" = 7£(Ar) (Note: This holds only for finite-dimensional vector spaces.)

2 'R,(A) 1 ~ — J\f(A T ) (Note: This also holds for infinite-dimensional vector spaces.) Proof: To prove the first part, take an arbitrary x e A/"(A) Then Ax = 0 and this is equivalent to y T Ax = 0 for all v But y T Ax = ( A T y ) x Thus, Ax = 0 if and only if x

is orthogonal to all vectors of the form A T v, i.e., x e R(Ar) Since x was arbitrary, we

have established that N(A)1 = U(A T }.

The proof of the second part is similar and is left as an exercise D

Definition 3.13 Let A : R n -> R m Then {v e R" : Av = 0} is sometimes called the

right nullspace of A Similarly, (w e R m : W T A = 0} is called the left nullspace of A.

Clearly, the right nullspace is A/"(A) while the left nullspace is J\f(A T ).

Theorem 3.12 and part 2 of Theorem 3.11 can be combined to give two very damental and useful decompositions of vectors in the domain and co-domain of a lineartransformation A See also Theorem 2.26

fun-Theorem 3.14 (Decomposition fun-Theorem) Let A : R" -> R m Then

7 every vector v in the domain space R" can be written in a unique way as v = x + y, where x € M(A) and y € J\f(A) ± = ft(Ar) (i.e., R" = M(A) 0 ft(Ar))

2 every vector w in the co-domain space R m can be written in a unique way asw = x+y, where x e U(A) and y e ft(A)1- = Af(A T ) (i.e., R m = 7l(A) 0 M(A T )).

This key theorem becomes very easy to remember by carefully studying and standing Figure 3.1 in the next section

under-3.5 Four Fundamental Subspaces

Consider a general matrix A € E^x" When thought of as a linear transformation from E"

to R m , many properties of A can be developed in terms of the four fundamental subspaces

22 Chapters Li near Transformations

Then x\ e <S and, since

Trang 38

3.5 Four Fundamental Subspaces 23

Figure 3.1 Four fundamental subspaces.

7£(A), 'R.(A)^, A f ( A ) , and N(A)T Figure 3.1 makes many key properties seem almost

obvious and we return to this figure frequently both in the context of linear transformationsand in illustrating concepts such as controllability and observability

Definition 3.15 Let V and W be vector spaces and let A : V

motion.

1 A is onto (also called epic or surjective) ifR,(A) = W.

W be a linear

transfor-2 A is one-to-one or 1-1 (also called monic or infective) ifJ\f(A) = 0 Two equivalent

characterizations of A being 1-1 that are often easier to verify in practice are the following:

Definition 3.16 Let A : E" -> R m Then rank(A) = dimftCA) This is sometimes called

the column rank of A (maximum number of independent columns) The row rank of A is

Trang 39

24 Chapter3 Linear Transformations

dim 7£(Ar) (maximum number of independent rows) The dual notion to rank is the nullity

of A, sometimes denoted nullity(A) or corank(A), and is defined as dim A/"(A).

Theorem 3.17 Let A : R n -> R m Then dim K(A) = dimA/'(A)± (Note: SinceA/^A)1" = 7l(AT ), this theorem is sometimes colloquially stated "row rank of A = column rank of A.")

Proof: Define a linear transformation T : J\f(A)~ L —>• 7£(A) by

Clearly T is 1-1 (since A/"(T) = 0) To see that T is also onto, take any w e 7£(A) Then

by definition there is a vector x e R" such that Ax — w Write x = x\ + X2, where

x\ e A/^A)1- and jc2 e A/"(A) Then Ajti = u; = r*i since *i e A/^A)-1 The last equality

shows that T is onto We thus have that dim7?.(A) = dimA/^A^ since it is easily shown

that if {ui, , iv} is abasis forA/'CA)1, then {Tv\, , Tvr ] is abasis for 7?.(A) Finally, if

we apply this and several previous results, the following string of equalities follows easily:

"column rank of A" = rank(A) = dim7e(A) = dim A/^A)1 = dim7l(AT ) = rank(Ar) =

"row rank of A." D

The following corollary is immediate Like the theorem, it is a statement about equality

of dimensions; the subspaces themselves are not necessarily in the same vector space

Corollary 3.18 Let A : R" -> R m Then dimA/"(A) + dimft(A) = n, where n is the dimension of the domain of A.

Proof: From Theorems 3.11 and 3.17 we see immediately that

For completeness, we include here a few miscellaneous results about ranks of sumsand products of matrices

Theorem 3.19 Let A, B e R"xn Then

Part 4 of Theorem 3.19 suggests looking at the general problem of the four fundamentalsubspaces of matrix products The basic results are contained in the following easily provedtheorem

Trang 40

3.5 Four Fundamental Subspaces 25

Theorem 3.20 Let A e R mxn , B e R nxp Then

The next theorem is closely related to Theorem 3.20 and is also easily proved It

is extremely useful in text that follows, especially when dealing with pseudoinverses andlinear least squares problems

Theorem 3.21 Let A e R mxn Then

We now characterize 1-1 and onto transformations and provide characterizations interms of rank and invertibility

Theorem 3.22 Let A : R n -» R m Then

1 A is onto if and only //"rank(A) — m (A has linearly independent rows or is said to have full row rank; equivalently, AA T is nonsingular).

2 A is 1-1 if and only z/rank(A) = n (A has linearly independent columns or is said

to have full column rank; equivalently, A T A is nonsingular).

Proof: Proof of part 1: If A is onto, dim7?,(A) — m — rank (A) Conversely, let y e R m

be arbitrary Let jc = A T (AA T )~ ] y e R n Then y = Ax, i.e., y e 7?.(A), so A is onto.

Proof of part 2: If A is 1-1, then A/"(A) = 0, which implies that dim A/^A)-1 —n —

dim 7£(Ar), and hence dim 7£(A) = n by Theorem 3.17 Conversely, suppose Ax\ = Ax^.

Then ArA;ti = A T Ax2, which implies x\ = x^ since ArA is invertible Thus, A is1-1 D

Definition 3.23 A : V —» W is invertible (or bijective) if and only if it is 1-1 and onto.

Note that if A is invertible, then dim V — dim W Also, A : W 1 -»• E" is invertible or

nonsingular if and only z/rank(A) = n.

Note that in the special case when A € R"x", the transformations A, Ar, and A"1

are all 1-1 and onto between the two spaces M(A) ± and 7£(A) The transformations A T

and A~! have the same domain and range but are in general different maps unless A is

orthogonal Similar remarks apply to A and A~ T

Tiêu đề	Matrix Analysis For Scientists & Engineers
Tác giả	Alan J. Laub
Trường học	University of California
Thể loại	book
Năm xuất bản	2005
Thành phố	Davis

Định dạng
Số trang	172
Dung lượng	16,8 MB