On the boundary betweenmathematics and engineering, wavelet theory shows students thatmathematics research is still thriving, with important applications in areas such as image compressi
Trang 1to Wavelets Through Linear Algebra
Michael W Frazier
Springer
Trang 6Mathematics majors at Michigan State University take a “Capstone”course near the end of their undergraduate careers The content
of this course varies with each offering Its purpose is to bringtogether different topics from the undergraduate curriculum andintroduce students to a developing area in mathematics This textwas originally written for a Capstone course
Basic wavelet theory is a natural topic for such a course By name,wavelets date back only to the 1980s On the boundary betweenmathematics and engineering, wavelet theory shows students thatmathematics research is still thriving, with important applications
in areas such as image compression and the numericalsolution
of differential equations The author believes that the essentials ofwavelet theory are sufficiently elementary to be taught successfully
to advanced undergraduates
This text is intended for undergraduates, so only a basicbackground in linear algebra and analysis is assumed We do notrequire familiarity with complex numbers and the roots of unity.These are introduced in the first two sections of chapter 1 In theremainder of chapter 1 we review linear algebra Students should befamiliar with the basic definitions in sections 1.3 and 1.4 From ourviewpoint, linear transformations are the primary object of study;
Trang 7a matrix arises as a realization of a linear transformation Manystudents may have been exposed to the materialon change of basis
in section 1.4, but may benefit from seeing it again In section 1.5,
we ask how to pick a basis to simplify the matrix representation of
a given linear transformation as much as possible We then focus onthe simplest case, when the linear transformation is diagonalizable
In section 1.6, we discuss inner products and orthonormalbases Weend with a statement of the spectraltheorem for matrices, whoseproof is outlined in the exercises This is beyond the experience ofmost undergraduates
Chapter 1 is intended as reference material Depending onbackground, many readers and instructors will be able to skip orquickly review much of this material The treatment in chapter 1 isrelatively thorough, however, to make the text as self-contained aspossible, provide a logically ordered context for the subject matter,and motivate later developments
The author believes that students should be introduced to Fourieranalysis in the finite dimensional context, where everything can beexplained in terms of linear algebra The key ideas can be exhibited
in this setting without the distraction of technicalities relating toconvergence We start by introducing the Discrete Fourier Transform(DFT) in section 2.1 The DFT of a vector consists of its componentswith respect to a certain orthogonal basis of complex exponentials.The key point, that all translation-invariant linear transformationsare diagonalized by this basis, is proved in section 2.2 We turn tocomputationalissues in section 2.3, where we see that the DFT can
be computed rapidly via the Fast Fourier Transform (FFT)
It is not so well known that the basics of wavelet theory canalso be introduced in the finite dimensional context This is done
in chapter 3 The materialhere is not entirely standard; it is anadaptation of wavelet theory to the finite dimensional setting It hasthe advantage that it requires only linear algebra as background Insection 3.1, we search for orthonormalbases with both space andfrequency localization, which can be computed rapidly We are led
to consider the even integer translates of two vectors, the mother andfather wavelets in this context The filter bank arrangement for thecomputation of wavelets arises naturally here By iterating this filterbank structure, we arrive in section 3.2 at a multilevel wavelet basis
Trang 8Examples and applications are discussed in section 3.3 Daubechies’swavelets are presented in this context, and elementary compressionexamples are considered A student familiar with MatLab, Maple, orMathematica should be able to carry out similar examples if desired.
In section 4.1 we change to the infinite dimensionalbut discretesetting 2(Z), the square summable sequences on the integers.Generalproperties of complete orthonormalsets in inner productspaces are discussed in section 4.2 This is first point where analysisenters our picture in a serious way Square integrable functions
on the interval[−π, π) and their Fourier series are developed in
section 4.3 Here we have to cheat a little bit: we note that weare using the Lebesgue integralbut we don’t define it, and weask students to accept certain of its properties We arrive again atthe key principle that the Fourier system diagonalizes translation-invariant linear operators The relevant version of the Fouriertransform in this setting is the map taking a sequence in 2(Z)
to a function in L2([−π, π)) whose Fourier coefficients make upthe originalsequence Its properties are presented in section 4.4.Given this preparation, the construction of first stage wavelets onthe integers (section 4.5) and the iteration step yielding a multilevelbasis (section 4.6) are carried out in close analogy to the methods
in chapter 3 The computation of wavelets in the context of 2(Z)
is discussed in section 4.7, which includes the construction ofDaubechies’s wavelets on Z The generators u and v of a wavelet
system for2(Z) reappear in chapter 5 as the scaling sequence andits companion
The usualversion of wavelet theory on the realline is presented
in chapter 5 The preliminaries regarding square integrable tions and the Fourier transform are discussed in sections 5.1 and 5.2.The facts regarding Fourier inversion inL2(R) are proved in detail,although many instructors may prefer to assume these results TheFourier inversion formula is analogous to an orthonormal basis rep-resentation, using an integralrather than a sum Again we see thatthe Fourier system diagonalizes translation-invariant operators Mal-lat’s theorem that a multiresolution analysis yields an orthonormalwavelet basis is proved in section 5.3 The aformentioned relationbetween the scaling sequence and wavelets on 2(Z) al l ows us tomake direct use of the results of chapter 4 The conditions under
Trang 9func-which wavelets on 2(Z) can be used to generate a multiresolutionanalysis, and hence wavelets on R, are considered in section 5.4.
In section 5.5, we construct Daubechies’s wavelets of compact port, and show how the wavelet transform is implemented usingfilter banks
sup-We briefly consider the application of these results to numericaldifferentialequations in chapter 6 We begin in section 6.1 with
a discussion of the condition number of a matrix In section6.2, we present a simple example of the numerical solution of aconstant coefficient ordinary differentialequation on [0, 1] using
finite differences We see that although the resulting matrix issparse, which is convenient, it has a condition number that growsquadratically with the size of the matrix By comparison, in section6.3, we see that for a wavelet-Galerkin discretization of a uniformlyelliptic, possibly variable-coefficient, differential equation, thematrix of the associated linear system can be preconditioned to besparse and to have bounded condition number The boundedness
of the condition number comes from a norm equivalence property
of wavelets that we state without proof The sparseness of theassociated matrix comes from the localization of the wavelet system
A large proportion of the time, the orthogonality of wavelet basismembers comes from their supports not overlapping (using wavelets
of compact support, say) This is a much more robust property,for example with respect to multiplying by a variable coefficientfunction, than the delicate cancellation underlying the orthogonality
of the Fourier system Thus, although the wavelet system may notexactly diagonalize any natural operator, it nearly diagonalizes (inthe sense of the matrix being sparse) a much larger class of operatorsthan the Fourier basis
Basic wavelet theory includes aspects of linear algebra, realand complex analysis, numerical analysis, and engineering Inthis respect it mimics modern mathematics, which is becomingincreasingly interdisciplinary
This text is relatively elementary at the start, but the level
of difficulty increases steadily It can be used in different waysdepending on the preparation level of the class If a long time isrequired for chapter 1, then the more difficult proofs in the laterchapters may have to be only briefly outlined For a more advanced
Trang 10group, most or all of chapter 1 could be skipped, which would leavetime for a relatively thorough treatment of the remainder A shortercourse for a more sophisticated audience could start in chapter
4 because the main material in chapters 4 and 5 is technically,although not conceptually, independent of the content of chapters
2 and 3 An individual with a solid background in Fourier analysiscould learn the basics of wavelet theory from sections 4.5, 4.7, 5.3,5.4, 5.5, and 6.3 with only occasional references to the remainder ofthe text
This volume is intended as an introduction to wavelet theorythat is as elementary as possible It is not designed to be a thoroughreference We refer the reader interested in additionalinformation
to the Bibliography at the end of the text
April1999
Trang 12This text owes a great deal to a number of my colleagues andstudents The discrete presentation in Chapters 3 and 4 wasdeveloped in joint work (Frazier and Kumar, 1993) with Arun Kumar,
in our early attempt to understand wavelets This was furtherclarified in consulting work done with Jay Epperson at Daniel H.Wagner Associates in California Many of the graphs in this textare similar to examples done by Douglas McCulloch during thisconsulting project Additional insight was gained in subsequent workwith Rodolfo Torres
My colleagues at Michigan State University provided assistancewith this text in various ways Patti Lamm read a preliminary version
in its entirety and made more than a hundred usefulsuggestions,including some that led to a complete overhaul of section 6.2 Shealso provided computer assistance with the figures in the Prologue.Sheldon Axler supplied technical assistance and made suggestionsthat improved the style and presentation throughout the manuscript.T.-Y Li made a number of helpful suggestions, including providing
me with Exercise 1.6.20 Byron Drachman helped with the index
I have had the opportunity to test preliminary versions ofthis text in the classroom on several occasions It was used atMichigan State University in a course for undergraduates in spring
Trang 131996 and in a beginning graduate course in summer 1996 Theadministration of the Mathematics Department, especially Jon Hall,Bill Sledd, and Wei-Eihn Kuan, went out of their way to provide theseopportunities The students in these classes made many suggestionsand corrections, which have improved the text Gihan Mandour,Jian-Yu Lin, Rudolf Blazek, and Richard Andrusiak made largenumbers of corrections.
This text was also the basis for three short courses on wavelets.One of these was presented at the University of Puerto Rico atMayag ¨uez in the spring of 1997 I thank Nayda Santiago for helpingarrange the visit, and Shawn Hunt, Domingo Rodr´iguez, and Ram´onV´asquez for inviting me and for their warm hospitality Anothershort course was given at the University of Missouri at Columbia
in fall 1997 I thank Elias Saab and Nakhl´e Asmar for makingthis possible The third short course took place at the Instituto deMatem´aticas de la UNAM in Cuernavaca, Mexico in summer 1998
I thank Professors Salvador P´erez-Esteva and Carlos Villegas Blasfor their efforts in arranging this trip, and for their congenialitythroughout The text in preliminary form has also been used incourses given by Cristina Pereyra at the University of New Mexicoand by Suzanne Tourville at Carnegie-Mellon University Cristina,Suzanne, and their students provided valuable feedback and anumber of corrections, as did Kees Onneweer
My doctoralstudents Kunchuan Wang and Mike Nixon mademany helpful suggestions and found a number of corrections in themanuscript My other doctoralstudent, Shangqian Zhang, taught methe mathematics in Section 6.3 I also thank him and his son SimonZhang for Figure 35
The fingerprint examples in Figures 1–3 in the Prologue wereprovided by Chris Brislawn of the Los Alamos National Laboratories
I thank him for permission to reproduce these images Figures36e and f were prepared using a program (Summus 4U2C 3.0)provided to me by Bj¨orn Jawerth and Summus Technologies, Inc,for which I am grateful Figures 36b, c, and d were created using thecommercially available software WinJPEG v.2.84 The manuscriptand some of the figures were prepared using LaTEX The other figureswere done using MatLab Steve Plemmons, the computer manager
in the mathematics department at Michigan State University, aided
Trang 14in many ways, particularly with regard to the images in Figure
36 I thank Ina Lindemann, my editor at Springer-Verlag, for herassistance, encouragement, and especially her patience
I take this opportunity to thank the mathematicians whose aidwas critical in helping me reach the point where it became possiblefor me to write this text The patience and encouragement of mythesis advisor John Garnett was essentialat the start My earlycollaboration with Bj¨orn Jawerth played a decisive role in my career
My postdoctoraladvisor Guido Weiss encouraged and helped me inmany important ways over the years
This text was revised and corrected during a sabbaticalleaveprovided by Michigan State University This leave was spent atthe University of Missouri at Columbia I thank the University ofMissouri for their hospitality and for providing me with valuableresources and technicalassistance
At a time when academic tenure is under attack, it is worthcommenting that this text and many others like it would not havebeen written without the tenure system
Trang 16Preface v
1.2 Complex Series, Euler’s Formula, and the Roots of
Unity 161.3 Vector Spaces and Bases 291.4 Linear Transformations, Matrices, and Change of
Basis 401.5 Diagonalization of Linear Transformations and
Matrices 561.6 Inner Products, OrthonormalBases, and Unitary
Matrices 79
2.1 Basic Properties of the Discrete Fourier Transform 1012.2 Translation-Invariant Linear Transformations 128
Trang 172.3 The Fast Fourier Transform 151
3 Wavelets onZN 165 3.1 Construction of Wavelets onZN: The First Stage 165
3.2 Construction of Wavelets onZN: The Iteration Step 196 3.3 Examples and Applications 225
4 Wavelets onZ 265 4.1 2(Z) 265
4.2 Complete Orthonormal Sets in Hilbert Spaces 271
4.3 L2([−π, π)) and Fourier Series 279
4.4 The Fourier Transform and Convolution on2(Z) 298
4.5 First-Stage Wavelets onZ 309
4.6 The Iteration Step for Wavelets onZ 321
4.7 Implementation and Examples 330
5 Wavelets onR 349 5.1 L2(R) and Approximate Identities 349
5.2 The Fourier Transform onR 362
5.3 Multiresolution Analysis and Wavelets 380
5.4 Construction of Multiresolution Analyses 398
5.5 Wavelets with Compact Support and Their Computation 429
6 Wavelets and Differential Equations 451 6.1 The Condition Number of a Matrix 451
6.2 Finite Difference Methods for Differential Equations 459
6.3 Wavelet-Galerkin Methods for Differential Equations 470
Trang 18the FBI Fingerprint Files
When your local police arrest somebody on a minor charge, theywould like to check whether that person has an outstanding warrant,possibly in another state, for a more serious crime To check, theycan send his or her fingerprints to the FBI fingerprint archive
in Washington, D.C Unfortunately, the FBI cannot compare thereceived fingerprints with their records rapidly enough to make
an identification before the suspect must be released A criminalwanted on a serious charge will most likely have vacated the area
by the time the FBI has provided the necessary identification.Why does it take so long? The FBI fingerprint files are stored
on fingerprint cards in filing cabinets in a warehouse that occupiesabout an acre of floor space The logistics of the search proceduremake it impossible to proceed sufficiently rapidly
The solution to this seems obvious—the FBI fingerprint datashould be computerized and searched electronically After all, this
is the computer age Why hasn’t this been done long ago?
Data representing a fingerprint image can be stored on acomputer in such a way that the image can be reconstructed withsufficient accuracy to allow positive identification To do this, thefingerprint image is scanned and digitized Each square inch of thefingerprint image is broken into a 500 by 500 grid of small boxes,
Trang 19FIGURE 1 Originalfingerprint image (Courtesy of Chris Brislawn, Los Alamos National Laboratory)
called pixels Each pixel is given a gray-scale value corresponding to
its darkness, on a scale from 0 to 255 Because the integers from 0
to 255 can be represented in base 2 using eight places (that is, eachinteger between 0 and 255 corresponds to an 8-digit sequence ofzeros and ones), it takes eight binary data bits to specify the darkness
of one pixel (One digit in base 2 represents a single data bit, whichelectronically corresponds to the difference between a switch being
on or off.) A portion of a fingerprint scanned in this way is exhibited
in Figure ??.
Trang 20Consider the amount of data required for a single fingerprintcard Each rolled fingerprint is about 1.5 inches by 1.6 inches, with
5002 250,000 pixels per square inch, each requiring eight data bits (one data byte) So each fingerprint requires about 600,000 data bytes.
A card includes all 10 rolled fingerprints, plus 2 unrolled thumbimpressions and 2 impressions of all 5 fingers on a hand The result
is that each card requires about 10 megabytes of data (a megabyte isone million bytes) This is still manageable for modern computers,which frequently have several gigabytes of memory (a gigabyte is abillion, or 109, bytes) Electronic transmission of the data on a card
is feasible, although slow So it is possible for the police to send thenecessary data electronically to the FBI while the suspect is still incustody
However, the FBI has about 200 million fingerprint cards in itsarchive (Many are for deceased individuals, and there are someduplications—apparently the FBI is not good at throwing thingsaway.) Hence digitizing the entire archive would require roughly 2
× 1015data bytes, or about 2,000 terabytes (a terabyte is 1012bytes)
of memory This represents more data than current computers canstore Even if we restrict to cards corresponding to current criminalsuspects, we are dealing with about 29 million cards (with someduplications due to aliases), or roughly 2× 1014data bytes Thus itwould require about 60,000 3-gigabyte hard drives to store This istoo much, even for the FBI Even if this large of a data base could bestored, it could not be rapidly searched Yet it is not astronomicallytoo large If the amount of data could be cut by a factor of about 20, itcould be stored on roughly 3,000 3-gigabyte hard drives This is still alot, but not an unimaginable amount for a government agency Thuswhat is needed is a method to compress the data, that is, to representthe information using less data while retaining enough accuracy toallow positive identification
Data compression is a major field in signal analysis, with a longhistory The current industry standard for image compression waswritten by the Joint Photographic Experts Group, known as JPEG.Many, perhaps most, of the image files that are downloaded on theInternet are compressed with this standard, which is why they end
in the suffix “jpg.” The FBI solicitated proposals for compressingtheir fingerprint files a few years ago Different groups proposing
Trang 21different methods responded to the FBI solicitation The contractwas awarded to a group at the Los Alamos National Laboratory,headed by Jonathan Bradley and Christopher Brislawn; the projectleader was Tom Hopper from the FBI They proposed compressionusing the recently developed theory of wavelets An account of thisproject can be found in Brislawn (1995).
To see the reason the wavelet proposal was accepted instead of
proposals based on JPEG, consider the images in Figures ?? and ??.
Both contain compressions by a factor of about 13 of the fingerprint
image in Figure ?? Figure ?? shows the compression using JPEG, and Figure ?? exhibits the wavelet compression One feature of JPEG
is that it first divides a large image into smaller boxes, and thencompresses in these smaller boxes independently This providessome advantages due to local homogeneities in the image, but thedisadvantage is that the subimages may not align well at the edges ofthe smaller boxes This causes the regular pattern of horizontal and
verticallines seen in Figure ?? These are called block artifacts, or
block lines for short These are not just a visualannoyance, they also
are an impediment to machine recognition of fingerprints Waveletcompression methods do not require dividing the image into smallerblocks because the desired localization properties are naturally builtinto the wavelet system Hence the wavelet compression in Figure
?? does not show block lines This is one of the main reasons
that the FBI fingerprint compression contract was awarded to thewavelet group We introduce both Fourier compression and waveletcompression in section 3.3 of this text
The examples of fingerprint file compression in Figures ?? and
?? show that mathematics that has been developed recently (within
the last 10 or 12 years) has important practical applications
Trang 22FIGURE 2 JPEG compression (Courtesy of Chris Brislawn, Los Alamos NationalLaboratory)
Trang 23FIGURE 3 Wavelet compression (Courtesy of Chris Brislawn, Los Alamos National Laboratory)
Trang 24Numbers and Linear Algebra
Numbers
We start by setting some notation The naturalnumbers{1, 2, 3, 4, }
will be denoted byN, and the integers{ , −3, −2, −1, 0, 1, 2, 3, }
familiarity with the real numbersR and their properties, which webriefly summarize here The basic algebraic properties of R followfrom the fact thatR is a field
(called addition) and · (called multiplication) satisfying the following
properties:
A1 (Closure for addition) For all x, y ∈ F, x + y is defined and is an
element of F.
A2 (Commutativity for addition) x + y y + x, for all x, y ∈ F.
A3 (Associativity for addition) x + (y + z) (x + y) + z, for all
x, y, z ∈ F.
A4 (Existence of additive identity) There exists an element in F,
denoted 0, such that x + 0 x for all x ∈ F.
Trang 25A5 (Existence of additive inverse) For each x ∈ F, there exists an
element in F, denoted −x, such that x + (−x) 0.
M1 (Closure for multiplication) For all x, y ∈ F, x · y is defined and is
an element of F.
M2 (Commutativity for multiplication) x · y y · x, for all x, y ∈ F.
M3 (Associativity for multiplication) x · (y · z) (x · y) · z, for all
x, y, z ∈ F.
M4 (Existence of multiplicative identity) There exists an element in
F, denoted 1, such that 1 0 and x · 1 x for all x ∈ F.
M5 (Existence of multiplicative inverse) For each x ∈ F such that
x 0, there exists an element in F, denoted x−1 (or 1 /x), such that x · (x−1) 1.
D (Distributive law) x · (y + z) (x · y) + (x · z), for all x, y, z ∈ F.
We emphasize that in principle the operations + and · inDefinition 1.1 could be any operations satisfying the requiredproperties However, in our main examples R and C, these arethe usual addition and multiplication In particular, with the usualmeanings of+ and · , (R, +, ·) forms a field We usually omit · and
writexy in place of x · y Allof the usualbasic algebraic properties
(such as −(−x) x) of R follow from the field properties This
is shown in most introductory analysis texts We assume all thesefamiliar properties in this text
An ordered field is a field F with a relation < satisfying properties
O1–O4 The first two properties state thatF is an ordered set
O1 (Comparison principle) If x, y ∈ F, then one and only one of the
following holds:
x < y, y < x, y x.
O2 (Transitivity) If x, y, z ∈ F, with x < y and y < z, then x < z.
The remaining two properties state that the operations + and ·defined onF are consistent with the ordering <:
O3 (Consistency of + with <) If x, y, z ∈ F and y < z, then
x + y < x + z.
O4 (Consistency of · with <) If x, y ∈ F, with 0 < x and 0 < y, then
0< xy.
Trang 26We assume the fact that R with the usualrelation < forms an
ordered field All of the standard order properties of R (such as, if
0 < x then −x < 0) follow from O1–O4 We assume such basic
facts as needed We use the standard notationsx > y, with the same
meaning as y < x, and x ≤ y (equivalently y ≥ x), meaning that
eitherx < y or x y.
Forx ∈ R, we denote the absolute value, or magnitude, of x by
|x|, where |x| x if x ≥ 0 and |x| −x if x < 0 Then |x| ≥ 0 for all
x ∈ R, and |x| 0 if and only if x 0.
|x + y| ≤ |x| + |y|.
Proof
Exercise 1.1.1
We interpret|x − y| as the distance between the points x and y in
R (See Exercise 1.1.2) This leads to the notion of the convergence
of a sequence
n M of real
numbers converges to x if, for all > 0, there exists N ∈ N such that
|x n −x| < for all n > N A sequence {x n}∞
n M converges if it converges
to some x ∈ R.
n M of real numbers is a Cauchy
sequence if, for all > 0, there exists N ∈ N such that |x n − x m | <
for all n, m > N.
The rationalnumbersQ form an ordered field The property thatdistinguishesR is its completeness In many texts, this is formulated
as the least upper bound property, namely that every nonempty set
of realnumbers that is bounded above has a least upper bound Theleast upper bound property implies the following result, which weassume
Every Cauchy sequence of real numbers converges.
The Cauchy criterion allows us to prove that a sequenceconverges without knowing the value of the limit This is especially
Trang 27usefulwhen we consider series The converse of the Cauchycriterion (i.e., that any convergent sequence is Cauchy) is true also(Exercise 1.1.3).
We have seen thatR (with the usual addition and multiplication)forms a complete ordered field This characterizes R: any othercomplete ordered field is essentially the same as R except for thechoices of names or notation given to the elements and operations(more precisely, any other complete ordered field is “isomorphic” toR) We will not prove this
Most of the work in this text is done over the complex numbersC.The complex numbers also form a complete field (but not an orderedfield; see Exercise 1.1.4) One (somewhat mysterious) way to define
C is to assume the existence of some sort of generalized number (not
a realnumber)i that satisfies i2 −1 Then C is defined as the set of
all numbers of the formz x+iy where x, y ∈ R We then give C the
usual addition and multiplication operations: forx1, x2, y1, y2∈ R,
(x1+ iy1)+ (x2+ iy2) (x1+ x2)+ i(y1+ y2) (1.1)and
(x1+ iy1)· (x2+ iy2) (x1x2− y1y2)+ i(x1y2+ x2y1), (1.2)which is what you get if you formally multiply things out and usethe relationi2 −1 (To be precise we should emphasize that we are
defining the operations+ and · on C in the left side of equations (1.1)and (1.2), using the usual+, −, and · defined on R on the right side.)
As before, we usually writezw instead of z · w The only problem is
that none of this makes sense if the hypothesized numberi does not
(x1, y1)· (x2, y2) (x1x2− y1y2, x1y2+ x2y1), (1.4)
Trang 28where the +, −, and · on the right side of equations (1.3) and (1.4)
are the standard operations on R There is no question that thesedefinitions make sense Note that equation (1.3) is essentially (1.1)and equation (1.4) is essentially (1.2)
Observe that
R× {0} {(x, 0) : x ∈ R}
is a copy of R; that is, the map (x, 0) → x is a one-to-one
correspondence that identifiesR× {0} with R The equations
Hence the equationz2 (−1, 0) has a solution (0, 1) in ˜C (actually
two solutions, the other being (0,−1)), even though it has no solution
inR× {0} Since we identify (−1, 0) ∈ ˜C with −1 ∈ R, this says that
the equationz2 −1 has a solution in the larger set ˜C even though ithas none inR There is nothing inconsistent or even very surprisingabout this
In this notation, equations (1.3) and (1.4) give us (1.1) and (1.2), and
we are back where we started, but without fear of inconsistency
Trang 29It is important to go through this exercise in notation once.However, in practice nobody uses the notation (x, y) for complex
numbers, preferring to keep that for the vector spaceR2(see section1.3) We follow the standard terminology here: we call the set ofcomplex numbersC, forgetting about ˜C forever, and we denote theelements ofC in the usualway, namely
z x + iy, where x, y ∈ R.
We callx the real part of z, and y the imaginary part (a particularly
poor name, undoubtedly coming from a failure on somebody’spart to understand the construction we have just considered) Wesometimes write
Rez and Im z
to denote the realand imaginary parts ofz, respectively.
We regard pointsz x +iy as points in the plane, where one axis (the real axis) contains the points x∈ R, and the perpendicular axis
(the imaginary axis) contains the points iy, for y ∈ R In this plane
(the complex plane), the point x + iy occupies the same position that
|z| |z|2 x2+ y2.
These definitions yield the following properties
Trang 30|z + w| ≤ |z| + |w|.
Proof
Exercise 1.1.6
Similarly to the case forR above, we think of|z−w| as the distance
in the complex plane between the pointsz and w (see Exercise 1.1.7).
Note that ifz1 x1+ iy1andz2 x2+ iy2, then
|z1− z2| |x1− x2+ i(y1− y2)| (x1− x2)2+ (y1− y2)2,
which is the same as the usualdistance in R2 between the points(x1, y1) and (x2, y2)
We can now check that (C, +, ·) is a field (Exercise 1.1.8) The
additive identity is 0 0 + i0, and the multiplicative identity is
1 1 + i0 The additive inverse of z x + iy is −z −x − iy To find
the multiplicative inverse of a nonzeroz x + iy, we guess
x
x2+ y2 + i −y
x2+ y2
is in fact the multiplicative inverse of x + iy (assuming x + iy 0).
This determinesz−1 for nonzeroz∈ C, and we define
z
w z · w−1 forz, w ∈ C with w 0.
Trang 31Lemma 1.9 Suppose z, w ∈ C with w 0 Then
complex numbers converges to z if, for all > 0, there exists N ∈ N
such that |z n − z| < for all n > N We say {z n}∞
n M of complex numbers is a Cauchy
sequence if, for all > 0, there exists N ∈ N such that |z n − z m | <
for all n, m > N.
This leads to the Cauchy criterion for the convergence of asequence of complex numbers
converges if and only if it is a Cauchy sequence.
Proof
Exercise 1.1.10
A sequence {x n}+∞n M of realnumbers can be regarded as asequence of complex numbers However, it is easy to see that thesequence converges in the realsense if and only if it converges
in the complex sense, with the same limit (compare with Exercise1.1.10) Hence there is no ambiguity in the definitions, and we writelimn→∞x n without specifying the field in which convergence takes
place
Trang 321.1.1 Prove Lemma 1.2
1.1.2 LetX be a set A metric, or distance function, on X is a map
d :X× X → {t ∈ R : t ≥ 0} satisfying the properties:
Me1 (Symmetry) d(x, y) d(y, x) for all x, y ∈ X;
Me2 (Nondegeneracy) d(x, y) 0 if and only if x y;
Me3 (Metric triangle inequality) d(x, z) ≤ d(x, y) + d(y, z) for
all x, y, z ∈ X.
A metric space ( X, d) is a set X with a metric d.
For x, y ∈ R, define d(x, y) |x − y| Prove that d is a
metric onR
1.1.3 Prove that a convergent sequence{x n}∞
n M of realnumbers
is a Cauchy sequence
1.1.4 LetF with the relation < be an ordered field.
i Supposex ∈ F and x 0 Prove that x2 > 0.
ii Prove that there is no ordering < on the field C thatmakesC an ordered field Hint: Suppose by contradictionthat< is such an ordering Use part i to obtain 0 <−1.Argue that this is a contradiction, keeping in mind that
< is not necessarily the usual ordering when restricted
toR
1.1.5 Prove Lemma 1.7
1.1.6 Prove Lemma 1.8 Suggestion: Do not write it out in terms
of the realand imaginary parts Instead, prove that
|z + w|2 (z + w)(z + w) |z|2+ 2 Re(zw) + |w|2
and use Lemma 1.7
1.1.7 Forz, w ∈ C, define d(z, w) |z − w| Prove that (C, d) is a
metric space (see Exercise 1.1.2 for the definition) Draw apicture in the complex plane to show why condition Me3 inExercise 1.1.2 is called the triangle inequality
1.1.8 Verify thatC with the operations (1.1) and (1.2) is a field bychecking properties A1–A5, M1–M5, and D
1.1.9 Prove Lemma 1.9
1.1.10 Let{z n}∞
n Mbe a sequence of complex numbers For eachn,
letz n x n + iy n, wherex n , y n ∈ R.
Trang 33i Prove that {z n}∞
n M is a Cauchy sequence of complex
numbers (Definition 1.11) if and only if {x n}∞
iii Assuming Theorem 1.5, prove Theorem 1.12
1.2 Complex Series, Euler’s Formula,
and the Roots of Unity
We begin with series of complex numbers Particular cases of interestare geometric series and the power series for sinz, cos z, and e z.
Using these we establish Euler’s formulae iθ cos θ+i sin θ This will
lead to the polar representation of complex numbers and allow us
to calculateNthroots of complex numbers, especially theNth “roots
of unity,” the roots of the number 1 In chapter 2 we write Fourierexpansions of vectors using the complex exponentials introducedhere
We begin with the definition of convergence of a series ofcomplex numbers, which is formally the same as for a series of realnumbers
an expression of the form
∞
n M z n , where each z n is a complex number and M ∈ Z For k ≥ M, let
s k k
n M z n
Trang 34be the kth partialsum of the series If the complex sequence {s k}∞
k M
converges to some s ∈ C (Definition 1.10), we say the series∞n M z n
converges to s or ∞n M z n s If the sequence {s k}∞
k M does not
converge, we say the series diverges.
This definition together with the Cauchy criterion for gence of a complex sequence (Theorem 1.12) imply that a seriesconverges if and only if its partial sums form a Cauchy sequence
series of complex numbers ∞
n M z n converges if and only if for every
> 0, there exists an integer N such that m
Proof
Exercise 1.2.2
n M z n be a complex series and ∞
n M a n a series of nonnegative real numbers Suppose that there exists an integer N such that |z n | ≤ a n for all n ≥ N, and that∞n M a n converges Then∞
“series” without specifying whether the terms are realor complex
n M z n converges absolutely if∞
n M |z n|
converges.
The comparison test shows that an absolutely convergent series
is convergent If a series is convergent but not absolutely convergent,reindexing the terms can yield a series converging to a different
Trang 35value (Exercise 1.2.4) This cannot happen with an absolutelyconvergent series.
The Cauchy criterion and the comparison test enable us todetermine that a series converges without determining its value
It is rare that a series can be exactly evaluated Geometric series areone of the exceptions
This is one of the few cases in which the partialsum can be evaluated
in closed form To do this, observe that
(1− z)s k 1 + z + z2+ · · · + z k − (z + z2+ · · · + z k + z k+1).
Allterms on the right cancelout except the first and the last (this is
called a telescoping sum), so
Whenz 1, the definition yields s k k + 1 Using relation (1.5), we
obtain the following result
series∞
n0z n converges to 1 /(1 − z) if |z| < 1, and diverges if |z| ≥ 1.
The convergence for |z| < 1 is absolute.
Proof
Exercise 1.2.5
We remark that relation (1.5) is a useful formula that we applyfor other purposes in chapter 2 We now consider power series
Trang 36Definition 1.20 Fix a point z0 ∈ C A power series about z0 is a series of the form
∞
n0a n z − z0) , where a n ∈ C for each integer n ≥ 0.
A power series has a radius of convergence, which is determined
by the coefficients{a n} by the formula in Exercise 1.2.7 A function
f defined on an open set O⊆ C (a set having the property that anypoint in it has a ball of positive radius around it that is contained in
the set) is said to be analytic if at every point z ∈ O, f is represented
by a power series aboutz with a positive radius of convergence We
barely touch the rich subject of complex analysis, the study of analytic
(1.6)
in the sense that these series converge absolutely to the statedfunction values at every x ∈ R Because these series convergeabsolutely, the series
converge for every realnumber r By replacing r with |z|, we see
that the complex series
converge absolutely for all complex numbersz This can also be seen
by the ratio test (Exercise 1.2.6) In any case, the following definitionmakes sense
(1.7)
Trang 37Whenz is real, relations (1.6) and (1.7) agree So Definition 1.21
extends the usualsine, cosine, and exponentialfunctions to allof
C Many of the key properties of the real-valued sine and cosinefunctions in relation (1.6) continue to hold in the complex case Forexample, relation (1.7) implies that
for allz ∈ C Equation (1.7) leads to Euler’s formula, a very elegant
3! +i5z55! + · · ·)
(1 −z2
2 + z44! −z66! + · · ·) + i(z − z3
3! +z55! − · · ·)
cos z + i sin z.
This remarkable formula includes curious facts such as−e iπ 1.Applying Euler’s formula with−z in place of z and using equation
(1.8) gives the alternate formula
Adding and subtracting equations (1.9) and (1.10) gives
cosz e iz + e −iz
2 and sinz e iz − e −iz
Although these formulae hold for all complex numbers z, our
main interest in them is in the case when z is real For θ ∈ R, wehave
e iθ cos θ + i sin θ, e −iθ cos θ − i sin θ, (1.12)
Trang 38cosθ e iθ + e −iθ
2 , and sin θ e iθ − e −iθ
Our main interest in the next result is in the special case where
z and w are purely imaginary, which yields
e iθ e iϕ e i(θ +ϕ) for all θ, ϕ ∈ R. (1.16)This case can be proved using equation (1.12) and elementarytrigonometry (Exercise 1.2.9(i))
of sinθ and cos θ (in fact as polynomials in sin θ and cos θ) But
equation (1.17) gives a faster way
Example 1.24
Express sin 4θ and cos 4θ in terms of sin θ and cos θ.
Solution
First,
e4iθ cos 4θ + i sin 4θ.
But we also have
e4iθ (e iθ)4 (cos θ + i sin θ)4.
Trang 39By expanding this expression, usingi2 −1, we have
(cosθ + i sin θ)4 cos4θ + 4i cos3θ sin θ− 6 cos2θ sin2θ
− 4i cos θ sin3θ+ sin4θ.
By equating the realand imaginary parts of this last expression withcos 4θ and sin 4θ, we have
cos 4θ cos4θ− 6 cos2θ sin2θ+ sin4θ,
and
sin 4θ 4 cos3θ sin θ − 4 cos θ sin3θ.
Further, equation (1.12) leads to the polar representation ofcomplex numbers Supposez x + iy ∈ C, with x, y ∈ R and z 0.
The point (x/x2+ y2, y/x2+ y2) has distance 1 from the origin
in R2 and so lies on the unit circle Hence there exists an angle θ
(in fact infinitely many of them) such that cosθ x/x2+ y2 andsinθ y/x2+ y2 Using equation (1.12),
z re iθ r cos θ + ir sin θ.
So re iθ occupies the same point in C that the point with polarcoordinates (r, θ) occupies inR2 We call re iθ the polar representation
of z We call θ the argument of z, written θ arg z Thus r is the
distance in C from z to the origin, and θ is the angle, in radians,
between the positive x-axis and the ray from 0 to z For z 0, wedefine the polar representation to be r 0 and we do not definearg 0.
As in the case of polar coordinates inR2, the polar representation
of a nonzeroz ∈ C is not unique By equation (1.12) and the fact thatthe sine and cosine functions have period 2π, we have
re iθ re i(θ +2kπ) (1.18)
Trang 40for any integer k So arg z is determined only up to adding 2kπ for
somek ∈ Z If we select arg z so that −π < arg z ≤ π, we call this the principal value of the argument.
The polar representation gives geometric interpretation to themultiplication of complex numbers Suppose z1 r1e iθ1 and z2
r2e iθ2 Then
z1z2 r1e iθ1r2e iθ2 r1r2e i(θ1+θ2 ),
by equation (1.16) Thus the modulus of the product is the product
of the moduli (which we already knew by Lemma 1.7) and theargument of the product is the sum of the arguments In other words,the effect of multiplying a complex numberz by re iθ is to multiply
the distance fromz to the origin by r and to rotate z by an angle of
θ radians in the counterclockwise direction.
The polar representation makes the computation of positiveinteger powers of complex numbers easier, since, forn ∈ N,
(re iθ) r n e iθ) r n e inθ ,
4 + i sin 3π
4 −221+ i221.
For comparison, imagine trying to do this problem directly.The polar representation also allows us to find roots of complexnumbers First consider an example, which we will solve below aftersome discussion
Example 1.26
Find all 5throots of 2+ 2√3i; that is, find all complex numbers a + ib
such that (a + ib)5 2 + 2√3i.