1. Trang chủ
  2. » Ngoại Ngữ

mathematical methods for optical physics bookfi org

820 2,3K 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 820
Dung lượng 10,85 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

In mathematics, a vector space is defined quite generally as a set of elements called vectors together with rules relating to their addition and scalar tiplication of vectors.. For most o

Trang 3

M AT H E M AT I C A L M E T H O D S F O R O P T I C A L

P H Y S I C S A N D E N G I N E E R I N G

The first textbook on mathematical methods focusing on techniques for optical science andengineering, this textbook is ideal for advanced undergraduates and graduate students inoptical physics

Containing detailed sections on the basic theory, the textbook places strong emphasis

on connecting the abstract mathematical concepts to the optical systems to which theyare applied It covers many topics which usually only appear in more specialized books,suchas Zernike polynomials, wavelet and fractional Fourier transforms, vector sphericalharmonics, the z-transform, and the angular spectrum representation

Most chapters end by showing how the techniques covered can be used to solve anoptical problem Essay problems in eachchapter based on researchpublications, togetherwith numerous exercises, help to further strengthen the connection between the theory andits application

gregory j gbur is anAssociate Professor of Physics and Optical Science at the sity of North Carolina at Charlotte, where he has taught a graduate course on mathematicalmethods for optics for the past five years and a course on advanced physical optics for twoyears

Trang 6

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore,

São Paulo, Delhi, Dubai, Tokyo, Mexico City

Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK

Published in the United States of America by Cambridge University Press, New York

www.cambridge.org Information on this title: www.cambridge.org/9780521516105

© G J Gbur 2011 This publication is in copyright Subject to statutory exception

and to the provisions of relevant collective licensing agreements,

no reproduction of any part may take place without

the written permission of Cambridge University Press.

First published 2011 Printed in the United Kingdom at the University Press, Cambridge

A catalog record for this publication is available from the British Library

Library of Congress Cataloging in Publication data

or will remain, accurate or appropriate.

Trang 7

Dedicated to my wife Beth, and my parents

Trang 9

2.10 Focus: Maxwell’s equations in integral and differential form 51

vii

Trang 10

3.5 Spherical coordinates 76

4.6 Diagonalization of matrices, eigenvectors, and eigenvalues 107

Trang 11

Contents ix

Trang 12

11.2 The Fourier transform and its inverse 352

12.8 Focus: the Radon transform and computed axial tomography (CAT) 410

13.8 Focus: z-transforms in the numerical solution of Maxwell’s equations 445

14.9 Singularities, complex analysis, and general Frobenius solutions 481

Trang 13

Contents xi

15.1 Introduction: propagation in a rectangular waveguide 505

16.12 Addition theorems, sum theorems, and product relations 576

17.1 Introduction: Laplace’s equation in spherical coordinates 585

Trang 14

17.8 Spherical harmonics 602

19.9 Focus: dyadic Green’s function for Maxwell’s equations 704

20.5 Calculus of variations withseveral dependent variables 73020.6 Calculus of variations withseveral independent variables 73220.7 Euler’s equation withauxiliary conditions: Lagrange multipliers 734

Trang 15

Contents xiii

Trang 17

Why another textbook on Mathematical Methods for Scientists? Certainly there are quite afew good, indeed classic texts on the subject What can another text add that these othershave not already done?

I began to ponder these questions, and my answers to them, over the past several yearswhile teaching a graduate course on Mathematical Methods for Physics and Optical Science

at the University of North Carolina at Charlotte Although every student has his or herown difficulties in learning mathematical techniques, a few problems amongst the studentshave remained common and constant The foremost among these is the “wall” between themathematics the students learn in math class and the applications they study in other classes.The Fourier transform learned in math class is internally treated differently than the Fouriertransform used in, say, Fraunhofer diffraction The end result is that the student effectivelylearns the same topic twice, and is unable to use the intuition learned in a physics class tohelp aid in mathematical understanding, or to use the techniques learned in math class toformulate and solve physical problems

To try and correct for this, I began to devote special lectures to the consequences of themaththe students were studying Lectures on complex analysis would be followed by dis-cussions of the analytic properties of wavefields and the Kramers–Kronig relations Lectures

on infinite series could be highlighted by the discussion of the Fabry–Perot interferometer.Students in my classes were uniformly dissatisfied withthe standard textbooks Part ofthis dissatisfaction arises from the broad topics from which examples are drawn: quantumphysics, field theory, general relativity, optics, mechanics, and thermodynamics, to name

a few Even the most dedicated theoretical physics students do not have a great physicalintuition about all these subfields, and consequently many of the examples are no moreuseful in their minds than problems in abstract mathematics

Given that students in my class are studying optics, I have focused most of my attention

on methods directly related to optical science Here again the standard texts became aproblem, as there is not a perfect overlap between important methods for general physicsand important methods for optics For example, group theory is not commonly used amongmost optics researchers, and Fourier transforms, essential to the optics researcher, are notused as much by the rest of the general physics community Teaching to an optics crowdwould require that the emphasis on material be refocused It was in view of these various

xv

Trang 18

observations that I decided that a new mathematical methods book, with an emphasis onoptics, would be useful.

Optics as both an industry and a field of study in its own right has grown dramaticallyover the past two decades Optics programs at universities have grown in size and number inrecent years The University of Rochester and the University of Arizona are schools whichhave had degrees for some time, while the University of Central Florida and the University

of North Carolina at Charlotte have started programs within recent years With countlesselectrical engineering programs emphasizing studies in optics, it seems likely that moreoptics degrees will follow in the years to come A textbook which serves such programsand optical researchers in general seems to have the potential to be a popular resource

My goal, then, was to write a textbook on mathematical methods for physics and opticalscience, with an emphasis on those techniques relevant to optical scientists and engineers.The level of the book is intended for an advanced undergraduate or beginning graduatelevel class on math methods One of my main objectives was to write a “leaner” book thanmany of the 1000+ page math books currently available, and do so by pushing much of theabstract mathematical subtlety into references Instead, the emphasis is placed on making theconnection between the mathematical techniques and the optics problems they are intimatelyrelated to To make this connection, most chapters begin with a short introduction whichillustrates the relevance of the mathematical technique being considered, and ends with one

or more applications, in which the technique is used to solve a problem Physical exampleswithin the chapters are drawn predominantly from optics, though examples from other fieldswill be used when appropriate

A book of this type will address a number of mathematical techniques which are normallynot compiled into a single volume It is hoped that this book will therefore serve not only

as a textbook but also potentially as a reference book for researchers in optics

Another “wall” in students’ understanding is making the connection between the topicslearned in class and researchresults in the literature A number of exercises in eachchapterare essay-style questions, in which a journal article must be read and its relevance to themathematical method discussed I have also endeavored to provide an appreciable number ofexercises in eachchapter, withsome similar problems to facilitate teaching a class multipletimes Some more advanced chapters have fewer exercises, mainly because it is difficult tofind exercises that are simultaneously solvable and enlightening

Early chapters cover the basics that are essential for any student of the physical ences, including vectors, curvilinear coordinate systems, differential equations, sequencesand series, matrices, and this part of the book might be used for any math methods forphysics course Later chapters concentrate on techniques significant to optics, includingFourier analysis, asymptotic methods, Green’s functions, and more general types of integraltransform

sci-A book of this sort requires a lot of help, and I have sought plenty of insight from leagues I would like to thank Professor John Schotland, Professor Daniel James, ProfessorTom Suleski, Dr Choon How Gan, Mike Fairchild and Casey Rhodes for helpful suggestionsduring the course of writing I give special thanks to Professor Taco Visser and Dr Damon

Trang 19

col-Preface xviiDiehl, each of whom read significant sections of the manuscript and provided corrections,and to Professor Emil Wolf, who gave me encouragement and inspiration during the writingprocess I am grateful to Professor John Foley who some time ago gave me access to hiscollection of math methods exercises, which were useful in developing my own problems.Professor Daniel S Jones generously provided a photograph of X-ray diffraction, and Pro-fessor Visser provided a figure on the Poincaré sphere I would also like to express myappreciation to the very helpful people at Cambridge University Press, including SimonCapelin, John Fowler, Megan Waddington, and Lindsay Barnes Special thanks goes toFrances Nex for her careful editing of the text.

I also have to thank a number of people for their help in keeping me sane during thewriting process! Among them, let me thank my guitar instructor Toby Watson, my skatingcoachTappie Dellinger, and my friends at Skydive Carolina, particularly my regular jumpbuddies Nancy Newman, Mickey Turner, John Solomon, Robyn Porter, Mike Reinhart, andHeiko Lotz! I would also like to give a “shout out” to Eric Smith and Mahy El-Kouedi fortheir friendship and support

Finally, let me thank my wife Beth Szabo for her support, understanding and patienceduring this rather strenuous writing process

Trang 21

potential of an insulated metal sphere are all examples of such scalar quantities.

Descriptions of physical phenomena are not always (indeed, rarely) that simple, however,and often we must use multiple, but related, numbers to offer a complete description of an

effect The next level of complexity is the introduction of vector quantities.

A vector may be described as a conceptual object having both magnitude and direction.

Graphically, vectors can be represented by an arrow:

The length of the arrow is the magnitude of the vector, and the direction of the arrow indicates the direction of the vector.

Examples of vectors in elementary physics include displacement, velocity, force, tum, and angular momentum, though the concept can be extended to more complicated andabstract systems Algebraically, we will usually represent vectors by boldface characters,

momen-i.e F for force, v for velocity, and so on.

It is worth noting at this point that the word “vector” is used in mathematics with

some-what broader meaning In mathematics, a vector space is defined quite generally as a set

of elements (called vectors) together with rules relating to their addition and scalar tiplication of vectors In this sense, the set of real numbers form a vector space, as doesany ordered set of numbers, including matrices, to be discussed in Chapter4, and complexnumbers, to be discussed in Chapter9 For most of this chapter we reserve the term “vector”for quantities which possess magnitude and direction in three-dimensional space, and areindependent of the specific choice of coordinate system in a manner to be discussed in

mul-1

Trang 22

A A

B

B

C

Figure 1.1 The parallelogram law of vector addition Adding B to A (the addition above the C-line)

is equivalent to adding A to B (the addition below the C-line).

Section1.2 We briefly describe vector spaces at the end of this chapter, in Section1.5 Th einterested reader can also consult Ref [Kre78, Sec 2.1]

Vector addition is commutative and associative; commutativity refers to the observation

that the addition of vectors is order independent, i.e

This can be depicted graphically by the parallelogram law of vector addition, illustrated

in Fig.1.1 A pair of vectors are added “tip-to-tail”; that is, the second vector is added tothe first by putting its tail at the end of the tip of the first vector The resultant vector isfound by drawing an arrow from the origin of the first vector to the tip of the second vector.Associativity refers to the observation that the addition of multiple vectors is independent

of the way the vectors are grouped for addition, i.e

So far, we have introduced vectors as purely geometrical objects which are independent

of any specific coordinate system Intuitively, this is an obvious requirement: where I amstanding in a room (my “position vector”) is independent of whether I choose to describe

it by measuring it from the rear left corner of the room or the front right corner In otherwords, the vector has a physical significance which does not change when I change mymethod of describing it

Trang 23

By choosing a coordinate system, however, we may create a representation of the vector

in terms of these coordinates We start by considering a Cartesian coordinate system with

coordinates x, y, z which are all mutually perpendicular and form a right-handed coordinate

system.1For a given Cartesian coordinate system, the vector A, which starts at the origin and

ends at the point with coordinates(A x , A y , A z ), is completely described by the coordinates

of the end point

It is highly convenient to express a vector in terms of these components by use of unitvectorsˆx, ˆy, ˆz, vectors of unit magnitude pointing in the directions of the positive coordinate

axes,

This equation indicates that a vector equals the vector sum of its components In three

dimensions, the position vector r which measures the distance from a chosen origin is

written as

where x, y, and z are the lengths along the different coordinate axes.

The sum of two vectors can be found by taking the sum of their individual components

This means that the sum of two vectors A and B can be written as

A+ B = (A x + B x )ˆx + (A y + B y )ˆy + (A z + B z )ˆz. (1.9)The magnitude (length) of a vector in terms of its components can be found by two successive

applications of the Pythagorean theorem The magnitude A of the complete vector, also

written as|A|, is found to be

x + A2

y + A2

Another way to represent the vector in a particular coordinate system is by its magnitude

1If x is the outward-pointing index finger of the right hand, y is the folded-in ring finger and z is the thumb,

pointing straight up.

Trang 24

Figure 1.3 Illustration of the vector A, its components(A x , A y , A z ), and the angles α, β, γ

These angles and their relationship to the vector and its components are illustrated in Fig.1.3.The quantities cosα, cosβ, and cosγ are called direction cosines It might seem that there

is an inconsistency withthis representation, since we now evidently need four numbers

seeming contradiction is resolved by the observation thatα, β, and γ are not independent

quantities; they are related by the equation,

In the spherical coordinate system to be discussed in Chapter3, we will see that we may

completely specify the position vector by its magnitude r and two angles θ and φ.

It is to be noted that we usually see vectors in physics in two distinct classes:

1 Vectors associated withthe property of a single, localized object, suchas the velocity of a car, orthe force of gravity acting on a moving projectile

2 Vectors associated withthe property of a nonlocalized “object” or system, suchas the electricfield of a light wave, or the velocity of a fluid In such a case, the vector quantity is a function ofposition and possibly time and we may do calculus withrespect to the position and time variables

This vector quantity is usually referred to as a vector field.

Vector fields are extremely important quantities in physics and we will return to them often

1.2 Coordinate system invariance

We have said that a vector is independent of any specific coordinate system – in otherwords, that a vector is independent of how we choose to characterize it This seems like anobvious criterion, but there are physical quantities which have magnitude and direction but

Trang 25

1.2 Coordinate system invariance 5are not vectors; an example of this in optics is the set of principle indices of refraction of ananisotropic crystal Thus, to define a vector properly, we need to formulate mathematically

this concept of coordinate system invariance Furthermore, it is not uncommon to require,

in the solution of a physical problem, the transformation from one coordinate system toanother We therefore take some time to study the mathematics relating to the behavior of

a vector under a change of coordinates

The simplest coordinate transformation is a change of origin, leaving the orientation ofthe axes unchanged The only vector that depends explicitly upon the origin is the position

vector r, which is a measure of the vector distance from the origin If the new origin of a new coordinate system, described by position vector r, is located at the position r

0fromthe old origin, the coordinates are related by the formula

Most other basic vectors depend upon the displacement vector R= r2−r1, i.e the change inposition, and therefore are unaffected by a change in origin Examples include the velocity,momentum, and force upon an object

A less trivial example of a change of coordinate system is a change of the orientation of

coordinate axes, and its effect on a position vector r For simplicity, we first consider the two dimensional case The vector r may be written in one coordinate system as r= xˆx +yˆy,

while in a second coordinate system this vector may be written as r= xˆx+yˆy Th e(x,y)

coordinate axes are rotated to a new location to become the(x, y) axes, while leaving the

vector r (in particular, the location of the tip of r) fixed The question we ask: what are the components of the vector r in the new coordinate system, which makes an angleφ with

the old system? The relation between the two systems is illustrated in Fig.1.4

x

y

x' y'

r

φ

Figure 1.4 Illustration of the position vector r and its components in the two coordinate systems.

Trang 26

By straightforward trigonometry, one can readily find that the new coordinates of thevector(x, y) may be written in terms of the old coordinates as

These equations are based on the assumption that the magnitude and direction of the vector

is independent of the coordinate system, and this assumption should hold for anything we

refer to as a vector We therefore define a vector as a quantity whose components transform

under rotations just as the position vector r does, i.e a vector A withcomponents Ax and

A yin the unprimed system should have components

A

A

in the primed system

It is important to emphasize again that we are only rotating the coordinate axes, and

that the vector A does not change:(A x , Ay ) and (A

x , A

y ) are representations of the vector,

different ways of describing the same physical property Indeed, another way to define

a vector is that it is a quantity with magnitude and direction that is independent of thecoordinate system

We can also interpret Eqs (1.15) and (1.16) in an entirely different manner: if we were to

physically rotate the vector A over an angle−φ about the origin of the coordinate system, the new direction of the vector in the same coordinate system would be given by (A

x , A

y ).

A rotation of the coordinate system in the directionφ is mathematically equivalent to a

rotation of the vector by an angle−φ.

To generalize the discussion to three (or more) dimensions, it helps to modify the notationsomewhat We write

and define

a11 = cosφ, a12 = sin φ = cos(π/2 − φ) = cos(φ − π/2), a21 = −sin φ = −a12= cos(φ + π/2),

Trang 27

1.2 Coordinate system invariance 7These transformations may be written in a summation format,

i is the ith component of the vector in the primed frame, xj is the jthcomponent

of the vector in the unprimed frame, and the notationm

j =nindicates summation over allterms withindex j ranging from n to m The quantity aij can be seen from Eqs (1.19) to

be the direction cosine with the respect to the ithprimed coordinate and the jthunprimed

coordinate

What happens if we run the rotation in reverse? We can still use Eq (1.21), but we replace

φ by −φ and switch the primed and unprimed coordinates, i.e the primed coordinates are

now the start of the rotation and the unprimed coordinates are now the end of the rotation.With these changes, Eq (1.21) becomes

This formula can be derived by taking partial derivatives of the transformation equations

for r, namely Eq (1.26) The quantity a ijhas more components than a vector and will be

seen to be a tensor, whichcan be represented in matrix form; we will discuss suchbeasts

Trang 28

and from this expression we may also write

an orthogonality condition for the coefficients aij We begin with the transformation of the

vector V and its reverse,

The left-hand side of this equation is the kth component of the vector in the unprimed frame.

The right-hand side of the equation is a weighted sum of all components of the vector in theunprimed frame By the use of Eqs (1.27) and (1.29), we may write the quantity in squarebrackets as

where the final step is the result of application of the chain rule of calculus Because the

variables x j , for j = 1, ,N, are independent of one another, we readily find that

We will see a lot of the Kronecker delta in the future – remember it! It is another example

of a tensor, like the rotation tensor a ij

Trang 29

1.3 Vector multiplication 9

1.3 Vector multiplication

In Section1.1, we looked at the addition of vectors, which may be considered a tion of the addition of ordinary numbers One can envision that there exists a generalizedform of multiplication for vectors, as well; withthree components for eachvector, however,there are a large number of possibilities for what we might call “vector multiplication” Just

generaliza-as vectors themselves are invariant under a change of coordinates, any vector multiplicationshould also be invariant under a change of coordinates It turns out that for three-dimensionalvectors there exist four possibilities, three of which we discuss here.2

1.3.1 Multiplication by a scalar

The simplest form of multiplication involving vectors is the multiplication of a vector by ascalar The effect of such a multiplication is the “scaling” of each component of the vectorequally by the scalarα, i.e.

It is clear from the above that the act of multiplying by a scalar does not change the direction

of the vector, but only scales its length by the factorα; we may formally write |αV| = |α||V|.

The result of the multiplication is also a vector, as it is clear that this product is invariantunder a rotation of the coordinate axes, which does not affect the vector length

It is to be noted that we may also consider scalar multiplication “backwards”, i.e that itrepresents the multiplication of a scalar by a vector, with the end result being a vector Thisinterpretation will be employed in Chapter2to help categorize the different types of vectordifferentiation

1.3.2 Scalar or dot product

The scalar product (or “dot product”) between two vectors is represented by a dot and is

defined as

where A and B are the magnitudes of vectors A and B, respectively, and θ is the angle

between the two vectors

The rotational invariance of this quantity is almost obvious from its definition, for weknow that the magnitudes of vectors and the angles between any two vectors are allunchanged under rotations We will confirm this more rigorously in a moment

In terms of components in a particular Cartesian coordinate system, the scalar product isgiven by3

2 A discussion of the fourth, the direct product, will be deferred until Section 5.6

3 From now on, we no longer write the upper and lower ranges of the summations.

Trang 30

We can use this representation of the scalar product to rigorously prove that it is invariantunder rotations We start with the scalar product in the primed coordinate system, and sub-stitute into it the representation of the primed vectors in terms of the unprimed coordinates,

and we have proven that the scalar product is invariant under rotations

The dot product may be used to demonstrate another familiar geometrical formula, the

law of cosines Defining

it follows from the parallelogram law of vector addition that A, B, and C form the sides of

a triangle If we take the dot product of C withitself,

C· C = (A + B) · (A + B) = A · A + B · B + 2A · B. (1.42)The dot product of a vector with itself is simply the squared magnitude of the vector, and

the dot product of A and B is defined by Eq (1.37) We thus arrive at

the law of cosines

1.3.3 Vector or cross product

The third form of rotationally invariant product involving vectors is the vector product or

“cross product” Just as the scalar product is named such because the result of the product

is a scalar, the result of the vector product is another vector It is represented by a crossbetween vectors,

In evident contrast with the scalar product, the magnitude of the vector product is definedas

whereθ is again the angle between the vectors A and B From this definition, it is to be

noted that the magnitude of C is the area of the parallelogram formed by A and B, and that

Trang 31

1.3 Vector multiplication 11

A B

C = A × B

C = B × A

x

y z

θ

Figure 1.5 Illustration of the cross product Vector A points in the +ˆy direction, while B points in the

−ˆx direction Then C = A ×B points in the +ˆz direction, while C = B×A points in the −ˆz direction.

A × A = 0 for any vector A The parallelogram formed by A and B defines a plane, and we assign to the vector C a direction such that it is perpendicular to the plane of A and B in a manner suchthat A, B, and C form a right-handed system This is illustrated in Fig.1.5

It immediately follows from this definition of direction that

From these results, we may determine the Cartesian components of C in terms of the components of A and B,

C x = A y B z − A z B y, C y = A z B x − A x B z, C z = A x B y − A y B x (1.49)These rules can be put in a convenient determinant form,4

We may also write the cross product in a summation form, though we need to introduce an

additional tensor, the Levi-Civita tensor ijk,

Trang 32

1 if ijk are cyclic (counting upwards), i.e ijk= 123,231,312,

−1 if ijk are anti-cyclic (counting downwards), i.e ijk = 321,213,132,

0 otherwise

(1.52)

The Levi-Civita tensor is noteworthy because it depends on three indices, i, j, k, whereas

the previous tensors introduced, such asδ ij, depended only on two The Levi-Civita tensor

is a more complicated type of tensor than the Kronecker delta

One other property of the cross product will be found to be consistently useful It followsimmediately from the definition of the direction of the cross product that

In other words, the cross product of two vectors is orthogonal (perpendicular) to each ofthe vectors individually This can also be readily shown by using the component definitions

of the dot and cross products

1.4Useful products of vectors

Two types of multiple product of vectors appear quite regularly in physical problems; webriefly discuss eachof them

1.4.1 Triple scalar product

The triple scalar product is given by

It can readily be shown that the triple scalar product may be written in terms of componentsas

A· (B × C) = Ax (B y C z − Bz C y ) + A y (B z C x − Bx C z ) + A z (B x C y − By C x ). (1.55)From this result, the following symmetry properties of the triple scalar product can bededuced:

Trang 33

1.5 Linear vector spaces 13

A (B C) = C (A B) = B (C A)

Figure 1.6 Illustrating the cyclic nature of the triple scalar product The vectors may all be “moved”

to the left or right without changing the value of the product

Geometrically, one can show that the triple scalar product represents the volume of the

parallepiped formed by the vectors A, B, and C.

1.4.2 Triple vector product

The triple vector product is the familiar rule, usually known as the BAC-CAB rule,

The BAC-CAB rule, along with its vector calculus cousin (to be discussed in Section2.6.3), are perhaps the most commonly useful vector formulas, and they should be burnedinto memory!

The triple vector product may be confirmed by a straightforward representation of the

vectors A, B, and C in a Cartesian coordinate system.

1.5 Linear vector spaces

In this chapter, we have so far looked at the properties of individual vectors We may also,

however, consider the set of all possible vectors of a certain class as a whole, in what is referred to as a linear vector space In anticipation of future discussion, we take some time

to formally define a linear vector space and some other related spaces

Definition 1.1 (Linear vector space) A linear vector space S is defined as a set of

ele-ments called vectors, which satisfy the following ten properties related to eleele-ments |x , |y and |z :

1 |x + |y ∈ S (completeness with respect to addition),

2 (|x + |y ) + |z = |x + (|y + |z ) (associativity),

3 |x + |0 = |x (existence of zero),

4 ∃|y such that |x + |y = |0 (existence of negative element),

5 |x + |y = |y + |x (commutativity),

6 a |x ∈ S (completeness with respect to scalar multiplication),

7 a (b|x ) = (ab)|x (associativity of scalar multiplication),

8 (a + b)|x = a |x + b|x (first distribution rule),

9 a (|x + |y ) = a |x + a |y (second distribution rule),

10 1|x = |x (existence of unit element).

Trang 34

To emphasize the general nature of this definition, we have adopted the so-called “bra-ket”notation for vectors, where|x is an element of the vector space, and referred to as a “ket”.

We will discuss the “bra” form of a vector,x|, momentarily The symbol ∈ indicates that

the given vector belongs to the given set, i.e.|x ∈ S means that |x is a member of the set

S The symbol∃ is mathematical shorthand for “there exists”

The first five items of the definition describe the behavior of vectors under addition; thelast five items describe the behavior of vectors under scalar multiplication Most of theseitems (namely, 2–5, 7–10) are clearly satisfied by the three-dimensional vectors discussedthroughout this chapter The first item is a statement that the object resulting from theaddition of two vectors is itself a vector, and the sixth item is a statement that the objectresulting from scalar multiplication of a vector is itself a vector The first four items above

classify the vector space as a mathematical group.5

This definition is broad enough that we may also refer to the complete set of numbers x

on the real line−∞ < x < ∞ as forming a vector space, withthe numbers x serving the

role of “vectors” Elements of a vector space may also consist of an array of numbers: theset of all real-valued matrices also satisfy the definition of a linear vector space Elements

of a vector space may have complex values: the set of all complex numbers z = x + iy in

the plane satisfy the definition of a linear vector space We will assume throughout the rest

of this section that the vectors are in general complex-valued

With such a broad definition, it is not clear what we have gained, other than listingproperties of vectors that are obvious or seemingly trivial When we begin to study moregeneral classes of vectors, however, we will see that there are important properties whichhave to do with the set of vectors (the space) as a whole, and this general formalism will

be useful We briefly consider some of these concepts to prepare for future discussions

Definition 1.2 (Linear independence) A set of vectors X = {|x1 ,|x2 , ,|x N } are said

to be linearly independent if the only solution to the equation

a1 |x1 + a2|x2 + ··· + a N |xN = |0 (1.59)

is a i = 0 for all i, where ai are scalars.

Linear independence implies that no vector in the set X may be written as the sum of any combination of the others A set X of vectors which is not linearly independent is linearly dependent.

Example 1.1 A trivial example of linearly independent vectors in three-dimensional space

are the unit vectors ˆx, ˆy, ˆz The only way to construct a 0-vector from these is by setting

their coefficients all equal to zero, i.e x ˆx +yˆy +zˆz = 0 only for x = y = z = 0 A less trivial

example is the trio of vectors in three dimensions listed below,

x1 = ˆx + 2ˆy, x2= 2ˆx + 4ˆy + ˆz, x3= 5ˆz. (1.60)

5 Group theory is a beautiful and general mathematical theory which is especially useful in atomic, nuclear and particle physics See [ Ham64 , Cor97 ] for a detailed discussion.

Trang 35

1.5 Linear vector spaces 15

If we take the addition of these vectors, with coefficients a1= 1, a2= −1/2 and a3= 1/10,

the result is 0; the vectors are therefore linearly dependent.



We see that all three-dimensional vectors may be written as some combination of the

unit vectorsˆx, ˆy, ˆz The trio of unit vectors is then said to span the three-dimensional space.

In general, a set of vectors is said to span a vector space if all vectors in the space may

be written as a linear combination of that set Returning to three-dimensional space, it can

be seen quite clearly that one can find at most three linearly independent vectors at once,

i.e any collection of four or more vectors is necessarily linearly dependent We may use

this observation as the definition of the dimension of a vector space: the dimension of a

vector space is the maximum number of linearly independent vectors which may be found

We will, however, find many important cases where the dimension of the vector space is

infinite.

A linearly independent set of vectors which span a vector space are referred to as a

basis for that space There are always many possible choices of basis for any

particu-lar vector space For instance, we may use the unit vectors(ˆx, ˆy, ˆz) as a basis in three

dimensions, but we may also use(ˆx + ˆy, ˆx − ˆy, ˆz), or even (ˆx, ˆx + ˆy, ˆz) It is to be noted

that in the last case the basis vectors are not even perpendicular to one another A basisneed not consist of mutually perpendicular vectors; the important characteristic is linearindependence

For a finite-dimensional space, finding a basis is usually a straightforward process Whenthe space is infinite-dimensional, however, things become much less clear An importantproblem in dealing withinfinite-dimensional vector spaces is proving that a set of vectorsforms a complete basis

There are several other types of space which should be mentioned The first of these is

an inner product space, also known as a scalar product space To define sucha space, we

must first define an inner product

Definition 1.3 (Inner product) Suppose we have a linear vector space S An inner product

is a rule that associates with any pair of vectors |x and |y a complexnumber z, which is written as z = y| x and satisfies the following properties:

1 y| x = (x| y ), where ∗ refers to the complexconjugate,

2 if |w = a |x + b|y , then v| w = a v| x + bv| y ,

3 x| x ≥ 0, with equality occurring only for |x = 0.

The inner product is also known as the scalar product, as the scalar or dot product of

three-dimensional vectors satisfies the conditions for an inner product Other examples ofinner products will appear in future chapters

We may now define an inner product space in a straightforward manner

Definition 1.4(Inner product space) A linear vector space which has a scalar product

associated with it is known as an inner product space.

Trang 36

The way we have written the scalar product suggests that we may associate with every

“ket”|x a corresponding “bra” vector x|, and the inner product is the formation of a

“bracket” from these types of vectors In general, we cannot say much about the nature

of the “bra” vector, other than the fact that every “ket” has an associated “bra” For athree-dimensional, real-valued vector, the “bra” and “ket” vectors are the same

We may define the orthogonality of vectors quite generally through their inner product

Definition 1.5 (Orthogonality) A pair of vectors |x , |y are orthogonal if their inner product vanishes, i.e.

distance(|x ,|y ) =(x| − y|)(|x − |y ) =x| x + y| y − 2Re{x| y }. (1.64)

It is possible to introduce a more general concept of lengthwithout reference to an inner

product This results in a distinct type of space known as a metric space.

Definition 1.6 (Metric space) A metric space is defined as a set of elements which has a

real, positive number ρ(x,y), called the metric, associated with any pair of elements x, y The metric must satisfy the conditions

1 ρ(x,y) = ρ(y,x),

2 ρ(x,y) = 0 only for x = y,

3 ρ(x,y) + ρ(y,z) ≥ ρ(x,z).

It is to be noted that a metric space is a different beast than a linear vector space, as

a metric space is defined without any reference to the properties of addition and scalarmultiplication Therefore there can exist linear vector spaces which are not metric spaces,and vice versa An inner product space, however, with a “distance” defined as in Eq (1.64),

is automatically a metric space One can show with some effort that the “distance” satisfiesthe three properties of a metric space An inner product space is automatically a metricspace, but a metric space is not necessarily an inner product space

Trang 37

1.6 Focus: periodic media and reciprocal lattice vectors 17The third property of the metric listed above deserves a bit of attention It is the general-ization of the familiar triangle inequality of geometry: the sum of the lengths of any sides

of the triangle is greater than the length of the third side

This section contains a lot of definition without much application! The purpose, however,

is to demonstrate that the familiar three-dimensional vector space is part of a larger “family”

of vector spaces, all of which have the same basic properties This big picture shows uswhich properties of vectors are important, and what properties to look for when definingnew vector spaces We will get muchmore practice in the study and analysis of linear vectorspaces in Chapter4, on linear algebra

1.6 Focus: periodic media and reciprocal lattice vectors

The properties of waves propagating in a periodic medium have been the basis of a number

of important physical discoveries and applications In the early 1900s, Max von Lauefirst demonstrated that X-rays could be diffracted by crystals [vL13], and soon afterwardsWilliam Henry Bragg and William Lawrence Bragg applied X-ray diffraction to the analysis

of crystal structure [Bra14] X-ray diffraction occurs because the wavelengths of X-rays arecomparable to the spacing of atoms in a crystal; visible light, with much longer wavelength,

is not sensitive to this atomic lattice and its propagation can be treated as though the mediumwere homogeneous In recent decades, however, researchers have engineered materials withperiodic structures on the order of the wavelength of visible light These materials, known

as photonic crystals, have optical properties analogous to the X-ray properties of ordinary

crystals [JMW95]

The basic theory of crystals and other periodic media involves the application of a number

of concepts of vector algebra from this chapter We restrict ourselves for the moment todiscussions of X-ray diffraction, though the mathematics also applies broadly to discussions

of photonic crystals and even to electron propagation in crystals

The basic structure used to model crystals is the Bravais lattice, which is defined as an

infinite array of discrete units arranged such that the system looks exactly the same from any

of the lattice points Some examples of two-dimensional lattices are illustrated in Fig.1.7.The units of the Bravais lattice may represent individual atoms, collections of atoms, ormolecules; the units are identical throughout the lattice, however

As the lattice is of infinite spatial extent, it looks the same under translations from oneunit to another For instance, if we jump vertically from one unit to another in the squarelattice, its appearance is unchanged We can characterize this mathematically in a second,equivalent, definition of a Bravais lattice, as the set of all points which have position vectors

R of the form

R= m1a1+ m2a2+ m3a3, (1.65)where−∞ < m i < ∞ are integers, with i = 1,2,3, and a i are known as primitive vectors which characterize the lattice Different sets of integers m icharacterize different points inthe lattice The primitive vectors are a linearly independent, but not necessarily orthogonal,

Trang 38

square array triangular array

a1

a2

a1

a2

Figure 1.7 A pair of two-dimensional Bravais lattices: a square array of units and a triangular array

of units The units are represented as black dots Examples of primitive vectors a1and a2are shown

m i ; in sucha case, the primitive vectors are said to span the lattice.

One immediate consequence of our vector definition of the Bravais lattice is that we can

define the volume of a unit of the lattice The volume of the parallelepiped with all m i= 1

is simply V = |a1· (a2× a3)|.

It is important to note that the choice of primitive vectors is not unique Figure1.8

shows several different pairs of primitive vectors which span the lattice A little thought

Trang 39

1.6 Focus: periodic media and reciprocal lattice vectors 19

Figure 1.9 A precession photograph of a zero-level of the X-ray diffraction pattern The photo wasmade on Polaroid film withunfiltered copper radiation (Courtesy of Professor Daniel S Jones ofUNC Charlotte.)

will convince the reader that any pair of these primitive vectors can be constructed out ofany other pair, thus demonstrating their equivalence

We now turn to the diffraction of X-rays, and see how the tools of vector algebra aid

in the understanding of this phenomenon When a crystal is illuminated by a broadband(multifrequency) X-ray beam, the X-rays are scattered only in isolated directions, forming

a pattern of spots at the detector; an example is shown in Fig.1.9 There are two ways tointerpret the experimental results, due to the Braggs and von Laue, and we consider eachinterpretation and then show them to be equivalent

The Braggs’ approach involves the introduction of lattice planes in the crystal Any three

noncollinear points of a Bravais lattice define a planar surface which intersects an infinitenumber of lattice points Because of the periodicity of the lattice, we may then construct

an infinite family of parallel lattice planes for a given choice of noncollinear points Thereare an infinite number of families of lattice planes, several of which are illustrated for atwo-dimensional square lattice in Fig.1.10

Bragg assumed that these planes of atoms act essentially as planar surfaces from whichX-rays reflect in a specular manner Rays that are reflected from adjacent planes traveldifferent distances, with the ray reflected from the lower plane traveling farther by a distance

d is the distance between planes; this is illustrated in Fig.1.11

6 In crystallography, the angle of incidence is typically measured from the lattice plane rather than the normal to the plane, as is done in optics; we stick to the optics definition for consistency with future discussions.

Trang 40

Figure 1.10 Illustration of different families of lattice planes of a square lattice.

θ

θθ

d

d cos θ d cos θ

θ

Figure 1.11 Derivation of the Bragg condition

If we assume that the illuminating rays are monochromatic plane waves, any difference

 in path between the two rays introduces a phase difference k between them, where

k = 2π/λ is the wavenumber of the rays and λ is the wavelength When the two reflected

rays are in phase, i.e their phase is a multiple of 2π, there is constructive interference

between them and a strong reflected signal The Bragg condition therefore states that X-raydiffraction peaks appear for angles of illumination suchthat

where N is an integer.

The Bragg condition is the easiest way to understand X-ray diffraction phenomena, but

it has complexity hidden within it, namely in the plane spacing d and the angle to the

planeθ, bothof whichdepend upon the specific crystal structure An arguably more elegant

formulation was produced by von Laue, and this formulation will require some tools ofvector algebra

Ngày đăng: 22/04/2015, 21:15

TỪ KHÓA LIÊN QUAN