If it does contain the same number of elements, then the matrix has an equal number of rows and columns and is said to be ''square in the broader sense." Let the set G contain the symbo
Trang 1PURE A N D APPLIED PHYSICS
A Series of Monographs and Textbooks
Consulting Editors: H S W M A S S E Y AND K E I T H A BRUECKNER
Trang 2GROUP THEORY
AND ITS APPLICATION TO THE
QUANTUM MECHANICS OF ATOMIC SPECTRA
EUGENE P W I G N E R Palmer Physical Laboratory, Princeton University
Princeton, New Jersey
TRANSLATED FROM THE GERMAN BY
Trang 3NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM,
BY PHOTOSTAT, MICROFILM, RETRIEVAL SYSTEM, OR ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS
LIBRARY OF CONGRESS CATALOG CARD N U M B E R : 5 9 - 1 0 7 4 1
PRINTED IN THE UNITED STATES OF AMERICA
82 9
Trang 4The purpose of this book is to describe the application of group theoretical methods to problems o f quantum mechanics with specific reference t o atomic spectra The actual solution o f quantum mechanical equations is, in general,
so difficult that one obtains b y direct calculations only crude approximations
to the real solutions It is gratifying, therefore, that a large part of the relevant results can be deduced b y considering the fundamental symmetry operations
When the original German version was first published, in 1931, there was
a great reluctance among physicists toward accepting group theoretical arguments and the group theoretical point o f view It pleases the author that this reluctance has virtually vanished in the meantime and that, in fact, the younger generation does not understand the causes and the basis for this reluctance Of the older generation it was probably M von Laue who first recognized the significance o f group theory as the natural tool with which t o obtain a first orientation in problems o f quantum mechanics V o n Laue's encouragement of both publisher and author contributed significantly
to bringing this book into existence I like to recall his question as to which results derived in the present volume I considered most important My answer was that the explanation of Laporte's rule (the concept of parity) and the quantum theory of the vector addition model appeared to me most significant Since that time, I have come to agree with his answer that the recognition that almost all rules of spectroscopy follow from the symmetry
of the problem is the most remarkable result
Three new chapters have been added in translation The second half o f Chapter 24 reports on the work of Racah and of his followers Chapter 24
of the German edition now appears as Chapter 25 Chapter 26 deals with time inversion, a symmetry operation which had not yet been recognized
at the time the German edition was written The contents of the last part
of this chapter, as well as that of Chapter 27, have not appeared before in print While Chapter 27 appears at the end o f the book for editorial reasons, the reader may be well advised to glance at it when studying, in Chapters
17 and 24, the relevant concepts The other chapters represent the translation
of Dr J J Griffin, to whom the author is greatly indebted for his ready acceptance of several suggestions and his generally cooperative attitude H e also converted the left-handed coordinate system originally used to a right-handed system and added an Appendix on notations
ν
Trang 5The character of the book—its explicitness and its restriction to one subject only, viz the quantum mechanics of atomic spectra—has not been changed Its principal results were contained in articles first published in the
Zeitschrift für Physik in 1926 and early 1927 The initial stimulus for these
articles was given b y the investigations of Heisenberg and Dirac on the quantum theory of assemblies of identical particles W e y l delivered lectures
in Zürich on related subjects during the academic year 1927-1928 These were later expanded into his well-known book
When it became known that the German edition was being translated, many additions were suggested It is regrettable that most of these could not
be followed without substantially changing the outlook and also the size of the volume Author and translator nevertheless are grateful for these suggestions which were very encouraging The author also wishes to thank his colleagues for many stimulating discussions on the role of group theory in quantum mechanics as well as on more specific subjects H e wishes to record his deep indebtedness to Drs Bargmann, Michel, Wightman, and, last but not least, J von Neumann
E P WLGNER
Princeton, New Jersey
February, 1959
Trang 6This translation was initiated while the translator was a graduate student
at Princeton University It was motivated b y the lack of a good English work on the subject of group theory from the physicist's point of view Since that time, several books have been published in English which deal with group theory in quantum mechanics Still, it is perhaps a reasonable hope that this translation will facilitate the introduction of English-speaking physicists to the use of group theory in modern physics
The book is an interlacing of physics and mathematics The first three chapters discuss the elements of linear vector theory The second three deal more specifically with the rudiments o f quantum mechanics itself Chapters
7 through 16 are again mathematical, although much of the material covered should be familiar from an elementary course in quantum theory Chapters
17 through 23 are specifically concerned with atomic spectra, as is Chapter 25 The remaining chapters are additions to the German text; they discuss topics which have been developed since the original publication of this book: the recoupling (Racah) coefficients, the time inversion operation, and the classical interpretations of the coefficients
Various readers may wish to utilize the book differently Those who are interested specifically in the mathematics of group theory might skim over the chapters dealing with quantum physics Others might choose to de-emphasize the mathematics, touching Chapters 7, 9, 10, 13, and 14 lightly for background and devoting more attention to the subsequent chapters Students of quantum mechanics and physicists who prefer familiar material interwoven with the less familiar will probably apply a more even distribution
of emphasis
The translator would like to express his gratitude to Professor E P Wigner for encouraging and guiding the task, to Drs Robert Johnston and John McHale who suggested various improvements in the text, and to Mrs Marjorie Dresback whose secretarial assistance was most valuable
J J G R I F F I N
Los Alamos, New Mexico
February, 1959
vii
Trang 7LINEAR TRANSFORMATIONS
An aggregate of η numbers (o v t)2, X> 3 , · * * , V n ) is called an ^-dimensional
vector, or a vector in ^-dimensional space; the numbers themselves are the
components of this vector The coordinates of a point in η-dimensional space
can also be interpreted as a vector which connects the origin of the co
ordinate system with the point considered Vectors will be denoted b y bold
face German letters; their components will carry a roman index which
specifies the coordinate axis Thus v k is a vector component (a number), and
V is a vector, a set o f η numbers
Two vectors are said to be equal if their corresponding components are
equal Thus
is equivalent to the η equations
Ü! = mi, Υ2 = ΠΓ, · · · ; O N = vo n
A vector is a null vector if all its components vanish The product ct) of a
number c with a vector TL is a vector whose components are c times the
components of T>, or (cv) k — CO k Addition of vectors is defined b y the rule
that the components of the sum are equal to the sums of the corresponding
components Formally
In mathemtaical problems it is often advantageous to introduce new
variables in place of the original ones In the simplest case the new variables
linear functions of the old ones, x v x 2 , · · · , x n That is x[ = αι1 χ τ Η + a ln x n
I
Trang 8The transformation is completely determined by the coefficients αη, · · · , aw n,
and the aggregate of these n 2 numbers arranged in a square array is called
W e shall write such a matrix more concisely as ( a i k ) or simply a
For E q (1.3) actually to represent an introduction of new variables, it is
necessary not only that the x' be expressible in terms of the x , but also that
the χ can be expressed in terms of the χ That is, if we view the x i as un
knowns in E q (1.3), a unique solution to these equations must exist giving
the χ in terms of the x ' The necessary and sufficient condition for this is
that the determinant formed from the coefficients a ik be nonzero:
Transformations whose matrices have non vanishing determinants are referred
to as proper transformations, but an array of coefficients like (1.4) is always
called a matrix, whether or not it induces a proper transformation Bold
face letters are used to represent matrices; matrix coefficients are indicated
by affixing indices specifying the corresponding axes Thus α is a matrix,
an array of n 2 numbers; a jk is a matrix element (a number)
Two matrices are equal if all their corresponding coefficients are equal
by considering the Xp not as components of the original vector in a new
coordinate system, but as the components of a new vector in the original
Trang 9coordinate system W e then say that the matrix α transforms the vector χ
into the vector or that α applied to χ gives x'
x' = ax (1.3b)
This equation is completely equivalent to (1.3a)
An w-dimensional matrix is a linear operator on w-dimensional vectors It
is an operator because it transforms one vector into another vector; it is
linear since for arbitrary numbers a and b, and arbitrary vectors t and t),
the relation
a(ax + bv) = aar + bat) (1.6)
is true T o prove (1.6) one need only write out the left and right sides
explicitly The kth component of at + bt) is ax k + bv k , so that the *th com
ponent of the vector on the left is :
η
k = l But this is identical with the iih component of the vector on the right side
of (1.6)
η η
a Σ * < λ + & Σ α Λ ·
k = l k = l
This establishes the linearity of matrix operators
An n-dimensional matrix is the most general linear operator in η-dimensional vector
space That is, every linear operator in this space is equivalent to a matrix To prove
this, consider the arbitrary linear operator Ο which transforms the vector ei = (1, 0, 0,
• · · , 0) into the vector x ml , the vector e2 = (0, 1, 0, · · · , 0) into the vector ϊ,2, and finally,
the vector en = (0, 0, 0, · · · , 1) into t.n, where the components of the vector t,k are
iifcj r 2fc, * * * , Xnk- Now the matrix (t ik ) transforms each of the vectors t lt fc2> ' ' ' >*nm t o
the same vectors, t.2, · · · , t.n as does the operator O Moreover, any w-dimensional
vector d is a linear combination of the vectors t\ f £2, * * * , fcn Thus, both Ο and
(T ik ) (since they are linear) transform any arbitrary vector α into the same vector
«it.! + · · · + α η Χ.η' The matrix (rifc) is therefore equivalent to the operator 0
The most important property of linear transformations is that two o f them,
applied successively, can be combined into a single linear transformation
Suppose, for example, we introduce the variables x' in place of the original
χ via the linear transformation (1.3), and subsequently introduce variables
x" via a second linear transformation,
x\ = β η χ[ + β1 2 4 + h ßi A
« · (1-7)
x "n = ßnl^l + ß » * 4 + * · * + ß n Α ·
Trang 10Both processes can be combined into a single one, so that the x" are introduced
directly in place of the χ b y one linear transformation Substituting (1.3)
into (1.7), one finds
This demonstrates that the combination of two linear transformations (1.7)
and (1.3), with matrices ( ß i f c ) and ( a i k ) is a single linear transformation which
has the matrix (y ik )
The matrix (y ik ), defined in terms of the matrices ( a i k ) and ( ß i Ä ) according
to E q (1.9), is called the product of the matrices ( ß i f c ) and (0L ik ). Since ( a i k )
transforms the vector r into χ' = a t , and ( ß i k ) transforms the vector χ' into
χ" = ß t ' , the product matrix ( y i k ) b y its definition, transforms χ directly
into x" = y x This method of combining transformations is called "matrix
multiplication," and exhibits a number o f simple properties, which we n o w
enumerate as theorems
First of all we observe that the formal rule for matrix multiplication is
the same as the rule for the multiplication o f determinants
1 The determinant of a product of two matrices i s equal to the product of the
determinants of the two factors
Trang 11In the multiplication o f matrices, it is not necessarily true that
This establishes a second property of matrix multiplication
2 The product of two matrices depends in general upon the order of the factors
In the very special situation when E q ( l E l ) is true, the matrices Α and
Β are said to commute
I n contrast to the commutative law,
3 The associative law of multiplication is valid in matrix multiplication
That is,
Γ ( Β Α ) = ( Γ Β ) Α (1.10)
Thus, it makes no difference whether one multiplies Γ with the product of Β
and A, or the product of Γ and Β with A. T o prove this, denote the i-kth
element o f the matrix on the left side of (1.10) b y e i k Then
Then e ik = é ik , and (1.10) is established One can therefore write simply
Γ Β Α for both sides of (1.10)
The validity of the associative law is immediately obvious if the matrices
are considered as linear operators Let Α transform the vector χ into χ' =
AT, Β the vector t' into R" = ßT', and Γ the vector χ" into χ'" = γχ" Then
the combination o f two matrices into a single one b y matrix multiplication
signifies simply the combination of two operations The product Β Α trans
forms X directly into R", and Γ Β transforms χ' directly into χ'" Thus both
( Γ Β ) Α and Γ ( Β Α ) transform R into χ'", and the two operations are equivalent
Trang 124 The unit matrix
ordinary multiplication For every matrix A,
A · 1 = 1 · A
That is, 1 commutes with all matrices, and its product with any matrix is
just that matrix again The elements of the unit matrix are denoted b y the
symbol ö ik , so that
dik = 0 {ιφ k)
The ô ik defined in this way is called the Kronecker delta-symbol The matrix
(ôik) = 1 induces the identity transformation, which leaves the variables
unchanged
If for a given matrix A, there exists a matrix Β such that
β α = 1 , (1.13)
then Β is called the inverse, or reciprocal, of the matrix A. Equation (1.13)
states that a transformation via the matrix Β exists which combines with Α
to give the identity transformation If the determinant of Α is not equal to
zero (|A I F E| φ 0), then an inverse transformation always exists (as has been
mentioned on page 2 ) T o prove this we write out the n 2 equations (1.13)
more explicitly
lßifl* = aik ( < , * = 1 , 2 , · · , η ) (1.14)
3 = 1
Consider now the η equations in which i has one value, say I These are η
linear equations for η unknowns ΒΑ , Β Ί 2, · · · , ß ln They have, therefore, one
and only one solution, provided the determinant \oijk\ does not vanish The
same holds for the other η — 1 systems of equations This establishes the
fifth property we wish to mention
5 / / the determinant \aLjk\ Φ 0, there exists one and only one matrix Β such
that ΒΑ = 1
Trang 13Moreover, the determinant |ß^.| is the reciprocal of \oijk\, since, according
to Theorem 1,
From this it follows that α has no inverse if \aik\ = 0, and that β, the inverse
of a, must also have an inverse
W e n o w show that if (1.13) is true, then
αβ = 1 (1.16)
is true as well That is, if β is the inverse of a, then a is also the inverse of
β This can be seen most simply b y multiplying (1.13) from the right with β,
βαβ = β, (1.17) and this from the left with the inverse of β, which we call γ Then
γβαβ = γβ and since, b y hypothesis γβ = 1, this is identical with (1.16) Conversely,
(1.13) follows easily from (1.16) This proves Theorem 6 (the inverse of α
It is clear that inverse matrices commute with one another
Rule: The inverse o f a product αβγδ is obtained b y multiplying the
inverses of the individual factors in reverse order (8~1
Another important matrix is
7 The null matrix, every element of which is zero
0 =
Obviously one has
(1.18)
for any matrix a
The null matrix plays an important role in another combination process
for matrices, namely, addition The sum γ of two matrices α and β is the
matrix whose elements are
)
The n 2 equations (1.19) are equivalent to the equation
γ = α + β or γ — α — β = 0
Trang 14ΑΑΒ = Α Α Β ; a(A + Β) = AA + A ß
The formulas
(ab)a = a(ba);
then follow directly
Since integral powers of a matrix Α can easily be defined b y successive
multiplication
A 2 = A · A; A 3 = A · A · A;
(1.22) polynomials with positive and negative integral exponents can also be defined
α_„Α~ 1 Α ~ 1 + α0 1 + αΧ Α + (1.23)
The coefficients α in the above expression are not matrices, but numbers A
junction of Α like (1.23) commutes with any other function of Α (and, in par
ticular, with Α itself)
Still another important type of matrix which appears frequently is the
diagonal matrix
8 A diagonal matrix is a matrix the elements of which are all zero except for
those on the main diagonal
All diagonal matrices commute, and the product of two diagonal matrices is again
diagonal This can be seen directly from the definition o f the product
( D D ' ) T T = I ViP* = 1 DAP?* = D ( D/O ik (1.26)
Addition of matrices is clearly commutative
Α + Β = Β + Α (1.20)
Moreover, multiplication by sums is distributive
Γ ( Α + Β ) = Γ Α + Γ Β (Α + Β ) Γ = Α Γ + Β Γ Furthermore, the product of a matrix Α and a number α is defined to be that
matrix Γ each element of which is α times the corresponding elements of A
Trang 15Conversely, if a matrix Α commutes with a diagonal matrix D , the diagonal
elements o f which are all different, then Α must itself be a diagonal matrix
Writing out the product
A D = D A
(*D)ik = aikDk = (Da)ik = D{aik. (1.27) That is
This establishes another property of matrices
9 The trace of a product of two matrices does not depend on the order of the
two factors
This rule finds its most important application in connection with similarity
transformations o f matrices A similarity transformation is one in which the
transformed matrix Α is multiplied b y the transforming matrix Β from the
right and b y its reciprocal from the left The matrix Α is thus transformed
into Β _ 1 Α Β A similarity transformation leaves the trace of the matrix unchanged,
since the rule above states that Β _ 1 Α Β has the same trace as Α Β Β - 1 = Α
The importance o f similarity transformations arises from the fact that
10 A matrix equation remains true if every matrix in it is subjected to the
same similarity transformation
For example, transformation of a product of matrices Α Β = Γ yields
Trang 16matrices and numbers are also preserved under similarity transformation
Theorem 10 therefore applies t o every matrix equation involving products
of matrices and numbers or other matrices, integral (positive or negative)
powers of matrices, and sums o f matrices
These ten theorems for matrix manipulation were presented in the very
first papers on quantum mechanics b y Born and Jordan,1 and are undoubtedly
already familiar t o many readers They are reiterated here since a firm
command of these basic rules is indispensable for what follows and for
practically every quantum mechanical calculation Besides, they must very
often be used implicitly, or else even the simplest proofs become excessively
tedious.2
LINEAR INDEPENDENCE OF VECTORS
The vectors T) 1? T> 2 , · · · , T) FC are said to be linearly independent if no relation
ship o f the form
+ a2 T> 2 + · · · + a n *k = 0
) exists except that in which every a v α2, · · · , a k is zero Thus no vector in a
linearly independent set can be expressed as a linear combination o f the
other vectors in the set In the case where one of the vectors, say t) v is a null
vector, the set can no longer be linearly independent, since the relationship
1 · T>! + 0 · T>2 + · · · + 0 · v k = 0
is surely satisfied, in violation of the condition for linear independence
AS AN EXAMPLE OF LINEAR DEPENDENCE, CONSIDER THE FOUR-DIMENSIONAL VECTORS: Q 1 =
(1, 2, — 1 , 3),T>2 = (0, — 2 , 1, —1), ANDT>3 = (2, 2, — 1 , 5) THESE ARE LINEARLY DEPENDENT
SINCE
2T>! + T>2 - T>3 = 0
ON THE OTHER HAND, T)X AND T)2 A E LINEARLY INDEPENDENT
1 M BORN AND P JORDAN, Z Physik 3 4 , 858 (1925)
2 FOR EXAMPLE, THE ASSOCIATIVE LAW OF MULTIPLICATION (THEOREM 3) IS USED IMPLICITLY
THREE TIMES IN THE DEDUCTION OF THE COMMUTABILITY OF INVERSES (THEOREM 6) (TRY
WRITING OUT ALL THE PARENTHESES!)
Trang 17If k vectors v v t) 2, · * · , t ) k are linearly dependent, then there can be found
among them k' vectors (k f < k) which are linearly independent Moreover,
all k vectors can be expressed as linear combinations of these k' vectors
In seeking k' vectors which are linearly independent we omit all null
vectors, since, as we have already seen, a null vector can never be a member
of a linearly independent set W e then go through the remaining vectors one
after another, rejecting any one which can be expressed as a linear combina
tion of those already retained The k' vectors retained in this way are linearly
independent, since if none of them can be expressed as a linear combination
of the others, no relationship of the type (1.30) can exist among them
Moreover, each o f the rejected vectors (and thus all o f the k original vectors)
can be expressed in terms of them, since this was the criterion for rejection
The linear dependence or independence o f k vectors v v t> 2, · · · , v k is also
a property of the vectors a i ) v · · · , a t ) k which result from them b y a proper
transformation a That is,
as can be seen b y applying α to both sides o f (1.31) and using the linearity
property to obtain (1.31a) Conversely, (1.31a) implies (1.31) It also follows
that any specific linear relationship which exists among the exists among
the at>t, and conversely
No more than η ^-dimensional vectors can be linearly independent T o
prove this, note that the relation implying linear dependence
If the coefficients a v α2, · · · , a n , aw + 1 in these equations are viewed as un
knowns, the fact that η linear homogeneous equations in η + 1 unknowns
always have nontrivial solutions implies at once that the relationship (1.32)
always exists Thus, η + 1 ^-dimensional vectors are always linearly
dependent
An immediate corollary to the above theorem is the statement that any
η linearly independent η-dimensional vectors form a complete vector system; that
Trang 18is, an arbitrary ^-dimensional vector vo can be expressed as a linear com
bination of them Indeed, the theorem states that some relationship
α
Λ + ' ' ' + a
n*n + OVO = 0
must exist among the η vectors and the arbitrary vector Moreover, if
V V V 2 , * ' · , V n are linearly independent, the coefficient b cannot be zero Thus
any vector to can be written as a linear combination o f the v i9 so that these form a complete vector system
A row or a column of an ^-dimensional matrix can be looked upon as a
vector For example, the components of the vector a, fc which forms the k i h
column are a l k , a 2 k , · · · , ci nk , and those of the vector a { which forms the ith
row are α α, · · · , a in A nontrivial linear relationship among the column
The vanishing of the determinant |a fÄ | is the necessary and sufficient condition
that such a solution exist Therefore, if this determinant does not vanish
(|affc| Φ 0), then the vectors a.1? · · · , a.TO are linearly independent and form a
complete vector system Conversely, if the vectors V v · · · , V n are linearly independent, the matrix which is formed b y taking them as its columns must have a nonzero determinant Of course, this whole discussion applies equally well to the row-vectors of a matrix
Trang 191, W e now generalize the results o f the previous chapter The first
generalization is entirely formal, the second one is o f a more essential nature
T o denote the components o f vectors and the elements o f matrices, we have
affixed the appropriate coordinate axes as indices So far, the coordinate
axes have been denoted b y 1, 2, 3, · · · , n From now on we will name the
coordinate axes after the elements of an arbitrary set I f G is a set of objects
g, h , i , · · · , then the vector ο in the space of the set G is the set of numbers
V g , T> h , O t , · · · Of course only vectors which are defined in the same space
can be equated (or added, etc.) since only then do the components correspond
to the same set
A similar system will be used for matrices Thus for a matrix α to be
applied to a vector ν with components v g , O h , Ό { , · · · , the columns of α must
be labeled by the elements of the same set G as that specifying the components
of v In the simplest case the rows are also named after the elements
g, h , i , · · · of this set, and α transforms a vector t) in the space of G into a
vector at) in the same space That is
leG
where j is an element of the set G , and I runs over all the elements of this set
For example, the coordinate axes can be labeled by three letters x, y, z Then v, with components y)x = 1, t)y = 0, t)2 = —2, is a vector, and
y ζ
5 - 1 ) y
is a matrix (The symbols for the rows and columns are indicated.) In this example
ct xx = 1, a x y = 2, a x z = 3 Eq (2.1) states that the ^-component of t>' = at> is given by
= 1 · 1 + 2 · 0 + 3 ( - 2 ) - - 5
The simple generalization above is purely formal; it involves merely
another system of labeling the coordinate axes and the components of vectors
and matrices T w o matrices which operate on vectors in the same space can
13
Trang 20be multiplied with one another, just like the matrices in the previous chapter
2 A further generalization is that in which the rows and columns o f
matrices are labeled b y elements of different sets, F and G. Then from (2.1),
where j is an element of the set F, and I runs over all the elements of the set G
Such a matrix, whose rows and columns are labeled b y different sets is
called a rectangular matrix, in contrast with the square matrices o f the
previous chapter; it transforms a vector Q in the space of G into a vector
tt> in the space of F In general the set F need not contain the same number
of elements as the set G If it does contain the same number of elements, then
the matrix has an equal number of rows and columns and is said to be
''square in the broader sense."
Let the set G contain the symbols *, Δ> • > and the set F the numbers 1 and 2 Then
* Δ •
\0 - 1 - 2 / 2
is a rectangular matrix (The labels of the rows and columns are again indicated.) It
transforms a vector t)* = 1, D/\ = 0, = —2 into the vector
vo = an
The components χθχ and m2 are then
tX>i = <*i*D* + αι ΔϋΔ + *ιΠ°Π = 5 · 1 -f 7 · 0 -f 3( —2) = - 1
tö2 = + α2 ΔϋΔ + α2 Πυπ = 0 · 1 + (-1)(0) + ( - 2 ) ( - 2 ) - 4
Two rectangular matrices β and α can be multiplied only if the columns of
the first factor and the rows of the second factor are labeled b y the same set
F; i.e., only if the rows of the second factor and the columns of the first
"match." On the other hand, the rows o f the first and the columns of the
second factor can correspond to elements of completely different sets, Ε and
Trang 21where j is an element of E, k an element of G, and I runs over all the elements
of F The rectangular matrix Α transforms a vector in the space o f G into
one in the space of F; the matrix Β then transforms this vector into one in
the space of E The matrix Γ therefore transforms a vector in the space of G
into one in the space of E
LET G BE THE SET * , Δ > • AGAIN, LET F CONTAIN THE LETTERS χ AND y, AND Ε THE NUMBERS
* Δ • / 5 4 69 8 4 \ 1
Υ ~~ \ 3 3 4 5 5 7 / 2
3 W e now investigate h o w the ten theorems of matrix calculus deduced
in Chapter 1 must be modified for rectangular matrices W e see immediately
that they remain true for the generalized square matrix discussed at the
beginning of this chapter, since the specific numerical nature of the indices
has not been used anywhere in the first chapter
Addition of two rectangular matrices—just as that o f two vectors—pre
supposes that they are defined in the same coordinate system, that is, that
the rows match the rows and the columns match the columns In the equation
« + Β = Γ
the labeling of the rows of the three matrices Α, Β, Γ must be the same, as
well as the labeling of their columns On the other hand, for multiplication
the columns of the first factor and the rows of the second factor must match;
only then (and always then) can the product be constructed The resulting
product has the row labeling of the first, and the column labeling of the
second factor
THEOREM 1 W e can speak of the determinant of a rectangular matrix if it
has the same number of rows and columns, although these may be labeled
differently For matrices "square in the broader sense" the rule that the
determinant of the product equals the product of the determinants is still
valid
THEOREMS 2 and 3 The associative law also holds for the multiplication o f
rectangular matrices
( Ο Β ) Γ = Α ( Β Γ ) (2.3)
Trang 22Clearly all multiplication on the right side can actually be carried out
provided it can be done on the left side, and conversely
T H E O R E M S 4, 5, and 6 The matrix 1 will always be understood to be a
square matrix with rows and columns labeled b y the same set Multiplication
by it can always be omitted
Matrices which are square in the broader sense have a reciprocal only if
their determinant is nonvanishing For rectangular matrices with a different
number of rows and columns, the inverse is not defined at all If Α is a matrix
which is square only in the broader sense, the equation
Β Α = 1
implies that the columns of Β match the rows of A Furthermore, the rows
of 1 must match the rows of Β , and its columns must match the columns of A
Since 1 is square in the restricted sense, the columns o f Α must also match the
rows of Β
The rows of the matrix Β inverse to the matrix Α are labeled by the same set as
the elements of the columns of a, its columns by the same elements as the rows of A
There exists for any matrix Α which is square in the broader sense and has a
nonvanishing determinant, an inverse Β such that
Moreover,
Α Β = 1 (2.4a) However, it should be noted that the rows and columns of 1 in (2.4) are
labeled differently from those of 1 in (2.4a)
T H E O R E M 7 With respect to addition and the null matrix, the same rules
hold for rectangular matrices as for square matrices However, the powers of
Α presupposes that the columns of Α and the rows of Α match, i.e., that Α
is square
T H E O R E M S 8 , 9, and 10 For rectangular matrices the concepts of diagonal
matrix and trace are meaningless; also, the similarity transformation is
undefined Consider the equation
Σ Α Σ - 1 — Β
This implies that the labeling of the rows of Β and Σ are the same But this
is the same as the labeling of the columns of Σ _ 1 , and thus of the columns o f
Β It follows that the matrix Β is square in the restricted sense; likewise, A ,
whose rows must match the columns of Σ and whose columns must match the
rows of Σ - 1 , must be square in the restricted sense
On the other hand, Σ itself can be square in the broad sense: the columns and
Trang 23change the labeling of rows and columns are especially important The
so-called transformation theory of quantum mechanics is an example of such
transformations
The introduction of rectangular matrices is very advantageous in spite of
the apparent complication which is involved, since substantial simplifications
can be achieved with them The outline above is designed not as a rigid
scheme but rather to accustom the reader to thinking in terms of these
entities The use of such more complicated matrices will always be explained
specifically unless the enumeration of rows and columns is so very clear b y
the form and definition of the elements that further explanation is scarcely
desirable
4 Quite frequently it occurs that the rows are named not with just one
number but with two or more numbers, for example
The first column is called the "1,1 column;" the second, the "1,2 column;"
the third, the "2,1 column;" the fourth, the "2,2 column;" the rows are
designated in the same way The elements of (2.E.1) are
If the number of rows in Α is n 1 and the number of columns, n 2 , and the
corresponding numbers for Β are n[ and n2, then Γ has exactly nxnx rows and
n2n2 columns In particular, if Α and Β are both square matrices then Α Χ Β
is also square
1 THE FACTORS Α AND Α OF THE ORDINARY MATRIX PRODUCT ARE MERELY WRITTEN NEXT TO ONE
ANOTHER, ΑΑ THE MATRIX (2.E.1) IS THE DIRECT PRODUCT OF THE TWO MATRICES
(a 1 c 1 « I C 2 \ /^I^I & Α \
a 2 c 1 a 2 c 2 J \ 62 ^ I ^2^2/
Trang 24T H E O R E M 1 If_aa = a and ß ß = β, and if α χ β = γ and α Χ β = γ
then γ γ = α" Χ β
(α Χ β)(α" Χ β) = α α Χ β β (2.7)
That is, the matrix product of two direct products is the direct product of the
two matrix products T o show this, consider
T H E O R E M 2 The direct product of two diagonal matrices is again a diagonal
matrix-, the direct product of two unit matrices is a unit matrix This is easily
seen directly from the definition of direct products
In formal calculations with matrices it must be verified that the multi
plication indicated is actually possiblẹ In the first chapter where we had
square matrices with η rows and columns throughout, this was, of course,
always the casẹ In general, however, it must be established that the rows of
the first factor in matrix multiplication match the columns of the second
factor, ịẹ, that they both have the same names or labels The direct
product of two matrices can always be constructed b y (2.6)
A generalized type of matrix with several indices is referred to b y M Born
and P Jordan as a "super-matrix." They interpret the matrix as
a matrix (A ik ) whose elements A ik are themselves matrices A ik is that matrix
in which the number ậ f c î occurs in the jih row and the Zth column
= « = (A < J b), where (Â)^ = a i j ; k l (2.10)
T H E O R E M 3 If a = (A u >) and β = ( Bi Vr ) , then αβ = γ = ( Ci r) where
Trang 25The right-hand side of (2.11) consists of a sum o f products of matrix multiplications W e have
On the other hand
In the simplest case we might have two square matrices
22'
is meaningless, since the number o f columns of B n , for example, differs from the number o f rows of A n
Trang 26In the first chapter we established a very important property of similarity
transformations They leave the trace of a matrix unchanged;1 the matrix
α has the same trace as σ-1
ασ. Is the trace o f a matrix the only invariant under similarity transformation? Clearly not, since, for example, the deter
minant |σ_1
ασ| is also equal to the determinant |a|. In order to obtain further
invariants, we consider the determinantal equation of the nth order for λ
Clearly the determinant |σ_1
(α — Α1)σ| is also equal to zero; this can be written
Ισ-1
Equation (3.4) shows that the η roots of the secular equation |β — λΐ\ — 0
are identical2 with the η roots of the secular equation |α — λΐ\ = 0 The
roots of the secular equation, the so-called eigenvalues of the matrix, are invariant
under similarity transformations W e shall see later that in general a matrix
has no other invariants Also, the trace is the sum, and the determinant is
the product of the eigenvalues, so that their invariance is included in the
theorem stated above
1
The matrix which undergoes a similarity transformation must always be a square
matrix For this reason we again denote the rows and columns with the numbers,
Trang 27W e now consider one eigenvalue, λ ν The determinant of the matrix
(a — Ajl) is zero, so that the linear homogeneous equations
have a solution A linear homogeneous system o f equations like (3.5) can
be written for each of the η eigenvalues X k W e denote the solutions o f
this system, which are determined only up to a common constant factor, b y
The set of η numbers rl f c, r2 f c, ' * * , V n k is called an eigenvector t mJc of the matrix
a; the eigenvector t, k belongs to the eigenvalue X k Equation (3.5a) can
then be written
The matrix transforms an eigenvector into a vector which differs from the
eigenvector only b y a constant factor; this factor is the eigenvalue itself
The eigenvectors t v r 2 , · · · , t n can be combined into a matrix ρ in
such a way that t k is the &th column of this matrix
Pik =
(**k)i — *ik *
Then the left side of (3.5a) consists of the (ik) element of αρ The right side
also can be interpreted as the (ik) element of a matrix, the matrix ρΛ,
where Λ is a diagonal matrix with diagonal elements λ ν λ 2 , ' ' ' , λ η
A similarity transformation by a matrix whose columns are the η eigen
vectors transforms the original matrix into the diagonal form; the diagonal
elements are the eigenvalues of the matrix. T w o matrices which have the same
eigenvalues can always be transformed into one another since they can both
be transformed into the same matrix The eigenvalues are the only invariants
under a similarity transformation
Trang 28This is true, of course, only if ρ has a reciprocal, that is, if the η vectors
t i t 2 · · ' , t n are linearly independent This is generally the case, and is always true if the eigenvalues are all different Nevertheless, there are exceptions, as is shown, for example, b y the matrices
These cannot be brought into diagonal form b y any kind of similarity transformation The theory of elementary divisors deals with such matrices; however, we need not go into this, since we shall have always to deal with matrices which can be brought into the diagonal form (3.6a) (e.g., with unitary and/or Hermitian matrices)
The conditions for the commutability of two matrices can be reviewed very well from the viewpoint developed above I f two matrices can be brought into diagonal form b y the same transformation, i.e., if they have the
commute after the similarity transformation; therefore they must also commute in their original form
In the first chapter we defined the rational function of a matrix
/ ( A ) = · · · a _ 3A ~ 3 + a_ 2ar 2 + a ^ A - 1 + a 0l + aXA + a 2 A 2 + a3 A 3 + · · ·
To b r i n g / ( A ) into diagonal form it is sufficient to transform Α to the diagonal
Σ - γ ( Α ) Σ = Σ - Χ ( · · · a _ 2 a r 2 + a ^ a r 1 + a 0 l + α Χ Α + α 2 Α 2
+ · · · ) Σ ,
and this is itself a diagonal matrix I f X k is the Arth diagonal element in
Λ — ( A i k ) — (ô ik X k ), then (X k ) p is the Arth diagonal element in ( A ) P and
* ' * 0 - 2 * Γ 2 + n - i K 1 + o + < h h + α 2 λ 1 + = f ( K )
is the Arth diagonal element i n / ( A )
A rational function / ( A ) of a matrix Α can be brought into diagonal form
by the same transformation which brings Α into diagonal form The diagonal elements, the eigenvalues of / ( A ) , are the corresponding functions
/ ( A ^ / ^ g ) , * * * , / ( λη) of the diagonal elements λ ν A2, * * · , λ η of Α W e assume
that this law holds not only for rational functions but also for arbitrary functions F (a) of Α and consider this as the definition of general matrix functions
3 N o t e t h a t t h e e i g e n v a l u e s c a n d i f f e r a r b i t r a r i l y
or
= · · · α_ 2 Λ~ 2 + α - Ι Λ - 1 + α 0 1 + + α 2 Α 2 + · · · = / ( Α )
Trang 29Special Matrices
One can obtain from a square matrix α a new matrix α', in which the roles
of rows and columns are interchanged The matrix a' so formed is called
which verifies (3.7a)
The matrix which is formed b y replacing each of the n 2 elements with its
complex conjugate is denoted b y a*, the complex conjugate of a I f a = a*
all the elements are real
By interchanging the rows and columns and taking the complex conjugate
as well, one obtains from α the matrix a*' = a'* This matrix is called the
B y assuming various relationships between a matrix α and its adjoint,
transpose, and reciprocal, special kinds of matrices can be obtained Since
their names appear frequently in the literature we will mention them all;
in what follows, we shall use only unitary, Hermitian, and real orthogonal
matrices
If α = α* (i.e., a i k = a^*), the matrix is said to be real, and all n 2
elements a i k are real I f α = —α* (oL ik = — α$*), then the matrix is purely
imaginary
Trang 30If S = S' {S ik = & ki ), the matrix is symmetric; if S = —S' (S ik = —S ki ) }
it is skew- or anti-symmetric
If H = H* (H iÄ = Η^·), the matrix is said to be Hermitian; if A = —A*,
skew- or anti-Hermitian
If α is real as well as symmetric, then α is Hermitian also, etc
If 0' = 0 _ 1, then 0 is complex orthogonal A matrix U, for which = U- 1 ,
is said to be a unitary matrix I f R 1 " = R - 1 , and R = R* (real), then
R' = R*' — Rt — R-i and R' = R - 1 ; R is said to be real orthogonal, or
simply orthogonal
Unitary Matrices and the Scalar Product
Before discussing unitary matrices, we must introduce one more new
concept In the very first chapter we defined the sum of two vectors and a
constant multiple of a vector Another important elementary concept is the
scalar product of two vectors The scalar product o f a vector ft with a vector
b is a number W e shall distinguish between the Hermitian scalar product
and the simple scalar product
αΛ + a2b2 + · · · + a n b n = ((ft, b)) (3.9a) Unless we specify otherwise, we always refer to the Hermitian scalar product
rather than the simple scalar product If the vector components a 1? a 2, * * · , a n
are real, both products are identical
If (a, b) = 0 = (b, ft), then a and b are said to be orthogonal to one another
If (a, a) = 1, it is said that α is a unit vector, or that it is normalized The
product (ft, ft) is always real and positive, and vanishes only when all the
components of α vanish This holds only for the Hermitian scalar product,
in contrast to the simple scalar product; for example, suppose α is the
two-dimensional vector (1, i) Then ((a, a)) = 0, but (a, a) = 2 In fact (a, a) = 0
implies that a = 0; but this does not follow from ((a, ft)) = 0
Simple Rules for Scalar Products:
1 Upon interchange of the vectors
On the other hand
(eft, b) = c*(ft, b) whereas ((eft, b)) = c((ft, b))
(3.10)
(3.10a)
(3.11)
Trang 313. The scalar product is linear in the second factor, since
(a, bb + cc) = b(a, b) + c(a, c) (3.12)
It is, however, "antilinear" in the first factor
(aa + bb, C) = α*(α, C) + b*(b, C) (3.12a)
4 Furthermore, the important rule
(α, ab) = (οΛι, b) or (ßa, b) = (a, ßf b) (3.13)
is valid for arbitrary vectors α and b, and every matrix a T o see this,
Instead of applying the matrix a to one factor of a scalar product, its adjoint
a* can be applied to the other factor
For the simple scalar product the same rule holds for the transposed
matrix; that is
((α, ab)) = ((a'a, b))
5 W e now write the condition = U - 1 for the unitarity of a matrix
somewhat more explicitly: U^U = 1 implies that
ί (Ut)„U,f c = I U|U,fc = ίΛ; (U.,, V.k) = aÄ (3.14)
// £Λβ η columns of a unitary matrix are looked upon as vectors, they comprise η
orthogonal unit vectors. Similarly, from UU^ = 1, it follows, that
j The η rows of a unitary matrix also form η unit vectors which are mutually
orthogonal
6 A unitary transformation leaves the Hermitian scalar product un
changed; in other words, for arbitrary vectors α and b,
(Ua, Ub) = (a, Uf
Conversely, if (3.15) holds for a matrix U for every pair of arbitrary vectors
Trang 32α and 6, then U is unitary, since then Eq (3.15) holds also for α = t i t and
b = t k (where (t k ) l = ô kl ) But in this special case (3.15) becomes
The same rule applies to complex orthogonal matrices, with respect to the
simple scalar product
7 The product UV of two unitary matrices U and V is unitary
The reciprocal U _ 1 of a unitary matrix is also unitary
(U-i)t = (Ut)t = U = (U-1
)-1
The Principal Axis Transformation for Unitary and Hermitian Matrices
Every unitary matrix V and every Hermitian matrix Η can be brought into
diagonal form by a similarity transformation with a unitary matrix U For
such matrices, the exceptional case mentioned on page 22 cannot occur
First of all, we point out that a unitary (or Hermitian) matrix remains
unitary (or Hermitian) after a unitary transformation Since it is a product
of three unitary matrices, U - 1
To bring V or Η to the diagonal form, we determine an eigenvalue of V
or H Let this be λ χ ', the corresponding eigenvector, U.x = (U n · · · U wl ) is
determined only up to a constant factor W e choose the constant factor
so that
(TJ.1; U.i) = 1
This is always possible since (U el , U el ) can never vanish W e now construct
a unitary matrix U of which the first column is Ue l.4
With this unitary matrix we now transform V or Η into U^VU or U_ 1
HU. For example, in
XJ-1
VU, we have for the first column
xr l = (U-!VU)rl = (Utvu)rl = 2KI ν,μνμι = Σ ΚΚ^Λ = δΛλν
ν μ ν
since Ue l is already an eigenvector of V. W e see that λ1 occurs in the first
row of the first column, and all the other elements of the first column are zero
Obviously, this holds true not only for U_1
VU, but also for U_ 1
HU. Since
4 See the lemma at the end of the proof
Trang 33U _ 1 HXJ is Hermitian, the first row is also zero, except for the very first
element; thus U _ 1 H U has the form
But XJ _ 1 VU must have exactly the same form! Since X is a unitary matrix,
its first column X # 1 is a unit vector, and from this it follows that
| X u | 2 + | X 2 1 | 2 + · · · + | X M L | 2 = L^L 2 = Ι ( 3 E 2 ) The same argument applies to the first row, X 1 # of X The sum of the squares
is given b y
| X n | 2 + |Χχ 2 | 2 + · · · + |Xm| 2 = W 2 + | X 1 2 | 2 + | X 1 3 | 2 + · · · + |Xi„| 2 = I
which implies that X 1 2 , X 1 3 , · · · , X l w all vanish
Therefore, every unitary or Hermitian matrix can be transformed into
the form (3.E.1) b y a unitary matrix The matrix (3.E.1) is not yet a diagonal
matrix, as it cannot be, since we have used the existence of only one eigen
value It is however more like a diagonal matrix than the original matrix V,
or H It is natural to write (3.E.1) as a super-matrix
\ 0 VJ \0 B.J ( 3 E 3 )
where the matrix V x or has only Η — 1 rows and columns W e can then
transform ( 3 E 3 ) b y another unitary matrix o f the form
A ( Η
LO I Y
where JJ 1 has only Η — 1 rows and columns
Under this process (3.E.1) assumes the form
The procedure above can be applied again and JJ 1 can be chosen so that
U L V I U I O R ΥΐΗΧΫΧ has the form
where V 2 or H 2 are only Η
has the form
Trang 34Clearly, repetition of this procedure will bring V or H entirely into diagonal form, so that the theorem is proven
This theorem is not valid for symmetric or complex orthogonal matrices,
as the second example on page 22 shows (the second matrix is symmetric and complex orthogonal) However, it is valid for real symmetric or real orthogonal matrices, which are just special cases o f Hermitian or unitary matrices
Lemma I f (U#1 , U t ) = 1, then a unitary matrix can be constructed (in
many different ways), whose first column is Μ Λ — (un, U2i> · · * , Unl)
W e first construct in general a matrix the first column o f which is U #1 and which has a nonvanishing determinant Let the second column of this matrix
be t>.2 = (o 12 , t)2 2, · · · , v n2 ), the third t>.3 , etc
The vectors tt#1, t) 2 , υ 3 , · · · are then linearly independent since the deter
minant does not vanish Since we also wish them to be orthogonal, we use the Schmidt procedure to "orthogonalize" them First substitute U.2 = «2iUe l + t> 2 for t> 2 ; this leaves the determinant unaltered Then set
(U v tt.2) = 0 = a 21 {U v U.i) + (U 1? t> 2) = a 21 + (u v t> 2 )
and determine a 21 from this Next write U 3 in place of t> 3 with tt.3 =
α3ΐ**.ι + α32**.2 + tt.3 a n
d determine a 31 and a 32 so that
Proceeding in this way, we finally write U n in place of t) w , with n w =
U w _ x + V n , and determine a nl , a n2 , a n3 , · · · ,
a n,n-V s o ^ a t
0 = « , ) = a 3l (U v «.!> + («.!, t> 3 )
0 = (U.2, II.3) = a32(M.2, U.2) + (U.2,W.3)
0 = (n v U.J = a nl (U v tt.a) + ( « ! , » „ ) ,
0 = (U.2> «.„) = an2(U.2,1l.2) + (U.2, V.n),
0 = (u n _ v «.„) = «„.„^(tt.^, u n ^) + (u.^, o.n)
Trang 35In this way, with the help of the \N(N — 1) numbers A we succeed in sub stituting the vectors u for the vectors V The U are orthogonal and non- null b y virtue o f the linear independence o f the V Assume, for example, that U N = 0 This implies
and since the tt#1, TI 2 > " ' ' , tt.w are linear combinations o f the ttel, T> 2 > ' * * >
V n _ v one could write T) N in terms of these Η — I vectors, in contradiction
to their linear independence
Finally, we normalize the U 2 , U 3, · · * , U N , thereby constructing a unitary
matrix whose first column is U# 1
This "SCHMIDT ORTHOGONALIZATION PROCEDURE" shows how t o construct from
any set o f linearly independent vectors an orthogonal normalized set in
which the KTH unit vector is a linear combination o f just the first K o f the original vectors I f one starts with Η η-dimensional vectors which form a complete set of vectors, one obtains a complete ORTHOGONAL system
If a unitary matrix V or a Hermitian matrix Η is brought t o the diagonal
form this way, then the resulting matrix A V or A H is also unitary or Hermitian
It follows that
THE ABSOLUTE VALUE OF EACH EIGENVALUE OF A UNITARY MATRIX 5 IS 1; THE EIGENVALUES
OF A HERMITIAN MATRIX ARE REAL This follows directly from (3.19), which states that for the eigenvalues Λ Ν of the unitary matrix, Λ Ν Λ* = 1 ; for those o f a Hermitian matrix, X H = Λ* The eigenvectors o f V, and of H, as columns
of the unitary matrix U, can be assumed t o be orthogonal
Real Orthogonal and Symmetric Matrices
Finally, we investigate the implications of the requirement that V, or H,
be complex orthogonal (or symmetric), as well as unitary (or Hermitian) I n
this case, both V and Η are real
From U^VU = AV, we obtain the complex conjugate υ*^ν*υ* = (U*)îVU* = Λ*. Since the eigenvalues, as roots of the secular equation, are
independent of how the matrix is diagonalized (i.e., whether b y U or U*), the
diagonal form A V can also be written as Λ* Thus the numbers Λ Ν Λ 2 ,·'·,Λ Η
are the same as the numbers Λ*, Λ*, * · · , A* This implies that THE COMPLEX
EIGENVALUES o f a real orthogonal matrix V OCCUR IN CONJUGATE PAIRS Moreover
since VV — 1, they all have absolute value 1 ; the real eigenvalues are
therefore ^ 1 I n an odd-dimensional matrix at least one eigenvalue must
be real
1, or AH = A{ (3.19)
AS EQUATION (3.E.2) ALREADY SHOWS
Trang 36If D is an eigenvector for the eigenvalue λ, then t)* is an eigenvector for
the complex conjugate value A* T o see this write Vt) — At); then V*t)* =
= Vt)* Moreover, if λ* is different from λ, then (t>*, t>) = 0 = ((t), t>));
the simple scalar product of an eigenvector with itself vanishes if the corresponding eigenvalue is not real (not ± 1 ) · Conversely, real eigenvectors (for which the simple scalar product does not vanish) correspond to the eigen
values J^l Also, let t) be the eigenvector for λ ν let υ* be that for Af, and
3 that for A2 Then if λ χ φ λ 2 , it follows that
0 = (»*, J) = ((», }))·
The simple scalar product of two eigenvectors of a real orthogonal matrix i s always zero if the corresponding eigenvalues are not complex conjugates; when the eigenvalues are complex conjugates, the corresponding eigenvectors are them selves complex conjugates
W = 1; it follows that the determinant of V multiplied with that of V must give 1 The determinant of V, however, is equal to that of V, so that
both must be either + 1 or — 1
If Η is real, the Eq (3.5) is real, since the X h are real. The eigenvectors of
up to a constant factor, they can also be multiplied b y a complex factor.)
Thus, the unitary matrix U in U _ 1H U = A H can be assumed real
Trang 371 In the years before 1925 the development o f the then new "Quantum
Mechanics" was directed primarily toward the determination of the energy
of stationary states, i.e., toward the calculation of the energy levels The
older "Separation Theory" o f Epstein-Schwarzschild gave a prescription for
the determination o f the energy levels, or terms, only for systems whose
classical mechanical motions had the very special property of being periodic,
or at least quasi-periodic
An idea of W Heisenberg, which attempted a precise statement of the
Bohr correspondence principle, corrected this deficiency It was proposed
independently b y M Born and P Jordan, and b y P A M Dirac Its essence
is the requirement that only motions which later would be seen as quantum
mechanically allowed motions should occur in the calculation In carrying
through this idea these authors were led to introduce matrices with infinite
numbers of rows and columns as a formal representation of position and
momentum coordinates, and formal calculations with "q-numbers" obeying
the associative but not the commutative law
Thus, for example, the equation for the energy H of the linear oscillator1
1 Ο Κ Β
2m 2
is obtained b y formally substituting the matrices Ρ and Q for THE MOMENTUM
AND POSITION COORDINATES Ρ and Q in the HAMILTONIAN FORMULATION of the
classical expression for the energy It is required that Η be a diagonal
matrix The diagonal terms H T O N then give the possible energy values, the
stationary levels o f the system On the other hand, the absolute squares o f
elements Q NK of the matrix Q are proportional to the probability of a spon
taneous transition from a state with energy H N N to one with energy VI KK They
| | J_|
give, therefore, the intensity of the line with frequency ω = — - k
All
Η
of this follows from the same considerations which suggest the introduction
of matrices for Ρ and Q
In order to specify the problem completely, one had still to introduce a
1 THE M IS THE MASS OF THE OSCILLATING PARTICLE, AND Κ THE FORCE CONSTANT; Q AND Ρ
ARE THE POSITION AND MOMENTUM COORDINATES
31
Trang 38''commutation relation" between ρ and q This was assumed to be
p q - q p - τ ΐ (4.2)
where % is Planck's constant divided b y 2π
Calculations with these quantities, although often fairly tedious, led very
rapidly to beautiful and important results of a far-reaching nature Thus,
the "selection rules" for angular momentum and certain "sum rules" which
determine the relative intensity of the Zeeman components of a line could be
calculated in agreement with experiment, an achievement for which the
Separation Theory was inadequate
E Schrödinger, b y an approach which was independent of Heisenberg's
point of view, arrived at results which were mathematically equivalent to
those mentioned above His method bears deep resemblance to the ideas o f
L DeBroglie The discussion to follow is based on Schrödinger 's approach
Consider a many-dimensional space with as many coordinates as the system
considered has position coordinates Every arrangement of the positions o f
the particles of the system corresponds to a point in this multidimensional
"configuration space." This point will move in the course of time, tracing
out a curve b y which the motion o f the system can be completely described
classically There exists a fundamental correspondence between the classical
motion of this point, the system point in configuration space, and the motion
of a wave-packet, also considered in configuration space,2 if only we assume
that the index of refraction for these waves is [2m(E — V)]li2
/E Ε is the
total energy of the system; V, the potential energy as a function o f the
configuration
The correspondence consists in the fact that the smaller the ratio between
the wavelengths in the wave-packet and the radius of curvature of the path
in configuration space, the more accurately the wave-packet will follow that
path On the other hand if the wave-packet contains wavelengths as large
as the classical radius of curvature of the path in configuration space then
important differences between the two motions exist, due to interference
among the waves
Schrödinger assumes that the motion of the configuration point corre
sponds to the motion of the waves, and not to the classically calculated
The development of the text follows Schrödinger's ideas more closely than is
customary at present (remark of translator)
Trang 39ψ = ψΕ exp (^—i ^ tj, (4.4)
where ψ Ε is independent o f t H e thus obtains the eigenvalue equation
where ψ Ε is a function of the particle position coordinates x v x 2) · · · , x f
It is necessary to require that ψ Ε be square-integrable, i.e., the integral
OO 00
j " ' J | V E ( * „ % , · • ' * , ) | !
* » , · dx,
— 00 — 00
over all configuration space must be finite In particular, ψ must vanish at
infinity The values of Ε for which the determination o f such a function,
ψΕ, is possible are called the "eigenvalues" o f (4.5); they give the possible
energy values of the system. The corresponding square integrable solution o f
(4.5) is called the eigenfunction belonging t o the eigenvalue E
Equation (4.5) is also written in the form
where Η is a linear operator (The Hamiltonian, or energy operator)
\2m1 dx\ 2m2 dx\
+ V(xvx2, - · · ,xf). (4.5b)
The last term means multiplication b y V(x v x 2f - · -, x f )
It transforms one function o f x v x 2 , · · · , x f into another function The
function ψ o f (4.4) fulfills the relationship
dt
The total energy o f the system does not appear explicitly in (4.6), so that it
where the position coordinates o f the particles in
the system considered, m1, m 2 , · · · , m f , the corresponding masses, and
V(x v x 2 , * · · , x f ) is the potential energy in terms of the coordinates of the
individual particles 00 ^ y 00 g 5 J *^^ ·
The total energy o f the system appears explicitly in (4.3) On the other
hand the frequency, or the period o f the waves, is still unspecified
Schrödinger assumes that the frequency o f a wave which is associated with
the motion of a system with total energy Ε is given b y hco = Ε H e therefore
substitutes into (4.3)
Trang 40applies generally t o all motions, independent o f the energy o f the system;
it is called the TIME-DEPENDENT SCHRÖDINGER EQUATION
The two Eqs (4.5) (or (4.5a), (4.5b)) and (4.6) are the basic equations o f quantum mechanics The latter specifies the change of a configuration wave
in the course o f time—to which, as we will see, a far-reaching physical reality is attributed; (4.5), (or (4.5a), (4.5b)) is the equation for the fre
quency Ω = E/H, the energy Ε, and the periodic time-dependence o f the wave function Ψ Indeed, (4.5a) results from (4.6) and the assumption that
All the simple calculational rules o f Chapter 3 apply t o this scalar product
Thus, if A 1 and A 2 are numerical constants,
(Ψ, (H9I + a 29 2 ) = A I(<P> 9ι) + Α 2 (Φ, G 2 ),
and
(<P>9) = (9>
Ψ)*-(Φ, Φ) is real and positive and vanishes only IF Φ = 0 IF Ψ)*-(Φ, Φ) = 1, then
Ψ is said to be normalized I f the integral
oo
is finite, then Φ can always be normalized b y multiplication b y a constant
^1/c in the case above, since ^— ,— J = i j T w o FUNCTIONS are orthogonal
if their scalar product is zero
The scalar product given in the Eq (4.7) is constructed by considering the
functions Φ(χ 1 · · · X F ), G(X L ···#/) of X LT X 2 , · · · , X F as vectors, whose components are
labeled by / continuous indices The function vector Φ{χ χ · · · X F ) is defined in an /-fold
infinite-dimensional space Each system of values of X X · · · X F , i.e., each configuration,
corresponds to one dimension Then the scalar product of Φ and G, in vector language, is
{ψ* 9) = 2 ' ' ' χ Α*9( χ ι " ' x f )
for which the integral (4.7) was substituted