The calculus of variations involves problems in which the quantity to be minimized (or maximized) appears as a stationary integral, a functional, because a functiony(x, α)needs to be determined from a class described by an infinitesimal parameterα. As the simplest case, let
J= x2
x1
f (y, yx, x) dx. (17.1)
Here J is the quantity that takes on a stationary value. Under the integral sign, f is a known function of the indicated variablesxandα, as arey(x, α), yx(x, α)≡∂y(x, α)/∂x, but the dependence ofy onx (andα)is not yet known; that is,y(x)is unknown. This means that although the integral is fromx1tox2, the exact path of integration is not known (Fig. 17.1). We are to choose the path of integration through points(x1, y1)and(x2, y2)to minimizeJ. Strictly speaking, we determine stationary values ofJ: minima, maxima, or saddle points. In most cases of physical interest the stationary value will be a minimum.
This problem is considerably more difficult than the corresponding problem of a function y(x)in differential calculus. Indeed, there may be no solution. In differential calculus the minimum is determined by comparingy(x0)withy(x), wherexranges over neighboring points. Here we assume the existence of an optimum path, that is, an acceptable path for whichJ is stationary, and then compare J for our (unknown) optimum path with that obtained from neighboring paths. In Fig. 17.1 two possible paths are shown. (There are an infinite number of possibilities.) The difference between these two for a givenx is called the variation ofy,δy, and is conveniently described by introducing a new function,η(x), to define the arbitrary deformation of the path and a scale factor,α, to give the magnitude of the variation. The functionη(x)is arbitrary except for two restrictions. First,
η(x1)=η(x2)=0, (17.2)
FIGURE17.1 A varied path.
which means that all varied paths must pass through the fixed endpoints. Second, as will be seen shortly,η(x)must be differentiable; that is, we may not use
η(x)=1, x=x0,
(17.3)
=0, x=x0,
but we can chooseη(x)to have a form similar to the functions used to represent the Dirac delta function (Chapter 1) so thatη(x)differs from zero only over an infinitesimal region.1 Then, with the path described byαandη(x),
y(x, α)=y(x,0)+αη(x) (17.4)
and
δy=y(x, α)−y(x,0)=αη(x). (17.5) Let us choose y(x, α=0)as the unknown path that will minimize J. Then y(x, α) for nonzeroα describes a neighboring path. In Eq. (17.1),J is now a function2 of our parameterα:
J (α)= x2
x1
f
y(x, α), yx(x, α), x dx, (17.6) and our condition for an extreme value is that
∂J (α)
∂α
α=0=0, (17.7)
analogous to the vanishing of the derivativedy/dx in differential calculus.
Now, the α-dependence of the integral is contained in y(x, α) and yx(x, α) = (∂/∂x)y(x, α). Therefore3
∂J (α)
∂α = x2
x1
∂f
∂y
∂y
∂α+ ∂f
∂yx
∂yx
∂α
dx. (17.8)
From Eq. (17.4),
∂y(x, α)
∂α =η(x), (17.9)
∂yx(x, α)
∂α =dη(x)
dx , (17.10)
so Eq. (17.8) becomes
∂J (α)
∂α = x2
x1
∂f
∂yη(x)+ ∂f
∂yx dη(x)
dx
dx. (17.11)
1Compare H. Jeffreys and B. S. Jeffreys, Methods of Mathematical Physics, 3rd ed., Cambridge, UK: Cambridge University Press (1966), Chapter 10, for a more complete discussion of this point.
2Technically,J is a functional ofy, yx, but a function ofαdepending on the functionsy(x, α)andyx(x, α): J[y(x, α), yx(x, α)].
3Note thatyandyxare being treated as independent variables.
Integrating the second term by parts to getη(x)as a common and arbitrary nonvanishing factor, we obtain
x2 x1
dη(x) dx
∂f
∂yx
dx=η(x)∂f
∂yx
x2 x1−
x2 x1
η(x) d dx
∂f
∂yx
dx. (17.12)
The integrated part vanishes by Eq. (17.2), and Eq. (17.11) becomes x2
x1
∂f
∂y − d dx
∂f
∂yx
η(x) dx=0. (17.13)
In this formαhas been set equal to zero, corresponding to the solution path, and, in effect, is no longer part of the problem.
Occasionally we will see Eq. (17.13) multiplied by δα, which gives, upon using η(x)δα=δy,
x2 x1
∂f
∂y − d dx
∂f
∂yx
δy dx=δα ∂J
∂α
α=0
=δJ=0. (17.14)
Sinceη(x)is arbitrary, we may choose it to have the same sign as the bracketed expression in Eq. (17.13) whenever the latter differs from zero. Hence the integrand is always nonneg- ative. Equation (17.13), our condition for the existence of a stationary value, can then be satisfied only if the bracketed term itself is zero almost everywhere. The condition for our stationary value is thus a PDE,4
∂f
∂y − d dx
∂f
∂yx =0, (17.15)
known as the Euler equation, which can be expressed in various other forms. Sometimes solutions are missed when they are not twice differentiable, as required by Eq. (17.15). An example is Goldschmidt’s discontinuous solution of Section 17.2. It is clear that Eq. (17.15) must be satisfied forJ to take on a stationary value, that is, for Eq. (17.14) to be satisfied.
Equation (17.15) is necessary, but it is by no means sufficient.5Courant and Robbins (1996;
see the Additional Readings) illustrate this very nicely by considering the distance over a sphere between points on the sphere,AandB, Fig. 17.2. Path (1), a great circle, is found from Eq. (17.15). But path (2), the remainder of the great circle through pointsAandB, also satisfies the Euler equation. Path (2) is a maximum, but only if we demand that it be a great circle and then only if we make less than one circuit; that is, path (2)+ncomplete revolutions is also a solution. If the path is not required to be a great circle, any deviation from (2) will increase the length. This is hardly the property of a local maximum, and that is why it is important to check the properties of solutions of Eq. (17.15) to see if they satisfy the physical conditions of the given problem.
4It is important to watch the meaning of∂/∂xandd/dxclosely. For example, iff=f[y(x), yx, x], df
dx=∂f
∂x +∂f
∂y dy dx+ ∂f
∂yx d2y dx2.
The first term on the right gives the explicitx-dependence. The second and third terms give the implicitx-dependence viay andyx.
5For a discussion of sufficiency conditions and the development of the calculus of variations as a part of mathematics, see G. M.
Ewing, Calculus of Variations with Applications, New York: Norton (1969). Sufficiency conditions are also covered by Sagan (in the Additional Readings at the end of this chapter).
FIGURE17.2 Stationary paths over a sphere.
Example 17.1.1 OPTICALPATHNEAREVENTHORIZON OF ABLACKHOLE
Determine the optical path in an atmosphere where the velocity of light increases in pro- portion to the height,v(y)=y/b,withb >0 some parameter describing the light speed.
Sov=0 aty=0,which simulates the conditions at the surface of a black hole, called its event horizon, where the gravitational force is so strong that the velocity of light goes to zero, thus even trapping light.
Because light takes the shortest time, the variational problem takes the form
t=
t2 t1
dt= ds
v =b dx2+dy2
y dt=minimum.
Herev=ds/dt=y/bis the velocity of light in this environment, theycoordinate being the height. A look at the variational functional suggests choosingy as the independent variable becausex does not appear in the integrand. We can bringdy outside the radical and change the role ofx andy inJ of Eq. (17.1) and the resulting Euler equation. With x=x(y), x′=dx/dy, we obtain
b √
x′2+1
y dy=minimum, and the Euler equation becomes
∂f
∂x − d dy
∂f
∂x′ =0.
Since∂f/∂x=0,this can be integrated, giving x′
y√
x′2+1=C1=const., or x′2=C12y2
x′2+1 . Separatingdxanddyin this first-order ODE we find the integral
x
dx=
y C1y dy
1−C12y2 ,
FIGURE17.3 Circular optical path in medium.
which yields
x+C2=−1 C1
1−C12y2, or (x+C2)2+y2= 1 C12.
This is a circular light path with center on the x-axis along the event horizon. (See Fig. 17.3.) This example may be adapted to a mirage (Fata Morgana) in a desert with hot air near the ground and cooler air aloft (the index of refraction changes with height in cool versus hot air), thus changing the velocity law fromv=y/b→v0−y/b.In this case, the circular light path is no longer convex with center on thex-axis, but becomes
concave.
Alternate Forms of Euler Equations
One other form (Exercise 17.1.1), which is often useful, is
∂f
∂x − d dx
f −yx∂f
∂yx
=0. (17.16)
In problems in which f =f (y, yx), that is, in which x does not appear explicitly, Eq. (17.16) reduces to
d dx
f−yx
∂f
∂yx
=0, (17.17)
or
f−yx ∂f
∂yx =constant. (17.18)
Example 17.1.2 Missing Dependent Variables
Consider the variational problem
f (r) dt˙ =minimum. Here r is absent from the inte- grand. Therefore the Euler equations become
d dt
∂f
∂x˙ =0, d dt
∂f
∂y˙ =0, d dt
∂f
∂z˙ =0,
with r=(x, y, z),sofr˙=c=const. Solving these three equations for the three unknowns
˙
x,y˙,z˙ yieldsr˙=c1=const. Integrating this constant velocity gives r=c1t+c2.The solutions are straight lines, despite the general nature of the functionf.
A physical example illustrating this case is the propagation of light in a crystal, where the velocity of light depends on the (crystal) directions but not on the location in the crystal, because a crystal is an anisotropic homogeneous medium. The variational problem
ds v =
√
˙ r2
v(r)˙ dt=minimum
has the form of our example. Note thattneed not be the time, but it parameterizes the light
path.
Exercises
17.1.1 Fordy/dx≡yx=0, show the equivalence of the two forms of Euler’s equation:
∂f
∂x − d dx
∂f
∂yx =0 and
∂f
∂y − d dx
f−yx ∂f
∂yx
=0.
17.1.2 Derive Euler’s equation by expanding the integrand of J (α)=
x2 x1
f
y(x, α), yx(x, α), x dx
in powers ofα, using a Taylor (Maclaurin) expansion withyandyxas the two variables (Section 5.6).
Note. The stationary condition is ∂J (α)/∂α=0, evaluated at α =0. The terms quadratic inαmay be useful in establishing the nature of the stationary solution (maxi- mum, minimum, or saddle point).
17.1.3 Find the Euler equation corresponding to Eq. (17.15) iff =f (yxx, yx, y, x).
ANS. d2 dx2
∂f
∂yxx
− d dx
∂f
∂yx
+∂f
∂y =0, η(x1)=η(x2)=0, ηx(x1)=ηx(x2)=0.
17.1.4 The integrandf (y, yx, x)of Eq. (17.1) has the form
f (y, yx, x)=f1(x, y)+f2(x, y)yx. (a) Show that the Euler equation leads to
∂f1
∂y −∂f2
∂x =0.
(b) What does this imply for the dependence of the integralJupon the choice of path?
17.1.5 Show that the condition that
J=
f (x, y) dx has a stationary value
(a) leads tof (x, y)independent ofyand
(b) yields no information about anyx-dependence.
We get no (continuous, differentiable) solution. To be a meaningful variational problem, dependence onyor higher derivatives is essential.
Note. The situation will change when constraints are introduced (compare Exer- cise 17.7.7).