lem 8.4.30 suggested a way to prove Ptolemy's the-
orem using similar triangles. Now find an inversive proof. Hint: Invert about one of the vertices, sending the other three onto a line. Now the image distances of are more easily handled; then use the image-distance formula of 8.5. 1 4(b) to relate these with the original side lengths.
8.5.50 (USAMO 2000) Let AIA2A3 be a triangle and let COl be a circle in its plane passing through A I and A2. Suppose there exist circles CO2 , CO] , • • • , CI>7 such
that for k = 2, 3 , . . . ,7, circle COk is externally tan
gent to COk- 1 and passes through Ak and Ak+ I , where An+3 = An for all n � I . Prove that CI>7 = COl .
Calcu l us
In this chapter, we take it for granted that you are familiar with the basic calculus ideas like limits, continuity, differentiation, integration, and power series. On the other hand, we assume that you may have have heard of, but not mastered:
• Formal " 0 -e" proofs
• Taylor series with "remainder"
• The mean value theorem
In contrast to, say, Chapter 7, this chapter is not a systematic, self-contained treatment.
Instead, we concentrate on just a few important ideas that enhance your understanding of how calculus works. Our goal is twofold: to uncover the practical meaning of some of the things that you have already studied, by developing useful reformulations of old ideas; and to enhance your intuitive understanding of calculus, by showing you some useful albeit non-rigorous "moving curtains." The meaning of this last phrase is best understood with an example.
9.1 The Fundamental Theorem of Calculus
To understand what a moving curtain is, we shall explore, in some detail, the most important idea of elementary calculus. This example also introduces a number of ideas that we will keep returning to throughout the chapter.
Example 9.1.1 What is the fundamental theorem of calculus (FTC), what does it mean, and why is it true?
Partial Solution : You have undoubtedly learned about the FTC. One formulation of it says that if f is a continuous function, l then
lb f(x)dx = F (b) -F (a) , ( 1 )
where F i s any anti derivative of f; i.e., F' (x) = f(x) . This i s a remarkable state
ment. The left -hand side of ( 1 ) can be interpreted as the area bounded by the graph
I In this chapter we will assume that the domain and range of all functions are subsets of the real numbers.
31 5
31 6 CHAPTER 9 CALCULUS
of y = f(x) , the x-axis, and the vertical lines x = a and x = b. The right-hand side has a completely different meaning, since it is related to f(x) by differentiation, the computation of the slope of the tangent line to the graph of a function. How in the world are areas and slopes related?
Stating it that way makes the FTC seem quite mysterious. Let us try to shed some light on it. On one level, the FTC is an amazing algorithmic statement, since in practice, antiderivatives are sometimes rather easy to compute. But that explains what it is, not why it is true. Understanding why it is true is a matter of choosing the proper interpretation of the entities in ( 1 ).
We start with the very useful define a function tool, which you have seen before (for example, 5 .4.2). Let
g(t) := l f(x)dx.
We chose the variable t on purpose, to make it easy to visualize g(t) as a function of time. As t increases from a, the function g(t) is computing the area of a "moving curtain," as seen below. Notice that g(a) = O.
y
y=/(x)
L---�---L---� x
a
Differentiation is not just about tangent lines-it has a dynamic interpretation as in
stantaneous rate of change. Thus g' (t) is equal to the rate of change of the area of the curtain at time t. With this in mind, look at the picture below: what does your intuition tell you the answer must be?
6.(
t+6.t
The area grows fast when the leading edge of the curtain is tall, and it grows slowly when the leading edge is short. It makes intuitive sense that
g' (t) = f(t) , (2)
since in a small interval of time .1t, the curtain's area will grow by approximately f(t}.1t. Equation (2) immediately yields the FTC, because if we define F (t) := g(t) + C, where C is any constant, we have F' (t) = f( t) and
F (b) -F (a) = g(b) -g(a) = lb f(x)dx.
The crux move was to interpret the definite integral dynamically, and then observe the intuitive relationship between the speed that the area changes and the height of the curtain. This classic argument illustrates the critical importance of knowing many possible alternate interpretations of both differentiation and integration.
You may argue that we have not proved FTC rigorously, and indeed (2) deserves a more careful treatment. After all. the curtain does not grow by exactly f(t).1t. The exact amount is equal to
jt+.1t t f(x)dx,
which is equal to f(t).1t + E (t ) , where E (t) is the area of the "error," shown shaded below [note that E (t) is negative in this picture] .
Hilt
Everything hinges on showing that
lim (E (t)) =0.
.11->0 .1t
This requires an understanding of continuity; we will prove (3) in 9.2.7.
9.2 Convergence and Conti nuity
(3)
You already have an intuitive understanding of concepts like limits and continuity, but in order to tackle interesting problems, you must develop a rigorous wisdom. Luckily, almost everything stems from one fundamental idea: convergence of sequences. If you understand this, you can handle limits, and continuity, and differentiation, and integration. Convergence of sequences is the theoretical foundation of calculus.
31 8 CHAPTER 9 CALCULUS Convergence
We say that the real-valued sequence (an) converges to the limit L if lim an =L.
n->oo
What does this mean? That if we pick an arbitrary distance e > 0, then eventually, and forever after, the ai will get within e of L. More specifically, for any e > 0 (think of e as a really tiny number), there is an integer N (think of it as a really huge number, one that depends on e) such that all of the numbers
lie within e of L. In other words, for all n 2 N,
If the context is clear, we may use the abbreviation an ---t L for lim n->oo an = L.
In practice, there are several possible methods of showing that a given sequence converges to a limit.
1 . Draw pictures whenever possible. Pictures rarely supply rigor, but often fur
nish the key ideas that make an argument both lucid and correct.
2. Somehow guess the limit L, and then show that the ai get arbitrarily close to L.
3. Show that the ai eventually get arbitrarily close to one another. More precisely, a sequence (an) possesses the Cauchy property if for any (very tiny) e > 0 there is a (huge) N such that
lam - an i < e
for all m, n 2 N. If a sequence of real numbers has the Cauchy property, it converges.2 The Cauchy property is often fairly easy to verify, but the disad
vantage is that one doesn't get any information about the actual limiting value of the sequence.
4. Show that the sequence is bounded and monotonic. A sequence (an) is bounded if there is a finite number B such that Ian I ::; B for all n. The sequence is monotonic if it is either non-increasing or non-decreasing. For example, (an)
is monotonically non-increasing if an+l ::; an for all n.
Bounded monotonic sequences are good, because they always converge. To see this, argue by contradiction: if the sequence did not converge, it would not have the Cauchy property, etc. But please note: the limit of the sequence need not be the bound B ! Construct an example to make sure you understand this.
5. The Squeeze Principle. Show that the terms of the sequence are bounded above and below by the terms of two convergent sequences that converge to the same limit. For example, suppose that for all n, we have
0 < Xn < (0.9t .
2 See [36] for more information about this and other "foundational" issues regarding the real numbers.
This forces Xn -t O. Conversely, if the tenus of a sequence are greater in ab
solute value than the corresponding tenus of a sequence that diverges (has infinite limit), then the sequence in question also diverges.
6. Use Big-Oh and Little-oh Analysis. Most convergence investigations require estimates and comparisons. The big- and little-oh notations give us a system
atic way to describe growth rates of functions as the variable tends towards infinity or zero.
We say that f(x) = O(g(x) ) ("f is big-Oh g") if there exists a constant C such that If(n) 1 ::; qg(n) 1 for all sufficiently large n. We say that f(x) = o(g(x) ) if limx--->oo f(x) /g(x) = O. For example, f(x) = O(x3 ) means that, for large enough x, we can bound f(x) by a cubic. On the other hand, f(x) = o(x3 ) means that f(x) grows fundamentally slower than a cubic.
We can also use this notation to describe behavior near zero. If we say f(x) =
O(g(x) ) "as x -t 0," this means that f(x) is bounded by a constant multiple of g(x) for sufficiently small, but nonzero values of x. Likewise, we can define f(x) = o(g(x) ) as x -t O.
This notation is useful for two reasons: it allows us to focus on the parts of a function "that matter." For example, when x is small, it may be very helpful to know that f(x) = x + O( v'x) as x -t 0, especially if we are comparing it, say, with another function that is x + O( V'x) as x -t O. Also, we can do simple algebra with the "oh" functions. For example, If f(x) = O(xl), then xf(x) =
O(x3 ) , etc.
The next few examples illustrate some of these ideas. In the first example, our goals are modest-just to find some decent bounds for an infinite sequence. However, the process is instructive.
Example 9.2.1 Let an = ( I + 1 /2) ( 1 + 1 /4) ã ã ã ( 1 + 1 /2n ) . Find upper and lower bounds a, b such that a ::; n--->oo lim an ::; b.
Solution : Define the product
S(x, n) = ( 1 + x) ( 1 + xl) . . . ( 1 + � ) ,
where 0 < x < 1 . What we are interested in i s the limiting value of S(x, n) a s n -t 00, which we will denote by S(x) .
By mUltiplying out but ignoring repeated tenus, it is clear that
(4)
since all powers of x will appear in the product (with coefficients of at least I ).
To get an inequality going the other direction, we need a more subtle analysis. We claim that for any integer m, we have
320 CHAPTER 9 CALCULUS
To see why this is true, use the binomial theorem3 to get
1 + - = I + m - + - + - + - + .. .
( m 1 ) m m 1 (m2 m2 ) 1 (m3 m3 ) 1 (m4 m4 ) 1
m(m - I ) m (m - I ) (m - 2) m(m - l ) (m - 2) (m - 3 )
= I + I +
2 ' 2 + 3 ' 3 +
4' 4 + . . .
. m .m .m
I 1 I
< 1 + 1 + - + - + - + ã ã ã
2 ! 3 ! 4!
= e.
Consequently, ( I + m) < em, so we have
S(x, n) = ( I + x) ( 1 + � ) . . . ( I + XZ )
< (/(/2 .. . (/"
= (/+xZ+ ... +x" .
Summing the geometric series, we conclude that _1_ I - x < S(x) < (/I( I -x) ,
for 0 < x < 1 . Can these bounds be improved? •
Example 9.2.2 Fix a > I , and consider the sequence (xn)n?:O defined by Xo = a, and
Xn+1 = � ( xn+ :) , n = O, I , 2, . . . . Does this sequence converge, and if so, to what?
Solution : Let us try an example where a = 5. Then we have
Xo = 5 ,
XI = � ( 5 + �) = 3
X2 = � (3 + �) = �.
Observe that the values (so far) are strictly decreasing. Will this always be the case?
Let us visualize the evolution of the sequence. If we draw the graphs of y = 5/ x and y = x, we can construct a neat algorithm for producing the values of this sequence, for Xn+1 is the average of the two numbers Xn and 5/xn. In the picture below, the y
coordinates of points B and A are respectively Xo and 5/ Xo. Notice that the y-coordinate of the midpoint of the line segment AB is the average of these two numbers, which is equal to XI .
30r use the fact that Iimm�""( I + I /m)m = e along with an analysis of the derivative of the function f(x) = ( I + I lx)x to show that this limit is attained from below.
7
s
3 2
y = x
2 � � 4 • 6 8
y = 5/x
Next, draw a horizontal line left from this midpoint until it intersects the graph of y = x (at C). The coordinates of C are (Xl , Xl ) , and we can drop a vertical line from C until it meets the graph of y = 5 1x (at D). By the same reasoning as before, X2 is the y-coordinate of the midpoint of segment CD.
Continuing this process, we reach the point E = (X2 , X2 ) , and it seems clear from this picture that if we keep going, we will converge to the intersection of the two graphs, which is the point ( v's, v's) .
Thus w e conjecture that limn-.oo xn = v's. However, the picture i s not a rigorous proof, but an aid to reasoning. To show convergence with this picture, we would need to argue carefully why we will never "bounce away" from the convergence point.
While it is possible to rigorize this, let's change gears and analyze the general problem algebraically.
The picture suggests two things: that the sequence decreases monotonically, and that it decreases to Va. To prove monotonicity, we must show that Xn+ l :S Xn • This is easy to do by computing the difference
_ _ _ (x�+a ) _ 2x� - x� - a _ � - a
Xn Xn+ l - Xn 2xn -
2xn - 2xn '
which is non-negative as long as x� 2 a . And this last inequality is true; it is a simple consequence of the AM-OM inequality (see page 1 76):
so Xn+ l 2 Va no matter what Xn is equal to.4 Since Xo = a > Va, all terms of the sequence are greater than or equal to Va.
4 Instead of studying the difference Xn - Xn+ 1 , it is just as easy to look at the ratio xn+ 1 / Xn • This is always less than or equal to I (using a little algebra and the fact that Xn � Va).
322 CHAPTER 9 CALCULUS
Since the sequence is monotonic and bounded, it must converge. Now let us show that it converges to Va. Since 0 is a much easier number to work with, let us define the sequence of "error" values En by
En : = Xn - Va,
and show that En -+ O. Note that the En are all non-negative. Now we look at the ratio of En+! to En to see how the error changes, hoping that it decreases dramatically. We have (aren't you glad you studied factoring in Section 5.2?)
Thus
En+ ! = Xn+ ! - Va
= � (2 xn + �Xn ) - Va
x� + a -2xn Va 2xn (xn -Va)2
2xn E2 n 2xn
En+ ! En Xn -Va Xn 1
-- = - = < - = - .
En 2xn 2xn 2xn 2
Since this ratio is also positive, we are guaranteed that limn-+oo En = 0, using the squeeze principle on page 3 1 8.
We are done; we have shown that Xn -+ Va. •
The trickiest part in the example above was guessing that the limit was Va. What if we hadn't been lucky enough to have a nice picture? There is a simple but very productive tool that often works when a sequence is defined recursively. Let us apply it to the previous example. If Xn -+ L, then for really large n, both Xn and xn+ ! approach L. Thus, as n approaches infinity, the equation Xn+! = (xn + a/xn ) /2 becomes
L = � (L +I)'
and a tiny bit of algebra yields L = Va. This solve for the limit tool does not prove that the limit exists, but it does show us what the limit must equal if it exists.
Here is a tricky problem that has several solutions. We will present one that em
ploys big-Oh estimates.
Example 9.2.3 (Putnam 1 990) Is v'2 the limit of a sequence of numbers of the form ifJi - Tm (n, m = 0, 1 , 2, . . . )?
Partial Solution: Your intuition should suggest that the answer is yes, because, for "large enough integers, cube roots get closer to each other, so we can approach any number." Let's sketch a solution that formalizes this idea.
1 . First, show that V'n + 1 -Tn = O(n-2/3 ) .
2. All that matters about O(n-2/3 ) is that it can be made arbitrarily small for large enough n. So now let's try to get differences of cube roots close to a particular number, say, n. If we wanted to get just moderately close, we could look at the sequence
?fil, ?/28, m, V'3Q, . . .
that begins at 3 < n and partitions the rest of the number line into intervals that are no wider than V'28 -3. If we wanted to get even closer, we could start at 3, as before, but represented by the difference V'103 -.j73. So now we have the more finely spaced sequence
3 , V'l OO I - 7 , V'1 002 -7 , ... .
3 . Make this rigorous, and general (not just for n) and you 're done.
Continuity
Informally, a function is continuous if it is possible to draw its graph without lifting the pencil. Of the many equivalent formal definitions, the following is one of the easiest to use.
Let f : D � IR and let a E D. We say that f is continuous at a if lim f(xn) = f(a)
n-+oo
for all sequences (xn) in D with limit a.
We call f continuous on the set D if f is continuous at all points in D.
Continuity is a condition that you probably take for granted. This is because virtu
ally every function that you have encountered (certainly most that can be written with a simple formula) are continuous.5 For example, all elementary functions (finite combi
nations of polynomials, rational functions, trig and inverse trig functions, exponential and logarithmic functions, and radicals) are continuous at all points in their domains.
Consequently, we will concentrate on the many good properties that continuous functions possess. Here are two extremely useful ones.
Intermediate-Value Theorem (IVT) If f is continuous on the closed inter
val [a, b] , then f assumes all values between f(a) and f(b) . In other words, if y lies between f(a) and f(b) , then there exists x E [a, b] such that f(x) = y.
Extreme-Value Theorem Iff is continuous on the closed interval [a, b] , then f attains minimum and maximum values on this interval. In other words, there exists u, v E [a, b] such that f(u) � f(x) and f(v) ? f(x) for all x E [a, b] . The extreme-value theorem seems almost without content, but examine the hy
pothesis carefully. If the domain is not a closed interval, then the conclusion can fail.
S Notable exceptions are the floor and ceiling functions LxJ and rxl -
324 CHAPTER 9 CALCULUS
For example, f(x) := l /x is continuous on (0, 5), but achieves neither maximum nor minimum values on this interval.
On the other hand, the IVT, while "obvious" (see Problem 9.2.22 for hints about its proof), has many immediate applications. Here is one simple example. The crux move, defining a new function, is a typical tactic in problems of this kind.
Example 9.2.4 Let f : [0, 1] ---+ [0, 1] be continuous. Prove that f has a fixed point;
i.e., there exists x E [0, 1] such that f(x) = x.
Solution : Let g(x) := f(x) - x. Note that g is continuous (since it is the difference of two continuous functions), and that g(O) = f(O) 2: 0 and g( 1 ) = f( 1 ) - 1 :S O. By
the IVT, there exists u E [0, 1] such that g( u) = O. But this implies that f( u) = u. •
Uniform Continuity
Continuous functions on a closed interval (i.e., the domain is a closed interval) possess another important property, that of uniform continuity. Informally, this means that the amount of "wiggle" in the graph is constrained in the same way throughout the domain. More precisely,
A function f : A ---+ B is uniformly continuous on A if, for each e > 0, there exists 8 > 0 such that if XI ,X2 E A satisfy IXI - x2 1 < 8 , then If(xt } - f(X2 ) I < e.
The important thing in this definition is that the value of 8 depends only on e and not on the x-value. For each positive e, there is a single 8 that works everywhere on the domain. Because it is rather difficult to prove that all continuous functions on closed intervals are uniformly continuous, the concept of uniform continuity is not often introduced in elementary calculus classes. But it is such a useful idea that we will accept it, for now, on faith.6
Example 9.2.5 The function f(x) = x2 is uniformly continuous on [-3, 3] . As long as IXI - x2 1 < 8, we are guaranteed that If(xt } - f(X2 ) I < 68. It is easy to see why:
For any Xl , X2 E [- 3, 3] , the largest possible value for IXI + x2 1 is 6, and then If(xt } - f(X2 ) I = IXI + x2 1 ã lxl - x2 1 :S 61xl - x2 1 ã
Consequently, if we want to be sure that the function values are within e, we need only require that the x-values be within e/6.
Example 9.2.6 The function f(x) = l /x defined on (0, 00) is not uniformly continu
ous. For x-values close to 0, the function changes too fast. Given an e, no single 8 will do, if the x-values are sufficiently close to O. Note, however, that on any closed interval, f(x) is uniformly continuous. For example, verify that if we are restricting our attention to x E [2, 1000] , then the "8 response" to the "e challenge" is 8 = 4e.
In other words, if we are challenged to constrain the f-values to be within e of each other, we need only choose x-values within 4e of one another.
6Consult any of the texts mentioned on page 357.