THE MEAN VALUE THEOREM AND CRITERIA FOR- 123docz.net

Theorem 1.9.1: The segment [a, b] is the image of the map

t >-+ (1 - t)a +th, forO::;t::;l.

I turn with terror and horror from this lamentable scourge of contin- uous functions with no derivatives.-Charles Hermite, in a letter to Thomas Stieltjes, 1893

In this section we discuss two applications of the mean value theorem. The first extends that theorem to functions of several variables, and the second gives a criterion for determining when a function is differentiable.

The mean value theorem for functions of several variables

The derivative measures the difference of the values of functions at different points. For functions of one variable, the mean value theorem (Theorem 1.6.13) says that if f : [a, b] ---+ JR is continuous, and f is differentiable on (a, b), then there exists c E (a, b) such that

f(b) - f(a) = J'(c)(b- a). 1.9.1 The analogous statement in several variables is the following.

Theorem 1.9.1 (Mean value theorem for functions of several variables). Let U C Rn be open, let f : U ---+ JR be differentiable, and let the segment [a, h] joining a to b be contained in U. Then there exists c0 E [a, b] such that

--+

f(b) - /(a)= [D/(co)](b - a). 1.9.2

Why do we write inequality 1.9.3 with the sup, rather than

Jf(b) - f(a)J::; J[Df(c)]J Jb - aJ, -

which of course is also true? Using the sup means that we do not need to know the value of c in order to relate how fast f is changing to its derivative; we can run through all c E [a, h] and choose the one where the derivative is greatest.

This will be useful in Section 2.8 when we discuss Lipschitz ratios.

FIGURE 1.9.1.

The graph of the function f de-

fined in equation 1.9. 7 is made up of straight lines through the origin, so if you leave the origin in any direction, the directional derivative in that direction certainly exists. Both axes are among the lines making up the graph, so the directional derivatives in those di- rections are 0. But clearly there is no tangent plane to the graph at the origin.

146 Chapter 1. Vectors, matrices, and derivatives

Corollary 1.9.2. If f is a function as defined in Theorem 1.9.1, then lf(b) - f(a)I ::; ( sup I [Df(c)J I) lb~ al. 1.9.3

cE[a,b]

Proof of Corollary 1.9.2. This follows immediately from Theorem 1.9.1 and Proposition 1.4.11. D

Proof of Theorem 1.9.1. Ast varies from 0 to 1, the point (1- t)a +th moves from a to b. Consider the mapping g(t) = f((l - t)a +th). By the chain rule, g is differentiable, and by the one-variable mean value theorem, there exists to such that

g(l) - g(O) = g'(to)(l - 0) = g'(to). 1.9.4 Set co = (1 - to)a + tob. By Proposition 1.7.14, we can express g'(t0 ) in terms of the derivative off:

'(t ) 1. g(to + s) - g(to) g O = Im

s-->O S

= lim /(co+ s((b ~a)) - /(co) = (Df(co)](b ~a).

s-->0 S

1.9.5

So equation 1.9.4 reads

-->

f(b) - /(a)= [D/(co)](b - a). D 1.9.6

Differentiability and pathological functions

Most often the Jacobian matrix of a function is its derivative. But as we mentioned in Section 1.7, there are exceptions. It is possible for all partial derivatives of f to exist, and even all directional derivatives, and yet for f

not to be differentiable! In such a case the Jacobian matrix exists but does not represent the derivative.

Example 1.9.3 (Nondifferentiable function with Jacobian matrix).

This happens even for the innocent-looking function

(x) x2y

f Y = x2 + y2 1.9. 7

shown in Figure 1.9.1. Actually, we should write this function as

if(:)#(~)

if(:)=(~). 1.9.8

You have probably learned to be suspicious of functions that are defined by different formulas for different values of the variable. In this case, the

"Identically" means "at every point."

If we change the function of Example 1.9.3, replacing the x2y in the numerator of

x2y x2 +y2

by xy, then the resulting function, which we'll call g, will not be continuous at the origin. If x = y, then g = 1/2 no matter how close ( ~ ) gets to the origin: we then have

value at ( 8) is really natural, in the sense that as ( ~ ) approaches ( 8),

the function f approaches 0. This is not one of those functions whose value takes a sudden jump; indeed, f is continuous everywhere. Away from the origin, this is obvious by Corollary 1.5.31: away from the origin, f is a rational function whose denominator does not vanish. So we can compute both its partial derivatives at any point ( ~ ) # ( 8) .

That f is continuous at the origin requires a little checking, as follows.

If x2 + y2 = r2 , then lxl $ r and IYI $ r so lx2yl $ r3 • Therefore, and lim f (~) = 0.

(:)-(~) 1.9.9

So f is continuous at the origin. Moreover, f vanishes identically on both axes, so both partial derivatives of f vanish at the origin.

So far, f looks perfectly civilized: it is continuous, and both partial derivatives exist everywhere. But consider the derivative in the direction of the vector [ ~ ] :

. 1((8)+t[~])-1(8) t3 1

hm = lim - = - 1.9.10

t-o t t-o 2t3 2

This is not what we get when we compute the same directional derivative by multiplying the Jacobian matrix of f by the vector [ ~] , as on the right side of equation 1.7.38:

[ Dif (8) ,D2f (8)] [n = [O,O] [n = 0. 1.9.11

Jacobian matrix [.Jf(O)]

Thus, by Proposition 1.7.14, f is not differentiable. 6

Things can get worse. The function we just discussed is continuous, but it is possible for all directional derivatives of a function to exist, and yet for the function not to be continuous, or even bounded in a neighborhood of 0. For instance, the function discussed in Example 1.5.24 is not continuous in a neighborhood of the origin; if we redefine it to be 0 at the origin, then all directional derivatives would exist everywhere, but the function would not be continuous. Exercise 1.9.2 provides another example. Thus knowing that a function has partial derivatives or directional derivatives does not tell you either that the function is differentiable or even that it is continuous.

Even knowing that a function is differentiable tells you less than you might think. The function in Example 1.9.4 has a positive derivative at x although it does not increase in a neighborhood of x!

Example 1.9.4 (A differentiable yet pathological function). Con- sider the function f : R ---+ R defined by

f(x) = ~ +x 2 sin~, 1.9.12

FIGURE 1.9.2.

Graph of the function f(x) = ~ + 6x2 sin~-

The derivative of f does not have a limit at the origin, but the curve still has slope 1/2 there.

148 Chapter 1. Vectors, matrices, and derivatives

a variant of which is shown in Figure 1.9.2. To be precise, one should add f(O) = 0, since sin 1/x is not defined there, but this was the only reasonable value, since

lim x2 sin ~ = 0. 1.9.13

x-+0 X

Moreover, we will see that the function f is differentiable at the origin.

This is one case where you must use the definition of the derivative as a limit; you cannot use the rules for computing derivatives blindly. In fact, let's try. We find

'( ) 1 . 1 2 ( 1) ( 1 ) 1 1 1

f x =-2 +2xsm-+x cos- - -2 =-+2xsin--cos-. l.9.14

x x x 2 x x

This formula is certainly correct for x "I- 0, but f'(x) doesn't have a limit when x --+ 0. Indeed,

1. 1m -+ xs1n-( 1 2 . 1) = -1

x-+O 2 X 2 1.9.15

does exist, but cos 1/x oscillates infinitely many times between -1 and 1.

So f' will oscillate from a value near -1/2 to a value near 3/2. This does not mean that f isn't differentiable at 0. We can compute the derivative at 0 using the definition of the derivative:

f' (0) = l~ ~ ( O ; h + (0 + h )2 sin 0 ~ h) = l~ ~ ( ~ + h2 sin ~)

1 l" h . 1 1 1 9 6

= 2 + h~ sm h = 2' ã ã1

since (Theorem 1.5.26, part 5) limh-+O h sin* exists, and indeed vanishes.

Finally, we can see that although the derivative at 0 is positive, the function is not increasing in any neighborhood of 0, since in any interval arbitrarily close to 0 the derivative takes negative values; as we saw above, it oscillates from a value near -1/2 to a value near 3/2. 6.

This is very bad. Our whole point is that locally, the function should behave like its best linear approximation, and in this case it emphatically does not. We could easily make up examples in several variables where the same thing occurs: where the function is differentiable, so that the Jacobian matrix represents the derivative, but where that derivative doesn't tell you much. Of course we don't claim that derivatives are worthless. The problem in these pathological cases is that the partial derivatives of the function are not continuous.

Example 1.9.5 (Discontinuous derivatives). Let us go back to the function of Example 1.9.3:

if(:)-1-(~)

if(~)=(~). 1.9.17

If you come across a function that is not continuously differentiable, you should be aware that none of the usual tools of cal- culus can be relied upon. Each such function is an outlaw, obey- ing none of the standard theorems.

which we saw is not differentiable although its Jacobian matrix exists.

Both partial derivatives are 0 at the origin. Away from the origin - that is, if ( ~ ) =F ( g) -then

Dif (x) = (x 2 + y2)(2xy) - x 2y(2x) = 2xy3

y (x2 + y2)2 (x2 + y2)2

1.9.18

These partial derivatives are not continuous at the origin, as you will see if you approach the origin from any direction other than one of the axes.

For example, if you compute the first partial derivative at the point ( ~) of the diagonal, you find the limit

. (t) 2t4 1

l~ Dif t = (2t2)2 = 2' 1.9.19

which is not the value of

Dif (8) = 0. 1.9.20

In the case of the differentiable but pathological function of Example 1.9.4, the discontinuities are worse. This is a function of one variable, so (as discussed in Section 1.6) the only kind of discontinuities its derivative can have are wild oscillations. Indeed, f'(x) = ! + 2xsin ~ - cos~. and

cos l/x oscillates infinitely many times between -1 and 1. !:::..

The moral of the story: Only study continuously differentiable functions.

Definition 1.9.6 {Continuously differentiable function). A function is continuously differentiable on U C JR.n if all its partial derivatives exist and are continuous on U. Such a function is known as a G1 function.

This definition can be generalized; in Section 3.3 we will need functions that are "smoother" than G1 .

Definition 1.9.7 (GP function). A GP function on U c JR.n is a function that is p times continuously differentiable: all of its partial derivatives up to order p exist and are continuous on U.

Theorem 1.9.8 guarantees that a "continuously differentiable" function is indeed differentiable: from the continuity of its partial derivatives, one can infer the existence of its derivative. Most often, the criterion of Theorem 1.9.8 is the tool used to determine whether a function is differentiable.

Theorem 1.9.8 {Criterion for differentiability). If U is an open subset of JR.n' and f : u - JR.m is a G1 mapping, then f is differentiable on U, and its derivative is given by its Jacobian matrix.

In equation 1.9.21 we use the interval (a, a + h), rather than (a, b), so instead of the equation

J'(c) = f(b) - f(a) b-a

of the mean value theorem we have statement

J'(c) = f(a + h) - f(a), h

hf'(c) = f(a + h) - f(a).

150 Chapter 1. Vectors, matrices, and derivatives

A function that meets this criterion is not only differentiable; it is also guaranteed not to be pathological. By "not pathological" we mean that locally, its derivative is a reliable guide to its behavior.

Note that the last part of Theorem 1.9.8 - "and its derivative is given by its Jacobian matrix" - is obvious; if a function is differentiable, Theorem 1.7.10 tells us that its derivative is given by its Jacobian matrix. So the point is to prove that the function is differentiable.

Proof of Theorem 1.9.8. This is an application of Theorem 1.9.1, the mean value theorem. What we need to show is that

1 ( - -) -

!im~-:;- f(a + h) - f(a) - [Jf(a)]h = 0.

h-+O lhl 1.9.21

First, note that by part 3 of Theorem 1.8.1, it is enough to prove it when m = 1 (i.e., for f : U - t ~). Next write

f(a+h)-f(a)=f (::~:)-/(:)

an+hn an

1.9.22

in expanded form, subtracting and adding inner terms:

f(a + h) - f(a) =

f (:::~:)-/ (.,.';\,) +! [::!~:]-/ [•a~ha]

an+hn ______... an+hn an +h n an +h n

subtracted _ _ _ _ _ _ . . . added

+ããã±ããã+! [ ~~, )-! [I].

an+hn an

By the mean value theorem, the ith term is

ai ai ai

a2 a2 a2

f ai +hi -! ai =hiDd bi 1.9.23

ai+l + hi+1 ai+l + hi+1 ai+l + hi+1

an+hn an+hn an +hn

ith term

for some bi E [ai, ai +hi]: there is some point bi in the interval [ai, ai +hi]

such that the partial derivative Dd at bi gives the average change of the

The inequality in the second line of inequality 1.9.25 comes from the fact that lhil/lhl ~ 1.

function f over that interval, when all variables except the ith are kept constant.

Since f has n variables, we need to find such a point for every i from 1 ton. We will call these points Ci:

this gives f(a+h)-f(a) = 'L,hiDd(ci)ã

Ci= bi

ai+I + hi+1 i=l

Thus we find that

f(a+ h) - f(a)-L, Dd(a)hi = L, hi(Dd(ci) - Dd(a)). 1.9.24

i=l

So far we haven't used the hypothesis that the partial derivatives Dd are continuous. Now we do. Since Dd is continuous, and since ci tends to a as h---+ 0, we see that the theorem is true:

2::?=1 D;f(a)h;

, . - " - . .

!im IJ(a + h) - !(~) - [J/(a)]h I :::;; !im t l':l IDd(ci) - Dd(a)I

h--+O lhl h--+O i=l lhl

:::;; !im_'L,IDd(ci) - Dd(a)I = 0.

h--+O i=l

D 1.9.25

Example 1.9.9. Here we work out the preceding computation when f is a scalar-valued function on JR.2:

1.9.26

=I ( :~ ! ~~)-1 ( a2; h2 ) +I ( a2; h2 )-1 ( ~~)

= h1Di!(a2 ~ h2) +h2D2f(~~) = h1Dif(c1) + h2D2f(c2). t:,,.

EXERCISES FOR SECTION 1.9

1.9.1 Show that the function f : JR.2 ---+ JR given by

Exercise 1.9.2: Remember, sometimes you have to use the definition of the derivative, rather than the rules for computing derivatives.

The functions g and h for Ex- ercise 1.9.2, parts b and c:

Exercise 1.9.3, part c: You may find the following fact useful:

I sin xi :'.5 lxl for all x E R.

This follows from the mean value theorem:

I sin xi= 11"' costdtl :'.511"' 1 dtl = lxl.

152 Chapter 1. Vectors, matrices, and derivatives

is differentiable at every point of R2 •

1.9.2 a. Show that for

if(:)#(~) if(:)=(~)

all directional derivatives exist, but that f is not differentiable at the origin.

*b. Show that the function g defined in the margin has directional derivatives at every point but is not continuous.

*c. Show that the function h defined in the margin has directional derivatives at every point but is not bounded in a neighborhood of 0.

1.9.3 Consider the function f : R2 --+ R given by if(:)# G)

if(:) = (~).

a. What does it mean to say that f is differentiable at ( ~) ?

b. Show that both partial derivatives Dif (~) and D2f (~) exist, and compute them.

c. Is f differentiable at (~)?

THE MEAN VALUE THEOREM AND CRITERIA FOR

THE MAIN ALGORITHM: ROW REDUCTION

LINEAR COMBINATIONS, SPAN, AND LINEAR INDEPENDENCE