7.9. The Mean Value Theorem
Extreme points are a basic concept in applied mathematics. Examples in 7.11 will demonstrate this. So we start this chapter introducing this notation.
The major part however is up to the mean value theorem and its implications. The theorem itself is easily depicted (look at the sketch in [7.9.4]), but its proof is far from being trivial. The actual clue is the extreme value theorem ([6.6.5]) for continuous functions.
The mean value theorem proves to be the most powerful device in calculus.
Definition: For an arbitrary point a function is said to have a
-
global maximum at a, if
-
global minimum at a, if
-
local maximum at a, if there is a relative ε-neighbourhood
i |
|
such that
-
local minimum at a, if there is a relative ε-neighbourhood such that
|
[7.9.1] |
In any case we speak of a global or local extremum. Occasionally the notations absolute extremum and relative extremum respectively are used.
a is called an extreme point for the extremum (or the extreme value) .
|
To illustrate this new concept we look at the function given by
.
The sketch to the right obviously shows that f has
-
a local minimum at −1 and at 2.5. The local minimum at −1 is even a global one.
-
a local maximum at −2 and at 1.
-
no global maximum as f is unbounded from above.
|
|
Note that with the differentiable function f a horizontal tangent is attached to all of the interior extreme points, but not to −2.
Consider:
-
The ability to have an extremum is not bound to differentiability. The absolute value function for instance has a global minimum at 0, but fails to be differentiable.
-
Each global extreme point is a local one as well, because if an estimate holds for the whole of A it will certainly hold for every subset of the type , thus for instance for . The sketch above shows that the reverse does not hold.
-
As we used the relations and when stating [7.9.1] a constant function has a global maximum and a global minimum at each point simultaneously. This cannot happen however when strict extrema are involved.
It will be an important task to detect local extreme points. As noticed above, points with horizontal tangents will play a certain role in this quest and in fact this observation leads to a first existance criterion for local extreme points.
Proposition (necessary criterion): Let be differentiable at . If a is an
interior point
i |
|
of A the following implication holds:
If f has a local extremum at a its derivative vanishes at a: .
|
[7.9.2] |
Proof: Assume f has a local maximum at a. So there is a relative ε-neighbourhood such that
.
This allows to calculate the sign of the difference quotient function:
.
With [6.9.1] and [6.9.4] we thus have:
,
which finally results in: .
|
Consider:
-
The necessary criterion confirms the presumed behavior: Tangents attached to interior local extreme points are always horizontal.
-
The reverse of [7.9.2] is not true (read: [7.9.2] is not sufficient). The function for instance has no local extremum at 0 irrespective of . Actually the necessary criterion only filters out points with horizontal tangents.
Looking for suitable sufficient criteria is thus worthwhile. [7.9.17] at the end of this part is a first example for such a criterion.
-
The necessary criterion is not valid at boundary points: The restriction has a local (even global) minimum at 0, but its derivative number at 0 is 1.
-
[7.9.2] is often read as " is a necessary existence condition for interior local extreme points". Thus only the first derivative's zeros come into question when searching for local extreme points.
Theorem (Rolle's theorem): Let f be continuous on the closed interval and differentiable on its interior , i.e. .
If , then there is an such that
|
[7.9.3] |
Proof: f has a global maximum and a global minimum due to the extreme value theorem [6.6.5]. (Note that f is continuous on a closed interval!) Thus there are two numbers such that
. [0]
If one of these numbers is an interior point of , it is certainly a zero for as a result of the necessary criterion [7.9.2].
Otherwise we would know that , thus as due to the premise. [0] now forces f to be a constant function, thus vanishes everywhere.
|
Consider:
-
Rolle's theorem is a pure existance theorem. It does not provide any information on uniqueness and on the precise location of .
-
The continuity at a and at b is compulsory. As an example take the function defined by . We have for all , although . But f is discontinuous at 1.
-
As , the line segment joining and is horizontal. Thus Rolle's theorem is often stated in a more geometrical manner: There is an interior point with a tangent parallel to the line segment which joins the end point of f. The sketch to the right illustrates this for the function on the interval . Apart from the marked position there is obviously another option in this case for a horizontal tangent, namely at .
It is tempting to ask if this geometric property is still valid if the restriction is lifted. The function on depicted below suggests that the answer might be "yes". Luckily we are able to prove that "yes" is the actual answer.
Theorem (mean value theorem): There is an for each such that
|
[7.9.4] |
Proof: We apply Rolle's theorem [7.9.3] to a modification of f. The function
|
|
is certainly continuous on and differentiable on . As
the special condition of Rolle's theorem is met, so that due to [7.9.3] there is an such that
.
This however is the assertion.
|
Consider:
-
On the one hand the mean value theorem is clearly an implication of Rolle's theorem due to the structure of its proof. We find on the other hand that Rolle's theorem comes as a special case from the mean value theorem, because if, in addition, we get . Both theorems are thus equivalent:
Rolle's theorem mean value theorem
With the mean value theorem one of the major results in calculus is now at our disposal. There are a lot of non-trivial applications supporting this rating. Our present notation [7.9.4] however seldom proves to be suitable for a direct application. We will thus benefit from another, equivalent version which is more taylored to application purposes. As
[7.9.4] is always valid, wheather or not a is actually left of b. Thus we will subsequently use the phrase " lies in between a and b" as an abbreviation for
Solving [7.9.4] for provides a new version of the mean value theorem:
Let I be an arbitrary
interval
i |
We understand this as a common notation for open and closed intervals. In the open case we also allow the values for the right and for the left boundery. Thus and e.g. are regarded as intervals as well.
|
and let be any two different points of I. If , then there is an in between a and b such that
|
[7.9.5] |
Consider:
-
The closed interval generated by a and b is a subset of I. Any function differentiable on I is as well differentiable, and thus also continuous, on that closed subinterval. The conditions of [7.9.4] are thus satisfied.
-
[7.9.5] extends (on intervals) the basic representation theorem [7.5.1] as all the values now prove to be derivative numbers of f.
-
The mean value theorem is only valid for intervals. The
Heaviside step function H
i |
|
for example is on . But as for all there will be no such that
,
because that would imply: .
Our first application goes back to a promise made in the context of [7.5.3/4]. Now we are able to show that regular functions on intervals are always injective.
The way the mean value theorem is used in the subsequent proof is a characteristic one: The equation [7.9.5] allows to access properties of f as soon as global features of its derivative are available. In this case for example we know that the derivative values are non-zero everywhere and thus certainly non-zero as well at the normally unknown point provided by the mean value theorem.
Proposition: If is differentiable at each interior point of I we have:
for all interior points x of If is injective on I.
|
[7.9.6] |
Proof: If x and y are any two different points of I there is an in between x and y according to [7.9.5] such that
.
With we now see: .
|
A second example will classify the constant functions using only their derivative behaviour.
Proposition: For any function the following holds:
|
[7.9.7] |
Proof: Whereas the direction "" is trivial (see [7.3.6]) the reverse one "" turns out to be the actual task. And again this is a characteristic context for the mean value theorem. If we choose a fixed point [7.9.5] guarantees an in between a and x for each different from a such that
Taking now proves the assertion.
|
An appropriate statement for polynomials extends [7.9.7] considerably.
Proposition: Let be an arbitrary natural number. For any -function we have:
f is a ploynomial of degree
|
[7.9.8] |
Proof: "" is an immediate consequence from [7.8.14]. We prove "" by induction. As the base step () is already done by [7.9.7] it remains to prove the induction step. To that end let f be a -function such that
.
According to [7.9.7] the differentiable function is a constant one (the weird notation of c is due to [7.8.14]):
.
Now we consider the -function . As obviously
we know that p is a polynomial of degree due to the induction hypothesis. But that means: is a polynomial of degree .
|
[7.9.8] allows to spot the polynomials within the -functions on simply by checking if the zero function is one of their derivatives. The derivatives calculated in [7.8.11-13] thus prove that exp, sin and cos are no polynomials.
When dealing with continuity we introduced one of its special forms in [6.5.6], the so called lipschitz-continuity. From the mean value theorem we now get the information that a differentiable function with a bounded derivative will be lipschitz-continuous automatically.
Task: If and for all interior points x of I, then all satisfy:
|
[7.9.9] |
Proof:
?
Obviously we may assume that . According to the mean value theorem there is an in between x and y such that . But this leads to
.
|
Consider:
-
Every function is lipschitz-continuous, because now is a continuous function on a closed interval and thus bounded due to [6.6.4].
-
As sin and cos have only values between −1 and 1, we get for all :
We are now going to extend the mean value theorem. Two options are at our disposal: We could try to find a version for two functions, as we did successfully with the intermediate value theorem (see [6.6.3]), and we could check if the mean value theorem reveals additional features if repeatedly differentiable functions are involved.
Proposition (second mean value theorem): For any two functions there is an such that
|
[7.9.10] |
Proof: It is easily calculated that the function
satifies . According to Rolle's theorem [7.9.3] we thus find an such that
,
which in fact is the assertion.
|
Consider:
-
If we interchange a and b in [7.8.10], i.e. if we multiply the equation by −1, [7.8.10] is still valid. The second mean value theorem thus does also not depend on wheather or not a is actually left of b.
With the second mean value theorem we will get L'Hôpital's rule, a very efficient technique for calculating certain limits. We start with the following observation:
Let be differentiable at a and assume . If then is continuously continuable at a by the limit
.
|
[7.9.11] |
Proof: At first we consider the representation (see [7.5.1] for details)
to show that a is an
accumulation point
i |
According to [6.4.4] we would manage this by creating a sequence in such that . As a is already an accumulation point of A (otherwise there would be no function differentiable at a) it is sufficient to that end to find a relative ε-neighbourhood satisfying .
|
of . As r is continuous at a the information yields a relative ε-neighbourhood such that for all .
As and [7.9.11] now follows immediately with [6.9.8] from the equation
which holds for all .
Proposition (L'Hôpital's rule): Let I be an interval, and such that for all . If is continuously continuable at a the following holds:
-
is continuously continuable at a by .
|
[7.9.12] |
-
is continuously continuable at a by .
|
[7.9.13] |
Proof: We go ahead with the sequence criterion [6.8.4] and take a sequence in such that . We may assume that for all n, because: As , [7.9.6] allows at most one zero for g left of a, i.e. within the interval and at most one zero within the right subinterval . According to the second mean value theorem [7.9.10] there is now an in between a and , that means , for each n such that
. [1]
If converges to a the same is true for (see the nesting theorem [5.5.8]!) which guarantees the convergence due to the premise. From [1] we thus get:
1. ►
.
2. ►
.
|
Consider:
-
Of course are we allowed to iterate the above rules: Take e.g. such that , and for all . If the limit exists the following ones exist as well and are all the same:
.
Example: Many problems could already be solved using only [7.9.11]. The logarithm function ln from the third example is introduced in [8.7.1].
|
|
|
[7.9.14] |
This justifies the phrase "If , then the logarithm function ln approaches slower than every positive power of X".
An analogue statement is true for the exponential function exp. "If , then exp approaches quicker than every positive power of X":
|
|
[7.9.15] |
Proof: L'Hôpital's rule is not necessary in this case. We just need the
series representation of exp
i |
See [5.9.18] for details
|
.
Choosing a such that will yield the assertion as the following estimate is true for all :
.
|
|
We now turn to the second promised extension of the mean value theorem. This will, amongst others, result in a special method (Taylor polynomials) to study analytical functions.
Theorem (Taylor's theorem): Take . For any function there is an such that Taylor's formula holds:
|
[7.9.16] |
Proof: It is Rolle's theorem again that will do the trick. As a start we find a real number c such that
[2]
by simply solving the linear equation [2] for c. So it remains to find an such that . To that end we consider the function
.
Due to the premise g is continuous on and differentiable on . We use the product rule to calculate its derivative and observe that the resulting series is a telescopic one collapsing to a single difference:
[3]
is obvious and according to [2] is valid as well. Thus Rolle's theorem [7.9.3] provides an such that
.
As , we see that .
|
Consider:
-
If [7.9.16] and [7.9.5] coincide. Thus Taylor's theorem is indeed an extension of the mean value theorem.
-
And again we find that the actual order of a and b is of no relevance for the validity of Taylor's theorem. The proof however is a bit more tricky this time. We need to introduce the linear function
and to apply Taylor's formula to the composit . Noting that , especially and , we get
Thus Taylor's formula holds for as well when we take as the "new" . So we may restate Taylor's theorem the following way:
For any function and any two different points there is an in between a und b that satisfies [7.9.16].
-
If and we call the polynomial
the n-th Taylor polynomial and the function defined by
the n-th remainder of f with respect to a. The function is well defined: There might be several for a fixed x satisfiying Taylor's formula, but the value is the unique solution c of [2] belonging to x and a. is sometimes referred to as the Lagrange form of the remainder.
-
If and we call the power series the Taylor series of f with respect to a. If the Taylor series is convergent with r as its radius of convergence and if is its limit function, i.e.
for all , [4]
the equation [4] is said to be the Taylor expansion of f at a. -functions, which allow a Taylor expansion for each are thus analytical.
In chapter 8.10 we will access the Taylor formula in a different way. On this occasion we will learn how to test if a -function is analytical.
We return to the quest for local extreme points. With Taylor's formula we are now able to state a first sufficient existence criterion for local extreme points that in many cases successfully tells apart the candidates provided by the necessary criterion [7.9.2]. It is however confined to high quality functions on intervals only. Criteria for weaker functions will follow in the next part.
Proposition (sufficient criterion for -functions): For a function and an interior point a of I with
|
[7.9.17] |
it depends on the kind of n + 1 wheather or not f has an extremum at a:
-
If n + 1 is odd, f has no local extremum at a.
-
If n + 1 is even, f has a strict local
Proof: Let's say . As is continuous, there is an such that for all . [7.9.16] now provides an in between x and a for all those x different from a such that
.
As for all , we may argue now as follows:
1. ► If n + 1 is odd the term - and thence as well - is less then zero for all x left of a, and greater than zero for all x right of a. As a is an interior point of I both typs of x actually occur in . Thus f fails to have a local extremum at a.
2. ► Now, if n + 1 is even the term is greater than zero for all and so we have: for all , which proves f to have a strict local minimum at a.
|
Consider:
-
If f is a -function [7.9.17] turns into the "classical" criterion
f has a local extremum at a.
-
The proof of [7.9.17] shows that 2. is also valid for boundary points of I. A similar result for 1. however does not hold as is demonstrated by the restriction .
-
The reverse of 2. is not true, the criterion thus not necessary. The function defined by
is our counter example in this case. In 9.12 we will prove that f is a -function with all its derivatives vanishing at 0 which is a local minimum point for f.
|
|
|