3 Differentiation
Differentiation grew out of the problem of instantaneous velocity. Velocity can only easily be measured
as an average over a time interval:
13
if an object travels d meters in t seconds, then its average
velocity is v
av
=
d
t
ms
1
. An early ‘definition’ (dating to the 1300s) makes the instantaneous velocity
equal to the constant velocity that would be observed if a body were to stop accelerating: while
useless for the purposes of measurement, this is essentially Newton’s first law regarding inertial
motion (1687). We also see the concept of the tangent line beginning to appear. Indeed if one graphs
position against time, intuition tells us:
The graph of inertial (constant speed) motion is a straight line whose slope is the velocity.
The tangent line to a curve has slope equal to the instantaneous velocity.
The problem of finding, defining and computing instantaneous velocity thus morphed into the con-
sideration of tangent lines to curves. With the advent of analytic geometry in the early 1600s, math-
ematicians such as Fermat and Descartes pioneered versions of the familiar secant (‘cutting’) line
method for computing tangents.
d
t
a
t
d
v =
d
t
d
t
a
t
Instantaneous velocity equals constant
velocity corresponding to tangent line
Secant lines approximate tangent line as t a
The average velocity of the particle over the time interval [a, t] is the slope of the secant line, namely
v
av
(a, t) =
d(t) d(a)
t a
Since the secant lines approximate the tangent line as t approaches a, it seems reasonable that we
should compute the instantaneous velocity in this manner:
v(a) = lim
ta
v
av
(a, t) = lim
ta
d(t) d(a)
t a
This is, of course, the modern definition of the derivative.
13
Even a modern technique such as Doppler-shift compares measurements separated by the extremely small period of a
light or soundwave. These are still therefore average velocities, albeit taken over very small time intervals.
44
3.28 Basic Properties of the Derivative
Definition 3.1. Let f : U R and a U
an interior point. We say that f is differentiable at a if the
following limit exists (is finite!)
lim
xa
f (x) f (a)
x a
We call this limit the derivative of f at a and denote its value by
d f
dx
x=a
or f
(a).
If f
(a) exists for all a U then f is differentiable (on U); the derivative becomes a function f
(x) =
d f
dx
.
The two notations are partly attributable to the primary founders of calculus: Issac Newton and
Gottfried Leibniz. Each has its pros and cons and you should be comfortable with both.
One-sided derivatives Differentiability only makes sense at interior points of U since the defining
limit is two-sided. Left- and right-derivatives may be defined using one-sided limits; differentiability
is then equivalent to these being equal. All results in this section hold for one-sided derivatives with
suitable (sometimes tedious) modifications. It is common, though strictly incorrect, to say that f is
differentiable on [a, b) if it is differentiable on the interior (a, b) and right-differentiable at a. In these
notes we will strictly adhere to Definition 3.1: differentiable means two-sided.
Examples 3.2. Basic examples should be familiar from elementary calculus.
1. Let f (x) = x
2
+ 4x. Then, for any a R,
lim
xa
f (x) f (a)
x a
= lim
xa
x
2
+ 4x a
2
4a
x a
= lim
xa
(x a)(x + a + 4)
x a
= lim
xa
(x + a + 4) = 2a + 4
Note how the definition of lim
xa
allows us to cancel the x a terms from the numerator and
denominator. We conclude that f is differentiable (on R) and that f
(x) = 2x + 4.
2. Let g(x) =
x+1
2x3
. Then, for any a =
3
2
,
lim
xa
f (x) f (a)
x a
= lim
xa
1
x a
x + 1
2x 3
a + 1
2a 3
= lim
xa
5a 5x
(x a)(2x 3)(2a 3)
= lim
xa
5
(2x 3)(2a 3)
=
5
(2a 3)
2
f is therefore differentiable on its domain R \{
3
2
} with derivative f
(x) =
5
(2x3)
2
.
The familiar expressions
f
(a) = lim
h0
f (a + h) f (a)
h
, f
(x) = lim
h0
f (x + h) f (x)
h
are equivalent to the original definition (Exercise 5). While seemingly simpler, they sometimes lead
to nastier calculations: see what happens if you try the previous example in this language. . .
45
We now turn to perhaps the most well-known result of elementary calculus.
Theorem 3.3 (Power Law). Let r R. Then f (x) = x
r
is differentiable with f
(x) = rx
r1
.
The domains of f and f
depend messily on r, but the formula holds at least on the interval (0, ).
We leave a complete proof to the exercises and instead consider a few generalizable examples.
Examples 3.4. 1. If n N and a R, a simple factorization yields
lim
xa
x
n
a
n
x a
= lim
xa
(x a)(x
n1
+ ax
n2
+ ···+ a
n2
x + a
n1
)
x a
()
= lim
xa
(x
n1
+ ax
n2
+ ···+ a
n2
x + a
n1
) = na
n1
We conclude that
d
dx
x
n
= nx
n1
.
2. If f (x) = x
1
and a = 0, then
lim
xa
x
1
a
1
x a
= lim
xa
a x
ax(x a)
= lim
xa
1
ax
=
1
a
2
from which we conclude that f
(x) = x
2
.
A similar approach followed by the factorization () proves the
power law for all negative integer exponents:
x
n
a
n
x a
=
a
n
x
n
a
n
x
n
(x a)
= ···
2
1
1
2
y
2 1 1 2
x
3. To differentiate x
1/n
, substitute x = y
n
and observe case 1. For
instance, if g(x) = x
1/3
and a = 0, then y = x
1/3
and b = a
1/3
yield
lim
xa
x
1/3
a
1/3
x a
= lim
yb
y b
y
3
b
3
=
1
3b
2
=
1
3
a
2/3
= g
(x) =
1
3
x
2/3
Note that g is not differentiable at x = 0!
1
1
2
y
2 1 1 2
x
We could similarly compute the derivative for all rational exponents, though it is much easier to wait
for the chain rule. The power law for irrational exponents is somewhat more ticklish.
Corollary 3.5 (Basic Transcendental Functions). Recalling our development of power series in
Chapter 2, the power law (for positive integers!) is all we need to see that
d
dx
exp(x) = exp(x),
d
dx
sin x = cos x,
d
dx
cos x = sin x
It is also possible to develop these results independently of power series (see e.g. Exercise 12).
46
Failure of differentiability
It is instructive to consider how a function might fail to be differentiable. Firstly, a familiar fact shows
that functions are not differentiable at discontinuities.
Lemma 3.6. If f is differentiable at a then f is continuous at a.
Proof. Just take the limit (think carefully why this works!):
lim
xa
f (x) = lim
xa
f (x) f (a)
x a
(x a) + f (a)
= f
(a)(0 0) + f (a) = f (a)
It remains to consider situations when a function is continuous but not differentiable.
Examples 3.7. The following exemplify all situations where a function is continuous on an interval
and differentiable everywhere except at a single interior point. As with isolated discontinuities, these
are classified by considering the three ways in which the derivative limit might not converge.
1. A vertical tangent line occurs when the limit is infinite. For instance, g(x) = x
1/3
at x = 0.
2. Corners occur when the one-sided limits are unequal (could be infinite). For instance, f (x) =
|
x
|
is not differentiable at zero, with one-sided limits
lim
x0
+
|
x
|
|
0
|
x 0
= lim
x0
+
x
x
= 1 = lim
x0
|
x
|
|
0
|
x 0
= lim
x0
x
x
= 1
Indeed f is differentiable everywhere except at zero, with
f
(x) =
(
1 if x > 0
1 if x < 0
A cusp describes the special case where the one-sided limits are = .
3. A singularity is where left- and/or right-limits do not exist.
The standard example is
f (x) =
(
x sin
1
x
if x = 0
0 if x = 0
which is continuous on R and differentiable everywhere ex-
cept at zero: the details are in Exercise 10.
1
x
2
π
2
π
2
π
Singularities and vertical tangent lines can also prevent one-sided differentiability.
More esoteric examples of non-differentiability are possible:
Utilizing series, we can create functions which are continuous on an interval but nowhere differ-
entiable! For an example, see Exercise 15.
It is also possible to construct a function which differentiable (and thus continuous) at precisely
one point; can you think of an example?
47
The Basic Rules of Differentiation
Theorem 3.8. Let f , g be differentiable and k, l be constants.
1. (Linearity) The function k f + lg is differentiable with (k f + lg)
= k f
+ lg
.
2. (Product rule) The function f g is differentiable with ( f g)
= f
g + f g
.
3. (Inverse functions) Suppose f is bijective, b = f
1
(a) is an interior point of dom f
1
, and
f
(a) = 0, then f
1
is differentiable at b and
d
dy
y=b
f
1
( y) =
1
f
(a)
=
1
f
f
1
( b)
Proof. Parts 1 and 2 follow from the limit laws:
lim
xa
( k f + lg)(x) (k f + lg)(a)
x a
= lim
xa
k
f (x) f (a)
x a
+ l
g(x) g(a)
x a
= k f
(a) + lg
(a)
lim
xa
f (x)g(x) f (a)g(a)
x a
= lim
xa
f (x) f (a)
x a
g(x) + f (a)
g(x) g(a)
x a
= f
(a)g(a) + f (a)g
(a)
Note where we used the continuity of g in the second line (lim g(x) = g(a)). Part 3 is an exercise.
The inverse function rule should be intuitive: since the graphs of f and f
1
are related by reflection
in the diagonal y = x, gradients at corresponding points are reciprocals. The result feels even more
natural in Leibniz’s notation:
dx
dy
=
1
dy/dx
.
Examples 3.9. 1. Linearity permits the differentiation of any polynomial: e.g.,
d
dx
7x
2
+ 13x
4
= 7
d
dx
x
2
+ 13
d
dx
x
4
= 14x + 52x
3
2. The product rule extends the reach of differentiation to include simple combinations: e.g.,
d
dx
(x
4
sin x) =
d
dx
x
4
sin x + x
4
d
dx
sin x = 4x
3
sin x x
4
cos x
3. Inverse trigonometric functions can now be differentiated: e.g.,
y = sin
1
x =
d
dx
sin
1
x =
dy
dx
=
dx
dy
1
=
1
cos y
=
1
q
1 sin
2
y
=
1
1 x
2
4. Define the natural logarithm to be the inverse of the (bijective!) exponential function exp(x):
y = ln x x = exp y
It follows that
d
dx
ln x =
dx
dy
1
=
1
exp y
=
1
x
The full details, and the justification that exp x = e
x
, are in Exercise 14.
48
Theorem 3.10 (Chain Rule). If g is differentiable at a, and f is differentiable at g(a), then f g is
differentiable at a with derivative
( f g)
(a) = f
g(a)
g
(a)
In Leibniz’s notation,
d( f g)
dx
=
d f
dg
dg
dx
: this looks like a simple cancellation of the dg terms. . .
14
Proof. Since f and g are differentiable, a is interior to dom(g) and g(a) is interior to dom( f ). Since g
is continuous at a, there must exist some open interval U a for which x U = g(x) dom( f ).
Define γ : dom( f ) R via
γ(v) =
f
v
f
g(a)
vg(a)
if v = g(a)
f
g(a)
if v = g(a)
()
Since f is differentiable at g(a),we see that γ is continuous there: indeed lim
vg(a)
γ(v) = f
g(a)
.
For any x U \ {a}, let v = g(x) in (). Then
f
g(x)
f
g(a)
x a
= γ
g(x)
g(x) g(a)
x a
Take limits as x a for the result.
Corollary 3.11 (Quotient Rule). Suppose f and g are differentiable. Then
f
g
is differentiable when-
ever g(x) = 0. Moreover
f
g
=
f
g f g
g
2
The proof is an exercise.
Examples 3.12. 1. By the quotient rule,
d
dx
tan x =
d
dx
sin x
cos x
=
cos
2
x + sin
2
x
cos
2
x
= sec
2
x
2. We can now differentiate highly involved combinations of elementary functions:
d
dx
tan(e
4x
2
)
7x
sin x
= 8xe
4x
2
sec
2
( e
4x
2
)
7 sin x 7x cos x
sin
2
x
14
This is completely unjustified since dg does not (for us) have independent meaning. The same problem appears in a
famously flawed one-line ‘proof of the chain rule:
lim
xa
f
g(x)
f
g(a)
x a
?
= lim
xa
f
g(x)
f
g(a)
g(x) g(a)
lim
xa
g(x) g(a)
x a
The second limit doesn’t make sense unless g(x) = g(a) for all x on some punctured neighborhood of a: in particular, g(x)
cannot be constant! The faulty argument may be repaired by replacing this difference quotient with f
g(a)
whenever
g(x) = g(a), before taking the limit. This is precisely what γ
g(x)
does in the correct proof.
49
Exercises 3.28. Key concepts: Differentiability, Basic rules: linearity, power, product, chain, quotient
1. Use Definition 3.1 to calculate the derivatives.
(a) f (x) = x
3
at x = 2 (b) g(x) = x + 2 at x = a
(c) f (x) = x
2
cos x at x = 0 (d) r(x) =
3x+4
2x1
at x = 1
2. Differentiate the function f (x) = cos
e
x
5
3x
using the chain and product rules.
3. (a) Prove the quotient rule (Corollary 3.11) by combining the chain and product rules.
(b) Prove the inverse derivative rule (Theorem 3.8, part 3).
(Hint: You can’t simply differentiate 1 =
dx
dx
=
d
dx
f
f
1
(x)
using the chain rule; why not?)
4. (a) Find the derivatives of secant, cosecant and cotangent using the quotient rule.
(b) Why did we choose the positive square-root when computing
d
dx
sin
1
x? What is the
standard domain of arcsine, and what happens at x = ±1?
(c) Find the derivatives of the inverse trigonometric functions using the inverse function rule.
5. Using the definition of the derivative, and supposing that f is differentiable at a, prove that
f
(a) = lim
h0
f (a + h) f (a)
h
= lim
h0
f (a + h) f (a h)
2h
6. Use induction to prove the power law
d
dx
x
n
= nx
n1
when n N using only the product rule
and the fact that
d
dx
x = 1.
7. Prove that f (x) = x
|
x
|
is differentiable everywhere and compute its derivative.
8. Show that f (x) = x
2/3
has a cusp (see Example 3.7.2) at x = 0.
9. Show that following function is differentiable everywhere and compute its derivative:
f (x) =
(
x
2
sin
1
x
if x = 0
0 if x = 0
Moreover, prove that the derivative f
is discontinuous at x = 0.
10. Prove that the function in Example 3.7.3 is differentiable everywhere except at x = 0.
11. Suppose f (x) = x
2
whenever x Q and f (x) = 0 whenever x Q. At what values of x is f
differentiable? Prove your assertion.
12. (a) Suppose 0 < h <
π
2
. Use the picture to show that
0 <
1 cos h
h
< sin
h
2
and sin h < h < tan h
Hence conclude that lim
h0
sin h
h
= 1 and lim
h0
1cos h
h
= 0.
(b) Use part (a) to prove that
d
dx
sin x = cos x
cos h
sin h
tan h
1
h
h
50
13. (Hard) Use induction to prove the Leibniz rule (general product rule):
( f g)
(n)
=
n
k=0
n
k
f
(k)
g
(nk)
Warning! The last two exercises are much longer and & tougher: have a go if you appreciate a
challenge.
14. The Exponential Function & the Power Law
The ratio tests shows that the power series exp(x) :=
n=0
x
n
n!
converges for all real x. Define
e := exp(1). Certainly e
x
makes sense whenever x Q. If x is irrational, instead define
e
x
:= sup{e
q
: q Q, q < x}
The goal of this question is to prove that exp(x) = e
x
. As a nice bonus we recover Bernoulli’s
limit identity e = lim
n
1 +
1
n
n
and obtain a complete proof of the power law!
(a) For all x, y R, prove that exp(x + y) = exp(x) exp(y)
(Hint: use the binomial theorem and change the order of summation)
(b) Show that exp(x) is always positive, even when x < 0.
(c) Prove that exp : R (0, ) is bijective.
(Hint: x 0 = exp(x) 1 + x; take limits then apply part (a))
(d) Prove that e
x
= exp(x). Do this in three stages:
If x N, use part (a). Now check for x Z
.
If x =
m
n
Q, first compute
exp(
m
n
)
n
.
If x is irrational, consider a sequence of rational numbers q
n
< x with e
q
n
e
x
. . .
(e) Let ln : (0, ) R be the inverse function of exp. Prove the logarithm laws:
ln(xy) = ln x + ln y and ln x
r
= r ln x
(Just do this when r N; in general, another argument like part (d) is required)
(f) We’ve already seen that
d
dy
ln y =
1
y
. Use the fact that
d
dy
ln y = lim
h0
ln(y + h) ln y
h
to prove that exp(x) = lim
n
1 +
x
n
n
, thus recovering Bernoulli’s definition of e.
(g) For any r R, define x
r
:= exp(r ln x). Hence obtain the power law for any exponent.
51
15. A Very Strange Function
Here is a classic example of a continuous but nowhere-differentiable function!
Let f be the sawtooth function defined by f (x) =
|
x
|
whenever x [1, 1] and extending
periodically to R so that f (x + 2) = f (x). Now define g : R R via
g(x) =
n=0
3
4
n
f (4
n
x)
1
2
2 1 0 1 2
1
2
2 1 0 1 2
f (x) and iterations to n = 3 g(x) (really n = 6, but can you tell?!)
(a) Prove that g is well-defined and continuous on R.
(b) Let x R and m N be fixed. Define h
m
= ±
1
2
·4
m
where the sign is chosen so that no
integers lie strictly between 4
m
x and 4
m
(x + h
m
) = 4
m
x ±
1
2
.
For each n N
0
, define
k
n
=
f
4
n
(x + h
m
)
f (4
n
x)
h
m
Prove the following
i.
|
k
n
|
4
n
with equality when n = m.
ii. n > m = k
n
= 0.
(Hint:
|
f (y) f (z)
|
|
y z
|
: when is this an equality?)
(c) Use part (b) to prove that
g(x + h
m
) g(x)
h
m
1
2
(3
m
+ 1)
Hence conclude that g is nowhere differentiable.
52
3.29 The Mean Value Theorem
A key result in elementary calculus, this should be very familiar from your previous studies.
Theorem 3.13 (Mean Value Theorem/MVT). Let f be continuous on [a, b] and differentiable on
(a, b). Then there exists ξ (a, b) such that f
( ξ) =
f (b)f (a)
ba
.
This follows easily from two lemmas.
Lemma 3.14. 1. (Critical Points) Suppose g is bounded on (a, b) and attains its maximum or mini-
mum at ξ (a, b). If g is differentiable at ξ then g
( ξ) = 0.
2. (Rolle’s Theorem) Suppose g is continuous on [a, b], differentiable on (a, b), and g(a) = g(b).
Then there exists ξ (a, b) such that g
( ξ) = 0.
The main result is obtained by subtracting a straight line and applying Rolle’s theorem to
g(x) = f (x)
f (b) f (a)
b a
(x a)
and observing that g(a) = f (a) = g(b) and g
(x) = f
(x)
f (b)f (a)
ba
.
g(x)
x
a b
ξ
Critical Points/Rolle’s Theorem
f (x)
x
a b
ξ
Mean Value Theorem
In the pictures, the orange and green lines are parallel: the average slope over the interval [a, b] equals
the gradient/derivative f
( ξ).
Proof of Lemma. 1. Suppose ξ (a, b) is a maximum: that is, g(x) g(ξ) for all x = ξ. Then
g(x) g(ξ)
x ξ
(
0 whenever x > ξ
0 whenever x < ξ
Now take the one-sided limits: since g is differentiable at ξ, we see that
0 lim
xξ
+
g(x) g(ξ)
x ξ
= g
( ξ) = lim
xξ
g(x) g(ξ)
x ξ
0
Otherwise said g
( ξ) = 0. The case when ξ is a minimum is similar.
2. By the Extreme Value Theorem (1.11), g is bounded and attains its bounds. If the extrema both
occur at the endpoints a, b, then g is constant: any ξ (a, b) satisfies the result. Otherwise, at
least one extreme occurs at some ξ (a, b): part 1 says that g
( ξ) = 0.
53
Examples 3.15. 1. Let f (x) = (x 1)
2
(4 x) + x on [a, b] = [1, 4]: this is roughly the above picture
illustrating the mean value theorem. Compute the average slope and the derivative,
f (b) f (a)
b a
= 1, f
(x) = 2(x 1)(4 x) (x 1)
2
+ 1 = 3x
2
+ 12x 8
and observe that
f
( ξ) =
f (b) f (a)
b a
3ξ
2
12ξ + 9 = 0 ξ = 1 or 3
Since only 3 lies in the interval ( 1, 4), this is the value ξ satisfying the mean value theorem.
2. We find the maximum and minimum values of g(x) = x
4
14x
2
+ 24x on the interval [0, 2].
The function is differentiable, with
g
(x) = 4x
3
28x + 24 = 4(x 2)(x 1)(x + 3)
By the Lemma, the locations of the extrema are either the end-
points x = 0, 2 or locations with zero derivative (x = 1). Since
f (0) = 0, f (1) = 11, f (2) = 8
we conclude that max( f ) = f (1) = 11 and min( f ) = f (0) = 0.
0
5
10
g(x)
0 1 2
x
Consequences of the Mean Value Theorem Several simple corollaries relate to monotonicity.
Definition 3.16. Suppose f : I R is defined on an interval I. We say that f is:
Increasing (monotone-up) on I if x < y = f (x) f (y)
Decreasing (monotone-down) on I if x < y = f (x) f (y)
We say strictly increasing/decreasing if the inequalities are strict.
Examples 3.17. 1. f : x 7 x
2
is strictly increasing on [0, )
and strictly decreasing on (, 0].
2. The floor function f : x 7 x (the greatest integer less
than or equal to x) is increasing, but not strictly, on R.
2
1
1
2
g(x)
2 1 1 2 3
x
Corollary 3.18. Suppose f is differentiable on an interval I. Then
1. f
0 on I f is increasing on I
2. f
0 on I f is decreasing on I
3. f
= 0 on I f is constant on I
54
Proof. (Part 1, ) Let x < y where x, y I. By the mean value theorem, ξ (x, y) such that
f (y) f (x)
y x
= f
( ξ) whence f
( ξ) 0 = f (y) f (x)
( ) For the converse, use the definition of derivative: f
( ξ) = lim
xξ
f (x)f (ξ)
xξ
. If f is increasing, then
x > ξ = f (x) f (ξ) = f
( ξ) 0
Parts 2 and 3 are similar.
More care is required when relating f
> 0 to f being strictly increasing (see Exercise 5). The corollary
also yields a couple of (hopefully familiar) flashbacks to elementary calculus.
Corollary 3.19. Let I be an open interval.
1. (Anti-derivatives on an interval) If f
(x) = g
(x) on I, then c such that g(x) = f (x) + c on I.
2. (First derivative test) Suppose f is continuous on I and differentiable except perhaps at ξ. If
(
f
(x) < 0 whenever x < ξ, and
f
(x) > 0 whenever x > ξ
then f has its minimum value at x = ξ
The statement for a maximum is similar.
Examples 3.20. 1. Since
d
dx
sin( 3x
2
+ x) = (6x + 1) cos(3x
2
+ x) on (the interval) R, whence all
anti-derivatives of f (x) = (6x + 1) cos(3x
2
+ x) are given by
Z
f (x) dx =
Z
(6x + 1) cos(3x
2
+ x) dx = sin(3x
2
+ x) + c
As is typical in calculus, we use the indefinite integral notation
R
f (x) dx for anti-derivatives.
2. If f (x) = x
2/3
e
x/3
, then f
(x) =
1
3
x
1/3
(2 + x)e
x/3
.
By Lemma 3.14, the only possible critical points are at
x = 0 or 2. The sign of the derivative is also clear:
1
f (x)
3 2 1 0 1
x
By the 1
st
derivative test, f has a maximum at x = 2 and a minimum at x = 0.
We finish this section by tying together the mean and intermediate value theorems.
Theorem 3.21 (IVT for Derivatives). Suppose f is differentiable on an interval I containing a < b,
and that L lies between f
(a) and f
( b). Then ξ (a, b) such that f
( ξ) = L.
If f
(x) is continuous, this is just the intermediate value theorem applied to f
; surprisingly, continuity
of f
is not required. A full proof is in Exercise 7.
55
Exercises 3.29. Key concepts: Differentiability, Basic rules: linearity, power, product, chain, quotient
1. Determine whether the conclusion of the mean value theorem holds for each function on the
given interval. If so, find a suitable point ξ. If not, state which hypothesis fails.
(a) x
2
on [1, 2] (b) sin x on [0, π] (c)
|
x
|
on [1, 2]
(d) 1/x on [1, 1] (e) 1/x on [1, 3]
2. Suppose f and g are differentiable on an interval I containing a < b and that f (a) = f (b) = 0.
By considering h(x) = f (x)e
g(x)
, prove that f
( ξ) + f (ξ)g
( ξ) = 0 for some ξ (a, b).
3. (a) Use the Mean Value Theorem to prove that x < tan x for all x (0,
π
2
).
(b) Prove that
x
sin x
is strictly increasing on (0,
π
2
).
(c) Prove that x
π
2
sin x for all x [0,
π
2
].
4. Suppose that
|
f (x) f (y)
|
(x y)
2
for all x, y R. Prove that f is a constant function.
5. (a) Prove that f
> 0 on an interval I = f is strictly increasing on I.
(b) Show that the converse of part (a) is false.
(c) Carefully prove the first derivative test (Corollary 3.19).
6. If f is differentiable on an interval I such that f
(x) = 0 for all x I, use the intermediate value
theorem for derivatives to prove that f is either strictly increasing or strictly decreasing.
7. (Intermediate value theorem for derivatives) Let f , a, b and L be as in Theorem 3.21, define
g : I R by g(x) = f (x) Lx, and let ξ [a, b] be such that
g(ξ) = min
g(x) : x [a, b]
(a) Why can we be sure that ξ exists? If ξ (a, b), explain why f
( ξ) = L.
(b) Assume WLOG that f
(a) < f
( b). Prove that g
(a) < 0 < g
( b). By considering
lim
xa
+
g(x)g(a)
xa
, show that x > a for which g(x) < g(a). Hence complete the proof.
8. Suppose f
exists on (a, b), and is continuous except for a discontinuity at c (a, b).
(a) Suppose lim
xc
+
f
(x) = L < f
( c). By taking ϵ =
f
(c)L
2
in the definition of this limit
and applying IVT for derivatives, obtain a contradiction.
Hence argue that c cannot be a removable or a jump discontinuity.
(b) Similarly, show that f
cannot have an infinite discontinuity by considering lim
xc
+
f
(x) = .
(c) By parts (a) and (b), It remains to see that f
can have an essential discontinuity. Recall
(Exercise 3.28.9) that
f : R R : x 7
(
x
2
sin
1
x
x = 0
0 x = 0
is differentiable on R, but has discontinuous derivative at x = 0.
i. Use x
n
=
1
2nπ
and y
n
=
1
(2n+1)π
to show that f
has an essential discontinuity at x = 0.
ii. Prove that if lim s
n
= 0 and lim f
( s
n
) = M, then M [1, 1].
iii. Prove that for any L [1, 1], there is a sequence (t
n
) for which lim f
( t
n
) = L.
(Hint: Use IVT for derivatives)
56
3.30 L’Hˆopital’s Rule
We are often required to consider indeterminate forms: limits which do not yield easily to the standard
limits laws. For instance, while it is tempting to write
lim
x0
sin 2x
e
3x
1
=
lim sin 2x
lim e
3x
1
=
0
0
()
this is an incorrect application of the limit laws since the resulting quotient has no meaning.
Definition 3.22. An indeterminate form is any limit where a na
¨
ıve application of the limit laws results
in a meaningless expression: the primary types are
0
0
,
, , 0 ·, 0
0
, 0
, and 1
.
Examples 3.23. 1. lim
x7
+
(x 7)
1
x7
is an indeterminate form of type 0
.
2. Our motivating example () may correctly be evaluated using the definition of the derivative:
lim
x0
sin 2x
e
3x
1
= lim
x0
sin 2x 0
x 0
x 0
e
3x
1
=
d
dx
x=0
sin 2x
d
dx
x=0
e
3x
1
=
2
3
By considering lim
x0
3a sin 2x
2(e
3x
1)
, we see that an indeterminate form of type
0
0
can take any value a!
The approach generalizes, if non-rigorously: if f , g are differentiable at a and f (a) = 0 = g(a), then
lim
xa
f (x)
g(x)
= lim
xa
f (x) f (a)
x a
·
x a
g(x) g(a)
=
f
(a)
g
(a)
Our goal is to fully justify this result and extend to several situations:
One-sided limits, including when a = ±.
When lim f (x) = 0 exists, but f (a) does not (g(x), g(a) similarly).
Indeterminate forms of type
(lim f (x) = , etc.).
When the RHS cannot be cleanly evaluated: for instance g
(a) = 0 or if the original limit is ±.
Here is the full result.
Theorem 3.24 (L’Hˆopital’s Rule). Let a R {±} and suppose functions f and g satisfy:
1. lim
xa
f
(x)
g
(x)
= L for some L R {±}, and,
2. (a) lim
xa
f (x) = lim
xa
g(x) = 0, or (b) lim
xa
g(x) = (no condition on f )
Then lim
xa
f (x)
g(x)
= L. The same result holds for one-sided limits.
The full proof is a behemoth—we postpone this until after several examples. In part because of this,
and because examples can often be evaluated more instructively using elementary methods (as in the
above example), l’H
ˆ
opital’s rule is often discouraged in elementary calculus.
57
Examples 3.25. 1. If f (x) = e
4x
and g(x) = 21x 17, then lim
x
f (x)
g(x)
has type
. By l’H
ˆ
opital’s rule,
lim
x
f
(x)
g
(x)
= lim
x
4e
4x
21
= = lim
x
e
4x
21x 17
=
2. For an example of type
0
0
, consider f (x) = x
2
9 and g(x) = ln( 4 x):
lim
x3
f
(x)
g
(x)
= lim
x3
2x
1/(4 x)
= lim
x3
2x(x 4) = 6 = lim
x3
x
2
9
ln( 4 x)
= 6
3. One can apply the rule repeatedly: for example
lim
x0
e
4x
1 4x
x
2
= lim
x0
4e
4x
4
2x
= lim
x0
16e
4x
2
= 8
This is a generally accepted abuse of protocol: one shouldn’t really state the first limit until one
knows the last limit exists! As long as everything works, you are fine. However. . .
4. It is crucially important that the limit lim
f
g
exists before applying l’H
ˆ
opital’s rule! Consider
f (x) = x + cos x and g(x) = x: certainly lim
x
f (x)
g(x)
has type
, however
lim
x
f
(x)
g
(x)
= lim
x
1 sin x
does not exist! In this case the rule is unnecessary: appealing to the squeeze theorem,
f (x)
g(x)
= 1 +
cos x
x
x
1
5. For another reason for why l’H
ˆ
opital’s rule is often prohibited in Freshman calculus, consider
lim
x0
sin x
x
= lim
x0
cos x
1
= 1
This appears legitimate. However, recall (Exercise 3.28.12) that this limit is used to demonstrate
d
dx
sin x = cos x; to use this to calculate the limit on which it depends is circular logic!
The remaining indeterminate forms (Definition 3.22) may be modified so that l’H
ˆ
opital’s rule applies.
Examples 3.26. 1. An indeterminate form of type may be transformed to one of type
0
0
before
applying the rule (twice):
lim
x0
+
1
e
x
1
1
x
= lim
x0
+
x + 1 e
x
x(e
x
1)
(type
0
0
)
= lim
x0
+
1 e
x
e
x
1 + xe
x
(still type
0
0
)
= lim
x0
+
e
x
2e
x
+ xe
x
=
1
2
58
2. For an indeterminate form of type 1
, we use the log laws & continuity of the exponential:
lim
x0
+
(1 + sin x)
1/x
= exp
lim
x0
+
1
x
ln( 1 + sin x)
(type
0
0
)
= exp
lim
x0
+
cos x
1 + sin x
= e
1
= e
Proving l’Hˆopital’s Rule
The complete argument is very lengthy. It starts with an extension of the Mean Value Theorem.
Lemma 3.27 (Extended Mean Value Theorem). Fix a < b, suppose f , g are continuous on [a, b] and
differentiable on (a, b). Then there exists ξ (a, b) such that
f (b) f (a)
g
( ξ) =
g(b) g(a)
f
( ξ)
Proof. Apply the standard mean value theorem (really Rolle’s theorem) to
h(t) =
f (b) f (a)
g(t)
g(b) g(a
) f ( t)
which satisfies h(a) = h(b).
Now for the main event. If you do nothing else, read the following proof of the simplest case. Every-
thing else is a modification.
Proof (Case (a)/type
0
0
, with right limits). Suppose we have a form of type
0
0
= lim
xa
+
f (x)
g(x)
taking right-
limits at a finite location a, and that the resulting limit L is finite.
First observe that condition 1 forces the existence of an interval (a, b) on which f , g are differentiable
and g
(x) = 0. Everything follows from the definition the limit in condition 1, and Lemma 3.27:
Given ϵ > 0, δ (0, b a) such that a < ξ < a + δ =
f
( ξ)
g
( ξ)
L
<
ϵ
2
()
a < y < x < a + δ = ξ (y, x) such that
f (x) f (y)
g(x) g(y)
=
f
( ξ)
g
( ξ)
(†)
Since g
= 0, the usual mean value theorem says
c (y, x) such that g(x) g(y) = g
( c)(x y) = 0
whence we never divide by zero in (). Combining () and (), observe that
a < x < a + δ =
f (x)
g(x)
L
2(a)
= lim
ya
+
f (x) f (y)
g(x) g(y)
L
()
= lim
ya
+
f
( ξ)
g
( ξ)
L
()
ϵ
2
< ϵ
Note that a < y < ξ(x, y) < x is a function of x, y here! Since ϵ > 0 is arbitrary, this is the required
result.
59
A complete proof for all indeterminate forms of type
0
0
follows from some simple modifications.
If a = : Replace the blue part of () as follows:
Given ϵ > 0, m b such that ξ < m =
f
( ξ)
g
( ξ)
L
<
ϵ
2
The rest of the proof goes through after replacing a with and a + δ with m.
If L = : Replace the green parts of () with Given M > 0 and
f
(ξ)
g
(ξ)
> 2M. Fixing the rest of the
proof is again straightforward.
If L = : Replace the green parts of () with Given M > 0 and
f
(ξ)
g
(ξ)
< 2M.
Left-limits: If f , g are differentiable on (c, a), then the blue part may be replaced with either:
(a finite) δ (0, a c) such that a δ < ξ < a
(a = ) m c such that ξ > m
The blue and green parts of () may be replaced independently.
Proof (Case (b), lim g(x) = ). This requires a little more care.
15
Since g
= 0, and lim
xa
+
g(x) = ,
Exercise 3.29.6 says that g is strictly decreasing on (a, b). By replacing b by some
˜
b (a, b), if necessary,
we may assume that
a < y < x < b = 0 < g(x) < g(y) (‡)
Assume a and L are finite and obtain () and () as before. Let x (a, a + δ) be fixed and multiply
(†) by
g(y)g(x)
g(y)
(this is positive by (‡)): a little algebra and the triangle inequality tell us that
a < y < x =
f (y)
g(y)
=
f
( ξ)
g
( ξ)
+
f (x)
g(y)
g(x)
g(y)
·
f
( ξ)
g
( ξ)
=
f (y)
g(y)
L
f
( ξ)
g
( ξ)
L
+
1
g(y)
|
f (x)
|
+
|
g(x)
|
L +
ϵ
2
Since lim
ya
+
g(y) = and x is fixed, we see that there exists η x a < δ such that
y (a, a + η) =
1
g(y)
|
f (x)
|
+
|
g(x)
|
L +
ϵ
2
<
ϵ
2
Finally combine with (): given ϵ > 0, η > 0 such that y (a, a + η) =
f (y)
g(y)
L
< ϵ.
The same modifications listed above complete the proof.
15
Forms of type
? Instead of assumption 2. (b), why not simply assume lim f = lim g = and write
f
g
=
1/g
1/ f
to obtain
a form of type
0
0
? The problem is that the derivative of the ‘new’ denominator
d
dx
1
f
=
f
f
2
need not be non-zero on any
interval (a, b) and so condition 1. need not hold. We could modify this, but it would make for a weaker theorem. Example
3.25.4 illustrates the issue: f
(x) = 1 + sin x has zeros on any unbounded interval.
After the 2. (b) case is proved and we know that lim
f
g
= L, it is then clear that lim f must also be infinite (unless L = 0 in
which case lim f could be anything and need not exist). This situation therefore really does deal with forms of type
.
60
Exercises 3.30. Key concepts: Types of indeterminate forms, Formal statement of l’Hˆopital’s rule
1. Evaluate the limits, if they exist:
(a) lim
x0
x
3
sin x x
(b) lim
x
π
2
tan x
2
π 2x
(c) lim
x0
(cos x)
1/x
2
(d) lim
x0
(1 + 2x)
1/x
(e) lim
x
( e
x
+ x)
1/x
2. Suppose f is differentiable on (c, ) and that lim
x
[ f (x) + f
(x)] = L is finite.
(a) Prove that lim
x
f (x) = L and that lim
x
f
(x) = 0.
(Hint: write f (x) =
f (x)e
x
e
x
)
(b) Does anything change if L exists and is infinite?
3. If p
n
(x) is a polynomial of degree n, use induction to prove that lim
x
p
n
(x)e
x
= 0
4. Let f (x) = x + sin x cos x, g(x) = e
sin x
f (x) and h(x) =
2 cos x
e
sin x
( f (x) + 2 cos x)
(a) Prove that lim
x
f (x) = = lim
x
g(x) but that lim
x
f (x)
g(x)
does not exist.
(b) If cos x = 0, and x is large, show that
f
(x)
g
(x)
= h(x).
(c) Prove that lim
x
h(x) = 0. Explain why this does not contradict part (a)!
61
3.31 Taylors Theorem
A primary goal of power series is the approximation of functions. With this in mind, there are two
natural questions to ask of a function f :
1. Given c dom( f ), is there a series
a
n
(x c)
n
which equals f (x) on an interval containing c?
2. If we take the first n terms of such a series, how accurate is this polynomial approximation?
Example 3.28. Recall the geometric series
f (x) =
1
1 x
=
n=0
x
n
whenever 1 < x < 1
The polynomial approximation
p
n
(x) =
n
k=0
x
k
= 1 + x + ··· + x
n
=
1 x
n+1
1 x
has error
R
n
(x) = f (x) p
n
(x) =
x
n+1
1 x
2
4
6
8
10
y
1 0 1
x
1
2
1
2
p
3
(x) = 1 + x + x
2
+ x
3
If x is close to 0, this is likely very small; for instance if x
1
2
,
1
2
, then
|
R
n
(x)
|
1
1
1
2
1
2
n+1
= 2
n
However, when x is close to 1 the error is unbounded!
The above behavior occurs in general: the truncated polynomials provide better approximations
nearer the center of the series. To see this, we first need to consider higher-order derivatives.
Definition 3.29. We write f
′′
for the second derivative of f , namely the derivative of its derivative
f
′′
(a) = lim
xa
f
(x) f
(a)
x a
The existence of f
′′
(a) presupposes that f
exists on an (open) interval containing a. We can similarly
consider third, fourth, and higher-order derivatives. As a function, the n
th
derivative is written
f
(n)
(x) =
d
n
f
dx
n
By convention, the zeroth derivative is the function itself f
(0)
(x) = f (x). We say that f is n times
differentiable at a if f
(n)
(a) exists, and infinitely differentiable (or smooth) if derivatives of all orders exist.
Example 3.30. f (x) = x
2
|
x
|
is twice differentiable, with f
′′
(x) = 6
|
x
|
. It is smooth everywhere
except at x = 0, where third (and higher-order) derivatives do not exist.
62
Definition 3.31. Suppose f is n times differentiable at x = c. The n
th
Taylor polynomial p
n
of f
centered at c is
p
n
(x) :=
n
k=0
f
(k)
( c)
k!
(x c)
k
= f (c) + f
( c)(x c) +
f
′′
( c)
2
(x c)
2
+ ···+
f
(n)
( c)
n!
(x c)
n
The remainder R
n
(x) is the error in the polynomial approximation
R
n
(x) = f (x) p
n
(x) = f (x)
n
j=0
f
(k)
( c)
k!
(x c)
k
If f is infinitely differentiable at x = c, then its Taylor series centered at x = c is the power series
T
c
f (x) =
n=0
f
(n)
( c)
n!
(x c)
n
When c = 0 this is known as a Maclaurin series.
16
For simplicity we’ll mostly work with Maclaurin series, with general situation hopefully being clear.
Examples 3.32. 1. If f (x) = e
3x
, then f
(n)
(x) = 3
n
e
x
, from which the Maclaurin series is
T
0
f (x) =
n=0
3
n
n!
x
n
2. If g(x) = sin 7x, then the sequence of derivatives is
7 cos 7x, 7
2
sin 7x, 7
3
cos 7x, 7
4
sin 7x, 7
5
cos 7x, 7
6
sin 7x, . . .
At x = 0, every even derivative is zero whereas the odd derivatives alternate in sign. The
Maclaurin series is easily seen to be
T
0
g(x) =
n=0
( 1)
n
7
2n+1
(2n + 1)!
x
2n+1
3. If h(x) =
x, then h
(x) =
1
2
x
1/2
, h
′′
(x) =
1
2
2
x
3/2
, and h
′′
(x) =
3
2
3
x
5/2
, from which the
third Taylor polynomial centered at c = 1 is
p
2
(x) = h(1) + h
(1)(x 1) +
h
′′
(1)
2
(x 1)
2
+
h
′′
(1)
6
(x 1)
3
= 1 +
1
2
(x 1)
1
8
(x 1)
2
+
1
16
(x 1)
3
Rather than computing further examples, we first develop a little theory that makes verifying Taylor
series much easier.
16
Named for Englishman Brook Taylor (1685–1731) and Scotsman Colin Maclaurin (1698–1746). Taylor’s general method
expanded on examples discovered by James Gregory and Issac Newton in the mid-to-late 1600s.
63
Differentiation of Taylor Polynomials and Series
Suppose P(x) =
a
j
x
j
is a power series with radius of convergence R > 0. As we saw previously
(Theorem 2.31), P(x) is differentiable term-by-term on (R, R). Indeed,
P
(x) =
j=1
a
j
jx
j1
= P
(0) = a
1
P
′′
(x) =
j=2
a
j
j(j 1)x
j2
= P
′′
(0) = 2a
2
P
′′
(x) =
j=3
a
j
j(j 1)(j 2)x
j3
= P
′′
(0) = 3!a
3
.
.
.
P
(k)
(x) =
j=k
a
j
j(j 1) ···(j k + 1)x
jk
=
j=k
j!a
j
(j k)!
x
jk
= P
(k)
(0) = k!a
k
Otherwise said, P is its own Maclaurin series! The same discussion holds for polynomials. Indeed if
P(x) = a
0
+ a
1
x + ···+ a
n
x
n
is a polynomial and f a function, then
P
(k)
(0) = f
(k)
(0) a
k
=
f
(k)
(0)
k!
If this holds for all k n, then P = p
n
is the n
th
Taylor polynomial of f ! With a little modification,
we’ve proved the following:
Theorem 3.33. 1. If f (x) =
a
n
(x c)
n
is a power series defined on a neighborhood of c, then
T
c
f (x) = f (x): the function is its own Taylor series!
2. The n
th
Taylor polynomial of f centered at x = c is the unique polynomial p
n
of degree n
whose value and first n derivatives agree with those of f at x = c: that is
k n, p
(k)
n
( c) = f
(k)
( c)
This answers our first motivating question: a function can equal at most one power series with a
given center. The second question requires a careful study of the remainder: we’ll do this shortly.
Examples 3.34 (Common Maclaurin Series). These should be familiar from elementary calculus.
Each function equals the given series form our previous discussions of power series: by the Theorem,
each series is immediately the Maclaurin series of the given function.
e
x
=
n=0
x
n
n!
x R
1
1 x
=
n=0
x
n
x (1, 1)
sin x =
n=0
( 1)
n
(2n + 1)!
x
2n+1
x R ln( 1 + x) =
n=1
( 1)
n+1
n
x
n
x (1, 1]
cos x =
n=0
( 1)
n
(2n)!
x
2n
x R tan
1
x =
n=0
( 1)
n
2n + 1
x
2n+1
x [1, 1]
64
Examples 3.35 (Modifying Maclaurin Series). By substituting for x in a common series, we quickly
obtain new series.
1. Substitute x 7 7x in the Maclaurin series for sin x, to recover our earlier example
sin 7x =
n=0
( 1)
n
7
2n+1
(2n + 1)!
x
2n+1
, x R
Note how this requires almost no calculation: since the function equals a series, the Theorem
says we have the Maclaurin series for sin 7x!
2. Substitute x 7 x
2
in the Maclaurin series for e
x
to obtain
e
x
2
= exp(x
2
) =
n=0
1
n!
x
2n
, x R
This would be disgusting to verify directly, given the difficulty of repeatedly differentiating e
x
2
.
3. We find the Taylor series for f (x) =
1
5x
centered at x = 2:
f (x) =
1
3 + 2 x
=
1
3( 1
2x
3
)
=
1
3
n=0
2 x
3
n
which is valid whenever 1 <
2x
3
< 1 1 < x < 5.
4. Fix c R and observe that, for all x R,
e
x
= e
c+xc
= e
c
e
xc
=
n=0
e
c
n!
(x c)
n
We conclude that the series is the Taylor series of e
x
centered at x = c. Of course this is easily
verified using the definition, since
d
n
dx
n
x=c
e
x
= e
c
.
5. Combining the Theorem with the multiple-angle formula, we obtain the Taylor series for sin x
centered at x = c:
sin x = sin(c + x c) = sin c cos(x c) + cos x sin(x c)
=
n=0
( 1)
n
sin c
(2n)!
(x c)
2n
+
n=0
( 1)
n
cos c
(2n + 1)!
(x c)
2n+1
Definition 3.36. A function f is analytic on a its domain if every c dom f has a neighborhood on
which f (x) equals its Taylor series centered at c.
All the examples we’ve thus far seen are analytic on their domains; indeed the last two of Exam-
ples 3.35 prove this for the exponential and sine functions. Every analytic function is automatically
smooth (infinitely differentiable), however the converse is false (Exercise 10). Analyticity is of greater
importance in complex analysis where (amazingly!) it is equivalent to complex-differentiability.
65
Accuracy of Taylor Approximations
Our final goal is to estimate the accuracy of a Taylor polynomial as an approximation to its generating
function. Otherwise said, we want to estimate the size of the remainder R
n
(x) = f (x) p
n
(x).
Theorem 3.37 (Taylors Theorem: Lagrange’s form). Suppose f is n + 1 times differentiable on an
open interval I containing c and let x I \ {c}. Then there exists ξ between c and x for which the
remainder centered at c satisfies
R
n
(x) =
f
(n+1)
( ξ)
( n + 1)!
(x c)
n+1
Proof. For simplicity let c = 0. Fix x = 0, define a constant M
x
and a function g : I R by
R
n
(x) =
M
x
( n + 1)!
x
n+1
and g(t) =
M
x
( n + 1)!
t
n+1
+ p
n
( t) f (t) =
M
x
( n + 1)!
t
n+1
R
n
( t)
Observe that
k n + 1 = g
(k)
(x) =
M
x
( n + 1 k)!
t
n+1k
+ p
(k)
n
( t) f
(k)
( t) ()
= g
(k)
(0) = p
(k)
n
(0) f
(k)
(0) = 0 if k n
where we invoked Theorem 3.33.
Now apply Rolle’s Theorem repeatedly (WLOG assume x > 0):
ξ
1
between 0 and x such that g
( ξ
1
) = 0.
ξ
2
between 0 and ξ
1
such that g
′′
( ξ
2
) = 0, etc.
Iterate to obtain a sequence (ξ
k
) such that
0 < ξ
n+1
< ξ
n
< ··· < ξ
1
< x and g
(k)
( ξ
k
) = 0
Take ξ = ξ
n+1
and consider (): since deg p
n
n, we see that
0 = g
(n+1)
( ξ) = M
x
f
(n+1)
( ξ) = R
n
(x) = f (x) p
n
(x) =
f
(n+1)
( ξ)
( n + 1)!
x
n+1
Corollary 3.38. Suppose f is smooth on an open interval I containing c and that all derivatives f
(n)
of all orders are bounded on I. Then f equals its Taylor series (centered at c) on I.
Proof. For simplicity, let c = 0. Suppose
f
(n+1)
( ξ)
K for all ξ I. Choose any N >
|
x
|
and
observe that
n > N =
|
R
n
(x)
|
K
|
x
|
n+1
( n + 1)!
=
K
|
x
|
n+1
N!(N + 1) ···(n + 1)
K
|
x
|
N
N!
|
x
|
N
n+1N
n
0
66
Examples 3.39. 1. The functions sine and cosine have derivatives bounded by 1 on R, and thus both
functions equal their Maclaurin series on R. This removes the need to have previously justified
these facts using the theory of differential equations.
2. The exponential function does not have bounded derivatives, however we can still apply Tay-
lor’s Theorem. For any fixed x, ξ between 0 and x such that
|
R
n
(x)
|
=
e
ξ
( n + 1)!
x
n+1
n
0
by the same argument in the Corollary. Thus e
x
equals its Maclaurin series on the real line (we
knew this already from Exercise 3.28.14).
3. Extending Example 3.32.3, we see that h(x) =
x has linear approximation (1
st
Taylor polyno-
mial) centered at c = 9
p
1
(x) = h(9) + h
(9)(x 9) = 3 +
1
6
(x 9)
This yields the simple approximation
10 p
1
(10) = 3 +
1
6
=
19
6
Taylor’s Theorem can be used to estimate its accuracy (remember to shift the center to 9!):
R
1
(10) =
h
′′
( ξ)
2!
(10 9)
2
=
1
2
2
·2!
ξ
3/2
=
1
8ξ
3/2
for some ξ (9, 10)
Certainly ξ
3/2
< 9
3/2
=
1
27
, whence
1
216
< R
1
(10) < 0 =
19
6
1
216
=
683
216
<
10 <
684
216
=
19
6
19
6
is therefore an overestimate for
10, but is accurate to within
1
216
< 0.005.
Alternative Versions of Taylors Theorem
There are two further common expressions for the remainder in Taylor’s Theorem. These are typ-
ically less easy to use than Lagrange’s form but can sometimes provide sharper estimates for the
remainder, particularly when x is far from the center of the series.
Corollary 3.40. Suppose f
(n+1)
is continuous on an open interval I containing c, let x I \ {c}, and
let R
n
(x) = f (x) p
n
(x) be the remainder for the Taylor polynomial centered at c. Then:
1. (Integral Remainder) R
n
(x) =
Z
x
c
(x t)
n
n!
f
(n+1)
( t) dt
2. (Cauchy’s Form) ξ between c and x such that R
n
(x) =
(x ξ)
n
n!
(x c) f
(n+1)
( ξ)
67
Using these expressions it is possible to explicitly prove Newton’s binomial series formula.
Corollary 3.41. If α R and
|
x
|
< 1, then
(1 + x)
α
= 1 +
n=1
α(α 1) ···(α n + 1)
n!
x
n
= 1 + αx +
α(α 1)
2!
x
2
+
α(α 1)(α 2)
3!
x
3
+
α(α 1)(α 2)(α 3)
4!
x
4
+ ···
If α N
0
, this is the usual binomial theorem. Otherwise it is more interesting: for instance,
1 + x = (1 + x)
1/2
= 1 +
1
2
x
1
8
x
2
+
1
16
x
3
5
128
x
4
+ ···
1
(1 + x)
3
= 1 3x + 6x
2
10x
3
+ 15x
4
···
Of course this last could easily be obtained from
1
1+x
=
( 1)
n
x
n
by differentiating twice!
Exercises 3.31. Key concepts: Taylor Series/Polynomials, Lagrange’s form for Remainder
1. Compute the Maclaurin series for cos x directly from the definition and use Taylor’s Theorem
to indicate why it converges to cos x for all x R.
2. Repeat the previous exercise for sinh x =
1
2
( e
x
e
x
) and cosh x =
1
2
( e
x
+ e
x
).
3. Find the Maclaurin series for the function sin( 3x
2
). How do you know you are correct?
4. Find the Taylor series of f (x) = x
4
3x
2
+ 2x 5 at x = 2 and show that T
2
f (x) = f (x).
5. Find a rational approximation to
3
9 using the first Taylor polynomial for f (x) =
3
x. Now use
Taylor’s Theorem to estimate its accuracy.
6. If c = 1, use the fact that 1 x = (1 c)
1
xc
1c
to obtain the Taylor series of
1
1x
centered at
c. Hence conclude that
1
1x
is analytic on its domain R \ {1}.
7. We prove that the Maclaurin series
n=1
(1)
n+1
n
x
n
converges to ln(1 + x) whenever 0 < x 1.
(a) Explicitly compute
d
n+1
dx
n+1
ln( 1 + x).
(b) Suppose 0 < x 1. Using Taylor’s Theorem, prove that lim
n
R
n
(x) = 0.
(If 1 < x < 0, the argument is tougher, being similar to Exercise 11)
8. Why can’t we use Taylor’s Theorem to approximate the error in
1
1x
= 1 + x + R
1
(x) when
x 1? Try it when x = 2, what happens? What about when x = 2?
9. Prove Taylor’s Theorem with integral remainder when c = 0 by using the following as an
induction step: for each n N, define
A
n
(x) =
Z
x
0
(x t)
n
n!
f
(n+1)
( t) dt
and use integration by parts to prove that A
n+1
= A
n
x
n+1
(n+1)!
f
(n+1)
(0).
(The Cauchy form follows from the intermediate value theorem for integrals which we’ll see later)
68
10. Consider the function
f (x) =
(
e
1/x
if x > 0
0 otherwise
(a) Prove by induction that there exists a degree 2n polynomial q
n
for which
f
(n)
(x) = q
n
1
x
e
1/x
whenever x > 0
(b) Prove that f is infinitely differentiable at x = 0 with f
(n)
(0) = 0 (use Exercise 3.30.3).
The Maclaurin series of f is identically zero! Moreover, f is smooth (infinitely differentiable) on R but
non-analytic at zero since it does not equal its Taylor series on any open interval containing zero.
A modification allows us to create bump functions, which find wide use in analysis. If a < b, define
g
a,b
: x 7 f (x a) f (b x)
This is smooth on R but non-zero only on the interval (a, b). A
further modification involving two such functions g
a,b
creates
a smooth function on R which satisfies
h
a,b,ϵ
(x) =
(
0 if x a ϵ or x b + ϵ
1 if a x b
This ‘switches on’ rapidly from 0 to 1 near a and switches off
similarly near b. By letting ϵ be small, we smoothly (but not
uniformly) approximate the indicator function on [a, b].
0
h
a,b,ǫ
(x)
0
x
aa ǫ b b + ǫ
1
11. (Hard) We prove the binomial series formula (Corollary 3.41).
Let f (x) = (1 + x)
α
and g(x) = 1 +
n=1
a
n
x
n
where a
n
=
α(α1)···(αn+1)
n!
. Our goal is to prove
that f = g on the interval (1, 1).
(a) Check that f
(n)
(0) = n!a
n
so that g really is the Maclaurin series of f .
(b) i. Prove that the radius of convergence of g is 1.
ii. Prove that lim
n
na
n
x
n
= 0 whenever
|
x
|
< 1.
iii. If
|
x
|
< 1 and ξ lies between 0 and x, prove that
xξ
1+ξ
|
x
|
.
(Hint: write ξ = tx for some t (0, 1). . . )
(c) Use Taylor’s Theorem with Cauchy remainder to prove that
|
R
n
(x)
|
< (n + 1)
|
a
n+1
||
x
|
n+1
(1 + ξ)
α1
Hence conclude that g = f whenever
|
x
|
< 1.
(d) Here is an alternative argument for the full result:
i. Show that (n + 1)a
n+1
+ na
n
= αa
n
.
ii. Differentiate term-by-term to prove directly that g satisfies the differential equation
(1 + x)g
(x) = αg(x). Solve this to show that g = f whenever
|
x
|
< 1.
69