3 Differentiation

Differentiation grew out of the problem of instantaneous velocity. Velocity can only easily be measured

as an average over a time interval:

if an object travels ∆d meters in ∆t seconds, then its average

velocity is v

∆d

∆t

−1

. An early ‘deﬁnition’ (dating to the 1300s) makes the instantaneous velocity

equal to the constant velocity that would be observed if a body were to stop accelerating: while

useless for the purposes of measurement, this is essentially Newton’s ﬁrst law regarding inertial

motion (1687). We also see the concept of the tangent line beginning to appear. Indeed if one graphs

position against time, intuition tells us:

• The graph of inertial (constant speed) motion is a straight line whose slope is the velocity.

• The tangent line to a curve has slope equal to the instantaneous velocity.

The problem of ﬁnding, deﬁning and computing instantaneous velocity thus morphed into the con-

sideration of tangent lines to curves. With the advent of analytic geometry in the early 1600s, math-

ematicians such as Fermat and Descartes pioneered versions of the familiar secant (‘cutting’) line

method for computing tangents.

∆t

∆d

v =

∆d

∆t

Instantaneous velocity equals constant

velocity corresponding to tangent line

Secant lines approximate tangent line as t → a

The average velocity of the particle over the time interval [a, t] is the slope of the secant line, namely

(a, t) =

d(t) − d(a)

t − a

Since the secant lines approximate the tangent line as t approaches a, it seems reasonable that we

should compute the instantaneous velocity in this manner:

v(a) = lim

t→a

(a, t) = lim

t→a

d(t) − d(a)

t − a

This is, of course, the modern deﬁnition of the derivative.

Even a modern technique such as Doppler-shift compares measurements separated by the extremely small period of a

light or soundwave. These are still therefore average velocities, albeit taken over very small time intervals.

3.28 Basic Properties of the Derivative

Deﬁnition 3.1. Let f : U → R and a ∈ U

◦

an interior point. We say that f is differentiable at a if the

following limit exists (is ﬁnite!)

lim

x→a

f (x) − f (a)

x −a

We call this limit the derivative of f at a and denote its value by

d f



x=a

or f

′

(a).

If f

′

(a) exists for all a ∈ U then f is differentiable (on U); the derivative becomes a function f

′

(x) =

d f

The two notations are partly attributable to the primary founders of calculus: Issac Newton and

Gottfried Leibniz. Each has its pros and cons and you should be comfortable with both.

One-sided derivatives Differentiability only makes sense at interior points of U since the deﬁning

limit is two-sided. Left- and right-derivatives may be deﬁned using one-sided limits; differentiability

is then equivalent to these being equal. All results in this section hold for one-sided derivatives with

suitable (sometimes tedious) modiﬁcations. It is common, though strictly incorrect, to say that f is

differentiable on [a, b) if it is differentiable on the interior (a, b) and right-differentiable at a. In these

notes we will strictly adhere to Deﬁnition 3.1: differentiable means two-sided.

Examples 3.2. Basic examples should be familiar from elementary calculus.

1. Let f (x) = x

+ 4x. Then, for any a ∈ R,

lim

x→a

f (x) − f (a)

x −a

= lim

x→a

+ 4x − a

−4a

x −a

= lim

x→a

(x −a)(x + a + 4)

x −a

= lim

x→a

(x + a + 4) = 2a + 4

Note how the deﬁnition of lim

x→a

allows us to cancel the x − a terms from the numerator and

denominator. We conclude that f is differentiable (on R) and that f

′

(x) = 2x + 4.

2. Let g(x) =

x+1

2x−3

. Then, for any a =

lim

x→a

f (x) − f (a)

x −a

= lim

x→a

x −a



x + 1

2x − 3

−

a + 1

2a −3



= lim

x→a

5a −5x

(x −a)(2x − 3)(2a − 3)

= lim

x→a

−5

(2x −3)(2a −3)

−5

(2a − 3)

f is therefore differentiable on its domain R \{

} with derivative f

′

(x) =

−5

(2x−3)

The familiar expressions

′

(a) = lim

h→0

f (a + h) − f (a)

, f

′

(x) = lim

h→0

f (x + h) − f (x)

are equivalent to the original deﬁnition (Exercise 5). While seemingly simpler, they sometimes lead

to nastier calculations: see what happens if you try the previous example in this language. . .

We now turn to perhaps the most well-known result of elementary calculus.

Theorem 3.3 (Power Law). Let r ∈ R. Then f (x) = x

is differentiable with f

′

(x) = rx

r−1

The domains of f and f

′

depend messily on r, but the formula holds at least on the interval (0, ∞).

We leave a complete proof to the exercises and instead consider a few generalizable examples.

Examples 3.4. 1. If n ∈ N and a ∈ R, a simple factorization yields

lim

x→a

− a

x −a

= lim

x→a

(x −a)(x

n−1

+ ax

n−2

+ ···+ a

n−2

x + a

n−1

)

x −a

(∗)

= lim

x→a

n−1

+ ax

n−2

+ ···+ a

n−2

x + a

n−1

) = na

n−1

We conclude that

= nx

n−1

2. If f (x) = x

−1

and a = 0, then

lim

x→a

−1

− a

−1

x −a

= lim

x→a

a − x

ax(x −a)

= lim

x→a

−1

= −

from which we conclude that f

′

(x) = −x

−2

A similar approach followed by the factorization (∗) proves the

power law for all negative integer exponents:

−n

− a

−n

x −a

− x

(x −a)

= ···

−2

−1

−2 −1 1 2

3. To differentiate x

1/n

, substitute x = y

and observe case 1. For

instance, if g(x) = x

1/3

and a = 0, then y = x

1/3

and b = a

1/3

yield

lim

x→a

1/3

− a

1/3

x −a

= lim

y→b

y − b

−b

−2/3

=⇒ g

′

(x) =

−2/3

Note that g is not differentiable at x = 0!

−1

−2 −1 1 2

We could similarly compute the derivative for all rational exponents, though it is much easier to wait

for the chain rule. The power law for irrational exponents is somewhat more ticklish.

Corollary 3.5 (Basic Transcendental Functions). Recalling our development of power series in

Chapter 2, the power law (for positive integers!) is all we need to see that

exp(x) = exp(x),

sin x = cos x,

cos x = −sin x

It is also possible to develop these results independently of power series (see e.g. Exercise 12).

Failure of differentiability

It is instructive to consider how a function might fail to be differentiable. Firstly, a familiar fact shows

that functions are not differentiable at discontinuities.

Lemma 3.6. If f is differentiable at a then f is continuous at a.

Proof. Just take the limit (think carefully why this works!):

lim

x→a

f (x) = lim

x→a



f (x) − f (a)

x −a

(x −a) + f (a)



= f

′

(a)(0 −0) + f (a) = f (a)

It remains to consider situations when a function is continuous but not differentiable.

Examples 3.7. The following exemplify all situations where a function is continuous on an interval

and differentiable everywhere except at a single interior point. As with isolated discontinuities, these

are classiﬁed by considering the three ways in which the derivative limit might not converge.

1. A vertical tangent line occurs when the limit is inﬁnite. For instance, g(x) = x

1/3

at x = 0.

2. Corners occur when the one-sided limits are unequal (could be inﬁnite). For instance, f (x) =

is not differentiable at zero, with one-sided limits

lim

x→0

−

x −0

= lim

x→0

= 1 = lim

x→0

−

x −0

= lim

x→0

−

−x

= −1

Indeed f is differentiable everywhere except at zero, with

′

(x) =

(

1 if x > 0

−1 if x < 0

A cusp describes the special case where the one-sided limits are ∞ = −∞.

3. A singularity is where left- and/or right-limits do not exist.

The standard example is

f (x) =

(

x sin

if x = 0

0 if x = 0

which is continuous on R and differentiable everywhere ex-

cept at zero: the details are in Exercise 10.

−

Singularities and vertical tangent lines can also prevent one-sided differentiability.

More esoteric examples of non-differentiability are possible:

• Utilizing series, we can create functions which are continuous on an interval but nowhere differ-

entiable! For an example, see Exercise 15.

• It is also possible to construct a function which differentiable (and thus continuous) at precisely

one point; can you think of an example?

The Basic Rules of Differentiation

Theorem 3.8. Let f , g be differentiable and k, l be constants.

1. (Linearity) The function k f + lg is differentiable with (k f + lg)

′

= k f

′

+ lg

′

2. (Product rule) The function f g is differentiable with ( f g)

′

= f

′

g + f g

′

3. (Inverse functions) Suppose f is bijective, b = f

−1

(a) is an interior point of dom f

−1

, and

′

(a) = 0, then f

−1

is differentiable at b and



y=b

−1

( y) =

′

(a)

′



−1

( b)



Proof. Parts 1 and 2 follow from the limit laws:

lim

x→a

( k f + lg)(x) − (k f + lg)(a)

x −a

= lim

x→a



f (x) − f (a)

x −a

+ l

g(x) − g(a)

x −a



= k f

′

(a) + lg

′

(a)

lim

x→a

f (x)g(x) − f (a)g(a)

x −a

= lim

x→a



f (x) − f (a)

x −a

g(x) + f (a)

g(x) − g(a)

x −a



= f

′

(a)g(a) + f (a)g

′

(a)

Note where we used the continuity of g in the second line (lim g(x) = g(a)). Part 3 is an exercise.

The inverse function rule should be intuitive: since the graphs of f and f

−1

are related by reﬂection

in the diagonal y = x, gradients at corresponding points are reciprocals. The result feels even more

natural in Leibniz’s notation:

dy/dx

Examples 3.9. 1. Linearity permits the differentiation of any polynomial: e.g.,



+ 13x



= 7

+ 13

= 14x + 52x

2. The product rule extends the reach of differentiation to include simple combinations: e.g.,

sin x) =





sin x + x

sin x = 4x

sin x − x

cos x

3. Inverse trigonometric functions can now be differentiated: e.g.,

y = sin

−1

x =⇒

sin

−1

x =





−1

cos y

1 −sin

√

1 − x

4. Deﬁne the natural logarithm to be the inverse of the (bijective!) exponential function exp(x):

y = ln x ⇐⇒ x = exp y

It follows that

ln x =





−1

exp y

The full details, and the justiﬁcation that exp x = e

, are in Exercise 14.

Theorem 3.10 (Chain Rule). If g is differentiable at a, and f is differentiable at g(a), then f ◦ g is

differentiable at a with derivative

( f ◦ g)

′

(a) = f

′



g(a)



′

(a)

In Leibniz’s notation,

d( f ◦g)

d f

: this looks like a simple cancellation of the dg terms. . .

Proof. Since f and g are differentiable, a is interior to dom(g) and g(a) is interior to dom( f ). Since g

is continuous at a, there must exist some open interval U ∋ a for which x ∈ U =⇒ g(x) ∈ dom( f ).

Deﬁne γ : dom( f ) → R via

γ(v) =











−f



g(a)



v−g(a)

if v = g(a)

′



g(a)



if v = g(a)

(∗)

Since f is differentiable at g(a),we see that γ is continuous there: indeed lim

v→g(a)

γ(v) = f

′



g(a)



For any x ∈ U \ {a}, let v = g(x) in (∗). Then



g(x)



− f



g(a)



x −a

= γ



g(x)



g(x) − g(a)

x −a

Take limits as x → a for the result.

Corollary 3.11 (Quotient Rule). Suppose f and g are differentiable. Then

is differentiable when-

ever g(x) = 0. Moreover





′

g − f g

′

The proof is an exercise.

Examples 3.12. 1. By the quotient rule,

tan x =

sin x

cos x

cos

x + sin

cos

= sec

2. We can now differentiate highly involved combinations of elementary functions:



tan(e

) −

sin x



= 8xe

sec

( e

) −

7 sin x −7x cos x

sin

This is completely unjustiﬁed since dg does not (for us) have independent meaning. The same problem appears in a

famously ﬂawed one-line ‘proof’ of the chain rule:

lim

x→a



g(x)



− f



g(a)



x −a

= lim

x→a



g(x)



− f



g(a)



g(x) − g(a)

lim

x→a

g(x) − g(a)

x −a

The second limit doesn’t make sense unless g(x) = g(a) for all x on some punctured neighborhood of a: in particular, g(x)

cannot be constant! The faulty argument may be repaired by replacing this difference quotient with f

′



g(a)



whenever

g(x) = g(a), before taking the limit. This is precisely what γ



g(x)



does in the correct proof.

Exercises 3.28. Key concepts: Differentiability, Basic rules: linearity, power, product, chain, quotient

1. Use Deﬁnition 3.1 to calculate the derivatives.

(a) f (x) = x

at x = 2 (b) g(x) = x + 2 at x = a

cos x at x = 0 (d) r(x) =

3x+4

2x−1

at x = 1

2. Differentiate the function f (x) = cos



−3x



using the chain and product rules.

3. (a) Prove the quotient rule (Corollary 3.11) by combining the chain and product rules.

(b) Prove the inverse derivative rule (Theorem 3.8, part 3).

(Hint: You can’t simply differentiate 1 =



−1

(x)



using the chain rule; why not?)

4. (a) Find the derivatives of secant, cosecant and cotangent using the quotient rule.

(b) Why did we choose the positive square-root when computing

sin

−1

x? What is the

standard domain of arcsine, and what happens at x = ±1?

5. Using the deﬁnition of the derivative, and supposing that f is differentiable at a, prove that

′

(a) = lim

h→0

f (a + h) − f (a)

= lim

h→0

f (a + h) − f (a − h)

6. Use induction to prove the power law

= nx

n−1

when n ∈ N using only the product rule

and the fact that

x = 1.

7. Prove that f (x) = x

is differentiable everywhere and compute its derivative.

8. Show that f (x) = x

2/3

has a cusp (see Example 3.7.2) at x = 0.

9. Show that following function is differentiable everywhere and compute its derivative:

f (x) =

(

sin

if x = 0

0 if x = 0

Moreover, prove that the derivative f

′

is discontinuous at x = 0.

10. Prove that the function in Example 3.7.3 is differentiable everywhere except at x = 0.

11. Suppose f (x) = x

whenever x ∈ Q and f (x) = 0 whenever x ∈ Q. At what values of x is f

differentiable? Prove your assertion.

12. (a) Suppose 0 < h <

. Use the picture to show that

0 <

1 −cos h

< sin

and sin h < h < tan h

Hence conclude that lim

h→0

sin h

= 1 and lim

h→0

1−cos h

= 0.

(b) Use part (a) to prove that

sin x = cos x

cos h

sin h

tan h

13. (Hard) Use induction to prove the Leibniz rule (general product rule):

( f g)

(n)

∑

k=0





(k)

(n−k)

Warning! The last two exercises are much longer and & tougher: have a go if you appreciate a

challenge.

14. The Exponential Function & the Power Law

The ratio tests shows that the power series exp(x) :=

∑

∞

n=0

converges for all real x. Deﬁne

e := exp(1). Certainly e

makes sense whenever x ∈ Q. If x is irrational, instead deﬁne

:= sup{e

: q ∈ Q, q < x}

The goal of this question is to prove that exp(x) = e

. As a nice bonus we recover Bernoulli’s

limit identity e = lim

n→∞



1 +



and obtain a complete proof of the power law!

(a) For all x, y ∈ R, prove that exp(x + y) = exp(x) exp(y)

(Hint: use the binomial theorem and change the order of summation)

(b) Show that exp(x) is always positive, even when x < 0.

(Hint: x ≥ 0 =⇒ exp(x) ≥ 1 + x; take limits then apply part (a))

(d) Prove that e

= exp(x). Do this in three stages:

• If x ∈ N, use part (a). Now check for x ∈ Z

−

• If x =

∈ Q, ﬁrst compute



exp(

)



• If x is irrational, consider a sequence of rational numbers q

< x with e

→ e

. . .

(e) Let ln : (0, ∞) → R be the inverse function of exp. Prove the logarithm laws:

ln(xy) = ln x + ln y and ln x

= r ln x

(Just do this when r ∈ N; in general, another argument like part (d) is required)

(f) We’ve already seen that

ln y =

. Use the fact that

ln y = lim

h→0

ln(y + h) −ln y

to prove that exp(x) = lim

n→∞



1 +



, thus recovering Bernoulli’s deﬁnition of e.

(g) For any r ∈ R, deﬁne x

:= exp(r ln x). Hence obtain the power law for any exponent.

15. A Very Strange Function

Here is a classic example of a continuous but nowhere-differentiable function!

Let f be the sawtooth function deﬁned by f (x) =

whenever x ∈ [−1, 1] and extending

periodically to R so that f (x + 2) = f (x). Now deﬁne g : R → R via

g(x) =

∞

∑

n=0





f (4

−2 −1 0 1 2

f (x) and iterations to n = 3 g(x) (really n = 6, but can you tell?!)

(a) Prove that g is well-deﬁned and continuous on R.

(b) Let x ∈ R and m ∈ N be ﬁxed. Deﬁne h

= ±

·4

−m

where the sign is chosen so that no

integers lie strictly between 4

x and 4

(x + h

) = 4

x ±

For each n ∈ N

, deﬁne



(x + h

)



− f (4

Prove the following

≤ 4

with equality when n = m.

ii. n > m =⇒ k

= 0.

(Hint:

f (y) − f (z)

≤

y − z

: when is this an equality?)



g(x + h

) − g(x)



≥

+ 1)

Hence conclude that g is nowhere differentiable.

3.29 The Mean Value Theorem

A key result in elementary calculus, this should be very familiar from your previous studies.

Theorem 3.13 (Mean Value Theorem/MVT). Let f be continuous on [a, b] and differentiable on

(a, b). Then there exists ξ ∈ (a, b) such that f

′

( ξ) =

f (b)−f (a)

b−a

This follows easily from two lemmas.

Lemma 3.14. 1. (Critical Points) Suppose g is bounded on (a, b) and attains its maximum or mini-

mum at ξ ∈ (a, b). If g is differentiable at ξ then g

′

( ξ) = 0.

2. (Rolle’s Theorem) Suppose g is continuous on [a, b], differentiable on (a, b), and g(a) = g(b).

Then there exists ξ ∈ (a, b) such that g

′

( ξ) = 0.

The main result is obtained by subtracting a straight line and applying Rolle’s theorem to

g(x) = f (x) −

f (b) − f (a)

b − a

(x −a)

and observing that g(a) = f (a) = g(b) and g

′

(x) = f

′

(x) −

f (b)−f (a)

b−a

g(x)

a b

Critical Points/Rolle’s Theorem

f (x)

a b

Mean Value Theorem

In the pictures, the orange and green lines are parallel: the average slope over the interval [a, b] equals

the gradient/derivative f

′

( ξ).

Proof of Lemma. 1. Suppose ξ ∈ (a, b) is a maximum: that is, g(x) ≤ g(ξ) for all x = ξ. Then

g(x) − g(ξ)

x −ξ

(

≤ 0 whenever x > ξ

≥ 0 whenever x < ξ

Now take the one-sided limits: since g is differentiable at ξ, we see that

0 ≤ lim

x→ξ

g(x) − g(ξ)

x −ξ

= g

′

( ξ) = lim

x→ξ

−

g(x) − g(ξ)

x −ξ

≤ 0

Otherwise said g

′

( ξ) = 0. The case when ξ is a minimum is similar.

2. By the Extreme Value Theorem (1.11), g is bounded and attains its bounds. If the extrema both

occur at the endpoints a, b, then g is constant: any ξ ∈ (a, b) satisﬁes the result. Otherwise, at

least one extreme occurs at some ξ ∈ (a, b): part 1 says that g

′

( ξ) = 0.

Examples 3.15. 1. Let f (x) = (x −1)

(4 − x) + x on [a, b] = [1, 4]: this is roughly the above picture

illustrating the mean value theorem. Compute the average slope and the derivative,

f (b) − f (a)

b − a

= 1, f

′

(x) = 2(x −1)(4 − x) − (x −1)

+ 1 = −3x

+ 12x − 8

and observe that

′

( ξ) =

f (b) − f (a)

b − a

⇐⇒ 3ξ

−12ξ + 9 = 0 ⇐⇒ ξ = 1 or 3

Since only 3 lies in the interval ( 1, 4), this is the value ξ satisfying the mean value theorem.

2. We ﬁnd the maximum and minimum values of g(x) = x

−14x

+ 24x on the interval [0, 2].

The function is differentiable, with

′

(x) = 4x

−28x + 24 = 4(x −2)(x −1)(x + 3)

By the Lemma, the locations of the extrema are either the end-

points x = 0, 2 or locations with zero derivative (x = 1). Since

f (0) = 0, f (1) = 11, f (2) = 8

we conclude that max( f ) = f (1) = 11 and min( f ) = f (0) = 0.

g(x)

0 1 2

Consequences of the Mean Value Theorem Several simple corollaries relate to monotonicity.

Deﬁnition 3.16. Suppose f : I → R is deﬁned on an interval I. We say that f is:

Increasing (monotone-up) on I if x < y =⇒ f (x) ≤ f (y)

Decreasing (monotone-down) on I if x < y =⇒ f (x) ≥ f (y)

We say strictly increasing/decreasing if the inequalities are strict.

Examples 3.17. 1. f : x 7→ x

is strictly increasing on [0, ∞)

and strictly decreasing on (−∞, 0].

2. The ﬂoor function f : x 7→ ⌊x⌋ (the greatest integer less

than or equal to x) is increasing, but not strictly, on R.

−2

−1

g(x)

−2 −1 1 2 3

Corollary 3.18. Suppose f is differentiable on an interval I. Then

1. f

′

≥ 0 on I ⇐⇒ f is increasing on I

2. f

′

≤ 0 on I ⇐⇒ f is decreasing on I

3. f

′

= 0 on I ⇐⇒ f is constant on I

Proof. (Part 1, ⇒) Let x < y where x, y ∈ I. By the mean value theorem, ∃ξ ∈ (x, y) such that

f (y) − f (x)

y − x

= f

′

( ξ) whence f

′

( ξ) ≥ 0 =⇒ f (y) ≥ f (x)

( ⇐) For the converse, use the deﬁnition of derivative: f

′

( ξ) = lim

x→ξ

f (x)−f (ξ)

x−ξ

. If f is increasing, then

x > ξ =⇒ f (x) ≥ f (ξ) =⇒ f

′

( ξ) ≥ 0

Parts 2 and 3 are similar.

More care is required when relating f

′

> 0 to f being strictly increasing (see Exercise 5). The corollary

also yields a couple of (hopefully familiar) ﬂashbacks to elementary calculus.

Corollary 3.19. Let I be an open interval.

1. (Anti-derivatives on an interval) If f

′

(x) = g

′

(x) on I, then ∃c such that g(x) = f (x) + c on I.

2. (First derivative test) Suppose f is continuous on I and differentiable except perhaps at ξ. If

(

′

(x) < 0 whenever x < ξ, and

′

(x) > 0 whenever x > ξ

then f has its minimum value at x = ξ

The statement for a maximum is similar.

Examples 3.20. 1. Since

sin( 3x

+ x) = (6x + 1) cos(3x

+ x) on (the interval) R, whence all

anti-derivatives of f (x) = (6x + 1) cos(3x

+ x) are given by

f (x) dx =

(6x + 1) cos(3x

+ x) dx = sin(3x

+ x) + c

As is typical in calculus, we use the indeﬁnite integral notation

f (x) dx for anti-derivatives.

2. If f (x) = x

2/3

x/3

, then f

′

(x) =

−1/3

(2 + x)e

x/3

By Lemma 3.14, the only possible critical points are at

x = 0 or −2. The sign of the derivative is also clear:

−2 0

′

(x) > 0 f

′

(x) < 0 f

′

(x) > 0

f (x)

−3 −2 −1 0 1

By the 1

derivative test, f has a maximum at x = −2 and a minimum at x = 0.

We ﬁnish this section by tying together the mean and intermediate value theorems.

Theorem 3.21 (IVT for Derivatives). Suppose f is differentiable on an interval I containing a < b,

and that L lies between f

′

(a) and f

′

( b). Then ∃ξ ∈ (a, b) such that f

′

( ξ) = L.

If f

′

(x) is continuous, this is just the intermediate value theorem applied to f

′

; surprisingly, continuity

of f

′

is not required. A full proof is in Exercise 7.

Exercises 3.29. Key concepts: Differentiability, Basic rules: linearity, power, product, chain, quotient

1. Determine whether the conclusion of the mean value theorem holds for each function on the

given interval. If so, ﬁnd a suitable point ξ. If not, state which hypothesis fails.

(a) x

on [−1, 2] (b) sin x on [0, π] (c)

on [−1, 2]

(d) 1/x on [−1, 1] (e) 1/x on [1, 3]

2. Suppose f and g are differentiable on an interval I containing a < b and that f (a) = f (b) = 0.

By considering h(x) = f (x)e

g(x)

, prove that f

′

( ξ) + f (ξ)g

′

( ξ) = 0 for some ξ ∈ (a, b).

3. (a) Use the Mean Value Theorem to prove that x < tan x for all x ∈ (0,

(b) Prove that

sin x

is strictly increasing on (0,

sin x for all x ∈ [0,

4. Suppose that

f (x) − f (y)

≤ (x −y)

for all x, y ∈ R. Prove that f is a constant function.

5. (a) Prove that f

′

> 0 on an interval I =⇒ f is strictly increasing on I.

(b) Show that the converse of part (a) is false.

6. If f is differentiable on an interval I such that f

′

(x) = 0 for all x ∈ I, use the intermediate value

theorem for derivatives to prove that f is either strictly increasing or strictly decreasing.

7. (Intermediate value theorem for derivatives) Let f , a, b and L be as in Theorem 3.21, deﬁne

g : I → R by g(x) = f (x) − Lx, and let ξ ∈ [a, b] be such that

g(ξ) = min



g(x) : x ∈ [a, b]



(a) Why can we be sure that ξ exists? If ξ ∈ (a, b), explain why f

′

( ξ) = L.

(b) Assume WLOG that f

′

(a) < f

′

( b). Prove that g

′

(a) < 0 < g

′

( b). By considering

lim

x→a

g(x)−g(a)

x−a

, show that ∃x > a for which g(x) < g(a). Hence complete the proof.

8. Suppose f

′

exists on (a, b), and is continuous except for a discontinuity at c ∈ (a, b).

(a) Suppose lim

x→c

′

(x) = L < f

′

( c). By taking ϵ =

′

(c)−L

in the deﬁnition of this limit

and applying IVT for derivatives, obtain a contradiction.

Hence argue that c cannot be a removable or a jump discontinuity.

(b) Similarly, show that f

′

cannot have an inﬁnite discontinuity by considering lim

x→c

′

(x) = ∞ .

′

can have an essential discontinuity. Recall

(Exercise 3.28.9) that

f : R → R : x 7→

(

sin

x = 0

0 x = 0

is differentiable on R, but has discontinuous derivative at x = 0.

i. Use x

2nπ

and y

(2n+1)π

to show that f

′

has an essential discontinuity at x = 0.

ii. Prove that if lim s

= 0 and lim f

′

( s

) = M, then M ∈ [−1, 1].

iii. Prove that for any L ∈ [−1, 1], there is a sequence (t

) for which lim f

′

( t

) = L.

(Hint: Use IVT for derivatives)

3.30 L’Hˆopital’s Rule

We are often required to consider indeterminate forms: limits which do not yield easily to the standard

limits laws. For instance, while it is tempting to write

lim

x→0

sin 2x

−1

lim sin 2x

lim e

−1

(∗)

this is an incorrect application of the limit laws since the resulting quotient has no meaning.

Deﬁnition 3.22. An indeterminate form is any limit where a na

ıve application of the limit laws results

in a meaningless expression: the primary types are

∞

, ∞ − ∞, 0 ·∞, 0

, 0

∞

, and 1

∞

Examples 3.23. 1. lim

x→7

(x −7)

x−7

is an indeterminate form of type 0

∞

2. Our motivating example (∗) may correctly be evaluated using the deﬁnition of the derivative:

lim

x→0

sin 2x

−1

= lim

x→0

sin 2x −0

x −0

−1





x=0

sin 2x





x=0



−1

By considering lim

x→0

3a sin 2x

2(e

−1)

, we see that an indeterminate form of type

can take any value a!

The approach generalizes, if non-rigorously: if f , g are differentiable at a and f (a) = 0 = g(a), then

lim

x→a

f (x)

g(x)

= lim

x→a

f (x) − f (a)

x −a

g(x) − g(a)

′

(a)

′

(a)

Our goal is to fully justify this result and extend to several situations:

• One-sided limits, including when a = ±∞.

• When lim f (x) = 0 exists, but f (a) does not (g(x), g(a) similarly).

• Indeterminate forms of type

∞

(lim f (x) = ∞ , etc.).

• When the RHS cannot be cleanly evaluated: for instance g

′

(a) = 0 or if the original limit is ±∞.

Here is the full result.

Theorem 3.24 (L’Hˆopital’s Rule). Let a ∈ R ∪ {±∞} and suppose functions f and g satisfy:

1. lim

x→a

′

(x)

′

(x)

= L for some L ∈ R ∪ {±∞}, and,

2. (a) lim

x→a

f (x) = lim

x→a

g(x) = 0, or (b) lim

x→a

g(x) = ∞ (no condition on f )

Then lim

x→a

f (x)

g(x)

= L. The same result holds for one-sided limits.

The full proof is a behemoth—we postpone this until after several examples. In part because of this,

and because examples can often be evaluated more instructively using elementary methods (as in the

above example), l’H

opital’s rule is often discouraged in elementary calculus.

Examples 3.25. 1. If f (x) = e

and g(x) = 21x −17, then lim

x→∞

f (x)

g(x)

has type

∞

. By l’H

opital’s rule,

lim

x→∞

′

(x)

′

(x)

= lim

x→∞

= ∞ =⇒ lim

x→∞

21x − 17

= ∞

2. For an example of type

, consider f (x) = x

−9 and g(x) = ln( 4 −x):

lim

x→3

−

′

(x)

′

(x)

= lim

x→3

−

−1/(4 − x)

= lim

x→3

−

2x(x − 4) = −6 =⇒ lim

x→3

−

−9

ln( 4 −x)

= −6

3. One can apply the rule repeatedly: for example

lim

x→0

−1 −4x

= lim

x→0

−4

= lim

x→0

16e

= 8

This is a generally accepted abuse of protocol: one shouldn’t really state the ﬁrst limit until one

knows the last limit exists! As long as everything works, you are ﬁne. However. . .

4. It is crucially important that the limit lim

′

exists before applying l’H

opital’s rule! Consider

f (x) = x + cos x and g(x) = x: certainly lim

x→∞

f (x)

g(x)

has type

∞

, however

lim

x→∞

′

(x)

′

(x)

= lim

x→∞

1 −sin x

does not exist! In this case the rule is unnecessary: appealing to the squeeze theorem,

f (x)

g(x)

= 1 +

cos x

−−−→

x→∞

5. For another reason for why l’H

opital’s rule is often prohibited in Freshman calculus, consider

lim

x→0

sin x

= lim

x→0

cos x

= 1

This appears legitimate. However, recall (Exercise 3.28.12) that this limit is used to demonstrate

sin x = cos x; to use this to calculate the limit on which it depends is circular logic!

The remaining indeterminate forms (Deﬁnition 3.22) may be modiﬁed so that l’H

opital’s rule applies.

Examples 3.26. 1. An indeterminate form of type ∞ −∞ may be transformed to one of type

before

applying the rule (twice):

lim

x→0

−1

−

= lim

x→0

x + 1 − e

x(e

−1)

(type

)

= lim

x→0

1 −e

−1 + xe

(still type

)

= lim

x→0

−e

+ xe

= −

2. For an indeterminate form of type 1

∞

, we use the log laws & continuity of the exponential:

lim

x→0

(1 + sin x)

1/x

= exp



lim

x→0

ln( 1 + sin x)



(type

)

= exp



lim

x→0

cos x

1 + sin x



= e

Proving l’Hˆopital’s Rule

The complete argument is very lengthy. It starts with an extension of the Mean Value Theorem.

Lemma 3.27 (Extended Mean Value Theorem). Fix a < b, suppose f , g are continuous on [a, b] and

differentiable on (a, b). Then there exists ξ ∈ (a, b) such that



f (b) − f (a)



′

( ξ) =



g(b) − g(a)



′

( ξ)

Proof. Apply the standard mean value theorem (really Rolle’s theorem) to

h(t) =



f (b) − f (a)



g(t) −



g(b) − g(a



) f ( t)

which satisﬁes h(a) = h(b).

Now for the main event. If you do nothing else, read the following proof of the simplest case. Every-

thing else is a modiﬁcation.

Proof (Case (a)/type

, with right limits). Suppose we have a form of type

= lim

x→a

f (x)

g(x)

taking right-

limits at a ﬁnite location a, and that the resulting limit L is ﬁnite.

First observe that condition 1 forces the existence of an interval (a, b) on which f , g are differentiable

and g

′

(x) = 0. Everything follows from the deﬁnition the limit in condition 1, and Lemma 3.27:

Given ϵ > 0, ∃δ ∈ (0, b − a) such that a < ξ < a + δ =⇒



′

( ξ)

′

( ξ)

− L



(∗)

a < y < x < a + δ =⇒ ∃ξ ∈ (y, x) such that

f (x) − f (y)

g(x) − g(y)

′

( ξ)

′

( ξ)

(†)

Since g

′

= 0, the usual mean value theorem says

∃c ∈ (y, x) such that g(x) − g(y) = g

′

( c)(x −y) = 0

whence we never divide by zero in (†). Combining (∗) and (†), observe that

a < x < a + δ =⇒



f (x)

g(x)

− L



2(a)

= lim

y→a



f (x) − f (y)

g(x) − g(y)

− L



(†)

= lim

y→a



′

( ξ)

′

( ξ)

− L



(∗)

≤

< ϵ

Note that a < y < ξ(x, y) < x is a function of x, y here! Since ϵ > 0 is arbitrary, this is the required

result.

A complete proof for all indeterminate forms of type

follows from some simple modiﬁcations.

If a = −∞: Replace the blue part of (∗) as follows:

Given ϵ > 0, ∃m ≤ b such that ξ < m =⇒



′

( ξ)

′

( ξ)

− L



The rest of the proof goes through after replacing a with −∞ and a + δ with m.

If L = ∞: Replace the green parts of (∗) with Given M > 0 and

′

(ξ)

′

(ξ)

> 2M. Fixing the rest of the

proof is again straightforward.

If L = −∞: Replace the green parts of (∗) with Given M > 0 and

′

(ξ)

′

(ξ)

< −2M.

Left-limits: If f , g are differentiable on (c, a), then the blue part may be replaced with either:

• (a ﬁnite) ∃δ ∈ (0, a −c) such that a −δ < ξ < a

• (a = ∞) ∃m ≥ c such that ξ > m

The blue and green parts of (∗) may be replaced independently.

Proof (Case (b), lim g(x) = ∞). This requires a little more care.

Since g

′

= 0, and lim

x→a

g(x) = ∞,

Exercise 3.29.6 says that g is strictly decreasing on (a, b). By replacing b by some

b ∈ (a, b), if necessary,

we may assume that

a < y < x < b =⇒ 0 < g(x) < g(y) (‡)

Assume a and L are ﬁnite and obtain (∗) and (†) as before. Let x ∈ (a, a + δ) be ﬁxed and multiply

(†) by

g(y)−g(x)

g(y)

(this is positive by (‡)): a little algebra and the triangle inequality tell us that

a < y < x =⇒

f (y)

g(y)

′

( ξ)

′

( ξ)

f (x)

g(y)

−

g(x)

g(y)

′

( ξ)

′

( ξ)

=⇒



f (y)

g(y)

− L



≤



′

( ξ)

′

( ξ)

− L



g(y)



f (x)

g(x)



L +



Since lim

y→a

g(y) = ∞ and x is ﬁxed, we see that there exists η ≤ x − a < δ such that

y ∈ (a, a + η) =⇒

g(y)



f (x)

g(x)



L +



Finally combine with (∗): given ϵ > 0, ∃η > 0 such that y ∈ (a, a + η) =⇒



f (y)

g(y)

− L



< ϵ.

The same modiﬁcations listed above complete the proof.

Forms of type

∞

? Instead of assumption 2. (b), why not simply assume lim f = lim g = ∞ and write

1/g

1/ f

to obtain

a form of type

? The problem is that the derivative of the ‘new’ denominator

−f

′

need not be non-zero on any

interval (a, b) and so condition 1. need not hold. We could modify this, but it would make for a weaker theorem. Example

3.25.4 illustrates the issue: f

′

(x) = 1 + sin x has zeros on any unbounded interval.

After the 2. (b) case is proved and we know that lim

= L, it is then clear that lim f must also be inﬁnite (unless L = 0 in

which case lim f could be anything and need not exist). This situation therefore really does deal with forms of type

∞

Exercises 3.30. Key concepts: Types of indeterminate forms, Formal statement of l’Hˆopital’s rule

1. Evaluate the limits, if they exist:

(a) lim

x→0

sin x − x

(b) lim

x→

−

tan x −

π − 2x

x→0

(cos x)

1/x

(d) lim

x→0

(1 + 2x)

1/x

(e) lim

x→∞

( e

+ x)

1/x

2. Suppose f is differentiable on (c, ∞) and that lim

x→∞

[ f (x) + f

′

(x)] = L is ﬁnite.

(a) Prove that lim

x→∞

f (x) = L and that lim

x→∞

′

(x) = 0.

(Hint: write f (x) =

f (x)e

)

(b) Does anything change if L exists and is inﬁnite?

3. If p

(x) is a polynomial of degree n, use induction to prove that lim

x→∞

(x)e

−x

= 0

4. Let f (x) = x + sin x cos x, g(x) = e

sin x

f (x) and h(x) =

2 cos x

sin x

( f (x) + 2 cos x)

(a) Prove that lim

x→∞

f (x) = ∞ = lim

x→∞

g(x) but that lim

x→∞

f (x)

g(x)

does not exist.

(b) If cos x = 0, and x is large, show that

′

(x)

′

(x)

= h(x).

x→∞

h(x) = 0. Explain why this does not contradict part (a)!

3.31 Taylor’s Theorem

A primary goal of power series is the approximation of functions. With this in mind, there are two

natural questions to ask of a function f :

1. Given c ∈ dom( f ), is there a series

∑

(x −c)

which equals f (x) on an interval containing c?

2. If we take the ﬁrst n terms of such a series, how accurate is this polynomial approximation?

Example 3.28. Recall the geometric series

f (x) =

1 − x

∞

∑

n=0

whenever −1 < x < 1

The polynomial approximation

(x) =

∑

k=0

= 1 + x + ··· + x

1 − x

n+1

1 − x

has error

(x) = f (x) − p

(x) =

n+1

1 − x

−1 0 1

−

(x) = 1 + x + x

+ x

If x is close to 0, this is likely very small; for instance if x ∈



−



, then

(x)

≤

1 −





n+1

= 2

−n

However, when x is close to 1 the error is unbounded!

The above behavior occurs in general: the truncated polynomials provide better approximations

nearer the center of the series. To see this, we ﬁrst need to consider higher-order derivatives.

Deﬁnition 3.29. We write f

′′

for the second derivative of f , namely the derivative of its derivative

′′

(a) = lim

x→a

′

(x) − f

′

(a)

x −a

The existence of f

′′

(a) presupposes that f

′

exists on an (open) interval containing a. We can similarly

consider third, fourth, and higher-order derivatives. As a function, the n

derivative is written

(n)

(x) =

By convention, the zeroth derivative is the function itself f

(0)

(x) = f (x). We say that f is n times

differentiable at a if f

(n)

(a) exists, and inﬁnitely differentiable (or smooth) if derivatives of all orders exist.

Example 3.30. f (x) = x

is twice differentiable, with f

′′

(x) = 6

. It is smooth everywhere

except at x = 0, where third (and higher-order) derivatives do not exist.

Deﬁnition 3.31. Suppose f is n times differentiable at x = c. The n

Taylor polynomial p

of f

centered at c is

(x) :=

∑

k=0

(k)

( c)

(x −c)

= f (c) + f

′

( c)(x −c) +

′′

( c)

(x −c)

+ ···+

(n)

( c)

(x −c)

The remainder R

(x) is the error in the polynomial approximation

(x) = f (x) − p

(x) = f (x) −

∑

j=0

(k)

( c)

(x −c)

If f is inﬁnitely differentiable at x = c, then its Taylor series centered at x = c is the power series

f (x) =

∞

∑

n=0

(n)

( c)

(x −c)

When c = 0 this is known as a Maclaurin series.

For simplicity we’ll mostly work with Maclaurin series, with general situation hopefully being clear.

Examples 3.32. 1. If f (x) = e

, then f

(n)

(x) = 3

, from which the Maclaurin series is

f (x) =

∞

∑

n=0

2. If g(x) = sin 7x, then the sequence of derivatives is

7 cos 7x, −7

sin 7x, −7

cos 7x, 7

sin 7x, 7

cos 7x, −7

sin 7x, . . .

At x = 0, every even derivative is zero whereas the odd derivatives alternate in sign. The

Maclaurin series is easily seen to be

g(x) =

∞

∑

n=0

( −1)

2n+1

(2n + 1)!

2n+1

3. If h(x) =

√

x, then h

′

(x) =

−1/2

, h

′′

(x) =

−1

−3/2

, and h

′′′

(x) =

−5/2

, from which the

third Taylor polynomial centered at c = 1 is

(x) = h(1) + h

′

(1)(x −1) +

′′

(1)

(x −1)

′′′

(1)

(x −1)

= 1 +

(x −1) −

(x −1)

Rather than computing further examples, we ﬁrst develop a little theory that makes verifying Taylor

series much easier.

Named for Englishman Brook Taylor (1685–1731) and Scotsman Colin Maclaurin (1698–1746). Taylor’s general method

expanded on examples discovered by James Gregory and Issac Newton in the mid-to-late 1600s.

Differentiation of Taylor Polynomials and Series

Suppose P(x) =

∑

is a power series with radius of convergence R > 0. As we saw previously

(Theorem 2.31), P(x) is differentiable term-by-term on (−R, R). Indeed,

′

(x) =

∞

∑

j=1

j−1

=⇒ P

′

(0) = a

′′

(x) =

∞

∑

j=2

j(j −1)x

j−2

=⇒ P

′′

(0) = 2a

′′′

(x) =

∞

∑

j=3

j(j −1)(j −2)x

j−3

=⇒ P

′′′

(0) = 3!a

(k)

(x) =

∞

∑

j=k

j(j −1) ···(j − k + 1)x

j−k

∞

∑

j=k

j!a

(j −k)!

j−k

=⇒ P

(k)

(0) = k!a

Otherwise said, P is its own Maclaurin series! The same discussion holds for polynomials. Indeed if

P(x) = a

+ a

x + ···+ a

is a polynomial and f a function, then

(k)

(0) = f

(k)

(0) ⇐⇒ a

(k)

(0)

If this holds for all k ≤ n, then P = p

is the n

Taylor polynomial of f ! With a little modiﬁcation,

we’ve proved the following:

Theorem 3.33. 1. If f (x) =

∑

(x − c)

is a power series deﬁned on a neighborhood of c, then

f (x) = f (x): the function is its own Taylor series!

2. The n

Taylor polynomial of f centered at x = c is the unique polynomial p

of degree ≤ n

whose value and ﬁrst n derivatives agree with those of f at x = c: that is

∀k ≤ n, p

(k)

( c) = f

(k)

( c)

This answers our ﬁrst motivating question: a function can equal at most one power series with a

given center. The second question requires a careful study of the remainder: we’ll do this shortly.

Examples 3.34 (Common Maclaurin Series). These should be familiar from elementary calculus.

Each function equals the given series form our previous discussions of power series: by the Theorem,

each series is immediately the Maclaurin series of the given function.

∞

∑

n=0

x ∈ R

1 − x

∞

∑

n=0

x ∈ (−1, 1)

sin x =

∞

∑

n=0

( −1)

(2n + 1)!

2n+1

x ∈ R ln( 1 + x) =

∞

∑

n=1

( −1)

n+1

x ∈ (−1, 1]

cos x =

∞

∑

n=0

( −1)

(2n)!

x ∈ R tan

−1

x =

∞

∑

n=0

( −1)

2n + 1

2n+1

x ∈ [−1, 1]

Examples 3.35 (Modifying Maclaurin Series). By substituting for x in a common series, we quickly

obtain new series.

1. Substitute x 7→ 7x in the Maclaurin series for sin x, to recover our earlier example

sin 7x =

∞

∑

n=0

( −1)

2n+1

(2n + 1)!

2n+1

, x ∈ R

Note how this requires almost no calculation: since the function equals a series, the Theorem

says we have the Maclaurin series for sin 7x!

2. Substitute x 7→ x

in the Maclaurin series for e

to obtain

= exp(x

) =

∞

∑

n=0

, x ∈ R

This would be disgusting to verify directly, given the difﬁculty of repeatedly differentiating e

3. We ﬁnd the Taylor series for f (x) =

5−x

centered at x = 2:

f (x) =

3 + 2 − x

3( 1 −

2−x

)

∞

∑

n=0



2 − x



which is valid whenever −1 <

2−x

< 1 ⇐⇒ −1 < x < 5.

4. Fix c ∈ R and observe that, for all x ∈ R,

= e

c+x−c

= e

x−c

∞

∑

n=0

(x −c)

We conclude that the series is the Taylor series of e

centered at x = c. Of course this is easily

veriﬁed using the deﬁnition, since



x=c

= e

5. Combining the Theorem with the multiple-angle formula, we obtain the Taylor series for sin x

centered at x = c:

sin x = sin(c + x −c) = sin c cos(x −c) + cos x sin(x −c)

∞

∑

n=0

( −1)

sin c

(2n)!

(x −c)

∞

∑

n=0

( −1)

cos c

(2n + 1)!

(x −c)

2n+1

Deﬁnition 3.36. A function f is analytic on a its domain if every c ∈ dom f has a neighborhood on

which f (x) equals its Taylor series centered at c.

All the examples we’ve thus far seen are analytic on their domains; indeed the last two of Exam-

ples 3.35 prove this for the exponential and sine functions. Every analytic function is automatically

smooth (inﬁnitely differentiable), however the converse is false (Exercise 10). Analyticity is of greater

importance in complex analysis where (amazingly!) it is equivalent to complex-differentiability.

Accuracy of Taylor Approximations

Our ﬁnal goal is to estimate the accuracy of a Taylor polynomial as an approximation to its generating

function. Otherwise said, we want to estimate the size of the remainder R

(x) = f (x) − p

(x).

Theorem 3.37 (Taylor’s Theorem: Lagrange’s form). Suppose f is n + 1 times differentiable on an

open interval I containing c and let x ∈ I \ {c}. Then there exists ξ between c and x for which the

remainder centered at c satisﬁes

(x) =

(n+1)

( ξ)

( n + 1)!

(x −c)

n+1

Proof. For simplicity let c = 0. Fix x = 0, deﬁne a constant M

and a function g : I → R by

(x) =

( n + 1)!

n+1

and g(t) =

( n + 1)!

n+1

+ p

( t) − f (t) =

( n + 1)!

n+1

− R

( t)

Observe that

k ≤ n + 1 =⇒ g

(k)

(x) =

( n + 1 −k)!

n+1−k

+ p

(k)

( t) − f

(k)

( t) (∗)

=⇒ g

(k)

(0) = p

(k)

(0) − f

(k)

(0) = 0 if k ≤ n

where we invoked Theorem 3.33.

Now apply Rolle’s Theorem repeatedly (WLOG assume x > 0):

• ∃ξ

between 0 and x such that g

′

( ξ

) = 0.

• ∃ξ

between 0 and ξ

such that g

′′

( ξ

) = 0, etc.

• Iterate to obtain a sequence (ξ

) such that

0 < ξ

n+1

< ξ

< ··· < ξ

< x and g

(k)

( ξ

) = 0

Take ξ = ξ

n+1

and consider (∗): since deg p

≤ n, we see that

0 = g

(n+1)

( ξ) = M

− f

(n+1)

( ξ) =⇒ R

(x) = f (x) − p

(x) =

(n+1)

( ξ)

( n + 1)!

n+1

Corollary 3.38. Suppose f is smooth on an open interval I containing c and that all derivatives f

(n)

of all orders are bounded on I. Then f equals its Taylor series (centered at c) on I.

Proof. For simplicity, let c = 0. Suppose



(n+1)

( ξ)



≤ K for all ξ ∈ I. Choose any N >

and

observe that

n > N =⇒

(x)

≤

n+1

( n + 1)!

n+1

N!(N + 1) ···(n + 1)

≤





n+1−N

−−−→

n→∞

Examples 3.39. 1. The functions sine and cosine have derivatives bounded by 1 on R, and thus both

functions equal their Maclaurin series on R. This removes the need to have previously justiﬁed

these facts using the theory of differential equations.

2. The exponential function does not have bounded derivatives, however we can still apply Tay-

lor’s Theorem. For any ﬁxed x, ∃ξ between 0 and x such that

(x)



( n + 1)!

n+1



−−−→

n→∞

by the same argument in the Corollary. Thus e

equals its Maclaurin series on the real line (we

knew this already from Exercise 3.28.14).

3. Extending Example 3.32.3, we see that h(x) =

√

x has linear approximation (1

Taylor polyno-

mial) centered at c = 9

(x) = h(9) + h

′

(9)(x −9) = 3 +

(x −9)

This yields the simple approximation

√

10 ≈ p

(10) = 3 +

Taylor’s Theorem can be used to estimate its accuracy (remember to shift the center to 9!):

(10) =

′′

( ξ)

(10 −9)

= −

·2!

−3/2

= −

8ξ

3/2

for some ξ ∈ (9, 10)

Certainly ξ

−3/2

< 9

−3/2

, whence

−

216

< R

(10) < 0 =⇒

−

216

683

216

√

10 <

684

216

is therefore an overestimate for

√

10, but is accurate to within

216

< 0.005.

Alternative Versions of Taylor’s Theorem

There are two further common expressions for the remainder in Taylor’s Theorem. These are typ-

ically less easy to use than Lagrange’s form but can sometimes provide sharper estimates for the

remainder, particularly when x is far from the center of the series.

Corollary 3.40. Suppose f

(n+1)

is continuous on an open interval I containing c, let x ∈ I \ {c}, and

let R

(x) = f (x) − p

(x) be the remainder for the Taylor polynomial centered at c. Then:

1. (Integral Remainder) R

(x) =

(x −t)

(n+1)

( t) dt

2. (Cauchy’s Form) ∃ξ between c and x such that R

(x) =

(x −ξ)

(x −c) f

(n+1)

( ξ)

Using these expressions it is possible to explicitly prove Newton’s binomial series formula.

Corollary 3.41. If α ∈ R and

< 1, then

(1 + x)

= 1 +

∞

∑

n=1

α(α −1) ···(α − n + 1)

= 1 + αx +

α(α −1)

α(α −1)(α −2)

α(α −1)(α −2)(α − 3)

+ ···

If α ∈ N

, this is the usual binomial theorem. Otherwise it is more interesting: for instance,

√

1 + x = (1 + x)

1/2

= 1 +

x −

−

128

+ ···

(1 + x)

= 1 − 3x + 6x

−10x

+ 15x

−···

Of course this last could easily be obtained from

1+x

∑

( −1)

by differentiating twice!

Exercises 3.31. Key concepts: Taylor Series/Polynomials, Lagrange’s form for Remainder

1. Compute the Maclaurin series for cos x directly from the deﬁnition and use Taylor’s Theorem

to indicate why it converges to cos x for all x ∈ R.

2. Repeat the previous exercise for sinh x =

( e

−e

−x

) and cosh x =

( e

+ e

−x

3. Find the Maclaurin series for the function sin( 3x

). How do you know you are correct?

4. Find the Taylor series of f (x) = x

−3x

+ 2x − 5 at x = 2 and show that T

f (x) = f (x).

5. Find a rational approximation to

√

9 using the ﬁrst Taylor polynomial for f (x) =

√

x. Now use

Taylor’s Theorem to estimate its accuracy.

6. If c = 1, use the fact that 1 − x = (1 −c)



1 −

x−c

1−c



to obtain the Taylor series of

1−x

centered at

c. Hence conclude that

1−x

is analytic on its domain R \ {1}.

7. We prove that the Maclaurin series

∑

∞

n=1

(−1)

n+1

converges to ln(1 + x) whenever 0 < x ≤ 1.

(a) Explicitly compute

n+1

ln( 1 + x).

(b) Suppose 0 < x ≤ 1. Using Taylor’s Theorem, prove that lim

n→∞

(x) = 0.

(If −1 < x < 0, the argument is tougher, being similar to Exercise 11)

8. Why can’t we use Taylor’s Theorem to approximate the error in

1−x

= 1 + x + R

(x) when

x ≥ 1? Try it when x = 2, what happens? What about when x = −2?

9. Prove Taylor’s Theorem with integral remainder when c = 0 by using the following as an

induction step: for each n ∈ N, deﬁne

(x) =

(x −t)

(n+1)

( t) dt

and use integration by parts to prove that A

n+1

= A

−

n+1

(n+1)!

(n+1)

(0).

(The Cauchy form follows from the intermediate value theorem for integrals which we’ll see later)

10. Consider the function

f (x) =

(

−1/x

if x > 0

0 otherwise

(a) Prove by induction that there exists a degree 2n polynomial q

for which

(n)

(x) = q





−1/x

whenever x > 0

(b) Prove that f is inﬁnitely differentiable at x = 0 with f

(n)

(0) = 0 (use Exercise 3.30.3).

The Maclaurin series of f is identically zero! Moreover, f is smooth (inﬁnitely differentiable) on R but

non-analytic at zero since it does not equal its Taylor series on any open interval containing zero.

A modiﬁcation allows us to create bump functions, which ﬁnd wide use in analysis. If a < b, deﬁne

a,b

: x 7→ f (x − a) f (b − x)

This is smooth on R but non-zero only on the interval (a, b). A

further modiﬁcation involving two such functions g

a,b

creates

a smooth function on R which satisﬁes

a,b,ϵ

(x) =

(

0 if x ≤ a −ϵ or x ≥ b + ϵ

1 if a ≤ x ≤ b

This ‘switches on’ rapidly from 0 to 1 near a and switches off

similarly near b. By letting ϵ be small, we smoothly (but not

uniformly) approximate the indicator function on [a, b].

a,b,ǫ

(x)

aa − ǫ b b + ǫ

11. (Hard) We prove the binomial series formula (Corollary 3.41).

Let f (x) = (1 + x)

and g(x) = 1 +

∑

∞

n=1

where a

α(α−1)···(α−n+1)

. Our goal is to prove

that f = g on the interval (−1, 1).

(a) Check that f

(n)

(0) = n!a

so that g really is the Maclaurin series of f .

(b) i. Prove that the radius of convergence of g is 1.

ii. Prove that lim

n→∞

= 0 whenever

< 1.

iii. If

< 1 and ξ lies between 0 and x, prove that



x−ξ

1+ξ



≤

(Hint: write ξ = tx for some t ∈ (0, 1). . . )

(x)

< (n + 1)

n+1

(1 + ξ)

α−1

Hence conclude that g = f whenever

< 1.

(d) Here is an alternative argument for the full result:

i. Show that (n + 1)a

n+1

+ na

= αa

ii. Differentiate term-by-term to prove directly that g satisﬁes the differential equation

(1 + x)g

′

(x) = αg(x). Solve this to show that g = f whenever

< 1.