4 Integration

The theory of inﬁnite series addresses how to sum inﬁnitely many ﬁnite quantities. Integration, by

contrast, is the business of summing inﬁnitely many inﬁnitesimal quantities. Attempts to do both have

been part of mathematics for well over 2000 years, and the philosophical objections are just as old.

The development and increased application of calculus from the late 1600s onward spurred mathe-

maticians to put the theory on a ﬁrmer footing, though from Newton and Leibniz it took another 150

years before Bernhard Riemann (1856) provided a thorough development of the integral.

4.32 The Riemann Integral

The basic idea behind Riemann integration is to approximate area using a sequence of rectangles

whose width tends to zero. The following discussion illustrates the essential idea, which should be

familiar from elementary calculus.

Example 4.1. Suppose f (x) = x

is deﬁned on [0, 1].

For each n ∈ N, let ∆x =

and deﬁne x

= i∆x.

Above each subinterval [x

i−1

, x

], raise a rectangle of height

f (x

) = x

. The sum of the areas of these rectangles is the Rie-

mann sum with right-endpoints

∑

i=1

f (x

)∆x =

∑

i=1

n(n + 1)(2n + 1)

3n + 1

The Riemann sum with left-endpoints is deﬁned similarly:

∑

i=1

f (x

i−1

)∆x =

∑

i=1

(i −1)

−

3n −1

Since f is an increasing function, the area A under the curve

plainly satisﬁes

≤ A ≤ R

By the squeeze theorem, we conclude that A =

0 1

n =

0.365234

0 1

n =

0.302734

The example should feel convincing, though perhaps this is due to the simplicity of the function. To

apply this approach to more general functions, we need to be signiﬁcantly more rigorous.

Two of Zeno’s ancient paradoxes are relevant here: Achilles and the Tortoise concerns a convergent inﬁnite series,

while the Arrow Paradox toys with integration by questioning whether time can be viewed as a sum of instants. Perhaps

the most famous contemporary criticism comes from Bishop George Berkeley, who gave his name to the city and ﬁrst

UC campus: in 1734’s The Analyst, Berkeley savaged the foundations of calculus, describing the inﬁnitesimal increments

required in Newton’s theory of ﬂuxions (derivatives) as merely the “ghosts of departed quantities.”

Recall some basic identities:

∑

i=1

i =

n(n + 1),

∑

i=1

n(n + 1)(2n + 1),

∑

i=1

(n + 1)

Deﬁnition 4.2. A partition P = {x

, . . . , x

} of an interval [a, b] is a ﬁnite sequence for which

a = x

< x

< ··· < x

n−1

< x

= b

Choosing a sample point x

∗

in each subinterval [x

i−1

, x

] results in a tagged partition.

The mesh of the partition is mesh(P) := max ∆x

, the width ∆x

= x

−x

i−1

of the largest subinterval.

If f : [a, b] → R, the Riemann sum

∑

i=1

f (x

∗

) ∆x

evaluates the area of a family of n rectangles, as

pictured. The heights f (x

∗

) and thus areas can be negative or zero.

f (x)

∗

b = x

∗

a = x

b = x

In elementary calculus, one typically computes Riemann sums for equally-spaced partitions with left,

right or middle sample points. The ﬂexibility of tagged partitions makes applying Riemann’s deﬁni-

tion a challenge, so we instead consider two special families of rectangles.

Deﬁnition 4.3. Given a partition P of [a, b] and a bounded function f on [a, b], deﬁne

= sup

x∈[x

i−1

]

f (x) U( f , P) =

∑

i=1

∆x

= inf

x∈[x

i−1

]

f (x) L( f , P) =

∑

i=1

∆x

U( f , P) and L( f , P) are the upper and lower Darboux sums for

f with respect to P. The upper and lower Darboux integrals are

U( f ) = inf U( f , P) L( f ) = sup L( f , P)

where the supremum/inﬁmum are taken over all partitions.

Necessarily both integrals are ﬁnite.

We say that f is (Riemann) integrable on [a, b] if U( f ) = L( f ).

We denote this value by

f or

f (x) dx

a x

Upper Darboux sum U( f , P)

a x

Lower Darboux sum L( f , P)

If the interval is understood or irrelevant, one often simply says that f is integrable and writes

f .

Intuitively, L( f , P) is the sum of the areas of rectangles built on P which just ﬁt under the graph of f .

It is also the inﬁmum of all Riemann sums on P. If f is discontinuous, then L( f , P) need not itself be

a Riemann sum, as there might not exist suitable sample points!

Examples 4.4. 1. We revisit Example 4.1 in this language.

Given a partition Q = {x

, . . . , x

} of [0, 1] and sample points x

∗

∈ [x

i−1

, x

], we compute the

Riemann sum for f (x) = x

∑

i=1

f (x

∗

) ∆x

∑

i=1

∗

)

− x

i−1

)

Since f is increasing, we have x

i−1

≤ (x

∗

)

≤ x

on each interval, whence

L( f , Q) =

∑

i=1

i−1

)

− x

i−1

) ≤

∑

i=1

∗

)

− x

i−1

) ≤

∑

i=1

)

− x

i−1

) = U( f , Q)

The Darboux sums are therefore the Riemann sums for left- and right-endpoints.

If we take Q

to be the partition with subintervals of equal width ∆x =

, then

U( f ) = inf

U( f , P) ≤ U( f , Q

) =

∑

i=1





∆x = R

is the right Riemann sum discussed originally. Similarly L( f ) ≥ L

. Since L

and R

both

converge to

as n → ∞, the squeeze theorem forces

≤ L( f ) ≤ U( f ) ≤ R

=⇒ L( f ) = U( f ) =

Otherwise said, f is integrable on [0, 1] with

dx =

2. Suppose f (x) = kx + c on [a, b], and that k > 0. Take

the evenly spaced partition P

where x

= a +

b−a

Since f is increasing, the upper Darboux sum is again

the Riemann sum with right-endpoints:

U( f , P

) = R

∑

i=1

f (x

)∆x

b − a

∑

i=1

k(b − a)

i + ak + c

ak + c

bk + c

U( f , P

)

b − a



k(b − a)

n(n + 1) + (ak + c)n



−−−→

n→∞

k(b − a)

+ (b −a)(ak + c) =

− a

) + c(b −a)

Similarly, the lower Darboux sum is the Riemann sum with left-endpoints:

L( f , P

) = L

b − a



k(b − a)

n(n −1) + (ak + c)n



−−−→

n→∞

− a

) + c(b −a)

As above, L

≤ L( f ) ≤ U( f ) ≤ R

and the squeeze theorem prove that f is integrable on [a, b]

with

f =

− a

) + c(b −a).

Now we have some examples, a few remarks are in order.

Riemann versus Darboux Deﬁnition 4.3 is really that of the Darboux integral. Here is Riemann’s deﬁ-

nition: f : [a, b] → R being integrable with integral

f means

∀ϵ > 0, ∃δ such that (∀P, x

∗

) mesh(P) < δ =⇒



∑

i=1

f (x

∗

)∆x

−



< ϵ

This is signiﬁcantly more difﬁcult to work with, though it can be shown to be equivalent to the

Darboux integral. We won’t pursue Riemann’s formulation further, except to observe that if

a function is integrable and mesh(P

) → 0, then

f = lim

n→∞

∑

i=1

f (x

∗

)∆x

: this allows us to

approximate integrals using any sample points we choose, hence why right-endpoints (x

∗

= x

)

are so common in Freshman calculus.

Monotone Functions Darboux sums are easy to compute for monotone functions. As in the examples,

if f is increasing, then each M

= f (x

), from which U( f , P) is the Riemann sum with right-

endpoints. Similarly, L( f , P) is the Riemann sum with left-endpoints.

Area If f is positive and continuous,

the Riemann integral

f serves as a deﬁnition for the area

under the curve y = f (x). This should make intuitive sense:

1. In the second example where we have a straight line, we obtain the same value for the

area by computing directly as the sum of a rectangle and a triangle!

2. For any partition P, the area under the curve should satisfy the inequalities

L( f , P) ≤ Area ≤ U( f , P)

But these are precisely the same inequalities satisﬁed by the integral itself!

L( f , P) ≤ L( f ) =

f = U( f ) ≤ U( f , P)

In the examples we exhibited a sequence of partitions (P

) where U( f , P

) and L( f , P

) converged to

the same limit. The remaining results in this section develop some basic properties of partitions and

make this limiting process rigorous.

Deﬁnition 4.5. If P ⊆ Q are both partitions of [a, b], we call Q a reﬁnement of P.

To reﬁne a partition, we simply throw some more points in!

Lemma 4.6. Suppose f : [a, b] → R is bounded.

1. If Q is a reﬁnement of P (on [a, b]), then

L( f , P) ≤ L( f , Q) ≤ U( f , Q) ≤ U( f , P)

2. For any partitions P, Q of [a, b], we have L( f , P) ≤ U( f , Q).

3. L( f ) ≤ U( f )

We’ll see in Theorem 4.17 that every continuous function is integrable.

Proof. 1. We prove inductively. Suppose ﬁrst that Q = P ∪{t} contains exactly one additional point

t ∈ (x

k−1

, x

). Write

= inf



f (x) : x ∈ [x

k−1

, t]



= inf



f (x) : x ∈ [t, x

k−1

]



m = inf



f (x) : x ∈ [x

k−1

, x

]



= min{m

, m

}

The Darboux sums L( f , P) and L( f , Q) are identical ex-

cept for the terms involving t. This results in extra area:

k−1

Extra area!

··· ···

L( f , Q) − L( f , P) = m

(t − x

k−1

) + m

−t) − m(x

− x

k−1

)

= (m

−m)(t − x

k−1

) + (m

−m)(x

−t) ≥ 0

More generally, since a reﬁnement Q is obtained by adding ﬁnitely many new points, induction

tells us that P ⊆ Q =⇒ L( f , P) ≤ L( f , Q).

The argument for U( f , Q) ≤ U( f , P) is similar, and the middle inequality is trivial.

2. If P and Q are partitions, then P ∪Q is a reﬁnement of both P and Q. By part 1,

L( f , P) ≤ L( f , P ∪ Q) ≤ U( f , P ∪Q) ≤ U( f , Q) (∗)

3. This is an exercise.

Theorem 4.7. Suppose f : [a, b] → R is bounded.

1. (Cauchy criterion) f is integrable ⇐⇒ ∀ϵ > 0, ∃P such that U( f , P) − L( f , P) < ϵ.

2. f is integrable ⇐⇒ ∃(P

)

n∈N

such that U( f , P

) − L( f , P

) → 0. In such a situation, both

sequences U( f , P

) and L( f , P

) converge to

f .

Part 1 is termed a ‘Cauchy’ criterion since it doesn’t mention the integral (limit).

Proof. We prove the Cauchy criterion, leaving part 2 as an exercise.

(⇒) Suppose f is integrable and that ϵ > 0 is given. Since inf U( f , Q) =

f = sup L( f , R), there

exist partitions Q, R such that

U( f , Q) <

f +

and L( f , R) >

f −

Let P = Q ∪ R and apply (∗): L( f , R) ≤ L( f , P) ≤ U( f , P) ≤ U( f , Q). But then

U( f , P) − L( f , P) ≤ U( f , Q) − L( f , R) = U( f , Q) −

f +

f − L( f , R) < ϵ

(⇐) Assume the right hand side. For every partition, L( f , P) ≤ L( f ) ≤ U( f ) ≤ U( f , P). Thus

0 ≤ U( f ) − L( f ) ≤ U( f , P) − L( f , P) < ϵ

Since this holds for all ϵ > 0, we see that U( f ) = L( f ): that is, f is integrable.

Examples 4.8. 1. Consider f (x) =

√

x on the interval [0, b]. We choose a sequence of partitions (P

)

that evaluate nicely when fed to this function:

= {x

, . . . , x

} where x





=⇒ ∆x

= x

− x

i−1



−(i −1)



(2i −1)b

Since f is increasing on [0, b], we see that

U( f , P

) =

∑

i=1

f (x

)∆x

∑

i=1

√

(2i −1)b

3/2

∑

i=1

−i

3/2



n(n + 1)(2n + 1) −

n(n + 1)



−−−→

n→∞

3/2

Similarly

L( f , P

) =

∑

i=1

f (x

i−1

)∆x

∑

i=1

(i −1)

√

(2i −1)b

3/2

∑

i=1

−3i + 1

3/2



n(n + 1)(2n + 1) −

n(n + 1) + n



−−−→

n→∞

3/2

Since the limits are equal, we conclude that f is integrable and

√

x dx =

3/2

0 b

√

Upper Sum U( f , P

)

0 b

√

Lower Sum L( f , P

)

2. Here is the classic example of a non-integrable function. Let f : [a, b] → R to be the indicator

function of the irrational numbers,

f (x) =

(

1 if x ∈ Q

0 if x ∈ Q

Suppose P = {x

, . . . , x

} is any partition of [a, b]. Since any interval of positive length contains

both rational and irrational numbers, we see that

sup



f (x) : x ∈ [x

i−1

, x

]



= 1 =⇒ U( f , P) =

∑

i=1

− x

i−1

) = b −a =⇒ U( f ) = b − a

inf



f (x) : x ∈ [x

i−1

, x

]



= 0 =⇒ L( f , P) = 0 =⇒ L( f ) = 0

Since the upper and lower Darboux integrals differ, f is not (Riemann) integrable.

As any freshman calculus student can attest, if you can ﬁnd an anti-derivative, then the fundamen-

tal theorem of calculus (Section 4.34) makes evaluating integrals far easier. For instance, you are

probably desperate to write

3/2

= x

1/2

=⇒

√

x dx =

3/2



3/2

rather than computing Riemann/Darboux sums as in the previous example! However, in most prac-

tical situations, no easy-to-compute anti-derivative exists; the best we can do is to approximate using

Riemann sums for progressively ﬁner partitions. Thankfully computers excel at such tedious work!

Exercises 4.32. Key concepts: Darboux sums/integrals, Partitions, sample points & reﬁnements,

Cauchy & sequential criteria for integrability

1. Use partitions to ﬁnd the upper and lower Darboux integrals on the interval [0, b] . Hence prove

that the function is integrable and compute its integral.

(a) f (x) = x

(b) g(x) =

√

2. Repeat question 1 for the following two functions. You cannot simply compute Riemann sums

for left and right endpoints and take limits: why not?

(a) h(x) = x(2 − x) on [0, 2]

(Hint: choose a partition with 2n subintervals such that x

= 1 and observe that h(2 −x) = h(x))

(b) On the interval [0, 3], let k(x) =

(

2x if x ≤ 1

5 − x if x > 1

(Hint: this time try a partition with 3n subintervals)

3. Let f (x) = x for rational x and f (x) = 0 for irrational x. Calculate the upper and lower

Darboux integrals for f on the interval [0, b]. Is f integrable on [0, b]?

4. Prove part 3 of Lemma 4.6: L( f ) ≤ U( f ).

5. Prove part 2 of Theorem 4.7.

f is integrable ⇐⇒ ∃(P

)

n∈N

such that lim

n→∞



U( f , P

) − L( f , P

)



= 0

Moreover, prove that both U( f , P

) and L( f , P

) converge to

f .

6. (a) Reread Deﬁnition 4.3. What happens if we allow f : [a, b] → R to be unbounded?

(b) (Hard) Read “Riemann versus Darboux” on page 73. Explain why being Riemann integrable

also forces f to be bounded.

on P.

7. (If you like coding) Write a short program to estimate

f (x) dx using Riemann sums. This

can be very simple (equal partitions with right endpoints), or more complex (random partition

and sample points given a mesh). Apply your program to estimate

sin(x

−

√

) dx.

4.33 Properties of the Riemann Integral

The rough take-away of this long section is that everything you think is integrable probably is! Ex-

amples will be few, since we have not established many explicit values for integrals.

Theorem 4.9 (Linearity). If f , g are integrable and k, l are constant, then k f + lg is integrable and

k f + lg = k

f + l

Example 4.10. Thanks to examples in the previous section, we can now calculate, e.g.,

−3

√

x dx = 5 ·

·2

−3 ·

·2

3/2

= 20 −4

√

Proof. Suppose ϵ > 0 is given. By the Cauchy criterion (Theorem 4.7, part 1), there exist partitions

R, S such that

U( f , R) − L( f , R) <

and U(g, S) − L(g, S) <

If P = R ∪ S, then both inequalities are satisﬁed by P (Lemma 4.6). On each subinterval,

inf f (x) + inf g(x) ≤ inf



f (x) + g(x)



and sup



f (x) + g(x)



≤ sup f (x) + sup g(x)

since the individual suprema/inﬁma could be ‘evaluated’ at different places. Thus

L( f , P) + L(g, P) ≤ L( f + g, P) ≤ U( f + g, P) ≤ U( f , P) + U(g, P)

whence U( f + g, P) − L( f + g, P) < ϵ and f + g is integrable. Moreover,

( f + g) −

f −

g ≤



U( f , P) −





U(g, P) −



< ϵ

Using lower Darboux integrals similarly obtains the other half of the inequality

−ϵ <

( f + g) −

f −

g < ϵ

Since this holds for all ϵ > 0, we conclude that

( f + g) =

f +

That k f is integrable with

k f = k

f is an exercise. Put these together for the result.

Corollary 4.11 (Changing endvalues). Suppose f is integrable on [a, b] and g : [a, b] → R satisﬁes

f (x) = g(x) on (a, b). Then g is also integrable on [a, b] and

g =

f .

Deﬁnition 4.12 (Integration on an open interval). A bounded function g : (a, b) → R is integrable if

it has an integrable extension f : [a, b] → R where f (x) = g(x) on (a, b). In such a case, we deﬁne

g :=

f .

The Corollary (its proof is an exercise) shows that the choice of extension is irrelevant.

Theorem 4.13 (Basic integral comparisons). Suppose f and g are integrable on [a, b]. Then:

1. f (x) ≤ g(x) =⇒

f ≤

2. m ≤ f (x) ≤ M =⇒ m(b − a) ≤

f ≤ M(b − a)

3. f g is integrable.

is integrable and



≤

5. max( f , g) and min( f , g) are both integrable.

Part 3 is not integration by parts since it doesn’t tell us how

f g relates to

f and

Proof. 1. Since g − f is positive and integrable, L(g − f , P) ≥ 0 for all partitions P. But then

0 ≤ inf L(g − f , P) = L(g − f ) =

g − f =

g −

2. Apply part 1 twice.

3. This is an exercise.

4. The integrability is an exercise. For the comparison, apply part 1 to −

≤ f ≤

5. Use max( f , g) =

( f + g) +

f − g

, etc., together with the previous parts.

Theorem 4.14 (Domain splitting). Suppose f : [a, b] → R and

let c ∈ (a, b). If f is integrable on both [a, c] and [c, b], then it is

integrable on [a, b] and

f =

f +

f (x)

a c b x

In light of this result, it is conventional to allow integral limits to be reversed: if a < b, then

f := −

f is consistent with

f = 0

Proof. Let ϵ > 0 be given, then ∃R, S partitions of [a, c], [c, b] such that

U( f , R) − L( f , R) <

, U( f , S) − L( f , S) <

Choose P = R ∪ S to partition [a, b], then

U( f , P) − L( f , P) = U( f , R) + U( f , S) −L( f , R) − L( f , S) < ϵ

Moreover

f (x)

a c b x

}| {

f −

f ≤ U( f , P) − L( f , R) − L( f , S) = U( f , P) − L( f , P) < ϵ

Showing that this expression is greater than −ϵ is similar.

Example 4.15. If f (x) =

√

x on [0, 1] and f (x) = 1 on [1, 2], then

f =

√

x dx +

1 dx =

+ 1 =

Monotonic & Continuous Functions We establish the integrability of two large classes of functions.

Deﬁnition 4.16. A function f : [a, b] → R is:

Monotonic if it is either increasing (x < y =⇒ f (x) ≤ f (y)) or decreasing.

Piecewise monotonic if there is a partition P = {x

, . . . , x

} (ﬁnite!) of [a, b] such that f is monotonic

on each open subinterval (x

k−1

, x

Piecewise continuous if there is a partition such that f is uniformly continuous on each (x

k−1

, x

Theorem 4.17. If f is monotonic or continuous on [a, b], then it is integrable.

Examples 4.18. 1. Since sine is continuous, we can approximate via a sequence of Riemann sums

sin x dx =

lim

n→∞

∑

i=1

sin

πi

Evaluating this limit is another matter entirely, one best handled in the next section...

2. Similarly, e

√

is integrable and therefore may be approximated via Riemann sums:

√

dx =

lim

n→∞

∑

i=1

exp

= lim

n→∞

∑

j=1

2j −1

exp

Both sums use right endpoints: the ﬁrst has equal subintervals, while the second is analogous

to Example 4.8.1. These limits would typically be estimated using a computer.

Proof. Since [a, b] is closed and bounded, a continuous function f is uniformly so. Let ϵ > 0 be given:

∃δ > 0 such that ∀x, y ∈ [a, b],

x −y

< δ =⇒

f (x) − f (y)

b − a

Let P be a partition with mesh P < δ. Since f attains its bounds on each [x

i−1

, x

∃x

∗

, y

∗

∈ [x

i−1

, x

] such that M

−m

= f (x

∗

) − f (y

∗

) <

b − a

from which

U( f , P) − L( f , P) <

∑

i=1

b − a

− x

i−1

) = ϵ

The monotonicity argument is an exercise.

Combining the proof with Deﬁnition 4.12: every uniformly continuous f : (a, b) → R is integrable.

Corollary 4.19. Piecewise continuous and bounded piecewise monotonic functions are integrable.

Proof. If f is piecewise continuous, then the restriction of f to (x

k−1

, x

) has a continuous extension

: [x

k−1

, x

] → R; this is integrable by Theorem 4.17. By Corollary 4.11, f is integrable on [x

k−1

, x

]

with

k−1

f =

k−1

. Theorem 4.14 (n −1 times!) ﬁnishes things off:

f =

∑

k=1

k−1

The argument for piecewise monotonicity is similar.

Example 4.20. The ‘fractional part’ function f (x) = x − ⌊x⌋

is both piecewise continuous and piecewise monotone on any

bounded interval. It is therefore integrable on any such interval.

0 1 2 3 4 5

For a ﬁnal corollary, here is one more incarnation of the intermediate value theorem.

Corollary 4.21 (IVT for integrals). If f is continuous on [a, b], then ∃ξ ∈ (a, b) for which

f (ξ) =

b − a

Proof. Since f is continuous, it is integrable on [a, b]. By the extreme value theorem it is also bounded

and attains its bounds: ∃p, q ∈ [a, b] such that

f (p) := inf

x∈[a,b]

f (x), f (q) = sup

x∈[a,b]

f (x)

Applying Theorem 4.13, part 2, with m = f (p) and

M = f (q), we see that

(b − a) f (p) ≤

f ≤ (b − a) f (q)

ξa bp q

Divide by b −a and apply the usual intermediate value theorem for f to see that the required ξ exists

between p and q.

In the picture, when f is positive and continuous, the grey area equals that under the curve; imagine

levelling off the blue hill with a bulldozer. . . The notation f

b−a

f indicates the average value

of f on [a, b]: to see why this interpretation is sensible, take a sequence of Riemann sums on equally-

spaced partitions P

to see that

b − a

f = lim

n→∞

∑

i=1

f (x

∗

)∆x = lim

n→∞

f (x

∗

) + ···+ f (x

∗

)

is the limit of a sequence of averages of equally-spaced samples f (x

∗

What can/cannot be integrated?

We now know a great many examples of integrable functions:

• Piecewise continuous & monotonic functions are integrable.

• Linear combinations, products, absolute values, maximums and minimums of (already) inte-

grable functions.

By contrast, we’ve only seen one non-integrable function (Example 4.8.2). After so many positive

integrability conditions, it is reasonable to ask precisely which functions are Riemann integrable.

Here is the answer, though it is quite tricky to understand.

Theorem 4.22 (Lebesgue). Suppose f : [a, b] → R is bounded. Then

f is Riemann integrable ⇐⇒ it is continuous except on a set of measure zero

ıvely, the measure of a set is the sum of the lengths of its maximal subintervals, though unfortu-

nately this doesn’t make for a very useful deﬁnition.

Any countable subset has measure zero, so

Lebesgue’s result is almost as if we can extend Corollary 4.19 to allow for inﬁnite sums. For instance,

Exercise 1.17.8 describes a function which is continuous only on the irrationals: it is thus Riemann

integrable (indeed

f = 0 for any a < b). There are also uncountable sets with measure zero such

as Cantor’s middle-third set C: the function

f (x) =

(

1 if x ∈ C

0 otherwise

is continuous except on C and therefore Riemann integrable; again

f (x) dx = 0.

Exercises 4.33. Key concepts: Linear combinations, products, etc., of integrable functions are integrable,

Continuous and monotone functions are integrable, Integrability on open intervals

1. Explain why

2π

sin

) dx ≤

2. If f is integrable on [a, b] prove that it is integrable on any interval [c, d] ⊆ [a, b].

3. We complete the proof of Theorem 4.9 (linearity of integration).

(a) Suppose k > 0, let A ⊆ R and deﬁne kA := {kx : x ∈ A}. Prove that sup kA = k sup A

and inf kA = k inf A.

(b) If k > 0 prove that k f is integrable on any interval and that

k f = k

f .

Formally, the length of an open interval (a, b) is b −a and a set A ⊆ R has measure zero if

∀ϵ > 0, ∃ open intervals I

such that A ⊆

∞

[

n=1

and

∞

∑

i=1

length(I

) < ϵ

More generally, the Lebesgue measure of a set (subject to a technical condition) is the inﬁmum of the sum of the lengths of

any countable collection of open covering intervals. Measure theory is properly a matter for graduate study. Surprisingly,

there exist sets with positive measure that contain no subintervals, and even sets which are non-measurable!

4. Give an example of an integrable but discontinuous function on a closed bounded interval [a, b]

for which the conclusion of the Intermediate Value Theorem for Integrals is false.

5. Use Darboux sums to compute the value of the integral

15/2

1/2

x −⌊x⌋dx (Example 4.20).

6. We prove and extend Corollary 4.11. Suppose f is integrable on [a, b].

(a) If g : [a, b] → R satisﬁes f (x) = g(x) for all x ∈ (a, b), prove that g is integrable and

g =

f .

(Hint: consider h = f − g and show that

h = 0)

(b) Now suppose g : [a, b] → R satisﬁes f (x) = g(x) for all x ∈ [a, b] except at ﬁnitely many

points. Prove that g is integrable and

g =

f .

7. Show that an increasing function on [a, b] is integrable and thus complete Theorem 4.17.

(Hint: Choose a partition with mesh P <

f (b)−f (a)

)

8. Suppose f and g are integrable on [a, b].

(a) Deﬁne h(x) =



f (x)



. We know:

• f is bounded: ∃K such that

f (x)

≤ K on [a, b] .

• Given ϵ > 0, ∃P such that U( f , P) − L( f , P) <

. For each subinterval [x

i−1

, x

], let

= sup f (x), m

= inf f (x), M

= sup h(x), m

= inf h(x)

Prove that M

−m

≤ 2(M

−m

)K. Hence conclude that h is integrable.

(b) Prove that f g is integrable.

(Hint: f g =

( f + g)

−

( f − g)

)

, P) − L(

, P) ≤ U( f , P) − L( f , P) for any partition P. Hence conclude

that

is integrable.

(One can extend these arguments to show that if j is continuous, then j ◦ f is integrable. Parts (a) and

and j(x) =

9. (Hard) Let f (x) =











x if x = 0 and sin

> 0

−x if x = 0 and sin

< 0

0 if x = 0

(a) Show that f is not piecewise continuous on [0, 1].

(b) Show that f is not piecewise monotonic on [0, 1].

(Hint: given ϵ, hunt for a suitable partition to make U( f , P) − L( f , P) < ϵ by considering [0, x

]

differently to the other subintervals)

(d) Make a similar argument which proves that g = sin

is integrable on ( 0, 1].

(Hint: Show that g has an integrable extension on [0, 1])

4.34 The Fundamental Theorem of Calculus

The key result linking integration and differentiation is usually presented in two parts. While there

are signiﬁcant subtleties, the rough statements are as follows (we follow the traditional numbering):

Part I Differentiation reverses integration:

f (t) dt = f (x)

Part II Integration reverses differentiation:

′

(x) dx = F(b) − F(a)

These facts seemed intuitively obvious to early practitioners of calcu-

lus. Given a continuous positive function f :

• Let F(x) denote the area under y = f (x) between 0 and x.

• A small increase ∆x results in the area increasing by ∆F.

• ∆F ≈ f (x)∆x is approximately the area of a rectangle, whence

∆F

∆x

≈ f (x). This is part I.

• F(b) − F(a) ≈

∑

∆F

≈

∑

f (x

)∆x

. Since F

′

= f , this is part II.

∆F

∆x

f (x)

When Leibniz introduced the symbols

and d in the late 1600s, it was partly to reﬂect the fundamen-

tal theorem.

If you’re happy with non-rigorous notions of limit, rate of change, area, and (inﬁnite)

sums, the above is all you need!

Of course we are very much concerned with the details: What must we assume about f and F, and

how are these properties used in the proof?

Theorem 4.23 (FTC, part I). Suppose f is integrable on [a, b]. For any x ∈ [a, b], deﬁne

F(x) :=

f (t) dt

Then:

1. F is uniformly continuous on [a, b];

2. If f is continuous at c ∈ [a, b], then F is differentiable

at c with F

′

Compare this with the na

ıve version above where we assumed f was continuous. We now require

only the integrability of f , and its continuity at one point for the full result.

is a stylized S for sum, while d stands for difference. Given a sequence F = (F

, F

, . . . , F

), construct a new

sequence of differences

dF = (F

− F

, F

− F

, . . . , F

− F

n−1

)

which can then be summed:

dF = (F

− F

) + (F

− F

) + ···(F

− F

n−1

) = F

− F

(∗)

Viewing a function as an ‘inﬁnite sequence’ of values spaced along an interval, dF becomes a sequence of inﬁnitesimals and

(∗) is essentially the fundamental theorem:

dF = F(b) − F(a). It is the concept of function that is suspect here, not the

essential relationship between sums and differences.

Strictly: if c = a, then F is right-differentiable, etc.

Examples 4.24. Examples in every elementary calculus course.

1. Since f (x) = sin

−7) is continuous on any bounded interval, we conclude that

sin

−7) dt = sin

−7)

If one follows Theorem 4.14 and its conventions, then this is valid for all x ∈ R.

2. The chain rule permits more complicated examples. For instance: f (t) = sin

√

t is continuous

on its domain [0, ∞) and y(x) = x

+ 3 has range [ 3, ∞) ⊆ dom( f ), whence

sin

√

t dt =

sin

√

t dt = 2x sin

+ 3

3. For a ﬁnal positive example, we consider when

sin x

tan(t

) dt = e

tan(e

) −cos x tan(sin

Makes sense. To evaluate this, ﬁrst choose any constant a and write

sin x

−

sin x

before differentiating. This is valid provided sin x, e

and a all lie in the same subinterval of

dom tan(t

) = R \ {±

, ±

3π

, ±

5π

, . . .}

Since

sin x

≤ 1 <

, this requires



⇐⇒ x <

Choosing a = 1 would certainly sufﬁce.

4. Now consider why the theorem requires continuity. The piecewise

continuous function

f : [0, 2] → R : x 7→

(

2x if x ≤ 1

if x > 1

has a jump discontinuity at x = 1. We can still compute

F(x) =

(

2t dt = x

if x ≤ 1

2t dt +

dt =

(x + 1) if x > 1

This is continuous, indeed uniformly so! However the discontinu-

ity of f results in F having a corner and thus being non-differentiable

at x = 1. Indeed F

′

(x) = f (x) whenever x = 1: that is, at all values

of x where f is continuous.

f (x)

0 1 2

F(x)

0 1 2

Proving FTC I Neither half of the theorem is particularly difﬁcult once you write down what you

know and what you need to prove. Here are the key ingredients:

1. Uniform continuity for F means we must control the size of

F(y) − F(x)



f (t) dt −

f (t) dt



f (t) dt



≤

f (t)

But the boundedness of f allows us to control this last integral. . .

2. F

′

x→c

F(x)−F(c)

x−c

= f (c), which means controlling the size of



F(x) − F(c)

x −c

− f (c)



x −c

f (t) dt − f (c)



The trick here will is to bring the constant f (c) inside the integral as

x−c

f (c) dt so that the

above becomes

x−c

f (t) − f (c)

dt. This may now be controlled via the continuity of f . . .

Proof. 1. Since f is integrable, it is bounded: ∃M > 0 such that

f (x)

≤ M for all x.

Let ϵ > 0 be given and deﬁne δ =

. Then, for any x, y ∈ [a, b],

0 < y − x < δ =⇒

F(y) − F(x)



f (t) dt



≤

f (t)

dt (Theorem 4.13, part 4)

≤ M(y −x) (Theorem 4.13, part 2)

< Mδ = ϵ

We conclude that F is uniformly continuous on [a, b].

2. Let ϵ > 0 be given. Since f is continuous at c, ∃δ > 0 such that, for all t ∈ [a, b],

t − c

< δ =⇒

f (t) − f (c)

Now for all x ∈ [a, b] (except c),

0 <

x −c

< δ =⇒



F(x) − F(c)

x −c

− f (c)



x −c

f (t) − f (c) dt



(Theorem 4.9)

≤

x −c

f (t) − f (c)

dt (Theorem 4.13)

≤

x −c

< ϵ

Clearly lim

x→c

F(x)−F(c)

x−c

= f (c). Otherwise said, F is differentiable at c with F

′

The Fundamental Theorem, part II As with part I, the formulaic part of the result should be familiar,

though we are more interested in the assumptions and where they are needed.

Theorem 4.25 (FTC, part II). Suppose g is continuous on [a, b], differentiable on (a, b), and moreover

that g

′

is integrable on (a, b) (recall Deﬁnition 4.12). Then,

′

= g(b) − g(a)

Part II is often expressed in terms of anti-derivatives: F being an anti-derivative of f if F

′

= f . Com-

bined with FTC, part I, we recover the familiar ‘+c’ result and a simpler version of the fundamental

theorem often seen in elementary calculus.

Corollary 4.26. Let f be continuous on [a, b].

• If F is an anti-derivative of f , then

f = F(b) − F(a).

• Every anti-derivative of f has the form F(x) =

f (t) dt + c for some constant c.

Examples 4.27. Again, basic examples should be familiar.

1. Plainly g(x) = x

+ 2x

3/2

is continuous on [1, 4] and differentiable on (1, 4) with derivative

′

(x) = 2x + 3

√

x; this last is continuous (and thus integrable) on (1, 4). We conclude that

2x + 3

√

x dx = x

+ 2x

3/2



= (16 + 16) −(1 + 2) = 29

2. If g(x) = sin( 3x

), then g

′

(x) = 6x cos(3x

). Certainly g satisﬁes the hypotheses of the theorem

on any bounded interval [a, b]. We conclude

6x cos(3x

) dx = sin(3b

) −sin(3a

)

Moreover, every anti-derivative of f (x) = 6x cos(3x

) has the form F(x) = sin( 3x

) + c.

3. Recall Example 4.24.4 where the discontinuity of f at x = 1 led to the non-differentiability of

F(x) =

f (t) dt. The function F therefore fails the hypotheses of FTC II on the interval [0, 2].

It almost, however, satisﬁes the conclusions of FTC II, though this is somewhat tautological

given the deﬁnition of F: except at x = 1, F is certainly an anti-derivative of f , and moreover

f (x) dx = F(2) − F(0).

In case you’re worried that this makes the theorem trivial, note that other anti-derivatives

F of

f exist (except at x = 1) which fail to satisfy the conclusion. For instance

F(x) =

(

if x < 1

x if x > 1

=⇒

F(2) −

F(0) = 1 =

f (x) dx

Proving FTC II Exercise 10 offers a relatively easy proof when g

′

= f is continuous. For the real

McCoy, we can only rely on the integrability of g

′

: the trick is to use the mean value theorem to write

g(b) − g(a) as a Riemann sum over a suitable partition.

Proof. Suppose ϵ > 0 is given. Since g

′

is integrable, we may choose some partition P satisfying

U(g

′

, P) − L(g

′

, P) < ϵ. Since g satisﬁes the mean value theorem on each subinterval,

∃ξ

∈ (x

i−1

, x

) such that g

′

(ξ

) =

g(x

) − g(x

i−1

)

− x

i−1

from which

g(b) − g(a) =

∑

i=1

g(x

) − g(x

i−1

) =

∑

i=1

′

(ξ

)(x

− x

i−1

)

This is a Riemann sum for g

′

associated to the partition P. Since the upper and lower Darboux sums

are the supremum and inﬁmum of these, we see that

L(g

′

, P) ≤ g(b) − g(a) ≤ U(g

′

, P)

However

′

satisﬁes the same inequality: L(g

′

, P) ≤

′

≤ U(g

′

, P). Since these inequalities

hold for all ϵ > 0, we conclude that

′

= g(b) − g(a).

While we certainly used the integrability of g

′

in the proof, it might seem strange that we assumed it

at all: shouldn’t every derivative be integrable? Perhaps surprisingly, the answer is no! If you want

a challenge, look up the Volterra function, which is differentiable everywhere but whose derivative is

non-integrable!

The Rules of Integration

If one wants to evaluate an integral, rather than merely show it exists, there are really only two options:

1. Evaluate Riemann sums and take limits. This is often difﬁcult if not impossible to do explicitly.

2. Use FTC II. The problem now becomes the ﬁnding of anti-derivatives, for which the core method

is essentially guess and differentiate. To obtain general rules, we can attempt to reverse the rules

of differentiation.

Integration by Parts Recall the product rule: the product g = uv of two differentiable functions is

differentiable with g

′

= u

′

v + uv

′

. Now apply Theorems 4.9, 4.13 and FTC II.

Corollary 4.28 (Integration by Parts). Suppose u, v are continuous on [a, b], differentiable on (a, b),

and that u

′

, v

′

are integrable on (a, b). Then

′

(x)v(x)dx = u(b)v(b) −u(a)v(a) −

u(x)v

′

(x)dx

This is signiﬁcantly less useful than the product rule since it merely transforms the integral of one

product into the integral of another.

Examples 4.29. With practice, there is no need to explicitly state u and v.

1. Let u(x) = x and v

′

(x) = cos x. Then u

′

(x) = 1 and v(x) = sin x. These certainly satisfy the

hypotheses. We conclude

π/2

x cos x dx =

[

x sin x

]

π/2

−

π/2

sin x dx =

sin

−0 −

[

−cos x

]

π/2

+ cos

−cos 0 =

−1

2. Let u(x) = ln x and v

′

(x) = 1. Then u

′

(x) =

and v(x) = x, whence

ln x dx =

[

x ln x

]

−

dx = e

ln e

−e ln e −

[

]

= 2e

−e −e

+ e = e

Change of Variables/Substitution We now turn our attention to the chain rule. If g(x) = F



u(x)



where F and u are differentiable, then g is differentiable with

′

(x) =

= F

′



u(x)



′

(x)

Now integrate both sides; the only issue is what assumptions are needed to invoke FTC II.

Theorem 4.30 (Substitution Rule). Suppose u : [a, b] → R and f : range(u) → R are continuous.

Suppose also that u is differentiable on (a, b) with integrable derivative u

′

. Then



u(x)



′

(x) dx =

u(b)

u(a)

f (u) du

This is the famous ‘u-sub’/change-of-variables formula from elementary calculus.

Proof. We leave as an exercise the veriﬁcation that both integrals exist. By the intermediate and

extreme value theorems, range(u) is a closed bounded interval. Assume range(u) has positive length

for otherwise both integrals are trivially zero.

Choose any c ∈ range(u) and deﬁne

F : range(u) → R by F(v) :=

f (t) dt

Since f is continuous, by FTC I says that F is differentiable with F

′

(u) = f (u). But now



u(x)



′

(x) dx =





u(x)





dx (chain rule)

= F



u(b)



− F



u(a)



(FTC II)

u(b)

u(a)

f (u) du

Examples 4.31. Successfully applying the substitution rule can require signiﬁcant creativity.

1. To evaluate

√

2x sin x

dx, we consider the substitution u(x) = x

deﬁned on [0,

√

π].

Certainly u is continuous; moreover its derivative u

′

(x) = 2x is integrable on (0,

√

π). Finally

f (u) = sin u is continuous on range(u) = [0, π]. The hypotheses are satisﬁed, whence

√

2x sin x

dx =

√



u(x)



′

(x) dx =

u(π)

u(0)

f (u) du =

sin u du

= −cos u



= 2

2. For the following integral, a simple factorization suggests the substitution u(x) = x

− 2.

Plainly u : [

√

3] → [0, 1] and u

′

(x) = 2x is integrable. Moreover, f (u) =

is continuous

on range(u) = [0, 1]. We conclude

√

−4x

+ 5

dx =

√

−2)

+ 1

dx =

+ 1

du = arctan u



3. The hypotheses on u really are all that’s necessary. In particular, u need not be left-/right-

differentiable at the endpoints of [a, b]. For instance, with f (u) = u

and u(x) =

√

x on [0, 4],

we easily verify

√

x dx =

√

dx =



u(x)



′

(x) dx =

f (u) du =

du =

4. Sloppy ‘substitutions’ might lead to utter nonsense. For instance, u(x) = x

suggests

−1

dx =

−1

2x dx =

du =

(ln 4 −ln 1) = ln 2

This is total gibberish: the ﬁrst integral does not exist since

is undeﬁned at 0 ∈ (−1, 2).

Thankfully, the hypotheses of the substitution rule prevent this: f (u) =

is not continuous

on range(u) = [0, 4].

While you are very unlikely to make precisely this mistake, the risk is real in more complicated

or abstract situations. . .

Hence the old adage, “Differentiation is a science, whereas integration is an art.” To illustrate by example, consider

f (x) = tan(e

cos(3x

) + 4x

). The derivative is easily found using the product and chain rules:

d f

1 + (e

cos(3x

) + 4x

)



cos(3x

) −6xe

sin(3x

) + 12x



By contrast, if you want to ﬁnd an explicit anti-derivative of f (x), the integration analogues (parts/substitution) are essen-

tially useless. Similarly, the integral

tan(e

cos(3x

) + 4x

) dx

is likely impossible to evaluate explicitly and can only be approximated, say by using Riemann sums.

Exercises 4.34. Key concepts: Complete statements of FTC parts I & II, Integration by Parts/Substitution

1. Calculate the following limits:

(a) lim

x→0

dt (b) lim

h→0

3+h

2. Let f (t) =











0 if t < 0

t if 0 ≤ t ≤ 1

4 if t > 1

(a) Determine the function F(x) =

f (t) dt and sketch it. Where is F continuous?

(b) Where is F differentiable? Calculate F

′

at the points of differentiability.

3. Let f be continuous on R.

(a) Deﬁne F(x) =

x+1

x−1

f (t) dt. Carefully show that F is differentiable on R and compute F

′

(b) Repeat for G(x) =

sin x

f (t) dt.

4. Recall Examples 4.24.4 and 4.27.3. Describe all anti-derivatives F of f on [0, 1) ∪ (1, 2]. Which

satisfy

f (x) dx = F(2) − F(0)?

5. Suppose u, v satisfy the hypotheses of integration by parts. By FTC I,

′

(t)v(t) dt is an anti-

derivative of u

′

(x)v(x): what does integration by parts say is another?

6. Use a substitution to integrate

√

1 − x

7. Use integration by parts and the substitution rule to evaluate

arcsin x dx for any b < 1.

8. Use integration by parts to evaluate

x arctan x dx for any b > 0

9. If f and u satisfy the hypotheses of the substitution rule, explain why both ( f ◦ u)u

′

and f are

integrable on the required intervals.

10. We prove a simpler version of the fundamental theorem when f : [a, b] → R is continuous.

Part I Deﬁne F(x) =

f (t) dt. If c, x ∈ [a, b] where c = x, prove that

m ≤

F(x) − F(c)

x −c

≤ M

where m, M are the maximum and minimum values of f (t) on the closed interval with

endpoints c, x; why do m, M exist? Now deduce that F

′

Part II Now suppose F is any anti-derivative of f on [a, b]. Use part (a) and the mean value

theorem to prove that

f (t) dt = F(b) − F(a).

4.36 Improper Integrals

The Riemann integral has several limitations. Even allowing for functions to be integrable on open

intervals (Deﬁnition 4.12), the existence of

f (x) dx requires both:

• That (a, b) be a bounded interval.

• That f be bounded on (a, b).

Limits provide a natural way to extend the Riemann integral to unbounded intervals and functions.

Deﬁnition 4.32. Suppose f : [a, b) → R satisﬁes the following properties:

• f is integrable on every closed bounded subinterval [a, t] ⊆ [a, b).

• If b is ﬁnite, then f is unbounded at b (b can be ∞!)

The improper integral of f on [a, b) is

f (x) dx := lim

t→b

−

f (x) dx

This is convergent or divergent as is the limit.

If an integral is improper at its lower limit (f : (a, b] → R, etc.), then

f (x) dx := lim

s→a

f (x) dx.

If an integral is improper at both ends, choose any c ∈ (a, b) and deﬁne

f (x) dx = lim

s→a

f (x) dx + lim

t→b

−

f (x) dx

provided both one-sided improper integrals exist and the limit sum makes sense.

Theorem 4.14 says that the choice of c for a doubly-improper integral is irrelevant.

Many properties of the Riemann integral transfer naturally to improper integrals, though not every-

thing. . . For example, part 1 of Theorem 4.13 extends:

Theorem 4.33. If 0 ≤ f (x) ≤ g(x) on [a, b) , then

f ≤

g whenever the integrals exist (standard

or improper). In particular:

•

f = ∞ =⇒

g = ∞

•

g convergent =⇒

f converges to some value ≤

We leave some of the detail to Exercise 7.

Examples 4.34. 1.

dx =

for any t > 0. Clearly

∞

dx = lim

t→∞

= ∞

More formally, the improper integral

∞

dx diverges to inﬁnity.

2. With f (x) = x

−4/3

deﬁned on [1, ∞),

∞

−4/3

dx = lim

t→∞

−4/3

dx = lim

t→∞

−3x

−1/3

= lim

t→∞

3 −3t

−1/3

= 3

3. Consider f (x) =

−x

on (−∞, ∞). On any bounded interval [0, t],

f (x) dx =

−x

dx =

−e

−x

= 1 − e

−t

−−→

t→∞

By symmetry,

∞

−∞

−x

dx = 1 + 1 = 2

This example arises naturally in probability: multiplying by

√

2π

computes the expectation of

when X is a standard normally-distributed random variable

) =

∞

−∞

√

2π

−x

dx =

4. Our knowledge of derivatives

sin

−1

x =

√

1−x

(or the substitution rule) allows us to evaluate

√

1 − x

dx = lim

t→1

−

√

1 − x

dx = lim

t→1

−

sin

−1

t =

By symmetry,

−1

√

1−x

dx = π. By comparison, we obtain bounds on another improper inte-

gral:

√

1 − x

≤

√

1 − x

=⇒

−1

√

1 − x

dx ≤

−1

√

1 − x

dx = π

5. Improper integrals need not exist. For instance,

lim

t→∞

sin x dx = lim

t→∞

1 −cos t

diverges by oscillation.

Exercises 4.36. Key concepts: Formal deﬁnition and careful calculation of Improper Integrals

1. Use your answers from Section 4.34 to decide whether the improper integrals

arcsin x dx

and

∞

x arctan x dx exist. If so, what are their values?

2. Let p be a positive constant. Prove:

dx =

(

1−p

if p < 1

∞ if p ≥ 1

∞

dx =

(

p−1

if p > 1

∞ if p ≤ 1

(The ﬁrst of these justiﬁes the convergence/divergence properties of p-series via the integral test)

3. Suppose f is integrable on [a, b]. Explain why

f (x) dx = lim

t→b

−

f (x) dx is still true, even

though the integral is not improper.

4. State a version of integration by parts modiﬁed for when

′

(x)v(x) dx is improper at b. Now

evaluate

∞

−4x

dx.

5. What is wrong with the following calculation?

∞

−∞

x dx = lim

t→∞



−t

= lim

t→∞

−t

) = lim

t→∞

0 = 0

6. Prove or disprove: if

f and

g are convergent improper integrals, so is

f g.

7. Prove part of Theorem 4.33. Suppose 0 ≤ f (x) ≤ g(x) for all x ∈ [a, b), and that

g is a

convergent improper integral. Prove that

f converges and that

f ≤

Extensions of the Riemann Integral (just for fun)

In the 1890s, Thomas Stieltjes

offered a generalization of the Riemann integral.

Deﬁnition 4.35. Let f : [a, b] → R be bounded and α : [a, b] → R monotonically increasing. Given a

partition P = {x

, . . . , x

} of [a, b], deﬁne the sequence of differences

∆α

= α(x

) − α(x

i−1

)

The upper/lower Darboux–Stieltjes sums/integrals are deﬁned analogously to the pure Riemann case:

U( f , P, α) =

∑

i=1

sup

i−1

]

f (x) ∆α

L( f , P, α) =

∑

i=1

inf

i−1

]

f (x) ∆α

U( f , α) = inf

U( f , P, α) L( f , α) = sup

L( f , P, α)

If U( f , α) = L( f , α), we say that f is Riemann–Stieltjes integrable of class R(α) and denote its value

f (x) dα.

The standard Riemann integral corresponds to α(x) = x. It is the ability to choose other functions α

that makes the Riemann–Stieltjes integral both powerful and applicable.

Standard Properties Most results in sections 4.32 and 4.33 hold with suitable modiﬁcations, as does

the discussion of improper integrals. For instance,

f ∈ R(α) ⇐⇒ ∃P such that U( f , P, α) − L( f , P, α) < ϵ

The result regarding the piecewise continuity of f is a notable exception: depending on α, a

piecewise continuous f might not lie in R(α).

Weighted integrals If α is differentiable, we obtain a standard Riemann integral

f (x) dα =

f (x)α

′

(x) dx

weighted so that f (x) contributes more when α is increasing rapidly.

Probability If α (a) = 0 and α(b) = 1, then α may be viewed as a probability distribution function and its

derivative α

′

as the corresponding probability density function. For example:

1. The uniform distribution on [a, b] has α =

b−a

(x − a) so that

f (x) dα =

b − a

f (x) dx

Since α

′

is constant, the integrals weigh all values of x uniformly.

2. The standard normal distribution has α(x) =

−∞

√

2π

−t

dt. The fact that α

′

√

2π

−x

is maximal when x = 0 reﬂects the fact that a normally distributed variable is clustered

near its mean.

In all cases,

f (x) dα = E( f (X)) computes an expectation (see, e.g., Example 4.34.3).

Stieltjes was Dutch; the pronunciation is roughly ‘steelchez.’

Non-differentiable or continuous α This provides major ﬂexibility! For example, if Q = {s

, . . . , s

}

partitions [a, b], and (c

)

k=1

is a positive sequence, then

α(x) =











0 if x = a

∑

i=1

if x ∈ (s

k−1

, s

]

deﬁnes an increasing step function, and the Riemann–Stieltjes integral a weighted sum

f (x) dα =

∑

i=1

f (s

)

Taking an inﬁnite increasing sequence (s

) ⊆ [a, b] results in an inﬁnite series, which helps

explain why so many results for series and integrals look similar!

This also touches on probability. For example, let p ∈ [0, 1], n ∈ N, and s

= k on the interval

[0, n]. If c

(

)

(1 − p)

n−k

, then

f (x) dα =

∑

k=0





(1 − p)

n−k

f (x) = E( f (X))

is the expectation of f (X) when X ∼ B(n, p) is binomially distributed.

Lebesgue Integration: Integrals and Convergence

Lebesgue’s extension essentially uses rectangles whose heights tend to zero: cutting up the area under

a curve using horizontal instead of vertical strips. One of its major purposes is to permit a more general

interchange of limits and integration in many cases of pointwise (non-uniform) convergence. To see

the problem, consider the sequence of piecewise continuous functions

: [0, 1] → R : x 7→

(

1 if x =

∈ Q with q ≤ n

0 otherwise

Each f

is Riemann integrable with

(x) dx = 0. However, the pointwise limit

f (x) =

(

1 if x ∈ Q

0 if x ∈ Q

is not Riemann integrable (compare Example 4.8.2). In the Lebesgue theory, the limit f turns out to

be integrable with integral 0, so that

lim

n→∞

(x) dx =

lim

n→∞

(x) dx

Recall (Theorem 2.19) that the interchange of limits and integrals would be automatic if the conver-

gence f

→ f were uniform: of course the convergence isn’t uniform here.

Like measure theory (recall Theorem 4.22), Lebesgue integration is a central topic in graduate analysis.