Total variation, absolute continuity, and the Borel σ-algebra of C(I)

Jordan Bell
March 10, 2015

1 Total variation

Let a<b. A partition of [a,b] is a sequence t0,t1,,tn such that

a=t0<t1<<tn=b.

The total variation of a function f:[a,b] is

Varf[a,b]=sup{i=1n|f(ti)-f(ti-1)|:t0,t1,,tn is a partition of [a,b]}.

If Varf[a,b]< then we say that f has bounded variation.

Lemma 1.

If ac<e<db, then

Varf[c,d]=Varf[c,e]+Varf[e,d].

The following theorem establishes properties of functions of bounded variation.11 1 Charalambos D. Aliprantis and Owen Burkinshaw, Principles of Real Analysis, third ed., p. 377, Theorem 39.10.

Theorem 2.

Suppose that f:[a,b] is of bounded variation and define

F(x)=Varf[a,x],x[a,b].

Then:

  1. 1.

    |f(y)-f(x)|F(y)-F(x) for all ax<yb.

  2. 2.

    F is a nondecreasing function.

  3. 3.

    F-f and F+f are nondecreasing functions.

  4. 4.

    For x0[a,b], f is continuous at x0 if and only if F is continuous at x0.

Proof.

If t0,,tn is a partition of [a,x] then t0,,tn,y is a partition of [a,y], so

i=1n|f(ti)-f(ti-1)|+|f(y)-f(x)|F(y).

Since this is true for any partition t0,,tn of [a,x],

F(x)+|f(y)-f(x)|F(y).

This shows in particular that F(x)F(y), and thus that F is nondecreasing.

For ax<yb,

f(y)-f(x)|f(y)-f(x)|F(y)-F(x),

thus

F(x)-f(x)F(y)-f(y),

showing that xF(x)-f(x) is nondecreasing. Likewise,

f(x)-f(y)|f(y)-f(x)|F(y)-F(x),

thus

f(x)+F(x)f(y)+F(y),

showing that xF(x)+f(x) is nondecreasing.

Suppose that F is continuous at x0 and let ϵ>0. There is some δ>0 such that |x-x0|<δ implies that |F(x)-F(x0)|<ϵ. If |x-x0|<δ, then

|f(x)-f(x0)||F(x)-F(x0)|<ϵ,

showing that f is continuous at x0.

Suppose that f is continuous at x0 and let ϵ>0. Then there is some δ>0 such that |x-x0|<δ implies that |f(x)-f(x0)|<ϵ, and such that x0-δ>a. Let x0-δ<s<x0, and let t0,,tn be a partition of [s,b] such that

Varf[s,b]<i=1n|f(ti)-f(ti-1)|+ϵ

and such that none of t0,,tn is equal to x0. Say that tk<x0<tk+1. Then

t0,,tk,x0,tk+1,,tn

is a partition of [s,b]. For tk<x<x0 we have |x-x0|<δ and therefore

Varf[s,x]+Varf[x,b] =Varf[s,b]
<i=1n|f(ti)-f(ti-1)|+ϵ
i=1k|f(ti)-f(ti-1)|+|f(x)-f(tk)|
+|f(x0)-f(x)|
+|f(tk+1)-f(x0)|+i=k+2n|f(ti)-f(ti-1)|+ϵ
Varf[s,x]+|f(x)-f(x0)|+Varf[x0,b]+ϵ
<Varf[s,x]+Varf[x0,b]+2ϵ,

giving

Varf[x,b]-Varf[x0,b]<2ϵ.

As Varf[a,b]=Varf[a,x]+Varf[x,b] and also Varf[a,b]=Varf[a,x0]+Varf[x0,b], we have F(x)+Varf[x,b]=F(x0)+Varf[x0,b], and therefore

F(x0)-F(x)<2ϵ.

Thus, if tk<x<x0 then |F(x0)-F(x)|<2ϵ, showing that F is left-continuous at x0. It is straightforward to show in the same way that F is right-continuous at x0, and thus continuous at x0. ∎

If f:[a,b] is of bounded variation, then Theorem 2 tells us that F and F+f are nondecreasing functions. A monotone function is differentiable almost everywhere,22 2 Charalambos D. Aliprantis and Owen Burkinshaw, Principles of Real Analysis, third ed., p. 375, Theorem 39.9. and it follows that f=(F+f)-F is differentiable almost everywhere.

2 Absolute continuity

Let a<b and let I=[a,b]. A function f:I is said to be absolutely continuous if for any ϵ>0 there is some δ>0 such that for any n and any collection of pairwise disjoint intervals (α1,β1),,(αn,βn) satisfying

i=1n(βi-αi)<δ,

we have

i=1n|f(βi)-f(αi)|<ϵ.

It is immediate that if f is absolutely continuous then f is uniformly continuous.

Lemma 3.

If f:[a,b] is absolutely continuous then f has bounded variation.

Proof.

Because f is absolutely continuous, there is some δ>0 such that if (α1,β1),,(αn,βn) are pairwise disjoint and

i=1n(βi-αi)<δ,

then

i=1n|f(βi)-f(αi)|<1.

Let N be an integer that is >b-aδ and let a=x0<<xN=b such that xi-xi-1<b-aN for each i=1,,N. Then

Varf[a,b]=i=1NVarf[xi-1,xi]N,

showing that f has bounded variation. ∎

Let λ be Lebesgue measure on and let 𝔐 be the collection of Lebesgue measurable subsets of .

The following theorem establishes connections between absolute continuity of a function and Lebesgue measure.33 3 Walter Rudin, Real and Complex Analysis, third ed., p. 146, Theorem 7.18. In the following theorem, we extend f:[a,b] to by defining f(x)=f(b) for x>b and f(x)=f(a) for x<a. In particular, for any x>b, f(x) exists and is equal to 0, and for any x<a, f(x) exists and is equal to 0.

Theorem 4.

Suppose that I=[a,b] and that f:I is continuous and nondecreasing. Then the following statements are equivalent.

  1. 1.

    f is absolutely continuous.

  2. 2.

    If EI and λ(E)=0 then λ(f(E))=0. (In words: f has the Luzin property.)

  3. 3.

    f is differentiable λ-almost everywhere on I, fL1(λ), and

    f(x)-f(a)=axf(t)𝑑λ(t),axb.
Proof.

Assume that f is absolutely continuous and let EI with λ(E)=0. Let E0=E{a,b}; to prove that λ(f(E))=0 it suffices to prove that λ(f(E0))=0. Let ϵ>0. As f is absolutely continuous, there is some δ>0 such that for any n and any collection of pairwise disjoint intervals (α1,β1),,(αn,βn) satisfying

i=1n(βi-αi)<δ,

we have

i=1n|f(βi)-f(αi)|<ϵ.

There is an open set V such that E0VI and such that λ(V)<δ. (Lebesgue measure is outer regular.) There are countably many pairwise disjoint intervals (αi,βi) such that V=i(αi,βi). Then

i(βi-αi)=λ(V)<δ,

so for any n,

i=1n(βi-αi)<δ,

and because f is absolutely continuous it follows that

i=1n|f(βi)-f(αi)|<ϵ.

This is true for all n, so

i|f(βi)-f(αi)|ϵ.

Because f is continuous and nondecreasing, f(αi,βi)=(f(αi),f(βi)) for each i. Therefore

f(V)=f(i(αi,βi))=if(αi,βi)=i(f(αi),f(βi)),

which gives

λ(f(V))=i(f(βi)-f(αi))=i|f(βi)-f(αi)|ϵ.

This is true for all ϵ>0, so λ(f(V))=0. Because f(E0)f(V), it follows that f(E0)𝔐 (Lebesgue measure is complete) and that λ(f(E0))=0.

Assume that for all EI with λ(E)=0, λ(f(E))=0. Define g:I by

g(x)=x+f(x),xI.

Because f is continuous and nondecreasing, g is continuous and strictly increasing. Thus if (α,β)I then g(α,β)=(g(α),g(β)) and so

λ(g(α,β))=g(β)-g(α)=β+f(β)-(α+f(α))=β-α+f(β)-f(α),

showing that if JI is an interval then λ(g(J))=λ(J)+λ(f(J)). Suppose that EI and λ(E)=0, and let ϵ>0. There are countably many pairwise disjoint intervals (αi,βi) such that Ei(αi,βi) and i(βi-αi)<ϵ, and because λ(f(E))=0, there are countably many pairwise disjoint intervals (γi,δi) such that f(E)i(γi,δi) and i(δi-γi)<ϵ. Let

N=f-1(i(γi,δi))i(αi,βi)=i,j(f-1(γi,δi)(αi,βi))𝔐.

We check that

λ(g(N))=λ(N)+λ(f(N)),

and because

λ(N)+λ(f(N))i(βi-αi)+i(δi-γi)<2ϵ

we have

λ(g(N))<2ϵ.

Finally, EN so g(E)g(N). Therefore, for every ϵ>0 there is some N𝔐 with g(E)g(N) and λ(g(N))<ϵ, from which it follows that λ(g(E))=0.

Suppose that EI belongs to 𝔐. Because E𝔐, there are E0,E1𝔐 such that E=E0E1, λ(E0)=0, and E1 is a countable union of closed sets (namely, an Fσ-set). On the one hand, as E1I, E1 is a countable union of compact sets, and because g is continuous, g(E1) is a countable union of compact sets, and in particular belongs to 𝔐. On the other hand, because λ(E0)=0, g(E0)𝔐. Therefore g(E)=g(E0)g(E1)𝔐. Define μ:𝔐[0,) by

μ(E)=λ(g(EI)),E𝔐.

If Ei are countably many pairwise disjoint elements of 𝔐, then g(EiI) are pairwise disjoint elements of 𝔐, hence

μ(iEi) =λ(g((iEi)I))
=λ(ig(EiI))
=iλ(g(EiI))
=iμ(Ei),

showing that μ is a measure. If λ(E)=0, then λ(EI)=0 so λ(g(EI))=0, i.e. μ(E)=0. This shows that μ is absolutely continuous with respect to λ. Therefore by the Radon-Nikodym theorem44 4 Walter Rudin, Real and Complex Analysis, third ed., p. 121, Theorem 6.10. there is a unique hL1(λ) such that

μ(E)=Eh𝑑λ,E𝔐.

h(x)0 for λ-almost all x.

Suppose that x and let E=[a,x]. Then g(E)=[g(a),g(x)], and

μ(E)=Eh(t)𝑑λ(t)=axh(t)𝑑λ(t).

On the other hand,

μ(E)=λ(g(E))=λ([g(a),g(x)])=g(x)-g(a)=x+f(x)-(a+f(a)).

Hence

f(x)-f(a)=axh(t)𝑑λ(t)-(x-a),

i.e.,

f(x)-f(a)=ax(h(t)-1)𝑑λ(t).

By the Lebesgue differentiation theorem,55 5 Walter Rudin, Real and Complex Analysis, third ed., p. 141, Theorem 7.11. f(x)=h(x)-1 for λ-almost all x, and it follows that fL1(λ) and

f(x)-f(a)=axf(t)𝑑λ(t),xI.

Assume that f is differentiable λ-almost everywhere in I, fL1(λ), and

f(x)-f(a)=axf(t)𝑑λ(t),xI.

Let ϵ>0 and let (α1,β1),,(αn,βn) be pairwise disjoint intervals satisfying

i=1n(βi-αi)<δ.

Because f is nondecreasing, for λ-almost all xI, f(x)0, and hence the measure μ defined by dμ=fdλ is absolutely continuous with respect to λ. It follows66 6 Walter Rudin, Real and Complex Analysis, third ed., p. 124, Theorem 6.11. that there is some δ>0 such that for E𝔐, λ(E)<δ implies that μ(E)<ϵ. This gives us

μ(i=1n(αi,βi))<ϵ,

and as

μ(αi,βi)=αiβif(t)𝑑λ(t)=f(βi)-f(αi),

we get

i=1nf(βi)-f(αi)<ϵ.

This shows that f is absolutely continuous, completing the proof. ∎

The following lemma establishes properties of the total variation of absolutely continuous functions.77 7 Walter Rudin, Real and Complex Analysis, third ed., p. 147, Theorem 7.19.

Lemma 5.

Suppose that I=[a,b] and that f:I is absolutely continuous. Then the function F:I defined by

F(x)=Varf[a,x],xI

is absolutely continuous.

Proof.

Let ϵ>0. Because f is absolutely continuous, there is some δ>0 such that if (a1,b1),,(am,bm) are disjoint intervals with k=1m(bk-ak)<δ, then

k=1m|f(bk)-f(ak)|<ϵ.

Suppose that (α1,β1),,(αn,βn) are disjoint intervals with i=1n(βi-αi)<δ. If αi=ti,0<<ti,mi=βi for i=1,,n, then (ti,j-1,ti,j), 1in, 1jmi, are disjoint intervals whose total length is <δ, hence

i=1nj=1mi|f(ti,j)-f(ti,j-1)|<ϵ.

It follows that

i=1n|F(βi)-F(αi)|=i=1nVarf[αi,βi]ϵ,

which shows that F is absolutely continuous. ∎

We now prove the fundamental theorem of calculus for absolutely continuous functions.88 8 Walter Rudin, Real and Complex Analysis, third ed., p. 148, Theorem 7.20.

Theorem 6.

Suppose that I=[a,b] and that f:I is absolutely continuous. Then f is differentiable at almost all x in I, fL1(λ), and

f(x)-f(a)=axf(t)𝑑λ(t),xI.
Proof.

Define F:I by

F(x)=Varf[a,x],xI.

By Lemma 3, f has bounded variation, and then using Theorem 2, F-f and F+f are nondecreasing. Furthermore, by Lemma 5, F is absolutely continuous, so F-f and F+f are absolutely continuous. Let

f1=F+f2,f2=F-f2,

which are thus nondecreasing and absolutely continuous. Applying Theorem 4, we get that f1,f2 are differentiable at almost all xI, f1,f2L1(λ), and

f1(x)-f1(a)=axf1(t)𝑑λ(t),axb

and

f2(x)-f2(a)=axf2(t)𝑑λ(t),axb.

Because f=f1-f2, f is differentiable at almost all xI, f=f1-f2L1(λ), and

f(x)-f(a)=axf(t)𝑑λ(t),axb,

proving the claim. ∎

3 Borel sets

Let I=[a,b]. Denote by C(I) the set of continuous functions I, which with the norm

fC(I)=supxI|f(x)|,fC(I),

is a Banach space. Denote by AC(I) the set of absolutely continuous functions I. Let C(I) be the Borel σ-algebra of C(I). We have AC(I)C(I), and in the following theorem we prove that AC(I) is a Borel set in C(I).

Theorem 7.

AC(I)C(I).

Proof.

If X,Y are Polish spaces, f:XY is continuous, AX, and f|A is injective, then f(A)Y.99 9 Alexander Kechris, Classical Descriptive Set Theory, p. 89, Theorem 15.1. Let X=×L1(I), which is a Banach space with the norm

(A,g)X=|A|+ab|g|𝑑λ,(A,g)X.

Furthermore, and L1(I) are separable and thus so is X, so X is indeed a Polish space. The Banach space C(I) is separable and thus is a Polish space. Define Φ:XC(I) by

Φ(A,g)(x)=A+axg(t)𝑑λ(t),(A,g)X,xI.

For (A1,g1),(A2,g2)X,

Φ(A1,g1)-Φ(A2,g2)C(I) =(A1-A2)+ax(g1(t)-g2(t))𝑑λ(t)C(I)
=|A1-A2|+supxI|ax(g1(t)-g2(t))𝑑λ(t)|
|A1-A2|+ab|g1(t)-g2(t)|𝑑λ(t)
=(A1,g1)-(A2,g2)X,

which shows that Φ:XC(I) is continuous.

Let (A,g)X and ϵ>0. Because gL1(I), there is some δ>0 such that if λ(E)<δ then E|g|𝑑λ<ϵ.1010 10 Walter Rudin, Real and Complex Analysis, third ed., p. 32, exercise 1.12. If (α1,β1),,(αn,βn) are disjoint intervals whose total length is <δ, then, with E=i=1n(αi,βi),

i=1n|Φ(A,g)(βi)-Φ(A,g)(αi)| =i=1n|αiβig(t)𝑑λ(t)|
i=1nαiβi|g(t)|𝑑λ(t)
=E|g|𝑑λ
<ϵ,

showing that Φ(A,g) is absolutely continuous. On the other hand, let fAC(I). From Theorem 6, f is differentiable at almost all xI, fL1(I), and

f(x)-f(a)=axf(t)𝑑λ(t),xI.

Then (f(a),f)X, and the above gives us, for all xI,

Φ(f(a),f)(x)=f(a)+axf(t)𝑑λ(t)=f(x),

thus Φ(f(a),f)=f. Therefore

Φ(X)=AC(I).

If Φ(A1,g1)=Φ(A2,g2), then Φ(A1,g1)(a)=Φ(A2,g2)(a) gives A1=A2. Using this, and defining G:I by G=ax(g1(t)-g2(t))𝑑λ(t), we have G(x)=0 for all xI. Then G(x)=0 for all xI, and by the Lebesgue differentiation theorem1111 11 Walter Rudin, Real and Complex Analysis, third ed., p. 141, Theorem 7.11. we have G(x)=g1(x)-g2(x) for almost all xI. That is, g1(x)=g2(x) for almost all xI, and thus in L1(I) we have g1=g2. Therefore Φ:XC(I) is injective.

Therefore Φ(X)C(I). ∎