Semicontinuous functions and convexity

Jordan Bell
April 3, 2014

1 Lattices

If (A,) is a partially ordered set and S is a subset of A, a supremum of S is an upper bound that is any upper bound of S, and an infimum of S is a lower bound that is any lower bound of S. Because a partial order is antisymmetric, if a supremum exists it is unique, and we denote it by S, and if an infimum exists it is unique, and we denote it by S. If (A,) is a lattice, then one proves by induction that both the supremum and the infimum exist for every finite nonempty subset of A. Vacuously, every element of a partially ordered set is an upper bound for and a lower bound for . Thus, if has a supremum x then xy for all yA, and if has an infimum x then xy for all yA. That is,

=A,=A.

If X is a set and R[-,], the set RX is partially ordered where fg if f(x)g(x) for all xX. Moreover, RX is a lattice:

(fg)(x)=max{f(x),g(x)},(fg)(x)=min{f(x),g(x)}.

2 Urysohn’s lemma

A Hausdorff topological space (X,τ) is said to be normal if for every pair of disjoint closed sets E,F there are disjoint open sets U,V with EU and FV. Every metrizable space is a normal topological space, but there are normal topological spaces that are not metrizable. A useful fact about normal topological spaces is Urysohn’s lemma:11 1 Gert K. Pedersen, Analysis Now, revised printing, p. 24, Theorem 1.5.6. For each pair of disjoint nonempty closed sets E,F there is a continuous function f:X[0,1] such that f(E)=0 and f(F)=1.

A locally compact Hausdorff space need not be normal. For example, the real numbers with the rational sequence topology is Hausdorff and locally compact but is not normal. We say that a topological space is σ-compact if it is the union of countably many compact subsets. The following lemma states that if a locally compact Hausdorff space is σ-compact then it is normal.22 2 Gert K. Pedersen, Analysis Now, revised printing, p. 39, Proposition 1.7.8.

Lemma 1.

If a locally compact Hausdorff space is σ-compact, then it is normal.

Urysohn’s metrization theorem states that if a topological space is normal and is second-countable (its topology has a countable basis) then it is metrizable. However, a metric space need not be second-countable: a metric space is second-countable precisely when it is separable, and hence the converse of Urysohn’s metrization theorem is false. The following lemma shows that a second-countable locally compact Hausdorff space is σ-compact, and hence metrizable by Urysohn’s metrization theorem.

Lemma 2.

If a locally compact Hausdorff space is second-countable, then it is σ-compact.

Proof.

Let (X,τ) be a second-countable locally compact Hausdorff space. Because it is second-countable, there is a countable subset of τ that is a basis for τ. If xX, then because X is locally compact there is an open set U containing x which is itself contained in a compact set K, and then there is some V such that xVU. The closure V¯ of V is contained in K and hence V¯ is compact. Defining to be those V such that V¯ is compact, it follows that is a basis for τ. The closures of the elements of are countably many compact sets whose union is equal to X, showing that X is σ-compact. ∎

3 Lower semicontinuous functions

If (X,τ) is a topological space, then f:X[-,] is said to be lower semicontinuous if t implies that f-1(t,]τ. We say that f is finite if -<f(x)< for all xX.

If AX and t, then

χA-1(t,]={Xt<0A0t<1t1

We see that the characteristic function of a set is lower semicontinuous if and only if the set is open.

The following theorem characterizes lower semicontinuous functions in terms of nets.33 3 Gert K. Pedersen, Analysis Now, revised printing, p. 26, Proposition 1.5.11.

Theorem 3.

If (X,τ) is a topological space and f:X[-,] is a function, then f is lower semicontinuous if and only if (xα)αI being a convergent net in X implies that

f(limxα)lim inff(xα).
Proof.

Suppose that f is lower semicontinuous and xαx. Say t<f(x). Because f is lower semicontinuous, f-1(t,]τ. As xf-1(t,] and xαx, there is some αt such that ααt implies xαf-1(t,]. That is, if ααt then f(xα)>t. This implies that lim inff(xα)t. But this is true for all t<f(x), hence lim inff(xα)f(x).

Suppose that xαx implies that f(x)lim inff(xα). Let t and let F=f-1(-,t]. If xF¯, then there is a net xαF with xαx. As xαx, by hypothesis f(x)lim inff(xα). By definition of F we have f(xα)t for each α, and hence f(x)t. This means that xF, showing that F is closed and so the complement of F is open. But the complement of F is f-1(t,], showing that f is lower semicontinuous. ∎

Let LSC(X) be the set of all lower semicontinuous functions X[-,]. LSC(X) is a partially ordered set: fg means that f(x)g(x) for all xX. The following theorem shows that LSC(X) is a lattice that contains the supremum of each of its subsets.44 4 Gert K. Pedersen, Analysis Now, revised printing, p. 27, Proposition 1.5.12.

Theorem 4.

If (X,τ) is a topological space, then LSC(X) is a lattice, and if FLSC(X) then g:X[-,] defined by

g(x)=supff(x),xX

belongs to LSC(X).

Proof.

If =, then g is the constant function x-, which is lower semicontinuous as for any t, the inverse image of (t,] is , which belongs to τ. Otherwise, let t. Saying xg-1(t,] means that g(x)>t, and with ϵ=g(x)-t there is some some f satisfying f(x)>g(x)-ϵ=t. Therefore, if xg-1(t,] then xff-1(t,]. On the other hand, if xff-1(t,], then there is some f with f(x)>t, and hence g(x)>t. Therefore

g-1(t,]=ff-1(t,].

But each f is lower semicontinuous so the right-hand side is a union of elements of τ, and hence g-1(t,]τ, showing that g is lower semicontinuous. Therefore every subset of LSC(X) has a supremum.

Suppose that f1,f2LSC(X), and define f:X[-,] by

f(x)=min{f1(x),f2(x)},xX.

For t, it is apparent that

f-1(t,]=f1-1(t,]f2-1(t,].

As f1 and f2 are each lower semicontinuous, these two inverse images are each open sets, and so their intersection is an open set. Therefore f is lower semicontinuous, showing that LSC(X) is a lattice. ∎

One is sometimes interested in lower semicontinuous functions that do not take the value -. As the following theorem shows, the sum of two lower semicontinuous functions that do not take the value - is also a lower semicontinuous function.

Theorem 5.

If X is a topological space, if f,gLSC(X), and if f,g>-, then f+gLSC(X), and if r>0 and fLSC(X) then rfLSC(X).

Proof.

If (xα)αI is a net that converges to xX, then, by Theorem 3,

(f+g)(x)lim inff(xα)+lim infg(xα)lim inff(xα)+g(xα)=lim inf(f+g)(xα),

and by Theorem 3 this tells us that f+g is lower semicontinuous. As well,

(rf)(x)=rf(x)rlim inff(xα)=lim infrf(xα)=lim inf(rf)(xα),

showing that rfLSC(X). ∎

The following theorem shows that if fnLSC(X) are each finite and converge uniformly on X to fX, then fLSC(X).55 5 Gert K. Pedersen, Analysis Now, revised printing, p. 27, Proposition 1.5.12.

Theorem 6.

If (X,τ) is a topological space, if fnLSC(X) are finite, and if fn converge uniformly in X to fRX, then fLSC(X).

Proof.

If ϵ>0, then there is some N such that nN and xX imply that |fn(x)-f(x)|<ϵ. Define

δn=sup{|fn(x)-f(x)|:xX}.

Thus, for all ϵ>0 there is some N such that nN implies that δnϵ. If (xα)αI is a convergent net in X, then, for all nN,

f(limxα) δn+fn(limxα)
δn+lim inffn(xα)
2δn+lim inff(xα)
2ϵ+lim inff(xα).

This is true for all ϵ, so we get

f(limxα)lim inff(xα),

and therefore f is lower semicontinuous. ∎

The following theorem shows in particular that on a normal topological space X, any finite nonnegative lower semicontinuous function is the the supremum of the set of all continuous functions X that are dominated by it.66 6 Gert K. Pedersen, Analysis Now, revised printing, p. 27, Proposition 1.5.13. To say that continuous functions X[0,1] separate points and closed sets means that if xX and F is a disjoint closed set, then there is a continuous function g:X[0,1] such that g(x)=1 and g(F)=0. Urysohn’s lemma states that in a normal topological space continuous functions separate closed sets, so in particular they separate points and closed sets.

Theorem 7.

If (X,τ) is a topological space such that continuous functions X[0,1] separate points and closed sets, and f0 is a finite lower semicontinuous function on X, then f is the supremum of the set of continuous functions g:XR such that gf.

Proof.

Let (f) be the set of all continuous functions g:X with gf. As f0 we get 0(f), and so (f) is nonempty. If xX and ϵ>0, let

F=f-1(-,f(x)-ϵ].

XF=f-1(f(x)-ϵ,)τ, so F is closed. It is apparent that xF. Therefore there is a continuous function h:X[0,1] such that h(x)=1 and h(F)=0. If f(x)-ϵ0 then, as h0 and f0, we have (f(x)-ϵ)hf. If f(x)-ϵ>0 and yX, then

(f(x)-ϵ)h(y)={0yF(f(x)-ϵ)h(y)yF{0yFf(x)-ϵyFf(y).

Therefore (f(x)-ϵ)hf, and because the function (f(x)-ϵ)h is continuous we have (f(x)-ϵ)g(f) Because this is an element of (f) we get

((f))(x)(f(x)-ϵ)g(x)=f(x)-ϵ.

As ϵ was arbitrary it follows that ((f))(x)f(x), and as x was arbitrary we have (f)f. But f is an upper bound for (f), so (f)f. Therefore f=(f). ∎

The following is a formulation of the extreme value theorem for lower semicontinuous functions on a compact topological space.

Theorem 8 (Extreme value theorem).

If X is a compact topological space and if f is a lower semicontinuous function on X, then

K={xX:f(x)=infyXf(y)}

is a nonempty closed subset of X.

Proof.

Let C=f(X)[-,], and for cC let Fc={xX:f(x)c}. Because Fc=Xf-1(c,] and f is lower semicontinuous, Fc is a closed set. Suppose that c1,,cnC. Taking c=min{ck:1kn}C, we have

k=1nFck=Fc.

This shows that the collection {Fc:cC} has the finite intersection property (the intersection of finitely many members of it is nonempty). But a topological space is compact if and only if for every collection of closed subsets with the finite intersection property the intersection of all the members of the collection is nonempty.77 7 James Munkres, Topology, second ed., p. 169, Theorem 26.9. Applying this theorem, we get that

cCFc,

and this intersection is closed because each member is closed.

Let xcCFc. Then for all cC we have f(x)c, and because C=f(X) this means that for all yX we have f(x)f(y). Therefore xK. Let xK. Then for all yX we have f(x)f(y), hence for all cC we have f(x)c, hence xcCFc. Therefore K=cCFc, which we have shown is nonempty and closed, proving the claim. ∎

4 Upper semicontinuous functions

If (X,τ) is a topological space, then f:X[-,] is said to be upper semicontinuous if t implies that f-1[-,t)τ. We denote by USC(X) the set of upper semicontinuous functions X[-,]. We say that f is finite if -<f(x)< for all xX.

If AX and t, then

χA-1[t,)={Xt0A0<t1t>1

But χA-1[t,) is closed if and only if χA-1[-,t) is open, hence χA is upper semicontinuous if and only if A is closed.

It is apparent that f:X[-,] is upper semicontinuous if and only if -f:X[-,] is lower semicontinuous. Because the set of all open intervals are a basis for the topology of , a function X is continuous if and only if it is both lower semicontinuous and upper semicontinuous. That is,

C(X)=XLSC(X)USC(X).

5 Approximating integrable functions

If X is a Hausdorff space, if 𝔐 is a σ-algebra on X that contains the Borel σ-algebra of X (equivalently, if every open set belongs to 𝔐), and if μ is a measure on 𝔐, we say that μ is outer regular on EM if

μ(E)=inf{μ(V):EV and V is open},

and we say that μ is inner regular on E if

μ(E)=sup{μ(K):KE and K is compact}.

We state the following to motivate the conditions in Theorem 9. If X is a locally compact Hausdorff space and λ is a positive linear functional on Cc(X) (f0 implies that λf0), then the Riesz-Markov theorem88 8 Walter Rudin, Real and Complex Analysis, third ed., p. 40, Theorem 2.14. states that there is a σ-algebra 𝔐 on X that contains the Borel σ-algebra of X and there is a unique complete measure μ on 𝔐 that satisfies:

  1. 1.

    If fCc(X) then λf=Xf𝑑μ.

  2. 2.

    If K is compact then μ(K)<.

  3. 3.

    μ is outer regular on all E𝔐

  4. 4.

    μ is inner regular on all open sets and on all sets with finite measure.

The following theorem gives conditions under which we can bound an integrable function above and below by semicontinuous functions that can be chosen as close as we please in L1 norm.99 9 Walter Rudin, Real and Complex Analysis, third ed., p. 56, Theorem 2.25.

Theorem 9 (Vitali-Carathéodory theorem).

Let X be a locally compact Hausdorff space, let M be a σ-algebra containing the Borel σ-algebra of X, and let μ be a complete measure on M that satisfies μ(K)< for compact K, that is outer regular on all measurable sets, and that is inner regular on open sets and on sets with finite measure. If fL1(μ) is real valued and if ϵ>0, then there is some upper semicontinuous function u that is bounded above and some lower semicontinuous function v that is bounded below such that ufv and such that

X(v-u)𝑑μ<ϵ.
Proof.

Let gL1(μ) be 0 and let ϵ>0. There is a nondecreasing sequence of measurable simple functions sn such that for all xX we have g(x)=limnsn(x).1010 10 Walter Rudin, Real and Complex Analysis, third ed., p. 15, Theorem 1.17. A simple function is a finite linear combination of characteristic functions, either over or . Writing s0=0 and tn=sn-sn-1, each tn is a measurable simple function and is 0. Then, there are some ci0 and measurable sets Ei such that for all xX we have

g(x)=limni=1nti(x)=i=1ti(x)=i=1ciχEi(x).

Integrating this we get1111 11 Walter Rudin, Real and Complex Analysis, third ed., p. 22, Theorem 1.27.

Xg𝑑μ=i=1XciχEi𝑑μ=i=1ciμ(Ei).

As gL1(μ) the left-hand side is finite and so the right-hand side is too, hence there is then some N such that i=N+1ciμ(Ei)<ϵ2.

For each i, because μ is outer regular on Ei there is an open set Vi containing Ei such that ciμ(ViEi)<2-i-2ϵ. Each Ei has finite measure so μ is inner regular on Ei, hence there is a compact set Ki contained in Ei such that ciμ(EiKi)<2-i-2ϵ. Define

v=i=1ciχVi,u=i=1NciχKi.

Each Vi is open so the characteristic function χVi is lower semicontinuous, and ci0 so each of the functions i=1nciχVi is a sum of finitely many lower semicontinuous functions and hence is lower semicontinuous. But v is the supremum of the functions i=1nciχVi, so v is lower semicontinuous. As each Ki is closed the characteristic function χKi is upper semicontinuous, and ci0 so u is a sum of finitely many upper semicontinuous functions and hence is upper semicontinuous. u is a finite sum so is bounded above, and v is a sum of nonnegative terms so is bounded below by 0.

Because KiEiVi we have ugv, and

v-u =i=1ciχVi-i=1NciχKi
=i=1Nci(χVi-χKi)+i=N+1ciχVi
i=1ci(χVi-χKi)+i=N+1ciχEi;

the inequality is because χVi-χKi+χEiχVi. Integrating,

X(v-u)𝑑μ i=1ciμ(ViKi)+i=N+1ciμ(Ei)
=i=1(ciμ(ViEi)+ciμ(EiVi))+i=N+1ciμ(Ei)
i=1(2-i-2ϵ+2-i-2ϵ)+ϵ2
=ϵ.

Let f=f+-f-, for f+,f-L1(μ) with f+,f-0, and let ϵ>0. From what we have established above, there is an upper semicontinuous function u1 that is bounded above and a lower semicontinuous function v1 that is bounded below satisfying u1f+v1 and

X(v1-u1)𝑑μ<ϵ2,

and similarly u2,v2 with u2f-v2 and

X(v2-u2)𝑑μ<ϵ2.

We have

u1-v2f+-f-=f=f+-f-v1-u2.

That v2 is lower semicontinuous and bounded below means that -v2 is upper semicontinuous and bounded above, hence u1-v2 is upper semicontinuous and bounded above. That u2 is upper semicontinuous and bounded above means that -u2 is lower semicontinuous and bounded below, so v1-u2 is lower semicontinuous and bounded below. Taking u=u1-v2 and v=v1-u2, we have ufv, and

X(v-u)𝑑μ=X(v1-u2-u1+v2)𝑑μ=X(v1-u1)𝑑μ+X(v2-u2)𝑑μ<ϵ.

6 Convex functions

If X is a set and f:X[-,] is a function, its epigraph is the set

epif={(x,α)X×:αf(x)}.

When X is a vector space, we say that f is convex if epif is a convex subset of the vector space X×. The effective domain of a convex function f is the set

domf={xX:f(x)<}.

To say that xdomf is to say that there is some α such that (x,α)epif, from which it follows that if f:X[-,] is a convex function then domf is a convex subset of X. A convex function f is said to be proper if domf and f(x)>- for all xX, i.e. if f does not only take the value and never takes the value -.

If C is a nonempty convex subset of X and f:C is a function, we extend f to X by defining f(x)= for xC. One checks that this extension is a convex function if and only if

f((1-t)x+ty)(1-t)f(x)+tf(y),x,yC,0<t<1,

and we call f:C convex if f:X(-,] is convex. If this extension is convex, then it has effective domain C and is proper.

The following lemma is straightforward to prove.1212 12 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 187, Lemma 5.41.

Lemma 10.

If X is a vector space, if C is a convex subset of X, if f:CR is convex, if x,x+z,x-zC, and if 0δ1, then

|f(x+δz)-f(x)|δmax{f(x+z)-f(x),f(x-z)-f(x)}.

The following lemma asserts that a convex function that is bounded above on some neighborhood of an interior point of a convex subset of a topological vector space is continuous at that point.1313 13 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 188, Theorem 5.42.

Lemma 11.

If X is a topological vector space, if C is a convex subset of X, if f:CR is convex, if x is in the interior of C, and if f is bounded above on some neighborhood of x, then f is continuous at x.

Proof.

There is some neighborhood of x contained in C on which f is bounded above. Thus, there is some open neighborhood U of the origin such that x+UC and such that f is bounded above on x+U. Being bounded above means that there is some M such that yx+U implies that f(y)M. Any open neighborhood of 0 contains a balanced open neighborhood of 0,1414 14 Walter Rudin, Functional Analysis, second ed., p. 12, Theorem 1.14. For a set V to be balanced means that |α|1 implies that αVV. so let V be a balanced open neighborhood of 0 contained in U.

Let ϵ>0 and take δ>0 small enough that δ(M-f(x))<ϵ. For yx+δV, there is some zV with y=x+δz, and because V is balanced we have x+z,x-zx+V. Then we can apply Lemma 10 to get

|f(y)-f(x)|δmax{f(x+z)-f(x),f(x-z)-f(x)}δ(M-f(x))<ϵ.

But x+δV is an open neighborhood of x (because scalar multiplication is continuous), so we have shown that if ϵ>0 then there is some open neighborhood of x such that y being in this neighborhood implies that |f(y)-f(x)|<ϵ. This means that f is continuous at x. ∎

The following theorem shows that properties that by themselves are weaker than continuity on a set are equivalent to it for a convex function on an open convex set.1515 15 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 188, Theorem 5.43. We prove three of the five implications because two are immediate.

Theorem 12.

If X is a topological vector space, if C is an open convex subset of X, and if f:CR is convex, then the following are equivalent:

  1. 1.

    f is continuous on C.

  2. 2.

    f is upper semicontinuous on C.

  3. 3.

    For each xC there is some neighborhood of x on which f is bounded above.

  4. 4.

    There is some xC and some neighborhood of x on which f is bounded above.

  5. 5.

    There is some xC at which f is continuous.

Proof.

Suppose that f is upper semicontinuous on C, and say xC. Because f is upper semicontinuous and C is open, the set

U={yC:f(y)<f(x)+1}=Cf-1(-,f(x)+1)

is open. xU because f(x)<f(x)+1, so U is a neighborhood of x, and f is bounded on U.

Suppose that xC and U is a neighborhood of x on which f is bounded above. Lemma 11 states that f is continuous at x, showing that f is continuous at some point in C.

Suppose that f is continuous at some xC, and let y be another point in C. The function tx+t(y-x) is continuous X, and because it sends 1 to yC and C is open, there is some t>1 such that x+t(y-x)C. (That is, the line segment from x to y remains in C for some length past y.) Set z=x+t(y-x), i.e. z=(1-t)x+ty, i.e.

y=(1-1t)x+1tz,

or

y=λx+(1-λ)z,

where λ=1-1t satisfies 0<λ<1. f being continuous at x means that for every ϵ>0 there is some open neighborhood V of 0 such that yx+V implies that |f(y)-f(x)|<ϵ. Take x+VC, which we can do because C is open. In particular, there is some M such that f(w)M for wx+V. If vV, then

y+λv=λx+(1-λ)z+λv=λ(x+v)+(1-λ)z.

x+vC and zC, and because C is convex this tells us y+λvC. Therefore y+λVC. Because f is convex we have

f(y+λv)=f(λ(x+v)+(1-λ)z)λf(x+v)+(1-λ)f(z)λM+(1-λ)f(z).

This holds for every vV, so f is bounded above by λM+(1-λ)f(z) on y+λV. y+λV is an open neighborhood of y on which f is bounded above, so we can apply Lemma 11, which tells us that f is continuous at y. Since f is continuous at each point in C, it is continuous on C. ∎

7 Convex hulls

If X is a topological vector space over , let X* denote the set of continuous linear maps X. X* is called the dual space of X and is a vector space. We call the bilinear map ,:X×X* defined by

x,λ=λx,xX,λX*

the dual pairing of X and X*. If X is locally convex, it follows from the Hahn-Banach separation theorem1616 16 Walter Rudin, Functional Analysis, second ed., p. 59, Theorem 3.4. that for distinct x,yX there is some λX* such that λxλy. The weak topology on X is the initial topology for the set of functions X*, and we denote the vector space X with the weak topology by Xw. Xw is a locally convex space whose dual space is X*.1717 17 Walter Rudin, Functional Analysis, second ed., p. 64, Theorem 3.10.

If X is a vector space and E is a subset of X, the convex hull of E is the set of all convex combinations of finitely many points in E and is denoted by coE. The convex hull coE is a convex set, and it is straightforward to prove that coE is equal to the intersection of all convex sets containing E. If X is a topological vector space, the closed convex hull of E is the closure of the convex hull coE and is denoted by co¯E. One proves that the closed convex hull co¯E is equal to the intersection of all closed convex sets containing E.

A closed half-space in a locally convex space X over is a set of the form {xX:x,λβ}, for β and λX* with λ0. (If X is merely a topological vector space then it may be the case that X*={0}, for example Lp[0,1] for 0<p<1.) If x,λβ, y,λβ, and 0t1, then

(1-t)x+ty,λ=(1-t)x,λ+ty,λ(1-t)β+tβ=β,

so a closed half-space is convex.

Lemma 13.

If X is a locally convex space over R and E is a subset of X, then co¯E is the intersection of all closed half-spaces containing E.

Proof.

If E= then co¯E=. But every closed half-space contains and the intersection of all of these is also , so the claim is true in this case. If co¯E=X then there are no closed half-spaces that contain E, and as an intersection over an empty index set is equal to the universe which is X here, so the claim is true in this case also. Otherwise, co¯E,X, and let aco¯E. Because {a} is a compact convex set and co¯E is a disjoint nonempty closed convex set, we can apply the Hahn-Banach separation theorem,1818 18 Walter Rudin, Functional Analysis, second ed., p. 59, Theorem 3.4. which tells us that there is some λaX* and some γa such that

λax<γa<λaa,xco¯E.

co¯E is contained in the closed half-space {xX:x,λaγa} and a is not. Hence

co¯E=aco¯E{xX:x,λaγa}.

This shows that co¯E is equal to an intersection of closed half-spaces containing E. Since a closed half-space is closed and convex and co¯E is the intersection of all closed convex sets containing E, it follows that co¯E is equal to the intersection of all closed half-spaces containing E. ∎

If X is a vector convex space and f:X[-,] is a function, the convex hull of f is the function cof:X[-,] defined by

cof={g[-,]X:g is convex and gf}.

We have coff. The following lemma shows that the supremum of a set of convex functions is itself a convex function (because an intersection of convex sets is itself a convex set and for a function to be convex means that its epigraph is a convex set), and hence that the convex hull of a function is itself a convex function.

Lemma 14.

If X is a set and F[-,]X, then F=F satisfies

epiF=fepif.
Proof.

If =, then is the function x-, whose epigraph is X×. The intersection over the empty set of subsets of X× is equal to the universe X×, so the claim holds in this case. Otherwise, let F=, which satisfies

F(x)=sup{f(x):f},xX.

Suppose that (x,α)epiF. This means that F(x)α, so f(x)α for all f, hence (x,α)epif for all f, and therefore

(x,α)fepif.

Suppose that xX and α and that (x,α)epif for all f. This means that f(x)α for all f, hence F(x)α and so (x,α)epiF. ∎

Lower semicontinuity of a function can also be expressed using the notion of epigraphs.

Lemma 15.

If X is a topological space and f:X[-,] is a function, then f is lower semicontinuous if and only if epif is a closed subset of X×R.

Proof.

Suppose that f is lower semicontinuous and let (xi,αi)epif be a net that converges to (x,α)X×. Then xix and αiα, and Theorem 3 gives us

f(x)lim inff(xi)lim infαi=limαi=α.

Hence f(x)α, which means that (x,α)epif. Therefore epif is closed.

Suppose that epif is closed and let t. The set

epif(X×{t})={(x,t):xX,tf(x)}

is a closed subset of X×. This implies that f-1[-,t] is a closed subset of X, which is equivalent to f-1(t,] being an open subset of X. This is true for all t, so f is lower semicontinuous. ∎

The following lemma shows that if a convex lower semicontinuous function takes the value - then it is nowhere finite. This means that if there is some point at which a convex lower semicontinuous function takes a finite value then it does not take the value -, namely, if a convex lower semicontinuous function takes a finite value at some point then it is proper.

Lemma 16.

If X is a topological vector space, if f:X[-,] is convex and lower semicontinuous, and if there is some x0X such that f(x0)=-, then f(x){-,} for all xX.

Proof.

Suppose by contradiction that there is some xX such that -<f(x)<. Because f is convex, for every λ(0,1] we have

f((1-λ)x+λx0)(1-λ)f(x)+λf(x0)=finite-=-,

hence f((1-λ)x+λx0)=- for all λ(0,1]. Because f is lower semicontinuous,

f(limλ0((1-λ)x+λx0))lim infλ0f((1-λ)x+λx0),

and hence

f(x)-,

a contradiction. Therefore there is no xX such that -<f(x)<. ∎

If X is a topological space and f:X[-,] is a function, the lower semicontinuous hull of f is the function lscf:X[-,] defined by

lscf={gLSC(X):gf}.

By Theorem 4, lscfLSC(X). It is apparent that a function is lower semicontinuous if and only if it is equal to its lower semicontinuous hull. The following lemma shows that the epigraph of the lower semicontinuous hull of a function is equal to the closure of its epigraph.1919 19 Jean-Paul Penot, Calculus Without Derivatives, p. 18, Proposition 1.21.

Lemma 17.

If X is a topological space and f:X[-,] is a function, then

epilscf=epif¯
Proof.

Check that epif¯=epig for g:X[-,] defined by2020 20 Let 𝒩(x) be the neighborhood filter at x. lim infyxf(y) is defined to be supN𝒩(x)infyN{x}f(y).

g(x)=lim infyxf(y),xX,

and that lscf=g.∎

The notion of being a convex function applies to functions on a vector space, and the notion of being lower semicontinuous applies to functions on a topological space. The following theorem shows that the lower semicontinuous hull of a convex function on a topological vector space is convex.2121 21 R. Tyrrell Rockafellar, Conjugate Duality and Optimization, p. 15, Theorem 4.

Theorem 18.

If X is a topological vector space and f:X[-,] is convex, then lscf is convex.

Proof.

For lscf to be a convex function means that epilscf is a convex set. But Lemma 17 tells us that epilscf=lscf¯. As f is convex, the epigraph epif is convex, and the closure of a convex set is convex.2222 22 Walter Rudin, Functional Analysis, second ed., p. 11, Theorem 1.13. Hence epilscf is convex, and so lscf is a convex function. ∎

8 Extreme points

If X is a vector space and C is a subset of X, a nonempty subset of S of C is called an extreme set of C if x,yC,0<t<1 and (1-t)x+tyS together imply that x,yS. An extreme point of C is an element x of C such that the singleton {x} is an extreme set of C. The set of extreme points of C is denoted by extC. If C is a convex set and S is an extreme set of C that is itself convex, then S is called a face of C.

Lemma 19.

If X is a vector space, if C is a convex subset of X, and if aC, then a is an extreme point of C if and only if C{a} is a convex set.

Proof.

Suppose that a is an extreme point of C and let x,yC{a} and 0<t<1. Because C is convex, (1-t)x+tyC. If (1-t)x+ty=a, then because a is an extreme point of C we would have x=a and y=a, contradicting x,yC{a}. Therefore (1-t)x+tyC{a}.

Suppose that C{a} is a convex set. Suppose that x,yC, 0<t<1, and (1-t)x+ty=a. Assume by contradiction that xa. If y=a then we get (1-t)x+ta=a, or (1-t)x=(1-t)a, hence x=a, a contradiction. If ya, then using that C{a} is convex, we have (1-t)x+tyC{a}, contradicting that (1-t)x+ty=a. Therefore x=a. We similarly show that y=a. Therefore a an extreme point of C. ∎

The following lemma is about the set of maximizers of a convex function, and does not involve a topology on the vector space.2323 23 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 296, Lemma 7.64.

Lemma 20.

If X is a vector space, if C is a convex susbset of X, and if f:CR is convex, then

F={xC:f(x)=supyCf(y)}

is either an extreme set of C or is empty.

Proof.

Suppose that there is some x0 at which f is maximized, i.e. that F is nonempty, and let M=f(x0). Suppose that x,yC, 0<t<1, and (1-t)x+tyF. If at least one of x,y do not belong to F, then, as (1-t)x+tyF and as f is convex,

M=f((1-t)x+ty)(1-t)f(x)+tf(y)<(1-t)M+tM=M,

a contradiction. Therefore x,yF, showing that F is an extreme set of C. ∎

Elements of an extreme set need not be extreme points, but the following lemma shows that in a locally convex space if an extreme set is compact then it contains an extreme point.2424 24 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 296, Lemma 7.65.

Lemma 21.

If X is a real locally convex space, if C is a subset of X, and if F is a compact extreme set of C, then

FextC.
Proof.

Define ={GF:G is a compact extreme set of C}. F so is nonempty. is a partially ordered set ordered by set inclusion. If T is a chain in , then the intersection of finitely many elements of T is equal to the minimum of these elements which is an extreme set and hence is nonempty. Therefore the chain T has the finite intersection property, and because the elements of T are closed subsets of F and F is compact, the intersection of all the elements of T is nonempty.2525 25 James Munkres, Topology, second ed., p. 169, Theorem 26.9. One checks that this intersection belongs to (one must verify that it is an extreme set of C) and is a lower bound for T. We have shown that every chain in has a lower bound in , and applying Zorn’s lemma, there is a minimal element G in (if an element of is contained in G then it is equal to G).

Assume by contradiction that there are a,bG with ab. Then there is some λX* such that λa>λb. G is compact and λ is continuous, so

G0={cG:λc=supyGλy}

is nonempty and closed. We shall show that G0 is an extreme set of G. Suppose that x,yG, 0<t<1, and (1-t)x+tyG0. Let M=supyGλy. If at least one of x,y do not belong to G0, then

M=λ((1-t)x+ty)=(1-t)λx+tλy<(1-t)M+tM=M,

a contradiction. Hence x,yG0, showing that G0 is an extreme set of G. Check that G0 being an extreme set of G implies that G0 is an extreme set of C. Then G0, but as λa>λb we have bG0, so that G0 is strictly contained in G, contradicting that G is a minimal element of 𝒢. Therefore G has a single element, as G being an extreme set means that it is nonempty. But G is an extreme set of C, and since G is a singleton this means that the single point it contains is an extreme point of C. ∎

The following theorem gives conditions under which a function on a set has a maximizer that is an extreme point of the set.2626 26 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 298, Theorem 7.69.

Theorem 22 (Bauer maximum principle).

If X is a real locally convex space, if C is a compact convex subset of X, and if f:CR is upper semicontinuous, then there is a maximizer of f that belongs to extC.

Proof.

Because f is upper semicontinuous and C is compact, it follows from Theorem 8 that

F={xC:f(x)=supyCf(y)}

is a nonempty closed subset of C. Since C is convex and f is a convex function, by Lemma 20, F is an extreme set of C. F is a closed subset of the compact set C, so F is compact. Hence F is a compact extreme set of C, and by Lemma 21 there is an extreme point in F, which was the claim. ∎

9 Duality

Lemma 23.

A convex function on a real locally convex space is lower semicontinuous if and only if it is weakly lower semicontinuous.

Proof.

Let X be a locally convex space and let Xw denote this vector space with the weak topology, with which it is a locally convex space. As is a locally convex space, the product X× is a locally convex space, and one checks that X× with the weak topology is Xw×. Thus, to say that a subset of X× is weakly closed is equivalent to saying that it is closed in Xw×. Furthermore, the closure of a convex set in a locally convex space is equal to its weak closure.2727 27 Walter Rudin, Functional Analysis, second ed., p. 66, Theorem 3.12. In particular, a convex subset of a locally convex space is closed if and only if it is weakly closed. Therefore, a convex subset of X× is closed if and only if it is closed in Xw×.

By Lemma 15, a function f:X[-,] is lower semicontinuous if and only if epif is a closed subset of X×, and is weakly lower semicontinuous if and only if epif is a closed subset of Xw×. ∎

A topological vector space is said to have the Heine-Borel property if every closed and bounded subset of it is compact. The following theorem gives conditions under which a function is minimized on a set that is not necessarily compact but which is convex, closed, and bounded.

Theorem 24.

If X is a real locally convex space such that Xw has the Heine-Borel property, if f:X[-,] is a lower semicontinuous convex function, and if C is a convex closed bounded subset of X, then

K={xC:f(x)=infyCf(y)}

is a nonempty closed subset of X.

Proof.

Because X is a locally convex space and C is convex, the fact that C is closed implies that it is weakly closed.2828 28 Walter Rudin, Functional Analysis, second ed., p. 66, Theorem 3.12. It is straightforward to prove that if a subset of a topological vector space is bounded then it is weakly bounded.2929 29 The converse is true in a locally convex space. Walter Rudin, Functional Analysis, second ed., p. 70, Theorem 3.18. Thus, C is weakly closed and weakly bounded, and because Xw has the Heine-Borel property we get that C is weakly compact. In other words, Cw is compact, where Cw is the set C with the subspace topology inherited from Xw. By Lemma 23, f is weakly lower semicontinuous, i.e. f:Xw[-,] is lower semicontinuous. Thus, the restriction of f to Cw is lower semicontinuous. We have established that Cw is compact and that the restriction of f to Cw is lower semicontinuous, so we can apply Theorem 8 (the extreme value theorem) to obtain that K is a nonempty closed subset of Cw. Finally, K being a closed subset of Cw implies that K is a closed subset of X. ∎

10 Convex conjugation

If X is a locally convex space and X* is its dual space, the strong dual topology on X* is the seminorm topology induced by the seminorms λsupxE|λx|, where E are the bounded subsets of X. Because these seminorms are a separating family, X* with the strong dual topology is a locally convex space. (If X is a normed space then the strong dual topology on X* is equal to the operator norm topology on X*.3030 30 Kôsaku Yosida, Functional Analysis, sixth ed., p. 111, Theorem 1.) A locally convex space is said to be reflexive if the strong dual of its strong dual is isomorphic as a locally convex space to the original space.

If X is a real locally convex space, the convex conjugate3131 31 Also called the Fenchel transform. of a function f:X[-,] is the function f*:X*[-,] defined by

f*(λ)=sup{x,λ-f(x):xX}=sup{x,λ-f(x):xdomf}.

The convex biconjugate of f is the function f**:X[-,] defined by

f**(x)=sup{x,λ-f*(λ):λX*}=sup{x,λ-f*(λ):λdomf*}.

The convex biconjugate of a function on a real reflexive locally convex space is the convex conjugate of its convex conjugate. From the definition of f* it is apparent that for all xX and λX*,

x,λf(x)+f*(λ), (1)

called Young’s inequality.

The following theorem establishes some properties of the convex conjugates and convex biconjugates of any function from a real locally convex space to [-,].3232 32 Viorel Barbu and Teodor Precupanu, Convexity and Optimization in Banach Spaces, fourth ed., p. 77, Proposition 2.19.

Theorem 25.

If X is a real locally convex space and f:X[-,] is a function, then

  • f* is convex and weak-* lower semicontinuous,

  • f** is convex and weakly lower semicontinuous,

  • f**f.

If f1,f2:X[-,] are functions satisfying f1f2, then f1*f2*.

Proof.

For each xX, it is apparent that the function λx,λ is convex and weak-* continuous, and a fortiori is weak-* lower semicontinuous. Whether f(x) is finite or infinite, the function λx,λ is weak-* lower semicontinuous. By Lemma 14, the supremum of a collection of convex functions is a convex function, and by Theorem 4 the supremum of a collection of lower semicontinuous functions is a lower semicontinuous function. f* is the supremum of this set of functions, and therefore is convex and weak-* lower semicontinuous.

For each λX*, the function xx,λ-f*(λ) is convex and is weakly semicontinuous, and a fortiori is weakly lower semicontinuous. As f** is the supremum of this set of functions, f** is convex and weakly lower semicontinuous.

For every xX and λX* we have by Young’s inequality (1),

x,λ-f*(λ)f(x),

and hence for every xX,

f**(x)=supλX*(x,λ-f*(λ))f(x),

and this means that f**f.

For λX*, because f1f2,

f2*(λ)=supxX(x,λ-f2(x))supxX(x,λ-f1(x))=f1*(λ),

which means that f1*f2*. ∎

We remind ourselves that to say that a convex function is proper means that at some point it takes a value other than , and that it nowhere takes the value -. The following lemma shows that any lower semicontinuous proper convex function on a real locally convex space is bounded below by a continuous affine functional.

Lemma 26.

If X is a real locally convex space and f:X(-,] is a lower semicontinuous proper convex function, then there is some μX* and some cR such that fμ+c.

Proof.

The fact that f is a convex function tells us that epif is a convex subset of X×, and as f is proper, domf and x0domf satisfies f(x0)>-. The fact that f is lower semicontinuous tells us that epif is a closed subset of X×. Let x0domf. We have (x0,f(x0)-1)epif. The singleton {(x0,f(x0)-1)} is a compact convex set and epif is a disjoint closed convex set, so we can apply the Hahn-Banach separation theorem to obtain that there is some Λ(X×)* and some γ satisfying

Λ(x,α)<γ<Λ(x0,f(x0)-1),(x,α)epif.

There is some λX* and some β*= such that Λ(x,α)=λx+βα for all (x,α)X×. So we have

λx+βα<γ<λx0+β(f(x0)-1),(x,α)epif.

And (x0,f(x0))epif, so

λx0+βf(x0)<λx0+β(f(x0)-1),

hence β<0. If xdomf then (x,f(x))epif and

λx+βf(x)<λx0+β(f(x0)-1).

Rearranging, and as β<0,

f(x)>-1βλx+1βλx0+f(x0)-1,xdomf.

If xdomf then f(x)=, for which the above inequality also holds. ∎

We proved in Theorem 25 that the conjugate of any function is convex and weak-* lower semicontinuous, which a fortiori gives that it is lower semicontinuous. In the following lemma we show that a convex lower semicontinuous function is proper if and only if its convex conjugate is proper.3333 33 Viorel Barbu and Teodor Precupanu, Convexity and Optimization in Banach Spaces, fourth ed., p. 78, Corollary 2.21.

Lemma 27.

If X is a locally convex space and f:X[-,] is a lower semicontinuous convex function, then f is proper if and only if f* is proper.

Proof.

Suppose that f is proper. By Lemma 26 there is some μX* and some c such that f(x)μx+c for all xX. For any λX* we have

f*(λ)=supxX(λx-f(x))supxX(λx-μx-c),

thus f*(μ)=-c<, so domf*. And there is some x0X such that f(x0), giving supxX(λx-f(x))λx0-f(x0)>-, showing that f*(λ)>- for all λX*. Therefore f* is proper.

Suppose that f* is proper. If f took only the value then f* would take only the value -, and f* being proper means that it in fact never takes the value -. Let xX. As f* is proper there is some λX* for which f*(λ)<, and using Young’s inequality (1) we get

f(x)-f*(λ)+x,λ>-.

Thus, for every xX we have f(x)>-, so we have verified that f is proper. ∎

The following theorem is called the Fenchel-Moreau theorem, and gives necessary and sufficient conditions for a function to equal its convex biconjugate.3434 34 Viorel Barbu and Teodor Precupanu, Convexity and Optimization in Banach Spaces, fourth ed., p. 79, Theorem 2.22.

Theorem 28 (Fenchel-Moreau theorem).

If X is a real locally convex space and f:X[-,] is a function, then f=f** if and only if one of the following three conditions holds:

  1. 1.

    f is a proper convex lower semicontinuous function

  2. 2.

    f is the constant function

  3. 3.

    f is the constant function -

Proof.

Suppose that f is a proper convex lower semicontinuous function. From Lemma 27, its convex conjugate f* is a proper convex function. As f* does not take only the value we get from the definition of f** that f**(x)>- for every xX. Theorem 25 tells us that f**f, and suppose by contradiction that there is some x0X for which -<f**(x0)<f(x0). For this x0 we have (x0,f**(x0))epif, so we can apply the Hahn-Banach separation theorem to the sets {(x0,f**(x0))} and epif to get that there is some Λ(X×)* and some γ for which

Λ(x,α)<γ<Λ(x0,f**(x0)),(x,α)epif.

But Λ(x,α) can be written as λx+βα for some λX* and some β*=, so

λx+βα<γ<λx0+βf**(x0),(x,α)epif. (2)

If β were >0 then the left-hand side of (2) would be , because for xdomf there are arbitrarily large α such that (x,α)epif. But the left-hand side is upper bounded by the constant right-hand side, so β0. Either f(x0)< or f(x0)=. In the first case, (x0,f(x0))epif, and applying (2) gives βf(x0)<βf**(x0). The assumption that f**(x0)<f(x0) then implies that β<0. In the case that f(x0)=, assume by contradiction that β=0. Then (2) becomes

λx<γ<λx0,xdomf. (3)

Let μdomf*; f* is proper so domf*. For all h>0 we have

f*(μ+hλ) =sup{x,μ+hλ-f(x):xX}
=sup{x,μ+hλ-f(x):xdomf}
sup{x,μ-f(x):xdomf}+hsup{x,λ:xdomf}
=f*(μ)+hsup{x,λ:xdomf}.

But using the definition of f** we have f**(x0)x0,μ+hλ-f*(μ+hλ), so

f**(x0) x0,μ+hx0,λ-f*(μ+hλ)
x0,μ+hx0,λ-f*(μ)-hsup{x,λ:xdomf}
=x0,μ-f*(μ)+h(x0,λ-sup{x,λ:xdomf}).

But (3) tells us that x0,λ-sup{x,λ:xdomf}>0, and since the above inequality holds for arbitrarily large h we get f**(x0)=, contradicting that f**(x0)<f(x0). Therefore, β<0, and then we can divide (2) by β to obtain

1βλx+α>γβ>1βλx0+f**(x0),(x,α)epif,

hence

λ(-x0β)-f**(x0) >sup{-1βλx-α:(x,α)epif}
=sup{-1βλx-f(x):xdomf}
=f*(-1βλ).

From the definition of f** we have

f**(x0)x0,-1βλ-f*(-1βλ),

and this and the above give

λ(-x0β)-f*(-1βλ)>x0,-1βλ-f*(-1βλ),

but the two sides are equal, a contradiction. Therefore, f**(x0)=f(x0).

Suppose that f is the constant function . Then f* is the constant function -, and this means that f** is the constant function , giving f=f**.

Suppose that f is the constant function -. This implies that f* is the constant function , and so f** is the constant function -, giving f=f**.

Suppose that f=f**. By Theorem 25, f** is convex and weakly lower semicontinuous, so a fortiori it is lower semicontinuous, hence f is convex and lower semicontinuous. Suppose that f is neither the constant function nor the constant function -. If f took the value - then f* would take only the value , and then f** would take only the value -, contradicting that f=f** is not the constant function -. Therefore, if f=f** then either f is the constant function , or f is the constant function -, or f is a proper convex lower semicontinuous function. ∎

If X is a topological space and f:X[-,] is a function, the closure of f is the function clf:X[-,] that is defined to be lscf if (lscf)(x)>- for all xX, and defined to be the constant function - if there is some xX such that (lscf)(x)=-. We say that a function is closed if it is equal to its closure, and thus to say that a function is closed is to say that it is lower semicontinuous, and either does not take the value - or only takes the value -. One checks that (clf)*=f*, and combined with the Fenchel-Moreau theorem one can obtain the following.

Corollary 29.

If X is a real locally convex space and f:X[-,] is a convex function, then f**=clf.