Subdifferentials of convex functions

Jordan Bell
April 21, 2014

1 Introduction

Whenever we speak about a vector space in this note we mean a vector space over . If X is a topological vector space then we denote by X* the set of all continuous linear maps X. X* is called the dual space of X, and is itself a vector space.11 1 In this note, we are following the presentation of some results in Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., chapter 7. Three other sources for material on subdifferentials are: Jean-Paul Penot, Calulus Without Derivatives, chapter 3; Viorel Barbu and Teodor Precupanu, Convexity and Optimization in Banach Spaces, fourth ed., §2.2, pp. 82–125; and Jean-Pierre Aubin, Optima and Equilibria: An Introduction to Nonlinear Analysis, second ed., chapter 4, pp. 57–73.

2 Definition of subdifferential

If X is a topological vector space, f:X[-,] is a function, xX, and λX*, then we say that λ is a subgradient of f at x if22 2 +=, --=-, and - is nonsense; if a, then a-=- and a+=.

f(y)f(x)+λ(y-x),yX.

The subdifferential of f at x is the set of all subgradients of f at x and is denoted by f(x). Thus f is a function from X to the power set of X*, i.e. f:X2X*. If f(x), we say that f is subdifferentiable at x.

It is immediate that if there is some y such that f(y)=-, then

f(x)={X*f(x)=-f(x)>-,xX.

Thus, little is lost if we prove statements about subdifferentials of functions that do not take the value -.

Theorem 1.

If X is a topological vector space, f:X[-,] is a function and xX, then f(x) is a convex subset of X*.

Proof.

If λ1,λ2f(x) and 0t1, then of course (1-t)λ1+tλ2X*. For any yX we have

f(y) =(1-t)f(y)+tf(y)
(1-t)f(x)+(1-t)λ1(y-x)+tf(x)+tλ2(y-x)
=f(x)+((1-t)λ1+tλ2)(y-x),

showing that (1-t)λ1+tλ2f(x) and thus that f(x) is convex. ∎

To say that 0f(x) is equivalent to saying that f(y)f(x) for all yX and so f(x)=infyXf(y). This can be said in the following way.

Lemma 2.

If X is a topological vector space and f:X[-,] is a function, then x is a minimizer of f if and only if 0f(x).

3 Convex functions

If X is a set and f:X[-,] is a function, then the epigraph of f is the set

epif={(x,α)X×:αf(x)},

and the effective domain of f is the set

domf={xX:f(x)<}.

To say that xdomf is equivalent to saying that there is some α such that (x,α)epif. We say that f is finite if -<f(x)< for all xX.

If X is a vector space and f:X[-,] is a function, then we say that f is convex if epif is a convex subset of the vector space X×.

If X is a set and f:X[-,] is a function, we say that f is proper if it does not take only the value and never takes the value -. It is unusual to talk merely about proper functions rather than proper convex functions; we do so to make clear how convexity is used in the results we prove.

4 Weak-* topology

Let X be a topological vector space and for xX define ex:X* by exλ=λx. The weak-* topology on X* is the initial topology for the set of functions {ex:xX}, that is, the coarsest topology on X* such that for each xX, the function ex:X* is continuous.

Lemma 3.

If X is a topological vector space, τ1 is the weak-* topology on X*, and τ2 is the subspace topology on X* inherited from RX with the product topology, then τ1=τ2.

Proof.

Let λiX* converge in τ1 to λX*. For each xX, the function ex:X* is τ1 continuous, so exλiexλ, i.e. λixλx. But for fiX to converge to fX means that for each x, we have fi(x)f(x). Thus λi converges to λ in τ2. This shows that τ2τ1.

Let xX, and let λiX* converge in τ2 to λX*. We then have exλi=λixλx=exλ; since λi was an arbitrary net that converges in τ2, this shows that ex is τ2 continuous. Thus, we have shown that for each xX, the function ex is τ2 continuous. But τ1 is the coarsest topology for which ex is continuous for all xX, so we obtain τ1τ2. ∎

In other words, the weak-* topology on X* is the topology of pointwise convergence. We now prove that at each point in the effective domain of a proper function on a topological vector space, the subdifferential is a weak-* closed subset of the dual space.33 3 cf. Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 265, Theorem 7.13.

Theorem 4.

If X is a topological vector space, f:X(-,] is a proper function, and xdomf, then f(x) is a weak-* closed subset of X*.

Proof.

If λf(x), then for all yX we have

f(y)f(x)+λ(y-x),

so, for any vX, using y=v+x,

f(v+x)f(x)+λv,

or,

λvf(x+v)-f(x);

this makes sense because f(x) is finite. On the other hand, let λX*. If λvf(x+v)-f(x) for all vX, then λ(v-x)f(v)-f(x), i.e. f(v)f(x)+λ(v-x), and so λf(x). Therefore

f(x)=vX{λX*:λvf(x+v)-f(x)}. (1)

Defining ev:X* for vX by evλ=λv, for each vX we have

ev-1(-,f(x+v)-f(x)]={λX*:λvf(x+v)-f(x)}.

Because ev is continuous, this inverse image is a closed subset of X*. Therefore, each of the sets in the intersection (1) is a closed subset of X*, and so f(x) is a closed subset of X*. ∎

5 Support points

If X is set, A is a subset of X, and f:X[-,] is a function, we say that xX is a minimizer of f over A if

f(x)=infyAf(y),

and that x is a maximizer of f over A if

f(x)=supyAf(y).

If A is a nonempty subset of a topological vector space X and xA, we say that x is a support point of A if there is some nonzero λX* for which x is a minimizer or a maximizer of λ over A. Moreover, x is a minimizer of λ over A if and only if x is a maximizer of -λ over A. Thus, if we know that x is a support point of a set A, then we have at our disposal both that x is a minimizer of some nonzero element of X* over A and that x is a maximizer of some nonzero element of X* over A.

If x is a support point of A and A is not contained in the hyperplane {yX:λy=λx}, we say that A is properly supported at x. To say that A is not contained in the set {yX:λy=λx} is equivalent to saying that there is some yA such that λyλx.

In the following lemma, we show that the support points of a set A are contained in the boundary A of the set.

Lemma 5.

If X is a topological vector space, A is a subset of X, and x is a support point of A, then xA.

Proof.

Because x is a support point of A there is some nonzero λX* for which x is a maximizer of λ over A:

λx=supyAλy.

As λ is nonzero there is some yX with λy>λx. For any t>0,

(1-t)λx+tλy=λ((1-t)x+ty)=(1-t)λx+tλy>(1-t)λx+tλx=λx,

hence if t>0 then (1-t)λx+tyA. But (1-t)x+tyx as t0 and xA, showing that xA. ∎

The following lemma gives conditions under which a boundary point of a set is a proper support point of the set.44 4 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 259, Lemma 7.7.

Lemma 6.

If X is a topological vector space, C is a convex subset of X that has nonempty interior, and xCC, then C is properly supported at x.

Proof.

The Hahn-Banach separation theorem55 5 Gert K. Pedersen, Analysis Now, revised printing, p. 65, Theorem 2.4.7. tells us that if A and B are disjoint nonempty convex subsets of X and A is open then there is some λX* and some t such that

λa<tλb,aA,bB.

Check that the interior of a convex set in a topological vector space is convex, and hence that we can apply the Hahn-Banach separation theorem to {x} and C: as x belongs to the boundary of C it does not belong to the interior of C, so {x} and C are disjoint nonempty convex sets. Thus, there is some λX* and some t such that λy<tλx for all yC, from which it follows that λxλy for all yC, and λ0 because of the strict inequality for the interior. As xC, this means that x is a maximizer of λ over C, and as λ0 this means that x is a support point of C. But C is nonempty and if yC then λx<λy, hence x is a proper support point of C. ∎

6 Subdifferentials of convex functions

If f:X(-,] is a proper function then there is some yX for which f(y)<, and for f to have a subgradient λ at x demands that f(y)f(x)+λ(y-x), and hence that f(x)<. Therefore, if f is a proper function then the set of x at which f is subdifferentiable is a subset of domf.

We now prove conditions under which a function is subdifferentiable at a point, i.e., under which the subdifferential at that point is nonempty.66 6 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 265, Theorem 7.12.

Theorem 7.

If X is a topological vector space, f:X(-,] is a proper convex function, x is an interior point of domf, and f is continuous at x, then f has a subgradient at x.

Proof.

Because f is convex, the set domf is convex, and the interior of a convex set in a topological vector space is convex so (domf) is convex. f is proper so it does not take the value -, and on domf it does not take the value , hence f is finite on domf. But for a finite convex function on an open convex set in a topological vector space, being continuous at a point is equivalent to being continuous on the set, and is also equivalent to being bounded above on an open neighborhood of the point.77 7 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 188, Theorem 5.43. Therefore, f is continuous on (domf) and is bounded above on some open neighborhood V of x contained in (domf), say f(y)M for all yV. V×(M,) is an open subset of X×, and is contained in epif. This shows that epif has nonempty interior. Since f(x)<, if ϵ>0 then (x,f(x)-ϵ)epif, and since f(x)>- we have (x,f(x))epif, and therefore (x,f(x))epif(epif). We can now apply Lemma 6: epif is a convex subset of the topological vector space X× with nonempty interior and (x,f(x))epif(epif), so epif is properly supported at (x,f(x)). That is, Lemma 6 shows that there is some Λ(X×)* such that

Λ(x,f(x))=sup(y,α)epifΛ(y,α),

and there is some (y,α)epif for which Λ(x,f(x))>Λ(y,α). Now, there is some λX* and some β*= such that Λ(y,α)=λy+βα for all (y,α)X×. Thus, there is some nonzero λX* and some β such that

λx+βf(x)=sup(y,α)epifλy+βα.

If β>0 then the right-hand side would be while the left-hand side is constant and <, so β0. Suppose by contradiction that β=0. Then λxλy for all ydomf, and as λ0 this means that x is a support point of domf, and then by Lemma 5 we have that x(domf), contradicting x(domf). Hence β<0, so

λx+βf(x)λy+βf(y),ydomf,

i.e.,

f(y)f(x)-λβ(y-x),ydomf.

Furthermore, if ydomf then f(y)=, for which the above inequality is true. Therefore, f(y)f(x)-λβ(y-x) for all yX, showing that -λβ is a subgradient of f at x. ∎

7 Directional derivatives

Lemma 8.

If X is a vector space, f:X(-,] is a proper convex function, xdomf, vX, and 0<h<h, then

f(x+hv)-f(x)hf(x+hv)-f(x)h.
Proof.

We have

x+hv=hh(x+hv)+h-hhx,

and because f is convex this gives

f(x+hv)hhf(x+hv)+h-hhf(x),

i.e.

f(x+hv)-f(x)hh(f(x+hv)-f(x)).

Dividing by h,

f(x+hv)-f(x)hf(x+hv)-f(x)h.

If f:X(-,] is a proper convex function, xdomf, and vX, then the above lemma shows that

hf(x+hv)-f(x)h

is an increasing function (0,)(-,], and therefore that

limh0+f(x+hv)-f(x)h

exists; it belongs to [-,], and if there is at least one h>0 for which f(x+hv)< then the limit will be <. We define the one-sided directional derivative of f at x to be the function d+f(x):X[-,] defined by88 8 We are following the notation of Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 266.

d+f(x)v=limh0+f(x+hv)-f(x)h,vX.
Lemma 9.

If X is a topological vector space, f:X(-,] is a proper convex function, x(domf), f is continuous at x, and vX, then -<d+f(x)v<.

Proof.

Because x(domf), there is some h>0 for which x+hvdomf and hence for which f(x+hv)<. This implies that d+f(x)v<.

Let h>0. By Theorem 7, the subdifferential f(x) is nonempty, i.e. there is some λX* for which f(y)f(x)+λ(y-x) for all yX. Thus, for all vX we have, with y=x+hv,

f(x+hv)f(x)+λ(hv),

i.e.,

λvf(x+hv)-f(x)h.

Since this difference quotient is bounded below by λv, its limit as h0+ is >-, and therefore d+f(x)v>-. ∎