Positive definite functions, completely monotone functions, the Bernstein-Widder theorem, and Schoenberg’s theorem

Jordan Bell
June 26, 2015

1 Linear operators

For a complex Hilbert space H let (H) be the bounded linear operators HH. It is a fact that A(H) is self-adjoint if and only if Ah,h for all hH.11 1 John B. Conway, A Course in Functional Analysis, second ed., p. 33, Proposition 2.12. For a bounded self-adjoint operator A it is a fact that22 2 John B. Conway, A Course in Functional Analysis, second ed., p. 34, Proposition 2.13.

A=suph=1|Ah,h|.

A(H) is called positive if it is self-adjoint and

Ah,h0,hH;

because we have taken H to be a complex Hilbert space, for A to be positive it suffices that the inequality is satisfied.

For A,B(n), we define their Hadamard product A*B(H) by

(A*B)ei=j=1nAei,ejBei,ejej.

So,

(A*B)ei,ej=Aei,ejBei,ej.

The Schur product theorem states that if A,B(n) are positive then their Hadamard product A*B is positive.33 3 Ward Cheney and Will Light, A Course in Approximation Theory, p. 81, chapter 12.

2 Positive definite functions

Let X be a real or complex linear space, let f:X be a function, and for x1,,xnX, define Ff;x1,,xn(n) by

Ff;x1,,xnei=j=1nf(xi-xj)ej,

where {e1,,en} is the standard basis for n. Thus for u=i=1nuiein,

Ff;x1,,xnu,u =i=1nuij=1nf(xi-xj)ej,k=1nukek
=i=1nuij=1nf(xi-xj)ej,k=1nukek
=i=1nj=1nuiuj¯f(xi-xj).

We call f positive definite if for all x1,,xnX, Ff;x1,,xn is a positive operator, i.e. for un,

Ff;x1,,xnu,u0.

We call f strictly positive definite for all distinct x1,,xnX and nonzero un,

Ff;x1,,xnu,u>0.

3 Completely monotone functions

A function f:[0,) is called completely monotone if

  1. 1.

    fC[0,)

  2. 2.

    fC(0,)

  3. 3.

    (-1)kf(k)(x)0 for k0 and x(0,)

Because a completely monotone function is continuous, f(x) tends to f(0) as x0. Because a completely monotone function is nonincreasing and convex, f(x) has a limit, which we call f(), as x.

The Bernstein-Widder theorem states that a function f satisfying f(0)=1 is completely monotone if and only if it is the Laplace transform of a Borel probability measure on [0,).44 4 Peter D. Lax, Functional Analysis, p. 138, chapter 14, Theorem 3; http://djalil.chafai.net/blog/2013/03/23/the-bernstein-theorem-on-completely-monotone-functions/

Theorem 1 (Bernstein-Widder theorem).

A function f:[0,) satisfies f(0)=1 and is completely monotone if and only if there is a Borel probability measure μ on [0,) such that

f(x)=0e-xt𝑑μ(t),x[0,).
Proof.

If f is the Laplace transform of some probability measure μ on [0,), then using the dominated convergence theorem yields that f is continuous and by induction that fC(0,). For k0 and for x(0,),

f(k)(x)=0(-t)ke-xt𝑑μ(t),

as 0tke-xt𝑑μ(t)0 so (-1)kf(k)(x)0. Hence f is completely monotone, and f(0)=0𝑑μ(t)=1.

If f satisfies f(0)=1 and is completely monotone, then for each k0, the function (-1)kf(k):(0,) is nonnegative and is nonincreasing, so for k1 and t(0,), using that (-1)kf(k) is nondecreasing and that (-1)k-1f(k-1) is nonnegative,

(-1)kf(k)(t) 2tt/2t(-1)kf(k)(u)𝑑u
=2t(-1)k(f(k-1)(t)-f(k-1)(t/2))
2t(-1)k-1f(k-1)(t/2).

Doing induction, for any k1,

(-1)kf(k)(t) j=1k-1(2jt)f(t2k-1)
j=1k-1(2jt)2kt(f(t2k-1)-f(t2k))
=t-k2k(k-1)/2(f(t2k-1)-f(t2k)).

Because f(x)f(0) as x0,

f(t2k-1)-f(t2k)0,t0,

and because f(x)f() as x,

f(t2k-1)-f(t2k)0,t.

Hence for each k1,

|f(t)|=ok(t-k),t0, (1)

and

|f(t)|=ok(t-k),t. (2)

Furthermore, for any x(0,), f(k)(t)f(k)(x) as tx, so it is immediate that

(t-x)kf(k)(t)0,tx. (3)

For x0 and k1, integrating by parts, using (2) and (1) or (3) respectively as x=0 or x>0,

f(x)-f() =-xf(t)𝑑t
=-(t-x)f(t)|x+xf′′(t)(t-x)𝑑t
=xf′′(t)(t-x)𝑑t
=(t-x)22f′′(t)|x-xf′′′(t)(t-x)22𝑑t
=-xf′′′(t)(t-x)22𝑑t
=(-1)kxf(k)(t)(t-x)k-1(k-1)!𝑑t.

Hence for x0 and n0,

f(x)-f()=(-1)n+1n!xf(n+1)(t)(t-x)n𝑑t.

Define

ϕn(y)=(1-y/n)n1[0,n](y).

For n1, by change of variables,

f(x)-f() =(-1)n+1n!x/nf(n+1)(nu)(nu-x)nn𝑑u
=(-1)n+1(n-1)!x/nf(n+1)(nu)(nu)n(1-xnu)n𝑑u
=(-1)n+1(n-1)!0(nu)nϕn(x/u)f(n+1)(nu)𝑑u
=(-1)n+1(n-1)!0(n/t)nϕn(xt)f(n+1)(n/t)t-2𝑑t.

For t0, define

sn(t)=(-1)n+1(n-1)!1/t(nu)nf(n+1)(nu)𝑑u,

where sn(0)=0, and for t<0 let sn(t)=0. sn is continuous and because f is completely monotone, sn is nondecreasing, so there is a unique positive measure σn on such that55 5 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 393, Theorem 10.48.

σn((a,b])=sn(b)-sn(a),ab.

On the other hand, sn is absolutely continuous, so σn is absolutely continuous with respect to Lebesgue measure λ1, and for λ1-almost all t,66 6 H. L. Royden, Real Analysis, third ed., p. 303, Exercise 16.

dσndλ1(t)=sn(t).

Now for t>0, by the fundamental theorem of calculus and the chain rule,

sn(t)=(-1)n+1(n-1)!(n/t)nf(n+1)(n/t)t-2,

and therefore

f(x)-f() =0ϕn(xt)sn(t)𝑑λ1(t)
=0ϕn(xt)dσndλ1(t)𝑑λ1(t)
=0ϕn(xt)𝑑σn(t).

The total variation of σn is equal to the total variation of sn, and because sn is nondecreasing,

σn=0|sn(t)|𝑑t=0sn(t)𝑑t=sn()-sn(0)=sn(),

which is

σn=(-1)n+1(n-1)!0(nu)nf(n+1)(nu)𝑑u=f(0)-f(),

showing that {σn:n1} is bounded for the total variation norm. We claim that {σn:n1} is tight: for each ϵ>0 there is a compact subset Kϵ of such that σn(Kϵc)<ϵ for all n. Taking this for granted, Prokhorov’s theorem77 7 V. I. Bogachev, Measure Theory, volume II, p. 202, Theorem 8.6.2. states that there is a subsequence σkn of σn that converges narrowly to some positive measure σ on . Finally, the sequence tϕn(xt) tends in Cb([0,)) to te-xt, and it thus follows that88 8 cf. Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 511, Corollary 15.7.

0ϕn(xt)𝑑σn(t)0e-xt𝑑σ(t),

so

f(x)-f()=0e-xt𝑑σ(t).

Let

μ=σ+f()δ0,

with which

0e-xt𝑑μ(t)=0e-xt𝑑σ(t)+f(),

hence

f(x)=0e-xt𝑑μ(t).

Because f(0)=1, 0𝑑μ(t)=1, showing that μ is a probability measure. ∎

4 Fourier transforms

For a topological space X and a positive Borel measure μ on X, FX is called a support of μ if (i) F is closed, (ii) μ(Fc)=0, and (iii) if G is open and GF then μ(GF)>0. If F1 and F2 are supports of μ, it is straightforward that F1=F2. It is a fact that if X is second-countable then μ has a support, which we denote by suppμ.99 9 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 442, Theorem 12.14.

Lemma 2.

If μ is a Borel measure on a topological space X and μ has a support suppμ, if f:X[0,) is continuous and Xf𝑑μ=0 then f(x)=0 for all xsuppμ.

Proof.

Let F=suppμ and let E={xX:f(x)0}. E is an open subset of X. Suppose by contradiction that there is some xEF, i.e. that EF. Because f is continuous and f(x)>0, there is some open neighborhood G of x for which f(y)>f(x)/2 for yU. Then xGF, so GF and because F is the support of μ, μ(GF)>0 and a fortiori μ(G)>0. Then

0=Xf𝑑μGf(y)𝑑μ(y)Gf(x)2𝑑μ(y)=f(x)2μ(G)>0,

a contradiction. Therefore EF=, i.e. for all xF, f(x)=0. ∎

The following lemma asserts that a certain function is nonzero λd-almost everywhere, where λd is Lebesgue measure on d.1010 10 Ward Cheney and Will Light, A Course in Approximation Theory, p. 91, chapter 13, Lemma 6.

Lemma 3.

Let x1,,xn be distinct points in d, let un not be the zero vector, and define

g(y)=j=1nuje-2πixjy,yd.

For λd-almost all yd, g(y)0.

The following theorem gives conditions under which the Fourier transform of a Borel measure on d is strictly positive definite.1111 11 Ward Cheney and Will Light, A Course in Approximation Theory, p. 92, chapter 13, Theorem 3.

Theorem 4.

If μ is a finite Borel measure on d and λd(suppμ)>0, then μ^:d is strictly positive definite.

Proof.

For distinct x1,,xnd and for nonzero un,

j=1nk=1nujuk¯μ^(xj-xk) =j=1nk=1nujuk¯de-2πi(xj-xk)y𝑑μ(y)
=d(j=1nuje-2πixjy)(k=1nuke-2πixky)¯𝑑μ(y)
=d|j=1nuje-2πixjy|2𝑑μ(y)
=d|g(y)|2𝑑μ(y).

It is apparent that this is nonnegative. If it is equal to 0 then because g is continuous we obtain from Lemma 2 that |g(y)|2=0 for all ysuppμ, i.e. g(y)=0 for all ysuppμ. In other words,

suppμ{yd:g(y)=0}.

But by Lemma 3, λd({yd:g(y)=0})=0, so λd(suppμ)=0, contradicting the hypothesis λd(suppμ)>0. Therefore

d|g(y)|2𝑑μ(y)>0,

which shows that μ^ is strictly positive definite. ∎

5 Schoenberg’s theorem

Let (X,,) be a real inner product space. We call a function F:X radial when x=y implies that F(x)=F(y).

An identity that is worth memorizing is that for y,

e-πx2e-2πixy𝑑x=e-πy2.

Using this and Fubini’s theorem yields, yd,

de-π|x|2e-2πx,y=e-π|y|2.
Lemma 5.

For α>0 and yd,

d(πα)d/2exp(-π2α|x|2)e-2πix,y𝑑x=e-α|y|2.
Proof.

Define T:dd by

T(x)=παx,xd.

T(x)=παI(d) and JT(x)=detT(x)=(πα)d/2. Let ud and define f(x)=e-π|x|2e-2πix,u. By the change of variables formula,1212 12 Charalambos D. Aliprantis and Owen Burkinshaw, Principles of Real Analysis, third ed., p. 393, Theorem 40.7.

d(fT)|JT|𝑑λd=T(d)f𝑑λd,

and because T is self-adjoint this is

de-π|T(x)|2e-2πix,Tu(πα)d/2𝑑x=de-π|x|2e-2πix,u𝑑x,

and therefore

d(πα)d/2exp(-π2α|x|2)e-2πix,Tu𝑑x=e-π|u|2.

For u=T-1(y)=απy this is

d(πα)d/2exp(-π2α|x|2)e-2πix,y𝑑x=e-α|y|2,

proving the claim. ∎

We now prove that on a real inner product space, xe-αx2 is strictly positive definite whenever α>0.1313 13 Ward Cheney and Will Light, A Course in Approximation Theory, p. 104, chapter 15, Theorem 2.

Theorem 6.

Let (X,,) be a real inner product space. If α>0, then

xe-αx2,xX,

is radial and strictly positive definite.

Proof.

Let x1,,xn be distinct points in X. There is an n-dimensional linear subspace V of X that contains x1,,xn. By the Gram-Schmidt process, V has an orthonormal basis {v1,,vn}. Define T:Vn by Tvj=ej, where {e1,,en} is the standard basis for n, which is an orthogonal transformation, and define

f(u)=e-α|u|2,ud.

For un, u0,

j=1nk=1nujuk¯e-αxj-xk2 =j=1nk=1nujuk¯exp(-α|T(xj-xk)|2)
=j=1nk=1nujuk¯f(Txj-Txk).

Now, let μ be the Borel measure on d whose density with respect to λd is

y(πα)d/2exp(-π2α|y|2).

Because μ is absolutely continuous with respect to λd, λd(suppμ)>0, so Theorem 4 states that the Fourier transform μ^:d is strictly positive definite. Applying Lemma 5, the Fourier transform of μ is

μ^(u)=d(πα)d/2exp(-π2α|y|2)e-2πiy,u𝑑y=e-α|u|2=f(u),

so f is strictly positive definite. Because T is an orthogonal transformation it is in particular one-to-one, so Tx1,,Txn are distinct points in d. Thus the fact that f is strictly positive definite means that

j=1nk=1nujuk¯e-αxj-xk2=j=1nk=1nujuk¯f(Txj-Txk)>0,

which establishes that xe-αx2 is strictly positive definite. ∎

The following is Schoenberg’s theorem.1414 14 Ward Cheney and Will Light, A Course in Approximation Theory, p. 101, chapter 15, Theorem 1; René L. Schilling, Renming Song, and Zoran Vondraček, Bernstein Functions: Theory and Applications, p. 142, Theorem 12.14; William F. Donoghue Jr., Distributions and Fourier Transforms, p. 205, §41.

Theorem 7 (Schoenberg’s theorem).

Let (X,,) be a real inner product space. If f:[0,) is completely monotone, f(0)=1, and f is not constant, then

xf(x2),X[0,),

is radial and strictly positive definite.

Proof.

Because f is completely monotone, the Bernstein-Widder theorem (Theorem 1) tells us that there is a Borel probability measure μ on [0,) such that

f(t)=0e-st𝑑μ(s),t[0,),

that is, f is the Laplace transform of μ. Now, the Laplace transform of δ0 is t1, and because f is not constant, the Laplace transform of μ is not equal to the Laplace transform of δ0, which implies that μδ0.1515 15 Bert Fristedt and Lawrence Gray, A Modern Approach to Probability Theory, p. 218, §13.5, Theorem 6. Therefore μ((0,))>0.

Let x1,,xn be distinct points in X and let un, u0. Then, because j=1nk=1nujuk¯0,

j=1nk=1nujuk¯f(xj-xk2) =j=1nk=1nujuk¯0exp(-sxj-xk2)𝑑μ(s)
=0j=1nk=1nujuk¯exp(-sxj-xk2)dμ(s)
=j=1nk=1nujuk¯μ({0})
+01(0,)(s)j=1nk=1nujuk¯exp(-sxj-xk2)dμ(s)
01(0,)(s)j=1nk=1nujuk¯exp(-sxj-xk2)dμ(s)
=0g(s)𝑑μ(s).

Assume by contradiction that 0g(s)𝑑μ(s)=0. Because g0, this implies that μ({s[0,):g(s)>0})=0.1616 16 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 411, Theorem 11.16. By Theorem 6, for each s>0,

j=1nk=1nujuk¯exp(-sxj-xk2)>0,

so g(s)>0 when s>0. Thus μ((0,))=0, a contradiction. Therefore,

j=1nk=1nujuk¯f(xj-xk2)=0g(s)𝑑μ(s)>0,

which shows that xf(x2) is strictly positive definite. ∎