Gaussian measures and Bochner’s theorem

Jordan Bell
April 30, 2015

1 Fourier transforms of measures

Let mn be normalized Lebesgue measure on n: dmn(x)=(2π)-n/2dx. If μ is a finite positive Borel measure on n, the Fourier transform of μ is the function μ^:n defined by

μ^(ξ)=ne-iξx𝑑μ(x),ξn.

One proves using the dominated convergence theorem that μ^ is continuous. If fL1(n), the Fourier transform of f is the function f^:n defined by

f^(ξ)=ne-iξxf(x)𝑑mn(x),ξn.

Likewise, using the dominated convergence theorem, f^ is continuous. One proves that if fL1(n) and f^L1(n) then, for almost all xn,

f(x)=neixξf^(ξ)𝑑mn(ξ).

As

μ^(0)=n𝑑μ(x)=μ(n),

μ is a probability measure if and only if μ^(0)=1. (By a probability measure we mean a positive measure with mass 1.)

If ϕL1(n) and ϕ^L1(n), then, inverting the Fourier transform,

ϕ,μ = nϕ(x)𝑑μ(x)
= n(nϕ^(ξ)eixξ𝑑mn(ξ))𝑑μ(x)
= nϕ^(ξ)neiξx𝑑μ(x)𝑑mn(ξ)
= nϕ^(ξ)μ^(-ξ)𝑑mn(ξ)
= nϕ^(-ξ)μ^(ξ)𝑑mn(ξ).
Theorem 1.

If μ and ν are finite Borel measures on n and μ^=ν^, then μ=ν.

Proof.

To prove that μ=ν it suffices to prove that for any ball B in n we have μ(B)=ν(B). Let ϕnCc(n)χB pointwise. On the one hand, by the dominated convergence theorem, ϕn,μμ(B) and ϕn,νν(B) as n. On the other hand, because μ^=ν^ we have

ϕn,μ=nϕ^n(-ξ)μ^(ξ)𝑑mn(ξ)=nϕ^n(-ξ)ν^(ξ)𝑑mn(ξ)=ϕn,ν.

Therefore μ(B)=ν(B), and it follows that μ=ν. ∎

2 Gaussian measures

Let λ1,,λn>0, and let Λ:nn be the linear map defined by Λei=λiei. Define

dμ(x)=detΛexp(-12xΛx)dmn(x),

called a Gaussian measure.

Theorem 2.
μ^(ξ)=exp(-12ξΛ-1ξ),ξn.
Proof.

We have

μ^(ξ) = ne-iξxdetΛexp(-12xΛx)𝑑mn(x)
= de-iξ1x1--iξnxnλ1λnexp(-12λ1x12--12λnxn2)𝑑mn(x)
= j=1nIj,

where

Ij=e-iξjxjλjexp(-12λjxj2)𝑑m1(xj).

Using

-iξjxj-12λjxj2=-λj2((xj+iξjλj)2+ξj2λj2)=-λj2(xj+iξjλj)2-ξj22λj,

we get, doing contour integration,

Ij = λjexp(-λj2(xj+iξjλj)2)exp(-ξj22λj)𝑑m1(xj)
= λjexp(-λjxj22)exp(-ξj22λj)𝑑m1(xj)
= λjexp(-yj2)exp(-ξj22λj)2λj𝑑m1(yj)
= exp(-ξj22λj)2exp(-yj2)𝑑m1(yj)
= exp(-ξj22λj)1πexp(-yj2)𝑑yj
= exp(-ξj22λj).

Therefore, as Λ-1ξ=j=1nξjλjej and ξΛ-1ξ=j=1nξj2λj,

μ^(ξ) = j=1nexp(-ξj22λj)
= exp(-12j=1nξj2λj)
= exp(-12ξΛ-1ξ).

From the above theorem we get

μ^(0)=1,

and hence a Gaussian measure is a probability measure.

For hn, define Th:nn by Th(x)=x-h. If E is a Borel subset of n, because χT-h(E)=χETh,

((Th)*μ)(E)=μ(Th-1(E))=μ(T-h(E))=nχT-h(E)𝑑μ=nχETh𝑑μ.

Then, because ThT-h=idn,

nχETh𝑑μ = nχETh(x)detΛexp(-12xΛx)𝑑mn(x)
= nχE(x)detΛexp(-12(T-hx)(ΛT-hx))d((T-h)*mn)(x)
= nχE(x)detΛexp(-12(T-hx)(ΛT-hx))𝑑mn(x).

As Λ is self-adjoint Λxh=xΛh,

(T-hx)(ΛT-hx) = (x+h)(Λ(x+h))
= (x+h)(Λx+Λh)
= xΛx+xΛh+hΛx+hΛh
= xΛx+2xΛh+hΛh.

Therefore,

((Th)*μ)(E) = nχE(x)exp(-12(2xΛh+hΛh))𝑑μ(x)
= nχE(x)exp(-xΛh-12hΛh)𝑑μ(x).

This shows that the Radon-Nikodym derivative of (Th)*μ with respect to μ is

d(Th)*μdμ(x)=exp(-xΛh-12hΛh).

3 Positive-definite functions

We say that a function ϕ:n is positive-definite if x1,,xrn and c1,,cr imply that

i,j=1rcicj¯ϕ(xi-xj)0;

in particular, the left-hand side is real.

Using r=1, c1=1, we have for any x1n that ϕ(x1-x1)0, i.e. ϕ(0)0. For xn, using r=2, x1=x,x2=0 and choosing fitting c1,c2 gives

ϕ(-x)=ϕ(x)¯,

and using this with c2=1 and for appropriate c1 gives

|ϕ(x)|ϕ(0).

For f,gL1(n), the convolution of f and g is the function f*g:n defined by

(f*g)(x)=nf(y)g(x-y)𝑑mn(y),xn,

and f*gL1fL1gL1, a case of Young’s inequality. For f:n, we denote by suppf the essential support of f; if f is continuous, then suppf is the closure of the set {xn:f(x)0}. A fact that we will use later is11 1 Gerald B. Folland, Real Analysis: Modern Techniques and their Applications, second ed., p. 240, Proposition 8.6.

supp(f*g)suppf+suppg¯.

We denote by f* the function defined by f*(x)=f(-x)¯.

Cc(n) is the set of all fC(n) for which suppf is a compact set. The set Cc(n) is dense in the Banach space C0(n) and also in the Banach space L1(n); Cc(n) is not a Banach space or even a Fréchet space, and thus does not have a robust structure itself, but is used because it is easier to prove things for it which one then extends in some way to spaces in which the set is dense. The proof of the following theorem follows Folland.22 2 Gerald B. Folland, A Course in Abstract Harmonic Analysis, p. 85, Proposition 3.35.

Theorem 3.

If ϕ:n is positive-definite and continuous and fCc(n), then

(f**f)ϕ0.
Proof.

Write K=suppf, and define F:n×n by

F(x,y)=f(x)f(y)¯ϕ(x-y).

F is continuous, and suppFK×K, hence suppF is compact. Thus FCc(n×n); in particular F is uniformly continuous on K×K, and it follows that for each ϵ>0 there is some δ>0 such that if |x-a|<δ and |y-b|<δ then |F(x,y)-F(a,b)|<ϵ. The collection {Bδ(x):xK} covers K and hence there are finitely many distinct xiK such that the collection {Bδ(xi):i} covers K. Then {Bδ(xi)×Bδ(xj):i,j} covers K×K. Let Ei be pairwise disjoint, measurable, and satisfy xiEiBδ(xi). The collection {Ei:i,} covers K, so the collection {Ei×Ej:i,j} covers K×K.

Define

R=i,jEi×Ej(F(x,y)-F(xi,xj))𝑑mn(x)𝑑mn(y).

R satisfies

|R| i,jEi×Ej|F(x,y)-F(xi,xj)|𝑑mn(x)𝑑mn(y)
i,jEi×Ejϵ𝑑mn(x)𝑑mn(y)
= ϵi,jmn(Ei)mn(Ej)
= ϵmn(K)2.

We obtain

K×KF(x,y)𝑑mn(x)𝑑mn(y) = i,jEi×EjF(x,y)𝑑mn(x)𝑑mn(y)
= i,jF(xi,xj)mn(Ei)mn(Ej)+R
= i,jf(xi)f(xj)¯ϕ(xi-xj)mn(Ei)mn(Ej)+R.

Using ci=f(xi)mn(Ei), the fact that ϕ is positive-definite means that the sum is 0. Therefore

K×KF(x,y)𝑑mn(x)𝑑mn(y)-|R|-ϵmn(K)2.

This is true for all ϵ>0, hence

nnf(x)f(y)¯ϕ(x-y)𝑑mn(x)𝑑mn(y)=K×KF(x,y)𝑑mn(x)𝑑mn(y)0.

But

n(f**f)(x)ϕ(x)𝑑mn(x) = n(nf*(y)f(x-y)𝑑mn(y))ϕ(x)𝑑mn(x)
= nnf(-y)¯f(x-y)ϕ(x)𝑑mn(x)𝑑mn(y)
= nnf(-y)¯f(x)ϕ(x+y)𝑑mn(x)𝑑mn(y)
= nnf(y)¯f(x)ϕ(x-y)𝑑mn(x)𝑑mn(y).

Corollary 4.

If ϕ:n is positive-definite and continuous and fL1(n), then

(f**f)ϕ0.
Proof.

Let fnCc(n) converge to f in L1(n) as n; that there is such a sequence is given to us by the fact that Cc(n) is a dense subset of L1(n). Using

fn**fn-f**f = fn**fn-fn**f+fn**f-f**f
= fn**(fn-f)+(fn*-f*)*f
= fn**(fn-f)+(fn-f)**f,

and g*L1=gL1, we get

fn**fn-f**fL1 fn**(fn-f)L1+(fn-f)**fL1
fn*L1fn-fL1+(fn-f)*L1fL1
= fnL1fn-fL1+fn-fL1fL1,

which converges to 0 because fn-fL10. Therefore, because ϕ is bounded,

n(fn**fn)ϕ𝑑mnn(f**f)ϕ𝑑mn.

As n(fn**fn)ϕ𝑑mn0 for each n, this implies that n(f**f)ϕ𝑑mn0. ∎

It is straightforward to prove that the Fourier transform of a finite positive Borel measure is a positive-definite function; one ends up with the expression

n|j=1ncjeiξjx|2𝑑μ(x),

which is finite and nonnegative because μ is finite and positive respectively. We have established already that the Fourier transform of a finite positive Borel measure μ on n is continuous and satisfies μ^(0)=1. Bochner’s theorem is the statement that a function with these three properties is indeed the Fourier transform of a finite positive Borel measure. Our proof of the following theorem follows Folland.33 3 Gerald B. Folland, A Course in Abstract Harmonic Analysis, p. 95, Theorem 4.18.

Theorem 5 (Bochner).

If ϕ:n is positive-definite, continuous, and satisfies ϕ(0)=1, then there is some Borel probability measure μ on n such that ϕ=μ^.

Proof.

Let {ψU} be an approximate identity. That is, for each neighborhood U of 0, ψU is a function such that suppψU is compact and contained in U, ψ0, ψU(-x)=ψU(x), and nψU𝑑mn=1. For every fL1(n), an approximate identity satisfies f*ψU-fL10 as U{0}.44 4 Gerald B. Folland, A Course in Abstract Harmonic Analysis, p. 53, Proposition 2.42.

We have ψU*=ψ-U, so

supp(ψU**ψU)suppψ-U+suppψU¯=suppψ-U+suppψU-U+U,

and as always, nf*g𝑑mn=nf𝑑mnng𝑑mn. Therefore {ψU**ψU} is an approximate identity:

For f,gL1(n), define

f,gϕ=n(g**f)ϕ𝑑mn.

One checks that this is a positive Hermitian form; positive means that f,fϕ0 for all fL1(n), and this is given to us by Corollary 4. Using the Cauchy-Schwarz inequality,55 5 Jean Dieudonne, Foundations of Modern Analysis, 1969, p. 117, Theorem 6.2.1.

|f,gϕ|2f,fϕg,gϕ.

We have laid out the tools that we will use. Let fL1(n). ψU*ff in L1 as U{0}, and as ϕ is bounded this gives n(ψU**f)ϕ𝑑mnnfϕ𝑑mn as U{0}. Because {ψU**ψU} is an approximate identity, n(ψU**ψU)ϕ𝑑mnϕ(0) as U{0}. That is, we have f,ψUϕnfϕ𝑑mn and ψU,ψUϕϕ(0 as U{0}, and as ϕ(0)=1, the above statemtn of the Cauchy-Schwarz inequality produces

|nfϕ𝑑mn|2n(f**f)ϕ𝑑mn. (1)

With h=f**f, the inequality (1) reads

|nfϕ𝑑mn|2nhϕ𝑑mn.

Defining h(1)=h, h(2)=h*h, h(3)=h*h*h, etc., applying (1) to h gives, because h*=h,

|nhϕ𝑑mn|2nh(2)ϕ𝑑mn.

Then applying (1) to h(2), which satisfies (h(2))*=h(2),

|nh(2)ϕ𝑑mn|2nh(4)ϕ𝑑mn.

Thus, for any m0 we have

|nfϕ𝑑mn| |nh(2m)ϕ𝑑mn|2-(m+1)
h(2m)L12-(m+1)
= (h(2m)L12-m)1/2,

since ϕ=ϕ(0)=1.

With convolution as multiplication, L1(n) is a commutative Banach algebra, and the Gelfand transform is an algebra homomorphism L1(n)C0(n) that satisfies66 6 Gerald B. Folland, A Course in Abstract Harmonic Analysis, p. 15, Theorem 1.30. Namely, this is the Gelfand-Naimark theorem.

g^=limkg(k)L11/k,gL1(n);

for L1(n), the Gelfand transform is the Fourier transform. Write the Fourier transform as :L1(n)C0(n). Stating that the Gelfand transform is a homomorphism means that (g1*g2)=(g1)(g2), because multiplication in the Banach algebra C0(n) is pointwise multiplication. Then, since a subsequence of a convergent sequence converges to the same limit,

limm(h(2m)L12-m)1/2=(h^)1/2.

But

h^=(f**f)=(f*)(f)=(f)¯(f)=|(f)|2,

so

(h^)1/2=(|f^|2)1/2=f^.

Putting things together, we have that for any fL1(n),

|nfϕ𝑑mn|f^.

Therefore f^nfϕ𝑑mn is a bounded linear functional (L1(n)), of norm 1. Using ϕ(0)=1, one proves that this functional has norm 1. (If we could apply this inequality to (δ) the two sides would be equal, thus to prove that the operator norm is 1, one applies the inequality to a sequence of functions that converge weakly to δ.) We take as known that (L1(n)) is dense in the Banach space C0(n), so there is a bounded linear functional Φ:C0(n) whose restriction to (L1(n)) is equal to f^nfϕ𝑑mn, and Φ=1.

Using the Riesz-Markov theorem,77 7 Walter Rudin, Real and Complex Analysis, third ed., p. 130, Theorem 6.19. there is a regular complex Borel measure μ on n such that

Φ(g)=ng𝑑μ,gC0(n),

and μ=Φ; μ is the total variation norm of μ, μ=|μ|(n). Then for fL1(n) we have

nfϕ𝑑mn = Φ(f^)
= nf^𝑑μ
= n(ne-iξxf(x)𝑑mn(x))𝑑μ(ξ)
= nf(x)(ne-ixξ𝑑μ(ξ))𝑑mn(x)
= nf(x)μ^(x)𝑑mn(x).

That this is true for all fL1(n) implies that ϕ=μ^. As μ(n)=μ^(0)=ϕ(0)=1 and μ=Φ=1 we have μ(n)=μ, and this implies that μ is positive measure, hence, as μ(n)=1, a probability measure.