# Positive definite functions, completely monotone functions, the Bernstein-Widder theorem, and Schoenberg’s theorem

Jordan Bell
June 26, 2015

## 1 Linear operators

For a complex Hilbert space $H$ let $\mathscr{L}(H)$ be the bounded linear operators $H\to H$. It is a fact that $A\in\mathscr{L}(H)$ is self-adjoint if and only if $\left\langle Ah,h\right\rangle\in\mathbb{R}$ for all $h\in H$.11 1 John B. Conway, A Course in Functional Analysis, second ed., p. 33, Proposition 2.12. For a bounded self-adjoint operator $A$ it is a fact that22 2 John B. Conway, A Course in Functional Analysis, second ed., p. 34, Proposition 2.13.

 $\left\|A\right\|=\sup_{\left\|h\right\|=1}|\left\langle Ah,h\right\rangle|.$

$A\in\mathscr{L}(H)$ is called positive if it is self-adjoint and

 $\left\langle Ah,h\right\rangle\geq 0,\qquad h\in H;$

because we have taken $H$ to be a complex Hilbert space, for $A$ to be positive it suffices that the inequality is satisfied.

For $A,B\in\mathscr{L}(\mathbb{C}^{n})$, we define their Hadamard product $A*B\in\mathscr{L}(H)$ by

 $(A*B)e_{i}=\sum_{j=1}^{n}\left\langle Ae_{i},e_{j}\right\rangle\left\langle Be% _{i},e_{j}\right\rangle e_{j}.$

So,

 $\left\langle(A*B)e_{i},e_{j}\right\rangle=\left\langle Ae_{i},e_{j}\right% \rangle\left\langle Be_{i},e_{j}\right\rangle.$

The Schur product theorem states that if $A,B\in\mathscr{L}(\mathbb{C}^{n})$ are positive then their Hadamard product $A*B$ is positive.33 3 Ward Cheney and Will Light, A Course in Approximation Theory, p. 81, chapter 12.

## 2 Positive definite functions

Let $X$ be a real or complex linear space, let $f:X\to\mathbb{C}$ be a function, and for $x_{1},\ldots,x_{n}\in X$, define $F_{f;x_{1},\ldots,x_{n}}\in\mathscr{L}(\mathbb{C}^{n})$ by

 $F_{f;x_{1},\ldots,x_{n}}e_{i}=\sum_{j=1}^{n}f(x_{i}-x_{j})e_{j},$

where $\{e_{1},\ldots,e_{n}\}$ is the standard basis for $\mathbb{C}^{n}$. Thus for $u=\sum_{i=1}^{n}u_{i}e_{i}\in\mathbb{C}^{n}$,

 $\displaystyle\left\langle F_{f;x_{1},\ldots,x_{n}}u,u\right\rangle$ $\displaystyle=\left\langle\sum_{i=1}^{n}u_{i}\sum_{j=1}^{n}f(x_{i}-x_{j})e_{j}% ,\sum_{k=1}^{n}u_{k}e_{k}\right\rangle$ $\displaystyle=\sum_{i=1}^{n}u_{i}\sum_{j=1}^{n}f(x_{i}-x_{j})\left\langle e_{j% },\sum_{k=1}^{n}u_{k}e_{k}\right\rangle$ $\displaystyle=\sum_{i=1}^{n}\sum_{j=1}^{n}u_{i}\overline{u_{j}}f(x_{i}-x_{j}).$

We call $f$ positive definite if for all $x_{1},\ldots,x_{n}\in X$, $F_{f;x_{1},\ldots,x_{n}}$ is a positive operator, i.e. for $u\in\mathbb{C}^{n}$,

 $\left\langle F_{f;x_{1},\ldots,x_{n}}u,u\right\rangle\geq 0.$

We call $f$ strictly positive definite for all distinct $x_{1},\ldots,x_{n}\in X$ and nonzero $u\in\mathbb{C}^{n}$,

 $\left\langle F_{f;x_{1},\ldots,x_{n}}u,u\right\rangle>0.$

## 3 Completely monotone functions

A function $f:[0,\infty)\to\mathbb{R}$ is called completely monotone if

1. 1.

$f\in C[0,\infty)$

2. 2.

$f\in C^{\infty}(0,\infty)$

3. 3.

$(-1)^{k}f^{(k)}(x)\geq 0$ for $k\geq 0$ and $x\in(0,\infty)$

Because a completely monotone function is continuous, $f(x)$ tends to $f(0)$ as $x\downarrow 0$. Because a completely monotone function is nonincreasing and convex, $f(x)$ has a limit, which we call $f(\infty)$, as $x\uparrow\infty$.

The Bernstein-Widder theorem states that a function $f$ satisfying $f(0)=1$ is completely monotone if and only if it is the Laplace transform of a Borel probability measure on $[0,\infty)$.44 4 Peter D. Lax, Functional Analysis, p. 138, chapter 14, Theorem 3; http://djalil.chafai.net/blog/2013/03/23/the-bernstein-theorem-on-completely-monotone-functions/

###### Theorem 1 (Bernstein-Widder theorem).

A function $f:[0,\infty)\to\mathbb{R}$ satisfies $f(0)=1$ and is completely monotone if and only if there is a Borel probability measure $\mu$ on $[0,\infty)$ such that

 $f(x)=\int_{0}^{\infty}e^{-xt}d\mu(t),\qquad x\in[0,\infty).$
###### Proof.

If $f$ is the Laplace transform of some probability measure $\mu$ on $\mathscr{B}_{[0,\infty)}$, then using the dominated convergence theorem yields that $f$ is continuous and by induction that $f\in C^{\infty}(0,\infty)$. For $k\geq 0$ and for $x\in(0,\infty)$,

 $f^{(k)}(x)=\int_{0}^{\infty}(-t)^{k}e^{-xt}d\mu(t),$

as $\int_{0}^{\infty}t^{k}e^{-xt}d\mu(t)\geq 0$ so $(-1)^{k}f^{(k)}(x)\geq 0$. Hence $f$ is completely monotone, and $f(0)=\int_{0}^{\infty}d\mu(t)=1$.

If $f$ satisfies $f(0)=1$ and is completely monotone, then for each $k\geq 0$, the function $(-1)^{k}f^{(k)}:(0,\infty)\to\mathbb{R}$ is nonnegative and is nonincreasing, so for $k\geq 1$ and $t\in(0,\infty)$, using that $(-1)^{k}f^{(k)}$ is nondecreasing and that $(-1)^{k-1}f^{(k-1)}$ is nonnegative,

 $\displaystyle(-1)^{k}f^{(k)}(t)$ $\displaystyle\leq\frac{2}{t}\int_{t/2}^{t}(-1)^{k}f^{(k)}(u)du$ $\displaystyle=\frac{2}{t}(-1)^{k}\left(f^{(k-1)}(t)-f^{(k-1)}(t/2)\right)$ $\displaystyle\leq\frac{2}{t}(-1)^{k-1}f^{(k-1)}(t/2).$

Doing induction, for any $k\geq 1$,

 $\displaystyle(-1)^{k}f^{(k)}(t)$ $\displaystyle\leq\prod_{j=1}^{k-1}\left(\frac{2^{j}}{t}\right)\cdot f^{\prime}% \left(\frac{t}{2^{k-1}}\right)$ $\displaystyle\leq\prod_{j=1}^{k-1}\left(\frac{2^{j}}{t}\right)\cdot\frac{2^{k}% }{t}\left(f\left(\frac{t}{2^{k-1}}\right)-f\left(\frac{t}{2^{k}}\right)\right)$ $\displaystyle=t^{-k}2^{k(k-1)/2}\left(f\left(\frac{t}{2^{k-1}}\right)-f\left(% \frac{t}{2^{k}}\right)\right).$

Because $f(x)\to f(0)$ as $x\downarrow 0$,

 $f\left(\frac{t}{2^{k-1}}\right)-f\left(\frac{t}{2^{k}}\right)\to 0,\qquad t% \downarrow 0,$

and because $f(x)\to f(\infty)$ as $x\uparrow\infty$,

 $f\left(\frac{t}{2^{k-1}}\right)-f\left(\frac{t}{2^{k}}\right)\to 0,\qquad t% \uparrow\infty.$

Hence for each $k\geq 1$,

 $|f(t)|=o_{k}(t^{-k}),\qquad t\downarrow 0,$ (1)

and

 $|f(t)|=o_{k}(t^{-k}),\qquad t\uparrow\infty.$ (2)

Furthermore, for any $x\in(0,\infty)$, $f^{(k)}(t)\to f^{(k)}(x)$ as $t\to x$, so it is immediate that

 $(t-x)^{k}f^{(k)}(t)\to 0,\qquad t\to x.$ (3)

For $x\geq 0$ and $k\geq 1$, integrating by parts, using (2) and (1) or (3) respectively as $x=0$ or $x>0$,

 $\displaystyle f(x)-f(\infty)$ $\displaystyle=-\int_{x}^{\infty}f^{\prime}(t)dt$ $\displaystyle=-(t-x)f^{\prime}(t)\Big{|}_{x}^{\infty}+\int_{x}^{\infty}f^{% \prime\prime}(t)(t-x)dt$ $\displaystyle=\int_{x}^{\infty}f^{\prime\prime}(t)(t-x)dt$ $\displaystyle=\frac{(t-x)^{2}}{2}f^{\prime\prime}(t)\Big{|}_{x}^{\infty}-\int_% {x}^{\infty}f^{\prime\prime\prime}(t)\frac{(t-x)^{2}}{2}dt$ $\displaystyle=-\int_{x}^{\infty}f^{\prime\prime\prime}(t)\frac{(t-x)^{2}}{2}dt$ $\displaystyle=(-1)^{k}\int_{x}^{\infty}f^{(k)}(t)\frac{(t-x)^{k-1}}{(k-1)!}dt.$

Hence for $x\geq 0$ and $n\geq 0$,

 $f(x)-f(\infty)=\frac{(-1)^{n+1}}{n!}\int_{x}^{\infty}f^{(n+1)}(t)(t-x)^{n}dt.$

Define

 $\phi_{n}(y)=(1-y/n)^{n}1_{[0,n]}(y).$

For $n\geq 1$, by change of variables,

 $\displaystyle f(x)-f(\infty)$ $\displaystyle=\frac{(-1)^{n+1}}{n!}\int_{x/n}^{\infty}f^{(n+1)}(nu)(nu-x)^{n}ndu$ $\displaystyle=\frac{(-1)^{n+1}}{(n-1)!}\int_{x/n}^{\infty}f^{(n+1)}(nu)(nu)^{n% }\left(1-\frac{x}{nu}\right)^{n}du$ $\displaystyle=\frac{(-1)^{n+1}}{(n-1)!}\int_{0}^{\infty}(nu)^{n}\phi_{n}(x/u)f% ^{(n+1)}(nu)du$ $\displaystyle=\frac{(-1)^{n+1}}{(n-1)!}\int_{0}^{\infty}(n/t)^{n}\phi_{n}(xt)f% ^{(n+1)}(n/t)t^{-2}dt.$

For $t\geq 0$, define

 $s_{n}(t)=\frac{(-1)^{n+1}}{(n-1)!}\int_{1/t}^{\infty}(nu)^{n}f^{(n+1)}(nu)du,$

where $s_{n}(0)=0$, and for $t<0$ let $s_{n}(t)=0$. $s_{n}$ is continuous and because $f$ is completely monotone, $s_{n}$ is nondecreasing, so there is a unique positive measure $\sigma_{n}$ on $\mathscr{B}_{\mathbb{R}}$ such that55 5 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 393, Theorem 10.48.

 $\sigma_{n}((a,b])=s_{n}(b)-s_{n}(a),\qquad a\leq b.$

On the other hand, $s_{n}$ is absolutely continuous, so $\sigma_{n}$ is absolutely continuous with respect to Lebesgue measure $\lambda_{1}$, and for $\lambda_{1}$-almost all $t\in\mathbb{R}$,66 6 H. L. Royden, Real Analysis, third ed., p. 303, Exercise 16.

 $\frac{d\sigma_{n}}{d\lambda_{1}}(t)=s_{n}^{\prime}(t).$

Now for $t>0$, by the fundamental theorem of calculus and the chain rule,

 $s_{n}^{\prime}(t)=\frac{(-1)^{n+1}}{(n-1)!}(n/t)^{n}f^{(n+1)}(n/t)\cdot t^{-2},$

and therefore

 $\displaystyle f(x)-f(\infty)$ $\displaystyle=\int_{0}^{\infty}\phi_{n}(xt)s_{n}^{\prime}(t)d\lambda_{1}(t)$ $\displaystyle=\int_{0}^{\infty}\phi_{n}(xt)\frac{d\sigma_{n}}{d\lambda_{1}}(t)% d\lambda_{1}(t)$ $\displaystyle=\int_{0}^{\infty}\phi_{n}(xt)d\sigma_{n}(t).$

The total variation of $\sigma_{n}$ is equal to the total variation of $s_{n}$, and because $s_{n}$ is nondecreasing,

 $\left\|\sigma_{n}\right\|=\int_{0}^{\infty}|s_{n}^{\prime}(t)|dt=\int_{0}^{% \infty}s_{n}^{\prime}(t)dt=s_{n}(\infty)-s_{n}(0)=s_{n}(\infty),$

which is

 $\left\|\sigma_{n}\right\|=\frac{(-1)^{n+1}}{(n-1)!}\int_{0}^{\infty}(nu)^{n}f^% {(n+1)}(nu)du=f(0)-f(\infty),$

showing that $\{\sigma_{n}:n\geq 1\}$ is bounded for the total variation norm. We claim that $\{\sigma_{n}:n\geq 1\}$ is tight: for each $\epsilon>0$ there is a compact subset $K_{\epsilon}$ of $\mathbb{R}$ such that $\sigma_{n}(K_{\epsilon}^{c})<\epsilon$ for all $n$. Taking this for granted, Prokhorov’s theorem77 7 V. I. Bogachev, Measure Theory, volume II, p. 202, Theorem 8.6.2. states that there is a subsequence $\sigma_{k_{n}}$ of $\sigma_{n}$ that converges narrowly to some positive measure $\sigma$ on $\mathscr{B}_{\mathbb{R}}$. Finally, the sequence $t\mapsto\phi_{n}(xt)$ tends in $C_{b}([0,\infty))$ to $t\mapsto e^{-xt}$, and it thus follows that88 8 cf. Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 511, Corollary 15.7.

 $\int_{0}^{\infty}\phi_{n}(xt)d\sigma_{n}(t)\to\int_{0}^{\infty}e^{-xt}d\sigma(% t),$

so

 $f(x)-f(\infty)=\int_{0}^{\infty}e^{-xt}d\sigma(t).$

Let

 $\mu=\sigma+f(\infty)\delta_{0},$

with which

 $\int_{0}^{\infty}e^{-xt}d\mu(t)=\int_{0}^{\infty}e^{-xt}d\sigma(t)+f(\infty),$

hence

 $f(x)=\int_{0}^{\infty}e^{-xt}d\mu(t).$

Because $f(0)=1$, $\int_{0}^{\infty}d\mu(t)=1$, showing that $\mu$ is a probability measure. ∎

## 4 Fourier transforms

For a topological space $X$ and a positive Borel measure $\mu$ on $X$, $F\subset X$ is called a support of $\mu$ if (i) $F$ is closed, (ii) $\mu(F^{c})=0$, and (iii) if $G$ is open and $G\cap F\neq\emptyset$ then $\mu(G\cap F)>0$. If $F_{1}$ and $F_{2}$ are supports of $\mu$, it is straightforward that $F_{1}=F_{2}$. It is a fact that if $X$ is second-countable then $\mu$ has a support, which we denote by $\mathrm{supp}\,\mu$.99 9 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 442, Theorem 12.14.

###### Lemma 2.

If $\mu$ is a Borel measure on a topological space $X$ and $\mu$ has a support $\mathrm{supp}\,\mu$, if $f:X\to[0,\infty)$ is continuous and $\int_{X}fd\mu=0$ then $f(x)=0$ for all $x\in\mathrm{supp}\,\mu$.

###### Proof.

Let $F=\mathrm{supp}\,\mu$ and let $E=\{x\in X:f(x)\neq 0\}$. $E$ is an open subset of $X$. Suppose by contradiction that there is some $x\in E\cap F$, i.e. that $E\cap F\neq\emptyset$. Because $f$ is continuous and $f(x)>0$, there is some open neighborhood $G$ of $x$ for which $f(y)>f(x)/2$ for $y\in U$. Then $x\in G\cap F$, so $G\cap F\neq\emptyset$ and because $F$ is the support of $\mu$, $\mu(G\cap F)>0$ and a fortiori $\mu(G)>0$. Then

 $0=\int_{X}fd\mu\geq\int_{G}f(y)d\mu(y)\geq\int_{G}\frac{f(x)}{2}d\mu(y)=\frac{% f(x)}{2}\mu(G)>0,$

a contradiction. Therefore $E\cap F=\emptyset$, i.e. for all $x\in F$, $f(x)=0$. ∎

The following lemma asserts that a certain function is nonzero $\lambda_{d}$-almost everywhere, where $\lambda_{d}$ is Lebesgue measure on $\mathbb{R}^{d}$.1010 10 Ward Cheney and Will Light, A Course in Approximation Theory, p. 91, chapter 13, Lemma 6.

###### Lemma 3.

Let $x_{1},\ldots,x_{n}$ be distinct points in $\mathbb{R}^{d}$, let $u\in\mathbb{C}^{n}$ not be the zero vector, and define

 $g(y)=\sum_{j=1}^{n}u_{j}e^{-2\pi ix_{j}\cdot y},\qquad y\in\mathbb{R}^{d}.$

For $\lambda_{d}$-almost all $y\in\mathbb{R}^{d}$, $g(y)\neq 0$.

The following theorem gives conditions under which the Fourier transform of a Borel measure on $\mathbb{R}^{d}$ is strictly positive definite.1111 11 Ward Cheney and Will Light, A Course in Approximation Theory, p. 92, chapter 13, Theorem 3.

###### Theorem 4.

If $\mu$ is a finite Borel measure on $\mathbb{R}^{d}$ and $\lambda_{d}(\mathrm{supp}\,\mu)>0$, then $\hat{\mu}:\mathbb{R}^{d}\to\mathbb{C}$ is strictly positive definite.

###### Proof.

For distinct $x_{1},\ldots,x_{n}\in\mathbb{R}^{d}$ and for nonzero $u\in\mathbb{C}^{n}$,

 $\displaystyle\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}\hat{\mu}(x_{j}-% x_{k})$ $\displaystyle=\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}\int_{\mathbb{R% }^{d}}e^{-2\pi i(x_{j}-x_{k})\cdot y}d\mu(y)$ $\displaystyle=\int_{\mathbb{R}^{d}}\left(\sum_{j=1}^{n}u_{j}e^{-2\pi ix_{j}% \cdot y}\right)\overline{\left(\sum_{k=1}^{n}u_{k}e^{-2\pi ix_{k}\cdot y}% \right)}d\mu(y)$ $\displaystyle=\int_{\mathbb{R}^{d}}\left|\sum_{j=1}^{n}u_{j}e^{-2\pi ix_{j}% \cdot y}\right|^{2}d\mu(y)$ $\displaystyle=\int_{\mathbb{R}^{d}}|g(y)|^{2}d\mu(y).$

It is apparent that this is nonnegative. If it is equal to $0$ then because $g$ is continuous we obtain from Lemma 2 that $|g(y)|^{2}=0$ for all $y\in\mathrm{supp}\,\mu$, i.e. $g(y)=0$ for all $y\in\mathrm{supp}\,\mu$. In other words,

 $\mathrm{supp}\,\mu\subset\{y\in\mathbb{R}^{d}:g(y)=0\}.$

But by Lemma 3, $\lambda_{d}(\{y\in\mathbb{R}^{d}:g(y)=0\})=0$, so $\lambda_{d}(\mathrm{supp}\,\mu)=0$, contradicting the hypothesis $\lambda_{d}(\mathrm{supp}\,\mu)>0$. Therefore

 $\int_{\mathbb{R}^{d}}|g(y)|^{2}d\mu(y)>0,$

which shows that $\hat{\mu}$ is strictly positive definite. ∎

## 5 Schoenberg’s theorem

Let $(X,\left\langle\cdot,\cdot\right\rangle)$ be a real inner product space. We call a function $F:X\to\mathbb{R}$ radial when $\left\|x\right\|=\left\|y\right\|$ implies that $F(x)=F(y)$.

An identity that is worth memorizing is that for $y\in\mathbb{R}$,

 $\int_{\mathbb{R}}e^{-\pi x^{2}}e^{-2\pi ixy}dx=e^{-\pi y^{2}}.$

Using this and Fubini’s theorem yields, $y\in\mathbb{R}^{d}$,

 $\int_{\mathbb{R}^{d}}e^{-\pi|x|^{2}}e^{-2\pi\left\langle x,y\right\rangle}=e^{% -\pi|y|^{2}}.$
###### Lemma 5.

For $\alpha>0$ and $y\in\mathbb{R}^{d}$,

 $\int_{\mathbb{R}^{d}}\left(\frac{\pi}{\alpha}\right)^{d/2}\exp\left(-\frac{\pi% ^{2}}{\alpha}|x|^{2}\right)e^{-2\pi i\left\langle x,y\right\rangle}dx=e^{-% \alpha|y|^{2}}.$
###### Proof.

Define $T:\mathbb{R}^{d}\to\mathbb{R}^{d}$ by

 $T(x)=\sqrt{\frac{\pi}{\alpha}}x,\qquad x\in\mathbb{R}^{d}.$

$T^{\prime}(x)=\sqrt{\frac{\pi}{\alpha}}I\in\mathscr{L}(\mathbb{R}^{d})$ and $J_{T}(x)=\det T^{\prime}(x)=\left(\frac{\pi}{\alpha}\right)^{d/2}$. Let $u\in\mathbb{R}^{d}$ and define $f(x)=e^{-\pi|x|^{2}}e^{-2\pi i\left\langle x,u\right\rangle}$. By the change of variables formula,1212 12 Charalambos D. Aliprantis and Owen Burkinshaw, Principles of Real Analysis, third ed., p. 393, Theorem 40.7.

 $\int_{\mathbb{R}^{d}}(f\circ T)\cdot|J_{T}|d\lambda_{d}=\int_{T(\mathbb{R}^{d}% )}fd\lambda_{d},$

and because $T$ is self-adjoint this is

 $\int_{\mathbb{R}^{d}}e^{-\pi|T(x)|^{2}}e^{-2\pi i\left\langle x,Tu\right% \rangle}\left(\frac{\pi}{\alpha}\right)^{d/2}dx=\int_{\mathbb{R}^{d}}e^{-\pi|x% |^{2}}e^{-2\pi i\left\langle x,u\right\rangle}dx,$

and therefore

 $\int_{\mathbb{R}^{d}}\left(\frac{\pi}{\alpha}\right)^{d/2}\exp\left(-\frac{\pi% ^{2}}{\alpha}|x|^{2}\right)e^{-2\pi i\left\langle x,Tu\right\rangle}dx=e^{-\pi% |u|^{2}}.$

For $u=T^{-1}(y)=\sqrt{\frac{\alpha}{\pi}}y$ this is

 $\int_{\mathbb{R}^{d}}\left(\frac{\pi}{\alpha}\right)^{d/2}\exp\left(-\frac{\pi% ^{2}}{\alpha}|x|^{2}\right)e^{-2\pi i\left\langle x,y\right\rangle}dx=e^{-% \alpha|y|^{2}},$

proving the claim. ∎

We now prove that on a real inner product space, $x\mapsto e^{-\alpha\left\|x\right\|^{2}}$ is strictly positive definite whenever $\alpha>0$.1313 13 Ward Cheney and Will Light, A Course in Approximation Theory, p. 104, chapter 15, Theorem 2.

###### Theorem 6.

Let $(X,\left\langle\cdot,\cdot\right\rangle)$ be a real inner product space. If $\alpha>0$, then

 $x\mapsto e^{-\alpha\left\|x\right\|^{2}},\qquad x\in X,$

is radial and strictly positive definite.

###### Proof.

Let $x_{1},\ldots,x_{n}$ be distinct points in $X$. There is an $n$-dimensional linear subspace $V$ of $X$ that contains $x_{1},\ldots,x_{n}$. By the Gram-Schmidt process, $V$ has an orthonormal basis $\{v_{1},\ldots,v_{n}\}$. Define $T:V\to\mathbb{R}^{n}$ by $Tv_{j}=e_{j}$, where $\{e_{1},\ldots,e_{n}\}$ is the standard basis for $\mathbb{R}^{n}$, which is an orthogonal transformation, and define

 $f(u)=e^{-\alpha|u|^{2}},\qquad u\in\mathbb{R}^{d}.$

For $u\in\mathbb{C}^{n}$, $u\neq 0$,

 $\displaystyle\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}e^{-\alpha\left% \|x_{j}-x_{k}\right\|^{2}}$ $\displaystyle=\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}\exp\left(-% \alpha|T(x_{j}-x_{k})|^{2}\right)$ $\displaystyle=\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}f(Tx_{j}-Tx_{k}).$

Now, let $\mu$ be the Borel measure on $\mathbb{R}^{d}$ whose density with respect to $\lambda_{d}$ is

 $y\mapsto\left(\frac{\pi}{\alpha}\right)^{d/2}\exp\left(-\frac{\pi^{2}}{\alpha}% |y|^{2}\right).$

Because $\mu$ is absolutely continuous with respect to $\lambda_{d}$, $\lambda_{d}(\mathrm{supp}\,\mu)>0$, so Theorem 4 states that the Fourier transform $\hat{\mu}:\mathbb{R}^{d}\to\mathbb{C}$ is strictly positive definite. Applying Lemma 5, the Fourier transform of $\mu$ is

 $\hat{\mu}(u)=\int_{\mathbb{R}^{d}}\left(\frac{\pi}{\alpha}\right)^{d/2}\exp% \left(-\frac{\pi^{2}}{\alpha}|y|^{2}\right)e^{-2\pi i\left\langle y,u\right% \rangle}dy=e^{-\alpha|u|^{2}}=f(u),$

so $f$ is strictly positive definite. Because $T$ is an orthogonal transformation it is in particular one-to-one, so $Tx_{1},\ldots,Tx_{n}$ are distinct points in $\mathbb{R}^{d}$. Thus the fact that $f$ is strictly positive definite means that

 $\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}e^{-\alpha\left\|x_{j}-x_{k}% \right\|^{2}}=\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}f(Tx_{j}-Tx_{k}% )>0,$

which establishes that $x\mapsto e^{-\alpha\left\|x\right\|^{2}}$ is strictly positive definite. ∎

The following is Schoenberg’s theorem.1414 14 Ward Cheney and Will Light, A Course in Approximation Theory, p. 101, chapter 15, Theorem 1; René L. Schilling, Renming Song, and Zoran Vondraček, Bernstein Functions: Theory and Applications, p. 142, Theorem 12.14; William F. Donoghue Jr., Distributions and Fourier Transforms, p. 205, §41.

###### Theorem 7 (Schoenberg’s theorem).

Let $(X,\left\langle\cdot,\cdot\right\rangle)$ be a real inner product space. If $f:[0,\infty)\to\mathbb{R}$ is completely monotone, $f(0)=1$, and $f$ is not constant, then

 $x\mapsto f(\left\|x\right\|^{2}),\qquad X\to[0,\infty),$

is radial and strictly positive definite.

###### Proof.

Because $f$ is completely monotone, the Bernstein-Widder theorem (Theorem 1) tells us that there is a Borel probability measure $\mu$ on $[0,\infty)$ such that

 $f(t)=\int_{0}^{\infty}e^{-st}d\mu(s),\qquad t\in[0,\infty),$

that is, $f$ is the Laplace transform of $\mu$. Now, the Laplace transform of $\delta_{0}$ is $t\mapsto 1$, and because $f$ is not constant, the Laplace transform of $\mu$ is not equal to the Laplace transform of $\delta_{0}$, which implies that $\mu\neq\delta_{0}$.1515 15 Bert Fristedt and Lawrence Gray, A Modern Approach to Probability Theory, p. 218, §13.5, Theorem 6. Therefore $\mu((0,\infty))>0$.

Let $x_{1},\ldots,x_{n}$ be distinct points in $X$ and let $u\in\mathbb{C}^{n}$, $u\neq 0$. Then, because $\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}\geq 0$,

 $\displaystyle\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}f(\left\|x_{j}-x% _{k}\right\|^{2})$ $\displaystyle=\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}\int_{0}^{% \infty}\exp\left(-s\left\|x_{j}-x_{k}\right\|^{2}\right)d\mu(s)$ $\displaystyle=\int_{0}^{\infty}\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k% }}\exp\left(-s\left\|x_{j}-x_{k}\right\|^{2}\right)d\mu(s)$ $\displaystyle=\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}\mu(\{0\})$ $\displaystyle+\int_{0}^{\infty}1_{(0,\infty)}(s)\sum_{j=1}^{n}\sum_{k=1}^{n}u_% {j}\overline{u_{k}}\exp\left(-s\left\|x_{j}-x_{k}\right\|^{2}\right)d\mu(s)$ $\displaystyle\geq\int_{0}^{\infty}1_{(0,\infty)}(s)\sum_{j=1}^{n}\sum_{k=1}^{n% }u_{j}\overline{u_{k}}\exp\left(-s\left\|x_{j}-x_{k}\right\|^{2}\right)d\mu(s)$ $\displaystyle=\int_{0}^{\infty}g(s)d\mu(s).$

Assume by contradiction that $\int_{0}^{\infty}g(s)d\mu(s)=0$. Because $g\geq 0$, this implies that $\mu(\{s\in[0,\infty):g(s)>0\})=0$.1616 16 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 411, Theorem 11.16. By Theorem 6, for each $s>0$,

 $\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}\exp\left(-s\left\|x_{j}-x_{k% }\right\|^{2}\right)>0,$

so $g(s)>0$ when $s>0$. Thus $\mu((0,\infty))=0$, a contradiction. Therefore,

 $\sum_{j=1}^{n}\sum_{k=1}^{n}u_{j}\overline{u_{k}}f(\left\|x_{j}-x_{k}\right\|^% {2})=\int_{0}^{\infty}g(s)d\mu(s)>0,$

which shows that $x\mapsto f(\left\|x\right\|^{2})$ is strictly positive definite. ∎