Wiener measure and Donsker’s theorem

Jordan Bell

September 4, 2015

1 Relatively compact sets of Borel probability measures on C[0,1]

Let $E=C[0,1]$ , let $\mathscr{B}_{E}$ be the Borel $\sigma$ -algebra of $E$ , and let $\mathscr{P}_{E}$ be the collection of Borel probability measures on $E$ . We assign $\mathscr{P}$ the narrow topology, the coarsest topology on $\mathscr{P}_{E}$ such that for each $F\in C_{b}(E)$ the map $\mu\mapsto\int_{E}Fd\mu$ is continuous.

For $f\in E$ and $\delta>0$ we define

\omega_{f}(\delta)=\sup_{s,t\in[0,1],|s-t|\leq\delta}|f(s)-f(t)|.

For $f\in E$ , $\omega_{f}(\delta)\downarrow 0$ as $\delta\downarrow 0$ , and for $\delta>0$ , $f\mapsto\omega_{f}(\delta)$ is continuous. We shall use the following characterization of a relatively compact subset $A$ of $E$ , which is proved using the Arzelà-Ascoli theorem.

Lemma 1.

Let $A$ be a subset of $E$ . $\overline{A}$ is compact if and only if

\sup_{f\in A}|f(0)|<\infty

and

\sup_{f\in A}\omega_{f}(\delta)\downarrow 0,\qquad\delta\downarrow 0.

We shall use Prokhorov’s theorem:¹¹ 1 K. R. Parthasarathy, Probability Measures on Metric Spaces, p. 47, Chapter II, Theorem 6.7. for $X$ a Polish space and for $\Gamma\subset\mathscr{P}_{X}$ , $\overline{\Gamma}$ is compact if and only if for each $\epsilon>0$ there is a compact subset $K_{\epsilon}$ of $X$ such that $\mu(K_{\epsilon})\geq 1-\epsilon$ for all $\mu\in\Gamma$ . Namely, a subset of $\mathscr{P}_{X}$ is relatively compact if and only if it is tight. We use Prokhorov’s theorem to prove a characterization of relatively compact subsets of $\mathscr{P}_{E}$ , which we then use to prove the characterization in Theorem 3.²² 2 K. R. Parthasarathy, Probability Measures on Metric Spaces, p. 213, Chapter VII, Lemma 2.2.

Lemma 2.

Let $\Gamma$ be a subset of $\mathscr{P}_{E}$ . $\overline{\Gamma}$ is compact if and only if for each $\epsilon>0$ there is some $M_{\epsilon}<\infty$ and a function $\delta\mapsto\omega_{\epsilon}(\delta)$ satisfying $\omega_{\epsilon}(\delta)\downarrow 0$ as $\delta\downarrow 0$ and such that for all $\mu\in\Gamma$ ,

\mu(A_{\epsilon})\geq 1-\frac{\epsilon}{2},\qquad\mu(B_{\epsilon})\geq 1-\frac% {\epsilon}{2},

where

A_{\epsilon}=\{f\in E:|f(0)|\leq M_{\epsilon}\},\qquad B_{\epsilon}=\{f\in E:% \textrm{$\omega_{f}(\delta)\leq\omega_{\epsilon}(\delta)$ for all $\delta>0$}\}.

Proof.

Suppose that $\Gamma$ satisfies the above conditions. Because $f\mapsto|f(0)|$ is continuous, $A_{\epsilon}$ is closed. For $\delta>0$ , suppose that $f_{n}$ is a sequence in $B_{\epsilon}$ tending to some $f\in E$ . Because $g\mapsto\omega_{g}(\delta)$ is continuous, $\omega_{f_{n}}(\delta)\to\omega_{f}(\delta)$ , and because $\omega_{f_{n}}(\delta)\leq\omega_{\epsilon}(\delta)$ for each $n$ , we get $\omega_{f}(\delta)\leq\omega_{\epsilon}(\delta)$ and hence $f\in B_{\epsilon}$ , showing that $B_{\epsilon}$ is closed. Therefore $K_{\epsilon}=A_{\epsilon}\cap B_{\epsilon}$ is closed, i.e. $K_{\epsilon}=\overline{K_{\epsilon}}$ . The set $K_{\epsilon}$ satisfies

\sup_{f\in K_{\epsilon}}|f(0)|\leq M_{\epsilon}

and

\limsup_{\delta\downarrow 0}\sup_{f\in K_{\epsilon}}\omega_{f}(\delta)\leq% \limsup_{\delta\downarrow 0}\omega_{\epsilon}(\delta)=0,

thus by Lemma 1, $K_{\epsilon}$ is compact. For $\mu\in\Gamma$ ,

\mu(K_{\epsilon})\geq 1-\frac{\epsilon}{2},

and because $K_{\epsilon}$ is compact, this means that $\Gamma$ is tight, so by Prokhorov’s theorem, $\Gamma$ is relatively compact.

Now suppose that $\Gamma$ is relatively compact and let $\epsilon>0$ . By Prokhorov’s theorem, there is a compact set $K_{\epsilon}$ in $E$ such that $\mu(K_{\epsilon})\geq 1-\frac{\epsilon}{2}$ for all $\mu\in\Gamma$ . Define

M_{\epsilon}=\sup_{f\in K_{\epsilon}}|f(0)|,\qquad\omega_{\epsilon}(\delta)=% \sup_{f\in K_{\epsilon}}\omega_{f}(\delta),\qquad\delta>0.

Because $K_{\epsilon}$ is compact, by Lemma 1 we get that $M_{\epsilon}<\infty$ and $\omega_{\epsilon}(\delta)\downarrow 0$ as $\delta\downarrow 0$ . For $\mu\in\Gamma$ ,

\mu(A_{\epsilon})\geq\mu(K_{\epsilon})\geq 1-\frac{\epsilon}{2},\qquad\mu(B_{% \epsilon})\geq\mu(K_{\epsilon})\geq 1-\frac{\epsilon}{2},

showing that $\Gamma$ satisfies the conditions of the theorem. ∎

We now prove the characterization of relatively compact subsets of $\mathscr{P}_{E}$ that we shall use in our proof of Donsker’s theorem.³³ 3 K. R. Parthasarathy, Probability Measures on Metric Spaces, p. 214, Chapter VII, Theorem 2.2.

Theorem 3 (Relatively compact sets in $\mathscr{P}$ ).

Let $\Gamma$ be a subset of $\mathscr{P}_{E}$ . $\overline{\Gamma}$ is compact if and only if the following conditions are satisfied:

1.

For each $\epsilon>0$ there is some $M_{\epsilon}<\infty$ such that

$\mu(f:|f(0)|\leq M_{\epsilon})\geq 1-\frac{\epsilon}{2},\qquad\mu\in\Gamma.$
2.

For each $\epsilon>0$ and $\delta>0$ there is some $\eta=\eta(\epsilon,\delta)>0$ such that

$\mu(f:\omega_{f}(\eta)\leq\delta)\geq 1-\frac{\epsilon}{2},\qquad\mu\in\Gamma.$

Proof.

Suppose that $\overline{\Gamma}$ is compact and let $\epsilon>0$ . By Lemma 2, there is some $M_{\epsilon}<\infty$ and a function $\eta\mapsto\omega_{\epsilon}(\eta)$ satisfying $\omega_{\epsilon}(\eta)\downarrow 0$ as $\eta\downarrow 0$ and

\mu(A_{\epsilon})\geq 1-\frac{\epsilon}{2},\qquad\mu(B_{\epsilon})\geq 1-\frac% {\epsilon}{2},\qquad\mu\in\Gamma.

For $\delta>0$ , there is some $\eta=\eta(\epsilon,\delta)$ with $\omega_{\epsilon}(\eta)\leq\delta$ . Then for $\mu\in\Gamma$ ,

\mu(f:\omega_{f}(\eta)\leq\delta)\geq\mu(f:\omega_{f}(\eta)\leq\omega_{% \epsilon}(\eta))\geq\mu(B_{\epsilon})\geq 1-\frac{\epsilon}{2}.

Now suppose that the conditions of the theorem hold. For each $\epsilon>0$ and $n\geq 1$ there is some $\eta_{\epsilon,n}>0$ such that

\mu(F_{\epsilon,n})\geq 1-\frac{\epsilon}{2^{n+1}},\qquad\mu\in\Gamma,

where

F_{\epsilon,n}=\left\{f:\omega_{f}(\eta_{\epsilon,n})\leq\frac{1}{n}\right\}.

Let

K_{\epsilon}=\{f:|f(0)|\leq M_{\epsilon}\}\cap\bigcap_{n=1}^{\infty}F_{% \epsilon,n},

for which

\mu(K_{\epsilon})\geq\mu(f:|f(0)|\leq M_{\epsilon})\geq 1-\frac{\epsilon}{2},% \qquad\mu\in\Gamma.

For $f\in K_{\epsilon}$ , then for each $n\geq 1$ we have $f\in F_{\epsilon,n}$ , which means that $\omega_{f}(\eta_{\epsilon,n})\leq\frac{1}{n}$ , and therefore

\sup_{f\in K_{\epsilon}}\omega_{f}(\eta_{\epsilon,n})\leq\frac{1}{n}.

Thus for $n\geq 1$ , if $0<\eta\leq\eta_{\epsilon,n}$ then

\sup_{f\in K_{\epsilon}}\omega_{f}(\eta)\leq\frac{1}{n},

which shows $\sup_{f\in K_{\epsilon}}\omega_{f}(\eta)\downarrow 0$ as $\eta\downarrow 0$ . Then because

\sup_{f\in K_{\epsilon}}|f(0)|\leq M_{\epsilon},

applying Lemma 1 we get that $\overline{K_{\epsilon}}$ is compact. The map $f\mapsto\omega_{f}(\eta_{\epsilon,n})$ is continuous, so the set $F_{\epsilon,n}$ is closed, and therefore the set $K_{\epsilon}$ is closed. Because $K_{\epsilon}$ is compact and $\mu(K_{\epsilon})\geq 1-\frac{\epsilon}{2}$ for all $\mu\in\Gamma$ , it follows from by Prokhorov’s theorem that $\Gamma$ is relatively compact. ∎

2 Wiener measure

For $t_{1},\ldots,t_{d}\in[0,1]$ , $t_{1}<\cdots<t_{d}$ , define $\pi_{t_{1},\ldots,t_{d}}:E\to\mathbb{R}^{d}$ by

\pi_{t_{1},\ldots,t_{d}}(f)=(f(t_{1}),\ldots,f(t_{d})),\qquad f\in E,

which is continuous. We state the following results, which we will use later.⁴⁴ 4 K. R. Parthasarathy, Probability Measures on Metric Spaces, p. 212, Chapter VII, Theorem 2.1.

Theorem 4 (The Borel $\sigma$ -algebra of $E$ ).

$\mathscr{B}_{E}$ is equal to the $\sigma$ -algebra generated by $\{\pi_{t}:t\in[0,1]\}$ .

Two elements $\mu$ and $\nu$ of $\mathscr{P}_{E}$ are equal if and only if for any $d$ and any $t_{1}<\cdots<t_{d}$ , the pushforward measures

\mu_{t_{1},\ldots,t_{d}}=(\pi_{t_{1},\ldots,t_{d}})_{*}\mu,\qquad\nu_{t_{1},% \ldots,t_{d}}=(\pi_{t_{1},\ldots,t_{d}})_{*}\nu

are equal.

Let $(\xi_{t})_{t\in[0,1]}$ be a stochastic process with state space $\mathbb{R}$ and sample space $(\Omega,\mathscr{F},P)$ . For $t_{1}<\cdots<t_{d}$ , let $\xi_{t_{1},\ldots,t_{d}}=\xi_{t_{1}}\otimes\cdots\otimes\xi_{t_{d}}$ and let $P_{t_{1},\ldots,t_{d}}=(\xi_{t_{1},\ldots,t_{d}})_{*}P$ : for $B\in\mathscr{B}_{\mathbb{R}}^{d}$ ,

P_{t_{1},\ldots,t_{d}}(B)=((\xi_{t_{1},\ldots,t_{d}})_{*}P)(B)=P(\xi_{t_{1},% \ldots,t_{d}}^{-1}(B))=P((\xi_{t_{1}},\ldots,\xi_{t_{d}})\in B).

$P_{t_{1},\ldots,t_{d}}$ is a Borel probability measure on $\mathbb{R}^{d}$ and is called a finite-dimensional distribution of the stochastic process.

The Kolmogorov continuity theorem⁵⁵ 5 K. R. Parthasarathy, Probability Measures on Metric Spaces, p. 216, Chapter VII, Theorem 3.1 tells us that if there are $\alpha,\beta,K>0$ such that for all $s,t\in[0,1]$ ,

E|\xi_{t}-\xi_{s}|^{\alpha}\leq K|t-s|^{1+\beta},

then there is a unique $\mu\in\mathscr{P}_{E}$ such that for all $k$ and for all $t_{1}<\cdots<t_{d}$ ,

\mu_{t_{1},\ldots,t_{d}}=P_{t_{1},\ldots,t_{d}}.

We now define and prove the existence of Wiener measure.⁶⁶ 6 K. R. Parthasarathy, Probability Measures on Metric Spaces, p. 218, Chapter VII, Theorem 3.2.

Theorem 5 (Wiener measure).

There is a unique Borel probability measure $W$ on $E$ satisfying:

1.

$W(f\in E:f(0)=0)=1$ .
2.

For $0\leq t_{0}<t_{1}<\cdots<t_{d}\leq 1$ the random variables

$\pi_{t_{1}}-\pi_{t_{0}},\quad\pi_{t_{2}}-\pi_{t_{1}},\quad\pi_{t_{3}}-\pi_{t_{% 2}},\quad\pi_{t_{d}}-\pi_{t_{d-1}}$

are independent $(E,\mathscr{B}_{E},W)\to(\mathbb{R},\mathscr{B}_{\mathbb{R}})$ .
3.

If $0\leq s<t\leq 1$ , the random variable $\pi_{t}-\pi_{s}:(E,\mathscr{B}_{E},W)\to(\mathbb{R},\mathscr{B}_{\mathbb{R}})$ is normal with mean $0$ and variance $t-s$ .

Proof.

There is a stochastic process $(\xi_{t})_{t\in[0,1]}$ with state space $\mathbb{R}$ and some sample space $(\Omega,\mathscr{F},P)$ , such that (i) $P(\xi_{0}=0)=1$ , (ii) $(\xi_{t})_{t\in[0,1]}$ has independent increments, and (iii) for $s<t$ , $\xi_{t}-\xi_{s}$ is a normal random variable with mean $0$ and variance $t-s$ . (Namely, Brownian motion with starting point $0$ .) Because $\xi_{t}-\xi_{s}$ has mean $0$ and variance $t-s$ , we calculate (cf. Isserlis’s theorem)

E|\xi_{t}-\xi_{s}|^{4}=3|t-s|^{2}.

Thus using the Kolmogorov continuity theorem with $\alpha=4$ , $\beta=1$ , $K=3$ , there is a unique $W\in\mathscr{P}_{E}$ such that for all $t_{1}<\cdots<t_{d}$ ,

W_{t_{1},\ldots,t_{d}}=P_{t_{1},\ldots,t_{d}},

i.e. for $B\in\mathscr{B}_{\mathbb{R}}^{d}$ ,

W(\pi_{t_{1}}\otimes\cdots\otimes\pi_{t_{d}}\in B)=P(\xi_{t_{1}}\otimes\cdots% \otimes\xi_{t_{d}}\in B).

For $t_{1}<\cdots<t_{d}$ and $B\in\mathscr{B}_{\mathbb{R}}^{d}$ , with $T:\mathbb{R}^{d}\to\mathbb{R}^{d}$ defined by $T(x_{1},\ldots,x_{d})=(x_{1},x_{2}-x_{1},\ldots,x_{d}-x_{d-1})$ ,

\begin{split}&\displaystyle W(\pi_{t_{1}}\otimes(\pi_{t_{2}}-\pi_{t_{1}})% \otimes\cdots\otimes(\pi_{t_{d}}-\pi_{t_{d-1}})\in B)\\ \displaystyle=&\displaystyle W(T\circ(\pi_{t_{1}}\otimes\pi_{t_{2}}\otimes% \cdots\otimes\pi_{t_{d}})\in B)\\ \displaystyle=&\displaystyle W(\pi_{t_{1}}\otimes\pi_{t_{2}}\otimes\cdots% \otimes\pi_{t_{d}}\in T^{-1}(B))\\ \displaystyle=&\displaystyle P(\xi_{t_{1}}\otimes\xi_{t_{2}}\otimes\cdots% \otimes\xi_{t_{d}}\in T^{-1}(B))\\ \displaystyle=&\displaystyle P(T\circ(\xi_{t_{1}}\otimes\xi_{t_{2}}\otimes% \cdots\otimes\xi_{t_{d}})\in B)\\ \displaystyle=&\displaystyle P(\xi_{t_{1}}\otimes(\xi_{t_{2}}-\xi_{t_{1}})% \otimes\cdots\otimes(\xi_{t_{d}}-\xi_{t_{d-1}})\in B).\end{split}

Hence, because $\xi_{t_{1}},\xi_{t_{2}}-\xi_{t_{1}},\ldots,\xi_{t_{d}}-\xi_{t_{d-1}}$ are independent,

\begin{split}&\displaystyle(\pi_{t_{1}}\otimes(\pi_{t_{2}}-\pi_{t_{1}})\otimes% \cdots\otimes(\pi_{t_{d}}-\pi_{t_{d-1}}))_{*}W\\ \displaystyle=&\displaystyle(\xi_{t_{1}}\otimes(\xi_{t_{2}}-\xi_{t_{1}})% \otimes\cdots\otimes(\xi_{t_{d}}-\xi_{t_{d-1}}))_{*}P\\ \displaystyle=&\displaystyle(\xi_{t_{1}})_{*}P\otimes(\xi_{t_{2}}-\xi_{t_{1}})% _{*}P\otimes\cdots\otimes(\xi_{t_{d}}-\xi_{t_{d-1}})_{*}P\\ \displaystyle=&\displaystyle(\pi_{t_{1}})_{*}W\otimes(\pi_{t_{2}}-\pi_{t_{1}})% _{*}W\otimes\cdots\otimes(\pi_{t_{d}}-\pi_{t_{d-1}})_{*}W,\end{split}

which means that the random variables $\pi_{t_{1}},\pi_{t_{2}}-\pi_{t_{1}},\ldots,\pi_{t_{d}}-\pi_{t_{d-1}}$ are independent.

If $s<t$ and $B_{1},B_{2}\in\mathscr{B}_{\mathbb{R}}$ , and for $T:\mathbb{R}^{2}\to\mathbb{R}^{2}$ defined by $T(x,y)=(x,y-x)$ ,

	$\displaystyle W((\pi_{s},\pi_{t}-\pi_{s})\in(B_{1},B_{2}))$	$\displaystyle=W(T\circ(\pi_{s},\pi_{t})\in(B_{1},B_{2}))$
		$\displaystyle=P((\xi_{s},\xi_{t})\in T^{-1}(B_{1},B_{2}))$
		$\displaystyle=P((\xi_{s},\xi_{t}-\xi_{s})\in(B_{1},B_{2})),$

which implies that $(\pi_{t}-\pi_{s})_{*}W=(\xi_{t}-\xi_{s})_{*}P$ , and because $\xi_{t}-\xi_{s}$ is a normal random variable with mean $0$ and variance $t-s$ , so is $\pi_{t}-\pi_{s}$ .

Finally,

W(f:f(0)=0)=W(\pi_{0}=0)=P(\xi_{0}=0)=1.

∎

$(E,\mathscr{B}_{E},W)$ is a probability space, and the stochastic process $(\pi_{t})_{t\in[0,1]}$ is a Brownian motion.

3 Interpolation and continuous stochastic processes

Let $(\xi_{t})_{t\in[0,1]}$ be a continuous stochastic process with state space $\mathbb{R}$ and sample space $(\Omega,\mathscr{F},P)$ . To say that the stochastic process is continuous means that for each $\omega\in\Omega$ the map $t\mapsto\xi_{t}(\omega)$ is continuous $[0,1]\to\mathbb{R}$ . Define $\xi:\Omega\to E$ by

\xi(\omega)=(t\mapsto\xi_{t}(\omega)),\qquad\omega\in\Omega.

For $t\in[0,1]$ and $B$ a Borel set in $\mathbb{R}$ ,

\xi^{-1}\pi_{t}^{-1}B=\{\omega\in\Omega:\xi_{t}(\omega)\in B\}=\xi_{t}^{-1}B,

and because $\xi_{t}:(\Omega,\mathscr{F})\to(\mathbb{R},\mathscr{B}_{\mathbb{R}})$ is measurable this belongs to $\mathscr{F}$ . But by Theorem 4, $\mathscr{B}_{E}$ is generated by the collection $\{\pi_{t}^{-1}B:t\in[0,1],B\in\mathscr{B}_{\mathbb{R}}\}$ . Now, for $f:X\to Y$ and for a nonempty collection $\mathscr{F}$ of subsets of $Y$ ,⁷⁷ 7 Charalambos D. Aliprantis and Kim C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, third ed., p. 140, Lemma 4.23.

\sigma(f^{-1}(\mathscr{F}))=f^{-1}(\sigma(\mathscr{F})).

Therefore $\xi^{-1}(\mathscr{B}_{E})\subset\mathscr{F}$ , which means that $\xi:(\Omega,\mathscr{F})\to(E,\mathscr{B}_{E})$ is measurable. This means that a continuous stochastic proess with index set $[0,1]$ induces a random variable with state space $E$ . Then the pushforward measure of $P$ by $\xi$ is a Borel probability measure on $E$ . We shall end up constructing a sequence of pushforward measures from a sequence of continuous stochastic processes, that converge in $\mathscr{P}_{E}$ to Wiener measure $W$ .

Let $(X_{n})_{n\geq 1}$ be a sequence of independent identically distributed random variables on a sample space $(\Omega,\mathscr{F},P)$ with $E(X_{n})=0$ and $V(X_{n})=1$ , and let $S_{0}=0$ and

S_{k}=\sum_{i=1}^{k}X_{i}.

Then $E(S_{k})=0$ and $V(S_{k})=k$ . For $t\geq 0$ let

Y_{t}=S_{[t]}+(t-[t])X_{[t]+1}.

Thus, for $k\geq 0$ and $k\leq t\leq k+1$ ,

	$\displaystyle Y_{t}$	$\displaystyle=S_{k}+(t-k)X_{k+1}$
		$\displaystyle=S_{k}+(t-k)(S_{k+1}-S_{k})$
		$\displaystyle=(1-t+k)S_{k}+(t-k)S_{k+1}.$

For each $\omega\in\Omega$ , the map $t\mapsto Y_{t}(\omega)$ is piecewise linear, equal to $S_{k}(\omega)$ when $t=k$ , and in particular it is continuous. For $n\geq 1$ , define

X_{t}^{(n)}=n^{-1/2}Y_{nt}=n^{-1/2}S_{[nt]}+n^{-1/2}(nt-[nt])X_{[nt]+1},\qquad t% \in[0,1].

(1)

For $0\leq k\leq n$ ,

X_{k/n}^{(n)}=n^{-1/2}S_{k}.

For each $n\geq 1$ , $(X^{(n)}_{t})_{t\in[0,1]}$ is a continuous stochastic process on the sample space $(\Omega,\mathscr{F},P)$ , and we denote by $P_{n}\in\mathscr{P}_{E}$ the pushforward measure of $P$ by $X^{(n)}$ .

4 Donsker’s theorem

Lemma 6.

If $Z_{n}$ and $U_{n}$ are random variables with state space $\mathbb{R}^{d}$ such that $Z_{n}\to Z$ in distribution and $U_{n}\to 0$ in distribution, then $Z_{n}+U_{n}\to 0$ in distribution.

If $Z_{n}$ are random variables with state space $\mathbb{R}$ that converge in distribution to some random variable $Z$ and $c_{n}$ are real numbers that converge to some real number $c$ , then $c_{n}Z_{n}\to cZ$ in distribution.

For $\sigma\geq 0$ , let $\nu_{\sigma^{2}}$ be the Gaussian measure on $\mathbb{R}$ with mean $0$ and variance $\sigma^{2}$ . The characteristic function of $\nu_{\sigma^{2}}$ is, for $\sigma>0$ ,

\widetilde{\nu}_{\sigma^{2}}(\xi)=\int_{\mathbb{R}}e^{i\xi x}d\nu_{\sigma^{2}}% (x)=\int_{\mathbb{R}}e^{i\xi x}\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{x^{2}}{2% \sigma^{2}}}dx=e^{-\frac{1}{2}\sigma^{2}\xi^{2}},

and $\widetilde{\nu}_{0}(\xi)=1$ . One checks that $c_{*}\nu_{1}=\nu_{c^{2}}$ for $c\geq 0$ .

In following theorem and in what follows, $X^{(n)}$ is the piecewise linear stochastic process defined in (1). We prove that a sequence of finite-dimensional distributions converge to a Gaussian measure.⁸⁸ 8 Bert Fristedt and Lawrence Gray, A Modern Approach to Probability Theory, p. 368, §19.1, Lemma 1.

Theorem 7.

For $0\leq t_{0}<t_{1}<t_{1}<\cdots<t_{d}\leq 1$ , the random vectors

(X^{(n)}_{t_{1}}-X^{(n)}_{t_{0}},\ldots,X^{(n)}_{t_{d}}-X^{(n)}_{t_{d-1}}),% \qquad(\Omega,\mathscr{F},P)\to(\mathbb{R}^{d},\mathscr{B}_{\mathbb{R}}^{d}),

converge in distribution to $\nu_{t_{1}-t_{0}}\otimes\cdots\otimes\nu_{t_{d}-t_{d-1}}$ as $n\to\infty$ .

Proof.

For $0<j\leq d$ and $n\geq 1$ let

r_{j,n}=\frac{[nt_{j}]}{n},\qquad U_{j,n}=X^{(n)}_{t_{j}}-X^{(n)}_{r_{j,n}},

and for $0\leq j<d$ and $n\geq 1$ let

s_{j,n}=\frac{\lceil nt_{j}\rceil}{n},\qquad V_{j,n}=X^{(n)}_{s_{j,n}}-X^{(n)}% _{t_{j}},

with which

	$\displaystyle(X^{(n)}_{t_{1}}-X^{(n)}_{t_{0}},\ldots,X^{(n)}_{t_{d}}-X^{(n)}_{% t_{d-1}})$	$\displaystyle=(X^{(n)}_{r_{1,n}}-X^{(n)}_{s_{0,n}},\ldots,X^{(n)}_{r_{d,n}}-X^% {(n)}_{s_{d-1,n}})$
		$\displaystyle+(U_{1,n},\ldots,U_{d,n})+(V_{0,n},\ldots,V_{d-1,n}).$

Because $E(X^{(n)}_{t})=0$ ,

E(U_{j,n})=0,\qquad E(V_{j,n})=0.

Furthermore,

\begin{split}&\displaystyle V(U_{j,n})\\ \displaystyle=&\displaystyle V(X^{(n)}_{t_{j}}-X^{(n)}_{r_{j,n}})\\ \displaystyle=&\displaystyle n^{-1}V(S_{[nt_{j}]}+(nt_{j}-[nt_{j}])X_{[nt_{j}]% +1}-S_{[nr_{j,n}]}-(nr_{j,n}-[nr_{j,n}])X_{[nr_{j,n}]+1})\\ \displaystyle=&\displaystyle n^{-1}V(S_{[nt_{j}]}+(nt_{j}-[nt_{j}])X_{[nt_{j}]% +1}-S_{[nt_{j}]}-([nt_{j}]-[nt_{j}])X_{[nr_{j,n}]+1})\\ \displaystyle=&\displaystyle n^{-1}(nt_{j}-[nt_{j}])^{2}V(X_{[nt_{j}]+1})\\ \displaystyle=&\displaystyle n^{-1}(nt_{j}-[nt_{j}])^{2},\end{split}

and because $0\leq nt_{j}-[nt_{j}]<1$ this tends to $0$ as $n\to\infty$ . Likewise, $V(V_{j,n})\to 0$ as $n\to\infty$ .

For $1\leq j\leq d$ ,

	$\displaystyle X^{(n)}_{r_{j,n}}-X^{(n)}_{s_{j-1,n}}$	$\displaystyle=n^{-1/2}S_{[nr_{j,n}]}+n^{-1/2}(nr_{j,n}-[nr_{j,n}])X_{[nr_{j,n}% ]+1}$
		$\displaystyle-n^{-1/2}S_{[ns_{j-1,n}]}-n^{-1/2}(ns_{j-1,n}-[ns_{j-1,n}])X_{[ns% _{j-1,n}]+1}$
		$\displaystyle=n^{-1/2}S_{[nt_{j}]}-n^{-1/2}S_{\lceil nt_{j-1}\rceil}$
		$\displaystyle=n^{-1/2}\frac{([nt_{j}]-\lceil nt_{j-1}\rceil-1)^{1/2}}{([nt_{j}% ]-\lceil nt_{j-1}\rceil-1)^{1/2}}\sum_{i=\lceil nt_{j-1}\rceil+1}^{[nt_{j}]}X_% {i}.$

By the central limit theorem,

([nt_{j}]-\lceil nt_{j-1}\rceil-1)^{1/2}\sum_{i=\lceil nt_{j-1}\rceil+1}^{[nt_% {j}]}X_{i}\to\nu_{1}

in distribution as $n\to\infty$ . But

n^{-1/2}([nt_{j}]-\lceil nt_{j-1}\rceil-1)^{1/2}\to(t_{j}-t_{j-1})^{1/2}

as $n\to\infty$ , and $(t_{j}-t_{j-1})^{1/2}_{*}\nu_{1}=\nu_{t_{j}-t_{j-1}}$ , so by Lemma 6,

X^{(n)}_{r_{j,n}}-X^{(n)}_{s_{j-1,n}}\to\nu_{t_{j}-t_{j-1}}

in distribution as $n\to\infty$ .

For sufficiently large $n$ , depending on $t_{0},\ldots,t_{d}$ ,

t_{0}\leq s_{0,n}<r_{1,n}\leq t_{1}\leq s_{1,n}<r_{2,n}\leq\cdots\leq t_{d-1}% \leq s_{d-1,n}<r_{d,n}\leq t_{d}.

Check that $(U_{1,n},\ldots,U_{d,n})\to 0$ in probability and that $(V_{0,n},\ldots,V_{d-1,n})\to 0$ in probability, and hence these random vectors converge to $0$ in distribution as $n\to\infty$ . The random variables $X^{(n)}_{r_{1,n}}-X^{(n)}_{s_{0,n}},\ldots,X^{(n)}_{r_{d,n}}-X^{(n)}_{s_{d-1,n}}$ are independent, and therefore their joint distribution is equal to the product of their distributions. Now, if $\mu_{n}=\mu_{n}^{1}\otimes\cdots\otimes\mu_{n}^{d}$ and $\mu_{n}^{j}\to\mu^{j}$ as $n\to\infty$ , $1\leq j\leq d$ , then for $\xi\in\mathbb{R}^{d}$ ,

	$\displaystyle\widetilde{\mu}_{n}(\xi)$	$\displaystyle=\widetilde{\mu}_{n}^{1}(\xi_{1})\cdots\widetilde{\mu}_{n}^{d}(% \xi_{d})$
		$\displaystyle\to\widetilde{\mu}^{1}(\xi_{1})\cdots\widetilde{\mu}^{d}(\xi_{d})$
		$\displaystyle=(\mu^{1}\otimes\cdots\otimes\mu^{d})^{\widetilde{\;}}(\xi)$

as $n\to\infty$ , and therefore by Lévy’s continuity theorem, $\mu_{n}\to\mu^{1}\otimes\cdots\otimes\mu^{d}$ as $n\to\infty$ . This means that the joint distribution of $X^{(n)}_{r_{1,n}}-X^{(n)}_{s_{0,n}},\ldots,X^{(n)}_{r_{d,n}}-X^{(n)}_{s_{d-1,n}}$ converges to

\nu_{t_{1}-t_{0}}\otimes\cdots\otimes\nu_{t_{d}-t_{d-1}}

as $n\to\infty$ . Because $(U_{1,n},\ldots,U_{d,n})\to 0$ in distribution as $n\to\infty$ and $(V_{0,n},\ldots,V_{d-1,n})\to 0$ in distribution as $n\to\infty$ , applying Lemma 6 we get that

(X^{(n)}_{t_{1}}-X^{(n)}_{t_{0}},\ldots,X^{(n)}_{t_{d}}-X^{(n)}_{t_{d-1}})\to% \nu_{t_{1}-t_{0}}\otimes\cdots\otimes\nu_{t_{d}-t_{d-1}}

in distribution as $n\to\infty$ , completing the proof. ∎

Let $t_{0}=0$ and let $0<t_{1}<\cdots<t_{d}\leq 1$ . As $X^{(n)}_{0}=0$ , the above lemma tells us that

(X^{(n)}_{t_{1}},X^{(n)}_{t_{2}}-X^{(n)}_{t_{1}},\ldots,X^{(n)}_{t_{d}}-X^{(n)% }_{t_{d-1}})\to\nu_{t_{1}}\otimes\nu_{t_{2}-t_{1}}\otimes\cdots\otimes\nu_{t_{% d}-t_{d-1}}

in distribution as $n\to\infty$ . Define $g:\mathbb{R}^{d}\to\mathbb{R}^{d}$ by

g(x_{1},x_{2},\ldots,x_{d})=(x_{1},x_{1}+x_{2},\ldots,x_{1}+x_{2}+\cdots+x_{d}).

The function $g$ is continuous and satisfies

g\circ(X^{(n)}_{t_{1}}-X^{(n)}_{t_{0}},\ldots,X^{(n)}_{t_{d}}-X^{(n)}_{t_{d-1}% })=(X^{(n)}_{t_{1}},X^{(n)}_{t_{2}},\ldots,X^{(n)}_{t_{d}}).

Then by the continuous mapping theorem,

(X^{(n)}_{t_{1}},X^{(n)}_{t_{2}},\ldots,X^{(n)}_{t_{d}})\to g_{*}(\nu_{t_{1}}% \otimes\nu_{t_{2}-t_{1}}\otimes\cdots\otimes\nu_{t_{d}-t_{d-1}})

(2)

in distribution as $n\to\infty$ .⁹⁹ 9 Allan Gut, Probability: A Graduate Course, second ed., p. 245, Chapter 5, Theorem 10.4.

We prove a result that we use to prove the next lemma, and that lemma is used in the proof of Donsker’s theorem.¹⁰¹⁰ 10 Ioannis Karatzas and Steven E. Shreve, Brownian Motion and Stochastic Calculus, second ed., p. 68, Lemma 4.18.

Lemma 8.

For $\epsilon>0$ ,

\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{\delta}P\left(\max_{1% \leq j\leq[n\delta]+1}|S_{j}|>\epsilon n^{1/2}\right)=0.

Proof.

For each $\delta>0$ , by the central limit theorem,

([n\delta]+1)^{-1/2}S_{[n\delta]+1}\to Z

in distribution as $n\to\infty$ , where $Z_{*}P=\nu_{1}$ . Because $\frac{([n\delta]+1)^{1/2}}{(n\delta)^{1/2}}\to 1$ as $n\to\infty$ , by Lemma 6 we then get that

(n\delta)^{-1/2}S_{[n\delta]+1}\to Z

in distribution as $n\to\infty$ . Now let $\lambda>0$ , and there is a sequence $\phi_{k}$ in $C_{b}(\mathbb{R})$ such that $\phi_{k}\downarrow 1_{(-\infty,-\lambda]\cup[\lambda,\infty)}=\chi_{\lambda}$ pointwise as $k\to\infty$ . For each $k$ , writing $X=S_{[n\delta]+1}$ , using the change of variables formula,

	$\displaystyle P(\|X\|\geq\lambda(n\delta)^{1/2})$	$\displaystyle=\int_{\Omega}\chi_{\lambda(n\delta)^{1/2}}(X(\omega))dP(\omega)$
		$\displaystyle=\int_{\Omega}\chi_{\lambda}((n\delta)^{-1/2}X(\omega))dP(\omega)$
		$\displaystyle\leq\int_{\Omega}\phi_{k}((n\delta)^{-1/2}X(\omega))dP(\omega)$
		$\displaystyle=E(\phi_{k}((n\delta)^{-1/2}X)).$

Therefore, by the continuous mapping theorem,

	$\displaystyle\limsup_{n\to\infty}P(\|S_{[n\delta]+1}\|\geq\lambda(n\delta)^{1/2})$	$\displaystyle\leq\lim_{n\to\infty}E(\phi_{k}((n\delta)^{-1/2}S_{[n\delta]+1}))$
		$\displaystyle=E(\phi_{k}\circ Z).$

Because $\phi_{k}\downarrow\chi_{\lambda}$ pointwise as $k\to\infty$ , using the monotone convergence theorem and then using Chebyshev’s inequality,

E(\phi_{k}\circ Z)\to E(\chi_{\lambda}\circ Z)=P(|Z|\geq\lambda)\leq\lambda^{-% 3}E|Z|^{3}.

We have established that for each $\lambda>0$ ,

\limsup_{n\to\infty}P(|S_{[n\delta]+1}|\geq\lambda(n\delta)^{1/2})\leq\lambda^% {-3}E|Z|^{3}.

(3)

Define

\tau=\min\{j\geq 1:|S_{j}|>n^{1/2}\epsilon\}.

For $0<\delta<\epsilon^{2}/2$ , it is a fact that

\begin{split}&\displaystyle P\left(\max_{0\leq j\leq[n\delta]+1}|S_{j}|>n^{1/2% }\epsilon\right)\\ \displaystyle\leq&\displaystyle P(|S_{[n\delta]+1}|\geq n^{1/2}(\epsilon-(2% \delta)^{1/2}))\\ &\displaystyle+\sum_{j=1}^{[n\delta]}P(|S_{[n\delta]+1}|<n^{1/2}(\epsilon-(2% \delta)^{1/2})|\tau=j)P(\tau=j).\end{split}

If $\tau(\omega)=j$ and $|S_{[n\delta]+1}(\omega)|<n^{1/2}(\epsilon-(2\delta)^{1/2})$ then

|S_{j}(\omega)-S_{[n\delta]+1}(\omega)|\geq|S_{j}(\omega)|-|S_{[n\delta]+1}(% \omega)|>n^{1/2}\epsilon-n^{1/2}(\epsilon-(2\delta)^{1/2})=(2n\delta)^{1/2}.

But by Chebyshev’s inequality and the fact that the random variables $X_{1},X_{2},\ldots$ are independent with mean $0$ and variance $1$ ,

P(|S_{j}-S_{[n\delta]+1}|>(2n\delta)^{1/2})\leq\frac{1}{2n\delta}E((S_{j}-S_{[% n\delta]+1})^{2})=\frac{1}{2n\delta}([n\delta]-j)\leq\frac{1}{2},

P(|S_{[n\delta]+1}(\omega)|<n^{1/2}(\epsilon-(2\delta)^{1/2})|\tau=j)\leq\frac% {1}{2}.

Therefore,

\begin{split}&\displaystyle P\left(\max_{0\leq j\leq[n\delta]+1}|S_{j}|>n^{1/2% }\epsilon\right)\\ \displaystyle\leq&\displaystyle P(|S_{[n\delta]+1}|\geq n^{1/2}(\epsilon-(2% \delta)^{1/2}))+\sum_{j=1}^{[n\delta]}\frac{1}{2}\cdot P(\tau=j)\\ \displaystyle=&\displaystyle P(|S_{[n\delta]+1}|\geq n^{1/2}(\epsilon-(2\delta% )^{1/2}))+\frac{1}{2}P(\tau\leq[n\delta])\\ \displaystyle=&\displaystyle P(|S_{[n\delta]+1}|\geq n^{1/2}(\epsilon-(2\delta% )^{1/2}))+\frac{1}{2}P\left(\max_{0\leq j\leq[n\delta]+1}|S_{j}|>n^{1/2}% \epsilon\right),\end{split}

P\left(\max_{0\leq j\leq[n\delta]+1}|S_{j}|>n^{1/2}\epsilon\right)\leq 2P(|S_{% [n\delta]+1}|\geq n^{1/2}(\epsilon-(2\delta)^{1/2})).

Now using (3) with $\lambda=(\epsilon-(2\delta)^{1/2})\delta^{-1/2}$ ,

\limsup_{n\to\infty}P(|S_{[n\delta]+1}|\geq(\epsilon-(2\delta)^{1/2})\delta^{-% 1/2}(n\delta)^{1/2})\leq(\epsilon-(2\delta)^{1/2})^{-3}\delta^{3/2}E|Z|^{3},

hence

\limsup_{n\to\infty}P\left(\max_{0\leq j\leq[n\delta]+1}|S_{j}|>n^{1/2}% \epsilon\right)\leq 2(\epsilon-(2\delta)^{1/2})^{-3}\delta^{3/2}E|Z|^{3}.

Dividing both sides by $\delta$ and then taking $\delta\downarrow 0$ we obtain the claim. ∎

We prove one more result that we use to prove Donsker’s theorem.¹¹¹¹ 11 Ioannis Karatzas and Steven E. Shreve, Brownian Motion and Stochastic Calculus, second ed., p. 69, Lemma 4.19.

Lemma 9.

For $T>0$ and $\epsilon>0$ ,

\lim_{\delta\downarrow 0}\limsup_{n\to\infty}P\left(\max_{0\leq k\leq[nT]+1}% \max_{1\leq j\leq[n\delta]+1}|S_{j+k}-S_{k}|>n^{1/2}\epsilon\right)=0.

Proof.

For $0<\delta\leq T$ , let $m=\lceil T/\delta\rceil$ , so $T/m<\delta\leq T/(m-1)$ . Then

\lim_{n\to\infty}\frac{[nT]+1}{[n\delta]+1}=\frac{T}{\delta}<m,

so for all $n\geq n_{\delta}$ it is the case that $[nT]+1<([n\delta]+1)m$ . Suppose that $\omega\in\Omega$ is such that there are $1\leq j\leq[n\delta]+1$ and $0\leq k\leq[nT]+1$ satisfying

|S_{j+k}(\omega)-S_{k}(\omega)|>n^{1/2}\epsilon,

and then let $p=[k/([n\delta]+1)]$ , which satisfies $0\leq p\leq m-1$ and

([n\delta]+1)p\leq k<([n\delta]+1)(p+1).

Because $1\leq j\leq[n\delta]+1$ , either

([n\delta]+1)p<k+j\leq([n\delta]+1)(p+1)

([n\delta]+1)(p+1)<k+j<([n\delta]+1)(p+2).

We separate the first case into the cases

|S_{k}(\omega)-S_{([n\delta]+1)p}(\omega)|>\frac{1}{2}n^{1/2}\epsilon

and

|S_{j+k}(\omega)-S_{([n\delta]+1)p}(\omega)|>\frac{1}{2}n^{1/2}\epsilon,

and we separate the second case into the cases

|S_{k}-S_{([n\delta]+1)p}(\omega)|>\frac{1}{3}n^{1/2}\epsilon,

and

|S_{([n\delta]+1)p}(\omega)-S_{([n\delta]+1)(p+1)}(\omega)|>\frac{1}{3}n^{1/2}\epsilon,

and

|S_{([n\delta]+1)(p+1)}(\omega)-S_{([n+\delta]+1)(p+2)}(\omega)|>\frac{1}{3}n^% {1/2}\epsilon.

It follows that¹²¹² 12 This should be worked out more carefully. In Karatzas and Shreve, there is $m+1$ where I have $m$ .

\begin{split}&\displaystyle\left\{\max_{1\leq j\leq[n\delta]+1}\max_{0\leq k% \leq[nT]+1}|S_{j+k}-S_{k}|>n^{1/2}\epsilon\right\}\\ \displaystyle\subset&\displaystyle\bigcup_{p=0}^{m-1}\left\{\max_{1\leq j\leq[% n\delta]+1}|S_{j+([n\delta]+1)p}-S_{([n\delta]+1)p}|>\frac{1}{3}n^{1/2}% \epsilon\right\}.\end{split}

For $0\leq p\leq m-1$ ,

\begin{split}&\displaystyle P\left(\max_{1\leq j\leq[n\delta]+1}|S_{j+([n% \delta]+1)p}-S_{([n\delta]+1)p}|>\frac{1}{3}n^{1/2}\epsilon\right)\\ \displaystyle\leq&\displaystyle P\left(\max_{1\leq j\leq[n\delta]+1}|S_{j}|>% \frac{1}{3}n^{1/2}\epsilon\right),\end{split}

\begin{split}&\displaystyle P\left\{\max_{1\leq j\leq[n\delta]+1}\max_{0\leq k% \leq[nT]+1}|S_{j+k}-S_{k}|>n^{1/2}\epsilon\right\}\\ \displaystyle\leq&\displaystyle\sum_{p=0}^{m-1}P\left(\max_{1\leq j\leq[n% \delta]+1}|S_{j}|>\frac{1}{3}n^{1/2}\epsilon\right)\\ \displaystyle=&\displaystyle mP\left(\max_{1\leq j\leq[n\delta]+1}|S_{j}|>% \frac{1}{3}n^{1/2}\epsilon\right).\end{split}

Lemma 8 tells us

\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{\delta}P\left(\max_{1% \leq j\leq[n\delta]+1}|S_{j}|>\frac{1}{3}n^{1/2}\epsilon\right)=0,

and because $m\leq\frac{T}{\delta}+1=\frac{T+\delta}{\delta}$ ,

\lim_{\delta\downarrow 0}\limsup_{n\to\infty}P\left\{\max_{1\leq j\leq[n\delta% ]+1}\max_{0\leq k\leq[nT]+1}|S_{j+k}-S_{k}|>n^{1/2}\epsilon\right\}=0,

proving the claim. ∎

In the following, $P_{n}\in\mathscr{P}_{E}$ denotes the pushforward measure of $P$ by $X^{(n)}$ , for $X^{(n)}$ defined in (1). We now prove Donsker’s theorem.¹³¹³ 13 Ioannis Karatzas and Steven E. Shreve, Brownian Motion and Stochastic Calculus, second ed., p. 70, Theorem 4.20.

Theorem 10 (Donsker’s theorem).

$P_{n}\to W$ .

Proof.

We shall use Theorem 3 to prove that $\Gamma=\{P_{n}:n\geq 1\}$ is relatively compact in $\mathscr{P}_{E}$ . For $n\geq 1$ ,

P_{n}(f\in E:|f(0)|=0)=P(\omega\in\Omega:|X_{0}^{(n)}(\omega)|=0)=1,

thus the first condition of Theorem 3 is satisfied with $M_{\epsilon}=0$ . For the second condition of Theorem 3 to be satisfied it suffices that for each $\epsilon>0$ ,

\lim_{\delta\downarrow 0}\limsup_{n\to\infty}P\left(\sup_{0\leq s,t\leq 1,|s-t% |\leq\delta}|X^{(n)}(s)-X^{(n)}(t)|>\epsilon\right)=0.

Now,

P\left(\sup_{0\leq s,t\leq 1,|s-t|\leq\delta}|X^{(n)}_{s}-X^{(n)}_{t}|>% \epsilon\right)=P\left(\sup_{0\leq s,t\leq n,|s-t|\leq n\delta}|Y_{s}-Y_{t}|>n% ^{1/2}\epsilon\right).

Also,

	$\displaystyle\sup_{0\leq s,t\leq n,\|s-t\|\leq n\delta}\|Y_{s}-Y_{t}\|$	$\displaystyle\leq\sup_{0\leq s,t\leq n,\|s-t\|\leq n\delta}\|Y-s-Y_{t}\|$
		$\displaystyle\leq\max_{1\leq j\leq[n\delta]+1}\max_{0\leq k\leq n+1}\|S_{j+k}-S% _{k}\|,$

so applying Lemma 9,

\begin{split}&\displaystyle\lim_{\delta\downarrow 0}\limsup_{n\to\infty}P\left% (\sup_{0\leq s,t\leq 1,|s-t|\leq\delta}|X^{(n)}_{s}-X^{(n)}_{t}|>\epsilon% \right)\\ \displaystyle\leq&\displaystyle\lim_{\delta\downarrow 0}\limsup_{n\to\infty}P% \left(\max_{1\leq j\leq[n\delta]+1}\max_{0\leq k\leq n+1}|S_{j+k}-S_{k}|>n^{1/% 2}\epsilon\right)\\ \displaystyle\to&\displaystyle 0,\end{split}

from which we get that $\Gamma$ is tight in $\mathscr{P}_{E}$ . ∎

Wiener measure and Donsker’s theorem

1 Relatively compact sets of Borel probability measures on C[0,1]

Lemma 1.

Lemma 2.

Proof.

Theorem 3 (Relatively compact sets in 𝒫).

Proof.

2 Wiener measure

Theorem 4 (The Borel σ-algebra of E).

Theorem 5 (Wiener measure).

Proof.

3 Interpolation and continuous stochastic processes

4 Donsker’s theorem

Lemma 6.

Theorem 7.

Proof.

Lemma 8.

Proof.

Lemma 9.

Proof.

Theorem 10 (Donsker’s theorem).

Proof.

Theorem 3 (Relatively compact sets in $\mathscr{P}$ ).

Theorem 4 (The Borel $\sigma$ -algebra of $E$ ).