The heat kernel on n

Jordan Bell
March 28, 2014

1 Notation

For fL1(n), we define f^:n by


The statement of the Riemann-Lebesgue lemma is that f^C0(n).

We denote by 𝒮n the Fréchet space of Schwartz functions n.

If α is a multi-index, we define




2 The heat equation

Fix n, and for t>0, xn, define


We call k the heat kernel. It is straightforward to check for any t>0 that kt𝒮n. The heat kernel satisfies


For a>0 and f(x)=e-πa|x|2, it is a fact that f^(ξ)=a-n/2e-π|ξ|2/a. Using this, for any t>0 we get


Thus for any t>0,


Then the heat kernel is an approximate identity: if fLp(n), 1p<, then f*kt-fp0 as t0, and if f is a function on n that is bounded and continuous, then for every xn, f*kt(x)f(x) as t0.11 1 k1, and any kt, belong merely to 𝒮n and not to 𝒟(n), which is demanded in the definition of an approximate identity in Rudin’s Functional Analysis, second ed. For each t>0, because kt𝒮n we have f*ktC(n), and Dα(f*kt)=f*Dαkt for any multi-index α.22 2 Gerald B. Folland, Introduction to Partial Differential Equations, second ed., p. 11, Theorem 0.14.

The heat operator is Dt-Δ and the heat equation is (Dt-Δ)u=0. It is straightforward to check that


that is, the heat kernel is a solution of the heat equation.

To get some practice proving things about solutions of the heat equation, we work out the following theorem from Folland.33 3 Gerald B. Folland, Introduction to Partial Differential Equations, second ed., p. 144, Theorem 4.4. In Folland’s proof it is not apparent how the hypotheses on u and Dx are used, and we make this explicit.

Theorem 1.

Suppose that u:[0,)×RnC is continuous, that u is C2 on (0,)×Rn, that


and that u(0,x)=0 for xRn. If for every ϵ>0 there is some C such that


then u=0.


If f and g are C2 functions on some open set in ×n, such as (0,)×n, then

g(tf-Δf)+f(tg+Δg) = t(fg)-gj=1nj2f+fj=1nj2g
= t(fg)+j=1nj(fjg-gjf)
= divt,xF,



Take t0>0, x0n, and let f(t,x)=u(t,x) and g(t,x)=k(t0-t,x-x0) for t>0, xn. Let 0<a<b<t0 and r>0, and define


In Ω we check that (t-Δ)f=0 and (t+Δ)g=0, so by the divergence theorem,


On the other hand, as


we have

ΩFν = |x|rF(b,x)(1,0,,0)𝑑x+|x|rF(a,x)(-1,0,,0)𝑑x
= |x|rf(b,x)g(b,x)𝑑x-|x|rf(a,x)g(a,x)𝑑x
= |x|ru(b,x)k(t0-b,x-x0)𝑑x-|x|ru(a,x)k(t0-a,x-x0)𝑑x

where σ is surface measure on {|x|=r}=rSn-1. As r, the first two terms tend to




respectively. Let ϵ<14(t0-a), and let C be as given in the statement of the theorem. Using jk(t,x)=-xj2tk(t,x), for any r>0 the third term is bounded by


which is bounded by


and writing η=14(t0-a)-ϵ and ωn=2πn/2Γ(n/2), the surface area of the sphere of radius 1 in n, this is equal to


which tends to 0 as r. Therefore,


One checks that as bt0, the left-hand side tends to u(t0,x0), and that as a0, the right-hand side tends to u(0,x0)=0. Therefore,


This is true for any t0>0, x0n, and as u:[0,)×n is continuous, it follows that u is identically 0. ∎

3 Fundamental solutions

We extend k to ×n as


This function is locally integrable in ×n, so it makes sense to define Λk𝒟(×n) by


Suppose that P is a polynomial in n variables:


We say that E𝒟(n) is a fundamental solution of the differential operator


if P(D)E=δ. If E=Λf for some locally integrable f, Λfϕ=nϕ(x)f(x)𝑑x, we also say that the function f is a fundamental solution of the differential operator P(D). We now prove that the heat kernel extended to ×n in the above way is a fundamental solution of the heat operator.44 4 Gerald B. Folland, Introduction to Partial Differential Equations, second ed., p. 146, Theorem 4.6.

Theorem 2.

Λk is a fundamental solution of Dt-Δ.


For ϵ>0, define Kϵ(t,x)=k(t,x) if t>ϵ and Kϵ(t,x)=0 otherwise. For any ϕ𝒟(×n),

|n(k(t,x)-Kϵ(t,x))ϕ(t,x)𝑑x𝑑t| = |0ϵnk(t,x)ϕ(t,x)𝑑x𝑑t|
= ϕ0ϵ𝑑t
= ϕϵ.

This shows that ΛKϵΛk in 𝒟(×n), with the weak-* topology. It is a fact that for any multi-index, EDαE is continuous 𝒟(×n)𝒟(×n), and hence (Dt-Δ)ΛKϵ(Dt-Δ)Λk in 𝒟(×n). Therefore, to prove the theorem it suffices to prove that (Dt-Δ)ΛKϵδ (because 𝒟(×n) with the weak-* topology is Hausdorff).

Let ϕ𝒟(×n). Doing integration by parts,

(Dt-Δ)ΛKϵ(ϕ) = ΛKϵ((Dt-Δ)ϕ)
= nKϵ(t,x)(Dtϕ(t,x)-Δϕ(t,x))𝑑xtx
= ϵnk(t,x)Dtϕ(t,x)-k(t,x)Δϕ(t,x)dxtx
= n(k(ϵ,x)ϕ(ϵ,x)-ϵϕ(t,x)Dtk(t,x)𝑑t)𝑑x
= nk(ϵ,x)ϕ(ϵ,x)𝑑x
= nk(ϵ,x)ϕ(ϵ,x)𝑑x.

So, using kt(x)=kt(-x) and writing ϕt(x)=ϕ(t,x),

(Dt-Δ)ΛKϵ(ϕ) = nkϵ(-x)ϕϵ(x)𝑑x
= kϵ*ϕϵ(0)
= kϵ*ϕ0(0)+kϵ*(ϕϵ-ϕ0)(0).

Using the definition of convolution, the second term is bounded by


which tends to 0 as ϵ0. Because k is an approximate identity, kϵ*ϕ0(0)ϕ0(0) as ϵ0. That is,


as ϵ0, showing that (Dt-Δ)ΛKϵδ in 𝒟(×n) and completing the proof. ∎

4 Functions of the Laplacian

This section is my working through of material in Folland.55 5 Gerald B. Folland, Introduction to Partial Differential Equations, second ed., pp. 149–152, §4B. For f𝒮n and for any nonnegative integer k, doing integration by parts we get


Suppose that P is a polynomial in one variable: P(x)=ckxk. Then, writing P(-Δ)=ck(-Δ)k, we have

(P(-Δ)f)(ξ) = ck((-Δ)kf)(ξ)
= ck(4π2|ξ|2)k(f)(ξ)
= (f)(ξ)P(4π2|ξ|2).

We remind ourselves that tempered distributions are elements of 𝒮n, i.e. continuous linear maps 𝒮n. The Fourier transform of a tempered distribution Λ is defined by Λ^f=(Λ)f=Λf^, f𝒮n. It is a fact that the Fourier transform is an isomorphism of locally convex spaces 𝒮n𝒮n.66 6 Walter Rudin, Functional Analysis, second ed., p. 192, Theorem 7.15.

Suppose that ψ:(0,) is a function such that


is a tempered distribution. We define ψ(-Δ):𝒮n𝒮n by


Define fˇ(x)=f(-x); this is not the inverse Fourier transform of f, which we denote by -1. As well, write τxf(y)=f(y-x). For u𝒮n and ϕ𝒮n, we define the convolution u*ϕ:n by


One proves that u*ϕC(n), that


for any multi-index, that u*ϕ is a tempered distribution, that (u*ϕ)=ϕ^u^, and that u^*ϕ^=(ϕu).77 7 Walter Rudin, Functional Analysis, second ed., p. 195, Theorem 7.19.

We can also write ψ(-Δ) in the following way. There is a unique κψ𝒮n such that


For f𝒮n, we have (κψ*f)=f^κ^ψ=f^Λ, but, using the definition of ψ(-Δ) we also have (ψ(-Δ)f)=-1(f^Λ)=f^Λ, so


Moreover, κψ*fC(n); this shows that ψ(-Δ)f can be interpreted as a tempered distribution or as a function. We call κψ the convolution kernel of ψ(-Δ).

For a fixed t>0, define ψ(s)=e-ts. Then Λ:𝒮n defined by


is a tempered distribution. Using the Plancherel theorem, we have


With κψ𝒮n such that κψ=Λ, we have


Because ff^ is a bijection 𝒮n𝒮n, this shows that for any f𝒮n we have



etΔf=κψ*f=kt*f,t>0,f𝒮n. (1)

Suppose that ϕ:(0,) and ω:(0,)(0,) are functions and that


Manipulating symbols suggests that it may be true that


and then, for f𝒮n,


and hence

κψ(x)=0ϕ(τ)kω(τ)(x)𝑑τ,xn. (2)

Take ψ(s)=s-β with 0<Reβ<n2. Because Reβ<n2, one checks that


is a tempered distribution. As Reβ>0, we have


and writing ϕ(τ)=τβ-1Γ(β) and ω(τ)=τ, we suspect from (2) that the convolution kernel of (-Δ)-β is


which one calculates is equal to

Γ(n2-β)Γ(β)4βπn/2|x|n-2β. (3)

What we have written so far does not prove that this is the convolution kernel of (-Δ)-β because it used (2), but it is straightforward to calculate that indeed the convolution kernel of (-Δ)-β is (3). This calculation is explained in an exercise in Folland.88 8 Gerald B. Folland, Introduction to Partial Differential Equations, second ed., p. 154, Exercise 1.

Taking α=2β and defining


we call Rα the Riesz potential of order α. Taking as granted that (3) is the convolution kernel of (-Δ)-β, we have


Then, if n>2 and α=2 satisfies 0<Reα<n, we work out that


where ωn=2πn/2Γ(n/2), and hence


and applying -Δ we obtain


hence -ΔR2=δ. That is, R2 is the fundamental solution for -Δ.

Suppose that Reβ>0. Then, using the definition of Γ(β) as an integral, with ψ(s)=(1+s)-β, we have


Manipulating symbols suggests that


and using (1), assuming the above is true we would have for all f𝒮n,


whose convolution kernel is


We write α=2β and define, for Reα>0,


We call Bα the Bessel potential of order α. It is straightforward to show, and shown in Folland, that Bα1<, so BαL1(n). Therefore we can take the Fourier transform of Bα, and one calculates that it is


and then


5 Gaussian measure

If μ is a measure on n and f:n is a function such that for every xn the integral nf(x-y)𝑑μ(y) converges, we define the convolution μ*f:n by


Let νt be the measure on n with density kt. We call νt Gaussian measure. It satisfies