Notes on the KAM theorem

Jordan Bell
April 6, 2015

1 Introduction

I hope eventually to expand these notes into a standalone presentation of KAM that presents a precise formulation of the theorem and gives detailed proofs of everything. There are few presentations of KAM in the literature that give a precise formulation of the theorem, and even those that give precise formulations such as [6] and [7] glide over some details. Gallavotti [4] explains the history of quasi-periodic phenomena in celestial mechanics.

Let 𝕋n=n/n.

For x,yn, let x,y=j=1nxjyj. Let x=j=1nxj2 and let x=max1jn|xj|. For x,yn, we have |x,y|nxy.

If (M,ω) is a symplectic manifold and HC(M), then the Hamiltonian vector field with energy function H is the vector field XH on M uniquely determined by the condition ωx(XH(x),v)=(dH)(x)(v) for all points xM and tangent vectors vTxM.

We say that (q1,,qn,p1,,pn) are canonical coordinates for (M,ω) if ω=j=1ndqjdpj. If (q1,,qn,p1,,pn) are canonical coordinates for (M,ω) and HC(M) then

XH(x)=((pH)(x),(-qH)(x))

for all xM, where

qH=(Hq1,,Hqn),pH=(Hp1,,Hpn).

Let ϕ be the flow of XH on M. Then

d(qj(ϕt(x)))dt=Hpj(ϕt(x)),d(pj(ϕt(x)))dt=-Hqj(ϕt(x)),

called Hamilton’s equations.

2 Action-angle coordinates

Let (M,ω) be a 2n-dimensional symplectic manifold. Let f1,,fnC(M). If {fi,fj}=0 for all 1i,jn (namely the functions are in involution) and if at each point in M the differentials of the functions are linearly independent in the cotangent space at that point, then we say that the set of functions is completely integrable.

We define the momentum map F:Mn by F=f1××fn.

We say that F is locally trivial at a value y0 in its range if there is a neighborhood U of y0 such that for all yU there is a smooth map hy:F-1(U)F-1(y0) such that F×hy is a diffeomorphism from F-1(U) to U×F-1(y0). The bifurcation set of F is the set ΣF of y0n at which F fails to be locally trivial.

The following theorem is proved in [1, Theorem 5.2.21].

Theorem 1.

Let URn be open. If F|F-1(U):F-1(U)U is a proper map then each of the vector fields Xfi|F-1(U) is complete, URnΣF, and the fibers of the locally trivial fibration F|F-1(U) are disjoint unions of manifolds each diffeomorphic with Tn.

Let νn, and define the linear flow F on n by Ft(v)=v+tν. Let π:n𝕋n be the projection map and let ϕt:𝕋n𝕋n be such that πFt=ϕtπ; if π(v1)=π(v2) then πFt(v1)=πFt(v2), so such a map exists, and is clearly unique. A flow ϕ on 𝕋n induced by a linear flow on n is called a quasi-periodic flow.

Say that νμ, and let ϕ be the flow induced by ν and ψ be the flow induced by μ. Then for some i, νiμi and for any t such that t(νi-μi), ϕt(θ)ψt(θ) for any θ𝕋n. Hence ϕψ. Thus a quasi-periodic flow is induced by a unique vector νn. We call ν the frequency vector of the flow ϕ.

We say that νn is resonant if there is some 0kn such that k,ν=0, and we say that it is nonresonant otherwise.

Let ϕ be the quasi-periodic flow on 𝕋n with frequency vector νn. It can be shown that each orbit of ϕ is dense in 𝕋n if and only if ν is nonresonant. This is proved in [1, pp. 818–820]; that each orbit of ϕ is dense in 𝕋n if ν is nonresonant is proved in [5, Theorem 444].

Let H=f1; we call this distinguished function the Hamiltonian, and we are concerned with the flow of the Hamiltonian vector field XH.

The following theorem is proved in [1, Theorem 5.2.24].

Theorem 2.

Let c be in the range of F, let Ic0 denote a connected component of F-1(c), and let ϕ be the flow of XH. Then there is a quasiperiodic flow ψ on Tn and a diffeomorphism g:TnIc0 such that gψt=ϕt|Ic0g.

Let 2n={q1,,qn,p1,,pn} and let ω=j=1ndqjdpj. Let J=[0I-I0], where I is the n×n identity matrix. For u,v2n we have that ω(u,v)=u,Jv.

Let Bn be an open ball in n. Bn×𝕋n is a symplectic submanifold of 2n. We define coordinates Ij=qj and θj=pj+, j=1,,n. If HC(Bn×𝕋n) does not depend on θ1,,θn then we say that it has action-angle coordinates in Bn×𝕋n.

If HC(Bn×𝕋n) admits action-angle coordinates (I,θ) then for all xBn×𝕋n we have

d(Ij(ϕt(x)))dt=Hθj(ϕt(x))=0,

i.e. Ij(ϕt(x))=Ij(x) for all t, and as H depends only on I this gives

d(θj(ϕt(x)))dt=-HIj(ϕt(x))=-HIj(x)=νj,

where ν=ν(I(x)). We integrate this equation from 0 to t and get

θj(ϕt(x))-θj(x)=tνj.

Thus for xBn×𝕋n, given I(x) the trajectory ϕt(x) of x under the Hamiltonian flow of H can be explicitly seen if we know ν(I(x)). We say that a value of I determines an invariant torus for the Hamiltonian flow of H.

If (M,ω) is a symplectic manifold and HC(M), we say that H admits action-angle coordinates (I,θ) on an open set UM if there exists a symplectic diffeomorphism ψ:UBn×𝕋n such that Hψ-1 has action-angle coordinates (I,θ) in Bn×𝕋n. If H admits action-angle coordinates, then one can check that the push-forward ψ*XH is the Hamiltonian vector field XHψ-1, so that

ψ*XH=-j=1n(Hψ-1)Ijθj.

Let f1,,fnC(2n). If the set {f1,,fn} is completely integrable, with H=f1, then for any open set U2nΣF for which F-1(c)=𝕋n for all cU, Abraham and Marsden [1, pp. 398–400] find action-angle coordinates in U. Here F=f1×fn, the momentum map. This construction is also explained by Arnold [2, pp. 282–284].

Suppose that HC(Bn×𝕋n) has action-angle coordinates (I,θ), and assume that for all IBn,

det(I2H(I))0.

Then by the inverse function theorem, for every IBn there is a neighborhood U of I and a neighborhood V of ν=IH(I) such that IH:UV is a diffeomorphism. In U×𝕋n we can use ν and θ as coordinates.

For νn, let gν={kn:ν,k=0}, and let rank(gν) be the rank of the -module gν, i.e. the maximal number of elements of gν that are linearly independent over . The proof of the following theorem follows [8, Proposition 2.1].

Theorem 3.

Let νΩ and let r=rank(gν). In the torus with frequency ν, each trajectory is dense in some (n-r)-dimensional subtorus and the n-dimensional torus is foliated by these (n-r)-dimensional tori.

Proof.

There exists a basis k1,,kr of gν and vectors k1*,,kn-r*n such that the n×n matrix K0 with rows k1*,,kn-r*,k1,,kr, has determinant 1. (I should show why such a basis exists.) Let K0=[K*K]. K* is an (n-r)×n matrix and K is an r×n matrix.

Let q=K0θ. Since det(K0)=1, K0 is invertible over . The coordinate θ is only determined up to n, and for q1-q2n then also θ1-θ2n. Thus q=K0θ are coordinates on 𝕋n. The equation θ˙=ν can be written using the q coordinates as q˙=K0ν. Then

K0ν=[K*K]ν=[K*νKν]=[K*ν0].

Let ν*=K*ν.

We see that {ln:l1==ln-r=0}gK0ν; since they both have rank r, they are equal. It follows that ν*n-r is nonresonant. Hence any trajectory on the n-dimensional torus with frequency ν is dense in the r-dimensional torus {q𝕋n:qn-r+1==qn=constant}. ∎

3 Diophantine frequency vectors

For c>0 and γ0 we define

Dn(c,γ)={νn:|k,ν|1ckγfor allkn}.

We further define Dn(γ)=c>0Dn(c,γ).

Theorem 4.

For any νRn and for any positive integer K, there is some 0kZn with k2K such that

|k,ν|nν(2K)n-1.
Proof.

Let BK={kn:0<kK}. The set BK has (2K+1)n-1 elements. For kBK we have

|k,ν|nkνnKν.

Let A=nKν.

Let M=(2K+1)n-2. In the set {|k,ν|:kBK}, there are two elements that are in same interval [(j-1)AM,jAM], j=1,,M, since BK has M+1 elements and there are M such intervals. That is, there are k,k′′BK such that |k,ν|,|k′′,ν|[(j-1)AM,jAM] for some j. Hence |k,ν-k′′,ν|AM=nKν(2K+1)n-2.

One can show by induction that for all n1, K(2K+1)n-21(2K)n-1. Therefore for k=k-k′′ we have

|k,ν|nν(2K)n-1,

Finally, kk+k′′2K. ∎

Corollary 5.

If γ<n-1 then Dn(γ)=.

Proof.

Let c>0. Suppose that there is some νDn(c,γ). Let K be the least integer such that (2K)n-1-γ is greater than 2cnν; since n-1-γ>0 such a K exists.

By Theorem 4, there is some 0kn with

|k,ν|nν(2K)n-1.

Then

|k,ν| nν(2K)-γ(2K)n-1-γ
nν(2K)-γ2cnν
= 12c(2K)γ
12c(4k)γ
< 1ckγ,

contradicting that νDn(c,γ). Therefore for all c>0, Dn(c,γ)=.

Treschev and Zubelevich give a construction for points in Dn(c,n-1) for sufficiently large c [8, Theorem 9.2]. Thus there is some C(n) such that for all cC(n), Dn(c,n-1). It is clear that for γγ we have the inclusion Dn(c,γ)Dn(c,γ). Hence this construction also shows that Dn(c,γ) for all γn-1 and cC(n). However this construction does not show that m(Dn(c,n-1))>0 for cC(n). Indeed, one can show that m(Dn(n-1))=0, but also that the set Dn(n-1) has Hausdorff dimension n [7, p. 5].

Our proof of the following theorem expands on [8, Theorem 9.3]. Let Qn(L)={νn:νL2}, the cube in n of edge length L. Let m be n-dimensional Lebesgue measure. We will use the fact that the maximal n-1 dimensional area of the intersection of Qn(L) and a hyperplane is 2Ln-1 [3].

Theorem 6.

Let L>0. For γ>n-1 and c>0,

m(Qn(L)Dn(c,γ))42n(3L)n-1c(1-1γ-n+1).
Proof.

Let Qn=Qn(L). Let Πk={νn:|ν,k|<1ckγ}. Let νQnDn(c,γ). Then there is some k0 such that |k,ν|<1ckγ, and so νΠk. Thus

QnDn(c,γ)k0(QnΠk),

so

m(QnDn(c,γ))k0m(QnΠk).

Let k0. Πk is the region bounded by the two hyperplanes π1={νn:ν,k=1ckγ} and π2={νn:ν,k=-1ckγ}. Let p1=kckγkπ1 and p2=-kckγkπ2. For any two points ν1,ν2π1 we can check that p1-p2,ν1-ν2=0, and for any two points ν1,ν2π2 we can check that p1-p2,ν1-ν2=0. Thus the vector p1-p2 is orthogonal to each of the hyperplanes π1 and π2. It follows that the distance between the hyperplanes π1 and π2 is the distance between the points p1 and p2, which is 2kckγk2. Since kk, this is 2ckγ+1. Therefore

m(QnΠk)2ckγ+12Ln-1,

where we use the fact that the maximal n-1 dimensional area of the intersection of Qn=Qn(L) and a hyperplane is 2Ln-1 [3].

For each positive integer l, the hypercube {kn:k=l} has 2n faces, on each of which there are (2l+1)n-1 points with integer coordinates. Hence for each integer positive integer l, we have #{kn:k=l}2n(2l+1)n-1.

Therefore

m(QnDn(c,γ)) k0m(QnΠk)
k022Ln-1ckγ+1
= l=1k=l22Ln-1clγ+1
l=12n(2l+1)n-122Ln-1clγ+1
l=12n(3l)n-122Ln-1clγ+1
= 42n(3L)n-1cl=11lγ-n+2.

Since the terms in the sum are positive and decreasing, we can estimate the sum using an integral:

l=11lγ-n+21+1dxxγ-n+2=1+1γ-n+1,

finishing the proof. ∎

Corollary 7.

If γ>n-1 then m(RnDn(γ))=0.

Proof.

Let L>0. For every c>0, m(Qn(L)Dn(γ))m(Qn(L)Dn(c,γ)). By Theorem 6, m(Qn(L)Dn(c,γ))0 as c. Hence m(Qn(L)Dn(γ))=0. But then

m(nDn(γ))=limLm(Qn(L)Dn(γ))=limL0=0.

Fix γ>n-1. Let α=1c. Let Aα be an α-neighborhood of the boundary of Ω. We will make whatever assumption about Ω we need in order to get m(Aα)=O(α).

Suppose that L is sufficiently large so that ΩQn(L). Then Theorem 6 gives us that m(ΩDn(c,γ))=O(α).

Let Ωα=Dn(c,γ)(ΩAα). Since ΩΩα=(ΩDn(c,γ))(ΩAα), we have m(ΩΩα)=O(α).

4 Statement of KAM

If we have a Hamiltonian system which admits action-angle coordinates in Bn×𝕋n, then the trajectories of points in phase space are constrained to lie on invariant tori. Moreover, on these tori the dynamics of the system are quasi-periodic; a priori we don’t have a reason to expect that the dynamics should be so nice just because the trajectories lie on tori. But a generic Hamiltonian on the same phase space (I would like to make this notion precise) does not admit action-angle coordinates. The KAM theorem is a statement about the dynamics induced by making a sufficiently small change to a Hamiltonian. If we perturb a Hamiltonian which admits action-angle coordinates to one which probably does not, if the perturbation is sufficiently small, then most of the trajectories of points under the flow of the new Hamiltonian will also lie on tori. In some sense which I want to clarify, the invariant tori of the new Hamiltonian are close to the invariant tori of the Hamiltonian that admits action-angle coordinates. It is not clear to me how an invariant torus of the old Hamiltonian transforms into an invariant torus of the new Hamiltonian; in what sense does an invariant torus for the old Hamiltonian become an invariant torus for the new Hamiltonian?

In particular, a consequence of the KAM theorem is that if we make a small perturbation of a Hamiltonian system that admits action-angle coordinates then the trajectories of most points will not be dense on a hypersurface in phase space, since they are constrained to lie on n-dimensional tori. In other words, the new Hamiltonian system is not ergodic, since the invariant tori have lower dimension than n-1, and so have n-1-dimensional measure 0.

Let’s explain the KAM theorem in another way. Suppose that we have a symplectic manifold M and a Lagrangian foliation 0 whose leaves are tori, and suppose that the leaves of 0 are invariant tori for a Hamiltonian H0. That is, the Hamiltonian vector field XH0 is tangent to all the leaves in 0. Now let H=H0+ϵH1. The leaves of the foliation 0 will not be invariant under the flow of H. We would like to obtain a symplectomorphism Φ:MM such that the Hamiltonian vector field XH is tangent to most leaves in the foliation =Φ(0). Here we mean most in a measure theoretic sense that depends on the magnitude ϵ of the perturbation away from the Hamiltonian that admits action-angle coordinates.

How do we construct a diffeomorphism? Often the best way is to demand that it be the time 1 flow of a vector field, so Φ=Φ1 for some Φt, and to see if such a vector field exists. Suppose that f is a function such that if Φt is the flow of Xf then Φ1=Φ.

5 Normal forms

Normal forms of vector fields, homological equation [9].

References

  • [1] R. Abraham and J. E. Marsden (2008) Foundations of mechanics. Second edition, AMS Chelsea Publishing, Providence, Rhode Island. Cited by: §2, §2, §2, §2.
  • [2] V. I. Arnold (1989) Mathematical methods of classical mechanics. Second edition, Graduate Texts in Mathematics, Vol. 60, Springer. Cited by: §2.
  • [3] K. Ball (1986) Cube slicing in 𝐑n. Proc. Amer. Math. Soc. 97 (3), pp. 465–473. External Links: ISSN 0002-9939, Document, Link, MathReview (Jeffrey D. Vaaler) Cited by: §3, §3.
  • [4] G. Gallavotti (2001) Quasi periodic motions from Hipparchus to Kolmogorov. Atti della Accademia Nazionale dei Lincei. Classe di Scienze Fisiche, Matematiche e Naturali. Rendiconti Lincei. Matematica e Applicazioni 12 (2), pp. 125–152. Cited by: §1.
  • [5] G. H. Hardy and E. M. Wright (2008) An introduction to the theory of numbers. Sixth edition, Oxford University Press. External Links: ISBN 978-0-19-921986-5 Cited by: §2.
  • [6] J. Hubbard and Y. Ilyashenko (2004) A proof of Kolmogorov’s theorem. Discrete Contin. Dyn. Syst. 10 (1-2), pp. 367–385. External Links: ISSN 1078-0947, Document, Link, MathReview (Dario Bambusi) Cited by: §1.
  • [7] J. Pöschel (2001) A lecture on the classical KAM theorem. In Smooth Ergodic Theory and Its Applications, A. Katok, R. de la Llave, Y. Pesin, and H. Weiss (Eds.), Proceedings of Symposia in Pure Mathematics, Vol. 69, pp. 707–732. Cited by: §1, §3.
  • [8] D. Treschev and O. Zubelevich (2010) Introduction to the perturbation theory of Hamiltonian systems. Springer Monographs in Mathematics, Springer. Cited by: §2, §3, §3.
  • [9] S. Wiggins (2003) Introduction to applied nonlinear dynamical systems and chaos. second edition, Texts in Applied Mathematics, Vol. 2, Springer. Cited by: §5.