The Legendre transform
1 Convexity
Write . We define , , and is nonsense. If , we define and .
If is a vector space and is a function, we define the epigraph of to be the set
and if is a convex subset of the vector space , we say that is convex. We define the effective domain of a convex function to be
We say that a convex function is proper if and does not take the value . If is a convex subset of and is a function, we extend to by defining for .
Lemma 1.
If is a vector space, is a convex subset of , and satisfies
then is convex.
Proof.
Let and . The fact that the pairs belong to means in particular that , and hence that , as otherwise we would have . But
and, as ,
showing that , and hence that is convex. ∎
2 Definition of the Legendre transform
Let be a locally convex space and let be the dual space of , i.e. the set of all continuous linear maps . With the weak-* topology, is itself a locally convex space and , with the isomorphism of locally convex spaces . If is a function, the Legendre transform or convex conjugate of is the function defined by
Like how the Fourier transform of a function from a locally compact abelian group to is itself a function from the Pontryagin dual of the group to , the Legendre transform of a function from a locally convex space to is itself a function from the dual space to .
Theorem 2.
If is a locally convex space and is convex, then its Legendre transform is convex.
Proof.
Let , and let . We have
which means that , and hence that is convex. ∎
3 Lower semicontinuity
If is a topological vector space and is a function, we say that is lower semicontinuous if is a closed subset of .
Theorem 2 shows that the Legendre transform of a convex function is itself convex. The following lemma states that if a proper convex function is lower semicontinuous, then its Legendre transform is proper; one proves the lemma using the Hahn-Banach separation theorem.11 1 Viorel Barbu and Teodor Precupanu, Convexity and Optimization in Banach Spaces, fourth ed., p. 78, Corollary 2.21. We use this lemma in the proof of the theorem that comes after.
Lemma 3.
If is a locally convex space and is a lower semicontinuous proper convex function, then its Legendre transform is proper.
Theorem 4.
If is a locally convex space and is a lower semicontinuous proper convex convex, then .
Proof.
For any we have , and hence for any we have . Thus, for any and we have
Using this, for any we have
Suppose by contradiction that there were some for which . First, by Lemma 3 we have that is proper, so in particular , and this tells us that does not take the value . Hence , which tells us that . Therefore, and the singleton are disjoint closed convex sets ( is closed because is lower semicontinuous), and so we can apply the Hahn-Banach separation theorem: there is some and some for which
As , there is some and some for which , and so
(1) |
If then we get a contradiction because for a fixed there are arbitrarily large positive for which . Hence, . Assume by contradiction that , and hence that
(2) |
If then we get , a contradiction. If , we shall still obtain a contradiction. Let . For any ,
Therefore,
But by (2) we have , and therefore the right-hand side of
can be an arbitrarily large positive number (as ), contradicting that . Therefore, , and dividing (1) by then yields
Hence,
which contradicts
Therefore, there is no for which , i.e., for all we have
∎
4 Example in Rn
Let be a function, let be an symmetric positive definite matrix, and define by
Fix any , let , and write , for which . The Legendre transform of is , defined by
Because is symmetric, for any we obtain
Hence, is equivalent to
and with this,
and therefore
5 Derivatives
Let be a domain in and let for some . We define by ; we have . Following Giaquinta and Hildebrandt,22 2 Mariano Giaquinta and Stefan Hildebrandt, Calculus of Variations II, p. 6. we call a gradient mapping. The following theorem gives conditions under which is invertible.33 3 Mariano Giaquinta and Stefan Hildebrandt, Calculus of Variations II, p. 6, Lemma 1. (To be locally invertible means that for each point there is an open neighborhood such that the restriction of to that neighborhood is invertible.)
Theorem 5.
If
then is locally invertible on . If is convex and for all the matrix is positive definite, then is a diffeomorphism.
Proof.
Because and for all , by the inverse function theorem44 4 Jerrold E. Marsden and Michael J. Hoffman, Elementary Classical Analysis, second ed., p. 393, Theorem 7.1.1. we have that is a local diffeomorphism.
Suppose that for some distinct . Put . Because and is convex, for any we have . Now define
because with , we have that is continuous. Because
we have
For each we have that is a positive definite matrix, and because , this gives us that . Moreover, is continuous, so it follows that
But , a contradiction. Therefore, is one-to-one. It is a fact that a local diffeomorphism that is one-to-one is a diffeomorphism, thus is a diffeomorphism. ∎
Suppose that the gradient mapping is a diffeomorphism. We write , so is a diffeomorphism. The following theorem gives an explicit expression for the Legendre transform of certain functions.55 5 Mariano Giaquinta and Stefan Hildebrandt, Calculus of Variations II, p. 9.
Theorem 6.
If is a convex domain in , , and for all the matrix is positive definite, then
Proof.
Fix and define by
We have , and we have and . Thus, for each , the matrix is negative definite. It follows that if there is a point at which , then for all other we have . To have is equivalent , and because is a bijection, there is indeed a unique for which . Therefore,
∎
Using the above expression for the Legendre transform of a function with positive definite Hessian, we show in the following theorem that the Legendre transform of a function with positive definite Hessian is itself a function.66 6 Mariano Giaquinta and Stefan Hildebrandt, Calculus of Variations II, p. 7, Lemma 2.
Theorem 7.
If is a convex domain in , with , and for all the matrix is positive definite, then and .
Proof.
For all , because we have , i.e. , so
For all , because , we have , i.e. , so
6 Example in R2
Suppose that is a convex domain in , that , and that for all , the matrix is positive definite. Write ; for all . Because for all , we have
Giaquinta and Hildebrandt77 7 Mariano Giaquinta and Stefan Hildebrandt, Calculus of Variations II, p. 14. give the following consequence of what we have just written out. If satisfies the above conditions and satisfies the equation
on , where is some constant, then
for all . Therefore,
for all , . In the case where , then, dividing by , which is , we obtain
In the case , the equation satisfied by is called the minimal surface equation, and we have thus found a partial differential equation satisfied by the Legendre transform of a solution of the minimal surface equation that satisfies the conditions we imposed at the start of the example. Writing the equation satisfied by in the form
we have , , , with which
which means that partial differential equation satisfied by is elliptic.
7 Lagrangians and Hamiltonians
Theorem 6 states that if is a convex domain in , , and for all the matrix is positive definite, then
Suppose that is a function such that for each and , the function satisfies the above conditions. Fix and . With and , , we have
or with ,
We have
and
and
For a path to satisfy the Euler-Lagrange equation means that
With , this yields
and hence
8 Physical units
First, if a Lagrangian , , has units , then the Hamiltonian , , has the same units , and it follows that has units . Second, , and so . Therefore, . If we take , then this implies that .
9 More books
V. I. Arnold, Mathematical Methods of Classical Mechanics, second ed., p. 61, §14; Ralph Abraham and Jerrold E. Marsden, Foundations of Mechanics, second ed., p. 218, §3.6; Jerrold E. Marsden and Tudor S. Ratiu, Introduction to Mechanics and Symmetry, second ed., p. 183, §7.2; Jürgen Jost and Xianqing Li-Jost, Calculus of Variations, chapter 4; David Yang Gao, Duality Principles in Nonconvex Systems: Theory, Methods and Applications.
10 History
As best as I can tell, the thing we call the Legendre transform is named after Legendre because of the following paper: Adrien-Marie Legendre, Mémoire sur l’intégration de quelques Équations aux différences partielles, Histoire de l’Académie royale des sciences (1787), 309–351. The following is a partial bibliography of works that refer to this paper of Legendre’s. No historical summary of the Legendre transform exists in the literature, and the following is presented as an aid to the preparation of one. To properly tell the story of the Legendre transform, one would be well served by carefully digging through sources and attentively reading Legendre’s original paper, and also by making oneself comfortable with how it appears in convex analysis, minimal surfaces, contact geometry, thermodynamics, etc. Such a comprehensive history would require meticulously scanning through Legendre’s monumental Traite on elliptic integrals lest relevant material is included there. The best biography of Legendre that exists is the one by Itard in the Dictionary of Scientific Biography, who mentions that something relevant to the Legendre transform appears in volume II of the 1826 Traite, concerning arc lengths. One should also scan through the work of Lagrange, including his 1788 Méchanique analitique, and the work of Euler on the calculus of variations.
Correspondance de Leonhard Euler avec A. C. Clairaut, J. d’Alembert et J. L. Lagrange, pp. 440–441, Note 6; S. S. Demidov, The study of partial differential equations of the first order in the 18th and 19th centuries, Arch. Hist. Exact Sci. 26 (1982), no. 4, 325–350; Erwin Kreyszig, On the Theory of Minimal Surfaces, The Problem of Plateau (Themistocles M. Rassias, ed.), 1992, 138–164, p. 145; Julian Lowell Coolidge, A History of Geometrical Methods, p. 377; Alfred Enneper, Bemerkungen über einige Flächen von constantem Krümmungsmaaß, Nachrichten von der Königl. Gesellschaft der Wissenschaften und der Georg-Augusts-Universität zu Göttingen (1876), 597–619, p. 614; Alfred Enneper, Ueber Flächen mit besonderen Meridiancurven, Abhandlungen der Königlichen Gesellschaft der Wissenschaften in Göttingen 29 (1882), 3–87, p. 6; Gaston Darboux, Leçons sur la théorie générale des surfaces, vol. 1, p. 271, §177; Édouard Goursat, Leçons sur l’intégration des équations aux dérivées partielles du second ordre, à deux variables indépendantes, tome 2, p. 32, chapter V, §113; René Taton, L’œuvre scientifique de Monge, p. 262; Karin Reich, Die Geschichte der Differentialgeometrie von Gauß bis Riemann (1828–1868), Arch. Hist. Exact Sci. 11 (1973), no. 4, 273–376, p. 315; Ivor Grattan-Guinness, Convolutions in French Mathematics, 1800-1840, vol. I, p. 152; Morris Kline, Mathematical Thought From Ancient to Modern Times, chapter 22; João Caramalho Domingues, Lacroix and the Calculus, p. 223; Ernst Hairer, Syvert Paul Nørsett and Gerhard Wanner, Solving Ordinary Differential Equations I: Nonstiff Problems, p. 32; Paul Mansion, Théorie des équations aux dérivées partielles du premier ordre, p. 76; A. R. Forsyth, A Treatise on Differential Equations, sixth ed., pp. 418, 476; Lagrange, Méchanique analitique (1788), tome 1, partie 2, §IV; Ernesto Pascal, Die Variationsrechnung, p. 125; Bernhard Riemann, Ueber die Fläche vom kleinsten Inhalt bei gegebener Begrenzung; Courant and Hilbert, vol. II; Cornelius Lanczos, The Variational Principles of Mechanics, fourth ed., §VI.1; Ed. Combescure, Remarques sur un Mémoire de Legendre, Comptes rendus hebdomadaires des séances de l’Académie des sciences 74 (1872), 798–802; Johannes C. C. Nitsche, Vorlesungen über Minimalflächen, p. 147; A. W. Conway and J. L. Synge (ed.), The Mathematical Papers of Sir William Rowan Hamilton, vol. I, (1931), p. 474.