The singular value decomposition of compact operators on Hilbert spaces

Jordan Bell
April 3, 2014

1 Preliminaries

The purpose of these notes is to present material about compact operators on Hilbert spaces that is special to Hilbert spaces, rather than what applies to all Banach spaces. We use statements about compact operators on Banach spaces without proof. For instance, any compact operator from one Banach space to another has separable image; the set of compact operators from one Banach space to another is a closed subspace of the set of all bounded linear operators; a Banach space is reflexive if and only if the closed unit ball is weakly compact (Kakutani’s theorem); etc. We do however state precisely each result that we are using for Banach spaces and show that its hypotheses are satisfied.

Let be the set of positive integers. We say that a set is countable if it is bijective with a subset of . In this note I do not presume unless I say so that any set is countable or that any Hilbert space is separable. A neighborhood of a point in a topological space is a set that contains an open set that contains the point; one reason why it can be handy to speak about neighborhoods of a point rather than just open sets that contain the point is that the set of all neighborhoods of a point is a filter, whereas it is unlikely that the set of all open sets that contain a point is a filter. If z, we denote z*=z¯.

2 Bounded linear operators

An advantage of working with normed spaces rather than merely topological vector spaces is that continuous linear maps between normed spaces have a simple characterization. If X and Y are normed spaces and T:XY is linear, the operator norm of T is

T=supx1Tx.

If T<, then we say that T is bounded.

Theorem 1.

If X and Y are normed spaces, a linear map T:XY is continuous if and only if it is bounded.

Proof.

Suppose that T is continuous. In particular T is continuous at 0, so there is some δ>0 such that if xδ then Tx=Tx-T01. If x0 then, as T is linear,

Tx=1δxT(δxx)1δx.

Thus T1δ<, so T is bounded.

Suppose that T is bounded. Let x0X, and let ϵ>0. If x-x0ϵT, then

Tx-Tx0=T(x-x0)Tx-x0TϵT=ϵ.

Hence T is continuous at x0, and so T is continuous. ∎

If X and Y are normed spaces, we denote by (X,Y) the set of bounded linear maps XY. It is straightforward to check that (X,Y) is a normed space with the operator norm. One proves that if Y is a Banach space, then (X,Y) is a Banach space,11 1 Walter Rudin, Functional Analysis, second ed., p. 92, Theorem 4.1. and if X is a Banach space one then checks that (X)=(X,X) is a Banach algebra. If X is a normed space, we define X*=(X,), which is a Banach space, called the dual space of X.

If X and Y are normed spaces and T:XY is linear, we say that T has finite rank if

rankT=dimT(X)

is finite. If X is infinite dimensional and Y{0}, let be a Hamel basis for X, let {en:n} be a countable subset of , and let yY be nonzero. If we define T:XY by Ten=neny and Te=0 if e{en:n}, then T is a linear map with finite rank yet T is unbounded. Thus a finite rank linear map is not necessarily bounded. We denote by 00(X,Y) the set of bounded finite rank linear maps XY, and check that 00(X,Y) is a vector space. If X is a Banach space, one checks that 00(X) is an ideal of the algebra (X) (if we either pre- or postcompose a linear map with a finite rank linear map, the image will be finite dimensional).

If X and Y are Banach spaces, we say that T:XY is compact if the image of any bounded set under T is precompact (has compact closure). One checks that if a linear map is compact then it is bounded (unlike a finite rank linear map, which is not necessarily bounded). There are several ways to state that a linear map is compact that one proves are equivalent: T is compact if and only if the image of the closed unit ball is precompact; T is compact if and only if the image of the open unit ball is precompact; T is compact if and only if the image under it of any bounded sequence has a convergent subsequence. In a complete metric space, the Heine-Borel theorem asserts that a set is precompact if and only if it is totally bounded (for any ϵ>0, the set can be covered by a finite number of balls of radius ϵ). We denote by 0(X,Y) the set of compact linear maps XY, and it is straightforward to check that this is a vector space. One proves that if an operator is in the closure of the compact operators then the image of the closed unit ball under it is totally bounded, and from this it follows that 0(X,Y) is a closed subspace of (X,Y). 0(X) is an ideal of the algebra (X): if K0(X) and T(X), one checks that TK0(X) and KT0(X).

Let X and Y be Banach spaces. Using the fact that a bounded set in a finite dimensional normed vector space is precompact, we can prove that a bounded finite rank operator is compact: 00(X,Y)0(X,Y). Also, it doesn’t take long to prove that the image of a compact operator is separable: if T0(X,Y) then T(X) has a countable dense subset. (We can prove this using the fact that a compact metric space is separable.)

If H is a Hilbert space and Si,iI are subsets of H, we define iISi to be the closure of the span of iISi. We say that is an orthonormal basis for H if e,f=δe,f and H=.

A sesquilinear form on H is a function f:H×H that is linear in its first argument and that satisfies f(x,y)=f(y,x)*. If f is a sesquilinear form on H, we say that f is bounded if

sup{|f(x,y)|:x,y1}<.

The Riesz representation theorem22 2 Walter Rudin, Functional Analysis, second ed., p. 310, Theorem 12.8. states that if f is a bounded sesquilinear form on H, then there is a unique B(H) such that

f(x,y)=x,By,x,yH,

and B=sup{|f(x,y)|:x,y1}. It follows from the Riesz representation that if A(H), then there is a unique A*(H) such that

Ax,y=x,A*y,x,yH,

and A*=A. (H) is a C*-algebra: if A,B(H) and λ then A**=A, (A+B)*=A*+B*, (AB)*=B*A*, (λA)*=λ*A*, and A*A=A2.

We say that A(H) is normal if A*A=AA*, and self-adjoint if A*=A. One proves using the parallelogram law that A(H) is self-adjoint if and only if Ax,x for all xH. If A(H) is self-adjoint, we say that A is positive if Ax,x0 for all xH.

3 Spectrum in Banach spaces

If X and Y are Banach spaces and T(X,Y) is a bijection, then its inverse function T-1:YX is linear, since the inverse of a linear bijection is itself linear. Because T is a surjective bounded linear map, by the open mapping theorem it is an open map: if U is an open subset of X then T(U) is an open subset of Y, and it follows that T-1(Y,X). That is, if a bounded linear operator from one Banach space to another is bijective then its inverse function is also a bounded linear operator.

If X is a Banach space and T(X), the spectrum σ(T) of T is the set of those λ such that the map T-λidX:XX is not a bijection. One proves that σ(T) is nonempty (the proof uses Liouville’s theorem, which states that a bounded entire function is constant). One also proves that if λσ(T) then |λ|T. We define the spectral radius of T to be

r(T)=supλσ(T)|λ|,

and so r(T)T. Because 0(X) is an ideal in the algebra (X), if T0(X) is invertible then idX is compact. One checks that if idX is compact then X is finite dimensional (a locally compact topological vector space is finite dimensional), and therefore, if X is an infinite dimensional Banach space and T0(X), then 0σ(T).

The resolvent set of T is ρ(T)=σ(T). One proves that ρ(T) is open,33 3 Gert K. Pedersen, Analysis Now, revised printing, p. 131, Theorem 4.1.13. from which it then follows that σ(T) is a compact set. For λρ(T), we define

R(λ,T)=(T-λidX)-1(X),

called the resolvent of T.

If X is a Banach space and T0(X), then the point spectrum σpoint(T) of T is the set of those λ such that T-λidX is not injective. In other words, to say that λσpoint(T) is to say that

dimker(T-λidX)>0.

If λσpoint(T), we say that λ is an eigenvalue of T, and call dimker(T-λidX) its geometric multiplicity.44 4 There is also a notion of algebraic multiplicity of an eigenvalue: the algebraic multiplicity of λ is defined to be supndimker((A-λidH)n). For self-adjoint operators this is equal to the geometric multiplicity of λ, while for a operator that is not self-adjoint the algebraic multiplicity of an eingevalue may be greater than its geometric multiplicity. Nonzero elements of ker((A-λidH)n) are called generalized eigenvectors or root vectors. See I. C. Gohberg and M. G. Krein, Introduction to the Theory of Linear Nonselfadjoint Operators in Hilbert Space. It is a fact that each nonzero eigenvalue of T has finite geometric multiplicity, and it is also a fact that if T0(X), then σpoint(T) is a bounded countable set and that if σpoint(T) has a limit point that limit point is 0. The Fredholm alternative tells us that

σ(T)σpoint(T){0}.

If X is infinite dimensional then σ(T)=σpoint(T){0}, and σpoint(T) might or might not include 0. If T00(X), check that σpoint(T) is a finite set.

If H is a Hilbert space, using the fact that a bounded linear operator T(H) is invertible T if and only if both TT* and T*T are bounded below (S is bounded below if there is some c>0 such that Sxcx for all xX), one can prove that the spectrum of a bounded self-adjoint operator is a set of real numbers, and the spectrum of a bounded positive operator is a set of nonnegative real numbers.

4 Numerical radius

If H is a Hilbert space and A(H), the numerical range of A is the set {Ax,x:x=1}.55 5 The Toeplitz-Hausdorff theorem states that the numerical range of any bounded linear operator is a convex set. See Paul R. Halmos, A Hilbert Space Problem Book, Problem 166. The closure of the numerical range contains the spectrum of A.66 6 Paul R. Halmos, A Hilbert Space Problem Book, Problem 169. If A is normal, then the closure of its numerical range is the convex hull of the spectrum: Problem 171. The numerical radius w(A) of A is the supremum of the numerical range of A: w(A)=supx=1|Ax,x|. If A(H) is self-adjoint, one can prove that77 7 John B. Conway, A Course in Functional Analysis, second ed., p. 34, Proposition 2.13.

w(A)=A.

The following theorem asserts that a compact self-adjoint operator A has an eigenvalue whose absolute value is equal to the norm of the operator. Thus in particular, the spectral radius of a compact self-adjoint operator is equal to its numerical radius. Since a self-adjoint operator has real spectrum, to say that |λ|=A is to say that either λ=A or λ=-A. A compact operator on a Hilbert space can have empty point spectrum (e.g. the Volterra operator on L2([0,1])) and a bounded self-adjoint operator can have empty point spectrum (e.g. the multiplication operator Tϕ(t)=tϕ(t) on L2([0,1])), but this theorem shows that if an operator is compact and self-adjoint then its point spectrum is nonempty.

Theorem 2.

If AB(H) is compact and self-adjoint then at least one of -A,A is an eigenvalue of A.

Proof.

Because A is self-adjoint, A=w(A)=supx=1|Ax,x|. Also, as A is self-adjoint, Ax,x is a real number, and thus either A=supx=1Ax,x or A=-infxAx,x. In the first case, due to A being a supremum there is a sequence xn, all with norm 1, such that Axn,xnA. Using that A is compact, there is a subsequence xa(n) such that Axa(n) converges to some x, and x=1 because each xn has norm 1. Using A=A*,

Axn-Axn,Axn-Axn = Axn,Axn-Axn,Axn
-Axn,Axn+Axn,Axn
= Axn2-2AAxn,xn+A2xn2
A2xn2-2AAxn,xn+A2xn2
= 2A2xn2-2AAxn,xn
2A2x2-2AA
= 0.

Therefore, as n, the sequence Axn-Axn tends to 0, so Ax-Ax=0, i.e. x is an eigenvector for the eigenvalue A. If A=-infx=1Ax,x the argument goes the same. ∎

5 Polar decomposition

If H is a Hilbert space and P(H) is positive, there is a unique positive element of (H), denoted P1/2, satisfying (P1/2)2=P, which we call the positive square root of P.88 8 Gert K. Pedersen, Analysis Now, revised printing, p. 92, Proposition 3.2.11. If A(H) one checks that A*A is positive, and hence A*A has a positive square root, which we denote by |A| and call the absolute value of A. One proves that |A| is the unique positive operator in (H) satisfying

Ax=|A|x,xH.

An element U of (H) is said to be a partial isometry if there is a closed subspace X of H such that the restriction of U to X is an isometry XU(X) and kerU=X. One proves that U*U is the orthogonal projection of H onto X. It can be proved that if A(H) then there is a unique partial isometry U satisfying both kerU=kerA and A=U|A|.99 9 Gert K. Pedersen, Analysis Now, revised printing, p. 96, Theorem 3.2.17. This is called the polar decomposition of A. The polar decomposition satisfies

U*U|A|=|A|,U*A=|A|,UU*A=A.

6 Spectral theorem

If e,fH, we define ef:HH by ef(h)=h,fe. ef is linear, and

ef(h)=h,fe=|h,f|ehfe,

so efef. Depending on whether f=0 the image of ef is {0} or the span of e, and in either case ef00(H). If either of e or f is 0 then ef has rank 0, and otherwise ef has rank 1, and it is an orthogonal projection precisely when f is a multiple of e.

If is an orthonormal set in a Hilbert space H, then is an orthonormal basis for H if and only if the unordered sum

eee

converges strongly to idH.1010 10 John B. Conway, A Course in Functional Analysis, second ed., p. 16, Theorem 4.13.

Let’s summarize what we have stated so far about the spectrum and point spectrum of a compact self-adjoint operator on a Hilbert space.

Theorem 3 (Spectrum of compact self-adjoint operators).

If H is a Hilbert space and AB0(H) is self-adjoint, then:

  • σ(A) is a nonempty compact subset of .

  • If H is infinite dimensional, then 0σ(A).

  • σ(A)σpoint(A){0}.

  • σpoint(A) is countable.

  • If λ is a limit point of σpoint(A), then λ=0.

  • At least one of A,-A is an element of σpoint(A).

  • Each nonzero eigenvalue of A has finite geometric multiplicity: If λσpoint(A) and λ0, then dimker(A-λidH)<.

  • If A00(H), then σpoint(A) is a finite set.

We say that A(H) is diagonalizable if there is an orthonormal basis for H and a bounded set {λe:e} such that the unordered sum

eλeee

converges strongly to A.

The following is the spectral theorem for normal compact operators.1111 11 Gert K. Pedersen, Analysis Now, revised printing, p. 108, Theorem 3.3.8.

Theorem 4 (Spectral theorem).

If AB0(H) is normal, then A is diagonalizable.

The last assertion of Theorem 3 is that a bounded self-adjoint finite rank operator on a Hilbert space has finitely many elements in its point spectrum. Using the spectral theorem, we get that if A0(H) is self-adjoint and σpoint(A) is finite, then A is finite rank. In the notation we introduce in the following definition, ν(A)< precisely when A has finite rank.

Definition 5.

If A0(H) is self-adjoint, define 0ν(A) to be the sum of the geometric multiplicities of the nonzero eigenvalues of A:

ν(A)=λσpoint(A){0}dimker(A-λidH).

Define

(λn(A):n)

to be the sequence whose first term is the element of σpoint(A){0} with largest absolute value repeated as many times as its geometric multiplicity. If λ,-λ are both nonzero elements of σpoint(A), we put the positive one first. We repeat this for the remaining elements of σpoint(A){0}. If ν(A)<, we define λn(A)=0 for n>ν(A).

Using the spectral theorem and the notation in the above definition we get the following.

Theorem 6.

If AB0(H) is self-adjoint, then there is an orthonormal set {en:nN} in H such that

nλn(A)enen

converges strongly to A.

If A0(H), then its absolute value |A| is a positive compact operator, and λn(|A|)0 for all n.

Definition 7.

If A0(H) and λ is an eigenvalue of |A|, we call λ a singular value of A, and we define

σn(A)=λn(|A|),n.

Because the absolute value of the absolute value of an operator is the absolute value of the operator, if A0(H) and n then σn(|A|)=σn(A). If A0(H) and λ0, one proves that λ is an eigenvalue of AA* if and only if λ is an eigenvalue of A*A and that they have the same geometric multiplicity. From this we get that σn(A)=σn(A*) for all n.

7 Finite rank operators

Theorem 8 (Singular value decomposition).

If H is a Hilbert space and AB00(H) has rankA=N, then there is an orthonormal set {en:1nN} and an orthonormal set {fn:1nN} such that

A=n=1Nσn(A)enfn,Ah=n=1Nσn(A)h,fnen.
Proof.

|A| is a positive operator with rank|A|=N, and according to Theorem 6, there is an orthonormal set {fn:n} in H such that

|A|=nλn(|A|)fnfn=nσn(A)fnfn.

Using the polar decomposition A=U|A|,

A=U|A|=n=1Nσn(A)(Ufn)fn.

Define en=Ufn. As U*U|A|=|A| and as |A|fmσm(A)=fm,

en,em = Ufn,Ufm
= fn,U*Ufm
= fn,U*U|A|fmσm(A)
= fn,|A|fmσm(A)
= fn,fm
= δn,m,

showing that {en:1nN} is an orthonormal set. ∎

If e,f,x,yH, then

ef(x),y=x,fe,y=x,fe,y=y,e*x,f=x,y,ef=x,fe(y),

so (ef)*=fe.

Theorem 9.

If AB00(H) then A*B00(H).

Proof.

Let A have the singular value decomposition

A=n=1Nσn(A)enfn.

Taking the adjoint, and because σn,

A*=n=1Nσn(A)fnen.

A* is a sum of finite rank operators and is therefore itself a finite rank operator. ∎

8 Compact operators

If X and Y are Banach spaces, 00(X,Y)0(X,Y). But if H is a Hilbert space we can say much more: 00(H) is a dense subset of 0(H). In other words, any compact operator on a Hilbert space can be approximated by a sequence of bounded finite rank operators.1212 12 John B. Conway, A Course in Functional Analysis, second ed., p. 41, Theorem 4.4. As the adjoint An* of each of these finite rank operators An is itself a bounded finite rank operator,

A*-An*=(A-An)*=A-An0,

so An*A*. Because each bounded finite rank operator is compact and 0(H) is closed, this establishes that A*0(H). (In fact, it is true that the adjoint of a compact linear operator between Banach spaces is itself compact, but there we don’t have the tool of showing that the adjoint is the limit of the adjoints of finite rank operators.)

If H is a Hilbert space, the weak topology is the topology on H such that a net xα converges to x weakly if for all hH the net xα,h converges to x,h in . Let 𝔅 be the closed unit ball in H, and let it be a topological space with the subspace topology inherited from H with the weak topology. Thus, a net xα𝔅 converges to x𝔅 if and only if for all hH the net xα,h converges to x,h.

Theorem 10.

If H is a Hilbert space, AB(H), and B is the closed unit ball in H with the subspace topology inherited from H with the weak topology, then A is compact if and only if A|B:BH is continuous.

Proof.

Suppose that A is compact and let xα be a net in 𝔅 that converges weakly to some x𝔅. If ϵ>0, then there is some B00(H) with A-B<ϵ. Let B have the singular value decomposition

B=n=1Nσn(B)enfn.

We have, using that the en are orthonormal,

Bxα-Bx2 = n=1Nσn(B)xα,fnen-n=1Nσn(B)x,fnen2
= n=1Nσn(B)xα-x,fnen2
= n=1Nσn(B)2|xα-x,fn|2.

Eventually this is <ϵ3, and for such α,

Axα-Ax Axα-Bxα+Bxα-Bx+Bx-Ax
A-Bxα+Bxα-Bx+B-Ax
A-B+Bxα-Bx+B-A
< ϵ3+ϵ3+ϵ3
= ϵ.

We have shown that AxαAx in the no rm of H, and this shows that A|𝔅:𝔅H is continuous.

Suppose that A|𝔅:𝔅H is continuous. Kakutani’s theorem states that a Banach space is reflexive if and only if the closed unit ball is weakly compact. A Hilbert space is reflexive, hence 𝔅, the closed unit ball with the weak topology, is a compact topological space.1313 13 cf. Paul R. Halmos, A Hilbert Space Problem Book, Problem 17. Since A|𝔅:𝔅H is continuous and 𝔅 is compact, the image A(𝔅) is compact (the image of a compact set under a continuous map is a compact set). We have shown that the image of the closed unit ball is a compact subset of H, and this shows that A is compact; in fact, to have shown that A is compact we merely needed to show that the image of the closed unit ball is precompact, and H is a Hausdorff space so a compact set is precompact. ∎

A compact linear operator on an infinite dimensional Hilbert space H is not invertible, lest idH be compact. However, operators of the form A-λidH may indeed be invertible.1414 14 Ward Cheney, Analysis for Applied Mathematics, p. 94, Theorem 2.

Theorem 11.

If AB0(H) is a normal operator with diagonalization

A=n=1λnenen

and 0λσpoint(A), then A-λidH is invertible and

(A-λidH)-1=-1λ+1λn=1λnλn-λenen,

where the series converges in the strong operator topology.

Proof.

As λn0 we have α=supn|λn|<, and as λ0 we have β=infn|λn-λ|>0. Define

TN=-1λ+1λn=1Nλnλn-λenen00(H),

and if N>M, then, for any hH,

TNh-TMh2 = 1|λ|2n=M+1Nλnλn-λh,enen2
= 1|λ|2n=M+1N|λn|2|λn-λ|2|h,en|2
1|λ|2n=M+1Nα2|h,en|2β2
= α2|λ|2β2n=M+1N|h,en|2.

By Bessel’s inequality, n=1|h,en|2h2, hence n=N|h,en|20 as N; this N depends on h, and this is why the claim is stated merely for the strong operator topology and not the norm topology. We have shown that TNh is a Cauchy sequence in H and hence TNh converges. We define Bh to be this limit. For hH,

1λn=1λnλn-λh,enen2 = 1|λ|2n=1|λn|2|λn-λ|2|h,en|2
α2|λ|2β2n=1|h,en|2
α2|λ|2β2h2,

whence

Bh -1λh+1λn=1λnλn-λh,enen
1|λ|h+α|λ|βh,

showing that B1|λ|+α|λ|β. It is straightforward to check that B is linear, thus B(H). (Thus B is a strong limit of finite rank operators. But if H is infinite dimensional then B is in fact not the norm limit of the sequence: for if it were it would be compact, and we will show that B is invertible, which would tell us that idH is compact, contradicting H being infinite dimensional.)

For hH,

(A-λidH)Bh = -1λ(Ah-λh)+1λn=1λnλn-λh,en(Aen-λen)
= h-1λAh+1λn=1λnλn-λh,en(λnen-λen)
= h-1λAh+1λn=1λnh,enen
= h,

where the final equality is because the series is the diagonalization of A. On the other hand,

B(A-λidH)h = -1λ(A-λidH)h+1λn=1λnλn-λ(A-λidH)h,enen
= h-1λAh+1λn=1λnλn-λAh-λh,enen
= h-1λn=1λnh,enen+1λn=1λnλn-λAh,enen
-n=1λnλn-λh,enen
= h-1λn=1λnh,enen+1λn=1λnλn-λλnh,enen
-n=1λnλn-λh,enen
= h+1λn=1-λn(λn-λ)+λn2-λnλλn-λh,enen
= h,

showing that B=(A-λidH)-1. ∎

We can start with a function and ask what kind of series it can be expanded into, or we can start with a series and ask what kind of function it defines. The following theorem does the latter. It shows that if en and fn are each orthonormal sequences and λn is a sequence of complex numbers whose limit of 0, then the series

n=1λnenfn

converges and is an element of 0(H).

Theorem 12.

If H is a Hilbert space, {en:nN} is an orthonormal set, {fn:nN} is an orthonormal set, and λnC is a sequence tending to 0, then the sequence

AN=n=1Nλnenfn00(H).

converges to an element of B0(H).

Proof.

Let ϵ>0 and let N0 be such that if nN0 then |λn|<ϵ. If N>MN0 and hH, then, as the en are orthonormal,

(AN-AM)h2= n=M+1Nλnenfn(h)2
= M+1Nλnh,fnen2
= n=M+1Nλnh,fnen2
= n=M+1N|λn|2|h,fn|2
< ϵ2n=M+1N|h,fn|2.

By Bessel’s inequality, n=M+1N|h,fn|2h2, and hence

(AN-AM)h<ϵh.

As this holds for all hH,

AN-AMϵ,

showing that AN is a Cauchy sequence, which therefore converges in (H). As each term in the sequence is finite rank and so compact, the limit is a compact operator. ∎

Continuing the analogy we used with the above theorem, now we start with a function and ask what kind of series it can be expanded into. This is called the singular value decomposition of a compact operator. Helemskii calls the series in the following theorem the Schmidt series of the operator.1515 15 A. Ya. Helemskii, Lectures and Exercises on Functional Analysis, p. 215, Theorem 1. We have already presented the singular value decomposition for finite rank operators in Theorem 8.

Theorem 13 (Singular value decomposition).

If H is a Hilbert space and

A0(H)00(H),

then there is an orthonormal set {en:nN} and an orthonormal set {fn:nN} such that ANA, where

AN=n=1Nσn(A)enfn.
Proof.

As |A| is self-adjoint and compact, by Theorem 6 there is an orthonormal set {fn:n} such that

|A|=n=1λn(|A|)fnfn=n=1σn(A)fnfn.

That is, with |A|N00(H) defined by

|A|N=n=1Nσn(A)fnfn,

we have |A|N|A|.

Let A=U|A| be the polar decomposition of A, and define en=Ufn. As U*U|A|=|A| and as σm(A)>0 (because A is not finite rank), we have |A|fmσm(A)=fm, and hence

en,em = Ufn,Ufm
= fn,U*Ufm
= fn,U*U|A|fmσm(A)
= fn,|A|fmσm(A)
= fn,fm
= δn,m,

showing that {en:n} is an orthonormal set. Define

AN=n=1Nσn(A)enfn,

and we have AN=U|A|N. As A=U|A| and AN=U|A|N, we get

A-AN=U|A|-U|A|NU|A|-|A|N=|A|-|A|N0,

showing that ANA. ∎

9 Courant min-max theorem

Theorem 14 (Courant min-max theorem).

Let H be an infinite dimensional Hilbert space and let AB0(H) be a positive operator. If kN then

maxdimS=kminxS,x=1Ax,x=λk(A)=σk(A)

and

mindimS=k-1maxxS,x=1Ax,x=λk(A)=σk(A).
Proof.

|A| is compact and positive, so according to Theorem 6 there is an orthonormal set {en:n} such that

A=n=1σn(A)enen.

For k, let Sk=n=k{en}. Sk=k=1n-1{en}, so Sk has codimension k-1. (The codimension of a closed subspace of a Hilbert space is the dimension of its orthogonal complement.) If S is a k dimensional subspace of H, then there is some xSkS with x=1. This is because if V is a closed subspace with codimension k-1 of a Hilbert space and W is a k dimensional subspace of the Hilbert space, then their intersection is a subspace of nonzero dimension. As xSk, there are αn, nk, with

x=n=kαnen,x2=n=k|αn|2.

As the sequence σn(A) is nonincreasing,

Ax,x = n=1σn(A)x,enen,x
= n=1σn(A)m=kαmem,enen,m=kαmem
= n=1σn(A)m=kαmδm,nen,m=kαmem
= n=1σn(A)αnχk(n)en,m=kαmem
= n=kσn(A)αnen,m=kαmem
= n=kσn(A)|αn|2
σk(A)n=k|αn|2
= σk(A),

where we write

χk(n)={1nk0n<k.

This shows that if dimS=k then

infxS,x=1Ax,xσk(A).

Let M=infxS,x=1Ax,x, and let xnS, xn=1, with Axn,xnM. As S is a finite dimensional Hilbert space, the unit sphere in it is compact, so there a a subsequence xa(n) that converges to some zS, z=1. We have

|Az,z-Axn,xn| |Az,z-Axn,z|+Axn,z-Axn,xn|
= |A(z-xn,z|+|Axn,z-xn|
Az-xnz+Axnz-xn
= 2Az-xn.

As xa(n)z, we get Axa(n),xa(n)Az,z. As Axn,AxnM, we get

Az,z=M=infxS,x=1Ax,x.

As zS and z=1, we have in fact

minxS,x=1Ax,x=infxS,x=1Ax,xσk(A).

This is true for any k dimensional subspace of H, so

supdimS=kminxS,x=1Ax,xσk(A).

If S=Sk+1 then ekS, ek=1, and

Aek,ek=σk(A)ek,ek=σk(A),

so in fact

maxdimS=kminxS,x=1Ax,x=σk(A),

which is the first of the two formulas that we want to prove.

For k1, let Sk=n=1k{en}. If S is a k-1 dimensional subspace of H, then S is a closed subspace with codimension k-1, so the intersection of Sk and S has nonzero dimension, and so there is some xSkS with x=1. As xSk there are α1,,αk with x=n=1kαnen, giving

Ax,x = n=1σn(A)x,enen,x
= n=1σn(A)m=1kαmem,enen,m=1kαmem
= n=1σn(A)m=1kαmδm,nen,m=1kαmem
= n=1σn(A)αnχk(n)en,m=1kαmem
= n=1kσn(A)αnen,m=1kαmem
= n=1kσn(A)|αn|2
σk(A)n=1k|αn|2
= σk(A).

This shows that

supxS,x=1Ax,xσk(A).

Define M=supxS,x=1. Because M is a supremum, there is a sequence xn on the unit sphere in Sk-1 such that Axn,xnM. The unit sphere in Sk-1 is compact because Sk-1 is finite dimensional, so this sequence has a convergent subsequence xa(n)z. As

|Az,z-Axn,xn|2Az-xn

and xa(n)z, we get

Az,z=M=supxS,x=1Ax,x,

whence

maxxS,x=1Ax,x=supxS,x=1Ax,xσk(A).

As this is true for any k-1 dimensional subspace S,

infdimS=k-1maxxS,x=1Ax,xσk(A).

But for S=Sk-1 we have ekS, ek=1, and

Aek,ek=σk(A)ek,ek=σk(A),

which implies that

mindimS=k-1maxxS,x=1Ax,x=σk(A).

If A(H) is compact, then the eigenvalues of |A| are equal to the singular values of |A|. Therefore the Courant min-max theorem gives expressions for the singular values of a compact linear operator on a Hilbert space, whether or not the operator is itself self-adjoint.

Allahverdiev’s theorem1616 16 I. C. Gohberg and M. G. Krein, Introduction to the Theory of Linear Nonselfadjoint Operators in Hilbert Space, p. 28, Theorem 2.1; cf. J. R. Retherford, Hilbert Space: Compact Operators and the Trace Theorem, p. 75 and p. 106. gives an expression for the singular values of a compact operator that does not involve orthonormal sets, unlike Courant’s min-max theorem. Thus this formula makes sense for a compact operator from one Banach space to another.

Theorem 15 (Allahverdiev’s theorem).

Let H be a Hilbert space and let Fn(H) be the set of bounded finite rank operators of rank n. If AB0(H) and nN, then

σn(A)=infTn-1A-T.

10 Schatten class operators

If 1p< and A0(H), we define

Ap=(nσn(A)p)1/p,

and define p(H) to be those A0(H) with Ap<. In other words, an element of p(H) is a compact operator whose sequence of singular values is an element of p. We call an element of p(H) a Schatten class operator. We call elements of 1(H) trace class operators and elements of 2(H) Hilbert-Schmidt operators.

If A0(H) is positive, then, according to Theorem 6, there is an orthonormal set {en:n} such that

A=nλn(A)enen,

where the series converges in the strong operator topology. As the en are orthonormal, we have

Ap=nλn(A)penen,

which is itself a positive compact operator, and thus σn(Ap)=σn(A)p for n. Therefore, if A is a positive compact operator, then Ap=Ap11/p.

If A0(H) and n, then σn(|A|)=σn(A) and σn(A*)=σn(A). Hence, if 1p< then

|A|p=Ap,A*p=Ap.

As |A| is compact and self-adjoint, it has an eigenvalue with absolute value A, from which it follows that if 1p< then AAp.

Theorem 16.

If AB1(H), BB(H), and kN, then

σk(BA)Bσk(A).
Proof.

For all xH,

(BA)*BAx,x = BAx,BAx
= BAx2
B2Ax2
= B2Ax,Ax
= B2A*Ax,x.

Applying the Courant min-max theorem to the positive operators (BA)*BA and A*A, if k then

σk((BA)*BA) = maxdimS=kminxS,x=1(BA)*BAx,x
B2maxdimS=kminxS,x=1A*Ax,x
= B2σk(A*A).

But

σk((BA)*BA)=σk(|BA|2)=σk(|BA|)2=σk(BA)2

and

σk(A*A)=σk(|A|2)=σk(|A|)2=σk(A)2,

so taking the square root,

σk(BA)Bσk(A).

Using Theorem 16, if 1p< then

BAp=(nσn(BA)p)1/p(nBpσk(A)p)1/p=BAp.

The following theorem states that the Schatten class operators are Banach spaces.1717 17 Gert K. Pedersen, Analysis Now, revised printing, p. 124, E 3.4.4

Theorem 17.

If 1p<, then Bp(H) is a Banach space with the norm p.

11 Weyl’s inequality

Weyl’s inequality relates the eigenvalues of a self-adjoint compact operator with its singular values.1818 18 Peter D. Lax, Functional Analysis, p. 336, chapter 30, Lemma 7. We use the notation from Definition 5. For N>ν(A) the left hand side is equal to 0 so the inequality is certainly true then.

Theorem 18 (Weyl’s inequality).

If AB0(H) is self-adjoint and Nν(A), then

n=1N|λn(A)|n=1Nσn(A).
Proof.

Let

EN=n=1Nker(A-λn(A)idH),

which is finite dimensional. Check that EN is an invariant subspace of A, and let AN:ENEN be the restriction of A to EN. AN is a positive operator. As EN is spanned by eigenvectors for nonzero eigenvalues of A it follows that kerAN={0}, and as EN is finite dimensional, we get that AN is invertible. If AN has polar decomposition AN=UN|AN|, then UN is invertible; if a partial isometry is invertible then it is unitary, so UN is unitary, and therefore the eigenvalues of UN all have absolute value 1. As the determinant of a linear operator on a finite dimensional vector space is the product of its eigenvalues counting algebraic multiplicity,

det|AN|=1|detUN||detAN|=|detAN|=n=1N|λn(A)|. (1)

Let PN be the orthogonal projection onto EN. If vEN, then APNv=ANv, and if vEN then APNv=A(0)=0. We get that

|APN|v={|AN|vvEN0vEN,

and it follows that if 1nN then σn(AN)=σn(APN). Using Theorem 16 we get

σn(APN)Pσn(A)σn(A);

the second inequality is an equality unless PN=0. We have shown that if 1nN then σn(AN)σn(A), and combining this with (1) gives us

n=1N|λn(A)|=det|AN|=n=1Nσn(AN)n=1Nσn(A).

Theorem 19.

If 0<p<, AB0(H) is self-adjoint, and NN, then

n=1N|λn(A)|pn=1Nσn(A)p.
Proof.

Schur’s majorization inequality1919 19 Peter D. Lax, Functional Analysis, p. 337, chapter 30, Lemma 8; cf. J. Michael Steele, The Cauchy-Schwarz Master Class, p. 201, Problem 13.4. states that if a1a2 and b1b2 are nonincreasing sequences of real numbers satisfying, for each N,

n=1Nann=1Nbn,

and ϕ: is a convex function with limx-ϕ(x)=0, then for every N,

n=1Nϕ(an)n=1Nϕ(bn).

With the hypotheses of Theorem 18, for 1nν(A), define an=log|λn(A)| and bn=logσn(A) and let ϕ(x)=epx. By Theorem 18 these satisfy the conditions of Schur’s majorization inequality, which then gives us for 1Nν(A) that

n=1N|λn(A)|pn=1Nσn(A)p.

If n>ν(A) then λn(A)=0. ∎

12 Rayleigh quotients for self-adjoint operators

If A(H) is self-adjoint, we define the Rayleigh quotient of A by

f(x)=Ax,xx,x,xH,x0,f:H{0}.

Let X and Y be normed spaces, U an open subset of X, and f:UY a function. If xU and there is some T(X,Y) such that

limh0f(x+h)-f(x)-Thh=0, (2)

then f is said to be Fréchet differentiable at x, and T is called the Fréchet derivative of f at x;2020 20 Ward Cheney, Analysis for Applied Mathematics, p. 149. it does not take long to prove that if T1,T2(X,Y) both satisfy (2) then T1=T2. We denote the Fréchet derivative of f at x by (Df)x. Df is a map from the set of all points at which f is Fréchet differentiable to (X,Y).

To say that x is a stationary point of f is to say that f is Fréchet differentiable at x and that the Fréchet derivative of f at x is the zero map. One proves that if T1,T2 are Fréchet derivatives of f at x then T1=T2, and thus speak about the Fréchet derivative of f at x

Theorem 20.

If AB(H) is self-adjoint, then each eigenvector of A is a stationary point of the Rayleigh quotient of A.

Proof.

If λ is an eigenvalue of A then, as A is self-adjoint, λ. Let v0 satisfy Av=λv. We have

f(v)=Av,vv,v=λv,vv,v=λ.

For h0, using that A is self-adjoint and that λ,

|f(v+h)-f(v)-0v|h = 1h|A(v+h),v+hv+h,v+h-λ|
= 1hv+h2|A(v+h),v+h-λv+h,v+h|
= 1hv+h2|Av,v+Av,h+Ah,v+Ah,h
-λv,v-λv,h-λh,v-λh,h|
= 1hv+h2|λv,v+λv,h+h,λv+Ah,h
-λv,v-λv,h-λh,v-λh,h|
= 1hv+h2|Ah,h-λh,h|
= 1hv+h2|Ah-λh,h|.

Therefore

|f(v+h)-f(v)-0v|h Ah-λhhhv+h2
= (A-λidH)hv+h2
= A-λidHhv+h2.

As h0 the right-hand side tends to 0 (one of the terms tends to 0, one doesn’t depend on h, and the denominator is bounded below in terms just of v for sufficiently small h), showing that 0 is the Fréchet derivative of f at v. ∎