# The singular value decomposition of compact operators on Hilbert spaces

Jordan Bell
April 3, 2014

## 1 Preliminaries

The purpose of these notes is to present material about compact operators on Hilbert spaces that is special to Hilbert spaces, rather than what applies to all Banach spaces. We use statements about compact operators on Banach spaces without proof. For instance, any compact operator from one Banach space to another has separable image; the set of compact operators from one Banach space to another is a closed subspace of the set of all bounded linear operators; a Banach space is reflexive if and only if the closed unit ball is weakly compact (Kakutani’s theorem); etc. We do however state precisely each result that we are using for Banach spaces and show that its hypotheses are satisfied.

Let $\mathbb{N}$ be the set of positive integers. We say that a set is countable if it is bijective with a subset of $\mathbb{N}$. In this note I do not presume unless I say so that any set is countable or that any Hilbert space is separable. A neighborhood of a point in a topological space is a set that contains an open set that contains the point; one reason why it can be handy to speak about neighborhoods of a point rather than just open sets that contain the point is that the set of all neighborhoods of a point is a filter, whereas it is unlikely that the set of all open sets that contain a point is a filter. If $z\in\mathbb{C}$, we denote $z^{*}=\overline{z}$.

## 2 Bounded linear operators

An advantage of working with normed spaces rather than merely topological vector spaces is that continuous linear maps between normed spaces have a simple characterization. If $X$ and $Y$ are normed spaces and $T:X\to Y$ is linear, the operator norm of $T$ is

 $\left\|T\right\|=\sup_{\left\|x\right\|\leq 1}\left\|Tx\right\|.$

If $\left\|T\right\|<\infty$, then we say that $T$ is bounded.

###### Theorem 1.

If $X$ and $Y$ are normed spaces, a linear map $T:X\to Y$ is continuous if and only if it is bounded.

###### Proof.

Suppose that $T$ is continuous. In particular $T$ is continuous at $0$, so there is some $\delta>0$ such that if $\left\|x\right\|\leq\delta$ then $\left\|Tx\right\|=\left\|Tx-T0\right\|\leq 1$. If $x\neq 0$ then, as $T$ is linear,

 $\left\|Tx\right\|=\frac{1}{\delta}\left\|x\right\|\left\|T\left(\frac{\delta}{% \left\|x\right\|}x\right)\right\|\leq\frac{1}{\delta}\left\|x\right\|.$

Thus $\left\|T\right\|\leq\frac{1}{\delta}<\infty$, so $T$ is bounded.

Suppose that $T$ is bounded. Let $x_{0}\in X$, and let $\epsilon>0$. If $\left\|x-x_{0}\right\|\leq\frac{\epsilon}{\left\|T\right\|}$, then

 $\left\|Tx-Tx_{0}\right\|=\left\|T(x-x_{0})\right\|\leq\left\|T\right\|\left\|x% -x_{0}\right\|\leq\left\|T\right\|\cdot\frac{\epsilon}{\left\|T\right\|}=\epsilon.$

Hence $T$ is continuous at $x_{0}$, and so $T$ is continuous. ∎

If $X$ and $Y$ are normed spaces, we denote by $\mathscr{B}(X,Y)$ the set of bounded linear maps $X\to Y$. It is straightforward to check that $\mathscr{B}(X,Y)$ is a normed space with the operator norm. One proves that if $Y$ is a Banach space, then $\mathscr{B}(X,Y)$ is a Banach space,11 1 Walter Rudin, Functional Analysis, second ed., p. 92, Theorem 4.1. and if $X$ is a Banach space one then checks that $\mathscr{B}(X)=\mathscr{B}(X,X)$ is a Banach algebra. If $X$ is a normed space, we define $X^{*}=\mathscr{B}(X,\mathbb{C})$, which is a Banach space, called the dual space of $X$.

If $X$ and $Y$ are normed spaces and $T:X\to Y$ is linear, we say that $T$ has finite rank if

 $\mathrm{rank\,}T=\dim T(X)$

is finite. If $X$ is infinite dimensional and $Y\neq\{0\}$, let $\mathscr{E}$ be a Hamel basis for $X$, let $\{e_{n}:n\in\mathbb{N}\}$ be a countable subset of $\mathscr{E}$, and let $y\in Y$ be nonzero. If we define $T:X\to Y$ by $Te_{n}=n\left\|e_{n}\right\|y$ and $Te=0$ if $e\in\mathscr{E}\setminus\{e_{n}:n\in\mathbb{N}\}$, then $T$ is a linear map with finite rank yet $T$ is unbounded. Thus a finite rank linear map is not necessarily bounded. We denote by $\mathscr{B}_{00}(X,Y)$ the set of bounded finite rank linear maps $X\to Y$, and check that $\mathscr{B}_{00}(X,Y)$ is a vector space. If $X$ is a Banach space, one checks that $\mathscr{B}_{00}(X)$ is an ideal of the algebra $\mathscr{B}(X)$ (if we either pre- or postcompose a linear map with a finite rank linear map, the image will be finite dimensional).

If $X$ and $Y$ are Banach spaces, we say that $T:X\to Y$ is compact if the image of any bounded set under $T$ is precompact (has compact closure). One checks that if a linear map is compact then it is bounded (unlike a finite rank linear map, which is not necessarily bounded). There are several ways to state that a linear map is compact that one proves are equivalent: $T$ is compact if and only if the image of the closed unit ball is precompact; $T$ is compact if and only if the image of the open unit ball is precompact; $T$ is compact if and only if the image under it of any bounded sequence has a convergent subsequence. In a complete metric space, the Heine-Borel theorem asserts that a set is precompact if and only if it is totally bounded (for any $\epsilon>0$, the set can be covered by a finite number of balls of radius $\epsilon$). We denote by $\mathscr{B}_{0}(X,Y)$ the set of compact linear maps $X\to Y$, and it is straightforward to check that this is a vector space. One proves that if an operator is in the closure of the compact operators then the image of the closed unit ball under it is totally bounded, and from this it follows that $\mathscr{B}_{0}(X,Y)$ is a closed subspace of $\mathscr{B}(X,Y)$. $\mathscr{B}_{0}(X)$ is an ideal of the algebra $\mathscr{B}(X)$: if $K\in\mathscr{B}_{0}(X)$ and $T\in\mathscr{B}(X)$, one checks that $TK\in\mathscr{B}_{0}(X)$ and $KT\in\mathscr{B}_{0}(X)$.

Let $X$ and $Y$ be Banach spaces. Using the fact that a bounded set in a finite dimensional normed vector space is precompact, we can prove that a bounded finite rank operator is compact: $\mathscr{B}_{00}(X,Y)\subseteq\mathscr{B}_{0}(X,Y)$. Also, it doesn’t take long to prove that the image of a compact operator is separable: if $T\in\mathscr{B}_{0}(X,Y)$ then $T(X)$ has a countable dense subset. (We can prove this using the fact that a compact metric space is separable.)

If $H$ is a Hilbert space and $S_{i},i\in I$ are subsets of $H$, we define $\bigvee_{i\in I}S_{i}$ to be the closure of the span of $\bigcup_{i\in I}S_{i}$. We say that $\mathscr{E}$ is an orthonormal basis for $H$ if $\left\langle e,f\right\rangle=\delta_{e,f}$ and $H=\bigvee\mathscr{E}$.

A sesquilinear form on $H$ is a function $f:H\times H\to\mathbb{C}$ that is linear in its first argument and that satisfies $f(x,y)=f(y,x)^{*}$. If $f$ is a sesquilinear form on $H$, we say that $f$ is bounded if

 $\sup\{|f(x,y)|:\left\|x\right\|,\left\|y\right\|\leq 1\}<\infty.$

The Riesz representation theorem22 2 Walter Rudin, Functional Analysis, second ed., p. 310, Theorem 12.8. states that if $f$ is a bounded sesquilinear form on $H$, then there is a unique $B\in\mathscr{B}(H)$ such that

 $f(x,y)=\left\langle x,By\right\rangle,\qquad x,y\in H,$

and $\left\|B\right\|=\sup\{|f(x,y)|:\left\|x\right\|,\left\|y\right\|\leq 1\}$. It follows from the Riesz representation that if $A\in\mathscr{B}(H)$, then there is a unique $A^{*}\in\mathscr{B}(H)$ such that

 $\left\langle Ax,y\right\rangle=\left\langle x,A^{*}y\right\rangle,\qquad x,y% \in H,$

and $\left\|A^{*}\right\|=\left\|A\right\|$. $\mathscr{B}(H)$ is a $C^{*}$-algebra: if $A,B\in\mathscr{B}(H)$ and $\lambda\in\mathbb{C}$ then $A^{**}=A$, $(A+B)^{*}=A^{*}+B^{*}$, $(AB)^{*}=B^{*}A^{*}$, $(\lambda A)^{*}=\lambda^{*}A^{*}$, and $\left\|A^{*}A\right\|=\left\|A\right\|^{2}$.

We say that $A\in\mathscr{B}(H)$ is normal if $A^{*}A=AA^{*}$, and self-adjoint if $A^{*}=A$. One proves using the parallelogram law that $A\in\mathscr{B}(H)$ is self-adjoint if and only if $\left\langle Ax,x\right\rangle\in\mathbb{R}$ for all $x\in H$. If $A\in\mathscr{B}(H)$ is self-adjoint, we say that $A$ is positive if $\left\langle Ax,x\right\rangle\geq 0$ for all $x\in H$.

## 3 Spectrum in Banach spaces

If $X$ and $Y$ are Banach spaces and $T\in\mathscr{B}(X,Y)$ is a bijection, then its inverse function $T^{-1}:Y\to X$ is linear, since the inverse of a linear bijection is itself linear. Because $T$ is a surjective bounded linear map, by the open mapping theorem it is an open map: if $U$ is an open subset of $X$ then $T(U)$ is an open subset of $Y$, and it follows that $T^{-1}\in\mathscr{B}(Y,X)$. That is, if a bounded linear operator from one Banach space to another is bijective then its inverse function is also a bounded linear operator.

If $X$ is a Banach space and $T\in\mathscr{B}(X)$, the spectrum $\sigma(T)$ of $T$ is the set of those $\lambda\in\mathbb{C}$ such that the map $T-\lambda\mathrm{id}_{X}:X\to X$ is not a bijection. One proves that $\sigma(T)$ is nonempty (the proof uses Liouville’s theorem, which states that a bounded entire function is constant). One also proves that if $\lambda\in\sigma(T)$ then $|\lambda|\leq\left\|T\right\|$. We define the spectral radius of $T$ to be

 $r(T)=\sup_{\lambda\in\sigma(T)}|\lambda|,$

and so $r(T)\leq\left\|T\right\|$. Because $\mathscr{B}_{0}(X)$ is an ideal in the algebra $\mathscr{B}(X)$, if $T\in\mathscr{B}_{0}(X)$ is invertible then $\mathrm{id}_{X}$ is compact. One checks that if $\mathrm{id}_{X}$ is compact then $X$ is finite dimensional (a locally compact topological vector space is finite dimensional), and therefore, if $X$ is an infinite dimensional Banach space and $T\in\mathscr{B}_{0}(X)$, then $0\in\sigma(T)$.

The resolvent set of $T$ is $\rho(T)=\mathbb{C}\setminus\sigma(T)$. One proves that $\rho(T)$ is open,33 3 Gert K. Pedersen, Analysis Now, revised printing, p. 131, Theorem 4.1.13. from which it then follows that $\sigma(T)$ is a compact set. For $\lambda\in\rho(T)$, we define

 $R(\lambda,T)=(T-\lambda\mathrm{id}_{X})^{-1}\in\mathscr{B}(X),$

called the resolvent of $T$.

If $X$ is a Banach space and $T\in\mathscr{B}_{0}(X)$, then the point spectrum $\sigma_{\mathrm{point}}(T)$ of $T$ is the set of those $\lambda\in\mathbb{C}$ such that $T-\lambda\mathrm{id}_{X}$ is not injective. In other words, to say that $\lambda\in\sigma_{\mathrm{point}}(T)$ is to say that

 $\dim\ker(T-\lambda\mathrm{id}_{X})>0.$

If $\lambda\in\sigma_{\mathrm{point}}(T)$, we say that $\lambda$ is an eigenvalue of $T$, and call $\dim\ker(T-\lambda\mathrm{id}_{X})$ its geometric multiplicity.44 4 There is also a notion of algebraic multiplicity of an eigenvalue: the algebraic multiplicity of $\lambda$ is defined to be $\sup_{n\in\mathbb{N}}\dim\ker((A-\lambda\mathrm{id}_{H})^{n}).$ For self-adjoint operators this is equal to the geometric multiplicity of $\lambda$, while for a operator that is not self-adjoint the algebraic multiplicity of an eingevalue may be greater than its geometric multiplicity. Nonzero elements of $\ker((A-\lambda\mathrm{id}_{H})^{n})$ are called generalized eigenvectors or root vectors. See I. C. Gohberg and M. G. Krein, Introduction to the Theory of Linear Nonselfadjoint Operators in Hilbert Space. It is a fact that each nonzero eigenvalue of $T$ has finite geometric multiplicity, and it is also a fact that if $T\in\mathscr{B}_{0}(X)$, then $\sigma_{\mathrm{point}}(T)$ is a bounded countable set and that if $\sigma_{\mathrm{point}}(T)$ has a limit point that limit point is $0$. The Fredholm alternative tells us that

 $\sigma(T)\subseteq\sigma_{\mathrm{point}}(T)\cup\{0\}.$

If $X$ is infinite dimensional then $\sigma(T)=\sigma_{\mathrm{point}}(T)\cup\{0\}$, and $\sigma_{\mathrm{point}}(T)$ might or might not include $0$. If $T\in\mathscr{B}_{00}(X)$, check that $\sigma_{\mathrm{point}}(T)$ is a finite set.

If $H$ is a Hilbert space, using the fact that a bounded linear operator $T\in\mathscr{B}(H)$ is invertible $T$ if and only if both $TT^{*}$ and $T^{*}T$ are bounded below ($S$ is bounded below if there is some $c>0$ such that $\left\|Sx\right\|\geq c\left\|x\right\|$ for all $x\in X$), one can prove that the spectrum of a bounded self-adjoint operator is a set of real numbers, and the spectrum of a bounded positive operator is a set of nonnegative real numbers.

If $H$ is a Hilbert space and $A\in\mathscr{B}(H)$, the numerical range of $A$ is the set $\{\left\langle Ax,x\right\rangle:\left\|x\right\|=1\}$.55 5 The Toeplitz-Hausdorff theorem states that the numerical range of any bounded linear operator is a convex set. See Paul R. Halmos, A Hilbert Space Problem Book, Problem 166. The closure of the numerical range contains the spectrum of $A$.66 6 Paul R. Halmos, A Hilbert Space Problem Book, Problem 169. If $A$ is normal, then the closure of its numerical range is the convex hull of the spectrum: Problem 171. The numerical radius $w(A)$ of $A$ is the supremum of the numerical range of $A$: $w(A)=\sup_{\left\|x\right\|=1}|\left\langle Ax,x\right\rangle|$. If $A\in\mathscr{B}(H)$ is self-adjoint, one can prove that77 7 John B. Conway, A Course in Functional Analysis, second ed., p. 34, Proposition 2.13.

 $w(A)=\left\|A\right\|.$

The following theorem asserts that a compact self-adjoint operator $A$ has an eigenvalue whose absolute value is equal to the norm of the operator. Thus in particular, the spectral radius of a compact self-adjoint operator is equal to its numerical radius. Since a self-adjoint operator has real spectrum, to say that $|\lambda|=\left\|A\right\|$ is to say that either $\lambda=\left\|A\right\|$ or $\lambda=-\left\|A\right\|$. A compact operator on a Hilbert space can have empty point spectrum (e.g. the Volterra operator on $L^{2}([0,1])$) and a bounded self-adjoint operator can have empty point spectrum (e.g. the multiplication operator $T\phi(t)=t\phi(t)$ on $L^{2}([0,1])$), but this theorem shows that if an operator is compact and self-adjoint then its point spectrum is nonempty.

###### Theorem 2.

If $A\in\mathscr{B}(H)$ is compact and self-adjoint then at least one of $-\left\|A\right\|,\left\|A\right\|$ is an eigenvalue of $A$.

###### Proof.

Because $A$ is self-adjoint, $\left\|A\right\|=w(A)=\sup_{\left\|x\right\|=1}|\left\langle Ax,x\right\rangle|$. Also, as $A$ is self-adjoint, $\left\langle Ax,x\right\rangle$ is a real number, and thus either $\left\|A\right\|=\sup_{\left\|x\right\|=1}\left\langle Ax,x\right\rangle$ or $\left\|A\right\|=-\inf_{\left\|x\right\|}\left\langle Ax,x\right\rangle$. In the first case, due to $\left\|A\right\|$ being a supremum there is a sequence $x_{n}$, all with norm $1$, such that $\left\langle Ax_{n},x_{n}\right\rangle\to\left\|A\right\|$. Using that $A$ is compact, there is a subsequence $x_{a(n)}$ such that $Ax_{a(n)}$ converges to some $x$, and $\left\|x\right\|=1$ because each $x_{n}$ has norm $1$. Using $A=A^{*}$,

 $\displaystyle\left\langle Ax_{n}-\left\|A\right\|x_{n},Ax_{n}-\left\|A\right\|% x_{n}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle Ax_{n},Ax_{n}\right\rangle-\left\langle Ax_{n},\left% \|A\right\|x_{n}\right\rangle$ $\displaystyle-\left\langle\left\|A\right\|x_{n},Ax_{n}\right\rangle+\left% \langle\left\|A\right\|x_{n},\left\|A\right\|x_{n}\right\rangle$ $\displaystyle=$ $\displaystyle\left\|Ax_{n}\right\|^{2}-2\left\|A\right\|\left\langle Ax_{n},x_% {n}\right\rangle+\left\|A\right\|^{2}\left\|x_{n}\right\|^{2}$ $\displaystyle\leq$ $\displaystyle\left\|A\right\|^{2}\left\|x_{n}\right\|^{2}-2\left\|A\right\|% \left\langle Ax_{n},x_{n}\right\rangle+\left\|A\right\|^{2}\left\|x_{n}\right% \|^{2}$ $\displaystyle=$ $\displaystyle 2\left\|A\right\|^{2}\left\|x_{n}\right\|^{2}-2\left\|A\right\|% \left\langle Ax_{n},x_{n}\right\rangle$ $\displaystyle\to$ $\displaystyle 2\left\|A\right\|^{2}\left\|x\right\|^{2}-2\left\|A\right\|\left% \|A\right\|$ $\displaystyle=$ $\displaystyle 0.$

Therefore, as $n\to\infty$, the sequence $Ax_{n}-\left\|A\right\|x_{n}$ tends to $0$, so $Ax-\left\|A\right\|x=0$, i.e. $x$ is an eigenvector for the eigenvalue $\left\|A\right\|$. If $\left\|A\right\|=-\inf_{\left\|x\right\|=1}\left\langle Ax,x\right\rangle$ the argument goes the same. ∎

## 5 Polar decomposition

If $H$ is a Hilbert space and $P\in\mathscr{B}(H)$ is positive, there is a unique positive element of $\mathscr{B}(H)$, denoted $P^{1/2}$, satisfying $(P^{1/2})^{2}=P$, which we call the positive square root of $P$.88 8 Gert K. Pedersen, Analysis Now, revised printing, p. 92, Proposition 3.2.11. If $A\in\mathscr{B}(H)$ one checks that $A^{*}A$ is positive, and hence $A^{*}A$ has a positive square root, which we denote by $|A|$ and call the absolute value of $A$. One proves that $|A|$ is the unique positive operator in $\mathscr{B}(H)$ satisfying

 $\left\|Ax\right\|=\left\||A|x\right\|,\qquad x\in H.$

An element $U$ of $\mathscr{B}(H)$ is said to be a partial isometry if there is a closed subspace $X$ of $H$ such that the restriction of $U$ to $X$ is an isometry $X\to U(X)$ and $\ker U=X^{\perp}$. One proves that $U^{*}U$ is the orthogonal projection of $H$ onto $X$. It can be proved that if $A\in\mathscr{B}(H)$ then there is a unique partial isometry $U$ satisfying both $\ker U=\ker A$ and $A=U|A|$.99 9 Gert K. Pedersen, Analysis Now, revised printing, p. 96, Theorem 3.2.17. This is called the polar decomposition of $A$. The polar decomposition satisfies

 $U^{*}U|A|=|A|,\quad U^{*}A=|A|,\quad UU^{*}A=A.$

## 6 Spectral theorem

If $e,f\in H$, we define $e\otimes f:H\to H$ by $e\otimes f(h)=\left\langle h,f\right\rangle e$. $e\otimes f$ is linear, and

 $\left\|e\otimes f(h)\right\|=\left\|\left\langle h,f\right\rangle e\right\|=|% \left\langle h,f\right\rangle|\left\|e\right\|\leq\left\|h\right\|\left\|f% \right\|\left\|e\right\|,$

so $\left\|e\otimes f\right\|\leq\left\|e\right\|\left\|f\right\|$. Depending on whether $f=0$ the image of $e\otimes f$ is $\{0\}$ or the span of $e$, and in either case $e\otimes f\in\mathscr{B}_{00}(H)$. If either of $e$ or $f$ is $0$ then $e\otimes f$ has rank $0$, and otherwise $e\otimes f$ has rank $1$, and it is an orthogonal projection precisely when $f$ is a multiple of $e$.

If $\mathscr{E}$ is an orthonormal set in a Hilbert space $H$, then $\mathscr{E}$ is an orthonormal basis for $H$ if and only if the unordered sum

 $\sum_{e\in\mathscr{E}}e\otimes e$

converges strongly to $\mathrm{id}_{H}$.1010 10 John B. Conway, A Course in Functional Analysis, second ed., p. 16, Theorem 4.13.

Let’s summarize what we have stated so far about the spectrum and point spectrum of a compact self-adjoint operator on a Hilbert space.

###### Theorem 3 (Spectrum of compact self-adjoint operators).

If $H$ is a Hilbert space and $A\in\mathscr{B}_{0}(H)$ is self-adjoint, then:

• $\sigma(A)$ is a nonempty compact subset of $\mathbb{R}$.

• If $H$ is infinite dimensional, then $0\in\sigma(A)$.

• $\sigma(A)\subseteq\sigma_{\mathrm{point}}(A)\cup\{0\}$.

• $\sigma_{\mathrm{point}}(A)$ is countable.

• If $\lambda\in\mathbb{R}$ is a limit point of $\sigma_{\mathrm{point}}(A)$, then $\lambda=0$.

• At least one of $\left\|A\right\|,-\left\|A\right\|$ is an element of $\sigma_{\mathrm{point}}(A)$.

• Each nonzero eigenvalue of $A$ has finite geometric multiplicity: If $\lambda\in\sigma_{\mathrm{point}}(A)$ and $\lambda\neq 0$, then $\dim\ker(A-\lambda\mathrm{id}_{H})<\infty$.

• If $A\in\mathscr{B}_{00}(H)$, then $\sigma_{\mathrm{point}}(A)$ is a finite set.

We say that $A\in\mathscr{B}(H)$ is diagonalizable if there is an orthonormal basis $\mathscr{E}$ for $H$ and a bounded set $\{\lambda_{e}\in\mathbb{C}:e\in\mathscr{E}\}$ such that the unordered sum

 $\sum_{e\in\mathscr{E}}\lambda_{e}e\otimes e$

converges strongly to $A$.

The following is the spectral theorem for normal compact operators.1111 11 Gert K. Pedersen, Analysis Now, revised printing, p. 108, Theorem 3.3.8.

###### Theorem 4 (Spectral theorem).

If $A\in\mathscr{B}_{0}(H)$ is normal, then $A$ is diagonalizable.

The last assertion of Theorem 3 is that a bounded self-adjoint finite rank operator on a Hilbert space has finitely many elements in its point spectrum. Using the spectral theorem, we get that if $A\in\mathscr{B}_{0}(H)$ is self-adjoint and $\sigma_{\mathrm{point}}(A)$ is finite, then $A$ is finite rank. In the notation we introduce in the following definition, $\nu(A)<\infty$ precisely when $A$ has finite rank.

###### Definition 5.

If $A\in\mathscr{B}_{0}(H)$ is self-adjoint, define $0\leq\nu(A)\leq\infty$ to be the sum of the geometric multiplicities of the nonzero eigenvalues of $A$:

 $\nu(A)=\sum_{\lambda\in\sigma_{\mathrm{point}}(A)\setminus\{0\}}\dim\ker(A-% \lambda\mathrm{id}_{H}).$

Define

 $(\lambda_{n}(A):n\in\mathbb{N})\in\mathbb{R}^{\mathbb{N}}$

to be the sequence whose first term is the element of $\sigma_{\mathrm{point}}(A)\setminus\{0\}$ with largest absolute value repeated as many times as its geometric multiplicity. If $\lambda,-\lambda$ are both nonzero elements of $\sigma_{\mathrm{point}}(A)$, we put the positive one first. We repeat this for the remaining elements of $\sigma_{\mathrm{point}}(A)\setminus\{0\}$. If $\nu(A)<\infty$, we define $\lambda_{n}(A)=0$ for $n>\nu(A)$.

Using the spectral theorem and the notation in the above definition we get the following.

###### Theorem 6.

If $A\in\mathscr{B}_{0}(H)$ is self-adjoint, then there is an orthonormal set $\{e_{n}:n\in\mathbb{N}\}$ in $H$ such that

 $\sum_{n\in\mathbb{N}}\lambda_{n}(A)e_{n}\otimes e_{n}$

converges strongly to $A$.

If $A\in\mathscr{B}_{0}(H)$, then its absolute value $|A|$ is a positive compact operator, and $\lambda_{n}(|A|)\geq 0$ for all $n\in\mathbb{N}$.

###### Definition 7.

If $A\in\mathscr{B}_{0}(H)$ and $\lambda$ is an eigenvalue of $|A|$, we call $\lambda$ a singular value of $A$, and we define

 $\sigma_{n}(A)=\lambda_{n}(|A|),\qquad n\in\mathbb{N}.$

Because the absolute value of the absolute value of an operator is the absolute value of the operator, if $A\in\mathscr{B}_{0}(H)$ and $n\in\mathbb{N}$ then $\sigma_{n}(|A|)=\sigma_{n}(A)$. If $A\in\mathscr{B}_{0}(H)$ and $\lambda\neq 0$, one proves that $\lambda$ is an eigenvalue of $AA^{*}$ if and only if $\lambda$ is an eigenvalue of $A^{*}A$ and that they have the same geometric multiplicity. From this we get that $\sigma_{n}(A)=\sigma_{n}(A^{*})$ for all $n\in\mathbb{N}$.

## 7 Finite rank operators

###### Theorem 8 (Singular value decomposition).

If $H$ is a Hilbert space and $A\in\mathscr{B}_{00}(H)$ has $\mathrm{rank\,}A=N$, then there is an orthonormal set $\{e_{n}:1\leq n\leq N\}$ and an orthonormal set $\{f_{n}:1\leq n\leq N\}$ such that

 $A=\sum_{n=1}^{N}\sigma_{n}(A)e_{n}\otimes f_{n},\qquad Ah=\sum_{n=1}^{N}\sigma% _{n}(A)\left\langle h,f_{n}\right\rangle e_{n}.$
###### Proof.

$|A|$ is a positive operator with $\mathrm{rank\,}|A|=N$, and according to Theorem 6, there is an orthonormal set $\{f_{n}:n\in\mathbb{N}\}$ in $H$ such that

 $|A|=\sum_{n\in\mathbb{N}}\lambda_{n}(|A|)f_{n}\otimes f_{n}=\sum_{n\in\mathbb{% N}}\sigma_{n}(A)f_{n}\otimes f_{n}.$

Using the polar decomposition $A=U|A|$,

 $A=U|A|=\sum_{n=1}^{N}\sigma_{n}(A)(Uf_{n})\otimes f_{n}.$

Define $e_{n}=Uf_{n}$. As $U^{*}U|A|=|A|$ and as $|A|\frac{f_{m}}{\sigma_{m}(A)}=f_{m}$,

 $\displaystyle\left\langle e_{n},e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle Uf_{n},Uf_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},U^{*}Uf_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},U^{*}U|A|\frac{f_{m}}{\sigma_{m}(A)}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},|A|\frac{f_{m}}{\sigma_{m}(A)}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},f_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\delta_{n,m},$

showing that $\{e_{n}:1\leq n\leq N\}$ is an orthonormal set. ∎

If $e,f,x,y\in H$, then

 $\left\langle e\otimes f(x),y\right\rangle=\left\langle\left\langle x,f\right% \rangle e,y\right\rangle=\left\langle x,f\right\rangle\left\langle e,y\right% \rangle=\left\langle y,e\right\rangle^{*}\left\langle x,f\right\rangle=\left% \langle x,\left\langle y,e\right\rangle f\right\rangle=\left\langle x,f\otimes e% (y)\right\rangle,$

so $(e\otimes f)^{*}=f\otimes e$.

###### Theorem 9.

If $A\in\mathscr{B}_{00}(H)$ then $A^{*}\in\mathscr{B}_{00}(H)$.

###### Proof.

Let $A$ have the singular value decomposition

 $A=\sum_{n=1}^{N}\sigma_{n}(A)e_{n}\otimes f_{n}.$

Taking the adjoint, and because $\sigma_{n}\in\mathbb{R}$,

 $A^{*}=\sum_{n=1}^{N}\sigma_{n}(A)f_{n}\otimes e_{n}.$

$A^{*}$ is a sum of finite rank operators and is therefore itself a finite rank operator. ∎

## 8 Compact operators

If $X$ and $Y$ are Banach spaces, $\mathscr{B}_{00}(X,Y)\subseteq\mathscr{B}_{0}(X,Y)$. But if $H$ is a Hilbert space we can say much more: $\mathscr{B}_{00}(H)$ is a dense subset of $\mathscr{B}_{0}(H)$. In other words, any compact operator on a Hilbert space can be approximated by a sequence of bounded finite rank operators.1212 12 John B. Conway, A Course in Functional Analysis, second ed., p. 41, Theorem 4.4. As the adjoint $A_{n}^{*}$ of each of these finite rank operators $A_{n}$ is itself a bounded finite rank operator,

 $\left\|A^{*}-A_{n}^{*}\right\|=\left\|(A-A_{n})^{*}\right\|=\left\|A-A_{n}% \right\|\to 0,$

so $A_{n}^{*}\to A^{*}$. Because each bounded finite rank operator is compact and $\mathscr{B}_{0}(H)$ is closed, this establishes that $A^{*}\in\mathscr{B}_{0}(H)$. (In fact, it is true that the adjoint of a compact linear operator between Banach spaces is itself compact, but there we don’t have the tool of showing that the adjoint is the limit of the adjoints of finite rank operators.)

If $H$ is a Hilbert space, the weak topology is the topology on $H$ such that a net $x_{\alpha}$ converges to $x$ weakly if for all $h\in H$ the net $\left\langle x_{\alpha},h\right\rangle$ converges to $\left\langle x,h\right\rangle$ in $\mathbb{C}$. Let $\mathfrak{B}$ be the closed unit ball in $H$, and let it be a topological space with the subspace topology inherited from $H$ with the weak topology. Thus, a net $x_{\alpha}\in\mathfrak{B}$ converges to $x\in\mathfrak{B}$ if and only if for all $h\in H$ the net $\left\langle x_{\alpha},h\right\rangle$ converges to $\left\langle x,h\right\rangle$.

###### Theorem 10.

If $H$ is a Hilbert space, $A\in\mathscr{B}(H)$, and $\mathfrak{B}$ is the closed unit ball in $H$ with the subspace topology inherited from $H$ with the weak topology, then $A$ is compact if and only if $A|\mathfrak{B}:\mathfrak{B}\to H$ is continuous.

###### Proof.

Suppose that $A$ is compact and let $x_{\alpha}$ be a net in $\mathfrak{B}$ that converges weakly to some $x\in\mathfrak{B}$. If $\epsilon>0$, then there is some $B\in\mathscr{B}_{00}(H)$ with $\left\|A-B\right\|<\epsilon$. Let $B$ have the singular value decomposition

 $B=\sum_{n=1}^{N}\sigma_{n}(B)e_{n}\otimes f_{n}.$

We have, using that the $e_{n}$ are orthonormal,

 $\displaystyle\left\|Bx_{\alpha}-Bx\right\|^{2}$ $\displaystyle=$ $\displaystyle\left\|\sum_{n=1}^{N}\sigma_{n}(B)\left\langle x_{\alpha},f_{n}% \right\rangle e_{n}-\sum_{n=1}^{N}\sigma_{n}(B)\left\langle x,f_{n}\right% \rangle e_{n}\right\|^{2}$ $\displaystyle=$ $\displaystyle\left\|\sum_{n=1}^{N}\sigma_{n}(B)\left\langle x_{\alpha}-x,f_{n}% \right\rangle e_{n}\right\|^{2}$ $\displaystyle=$ $\displaystyle\sum_{n=1}^{N}\sigma_{n}(B)^{2}|\left\langle x_{\alpha}-x,f_{n}% \right\rangle|^{2}.$

Eventually this is $<\frac{\epsilon}{3}$, and for such $\alpha$,

 $\displaystyle\left\|Ax_{\alpha}-Ax\right\|$ $\displaystyle\leq$ $\displaystyle\left\|Ax_{\alpha}-Bx_{\alpha}\right\|+\left\|Bx_{\alpha}-Bx% \right\|+\left\|Bx-Ax\right\|$ $\displaystyle\leq$ $\displaystyle\left\|A-B\right\|\left\|x_{\alpha}\right\|+\left\|Bx_{\alpha}-Bx% \right\|+\left\|B-A\right\|\left\|x\right\|$ $\displaystyle\leq$ $\displaystyle\left\|A-B\right\|+\left\|Bx_{\alpha}-Bx\right\|+\left\|B-A\right\|$ $\displaystyle<$ $\displaystyle\frac{\epsilon}{3}+\frac{\epsilon}{3}+\frac{\epsilon}{3}$ $\displaystyle=$ $\displaystyle\epsilon.$

We have shown that $Ax_{\alpha}\to Ax$ in the no rm of $H$, and this shows that $A|\mathfrak{B}:\mathfrak{B}\to H$ is continuous.

Suppose that $A|\mathfrak{B}:\mathfrak{B}\to H$ is continuous. Kakutani’s theorem states that a Banach space is reflexive if and only if the closed unit ball is weakly compact. A Hilbert space is reflexive, hence $\mathfrak{B}$, the closed unit ball with the weak topology, is a compact topological space.1313 13 cf. Paul R. Halmos, A Hilbert Space Problem Book, Problem 17. Since $A|\mathfrak{B}:\mathfrak{B}\to H$ is continuous and $\mathfrak{B}$ is compact, the image $A(\mathfrak{B})$ is compact (the image of a compact set under a continuous map is a compact set). We have shown that the image of the closed unit ball is a compact subset of $H$, and this shows that $A$ is compact; in fact, to have shown that $A$ is compact we merely needed to show that the image of the closed unit ball is precompact, and $H$ is a Hausdorff space so a compact set is precompact. ∎

A compact linear operator on an infinite dimensional Hilbert space $H$ is not invertible, lest $\mathrm{id}_{H}$ be compact. However, operators of the form $A-\lambda\mathrm{id}_{H}$ may indeed be invertible.1414 14 Ward Cheney, Analysis for Applied Mathematics, p. 94, Theorem 2.

###### Theorem 11.

If $A\in\mathscr{B}_{0}(H)$ is a normal operator with diagonalization

 $A=\sum_{n=1}^{\infty}\lambda_{n}e_{n}\otimes e_{n}$

and $0\neq\lambda\not\in\sigma_{\mathrm{point}}(A)$, then $A-\lambda\mathrm{id}_{H}$ is invertible and

 $(A-\lambda\mathrm{id}_{H})^{-1}=-\frac{1}{\lambda}+\frac{1}{\lambda}\sum_{n=1}% ^{\infty}\frac{\lambda_{n}}{\lambda_{n}-\lambda}e_{n}\otimes e_{n},$

where the series converges in the strong operator topology.

###### Proof.

As $\lambda_{n}\to 0$ we have $\alpha=\sup_{n}|\lambda_{n}|<\infty$, and as $\lambda\neq 0$ we have $\beta=\inf_{n}|\lambda_{n}-\lambda|>0$. Define

 $T_{N}=-\frac{1}{\lambda}+\frac{1}{\lambda}\sum_{n=1}^{N}\frac{\lambda_{n}}{% \lambda_{n}-\lambda}e_{n}\otimes e_{n}\in\mathscr{B}_{00}(H),$

and if $N>M$, then, for any $h\in H$,

 $\displaystyle\left\|T_{N}h-T_{M}h\right\|^{2}$ $\displaystyle=$ $\displaystyle\frac{1}{|\lambda|^{2}}\left\|\sum_{n=M+1}^{N}\frac{\lambda_{n}}{% \lambda_{n}-\lambda}\left\langle h,e_{n}\right\rangle e_{n}\right\|^{2}$ $\displaystyle=$ $\displaystyle\frac{1}{|\lambda|^{2}}\sum_{n=M+1}^{N}\frac{|\lambda_{n}|^{2}}{|% \lambda_{n}-\lambda|^{2}}|\left\langle h,e_{n}\right\rangle|^{2}$ $\displaystyle\leq$ $\displaystyle\frac{1}{|\lambda|^{2}}\sum_{n=M+1}^{N}\frac{\alpha^{2}|\left% \langle h,e_{n}\right\rangle|^{2}}{\beta^{2}}$ $\displaystyle=$ $\displaystyle\frac{\alpha^{2}}{|\lambda|^{2}\beta^{2}}\sum_{n=M+1}^{N}|\left% \langle h,e_{n}\right\rangle|^{2}.$

By Bessel’s inequality, $\sum_{n=1}^{\infty}|\left\langle h,e_{n}\right\rangle|^{2}\leq\left\|h\right\|% ^{2}$, hence $\sum_{n=N}^{\infty}|\left\langle h,e_{n}\right\rangle|^{2}\to 0$ as $N\to\infty$; this $N$ depends on $h$, and this is why the claim is stated merely for the strong operator topology and not the norm topology. We have shown that $T_{N}h$ is a Cauchy sequence in $H$ and hence $T_{N}h$ converges. We define $Bh$ to be this limit. For $h\in H$,

 $\displaystyle\left\|\frac{1}{\lambda}\sum_{n=1}^{\infty}\frac{\lambda_{n}}{% \lambda_{n}-\lambda}\left\langle h,e_{n}\right\rangle e_{n}\right\|^{2}$ $\displaystyle=$ $\displaystyle\frac{1}{|\lambda|^{2}}\sum_{n=1}^{\infty}\frac{|\lambda_{n}|^{2}% }{|\lambda_{n}-\lambda|^{2}}|\left\langle h,e_{n}\right\rangle|^{2}$ $\displaystyle\leq$ $\displaystyle\frac{\alpha^{2}}{|\lambda|^{2}\beta^{2}}\sum_{n=1}^{\infty}|% \left\langle h,e_{n}\right\rangle|^{2}$ $\displaystyle\leq$ $\displaystyle\frac{\alpha^{2}}{|\lambda|^{2}\beta^{2}}\left\|h\right\|^{2},$

whence

 $\displaystyle\left\|Bh\right\|$ $\displaystyle\leq$ $\displaystyle\left\|-\frac{1}{\lambda}h\right\|+\left\|\frac{1}{\lambda}\sum_{% n=1}^{\infty}\frac{\lambda_{n}}{\lambda_{n}-\lambda}\left\langle h,e_{n}\right% \rangle e_{n}\right\|$ $\displaystyle\leq$ $\displaystyle\frac{1}{|\lambda|}\left\|h\right\|+\frac{\alpha}{|\lambda|\beta}% \left\|h\right\|,$

showing that $\left\|B\right\|\leq\frac{1}{|\lambda|}+\frac{\alpha}{|\lambda|\beta}$. It is straightforward to check that $B$ is linear, thus $B\in\mathscr{B}(H)$. (Thus $B$ is a strong limit of finite rank operators. But if $H$ is infinite dimensional then $B$ is in fact not the norm limit of the sequence: for if it were it would be compact, and we will show that $B$ is invertible, which would tell us that $\mathrm{id}_{H}$ is compact, contradicting $H$ being infinite dimensional.)

For $h\in H$,

 $\displaystyle(A-\lambda\mathrm{id}_{H})Bh$ $\displaystyle=$ $\displaystyle-\frac{1}{\lambda}(Ah-\lambda h)+\frac{1}{\lambda}\sum_{n=1}^{% \infty}\frac{\lambda_{n}}{\lambda_{n}-\lambda}\left\langle h,e_{n}\right% \rangle(Ae_{n}-\lambda e_{n})$ $\displaystyle=$ $\displaystyle h-\frac{1}{\lambda}Ah+\frac{1}{\lambda}\sum_{n=1}^{\infty}\frac{% \lambda_{n}}{\lambda_{n}-\lambda}\left\langle h,e_{n}\right\rangle(\lambda_{n}% e_{n}-\lambda e_{n})$ $\displaystyle=$ $\displaystyle h-\frac{1}{\lambda}Ah+\frac{1}{\lambda}\sum_{n=1}^{\infty}% \lambda_{n}\left\langle h,e_{n}\right\rangle e_{n}$ $\displaystyle=$ $\displaystyle h,$

where the final equality is because the series is the diagonalization of $A$. On the other hand,

 $\displaystyle B(A-\lambda\mathrm{id}_{H})h$ $\displaystyle=$ $\displaystyle-\frac{1}{\lambda}(A-\lambda\mathrm{id}_{H})h+\frac{1}{\lambda}% \sum_{n=1}^{\infty}\frac{\lambda_{n}}{\lambda_{n}-\lambda}\left\langle(A-% \lambda\mathrm{id}_{H})h,e_{n}\right\rangle e_{n}$ $\displaystyle=$ $\displaystyle h-\frac{1}{\lambda}Ah+\frac{1}{\lambda}\sum_{n=1}^{\infty}\frac{% \lambda_{n}}{\lambda_{n}-\lambda}\left\langle Ah-\lambda h,e_{n}\right\rangle e% _{n}$ $\displaystyle=$ $\displaystyle h-\frac{1}{\lambda}\sum_{n=1}^{\infty}\lambda_{n}\left\langle h,% e_{n}\right\rangle e_{n}+\frac{1}{\lambda}\sum_{n=1}^{\infty}\frac{\lambda_{n}% }{\lambda_{n}-\lambda}\left\langle Ah,e_{n}\right\rangle e_{n}$ $\displaystyle-\sum_{n=1}^{\infty}\frac{\lambda_{n}}{\lambda_{n}-\lambda}\left% \langle h,e_{n}\right\rangle e_{n}$ $\displaystyle=$ $\displaystyle h-\frac{1}{\lambda}\sum_{n=1}^{\infty}\lambda_{n}\left\langle h,% e_{n}\right\rangle e_{n}+\frac{1}{\lambda}\sum_{n=1}^{\infty}\frac{\lambda_{n}% }{\lambda_{n}-\lambda}\lambda_{n}\left\langle h,e_{n}\right\rangle e_{n}$ $\displaystyle-\sum_{n=1}^{\infty}\frac{\lambda_{n}}{\lambda_{n}-\lambda}\left% \langle h,e_{n}\right\rangle e_{n}$ $\displaystyle=$ $\displaystyle h+\frac{1}{\lambda}\sum_{n=1}^{\infty}\frac{-\lambda_{n}(\lambda% _{n}-\lambda)+\lambda_{n}^{2}-\lambda_{n}\lambda}{\lambda_{n}-\lambda}\left% \langle h,e_{n}\right\rangle e_{n}$ $\displaystyle=$ $\displaystyle h,$

showing that $B=(A-\lambda\mathrm{id}_{H})^{-1}$. ∎

We can start with a function and ask what kind of series it can be expanded into, or we can start with a series and ask what kind of function it defines. The following theorem does the latter. It shows that if $e_{n}$ and $f_{n}$ are each orthonormal sequences and $\lambda_{n}$ is a sequence of complex numbers whose limit of $0$, then the series

 $\sum_{n=1}^{\infty}\lambda_{n}e_{n}\otimes f_{n}$

converges and is an element of $\mathscr{B}_{0}(H)$.

###### Theorem 12.

If $H$ is a Hilbert space, $\{e_{n}:n\in\mathbb{N}\}$ is an orthonormal set, $\{f_{n}:n\in\mathbb{N}\}$ is an orthonormal set, and $\lambda_{n}\in\mathbb{C}$ is a sequence tending to $0$, then the sequence

 $A_{N}=\sum_{n=1}^{N}\lambda_{n}e_{n}\otimes f_{n}\in\mathscr{B}_{00}(H).$

converges to an element of $\mathscr{B}_{0}(H)$.

###### Proof.

Let $\epsilon>0$ and let $N_{0}$ be such that if $n\geq N_{0}$ then $|\lambda_{n}|<\epsilon$. If $N>M\geq N_{0}$ and $h\in H$, then, as the $e_{n}$ are orthonormal,

 $\displaystyle\left\|(A_{N}-A_{M})h\right\|^{2}=$ $\displaystyle\left\|\sum_{n=M+1}^{N}\lambda_{n}e_{n}\otimes f_{n}(h)\right\|^{2}$ $\displaystyle=$ $\displaystyle\left\|\sum_{M+1}^{N}\lambda_{n}\left\langle h,f_{n}\right\rangle e% _{n}\right\|^{2}$ $\displaystyle=$ $\displaystyle\sum_{n=M+1}^{N}\left\|\lambda_{n}\left\langle h,f_{n}\right% \rangle e_{n}\right\|^{2}$ $\displaystyle=$ $\displaystyle\sum_{n=M+1}^{N}|\lambda_{n}|^{2}|\left\langle h,f_{n}\right% \rangle|^{2}$ $\displaystyle<$ $\displaystyle\epsilon^{2}\sum_{n=M+1}^{N}|\left\langle h,f_{n}\right\rangle|^{% 2}.$

By Bessel’s inequality, $\sum_{n=M+1}^{N}|\left\langle h,f_{n}\right\rangle|^{2}\leq\left\|h\right\|^{2}$, and hence

 $\left\|(A_{N}-A_{M})h\right\|<\epsilon\left\|h\right\|.$

As this holds for all $h\in H$,

 $\left\|A_{N}-A_{M}\right\|\leq\epsilon,$

showing that $A_{N}$ is a Cauchy sequence, which therefore converges in $\mathscr{B}(H)$. As each term in the sequence is finite rank and so compact, the limit is a compact operator. ∎

Continuing the analogy we used with the above theorem, now we start with a function and ask what kind of series it can be expanded into. This is called the singular value decomposition of a compact operator. Helemskii calls the series in the following theorem the Schmidt series of the operator.1515 15 A. Ya. Helemskii, Lectures and Exercises on Functional Analysis, p. 215, Theorem 1. We have already presented the singular value decomposition for finite rank operators in Theorem 8.

###### Theorem 13 (Singular value decomposition).

If $H$ is a Hilbert space and

 $A\in\mathscr{B}_{0}(H)\setminus\mathscr{B}_{00}(H),$

then there is an orthonormal set $\{e_{n}:n\in\mathbb{N}\}$ and an orthonormal set $\{f_{n}:n\in\mathbb{N}\}$ such that $A_{N}\to A$, where

 $A_{N}=\sum_{n=1}^{N}\sigma_{n}(A)e_{n}\otimes f_{n}.$
###### Proof.

As $|A|$ is self-adjoint and compact, by Theorem 6 there is an orthonormal set $\{f_{n}:n\in\mathbb{N}\}$ such that

 $|A|=\sum_{n=1}^{\infty}\lambda_{n}(|A|)f_{n}\otimes f_{n}=\sum_{n=1}^{\infty}% \sigma_{n}(A)f_{n}\otimes f_{n}.$

That is, with $|A|_{N}\in\mathscr{B}_{00}(H)$ defined by

 $|A|_{N}=\sum_{n=1}^{N}\sigma_{n}(A)f_{n}\otimes f_{n},$

we have $|A|_{N}\to|A|$.

Let $A=U|A|$ be the polar decomposition of $A$, and define $e_{n}=Uf_{n}$. As $U^{*}U|A|=|A|$ and as $\sigma_{m}(A)>0$ (because $A$ is not finite rank), we have $|A|\frac{f_{m}}{\sigma_{m}(A)}=f_{m}$, and hence

 $\displaystyle\left\langle e_{n},e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle Uf_{n},Uf_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},U^{*}Uf_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},U^{*}U|A|\frac{f_{m}}{\sigma_{m}(A)}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},|A|\frac{f_{m}}{\sigma_{m}(A)}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle f_{n},f_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\delta_{n,m},$

showing that $\{e_{n}:n\in\mathbb{N}\}$ is an orthonormal set. Define

 $A_{N}=\sum_{n=1}^{N}\sigma_{n}(A)e_{n}\otimes f_{n},$

and we have $A_{N}=U|A|_{N}$. As $A=U|A|$ and $A_{N}=U|A|_{N}$, we get

 $\left\|A-A_{N}\right\|=\left\|U|A|-U|A|_{N}\right\|\leq\left\|U\right\|\left\|% |A|-|A|_{N}\right\|=\left\||A|-|A|_{N}\right\|\to 0,$

showing that $A_{N}\to A$. ∎

## 9 Courant min-max theorem

###### Theorem 14 (Courant min-max theorem).

Let $H$ be an infinite dimensional Hilbert space and let $A\in\mathscr{B}_{0}(H)$ be a positive operator. If $k\in\mathbb{N}$ then

 $\max_{\dim S=k}\min_{x\in S,\left\|x\right\|=1}\left\langle Ax,x\right\rangle=% \lambda_{k}(A)=\sigma_{k}(A)$

and

 $\min_{\dim S=k-1}\max_{x\in S^{\perp},\left\|x\right\|=1}\left\langle Ax,x% \right\rangle=\lambda_{k}(A)=\sigma_{k}(A).$
###### Proof.

$|A|$ is compact and positive, so according to Theorem 6 there is an orthonormal set $\{e_{n}:n\in\mathbb{N}\}$ such that

 $A=\sum_{n=1}^{\infty}\sigma_{n}(A)e_{n}\otimes e_{n}.$

For $k\in\mathbb{N}$, let $S_{k}=\bigvee_{n=k}^{\infty}\{e_{n}\}$. $S_{k}^{\perp}=\bigvee_{k=1}^{n-1}\{e_{n}\}$, so $S_{k}$ has codimension $k-1$. (The codimension of a closed subspace of a Hilbert space is the dimension of its orthogonal complement.) If $S$ is a $k$ dimensional subspace of $H$, then there is some $x\in S_{k}\cap S$ with $\left\|x\right\|=1$. This is because if $V$ is a closed subspace with codimension $k-1$ of a Hilbert space and $W$ is a $k$ dimensional subspace of the Hilbert space, then their intersection is a subspace of nonzero dimension. As $x\in S_{k}$, there are $\alpha_{n}\in\mathbb{C}$, $n\geq k$, with

 $x=\sum_{n=k}^{\infty}\alpha_{n}e_{n},\qquad\left\|x\right\|^{2}=\sum_{n=k}^{% \infty}|\alpha_{n}|^{2}.$

As the sequence $\sigma_{n}(A)$ is nonincreasing,

 $\displaystyle\left\langle Ax,x\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\left\langle x,e_{n}% \right\rangle e_{n},x\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\left\langle\sum_{m=k% }^{\infty}\alpha_{m}e_{m},e_{n}\right\rangle e_{n},\sum_{m=k}^{\infty}\alpha_{% m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\sum_{m=k}^{\infty}% \alpha_{m}\delta_{m,n}e_{n},\sum_{m=k}^{\infty}\alpha_{m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\alpha_{n}\chi_{\geq k% }(n)e_{n},\sum_{m=k}^{\infty}\alpha_{m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=k}^{\infty}\sigma_{n}(A)\alpha_{n}e_{n},\sum_% {m=k}^{\infty}\alpha_{m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\sum_{n=k}^{\infty}\sigma_{n}(A)|\alpha_{n}|^{2}$ $\displaystyle\leq$ $\displaystyle\sigma_{k}(A)\sum_{n=k}^{\infty}|\alpha_{n}|^{2}$ $\displaystyle=$ $\displaystyle\sigma_{k}(A),$

where we write

 $\chi_{\geq k}(n)=\begin{cases}1&n\geq k\\ 0&n

This shows that if $\dim S=k$ then

 $\inf_{x\in S,\left\|x\right\|=1}\left\langle Ax,x\right\rangle\leq\sigma_{k}(A).$

Let $M=\inf_{x\in S,\left\|x\right\|=1}\left\langle Ax,x\right\rangle$, and let $x_{n}\in S$, $\left\|x_{n}\right\|=1$, with $\left\langle Ax_{n},x_{n}\right\rangle\to M$. As $S$ is a finite dimensional Hilbert space, the unit sphere in it is compact, so there a a subsequence $x_{a(n)}$ that converges to some $z\in S$, $\left\|z\right\|=1$. We have

 $\displaystyle|\left\langle Az,z\right\rangle-\left\langle Ax_{n},x_{n}\right\rangle|$ $\displaystyle\leq$ $\displaystyle|\left\langle Az,z\right\rangle-\left\langle Ax_{n},z\right% \rangle|+\left\langle Ax_{n},z\right\rangle-\left\langle Ax_{n},x_{n}\right\rangle|$ $\displaystyle=$ $\displaystyle|\left\langle A(z-x_{n},z\right\rangle|+|\left\langle Ax_{n},z-x_% {n}\right\rangle|$ $\displaystyle\leq$ $\displaystyle\left\|A\right\|\left\|z-x_{n}\right\|\left\|z\right\|+\left\|A% \right\|\left\|x_{n}\right\|\left\|z-x_{n}\right\|$ $\displaystyle=$ $\displaystyle 2\left\|A\right\|\left\|z-x_{n}\right\|.$

As $x_{a(n)}\to z$, we get $\left\langle Ax_{a(n)},x_{a(n)}\right\rangle\to\left\langle Az,z\right\rangle$. As $\left\langle Ax_{n},Ax_{n}\right\rangle\to M$, we get

 $\left\langle Az,z\right\rangle=M=\inf_{x\in S,\left\|x\right\|=1}\left\langle Ax% ,x\right\rangle.$

As $z\in S$ and $\left\|z\right\|=1$, we have in fact

 $\min_{x\in S,\left\|x\right\|=1}\left\langle Ax,x\right\rangle=\inf_{x\in S,% \left\|x\right\|=1}\left\langle Ax,x\right\rangle\leq\sigma_{k}(A).$

This is true for any $k$ dimensional subspace of $H$, so

 $\sup_{\dim S=k}\min_{x\in S,\left\|x\right\|=1}\left\langle Ax,x\right\rangle% \leq\sigma_{k}(A).$

If $S=S_{k+1}^{\perp}$ then $e_{k}\in S$, $\left\|e_{k}\right\|=1$, and

 $\left\langle Ae_{k},e_{k}\right\rangle=\left\langle\sigma_{k}(A)e_{k},e_{k}% \right\rangle=\sigma_{k}(A),$

so in fact

 $\max_{\dim S=k}\min_{x\in S,\left\|x\right\|=1}\left\langle Ax,x\right\rangle=% \sigma_{k}(A),$

which is the first of the two formulas that we want to prove.

For $k\geq 1$, let $S_{k}=\bigvee_{n=1}^{k}\{e_{n}\}$. If $S$ is a $k-1$ dimensional subspace of $H$, then $S^{\perp}$ is a closed subspace with codimension $k-1$, so the intersection of $S_{k}$ and $S^{\perp}$ has nonzero dimension, and so there is some $x\in S_{k}\cap S^{\perp}$ with $\left\|x\right\|=1$. As $x\in S_{k}$ there are $\alpha_{1},\ldots,\alpha_{k}$ with $x=\sum_{n=1}^{k}\alpha_{n}e_{n}$, giving

 $\displaystyle\left\langle Ax,x\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\left\langle x,e_{n}% \right\rangle e_{n},x\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\left\langle\sum_{m=1% }^{k}\alpha_{m}e_{m},e_{n}\right\rangle e_{n},\sum_{m=1}^{k}\alpha_{m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\sum_{m=1}^{k}\alpha_% {m}\delta_{m,n}e_{n},\sum_{m=1}^{k}\alpha_{m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{\infty}\sigma_{n}(A)\alpha_{n}\chi_{\leq k% }(n)e_{n},\sum_{m=1}^{k}\alpha_{m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle\sum_{n=1}^{k}\sigma_{n}(A)\alpha_{n}e_{n},\sum_{m=1}% ^{k}\alpha_{m}e_{m}\right\rangle$ $\displaystyle=$ $\displaystyle\sum_{n=1}^{k}\sigma_{n}(A)|\alpha_{n}|^{2}$ $\displaystyle\geq$ $\displaystyle\sigma_{k}(A)\sum_{n=1}^{k}|\alpha_{n}|^{2}$ $\displaystyle=$ $\displaystyle\sigma_{k}(A).$

This shows that

 $\sup_{x\in S^{\perp},\left\|x\right\|=1}\left\langle Ax,x\right\rangle\geq% \sigma_{k}(A).$

Define $M=\sup_{x\in S^{\perp},\left\|x\right\|=1}$. Because $M$ is a supremum, there is a sequence $x_{n}$ on the unit sphere in $S_{k-1}$ such that $\left\langle Ax_{n},x_{n}\right\rangle\to M$. The unit sphere in $S_{k-1}$ is compact because $S_{k-1}$ is finite dimensional, so this sequence has a convergent subsequence $x_{a(n)}\to z$. As

 $|\left\langle Az,z\right\rangle-\left\langle Ax_{n},x_{n}\right\rangle|\leq 2% \left\|A\right\|\left\|z-x_{n}\right\|$

and $x_{a(n)}\to z$, we get

 $\left\langle Az,z\right\rangle=M=\sup_{x\in S^{\perp},\left\|x\right\|=1}\left% \langle Ax,x\right\rangle,$

whence

 $\max_{x\in S^{\perp},\left\|x\right\|=1}\left\langle Ax,x\right\rangle=\sup_{x% \in S^{\perp},\left\|x\right\|=1}\left\langle Ax,x\right\rangle\geq\sigma_{k}(% A).$

As this is true for any $k-1$ dimensional subspace $S$,

 $\inf_{\dim S=k-1}\max_{x\in S^{\perp},\left\|x\right\|=1}\left\langle Ax,x% \right\rangle\geq\sigma_{k}(A).$

But for $S=S_{k-1}$ we have $e_{k}\in S^{\perp}$, $\left\|e_{k}\right\|=1$, and

 $\left\langle Ae_{k},e_{k}\right\rangle=\left\langle\sigma_{k}(A)e_{k},e_{k}% \right\rangle=\sigma_{k}(A),$

which implies that

 $\min_{\dim S=k-1}\max_{x\in S^{\perp},\left\|x\right\|=1}\left\langle Ax,x% \right\rangle=\sigma_{k}(A).$

If $A\in\mathscr{B}(H)$ is compact, then the eigenvalues of $|A|$ are equal to the singular values of $|A|$. Therefore the Courant min-max theorem gives expressions for the singular values of a compact linear operator on a Hilbert space, whether or not the operator is itself self-adjoint.

Allahverdiev’s theorem1616 16 I. C. Gohberg and M. G. Krein, Introduction to the Theory of Linear Nonselfadjoint Operators in Hilbert Space, p. 28, Theorem 2.1; cf. J. R. Retherford, Hilbert Space: Compact Operators and the Trace Theorem, p. 75 and p. 106. gives an expression for the singular values of a compact operator that does not involve orthonormal sets, unlike Courant’s min-max theorem. Thus this formula makes sense for a compact operator from one Banach space to another.

###### Theorem 15 (Allahverdiev’s theorem).

Let $H$ be a Hilbert space and let $\mathscr{F}_{n}(H)$ be the set of bounded finite rank operators of rank $\leq n$. If $A\in\mathscr{B}_{0}(H)$ and $n\in\mathbb{N}$, then

 $\sigma_{n}(A)=\inf_{T\in\mathscr{F}_{n-1}}\left\|A-T\right\|.$

## 10 Schatten class operators

If $1\leq p<\infty$ and $A\in\mathscr{B}_{0}(H)$, we define

 $\left\|A\right\|_{p}=\left(\sum_{n\in\mathbb{N}}\sigma_{n}(A)^{p}\right)^{1/p},$

and define $\mathscr{B}_{p}(H)$ to be those $A\in\mathscr{B}_{0}(H)$ with $\left\|A\right\|_{p}<\infty$. In other words, an element of $\mathscr{B}_{p}(H)$ is a compact operator whose sequence of singular values is an element of $\ell^{p}$. We call an element of $\mathscr{B}_{p}(H)$ a Schatten class operator. We call elements of $\mathscr{B}_{1}(H)$ trace class operators and elements of $\mathscr{B}_{2}(H)$ Hilbert-Schmidt operators.

If $A\in\mathscr{B}_{0}(H)$ is positive, then, according to Theorem 6, there is an orthonormal set $\{e_{n}:n\in\mathbb{N}\}$ such that

 $A=\sum_{n\in\mathbb{N}}\lambda_{n}(A)e_{n}\otimes e_{n},$

where the series converges in the strong operator topology. As the $e_{n}$ are orthonormal, we have

 $A^{p}=\sum_{n\in\mathbb{N}}\lambda_{n}(A)^{p}e_{n}\otimes e_{n},$

which is itself a positive compact operator, and thus $\sigma_{n}(A^{p})=\sigma_{n}(A)^{p}$ for $n\in\mathbb{N}$. Therefore, if $A$ is a positive compact operator, then $\left\|A\right\|_{p}=\left\|A^{p}\right\|_{1}^{1/p}$.

If $A\in\mathscr{B}_{0}(H)$ and $n\in\mathbb{N}$, then $\sigma_{n}(|A|)=\sigma_{n}(A)$ and $\sigma_{n}(A^{*})=\sigma_{n}(A)$. Hence, if $1\leq p<\infty$ then

 $\left\||A|\right\|_{p}=\left\|A\right\|_{p},\qquad\left\|A^{*}\right\|_{p}=% \left\|A\right\|_{p}.$

As $|A|$ is compact and self-adjoint, it has an eigenvalue with absolute value $\left\|A\right\|$, from which it follows that if $1\leq p<\infty$ then $\left\|A\right\|\leq\left\|A\right\|_{p}$.

###### Theorem 16.

If $A\in\mathscr{B}_{1}(H)$, $B\in\mathscr{B}(H)$, and $k\in\mathbb{N}$, then

 $\sigma_{k}(BA)\leq\left\|B\right\|\sigma_{k}(A).$
###### Proof.

For all $x\in H$,

 $\displaystyle\left\langle(BA)^{*}BAx,x\right\rangle$ $\displaystyle=$ $\displaystyle\left\langle BAx,BAx\right\rangle$ $\displaystyle=$ $\displaystyle\left\|BAx\right\|^{2}$ $\displaystyle\leq$ $\displaystyle\left\|B\right\|^{2}\left\|Ax\right\|^{2}$ $\displaystyle=$ $\displaystyle\left\|B\right\|^{2}\left\langle Ax,Ax\right\rangle$ $\displaystyle=$ $\displaystyle\left\|B\right\|^{2}\left\langle A^{*}Ax,x\right\rangle.$

Applying the Courant min-max theorem to the positive operators $(BA)^{*}BA$ and $A^{*}A$, if $k\in\mathbb{N}$ then

 $\displaystyle\sigma_{k}((BA)^{*}BA)$ $\displaystyle=$ $\displaystyle\max_{\dim S=k}\min_{x\in S,\left\|x\right\|=1}\left\langle(BA)^{% *}BAx,x\right\rangle$ $\displaystyle\leq$ $\displaystyle\left\|B\right\|^{2}\max_{\dim S=k}\min_{x\in S,\left\|x\right\|=% 1}\left\langle A^{*}Ax,x\right\rangle$ $\displaystyle=$ $\displaystyle\left\|B\right\|^{2}\sigma_{k}(A^{*}A).$

But

 $\sigma_{k}((BA)^{*}BA)=\sigma_{k}(|BA|^{2})=\sigma_{k}(|BA|)^{2}=\sigma_{k}(BA% )^{2}$

and

 $\sigma_{k}(A^{*}A)=\sigma_{k}(|A|^{2})=\sigma_{k}(|A|)^{2}=\sigma_{k}(A)^{2},$

so taking the square root,

 $\sigma_{k}(BA)\leq\left\|B\right\|\sigma_{k}(A).$

Using Theorem 16, if $1\leq p<\infty$ then

 $\left\|BA\right\|_{p}=\left(\sum_{n\in\mathbb{N}}\sigma_{n}(BA)^{p}\right)^{1/% p}\leq\left(\sum_{n\in\mathbb{N}}\left\|B\right\|^{p}\sigma_{k}(A)^{p}\right)^% {1/p}=\left\|B\right\|\left\|A\right\|_{p}.$

The following theorem states that the Schatten class operators are Banach spaces.1717 17 Gert K. Pedersen, Analysis Now, revised printing, p. 124, E 3.4.4

###### Theorem 17.

If $1\leq p<\infty$, then $\mathscr{B}_{p}(H)$ is a Banach space with the norm $\left\|\cdot\right\|_{p}$.

## 11 Weyl’s inequality

Weyl’s inequality relates the eigenvalues of a self-adjoint compact operator with its singular values.1818 18 Peter D. Lax, Functional Analysis, p. 336, chapter 30, Lemma 7. We use the notation from Definition 5. For $N>\nu(A)$ the left hand side is equal to $0$ so the inequality is certainly true then.

###### Theorem 18 (Weyl’s inequality).

If $A\in\mathscr{B}_{0}(H)$ is self-adjoint and $N\leq\nu(A)$, then

 $\prod_{n=1}^{N}|\lambda_{n}(A)|\leq\prod_{n=1}^{N}\sigma_{n}(A).$
###### Proof.

Let

 $E_{N}=\bigvee_{n=1}^{N}\ker(A-\lambda_{n}(A)\mathrm{id}_{H}),$

which is finite dimensional. Check that $E_{N}$ is an invariant subspace of $A$, and let $A_{N}:E_{N}\to E_{N}$ be the restriction of $A$ to $E_{N}$. $A_{N}$ is a positive operator. As $E_{N}$ is spanned by eigenvectors for nonzero eigenvalues of $A$ it follows that $\ker A_{N}=\{0\}$, and as $E_{N}$ is finite dimensional, we get that $A_{N}$ is invertible. If $A_{N}$ has polar decomposition $A_{N}=U_{N}|A_{N}|$, then $U_{N}$ is invertible; if a partial isometry is invertible then it is unitary, so $U_{N}$ is unitary, and therefore the eigenvalues of $U_{N}$ all have absolute value $1$. As the determinant of a linear operator on a finite dimensional vector space is the product of its eigenvalues counting algebraic multiplicity,

 $\det|A_{N}|=\frac{1}{|\det U_{N}|}|\det A_{N}|=|\det A_{N}|=\prod_{n=1}^{N}|% \lambda_{n}(A)|.$ (1)

Let $P_{N}$ be the orthogonal projection onto $E_{N}$. If $v\in E_{N}$, then $AP_{N}v=A_{N}v$, and if $v\in E_{N}^{\perp}$ then $AP_{N}v=A(0)=0$. We get that

 $|AP_{N}|v=\begin{cases}|A_{N}|v&v\in E_{N}\\ 0&v\in E_{N}^{\perp},\end{cases}$

and it follows that if $1\leq n\leq N$ then $\sigma_{n}(A_{N})=\sigma_{n}(AP_{N})$. Using Theorem 16 we get

 $\sigma_{n}(AP_{N})\leq\left\|P\right\|\sigma_{n}(A)\leq\sigma_{n}(A);$

the second inequality is an equality unless $P_{N}=0$. We have shown that if $1\leq n\leq N$ then $\sigma_{n}(A_{N})\leq\sigma_{n}(A)$, and combining this with (1) gives us

 $\prod_{n=1}^{N}|\lambda_{n}(A)|=\det|A_{N}|=\prod_{n=1}^{N}\sigma_{n}(A_{N})% \leq\prod_{n=1}^{N}\sigma_{n}(A).$

###### Theorem 19.

If $0, $A\in\mathscr{B}_{0}(H)$ is self-adjoint, and $N\in\mathbb{N}$, then

 $\sum_{n=1}^{N}|\lambda_{n}(A)|^{p}\leq\sum_{n=1}^{N}\sigma_{n}(A)^{p}.$
###### Proof.

Schur’s majorization inequality1919 19 Peter D. Lax, Functional Analysis, p. 337, chapter 30, Lemma 8; cf. J. Michael Steele, The Cauchy-Schwarz Master Class, p. 201, Problem 13.4. states that if $a_{1}\geq a_{2}\geq\cdots$ and $b_{1}\geq b_{2}\geq\cdots$ are nonincreasing sequences of real numbers satisfying, for each $N\in\mathbb{N}$,

 $\sum_{n=1}^{N}a_{n}\leq\sum_{n=1}^{N}b_{n},$

and $\phi:\mathbb{R}\to\mathbb{R}$ is a convex function with $\lim_{x\to-\infty}\phi(x)=0$, then for every $N\in\mathbb{N}$,

 $\sum_{n=1}^{N}\phi(a_{n})\leq\sum_{n=1}^{N}\phi(b_{n}).$

With the hypotheses of Theorem 18, for $1\leq n\leq\nu(A)$, define $a_{n}=\log|\lambda_{n}(A)|$ and $b_{n}=\log\sigma_{n}(A)$ and let $\phi(x)=e^{px}$. By Theorem 18 these satisfy the conditions of Schur’s majorization inequality, which then gives us for $1\leq N\leq\nu(A)$ that

 $\sum_{n=1}^{N}|\lambda_{n}(A)|^{p}\leq\sum_{n=1}^{N}\sigma_{n}(A)^{p}.$

If $n>\nu(A)$ then $\lambda_{n}(A)=0$. ∎

## 12 Rayleigh quotients for self-adjoint operators

If $A\in\mathscr{B}(H)$ is self-adjoint, we define the Rayleigh quotient of $A$ by

 $f(x)=\frac{\left\langle Ax,x\right\rangle}{\left\langle x,x\right\rangle},% \quad x\in H,x\neq 0,\qquad f:H\setminus\{0\}\to\mathbb{R}.$

Let $X$ and $Y$ be normed spaces, $U$ an open subset of $X$, and $f:U\to Y$ a function. If $x\in U$ and there is some $T\in\mathscr{B}(X,Y)$ such that

 $\lim_{h\to 0}\frac{\left\|f(x+h)-f(x)-Th\right\|}{\left\|h\right\|}=0,$ (2)

then $f$ is said to be Fréchet differentiable at $x$, and $T$ is called the Fréchet derivative of $f$ at $x$;2020 20 Ward Cheney, Analysis for Applied Mathematics, p. 149. it does not take long to prove that if $T_{1},T_{2}\in\mathscr{B}(X,Y)$ both satisfy (2) then $T_{1}=T_{2}$. We denote the Fréchet derivative of $f$ at $x$ by $(Df)x$. $Df$ is a map from the set of all points at which $f$ is Fréchet differentiable to $\mathscr{B}(X,Y)$.

To say that $x$ is a stationary point of $f$ is to say that $f$ is Fréchet differentiable at $x$ and that the Fréchet derivative of $f$ at $x$ is the zero map. One proves that if $T_{1},T_{2}$ are Fréchet derivatives of $f$ at $x$ then $T_{1}=T_{2}$, and thus speak about the Fréchet derivative of $f$ at $x$

###### Theorem 20.

If $A\in\mathscr{B}(H)$ is self-adjoint, then each eigenvector of $A$ is a stationary point of the Rayleigh quotient of $A$.

###### Proof.

If $\lambda$ is an eigenvalue of $A$ then, as $A$ is self-adjoint, $\lambda\in\mathbb{R}$. Let $v\neq 0$ satisfy $Av=\lambda v$. We have

 $f(v)=\frac{\left\langle Av,v\right\rangle}{\left\langle v,v\right\rangle}=% \frac{\left\langle\lambda v,v\right\rangle}{\left\langle v,v\right\rangle}=\lambda.$

For $h\neq 0$, using that $A$ is self-adjoint and that $\lambda\in\mathbb{R}$,

 $\displaystyle\frac{|f(v+h)-f(v)-0v|}{\left\|h\right\|}$ $\displaystyle=$ $\displaystyle\frac{1}{\left\|h\right\|}\cdot\left|\frac{\left\langle A(v+h),v+% h\right\rangle}{\left\langle v+h,v+h\right\rangle}-\lambda\right|$ $\displaystyle=$ $\displaystyle\frac{1}{\left\|h\right\|\left\|v+h\right\|^{2}}\left|\left% \langle A(v+h),v+h\right\rangle-\lambda\left\langle v+h,v+h\right\rangle\right|$ $\displaystyle=$ $\displaystyle\frac{1}{\left\|h\right\|\left\|v+h\right\|^{2}}\big{|}\left% \langle Av,v\right\rangle+\left\langle Av,h\right\rangle+\left\langle Ah,v% \right\rangle+\left\langle Ah,h\right\rangle$ $\displaystyle-\lambda\left\langle v,v\right\rangle-\lambda\left\langle v,h% \right\rangle-\lambda\left\langle h,v\right\rangle-\lambda\left\langle h,h% \right\rangle\big{|}$ $\displaystyle=$ $\displaystyle\frac{1}{\left\|h\right\|\left\|v+h\right\|^{2}}\big{|}\left% \langle\lambda v,v\right\rangle+\left\langle\lambda v,h\right\rangle+\left% \langle h,\lambda v\right\rangle+\left\langle Ah,h\right\rangle$ $\displaystyle-\lambda\left\langle v,v\right\rangle-\lambda\left\langle v,h% \right\rangle-\lambda\left\langle h,v\right\rangle-\lambda\left\langle h,h% \right\rangle\big{|}$ $\displaystyle=$ $\displaystyle\frac{1}{\left\|h\right\|\left\|v+h\right\|^{2}}\left|\left% \langle Ah,h\right\rangle-\lambda\left\langle h,h\right\rangle\right|$ $\displaystyle=$ $\displaystyle\frac{1}{\left\|h\right\|\left\|v+h\right\|^{2}}|\left\langle Ah-% \lambda h,h\right\rangle|.$

Therefore

 $\displaystyle\frac{|f(v+h)-f(v)-0v|}{\left\|h\right\|}$ $\displaystyle\leq$ $\displaystyle\frac{\left\|Ah-\lambda h\right\|\left\|h\right\|}{\left\|h\right% \|\left\|v+h\right\|^{2}}$ $\displaystyle=$ $\displaystyle\frac{\left\|(A-\lambda\mathrm{id}_{H})h\right\|}{\left\|v+h% \right\|^{2}}$ $\displaystyle=$ $\displaystyle\frac{\left\|A-\lambda\mathrm{id}_{H}\right\|\left\|h\right\|}{% \left\|v+h\right\|^{2}}.$

As $h\to 0$ the right-hand side tends to $0$ (one of the terms tends to $0$, one doesn’t depend on $h$, and the denominator is bounded below in terms just of $v$ for sufficiently small $h$), showing that $0$ is the Fréchet derivative of $f$ at $v$. ∎