Sums, series, and products in Diophantine approximation

Jordan Bell
Department of Mathematics, University of Toronto
Toronto, Ontario, Canada
(March 3, 2023)

There is not much that can be said for all x and for all n about the sum


However, for this and similar sums, series, and products, we can establish results for almost all x using the tools of continued fractions. We present in detail the appearance of these sums in the singular series for the circle method. One particular interest of the paper is the detailed proof of a striking result of Hardy and Littlewood, whose compact proof, which delicately uses analytic continuation, has not been written freshly anywhere since its original publication. This story includes various parts of late 19th century and early 20th century mathematics.

1 Introduction

In this paper we survey a class of estimates for sums, series, and products that involve Diophantine approximation. We both sort out a timeline of the literature on these questions and give careful proofs of many lesser known results. Rather than being an open-pit mine for the history of Diophantine approximation, this paper follows one vein as deep as it goes.

For x, we write x=minn|x-n|, the distance from x to a nearest integer. In this paper we give a comprehensive presentation of estimates for sums whose jth term involves jx and determine the abscissa of convergence and radius of convergence respectively of Dirichlet series and power series whose jth term involves jx. We also give a detailed proof of a result of Hardy and Littlewood that


for almost all x.

We either prove or state in detail and give references for all the material on continued fractions and measure theory that we use in this paper. Many of the results we prove in this paper do not have detailed proofs written in any books, and the proofs we give for results that do have proofs in books are often written significantly more meticulously here than anywhere else; in some cases the proofs in the literature are so sketchy that the proof we give is written from scratch, for example Lemma 9. Our presentation of Hardy and Littlewood’s estimate for k=1n|sinπkx| makes clear exactly what results in Diophantine approximation one needs for the proof.

In the next section we introduce the Bernoulli polynomials, the Euler-Maclaurin summation formula, and Euler’s constant, which we shall use in a few places. Because the calculations are similar to what we do in the rest of this paper and because we want to be comfortable using Bernoulli polynomials, we work things out from scratch rather than merely stating results as known. In the section after that we summarize various problems that involve sums of the type we are talking about in this paper.

2 The Bernoulli polynomials, the Euler-Maclaurin summation formula, and Euler’s constant

For k0, the Bernoulli polynomial Bk(x) is defined by

zexzez-1=k=0Bk(x)zkk!,|z|<2π. (1)

The Bernoulli numbers are Bk=Bk(0), the constant terms of the Bernoulli polynomials. For any x, using L’Hospital’s rule the left-hand side of (1) tends to 1 as z0, and the right-hand side tends to B0(x), hence B0(x)=1. Differentiating (1) with respect to x,


so B0(x)=0 and for k1 we have Bk(x)k!=Bk-1(x)(k-1)!, i.e.


Furthermore, integrating (1) with respect to x on [0,1] gives, since exz𝑑x=ez-1z,


hence 01B0(x)𝑑x=1 and for k1,


The first few Bernoulli polynomials are


The Bernoulli polynomials satisfy the following:

k=0Bk(x+1)zkk! =ze(x+1)zez-1


Bk(x+1)=kxk-1+Bk(x),k1,x. (2)

In particular, for k2, Bk(1)=Bk(0). The identity (2) yields Faulhaber’s formula, for k0 and positive integers a<b,

ambmk=Bk+1(b+1)-Bk+1(a)k+1. (3)

The following identity is the multiplication formula for the Bernoulli polynomials, found by Raabe [122, pp. 19–24, §13].

Lemma 1.

For k0, q1, and x,


Using (2) with x=qn+jq=n+jq,



m=qNq-1mk =n=1N-1j=0q-1(nq+j)k

Then by (3),


Let Fk,q(x)=Bk+1(qx)-qkj=0q-1Bk+1(x+j/q), which is a polynomial of degree k+1. The above means that for N1, Fk,q(N)=Fk,q(1), which implies that the polynomial Fk,q(x)-Fk,q(1) is identically 0 and, a fortiori, Fk,q(x) is identically 0:


Using Bk+1(x)=(k+1)Bk(x),


We remark that for prime p and for k0, one uses Lemma 1 to prove that there is a unique p-adic distribution μB,k on the p-adic integers p such that μB,k(a+pNp)=pN(k-1)Bk(a/pN) [80, p. 35, Chapter II, §4], called a Bernoulli distribution.

For x, let [x] be the greatest integer x, and let R(x)=x-[x], called the fractional part of x. Write 𝕋=/ and define the periodic Bernoulli functions Pk:𝕋 by


For k2, because Bk(1)=Bk(0), the function Pk is continuous. For f:𝕋 define its Fourier series f^: by


For k1, one calculates P^k(0)=0 and using integration by parts, P^k(n)=-1(2πin)k for n0. Thus for k1, the Fourier series of Pk is


For k2, n|P^k(n)|<, from which it follows that |n|NP^k(n)e2πint converges to Pk(t) uniformly for t𝕋. Furthermore, for t [102, p. 499, Theorem B.2],


Thus for example,


The Euler-Maclaurin summation formula is the following [102, p. 500, Theorem B.5]. If a<b are real numbers, K is a positive integer, and f is a CK function on an open set that contains [a,b], then

a<mbf(m) =abf(x)𝑑x+k=1K(-1)kk!(Pk(b)f(k-1)(b)-Pk(a)f(k-1)(a))

Applying the Euler-Maclaurin summation formula with a=1,b=n,K=2,f(x)=logx yields [102, p. 503, Eq. B.25]


Using e1+O(n-1)=1+O(n-1), taking the exponential of the above equation gives Stirling’s approximation,


Write an=-logn+1mn1m. Because xlog(1-x) is concave,


which means that the sequence an is nonincreasing. For f(x)=1x, because f is positive and nonincreasing,


hence an>0. Because the sequence an is positive and nonincreasing, there exists some nonnegative limit, γ, called Euler’s constant. Using the Euler-Maclaurin summation formula with a=1,b=n,K=1,f(x)=1x, as P1(x)=[x]-12,


which is


As 0R(x)x-2x-2, the function xR(x)x-2 is integrable on [1,); let C=1-1R(x)x-2. Since 0nR(x)x-2𝑑xnx-2𝑑x=n-1,


f. But -logn+1mn1mγ as n, from which it follows that C=γ and thus


3 Background

For x, let [x] be the greatest integer x and let R(x)=x-[x]. It will be handy to review some properties of x[x]. For n it is immediate that [x+n]=[x]. For x,y we have 0x-[x]+y-[y]<2, which means 0=[0][x-[y]+y-[y]]<[2]=2, and using [x+n]=[x] this is 0[x+y]-[x]-[y]<2, therefore


For m,n, n>1, and x,


For n a positive integer and for real x,


There is a unique ν, 1νn, such that [x+ν-1n]=[x] and [x+νn]=[x]+1, and therefore [R(x)+ν-1n]=0 and [R(x)+νn]=1, consequently R(x)+ν-1n<1 and R(x)+νn1, which means 1-νnR(x)<1-ν-1n, from which finally we get n[nx]-n[x]+ν<n+1 and so n=[nx]-n[x]+ν. But


and using ν=n-[nx]+n[x],


This identity is proved by Hermite [65, pp. 310–315, §V].

The Legendre symbol is defined in the following way. Let p be an odd prime, and let a be an integer that is not a multiple of p. If there is an integer b such that ab2(modp) then (ap)=1, and otherwise (ap)=-1. In other words, for an integer a that is relatively prime to p, if a is a square mod p then (ap)=1, and if a is not a square mod p then (ap)=-1. For example, one checks that there is no integer b such that b26(mod7), and hence (67)=-1, while 322(mod7), and so (27)=1.

If p and q are distinct odd primes, define integers uk, 1kp-12, by


namely, uk is the remainder of kq when divided by p. We have 1ukp-1. Let μ(q,p) be the number of k such that uk>p-12. It can be shown that [59, p. 74, Theorem 92]


this fact is called Gauss’s lemma. For example, for p=13 and q=3 we work out that


and hence μ(3,13)=2, so (-1)μ(3,13)=1; on the other hand, 423(mod13), hence (313)=1. With


it is known that [59, pp. 77-78, §6.13]


And it can be shown that [59, p. 76, Theorem 100]

S(q,p)+S(p,q)=p-12q-12. (4)


(pq)(qp) = (-1)μ(p,q)+μ(q,p)
= (-1)S(p,q)+S(q,p)
= (-1)p-12q-12.

This is Gauss’s third proof of the law of quadratic reciprocity in the numbering [7, p. 50, §20]. This proof was published in Gauss’s 1808 “Theorematis arithmetici demonstratio nova”, which is translated in [140, pp. 112–118]. Dirichlet [38, pp. 65–72, §§42–44] gives a presentation of the proof. Eisenstein’s streamlined version of Gauss’s third proof is presented with historical remarks in [93]. Lemmermeyer [95] gives a comprehensive history of the law of quadratic reciprocity, and in particular writes about Gauss’s third proof [95, pp. 9–10]. The formula (4) resembles the reciprocity formula for Dedekind sums [123, p. 4, Theorem 1].

Gauss obtains (4) from the following [140, p. 116, §5]: if x is irrational and n is a positive integer, then

k=1n[kx]+k=1[nx][kx]=n[nx], (5)

which he proves as follows. If [jx]<k[j+1x], then [kx]=j. Therefore

k=1n[kx] = j=1[nx]j([j+1x]-[jx])
= n[nx]-j=1[nx][jx].

Bachmann [6, pp. 654–658, §4] surveys later work on sums similar to (5); see also Dickson [35, Chapter X]. If m and n are relatively prime, then


and so



k=1n-1[kmn]=k=1n-1kmn-k=1n-1R(kmn)=(m-1)(n-1)2. (6)

There is also a simple lattice point counting argument [118, p. 113, No. 18] that gives (6).

In 1849, Dirichlet [37] shows that


where d(n) denotes the number of positive divisors of an integer n. (This equality is Dirichlet’s “hyperbola method”.) He then proves that


Hardy and Wright [59, pp. 264–265, Theorem 320] give a proof of this. Finding the best possible error term in the estimate for k=1nd(k) is “Dirichlet’s divisor problem”. Dirichlet cites the end of Section V Gauss’s Disquisitiones Arithmeticae as precedent for determining average magnitudes of arithmetic functions. (In Section V, Articles 302–304, of the Disquisitiones Arithmeticae, Gauss writes about averages of class numbers of binary quadratic forms, cf. [36, Chapter VI].)

Define (x) to be 0 if x+12; if x+12 then there is an integer mx for which |x-mx|<|x-n| for all integers nmx, and we define (x) to be x-mx. Riemann [125, p. 105, §6] defines


for any x, the series converges absolutely because |(-nx)|<12. Riemann states that if p and m are relatively prime and x=p2m, then




and hence that f is discontinuous at such points, and says that at all other points f is continuous; see Laugwitz [94, §2.1.1, pp. 183-184], Neuenschwander [108], and Pringsheim [119, p. 37] about Riemann’s work on pathological functions. The role of pathological functions in the development of set theory is explained by Dauben [32, Chapter 1, p. 19] and Ferreirós [49, Chapter V, §1, p. 152].

For any interval [a,b] and any σ>0, it is apparent from the above that there are only finitely many x[a,b] for which f(x-)-f(x+)>σ, and Riemann deduces from this that f is Riemann integrable on [a,b]; cf. Hawkins [63, p. 18] on the history of Riemann integration. Later in the same paper [125, p. 129, §13], Riemann states that the function


is not Riemann integrable in any interval.

In 1897, Cesàro [28] asks the following question (using the pseudonym, and anagram, “Rosace” [112, p. 331]). Let ϵ(x)=x-[x]-12. Is the series

n=1ϵ(nx)n (7)

convergent for all non-integer x? This is plausible because the expected value of ϵ(x) is 0. Landau [89] answers this question in 1901. Landau proves that if there is some g such that


where g is a nonnegative function such g(n)=o(n) and such that n=1g(n)n(n+1) converges, then (7) converges. And he proves that if x is rational then (7) diverges. We return to this series in §9.

Also in 1898, Franel [52] asks whether for irrational x and for ϵ>0 we have


Then in 1899, Franel [53] asks if we can do better than this: is the error term in fact O(1)? Cesàro and Franel each contributed many problems to L’Intermédiaire des mathématiciens, the periodical in which they posed their questions. Information about Franel is given in [82].

Lerch [96] answers Franel’s questions in 1904. If x is irrational and pq is a convergent of x (which we will define in §4), then using Theorem 2 (from §4) we can show that if 1kq then [kx]=[kpq]. Lerch uses this and (6) to show that if x is irrational and pq is a convergent of x then


Lerch states that if the continued fraction expansion of x has bounded partial quotients (defined in §4) then, for any positive integer n,


Lerch only gives a brief indication of the proof of this. This result is proved by Hardy and Littlewood in 1922 [57, p. 24, Theorem B3], and also in 1922 by Ostrowski [110, p. 81]. On the other hand, Lerch also constructs examples of x such that, for some positive integer c,


Nevertheless, in 1909 Sierpinski [135] proves that if x is irrational then


A bibliography of Lerch’s works is given in [139]. Lerch had written earlier papers on Gauss sums, Fourier series, theta functions, and the class number; many of his papers are in Czech, but some of them are in French, several of which were published in the Paris Comptes rendus. Several of Lerch’s papers are discussed in Cresse’s survey of the class number of binary quadratic forms [36, Chapter VI].

In 1899, a writer using the pseudonym “Quemquaeris” [121] (“quem quaeris” = “whom you seek”) asks if we can characterize ϕ(n) such that for all irrational θ the series


converges. In particular, the writer asks if ϕ(n)=1n! satisfies this. In the same year, de la Vallée-Poussin [34] answers this question. (There are also responses following de la Vallée-Poussin’s by Borel and Fabry.) For a given function ϕ(n), de la Vallée-Poussin shows that if we have an>1ϕ(qn-1) for all n, for an the nth partial quotient of θ and qn the denominator of the nth convergent of θ, then the series


will diverge. Hardy and Littlewood prove numerous results on similar series, e.g. for ϕ(n)=n-r for real r>1 and for certain classes of θ, in their papers on Diophantine approximation [61]. In 1931, Walfisz [152, p. 570, Hilfssatz 4] shows, following work of Behnke [13, p. 289, §16], that for almost all irrational x[0,1], if ϵ>0 then


where x=min(R(x),1-R(x)). Walfisz’s paper includes many results on related sums.

In 1916, Watson [153] finds the following asymptotic series for Sn=m=1n-1csc(mπn):


where γ is Euler’s constant and Bj are the Bernoulli numbers. Truncating the asymptotic series and rewriting gives


For example, computing S1000 directly we get S1000=4477.593932, and computing the right-hand side of the above formula without the error term we obtain 4477.594019 A cleaner derivation of the asymptotic series using the Euler-Maclaurin summation formula is given later by Williams [155].

Early surveys of Diophantine approximation are given by Bohr and Cramér [20, pp. 833–836, §39] and Koksma [81, pp. 102–110]. Hlawka and Binder [66] present the history of the initial years of the theory of uniform distribution. Narkiewicz [106, pp. 82–95, §2.5 and pp. 175–183 §3.5] gives additional historical references on Diophantine approximation. The papers of Hardy and Littlewood on Diophantine approximation are reprinted in [61]. Perron [113] and Brezinski [23] give historical references on continued fractions, and there is reliable material on the use of continued fractions by 17th century mathematicians in Whiteside [154]. Fowler [51] presents a prehistory of continued fractions in Greek mathematics.

4 Preliminaries on continued fractions



Let μ be Lebesgue measure on [0,1], and let Ω=[0,1].

For positive integers a1,,an, we define


For example, [1,1,1]=23.

Let be the set of positive integers. We call a a continued fraction, and we call an the nth partial quotient of a. If there is some K>0 such that anK for all n then we say that a has bounded partial quotients. We call [a1,,an] the nth convergent of a. For n1 let


with pn,qn positive integers that are relatively prime, and set


One can show by induction [44, p. 70, Lemma 3.1] that for n1 we have

(pnpn-1qnqn-1)=(0110)(a1110)(an110). (8)

We have


and from (8) we get for all n1 that




Since the an are positive integers we get by induction that for all n1,

pn2(n-1)/2,qn2(n-1)/2. (9)

In fact, setting F1=1,F2=1, Fn+1=Fn+Fn-1 for n2, with Fn the nth Fibonacci number, as an1 we check by induction that


Taking determinants of (8) gives us for all n1 that

pnqn-1-pn-1qn=(-1)n+1, (10)

and then by induction we have for all n1,


For any a, as n this sequence of sums converges and we denote its limit by v(a). We have for all n1,


Since the right hand side is an alternating series we obtain for n1,

|v(a)-pnqn|<1qnqn+1, (11)


|v(a)-pnqn|>1qnqn+1-1qn+1qn+2=an+2qnqn+21qn(qn+1+qn), (12)


p2q2<p4q4<<v(a)<<p3q3<p1q1. (13)

For xΩ, we say that pq, q>0, is a best approximation to x if qx=|qx-p| and qx>qx for 1q<q. The following theorem shows in particular that the convergents of a continued fraction a are best approximations to v(a) [127, p. 22, Chapter 2, §3, Theorem 1].

Theorem 2 (Best approximations).

Let a. For any n1,


If 1q<qn+1, then for any p,


By (11), |qnv(a)-pn|<1qn+1. But


So |qnv(a)-pn|<12, which means qnv(a)=|qnv(a)-pn|.

Write x=v(a) and let A=(pn+1pnqn+1qn). Applying (10),








In particular, μ,ν. Suppose by contradiction that ν=0. Then (q-μqn+1)x=p-μpn+1, and as x it must then be that q=μqn+1 and p=μpn+1. But q=μqn+1, 1q<qn+1, and μ are together a contradiction. Therefore ν0. Either μ=0 or μ0. For μ=0,


which is the claim. For μ0 we use the fact q=qn+1μ+qnν and 1q<qn+1. If μ,ν>0 then q<qn+1 is contradicted, and if μ,ν<0 then q1 is contradicted. Therefore μ and ν have different signs, say μ=(-1)N|μ| and ν=(-1)N+1|ν|. Furthermore, we get from (13) that



qx-p =μ(qn+1x-pn+1)+ν(qnx-pn)



which is the claim. ∎

The above theorem says, a fortiori, that the convergents of a are best approximations to v(a). It can also be proved that if pq, q>0, is a best approximation to v(a) then pq is a convergent of a [92, p. 9, Theorem 6]. Cassels [27, p. 2, Chapter I] works out the theory of continued fractions according to this point of view. Similarly, Milnor [101, p. 234, Appendix C] works out the theory of continued fractions in the language of rotations of the unit circle.

We define the Gauss transformation T:ΩΩ by T(x)=R(1x) for xΩ, and we define Φ:Ω by


One can check that if a then v(a)Ω [44, p. 73, Lemma 3.2]. (Namely, the value of a nonterminating continued fraction is irrational.) One can prove that v:Ω is injective [44, p. 75, Lemma 3.4], and for xΩ that [44, p. 78, Lemma 3.6]


Therefore Φ:Ω is a bijection. Moreover, Φ is a homeomorphism, when has discrete topology, has the product topology, and Ω has the subspace topology inherited from [44, p. 86, Exercise 3.2.2]. That and Ω are homeomorphic can also be proved without using continued fractions [2, p. 106, Theorem 3.68]. In descriptive set theory, the topological space 𝒩= is called the Baire space, and the Alexandrov-Urysohn theorem states that 𝒩 has the universal property that any nonempty Polish space that is zero-dimensional (there is a basis of clopen sets for the topology) and all of whose compact subsets have empty interior is homeomorphic to 𝒩 [75, p. 37, Theorem 7.7]. Some of Baire’s work on 𝒩 is described in [71, pp. 119–120] and [5, pp. 349, 372].

For I=[0,1] and for T:II, T(x)=R(1/x) for x>0 and T(0)=0. For k1 let Ik=(1k+1,1k), so if xIk then T(x)=x-1-k. Then for xI=[0,1],


For S={0}{k-1:k1}, IS=k1Ik, and for xIS,


and for xIk, k2<|T(x)|<(k+1)2. Differentiability and dynamical properties of the Gauss transformation are worked out by Cornfeld, Fomin and Sinai [30, pp. 165–177, Chapter 7, §4], as an instance of piecewise monotonic transformations.

For each n1 we define an:Ω by an(x)=(Φ(x))n. For example, e-2Ω, and it is known [92, p. 74, Theorem 2] that for k1,


The pattern for the continued fraction expansion of e seems first to have been worked out by Roger Cotes in 1714 [50], and was later proved by Euler using a method involving the Riccati equation [31].

For n1 and in, let


For xIn(i),


The following is an expression for the sets In(i) [69, p. 18, Theorem 1.2.2].

Theorem 3.

Let n1, i1n, and for pn=pn(x), qn=qn(x), xIn(i),

un(i)={pn+pn-1qn+qn-1n oddpnqnn even


vn(i)={pnqnn oddpn+pn-1qn+qn-1n even.



It follows from the above that for in, n1,


5 Diophantine conditions

For real τ,γ>0 let

𝒟(τ,γ) =q1,p{x[0,1]:|x-pq|γq-τ}

and let


We relate the sets 𝒟(τ) and continued fractions expansions [101, p. 241, Lemma C.6], cf. [158, p. 130, Proposition 2.4].

Lemma 4.

For τ>0 and xΩ, x𝒟(τ) if and only if there is some C=C(x)>0 such that qn+1(x)Cqn(x)τ-1 for all n1.


For xΩ, write qn=qn(x). By (12), qnx>1qn+1+qn, and by (11), qnx<1qn+1. Suppose x𝒟(τ), so there is some γ>0 such that x𝒟(τ,γ). Then


Suppose qn+1Cqnτ-1 for all n1. For q1, take qnq<qn+1. Using Theorem 2,


which means that x𝒟(τ,12C). ∎

For K a positive integer, let

K={xΩ:an(x)K for all n1},

so =K1K is the set of those xΩ with bounded partial quotients.

Lemma 5.

For xΩ, x𝒟(2) if and only if x.


Write an=an(x) and qn=qn(x). If x𝒟(2) then there is some γ>0 such that x𝒟(2,γ), hence for n1,


Now, qn+1=an+1qn+qn-1 for n1, so


which shows that x has bounded partial quotients.

If xK, let q1 and let qnq<qn+1. Using Theorem 2 and then (12),


As xK,


which means that x𝒟(2,1K+2). ∎

A complex number α is called an algebraic number of degree d, d0, if there is some polynomial f[x] with degree d such that f(α)=0 and if g[x] has degree <d and g(α)=0 then g=0. An algebraic number number of degree 2 is called a quadratic irrational. Let αΩ. It was proved by Euler [59, p. 144, Theorem 176] that if there is some p>0 and some L such that al+p(α)=al(α) for all lL then α is a quadratic irrational The converse of this was proved by Lagrange [59, p. 144, Theorem 177], namely that a quadratic irrational has eventually periodic partial quotients. For example, α=11-3Ω is a quadratic irrational, being a root of x2+6x-2, and one works out that a1(α)=3,a2(α)=6, and that al+2(α)=al(α) for l1. In particular, if αΩ is a quadratic irrational, then α has bounded partial quotients.

Liouville [59, p. 161, Theorem 191] proved that if xΩ is an algebraic number of degree d2, then x𝒟(d). The Thue-Siegel-Roth theorem [48, p. 55, Theorem 1.23] states that if xΩ is an algebraic number, then for any δ>0 there is some qδ1 such that for all qqδ,


See Schmidt [131, p. 195, Theorem 2B].

6 Sums of reciprocals

We are interested in getting bounds on the sum j=1m1jx. This is an appealing question because the terms 1jx are unbounded.

Rather than merely stating that k=11k=, we give more information by giving the estimate


where γ is Euler’s constant. Likewise, rather than merely stating that there are infinitely many primes, we state more information with [90, p. 102, §28]


for a certain constant B (namely “Merten’s constant”), or with [90, p. 226, §61]


Because |sin(πx)|=sin(πx)πx and


we have

1πj=1m1jxj=1m1|sinπjx|12j=1m1jx. (14)

Thus, getting bounds on j=1m1jx will give us bounds on j=1m1|sinπjx|.

Let ψ be a nondecreasing function defined on the positive integers such that ψ(h)>0 for h1 (for example, ψ(h)=log(2h)). Following Kuipers and Niederreiter [85, p. 121, Definition 3.3], we say that an irrational number x is of type <ψ if hx1hψ(h) for all integers h1. If ψ is a constant function, then we say that x is of constant type.

Lemma 6.

xΩ is of constant type if and only if it has bounded partial quotients.


Suppose that xΩ is of constant type. So there is some K>0 such that hx1hK for all integers h1. For n2 we have qn=anqn-1+qn-2, and hence, by (11),


showing that x has bounded partial quotients.

Suppose that xΩ has bounded partial quotients, say anK for all n1. Let h be a positive integer and take qnh<qn+1. Then first by Theorem 2 and then by (12),


and so


showing that x is of constant type. ∎

However, almost all x do not have bounded partial quotients [78, p. 60, Theorem 29]. Shallit [134] gives a survey on numbers with bounded partial quotients.

We state and prove a result of Khinchin’s [78, p. 69, Theorem 32] that we then use.

Lemma 7.

Let f be a positive function on the positive integers. If


then for almost all xΩ there are only finitely many q such that qx<f(q).


For each positive integer q, let Eq={tΩ:qt<f(q)}. If tEq, then there is some integer p with 0pq such that |t-pq|<f(q)q. It follows that




Let E=lim supqEq, i.e. E={tΩ:tEqfor infinitely many q}. Then by the Borel-Cantelli lemma [19, p. 59, Theorem 4.3] we have that μ(E)=0. Therefore, for almost all tΩ there are only finitely many q such that tEq, i.e., for almost all tΩ there are only finitely many q such that qt<f(q). ∎

The above lemma is proved in Benedetto and Czaja [15, p. 183, Theorem 4.3.3] using the fact that a function of bounded variation is differentiable almost everywhere. We outline the proof. Define F:[0,1]>0 by F(x)=f(q)q if x=aq, gcd(a,q)=1, 0aq, and F(x)=0 if xΩ. Writing 𝒜q={aq:0aq,gcd(a,q)=1}, for 0=t0<t1<<tN=1,


It follows that the total variation of F is 2q=1f(q) and hence F is a function of bounded variation. Because F has bounded variation, the set DF of x[0,1] at which F is differentiable is a Borel set with λ(DF)=1. Check that F(x)=0 for xDF, and using this, if anqnx with gcd(an,qn)=1 and 0anqn then for some N, if nN then |x-anq|F(qn)qn.

We use the above lemma to prove the following result.

Lemma 8.

Let ϵ>0. For almost all xΩ, there is some K>0 such that x is of type <K(logh)1+ϵ.



E={tΩ:ht<1h(logh)1+ϵfor infinitely many h}.

Since h=11h(logh)1+ϵ converges, we have by Lemma 7 that μ(E)=0. Let tΩE. Then ht1h(logh)1+ϵ for all sufficiently large h. It follows that there is some K such that t is of type <K(logh)1+ϵ. ∎

The following technical lemma is from Kuipers and Niederreiter [85, p. 130, Exercise 3.9]; cf. Lang [92, p. 39, Lemma].

Lemma 9.

Let xΩ be of type <ψ. If n0 and if 0h0<qn+1, then


Since pn and qn are relatively prime, the remainders of jpn, j=1,,qn, when divided by qn are all distinct. Then also, the remainders of jpn+h0pn, j=1,,qn, when divided by qn are all distinct. Let λj, j=1,,qn, be the remainder of jpn+h0pn when divided by qn. We have {λj:1jqn}={0,,qn-1}; let λj1=0, λj2=1, and λj3=qn-1.

Write x=pnqn+δnqnqn+1; by (11) we have |δn|<1. If j+h0<qn+1 then by Theorem 2 we have (j+h0)xqnx, and since x is of type <ψ we have


Let, for i=1,2,3,

Ai={1,if ji+h0<qn+1,0,if ji+h0qn+1.

If j+h0<qn+1 and jj1,j2,j3, then


It follows that

j+h0<qn+11jqn1(j+h0)x = A1(j1+h0)x+A2(j2+h0)x+A3(j3+h0)x
< 3qnψ(qn)+2k=1qn-11kqn.

But 1y<1R(y)+11-R(y) for y, so

k=1qn-11kqn < k=1qn-11R(kqn)+11-R(kqn)
= k=1qn-11kqn+11-kqn
= 2qnk=1qn-11k
< 3qnlogqn;

the last inequality is because, for all m1,


We now use Lemma 9 to obtain a bound on j=1m1jx in terms of the type of x. This is from Kuipers and Niederreiter [85, p. 131, Exercise 3.11]; cf. Lang [92, p. 39, Theorem 2].

Theorem 10.

If xΩ is of type <ψ, then for all m1 we have


We shall prove the claim by induction. Because x is of type <ψ, we have


so the claim is true for m=1. Take m>1 and assume that the claim is true for all 1m<m. We shall show that it is true for m.

Let qnm<qn+1. Either m<2qn or m2qn. In the first case, using Lemma 9 we have

j=1m1jx = j=1qn1jx+j=1m-qn1(j+qn)x
< 12qn(ψ(qn)+logqn)
< 12m(ψ(m)+logm).

In the second case, using the induction assumption (with m=m-qn) and Lemma 9 we have, because qn<m-qn,

j=1m1jx = j=1m-qn1jx+j=1qn1(j+m-qn)x
< 12(m-qn)(ψ(m-qn)+log(m-qn))+6qn(ψ(qn)+logqn)
< 12(m-qn)(ψ(m)+logm)+12qn(ψ(qn)+logqn)
= 12m(ψ(m)+logm)-12qn(ψ(m)-ψ(qn)+logm-logqn)

The claim is true in both cases, which completes the proof by induction. ∎

We can now establish for almost all xΩ a tractable upper bound on the sum j=1m1jx, and thus by (14) also on j=1m1|sinπjx|; cf. Lang [92, p. 44, Theorem 3].

Theorem 11.

Let ϵ>0. For almost all xΩ, we have


while if x has bounded partial quotients then


Let ϵ>0. By Lemma 8, for almost all xΩ there is some K such that x is of type <K(logh)1+ϵ. For such an x, it follows from Theorem 10 that for all m1,


If x has bounded partial quotients then by Lemma 6 it is of constant type, say ψ(m)=K for some K. It follows from Theorem 10 that for all m1,


For example, take x=-1+52, for which an(x)=1 for all n, and so in particular x has bounded partial quotients. In Figure 1 we plot 1mlogmj=1m1jx for m=5000,,40000. These computations suggest that there is some constant C for which j=1m1jx>Cmlogm for all m. In Theorem 13 we shall prove that for almost all xΩ there is such a C(x).

Figure 1: 1mlogmj=1m1jx for x=-1+52, for m=5000,,40000

However, the estimate in Theorem 11 is not true for all xΩ. Define a as follows. Let a1 be any element of . Then inductively, define an+1 to be any element of that is >qnn-1. Then for any n, using (11) and qn+1=an+1qn+qn-1>an+1qn we get


hence qnv(a)<1qnn, and then


Using ϵ=1, it is then straightforward to check that there is no constant C such that j=1m1jv(a)Cm(logm)2 for all m.

We will need the following lemma [19, p. 324, Lemma 3] to prove a theorem; cf. Khinchin [78, p. 63, Theorem 30].

Lemma 12.

If ϕ is a function defined on the positive integers such that ϕ(n)1 for all n and


then for almost all xΩ there are only finitely many n such that an(x)ϕ(n).


For a measurable set AΩ, define


in other words dγ(x)=1(1+x)log2dμ(x). Thus for a measurable set AΩ we have




We will use that γ is an invariant measure for the Gauss transformation T:ΩΩ [44, p. 77, Lemma 3.5], i.e., if AΩ is a measurable set then


Let An={xΩ:an(x)ϕ(n)}, n1. As


we have

An {xΩ:1Tn-1(x)>ϕ(n)}
= {xΩ:Tn-1(x)<1ϕ(n)}
= (Tn-1)-1([0,1ϕ(n))).


μ(An) μ((Tn-1)-1([0,1ϕ(n))))
= 2log2γ([0,1ϕ(n)))
= 2ϕ(n).

It follows that


and thus by the Borel-Cantelli lemma [19, p. 59, Theorem 4.3] we have

μ(lim supnAn)=0.

Let λ be Lebesgue measure on I=[0,1], let dγ(x)=1(1+x)log2dλ(x), and let T:II be the Gauss transformation, T(x)=x-1-[x]-1 for x>0 and T(0)=0, for which T*γ=γ [44, p. 77, Lemma 3.5]. Suppose that ν is a Borel probability measure on [0,1] such that the pushforward measure T*ν is absolutely continuous with respect to ν. For fL1(ν), define dνf=fdν, and define Pν:L1(ν)L1(ν) by


Thus, for gL(ν), using the change of variables formula,


in particular,


We call Pν:L1(ν)L1(ν) a Perron-Frobenius operator for T. It is a fact that if f0 then Pνf0 [69, p. 57, Proposition 2.1.1], namely Pν0. It can be proved that for fL1(γ), for almost all xI [69, p. 59, Proposition 2.1.2],


and for fL1(λ), for almost all xI [69, p. 60, Corollary 2.1.4],


and with g(x)=(x+1)f(x), for n1 it holds for almost all xI that (Pλnf)(x)=(Pγng)(x)x+1. Iosifescu and Kraaikamp [69, Chapter 2] give a detailed presentation of Perron-Frobenius operators for the Gauss map. We make the final remark that Pν1I=1I is equivalent with I1E𝑑ν=I1T-1(E)𝑑ν for all Borel sets E in I, i.e. ν(E)=ν(T-1(E)), which in turn means T*ν=ν, cf. Markov operators [83, Chapter 5, §5.1, pp. 177-186]. An object similar to Perron-Frobenius operators for the Gauss transformation is the zeta-function for the Gauss transformation, for which see Lagarias [87, p. 58, §3.3].

The following theorem gives a lower bound on the sum j=1m1jx, cf. [150, p. 4, Theorem 3.1].

Theorem 13.

For almost all xΩ there is some C>0 such that


For all xΩ, if n1 then qn2n-12, by (9). Take ϕ(n)=2n-22. The series n=11ϕ(n) converges, so by Lemma 12, for almost all xΩ there are only finitely many n such that anϕ(n). That is, for almost all xΩ there is some n0 such that if nn0 then


Hence, if nn0 then


It follows that for almost all xΩ there is some K such that

qn+1<Kqn2 (15)

for all n0.

For such an x, let m be a positive integer and let qnm<qn+1. For 1jm we have by (11),


Therefore for 1jm we have


Let L=[mqn], so Lqnm. Then,

j=1m1jx > j=1m11qn+jpnqn
= qnl=0L-1h=1qn11+qnhpnqn
= Lqnh=1qn11+qnhpnqn
= Lqnk=0qn-111+qnkqn
> Lqnlogqn.

But if y1 then [y]>y2, so L=[mqn]>m2qn. Hence by (15),


and thus there is some C>0 such that j=1m1jx>Cmlogm for all m1. ∎

The following is from Kuipers and Niederreiter [85, p. 131, Exercise 3.12].

Theorem 14.

If xΩ is of type <ψ, then for all m1 we have


Summation by parts is the following identity, which can be easily checked:


Let aj=1j, let sj=h=1j1hx, let b1=0, and let bj=sj-1 for j2. Doing summation by parts gives


As x is of type <ψ, we can use Theorem 10 to get sj<12j(ψ(j)+logj) for each j1. Therefore

j=1m1jjx < 1m+112m(ψ(m)+logm)+j=1m12(ψ(j)+logj)j+1
< 12(ψ(m)+logm)+12(logm)2+12j=1mψ(j)j+1
< 24(logm)2+12ψ(m)+12j=1mψ(j)j+1.

Erdős [45] proves that for almost all x,


Kruse [84] gives a comprehensive investigation of the sums j=1m1jsjxt, s,t0. The results depend on whether s and t are are less than, equal, or greater than 1, and on whether t<s. One of the theorems proved by Kruse is the following [84, p. 260, Theorem 7]. If t>1 and 0st, and if ϵ>0, then for almost all x we have


Haber and Osgood [56, p. 387, Theorem 1] prove that for real t1, A>1, M>0, r>0, there is some C=C(t,A,M,r)>0 such that for all xΩ satisfying qn+1(x)<Mqn(x)r, for all positive integers K,


We remind ourselves that according to Theorem 4, the elements of 𝒟(r+1) are those xΩ for which there is some c(x)>0 such that qn+1(x)C(x)qn(x)r for all n1.

For x+12, define {{x}}=12. If x+12, then there is an integer mx for which |x-mx|<|x-n| for all integers nmx, and we define {{x}}=x-mx. Sinai and Ulcigrai [138, p. 96, Proposition 2] prove that if α has bounded partial quotients, then there is some C(α) such that for all M,


7 Weyl’s inequality, Vinogradov’s estimate, Farey fractions, and the circle method



We first prove four estimates following Nathanson [107, pp. 104–110, Lemmas 4.8–4.11] that we will use in what follows; cf. Vinogradov [149, p. 26, Chapter I, Lemma 8b].

Lemma 15.

There is some C such that if α, (a,q)𝒜, and




For q=1, 1rq/21αr=0. For q2, let 1rq2. As gcd(a,q)=1 and r0(modq), ar0(modq). So for μr=[arq], there is some 1σrq-1 such that ar=μrq+σr. Then


Put srq=arq, so (i) sr=σr or (ii) sr=q-σr. In case (i), srq=arq-μr. In case (ii), srq=1-(arq-μr). In case (i) let ϵr=1,mr=μr, and in case (ii) let ϵr=-1,mr=μr+1. Thus, whether (i) or (ii) holds we have




for some real θ, |θ|1. For θr=2rqθ, which satisfies |θr||θ|1,



αr =arq+θr2q

Take 1r1,r2q2 and suppose that sr1=sr2. So


hence ar1ϵr1ϵr2ar2(modq). As gcd(a,q)=1, r1ϵr1ϵr2r2(modq). Because 1r1,r2q, if r1r2(modq) then r1=r2 and if r1-r2(modq) then r1=q2 and r2=q2, so in any case r1=r2. Therefore


Using the two things we have established,

1rq/21αr 1rq/21srq-12q

Lemma 16.

There is some C such that if α, (a,q)𝒜, and


then for any positive real V and nonnegative integer h,




which satisfies |θ|1, and for 1rq define


which satisfies -1δr<2. Then

α(hq+r) =(aq+θq2)(hq+r)

For mr=[ar+[θh]+δrq2],


Suppose that t[0,1-1q] and that tR(α(hq+r))t+1q. Then


This implies, as δr-1,


and, as δr<2,


so ar-qmrJt, writing


For 1r1,r2q, if ar1-qmr1=ar2-qmr2 then ar1ar2(modq), and gcd(a,q)=1 implies r1r2(modq); and 1r1,r2q so r1=r2. For t[0,1-1q], four integers belong to Jt, hence


has at most four elements. But





{1rq:α(hq+r)[t,t+1q]} {1rq:ar-qmrJt}

This shows that if t[0,1-1q] then


has at most eight elements. For 0k<q2, writing


the set {1rq:α(hq+r)Ik} has at most eight elements. Therefore

1rqmin(V,1α(hq+r)) =0k<q/21rq,α(hq+r)Ikmin(V,1α(hq+r))

Lemma 17.

There is some C such that if α, (a,q)𝒜,


U1 is a real number, and n is a positive integer, then


For 1kU there is some 0hk<Uq and 1rkq such that k=qhk+rk, and then

1kUmin(nk,1αk) 0h<U/q1rqmin(nqh+r,1α(hq+r))

the last inequality by Lemma 15. If q2<rq then 1r<2q=2(h+1)q for h=0, and if 1h<Uq and 1rq then hh+12 so hq+r>hq(h+1)q2 and hence 1hq+r<2(h+1)q, whence




Lemma 16 with V=n(h+1)q says



1kUmin(nk,1αk) qlogq+0h<U/q(n(h+1)q+qlogq)

If Uq then Uq+122qU, and if U>q then Uq+1U+12U2qU, hence


Lemma 18.

There is some C such that if α, (a,q)𝒜,


and U,V1 are real numbers, then


For 1kU there is some 0hk<Uq and 1rkq such that k=qhk+rk, and then, as in the proof of Lemma 17,

1kUmin(V,1αk) 0h<U/q1rqmin(V,1α(hq+r))

Using Lemma 16,

1kUmin(V,1αk) C1qlogq+2C20h<U/q(V+qlogq)

Weyl’s inequality [107, p. 114, Theorem 4.3] is the following. For k2 and ϵ>0, there is some C(k,ϵ) such that if α, f(x) is a real polynomial with highest degree term αxk, (a,q)𝒜, and


then, writing SN(f)=j=1Ne2πif(j) and K=2k-1,


Weyl’s inequality is proved using Lemma 18.

Montgomery [103, Chapter 3] gives a similar but more streamlined presentation of Weyl’s inequality. Chandrasekharan [29] gives a historical survey of exponential sums.

Vinogradov’s estimate [146, p. 26, Theorem 3.1] states that there is some C such that for n2, 1qn, gcd(a,q)=1, and |α-aq|q-2, then

|fn(α)|C(nq-1/2+n4/5+n1/2q1/2)(logn)4, (16)

where fn(α)=pn(logp)e2πiαp; cf. Nathanson [107, p. 220, Theorem 8.5] and Vinogradov [149, p. 131, Chapter IX, Theorem 1]. This is proved using Lemma 17.

Fix B>0 and let Pn=(logn)B. For 1aqPn and gcd(a,q)=1, let


called a major arc. One checks that there is some nB such that if nnB, then 𝔐n(q,a) and 𝔐n(q,a) are disjoint when (q,a)(q,a). Let


The Farey fractions of order N are


Cf. the Stern-Brocot tree [55, §4.5]. For early appearances of Farey fractions, see Dickson [35, pp. 155–158, Chapter V]. It is proved by Cauchy that if h/k and h/k are successive elements of 𝔉N, then kh-hk=1 [59, p. 23, Theorem 28]. Let


the Euler phi function, and write Φ(N)=1mNϕ(m). One sees that |𝔉N|=1+Φ(N), and it was proved by Mertens [59, p. 268] that


Let λ be Lebesgue measure on . For nnB, because the major arcs are pairwise disjoint,

λ(𝔐n) =1aqPn,gcd(a,q)=12Pnn-1

Let In=(Pnn-1,1+Pnn-1], which for n>2Pn2 contains 𝔐n. Let


called the minor arcs.

With fn(α)=pn(logp)e2πiαp and


we have


Using (16), it can be proved that for A>0 with B2A+10 [146, p. 29, Theorem 3.2],




called the singular series, it is proved, using the Siegel-Walfisz theorem on primes in arithmetic progressions [102, p. 381, Corollary 11.19], that for A>0 with B2A [146, p. 31, Theorem 3.3],




and it follows from this that there is some n0 such that if nn0 is odd then there are primes p1,p2,p3 such that n=p1+p2+p3.

For integers a,b with gcd(a,b)=1, the Ford circle C(a,b) is the circle in that touches the line Imz=0 at z=ab and has radius 12b2; in other words, C(a,b) is the circle in with center ab+i2b2 and radius 12b2. It is straightforward to prove that if C(a,b) and C(c,d) are Ford circles, then they are tangent if and only if (bc-ad)2=1, and otherwise they are disjoint [4, p. 100, Theorem 5.6]. It is also straightforward to prove [4, p. 101, Theorem 5.7] that if h1k1<hk<h2k2 are successive elements of 𝔉N, then C(h1,k1) and C(h,k) touch at


and C(h,k) and C(h2,k2) touch at


Bonahon [21, pp. 207 ff., Chapter 8] explains Ford circles in the language of hyperbolic geometry.

We remind ourselves that |𝔉N|=1+Φ(N)=1+m=1Nϕ(m) and let ρ0,N<<ρΦ(N),N be the elements of 𝔉N. In particular, ρ0,N=0 and ρΦ(N),N=1. For ρn,N=hn,Nkn,N with gcd(hh,N,kn,N)=1, write Cn,N=C(hn,N,kn,N).

Let A0,N be the clockwise arc of C0,N from i to the point at which C0,N and C1,N touch. For 0<n<Φ(N), let An,N be the clockwise arc of Cn,N from the point at which Cn-1,N and Cn,N touch to the point at which Cn,N and Cn+1,N touch. Finally, let AΦ(N),N be the clockwise arc of CΦ(N),N from the point at which CΦ(N)-1,N and CΦ(N),N touch to i+1. Let AN be the composition of the arcs

A0,N,,AΦ(N),N, (17)

which is a contour from i to i+1.

Write H={τ:Imτ>0}. The Dedekind eta function η:H is defined by


It is straightforward to check that η is analytic and that η(τ)0 for all τH [143, pp. 17–18, §1.44]. For (h,k)𝒜, let


called a Dedekind sum; P1 is the periodic Bernoulli function. Also, for (abcd)SL2() write


The functional equation for the Dedekind eta function [4, p. 52, Theorem 3.4] is


Let p(n) be the number of ways of writing n as a sum of positive integers where the order does not matter, called the partition function. For example, 4,3+1,2+2,2+1+1,1+1+1+1 are the partitions of 4, so p(4)=5. Denoting by D(0,1) the open disc with center 0 and radius 1, define F:D(0,1) by


that the product and the series are equal was found by Euler. F is analytic. On the one hand, p(n)=F(n)(0)n!, and on the other hand, by Cauchy’s integral formula [143, p. 82, Theorem 2.41], if C is a circle with center 0 and radius 0<R<1 then


Taking C to be the circle with center 0 and radius e-2π and doing the change of variable z=e2πiτ,

p(n) =12πiii+1F(e2πiτ)e2πi(n+1)τ2πie2πiτ𝑑τ

where we remind ourselves that AN is the contour (17) from i to i+1. Using this and the functional equation for the Dedekind eta function, Rademacher [4, p. 104, Theorem 5.10] proves that for n1,


for Ak(n)=0h<k,gcd(h,k)=1eπis(h,k)-2πinh/k.

We make a final remark about the Farey fractions. Writing the elements of 𝔉N as ρ0,N<<ρΦ(N),N, let ηn,N=ρn,N-nΦ(N) for 1nN. For example, Φ(5)=10 and




Landau [91], following work of Franel, proves that the Riemann hypothesis is true if and only if for every ϵ>0,


For example, for N=5, the left-hand side is 1130. See Narkiewicz [106, p. 40, §2.2.3].

8 Discrepancy and exponential sums

Discrepancy and Diophantine approximation are covered by Kuipers and Niederreiter [85, Chapters 1–2 ] and by Drmota and Tichy [41, §§1.1–1.4], especially [41, pp. 48–66, §1.4.1].

Let ω=(xn), n1, be a sequence of real numbers, and let E[0,1). For a positive integer N, let A(E;N;ω) be the number of xn, 1nN, such that R(xn)E. We say that the sequence ω is uniformly distributed modulo 1 if we have for all a and b with 0a<b1 that


It can be shown [85, p. 3, Corollary 1.1] that a sequence xn is uniformly distributed modulo 1 if and only if for every Riemann integrable function f:[0,1] we have

limN1Nn=1Nf(R(xn))=01f(t)𝑑t. (18)

Thus, if a sequence is uniformly distributed then the integral of any Riemann integrable function on [0,1] can be approximated by sampling according to this sequence. This approximation can be quantified using the notion of discrepancy.

It can be proved [85, p. 7, Theorem 2.1] that a sequence xn is uniformly distributed modulo 1 if and only if for all nonzero integers h we have


this is called Weyl’s criterion. But if xΩ, then

|n=1Ne2πihnx|=|1-e2πihNx1-e2πihx|2|1-e2πihx|=1|sinπhx|. (19)

We thus obtain the following theorem.

Theorem 19.

If xΩ then the sequence nx is uniformly distributed modulo 1.

The discrepancy of a sequence ω is defined, for N a positive integer, by


One proves that the sequence ω is uniformly distributed modulo 1 if and only if DN(ω)0 as N [85, p. 89, Theorem 1.1].

For f:[0,1], let V(f) denote the total variation of f. Koksma’s inequality [85, p. 143, Theorem 5.1] states that for any sequence ω=(xn), for any f:[0,1] of bounded variation, and for any positive integer N, we have

|1Nn=1Nf(R(xn))-01f(t)𝑑t|V(f)DN(ω). (20)

Following Kuipers and Niederreiter [85, p. 122, Lemma 3.2], we can bound the discrepancy of the sequence nx in terms of the sum on the left-hand side of Theorem 14

Lemma 20.

There is some C>0 such that for all xΩ, ω=(nx), and for all positive integers m we have


We shall use the following inequality, which lets us bound the discrepancy of a sequence in terms of exponential sums formed from the elements of the sequence. The Erdős-Turán theorem [85, p. 114, Eq. 2.42] states that there is some constant C>0 such that for any sequence ω=(xn) of real numbers, any positive integer N, and any positive integer m we have

DN(ω)C(1m+j=1m1j|1Nn=1Ne2πijxn|). (21)

Take xn=nx. For each j1, by (19) we have


But sint2πt for 0tπ2, so


Using this in (21) gives us


which is the claim. ∎

It follows from Theorem 14 and Lemma 20 (taking m=N) that if x is of type <K(logh)1+ϵ then

DN(ω)=O((logN)2+ϵN). (22)

Lemma 8 tells us that for almost all xΩ there is some K>0 such that x is of type <K(logh)1+ϵ, so for almost all xΩ, the bound (22) is true. It likewise follows that if x has bounded partial quotients then

DN(ω)=O((logN)2N). (23)

In fact, it can be proved that if xK then, for g=1+52 [85, p. 125, Theorem 3.4],


We use the above bounds in the proof of the following theorem.

Theorem 21.

Let ϵ>0. For almost all x we have


while if x has bounded partial quotients then


Let f(t)=t. Then V(f)=1 and 01f(t)𝑑t=14, so we get from Koksma’s inequality (20) that




The claims then follow respectively from (22) and (23). ∎

Like we mentioned at the beginning of §6, because |sin(πx)|=sin(πx)πx and |sin(πx)|=sin(πx)2ππx=2x, we have


Thus Theorem 21 also gives estimates for n=1N|sin(πnx)|.

We can investigate the sum n=1NR(nx) rather than n=1Nnx; see Lang [92, p. 37, Theorem 1], who proves that for almost all xΩ,


For xΩ, let qn=qn(x), the denominator of the nth convergent of the continued fraction expansion of x, and let an=an(x), the nth partial quotient of the continued fraction expansion of x. For m1, one can prove [18, p. 211, Proposition 1] that m can be written in one and only one way in the form

m=k=1zkqk-1=k=1tzkqk-1, (24)

where (i) 0z1a1-1, (ii) 0zkak for k2, (iii) for k1, if zk+1=ak+1 then zk=0, and (iv) zt0 and zk=0 for k>t. The expression (24) is called the Ostrowski expansion of m. We emphasize that this expansion depends on x. Berthé [18] surveys applications of this numeration system in combinatorics. For n0, define d2n=q2nx-p2n and d2n+1=p2n+1-q2n+1x. Brown and Shiue [24, p. 184, Theorem 1] prove that for xΩ,

k=1m(R(kx)-12)=k=1t(-1)kzk(12-dk-1(mk-1+12zkqk-1+12)), (25)

where m0=0 and if k1 then mk=j=1kzjqj-1. If k0, then by (11) we have 0<dk<1qk+1. For k1, using the fact that qkmk+1 (for the same reason that if the highest power of 2 appearing in a number’s binary expansion is 2k-1, then the number is 2k-1),

mk-1+12zkqk-1+12 = mk-12zkqk-1+12
= qk-12-12zkqk-1
< qk.

Using (25), this inequality, and the inequality 0<dk<1qk+1, we obtain

|k=1m(R(kx)-12)| k=1tzk|12-dk-1(mk-1+12zkqk-1+12)|
< 12k=1tzk
< 12k=1tak.

If the continued fraction expansion of x has bounded partial quotients, say akK for all k, we obtain from the above that


It can be proved [24, p. 185, Fact 2] that t<3logm. Thus, if akK for all k, then for m1,


This is Lerch’s claim stated in §3. For example, if x=-1+52Ω then ak(x)=1 for all k1. We compute that


on the other hand, we compute that


Brown and Shiue [24, p. 185, Fact 1] use (25) to obtain the result of Sierpinski stated in §3 that for all xΩ,


They also prove [24, p. 188, Theorem 4] that for A>0, there exists some dA>0 such that for xΩ, if there are infinitely many t such that k=1takAt (which happens in particular if x has bounded partial quotients), then there are infinitely many m such that


and there are infinitely many m such that


It can be shown [92, p. 44, Theorem 4] that if k is a positive integer and ϵ>0, then for almost all x we have


Lang attributes this result to Vinogradov. But it is not so easy to obtain a bound on this exponential sum for specific x. For k=2, one can prove [92, p. 45, Lemma] that for any xΩ,


cf. Steele [142, Problem 14.2]. By (14) this gives us


If x has bounded partial quotients, it follows from Theorem 11 that


Hardy and Littlewood [57, p. 28, Theorem B5] prove that if xΩ is an algebraic number, then there is some 0<α(x)<1 such that n=1NR(nx)=N2+O(Nα(x)). Pillai [115] gives a different proof of this.

Theorem 22 (Hardy and Littlewood, Pillai).

For τ>2, if x𝒟(τ) then for α=τ-2τ-1,


Pillai [114] proves other identities and inequalities for n=1NR(nx), some for all xΩ and some for all algebraic xΩ.

For ω=(xn), n1 and for E[0,1), we remind ourselves that A(E;M;ω) denotes the number of xn, 1nM, such that R(xn)E. Define for M1,


It is straightforward to prove that DM*DM2DM* [85, p. 91, Theorem 1.3]. For N1 write r=logN. Let ϵ0>0 and let N1/2τNexp(-rϵ0). Suppose that α and


For 0<β<1 denote by Hβ(N) the number of primes pN such that R(αp)β. Vinogradov [149, p. 177, Chapter XI, Theorem] proves that for ϵ>0,


Let ω=(pnα), n1, where pn is the nth prime. Using Vinogradov’s estimate, one proves that for α, DN*(ω)0 as N, which implies that the sequence (pnα) is uniformly distributed modulo 1. A clean proof of this is given by Pollicott [116, p. 200, Theorem 1], and this is also proved by Vaaler [145] using a Tauberian theorem. See also the early survey by Hua [68, pp. 98–99, §38].

Defining Sα(n)=k=1n(R(kα)-12), Beck [11, p. 14, Theorem 3.1] proves that there is some c>0 such that for every λ,


as N. Beck [12, p. 20, Theorem 1.2] further proves that if α is a quadratic irrational, there are C1=C1(α) and C2=C2(α)>0 such that for A,B, A<B,


Beck and Chen [10]

9 Dirichlet series

The result of de la Vallée-Poussin [34] stated in §3 implies that there is no s such that for all irrational x the Dirichlet series n=1n-s|sinnπx| converges. It follows from this fact that there is no s such that for all irrational x the Dirichlet series n=1n-snx converges.

Lerch [97] in 1904 gives some statements without proof about the series


He states that if ω is a real algebraic number that does not belong to , then for sufficiently large m this series converges, and states, for example, that with ω=1+52,


Writing cr(θ)=n=1cot(πnθ)n2r-1, Berndt [17, p. 135, Theorem 5.1] proves that if θ is a real algebraic number of degree d and 1<d<2r-1 (to say that d>1 is to say that θ is irrational), then cr(θ) converges.

For a Dirichlet series n=1ann-s, one can show [143, pp. 289–290, §9.11] that if the series is convergent at s=σ0+it0, then for σ>σ0 and any t the series is convergent at s=σ+it. It follows that there is some σ0[-,] such if σ<σ0 then the series diverges at s=σ+it, and if σ>σ0 then the series converges at s=σ+it. We call σ0 the abscissa of convergence of the Dirichlet series. If each an is a nonnegative real number, then the function


cannot be analytically continued to any domain that includes s=σ0 [67, p. 101, Proposition 18].

Let an be a sequence of complex numbers. It can be shown [143, pp. 292–293, §9.14] that if sn=a1++an and the sequence sn diverges, then the abscissa of convergence of the Dirichlet series n=1ann-s is given by

σ0=lim supnlog|sn|logn.

By Theorem 11 and Theorem 13 (taking, say, ϵ=1), for almost all xΩ there are C1,C2 such that for all positive integers n we have


Thus, if an=1nx, then


and hence


It follows that for almost all xΩ the abscissa of convergence of the Dirichlet series n=11nxn-s is σ0=1.

Likewise, by Theorem 21 (taking ϵ=1), we get for almost all xΩ that j=1njx=n4+O((logn)3). We can then check that limnlogsnlogn=1, and hence that the abscissa of convergence of the Dirichlet series n=1nxn-s is σ0=1.

A 1953 result of Mahler [48, pp. 107–108] implies that if α is an algebraic number of degree d, then, for m=[2025(d-1)2], the Dirichlet series


has abscissa of convergence σ0d(m+1)log(m+1), and the power series


has radius of convergence 1.

Rivoal [126] presents later work on similar Dirichlet series. See also Queffélec and Queffélec [120]. Lalín, Rodrigue and Rogers [88] prove results about Dirichlet series of the form n=1n-scos(nπz). Duke and Imamoḡlu [42] review Hardy and Littlewood’s work on estimating lattice points in triangles, and prove results about lattice points in cones.

For θΩ, write


where Pr is a periodic Bernoulli function. Spencer [141] proves that for any ϵ>0 and almost all θΩ,


Another result Spencer proves in this paper is that if qn(θ)=O(qn-1h), then Rr,θ(ζ,m)=O(m1-rh) for 1r<h. Schoißengeier [132] gives an explicit formula for k=0N-1P2(kα).

10 Power series

For a power series anzn with radius of convergence 0R, the Cauchy-Hadamard formula [124, p. 111, Chapter 4, §1] states

R=1lim supn|an|1/n=lim infn|an|-1/n. (26)

The radius of convergence R is equal to the supremum of those t0 for which |an|tn is a bounded sequence.

Lemma 23.

If xΩ, then the power series




have the same radius of convergence.


The radii of convergence of these power series are respectively

lim infnnx1/nandlim infn|sin(πnx)|1/n.

On the one hand,


Therefore, since limnπ1/n=1,

lim infn|sin(πnx)|1/nlim infn(πnx)1/n=lim infnnx1/n

On the other hand, since sint2πt for t0,


Therefore, using limn21/n=1, we have

lim infn|sin(πnx)|1/nlim infn(2nx)1/n=lim infnnx1/n,

showing that the two power series have the same radius of convergence. ∎

We show in the following theorem that for almost all x, the power series n=1znnx has radius of convergence 1.

Theorem 24.

For almost all xΩ, the power series

n=1znnx (27)

has radius of convergence 1.


For xΩ, let Rx be the radius of convergence of the power series (27). We have 0<nx<12, so 1nx>2. Therefore n=1N1nx as N, and so the power series (27) diverges at z=1. Therefore Rx1 for all xΩ.

We shall use Lemma 7 to get a lower bound on Rx that holds for almost all xΩ. Let A={xΩ:Rx<1}, let Am={xΩ:Rx<1-1m}, and let Bm be those xΩ such that nx1/n<1-1m infinitely often. If xAm, then Rx=lim infnnx1/n<1-1m, and this implies that there are infinitely many n such that nx1/n<1-1m, so xBm, i.e. AmBm. But let fm(n)=(1-1m)n. Then n=1fm(n) converges, since 1-1m<1, so, by Lemma 7, for almost all xΩ there are only finitely many n such that nx<fm(n). Thus μ(Bm)=0. Hence μ(Am)=0, and since A=m=2Am we get that


that is, Rx1 for almost all xΩ. In conclusion, Rx=1 for almost all xΩ. ∎

In fact, we can prove the above theorem using the bounds we obtained in Theorem 11. By Theorem 11, for almost all xΩ we have that j=1m1jx=O(m2). (Here we will merely need the fact that the sum is subexponential in m.) For such an x, take 0<r<1. Let an=rn, let sn=j=1n1jx, let b1=0, and let bn=sn-1 for n2. Using summation by parts, namely


we get




Since n=1Nrnnx is increasing in N (being a sum of positive terms), we obtain that the series n=1rnnx converges. Since this is true for all r with 0<r<1, it follows that Rx1.

For xΩ let Rx be the radius of convergence of the power series q=1zqqx. We proved in Theorem 27 that for almost all xΩ, Rx=1.

Theorem 25.

For xΩ, let Rx be the radius of convergence of the power series q=1zqqx, and let an=an(x) and qn=qn(x). Then

Rx=lim infnan+1-1/qn.

For any 0R1 there is some xΩ such that Rx=R.


From the Cauchy-Hadamard formula (26),

Rx=lim infqqx1/q.

Then Rxlim infnqnx1/qn. On the one hand, by (11), qnx<qn+1-1, and qn+1=an+1qn+qn-1>an+1qn hence qnx<an+1-1qn-1, and using that limnqn-1/qn=1,

Rxlim infnan+1-1/qnqn-1/qn=lim infnan+1-1/qn.

On the other hand, let q2 and take qnq<qn+1. Applying (12),


Then applying Theorem 2, and using that 0<qnx<1 and qqn,


As (4qn)-1/qn1 as n, this implies

Rx=lim infqqx1/qlim infnan+1-1/qn.

For 0<R<1, let R=e-r for r>0. Define a as follows. Define a1=1. Suppose for n1 that we have defined a1,,an and thus p1,,pn and q1,,qn. Define an+1=[erqn]. Then an+11/qner, so an+1-1/qne-r. Therefore for x=v(a), Rxe-r. Now, erqn>1 so an+1=[erqn]2-1erqn. Then an+11/qn2-1/qner, hence

Rx=lim infnan+1-1/qnlim infn21/qne-r=e-r.

We have therefore established that when x=v(a), Rx=e-r.

For R=0, define a by a1=1 and an+1=[enqn], which satisfies an+12-1enqn. For x=v(a),

Rx=lim infnan+1-1/qnlim infn21/qne-n=0.

For R=1, define a by an=1 for all n1. Namely, v(a)=-1+52Ω. For x=v(a) it is immediate that Rx1. ∎

Since R(nx)<1, of course the power series n=1R(nx)zn has radius of convergence 1. The following result, for which Pólya and Szegő [117, p. 280, Part II, No. 168] cite Hecke, shows in particular that the radius of convergence of this power series is 1 for xΩ and is thus equal to 1.

Theorem 26.

For xΩ, let


We have


Since xΩ, the sequence nx is uniformly distributed modulo 1. Therefore, with f(t)=te2πit we have by (18) that


We will use the following result [117, p. 21, Part I, No. 88]. If a sequence of complex numbers an satisfies limN1Nn=1Nan=s, then


Let an=R(nx)e2πinx, and we thus have


It follows from the above theorem that if xΩ then |z|=1 is a natural boundary of the function f defined on the open unit disc by f(z)=n=1R(nx)zn; cf. Segal [133, p. 255, Chapter 6], who writes about this power series, and who gives a thorough introduction to natural boundaries in the same chapter. Breur and Simon [22] prove a generalization of this result.

Hata [62, p. 173, Problem 12.6] mentions the appearance of the function f from the above theorem in the study of the Caianiello neuron equations.

11 Product

We will use the following lemma proved by Hardy and Littlewood [58, p. 89], whose brief proof we expand.

Lemma 27.

Let ψ:(0,) be positive and nondecreasing. If


then for almost all xΩ, there exists some H such that for all n1 and for all real hH, there are at most max{nψ(h)h,1} integers m{1,,n} that satisfy mx<1h.


By Lemma 7, for almost all xΩ there is some K such that if kK then

kx2kψ(k). (28)

Let H be large enough so that


also let ψ(H)1. Now suppose by contradiction that there is some n1 and some hH such that there are more than max{nψ(h)h,1} integers m{1,,n} that satisfy mx<1h. Then there are some 1m1<m2n satisfying m1x<1h and m2x<1h and such that


so μψ(h)<h. On the other hand,


so h<2μx. Thus μψ(h)<2μx, i.e.,


μ<h because μ<hψ(h) and ψ(h)ψ(H)1. Moreover, since μx<2h2H, we have μK. This contradicts (28). ∎

Hardy and Littlewood [58, p. 89, Theorem 4] prove the following theorem that gives us the conclusion (18) for certain functions that are not Riemann integrable on [0,1].

Theorem 28.

Let f:(0,1) be nonnegative, let f be nonincreasing on (0,12) and nondecreasing on (12,1), and let


Let ψ:(1,) be a positive and nondecreasing function such that




then for almost all xΩ,


For 0<δ<12, define


From Lemma 27, for almost all xΩ there is some C such that for all n and h there are at most Cnψ(h)h integers m{1,,n} satisfying mx<1h. Let





Sn2(δ) = 1nm=1nf(R(mx))-fδ(R(mx))
= 1nmx<δ1mnf(R(mx))
= 1nk=0δ2k+1mx<δ2k1mnf(R(mx))
= 1nk=0Tk,n(δ).

There are at most Cnψ(2kδ)2kδ integers m{1,,n} that satisfy mx<δ2k, thus, as ψ is nondecreasing, there are at most 2Cnδψ(2kδ)2k+12Cnδψ(2k+1δ)2k+1 terms in Tk,n(δ). For each term f(R(mx)) in Tk,n(δ), since δ2k+1mx we have, by assumption on f, either f(R(mx))f(δ2k+1) or f(R(mx))f(1-δ2k+1), and hence



Sn2(δ) 1nk=02Cnδψ(2k+1δ)2k+1(f(δ2k+1)+f(1-δ2k+1))
= 4Ck=0δ2k+2ψ(2k+1δ)f(δ2k+1)+4Ck=0δ2k+2ψ(2k+1δ)f(1-δ2k+1)

Let ϵ>0. Because 01f(t)(ψ(1t)+ψ(11-t))𝑑t<, there exists a δ1 such that if δδ1 then Sn2(δ)<ϵ.

On the other hand, since fδ is Riemann integrable on [0,1] and because, by (19), the sequence mx is uniformly distributed modulo 1, we obtain from (18) that


As 01f(t)𝑑t<, there exists a δ2 such that for δδ2 and for sufficiently large n,


Therefore, for sufficiently large n and for sufficiently small δ,


Thus for sufficiently large n,


By the Birkhoff ergodic theorem [44, p. 44, Theorem 2.30], if fL1[0,1] and xΩ, then for almost all α[0,1],


This equality holding for α=0 is the conclusion of Theorem 28.

Baxa [9] reviews further results that give conditions when a function f:[0,1]{+} that is not Riemann integrable on [0,1] nevertheless satisfies


for certain xΩ. Oskolkov [109, p. 170, Theorem 1] shows that if f:(0,1) satisfies limt0+f(t)=+ and limt1-f(t)=+, and also the improper Riemann integral of f on [0,1] exists, then, for xΩ,


if and only if


where qn(x) is the denominator of the nth convergent of the continued fraction expansion of x.

Driver, Lubinsky, Petruska and Sarnak [40]

Using Theorem 28 we can now prove the following theorem of Hardy and Littlewood [58, p. 88, Theorem 2].

Theorem 29.

For almost all xΩ,


Let f(t)=-logsinπt. Using cos(t-π2)=sint and sin2t=2sintcost, one can check that 01logsinπtdt=-log2. (The earliest evaluation of this integral of which we are aware is by Euler [46], who gives two derivations, the first using the Euler-Maclaurin summation formula, the power series expansion for log(1+z1-z), and the power series expansion of zcot(z), and the second using the Fourier series of log|sint|.) Thus, 01f(t)𝑑t=log2<. So f satisfies the conditions of Theorem 28.

Let ψ(t)=(logt)2. First, upper bounding the series by an integral,



01f(t)(ψ(1t)+ψ(11-t))𝑑t = 01f(t)(ψ(t)+ψ(1-t))𝑑t
= 012-2logsin(πt)
< .

Therefore by Theorem 28, for almost all xΩ,




Hardy and Littlewood give another proof [58, p. 86, Theorem 1] of the above theorem, which we now work out. This proof is complicated and we greatly expand on the abbreviated presentation of Hardy and Littlewood.

We remind ourselves that the Cauchy-Hadamard formula states that the radius of convergence R of a power series anzn satisfies

R=1lim supn|an|1/n=lim infn|an|-1/n.
Theorem 30.

Fix xΩ and write q0=e2πix. Let ρ and R respectively be the radii of convergence of the power series


Then R=ρ, and if |z|<ρ then F(z)=ef(z).

The functions f:D(0,ρ) and F:D(0,R) are analytic [143, p. 69, Theorem 2.16].

Using the Cauchy-Hadamard formula we have

ρ=lim infnn1/n|1-q0n|1/nlim infn(2n)1/n=1


R=lim infn(|1-q0||1-q02||1-q0n|)1/n2ρ. (29)
Lemma 31.

For |u|=1 and for 0r<1,


Because Reu1 we have 1-r1-rReu=Re(1-ru)|1-ru|. Hence |1-u|2|1-ru|, from which the claim follows. ∎

We assert the following as a common fact in complex analysis.

Lemma 32.

For |w|<1 define


If |w|<1, then eL(w)=(1-w)-1.

For |q|,|z|<1, we define


For |q|<1, define c0(q)=0 and for n1,


and thus for |z|<1,


Furthermore define γ0=0 and for n1,


and thus for |z|<ρ,


For |q|<1, define C0(q)=1 and for n1,


and thus for |z|<1,


Furthermore define Γ0=1 and for n1,


and thus for |z|<R,


We prove directly the following, which is an instance of the q-binomial formula [3, p. 17, Theorem 2.1].

Proposition 33.

If |q|,|z|<1 then


By Lemma 32,


Because eL(w)=(1-w)-1 for |w|<1,




On the one hand, g0(q)=G(0,q)=1. On the other hand,




and therefore for n1 we have gn(q)=(1-qn)-1gn-1(q). Thus by induction, for n1,




Proposition 34.

Rρ, and if |z|<ρ then ef(z)=F(z).


If ρ=0 then the claim is immediate. Otherwise, 0<ρ1. Let 0<t<ρ, 0<r<1, and define G(θ)=F(teiθ,rq0). On the one hand,


which implies that G^(n)=Cn(rq0)tn for n0 and G^(n)=0 for n<0. On the other hand, for n, using Proposition 33,

G^(n) =12π02πG(θ)e-inθ𝑑θ