Sums, series, and products in Diophantine approximation

Jordan Bell
[email protected]
Department of Mathematics, University of Toronto
Toronto, Ontario, Canada
(March 3, 2023)
Abstract

There is not much that can be said for all x and for all n about the sum

k=1n1|sinkπx|.

However, for this and similar sums, series, and products, we can establish results for almost all x using the tools of continued fractions. We present in detail the appearance of these sums in the singular series for the circle method. One particular interest of the paper is the detailed proof of a striking result of Hardy and Littlewood, whose compact proof, which delicately uses analytic continuation, has not been written freshly anywhere since its original publication. This story includes various parts of late 19th century and early 20th century mathematics.

1 Introduction

In this paper we survey a class of estimates for sums, series, and products that involve Diophantine approximation. We both sort out a timeline of the literature on these questions and give careful proofs of many lesser known results. Rather than being an open-pit mine for the history of Diophantine approximation, this paper follows one vein as deep as it goes.

For x, we write x=minn|x-n|, the distance from x to a nearest integer. In this paper we give a comprehensive presentation of estimates for sums whose jth term involves jx and determine the abscissa of convergence and radius of convergence respectively of Dirichlet series and power series whose jth term involves jx. We also give a detailed proof of a result of Hardy and Littlewood that

limn(k=1n|sinkπx|)1/n=12

for almost all x.

We either prove or state in detail and give references for all the material on continued fractions and measure theory that we use in this paper. Many of the results we prove in this paper do not have detailed proofs written in any books, and the proofs we give for results that do have proofs in books are often written significantly more meticulously here than anywhere else; in some cases the proofs in the literature are so sketchy that the proof we give is written from scratch, for example Lemma 9. Our presentation of Hardy and Littlewood’s estimate for k=1n|sinπkx| makes clear exactly what results in Diophantine approximation one needs for the proof.

In the next section we introduce the Bernoulli polynomials, the Euler-Maclaurin summation formula, and Euler’s constant, which we shall use in a few places. Because the calculations are similar to what we do in the rest of this paper and because we want to be comfortable using Bernoulli polynomials, we work things out from scratch rather than merely stating results as known. In the section after that we summarize various problems that involve sums of the type we are talking about in this paper.

2 The Bernoulli polynomials, the Euler-Maclaurin summation formula, and Euler’s constant

For k0, the Bernoulli polynomial Bk(x) is defined by

zexzez-1=k=0Bk(x)zkk!,|z|<2π. (1)

The Bernoulli numbers are Bk=Bk(0), the constant terms of the Bernoulli polynomials. For any x, using L’Hospital’s rule the left-hand side of (1) tends to 1 as z0, and the right-hand side tends to B0(x), hence B0(x)=1. Differentiating (1) with respect to x,

k=0Bk(x)zkk!=z2exzez-1=k=0Bk(x)zk+1k!=k=1Bk-1(x)zk(k-1)!,

so B0(x)=0 and for k1 we have Bk(x)k!=Bk-1(x)(k-1)!, i.e.

Bk(x)=kBk-1(x).

Furthermore, integrating (1) with respect to x on [0,1] gives, since exz𝑑x=ez-1z,

1=k=0(01Bk(x)𝑑x)zkk!,|z|<2π,

hence 01B0(x)𝑑x=1 and for k1,

01Bk(x)𝑑x=0.

The first few Bernoulli polynomials are

B0(x)=1,B1(x)=x-12,B2(x)=x2-x+16,B3(x)=x3-32x2+12x.

The Bernoulli polynomials satisfy the following:

k=0Bk(x+1)zkk! =ze(x+1)zez-1
=zexz(ez-1+1)ez-1
=zexz+zexzez-1
=k=0xkzk+1k!+k=0Bk(x)zkk!
=k=1xk-1zk(k-1)!+k=0Bk(x)zkk!,

hence

Bk(x+1)=kxk-1+Bk(x),k1,x. (2)

In particular, for k2, Bk(1)=Bk(0). The identity (2) yields Faulhaber’s formula, for k0 and positive integers a<b,

ambmk=Bk+1(b+1)-Bk+1(a)k+1. (3)

The following identity is the multiplication formula for the Bernoulli polynomials, found by Raabe [122, pp. 19–24, §13].

Lemma 1.

For k0, q1, and x,

qBk(qx)=qkj=0q-1Bk(x+jq).
Proof.

Using (2) with x=qn+jq=n+jq,

(qn+j)k=qkk+1(Bk+1(n+jq+1)-Bk+1(n+jq)),

thus

m=qNq-1mk =n=1N-1j=0q-1(nq+j)k
=qkk+1j=0q-1n=1N-1(Bk+1(n+jq+1)-Bk+1(n+jq))
=qkk+1j=0q-1(Bk+1(N+jq)-Bk+1(1+jq)).

Then by (3),

Bk+1(qN)-Bk+1(q)=qkj=0q-1(Bk+1(N+jq)-Bk+1(1+jq)).

Let Fk,q(x)=Bk+1(qx)-qkj=0q-1Bk+1(x+j/q), which is a polynomial of degree k+1. The above means that for N1, Fk,q(N)=Fk,q(1), which implies that the polynomial Fk,q(x)-Fk,q(1) is identically 0 and, a fortiori, Fk,q(x) is identically 0:

qBk+1(qx)-qkj=0q-1Bk+1(x+jq)=0.

Using Bk+1(x)=(k+1)Bk(x),

qBk(qx)=qkj=0q-1Bk(x+jq).

We remark that for prime p and for k0, one uses Lemma 1 to prove that there is a unique p-adic distribution μB,k on the p-adic integers p such that μB,k(a+pNp)=pN(k-1)Bk(a/pN) [80, p. 35, Chapter II, §4], called a Bernoulli distribution.

For x, let [x] be the greatest integer x, and let R(x)=x-[x], called the fractional part of x. Write 𝕋=/ and define the periodic Bernoulli functions Pk:𝕋 by

Pk=BkR.

For k2, because Bk(1)=Bk(0), the function Pk is continuous. For f:𝕋 define its Fourier series f^: by

f^(n)=𝕋f(t)e-2πint𝑑t,n.

For k1, one calculates P^k(0)=0 and using integration by parts, P^k(n)=-1(2πin)k for n0. Thus for k1, the Fourier series of Pk is

Pk(t)nP^k(n)e2πint=-1(2πi)kn0n-ke2πint.

For k2, n|P^k(n)|<, from which it follows that |n|NP^k(n)e2πint converges to Pk(t) uniformly for t𝕋. Furthermore, for t [102, p. 499, Theorem B.2],

P1(t)=-1πn=11nsin2πnt.

Thus for example,

B1(12π)=P1(12π)=-1πk=1sinkk.

The Euler-Maclaurin summation formula is the following [102, p. 500, Theorem B.5]. If a<b are real numbers, K is a positive integer, and f is a CK function on an open set that contains [a,b], then

a<mbf(m) =abf(x)𝑑x+k=1K(-1)kk!(Pk(b)f(k-1)(b)-Pk(a)f(k-1)(a))
-(-1)KK!abPK(x)f(K)(x)𝑑x.

Applying the Euler-Maclaurin summation formula with a=1,b=n,K=2,f(x)=logx yields [102, p. 503, Eq. B.25]

1mnlogn=nlogn-n+12logn+12log2π+O(n-1).

Using e1+O(n-1)=1+O(n-1), taking the exponential of the above equation gives Stirling’s approximation,

n!=nne-n2πn(1+O(n-1)).

Write an=-logn+1mn1m. Because xlog(1-x) is concave,

an-an-1=1n+log(1-1n)1+1-1n=0,

which means that the sequence an is nonincreasing. For f(x)=1x, because f is positive and nonincreasing,

1mnf(m)1n+1f(x)𝑑x=log(n+1)>logn,

hence an>0. Because the sequence an is positive and nonincreasing, there exists some nonnegative limit, γ, called Euler’s constant. Using the Euler-Maclaurin summation formula with a=1,b=n,K=1,f(x)=1x, as P1(x)=[x]-12,

1<mn1m=logn+12n-12+121n1x2𝑑x-1nR(x)1x2𝑑x,

which is

1<mn1m=logn-1R(x)x2𝑑x+nR(x)x2𝑑x.

As 0R(x)x-2x-2, the function xR(x)x-2 is integrable on [1,); let C=1-1R(x)x-2. Since 0nR(x)x-2𝑑xnx-2𝑑x=n-1,

1mn1m=logn+C+O(n-1)

f. But -logn+1mn1mγ as n, from which it follows that C=γ and thus

1mn1m=logn+γ+O(n-1).

3 Background

For x, let [x] be the greatest integer x and let R(x)=x-[x]. It will be handy to review some properties of x[x]. For n it is immediate that [x+n]=[x]. For x,y we have 0x-[x]+y-[y]<2, which means 0=[0][x-[y]+y-[y]]<[2]=2, and using [x+n]=[x] this is 0[x+y]-[x]-[y]<2, therefore

[x]+[y][x+y][x]+[y]+1.

For m,n, n>1, and x,

[x+mn]=[[x]+mn].

For n a positive integer and for real x,

[x]=[x+0n][x+1n][x+n-1n][x+nn]=[x]+1.

There is a unique ν, 1νn, such that [x+ν-1n]=[x] and [x+νn]=[x]+1, and therefore [R(x)+ν-1n]=0 and [R(x)+νn]=1, consequently R(x)+ν-1n<1 and R(x)+νn1, which means 1-νnR(x)<1-ν-1n, from which finally we get n[nx]-n[x]+ν<n+1 and so n=[nx]-n[x]+ν. But

k=0n-1[x+kn]=k=0ν-1[x]+k=νn-1([x]+1)=ν[x]+(n-ν)([x]+1)=n[x]+n-ν,

and using ν=n-[nx]+n[x],

k=0n-1[x+kn]=n[x]+n-(n-[nx]+n[x])=[nx].

This identity is proved by Hermite [65, pp. 310–315, §V].

The Legendre symbol is defined in the following way. Let p be an odd prime, and let a be an integer that is not a multiple of p. If there is an integer b such that ab2(modp) then (ap)=1, and otherwise (ap)=-1. In other words, for an integer a that is relatively prime to p, if a is a square mod p then (ap)=1, and if a is not a square mod p then (ap)=-1. For example, one checks that there is no integer b such that b26(mod7), and hence (67)=-1, while 322(mod7), and so (27)=1.

If p and q are distinct odd primes, define integers uk, 1kp-12, by

kq=p[kqp]+uk;

namely, uk is the remainder of kq when divided by p. We have 1ukp-1. Let μ(q,p) be the number of k such that uk>p-12. It can be shown that [59, p. 74, Theorem 92]

(qp)=(-1)μ(q,p);

this fact is called Gauss’s lemma. For example, for p=13 and q=3 we work out that

u1=3,u2=6,u3=9,u4=12,u5=2,u6=5,

and hence μ(3,13)=2, so (-1)μ(3,13)=1; on the other hand, 423(mod13), hence (313)=1. With

S(q,p)=j=1p-12[jqp],

it is known that [59, pp. 77-78, §6.13]

S(q,p)μ(q,p)(mod2).

And it can be shown that [59, p. 76, Theorem 100]

S(q,p)+S(p,q)=p-12q-12. (4)

Thus

(pq)(qp) = (-1)μ(p,q)+μ(q,p)
= (-1)S(p,q)+S(q,p)
= (-1)p-12q-12.

This is Gauss’s third proof of the law of quadratic reciprocity in the numbering [7, p. 50, §20]. This proof was published in Gauss’s 1808 “Theorematis arithmetici demonstratio nova”, which is translated in [140, pp. 112–118]. Dirichlet [38, pp. 65–72, §§42–44] gives a presentation of the proof. Eisenstein’s streamlined version of Gauss’s third proof is presented with historical remarks in [93]. Lemmermeyer [95] gives a comprehensive history of the law of quadratic reciprocity, and in particular writes about Gauss’s third proof [95, pp. 9–10]. The formula (4) resembles the reciprocity formula for Dedekind sums [123, p. 4, Theorem 1].

Gauss obtains (4) from the following [140, p. 116, §5]: if x is irrational and n is a positive integer, then

k=1n[kx]+k=1[nx][kx]=n[nx], (5)

which he proves as follows. If [jx]<k[j+1x], then [kx]=j. Therefore

k=1n[kx] = j=1[nx]j([j+1x]-[jx])
= n[nx]-j=1[nx][jx].

Bachmann [6, pp. 654–658, §4] surveys later work on sums similar to (5); see also Dickson [35, Chapter X]. If m and n are relatively prime, then

{R(mn),R(2mn),,R((n-1)mn)}={1n,2n,,n-1n},

and so

k=1n-1R(kmn)=k=1n-1kn=1n(n-1)n2=n-12.

Hence

k=1n-1[kmn]=k=1n-1kmn-k=1n-1R(kmn)=(m-1)(n-1)2. (6)

There is also a simple lattice point counting argument [118, p. 113, No. 18] that gives (6).

In 1849, Dirichlet [37] shows that

k=1nd(k)=k=1n[nk],

where d(n) denotes the number of positive divisors of an integer n. (This equality is Dirichlet’s “hyperbola method”.) He then proves that

k=1n[nk]=nlogn+(2γ-1)n+O(n).

Hardy and Wright [59, pp. 264–265, Theorem 320] give a proof of this. Finding the best possible error term in the estimate for k=1nd(k) is “Dirichlet’s divisor problem”. Dirichlet cites the end of Section V Gauss’s Disquisitiones Arithmeticae as precedent for determining average magnitudes of arithmetic functions. (In Section V, Articles 302–304, of the Disquisitiones Arithmeticae, Gauss writes about averages of class numbers of binary quadratic forms, cf. [36, Chapter VI].)

Define (x) to be 0 if x+12; if x+12 then there is an integer mx for which |x-mx|<|x-n| for all integers nmx, and we define (x) to be x-mx. Riemann [125, p. 105, §6] defines

f(x)=n=1(nx)n2;

for any x, the series converges absolutely because |(-nx)|<12. Riemann states that if p and m are relatively prime and x=p2m, then

f(x+)=limh0+f(x+h)=f(x)-π216m2,f(x-)=limh0-f(x+h)=f(x)+π216m2,

thus

f(x-)-f(x+)=π28m2,

and hence that f is discontinuous at such points, and says that at all other points f is continuous; see Laugwitz [94, §2.1.1, pp. 183-184], Neuenschwander [108], and Pringsheim [119, p. 37] about Riemann’s work on pathological functions. The role of pathological functions in the development of set theory is explained by Dauben [32, Chapter 1, p. 19] and Ferreirós [49, Chapter V, §1, p. 152].

For any interval [a,b] and any σ>0, it is apparent from the above that there are only finitely many x[a,b] for which f(x-)-f(x+)>σ, and Riemann deduces from this that f is Riemann integrable on [a,b]; cf. Hawkins [63, p. 18] on the history of Riemann integration. Later in the same paper [125, p. 129, §13], Riemann states that the function

xn=1(nx)n,

is not Riemann integrable in any interval.

In 1897, Cesàro [28] asks the following question (using the pseudonym, and anagram, “Rosace” [112, p. 331]). Let ϵ(x)=x-[x]-12. Is the series

n=1ϵ(nx)n (7)

convergent for all non-integer x? This is plausible because the expected value of ϵ(x) is 0. Landau [89] answers this question in 1901. Landau proves that if there is some g such that

k=1n[kx]=n(n+1)x2-n2+O(g(n))

where g is a nonnegative function such g(n)=o(n) and such that n=1g(n)n(n+1) converges, then (7) converges. And he proves that if x is rational then (7) diverges. We return to this series in §9.

Also in 1898, Franel [52] asks whether for irrational x and for ϵ>0 we have

k=1n[kx]=n(n+1)x2-n2+O(nϵ).

Then in 1899, Franel [53] asks if we can do better than this: is the error term in fact O(1)? Cesàro and Franel each contributed many problems to L’Intermédiaire des mathématiciens, the periodical in which they posed their questions. Information about Franel is given in [82].

Lerch [96] answers Franel’s questions in 1904. If x is irrational and pq is a convergent of x (which we will define in §4), then using Theorem 2 (from §4) we can show that if 1kq then [kx]=[kpq]. Lerch uses this and (6) to show that if x is irrational and pq is a convergent of x then

k=1q[kx]=q(q+1)x2-q2+R,0<R<12.

Lerch states that if the continued fraction expansion of x has bounded partial quotients (defined in §4) then, for any positive integer n,

k=1n[kx]=n(n+1)x2-n2+O(logn).

Lerch only gives a brief indication of the proof of this. This result is proved by Hardy and Littlewood in 1922 [57, p. 24, Theorem B3], and also in 1922 by Ostrowski [110, p. 81]. On the other hand, Lerch also constructs examples of x such that, for some positive integer c,

|k=1n[kx]-n(n+1)x2+n2|n1-1c.

Nevertheless, in 1909 Sierpinski [135] proves that if x is irrational then

k=1n[kx]=n(n+1)x2-n2+o(n).

A bibliography of Lerch’s works is given in [139]. Lerch had written earlier papers on Gauss sums, Fourier series, theta functions, and the class number; many of his papers are in Czech, but some of them are in French, several of which were published in the Paris Comptes rendus. Several of Lerch’s papers are discussed in Cresse’s survey of the class number of binary quadratic forms [36, Chapter VI].

In 1899, a writer using the pseudonym “Quemquaeris” [121] (“quem quaeris” = “whom you seek”) asks if we can characterize ϕ(n) such that for all irrational θ the series

n=1ϕ(n)sinnπθ

converges. In particular, the writer asks if ϕ(n)=1n! satisfies this. In the same year, de la Vallée-Poussin [34] answers this question. (There are also responses following de la Vallée-Poussin’s by Borel and Fabry.) For a given function ϕ(n), de la Vallée-Poussin shows that if we have an>1ϕ(qn-1) for all n, for an the nth partial quotient of θ and qn the denominator of the nth convergent of θ, then the series

n=1ϕ(n)sinnπθ

will diverge. Hardy and Littlewood prove numerous results on similar series, e.g. for ϕ(n)=n-r for real r>1 and for certain classes of θ, in their papers on Diophantine approximation [61]. In 1931, Walfisz [152, p. 570, Hilfssatz 4] shows, following work of Behnke [13, p. 289, §16], that for almost all irrational x[0,1], if ϵ>0 then

k=1n1kθ=O(n(logn)2+ϵ),

where x=min(R(x),1-R(x)). Walfisz’s paper includes many results on related sums.

In 1916, Watson [153] finds the following asymptotic series for Sn=m=1n-1csc(mπn):

πSn2nlog(2n)+2n(γ-logπ)+j=1(-1)jB2j2(22j-2)π2jj(2j)!n2j-1,

where γ is Euler’s constant and Bj are the Bernoulli numbers. Truncating the asymptotic series and rewriting gives

Sn=2nlognπ+n2log2+2γ-2logππ+O(1n).

For example, computing S1000 directly we get S1000=4477.593932, and computing the right-hand side of the above formula without the error term we obtain 4477.594019 A cleaner derivation of the asymptotic series using the Euler-Maclaurin summation formula is given later by Williams [155].

Early surveys of Diophantine approximation are given by Bohr and Cramér [20, pp. 833–836, §39] and Koksma [81, pp. 102–110]. Hlawka and Binder [66] present the history of the initial years of the theory of uniform distribution. Narkiewicz [106, pp. 82–95, §2.5 and pp. 175–183 §3.5] gives additional historical references on Diophantine approximation. The papers of Hardy and Littlewood on Diophantine approximation are reprinted in [61]. Perron [113] and Brezinski [23] give historical references on continued fractions, and there is reliable material on the use of continued fractions by 17th century mathematicians in Whiteside [154]. Fowler [51] presents a prehistory of continued fractions in Greek mathematics.

4 Preliminaries on continued fractions

Let

x=mink|x-k|=min(R(x),1-R(x)).

Let μ be Lebesgue measure on [0,1], and let Ω=[0,1].

For positive integers a1,,an, we define

[a1,,an]=1a1+1+1an-1+1an.

For example, [1,1,1]=23.

Let be the set of positive integers. We call a a continued fraction, and we call an the nth partial quotient of a. If there is some K>0 such that anK for all n then we say that a has bounded partial quotients. We call [a1,,an] the nth convergent of a. For n1 let

pnqn=[a1,,an],

with pn,qn positive integers that are relatively prime, and set

p0=0,q0=1.

One can show by induction [44, p. 70, Lemma 3.1] that for n1 we have

(pnpn-1qnqn-1)=(0110)(a1110)(an110). (8)

We have

p1=1,q1=a1,

and from (8) we get for all n1 that

pn+1=an+1pn+pn-1,

and

qn+1=an+1qn+qn-1.

Since the an are positive integers we get by induction that for all n1,

pn2(n-1)/2,qn2(n-1)/2. (9)

In fact, setting F1=1,F2=1, Fn+1=Fn+Fn-1 for n2, with Fn the nth Fibonacci number, as an1 we check by induction that

pnFn,qnFn+1.

Taking determinants of (8) gives us for all n1 that

pnqn-1-pn-1qn=(-1)n+1, (10)

and then by induction we have for all n1,

pnqn=k=1n(-1)k+1qk-1qk.

For any a, as n this sequence of sums converges and we denote its limit by v(a). We have for all n1,

v(a)-pnqn=k=n+1(-1)k+1qk-1qk.

Since the right hand side is an alternating series we obtain for n1,

|v(a)-pnqn|<1qnqn+1, (11)

and

|v(a)-pnqn|>1qnqn+1-1qn+1qn+2=an+2qnqn+21qn(qn+1+qn), (12)

and

p2q2<p4q4<<v(a)<<p3q3<p1q1. (13)

For xΩ, we say that pq, q>0, is a best approximation to x if qx=|qx-p| and qx>qx for 1q<q. The following theorem shows in particular that the convergents of a continued fraction a are best approximations to v(a) [127, p. 22, Chapter 2, §3, Theorem 1].

Theorem 2 (Best approximations).

Let a. For any n1,

qnv(a)=|pn-qnv(a)|.

If 1q<qn+1, then for any p,

|qv(a)-p||pn-qnv(a)|.
Proof.

By (11), |qnv(a)-pn|<1qn+1. But

qn+1q2=a2q1+q0q1+q0=a1+12.

So |qnv(a)-pn|<12, which means qnv(a)=|qnv(a)-pn|.

Write x=v(a) and let A=(pn+1pnqn+1qn). Applying (10),

detA=pn+1qn-pnqn+1=(-1)n.

Let

(μν)=A-1(pq)=1detA(qn-pn-qn+1pn+1)(pq)=(-1)n(qn-pn-qn+1pn+1)(pq).

Then

qx-p=(μqn+1+νqn)x-(pn+1μ+pnν)=μ(qn+1x-pn+1)+ν(qnx-pn)

and

(pq)=A(μν)=(pn+1pnqn+1qn)(μν)=(pn+1μ+pnνqn+1μ+qnν).

In particular, μ,ν. Suppose by contradiction that ν=0. Then (q-μqn+1)x=p-μpn+1, and as x it must then be that q=μqn+1 and p=μpn+1. But q=μqn+1, 1q<qn+1, and μ are together a contradiction. Therefore ν0. Either μ=0 or μ0. For μ=0,

|qx-p|=|ν||qnx-pn||qnx-pn|,

which is the claim. For μ0 we use the fact q=qn+1μ+qnν and 1q<qn+1. If μ,ν>0 then q<qn+1 is contradicted, and if μ,ν<0 then q1 is contradicted. Therefore μ and ν have different signs, say μ=(-1)N|μ| and ν=(-1)N+1|ν|. Furthermore, we get from (13) that

sgn(qnx-pn)=(-1)n,sgn(qn+1x-pn+1)=(-1)n+1.

Therefore

qx-p =μ(qn+1x-pn+1)+ν(qnx-pn)
=(-1)N|μ|(-1)n+1|qn+1x-pn+1|+(-1)N+1|ν|(-1)n|qnx-pn|,

hence

|qx-p|=|μ||qn+1x-pn+1|+|ν||qnx-pn||ν||qnx-pn||qnx-pn|,

which is the claim. ∎

The above theorem says, a fortiori, that the convergents of a are best approximations to v(a). It can also be proved that if pq, q>0, is a best approximation to v(a) then pq is a convergent of a [92, p. 9, Theorem 6]. Cassels [27, p. 2, Chapter I] works out the theory of continued fractions according to this point of view. Similarly, Milnor [101, p. 234, Appendix C] works out the theory of continued fractions in the language of rotations of the unit circle.

We define the Gauss transformation T:ΩΩ by T(x)=R(1x) for xΩ, and we define Φ:Ω by

(Φ(x))n=[1Tn-1(x)],n1.

One can check that if a then v(a)Ω [44, p. 73, Lemma 3.2]. (Namely, the value of a nonterminating continued fraction is irrational.) One can prove that v:Ω is injective [44, p. 75, Lemma 3.4], and for xΩ that [44, p. 78, Lemma 3.6]

(vΦ)(x)=x.

Therefore Φ:Ω is a bijection. Moreover, Φ is a homeomorphism, when has discrete topology, has the product topology, and Ω has the subspace topology inherited from [44, p. 86, Exercise 3.2.2]. That and Ω are homeomorphic can also be proved without using continued fractions [2, p. 106, Theorem 3.68]. In descriptive set theory, the topological space 𝒩= is called the Baire space, and the Alexandrov-Urysohn theorem states that 𝒩 has the universal property that any nonempty Polish space that is zero-dimensional (there is a basis of clopen sets for the topology) and all of whose compact subsets have empty interior is homeomorphic to 𝒩 [75, p. 37, Theorem 7.7]. Some of Baire’s work on 𝒩 is described in [71, pp. 119–120] and [5, pp. 349, 372].

For I=[0,1] and for T:II, T(x)=R(1/x) for x>0 and T(0)=0. For k1 let Ik=(1k+1,1k), so if xIk then T(x)=x-1-k. Then for xI=[0,1],

T(x)=k=11Ik(x)(x-1-k).

For S={0}{k-1:k1}, IS=k1Ik, and for xIS,

T(x)=-k=11Ik(x)x-2,

and for xIk, k2<|T(x)|<(k+1)2. Differentiability and dynamical properties of the Gauss transformation are worked out by Cornfeld, Fomin and Sinai [30, pp. 165–177, Chapter 7, §4], as an instance of piecewise monotonic transformations.

For each n1 we define an:Ω by an(x)=(Φ(x))n. For example, e-2Ω, and it is known [92, p. 74, Theorem 2] that for k1,

a3k(e-2)=a3k-2(e-2)=1anda3k-1(e-2)=2k.

The pattern for the continued fraction expansion of e seems first to have been worked out by Roger Cotes in 1714 [50], and was later proved by Euler using a method involving the Riccati equation [31].

For n1 and in, let

In(i)={ωΩ:ak(x)=ik,1kn}.

For xIn(i),

pn(x)qn(x)=[i1,,in],pn-1(x)qn-1(x)=[i1,,in-1].

The following is an expression for the sets In(i) [69, p. 18, Theorem 1.2.2].

Theorem 3.

Let n1, i1n, and for pn=pn(x), qn=qn(x), xIn(i),

un(i)={pn+pn-1qn+qn-1n oddpnqnn even

and

vn(i)={pnqnn oddpn+pn-1qn+qn-1n even.

Then

In(i)=Ω(un(i),vn(i)).

It follows from the above that for in, n1,

μ(In(i))=1qn(qn+qn-1).

5 Diophantine conditions

For real τ,γ>0 let

𝒟(τ,γ) =q1,p{x[0,1]:|x-pq|γq-τ}
=q1{x[0,1]:qxγq-τ+1},

and let

𝒟(τ)=γ>0𝒟(τ,γ).

We relate the sets 𝒟(τ) and continued fractions expansions [101, p. 241, Lemma C.6], cf. [158, p. 130, Proposition 2.4].

Lemma 4.

For τ>0 and xΩ, x𝒟(τ) if and only if there is some C=C(x)>0 such that qn+1(x)Cqn(x)τ-1 for all n1.

Proof.

For xΩ, write qn=qn(x). By (12), qnx>1qn+1+qn, and by (11), qnx<1qn+1. Suppose x𝒟(τ), so there is some γ>0 such that x𝒟(τ,γ). Then

qn+1<1qnxγ-1qnτ-1.

Suppose qn+1Cqnτ-1 for all n1. For q1, take qnq<qn+1. Using Theorem 2,

qxqnx>1qn+1+qn>12qn+112C-1qn-τ+112Cq-τ+1,

which means that x𝒟(τ,12C). ∎

For K a positive integer, let

K={xΩ:an(x)K for all n1},

so =K1K is the set of those xΩ with bounded partial quotients.

Lemma 5.

For xΩ, x𝒟(2) if and only if x.

Proof.

Write an=an(x) and qn=qn(x). If x𝒟(2) then there is some γ>0 such that x𝒟(2,γ), hence for n1,

qn+1<1qnxγ-1qn.

Now, qn+1=an+1qn+qn-1 for n1, so

an+1<qn+1qn-1<qn-1γ-1qn=γ-1,

which shows that x has bounded partial quotients.

If xK, let q1 and let qnq<qn+1. Using Theorem 2 and then (12),

qxqnx>1qn+1+qn=1an+1qn+qn-1+qn>1(an+1+2)qn.

As xK,

qx>1(K+2)qn1K+2q-1,

which means that x𝒟(2,1K+2). ∎

A complex number α is called an algebraic number of degree d, d0, if there is some polynomial f[x] with degree d such that f(α)=0 and if g[x] has degree <d and g(α)=0 then g=0. An algebraic number number of degree 2 is called a quadratic irrational. Let αΩ. It was proved by Euler [59, p. 144, Theorem 176] that if there is some p>0 and some L such that al+p(α)=al(α) for all lL then α is a quadratic irrational The converse of this was proved by Lagrange [59, p. 144, Theorem 177], namely that a quadratic irrational has eventually periodic partial quotients. For example, α=11-3Ω is a quadratic irrational, being a root of x2+6x-2, and one works out that a1(α)=3,a2(α)=6, and that al+2(α)=al(α) for l1. In particular, if αΩ is a quadratic irrational, then α has bounded partial quotients.

Liouville [59, p. 161, Theorem 191] proved that if xΩ is an algebraic number of degree d2, then x𝒟(d). The Thue-Siegel-Roth theorem [48, p. 55, Theorem 1.23] states that if xΩ is an algebraic number, then for any δ>0 there is some qδ1 such that for all qqδ,

qxq-1-δ.

See Schmidt [131, p. 195, Theorem 2B].

6 Sums of reciprocals

We are interested in getting bounds on the sum j=1m1jx. This is an appealing question because the terms 1jx are unbounded.

Rather than merely stating that k=11k=, we give more information by giving the estimate

k=1n1k=logn+γ+O(n-1),

where γ is Euler’s constant. Likewise, rather than merely stating that there are infinitely many primes, we state more information with [90, p. 102, §28]

px1p=loglogx+B+O(1logx),

for a certain constant B (namely “Merten’s constant”), or with [90, p. 226, §61]

pxp=x22logx+O(x2(logx)2).

Because |sin(πx)|=sin(πx)πx and

|sin(πx)|=sin(πx)2ππx=2x,

we have

1πj=1m1jxj=1m1|sinπjx|12j=1m1jx. (14)

Thus, getting bounds on j=1m1jx will give us bounds on j=1m1|sinπjx|.

Let ψ be a nondecreasing function defined on the positive integers such that ψ(h)>0 for h1 (for example, ψ(h)=log(2h)). Following Kuipers and Niederreiter [85, p. 121, Definition 3.3], we say that an irrational number x is of type <ψ if hx1hψ(h) for all integers h1. If ψ is a constant function, then we say that x is of constant type.

Lemma 6.

xΩ is of constant type if and only if it has bounded partial quotients.

Proof.

Suppose that xΩ is of constant type. So there is some K>0 such that hx1hK for all integers h1. For n2 we have qn=anqn-1+qn-2, and hence, by (11),

an=qnqn-1-qn-2qn-1<qnqn-1qnKqn-1x<K,

showing that x has bounded partial quotients.

Suppose that xΩ has bounded partial quotients, say anK for all n1. Let h be a positive integer and take qnh<qn+1. Then first by Theorem 2 and then by (12),

hxqnx>1qn+qn+1>12qn+1=12(an+1qn+qn-1)>12(an+1qn+qn),

and so

hx>12(K+1)1qn12(K+1)1h,

showing that x is of constant type. ∎

However, almost all x do not have bounded partial quotients [78, p. 60, Theorem 29]. Shallit [134] gives a survey on numbers with bounded partial quotients.

We state and prove a result of Khinchin’s [78, p. 69, Theorem 32] that we then use.

Lemma 7.

Let f be a positive function on the positive integers. If

q=1f(q)<,

then for almost all xΩ there are only finitely many q such that qx<f(q).

Proof.

For each positive integer q, let Eq={tΩ:qt<f(q)}. If tEq, then there is some integer p with 0pq such that |t-pq|<f(q)q. It follows that

Eq(0,f(q)q)(1-f(q)q,1)p=1q-1(pq-f(q)q,pq+f(q)q).

Therefore

q=1μ(Eq)q=12qf(q)q=2q=1f(q)<.

Let E=lim supqEq, i.e. E={tΩ:tEqfor infinitely many q}. Then by the Borel-Cantelli lemma [19, p. 59, Theorem 4.3] we have that μ(E)=0. Therefore, for almost all tΩ there are only finitely many q such that tEq, i.e., for almost all tΩ there are only finitely many q such that qt<f(q). ∎

The above lemma is proved in Benedetto and Czaja [15, p. 183, Theorem 4.3.3] using the fact that a function of bounded variation is differentiable almost everywhere. We outline the proof. Define F:[0,1]>0 by F(x)=f(q)q if x=aq, gcd(a,q)=1, 0aq, and F(x)=0 if xΩ. Writing 𝒜q={aq:0aq,gcd(a,q)=1}, for 0=t0<t1<<tN=1,

j=0NF(tj)=q=1j=0NF(tj)1𝒜q(tj)=q=1f(q)qj=0N1𝒜q(tj)q=1f(q).

It follows that the total variation of F is 2q=1f(q) and hence F is a function of bounded variation. Because F has bounded variation, the set DF of x[0,1] at which F is differentiable is a Borel set with λ(DF)=1. Check that F(x)=0 for xDF, and using this, if anqnx with gcd(an,qn)=1 and 0anqn then for some N, if nN then |x-anq|F(qn)qn.

We use the above lemma to prove the following result.

Lemma 8.

Let ϵ>0. For almost all xΩ, there is some K>0 such that x is of type <K(logh)1+ϵ.

Proof.

Let

E={tΩ:ht<1h(logh)1+ϵfor infinitely many h}.

Since h=11h(logh)1+ϵ converges, we have by Lemma 7 that μ(E)=0. Let tΩE. Then ht1h(logh)1+ϵ for all sufficiently large h. It follows that there is some K such that t is of type <K(logh)1+ϵ. ∎

The following technical lemma is from Kuipers and Niederreiter [85, p. 130, Exercise 3.9]; cf. Lang [92, p. 39, Lemma].

Lemma 9.

Let xΩ be of type <ψ. If n0 and if 0h0<qn+1, then

j+h0<qn+11jqn1(j+h0)x<6qn(ψ(qn)+logqn).
Proof.

Since pn and qn are relatively prime, the remainders of jpn, j=1,,qn, when divided by qn are all distinct. Then also, the remainders of jpn+h0pn, j=1,,qn, when divided by qn are all distinct. Let λj, j=1,,qn, be the remainder of jpn+h0pn when divided by qn. We have {λj:1jqn}={0,,qn-1}; let λj1=0, λj2=1, and λj3=qn-1.

Write x=pnqn+δnqnqn+1; by (11) we have |δn|<1. If j+h0<qn+1 then by Theorem 2 we have (j+h0)xqnx, and since x is of type <ψ we have

(j+h0)xqnx1qnψ(qn).

Let, for i=1,2,3,

Ai={1,if ji+h0<qn+1,0,if ji+h0qn+1.

If j+h0<qn+1 and jj1,j2,j3, then

λjqn+(j+h0)δnqnqn+1min{λjqn+1qn,λjqn-1qn}.

It follows that

j+h0<qn+11jqn1(j+h0)x = A1(j1+h0)x+A2(j2+h0)x+A3(j3+h0)x
+jj1,j2,j3j+h0<qn+11jqn1λjqn+(j+h0)δnqnqn+1
3qnψ(qn)+jj1,j2,j3j+h0<qn+11jqn1λjqn+(j+h0)δnqnqn+1
3qnψ(qn)+jj1,j2,j3j+h0<qn+11jqn1λjqn+1qn
+jj1,j2,j3j+h0<qn+11jqn1λjqn-1qn
< 3qnψ(qn)+2k=1qn-11kqn.

But 1y<1R(y)+11-R(y) for y, so

k=1qn-11kqn < k=1qn-11R(kqn)+11-R(kqn)
= k=1qn-11kqn+11-kqn
= 2qnk=1qn-11k
< 3qnlogqn;

the last inequality is because, for all m1,

k=1m1k<32log(m+1).

We now use Lemma 9 to obtain a bound on j=1m1jx in terms of the type of x. This is from Kuipers and Niederreiter [85, p. 131, Exercise 3.11]; cf. Lang [92, p. 39, Theorem 2].

Theorem 10.

If xΩ is of type <ψ, then for all m1 we have

j=1m1jx<12m(ψ(m)+logm).
Proof.

We shall prove the claim by induction. Because x is of type <ψ, we have

1xψ(1)<12ψ(1),

so the claim is true for m=1. Take m>1 and assume that the claim is true for all 1m<m. We shall show that it is true for m.

Let qnm<qn+1. Either m<2qn or m2qn. In the first case, using Lemma 9 we have

j=1m1jx = j=1qn1jx+j=1m-qn1(j+qn)x
< 12qn(ψ(qn)+logqn)
< 12m(ψ(m)+logm).

In the second case, using the induction assumption (with m=m-qn) and Lemma 9 we have, because qn<m-qn,

j=1m1jx = j=1m-qn1jx+j=1qn1(j+m-qn)x
< 12(m-qn)(ψ(m-qn)+log(m-qn))+6qn(ψ(qn)+logqn)
< 12(m-qn)(ψ(m)+logm)+12qn(ψ(qn)+logqn)
= 12m(ψ(m)+logm)-12qn(ψ(m)-ψ(qn)+logm-logqn)
12m(ψ(m)+logm).

The claim is true in both cases, which completes the proof by induction. ∎

We can now establish for almost all xΩ a tractable upper bound on the sum j=1m1jx, and thus by (14) also on j=1m1|sinπjx|; cf. Lang [92, p. 44, Theorem 3].

Theorem 11.

Let ϵ>0. For almost all xΩ, we have

j=1m1jx=O(m(logm)1+ϵ),

while if x has bounded partial quotients then

j=1m1jx=O(mlogm).
Proof.

Let ϵ>0. By Lemma 8, for almost all xΩ there is some K such that x is of type <K(logh)1+ϵ. For such an x, it follows from Theorem 10 that for all m1,

j=1m1jx<12m(K(logm)1+ϵ+logm)=O(m(logm)1+ϵ).

If x has bounded partial quotients then by Lemma 6 it is of constant type, say ψ(m)=K for some K. It follows from Theorem 10 that for all m1,

j=1m1jx<12m(K+logm)=O(mlogm).

For example, take x=-1+52, for which an(x)=1 for all n, and so in particular x has bounded partial quotients. In Figure 1 we plot 1mlogmj=1m1jx for m=5000,,40000. These computations suggest that there is some constant C for which j=1m1jx>Cmlogm for all m. In Theorem 13 we shall prove that for almost all xΩ there is such a C(x).

 for
Figure 1: 1mlogmj=1m1jx for x=-1+52, for m=5000,,40000

However, the estimate in Theorem 11 is not true for all xΩ. Define a as follows. Let a1 be any element of . Then inductively, define an+1 to be any element of that is >qnn-1. Then for any n, using (11) and qn+1=an+1qn+qn-1>an+1qn we get

|qnv(a)-pn|<1qn+1<1an+1qn<1qnn,

hence qnv(a)<1qnn, and then

j=1qn1jv(a)>1qnv(a)>qnn.

Using ϵ=1, it is then straightforward to check that there is no constant C such that j=1m1jv(a)Cm(logm)2 for all m.

We will need the following lemma [19, p. 324, Lemma 3] to prove a theorem; cf. Khinchin [78, p. 63, Theorem 30].

Lemma 12.

If ϕ is a function defined on the positive integers such that ϕ(n)1 for all n and

n=11ϕ(n)<,

then for almost all xΩ there are only finitely many n such that an(x)ϕ(n).

Proof.

For a measurable set AΩ, define

γ(A)=1log2A11+x𝑑μ(x),

in other words dγ(x)=1(1+x)log2dμ(x). Thus for a measurable set AΩ we have

γ(A)1log2A𝑑x=1log2μ(A)

and

γ(A)1log2A12𝑑x=12log2μ(A).

We will use that γ is an invariant measure for the Gauss transformation T:ΩΩ [44, p. 77, Lemma 3.5], i.e., if AΩ is a measurable set then

(T*γ)(A)=γ(T-1(A))=γ(A).

Let An={xΩ:an(x)ϕ(n)}, n1. As

an(x)=[1Tn-1(x)],

we have

An {xΩ:1Tn-1(x)>ϕ(n)}
= {xΩ:Tn-1(x)<1ϕ(n)}
= (Tn-1)-1([0,1ϕ(n))).

Hence

μ(An) μ((Tn-1)-1([0,1ϕ(n))))
2log2γ((Tn-1)-1([0,1ϕ(n))))
= 2log2γ([0,1ϕ(n)))
2log21log2μ([0,1ϕ(n)))
= 2ϕ(n).

It follows that

n=1μ(An)<,

and thus by the Borel-Cantelli lemma [19, p. 59, Theorem 4.3] we have

μ(lim supnAn)=0.

Let λ be Lebesgue measure on I=[0,1], let dγ(x)=1(1+x)log2dλ(x), and let T:II be the Gauss transformation, T(x)=x-1-[x]-1 for x>0 and T(0)=0, for which T*γ=γ [44, p. 77, Lemma 3.5]. Suppose that ν is a Borel probability measure on [0,1] such that the pushforward measure T*ν is absolutely continuous with respect to ν. For fL1(ν), define dνf=fdν, and define Pν:L1(ν)L1(ν) by

Pνf=d(T*νf)dν,fL1(ν).

Thus, for gL(ν), using the change of variables formula,

IgPνf𝑑ν=Igd(T*νf)=IgT𝑑νf=I(gT)f𝑑ν,

in particular,

IPνf𝑑ν=If𝑑ν.

We call Pν:L1(ν)L1(ν) a Perron-Frobenius operator for T. It is a fact that if f0 then Pνf0 [69, p. 57, Proposition 2.1.1], namely Pν0. It can be proved that for fL1(γ), for almost all xI [69, p. 59, Proposition 2.1.2],

(Pγf)(x)=k=1x+1(x+k)(x+k+1)f(1x+k),

and for fL1(λ), for almost all xI [69, p. 60, Corollary 2.1.4],

(Pλf)(x)=k=11(x+k)2f(1x+k),

and with g(x)=(x+1)f(x), for n1 it holds for almost all xI that (Pλnf)(x)=(Pγng)(x)x+1. Iosifescu and Kraaikamp [69, Chapter 2] give a detailed presentation of Perron-Frobenius operators for the Gauss map. We make the final remark that Pν1I=1I is equivalent with I1E𝑑ν=I1T-1(E)𝑑ν for all Borel sets E in I, i.e. ν(E)=ν(T-1(E)), which in turn means T*ν=ν, cf. Markov operators [83, Chapter 5, §5.1, pp. 177-186]. An object similar to Perron-Frobenius operators for the Gauss transformation is the zeta-function for the Gauss transformation, for which see Lagarias [87, p. 58, §3.3].

The following theorem gives a lower bound on the sum j=1m1jx, cf. [150, p. 4, Theorem 3.1].

Theorem 13.

For almost all xΩ there is some C>0 such that

j=1m1jx>Cmlogm.
Proof.

For all xΩ, if n1 then qn2n-12, by (9). Take ϕ(n)=2n-22. The series n=11ϕ(n) converges, so by Lemma 12, for almost all xΩ there are only finitely many n such that anϕ(n). That is, for almost all xΩ there is some n0 such that if nn0 then

an<ϕ(n)=2n-22qn-1.

Hence, if nn0 then

qn=anqn-1+qn-2<qn-12+qn-2<2qn-12.

It follows that for almost all xΩ there is some K such that

qn+1<Kqn2 (15)

for all n0.

For such an x, let m be a positive integer and let qnm<qn+1. For 1jm we have by (11),

jx-jpnqn|jx-jpnqn|=j|x-pnqn|<jqnqn+1<1qn.

Therefore for 1jm we have

jxjx-jpnqn+jpnqn<1qn+jpnqn.

Let L=[mqn], so Lqnm. Then,

j=1m1jx > j=1m11qn+jpnqn
l=0L-1h=1qn11qn+(lqn+h)pnqn
= qnl=0L-1h=1qn11+qnhpnqn
= Lqnh=1qn11+qnhpnqn
= Lqnk=0qn-111+qnkqn
> Lqnlogqn.

But if y1 then [y]>y2, so L=[mqn]>m2qn. Hence by (15),

j=1m1jx>m2logqn>m2logqn+1K>m2logmK,

and thus there is some C>0 such that j=1m1jx>Cmlogm for all m1. ∎

The following is from Kuipers and Niederreiter [85, p. 131, Exercise 3.12].

Theorem 14.

If xΩ is of type <ψ, then for all m1 we have

j=1m1jjx<24((logm)2+ψ(m)+j=1mψ(j)j).
Proof.

Summation by parts is the following identity, which can be easily checked:

n=1Nan(bn+1-bn)=aN+1bN+1-a1b1-n=1Nbn+1(an+1-an).

Let aj=1j, let sj=h=1j1hx, let b1=0, and let bj=sj-1 for j2. Doing summation by parts gives

j=1m1jjx=1m+1sm-j=1msj(1j+1-1j)=1m+1sm+j=1msj1j(j+1).

As x is of type <ψ, we can use Theorem 10 to get sj<12j(ψ(j)+logj) for each j1. Therefore

j=1m1jjx < 1m+112m(ψ(m)+logm)+j=1m12(ψ(j)+logj)j+1
< 12(ψ(m)+logm)+12(logm)2+12j=1mψ(j)j+1
< 24(logm)2+12ψ(m)+12j=1mψ(j)j+1.

Erdős [45] proves that for almost all x,

j=1m1jjx=(1+o(1))(logm)2.

Kruse [84] gives a comprehensive investigation of the sums j=1m1jsjxt, s,t0. The results depend on whether s and t are are less than, equal, or greater than 1, and on whether t<s. One of the theorems proved by Kruse is the following [84, p. 260, Theorem 7]. If t>1 and 0st, and if ϵ>0, then for almost all x we have

j=1m1jsjxt=O(mt-s(logm)(1+ϵ)t).

Haber and Osgood [56, p. 387, Theorem 1] prove that for real t1, A>1, M>0, r>0, there is some C=C(t,A,M,r)>0 such that for all xΩ satisfying qn+1(x)<Mqn(x)r, for all positive integers K,

n=K+1[AK]nx-t>{CKlogKt=1CK1+(t-1)/rt>1.

We remind ourselves that according to Theorem 4, the elements of 𝒟(r+1) are those xΩ for which there is some c(x)>0 such that qn+1(x)C(x)qn(x)r for all n1.

For x+12, define {{x}}=12. If x+12, then there is an integer mx for which |x-mx|<|x-n| for all integers nmx, and we define {{x}}=x-mx. Sinai and Ulcigrai [138, p. 96, Proposition 2] prove that if α has bounded partial quotients, then there is some C(α) such that for all M,

|m=1M1{{mα}}|C(α)M.

7 Weyl’s inequality, Vinogradov’s estimate, Farey fractions, and the circle method

Write

𝒜={(a,q)2:gcd(a,q)=1,q1}.

We first prove four estimates following Nathanson [107, pp. 104–110, Lemmas 4.8–4.11] that we will use in what follows; cf. Vinogradov [149, p. 26, Chapter I, Lemma 8b].

Lemma 15.

There is some C such that if α, (a,q)𝒜, and

|α-aq|1q2,

then

1rq/21αrCqlogq.
Proof.

For q=1, 1rq/21αr=0. For q2, let 1rq2. As gcd(a,q)=1 and r0(modq), ar0(modq). So for μr=[arq], there is some 1σrq-1 such that ar=μrq+σr. Then

arq=σrq{σrq,1-σrq}={σrq,q-σrq}.

Put srq=arq, so (i) sr=σr or (ii) sr=q-σr. In case (i), srq=arq-μr. In case (ii), srq=1-(arq-μr). In case (i) let ϵr=1,mr=μr, and in case (ii) let ϵr=-1,mr=μr+1. Thus, whether (i) or (ii) holds we have

srq=ϵr(arq-mr),srq=arq,1srq2.

Write

α-aq=θq2,

for some real θ, |θ|1. For θr=2rqθ, which satisfies |θr||θ|1,

αr=arq+rθq2=arq+θr2q.

Then

αr =arq+θr2q
=ϵrsrq+mr+θr2q
ϵrsrq+mr-θr2q
=srq-|θr2q|
srq-12q.

Take 1r1,r2q2 and suppose that sr1=sr2. So

ϵr1(ar1q-mr1)=ϵr2(ar2q-mr2)

hence ar1ϵr1ϵr2ar2(modq). As gcd(a,q)=1, r1ϵr1ϵr2r2(modq). Because 1r1,r2q, if r1r2(modq) then r1=r2 and if r1-r2(modq) then r1=q2 and r2=q2, so in any case r1=r2. Therefore

{srq:1rq2}={sq:1sq2}.

Using the two things we have established,

1rq/21αr 1rq/21srq-12q
=1sq/21sq-12q
=2q1sq/212s-1
2q1sq/21s
2q(logq2+γ+O(q-1))
=O(qlogq).

Lemma 16.

There is some C such that if α, (a,q)𝒜, and

|α-aq|1q2,

then for any positive real V and nonnegative integer h,

r=1qmin(V,1α(hq+r))C(V+qlogq).
Proof.

Write

α=aq+θq2,

which satisfies |θ|1, and for 1rq define

δr=R(θh)+θrq,

which satisfies -1δr<2. Then

α(hq+r) =(aq+θq2)(hq+r)
=ah+arq+θhq+θrq2
=ah+arq+R(θh)+[θh]q+θrq2
=ah+ar+[θh]+δrq.

For mr=[ar+[θh]+δrq2],

R(α(hq+r))=R(ar+[θh]+δrq)=ar+[θh]+δrq-mr.

Suppose that t[0,1-1q] and that tR(α(hq+r))t+1q. Then

qtar+[θh]+δr-qmrqt+1.

This implies, as δr-1,

ar-qmrqt+1-[θh]-δrqt+1-[θh]+1=qt-[θh]+2

and, as δr<2,

ar-qmrqt-[θh]-δr>qt-[θh]-2,

so ar-qmrJt, writing

Jt=(qt-[θh]-2,qt-[θh]+2].

For 1r1,r2q, if ar1-qmr1=ar2-qmr2 then ar1ar2(modq), and gcd(a,q)=1 implies r1r2(modq); and 1r1,r2q so r1=r2. For t[0,1-1q], four integers belong to Jt, hence

{1rq:ar-qmrJt}

has at most four elements. But

{1rq:R(α(hq+r))[t,t+1q]}{1rq:ar-qmrJt}.

Now,

{1rq:α(hq+r)[t,t+1q]}={1rq:R(α(hq+r))[t,t+1q]}{1rq:1-R(α(hq+r))[t,t+1q]}={1rq:R(α(hq+r))[t,t+1q]}{1rq:R(α(hq+r))[1-1q-t,1-t]},

whence

{1rq:α(hq+r)[t,t+1q]} {1rq:ar-qmrJt}
{1rq:ar-qmrJ1-1q-t}.

This shows that if t[0,1-1q] then

{1rq:α(hq+r)[t,t+1q]}

has at most eight elements. For 0k<q2, writing

Ik=[kq,kq+1q],

the set {1rq:α(hq+r)Ik} has at most eight elements. Therefore

1rqmin(V,1α(hq+r)) =0k<q/21rq,α(hq+r)Ikmin(V,1α(hq+r))
8V+1k<q/21rq,α(hq+r)Ik1α(hq+r)
8V+1k<q/28qk
=O(V+qlogq).

Lemma 17.

There is some C such that if α, (a,q)𝒜,

|α-aq|1q2,

U1 is a real number, and n is a positive integer, then

1kUmin(nk,1αk)C(nq+U+q)log2qU.
Proof.

For 1kU there is some 0hk<Uq and 1rkq such that k=qhk+rk, and then

1kUmin(nk,1αk) 0h<U/q1rqmin(nqh+r,1α(hq+r))
1rq/21αr+q/2<rqmin(nr,1αr)
+1h<U/q1rqmin(nqh+r,1α(hq+r))
C1qlogq+q/2<rqmin(nr,1αr)
+1h<U/q1rqmin(nqh+r,1α(hq+r)),

the last inequality by Lemma 15. If q2<rq then 1r<2q=2(h+1)q for h=0, and if 1h<Uq and 1rq then hh+12 so hq+r>hq(h+1)q2 and hence 1hq+r<2(h+1)q, whence

q/2<rqmin(nr,1αr)+1h<U/q1rqmin(nqh+r,1α(hq+r))2q/2<rqmin(n(h+1)q,1αr)+21h<U/q1rqmin(n(h+1)q,1α(hq+r)).

Consquently

1kUmin(nk,1αk)C1qlogq+20h<U/q1rqmin(n(h+1)q,1α(hq+r)).

Lemma 16 with V=n(h+1)q says

1rqmin(n(h+1)q,1α(hq+r))C2(n(h+1)q+qlogq),

therefore

1kUmin(nk,1αk) qlogq+0h<U/q(n(h+1)q+qlogq)
qlogq+nq1h<Uq+11h+q(logq)(Uq+1)
qlogq+nqlog(Uq+1)+Ulogq
Ulog2qU+qlog2qU+nqlog(Uq+1).

If Uq then Uq+122qU, and if U>q then Uq+1U+12U2qU, hence

1kUmin(nk,1αk)Ulog2qU+qlog2qU+nqlog2qU.

Lemma 18.

There is some C such that if α, (a,q)𝒜,

|α-pq|1q2,

and U,V1 are real numbers, then

1kUmin(V,1αk)C(q+U+V+UVq)max{1,logq}.
Proof.

For 1kU there is some 0hk<Uq and 1rkq such that k=qhk+rk, and then, as in the proof of Lemma 17,

1kUmin(V,1αk) 0h<U/q1rqmin(V,1α(hq+r))
C1qlogq+20h<U/q1rqmin(V,1α(hq+r)).

Using Lemma 16,

1kUmin(V,1αk) C1qlogq+2C20h<U/q(V+qlogq)
qlogq+(V+qlogq)(Uq+1)
qlogq+UVq+V+Ulogq.

Weyl’s inequality [107, p. 114, Theorem 4.3] is the following. For k2 and ϵ>0, there is some C(k,ϵ) such that if α, f(x) is a real polynomial with highest degree term αxk, (a,q)𝒜, and

|α-aq|1q2,

then, writing SN(f)=j=1Ne2πif(j) and K=2k-1,

|SN(f)|C(k,ϵ)N1+ϵ(N-1+q-1+N-kq)1K.

Weyl’s inequality is proved using Lemma 18.

Montgomery [103, Chapter 3] gives a similar but more streamlined presentation of Weyl’s inequality. Chandrasekharan [29] gives a historical survey of exponential sums.

Vinogradov’s estimate [146, p. 26, Theorem 3.1] states that there is some C such that for n2, 1qn, gcd(a,q)=1, and |α-aq|q-2, then

|fn(α)|C(nq-1/2+n4/5+n1/2q1/2)(logn)4, (16)

where fn(α)=pn(logp)e2πiαp; cf. Nathanson [107, p. 220, Theorem 8.5] and Vinogradov [149, p. 131, Chapter IX, Theorem 1]. This is proved using Lemma 17.

Fix B>0 and let Pn=(logn)B. For 1aqPn and gcd(a,q)=1, let

𝔐n(q,a)={α:|α-aq|Pnn-1},

called a major arc. One checks that there is some nB such that if nnB, then 𝔐n(q,a) and 𝔐n(q,a) are disjoint when (q,a)(q,a). Let

𝔐n=1aqPn,gcd(a,q)=1𝔐n(q,a).

The Farey fractions of order N are

𝔉N={hk:0hkN,gcd(h,k)=1}.

Cf. the Stern-Brocot tree [55, §4.5]. For early appearances of Farey fractions, see Dickson [35, pp. 155–158, Chapter V]. It is proved by Cauchy that if h/k and h/k are successive elements of 𝔉N, then kh-hk=1 [59, p. 23, Theorem 28]. Let

ϕ(m)=|{1km:gcd(k,m)=1}|,

the Euler phi function, and write Φ(N)=1mNϕ(m). One sees that |𝔉N|=1+Φ(N), and it was proved by Mertens [59, p. 268] that

Φ(N)=3N2π2+O(NlogN).

Let λ be Lebesgue measure on . For nnB, because the major arcs are pairwise disjoint,

λ(𝔐n) =1aqPn,gcd(a,q)=12Pnn-1
=(m=1Pnϕ(m))2Pnn-1
=6π2Pn3n-1+O(Pn2n-1logPn).

Let In=(Pnn-1,1+Pnn-1], which for n>2Pn2 contains 𝔐n. Let

𝔪n=In𝔐n,

called the minor arcs.

With fn(α)=pn(logp)e2πiαp and

R(n)=p1+p2+p3=n(logp1)(logp2)(logp3),

we have

R(n)=Infn(α)3e-2πinα𝑑α=𝔐nfn(α)3e-2πinα𝑑α+𝔪nfn(α)3e-2πinα𝑑α.

Using (16), it can be proved that for A>0 with B2A+10 [146, p. 29, Theorem 3.2],

𝔪n|fn(α)|3𝑑α=O(n2(logn)-A).

Writing

𝔖(n)=(pn(1+(p-1)-3))pn(1-(p-1)-2),

called the singular series, it is proved, using the Siegel-Walfisz theorem on primes in arithmetic progressions [102, p. 381, Corollary 11.19], that for A>0 with B2A [146, p. 31, Theorem 3.3],

𝔐nfn(α)3e-2πinα𝑑α=12n2𝔖(n)+O(n2(logn)-A).

Thus

R(n)=12n2𝔖(n)+O(n2(logn)-A),

and it follows from this that there is some n0 such that if nn0 is odd then there are primes p1,p2,p3 such that n=p1+p2+p3.

For integers a,b with gcd(a,b)=1, the Ford circle C(a,b) is the circle in that touches the line Imz=0 at z=ab and has radius 12b2; in other words, C(a,b) is the circle in with center ab+i2b2 and radius 12b2. It is straightforward to prove that if C(a,b) and C(c,d) are Ford circles, then they are tangent if and only if (bc-ad)2=1, and otherwise they are disjoint [4, p. 100, Theorem 5.6]. It is also straightforward to prove [4, p. 101, Theorem 5.7] that if h1k1<hk<h2k2 are successive elements of 𝔉N, then C(h1,k1) and C(h,k) touch at

hk-k1k(k2+k12)+ik2+k12

and C(h,k) and C(h2,k2) touch at

hk+k2k(k2+k22)+ik2+k22.

Bonahon [21, pp. 207 ff., Chapter 8] explains Ford circles in the language of hyperbolic geometry.

We remind ourselves that |𝔉N|=1+Φ(N)=1+m=1Nϕ(m) and let ρ0,N<<ρΦ(N),N be the elements of 𝔉N. In particular, ρ0,N=0 and ρΦ(N),N=1. For ρn,N=hn,Nkn,N with gcd(hh,N,kn,N)=1, write Cn,N=C(hn,N,kn,N).

Let A0,N be the clockwise arc of C0,N from i to the point at which C0,N and C1,N touch. For 0<n<Φ(N), let An,N be the clockwise arc of Cn,N from the point at which Cn-1,N and Cn,N touch to the point at which Cn,N and Cn+1,N touch. Finally, let AΦ(N),N be the clockwise arc of CΦ(N),N from the point at which CΦ(N)-1,N and CΦ(N),N touch to i+1. Let AN be the composition of the arcs

A0,N,,AΦ(N),N, (17)

which is a contour from i to i+1.

Write H={τ:Imτ>0}. The Dedekind eta function η:H is defined by

η(τ)=eπiτ/12m=1(1-e2πimτ).

It is straightforward to check that η is analytic and that η(τ)0 for all τH [143, pp. 17–18, §1.44]. For (h,k)𝒜, let

s(h,k)=r=1k-1rk(hrk-[hrk]-12)=r=1k-1rkP1(hr/k),

called a Dedekind sum; P1 is the periodic Bernoulli function. Also, for (abcd)SL2() write

ϵ(a,b,c,d)=exp(πi(a+d12c+s(-d,c))).

The functional equation for the Dedekind eta function [4, p. 52, Theorem 3.4] is

η(aτ+bcτ+d)=ϵ(a,b,c,d)(-i(cτ+d))1/2η(τ),(abcd)SL2(),τH.

Let p(n) be the number of ways of writing n as a sum of positive integers where the order does not matter, called the partition function. For example, 4,3+1,2+2,2+1+1,1+1+1+1 are the partitions of 4, so p(4)=5. Denoting by D(0,1) the open disc with center 0 and radius 1, define F:D(0,1) by

F(z)=m=1(1-zm)-1=n=0p(n)zn;

that the product and the series are equal was found by Euler. F is analytic. On the one hand, p(n)=F(n)(0)n!, and on the other hand, by Cauchy’s integral formula [143, p. 82, Theorem 2.41], if C is a circle with center 0 and radius 0<R<1 then

F(n)(0)=n!2πiCF(z)zn+1𝑑z.

Taking C to be the circle with center 0 and radius e-2π and doing the change of variable z=e2πiτ,

p(n) =12πiii+1F(e2πiτ)e2πi(n+1)τ2πie2πiτ𝑑τ
=ii+1F(e2πiτ)e-2πinτ𝑑τ
=ANF(e2πiτ)e-2πinτ𝑑τ,

where we remind ourselves that AN is the contour (17) from i to i+1. Using this and the functional equation for the Dedekind eta function, Rademacher [4, p. 104, Theorem 5.10] proves that for n1,

p(n)=1π2k=1Ak(n)k1/2ddnsinh(π231kn-124)n-124,

for Ak(n)=0h<k,gcd(h,k)=1eπis(h,k)-2πinh/k.

We make a final remark about the Farey fractions. Writing the elements of 𝔉N as ρ0,N<<ρΦ(N),N, let ηn,N=ρn,N-nΦ(N) for 1nN. For example, Φ(5)=10 and

{ρ1,5,,ρ10,5}={15,14,13,25,12,35,23,34,45,1},

and

{η1,5,,η10,5}={110,120,130,0,0,0,-130,-120,-110,0}.

Landau [91], following work of Franel, proves that the Riemann hypothesis is true if and only if for every ϵ>0,

n=1Φ(N)|ηn,N|=O(N12+ϵ).

For example, for N=5, the left-hand side is 1130. See Narkiewicz [106, p. 40, §2.2.3].

8 Discrepancy and exponential sums

Discrepancy and Diophantine approximation are covered by Kuipers and Niederreiter [85, Chapters 1–2 ] and by Drmota and Tichy [41, §§1.1–1.4], especially [41, pp. 48–66, §1.4.1].

Let ω=(xn), n1, be a sequence of real numbers, and let E[0,1). For a positive integer N, let A(E;N;ω) be the number of xn, 1nN, such that R(xn)E. We say that the sequence ω is uniformly distributed modulo 1 if we have for all a and b with 0a<b1 that

limNA([a,b);N;ω)N=b-a.

It can be shown [85, p. 3, Corollary 1.1] that a sequence xn is uniformly distributed modulo 1 if and only if for every Riemann integrable function f:[0,1] we have

limN1Nn=1Nf(R(xn))=01f(t)𝑑t. (18)

Thus, if a sequence is uniformly distributed then the integral of any Riemann integrable function on [0,1] can be approximated by sampling according to this sequence. This approximation can be quantified using the notion of discrepancy.

It can be proved [85, p. 7, Theorem 2.1] that a sequence xn is uniformly distributed modulo 1 if and only if for all nonzero integers h we have

limN1Nn=1Ne2πihxn=0;

this is called Weyl’s criterion. But if xΩ, then

|n=1Ne2πihnx|=|1-e2πihNx1-e2πihx|2|1-e2πihx|=1|sinπhx|. (19)

We thus obtain the following theorem.

Theorem 19.

If xΩ then the sequence nx is uniformly distributed modulo 1.

The discrepancy of a sequence ω is defined, for N a positive integer, by

DN(ω)=sup0a<b1|A([a,b);N;ω)N-(b-a)|

One proves that the sequence ω is uniformly distributed modulo 1 if and only if DN(ω)0 as N [85, p. 89, Theorem 1.1].

For f:[0,1], let V(f) denote the total variation of f. Koksma’s inequality [85, p. 143, Theorem 5.1] states that for any sequence ω=(xn), for any f:[0,1] of bounded variation, and for any positive integer N, we have

|1Nn=1Nf(R(xn))-01f(t)𝑑t|V(f)DN(ω). (20)

Following Kuipers and Niederreiter [85, p. 122, Lemma 3.2], we can bound the discrepancy of the sequence nx in terms of the sum on the left-hand side of Theorem 14

Lemma 20.

There is some C>0 such that for all xΩ, ω=(nx), and for all positive integers m we have

DN(ω)<C(1m+1Nj=1m1jjx).
Proof.

We shall use the following inequality, which lets us bound the discrepancy of a sequence in terms of exponential sums formed from the elements of the sequence. The Erdős-Turán theorem [85, p. 114, Eq. 2.42] states that there is some constant C>0 such that for any sequence ω=(xn) of real numbers, any positive integer N, and any positive integer m we have

DN(ω)C(1m+j=1m1j|1Nn=1Ne2πijxn|). (21)

Take xn=nx. For each j1, by (19) we have

|n=1Ne2πijnx|1|sinπjx|=1sin(πjx).

But sint2πt for 0tπ2, so

1sin(πjx)12jx<1jx.

Using this in (21) gives us

DN(ω)<C(1m+j=1m1j1N1jx)=C(1m+1Nj=1m1jjx),

which is the claim. ∎

It follows from Theorem 14 and Lemma 20 (taking m=N) that if x is of type <K(logh)1+ϵ then

DN(ω)=O((logN)2+ϵN). (22)

Lemma 8 tells us that for almost all xΩ there is some K>0 such that x is of type <K(logh)1+ϵ, so for almost all xΩ, the bound (22) is true. It likewise follows that if x has bounded partial quotients then

DN(ω)=O((logN)2N). (23)

In fact, it can be proved that if xK then, for g=1+52 [85, p. 125, Theorem 3.4],

DN(ω)3N-1+(1logg+Klog(K+1))N-1logN.

We use the above bounds in the proof of the following theorem.

Theorem 21.

Let ϵ>0. For almost all x we have

n=1Nnx=N4+O((logN)2+ϵ),

while if x has bounded partial quotients then

n=1Nnx=N4+O((logN)2).
Proof.

Let f(t)=t. Then V(f)=1 and 01f(t)𝑑t=14, so we get from Koksma’s inequality (20) that

|1Nn=1Nnx-14|DN(ω),

thus

n=1Nnx=N4+O(NDN(ω)).

The claims then follow respectively from (22) and (23). ∎

Like we mentioned at the beginning of §6, because |sin(πx)|=sin(πx)πx and |sin(πx)|=sin(πx)2ππx=2x, we have

2n=1Nnxn=1N|sin(πnx)|πn=1Nnx.

Thus Theorem 21 also gives estimates for n=1N|sin(πnx)|.

We can investigate the sum n=1NR(nx) rather than n=1Nnx; see Lang [92, p. 37, Theorem 1], who proves that for almost all xΩ,

n=1NR(nx)=N2+O((logN)2+ϵ).

For xΩ, let qn=qn(x), the denominator of the nth convergent of the continued fraction expansion of x, and let an=an(x), the nth partial quotient of the continued fraction expansion of x. For m1, one can prove [18, p. 211, Proposition 1] that m can be written in one and only one way in the form

m=k=1zkqk-1=k=1tzkqk-1, (24)

where (i) 0z1a1-1, (ii) 0zkak for k2, (iii) for k1, if zk+1=ak+1 then zk=0, and (iv) zt0 and zk=0 for k>t. The expression (24) is called the Ostrowski expansion of m. We emphasize that this expansion depends on x. Berthé [18] surveys applications of this numeration system in combinatorics. For n0, define d2n=q2nx-p2n and d2n+1=p2n+1-q2n+1x. Brown and Shiue [24, p. 184, Theorem 1] prove that for xΩ,

k=1m(R(kx)-12)=k=1t(-1)kzk(12-dk-1(mk-1+12zkqk-1+12)), (25)

where m0=0 and if k1 then mk=j=1kzjqj-1. If k0, then by (11) we have 0<dk<1qk+1. For k1, using the fact that qkmk+1 (for the same reason that if the highest power of 2 appearing in a number’s binary expansion is 2k-1, then the number is 2k-1),

mk-1+12zkqk-1+12 = mk-12zkqk-1+12
qk-1-12zkqk-1+12
= qk-12-12zkqk-1
< qk.

Using (25), this inequality, and the inequality 0<dk<1qk+1, we obtain

|k=1m(R(kx)-12)| k=1tzk|12-dk-1(mk-1+12zkqk-1+12)|
< 12k=1tzk
< 12k=1tak.

If the continued fraction expansion of x has bounded partial quotients, say akK for all k, we obtain from the above that

|k=1m(R(kx)-12)|<Kt2.

It can be proved [24, p. 185, Fact 2] that t<3logm. Thus, if akK for all k, then for m1,

|k=1m(R(kx)-12)|<3Klogm2.

This is Lerch’s claim stated in §3. For example, if x=-1+52Ω then ak(x)=1 for all k1. We compute that

k=11000000(R(kx)-12)=0.941799;

on the other hand, we compute that

k=11000000(R(kπ)-12)=19.223414.

Brown and Shiue [24, p. 185, Fact 1] use (25) to obtain the result of Sierpinski stated in §3 that for all xΩ,

k=1mR(kx)=m2+o(m).

They also prove [24, p. 188, Theorem 4] that for A>0, there exists some dA>0 such that for xΩ, if there are infinitely many t such that k=1takAt (which happens in particular if x has bounded partial quotients), then there are infinitely many m such that

k=1m(R(kx)-12)>dAlogm,

and there are infinitely many m such that

k=1m(R(kx)-12)<-dAlogm.

It can be shown [92, p. 44, Theorem 4] that if k is a positive integer and ϵ>0, then for almost all x we have

|n=1Ne2πinkx|=O(N12+ϵ).

Lang attributes this result to Vinogradov. But it is not so easy to obtain a bound on this exponential sum for specific x. For k=2, one can prove [92, p. 45, Lemma] that for any xΩ,

|n=1Ne2πin2x|2N+4n=1N1|sin4πnx|<N+4n=14N1|sinπnx|;

cf. Steele [142, Problem 14.2]. By (14) this gives us

|n=1Ne2πin2x|2<N+2n=14N1jx.

If x has bounded partial quotients, it follows from Theorem 11 that

|n=1Ne2πin2x|=O(N1/2(logN)1/2).

Hardy and Littlewood [57, p. 28, Theorem B5] prove that if xΩ is an algebraic number, then there is some 0<α(x)<1 such that n=1NR(nx)=N2+O(Nα(x)). Pillai [115] gives a different proof of this.

Theorem 22 (Hardy and Littlewood, Pillai).

For τ>2, if x𝒟(τ) then for α=τ-2τ-1,

n=1NR(nx)=N2+O(Nα).

Pillai [114] proves other identities and inequalities for n=1NR(nx), some for all xΩ and some for all algebraic xΩ.

For ω=(xn), n1 and for E[0,1), we remind ourselves that A(E;M;ω) denotes the number of xn, 1nM, such that R(xn)E. Define for M1,

DM*(ω)=sup0<β1|A([0,β);M;ω)M-β|.

It is straightforward to prove that DM*DM2DM* [85, p. 91, Theorem 1.3]. For N1 write r=logN. Let ϵ0>0 and let N1/2τNexp(-rϵ0). Suppose that α and

α=aq+θqτ,gcd(a,q)=1,exp(rϵ0)qτ,|θ|1.

For 0<β<1 denote by Hβ(N) the number of primes pN such that R(αp)β. Vinogradov [149, p. 177, Chapter XI, Theorem] proves that for ϵ>0,

Hβ(N)=βπ(N)+O(N(q-1+qN-1)12-ϵ+N45+ϵ),N.

Let ω=(pnα), n1, where pn is the nth prime. Using Vinogradov’s estimate, one proves that for α, DN*(ω)0 as N, which implies that the sequence (pnα) is uniformly distributed modulo 1. A clean proof of this is given by Pollicott [116, p. 200, Theorem 1], and this is also proved by Vaaler [145] using a Tauberian theorem. See also the early survey by Hua [68, pp. 98–99, §38].

Defining Sα(n)=k=1n(R(kα)-12), Beck [11, p. 14, Theorem 3.1] proves that there is some c>0 such that for every λ,

1N|{1nN:S2(n)clognλ}|12π-λexp(-u22)𝑑u

as N. Beck [12, p. 20, Theorem 1.2] further proves that if α is a quadratic irrational, there are C1=C1(α) and C2=C2(α)>0 such that for A,B, A<B,

1N|{0n<N:ASα(n)-C3logNC4logNB}|=(2π)-1/2ABe-u2/2𝑑u+O((logN)-1/10loglogN).

Beck and Chen [10]

9 Dirichlet series

The result of de la Vallée-Poussin [34] stated in §3 implies that there is no s such that for all irrational x the Dirichlet series n=1n-s|sinnπx| converges. It follows from this fact that there is no s such that for all irrational x the Dirichlet series n=1n-snx converges.

Lerch [97] in 1904 gives some statements without proof about the series

ν=1cotνωπ(2νπ)2m+1.

He states that if ω is a real algebraic number that does not belong to , then for sufficiently large m this series converges, and states, for example, that with ω=1+52,

5ν=1cotνωπ(2νπ)7=-810!.

Writing cr(θ)=n=1cot(πnθ)n2r-1, Berndt [17, p. 135, Theorem 5.1] proves that if θ is a real algebraic number of degree d and 1<d<2r-1 (to say that d>1 is to say that θ is irrational), then cr(θ) converges.

For a Dirichlet series n=1ann-s, one can show [143, pp. 289–290, §9.11] that if the series is convergent at s=σ0+it0, then for σ>σ0 and any t the series is convergent at s=σ+it. It follows that there is some σ0[-,] such if σ<σ0 then the series diverges at s=σ+it, and if σ>σ0 then the series converges at s=σ+it. We call σ0 the abscissa of convergence of the Dirichlet series. If each an is a nonnegative real number, then the function

F(s)=n=1ann-s,Res>σ0,

cannot be analytically continued to any domain that includes s=σ0 [67, p. 101, Proposition 18].

Let an be a sequence of complex numbers. It can be shown [143, pp. 292–293, §9.14] that if sn=a1++an and the sequence sn diverges, then the abscissa of convergence of the Dirichlet series n=1ann-s is given by

σ0=lim supnlog|sn|logn.

By Theorem 11 and Theorem 13 (taking, say, ϵ=1), for almost all xΩ there are C1,C2 such that for all positive integers n we have

C1nlogn<j=1n1jx<C2n(logn)2.

Thus, if an=1nx, then

logC1logn+1+loglognlogn<logsnlogn<logC2logn+1+2loglognlogn,

and hence

limnlogsnlogn=1.

It follows that for almost all xΩ the abscissa of convergence of the Dirichlet series n=11nxn-s is σ0=1.

Likewise, by Theorem 21 (taking ϵ=1), we get for almost all xΩ that j=1njx=n4+O((logn)3). We can then check that limnlogsnlogn=1, and hence that the abscissa of convergence of the Dirichlet series n=1nxn-s is σ0=1.

A 1953 result of Mahler [48, pp. 107–108] implies that if α is an algebraic number of degree d, then, for m=[2025(d-1)2], the Dirichlet series

n=1n-ssinnα

has abscissa of convergence σ0d(m+1)log(m+1), and the power series

n=1znsinnα

has radius of convergence 1.

Rivoal [126] presents later work on similar Dirichlet series. See also Queffélec and Queffélec [120]. Lalín, Rodrigue and Rogers [88] prove results about Dirichlet series of the form n=1n-scos(nπz). Duke and Imamoḡlu [42] review Hardy and Littlewood’s work on estimating lattice points in triangles, and prove results about lattice points in cones.

For θΩ, write

Rr,θ(ζ,m)=n=0m1r!Pr(ζ+nθ),

where Pr is a periodic Bernoulli function. Spencer [141] proves that for any ϵ>0 and almost all θΩ,

R1,θ(ζ,m)=O((logm)1+ϵ).

Another result Spencer proves in this paper is that if qn(θ)=O(qn-1h), then Rr,θ(ζ,m)=O(m1-rh) for 1r<h. Schoißengeier [132] gives an explicit formula for k=0N-1P2(kα).

10 Power series

For a power series anzn with radius of convergence 0R, the Cauchy-Hadamard formula [124, p. 111, Chapter 4, §1] states

R=1lim supn|an|1/n=lim infn|an|-1/n. (26)

The radius of convergence R is equal to the supremum of those t0 for which |an|tn is a bounded sequence.

Lemma 23.

If xΩ, then the power series

n=1znnx

and

n=1zn|sinπnx|,

have the same radius of convergence.

Proof.

The radii of convergence of these power series are respectively

lim infnnx1/nandlim infn|sin(πnx)|1/n.

On the one hand,

|sin(πnx)|=sin(πnx)πnx

Therefore, since limnπ1/n=1,

lim infn|sin(πnx)|1/nlim infn(πnx)1/n=lim infnnx1/n

On the other hand, since sint2πt for t0,

|sin(πnx)|=sin(πnx)2ππnx=2nx.

Therefore, using limn21/n=1, we have

lim infn|sin(πnx)|1/nlim infn(2nx)1/n=lim infnnx1/n,

showing that the two power series have the same radius of convergence. ∎

We show in the following theorem that for almost all x, the power series n=1znnx has radius of convergence 1.

Theorem 24.

For almost all xΩ, the power series

n=1znnx (27)

has radius of convergence 1.

Proof.

For xΩ, let Rx be the radius of convergence of the power series (27). We have 0<nx<12, so 1nx>2. Therefore n=1N1nx as N, and so the power series (27) diverges at z=1. Therefore Rx1 for all xΩ.

We shall use Lemma 7 to get a lower bound on Rx that holds for almost all xΩ. Let A={xΩ:Rx<1}, let Am={xΩ:Rx<1-1m}, and let Bm be those xΩ such that nx1/n<1-1m infinitely often. If xAm, then Rx=lim infnnx1/n<1-1m, and this implies that there are infinitely many n such that nx1/n<1-1m, so xBm, i.e. AmBm. But let fm(n)=(1-1m)n. Then n=1fm(n) converges, since 1-1m<1, so, by Lemma 7, for almost all xΩ there are only finitely many n such that nx<fm(n). Thus μ(Bm)=0. Hence μ(Am)=0, and since A=m=2Am we get that

μ(A)m=2μ(Am)=0,

that is, Rx1 for almost all xΩ. In conclusion, Rx=1 for almost all xΩ. ∎

In fact, we can prove the above theorem using the bounds we obtained in Theorem 11. By Theorem 11, for almost all xΩ we have that j=1m1jx=O(m2). (Here we will merely need the fact that the sum is subexponential in m.) For such an x, take 0<r<1. Let an=rn, let sn=j=1n1jx, let b1=0, and let bn=sn-1 for n2. Using summation by parts, namely

n=1Nan(bn+1-bn)=aN+1bN+1-a1b1-n=1Nbn+1(an+1-an),

we get

n=1Nrnnx=rN+1sN+n=1Nsnrn(1-r).

Therefore

n=1Nrnnx=O(rN+1N2)+O(n=1Nn2rn)=O(1).

Since n=1Nrnnx is increasing in N (being a sum of positive terms), we obtain that the series n=1rnnx converges. Since this is true for all r with 0<r<1, it follows that Rx1.

For xΩ let Rx be the radius of convergence of the power series q=1zqqx. We proved in Theorem 27 that for almost all xΩ, Rx=1.

Theorem 25.

For xΩ, let Rx be the radius of convergence of the power series q=1zqqx, and let an=an(x) and qn=qn(x). Then

Rx=lim infnan+1-1/qn.

For any 0R1 there is some xΩ such that Rx=R.

Proof.

From the Cauchy-Hadamard formula (26),

Rx=lim infqqx1/q.

Then Rxlim infnqnx1/qn. On the one hand, by (11), qnx<qn+1-1, and qn+1=an+1qn+qn-1>an+1qn hence qnx<an+1-1qn-1, and using that limnqn-1/qn=1,

Rxlim infnan+1-1/qnqn-1/qn=lim infnan+1-1/qn.

On the other hand, let q2 and take qnq<qn+1. Applying (12),

qnx>1qn+1+qn>12qn+1=12(an+1qn+qn-1)>14an+1qn.

Then applying Theorem 2, and using that 0<qnx<1 and qqn,

qx1/qqnx1/qqnx1/qn>(14an+1qn)1/qn.

As (4qn)-1/qn1 as n, this implies

Rx=lim infqqx1/qlim infnan+1-1/qn.

For 0<R<1, let R=e-r for r>0. Define a as follows. Define a1=1. Suppose for n1 that we have defined a1,,an and thus p1,,pn and q1,,qn. Define an+1=[erqn]. Then an+11/qner, so an+1-1/qne-r. Therefore for x=v(a), Rxe-r. Now, erqn>1 so an+1=[erqn]2-1erqn. Then an+11/qn2-1/qner, hence

Rx=lim infnan+1-1/qnlim infn21/qne-r=e-r.

We have therefore established that when x=v(a), Rx=e-r.

For R=0, define a by a1=1 and an+1=[enqn], which satisfies an+12-1enqn. For x=v(a),

Rx=lim infnan+1-1/qnlim infn21/qne-n=0.

For R=1, define a by an=1 for all n1. Namely, v(a)=-1+52Ω. For x=v(a) it is immediate that Rx1. ∎

Since R(nx)<1, of course the power series n=1R(nx)zn has radius of convergence 1. The following result, for which Pólya and Szegő [117, p. 280, Part II, No. 168] cite Hecke, shows in particular that the radius of convergence of this power series is 1 for xΩ and is thus equal to 1.

Theorem 26.

For xΩ, let

f(z)=n=1R(nx)zn,|z|<1.

We have

limr1-(1-r)f(re2πix)=12πi.
Proof.

Since xΩ, the sequence nx is uniformly distributed modulo 1. Therefore, with f(t)=te2πit we have by (18) that

limN1Nn=1NR(nx)e2πinx=limN1Nn=1Nf(R(nx))=01te2πit𝑑t=12πi.

We will use the following result [117, p. 21, Part I, No. 88]. If a sequence of complex numbers an satisfies limN1Nn=1Nan=s, then

limt1-(1-t)n=1antn=s.

Let an=R(nx)e2πinx, and we thus have

limt1-(1-t)n=1R(nx)e2πinxtn=12πi.

It follows from the above theorem that if xΩ then |z|=1 is a natural boundary of the function f defined on the open unit disc by f(z)=n=1R(nx)zn; cf. Segal [133, p. 255, Chapter 6], who writes about this power series, and who gives a thorough introduction to natural boundaries in the same chapter. Breur and Simon [22] prove a generalization of this result.

Hata [62, p. 173, Problem 12.6] mentions the appearance of the function f from the above theorem in the study of the Caianiello neuron equations.

11 Product

We will use the following lemma proved by Hardy and Littlewood [58, p. 89], whose brief proof we expand.

Lemma 27.

Let ψ:(0,) be positive and nondecreasing. If

k=11kψ(k)<,

then for almost all xΩ, there exists some H such that for all n1 and for all real hH, there are at most max{nψ(h)h,1} integers m{1,,n} that satisfy mx<1h.

Proof.

By Lemma 7, for almost all xΩ there is some K such that if kK then

kx2kψ(k). (28)

Let H be large enough so that

mink<Kkx2H;

also let ψ(H)1. Now suppose by contradiction that there is some n1 and some hH such that there are more than max{nψ(h)h,1} integers m{1,,n} that satisfy mx<1h. Then there are some 1m1<m2n satisfying m1x<1h and m2x<1h and such that

μ=m2-m1<nnψ(h)h=hψ(h),

so μψ(h)<h. On the other hand,

μxm1x+m2x<1h+1h=2h,

so h<2μx. Thus μψ(h)<2μx, i.e.,

μx<2μψ(h)2μψ(μ);

μ<h because μ<hψ(h) and ψ(h)ψ(H)1. Moreover, since μx<2h2H, we have μK. This contradicts (28). ∎

Hardy and Littlewood [58, p. 89, Theorem 4] prove the following theorem that gives us the conclusion (18) for certain functions that are not Riemann integrable on [0,1].

Theorem 28.

Let f:(0,1) be nonnegative, let f be nonincreasing on (0,12) and nondecreasing on (12,1), and let

01f(t)𝑑t<.

Let ψ:(1,) be a positive and nondecreasing function such that

k=21kψ(k)<.

If

01f(t)(ψ(1t)+ψ(11-t))𝑑t<,

then for almost all xΩ,

limn1nm=1nf(R(mx))=01f(t)𝑑t.
Proof.

For 0<δ<12, define

fδ(t)={f(t),δt1-δ,0,0<t<δor1-δ<t<1.

From Lemma 27, for almost all xΩ there is some C such that for all n and h there are at most Cnψ(h)h integers m{1,,n} satisfying mx<1h. Let

Sn=1nm=1nf(R(mx))=Sn1(δ)+Sn2(δ),

where

Sn1(δ)=1nm=1nfδ(R(mx))

and

Sn2(δ) = 1nm=1nf(R(mx))-fδ(R(mx))
= 1nmx<δ1mnf(R(mx))
= 1nk=0δ2k+1mx<δ2k1mnf(R(mx))
= 1nk=0Tk,n(δ).

There are at most Cnψ(2kδ)2kδ integers m{1,,n} that satisfy mx<δ2k, thus, as ψ is nondecreasing, there are at most 2Cnδψ(2kδ)2k+12Cnδψ(2k+1δ)2k+1 terms in Tk,n(δ). For each term f(R(mx)) in Tk,n(δ), since δ2k+1mx we have, by assumption on f, either f(R(mx))f(δ2k+1) or f(R(mx))f(1-δ2k+1), and hence

f(R(mx))f(δ2k+1)+f(1-δ2k+1).

Therefore,

Sn2(δ) 1nk=02Cnδψ(2k+1δ)2k+1(f(δ2k+1)+f(1-δ2k+1))
= 4Ck=0δ2k+2ψ(2k+1δ)f(δ2k+1)+4Ck=0δ2k+2ψ(2k+1δ)f(1-δ2k+1)
4C0δ2ψ(1t)f(t)𝑑t+4C1-δ21ψ(11-t)f(t)𝑑t.

Let ϵ>0. Because 01f(t)(ψ(1t)+ψ(11-t))𝑑t<, there exists a δ1 such that if δδ1 then Sn2(δ)<ϵ.

On the other hand, since fδ is Riemann integrable on [0,1] and because, by (19), the sequence mx is uniformly distributed modulo 1, we obtain from (18) that

limnSn1(δ)=01fδ(t)𝑑t=δ1-δf(t)𝑑t.

As 01f(t)𝑑t<, there exists a δ2 such that for δδ2 and for sufficiently large n,

|Sn1(δ)-01f(t)𝑑t||Sn1(δ)-δ1-δf(t)𝑑t|+|0δf(t)𝑑t|+|1-δ1f(t)𝑑t|<3ϵ.

Therefore, for sufficiently large n and for sufficiently small δ,

|Sn-01f(t)𝑑t||Sn1(δ)-01f(t)𝑑t|+|Sn2(δ)|<4ϵ.

Thus for sufficiently large n,

|Sn-01f(t)𝑑t|4ϵ.

By the Birkhoff ergodic theorem [44, p. 44, Theorem 2.30], if fL1[0,1] and xΩ, then for almost all α[0,1],

limn1nj=1nf(R(α+jx))=01f(t)𝑑t.

This equality holding for α=0 is the conclusion of Theorem 28.

Baxa [9] reviews further results that give conditions when a function f:[0,1]{+} that is not Riemann integrable on [0,1] nevertheless satisfies

limn1nm=1nf(R(mx))=01f(t)𝑑t

for certain xΩ. Oskolkov [109, p. 170, Theorem 1] shows that if f:(0,1) satisfies limt0+f(t)=+ and limt1-f(t)=+, and also the improper Riemann integral of f on [0,1] exists, then, for xΩ,

limn1nm=1nf(R(mx))=01f(t)𝑑t

if and only if

limn1qn(x)f(R(qn(x)x))=0,

where qn(x) is the denominator of the nth convergent of the continued fraction expansion of x.

Driver, Lubinsky, Petruska and Sarnak [40]

Using Theorem 28 we can now prove the following theorem of Hardy and Littlewood [58, p. 88, Theorem 2].

Theorem 29.

For almost all xΩ,

limn(k=1n|sinkπx|)1/n=12.
Proof.

Let f(t)=-logsinπt. Using cos(t-π2)=sint and sin2t=2sintcost, one can check that 01logsinπtdt=-log2. (The earliest evaluation of this integral of which we are aware is by Euler [46], who gives two derivations, the first using the Euler-Maclaurin summation formula, the power series expansion for log(1+z1-z), and the power series expansion of zcot(z), and the second using the Fourier series of log|sint|.) Thus, 01f(t)𝑑t=log2<. So f satisfies the conditions of Theorem 28.

Let ψ(t)=(logt)2. First, upper bounding the series by an integral,

k=21k(logk)212(log2)2+21t(logt)2𝑑t=12(log2)2+log2<.

Second,

01f(t)(ψ(1t)+ψ(11-t))𝑑t = 01f(t)(ψ(t)+ψ(1-t))𝑑t
= 012-2logsin(πt)
((logt)2+(log(1-t))2)dt
012-2log(2t)((logt)2+(log(1-t))2)dt
< .

Therefore by Theorem 28, for almost all xΩ,

limn1nm=1n-logsin(πR(mx))=01-logsinπtdt,

i.e.

limn1nm=1nlog|sinπmx|=log12.

Hardy and Littlewood give another proof [58, p. 86, Theorem 1] of the above theorem, which we now work out. This proof is complicated and we greatly expand on the abbreviated presentation of Hardy and Littlewood.

We remind ourselves that the Cauchy-Hadamard formula states that the radius of convergence R of a power series anzn satisfies

R=1lim supn|an|1/n=lim infn|an|-1/n.
Theorem 30.

Fix xΩ and write q0=e2πix. Let ρ and R respectively be the radii of convergence of the power series

f(z)=n=1znn(1-q0n),F(z)=1+n=1zn(1-q0)(1-q02)(1-q0n).

Then R=ρ, and if |z|<ρ then F(z)=ef(z).

The functions f:D(0,ρ) and F:D(0,R) are analytic [143, p. 69, Theorem 2.16].

Using the Cauchy-Hadamard formula we have

ρ=lim infnn1/n|1-q0n|1/nlim infn(2n)1/n=1

and

R=lim infn(|1-q0||1-q02||1-q0n|)1/n2ρ. (29)
Lemma 31.

For |u|=1 and for 0r<1,

|1-u1-ru|2.
Proof.
|1-u||1-ru|+|ru-u|=|1-ru|+1-r.

Because Reu1 we have 1-r1-rReu=Re(1-ru)|1-ru|. Hence |1-u|2|1-ru|, from which the claim follows. ∎

We assert the following as a common fact in complex analysis.

Lemma 32.

For |w|<1 define

L(w)=n=1wnn.

If |w|<1, then eL(w)=(1-w)-1.

For |q|,|z|<1, we define

f(z,q)=n=1znn(1-qn),F(z,q)=1+n=1zn(1-q)(1-q2)(1-qn).

For |q|<1, define c0(q)=0 and for n1,

cn(q)=1n(1-qn),

and thus for |z|<1,

f(z,q)=n=0cn(q)zn.

Furthermore define γ0=0 and for n1,

γn=1n(1-q0n),

and thus for |z|<ρ,

f(z)=n=0γnzn.

For |q|<1, define C0(q)=1 and for n1,

Cn(q)=1(1-q)(1-q2)(1-qn),

and thus for |z|<1,

F(z,q)=n=0Cn(q)zn.

Furthermore define Γ0=1 and for n1,

Γn=1(1-q0)(1-q02)(1-q0n),

and thus for |z|<R,

F(z)=n=0Γnzn.

We prove directly the following, which is an instance of the q-binomial formula [3, p. 17, Theorem 2.1].

Proposition 33.

If |q|,|z|<1 then

F(z,q)=ef(z,q).
Proof.

By Lemma 32,

f(z,q)=n=1znn(m=0qnm)=m=0(n=1(zqm)nn)=m=0L(zqm).

Because eL(w)=(1-w)-1 for |w|<1,

ef(z,q)=m=0eL(zqm)=m=0(1-zqm)-1.

Define

G(z,q)=m=0(1-zqm)-1=n=0gn(q)zn.

On the one hand, g0(q)=G(0,q)=1. On the other hand,

G(qz,q)=(1-z)G(z,q),

thus

n=0gn(q)zn+1=n=0gn(q)(1-qn)zn,

and therefore for n1 we have gn(q)=(1-qn)-1gn-1(q). Thus by induction, for n1,

gn(q)=1(1-q)(1-q2)(1-qn).

Hence

ef(z,q)=1+n=1zn(1-q)(1-q2)(1-qn)=F(z,q).

Proposition 34.

Rρ, and if |z|<ρ then ef(z)=F(z).

Proof.

If ρ=0 then the claim is immediate. Otherwise, 0<ρ1. Let 0<t<ρ, 0<r<1, and define G(θ)=F(teiθ,rq0). On the one hand,

G(θ)=n=0Cn(rq0)(teiθ)n=n=0Cn(rq0)tneinθ,

which implies that G^(n)=Cn(rq0)tn for n0 and G^(n)=0 for n<0. On the other hand, for n, using Proposition 33,

G^(n) =12π02πG(θ)e-inθ𝑑θ
=12π02πF(teiθ,rq0)e-inθ𝑑θ