Hensel’s lemma, valuations, and p-adic numbers

Jordan Bell
November 2, 2014

1 Hensel’s lemma

Let p be prime and f(x)[x].11 1 Hua Loo Keng, Introduction to Number Theory, Chapter 15, “p-adic numbers”. Suppose that 0a0<p, satisfies

f(a0)0(modp)

and

f(a0)0(modp).

Using the power series expansion

f(a0+h)=f(a0)+f(a0)h+f′′(a0)2h2+,

for any y we have

f(a0+py)=f(a0)+f(a0)py+f′′(a0)2p2y2+

so

f(a0+py)p=f(a0)p+f(a0)y+f′′(a0)2py2+.

Because f(a0)0(modp), each term on the right-hand side is an integer. Then, f(a0+py)0(modp2) is equivalent to

f(a0)p+f(a0)y+f′′(a0)2py2+0(modp),

i.e.,

f(a0)y-f(a0)p(modp).

Because f(a0)0(modp), there is a unique y(modp) that solves the above congruence, so there is a unique y(modp) that solves f(a0+py)0(modp2). This y is

y-f(a0)p(f(a0))-1(modp).

Let 0a1<p be a1y(modp).

Suppose that

x=a0+a1p+a2p2++al-2pl-2,0aj<p,

satisfies

f(x)0(modpl-1)

and

f(x)0(modp).

Using the power series expansion

f(x+h)=f(x)+f(x)h+f′′(x)2h2+,

for any y we have

f(x+pl-1y)=f(x)+f(x)pl-1y+f′′(x)2p2l-2y2+,

i.e.

f(x+pl-1y)pl-1=f(x)pl-1+f(x)y+f′′(x)2pl-1y2+.

Because f(x)0(modpl-1), each term on the right-hand side is an integer. Then, f(x+pl-1y)0(modpl) is equivalent to

f(x)pl-1+f(x)y+f′′(x)2pl-1y2+0(modp),

i.e.,

f(x)y-f(x)pl-1(modp).

Because f(x)0(modp), there is a unique y(modp) that solves the above congruence, so there is a unique y(modp) that solves f(x+pl-1y)0(modpl). This y is

y-f(x)pl-1(f(x))-1(modp).

Let 0al-1<p be al-1y(modp).

We have thus inductively defined a sequence a0,a1,a2,, with 0aj<p, such that for any l,

f(a0+a1p++al-1pl-1)0(modpl).

We wish to make sense of the infinite expression

a0+a1p+a2p2+a3p3+

Calling this x, it ought to be the case that f(x)0(modp), f(x)0(modp2), f(x)0(modp3), etc.

Example 1.

Take p=3 and f(x)=x2-7, f(x)=2x. The two conditions f(x)0(modp) and f(x)0(modp) are satisfied both by a0=1 and a0=2. Take a0=1. Then

a1-f(1)3(f(1))-1--63(2)-11(mod3).

So a1=1. Then,

a2-f(1+13)32(f(1+13))-1-99(8)-1-21(mod3).

So a2=1. Then,

a3-f(1+13+132)33(f(1+13+132))-1-620(mod3).

So, a3=0. Then,

a4-f(1+13+132+033)34(f(1+13+132+033))-1-222(mod3).

So, a4=2, etc.

2 Absolute values on fields

If K is a field, an absolute value on K is a map ||:K0 such that |x|=0 if and only if x=0, |xy|=|x||y|, and |x+y||x|+|y|. The trivial absolute value on K is |0|=0 and |x|=1 for all nonzero xK.

If || is an absolute value on K, then d(x,y)=|x-y| is a metric on K. The trivial absolute value yields the discrete metric. Two absolute values ||1,||2 on K are said to be equivalent if they induce the same topology on K.

The following theorem characterizes equivalent absolute values.22 2 Absolute values, valuations and completion, https://www.math.ethz.ch/education/bachelor/seminars/fs2008/algebra/Crivelli.pdf

Theorem 2.

Two nontrivial absolute values ||1,||2 are equivalent if and only if there is some real s>0 such that

|x|1=|x|2s,xK.
Proof.

Suppose that s>0 and that |x|1=|x|2s for all xK. Then

Bd1(x,r) ={yK:|y-x|1<r}
={yK:|y-x|2s<r}
={yK:|y-x|2<r1/s}
=Bd2(x,r1/s).

Since the collection of open balls for d1 is equal to the collection of open balls for d2, the absolute values ||1,||2 induce the same topology on K.

Suppose that ||1,||2 are equivalent. If |x|1<1 then d1(xn,0)=|xn|1=|x|1n0 as n. Thus xn0 in d1 and hence, because the topologies induced by ||1 and ||2 are equal, xn0 in d2, i.e. |x|2n=|xn|2=d2(xn,0)0. Therefore |x|2<1. Thus, |x|1<1 if and only if |x|2<1.

Let yK such that |y|1>1 (there is such an element because ||1 is nontrivial and |y-1|1=|y|1-1) and let xK with |x|10,1. There is some nonzero α such that |x|1=|y|1α. Let mini all be greater than α and converge to α. Then, because |y|1>1, we have |x|1=|y|1α<|y|1mini, hence |x|1ni<|y|1mi, hence |xni|1|ymi|1<1, hence

|xniymi|1<1.

Because ||1 and ||2 are equivalent,

|x|2ni|y|2mi=|xniymi|2<1,

so |x|2<|y|2mini. Taking i gives

|x|2|y|2α.

Similarly, we check that

|x|2|y|2α.

Therefore,

|x|2=|y|2α.

Using this and |x|1=|y|1α, we have

log|x|1=αlog|y|1,log|x|2=αlog|y|2,

and so, as α0,

log|x|1log|x|2=log|y|1log|y|2.

This is true for any xK with |x|10,1. We define s to be this common value. The fact that |y|1>1 implies, because ||1 and ||2 are equivalent, that |y|2>1, and so s>0.

Now take xK. If x=0 then |x|1=0=0s=|x|2s. Because ||1 and ||2 are equivalent, |x|2>1 implies that |x|1>1 and |x|2<1 implies that |x|1<1, so if |x|1=1 then |x|2=1 and hence |x|1=1=1s=|x|2s. If |x|10,1, then the above shows that

log|x|1log|x|2=s,

i.e., |x|1=|x|2s, proving the claim. ∎

An absolute value ||:K0 is said to be non-Archimedean if

|x+y|max{|x|,|y|},x,yK.

An absolute value is called Archimedean if it is not non-Archimedean. For example, the absolute value on the field is Archimedean, since, for example, |1+1|=2>max{|1|,|1|}=1.

Lemma 3.

If || is a non-Archimedean absolute value on a field K and |x||y|, then

|x+y|=max{|x|,|y|}.

3 Valuations

A valuation on a field K is a function v:K{} satisfying v(x)= if and only if x=0, v(xy)=v(x)+v(y), and

v(x+y)min{v(x),v(y)}.

The trivial valuation is v(x)=0 for x0 and v(0)=.

Lemma 4.

Let v be a valuation on a field K. If v(x)v(y), then v(x+y)=min{v(x),v(y)}.

Proof.

Take v(y)<v(x). For x=0,

v(x+y)=v(y)=min{,v(y)}=min{v(x),v(y)}.

For x0, assume by contradiction that min{v(x+y),v(x)}=v(x). Then, since v(-x)=v(-1x)=v(-1)+v(x)=v(x),

v(x)>v(y)=v(x+y-x)min{v(x+y),v(x)}=v(x),

a contradiction. Hence min{v(x+y),v(x)}=v(x+y). Then

v(y) =v(x+y-x)
min{v(x+y),v(x)}
=v(x+y)
min{v(x),v(y)}
=v(y).

Hence v(x+y)=v(y)=min{v(x),v(y)}, completing the proof. ∎

Theorem 5.

Let K be a field. If || is a non-Archimedean absolute value on K and s>0, then vs:KR{} defined by vs(x)=-slog|x| for x0 and vs(0)= is a valuation on K.

If v is a valuation on K and q>1, then the function ||q:KR0 defined by |x|q=q-v(x) for x0 and |0|q=0 is a non-Archimedean absolute value on K.

Proof.

Suppose that || is a non-Archimedean absolute value on K and that s>0. Let x,yK. If either is 0, then it is immediate that vs(xy)==vs(x)+vs(y). If neither is 0, then

vs(xy)=-slog|xy|=-slog(|x||y|)=-slog|x|-slog|y|=vs(x)+vs(y).

Now, if both x,y are 0 then

vs(x+y)=vs(0)==min{,}=min{vs(x),vs(y)}.

If x=0 and y0 then

vs(x+y)=vs(y)=-slog|y|=min{-slog|y|,}=min{vs(y),vs(x)}.

If neither x,y is 0 but x=-y, then

vs(x+y)=vs(0)=min{vs(x),vs(y)}.

Finally, if neither x,y is 0 and x-y, then, because || is non-Archimedean,

vs(x+y) =-slog|x+y|
-slog(max{|x|,|y|})
=min{-slog|x|,-slog|y|}
=min{vs(x),vs(y)}.

Thus vs is a valuation on K.

Suppose that v is a valuation on K and that q>1. If x,y are nonzero, then

|xy|q=q-v(xy)=q-v(x)-v(y)=q-v(x)q-v(y)=|x|q|y|q.

Let x,yK. To show that |x+y|q|x|q+|y|q, it suffices to show that |x+y|qmax{|x|q,|y|q}; proving this will establish that ||q is an absolute value and furthermore that ||q is non-Archimedean. If x,y are both 0, then |x+y|q=|0|q=0=max{0,0}=max{|x|q,|y|q}. If x=0 and y0, then |x+y|q=|y|q=q-v(y)=max{q-v(y),0}=max{|y|q,|x|q}. If neither x,y is 0 but x=-y, then

|x+y|q=|0|q=0max{|x|q,|y|q}.

Finally, if neither x,y is 0 and x-y, then

|x+y|q =q-v(x+y)
q-min{v(x),v(y)}
=max{q-v(x),q-v(y)}
=max{|x|q,|y|q}.

Two valuations v1,v2 on a field K are said to be equivalent if there is some real s>0 such that

v1=sv2.

A valuation v on a field K is said to be discrete if there is some real s>0 such that

v(K*)=s.

A valuation is said to be normalized if

v(K*)=.

4 Valuation rings

Theorem 6.

If K is a field and v is a nontrivial valuation on K, then

𝒪v={xK:v(x)0}

is a maximal proper subring of K, and for all x0, xOv or x-1Ov. The set

{xK:v(x)=0}

is the group of invertible elements of Ov, and the set

𝔭v={xK:v(x)>0}

is the unique maximal ideal of Ov.

Proof.

It is immediate that 0,1𝒪v. For x𝒪v, v(-x)=v(x)0, so -x𝒪v. For x,y𝒪v, v(xy)=v(x)+v(y)0, so xy𝒪v. And v(x+y)min{v(x),v(y)}0, so x+y𝒪v. Thus 𝒪v is a subring of K. For nonzero xK, if v(x)0 then x𝒪v, and if v(x)<0 then v(x-1)=-v(x)>0, so x-1𝒪v.

Since v is nontrivial, there is some xK with v(x)0,. If x𝒪v then v(x)>0 and so v(x-1)=-v(x)<0, giving x-1𝒪v. Hence 𝒪vK, showing that 𝒪v is a proper subring of K.

To show that 𝒪v is a maximal proper subring, it suffices to show that if zK𝒪v then 𝒪v[z]=K, i.e., that the smallest ring containing 𝒪v and z is K. As z𝒪v, v(z)<0. Let yK. For any positive integer j we have v(yz-j)=v(y)-jv(z), and because v(z)<0, there is some j=j(y) such that v(yz-j)>0. For this j, yz-j𝒪v. Hence y𝒪v[z], and so 𝒪v[z]=K, showing that 𝒪v is a maximal proper subring.

Suppose that x𝒪v and x-1𝒪v. If v(x)>0, then v(x-1=-v(x)<0, contradicting that x-1𝒪v. Hence v(x)=0. If v(x)=0, then, as x-1K, v(x-1)=-v(x)=0, so x-1𝒪v, hence x is an element of 𝒪v whose inverse is in 𝒪v.

Let x,y𝔭v. Then, since v(x)>0 and v(y)>0,

v(x-y)min{v(x),v(-y)}=min{v(x),v(y)}>0,

showing that x-y𝔭v, and thus that 𝔭v is an additive subgroup of 𝒪v. Let x𝔭v and z𝒪v. Then, since v(z)0 and v(x)>0,

v(zx)=v(z)+v(x)v(x)>0,

showing that zx𝔭v. Therefore 𝔭v is an ideal in the ring 𝒪v. Since v(1)=0, 1𝔭v, so 𝔭v is a proper ideal.

The fact that 𝔭v is maximal follows from it being the set of noninvertible elements of 𝒪v. Suppose that B is a maximal ideal B of 𝒪v. Because B is a proper ideal it contains no invertible elements, and hence is contained in 𝔭v, the set of noninvertible elements of 𝒪v. Since B is maximal, it must be that B=𝔭v. Therefore, any maximal ideal of 𝒪v is 𝔭v, showing that 𝔭v is the unique maximal ideal of 𝒪v. ∎

The above ring 𝒪v is called the valuation ring. Generally, a ring that has a unique maximal ideal is called a local ring, and thus the above theorem shows that the valuation ring is a local ring. We call the quotient 𝒪v/𝔭v the residue field of Ov.

Lemma 7.

If v is a normalized valuation on a field K then for all nonzero xK and tpv, v(t)=1, there is some uOv* such that

x=utn,n=v(x).
Proof.

Since x0, v(x)=n. Hence v(xt-n)=v(x)-nv(t)=v(x)-n=0, and therefore u=xt-n𝒪*. Then x=utn, completing the proof. ∎

Theorem 8.

If v is a normalized valuation on a field K, then Ov is a principal ideal domain. If A is a nonzero ideal of Ov, then there is some tp, v(t)=1 and n0 such that

A=tn𝒪v={xK:v(x)n}=𝔭vn,

and

𝔭vn/𝔭vn+1𝒪v/𝔭v,

as Ov/pv-linear vector spaces.

Proof.

Let A{0} be an ideal of 𝒪v. For any yA, v(y)0, and we take xA such that

v(x)=min{v(y):yA}. (1)

Since v(K*)=, there is some tK with v(t)=1, and because v(t)>0, t𝔭v. By Lemma 7, there is some u𝒪* such that x=utn, n=v(x). For any z𝒪, xzA and so tnzA. Thus tn𝒪vA. On the other hand, let yA. Then also by Lemma 7 there is some w𝒪v* such that y=wtm, m=v(y). By (1), m=v(y)v(x)=n, so v(tm-n)=(m-n)v(t)=m-n0 so tm-n𝒪v, giving

y=wtm=tn(wtm-n)tn𝒪v.

Therefore Atn𝒪v, and so A=tn𝒪v. That is, A is the principal ideal generated by tn, which shows that 𝒪v is a principal ideal domain.

Let t𝔭v with v(t)=1, and define ϕ:𝔭vn𝒪v/𝔭v by v(atn)=a+𝔭, for a𝒪v. ∎

Lemma 9.

If v1,v2 are discrete valuations on a field K such that Ov1=Ov2, then v1 and v2 are equivalent.

5 p-adic valuations

Fix a prime number p. For nonzero a, there are unique integers n,r,s satisfying

a=rspn,

where r,s are coprime, s>0, and prs. We define vp(a)=n. Furthermore, we define vp(0)=.

Theorem 10.

vp:{} is a normalized valuation.

Proof.

For nonzero a,b, write

a=r1s1pm,b=r2s2pn,

where gcd(r1,s1)=gcd(r2,s2)=1, s1,s2>0, and pr1s1,pr2s2. Then,

ab=r1r2s1s2pm+n,

where pr1s1r2s2; the fraction r1r2s1s2 need not be in lowest terms. So vp(ab)=m+n=vp(a)+vp(n).

Suppose that vp(a)vp(b). Then

a+b=r1s1pm+r2s2pn=(r1s1+r2s2pn-m)pm=r1s2+r2s1pn-ms1s2pm.

Since ps1 and ps2, then

vp(a+b)m=vp(a)=min{vp(a),vp(b)}.

We call vp the p-adic valuation. The valuation ring of corresponding to vp is

𝒪p={x:vp(x)0},

in other words, those rational numbers such that in lowest terms, p does not divide their denominator. For example, 11169,-935𝒪3, and 53𝒪3. By Theorem 6, the group of units of the valuation ring 𝒪p is

𝒪p*={x:vp(x)=0},

in other words, those rational numbers such that in lowest terms, p divides neither their numerator nor their denominator. As well by Theorem 6, 𝒪p is a local ring whose unique maximal ideal is

𝔭p={x:vp(x)>0},

in other words, those rational numbers such that in lowest terms, p divides their numerator and does not divide their denominator. We see that p𝔭p and vp(p)=1, so the nonzero ideals of 𝒪p are of the form

pn𝒪p.
Lemma 11.

𝒪p/𝔭p/p.

6 p-adic absolute values and metrics

We define ||p:0 by |a|p=p-vp(n) for a0 and |0|p=0. This is a non-Archimedean absolute value on , which we call the p-adic absolute value.

Example 12.

For p=3 and a=-5710, we have n=1,r=-19,s=10. Thus |-5710|3=3-1.

For p=5 and a=2875, we have n=-2,r=28,s=3. Thus |2875|5=52.

We define dp(x,y)=|x-y|p. The sequences xl=a0+a1p+a2p2++al-1pl-1 constructed when applying Hensel’s lemma satisfy, for m<n,

xn-xm=ampm+am+1pm+1++an-1pn-10(modpm),

so

|xn-xm|pp-m,

and

f(xn)0(modpn),

so

|f(xn)|pp-n.

Thus, xn is a Cauchy sequence in the p-adic metric dp(x,y)=|x-y|p, and f(xn)0 as n.

Lemma 13.

If xn and yn are Cauchy sequences in (Q,dp), then xn+yn and xnyn are Cauchy sequences in (Q,dp).

Proof.

The claim follows from

|xn+yn-(xm+ym)|p|xn-xm|p+|yn-ym|p

and

|xnyn-xmym|p =|xnyn-xmyn+xmyn-xmym|p
|xn-xm|p|yn|p+|xm|p|yn-ym|p,

and the fact that xn,yn being Cauchy implies that |xn|p,|yn|p are bounded. ∎

7 Completions of metric spaces

If (X,d) is a metric space, a completion of X is a complete metric space (Y,ρ) and an isometry i:XY such that for every metric space (Z,r) and isometry j:XZ, there is a unique isometry J:YZ such that Ji=j. It is a fact that any metric space has a completion, and that if (Y1,ρ1) and (Y2,ρ2) are completions then there is a unique isometric isomorphism f:Y1Y2.

For p prime, let (p,dp) be the completion of (,dp). Elements of p are called p-adic numbers. For x,yp, there are Cauchy sequences xn,yn in (,dp) such that xnx and yny in (p,dp). We define addition and multiplication on the set p by

x+y=lim(xn+yn),xy=lim(xnyn);

that these limits exists follows from Lemma 13. If xp, x0, then there is a sequence xn, each term of which is 0, such that xnx in (p,dp). Then xn-1 is a Cauchy sequence in (,dp) hence converges to some yp which satisfies xy=1. Therefore p is a field.

We define vp:p{}

vp(x)=limvp(xn),xnx.

One proves that vp is a normalized valuation on the field p.33 3 cf. Paul Garrett, Classical definitions of Zp and A, http://www.math.umn.edu/~garrett/m/mfms/notes/05_compare_classical.pdf We then define ||p:p0 by |x|p=p-vp(x) for x0 and |0|p=.

8 The exponential function

Lemma 14.

For a1,,arQp,

|a1++ar|pmax{|a1|,,|ar|}.
Lemma 15.

A sequence aiQp is Cauchy if and only if ai+1-ai0 as i.

Proof.

Assume that ai+1-ai0 and let ϵ>0. Then there is some i0 such that ii0 implies |ai+1-ai|p<ϵ. For i0i<j,

|aj-ai|p =|aj-aj-1+aj-1+-ai+1+ai+1-ai|p
=|(aj-aj-1)++(ai+1-ai)|p
max{|aj-aj-1|,,|ai+1-ai|}
<ϵ.

The above shows that if ai0 in (p,dp) then the series ai converges in (p,dp).

Lemma 16 (Exponential power series).

If vp(x)>1p-1, then

n=0xnn!

converges in (Qp,dp).

Proof.
vp(n!)=j=1[npj]j=1npj=1np11-1p=np-1.

Then

vp(xnn!)=nvp(x)-vp(n!)nvp(x)-np-1=n(vp(x)-1p-1).

As n this tends to +, hence

|xnn!|p=p-vp(xnn!)0,

and thus the series n=0xnn! converges. ∎

Lemma 17 (Logarithm power series).

If vp(x)>0, then

n=1(-1)n+1xnn

converges in (Qp,dp).

Proof.

For n a positive integer we have vp(n)logpn. Then,

vp(xnn)=nvp(x)-vp(n)nvp(x)-logpn.

If vp(x)>0 then this tends to + as n. ∎

9 Topology

We define p to be the valuation ring of p. Elements of p are called p-adic integers. For xp and real r>0, write

B¯p(r,x)={yp:|x-y|pr}={yp:vp(x-y)-logpr}.

In particular,

B¯p(0,1)=p.

Because vp is discrete, there is some ϵ>0 such that

{yp:|x-y|pr}={yp:|x-y|p<r+ϵ}.

This shows that B¯p(x,r) is open in the topology induced by vp, and thus is both closed and open. It follows that p is totally disconnected.44 4 Gerald B. Folland, A Course in Abstract Harmonic Analysis, pp. 34–36.

Theorem 18.

p is totally bounded.

The fact that p is a totally bounded subset of a complete metric space implies that p is compact. Then because

B¯d(0,pk)={yp:|y|ppk}={yp:|pky|p1}=p-kp

and translation is a homeomorphism, any closed ball in p is compact. Therefore p is locally compact.

p is a locally compact abelian group under addition, and we take Haar measure on it satisfying μ(p)=1. One can explicitly calculate the characters on p.55 5 Gerald B. Folland, A Course in Abstract Harmonic Analysis, pp. 91–93, 104. Cf. Keith Conrad, The character group of Q, http://www.math.uconn.edu/~kconrad/blurbs/gradnumthy/characterQ.pdf