This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Solving pp-adic polynomial equations using Jarratt’s Method

Stephan Baier, Swarup Kumar Das and Saayan Mukherjee Stephan Baier, Ramakrishna Mission Vivekananda Educational and Research Institute, Department of Mathematics, G. T. Road, PO Belur Math, Howrah, West Bengal 711202, India [email protected] Swarup Kumar Das, Ramakrishna Mission Vivekananda Educational and Research Institute, Department of Mathematics, G. T. Road, PO Belur Math, Howrah, West Bengal 711202, India [email protected] Saayan Mukherjee, Ramakrishna Mission Vivekananda Educational and Research Institute, Department of Mathematics, G. T. Road, PO Belur Math, Howrah, West Bengal 711202, India [email protected]
Abstract.

We implement an iterative numerical method to solve polynomial equations f(x)=0f(x)=0 in the pp-adic numbers, where f(x)p[x]f(x)\in\mathbb{Z}_{p}[x]. This method is a simplified pp-adic analogue of Jarratt’s method for finding roots of functions over the real numbers. We establish that our method has a higher order of convergence than J.F.T. Rabago’s [9] pp-adic version of Olver’s method from 2016. Moreover, we weaken the initial conditions in Rabago’s method, which allows us to start the iteration with a multiple root of the congruence f(x)0modpf(x)\equiv 0\bmod{p}.

Key words and phrases:
Jarratt’s method, p-adic polynomials, Thurston’s Method, Hensel’s lemma
2020 Mathematics Subject Classification:
11D88,11C08,65H04

1. Introduction

Solving polynomial equations is one of the oldest problems in mathematics which dates back to the Babylonians. There are several analytic, algebraic and geometric approaches for finding roots of polynomials. In this paper we discuss an iterative method to approximate solutions of polynomial equations of the form f(x)=0f(x)=0 in the pp-adic numbers, where fp[x]f\in\mathbb{Z}_{p}[x].

Well-known numerical methods to approximate roots of functions on the real numbers are Newton’s Method, Olver’s Method [8], Halley’s method [3] and others. In this paper, we look at a method developed by P. Jarratt in 1966 [5]. This method has a higher order of convergence than all of the afore-mentioned methods. We set up a pp-adic analogue of Jarratt’s method. Analogues of Olver’s and Halley’s method in the pp-adic setting have previously been established by J.F.T. Rabago [9]. Apart from improving the order of convergence, we also weaken the initial conditions in Rabago’s work. Below is a definition of the term “order of convergence” for general Banach spaces, including the complete valued fields \mathbb{R} and p\mathbb{Q}_{p}.

Definition 1.1.

Let VV be a Banach space with norm |.||.|. Assume that (xn)(x_{n}) is a sequence in VV converging to γV\gamma\in V. Then we say that (xn)(x_{n}) converges with order qq if

Rn=|xn+1γ||xnγ|qR_{n}=\frac{\left\lvert x_{n+1}-\gamma\right\rvert}{\left\lvert x_{n}-\gamma\right\rvert^{q}}

is a bounded sequence in \mathbb{R}.

Before turning to Jarret’s method, let us briefly review Newton’s and Olver’s methods in the context of pp-adic numbers. Suppose pp is a prime and let f(x)p[x]f(x)\in\mathbb{Z}_{p}[x] be a polynomial. Throughout the sequel, we assume the leading coefficient of ff to be a unit in p\mathbb{Z}_{p}. Then all solutions of ff in p\mathbb{Q}_{p} lie in p\mathbb{Z}_{p}. Assume that x1px_{1}\in\mathbb{Z}_{p} such that f(x1)0modpf(x_{1})\equiv 0\bmod{p} but f(x1)0modpf^{\prime}(x_{1})\not\equiv 0\bmod{p}. Newton’s method uses the iteration

(1.1) xn+1=xnf(xn)f(xn)for n1,x_{n+1}=x_{n}-\frac{f(x_{n})}{f^{\prime}(x_{n})}\quad\mbox{for }n\geq 1,

which yields a sequence in p\mathbb{Z}_{p} which converges to a simple root of ff in p\mathbb{Z}_{p}. In the case when f[x]f\in\mathbb{Z}[x], this provides a convenient proof of Hensel’s lemma which, for all mm\in\mathbb{N}, asserts that a solution x1x_{1}\in\mathbb{Z} of the congruence f(x1)0modpf(x_{1})\equiv 0\bmod{p} satisfying f(x1)0modpf^{\prime}(x_{1})\not\equiv 0\bmod{p} can be uniquely lifted to a root xx of the congruence f(x)0modpmf(x)\equiv 0\bmod{p^{m}}. In fact, one may replace the condition f(x1)0modpf^{\prime}(x_{1})\not\equiv 0\bmod{p} by the weaker condition

(1.2) |f(x1)|p<|f(x1)|p2|f(x_{1})|_{p}<|f^{\prime}(x_{1})|_{p}^{2}

which leads to a generalized version of Hensel’s lemma (see [6]). The order of convergence of Newton’s method is 2.

Olver’s method, an iteration with higher order of convergence, was carried over to pp-adic numbers by J.F.T. Rabago in 2016 [9]. He proved that if f(x)p[x]f(x)\in\mathbb{Z}_{p}[x] and if x1x_{1}\in\mathbb{Z} satisfies f(x1)0modpf(x_{1})\equiv 0\bmod{p} and f(x1)0modpf^{\prime}(x_{1})\not\equiv 0\bmod{p}, then the sequence (xn)(x_{n}) defined iteratively as

(1.3) xn+1=xnf(xn)f(xn)12f(xn)2f′′(xn)f(xn)3x_{n+1}=x_{n}-\frac{f(x_{n})}{f^{\prime}(x_{n})}-\frac{1}{2}\cdot\frac{f(x_{n})^{2}f^{\prime\prime}(x_{n})}{f^{\prime}(x_{n})^{3}}

converges to a simple root of ff in p\mathbb{Z}_{p}. The order of convergence of this method is 3 for p3p\geq 3 and 4 for p=2p=2.

Our goal is to implement a pp-adic analogue of a simplified version of Jarratt’s method which has order of convergence 4. We will also weaken the initial condition f(x1)0modpf^{\prime}(x_{1})\not\equiv 0\bmod{p}, replacing it by the condition (1.2) above. This allows us to begin with an x1x_{1} which possibly is a multiple root of the congruence f(x)0modpf(x)\equiv 0\bmod{p}. At the end of this article we will briefly discuss Thurston’s method which is useful when the initial condition (1.2) is not satisfied. In a nutshell, Thurston’s method either tells us after finitely many steps that a Hensel lifting of a given root x1x_{1} of the congruence f(x)0modpf(x)\equiv 0\bmod{p} to a solution of f(x)=0f(x)=0 in p\mathbb{Z}_{p} does not exist, or it produces a new polynomial FF in place of ff which meets the above-mentioned initial conditions. In this case we can perform Jarratt’s method for FF, which in turn gives us an approximation to a root of ff in p\mathbb{Z}_{p}.

2. Simplified Jarratt Method for functions on \mathbb{R}

Jarratt’s original method is an iterative numerical method for solving equations f(x)=0f(x)=0, where ff is a real-valued function defined on intervals in \mathbb{R}. The iteration is defined as

(2.1) xn+1=xn12f(xn)f(xn)+f(xn)f(xn)3f(xn23f(xn)f(xn)),n.x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{f(x_{n})}{f^{\prime}(x_{n})-3f^{\prime}\left(x_{n}-\frac{2}{3}\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)},\hskip 28.45274pt\forall n\in\mathbb{N}.

Precise initial conditions on x1x_{1} for a generalized version of this method on Banach spaces were worked out in [1]. The order of convergence of this method is 4, as established in [5].

Instead of using the original method, we here use a simplified version. The idea is as follows. Using a Taylor approximation (if existent), we find

f(xn23f(xn)f(xn))=f(xn)23f(xn)f(xn)f′′(xn)+29(f(xn)f(xn))2f′′′(xn)+Error.f^{\prime}\left(x_{n}-\frac{2}{3}\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)=f^{\prime}(x_{n})-\frac{2}{3}\cdot\frac{f(x_{n})}{f^{\prime}(x_{n})}\cdot f^{\prime\prime}(x_{n})+\frac{2}{9}\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})+\mbox{Error}.

Plugging the right-hand side above into (2.1) and neglecting the error, we arrive at the simpler iteration

(2.2) xn+1=xn12f(xn)f(xn)+3f(xn)f(xn)26f(xn)3+6f(xn)f(xn)f′′(xn)2f(xn)2f′′′(xn),x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{3f(x_{n})f^{\prime}(x_{n})^{2}}{-6f^{\prime}(x_{n})^{3}+6f(x_{n})f^{\prime}(x_{n})f^{\prime\prime}(x_{n})-2f(x_{n})^{2}f^{\prime\prime\prime}(x_{n})},

We will refer to this iteration as Simplified Jarratt Method (SJM) in the rest of this article. Below we will prove that SJM has order of convergence 4 over the real numbers. If p>3p>3, this result carries over to p\mathbb{Q}_{p}, as we will see later.

Theorem 2.1.

Assume II is an open interval in \mathbb{R} and f:If:I\rightarrow\mathbb{R} is a five times differentiable function. Assume that γI\gamma\in I satisfies f(γ)=0f(\gamma)=0 and f(γ)0f^{\prime}(\gamma)\not=0. Then there exists δ>0\delta>0 such that if x1(γδ,γ+δ)x_{1}\in(\gamma-\delta,\gamma+\delta), the sequence (xn)(x_{n}) defined by (2.2) converges to γ\gamma with order 44. Moreover,

(2.3) limn|xn+1γ||xnγ|4=|18f′′′(γ)f′′(γ)f(γ)18f′′(γ)35f(4)(γ)f(γ)248f(γ)3|.\lim_{n\rightarrow\infty}\frac{\left\lvert x_{n+1}-\gamma\right\rvert}{\left\lvert x_{n}-\gamma\right\rvert^{4}}=\left|\frac{18f^{\prime\prime\prime}(\gamma)f^{\prime\prime}(\gamma)f^{\prime}(\gamma)-18f^{\prime\prime}(\gamma)^{3}-5f^{(4)}(\gamma)f^{\prime}(\gamma)^{2}}{48f^{\prime}(\gamma)^{3}}\right|.
Proof.

Let nn\in\mathbb{N}. Taylor’s theorem provides us with the approximations

(2.4) f(xn)=f(γ)(en+c2en2+c3en3+c4en4+O(en5)),f(xn)=f(γ)(1+2c2en+3c3en2+4c4en3+O(en4)),f′′(xn)=f(γ)(2c2+6c3en+12c4en2+O(en3)),f′′′(xn)=f(γ)(6c3+24c4en+O(en2)),\begin{split}f(x_{n})=&f^{\prime}(\gamma)\left(e_{n}+c_{2}e_{n}^{2}+c_{3}e_{n}^{3}+c_{4}e_{n}^{4}+O(e_{n}^{5})\right),\\ f^{\prime}(x_{n})=&f^{\prime}(\gamma)\left(1+2c_{2}e_{n}+3c_{3}e_{n}^{2}+4c_{4}e_{n}^{3}+O(e_{n}^{4})\right),\\ f^{\prime\prime}(x_{n})=&f^{\prime}(\gamma)\left(2c_{2}+6c_{3}e_{n}+12c_{4}e_{n}^{2}+O(e_{n}^{3})\right),\\ f^{\prime\prime\prime}(x_{n})=&f^{\prime}(\gamma)\left(6c_{3}+24c_{4}e_{n}+O(e_{n}^{2})\right),\end{split}

where

en:=xnγandck:=1k!f(k)(γ)f(γ).e_{n}:=x_{n}-\gamma\quad\mbox{and}\quad c_{k}:=\frac{1}{k!}\frac{f^{(k)}(\gamma)}{f^{\prime}(\gamma)}.

Dividing the first two equations in (2.4), we get

(2.5) f(xn)f(xn)=enc2en2+2(c22c3)en3+(7c2c34c223c4)en4+O(en5)\frac{f(x_{n})}{f^{\prime}(x_{n})}=e_{n}-c_{2}{e_{n}^{2}}+2(c_{2}^{2}-c_{3})e_{n}^{3}+(7c_{2}c_{3}-4c_{2}^{2}-3c_{4})e_{n}^{4}+O(e_{n}^{5})

after a short calculation. We further write (2.2) in the form

(2.6) xn+1=xn12f(xn)f(xn)+3f(xn)6f(xn)+6f(xn)f(xn)f′′(xn)2(f(xn)f(xn))2f′′′(xn).x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{3f(x_{n})}{-6f^{\prime}(x_{n})+6\cdot\frac{f(x_{n})}{f^{\prime}(x_{n})}\cdot f^{\prime\prime}(x_{n})-2\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})}.

Combining (2.4) and (2.5), we approximate the denominator in the last term on the right-hand side of (2.6) by

(2.7) 6f(xn)+6(f(xn)f(xn))f′′(xn)2(f(xn)f(xn))2f′′′(xn)=f(γ)((24c3248c23+96c3c2212c4c2)en4+(24c2336c3c2)en3+(6c312c22)en26)+O(en5)).\begin{split}&-6f^{\prime}(x_{n})+6\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)f^{\prime\prime}(x_{n})-2\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})\\ =&f^{\prime}(\gamma)\left((-24c_{3}^{2}-48c_{2}^{3}+96c_{3}c_{2}^{2}-12c_{4}c_{2})e_{n}^{4}+(24c_{2}^{3}-36c_{3}c_{2})e_{n}^{3}+\right.\\ &\qquad\left.(6c_{3}-12c_{2}^{2})e_{n}^{2}-6)+O(e_{n}^{5})\right).\end{split}

From the first line in (2.4) and (2.7), we deduce that

(2.8) 3f(xn)6f(xn)+6(f(xn)f(xn))f′′(xn)2(f(xn)f(xn))2f′′′(xn)=en2c22en2+(c22c3)en3+c4+2c235c3c22en4+O(en5),\begin{split}&\frac{3f(x_{n})}{-6f^{\prime}(x_{n})+6\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)f^{\prime\prime}(x_{n})-2\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})}\\ &=-\frac{e_{n}}{2}-\frac{c_{2}}{2}e_{n}^{2}+(c_{2}^{2}-c_{3})e_{n}^{3}+\frac{c_{4}+2c_{2}^{3}-5c_{3}c_{2}}{2}\cdot e_{n}^{4}+O(e_{n}^{5}),\end{split}

which may be calculated using a suitable computer algebra system. Now plugging (2.5) and (2.8) into (2.6) and simplifying, we arrive at

xn+1=γ+en12(enc2en2+2(c22c3)en3+(7c3c24c233c4)en4)+(en2c22en2+(c22c3)en3+c4+2c235c3c22en4+O(en5))=γ+9c3c26c235c42en4+O(en5),\begin{split}x_{n+1}&=\gamma+e_{n}-\frac{1}{2}(e_{n}-c_{2}e_{n}^{2}+2(c_{2}^{2}-c_{3})e_{n}^{3}+(7c_{3}c_{2}-4c_{2}^{3}-3c_{4})e_{n}^{4})+\\ &\quad\left(-\frac{e_{n}}{2}-\frac{c_{2}}{2}e_{n}^{2}+(c_{2}^{2}-c_{3})e_{n}^{3}+\frac{c_{4}+2c_{2}^{3}-5c_{3}c_{2}}{2}\cdot e_{n}^{4}+O(e_{n}^{5})\right)\\ &=\gamma+\frac{9c_{3}c_{2}-6c_{2}^{3}-5c_{4}}{2}\cdot e_{n}^{4}+O(e_{n}^{5}),\end{split}

and hence

(2.9) en+1=ρen4+O(en5)e_{n+1}=\rho e_{n}^{4}+O(e_{n}^{5})

with

(2.10) ρ:=9c3c26c235c42.\rho:=\frac{9c_{3}c_{2}-6c_{2}^{3}-5c_{4}}{2}.

Therefore, if x1x_{1} is sufficiently close to γ\gamma, then the sequence (xn)(x_{n}) converges to γ\gamma with order 4, and we have

limn|en+1||en|4=|ρ|,\lim_{n\rightarrow\infty}\frac{\left\lvert e_{n+1}\right\rvert}{\left\lvert e_{n}\right\rvert^{4}}=|\rho|,

which implies (2.3). This completes the proof. ∎

3. pp-adic analogue of SJM

Now we apply SJM to polynomials in p[x]\mathbb{Z}_{p}[x] subject to certain initial conditions which we work out in detail. We shall prove the following theorem, which is our main result.

Theorem 3.1.

Let p>3p>3 be a prime and f(x)p[x]f(x)\in\mathbb{Z}_{p}[x] be a polynomial with leading coefficent a unit in p\mathbb{Z}_{p}. Suppose that x1px_{1}\in\mathbb{Z}_{p} satisfies the condition

(3.1) |f(x1)|p<|f(x1)|p2.|f(x_{1})|_{p}<|f^{\prime}(x_{1})|_{p}^{2}.

(In particular, f(x1)0f^{\prime}(x_{1})\not=0 and |f(x1)|p<1|f(x_{1})|_{p}<1 and thus f(x1)0modpf(x_{1})\equiv 0\bmod{p}.) Then the sequence (xn)(x_{n}) defined iteratively as

xn+1=xn12f(xn)f(xn)+3f(xn)f(xn)26f(xn)3+6f(xn)f(xn)f′′(xn)2f(xn)2f′′′(xn)x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{3f(x_{n})f^{\prime}(x_{n})^{2}}{-6f^{\prime}(x_{n})^{3}+6f(x_{n})f^{\prime}(x_{n})f^{\prime\prime}(x_{n})-2f(x_{n})^{2}f^{\prime\prime\prime}(x_{n})}

converges to a simple root γp\gamma\in\mathbb{Z}_{p} of ff with order of convergence 4.

As noted in the introduction, in [9], a pp-adic analogue of Olver’s method was worked out. This method converges with order 3 and was formulated in [9] under the stronger initial condition that |f(x1)|p<1|f(x_{1})|_{p}<1 and |f(x1)|p=1|f^{\prime}(x_{1})|_{p}=1, which means that x1x_{1} is a simple root of the congruence f(x)0modpf(x)\equiv 0\bmod{p}. In contrast, our condition (3.1) allows us to take x1x_{1} as a multiple root of the said congruence. We point out that (3.1) matches the condition in the following theorem which is a version of the Generalized Hensel Lemma (see [6]).

Theorem 3.2 (Generalised Hensel Lemma).

Let f(x)p[x]f(x)\in\mathbb{Z}_{p}[x] and apa\in\mathbb{Z}_{p} satisfying |f(a)|p<|f(a)|p2|f(a)|_{p}<|f^{\prime}(a)|_{p}^{2}. Then there is a unique γp\gamma\in\mathbb{Z}_{p} such that f(γ)=0f(\gamma)=0 and |γa|p<|f(a)|p|\gamma-a|_{p}<|f^{\prime}(a)|_{p}. Moreover, |f(γ)|p=|f(a)|p|f^{\prime}(\gamma)|_{p}=|f^{\prime}(a)|_{p}.

This shows that a root aa of the congruence f(x)0modpf(x)\equiv 0\bmod{p} can be lifted uniquely to a simple root γ\gamma of ff in p\mathbb{Z}_{p}, provided that |f(a)|p<|f(a)|p2|f(a)|_{p}<|f^{\prime}(a)|_{p}^{2}. Hence, if x1=ax_{1}=a satisfies this condition, then the SJM iteration gives a sequence which converges to precisely this unique lifting γ\gamma of aa.

Theorem 3.2 can be proved using Newton’s method (see [2]). In our proof of Theorem 3.1, we replace Newton’s method by SJM.

4. Preliminaries

For the proof of Theorem 3.1, we will use the lemma below.

Lemma 4.1.

Let p>3p>3 be a prime, f(x)p[x]f(x)\in\mathbb{Z}_{p}[x] and ApA\in\mathbb{Z}_{p} such that f(A)0f^{\prime}(A)\not=0. Suppose that

(4.1) |f(A)f(A)2|p<1.\left|\frac{f(A)}{f^{\prime}(A)^{2}}\right|_{p}<1.

Define

B:=12f(A)f(A)+3f(A)f(A)3M,B:=-\frac{1}{2}\frac{f(A)}{f^{\prime}(A)}+\frac{3f(A)}{f^{\prime}(A)-3M},

where

M:=f(A)23f(A)f(A)f′′(A)+29(f(A)f(A))2f′′′(A).M:=f^{\prime}(A)-\frac{2}{3}\cdot\frac{f(A)}{f^{\prime}(A)}\cdot f^{\prime\prime}(A)+\frac{2}{9}\left(\frac{f(A)}{f^{\prime}(A)}\right)^{2}f^{\prime\prime\prime}(A).

Then we have

|B|p|f(A)f(A)|p.|B|_{p}\leq\left|\frac{f(A)}{f^{\prime}(A)}\right|_{p}.
Proof.

We write

(4.2) B=f(A)f(A)(12+313M/f(A))B=\frac{f(A)}{f^{\prime}(A)}\left(-\frac{1}{2}+\frac{3}{1-3M/f^{\prime}(A)}\right)

and

13Mf(A)=2+2f(A)f(A)2(f′′(A)13f(A)f′′′(A)f(A)).1-\frac{3M}{f^{\prime}(A)}=-2+2\cdot\frac{f(A)}{f^{\prime}(A)^{2}}\left(f^{\prime\prime}(A)-\frac{1}{3}\cdot\frac{f(A)f^{\prime\prime\prime}(A)}{f^{\prime}(A)}\right).

From our assumption (4.1), it follows that

|13Mf(A)|p=|2|p=1\left|1-\frac{3M}{f^{\prime}(A)}\right|_{p}=|-2|_{p}=1

if p>3p>3. Hence,

|12+313M/f(A)|p1\left|-\frac{1}{2}+\frac{3}{1-3M/f^{\prime}(A)}\right|_{p}\leq 1

if p>3p>3, which together with (4.2) implies the claim. ∎

5. Proof of the main result

Now we turn to the

Proof of Theorem 3.1. We set

t:=|f(x1)f(x1)2|pt:=\left\lvert\frac{f(x_{1})}{f^{\prime}(x_{1})^{2}}\right\rvert_{p}

and recall our assumption |t|p<1|t|_{p}<1. We shall proceed similarly as in the proof of the Generalized Hensel Lemma in [2] and aim to show the following statements by induction over nn:

  1. (1)

    |xn|p1\left\lvert x_{n}\right\rvert_{p}\leq 1,

  2. (2)

    |f(xn)|p=|f(x1)|p\left\lvert f^{\prime}(x_{n})\right\rvert_{p}=\left\lvert f^{\prime}(x_{1})\right\rvert_{p},

  3. (3)

    |f(xn)|p|f(x1)|p2t4n1\left\lvert f(x_{n})\right\rvert_{p}\leq\left\lvert f^{\prime}(x_{1})\right\rvert_{p}^{2}t^{4^{n-1}}.

For n=1n=1, these conditions are clearly true. So let us assume the conditions to be true for nn. Now for n+1n+1, we have

(5.1) |xn+1|p=|xn+yn|pmax{|xn|p,|yn|p},\left\lvert x_{n+1}\right\rvert_{p}=\left\lvert x_{n}+y_{n}\right\rvert_{p}\leq\max\{\left\lvert x_{n}\right\rvert_{p},\left\lvert y_{n}\right\rvert_{p}\},

where yn:=xn+1xny_{n}:=x_{n+1}-x_{n}. From Lemma 4.1 we know that

(5.2) |yn|p|f(xn)f(xn)|p,\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert\frac{f(x_{n})}{f^{\prime}(x_{n})}\right\rvert_{p},

where we use statement (3) together with our assumption t<1t<1 above. Further, using (2), we deduce that |yn|p|f(xn)/f(x1)|p\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert{f(x_{n})}/{f^{\prime}(x_{1})}\right\rvert_{p}, and from (3), we then get

(5.3) |yn|p|f(xn)f(x1)|p|f(x1)|pt4n11.\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert\frac{f(x_{n})}{f^{\prime}(x_{1})}\right\rvert_{p}\leq\left\lvert f^{\prime}(x_{1})\right\rvert_{p}t^{4^{n-1}}\leq 1.

Finally, from (1), we know that |xn|p1\left\lvert x_{n}\right\rvert_{p}\leq 1 and hence we have |xn+1|p1\left\lvert x_{n+1}\right\rvert_{p}\leq 1 by appealing to (5.1). Thus, we have established (1) for n+1n+1 in place of nn. To prove (2) for n+1n+1, we use the fact that if Fp[x]F\in\mathbb{Z}_{p}[x], then

(5.4) F(x)F(y)=(xy)G(x,y)F(x)-F(y)=(x-y)G(x,y)

for some polynomial Gp[x,y]G\in\mathbb{Z}_{p}[x,y] and hence

|F(x)F(y)|p|xy|p|F(x)-F(y)|_{p}\leq|x-y|_{p}

for any x,ypx,y\in\mathbb{Z}_{p}. Applying this to F=fF=f^{\prime}, we have

|f(xn+1)f(xn)|p|xn+1xn|p=|yn|p|f(xn)f(xn)|p|f(x1)|pt4n1<|f(x1)|p,\begin{split}\left\lvert f^{\prime}(x_{n+1})-f^{\prime}(x_{n})\right\rvert_{p}\leq&\left\lvert x_{n+1}-x_{n}\right\rvert_{p}=\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert\frac{f(x_{n})}{f^{\prime}(x_{n})}\right\rvert_{p}\\ \leq&\left\lvert f^{\prime}(x_{1})\right\rvert_{p}t^{4^{n-1}}<\left\lvert f^{\prime}(x_{1})\right\rvert_{p},\end{split}

where we use (5.2) and (3). Now if |f(xn+1)|p|f(xn)|p\left\lvert f^{\prime}(x_{n+1})\right\rvert_{p}\not=\left\lvert f^{\prime}(x_{n})\right\rvert_{p}, then we have

(5.5) |f(xn+1)f(xn)|p=max{|f(xn+1)|p,|f(xn)|p}|f(xn)|p=|f(x1)|p.\left\lvert f^{\prime}(x_{n+1})-f^{\prime}(x_{n})\right\rvert_{p}=\max\{\left\lvert f^{\prime}(x_{n+1})\right\rvert_{p},\left\lvert f^{\prime}(x_{n})\right\rvert_{p}\}\geq\left\lvert f^{\prime}(x_{n})\right\rvert_{p}=\left\lvert f^{\prime}(x_{1})\right\rvert_{p}.

Hence we arrive at a contradiction. So |f(xn+1)|p=|f(xn)|p=|f(x1)|p\left\lvert f^{\prime}(x_{n+1})\right\rvert_{p}=\left\lvert f^{\prime}(x_{n})\right\rvert_{p}=\left\lvert f^{\prime}(x_{1})\right\rvert_{p}.

To prove (3) for n+1n+1, we use Taylor’s formula and recall that p>3p>3, obtaining

(5.6) f(xn+1)=f(xn+yn)=f(xn)+f(xn)yn+f′′(xn)2!yn2+f′′′(xn)3!yn3+zyn4,f(x_{n+1})=f(x_{n}+y_{n})=f(x_{n})+f^{\prime}(x_{n})y_{n}+\frac{f^{\prime\prime}(x_{n})}{2!}y_{n}^{2}+\frac{f^{\prime\prime\prime}(x_{n})}{3!}y_{n}^{3}+zy_{n}^{4},

for some zpz\in\mathbb{Z}_{p}.

(5.7) f(xn)+f(xn)yn+f′′(xn)2!yn2+f′′′(xn)3!yn3=f(xn)4f(xn)3AnBn=f(xn)4f(xn)6f(xn)3AnBn,f(x_{n})+f^{\prime}(x_{n})y_{n}+\frac{f^{\prime\prime}(x_{n})}{2!}y_{n}^{2}+\frac{f^{\prime\prime\prime}(x_{n})}{3!}y_{n}^{3}=\frac{f(x_{n})^{4}}{f^{\prime}(x_{n})^{3}}\frac{A_{n}}{B_{n}}=\frac{f(x_{n})^{4}}{f^{\prime}(x_{n})^{6}}\frac{f^{\prime}(x_{n})^{3}A_{n}}{B_{n}},

where

An:=a5d415a4d3bc6a3d3b3+81a3d2b2c2189a2db3c3+18a2d2b4c36ad2b6+162ab4c4+54adb5c2162b6c3+108db7c\begin{split}A_{n}:=&a^{5}d^{4}-15a^{4}d^{3}bc-6a^{3}d^{3}b^{3}+81a^{3}d^{2}b^{2}c^{2}-189a^{2}db^{3}c^{3}+18a^{2}d^{2}b^{4}c-\\ &36ad^{2}b^{6}+162ab^{4}c^{4}+54adb^{5}c^{2}-162b^{6}c^{3}+108db^{7}c\end{split}

and

Bn:=48(3abca2d3b3)3B_{n}:=48(3abc-a^{2}d-3b^{3})^{3}

with a:=f(xn)a:=f(x_{n}), b:=f(xn)b:=f^{\prime}(x_{n}), c:=f′′(xn)c:=f^{\prime\prime}(x_{n}), d:=f′′′(xn)d:=f^{\prime\prime\prime}(x_{n}). The above expressions may be calculated conveniently using a suitable computer algebra system.

We note that |b|p=|f(x1)|p|b|_{p}=|f(x_{1})|_{p} by (2) and |a|p<|b|p2|a|_{p}<|b|_{p}^{2} by (2) and (3). It follows that |An|p|b|p6|A_{n}|_{p}\leq|b|_{p}^{6} and |Bn|p=|b|p9|B_{n}|_{p}=|b|_{p}^{9}, and hence

|f(xn+1)|p=|a4b3AnBn+zyn4|pmax{|a4b6|p,|yn|p4}|a4b6|pmax{1,|b|p2}|a4b6|p|f(x1)|p2t4n,\begin{split}|f(x_{n+1})|_{p}=&\left|\frac{a^{4}}{b^{3}}\cdot\frac{A_{n}}{B_{n}}+zy_{n}^{4}\right|_{p}\leq\max\left\{\left|\frac{a^{4}}{b^{6}}\right|_{p},|y_{n}|_{p}^{4}\right\}\\ \leq&\left|\frac{a^{4}}{b^{6}}\right|_{p}\cdot\max\left\{1,|b|^{2}_{p}\right\}\leq\left|\frac{a^{4}}{b^{6}}\right|_{p}\leq|f^{\prime}(x_{1})|_{p}^{2}t^{4^{n}},\end{split}

where for the second inequality above, we use (5.2) again, and for the last inequality, we use (2) and (3). Hence we are done with the induction.

Using (5.3) and t<1t<1, it is clear that (xn)(x_{n}) is a Cauchy sequence in p\mathbb{Z}_{p} and hence converges to some pp-adic integer γ\gamma. Further, from (2) and (3), we get |f(γ)|p=|f(x1)|p>0|f^{\prime}(\gamma)|_{p}=|f^{\prime}(x_{1})|_{p}>0 and f(γ)=0f(\gamma)=0. Finally, using arguments parallel to those in the proof of (2.9), we see that

|en+1ρen4|p=O(|en|p5),\left|e_{n+1}-\rho e_{n}^{4}\right|_{p}=O(|e_{n}|_{p}^{5}),

where en:=xnγe_{n}:=x_{n}-\gamma and ρ\rho as defined in (2.10). Thus we get

(5.8) limn|en+1|p|en|p4=|ρ|p\lim_{n\rightarrow\infty}\frac{\left\lvert e_{n+1}\right\rvert_{p}}{\left\lvert e_{n}\right\rvert_{p}^{4}}=\left\lvert\rho\right\rvert_{p}

and hence, (xn)(x_{n}) converges with order 4. ∎

6. Brief concluding remarks

Two questions remain to be answered.

(A) Can every zero in p\mathbb{Z}_{p} of ff be approximated using SJM, provided we choose x1x_{1} suitably?

(B) What happens if the initial condition (3.1) on x1x_{1} in Theorem 3.1 is not satisfied? Is there a method which produces a suitable x1x_{1} satisfying this condition?

Regarding question (A), we note that SJM only approximates simple roots of ff. If γ\gamma is a simple root, then clearly there is x1x_{1} satisfying the initial condition in Theorem 3.1, giving rise to a sequence (xn)(x_{n}) converging to γ\gamma: In particular, we may just take x1:=γx_{1}:=\gamma. If ff has multiple roots, then it is easy to produce a polynomial f~\tilde{f} whose roots are all simple and coincide with the roots of ff: Take f~=f/gcd(f,f)\tilde{f}=f/\mbox{gcd}(f,f^{\prime}), where gcd(f,f)\mbox{gcd}(f,f^{\prime}) is a greatest common divisor (unique up to multiplication by a unit in p\mathbb{Z}_{p}) of ff and ff^{\prime} in p[x]\mathbb{Z}_{p}[x]. Such a greatest common divisor can be conveniently calculated using the Euclidean algorithm in p[x]\mathbb{Q}_{p}[x].

To address question (B), we may employ Thurston’s method. Here we only describe briefly how this method works. For the details, we refer the interested reader to [13]. Let

(6.1) γ=a0+a1p+a2p2+\gamma=a_{0}+a_{1}p+a_{2}p^{2}+\cdots

be a root of ff in p\mathbb{Z}_{p}. In particular, f(a0)0modpf(a_{0})\equiv 0\bmod{p}. Let

γi=ai+ai+1p+ai+2p2+ for i0.\gamma_{i}=a_{i}+a_{i+1}p+a_{i+2}p^{2}+\cdots\mbox{ for }i\in\mathbb{N}_{0}.

Now Thurston’s method produces a chain of polynomials F0,F1,F2,F3,F_{0},F_{1},F_{2},F_{3},..., starting from F0=fF_{0}=f, where FiF_{i} satisfies the equation Fi(γi)=0F_{i}(\gamma_{i})=0. In particular, Fi(ai)0modpF_{i}(a_{i})\equiv 0\bmod{p}. Hence, we may proceed as follows to solve f(x)=0f(x)=0 in p\mathbb{Z}_{p}: First, we find a0a_{0} such that F0(a0)=f(a0)0modpF_{0}(a_{0})=f(a_{0})\equiv 0\bmod{p}. Then we produce F1F_{1} and find a1a_{1} such that F1(a1)0modpF_{1}(a_{1})\equiv 0\bmod{p}, if existent. This a1a_{1} may not be unique. Then we produce F2F_{2} and find (a not necessarily unique) a2a_{2} such that F2(a2)0modpF_{2}(a_{2})\equiv 0\bmod{p}, if existent. We continue this process. If we can continue this process ad infinitum, this gives rise to a sequence (a0+a1p++anpn)(a_{0}+a_{1}p+\cdots+a_{n}p^{n}) which converges to a root γ\gamma in p\mathbb{Z}_{p}. Generally, this sequence converges only with order 1, though.

Thurston proved that if ff has only simple roots in p\mathbb{Z}_{p}, then for every a0a_{0} satisfying f(a0)0modpf(a_{0})\equiv 0\bmod{p}, the above process either stops at some point in which case a0a_{0} cannot be lifted to a root γ\gamma of ff in p\mathbb{Z}_{p}, or, after finitely many steps, we get an index nn for which there exists ana_{n} with Fn(an)0modpF_{n}(a_{n})\equiv 0\bmod{p} and Fn(an)0modpF_{n}^{\prime}(a_{n})\not\equiv 0\bmod{p}. In this case, we are in the situation of Hensel’s Lemma, and ana_{n} therefore lifts uniquely to a root γn\gamma_{n} of FnF_{n}. To apply SJM, we only need the weaker condition |Fn(an)|p<|Fn(an)|p2|F_{n}(a_{n})|_{p}<|F_{n}^{\prime}(a_{n})|_{p}^{2} from the Generalized Hensel Lemma. Therefore, if ff has only simple roots in p\mathbb{Z}_{p}, then we can get all roots of ff in p\mathbb{Z}_{p} as follows: We apply Thurston’s method initially, starting with an a0a_{0} satisfying f(a0)0modpf(a_{0})\equiv 0\bmod{p} and, as soon as we reach the said index nn, apply the faster SJM with x1=anx_{1}=a_{n} and FnF_{n} in place of ff. This gives rise to a sequence (xn)(x_{n}) which converges to a root γnp\gamma_{n}\in\mathbb{Z}_{p} of FnF_{n}. From this, we retrieve a root

γ=a0+a1p++an1pn1+γnpnp\gamma=a_{0}+a_{1}p+...+a_{n-1}p^{n-1}+\gamma_{n}p^{n}\in\mathbb{Z}_{p}

of f(x)f(x).

One last brief remark on practical applications of SJM. This method produces a sequence whose elements lie in p\mathbb{Z}_{p}. For practical applications, however, only sequences in \mathbb{Z} are useful since we cannot encode an infinite pp-adic representation using a computer. Therefore, to implement SJM in practice, one needs to cut off the pp-adic representation of xnx_{n} at an appropriate point, getting an integer xnx_{n}^{\prime}. Then one calculates xn+1x_{n+1} (and hence xn+1x_{n+1}^{\prime}) using xnx_{n}^{\prime} in place of xnx_{n}. This will give a sequence (xn)(x_{n}^{\prime}) of integers converging to our root γp\gamma\in\mathbb{Z}_{p} of ff.

References

  • [1] I.K. Argyros, Dong Chen, and Qingshan Qian. The jarratt method in banach space setting. Journal of Computational and Applied Mathematics, 51(1):103–106, May 1994.
  • [2] Keith Conrad. Hensel’s lemma. Hensel’s Lemma by K.Conrad.
  • [3] Walter Gander. On halley’s iteration method. The American Mathematical Monthly, 92(2):131–134, 1985.
  • [4] Fernando Q. Gouvêa. p-adic Numbers. Springer International Publishing, 2020.
  • [5] P. Jarratt. Some fourth order multipoint iterative methods for solving equations. Mathematics of Computation, 20(95):434–437, 1966.
  • [6] Svetlana Katok. p-adic analysis compared with real. American Mathematical Society Mathematics Advanced Study Semesters, Providence, R.I. University Park, PA, 2007.
  • [7] Neal Koblitz. p-adic Numbers, p-adic Analysis, and Zeta-Functions. Springer New York, 1984.
  • [8] F. W. J. Olver. The evaluation of zeros of high-degree polynomials. Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences, 244(885):385–415, 1952.
  • [9] Julius Fergy Rabago. Olver’s method for solving roots of p-adic polynomial equations. Italian Journal of Pure and Applied Mathematics, 36:739, 08 2016.
  • [10] B. Surender Reddy. On equivalence of p-adic 2-norms in p-adic linear 2-normed spaces. 2010.
  • [11] Alain M. Robert. A Course in p-adic Analysis. Springer New York, 2000.
  • [12] W. H. Schikhof. Ultrametric Calculus: An Introduction to p-Adic Analysis. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1985.
  • [13] H. S. Thurston. The solution of p-adic equations. The American Mathematical Monthly, 50(3):142–148, 1943.
  • [14] Axel G. R. Turnquist. p-adic numbers and solving p-adic equations. 2012.
  • [15] Xiuhua Wang, Jisheng Kou, and Yitian Li. Modified jarratt method with sixth-order convergence. Applied Mathematics Letters, 22(12):1798–1802, 2009.