Solving $p$ -adic polynomial equations using Jarratt’s Method

Stephan Baier, Swarup Kumar Das and Saayan Mukherjee Stephan Baier, Ramakrishna Mission Vivekananda Educational and Research Institute, Department of Mathematics, G. T. Road, PO Belur Math, Howrah, West Bengal 711202, India [email protected] Swarup Kumar Das, Ramakrishna Mission Vivekananda Educational and Research Institute, Department of Mathematics, G. T. Road, PO Belur Math, Howrah, West Bengal 711202, India [email protected] Saayan Mukherjee, Ramakrishna Mission Vivekananda Educational and Research Institute, Department of Mathematics, G. T. Road, PO Belur Math, Howrah, West Bengal 711202, India [email protected]

Abstract.

We implement an iterative numerical method to solve polynomial equations $f(x)=0$ in the $p$ -adic numbers, where $f(x)\in\mathbb{Z}_{p}[x]$ . This method is a simplified $p$ -adic analogue of Jarratt’s method for finding roots of functions over the real numbers. We establish that our method has a higher order of convergence than J.F.T. Rabago’s [9] $p$ -adic version of Olver’s method from 2016. Moreover, we weaken the initial conditions in Rabago’s method, which allows us to start the iteration with a multiple root of the congruence $f(x)\equiv 0\bmod{p}$ .

Key words and phrases:

Jarratt’s method, p-adic polynomials, Thurston’s Method, Hensel’s lemma

2020 Mathematics Subject Classification:

11D88,11C08,65H04

1. Introduction

Solving polynomial equations is one of the oldest problems in mathematics which dates back to the Babylonians. There are several analytic, algebraic and geometric approaches for finding roots of polynomials. In this paper we discuss an iterative method to approximate solutions of polynomial equations of the form $f(x)=0$ in the $p$ -adic numbers, where $f\in\mathbb{Z}_{p}[x]$ .

Well-known numerical methods to approximate roots of functions on the real numbers are Newton’s Method, Olver’s Method [8], Halley’s method [3] and others. In this paper, we look at a method developed by P. Jarratt in 1966 [5]. This method has a higher order of convergence than all of the afore-mentioned methods. We set up a $p$ -adic analogue of Jarratt’s method. Analogues of Olver’s and Halley’s method in the $p$ -adic setting have previously been established by J.F.T. Rabago [9]. Apart from improving the order of convergence, we also weaken the initial conditions in Rabago’s work. Below is a definition of the term “order of convergence” for general Banach spaces, including the complete valued fields $\mathbb{R}$ and $\mathbb{Q}_{p}$ .

Definition 1.1.

Let $V$ be a Banach space with norm $|.|$ . Assume that $(x_{n})$ is a sequence in $V$ converging to $\gamma\in V$ . Then we say that $(x_{n})$ converges with order $q$ if

R_{n}=\frac{\left\lvert x_{n+1}-\gamma\right\rvert}{\left\lvert x_{n}-\gamma\right\rvert^{q}}

is a bounded sequence in $\mathbb{R}$ .

Before turning to Jarret’s method, let us briefly review Newton’s and Olver’s methods in the context of $p$ -adic numbers. Suppose $p$ is a prime and let $f(x)\in\mathbb{Z}_{p}[x]$ be a polynomial. Throughout the sequel, we assume the leading coefficient of $f$ to be a unit in $\mathbb{Z}_{p}$ . Then all solutions of $f$ in $\mathbb{Q}_{p}$ lie in $\mathbb{Z}_{p}$ . Assume that $x_{1}\in\mathbb{Z}_{p}$ such that $f(x_{1})\equiv 0\bmod{p}$ but $f^{\prime}(x_{1})\not\equiv 0\bmod{p}$ . Newton’s method uses the iteration

(1.1)

x_{n+1}=x_{n}-\frac{f(x_{n})}{f^{\prime}(x_{n})}\quad\mbox{for }n\geq 1,

which yields a sequence in $\mathbb{Z}_{p}$ which converges to a simple root of $f$ in $\mathbb{Z}_{p}$ . In the case when $f\in\mathbb{Z}[x]$ , this provides a convenient proof of Hensel’s lemma which, for all $m\in\mathbb{N}$ , asserts that a solution $x_{1}\in\mathbb{Z}$ of the congruence $f(x_{1})\equiv 0\bmod{p}$ satisfying $f^{\prime}(x_{1})\not\equiv 0\bmod{p}$ can be uniquely lifted to a root $x$ of the congruence $f(x)\equiv 0\bmod{p^{m}}$ . In fact, one may replace the condition $f^{\prime}(x_{1})\not\equiv 0\bmod{p}$ by the weaker condition

(1.2)

|f(x_{1})|_{p}<|f^{\prime}(x_{1})|_{p}^{2}

which leads to a generalized version of Hensel’s lemma (see [6]). The order of convergence of Newton’s method is 2.

Olver’s method, an iteration with higher order of convergence, was carried over to $p$ -adic numbers by J.F.T. Rabago in 2016 [9]. He proved that if $f(x)\in\mathbb{Z}_{p}[x]$ and if $x_{1}\in\mathbb{Z}$ satisfies $f(x_{1})\equiv 0\bmod{p}$ and $f^{\prime}(x_{1})\not\equiv 0\bmod{p}$ , then the sequence $(x_{n})$ defined iteratively as

(1.3)

x_{n+1}=x_{n}-\frac{f(x_{n})}{f^{\prime}(x_{n})}-\frac{1}{2}\cdot\frac{f(x_{n})^{2}f^{\prime\prime}(x_{n})}{f^{\prime}(x_{n})^{3}}

converges to a simple root of $f$ in $\mathbb{Z}_{p}$ . The order of convergence of this method is 3 for $p\geq 3$ and 4 for $p=2$ .

Our goal is to implement a $p$ -adic analogue of a simplified version of Jarratt’s method which has order of convergence 4. We will also weaken the initial condition $f^{\prime}(x_{1})\not\equiv 0\bmod{p}$ , replacing it by the condition (1.2) above. This allows us to begin with an $x_{1}$ which possibly is a multiple root of the congruence $f(x)\equiv 0\bmod{p}$ . At the end of this article we will briefly discuss Thurston’s method which is useful when the initial condition (1.2) is not satisfied. In a nutshell, Thurston’s method either tells us after finitely many steps that a Hensel lifting of a given root $x_{1}$ of the congruence $f(x)\equiv 0\bmod{p}$ to a solution of $f(x)=0$ in $\mathbb{Z}_{p}$ does not exist, or it produces a new polynomial $F$ in place of $f$ which meets the above-mentioned initial conditions. In this case we can perform Jarratt’s method for $F$ , which in turn gives us an approximation to a root of $f$ in $\mathbb{Z}_{p}$ .

2. Simplified Jarratt Method for functions on $\mathbb{R}$

Jarratt’s original method is an iterative numerical method for solving equations $f(x)=0$ , where $f$ is a real-valued function defined on intervals in $\mathbb{R}$ . The iteration is defined as

(2.1)

x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{f(x_{n})}{f^{\prime}(x_{n})-3f^{\prime}\left(x_{n}-\frac{2}{3}\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)},\hskip 28.45274pt\forall n\in\mathbb{N}.

Precise initial conditions on $x_{1}$ for a generalized version of this method on Banach spaces were worked out in [1]. The order of convergence of this method is 4, as established in [5].

Instead of using the original method, we here use a simplified version. The idea is as follows. Using a Taylor approximation (if existent), we find

f^{\prime}\left(x_{n}-\frac{2}{3}\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)=f^{\prime}(x_{n})-\frac{2}{3}\cdot\frac{f(x_{n})}{f^{\prime}(x_{n})}\cdot f^{\prime\prime}(x_{n})+\frac{2}{9}\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})+\mbox{Error}.

Plugging the right-hand side above into (2.1) and neglecting the error, we arrive at the simpler iteration

(2.2)

x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{3f(x_{n})f^{\prime}(x_{n})^{2}}{-6f^{\prime}(x_{n})^{3}+6f(x_{n})f^{\prime}(x_{n})f^{\prime\prime}(x_{n})-2f(x_{n})^{2}f^{\prime\prime\prime}(x_{n})},

We will refer to this iteration as Simplified Jarratt Method (SJM) in the rest of this article. Below we will prove that SJM has order of convergence 4 over the real numbers. If $p>3$ , this result carries over to $\mathbb{Q}_{p}$ , as we will see later.

Theorem 2.1.

Assume $I$ is an open interval in $\mathbb{R}$ and $f:I\rightarrow\mathbb{R}$ is a five times differentiable function. Assume that $\gamma\in I$ satisfies $f(\gamma)=0$ and $f^{\prime}(\gamma)\not=0$ . Then there exists $\delta>0$ such that if $x_{1}\in(\gamma-\delta,\gamma+\delta)$ , the sequence $(x_{n})$ defined by (2.2) converges to $\gamma$ with order $4$ . Moreover,

(2.3)

\lim_{n\rightarrow\infty}\frac{\left\lvert x_{n+1}-\gamma\right\rvert}{\left\lvert x_{n}-\gamma\right\rvert^{4}}=\left|\frac{18f^{\prime\prime\prime}(\gamma)f^{\prime\prime}(\gamma)f^{\prime}(\gamma)-18f^{\prime\prime}(\gamma)^{3}-5f^{(4)}(\gamma)f^{\prime}(\gamma)^{2}}{48f^{\prime}(\gamma)^{3}}\right|.

Proof.

Let $n\in\mathbb{N}$ . Taylor’s theorem provides us with the approximations

(2.4)

\begin{split}f(x_{n})=&f^{\prime}(\gamma)\left(e_{n}+c_{2}e_{n}^{2}+c_{3}e_{n}^{3}+c_{4}e_{n}^{4}+O(e_{n}^{5})\right),\\ f^{\prime}(x_{n})=&f^{\prime}(\gamma)\left(1+2c_{2}e_{n}+3c_{3}e_{n}^{2}+4c_{4}e_{n}^{3}+O(e_{n}^{4})\right),\\ f^{\prime\prime}(x_{n})=&f^{\prime}(\gamma)\left(2c_{2}+6c_{3}e_{n}+12c_{4}e_{n}^{2}+O(e_{n}^{3})\right),\\ f^{\prime\prime\prime}(x_{n})=&f^{\prime}(\gamma)\left(6c_{3}+24c_{4}e_{n}+O(e_{n}^{2})\right),\end{split}

where

e_{n}:=x_{n}-\gamma\quad\mbox{and}\quad c_{k}:=\frac{1}{k!}\frac{f^{(k)}(\gamma)}{f^{\prime}(\gamma)}.

Dividing the first two equations in (2.4), we get

(2.5)

\frac{f(x_{n})}{f^{\prime}(x_{n})}=e_{n}-c_{2}{e_{n}^{2}}+2(c_{2}^{2}-c_{3})e_{n}^{3}+(7c_{2}c_{3}-4c_{2}^{2}-3c_{4})e_{n}^{4}+O(e_{n}^{5})

after a short calculation. We further write (2.2) in the form

(2.6)

x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{3f(x_{n})}{-6f^{\prime}(x_{n})+6\cdot\frac{f(x_{n})}{f^{\prime}(x_{n})}\cdot f^{\prime\prime}(x_{n})-2\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})}.

Combining (2.4) and (2.5), we approximate the denominator in the last term on the right-hand side of (2.6) by

(2.7)

\begin{split}&-6f^{\prime}(x_{n})+6\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)f^{\prime\prime}(x_{n})-2\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})\\ =&f^{\prime}(\gamma)\left((-24c_{3}^{2}-48c_{2}^{3}+96c_{3}c_{2}^{2}-12c_{4}c_{2})e_{n}^{4}+(24c_{2}^{3}-36c_{3}c_{2})e_{n}^{3}+\right.\\ &\qquad\left.(6c_{3}-12c_{2}^{2})e_{n}^{2}-6)+O(e_{n}^{5})\right).\end{split}

From the first line in (2.4) and (2.7), we deduce that

(2.8)

\begin{split}&\frac{3f(x_{n})}{-6f^{\prime}(x_{n})+6\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)f^{\prime\prime}(x_{n})-2\left(\frac{f(x_{n})}{f^{\prime}(x_{n})}\right)^{2}f^{\prime\prime\prime}(x_{n})}\\ &=-\frac{e_{n}}{2}-\frac{c_{2}}{2}e_{n}^{2}+(c_{2}^{2}-c_{3})e_{n}^{3}+\frac{c_{4}+2c_{2}^{3}-5c_{3}c_{2}}{2}\cdot e_{n}^{4}+O(e_{n}^{5}),\end{split}

which may be calculated using a suitable computer algebra system. Now plugging (2.5) and (2.8) into (2.6) and simplifying, we arrive at

\begin{split}x_{n+1}&=\gamma+e_{n}-\frac{1}{2}(e_{n}-c_{2}e_{n}^{2}+2(c_{2}^{2}-c_{3})e_{n}^{3}+(7c_{3}c_{2}-4c_{2}^{3}-3c_{4})e_{n}^{4})+\\ &\quad\left(-\frac{e_{n}}{2}-\frac{c_{2}}{2}e_{n}^{2}+(c_{2}^{2}-c_{3})e_{n}^{3}+\frac{c_{4}+2c_{2}^{3}-5c_{3}c_{2}}{2}\cdot e_{n}^{4}+O(e_{n}^{5})\right)\\ &=\gamma+\frac{9c_{3}c_{2}-6c_{2}^{3}-5c_{4}}{2}\cdot e_{n}^{4}+O(e_{n}^{5}),\end{split}

and hence

(2.9)

e_{n+1}=\rho e_{n}^{4}+O(e_{n}^{5})

with

(2.10)

\rho:=\frac{9c_{3}c_{2}-6c_{2}^{3}-5c_{4}}{2}.

Therefore, if $x_{1}$ is sufficiently close to $\gamma$ , then the sequence $(x_{n})$ converges to $\gamma$ with order 4, and we have

\lim_{n\rightarrow\infty}\frac{\left\lvert e_{n+1}\right\rvert}{\left\lvert e_{n}\right\rvert^{4}}=|\rho|,

which implies (2.3). This completes the proof. ∎

3. $p$ -adic analogue of SJM

Now we apply SJM to polynomials in $\mathbb{Z}_{p}[x]$ subject to certain initial conditions which we work out in detail. We shall prove the following theorem, which is our main result.

Theorem 3.1.

Let $p>3$ be a prime and $f(x)\in\mathbb{Z}_{p}[x]$ be a polynomial with leading coefficent a unit in $\mathbb{Z}_{p}$ . Suppose that $x_{1}\in\mathbb{Z}_{p}$ satisfies the condition

(3.1)

|f(x_{1})|_{p}<|f^{\prime}(x_{1})|_{p}^{2}.

(In particular, $f^{\prime}(x_{1})\not=0$ and $|f(x_{1})|_{p}<1$ and thus $f(x_{1})\equiv 0\bmod{p}$ .) Then the sequence $(x_{n})$ defined iteratively as

x_{n+1}=x_{n}-\frac{1}{2}\frac{f(x_{n})}{f^{\prime}(x_{n})}+\frac{3f(x_{n})f^{\prime}(x_{n})^{2}}{-6f^{\prime}(x_{n})^{3}+6f(x_{n})f^{\prime}(x_{n})f^{\prime\prime}(x_{n})-2f(x_{n})^{2}f^{\prime\prime\prime}(x_{n})}

converges to a simple root $\gamma\in\mathbb{Z}_{p}$ of $f$ with order of convergence 4.

As noted in the introduction, in [9], a $p$ -adic analogue of Olver’s method was worked out. This method converges with order 3 and was formulated in [9] under the stronger initial condition that $|f(x_{1})|_{p}<1$ and $|f^{\prime}(x_{1})|_{p}=1$ , which means that $x_{1}$ is a simple root of the congruence $f(x)\equiv 0\bmod{p}$ . In contrast, our condition (3.1) allows us to take $x_{1}$ as a multiple root of the said congruence. We point out that (3.1) matches the condition in the following theorem which is a version of the Generalized Hensel Lemma (see [6]).

Theorem 3.2 (Generalised Hensel Lemma).

Let $f(x)\in\mathbb{Z}_{p}[x]$ and $a\in\mathbb{Z}_{p}$ satisfying $|f(a)|_{p}<|f^{\prime}(a)|_{p}^{2}$ . Then there is a unique $\gamma\in\mathbb{Z}_{p}$ such that $f(\gamma)=0$ and $|\gamma-a|_{p}<|f^{\prime}(a)|_{p}$ . Moreover, $|f^{\prime}(\gamma)|_{p}=|f^{\prime}(a)|_{p}$ .

This shows that a root $a$ of the congruence $f(x)\equiv 0\bmod{p}$ can be lifted uniquely to a simple root $\gamma$ of $f$ in $\mathbb{Z}_{p}$ , provided that $|f(a)|_{p}<|f^{\prime}(a)|_{p}^{2}$ . Hence, if $x_{1}=a$ satisfies this condition, then the SJM iteration gives a sequence which converges to precisely this unique lifting $\gamma$ of $a$ .

Theorem 3.2 can be proved using Newton’s method (see [2]). In our proof of Theorem 3.1, we replace Newton’s method by SJM.

4. Preliminaries

For the proof of Theorem 3.1, we will use the lemma below.

Lemma 4.1.

Let $p>3$ be a prime, $f(x)\in\mathbb{Z}_{p}[x]$ and $A\in\mathbb{Z}_{p}$ such that $f^{\prime}(A)\not=0$ . Suppose that

(4.1)

\left|\frac{f(A)}{f^{\prime}(A)^{2}}\right|_{p}<1.

Define

B:=-\frac{1}{2}\frac{f(A)}{f^{\prime}(A)}+\frac{3f(A)}{f^{\prime}(A)-3M},

where

M:=f^{\prime}(A)-\frac{2}{3}\cdot\frac{f(A)}{f^{\prime}(A)}\cdot f^{\prime\prime}(A)+\frac{2}{9}\left(\frac{f(A)}{f^{\prime}(A)}\right)^{2}f^{\prime\prime\prime}(A).

Then we have

|B|_{p}\leq\left|\frac{f(A)}{f^{\prime}(A)}\right|_{p}.

Proof.

We write

(4.2)

B=\frac{f(A)}{f^{\prime}(A)}\left(-\frac{1}{2}+\frac{3}{1-3M/f^{\prime}(A)}\right)

and

1-\frac{3M}{f^{\prime}(A)}=-2+2\cdot\frac{f(A)}{f^{\prime}(A)^{2}}\left(f^{\prime\prime}(A)-\frac{1}{3}\cdot\frac{f(A)f^{\prime\prime\prime}(A)}{f^{\prime}(A)}\right).

From our assumption (4.1), it follows that

\left|1-\frac{3M}{f^{\prime}(A)}\right|_{p}=|-2|_{p}=1

if $p>3$ . Hence,

\left|-\frac{1}{2}+\frac{3}{1-3M/f^{\prime}(A)}\right|_{p}\leq 1

if $p>3$ , which together with (4.2) implies the claim. ∎

5. Proof of the main result

Now we turn to the

Proof of Theorem 3.1. We set

t:=\left\lvert\frac{f(x_{1})}{f^{\prime}(x_{1})^{2}}\right\rvert_{p}

and recall our assumption $|t|_{p}<1$ . We shall proceed similarly as in the proof of the Generalized Hensel Lemma in [2] and aim to show the following statements by induction over $n$ :

(1)

$\left\lvert x_{n}\right\rvert_{p}\leq 1$ ,
(2)

$\left\lvert f^{\prime}(x_{n})\right\rvert_{p}=\left\lvert f^{\prime}(x_{1})\right\rvert_{p}$ ,
(3)

$\left\lvert f(x_{n})\right\rvert_{p}\leq\left\lvert f^{\prime}(x_{1})\right\rvert_{p}^{2}t^{4^{n-1}}$ .

For $n=1$ , these conditions are clearly true. So let us assume the conditions to be true for $n$ . Now for $n+1$ , we have

(5.1)

\left\lvert x_{n+1}\right\rvert_{p}=\left\lvert x_{n}+y_{n}\right\rvert_{p}\leq\max\{\left\lvert x_{n}\right\rvert_{p},\left\lvert y_{n}\right\rvert_{p}\},

where $y_{n}:=x_{n+1}-x_{n}$ . From Lemma 4.1 we know that

(5.2)

\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert\frac{f(x_{n})}{f^{\prime}(x_{n})}\right\rvert_{p},

where we use statement (3) together with our assumption $t<1$ above. Further, using (2), we deduce that $\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert{f(x_{n})}/{f^{\prime}(x_{1})}\right\rvert_{p}$ , and from (3), we then get

(5.3)

\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert\frac{f(x_{n})}{f^{\prime}(x_{1})}\right\rvert_{p}\leq\left\lvert f^{\prime}(x_{1})\right\rvert_{p}t^{4^{n-1}}\leq 1.

Finally, from (1), we know that $\left\lvert x_{n}\right\rvert_{p}\leq 1$ and hence we have $\left\lvert x_{n+1}\right\rvert_{p}\leq 1$ by appealing to (5.1). Thus, we have established (1) for $n+1$ in place of $n$ . To prove (2) for $n+1$ , we use the fact that if $F\in\mathbb{Z}_{p}[x]$ , then

(5.4)

F(x)-F(y)=(x-y)G(x,y)

for some polynomial $G\in\mathbb{Z}_{p}[x,y]$ and hence

|F(x)-F(y)|_{p}\leq|x-y|_{p}

for any $x,y\in\mathbb{Z}_{p}$ . Applying this to $F=f^{\prime}$ , we have

\begin{split}\left\lvert f^{\prime}(x_{n+1})-f^{\prime}(x_{n})\right\rvert_{p}\leq&\left\lvert x_{n+1}-x_{n}\right\rvert_{p}=\left\lvert y_{n}\right\rvert_{p}\leq\left\lvert\frac{f(x_{n})}{f^{\prime}(x_{n})}\right\rvert_{p}\\ \leq&\left\lvert f^{\prime}(x_{1})\right\rvert_{p}t^{4^{n-1}}<\left\lvert f^{\prime}(x_{1})\right\rvert_{p},\end{split}

where we use (5.2) and (3). Now if $\left\lvert f^{\prime}(x_{n+1})\right\rvert_{p}\not=\left\lvert f^{\prime}(x_{n})\right\rvert_{p}$ , then we have

(5.5)

\left\lvert f^{\prime}(x_{n+1})-f^{\prime}(x_{n})\right\rvert_{p}=\max\{\left\lvert f^{\prime}(x_{n+1})\right\rvert_{p},\left\lvert f^{\prime}(x_{n})\right\rvert_{p}\}\geq\left\lvert f^{\prime}(x_{n})\right\rvert_{p}=\left\lvert f^{\prime}(x_{1})\right\rvert_{p}.

Hence we arrive at a contradiction. So $\left\lvert f^{\prime}(x_{n+1})\right\rvert_{p}=\left\lvert f^{\prime}(x_{n})\right\rvert_{p}=\left\lvert f^{\prime}(x_{1})\right\rvert_{p}$ .

To prove (3) for $n+1$ , we use Taylor’s formula and recall that $p>3$ , obtaining

(5.6)

f(x_{n+1})=f(x_{n}+y_{n})=f(x_{n})+f^{\prime}(x_{n})y_{n}+\frac{f^{\prime\prime}(x_{n})}{2!}y_{n}^{2}+\frac{f^{\prime\prime\prime}(x_{n})}{3!}y_{n}^{3}+zy_{n}^{4},

for some $z\in\mathbb{Z}_{p}$ .

(5.7)

f(x_{n})+f^{\prime}(x_{n})y_{n}+\frac{f^{\prime\prime}(x_{n})}{2!}y_{n}^{2}+\frac{f^{\prime\prime\prime}(x_{n})}{3!}y_{n}^{3}=\frac{f(x_{n})^{4}}{f^{\prime}(x_{n})^{3}}\frac{A_{n}}{B_{n}}=\frac{f(x_{n})^{4}}{f^{\prime}(x_{n})^{6}}\frac{f^{\prime}(x_{n})^{3}A_{n}}{B_{n}},

where

\begin{split}A_{n}:=&a^{5}d^{4}-15a^{4}d^{3}bc-6a^{3}d^{3}b^{3}+81a^{3}d^{2}b^{2}c^{2}-189a^{2}db^{3}c^{3}+18a^{2}d^{2}b^{4}c-\\ &36ad^{2}b^{6}+162ab^{4}c^{4}+54adb^{5}c^{2}-162b^{6}c^{3}+108db^{7}c\end{split}

and

B_{n}:=48(3abc-a^{2}d-3b^{3})^{3}

with $a:=f(x_{n})$ , $b:=f^{\prime}(x_{n})$ , $c:=f^{\prime\prime}(x_{n})$ , $d:=f^{\prime\prime\prime}(x_{n})$ . The above expressions may be calculated conveniently using a suitable computer algebra system.

We note that $|b|_{p}=|f(x_{1})|_{p}$ by (2) and $|a|_{p}<|b|_{p}^{2}$ by (2) and (3). It follows that $|A_{n}|_{p}\leq|b|_{p}^{6}$ and $|B_{n}|_{p}=|b|_{p}^{9}$ , and hence

\begin{split}|f(x_{n+1})|_{p}=&\left|\frac{a^{4}}{b^{3}}\cdot\frac{A_{n}}{B_{n}}+zy_{n}^{4}\right|_{p}\leq\max\left\{\left|\frac{a^{4}}{b^{6}}\right|_{p},|y_{n}|_{p}^{4}\right\}\\ \leq&\left|\frac{a^{4}}{b^{6}}\right|_{p}\cdot\max\left\{1,|b|^{2}_{p}\right\}\leq\left|\frac{a^{4}}{b^{6}}\right|_{p}\leq|f^{\prime}(x_{1})|_{p}^{2}t^{4^{n}},\end{split}

where for the second inequality above, we use (5.2) again, and for the last inequality, we use (2) and (3). Hence we are done with the induction.

Using (5.3) and $t<1$ , it is clear that $(x_{n})$ is a Cauchy sequence in $\mathbb{Z}_{p}$ and hence converges to some $p$ -adic integer $\gamma$ . Further, from (2) and (3), we get $|f^{\prime}(\gamma)|_{p}=|f^{\prime}(x_{1})|_{p}>0$ and $f(\gamma)=0$ . Finally, using arguments parallel to those in the proof of (2.9), we see that

\left|e_{n+1}-\rho e_{n}^{4}\right|_{p}=O(|e_{n}|_{p}^{5}),

where $e_{n}:=x_{n}-\gamma$ and $\rho$ as defined in (2.10). Thus we get

(5.8)

\lim_{n\rightarrow\infty}\frac{\left\lvert e_{n+1}\right\rvert_{p}}{\left\lvert e_{n}\right\rvert_{p}^{4}}=\left\lvert\rho\right\rvert_{p}

and hence, $(x_{n})$ converges with order 4. ∎

6. Brief concluding remarks

Two questions remain to be answered.

(A) Can every zero in $\mathbb{Z}_{p}$ of $f$ be approximated using SJM, provided we choose $x_{1}$ suitably?

(B) What happens if the initial condition (3.1) on $x_{1}$ in Theorem 3.1 is not satisfied? Is there a method which produces a suitable $x_{1}$ satisfying this condition?

Regarding question (A), we note that SJM only approximates simple roots of $f$ . If $\gamma$ is a simple root, then clearly there is $x_{1}$ satisfying the initial condition in Theorem 3.1, giving rise to a sequence $(x_{n})$ converging to $\gamma$ : In particular, we may just take $x_{1}:=\gamma$ . If $f$ has multiple roots, then it is easy to produce a polynomial $\tilde{f}$ whose roots are all simple and coincide with the roots of $f$ : Take $\tilde{f}=f/\mbox{gcd}(f,f^{\prime})$ , where $\mbox{gcd}(f,f^{\prime})$ is a greatest common divisor (unique up to multiplication by a unit in $\mathbb{Z}_{p}$ ) of $f$ and $f^{\prime}$ in $\mathbb{Z}_{p}[x]$ . Such a greatest common divisor can be conveniently calculated using the Euclidean algorithm in $\mathbb{Q}_{p}[x]$ .

To address question (B), we may employ Thurston’s method. Here we only describe briefly how this method works. For the details, we refer the interested reader to [13]. Let

(6.1)

\gamma=a_{0}+a_{1}p+a_{2}p^{2}+\cdots

be a root of $f$ in $\mathbb{Z}_{p}$ . In particular, $f(a_{0})\equiv 0\bmod{p}$ . Let

\gamma_{i}=a_{i}+a_{i+1}p+a_{i+2}p^{2}+\cdots\mbox{ for }i\in\mathbb{N}_{0}.

Now Thurston’s method produces a chain of polynomials $F_{0},F_{1},F_{2},F_{3},...$ , starting from $F_{0}=f$ , where $F_{i}$ satisfies the equation $F_{i}(\gamma_{i})=0$ . In particular, $F_{i}(a_{i})\equiv 0\bmod{p}$ . Hence, we may proceed as follows to solve $f(x)=0$ in $\mathbb{Z}_{p}$ : First, we find $a_{0}$ such that $F_{0}(a_{0})=f(a_{0})\equiv 0\bmod{p}$ . Then we produce $F_{1}$ and find $a_{1}$ such that $F_{1}(a_{1})\equiv 0\bmod{p}$ , if existent. This $a_{1}$ may not be unique. Then we produce $F_{2}$ and find (a not necessarily unique) $a_{2}$ such that $F_{2}(a_{2})\equiv 0\bmod{p}$ , if existent. We continue this process. If we can continue this process ad infinitum, this gives rise to a sequence $(a_{0}+a_{1}p+\cdots+a_{n}p^{n})$ which converges to a root $\gamma$ in $\mathbb{Z}_{p}$ . Generally, this sequence converges only with order 1, though.

Thurston proved that if $f$ has only simple roots in $\mathbb{Z}_{p}$ , then for every $a_{0}$ satisfying $f(a_{0})\equiv 0\bmod{p}$ , the above process either stops at some point in which case $a_{0}$ cannot be lifted to a root $\gamma$ of $f$ in $\mathbb{Z}_{p}$ , or, after finitely many steps, we get an index $n$ for which there exists $a_{n}$ with $F_{n}(a_{n})\equiv 0\bmod{p}$ and $F_{n}^{\prime}(a_{n})\not\equiv 0\bmod{p}$ . In this case, we are in the situation of Hensel’s Lemma, and $a_{n}$ therefore lifts uniquely to a root $\gamma_{n}$ of $F_{n}$ . To apply SJM, we only need the weaker condition $|F_{n}(a_{n})|_{p}<|F_{n}^{\prime}(a_{n})|_{p}^{2}$ from the Generalized Hensel Lemma. Therefore, if $f$ has only simple roots in $\mathbb{Z}_{p}$ , then we can get all roots of $f$ in $\mathbb{Z}_{p}$ as follows: We apply Thurston’s method initially, starting with an $a_{0}$ satisfying $f(a_{0})\equiv 0\bmod{p}$ and, as soon as we reach the said index $n$ , apply the faster SJM with $x_{1}=a_{n}$ and $F_{n}$ in place of $f$ . This gives rise to a sequence $(x_{n})$ which converges to a root $\gamma_{n}\in\mathbb{Z}_{p}$ of $F_{n}$ . From this, we retrieve a root

\gamma=a_{0}+a_{1}p+...+a_{n-1}p^{n-1}+\gamma_{n}p^{n}\in\mathbb{Z}_{p}

of $f(x)$ .

One last brief remark on practical applications of SJM. This method produces a sequence whose elements lie in $\mathbb{Z}_{p}$ . For practical applications, however, only sequences in $\mathbb{Z}$ are useful since we cannot encode an infinite $p$ -adic representation using a computer. Therefore, to implement SJM in practice, one needs to cut off the $p$ -adic representation of $x_{n}$ at an appropriate point, getting an integer $x_{n}^{\prime}$ . Then one calculates $x_{n+1}$ (and hence $x_{n+1}^{\prime}$ ) using $x_{n}^{\prime}$ in place of $x_{n}$ . This will give a sequence $(x_{n}^{\prime})$ of integers converging to our root $\gamma\in\mathbb{Z}_{p}$ of $f$ .

References

[1] I.K. Argyros, Dong Chen, and Qingshan Qian. The jarratt method in banach space setting. Journal of Computational and Applied Mathematics, 51(1):103–106, May 1994.
[2] Keith Conrad. Hensel’s lemma. Hensel’s Lemma by K.Conrad.
[3] Walter Gander. On halley’s iteration method. The American Mathematical Monthly, 92(2):131–134, 1985.
[4] Fernando Q. Gouvêa. p-adic Numbers. Springer International Publishing, 2020.
[5] P. Jarratt. Some fourth order multipoint iterative methods for solving equations. Mathematics of Computation, 20(95):434–437, 1966.
[6] Svetlana Katok. p-adic analysis compared with real. American Mathematical Society Mathematics Advanced Study Semesters, Providence, R.I. University Park, PA, 2007.
[7] Neal Koblitz. p-adic Numbers, p-adic Analysis, and Zeta-Functions. Springer New York, 1984.
[8] F. W. J. Olver. The evaluation of zeros of high-degree polynomials. Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences, 244(885):385–415, 1952.
[9] Julius Fergy Rabago. Olver’s method for solving roots of p-adic polynomial equations. Italian Journal of Pure and Applied Mathematics, 36:739, 08 2016.
[10] B. Surender Reddy. On equivalence of p-adic 2-norms in p-adic linear 2-normed spaces. 2010.
[11] Alain M. Robert. A Course in p-adic Analysis. Springer New York, 2000.
[12] W. H. Schikhof. Ultrametric Calculus: An Introduction to p-Adic Analysis. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1985.
[13] H. S. Thurston. The solution of p-adic equations. The American Mathematical Monthly, 50(3):142–148, 1943.
[14] Axel G. R. Turnquist. p-adic numbers and solving p-adic equations. 2012.
[15] Xiuhua Wang, Jisheng Kou, and Yitian Li. Modified jarratt method with sixth-order convergence. Applied Mathematics Letters, 22(12):1798–1802, 2009.

Solving pp-adic polynomial equations using Jarratt’s Method

Abstract.

Key words and phrases:

2020 Mathematics Subject Classification:

1. Introduction

Definition 1.1.

2. Simplified Jarratt Method for functions on ℝ\mathbb{R}

Theorem 2.1.

Proof.

3. pp-adic analogue of SJM

Theorem 3.1.

Theorem 3.2 (Generalised Hensel Lemma).

4. Preliminaries

Lemma 4.1.

Proof.

5. Proof of the main result

6. Brief concluding remarks

References

Solving $p$ -adic polynomial equations using Jarratt’s Method

2. Simplified Jarratt Method for functions on $\mathbb{R}$

3. $p$ -adic analogue of SJM