An iterative algorithm for approximating roots of integers
Abstract
We explore an algorithm for approximating roots of integers, discuss its motivation and derivation, and analyze its convergence rates with varying parameters and inputs. We also perform comparisons with established methods for approximating square roots and other rational powers.
1 Introduction
Fixed-point iteration is a technique commonly used for approximating irrational constants, such as calculating the golden ratio or finding square roots via the Babylonian method. More generally, the Newton-Raphson method is an iterative process used for estimating the roots of functions, which can be applied to find the roots of certain integers. For example, searching for the real root of will yield .
In this paper, we will introduce an algorithm for estimating rational powers of integers that employs fixed-point iteration, provide examples of its use, and compare it to the existing methods mentioned above.
It is important to note the definition of a “fixed point” in the context of functions and iterative processes, as well as what it means for a fixed point to be “stable.”
Definition 1.1.
A fixed point of a function is a value in the domain of such that .
In other words, a fixed point of a function occurs wherever the input and output values are equal.
Example 1.2.
The values and are the two fixed points of the function because and .
Not all fixed points are of equal importance, however. One important distinction to be made for a given fixed point is whether it can be classified as “stable” or “unstable,” which describes how the function acts in the neighborhood of this point.
Definition 1.3.
A fixed point of a function is stable if and only if the sequence converges to with for arbitrarily small . Conversely, is unstable if and only if the sequence diverges from with small initial offset .
The stability of fixed points of a function can be identified by the magnitude of the derivative at those points [1].
unstable | |
---|---|
neutral | |
stable | |
superstable |
When a fixed point is “superstable,” it is said that the algorithm “converges (at least) quadratically” because the error in the approximation must be some quadratic function of the error in .
2 Motivation
In the Newton-Raphson method, we see an example of how fixed-point iteration can be used to narrow in on an estimated value for the root of a given integer. Suppose that we begin the process in a slightly different way.
For clarity of this example, suppose we are trying to find an estimated value of . Rather than defining the function and trying to find the positive root, we simply consider the equation for some . Ideally, we could divide each side of the equation by to get and iterate as with arbitrary to get closer and closer to as . There is, however, a fatal flaw with this approach: the fixed point of has only neutral stability, so the value does not converge as we might hope. Instead, it can be easily seen that any nonzero starting value of will simply yield values alternating between and .
This flaw can be remedied if we offset by some constant value. Suppose that we rewrite . (As for where the 2 and the negative sign come from, that will be explained shortly.) Then, we have the initial equation and can turn this into the fixed-point iteration with
Taking the derivative of the right-hand side with respect to at the fixed point of gives
The absolute value of the derivative at the fixed point is less than 1, so it is stable! This means that repeatedly iterating on the value of will cause the value to tend towards , at which point we may subtract to recover our approximation for as desired.
This is where the earlier rewriting of is important. Had we chosen a different constant or sign, it is entirely possible that the fixed point would not have been stable, ruining the results. In the following section, we generalize our results and make clear exactly how to choose this constant to make the algorithm work (and optimize its rate of convergence).
3 Derivation of general algorithm
First, we want to generalize to arbitrary input , not just the number 7 used in the previous example. (Note that each of the examples given in this paper use integer , but there is no limitation that prevents calculating , for example. The main drawback is simply that it is more difficult to calculate by hand and of less interest to the general reader.) Start with the quadratic equation
(1) |
where is our “offset” value and we can solve to get . Next, rewrite equation (1) in expanded form as
and perform algebraic manipulation to obtain
which can be easily transformed into the fixed-point iteration of
(2) |
From here, it is easily verified that the two fixed points for are given by and . Determining which of these is stable is fundamental in knowing how to proceed.
We define and take the absolute value of the derivative to get
We first check the fixed point of . Substituting yields
Consider the case in which we have picked such that . We can rewrite for some . Then, we can simplify to
In this case, then, the fixed point is unstable. Consider instead the case of . We rewrite for some . Then, we can simplify to
where the final inequality holds true because of the bounds on . This is unstable once more.
Next, let us consider the case of . Rewriting for some value gives
Unlike the first two cases, this shows that the fixed point is stable for such a selection of .
Fourth, we consider the case of . Rewriting for some value gives
Once again, the fixed point is stable.
The last inequality here may not be immediately clear, but it can be observed that the value of the denominator lies within the open interval of (which is greater in magnitude than any possible value of ) to , at which point the numerator and denominator are equal in magnitude.
Now, we address the few remaining special cases. When we have the exact equality , we see , making it unstable. When , we have that , indicating a neutral stability (the same situation described as a “flaw” in the beginning of section 2). Finally, when we have , we see , making the point superstable.
Nearly identical calculations reveal that whenever the point is stable, the point is unstable, and vice versa. A summary table of all the stability results above for the different possible values of is provided below:
unstable | stable | |
unstable | superstable | |
unstable | stable | |
neutral | neutral | |
stable | unstable | |
superstable | unstable | |
stable | unstable |
Essentially, the key takeaway is that there is a kind of symmetry about with respect to the stability of these fixed points. To make things simpler, we can simply choose to consider strictly positive and deal only with , or we may choose to consider only negative and the corresponding ; either is equally valid. In this analysis, we opt with the former.
3.1 How to put the square root algorithm into practice
A quick glance at the summary table provides instructions on how to select the constant : choose a positive value as close to as possible to maximize the convergence rate. (Estimating is what we seek to do with this algorithm, so coming up with a good initial guess is not necessarily an easy task. It can be noted, however, that for an input integer with digits, the square root of will have approximately digits, which can be used to rapidly generate a ballpark figure.)
Once we have chosen an appropriate value of , we want to choose some initial value for our iteration. Fortunately, the convergence does not depend on this value, except for that the denominator of (2) must never be zero. To avoid this, set to some number not equal to , and if by chance a division by zero does occur, simply restart the process with a new value of . (Ideally, will provide the best accuracy in the fewest iterations. We can use the same trick as before to estimate here if desired, or simply note that this estimate gives and set .)
After values of and are both chosen, iterate with equation (2) until achieving the desired level of accuracy (in terms of unchanging decimal places, etc.).
To finish the process, calculate to get an estimate for .
3.2 Example of square-root application
Suppose we wish to calculate an approximation of . Since , it stands to reason that , so we will choose the value as our guess. Since we have a fairly good estimate for , it would make sense to choose . For the purpose of demonstrating the convergence speed even in non-optimal cases, however, we will choose here instead. Now, we use the recursive formula
to calculate up to our desired level of precision. Afterwards, we will evaluate to obtain an approximation of . The results of these calculations are provided below:
Finally, we conclude by computing , to be compared against the desired value of .
Interestingly, this specific process produces the continued fraction
that clearly demonstrates why .
3.3 Attempting to expand to roots
It is clear how to extend this reasoning to more than simply square roots. To estimate a power , we start with the analogous form of equation (1):
(3) |
To make analysis easier, we can swap the sign of to give
Binomial expansion of the left equality yields
All terms, except for the first, in the binomial expansion have at least one factor . Then, using the same algebraic manipulations as before, it is trivial to find that
Cross-multiplying, it follows that the generalized form of (2) is given by
(4) |
where the fixed point for satisfies
The equations (1) and (2) can be readily recovered by setting and replacing with in formulas (3) and (4) and simplifying.
Setting , on the other hand, results in the equation
Note that we can choose to avoid doing this simplification and just use equation (4) directly, which is especially helpful for large .
However, trying this equation with test values may reveal a disappointing truth: the equation is not as versatile as the square root special case. As an exercise, attempt to calculate with the values or . They don’t converge to the correct value! On the other hand, testing with or does produce the correct value of about . To understand why this occurs, we must return to the idea of fixed point stability.
In the quadratic special case, we found the fixed points of the algorithm as dictated by the and inputs. These were of the form and . Now, in the higher-degree case, we have performed a sign swap, so these fixed points are of the form (with a on the radical term in the case of even ). Then, to analyze the fixed point stability, we differentiate the right-hand side of (4) with respect to at the fixed point(s). Finding where the magnitude of the derivative crosses one, we will see for which values of the fixed point(s) switch from being stable to unstable. In the example with , this yields values of and , where the algorithm fails to converge to the correct value within the interval but succeeds outside the interval. (Solving the resulting quadratic equation in analytically gives , which supports the aforementioned empirical values.)
The cases of even higher degree work even less frequently. In computer-based testing, it seems that the algorithm works but only for specific values of inside tight intervals. This is a stark contrast with the quadratic case, which required only that have a particular sign.
In the seventh-degree case, for example, the least positive value of for which calculating seems to work is approximately , while the lowest possible positive value of for estimating is approximately . Guessing that the seventh powers of these bounds are related to their corresponding values of (analogous to the cubic example), we see that the ratio in each case. Extrapolating to predicts , which can be verified via computer. Evidently, this trend is real, although the exact mechanism by which the number 17.704 can be generated remains elusive.
Unfortunately, there does not seem to be a clear path to a general analytical solution to this problem of determining the interval(s) of validity. It should be noted that whereas there was no upper bound on the valid range of values for the cubic case, the same does not seem to hold true for higher-degree examples. Fortunately, computer testing suggests that even at high degrees, there are at least some values of for which this process works, so a solution could recover some of the usefulness of this technique.
4 Comparison to existing methods
There are two connections we will draw. Namely, we will discuss connections to continued fractions and the Newton-Raphson method mentioned before.
We recall that the special case in equation (4) results in (2):
In this case, we see that the recurrence relation allows for an infinite generalized continued fraction to be constructed [2].
In 3.2, the numerators were each 1, making a “simple” continued fraction.
Next, the Newton-Raphson method is another technique used for approximating roots of positive real numbers, and it also relies on fixed-point iteration in its calculations.
Suppose we want to find an approximation for . We know that we can achieve this by searching for the positive root of . By setting and performing successive linear approximations with
(5) |
we will have converge to the desired value. (This is equivalent to the Babylonian method, after algebraic simplification.)
In fact, we see that at the fixed point , the derivative of the right-hand side with respect to is zero. This indicates that the Newton-Raphson method converges at least quadratically (the number of accurate decimal places roughly doubles in each iteration), which is better than the method explored in this paper, which only converges linearly (since the derivative is generally nonzero, as discussed in the section on stability). This can be explained by the theory of Taylor expansion [3].
Thus, the Newton-Raphson method and higher-order analogues like Halley’s method experience a much faster rate of convergence, with the trade-off being that elementary differentiation of the polynomial is necessary for their computation. Since this polynomial is always of the form with degree and constant , we know that in right-hand side of (5) will always be equal to by the power rule. For Halley’s method and higher-order root-finding algorithms, the higher-order derivatives can be similarly computed without much effort.
References
- [1] Harold V. McIntosh “Fixed points and their stability”, 2001 URL: http://delta.cs.cinvestav.mx/~mcintosh/comun/complex/node20.html
- [2] Robert Israel “Sequence Converging to the Square Root of an Integer”, 2019 URL: https://math.stackexchange.com/a/3075334
- [3] Simply Beautiful Art “In practice, what does it mean for the Newton’s method to converge quadratically (when it converges)?”, 2020 URL: https://math.stackexchange.com/a/3893355