This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Root finding with interval arithmetic

Walter F. Mascarenhas Departamento de Computação, IME
Universidade de São Paulo, Brazil
Abstract

We consider the solution of nonlinear equations in one real variable, the problem usually called by root finding. Although this is an old problem, we believe that some aspects of its solution using interval arithmetic are not well understood, and we present our views on this subject. We argue that problems with just one variable are much simpler than problems with more variables, and we should use specific methods for them. We provide an implementation of our ideas in C++, and make this code available under the Mozilla Public License 2.0.

1 Introduction

Many books and articles explain how to solve the nonlinear equation f(x)=0{f}\!\left(x\right)=0 for xx\in\mathds{R}{}, and some of them consider the verified solution of such equation, that is, finding solutions with rigorous bounds on them. Here we discuss the computation of verified solutions using interval arithmetic. This is an old subject, but the interval arithmetic literature is mostly concerned with the solution of the general equation f(x)=0{f}\!\left(x\right)=0 for xnx\in{\mathds{R}}^{n} using variations of Newton’s method, since its very beginning [2, 10, 11]. For instance, the third chapter of [13] gives a nice description of interval Newton’s method and [2, 3] present interesting ways to improve it. Here we focus on the simplest case n=1n=1, because it is very important in practice and there are techniques which are applicable only for n=1n=1. As far as we know, the most detailed discussion of this particular case is presented in chapter 9 of [4], but we believe that there is more to be said about this subject than what is presented there.

In this article, ff is a function from \mathds{R}{} to \mathds{R}{}. We denote ff’s \ellth derivative by f()f^{\left({\ell}\right)} and when we mention such derivatives we assume that they exist. We consider the class

𝕀:={𝕩=[x¯,x¯],withx¯x¯+}{}{\mathds{I}}:=\left\{{\,\mathbb{x}=[\underline{x},\overline{x}],\ \mathrm{with}\ -\infty\leq\underline{x}\leq\overline{x}\leq+\infty}\right\}\cup\left\{{\emptyset}\right\}

of all closed intervals, including the empty and unbounded ones. For 𝒮{\cal{S}}\subset\mathds{R}{}, we write

f(𝒮):={f(x)forx𝒮}.{f}\!\left({\cal{S}}\right):=\left\{{{f}\!\left(x\right)\ \mathrm{for}\ x\in{\cal{S}}}\right\}.

We assume that we have extensions of ff and its derivatives, in the sense that:

Definition 1

We say that a function 𝕗:𝕀𝕀\mathbb{f}:{\mathds{I}}\rightarrow{\mathds{I}} is an extension of a function f:f:\mathds{R}{}\rightarrow\mathds{R}{} in an interval 𝕩\mathbb{x} if for every 𝕩𝕩\mathbb{x}^{\prime}\subset\mathbb{x} we have that f(𝕩)𝕗(𝕩){f}\!\left(\mathbb{x}^{\prime}\right)\subset{\mathbb{f}}\!\left(\mathbb{x}^{\prime}\right). \blacktriangle

We usually identify the point tt\in\mathds{R}{} with the interval [t,t][t,t], and for functions 𝕗:𝕀𝕀\mathbb{f}:{\mathds{I}}\rightarrow{\mathds{I}} we write 𝕗(t):=𝕗([t,t]){\mathbb{f}}\!\left(t\right):={\mathbb{f}}\!\left([t,t]\right). The set of roots is denoted by {\cal{R}}, and rr is a generic root.

When ff is differentiable, the main tool for root finding is Newton’s method

xk+1=xkf(xk)/f(xk).x_{k+1}=x_{k}-{f}\!\left(x_{k}\right)/{f}^{\prime}\!\left(x_{k}\right). (1)

This method has a natural extension to interval arithmetic:

𝕩k+1=𝕩k(tk𝕗(tk)/𝕗(𝕩k)),\mathbb{x}_{k+1}=\mathbb{x}_{k}\cap\left({\ t_{k}-{\mathbb{f}}\!\left(t_{k}\right)/\,{\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right)\ }\right), (2)

where tkt_{k} is a point in 𝕩k\mathbb{x}_{k} and ff and ff^{\prime} are extensions of 𝕗\mathbb{f} and 𝕗\mathbb{f}^{\prime}. Traditionally, we compute tkt_{k} by rounding the midpoint of 𝕩k\mathbb{x}_{k}. The first question an alert numerical analyst would ask about Equation (2)\left({\ref{newton_i}}\right) is:

Whatshouldwedowhen𝕗(𝕩k)contains 0?\mathrm{What\ should\ we\ do\ when\ }{\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right)\ \mathrm{contains}\ 0\mathrm{?} (3)

However, we see few people asking the following question:

Whatshouldwedowhen𝕗(𝕩k)𝐝𝐨𝐞𝐬𝐧𝐨𝐭contain 0?\mathrm{What\ should\ we\ do\ when\ }{\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right)\ \mathrm{{\bf does\ not}\ contain}\ 0\mathrm{?} (4)

Both questions are quite relevant, but before we answer them we must say that the interval version of Newton’s method may be implemented without Equation (2)\left({\ref{newton_i}}\right). Instead, we could write ff in the centered form

f(t)𝕗(tk)+𝕤tk(𝕩k)(𝕩ktk)fort𝕩k,{f}\!\left(t\right)\in{\mathbb{f}}\!\left(t_{k}\right)+{\mathbb{s}_{t_{k}}}\!\left(\mathbb{x}_{k}\right)\left({\mathbb{x}_{k}-t_{k}}\right)\hskip 14.22636pt\mathrm{for}\hskip 14.22636ptt\in\mathbb{x}_{k}, (5)

and replace the extended derivative 𝕗(𝕩k){\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right) by an extension 𝕤tk(𝕩k){\mathbb{s}_{t_{k}}}\!\left(\mathbb{x}_{k}\right) of the slope stks_{t_{k}} of ff at tkt_{k}. This leads to an improved version of Newton’s method given by

𝕩k+1=𝕩k(tk𝕗(tk)/𝕤tk(𝕩𝕜)).\mathbb{x}_{k+1}=\mathbb{x}_{k}\cap\left({\ t_{k}-{\mathbb{f}}\!\left(t_{k}\right)/\,{\mathbb{s}_{t_{k}}}\!\left(\mathbb{x_{k}}\right)\ }\right). (6)

The slope is defined as

sc(t):=sf,c(t):={f(c)ift=c,f(t)f(c)tciftc,{s_{c}}\!\left(t\right):={s_{f,c}}\!\left(t\right):=\left\{\begin{array}[]{ccc}{f}^{\prime}\!\left(c\right)&\mathrm{if}&t=c,\\[2.84544pt] \frac{{f}\!\left(t\right)-{f}\!\left(c\right)}{t-c}&\mathrm{if}&t\neq c,\end{array}\right. (7)

and centered forms are already mentioned in Moore’s book [10]. They have been discussed in detail in the interval arithmetic literature [7, 12] and have generalizations called Taylor forms or Taylor models [9]. Our algorithm uses only the plain centered form (5)\left({\ref{center}}\right), and in situations in which practical implementations of such generalizations are available we would use them only as a tool to obtain more accurate centered forms.

The Mean Value Theorem shows that

tcsc(t)f([t,c])andtcsc(t)f([c,t])t\leq c\Rightarrow{s_{c}}\!\left(t\right)\in{f}^{\prime}\!\left([t,c]\right)\hskip 28.45274pt\mathrm{and}\hskip 28.45274ptt\geq c\Rightarrow{s_{c}}\!\left(t\right)\in{f}^{\prime}\!\left([c,t]\right)

and this implies that any extension of 𝕗\mathbb{f}^{\prime} is an extension of stks_{t_{k}}, but there may be better ones, specially when the interval 𝕩k\mathbb{x}_{k} is not small. For instance, if f(t):=t2{f}\!\left(t\right):=t^{2} then

s0(t)=xandf(t)=2t{s_{0}}\!\left(t\right)=x\hskip 28.45274pt\mathrm{and}\hskip 28.45274pt{f}^{\prime}\!\left(t\right)=2t

and any extension of ff^{\prime} yields intervals twice as large than necessary for an slope.

In practice the gain by replacing derivatives by slopes may not be great, because usually f(r)0{f}^{\prime}\!\left(r\right)\neq 0 at the roots rr and in this case, for tkt_{k} close to rr, the difference between stks_{t_{k}} and f(t){f}^{\prime}\!\left(t\right) for tt in a short interval 𝕩k\mathbb{x}_{k} is not large. Moreover, extensions of derivatives have an important feature that extensions of slopes do not have: derivative extensions can detect monotonicity. In other words, if 𝕗(𝕩)(0,+){\mathbb{f}^{\prime}}\!\left(\mathbb{x}\right)\subset(0,+\infty) then ff is increasing in 𝕩\mathbb{x}, but we cannot reach the same conclusion from the fact that stk(𝕩)(0,+){s_{t_{k}}}\!\left(\mathbb{x}\right)\subset(0,+\infty). The ease with which we can use information about monotonicity when we have only one variable is what makes the case n=1n=1 so special, and this is the reason why this article was written to start with.

We can ask questions similar to (3)\left({\ref{qd0}}\right) and (4)\left({\ref{qd1}}\right) about the modified Newton’s step (6)\left({\ref{newton_t1}}\right):

Whatshouldwedowhen𝕤mk(𝕩k)contains 0?\mathrm{What\ should\ we\ do\ when\ }{\mathbb{s}_{m_{k}}}\!\left(\mathbb{x}_{k}\right)\ \mathrm{contains}\ 0\mathrm{?} (8)
Whatshouldwedowhen𝕤mk(𝕩k)doesnotcontain 0?\mathrm{What\ should\ we\ do\ when\ }{\mathbb{s}_{m_{k}}}\!\left(\mathbb{x}_{k}\right)\ \mathrm{does\ not\ contain}\ 0\mathrm{?} (9)

and there are at least two more questions we must ask:

Whatinformationabouttherootsouralgorithmshouldprovide?\mathrm{What\ information\ about\ the\ roots\ our\ algorithm\ should\ provide?} (10)
Whenshouldwestoptheiterations?\mathrm{When\ should\ we\ stop\ the\ iterations?} (11)

After much thought about the questions above, we devised a root finding algorithm which combines the interval versions of Newton’s method in Equations (2)\left({\ref{newton_i}}\right) and (6)\left({\ref{newton_t1}}\right), and modifies them in order to exploit montonicity. Our algorithm tries to strike a balance between theory and practice, and takes into account the following practical issues:

  1. 1.

    Usually, the evaluation of interval extensions over intervals is much less accurate than the evaluation of interval extensions at points, that is, the width of the interval 𝕗(tk){\mathbb{f}}\!\left(t_{k}\right) is much smaller than the width of the interval 𝕗(𝕩k){\mathbb{f}}\!\left(\mathbb{x}_{k}\right). The same applies to the derivative 𝕗\mathbb{f}^{\prime}.

  2. 2.

    Usually, the floating point evaluation of dk=f(tk)d_{k}={f}^{\prime}\!\left(t_{k}\right) yields a reasonable estimate this derivative, at a much lower cost than the evaluation of the extensions 𝕗(tk){\mathbb{f}}^{\prime}\!\left(t_{k}\right) or 𝕗(𝕩k){\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right). The only defect of dkd_{k} is that it does not come with a guarantee of its accuracy. As a result we can, and should, use floating point computations in order to obtain reasonable estimates (which may turn out to be inaccurate sometimes) and resort to interval computation only when we absolutely need guarantees about our results. In particular, the computed 𝕗(𝕩k){\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right) may be very wide even when the floating point dk=f(tk)d_{k}={f}^{\prime}\!\left(t_{k}\right) would lead to a good Newton step tk+1=tkw^k/dkt_{k+1}=t_{k}-\hat{w}_{k}/d_{k}, where w^k\hat{w}_{k} is the mid point of 𝕨k=𝕗(tk)\mathbb{w}_{k}={\mathbb{f}}\!\left(t_{k}\right).

  3. 3.

    In interval arithmetic, our goal is to find short intervals 𝕣\mathbb{r} which may contain roots and to discard intervals guaranteed not to contain roots.

  4. 4.

    The simplest way to ensure that an interval 𝕣=[r¯,r¯]\mathbb{r}=[\underline{r},\overline{r}] contains a root is to prove that f(r¯){f}\!\left(\underline{r}\right) and f(r¯){f}\!\left(\overline{r}\right) have opposite signs. We can do that by evaluating 𝕗\mathbb{f} at the points r¯\underline{r} and r¯\overline{r}, and as we noted above this evaluation tends do yield sharp results.

  5. 5.

    The simplest way to ensure that an interval 𝕣\mathbb{r} does not contain a root is to prove that f(r¯){f}\!\left(\underline{r}\right) and f(r¯){f}\!\left(\overline{r}\right) are different from zero and have the same sign and f(𝕣){f}^{\prime}\!\left(\mathbb{r}\right) does not contain 0. In practice, this is not as easy as in the previous item because the computed 𝕗(𝕣){\mathbb{f}}^{\prime}\!\left(\mathbb{r}\right) tends to be inflated by the usual weaknesses of interval arithmetic. However, when we know that ff is monotonic we can dispense with 𝕗(𝕣){\mathbb{f}}^{\prime}\!\left(\mathbb{r}\right) and check only the values of ff at the end points. This simplifies things immensely for monotonic functions. Actually, it makes little practical sense to compute 𝕗(𝕩k){\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right) once we already know that it does not contain 0, and we have this information for all nodes below a first node 𝕩k\mathbb{x}_{k} in the branch and bound tree which is such that 𝕗(𝕩k){\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right) does not contain 0.

  6. 6.

    Multiple roots, i.e., roots rr such that f(r)=0{f}^{\prime}\left(r\right)=0, are a major problem and should be handled differently from single roots. Usually, it is hopeless to try to get them with the same accuracy as single roots, and the cases in which we have some hope are too specific to deserve attention in a generic software.

Combining the items above, we propose an algorithm which can be outlined as follows. For each candidate interval we keep seven additional bits of information:

  • The sign σi{1,0,1}\sigma_{i}\in\left\{{-1,0,1}\right\} of ff at its infimum,

  • The sign σs{1,0,1}\sigma_{s}\in\left\{{-1,0,1}\right\} of ff at its supremum

  • The sign σd{1,0,1}\sigma_{d}\in\left\{{-1,0,1}\right\} of ff^{\prime} in 𝕩\mathbb{x}

  • A point tt in its interior, indicating where it should be split in the Newton step

  • A point t~\tilde{t}, which lies near 𝕩\mathbb{x}.

  • The value d~=f(t~)\tilde{d}={f}^{\prime}\!\left(\tilde{t}\right).

  • The expected sign σt\sigma_{t} of f(t){f}\!\left(t\right).

Regarding the signs above, 1-1 means definitely negative, +1+1 means definitely positive and 0 means that we do not know the corresponding sign.

We then procedure in the usual branch and bound way:

  1. 1.

    If the stack is empty then we are done. Otherwise we pop a tuple

    (𝕩k,σi,σs,σd,tk,t~,d~,σt)\left({\mathbb{x}_{k},\sigma_{i},\sigma_{s},\sigma_{d},t_{k},\tilde{t},\tilde{d},\sigma_{t}}\right)

    from the stack.

  2. 2.

    If the width of 𝕩\mathbb{x} is below a given tolerance τx\tau_{x} then we insert (𝕩,σi,σs,σd)\left({\mathbb{x},\sigma_{i},\sigma_{s},\sigma_{d}}\right) in a list of possible solutions and go to item 1.

  3. 3.

    We compute 𝕗(𝕩){\mathbb{f}}\!\left(\mathbb{x}\right). If it does not contain 0, then drop 𝕩\mathbb{x} and go to item 1. (Tolerances are discussed in Section 3.)

  4. 4.

    We compute the slope 𝕤k:=𝕤tk(𝕩k)\mathbb{s}_{k}:={\mathbb{s}_{t_{k}}}\!\left(\mathbb{x}_{k}\right). If it does not contain 0 then we compute the derivative 𝕕k:=𝕗(𝕩k)\mathbb{d}_{k}:={\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right). If 𝕕k\mathbb{d}_{k} does not contain 0 then we change to the specific algorithm to handle strictly monotonic functions described in Section 5, and once we are done we go to item 1. Otherwise, we replace 𝕤k\mathbb{s}_{k} by 𝕤k𝕕k\mathbb{s}_{k}\cap\mathbb{d}_{k} and continue.

  5. 5.

    We compute 𝕗(tk){\mathbb{f}}\!\left(t_{k}\right) and check whether it contains zero or is too close to zero. If it does then use the algorithm described in Section 3 to handle 𝕩k\mathbb{x}_{k} and go to item 1.

  6. 6.

    If 𝕗(tk){\mathbb{f}}\!\left(t_{k}\right) is not too close to zero then and apply the version of the interval Newton step described in Section 2, obtaining at most two intervals 𝕩k+1\mathbb{x}_{k+1} and 𝕩k+1\mathbb{x}_{k+1}^{\prime}. If there are no 𝕩k+1\mathbb{x}_{k+1} then we drop 𝕩k\mathbb{x}_{k} and go to item 1.

  7. 7.

    If there is just one 𝕩k+1\mathbb{x}_{k+1} then we check whether the sign σt\sigma_{t} matches the sign of 𝕗(𝕩k){\mathbb{f}}\!\left(\mathbb{x}_{k}\right). If it does not then we set σt\sigma_{t} to zero, take tk+1t_{k+1} as the mid point of 𝕩k+1\mathbb{x}_{k+1}, obtain the extra information for 𝕩x+1\mathbb{x}_{x+1} and push it on the stack and go to item 1. Otherwise, we compute d=f(tk)d={f}^{\prime}\!\left(t_{k}\right) using floating point arithmetic and use d~\tilde{d} and t~\tilde{t} to check whether the corrected Newton step tk+1t_{k+1} described in Section 4 lies in 𝕩k+1\mathbb{x}_{k+1}. If it does then we use this tk+1t_{k+1}, otherwise we take tk+1t_{k+1} as the midpoint of 𝕩k+1\mathbb{x}_{k+1}. We then obtain the additional information for 𝕩k+1\mathbb{x}_{k+1}, push it on the stack and go to item 1.

  8. 8.

    If there are two 𝕩k+1\mathbb{x}_{k+1}s, then we check whether the width of 𝕩k\mathbb{x}_{k} lies below the cluster threshold τc\tau_{c}. If it does then we deem 𝕩k\mathbb{x}_{k} to contain a cluster of zeros, and proceed as in item 2. Otherwise we apply the same procedure as in item 7 to 𝕩k+1\mathbb{x}_{k+1} and 𝕩k+1\mathbb{x}_{k+1}^{\prime} and go to item 1.

We implemented the algorithm outlined above in C++, using our Moore interval arithmetic library [8]. This code is available with the arxiv version of this article. It is distributed under the Mozilla Public License 2.0. Unfortunately, the code is much more involved than the summary above, because it must deal with many technical details which were not mentioned in this article in order not to make it longer and more complex than it already is.

In the rest of the article we discuss in more detail several aspects of the algorithm outline above. We start with Section 2, in which we present a version of Newton’s method for interval arithmetic. This version is similar to the ones found in the literature, but it is slightly different because it ensures that ff is different from zero at some of the extreme points of the new intervals, and computes the signs of ff at these extremes at a low extra cost. As a result, our algorithm yields not only intervals containing the roots but also the sign of ff at the extreme of such intervals, and these signs certify the existence of a root in the interval when they are different from each other. In Section 3 we make some comments about interval arithmetic in general and discuss the thorny subject of tolerances, which are unavoidable for defining stopping criterion for interval root solvers. Section 4 presents yet another version of the classic Newton’s method for exact, “point arithmetic.” This “point version” is the motivation for the interval version of Newton’s method for monotone functions presented in Section 5. Finally, in Section 6 we discuss the important subject of testing.

2 The Interval version of Newton’s step

This section is about the interval version of Newton’s step

𝕩k+1=𝕩k(tk𝕗(tk)/𝕕k),\mathbb{x}_{k+1}=\mathbb{x}_{k}\cap\left({\ t_{k}-{\mathbb{f}}\!\left(t_{k}\right)/\,\mathbb{d}_{k}\ }\right), (12)

where 𝕕k=[d¯k,d¯k]\mathbb{d}_{k}=[\underline{d}_{k},\overline{d}_{k}] can be either the derivative 𝕗(𝕩k){\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right) or the slope 𝕤tk(𝕩𝕜){\mathbb{s}_{t_{k}}}\!\left(\mathbb{x_{k}}\right). Here we make the simplifying assumption that the interval

𝕨k:=𝕗(tk):=[w¯k,w¯k]\mathbb{w}_{k}:={\mathbb{f}}\!\left(t_{k}\right):=[\underline{w}_{k},\ \overline{w}_{k}]

does not contain 0, answering to the questions (3)\left({\ref{qd0}}\right), (4)\left({\ref{qd1}}\right), (8)\left({\ref{qs0}}\right) and (9)\left({\ref{qs1}}\right) in the introduction in this case. By replacing ff by f-f if necessary, we can assume that w¯k<0\overline{w}_{k}<0, and we make this simplifying assumption from now on.

The answer to questions (3)\left({\ref{qd0}}\right) and (8)\left({\ref{qs0}}\right), in which case 𝕕k\mathbb{d}_{k} contains 0, is illustrated in Figure 2. In this Figure, the inclined lines have equations

w=w¯k+d¯k(ttk)andw=w¯k+d¯k(ttk),w=\overline{w}_{k}+\underline{d}_{k}\left({t-t_{k}}\right)\hskip 28.45274pt\mathrm{and}\hskip 28.45274ptw=\overline{w}_{k}+\overline{d}_{k}\left({t-t_{k}}\right), (13)

and by intersecting these lines with the axis w=0w=0 we obtain a better bound on 𝕩k{\cal{R}}\cap\mathbb{x}_{k} (recall that {\cal{R}} is the set of roots.) There are three possibilities for our bounds on 𝕩k{\cal{R}}\cap\mathbb{x}_{k} after we take in to account the intersections above. We may find that:

  • 𝕩k={\cal{R}}\cap\mathbb{x}_{k}=\emptyset. In this case we drop 𝕩k\mathbb{x}_{k}.

  • 𝕩k{\cal{R}}\cap\mathbb{x}_{k} is contained in a single interval 𝕣=[r¯,r¯]\mathbb{r}=[\underline{r},\overline{r}], as in the second and third cases in Figure 1. In this case we take 𝕩k+1=𝕣\mathbb{x}_{k+1}=\mathbb{r}.

  • 𝕩k{\cal{R}}\cap\mathbb{x}_{k} is contained in the union of two intervals 𝕣1\mathbb{r}_{1} and 𝕣2\mathbb{r}_{2} such that r¯1r¯2\overline{r}_{1}\leq\underline{r}_{2}, as in the last case of Figure 1.

𝕩k{\cal{R}}\cap\mathbb{x}_{k} is emptytkt_{k}w¯k\overline{w}_{k}w¯k\underline{w}_{k}tkt_{k}w¯k\overline{w}_{k}w¯k\underline{w}_{k}r¯\underline{r}r¯\overline{r}tkt_{k}w¯k\overline{w}_{k}w¯k\underline{w}_{k}r¯\underline{r}r¯\overline{r}w¯k\overline{w}_{k}w¯k\underline{w}_{k}tkt_{k}r¯1\underline{r}_{1}r¯1\overline{r}_{1}r¯2\underline{r}_{2}r¯2\overline{r}_{2}
Figure 1: The case 0𝕕0\in\mathbb{d}, w¯<0\overline{w}<0, with d¯k-\underline{d}_{k} and d¯k\overline{d}_{k} large and small.

There is nothing new in our approach up to this point. We just explained with a picture what most people do algebraically. Our slight improvement comes in the way we compute the intersections in Equations (13)\left({\ref{inter1}}\right). These intersections are

r¯1:=tkw¯k/d¯kandr¯2:=tkw¯k/d¯k,\overline{r}_{1}:=t_{k}-\overline{w}_{k}/\underline{d}_{k}\hskip 28.45274pt\mathrm{and}\hskip 28.45274pt\underline{r}_{2}:=t_{k}-\overline{w}_{k}/\overline{d}_{k},

and, as usual in the Moore Library [8], we propose that r¯1\overline{r}_{1} and r¯2\underline{r}_{2} be computed with the rounding mode upwards, using the following expressions:

r¯1\displaystyle\overline{r}_{1} :=\displaystyle:= tk+q¯kforq¯k:=w¯k/(d¯k),\displaystyle t_{k}+\overline{q}_{k}\hskip 14.22636pt\mathrm{for}\hskip 8.5359pt\overline{q}_{k}:=\overline{w}_{k}/\left({-\underline{d}_{k}}\right), (14)
r¯2\displaystyle\underline{r}_{2} :=\displaystyle:= ukforuk:=q¯ktkandq¯k:=w¯k/d¯k.\displaystyle-u_{k}\hskip 28.45274pt\mathrm{for}\hskip 8.5359ptu_{k}:=\underline{q}_{k}-t_{k}\hskip 8.5359pt\mathrm{and}\hskip 8.5359pt\underline{q}_{k}:=\overline{w}_{k}/\overline{d}_{k}. (15)

This order of evaluation ensure that r¯1\overline{r}_{1} and r¯2\underline{r}_{2} are rounded in the correct direction, so that we do not risk loosing roots. The expressions in Equation (14)\left({\ref{computedr1}}\right) and (15)\left({\ref{computedr2}}\right) are so simple that we find whether r¯1\overline{r}_{1} and r¯2\underline{r}_{2} where computed exactly by checking whether

r¯1tk=q¯kandq¯kd¯k=w¯k\overline{r}_{1}-t_{k}=\overline{q}_{k}\hskip 14.22636pt\mathrm{and}\hskip 14.22636pt\overline{q}_{k}\underline{d}_{k}=\overline{w}_{k} (16)

and

uk+tk=q¯kandq¯kd¯k=w¯k.u_{k}+t_{k}=\underline{q}_{k}\hskip 14.22636pt\mathrm{and}\hskip 14.22636pt\underline{q}_{k}\overline{d}_{k}=\overline{w}_{k}. (17)

If we find that r¯1\overline{r}_{1} was computed exactly then we increase it to the next floating point number. If r¯2\underline{r}_{2} was computed exactly then we decrease it to the previous floating point number. By doing so, we ensure that f(r¯1)<0{f}\!\left(\overline{r}_{1}\right)<0 and f(r¯2)<0{f}\!\left(\underline{r}_{2}\right)<0, without computing 𝕗(r¯1){\mathbb{f}}\!\left(\overline{r}_{1}\right) or 𝕗(r¯2){\mathbb{f}}\!\left(\underline{r}_{2}\right). We should also mention that it is possible to prove that, even after the rounding, incrementing and decrementing steps above, r¯1tkr¯2\overline{r}_{1}\leq t_{k}\leq\underline{r}_{2} and there is no risk of r¯1\overline{r}_{1} crossing r¯2\underline{r}_{2}.

Regarding the cost of all this, note that in the Moore library the rounding mode is always upwards. Therefore, there is no cost associated with changing rounding modes. Moreover, the expensive operations in Equations (14)\left({\ref{computedr1}}\right) and (15)\left({\ref{computedr2}}\right) are the divisions, and the extra couple of sums and multiplications do not increase the overall cost of computing the intersections by much. In fact, if we take into account branch prediction and speculative execution and the fact that computations are usually inexact, the extra cost due to the verification in Equations (16)\left({\ref{veri1}}\right) and (17)\left({\ref{veri2}}\right) and the decrement of r¯1\overline{r}_{1} and the increment of r¯2\underline{r}_{2} is likely to be minimal. We would not be surprised if by performing the verification in Equations (16)\left({\ref{veri1}}\right) and (17)\left({\ref{veri2}}\right) above the code would be faster than blindly incrementing r¯1\overline{r}_{1} and decrementing r¯2\underline{r}_{2} (we did not have the time to check this.)

Finally, by using symmetry, we can reduce the analysis of questions (4)\left({\ref{qd1}}\right) and (9)\left({\ref{qs1}}\right) in the case in which 𝕗(tk){\mathbb{f}}\!\left(t_{k}\right) does not contain 0 to the cases described in Figure 2. In this Figure, the inclined lines have equations

w=w¯k+d¯k(ttk)andw=w¯k+d¯k(ttk).w=\overline{w}_{k}+\overline{d}_{k}\left({t-t_{k}}\right)\hskip 28.45274pt\mathrm{and}\hskip 28.45274ptw=\underline{w}_{k}+\underline{d}_{k}\left({t-t_{k}}\right).

and we may either have no intersection or one intersection. We can use the same technique described above for find intersections which are correctly rounded and such that f(u){f}\!\left(u\right) is different from zero in the new extreme points uu, without evaluating 𝕗(u){\mathbb{f}}\!\left(u\right) (the old extreme points stay as they were.)

𝕩k{\cal{R}}\cap\mathbb{x}_{k} is emptytkt_{k}w¯k\overline{w}_{k}w¯k\underline{w}_{k}tkt_{k}w¯k\overline{w}_{k}w¯k\underline{w}_{k}r¯\underline{r}r¯\overline{r}tkt_{k}w¯k\overline{w}_{k}w¯k\underline{w}_{k}r¯\underline{r}r¯\overline{r}
Figure 2: The case d¯k>0\underline{d}_{k}>0 and w¯k<0\overline{w}_{k}<0, with d¯k\underline{d}_{k} and d¯k\overline{d}_{k} large and small.

3 What we can expect from Interval Arithmetic

In order to appreciate the content of this article, one must realize that interval arithmetic is quite different from floating point arithmetic, and it is used for different purposes. Floating point methods are usually faster than interval methods, but do not provide rigorous bounds on their results. We use interval arithmetic when we want global solutions to our problems, with rigorous bounds on them, and we are willing to pay more for that.

r¯\underline{r}r¯\overline{r}r¯\underline{r}^{\prime}r¯\overline{r}^{\prime}f(t){f}\!\left(t\right)𝕗¯(t){\underline{\mathbb{f}}}\!\left(t\right)𝕗¯(t){\overline{\mathbb{f}}}\!\left(t\right)rr
Figure 3: If 𝕗\mathbb{f} is all we know, then the best estimate we can hope for rr is the interval 𝕣=[r¯,r¯]\mathbb{r}=[\underline{r},\overline{r}]. Knowing that 𝕗¯(t)>0{\underline{\mathbb{f}}}^{\prime}\!\left(t\right)>0 for all t𝕣t\in\mathbb{r} we can improve this to 𝕣=[r¯,r¯]\mathbb{r}^{\prime}=[\underline{r}^{\prime},\overline{r}^{\prime}]. This is the best we can do if 𝕗\mathbb{f} an 𝕗\mathbb{f}^{\prime} are all the information we have about ff.

Interval arithmetic is fundamentally limited. When we evaluate the extension 𝕗\mathbb{f} at a point tt obtain an interval 𝕨=𝕗(t)\mathbb{w}={\mathbb{f}}\!\left(t\right) containing f(t){f}\!\left(t\right). The width of 𝕨\mathbb{w} depends on the quality of the implementation of 𝕗\mathbb{f} and the underlying arithmetic. When ff is not too complex and f(t){f}\!\left(t\right) is O(1){O}\!\left(1\right), we expect that the width of 𝕨\mathbb{w} to be O(ϵ){O}\!\left(\epsilon\right), where ϵ\epsilon is the machine precision. It is also reasonable to expect that the functions 𝕗¯,𝕗¯:𝕩\underline{\mathbb{f}},\overline{\mathbb{f}}:\mathbb{x}\rightarrow\mathds{R}{} will be as in Figure 3 when r𝕩r\in\mathbb{x} is a root of ff.

When the evaluation of ff using interval arithmetic leads to wide intervals, we can (and do) use centered forms and Taylor models to reduce the width of 𝕗(𝕩){\mathbb{f}}\!\left(\mathbb{x}\right), but this will not free us from facing the reality that the evaluation of interval extensions is usually imperfect. This is a fact, and we need to be prepared to handle not only the monotone situation described in Figure 3, but also the noisy situation described in Figure 4, which often occurs near a double root, and gets worse near roots of higher multiplicities.

f1(t){f_{1}}\!\left(t\right)f2(t){f_{2}}\!\left(t\right)𝕗¯(t){\underline{\mathbb{f}}}\!\left(t\right)𝕗¯(t){\overline{\mathbb{f}}}\!\left(t\right)τw\tau_{w}x¯\underline{x}c¯\underline{c}τc\tau_{c}zzτc\tau_{c}τc\tau_{c}c¯\overline{c}x¯\overline{x}
Figure 4: The hardest case our algorithm faces: a cluster of zeros below the accuracy of 𝕗\mathbb{f} and 𝕗\mathbb{f}^{\prime}. There are no silver bullets in such cases, and we must find reasonable ways to cope with the 𝕗\mathbb{f} and 𝕗\mathbb{f}^{\prime} which are provided to us, not the ones that we dream about. Using the tolerances τc\tau_{c} and τw\tau_{w} described in the text is a reasonable compromise to handle such situations (which is far from perfect.)

Another fundamental fact about interval arithmetic is that it is a pessimistic theory: it always considers the worst scenario, and it would be inconsistent if it were otherwise. As consequence, of this fundamental fact, the functions f1f_{1} and f2f_{2} are indistinguishable by their common extension 𝕗\mathbb{f}, that is, if 𝕗\mathbb{f} is all the information that we have, then all our conclusions about f1f_{1} must also apply to f2f_{2}. This forces us to be very conservative, and include all possible roots for f1f_{1} as candidates for roots for f2f_{2} too, and vice versa. We emphasize this point because we found that it is not always clear in the minds of people trying to implement roots finders like the one we propose here, and this makes them under estimate the effort required for this task in real life.

Taking into account all that was said since the beginning of this section, we reached the conclusion that we should use three tolerances to try to handle the problems caused by the very nature of interval arithmetic. First, we should have a tolerance τx\tau_{x} to handle “normal” roots rr like the one in Figure 3, at which f(r)0{f}^{\prime}\!\left(r\right)\neq 0, so that we consider intervals around such roots with width below τx\tau_{x} to be good enough. Clusters of zeros of 𝕗\mathbb{f}, as in Figure 4, are a different kind of beast, as the experience with other problems in interval arithmetic has show [5, 6]. By using the same tolerance τx\tau_{x} for them we can end up with literally thousands of small intervals as candidates for containing roots, in the favorable case in which the program actually ends and returns something. We say this from our experience with our own interval root finders as well as with interval root finders developed by other people.

Our algorithm asks the user to also provide a “cluster tolerance” τc\tau_{c}, which could be something of the order of τx\sqrt{\tau_{x}}, and a tolerance τw\tau_{w} so that function values w=f(t)w={f}\!\left(t\right) with magnitude below τw\tau_{w} are deemed to be negligible. We take the liberty of increasing τw\tau_{w} if we find it to be smaller than 1616 times the maximum width of the intervals 𝕗(t){\mathbb{f}}\!\left(t\right) which compute along the execution of algorithm. Contrary to what is proposed in [4], we believe that all tolerances should be absolute, and not relative (and users are free to choose which approach suits their needs best.) Using these tolerances, we can implement Algorithm 1 which “expands” a zero z𝕩z\in\mathbb{x}. By applying this algorithms to the point zz and the interval 𝕩=[x¯,x¯]\mathbb{x}=[\underline{x},\overline{x}] in Figure 4 we would identify the cluster [c¯,c¯][\underline{c},\overline{c}] and submit the intervals [x¯,c¯][\underline{x},\underline{c}] and [c¯,x¯][\overline{c},\overline{x}] to further examination. Finally, we must say that the actual algorithms for zero expansion in the C++ code are more involved than Algorithm 1, but the details are too cumbersome to be presented here.

Algorithm 1 Zero expansion
procedure expand_zero(zz, 𝕩\mathbb{x})
     c¯z\overline{c}\leftarrow z
     while  c¯+τcx¯\overline{c}+\tau_{c}\leq\overline{x}  do
         𝕨𝕗(c¯+τc)\mathbb{w}\leftarrow{\mathbb{f}}\!\left(\overline{c}+\tau_{c}\right)
         τwmax{16wid(𝕨),τw}\tau_{w}\leftarrow\max\left\{{16\ {\mathrm{wid}}\!\left(\mathbb{w}\right),\tau_{w}}\right\}
         if  w¯τw\underline{w}\geq\tau_{w} or w¯τw\overline{w}\leq-\tau_{w}  then
              break          
         c¯c¯+τc\overline{c}\leftarrow\overline{c}+\tau_{c}      
     
     c¯z\underline{c}\leftarrow z
     while  c¯τcx¯\underline{c}-\tau_{c}\geq\underline{x}  do
         𝕨𝕗(c¯τc)\mathbb{w}\leftarrow{\mathbb{f}}\!\left(\underline{c}-\tau_{c}\right)
         τwmax{16wid(𝕨),τw}\tau_{w}\leftarrow\max\left\{{16\ {\mathrm{wid}}\!\left(\mathbb{w}\right),\tau_{w}}\right\}
         if  w¯τw\underline{w}\geq\tau_{w} or w¯τw\overline{w}\leq-\tau_{w}  then
              break          
         c¯c¯τc\underline{c}\leftarrow\underline{c}-\tau_{c}      
     
     if c¯>x¯\underline{c}>\underline{x}  then
         if c¯<x¯\overline{c}<\overline{x}  then
              return [x¯,c¯],[c¯,c¯],[c¯,x¯][\underline{x},\underline{c}],\ \ [\underline{c},\overline{c}],\ \ [\overline{c},\overline{x}]
         else
              return [x¯,c¯],[c¯,c¯],[\underline{x},\underline{c}],\ \ [\underline{c},\overline{c}],\ \ \emptyset          
     else
         if c¯<x¯\overline{c}<\overline{x}  then
              return ,[c¯,c¯],[c¯,x¯]\emptyset,\ \ [\underline{c},\overline{c}],\ \ [\overline{c},\overline{x}]
         else
              return ,[c¯,c¯],\emptyset,\ \ [\underline{c},\overline{c}],\ \ \emptyset               
end procedure

4 The modified Newton’s method

As mentioned in the introduction, the main tool for root finding is Newton’s method

tk+1=tkf(tk)/f(tk).t_{k+1}=t_{k}-{f}\!\left(t_{k}\right)/{f}^{\prime}\!\left(t_{k}\right). (18)

This method was devised to find a “point approximation” to the root rr, that is, we for r~\tilde{r} such that |rr~|{\left|{r-\tilde{r}}\right|} is small. It is our opinion that the goal of interval arithmetic is a bit different: we look for a short interval 𝕣\mathbb{r} such that r𝕣r\in\mathbb{r}. Of course, in the end both goals amount to the same, but they suggest slightly different perspectives. We can rephrase the interval arithmetic goal as finding a short interval 𝕣=[r¯,r¯]\mathbb{r}=[\underline{r},\overline{r}] such that f(r¯){f}\!\left(\underline{r}\right) and f(r¯){f}\!\left(\overline{r}\right) have opposite signs (modulo degenerate cases in which f(r¯)=0{f}\!\left(\underline{r}\right)=0 or f(r¯)=0{f}\!\left(\overline{r}\right)=0.) From this perspective, we believe that we should modify the classic Newton step (18)\left({\ref{newton2}}\right) so that it produces iterates tkt_{k} such that the signs of wk:=f(tk)w_{k}:={f}\!\left(t_{k}\right) alternate, in order to ensure that the interval

𝕣k:=[r¯k,r¯k]forr¯k:=min{tk,tk+1}andr¯k:=max{tk,tk+1}\mathbb{r}_{k}:=[\underline{r}_{k},\overline{r}_{k}]\hskip 14.22636pt\mathrm{for}\hskip 14.22636pt\underline{r}_{k}:=\min\left\{{t_{k},t_{k+1}}\right\}\hskip 14.22636pt\mathrm{and}\hskip 14.22636pt\overline{r}_{k}:=\max\left\{{t_{k},t_{k+1}}\right\}

always contain a root.

wkw_{k}tkt_{k}nk+1n_{k+1}w=f(t)w={f}\!\left(t\right)tk+1:=nkhksk2/dkt_{k+1}:=n_{k}-h_{k}s_{k}^{2}/d_{k}nk+1:=tk+skn_{k+1}:=t_{k}+s_{k}sk:=wk/dks_{k}:=w_{k}/d_{k}dk:=f(tk)d_{k}:={f}^{\prime}\!\left(t_{k}\right)hkf′′(tk)<0h_{k}\approx{f}^{\prime\prime}\!\left(t_{k}\right)<0wk+1w_{k+1}w=f(t)w={f}\!\left(t\right)wkw_{k}tkt_{k}tk+1=nk+1t_{k+1}=n_{k}+1wk+1w_{k+1}
Figure 5: The modified Newton step when wk=f(tk)<0w_{k}={f}\!\left(t_{k}\right)<0 and dk=f(tk)>0d_{k}={f}^{\prime}\!\left(t_{k}\right)>0. At left, ff is concave and we modify the Newton iterate nk+1n_{k+1} in order to ensure that wk+1w_{k+1} and wkw_{k} have opposite signs. At right, ff is convex and no modification is needed.

A simple modification with this purpose is described in Figure 5. This Figure illustrates only the case in which wk<0w_{k}<0 and dk>0d_{k}>0, but cases in which wk>0w_{k}>0 or dk<0d_{k}<0 are similar and can be reduced to the case wk<0w_{k}<0 and dk>0d_{k}>0 by replacing ff by f-f or tt by t-t. A simple way to obtain a reasonable approximation hkh_{k} for the second derivative mentioned in Figure 5 is to use the quotient

hk=dkmdktkmtk,h_{k}=\frac{d_{k-m}-d_{k}}{t_{k-m}-t_{k}},

with data dkmd_{k-m} and tkmt_{k-m} from previous iterations.

Motivated by Figure 5, we propose Algorithm 2 below for finding a short interval 𝕣\mathbb{r} containing the single root of ff in the interval 𝕩\mathbb{x} using exact arithmetic, in the case in which f(x¯)<0<f(x¯){f}\!\left(\underline{x}\right)<0<{f}\!\left(\overline{x}\right) and f(t)>0{f}^{\prime}\!\left(t\right)>0 for t𝕩t\in\mathbb{x}. It is not difficult to prove that Algorithm 2 has the same properties as the many other versions of Newton’s method that we find in the literature:

  • When the model

    f(t)f(tk)+f(tk)(ttk)+f′′(t)(ttk)2/2{f}\!\left(t\right)\approx{f}\!\left(t_{k}\right)+{f}^{\prime}\!\left(t_{k}\right)\left({t-t_{k}}\right)+{f}^{\prime\prime}\!\left(t\right)\left({t-t_{k}}\right)^{2}/2 (19)

    is inaccurate for t𝕩t\in\mathbb{x}, we fallback into a bisection step. This guarantees that eventually the interval 𝕩\mathbb{x} will become very short, and the analysis in the next item will apply

  • When the quadratic model (19)\left({\ref{quadratic}}\right) is accurate in 𝕩\mathbb{x}, and the interval 𝕩\mathbb{x} is very short, hh will be a good approximation of f′′(tk){f}^{\prime\prime}\!\left(t_{k}\right) and our modification in the Newton step will ensure that bisection will stop being necessary, the signs of the wkw_{k} will alternate and the iterates will converge at the same quadratic rate as the classic method.

Algorithm 2 Find a short interval 𝕣\mathbb{r} containing the root rr of ff with f(t)>0{f}^{\prime}\!\left(t\right)>0.
procedure exact_newton(𝕩\mathbb{x}, ff,  τx\tau_{x}, τw\tau_{w})
     w¯f(x¯),w¯f(x¯),d¯nan,d¯nan,dnan,tdnan,h0\underline{w}\leftarrow{f}\!\left(\underline{x}\right),\ \ \overline{w}\leftarrow{f}\!\left(\overline{x}\right),\ \ \underline{d}\leftarrow\mathrm{nan},\ \ \overline{d}\leftarrow\mathrm{nan},\ \ d\leftarrow\rm{nan},\ \ t_{d}\leftarrow\mathrm{nan},\ \ h\leftarrow 0
     if  w¯>0\underline{w}>0 or w¯<0\overline{w}<0  then
         return \emptyset      
     
     regular case:
     if  (x¯x¯<τx)\left({\overline{x}-\underline{x}<\tau_{x}}\right)\ or (w¯w¯<τw)\ \left({\overline{w}-\underline{w}<\tau_{w}}\right)  then
         return 𝕩\mathbb{x}      
     if  w¯w¯-\underline{w}\leq\overline{w}  then
         if  !isnan(d¯)!{\mathrm{isnan}}\!\left(\underline{d}\right)  then
              goto bissection          
         tkx¯,dkf(tk)t_{k}\leftarrow\underline{x},\ \ d_{k}\leftarrow{f}^{\prime}\!\left(t_{k}\right)
         if  !isnan(d)!{\mathrm{isnan}}\!\left(d\right) and tdtkt_{d}\neq t_{k} then
              h(ddk)/(tdtk),ddk,tdtkh\leftarrow\left({d-d_{k}}\right)/\left({t_{d}-t_{k}}\right),\ \ d\leftarrow d_{k},\ \ t_{d}\leftarrow t_{k}          
         d¯dk,ddk,tdtk,skw¯/dk\underline{d}\leftarrow d_{k},\ \ d\leftarrow d_{k},\ \ t_{d}\leftarrow t_{k},\ \ s_{k}\leftarrow-\underline{w}/d_{k}
         if  h<0h<0  then
              skskhsk2/dks_{k}\leftarrow s_{k}-hs_{k}^{2}/d_{k}          
         if  2sk>x¯x¯2s_{k}>\overline{x}-\underline{x}  then
              goto bissection          
         tk+1tk+sk,wk+1f(tk+1)t_{k+1}\leftarrow t_{k}+s_{k},\ \ w_{k+1}\leftarrow{f}\!\left(t_{k+1}\right)
         if  wk+10w_{k+1}\geq 0  then
              w¯wk+1,x¯tk+1,d¯nan\overline{w}\leftarrow w_{k+1},\ \ \overline{x}\leftarrow t_{k+1},\ \ \overline{d}\leftarrow\mathrm{nan}
              goto regular case
         else
              w¯wk+1,x¯tk+1,d¯nan\underline{w}\leftarrow w_{k+1},\ \ \underline{x}\leftarrow t_{k+1},\ \ \underline{d}\leftarrow\mathrm{nan}
              goto bissection          
     else
         the case w¯>w¯-\underline{w}>\overline{w} is analogous to w¯w¯-\underline{w}\leq\overline{w}      
     
     bissection:
     tk+1(x¯+x¯)/2,wk+1f(tk+1)t_{k+1}\leftarrow\left({\underline{x}+\overline{x}}\right)/2,\ \ w_{k+1}\leftarrow{f}\!\left(t_{k+1}\right)
     if  wk+1>0w_{k+1}>0  then
         w¯wk+1,x¯tk+1,d¯nan\overline{w}\leftarrow w_{k+1},\ \ \overline{x}\leftarrow t_{k+1},\ \ \overline{d}\leftarrow\mathrm{nan}
     else
         w¯wk+1,x¯tk+1,d¯nan\underline{w}\leftarrow w_{k+1},\ \ \underline{x}\leftarrow t_{k+1},\ \ \underline{d}\leftarrow\mathrm{nan}      
     goto regular case
end procedure

5 The Monotonic Method

This Section presents an interval Newton’s method for functions with f(t)>κ>0{f}^{\prime}\!\left(t\right)>\kappa>0 for t𝕩t\in\mathbb{x} (in order to used for functions with f(t)<κ<0{f}^{\prime}\!\left(t\right)<-\kappa<0 we can simply replace ff by f-f. The method is a simple adaptation of the “point method” presented in Section 4, taking into account the fundamental differences between exact (point) arithmetic and interval arithmetic. Among such differences we have the following ones:

  • In interval arithmetic, for a strictly increasing function, we may have that 0𝕗(t)0\in{\mathbb{f}}\!\left(t\right) for many tt’s. In exact arithmetic this can only happen for one value of tt.

  • In interval arithmetic we have a natural measure for the error in the the values of ff, given by the maximum width of the intervals 𝕗(tk){\mathbb{f}}\!\left(t_{k}\right). This allows us to correct a poor choice of the tolerance τw\tau_{w} by the user and dispense with the cluster tolerance τc\tau_{c} mentioned in Section 3.

As we mentioned in the introduction, the increasing case is much simpler than the general one. In this case, we do not need to compute neither 𝕗(𝕩k){\mathbb{f}}\!\left(\mathbb{x}_{k}\right) nor 𝕗(𝕩k){\mathbb{f}}^{\prime}\!\left(\mathbb{x}_{k}\right). We only need the interval 𝕗(tk){\mathbb{f}}\!\left(t_{k}\right) and the floating point number f(tk){f}^{\prime}\!\left(t_{k}\right), and this entails a considerable reduction in the cost of the steps. The analysis of degenerate cases is also much simpler in the increasing case. These ideas are explored in Algorithm 3, which is presented after the bibliography. For brevity, we omit the expand_zero label in this code, but it is similar to the zero expansion algorithms presented in Section 3.

Algorithm 3 Find a short interval 𝕣\mathbb{r} containing the root rr of ff with 𝕗¯(𝕩)κ>0{\underline{\mathbb{f}}}^{\prime}\!\left(\mathbb{x}\right)\geq\kappa>0.
procedure increasing_interval_newton(𝕩\mathbb{x}, ff,  τx\tau_{x}, τw\tau_{w})
     𝕨𝕗(x¯),𝕨+𝕗(x¯),d¯nan,d¯nan,dnan,tdnan,h0\mathbb{w}_{-}\leftarrow{\mathbb{f}}\!\left(\underline{x}\right),\ \ \mathbb{w}_{+}\leftarrow{\mathbb{f}}\!\left(\overline{x}\right),\ \ \underline{d}\leftarrow\mathrm{nan},\ \ \overline{d}\leftarrow\mathrm{nan},\ \ d\leftarrow\rm{nan},\ \ t_{d}\leftarrow\mathrm{nan},\ \ h\leftarrow 0
     if  w¯>0\underline{w}_{-}>0 or w¯+<0\overline{w}_{+}<0  then
         return \emptyset      
     τwmax{16wid(𝕨),16wid(𝕨+),τw}\tau_{w}\leftarrow\max\left\{{16\ {\mathrm{wid}}\!\left(\mathbb{w}_{-}\right),16\ {\mathrm{wid}}\!\left(\mathbb{w}_{+}\right),\tau_{w}}\right\}
     if  w¯0then\overline{w}_{-}\geq 0\ \textbf{then}
         tzx¯t_{z}\leftarrow\underline{x} and goto expand zero      
     if  w¯+0then\underline{w}_{+}\leq 0\ \textbf{then}
         tzx¯t_{z}\leftarrow\overline{x} and goto expand zero      
     
     regular case:
     if  (x¯x¯<τx)\left({\overline{x}-\underline{x}<\tau_{x}}\right)\ or (w¯+w¯<τw)\ \left({\overline{w}_{+}-\underline{w}_{-}<\tau_{w}}\right)  then
         return 𝕩\mathbb{x}      
     if  w¯w¯+-\underline{w}_{-}\leq\overline{w}_{+}  then
         if  !isnan(d¯)!{\mathrm{isnan}}\!\left(\underline{d}\right)  then
              goto bissection          
         tkx¯,dkmax{κ,f(tk)}t_{k}\leftarrow\underline{x},\ \ d_{k}\leftarrow\max\left\{{\kappa,{f}^{\prime}\!\left(t_{k}\right)}\right\}
         if  !isnan(d)!{\mathrm{isnan}}\!\left(d\right) and tdtkt_{d}\neq t_{k} then
              h(ddk)/(tdtk),ddk,tdtkh\leftarrow\left({d-d_{k}}\right)/\left({t_{d}-t_{k}}\right),\ \ d\leftarrow d_{k},\ \ t_{d}\leftarrow t_{k}          
         d¯dk,ddk,tdtk,skw¯/dk\underline{d}\leftarrow d_{k},\ \ d\leftarrow d_{k},\ \ t_{d}\leftarrow t_{k},\ \ s_{k}\leftarrow-\underline{w}_{-}/d_{k}
         if  h<0h<0  then
              skskhsk2/dks_{k}\leftarrow s_{k}-hs_{k}^{2}/d_{k}          
         if  2sk>x¯x¯2s_{k}>\overline{x}-\underline{x}  then
              goto bissection          
         tk+1tk+sk,𝕨k+1𝕗(tk+1),τwmax{16wid(𝕨k+1),τw}t_{k+1}\leftarrow t_{k}+s_{k},\ \ {\mathbb{w}_{k+1}\leftarrow{\mathbb{f}}\!\left(t_{k+1}\right),\ \ \tau_{w}\leftarrow\max\left\{{16\ {\mathrm{wid}}\!\left(\mathbb{w}_{k+1}\right),\tau_{w}}\right\}}
         if  w¯k+1>0\underline{w}_{k+1}>0  then
              𝕨+𝕨k+1,x¯tk+1,d¯nan\mathbb{w}_{+}\leftarrow\mathbb{w}_{k+1},\ \ \overline{x}\leftarrow t_{k+1},\ \ \overline{d}\leftarrow\mathrm{nan} and goto regular case
         else
              if  w¯k+1<0\overline{w}_{k+1}<0  then
                  w¯𝕨k+1,x¯tk+1,d¯nan\underline{w}_{-}\leftarrow\mathbb{w}_{k+1},\ \ \underline{x}\leftarrow t_{k+1},\ \ \underline{d}\leftarrow\mathrm{nan}
                  goto bissection
              else
                  tztk+1t_{z}\leftarrow t_{k+1} and goto expand zero                        
     else
         the case w¯>w¯+-\underline{w}_{-}>\overline{w}_{+} is analogous to w¯w¯+-\underline{w}_{-}\leq\overline{w}_{+}      
     
     bissection:
     tk+1(x¯+x¯)/2,𝕨k+1𝕗(tk+1)t_{k+1}\leftarrow\left({\underline{x}+\overline{x}}\right)/2,\ \ \mathbb{w}_{k+1}\leftarrow{\mathbb{f}}\!\left(t_{k+1}\right)
     if  w¯k+1>0\underline{w}_{k+1}>0  then
         𝕨+𝕨k+1,x¯tk+1,d¯nan\mathbb{w}_{+}\leftarrow\mathbb{w}_{k+1},\ \ \overline{x}\leftarrow t_{k+1},\ \ \overline{d}\leftarrow\mathrm{nan}
     else
         if  w¯k+1>0\overline{w}_{k+1}>0  then
              𝕨𝕨k+1,x¯tk+1,d¯nan\mathbb{w}_{-}\leftarrow\mathbb{w}_{k+1},\ \ \underline{x}\leftarrow t_{k+1},\ \ \underline{d}\leftarrow\mathrm{nan}
         else
              tztk+1t_{z}\leftarrow t_{k+1} and goto expand zero               
     goto regular case
end procedure

6 Testing

In practice, it is quite difficult to implement the algorithm presented here due to the need of much attention to detail. It is quite likely that the first attempts at such an implementation will fail, as ours did. Even worse, it is also likely that the failure will be undetected, specially if the details involved in the actual coding of the algorithm are considered to be of little relevance. As a result, in order to have some assurance of the reliability of our code in practice it is necessary to have a good suit of tests. Readers should not underestimate this point (and should not underestimate our remark that they should not underestimate this point, and, recursively…)

It is also important to realize that a good test suit may reveal problems not only in the code but also in the theory upon which it is based. A good example of this point is the choice of the stopping criterion presented in chapter 9 of [4]. When using that stopping criterion for the simple polynomial

f(x)=(x1)(x2)(x3)(x4)(x5){f}\!\left(x\right)=\left({x-1}\right)\left({x-2}\right)\left({x-3}\right)\left({x-4}\right)\left({x-5}\right) (20)

in the interval 𝕩=[1,5]\mathbb{x}=[1,5], we would obtain that the solution set {\cal{R}} is the whole interval 𝕩\mathbb{x}, regardless of the tolerances provided by the user, and a good set of tests could detect this problem. The authors of [4] do mention that their stopping criterion may be problematic in some rare cases, and that users would detect this kind of problem afterwards. Readers may think that the polynomial in Equation (20)\left({\ref{bughw}}\right) is one of such cases, but we beg to differ: we believe that a proper algorithm should get the roots of a function as simple as the one in Equation (20)\left({\ref{bughw}}\right) with high accuracy without user intervention, and if it does not then it must be fixed. Asking users to analyze the result may not be an option for instance when our algorithm is being used thousands of times as part of a larger routine. In this scenario, users will not have the chance to look at the result of each call of the algorithm, and they will need either to rely on the algorithm to provide good answers, or to write their own code to do what the algorithm should have done for them.

The example in Equation (20)\left({\ref{bughw}}\right) can be though as a mild version of Wilkinson’s polynomial, and we believe that we can obtain a good test suit by adding multiple roots to polynomials similar to it. In our opinion, a robust root finder must not choke when trying to find the roots of polynomials of the form

pd,𝐞(x):=±i=mm(xi)eiforx𝕩=[mδ¯,m+δ¯]{p_{d,\mathbf{e}}}\!\left(x\right):=\pm\prod_{i=-m}^{m}\left({x-i}\right)^{e_{i}}\hskip 11.38092pt\mathrm{for}\hskip 11.38092ptx\in\mathbb{x}=[m-\underline{\delta},\ m+\overline{\delta}] (21)

where mm and the exponents eie_{i} are integers and

δ¯,δ¯{0,1},i=mmei=d>0,m>0andei0fori=m,,m.\underline{\delta},\overline{\delta}\in\left\{{0,1}\right\},\hskip 11.38092pt\sum_{i=-m}^{m}e_{i}=d>0,\hskip 11.38092pt\hskip 11.38092ptm>0\hskip 11.38092pt\mathrm{and}\hskip 11.38092pte_{i}\geq 0\hskip 5.69046pt\mathrm{for}\hskip 5.69046pti=-m,\dots,m.

If dd and mm are not large then the coefficients of pd,𝐞p_{d,\mathbf{e}} can be computed exactly in floating point arithmetic. For instance, when d=20d=20 and m=5m=5 the coefficients of pd,𝐞p_{d,\mathbf{e}} can be computed exactly with the usual IEEEE 754 double precision arithmetic. There are

nd,m:=8(2m+dd)n_{d,m}:=8\ \binom{2m+d}{d} (22)

elements in the family of polynomials in Equation (21)\left({\ref{cases}}\right). For m=5m=5 we have that

n5=d=120nd,m700million,n_{5}=\sum_{d=1}^{20}n_{d,m}\approx 700\ \mathrm{million},

and for tolerances τx=τw=106\tau_{x}=\tau_{w}=10^{-6} and τc=0.001\tau_{c}=0.001 with our code we can test all these 700700 million cases in a couple of days on our desktop machine, which has an AMD® Ryzen 7 2700x eight-core processor with 64MB of RAM (see the discussion about tolerances in Section 3.)

The family of functions in Equation (21)\left({\ref{cases}}\right) has the following features, which make it a good test suit:

  1. 1.

    By taking δ¯=0\underline{\delta}=0 or δ¯=0\overline{\delta}=0 we can check how our code handles roots on the border of the interval 𝕩\mathbb{x}. This situation is a common source of bugs in root finders.

  2. 2.

    Usually some eie_{i} will be greater than one, and the polynomials pd,𝐞p_{d,\mathbf{e}} tend to have multiple roots. In fact, they may have roots with multiplicity as high as dd, or a couple of roots with multiplicity d/2d/2. These problems are quite challenging and some root finders may return thousands of candidates intervals for containing such multiple roots.

  3. 3.

    It is easy to write an algorithm to generate these polynomials by writing them in the factored form

    pd,𝐞(x)=±q1(x)xe0q2(x){p_{d,\mathbf{e}}}\!\left(x\right)=\pm q_{1}(-x)x^{e_{0}}q_{2}(x) (23)

    where the qkq_{k} are polynomials of the form

    qk(x)=i=1m(xi)ei{q_{k}}\!\left(x\right)=\prod_{i=1}^{m}\left({x-i}\right)^{e_{i}}

    with degree at most dd. For m=5m=5 and d20d\leq 20 there are

    nq=j=120(m+j1j)=53129n_{q}=\sum_{j=1}^{20}\binom{m+j-1}{j}=53129

    polynomials qkq_{k} and we can keep them in a table in memory, and let several threads build the polynomials pd,𝐞p_{d,\mathbf{e}} from them using Equation (23)\left({\ref{decomp}}\right).

  4. 4.

    When the coefficients ckc_{k} of the expanded version of pd,𝐞p_{d,\mathbf{e}}

    pd,𝐞(x)=k=0dckxk{p_{d,\mathbf{e}}}\!\left(x\right)=\sum_{k=0}^{d}c_{k}x^{k} (24)

    can be computed exactly, we know the exact roots of the polynomial in Equation (24)\left({\ref{expand}}\right) and we can use this knowledge to evaluate the robustness of our code with respect to the inaccuracies in the evaluation of pd,𝐞p_{d,\mathbf{e}} and its derivatives using the expanded form. In other words, we should test our code using Horner’s method or a Taylor model version of it to evaluate pd,𝐞p_{d,\mathbf{e}} using its expanded form (24)\left({\ref{expand}}\right), because the evaluation of the expanded form is usually much less accurate than the evaluation of the product form in Equation (21)\left({\ref{cases}}\right), and the purpose of testing is to stress our code.

Once we have chosen the family of functions and the intervals on which will search for roots, we must decide which tolerances to use for testing. Our advice is to use both large and small tolerances. Using large tolerances, like 0.10.1, 0.010.01 and 0.0010.001, we can detect gross mistakes, like mistyping. We may miss some of these gross mistakes if we only use tiny tolerances, because with tiny tolerances the part of the code affected by mistyping may never be executed during the tests, or may only cause bugs which are not severe enough to be detected (we say this based on our personal experience with our own blunders.) Tests with small tolerances are necessary to access the algorithm’s accuracy and to check how much time it takes to find accurate roots.

References

  • [1] Alefeld, G., On the convergence of some interval arithmetic modifications of Newton’s method, SIAM J. Numer. Anal. 21, 363–372 (1984)
  • [2] Hansen, E., On Solving Systems of Equations Using Interval Arithmetic. Math. Comp. 22, 374–384 (1968)
  • [3] Hansen, E., Interval Forms of Newtons Method, Computing 20, 153–163 (1978)
  • [4] Hansen, E., Walster, G., Global Optimization and Interval Analysis, Second Edition, Revised and Expanded, Marcel Dekker (2004)
  • [5] Kearfott, R. B., and Du, K., The cluster problem in global optimization, the univariate case. Computing Supplement 9, 117–127 (1992)
  • [6] Kearfott, R. B., and Du, K., The cluster problem in multivariate global optimization. Journal of Global Optimization, 5:253–265 (1994)
  • [7] Krawczyk, R. and Neumaier, A., Interval Slopes for Rational Functions and Associated Centered Forms, SIAM J. Numer. Anal. 22, 604–616 (1985)
  • [8] Mascarenhas, W. F., Moore: Interval Arithmetic in C++20, In: Barreto G., Coelho R. (eds) Fuzzy Information Processing. NAFIPS 2018. Communications in Computer and Information Science, vol 831, pp 519–529 (2018)
  • [9] Neumaier, A., Taylor Forms–Use and Limits. Reliable Computing 9, 43–79 (2003)
  • [10] Moore, R. E., Interval Analysis. Prentice-Hall, Englewood Cliffs, NJ. (1966)
  • [11] Nickel, K., On the Newton Method in Interval Analysis. Mathematics Research Center Report 1136, University of Wisconsin (1971)
  • [12] Ratschek, H., Centered forms, SIAM J. Numer. Anal. vol 15, no. 5, 656-662 (1980)
  • [13] Ratschek, H. and Rokne, J., Geometric Computations with Interval and New Robust Methods, Applications in Computer Graphics, GIS and Computational Geometry. Horwood Publishing Chichester (2003)