This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Backward Error Measures for Roots of Polynomials

Simon Telen Department of Computer Science, KU Leuven Sascha Timme This author was supported by the Deutsche Forschungsgemeinschaft (German Research Foundation) Graduiertenkolleg Facets of Complexity (GRK 2434). Technische Universität Berlin, Chair of Discrete Mathematics/Geometry Marc Van Barel This author was partially supported by the Research Council KU Leuven, C1-project (Numerical Linear Algebra and Polynomial Computations), and by the Fund for Scientific Research Flanders (Belgium), G.0828.14N (Multivariate polynomial and rational interpolation and approximation), and EOS Project no 30468160. Department of Computer Science, KU Leuven
(January 1, 2025)
Abstract

We analyze different measures for the backward error of a set of numerical approximations for the roots of a polynomial. We focus mainly on the element-wise mixed backward error introduced by Mastronardi and Van Dooren, and the tropical backward error introduced by Tisseur and Van Barel. We show that these measures are equivalent under suitable assumptions. We also show relations between these measures and the classical element-wise and norm-wise backward error measures.

1 Introduction

In this article we analyze the problem of measuring the backward error for a set of approximations for the roots of a polynomial with complex coefficients. For a general introduction to the notion of backward error analysis, the reader can consult for instance [Hig02, Section 1.5]. Consider a set of approximate solutions X^={x^1,,x^d}={0}\hat{X}=\{\hat{x}_{1},\ldots,\hat{x}_{d}\}\subset\mathbb{C}^{*}=\mathbb{C}\setminus\{0\} of a polynomial equation with nonzero coefficients

f=c0+c1x++cd1xd1+cdxd=cd(xx1)(xxd)=0,f[x],f=c_{0}+c_{1}x+\ldots+c_{d-1}x^{d-1}+c_{d}x^{d}=c_{d}(x-x_{1})\cdots(x-x_{d})=0,\quad f\in\mathbb{C}[x],

with solutions X={x1,,xd}X=\{x_{1},\ldots,x_{d}\}\subset\mathbb{C}^{*}. The backward error of X^\hat{X} is a measure for the ‘distance’ of ff to the polynomial

f^=c^0+c^1x++c^d1xd1+cdxd=cd(xx^1)(xx^d),\hat{f}=\hat{c}_{0}+\hat{c}_{1}x+\ldots+\hat{c}_{d-1}x^{d-1}+c_{d}x^{d}=c_{d}(x-\hat{x}_{1})\cdots(x-\hat{x}_{d}),

whose roots are exactly the points in X^\hat{X}. How to measure this distance turns out to be a surprisingly subtle problem. A first and natural measure is the 2-norm distance between the coefficients of ff and f^\hat{f}. The norm-wise backward error (NBE) of X^\hat{X} is

NBE(X^)=|c0c^0|2++|cd1c^d1|2|c0|2++|cd|2=cc^2c2,\textup{NBE}(\hat{X})=\sqrt{\frac{|c_{0}-\hat{c}_{0}|^{2}+\cdots+|c_{d-1}-\hat{c}_{d-1}|^{2}}{|c_{0}|^{2}+\cdots+|c_{d}|^{2}}}=\frac{\lVert{c-\hat{c}}\rVert_{2}}{\lVert{c}\rVert_{2}},

where c=(c0,,cd)d,c^=(c^0,,cd)dc=(c_{0},\ldots,c_{d})\in\mathbb{C}^{d},\hat{c}=(\hat{c}_{0},\ldots,c_{d})\in\mathbb{C}^{d}. In [AMVW15], an algorithm is proposed that computes a set of approximate solutions X^\hat{X} satisfying NBE(X^)=O(u)\textup{NBE}(\hat{X})=O(u), where uu is the unit round-off. Such an algorithm is called norm-wise backward stable. However, it turns out that this type of stability is too ‘weak’ in a sense we explain by means of an example. Consider the polynomial f=a(x106)(x106)f=a(x-10^{6})(x-10^{-6}). The set of approximate solutions X^={106+u106,106+u}\hat{X}=\{10^{6}+u10^{6},10^{-6}+u\} would satisfy NBE(X^)=O(u)\textup{NBE}(\hat{X})=O(u). Indeed,

c^=a(1+(u106+u+u2106),(106+106)u(106+1),1),c=a(1,(106+106),1)\hat{c}=a(1+(u10^{6}+u+u^{2}10^{6}),-(10^{6}+10^{-6})-u(10^{6}+1),1),\quad c=a(1,-(10^{6}+10^{-6}),1)

and cc^2/c2=O(u)\lVert{c-\hat{c}}\rVert_{2}/\lVert{c}\rVert_{2}=O(u). This means that we would allow a relative error of size u106u10^{6} on the coefficient vector cc. However, computing the roots of ff with a=0.2a=0.2 using the Julia package PolynomialRoots (using the command roots) we get

julia> abs.((c - chat)./c)
3-element Array{Float64,1}:
 2.7755575615628914e-16
 0.0
 0.0

which shows that we can obtain better element-wise accuracy on the coefficient vector. This suggests another, more ‘strict’, measure for the backward error. The element-wise backward error (EBE) of X^\hat{X} is

EBE(X^)=maxi=0,,d1|cic^ici|.\textup{EBE}(\hat{X})=\max_{i=0,\ldots,d-1}\left|\frac{c_{i}-\hat{c}_{i}}{c_{i}}\right|.

Unfortunately, this measure turns out to be too strict. In [MVD15], Mastronardi and Van Dooren show that no algorithm for finding the roots of a general quadratic polynomial is element-wise backward stable, meaning that it computes X^\hat{X} such that EBE(X^)=O(u)\textup{EBE}(\hat{X})=O(u). As an alternative measure, the authors of [MVD15] propose the following definition in the case where d=2d=2. The element-wise mixed backward error (EMBE) of X^\hat{X}, denoted EMBE(X^)\textup{EMBE}(\hat{X}), is the smallest number ε0\varepsilon\geq 0 such that there exists some X~={x~1,,x~d}\tilde{X}=\{\tilde{x}_{1},\ldots,\tilde{x}_{d}\}\subset\mathbb{C} and

f~=c~0+c~1x++c~d1xd1+cdxd=cd(xx~1)(xx~d)\tilde{f}=\tilde{c}_{0}+\tilde{c}_{1}x+\cdots+\tilde{c}_{d-1}x^{d-1}+c_{d}x^{d}=c_{d}(x-\tilde{x}_{1})\cdots(x-\tilde{x}_{d})

such that

|x^ix~i|\displaystyle|\hat{x}_{i}-\tilde{x}_{i}| ε|x~i|,i=1,,d,\displaystyle\leq\varepsilon|\tilde{x}_{i}|,\quad i=1,\ldots,d,
|cic~i|\displaystyle|c_{i}-\tilde{c}_{i}| ε|ci|,i=0,,d1.\displaystyle\leq\varepsilon|c_{i}|,\quad i=0,\ldots,d-1.

Note that the second set of inequalities is equivalent to EBE(X~)ε\textup{EBE}(\tilde{X})\leq\varepsilon. In the same paper, the authors also show the implication EMBE(X^)=O(u)NBE(X^)=O(u)\textup{EMBE}(\hat{X})=O(u)\Rightarrow\textup{NBE}(\hat{X})=O(u) in the case where d=2d=2. The implication EBE(X^)=O(u)EMBE(X^)=O(u)\textup{EBE}(\hat{X})=O(u)\Rightarrow\textup{EMBE}(\hat{X})=O(u) is obvious from the definition.
The advantage of EMBE as a measure for the backward error is that it results in a notion of stability, i.e. element-wise mixed backward stability, which is stronger than norm-wise backward stability and can provably be obtained for quadratic polynomials [MVD15]. A drawback of this measure is that it is hard to compute EMBE(X^)\textup{EMBE}(\hat{X}) for a given set of approximate solutions X^\hat{X} because of the rather abstract definition. In [TVB20], Tisseur and Van Barel define the min-max element-wise backward error of X^\hat{X} as

TBE(X^)=maxi=0,,d1|cic^irici|\textup{TBE}(\hat{X})=\max_{i=0,\ldots,d-1}\left|\frac{c_{i}-\hat{c}_{i}}{r_{i}c_{i}}\right|

where ri1r_{i}\geq 1 are constants that can be computed in linear time from the coefficients of ff. The rir_{i} depend only on the tropical roots of ff, which is why we will refer to this error measure as the tropical backward error (TBE). We will give a definition of the numbers rir_{i} in Section 3. The authors of [TVB20] also provide an algorithm that, under some assumptions on the numerical behavior of a modified QZ-algorithm (see [TVB20, Section 5, Assumption 1]), computes a set of approximate roots X^\hat{X} satisfying TBE(X^)=O(u)\textup{TBE}(\hat{X})=O(u).

In this paper, we investigate the relations between the TBE and the EMBE. In particular, we show that these error measures are equivalent under suitable assumptions. Here’s a simplified version of our first main theorem.

Theorem 1.1.

Assume that the tropical roots of ff are of the same order of magnitude as the corresponding classical roots and |x^j||\hat{x}_{j}| are of the same order of magnitude as |xj||x_{j}|. Then we have that EMBE(X^)=O(u)\textup{EMBE}(\hat{X})=O(u) implies TBE(X^)=O(u)\textup{TBE}(\hat{X})=O(u).

The strategy for proving Theorem 1.1 will also allow us to prove that, under the assumptions of the theorem, EMBE(X^)=O(u)\textup{EMBE}(\hat{X})=O(u) implies NBE(X^)=O(u)\textup{NBE}(\hat{X})=O(u). This was proved in [MVD15] for the case where d=2d=2. Under some stronger assumptions, we also prove the reverse implication.

Theorem 1.2.

Assume that the tropical roots of ff are of the same order of magnitude as the corresponding classical roots and |x^j||\hat{x}_{j}| are of the same order of magnitude as |xj||x_{j}|. Moreover, assume that for each xjXx_{j}\in X, there are two terms cβxjβc_{\beta^{\prime}}x_{j}^{\beta^{\prime}} and cβxjβc_{\beta}x_{j}^{\beta} of f(xj)f(x_{j}) such that

|cixji||cβxjβ|and|cixji||cβxjβ|,for all iβ,β.|c_{i}x_{j}^{i}|\ll|c_{\beta^{\prime}}x_{j}^{\beta^{\prime}}|\quad\text{and}\quad|c_{i}x_{j}^{i}|\ll|c_{\beta}x_{j}^{\beta}|,\quad\text{for all }i\neq\beta^{\prime},\beta.

Then we have that TBE(X^)=O(u)\textup{TBE}(\hat{X})=O(u) implies EMBE(X^)=O(u)\textup{EMBE}(\hat{X})=O(u).

We will give numerical evidence that Theorem 1.2 holds without the extra assumption on the polynomial ff. In summary, we have the following diagram, where the arrows are implications.

EBE(X^)=O(u){\textup{EBE}(\hat{X})=O(u)}TBE(X^)=O(u){\textup{TBE}(\hat{X})=O(u)}NBE(X^)=O(u){\textup{NBE}(\hat{X})=O(u)}EMBE(X^)=O(u){\textup{EMBE}(\hat{X})=O(u)} (1)

Here, the black arrows are implications which are obvious from the definitions. The blue arrows are implications that we prove under the assumptions of Theorem 1.1. The dashed arrow represents Theorem 1.2, which uses stronger assumptions.

This article is organized as follows. In the next section we prove the equivalence of the tropical and element-wise mixed backward error measures in the case where d=2d=2. This can be seen as an extension of the analysis in [MVD15], and it is instructive for the general case. The proofs for general dd are given in Section 3. In Section 4 we show some computational experiments and give numerical evidence that Theorem 1.2 holds under weaker assumptions.

2 Backward Error for Quadratic Polynomials

In this section, we prove that the element-wise mixed backward error (EMBE) as introduced by Mastronardi and Van Dooren [MVD15] and the tropical backward error (TBE) from [TVB20] are equivalent backward error measures for the roots of a quadratic polynomial

f=ax2+bx+c=a(xx1)(xx2).f=ax^{2}+bx+c=a(x-x_{1})(x-x_{2}).

For simplicity, we assume that a,b,c={0}a,b,c\in\mathbb{C}^{*}=\mathbb{C}\setminus\{0\}. For the approximate roots X^={x^1,x^2}\hat{X}=\{\hat{x}_{1},\hat{x}_{2}\} of ff, EMBE(X^)\textup{EMBE}(\hat{X}) is the smallest number ε0\varepsilon\geq 0 such that there exists X~={x~1,x~2}\tilde{X}=\{\tilde{x}_{1},\tilde{x}_{2}\} with

f~=a(xx~1)(xx~2)\displaystyle\tilde{f}=a(x-\tilde{x}_{1})(x-\tilde{x}_{2}) =ax2+b~x+c~,\displaystyle=ax^{2}+\tilde{b}x+\tilde{c},
|x^1x~1|ε|x~1|,\displaystyle|\hat{x}_{1}-\tilde{x}_{1}|\leq\varepsilon|\tilde{x}_{1}|,\quad |x^2x~2|ε|x~2|,\displaystyle|\hat{x}_{2}-\tilde{x}_{2}|\leq\varepsilon|\tilde{x}_{2}|,
|bb~|ε|b|,\displaystyle|b-\tilde{b}|\leq\varepsilon|b|,\quad |cc~|ε|c|.\displaystyle|c-\tilde{c}|\leq\varepsilon|c|.

In analogy with [TVB20], we define the TBE(X^)\textup{TBE}(\hat{X}) to be the smallest number ε0\varepsilon\geq 0 such that

f^=a(xx^1)(xx^2)=ax2+b^x+c^\displaystyle\hat{f}=a(x-\hat{x}_{1})(x-\hat{x}_{2})=ax^{2}+\hat{b}x+\hat{c}
|bb^|rbε|b|,|cc^|ε|c|,\displaystyle|b-\hat{b}|\leq r_{b}\varepsilon|b|,\quad|c-\hat{c}|\leq\varepsilon|c|,

where rb=max(1,|ac|/|b|)r_{b}=\max(1,\sqrt{|ac|}/|b|). The definition of rbr_{b} will be clarified in Section 3. For the approximate roots x^j\hat{x}_{j} and x~j\tilde{x}_{j} in these definitions, we will assume that the order of magnitude of |x~j||\tilde{x}_{j}| and |x^j||\hat{x}_{j}| is the same as the order of magnitude of |xj||x_{j}| (that is, we allow relative errors of size at most 1). Note that by the definition of EMBE, it is sufficient that this is satisfied for |x^j||\hat{x}_{j}|.

We will now relate these two error measures. Let ε=EMBE(X^)\varepsilon=\textup{EMBE}(\hat{X}). We observe

c^cc\displaystyle\frac{\hat{c}-c}{c} =c^c~c+c~cc=x^1x^2x~1x~2x1x2+c~cc\displaystyle=\frac{\hat{c}-\tilde{c}}{c}+\frac{\tilde{c}-c}{c}=\frac{\hat{x}_{1}\hat{x}_{2}-\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}+\frac{\tilde{c}-c}{c}
=(x~1+(x^1x~1))(x~2+(x^2x~2))x~1x~2x1x2+c~cc.\displaystyle=\frac{(\tilde{x}_{1}+(\hat{x}_{1}-\tilde{x}_{1}))(\tilde{x}_{2}+(\hat{x}_{2}-\tilde{x}_{2}))-\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}+\frac{\tilde{c}-c}{c}\,.

Since |cc~|ε|c||c-\tilde{c}|\leq\varepsilon|c|, we have |1x~1x~2x1x2|ε|1-\frac{\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}|\leq\varepsilon and it follows

|c^cc|(2ε+ε2)|x~1x~2x1x2|+ε3ε.\left|\frac{\hat{c}-c}{c}\right|\leq(2\varepsilon+\varepsilon^{2})\left|\frac{\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}\right|+\varepsilon\lesssim 3\varepsilon\,.

For the coefficient b^\hat{b}, we find in an analogous way that

|b^bb|ε(1+|x~1|+|x~2||x1+x2|).\left|\frac{\hat{b}-b}{b}\right|\leq\varepsilon\left(1+\frac{|\tilde{x}_{1}|+|\tilde{x}_{2}|}{|x_{1}+x_{2}|}\right). (2)

If the solutions x1x_{1} and x2x_{2} have different orders of magnitude, there does not occur any cancellation in the denominator of the right hand side of (2) and this implies |b^bb|2ε\left|\frac{\hat{b}-b}{b}\right|\lesssim 2\varepsilon. However, if the order of magnitude of both solutions is the same, the factor standing with ε\varepsilon may be significantly larger than 1 due to cancellation. We will now make this precise and relate this to the number rbr_{b}. We define γ=x1/x2\gamma=x_{1}/x_{2}. By the assumption that |x~i||\tilde{x}_{i}| is of the same order of magnitude as |xi||x_{i}|, (2) can be written as

|b^bb|ε(1+K|γ|+1|γ+1|)\left|\frac{\hat{b}-b}{b}\right|\leq\varepsilon\left(1+K\frac{|\gamma|+1}{|\gamma+1|}\right) (3)

with KK a small constant. We assume, without loss of generality, that 0<|γ|10<|\gamma|\leq 1. Note that we have for 0<|γ|<10<|\gamma|<1 the inequality

|γ|+1|γ+1||γ|+1||γ|1|=|γ|+11|γ|,\frac{|\gamma|+1}{|\gamma+1|}\leq\frac{|\gamma|+1}{||\gamma|-1|}=\frac{|\gamma|+1}{1-|\gamma|}, (4)

which follows from |xy|||x||y||,x,y|x-y|\geq||x|-|y||,\forall x,y\in\mathbb{C} applied to x=γ,y=1x=\gamma,y=-1. Assume that

|ac||b|=|γ||γ+1|1.\frac{\sqrt{|ac|}}{|b|}=\frac{\sqrt{|\gamma|}}{|\gamma+1|}\leq 1.

In this case, we have

|γ|+1|γ+1||γ|+1|γ|.\frac{|\gamma|+1}{|\gamma+1|}\leq\frac{|\gamma|+1}{\sqrt{|\gamma|}}. (5)

Now, assume

|ac||b|=|γ||γ+1|1.\frac{\sqrt{|ac|}}{|b|}=\frac{\sqrt{|\gamma|}}{|\gamma+1|}\geq 1.

In this case

|γ|+1|γ+1|(|ac||b|)1|γ|+1|γ+1||γ|+11|γ|.\frac{|\gamma|+1}{|\gamma+1|}\left(\frac{\sqrt{|ac|}}{|b|}\right)^{-1}\leq\frac{|\gamma|+1}{|\gamma+1|}\leq\frac{|\gamma|+1}{1-|\gamma|}. (6)

Also,

|γ|+1|γ+1|(|ac||b|)1=|γ|+1|γ+1|(|γ||γ+1|)1=|γ|+1|γ|.\frac{|\gamma|+1}{|\gamma+1|}\left(\frac{\sqrt{|ac|}}{|b|}\right)^{-1}=\frac{|\gamma|+1}{|\gamma+1|}\left(\frac{\sqrt{|\gamma|}}{|\gamma+1|}\right)^{-1}=\frac{|\gamma|+1}{\sqrt{|\gamma|}}. (7)

Using (4)-(7), we find that

|γ|+1|γ+1|rb1min(|γ|+11|γ|,|γ|+1|γ|),\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}\leq\min\left(\frac{|\gamma|+1}{1-|\gamma|},\frac{|\gamma|+1}{\sqrt{|\gamma|}}\right),

which gives

|γ|+1|γ+1|rb1{α+11α=50<|γ|αα+1α=5α|γ|1|γ|+1|γ+1|rb15,\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}\leq\begin{cases}\frac{\alpha+1}{1-\alpha}=\sqrt{5}&0<|\gamma|\leq\alpha\\ \frac{\alpha+1}{\sqrt{\alpha}}=\sqrt{5}&\alpha\leq|\gamma|\leq 1\end{cases}\quad\Rightarrow\leavevmode\nobreak\ \frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}\leq\sqrt{5},

where α=3252\alpha=\frac{3}{2}-\frac{\sqrt{5}}{2}. This is illustrated in Figure 1.

00.50.511111.51.522
Figure 1: The value of min(|γ|+11|γ|,|γ|+1|γ|)\min\left(\frac{|\gamma|+1}{1-|\gamma|},\frac{|\gamma|+1}{\sqrt{|\gamma|}}\right) for 0|γ|10\leq|\gamma|\leq 1.

It follows immediately from this observation and (3) that

|b^bb|ε(5K+rb1)rbε(5K+1)rb.\left|\frac{\hat{b}-b}{b}\right|\leq\varepsilon\left(\sqrt{5}K+r_{b}^{-1}\right)r_{b}\leq\varepsilon(\sqrt{5}K+1)r_{b}.

Figure 2 illustrates the values of rbr_{b} and |γ|+1|γ+1|rb1\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1} as a function of γ\gamma in the unit disk.

Refer to caption
Refer to caption
Figure 2: Left: illustration of the value of log(rb)\log(r_{b}) as a function of γ\gamma for 0|γ|10\leq|\gamma|\leq 1. Right: illustration of the value of |γ|+1|γ+1|rb1\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1} for 0|γ|10\leq|\gamma|\leq 1.

This shows that element-wise mixed backward stability implies tropical backward stability. The converse also holds, as we will now show. Let TBE(X^)=ε\textup{TBE}(\hat{X})=\varepsilon for the computed roots X^={x^1,x^2}\hat{X}=\{\hat{x}_{1},\hat{x}_{2}\}. If rb=1r_{b}=1 then we can take x~1=x^1\tilde{x}_{1}=\hat{x}_{1} and x~2=x^2\tilde{x}_{2}=\hat{x}_{2} such that EMBE(X^)TBE(X^)\textup{EMBE}(\hat{X})\leq\textup{TBE}(\hat{X}). Suppose now that rb=|ac|/|b|>1r_{b}=\sqrt{|ac|}/|b|>1 and without loss of generality assume |x^1||x^2||\hat{x}_{1}|\leq|\hat{x}_{2}|. TBE(X^)=ε\textup{TBE}(\hat{X})=\varepsilon implies that there exists δ^b\hat{\delta}_{b}\in\mathbb{C} with |δ^b|ε|\hat{\delta}_{b}|\leq\varepsilon such that

b^=a(x^1+x^2)=b(1+rbδ^b).-\hat{b}=a(\hat{x}_{1}+\hat{x}_{2})=-b(1+r_{b}\hat{\delta}_{b})\,.

Let x~1=x^1+barbδ^b\tilde{x}_{1}=\hat{x}_{1}+\frac{b}{a}r_{b}\hat{\delta}_{b} and x~2=x^2\tilde{x}_{2}=\hat{x}_{2}. Then we have

b~=a(x~1+x~2)=a(x^1+barbδ^b+x^2)=a(x^1+x^2)+brbδ^b=b(1+rbδ^b)+brbδ^b=b.-\tilde{b}=a(\tilde{x}_{1}+\tilde{x}_{2})=a(\hat{x}_{1}+\frac{b}{a}r_{b}\hat{\delta}_{b}+\hat{x}_{2})=a(\hat{x}_{1}+\hat{x}_{2})+br_{b}\hat{\delta}_{b}=-b(1+r_{b}\hat{\delta}_{b})+br_{b}\hat{\delta}_{b}=-b\,.

Note that

|x~1x^1|=|barbδ^b|=|ca||δ^b|=|x1x2||δ^b||x~1|ε.|\tilde{x}_{1}-\hat{x}_{1}|=\left|\frac{b}{a}r_{b}\hat{\delta}_{b}\right|=\sqrt{\left|\frac{c}{a}\right|}|\hat{\delta}_{b}|=\sqrt{|x_{1}x_{2}|}|\hat{\delta}_{b}|\lesssim|\tilde{x}_{1}|\varepsilon\,.

We also have

c~c^=ax~1x~2ax^1x^2=a(x^1+barbδ^b)x^2ax^1x^2=brbδ^bx^2\tilde{c}-\hat{c}=a\tilde{x}_{1}\tilde{x}_{2}-a\hat{x}_{1}\hat{x}_{2}=a(\hat{x}_{1}+\frac{b}{a}r_{b}\hat{\delta}_{b})\hat{x}_{2}-a\hat{x}_{1}\hat{x}_{2}=br_{b}\hat{\delta}_{b}\hat{x}_{2}

from which we get

|c~c^|=|a||barbδ^bx^2|ε|a||x~1x~2|=ε|c~|.|\tilde{c}-\hat{c}|=|a|\left|\frac{b}{a}r_{b}\hat{\delta}_{b}\hat{x}_{2}\right|\lesssim\varepsilon|a||\tilde{x}_{1}\tilde{x}_{2}|=\varepsilon|\tilde{c}|\,.

We conclude that tropical backward stability implies element-wise mixed backward stability.

Remark 1.

We assumed a,b,ca,b,c\in\mathbb{C}^{*} in this discussion. If a=0a=0, we are solving a linear equation and there is nothing to prove. If c=0c=0, the root x1=0x_{1}=0 can be deflated and we are again left with a linear equation. If b=0b=0, a similar derivation can be made. We omit the details but give a brief outline. We replace the conditions on b~\tilde{b} and b^\hat{b} in the definitions of EMBE(X^)\textup{EMBE}(\hat{X}) and TBE(X^)\textup{TBE}(\hat{X}) respectively by |b~|ε|\tilde{b}|\leq\varepsilon and |b^|rbε|\hat{b}|\leq r_{b}\varepsilon where rb=max(1,|ac|)=max(1,|ax1|)r_{b}=\max(1,\sqrt{|ac|})=\max(1,|ax_{1}|) (note that in this case |x1|=|x2||x_{1}|=|x_{2}|). One derives bounds in a similar way for the implication EMBE(X^)=εTBE(X^)=O(ε)\textup{EMBE}(\hat{X})=\varepsilon\Rightarrow\textup{TBE}(\hat{X})=O(\varepsilon). For the other implication, there is again nothing to prove when rb=1r_{b}=1. When rb=|ax1|r_{b}=|ax_{1}| we observe that we can write b^=rbδ^b\hat{b}=-r_{b}\hat{\delta}_{b} with |δ^b|ε|\hat{\delta}_{b}|\leq\varepsilon and we set x~1=x^1+a1rbδ^b,x~2=x^2\tilde{x}_{1}=\hat{x}_{1}+a^{-1}r_{b}\hat{\delta}_{b},\tilde{x}_{2}=\hat{x}_{2}.

We arrive at the following statement.

Proposition 2.1.

Let X^={x^1,x^2}\hat{X}=\{\hat{x}_{1},\hat{x}_{2}\} be a set of approximations for the roots X={x1,x2}X=\{x_{1},x_{2}\} of a quadratic polynomial f=ax2+bx+c[x]f=ax^{2}+bx+c\in\mathbb{C}[x], where a,c0a,c\neq 0. Under the assumption that |x^i||\hat{x}_{i}| has the same order of magnitude as |xi||x_{i}|, i=1,2i=1,2, we have that

EMBE(X^)=O(u)if and only ifTBE(X^)=O(u).\textup{EMBE}(\hat{X})=O(u)\quad\textup{if and only if}\quad\textup{TBE}(\hat{X})=O(u).

3 Backward Error for Polynomials of General Degree

We now generalize the results of the previous section to polynomials of arbitrary degree dd. In the following let f=i=0dcixi[x]f=\sum_{i=0}^{d}c_{i}x^{i}\in\mathbb{C}[x], c0,cd0c_{0},c_{d}\neq 0, be a polynomial of degree dd with roots X={x1,,xd}X=\{x_{1},\ldots,x_{d}\}\subset\mathbb{C}^{*} and let X^={x^1,,x^d}\hat{X}=\{\hat{x}_{1},\ldots,\hat{x}_{d}\}\subset\mathbb{C}^{*} be the approximate roots. Without loss of generality we assume that the roots are labeled such that |x1||x2||xd||x_{1}|\leq|x_{2}|\leq\ldots\leq|x_{d}|.

We first formally generalize the notion of the element-wise mixed backward error due to Mastronardi and Van Dooren [MVD15] from quadratic polynomials to general polynomials of degree dd.

Definition 3.1 (Element-wise mixed backward error).

The element-wise mixed backward error of X^\hat{X}, denoted EMBE(X^)\textup{EMBE}(\hat{X}), is the smallest number ε0\varepsilon\geq 0 such that there exist points X~={x~1,,x~d}\tilde{X}=\{\tilde{x}_{1},\ldots,\tilde{x}_{d}\}\subset\mathbb{C}^{*} satisfying |x^jx~j|ε|x~j||\hat{x}_{j}-\tilde{x}_{j}|\leq\varepsilon|\tilde{x}_{j}|, j=1,,dj=1,\ldots,d, and

{|cic~i|ε|ci|,ci0,|cic~i|ε,ci=0,\begin{cases}|c_{i}-\tilde{c}_{i}|\leq\varepsilon|c_{i}|,&c_{i}\neq 0,\\ |c_{i}-\tilde{c}_{i}|\leq\varepsilon,&c_{i}=0,\end{cases}

for all i=0,,d1i=0,\ldots,d-1, where the c~i\tilde{c}_{i} are the coefficients of

f~=cdj=1d(xx~j)=cdxd+i=0d1c~ixi.\tilde{f}=c_{d}\prod_{j=1}^{d}(x-\tilde{x}_{j})=c_{d}x^{d}+\sum_{i=0}^{d-1}\tilde{c}_{i}x^{i}\,.

Before we can state the generalization of the tropical backward error for degree dd polynomials we need to introduce some definitions and concepts. We define the tropical polynomial tf(τ)\texttt{t}{f}(\tau) associated to f(x)f(x) as

tf:{}{},τmax0idvi+iτ\texttt{t}{f}:\mathbb{R}\cup\{-\infty\}\rightarrow\mathbb{R}\cup\{-\infty\},\quad\tau\mapsto\max_{0\leq i\leq d}v_{i}+i\tau (8)

where vi=log|ci|v_{i}=\log|c_{i}|. The number viv_{i} is the valuation of cic_{i} under the valuation map log||\log|\cdot|. Any base for the logarithm can be used in theory. We want to think of the image under the valuation map as the ‘order of magnitude’ of the modulus of a complex number. In this paper, when we state that log|c|0,c,\log|c|\approx 0,c\in\mathbb{C}, we mean that |c||c| is of order 1. In tropical geometry the map log||\log|\cdot| is referred to as an Archimedean valuation. For a general introduction to tropical geometry we refer to [Sha11, MS15].

Refer to caption
Figure 3: Consider f(x)=12x7+3x6+12x5+5x4+52x3+3x2+6x+52f(x)=\frac{1}{2}x^{7}+3x^{6}+\frac{1}{2}x^{5}+5x^{4}+\frac{5}{2}x^{3}+3x^{2}+6x+\frac{5}{2}. The figure depicts the upper convex hull of the lifted Newton polytope, the geometric derivation of the rir_{i} values and the induced subdivision Δ={(β0=0,β1=1),(1,β2=4),(4,β3=6),(6,β4=7)}\Delta=\leavevmode\nobreak\ \{(\beta_{0}=0,\beta_{1}=1),(1,\beta_{2}=4),(4,\beta_{3}=6),(6,\beta_{4}=7)\}.

The Newton polytope of ff is the line segment [0,d][0,d]\subset\mathbb{R}. The convex hull of the points {(i,vi)}0id2\{(i,v_{i})\}_{0\leq i\leq d}\subset\mathbb{R}^{2} is called the lifted Newton polytope. We will consider the upper hull of the lifted Newton polytope. For a specific example, this is shown as a solid black line in Figure 3. The vertices of this upper hull are the points (β,vβ)(\beta_{\ell},v_{\beta_{\ell}}), =0,,s\ell=0,\ldots,s, with

0=β0<β1<<βs=d.0=\beta_{0}<\beta_{1}<\ldots<\beta_{s}=d\,.

We call the set Δ={(β0,β1),(β1,β2),,(βs1,βs)}\Delta=\{(\beta_{0},\beta_{1}),(\beta_{1},\beta_{2}),\ldots,(\beta_{s-1},\beta_{s})\} the subdivision induced by the coefficients cic_{i}, or the induced subdivision for short. We say that a point τ{}\tau\in\mathbb{R}\cup\{-\infty\} is a root of tf\texttt{t}{f} if the maximum in (8) is attained at least twice. A root of tf\texttt{t}{f} is called a tropical root of ff. The multiplicity of a root τ\tau of tf\texttt{t}{f} is the number ββ1\beta_{\ell}-\beta_{\ell-1} where β\beta_{\ell} and β1\beta_{\ell-1} are the largest, respectively the smallest value of ii for which vi+iτv_{i}+i\tau is maximal. Counted with multiplicity, tf\texttt{t}{f} has dd roots τ1,,τd\tau_{1},\ldots,\tau_{d} and we can give a closed formula for them. For β1<iβ\beta_{\ell-1}<i\leq\beta_{\ell} we have

τi=1m(vβ1vβ)=log|cβ1cβ|1m\tau_{i}=\frac{1}{m_{\ell}}(v_{\beta_{\ell-1}}-v_{\beta_{\ell}})=\log\;\left|\frac{c_{\beta_{\ell-1}}}{c_{\beta_{\ell}}}\right|^{\frac{1}{m_{\ell}}}

where m=ββ1m_{\ell}=\beta_{\ell}-\beta_{\ell-1} is the multiplicity. In particular, the definition implies

τ1=τβ0==τβ1<τβ1+1==τβ2<<τβs1+1==τβs=τd.\tau_{1}=\tau_{\beta_{0}}=\cdots=\tau_{\beta_{1}}<\tau_{\beta_{1}+1}=\cdots=\tau_{\beta_{2}}<\cdots<\tau_{\beta_{s-1}+1}=\cdots=\tau_{\beta_{s}}=\tau_{d}.

Tropical roots of polynomials are used for scaling (matrix) polynomial eigenvalue problems, see for instance [BNS13, GS09, NST15].

Furthermore, for i=0,,di=0,\ldots,d we define the constants

ri={1,i=β for some exp(vβ+(βi)τivi),β1<i<β for some  and viexp(vβ+(βi)τi),β1<i<β for some  and vi=.r_{i}=\begin{cases}1,&i=\beta_{\ell}\text{ for some }\ell\\ \exp(v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{i}-v_{i}),&\beta_{\ell-1}<i<\beta_{\ell}\text{ for some }\ell\text{ and }v_{i}\in\mathbb{R}\\ \exp(v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{i}),&\beta_{\ell-1}<i<\beta_{\ell}\text{ for some }\ell\text{ and }v_{i}=-\infty\end{cases}\,.

Geometrically, if ci0c_{i}\neq 0 then logri\log r_{i} is the distance of vi=log|ci|v_{i}=\log|c_{i}| to the upper convex hull of the lifted Newton polytope. Figure 3 illustrates these concepts. Note that τi\tau_{i}, β1<iβ\beta_{\ell-1}<i\leq\beta_{\ell}, is the negative slope of the line connecting (β1,vβ1)(\beta_{\ell-1},v_{\beta_{\ell-1}}) and (β,vβ)(\beta_{\ell},v_{\beta_{\ell}}). We note that the complexity of computing the tropical roots of ff is linear in the degree dd, see e.g. [Sha11, Proposition 2.2.1].

Definition 3.2 (Tropical backward error).

The tropical backward error of X^\hat{X}, denoted TBE(X^)\textup{TBE}(\hat{X}), is the smallest number ε0\varepsilon\geq 0 such that for all i=0,,d1i=0,\ldots,d-1

{|cic^i|riε|ci|,ci0,|cic^i|riε,ci=0,\begin{cases}|c_{i}-\hat{c}_{i}|\leq r_{i}\varepsilon|c_{i}|,&c_{i}\neq 0,\\ |c_{i}-\hat{c}_{i}|\leq r_{i}\varepsilon,&c_{i}=0,\end{cases}

where the c^i\hat{c}_{i} are the coefficients of

f^=cdj=1d(xx^j)=cdxd+i=0d1c^ixi.\hat{f}=c_{d}\prod_{j=1}^{d}(x-\hat{x}_{j})=c_{d}x^{d}+\sum_{i=0}^{d-1}\hat{c}_{i}x^{i}\,.

In the following we show that an element-wise mixed backward error of order machine precision also implies a tropical backward error of order machine precision and that the converse holds, under suitable assumptions, as well. As in Section 2 for the quadratic case, we assume in the following that |xj||x_{j}|, |x^j||\hat{x}_{j}| and |x~j||\tilde{x}_{j}| have the same order of magnitude for j=1,,dj=1,\ldots,d.

3.1 Element-wise Mixed Backward Stability implies Tropical Backward Stability

We start by showing that an EMBE(X^)=ε\textup{EMBE}(\hat{X})=\varepsilon implies TBE(X^)=O(ε)\textup{TBE}(\hat{X})=O(\varepsilon). The coefficients of the polynomial ff can be considered as functions of the roots x1,,xdx_{1},\ldots,x_{d}. Let

σk(x1,,xd)=|I|=kiIxi\sigma_{k}(x_{1},\ldots,x_{d})=\sum_{|I|=k}\prod_{i\in I}x_{i}

be the kk-th elementary symmetric polynomial. We have the identity

f=cdi=1d(xxi)=cdk=0d(1)dkσdk(x1,,xd)xk.f=c_{d}\prod_{i=1}^{d}(x-x_{i})=c_{d}\sum_{k=0}^{d}(-1)^{d-k}\sigma_{d-k}(x_{1},\ldots,x_{d})x^{k}.
Lemma 3.1.

If EMBE(X^)=ε\textup{EMBE}(\hat{X})=\varepsilon, then the coefficients of

f^=cdxd+i=0d1c^ixi=cd(xx^1)(xx^d)\hat{f}=c_{d}x^{d}+\sum_{i=0}^{d-1}\hat{c}_{i}x^{i}=c_{d}(x-\hat{x}_{1})\cdots(x-\hat{x}_{d})

satisfy

|c^icici|\displaystyle\left|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right| ε(1+(di)σdi(|x~1|,,|x~d|)|σdi(x1,,xd)|+O(ε)),\displaystyle\leq\varepsilon\left(1+(d-i)\frac{\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)}{|\sigma_{d-i}(x_{1},\ldots,x_{d})|}+O(\varepsilon)\right), when ci0\displaystyle\text{when }c_{i}\neq 0
and
|c^ici|\displaystyle|\hat{c}_{i}-c_{i}| ε(1+cd(di)σdi(|x~1|,,|x~d|)+O(ε)),\displaystyle\leq\varepsilon(1+c_{d}(d-i)\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)+O(\varepsilon)), when ci=0.\displaystyle\text{when }c_{i}=0.
Proof.

Suppose ci0c_{i}\neq 0. We have

c^icici=c^ic~ici+c~icici.\frac{\hat{c}_{i}-c_{i}}{c_{i}}=\frac{\hat{c}_{i}-\tilde{c}_{i}}{c_{i}}+\frac{\tilde{c}_{i}-c_{i}}{c_{i}}.

This implies

|c^icici|\displaystyle\left|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right| |c^ic~ici|+|c~icici|\displaystyle\leq\left|\frac{\hat{c}_{i}-\tilde{c}_{i}}{c_{i}}\right|+\left|\frac{\tilde{c}_{i}-c_{i}}{c_{i}}\right|
|σdi(x^1,,x^d)σdi(x~1,,x~d)||σdi(x1,,xd)|+ε\displaystyle\leq\frac{|\sigma_{d-i}(\hat{x}_{1},\ldots,\hat{x}_{d})-\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})|}{|\sigma_{d-i}(x_{1},\ldots,x_{d})|}+\varepsilon
=|σdi(x~1(1+δ^1),,x~d(1+δ^d))σdi(x~1,,x~d)||σdi(x1,,xd)|+ε\displaystyle=\frac{|\sigma_{d-i}(\tilde{x}_{1}(1+\hat{\delta}_{1}),\ldots,\tilde{x}_{d}(1+\hat{\delta}_{d}))-\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})|}{|\sigma_{d-i}(x_{1},\ldots,x_{d})|}+\varepsilon

where |δ^i|ε,i=1,,d|\hat{\delta}_{i}|\leq\varepsilon,i=1,\ldots,d. Note that the second inequality and the equality both use EMBE(X^)=ε\textup{EMBE}(\hat{X})=\varepsilon. We now observe

σdi(x~1(1+δ^1),,x~d(1+δ^d))=σdi(x~1,,x~d)+j=1diδ^jσdi(x~1,,x~d)+h.o.t.,\sigma_{d-i}(\tilde{x}_{1}(1+\hat{\delta}_{1}),\ldots,\tilde{x}_{d}(1+\hat{\delta}_{d}))=\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})+\sum_{j=1}^{d-i}\hat{\delta}_{j}\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})+\text{h.o.t.},

where the ‘higher order terms’ contain at least two of the δ^i\hat{\delta}_{i}. This, together with the triangle inequality, shows

|c^icici|ε(di)σdi(|x~1|,,|x~d|)|σdi(x1,,xd)|+O(ε2)+ε.\left|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right|\leq\varepsilon(d-i)\frac{\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)}{|\sigma_{d-i}(x_{1},\ldots,x_{d})|}+O(\varepsilon^{2})+\varepsilon.

The case ci=0c_{i}=0 is completely analogous. ∎

It is well known that the values exp(τi)\exp(\tau_{i}) are related to the modulus of the classical roots xix_{i}, see for instance [Sha11]. In what follows, we will make the assumption that the order of magnitude of exp(τi)\exp(\tau_{i}) is equal to that of |xi||x_{i}| (and, by our previous assumption, also to |x^i||\hat{x}_{i}| and |x~i||\tilde{x}_{i}|). Under this assumption, we have for β1<iβ\beta_{\ell-1}<i\leq\beta_{\ell} that

σdi(|x1|,,|xd|)=|I|=dijI|xj|=Di(mβi)k=0di1exp(τdk),\sigma_{d-i}(|x_{1}|,\ldots,|x_{d}|)=\sum_{|I|=d-i}\prod_{j\in I}|x_{j}|=D_{i}\begin{pmatrix}{m_{\ell}}\\ {\beta_{\ell}-i}\end{pmatrix}\prod_{k=0}^{d-i-1}\exp(\tau_{d-k}),

with DiD_{i} a not too large constant and

(mβi)=m!(βi)!(iβ1)!\begin{pmatrix}{m_{\ell}}\\ {\beta_{\ell}-i}\end{pmatrix}=\frac{m_{\ell}!}{(\beta_{\ell}-i)!(i-\beta_{\ell-1})!}

the binomial coefficient. This can be seen as follows. The important terms in the expansion of σdi(|x1|,,|xd|)\sigma_{d-i}(|x_{1}|,\ldots,|x_{d}|) are those containing did-i large roots. By the ordering of the roots by their modulus, these terms have the order of magnitude of k=0di1exp(τdk)\prod_{k=0}^{d-i-1}\exp(\tau_{d-k}). We can assume that each of these terms contains the largest m+1++msdim_{\ell+1}+\cdots+m_{s}\leq d-i roots. For the remaining factors, we can choose βi\beta_{\ell}-i roots among {xβ1+1,,xβ}\{x_{\beta_{\ell-1}+1},\ldots,x_{\beta_{\ell}}\}.

By our assumption that |x~i||xi||\tilde{x}_{i}|\approx|x_{i}|, we have

σdi(|x~1|,,|x~d|)=|I|=dijI|x~j|=D~i(mβi)k=0di1exp(τdk),\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)=\sum_{|I|=d-i}\prod_{j\in I}|\tilde{x}_{j}|=\tilde{D}_{i}\begin{pmatrix}{m_{\ell}}\\ {\beta_{\ell}-i}\end{pmatrix}\prod_{k=0}^{d-i-1}\exp(\tau_{d-k}),

with D~i\tilde{D}_{i} a not too large constant. Taking valuations on both sides of this equality, we get

log|σdi(|x~1|,,|x~d|)|=wi+k=0di1τdk\log|\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)|=w_{i}+\sum_{k=0}^{d-i-1}\tau_{d-k} (9)

where wi=log|D~i(miβi)|w_{i}=\log\left|\tilde{D}_{i}\binom{m_{i}}{\beta_{\ell}-i}\right| and exp(wi)\exp(w_{i}) is a not too large positive number. Equation (9) is the assumption we will use in the next theorem.

Theorem 3.1.

If EMBE(X^)=ε\textup{EMBE}(\hat{X})=\varepsilon and the order of magnitude of |xi||x_{i}| is equal to that of |xi^||\hat{x_{i}}| and exp(τi)\exp(\tau_{i}), i=1,,di=1,\ldots,d and (9) holds, then the coefficients of

f^=cdxd+i=0d1c^ixi=cd(xx^1)(xx^d)\hat{f}=c_{d}x^{d}+\sum_{i=0}^{d-1}\hat{c}_{i}x^{i}=c_{d}(x-\hat{x}_{1})\cdots(x-\hat{x}_{d})

satisfy

|c^icici|\displaystyle\left|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right| ε(1+(di)exp(wi)ri+O(ε))when ci0\displaystyle\leq\varepsilon\left(1+(d-i)\exp(w_{i})r_{i}+O(\varepsilon)\right)\quad\text{when }c_{i}\neq 0
and
|c^ici|\displaystyle|\hat{c}_{i}-c_{i}| ε(1+(di)exp(wi)ri+O(ε))when ci=0.\displaystyle\leq\varepsilon\left(1+(d-i)\exp(w_{i})r_{i}+O(\varepsilon)\right)\quad\text{when }c_{i}=0.

In particular, TBE(X^)εmaxi(1+(di)exp(wi)+O(ε))\textup{TBE}(\hat{X})\leq\varepsilon\max_{i}(1+(d-i)\exp(w_{i})+O(\varepsilon)).

Proof.

For β1iβ\beta_{\ell-1}\leq i\leq\beta_{\ell}, assume ci0c_{i}\neq 0. Note that

log|ri|\displaystyle\log|r_{i}| =vβ+(βi)τβvi\displaystyle=v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{\beta_{\ell}}-v_{i}
=vβ+1+mi+1τβ+1+(βi)τβvi\displaystyle=v_{\beta_{\ell+1}}+m_{i+1}\tau_{\beta_{\ell+1}}+(\beta_{\ell}-i)\tau_{\beta_{\ell}}-v_{i}
=vβ+2+mi+2τβ+2+mi+1τβ+1+(βi)τβsvi\displaystyle=v_{\beta_{\ell+2}}+m_{i+2}\tau_{\beta_{\ell+2}}+m_{i+1}\tau_{\beta_{\ell+1}}+(\beta_{\ell}-i)\tau_{\beta_{s}}-v_{i}
=\displaystyle=\ldots
=vβ+k=+1mkτβk+(βi)τβvi\displaystyle=v_{\beta_{\ell}}+\sum_{k=\ell+1}^{\ell}m_{k}\tau_{\beta_{k}}+(\beta_{\ell}-i)\tau_{\beta_{\ell}}-v_{i}
=vd+k=0di1τdkvi.\displaystyle=v_{d}+\sum_{k=0}^{d-i-1}\tau_{d-k}-v_{i}.

Now, using (9) and vi=vd+log|σdi(x1,,xd)|v_{i}=v_{d}+\log|\sigma_{d-i}(x_{1},\ldots,x_{d})| we get

log|ri|=log|σdi(|x~1|,,|x~d|)|log|σdi(x1,,xd)|wi.\log|r_{i}|=\log|\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)|-\log|\sigma_{d-i}(x_{1},\ldots,x_{d})|-w_{i}.

Therefore

exp(wi)ri=exp(wi+log|ri|)=σdi(|x~1|,,|x~d|)|σdi(x1,,xd)|\exp(w_{i})r_{i}=\exp(w_{i}+\log|r_{i}|)=\frac{\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)}{|\sigma_{d-i}(x_{1},\ldots,x_{d})|}

and we are done by Lemma 3.1. The proof for ci=0c_{i}=0 is analogous. ∎

In [MVD15], Mastronardi and Van Dooren show that, for d=2d=2, EMBE(X^)=O(u)\textup{EMBE}(\hat{X})=O(u) implies NBE(X^)=O(u)\textup{NBE}(\hat{X})=O(u), where NBE(X^)\textup{NBE}(\hat{X}) is the norm-wise backward error as defined in the introduction. Theorem 3.1 allows us to prove this statement for general degrees.

Proposition 3.1.

If EMBE(X^)=ε\textup{EMBE}(\hat{X})=\varepsilon and the assumptions of Theorem 3.1 are satisfied, we have that

(c0,,cd1,cd)(c^0,,c^d1,cd)2=O(ε)(c0,,cd1,cd)2.\lVert{(c_{0},\ldots,c_{d-1},c_{d})-(\hat{c}_{0},\ldots,\hat{c}_{d-1},c_{d})}\rVert_{2}=O(\varepsilon)\lVert{(c_{0},\ldots,c_{d-1},c_{d})}\rVert_{2}.
Proof.

Suppose that ci0c_{i}\neq 0. We have

|cic^i|\displaystyle|c_{i}-\hat{c}_{i}| |cic~i|+|c^ic~i|.\displaystyle\leq|c_{i}-\tilde{c}_{i}|+|\hat{c}_{i}-\tilde{c}_{i}|.

Since EMBE(X^)=ε\textup{EMBE}(\hat{X})=\varepsilon, we have that |(c^ici)(c^ic~i)|ε|ci|.\left|(\hat{c}_{i}-c_{i})-(\hat{c}_{i}-\tilde{c}_{i})\right|\leq\varepsilon|c_{i}|. Combining this with Theorem 3.1, we have that

|c^ic~i|=O(ε)|ci|ri.|\hat{c}_{i}-\tilde{c}_{i}|=O(\varepsilon)|c_{i}|r_{i}.

Hence, we obtain the bound

|cic^i|ε|ci|+O(ε)|ci|ri.|c_{i}-\hat{c}_{i}|\leq\varepsilon|c_{i}|+O(\varepsilon)|c_{i}|r_{i}.

Analogously, when ci=0c_{i}=0 we obtain

|cic^i|ε+O(ε)ri.|c_{i}-\hat{c}_{i}|\leq\varepsilon+O(\varepsilon)r_{i}.

It follows that

(c0c^0,,cd1c^d1,cdcd)2\displaystyle\lVert{(c_{0}-\hat{c}_{0},\ldots,c_{d-1}-\hat{c}_{d-1},c_{d}-c_{d})}\rVert_{2} ε(c0,,cd)2+O(ε)(r0c0,,rdcd)2\displaystyle\leq\varepsilon\lVert{(c_{0},\ldots,c_{d})}\rVert_{2}+O(\varepsilon)\lVert{(r_{0}c_{0},\ldots,r_{d}c_{d})}\rVert_{2}
ε(c0,,cd)2+O(ε)d(c0,,cd)2,\displaystyle\leq\varepsilon\lVert{(c_{0},\ldots,c_{d})}\rVert_{2}+O(\varepsilon)\sqrt{d}\lVert{(c_{0},\ldots,c_{d})}\rVert_{2},

where the last inequality follows from

maxi=1,,d|ci|\displaystyle\max_{i=1,\ldots,d}|c_{i}| (c0,,cd)2dmaxi=1,,d|ci|,\displaystyle\leq\lVert{(c_{0},\ldots,c_{d})}\rVert_{2}\leq\sqrt{d}\max_{i=1,\ldots,d}|c_{i}|,
maxi=1,,d|ci|\displaystyle\max_{i=1,\ldots,d}|c_{i}| (r0c0,,rdcd)2dmaxi=1,,d|ci|\displaystyle\leq\lVert{(r_{0}c_{0},\ldots,r_{d}c_{d})}\rVert_{2}\leq\sqrt{d}\max_{i=1,\ldots,d}|c_{i}|

because ri=1r_{i}=1 for i=argmax=1,,d|c|i=\text{argmax}_{\ell=1,\ldots,d}|c_{\ell}|. ∎

We note that the proof of Proposition 3.1 can be summarized as

EMBE(X^)=εTheorem 3.1TBE(X^)=O(ε)NBE(X^)=O(ε).\textup{EMBE}(\hat{X})=\varepsilon\overset{\text{Theorem \ref{thm:EMBEimpliesTBE}}}{\Longrightarrow}\textup{TBE}(\hat{X})=O(\varepsilon)\Longrightarrow\textup{NBE}(\hat{X})=O(\varepsilon).

3.2 Tropical Backward Stability implies Element-wise Backward Stability?

We now show that a tropical backward error of order ε\varepsilon also implies a mixed element-wise backward error of the same magnitude under some assumptions. For this, consider the perturbed polynomials

f^=f+Δ^f\displaystyle\hat{f}=f+\hat{\Delta}f =i=0d1ci(1+riδi)xi+cdxd\displaystyle=\sum_{i=0}^{d-1}c_{i}(1+r_{i}\delta_{i})x^{i}+c_{d}x^{d}
and
f~=f+Δ~f\displaystyle\tilde{f}=f+\tilde{\Delta}f =i=0d1ci(1+κiδie1θi)xi+cdxd\displaystyle=\sum_{i=0}^{d-1}c_{i}(1+\kappa_{i}\delta_{i}e^{\sqrt{-1}\theta_{i}})x^{i}+c_{d}x^{d} (10)

where log|δi|log|ε|=vε\log|\delta_{i}|\approx\log|\varepsilon|=v_{\varepsilon} and κi,θi[0,2π)\kappa_{i}\in\mathbb{R},\theta_{i}\in[0,2\pi) are parameters. We assume that κi\kappa_{i} is not too large, i.e. log|κi|0\log|\kappa_{i}|\approx 0, such that Δ~f\tilde{\Delta}f is a ‘small’ perturbation. Observe that for the roots x^j=xj+Δ^xj\hat{x}_{j}=x_{j}+\hat{\Delta}x_{j} of f^\hat{f} we have

0=(f+Δ^f)(xj+Δ^xj)=f(xj+Δ^xj)+Δ^f(xj+Δ^xj)=f(xj)+f(xj)Δ^xj+f′′(xj)2Δ^xj2++f(d)(xj)d!Δ^xjd+Δ^f(xj)+Δ^f(xj)Δ^xj+Δ^f′′(xj)2Δ^xj2++Δ^f(d)(xj)d!Δ^xjd.\begin{array}[]{rl}0&=(f+\hat{\Delta}f)(x_{j}+\hat{\Delta}x_{j})\\ &=f(x_{j}+\hat{\Delta}x_{j})+\hat{\Delta}f(x_{j}+\hat{\Delta}x_{j})\\ &=f(x_{j})+f^{\prime}(x_{j})\hat{\Delta}x_{j}+\frac{f^{\prime\prime}(x_{j})}{2}\hat{\Delta}x_{j}^{2}+\cdots+\frac{f^{(d)}(x_{j})}{d!}\hat{\Delta}x_{j}^{d}\\ &\phantom{=}+\hat{\Delta}f(x_{j})+\hat{\Delta}f^{\prime}(x_{j})\hat{\Delta}x_{j}+\frac{\hat{\Delta}f^{\prime\prime}(x_{j})}{2}\hat{\Delta}x_{j}^{2}+\cdots+\frac{\hat{\Delta}f^{(d)}(x_{j})}{d!}\hat{\Delta}x_{j}^{d}.\end{array} (11)

From this we conclude that Δ^xj\hat{\Delta}x_{j} is a root of the polynomial

E^(x)=Δ^f(xj)+(f(xj)+Δ^f(xj))x1!++(f(d)(xj)+Δ^f(d)(xj))xdd!.\hat{E}(x)=\hat{\Delta}f(x_{j})+(f^{\prime}(x_{j})+\hat{\Delta}f^{\prime}(x_{j}))\frac{x}{1!}+\cdots+(f^{(d)}(x_{j})+\hat{\Delta}f^{(d)}(x_{j}))\frac{x^{d}}{d!}.

Similarly, for the roots x~1=x1+Δ~x1,,x~d=xd+Δ~xd\tilde{x}_{1}=x_{1}+\tilde{\Delta}x_{1},\ldots,\tilde{x}_{d}=x_{d}+\tilde{\Delta}x_{d} of f~\tilde{f} we have that Δ~xj\tilde{\Delta}x_{j} is a root of the polynomial

E~(x)=Δ~f(xj)+(f(xj)+Δ~f(xj))x1!++(f(d)(xj)+Δ~f(d)(xj))xdd!.\tilde{E}(x)=\tilde{\Delta}f(x_{j})+(f^{\prime}(x_{j})+\tilde{\Delta}f^{\prime}(x_{j}))\frac{x}{1!}+\cdots+(f^{(d)}(x_{j})+\tilde{\Delta}f^{(d)}(x_{j}))\frac{x^{d}}{d!}.

To show that tropical backward stability implies element-wise mixed backward stability we need three assumptions. Each tropical root τi\tau_{i} should attain the maximum in (8) exactly twice, and the other terms should be significantly smaller. Also, the tropical root τi\tau_{i} should be of the same order of magnitude as log|xi|\log|x_{i}|.

Lemma 3.2.

If for the tropical root τj\tau_{j} of ff we have

  1. 1.

    {β{0,,d}|vβ+βτj=maxivi+iτj}={β1,β}\{\beta\in\{0,\ldots,d\}\leavevmode\nobreak\ |\leavevmode\nobreak\ v_{\beta}+\beta\tau_{j}=\max_{i}v_{i}+i\tau_{j}\}=\{\beta_{\ell-1},\beta_{\ell}\} with β1<β\beta_{\ell-1}<\beta_{\ell},

  2. 2.

    log|xj|τj\log|x_{j}|\approx\tau_{j},

  3. 3.

    |cixji||cβxjβ|,iβ1,β|c_{i}x_{j}^{i}|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,i\neq\beta_{\ell-1},\beta_{\ell},

then for k1k\geq 1

log|f(k)(xj)|vβ+(βk)τj.\log|f^{(k)}(x_{j})|\lesssim v_{\beta_{\ell}}+(\beta_{\ell}-k)\tau_{j}.

Here ‘\lesssim’ can be replaced by ‘\approx’ for k=1k=1.

Proof.

We have that xkf(k)(x)=i=kdcii!(ik)!xix^{k}f^{(k)}(x)=\sum_{i=k}^{d}c_{i}\frac{i!}{(i-k)!}x^{i}. We distinguish three different cases.

  1. 1.

    (β1k0,βk0\beta_{\ell-1}-k\geq 0,\beta_{\ell}-k\geq 0). In this case

    |xjkf(k)(xj)|=K1|cβ1β1!(β1k)!xjβ1+cββ!(βk)!xjβ|,|x_{j}^{k}f^{(k)}(x_{j})|=K_{1}\left|c_{\beta_{\ell-1}}\frac{\beta_{\ell-1}!}{(\beta_{\ell-1}-k)!}x_{j}^{\beta_{\ell-1}}+c_{\beta_{\ell}}\frac{\beta_{\ell}!}{(\beta_{\ell}-k)!}x_{j}^{\beta_{\ell}}\right|,

    with log|K1|0\log|K_{1}|\approx 0. Since f(xj)=0f(x_{j})=0 and by assumption |cixji||cβxjβ|,iβ1,β|c_{i}x_{j}^{i}|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,i\neq\beta_{\ell-1},\beta_{\ell}, we have that

    |cβ1xjβ1+cβxjβ|\displaystyle|c_{\beta_{\ell-1}}x_{j}^{\beta_{\ell-1}}+c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}| =|iβ1,βcixji|iβ1,β|cixji||cβxjβ|.\displaystyle=\left|\sum_{i\neq\beta_{\ell-1},\beta_{\ell}}c_{i}x_{j}^{i}\right|\leq\sum_{i\neq\beta_{\ell-1},\beta_{\ell}}|c_{i}x_{j}^{i}|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|.

    Then

    |xjkf(k)(xj)|\displaystyle|x_{j}^{k}f^{(k)}(x_{j})| =K1|β1!(β1k)!(cβ1xβ1+cβxβ)+(β!(βk)!β1!(β1k)!)cβxjβ|\displaystyle=K_{1}\left|\frac{\beta_{\ell-1}!}{(\beta_{\ell-1}-k)!}(c_{\beta_{\ell-1}}x^{\beta_{\ell-1}}+c_{\beta_{\ell}}x^{\beta_{\ell}})+\left(\frac{\beta_{\ell}!}{(\beta_{\ell}-k)!}-\frac{\beta_{\ell-1}!}{(\beta_{\ell-1}-k)!}\right)c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}\right|
    =K2|cβxjβ|\displaystyle=K_{2}|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|

    with log|K2|0\log|K_{2}|\approx 0. The lemma now follows from taking valuations.

  2. 2.

    (β1k<0,βk0\beta_{\ell-1}-k<0,\beta_{\ell}-k\geq 0). The lemma follows from the observation that in this case

    |xjkf(k)(xj)|=K1|cββ!(βk)!xjβ|=K2|cβxjβ|,|x_{j}^{k}f^{(k)}(x_{j})|=K_{1}\left|c_{\beta_{\ell}}\frac{\beta_{\ell}!}{(\beta_{\ell}-k)!}x_{j}^{\beta_{\ell}}\right|=K_{2}|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,

    with log|K1|0,log|K2|0\log|K_{1}|\approx 0,\log|K_{2}|\approx 0.

  3. 3.

    (β1k<0,βk<0\beta_{\ell-1}-k<0,\beta_{\ell}-k<0). In this case

    |xjkf(k)(xj)|=|i=kdcii!(ik)!xji|i=kd|cii!(ik)!xji||cβxjβ|.|x_{j}^{k}f^{(k)}(x_{j})|=\left|\sum_{i=k}^{d}c_{i}\frac{i!}{(i-k)!}x_{j}^{i}\right|\leq\sum_{i=k}^{d}\left|c_{i}\frac{i!}{(i-k)!}x_{j}^{i}\right|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|.

Note that if k=1k=1, the third case is not possible because β>β10\beta_{\ell}>\beta_{\ell-1}\geq 0. ∎

Lemma 3.3.

Under the assumptions of Lemma 3.2, we have that

log|Δ^f(xj)|vβ+vε+βτj,log|Δ~f(xj)|vβ+vε+βτj,\log|\hat{\Delta}f(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+\beta_{\ell}\tau_{j},\quad\log|\tilde{\Delta}f(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+\beta_{\ell}\tau_{j},
log|Δ^f(xj)|vβ+vε+(β1)τj,log|Δ~f(xj)|vβ+vε+(β1)τj,\log|\hat{\Delta}f^{\prime}(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+(\beta_{\ell}-1)\tau_{j},\quad\log|\tilde{\Delta}f^{\prime}(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+(\beta_{\ell}-1)\tau_{j},
log|f(k)(xj)+Δ^f(k)(xj)|vβ+(βk)τj,log|f(k)(xj)+Δ~f(k)(xj)|vβ+(βk)τj.\log|f^{(k)}(x_{j})+\hat{\Delta}f^{(k)}(x_{j})|\lesssim v_{\beta_{\ell}}+(\beta_{\ell}-k)\tau_{j},\quad\log|f^{(k)}(x_{j})+\tilde{\Delta}f^{(k)}(x_{j})|\lesssim v_{\beta_{\ell}}+(\beta_{\ell}-k)\tau_{j}.

In the last line, for k=1k=1 we can replace ‘\lesssim’ by ‘\approx’.

Proof.

We have

|Δ^f(xj)|\displaystyle|\hat{\Delta}f(x_{j})| =|i=0d1ciδirixji|=K1|β1iβd1ciδirixji|K1(ββ1)|cβδβxjβ|,\displaystyle=\left|\sum_{i=0}^{d-1}c_{i}\delta_{i}r_{i}x_{j}^{i}\right|=K_{1}\left|\sum_{\beta_{\ell-1}\leq i\leq\beta_{\ell}}^{d-1}c_{i}\delta_{i}r_{i}x_{j}^{i}\right|\leq K_{1}(\beta_{\ell}-\beta_{\ell-1})|c_{\beta_{\ell}}\delta_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,

with log|K1(ββ1)|0\log|K_{1}(\beta_{\ell}-\beta_{\ell-1})|\approx 0, which proves the first statement. The second statement is proven by a completely analogous argument. The third statement follows from

|xjΔ^f(xj)|\displaystyle|x_{j}\hat{\Delta}f^{\prime}(x_{j})| =|i=1d1ciiδirixji|K1(ββ1)|cβδβxjβ|,\displaystyle=\left|\sum_{i=1}^{d-1}c_{i}i\delta_{i}r_{i}x_{j}^{i}\right|\leq K_{1}(\beta_{\ell}-\beta_{\ell-1})|c_{\beta_{\ell}}\delta_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,

with log|K1(ββ1)|0\log|K_{1}(\beta_{\ell}-\beta_{\ell-1})|\approx 0. The fourth statement is analogous. The fifth statement follows from

log|f(k)(xj)+Δ^f(k)(xj)|log|f(k)(xj)|\log|f^{(k)}(x_{j})+\hat{\Delta}f^{(k)}(x_{j})|\approx\log|f^{(k)}(x_{j})|

and Lemma 3.2. The sixth statement follows again from an analogous argument. ∎

It follows from Lemma 3.3 that we can bound the lifted Newton polytopes of the polynomials E^,E~\hat{E},\tilde{E} from above. An example is shown in Figure 4.

vβ+vϵ+βτjv_{\beta_{\ell}}+v_{\epsilon}+\beta_{\ell}\tau_{j}vβ+βτjv_{\beta_{\ell}}+\beta_{\ell}\tau_{j}vβ+(β1)τjv_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j}vβ+(β2)τjv_{\beta_{\ell}}+(\beta_{\ell}-2)\tau_{j}\ldots
Figure 4: The blue line shows an upper bound for the lifted Newton polytopes of E^(x)\hat{E}(x) and E~(x)\tilde{E}(x). The actual lifted polytopes will meet the blue line (approximately) in the point (1,vβ+(β1)τj)(1,v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j}) (indicated with a small box).

The expansion (11) is used to approximate Δ^xj\hat{\Delta}x_{j} as

Δ^xjΔ^f(xj)f(xj)+Δ^f(xj).\hat{\Delta}x_{j}\approx\frac{-\hat{\Delta}f(x_{j})}{f^{\prime}(x_{j})+\hat{\Delta}f^{\prime}(x_{j})}. (12)

It is clear that this is an approximation for the smallest root of E^(z)\hat{E}(z), which corresponds to the smallest tropical root τE^\tau_{\hat{E}} of E^\hat{E} which is bounded by (see Figure 4)

τE^(vβ+vε+βτj)(vβ+(β1)τj)=τj+vε.\tau_{\hat{E}}\leq(v_{\beta_{\ell}}+v_{\varepsilon}+\beta_{\ell}\tau_{j})-(v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j})=\tau_{j}+v_{\varepsilon}. (13)

Analogously, we have for the smallest tropical root τE~\tau_{\tilde{E}} of E~\tilde{E} that τE~τj+vε\tau_{\tilde{E}}\leq\tau_{j}+v_{\varepsilon}. We will also make our usual assumption that the tropical roots give an indication for the order of magnitude of the classical roots, i.e.

log|Δ^xj|τj+vε,log|Δ~xj|τj+vε.\log|\hat{\Delta}x_{j}|\lesssim\tau_{j}+v_{\varepsilon},\quad\log|\tilde{\Delta}x_{j}|\lesssim\tau_{j}+v_{\varepsilon}. (14)

We conclude that the assumptions of Lemma 3.2 implies that X^\hat{X} and X~\tilde{X} have a relative forward error of size O(ε)O(\varepsilon). This implies that EMBE(X^)=O(ε)\textup{EMBE}(\hat{X})=O(\varepsilon) (take x~j=xj\tilde{x}_{j}=x_{j}), which gives the following result.

Theorem 3.2.

Under the assumptions of Lemma 3.2, if TBE(X^)=ε\textup{TBE}(\hat{X})=\varepsilon and the order of magnitude of |xj||x_{j}| is equal to that of |xj^||\hat{x_{j}}| and exp(τj)\exp(\tau_{j}), then EMBE(X^)=O(ε)\textup{EMBE}(\hat{X})=O(\varepsilon).

In fact, the assumptions of Lemma 3.2 imply that the roots XX of ff are well-conditioned. Consider the first order approximation

|Δxj||xj|maxi=0,,d|cixji||f(xj)||xj||Δf(xj)|maxi=0,,d|cixji|,\frac{|\Delta x_{j}|}{|x_{j}|}\approx\frac{\max_{i=0,...,d}|c_{i}x_{j}^{i}|}{|f^{\prime}(x_{j})||x_{j}|}\frac{|\Delta f(x_{j})|}{\max_{i=0,...,d}|c_{i}x_{j}^{i}|},

for a perturbation Δf\Delta f on ff causing a perturbation Δxj\Delta x_{j} on xjx_{j}. Here maxi=0,,d|cixji|\max_{i=0,...,d}|c_{i}x_{j}^{i}| is used to measure the residual |Δf(xj)||\Delta f(x_{j})| with a relative criterion. The condition of a root xjx_{j} can be measured by

maxi=0,,d|cixji||f(xj)||xj|=cβexp(τj)β|f(xj)||xj|.\frac{\max_{i=0,...,d}|c_{i}x_{j}^{i}|}{|f^{\prime}(x_{j})||x_{j}|}=\frac{c_{\beta_{\ell}}\exp(\tau_{j})^{\beta_{\ell}}}{|f^{\prime}(x_{j})||x_{j}|}.

Using Lemma 3.2, we find that this number is of order 1.

In what follows, we will give a constructive proof for Theorem 3.2. That is, in the proof of Theorem 3.3 (which implies Theorem 3.2) we will give values for κi,θi\kappa_{i},\theta_{i} in (10) which realize a small EMBE. It uses the following lemma.

Lemma 3.4.

Under the assumptions of Lemma 3.2, we have that

log|Δ^xjΔ^f(xj)f(xj)|τj+2vε,log|Δ~xjΔ~f(xj)f(xj)|τj+2vε.\log\left|\hat{\Delta}x_{j}-\frac{-\hat{\Delta}f(x_{j})}{f^{\prime}(x_{j})}\right|\lesssim\tau_{j}+2v_{\varepsilon},\quad\log\left|\tilde{\Delta}x_{j}-\frac{-\tilde{\Delta}f(x_{j})}{f^{\prime}(x_{j})}\right|\lesssim\tau_{j}+2v_{\varepsilon}.
Proof.

It follows from E^(Δ^xj)=0\hat{E}(\hat{\Delta}x_{j})=0 that

f(xj)Δ^xj=(Δ^f(xj)+Δ^xjΔ^f(xj)+k=2dΔ^xjkf(k)(xj)+Δ^f(k)(xj)d!).f^{\prime}(x_{j})\hat{\Delta}x_{j}=-\left(\hat{\Delta}f(x_{j})+\hat{\Delta}x_{j}\hat{\Delta}f^{\prime}(x_{j})+\sum_{k=2}^{d}\hat{\Delta}x_{j}^{k}\frac{f^{(k)}(x_{j})+\hat{\Delta}f^{(k)}(x_{j})}{d!}\right).

The valuation of the first neglected term in the approximation (12) is

log|Δ^xjΔ^f(xj)f(xj)|τj+vε+vβ+vε+(β1)τj(vβ+(β1)τj)=2vε+τj,\log\left|\frac{{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}\hat{\Delta}x_{j}}{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}\hat{\Delta}f^{\prime}(x_{j})}}{{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}f^{\prime}(x_{j})}}\right|\lesssim{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}\tau_{j}+v_{\varepsilon}}+{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}v_{\beta_{\ell}}+v_{\varepsilon}+(\beta_{\ell}-1)\tau_{j}}-{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}(v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j})}=2v_{\varepsilon}+\tau_{j},

where we used Lemma 3.2, Lemma 3.3 and (14). For the term corresponding to k=2k=2, we have

log|Δ^xj2f(2)(xj)2f(xj)|2(τj+vε)+vβ+(β2)τj(vβ+(β1)τj)=2vε+τj.\log\left|\frac{{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}\hat{\Delta}x_{j}^{2}}{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}f^{(2)}(x_{j})}}{2{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}f^{\prime}(x_{j})}}\right|\lesssim{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}2(\tau_{j}+v_{\varepsilon})}+{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}v_{\beta_{\ell}}+(\beta_{\ell}-2)\tau_{j}}-{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}(v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j})}=2v_{\varepsilon}+\tau_{j}.

For the terms corresponding to higher values of kk, we get in the same way a valuation of τj+kvε<τj+2vε\tau_{j}+kv_{\varepsilon}<\tau_{j}+2v_{\varepsilon}. The reasoning for Δ~xj\tilde{\Delta}x_{j} is completely analogous. ∎

Theorem 3.3.

Under the assumptions of Lemma 3.2, there are choices of the parameters κi\kappa_{i}, θi\theta_{i} with log|κi|0\log|\kappa_{i}|\approx 0 such that log|x^jx~j|vε+τj\log|\hat{x}_{j}-\tilde{x}_{j}|\approx v_{\varepsilon}+\tau_{j}.

Proof.

By Lemma 3.4 we have x^j=xjΔ^f(xj)f(xj)+O(ε2exp(τj))\hat{x}_{j}=x_{j}-\frac{\hat{\Delta}f(x_{j})}{f^{\prime}(x_{j})}+O(\varepsilon^{2}\exp(\tau_{j})) and x~j=xjΔ~f(xj)f(xj)+O(ε2exp(τj))\tilde{x}_{j}=x_{j}-\frac{\tilde{\Delta}f(x_{j})}{f^{\prime}(x_{j})}+O(\varepsilon^{2}\exp(\tau_{j})). Hence it suffices to show that the valuation of

|x^jx~j|\displaystyle|\hat{x}_{j}-\tilde{x}_{j}| |1f(xj)||i=0d1ciδi(riκie1θi)xji|\displaystyle\approx\left|\frac{1}{f^{\prime}(x_{j})}\right|\left|\sum_{i=0}^{d-1}c_{i}\delta_{i}(r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}})x_{j}^{i}\right|

is bounded by vε+τjv_{\varepsilon}+\tau_{j}. We have

|1f(xj)||i=0d1ciδi(riκie1θi)xji|\displaystyle\left|\frac{1}{f^{\prime}(x_{j})}\right|\left|\sum_{i=0}^{d-1}c_{i}\delta_{i}(r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}})x_{j}^{i}\right| |1f(xj)|i=0d1|ci||δi||riκie1θi||xji|\displaystyle\leq\left|\frac{1}{f^{\prime}(x_{j})}\right|\sum_{i=0}^{d-1}|c_{i}||\delta_{i}||r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}}||x_{j}^{i}|
ε|1f(xj)|i=0d1|ci||riκie1θi||xji|.\displaystyle\leq\varepsilon\left|\frac{1}{f^{\prime}(x_{j})}\right|\sum_{i=0}^{d-1}|c_{i}||r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}}||x_{j}^{i}|.

We now specify the parameters κi\kappa_{i}, θi\theta_{i}. For ri=O(1)r_{i}=O(1), we choose κi\kappa_{i}, θi\theta_{i} such that ri=κie1θir_{i}=\kappa_{i}e^{\sqrt{-1}\theta_{i}}. Note that log|κi|0\log|\kappa_{i}|\approx 0. For the other ii, we set κi=θi=0\kappa_{i}=\theta_{i}=0. We get

|x^jx~j|ε|1f(xj)|ri1|ci||ri||xji|.|\hat{x}_{j}-\tilde{x}_{j}|\leq\varepsilon\left|\frac{1}{f^{\prime}(x_{j})}\right|\sum_{r_{i}\gg 1}|c_{i}||r_{i}||x_{j}^{i}|.

Since by assumption log|xj|τj\log|x_{j}|\approx\tau_{j}, the dominant terms in the sum are those with β1<i<β\beta_{\ell-1}<i<\beta_{\ell}, where τβ1+1==τβ=τj\tau_{\beta_{\ell-1}+1}=\cdots=\tau_{\beta_{\ell}}=\tau_{j}. Therefore

|x^jx~j|Kε|1f(xj)|β1<i<β|ci||ri||xji||\hat{x}_{j}-\tilde{x}_{j}|\leq K\varepsilon\left|\frac{1}{f^{\prime}(x_{j})}\right|\sum_{\beta_{\ell-1}<i<\beta_{\ell}}|c_{i}||r_{i}||x_{j}^{i}|

with log|K|0\log|K|\approx 0. The valuation of one of the terms in the sum is

log|K|+vεlog|f(xj)|+vi+vβ+(βi)τjvi+iτj\log|K|+v_{\varepsilon}-\log|f^{\prime}(x_{j})|+v_{i}+v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{j}-v_{i}+i\tau_{j}

which is equal to log|K|log|f(xj)|+vε+vβ+βτj\log|K|-\log|f^{\prime}(x_{j})|+v_{\varepsilon}+v_{\beta_{\ell}}+\beta_{\ell}\tau_{j}. Note that this is independent of ii, and hence we get

log|x^jx~j|log|f(xj)|+vε+vβ+βτj.\log|\hat{x}_{j}-\tilde{x}_{j}|\approx-\log|f^{\prime}(x_{j})|+v_{\varepsilon}+v_{\beta_{\ell}}+\beta_{\ell}\tau_{j}.

Using Lemma 3.2 we get

log|x^jx~jx~j|vε.\log\left|\frac{\hat{x}_{j}-\tilde{x}_{j}}{\tilde{x}_{j}}\right|\approx v_{\varepsilon}.

4 Computational Experiments

In Subsection 3.2 we proved that a tropical backward error of order ε\varepsilon also implies a mixed element-wise backward error of the same magnitude under some assumptions. Unfortunately, we were not able to prove this result in general. However, based on several numerical experiments that we performed, we are convinced that a small TBE implies a small EMBS also in general. To support this conjecture, the following numerical experiment was performed.
Numerical Experiment 1 Take 1000 polynomials of degree dd with coefficients whose modulus is chosen as 10e10^{e} with ee uniformly randomly chosen between k-k and kk and whose argument is uniformly randomly chosen between 0 and 2π2\pi. These polynomials will not always satisfy the necessary assumptions for Theorem 3.2. The zeros of these polynomials are approximated by applying an eigenvalue method from the Julia package Polynomials resulting in the computed zeros X^\hat{X}. For these approximate roots the TBE is computed. To compute an upper bound for the EMBE, the roots x^j\hat{x}_{j} are separately refined using Newton’s method in extended precision based on the original polynomial. The correspondence between the TBE and EMBE is shown in Figure 5 and clearly indicates that a small TBE implies a small EMBE.

Refer to caption
Figure 5: Results of Numerical Experiment 1.

In several of our statements, we assumed that the tropical roots of ff are of the same order of magnitude as the corresponding classical roots. To check if this is a reasonable assumption in practice, we performed the following numerical experiment.

Refer to caption
Figure 6: Results of Numerical Experiment 2.

Numerical Experiment 2 Ten thousand polynomials of degree dd are taken with coefficients whose modulus is chosen as 10e10^{e} with ee uniformly randomly chosen between k-k and kk and whose argument is uniformly randomly chosen between 0 and 2π2\pi. For each of these polynomials the tropical roots are compared to the roots computed in high precision. Figure 6 gives a histogram of the measured ratios |τi/xi||\tau_{i}/x_{i}|. The results show that for the vast majority of roots the magnitude differs by at most 10 percent.

5 Conclusion

We have shown the relations (1) between different measures for the backward error of an approximate set of roots X^\hat{X} of a polynomial. Under some assumptions the tropical backward error measure of [TVB20], which is easy to compute, is shown to be equivalent to the element-wise mixed backward error measure defined in [MVD15] for d=2d=2. We have given numerical evidence that the equivalence holds more generally.

References

  • [AMVW15] Jared L Aurentz, Thomas Mach, Raf Vandebril, and David S Watkins. Fast and backward stable computation of roots of polynomials. SIAM Journal on Matrix Analysis and Applications, 36(3):942–973, 2015.
  • [BNS13] Dario A Bini, Vanni Noferini, and Meisam Sharify. Locating the eigenvalues of matrix polynomials. SIAM Journal on Matrix Analysis and Applications, 34(4):1708–1727, 2013.
  • [GS09] Stéphane Gaubert and Meisam Sharify. Tropical scaling of polynomial matrices. In Positive systems, volume 389 of Lecture Notes in Control and Information Sciences, pages 291–303. Springer, 2009.
  • [Hig02] Nicholas J Higham. Accuracy and stability of numerical algorithms, volume 80. Siam, 2002.
  • [MS15] Diane Maclagan and Bernd Sturmfels. Introduction to tropical geometry, volume 161. American Mathematical Soc., 2015.
  • [MVD15] Nicola Mastronardi and Paul Van Dooren. Revisiting the stability of computing the roots of a quadratic polynomial. Electronic Transactions on Numerical Analysis, 44:73–82, 2015.
  • [NST15] Vanni Noferini, Meisam Sharify, and Francoise Tisseur. Tropical roots as approximations to eigenvalues of matrix polynomials. SIAM Journal on Matrix Analysis and Applications, 36(1):138–157, 2015.
  • [Sha11] Meisam Sharify. Scaling Algorithms and Tropical Methods in Numerical Matrix Analysis: Application to the Optimal Assignment Problem and to the Accurate Computation of Eigenvalues. PhD thesis, 2011.
  • [TVB20] Françoise Tisseur and Marc Van Barel. Min-max elementwise backward error for roots of polynomials and a corresponding backward stable root finder. arXiv:2001.05281, 2020.