Backward Error Measures for Roots of Polynomials

Simon Telen Department of Computer Science, KU Leuven Sascha Timme This author was supported by the Deutsche Forschungsgemeinschaft (German Research Foundation) Graduiertenkolleg Facets of Complexity (GRK 2434). Technische Universität Berlin, Chair of Discrete Mathematics/Geometry Marc Van Barel This author was partially supported by the Research Council KU Leuven, C1-project (Numerical Linear Algebra and Polynomial Computations), and by the Fund for Scientific Research Flanders (Belgium), G.0828.14N (Multivariate polynomial and rational interpolation and approximation), and EOS Project no 30468160. Department of Computer Science, KU Leuven

(January 1, 2025)

Abstract

We analyze different measures for the backward error of a set of numerical approximations for the roots of a polynomial. We focus mainly on the element-wise mixed backward error introduced by Mastronardi and Van Dooren, and the tropical backward error introduced by Tisseur and Van Barel. We show that these measures are equivalent under suitable assumptions. We also show relations between these measures and the classical element-wise and norm-wise backward error measures.

1 Introduction

In this article we analyze the problem of measuring the backward error for a set of approximations for the roots of a polynomial with complex coefficients. For a general introduction to the notion of backward error analysis, the reader can consult for instance [Hig02, Section 1.5]. Consider a set of approximate solutions $\hat{X}=\{\hat{x}_{1},\ldots,\hat{x}_{d}\}\subset\mathbb{C}^{*}=\mathbb{C}\setminus\{0\}$ of a polynomial equation with nonzero coefficients

f=c_{0}+c_{1}x+\ldots+c_{d-1}x^{d-1}+c_{d}x^{d}=c_{d}(x-x_{1})\cdots(x-x_{d})=0,\quad f\in\mathbb{C}[x],

with solutions $X=\{x_{1},\ldots,x_{d}\}\subset\mathbb{C}^{*}$ . The backward error of $\hat{X}$ is a measure for the ‘distance’ of $f$ to the polynomial

\hat{f}=\hat{c}_{0}+\hat{c}_{1}x+\ldots+\hat{c}_{d-1}x^{d-1}+c_{d}x^{d}=c_{d}(x-\hat{x}_{1})\cdots(x-\hat{x}_{d}),

whose roots are exactly the points in $\hat{X}$ . How to measure this distance turns out to be a surprisingly subtle problem. A first and natural measure is the 2-norm distance between the coefficients of $f$ and $\hat{f}$ . The norm-wise backward error (NBE) of $\hat{X}$ is

\textup{NBE}(\hat{X})=\sqrt{\frac{|c_{0}-\hat{c}_{0}|^{2}+\cdots+|c_{d-1}-\hat{c}_{d-1}|^{2}}{|c_{0}|^{2}+\cdots+|c_{d}|^{2}}}=\frac{\lVert{c-\hat{c}}\rVert_{2}}{\lVert{c}\rVert_{2}},

where $c=(c_{0},\ldots,c_{d})\in\mathbb{C}^{d},\hat{c}=(\hat{c}_{0},\ldots,c_{d})\in\mathbb{C}^{d}$ . In [AMVW15], an algorithm is proposed that computes a set of approximate solutions $\hat{X}$ satisfying $\textup{NBE}(\hat{X})=O(u)$ , where $u$ is the unit round-off. Such an algorithm is called norm-wise backward stable. However, it turns out that this type of stability is too ‘weak’ in a sense we explain by means of an example. Consider the polynomial $f=a(x-10^{6})(x-10^{-6})$ . The set of approximate solutions $\hat{X}=\{10^{6}+u10^{6},10^{-6}+u\}$ would satisfy $\textup{NBE}(\hat{X})=O(u)$ . Indeed,

\hat{c}=a(1+(u10^{6}+u+u^{2}10^{6}),-(10^{6}+10^{-6})-u(10^{6}+1),1),\quad c=a(1,-(10^{6}+10^{-6}),1)

and $\lVert{c-\hat{c}}\rVert_{2}/\lVert{c}\rVert_{2}=O(u)$ . This means that we would allow a relative error of size $u10^{6}$ on the coefficient vector $c$ . However, computing the roots of $f$ with $a=0.2$ using the Julia package PolynomialRoots (using the command roots) we get

julia> abs.((c - chat)./c)
3-element Array{Float64,1}:
 2.7755575615628914e-16
 0.0
 0.0

which shows that we can obtain better element-wise accuracy on the coefficient vector. This suggests another, more ‘strict’, measure for the backward error. The element-wise backward error (EBE) of $\hat{X}$ is

\textup{EBE}(\hat{X})=\max_{i=0,\ldots,d-1}\left|\frac{c_{i}-\hat{c}_{i}}{c_{i}}\right|.

Unfortunately, this measure turns out to be too strict. In [MVD15], Mastronardi and Van Dooren show that no algorithm for finding the roots of a general quadratic polynomial is element-wise backward stable, meaning that it computes $\hat{X}$ such that $\textup{EBE}(\hat{X})=O(u)$ . As an alternative measure, the authors of [MVD15] propose the following definition in the case where $d=2$ . The element-wise mixed backward error (EMBE) of $\hat{X}$ , denoted $\textup{EMBE}(\hat{X})$ , is the smallest number $\varepsilon\geq 0$ such that there exists some $\tilde{X}=\{\tilde{x}_{1},\ldots,\tilde{x}_{d}\}\subset\mathbb{C}$ and

\tilde{f}=\tilde{c}_{0}+\tilde{c}_{1}x+\cdots+\tilde{c}_{d-1}x^{d-1}+c_{d}x^{d}=c_{d}(x-\tilde{x}_{1})\cdots(x-\tilde{x}_{d})

such that

	$\displaystyle\|\hat{x}_{i}-\tilde{x}_{i}\|$	$\displaystyle\leq\varepsilon\|\tilde{x}_{i}\|,\quad i=1,\ldots,d,$
	$\displaystyle\|c_{i}-\tilde{c}_{i}\|$	$\displaystyle\leq\varepsilon\|c_{i}\|,\quad i=0,\ldots,d-1.$

Note that the second set of inequalities is equivalent to $\textup{EBE}(\tilde{X})\leq\varepsilon$ . In the same paper, the authors also show the implication $\textup{EMBE}(\hat{X})=O(u)\Rightarrow\textup{NBE}(\hat{X})=O(u)$ in the case where $d=2$ . The implication $\textup{EBE}(\hat{X})=O(u)\Rightarrow\textup{EMBE}(\hat{X})=O(u)$ is obvious from the definition.
The advantage of EMBE as a measure for the backward error is that it results in a notion of stability, i.e. element-wise mixed backward stability, which is stronger than norm-wise backward stability and can provably be obtained for quadratic polynomials [MVD15]. A drawback of this measure is that it is hard to compute $\textup{EMBE}(\hat{X})$ for a given set of approximate solutions $\hat{X}$ because of the rather abstract definition. In [TVB20], Tisseur and Van Barel define the min-max element-wise backward error of $\hat{X}$ as

\textup{TBE}(\hat{X})=\max_{i=0,\ldots,d-1}\left|\frac{c_{i}-\hat{c}_{i}}{r_{i}c_{i}}\right|

where $r_{i}\geq 1$ are constants that can be computed in linear time from the coefficients of $f$ . The $r_{i}$ depend only on the tropical roots of $f$ , which is why we will refer to this error measure as the tropical backward error (TBE). We will give a definition of the numbers $r_{i}$ in Section 3. The authors of [TVB20] also provide an algorithm that, under some assumptions on the numerical behavior of a modified QZ-algorithm (see [TVB20, Section 5, Assumption 1]), computes a set of approximate roots $\hat{X}$ satisfying $\textup{TBE}(\hat{X})=O(u)$ .

In this paper, we investigate the relations between the TBE and the EMBE. In particular, we show that these error measures are equivalent under suitable assumptions. Here’s a simplified version of our first main theorem.

Theorem 1.1.

Assume that the tropical roots of $f$ are of the same order of magnitude as the corresponding classical roots and $|\hat{x}_{j}|$ are of the same order of magnitude as $|x_{j}|$ . Then we have that $\textup{EMBE}(\hat{X})=O(u)$ implies $\textup{TBE}(\hat{X})=O(u)$ .

The strategy for proving Theorem 1.1 will also allow us to prove that, under the assumptions of the theorem, $\textup{EMBE}(\hat{X})=O(u)$ implies $\textup{NBE}(\hat{X})=O(u)$ . This was proved in [MVD15] for the case where $d=2$ . Under some stronger assumptions, we also prove the reverse implication.

Theorem 1.2.

Assume that the tropical roots of $f$ are of the same order of magnitude as the corresponding classical roots and $|\hat{x}_{j}|$ are of the same order of magnitude as $|x_{j}|$ . Moreover, assume that for each $x_{j}\in X$ , there are two terms $c_{\beta^{\prime}}x_{j}^{\beta^{\prime}}$ and $c_{\beta}x_{j}^{\beta}$ of $f(x_{j})$ such that

|c_{i}x_{j}^{i}|\ll|c_{\beta^{\prime}}x_{j}^{\beta^{\prime}}|\quad\text{and}\quad|c_{i}x_{j}^{i}|\ll|c_{\beta}x_{j}^{\beta}|,\quad\text{for all }i\neq\beta^{\prime},\beta.

Then we have that $\textup{TBE}(\hat{X})=O(u)$ implies $\textup{EMBE}(\hat{X})=O(u)$ .

We will give numerical evidence that Theorem 1.2 holds without the extra assumption on the polynomial $f$ . In summary, we have the following diagram, where the arrows are implications.

(1)

Here, the black arrows are implications which are obvious from the definitions. The blue arrows are implications that we prove under the assumptions of Theorem 1.1. The dashed arrow represents Theorem 1.2, which uses stronger assumptions.

This article is organized as follows. In the next section we prove the equivalence of the tropical and element-wise mixed backward error measures in the case where $d=2$ . This can be seen as an extension of the analysis in [MVD15], and it is instructive for the general case. The proofs for general $d$ are given in Section 3. In Section 4 we show some computational experiments and give numerical evidence that Theorem 1.2 holds under weaker assumptions.

2 Backward Error for Quadratic Polynomials

In this section, we prove that the element-wise mixed backward error (EMBE) as introduced by Mastronardi and Van Dooren [MVD15] and the tropical backward error (TBE) from [TVB20] are equivalent backward error measures for the roots of a quadratic polynomial

f=ax^{2}+bx+c=a(x-x_{1})(x-x_{2}).

For simplicity, we assume that $a,b,c\in\mathbb{C}^{*}=\mathbb{C}\setminus\{0\}$ . For the approximate roots $\hat{X}=\{\hat{x}_{1},\hat{x}_{2}\}$ of $f$ , $\textup{EMBE}(\hat{X})$ is the smallest number $\varepsilon\geq 0$ such that there exists $\tilde{X}=\{\tilde{x}_{1},\tilde{x}_{2}\}$ with

	$\displaystyle\tilde{f}=a(x-\tilde{x}_{1})(x-\tilde{x}_{2})$	$\displaystyle=ax^{2}+\tilde{b}x+\tilde{c},$
	$\displaystyle\|\hat{x}_{1}-\tilde{x}_{1}\|\leq\varepsilon\|\tilde{x}_{1}\|,\quad$	$\displaystyle\|\hat{x}_{2}-\tilde{x}_{2}\|\leq\varepsilon\|\tilde{x}_{2}\|,$
	$\displaystyle\|b-\tilde{b}\|\leq\varepsilon\|b\|,\quad$	$\displaystyle\|c-\tilde{c}\|\leq\varepsilon\|c\|.$

In analogy with [TVB20], we define the $\textup{TBE}(\hat{X})$ to be the smallest number $\varepsilon\geq 0$ such that

	$\displaystyle\hat{f}=a(x-\hat{x}_{1})(x-\hat{x}_{2})=ax^{2}+\hat{b}x+\hat{c}$
	$\displaystyle\|b-\hat{b}\|\leq r_{b}\varepsilon\|b\|,\quad\|c-\hat{c}\|\leq\varepsilon\|c\|,$

where $r_{b}=\max(1,\sqrt{|ac|}/|b|)$ . The definition of $r_{b}$ will be clarified in Section 3. For the approximate roots $\hat{x}_{j}$ and $\tilde{x}_{j}$ in these definitions, we will assume that the order of magnitude of $|\tilde{x}_{j}|$ and $|\hat{x}_{j}|$ is the same as the order of magnitude of $|x_{j}|$ (that is, we allow relative errors of size at most 1). Note that by the definition of EMBE, it is sufficient that this is satisfied for $|\hat{x}_{j}|$ .

We will now relate these two error measures. Let $\varepsilon=\textup{EMBE}(\hat{X})$ . We observe

	$\displaystyle\frac{\hat{c}-c}{c}$	$\displaystyle=\frac{\hat{c}-\tilde{c}}{c}+\frac{\tilde{c}-c}{c}=\frac{\hat{x}_{1}\hat{x}_{2}-\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}+\frac{\tilde{c}-c}{c}$
		$\displaystyle=\frac{(\tilde{x}_{1}+(\hat{x}_{1}-\tilde{x}_{1}))(\tilde{x}_{2}+(\hat{x}_{2}-\tilde{x}_{2}))-\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}+\frac{\tilde{c}-c}{c}\,.$

Since $|c-\tilde{c}|\leq\varepsilon|c|$ , we have $|1-\frac{\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}|\leq\varepsilon$ and it follows

\left|\frac{\hat{c}-c}{c}\right|\leq(2\varepsilon+\varepsilon^{2})\left|\frac{\tilde{x}_{1}\tilde{x}_{2}}{x_{1}x_{2}}\right|+\varepsilon\lesssim 3\varepsilon\,.

For the coefficient $\hat{b}$ , we find in an analogous way that

\left|\frac{\hat{b}-b}{b}\right|\leq\varepsilon\left(1+\frac{|\tilde{x}_{1}|+|\tilde{x}_{2}|}{|x_{1}+x_{2}|}\right).

(2)

If the solutions $x_{1}$ and $x_{2}$ have different orders of magnitude, there does not occur any cancellation in the denominator of the right hand side of (2) and this implies $\left|\frac{\hat{b}-b}{b}\right|\lesssim 2\varepsilon$ . However, if the order of magnitude of both solutions is the same, the factor standing with $\varepsilon$ may be significantly larger than 1 due to cancellation. We will now make this precise and relate this to the number $r_{b}$ . We define $\gamma=x_{1}/x_{2}$ . By the assumption that $|\tilde{x}_{i}|$ is of the same order of magnitude as $|x_{i}|$ , (2) can be written as

\left|\frac{\hat{b}-b}{b}\right|\leq\varepsilon\left(1+K\frac{|\gamma|+1}{|\gamma+1|}\right)

(3)

with $K$ a small constant. We assume, without loss of generality, that $0<|\gamma|\leq 1$ . Note that we have for $0<|\gamma|<1$ the inequality

\frac{|\gamma|+1}{|\gamma+1|}\leq\frac{|\gamma|+1}{||\gamma|-1|}=\frac{|\gamma|+1}{1-|\gamma|},

(4)

which follows from $|x-y|\geq||x|-|y||,\forall x,y\in\mathbb{C}$ applied to $x=\gamma,y=-1$ . Assume that

\frac{\sqrt{|ac|}}{|b|}=\frac{\sqrt{|\gamma|}}{|\gamma+1|}\leq 1.

In this case, we have

\frac{|\gamma|+1}{|\gamma+1|}\leq\frac{|\gamma|+1}{\sqrt{|\gamma|}}.

(5)

Now, assume

\frac{\sqrt{|ac|}}{|b|}=\frac{\sqrt{|\gamma|}}{|\gamma+1|}\geq 1.

In this case

\frac{|\gamma|+1}{|\gamma+1|}\left(\frac{\sqrt{|ac|}}{|b|}\right)^{-1}\leq\frac{|\gamma|+1}{|\gamma+1|}\leq\frac{|\gamma|+1}{1-|\gamma|}.

(6)

Also,

\frac{|\gamma|+1}{|\gamma+1|}\left(\frac{\sqrt{|ac|}}{|b|}\right)^{-1}=\frac{|\gamma|+1}{|\gamma+1|}\left(\frac{\sqrt{|\gamma|}}{|\gamma+1|}\right)^{-1}=\frac{|\gamma|+1}{\sqrt{|\gamma|}}.

(7)

Using (4)-(7), we find that

\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}\leq\min\left(\frac{|\gamma|+1}{1-|\gamma|},\frac{|\gamma|+1}{\sqrt{|\gamma|}}\right),

which gives

\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}\leq\begin{cases}\frac{\alpha+1}{1-\alpha}=\sqrt{5}&0<|\gamma|\leq\alpha\\ \frac{\alpha+1}{\sqrt{\alpha}}=\sqrt{5}&\alpha\leq|\gamma|\leq 1\end{cases}\quad\Rightarrow\leavevmode\nobreak\ \frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}\leq\sqrt{5},

where $\alpha=\frac{3}{2}-\frac{\sqrt{5}}{2}$ . This is illustrated in Figure 1.

Figure 1: The value of

\min\left(\frac{|\gamma|+1}{1-|\gamma|},\frac{|\gamma|+1}{\sqrt{|\gamma|}}\right)

for

0\leq|\gamma|\leq 1

It follows immediately from this observation and (3) that

\left|\frac{\hat{b}-b}{b}\right|\leq\varepsilon\left(\sqrt{5}K+r_{b}^{-1}\right)r_{b}\leq\varepsilon(\sqrt{5}K+1)r_{b}.

Figure 2 illustrates the values of $r_{b}$ and $\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}$ as a function of $\gamma$ in the unit disk.

Refer to caption — Figure 2: Left: illustration of the value of $\log(r_{b})$ as a function of $\gamma$ for $0\leq|\gamma|\leq 1$ . Right: illustration of the value of $\frac{|\gamma|+1}{|\gamma+1|}r_{b}^{-1}$ for $0\leq|\gamma|\leq 1$ .

This shows that element-wise mixed backward stability implies tropical backward stability. The converse also holds, as we will now show. Let $\textup{TBE}(\hat{X})=\varepsilon$ for the computed roots $\hat{X}=\{\hat{x}_{1},\hat{x}_{2}\}$ . If $r_{b}=1$ then we can take $\tilde{x}_{1}=\hat{x}_{1}$ and $\tilde{x}_{2}=\hat{x}_{2}$ such that $\textup{EMBE}(\hat{X})\leq\textup{TBE}(\hat{X})$ . Suppose now that $r_{b}=\sqrt{|ac|}/|b|>1$ and without loss of generality assume $|\hat{x}_{1}|\leq|\hat{x}_{2}|$ . $\textup{TBE}(\hat{X})=\varepsilon$ implies that there exists $\hat{\delta}_{b}\in\mathbb{C}$ with $|\hat{\delta}_{b}|\leq\varepsilon$ such that

-\hat{b}=a(\hat{x}_{1}+\hat{x}_{2})=-b(1+r_{b}\hat{\delta}_{b})\,.

Let $\tilde{x}_{1}=\hat{x}_{1}+\frac{b}{a}r_{b}\hat{\delta}_{b}$ and $\tilde{x}_{2}=\hat{x}_{2}$ . Then we have

-\tilde{b}=a(\tilde{x}_{1}+\tilde{x}_{2})=a(\hat{x}_{1}+\frac{b}{a}r_{b}\hat{\delta}_{b}+\hat{x}_{2})=a(\hat{x}_{1}+\hat{x}_{2})+br_{b}\hat{\delta}_{b}=-b(1+r_{b}\hat{\delta}_{b})+br_{b}\hat{\delta}_{b}=-b\,.

Note that

|\tilde{x}_{1}-\hat{x}_{1}|=\left|\frac{b}{a}r_{b}\hat{\delta}_{b}\right|=\sqrt{\left|\frac{c}{a}\right|}|\hat{\delta}_{b}|=\sqrt{|x_{1}x_{2}|}|\hat{\delta}_{b}|\lesssim|\tilde{x}_{1}|\varepsilon\,.

We also have

\tilde{c}-\hat{c}=a\tilde{x}_{1}\tilde{x}_{2}-a\hat{x}_{1}\hat{x}_{2}=a(\hat{x}_{1}+\frac{b}{a}r_{b}\hat{\delta}_{b})\hat{x}_{2}-a\hat{x}_{1}\hat{x}_{2}=br_{b}\hat{\delta}_{b}\hat{x}_{2}

from which we get

|\tilde{c}-\hat{c}|=|a|\left|\frac{b}{a}r_{b}\hat{\delta}_{b}\hat{x}_{2}\right|\lesssim\varepsilon|a||\tilde{x}_{1}\tilde{x}_{2}|=\varepsilon|\tilde{c}|\,.

We conclude that tropical backward stability implies element-wise mixed backward stability.

Remark 1.

We assumed $a,b,c\in\mathbb{C}^{*}$ in this discussion. If $a=0$ , we are solving a linear equation and there is nothing to prove. If $c=0$ , the root $x_{1}=0$ can be deflated and we are again left with a linear equation. If $b=0$ , a similar derivation can be made. We omit the details but give a brief outline. We replace the conditions on $\tilde{b}$ and $\hat{b}$ in the definitions of $\textup{EMBE}(\hat{X})$ and $\textup{TBE}(\hat{X})$ respectively by $|\tilde{b}|\leq\varepsilon$ and $|\hat{b}|\leq r_{b}\varepsilon$ where $r_{b}=\max(1,\sqrt{|ac|})=\max(1,|ax_{1}|)$ (note that in this case $|x_{1}|=|x_{2}|$ ). One derives bounds in a similar way for the implication $\textup{EMBE}(\hat{X})=\varepsilon\Rightarrow\textup{TBE}(\hat{X})=O(\varepsilon)$ . For the other implication, there is again nothing to prove when $r_{b}=1$ . When $r_{b}=|ax_{1}|$ we observe that we can write $\hat{b}=-r_{b}\hat{\delta}_{b}$ with $|\hat{\delta}_{b}|\leq\varepsilon$ and we set $\tilde{x}_{1}=\hat{x}_{1}+a^{-1}r_{b}\hat{\delta}_{b},\tilde{x}_{2}=\hat{x}_{2}$ .

We arrive at the following statement.

Proposition 2.1.

Let $\hat{X}=\{\hat{x}_{1},\hat{x}_{2}\}$ be a set of approximations for the roots $X=\{x_{1},x_{2}\}$ of a quadratic polynomial $f=ax^{2}+bx+c\in\mathbb{C}[x]$ , where $a,c\neq 0$ . Under the assumption that $|\hat{x}_{i}|$ has the same order of magnitude as $|x_{i}|$ , $i=1,2$ , we have that

\textup{EMBE}(\hat{X})=O(u)\quad\textup{if and only if}\quad\textup{TBE}(\hat{X})=O(u).

3 Backward Error for Polynomials of General Degree

We now generalize the results of the previous section to polynomials of arbitrary degree $d$ . In the following let $f=\sum_{i=0}^{d}c_{i}x^{i}\in\mathbb{C}[x]$ , $c_{0},c_{d}\neq 0$ , be a polynomial of degree $d$ with roots $X=\{x_{1},\ldots,x_{d}\}\subset\mathbb{C}^{*}$ and let $\hat{X}=\{\hat{x}_{1},\ldots,\hat{x}_{d}\}\subset\mathbb{C}^{*}$ be the approximate roots. Without loss of generality we assume that the roots are labeled such that $|x_{1}|\leq|x_{2}|\leq\ldots\leq|x_{d}|$ .

We first formally generalize the notion of the element-wise mixed backward error due to Mastronardi and Van Dooren [MVD15] from quadratic polynomials to general polynomials of degree $d$ .

Definition 3.1 (Element-wise mixed backward error).

The element-wise mixed backward error of $\hat{X}$ , denoted $\textup{EMBE}(\hat{X})$ , is the smallest number $\varepsilon\geq 0$ such that there exist points $\tilde{X}=\{\tilde{x}_{1},\ldots,\tilde{x}_{d}\}\subset\mathbb{C}^{*}$ satisfying $|\hat{x}_{j}-\tilde{x}_{j}|\leq\varepsilon|\tilde{x}_{j}|$ , $j=1,\ldots,d$ , and

\begin{cases}|c_{i}-\tilde{c}_{i}|\leq\varepsilon|c_{i}|,&c_{i}\neq 0,\\ |c_{i}-\tilde{c}_{i}|\leq\varepsilon,&c_{i}=0,\end{cases}

for all $i=0,\ldots,d-1$ , where the $\tilde{c}_{i}$ are the coefficients of

\tilde{f}=c_{d}\prod_{j=1}^{d}(x-\tilde{x}_{j})=c_{d}x^{d}+\sum_{i=0}^{d-1}\tilde{c}_{i}x^{i}\,.

Before we can state the generalization of the tropical backward error for degree $d$ polynomials we need to introduce some definitions and concepts. We define the tropical polynomial $\texttt{t}{f}(\tau)$ associated to $f(x)$ as

\texttt{t}{f}:\mathbb{R}\cup\{-\infty\}\rightarrow\mathbb{R}\cup\{-\infty\},\quad\tau\mapsto\max_{0\leq i\leq d}v_{i}+i\tau

(8)

where $v_{i}=\log|c_{i}|$ . The number $v_{i}$ is the valuation of $c_{i}$ under the valuation map $\log|\cdot|$ . Any base for the logarithm can be used in theory. We want to think of the image under the valuation map as the ‘order of magnitude’ of the modulus of a complex number. In this paper, when we state that $\log|c|\approx 0,c\in\mathbb{C},$ we mean that $|c|$ is of order 1. In tropical geometry the map $\log|\cdot|$ is referred to as an Archimedean valuation. For a general introduction to tropical geometry we refer to [Sha11, MS15].

The Newton polytope of $f$ is the line segment $[0,d]\subset\mathbb{R}$ . The convex hull of the points $\{(i,v_{i})\}_{0\leq i\leq d}\subset\mathbb{R}^{2}$ is called the lifted Newton polytope. We will consider the upper hull of the lifted Newton polytope. For a specific example, this is shown as a solid black line in Figure 3. The vertices of this upper hull are the points $(\beta_{\ell},v_{\beta_{\ell}})$ , $\ell=0,\ldots,s$ , with

0=\beta_{0}<\beta_{1}<\ldots<\beta_{s}=d\,.

We call the set $\Delta=\{(\beta_{0},\beta_{1}),(\beta_{1},\beta_{2}),\ldots,(\beta_{s-1},\beta_{s})\}$ the subdivision induced by the coefficients $c_{i}$ , or the induced subdivision for short. We say that a point $\tau\in\mathbb{R}\cup\{-\infty\}$ is a root of $\texttt{t}{f}$ if the maximum in (8) is attained at least twice. A root of $\texttt{t}{f}$ is called a tropical root of $f$ . The multiplicity of a root $\tau$ of $\texttt{t}{f}$ is the number $\beta_{\ell}-\beta_{\ell-1}$ where $\beta_{\ell}$ and $\beta_{\ell-1}$ are the largest, respectively the smallest value of $i$ for which $v_{i}+i\tau$ is maximal. Counted with multiplicity, $\texttt{t}{f}$ has $d$ roots $\tau_{1},\ldots,\tau_{d}$ and we can give a closed formula for them. For $\beta_{\ell-1}<i\leq\beta_{\ell}$ we have

\tau_{i}=\frac{1}{m_{\ell}}(v_{\beta_{\ell-1}}-v_{\beta_{\ell}})=\log\;\left|\frac{c_{\beta_{\ell-1}}}{c_{\beta_{\ell}}}\right|^{\frac{1}{m_{\ell}}}

where $m_{\ell}=\beta_{\ell}-\beta_{\ell-1}$ is the multiplicity. In particular, the definition implies

\tau_{1}=\tau_{\beta_{0}}=\cdots=\tau_{\beta_{1}}<\tau_{\beta_{1}+1}=\cdots=\tau_{\beta_{2}}<\cdots<\tau_{\beta_{s-1}+1}=\cdots=\tau_{\beta_{s}}=\tau_{d}.

Tropical roots of polynomials are used for scaling (matrix) polynomial eigenvalue problems, see for instance [BNS13, GS09, NST15].

Furthermore, for $i=0,\ldots,d$ we define the constants

r_{i}=\begin{cases}1,&i=\beta_{\ell}\text{ for some }\ell\\ \exp(v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{i}-v_{i}),&\beta_{\ell-1}<i<\beta_{\ell}\text{ for some }\ell\text{ and }v_{i}\in\mathbb{R}\\ \exp(v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{i}),&\beta_{\ell-1}<i<\beta_{\ell}\text{ for some }\ell\text{ and }v_{i}=-\infty\end{cases}\,.

Geometrically, if $c_{i}\neq 0$ then $\log r_{i}$ is the distance of $v_{i}=\log|c_{i}|$ to the upper convex hull of the lifted Newton polytope. Figure 3 illustrates these concepts. Note that $\tau_{i}$ , $\beta_{\ell-1}<i\leq\beta_{\ell}$ , is the negative slope of the line connecting $(\beta_{\ell-1},v_{\beta_{\ell-1}})$ and $(\beta_{\ell},v_{\beta_{\ell}})$ . We note that the complexity of computing the tropical roots of $f$ is linear in the degree $d$ , see e.g. [Sha11, Proposition 2.2.1].

Definition 3.2 (Tropical backward error).

The tropical backward error of $\hat{X}$ , denoted $\textup{TBE}(\hat{X})$ , is the smallest number $\varepsilon\geq 0$ such that for all $i=0,\ldots,d-1$

\begin{cases}|c_{i}-\hat{c}_{i}|\leq r_{i}\varepsilon|c_{i}|,&c_{i}\neq 0,\\ |c_{i}-\hat{c}_{i}|\leq r_{i}\varepsilon,&c_{i}=0,\end{cases}

where the $\hat{c}_{i}$ are the coefficients of

\hat{f}=c_{d}\prod_{j=1}^{d}(x-\hat{x}_{j})=c_{d}x^{d}+\sum_{i=0}^{d-1}\hat{c}_{i}x^{i}\,.

In the following we show that an element-wise mixed backward error of order machine precision also implies a tropical backward error of order machine precision and that the converse holds, under suitable assumptions, as well. As in Section 2 for the quadratic case, we assume in the following that $|x_{j}|$ , $|\hat{x}_{j}|$ and $|\tilde{x}_{j}|$ have the same order of magnitude for $j=1,\ldots,d$ .

3.1 Element-wise Mixed Backward Stability implies Tropical Backward Stability

We start by showing that an $\textup{EMBE}(\hat{X})=\varepsilon$ implies $\textup{TBE}(\hat{X})=O(\varepsilon)$ . The coefficients of the polynomial $f$ can be considered as functions of the roots $x_{1},\ldots,x_{d}$ . Let

\sigma_{k}(x_{1},\ldots,x_{d})=\sum_{|I|=k}\prod_{i\in I}x_{i}

be the $k$ -th elementary symmetric polynomial. We have the identity

f=c_{d}\prod_{i=1}^{d}(x-x_{i})=c_{d}\sum_{k=0}^{d}(-1)^{d-k}\sigma_{d-k}(x_{1},\ldots,x_{d})x^{k}.

Lemma 3.1.

If $\textup{EMBE}(\hat{X})=\varepsilon$ , then the coefficients of

\hat{f}=c_{d}x^{d}+\sum_{i=0}^{d-1}\hat{c}_{i}x^{i}=c_{d}(x-\hat{x}_{1})\cdots(x-\hat{x}_{d})

satisfy

	$\displaystyle\left\|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right\|$	$\displaystyle\leq\varepsilon\left(1+(d-i)\frac{\sigma_{d-i}(\|\tilde{x}_{1}\|,\ldots,\|\tilde{x}_{d}\|)}{\|\sigma_{d-i}(x_{1},\ldots,x_{d})\|}+O(\varepsilon)\right),$	$\displaystyle\text{when }c_{i}\neq 0$
and
	$\displaystyle\|\hat{c}_{i}-c_{i}\|$	$\displaystyle\leq\varepsilon(1+c_{d}(d-i)\sigma_{d-i}(\|\tilde{x}_{1}\|,\ldots,\|\tilde{x}_{d}\|)+O(\varepsilon)),$	$\displaystyle\text{when }c_{i}=0.$

Proof.

Suppose $c_{i}\neq 0$ . We have

\frac{\hat{c}_{i}-c_{i}}{c_{i}}=\frac{\hat{c}_{i}-\tilde{c}_{i}}{c_{i}}+\frac{\tilde{c}_{i}-c_{i}}{c_{i}}.

This implies

	$\displaystyle\left\|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right\|$	$\displaystyle\leq\left\|\frac{\hat{c}_{i}-\tilde{c}_{i}}{c_{i}}\right\|+\left\|\frac{\tilde{c}_{i}-c_{i}}{c_{i}}\right\|$
		$\displaystyle\leq\frac{\|\sigma_{d-i}(\hat{x}_{1},\ldots,\hat{x}_{d})-\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})\|}{\|\sigma_{d-i}(x_{1},\ldots,x_{d})\|}+\varepsilon$
		$\displaystyle=\frac{\|\sigma_{d-i}(\tilde{x}_{1}(1+\hat{\delta}_{1}),\ldots,\tilde{x}_{d}(1+\hat{\delta}_{d}))-\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})\|}{\|\sigma_{d-i}(x_{1},\ldots,x_{d})\|}+\varepsilon$

where $|\hat{\delta}_{i}|\leq\varepsilon,i=1,\ldots,d$ . Note that the second inequality and the equality both use $\textup{EMBE}(\hat{X})=\varepsilon$ . We now observe

\sigma_{d-i}(\tilde{x}_{1}(1+\hat{\delta}_{1}),\ldots,\tilde{x}_{d}(1+\hat{\delta}_{d}))=\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})+\sum_{j=1}^{d-i}\hat{\delta}_{j}\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})+\text{h.o.t.},

where the ‘higher order terms’ contain at least two of the $\hat{\delta}_{i}$ . This, together with the triangle inequality, shows

\left|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right|\leq\varepsilon(d-i)\frac{\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)}{|\sigma_{d-i}(x_{1},\ldots,x_{d})|}+O(\varepsilon^{2})+\varepsilon.

The case $c_{i}=0$ is completely analogous. ∎

It is well known that the values $\exp(\tau_{i})$ are related to the modulus of the classical roots $x_{i}$ , see for instance [Sha11]. In what follows, we will make the assumption that the order of magnitude of $\exp(\tau_{i})$ is equal to that of $|x_{i}|$ (and, by our previous assumption, also to $|\hat{x}_{i}|$ and $|\tilde{x}_{i}|$ ). Under this assumption, we have for $\beta_{\ell-1}<i\leq\beta_{\ell}$ that

\sigma_{d-i}(|x_{1}|,\ldots,|x_{d}|)=\sum_{|I|=d-i}\prod_{j\in I}|x_{j}|=D_{i}\begin{pmatrix}{m_{\ell}}\\ {\beta_{\ell}-i}\end{pmatrix}\prod_{k=0}^{d-i-1}\exp(\tau_{d-k}),

with $D_{i}$ a not too large constant and

\begin{pmatrix}{m_{\ell}}\\ {\beta_{\ell}-i}\end{pmatrix}=\frac{m_{\ell}!}{(\beta_{\ell}-i)!(i-\beta_{\ell-1})!}

the binomial coefficient. This can be seen as follows. The important terms in the expansion of $\sigma_{d-i}(|x_{1}|,\ldots,|x_{d}|)$ are those containing $d-i$ large roots. By the ordering of the roots by their modulus, these terms have the order of magnitude of $\prod_{k=0}^{d-i-1}\exp(\tau_{d-k})$ . We can assume that each of these terms contains the largest $m_{\ell+1}+\cdots+m_{s}\leq d-i$ roots. For the remaining factors, we can choose $\beta_{\ell}-i$ roots among $\{x_{\beta_{\ell-1}+1},\ldots,x_{\beta_{\ell}}\}$ .

By our assumption that $|\tilde{x}_{i}|\approx|x_{i}|$ , we have

\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)=\sum_{|I|=d-i}\prod_{j\in I}|\tilde{x}_{j}|=\tilde{D}_{i}\begin{pmatrix}{m_{\ell}}\\ {\beta_{\ell}-i}\end{pmatrix}\prod_{k=0}^{d-i-1}\exp(\tau_{d-k}),

with $\tilde{D}_{i}$ a not too large constant. Taking valuations on both sides of this equality, we get

\log|\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)|=w_{i}+\sum_{k=0}^{d-i-1}\tau_{d-k}

(9)

where $w_{i}=\log\left|\tilde{D}_{i}\binom{m_{i}}{\beta_{\ell}-i}\right|$ and $\exp(w_{i})$ is a not too large positive number. Equation (9) is the assumption we will use in the next theorem.

Theorem 3.1.

If $\textup{EMBE}(\hat{X})=\varepsilon$ and the order of magnitude of $|x_{i}|$ is equal to that of $|\hat{x_{i}}|$ and $\exp(\tau_{i})$ , $i=1,\ldots,d$ and (9) holds, then the coefficients of

\hat{f}=c_{d}x^{d}+\sum_{i=0}^{d-1}\hat{c}_{i}x^{i}=c_{d}(x-\hat{x}_{1})\cdots(x-\hat{x}_{d})

satisfy

	$\displaystyle\left\|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right\|$	$\displaystyle\leq\varepsilon\left(1+(d-i)\exp(w_{i})r_{i}+O(\varepsilon)\right)\quad\text{when }c_{i}\neq 0$
and
	$\displaystyle\|\hat{c}_{i}-c_{i}\|$	$\displaystyle\leq\varepsilon\left(1+(d-i)\exp(w_{i})r_{i}+O(\varepsilon)\right)\quad\text{when }c_{i}=0.$

In particular, $\textup{TBE}(\hat{X})\leq\varepsilon\max_{i}(1+(d-i)\exp(w_{i})+O(\varepsilon))$ .

Proof.

For $\beta_{\ell-1}\leq i\leq\beta_{\ell}$ , assume $c_{i}\neq 0$ . Note that

	$\displaystyle\log\|r_{i}\|$	$\displaystyle=v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{\beta_{\ell}}-v_{i}$
		$\displaystyle=v_{\beta_{\ell+1}}+m_{i+1}\tau_{\beta_{\ell+1}}+(\beta_{\ell}-i)\tau_{\beta_{\ell}}-v_{i}$
		$\displaystyle=v_{\beta_{\ell+2}}+m_{i+2}\tau_{\beta_{\ell+2}}+m_{i+1}\tau_{\beta_{\ell+1}}+(\beta_{\ell}-i)\tau_{\beta_{s}}-v_{i}$
		$\displaystyle=\ldots$
		$\displaystyle=v_{\beta_{\ell}}+\sum_{k=\ell+1}^{\ell}m_{k}\tau_{\beta_{k}}+(\beta_{\ell}-i)\tau_{\beta_{\ell}}-v_{i}$
		$\displaystyle=v_{d}+\sum_{k=0}^{d-i-1}\tau_{d-k}-v_{i}.$

Now, using (9) and $v_{i}=v_{d}+\log|\sigma_{d-i}(x_{1},\ldots,x_{d})|$ we get

\log|r_{i}|=\log|\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)|-\log|\sigma_{d-i}(x_{1},\ldots,x_{d})|-w_{i}.

Therefore

\exp(w_{i})r_{i}=\exp(w_{i}+\log|r_{i}|)=\frac{\sigma_{d-i}(|\tilde{x}_{1}|,\ldots,|\tilde{x}_{d}|)}{|\sigma_{d-i}(x_{1},\ldots,x_{d})|}

and we are done by Lemma 3.1. The proof for $c_{i}=0$ is analogous. ∎

In [MVD15], Mastronardi and Van Dooren show that, for $d=2$ , $\textup{EMBE}(\hat{X})=O(u)$ implies $\textup{NBE}(\hat{X})=O(u)$ , where $\textup{NBE}(\hat{X})$ is the norm-wise backward error as defined in the introduction. Theorem 3.1 allows us to prove this statement for general degrees.

Proposition 3.1.

If $\textup{EMBE}(\hat{X})=\varepsilon$ and the assumptions of Theorem 3.1 are satisfied, we have that

\lVert{(c_{0},\ldots,c_{d-1},c_{d})-(\hat{c}_{0},\ldots,\hat{c}_{d-1},c_{d})}\rVert_{2}=O(\varepsilon)\lVert{(c_{0},\ldots,c_{d-1},c_{d})}\rVert_{2}.

Proof.

Suppose that $c_{i}\neq 0$ . We have

\displaystyle|c_{i}-\hat{c}_{i}|

\displaystyle\leq|c_{i}-\tilde{c}_{i}|+|\hat{c}_{i}-\tilde{c}_{i}|.

Since $\textup{EMBE}(\hat{X})=\varepsilon$ , we have that $\left|(\hat{c}_{i}-c_{i})-(\hat{c}_{i}-\tilde{c}_{i})\right|\leq\varepsilon|c_{i}|.$ Combining this with Theorem 3.1, we have that

|\hat{c}_{i}-\tilde{c}_{i}|=O(\varepsilon)|c_{i}|r_{i}.

Hence, we obtain the bound

|c_{i}-\hat{c}_{i}|\leq\varepsilon|c_{i}|+O(\varepsilon)|c_{i}|r_{i}.

Analogously, when $c_{i}=0$ we obtain

|c_{i}-\hat{c}_{i}|\leq\varepsilon+O(\varepsilon)r_{i}.

It follows that

	$\displaystyle\lVert{(c_{0}-\hat{c}_{0},\ldots,c_{d-1}-\hat{c}_{d-1},c_{d}-c_{d})}\rVert_{2}$	$\displaystyle\leq\varepsilon\lVert{(c_{0},\ldots,c_{d})}\rVert_{2}+O(\varepsilon)\lVert{(r_{0}c_{0},\ldots,r_{d}c_{d})}\rVert_{2}$
		$\displaystyle\leq\varepsilon\lVert{(c_{0},\ldots,c_{d})}\rVert_{2}+O(\varepsilon)\sqrt{d}\lVert{(c_{0},\ldots,c_{d})}\rVert_{2},$

where the last inequality follows from

	$\displaystyle\max_{i=1,\ldots,d}\|c_{i}\|$	$\displaystyle\leq\lVert{(c_{0},\ldots,c_{d})}\rVert_{2}\leq\sqrt{d}\max_{i=1,\ldots,d}\|c_{i}\|,$
	$\displaystyle\max_{i=1,\ldots,d}\|c_{i}\|$	$\displaystyle\leq\lVert{(r_{0}c_{0},\ldots,r_{d}c_{d})}\rVert_{2}\leq\sqrt{d}\max_{i=1,\ldots,d}\|c_{i}\|$

because $r_{i}=1$ for $i=\text{argmax}_{\ell=1,\ldots,d}|c_{\ell}|$ . ∎

We note that the proof of Proposition 3.1 can be summarized as

\textup{EMBE}(\hat{X})=\varepsilon\overset{\text{Theorem \ref{thm:EMBEimpliesTBE}}}{\Longrightarrow}\textup{TBE}(\hat{X})=O(\varepsilon)\Longrightarrow\textup{NBE}(\hat{X})=O(\varepsilon).

3.2 Tropical Backward Stability implies Element-wise Backward Stability?

We now show that a tropical backward error of order $\varepsilon$ also implies a mixed element-wise backward error of the same magnitude under some assumptions. For this, consider the perturbed polynomials

	$\displaystyle\hat{f}=f+\hat{\Delta}f$	$\displaystyle=\sum_{i=0}^{d-1}c_{i}(1+r_{i}\delta_{i})x^{i}+c_{d}x^{d}$
and
	$\displaystyle\tilde{f}=f+\tilde{\Delta}f$	$\displaystyle=\sum_{i=0}^{d-1}c_{i}(1+\kappa_{i}\delta_{i}e^{\sqrt{-1}\theta_{i}})x^{i}+c_{d}x^{d}$	(10)

where $\log|\delta_{i}|\approx\log|\varepsilon|=v_{\varepsilon}$ and $\kappa_{i}\in\mathbb{R},\theta_{i}\in[0,2\pi)$ are parameters. We assume that $\kappa_{i}$ is not too large, i.e. $\log|\kappa_{i}|\approx 0$ , such that $\tilde{\Delta}f$ is a ‘small’ perturbation. Observe that for the roots $\hat{x}_{j}=x_{j}+\hat{\Delta}x_{j}$ of $\hat{f}$ we have

\begin{array}[]{rl}0&=(f+\hat{\Delta}f)(x_{j}+\hat{\Delta}x_{j})\\ &=f(x_{j}+\hat{\Delta}x_{j})+\hat{\Delta}f(x_{j}+\hat{\Delta}x_{j})\\ &=f(x_{j})+f^{\prime}(x_{j})\hat{\Delta}x_{j}+\frac{f^{\prime\prime}(x_{j})}{2}\hat{\Delta}x_{j}^{2}+\cdots+\frac{f^{(d)}(x_{j})}{d!}\hat{\Delta}x_{j}^{d}\\ &\phantom{=}+\hat{\Delta}f(x_{j})+\hat{\Delta}f^{\prime}(x_{j})\hat{\Delta}x_{j}+\frac{\hat{\Delta}f^{\prime\prime}(x_{j})}{2}\hat{\Delta}x_{j}^{2}+\cdots+\frac{\hat{\Delta}f^{(d)}(x_{j})}{d!}\hat{\Delta}x_{j}^{d}.\end{array}

(11)

From this we conclude that $\hat{\Delta}x_{j}$ is a root of the polynomial

\hat{E}(x)=\hat{\Delta}f(x_{j})+(f^{\prime}(x_{j})+\hat{\Delta}f^{\prime}(x_{j}))\frac{x}{1!}+\cdots+(f^{(d)}(x_{j})+\hat{\Delta}f^{(d)}(x_{j}))\frac{x^{d}}{d!}.

Similarly, for the roots $\tilde{x}_{1}=x_{1}+\tilde{\Delta}x_{1},\ldots,\tilde{x}_{d}=x_{d}+\tilde{\Delta}x_{d}$ of $\tilde{f}$ we have that $\tilde{\Delta}x_{j}$ is a root of the polynomial

\tilde{E}(x)=\tilde{\Delta}f(x_{j})+(f^{\prime}(x_{j})+\tilde{\Delta}f^{\prime}(x_{j}))\frac{x}{1!}+\cdots+(f^{(d)}(x_{j})+\tilde{\Delta}f^{(d)}(x_{j}))\frac{x^{d}}{d!}.

To show that tropical backward stability implies element-wise mixed backward stability we need three assumptions. Each tropical root $\tau_{i}$ should attain the maximum in (8) exactly twice, and the other terms should be significantly smaller. Also, the tropical root $\tau_{i}$ should be of the same order of magnitude as $\log|x_{i}|$ .

Lemma 3.2.

If for the tropical root $\tau_{j}$ of $f$ we have

1.

$\{\beta\in\{0,\ldots,d\}\leavevmode\nobreak\ |\leavevmode\nobreak\ v_{\beta}+\beta\tau_{j}=\max_{i}v_{i}+i\tau_{j}\}=\{\beta_{\ell-1},\beta_{\ell}\}$ with $\beta_{\ell-1}<\beta_{\ell}$ ,
2.

$\log|x_{j}|\approx\tau_{j}$ ,
3.

$|c_{i}x_{j}^{i}|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,i\neq\beta_{\ell-1},\beta_{\ell}$ ,

then for $k\geq 1$

\log|f^{(k)}(x_{j})|\lesssim v_{\beta_{\ell}}+(\beta_{\ell}-k)\tau_{j}.

Here ‘ $\lesssim$ ’ can be replaced by ‘ $\approx$ ’ for $k=1$ .

Proof.

We have that $x^{k}f^{(k)}(x)=\sum_{i=k}^{d}c_{i}\frac{i!}{(i-k)!}x^{i}$ . We distinguish three different cases.

( $\beta_{\ell-1}-k\geq 0,\beta_{\ell}-k\geq 0$ ). In this case

|x_{j}^{k}f^{(k)}(x_{j})|=K_{1}\left|c_{\beta_{\ell-1}}\frac{\beta_{\ell-1}!}{(\beta_{\ell-1}-k)!}x_{j}^{\beta_{\ell-1}}+c_{\beta_{\ell}}\frac{\beta_{\ell}!}{(\beta_{\ell}-k)!}x_{j}^{\beta_{\ell}}\right|,

with $\log|K_{1}|\approx 0$ . Since $f(x_{j})=0$ and by assumption $|c_{i}x_{j}^{i}|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,i\neq\beta_{\ell-1},\beta_{\ell}$ , we have that

\displaystyle|c_{\beta_{\ell-1}}x_{j}^{\beta_{\ell-1}}+c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|

\displaystyle=\left|\sum_{i\neq\beta_{\ell-1},\beta_{\ell}}c_{i}x_{j}^{i}\right|\leq\sum_{i\neq\beta_{\ell-1},\beta_{\ell}}|c_{i}x_{j}^{i}|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|.

Then

	$\displaystyle\|x_{j}^{k}f^{(k)}(x_{j})\|$	$\displaystyle=K_{1}\left\|\frac{\beta_{\ell-1}!}{(\beta_{\ell-1}-k)!}(c_{\beta_{\ell-1}}x^{\beta_{\ell-1}}+c_{\beta_{\ell}}x^{\beta_{\ell}})+\left(\frac{\beta_{\ell}!}{(\beta_{\ell}-k)!}-\frac{\beta_{\ell-1}!}{(\beta_{\ell-1}-k)!}\right)c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}\right\|$
		$\displaystyle=K_{2}\|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}\|$

with $\log|K_{2}|\approx 0$ . The lemma now follows from taking valuations.

( $\beta_{\ell-1}-k<0,\beta_{\ell}-k\geq 0$ ). The lemma follows from the observation that in this case

|x_{j}^{k}f^{(k)}(x_{j})|=K_{1}\left|c_{\beta_{\ell}}\frac{\beta_{\ell}!}{(\beta_{\ell}-k)!}x_{j}^{\beta_{\ell}}\right|=K_{2}|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,

with $\log|K_{1}|\approx 0,\log|K_{2}|\approx 0$ .

( $\beta_{\ell-1}-k<0,\beta_{\ell}-k<0$ ). In this case

|x_{j}^{k}f^{(k)}(x_{j})|=\left|\sum_{i=k}^{d}c_{i}\frac{i!}{(i-k)!}x_{j}^{i}\right|\leq\sum_{i=k}^{d}\left|c_{i}\frac{i!}{(i-k)!}x_{j}^{i}\right|\ll|c_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|.

Note that if $k=1$ , the third case is not possible because $\beta_{\ell}>\beta_{\ell-1}\geq 0$ . ∎

Lemma 3.3.

Under the assumptions of Lemma 3.2, we have that

\log|\hat{\Delta}f(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+\beta_{\ell}\tau_{j},\quad\log|\tilde{\Delta}f(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+\beta_{\ell}\tau_{j},

\log|\hat{\Delta}f^{\prime}(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+(\beta_{\ell}-1)\tau_{j},\quad\log|\tilde{\Delta}f^{\prime}(x_{j})|\lesssim v_{\beta_{\ell}}+v_{\varepsilon}+(\beta_{\ell}-1)\tau_{j},

\log|f^{(k)}(x_{j})+\hat{\Delta}f^{(k)}(x_{j})|\lesssim v_{\beta_{\ell}}+(\beta_{\ell}-k)\tau_{j},\quad\log|f^{(k)}(x_{j})+\tilde{\Delta}f^{(k)}(x_{j})|\lesssim v_{\beta_{\ell}}+(\beta_{\ell}-k)\tau_{j}.

In the last line, for $k=1$ we can replace ‘ $\lesssim$ ’ by ‘ $\approx$ ’.

Proof.

We have

\displaystyle|\hat{\Delta}f(x_{j})|

\displaystyle=\left|\sum_{i=0}^{d-1}c_{i}\delta_{i}r_{i}x_{j}^{i}\right|=K_{1}\left|\sum_{\beta_{\ell-1}\leq i\leq\beta_{\ell}}^{d-1}c_{i}\delta_{i}r_{i}x_{j}^{i}\right|\leq K_{1}(\beta_{\ell}-\beta_{\ell-1})|c_{\beta_{\ell}}\delta_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,

with $\log|K_{1}(\beta_{\ell}-\beta_{\ell-1})|\approx 0$ , which proves the first statement. The second statement is proven by a completely analogous argument. The third statement follows from

\displaystyle|x_{j}\hat{\Delta}f^{\prime}(x_{j})|

\displaystyle=\left|\sum_{i=1}^{d-1}c_{i}i\delta_{i}r_{i}x_{j}^{i}\right|\leq K_{1}(\beta_{\ell}-\beta_{\ell-1})|c_{\beta_{\ell}}\delta_{\beta_{\ell}}x_{j}^{\beta_{\ell}}|,

with $\log|K_{1}(\beta_{\ell}-\beta_{\ell-1})|\approx 0$ . The fourth statement is analogous. The fifth statement follows from

\log|f^{(k)}(x_{j})+\hat{\Delta}f^{(k)}(x_{j})|\approx\log|f^{(k)}(x_{j})|

and Lemma 3.2. The sixth statement follows again from an analogous argument. ∎

It follows from Lemma 3.3 that we can bound the lifted Newton polytopes of the polynomials $\hat{E},\tilde{E}$ from above. An example is shown in Figure 4.

Figure 4: The blue line shows an upper bound for the lifted Newton polytopes of

\hat{E}(x)

and

\tilde{E}(x)

. The actual lifted polytopes will meet the blue line (approximately) in the point

(1,v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j})

(indicated with a small box).

The expansion (11) is used to approximate $\hat{\Delta}x_{j}$ as

\hat{\Delta}x_{j}\approx\frac{-\hat{\Delta}f(x_{j})}{f^{\prime}(x_{j})+\hat{\Delta}f^{\prime}(x_{j})}.

(12)

It is clear that this is an approximation for the smallest root of $\hat{E}(z)$ , which corresponds to the smallest tropical root $\tau_{\hat{E}}$ of $\hat{E}$ which is bounded by (see Figure 4)

\tau_{\hat{E}}\leq(v_{\beta_{\ell}}+v_{\varepsilon}+\beta_{\ell}\tau_{j})-(v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j})=\tau_{j}+v_{\varepsilon}.

(13)

Analogously, we have for the smallest tropical root $\tau_{\tilde{E}}$ of $\tilde{E}$ that $\tau_{\tilde{E}}\leq\tau_{j}+v_{\varepsilon}$ . We will also make our usual assumption that the tropical roots give an indication for the order of magnitude of the classical roots, i.e.

\log|\hat{\Delta}x_{j}|\lesssim\tau_{j}+v_{\varepsilon},\quad\log|\tilde{\Delta}x_{j}|\lesssim\tau_{j}+v_{\varepsilon}.

(14)

We conclude that the assumptions of Lemma 3.2 implies that $\hat{X}$ and $\tilde{X}$ have a relative forward error of size $O(\varepsilon)$ . This implies that $\textup{EMBE}(\hat{X})=O(\varepsilon)$ (take $\tilde{x}_{j}=x_{j}$ ), which gives the following result.

Theorem 3.2.

Under the assumptions of Lemma 3.2, if $\textup{TBE}(\hat{X})=\varepsilon$ and the order of magnitude of $|x_{j}|$ is equal to that of $|\hat{x_{j}}|$ and $\exp(\tau_{j})$ , then $\textup{EMBE}(\hat{X})=O(\varepsilon)$ .

In fact, the assumptions of Lemma 3.2 imply that the roots $X$ of $f$ are well-conditioned. Consider the first order approximation

\frac{|\Delta x_{j}|}{|x_{j}|}\approx\frac{\max_{i=0,...,d}|c_{i}x_{j}^{i}|}{|f^{\prime}(x_{j})||x_{j}|}\frac{|\Delta f(x_{j})|}{\max_{i=0,...,d}|c_{i}x_{j}^{i}|},

for a perturbation $\Delta f$ on $f$ causing a perturbation $\Delta x_{j}$ on $x_{j}$ . Here $\max_{i=0,...,d}|c_{i}x_{j}^{i}|$ is used to measure the residual $|\Delta f(x_{j})|$ with a relative criterion. The condition of a root $x_{j}$ can be measured by

\frac{\max_{i=0,...,d}|c_{i}x_{j}^{i}|}{|f^{\prime}(x_{j})||x_{j}|}=\frac{c_{\beta_{\ell}}\exp(\tau_{j})^{\beta_{\ell}}}{|f^{\prime}(x_{j})||x_{j}|}.

Using Lemma 3.2, we find that this number is of order 1.

In what follows, we will give a constructive proof for Theorem 3.2. That is, in the proof of Theorem 3.3 (which implies Theorem 3.2) we will give values for $\kappa_{i},\theta_{i}$ in (10) which realize a small EMBE. It uses the following lemma.

Lemma 3.4.

Under the assumptions of Lemma 3.2, we have that

\log\left|\hat{\Delta}x_{j}-\frac{-\hat{\Delta}f(x_{j})}{f^{\prime}(x_{j})}\right|\lesssim\tau_{j}+2v_{\varepsilon},\quad\log\left|\tilde{\Delta}x_{j}-\frac{-\tilde{\Delta}f(x_{j})}{f^{\prime}(x_{j})}\right|\lesssim\tau_{j}+2v_{\varepsilon}.

Proof.

It follows from $\hat{E}(\hat{\Delta}x_{j})=0$ that

f^{\prime}(x_{j})\hat{\Delta}x_{j}=-\left(\hat{\Delta}f(x_{j})+\hat{\Delta}x_{j}\hat{\Delta}f^{\prime}(x_{j})+\sum_{k=2}^{d}\hat{\Delta}x_{j}^{k}\frac{f^{(k)}(x_{j})+\hat{\Delta}f^{(k)}(x_{j})}{d!}\right).

The valuation of the first neglected term in the approximation (12) is

\log\left|\frac{{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}\hat{\Delta}x_{j}}{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}\hat{\Delta}f^{\prime}(x_{j})}}{{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}f^{\prime}(x_{j})}}\right|\lesssim{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}\tau_{j}+v_{\varepsilon}}+{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}v_{\beta_{\ell}}+v_{\varepsilon}+(\beta_{\ell}-1)\tau_{j}}-{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}(v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j})}=2v_{\varepsilon}+\tau_{j},

where we used Lemma 3.2, Lemma 3.3 and (14). For the term corresponding to $k=2$ , we have

\log\left|\frac{{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}\hat{\Delta}x_{j}^{2}}{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}f^{(2)}(x_{j})}}{2{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}f^{\prime}(x_{j})}}\right|\lesssim{\color[rgb]{0.00000,0.44700,0.74100}\definecolor[named]{pgfstrokecolor}{rgb}{0.00000,0.44700,0.74100}2(\tau_{j}+v_{\varepsilon})}+{\color[rgb]{0.8500,0.3250,0.0980}\definecolor[named]{pgfstrokecolor}{rgb}{0.8500,0.3250,0.0980}v_{\beta_{\ell}}+(\beta_{\ell}-2)\tau_{j}}-{\color[rgb]{0.4940,0.1840,0.5560}\definecolor[named]{pgfstrokecolor}{rgb}{0.4940,0.1840,0.5560}(v_{\beta_{\ell}}+(\beta_{\ell}-1)\tau_{j})}=2v_{\varepsilon}+\tau_{j}.

For the terms corresponding to higher values of $k$ , we get in the same way a valuation of $\tau_{j}+kv_{\varepsilon}<\tau_{j}+2v_{\varepsilon}$ . The reasoning for $\tilde{\Delta}x_{j}$ is completely analogous. ∎

Theorem 3.3.

Under the assumptions of Lemma 3.2, there are choices of the parameters $\kappa_{i}$ , $\theta_{i}$ with $\log|\kappa_{i}|\approx 0$ such that $\log|\hat{x}_{j}-\tilde{x}_{j}|\approx v_{\varepsilon}+\tau_{j}$ .

Proof.

By Lemma 3.4 we have $\hat{x}_{j}=x_{j}-\frac{\hat{\Delta}f(x_{j})}{f^{\prime}(x_{j})}+O(\varepsilon^{2}\exp(\tau_{j}))$ and $\tilde{x}_{j}=x_{j}-\frac{\tilde{\Delta}f(x_{j})}{f^{\prime}(x_{j})}+O(\varepsilon^{2}\exp(\tau_{j}))$ . Hence it suffices to show that the valuation of

\displaystyle|\hat{x}_{j}-\tilde{x}_{j}|

\displaystyle\approx\left|\frac{1}{f^{\prime}(x_{j})}\right|\left|\sum_{i=0}^{d-1}c_{i}\delta_{i}(r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}})x_{j}^{i}\right|

is bounded by $v_{\varepsilon}+\tau_{j}$ . We have

	$\displaystyle\left\|\frac{1}{f^{\prime}(x_{j})}\right\|\left\|\sum_{i=0}^{d-1}c_{i}\delta_{i}(r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}})x_{j}^{i}\right\|$	$\displaystyle\leq\left\|\frac{1}{f^{\prime}(x_{j})}\right\|\sum_{i=0}^{d-1}\|c_{i}\|\|\delta_{i}\|\|r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}}\|\|x_{j}^{i}\|$
		$\displaystyle\leq\varepsilon\left\|\frac{1}{f^{\prime}(x_{j})}\right\|\sum_{i=0}^{d-1}\|c_{i}\|\|r_{i}-\kappa_{i}e^{\sqrt{-1}\theta_{i}}\|\|x_{j}^{i}\|.$

We now specify the parameters $\kappa_{i}$ , $\theta_{i}$ . For $r_{i}=O(1)$ , we choose $\kappa_{i}$ , $\theta_{i}$ such that $r_{i}=\kappa_{i}e^{\sqrt{-1}\theta_{i}}$ . Note that $\log|\kappa_{i}|\approx 0$ . For the other $i$ , we set $\kappa_{i}=\theta_{i}=0$ . We get

|\hat{x}_{j}-\tilde{x}_{j}|\leq\varepsilon\left|\frac{1}{f^{\prime}(x_{j})}\right|\sum_{r_{i}\gg 1}|c_{i}||r_{i}||x_{j}^{i}|.

Since by assumption $\log|x_{j}|\approx\tau_{j}$ , the dominant terms in the sum are those with $\beta_{\ell-1}<i<\beta_{\ell}$ , where $\tau_{\beta_{\ell-1}+1}=\cdots=\tau_{\beta_{\ell}}=\tau_{j}$ . Therefore

|\hat{x}_{j}-\tilde{x}_{j}|\leq K\varepsilon\left|\frac{1}{f^{\prime}(x_{j})}\right|\sum_{\beta_{\ell-1}<i<\beta_{\ell}}|c_{i}||r_{i}||x_{j}^{i}|

with $\log|K|\approx 0$ . The valuation of one of the terms in the sum is

\log|K|+v_{\varepsilon}-\log|f^{\prime}(x_{j})|+v_{i}+v_{\beta_{\ell}}+(\beta_{\ell}-i)\tau_{j}-v_{i}+i\tau_{j}

which is equal to $\log|K|-\log|f^{\prime}(x_{j})|+v_{\varepsilon}+v_{\beta_{\ell}}+\beta_{\ell}\tau_{j}$ . Note that this is independent of $i$ , and hence we get

\log|\hat{x}_{j}-\tilde{x}_{j}|\approx-\log|f^{\prime}(x_{j})|+v_{\varepsilon}+v_{\beta_{\ell}}+\beta_{\ell}\tau_{j}.

Using Lemma 3.2 we get

\log\left|\frac{\hat{x}_{j}-\tilde{x}_{j}}{\tilde{x}_{j}}\right|\approx v_{\varepsilon}.

∎

4 Computational Experiments

In Subsection 3.2 we proved that a tropical backward error of order $\varepsilon$ also implies a mixed element-wise backward error of the same magnitude under some assumptions. Unfortunately, we were not able to prove this result in general. However, based on several numerical experiments that we performed, we are convinced that a small TBE implies a small EMBS also in general. To support this conjecture, the following numerical experiment was performed.
Numerical Experiment 1 Take 1000 polynomials of degree $d$ with coefficients whose modulus is chosen as $10^{e}$ with $e$ uniformly randomly chosen between $-k$ and $k$ and whose argument is uniformly randomly chosen between $0$ and $2\pi$ . These polynomials will not always satisfy the necessary assumptions for Theorem 3.2. The zeros of these polynomials are approximated by applying an eigenvalue method from the Julia package Polynomials resulting in the computed zeros $\hat{X}$ . For these approximate roots the TBE is computed. To compute an upper bound for the EMBE, the roots $\hat{x}_{j}$ are separately refined using Newton’s method in extended precision based on the original polynomial. The correspondence between the TBE and EMBE is shown in Figure 5 and clearly indicates that a small TBE implies a small EMBE.

In several of our statements, we assumed that the tropical roots of $f$ are of the same order of magnitude as the corresponding classical roots. To check if this is a reasonable assumption in practice, we performed the following numerical experiment.

Numerical Experiment 2 Ten thousand polynomials of degree $d$ are taken with coefficients whose modulus is chosen as $10^{e}$ with $e$ uniformly randomly chosen between $-k$ and $k$ and whose argument is uniformly randomly chosen between $0$ and $2\pi$ . For each of these polynomials the tropical roots are compared to the roots computed in high precision. Figure 6 gives a histogram of the measured ratios $|\tau_{i}/x_{i}|$ . The results show that for the vast majority of roots the magnitude differs by at most 10 percent.

5 Conclusion

We have shown the relations (1) between different measures for the backward error of an approximate set of roots $\hat{X}$ of a polynomial. Under some assumptions the tropical backward error measure of [TVB20], which is easy to compute, is shown to be equivalent to the element-wise mixed backward error measure defined in [MVD15] for $d=2$ . We have given numerical evidence that the equivalence holds more generally.

References

[AMVW15] Jared L Aurentz, Thomas Mach, Raf Vandebril, and David S Watkins. Fast and backward stable computation of roots of polynomials. SIAM Journal on Matrix Analysis and Applications, 36(3):942–973, 2015.
[BNS13] Dario A Bini, Vanni Noferini, and Meisam Sharify. Locating the eigenvalues of matrix polynomials. SIAM Journal on Matrix Analysis and Applications, 34(4):1708–1727, 2013.
[GS09] Stéphane Gaubert and Meisam Sharify. Tropical scaling of polynomial matrices. In Positive systems, volume 389 of Lecture Notes in Control and Information Sciences, pages 291–303. Springer, 2009.
[Hig02] Nicholas J Higham. Accuracy and stability of numerical algorithms, volume 80. Siam, 2002.
[MS15] Diane Maclagan and Bernd Sturmfels. Introduction to tropical geometry, volume 161. American Mathematical Soc., 2015.
[MVD15] Nicola Mastronardi and Paul Van Dooren. Revisiting the stability of computing the roots of a quadratic polynomial. Electronic Transactions on Numerical Analysis, 44:73–82, 2015.
[NST15] Vanni Noferini, Meisam Sharify, and Francoise Tisseur. Tropical roots as approximations to eigenvalues of matrix polynomials. SIAM Journal on Matrix Analysis and Applications, 36(1):138–157, 2015.
[Sha11] Meisam Sharify. Scaling Algorithms and Tropical Methods in Numerical Matrix Analysis: Application to the Optimal Assignment Problem and to the Accurate Computation of Eigenvalues. PhD thesis, 2011.
[TVB20] Françoise Tisseur and Marc Van Barel. Min-max elementwise backward error for roots of polynomials and a corresponding backward stable root finder. arXiv:2001.05281, 2020.

	$\displaystyle\|\hat{x}_{i}-\tilde{x}_{i}\|$	$\displaystyle\leq\varepsilon\|\tilde{x}_{i}\|,\quad i=1,\ldots,d,$
	$\displaystyle\|c_{i}-\tilde{c}_{i}\|$	$\displaystyle\leq\varepsilon\|c_{i}\|,\quad i=0,\ldots,d-1.$

	$\displaystyle\tilde{f}=a(x-\tilde{x}_{1})(x-\tilde{x}_{2})$	$\displaystyle=ax^{2}+\tilde{b}x+\tilde{c},$
	$\displaystyle\|\hat{x}_{1}-\tilde{x}_{1}\|\leq\varepsilon\|\tilde{x}_{1}\|,\quad$	$\displaystyle\|\hat{x}_{2}-\tilde{x}_{2}\|\leq\varepsilon\|\tilde{x}_{2}\|,$
	$\displaystyle\|b-\tilde{b}\|\leq\varepsilon\|b\|,\quad$	$\displaystyle\|c-\tilde{c}\|\leq\varepsilon\|c\|.$

	$\displaystyle\left\|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right\|$	$\displaystyle\leq\varepsilon\left(1+(d-i)\frac{\sigma_{d-i}(\|\tilde{x}_{1}\|,\ldots,\|\tilde{x}_{d}\|)}{\|\sigma_{d-i}(x_{1},\ldots,x_{d})\|}+O(\varepsilon)\right),$	$\displaystyle\text{when }c_{i}\neq 0$
and
	$\displaystyle\|\hat{c}_{i}-c_{i}\|$	$\displaystyle\leq\varepsilon(1+c_{d}(d-i)\sigma_{d-i}(\|\tilde{x}_{1}\|,\ldots,\|\tilde{x}_{d}\|)+O(\varepsilon)),$	$\displaystyle\text{when }c_{i}=0.$

	$\displaystyle\left\|\frac{\hat{c}_{i}-c_{i}}{c_{i}}\right\|$	$\displaystyle\leq\left\|\frac{\hat{c}_{i}-\tilde{c}_{i}}{c_{i}}\right\|+\left\|\frac{\tilde{c}_{i}-c_{i}}{c_{i}}\right\|$
		$\displaystyle\leq\frac{\|\sigma_{d-i}(\hat{x}_{1},\ldots,\hat{x}_{d})-\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})\|}{\|\sigma_{d-i}(x_{1},\ldots,x_{d})\|}+\varepsilon$
		$\displaystyle=\frac{\|\sigma_{d-i}(\tilde{x}_{1}(1+\hat{\delta}_{1}),\ldots,\tilde{x}_{d}(1+\hat{\delta}_{d}))-\sigma_{d-i}(\tilde{x}_{1},\ldots,\tilde{x}_{d})\|}{\|\sigma_{d-i}(x_{1},\ldots,x_{d})\|}+\varepsilon$

	$\displaystyle\max_{i=1,\ldots,d}\|c_{i}\|$	$\displaystyle\leq\lVert{(c_{0},\ldots,c_{d})}\rVert_{2}\leq\sqrt{d}\max_{i=1,\ldots,d}\|c_{i}\|,$
	$\displaystyle\max_{i=1,\ldots,d}\|c_{i}\|$	$\displaystyle\leq\lVert{(r_{0}c_{0},\ldots,r_{d}c_{d})}\rVert_{2}\leq\sqrt{d}\max_{i=1,\ldots,d}\|c_{i}\|$