This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\headingtitle

Affine Iterations and Wrapping Effect

Affine Iterations and Wrapping Effect: Various Approaches

Nathalie Revol INRIA - LIP, ENS de Lyon, France, , \orcidhttps://orcid.org/0000-0002-2503-2274 Nathalie.Revol@inria.fr
Abstract

Affine iterations of the form xn+1=Axn+bx_{n+1}=Ax_{n}+b converge, using real arithmetic, if the spectral radius of the matrix AA is less than 1. However, substituting interval arithmetic to real arithmetic may lead to divergence of these iterations, in particular if the spectral radius of the absolute value of AA is greater than 1. We will review different approaches to limit the overestimation of the iterates, when the components of the initial vector x0x_{0} and bb are intervals. We will compare, both theoretically and experimentally, the widths of the iterates computed by these different methods: the naive iteration, methods based on the QR- and SVD-factorization of AA, and Lohner’s QR-factorization method. The method based on the SVD-factorization is computationally less demanding and gives good results when the matrix is poorly scaled, it is superseded either by the naive iteration or by Lohner’s method otherwise.

keywords:
interval analysis, affine iterations, matrix powers, Lohner’s QR algorithm, QR factorization, SVD factorization

1 Introduction

The problem we consider is the evaluation of the successive iterates of

{xn+1=Axn+b,x0 given,\left\{\begin{array}[]{lcl}x_{n+1}&=&Ax_{n}+b,\\ \lx@intercol x_{0}\mbox{ given},\hfil\lx@intercol\end{array}\right.

where Ad×dA\in\mathbb{R}^{d\times d}, xndx_{n}\in\mathbb{R}^{d} for every nn\in\mathbb{N} and bdb\in\mathbb{R}^{d}. More specifically, the focus is on the use of interval arithmetic to evaluate these iterates.

In what follows, interval quantities will be denoted in boldface.

1.1 A Toy Example

This problem was brought to us through this example of an IIR (Infinite Impulse Response) linear filter in a state-space form:

xn=1.8xn10.9xn2+4.7.102(un2+un1+un)x_{n}=1.8*x_{n-1}-0.9*x_{n-2}+4.7.10^{-2}*(u_{n-2}+u_{n-1}+u_{n})

for x0=0x_{0}=0 and x1[1,1.1]x_{1}\in[1,1.1]. We assume that un𝐮=[9.95, 10.05]u_{n}\in{\mathbf{u}}=[9.95\,,\,10.05] for every nn.

This iteration can also be written as a linear recurrence in 2\mathbb{R}^{2}:

(xn1xn)=A.(xn2xn1)+bn, where A=(010.91.8) and bn=(04.7.102(un2+un1+un)).\begin{array}[]{ll}&\left(\begin{array}[]{r}x_{n-1}\\ x_{n}\end{array}\right)=A.\left(\begin{array}[]{r}x_{n-2}\\ x_{n-1}\end{array}\right)+b_{n},\\ \mbox{ where }&A=\left(\begin{array}[]{rl}0&1\\ -0.9&1.8\end{array}\right)\mbox{ and }b_{n}=\left(\begin{array}[]{c}0\\ 4.7.10^{-2}*(u_{n-2}+u_{n-1}+u_{n})\end{array}\right).\end{array}

This toy example will be used to illustrate the various approaches mentioned in this paper. The first iterates, obtained using floating-point arithmetic, with random values for x1[1,1.1]x_{1}\in[1,1.1] and each un[9.95, 10.05]u_{n}\in[9.95\,,\,10.05], are given on the left two columns below.

nxn0011.061723.318336.423449.9851513.6031616.9117719.6103821.4884922.43941022.45951220.15081513.8931nxn209.15183017.01864012.44145015.03056013.61307014.38588013.96809014.151010014.094920014.087030014.144340014.128250014.0828nwid(𝐱n)0010.100020.194130.453541.005152.231364.9350710.905824.085953.18210117.4212572.31156158.0nwid(𝐱n)203.2293.105308.8808.108402.4423.1012506.7164.1015601.8470.1019705.0794.1022801.3969.1026903.8415.10291001.0564.10332002.6137.10673006.4663.101014001.5998.101365003.9580.10170\begin{array}[]{cc||cc}\begin{array}[]{c|r}n&x_{n}\\ \hline\cr 0&0\\ 1&1.0617\\ 2&3.3183\\ 3&6.4234\\ 4&9.9851\\ 5&13.6031\\ 6&16.9117\\ 7&19.6103\\ 8&21.4884\\ 9&22.4394\\ 10&22.4595\\ 12&20.1508\\ 15&13.8931\\ \end{array}&\hskip 14.22636pt\begin{array}[]{c|r}n&x_{n}\\ \hline\cr 20&9.1518\\ 30&17.0186\\ 40&12.4414\\ 50&15.0305\\ 60&13.6130\\ 70&14.3858\\ 80&13.9680\\ 90&14.1510\\ 100&14.0949\\ 200&14.0870\\ 300&14.1443\\ 400&14.1282\\ 500&14.0828\\ \end{array}&\hskip 14.22636pt\begin{array}[]{c|r}n&\mathrm{wid}(\mathbf{x}_{n})\\ \hline\cr 0&0\\ 1&0.1000\\ 2&0.1941\\ 3&0.4535\\ 4&1.0051\\ 5&2.2313\\ 6&4.9350\\ 7&10.905\\ 8&24.085\\ 9&53.182\\ 10&117.42\\ 12&572.31\\ 15&6158.0\\ \end{array}&\hskip 8.53581pt\begin{array}[]{c|l}n&\mathrm{wid}(\mathbf{x}_{n})\\ \hline\cr 20&3.2293.10^{5}\\ 30&8.8808.10^{8}\\ 40&2.4423.10^{12}\\ 50&6.7164.10^{15}\\ 60&1.8470.10^{19}\\ 70&5.0794.10^{22}\\ 80&1.3969.10^{26}\\ 90&3.8415.10^{29}\\ 100&1.0564.10^{33}\\ 200&2.6137.10^{67}\\ 300&6.4663.10^{101}\\ 400&1.5998.10^{136}\\ 500&3.9580.10^{170}\\ \end{array}\end{array}

The system stabilizes around 1414, with variations due to the random values taken by the unu_{n}. However, the following snippet of Octave code computes the successive iterates using interval arithmetic, using the interval [1, 1.1][1\,,\,1.1] for 𝐱1{\mathbf{x}}_{1} and 𝐮=[9.95, 10.05]{\mathbf{u}}=[9.95\,,\,10.05] for the unu_{n}, that is, we replace un2+un1+unu_{n-2}+u_{n-1}+u_{n} by 3𝐮3*{\mathbf{u}}.

A=[[0 1];[-0.9 1.8]];
xn=[infsup(0,0);infsup(1,1.1)];
b=4.7e-2 * 3.0*[infsup(0,0);infsup(9.95,10.05)];
n=500; for i=1:n, i , xn=A*xn+b, wid(xn(1)), end;

On the right two columns above are the widths of the successive iterates 𝐱n{\mathbf{x}}_{n}: the widths of the iterates diverge rapidly to infinity.

The explanation of this phenomenon is the following: the spectral radius of AA is strictly less than 1: ρ(A)0.9487<1\rho(A)\simeq 0.9487<1, and thus the exact (and, for that matter, floating-point) iterations converge. However, the recurrence satisfied by the widths of the iterates is wid(𝐱n)=1.8wid(𝐱n1)+0.9wid(𝐱n2)+4.7.1023wid(𝐮)\mathrm{wid}({\mathbf{x}}_{n})=1.8*\mathrm{wid}({\mathbf{x}}_{n-1})+0.9*\mathrm{wid}({\mathbf{x}}_{n-2})+4.7.10^{-2}*3*\mathrm{wid}({\mathbf{u}}), which corresponds to the 2-dimensional iteration wn=|A|.wn1+wbw_{n}=|A|.w_{n-1}+w_{b}, with wn=wid(𝐱n)w_{n}=\mathrm{wid}({\mathbf{x}}_{n}), |A||A| the matrix whose coefficients are the absolute values of the coefficients of AA and wb=4.7.1023wid(𝐮)w_{b}=4.7.10^{-2}*3*\mathrm{wid}({\mathbf{u}}). As the spectral radius of |A||A| is larger than 1, indeed ρ(|A|)2.208>1\rho(|A|)\simeq 2.208>1, the iterations diverge.

This phenomenon is a special case of the so-called wrapping effect. Its ubiquity in interval computations has been put in evidence by Lohner in [6].

1.2 The Wrapping Effect

The wrapping effect is ubiquitous, as defined and developed in [6]. It can be described as the overestimation due to the enclosure of the sought set in a set of a given. simple structure. In our case, this simple structure corresponds to multidimensional intervals or boxes, that is, parallelepipeds with sides parallel to the axes of the coordinate system. When the computation is iterative, and when each iteration produces such an overestimating set that is used as the starting point of the next iteration, the size of the computed set may grow exponentially in the number of iterations, even when the exact solution set remains bounded and small.

Lohner also put in evidence that the affine iteration we study in this paper, namely xn+1=Axn+bx_{n+1}=Ax_{n}+b, or more generally xn+1=Anxn+bnx_{n+1}=A_{n}x_{n}+b_{n} with xn+1x_{n+1}, xnx_{n} and bnb_{n} vectors in d\mathbb{R}^{d} and And×dA_{n}\in\mathbb{R}^{d\times d} for every nn\in\mathbb{N}, is archetypal. It occurs in many algorithms, and the examples cited in [6] include

  • matrix-vector iterations as the ones studied in this paper;

  • discrete dynamical systems: 𝐱n+1=f(𝐱n){\mathbf{x}}_{n+1}=f({\mathbf{x}}_{n}), 𝐱0{\mathbf{x}}_{0} given and ff sufficiently smooth;

  • continuous dynamical systems (ODEs): x(t)=g(t,x(t))x^{\prime}(t)=g(t,x(t)), x(0)=x0x(0)=x_{0}, which is studied through a numerical one step method (or more) of the kind 𝐱n+1=𝐱n+hΦ(𝐱n,tn)+𝐳n+1{\mathbf{x}}_{n+1}={\mathbf{x}}_{n}+h\Phi({\mathbf{x}}_{n},t_{n})+{\mathbf{z}}_{n+1};

  • difference equations: a0𝐳n+a1𝐳n+1++am𝐳n+m=bna_{0}{\mathbf{z}}_{n}+a_{1}{\mathbf{z}}_{n+1}+\ldots+a_{m}{\mathbf{z}}_{n+m}=b_{n} with 𝐳1,𝐳m{\mathbf{z}}_{1},\ldots{\mathbf{z}}_{m} given;

  • linear systems with (banded) triangular matrix;

  • automatic differentiation.

In this paper, we concentrate on examples similar to the toy example presented above: for every initial value x0dx_{0}\in{\mathbb{R}}^{d}, the sequence of iterates (xn)n(x_{n})_{n\in{\mathbb{N}}} converges to a finite value xdx^{*}\in{\mathbb{R}}^{d}, since ρ(A)<1\rho(A)<1; however, the computations performed using interval arithmetic diverge because their behaviour is dictated by ρ(|A|)\rho(|A|) which is larger than 11. We are interested in the iterates computed using interval arithmetic: it is established that these iterates increase in width, however different approaches can be applied to counteract the exponential growth of the width of the iterates. Several of them, some new as in Sections 2.3.2 and 2.3.3, and some already well eestablished as in Section 2.4, will be tried and compared, in terms of the widths of the results and the computational time.

2 Theoretical Results

2.1 Problem and Notations

Let AA be a d×dd\times d matrix in d×d\mathbb{R}^{d\times d}, 𝐱0Id{\mathbf{x}}_{0}\in{\mathrm{I}}\!\mathbb{R}^{d} be an interval vector (boldface font is used for interval quantities and I{\mathrm{I}}\!\mathbb{R} stands for the set of real intervals), x0x_{0} a vector in d\mathbb{R}^{d} with x0𝐱0x_{0}\in{\mathbf{x}}_{0}, 𝐛Id{\mathbf{b}}\in{\mathrm{I}}\!\mathbb{R}^{d} an interval vector, and bb a vector in d\mathbb{R}^{d} and b𝐛b\in{\mathbf{b}}. In what follows, nn denotes the number of iterations.

It is assumed that ρ(A)<1\rho(A)<1 and ρ(|A|)>1\rho(|A|)>1.

A first goal is to determine the set of all fixed-points of the iteration

{x0𝐱0, b𝐛,xn+1=Axn+b\left\{\begin{array}[]{l}x_{0}\in{\mathbf{x}}_{0},\mbox{ }b\in{\mathbf{b}},\\ x_{n+1}=Ax_{n}+b\end{array}\right.

for every x0𝐱0x_{0}\in{\mathbf{x}}_{0} and every b𝐛b\in{\mathbf{b}}.

It is known that xnx_{n} can be written as

xn=Anx0+i=1n1Aib,x_{n}=A^{n}x_{0}+\sum_{i=1}^{n-1}A^{i}b,

thus

{xn:x0𝐱0}An𝐱0+(i=1n1Ai)𝐛.\{x_{n}:x_{0}\in{\mathbf{x}}_{0}\}\subset A^{n}{\mathbf{x}}_{0}+\left(\sum_{i=1}^{n-1}A^{i}\right){\mathbf{b}}.

However, when the vectors x0x_{0} and bb are replaced in the iterative formula by their interval enclosures 𝐱0{\mathbf{x}}_{0} and 𝐛{\mathbf{b}}, one obtains the new interval vector 𝐱n+1{\mathbf{x}}_{n+1}, which is computed as:

{𝐱0 and 𝐛 given,𝐱n+1=A𝐱n+𝐛.\left\{\begin{array}[]{l}{\mathbf{x}}_{0}\mbox{ and }{\mathbf{b}}\mbox{ given},\\ {\mathbf{x}}_{n+1}=A{\mathbf{x}}_{n}+{\mathbf{b}}.\end{array}\right.

Another goal is to determine a tight enclosure for each iterate of this diverging set of intervals.

As mentioned above, the increase in widths of the iterates can be attributed to the use of parallelepipeds with sides parallel to the axes of the coordinate system, and not to the geometry of the transformation. To cure this problem, changes of coordinates will be applied, using an invertible matrix BB, with x=Byy=B1xx=By\Leftrightarrow y=B^{-1}x and its interval counterpart 𝐱=B𝐲{\mathbf{x}}=B{\mathbf{y}}. This yields the iteration

{xn+1=Byn+1yn+1=B1AByn+B1b\left\{\begin{array}[]{lcl}x_{n+1}&=&By_{n+1}\\ y_{n+1}&=&B^{-1}ABy_{n}+B^{-1}b\end{array}\right.

and its interval counterpart

{𝐱n+1=B𝐲n+1𝐲n+1=B1AB𝐲n+B1𝐛.\left\{\begin{array}[]{lcl}{\mathbf{x}}_{n+1}&=&B{\mathbf{y}}_{n+1}\\ {\mathbf{y}}_{n+1}&=&B^{-1}AB{\mathbf{y}}_{n}+B^{-1}{\mathbf{b}}.\end{array}\right.

In what follows, to establish bounds and their proofs, we assume AA diagonalizable (this will not necessarily be the case for the experiments) and AA can be diagonalized as A=P1ΛPA=P^{-1}\Lambda P where Λ\Lambda is a diagonal matrix with the eigenvalues λ1,λd\lambda_{1},\ldots\lambda_{d} of AA on the diagonal and the columns of P1P^{-1} are the corresponding eigenvectors.

The iteration considered in this paper corresponds to xn=Anx0+i=0n1Aibx_{n}=A^{n}x_{0}+\sum_{i=0}^{n-1}A^{i}b. The numerical unstability of computing the matrix power AnA^{n} and applying it to a vector, is well known: Anx0A^{n}x_{0} tends to be aligned with the eigenvector of AA associated with the largest (in module) eigenvalue, and the information corresponding to the contribution of the other eigenvectors is lost. To avoid this well-known problem of the power method, we will consider orthogonal changes of coordinates. The choice of the orthogonal matrices is related to AA, the matrix of the iteration.

We will first consider the QR-factorization of AA: A=QRA=QR with Qd×dQ\in{\mathbb{R}}^{d\times d} orthogonal, that is, QQ=QQ=IQQ^{\prime}=Q^{\prime}Q=I is the identity matrix and Rd×dR\in{\mathbb{R}}^{d\times d} is upper triangular.

The other factorization used in this paper is the SVD-factorization of AA: A=UΣVA=U\Sigma V^{\prime} with UU, VV and Σ\Sigma d×d\in{\mathbb{R}}^{d\times d}, where UU and VV are orthogonal and Σ\Sigma is a diagonal matrix with the singular values σ1,σd\sigma_{1},\ldots\sigma_{d} of AA on the diagonal. We also assume that σ1σ2σn0\sigma_{1}\geq\sigma_{2}\geq\ldots\geq\sigma_{n}\geq 0. This idea has been sketched but not completely developed by Beaumont in [2].

2.2 Known results

Mayer and his co-authors have extensively studied the existence of a fixed-point for the iteration studied in this paper. In [7], Mayer and Warnke have thoroughly established formulas for the fixed-point in the case of ρ(|A|)<1\rho(|A|)<1: this fixed-point is independent of the starting interval 𝐱0{\mathbf{x}}_{0}. In [1], Arndt and Mayer have established necessary and sufficient condition on AA for a fixed-point to exist, when ρ(|A|)=1\rho(|A|)=1. In this case, the fixed-point is an interval of nonzero width, that is, a non-degenerate interval. It is well-known that the widths of the iterates diverge when ρ(|A|)>1\rho(|A|)>1, and thus that no fixed-point exists in this case. Our goal is to study the speed of divergence of the iterates in this case.

2.3 Different Approaches along with Theoretical Bounds

The main idea is to use an orthogonal change of coordinates which is related to the matrix of the iteration. As the matrix AA is kept constant for all iterations (and this is not the case in the more general approach of Lohner, see Section 2.4), the change of coordinates is also kept constant and given by an orthogonal matrix BB. The two orthogonal matrices considered in what follows are either B=QB=Q from the QR-factorization of AA, or B=UB=U, resp. B=VB=V^{\prime}, from the SVD-factorization of AA.

2.3.1 Orthogonal Change of Coordinates

Before diving into the specificities of these changes of coordinates, let us study the general change of coordinates using an orthogonal matrix BB, that is, B1=BB^{-1}=B^{\prime}, with x=Byy=B1xx=By\Leftrightarrow y=B^{-1}x and its interval counterpart 𝐱=B𝐲{\mathbf{x}}=B{\mathbf{y}}. The interval iteration is

{𝐱n+1=B𝐲n+1𝐲n+1=B1AB𝐲n+B1𝐛.\left\{\begin{array}[]{lcl}{\mathbf{x}}_{n+1}&=&B{\mathbf{y}}_{n+1}\\ {\mathbf{y}}_{n+1}&=&B^{-1}AB{\mathbf{y}}_{n}+B^{-1}{\mathbf{b}}.\end{array}\right.

Thus the iteration satisfied by the width of the 𝐲n{\mathbf{y}}_{n} is

wid(𝐲n+1)=|B1AB|wid(𝐲n)+|B1|wid(𝐛)|B1|.|A|.|B|wid(𝐲n)+|B1|wid(𝐛)\begin{array}[]{lcl}\mathrm{wid}(\mathbf{y}_{n+1})&=&|B^{-1}AB|\mathrm{wid}(\mathbf{y}_{n})+|B^{-1}|\mathrm{wid}(\mathbf{b})\\ &\leq&|B^{-1}|.|A|.|B|\mathrm{wid}(\mathbf{y}_{n})+|B^{-1}|\mathrm{wid}(\mathbf{b})\end{array}

where the inequalities are to be understood componentwise. By induction on nn,

wid(𝐲n)(|B1|.|A|.|B|)n.wid(𝐲0)+i=0n1(|B1|.|A|.|B|)i.|B1|.wid(𝐛).\mathrm{wid}(\mathbf{y}_{n})\leq(|B^{-1}|.|A|.|B|)^{n}.\mathrm{wid}(\mathbf{y}_{0})+\sum_{i=0}^{n-1}(|B^{-1}|.|A|.|B|)^{i}.|B^{-1}|.\mathrm{wid}(\mathbf{b}).

Taking norms, one gets

wid(𝐲n)(|B1|.|A|.|B|)n.wid(𝐲0)+i=0n1(|B1|.|A|.|B|)i.|B1|.wid(𝐛)(|B1|.|A|.|B|)n.wid(𝐲0)+(|B1|.|A|.|B|)n1|B1|.|A|.|B|1|B1|.wid(𝐛).\begin{array}[]{lccl}\|\mathrm{wid}(\mathbf{y}_{n})\|&\leq&&\left(\|\>|B^{-1}|\>\|.\|\>|A|\>\|.\|\>|B|\>\|\right)^{n}\|.\mathrm{wid}(\mathbf{y}_{0})\|\\[5.69054pt] &&+&\sum_{i=0}^{n-1}\left(\|\>|B^{-1}|\>\|.\|\>|A|\>\|.\|\>|B|\>\|\right)^{i}.\|\>|B^{-1}|\>\|.\|\mathrm{wid}(\mathbf{b})\|\\[11.38109pt] &\leq&&\left(\|\>|B^{-1}|\>\|.\|\>|A|\>\|.\|\>|B|\>\|\right)^{n}\|.\mathrm{wid}(\mathbf{y}_{0})\|\\[5.69054pt] &&+&\frac{\left(\|\>|B^{-1}|\>\|.\|\>|A|\>\|.\|\>|B|\>\|\right)^{n}-1}{\|\>|B^{-1}|\>\|.\|\>|A|\>\|.\|\>|B|\>\|-1}\|\>|B^{-1}|\>\|.\|\mathrm{wid}(\mathbf{b})\|.\end{array}

Remark: if the considered norm is the matrix norm induced by the vector Euclidean norm, then |B|2=B2\|\>|B|\>\|_{2}=\|B\|_{2} for any matrix BB. Similarly, |B|=Bd\|\>|B|\>\|_{\infty}=\|B\|_{\infty}\leq\sqrt{d} for any d×dd\times d orthogonal matrix BB. In such cases, the bound becomes

wid(𝐲n)(κ(B).|A|)nwid(𝐲0)+(κ(B).|A|)n1κ(B).|A|1B1.wid(𝐛),\begin{array}[]{lccl}\|\mathrm{wid}(\mathbf{y}_{n})\|&\leq&&\left(\kappa(B).\|\>|A|\>\|\right)^{n}\|\mathrm{wid}(\mathbf{y}_{0})\|\\[5.69054pt] &&+&\frac{\left(\kappa(B).\|\>|A|\>\|\right)^{n}-1}{\kappa(B).\|\>|A|\>\|-1}\|B^{-1}\|.\|\mathrm{wid}(\mathbf{b})\|,\end{array}

where κ(B)\kappa(B) denotes B.B1\|B\|.\|B^{-1}\|, the condition number of BB for the problem of solving a linear system.

Since B2=B12=κ2(B)=1\|B\|_{2}=\|B^{-1}\|_{2}=\kappa_{2}(B)=1 for an orthogonal matrix BB, this bound simplifies even further with the Euclidean norm:

wid(𝐲n)An.wid(𝐲0)+An1A1.wid(𝐛).\begin{array}[]{lcl}\|\mathrm{wid}(\mathbf{y}_{n})\|&\leq&\|A\|^{n}.\|\mathrm{wid}(\mathbf{y}_{0})\|+\frac{\|A\|^{n}-1}{\|A\|-1}.\|\mathrm{wid}(\mathbf{b})\|.\end{array}

In other words, theoretically there is no difference in the bounds on the widths of the iterates, whether an orthogonal change of coordinates takes place or not.

In what follows, we assume again AA diagonalizable (this will not necessarily be the case for the experiments) and AA can be diagonalized as A=P1ΛPA=P^{-1}\Lambda P where Λ\Lambda is diagonal. If we replace AA by P1ΛPP^{-1}\Lambda P in the iteration, one gets the mathematically equivalent formulation

𝐲n+1=B1P1ΛPB𝐲n+B1𝐛=(PB)1Λ(PB)𝐲n+B1𝐛,\begin{array}[]{lcl}\mathbf{y}_{n+1}&=&B^{-1}P^{-1}\Lambda PB\mathbf{y}_{n}+B^{-1}\mathbf{b}\\ &=&(PB)^{-1}\Lambda(PB)\mathbf{y}_{n}+B^{-1}\mathbf{b},\end{array}

thus

wid(𝐲n)=(|(PB)1|.|Λ|.|PB|).wid(𝐲n)+|B1|.wid(𝐛),\mathrm{wid}(\mathbf{y}_{n})=(|(PB)^{-1}|.|\Lambda|.|PB|).\mathrm{wid}(\mathbf{y}_{n})+|B^{-1}|.\mathrm{wid}(\mathbf{b}),

and by induction

wid(𝐲n)=(|(PB)1|.|Λ|.|PB|)n.wid(𝐲0)+i=0n1(|(PB)1|.|Λ|.|PB|)i.|B1|.wid(𝐛).\mathrm{wid}(\mathbf{y}_{n})=(|(PB)^{-1}|.|\Lambda|.|PB|)^{n}.\mathrm{wid}(\mathbf{y}_{0})+\sum_{i=0}^{n-1}(|(PB)^{-1}|.|\Lambda|.|PB|)^{i}.|B^{-1}|.\mathrm{wid}(\mathbf{b}).

Taking the Euclidean norm of vectors and the induced matrix norm, one gets

wid(𝐲n)2(κ2(PB)Λ2)n.wid(𝐲0)2+(κ2(PB)Λ2)n1κ2(PB)Λ21.wid(𝐛)2.\begin{array}[]{lccl}\|\mathrm{wid}(\mathbf{y}_{n})\|_{2}&\leq&&\left(\kappa_{2}(PB)\|\Lambda\|_{2}\right)^{n}.\|\mathrm{wid}(\mathbf{y}_{0})\|_{2}\\[5.69054pt] &&+&\frac{\left(\kappa_{2}(PB)\|\Lambda\|_{2}\right)^{n}-1}{\kappa_{2}(PB)\|\Lambda\|_{2}-1}.\|\mathrm{wid}(\mathbf{b})\|_{2}.\\[11.38109pt] \end{array}

Let us note that κ(PB)=κ(P)\kappa(PB)=\kappa(P). Furthermore, as Λ\Lambda is diagonal, Λ\|\Lambda\| is the largest eigenvalue (in module) of AA, that is, Λ=ρ(A)<1\|\Lambda\|=\rho(A)<1. This implies

wid(𝐲n)2(κ2(P)ρ(A))n.wid(𝐲0)2+(κ2(P)ρ(A))n1κ2(P)ρ(A)1.wid(𝐛)2,\begin{array}[]{lcl}\|\mathrm{wid}(\mathbf{y}_{n})\|_{2}&\leq&\left(\kappa_{2}(P)\rho(A)\right)^{n}.\|\mathrm{wid}(\mathbf{y}_{0})\|_{2}+\frac{\left(\kappa_{2}(P)\rho(A)\right)^{n}-1}{\kappa_{2}(P)\rho(A)-1}.\|\mathrm{wid}(\mathbf{b})\|_{2},\\[11.38109pt] \end{array}

This inequality puts in evidence the influence of the condition number of PP, the matrix of eigenvectors. For instance, in the ideal case where the eigenvectors form an orthonormal basis, no overestimation occurs.

2.3.2 Use of the QR Factorization

When the orthogonal change of coordinates involves QQ from the QR-factorization of AA, the algorithm can be written as

A=QR,xn+1=Qyn+1yn+1=Qxn+1yn+1=QAQyn+Qb\begin{array}[]{rcl}A&=&QR,\\ x_{n+1}&=&Qy_{n+1}\\ \Leftrightarrow y_{n+1}&=&Q^{\prime}x_{n+1}\\ y_{n+1}&=&Q^{\prime}AQy_{n}+Q^{\prime}b\end{array}

In exact arithmetic, one should get

yn+1=RQyn+Qb.y_{n+1}=RQy_{n}+Q^{\prime}b.

The interval counterpart is

𝐱n+1=Q𝐲n+1𝐲n+1=QAQ𝐲n+Q𝐛.\begin{array}[]{lcl}{\mathbf{x}}_{n+1}&=&Q{\mathbf{y}}_{n+1}\\ {\mathbf{y}}_{n+1}&=&Q^{\prime}AQ{\mathbf{y}}_{n}+Q^{\prime}{\mathbf{b}}.\end{array}

2.3.3 Use of the SVD Factorization

Our second and third proposals consist in using respectively UU and VV from the SVD-factorization of AA: from A=UΣVA=U\Sigma V^{\prime}, we use either B=UB=U or B=VB=V^{\prime}, which yields
xn+1=Uyn+1yn+1=Uxn+1,yn+1=UAUyn+Ub.\begin{array}[]{rcl}x_{n+1}&=&Uy_{n+1}\\ \Leftrightarrow y_{n+1}&=&U^{\prime}x_{n+1},\\ y_{n+1}&=&U^{\prime}AUy_{n}+U^{\prime}b.\end{array} In exact arithmetic, this corresponds to yn+1=ΣVUyn+Ub.y_{n+1}=\Sigma VUy_{n}+U^{\prime}b. The interval counterpart is 𝐱n+1=U𝐲n+1𝐲n+1=UAU𝐲n+U𝐛.\begin{array}[]{lcl}{\mathbf{x}}_{n+1}&=&U{\mathbf{y}}_{n+1}\\ {\mathbf{y}}_{n+1}&=&U^{\prime}AU{\mathbf{y}}_{n}+U^{\prime}{\mathbf{b}}.\end{array} xn+1=Vyn+1yn+1=Vxn+1,yn+1=VAVyn+Vb.\begin{array}[]{rcl}x_{n+1}&=&V^{\prime}y_{n+1}\\ \Leftrightarrow y_{n+1}&=&Vx_{n+1},\\ y_{n+1}&=&VAV^{\prime}y_{n}+Vb.\end{array} In exact arithmetic, this corresponds to yn+1=VUΣyn+Vb.y_{n+1}=VU\Sigma y_{n}+Vb. The interval counterpart is 𝐱n+1=V𝐲n+1𝐲n+1=VAV𝐲n+V𝐛.\begin{array}[]{lcl}{\mathbf{x}}_{n+1}&=&V^{\prime}{\mathbf{y}}_{n+1}\\ {\mathbf{y}}_{n+1}&=&VAV^{\prime}{\mathbf{y}}_{n}+V{\mathbf{b}}.\end{array}

Remark: VUVU is also an orthogonal matrix.

2.4 Lohner’s QR Method

A well-known approach is given in Lohner, e.g. in [6] and studied in details by Nedialkov and Jackson in [8]. It is usually presented for the iteration xn+1=Anxn+bnx_{n+1}=A_{n}x_{n}+b_{n}, that is when the matrix and the affine term vary at each iteration.

Lohner’s QR method consists in performing the following iteration:

{y0=x0,Q0=I,[Q1,R1]=qr(A) that is, A=Q1R1[Qn+1,Rn+1]=qr(RnQn)yn+1=Qn+1AQnyn+Qn+1bxn+1=Qn+1yn+1\left\{\begin{array}[]{lcl}\lx@intercol y_{0}=x_{0},\>Q_{0}=I,\>[Q_{1},R_{1}]=qr(A)\mbox{ that is, }A=Q_{1}R_{1}\hfil\lx@intercol\\ [Q_{n+1},R_{n+1}]&=&qr(R_{n}Q_{n})\\ y_{n+1}&=&Q_{n+1}^{\prime}AQ_{n}y_{n}+Q_{n+1}^{\prime}b\\ x_{n+1}&=&Q_{n+1}y_{n+1}\end{array}\right.

and its interval counterpart is

{𝐲0=𝐱0,Q0=I,[Q1,R1]=qr(A) that is, A=Q1R1[Qn+1,Rn+1]=qr(RnQn)𝐲n+1=Qn+1AQn𝐲n+Qn+1𝐛𝐱n+1=Qn+1𝐲n+1.\left\{\begin{array}[]{lcl}\lx@intercol{\mathbf{y}}_{0}={\mathbf{x}}_{0},\>Q_{0}=I,\>[Q_{1},R_{1}]=qr(A)\mbox{ that is, }A=Q_{1}R_{1}\hfil\lx@intercol\\ [Q_{n+1},R_{n+1}]&=&qr(R_{n}Q_{n})\\ {\mathbf{y}}_{n+1}&=&Q_{n+1}^{\prime}AQ_{n}{\mathbf{y}}_{n}+Q_{n+1}^{\prime}{\mathbf{b}}\\ {\mathbf{x}}_{n+1}&=&Q_{n+1}{\mathbf{y}}_{n+1}.\\ \end{array}\right.

In the case of a constant – throughout the iterations – matrix AA, one can recognize Francis’ and Kublanovskaya’s QR-algorithm. Using the convergence of (Rn)(R_{n}) towards the matrix of eigenvalues of AA (or towards its Schur form), in [8], Nedialkov and Jackson established the following bounds:

w(𝐱n)cond(P)ρ(A)nw(𝐱0)+cond(P)ρ(A)n11cond(P)ρ(A)1w(𝐛)+𝐛w({\mathbf{x}}_{n})\leq\mathrm{cond}(P)\rho(A)^{n}w({\mathbf{x}}_{0})+\frac{\mathrm{cond}(P)\rho(A)^{n-1}-1}{\mathrm{cond}(P)\rho(A)-1}w({\mathbf{b}})+{\mathbf{b}}

where we recall AA diagonalizable: A=P1ΛPA=P^{-1}\Lambda P.

2.5 Comparison

Two aspects are compared: the complexity and the accuracy, that is, the bounds on the widths of the iterates, of each method.

Let us first examine the computational complexity. Let us recall that the QR-factorization, resp. SVD-factorization, of a d×dd\times d matrix has a computational complexity of 𝒪(d3){\cal{O}}(d^{3}). In the algorithms of Sections 2.3.2 and 2.3.3, the factorization of a matrix is performed only once, and not at every iteration: for nn iterations, these algorithms thus have complexity 𝒪(d3+nd2){\cal{O}}(d^{3}+nd^{2}). In comparison, Lohner’s QR method has complexity 𝒪(nd3){\cal{O}}(nd^{3}), which is significantly larger when dd is large. In comparison, the cost of the factorization is negligible when the number nn of iterations is large.

Let us now compare the accuracy of these different methods, from a theoretical point of view. The bounds we get on the width of the iterate 𝐱n{\mathbf{x}}_{n} are larger than the bounds obtained by Nedialkov and Jackson, as the condition number of the matrix PP appears to the nn-th power in the formula for the QR- and SVD-algorithms, whereas it appears without this nn-th power in the bound for Lohner’s QR-algorithm. As a condition number is always larger or equal to 11, this means that the bound for Lohner’s QR-algorithm is tighter than the bounds for the QR- and SVD-algorithms.

3 Experiments

3.1 Experimental Setup

After the results on the widths of the iterates of the toy example given in Section 1.1, Section 3.2 presents the computation of each corner of the initial box, to illustrate that it is possible to get tight enclosures, on such a small example.

All algorithms presented in this paper, namely the naive (or brute-force) application of the iteration, the QR-algorithm of Section 2.3.2, the two versions of the SVD-algorithm of Section 2.3.3, and Lohner’s QR algorithm given in Section 2.4 have been implemented in Octave using Heimlich’ interval package [3], then in Matlab using Rump’s Intlab package [10]. Two other methods have been implemented and compared. The first technique [9] consists in the determination of kk such that ρ(|Ak|)<1\rho(|A^{k}|)<1, then it computes only one iterate every kk step, in other words it computes

x(k+1)n=Akxkn+i=0k1Aib:x_{(k+1)n}=A^{k}x_{kn}+\sum_{i=0}^{k-1}A^{i}b:

this iteration converges even when interval arithmetic is employed. The other technique is the use of affine arithmetic, as advocated by Rump in a private communication. In the experimental results presented below, each technique is associated to a color:

algorithm color
brute force black
QR cyan
SVD UU red
SVD VV magenta
Lohner’s QR dark blue
every kk-th iterate green
affine arithmetic yellow

The factorizations use only the basic QR and SVD factorizations available in Matlab, but neither the pivoted QR recommended by Lohner in [6] nor more elaborate versions presented by Higham in [4].

Sections 3.3 and 3.4 contain the evolution of the radii of the iterates computed by these different techniques, for two matrix dimensions: 10×1010\times 10 and 100×100100\times 100. The yy-axis for the radii uses a logarithmic scale. For both dimensions, four kinds of matrices AA have been used for the experiments. On the one hand, matrices which are well-conditioned (with a condition number of order 10210^{2}) and ill-conditioned (with a condition number of order 101010^{10}) have been generated. On the other hand, the scaling of the matrices varies: matrices which are well-scaled and matrices which are ill-scaled, with the order of magnitude of their coefficients varying between 11 and 101010^{10}. These are only orders of magnitudes, as the matrices, originally generated by a call to Matlab’s randsvd were then added to a multiple of the identity matrix and multiplied by a constant, in order to satisfy both ρ(A)<1\rho(A)<1 and ρ(|A|)>1\rho(|A|)>1. It can also be noted that degrading the scaling of the matrix also degrades its condition number; in other words, a “well-conditioned ill-scaled” matrix has a much worse condition number than a “well-conditioned well-scaled” matrix, even if the required condition numbers, in the call to randsvd, are initially the same.

All experiments have been performed on a 2.7 GHz Quad-Core Intel Core i7 with 16GB RAM. Timings are averaged over 100 executions, except for affine arithmetic where at most 10 executions were performed.

3.2 Toy Example

First, the toy example presented in Section 1.1 is considered. As the iteration is affine, one can compute separately the images of the endpoints of the initial vector, to get the endpoints of the successive iterates. That is, we compute separately

xn=1.8xn10.9xn2+4.7.1023𝐮x_{n}=1.8*x_{n-1}-0.9*x_{n-2}+4.7.10^{-2}*3*{\mathbf{u}}

for x0=0x_{0}=0 and x1=1x_{1}=1 and for x0=0x_{0}=0 and x1=1.1x_{1}=1.1. However, we use the interval vector 𝐮=[9.95,10.05]{\mathbf{u}}=[9.95,10.05] in the iteration. The convex hull of the 10 first iterates are represented on the left part of Figure 1. It is obvious that the width of the successive iterates grow rapidly.

Then we compute separately

xn=1.8xn10.9xn2+4.7.1023ux_{n}=1.8*x_{n-1}-0.9*x_{n-2}+4.7.10^{-2}*3*u

for x0=0x_{0}=0 and x1=1x_{1}=1 and for x0=0x_{0}=0 and x1=1.1x_{1}=1.1, and for u=9.95u=9.95 and u=10.05u=10.05. The convex hull of the 10 first iterates are represented on the right part of Figure 1. In this case, the width of the successive iterates remain small, of the order of magnitude of 1%1\% of the midpoint of the interval.

Refer to caption Refer to caption
Figure 1: Left: the 10 first iterates of the toy example, where the endpoints of 𝐱0{\mathbf{x}}_{0} are considered separately. Right: the 10 first iterates of the toy example, computed corner by corner.

In this toy example, the iterations had to be performed 4 times, that is, once for each corner of the initial values, in order to get a tight enclosure. This can clearly not be generalized to high dimensions, as the number of corners grows as 4d4^{d} with the dimension dd of the problem.

3.3 Example of dimension 10

Figure 2 gives the radii (in logarithmic scale) for the successive iterates computed by the methods detailed above. When the number of iterations is large (visually, above 30 or 40 iterations), the iterates computed by all methods presented in Section 2 diverge rapidly, as can be seen on the plots on the left. When one concentrates on the first iterations, the behaviours compare differently. One can also note that unscaling the matrix AA speeds the divergence, for all methods. On the contrary, the kk-step method and the use of affine arithmetic preserve the convergence of the iterates.

a) Refer to caption Refer to caption
b) Refer to caption Refer to caption
c) Refer to caption Refer to caption
d) Refer to caption Refer to caption
Figure 2: The case of a 10×1010\times 10 matrix (right part: zoom of the left part): a) well-conditioned and well-scaled, b) ill-conditioned well-scaled, c) well-conditioned ill-scaled, d) ill-conditioned ill-scaled.

The timings in seconds are given below:

method well-cond. ill-cond. well-cond. ill-cond.
well-scaled well-scaled ill-scaled ill-scaled
naive 0.0088 0.0084 0.0093 0.0086
kk-th step 0.0044 0.0015 0.0043 0.0045
QR 0.0161 0.0155 0.0160 0.0157
SVD UU 0.0162 0.0156 0.0161 0.0166
SVD VV 0.0147 0.0145 0.0324 0.0332
Lohner’s QR 0.0170 0.0164 0.0165 0.0177
affine arith. 8.1859 8.7911 8.6595 8.2754

The methods presented in Sections 2.3.2, 2.3.3 and 2.4 all exhibit similar execution times. The naive method performs less operations and is thus faster. The kk-th step method is fast as well, the variations in its execution time are due to the preprocessing, that is to the determination of the power kk such that ρ(|Ak|)<1\rho(|A^{k}|)<1: the execution time is larger when kk is larger. With this method, the convergence is good. The use of affine arithmetic significantly slows down the computations, however the iterates converge.

3.4 Example of dimension 100

Figure 3 gives the radii (in logarithmic scale) for the successive iterates computed by the methods detailed in Section 3.1. When the number of iterations is large (visually, above 40 or 50 iterations), the iterates computed by all methods, except the kk-step method, diverge rapidly, as can be seen on the plots on the left. Again, when one concentrates on the first iterations, the behaviours compare differently.

a) Refer to caption Refer to caption
b) Refer to caption Refer to caption
c) Refer to caption Refer to caption
d) Refer to caption Refer to caption
Figure 3: The case of a 100×100100\times 100 matrix (right part: zoom of the left part): a) well-conditioned and well-scaled, b) ill-conditioned well-scaled, c) well-conditioned ill-scaled, d) ill-conditioned ill-scaled.

The timings in seconds are given below:

method well-cond. ill-cond. well-cond. ill-cond.
well-scaled well-scaled ill-scaled ill-scaled
naive 0.0163 0.0145 0.0142 0.0142
kk-th step 0.0046 0.0071 0.0048 0.0072
QR 0.1577 0.0363 0.0408 0.0409
SVD UU 0.0427 0.0420 0.0437 0.0437
SVD VV 0.0423 0.0390 0.0643 0.0638
Lohner’s QR 0.0611 0.0758 0.0646 0.0804
affine arith. 70.8724 72.6441 76.6102 71.2518

The comments on the timings apply again, with the exception of the use of affine arithmetic, which is still much slower but does not manage any more to preserve the convergence very long.

3.5 Comments

One can note that the kk-step method, that is the method that resorts to a convergent interval iteration, performs very well at a moderate computation cost. Even the preprocessing time to determine the value of kk has a negligible cost.

This method is a totally ad hoc approach for this problem and cannot be generalized. However, in the framework of filters and control theory, it has a physical meaning: the divergence of the iterations can be attributed to a sampling time which is too small to allow variations to be observed. Multiplying the sampling time by kk means sampling less frequently (by a factor kk) and thus being able to measure the evolution of the observed quantities.

The use of affine arithmetic, on the contrary, is a very general method and it exhibits a very good accuracy, even if it eventually diverges (see the experiments with the 100×100100\times 100 matrices in Section 3.4). The counterpart is the execution time, which is at least a thousand times larger than for the other methods. This is not an issue for the experiments presented here, as the time is of order of magnitude of a minute.

The methods based on the QR or SVD factorizations of the matrix AA were developed with geometric principles in mind. For the QR-algorithm, the idea was to align the current box with the directions that are preserved by the product by AA, with a tradeoff between aligning the box along the eigenvectors and preserving an orthonormal system of coordinates, hence the choice of QQ. For the SVD-algorithm, the idea was to align the box along the direction which gets the maximal elongation, that is along the singular vector corresponding to the largest singular value.

In both cases, the benefit of these geometric transformations is mitigated with the overestimation implied by extra computations, and there is either no clear benefit for the QR-based approach, or a delicate balance for the SVD-based approach. The SVD-algorithm is interesting when the matrix is ill-scaled, and particularly for the first iterations.

The methods of choice remain either the naive approach, when the matrix AA is well-conditioned and well-scaled, or Lohner’s QR method when the matrix is ill-conditioned. Surprisingly, the overhead of Lohner’s QR method, in terms of computational time, is not as large as the formula for its complexity implies.

Our general recommendation is thus:

  • to preprocess the matrix AA in order to scale it;

  • then to execute in parallel the naive approach and Lohner’s QR approach, in order to converge reasonably well for any condition number of AA.

Affine arithmetic is a solution of choice when other solutions fail and when the analysis and developing time is a scarce resource.

4 Conclusion and Future Work

This study, both theoretical and experimental, has compared several approaches to counteract the wrapping effect for the computation of affine iterations. Geometric considerations have led to the proposed algorithms. The benefit of these approaches is not always clear, as a better configuration is obtained through extra-computations and thus extra-overestimation. To deepen this geometric approach, we will aim at simplifying the resulting formulas, at getting formulas that are closer to the mathematically equivalent, but simpler, ones that are given after each proposed transformation. The main difficulty is to perform products such as Q.QQ.Q^{\prime} or U.UU^{\prime}.U, without replacing them by the identity, but in a certified and tight way. As the SVD-based approach seems more promising, our future work will concentrate on the use of a certified SVD factorization, as proposed by van der Hoeven and Yakoubsohn in [11]. We also plan to consider an interval version of the matrix, using the results in [5] to keep guarantees on the singular quantities involved in the computations.

References

  • [1] Arndt, H-R., and Mayer, G. On the semi-convergence of interval matrices. Linear Algebra and its Applications, 393:15–35, 2004. 10.1016/j.laa.2003.10.015.
  • [2] Beaumont, O. Solving Interval Linear Systems with Oblique Boxes. Technical Report 1315, Irisa, 2000. ftp://ftp.irisa.fr/techreports/2000/PI-1313.ps.gz.
  • [3] Heimlich, O. Interval arithmetic in GNU Octave. In SWIM 2016: Summer Workshop on Interval Methods, France, 2016. https://swim2016.sciencesconf.org/data/SWIM2016_book_of_abstracts.pdf#page=27.
  • [4] Higham, N.J. QR factorization with complete pivoting and accurate computation of the SVD. Linear Algebra and its Applications, 309:153–174, 2000. 10.1016/S0024-3795(99)00230-X.
  • [5] Hladik, M. and Daney, D. and Tsigaridas, E. Bounds on Real Eigenvalues and Singular Values of Interval Matrices. SIAM J. Matrix Analysis and Applications, 31(4):2116–2129, 2010. 10.1137/090753991.
  • [6] Lohner, R. On the Ubiquity of the Wrapping Effect in the Computation of Error Bounds. In Kulisch, Lohner, Facius, editor, Perspectives on Enclosures Methods, pages 201–217. Springer, 2001.
  • [7] Mayer, G. and Warnke, I. On the fixed points of the interval function [f]([x])=[A][x]+[b][f]([x])=[A][x]+[b]. Linear Algebra and its Applications, 363:202–216, 2003. 10.1016/S0024-3795(02)00254-9.
  • [8] Nedialkov, N. and Jackson, K. A New Perspective on the Wrapping Effect in Interval Methods for Initial Value Problems for Ordinary Differential Equations. In Kulisch, Lohner, Facius, editor, Perspectives on Enclosures Methods, pages 219–264. Springer, 2001.
  • [9] Revol, N. Convergent linear recurrences (with scalar coefficients) with divergent interval simulations. In SCAN: 11th GAMM - IMACS Int. Symp. on Scientific Computing, Computer Arithmetic, and Validated Numerics (Japan), 2004.
  • [10] Rump, S.M. INTLAB - INTerval LABoratory. In Csendes, Tibor, editor, Developments in Reliable Computing, pages 77–104. Kluwer Academic Publishers, Dordrecht, 1999. 10.1007/978-94-017-1247-7_7.
  • [11] van der Hoeven, J. and Yakoubsohn, J.-C. Certified Singular Value Decomposition. Technical report, 2018. https://hal.archives-ouvertes.fr/hal-01941987.