This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

[1]\fnmM. A. \surBotchev

[1]\orgnameKeldysh Institute of Applied Mathematics of Russian Academy of Sciences, \orgaddress\streetMiusskaya Sq., 4, \cityMoscow, \postcode125047, \countryRussia

On convergence of waveform relaxation for nonlinear systems of ordinary differential equations

To the memory of Willem Hundsdorfer
(April 4, 2024)
Abstract

To integrate large systems of nonlinear differential equations in time, we consider a variant of nonlinear waveform relaxation (also known as dynamic iteration or Picard–Lindelöf iteration), where at each iteration a linear inhomogeneous system of differential equations has to be solved. This is done by the exponential block Krylov subspace (EBK) method. Thus, we have an inner-outer iterative method, where iterative approximations are determined over a certain time interval, with no time stepping involved. This approach has recently been shown to be efficient as a time-parallel integrator within the PARAEXP framework. In this paper, convergence behavior of this method is assessed theoretically and practically. We examine efficiency of the method by testing it on nonlinear Burgers, three-dimensional Liouville–Bratu–Gelfand, and three-dimensional nonlinear heat conduction equations and comparing its performance with that of conventional time-stepping integrators.

keywords:
waveform relaxation, Krylov subspaces, exponential time integrators, time-parallel methods, Burgers equation, Liouville–Bratu–Gelfand equation

1 Introduction

Large systems of time-dependent ordinary differential equations arise in various applications and in many cases have to be integrated in time by implicit methods, see e.g. [1, 2]. Last decennia the niche of implicit methods has been gradually taken up by exponential time integrators [3]. For implicit and exponential methods the key issue is how to solve arising nonlinear systems and (or) to evaluate related matrix functions efficiently. To achieve efficiency, different approaches exist and widely applied, such as inexact Newton methods combined with powerful linear preconditioned solvers [4, 5, 6], splitting and Rosenbrock methods [7, 8, 9, 10, 1] and approximate iterative implicit schemes (which can be seen as stabilized explicit schemes) [11, 12, 13, 14, 15, 16, 17, 18, 19].

Another important approach to achieve efficiency in implicit and exponential time integrators is based on waveform relaxation methods [20, 21, 22], where iterative approximations are time dependent functions rather than time step values of a numerical solution. These methods have also been known as dynamic iteration or Picard–Lindelöf iteration [23]. They have been developed since the 80s [21, 24, 25, 26, 27] and have received attention recently in connection to the time-parallel exponential method PARAEXP [28] and its extension to nonlinear problems [29, 30].

Typically matrix function evaluations within exponential integrators are carried out by special linear algebra iterative procedures (often these are Krylov subspace or Chebyshev polynomial methods) [31, 32, 33, 14, 34, 35, 36]. The key attractive feature of the waveform relaxation methods is that they employ this linear algebra machinery across a certain time interval rather than within a time step, so that computational costs are distributed across time. This leads to a higher computational efficiency as well as to a parallelism across time [22].

One promising time-integration exponential method of this class is the nonlinear EBK (exponential block Krylov) method [29, 30]. It is based on block Krylov subspace projections [37] combined with waveform relaxation iteration employed to handle nonlinearities. The nonlinear EBK method possesses an across-time parallelism, and, in its parallel implementation, can be seen as an efficient version of the time-parallel PARAEXP method [29, 30].

Convergence of the waveform relaxation methods has been studied in different settings, in particular, for linear initial-value problems, for nonlinear Gauss–Seidel and Jacobi iterations (the working horses of classical waveform relaxation) and for time-discretized settings, see [23, 22] for a survey. Convergence results for waveform relaxation in general nonlinear settings are scarce, and it is often assumed that “studying the linear case carefully … might be what users really need” [23]. Except book [22], containing some general introductory convergence results, one of the papers where nonlinear waveform relaxation is studied is [38]. Below, in Remark 2, we comment more specifically on the results presented there. The aim of this paper is to extend convergence results for waveform relaxation to a nonlinear setting employed in the EBK method. This also yields an insight on its convergence behavior in practice.

The paper is organized as follows. In Section 2.1 a problem setting and the assumptions made are formulated. Main convergence results are presented in Section 2.2. Since the results are formulated for the error function which is in general unknown, in Section 2.3 an error estimate in terms of a computable nonlinear residual is formulated. At each waveform relaxation iteration a linear initial-value problem (IVP) has to be solved which, in current setting, is done by an iterative block Krylov subspace method (the linear EBK method). Therefore, in Section 2.4 we show how the nonlinear iteration convergence is affected if the linear IVP is solved inexactly. In Section 2.5 implementation issues are briefly discussed. Numerical experiments are described in Section 3.1 (for a 1D Burgers equation), Section 3.2 (for a 3D Liouville–Bratu–Gelfand equation), and Section 3.3 (for a 3D nonlinear heat conduction problem). We finish the paper with conclusions drawn in Section 4.

2 Nonlinear waveform relaxation and its convergence

2.1 Problem setting, assumptions and notation

We consider an IVP for nonlinear ODE system

y(t)=Φ(t,y(t)),y(0)=v,t[0,T]y^{\prime}(t)=\Phi(t,y(t)),\quad y(0)=v,\quad t\in[0,T] (1)

where Φ:×NN\Phi:\mathbb{R}\times\mathbb{R}^{N}\rightarrow\mathbb{R}^{N}, vNv\in\mathbb{R}^{N} and T>0T>0 are given. Without loss of generality, we assume that the time dependence in the right hand side function Φ\Phi is of the form

Φ(t,y)=Φ¯(y)+g(t),Φ¯:NN,g:N.\Phi(t,y)=\bar{\Phi}(y)+g(t),\quad\bar{\Phi}:\mathbb{R}^{N}\rightarrow\mathbb{R}^{N},\quad g:\mathbb{R}\rightarrow\mathbb{R}^{N}. (2)

Otherwise, we can transform (1) to an equivalent autonomous form (usually this is done by extending the unknown vector function y(t)y(t) with an additional (N+1)(N+1)-th coordinate yN+1(t)=ty_{N+1}(t)=t and adding an ODE yN+1(t)=1y_{N+1}^{\prime}(t)=1 to the system).

Iterative methods considered in this note are based on a family of splittings

Φ(t,y)=Aky+fk(y)+g(t),yN,t0,\Phi(t,y)=-A_{k}y+f_{k}(y)+g(t),\qquad\forall\;y\in\mathbb{R}^{N},\;t\geqslant 0, (3)

with kk being the iteration number, so that matrices AkN×NA_{k}\in\mathbb{R}^{N\times N} and mappings fk:NNf_{k}:\mathbb{R}^{N}\rightarrow\mathbb{R}^{N} may vary from iteration to iteration. The mappings fkf_{k} in (3) are all assumed to be Lipschitz continuous with a common Lipschitz constant LL, i.e.,

k:fk(u)fk(v)Luv,u,vN.\forall k:\quad\|f_{k}(u)-f_{k}(v)\|\leqslant L\|u-v\|,\qquad\forall\;u,v\in\mathbb{R}^{N}. (4)

Here and throughout the paper \|\cdot\| denotes a vector norm in N\mathbb{C}^{N} or a corresponding induced matrix norm. Furthermore, we assume that for all the matrices AkA_{k} in (3) there exist constants C>0C>0 and ω0\omega\geqslant 0 such that

exp(tAk)Ceωt,t0.\|\exp(-tA_{k})\|\leqslant Ce^{-\omega t},\qquad t\geqslant 0. (5)

For a particular matrix A=AkA=A_{k}, condition (5) holds, for instance, in the 2-norm if the numerical range of AA lies in the complex halfplane {z=x+iy|x0,y,i2=1}\{z=x+iy\;|\;x\geqslant 0,y\in\mathbb{R},i^{2}=-1\}. In this case C=1C=1 and ω\omega is the smallest eigenvalue of the symmetric part 12(A+AT)\frac{1}{2}(A+A^{T}) of AA. Condition (5) is closely related to the concept of the matrix logarithmic norm, see, e.g., [39, 1]. In particular, if μ(A)\mu(-A) is the logarithmic norm of A-A then condition (5) holds, with C=1C=1 and for all t0t\geqslant 0, if and only if [1, Theorem I.2.4]

μ(A)ω.\mu(-A)\leqslant-\omega.

In the analysis below, we also use functions φj\varphi_{j} [3, relation (2.10)], defined for j=0,1,2,j=0,1,2,\dots as

φj+1(z)=φj(z)φj(0)z,φ0(z)=ez,\varphi_{j+1}(z)=\frac{\varphi_{j}(z)-\varphi_{j}(0)}{z},\qquad\varphi_{0}(z)=e^{z}, (6)

where we set φj(0)=1/j!\varphi_{j}(0)=1/j!, which makes φj\varphi_{j} entire functions. In this paper, the following variation-of-constants formula (see, e.g., [1, Section I.2.3], [3, relation (1.5)]) is instrumental:

y(t)=exp(tAk)v+0texp((ts)Ak)[fk(y(s))+g(s)]ds,t[0,T],y(t)=\exp(-tA_{k})v+\int_{0}^{t}\exp(-(t-s)A_{k})\left[f_{k}(y(s))+g(s)\right]\mathrm{d}s,\quad t\in[0,T], (7)

where y(t)y(t) is solution of IVP (1),(3).

2.2 Nonlinear waveform relaxation

Let y0(t)y_{0}(t) be an approximation to the unknown function y(t)y(t) for t[0,T]t\in[0,T], with y0(0)=vy_{0}(0)=v. Usually y0(t)y_{0}(t) is taken to be y0(t)vy_{0}(t)\equiv v for all t[0,T]t\in[0,T]. To solve (1), in this paper we consider nonlinear waveform relaxation iteration where we solve successfully, for k=0,1,2,k=0,1,2,\dots, a linear inhomogeneous IVP

yk+1(t)=Akyk+1(t)+fk(yk(t))+g(t),yk+1(0)=v,t[0,T].y_{k+1}^{\prime}(t)=-A_{k}y_{k+1}(t)+f_{k}(y_{k}(t))+g(t),\quad y_{k+1}(0)=v,\quad t\in[0,T]. (8)

Here the matrices AkN×NA_{k}\in\mathbb{R}^{N\times N} and the mappings fk:NNf_{k}:\mathbb{R}^{N}\rightarrow\mathbb{R}^{N} form a splitting of Φ\Phi, see (3), and satisfy, for all kk, the assumptions (4),(5). We emphasize that at each iteration an approximation yk+1(t)y_{k+1}(t) is computed and stored for the whole time range t[0,T]t\in[0,T]. In Section 2.5 below we briefly discuss how to do this efficiently.

The following proposition provides a sufficient condition for iteration (8) to converge.

Proposition 1.

Let IVP (1) be solved iteratively by (8) and let assumptions (4),(5) hold for AkA_{k} and fkf_{k} in (8), k=0,1,k=0,1,\dots. Then for the error ϵk+1(t)y(t)yk+1(t)\epsilon_{k+1}(t)\equiv y(t)-y_{k+1}(t) of the iterative approximation yk+1(t)y_{k+1}(t) holds, for k=0,1,k=0,1,\dots,

ϵk+1(t)CLtφ1(ωt)maxs[0,t]ϵk(s),t[0,T],\|\epsilon_{k+1}(t)\|\leqslant CLt\varphi_{1}(-\omega t)\max_{s\in[0,t]}\|\epsilon_{k}(s)\|,\quad\forall t\in[0,T], (9)

and nonlinear waveform relaxation (8) converges for t[0,T]t\in[0,T] to solution y(t)y(t) of IVP (1),(3) provided that

CLTφ1(ωT)<1.CLT\varphi_{1}(-\omega T)<1. (10)
Proof.

Subtracting the iteration formula (8) from the ODE y(t)=Aky(t)+fk(y(t))+g(t)y^{\prime}(t)=-A_{k}y(t)+f_{k}(y(t))+g(t) and taking into account that y(0)=yk+1(0)=vy(0)=y_{k+1}(0)=v, we see that the error ϵk+1(t)\epsilon_{k+1}(t) satisfies IVP

ϵk+1(t)=Akϵk+1(t)+fk(y(t))fk(yk(t)),ϵk+1(0)=0.\epsilon_{k+1}^{\prime}(t)=-A_{k}\epsilon_{k+1}(t)+f_{k}(y(t))-f_{k}(y_{k}(t)),\quad\epsilon_{k+1}(0)=0. (11)

Then, applying the variation-of-constants formula (7) to IVP (11), we obtain

ϵk+1(t)=0texp((ts)Ak)[fk(y(s))fk(yk(s))]ds.\epsilon_{k+1}(t)=\int_{0}^{t}\exp(-(t-s)A_{k})\left[f_{k}(y(s))-f_{k}(y_{k}(s))\right]\mathrm{d}s.

Therefore, using (4) and (5), we can bound

ϵk+1(t)\displaystyle\|\epsilon_{k+1}(t)\| 0texp((ts)Ak)[fk(y(s))fk(yk(s))]ds\displaystyle\leqslant\int_{0}^{t}\|\exp(-(t-s)A_{k})\|\,\|\left[f_{k}(y(s))-f_{k}(y_{k}(s))\right]\|\mathrm{d}s
0texp((ts)Ak)Ly(s))yk(s)ds\displaystyle\leqslant\int_{0}^{t}\|\exp(-(t-s)A_{k})\|\,L\|y(s))-y_{k}(s)\|\mathrm{d}s
Lmaxs[0,t]ϵk(s)0texp((ts)Ak)ds\displaystyle\leqslant L\max_{s\in[0,t]}\|\epsilon_{k}(s)\|\int_{0}^{t}\|\exp(-(t-s)A_{k})\|\,\mathrm{d}s
CLmaxs[0,t]ϵk(s)0teω(ts)ds=CLtφ1(ωt)maxs[0,t]ϵk(s).\displaystyle\leqslant CL\max_{s\in[0,t]}\|\epsilon_{k}(s)\|\int_{0}^{t}e^{-\omega(t-s)}\,\mathrm{d}s=CLt\varphi_{1}(-\omega t)\max_{s\in[0,t]}\|\epsilon_{k}(s)\|.

Thus, (9) is proved. Taking into account that tφ1(ωt)t\varphi_{1}(-\omega t) is a monotonically increasing function of tt, we obtain (10). ∎

Remark 1.

Proposition 1 shows that we can choose a length TT of the time interval [0,T][0,T] such that the nonlinear waveform relaxation converges. The larger the Lipschitz constant LL, the smaller TT should be taken.

Remark 2.

To solve an initial-value problem for an autonomous ODE system

C(y)y(t)F¯(y(t))=0,C(y)y^{\prime}(t)-\bar{F}(y(t))=0,

in [38] a nonlinear waveform iteration

yk+1(t)F¯(yk+1(t))=g^(yk(t))y^{\prime}_{k+1}(t)-\bar{F}(y_{k+1}(t))=\widehat{g}(y^{\prime}_{k}(t)) (12)

is considered, where C(y)C(y) is a matrix and C(y)y=yg^(y)C(y)y^{\prime}=y^{\prime}-\widehat{g}(y^{\prime}). A convergence analysis of (12) is given in assumption that g^\widehat{g} is Lipschitz-continuous with a Lipschitz constant less than one. In [38], a particular case of (12) is also considered with F¯(y)=Ay+f(y)\bar{F}(y)=-Ay+f(y), i.e.,

yk+1(t)+Ayk+1(t)f(yk+1(t))=g^(yk(t)).y^{\prime}_{k+1}(t)+Ay_{k+1}(t)-f(y_{k+1}(t))=\widehat{g}(y^{\prime}_{k}(t)).

Hence, our results here do not overlap with the results in [38].

In case f(y)f(y) is linear, waveform relaxation (8) is known to have an attractive property to converge superlinearly for final TT, see, e.g., [40, Theorem 5.1] and, for inexact waveform relaxation, [41, Theorem 2]. The following result shows that the superlinear convergence property is shared by nonlinear waveform relaxation (8).

Proposition 2.

Let IVP (1) be solved iteratively by (8) and let assumptions (4),(5) hold for AkA_{k} and fkf_{k} in (8), k=0,1,k=0,1,\dots. Then for the error ϵk(t)\epsilon_{k}(t) of the iterative approximation yk(t)y_{k}(t) holds, for k=1,2,k=1,2,\dots,

ϵk(t)\displaystyle\|\epsilon_{k}(t)\| (CL)ktkeωtφk(ωt)maxs[0,t]ϵ0(s)\displaystyle\leqslant(CL)^{k}t^{k}e^{-\omega t}\varphi_{k}(\omega t)\max_{s\in[0,t]}\|\epsilon_{0}(s)\| (13)
(CL)ktkk!maxs[0,t]ϵ0(s),t[0,T].\displaystyle\leqslant(CL)^{k}\frac{t^{k}}{k!}\max_{s\in[0,t]}\|\epsilon_{0}(s)\|,\quad\forall t\in[0,T].
Proof.

The proof is very similar to the proof of [40, Theorem 5.1], where a linear equivalent of (1),(8) is considered. The estimate (13) will be proven by induction. Note that tφ1(ωt)=teωtφ1(ωt)t\varphi_{1}(-\omega t)=te^{-\omega t}\varphi_{1}(\omega t). Therefore, the estimate (13) for k=1k=1 holds, as it coincides with (9) for k=0k=0. Assume that (13) holds for a certain kk. From the proof of Proposition 1, we see that

ϵk+1(t)\displaystyle\|\epsilon_{k+1}(t)\| CL0te(ts)ωϵk(s)ds\displaystyle\leqslant CL\int_{0}^{t}e^{-(t-s)\omega}\|\epsilon_{k}(s)\|\mathrm{d}s
and, by the induction assumption,
CL0te(ts)ω(CL)kskeωsφk(ωs)maxs~[0,s]ϵ0(s~)ds\displaystyle\leqslant CL\int_{0}^{t}e^{-(t-s)\omega}(CL)^{k}s^{k}e^{-\omega s}\varphi_{k}(\omega s)\max_{\tilde{s}\in[0,s]}\|\epsilon_{0}(\tilde{s})\|\mathrm{d}s
=(CL)k+1eωt0tskφk(ωs)maxs~[0,s]ϵ0(s~)ds\displaystyle=(CL)^{k+1}e^{-\omega t}\int_{0}^{t}s^{k}\varphi_{k}(\omega s)\max_{\tilde{s}\in[0,s]}\|\epsilon_{0}(\tilde{s})\|\mathrm{d}s
(CL)k+1eωtmaxs[0,t]ϵ0(s)0tskφk(ωs)ds\displaystyle\leqslant(CL)^{k+1}e^{-\omega t}\max_{s\in[0,t]}\|\epsilon_{0}(s)\|\int_{0}^{t}s^{k}\varphi_{k}(\omega s)\mathrm{d}s
=(CL)k+1tk+1eωtφk+1(ωt)maxs[0,t]ϵ0(s),\displaystyle=(CL)^{k+1}t^{k+1}e^{-\omega t}\varphi_{k+1}(\omega t)\max_{s\in[0,t]}\|\epsilon_{0}(s)\|,

where the relation 0tskφk(ωs)ds=tk+1φk+1(ωt)\int_{0}^{t}s^{k}\varphi_{k}(\omega s)\mathrm{d}s=t^{k+1}\varphi_{k+1}(\omega t) is employed. Thus, the induction step is done. Finally, using the definition of φk\varphi_{k} and the fact that ω0\omega\geqslant 0, it is not difficult to check that

tkeωtφk(ωt)tkk!,t0,t^{k}e^{-\omega t}\varphi_{k}(\omega t)\leqslant\frac{t^{k}}{k!},\quad t\geqslant 0,

which proves the second inequality in (13). ∎

2.3 Residual of the nonlinear problem

To control convergence of waveform relaxation (8) in practice, it is natural to consider a residual rk(t)r_{k}(t) of an iterative approximation yk(t)y_{k}(t) in (8). Since yk(t)y_{k}(t) is defined by (8),(3) written for k1k-1, it is natural to define the residual with respect to IVP (1) combined with splitting (3) for k1k-1, i.e., for IVP

y(t)=Ak1y(t)+fk1(y(t))+g(t),y(0)=v,t[0,T].y^{\prime}(t)=-A_{k-1}y(t)+f_{k-1}(y(t))+g(t),\quad y(0)=v,\quad t\in[0,T]. (14)

Hence, we define

rk(t)Ak1yk(t)+fk1(yk(t))+g(t)yk(t),t[0,T],r_{k}(t)\equiv-A_{k-1}y_{k}(t)+f_{k-1}(y_{k}(t))+g(t)-y_{k}^{\prime}(t),\quad t\in[0,T], (15)

and write

rk(t)\displaystyle r_{k}(t) =Ak1yk(t)+fk1(yk(t))+g(t)yk(t)±fk1(yk1(t))\displaystyle=-A_{k-1}y_{k}(t)+f_{k-1}(y_{k}(t))+g(t)-y_{k}^{\prime}(t)\pm f_{k-1}(y_{k-1}(t)) (16)
=Ak1yk(t)+fk1(yk1(t))+g(t)yk(t)=0+fk1(yk(t))fk1(yk1(t))\displaystyle=\underbrace{-A_{k-1}y_{k}(t)+f_{k-1}(y_{k-1}(t))+g(t)-y_{k}^{\prime}(t)}_{=0}+f_{k-1}(y_{k}(t))-f_{k-1}(y_{k-1}(t))
=fk1(yk(t))fk1(yk1(t)),t[0,T],k=1,2,.\displaystyle=f_{k-1}(y_{k}(t))-f_{k-1}(y_{k-1}(t)),\quad t\in[0,T],\quad k=1,2,\dots.

Note that the residual rk(t)r_{k}(t) possesses the backward error property, i.e., iterative approximation yk(t)y_{k}(t) can be viewed as the exact solution of a perturbed IVP (14)

yk(t)=Ak1yk(t)+fk1(yk(t))+g(t)rk(t),yk(0)=vt[0,T].y_{k}^{\prime}(t)=-A_{k-1}y_{k}(t)+f_{k-1}(y_{k}(t))+g(t)-r_{k}(t),\quad y_{k}(0)=v\quad t\in[0,T]. (17)

Subtracting the ODE of this problem from the ODE in (14), we obtain an IVP for the error ϵk(t)\epsilon_{k}(t)

ϵk(t)=Ak1ϵk(t)+fk1(y(t))fk1(yk(t))+rk(t),ϵk(0)=0,\epsilon_{k}^{\prime}(t)=-A_{k-1}\epsilon_{k}(t)+f_{k-1}(y(t))-f_{k-1}(y_{k}(t))+r_{k}(t),\quad\epsilon_{k}(0)=0, (18)

which is equivalent to the IVP (11). The following proposition shows that the residual can be seen as an upper bound bound for the error.

Proposition 3.

Let IVP (1) be solved iteratively by (8) and let assumptions (4),(5) hold for AkA_{k} and fkf_{k} in (8), k=0,1,k=0,1,\dots. Let T>0T>0 be chosen such that the sufficient condition (10) for convergence of (8) holds

CLTφ1(ωT)δ<1,CLT\varphi_{1}(-\omega T)\leqslant\delta<1, (19)

for a certain constant δ(0,1)\delta\in(0,1). Then, for t[0,T]\forall t\in[0,T],

maxs[0,t]ϵk(s)Ctφ1(ωt)1CLtφ1(ωt)maxs[0,t]rk(s)δ(1δ)Lmaxs[0,t]rk(s).\max_{s\in[0,t]}\|\epsilon_{k}(s)\|\leqslant\frac{Ct\varphi_{1}(-\omega t)}{1-CLt\varphi_{1}(-\omega t)}\max_{s\in[0,t]}\|r_{k}(s)\|\leqslant\frac{\delta}{(1-\delta)L}\max_{s\in[0,t]}\|r_{k}(s)\|. (20)

Note that Tφ1(ωT)T\varphi_{1}(-\omega T) increases with TT monotonically.

Proof.

Employing the variation-of-constants formula for IVP (18), we can bound

ϵk(t)\displaystyle\|\epsilon_{k}(t)\| 0texp((ts)Ak1)[fk1(y(s))fk1(yk(s))+rk(s)]ds\displaystyle\leqslant\int_{0}^{t}\left\|\exp(-(t-s)A_{k-1})\left[f_{k-1}(y(s))-f_{k-1}(y_{k}(s))+r_{k}(s)\right]\right\|\mathrm{d}s
maxs[0,t]fk1(y(s))fk1(yk(s))+rk(s)0texp((ts)Ak1)ds\displaystyle\leqslant\max_{s\in[0,t]}\left\|f_{k-1}(y(s))-f_{k-1}(y_{k}(s))+r_{k}(s)\right\|\int_{0}^{t}\|\exp(-(t-s)A_{k-1})\|\,\mathrm{d}s
maxs[0,t]fk1(y(s))fk1(yk(s))+rk(s)Ctφ1(ωt)\displaystyle\leqslant\max_{s\in[0,t]}\left\|f_{k-1}(y(s))-f_{k-1}(y_{k}(s))+r_{k}(s)\right\|\,Ct\varphi_{1}(-\omega t)
(Lmaxs[0,t]ϵk(s)+maxs[0,t]rk(s))Ctφ1(ωt)\displaystyle\leqslant\left(L\max_{s\in[0,t]}\|\epsilon_{k}(s)\|+\max_{s\in[0,t]}\|r_{k}(s)\|\right)\,Ct\varphi_{1}(-\omega t)
CLtφ1(ωt)maxs[0,t]ϵk(s)+Ctφ1(ωt)maxs[0,t]rk(s).\displaystyle\leqslant CLt\varphi_{1}(-\omega t)\max_{s\in[0,t]}\|\epsilon_{k}(s)\|+Ct\varphi_{1}(-\omega t)\max_{s\in[0,t]}\|r_{k}(s)\|.

Taking into account that CLtφ1(ωt)CLTφ1(ωT)δ<1CLt\varphi_{1}(-\omega t)\leqslant CLT\varphi_{1}(-\omega T)\leqslant\delta<1 for t[0,T]t\in[0,T], we obtain

(1CLtφ1(ωt))maxs[0,t]ϵk(s)\displaystyle(1-CLt\varphi_{1}(-\omega t))\max_{s\in[0,t]}\|\epsilon_{k}(s)\| Ctφ1(ωt)maxs[0,t]rk(s)\displaystyle\leqslant Ct\varphi_{1}(-\omega t)\max_{s\in[0,t]}\|r_{k}(s)\|
maxs[0,t]ϵk(s)\displaystyle\max_{s\in[0,t]}\|\epsilon_{k}(s)\| Ctφ1(ωt)1CLtφ1(ωt)maxs[0,t]rk(s),\displaystyle\leqslant\frac{Ct\varphi_{1}(-\omega t)}{1-CLt\varphi_{1}(-\omega t)}\max_{s\in[0,t]}\|r_{k}(s)\|,

which yields (20). ∎

As stated by proposition below, an upper bound for the nonlinear residual can be obtained in terms of the error at the previous iteration.

Proposition 4.

Let IVP (1) be solved iteratively by (8) and let assumptions (4),(5) hold for AkA_{k} and fkf_{k} in (8), k=0,1,k=0,1,\dots. Then, for k=1,2,k=1,2,\dots,

rk(t)\displaystyle\|r_{k}(t)\| (1+CLtφ1(ωt))Lmaxs[0,t]ϵk1(s),t[0,T],\displaystyle\leqslant(1+CLt\varphi_{1}(-\omega t))L\max_{s\in[0,t]}\|\epsilon_{k-1}(s)\|,\quad\forall t\in[0,T], (21)
If, in addition, convergence condition (19) is satisfied, we have
rk(t)\displaystyle\|r_{k}(t)\| (1+δ)Lmaxs[0,t]ϵk1(s)<2Lmaxs[0,t]ϵk1(s),t[0,T].\displaystyle\leqslant(1+\delta)L\max_{s\in[0,t]}\|\epsilon_{k-1}(s)\|<2L\max_{s\in[0,t]}\|\epsilon_{k-1}(s)\|,\quad\forall t\in[0,T]. (22)
Proof.

Using relation (16), we can write, for t[0,T]t\in[0,T],

rk(t)\displaystyle r_{k}(t) =fk1(yk(t))fk1(y(t))+fk1(y(t))fk1(yk1(t)),\displaystyle=f_{k-1}(y_{k}(t))-f_{k-1}(y(t))+f_{k-1}(y(t))-f_{k-1}(y_{k-1}(t)),
rk(t)\displaystyle r_{k}(t) fk1(yk(t))fk1(y(t))+fk1(y(t))fk1(yk1(t))\displaystyle\leqslant\|f_{k-1}(y_{k}(t))-f_{k-1}(y(t))\|+\|f_{k-1}(y(t))-f_{k-1}(y_{k-1}(t))\|
Lyk(t))y(t)+Ly(t)yk1(t)=L(ek(t)+ek1(t))\displaystyle\leqslant L\|y_{k}(t))-y(t)\|+L\|y(t)-y_{k-1}(t)\|=L(\|e_{k}(t)\|+\|e_{k-1}(t)\|)
L(CLtφ1(ωt)maxs[0,t]ek1(s)+ek1(t)),\displaystyle\leqslant L\left(CLt\varphi_{1}(-\omega t)\max_{s\in[0,t]}\|e_{k-1}(s)\|+\|e_{k-1}(t)\|\right),

which leads to (21),(22). ∎

2.4 Linear inner iteration and its residual

In practice, the linear ODE system (8) at each waveform relaxation iteration can be solved by any suitable integrator inexactly, with a certain accuracy tolerance. In the context of the time-parallel PARAEXP scheme, the nonlinear waveform relaxation appeared to be efficient for fluid dynamics problems [29, 30], with linear IVP (8) solved by the exponential block Krylov subspace (EBK) method [37]. Thus, in this setting we have an inner-outer iterative process where each nonlinear waveform relaxation iteration (8) involves an inner Krylov subspace iterative process y~k+1,(t)yk+1(t)\tilde{y}_{k+1,\ell}(t)\rightarrow y_{k+1}(t), with \ell being the inner iteration number. For notation simplicity we omit the dependence on the inner iteration number \ell in y~k+1,(t)\tilde{y}_{k+1,\ell}(t) and write y~k+1(t)\tilde{y}_{k+1}(t). These inner EBK iterations are controled by checking the norm of a residual defined, for y~k+1(t)yk+1(t)\tilde{y}_{k+1}(t)\approx y_{k+1}(t), as

rk+1𝗅𝗂𝗇(t)Aky~k+1(t)+fk(y~k(t))+g(t)y~k+1(t),t[0,T].r^{\mathsf{lin}}_{k+1}(t)\equiv-A_{k}\tilde{y}_{k+1}(t)+f_{k}(\tilde{y}_{k}(t))+g(t)-\tilde{y}^{\prime}_{k+1}(t),\quad t\in[0,T]. (23)

Here we assume that the previous iterative approximation is also computed inexactly and therefore we replace yk(t)y_{k}(t) by y~k(t)\tilde{y}_{k}(t). The inner iterations stop after a certain number \ell of inner iterations as soon as the linear residual rk+1𝗅𝗂𝗇(t)r^{\mathsf{lin}}_{k+1}(t), defined by (23), is small enough in norm. The residual r~k+1(t)\tilde{r}_{k+1}(t) turns out to be easily computable as a by-product of the block Krylov subspace method [37]. Moreover, the residual in (23) can be used as a controller for the error in the linear IVP (8), see [37, relation (16)].

As Proposition 5 below shows, convergence in the nonlinear waveform relaxation is retained if the IVP (8) at each nonlinear iteration is solved approximately, such that residual (23) is small enough in norm. To formulate the proposition, it is convenient to assume that the linear residual is bounded in norm by the error achieved at a previous nonlinear iteration, cf. (24).

Proposition 5.

Let IVP (1) be solved iteratively by (8) and let assumptions (4),(5) and convergence condition (19) hold. Furthermore, let the IVP (8) at each nonlinear iteration k=0,1,k=0,1,\dots be solved approximately with \ell inner iterations such that for the inner iteration residual (23) one of the following two conditions holds:

(a)\displaystyle\mathrm{(a)} rk+1𝗅𝗂𝗇(t)ηmaxs[0,t]fk(y~k+1(s))fk(y~k(s)),t[0,T],\displaystyle\|r^{\mathsf{lin}}_{k+1}(t)\|\leqslant\eta\max_{s\in[0,t]}\|f_{k}(\tilde{y}_{k+1}(s))-f_{k}(\tilde{y}_{k}(s))\|,\quad t\in[0,T], (24)
(b)\displaystyle\mathrm{(b)} rk+1𝗅𝗂𝗇(t)ηLmaxs[0,t]y~k+1(s)y~k(s),t[0,T],\displaystyle\|r^{\mathsf{lin}}_{k+1}(t)\|\leqslant\eta L\max_{s\in[0,t]}\|\tilde{y}_{k+1}(s)-\tilde{y}_{k}(s)\|,\quad t\in[0,T],

with a constant η(0,1)\eta\in(0,1). Then for the error ϵ~k+1(t)y(t)y~k+1(t)\tilde{\epsilon}_{k+1}(t)\equiv y(t)-\tilde{y}_{k+1}(t) of the iterative solution y~k+1(t)\tilde{y}_{k+1}(t) in inexact inner-outer iteration (8),(23) holds, for k=0,1,k=0,1,\dots,

ϵ~k+1(t)(1+η)CLtφ1(ωt)1ηCLtφ1(ωt)maxs[0,t]ϵ~k(s),t[0,T].\|\tilde{\epsilon}_{k+1}(t)\|\leqslant\frac{(1+\eta)CLt\varphi_{1}(-\omega t)}{1-\eta CLt\varphi_{1}(-\omega t)}\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|,\quad\forall t\in[0,T]. (25)

Moreover, the inexact iteration (8),(23) converges for t[0,T]t\in[0,T] to solution y(t)y(t) of IVP (1),(3) provided that

η<1δ2δ.\eta<\frac{1-\delta}{2\delta}. (26)
Proof.

Note that, due to (4),

fk(y~k+1(t))fk(y~k(t))Ly~k+1(t)y~k(t)\|f_{k}(\tilde{y}_{k+1}(t))-f_{k}(\tilde{y}_{k}(t))\|\leqslant L\|\tilde{y}_{k+1}(t)-\tilde{y}_{k}(t)\|

and, hence, condition (24),(b) follows from condition (24),(a) Therefore, without loss of generality, we assume that (24),(a) holds. It follows from (23) that y~k+1(t)\tilde{y}_{k+1}(t) solves a perturbed ODE system

y~k+1(t)=Aky~k+1(t)+fk(y~k(t))+g(t)rk+1𝗅𝗂𝗇(t).\tilde{y}^{\prime}_{k+1}(t)=-A_{k}\tilde{y}_{k+1}(t)+f_{k}(\tilde{y}_{k}(t))+g(t)-r^{\mathsf{lin}}_{k+1}(t). (27)

Subtracting this equation from the original nonlinear ODE y(t)=Aky(t)+fk(y(t))+g(t)y^{\prime}(t)=-A_{k}y(t)+f_{k}(y(t))+g(t) and taking into account that y~k+1(0)=y(0)=v\tilde{y}_{k+1}(0)=y(0)=v, we obtain an IVP for the error ϵ~k+1(t)y(t)y~k+1(t)\tilde{\epsilon}_{k+1}(t)\equiv y(t)-\tilde{y}_{k+1}(t) (cf. (11)):

ϵ~k+1(t)=Akϵ~k+1(t)+fk(y(t))fk(y~k(t))+rk+1𝗅𝗂𝗇(t),ϵ~k+1(0)=0.\tilde{\epsilon}^{\prime}_{k+1}(t)=-A_{k}\tilde{\epsilon}_{k+1}(t)+f_{k}(y(t))-f_{k}(\tilde{y}_{k}(t))+r^{\mathsf{lin}}_{k+1}(t),\quad\tilde{\epsilon}_{k+1}(0)=0.

Application of variation-of-constants formula (7) leads to

ϵ~k+1(t)=0texp((ts)Ak)[fk(y(s))fk(y~k(s))+rk+1𝗅𝗂𝗇(s)]ds.\tilde{\epsilon}_{k+1}(t)=\int_{0}^{t}\exp(-(t-s)A_{k})\left[f_{k}(y(s))-f_{k}(\tilde{y}_{k}(s))+r^{\mathsf{lin}}_{k+1}(s)\right]\mathrm{d}s.

Then, using (4),(5),(24) we get, for t[0,T]t\in[0,T],

ϵ~k+1(t)\displaystyle\|\tilde{\epsilon}_{k+1}(t)\| 0texp((ts)Ak)fk(y(s))fk(y~k(s))+rk+1𝗅𝗂𝗇(s)ds\displaystyle\leqslant\int_{0}^{t}\|\exp(-(t-s)A_{k})\|\,\left\|f_{k}(y(s))-f_{k}(\tilde{y}_{k}(s))+r^{\mathsf{lin}}_{k+1}(s)\right\|\mathrm{d}s
Ctφ1(ωt)[Lmaxs[0,t]ϵ~k(s)+maxs[0,t]rk+1𝗅𝗂𝗇(s)]\displaystyle\leqslant Ct\varphi_{1}(-\omega t)\left[L\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|+\max_{s\in[0,t]}\|r^{\mathsf{lin}}_{k+1}(s)\|\right]
Ctφ1(ωt)[Lmaxs[0,t]ϵ~k(s)+ηLmaxs[0,t]y~k+1(s)y~k(s)].\displaystyle\leqslant Ct\varphi_{1}(-\omega t)\left[L\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|+\eta L\max_{s\in[0,t]}\|\tilde{y}_{k+1}(s)-\tilde{y}_{k}(s)\|\right].

Furthermore,

maxs[0,t]y~k+1(s)y~k(s)±y(s)maxs[0,t]ϵ~k+1(s)+maxs[0,t]ϵ~k(s),\max_{s\in[0,t]}\|\tilde{y}_{k+1}(s)-\tilde{y}_{k}(s)\pm y(s)\|\leqslant\max_{s\in[0,t]}\|\tilde{\epsilon}_{k+1}(s)\|+\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|,

so that, for t[0,T]t\in[0,T],

ϵ~k+1(t)CLtφ1(ωt)[(1+η)maxs[0,t]ϵ~k(s)+ηmaxs[0,t]ϵ~k+1(s)],\displaystyle\|\tilde{\epsilon}_{k+1}(t)\|\leqslant CLt\varphi_{1}(-\omega t)\left[(1+\eta)\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|+\eta\max_{s\in[0,t]}\|\tilde{\epsilon}_{k+1}(s)\|\right],
(1CLtφ1(ωt)η)maxs[0,t]ϵ~k+1(s)CLtφ1(ωt)(1+η)maxs[0,t]ϵ~k(s).\displaystyle(1-CLt\varphi_{1}(-\omega t)\eta)\max_{s\in[0,t]}\|\tilde{\epsilon}_{k+1}(s)\|\leqslant CLt\varphi_{1}(-\omega t)(1+\eta)\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|.

Taking into account that CLtφ1(ωt)CLTφ1(ωT)δ<1CLt\varphi_{1}(-\omega t)\leqslant CLT\varphi_{1}(-\omega T)\leqslant\delta<1 and 0<η<10<\eta<1, we then obtain

maxs[0,t]ϵ~k+1(s)CLtφ1(ωt)(1+η)1CLtφ1(ωt)ηmaxs[0,t]ϵ~k(s)δ(1+η)1δηmaxs[0,t]ϵ~k(s).\max_{s\in[0,t]}\|\tilde{\epsilon}_{k+1}(s)\|\leqslant\frac{CLt\varphi_{1}(-\omega t)(1+\eta)}{1-CLt\varphi_{1}(-\omega t)\eta}\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|\leqslant\frac{\delta(1+\eta)}{1-\delta\eta}\max_{s\in[0,t]}\|\tilde{\epsilon}_{k}(s)\|.

This yields (25). Condition (26) is equivalent to δ(1+η)1δη<1\frac{\delta(1+\eta)}{1-\delta\eta}<1. ∎

Remark 3.

Note that for η0\eta\rightarrow 0 the obtained convergence condition (26) transforms into (19). In general, condition (24),(b) is probably easier to check in practice than (24),(a). However, IVPs where (24),(a) is easy to compute may occur as well. Since y~k(t)\tilde{y}_{k}(t) and y~k+1(t)\tilde{y}_{k+1}(t) are solutions of IVPs with the same initial vector vv, we may expect that maxs[0,t]y~k+1(s)y~k(s)y~k+1(t)y~k(t)\max_{s\in[0,t]}\|\tilde{y}_{k+1}(s)-\tilde{y}_{k}(s)\|\approx\|\tilde{y}_{k+1}(t)-\tilde{y}_{k}(t)\|.

Nevertheless, employment of condition (24) in practice does not seem to be necessary: as numerical tests below demonstrate, simple stopping criteria (cf. (33),(40)) based on smallness of rk+1𝗅𝗂𝗇(t)\|r^{\mathsf{lin}}_{k+1}(t)\| work fine in practice, leading to an efficient inner-outer iteration method.

2.5 Implementation of nonlinear waveform relaxation

We take initial guess function y0(t)y_{0}(t) identically (i.e., for all t[0,T]t\in[0,T]) equal to the initial vector y0(t)vy_{0}(t)\equiv v. Then, at each iteration (8) we set g~(t):=fk(yk(t))+g(t)\tilde{g}(t):=f_{k}(y_{k}(t))+g(t) and solve a linear IVP

yk+1(t)=Akyk+1(t)+g~(t),yk+1(0)=v,t[0,T].y^{\prime}_{k+1}(t)=-A_{k}y_{k+1}(t)+\tilde{g}(t),\quad y_{k+1}(0)=v,\quad t\in[0,T]. (28)

Since this IVP is solved by the EBK method [37], the solution yk+1(t)y_{k+1}(t) at every iteration is obtained as a linear combination of Krylov subspace vectors, stored columnwise in a matrix VmN×mV_{\ell\cdot m}\in\mathbb{R}^{N\times\ell m}, with \ell being the number of inner iterations,

yk+1(t)=Vmuk+1(t),t[0,T],uk+1:m,y_{k+1}(t)=V_{\ell\cdot m}u_{k+1}(t),\quad t\in[0,T],\quad u_{k+1}:\mathbb{R}\rightarrow\mathbb{R}^{\ell\cdot m},

where uk+1(t)u_{k+1}(t) of is solution of a small-sized projected IVP (see [37] for details). This allows to store yk+1(t)y_{k+1}(t) in a compact way for all t[0,T]t\in[0,T]. At each nonlinear iteration kk the EBK method stops as soon as the residual (23) is small enough in norm. The residual norm is easily computable and is checked for several values of t[0,T]t\in[0,T].

Starting with the second iteration k=1k=1, before the IVP (28) is solved, the vector function g~(t)fk(yk(t))+g(t):N\tilde{g}(t)\equiv f_{k}(y_{k}(t))+g(t):\mathbb{R}\rightarrow\mathbb{R}^{N} is sampled at nsn_{s} points t1t_{1}, …, tnst_{n_{s}} covering the time interval [0,T][0,T]. Usually, it is sensible to take t1=0t_{1}=0, tns=Tt_{n_{s}}=T and the other tjt_{j} to be the Chebyshev polynomial roots:

tj=T2(1cosπ(j3/2)s2),j=2,,ns1.t_{j}=\frac{T}{2}\left(1-\cos\frac{\pi(j-3/2)}{s-2}\right),\quad j=2,\dots,n_{s}-1.

The number of sample points nsn_{s} should be taken such that g~(t)\tilde{g}(t) is sufficiently well approximated by its linear interpolation at t1t_{1}, …, tnst_{n_{s}}. The computed samples g~(tj)\tilde{g}(t_{j}), j=1,,nsj=1,\dots,{n_{s}}, are then stored as the columns of a matrix G~N×ns\tilde{G}\in\mathbb{R}^{N\times n_{s}} and the thin singular value decomposition of G~\tilde{G} is employed to obtain a low rank representation

g~(t)Up(t),UN×m,t[0,T],\tilde{g}(t)\approx Up(t),\quad U\in\mathbb{R}^{N\times m},\quad t\in[0,T], (29)

where typically m<10m<10. For more details of this procedure we refer to [37] or to [29]. Usually the computational work needed to obtain representation g~(t)Up(t)\tilde{g}(t)\approx Up(t) is negligible with respect to the other work in the EBK method. The nonlinear iteration (8) is stopped as soon as the nonlinear residual (16) is small enough in norm (in our limited experience, in practice it suffice to check the residual norm at final time t=Tt=T only). The resulting algorithm is outlined below.

  Algorithm 1  Given: IVP (1), initial value vector vv, tolerance tol, block size mm, sample number nsn_{s}.
Result: numerical solution y(T)y(T) of IVP (1).
Initialize nonlinear iteration:
y0(t):=vy_{0}(t):=v (constant in time)
for k=0,1,2,k=0,1,2,\dots (main iteration loop)
    if (k=0k=0) then
       resnorm:=r0(T)=Φ(T,v)\texttt{resnorm}:=\|r_{0}(T)\|=\|\Phi(T,v)\|
    else
       resnorm:=rk(T)=fk1(yk(T))fk1(yk1(T))\texttt{resnorm}:=\|r_{k}(T)\|=\|f_{k-1}(y_{k}(T))-f_{k-1}(y_{k-1}(T))\|
    endif
    if resnorm is small enough (cf. (32),(39)) then
       stop iteration
    endif
    define fk(y)f_{k}(y), AkA_{k}, g~(t):=fk(yk(t))+g(t)\tilde{g}(t):=f_{k}(y_{k}(t))+g(t)
    set inner iteration tolerance tol𝚕𝚒𝚗\texttt{tol}_{\mathtt{lin}} (cf. (33),(40))
    compute the low rank approximation (29), g~(t)Up(t)\tilde{g}(t)\approx Up(t)
    solve linear IVP (28) by inner block Krylov subspace iteration (linear EBK)
    \Rightarrow obtain yk+1(t)=Vmuk+1(t)y_{k+1}(t)=V_{\ell\cdot m}u_{k+1}(t), t[0,T]t\in[0,T]
endfor (main iteration loop)  

3 Numerical experiments

3.1 1D Burgers equation

This IVP is taken from [42, Ch. IV.10]. To find unknown function u(x,t)u(x,t), we solve a 1D Burgers equation

ut=νuxxuux,0x1,0tT,u_{t}=\nu u_{xx}-uu_{x},\quad 0\leqslant x\leqslant 1,\quad 0\leqslant t\leqslant T,

with viscosity ν=3104\nu=3\cdot 10^{-4} and ν=3105\nu=3\cdot 10^{-5}. Initial and boundary conditions are

u(x,0)=32x(1x)2,u(0,t)=u(1,t)=0,u(x,0)=\frac{3}{2}x(1-x)^{2},\quad u(0,t)=u(1,t)=0,

and the final time TT is set to 0.50.5, 1.01.0, and 1.51.5. We discretize the Burgers equation by finite differences on a uniform mesh with NN internal nodes xj=jΔxx_{j}=j\Delta x, j=1,,nj=1,\dots,n, Δx=1/(N+1)\Delta x=1/(N+1). To avoid numerical energy dissipation and to guarantee that a discretization of the advection term uuxuu_{x} results in a skew-symmetric matrix [43], we apply the standard second-order central differences in the following way

uux=13uux+23(u22)x13uiui+1ui12Δx+23ui+12ui124Δx.uu_{x}=\frac{1}{3}uu_{x}+\frac{2}{3}\left(\frac{u^{2}}{2}\right)_{x}\approx\frac{1}{3}u_{i}\frac{u_{i+1}-u_{i-1}}{2\Delta x}+\frac{2}{3}\frac{u_{i+1}^{2}-u_{i-1}^{2}}{4\Delta x}.

The diffusion term νuxx\nu u_{xx} is discretized by second-order central differences in the usual way. Then this finite difference discretization leads to an IVP of the form

y(t)=Asymmy(t)Askew(y(t))y(t),y(0)=v,y^{\prime}(t)=-A_{\mathrm{symm}}y(t)-A_{\mathrm{skew}}(y(t))y(t),\quad y(0)=v, (30)

where

Asymmyνuxx,Askew(y)yuux,A_{\mathrm{symm}}y\approx-\nu u_{xx},\quad A_{\mathrm{skew}}(y)y\approx uu_{x},

Asymm=AsymmTN×NA_{\mathrm{symm}}=A_{\mathrm{symm}}^{T}\in\mathbb{R}^{N\times N} is positive definite and (Askew(y))T=Askew(y)N×N(A_{\mathrm{skew}}(y))^{T}=-A_{\mathrm{skew}}(y)\in\mathbb{R}^{N\times N} for all yNy\in\mathbb{R}^{N}. It is important to make sure that discretization of the convective term results in a skew-symmetric matrix [43, 44]. Otherwise, there are contributions of the convective term to the symmetric part of the Jacobian matrix which distort the proper energy dissipation in the ODE system and may prevent the the symmetric part from being positive semidefinite. In the context of this paper, this would mean that ω0\omega\geqslant 0 is not guaranteed anymore.

A straightforward way to apply the nonlinear iteration (8) to (30) would be setting in (3) Ak:=AsymmA_{k}:=A_{\mathrm{symm}}, fk(y):=Askew(y)yf_{k}(y):=-A_{\mathrm{skew}}(y)y. However, to decrease the Lipschitz constant of fk(y)f_{k}(y) (cf. condition (10)), we also include a linear part of Askew(y)y-A_{\mathrm{skew}}(y)y to the Aky-A_{k}y term in (3). More specifically, we apply the nonlinear iteration (8) to semidiscrete Burgers equation (30) in the form

yk+1(t)=(Asymm+Askew(y¯k)Ak)yk+1(t)+[Askew(y¯k)Askew(yk(t))]yk(t)fk(yk(t)),\displaystyle y_{k+1}^{\prime}(t)=-(\,\underbrace{A_{\mathrm{symm}}+A_{\mathrm{skew}}(\bar{y}_{k})}_{\displaystyle A_{k}}\,)y_{k+1}(t)+\underbrace{\left[A_{\mathrm{skew}}(\bar{y}_{k})-A_{\mathrm{skew}}(y_{k}(t))\right]y_{k}(t)}_{\displaystyle f_{k}(y_{k}(t))}, (31)
k=0,1,2,,\displaystyle k=0,1,2,\dots,

where y¯k=yk(T)\bar{y}_{k}=y_{k}(T). The nonlinear residual (16) in this case reads, for k=0,1,k=0,1,\dots,

rk+1(t)\displaystyle r_{k+1}(t) =fk(yk+1(t))fk(yk(t))\displaystyle=f_{k}(y_{k+1}(t))-f_{k}(y_{k}(t))
=[Askew(y¯k)Askew(yk+1(t))]yk+1(t)[Askew(y¯k)Askew(yk(t))]yk(t),\displaystyle=\left[A_{\mathrm{skew}}(\bar{y}_{k})-A_{\mathrm{skew}}(y_{k+1}(t))\right]y_{k+1}(t)-\left[A_{\mathrm{skew}}(\bar{y}_{k})-A_{\mathrm{skew}}(y_{k}(t))\right]y_{k}(t),
rk+1(T)\displaystyle r_{k+1}(T) =[Askew(y¯k)Askew(yk+1(T))]yk+1(T).\displaystyle=\left[A_{\mathrm{skew}}(\bar{y}_{k})-A_{\mathrm{skew}}(y_{k+1}(T))\right]y_{k+1}(T).

Since the symmetric part of the matrix AkA_{k} is 12(Ak+AkT)=Asymm\frac{1}{2}(A_{k}+A_{k}^{T})=A_{\mathrm{symm}}, the ω\omega value is the smallest eigenvalue of AsymmA_{\mathrm{symm}}. The lowest eigenmodes should be well approximated even on the coarsest grid. Indeed, computing ω\omega numerically (which is easy for this one-dimensional problem) yields approximately the same result on all grids: we get ω2.96e-03\omega\approx\texttt{2.96e-03} for ν=3104\nu=3\cdot 10^{-4} and ω2.96e-04\omega\approx\texttt{2.96e-04} for ν=3105\nu=3\cdot 10^{-5}. We stop our nonlinear iterative process (8) as soon as

rk(T)tol,\|r_{k}(T)\|\leqslant\texttt{tol}, (32)

where tol is a tolerance value. As we primarily aim at solving IVPs stemming from space discretization of PDEs, where space discretization error is significant, we settle for a moderate tolerance value tol=103\texttt{tol}=10^{-3} in this test. To obtain the low rank representation (29) at each iteration kk we use ns=100n_{s}=100 sample points tjt_{j}, j=1,,nsj=1,\dots,n_{s} and set the number mm of singular vectors to m=7m=7. The largest truncated singular value σm+1\sigma_{m+1}, serving as an indicator of the relative low rank representation error in (29), is then of order 10610^{-6} for T=0.5T=0.5 and of order 10310^{-3} for T=1.5T=1.5.

For the inner iterative solver EBK employed to solve (8) we set the residual tolerance to tol𝚕𝚒𝚗=tol\texttt{tol}_{\mathtt{lin}}=\texttt{tol}, which means that inner iterations are stopped as soon as the inner linear residual rk+1𝗅𝗂𝗇(t)r^{\mathsf{lin}}_{k+1}(t) satisfies

rk+1𝗅𝗂𝗇(T)tol𝚕𝚒𝚗.\|r^{\mathsf{lin}}_{k+1}(T)\|\leqslant\texttt{tol}_{\mathtt{lin}}. (33)

Krylov subspace dimension is set to 1010. As the block size in the Krylov subspace process is m=7m=7, the EBK solver requires m(10+1)=77m(10+1)=77 vectors to store. Since the problem becomes stiffer as the spatial grid gets finer, we use the EBK solver in the shift-and-invert (SAI) mode [45, 46], i.e., the block Krylov subspace is computed for the SAI matrix (I+γAk)1(I+\gamma A_{k})^{-1}, γ=T/10\gamma=T/10, with AkA_{k} defined in (31). A sparse (banded in this test) LU factorization of I+γAkI+\gamma A_{k} is computed once at each nonlinear iteration and used every time an action of (I+γAk)1(I+\gamma A_{k})^{-1} is required.

To compare our nonlinear iterative EBK solver to another solver, we also solve this test with a MATLAB stiff ODE solver ode15s. This is a variable step size and variable order implicit multistep method [47]. The ode15s solver is run with absolute and relative tolerances set, respectively, to AbsTol=tol\texttt{AbsTol}=\texttt{tol} and RelTol=104tol\texttt{RelTol}=10^{4}\texttt{tol}, with tol=109\texttt{tol}=10^{-9}. For these tolerance values both ode15s and our nonlinear EBK deliver a comparable accuracy. The relative error values reported below are computed as

yyrefyref,\frac{\|y-y_{\mathrm{ref}}\|}{\|y_{\mathrm{ref}}\|}, (34)

where yy is a numerical solution at final time TT and yrefy_{\mathrm{ref}} is a reference solution computed at final time by the ode15s solver run with strict tolerance values AbsTol=1012\texttt{AbsTol}=10^{-12} and RelTol=108\texttt{RelTol}=10^{-8}. As yrefy_{\mathrm{ref}} is computed on the same spatial grid as yy, the error value can be seen as a reliable measure of time accuracy [13].

Actual delivered accuracies and corresponding required computational work for both solvers are reported in Table 1 (for viscosity ν=3104\nu=3\cdot 10^{-4}) and Table 2 (for ν=3105\nu=3\cdot 10^{-5}). For our nonlinear EBK solver the work is reported as the number of nonlinear iterations, with one LU factorization and one matrix-vector product (matvec) per iteration, and the total number of linear systems solutions (which equals the total number of Krylov steps times the block size mm). The reported total number of LU applications is not necessarily a multiple of the block size m=7m=7 because at the first nonlinear iteration the approximate solution is constant in time and, hence, we set m=1m=1 in (29) at the first iteration. The computational work for the ode15s solver is reported as the number of time steps, computed LU factorizations and the ODE right hand side function evaluations (fevals). In Tables 1 and 2 we also report the norm of Asymm+Askew(y(T))A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T)) which can be seen as a measure of the ODE system stiffness.

Table 1: The 1D Burgers test problem, viscosity ν=3104\nu=3\cdot 10^{-4}. Attained error and computational work for our nonlinear EBK method and the ode15s solver. For the EBK method the work is reported as the nonlinear iteration number, number of LU factorizations, their applications and matvecs. For the ode15s solver the work is measured the as the number of time steps, LU factorizations, their applications and fevals.
method, iter./ LUs (LUs applic.), relative
TT tolerance steps matvecs/fevals error
grid N=500N=500, Asymm+Askew(y(T))1300\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 300
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (141), 5 5.17e-06
1.01.0 nonlin.EBK(m=7m=7), 1e-03 7 7 (220), 7 2.03e-05
1.51.5 nonlin.EBK(m=7m=7), 1e-03 10 10 (340), 10 5.31e-05
0.50.5 ode15s, 1e-09 59 14 (73), 575 1.93e-06
1.01.0 ode15s, 1e-09 70 16 (86), 588 8.38e-06
1.51.5 ode15s, 1e-09 83 19 (118), 620 2.21e-05
grid N=1000N=1000, Asymm+Askew(y(T))11 200\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 1\,200
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (170), 5 5.06e-06
1.01.0 nonlin.EBK(m=7m=7), 1e-03 7 7 (256), 7 2.00e-05
1.51.5 nonlin.EBK(m=7m=7), 1e-03 10 10 (389), 10 5.30e-05
0.50.5 ode15s, 1e-09 67 15 (83), 1085 1.76e-06
1.01.0 ode15s, 1e-09 78 17 (104), 1106 1.13e-05
1.51.5 ode15s, 1e-09 91 19 (134), 1136 2.48e-05
grid N=2000N=2000, Asymm+Askew(y(T))14 800\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 4\,800
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (177), 5 5.07e-06
1.01.0 nonlin.EBK(m=7m=7), 1e-03 7 7 (277), 7 2.00e-05
1.51.5 nonlin.EBK(m=7m=7), 1e-03 11 11 (452), 11 4.38e-05
0.50.5 ode15s, 1e-09 73 16 (91), 2093 2.29e-06
1.01.0 ode15s, 1e-09 85 19 (107), 2109 7.96e-06
1.51.5 ode15s, 1e-09 98 22 (139), 2141 2.22e-05
grid N=4000N=4000, Asymm+Askew(y(T))119 200\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 19\,200
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (193), 5 5.06e-06
1.01.0 nonlin.EBK(m=7m=7), 1e-03 8 8 (347), 8 4.82e-06
1.51.5 nonlin.EBK(m=7m=7), 1e-03 11 11 (501), 11 4.38e-05
0.50.5 ode15s, 1e-09 79 18 (99), 4101 1.93e-06
1.01.0 ode15s, 1e-09 90 20 (112), 4114 8.25e-06
1.51.5 ode15s, 1e-09 103 23 (144), 4146 2.21e-05

As we can see in Tables 1 and 2, our solver requires less LU factorizations than the ode15s solver. It does require more LU factorization actions but these are relatively cheap. Moreover, in EBK they are carried out simultaneously for blocks of mm right hand sides. For both solvers, in Figure 1 we plot the numbers of LU factorizations (presented in the tables) versus the grid size. The number of nonlinear iterations in the EBK solver (with one LU factorization per iteration) remains practically constant as the grid gets finer, whereas the number of LU factorizations in ode15s increases with the grid size. Furthermore, we see that the ode15s solver requires more time steps and significantly more fevals on the finer grids. T=1.5T=1.5 is approximately the largest possible time for which our nonlinear EBK solver converges in this test. From Figure 1 we see that a higher efficiency is reached for larger TT values. Indeed, on the finest grid N=4000N=4000 5×2=105\times 2=10 LU factorizations per unit time are required for T=0.5T=0.5, 88 LU factorizations for T=1T=1 and 11/1.5711/1.5\approx 7 factorizations for T=1.5T=1.5.

Comparing Tables 1 and 2, we see that performance of the ode15s solver improves for the smaller viscosity value ν=3105\nu=3\cdot 10^{-5}. This is expected as the stiffness of the space-discretized Burgers equation decreases with ν\nu. Indeed, on finer grids, where contributions of AsymmA_{\mathrm{symm}} (proportional to (Δx)2(\Delta x)^{-2}) dominate those of Askew(t)A_{\mathrm{skew}}(t) (proportional to (Δx)1(\Delta x)^{-1}), we have Asymm+Askew(t)𝒪(ν)\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(t)\|\approx\mathcal{O}(\nu), see the values Asymm+Askew(T)\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(T)\| reported in the tables.

In Figure 2 we plot the reference solution yref(T)y_{\mathrm{ref}}(T) and the first three iterative approximations y0(T)=vy_{0}(T)=v, y1(T)y_{1}(T) and y2(T)y_{2}(T). Convergence plots of the residual and error norms are shown in Figure 3. There, the shown residual norm is computed at t=Tt=T according to (16) and the error norm is the relative norm defined in (34). As we see, the residual turns out to be a reliable measure of the error. Furthermore, convergence behavior is not influenced by viscosity ν\nu, which is in accordance with theory: ν\nu does not change the nonlinear term fkf_{k} in (8) and its Lipschitz constant.

Table 2: The 1D Burgers test problem, viscosity ν=3105\nu=3\cdot 10^{-5}. Attained error and computational work for our nonlinear EBK method and the ode15s solver. For the EBK method the work is reported as the nonlinear iteration number, number of LU factorizations, their applications and matvecs. For the ode15s solver the work is measured the as the number of time steps, LU factorizations, their applications and fevals.
method, iter./ LUs (LUs applic.), relative
TT tolerance steps matvecs/fevals error
grid N=500N=500, Asymm+Askew(y(T))190\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 90
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (69), 5 1.82e-05
1.01.0 nonlin.EBK(m=7m=7), 1e-03 7 7 (139), 7 2.26e-05
1.51.5 nonlin.EBK(m=7m=7), 1e-03 13 13 (414), 13 1.10e-04
0.50.5 ode15s, 1e-09 28 7 (38), 540 3.71e-06
1.01.0 ode15s, 1e-09 38 9 (53), 555 1.23e-05
1.51.5 ode15s, 1e-09 52 13 (89), 591 3.38e-05
grid N=1000N=1000, Asymm+Askew(y(T))1210\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 210
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (90), 5 6.20e-06
1.01.0 nonlin.EBK(m=7m=7), 1e-03 7 7 (176), 7 2.25e-05
1.51.5 nonlin.EBK(m=7m=7), 1e-03 12 12 (430), 12 1.07e-04
0.50.5 ode15s, 1e-09 35 9 (44), 1046 2.44e-06
1.01.0 ode15s, 1e-09 45 10 (60), 1062 1.64e-05
1.51.5 ode15s, 1e-09 59 15 (99), 2102 3.83e-05
grid N=2000N=2000, Asymm+Askew(y(T))1540\|A_{\mathrm{symm}}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 540
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (120), 5 5.29e-06
1.01.0 nonlin.EBK(m=7m=7), 1e-03 7 7 (190), 7 2.22e-05
1.51.5 nonlin.EBK(m=7m=7), 1e-03 12 12 (494), 12 1.06e-04
0.50.5 ode15s, 1e-09 42 10 (55), 2057 3.91e-06
1.01.0 ode15s, 1e-09 52 12 (70), 2072 1.32e-05
1.51.5 ode15s, 1e-09 66 17 (108), 4111 3.32e-05
grid N=4000N=4000, A0+Askew(y(T))11 920\|A_{0}+A_{\mathrm{skew}}(y(T))\|_{1}\approx 1\,920
0.50.5 nonlin.EBK(m=7m=7), 1e-03 5 5 (149), 5 5.24e-06
1.01.0 nonlin.EBK(m=7m=7), 1e-03 8 8 (276), 8 5.52e-06
1.51.5 nonlin.EBK(m=7m=7), 1e-03 12 12 (578), 12 1.07e-04
0.50.5 ode15s, 1e-09 47 11 (61), 4063 3.86e-06
1.01.0 ode15s, 1e-09 58 15 (80), 4082 1.22e-05
1.51.5 ode15s, 1e-09 71 18 (112), 8115 3.36e-05
Refer to caption
Refer to caption
Figure 1: The Burgers test. Number of required LU factorizations versus the grid size in the nonlinear EBK solver (left) and ode15s (right) for different final time values TT and viscosity ν=3104\nu=3\cdot 10^{-4}

Refer to caption

Figure 2: The Burgers test. The first three iterative approximations y0(T)=vy_{0}(T)=v, y1(T)y_{1}(T), y2(T)y_{2}(T) and reference solution yref(T)y_{\mathrm{ref}}(T), grid size n=2000n=2000, final time T=1.5T=1.5, viscosity ν=3104\nu=3\cdot 10^{-4}.
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 3: The Burgers test. Residual norm (16) and relative error norm (34) versus iteration number for T=0.5T=0.5 (top plots) and T=1.5T=1.5 (bottom plots), ν=3104\nu=3\cdot 10^{-4} (left plots) and ν=3105\nu=3\cdot 10^{-5} (right plots). The grid size is n=4000n=4000. Error stagnation visible at the bottom right plot can be repaired by slightly increasing the block size mm.

3.2 3D Bratu test problem

We now consider a nonstationary IVP for Liouville–Bratu–Gelfand equation [48]: find u=u(x,y,z,t)u=u(x,y,z,t) such that

ut=104uxx+102uyy+uzz+Ceu+g𝗌𝗋𝖼(x,y,z,t),\displaystyle u_{t}=10^{4}u_{xx}+10^{2}u_{yy}+u_{zz}+Ce^{u}+g^{\mathsf{src}}(x,y,z,t), (35)
(x,y,z)Ω=[0,1]3,0tT,C=3104,\displaystyle\qquad\qquad(x,y,z)\in\Omega=[0,1]^{3},\quad 0\leqslant t\leqslant T,\quad C=3\cdot 10^{4},
u(x,y,z,0)=e100((x0.2)2+(y0.4)2+(z0.5)2),u|Ω=0.\displaystyle u(x,y,z,0)=e^{-100\left((x-0.2)^{2}+(y-0.4)^{2}+(z-0.5)^{2}\right)},\quad u\bigl{|}_{\partial\Omega}=0.

Here the source function g𝗌𝗋𝖼g^{\mathsf{src}} is defined as

g𝗌𝗋𝖼(x,y,z,t)={e100((xx0(t))2+(yy0(t))2+(z0.5)2)+Cu(x,y,z,0),t5105,e100((xx0(t))2+(yy0(t))2+(z0.5)2),t>5105,\displaystyle g^{\mathsf{src}}(x,y,z,t)=\begin{cases}e^{-100((x-x_{0}(t))^{2}+(y-y_{0}(t))^{2}+(z-0.5)^{2})}+Cu(x,y,z,0),\;&t\leqslant 5\cdot 10^{-5},\\ e^{-100((x-x_{0}(t))^{2}+(y-y_{0}(t))^{2}+(z-0.5)^{2})},\;&t>5\cdot 10^{-5},\end{cases}
x0(t)=0.5+0.3cos(2000πt),y0(t)=0.5+0.3sin(2000πt),\displaystyle x_{0}(t)=0.5+0.3\cos(2000\pi t),\quad y_{0}(t)=0.5+0.3\sin(2000\pi t),

and the final time TT is either T=5105T=5\cdot 10^{-5} or T=1104T=1\cdot 10^{-4}. Solution to (35) is shown in Figure 4.

Refer to caption Refer to caption

Figure 4: Left: initial value function u(x,y,z,0)u(x,y,z,0) of the Bratu test on a uniform 40×40×4040\times 40\times 40 grid for z=0.5z=0.5. Right: numerical solution u(x,y,z,t)u(x,y,z,t) on the same grid for z=0.5z=0.5 and t=1103t=1\cdot 10^{-3}.

The regular second order finite-difference discretization of the spatial derivatives in (35) on a uniform grid yields an IVP

y(t)=Ay(t)+f^(y(t))+g(t),y(0)=v,y^{\prime}(t)=-Ay(t)+\hat{f}(y(t))+g(t),\quad y(0)=v, (36)

where the entries of the vector functions y(t)y(t) and g(t)g(t) contain the values of the numerical solution and function g𝗌𝗋𝖼(x,y,z,t)g^{\mathsf{src}}(x,y,z,t) at the grid nodes, respectively, and

Ay(t)(104uxx+102uyy+uzz).Ay(t)\approx-(10^{4}u_{xx}+10^{2}u_{yy}+u_{zz}). (37)

Furthermore, the components [f^(y)]i[\hat{f}(y)]_{i} of the vector function f^(y)\hat{f}(y) are defined as

[f^(y)]i=Ceyi,i=1,,N.\left[\hat{f}(y)\right]_{i}=Ce^{y_{i}},\quad i=1,\dots,N.

To apply nonlinear waveform relaxation (8) for solving (36), one could have chosen to set Ak:=AA_{k}:=A and fk(y):=f^(y)f_{k}(y):=\hat{f}(y). However, to have a smaller Lipschitz constant LL in fk(y)f_{k}(y), we supplement AkA_{k} with a linear part of f^\hat{f}, setting

yk+1(t)=(AJ(y¯k)Ak)yk+1(t)+[f^(yk(t))J(y¯k)yk(t)]fk(yk(t))+g(t),\displaystyle y_{k+1}^{\prime}(t)=-(\,\underbrace{A-J(\bar{y}_{k})}_{\displaystyle A_{k}}\,)y_{k+1}(t)+\underbrace{\left[\hat{f}(y_{k}(t))-J(\bar{y}_{k})y_{k}(t)\right]}_{\displaystyle f_{k}(y_{k}(t))}+g(t), (38)
k=0,1,2,,\displaystyle k=0,1,2,\dots,

where J(y¯k)J(\bar{y}_{k}) is the Jacobian matrix of f^(y)\hat{f}(y) evaluated at y¯k=yk(T)\bar{y}_{k}=y_{k}(T) and an approximation

f^(yk+1(t))f^(yk(t))+J(y¯k)(yk+1(t)yk(t))\hat{f}(y_{k+1}(t))\approx\hat{f}(y_{k}(t))+J(\bar{y}_{k})(y_{k+1}(t)-y_{k}(t))

is used. The nonlinear residual (16) takes in this case the form, for k=0,1,k=0,1,\dots,

rk+1(t)\displaystyle r_{k+1}(t) =fk(yk+1(t))fk(yk(t))\displaystyle=f_{k}(y_{k+1}(t))-f_{k}(y_{k}(t))
=[f^(yk+1(t))J(y¯k)yk+1(t)][f^(yk(t))J(y¯k)yk(t)]\displaystyle=\left[\hat{f}(y_{k+1}(t))-J(\bar{y}_{k})y_{k+1}(t)\right]-\left[\hat{f}(y_{k}(t))-J(\bar{y}_{k})y_{k}(t)\right]
=f^(yk+1(t))f^(yk(t))J(y¯k)(yk+1(t)yk(t)).\displaystyle=\hat{f}(y_{k+1}(t))-\hat{f}(y_{k}(t))-J(\bar{y}_{k})(y_{k+1}(t)-y_{k}(t)).

Since this IVP is stiff and anisotropic, the nonlinear residual norm can be very large in norm, 𝒪(108)\mathcal{O}(10^{8}). Therefore we switch from the absolute stopping criterion rk(T)tol\|r_{k}(T)\|\leqslant\texttt{tol} used in the previous test to a relative stopping criterion

rk(T)tolr0(T),\|r_{k}(T)\|\leqslant\texttt{tol}\|r_{0}(T)\|, (39)

with tolerances tol varying in this test from 10210^{-2} to 10410^{-4} (the specific values are reported in Table 3 discussed below). Note that y0(t)vy_{0}(t)\equiv v, as discussed in Section 2.5.

The low rank representation (29) at each iteration kk is computed at ns=100n_{s}=100 sample points tjt_{j}, j=1,,nsj=1,\dots,n_{s} for either m=4m=4 or m=5m=5 low rank terms. The EBK solver applied to solve (8) is stopped as soon as

rk+1𝗅𝗂𝗇(T)tol𝚕𝚒𝚗=fk(yk(0))+g(0)tol/10,\|r^{\mathsf{lin}}_{k+1}(T)\|\leqslant\texttt{tol}_{\mathtt{lin}}=\|f_{k}(y_{k}(0))+g(0)\|\,\texttt{tol}/10, (40)

which means that the linear residual should be small with respect to the other terms in the perturbed ODE (27). We switch from the absolute inner stopping criterion (33) to (40) because in this stiff anisotropic problem the term fk(yk(t))+g(t)f_{k}(y_{k}(t))+g(t) can be very large in norm. Krylov subspace dimension is set to 1010. Hence, for the block size m=5m=5, the EBK solver has to store 5(10+1)=555\cdot(10+1)=55 vectors. Since the problem is stiff, just as in the previous test, the EBK solver is employed in the SAI mode, with a sparse LU factorization of I+γAkI+\gamma A_{k} computed and applied throughout each nonlinear iteration. The relative error values we report below are computed according to (34).

We compare our nonlinear EBK solver to the two-stage Rosenbrock method ROS2, see [10] or [1, Chapter IV.5]. For IVP (1), it can be written as

y+1\displaystyle y^{\ell+1} =y+32τk1+12τk2,\displaystyle=y^{\ell}+\frac{3}{2}\tau k_{1}+\frac{1}{2}\tau k_{2}, (41)
(IγτA^)k1\displaystyle(I-\gamma\tau\widehat{A})k_{1} =Φ(t,y),\displaystyle=\Phi(t_{\ell},y^{\ell}),
(IγτA^)k2\displaystyle(I-\gamma\tau\widehat{A})k_{2} =Φ(t+1,y+τk1)2k1.\displaystyle=\Phi(t_{\ell+1},y^{\ell}+\tau k_{1})-2k_{1}.

Here yy(τ)y^{\ell}\approx y(\ell\tau) is numerical solution at time step =0,1,2,\ell=0,1,2,\dots (y0=vy^{0}=v) obtained with a time step size τ>0\tau>0. This scheme is second order consistent for any matrix A^N×N\widehat{A}\in\mathbb{R}^{N\times N}. Here we take A^\widehat{A} to be the Jacobian matrix of Φ\Phi computed at t=τt_{\ell}=\ell\tau and yy_{\ell}. At each time step a sparse LU factorization of IγτA^I-\gamma\tau\widehat{A} can be computed and used to solve the two linear systems in k1,2k_{1,2}. The numerical tests presented here are carried in Matlab and, as it turns out, for the grid sizes used, it is much faster to compute the action of the matrix (IγτA^)1(I-\gamma\tau\widehat{A})^{-1} by the Matlab backslash operator \ than to compute a sparse LU factorization first and to apply it twice111To solve a linear system Ax=b with a square nonsingular matrix A, one can use the Matlab backslash operator \\mathtt{\backslash} as x=A\\mathtt{\backslash}b, in which case the operator computes and applies an LU factorization.. This is apparently because Matlab creates an overhead to call sparse LU factorization routines (of the UMFPACK software [49, 50]) and then to store the computed LU factors as Matlab variables. Note that both the backslash operator as well as the LU factorization for large sparse matrices in Matlab are based on UMFPACK.

Comparison results of our nonlinear EBK solver and ROS2 method are presented in Table 3. The results clearly show that, for this test problem, the nonlinear EBK solver is much more efficient than the ROS2 method both in terms of required LU factorizations (and their actions) and the CPU time. We also see that the number of nonlinear waveform relaxation iterations hardly changes with grid size and with final time TT (see also Table 4). While the convergence independence on the grid size is expected (as the grid size does not affect the nonlinear part CeuCe^{u} in the ODE), the weak convergence dependence on TT is probably a property of the specific test problem. From plots in Figure 5 we see that convergence behavior does depend on TT.

Table 3: The 3D Bratu test problem. Attained error and computational work for our nonlinear EBK method and the ROS2 solver. For the EBK method the work is reported as the nonlinear iteration number, number of LU factorizations, their applications and matvecs. For the ROS2 solver the work is shown as as the number of time steps, LU factorizations, their applications and fevals.
method, time steps (t.s.) iter./ LUs (LUs applic.), relative CPU
tolerance / τ\tau steps matvecs/fevals error time, s
grid 40×40×4040\times 40\times 40, T=5105T=5\cdot 10^{-5}, TA(y(T))13.40e+03T\|A(y(T))\|_{1}\approx\texttt{3.40e+03}
nonlin.EBK(m=4m=4), 1 t.s., 1e-02 2 2 (37), 2 1.17e-04 6.3
nonlin.EBK(m=5m=5), 1 t.s., 1e-04 3 3 (111), 3 4.04e-05 12.9
ROS2, τ=T/320\tau=T/320 320 320 (640), 640 2.36e-04 225
ROS2, τ=T/640\tau=T/640 640 640 (1280), 1280 5.73e-07 442
grid 40×40×4040\times 40\times 40, T=1104T=1\cdot 10^{-4}, TA(y(T))16.79e+03T\|A(y(T))\|_{1}\approx\texttt{6.79e+03}
nonlin.EBK(m=4m=4), 1 t.s., 1e-03 3 3 (70), 3 2.09e-05 10.4
nonlin.EBK(m=5m=5), 1 t.s., 1e-03 3 3 (80), 3 1.38e-05 11.1
ROS2, τ=T/320\tau=T/320 320 320 (640), 640 1.51e-05 201
ROS2, τ=T/640\tau=T/640 320 640 (1280), 1280 7.51e-06 401
grid 40×40×4040\times 40\times 40, T=1103T=1\cdot 10^{-3}, TA(y(T))16.79e+04T\|A(y(T))\|_{1}\approx\texttt{6.79e+04}
nonlin.EBK(m=5m=5), 20 t.s., 1e-03 46 46 (908), 46 1.34e-05 148
nonlin.EBK(m=5m=5), 10 t.s., 1e-03 30 30 (646), 30 1.19e-05 102
ROS2, 3200 t.s., τ=T/3200\tau=T/3200 3200 3200 (6400), 3200 2.19e-06 2007
grid 60×60×6060\times 60\times 60, T=5105T=5\cdot 10^{-5}, TA(y(T))17.50e+03T\|A(y(T))\|_{1}\approx\texttt{7.50e+03}
nonlin.EBK(m=4m=4), 1 t.s., 1e-02 2 2 (41), 2 1.18e-04 51.5
nonlin.EBK(m=4m=4), 1 t.s., 1e-04 3 3 (91), 3 6.91e-05 90
ROS2, τ=T/320\tau=T/320 320 320 (640), 640 2.35e-04 2177
ROS2, τ=T/640\tau=T/640 640 640 (1280), 1280 1.66e-05 4790
grid 60×60×6060\times 60\times 60, T=1104T=1\cdot 10^{-4}, TA(y(T))11.50e+04T\|A(y(T))\|_{1}\approx\texttt{1.50e+04}
nonlin.EBK(m=4m=4), 1 t.s., 1e-02 2 2 (38), 2 3.48e-03 51
nonlin.EBK(m=4m=4), 1 t.s., 1e-03 3 3 (74), 3 2.05e-05 83
ROS2, τ=T/320\tau=T/320 320 320 (640), 640 2.11e-05 2013
ROS2, τ=T/640\tau=T/640 640 640 (1280), 1280 1.66e-05 4368
grid 60×60×6060\times 60\times 60, T=1103T=1\cdot 10^{-3}, TA(y(T))11.50e+05T\|A(y(T))\|_{1}\approx\texttt{1.50e+05}
nonlin.EBK(m=5m=5), 10 t.s., 1e-04 30 30 (671), 30 1.18e-05 800
ROS2, τ=T/3200\tau=T/3200 3200 3200 (6400), 3200 2.18e-06 20017
Refer to caption
Refer to caption
Figure 5: The Bratu 3D test. Residual and relative error norm versus iteration number in the nonlinear EBK solver for the grid size 40×40×4040\times 40\times 40, T=5105T=5\cdot 10^{-5} (left plots) and T=1104T=1\cdot 10^{-4} (right plots). The plots are made with increased block size m=32m=32 and number of samples ns=300n_{s}=300. Similar behavior is observed for the 60×60×6060\times 60\times 60 grid.

Results presented in Table 3 for the nonlinear EBK solver are obtained with the block size m=4m=4 and m=5m=5. The value m=4m=4 is marginally acceptable for the accuracy point of view but does yield an acceptable error values. Therefore, the plots in Figure 5, made to get an insight in the error convergence, are produced with a larger block size m=32m=32 and ns=300n_{s}=300 samples. A convergence stagnation is clearly visible in both plots. Although we do not have a precise explanation for the stagnation, we note that it disappears if we switch off the strong anisotropy in (37), taking AA to be a Laplacian. The accuracy level at which stagnation occurs, appears to be related to the condition number of the Jacobian of the ODE system being solved. The nonlinear EBK solver relies on repeatedly made low-rank approximations which are instrumental for the inner block Krylov subspace solver. Another source of possible inaccuracies is linear system solution within the block Krylov subspace solver where linear systems with matrices I+γAkI+\gamma A_{k} are solved for γ>0\gamma>0 orders of magnitude larger than typical time step sizes in implicit time integrators. Nevertheless, the convergence stagnation is observed at an accuracy level 108\approx 10^{-8} which seems to be by far acceptable for PDE solvers.

Finally, we briefly comment on convergence of the ROS2 method. Since its convergence order can not be seen from error values shown in Table 3, we give an error convergence plot in Figure 6, where relative error values, computed as given in (34), are plotted versus the total number of time steps. The second order convergence is clearly seen.

Refer to caption

Figure 6: The Bratu 3D test. Relative error of the ROS2 method at t=Tt=T, computed as given in (34), versus the time step number.
Table 4: Number of nonlinear EBK iterations needed to converge to relative tolerance tol=103\texttt{tol}=10^{-3}, cf. (39), for different grid sizes and TT
20×20×2020\times 20\times 20 40×40×4040\times 40\times 40 60×60×6060\times 60\times 60
T=2.5105T=2.5\cdot 10^{-5} 2 2 2
T=5105T=5\cdot 10^{-5} 2 2 3
T=1104T=1\cdot 10^{-4} 3 3 3

3.3 3D nonlinear heat conduction

In the previous tests the ODE right hand side function has a linear part (namely, Asymmy(t)-A_{\mathrm{symm}}y(t) in the first test and Ay(t)-Ay(t) in the second one, see (30) and (36), respectively). To examine whether our nonlinear EBK solver is able to handle nonlinear problems with no linear part, we now consider the following nonlinear heat conduction problem, described in [51]: find unknown u=u(x,y,z,t)u=u(x,y,z,t) which satisfies

ut=(k(x)(u)ux)x+(k(y)(u)uy)y+(k(z)(u)uz)z,(x,y,z)[0,1]3,0tT,u|x=0=u|x=1,u|y=0=900,u|y=1=300,uz|z=0=uz|z=1=0,u(x,y,z,0)=1800e60((x0.5)2+(y0.5)2+(z0.5)2),k(x)(u)=u/300,k(y)(u)=k(z)(u)=k(x)(u)/10.\begin{gathered}u_{t}=(k^{(x)}(u)u_{x})_{x}+(k^{(y)}(u)u_{y})_{y}+(k^{(z)}(u)u_{z})_{z},\\ (x,y,z)\in[0,1]^{3},\quad 0\leqslant t\leqslant T,\\ u\bigl{|}_{x=0}=u\bigl{|}_{x=1},\quad u\bigl{|}_{y=0}=900,\quad u\bigl{|}_{y=1}=300,\quad u_{z}\bigl{|}_{z=0}=u_{z}\bigl{|}_{z=1}=0,\\ u(x,y,z,0)=1800\,e^{-60((x-0.5)^{2}+(y-0.5)^{2}+(z-0.5)^{2})},\\ k^{(x)}(u)=u/300,\quad k^{(y)}(u)=k^{(z)}(u)=k^{(x)}(u)/10.\end{gathered} (42)

We discretize this initial boundary value problem on a regular grid of nx×ny×nzn_{x}\times n_{y}\times n_{z} nodes (xi,yj,zk)(x_{i},y_{j},z_{k}), xi=ihxx_{i}=ih_{x}, i=1,,nxi=1,\dots,n_{x}, yj=jhyy_{j}=jh_{y}, j=1,,nyj=1,\dots,n_{y}, zk=(k1/2)hzz_{k}=(k-1/2)h_{z}, k=1,,nzk=1,\dots,n_{z}, with grid widths hx,y=1/(nx,y+1)h_{x,y}=1/(n_{x,y}+1) and hz=1/nzh_{z}=1/n_{z}. The shifted grid nodes are taken for the zz direction to accommodate the Neumann boundary conditions. The standard second order finite difference approximation is employed, which reads, for the xx derivative,

(k(x)(u)ux)x|i,j,kk(x)(ui+1/2,j,k)(ui+1,j,kui,j,k)k(x)(ui1/2,j,k)(ui,j,kui1,j,k)hx2,(k^{(x)}(u)u_{x})_{x}\bigl{|}_{i,j,k}\approx\frac{k^{(x)}(u_{i+1/2,j,k})(u_{i+1,j,k}-u_{i,j,k})-k^{(x)}(u_{i-1/2,j,k})(u_{i,j,k}-u_{i-1,j,k})}{h_{x}^{2}},

where ui±1/2,j,k=(ui±1,j,k+ui,j,k)/2u_{i\pm 1/2,j,k}=(u_{i\pm 1,j,k}+u_{i,j,k})/2. The same finite difference approximations are applied in yy and zz directions. This space discretization of the initial boundary value problem (42) yields an IVP

y(t)=A(y(t))y(t)+g,y(0)=v,y^{\prime}(t)=-A(y(t))y(t)+g,\quad y(0)=v, (43)

where the entries of the vector function y(t)y(t) contain the numerical solution values and the constant in time vector gg contains the inhomogeneous boundary condition contributions. The waveform relaxation iteration is applied with the splitting (3) where we set, for y¯k=yk(T)\bar{y}_{k}=y_{k}(T),

Ak=A(y¯k),fk(y)=A(y¯k)yA(y)y.A_{k}=A(\bar{y}_{k}),\quad f_{k}(y)=A(\bar{y}_{k})y-A(y)y.

In this relation, it is not difficult to recognize the same splitting as is used for the Burgers equation test (with Asymm=0A_{\mathrm{symm}}=0 and Askew=AA_{\mathrm{skew}}=A).

The EBK inner-outer iterations are applied exactly in the same setting as in the previous test, with the same stopping criteria, see (39), (40) and the same number of samples nsn_{s}. As in the previous test, we compare the EBK solver with the ROS2 integrator. The results of the test runs are presented in Table 5. As we see, on the grid 40×40×4040\times 40\times 40 the two time integration schemes perform equally well in terms of the CPU time. Note that the EBK solver requires much fewer LU factorizations. On the finer 60×60×6060\times 60\times 60 the EBK solver clearly outperforms ROS2, both in terms of the CPU time and required LU factorizations. When switching to the finer grid, to keep convergence in the EBK scheme, we have to decrease the time interval TT from T=0.01T=0.01 to T=0.005T=0.005. Hence, we observe a deterioration of the EBK convergence in this test. There is also a moderate increase in the required number of nonlinear iterations: with T=0.005T=0.005 EBK needs in total 73 and 94 iterations on the two grids, respectively.

Table 5: The 3D nonlinear heat conduction test. Attained error and computational work for our nonlinear EBK method and the ROS2 solver. For the EBK method the work is reported as the nonlinear iteration number, number of LU factorizations, their applications and matvecs. For the ROS2 solver the work is shown as as the number of time steps, LU factorizations, their applications and fevals.
method, time steps (t.s.) iter./ LUs (LUs applic.), relative CPU
tolerance / τ\tau steps matvecs/fevals error time, s
grid 40×40×4040\times 40\times 40, T=0.1T=0.1, TA(y(T))12.19e+03T\|A(y(T))\|_{1}\approx\texttt{2.19e+03}
nonlin.EBK(m=6m=6), 10 t.s., 1e-02 52 52 (1223), 52 1.30e-04 387
nonlin.EBK(m=10m=10), 10 t.s., 1e-02 52 52 (1936), 52 5.91e-05 448
ROS2, τ=T/250\tau=T/250 250 250 (500) 7.26e-05 347
ROS2, τ=T/500\tau=T/500 500 500 (1000) 2.02e-05 821
grid 60×60×6060\times 60\times 60, T=0.1T=0.1, TA(y(T))14.96e+04T\|A(y(T))\|_{1}\approx\texttt{4.96e+04}
nonlin.EBK(m=6m=6), 20 t.s., 1e-02 94 94 (2068), 94 4.41e-05 8229
ROS2, τ=T/500\tau=T/500 500 500 (1000) 3.99e-05 19578

4 Conclusions

In this paper nonlinear waveform relaxation iterations and their block Krylov subspace implementation (the nonlinear EBK method) are examined theoretically and practically. Theoretically we have shown that convergence of the considered nonlinear waveform relaxation iteration is determined by the Lipschitz constant LL of the nonlinear term fkf_{k} in (3), the logarithmic norm bound ω\omega of the linear term matrix AkA_{k} and the time interval length TT, see relation (10). We have also established a superlinear convergence of our nonlinear iteration, which is known to hold in the linear case, see Proposition 2. Furthermore, we have linked the residual convergence in the nonlinear iteration to the error convergence, see relation (20). Finally, it is shown that inexact solution of the linear initial-value problem arising at each waveform relaxation iteration leads to a process whose convergence is equivalent to convergence of the exact waveform relaxation iteration with an increased Lipschitz constant L+lL+l, with ll being the inexactness tolerance.

Practical performance of nonlinear waveform relaxation heavily depends on the ability to solve the arising linear initial-value problem at each relaxation iteration quickly and to handle the iterative across-time solutions efficiently. As we show here, both these goals can be successfully reached by employing the exponential block Krylov subspace method EBK [37]. In the presented experiments the EBK method is used in the shift-and-invert (SAI) mode, i.e., it has been assumed that the SAI linear systems (similar to those arising in implicit time stepping) can be solved efficiently.

Comparisons with implicit time integration methods demonstrated superiority and potential of our EBK-SAI based nonlinear waveform relaxation iterations. The presented nonlinear residual stopping criterion has been proven to be a reliable way to control convergence and to estimate accuracy. Moreover, as the first two numerical tests demonstrate, the grid independent convergence of linear SAI Krylov subspace exponential time integrators (see [45, 46, 52]) can be inherited by the nonlinear waveform relaxation iterations.

As numerical tests and our experience suggest, the presented method has its limitations. First, it is not intended for strongly nonlinear problems with rapidly changing solutions. The method clearly profits if the right hand side function of the ODE system contains a linear part. Second, the linear systems arising in the presented nonlinear EBK method have a structure similar to linear systems occuring in implicit time integrators with significantly increased time step sizes. Therefore, the overall efficiency of the method for a particular application heavily depends on availability of efficient linear solvers. Third, the nonlinear EBK method is aimed at solving initial-value problems with a moderate accuracy (of order 10410^{-4} or 10510^{-5}), which is usually enough for solving PDE problems. The accuracy of the presented method can be limited by the built-in low rank approximations and solving large ill-conditioned linear systems. Finally, some tuning (such as a proper choice of the block size mm and the time interval length) is required to apply the proposed approach successfully. Some of these issues can be addressed in more details in a future.

Nevertheless, as results in this paper show, the presented framework seems to be promising, has a potential and is of interest for the numerical community. Recently, an alternative approach based on exponential Krylov subspace integrators has been presented in [53, 54] for nonlinear heat equations. In this problem a rapid change in the solution seems to exclude using waveform relaxation with block Krylov subspaces as presented here. A further research should be done to get an insight on when to switch between the block Krylov subspace approach discussed here and the approach of [53]. Finally, it seems to be useful to gain more insight how these two approaches compare to other exponential time integrators for nonlinear initial-value problems [3, 55, 56].

Acknowledgments

The author thanks the anonymous reviewer for careful reading and suggesting several valuable improvements. We acknowledge the use of the hybrid supercomputer K-100 facilities installed in the Supercomputer center of collective usage at Keldysh Institute of Applied Mathematics of Russian Academy of Sciences.

References

  • \bibcommenthead
  • [1] W. Hundsdorfer, J.G. Verwer, Numerical Solution of Time-Dependent Advection-Diffusion-Reaction Equations (Springer Verlag, 2003)
  • [2] H.C. Elman, D.J. Silvester, A.J. Wathen, Finite Elements and Fast Iterative Solvers with Applications in Incompressible Fluid Dynamics (Oxford University Press, Oxford, 2005)
  • [3] M. Hochbruck, A. Ostermann, Exponential integrators. Acta Numer. 19, 209–286 (2010). 10.1017/S0962492910000048
  • [4] P.N. Brown, Y. Saad, Convergence theory of nonlinear Newton–Krylov algorithms. SIAM J. Optimization 4(2), 297–330 (1994)
  • [5] R. Choquet, J. Erhel, Newton–GMRES algorithm applied to compressible flows. Int. J. Numer. Meth. Fluids 23, 177–190 (1996)
  • [6] D. Tromeur-Dervout, Y. Vassilevski, Choice of initial guess in iterative solution of series of systems arising in fluid flow simulations. Journal of Computational Physics 219(1), 210–227 (2006). https://doi.org/10.1016/j.jcp.2006.03.014. URL https://www.sciencedirect.com/science/article/pii/S0021999106001483
  • [7] N.N. Yanenko, The method of fractional steps. The solution of problems of mathematical physics in several variables (Springer-Verlag, New York, 1971), pp. viii+160
  • [8] E.G. D’yakonov, Difference systems of second order accuracy with a divided operator for parabolic equations without mixed derivatives. USSR Comput. Math. Math. Phys. 4(5), 206–216 (1964)
  • [9] P. Csomós, I. Faragó, A. Havasi, Weighted sequential splittings and their analysis. Comput. Math. Appl. pp. 1017–1031 (2005)
  • [10] J.G. Verwer, E.J. Spee, J.G. Blom, W. Hundsdorfer, A second order Rosenbrock method applied to photochemical dispersion problems. SIAM J. Sci. Comput. 20, 456–480 (1999)
  • [11] P.J. van der Houwen, B.P. Sommeijer, On the internal stability of explicit mm-stage Runge–Kutta methods for large values of mm. Z. Angew. Math. Mech. 60, 479–485 (1980)
  • [12] V.O. Lokutsievskii, O.V. Lokutsievskii, On numerical solution of boundary value problems for equations of parabolic type. Soviet Math. Dokl. 34(3), 512–516 (1987)
  • [13] B.P. Sommeijer, L.F. Shampine, J.G. Verwer, RKC: An explicit solver for parabolic PDEs. J. Comput. Appl. Math. 88, 315–326 (1998)
  • [14] H. Tal-Ezer, Spectral methods in time for parabolic problems. SIAM J. Numer. Anal. 26(1), 1–11 (1989)
  • [15] V.I. Lebedev, Explicit difference schemes for solving stiff systems of ODEs and PDEs with complex spectrum. Russian J. Numer. Anal. Math. Modelling 13(2), 107–116 (1998). 10.1515/rnam.1998.13.2.107
  • [16] J.G. Verwer, B.P. Sommeijer, W. Hundsdorfer, RKC time-stepping for advection–diffusion–reaction problems. Journal of Computational Physics 201(1), 61–79 (2004). https://doi.org/10.1016/j.jcp.2004.05.002. URL https://www.sciencedirect.com/science/article/pii/S0021999104001925
  • [17] M.A. Botchev, G.L.G. Sleijpen, H.A. van der Vorst, Stability control for approximate implicit time stepping schemes with minimum residual iterations. Appl. Numer. Math. 31(3), 239–253 (1999). https://doi.org/10.1016/S0168-9274(98)00138-X
  • [18] M.A. Botchev, H.A. van der Vorst, A parallel nearly implicit scheme. Journal of Computational and Applied Mathematics 137, 229–243 (2001). https://doi.org/10.1016/S0377-0427(01)00358-2
  • [19] V.T. Zhukov, Explicit methods of numerical integration for parabolic equations. Mathematical Models and Computer Simulations 3(3), 311–332 (2011). https://doi.org/10.1134/S2070048211030136
  • [20] E. Lelarasmee, A.E. Ruehli, A.L. Sangiovanni-Vincentelli, The waveform relaxation method for time-domain analysis of large scale integrated circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 1(3), 131–145 (1982). 10.1109/TCAD.1982.1270004
  • [21] A.R. Newton, A.L. Sangiovanni-Vincentelli, Relaxation-based electrical simulation. IEEE Transactions on Electron Devices 30(9), 1184–1207 (1983). 10.1109/T-ED.1983.21275
  • [22] S. Vandewalle, Parallel Multigrid Waveform Relaxation for Parabolic Problems (Teubner, Stuttgart, 1993)
  • [23] U. Miekkala, O. Nevanlinna, Iterative solution of systems of linear differential equations. Acta Numerica 5, 259–307 (1996). https://doi.org/10.1017/S096249290000266X
  • [24] J. White, F. Odeh, A.L. Sangiovanni-Vincentelli, A. Ruehli, Waveform relaxation: Theory and practice. Tech. Rep. UCB/ERL M85/65, EECS Department, University of California, Berkeley (1985). www.eecs.berkeley.edu/Pubs/TechRpts/1985/543.html
  • [25] U. Miekkala, O. Nevanlinna, Convergence of dynamic iteration methods for initial value problems. SIAM Journal on Scientific and Statistical Computing 8(4), 459–482 (1987). https://doi.org/10.1137/0908046
  • [26] C. Lubich, A. Ostermann, Multi-grid dynamic iteration for parabolic equations. BIT Numerical Mathematics 27, 216–234 (1987). URL http://doi.org/10.1007/BF01934186. 10.1007/BF01934186
  • [27] J. Janssen, S. Vandewalle, Multigrid waveform relaxation of spatial finite element meshes: The continuous-time case. SIAM J. Numer. Anal. 33(2), 456–474 (1996). 10.1137/0733024
  • [28] M.J. Gander, S. Güttel, PARAEXP: A parallel integrator for linear initial-value problems. SIAM Journal on Scientific Computing 35(2), C123–C142 (2013)
  • [29] G.L. Kooij, M.A. Botchev, B.J. Geurts, A block Krylov subspace implementation of the time-parallel Paraexp method and its extension for nonlinear partial differential equations. Journal of Computational and Applied Mathematics 316(Supplement C), 229–246 (2017). https://doi.org/10.1016/j.cam.2016.09.036
  • [30] G. Kooij, M.A. Botchev, B.J. Geurts, An exponential time integrator for the incompressible Navier–Stokes equation. SIAM J. Sci. Comput. 40(3), B684–B705 (2018). https://doi.org/10.1137/17M1121950
  • [31] T.J. Park, J.C. Light, Unitary quantum time evolution by iterative Lanczos reduction. J. Chem. Phys. 85, 5870–5876 (1986)
  • [32] H.A. van der Vorst, An iterative solution method for solving f(A)x=bf(A)x=b, using Krylov subspace information obtained for the symmetric positive definite matrix AA. J. Comput. Appl. Math. 18, 249–263 (1987)
  • [33] V.L. Druskin, L.A. Knizhnerman, Two polynomial methods of calculating functions of symmetric matrices. U.S.S.R. Comput. Maths. Math. Phys. 29(6), 112–121 (1989)
  • [34] E. Gallopoulos, Y. Saad, Efficient solution of parabolic equations by Krylov approximation methods. SIAM J. Sci. Statist. Comput. 13(5), 1236–1264 (1992). http://doi.org/10.1137/0913071
  • [35] V.L. Druskin, L.A. Knizhnerman, Krylov subspace approximations of eigenpairs and matrix functions in exact and computer arithmetic. Numer. Lin. Alg. Appl. 2, 205–217 (1995)
  • [36] M. Hochbruck, C. Lubich, On Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 34(5), 1911–1925 (1997)
  • [37] M.A. Botchev, A block Krylov subspace time-exact solution method for linear ordinary differential equation systems. Numer. Linear Algebra Appl. 20(4), 557–574 (2013). http://doi.org/10.1002/nla.1865
  • [38] O. Nevanlinna, F. Odeh, Remarks on the convergence of waveform relaxation method. Numerical functional analysis and optimization 9(3-4), 435–445 (1987). https://doi.org/10.1080/01630568708816241
  • [39] K. Dekker, J.G. Verwer, Stability of Runge–Kutta methods for stiff non-linear differential equations (North-Holland Elsevier Science Publishers, 1984)
  • [40] M.A. Botchev, V. Grimm, M. Hochbruck, Residual, restarting and Richardson iteration for the matrix exponential. SIAM J. Sci. Comput. 35(3), A1376–A1397 (2013). http://doi.org/10.1137/110820191
  • [41] M.A. Botchev, I.V. Oseledets, E.E. Tyrtyshnikov, Iterative across-time solution of linear differential equations: Krylov subspace versus waveform relaxation. Computers & Mathematics with Applications 67(12), 2088–2098 (2014). http://doi.org/10.1016/j.camwa.2014.03.002
  • [42] E. Hairer, G. Wanner, Solving Ordinary Differential Equations II. Stiff and Differential–Algebraic Problems, 2nd edn. Springer Series in Computational Mathematics 14 (Springer–Verlag, 1996)
  • [43] L.A. Krukier, Implicit difference schemes and an iterative method for solving them for a certain class of systems of quasi-linear equations. Sov. Math. 23(7), 43–55 (1979). Translation from Izv. Vyssh. Uchebn. Zaved., Mat. 1979, No. 7(206), 41–52 (1979)
  • [44] R.W.C.P. Verstappen, A.E.P. Veldman, Symmetry-preserving discretization of turbulent flow. J. Comput. Phys. 187(1), 343–368 (2003). http://doi.org/10.1016/S0021-9991(03)00126-8
  • [45] I. Moret, P. Novati, RD rational approximations of the matrix exponential. BIT 44, 595–615 (2004)
  • [46] J. van den Eshof, M. Hochbruck, Preconditioning Lanczos approximations to the matrix exponential. SIAM J. Sci. Comput. 27(4), 1438–1457 (2006)
  • [47] L.F. Shampine, M.W. Reichelt, The MATLAB ODE suite. SIAM J. Sci. Comput. 18(1), 1–22 (1997). Available at www.mathworks.com
  • [48] J. Jon, S. Klaus, The Liouville–Bratu–Gelfand problem for radial operators. Journal of Differential Equations 184, 283–298 (2002). https://doi.org/10.1006/jdeq.2001.4151
  • [49] T.A. Davis, A column pre-ordering strategy for the unsymmetric-pattern multifrontal method. ACM Trans. Math. Software 30(2), 167–195 (2004). 10.1145/992200.992205
  • [50] T.A. Davis, Algorithm 832: UMFPACK V4.3—an unsymmetric-pattern multifrontal method. ACM Trans. Math. Software 30(2), 196–199 (2004). 10.1145/992200.992206
  • [51] V.T. Zhukov, N. Novikova, O.B. Feodoritova, On direct solving conjugate heat transfer of gas mixture and solid body. KIAM Preprint 2023(12) (2023). https://doi.org/10.20948/prepr-2023-12
  • [52] V. Grimm, Resolvent Krylov subspace approximation to operator functions. BIT 52(3), 639–659 (2012). 10.1007/s10543-011-0367-8
  • [53] M.A. Botchev, V.T. Zhukov, Exponential Euler and backward Euler methods for nonlinear heat conduction problems. Lobachevskii Journal of Mathematics 44(1), 10–19 (2023). https://doi.org/10.1134/S1995080223010067
  • [54] M.A. Botchev, V.T. Zhukov, Adaptive iterative explicit time integration for nonlinear heat conduction problems. Lobachevskii Journal of Mathematics 45(1), 12–20 (2024). https://doi.org/10.1134/S1995080224010086
  • [55] D. Hipp, M. Hochbruck, A. Ostermann, An exponential integrator for non-autonomous parabolic problems. ETNA 41, 497–511 (2014). http://etna.mcs.kent.edu/volumes/2011-2020/vol41/
  • [56] E. Hansen, A. Ostermann, High-order splitting schemes for semilinear evolution equations. BIT Numerical Mathematics 56(4), 1303–1316 (2016)