This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Vu Thai Luan 22institutetext: Department of Mathematics and Statistics, Mississippi State University
410 Allen Hall, 175 President’s Circle, Mississippi State, MS, 39762
22email: [email protected]

Efficient exponential Runge–Kutta methods of high order: construction and implementation thanks: This work has been supported in part by National Science Foundation through award NSF DMS–2012022.

Vu Thai Luan

Dedicated to Professor Alexander Ostermann on the occasion of his 60th birthday.
Abstract

Exponential Runge–Kutta methods have shown to be competitive for the time integration of stiff semilinear parabolic PDEs. The current construction of stiffly accurate exponential Runge–Kutta methods, however, relies on a convergence result that requires weakening many of the order conditions, resulting in schemes whose stages must be implemented in a sequential way. In this work, after showing a stronger convergence result, we are able to derive two new families of fourth- and fifth-order exponential Runge–Kutta methods, which, in contrast to the existing methods, have multiple stages that are independent of one another and share the same format, thereby allowing them to be implemented in parallel or simultaneously, and making the methods to behave like using with much less stages. Moreover, all of their stages involve only one linear combination of the product of φ\varphi-functions (using the same argument) with vectors. Overall, these features make these new methods to be much more efficient to implement when compared to the existing methods of the same orders. Numerical experiments on a one-dimensional semilinear parabolic problem, a nonlinear Schrödinger equation, and a two-dimensional Gray–Scott model are given to confirm the accuracy and efficiency of the two newly constructed methods.

Keywords:
Exponential Runge–Kutta methods Exponential integrators Stiff PDEs Efficient implementation
MSC:
MSC 65L04 MSC 65M06 MSC 65N12
journal: BIT

1 Introduction

In this paper, we are concerned with the construction and implementation of new efficient exponential Runge–Kutta integrators for solving stiff parabolic PDEs. These PDEs, upon their spatial discretizations, can be cast in the form of semilinear problems

u(t)=Au(t)+g(t,u(t))=F(t,u(t)),u(t0)=u0,u^{\prime}(t)=Au(t)+g(t,u(t))=F(t,u(t)),\qquad u(t_{0})=u_{0}, (1)

where the linear part AuAu usually causes stiffness. The nonlinearity g(t,u)g(t,u) is assumed to satisfy a local Lipschitz condition in a strip along the exact solution.

Exponential Runge–Kutta methods are a popular class of exponential integrators HO10 , which have shown a great promise as an alternative to standard time integration solvers for stiff systems and applications in recent years, see e.g. HO05 ; LO12b ; LO14b ; LO13 ; LO14a ; Luan2014 ; LO16 ; L17 ; Michels2017 ; Ju2017 ; LD18 ; Luan18 ; Pieper2019a . The main idea behind these methods is to solve the linear portion of (1) exactly and integrate the remaining nonlinear portion explicitly based on a representation of the exact solution using the variation-of-constants formula.

A ss-stage explicit exponential Runge–Kutta (expRK) method HO05 applied to (1) can be reformulated (see LO12b ; LO14b ) as

Uni\displaystyle U_{ni} =un+cihφ1(cihA)F(tn,un)+hj=2i1aij(hA)Dnj, 2is,\displaystyle=u_{n}+c_{i}h\varphi_{1}(c_{i}hA)F(t_{n},u_{n})+h\sum_{j=2}^{i-1}a_{ij}(hA)D_{nj},\ 2\leq i\leq s, (2a)
un+1\displaystyle u_{n+1} =un+hφ1(hA)F(tn,un)+hi=2sbi(hA)Dni,\displaystyle=u_{n}+h\varphi_{1}(hA)F(t_{n},u_{n})+h\sum_{i=2}^{s}b_{i}(hA)D_{ni}, (2b)

where

Dni=g(tn+cih,Uni)g(tn,un),2is.D_{ni}=g(t_{n}+c_{i}h,U_{ni})-g(t_{n},u_{n}),\qquad 2\leq i\leq s. (3)

Here, UniU_{ni} denote the internal stages that approximate u(tn+cih)u(t_{n}+c_{i}h) using the time step size h=tn+1tn>0h=t_{n+1}-t_{n}>0 and nodes cic_{i}. By construction, the coefficients aij(z)a_{ij}(z) and bi(z)b_{i}(z) are usually linear combinations of the entire functions

φk(z)=01e(1θ)zθk1(k1)!dθ,k1\varphi_{k}(z)=\int_{0}^{1}{\rm e}\hskip 1.0pt^{(1-\theta)z}\frac{\theta^{k-1}}{(k-1)!}\,\text{d}\theta,\quad k\geq 1 (4)

and their scaled versions φk(ciz)\varphi_{k}(c_{i}z).

A common approach that has been used to determine the unknown matrix functions aij(hA)a_{ij}(hA) and bi(hA)b_{i}(hA) is to expand them as aij(hA)=k0αij(k)(hA)ka_{ij}(hA)=\sum_{k\geq 0}\alpha^{(k)}_{ij}(hA)^{k},  bi(hA)=k0βi(k)(hA)kb_{i}(hA)=\sum_{k\geq 0}\beta^{(k)}_{i}(hA)^{k} (e.g. using classical Taylor series expansions) to obtain order conditions. Clearly, the boundedness of the remainder terms of these expansions (and thus the error terms) are dependent of A\|A\|. Due to stability reasons, such resulting methods might not be suitable for integrating stiff PDEs, which AA typically has a large norm or is even unbounded operator. These methods are thus usually referred as classical (non-stiffly accurate) expRK methods. Unlike this approach, in a seminal contribution HO05 , Hochbruck and Ostermann derived a new error expansion with the remainder terms that are bounded independently of the stiffness (i.e. not involving the powers of AA), leading to stiff order conditions, which give rise to the construction of stiffly accurate expRK methods of orders up to four. Following this, in LO13 Luan and Ostermann developed a systematic theory of deriving stiff order conditions for expRK methods of arbitrary order, thereby allowing the construction of a fifth-order method in LO14b .

In view of the existing stiffly accurate expRK methods in the literature, we observe that they were derived based on a convergence result that requires weakening many of the stiff order conditions (in order to minimize the number of required stages ss and matrix functions used in each internal stages UniU_{ni}). As a result, their structures contain internal stages UniU_{ni} that are dependent of the preceding stages, implying that such methods must be implemented by computing each of these stages sequentially. Also, the very last stages usually involve several different linear combinations of φk(cihA)\varphi_{k}(c_{i}hA)-functions (using different nodes cic_{i} in their arguments) acting on different sets of vectors. This would introduce additional computational effort for these stages. For more details, we refer to Sections 2 and 5.

Motivated by the observations above, in this work we show a stronger convergence result for expRK methods up to order five which requires weakening only one order condition (thereby could improve the stability and accuracy) and offers more degree of freedoms in solving order conditions. Using this result and inspired by our recent algorithm, 𝚙𝚑𝚒𝚙𝚖_𝚜𝚒𝚖𝚞𝚕_𝚒𝚘𝚖\mathtt{phipm\_simul\_iom}, proposed in Luan18 (which allows one to simultaneously compute multiple linear combinations of φ\varphi- functions acting on a same set of vectors), we construct new methods of orders 4 and 5 which involve only one linear combination of φ\varphi- functions for each stage and have multiple internal stages UniU_{ni} that are independent of one another, thereby allowing them to be computed in parallel. Furthermore, one can derive these independent stages in a way that they share the same form of linear combination of φk(cihA)\varphi_{k}(c_{i}hA)- functions acting on the same set of vectors, allowing them to be implemented simultaneously (by one evaluation). While these independent states can be computed in parallel (as mentioned above) by any algorithm which approximates the action of (the linear combination of) φ\varphi- functions, we note that the possibility to compute them simultaneously is a new feature that can be used with our algorithm 𝚙𝚑𝚒𝚙𝚖_𝚜𝚒𝚖𝚞𝚕_𝚒𝚘𝚖\mathtt{phipm\_simul\_iom} (other algorithms, e.g., that do not require the construction of Krylov subspaces, might not support computing these stages simultaneously). Overall, this makes the new methods to behave like methods using much less number of stages (even when compared to the existing methods of the same orders), meaning that they require much less number of evaluations for linear combinations of φ\varphi- functions, and are thus more efficient.

The paper is organized as follows. In Section 2, we describe our motivation, propose new ideas, and review the existing expRK methods in the literature with respect to these ideas. Following this, in Section 3 we prove a stronger convergence result (Theorem 3.1) for expRK methods, which requires relaxing only one order condition. This allows us to construct more efficient methods in Section 4. In particular, we are able to derive two new families of fourth- and fifth- order stiffly accurate expRK methods called 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} (4th-order 6-stage but requires 4 independent stage evaluation only) and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} (5th-order 10-stage but requires 5 independent stage evaluation only), respectively. In Section 5, we present details implementation of these two new methods, as well as the existing stiffly accurate expRK schemes of the same orders (for comparison). In the last section, numerical examples including one and two-dimensional stiff PDEs are presented to demonstrate the accuracy and efficiency of the two newly constructed expRK integrators.

2 Motivation and existing methods

In this section, we start our motivation by taking a closer look at an efficient way for implementing expRK methods (2). Then, we propose some ideas to derive more efficient methods with respect to this efficient implementation along with reviewing the current methods.

2.1 An efficient way of implementation

Clearly, each stage (UniU_{ni} or un+1u_{n+1}) of (2) requires computing matrix functions of the form φk(cihA)vk\varphi_{k}(c_{i}hA)v_{k} (0<ci10<c_{i}\leq 1), where vkv_{k} is some vector (could be F(tn,un),DniF(t_{n},u_{n}),D_{ni} or a linear combination of these). Thanks to recent developments AH11 ; NW12 ; CKOS16 ; GaudreaultPudykiewicz16 , one can efficiently compute a linear combination of φ\varphi-functions acting on a set of input vectors V0,,VqV_{0},\ldots,V_{q}

φ0(M)V0+φ1(M)V1+φ2(M)V2++φq(M)Vq,\varphi_{0}(M)V_{0}+\varphi_{1}(M)V_{1}+\varphi_{2}(M)V_{2}+\cdots+\varphi_{q}(M)V_{q}, (5)

where MM is some square matrix. This is crucial when implementing exponential integrators. Very recently, in Luan18 , we were able to improve the implementations presented in NW12 ; GaudreaultPudykiewicz16 , resulting in the routine 𝚙𝚑𝚒𝚙𝚖_𝚜𝚒𝚖𝚞𝚕_𝚒𝚘𝚖\mathtt{phipm\_simul\_iom}. The underlying method in this algorithm is the use of an adaptive time-stepping technique combined with Krylov subspace methods, which allows us to simultaneously compute multiple linear combinations of type (5) using different scaling factors ρ1,,ρr\rho_{1},\cdots,\rho_{r} of MM, i.e.,

φ0(ρ1M)V0+φ1(ρ1M)V1\displaystyle\varphi_{0}(\rho_{1}M)V_{0}+\varphi_{1}(\rho_{1}M)V_{1} +φ2(ρ1M)V2++φN(ρ1M)Vq,\displaystyle+\varphi_{2}(\rho_{1}M)V_{2}+\cdots+\varphi_{N}(\rho_{1}M)V_{q}, (6)
\displaystyle\vdots
φ0(ρrM)V0+φ1(ρrM)V1\displaystyle\varphi_{0}(\rho_{r}M)V_{0}+\varphi_{1}(\rho_{r}M)V_{1} +φ2(ρrM)V2++φN(ρrM)Vq.\displaystyle+\varphi_{2}(\rho_{r}M)V_{2}+\cdots+\varphi_{N}(\rho_{r}M)V_{q}.

Now taking M=hAM=hA and considering ρk\rho_{k} (1kr1\leq k\leq r) as nodes cic_{i} used in expRK methods immediately suggests that one can compute the following (s1s-1) linear linear combinations

φ1(cihA)V1+φ2(cihA)V2++φN(cihA)Vq,2is\varphi_{1}(c_{i}hA)V_{1}+\varphi_{2}(c_{i}hA)V_{2}+\ldots+\varphi_{N}(c_{i}hA)V_{q},\quad 2\leq i\leq s (7)

simultaneously by using only one evaluation (i.e., one call to 𝚙𝚑𝚒𝚙𝚖_𝚜𝚒𝚖𝚞𝚕_𝚒𝚘𝚖\mathtt{phipm\_simul\_iom}). Note that this requires the use of a same set of vectors [V1,,Vq][V_{1},\ldots,V_{q}] for all the linear combinations in (7).

Motivated by this, we see that if a ss-stage expRK scheme (2) is constructed in such a way that each internal stage UniU_{ni} has the form

Uni=un+φ1(cihA)V1i+φ2(cihA)V2i++φN(cihA)Vqi,U_{ni}=u_{n}+\varphi_{1}(c_{i}hA)V_{1i}+\varphi_{2}(c_{i}hA)V_{2i}+\ldots+\varphi_{N}(c_{i}hA)V_{qi}, (8)

which includes only one linear combination of φ\varphi- functions using exactly node cic_{i} as an argument in all φk\varphi_{k} functions, then the scheme will contain a total of ss such linear combinations (s1s-1 for UniU_{ni} and 1 for un+1u_{n+1} as (2b) can be always written in the form of (8) with ci=1c_{i}=1), thereby requiring ss evaluations only. Furthermore, since the sets of vectors [V1i,V2i,,Vqi][V_{1i},V_{2i},\cdots,V_{qi}] in (8) are usually different for each UniU_{ni}, (7) also suggests that the efficiency will be significantly increased if one could build such stages (or a group of) UniU_{ni} of the form (8) that share the same format (i.e., having the same set of acting vectors V1iV1,,VqiVqV_{1i}\equiv V_{1},\ldots,V_{qi}\equiv V_{q}) or that are independent of one another. As this allows to compute such stages simultaneously by one evaluation or to implement them in parallel similarly to our construction of parallel exponential Rosenbrock methods LO16 ), it certainly reduces the total number of required evaluations and thus speedups the computing time.

With respect to these observations, we now review the existing expRK schemes in the literature. Since our focus is on stiff problems, we will discuss only on stiffly accurate expRK methods, meaning that they satisfy the stiff order conditions (see Section 3 below).

2.2 Existing schemes and remarks

In HO05 , expRK methods of orders up to four have been derived. For later reference, we name the second-order, the third-order, and the fourth-order methods in that work as 𝚎𝚡𝚙𝚁𝙺𝟸𝚜𝟸\mathtt{expRK2s2}, 𝚎𝚡𝚙𝚁𝙺𝟹𝚜𝟹\mathtt{expRK3s3}, and 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, respectively. In LO14b , we have constructed an expRK method of order five called 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}. To discuss all of these schemes in terms of the implementation, we rewrite their internal stages UniU_{ni} and un+1u_{n+1} as linear combinations of φ\varphi- functions like (8) and display them as follows (Note that since the first-order method, the exponential Euler scheme un+1=un+φ1(hA)hF(tn,un),u_{n+1}=u_{n}+\varphi_{1}(hA)hF(t_{n},u_{n}), has no internal stage, we do not consider it here).

𝚎𝚡𝚙𝚁𝙺𝟸𝚜𝟸\mathtt{expRK2s2}:

Un2=un\displaystyle U_{n2}=u_{n} +φ1(c2hA)c2hF(tn,un),\displaystyle+\varphi_{1}(c_{2}hA)c_{2}hF(t_{n},u_{n}), (9)
un+1=un\displaystyle u_{n+1}=u_{n} +φ1(hA)hF(tn,un)+φ2(hA)1c2hDn2.\displaystyle+\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)\tfrac{1}{c_{2}}hD_{n2}.

𝚎𝚡𝚙𝚁𝙺𝟹𝚜𝟹\mathtt{expRK3s3} (a representative with c223c_{2}\neq\tfrac{2}{3}):

Un2=un\displaystyle U_{n2}=u_{n} +φ1(c2hA)c2hF(tn,un),\displaystyle+\varphi_{1}(c_{2}hA)c_{2}hF(t_{n},u_{n}), (10)
Un3=un\displaystyle U_{n3}=u_{n} +φ1(23hA)23hF(tn,un)+φ2(23hA)49c2hDn2,\displaystyle+\varphi_{1}(\tfrac{2}{3}hA)\tfrac{2}{3}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{2}{3}hA)\tfrac{4}{9c_{2}}hD_{n2},
un+1=un\displaystyle u_{n+1}=u_{n} +φ1(hA)hF(tn,un)+φ2(hA)32hDn2.\displaystyle+\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)\tfrac{3}{2}hD_{n2}.

𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} (the only existing fourth-order stiffly accurate expRK method constructed by Hochbruck and Ostermann HO05 ):

Un2=un\displaystyle U_{n2}=u_{n} +φ1(12hA)12hF(tn,un),\displaystyle+\varphi_{1}(\tfrac{1}{2}hA)\tfrac{1}{2}hF(t_{n},u_{n}), (11)
Un3=un\displaystyle U_{n3}=u_{n} +φ1(12hA)12hF(tn,un)+φ2(12hA)hDn2,\displaystyle+\varphi_{1}(\tfrac{1}{2}hA)\tfrac{1}{2}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{1}{2}hA)hD_{n2},
Un4=un\displaystyle U_{n4}=u_{n} +φ1(hA)hF(tn,un)+φ2(hA)h(Dn2+Dn3),\displaystyle+\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)h(D_{n2}+D_{n3}),
Un5=un\displaystyle U_{n5}=u_{n} +[φ1(12hA)12hF(tn,un)+φ2(12hA)14h(2Dn2+2Dn3Dn4)\displaystyle+[\varphi_{1}(\tfrac{1}{2}hA)\tfrac{1}{2}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{1}{2}hA)\tfrac{1}{4}h(2D_{n2}+2D_{n3}-D_{n4})
+φ3(12hA)12h(Dn2Dn3+Dn4)]+[φ2(hA)14h(Dn2+Dn3Dn4)\displaystyle+\varphi_{3}(\tfrac{1}{2}hA)\tfrac{1}{2}h(-D_{n2}-D_{n3}+D_{n4})]+[\varphi_{2}(hA)\tfrac{1}{4}h(D_{n2}+D_{n3}-D_{n4})
+φ3(hA)h(Dn2Dn3+Dn4)],\displaystyle+\varphi_{3}(hA)h(-D_{n2}-D_{n3}+D_{n4})],
un+1=un\displaystyle u_{n+1}=u_{n} +φ1(hA)hF(tn,un)+φ2(hA)h(Dn4+4Dn5)+φ3(hA)h(4Dn48Dn5).\displaystyle+\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)h(-D_{n4}+4D_{n5})+\varphi_{3}(hA)h(4D_{n4}-8D_{n5}).

𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} (the only existing fifth-order stiffly accurate expRK method constructed by Luan and Ostermann LO14b ):

Un2=un\displaystyle U_{n2}=u_{n} +φ1(12hA)12hF(tn,un),\displaystyle+\varphi_{1}(\tfrac{1}{2}hA)\tfrac{1}{2}hF(t_{n},u_{n}),
Un3=un\displaystyle U_{n3}=u_{n} +φ1(12hA)12hF(tn,un)+φ2(12hA)12hDn2,\displaystyle+\varphi_{1}(\tfrac{1}{2}hA)\tfrac{1}{2}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{1}{2}hA)\tfrac{1}{2}hD_{n2},
Un4=un\displaystyle U_{n4}=u_{n} +φ1(14hA)14hF(tn,un)+φ2(14hA)18hDn3,\displaystyle+\varphi_{1}(\tfrac{1}{4}hA)\tfrac{1}{4}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{1}{4}hA)\tfrac{1}{8}hD_{n3},
Un5=un\displaystyle U_{n5}=u_{n} +φ1(12hA)12hF(tn,un)+φ2(12hA)12h(Dn3+4Dn4)+φ3(12hA)h(2Dn34Dn4)\displaystyle+\varphi_{1}(\tfrac{1}{2}hA)\tfrac{1}{2}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{1}{2}hA)\tfrac{1}{2}h(-D_{n3}+4D_{n4})+\varphi_{3}(\tfrac{1}{2}hA)h(2D_{n3}-4D_{n4})
Un6=un\displaystyle U_{n6}=u_{n} +φ1(15hA)15hF(tn,un)+φ2(15hA)125h(8Dn42Dn5)+φ3(15hA)1125h(32Dn4+16Dn5),\displaystyle+\varphi_{1}(\tfrac{1}{5}hA)\tfrac{1}{5}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{1}{5}hA)\tfrac{1}{25}h(8D_{n4}-2D_{n5})+\varphi_{3}(\tfrac{1}{5}hA)\tfrac{1}{125}h(-32D_{n4}+16D_{n5}),
Un7=un\displaystyle U_{n7}=u_{n} +[φ1(23hA)23hF(tn,un)+φ2(23hA)h(1627Dn5+10027Dn6)+φ3(23hA)h(32081Dn580081Dn6)]\displaystyle+[\varphi_{1}(\tfrac{2}{3}hA)\tfrac{2}{3}hF(t_{n},u_{n})+\varphi_{2}(\tfrac{2}{3}hA)h(\tfrac{-16}{27}D_{n5}+\tfrac{100}{27}D_{n6})+\varphi_{3}(\tfrac{2}{3}hA)h(\tfrac{320}{81}D_{n5}-\tfrac{800}{81}D_{n6})]
+[φ2(15hA)h(2081Dn4+5243Dn5+125486Dn6)+φ3(15hA)h(1681Dn44243Dn550243Dn6)],\displaystyle+[\varphi_{2}(\tfrac{1}{5}hA)h(\tfrac{-20}{81}D_{n4}+\tfrac{5}{243}D_{n5}+\tfrac{125}{486}D_{n6})+\varphi_{3}(\tfrac{1}{5}hA)h(\tfrac{16}{81}D_{n4}-\tfrac{4}{243}D_{n5}-\tfrac{50}{243}D_{n6})],
Un8=un\displaystyle U_{n8}=u_{n} +[φ1(hA)hF(tn,un)+φ2(hA)h(163Dn5+25021Dn6+2714Dn7)\displaystyle+\big{[}\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)h(\tfrac{-16}{3}D_{n5}+\tfrac{250}{21}D_{n6}+\tfrac{27}{14}D_{n7})
+φ3(hA)h(2083Dn52503Dn627Dn7)+φ4(hA)h(240Dn5+15007Dn6+8107Dn7)]\displaystyle+\varphi_{3}(hA)h(\tfrac{208}{3}D_{n5}-\tfrac{250}{3}D_{n6}-27D_{n7})+\varphi_{4}(hA)h(-240D_{n5}+\tfrac{1500}{7}D_{n6}+\tfrac{810}{7}D_{n7})\big{]}
+[φ2(15hA)h(47Dn5+2549Dn6+2798Dn7)+φ3(15hA)h(85Dn5107Dn62735Dn7)\displaystyle+\big{[}\varphi_{2}(\tfrac{1}{5}hA)h(\tfrac{-4}{7}D_{n5}+\tfrac{25}{49}D_{n6}+\tfrac{27}{98}D_{n7})+\varphi_{3}(\tfrac{1}{5}hA)h(\tfrac{8}{5}D_{n5}-\tfrac{10}{7}D_{n6}-\tfrac{27}{35}D_{n7})
+φ4(15hA)h(4835Dn5+6049Dn6+162245Dn7)]\displaystyle+\varphi_{4}(\tfrac{1}{5}hA)h(\tfrac{-48}{35}D_{n5}+\tfrac{60}{49}D_{n6}+\tfrac{162}{245}D_{n7})\big{]}
+[φ2(23hA)h(28835Dn5+36049Dn6+972245Dn7)+φ3(23hA)h(3845Dn54807Dn6129635Dn7)\displaystyle+\big{[}\varphi_{2}(\tfrac{2}{3}hA)h(\tfrac{-288}{35}D_{n5}+\tfrac{360}{49}D_{n6}+\tfrac{972}{245}D_{n7})+\varphi_{3}(\tfrac{2}{3}hA)h(\tfrac{384}{5}D_{n5}-\tfrac{480}{7}D_{n6}-\tfrac{1296}{35}D_{n7})
+φ4(23hA)h(15367Dn5+960049Dn6+518449Dn7)],\displaystyle+\varphi_{4}(\tfrac{2}{3}hA)h(\tfrac{-1536}{7}D_{n5}+\tfrac{9600}{49}D_{n6}+\tfrac{5184}{49}D_{n7})\big{]},
un+1=un\displaystyle u_{n+1}=u_{n} +φ1(hA)hF(tn,un)+φ2(hA)h(12514Dn62714Dn7+12Dn8)\displaystyle+\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)h(\tfrac{125}{14}D_{n6}-\tfrac{27}{14}D_{n7}+\tfrac{1}{2}D_{n8})
+φ3(hA)h(62514Dn6+1627Dn7132Dn8)+φ4(hA)h(112514Dn64057Dn7+452Dn8).\displaystyle+\varphi_{3}(hA)h(\tfrac{-625}{14}D_{n6}+\tfrac{162}{7}D_{n7}-\tfrac{13}{2}D_{n8})+\varphi_{4}(hA)h(\tfrac{1125}{14}D_{n6}-\tfrac{405}{7}D_{n7}+\tfrac{45}{2}D_{n8}).
Remark 1

In view of the structures of these schemes, one can see that only the second- and third-oder schemes (𝚎𝚡𝚙𝚁𝙺𝟸𝚜𝟸\mathtt{expRK2s2}, 𝚎𝚡𝚙𝚁𝙺𝟹𝚜𝟹\mathtt{expRK3s3}) have all UniU_{ni} in the form (8). While 𝚎𝚡𝚙𝚁𝙺𝟸𝚜𝟸\mathtt{expRK2s2} requires one internal stage Un2U_{n2}, 𝚎𝚡𝚙𝚁𝙺𝟹𝚜𝟹\mathtt{expRK3s3} needs two internal stages with Un3U_{n3} depends on Un2U_{n2}, making these stages cannot be computed simultaneously. As for 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, to the best of our knowledge, this 5-stage scheme is the only existing fourth-order stiffly accurate expRK method. As seen, among its internal stages the three internal stages Un2U_{n2}, Un3U_{n3}, and Un4U_{n4} are of the form (8) but again their corresponding sets of vectors [Vki][V_{ki}] are not the same ([Vk2]=[12hF(tn,un)],[Vk3]=[12hF(tn,un),hDn2],[Vk4]=[hF(tn,un),h(Dn2+Dn3)][V_{k2}]=[\tfrac{1}{2}hF(t_{n},u_{n})],[V_{k3}]=[\tfrac{1}{2}hF(t_{n},u_{n}),hD_{n2}],[V_{k4}]=[hF(t_{n},u_{n}),h(D_{n2}+D_{n3})]), and because of (3), they are not independent of one another (Un4U_{n4} and Un3U_{n3} depend on their preceding stages). Therefore one needs 3 sequential evaluations for computing these three stages. Also, we note that the last internal stage Un5U_{n5} depends on all of its preceding stages and involves two different linear combinations of φk\varphi_{k}- functions with different scaling factors c5=12c_{5}=\tfrac{1}{2} and c4=1c_{4}=1, namely, kφk(12hA)Vk\sum_{k}\varphi_{k}(\tfrac{1}{2}hA)V_{k} and kφk(hA)Wk\sum_{k}\varphi_{k}(hA)W_{k} (grouped in two different brackets [][\ ]), which has to be implemented by 2 separate evaluations. The final stage un+1u_{n+1} depends on Un4U_{n4} and Un5U_{n5}. As a result, this scheme must be implemented in a sequential way, which requires totally 6 evaluations for 6 different linear combinations. Similarly, to the best of our knowledge, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} is also the only existing fifth-order stiffly accurate expRK methods. From the construction of this scheme LO14b , one needs 8 stages. Among them, the first five internal stages are of the form (8). We note, however, that the last two internal stages Un7U_{n7} and Un8U_{n8} involves 2 and 3 different linear combinations (grouped in different brackets [][\ ]) of φk\varphi_{k}- functions (with different scaling factors) acting on different sets of vectors. And each stage (UniU_{ni} or un+1u_{n+1}) depends on all the preceding stages (except for the first stage Un2U_{n2}). Thus, this scheme must be also implemented in a sequential way (it also does not have any group of internal stages that can be computed simultaneously). Clearly, it requires totally 11 evaluations (11 different linear combinations of φ\varphi functions).

Remark 2

The resulting structures of the expRK schemes discussed in Remark 1 can be explained by taking a closer look at their constructions presented in HO05 ; LO14b . Namely, these methods have been analyzed and derived by using a weakened convergence result, i.e., weakening of many order conditions in order to minimize the number of required stages ss and the number of matrix functions in each internal stage UniU_{ni}. Specifically, for fourth-order methods (e.g., 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}) 4 out of 9 order conditions have to be relaxed and for fifth-order methods (e.g., 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}) 9 out of 16 order conditions have to be relaxed. As a trade off, each stage of these methods depends on the preceding stages (thus the resulting schemes must be implemented by computing each stage sequentially) and the very last stages usually involve different linear combinations of φk\varphi_{k}-functions (with several different nodes cic_{i} as scaling factors) acting on not the same set of vectors, which then require additional sequential evaluations. For more details, see Section 4 below.

3 Stiff order conditions and convergence analysis

Inspired by the motivation and remarks in Section 2, we next present a stronger convergence result which later allows a construction of new efficient methods of high order. For this, we first recall the stiff order conditions for expRK methods up to order 5 (see LO12b ; LO14b ).

3.1 Stiff order conditions for methods up to order 5

Let e~n+1=u^n+1u(tn+1)\tilde{e}_{n+1}=\hat{u}_{n+1}-u(t_{n+1}) denote the local error of (2), i.e., the difference between the numerical solution u^n+1\hat{u}_{n+1} obtained by (2) after one step starting from the ‘initial condition’ u(tn)u(t_{n}) and the corresponding exact solution u(tn+1)u(t_{n+1}) of (1) at tn+1t_{n+1}.

To simplify the notation in this section we set f(t)=g(t,u(t))f(t)=g(t,u(t)) as done in HO05 , and additionally denote Gk,n=Dkg(tn,u(tn))G_{k,n}=D^{k}g(t_{n},u(t_{n})) be the kk-th partial Fréchet derivative (with respect to uu) evaluated at u(tn)u(t_{n}). Our results in LO12b (Sect. 4.2) or LO13 (Sect. 5.1) showed that

e~n+1\displaystyle\tilde{e}_{n+1} =h2ψ2(hA)f(tn)+h3ψ3(hA)f′′(tn)+h4ψ4(hA)f′′′(tn)+h5ψ5(hA)f(4)(tn)\displaystyle=h^{2}\psi_{2}(hA)\,f^{\prime}(t_{n})+h^{3}\psi_{3}(hA)\,f^{\prime\prime}(t_{n})+h^{4}\psi_{4}(hA)\,f^{\prime\prime\prime}(t_{n})+h^{5}\psi_{5}(hA)\,f^{(4)}(t_{n}) (12)
+𝐑n+𝒪(h6)\displaystyle\quad+\mathbf{R}_{n}+\mathcal{O}(h^{6})

with the remaining terms

𝐑n\displaystyle\mathbf{R}_{n} =h3i=2sbiG1,nψ2,if(tn)+h4i=2sbiG1,nψ3,if′′(tn)+h4i=2sbiG1,nj=2i1aijG1,nψ2,jf(tn)\displaystyle=h^{3}\sum_{i=2}^{s}b_{i}G_{1,n}\psi_{2,i}\,f^{\prime}(t_{n})+h^{4}\sum_{i=2}^{s}b_{i}G_{1,n}\psi_{3,i}\,f^{\prime\prime}(t_{n})+h^{4}\sum_{i=2}^{s}b_{i}G_{1,n}\sum_{j=2}^{i-1}a_{ij}G_{1,n}\psi_{2,j}\,f^{\prime}(t_{n})
+h4i=2sbiciG2,n(u(tn),ψ2,if(tn))+h5i=2sbiG1,nψ4,if′′′(tn)\displaystyle+h^{4}\sum_{i=2}^{s}b_{i}c_{i}G_{2,n}\big{(}u^{\prime}(t_{n}),\psi_{2,i}\,f^{\prime}(t_{n})\big{)}+h^{5}\sum_{i=2}^{s}b_{i}G_{1,n}\psi_{4,i}\,f^{\prime\prime\prime}(t_{n})
+h5i=2sbiG1,nj=2i1aijG1,nψ3,jf′′(tn)+h5i=2sbiG1,nj=2i1aijG1,nk=2j1ajkG1,nψ2,kf(tn)\displaystyle+h^{5}\sum_{i=2}^{s}b_{i}G_{1,n}\sum_{j=2}^{i-1}a_{ij}G_{1,n}\psi_{3,j}\,f^{\prime\prime}(t_{n})+h^{5}\sum_{i=2}^{s}b_{i}G_{1,n}\sum_{j=2}^{i-1}a_{ij}G_{1,n}\sum_{k=2}^{j-1}a_{jk}G_{1,n}\psi_{2,k}\,f^{\prime}(t_{n}) (13)
+h5i=2sbiG1,nj=2i1aijcjG2,n(u(tn),ψ2,jf(tn))+h5i=2sbiciG2,n(u(tn),ψ3,if′′(tn))\displaystyle+h^{5}\sum_{i=2}^{s}b_{i}G_{1,n}\sum_{j=2}^{i-1}a_{ij}c_{j}G_{2,n}\big{(}u^{\prime}(t_{n}),\psi_{2,j}\,f^{\prime}(t_{n})\big{)}+h^{5}\sum_{i=2}^{s}b_{i}c_{i}G_{2,n}\big{(}u^{\prime}(t_{n}),\psi_{3,i}\,f^{\prime\prime}(t_{n})\big{)}
+h5i=2sbiciG2,n(u(tn),j=2i1aijG1,nψ2,jf(tn))+h5i=2sbi2!G2,n(ψ2,if(tn),ψ2,if(tn))\displaystyle+h^{5}\sum_{i=2}^{s}b_{i}c_{i}G_{2,n}\big{(}u^{\prime}(t_{n}),\sum_{j=2}^{i-1}a_{ij}G_{1,n}\psi_{2,j}\,f^{\prime}(t_{n})\big{)}+h^{5}\sum_{i=2}^{s}\frac{b_{i}}{2!}G_{2,n}\big{(}\psi_{2,i}\,f^{\prime}(t_{n}),\psi_{2,i}\,f^{\prime}(t_{n})\big{)}
+h5i=2sbici22!G2,n(u′′(tn),ψ2,if(tn))+h5i=2sbici22!G3,n(u(tn),u(tn),ψ2,if(tn)).\displaystyle+h^{5}\sum_{i=2}^{s}b_{i}\frac{c^{2}_{i}}{2!}G_{2,n}\big{(}u^{\prime\prime}(t_{n}),\psi_{2,i}\,f^{\prime}(t_{n})\big{)}+h^{5}\sum_{i=2}^{s}b_{i}\frac{c^{2}_{i}}{2!}G_{3,n}\big{(}u^{\prime}(t_{n}),u^{\prime}(t_{n}),\psi_{2,i}\,f^{\prime}(t_{n})\big{)}.

Here, (and from now on) we use the abbreviations aij=aij(hA)a_{ij}=a_{ij}(hA), bi=bi(hA),φj,i=φj(cihA)b_{i}=b_{i}(hA),\varphi_{j,i}=\varphi_{j}(c_{i}hA) and

ψj(hA)\displaystyle\psi_{j}(hA) =i=2sbicij1(j1)!φj(hA),j2\displaystyle=\sum_{i=2}^{s}b_{i}\frac{c^{j-1}_{i}}{(j-1)!}-\varphi_{j}(hA),\quad j\geq 2 (14a)
ψj,i\displaystyle\psi_{j,i} =ψj,i(hA)=k=2i1aikckj1(j1)!cijφj,i.\displaystyle=\psi_{j,i}(hA)=\sum_{k=2}^{i-1}a_{ik}\frac{c^{j-1}_{k}}{(j-1)!}-c^{j}_{i}\varphi_{j,i}. (14b)

Requiring a local error truncation e~n+1=𝒪(h6)\tilde{e}_{n+1}=\mathcal{O}(h^{6}) results in the stiff order conditions for methods of order up to 5, which are displayed in Table 1 below.

Table 1: Stiff order conditions for explicit exponential Runge–Kutta methods up to order 5. The variables ZZ, JJ, KK, LL denote arbitrary square matrices, and BB an arbitrary bilinear mapping of appropriate dimensions. The functions ψl\psi_{l} and ψk,l\psi_{k,l} are defined in (14).
 No. Stiff order condition  Order
1 ψ2(Z)=0i=2sbi(Z)ci=φ2(Z)\psi_{2}(Z)=0\Longleftrightarrow\sum_{i=2}^{s}b_{i}(Z)c_{i}=\varphi_{2}(Z) 2
2 ψ3(Z)=0i=2sbi(Z)ci22!=φ3(Z)\psi_{3}(Z)=0\Longleftrightarrow\sum_{i=2}^{s}b_{i}(Z)\frac{c_{i}^{2}}{2!}=\varphi_{3}(Z) 3
3 i=2sbi(Z)Jψ2,i(Z)=0\sum_{i=2}^{s}b_{i}(Z)J\psi_{2,i}(Z)=0 3
4 ψ4(Z)=0i=2sbi(Z)ci33!=φ4(Z)\psi_{4}(Z)=0\Longleftrightarrow\sum_{i=2}^{s}b_{i}(Z)\frac{c_{i}^{3}}{3!}=\varphi_{4}(Z) 4
5 i=2sbi(Z)Jψ3,i(Z)=0\sum_{i=2}^{s}b_{i}(Z)J\psi_{3,i}(Z)=0 4
6 i=2sbi(Z)Jj=2i1aij(Z)Jψ2,j(Z)=0\sum_{i=2}^{s}b_{i}(Z)J\sum_{j=2}^{i-1}a_{ij}(Z)J\psi_{2,j}(Z)=0 4
7 i=2sbi(Z)ciKψ2,i(Z)=0\sum_{i=2}^{s}b_{i}(Z)c_{i}K\psi_{2,i}(Z)=0 4
8 ψ5(Z)=0i=2sbi(Z)ci44!=φ5(Z)\psi_{5}(Z)=0\Longleftrightarrow\sum_{i=2}^{s}b_{i}(Z)\frac{c_{i}^{4}}{4!}=\varphi_{5}(Z) 5
9 i=2sbi(Z)Jψ4,i(Z)=0\sum_{i=2}^{s}b_{i}(Z)J\psi_{4,i}(Z)=0 5
10 i=2sbi(Z)Jj=2i1aij(Z)Jψ3,j(Z)=0\sum_{i=2}^{s}b_{i}(Z)J\sum_{j=2}^{i-1}a_{ij}(Z)J\psi_{3,j}(Z)=0 5
11 i=2sbi(Z)Jj=2i1aij(Z)Jk=2j1ajk(Z)Jψ2,k(Z)=0\sum_{i=2}^{s}b_{i}(Z)J\sum_{j=2}^{i-1}a_{ij}(Z)J\sum_{k=2}^{j-1}a_{jk}(Z)J\psi_{2,k}(Z)=0\quad 5
12 i=2sbi(Z)Jj=2i1aij(Z)cjKψ2,j(Z)=0\sum_{i=2}^{s}b_{i}(Z)J\sum_{j=2}^{i-1}a_{ij}(Z)c_{j}K\psi_{2,j}(Z)=0 5
13 i=2sbi(Z)ciKψ3,i(Z)=0\sum_{i=2}^{s}b_{i}(Z)c_{i}K\psi_{3,i}(Z)=0 5
14 i=2sbi(Z)ciKj=2i1aij(Z)Jψ2,j(Z)=0\sum_{i=2}^{s}b_{i}(Z)c_{i}K\sum_{j=2}^{i-1}a_{ij}(Z)J\psi_{2,j}(Z)=0 5
15 i=2sbi(Z)B(ψ2,i(Z),ψ2,i(Z))=0\sum_{i=2}^{s}b_{i}(Z)B\big{(}\psi_{2,i}(Z),\psi_{2,i}(Z)\big{)}=0 5
16 i=2sbi(Z)ci2Lψ2,i(Z)=0\sum_{i=2}^{s}b_{i}(Z)c^{2}_{i}L\psi_{2,i}(Z)=0 5

3.2 A stronger convergence result

The convergence analysis of exponential Runge–Kutta methods is usually performed in the framework of analytic semigroups on a Banach space XX with the following assumptions (see e.g. HO05 ; LO14b ):

Assumption 1. The linear operator AA is the infinitesimal generator of an analytic semigroup etA{\rm e}\hskip 1.0pt^{tA} on XX. This implies that

etAXXC,t0\|{\rm e}\hskip 1.0pt^{tA}\|_{X\leftarrow X}\leq C,\quad t\geq 0 (15)

and consequently φk(hA)\varphi_{k}(hA), the coefficients aij(hA)a_{ij}(hA) and bi(hA)b_{i}(hA) of the method are bounded operators. Furthermore, the following stability bound (see (HO05, , Lemma 1))

hAj=1nejhAXXC\left\|hA\sum_{j=1}^{n}{\rm e}\hskip 1.0pt^{jhA}\right\|_{X\leftarrow X}\leq C (16)

holds uniformly for all n1n\geq 1 and h>0h>0 with 0<nhTt00<nh\leq T-t_{0}.

Assumption 2 (for high-order methods). The solution u:[t0,T]Xu:[t_{0},T]\to X of (1) is sufficiently smooth with derivatives in XX and g:[t0,T]Xg:[t_{0},T]\to X is sufficiently often Fréchet differentiable in a strip along the exact solution. All occurring derivatives are assumed to be uniformly bounded.

Let en+1=un+1u(tn+1)e_{n+1}=u_{n+1}-u(t_{n+1}) denote the global error at time tn+1t_{n+1}. In LO14b , we have shown that ene_{n} satisfies the recursion

en=hj=0n1e(nj)hA𝒦j(ej)ej+j=0n1ejhAe~nj,e_{n}=h\sum_{j=0}^{n-1}{\rm e}\hskip 1.0pt^{(n-j)hA}\mathcal{K}_{j}(e_{j})e_{j}+\sum_{j=0}^{n-1}{\rm e}\hskip 1.0pt^{jhA}\tilde{e}_{n-j}, (17)

where 𝒦j(ej)\mathcal{K}_{j}(e_{j}) are bounded operators on XX.

Motivated by Remark 2, we now give a stronger convergence result (compared to those results given in HO05 ; LO14b ) in the sense that it requires relaxing only one order condition.

Theorem 3.1

(Convergence) Let the initial value problem (1) satisfy Assumptions 1–2. Consider for its numerical solution an explicit exponential Runge–Kutta method (2) that fulfills the order conditions of Table 1 up to order pp (2p52\leq p\leq 5) in a strong form with the exception that only one condition ψp(Z)=0\psi_{p}(Z)=0 holds in a weakened form, i.e., ψp(0)=0\psi_{p}(0)=0. Then, the method is convergent of order pp. In particular, the numerical solution unu_{n} satisfies the error bound

unu(tn)Chp\|u_{n}-u(t_{n})\|\leq Ch^{p} (18)

uniformly on compact time intervals  t0tn=t0+nhTt_{0}\leq t_{n}=t_{0}+nh\leq T with a constant CC that depends on Tt0T-t_{0}, but is independent of nn and hh.

Proof

The proof can be carried out in a very similar way as done in (LO14b, , Theorem 4.2). In view of (12) and (13) and employing the assumptions of Theorem 3.1 on the order conditions, we have 𝐑n=0\mathbf{R}_{n}=0 and thus

e~n+1=hp(ψp(hA)ψp(0))Gp1,n+hp+1𝐒n,\tilde{e}_{n+1}=h^{p}\big{(}\psi_{p}(hA)-\psi_{p}(0)\big{)}G_{p-1,n}+h^{p+1}\mathbf{S}_{n}, (19)

where Gp1,nG_{p-1,n} is defined in Section 3.1 and 𝐒n\mathbf{S}_{n} involves the terms multiplying hp+1h^{p+1} and higher order in (12) (clearly, 𝐒nC\|\mathbf{S}_{n}\|\leq C). Inserting (19) (with index nj1n-j-1 in place of nn) into (17) and using the fact that there exists a bounded operator ψ~p(hA)\tilde{\psi}_{p}(hA) such that ψp(hA)ψp(0)=ψ~p(hA)hA\psi_{p}(hA)-\psi_{p}(0)=\tilde{\psi}_{p}(hA)hA yields

en=hj=0n1e(nj)hA𝒦j(ej)ej+hpj=0n1hAejhAψ~p(hA)Gp1,nj1+hp+1𝐒nj1.e_{n}=h\sum_{j=0}^{n-1}{\rm e}\hskip 1.0pt^{(n-j)hA}\mathcal{K}_{j}(e_{j})e_{j}+h^{p}\sum_{j=0}^{n-1}hA{\rm e}\hskip 1.0pt^{jhA}\tilde{\psi}_{p}(hA)G_{p-1,n-j-1}+h^{p+1}\mathbf{S}_{n-j-1}. (20)

Using (15), (16) and an application of a discrete Gronwall lemma shows (18).∎

With the result of Theorem 3.1 in hand, we are now ready to derive more efficient methods. In particular, we will solve the system of stiff order conditions of Table 1 in the context of Theorem 3.1. It turns out that for methods of high order this will require an increase in the number of stages ss. However, we will have more degree of freedoms for constructing our desired methods as seen in Section 4 below. In addition, by relaxing only one order condition, we expect methods resulted from Theorem 3.1 to have better stability (and thus may be more accurate) when integrating stiff systems (see Section 6).

4 Derivation of new efficient exponential Runge–Kutta methods

In this section, we will derive methods which have the following features: (i) containing multiple internal stages UniU_{ni} that are independent of each other (henceforth called parallel stages) and share the same format (thereby allowing them to be implemented in parallel); (ii) involving less number of evaluations of the form (8) when compared to the existing methods of the same orders (thus behaving like methods that use fewer number of stages ss).

We first start with methods of order p3p\leq 3. When solving order conditions for these methods (requiring at least s=2s=2 and s=3s=3 for second- and third-order methods, respectively), one can easily show that it is not possible to fulfill the desired feature (ii), particularly when comparing with 𝚎𝚡𝚙𝚁𝙺𝟸𝚜𝟸\mathtt{expRK2s2} (order 2, 2-stage) and 𝚎𝚡𝚙𝚁𝙺𝟹𝚜𝟹\mathtt{expRK3s3} (order 3, 3-stage) mentioned in Section 2. We omit the details. Therefore, we will focus on the derivation of new methods of higher orders, namely, orders 4 and 5.

4.1 A family of fourth-order methods with parallel stages

Deriving methods of order 4 requires solving the set of 7 stiff order conditions 1–7 in Table 1. First, we discuss on the required number of stages ss. It is shown in (HO05, , Sect.5.3) that s=5s=5 is the minimal number of stages required to construct a family of fourth-order methods which satisfies conditions 1–3 in the strong sense and conditions 4–7 in the weakened form (relaxing bi(Z)b_{i}(Z) as bi(0)b_{i}(0)). In other words, with s=5s=5 it is not possible to fulfill the order conditions in the context of Theorem 3.1, which requires only condition 4 holds in a weakened form ψ4(0)=0\psi_{4}(0)=0 or equivalently i=2sbi(0)ci33!=φ4(0)=1/24\sum_{i=2}^{s}b_{i}(0)\frac{c_{i}^{3}}{3!}=\varphi_{4}(0)=1/24. Therefore, we consider s=6s=6. In this case, conditions 1, 2, and the weakened condition 4 are

b2c2+b3c3+b4c4+b5c5+b6c6\displaystyle b_{2}c_{2}+b_{3}c_{3}+b_{4}c_{4}+b_{5}c_{5}+b_{6}c_{6} =φ2,\displaystyle=\varphi_{2}, (21a)
b2c22+b3c32+b4c42+b5c52+b6c62\displaystyle b_{2}c^{2}_{2}+b_{3}c^{2}_{3}+b_{4}c^{2}_{4}+b_{5}c^{2}_{5}+b_{6}c^{2}_{6} =2φ3,\displaystyle=2\varphi_{3}, (21b)
b2(0)c23+b3(0)c33+b4(0)c43+b5(0)c53+b6(0)c63\displaystyle b_{2}(0)c^{3}_{2}+b_{3}(0)c^{3}_{3}+b_{4}(0)c^{3}_{4}+b_{5}(0)c^{3}_{5}+b_{6}(0)c^{3}_{6} =6φ4(0)=1/4,\displaystyle=6\varphi_{4}(0)=1/4, (21c)

and conditions 3, 5, 7 and 6 are

b2Jψ2,2+b3Jψ2,3+b4Jψ2,4+b5Jψ2,5+b6Jψ2,6=0,\displaystyle b_{2}J\psi_{2,2}+b_{3}J\psi_{2,3}+b_{4}J\psi_{2,4}+b_{5}J\psi_{2,5}+b_{6}J\psi_{2,6}=0, (22a)
b2Jψ3,2+b3Jψ3,3+b4Jψ3,4+b5Jψ3,5+b6Jψ3,6=0,\displaystyle b_{2}J\psi_{3,2}+b_{3}J\psi_{3,3}+b_{4}J\psi_{3,4}+b_{5}J\psi_{3,5}+b_{6}J\psi_{3,6}=0, (22b)
b2c2Kψ2,2+b3c3Kψ2,3+b4c4Kψ2,4+b5c5Kψ2,5+b6c6Kψ2,6=0,\displaystyle b_{2}c_{2}K\psi_{2,2}+b_{3}c_{3}K\psi_{2,3}+b_{4}c_{4}K\psi_{2,4}+b_{5}c_{5}K\psi_{2,5}+b_{6}c_{6}K\psi_{2,6}=0, (22c)
b3Ja32Jψ2,2+b4J(a42Jψ2,2+a43Jψ2,3)+b5J(a52Jψ2,2+a53Jψ2,3+a54Jψ2,4)\displaystyle b_{3}Ja_{32}J\psi_{2,2}+b_{4}J(a_{42}J\psi_{2,2}+a_{43}J\psi_{2,3})+b_{5}J(a_{52}J\psi_{2,2}+a_{53}J\psi_{2,3}+a_{54}J\psi_{2,4}) (22d)
+b6J(a62Jψ2,2+a63Jψ2,3+a64Jψ2,4+a65Jψ2,5)=0.\displaystyle+b_{6}J(a_{62}J\psi_{2,2}+a_{63}J\psi_{2,3}+a_{64}J\psi_{2,4}+a_{65}J\psi_{2,5})=0.

We now solve these order conditions. We note from (14b) that

ψ2,i=j=2i1aijcjci2φ2,i,ψ3,i=j=2i1aijcj22!ci3φ3,i\psi_{2,i}=\sum_{j=2}^{i-1}a_{ij}c_{j}-c^{2}_{i}\varphi_{2,i},\quad\psi_{3,i}=\sum_{j=2}^{i-1}a_{ij}\frac{c^{2}_{j}}{2!}-c^{3}_{i}\varphi_{3,i} (23)

and thus ψ2,2=c22φ2,20,ψ3,2=c23φ3,20\psi_{2,2}=-c^{2}_{2}\varphi_{2,2}\neq 0,\quad\psi_{3,2}=-c^{3}_{2}\varphi_{3,2}\neq 0 (since c20c_{2}\neq 0). Using (23), one can infer that either ψ2,3\psi_{2,3} or ψ3,3\psi_{3,3} must be nonzero as well (if both are zero then a32=c32c2φ2,3=2c33c22φ3,3a_{32}=\dfrac{c^{2}_{3}}{c_{2}}\varphi_{2,3}=\dfrac{2c^{3}_{3}}{c^{2}_{2}}\varphi_{3,3}, which is impossible since c3>0c_{3}>0 and {φ2,φ3}\{\varphi_{2},\varphi_{3}\} are linearly independent). This strongly suggests that b2=b3=0b_{2}=b_{3}=0 in order to later fulfill (22) in the strong sense with arbitrary square matrices JJ and KK. Next, we further observe that if b40b_{4}\neq 0 one may need both ψ2,4=ψ3,4=0\psi_{2,4}=\psi_{3,4}=0 (which solves a420a_{42}\neq 0, a430a_{43}\neq 0). However, this makes the second term in (22d) to be nonzero which is then very difficult to satisfy (22d) in the strong form. Putting together, it requires that b2=b3=b4=0b_{2}=b_{3}=b_{4}=0. Using this sufficient condition we can easily solve (21) to get

b5=c6φ2+2φ3c5(c5c6),b6=c5φ2+2φ3c6(c6c5)b_{5}=\dfrac{-c_{6}\varphi_{2}+2\varphi_{3}}{c_{5}(c_{5}-c_{6})},\quad b_{6}=\dfrac{-c_{5}\varphi_{2}+2\varphi_{3}}{c_{6}(c_{6}-c_{5})}

for any choice of distinct nodes c5,c6>0c_{5},c_{6}>0, satisfying the condition

c5=4c636c64.c_{5}=\dfrac{4c_{6}-3}{6c_{6}-4}. (24)

Since b5,b60b_{5},b_{6}\neq 0, we must enforce ψ2,5=ψ3,5=0\psi_{2,5}=\psi_{3,5}=0 and ψ2,6=ψ3,6=0\psi_{2,6}=\psi_{3,6}=0 to satisfy conditions (22a)–(22c). Using (23), this leads to the following 2 systems of two linear equations

a52c2+a53c3+a54c4\displaystyle a_{52}c_{2}+a_{53}c_{3}+a_{54}c_{4} =c52φ2,5,\displaystyle=c^{2}_{5}\varphi_{2,5}, (25a)
a52c22+a53c32+a54c42\displaystyle a_{52}c^{2}_{2}+a_{53}c^{2}_{3}+a_{54}c^{2}_{4} =2c53φ3,5,\displaystyle=2c^{3}_{5}\varphi_{3,5}, (25b)

and

a62c2+a63c3+a64c4+a65c5\displaystyle a_{62}c_{2}+a_{63}c_{3}+a_{64}c_{4}+a_{65}c_{5} =c62φ2,6,\displaystyle=c^{2}_{6}\varphi_{2,6}, (26a)
a62c22+a63c32+a64c42+a65c52\displaystyle a_{62}c^{2}_{2}+a_{63}c^{2}_{3}+a_{64}c^{2}_{4}+a_{65}c^{2}_{5} =2c63φ3,6.\displaystyle=2c^{3}_{6}\varphi_{3,6}. (26b)

To satisfy conditions (22d), we further enforce a52=a62=0a_{52}=a_{62}=0 (since ψ2,20\psi_{2,2}\neq 0), which immediately solves (25) for coefficients (with c3c4c_{3}\neq c_{4})

a53=c4c52φ2,5+2c53φ3,5c3(c3c4)0,a54=c3c52φ2,5+2c53φ3,5c4(c4c3)0,a_{53}=\dfrac{-c_{4}c^{2}_{5}\varphi_{2,5}+2c^{3}_{5}\varphi_{3,5}}{c_{3}(c_{3}-c_{4})}\neq 0,\quad a_{54}=\dfrac{-c_{3}c^{2}_{5}\varphi_{2,5}+2c^{3}_{5}\varphi_{3,5}}{c_{4}(c_{4}-c_{3})}\neq 0, (27)

and thus we also need ψ2,3=ψ2,4=0\psi_{2,3}=\psi_{2,4}=0 (since ψ2,5=0\psi_{2,5}=0), which gives

a32\displaystyle a_{32} =c32c2φ2,3,\displaystyle=\dfrac{c^{2}_{3}}{c_{2}}\varphi_{2,3}, (28a)
a42c2+a43c3\displaystyle a_{42}c_{2}+a_{43}c_{3} =c42φ2,4.\displaystyle=c^{2}_{4}\varphi_{2,4}. (28b)

After fulfilling all the required order conditions in (21)–(22), we see from (26) and (28b) that either a42a_{42} or a43a_{43} and one of the coefficients among a63,a64,a65a_{63},\ a_{64},\ a_{65} can be taken as free parameters. We now use them to construct parallel stages. Guided by (27) and (28a), we choose a43=0a_{43}=0 to make Un4U_{n4} is independent of Un3U_{n3} so that both these stages only depend on Un2U_{n2}, and choose a65=0a_{65}=0 to make Un6U_{n6} is independent of Un5U_{n5} so that both these stages only depend on the two preceding stages Un3,Un4U_{n3},U_{n4} (since a52=a62=0a_{52}=a_{62}=0). From this we determine the remaining coefficients

a42=c42c2φ2,4,a63=c4c62φ2,6+2c63φ3,6c3(c3c4),a64=c3c62φ2,6+2c63φ3,6c4(c4c3).a_{42}=\dfrac{c^{2}_{4}}{c_{2}}\varphi_{2,4},\quad a_{63}=\dfrac{-c_{4}c^{2}_{6}\varphi_{2,6}+2c^{3}_{6}\varphi_{3,6}}{c_{3}(c_{3}-c_{4})},\quad a_{64}=\dfrac{-c_{3}c^{2}_{6}\varphi_{2,6}+2c^{3}_{6}\varphi_{3,6}}{c_{4}(c_{4}-c_{3})}. (29)

Putting altogether and rearranging terms in Uni,un+1U_{ni},\ u_{n+1} as linear combinations of φ\varphi functions, we obtain the following family of 4th-order 6-stage methods (with the pairs of parallel stages {Un3,Un4}\{U_{n3},U_{n4}\} and {Un5,Un6}\{U_{n5},U_{n6}\}), which will be called 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6}:

Un2=un\displaystyle U_{n2}=u_{n} +φ1(c2hA)c2hF(tn,un),\displaystyle+\varphi_{1}(c_{2}hA)c_{2}hF(t_{n},u_{n}), (30a)
Un,k=un\displaystyle U_{n,k}=u_{n} +φ1(ckhA)ckhF(tn,un)+φ2(ckhA)ck2c2hDn2,k=3,4\displaystyle+\varphi_{1}(c_{k}hA)c_{k}hF(t_{n},u_{n})+\varphi_{2}(c_{k}hA)\tfrac{c^{2}_{k}}{c_{2}}hD_{n2},\quad\hskip 42.67912ptk=3,4 (30b)
Un,j=un\displaystyle U_{n,j}=u_{n} +φ1(cjhA)cjhF(tn,un)+φ2(cjhA)cj2c3c4h(c4c3Dn3+c3c4Dn4)\displaystyle+\varphi_{1}(c_{j}hA)c_{j}hF(t_{n},u_{n})+\varphi_{2}(c_{j}hA)\tfrac{c^{2}_{j}}{c_{3}-c_{4}}h\big{(}\tfrac{-c_{4}}{c_{3}}D_{n3}+\tfrac{c_{3}}{c_{4}}D_{n4}\big{)}
+φ3(cjhA)2cj3c3c4h(1c3Dn31c4Dn4),j=5,6\displaystyle+\varphi_{3}(c_{j}hA)\tfrac{2c^{3}_{j}}{c_{3}-c_{4}}h\big{(}\tfrac{1}{c_{3}}D_{n3}-\tfrac{1}{c_{4}}D_{n4}\big{)},\quad\quad\ \hskip 56.9055ptj=5,6 (30c)
un+1=un\displaystyle u_{n+1}=u_{n} +φ1(hA)hF(tn,un)+φ2(hA)1c5c6h(c6c5Dn5+c5c6Dn6)\displaystyle+\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)\tfrac{1}{c_{5}-c_{6}}h\big{(}\tfrac{-c_{6}}{c_{5}}D_{n5}+\tfrac{c_{5}}{c_{6}}D_{n6}\big{)}
+φ3(hA)2c5c6h(1c5Dn51c6Dn6).\displaystyle+\varphi_{3}(hA)\tfrac{2}{c_{5}-c_{6}}h\big{(}\tfrac{1}{c_{5}}D_{n5}-\tfrac{1}{c_{6}}D_{n6}\big{)}. (30d)

For the numerical experiments given in Section 6, we choose c2=c3=12,c4=13c_{2}=c_{3}=\frac{1}{2},c_{4}=\frac{1}{3}, c6=13c_{6}=\frac{1}{3} which gives c5=56c_{5}=\frac{5}{6} due to (24).

Remark 3

(A comparison with 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}). As seen, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} is resulted from weakening only condition 4 of Table 1 instead of weakening four conditions 4–7 as in the derivation of 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}. While the 5-stage method 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} requires 6 sequential evaluations in each step (as mentioned in Section 2), the new fourth-order 6-stage method 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} requires only 4 sequential evaluations, making it to behave like a 4-stage method. This is due to the fact 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} has the pairs of parallel stages {Un3,Un4}\{U_{n3},U_{n4}\} and {Un5,Un6}\{U_{n5},U_{n6}\} and all UniU_{ni} within these pairs have the same format, i.e., same (one) linear combination of φk(cihA)vk\varphi_{k}(c_{i}hA)v_{k}, allowing them to be computed in parallel or simultaneously (see Section 5).

4.2 A family of fifth-order methods with parallel stages

Constructing fifth-order exponential Runge-Kutta methods needs much more effort as one has to solve 16 order conditions in Table 1. As mentioned in Section 2, the only existing method of order 5 in the literature is 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} (see LO14b ) which requires s=8s=8 stages. Like 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, this method does not have any parallel stages and must be implemented in a sequential way. It also does not satisfy the assumption on the order conditions stated in Theorem 3.1. Indeed, it was constructed by fulfilling conditions 1–7 in the strong form and weakening conditions 8–16 (9 out of 16 order conditions) with bi(0)b_{i}(0) in place of bi(Z)b_{i}(Z). This resulted in the last two internal stages Un7U_{n7} and Un8U_{n8} that involve several different linear combinations of φk(cihA)vk\varphi_{k}(c_{i}hA)v_{k} (with different scalings c6,c7,c8c_{6},c_{7},c_{8} of hAhA), for which additional computational efforts are required to compute those stages (as shown in Section 2).

Therefore, to derive a method based on Theorem 3.1 which later allows us to derive parallel stages schemes with UniU_{ni} involving only one linear combination of φk(cihA)vk\varphi_{k}(c_{i}hA)v_{k}, we have to increase s9s\geqslant 9. To make it easier for readers to follow, we consider s=10s=10 first and later employ the similar procedure to show that it is not possible to fulfill condition 11 of Table 1 in the strong form (and thus not satisfying Theorem 3.1) with s=9s=9.

a) The case s=10s=10: Similarly to the derivation presented in Subsection 4.1, using (23), it strongly suggests b2=b3=b4=b5=b6=b7=0b_{2}=b_{3}=b_{4}=b_{5}=b_{6}=b_{7}=0 in order to solve conditions 3, 5, 9, 7, 16, 13, and 15 in their strong form. Using this, these conditions now read as

b8Jψ2,8+b9Jψ2,9+b10Jψ2,10=0,\displaystyle b_{8}J\psi_{2,8}+b_{9}J\psi_{2,9}+b_{10}J\psi_{2,10}=0, (31a)
b8Jψ3,8+b9Jψ3,9+b10Jψ3,10=0,\displaystyle b_{8}J\psi_{3,8}+b_{9}J\psi_{3,9}+b_{10}J\psi_{3,10}=0, (31b)
b8Jψ4,8+b9Jψ4,9+b10Jψ4,10=0,\displaystyle b_{8}J\psi_{4,8}+b_{9}J\psi_{4,9}+b_{10}J\psi_{4,10}=0, (31c)
b8c8Kψ2,8+b9c9Kψ2,9+b10c10Kψ2,10=0,\displaystyle b_{8}c_{8}K\psi_{2,8}+b_{9}c_{9}K\psi_{2,9}+b_{10}c_{10}K\psi_{2,10}=0, (31d)
b8c82Lψ2,8+b9c92Lψ2,9+b10c102Lψ2,10=0,\displaystyle b_{8}c^{2}_{8}L\psi_{2,8}+b_{9}c^{2}_{9}L\psi_{2,9}+b_{10}c^{2}_{10}L\psi_{2,10}=0, (31e)
b8c8Kψ3,8+b9c9Kψ3,9+b10c10Kψ3,10=0,\displaystyle b_{8}c_{8}K\psi_{3,8}+b_{9}c_{9}K\psi_{3,9}+b_{10}c_{10}K\psi_{3,10}=0, (31f)
b8B(ψ2,8,ψ2,8)+b9B(ψ2,9,ψ2,9)+b10B(ψ2,10,ψ2,10)=0,\displaystyle b_{8}B(\psi_{2,8},\psi_{2,8})+b_{9}B(\psi_{2,9},\psi_{2,9})+b_{10}B(\psi_{2,10},\psi_{2,10})=0, (31g)

respectively. And conditions 1, 2, 4, and 8 (weakened form) become

b8c8+b9c9+b10c10\displaystyle b_{8}c_{8}+b_{9}c_{9}+b_{10}c_{10} =φ2,\displaystyle=\varphi_{2}, (32a)
b8c82+b9c92+b10c102\displaystyle b_{8}c^{2}_{8}+b_{9}c^{2}_{9}+b_{10}c^{2}_{10} =2φ3,\displaystyle=2\varphi_{3}, (32b)
b8c83+b9c93+b10c103\displaystyle b_{8}c^{3}_{8}+b_{9}c^{3}_{9}+b_{10}c^{3}_{10} =6φ4,\displaystyle=6\varphi_{4}, (32c)
b8(0)c84+b9(0)c94+b10(0)c104\displaystyle b_{8}(0)c^{4}_{8}+b_{9}(0)c^{4}_{9}+b_{10}(0)c^{4}_{10} =24φ5(0)=1/5.\displaystyle=24\varphi_{5}(0)=1/5. (32d)

Solving (32) gives

b8\displaystyle b_{8} =c9c10φ22(c9+c10)φ3+6φ4c8(c8c9)(c8c10),\displaystyle=\dfrac{c_{9}c_{10}\varphi_{2}-2(c_{9}+c_{10})\varphi_{3}+6\varphi_{4}}{c_{8}(c_{8}-c_{9})(c_{8}-c_{10})}, (33a)
b9\displaystyle b_{9} =c8c10φ22(c8+c10)φ3+6φ4c9(c9c8)(c9c10),\displaystyle=\dfrac{c_{8}c_{10}\varphi_{2}-2(c_{8}+c_{10})\varphi_{3}+6\varphi_{4}}{c_{9}(c_{9}-c_{8})(c_{9}-c_{10})}, (33b)
b10\displaystyle b_{10} =c8c9φ22(c8+c9)φ3+6φ4c10(c10c8)(c10c9)\displaystyle=\dfrac{c_{8}c_{9}\varphi_{2}-2(c_{8}+c_{9})\varphi_{3}+6\varphi_{4}}{c_{10}(c_{10}-c_{8})(c_{10}-c_{9})} (33c)

where c8,c9c_{8},c_{9}, and c10c_{10} are distinct and positive nodes satisfying the algebraic equation

c8+c9+c104c8c9+c8c10+c9c103+c8c9c102=15.\dfrac{c_{8}+c_{9}+c_{10}}{4}-\dfrac{c_{8}c_{9}+c_{8}c_{10}+c_{9}c_{10}}{3}+\dfrac{c_{8}c_{9}c_{10}}{2}=\dfrac{1}{5}. (34)

Clearly, b8,b9,b100b_{8},b_{9},b_{10}\neq 0 so one has to enforce

ψ2,j=ψ3,j=ψ4,j=0(j=8,9,10)\psi_{2,j}=\psi_{3,j}=\psi_{4,j}=0\ (j=8,9,10) (35)

to satisfy (31) in the strong sense with arbitrary square matrices J,K,LJ,K,L and BB. Next, we consider conditions 6 and 10 taken into account that bi=0b_{i}=0 (i=2,,7i=2,\cdots,7) and (35), which can be now simplified as

j=27(b8Ja8j+b9Ja9j+b10Ja10j)Jψm,j=0(m=2,3),\sum_{j=2}^{7}(b_{8}Ja_{8j}+b_{9}Ja_{9j}+b_{10}Ja_{10j})J\psi_{m,j}=0\ \ (m=2,3), (36)

respectively. In order to satisfy the strong form of (36) one needs

a8j=a9j=a10j=0(j=2,3,4)a_{8j}=a_{9j}=a_{10j}=0\ (j=2,3,4) (37)

(this is again due to (23)) and

ψ2,j=ψ3,j=0(j=5,6,7).\psi_{2,j}=\psi_{3,j}=0\ (j=5,6,7). (38)

With (37), we note that Un8,Un9,Un10U_{n8},U_{n9},U_{n10} are independent of the internal stages Un2,Un3,Un4U_{n2},U_{n3},U_{n4}. Taking into all the requirements above, one can easily see that conditions 12 and 14 are now automatically fulfilled. Therefore, the only remaining condition to satisfy is condition 11.

Before working with condition 11, we first solve (35) using (37). For this, we observe that several coefficients aija_{ij} can be considered as free parameters. To have Un8,Un9,Un10U_{n8},U_{n9},U_{n10} are independent of each other, we choose

a98=a10,8=a10,9=0.a_{98}=a_{10,8}=a_{10,9}=0. (39)

The resulting systems of linear equations from (35) is then solved with the unique solution

aij=ci2ckclφ2,i2ci3(ck+cl)φ3,i+6ci4φ4,icj(cjck)(cjcl),i=8,9,10;j,k,l{5,6,7},jkla_{ij}=\dfrac{c^{2}_{i}c_{k}c_{l}\varphi_{2,i}-2c^{3}_{i}(c_{k}+c_{l})\varphi_{3,i}+6c^{4}_{i}\varphi_{4,i}}{c_{j}(c_{j}-c_{k})(c_{j}-c_{l})},\quad i=8,9,10;\ j,k,l\in\{5,6,7\},\ j\neq k\neq l (40)

(i.e., c5,c6,c7>0c_{5},c_{6},c_{7}>0 are distinct nodes).
We now use bi=0b_{i}=0 (i=2,,7i=2,\cdots,7), (35), (37), (38), and (39) to simplify condition 11 as

i=810biJj=57aijJ(aj2Jψ2,2+aj3Jψ2,3+aj4Jψ2,4)=0.\sum_{i=8}^{10}b_{i}J\sum_{j=5}^{7}a_{ij}J\big{(}a_{j2}J\psi_{2,2}+a_{j3}J\psi_{2,3}+a_{j4}J\psi_{2,4}\big{)}=0. (41)

Since b8,b9,b100b_{8},b_{9},b_{10}\neq 0, coefficients aija_{ij} in (40) (i{8,9,10},j{5,6,7}i\in\{8,9,10\},j\in\{5,6,7\}) are also nonzero, and that ψ2,20\psi_{2,2}\neq 0, we must enforce

aj2=0(j=5,6,7),i.e.,a52=a62=a72=0a_{j2}=0\ (j=5,6,7),\ \text{i.e.,}\ a_{52}=a_{62}=a_{72}=0 (42)

and require that

ψ2,3=ψ2,4=0\psi_{2,3}=\psi_{2,4}=0 (43)

in order to satisfy (41) in the strong sense. Note, because of (42), one could not require a53=0a_{53}=0 or a54=0a_{54}=0 (j=5j=5) in (41) or both as this does not agree with the requirement ψ2,5=ψ3,5=0\psi_{2,5}=\psi_{3,5}=0 in (38) (in other words, the linear system of equations displayed in (25) represented for this requirement has no solution). This justifies the requirement (43).

Finally, we solve (43) and (38) for the remaining coefficients aija_{ij}. When solving (43) (see (28)), we choose a43=0a_{43}=0 to have Un4U_{n4} is independent of Un3U_{n3}. This gives

a32=c32c2φ2,3,a42=c42c2φ2,4.a_{32}=\dfrac{c^{2}_{3}}{c_{2}}\varphi_{2,3},\quad a_{42}=\dfrac{c^{2}_{4}}{c_{2}}\varphi_{2,4}. (44)

When solving (38) (using (42)), we choose a65=a75=a76=0a_{65}=a_{75}=a_{76}=0 to have Un5,Un6,Un7U_{n5},U_{n6},U_{n7} are independent of each other. This results in the following 6 coefficients:

aij=ci2ckφ2,i+2ci3φ3,icj(cjck),i=5,6,7;j,k{3,4},jka_{ij}=\dfrac{-c^{2}_{i}c_{k}\varphi_{2,i}+2c^{3}_{i}\varphi_{3,i}}{c_{j}(c_{j}-c_{k})},\quad i=5,6,7;\ j,k\in\{3,4\},\ j\neq k (45)

(i.e., c3,c4>0c_{3},c_{4}>0 are distinct nodes).

Inserting all the obtained coefficients aija_{ij} and bib_{i} into Uni,un+1U_{ni},\ u_{n+1} and rewriting these stages as linear combinations of φ\varphi functions, we obtain the following family of 5th-order 10-stage methods (with the groups of parallel stages {Un3,Un4}\{U_{n3},U_{n4}\}, {Un5,Un6,Un7}\{U_{n5},U_{n6},U_{n7}\}, and {Un8,Un9,Un10}\{U_{n8},U_{n9},U_{n10}\}) which will be called 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10}:

Un2=un\displaystyle U_{n2}=u_{n} +φ1(c2hA)c2hF(tn,un),\displaystyle+\varphi_{1}(c_{2}hA)c_{2}hF(t_{n},u_{n}),
Un=un\displaystyle U_{n\ell}=u_{n} +φ1(chA)chF(tn,un)+φ2(chA)c2c2hDn2,=3,4\displaystyle+\varphi_{1}(c_{\ell}hA)c_{\ell}hF(t_{n},u_{n})+\varphi_{2}(c_{\ell}hA)\tfrac{c^{2}_{\ell}}{c_{2}}hD_{n2},\hskip 93.89418pt\ell=3,4
Unm=un\displaystyle U_{nm}=u_{n} +φ1(cmhA)cmhF(tn,un)+φ2(cmhA)cm2h(c4c3(c4c3)Dn3+c3c4(c3c4)Dn4)\displaystyle+\varphi_{1}(c_{m}hA)c_{m}hF(t_{n},u_{n})+\varphi_{2}(c_{m}hA)c^{2}_{m}h\big{(}\tfrac{c_{4}}{c_{3}(c_{4}-c_{3})}D_{n3}+\tfrac{c_{3}}{c_{4}(c_{3}-c_{4})}D_{n4}\big{)}
+φ3(cmhA)cm3h(2c3(c3c4)Dn32c4(c3c4)Dn4),m=5,6,7\displaystyle+\varphi_{3}(c_{m}hA)c^{3}_{m}h\big{(}\tfrac{2}{c_{3}(c_{3}-c_{4})}D_{n3}-\tfrac{2}{c_{4}(c_{3}-c_{4})}D_{n4}\big{)},\quad\hskip 71.13188ptm=5,6,7
Unq=un\displaystyle U_{nq}=u_{n} +φ1(cqhA)cqhF(tn,un)+φ2(cqhA)cq2h(α5Dn5+α6Dn6+α7Dn7)\displaystyle+\varphi_{1}(c_{q}hA)c_{q}hF(t_{n},u_{n})+\varphi_{2}(c_{q}hA)c^{2}_{q}h\big{(}\alpha_{5}D_{n5}+\alpha_{6}D_{n6}+\alpha_{7}D_{n7}\big{)}
+φ3(cqhA)cq3h(β5Dn5β6Dn6β7Dn7)\displaystyle+\varphi_{3}(c_{q}hA)c^{3}_{q}h\big{(}\beta_{5}D_{n5}-\beta_{6}D_{n6}-\beta_{7}D_{n7}\big{)}
+φ4(cqhA)cq4h(γ5Dn5+γ6Dn6+γ7Dn7),q=8,9,10\displaystyle+\varphi_{4}(c_{q}hA)c^{4}_{q}h\big{(}\gamma_{5}D_{n5}+\gamma_{6}D_{n6}+\gamma_{7}D_{n7}\big{)},\hskip 96.73918ptq=8,9,10
un+1=un\displaystyle u_{n+1}=u_{n} +φ1(hA)hF(tn,un)+φ2(hA)h(α8Dn8+α9Dn9+α10Dn10)\displaystyle+\varphi_{1}(hA)hF(t_{n},u_{n})+\varphi_{2}(hA)h\big{(}\alpha_{8}D_{n8}+\alpha_{9}D_{n9}+\alpha_{10}D_{n10}\big{)}
φ3(hA)h(β8Dn8+β9Dn9+β10Dn10)+φ4(hA)h(γ8Dn8+γ9Dn9+γ10Dn10),\displaystyle-\varphi_{3}(hA)h\big{(}\beta_{8}D_{n8}+\beta_{9}D_{n9}+\beta_{10}D_{n10}\big{)}+\varphi_{4}(hA)h\big{(}\gamma_{8}D_{n8}+\gamma_{9}D_{n9}+\gamma_{10}D_{n10}\big{)},

where

αi=ckclci(cick)(cicl),βi=2(ck+cl)ci(cick)(cicl),γi=6ci(cick)(cicl)\alpha_{i}=\dfrac{c_{k}c_{l}}{c_{i}(c_{i}-c_{k})(c_{i}-c_{l})},\quad\beta_{i}=\dfrac{2(c_{k}+c_{l})}{c_{i}(c_{i}-c_{k})(c_{i}-c_{l})},\quad\gamma_{i}=\dfrac{6}{c_{i}(c_{i}-c_{k})(c_{i}-c_{l})} (46)

with i{5,6,7}i\in\{5,6,7\} for k,l{5,6,7}\ k,l\in\{5,6,7\}, and i{8,9,10}i\in\{8,9,10\} for k,l{8,9,10}\ k,l\in\{8,9,10\} (note that i,k,li,k,l are distinct indices and that ci,ck,clc_{i},c_{k},c_{l} are distinct (positive) nodes).
For our numerical experiments, we choose c2=c3=c5=12c_{2}=c_{3}=c_{5}=\tfrac{1}{2}, c4=c6=13c_{4}=c_{6}=\tfrac{1}{3}, c7=14c_{7}=\tfrac{1}{4}, c8=310c_{8}=\tfrac{3}{10}, c9=34c_{9}=\tfrac{3}{4}, and c10=1c_{10}=1 (satisfying (34)).

Remark 4

(A comparison with 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}). Although the new fifth-order method 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} has 10 stages (compared to 8 stages of 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} displayed in Section 2), its special structure offers much more efficient for implementation. In particular, all UniU_{ni} in this scheme involve only one linear combination of φk(cihA)vk\varphi_{k}(c_{i}hA)v_{k} which can be computed by one evaluation for each; and more importantly, due to the same format of multiple stages within each of the three groups {Un3,Un4}\{U_{n3},U_{n4}\}, {Un5,Un6,Un7}\{U_{n5},U_{n6},U_{n7}\}, and {Un8,Un9,Un10}\{U_{n8},U_{n9},U_{n10}\} (same linear combination with different inputs cic_{i}), they can be computed simultaneously or implemented in parallel (see Section 5). This makes 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} to behave like a 5-stage method only, thereby requiring only 5 sequential evaluations in each step. Moreover, while 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} requires weakening 9 out of 16 order conditions of Table 1, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} requires only one condition (number 8) held in the weakened form. Note that by following the similar way of deriving 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10}, we can derive a scheme that satisfies all the stiff order conditions in Table 1 in the strong sense with s=11s=11. Such a scheme, however, still behaves like a 5-stage method. Therefore, we do not discuss further on this case.

b) The case s=9s=9 (which does not work): Clearly, in this case we have less degree of freedoms than the case s=10s=10 when solving the order conditions in Table 1. Nonetheless, one can still proceed in a similar way as done for s=10s=10. Again, it strongly suggests b2=b3=b4=b5=b6=0b_{2}=b_{3}=b_{4}=b_{5}=b_{6}=0 (which solves for b7,b8,b90b_{7},b_{8},b_{9}\neq 0 from conditions 1, 2, 4) and

ψ2,j=ψ3,j=ψ4,j=0(j=7,8,9)\psi_{2,j}=\psi_{3,j}=\psi_{4,j}=0\ (j=7,8,9) (47)

in order to satisfy conditions 1, 2, 3, 4, 5, 7, 9, 13, 15, 16 in the strong form. With this, conditions 6 and 10 now become

j=26(b7Ja7j+b8Ja8j+b9Ja9j)Jψm,j=0(m=2,3).\sum_{j=2}^{6}(b_{7}Ja_{7j}+b_{8}Ja_{8j}+b_{9}Ja_{9j})J\psi_{m,j}=0\ \ (m=2,3). (48)

Again, due to the fact that ψ2,2,ψ3,20\psi_{2,2},\psi_{3,2}\neq 0 and either ψ2,3\psi_{2,3} or ψ3,3\psi_{3,3} must be nonzero, one needs to enforce a7j=a8j=a9j=0(j=2,3)a_{7j}=a_{8j}=a_{9j}=0\ (j=2,3) in (48). Using this to solve (47) for j=7j=7 (ψ2,7=ψ3,7=ψ4,7=0\psi_{2,7}=\psi_{3,7}=\psi_{4,7}=0) gives a unique solution (with c4,c5,c6>0c_{4},c_{5},c_{6}>0 and are distinct) for a74,a75,a760a_{74},a_{75},a_{76}\neq 0, which then determines Un7U_{n7}. Next, one can solve (47) for j=8,9j=8,9 to obtain Un8,Un9U_{n8},U_{n9} that are independent of Un7U_{n7}, as well as are independent of each other, by requiring the three free parameters a87=a97=a98=0a_{87}=a_{97}=a_{98}=0. As a result, one gets a7j,a8j,a9j0(j=5,6)a_{7j},a_{8j},a_{9j}\neq 0\ (j=5,6). This immediately suggests ψ2,j=ψ3,j=0(j=4,5,6)\psi_{2,j}=\psi_{3,j}=0\ (j=4,5,6) to completely fulfill (48) with arbitrary square matrix JJ. With all of these in place, conditions 12 and 14 are automatically fulfilled, and condition 11 is now reduced to

i=79biJj=46aijJ(aj2Jψ2,2+aj3Jψ2,3)=0.\sum_{i=7}^{9}b_{i}J\sum_{j=4}^{6}a_{ij}J\big{(}a_{j2}J\psi_{2,2}+a_{j3}J\psi_{2,3}\big{)}=0. (49)

Clearly, since b7,b8,b90b_{7},b_{8},b_{9}\neq 0, a7j,a8j,a9j0(j=4,5,6)a_{7j},a_{8j},a_{9j}\neq 0\ (j=4,5,6), and ψ2,20\psi_{2,2}\neq 0, (49) can be satisfied in the strong sense only if we have one of the following conditions: aj2=aj3=0a_{j2}=a_{j3}=0 or aj2=ψ2,3=0,(j=4,5,6)a_{j2}=\psi_{2,3}=0,\ (j=4,5,6). Unfortunately, either of these requirements is in contradiction with ψ2,j=ψ3,j=0(j=4,5,6)\psi_{2,j}=\psi_{3,j}=0\ (j=4,5,6) which is needed for conditions 6 and 10 mentioned above. For example, solving ψ2,4=ψ3,4=0\psi_{2,4}=\psi_{3,4}=0 results in a42,a430a_{42},a_{43}\neq 0.

5 Details implementation of fourth- and fifth-order schemes

In this section, we present details implementation of the old and new fourth- and fifth-order expRK schemes (𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6}, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10}) mentioned above.

As mentioned in Section 2.1, we will use the MATLAB routine phipm_simul_iom (described in details in Luan18 ) to implement expRK methods. In particular, given the following inputs: an array of scaling factors 𝚝=[ρ1,,ρr]\mathtt{t}=[\rho_{1},\cdots,\rho_{r}] with 0<ρ1<ρ2<<ρr10<\rho_{1}<\rho_{2}<\cdots<\rho_{r}\leq 1 (𝚝\mathtt{t} could be a positive scalar), an nn-by-nn matrix MM, and a set of column vectors 𝚅=[V0,,vq]\mathtt{V}=[V_{0},\ldots,v_{q}] (each viv_{i} is an nn-by-11 vector), a tolerance 𝚝𝚘𝚕\mathtt{tol}, an initial value mm for the dimension of the Krylov subspace, and an incomplete orthogonalization length of 𝚒𝚘𝚖\mathtt{iom}, a call to this function

𝚙𝚑𝚒𝚙𝚖_𝚜𝚒𝚖𝚞𝚕_𝚒𝚘𝚖(𝚝,𝙼,𝚅,𝚝𝚘𝚕,𝚖,𝚒𝚘𝚖)\mathtt{phipm\_simul\_iom(t,M,V,tol,m,iom)} (50)

simultaneously computes the following rr linear combinations

Lρi,𝚅=φ0(ρiM)v0+φ1(ρiM)ρiv1+φ2(ρiM)ρi2v2++φq(ρiM)ρiqvq, 1ir.L_{\rho_{i},\mathtt{V}}=\varphi_{0}(\rho_{i}M)v_{0}+\varphi_{1}(\rho_{i}M)\rho_{i}v_{1}+\varphi_{2}(\rho_{i}M)\rho_{i}^{2}v_{2}+\cdots+\varphi_{q}(\rho_{i}M)\rho_{i}^{q}v_{q},\ 1\leq i\leq r. (51)

Note that, by setting Vj=ρijvjV_{j}=\rho_{i}^{j}v_{j} (j=0,,qj=0,\cdots,q), (51) becomes (6). In other words, all the linear combinations in (6) ( if VjV_{j} are given instead of vjv_{j}) can be then computed at the same time with one call (50) by using scaled vectors vj=Vj/ρijv_{j}=V_{j}/\rho_{i}^{j} for the input 𝚅\mathtt{V}.

In the following, we set

M=hA,ρi=ci,v=hF(tn,un),di=hDni.M=hA,\quad\rho_{i}=c_{i},\quad v=hF(t_{n},u_{n}),\quad d_{i}=hD_{ni}. (52)

to simplify notations in presenting details of implementation of the fourth- and fifth-order methods mentioned above. When calling (50), we use 𝚝𝚘𝚕=1012\mathtt{tol}=10^{-12}, 𝚖=1\mathtt{m}=1 (default value), and 𝚒𝚖𝚘=2\mathtt{imo}=2 (as in Luan18 ).

Implementation of 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} (c2=c3=c5=12,c4=1c_{2}=c_{3}=c_{5}=\tfrac{1}{2},c_{4}=1): As discussed in Remark 1, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} requires a sequential implementation of the following 6 different linear combinations of the form (51), corresponding to 6 calls to phipm_simul_iom:

  • (i)

    Evaluate Lc2,𝚅L_{c_{2},\mathtt{V}} with 𝚝=c2,𝚅=[0,v]\mathtt{t}=c_{2},\mathtt{V}=[0,v] to get Un2=un+Lc2,𝚅U_{n2}=u_{n}+L_{c_{2},\mathtt{V}}.

  • (ii)

    Evaluate Lc3,𝚅L_{c_{3},\mathtt{V}} with 𝚝=c3,𝚅=[0,v,d2/c32]\mathtt{t}=c_{3},\mathtt{V}=[0,v,d_{2}/c^{2}_{3}] to get Un3=un+Lc3,𝚅U_{n3}=u_{n}+L_{c_{3},\mathtt{V}}.

  • (iii)

    Evaluate Lc4,𝚅L_{c_{4},\mathtt{V}} with 𝚝=c4,𝚅=[0,v,d2+d3]\mathtt{t}=c_{4},\mathtt{V}=[0,v,d_{2}+d_{3}] to get Un4=un+Lc4,𝚅U_{n4}=u_{n}+L_{c_{4},\mathtt{V}}.

  • (iv)

    Evaluate Lc5,𝚅𝟷L_{c_{5},\mathtt{V_{1}}} with 𝚝=c5,𝚅1=[0,v,2d2+2d3d4,(d2d3+d4)/c52]\mathtt{t}=c_{5},\mathtt{V}_{1}=[0,v,2d_{2}+2d_{3}-d_{4},(-d_{2}-d_{3}+d_{4})/c^{2}_{5}] and

  • (v)

    Evaluate Lc4,𝚅𝟸L_{c_{4},\mathtt{V_{2}}} with 𝚝=c4,𝚅2=[0,0,(d2+d3d4)/4,(d2d3+d4)]\mathtt{t}=c_{4},\mathtt{V}_{2}=[0,0,(d_{2}+d_{3}-d_{4})/4,(-d_{2}-d_{3}+d_{4})]
    to get Un5=un+Lc5,𝚅1+Lc4,𝚅𝟸U_{n5}=u_{n}+L_{c_{5},\mathtt{V}_{1}}+L_{c_{4},\mathtt{V_{2}}}.

  • (vi)

    Evaluate L1,𝚅L_{1,\mathtt{V}} with 𝚝=1,𝚅=[0,v,d4+5d5,4d48d5]\mathtt{t}=1,\mathtt{V}=[0,v,-d_{4}+5d_{5},4d_{4}-8d_{5}] to get un+1=un+L1,𝚅u_{n+1}=u_{n}+L_{1,\mathtt{V}}.

Since di=hDnid_{i}=hD_{ni} which depends on UniU_{ni}, these are the 6 (sequential) evaluations.

Implementation of 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} (c2=c3=12,c4=c6=13,c5=56c_{2}=c_{3}=\frac{1}{2},c_{4}=c_{6}=\frac{1}{3},c_{5}=\frac{5}{6}): As discussed in Remark 3, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} can be implemented like a 4-stage method by evaluating the following 4 sequential evaluations, corresponding to 4 calls to phipm_simul_iom:

  • (i)

    Evaluate Lc2,𝚅L_{c_{2},\mathtt{V}} with 𝚝=c2,𝚅=[0,v]\mathtt{t}=c_{2},\mathtt{V}=[0,v] to get Un2=un+Lc2,𝚅U_{n2}=u_{n}+L_{c_{2},\mathtt{V}}.

  • (ii)

    Evaluate Lc4,𝚅L_{c_{4},\mathtt{V}} and Lc3,𝚅L_{c_{3},\mathtt{V}} simultaneously using 𝚝=[c4,c3],𝚅=[0,v,d2/c2]\mathtt{t}=[c_{4},c_{3}],\mathtt{V}=[0,v,d_{2}/c_{2}] to get both Un3=un+Lc3,𝚅U_{n3}=u_{n}+L_{c_{3},\mathtt{V}} and Un4=un+Lc4,𝚅U_{n4}=u_{n}+L_{c_{4},\mathtt{V}}.

  • (iii)

    Evaluate Lc5,𝚅L_{c_{5},\mathtt{V}} and Lc6,𝚅L_{c_{6},\mathtt{V}} simultaneously with 𝚝=[c6,c5],𝚅=[0,v,c4(c3c4)c3d3+c3(c3c4)c4d4,1(c3c4)c3d31(c3c4)c4d4]\mathtt{t}=[c_{6},c_{5}],\\ \mathtt{V}=[0,v,\tfrac{-c_{4}}{(c_{3}-c_{4})c_{3}}d_{3}+\tfrac{c_{3}}{(c_{3}-c_{4})c_{4}}d_{4},\tfrac{1}{(c_{3}-c_{4})c_{3}}d_{3}-\tfrac{1}{(c_{3}-c_{4})c_{4}}d_{4}] to get both Un5=un+Lc5,𝚅U_{n5}=u_{n}+L_{c_{5},\mathtt{V}} and Un6=un+Lc6,𝚅U_{n6}=u_{n}+L_{c_{6},\mathtt{V}}.

  • (iv)

    Evaluate L1,𝚅L_{1,\mathtt{V}} with 𝚝=1,𝚅=[0,v,1c5c6(c6c5d5+c5c6d6),2c5c6(1c5d51c6d6)]\mathtt{t}=1,\mathtt{V}=[0,v,\tfrac{1}{c_{5}-c_{6}}(\tfrac{-c_{6}}{c_{5}}d_{5}+\tfrac{c_{5}}{c_{6}}d_{6}),\tfrac{2}{c_{5}-c_{6}}(\tfrac{1}{c_{5}}d_{5}-\tfrac{1}{c_{6}}d_{6})] to get un+1=un+L1,𝚅u_{n+1}=u_{n}+L_{1,\mathtt{V}}.

Implementation of 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} (c2=c3=c5=12,c4=14,c6=15,c7=23,c8=1c_{2}=c_{3}=c_{5}=\frac{1}{2},c_{4}=\frac{1}{4},c_{6}=\frac{1}{5},c_{7}=\frac{2}{3},c_{8}=1): As discussed in Remark 1, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} requires a sequential implementation of 11 different linear combinations of the form (51), corresponding to the following 11 calls to phipm_simul_iom:

  • (i)

    Evaluate Lc2,𝚅L_{c_{2},\mathtt{V}} with 𝚝=c2,𝚅=[0,v]\mathtt{t}=c_{2},\mathtt{V}=[0,v] to get Un2=un+Lc2,𝚅U_{n2}=u_{n}+L_{c_{2},\mathtt{V}}.

  • (ii)

    Evaluate Lc3,𝚅L_{c_{3},\mathtt{V}} with 𝚝=c3,𝚅=[0,v,d2/c32]\mathtt{t}=c_{3},\mathtt{V}=[0,v,d_{2}/c^{2}_{3}] to get Un3=un+Lc3,𝚅U_{n3}=u_{n}+L_{c_{3},\mathtt{V}}.

  • (iii)

    Evaluate Lc4,𝚅L_{c_{4},\mathtt{V}} with 𝚝=c4,𝚅=[0,v,d3/c42]\mathtt{t}=c_{4},\mathtt{V}=[0,v,d_{3}/c^{2}_{4}] to get Un4=un+Lc4,𝚅U_{n4}=u_{n}+L_{c_{4},\mathtt{V}}.

  • (iv)

    Evaluate Lc5,𝚅L_{c_{5},\mathtt{V}} with 𝚝=c5,𝚅=[0,v,(d3+4d4)/c52,(2d34d4)/c53]\mathtt{t}=c_{5},\mathtt{V}=[0,v,(-d_{3}+4d_{4})/c^{2}_{5},(2d_{3}-4d_{4})/c^{3}_{5}] to get Un5=un+Lc5,𝚅U_{n5}=u_{n}+L_{c_{5},\mathtt{V}}.

  • (v)

    Evaluate Lc6,𝚅L_{c_{6},\mathtt{V}} with 𝚝=c6,𝚅=[0,v,(8d42d5)/c62,(32d4+16d5)/c63]\mathtt{t}=c_{6},\mathtt{V}=[0,v,(8d_{4}-2d_{5})/c^{2}_{6},(-32d_{4}+16d_{5})/c^{3}_{6}] to get Un6=un+Lc6,𝚅U_{n6}=u_{n}+L_{c_{6},\mathtt{V}}.

  • (vi)

    Evaluate Lc7,𝚅𝟷L_{c_{7},\mathtt{V_{1}}} with 𝚝=c7,𝚅1=[0,v,(1627d5+10027d6)/c72,(32081d580081dn6)/c73]\mathtt{t}=c_{7},\mathtt{V}_{1}=[0,v,(\tfrac{-16}{27}d_{5}+\tfrac{100}{27}d_{6})/c^{2}_{7},(\tfrac{320}{81}d_{5}-\tfrac{800}{81}d_{n6})/c^{3}_{7}] and

  • (vii)

    Evaluate Lc6,𝚅𝟸L_{c_{6},\mathtt{V_{2}}} with 𝚝=c6,𝚅2=[0,0,(2081d4+5243d5+125486d6)/c62,(1681d44243d550243d6)/c63]\mathtt{t}=c_{6},\mathtt{V}_{2}=[0,0,(\tfrac{-20}{81}d_{4}+\tfrac{5}{243}d_{5}+\tfrac{125}{486}d_{6})/c^{2}_{6},(\tfrac{16}{81}d_{4}-\tfrac{4}{243}d_{5}-\tfrac{50}{243}d_{6})/c^{3}_{6}] to get Un7=un+Lc7,𝚅1+Lc6,𝚅𝟸U_{n7}=u_{n}+L_{c_{7},\mathtt{V}_{1}}+L_{c_{6},\mathtt{V_{2}}}.

  • (viii)

    Evaluate Lc8,𝚅𝟷L_{c_{8},\mathtt{V_{1}}} with 𝚝=c8,𝚅1=[0,v,(163d5+25021d6+2714d7)/c82,(2083d52503d627d7)/c83,(240d5+15007d6+8107d7)/c84]\mathtt{t}=c_{8},\mathtt{V}_{1}=[0,v,(\tfrac{-16}{3}d_{5}+\tfrac{250}{21}d_{6}+\tfrac{27}{14}d_{7})/c^{2}_{8},(\tfrac{208}{3}d_{5}-\tfrac{250}{3}d_{6}-27d_{7})/c^{3}_{8},(-240d_{5}+\tfrac{1500}{7}d_{6}+\tfrac{810}{7}d_{7})/c^{4}_{8}] and

  • (ix)

    Evaluate Lc6,𝚅𝟸L_{c_{6},\mathtt{V_{2}}} with 𝚝=c6,𝚅2=[0,0,(47d5+2549d6+2798d7)/c62,(85d5107d62735d7)/c63,(4835d5+6049d6+162245d7)/c64]\mathtt{t}=c_{6},\\ \mathtt{V}_{2}=[0,0,(\tfrac{-4}{7}d_{5}+\tfrac{25}{49}d_{6}+\tfrac{27}{98}d_{7})/c^{2}_{6},(\tfrac{8}{5}d_{5}-\tfrac{10}{7}d_{6}-\tfrac{27}{35}d_{7})/c^{3}_{6},(\tfrac{-48}{35}d_{5}+\tfrac{60}{49}d_{6}+\tfrac{162}{245}d_{7})/c^{4}_{6}] and

  • (x)

    Evaluate Lc7,𝚅𝟹L_{c_{7},\mathtt{V_{3}}} with 𝚝=c7,𝚅3=[0,0,(28835d5+36049d6+972245d7)/c72,(3845d54807d6129635d7)/c73,(15367d5+960049d6+518449d7)/c74]\mathtt{t}=c_{7},\mathtt{V}_{3}=[0,0,(\tfrac{-288}{35}d_{5}+\tfrac{360}{49}d_{6}+\tfrac{972}{245}d_{7})/c^{2}_{7},(\tfrac{384}{5}d_{5}-\tfrac{480}{7}d_{6}-\tfrac{1296}{35}d_{7})/c^{3}_{7},(\tfrac{-1536}{7}d_{5}+\tfrac{9600}{49}d_{6}+\tfrac{5184}{49}d_{7})/c^{4}_{7}]
    to get Un8=un+Lc8,𝚅1+Lc6,𝚅2+Lc7,𝚅𝟹U_{n8}=u_{n}+L_{c_{8},\mathtt{V}_{1}}+L_{c_{6},\mathtt{V}_{2}}+L_{c_{7},\mathtt{V_{3}}}.

  • (xi)

    Evaluate L1,𝚅L_{1,\mathtt{V}} with 𝚝=1,𝚅=[0,v,12514d62714d7+12d8,62514d6+1627d7132d8,112514d64057d7+452d8]\mathtt{t}=1,\mathtt{V}=[0,v,\tfrac{125}{14}d_{6}-\tfrac{27}{14}d_{7}+\tfrac{1}{2}d_{8},\tfrac{-625}{14}d_{6}+\tfrac{162}{7}d_{7}-\tfrac{13}{2}d_{8},\tfrac{1125}{14}d_{6}-\tfrac{405}{7}d_{7}+\tfrac{45}{2}d_{8}] to get un+1=un+L1,𝚅u_{n+1}=u_{n}+L_{1,\mathtt{V}}.

Implementation of 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} (c2=c3=c5=12c_{2}=c_{3}=c_{5}=\tfrac{1}{2}, c4=c6=13c_{4}=c_{6}=\tfrac{1}{3}, c7=14c_{7}=\tfrac{1}{4}, c8=310c_{8}=\tfrac{3}{10}, c9=34c_{9}=\tfrac{3}{4}, and c10=1c_{10}=1): As discussed in Remark 4, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} can be implemented like a 5-stage method by evaluating the following 5 sequential evaluations, corresponding to 5 calls to phipm_simul_iom:

  • (i)

    Evaluate Lc2,𝚅L_{c_{2},\mathtt{V}} with 𝚝=c2,𝚅=[0,v]\mathtt{t}=c_{2},\mathtt{V}=[0,v] to get Un2=un+Lc2,𝚅U_{n2}=u_{n}+L_{c_{2},\mathtt{V}}.

  • (ii)

    Evaluate Lc4,𝚅L_{c_{4},\mathtt{V}} and Lc3,𝚅L_{c_{3},\mathtt{V}} simultaneously using 𝚝=[c4,c3],𝚅=[0,v,d2/c2]\mathtt{t}=[c_{4},c_{3}],\mathtt{V}=[0,v,d_{2}/c_{2}] to get both Un3=un+Lc3,𝚅U_{n3}=u_{n}+L_{c_{3},\mathtt{V}} and Un4=un+Lc4,𝚅U_{n4}=u_{n}+L_{c_{4},\mathtt{V}}.

  • (iii)

    Evaluate Lc5,𝚅L_{c_{5},\mathtt{V}}, Lc6,𝚅L_{c_{6},\mathtt{V}}, and Lc7,𝚅L_{c_{7},\mathtt{V}} simultaneously using 𝚝=[c7,c6,c5]\mathtt{t}=[c_{7},c_{6},c_{5}],
    𝚅=[0,v,c4c3(c4c3)d3+c3c4(c3c4)d4,2c3(c3c4)d32c4(c3c4)d4]\mathtt{V}=[0,v,\tfrac{c_{4}}{c_{3}(c_{4}-c_{3})}d_{3}+\tfrac{c_{3}}{c_{4}(c_{3}-c_{4})}d_{4},\tfrac{2}{c_{3}(c_{3}-c_{4})}d_{3}-\tfrac{2}{c_{4}(c_{3}-c_{4})}d_{4}]
    to get Un5=un+Lc5,𝚅U_{n5}=u_{n}+L_{c_{5},\mathtt{V}}, Un6=un+Lc6,𝚅U_{n6}=u_{n}+L_{c_{6},\mathtt{V}}, Un7=un+Lc7,𝚅U_{n7}=u_{n}+L_{c_{7},\mathtt{V}}.

  • (iv)

    Evaluate Lc8,𝚅L_{c_{8},\mathtt{V}}, Lc9,𝚅L_{c_{9},\mathtt{V}}, and Lc10,𝚅L_{c_{10},\mathtt{V}} simultaneously using 𝚝=[c9,c10,c8]\mathtt{t}=[c_{9},c_{10},c_{8}],
    𝚅=[0,v,α5d5+α6d6+α7d7,β5d5β6d6β7d7,γ5d5+γ6d6+γ7d7]\mathtt{V}=[0,v,\alpha_{5}d_{5}+\alpha_{6}d_{6}+\alpha_{7}d_{7},\beta_{5}d_{5}-\beta_{6}d_{6}-\beta_{7}d_{7},\gamma_{5}d_{5}+\gamma_{6}d_{6}+\gamma_{7}d_{7}]
    to get Un8=un+Lc8,𝚅U_{n8}=u_{n}+L_{c_{8},\mathtt{V}}, Un9=un+Lc9,𝚅U_{n9}=u_{n}+L_{c_{9},\mathtt{V}}, Un10=un+Lc10,𝚅U_{n10}=u_{n}+L_{c_{10},\mathtt{V}}.

  • (v)

    Evaluate L1,𝚅L_{1,\mathtt{V}} with 𝚝=1,𝚅=[0,v,α8d8+α9d9+α10d10,β8d8+β9d9+β10d10,γ8d8+γ9d9+γ10d10]\mathtt{t}=1,\mathtt{V}=[0,v,\alpha_{8}d_{8}+\alpha_{9}d_{9}+\alpha_{10}d_{10},\beta_{8}d_{8}+\beta_{9}d_{9}+\beta_{10}d_{10},\gamma_{8}d_{8}+\gamma_{9}d_{9}+\gamma_{10}d_{10}] to get un+1=un+L1,𝚅u_{n+1}=u_{n}+L_{1,\mathtt{V}} (coefficients αi,βi,γi\alpha_{i},\beta_{i},\gamma_{i} are given in (46)).

6 Numerical experiments

In this section, we demonstrate the efficiency of our newly derived fourth- and fifth-order expRK time integration methods (𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6}, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10}). Specifically, we will compare their performance against the existing methods of the same orders (𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}) on several examples of stiff PDEs. All the numerical simulations are performed in MATLAB on a single workstation, using an iMac 3.6 GHz Intel Core i7, 32 GB 2400 MHz DDR4.

Example 1

(A one-dimensional semilinear parabolic problem HO05 ): We first verify the order of convergence for the new derived fourth- and fifth-order expRK schemes (𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6}, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10}) by considering the following PDE for u(x,t)u(x,t), x[0,1],t[0,1]x\in[0,1],t\in[0,1], and subject to homogeneous Dirichlet boundary conditions,

u(x,t)t2u(x,t)x2=11+u2(x,t)+Φ(x,t),\frac{\partial u(x,t)}{\partial t}-\frac{\partial^{2}u(x,t)}{\partial x^{2}}=\frac{1}{1+u^{2}(x,t)}+\Phi(x,t), (53)

whose exact solution is known to be u(x,t)=x(1x)etu(x,t)=x(1-x){\rm e}\hskip 1.0pt^{t} for a suitable choice of the source function Φ(x,t)\Phi(x,t).

Spatial discretization: For this example, we use standard second order finite differences with 200200 grid points. This leads to a very stiff system of the form (1) (with A1.6×105\|A\|_{\infty}\approx 1.6\times 10^{5}).

The resulting system is then integrated on the time interval [0,1][0,1] using constant step sizes, corresponding to the number of time steps N=4,8,16,32,64N=4,8,16,32,64. The time integration errors at the final time t=1t=1 are measured in the maximum norm.

In Figure 1, we plot orders for all the employed integrators in the left diagram and the total CPU time versus the global errors in the right diagram. The left diagram clearly shows a perfect agreement with our convergence result in Theorem 3.1, meaning that the two new integrators 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} fully achieve orders 4 and 5, respectively. When compared to the old integrators of the same orders 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, we note that, given the same number of time steps, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} is slightly more accurate but is much faster than 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} (see the right diagram). In a similar manner, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} gives almost identical global errors but is also much faster than 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}. Finally, we observe that, for this example, for a global error that is larger than 10610^{-6}, the new fourth-order method 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} is the fastest one, and for more stringent errors, 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} is the fastest integrator.

Refer to caption    Refer to caption
Figure 1: Order plots (left) and total CPU times (right) of 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} , 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} when applied to (53). The global errors at time t=1t=1 are plotted as functions of the number of time steps (left) and the total CPU time in second (right). For comparison, straight lines with slopes 4 and 5 are added.
Example 2

(A nonlinear Schrödinger equation Cazenave1989 ; Berland2005 ): We consider the following one-dimensional nonlinear Schrödinger (NLS) equation with periodic boundary conditions

𝚒Ψ(x,t)t\displaystyle\mathtt{i}\frac{\partial\Psi(x,t)}{\partial t} =2Ψ(x,t)x2+(V(x)+λ|Ψ(x,t)|2)Ψ(x,t),\displaystyle=-\frac{\partial^{2}\Psi(x,t)}{\partial x^{2}}+\big{(}V(x)+\lambda|\Psi(x,t)|^{2}\big{)}\Psi(x,t), (54)
Ψ(π,t)\displaystyle\Psi(-\pi,t) =Ψ(π,t),t0\displaystyle=\Psi(\pi,t),\quad t\geq 0
Ψ(0,t)\displaystyle\Psi(0,t) =Ψ0(x),x[π,π]\displaystyle=\Psi_{0}(x),\quad x\in[-\pi,\pi]

where the potential function V(x)=11+sin2(x)V(x)=\dfrac{1}{1+\sin^{2}(x)}, the initial condition Ψ0(x)=esin(2x)\Psi_{0}(x)={\rm e}\hskip 1.0pt^{\sin(2x)}, and the constant λ=1\lambda=1 (see Berland2005 ).

Spatial discretization: For this example, we use a discrete Fourier transform \mathcal{F} with ND=128ND=128 modes, leading to a mildly stiff system of the form (1) with

A\displaystyle A =diag(𝚒k2),k=ND2+1,,ND2=63,,64\displaystyle=\text{diag}(-\mathtt{i}k^{2}),\ k=-\frac{ND}{2}+1,\cdots,\frac{ND}{2}=-63,\cdots,64 (55)
g(t,u)\displaystyle g(t,u) =𝚒((V(x)+λ|1(u)|2)1(u).\displaystyle=-\mathtt{i}\mathcal{F}((V(x)+\lambda|\mathcal{F}^{-1}(u)|^{2})\mathcal{F}^{-1}(u).

Next, we integrate this system on the time interval [0,3][0,3] with constant step sizes, corresponding to the number of time steps N=64,128N=64,128, 256,512,1024256,512,1024. Since the exact solution Ψ(x,t)\Psi(x,t) of (54) is unknown, a reliable reference solution is computed by the stiff solver 𝚘𝚍𝚎𝟷𝟻𝚜\mathtt{ode15s} with ATOL=RTOL=1014ATOL=RTOL=10^{-14}. Again, the time integration errors are measured in a discrete maximum norm at the final time t=3t=3.

As seen from the two double-logarithmic diagrams in Figure 2, we plot the accuracy of the four employed integrators (𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} , 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10}) as functions of the number of time steps (left) and the total CPU time (right). The left digram clearly indicates that the two new integrators 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} achieve their corresponding expected orders 4 and 5. While 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} is a little more accurate than 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} is much more accurate than 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} for a given same number of time steps, meaning that it can take much larger time steps while achieving the same accuracy. Moreover, the right precision digram displays the efficiency plot indicating that both 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} are much faster than their counterparts 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, respectively. More specifically, a similar story is observed: for lower accuracy requirements, say error 107\sim 10^{-7}, the new fourth-order method 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} is the most efficient, whereas for error 108\sim 10^{-8} or tighter the new fifth-order method 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} is the most efficient.

Refer to caption    Refer to caption
Figure 2: Order plots (left) and total CPU times (right) of 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} , 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} when applied to Example 2. The errors at time t=3t=3 are plotted as functions of the number of time steps (left) and the total CPU time in second (right). For comparison, straight lines with slopes 4 and 5 are added.
Example 3

(A 2D Gray–Scott model Gray1984 ; Berland2007 ): Consider the following two-dimensional reaction-diffusion equation–the Gray–Scott equation model, for u=u(x,y,t),v=v(x,y,t)u=u(x,y,t),\ v=v(x,y,t) on the square Ω=[0,L]2\Omega=[0,L]^{2}, (here, we choose L=1.5L=1.5) subject to periodic boundary conditions

ut\displaystyle\frac{\partial u}{\partial t} =duΔuuv2+α(1u),\displaystyle=d_{u}\Delta u-uv^{2}+\alpha(1-u), (56)
vt\displaystyle\frac{\partial v}{\partial t} =dvΔv+uv2(α+β)v,\displaystyle=d_{v}\Delta v+uv^{2}-(\alpha+\beta)v,

where Δ\Delta is the Laplacian operator, the diffusion coefficients du=0.02,dv=0.01d_{u}=0.02,\ d_{v}=0.01, and the bifurcation parameters α=0.065,β=0.035\alpha=0.065,\ \beta=0.035. The initial conditions are Gaussian pulses

u(x,y,0)=1e150((xL)2+(yL)2),v(x,y,0)=e150((xL)2+2(yL)2).u(x,y,0)=1-e^{-150\big{(}(x-L)^{2}+(y-L)^{2}\big{)}},\ v(x,y,0)=e^{-150\big{(}(x-L)^{2}+2(y-L)^{2}\big{)}}.

Spatial discretization: For this example, we use standard second order finite differences using 150 grid points in each direction with mesh width Δx=Δy=L/150\Delta x=\Delta y=L/150. This gives a stiff system of the form (1).

The system is then solved on the time interval [0,2][0,2] using constant step sizes. In the absence of an analytical solution of (56), a high-accuracy reference solution is computed using the 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} method with a sufficient small time step. Errors are measured in a discrete maximum norm at the final time t=2t=2.

In Figure 3, using the same number of time steps N=32,64,128N=32,64,128, 256,512,1024256,512,1024, we again display the order plots of the taken integrators. One can see that 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} is much more accurate than 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} is slightly more accurate than 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}.

Refer to caption
Figure 3: Order plots of 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} , 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} when applied to Example 3. The errors at time t=2t=2 are plotted as functions of the number of time steps. For comparison, straight lines with slopes 4 and 5 are added.

In Figure 4, we display the efficiency plot for which the time step sizes were chosen for each integrator to obtain about same error thresholds 10i,i=5,,1110^{-i},\ i=5,\cdots,11 (The corresponding number of time steps for each integrator are displayed in Table 2. As seen, given about the same level of accuracy, the new methods use smaller steps than the old ones of the same order, meaning that they can take larger step sizes). Again, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} is much faster than 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} and it is interesting that this new fourth-order method turns out to be the most efficient (although for error thresholds tighter than 101110^{-11} the new fifth-order method 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} seems to become the most efficient).

Refer to caption
Figure 4: Total CPU times of 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5}, 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} , 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8}, and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} when applied to Example 3. The time step sizes were chosen in such a way that each integrator achieves about the same error thresholds 10i,i=5,,1110^{-i},\ i=5,\cdots,11. The errors at time t=2t=2 are plotted as functions of the total CPU time (in second).
Method Error threshold vs. Number of time steps
10510^{-5} 10610^{-6} 10710^{-7} 10810^{-8} 10910^{-9} 101010^{-10} 101110^{-11}
𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟻\mathtt{expRK4s5} 18 36 66 121 215 385 685
𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} 10 19 28 46 122 230 420
𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟾\mathtt{expRK5s8} 7 18 33 57 92 149 238
𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10} 8 17 30 51 82 130 208
Table 2: The number of time steps taken to achieve about the same error thresholds 10i,i=5,,1110^{-i},\ i=5,\cdots,11.

The numerical results presented on the three examples above clearly confirm the advantage of constructing parallel stages expRK methods based on Theorem 3.1, leading to more efficient and accurate methods 𝚎𝚡𝚙𝚁𝙺𝟺𝚜𝟼\mathtt{expRK4s6} and 𝚎𝚡𝚙𝚁𝙺𝟻𝚜𝟷𝟶\mathtt{expRK5s10}.

Acknowledgements.
The author would like to thank Reviewer 1 for the valuable comments and helpful suggestions. He would like also to thank the National Science Foundation, which supported this research under award NSF DMS–2012022.

References

  • (1) Al-Mohy, A.H., Higham, N.J.: Computing the action of the matrix exponential, with an application to exponential integrators. SIAM J. Sci. Comput. 33, 488–511 (2011)
  • (2) Berland, H., Skaflestad, B.: Solving the nonlinear Schrödinger equation using exponential integrators. Tech. rep. (2005)
  • (3) Berland, H., Skaflestad, B., Wright, W.M.: Expint—a matlab package for exponential integrators. ACM Transactions on Mathematical Software (TOMS) 33(1), 4–es (2007)
  • (4) Caliari, M., Kandolf, P., Ostermann, A., Rainer, S.: The Leja method revisited: Backward error analysis for the matrix exponential. SIAM J. Sci. Comp. 38(3), A1639–A1661 (2016)
  • (5) Cazenave, T.: An introduction to nonlinear Schrödinger equations, vol. 22. Universidade Federal do Rio de Janeiro, Centro de Ciências Matemáticas e da … (1989)
  • (6) Gaudreault, S., Pudykiewicz, J.: An efficient exponential time integration method for the numerical solution of the shallow water equations on the sphere. J. Comput. Phys. 322, 827–848 (2016)
  • (7) Gray, P., Scott, S.: Autocatalytic reactions in the isothermal, continuous stirred tank reactor: Oscillations and instabilities in the system A+2B3B;BCA+2B\rightarrow 3B;B\rightarrow C. Chemical Engineering Science 39(6), 1087–1097 (1984)
  • (8) Hochbruck, M., Ostermann, A.: Explicit exponential Runge–Kutta methods for semilinear parabolic problems. SIAM J. Numer. Anal. 43, 1069–1090 (2005)
  • (9) Hochbruck, M., Ostermann, A.: Exponential integrators. Acta Numerica 19, 209–286 (2010)
  • (10) Ju, L., Wang, Z.: Exponential time differencing Gauge method for incompressible viscous flows. Communications in Computational Physics 22(2), 517–541 (2017)
  • (11) Luan, V.T.: High-order exponential integrators. Ph.D. thesis, University of Innsbruck (2014)
  • (12) Luan, V.T.: Fourth-order two-stage explicit exponential integrators for time-dependent PDEs. Applied Numerical Mathematics 112, 91–103 (2017)
  • (13) Luan, V.T., Michels, D.: Explicit exponential Rosenbrock methods and their application in visual computing. (Revised) (2020)
  • (14) Luan, V.T., Ostermann, A.: Exponential B-series: The stiff case. SIAM J. Numer. Anal. 51, 3431–3445 (2013)
  • (15) Luan, V.T., Ostermann, A.: Explicit exponential Runge–Kutta methods of high order for parabolic problems. J. Comput. Appl. Math. 256, 168–179 (2014)
  • (16) Luan, V.T., Ostermann, A.: Exponential Rosenbrock methods of order five–construction, analysis and numerical comparisons. J. Comput. Appl. Math. 255, 417–431 (2014)
  • (17) Luan, V.T., Ostermann, A.: Stiff order conditions for exponential Runge–Kutta methods of order five. In: H.B. et al. (ed.) Modeling, Simulation and Optimization of Complex Processes - HPSC 2012, pp. 133–143. Springer (2014)
  • (18) Luan, V.T., Ostermann, A.: Parallel exponential Rosenbrock methods. Comput. Math. Appl. 71, 1137–1150 (2016)
  • (19) Luan, V.T., Pudykiewicz, J.A., Reynolds, D.R.: Further development of efficient and accurate time integration schemes for meteorological models. J. Comput. Phys. 376, 817–837 (2019)
  • (20) Michels, D.L., Luan, V.T., Tokman, M.: A stiffly accurate integrator for elastodynamic problems. ACM Transactions on Graphics (TOG) 36(4), 116 (2017)
  • (21) Niesen, J., Wright, W.M.: Algorithm 919: A Krylov subspace algorithm for evaluating the φ\varphi-functions appearing in exponential integrators. ACM Trans. Math. Soft. (TOMS) 38(3), 22 (2012)
  • (22) Pieper, K., Sockwell, K.C., Gunzburger, M.: Exponential time differencing for mimetic multilayer ocean models. J. Comput. Phys. 398, 817–837 (2019)