Stability of variable-step BDF2 and BDF3 methods

Zhaoyi Li Hong-lin Liao Department of Mathematics, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, P. R. China. Email: [email protected] author. ORCID 0000-0003-0777-6832. Department of Mathematics, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; Key Laboratory of Mathematical Modelling and High Performance Computing of Air Vehicles (NUAA), MIIT, Nanjing 211106, China. Hong-lin Liao ([email protected] and [email protected]) is supported by a grant 12071216 from National Natural Science Foundation of China.

Abstract

We prove that the two-step backward differentiation formula (BDF2) method is stable on arbitrary time grids; while the variable-step BDF3 scheme is stable if almost all adjacent step ratios are less than 2.553. These results relax the severe mesh restrictions in the literature and provide a new understanding of variable-step BDF methods. Our main tools include the discrete orthogonal convolution kernels and an elliptic-type matrix norm.
Keywords: variable-step BDF formula, stability, step-ratio condition, discrete orthogonal convolution kernels
AMS subject classifications: 65M06, 65M12

1 Introduction

In this paper, we revisit the stability of variable-step backward differentiation formulas (BDF) for the following initial value problem

\displaystyle\frac{\,\mathrm{d}v}{\,\mathrm{d}t}=f(t,v)\quad\text{for $0<t\leq T$}.

Choose time levels $0=t_{0}<t_{1}<t_{2}<\cdots<t_{N}=T$ with the variable time-step $\tau_{k}:=t_{k}-t_{k-1}$ for $1\leq k\leq N$ , and the maximum step size $\tau:=\max_{1\leq k\leq N}\tau_{k}$ . Put the adjacent time-step ratios

r_{k}:=\frac{\tau_{k}}{\tau_{k-1}}\quad\text{for $2\leq k\leq N$}\quad\text{with}\quad r_{1}:=0.

For any sequences $\{v^{n}\}_{n=0}^{N}$ , denote $\partial_{\tau}v^{n}:=(v^{n}-v^{n-1})/\tau_{n}$ . Let $\Pi_{n,k}v$ ( $k=1,2,3$ ) be the interpolating polynomial of a function $v$ over the nodes $t_{n-k}$ , $\cdots$ , $t_{n-1}$ and $t_{n}$ . To describe the variable coefficients, define the following three functions

$\displaystyle d_{0}(x,y):=$	$\displaystyle\,\frac{1+2x}{1+x}+\frac{xy}{1+y+xy},$	(1.1)
$\displaystyle d_{1}(x,y):=$	$\displaystyle\,-\frac{x}{1+x}-\frac{xy}{1+y+xy}-\frac{xy^{2}}{1+y+xy}\frac{1+x}{1+y},$	(1.2)
$\displaystyle d_{2}(x,y):=$	$\displaystyle\,\frac{xy^{2}}{1+y+xy}\frac{1+x}{1+y},\qquad\text{for $x,y\geq 0$.}$	(1.3)

Taking $v^{n}=v(t_{n})$ , one has the BDF1 formula $D_{1}v^{n}:=\left(\Pi_{n,1}v\right)^{\prime}(t_{n})=d_{0}(0,0)\partial_{\tau}v^{n}$ for $n\geq 1$ , and the variable-step BDF2 formula

\displaystyle D_{2}v^{n}:=\left(\Pi_{n,2}v\right)^{\prime}(t_{n})=d_{0}(r_{n},0)\partial_{\tau}v^{n}+d_{1}(r_{n},0)\partial_{\tau}v^{n-1}\quad\text{for $n\geq 2$}.

(1.4)

The variable-step BDF3 formula $D_{3}v^{n}:=\left(\Pi_{n,3}v\right)^{\prime}(t_{n})$ for $n\geq 3$ ,

\displaystyle D_{3}v^{n}=

\displaystyle\,d_{0}(r_{n},r_{n-1})\partial_{\tau}v^{n}+d_{1}(r_{n},r_{n-1})\partial_{\tau}v^{n-1}+d_{2}(r_{n},r_{n-1})\partial_{\tau}v^{n-2}.

(1.5)

As seen, the formula (1.5) also represents a variable-order variable-step BDF formula from the first-order to third-order. The theoretical results on (1.5) are naturally valid for any combinations of the BDF-k formulas if the time-step ratios are allowed to be zero in the associated coefficients.

We consider the variable-step BDF- $\mathrm{k}$ time-stepping scheme for the ODE problem,

\displaystyle D_{\mathrm{k}}v^{n}=f(t_{n},v^{n})\quad\text{for $\mathrm{k}\leq n\leq N$.}

(1.6)

Without losing the generality, assume that the starting values $u^{1}$ , $\cdots$ , $u^{\mathrm{k}-1}$ are available by some other high-order approaches, such as Runge-Kutta methods. For the stability analysis, assume further that the perturbed solution $\bar{v}^{n}$ solves the equation $D_{\mathrm{k}}\bar{v}^{n}=f(t_{n},\bar{v}^{n})+\varepsilon^{n}$ with a bounded sequence $\{\varepsilon^{n}\}$ . Then the difference $\tilde{v}^{i}=\bar{v}^{i}-v^{i}$ solves

\displaystyle D_{\mathrm{k}}\tilde{v}^{i}=f(t_{i},\bar{v}^{i})-f(t_{i},v^{i})+\varepsilon^{i}\quad\text{for $\mathrm{k}\leq i\leq N$.}

(1.7)

These well-known stiff solvers have been tested to be practically valuable for differential-algebraic problems [5, 9, 10] and for the hyperbolic systems with multiscale relaxation [1]. Also, the variable-step versions are computationally efficient in capturing the multi-scale time behaviors [2, 4, 10, 11, 12, 13, 15] via adaptive time-stepping strategies. On the other hand, the stability and convergence analysis of variable-step BDF methods are difficult and remain incomplete, cf. [2, 4, 6, 7, 8, 9, 15], because they involve multiple degrees of freedom (independent time-step sizes).

One of the main defects in the existing theory is the severe mesh condition required for stability. For examples, the zero-stability of variable-step BDF2 method requires a step-ratio condition $0<r_{k}\leq 1+\sqrt{2}\approx 2.414$ , see [6, 7]. By considering a specific time grid with constant step-ratio $r_{k}=r$ , it was also shown in [6] that the zero-stability of variable-step BDF3 method requires the maximum step-ratio limit $R_{3}<(1+\sqrt{5})/{2}\approx 1.618$ . By using an elliptic type norm, Calvo, Grande and Grigorieff [2] gave the zero-stability constraint $r_{k}\leq 1.476$ in 1990. Twenty years ago, Guglielmi and Zennaro [8] applied a spectral radius approach to find the step-ratio condition $r_{k}\leq 1.501$ , which seems to be the best result for the variable-step BDF3 method.

We aim to relax the existing mesh conditions for the perturbed stability of variable-step BDF2 and BDF3 methods according to the following definition.

Definition 1 (Stability).

Assume that the nonlinear function $f(t,v)$ is a Lipschitz-continuous function with the Lipschitz constant $L_{f}>0$ . The $\mathrm{k}$ -step time-stepping (1.6) with unequal time-steps is stable if there exists a real number $\tau_{0}$ and a fixed constant $C$ (independent of the step sizes $\tau_{n}$ ) such that the numerical solution of (1.7) fulfills

\displaystyle\max_{\mathrm{k}\leq n\leq N}\big{|}\tilde{v}^{i}\big{|}\leq C\left(\sum_{i=0}^{\mathrm{k}-1}\big{|}\tilde{v}^{i}\big{|}+\tau\sum_{i=1}^{\mathrm{k}-1}\big{|}\partial_{\tau}\tilde{v}^{i}\big{|}+\max_{\mathrm{k}\leq i\leq N}\big{|}\varepsilon^{i}\big{|}\right)\quad\text{for all $\tau_{n}\leq\tau_{0}$.}

This definition is practically reasonable in the sense that one always uses some high-order scheme to start the $\mathrm{k}$ -step time-stepping (1.6). In the uniform time-step case, it reduces into the classical zero-stability, cf. [3, Definition 2.2],

\displaystyle\max_{\mathrm{k}\leq n\leq N}\big{|}\tilde{v}^{i}\big{|}\leq C\left(\sum_{i=0}^{\mathrm{k}-1}\big{|}\tilde{v}^{i}\big{|}+\max_{\mathrm{k}\leq i\leq N}\big{|}\varepsilon^{i}\big{|}\right)\quad\text{for all $\tau_{n}\leq\tau_{0}$.}

Definition 1 would be as useful as the original zero-stability for the numerical analysis of variable-step, multi-step methods. Actually, it was proven in [14] that the step-ratio condition $r_{k}<3.561$ is sufficient to the stability of variable-step BDF2 method for linear parabolic problem. In a recent work [12], this step-ratio condition was slightly improved to a new one, $r_{k}<4.864$ . These step-ratio conditions are weaker than the classical zero-stability restriction $r_{k}<2.414$ , but both of them are are sufficient for the stability of BDF2 method in the sense of Definition 1.

By developing a new framework with discrete orthogonal convolution (DOC) kernels, we apply Definition 1 to examine the ratio-step conditions for the stability of BDF2 and BDF3 methods in solving the ODE problems. The main contribution is two-fold:

•

Theorem 2.1 shows that the BDF2 method is stable on arbitrary time grids.
•

Theorem 3.1 shows that the variable-step BDF3 method is stable if the adjacent step ratios $r_{k}<R_{3},$ where $R_{3}\approx 2.553$ is the unique positive root of

$R_{3}^{6}+R_{3}^{5}-4R_{3}^{4}-8R_{3}^{3}-10R_{3}^{2}-6R_{3}-2=0.$

We start from a novel viewpoint by writing the BDF- $\mathrm{k}$ ( $\mathrm{k}=2,3$ ) formula as a convolution summation of local difference quotients (this viewpoint is quite different from that in previous works [12, 14])

\displaystyle D_{\mathrm{k}}v^{n}:=\sum_{j=1}^{n}d^{(\mathrm{k},n)}_{n-j}\partial_{\tau}v^{j}\quad\text{for $n\geq\mathrm{k}$,}

(1.8)

where the first superscript index $\mathrm{k}$ in the discrete kernels $d^{(\mathrm{k},n)}_{n-j}$ denotes the order of method. From the BDF2 formula (1.4) with $n\geq 2$ , we have

\displaystyle d^{(2,n)}_{j}:=d_{j}(r_{n},0)\;\;\text{for $j=0,1$,}\quad\text{and}\quad d^{(2,n)}_{j}:=0\quad\text{for $n\geq j+1\geq 3$}.

(1.9)

Similarly, from the BDF3 formula (1.5) with $n\geq 3$ , we have

\displaystyle d^{(3,n)}_{j}:=d_{j}(r_{n},r_{n-1})\;\;\text{for $j=0,1,2$,}\quad\text{and}\quad d^{(3,n)}_{j}:=0\quad\text{for $n\geq j+1\geq 4$}.

(1.10)

Here and hereafter, assume that the summation $\sum_{k=i}^{j}\cdot$ to be zero and the product $\prod_{k=i}^{j}\cdot$ to be one if the index $i>j$ . As for the BDF- $\mathrm{k}$ kernels $d^{(\mathrm{k},n)}_{n-j}$ with any fixed indexes $n$ , we recall a class of discrete orthogonal convolution (DOC) kernels $\big{\{}\vartheta_{n-j}^{(\mathrm{k},n)}\big{\}}_{j=\mathrm{k}}^{n}$ by a recursive procedure, also see [14, 12],

\displaystyle\vartheta_{0}^{(\mathrm{k},n)}:=\frac{1}{d^{(\mathrm{k},n)}_{0}}\quad\text{and}\quad\vartheta_{n-j}^{(\mathrm{k},n)}:=-\frac{1}{d^{(\mathrm{k},j)}_{0}}\sum_{i=j+1}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}d^{(\mathrm{k},i)}_{i-j}\quad\text{for $\mathrm{k}\leq j\leq n-1$.}

(1.11)

Obviously, the DOC kernels $\vartheta_{n-j}^{(\mathrm{k},n)}$ satisfy the following discrete orthogonality identity

\displaystyle\sum_{i=j}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}d^{(\mathrm{k},i)}_{i-j}\equiv\delta_{nj}\quad\text{for any $\mathrm{k}\leq j\leq n$,}

(1.12)

where $\delta_{nj}$ is the Kronecker delta symbol with $\delta_{nj}=0$ if $j\neq n$ . Furthermore, with the identity matrix $I_{m\times m}$ ( $m:=n-\mathrm{k}+1$ ), the above discrete orthogonality identity (1.12) also implies

\mathbf{\Theta}_{\mathrm{k}}\mathbf{D}_{\mathrm{k}}=I_{m\times m},

where the two $m\times m$ matrices $\mathbf{D}_{\mathrm{k}}$ and $\mathbf{\Theta}_{\mathrm{k}}$ are defined by

\mathbf{D}_{\mathrm{k}}:=\left(\begin{array}[]{cccc}d_{0}^{(\mathrm{k},\mathrm{k})}&&&\\ d_{1}^{(\mathrm{k},\mathrm{k}+1)}&d_{0}^{(\mathrm{k},\mathrm{k}+1)}&&\\ \vdots&\vdots&\ddots&\\ d_{n-\mathrm{k}}^{(\mathrm{k},n)}&d_{n-\mathrm{k}-1}^{(\mathrm{k},n)}&\cdots&d_{0}^{(\mathrm{k},n)}\\ \end{array}\right)\;\;\text{and}\;\;\mathbf{\Theta}_{\mathrm{k}}:=\left(\begin{array}[]{cccc}\vartheta_{0}^{(\mathrm{k},\mathrm{k})}&&&\\ \vartheta_{1}^{(\mathrm{k},\mathrm{k}+1)}&\vartheta_{0}^{(\mathrm{k},\mathrm{k}+1)}&&\\ \vdots&\vdots&\ddots&\\ \vartheta_{n-\mathrm{k}}^{(\mathrm{k},n)}&\vartheta_{n-\mathrm{k}-1}^{(\mathrm{k},n)}&\cdots&\vartheta_{0}^{(\mathrm{k},n)}\\ \end{array}\right).

Obviously, one has

\mathbf{D}_{\mathrm{k}}\mathbf{\Theta}_{\mathrm{k}}=I_{m\times m},

which implies the following mutual orthogonality identity

\displaystyle\sum_{i=j}^{n}d_{n-i}^{(\mathrm{k},n)}\vartheta^{(\mathrm{k},i)}_{i-j}\equiv\delta_{nj}\quad\text{for any $\mathrm{k}\leq j\leq n$.}

(1.13)

We will use this identity (1.13) to study the decaying property of DOC kernels $\vartheta_{n-j}^{(\mathrm{k},n)}$ .

We will derive an equivalent convolution form of the difference equation (1.7). By exchanging the summation order and using (1.12), one has

$\displaystyle\sum_{i=\mathrm{k}}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}D_{\mathrm{k}}\tilde{v}^{i}=$	$\displaystyle\,\sum_{i=\mathrm{k}}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}\sum_{j=1}^{\mathrm{k}-1}d^{(\mathrm{k},i)}_{i-j}\partial_{\tau}v^{j}+\sum_{i=\mathrm{k}}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}\sum_{j=\mathrm{k}}^{i}d^{(\mathrm{k},i)}_{i-j}\partial_{\tau}v^{j}$
$\displaystyle=$	$\displaystyle\,\sum_{j=1}^{\mathrm{k}-1}\partial_{\tau}v^{j}\sum_{i=\mathrm{k}}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}d^{(\mathrm{k},i)}_{i-j}+\sum_{j=\mathrm{k}}^{n}\partial_{\tau}v^{j}\sum_{i=j}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}d^{(\mathrm{k},i)}_{i-j}$
$\displaystyle=$	$\displaystyle\,\mathcal{I}_{\mathrm{k}}^{n}[v]+\partial_{\tau}v^{n}\qquad\text{for $n\geq\mathrm{k}$,}$	(1.14)

where $\mathcal{I}_{\mathrm{k}}^{n}[v]$ represents the starting effect on the numerical solution at $t_{n}$ ,

\displaystyle\mathcal{I}_{\mathrm{k}}^{n}[v]:=

\displaystyle\,\sum_{j=1}^{\mathrm{k}-1}\partial_{\tau}v^{j}\sum_{i=\mathrm{k}}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}d^{(\mathrm{k},i)}_{i-j}\qquad\text{for $n\geq\mathrm{k}$.}

(1.15)

Multiplying both sides of the difference equation (1.7) by the DOC kernels $\vartheta_{n-i}^{(\mathrm{k},n)}$ , and summing $i$ from $\mathrm{k}$ to $n$ , we apply (1) to get

\displaystyle\partial_{\tau}\tilde{v}^{n}=-\mathcal{I}_{\mathrm{k}}^{n}[\tilde{v}]+\sum_{i=\mathrm{k}}^{n}\vartheta_{n-i}^{(\mathrm{k},n)}\left[f(t_{i},\bar{v}^{i})-f(t_{i},v^{i})+\varepsilon^{i}\right]\quad\text{for $\mathrm{k}\leq n\leq N$.}

(1.16)

In next two sections, we will present the stability analysis for the variable-step BDF2 and BDF3 methods, respectively, via this discrete convolution form (1.16) of the perturbed equation. Numerical experiments on the graded and random time meshes are included in Section 4 to support our theory. Some concluding remarks end this article.

2 Unconditional stability of BDF2 method

For the BDF2 method, the associated DOC kernels are positive and decay exponentially.

Lemma 2.1.

The DOC kernels $\vartheta_{n-j}^{(2,n)}$ in (1.11) have an explicit formula

\displaystyle\vartheta_{n-j}^{(2,n)}=\frac{1}{d^{(2,j)}_{0}}\prod_{i=j+1}^{n}\frac{r_{i}}{1+2r_{i}}>0\quad\text{ for $2\leq j\leq n$},

and satisfy

\displaystyle\sum_{j=2}^{n}\vartheta_{n-j}^{(2,n)}=1-\prod_{i=2}^{n}\frac{r_{i}}{1+2r_{i}}<1\quad\text{for $n\geq 2$.}

Proof.

For any fixed $j$ , by taking $n=j$ in the identity (1.13), one has $\vartheta_{0}^{(2,j)}=1/d^{(2,j)}_{0}$ for $j\geq 2$ . According to the definition (1.9), one has $d^{(2,n)}_{n-i}=0$ for $n\geq i+2$ . The identity (1.13) gives

\displaystyle\vartheta_{m-j}^{(2,m)}=-\frac{d^{(2,{m})}_{1}}{d^{(2,m)}_{0}}\vartheta_{m-1-j}^{(2,m-1)}=-\frac{d_{1}(r_{m},0)}{d_{0}(r_{m},0)}\vartheta_{m-1-j}^{(2,m-1)}=\frac{r_{m}}{1+2r_{m}}\vartheta_{m-1-j}^{(2,m-1)}

for $m\geq j+1\geq 2$ . It yields immediately

\displaystyle\vartheta_{n-j}^{(2,n)}=

\displaystyle\,\vartheta_{n-2-j}^{(2,n-2)}\prod_{i=n-1}^{n}\frac{r_{i}}{1+2r_{i}}=\cdots=\vartheta_{0}^{(2,j)}\prod_{i=j+1}^{n}\frac{r_{i}}{1+2r_{i}}\quad\text{ for $2\leq j\leq n$}.

It leads to the first result. By taking $v^{n}=t_{n}$ in (1.8), one has $\partial_{\tau}t_{n}=1$ and $D_{2}t_{n}=1$ for $n\geq 2$ . Thus applying the equality (1) with $\mathrm{k}=2$ yields

	$\displaystyle\sum_{j=2}^{n}\vartheta_{n-j}^{(2,n)}=$	$\displaystyle\,\sum_{j=2}^{n}\vartheta_{n-j}^{(2,n)}D_{2}t_{j}=(\partial_{\tau}t_{1})\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}d^{(2,i)}_{i-1}+\partial_{\tau}t_{n}$
	$\displaystyle=$	$\displaystyle\,1+\vartheta_{n-2}^{(2,n)}d^{(2,2)}_{1}=1-\prod_{i=2}^{n}\frac{r_{i}}{1+2r_{i}}<1\quad\text{for $n\geq 1$,}$		(2.1)

where the explicit formula of $\vartheta_{n-2}^{(2,n)}$ and the definition (1.9) were used. It completes the proof. ∎

Theorem 2.1.

If $\tau\leq 1/(4L_{f})$ , the BDF2 solution of (1.7) satisfies

\displaystyle\big{|}\tilde{v}^{n}\big{|}\leq 2\exp(4L_{f}t_{n-1})\Big{(}\big{|}\tilde{v}^{1}\big{|}+2\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}+2t_{n}\max_{2\leq i\leq n}\big{|}\varepsilon^{i}\big{|}\Big{)}\quad\text{for $2\leq n\leq N$.}

Thus the variable-step BDF2 scheme is stable on arbitrary time meshes.

Proof.

Take $\mathrm{k}=2$ . Multiplying both sides of (1.16) with $2\tau_{n}\tilde{v}^{n}$ , one applies Lemma 2.1 (the positivity and boundedness of DOC kernels) to obtain that

	$\displaystyle\big{\|}\tilde{v}^{n}\big{\|}^{2}-$	$\displaystyle\,\big{\|}\tilde{v}^{n-1}\big{\|}^{2}+\big{\|}\tau_{n}\partial_{\tau}\tilde{v}^{n}\big{\|}^{2}=-2\tau_{n}\tilde{v}^{n}\mathcal{I}_{2}^{n}[\tilde{v}]+2\tau_{n}\tilde{v}^{n}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}\left[f(t_{i},\bar{v}^{i})-f(t_{i},v^{i})+\varepsilon^{i}\right]$
	$\displaystyle\leq$	$\displaystyle\,2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\big{\|}\mathcal{I}_{2}^{n}[\tilde{v}]\big{\|}+2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}\big{\|}f(t_{i},\bar{v}^{i})-f(t_{i},v^{i})\big{\|}+2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\max_{1\leq i\leq n}\big{\|}\varepsilon^{i}\big{\|}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}$
	$\displaystyle\leq$	$\displaystyle\,2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\big{\|}\mathcal{I}_{2}^{n}[\tilde{v}]\big{\|}+2L_{f}\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}\big{\|}\tilde{v}^{i}\big{\|}+2\big{\|}\tilde{v}^{n}\big{\|}\tau_{n}\max_{1\leq i\leq n}\big{\|}\varepsilon^{i}\big{\|},$

where the Lipschitz continuous property of the nonlinear function $f$ has been used. By dropping the nonnegative term at the left side and summing $n$ from $n=2$ to $m$ , we have

\displaystyle\big{|}\tilde{v}^{m}\big{|}^{2}\leq

\displaystyle\,\big{|}\tilde{v}^{1}\big{|}^{2}+2\sum_{j=2}^{m}\tau_{j}\big{|}\tilde{v}^{j}\big{|}\big{|}\mathcal{I}_{2}^{j}[\tilde{v}]\big{|}+2L_{f}\sum_{j=2}^{m}\tau_{j}\big{|}\tilde{v}^{j}\big{|}\sum_{i=2}^{j}\vartheta_{j-i}^{(2,j)}\big{|}\tilde{v}^{i}\big{|}+2\sum_{j=2}^{m}\tau_{j}\big{|}\tilde{v}^{j}\big{|}\max_{1\leq i\leq j}\big{|}\varepsilon^{i}\big{|}.

Let $m_{0}$ ( $1\leq m_{0}\leq m$ ) be an integer such that $\big{|}\tilde{v}^{m_{0}}\big{|}=\max_{1\leq k\leq m}\big{|}\tilde{v}^{k}\big{|}$ . Now we take $m:=m_{0}$ in the above inequality and get

\displaystyle\big{|}\tilde{v}^{m_{0}}\big{|}^{2}\leq

\displaystyle\,\big{|}\tilde{v}^{1}\big{|}\big{|}\tilde{v}^{m_{0}}\big{|}+2\big{|}\tilde{v}^{m_{0}}\big{|}\sum_{j=2}^{m_{0}}\tau_{j}\big{|}\mathcal{I}_{2}^{j}[\tilde{v}]\big{|}+2L_{f}\big{|}\tilde{v}^{m_{0}}\big{|}\sum_{j=2}^{m_{0}}\tau_{j}\big{|}\tilde{v}^{j}\big{|}+2\big{|}\tilde{v}^{m_{0}}\big{|}\max_{2\leq i\leq m_{0}}\big{|}\varepsilon^{i}\big{|}\sum_{j=2}^{m_{0}}\tau_{j}.

It leads to

\displaystyle\big{|}\tilde{v}^{m}\big{|}\leq\big{|}\tilde{v}^{m_{0}}\big{|}\leq\big{|}\tilde{v}^{1}\big{|}+2\sum_{j=2}^{m}\tau_{j}\big{|}\mathcal{I}_{2}^{j}[\tilde{v}]\big{|}+2L_{f}\sum_{j=2}^{m}\tau_{j}\big{|}\tilde{v}^{j}\big{|}+2t_{m}\max_{2\leq i\leq m}\big{|}\varepsilon^{i}\big{|}\quad\text{for $2\leq m\leq N$.}

(2.2)

By using Lemma 2.1, the starting error $\mathcal{I}_{2}^{n}[\tilde{v}]$ in (1.15) reads

\displaystyle\mathcal{I}_{2}^{n}[\tilde{v}]=\partial_{\tau}\tilde{v}^{1}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}d^{(2,i)}_{i-1}=\vartheta_{n-2}^{(2,n)}d^{(2,2)}_{1}(\partial_{\tau}\tilde{v}^{1})=-(\partial_{\tau}\tilde{v}^{1})\prod_{i=2}^{n}\frac{r_{i}}{1+2r_{i}}\quad\text{for $n\geq 2$}

such that

\displaystyle\sum_{j=2}^{m}\tau_{j}\big{|}\mathcal{I}_{2}^{j}[\tilde{v}]\big{|}\leq\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}\sum_{j=2}^{m}\frac{1}{2^{j-1}}\leq\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}\quad\text{for $m\geq 2$.}

It follows from (2.2) that

\displaystyle\big{|}\tilde{v}^{n}\big{|}\leq\big{|}\tilde{v}^{1}\big{|}+2\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}+2L_{f}\sum_{j=2}^{n}\tau_{j}\big{|}\tilde{v}^{j}\big{|}+2t_{n}\max_{2\leq i\leq n}\big{|}\varepsilon^{i}\big{|}\quad\text{for $2\leq n\leq N$.}

Assuming that $\tau\leq 1/(4L_{f})$ , one has

\displaystyle\big{|}\tilde{v}^{n}\big{|}\leq 2\big{|}\tilde{v}^{1}\big{|}+4\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}+4L_{f}\sum_{j=2}^{n-1}\tau_{j}\big{|}\tilde{v}^{j}\big{|}+4t_{n}\max_{2\leq i\leq n}\big{|}\varepsilon^{i}\big{|}\quad\text{for $2\leq n\leq N$.}

The standard discrete Grönwall inequality, e.g. [14, Lemma 3.1], completes the proof. ∎

3 Stability analysis of BDF3 method

3.1 Decaying of DOC kernels

Note that, the variable-step BDF3 method (1.8) of $\mathrm{k}=3$ is exact for the linear polynomial $v=t$ . Taking $v^{n}=t_{n}$ in (1.8), one can find that $D_{3}t_{n}=1$ for $n\geq 3$ . As done in (2), we can apply the discrete equality (1) with (1.15) to get

\displaystyle\sum_{j=3}^{n}\vartheta_{n-j}^{(3,n)}=

\displaystyle\,\sum_{j=1}^{2}\sum_{i=3}^{n}\vartheta_{n-i}^{(3,n)}d^{(3,i)}_{i-j}+1\quad\text{for $n\geq 3$.}

However, it does not provide enough information for the subsequent stability analysis because no explicit formulas of DOC kernels $\vartheta_{n-j}^{(3,n)}$ are available. Furthermore, the DOC kernels are not always positive, see Remark 4 below, we turn to bound $\sum_{j=3}^{n}\big{|}\vartheta_{n-j}^{(3,n)}\big{|}$ for any time-levels $t_{n}$ by imposing certain step-ratio constraint.

To do so, introduce the modified DOC kernels

\displaystyle\widehat{\vartheta}_{i-j}^{(3,i)}:=\vartheta_{i-j}^{(3,i)}d^{(3,j)}_{0}\quad\text{for $i\geq j\geq 3$.}

(3.1)

The discrete identity (1.13) gives

\displaystyle\sum_{i=j}^{n}d_{n-i}^{(3,n)}\widehat{\vartheta}^{(3,i)}_{i-j}\equiv\delta_{nj}d^{(3,j)}_{0}\quad\text{for $n\geq j\geq 3$.}

(3.2)

For any fixed index $j\geq 3$ , by taking $n=j$ and $n=j+1$ in the identity (3.2), respectively, one can derive that

\displaystyle\widehat{\vartheta}^{(3,j)}_{0}=1\quad\text{and}\quad\widehat{\vartheta}^{(3,j+1)}_{1}=-\frac{d_{1}^{(3,j+1)}}{d_{0}^{(3,j+1)}}\quad\text{for $j\geq 3$.}

(3.3)

According to the definition (1.10), one has $d^{(3,n)}_{n-i}=0$ for $n\geq i+3$ . The identity (1.13) gives

\displaystyle\widehat{\vartheta}_{m-j}^{(3,m)}+\frac{d^{(3,{m})}_{1}}{d^{(3,m)}_{0}}\widehat{\vartheta}_{m-1-j}^{(3,m-1)}+\frac{d^{(3,{m})}_{2}}{d^{(3,m)}_{0}}\widehat{\vartheta}_{m-2-j}^{(3,m-2)}=0\quad\text{for $m\geq j+2\geq 5$,}

or the difference equation

\displaystyle\widehat{\vartheta}_{m-j}^{(3,m)}-\alpha_{m}\widehat{\vartheta}_{m-1-j}^{(3,m-1)}+\beta_{m}\widehat{\vartheta}_{m-2-j}^{(3,m-2)}=0\quad\text{for $m\geq j+2\geq 5$,}

(3.4)

where, by using (1.1)-(1.3) and (A.2)-(A.3), the coefficients $\alpha_{m}$ and $\beta_{m}$ are defined by

\displaystyle\alpha_{m}:=-\frac{d^{(3,{m})}_{1}}{d^{(3,m)}_{0}}=\alpha(r_{m},r_{m-1})\quad\text{and}\quad\beta_{m}:=\frac{d^{(3,{m})}_{2}}{d^{(3,m)}_{0}}=\beta(r_{m},r_{m-1})\quad\text{for $m\geq 3$.}

(3.5)

To evaluate the asymptotic behaviors of the DOC kernels $\widehat{\vartheta}_{m-j}^{(3,m)}$ , introduce the following vector and companion matrix

\displaystyle\vec{u}_{m}=\left(\begin{array}[]{c}\widehat{\vartheta}_{m-j}^{(3,m)}\\[8.61108pt] \widehat{\vartheta}_{m-1-j}^{(3,m-1)}\\ \end{array}\right)\quad\text{and}\quad A_{m}:=\left(\begin{array}[]{cc}\alpha_{m}&-\beta_{m}\\[8.61108pt] 1&0\\ \end{array}\right).

(3.10)

For any fixed index $j\geq 1$ , the difference equation (3.4) reads

\displaystyle\vec{u}_{m}=A_{m}\vec{u}_{m-1}\quad\text{for $m\geq j+2\geq 5$,}

\displaystyle\vec{u}_{n}=A_{n}\vec{u}_{n-1}=A_{n}A_{n-1}\vec{u}_{n-2}=\cdots=\prod_{i=j+2}^{n}A_{i}\vec{u}_{j+1}\quad\text{for $n\geq j+2\geq 5$.}

(3.11)

The associated initial vector $\vec{u}_{j+1}=(\alpha_{j+1},1)^{T}$ according to (3.3).

If all of step-ratios $0<r_{k}<\hat{R}_{3}\approx 3.4405$ ( $k\geq 2$ ), Lemmas A.1-A.2 show that

\displaystyle\beta_{i}<\alpha_{i}<1+\beta_{i}<2\quad\text{for $i\geq 3$.}

(3.12)

They imply that the roots of $\lambda^{2}-\alpha_{i}\lambda+\beta_{i}=0$ satisfy $\left|\lambda\right|<1$ . Thus the eigenvalues of $A_{i}$ have the module less than 1 and the spectral radius $\rho(A_{i})<1$ for any $i\geq 3$ . In general, it is not sufficient to ensure the global decaying of DOC kernels $\widehat{\vartheta}_{n-j}^{(3,n)}$ .

It needs to examine the joint companion matrix $\prod_{i=j+2}^{n}A_{i}$ for $n\geq j+2\geq 5$ . Let $\left\|\cdot\right\|_{\infty}$ be the maximum norm on the space $\mathbb{C}^{2}$ and let the same symbol $\left\|\cdot\right\|_{\infty}$ denote also the corresponding induced matrix norm. We adopt the technique of Calvo et al [2] by considering a complex constant $\mu$ with $\mathrm{Im}(\mu)\neq 0$ and the following elliptic type norm

\displaystyle\left\|A_{i}\right\|_{H,\infty}:=\left\|H^{-1}A_{i}H\right\|_{\infty}\quad\text{with}\quad H:=\left(\begin{array}[]{cc}\mu&\bar{\mu}\\[8.61108pt] 1&1\\ \end{array}\right).

(3.15)

Then one can bound the DOC kernels via

\displaystyle\big{\|}\vec{u}_{n}\big{\|}_{\infty}\leq\big{\|}H\big{\|}_{\infty}\prod_{i=j+2}^{n}\big{\|}A_{i}\big{\|}_{H,\infty}\big{\|}H^{-1}\big{\|}_{\infty}\big{\|}\vec{u}_{j+1}\big{\|}_{\infty}\quad\text{for $n\geq j+2\geq 5$.}

(3.16)

To process the analysis, it is to determine a fixed parameter $\mu$ . We will choose a proper constant $\tilde{R}_{3}<\hat{R}_{3}\approx 3.4405$ to ensure that

\displaystyle\big{\|}A_{i}\big{\|}_{H,\infty}<1\quad\text{for $i\geq 3$,}

(3.17)

if all of step-ratios $r_{k}<\tilde{R}_{3}$ ( $k\geq 2$ ). It is easy to derive that

\displaystyle\big{\|}A_{i}\big{\|}_{H,\infty}=\frac{\left|\mu^{2}-\alpha_{i}\mu+\beta_{i}\right|+\big{|}\left|\mu\right|^{2}-\alpha_{i}\mu+\beta_{i}\big{|}}{\left|\mu-\bar{\mu}\right|}\quad\text{for $i\geq 3$.}

(3.18)

Noticing that the following fact

\displaystyle\big{|}\left|\mu\right|^{2}-\alpha_{i}\mu+\beta_{i}\big{|}^{2}-\big{|}\mu^{2}-\alpha_{i}\mu+\beta_{i}\big{|}^{2}=\beta_{i}\big{|}\mu-\bar{\mu}\big{|}^{2},

one has

\displaystyle\big{\|}A_{i}\big{\|}_{H,\infty}=\frac{\left|\mu^{2}-\alpha_{i}\mu+\beta_{i}\right|}{\left|\mu-\bar{\mu}\right|}+\sqrt{\beta_{i}+\frac{\left|\mu^{2}-\alpha_{i}\mu+\beta_{i}\right|^{2}}{\left|\mu-\bar{\mu}\right|^{2}}}\quad\text{for $i\geq 3$.}

(3.19)

Then the requirement (3.17) is equivalent to

\displaystyle\frac{\big{|}\mu^{2}-\alpha_{i}\mu+\beta_{i}\big{|}}{\left|\mu-\bar{\mu}\right|}<\frac{1}{2}(1-\beta_{i})\quad\text{for $i\geq 3$}.

(3.20)

Then, taking into account the coefficient relationship (3.12), we can find that the condition (3.20) can be written equivalently in the following form, also see [2, Eq. (14)],

\displaystyle\left|\mu-\frac{1}{2}\alpha_{i}-\frac{\imath}{2}\sqrt{(1+\beta_{i})^{2}-\alpha_{i}^{2}}\right|<\frac{1}{2}(1-\beta_{i})\quad\text{for $i\geq 3$.}

(3.21)

Note that, the other branch requiring $\big{|}\mu-\frac{1}{2}\alpha_{i}+\frac{\imath}{2}\sqrt{(1+\beta_{i})^{2}-\alpha_{i}^{2}}\big{|}<\frac{1}{2}(1-\beta_{i})$ is omitted here since it is equivalent to (3.21) by replacing the undetermined parameter $\mu$ with $\bar{\mu}$ . Since the coefficients $\alpha_{i},\beta_{i}$ are dependent on the ratios $r_{i}$ and $r_{i-1}$ , The inequalities in (3.21) define a family of complex disks $\mathfrak{D}(r_{i},r_{i-1})$ centered at $\big{(}\frac{1}{2}\alpha_{i},\frac{1}{2}\sqrt{(1+\beta_{i})^{2}-\alpha_{i}^{2}}\big{)}$ with the radius $\frac{1}{2}(1-\beta_{i})$ .

A heuristic analysis is considered firstly to obtain a rough estimate for the complex parameter $\mu$ , while the mathematical proof is left to next subsection. To ensure (3.21), one should choose a fixed $\mu_{0}$ such that the intersection set

\bigcap_{r_{i},r_{i-1}\in[0,\tilde{R}_{3}]}\mathfrak{D}(r_{i},r_{i-1})\quad\text{ is not empty.}

Reminding the increasing functions (A.2)-(A.3), we see that the disk $\mathfrak{D}(0,0)$ centered at $\big{(}0,\frac{1}{2}\big{)}$ has the maximum radius $\frac{1}{2}$ , while the disk $\mathfrak{D}(\tilde{R}_{3},\tilde{R}_{3})$ centered at

\left(\frac{1}{2}\alpha(\tilde{R}_{3},\tilde{R}_{3}),\frac{1}{2}\sqrt{\left[1+\beta(\tilde{R}_{3},\tilde{R}_{3})\right]^{2}-\alpha^{2}(\tilde{R}_{3},\tilde{R}_{3})}\right)

has the minimum radius $\frac{1}{2}\big{(}1-\beta(\tilde{R}_{3},\tilde{R}_{3})\big{)}$ . Obviously, the largest value of $\tilde{R}_{3}$ may be determined from the fact that the smallest disk $\mathfrak{D}(\tilde{R}_{3},\tilde{R}_{3})$ is tangential to the largest one $\mathfrak{D}(0,0)$ at the tangential point $\mu_{0}$ . This necessary condition holds if and only if

\displaystyle\frac{1}{4}\alpha^{2}(\tilde{R}_{3},\tilde{R}_{3})+\left(\frac{1}{2}\sqrt{\left[1+\beta(\tilde{R}_{3},\tilde{R}_{3})\right]^{2}-\alpha^{2}(\tilde{R}_{3},\tilde{R}_{3})}-\frac{1}{2}\right)^{2}=\frac{1}{4}\big{[}2-\beta(\tilde{R}_{3},\tilde{R}_{3})\big{]}^{2}\,.

which leads to

\displaystyle\alpha^{2}(\tilde{R}_{3},\tilde{R}_{3})+8\beta^{2}(\tilde{R}_{3},\tilde{R}_{3})-8\beta(\tilde{R}_{3},\tilde{R}_{3})=0,

(3.22)

or equivalently,

\displaystyle 9\tilde{R}_{3}^{6}-2\tilde{R}_{3}^{5}-35\tilde{R}_{3}^{4}-42\tilde{R}_{3}^{3}-22\tilde{R}_{3}^{2}-4\tilde{R}_{3}+1=0.

This polynomial equation has two positive roots $\tilde{R}_{3}\approx 2.5808$ and $\tilde{R}_{3}\approx 0.1304$ (it is checked that this situation also occurs in [2]). We throw away the small root $\tilde{R}_{3}\approx 0.1304$ and choose the large one $\tilde{R}_{3}\approx 2.5808$ with the corresponding tangential point $\tilde{\mu}_{*}\approx 0.4979+0.5454\imath$ .

Remark 1.

It is to mention that, by following the joint spectral radius approach in [8], one can obtain a slightly improved step-ratio constraint $0<r_{k}<2.705$ to ensure the boundedness of DOC kernels; however, it seems difficult to obtain the desired decaying behavior of DOC kernels theoretically from the extremal norm estimate of the family of companion matrices $\{A_{i}\,|\,i\geq 3\}$ .

Remark 2.

The condition (3.21) is sufficient for (3.17), or $\left\|A_{i}\right\|_{H,\infty}<1$ ; but different choices of the parameter $\mu$ leads to different values of $\left\|A_{i}\right\|_{H,\infty}$ . Consider the simple case with $r_{i-1}\equiv 0$ for any $i\geq 3$ , corresponding to the variable-step BDF2 scheme. By using the definition (3.5) together with (1.1)-(1.3), one has $\alpha_{i}=\frac{r_{i}}{1+2r_{i}}$ , $\beta_{i}=0$ and the reduced class of complex discs

\displaystyle\mathfrak{D}(r_{i},0)=\left\{\mu\,\left|\,\Big{|}\mu-\frac{1}{2}\alpha(r_{i},0)-\frac{\imath}{2}\sqrt{1-\alpha^{2}(r_{i},0)}\Big{|}<\frac{1}{2}\right.\right\}\quad\text{for $i\geq 3$.}

Recalling the fact $(\alpha_{i}+\frac{1}{2})(\alpha_{i}-1)<0$ for any $r_{i}\geq 0$ , one can check that

\left|\frac{1}{4}-\frac{1}{2}\alpha(r_{i},0)+\frac{\imath}{2}\Big{(}\frac{\sqrt{3}}{2}-\sqrt{1-\alpha^{2}(r_{i},0)}\Big{)}\right|<\frac{1}{2}.

That is, the distance between the center $\big{(}\frac{1}{4},\frac{\sqrt{3}}{4}\big{)}$ of $\mathfrak{D}(\infty,0)$ and the center $\big{(}\frac{1}{2}\alpha_{i},\frac{1}{2}\sqrt{1-\alpha_{i}^{2}}\big{)}$ of $\mathfrak{D}(r_{i},0)$ is always less than $\frac{1}{2}$ . Thus these complex discs are always have a common region for any $r_{i}>0$ , that is, $\cap_{r_{i}>0}\mathfrak{D}(r_{i},0)$ is not empty. There exists a fixed parameter $\mu$ such that $\big{\|}A_{i}\big{\|}_{H,\infty}<1$ for any $r_{i}>0$ . By taking $\mu_{1}=\imath/2$ in (3.19), one has

\displaystyle\big{\|}A_{i}\big{\|}_{H,\infty}=\frac{2\left|\mu_{1}\right|\left|\mu_{1}-\alpha_{i}\right|}{\left|\mu_{1}-\bar{\mu}_{1}\right|}=\sqrt{\alpha_{i}^{2}+\frac{1}{4}}\quad\text{for $r_{i}>0$.}

Since $\alpha_{i}\in(0,\frac{1}{2})$ for any $r_{i}>0$ , one gets $\left\|A_{i}\right\|_{H,\infty}\in\big{(}\tfrac{1}{2},\tfrac{1}{\sqrt{2}}\big{)}$ . On the other hand, taking $\mu_{2}=1/4+\imath/4$ in (3.19) arrives at

\displaystyle\big{\|}A_{i}\big{\|}_{H,\infty}=\frac{2\left|\mu_{2}\right|\left|\mu_{2}-\alpha_{i}\right|}{\left|\mu_{2}-\bar{\mu}_{2}\right|}=\sqrt{2\big{(}\frac{1}{4}-\alpha_{i}\big{)}^{2}+\frac{1}{8}}\quad\text{for $r_{i}>0$.}

For any $r_{i}>0$ , we have $\left\|A_{i}\right\|_{H,\infty}\in\big{(}\frac{\sqrt{2}}{4},\frac{1}{2}\big{)}$ , which are less than 1/2. Always, the estimate (3.16) implies the globally asymptotic decaying of DOC kernels, as shown by Lemma 2.1.

3.2 Stability analysis

To avoid the undesired equation (3.22) which has two different positive roots, and to simplify the subsequent mathematical derivations, we fix the complex parameter

\mu=\mu_{*}:=1/2+\imath/2,

which is very close to the optimal tangential point $\tilde{\mu}_{*}\approx 0.4979+0.5454\imath$ in the heuristic analysis. Reminding the condition (3.21), we determine the maximum step-ratio $R_{3}$ by

\displaystyle\left|1-\alpha(R_{3},R_{3})+\imath\Big{(}1-\sqrt{(1+\beta(R_{3},R_{3}))^{2}-\alpha(R_{3},R_{3})^{2}}\Big{)}\right|=1-\beta(R_{3},R_{3}),

which leads to the polynomial equation

\displaystyle R_{3}^{6}+R_{3}^{5}-4R_{3}^{4}-8R_{3}^{3}-10R_{3}^{2}-6R_{3}-2=0.

It has a unique positive root $R_{3}\approx 2.553$ . Now we prove the following result.

Lemma 3.1.

If the step ratios $r_{k}<R_{3}$ $(k\geq 2)$ , there exists a constant $C_{3}>0$ , independent of the time $t_{n}$ , such that the DOC kernels $\vartheta_{n-j}^{(3,n)}$ in (1.11) satisfy

\displaystyle\sum_{j=3}^{n}\big{|}\vartheta_{n-j}^{(3,n)}\big{|}\leq C_{3}\quad\text{and}\quad\sum_{j=i}^{n}\big{|}\vartheta_{j-i}^{(3,j)}\big{|}\leq C_{3}\quad\text{for $n\geq 3\;(i\geq 3)$.}

Proof.

The definitions (1.1) and (1.10) give $d^{(3,j)}_{0}=d_{0}(r_{j},r_{j-1})\geq 1$ for $j\geq 3$ . By using (3.1), (3.3) and the coefficient bound (3.12), one gets

\vartheta_{0}^{(3,j)}=\frac{1}{d^{(3,j)}_{0}}\leq 1\quad\text{and}\quad\vartheta_{1}^{(3,j+1)}=\frac{\alpha_{j+1}}{d^{(3,j)}_{0}}\leq\frac{2}{d^{(3,j)}_{0}}\leq 2,\quad\text{for $j\geq 3$.}

That is, the first two kernels are always positive and bounded so that

\displaystyle\sum_{j=3}^{n}\big{|}\vartheta_{n-j}^{(3,n)}\big{|}\leq 3\quad\text{and}\quad\sum_{j=i}^{n}\big{|}\vartheta_{j-i}^{(3,j)}\big{|}\leq 3\quad\text{for $n=3,4\;(i\geq 3)$.}

Now consider the general case $n\geq 5$ . Taking $\mu=\mu^{*}$ in the condition (3.20) yields

\displaystyle\big{|}2\beta_{i}-\alpha_{i}+\imath(1-\alpha_{i})\big{|}<1-\beta_{i}\quad\text{for $i\geq 3$,}

or equivalently,

\displaystyle 2\alpha_{i}^{2}+3\beta_{i}^{2}-4\alpha_{i}\beta_{i}-2\alpha_{i}+2\beta_{i}<0\quad\text{for $i\geq 3$.}

Thanks to Lemma A.3, they are valid if all of step-ratios $0<r_{k}<R_{3}$ $(k\geq 2)$ . According to (3.10) and (3.19), the elliptic norm $\big{\|}A_{i}\big{\|}_{H,\infty}$ is a continuous function with respect to $r_{i},r_{i-1}\in[\epsilon,R_{3}-\epsilon]$ for aribitrary small $\epsilon>0$ . Thus there exists a constant $\delta\in(0,1)$ such that

\displaystyle\big{\|}A_{i}\big{\|}_{H,\infty}\leq\delta<1\quad\text{for $i\geq 3$.}

By taking the parameter $\mu=\mu^{*}$ in (3.15) and (3.19), one has $\big{\|}H\big{\|}_{\infty}=2$ and $\big{\|}H^{-1}\big{\|}_{\infty}=1+\frac{\sqrt{2}}{2}.$ Moreover, by (A.4)-(A.5), one has $\big{\|}\vec{u}_{j+1}\big{\|}_{\infty}\leq\max\{\alpha(R_{3},R_{3}),1\}=\alpha(R_{3},R_{3})$ . Thus, the estimate (3.16) gives

\displaystyle\big{|}\vartheta_{n-j}^{(3,n)}\big{|}d^{(3,j)}_{0}\leq\big{\|}\vec{u}_{n}\big{\|}_{\infty}\leq c_{R}\delta^{n-j-1}\quad\text{for $n\geq j+2\geq 5$,}

where $c_{R}=(2+\sqrt{2})\alpha(R_{3},R_{3})$ , and then

\displaystyle\big{|}\vartheta_{n-j}^{(3,n)}\big{|}\leq c_{R}\delta^{n-j-1}\quad\text{for $n\geq j+2\geq 5$.}

(3.23)

Therefore it follows that

	$\displaystyle\sum_{j=3}^{n}\big{\|}\vartheta_{n-j}^{(3,n)}\big{\|}\leq c_{R}\sum_{j=3}^{n}\delta^{n-j-1}<\frac{c_{R}}{\delta(1-\delta)}\quad\text{for $n\geq 5$,}$
	$\displaystyle\sum_{j=i}^{n}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}\leq c_{R}\sum_{j=i}^{n}\delta^{j-i-1}<\frac{c_{R}}{\delta(1-\delta)}\quad\text{for $n\geq i\geq 3$.}$

We obtain the desired estimates by taking a constant $C_{3}:=\frac{c_{R}}{\delta(1-\delta)},$ which is independent of the time-level index $n$ . This proof is complete. ∎

Remark 3.

The elliptic type norm in (3.15) admits some other nonsingular matrices $H$ . Here consider a simple case with

H_{0}:=\Big{(}\begin{array}[]{cc}1&0\\ 1&1\\ \end{array}\Big{)}\quad\text{such that}\quad H_{0}^{-1}A_{i}H_{0}=\left(\begin{array}[]{cc}\alpha_{i}-\beta_{i}&-\beta_{i}\\ 1-\alpha_{i}+\beta_{i}&\beta_{i}\\ \end{array}\right)

Let $R_{3,0}:=\frac{1}{3}+\frac{1}{3}\sqrt[3]{19+3\sqrt{33}}+\frac{1}{3}\sqrt[3]{19-3\sqrt{33}}\approx 1.839$ be the unique positive root of the cubic equation $R_{3,0}^{3}-R_{3,0}^{2}-R_{3,0}-1=0$ . If the step-ratios satisfy $0<r_{k}<R_{3,0}$ (it is also superior to the existing mesh conditions in the literature [2, 6, 8]), one can follow the proofs of Lemmas A.1-A.2 to check that

2\beta_{i}<\alpha_{i}<1\quad\text{ for $i\geq 3$.}

It follows that

\big{\|}A_{i}\big{\|}_{H_{0},\infty}=\max\left\{\alpha_{i},1-\alpha_{i}+2\beta_{i}\right\}<1\quad\text{for $i\geq 3$,}

and then the DOC kernels $\vartheta_{n-j}^{(3,n)}$ are globally decaying. Although the proof of Lemma 3.1 is technically complex, the matrix $H$ in (3.15) would be optimal in the sense that the companion matrix $A_{i}$ always has a pair of conjugate complex eigenvalues. Actually, the inequality $\alpha_{i}^{2}<4\beta_{i}$ holds when the adjacent step ratios $r_{i},r_{i-1}$ are close to 1, cf. Remark 4.

Remark 4.

Consider the uniform time-step $\tau_{k}=\tau$ , the BDF3 kernels (1.10) give

d_{0}^{(3,n)}=\frac{11}{6},\quad d_{1}^{(3,n)}=-\frac{7}{6},\quad d_{2}^{(3,n)}=\frac{1}{3}\quad\text{and}\quad d_{j}^{(3,n)}=0\quad\text{for $j\geq 3$}.

The difference equation (3.4) becomes

\displaystyle\widehat{\vartheta}_{n-j}^{(3,n)}-\frac{7}{11}\widehat{\vartheta}_{n-1-j}^{(3,n-1)}+\frac{2}{11}\widehat{\vartheta}_{n-2-j}^{(3,n-2)}=0\quad\text{for $n\geq j+2\geq 5$.}

Since the associated characteristic equation $\lambda^{2}-\frac{7}{11}\lambda+\frac{2}{11}=0$ has two roots $\lambda_{1,2}=(7\pm\imath\sqrt{39})/22$ , we have the following explicit formula of DOC kernels

\displaystyle\vartheta_{n-j}^{(3,n)}=\frac{11\,\imath}{d_{0}^{(3,j)}\sqrt{39}}\left[\Big{(}\frac{7-\imath\sqrt{39}}{22}\Big{)}^{n-j+1}-\Big{(}\frac{7+\imath\sqrt{39}}{22}\Big{)}^{n-j+1}\right]\quad\text{for $n\geq j+2\geq 5$.}

Refer to caption — Figure 1: The DOC kernels $\vartheta_{j}^{(3,n)}\,(j\geq 2)$ for different step-ratio patterns.

In Figure 1, we fix $n=30$ and list the values of DOC kernels $\vartheta_{j}^{(3,n)}$ for different choices (let $0<\epsilon_{k}<1$ be the uniformly distributed random number):

(a)

the uniform mesh with $r_{k}=1$ for $2\leq k\leq 30$ ;
(b)

the random mesh with $r_{k}:=R_{3}\epsilon_{k}$ for $2\leq k\leq 30$ ;
(c)

the random mesh with $r_{k}:=3\epsilon_{k}$ for $2\leq k\leq 30$ .

As observed, the DOC kernels $\vartheta_{n-j}^{(3,n)}$ are not always positive, but they always decay fast, as predicted by (3.23). It is interesting to find that, from the case (c), the DOC kernels maintain the fast damping property as a whole although there are many of step ratios greater than $R_{3}$ (about $15\%$ in the above test).

Theorem 3.1.

If $\tau\leq 1/(4C_{3}L_{f})$ , the BDF3 solution of (1.7) satisfies

\displaystyle\big{|}\tilde{v}^{n}\big{|}\leq 2\exp(4C_{3}L_{f}t_{n-1})\Big{(}\big{|}\tilde{v}^{2}\big{|}+5C_{3}\tau\big{|}\partial_{\tau}\tilde{v}^{2}\big{|}+2C_{3}\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}+2C_{3}t_{n}\max_{3\leq i\leq n}\big{|}\varepsilon^{i}\big{|}\Big{)}\quad\text{for $3\leq n\leq N$.}

Thus the variable-step BDF3 scheme is stable if $0<r_{k}<R_{3}$ for $k\geq 2$ .

Proof.

Multiplying both sides of (1.16) with $2\tau_{n}\tilde{v}^{n}$ , one applies Lemma 3.1 to obtain that

	$\displaystyle\big{\|}\tilde{v}^{n}\big{\|}^{2}-\big{\|}\tilde{v}^{n-1}\big{\|}^{2}\leq$	$\displaystyle\,-2\tau_{n}\tilde{v}^{n}\mathcal{I}_{3}^{n}[\tilde{v}]+2\tau_{n}\tilde{v}^{n}\sum_{i=3}^{n}\vartheta_{n-i}^{(3,n)}\left[f(t_{i},\bar{v}^{i})-f(t_{i},v^{i})+\varepsilon^{i}\right]$
	$\displaystyle\leq$	$\displaystyle\,2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\big{\|}\mathcal{I}_{3}^{n}[\tilde{v}]\big{\|}+2L_{f}\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\sum_{i=3}^{n}\big{\|}\vartheta_{n-i}^{(3,n)}\big{\|}\big{\|}\tilde{v}^{i}\big{\|}+2C_{3}\big{\|}\tilde{v}^{n}\big{\|}\tau_{n}\max_{3\leq i\leq n}\big{\|}\varepsilon^{i}\big{\|}.$

By summing the above inequality for $n$ from $n=3$ to $m$ , we have

\displaystyle\big{|}\tilde{v}^{m}\big{|}^{2}\leq

\displaystyle\,\big{|}\tilde{v}^{2}\big{|}^{2}+2\sum_{j=3}^{m}\tau_{j}\big{|}\tilde{v}^{j}\big{|}\big{|}\mathcal{I}_{3}^{j}[\tilde{v}]\big{|}+2L_{f}\sum_{j=3}^{m}\tau_{j}\big{|}\tilde{v}^{j}\big{|}\sum_{i=3}^{j}\big{|}\vartheta_{j-i}^{(3,j)}\big{|}\big{|}\tilde{v}^{i}\big{|}+2C_{3}\sum_{j=3}^{m}\big{|}\tilde{v}^{j}\big{|}\tau_{j}\max_{3\leq i\leq j}\big{|}\varepsilon^{i}\big{|}.

Taking some integer $m_{1}$ ( $2\leq m_{1}\leq m$ ) such that $\big{|}\tilde{v}^{m_{1}}\big{|}=\max_{2\leq k\leq m}\big{|}\tilde{v}^{k}\big{|}$ . Now we take $m:=m_{1}$ in the above inequality and get

\displaystyle\big{|}\tilde{v}^{m}\big{|}\leq\big{|}\tilde{v}^{m_{1}}\big{|}\leq\big{|}\tilde{v}^{2}\big{|}+2\tau\sum_{j=3}^{m}\big{|}\mathcal{I}_{3}^{j}[\tilde{v}]\big{|}+2C_{3}L_{f}\sum_{j=3}^{m}\tau_{j}\big{|}\tilde{v}^{j}\big{|}+2C_{3}t_{m}\max_{3\leq i\leq m}\big{|}\varepsilon^{i}\big{|}

(3.24)

for $3\leq m\leq N$ . It remains to evaluate the error $\sum_{j=3}^{m}\big{|}\mathcal{I}_{3}^{j}[\tilde{v}]\big{|}$ from the starting values. Taking the index $\mathrm{k}=3$ in (1.15) gives

\displaystyle\mathcal{I}_{3}^{n}[\tilde{v}]=

\displaystyle\,\partial_{\tau}\tilde{v}^{2}\sum_{i=3}^{n}\vartheta_{n-i}^{(3,n)}d^{(3,i)}_{i-2}+\partial_{\tau}\tilde{v}^{1}\sum_{i=3}^{n}\vartheta_{n-i}^{(3,n)}d^{(3,i)}_{i-1}\quad\text{for $n\geq 3$.}

Recalling the definition (1.10) of BDF3 kernels with the increasing property (A.1), it is easy to check that $d^{(3,3)}_{2}\leq d_{2}(R_{3},R_{3})\leq 2$ and $\big{|}d^{(3,3)}_{1}\big{|}+d^{(3,4)}_{2}\leq-d_{1}(R_{3},R_{3})+d_{2}(R_{3},R_{3})\leq 5$ . Thus we apply Lemma 3.1 to get

	$\displaystyle\sum_{j=3}^{m}\big{\|}\mathcal{I}_{3}^{j}[\tilde{v}]\big{\|}\leq$	$\displaystyle\,\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}\sum_{j=3}^{m}\sum_{i=3}^{j}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}\big{\|}d^{(3,i)}_{i-2}\big{\|}+\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}\sum_{j=3}^{m}\sum_{i=3}^{j}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}\big{\|}d^{(3,i)}_{i-1}\big{\|}$
	$\displaystyle=$	$\displaystyle\,\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}\sum_{i=3}^{m}\big{\|}d^{(3,i)}_{i-2}\big{\|}\sum_{j=i}^{m}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}+\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}\sum_{i=3}^{m}\big{\|}d^{(3,i)}_{i-1}\big{\|}\sum_{j=i}^{m}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}$
	$\displaystyle\leq$	$\displaystyle\,C_{3}\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}\big{(}\big{\|}d^{(3,3)}_{1}\big{\|}+d^{(3,4)}_{2}\big{)}+C_{3}\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}d^{(3,3)}_{2}$
	$\displaystyle\leq$	$\displaystyle\,5C_{3}\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}+2C_{3}\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}\qquad\text{for $m\geq 3$.}$

It follows from (3.24) that

\displaystyle\big{|}\tilde{v}^{n}\big{|}\leq\big{|}\tilde{v}^{2}\big{|}+5C_{3}\tau\big{|}\partial_{\tau}\tilde{v}^{2}\big{|}+2C_{3}\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}+2C_{3}L_{f}\sum_{j=3}^{n}\tau_{j}\big{|}\tilde{v}^{j}\big{|}+2C_{3}t_{n}\max_{3\leq i\leq n}\big{|}\varepsilon^{i}\big{|}.

Assuming that $\tau\leq 1/(4C_{3}L_{f})$ , one has

\displaystyle\big{|}\tilde{v}^{n}\big{|}\leq 2\big{|}\tilde{v}^{2}\big{|}+10C_{3}\tau\big{|}\partial_{\tau}\tilde{v}^{2}\big{|}+4C_{3}\tau\big{|}\partial_{\tau}\tilde{v}^{1}\big{|}+4C_{3}L_{f}\sum_{j=3}^{n-1}\tau_{j}\big{|}\tilde{v}^{j}\big{|}+4C_{3}t_{n}\max_{3\leq i\leq n}\big{|}\varepsilon^{i}\big{|}

for $3\leq n\leq N$ . The standard Grönwall inequality completes the proof. ∎

From the proof of Lemma 3.1, the uniform boundedness of $\sum_{j=1}^{n}\big{|}\vartheta_{n-j}^{(3,n)}\big{|}$ and $\sum_{j=i}^{n}\big{|}\vartheta_{j-i}^{(3,j)}\big{|}$ does not require all DOC kernels to decay rapidly. It always allows a finite number of bounded DOC kernels. As seen from the numerical tests in the next section, the imposed step-ratio condition $0<r_{k}<R_{3}$ in Theorem 3.1 is sufficient, but far from necessary. A practical step-ratio constraint for stability is that

almost all step ratios satisfy $0<r_{k}<R_{3}$ for $2\leq k\leq N$ .

4 Numerical tests

Consider an ODE model $v^{\prime}=2v-3e^{-t}$ for $0<t\leq 1$ with a smooth solution $v=\exp(-t)$ in our numerical tests. We run the variable-step BDF2 and BDF3 schemes in two scenarios:

(a)

The graded meshes $t_{k}=(k/N)^{\gamma}$ for $0\leq k\leq N$ with grading parameters $\gamma>1$ . The maximum step-ratio is $r_{\max}=r_{2}=2^{\gamma}-1$ and $\tau/\tau_{1}=\tau_{N}/\tau_{1}=N^{\gamma}-(N-1)^{\gamma}\approx\gamma N^{\gamma-1}$ .
(b)

The random meshes with time-steps $\tau_{k}=\epsilon_{k}/\sum_{k=1}^{N}\epsilon_{k}$ , where $\epsilon_{k}\in(0,1)$ are uniformly distributed random numbers.

To start the multi-step schemes, a two-stage third-order singly diagonally implicit Runge-Kutta method is employed. We record the maximum error $e(N):=\max_{1\leq{n}\leq{N}}|v(t_{n})-v^{n}|$ in each run and compute the convergence order by

\text{Order}\approx\frac{\log\left(e(N)/e(2N)\right)}{\log\left(\tau(N)/\tau(2N)\right)}

where $\tau(N)$ denotes the maximum time-step size for total $N$ subintervals.

Table 1: Numerical results of BDF2 method on graded time meshes

$N$ $\dfrac{\tau}{\tau_{1}}$ $\gamma=2,\,\,r_{\max}=3$ $\dfrac{\tau}{\tau_{1}}$ $\gamma=3,\,\,r_{\max}=7$ $\dfrac{\tau}{\tau_{1}}$ $\gamma=4,\,\,r_{\max}=15$ $e(N)$ Order $e(N)$ Order $e(N)$ Order 40 7.9e+01 5.28e-04 $-$ 4.7e+03 8.77e-04 $-$ 2.5e+05 1.35e-03 $-$ 80 1.6e+02 1.34e-04 1.97 1.9e+04 2.25e-04 1.96 2.0e+06 3.49e-04 1.95 160 3.2e+02 3.39e-05 1.99 7.6e+04 5.72e-05 1.98 1.6e+07 8.91e-05 1.97 320 6.4e+02 8.52e-06 1.99 3.1e+05 1.44e-05 1.99 1.3e+08 2.25e-05 1.98 640 1.3e+03 2.14e-06 2.00 1.2e+06 3.61e-06 1.99 1.1e+09 5.66e-06 1.99 1280 2.6e+03 5.34e-07 2.00 4.9e+06 9.06e-07 2.00 8.4e+09 1.42e-06 2.00

Table 2: Numerical results of BDF3 method on graded time meshes

$N$ $\dfrac{\tau}{\tau_{1}}$ $\gamma=2,\,\,r_{\max}=3$ $\dfrac{\tau}{\tau_{1}}$ $\gamma=3,\,\,r_{\max}=7$ $\dfrac{\tau}{\tau_{1}}$ $\gamma=4,\,\,r_{\max}=15$ $e(N)$ Order $e(N)$ Order $e(N)$ Order 40 7.9e+01 1.27e-05 $-$ 4.7e+03 2.94e-05 $-$ 2.5e+05 5.73e-05 $-$ 80 1.6e+02 1.65e-06 2.95 1.9e+04 3.91e-06 2.91 2.0e+06 7.85e-06 2.87 160 3.2e+02 2.10e-07 2.97 7.6e+04 5.05e-07 2.96 1.6e+07 1.03e-06 2.94 320 6.4e+02 2.65e-08 2.99 3.1e+05 6.41e-08 2.98 1.3e+08 1.31e-07 2.97 640 1.3e+03 3.32e-09 2.99 1.2e+06 8.07e-09 2.99 1.1e+09 1.66e-08 2.98 1280 2.6e+03 4.16e-10 3.00 4.9e+06 1.01e-09 2.99 8.4e+09 2.08e-09 2.99

Tables 1-2 list the numerical results of the BDF2 and BDF3 methods on graded meshes with three grading parameters, respectively. We see that, although the ratio $\tau/\tau_{1}$ increases as fast as $O(N^{\gamma-1})$ , the numerical solutions remains stable and convergent with full accuracy.

Table 3: Numerical results of BDF2 method on random time meshes

$N$ $\tau(N)$ $e(N)$ Order $r_{\max}$ $\tau/\tau_{1}$ 40 4.49e-02 5.86e-04 – 28.30 1.22 80 2.40e-02 1.43e-04 2.03 91.41 1.06 160 1.18e-02 4.09e-05 1.81 32.54 19.88 320 6.18e-03 1.03e-05 1.99 418.41 1.21 640 3.01e-03 2.51e-06 2.04 604.02 2.66 1280 1.55e-03 6.63e-07 1.92 1963.80 1.02

Table 4: Numerical results of BDF3 method on random time meshes

$N$ $\tau(N)$ $e(N)$ Order $r_{\max}$ $N_{1}$ $\tau/\tau_{1}$ 40 4.60e-02 8.20e-06 – 7.58 9 1.57 80 2.46e-02 1.18e-06 2.79 361.49 18 1.93 160 1.22e-02 1.83e-07 2.69 1682.18 35 4.51 320 6.11e-03 2.40e-08 2.93 79.90 60 2.91 640 3.27e-03 3.40e-09 2.82 5765.00 146 1.20 1280 1.56e-03 4.16e-10 3.03 9677.92 250 6.64

Tables 3-4 record the numerical results on random time meshes. Table 3 suggests that the variable-step BDF2 method is robust with respect to the step-ratios $r_{k}$ , and supports the theoretical findings in Theorem 2.1. Reminding the step-ratio restriction $0<r_{k}<2.533$ in Theorem 3.1 for the BDF3 method, we also record the number (denoted by $N_{1}$ in Table 4) of time levels with $r_{k}\geq 2.533$ . It is seen that the variable-step BDF3 method is stable and third-order convergent on random meshes, even if there are many of (about $20\%$ in our tests) step-ratios do not meet our theoretical condition. Nonetheless, it remains mysterious to us.

5 Concluding remarks

The stability of BDF2 and BDF3 methods with unequal time-steps is investigated by a new theoretical framework using the discrete orthogonal convolution kernels. Thanks to the global analysis, that is, the present technique formulates the current BDF solution as a convolution summation of all previous information with DOC kernels as the convolutional weights, see the form (1.16), we improved the setp-ratio constraints for the stability. It is to mention that, the global decaying estimates of DOC kernels are also critical in the numerical analysis of BDF methods for parabolic equations, cf. [11, 12, 14]. The stability and convergence theory of variable-step BDF3 scheme for these stiff problems will be addressed in forthcoming reports.

For the high-order BDF- $\mathrm{k}$ ( $\mathrm{k}=4,5,6$ ) time-stepping methods, we can also define the associated DOC kernels $\vartheta_{n-j}^{(\mathrm{k},n)}$ via the recursive procedure (1.11). From the proof of Theorem 3.1, one needs a result similar to Lemma 3.1 under certain step-ratio condition. Actually, the uniform boundedness (there exists a constant $C_{\mathrm{k}}$ independent of the time-level indexes $n$ ) of the absolute summations of DOC kernels

\displaystyle\sum_{j=\mathrm{k}}^{n}\big{|}\vartheta_{n-j}^{(\mathrm{k},n)}\big{|}\leq C_{\mathrm{k}}\quad\text{and}\quad\sum_{j=i}^{n}\big{|}\vartheta_{j-i}^{(\mathrm{k},j)}\big{|}\leq C_{\mathrm{k}}\quad\text{for any $n\geq i\geq\mathrm{k}$,}

is sufficient to the stability of BDF- $\mathrm{k}$ schemes for initial value problems. However, this issue seems to be theoretically challenging for $\mathrm{k}\geq 4$ and remains open to us.

Acknowledgements

The authors would like to thank Dr. Ji Bingquan for his valuable discussions.

Appendix A Two auxiliary functions for the BDF3 method

By using the definitions (1.1)-(1.3), it is easy to check that

\displaystyle\frac{\partial}{\partial x}\left|d_{\nu}(x,y)\right|>0\quad\text{and}\quad\frac{\partial}{\partial y}\left|d_{\nu}(x,y)\right|>0\quad\text{for $\nu=0,1,2$ and $x,y>0.$}

(A.1)

That is, the functions $d_{0}(x,y)$ , $-d_{1}(x,y)$ and $d_{2}(x,y)$ are increasing with respect to $x,y>0$ . We consider two auxiliary functions for the analysis of variable-step BDF3 method,

	$\displaystyle\alpha(x,y):=$	$\displaystyle\,-\frac{d_{1}(x,y)}{d_{0}(x,y)}=\frac{x\left(x^{2}y^{2}+4xy^{2}+3y^{2}+2xy+3y+1\right)}{(y+1)\left(3x^{2}y+4xy+2x+y+1\right)},$		(A.2)
	$\displaystyle\beta(x,y):=$	$\displaystyle\,\frac{d_{2}(x,y)}{d_{0}(x,y)}=\frac{x(x+1)^{2}y^{2}}{(y+1)\left(3x^{2}y+4xy+2x+y+1\right)}.$		(A.3)

Their properties will be examined with respect to two independent variables $x,y>0$ due to the facts $\alpha_{m}=\alpha(r_{m},r_{m-1})$ and $\beta_{m}=\beta(r_{m},r_{m-1})$ for $m\geq 2$ according to (3.5).

The above functions $\alpha$ and $\beta$ are strictly increasing with respect to $x,y>0$ . Actually, by simple but tedious calculations for (A.2) -(A.3), it is not difficult to check that

	$\displaystyle\frac{\partial\alpha}{\partial x}=\frac{(x+1)^{2}\left(3x^{2}+2x+3\right)y^{3}+2\left(2x^{3}+5x^{2}+6x+3\right)y^{2}+(x+2)^{2}y+1}{(y+1)\left(3x^{2}y+4xy+2x+y+1\right)^{2}},$		(A.4)
	$\displaystyle\frac{\partial\alpha}{\partial y}=\frac{x(x+1)^{2}(xy+y+1)(3xy+3y+1)}{(y+1)^{2}\left(3x^{2}y+4xy+2x+y+1\right)^{2}},$		(A.5)

and

	$\displaystyle\frac{\partial\beta}{\partial x}=\frac{(x+1)y^{2}\left(3x^{3}y+5x^{2}y+4x^{2}+3xy+3x+y+1\right)}{(y+1)\left(3x^{2}y+4xy+2x+y+1\right)^{2}},$		(A.6)
	$\displaystyle\frac{\partial\beta}{\partial y}=\frac{x(x+1)^{2}y\left(3x^{2}y+6xy+4x+2y+2\right)}{(y+1)^{2}\left(3x^{2}y+4xy+2x+y+1\right)^{2}}\quad\text{for $x,y>0$.}$		(A.7)

We have the following results. Some of proofs are technically complex and the mathematical derivations have been checked carefully by a symbolic calculation software.

Lemma A.1.

It holds that

0<\beta(x,y)<\alpha(x,y)<1+\beta(x,y)\quad\text{for $x,y>0$}.

Proof.

Simple calculations lead to

	$\displaystyle\alpha(x,y)-\beta(x,y)=$	$\displaystyle\,\frac{x(2xy+2y+1)}{3x^{2}y+4xy+2x+y+1},$
	$\displaystyle 1-\alpha(x,y)+\beta(x,y)=$	$\displaystyle\,\frac{(x+1)(xy+y+1)}{3x^{2}y+4xy+2x+y+1}.$

It completes the proof. ∎

Lemma A.2.

It holds that

0<\beta(x,y)<1\quad\text{for $0<x,y<\hat{R}_{3}\approx 3.4405$,}

where $\hat{R}_{3}$ is the unique positive root of $\hat{R}_{3}^{4}-2\hat{R}_{3}^{3}-4\hat{R}_{3}^{2}-3\hat{R}_{3}-1=0$ .

Proof.

According to (A.6)-(A.7), it holds that

\displaystyle\beta(x,y)-1<\beta(\hat{R}_{3},\hat{R}_{3})-1=\frac{\hat{R}_{3}^{4}-2\hat{R}_{3}^{3}-4\hat{R}_{3}^{2}-3\hat{R}_{3}-1}{3\hat{R}_{3}^{3}+4\hat{R}_{3}^{2}+3\hat{R}_{3}+1}=0\quad\text{for $0<x,y<\hat{R}_{3}$.}

It completes the proof. ∎

Lemma A.3.

For $0<x,y<R_{3}\approx 2.553$ , it holds that

2\alpha^{2}(x,y)+3\beta^{2}(x,y)-4\alpha(x,y)\beta(x,y)-2\alpha(x,y)+2\beta(x,y)<0.

Proof.

Consider an auxiliary function, see Figure 2,

	$\displaystyle g(x,y):=$	$\displaystyle\,\frac{1}{x(x+1)}\Big{[}2\alpha(x,y)^{2}+3\beta(x,y)^{2}-4\alpha(x,y)\beta(x,y)-2\alpha(x,y)+2\beta(x,y)\Big{]}$
	$\displaystyle=$	$\displaystyle\,\frac{\left(3x^{2}y+4xy+2x+y+1\right)}{\gamma(x,y)}\Big{[}(x+1)^{2}\left(x^{2}+x-4\right)y^{4}-2\left(4x^{2}+11x+7\right)y^{3}\Big{.}$
		$\displaystyle\,\hskip 99.58464pt\Big{.}-2\left(2x^{2}+10x+9\right)y^{2}-2(3x+5)y-2\Big{]}\qquad\text{for $0\leq x,y<R_{3}$,}$

where

\gamma(x,y):=(y+1)^{2}\left(3x^{2}y+4xy+2x+y+1\right)^{3}.

Obviously, it is sufficient to show that $g(x,y)<0$ for $0<x,y<R_{3}$ . Simple but tedious calculations yield

	$\displaystyle\frac{\partial g}{\partial x}=$	$\displaystyle\,\frac{1}{\gamma(x,y)}\Big{[}2\left(27x^{2}+88x+63\right)y^{2}+2\left(12x^{3}+90x^{2}+158x+79\right)y^{3}\Big{.}$
		$\displaystyle\,\hskip 28.45274pt\Big{.}+\left(4x^{4}+58x^{3}+207x^{2}+252x+99\right)y^{4}+(25-x)(x+1)^{3}y^{5}+(36x+50)y+8\Big{]}.$

Obviously, we have $\frac{\partial g}{\partial x}>0$ such that $g$ is increasing with respect to $x$ for $0<x,y<R_{3}$ . Thus it has no extreme points over the open square $(0,R_{3})^{2}$ . It remains to consider the maximum value along the four sides of $[0,R_{3}]^{2}$ :

(i)

Along the side $y=0$ , we have $g(x,0)=-\frac{2}{(2x+1)^{2}}<0$ for $0<x<R_{3}$ ..
(ii)

Along the side $y=R_{3}$ , $g(x,R_{3})<g(R_{3},R_{3})\approx-0.0000277<0$ for $0<x<R_{3}$ .
(iii)

Along the side $x=0$ , we have $g(0,y)=-\frac{4y+2}{y+1}<0$ for $0<y<R_{3}$ ..

(iv)

Along the side $x=R_{3}$ , one has

\displaystyle g(R_{3},y)\approx

\displaystyle\,\frac{0.06763y^{4}-0.12922y^{3}-0.100507y^{2}-0.0267488y-0.002113}{(y+1)^{2}(y+0.19847)^{2}}\quad\text{for $0<y<R_{3}$.}

It has a unique minimum point at $y\approx 0.103652$ , while

g(R_{3},y)<\max\{g(R_{3},0),g(R_{3},R_{3})\}=g(R_{3},R_{3})<0\quad\text{for $0<y<R_{3}$.}

In summary, we have $g(x,y)<0$ for $0<x,y<R_{3}$ . It completes the proof. ∎

References

[1] G. Albi, G. Dimarco and L. Pareschi, Implicit-explicit multistep methods for hyperbolic systems with multiscale relaxation, SIAM J. Sci. Comput., 42(4) (2020) A2402–A2435 .
[2] M. Calvo, T. Grande and R.D. Grigorieff, On the zero stability of the variable order variable stepsize BDF-formulas, Numer. Math., 57 (1990), 39–50.
[3] M. Crouzeix and F.J. Lisbona, The convergence of variable-stepsize, variable formula, multistep methods, SIAM J. Numer. Anal., 21 (1984), 512–534.
[4] V. DeCaria, A. Guzel, W. Layton and Y. Li, A variable stepsize, variable order family of low complexity, SIAM J. Sci. Comp., 43(3) (2021), A2130-A2160.
[5] G. Dimarco and L. Pareschi, Implicit-explicit linear multistep methods for stiff kinetic equations, SIAM J. Numer. Anal., 55 (2017), 664–690.
[6] R.D. Grigorieff, Stability of multistep-methods on variable grids, Numer. Math., 42 (1983), 359–377.
[7] R.D. Grigorieff and P.J. Paes-Leme, On the zero-stability of the 3-step BDF-formula on nonuniform grids, BIT (1984), 24, 85–91.
[8] N. Guglielmi and M. Zennaro, On the zero-stability of variable stepsize multistep methods: the spectral radius approach, Numer. Math., 88 (2001), 445–458.
[9] E. Hairer, S.P. Nørsett and G. Wanner, Solving Ordinary Differential Equations I: Nonstiff Problems, Volume 8 of Springer Series in Computational Mathematics, Second Edition, Springer-Verlag, 1992.
[10] E. Hairer and G. Wanner, Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, Springer Series in Computational Mathematics Volume 14, Second Edition, Springer-Verlag, 2002.
[11] H.-L. Liao, B. Ji and L. Zhang, An adaptive BDF2 implicit time-stepping method for the phase field crystal model, IMA J. Numer. Anal., 2020, doi:10.1093/imanum/draa075.
[12] H.-L. Liao, B. Ji, L. Wang and Z. Zhang, Mesh-robustness of an energy stable BDF2 scheme with variable steps for the Cahn-Hilliard model, arXiv:2102.03731v1, 2021.
[13] H.-L. Liao, T. Tang and T. Zhou, On energy stable, maximum-bound preserving, second-order BDF scheme with variable steps for the Allen-Cahn equation, SIAM J. Numer. Anal., 58(4) (2020), 2294-2314.
[14] H.-L. Liao and Z. Zhang, Analysis of adaptive BDF2 scheme for diffusion equations, Math. Comp., 90 (2021), 1207–1226.
[15] D. Wang and S.T. Ruuth, Variable step-size implicit-explicit linear multistep methods for time-dependent partial differential equations, J. Comput. Math., 26(6) (2008), 838–855.

	$\displaystyle\big{\|}\tilde{v}^{n}\big{\|}^{2}-$	$\displaystyle\,\big{\|}\tilde{v}^{n-1}\big{\|}^{2}+\big{\|}\tau_{n}\partial_{\tau}\tilde{v}^{n}\big{\|}^{2}=-2\tau_{n}\tilde{v}^{n}\mathcal{I}_{2}^{n}[\tilde{v}]+2\tau_{n}\tilde{v}^{n}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}\left[f(t_{i},\bar{v}^{i})-f(t_{i},v^{i})+\varepsilon^{i}\right]$
	$\displaystyle\leq$	$\displaystyle\,2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\big{\|}\mathcal{I}_{2}^{n}[\tilde{v}]\big{\|}+2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}\big{\|}f(t_{i},\bar{v}^{i})-f(t_{i},v^{i})\big{\|}+2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\max_{1\leq i\leq n}\big{\|}\varepsilon^{i}\big{\|}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}$
	$\displaystyle\leq$	$\displaystyle\,2\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\big{\|}\mathcal{I}_{2}^{n}[\tilde{v}]\big{\|}+2L_{f}\tau_{n}\big{\|}\tilde{v}^{n}\big{\|}\sum_{i=2}^{n}\vartheta_{n-i}^{(2,n)}\big{\|}\tilde{v}^{i}\big{\|}+2\big{\|}\tilde{v}^{n}\big{\|}\tau_{n}\max_{1\leq i\leq n}\big{\|}\varepsilon^{i}\big{\|},$

	$\displaystyle\sum_{j=3}^{m}\big{\|}\mathcal{I}_{3}^{j}[\tilde{v}]\big{\|}\leq$	$\displaystyle\,\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}\sum_{j=3}^{m}\sum_{i=3}^{j}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}\big{\|}d^{(3,i)}_{i-2}\big{\|}+\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}\sum_{j=3}^{m}\sum_{i=3}^{j}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}\big{\|}d^{(3,i)}_{i-1}\big{\|}$
	$\displaystyle=$	$\displaystyle\,\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}\sum_{i=3}^{m}\big{\|}d^{(3,i)}_{i-2}\big{\|}\sum_{j=i}^{m}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}+\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}\sum_{i=3}^{m}\big{\|}d^{(3,i)}_{i-1}\big{\|}\sum_{j=i}^{m}\big{\|}\vartheta_{j-i}^{(3,j)}\big{\|}$
	$\displaystyle\leq$	$\displaystyle\,C_{3}\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}\big{(}\big{\|}d^{(3,3)}_{1}\big{\|}+d^{(3,4)}_{2}\big{)}+C_{3}\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}d^{(3,3)}_{2}$
	$\displaystyle\leq$	$\displaystyle\,5C_{3}\big{\|}\partial_{\tau}\tilde{v}^{2}\big{\|}+2C_{3}\big{\|}\partial_{\tau}\tilde{v}^{1}\big{\|}\qquad\text{for $m\geq 3$.}$