Stability of variable-step BDF2 and BDF3 methods
Abstract
We prove that the two-step backward differentiation formula (BDF2)
method is stable on arbitrary time grids;
while the variable-step BDF3 scheme is stable if almost all adjacent step ratios are less than 2.553.
These results relax the severe mesh restrictions in the literature and
provide a new understanding of variable-step BDF methods.
Our main tools include the discrete orthogonal convolution kernels
and an elliptic-type matrix norm.
Keywords: variable-step BDF formula,
stability, step-ratio condition, discrete orthogonal convolution kernels
AMS subject classifications: 65M06, 65M12
1 Introduction
In this paper, we revisit the stability of variable-step backward differentiation formulas (BDF) for the following initial value problem
Choose time levels with the variable time-step for , and the maximum step size . Put the adjacent time-step ratios
For any sequences , denote . Let () be the interpolating polynomial of a function over the nodes , , and . To describe the variable coefficients, define the following three functions
(1.1) | ||||
(1.2) | ||||
(1.3) |
Taking , one has the BDF1 formula for , and the variable-step BDF2 formula
(1.4) |
The variable-step BDF3 formula for ,
(1.5) |
As seen, the formula (1.5) also represents a variable-order variable-step BDF formula from the first-order to third-order. The theoretical results on (1.5) are naturally valid for any combinations of the BDF-k formulas if the time-step ratios are allowed to be zero in the associated coefficients.
We consider the variable-step BDF- time-stepping scheme for the ODE problem,
(1.6) |
Without losing the generality, assume that the starting values , , are available by some other high-order approaches, such as Runge-Kutta methods. For the stability analysis, assume further that the perturbed solution solves the equation with a bounded sequence . Then the difference solves
(1.7) |
These well-known stiff solvers have been tested to be practically valuable for differential-algebraic problems [5, 9, 10] and for the hyperbolic systems with multiscale relaxation [1]. Also, the variable-step versions are computationally efficient in capturing the multi-scale time behaviors [2, 4, 10, 11, 12, 13, 15] via adaptive time-stepping strategies. On the other hand, the stability and convergence analysis of variable-step BDF methods are difficult and remain incomplete, cf. [2, 4, 6, 7, 8, 9, 15], because they involve multiple degrees of freedom (independent time-step sizes).
One of the main defects in the existing theory is the severe mesh condition required for stability. For examples, the zero-stability of variable-step BDF2 method requires a step-ratio condition , see [6, 7]. By considering a specific time grid with constant step-ratio , it was also shown in [6] that the zero-stability of variable-step BDF3 method requires the maximum step-ratio limit . By using an elliptic type norm, Calvo, Grande and Grigorieff [2] gave the zero-stability constraint in 1990. Twenty years ago, Guglielmi and Zennaro [8] applied a spectral radius approach to find the step-ratio condition , which seems to be the best result for the variable-step BDF3 method.
We aim to relax the existing mesh conditions for the perturbed stability of variable-step BDF2 and BDF3 methods according to the following definition.
Definition 1 (Stability).
This definition is practically reasonable in the sense that one always uses some high-order scheme to start the -step time-stepping (1.6). In the uniform time-step case, it reduces into the classical zero-stability, cf. [3, Definition 2.2],
Definition 1 would be as useful as the original zero-stability for the numerical analysis of variable-step, multi-step methods. Actually, it was proven in [14] that the step-ratio condition is sufficient to the stability of variable-step BDF2 method for linear parabolic problem. In a recent work [12], this step-ratio condition was slightly improved to a new one, . These step-ratio conditions are weaker than the classical zero-stability restriction , but both of them are are sufficient for the stability of BDF2 method in the sense of Definition 1.
By developing a new framework with discrete orthogonal convolution (DOC) kernels, we apply Definition 1 to examine the ratio-step conditions for the stability of BDF2 and BDF3 methods in solving the ODE problems. The main contribution is two-fold:
We start from a novel viewpoint by writing the BDF- () formula as a convolution summation of local difference quotients (this viewpoint is quite different from that in previous works [12, 14])
(1.8) |
where the first superscript index in the discrete kernels denotes the order of method. From the BDF2 formula (1.4) with , we have
(1.9) |
Similarly, from the BDF3 formula (1.5) with , we have
(1.10) |
Here and hereafter, assume that the summation to be zero and the product to be one if the index . As for the BDF- kernels with any fixed indexes , we recall a class of discrete orthogonal convolution (DOC) kernels by a recursive procedure, also see [14, 12],
(1.11) |
Obviously, the DOC kernels satisfy the following discrete orthogonality identity
(1.12) |
where is the Kronecker delta symbol with if . Furthermore, with the identity matrix (), the above discrete orthogonality identity (1.12) also implies
where the two matrices and are defined by
Obviously, one has
which implies the following mutual orthogonality identity
(1.13) |
We will use this identity (1.13) to study the decaying property of DOC kernels .
We will derive an equivalent convolution form of the difference equation (1.7). By exchanging the summation order and using (1.12), one has
(1.14) |
where represents the starting effect on the numerical solution at ,
(1.15) |
Multiplying both sides of the difference equation (1.7) by the DOC kernels , and summing from to , we apply (1) to get
(1.16) |
In next two sections, we will present the stability analysis for the variable-step BDF2 and BDF3 methods, respectively, via this discrete convolution form (1.16) of the perturbed equation. Numerical experiments on the graded and random time meshes are included in Section 4 to support our theory. Some concluding remarks end this article.
2 Unconditional stability of BDF2 method
For the BDF2 method, the associated DOC kernels are positive and decay exponentially.
Lemma 2.1.
Proof.
For any fixed , by taking in the identity (1.13), one has for . According to the definition (1.9), one has for . The identity (1.13) gives
for . It yields immediately
It leads to the first result. By taking in (1.8), one has and for . Thus applying the equality (1) with yields
(2.1) |
where the explicit formula of and the definition (1.9) were used. It completes the proof. ∎
Theorem 2.1.
If , the BDF2 solution of (1.7) satisfies
Thus the variable-step BDF2 scheme is stable on arbitrary time meshes.
Proof.
Take . Multiplying both sides of (1.16) with , one applies Lemma 2.1 (the positivity and boundedness of DOC kernels) to obtain that
where the Lipschitz continuous property of the nonlinear function has been used. By dropping the nonnegative term at the left side and summing from to , we have
Let () be an integer such that . Now we take in the above inequality and get
It leads to
(2.2) |
By using Lemma 2.1, the starting error in (1.15) reads
such that
It follows from (2.2) that
Assuming that , one has
The standard discrete Grönwall inequality, e.g. [14, Lemma 3.1], completes the proof. ∎
3 Stability analysis of BDF3 method
3.1 Decaying of DOC kernels
Note that, the variable-step BDF3 method (1.8) of is exact for the linear polynomial . Taking in (1.8), one can find that for . As done in (2), we can apply the discrete equality (1) with (1.15) to get
However, it does not provide enough information for the subsequent stability analysis because no explicit formulas of DOC kernels are available. Furthermore, the DOC kernels are not always positive, see Remark 4 below, we turn to bound for any time-levels by imposing certain step-ratio constraint.
To do so, introduce the modified DOC kernels
(3.1) |
The discrete identity (1.13) gives
(3.2) |
For any fixed index , by taking and in the identity (3.2), respectively, one can derive that
(3.3) |
According to the definition (1.10), one has for . The identity (1.13) gives
or the difference equation
(3.4) |
where, by using (1.1)-(1.3) and (A.2)-(A.3), the coefficients and are defined by
(3.5) |
To evaluate the asymptotic behaviors of the DOC kernels , introduce the following vector and companion matrix
(3.10) |
For any fixed index , the difference equation (3.4) reads
or
(3.11) |
The associated initial vector according to (3.3).
If all of step-ratios (), Lemmas A.1-A.2 show that
(3.12) |
They imply that the roots of satisfy . Thus the eigenvalues of have the module less than 1 and the spectral radius for any . In general, it is not sufficient to ensure the global decaying of DOC kernels .
It needs to examine the joint companion matrix for . Let be the maximum norm on the space and let the same symbol denote also the corresponding induced matrix norm. We adopt the technique of Calvo et al [2] by considering a complex constant with and the following elliptic type norm
(3.15) |
Then one can bound the DOC kernels via
(3.16) |
To process the analysis, it is to determine a fixed parameter . We will choose a proper constant to ensure that
(3.17) |
if all of step-ratios (). It is easy to derive that
(3.18) |
Noticing that the following fact
one has
(3.19) |
Then the requirement (3.17) is equivalent to
(3.20) |
Then, taking into account the coefficient relationship (3.12), we can find that the condition (3.20) can be written equivalently in the following form, also see [2, Eq. (14)],
(3.21) |
Note that, the other branch requiring is omitted here since it is equivalent to (3.21) by replacing the undetermined parameter with . Since the coefficients are dependent on the ratios and , The inequalities in (3.21) define a family of complex disks centered at with the radius .
A heuristic analysis is considered firstly to obtain a rough estimate for the complex parameter , while the mathematical proof is left to next subsection. To ensure (3.21), one should choose a fixed such that the intersection set
Reminding the increasing functions (A.2)-(A.3), we see that the disk centered at has the maximum radius , while the disk centered at
has the minimum radius . Obviously, the largest value of may be determined from the fact that the smallest disk is tangential to the largest one at the tangential point . This necessary condition holds if and only if
which leads to
(3.22) |
or equivalently,
This polynomial equation has two positive roots and (it is checked that this situation also occurs in [2]). We throw away the small root and choose the large one with the corresponding tangential point .
Remark 1.
It is to mention that, by following the joint spectral radius approach in [8], one can obtain a slightly improved step-ratio constraint to ensure the boundedness of DOC kernels; however, it seems difficult to obtain the desired decaying behavior of DOC kernels theoretically from the extremal norm estimate of the family of companion matrices .
Remark 2.
The condition (3.21) is sufficient for (3.17), or ; but different choices of the parameter leads to different values of . Consider the simple case with for any , corresponding to the variable-step BDF2 scheme. By using the definition (3.5) together with (1.1)-(1.3), one has , and the reduced class of complex discs
Recalling the fact for any , one can check that
That is, the distance between the center of and the center of is always less than . Thus these complex discs are always have a common region for any , that is, is not empty. There exists a fixed parameter such that for any . By taking in (3.19), one has
Since for any , one gets . On the other hand, taking in (3.19) arrives at
For any , we have , which are less than 1/2. Always, the estimate (3.16) implies the globally asymptotic decaying of DOC kernels, as shown by Lemma 2.1.
3.2 Stability analysis
To avoid the undesired equation (3.22) which has two different positive roots, and to simplify the subsequent mathematical derivations, we fix the complex parameter
which is very close to the optimal tangential point in the heuristic analysis. Reminding the condition (3.21), we determine the maximum step-ratio by
which leads to the polynomial equation
It has a unique positive root . Now we prove the following result.
Lemma 3.1.
If the step ratios , there exists a constant , independent of the time , such that the DOC kernels in (1.11) satisfy
Proof.
The definitions (1.1) and (1.10) give for . By using (3.1), (3.3) and the coefficient bound (3.12), one gets
That is, the first two kernels are always positive and bounded so that
Now consider the general case . Taking in the condition (3.20) yields
or equivalently,
Thanks to Lemma A.3, they are valid if all of step-ratios . According to (3.10) and (3.19), the elliptic norm is a continuous function with respect to for aribitrary small . Thus there exists a constant such that
By taking the parameter in (3.15) and (3.19), one has and Moreover, by (A.4)-(A.5), one has . Thus, the estimate (3.16) gives
where , and then
(3.23) |
Therefore it follows that
We obtain the desired estimates by taking a constant which is independent of the time-level index . This proof is complete. ∎
Remark 3.
The elliptic type norm in (3.15) admits some other nonsingular matrices . Here consider a simple case with
Let be the unique positive root of the cubic equation . If the step-ratios satisfy (it is also superior to the existing mesh conditions in the literature [2, 6, 8]), one can follow the proofs of Lemmas A.1-A.2 to check that
It follows that
and then the DOC kernels are globally decaying. Although the proof of Lemma 3.1 is technically complex, the matrix in (3.15) would be optimal in the sense that the companion matrix always has a pair of conjugate complex eigenvalues. Actually, the inequality holds when the adjacent step ratios are close to 1, cf. Remark 4.
Remark 4.
Consider the uniform time-step , the BDF3 kernels (1.10) give
The difference equation (3.4) becomes
Since the associated characteristic equation has two roots , we have the following explicit formula of DOC kernels

In Figure 1, we fix and list the values of DOC kernels for different choices (let be the uniformly distributed random number):
-
(a)
the uniform mesh with for ;
-
(b)
the random mesh with for ;
-
(c)
the random mesh with for .
As observed, the DOC kernels are not always positive, but they always decay fast, as predicted by (3.23). It is interesting to find that, from the case (c), the DOC kernels maintain the fast damping property as a whole although there are many of step ratios greater than (about in the above test).
Theorem 3.1.
Proof.
Multiplying both sides of (1.16) with , one applies Lemma 3.1 to obtain that
By summing the above inequality for from to , we have
Taking some integer () such that . Now we take in the above inequality and get
(3.24) |
for . It remains to evaluate the error from the starting values. Taking the index in (1.15) gives
Recalling the definition (1.10) of BDF3 kernels with the increasing property (A.1), it is easy to check that and . Thus we apply Lemma 3.1 to get
It follows from (3.24) that
Assuming that , one has
for . The standard Grönwall inequality completes the proof. ∎
From the proof of Lemma 3.1, the uniform boundedness of and does not require all DOC kernels to decay rapidly. It always allows a finite number of bounded DOC kernels. As seen from the numerical tests in the next section, the imposed step-ratio condition in Theorem 3.1 is sufficient, but far from necessary. A practical step-ratio constraint for stability is that
almost all step ratios satisfy for .
4 Numerical tests
Consider an ODE model for with a smooth solution in our numerical tests. We run the variable-step BDF2 and BDF3 schemes in two scenarios:
-
(a)
The graded meshes for with grading parameters . The maximum step-ratio is and .
-
(b)
The random meshes with time-steps , where are uniformly distributed random numbers.
To start the multi-step schemes, a two-stage third-order singly diagonally implicit Runge-Kutta method is employed. We record the maximum error in each run and compute the convergence order by
where denotes the maximum time-step size for total subintervals.
Order Order Order 40 7.9e+01 5.28e-04 4.7e+03 8.77e-04 2.5e+05 1.35e-03 80 1.6e+02 1.34e-04 1.97 1.9e+04 2.25e-04 1.96 2.0e+06 3.49e-04 1.95 160 3.2e+02 3.39e-05 1.99 7.6e+04 5.72e-05 1.98 1.6e+07 8.91e-05 1.97 320 6.4e+02 8.52e-06 1.99 3.1e+05 1.44e-05 1.99 1.3e+08 2.25e-05 1.98 640 1.3e+03 2.14e-06 2.00 1.2e+06 3.61e-06 1.99 1.1e+09 5.66e-06 1.99 1280 2.6e+03 5.34e-07 2.00 4.9e+06 9.06e-07 2.00 8.4e+09 1.42e-06 2.00
Order Order Order 40 7.9e+01 1.27e-05 4.7e+03 2.94e-05 2.5e+05 5.73e-05 80 1.6e+02 1.65e-06 2.95 1.9e+04 3.91e-06 2.91 2.0e+06 7.85e-06 2.87 160 3.2e+02 2.10e-07 2.97 7.6e+04 5.05e-07 2.96 1.6e+07 1.03e-06 2.94 320 6.4e+02 2.65e-08 2.99 3.1e+05 6.41e-08 2.98 1.3e+08 1.31e-07 2.97 640 1.3e+03 3.32e-09 2.99 1.2e+06 8.07e-09 2.99 1.1e+09 1.66e-08 2.98 1280 2.6e+03 4.16e-10 3.00 4.9e+06 1.01e-09 2.99 8.4e+09 2.08e-09 2.99
Tables 1-2 list the numerical results of the BDF2 and BDF3 methods on graded meshes with three grading parameters, respectively. We see that, although the ratio increases as fast as , the numerical solutions remains stable and convergent with full accuracy.
Order 40 4.49e-02 5.86e-04 – 28.30 1.22 80 2.40e-02 1.43e-04 2.03 91.41 1.06 160 1.18e-02 4.09e-05 1.81 32.54 19.88 320 6.18e-03 1.03e-05 1.99 418.41 1.21 640 3.01e-03 2.51e-06 2.04 604.02 2.66 1280 1.55e-03 6.63e-07 1.92 1963.80 1.02
Order 40 4.60e-02 8.20e-06 – 7.58 9 1.57 80 2.46e-02 1.18e-06 2.79 361.49 18 1.93 160 1.22e-02 1.83e-07 2.69 1682.18 35 4.51 320 6.11e-03 2.40e-08 2.93 79.90 60 2.91 640 3.27e-03 3.40e-09 2.82 5765.00 146 1.20 1280 1.56e-03 4.16e-10 3.03 9677.92 250 6.64
Tables 3-4 record the numerical results on random time meshes. Table 3 suggests that the variable-step BDF2 method is robust with respect to the step-ratios , and supports the theoretical findings in Theorem 2.1. Reminding the step-ratio restriction in Theorem 3.1 for the BDF3 method, we also record the number (denoted by in Table 4) of time levels with . It is seen that the variable-step BDF3 method is stable and third-order convergent on random meshes, even if there are many of (about in our tests) step-ratios do not meet our theoretical condition. Nonetheless, it remains mysterious to us.
5 Concluding remarks
The stability of BDF2 and BDF3 methods with unequal time-steps is investigated by a new theoretical framework using the discrete orthogonal convolution kernels. Thanks to the global analysis, that is, the present technique formulates the current BDF solution as a convolution summation of all previous information with DOC kernels as the convolutional weights, see the form (1.16), we improved the setp-ratio constraints for the stability. It is to mention that, the global decaying estimates of DOC kernels are also critical in the numerical analysis of BDF methods for parabolic equations, cf. [11, 12, 14]. The stability and convergence theory of variable-step BDF3 scheme for these stiff problems will be addressed in forthcoming reports.
For the high-order BDF- () time-stepping methods, we can also define the associated DOC kernels via the recursive procedure (1.11). From the proof of Theorem 3.1, one needs a result similar to Lemma 3.1 under certain step-ratio condition. Actually, the uniform boundedness (there exists a constant independent of the time-level indexes ) of the absolute summations of DOC kernels
is sufficient to the stability of BDF- schemes for initial value problems. However, this issue seems to be theoretically challenging for and remains open to us.
Acknowledgements
The authors would like to thank Dr. Ji Bingquan for his valuable discussions.
Appendix A Two auxiliary functions for the BDF3 method
By using the definitions (1.1)-(1.3), it is easy to check that
(A.1) |
That is, the functions , and are increasing with respect to . We consider two auxiliary functions for the analysis of variable-step BDF3 method,
(A.2) | ||||
(A.3) |
Their properties will be examined with respect to two independent variables due to the facts and for according to (3.5).
The above functions and are strictly increasing with respect to . Actually, by simple but tedious calculations for (A.2) -(A.3), it is not difficult to check that
(A.4) | |||
(A.5) |
and
(A.6) | |||
(A.7) |
We have the following results. Some of proofs are technically complex and the mathematical derivations have been checked carefully by a symbolic calculation software.
Lemma A.1.
It holds that
Proof.
Simple calculations lead to
It completes the proof. ∎
Lemma A.2.
It holds that
where is the unique positive root of .
Lemma A.3.
For , it holds that
Proof.

Obviously, it is sufficient to show that for . Simple but tedious calculations yield
Obviously, we have such that is increasing with respect to for . Thus it has no extreme points over the open square . It remains to consider the maximum value along the four sides of :
-
(i)
Along the side , we have for ..
-
(ii)
Along the side , for .
-
(iii)
Along the side , we have for ..
-
(iv)
Along the side , one has
It has a unique minimum point at , while
In summary, we have for . It completes the proof. ∎
References
- [1] G. Albi, G. Dimarco and L. Pareschi, Implicit-explicit multistep methods for hyperbolic systems with multiscale relaxation, SIAM J. Sci. Comput., 42(4) (2020) A2402–A2435 .
- [2] M. Calvo, T. Grande and R.D. Grigorieff, On the zero stability of the variable order variable stepsize BDF-formulas, Numer. Math., 57 (1990), 39–50.
- [3] M. Crouzeix and F.J. Lisbona, The convergence of variable-stepsize, variable formula, multistep methods, SIAM J. Numer. Anal., 21 (1984), 512–534.
- [4] V. DeCaria, A. Guzel, W. Layton and Y. Li, A variable stepsize, variable order family of low complexity, SIAM J. Sci. Comp., 43(3) (2021), A2130-A2160.
- [5] G. Dimarco and L. Pareschi, Implicit-explicit linear multistep methods for stiff kinetic equations, SIAM J. Numer. Anal., 55 (2017), 664–690.
- [6] R.D. Grigorieff, Stability of multistep-methods on variable grids, Numer. Math., 42 (1983), 359–377.
- [7] R.D. Grigorieff and P.J. Paes-Leme, On the zero-stability of the 3-step BDF-formula on nonuniform grids, BIT (1984), 24, 85–91.
- [8] N. Guglielmi and M. Zennaro, On the zero-stability of variable stepsize multistep methods: the spectral radius approach, Numer. Math., 88 (2001), 445–458.
- [9] E. Hairer, S.P. Nørsett and G. Wanner, Solving Ordinary Differential Equations I: Nonstiff Problems, Volume 8 of Springer Series in Computational Mathematics, Second Edition, Springer-Verlag, 1992.
- [10] E. Hairer and G. Wanner, Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, Springer Series in Computational Mathematics Volume 14, Second Edition, Springer-Verlag, 2002.
- [11] H.-L. Liao, B. Ji and L. Zhang, An adaptive BDF2 implicit time-stepping method for the phase field crystal model, IMA J. Numer. Anal., 2020, doi:10.1093/imanum/draa075.
- [12] H.-L. Liao, B. Ji, L. Wang and Z. Zhang, Mesh-robustness of an energy stable BDF2 scheme with variable steps for the Cahn-Hilliard model, arXiv:2102.03731v1, 2021.
- [13] H.-L. Liao, T. Tang and T. Zhou, On energy stable, maximum-bound preserving, second-order BDF scheme with variable steps for the Allen-Cahn equation, SIAM J. Numer. Anal., 58(4) (2020), 2294-2314.
- [14] H.-L. Liao and Z. Zhang, Analysis of adaptive BDF2 scheme for diffusion equations, Math. Comp., 90 (2021), 1207–1226.
- [15] D. Wang and S.T. Ruuth, Variable step-size implicit-explicit linear multistep methods for time-dependent partial differential equations, J. Comput. Math., 26(6) (2008), 838–855.