This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\DeclareSourcemap\maps

[datatype=bibtex] \map[overwrite=true] \step[fieldsource=fjournal] \step[fieldset=journal, origfieldval]

Positivity preservation of implicit discretizations of the advection equation

Yiannis Hadjimichael [email protected], MTA-ELTE Numerical Analysis and Large Networks Research Group, Eötvös Loránd University, Pázmány Péter sétány 1/C, H-1117, Budapest, Hungary, and Weierstrass Institute (WIAS), Mohrenstraße 39, 10117, Berlin, Germany.    David I. Ketcheson [email protected], King Abdullah University of Science and Technology (KAUST), Computer Electrical and Mathematical Science and Engineering Division (CEMSE), Thuwal, 23955-6900, Saudi Arabia.    Lajos Lóczi [email protected], Department of Numerical Analysis, Eötvös Loránd University, and Department of Differential Equations, Budapest University of Technology and Economics, Hungary.
Abstract

We analyze, from the viewpoint of positivity preservation, certain discretizations of a fundamental partial differential equation, the one-dimensional advection equation with periodic boundary condition. The full discretization is obtained by coupling a finite difference spatial semi-discretization (the second- and some higher-order centered difference schemes, or the Fourier spectral collocation method) with an arbitrary θ\theta-method in time (including the forward and backward Euler methods, and a second-order method by choosing θ[0,1]\theta\in[0,1] suitably). The full discretization generates a two-parameter family of circulant matrices Mm×mM\in\mathbb{R}^{m\times m}, where each matrix entry is a rational function in θ\theta and ν\nu. Here, ν\nu denotes the CFL number, being proportional to the ratio between the temporal and spatial discretization step sizes. The entrywise non-negativity of the matrix MM—which is equivalent to the positivity preservation of the fully discrete scheme—is investigated via discrete Fourier analysis and also by solving some low-order parametric linear recursions. We find that positivity preservation of the fully discrete system is impossible if the number of spatial grid points mm is even. However, it turns out that positivity preservation of the fully discrete system is recovered for odd values of mm provided that θ1/2\theta\geq 1/2 and ν\nu are chosen suitably. These results are interesting since the systems of ordinary differential equations obtained via the spatial semi-discretizations studied are not positivity preserving.

The Application-domain Specific Highly Reliable IT Solutions project has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the Thematic Excellence Programme TKP2020-NKA-06 (National Challenges Subprogramme) funding scheme. This work was supported by the King Abdullah University of Science and Technology (KAUST), 4700 Thuwal, 23955-6900, Saudi Arabia, and by the Leibniz Competition.

1 Background and motivation

In this work, we investigate the positivity of some discretizations of the advection equation with periodic boundary condition

Ut(x,t)=aUx(x,t),x[0,1],t>0,U(x,0)=U0(x),U(0,t)=U(1,t),\displaystyle\begin{split}U_{t}(x,t)&=aU_{x}(x,t),\qquad x\in[0,1],t>0,\\ U(x,0)&=U_{0}(x),\\ U(0,t)&=U(1,t),\end{split} (1)

where U:×[0,+)U:\mathbb{R}\times[0,+\infty)\to\mathbb{R} is the unknown function, U0:U_{0}:\mathbb{R}\to\mathbb{R} is a given differentiable initial function, and a>0a>0 is a constant. The exact solution of the Cauchy problem (1), given by U(x,t)=U0({x+at})U(x,t)=U_{0}(\{x+at\}) (where {}\{\cdot\} denotes the fractional part), is positivity preserving; i.e.

x[0,1],t>0U0(x)0U(x,t)0.\displaystyle\forall x\in[0,1],\forall t>0\quad\quad U_{0}(x)\geq 0\implies U(x,t)\geq 0. (2)

Positivity is often important in this context, since UU may represent a concentration or density that cannot be negative.

Remark 1.

Herein the term positivity is always meant in the weak sense; i.e. it means non-negativity.

Finite difference spatial semi-discretization of (1) on a uniform grid {Δx,2Δx,,mΔx}[0,1]\{\Delta x,2\Delta x,\ldots,m\Delta x\}\subset[0,1] with mesh spacing Δx>0\Delta x>0 and mΔx=1m\Delta x=1 yields a system of ordinary differential equations

u(t)\displaystyle u^{\prime}(t) =aΔxLu(t),\displaystyle=\frac{a}{\Delta x}Lu(t), (3)

where u:mu:\mathbb{R}\to\mathbb{R}^{m}, Lm×mL\in\mathbb{R}^{m\times m} is a circulant matrix [2, Section 5.16], and m+m\in\mathbb{N}^{+} is the number of grid points (the points x=0x=0 and x=1x=1 are identified due to the periodic boundary condition). If one uses an upwind spatial discretization

L=(1100001100001000001110001),L=\left(\begin{array}[]{cccccc}-1&1&0&\cdots&0&0\\ 0&-1&1&\cdots&0&0\\ 0&0&-1&\ddots&0&0\\ \vdots&\vdots&\ddots&\ddots&\ddots&\vdots\\ 0&0&\cdots&0&-1&1\\ 1&0&\cdots&0&0&-1\\ \end{array}\right), (4)

then the exact solution of (3) is also positivity preserving. Moreover, a corresponding full discretization will be positivity preserving too, under an appropriate time step size restriction 0<ΔtΔt00<\Delta t\leq\Delta t_{0} if, for example, the forward (explicit) Euler method or any strong stability preserving method [5] is used in time; see, e.g. [3].

Positivity-preserving methods for transport equations are typically based on low-order upwind-biased spatial discretizations like that above, or involve nonlinear limiters (or both). Here we instead consider the positivity of linear higher-order centered discretizations. A second-order scheme is obtained with the centered difference discretization

L=(0120012120120001200000120121200120).L=\left(\begin{array}[]{cccccc}0&\frac{1}{2}&0&\cdots&0&-\frac{1}{2}\\ -\frac{1}{2}&0&\frac{1}{2}&\cdots&0&0\\ 0&-\frac{1}{2}&0&\ddots&0&0\\ \vdots&\vdots&\ddots&\ddots&\ddots&\vdots\\ 0&0&\cdots&-\frac{1}{2}&0&\frac{1}{2}\\ \frac{1}{2}&0&\cdots&0&-\frac{1}{2}&0\\ \end{array}\right). (5)

However, this spatial semi-discretization is not positivity preserving, since the matrix LL has at least one negative off-diagonal entry [7, Chapter I, Theorem 7.2]. This implies that any consistent full discretization based on (5) must fail to preserve positivity under sufficiently small step sizes Δt>0\Delta t>0. Indeed, a full discretization based on the scheme (5) and forward Euler in time is not positivity preserving for any step size Δt>0\Delta t>0.

On the other hand, interestingly, using (5) with backward (implicit) Euler time integration, one observes positivity preservation under large enough time step sizes provided that the parity of the number of spatial grid points is odd. To investigate the differences between the behavior of the forward and backward Euler methods, we will study the θ\theta-method [6, Chapter IV.3] as time discretization applied to (3)

un+1=un+aΔtΔx((1θ)Lun+θLun+1).\displaystyle u^{n+1}=u^{n}+\frac{a\Delta t}{\Delta x}((1-\theta)Lu^{n}+\theta Lu^{n+1}). (6)

For θ[0,1]\theta\in[0,1], the θ\theta-family includes both Euler methods as limiting cases: the forward Euler method for θ=0\theta=0, the backward Euler method for θ=1\theta=1, and the only second-order θ\theta-method for θ=1/2\theta=1/2; any θ\theta-method with θ(0,1]\theta\in(0,1] is implicit.

Remark 2.

As is customary in the context of space-time discretizations of partial differential equations, superscripts of uu in (6) (and later in this work) are not exponents but denote time discretization steps.

Remark 3.

Similarly to (2), the semi-discrete system (3) and the fully discrete system (6) are said to be positivity preserving, if for any componentwise non-negative vector of initial condition

  • u(0)u(0), the solution u(t)u(t) of (3) stays componentwise non-negative t>0\forall t>0

  • u0u^{0}, the solution unu^{n} of (6) stays componentwise non-negative n+\forall n\in\mathbb{N}^{+},

respectively.

The motivation for this work is not to develop new positivity preserving methods, but to study the positivity of some of fundamental discretizations such as the second-order centered difference method (5) and the θ\theta-method (6). As we will see, the combination of these methods does not preserve positivity in general, nor in the limit of small time step size, so it is not typically recommended in practice. Nevertheless, this study may both shed light on the behavior of more complicated methods used in practice and provide tools that can be used to study the positivity of those methods. In the later sections of the paper, we combine higher-order methods in space with the θ\theta-method in time, as a next step in this direction.

1.1 Structure of the paper and notation

In Section 2, we first characterize positivity preservation of full discretizations of (1) resulting from finite difference spatial and one-step time discretizations. Then, in Section 2.1, we study positivity of the second-order centered differences in space combined with the θ\theta-method in time, using discrete Fourier analysis. We also point out some connections with structured non-negative inverse eigenvalue problems. In Section 3, we study this particular full discretization in more detail. In Section 3.1, by setting up and solving certain parametric linear recursions, we derive explicit, non-trigonometric formulae for the entries of the full discretization matrix MM. Then, in Section 3.2, we use these formulae to provide precise results on the non-negativity of MM in terms of roots of some sparse polynomials. In Section 4, we discuss some observations and results regarding higher-order spatial discretizations, including high-order centered differences (in Section 4.1) and spectral collocation methods (in Section 4.2). We summarize our findings in Section 5.

Throughout the paper, the set of positive integers is denoted by +\mathbb{N}^{+}, the complex imaginary unit is ı\imath, the identity matrix is Im×mI\in\mathbb{R}^{m\times m}, and to emphasize the dimensions of a matrix, we will sometimes write, for example, Lm×mL_{m\times m}. The symbol M0M\geq 0 means that Mi,j0M_{i,j}\geq 0 for every entry 1i,jm1\leq i,j\leq m of the matrix Mm×mM\in\mathbb{R}^{m\times m}. The positive integer mm is the number of spatial grid points within the interval [0,1][0,1], and the matrices LL and MM are the matrices corresponding to the spatial and the full discretizations, respectively. The three key parameters in our investigations will be mm, θ[0,1]\theta\in[0,1] and ν>0\nu>0 (see (6) and (12)).

The computations in this work have been carried out by using Wolfram Mathematica version 11.

2 Discrete Fourier analysis

From here on, we consider the problem (1) on the domain x[0,1]x\in[0,1] with periodic boundary condition U(0,t)=U(1,t)U(0,t)=U(1,t). Finite difference discretization in space with step size Δx\Delta x leads to (3) with uju(jΔx)u_{j}\approx u(j\Delta x) for 1jm1\leq j\leq m. The circulant matrix Lm×mL\in\mathbb{R}^{m\times m} has the eigendecomposition

L=Λ,\displaystyle L={\cal{F}}\Lambda{\cal{F}}^{*}, (7)

where the (unitary) matrix of normalized eigenvectors {\cal{F}} has entries

fj,1mexp(ı(j1)ξ)(1j,m),\displaystyle f_{j,\ell}\coloneqq\frac{1}{\sqrt{m}}\exp(\imath(j-1)\xi_{\ell})\quad\quad(1\leq j,\ell\leq m), (8)

and Λ\Lambda is the diagonal matrix of eigenvalues λ\lambda_{\ell}, which depends on the particular finite difference method chosen. Here ξ\xi_{\ell} are evenly spaced angles

ξ2π(1)m(1m),\xi_{\ell}\coloneqq\frac{2\pi(\ell-1)}{m}\quad\quad(1\leq\ell\leq m), (9)

such that exp(ıξ)\exp(\imath\xi_{\ell}) are the mthm^{\text{th}} roots of unity. Applying a one-step time discretization with step size Δt\Delta t and stability function R:R:\mathbb{C}\to\mathbb{C} to (3) leads to the iteration

un+1\displaystyle u^{n+1} =Mun,\displaystyle=Mu^{n}, (10)

where

M:=R(νL)=R(νΛ),M:=R(\nu L)={\cal{F}}R(\nu\Lambda){\cal{F}}^{*}, (11)

and

ν:=aΔtΔx>0\nu:=a\frac{\Delta t}{\Delta x}>0 (12)

is the CFL number. Then it is easily seen that

positivity preservation of the fully discrete numerical solutionM0.\boxed{\text{positivity preservation of the fully discrete numerical solution}\quad\Longleftrightarrow\quad M\geq 0.}

For one-step methods, RR is a rational function, and products and inverses of circulant matrices are also circulant [2, Fact 5.16.7], so MM is also a real, circulant matrix. Thus it is defined completely by the entries of its first row, which are given by

M1,j\displaystyle M_{1,j} =1m=1mR(νλ)exp(ı(j1)ξ)(1jm).\displaystyle=\frac{1}{m}\sum_{\ell=1}^{m}R(\nu\lambda_{\ell})\exp(-\imath(j-1)\xi_{\ell})\quad\quad(1\leq j\leq m). (13)

2.1 Second-order centered discretization in space, θ\theta-method in time

In what follows we assume m3m\geq 3. Consider the case of a 3-point centered difference approximation in space (having order 2):

Ux|x=xjuj+1uj12Δx,U_{x}\Big{|}_{x=x_{j}}\approx\frac{u_{j+1}-u_{j-1}}{2\Delta x},

so that Lm×mL\in\mathbb{R}^{m\times m} is a circulant matrix with entries (1/2,0,1/2)(-1/2,0,1/2) on the central three diagonals:

L:=(0120012120120001200000120121200120),L:=\left(\begin{array}[]{cccccc}0&\frac{1}{2}&0&\cdots&0&-\frac{1}{2}\\ -\frac{1}{2}&0&\frac{1}{2}&\cdots&0&0\\ 0&-\frac{1}{2}&0&\ddots&0&0\\ \vdots&\vdots&\ddots&\ddots&\ddots&\vdots\\ 0&0&\cdots&-\frac{1}{2}&0&\frac{1}{2}\\ \frac{1}{2}&0&\cdots&0&-\frac{1}{2}&0\\ \end{array}\right), (14)

that is,

Li,i1\displaystyle L_{i,i-1} 12,Li,i+112\displaystyle\coloneqq-\frac{1}{2},\quad\quad L_{i,i+1}\coloneqq\frac{1}{2} (15a)
L1,m\displaystyle L_{1,m} 12,Lm,112.\displaystyle\coloneqq-\frac{1}{2},\quad\quad L_{m,1}\coloneqq\frac{1}{2}. (15b)

It is known that the eigenvalues of LL are

λ=ısin(ξ)(1m).\lambda_{\ell}=\imath\sin(\xi_{\ell})\quad(1\leq\ell\leq m). (16)

Now we consider the θ\theta-method [6, Chapter IV.3] in time, whose stability function is

R(z)1+(1θ)z1θz,\displaystyle R(z)\coloneqq\frac{1+(1-\theta)z}{1-\theta z}, (17)

so with the second-order centered difference in space we get from (13) for 1jm1\leq j\leq m that the entries of the full discretization matrix are

M1,j=1m=1m1+(1θ)νısin(ξ)1θνısin(ξ)exp(ı(j1)ξ)=1m=1m(1θ(1θ)ν2sin2(ξ))cos((j1)ξ)+νsin(ξ)sin((j1)ξ)1+θ2ν2sin2(ξ)ım=1m(1θ(1θ)ν2sin2(ξ))sin((j1)ξ)νsin(ξ)cos((j1)ξ)1+θ2ν2sin2(ξ).\begin{split}M_{1,j}&=\frac{1}{m}\sum_{\ell=1}^{m}\frac{1+(1-\theta)\nu\imath\sin(\xi_{\ell})}{1-\theta\nu\imath\sin(\xi_{\ell})}\exp\left(-\imath(j-1)\xi_{\ell}\right)\\ &=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})\right)\cos((j-1)\xi_{\ell})+\nu\sin(\xi_{\ell})\sin((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}\\ &\quad-\frac{\imath}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})\right)\sin((j-1)\xi_{\ell})-\nu\sin(\xi_{\ell})\cos((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}.\end{split} (18)

Note that the angles {(j1)ξ}=1m\{(j-1)\xi_{\ell}\}_{\ell=1}^{m} are symmetric about the xx-axis if mm is odd, and also if mm is even and jj is odd. If both mm and jj are even, then the angles are symmetric about the origin. Therefore, we have that

=1msin((j1)ξ)=0and=1msin(ξ)cos((j1)ξ)=0,\displaystyle\sum_{\ell=1}^{m}\sin((j-1)\xi_{\ell})=0\qquad\text{and}\qquad\sum_{\ell=1}^{m}\sin(\xi_{\ell})\cos((j-1)\xi_{\ell})=0,

for all 1jm1\leq j\leq m and for any value of mm. Moreover the factors 1θ(1θ)ν2sin2(ξ)1+θ2ν2sin2(ξ)\frac{1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})} and ν1+θ2ν2sin2(ξ)\frac{\nu}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})} in (18) keep this symmetry. Thus, the imaginary part of (18) vanishes, yielding M1,jM_{1,j}\in\mathbb{R} for all jj, as expected. So for 1jm1\leq j\leq m we get

M1,j=1m=1m(1θ(1θ)ν2sin2(ξ))cos((j1)ξ)+νsin(ξ)sin((j1)ξ)1+θ2ν2sin2(ξ).M_{1,j}=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})\right)\cos((j-1)\xi_{\ell})+\nu\sin(\xi_{\ell})\sin((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}.

We will also make use of the identities

cos((m1)ξ)\displaystyle\cos((m-1)\xi_{\ell}) =cos(ξ),\displaystyle=\cos(\xi_{\ell}), sin((m1)ξ)\displaystyle\sin((m-1)\xi_{\ell}) =sin(ξ).\displaystyle=-\sin(\xi_{\ell}). (19)

This leads to the following expressions for the first, second and last entries of the first row of MM:

M1,1\displaystyle M_{1,1} =1m=1m1θ(1θ)ν2sin2(ξ)1+θ2ν2sin2(ξ),\displaystyle=\frac{1}{m}\sum_{\ell=1}^{m}\frac{1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})},
M1,2\displaystyle M_{1,2} =1m=1m(1θ(1θ)ν2sin2(ξ))cos(ξ)+νsin2(ξ)1+θ2ν2sin2(ξ),\displaystyle=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})\right)\cos(\xi_{\ell})+\nu\sin^{2}(\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})},
M1,m\displaystyle M_{1,m} =1m=1m(1θ(1θ)ν2sin2(ξ))cos(ξ)νsin2(ξ)1+θ2ν2sin2(ξ).\displaystyle=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})\right)\cos(\xi_{\ell})-\nu\sin^{2}(\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}.

These entries will have a special role in the forthcoming analysis. We distinguish two cases.

Case 1: mm is even.

By using similar symmetry arguments as before, we conclude that for even m=2k4m=2k\geq 4 the entries of matrix MM are given by

M1,j={1m=1m(1θ(1θ)ν2sin2(ξ))cos((j1)ξ)1+θ2ν2sin2(ξ),if j is odd,1m=1mνsin(ξ)sin((j1)ξ)1+θ2ν2sin2(ξ),if j is even.\displaystyle M_{1,j}=\begin{cases}\displaystyle\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})\right)\cos((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})},&\mbox{if }j\text{ is odd},\\[20.0pt] \displaystyle\frac{1}{m}\sum_{\ell=1}^{m}\frac{\nu\sin(\xi_{\ell})\sin((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})},&\mbox{if }j\text{ is even}.\end{cases}

Considering the above expression for j=mj=m we have

M1,m\displaystyle M_{1,m} =1m=1mνsin2(ξ)1+θ2ν2sin2(ξ)<0.\displaystyle=\frac{1}{m}\sum_{\ell=1}^{m}\frac{-\nu\sin^{2}(\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}<0.

Thus the discretization using second-order centered differences in space and the θ\theta-method in time cannot preserve positivity when mm is even, regardless of the values of θ[0,1]\theta\in[0,1] and ν>0\nu>0. We can arrive at the same conclusion by observing that for any m=2k4m=2k\geq 4 we have M1,2=M1,mM_{1,2}=-M_{1,m}, so that one of these (non-zero) entries must always be negative.

Case 2: mm is odd.

Let us now consider the case of odd m=2k+13m=2k+1\geq 3. Then sin(ξ)0\sin(\xi_{\ell})\neq 0 for 2m2\leq\ell\leq m. Writing

M1,1\displaystyle M_{1,1} =1m(1+=2m1θ(1θ)ν2sin2(ξ)1+θ2ν2sin2(ξ))\displaystyle=\frac{1}{m}\left(1+\sum_{\ell=2}^{m}\frac{1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}\right)
M1,j\displaystyle M_{1,j} =1m(1+=2m(1θ(1θ)ν2sin2(ξ))cos((j1)ξ)+νsin(ξ)sin((j1)ξ)1+θ2ν2sin2(ξ))(j2),\displaystyle=\frac{1}{m}\left(1+\sum_{\ell=2}^{m}\frac{(1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell}))\cos((j-1)\xi_{\ell})+\nu\sin(\xi_{\ell})\sin((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}\right)\quad(j\geq 2),

and taking ν+\nu\to+\infty with mm and θ(0,1]\theta\in(0,1] fixed, we find that

M1,1limν+M1,1\displaystyle M_{1,1}^{\infty}\coloneqq\lim_{\nu\to+\infty}M_{1,1} =1m(1=2m1θθ)=1m1mθ\displaystyle=\frac{1}{m}\left(1-\sum_{\ell=2}^{m}\frac{1-\theta}{\theta}\right)=1-\frac{m-1}{m\theta}
M1,jlimν+M1,j\displaystyle M_{1,j}^{\infty}\coloneqq\lim_{\nu\to+\infty}M_{1,j} =1m(1=2m1θθcos((j1)ξ))=1m(1+1θθ)=1mθ(j2).\displaystyle=\frac{1}{m}\left(1-\sum_{\ell=2}^{m}\frac{1-\theta}{\theta}\cos((j-1)\xi_{\ell})\right)=\frac{1}{m}\left(1+\frac{1-\theta}{\theta}\right)=\frac{1}{m\theta}\quad(j\geq 2).

We see that

M1,j>0 for all 2jm and θ(0,1],M_{1,j}^{\infty}>0\text{ for all }2\leq j\leq m\text{ and }\theta\in(0,1],

while

M1,1>0θ>m1m.M_{1,1}^{\infty}>0\quad\Longleftrightarrow\quad\theta>\frac{m-1}{m}.

Thus for fixed m3m\geq 3 and θ>m1m\theta>\frac{m-1}{m}, the matrix MM is non-negative if ν>0\nu>0 is large enough.

We now show that M0M\geq 0 also holds for θ=m1m\theta=\frac{m-1}{m} with mm fixed and for ν>0\nu>0 large enough. Clearly, we only need to verify the non-negativity of entry M1,1M_{1,1} for ν>0\nu>0 large enough. In fact, for θ=m1m\theta=\frac{m-1}{m} and for any ν>0\nu>0 we have M1,1>0M_{1,1}>0. To see this, consider a summand with 2m2\leq\ell\leq m in M1,1M_{1,1}:

1θ(1θ)ν2sin2(ξ)1+θ2ν2sin2(ξ)|θ=m1m=m2(m1)ν2sin2(ξ)m2+(m1)2ν2sin2(ξ)=:φ(ν,).\frac{1-\theta(1-\theta)\nu^{2}\sin^{2}(\xi_{\ell})}{1+\theta^{2}\nu^{2}\sin^{2}(\xi_{\ell})}\Bigg{|}_{\theta=\frac{m-1}{m}}=\frac{m^{2}-(m-1)\nu^{2}\sin^{2}\left(\xi_{\ell}\right)}{m^{2}+(m-1)^{2}\nu^{2}\sin^{2}\left(\xi_{\ell}\right)}=:\varphi(\nu,\ell).

Its partial derivative with respect to ν\nu is

νφ(ν,)=2(m1)m3νsin2(ξ)(m2+(m1)2ν2sin2(ξ))2<0,\partial_{\nu}\varphi(\nu,\ell)=-\frac{2(m-1)m^{3}\nu\sin^{2}\left(\xi_{\ell}\right)}{\left(m^{2}+(m-1)^{2}\nu^{2}\sin^{2}\left(\xi_{\ell}\right)\right)^{2}}<0,

and φ(0,)=1\varphi(0,\ell)=1, hence the function

νM1,1|θ=m1m=1m(1+=2mφ(ν,))\displaystyle\nu\mapsto M_{1,1}\Big{|}_{\theta=\frac{m-1}{m}}=\frac{1}{m}\left(1+\sum_{\ell=2}^{m}\varphi(\nu,\ell)\right)

is positive at ν=0\nu=0, strictly decreases, and its limit when ν+\nu\to+\infty is M1,1|θ=m1m=0M_{1,1}^{\infty}\big{|}_{\theta=\frac{m-1}{m}}=0, completing the proof of the claim.

Summarizing the above, we have proved the following for any ν>0\nu>0.

Theorem 1.

Consider the advection equation (1) with periodic boundary condition discretized using 2nd2^{\text{nd}}-order centered differences in space and the θ\theta-method in time with m3m\geq 3 spatial grid points and θ[0,1]\theta\in[0,1]. The full discretization takes the form (10), where
(i) if mm is even, then MM has at least one negative entry;
(ii) if mm is odd and θ[m1m,1]\theta\in\left[\frac{m-1}{m},1\right], then for large enough Δt\Delta t all entries of MM are non-negative.

A refinement of Theorem 1 for odd values of mm will be given at the end of Section 3; see Theorem 2. As for the interval θ[m1m,1]\theta\in\left[\frac{m-1}{m},1\right] appearing in Theorem 1, see also Figures 34.

Remark 4.

In the formulae leading to Theorem 1 we used a trigonometric representation of the matrix entries M1,jM_{1,j}. Here we highlight a related approach to studying the non-negativity of MM by relying only on the eigenvalues σ\sigma_{\ell} (1m1\leq\ell\leq m) of MM. According to (11), (16) and (17), we have

σ:=R(νλ)=1+(1θ)νısin(ξ)1θνısin(ξ).\sigma_{\ell}:=R(\nu\lambda_{\ell})=\frac{1+(1-\theta)\nu\imath\sin\left(\xi_{\ell}\right)}{1-\theta\nu\imath\sin\left(\xi_{\ell}\right)}.

The main question in the context of non-negative inverse eigenvalue problems is to find (necessary or sufficient) conditions for a set Σ:={σ1,,σm}\Sigma:=\{\sigma_{1},\ldots,\sigma_{m}\}\subset\mathbb{C} to be the spectrum of some non-negative m×mm\times m matrix. One such condition is the following. It is known [1, Chapter 4] that if Σ\Sigma is the spectrum of an m×mm\times m non-negative matrix, then

p,q+:0(j=1mσjp)qmq1j=1mσjpq.\forall\,p,q\in\mathbb{N}^{+}:\quad\quad 0\leq\left(\sum_{j=1}^{m}\sigma_{j}^{\,p}\right)^{q}\leq m^{q-1}\sum_{j=1}^{m}\sigma_{j}^{\,pq}. (20)

For example, for m=5m=5 and θ=1\theta=1, (20) with p{1,,9}p\in\{1,\ldots,9\} and q{2,3}q\in\{2,3\} yields the lower bounds

νν(p,q),\nu\geq\nu_{*}(p,q), (21)

where the approximate values of ν(p,q)\nu_{*}(p,q) are given below:

ν(p,q)p=1p=2p=3p=4p=5p=6p=7p=8p=9q=23.00741.4620.96690.72190.57530.47780.40820.35630.3160q=32.14971.02690.66940.49410.39070.32270.27490.23930.2119\begin{array}[]{|c|c|c|c|c|c|c|c|c|c|}\hline\cr\nu_{*}(p,q)&p=1&p=2&p=3&p=4&p=5&p=6&p=7&p=8&p=9\\ \hline\cr q=2&3.0074&1.462&0.9669&0.7219&0.5753&0.4778&0.4082&0.3563&0.3160\\ \hline\cr q=3&2.1497&1.0269&0.6694&0.4941&0.3907&0.3227&0.2749&0.2393&0.2119\\ \hline\cr\end{array}

As we see, the necessary condition (20)—valid for any non-negative matrix—already implies that there are positive lower bounds on ν\nu, although these bounds are not optimal.

It is possible to sharpen the lower bounds in (21) by making use of some more specific results. We know in addition that the matrix MM is circulant, which leads us to the realm of structured non-negative inverse eigenvalue problems. For example, the spectra of non-negative circulant matrices have been characterized (with a necessary and sufficient condition) in [9, Theorem 10]. From this theorem we get (still for m=5m=5 and θ=1\theta=1) the lower bound

ν3.9173.\nu\geq 3.9173.

As we will see, the precise lower bound for this matrix—according to our Theorem 2 with k=2k=2 and θ=1\theta=1—is

ννR(2,1)4.4111.\nu\geq\nu_{R}(2,1)\approx 4.4111.
Remark 5.

It is not restrictive to assume a>0a>0 in (1). If we assumed a<0a<0 instead, then the results of Theorems 1 and 2 would remain valid (together with Figures 34, for example), with all the arguments in their proofs being essentially the same. For example, as we will see in Section 3, the non-negativity of (the first row of) matrix MM is governed by the elements M1,1M_{1,1} and M1,mM_{1,m} for a>0a>0 and mm odd—this would change to elements M1,1M_{1,1} and M1,2M_{1,2} for a<0a<0 and mm odd.

3 Second-order centered discretization in space and θ\theta-method in time—algebraic characterization of the entries of the full discretization matrix

The results of Section 2 are based on the eigendecomposition of the full discretization matrix M=R(νΛ)M={\cal{F}}R(\nu\Lambda){\cal{F}}^{*}. In this section, instead of using trigonometric functions, we give an algebraic description of the matrix entires by exploiting the relation M=R(νL)M=R(\nu L) in (11) with LL defined in (14). Explicitly, this means

M(m,θ,ν)=(IθνL)1(I+(1θ)νL)m×m,M(m,\theta,\nu)=(I-\theta\nu L)^{-1}(I+(1-\theta)\nu L)\in\mathbb{R}^{m\times m}, (22)

but the dependence of MM on its parameters will often be suppressed.

It is trivial that for θ=0\theta=0 we have M(m,0,ν)=I+νLM(m,0,\nu)=I+\nu L, hence M0M\geq 0 cannot hold for any ν>0\nu>0. The case m=2km=2k has been discussed in Section 2.1. Thus, throughout the rest of this section, we can assume that

m=2k+1(k+),ν>0 and 0<θ1.\boxed{m=2k+1\quad(k\in\mathbb{N}^{+}),\ \ \ \nu>0\text{\ \ and\ \ }0<\theta\leq 1.} (23)

3.1 Explicit description of the matrix entries for odd values of mm

To illustrate the structure of MM, we present its first row (as a vector, and with the common denominator of the entries in front of it) for the smallest values of mm.

Example 1.

For m=3m=3 the first row of (22) is

13θ2ν2/4+1(3θ2ν24θν22+1,θν24+ν2,θν24ν2),\frac{1}{{3\theta^{2}\nu^{2}}/{4}+1}\left(\frac{3\theta^{2}\nu^{2}}{4}-\frac{\theta\nu^{2}}{2}+1,\frac{\theta\nu^{2}}{4}+\frac{\nu}{2},\frac{\theta\nu^{2}}{4}-\frac{\nu}{2}\right),

while for m=5m=5 we have

15θ4ν4/16+5θ2ν2/4+1(5θ4ν416θ3ν44+5θ2ν24θν22+1,\frac{1}{{5\theta^{4}\nu^{4}}/{16}+{5\theta^{2}\nu^{2}}/{4}+1}\left(\frac{5\theta^{4}\nu^{4}}{16}-\frac{\theta^{3}\nu^{4}}{4}+\frac{5\theta^{2}\nu^{2}}{4}-\frac{\theta\nu^{2}}{2}+1,\right.
θ3ν416+θ2ν34+ν2,θ3ν416θ2ν38+θν24,θ3ν416+θ2ν38+θν24,θ3ν416θ2ν34ν2).\left.\frac{\theta^{3}\nu^{4}}{16}+\frac{\theta^{2}\nu^{3}}{4}+\frac{\nu}{2},\frac{\theta^{3}\nu^{4}}{16}-\frac{\theta^{2}\nu^{3}}{8}+\frac{\theta\nu^{2}}{4},\frac{\theta^{3}\nu^{4}}{16}+\frac{\theta^{2}\nu^{3}}{8}+\frac{\theta\nu^{2}}{4},\frac{\theta^{3}\nu^{4}}{16}-\frac{\theta^{2}\nu^{3}}{4}-\frac{\nu}{2}\right).

Each element of MM is a rational function in the variables θ\theta and ν\nu. From (22) it is clear that

M1,j=𝒫j,k(θ,ν)𝒟k(θ,ν)(j=1,2,,2k+1),M_{1,j}=\frac{{\cal{P}}_{j,k}(\theta,\nu)}{{\cal{D}}_{k}(\theta,\nu)}\quad\quad(j=1,2,\ldots,2k+1), (24)

where 𝒫j,k{\cal{P}}_{j,k} and 𝒟k{\cal{D}}_{k} are certain bivariate polynomials in θ\theta and ν\nu, and

𝒟k:=det(I(2k+1)×(2k+1)θνL(2k+1)×(2k+1)).{\cal{D}}_{k}:=\det\left(I_{(2k+1)\times(2k+1)}-\theta\nu L_{(2k+1)\times(2k+1)}\right). (25)
Remark 6.

The subscripts of 𝒫j,k{\cal{P}}_{j,k} thus refer to the position of the polynomial within the first row of MM, and the size of M(2k+1)×(2k+1)M\in\mathbb{R}^{(2k+1)\times(2k+1)}, respectively.

The key to describing MM algebraically is the observation that the polynomials 𝒫j,k{\cal{P}}_{j,k} and 𝒟k{\cal{D}}_{k} satisfy certain low-order linear recursions with constant coefficients. As already indicated by Section 2.1, the leftmost entry (j=1j=1) behaves differently than the rest (2j2k+12\leq j\leq 2k+1).

Remark 7.

Mathematica’s FindLinearRecurrence command proved to be an efficient tool for discovering these linear recursions.

First, let us introduce some new variables. On the one hand, as suggested by Example 1, it seems convenient to set

μ:=θ2ν2>0.\mu:=\theta^{2}\nu^{2}>0.

Then, due to the sign assumptions, μ=θν\sqrt{\mu}=\theta\nu. On the other hand, as we will soon see, the polynomial

κ2κ(1+μ2)+μ216\kappa^{2}-\kappa\left(1+\frac{\mu}{2}\right)+\frac{\mu^{2}}{16}

will appear as a (factor of a) characteristic polynomial, and its roots are

κ1,2=2+μ±2μ+14=(1+μ±12)2.\kappa_{1,2}=\frac{2+\mu\pm 2\sqrt{\mu+1}}{4}=\left(\frac{\sqrt{1+\mu}\pm 1}{2}\right)^{2}. (26)

This motivates us to introduce yet another variable, which will further simplify our exposition. We set

y:=1+μ1μ=1+θ2ν21θν(0,1).y:=\frac{\sqrt{1+\mu}-1}{\sqrt{\mu}}=\frac{\sqrt{1+\theta^{2}\nu^{2}}-1}{\theta\nu}\in(0,1). (27)

It is seen that the transformation

(0,+)μy(0,1)(0,+\infty)\ni\mu\longleftrightarrow y\in(0,1)

is a bijection. Moreover, the following (inverse) relations

μ=(2y1y2)2,\mu=\left(\frac{2y}{1-y^{2}}\right)^{2},
μy2=2+μ21+μ,\mu y^{2}=2+\mu-2\sqrt{1+\mu},
μ/y2=2+μ+21+μ,\mu/y^{2}=2+\mu+2\sqrt{1+\mu},

and

ν=2y1y21θ\nu=\frac{2y}{1-y^{2}}\cdot\frac{1}{\theta} (28)

are easily verified. We can now start describing the entries of the first row of MM.

Remark 8.

Although the expressions 𝒫j,k{\cal{P}}_{j,k} and 𝒟k{\cal{D}}_{k} will become in general rational functions in the variable yy, we still call them polynomials (referring to their structure in the original variables θ\theta and ν\nu).

\bullet The polynomials 𝒟k{\cal{D}}_{k}. By carrying out some determinant expansions, we see that the determinants (25) obey the second-order parametric recursion

𝒟k+2=(1+μ2)𝒟k+1μ216𝒟k{\cal{D}}_{k+2}=\left(1+\frac{\mu}{2}\right){\cal{D}}_{k+1}-\frac{\mu^{2}}{16}{\cal{D}}_{k} (29)

with initial conditions

𝒟1=1+3μ4,𝒟2=1+5μ4+5μ216{\cal{D}}_{1}=1+\frac{3\mu}{4},\quad{\cal{D}}_{2}=1+\frac{5\mu}{4}+\frac{5\mu^{2}}{16}

(cf. Example 1). After solving this recursion, we obtain

𝒟k=(1+μ+12)2k+1(1+μ12)2k+1,{\cal{D}}_{k}=\left(\frac{\sqrt{1+\mu}+1}{2}\right)^{2k+1}-\left(\frac{\sqrt{1+\mu}-1}{2}\right)^{2k+1},

which, in terms of the variable yy, becomes

𝒟k=1y4k+2(1y2)2k+1.{\cal{D}}_{k}=\frac{1-y^{4k+2}}{\left(1-y^{2}\right)^{2k+1}}. (30)

\bullet The polynomials 𝒫1,k{\cal{P}}_{1,k}. They satisfy the recursion

𝒫1,k+2=(1+μ2)𝒫1,k+1μ216𝒫1,k,{\cal{P}}_{1,k+2}=\left(1+\frac{\mu}{2}\right){\cal{P}}_{1,k+1}-\frac{\mu^{2}}{16}{\cal{P}}_{1,k},

that is, with coefficients being the same as in (29), but with initial conditions

𝒫1,1=1+3μ4μ/θ2,𝒫1,2=1+5μ4+5μ216μ/θ2μ2/θ4{\cal{P}}_{1,1}=1+\frac{3\mu}{4}-\frac{\mu/\theta}{2},\quad{\cal{P}}_{1,2}=1+\frac{5\mu}{4}+\frac{5\mu^{2}}{16}-\frac{\mu/\theta}{2}-\frac{\mu^{2}/\theta}{4}

(cf. Example 1). By solving this recursion, we derive that

𝒫1,k=PL,k,θ(y)(1+y2)(1y2)2k+1θ,{\cal{P}}_{1,k}=\frac{P_{L,k,\theta}(y)}{\left(1+y^{2}\right)\left(1-y^{2}\right)^{2k+1}\theta}, (31)

where the numerator is

PL,k,θ(y):=θy4k+4(θ2)y4k+2+(θ2)y2+θ.P_{L,k,\theta}(y):=-\theta y^{4k+4}-(\theta-2)y^{4k+2}+(\theta-2)y^{2}+\theta. (32)
Remark 9.

Here, the subscript LL stands for leftmost. This polynomial will play a special role in the next section.

\bullet The polynomials 𝒫2,k{\cal{P}}_{2,k}. They satisfy a third-order recursion in the variable kk,

𝒫2,k+3=(1+3μ4)𝒫2,k+2(μ4+3μ216)𝒫2,k+1+μ364𝒫2,k,{\cal{P}}_{2,k+3}=\left(1+\frac{3\mu}{4}\right){\cal{P}}_{2,k+2}-\left(\frac{\mu}{4}+\frac{3\mu^{2}}{16}\right){\cal{P}}_{2,k+1}+\frac{\mu^{3}}{64}{\cal{P}}_{2,k}, (33)

with initial conditions

𝒫2,1=(12+μ4)ν,𝒫2,2=(12+μ4+μ3/216)ν,{\cal{P}}_{2,1}=\left(\frac{1}{2}+\frac{\sqrt{\mu}}{4}\right)\nu,\quad{\cal{P}}_{2,2}=\left(\frac{1}{2}+\frac{\mu}{4}+\frac{\mu^{3/2}}{16}\right)\nu,
𝒫2,3=(12+μ2+3μ232+μ5/264)ν.{\cal{P}}_{2,3}=\left(\frac{1}{2}+\frac{\mu}{2}+\frac{3\mu^{2}}{32}+\frac{\mu^{5/2}}{64}\right)\nu.

The characteristic polynomial of recursion (33) is

κ3κ2(1+3μ4)+κ(μ4+3μ216)μ364=(κμ4)(κ2κ(1+μ2)+μ216),\kappa^{3}-\kappa^{2}\left(1+\frac{3\mu}{4}\right)+\kappa\left(\frac{\mu}{4}+\frac{3\mu^{2}}{16}\right)-\frac{\mu^{3}}{64}=\left(\kappa-\frac{\mu}{4}\right)\left(\kappa^{2}-\kappa\left(1+\frac{\mu}{2}\right)+\frac{\mu^{2}}{16}\right),

hence the characteristic roots are κ1,2\kappa_{1,2} as in (26), and κ3=μ/4\kappa_{3}={\mu}/{4}. Based on this, one easily obtains the explicit solution as

𝒫2,k=ν(1y2)12k(1+y2k1+y2k+1y4k)2(1+y2).{\cal{P}}_{2,k}=\frac{\nu\left(1-y^{2}\right)^{1-2k}\left(1+y^{2k-1}+y^{2k+1}-y^{4k}\right)}{2\left(1+y^{2}\right)}. (34)

\bullet The polynomials 𝒫3,k{\cal{P}}_{3,k}. They satisfy the same third-order recursion in the variable kk as (33),

𝒫3,k+3=(1+3μ4)𝒫3,k+2(μ4+3μ216)𝒫3,k+1+μ364𝒫3,k,{\cal{P}}_{3,k+3}=\left(1+\frac{3\mu}{4}\right){\cal{P}}_{3,k+2}-\left(\frac{\mu}{4}+\frac{3\mu^{2}}{16}\right){\cal{P}}_{3,k+1}+\frac{\mu^{3}}{64}{\cal{P}}_{3,k},

but with initial conditions

𝒫3,1=(12+μ4)ν,𝒫3,2=(μ4μ8+μ3/216)ν,{\cal{P}}_{3,1}=\left(-\frac{1}{2}+\frac{\sqrt{\mu}}{4}\right)\nu,\quad{\cal{P}}_{3,2}=\left(\frac{\sqrt{\mu}}{4}-\frac{\mu}{8}+\frac{\mu^{3/2}}{16}\right)\nu,
𝒫3,3=(μ4μ232+3μ3/216+μ5/264)ν.{\cal{P}}_{3,3}=\left(\frac{\sqrt{\mu}}{4}-\frac{\mu^{2}}{32}+\frac{3\mu^{3/2}}{16}+\frac{\mu^{5/2}}{64}\right)\nu.

The explicit solution of this recursion is

𝒫3,k=ν(1y2)12k(yy2k2+y2k+2+y4k1)2(1+y2).{\cal{P}}_{3,k}=\frac{\nu\left(1-y^{2}\right)^{1-2k}\left(y-y^{2k-2}+y^{2k+2}+y^{4k-1}\right)}{2\left(1+y^{2}\right)}. (35)
Remark 10.

We note that, for any fixed j2j\geq 2, the polynomials 𝒫j,k{\cal{P}}_{j,k} satisfy the same third-order recursion (33) in the variable kk, with triplets of initial conditions depending on jj. However, we cannot use this approach to proceed, since setting up the initial conditions would require, among others, the knowledge of the polynomials 𝒫j,1{\cal{P}}_{j,1} (for j=2,3j=2,3), 𝒫j,2{\cal{P}}_{j,2} (for j=4,5j=4,5), 𝒫j,3{\cal{P}}_{j,3} (for j=6,7j=6,7), and so on.

\bullet The polynomials 𝒫j,k{\cal{P}}_{j,k} (4j2k+14\leq j\leq 2k+1, k2k\geq 2). They satisfy the following second-order recursion in the variable jj when kk is fixed (hence having only finitely many terms for a particular kk):

𝒫j+2,k=2μ𝒫j+1,k+𝒫j,k.{\cal{P}}_{j+2,k}=-\frac{2}{\sqrt{\mu}}{\cal{P}}_{j+1,k}+{\cal{P}}_{j,k}.

For the initial conditions of this final recursion, we use the general forms of 𝒫2,k{\cal{P}}_{2,k} and 𝒫3,k{\cal{P}}_{3,k} in (34) and (35) to get for any k1k\geq 1 and 2j2k+12\leq j\leq 2k+1 that

𝒫j,k=ν(1y2)12k2(1+y2)Pj,k(y),{\cal{P}}_{j,k}=\frac{\nu\left(1-y^{2}\right)^{1-2k}}{2\left(1+y^{2}\right)}P_{j,k}(y), (36)

where the polynomials Pj,kP_{j,k} are defined as

Pj,k(y)(1)j1y4k+2j+y2k1+j+(1)jy2k+1j+yj2.P_{j,k}(y)\coloneqq(-1)^{j-1}y^{4k+2-j}+y^{2k-1+j}+(-1)^{j}y^{2k+1-j}+y^{j-2}. (37)

As a special case, we set

PR,k(y):=P2k+1,k(y),P_{R,k}(y):=P_{2k+1,k}(y),

in other words we have

PR,k(y)=y4k+y2k+1+y2k11,P_{R,k}(y)=y^{4k}+y^{2k+1}+y^{2k-1}-1, (38)

where the subscript RR stands for rightmost.

Remark 11.

As a by-product, we have obtained the following set of identities by comparing the trigonometric and algebraic representations presented so far. They are also interesting from a structural point of view: although the number of terms in the trigonometric sums increases as kk gets larger, the polynomials in yy are sparse polynomials (also known as lacunary polynomials or fewnomials)—the number of terms does not increase as the polynomial degree increases.

Corollary 1.

With MM defined in (22), θ>0\theta>0, ν>0\nu>0, k+k\in\mathbb{N}^{+}, y=1+θ2ν21θνy=\frac{\sqrt{1+\theta^{2}\nu^{2}}-1}{\theta\nu}, and ξ=2π(1)2k+1\xi_{\ell}=\frac{2\pi(\ell-1)}{2k+1}, we have that

12k+1=12k+11+ı(1θ)νsin(ξ)1ıθνsin(ξ)=\frac{1}{2k+1}\sum_{\ell=1}^{2k+1}\frac{1+\imath(1-\theta)\nu\sin(\xi_{\ell})}{1-\imath\theta\nu\sin(\xi_{\ell})}=
M1,1=𝒫1,k𝒟k=M_{1,1}=\frac{{\cal{P}}_{1,k}}{{\cal{D}}_{k}}=
θy4k+4(θ2)y4k+2+(θ2)y2+θ(1+y2)(1y4k+2)θ.\frac{-\theta y^{4k+4}-(\theta-2)y^{4k+2}+(\theta-2)y^{2}+\theta}{\left(1+y^{2}\right)\left(1-y^{4k+2}\right)\theta}.

Moreover, for j=2,3,,2k+1j=2,3,\ldots,2k+1 we have that

12k+1=12k+11+ı(1θ)νsin(ξ)1ıθνsin(ξ)exp(ı(j1)ξ)=\frac{1}{2k+1}\sum_{\ell=1}^{2k+1}\frac{1+\imath(1-\theta)\nu\sin(\xi_{\ell})}{1-\imath\theta\nu\sin(\xi_{\ell})}\exp\left(-\imath(j-1)\xi_{\ell}\right)=
M1,j=𝒫j,k𝒟k=M_{1,j}=\frac{{\cal{P}}_{j,k}}{{\cal{D}}_{k}}=
ν(1y2)22(1+y2)(1y4k+2)((1)j1y4k+2j+y2k1+j+(1)jy2k+1j+yj2).\frac{\nu\left(1-y^{2}\right)^{2}}{2\left(1+y^{2}\right)\left(1-y^{4k+2}\right)}\Big{(}(-1)^{j-1}y^{4k+2-j}+y^{2k-1+j}+(-1)^{j}y^{2k+1-j}+y^{j-2}\Big{)}.

In particular,

=12k+1(1ıθνsin(ξ))=𝒟k=\prod_{\ell=1}^{2k+1}(1-\imath\theta\nu\sin(\xi_{\ell}))={\cal{D}}_{k}=
(1+θ2ν2+12)2k+1(1+θ2ν212)2k+1=1y4k+2(1y2)2k+1.\left(\frac{\sqrt{1+\theta^{2}\nu^{2}}+1}{2}\right)^{2k+1}-\left(\frac{\sqrt{1+\theta^{2}\nu^{2}}-1}{2}\right)^{2k+1}=\frac{1-y^{4k+2}}{\left(1-y^{2}\right)^{2k+1}}.

3.2 Non-negativity of the matrix entries for odd values of mm

In this section we present a detailed description of the non-negativity properties of the matrix MM, thanks to the explicit forms for the entries M1,jM_{1,j} obtained in Section 3.1. Throughout this section we still assume (23).

By taking into account (24), (30), (31), (32), (36), (37), and the fact that now y(0,1)y\in(0,1) (see (27)), the following corollary is evident.

Corollary 2.

For a given pair (θ,ν)(\theta,\nu)

M1,1(2k+1,θ,ν)0PL,k,θ(y)0 (see (32)),M_{1,1}(2k+1,\theta,\nu)\geq 0\quad\Longleftrightarrow\quad P_{L,k,\theta}(y)\geq 0\quad\text{ (see }\eqref{poldef}\text{)},

and for any 2j2k+12\leq j\leq 2k+1

M1,j(2k+1,θ,ν)0Pj,k(y)0 (see (37)).M_{1,j}(2k+1,\theta,\nu)\geq 0\quad\Longleftrightarrow\quad P_{j,k}(y)\geq 0\quad\text{ (see }\eqref{Pjky}\text{)}.
Refer to caption
Figure 1: The typical behavior of the polynomials Pj,kP_{j,k} appearing in Corollary 2 for 2j2k+12\leq j\leq 2k+1 and kk fixed: curves in shades of gray (or black) correspond to even jj, while curves in shades of red (or orange) correspond to odd jj indices. Based on this figure, one can make the following observations. On the one hand, for each fixed and even jj, Pj,kP_{j,k} is strictly increasing in yy; however, for any fixed y(0,1)y\in(0,1), Pj,kP_{j,k} is in general not monotone in its even index jj. On the other hand, for each fixed and odd jj, Pj,kP_{j,k} is in general not monotone in yy; however, for any fixed y(0,1)y\in(0,1), Pj,kP_{j,k} is strictly decreasing in its odd index jj.

The following lemma proves some of the observations about the polynomials Pj,kP_{j,k} suggested by Figure 1 for even and odd indices 2j2k+12\leq j\leq 2k+1.

Lemma 1.

Let us fix y(0,1)y\in(0,1) arbitrarily. Then
\bullet for any 1k1\leq\ell\leq k, P2,k(y)>0P_{2\ell,k}(y)>0;
\bullet for any 2k2\leq\ell\leq k, P2+1,k(y)<P21,k(y)P_{2\ell+1,k}(y)<P_{2\ell-1,k}(y).

Proof.

For the even indices, we have

P2,k(y)=y2k2l+1(1y2k+1)+y2k+2l1+y2l2>0,P_{2\ell,k}(y)=y^{2k-2l+1}(1-y^{2k+1})+y^{2k+2l-1}+y^{2l-2}>0,

while for the odd indices,

P2+1,k(y)P21,k(y)=(1y2)(y2k+2l2+y2l3+y2k2l(1y2k+1))<0.P_{2\ell+1,k}(y)-P_{2\ell-1,k}(y)=-(1-y^{2})\Big{(}y^{2k+2l-2}+y^{2l-3}+y^{2k-2l}(1-y^{2k+1})\Big{)}<0.

By combining Corollary 2 and Lemma 1, we have obtained the following result, expressing the fact that the non-negativity of M(2k+1,θ,ν)M(2k+1,\theta,\nu) is determined only by the polynomials appearing in the numerators of its top left and top right entries.

Corollary 3.

For a given pair (θ,ν)(\theta,\nu)

M1,1(2k+1,θ,ν)0PL,k,θ(y)0(see (32)),M_{1,1}(2k+1,\theta,\nu)\geq 0\quad\Longleftrightarrow\quad P_{L,k,\theta}(y)\geq 0\quad\text{(see }\eqref{poldef}\text{)},

and

M1,j(2k+1,θ,ν)0 for each 2j2k+1PR,k(y)0(see (38)).M_{1,j}(2k+1,\theta,\nu)\geq 0\text{ for each }2\leq j\leq 2k+1\quad\Longleftrightarrow\quad P_{R,k}(y)\geq 0\quad\text{(see }\eqref{pordef}\text{)}.

The non-negativity of M(2k+1,θ,ν)M(2k+1,\theta,\nu) has therefore been reduced to studying the simultaneous non-negativity of two parametric polynomials, PL,k,θP_{L,k,\theta} and PR,kP_{R,k}, over the yy-interval (0,1)(0,1). The content of Lemmas 2 and 3 is illustrated by Figure 2.

Lemma 2 (about the sign of PR,k(y)P_{R,k}(y)).

Let us fix kk arbitrarily, and recall that by definition PR,k(y)=y4k+y2k+1+y2k11P_{R,k}(y)=y^{4k}+y^{2k+1}+y^{2k-1}-1. Then there is a unique y(0,1)y\in(0,1) such that PR,k(y)=0P_{R,k}(y)=0.
Let

yR(k) denote this root.y_{R}(k)\text{ denote this root.} (39)

Then PR,k(y)<0P_{R,k}(y)<0 for y(0,yR(k))y\in(0,y_{R}(k)), and PR,k(y)>0P_{R,k}(y)>0 for y(yR(k),1)y\in(y_{R}(k),1).
Moreover, yR(k)<yR(k+1)y_{R}(k)<y_{R}(k+1), limk+yR(k)=1\lim_{k\to+\infty}y_{R}(k)=1, and

(21)12k1<yR(k)<(21)12k+1.\left(\sqrt{2}-1\right)^{\frac{1}{2k-1}}<y_{R}(k)<\left(\sqrt{2}-1\right)^{\frac{1}{2k+1}}. (40)
Proof.

For fixed kk, the continuous function yPR,k(y)=y4k+y2k+1+y2k11y\mapsto P_{R,k}(y)=y^{4k}+y^{2k+1}+y^{2k-1}-1 is strictly increasing, PR,k(0)<0P_{R,k}(0)<0 and PR,k(1)>0P_{R,k}(1)>0, hence there is a unique root. This root is strictly increasing in kk, because the function kPR,k(y)k\mapsto P_{R,k}(y) is strictly decreasing for fixed y(0,1)y\in(0,1). Finally notice that y4k+y2k+1+y2k11=0y^{4k}+y^{2k+1}+y^{2k-1}-1=0 is equivalent to (y2k1+1)(y2k+1+1)=2\left(y^{2k-1}+1\right)\left(y^{2k+1}+1\right)=2, and for any y(0,1)y\in(0,1) one has

(y2k+1+1)2<(y2k1+1)(y2k+1+1)<(y2k1+1)2.\left(y^{2k+1}+1\right)^{2}<\left(y^{2k-1}+1\right)\left(y^{2k+1}+1\right)<\left(y^{2k-1}+1\right)^{2}.

From this we easily get (40) and also the limit of yR(k)y_{R}(k) (k+k\to+\infty). ∎

Remark 12.

The asymptotic series (as k+k\to+\infty) of both bounds in (40) has the form

1+ln(21)2k+𝒪(1k2)10.44069k+𝒪(1k2).1+\frac{\ln\left(\sqrt{2}-1\right)}{2k}+{\cal{O}}\left(\frac{1}{k^{2}}\right)\approx 1-\frac{0.44069}{k}+{\cal{O}}\left(\frac{1}{k^{2}}\right).
Lemma 3 (about the sign of PL,k,θ(y)P_{L,k,\theta}(y)).

Let us fix kk arbitrarily, and recall that by definition PL,k,θ(y)=θy4k+4(θ2)y4k+2+(θ2)y2+θP_{L,k,\theta}(y)=-\theta y^{4k+4}-(\theta-2)y^{4k+2}+(\theta-2)y^{2}+\theta.
(i) Suppose that 2k2k+1θ1\frac{2k}{2k+1}\leq\theta\leq 1. Then, for any y(0,1)y\in(0,1), PL,k,θ(y)>0P_{L,k,\theta}(y)>0.
(ii) Suppose now that 0<θ<2k2k+10<\theta<\frac{2k}{2k+1}. Then there is a unique y(0,1)y\in(0,1) such that PL,k,θ(y)=0P_{L,k,\theta}(y)=0.
Let

yL(k,θ) denote this root.y_{L}(k,\theta)\text{ denote this root.} (41)

Then PL,k,θ(y)>0P_{L,k,\theta}(y)>0 for y(0,yL(k,θ))y\in(0,y_{L}(k,\theta)), and PL,k,θ(y)<0P_{L,k,\theta}(y)<0 for y(yL(k,θ),1)y\in(y_{L}(k,\theta),1).
Moreover, on the one hand, for fixed 0<θ<2k2k+10<\theta<\frac{2k}{2k+1}, the function kyL(k,θ)k\mapsto y_{L}(k,\theta) is strictly decreasing, and limk+yL(k,θ)=θ2θ(0,1)\lim_{k\to+\infty}y_{L}(k,\theta)=\sqrt{\frac{\theta}{2-\theta}}\in(0,1).
On the other hand, for fixed k+k\in\mathbb{N}^{+}, the function (0,2k2k+1)θyL(k,θ)\left(0,\frac{2k}{2k+1}\right)\ni\theta\mapsto y_{L}(k,\theta) is strictly increasing, and we have the one-sided limits limθ0+0yL(k,θ)=0\lim_{\theta\to 0+0}y_{L}(k,\theta)=0 and limθ2k2k+10yL(k,θ)=1\lim_{\theta\to\frac{2k}{2k+1}-0}y_{L}(k,\theta)=1.

Proof.

We notice that the expression PL,k,θ(y)P_{L,k,\theta}(y) is linear in θ\theta, so by setting

Θ(y,k)2y2(1y4k)(1+y2)(1y4k+2),\Theta(y,k)\coloneqq\frac{2y^{2}\left(1-y^{4k}\right)}{\left(1+y^{2}\right)\left(1-y^{4k+2}\right)},

we easily get for any y(0,1)y\in(0,1) that

PL,k,θ(y)0θΘ(y,k),P_{L,k,\theta}(y)\lesseqqgtr 0\quad\Longleftrightarrow\quad\theta\lesseqqgtr\Theta(y,k), (42)

where the symbol \lesseqqgtr denotes either <<, or ==, or >> on both sides of the equivalence. It is seen that for fixed kk we have the one-sided limits

limy0+0Θ(y,k)=0andlimy10Θ(y,k)=2k2k+1.\lim_{y\to 0+0}\Theta(y,k)=0\quad\text{and}\quad\lim_{y\to 1-0}\Theta(y,k)=\frac{2k}{2k+1}. (43)

Now we show that the function

(0,1)yΘ(y,k)is strictly increasing.(0,1)\ni y\mapsto\Theta(y,k)\quad\text{is strictly increasing.} (44)

The partial derivative

yΘ(y,k)=4y(1(2k+1)y4k+(2k+1)y4k+4y8k+4)(1+y2)2(1y4k+2)2\partial_{y}\Theta(y,k)=\frac{4y\left(1-(2k+1)y^{4k}+(2k+1)y^{4k+4}-y^{8k+4}\right)}{\left(1+y^{2}\right)^{2}\left(1-y^{4k+2}\right)^{2}}

is positive, if P~(y,k):=1(2k+1)y4k+(2k+1)y4k+4y8k+4>0\widetilde{P}(y,k):=1-(2k+1)y^{4k}+(2k+1)y^{4k+4}-y^{8k+4}>0. But

P~(0,k)=1andP~(1,k)=0,\widetilde{P}(0,k)=1\quad\text{and}\quad\widetilde{P}(1,k)=0,

so the positivity of P~(y,k)\widetilde{P}(y,k) will follow if we show that yP~(y,k)y\mapsto\widetilde{P}(y,k) is strictly decreasing. Indeed,

yP~(y,k)=4(2k+1)y4k1Q~(y,k),\partial_{y}\widetilde{P}(y,k)=-4(2k+1)y^{4k-1}\widetilde{Q}(y,k),

where

Q~(y,k):=y4k+4(k+1)y4+k,\widetilde{Q}(y,k):=y^{4k+4}-(k+1)y^{4}+k,

hence it is enough to verify Q~(y,k)>0\widetilde{Q}(y,k)>0. And this is true, since Q~(0,k)=k\widetilde{Q}(0,k)=k, Q~(1,k)=0\widetilde{Q}(1,k)=0 and

yQ~(y,k)=4(k+1)y3(1y4k)<0.\partial_{y}\widetilde{Q}(y,k)=-4(k+1)y^{3}\left(1-y^{4k}\right)<0.

Now, as (44) has been checked, it is obvious that continuity, (42), (43) and (44) imply statement (i) of the lemma, and, at the same time, regarding statement (ii) of the lemma, the existence of a unique root yL(k,θ)(0,1)y_{L}(k,\theta)\in(0,1), the positivity of PL,k,θP_{L,k,\theta} on (0,yL(k,θ))(0,y_{L}(k,\theta)), and the negativity of PL,k,θP_{L,k,\theta} on (yL(k,θ),1)(y_{L}(k,\theta),1).

We finally discuss the monotonicity and limit properties of the root yL(k,θ)y_{L}(k,\theta). For fixed y(0,1)y\in(0,1), the function kΘ(y,k)k\mapsto\Theta(y,k) is strictly increasing, since

Θ(y,k+1)Θ(y,k)=2(1y2)2y4k+2(1y4k+2)(1y4k+6)>0.\Theta(y,k+1)-\Theta(y,k)=\frac{2\left(1-y^{2}\right)^{2}y^{4k+2}}{\left(1-y^{4k+2}\right)\left(1-y^{4k+6}\right)}>0.

This implies that, for any fixed θ(0,2k2k+1)\theta\in\left(0,\frac{2k}{2k+1}\right), the function kyL(k,θ)k\mapsto y_{L}(k,\theta) is strictly decreasing. Moreover, for fixed y(0,1)y\in(0,1), we see from the definition that limk+Θ(y,k)=2y21+y2\lim_{k\to+\infty}\Theta(y,k)=\frac{2y^{2}}{1+y^{2}}, so, due to (42) with “equality”, one has for fixed θ(0,2k2k+1)\theta\in\left(0,\frac{2k}{2k+1}\right) that y(θ)limk+yL(k,θ)y_{\infty}(\theta)\coloneqq\lim_{k\to+\infty}y_{L}(k,\theta) solves θ=2y(θ)21+y(θ)2\theta=\frac{2y_{\infty}(\theta)^{2}}{1+y_{\infty}(\theta)^{2}}; in other words, y(θ)=θ2θ(0,1)y_{\infty}(\theta)=\sqrt{\frac{\theta}{2-\theta}}\in(0,1). To show the validity of the last sentence of the lemma, we fix k+k\in\mathbb{N}^{+}, and simply take into account again (42) with “equality”, (43) and (44). ∎

Refer to caption
Figure 2: The two solid curves show the functions yPR,k(y)y\mapsto P_{R,k}(y) for some k=k0k=k_{0} (solid black) and k=k1k=k_{1} (solid red) with k0<k1k_{0}<k_{1}. The dashed black curves show the functions yPL,k,θ(y)y\mapsto P_{L,k,\theta}(y) for k=k0k=k_{0} and for various values of θ(0,1]\theta\in(0,1]. Finally, the dotted red curves show the functions yPL,k,θ(y)y\mapsto P_{L,k,\theta}(y) for k=k1k=k_{1} and for the same values of θ(0,1]\theta\in(0,1].

In order to return to the original variables (θ,ν)(\theta,\nu) from the variable yy—based on (28), (39) and (41)—we define

νR(k,θ):=2yR(k)1yR(k)21θ,\nu_{R}(k,\theta):=\frac{2y_{R}(k)}{1-y_{R}(k)^{2}}\cdot\frac{1}{\theta}, (45)

and similarly,

νL(k,θ):={2yL(k,θ)1yL(k,θ)21θfor 0<θ<2k2k+1+for 2k2k+1θ1.\nu_{L}(k,\theta):=\begin{cases}\frac{2y_{L}(k,\theta)}{1-y_{L}(k,\theta)^{2}}\cdot\frac{1}{\theta}&\text{for }0<\theta<\frac{2k}{2k+1}\\ +\infty&\text{for }\frac{2k}{2k+1}\leq\theta\leq 1.\end{cases} (46)

The value ++\infty is introduced here for convenience so as to make our descriptions shorter.

A reformulation of Corollary 3 in terms of the variables (θ,ν)(\theta,\nu) is given below.

Corollary 4.

For any k+k\in\mathbb{N}^{+} and θ(0,1]\theta\in(0,1] we have

M1,1(2k+1,θ,ν)0ννL(k,θ),M_{1,1}(2k+1,\theta,\nu)\geq 0\quad\Longleftrightarrow\quad\nu\leq\nu_{L}(k,\theta),

and

M1,j(2k+1,θ,ν)0 for each 2j2k+1ννR(k,θ).M_{1,j}(2k+1,\theta,\nu)\geq 0\text{ for each }2\leq j\leq 2k+1\quad\Longleftrightarrow\quad\nu\geq\nu_{R}(k,\theta).

In particular,

M1,j(2k+1,θ,ν)0 for each 1j2k+1νR(k,θ)ννL(k,θ).M_{1,j}(2k+1,\theta,\nu)\geq 0\text{ for each }1\leq j\leq 2k+1\quad\Longleftrightarrow\quad\nu_{R}(k,\theta)\leq\nu\leq\nu_{L}(k,\theta).
Proof.

By taking into account Corollary 3, Lemmas 2 and 3, and the fact that the map in (28)

(0,1)y2y1y2(0,+) is a strictly increasing bijection,(0,1)\ni y\mapsto\frac{2y}{1-y^{2}}\in(0,+\infty)\text{ is a strictly increasing bijection,} (47)

we get for fixed kk and θ\theta that PR,k(y)0P_{R,k}(y)\geq 0 is equivalent to ννR(k,θ)\nu\geq\nu_{R}(k,\theta), and PL,k,θ(y)0P_{L,k,\theta}(y)\geq 0 is equivalent to ννL(k,θ)\nu\leq\nu_{L}(k,\theta). In particular, due to the definition of νL(k,θ)\nu_{L}(k,\theta) in (46), this last inequality means that there is no upper bound on ν\nu for 2k2k+1θ1\frac{2k}{2k+1}\leq\theta\leq 1. ∎

Some growth rates, monotonicity and limit properties of νR(k,θ)\nu_{R}(k,\theta) and νL(k,θ)\nu_{L}(k,\theta)—defined in (45)–(46)—are collected below; see also Figures 3 and 4.

Corollary 5.

(i) For any k+k\in\mathbb{N}^{+} and θ(0,1]\theta\in(0,1], we have νR(k,θ)<νR(k+1,θ)\nu_{R}(k,\theta)<\nu_{R}(k+1,\theta), and

2(2+1)12k1(2+1)22k111θ<νR(k,θ)<2(21)12k+11(21)22k+11θ.\frac{2\left(\sqrt{2}+1\right)^{\frac{1}{2k-1}}}{\left(\sqrt{2}+1\right)^{\frac{2}{2k-1}}-1}\cdot\frac{1}{\theta}<\nu_{R}(k,\theta)<\frac{2\left(\sqrt{2}-1\right)^{\frac{1}{2k+1}}}{1-\left(\sqrt{2}-1\right)^{\frac{2}{2k+1}}}\cdot\frac{1}{\theta}. (48)

The asymptotic series for these lower and upper bounds have the form

(2ln(2+1)k1ln(2+1)+𝒪(1k))1θ,\left(\frac{2}{\ln\left(\sqrt{2}+1\right)}k\mp\frac{1}{\ln\left(\sqrt{2}+1\right)}+{\cal{O}}\left(\frac{1}{k}\right)\right)\cdot\frac{1}{\theta},

being approximately (2.26919k1.13459+𝒪(1k))1θ.\left(2.26919k\mp 1.13459+{\cal{O}}\left(\frac{1}{k}\right)\right)\cdot\frac{1}{\theta}. In particular, limk+νR(k,θ)=+\lim_{k\to+\infty}\nu_{R}(k,\theta)=+\infty.
(ii) For fixed 0<θ<2k2k+10<\theta<\frac{2k}{2k+1}, νL(k,θ)>νL(k+1,θ)\nu_{L}(k,\theta)>\nu_{L}(k+1,\theta) (and νL(k,θ)=+\nu_{L}(k,\theta)=+\infty for 2k2k+1θ1\frac{2k}{2k+1}\leq\theta\leq 1). Finally, for fixed θ(0,1)\theta\in(0,1), we have the limit

limk+νL(k,θ)=11θ2θθ,\lim_{k\to+\infty}\nu_{L}(k,\theta)=\frac{1}{1-\theta}\sqrt{\frac{2-\theta}{\theta}}, (49)

and, for fixed k+k\in\mathbb{N}^{+}, the one-sided limits

limθ0+0νL(k,θ)=+=limθ2k2k+10νL(k,θ).\lim_{\theta\to 0+0}\nu_{L}(k,\theta)=+\infty=\lim_{\theta\to\frac{2k}{2k+1}-0}\nu_{L}(k,\theta). (50)
Proof.

(i) The monotonicity of νR(k,θ)\nu_{R}(k,\theta) in kk for fixed θ\theta follows from the monotonicity of yR(k)y_{R}(k) in Lemma 2 together with (47), and inequality (48) is just (40) under the transformation (47).
(ii) We similarly obtain the monotonicity of νL(k,θ)\nu_{L}(k,\theta) in kk for fixed θ\theta, and the limit (49) from Lemma 3 via (47), by also noting that

2θ2θ1(θ2θ)21θ=11θ2θθ.\frac{2\sqrt{\frac{\theta}{2-\theta}}}{1-\left(\sqrt{\frac{\theta}{2-\theta}}\right)^{2}}\cdot\frac{1}{\theta}=\frac{1}{1-\theta}\sqrt{\frac{2-\theta}{\theta}}.

As for the θ2k2k+10\theta\to\frac{2k}{2k+1}-0 limit in (50), we know from Lemma 3 that yL(k,θ)1y_{L}(k,\theta)\to 1 (from below), and limy102y1y2=+\lim_{y\to 1-0}\frac{2y}{1-y^{2}}=+\infty, hence νL(k,θ)+\nu_{L}(k,\theta)\to+\infty when θ2k2k+10\theta\to\frac{2k}{2k+1}-0.

One needs to take care only when evaluating the θ0+0\theta\to 0+0 limit in (50) for fixed k+k\in\mathbb{N}^{+}, since 2yL(k,θ)1yL(k,θ)20\frac{2y_{L}(k,\theta)}{1-y_{L}(k,\theta)^{2}}\to 0 and 1θ+\frac{1}{\theta}\to+\infty in (46) when θ0+0\theta\to 0+0. But the monotonicity of νL(k,θ)\nu_{L}(k,\theta) in kk for fixed θ\theta, and (49) imply for any kk and θ(0,1)\theta\in(0,1) that

νL(k,θ)11θ2θθ,\nu_{L}(k,\theta)\geq\frac{1}{1-\theta}\sqrt{\frac{2-\theta}{\theta}},

and the right-hand side here tends to ++\infty as θ0+0\theta\to 0+0. ∎

Refer to caption
Figure 3: The parameter regions in the (θ,ν)(\theta,\nu) parameter plane ensuring M(2k+1,θ,ν)0M(2k+1,\theta,\nu)\geq 0 for k=1,2,,6k=1,2,\ldots,6 (different values of kk are represented by different colors). The regions continue to extend to infinity “upward”, but “shrink” in the horizontal direction as kk is increased.
Refer to caption
Figure 4: A typical shaded region in Figure 3 for which M(2k+1,θ,ν)0M(2k+1,\theta,\nu)\geq 0; in this particular case, for k=1k=1. The gray region is described by the inequalities νR(k,θ)ννL(k,θ)\nu_{R}(k,\theta)\leq\nu\leq\nu_{L}(k,\theta). The black dotted curve represents the function θνR(k,θ)\theta\mapsto\nu_{R}(k,\theta), while the red dashed curve is the function θνL(k,θ)\theta\mapsto\nu_{L}(k,\theta), having a vertical asymptote at θ=2k/(2k+1)\theta=2k/(2k+1).

The following result explains why the “left half” of Figure 3 is “empty” (cf. Corollary 4)—the result is non-trivial, since for fixed kk, limθ0+0νL(k,θ)=+=limθ0+0νR(k,θ)\lim_{\theta\to 0+0}\nu_{L}(k,\theta)=+\infty=\lim_{\theta\to 0+0}\nu_{R}(k,\theta) (cf. Figure 4).

Lemma 4.

For any k+k\in\mathbb{N}^{+} there is a unique θk[12,2k2k+1)\theta_{k}\in\left[\frac{1}{2},\frac{2k}{2k+1}\right) such that

νR(k,θ)=νL(k,θ)(for θ(0,1])θ=θk.\nu_{R}(k,\theta)=\nu_{L}(k,\theta)\ \ \ (\text{for }\theta\in(0,1])\quad\Longleftrightarrow\quad\theta=\theta_{k}.

This θk\theta_{k} also satisfies

νR(k,θ)<νL(k,θ)θ>θk.\nu_{R}(k,\theta)<\nu_{L}(k,\theta)\quad\Longleftrightarrow\quad\theta>\theta_{k}.

Moreover, the sequence θk\theta_{k} is strictly increasing in kk, and θ1=12\theta_{1}=\frac{1}{2}. In particular, for any k+k\in\mathbb{N}^{+} and θ(0,12)\theta\in\left(0,\frac{1}{2}\right) we have

νR(k,θ)>νL(k,θ).\nu_{R}(k,\theta)>\nu_{L}(k,\theta).
Proof.

Let us fix kk. Due to (45)–(46), νR(k,θ)\nu_{R}(k,\theta) is finite but νL(k,θ)\nu_{L}(k,\theta) is infinite for any θ[2k2k+1,1]\theta\in\left[\frac{2k}{2k+1},1\right], so νR(k,θ)=νL(k,θ)\nu_{R}(k,\theta)=\nu_{L}(k,\theta) cannot hold. For θ(0,2k2k+1)\theta\in\left(0,\frac{2k}{2k+1}\right), by also using (47), we have

νR(k,θ)=νL(k,θ)2yR(k)1yR(k)2=2yL(k,θ)1yL(k,θ)2yR(k)=yL(k,θ).\nu_{R}(k,\theta)=\nu_{L}(k,\theta)\quad\Longleftrightarrow\quad\frac{2y_{R}(k)}{1-y_{R}(k)^{2}}=\frac{2y_{L}(k,\theta)}{1-y_{L}(k,\theta)^{2}}\quad\Longleftrightarrow\quad y_{R}(k)=y_{L}(k,\theta).

Here yR(k)y_{R}(k) is independent of θ\theta, and kk is fixed, so by definitions (39) and (41) this means that θ\theta must be chosen in a way such that PL,k,θ(yR(k))=0P_{L,k,\theta}(y_{R}(k))=0. By using the notation introduced in the proof of Lemma 3, this is equivalent to θ=Θ(yR(k),k)=:θk\theta=\Theta(y_{R}(k),k)=:\theta_{k}. Hence, if νR(k,θ)=νL(k,θ)\nu_{R}(k,\theta)=\nu_{L}(k,\theta) holds for some θ(0,1]\theta\in(0,1], then θ=θk(0,2k2k+1)\theta=\theta_{k}\in\left(0,\frac{2k}{2k+1}\right). Now, by Lemma 2, we have yR(k)<yR(k+1)y_{R}(k)<y_{R}(k+1), and Θ\Theta is strictly increasing in its first argument (see (44)), so the sequence θk\theta_{k} is also strictly increasing.

The same monotonicity argument shows that νR(k,θ)<νL(k,θ)\nu_{R}(k,\theta)<\nu_{L}(k,\theta) holds for some θ(0,1]\theta\in(0,1] if and only if θ>θk\theta>\theta_{k} (see (42) and the characterization of PL,k,θ>0P_{L,k,\theta}>0 in Lemma 3).

For k=1k=1, one explicitly computes that

νR(1,θ)=2θ and νL(1,θ)=2θ(23θ),\nu_{R}(1,\theta)=\frac{2}{\theta}\quad\text{ and }\quad\nu_{L}(1,\theta)=\frac{2}{\sqrt{\theta(2-3\theta)}}, (51)

so νR(1,θ)=νL(1,θ)θ=θ1=1/2\nu_{R}(1,\theta)=\nu_{L}(1,\theta)\Longleftrightarrow\theta=\theta_{1}=1/2. Therefore, we have θk(12,2k2k+1)\theta_{k}\in\left(\frac{1}{2},\frac{2k}{2k+1}\right) for k2k\geq 2, also implying that for any k+k\in\mathbb{N}^{+} and θ(0,12)\theta\in\left(0,\frac{1}{2}\right), we have νR(k,θ)>νL(k,θ)\nu_{R}(k,\theta)>\nu_{L}(k,\theta). ∎

The following theorem summarizes the results of Section 3. In the theorem, we assume k+k\in\mathbb{N}^{+}, θ[0,1]\theta\in[0,1] and ν(0,+)\nu\in(0,+\infty).

Theorem 2 (About the full discretization matrix corresponding to the 2nd2^{\text{nd}}-order centered discretization in space and θ\theta-method in time).
  • \bullet

    Fix 0θ<1/20\leq\theta<1/2 arbitrarily. Then M(2k+1,θ,ν)0M(2k+1,\theta,\nu)\geq 0 can never hold, i.e. for any k+k\in\mathbb{N}^{+} and ν>0\nu>0 there is at least one strictly negative entry of the matrix MM.

  • \bullet

    Let θ=1/2\theta=1/2. Then

    M(2k+1,12,ν)0k=1 and ν=4(see (52)).M\left(2k+1,\frac{1}{2},\nu\right)\geq 0\quad\Longleftrightarrow\quad k=1\text{ and }\nu=4\quad\text{(see }\eqref{rem10thetahalfmatrix}\text{)}.
  • \bullet

    Fix 1/2<θ<11/2<\theta<1 arbitrarily. Then there are finitely many values of kk for which there exists ν>0\nu>0 with M(2k+1,θ,ν)0M(2k+1,\theta,\nu)\geq 0. For any such value of kk, the set of admissible values of ν\nu has the form νR(k,θ)ννL(k,θ)\nu_{R}(k,\theta)\leq\nu\leq\nu_{L}(k,\theta), with suitable constants 0<νR(k,θ)νL(k,θ)+0<\nu_{R}(k,\theta)\leq\nu_{L}(k,\theta)\leq+\infty (the possible case νL(k,θ)=+\nu_{L}(k,\theta)=+\infty means that there is no upper but only a lower bound on ν\nu); see also Corollary 5.

  • \bullet

    Let θ=1\theta=1. Then for each k+k\in\mathbb{N}^{+} there is a constant νR(k,1)>0\nu_{R}(k,1)>0 such that

    M(2k+1,1,ν)0ννR(k,1).M(2k+1,1,\nu)\geq 0\quad\Longleftrightarrow\quad\nu\geq\nu_{R}(k,1).

    In addition, νR(k,1)<νR(k+1,1)\nu_{R}(k,1)<\nu_{R}(k+1,1) for any kk, limk+νR(k,1)=+\lim_{k\to+\infty}\nu_{R}(k,1)=+\infty, and the two-sided estimates in (48) with θ=1\theta=1 hold.

Proof.

The case θ=0\theta=0 has already been discussed at the beginning of Section 3. In general, for θ(0,1]\theta\in\left(0,1\right], we know from Corollary 4 that, for any kk, the set of ν\nu values for which M(2k+1,θ,ν)0M(2k+1,\theta,\nu)\geq 0 holds has the form νR(k,θ)ννL(k,θ)\nu_{R}(k,\theta)\leq\nu\leq\nu_{L}(k,\theta).

The range θ(0,12)\theta\in\left(0,\frac{1}{2}\right) is covered by Lemma 4.

For θ=1/2\theta=1/2, (51) shows that for k=1k=1 one has νR(1,1/2)=νL(1,1/2)=4\nu_{R}(1,1/2)=\nu_{L}(1,1/2)=4. But for any k2k\geq 2 we know (see Corollary 5) that

νL(k,1/2)<νL(1,1/2)=νR(1,1/2)<νR(k,1/2),\nu_{L}(k,1/2)<\nu_{L}(1,1/2)=\nu_{R}(1,1/2)<\nu_{R}(k,1/2),

hence νR(k,1/2)νL(k,1/2)\nu_{R}(k,1/2)\leq\nu_{L}(k,1/2) cannot hold for any k2k\geq 2.

For fixed 1/2<θ<11/2<\theta<1, νL(k,θ)\nu_{L}(k,\theta) becomes finite for all sufficiently large kk (see (46)). But according to Corollary 5, νL(k,θ)\nu_{L}(k,\theta) is decreasing in kk for θ<2k2k+1\theta<\frac{2k}{2k+1}, and limk+νR(k,θ)=+\lim_{k\to+\infty}\nu_{R}(k,\theta)=+\infty, so the inequality νR(k,θ)νL(k,θ)\nu_{R}(k,\theta)\leq\nu_{L}(k,\theta) can hold only for finitely many values of kk.

Finally, for θ=1\theta=1, νL(k,1)=+\nu_{L}(k,1)=+\infty and we can use Corollary 5 (i) with θ=1\theta=1. ∎

Remark 13.

The “lower left corner point” of each shaded region in Figure 3 corresponds to a pair (θ,ν)(\theta,\nu) for which ν=νR(k,θ)=νL(k,θ)\nu=\nu_{R}(k,\theta)=\nu_{L}(k,\theta). This means that here the leftmost and the rightmost entries of the first row of M(2k+1,θ,ν)M(2k+1,\theta,\nu) simultaneously vanish (and the other entries are non-negative). For k=1k=1, this happens for θ=θ1=1/2\theta=\theta_{1}=1/2 and ν=4\nu=4; the corresponding matrix is

M(3,12,4)=(010001100).M\left(3,\frac{1}{2},4\right)=\left(\begin{array}[]{ccc}0&1&0\\ 0&0&1\\ 1&0&0\\ \end{array}\right). (52)

4 Other spatial discretizations

The spatial semi-discretization considered in Sections 2.13 is not positivity preserving. The same will hold true for each spatial semi-discretization to be investigated in Sections 4.14.2 below: it is well-known [7, Chapter I, Theorem 7.2] that a linear constant-coefficient system of ordinary differential equations (3) is positivity preserving if and only if the matrix aΔxL\frac{a}{\Delta x}L has no negative off-diagonal entries. The violation of this last condition is clear for the matrix LL in (14), for all matrices LL in Section 4.1, and also for the ones in Section 4.2 (due to L1,2=L2,10L_{1,2}=-L_{2,1}\neq 0).

However, as we will see, it is again possible to obtain a positivity-preserving full discretization scheme when the spatial discretizations covered in this section (higher-order centered spatial discretizations and Fourier spectral collocation methods) are suitably combined with the θ\theta-method.

Remark 14.

For θ=0\theta=0 (that is, when the explicit Euler time discretization is applied), the matrix MM in (11) becomes M=I+νLM=I+\nu L, hence M0M\geq 0 cannot hold for any ν>0\nu>0 due to the negative off-diagonal entries of LL. Therefore, in what follows, we can assume θ(0,1]\theta\in(0,1].

4.1 Higher-order centered discretizations in space, θ\theta-method in time

The coefficients of the centered differences can be found, e.g., in [4], from which the corresponding circulant matrices LL describing the spatial discretization can be constructed. Here, we examine the first few cases.

  • When the stencil width is 3 (implying m3m\geq 3 and 2nd2^{\text{nd}}-order accuracy), the entries on the central diagonals are (1/2,0,1/2)(-1/2,0,1/2). This is matrix LL in (14) that has been considered in Sections 2.13.

  • When the stencil width is 5 (implying m5m\geq 5 and 4th4^{\text{th}}-order accuracy), the entries on the central diagonals are (1/12,2/3,0,2/3,1/12)(1/12,-2/3,0,2/3,-1/12).

  • When the stencil width is 7 (implying m7m\geq 7 and 6th6^{\text{th}}-order accuracy), the entries on the central diagonals are (1/60,3/20,3/4,0,3/4,3/20,1/60)(-1/60,3/20,-3/4,0,3/4,-3/20,1/60).

In all above central diagonals, the middle 0 corresponds to the main diagonal. We can obtain an eigendecomposition (7) for the matrix LL, with eigenvectors given by (8) and eigenvalues λ=ıψ(ξ)\lambda_{\ell}=\imath\psi(\xi_{\ell}), where

ψ(x):=2k=1(N1)/2Cksin(kx)(x).\displaystyle\psi(x):=2\sum_{k=1}^{(N-1)/2}C_{k}\sin(kx)\quad(x\in\mathbb{R}). (53)

As before, ξ\xi_{\ell} is defined in (9), and CC is a vector consisting of the last (N1)/2(N-1)/2 coefficients of the central diagonals of LL, with NN denoting the stencil width. For instance, the vector CC is equal to (1/2)(1/2), (2/3,1/12)(2/3,-1/12), and (3/4,3/20,1/60)(3/4,-3/20,1/60) for stencil widths 33, 55, and 77, respectively.

After the matrix LL has been chosen, we couple this spatial discretization with the θ\theta-method as time discretization, and the full discretization matrix MM is obtained (see (11) and (17)). As seen in Section 2.1, the matrix MM is a real, circulant matrix so it can be characterized by the entries of its first row, which take the form

M1,j=1m=1m(1θ(1θ)ν2ψ2(ξ))cos((j1)ξ)+νψ(ξ)sin((j1)ξ)1+θ2ν2ψ2(ξ)(1jm).\displaystyle M_{1,j}=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\psi^{2}(\xi_{\ell})\right)\cos((j-1)\xi_{\ell})+\nu\psi(\xi_{\ell})\sin((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\psi^{2}(\xi_{\ell})}\quad(1\leq j\leq m). (54)

Our computations suggest that the non-negativity properties of the matrix family M(m,θ,ν)M(m,\theta,\nu) again depend on the parity of mm.

Case 1: mm is even.

By using symbolic calculations, we have found that, for the 4th4^{\text{th}}-order scheme, M(m,θ,ν)0M(m,\theta,\nu)\geq 0 cannot hold for any θ(0,1]\theta\in(0,1] and ν>0\nu>0 when m{6,8,10}m\in\{6,8,10\}. Similarly, for the 6th6^{\text{th}}-order scheme, we checked (again symbolically) that M(m,θ,ν)0M(m,\theta,\nu)\geq 0 does not hold for any θ(0,1]\theta\in(0,1] and ν>0\nu>0 when m{8,10}m\in\{8,10\}. Therefore, positivity preservation is impossible in these cases.

The following proposition extends the above observations for the 4th4^{\text{th}}-order scheme when mm is a general even number—although only for sufficiently large values of ν\nu.

Proposition 1.

Consider the iterative formula (10) applied to the advection equation (1) with periodic boundary condition. Let the matrix Mm×mM_{m\times m} result from the 4th4^{\text{th}}-order centered discretization in space with mm spatial grid points and the θ\theta-method in time. Also, let ν\nu be the CFL number defined in (12). If m6m\geq 6 is even, then there exists ν0>0\nu_{0}>0 such that the matrix MM has at least one negative entry for any ν>ν0\nu>\nu_{0}.

Proof.

We show that M1,m<0M_{1,m}<0 for any θ(0,1]\theta\in(0,1] and ν>ν0\nu>\nu_{0}, where ν0>0\nu_{0}>0 is a constant depending on mm and θ\theta.

Let j=mj=m in (54), then by using (19) and 1m=1mcos(ξ)=0\frac{1}{m}\sum_{\ell=1}^{m}\cos(\xi_{\ell})=0 we have

M1,m\displaystyle M_{1,m} =1m=1m(1θ(1θ)ν2ψ2(ξ))cos(ξ)νψ(ξ)sin(ξ)1+θ2ν2ψ2(ξ)\displaystyle=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\psi^{2}(\xi_{\ell})\right)\cos(\xi_{\ell})-\nu\psi(\xi_{\ell})\sin(\xi_{\ell})}{1+\theta^{2}\nu^{2}\psi^{2}(\xi_{\ell})}
=(1m=1m(1θ(1θ)ν2ψ2(ξ))cos(ξ)νψ(ξ)sin(ξ)1+θ2ν2ψ2(ξ))1m=1mcos(ξ)\displaystyle=\left(\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\psi^{2}(\xi_{\ell})\right)\cos(\xi_{\ell})-\nu\psi(\xi_{\ell})\sin(\xi_{\ell})}{1+\theta^{2}\nu^{2}\psi^{2}(\xi_{\ell})}\right)-\frac{1}{m}\sum_{\ell=1}^{m}\cos(\xi_{\ell})
=νm=1mψ(ξ)sin(ξ)+θνψ2(ξ)cos(ξ)1+θ2ν2ψ2(ξ)\displaystyle=-\frac{\nu}{m}\sum_{\ell=1}^{m}\frac{\psi(\xi_{\ell})\sin(\xi_{\ell})+\theta\nu\psi^{2}(\xi_{\ell})\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\psi^{2}(\xi_{\ell})}
=2νm=2m/2ψ(ξ)(sin(ξ)+θνψ(ξ)cos(ξ))1+θ2ν2ψ2(ξ),\displaystyle=-\frac{2\nu}{m}\sum_{\ell=2}^{m/2}\frac{\psi(\xi_{\ell})(\sin(\xi_{\ell})+\theta\nu\psi(\xi_{\ell})\cos(\xi_{\ell}))}{1+\theta^{2}\nu^{2}\psi^{2}(\xi_{\ell})}, (55)

where in the last equality we used ψ(ξ1)=ψ(0)=0\psi(\xi_{1})=\psi(0)=0, ψ(ξm/2+1)=0\psi(\xi_{m/2+1})=0, and the symmetry of angles ξ\xi_{\ell} (1m1\leq\ell\leq m) about the xx-axis when mm is even (explicitly, the identities sin(k2π(1)m)=sin(k2π(m+1)m)\sin\left(k\frac{2\pi(\ell-1)}{m}\right)=-\sin\left(k\frac{2\pi(m-\ell+1)}{m}\right) and cos(k2π(1)m)=cos(k2π(m+1)m)\cos\left(k\frac{2\pi(\ell-1)}{m}\right)=\cos\left(k\frac{2\pi(m-\ell+1)}{m}\right) for positive integers kk, \ell and mm). Define the function

f(x;θ,ν)ψ(x)(sin(x)+θνψ(x)cos(x))1+θ2ν2ψ2(x).\displaystyle f(x;\theta,\nu)\coloneqq\frac{\psi(x)(\sin(x)+\theta\nu\psi(x)\cos(x))}{1+\theta^{2}\nu^{2}\psi^{2}(x)}.

Then, we can express (4.1) by summing only over indices \ell for which 0<ξ<π/20<\xi_{\ell}<\pi/2 (and separating the case ξ=π/2\xi_{\ell}=\pi/2 when mm is divisible by 4), yielding

M1,m={2νψ(π/2)m(1+θ2ν2ψ2(π/2))2νm=2m/4(f(ξ;θ,ν)+f(πξ;θ,ν)),if m0(mod 4),2νm=2(m+2)/4(f(ξ;θ,ν)+f(πξ;θ,ν)),if m2(mod 4).\displaystyle M_{1,m}=\begin{cases}-\dfrac{2\nu\psi(\pi/2)}{m\left(1+\theta^{2}\nu^{2}\psi^{2}(\pi/2)\right)}-\dfrac{2\nu}{m}\displaystyle{\sum\limits_{\ell=2}^{m/4}}\Big{(}f(\xi_{\ell};\theta,\nu)+f(\pi-\xi_{\ell};\theta,\nu)\Big{)},&\text{if }m\equiv 0\ (\mathrm{mod}\ 4),\\[20.0pt] -\dfrac{2\nu}{m}\displaystyle{\sum\limits_{\ell=2}^{(m+2)/4}}\Big{(}f(\xi_{\ell};\theta,\nu)+f(\pi-\xi_{\ell};\theta,\nu)\Big{)},&\text{if }m\equiv 2\ (\mathrm{mod}\ 4).\end{cases} (56)

We will also use the identity

f(ξ;θ,ν)+f(πξ;θ,ν)=\displaystyle f(\xi_{\ell};\theta,\nu)+f(\pi-\xi_{\ell};\theta,\nu)=
(ψ(ξ)+ψ(πξ))(sin(ξ)+θνψ(ξ)cos(ξ)+θνψ(πξ)(θνψ(ξ)sin(ξ)cos(ξ)))(1+θ2ν2ψ2(ξ))(1+θ2ν2ψ2(πξ)).\displaystyle\frac{\Bigl{(}\psi(\xi_{\ell})+\psi(\pi-\xi_{\ell})\Bigr{)}\Bigl{(}\sin(\xi_{\ell})+\theta\nu\psi(\xi_{\ell})\cos(\xi_{\ell})+\theta\nu\psi(\pi-\xi_{\ell})\bigl{(}\theta\nu\psi(\xi_{\ell})\sin(\xi_{\ell})-\cos(\xi_{\ell})\bigr{)}\Bigr{)}}{\bigl{(}1+\theta^{2}\nu^{2}\psi^{2}(\xi_{\ell})\bigr{)}\bigl{(}1+\theta^{2}\nu^{2}\psi^{2}(\pi-\xi_{\ell})\bigr{)}}.

First, observe that sin(ξ)\sin(\xi_{\ell}) and cos(ξ)\cos(\xi_{\ell}) are positive for each index 2(m+2)/42\leq\ell\leq(m+2)/4 in (56). Now for the 4th4^{\text{th}}-order centered spatial discretization, easy calculations show that ψ(π/2)=4/3\psi(\pi/2)=4/3,

ψ(ξ)=13sin(ξ)(4cos(ξ)),andψ(πξ)=13sin(ξ)(4+cos(ξ)),\displaystyle\psi(\xi_{\ell})=\frac{1}{3}\sin(\xi_{\ell})(4-\cos(\xi_{\ell})),\quad\text{and}\quad\psi(\pi-\xi_{\ell})=\frac{1}{3}\sin(\xi_{\ell})(4+\cos(\xi_{\ell})),

so they are all positive as well. Let us fix θ(0,1]\theta\in(0,1] arbitrarily, and notice that for each \ell we can find ν>0\nu_{\ell}>0 such that θνψ(ξ)sin(ξ)cos(ξ)>0\theta\nu\psi(\xi_{\ell})\sin(\xi_{\ell})-\cos(\xi_{\ell})>0 for ν>ν\nu>\nu_{\ell}. Let ν0:=maxν\nu_{0}:=\max\nu_{\ell}, then f(ξ;θ,ν)+f(πξ;θ,ν)>0f(\xi_{\ell};\theta,\nu)+f(\pi-\xi_{\ell};\theta,\nu)>0 for ν>ν0\nu>\nu_{0}. Therefore, M1,m<0M_{1,m}<0 for any θ(0,1]\theta\in(0,1] and ν>ν0\nu>\nu_{0}.

Remark 15.

It is easily seen that the previous proof boils down to the fact that for the 4th4^{\text{th}}-order centered spatial discretization we have ψ(x)>0\psi(x)>0 for x(0,π)x\in(0,\pi), where

ψ(x)=2(23sin(x)112sin(2x))=13sin(x)(4cos(x)).\psi(x)=2\left(\frac{2}{3}\sin(x)-\frac{1}{12}\sin(2x)\right)=\frac{1}{3}\sin(x)\bigl{(}4-\cos(x)\bigr{)}.

We know from (53) that for the 6th6^{\text{th}}-order scheme

ψ(x)=2(34sin(x)320sin(2x)+160sin(3x))=115sin(x)(239cos(x)+cos(2x)),\psi(x)=2\left(\frac{3}{4}\sin(x)-\frac{3}{20}\sin(2x)+\frac{1}{60}\sin(3x)\right)=\frac{1}{15}\sin(x)\bigl{(}23-9\cos(x)+\cos(2x)\bigr{)},

so we again have ψ(x)>0\psi(x)>0 for x(0,π)x\in(0,\pi). Therefore, the analogue of Proposition 1 is true for the 6th6^{\text{th}}-order scheme as well (c.f. the proof of Proposition 2).

Remark 16.

Based on the observations above Proposition 1, we conjecture the following: given an arbitrary finite difference centered discretization in space coupled with the θ\theta-method, then M1,m<0M_{1,m}<0 for even mm, and for all values ν>0\nu>0 and θ(0,1]\theta\in(0,1]—that is, ν0=0\nu_{0}=0 can be chosen in general.

Case 2: mm is odd.

In this case we have found that positivity preservation is possible for a suitable set of θ(0,1]\theta\in(0,1] and ν>0\nu>0 values; see Figures 5 and 6.

Also, if we assume ψ(ξ)0\psi(\xi_{\ell})\neq 0 for each 2m2\leq\ell\leq m, then we can extend the asymptotic results of Section 2.1 for an arbitrary high-order centered discretization (c.f. the “odd mm” case in Section 4.2).

Refer to caption
Figure 5: Non-negativity of the matrix family generated by the 4th4^{\text{th}}-order centered discretization in space and the θ\theta-method in time; see Section 4.1. The parameter regions in the (θ,ν)(\theta,\nu) parameter plane ensuring M(m,θ,ν)0M(m,\theta,\nu)\geq 0 are highlighted for m{5,7,9}m\in\{5,7,9\} (different values of mm are represented by different colors; the regions “shrink” as mm gets larger). For m=5m=5, m=7m=7, and m=9m=9, the “lower left corner” point of the region has θ0.726106\theta\approx 0.726106, θ0.809401\theta\approx 0.809401, and θ0.853562\theta\approx 0.853562, respectively.
Refer to caption
Figure 6: Non-negativity of the matrix family generated by the 6th6^{\text{th}}-order centered discretization in space and the θ\theta-method in time; see Section 4.1. The parameter regions in the (θ,ν)(\theta,\nu) parameter plane ensuring M(m,θ,ν)0M(m,\theta,\nu)\geq 0 are highlighted for m{7,9}m\in\{7,9\} (different values of mm are represented by different colors; the regions “shrink” as mm gets larger). For m=7m=7 and m=9m=9, the “lower left corner” point of the region has θ0.807042\theta\approx 0.807042 and θ0.851437\theta\approx 0.851437, respectively.

4.2 Fourier spectral collocation in space, θ\theta-method in time

Here, we consider the spectral method that results from extending the finite difference stencil to include the whole spatial grid. In this section, we assume m4m\geq 4. The resulting spatial semi-discretization can again be written in the form (3) where the matrix LL takes the form [10, 8]

Li,j={0if i=j,πm(1)i+jcot((ij)πm)if ij,\displaystyle L_{i,j}=\begin{cases}0&\text{if }i=j,\\[10.0pt] \dfrac{\pi}{m}(-1)^{i+j}\cot{\!\left(\frac{(i-j)\pi}{m}\right)}&\text{if }i\neq j,\end{cases}

for even mm, and

Li,j={0if i=j,πm(1)i+jcsc((ij)πm)if ij,\displaystyle L_{i,j}=\begin{cases}0&\text{if }i=j,\\[10.0pt] \dfrac{\pi}{m}(-1)^{i+j}\csc{\!\left(\frac{(i-j)\pi}{m}\right)}&\text{if }i\neq j,\end{cases}

for odd mm. As in Section 2, the spectral collocation matrices have an eigendecomposition given by (7), with eigenvectors (8), but now the entries of the diagonal matrix Λ\Lambda are

ıλ\displaystyle\imath\lambda_{\ell} ={ıξ1<m2+1,0=m2+1 and m is even,ı(ξ2π)m2+1<m,\displaystyle=\begin{cases}\imath\xi_{\ell}&1\leq\ell<\frac{m}{2}+1,\\ 0&\ell=\frac{m}{2}+1\text{ and }m\text{ is even},\\ \imath(\xi_{\ell}-2\pi)&\frac{m}{2}+1<\ell\leq m,\end{cases} (57)

where ξ\xi_{\ell} has been defined in (9).

When this matrix LL is coupled with the θ\theta-method as time discretization, the full discretization matrix MM in (11) is obtained. The entries of the first row of the circulant matrix MM are again given by (13), and now they take the form

M1,j=1m=1m(1θ(1θ)ν2λ2)cos((j1)ξ)+νλsin((j1)ξ)1+θ2ν2λ2(1jm).\displaystyle M_{1,j}=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\lambda_{\ell}^{2}\right)\cos((j-1)\xi_{\ell})+\nu\lambda_{\ell}\sin((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}}\quad(1\leq j\leq m). (58)

Similarly to the Sections 2 and 4.1, we distinguish between even and odd sizes of the discretization matrices MM.

Case 1: mm is even.

From (58) we have (also using (19)) that

M1,m=1m=1m(1θ(1θ)ν2λ2)cos(ξ)νλsin(ξ)1+θ2ν2λ2.\displaystyle M_{1,m}=\frac{1}{m}\sum_{\ell=1}^{m}\frac{\left(1-\theta(1-\theta)\nu^{2}\lambda_{\ell}^{2}\right)\cos(\xi_{\ell})-\nu\lambda_{\ell}\sin(\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}}. (59)

The following proposition proves that (59) is negative for sufficiently large CFL number ν\nu.

Proposition 2.

Consider the iterative formula (10) applied to the advection equation (1) with periodic boundary condition. Let the matrix Mm×mM_{m\times m} result from a given even spactral collocation method in space (with mm spatial grid points) and the θ\theta-method in time. Let also ν\nu be the CFL number defined in (12). If m4m\geq 4 is even, then there exists ν0>0\nu_{0}>0 such that the matrix MM has at least one negative entry for any ν>ν0\nu>\nu_{0}.

Proof.

The proof is similar to that of Proposition 1: we show that M1,m<0M_{1,m}<0 for all θ(0,1]\theta\in(0,1] and ν>ν0\nu>\nu_{0}, where ν0>0\nu_{0}>0 depends on mm and θ\theta.

First, observe that by using 1m=1mcos(ξ)=0\frac{1}{m}\sum_{\ell=1}^{m}\cos(\xi_{\ell})=0 we can rewrite (59) as

M1,m=νm=1mλsin(ξ)+θνλ2cos(ξ)1+θ2ν2λ2.M_{1,m}=-\frac{\nu}{m}\sum_{\ell=1}^{m}\frac{\lambda_{\ell}\sin(\xi_{\ell})+\theta\nu\lambda_{\ell}^{2}\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}}.

Now, since λ1=λm/2+1=0\lambda_{1}=\lambda_{m/2+1}=0, we have

M1,m=νm=2m/2(λsin(ξ)+θνλ2cos(ξ)1+θ2ν2λ2)νm=m/2+2m(λsin(ξ)+θνλ2cos(ξ)1+θ2ν2λ2).M_{1,m}=-\frac{\nu}{m}\sum_{\ell=2}^{m/2}\left(\frac{\lambda_{\ell}\sin(\xi_{\ell})+\theta\nu\lambda_{\ell}^{2}\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}}\right)-\frac{\nu}{m}\sum_{\ell=m/2+2}^{m}\left(\frac{\lambda_{\ell}\sin(\xi_{\ell})+\theta\nu\lambda_{\ell}^{2}\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}}\right).

One easily checks from (57) and (9) that for any m2+2m\frac{m}{2}+2\leq\ell\leq m

λ=ξ2π=ξm+2=λm+2.\lambda_{\ell}=\xi_{\ell}-2\pi=-\xi_{m+2-\ell}=-\lambda_{m+2-\ell}.

This implies that

=2m/2λsin(ξ)+θνλ2cos(ξ)1+θ2ν2λ2==m/2+2mλsin(ξ)+θνλ2cos(ξ)1+θ2ν2λ2,\displaystyle\sum_{\ell=2}^{m/2}\frac{\lambda_{\ell}\sin(\xi_{\ell})+\theta\nu\lambda_{\ell}^{2}\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}}=\sum_{\ell=m/2+2}^{m}\frac{\lambda_{\ell}\sin(\xi_{\ell})+\theta\nu\lambda_{\ell}^{2}\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}},

so

M1,m=2νm=2m/2λsin(ξ)+θνλ2cos(ξ)1+θ2ν2λ2.M_{1,m}=-\frac{2\nu}{m}\sum_{\ell=2}^{m/2}\frac{\lambda_{\ell}\sin(\xi_{\ell})+\theta\nu\lambda_{\ell}^{2}\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda_{\ell}^{2}}.

Now notice (due to λ=ξ\lambda_{\ell}=\xi_{\ell} for 2m22\leq\ell\leq\frac{m}{2}) that we also have

M1,m=2νm=2m/2ξsin(ξ)+θνξ2cos(ξ)1+θ2ν2ξ2,M_{1,m}=-\frac{2\nu}{m}\sum_{\ell=2}^{m/2}\frac{\xi_{\ell}\sin(\xi_{\ell})+\theta\nu\xi_{\ell}^{2}\cos(\xi_{\ell})}{1+\theta^{2}\nu^{2}\xi_{\ell}^{2}},

thus

M1,m=2νm=2m/2ψ(ξ)(sin(ξ)+θνψ(ξ)cos(ξ))1+θ2ν2ψ2(ξ)M_{1,m}=-\frac{2\nu}{m}\sum_{\ell=2}^{m/2}\frac{\psi(\xi_{\ell})(\sin(\xi_{\ell})+\theta\nu\psi(\xi_{\ell})\cos(\xi_{\ell}))}{1+\theta^{2}\nu^{2}\psi^{2}(\xi_{\ell})}

with ψ(x):=x\psi(x):=x. Since ψ(x)>0\psi(x)>0 for x(0,π)x\in(0,\pi), according to Remark 15, the proof is complete. ∎

As previously, we conjecture that M1,m<0M_{1,m}<0 for all values of ν>0\nu>0 and θ(0,1]\theta\in(0,1]. We have been able to verify this for m{4,6,8,10}m\in\{4,6,8,10\}, as follows. The expression M1,mM_{1,m} is a rational function in ν\nu and θ\theta, whose denominator is positive. By introducing a new variable y:=πθν0y:=\pi\theta\nu\geq 0 and dividing by ν>0\nu>0, we can write the numerator as a univariate polynomial pm(y)p_{m}(y). For example,

p8(y)=3(8+32)y4+64y332(7+52)y2+256y256(2+2).p_{8}(y)=-3\left(8+3\sqrt{2}\right)y^{4}+64y^{3}-32\left(7+5\sqrt{2}\right)y^{2}+256y-256\left(2+\sqrt{2}\right).

We have confirmed with symbolic calculations that pm(y)<0p_{m}(y)<0 for all y0y\geq 0 in the cases m{4,6,8,10}m\in\{4,6,8,10\}.

Remark 17.

When generating the matrix MM for the symbolic calculations for larger values of mm for the actual full discretization, it is of course computationally more efficient to use the formula R(νΛ){\cal{F}}R(\nu\Lambda){\cal{F}}^{*} in (11) instead of R(νL)R(\nu L) (because in the latter form one would need to evaluate the inverse of a non-sparse matrix).

Case 2: mm is odd.

This time we find a behavior similar to that observed in Section 2.1; see Figure 7.
Moreover, since this time λ0\lambda_{\ell}\neq 0 for 2m2\leq\ell\leq m, the same asymptotic results hold as in the case of the 2nd2^{\text{nd}}-order finite difference scheme in Section 2.1. Indeed, for odd m=2k+15m=2k+1\geq 5 we have that

M1,1=1m(1+=2m1θ(1θ)ν2λ21+θ2ν2λ2)M1,j=1m(1+=2m(1θ(1θ)ν2λ2)cos((j1)ξ)+νλsin((j1)ξ)1+θ2ν2λ2)(j2).\displaystyle\begin{split}M_{1,1}&=\frac{1}{m}\left(1+\sum_{\ell=2}^{m}\frac{1-\theta(1-\theta)\nu^{2}\lambda^{2}_{\ell}}{1+\theta^{2}\nu^{2}\lambda^{2}_{\ell}}\right)\\ M_{1,j}&=\frac{1}{m}\left(1+\sum_{\ell=2}^{m}\frac{(1-\theta(1-\theta)\nu^{2}\lambda^{2}_{\ell})\cos((j-1)\xi_{\ell})+\nu\lambda_{\ell}\sin((j-1)\xi_{\ell})}{1+\theta^{2}\nu^{2}\lambda^{2}_{\ell}}\right)\quad(j\geq 2).\end{split} (60)

Taking ν+\nu\to+\infty and θ(0,1]\theta\in(0,1] fixed in (60), yields

M1,1limν+M1,1\displaystyle M_{1,1}^{\infty}\coloneqq\lim_{\nu\to+\infty}M_{1,1} =1m1mθ\displaystyle=1-\frac{m-1}{m\theta}
M1,jlimν+M1,j\displaystyle M_{1,j}^{\infty}\coloneqq\lim_{\nu\to+\infty}M_{1,j} =1mθ(j2).\displaystyle=\frac{1}{m\theta}\quad(j\geq 2).

As a result, we conclude that

M1,j>0 for all 2jm and θ(0,1],\displaystyle M_{1,j}^{\infty}>0\text{ for all }2\leq j\leq m\text{ and }\theta\in(0,1],

while

M1,1>0θ>m1m.\displaystyle M_{1,1}^{\infty}>0\quad\Longleftrightarrow\quad\theta>\frac{m-1}{m}.
Refer to caption
Figure 7: Non-negativity of the matrix family generated by the Fourier spectral collocation method in space and the θ\theta-method in time; see Section 4.2. The parameter regions in the (θ,ν)(\theta,\nu) parameter plane ensuring M(m,θ,ν)0M(m,\theta,\nu)\geq 0 are highlighted for m{5,7,9}m\in\{5,7,9\} (different values of mm are represented by different colors; the regions “shrink” as mm gets larger).

5 Conclusions

In this work, we have studied the positivity preservation of certain fundamental discretizations of the advection equation with periodic boundary condition (1). Our detailed investigations in Sections 23 were devoted to the full discretization obtained by coupling the second-order centered differences in space with the θ\theta-method in time. Rather than using SSP theory [5], we have employed a direct approach, first based on discrete Fourier analysis and then on a polynomial representation of the entries of the full discretization matrix. The characterization of the matrix entries, along with the related trigonometric identities presented in Corollary 1, may be of independent interest. In Section 4, we considered higher-order centered differences or Fourier spectral collocation in space, and again the θ\theta-method in time.

For all full discretizations constructed this way, we have found similar behavior. If the number of spatial grid points mm is even, no method is positivity preserving, while if mm is odd, some methods may be positivity preserving. Positivity is generally enhanced by taking larger values of the CFL number ν>0\nu>0, larger values of the time-discretization parameter θ[0,1]\theta\in[0,1], or smaller values of the spatial grid points m+m\in\mathbb{N}^{+}. These tendencies, and more specific results, are described in Theorem 2, and can be seen in Figures 3, 5, 6, and 7. Our positive results about the full discretizations are perhaps unexpected, since neither of the underlying spatial semi-discretizations preserves positivity.

Although some of the spatial discretizations considered above have high order, the θ\theta-method as time discretization typically has order only 1 (order 2 occurs only for θ=1/2\theta=1/2). Therefore, we emphasize that our goal in this work is not to provide efficient discretizations but rather to understand the behavior of these simple building blocks, as a means of gaining insight and understanding the positivity of more complicated discretizations that may not be amenable to a thorough analysis.

There are several possible future directions for research building on this work. Other finite difference spatial discretizations could be studied using similar techniques, and higher-order one-step time discretizations could easily be incorporated via (11). Similarly, finite difference discretizations of other linear partial differential equations could be analyzed with the same techniques. Further areas for extension might include other boundary conditions or multidimensional problems.

References

  • [1] Abraham Berman and Robert J. Plemmons “Nonnegative matrices in the mathematical sciences” Revised reprint of the 1979 original 9, Classics in Applied Mathematics Society for IndustrialApplied Mathematics (SIAM), Philadelphia, PA, 1994, pp. xx+340 DOI: 10.1137/1.9781611971262
  • [2] Dennis S. Bernstein “Matrix mathematics” Theory, facts, and formulas Princeton University Press, Princeton, NJ, 2009, pp. xlii+1139 DOI: 10.1515/9781400833344
  • [3] Imre Fekete, David I. Ketcheson and Lajos Lóczi “Positivity for convective semi-discretizations” In Journal of Scientific Computing 74.1, 2018, pp. 244–266 DOI: 10.1007/s10915-017-0432-9
  • [4] Bengt Fornberg “Generation of finite difference formulas on arbitrarily spaced grids” In Mathematics of Computation 51.184, 1988, pp. 699–706 DOI: 10.2307/2008770
  • [5] Sigal Gottlieb, David Ketcheson and Chi-Wang Shu “Strong stability preserving Runge-Kutta and multistep time discretizations” World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2011, pp. xii+176 DOI: 10.1142/7498
  • [6] E. Hairer and G. Wanner “Solving ordinary differential equations. II” Stiff and differential-algebraic problems, Second revised edition 14, Springer Series in Computational Mathematics Springer-Verlag, Berlin, 2010, pp. xvi+614 DOI: 10.1007/978-3-642-05221-7
  • [7] Willem Hundsdorfer and Jan Verwer “Numerical solution of time-dependent advection-diffusion-reaction equations” 33, Springer Series in Computational Mathematics Springer-Verlag, Berlin, 2003, pp. x+471 DOI: 10.1007/978-3-662-09017-6
  • [8] Roger Peyret “Spectral methods for incompressible viscous flow” 148, Applied Mathematical Sciences Springer-Verlag, New York, 2002, pp. xii+432 DOI: 10.1007/978-1-4757-6557-1
  • [9] Oscar Rojo and Ricardo L. Soto “Existence and construction of nonnegative matrices with complex spectrum” In Linear Algebra and its Applications 368, 2003, pp. 53–69 DOI: 10.1016/S0024-3795(02)00650-X
  • [10] Lloyd N. Trefethen “Spectral methods in MATLAB” 10, Software, Environments, and Tools Society for IndustrialApplied Mathematics (SIAM), Philadelphia, PA, 2000, pp. xviii+165 DOI: 10.1137/1.9780898719598