This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Block ω\omega-circulant preconditioners for parabolic optimal control problems

Po Yin Fung  and Sean Hon Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong SAR ([email protected]).Corresponding Author. Department of Mathematics, Hong Kong Baptist University, Hong Kong SAR ([email protected]).
Abstract

In this work, we propose a class of novel preconditioned Krylov subspace methods for solving an optimal control problem of parabolic equations. Namely, we develop a family of block ω\omega-circulant based preconditioners for the all-at-once linear system arising from the concerned optimal control problem, where both first order and second order time discretization methods are considered. The proposed preconditioners can be efficiently diagonalized by fast Fourier transforms in a parallel-in-time fashion, and their effectiveness is theoretically shown in the sense that the eigenvalues of the preconditioned matrix are clustered around ±1\pm 1, which leads to rapid convergence when the minimal residual method is used. When the generalized minimal residual method is deployed, the efficacy of the proposed preconditioners are justified in the way that the singular values of the preconditioned matrices are proven clustered around unity. Numerical results are provided to demonstrate the effectiveness of our proposed solvers.

Keywords: Toeplitz, skew/circulant matrices, ω\omega-circulant matrices, preconditioners, parallel-in-time


AMS Subject Classification: 65F08, 65F10, 65M22, 15B05

1 Introduction

In recent years, there has been a growing interest in analyzing and solving optimization problems constrained by parabolic equations. We refer to [25, 14, 34, 6] and the references therein for a comprehensive overview.

In this work, we are interested in solving the distributed optimal control model problem. Namely, the following quadratic cost functional is minimized:

miny,u𝒥(y,u):=12ygL2(Ω×(0,T))2+γ2uL2(Ω×(0,T))2,\min_{y,u}~{}\mathcal{J}(y,u):=\frac{1}{2}\|y-g\|^{2}_{L^{2}(\Omega\times(0,T))}+\frac{\gamma}{2}\|u\|^{2}_{L^{2}(\Omega\times(0,T))}, (1)

subject to a parabolic equation with certain initial and boundary conditions

{yty=f+u,(x,t)Ω×(0,T],y=0,(x,t)Ω×(0,T],y(x,0)=y0,xΩ,\left\{\begin{array}[]{lc}y_{t}-\mathcal{L}y=f+u,\quad(x,t)\in\Omega\times(0,T],\qquad y=0,\quad(x,t)\in\partial\Omega\times(0,T],\\ y(x,0)=y_{0},\quad x\in\Omega,\end{array}\right.\, (2)

where u,gL2u,g\in L^{2} are the distributed control and the desired tracking trajectory, respectively, γ>0\gamma>0 is a regularization parameter, =(a(x))\mathcal{L}=\nabla\cdot(a({x})\nabla), and ff and y0y_{0} are given problem dependent functions. Under appropriate assumptions, theoretical aspects such as the existence, uniqueness, and regularity of the solution were well studied in [25]. The optimal solution of (1) & (2) can be characterized by the following system:

{yty1γp=f,(x,t)Ω×(0,T],y=0,(x,t)Ω×(0,T],y(x,0)=y0xΩ,ptp+y=g,(x,t)Ω×(0,T],p=0,(x,t)Ω×(0,T],p(x,T)=0xΩ,\left\{\begin{array}[]{lc}y_{t}-\mathcal{L}y-\frac{1}{\gamma}p=f,\quad(x,t)\in\Omega\times(0,T],\qquad y=0,\quad(x,t)\in\partial\Omega\times(0,T],\\ y(x,0)=y_{0}\quad x\in\Omega,\\ -p_{t}-\mathcal{L}p+y=g,\quad(x,t)\in\Omega\times(0,T],\qquad p=0,\quad(x,t)\in\partial\Omega\times(0,T],\\ p(x,T)=0\quad x\in\Omega,\end{array}\right.\, (3)

where the control variable uu has been eliminated.

Following [36, 37], we discretize (3) using the θ\theta-method for time and some space discretization, which gives

Mm𝐲m(k+1)𝐲m(k)τ+Km(θ𝐲m(k+1)+(1θ)𝐲m(k))\displaystyle M_{m}\frac{\mathbf{y}_{m}^{(k+1)}-\mathbf{y}_{m}^{(k)}}{\tau}+K_{m}(\theta\mathbf{y}_{m}^{(k+1)}+(1-\theta)\mathbf{y}_{m}^{(k)}) =\displaystyle= Mm(θ𝐟m(k+1)+(1θ)𝐟m(k)+1γ(θ𝐩m(k)+(1θ)𝐩m(k+1))),\displaystyle M_{m}(\theta\mathbf{f}_{m}^{(k+1)}+(1-\theta)\mathbf{f}_{m}^{(k)}+\frac{1}{\gamma}(\theta\mathbf{p}_{m}^{(k)}+(1-\theta)\mathbf{p}_{m}^{(k+1)})),
Mm𝐩m(k+1)𝐩m(k)τ+Km(θ𝐩m(k)+(1θ)𝐩m(k+1))\displaystyle-M_{m}\frac{\mathbf{p}_{m}^{(k+1)}-\mathbf{p}_{m}^{(k)}}{\tau}+K_{m}(\theta\mathbf{p}_{m}^{(k)}+(1-\theta)\mathbf{p}_{m}^{(k+1)}) =\displaystyle= Mm(θ𝐠m(k)+(1θ)𝐠m(k+1)θ𝐲m(k+1)(1θ)𝐲m(k)).\displaystyle M_{m}(\theta\mathbf{g}_{m}^{(k)}+(1-\theta)\mathbf{g}_{m}^{(k+1)}-\theta\mathbf{y}_{m}^{(k+1)}-(1-\theta)\mathbf{y}_{m}^{(k)}).

The backward Euler method corresponds to θ=1\theta=1, while the Crank-Nicolson method is adopted when θ=1/2\theta=1/2.

Combining the given initial and boundary conditions, one needs to solve the following linear system

𝒜~[𝐲𝐩]=[𝐠~𝐟~],\widetilde{\mathcal{A}}\begin{bmatrix}\mathbf{y}\\ \mathbf{p}\end{bmatrix}=\begin{bmatrix}\mathbf{\widetilde{g}}\\ \mathbf{\widetilde{f}}\end{bmatrix}, (4)

where we have 𝐲=[𝐲m(1),,𝐲m(n)]\mathbf{y}=[\mathbf{y}_{m}^{(1)},\cdots,\mathbf{y}_{m}^{(n)}]^{\top}, 𝐩=[𝐩m(0),,𝐩m(n1)]\mathbf{p}=[\mathbf{p}_{m}^{(0)},\cdots,\mathbf{p}_{m}^{(n-1)}]^{\top},

𝐟~=[Mm(θτ𝐟m(1)+(1θ)τ𝐟m(0))+(Mm(1θ)τKm)𝐲m(0)Mm(θτ𝐟m(2)+(1θ)τ𝐟m(1))Mm(θτ𝐟m(n)+(1θ)τ𝐟m(n1))],𝐠~=τ[Mm(θ𝐠m(0)+(1θ)𝐠m(1)(1θ)𝐲m(0))Mm(θ𝐠m(1)+(1θ)𝐠m(2))Mm(θ𝐠m(n1)+(1θ)𝐠m(n))],\mathbf{\widetilde{f}}=\begin{bmatrix}M_{m}(\theta\tau\mathbf{f}_{m}^{(1)}+(1-\theta)\tau\mathbf{f}_{m}^{(0)})+(M_{m}-(1-\theta)\tau K_{m})\mathbf{y}_{m}^{(0)}\\ M_{m}(\theta\tau\mathbf{f}_{m}^{(2)}+(1-\theta)\tau\mathbf{f}_{m}^{(1)})\\ \vdots\\ M_{m}(\theta\tau\mathbf{f}_{m}^{(n)}+(1-\theta)\tau\mathbf{f}_{m}^{(n-1)})\\ \end{bmatrix},\mathbf{\widetilde{g}}=\tau\begin{bmatrix}M_{m}(\theta\mathbf{g}_{m}^{(0)}+(1-\theta)\mathbf{g}_{m}^{(1)}-(1-\theta)\mathbf{y}_{m}^{(0)})\\ M_{m}(\theta\mathbf{g}_{m}^{(1)}+(1-\theta)\mathbf{g}_{m}^{(2)})\\ \vdots\\ M_{m}(\theta\mathbf{g}_{m}^{(n-1)}+(1-\theta)\mathbf{g}_{m}^{(n)})\\ \end{bmatrix},
𝒜~\displaystyle\widetilde{\mathcal{A}} =\displaystyle= [τBn(2)Mm(Bn(1))Mm+τ(Bn(2))KmBn(1)Mm+τBn(2)Kmτγ(Bn(2))Mm],\displaystyle\begin{bmatrix}\tau B_{n}^{(2)}\otimes M_{m}&(B_{n}^{(1)})^{\top}\otimes M_{m}+\tau(B_{n}^{(2)})^{\top}\otimes K_{m}\\ B_{n}^{(1)}\otimes M_{m}+\tau B_{n}^{(2)}\otimes K_{m}&-\frac{\tau}{\gamma}(B_{n}^{(2)})^{\top}\otimes M_{m}\end{bmatrix}, (5)

and the matrices Bn(1),Bn(2)B_{n}^{(1)},B_{n}^{(2)} are, respectively,

Bn(1)=[1111111],Bn(2)=[θ1θθ1θθ1θθ].B_{n}^{(1)}=\begin{bmatrix}1&&&&\\ -1&1&&&\\ &-1&1&&\\ &&\ddots&\ddots&\\ &&&-1&1\end{bmatrix},\quad B_{n}^{(2)}=\begin{bmatrix}\theta&&&&\\ 1-\theta&\theta&&&\\ &1-\theta&\theta&&\\ &&\ddots&\ddots&\\ &&&1-\theta&\theta\\ \end{bmatrix}.

We assume that the matrix MmM_{m} is symmetric positive definite, and the matrix KmK_{m} is symmetric positive semi-definite. The matrices MmM_{m} and KmK_{m} represent the mass matrix and the stiffness matrix, respectively, if a finite element method is employed. For the finite difference method, the linear system is such that Mm=ImM_{m}=I_{m} and Km=LmK_{m}=-L_{m}, where Lm-L_{m} is the discretization matrix of the negative Laplacian.

Following a similar idea in [21], we can further transform (5) into the following equivalent system

𝒜[γ𝐲~𝐩~]=[𝐠γ𝐟],{\mathcal{A}}\begin{bmatrix}\sqrt{\gamma}\widetilde{\mathbf{y}}\\ \widetilde{\mathbf{p}}\end{bmatrix}=\begin{bmatrix}\mathbf{g}\\ \sqrt{\gamma}\mathbf{f}\end{bmatrix}, (6)

where 𝐟=(InMm12)𝐟~\mathbf{f}=(I_{n}\otimes M_{m}^{-\frac{1}{2}})\mathbf{\widetilde{f}}, 𝐠=(InMm12)𝐠~\mathbf{g}=(I_{n}\otimes M_{m}^{-\frac{1}{2}})\mathbf{\widetilde{g}}, 𝐲~=(Bn(2)Mm12)[𝐲m(1),,𝐲m(n)]\widetilde{\mathbf{y}}=(B_{n}^{(2)}\otimes M_{m}^{\frac{1}{2}})[\mathbf{y}_{m}^{(1)},\cdots,\mathbf{y}_{m}^{(n)}]^{\top}, 𝐩=((Bn(2))Mm12)[𝐩m(0),,𝐩m(n1)]\mathbf{p}=((B_{n}^{(2)})^{\top}\otimes M_{m}^{\frac{1}{2}})[\mathbf{p}_{m}^{(0)},\cdots,\mathbf{p}_{m}^{(n-1)}]^{\top}, and

𝒜\displaystyle{\mathcal{A}} =\displaystyle= [αInImBnIm+τInMm12KmMm12BnIm+τInMm12KmMm12αInIm]\displaystyle\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&B_{n}^{\top}\otimes I_{m}+\tau{I}_{n}\otimes M_{m}^{-\frac{1}{2}}K_{m}M_{m}^{-\frac{1}{2}}\\ B_{n}\otimes I_{m}+\tau{I}_{n}\otimes M_{m}^{-\frac{1}{2}}K_{m}M_{m}^{-\frac{1}{2}}&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}
=\displaystyle= [αInIm𝒯𝒯αInIm].\displaystyle\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&\mathcal{T}^{\top}\\ \mathcal{T}&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}.

Note that 12θ1\frac{1}{2}\leq\theta\leq 1, α=τγ\alpha=\frac{\tau}{\sqrt{\gamma}}, ImI_{m} is the m×mm\times m identity matrix, and

𝒯=BnIm+τInMm12KmMm12,\mathcal{T}=B_{n}\otimes I_{m}+\tau{I}_{n}\otimes M_{m}^{-\frac{1}{2}}K_{m}M_{m}^{-\frac{1}{2}}, (8)

where BnB_{n} is a lower triangular Toeplitz matrix whose entries are known explicitly as

Bn=[1θ1θ21θ(θ1)θ31θ21θ(θ1)n2θn(θ1)θ31θ21θ].B_{n}=\begin{bmatrix}\frac{1}{\theta}&&&&\\ \frac{-1}{\theta^{2}}&\frac{1}{\theta}&&&\\ \frac{-(\theta-1)}{\theta^{3}}&\frac{-1}{\theta^{2}}&\frac{1}{\theta}&&\\ \vdots&\ddots&\ddots&\ddots&\\ \frac{-(\theta-1)^{n-2}}{\theta^{n}}&\cdots&\frac{-(\theta-1)}{\theta^{3}}&\frac{-1}{\theta^{2}}&\frac{1}{\theta}\end{bmatrix}.

Incidentally, BnB_{n} can be expressed as the product of two Toeplitz matrices, i.e., Bn=Bn(1)(Bn(2))1=(Bn(2))1Bn(1)B_{n}=B_{n}^{(1)}(B_{n}^{(2)})^{-1}=(B_{n}^{(2)})^{-1}B_{n}^{(1)}. As will be explained in Section 2, the Toeplitz matrices Bn(1)B_{n}^{(1)} and Bn(2)B_{n}^{(2)} are respectively generated by the functions

f1(ϕ)=1exp(𝐢ϕ)f_{1}(\phi)=1-\exp{(\mathbf{i}\phi}) (9)

and

f2(ϕ)=θ+(1θ)exp(𝐢ϕ).f_{2}(\phi)=\theta+(1-\theta)\exp{(\mathbf{i}\phi)}. (10)

In what follows, we focus on using the finite difference method to discretize the system (3), namely, Mm=ImM_{m}=I_{m} and Km=LmK_{m}=-L_{m} in the linear system (6). However, we point out that our proposed preconditioning methods with minimal modification are still applicable when a finite element method is used. We first develop a preconditioned generalized minimal residual (GMRES) method for a nonsymmetric equivalent system of (6), which is

𝒜^[γ𝐲~𝐩~]=[γ𝐟𝐠]\mathcal{\widehat{A}}\begin{bmatrix}\sqrt{\gamma}\widetilde{\mathbf{y}}\\ \widetilde{\mathbf{p}}\end{bmatrix}=\begin{bmatrix}\sqrt{\gamma}\mathbf{f}\\ \mathbf{g}\end{bmatrix} (11)

where

𝒜^\displaystyle\mathcal{\widehat{A}} =\displaystyle= [𝒯αInImαInIm𝒯].\displaystyle\begin{bmatrix}\mathcal{T}&-\alpha{I}_{n}\otimes I_{m}\\ \alpha{I}_{n}\otimes I_{m}&\mathcal{T}^{\top}\end{bmatrix}. (12)

For 𝒜^\mathcal{\widehat{A}}, we propose the following novel block preconditioner:

𝒫S=[𝒮αInImαInIm𝒮],\displaystyle\mathcal{P}_{S}=\begin{bmatrix}\mathcal{S}&-\alpha{I}_{n}\otimes I_{m}\\ \alpha{I}_{n}\otimes I_{m}&\mathcal{S}^{*}\end{bmatrix}, (13)

where

𝒮=SnIm+τIn(Lm).\mathcal{S}=S_{n}\otimes I_{m}+\tau I_{n}\otimes(-L_{m}). (14)

Notice that Sn:=Sn(1)(Sn(2))1S_{n}:=S_{n}^{(1)}(S_{n}^{(2)})^{-1}, where

Sn(1)=[1ω111111],Sn(2)=[θω(1θ)1θθ1θθ1θθ]\displaystyle S_{n}^{(1)}=\begin{bmatrix}1&&&&-\omega\\ -1&1&&&\\ &-1&1&&\\ &&\ddots&\ddots&\\ &&&-1&1\end{bmatrix},\quad S_{n}^{(2)}=\begin{bmatrix}\theta&&&&\omega(1-\theta)\\ 1-\theta&\theta&&&\\ &1-\theta&\theta&&\\ &&\ddots&\ddots&\\ &&&1-\theta&\theta\end{bmatrix}

and ω=eiζ\omega=e^{\textbf{i}\zeta}\in\mathbb{C} with ζ[0,2π)\zeta\in[0,2\pi). Clearly, both Sn(1)S_{n}^{(1)} and Sn(2)S_{n}^{(2)} are ω\omega-circulant matrices [3, 2], so they admit the eigendecompositions

Sn(k)=(Γn𝔽n)Λn(j)(Γn𝔽n),j=1,2,S_{n}^{(k)}=(\Gamma_{n}\mathbb{F}_{n})\Lambda_{n}^{(j)}(\Gamma_{n}\mathbb{F}_{n})^{*},\quad j=1,2, (15)

where Γn=diag(exp(iζ(k1n)))k=1n\Gamma_{n}={\rm diag}(\exp{(\textbf{i}\zeta{(\frac{k-1}{n}}))})_{k=1}^{n} and 𝔽n=1n[θn(i1)(j1)]i,j=1n\mathbb{F}_{n}=\frac{1}{\sqrt{n}}[\theta_{n}^{(i-1)(j-1)}]_{i,j=1}^{n} with θn=exp(2π𝐢n)\theta_{n}=\exp(\frac{2\pi{\bf i}}{n}) and Λn(j)=diag(f1(ζ+2πkn))k=0n1\Lambda_{n}^{(j)}={\rm diag}(f_{1}(\frac{\zeta+2\pi k}{n}))_{k=0}^{n-1} and fjf_{j} is defined by (9) and (10).

Remark 1.

Since Sn(2)S_{n}^{(2)} is an ω\omega-circulant matrix, its eigenvalues λk(Sn(2))\lambda_{k}(S_{n}^{(2)}) can be found explicitly, i.e., λk(Sn(2))=θ+(1θ)exp(𝐢(ζ+2πkn)),k=0,,n1\lambda_{k}(S_{n}^{(2)})=\theta+(1-\theta)\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}},k=0,\dots,n-1. It is known that Sn(2)S_{n}^{(2)} can be singular. For example, Sn(2)S_{n}^{(2)} has a zero eigenvalue for even nn when θ=12\theta=\frac{1}{2} and ζ=0\zeta=0 (i.e., ω=1\omega=1), which was discussed in [36]. When it happens, a remedy is to replace the zero eigenvalue by a nonzero real number. Thus, it can easily create a nonsingular circulant matrix S~n(2)\widetilde{S}_{n}^{(2)} such that rank(S~n(2)Sn(2))=1\mathrm{rank}(\widetilde{S}_{n}^{(2)}-{S}_{n}^{(2)})=1. Hence, our preconditioning approach can work with SnS_{n} replaced by S~n=Sn(1)(S~n(2))1\widetilde{S}_{n}=S_{n}^{(1)}(\widetilde{S}_{n}^{(2)})^{-1}, without the restrictive assumption needed in [36] (i.e., nn should be chosen odd). Therefore, for ease of exposition, we assume that Sn(2)S_{n}^{(2)} is nonsingular in the rest of this work.

Remark 2.

It should be noted that our adopted ω\omega-circulant preconditioning is distinct from the ϵ\epsilon-circulant preconditioning that has received much attention in the literature due to its excellent performance for solving PDE problems (see, e.g, [24, 26, 33, 27]), despite these two kind of matrices have a similar decomposition like (15). A notable difference between the two is that ω\omega is a complex number in general, while ϵ\epsilon is only chosen real. Correspondingly, the diagonal matrix Γn\Gamma_{n} for ω\omega-circulant matrices is unitary in general, while that for ϵ\epsilon-circulant matrices is not.

When ω=1\omega=1, 𝒫S\mathcal{P}_{S} becomes the block circulant based preconditioner proposed in [36]. The existing block skew-circulant based preconditioner [4] proposed only with the backward Euler method is also included in our preconditioning strategy when ω=1\omega=-1 and θ=1\theta=1. As extensively studied in [8, 9], ω\omega-circulant matrices as preconditioners for Toeplitz systems can substantially outperform the Strang type preconditioners [32] (i.e., when ω=1\omega=1), especially in the ill-conditioned case. For related studies on the unsatisfactory performance of Strang preconditioners for ill-conditioned nonsymmetric (block) Toeplitz systems, we refer to [19, 18].

Since ω\omega-circulant matrices can be efficiently diagonalized by the fast Fourier transforms (FFTs), which can be parallelizable over different possessors. Hence, our preconditioner 𝒫S\mathcal{P}_{S} is especially advantageous in a high performance computing environment.

In order to support our GMRES solver with 𝒫S\mathcal{P}_{S} as a preconditioner, we will show that the singular values of 𝒫S1𝒜^\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}} are clustered around unity. However, despite of its success which can be seen in the numerical experiments from Section 4, the convergence study of preconditioning strategies for nonsymmetric problems is to a great extent heuristic. As mentioned in [35, Chapter 6], descriptive convergence bounds are usually not available for GMRES or any of the other applicable nonsymmetric Krylov subspace iterative methods.

Therefore, as an alternative solver, we develop a preconditioned minimal residual (MINRES) method for the symmetric system (6), instead of (11). Notice that our proposed MINRES method is in contrast with the aforementioned GMRES solvers, such as [36, 4] where a block (skew-)circulant type preconditioner was proposed and the eigenvalues of the preconditioned matrix were shown clustered around unity. As well explained in [13], the convergence behaviour of GMRES cannot be rigorously analyzed by using only eigenvalues in general. Thus, our MINRES solver can get round these theoretical difficulties of GMRES.

Based on the spectral distribution of 𝒜\mathcal{A}, we first propose the following novel SPD block diagonal preconditioner as an ideal preconditioner for 𝒜\mathcal{A}:

|𝒜|:=𝒜2=[𝒯𝒯+α2InIm𝒯𝒯+α2InIm].\displaystyle|\mathcal{A}|:=\sqrt{\mathcal{A}^{2}}=\begin{bmatrix}\sqrt{\mathcal{T}^{\top}\mathcal{T}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\mathcal{T}\mathcal{T}^{\top}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}. (16)

Despite its excellent preconditioning effect for 𝒜\mathcal{A}, which will be shown in Section 3.2, the matrix |𝒜||\mathcal{A}| is computational expensive to invert. Thus, we then propose the following parallel-in-time (PinT) preconditioner, which mimics |𝒜||\mathcal{A}| and can be fast implemented:

|𝒫S|:=𝒫S𝒫S=[𝒮𝒮+α2InIm𝒮𝒮+α2InIm].\displaystyle|\mathcal{P}_{S}|:=\sqrt{\mathcal{P}_{S}^{*}\mathcal{P}_{S}}=\begin{bmatrix}\sqrt{\mathcal{S}^{*}\mathcal{S}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\mathcal{S}\mathcal{S}^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}. (17)

However, the preconditioner |𝒫S||\mathcal{P}_{S}| requires fast diagonalizability of LmL_{m} in order to be efficiently implemented. When such diagonalizability is not available, we further propose the following preconditioner 𝒫MS\mathcal{P}_{MS} as a modification of |𝒫S||\mathcal{P}_{S}|:

𝒫MS\displaystyle\mathcal{P}_{MS}
=\displaystyle= [SnSn+α2InIm+τIn(Lm)SnSn+α2InIm+τIn(Lm)].\displaystyle\begin{bmatrix}\sqrt{S_{n}^{*}S_{n}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &\sqrt{S_{n}S_{n}^{*}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\end{bmatrix}.

One of our main contributions in this work is to develop a preconditioned MINRES method with the proposed preconditioners, which has theoretically guaranteed convergence based on eigenvalues.

It is worth noting that our preconditioning approaches are fundamentally different from another kind of related existing work (see, e.g., [31, 23]), which is a typical preconditioning approach in the context of preconditioning for saddle point systems. Its effectiveness is based on the approximation of Schur complements (e.g., [30, 21]) and the classical preconditioning techniques [1, 28]. Yet, for instance, our preconditioning proposal extends the MINRES preconditioning strategy proposed in [17] from optimal control of wave equations to that of parabolic equations, resulting in a clustered spectrum around {±1}\{\pm 1\}. Moreover, the implementation of our preconditioners based on FFTs are parallel-in-time.

The paper is organized as follows. In Section 2, we review some preliminary results on block Toeplitz matrices. In Section 3, we provide our main results on the spectral analysis for our proposed preconditioners. Numerical examples are given in Section 4 for supporting the performance of our proposed preconditioners.

2 Preliminaries on Toeplitz matrices

In this section, we provide some useful background knowledge regarding Toeplitz matrices.

We let L1([π,π])L^{1}([-\pi,\pi]) be the Banach space of all functions that are Lebesgue integrable over [π,π][-\pi,\pi] and periodically extended to the whole real line. The Toeplitz matrix generated by fL1([π,π])f\in L^{1}([-\pi,\pi]) is denoted by Tn[f]T_{n}[f], namely,

Tn[f]=[a0a1an+2an+1a1a0a1an+2a1a0an2a1an1an2a1a0],\displaystyle T_{n}[f]=\begin{bmatrix}{}a_{0}&a_{-1}&\cdots&a_{-n+2}&a_{-n+1}\\ a_{1}&a_{0}&a_{-1}&&a_{-n+2}\\ \vdots&a_{1}&a_{0}&\ddots&\vdots\\ a_{n-2}&&\ddots&\ddots&a_{-1}\\ a_{n-1}&a_{n-2}&\cdots&a_{1}&a_{0}\end{bmatrix},

where

ak=12πππf(θ)e𝐢kθ𝑑θ,k=0,±1,±2,\displaystyle a_{k}=\frac{1}{2\pi}\int_{-\pi}^{\pi}f(\theta)e^{-\mathbf{i}k\theta}\,d\theta,\quad k=0,\pm 1,\pm 2,\dots

are the Fourier coefficients of ff. The function ff is called the generating function of Tn[f]T_{n}[f]. If ff is complex-valued, then Tn[f]T_{n}[f] is non-Hermitian for all sufficiently large nn. Conversely, if ff is real-valued, then Tn[f]T_{n}[f] is Hermitian for all nn. If ff is real-valued and nonnegative, but not identically zero almost everywhere, then Tn[f]T_{n}[f] is Hermitian positive definite for all nn. If ff is real-valued and even, Tn[f]T_{n}[f] is symmetric for all nn. For thorough discussions on the related properties of block Toeplitz matrices, we refer readers to [11] and references therein; for computational features see [29, 7, 12] and references there reported.

3 Main results

In this section, the main results which support the effectiveness of our proposed preconditioners are provided. Also, the implementation issue is discussed.

3.1 GMRES - block ω\omega-circulant based preconditioner

Proposition 3.1.

Let 𝒜^2mn×2mn,𝒫S2mn×2mn\mathcal{\widehat{A}}\in\mathbb{R}^{2mn\times 2mn},\mathcal{P}_{S}\in\mathbb{C}^{2mn\times 2mn} be defined by (12) and (13), respectively. Then,

𝒫S1𝒜^=Inm+~1,\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}}=I_{nm}+\widetilde{\mathcal{R}}_{1},

where ImnI_{mn} is the mnmn by mnmn identity matrix and rank(~1)4m\mathrm{rank}(\widetilde{\mathcal{R}}_{1})\leq 4m.

Proof.

First, we observe that

𝒫S𝒜^\displaystyle\mathcal{P}_{S}-\mathcal{\widehat{A}} =[𝒮𝒯(𝒮𝒯)]\displaystyle=\begin{bmatrix}\mathcal{S-T}&\\ &(\mathcal{S-T})^{*}\\ \end{bmatrix}
=[(SnBn)Im(SnBn)Im].\displaystyle=\begin{bmatrix}(S_{n}-B_{n})\otimes I_{m}\\ &(S_{n}-B_{n})^{*}\otimes I_{m}\\ \end{bmatrix}.

Now, we examine rank(SnBn)\mathrm{rank}(S_{n}-B_{n}) via the following matrix decomposition:

SnBn\displaystyle S_{n}-B_{n} =\displaystyle= Sn(1)(Sn(2))1Bn(1)(Bn(2))1\displaystyle S_{n}^{(1)}(S_{n}^{(2)})^{-1}-B_{n}^{(1)}(B_{n}^{(2)})^{-1}
=\displaystyle= (Sn(1)Bn(1))(Sn(2))1+Bn(1)((Sn(2))1(Bn(2))1).\displaystyle(S_{n}^{(1)}-B_{n}^{(1)})(S_{n}^{(2)})^{-1}+B_{n}^{(1)}\big{(}(S_{n}^{(2)})^{-1}-(B_{n}^{(2)})^{-1}\big{)}.

From the simple structure of these matrices, it is clear that rank(Sn(1)Bn(1))1\mathrm{rank}(S_{n}^{(1)}-B_{n}^{(1)})\leq 1 and rank((Sn(2))1(Bn(2))1)1\mathrm{rank}\big{(}(S_{n}^{(2)})^{-1}-(B_{n}^{(2)})^{-1}\big{)}\leq 1 (because rank(Sn(2)Bn(2))1\mathrm{rank}(S_{n}^{(2)}-B_{n}^{(2)})\leq 1). Thus, we have rank(SnBn)2\mathrm{rank}(S_{n}-B_{n})\leq 2, implying rank(𝒫S𝒜^)4m\mathrm{rank}(\mathcal{P}_{S}-\mathcal{\widehat{A}})\leq 4m. Then, we have

𝒫S1𝒜^\displaystyle\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}} =Inm𝒫S1(𝒫S𝒜^)=:~1,\displaystyle=I_{nm}-\underbrace{\mathcal{P}_{S}^{-1}(\mathcal{P}_{S}-\mathcal{\widehat{A}})}_{=:\widetilde{\mathcal{R}}_{1}},

where rank(~1)4m\mathrm{rank}(\widetilde{\mathcal{R}}_{1})\leq 4m. ∎

As a consequence of proposition 3.1, we can show that the singular values of 𝒫S1𝒜^\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}} are clustered around unity except for a number of outliers whose size is independent of nn in general. From a preconditioning for nonsymmetric Toeplitz systems point of view, such a singular value cluster is often used to support the preconditioning effectiveness of 𝒫S\mathcal{P}_{S} for 𝒜^\mathcal{\widehat{A}} when GMRES is used. We refer to [29] for a systematic exposition of preconditioning for non-Hermitian Toeplitz systems. One could further show that the eigenvalues of 𝒫S1𝒜^\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}} are also clustered around unity, as with many existing works. However, as mentioned in Section 1, the convergence of GMRES in general cannot be rigorously analyzed by using only eigenvalues. As such, in the next subsections, we provide theoretical supports for our proposed MINRES solvers.

3.2 MINRES - ideal preconditioner

In what follows, we will show that an ideal preconditioner for 𝒜\mathcal{A} is the SPD matrix |𝒜||\mathcal{A}| defined by (16).

Proposition 3.2.

Let 𝒜2mn×2mn\mathcal{A}\in\mathbb{R}^{2mn\times 2mn} be defined by (1). Then, the preconditioned matrix |𝒜|1𝒜|\mathcal{A}|^{-1}\mathcal{A} is both (real) symmetric and orthogonal.

Proof.

Considering the singular value decomposition of 𝒯=UΣV\mathcal{T}=U\Sigma V^{\top}, the associated decomposition of 𝒜\mathcal{A} is obtained by direct computation, that is

𝒜\displaystyle\mathcal{A} =\displaystyle= [αInImVΣUUΣVαInIm]\displaystyle\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&V\Sigma U^{\top}\\ U\Sigma V^{\top}&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}
=\displaystyle= [VU][αInImΣΣαInIm][VU]\displaystyle\begin{bmatrix}V&\\ &U\end{bmatrix}\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&\Sigma\\ \Sigma&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}\begin{bmatrix}V&\\ &U\end{bmatrix}^{\top}
=\displaystyle= [VU]𝒬^[Σ2+α2InImΣ2+α2InIm]𝒬^[VU],\displaystyle\begin{bmatrix}V&\\ &U\end{bmatrix}\mathcal{{\widehat{Q}}}\begin{bmatrix}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &-\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}\mathcal{{\widehat{Q}}}^{\top}\begin{bmatrix}V&\\ &U\end{bmatrix}^{\top},

where 𝒬^\mathcal{{\widehat{Q}}} is orthogonal given by

𝒬^=[ΣD11ΣD21(Σ2+α2InImαInIm)D11(Σ2+α2InIm+αInIm)D21]\displaystyle\mathcal{{\widehat{Q}}}=\begin{bmatrix}\Sigma D_{1}^{-1}&-\Sigma D_{2}^{-1}\\ \big{(}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}-\alpha{I}_{n}\otimes I_{m}\big{)}D_{1}^{-1}&\big{(}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}+\alpha{I}_{n}\otimes I_{m}\big{)}D_{2}^{-1}\end{bmatrix}

with

D1=(Σ2+α2InIm+αInIm)2+Σ2D_{1}=\sqrt{(-\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}+\alpha{I}_{n}\otimes I_{m})^{2}+\Sigma^{2}}

and

D2=(Σ2+α2InIm+αInIm)2+Σ2.D_{2}=\sqrt{(\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}+\alpha{I}_{n}\otimes I_{m})^{2}+\Sigma^{2}}.

It is obvious that both diagonal matrices D1D_{1} and D2D_{2} are invertible. Thus Q^\widehat{Q} is well-defined.

Since both Q^\widehat{Q} and [VU]\begin{bmatrix}V&\\ &U\end{bmatrix} are orthogonal, 𝒬:=[VU]Q^\mathcal{Q}:=\begin{bmatrix}V&\\ &U\end{bmatrix}\widehat{Q} is also orthogonal. Hence, from (3.2), we have obtained an eigendecomposition of 𝒜\mathcal{A}, i.e.,

𝒜=𝒬[Σ2+α2InImΣ2+α2InIm]𝒬.\mathcal{A}=\mathcal{Q}\begin{bmatrix}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &-\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}\mathcal{Q}^{\top}.

Thus, we have

|𝒜|\displaystyle|\mathcal{A}| =\displaystyle= 𝒬[Σ2+α2InImΣ2+α2InIm]𝒬,\displaystyle\mathcal{Q}\begin{bmatrix}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}\mathcal{Q}^{\top},

where 𝒬\mathcal{Q} is orthogonal and Σ\Sigma is a diagonal matrix containing the singular values of 𝒯\mathcal{T}. Thus,

|𝒜|1𝒜\displaystyle|\mathcal{A}|^{-1}\mathcal{A} =\displaystyle= 𝒬[InImInIm]𝒬,\displaystyle\mathcal{Q}\begin{bmatrix}{I}_{n}\otimes I_{m}&\\ &-{I}_{n}\otimes I_{m}\end{bmatrix}\mathcal{Q}^{\top},

which is both symmetric and orthogonal. The proof is complete. ∎

In other words, Proposition 3.2 shows that the preconditioner |𝒜||\mathcal{A}| can render the eigenvalues exactly at ±1\pm 1, providing a good guide to designing effective preconditioners for 𝒜\mathcal{A}. As a consequence of Proposition 3.2, we conclude that the MINRES with |𝒜||\mathcal{A}| as a preconditioner can achieve mesh-independent convergence, i.e., a convergence rate independent of both the meshes and the regularization parameter.

Despite the fact that |𝒜||\mathcal{A}| is an ideal preconditioner, its direct application has the drawback of being computationally expensive in general. Proposition 3.2 reveals an eigendecomposition of both 𝒜\mathcal{A} and |𝒜||\mathcal{A}|, allowing us to develop preconditioners based on the spectral symbol. In what follows, we will show that 𝒫S\mathcal{P}_{S} defined by (17) is a good preconditioner for 𝒜\mathcal{A} in the sense that the preconditioned matrix 𝒫S1𝒜\mathcal{P}_{S}^{-1}\mathcal{A} can be expressed as the sum of a Hermitian unitary matrix and a low-rank matrix.

3.3 MINRES - block ω\omega-circulant based preconditioner

The following theorem accounts for the preconditioning effect of MINRES-𝒫S\mathcal{P}_{S}.

Theorem 3.3.

Let 𝒜2mn×2mn,|𝒫S|2mn×2mn\mathcal{A}\in\mathbb{R}^{2mn\times 2mn},|\mathcal{P}_{S}|\in\mathbb{C}^{2mn\times 2mn} be defined by (1) and (17), respectively. Then,

|𝒫S|1𝒜=𝒬~1+~2,|\mathcal{P}_{S}|^{-1}\mathcal{A}=\widetilde{\mathcal{Q}}_{1}+\widetilde{\mathcal{R}}_{2},

where 𝒬~1\widetilde{\mathcal{Q}}_{1} is both Hermitian and unitary and rank(~2)4m\mathrm{rank}(\widetilde{\mathcal{R}}_{2})\leq 4m.

Proof.

Let s(𝒜)=[αInIm𝒮𝒮αInIm]s(\mathcal{A})=\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&\mathcal{S}^{*}\\ \mathcal{S}&-\alpha{I}_{n}\otimes I_{m}\\ \end{bmatrix}. Notice that |s(𝒜)|=s(𝒜)2=|𝒫S|=𝒫S𝒫S|s(\mathcal{A})|=\sqrt{s(\mathcal{A})^{2}}=|\mathcal{P}_{S}|=\sqrt{\mathcal{P}_{S}^{*}\mathcal{P}_{S}}.
Simple calculations show that

s(𝒜)𝒜\displaystyle s(\mathcal{A})-\mathcal{A} =[(𝒮𝒯)𝒮𝒯]\displaystyle=\begin{bmatrix}&(\mathcal{S-T})^{*}\\ \mathcal{S-T}&\\ \end{bmatrix}
=[(SnBn)Im(SnBn)Im].\displaystyle=\begin{bmatrix}&(S_{n}-B_{n})^{*}\otimes I_{m}\\ (S_{n}-B_{n})\otimes I_{m}\\ \end{bmatrix}.

Thus, rank(s(𝒜)𝒜)4m\mathrm{rank}(s(\mathcal{A})-\mathcal{A})\leq 4m, following an argument from the proof of Proposition 3.1. Thus, we have

|𝒫S|1𝒜\displaystyle|\mathcal{P}_{S}|^{-1}\mathcal{A} =|s(𝒜)|1𝒜\displaystyle=|s(\mathcal{A})|^{-1}\mathcal{A}
=|s(𝒜)|1s(𝒜)=:𝒬~1|s(𝒜)|1(s(𝒜)𝒜)=:~2,\displaystyle=\underbrace{|s(\mathcal{A})|^{-1}s(\mathcal{A})}_{=:\widetilde{\mathcal{Q}}_{1}}-\underbrace{|s(\mathcal{A})|^{-1}(s(\mathcal{A})-\mathcal{A})}_{=:\widetilde{\mathcal{R}}_{2}},

where rank(~2)4m\mathrm{rank}(\widetilde{\mathcal{R}}_{2})\leq 4m.

Since s(𝒜)s(\mathcal{A}) is Hermitian, we have |s(𝒜)|=𝒲Ψ𝒲|s(\mathcal{A})|=\mathcal{W}\Psi\mathcal{W}^{*}, where 𝒲\mathcal{W} is a unitary matrix and Ψ\Psi is diagonal matrix containing the eigenvalues of s(𝒜)s(\mathcal{A}). Correspondingly, we have s(𝒜)=𝒲|Ψ|𝒲s(\mathcal{A})=\mathcal{W}|\Psi|\mathcal{W}^{*}, where |Ψ||\Psi| is diagonal matrix containing the absolute value eigenvalues of s(𝒜)s(\mathcal{A}). Thus, we have 𝒬~1=𝒲|Ψ|1Ψ𝒲\widetilde{\mathcal{Q}}_{1}=\mathcal{W}|\Psi|^{-1}\Psi\mathcal{W}^{*}, which is clearly Hermitian. Also, as 𝒬~12=𝒲|Ψ|1Ψ|Ψ|1Ψ𝒲=𝒲I2nm𝒲=I2nm\widetilde{\mathcal{Q}}_{1}^{2}=\mathcal{W}|\Psi|^{-1}\Psi|\Psi|^{-1}\Psi\mathcal{W}^{*}=\mathcal{W}I_{2nm}\mathcal{W}^{*}=I_{2nm}, we know that 𝒬~1\widetilde{\mathcal{Q}}_{1} is unitary.

The proof is complete. ∎

As a consequence of Theorem 3.3 and [5, Corollary 3], we know that the preconditioned matrix |𝒫S|1𝒜|\mathcal{P}_{S}|^{-1}\mathcal{A} has clustered eigenvalues at ±1\pm 1, with a number of outliers independent of nn in general (i.e., depending only on mm). Thus, the convergence is independent of the time step in general, and we can expect that MINRES for 𝒜\mathcal{A} will converge rapidly in exact arithmetic with |𝒫S||\mathcal{P}_{S}| as the preconditioner.

3.4 MINRES - modified block ω\omega-circulant based preconditioner

To support the preconditioning effect of 𝒫MS\mathcal{P}_{MS}, we will show that it is spectrally equivalent to 𝒫S\mathcal{P}_{S}. Before proceeding, we introduce the following auxiliary matrix 𝒫AS\mathcal{P}_{AS} which is useful to showing the preconditioning effect for 𝒫MS\mathcal{P}_{MS}.

𝒫AS\displaystyle\mathcal{P}_{AS}
=\displaystyle= [(SnSn+α2In)Im+τ2InLm2(SnSn+α2In)Im+τ2InLm2].\displaystyle\begin{bmatrix}\sqrt{(S_{n}^{*}S_{n}+\alpha^{2}I_{n})\otimes I_{m}+\tau^{2}I_{n}\otimes L_{m}^{2}}&\\ &\sqrt{(S_{n}S_{n}^{*}+\alpha^{2}I_{n})\otimes I_{m}+\tau^{2}I_{n}\otimes L_{m}^{2}}\end{bmatrix}.

Also, the following lemma is useful.

Lemma 3.4.

Let Snn×nS_{n}\in\mathbb{C}^{n\times n} be defined in (14). Then, Sn+SnS_{n}^{*}+S_{n} is (Hermitian) positive semi-definite.

Proof.

The eigenvalues of SnS_{n} can be found explicitly, which are

λk(Sn)=λk(Sn(1))λk(Sn(2))=1exp(𝐢(ζ+2πkn))θ+(1θ)exp(𝐢(ζ+2πkn)),\displaystyle\lambda_{k}(S_{n})=\frac{\lambda_{k}(S_{n}^{(1)})}{\lambda_{k}(S_{n}^{(2)})}=\frac{1-\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}{\theta+(1-\theta)\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}},

k=0,,n1.k=0,\dots,n-1. Thus, we have

λk(Sn+Sn)\displaystyle\lambda_{k}(S_{n}^{*}+S_{n}) =\displaystyle= 1exp(𝐢(ζ+2πkn))θ+(1θ)exp(𝐢(ζ+2πkn))+1exp(𝐢(ζ+2πkn))θ+(1θ)exp(𝐢(ζ+2πkn))\displaystyle\frac{1-\exp{\big{(}-\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}{\theta+(1-\theta)\exp{\big{(}-\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}+\frac{1-\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}{\theta+(1-\theta)\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}
=\displaystyle= 2(2θ1)(1cos(ζ+2πkn))θ2+(1θ)2+2θ(1θ)cos(ζ+2πkn),\displaystyle\frac{2(2\theta-1)\big{(}1-\cos{(\frac{\zeta+2\pi k}{n})}\big{)}}{\theta^{2}+(1-\theta)^{2}+2\theta(1-\theta)\cos{(\frac{\zeta+2\pi k}{n})}},

which is always non-negative provided Sn(2)S_{n}^{(2)} is nonsingular by assumption. The proof is complete. ∎

Denote by σ(𝐂)\sigma({\bf C}) the spectrum of a square matrix 𝐂{\bf C}.

Lemma 3.5.

Let X=SnImmn×mnX=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn} and Y=τIn(Lm)mn×mnY=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}, with the involved notation defined in (14). Then,

σ(XX+YY+α2InIm1(X+Y)(X+Y)+α2InIm)[1,2].\displaystyle\sigma\bigg{(}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq[1,\sqrt{2}].
Proof.

Let

Z=(X+Y)(X+Y)+α2InImZ=\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}

and

Z~=XX+YY+α2InIm.\widetilde{Z}=\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}.

Knowing that the eigendecomposition of Lm-L_{m} is given by Lm=𝕌mΩm𝕌m-L_{m}=\mathbb{U}_{m}\Omega_{m}\mathbb{U}_{m}^{\top} with Lm-L_{m} assumed SPD, where 𝕌m\mathbb{U}_{m} is orthogonal and Ωm\Omega_{m} is a real-valued diagonal matrix containing the eigenvalues of Lm-L_{m}, we have

Z\displaystyle Z =\displaystyle= (X+Y)(X+Y)+α2InIm\displaystyle\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}
=\displaystyle= (Γn𝔽n𝕌m)\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})
×(ΛnIm+τInΩm)(ΛnIm+τInΩm)+α2InIm\displaystyle\times\sqrt{(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})^{*}(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}
×(Γn𝔽n𝕌m)\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}

and

Z~\displaystyle\widetilde{Z} =\displaystyle= XX+YY+α2InIm\displaystyle\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}
=\displaystyle= (Γn𝔽n𝕌m)\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})
×(ΛnIm)(ΛnIm)+(τInΩm)(τInΩm)+α2InIm\displaystyle\times\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}
×(Γn𝔽n𝕌m).\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}.

Since ZZ and Z~\widetilde{Z} are diagonalized by the unitary matrix =Γn𝔽n𝕌m\mathbb{Q}=\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m}, they are simultaneously diagonalizable.

Since both ZZ and Z~\widetilde{Z} are invertible, they are Hermitian positive definite by construction. To examine the target spectrum of Z~1Z\widetilde{Z}^{-1}Z, we consider the Rayleigh quotient for complex 𝐯𝟎\mathbf{v}\neq\mathbf{0}:

R\displaystyle R :======\displaystyle:=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel= 𝐯(X+Y)(X+Y)+α2InIm𝐯𝐯XX+YY+α2InIm𝐯\displaystyle\frac{\mathbf{v}^{*}\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{v}}{\mathbf{v}^{*}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{v}}
======𝐰=𝐯\displaystyle\underbrace{=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel=}_{{\bf w}=\mathbb{Q}{\bf v}} 𝐰(ΛnIm+τInΩm)(ΛnIm+τInΩm)+α2InIm𝐰𝐰(ΛnIm)(ΛnIm)+(τInΩm)(τInΩm)+α2InIm𝐰.\displaystyle\frac{\mathbf{w}^{*}\sqrt{(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})^{*}(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{w}}{\mathbf{w}^{*}\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{w}}.

By the invertibility of ZZ and Z~\widetilde{Z}, both numerator and denominator are positive.

On one hand, we estimate an upper bound for RR. Let z1z_{1} and z2z_{2} be an entry of ΛnIm\Lambda_{n}\otimes I_{m} and τInΩm\tau I_{n}\otimes\Omega_{m}, respectively. We have

12(z1+z2¯)(z1+z2)z¯1z1+z¯2z2,\displaystyle\frac{{1}}{2}(\overline{z_{1}+z_{2}})(z_{1}+z_{2})\leq{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}},

which implies

12(z1+z2¯)(z1+z2)+α2z¯1z1+z¯2z2+α2,\displaystyle\frac{{1}}{2}(\overline{z_{1}+z_{2}})(z_{1}+z_{2})+\alpha^{2}\leq{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}}+\alpha^{2},

since α2=τ2γ\alpha^{2}=\frac{\tau^{2}}{{\gamma}} is positive. Thus, we have

12(z1+z2¯)(z1+z2)+α2z¯1z1+z¯2z2+α2.\displaystyle\frac{{1}}{\sqrt{2}}\sqrt{(\overline{z_{1}+z_{2}})(z_{1}+z_{2})+\alpha^{2}}\leq\sqrt{{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}}+\alpha^{2}}.

Therefore,

R2.R\leq\sqrt{2}.

On the other hand, we estimate an lower bound for RR by first examining the definiteness of the matrix XY+YXX^{*}Y+Y^{*}X. Since Sn+SnS_{n}^{*}+S_{n} is (Hermitian) non-negative definite by Lemma 3.4, which implies

XY+YX\displaystyle X^{*}Y+Y^{*}X =\displaystyle= (SnIm)(τIn(Lm))+(τIn(Lm))(SnIm)\displaystyle(S_{n}\otimes I_{m})^{\top}(\tau I_{n}\otimes(-L_{m}))+(\tau I_{n}\otimes(-L_{m}))^{\top}(S_{n}\otimes I_{m})
=\displaystyle= τ(Sn+Sn)(Lm)\displaystyle\tau(S_{n}^{*}+S_{n})\otimes(-L_{m})

is also non-negative definite. Thus, we also have

(z1+z2¯)(z1+z2)+α2\displaystyle\sqrt{(\overline{z_{1}+z_{2}})(z_{1}+z_{2})+\alpha^{2}} =\displaystyle= z¯1z1+z¯2z2+z¯1z2+z¯2z1+α2\displaystyle\sqrt{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}+\bar{z}_{1}{z_{2}}+\bar{z}_{2}{z_{1}}+\alpha^{2}}
\displaystyle\geq z¯1z1+z¯2z2+α2,\displaystyle\sqrt{{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}}+\alpha^{2}},

implying

1R.1\leq R.

The proof is complete. ∎

Similarly, we can show the following lemma:

Lemma 3.6.

Let X=SnImmn×mnX=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn} and Y=τIn(Lm)mn×mnY=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}, with the involved notation defined in (14). Then,

σ(XX+YY+α2InIm1(X+Y)(X+Y)+α2InIm)[1,2].\displaystyle\sigma\bigg{(}\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}\sqrt{(X+Y)(X+Y)^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq[1,\sqrt{2}].
Proposition 3.7.

Let |𝒫S|,𝒫AS2mn×2mn|\mathcal{P}_{S}|,\mathcal{P}_{AS}\in\mathbb{C}^{2mn\times 2mn} be defined by (17) and (3.4), respectively. Then,

σ(𝒫AS1|𝒫S|)[1,2].\displaystyle\sigma(\mathcal{P}_{AS}^{-1}|\mathcal{P}_{S}|)\subseteq[1,\sqrt{2}].
Proof.

Knowing that

|𝒫S|=[(X+Y)(X+Y)+α2InIm(X+Y)(X+Y)+α2InIm]|\mathcal{P}_{S}|=\begin{bmatrix}\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{(X+Y)(X+Y)^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix}

and

𝒫AS=[XX+YY+α2InImXX+YY+α2InIm],\mathcal{P}_{AS}=\begin{bmatrix}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix},

we know that σ(𝒫AS1|𝒫S|)[1,2],\sigma(\mathcal{P}_{AS}^{-1}|\mathcal{P}_{S}|)\subseteq[1,\sqrt{2}], by Lemmas 3.5 and 3.6.

Remark 3.

When the Crank-Nicolson method (i.e., θ=12)\theta=\frac{1}{2}) is adopted, Sn+SnS_{n}^{*}+S_{n} is a null matrix from Lemma 3.4. Using this fact and considering the proofs of Lemmas 3.5 & 3.6, we can show that XY+YXX^{*}Y+Y^{*}X is also a null matrix which implies 𝒫AS=|𝒫S|\mathcal{P}_{AS}=|\mathcal{P}_{S}|.

Lemma 3.8.

Let X=SnImmn×mnX=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn} and Y=τIn(Lm)mn×mnY=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}, with the involved notation defined in (14). Then,

σ((XX+α2InIm+YY)1XX+YY+α2InIm)[12,1].\displaystyle\sigma\bigg{(}\big{(}\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}\big{)}^{-1}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq\bigg{[}\frac{1}{\sqrt{2}},1\bigg{]}.
Proof.

Similar to the proof of Lemma 3.5, we let

Z^=XX+α2InIm+YY\widehat{Z}=\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}

and

Z~=XX+YY+α2InIm.\widetilde{Z}=\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}.

Knowing that the eigendecomposition of Lm-L_{m} is given by Lm=𝕌mΩm𝕌m-L_{m}=\mathbb{U}_{m}\Omega_{m}\mathbb{U}_{m}^{\top}, where 𝕌m\mathbb{U}_{m} is orthogonal and Ωm\Omega_{m} is a real diagonal matrix containing the eigenvalues of Lm-L_{m}, we have

Z^\displaystyle\widehat{Z} =\displaystyle= XX+α2InIm+YY\displaystyle\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}
=\displaystyle= (Γn𝔽n𝕌m)\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})
×((ΛnIm)(ΛnIm)+α2InIm+τInΩm)\displaystyle\times\big{(}\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}+\tau I_{n}\otimes\Omega_{m}\big{)}
×(Γn𝔽n𝕌m)\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}

and, again,

Z~\displaystyle\widetilde{Z} =\displaystyle= XX+YY+α2InIm\displaystyle\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}
=\displaystyle= (Γn𝔽n𝕌m)\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})
×(ΛnIm)(ΛnIm)+(τInΩm)(τInΩm)+α2InIm\displaystyle\times\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}
×(Γn𝔽n𝕌m).\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}.

Thus, Z^\widehat{Z} and Z~\widetilde{Z} are simultaneously diagonalized by the unitary matrix =Γn𝔽n𝕌m\mathbb{Q}=\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m}.

Since both Z^\widehat{Z} and Z~\widetilde{Z} are invertible, they are Hermitian positive definite by construction. To examine the target spectrum of Z^1Z~\widehat{Z}^{-1}\widetilde{Z}, we consider the Rayleigh quotient for complex 𝐯𝟎\mathbf{v}\neq\mathbf{0}:

R^\displaystyle\widehat{R} :======\displaystyle:=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel= 𝐯XX+YY+α2InIm𝐯𝐯(XX+α2InIm+YY)𝐯\displaystyle\frac{\mathbf{v}^{*}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{v}}{\mathbf{v}^{*}\big{(}\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}\big{)}\mathbf{v}}
======𝐰=𝐯\displaystyle\underbrace{=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel=}_{{\bf w}=\mathbb{Q}{\bf v}} 𝐰(ΛnIm)(ΛnIm)+(τInΩm)(τInΩm)+α2InIm𝐰𝐰((ΛnIm)(ΛnIm)+α2InIm+τInΩm)𝐰\displaystyle\frac{\mathbf{w}^{*}\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{w}}{\mathbf{w}^{*}\big{(}\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}+\tau I_{n}\otimes\Omega_{m}\big{)}\mathbf{w}}

For two non-negative numbers, c1c_{1} and c2c_{2}, it is known that

12(c1+c2)c12+c22c1+c2.\displaystyle\frac{1}{\sqrt{2}}(c_{1}+c_{2})\leq\sqrt{c_{1}^{2}+c_{2}^{2}}\leq c_{1}+c_{2}.

Therefore, by letting c1c_{1} and c2c_{2} be an entry of (ΛnIm)(ΛnIm)+α2InIm\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+\alpha^{2}{I}_{n}\otimes I_{m}} and τInΩm\tau I_{n}\otimes\Omega_{m}, respectively, we have

12R^1.\frac{1}{\sqrt{2}}\leq\widehat{R}\leq 1.

The proof is complete. ∎

Similarly, we can show the following lemma and proposition.

Lemma 3.9.

Let X=SnImmn×mnX=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn} and Y=τIn(Lm)mn×mnY=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}, with the involved notation defined in (14). Then,

σ((XX+α2InIm+YY)1XX+YY+α2InIm)[12,1].\displaystyle\sigma\bigg{(}\big{(}\sqrt{XX^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{YY^{*}}\big{)}^{-1}\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq\bigg{[}\frac{1}{\sqrt{2}},1\bigg{]}.
Proposition 3.10.

Let 𝒫AS,𝒫MS2mn×2mn\mathcal{P}_{AS},\mathcal{P}_{MS}\in\mathbb{C}^{2mn\times 2mn} be defined by (3.4) and (1), respectively. Then,

σ(𝒫MS1𝒫AS)[12,1].\displaystyle\sigma(\mathcal{P}_{MS}^{-1}\mathcal{P}_{AS})\subseteq\bigg{[}\frac{1}{\sqrt{2}},1\bigg{]}.
Proof.

Knowing that

𝒫MS=[XX+α2InIm+YYXX+α2InIm+YY]\mathcal{P}_{MS}=\begin{bmatrix}\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}&\\ &\sqrt{XX^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{YY^{*}}\\ \end{bmatrix}

and

𝒫AS=[XX+YY+α2InImXX+YY+α2InIm],\mathcal{P}_{AS}=\begin{bmatrix}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix},

we know that σ(𝒫MS1𝒫AS)[12,1],\sigma(\mathcal{P}_{MS}^{-1}\mathcal{P}_{AS})\subseteq\big{[}\frac{1}{\sqrt{2}},1\big{]}, by Lemmas 3.8 and 3.9. ∎

Now, we are ready to show that 𝒫MS\mathcal{P}_{MS} and 𝒫S\mathcal{P}_{S} are spectrally equivalent in the following theorem, which explains the effectiveness of 𝒫MS\mathcal{P}_{MS}.

Theorem 3.11.

Let |𝒫S|,𝒫MS2mn×2mn|\mathcal{P}_{S}|,\mathcal{P}_{MS}\in\mathbb{C}^{2mn\times 2mn} be defined by (17) and (1), respectively. Then,

σ(𝒫MS1|𝒫S|)[12,2].\displaystyle\sigma(\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}|)\subseteq\bigg{[}\frac{1}{\sqrt{2}},\sqrt{2}\bigg{]}.
Proof.

Notice that for complex 𝐯𝟎\mathbf{v}\neq\mathbf{0},

𝐯|𝒫S|𝐯𝐯𝒫MS𝐯=𝐯|𝒫S|𝐯𝐯𝒫AS𝐯𝐯𝒫AS𝐯𝐯𝒫MS𝐯.\displaystyle\frac{\mathbf{v}^{*}|\mathcal{P}_{S}|\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{MS}\mathbf{v}}=\frac{\mathbf{v}^{*}|\mathcal{P}_{S}|\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{AS}\mathbf{v}}\cdot\frac{\mathbf{v}^{*}\mathcal{P}_{AS}\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{MS}\mathbf{v}}.

By Propositions 3.7 and 3.10, we have

12𝐯|𝒫S|𝐯𝐯𝒫MS𝐯2.\frac{1}{\sqrt{2}}\leq\frac{\mathbf{v}^{*}|\mathcal{P}_{S}|\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{MS}\mathbf{v}}\leq\sqrt{2}.

The proof is complete. ∎

Remark 4.

From Remark 3, we know that σ(𝒫MS1|𝒫S|)[12,1]\sigma(\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}|)\subseteq[\frac{1}{\sqrt{2}},1] when θ=12\theta=\frac{1}{2}.

In what follows, we will justify the preconditioning effectiveness of 𝒫MS\mathcal{P}_{MS} for 𝒜\mathcal{A}.

Lemma 3.12.

[16, Theorem 4.5.9 (Ostrowski)] Let Am,WmA_{m},W_{m} be m×mm\times m matrices. Suppose AmA_{m} is Hermitian and WmW_{m} is nonsingular. Let the eigenvalues of AmA_{m} and WmWmW_{m}W_{m}^{*} be arranged in an increasing order. For each k=1,2,,m,k=1,2,\dots,m, there exists a positive real number θk\theta_{k} such that λ1(WmWm)θkλm(WmWm)\lambda_{1}(W_{m}W_{m}^{*})\leq\theta_{k}\leq\lambda_{m}(W_{m}W_{m}^{*}) and

λk(WmAmWm)=θkλk(Am).\lambda_{k}(W_{m}A_{m}W_{m}^{*})=\theta_{k}\lambda_{k}(A_{m}).

As a consequence of Theorem 3.11, Lemma 3.12, and [5, Corollary 3], we can show the following corollary accounting for the preconditioning effectiveness of 𝒫MS\mathcal{P}_{MS}:

Corollary 3.13.

Let 𝒜2mn×2mn,𝒫MS2mn×2mn\mathcal{A}\in\mathbb{R}^{2mn\times 2mn},\mathcal{P}_{MS}\in\mathbb{C}^{2mn\times 2mn} be defined by (1) and (1), respectively. Then, the eigenvalues of the matrix 𝒫MS1𝒜\mathcal{P}_{MS}^{-1}\mathcal{A} are contained [2,12][12,2][-\sqrt{2},-\frac{1}{\sqrt{2}}]\cup[\frac{1}{\sqrt{2}},\sqrt{2}], with a number of outliers independent of nn in general (i.e., depending only on mm).

Proof.

Note that

𝒫MS1/2𝒜𝒫MS1/2=𝒫MS1/2|𝒫S|1/2|𝒫S|1/2𝒜|𝒫S|1/2|𝒫S|1/2𝒫MS1/2.\displaystyle\mathcal{P}_{MS}^{-1/2}\mathcal{A}\mathcal{P}_{MS}^{-1/2}=\mathcal{P}_{MS}^{-1/2}|\mathcal{P}_{S}|^{1/2}|\mathcal{P}_{S}|^{-1/2}\mathcal{A}|\mathcal{P}_{S}|^{-1/2}|\mathcal{P}_{S}|^{1/2}\mathcal{P}_{MS}^{-1/2}.

From Lemma 3.12 and Theorem 3.11, we know that, for each k=1,2,,2mnk=1,2,\dots,2mn, there exists a positive real number θk\theta_{k} such that

12λmin(𝒫MS1/2|𝒫S|𝒫MS1/2)θkλmax(𝒫MS1/2|𝒫S|𝒫MS1/2)2\frac{1}{\sqrt{2}}\leq\lambda_{\min}(\mathcal{P}_{MS}^{-1/2}|\mathcal{P}_{S}|\mathcal{P}_{MS}^{-1/2})\leq\theta_{k}\leq\lambda_{\max}(\mathcal{P}_{MS}^{-1/2}|\mathcal{P}_{S}|\mathcal{P}_{MS}^{-1/2})\leq\sqrt{2}

and

λk(𝒫MS1/2𝒜𝒫MS1/2)=θkλk(|𝒫S|1/2𝒜|𝒫S|1/2).\lambda_{k}(\mathcal{P}_{MS}^{-1/2}\mathcal{A}\mathcal{P}_{MS}^{-1/2})=\theta_{k}\lambda_{k}(|\mathcal{P}_{S}|^{-1/2}\mathcal{A}|\mathcal{P}_{S}|^{-1/2}).

Recalling from Theorem 3.3 and [5, Corollary 3] that λk(|𝒫S|1/2𝒜|𝒫S|1/2)\lambda_{k}(|\mathcal{P}_{S}|^{-1/2}\mathcal{A}|\mathcal{P}_{S}|^{-1/2}) are either ±1\pm 1 except for a number of outliers independent of nn in general, the proof is complete. ∎

Remark 5.

When the Crank-Nicolson method is used, we can show that the eigenvalues of the matrix 𝒫MS1𝒜\mathcal{P}_{MS}^{-1}\mathcal{A} are contained [1,12][12,1][-1,-\frac{1}{\sqrt{2}}]\cup[\frac{1}{\sqrt{2}},1], with a number of outliers independent of nn in general.

In light of the last corollary, we can expect that MINRES for 𝒜\mathcal{A} will converge rapidly in exact arithmetic with 𝒫MS\mathcal{P}_{MS} as the preconditioner.

3.5 Implementation

We begin by discussing the computation of 𝒜^𝐯\mathcal{\widehat{A}}\mathbf{v} (and 𝒜𝐯\mathcal{A}\mathbf{v}) for any given vector 𝐯\mathbf{v}. The computation of matrix-vector product 𝒜^𝐯\mathcal{\widehat{A}}\mathbf{v} can be computed in 𝒪(mnlogn)\mathcal{O}(mn\log{n}) operations by using fast Fourier transforms, because of the fact that 𝒜^\mathcal{\widehat{A}} contains two block (dense) Toeplitz matrices. The required storage is of 𝒪(mn)\mathcal{O}(mn). In the special case when θ=1\theta=1 for instance, the product 𝒜^𝐯\mathcal{\widehat{A}}\mathbf{v} requires only linear complexity of 𝒪(mn)\mathcal{O}(mn) since 𝒜\mathcal{A} is a sparse matrix with a simple bi-diagonal Toeplitz matrix BnB_{n}.

In each GMRES iteration, the matrix-vector product 𝒫S1𝐯\mathcal{P}_{S}^{-1}\mathbf{v} for a given vector 𝐯\mathbf{v} needs to be computed. Since ω\omega-circulant matrices are diagonalizable by the product of a diagonal matrix and a discrete Fourier matrix 𝔽n=1n[θn(i1)(j1)]i,j=1nn×n\mathbb{F}_{n}=\frac{1}{\sqrt{n}}[\theta_{n}^{(i-1)(j-1)}]_{i,j=1}^{n}\in\mathbb{C}^{n\times n} with θn=exp(2π𝐢n)\theta_{n}=\exp{(\frac{2\pi\mathbf{i}}{n})}, we can represent the matrix SnS_{n} defined by (14) using eigendecomposition Sn=Γn𝔽nΛn𝔽nΓnS_{n}=\Gamma_{n}\mathbb{F}_{n}\Lambda_{n}\mathbb{F}_{n}^{*}\Gamma_{n}^{*}. Note that Λn\Lambda_{n} is a diagonal matrix.

Hence, we can decompose 𝒫S\mathcal{P}_{S} from (13) as follows:

𝒫S\displaystyle\mathcal{P}_{S} =[𝒮αInImαInIm𝒮]\displaystyle=\begin{bmatrix}\mathcal{S}&-\alpha{I}_{n}\otimes I_{m}\\ \alpha{I}_{n}\otimes I_{m}&\mathcal{S}^{*}\\ \end{bmatrix}
=𝒰~([ΛnαInαInΛn]𝒢Im+τ[InIn](Lm))𝒰~,\displaystyle=\widetilde{\mathcal{U}}\left(\underbrace{\begin{bmatrix}\Lambda_{n}&-\alpha I_{n}\\ \alpha I_{n}&\Lambda_{n}^{*}\\ \end{bmatrix}}_{\mathcal{G}}\otimes I_{m}+\tau\begin{bmatrix}I_{n}&\\ &I_{n}\\ \end{bmatrix}\otimes(-L_{m})\right)\widetilde{\mathcal{U}}^{*},

where 𝒰~=[Γn𝔽nIm(Γn𝔽nIm)]\widetilde{\mathcal{U}}=\begin{bmatrix}\Gamma_{n}\mathbb{F}_{n}\otimes I_{m}&\\ &(\Gamma_{n}\mathbb{F}_{n}\otimes I_{m})^{*}\\ \end{bmatrix} is an unitary matrix. Note that the matrix 𝒢\mathcal{G} can be further decomposed using the following Lemma in [36].

Lemma 3.14.

([36], Lemma 2.3) Let G1,2,3,4n×nG_{1,2,3,4}\in\mathbb{C}^{n\times n} be four diagonal matrices and 𝐆=[G1G2G3G4]\mathbf{G}=\begin{bmatrix}G_{1}&G_{2}\\ G_{3}&G_{4}\\ \end{bmatrix}. Suppose G2G_{2} and G3G_{3} are invertible. Then, it holds that

𝐆=𝐖[G1+G2M1G4+G3M2]𝐖1,𝐖=[InM2M1In],\displaystyle\mathbf{G}=\mathbf{W}\begin{bmatrix}G_{1}+G_{2}M_{1}&\\ &G_{4}+G_{3}M_{2}\\ \end{bmatrix}\mathbf{W}^{-1},\quad\mathbf{W}=\begin{bmatrix}I_{n}&M_{2}\\ M_{1}&I_{n}\\ \end{bmatrix},

provided 𝐖\mathbf{W} is invertible, where Inn×nI_{n}\in\mathbb{R}^{n\times n} is the identity matrix and

M1\displaystyle M_{1} =12G21(G4G1+(G4G1)2+4G2G3),\displaystyle=\frac{1}{2}G_{2}^{-1}\left(G_{4}-G_{1}+\sqrt{(G_{4}-G_{1})^{2}+4G_{2}G_{3}}\right),
M2\displaystyle M_{2} =12G31(G4G1+(G4G1)2+4G2G3).\displaystyle=\frac{-1}{2}G_{3}^{-1}\left(G_{4}-G_{1}+\sqrt{(G_{4}-G_{1})^{2}+4G_{2}G_{3}}\right).

Applying Lemma 3.14 to the matrix 𝒢=𝒲𝒟𝒲1\mathcal{G}=\mathcal{WDW}^{-1}, we can further decompose 𝒫S\mathcal{P}_{S} as follows:

𝒫S\displaystyle\mathcal{P}_{S} =𝒰~(𝒲𝒟𝒲1Im+τ[InIn](Lm))𝒰~\displaystyle=\widetilde{\mathcal{U}}\left(\mathcal{WDW}^{-1}\otimes I_{m}+\tau\begin{bmatrix}I_{n}&\\ &I_{n}\\ \end{bmatrix}\otimes(-L_{m})\right)\widetilde{\mathcal{U}}^{*}
=𝒱(𝒟Im+τ[InIn](Lm))𝒱,\displaystyle=\mathcal{V}\left(\mathcal{D}\otimes I_{m}+\tau\begin{bmatrix}I_{n}&\\ &I_{n}\\ \end{bmatrix}\otimes(-L_{m})\right)\mathcal{V}^{*},
=𝒱[D1Im+τIn(Lm)D2Im+τIn(Lm)]𝒱,\displaystyle=\mathcal{V}\begin{bmatrix}D_{1}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &D_{2}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}\mathcal{V}^{*},

where 𝒱=𝒰~(𝒲Im)\mathcal{V}=\widetilde{\mathcal{U}}\left(\mathcal{W}\otimes I_{m}\right), and the matrices 𝒲\mathcal{W} and 𝒟=[D1D2]\mathcal{D}=\begin{bmatrix}D_{1}&\\ &D_{2}\\ \end{bmatrix} are explicitly known from Lemma 3.14.

Therefore, the computation of 𝐰=𝒫S1𝐯\mathbf{w}=\mathcal{P}_{S}^{-1}\mathbf{v} can be implemented by the following three steps.

  1. 1.

    Compute𝐯~=𝒱𝐯\textrm{Compute}~{}\widetilde{\mathbf{v}}=\mathcal{V}^{*}\mathbf{\mathbf{v}},

  2. 2.

    Compute𝐰~=[D1Im+τIn(Lm)D2Im+τIn(Lm)]1𝐯~\textrm{Compute}~{}\widetilde{\mathbf{w}}=\begin{bmatrix}D_{1}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &D_{2}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}^{-1}\widetilde{\mathbf{v}},

  3. 3.

    Compute𝐰=𝒱𝐰~\textrm{Compute}~{}\mathbf{w}=\mathcal{V}\widetilde{\mathbf{w}}.

Both Steps 1 and 3 can be computed by fast Fourier transformation in 𝒪(mnlogn)\mathcal{O}(mn\log{n}). In Step 2, the shifted Laplacian systems can be efficiently solved for instance by using the multigrid method. A detailed description of this highly effective implementation can be found in [15] for example.

In each MINRES iteration, we need to compute a matrix-vector product in the form of |𝒫S|1𝐯|\mathcal{P}_{S}|^{-1}\mathbf{v} for some given vector 𝐯\mathbf{v}. The eigendecomposition of Lm-L_{m} is given by Lm=𝕌mΩm𝕌m-L_{m}=\mathbb{U}_{m}\Omega_{m}\mathbb{U}_{m}^{\top} with Lm-L_{m} assumed SPD, where 𝕌m\mathbb{U}_{m} is orthogonal and Ωm\Omega_{m} is a diagonal matrix containing the eigenvalues of Lm-L_{m}.

Hence, we can rewrite |𝒫S||\mathcal{P}_{S}| from (17) as follows:

|𝒫S|\displaystyle|\mathcal{P}_{S}| =[𝒮𝒮+α2InIm𝒮𝒮+α2InIm]\displaystyle=\begin{bmatrix}\sqrt{\mathcal{S}^{*}\mathcal{S}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\mathcal{S}^{*}\mathcal{S}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix}
=𝒰[|Λ|2+α2InIm|Λ|2+α2InIm]𝒰,\displaystyle=\mathcal{U}\begin{bmatrix}\sqrt{|\Lambda|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{|\Lambda|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix}\mathcal{U}^{*},

where 𝒰:=[Γn𝔽n𝕌m(Γn𝔽n𝕌m)]\mathcal{U}:=\begin{bmatrix}\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m}&\\ &(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}\\ \end{bmatrix} is an unitary matrix and Λ:=ΛnIm+τInΩm\Lambda:=\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m}.

Therefore, the computation of 𝐰=|𝒫S|1𝐯\mathbf{w}=|\mathcal{P}_{S}|^{-1}\mathbf{v} can be implemented with the following three steps.

  1. 1.

    Compute𝐯~=𝒰𝐯\textrm{Compute}~{}\widetilde{\mathbf{v}}=\mathcal{U}^{*}\mathbf{\mathbf{v}},

  2. 2.

    Compute𝐰~=[|Λ|2+α2InIm1|Λ|2+α2InIm1]𝐯~\textrm{Compute}~{}\widetilde{\mathbf{w}}=\begin{bmatrix}\sqrt{|\Lambda|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}&\\ &\sqrt{|\Lambda|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}\\ \end{bmatrix}\widetilde{\mathbf{v}},

  3. 3.

    Compute𝐰=𝒰𝐰~\textrm{Compute}~{}\mathbf{w}=\mathcal{U}\widetilde{\mathbf{w}}.

When the spatial grid is uniformly partitioned, the orthogonal matrix 𝕌m\mathbb{U}_{m} becomes the discrete sine matrix 𝕊m\mathbb{S}_{m}. In this case, Step 1 and 3 can be computed efficiently by fast Fourier transform and fast sine transform in 𝒪(mnlogn)\mathcal{O}(mn\log{n}) operations. For step 2, the required computations take 𝒪(mn)\mathcal{O}(mn) operations since the matrix involved is a simple diagonal matrix.

The product of 𝒫MS1𝐯\mathcal{P}_{MS}^{-1}\mathbf{v} for any vector 𝐯\mathbf{v} can be implemented following the above procedures. Note that

𝒫MS\displaystyle\mathcal{P}_{MS}
=[SnSn+α2InIm+τIn(Lm)SnSn+α2InIm+τIn(Lm)]\displaystyle=\begin{bmatrix}\sqrt{S_{n}^{*}S_{n}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &\sqrt{S_{n}S_{n}^{*}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}
=𝒰~[|Λn|2+α2InIm+τIn(Lm)|Λn|2+α2In+τIn(Lm)]𝒰~,\displaystyle=\widetilde{\mathcal{U}}\begin{bmatrix}\sqrt{|\Lambda_{n}|^{2}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &\sqrt{|\Lambda_{n}|^{2}+\alpha^{2}I_{n}}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}\widetilde{\mathcal{U}}^{*},

where 𝒰~=[Γn𝔽nIm(Γn𝔽nIm)]\widetilde{\mathcal{U}}=\begin{bmatrix}\Gamma_{n}\mathbb{F}_{n}\otimes I_{m}&\\ &(\Gamma_{n}\mathbb{F}_{n}\otimes I_{m})^{*}\\ \end{bmatrix} is an unitary matrix.

The computation of 𝒫MS1𝐯\mathcal{P}_{MS}^{-1}\mathbf{v} can be implemented by the following three steps.

  1. 1.

    Compute𝐯~=𝒰~𝐯\textrm{Compute}~{}\widetilde{\mathbf{v}}=\widetilde{\mathcal{U}}^{*}\mathbf{\mathbf{v}},

  2. 2.

    Compute

    𝐰~\displaystyle\widetilde{\mathbf{w}}
    =\displaystyle= [(|Λn|2+α2InIm+τIn(Lm))1(|Λn|2+α2In+τIn(Lm))1]𝐯~,\displaystyle\begin{bmatrix}(\sqrt{|\Lambda_{n}|^{2}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m}))^{-1}&\\ &(\sqrt{|\Lambda_{n}|^{2}+\alpha^{2}I_{n}}+\tau I_{n}\otimes(-L_{m}))^{-1}\\ \end{bmatrix}\widetilde{\mathbf{v}},
  3. 3.

    Compute𝐰=𝒰~𝐰~\textrm{Compute}~{}\mathbf{w}=\widetilde{\mathcal{U}}\widetilde{\mathbf{w}}.

Both Steps 1 and 3 can be computed by fast Fourier transformation in 𝒪(mnlogn)\mathcal{O}(mn\log{n}). As for Step 2, again, the shifted Laplacian systems can be efficiently solved by the multigrid method. We refer to [36] for more details regarding such efficient implementation.

4 Numerical examples

In this section, we provide several numerical results to show the performance of our proposed preconditioners. All numerical experiments are carried out using MATLAB 2022b on a PC with Intel i5-13600KF CPU 3.50GHz and 32 GB RAM.

The CPU time in seconds is measured using MATLAB built-in functions 𝕥𝕚𝕔\𝕥𝕠𝕔\mathbb{tic\backslash toc}. All Steps 131-3 in Section 3.5 are implemented by the functions 𝕕𝕤𝕥\mathbb{dst} and 𝕗𝕗𝕥\mathbb{fft} as discrete sine transform and fast Fourier transform respectively. All Krylov subspace solvers used are implemented using the built-in functions on MATLAB. We choose a zero initial guess and a stopping tolerance of 10810^{-8} based on the reduction in relative residual norms for all Krylov subspace solvers tested unless otherwise indicated.

We adopt the notation MINRES-|𝒫S||\mathcal{P}_{S}| and MINRES-𝒫MS\mathcal{P}_{MS} to represent the MINRES solvers with |𝒫S||\mathcal{P}_{S}| and 𝒫MS\mathcal{P}_{MS}, respectively. Also, GMRES-𝒫S\mathcal{P}_{S} is used to represent the GMRES solver with the proposed preconditioner 𝒫S\mathcal{P}_{S}. We compare our proposed methods against the state-of-the-art solver proposed recently in [23] (denoted by PCG-𝒫ϵ\mathcal{P}_{\epsilon}), where an ϵ\epsilon-circulant preconditioner 𝒫ϵ\mathcal{P}_{\epsilon} was constructed. Note that we did not compare with the matching Schur complement preconditioners proposed in [30, 21]. It is expected that their effectiveness cannot surpass PCG-𝒫ϵ\mathcal{P}_{\epsilon} as studied in the numerical tests carried out in [23].

In the related tables, we denote by ’Iter’ the number of iterations for solving a linear system by an iterative solver within the given accuracy. Denote by ’DoF’, the number of unknowns in a linear system. Let pp^{*} and yy^{*} denote the approximate solution to pp and yy, respectively. Then, we define the error measure ehe_{h} as

eh=[yp][yp]Lτ(L2(Ω)).e_{h}=\left\|\begin{bmatrix}y^{*}\\ p^{*}\end{bmatrix}-\begin{bmatrix}y\\ p\end{bmatrix}\right\|_{L^{\infty}_{\tau}(L^{2}(\Omega))}. (21)

The time interval [0,T][0,T] and the space are partitioned uniformly with the mesh step size τ=T/n=T/h1\tau=T/n=T/{h^{-1}} and h=1/(m+1)h=1/(m+1), respectively, where hh can be found in the related tables. Also, only ζ=π\zeta=\pi is used in the related tables for ω=eiζ\omega=e^{\textbf{i}\zeta} in the block ω\omega-circulant preconditioners. It is because, after conducting extensive trials, it was consistently observed that the preconditioner corresponding to ζ=π\zeta=\pi yielded the best results.

Example 1.

In this example [23], we consider the following two-dimensional problem of solving (1), where Ω=(0,1)2\Omega=(0,1)^{2}, T=1T=1, a(x1,x2)=1a(x_{1},x_{2})=1, and

f(x1,x2,t)\displaystyle f(x_{1},x_{2},t) =(2π21)etsin(πx1)sin(πx2),\displaystyle=(2\pi^{2}-1)e^{-t}\sin{(\pi x_{1})}\sin{(\pi x_{2})},
g(x1,x2,t)\displaystyle g(x_{1},x_{2},t) =etsin(πx1)sin(πx2),\displaystyle=e^{-t}\sin{(\pi x_{1})}\sin{(\pi x_{2})},

The analytical solution of which is given by

y(x1,x2,t)\displaystyle y(x_{1},x_{2},t) =etsin(πx1)sin(πx2),p=0.\displaystyle=e^{-t}\sin{(\pi x_{1})}\sin{(\pi x_{2})},\quad p=0.

To support the result of Theorem 3.11, we display the eigenvalues of the matrix 𝒫MS1|𝒫S|\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}| for various values of γ\gamma in Figure 1. The illustration confirms that the eigenvalues consistently fall within the interval [12,1][\frac{1}{\sqrt{2}},1], aligning with the expectations set forth in Remark 4. Furthermore, it is evident that as γ\gamma diminishes, the eigenvalues of 𝒫MS1|𝒫S|\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}| exhibit increased clustering around one. This trend can be attributed to the fact that α=τγ\alpha=\frac{\tau}{\sqrt{\gamma}} grows larger when γ\gamma is reduced, assuming the matrix size (or τ\tau) remains constant. As α\alpha becomes larger, it follows from their respective definitions that 𝒫MS\mathcal{P}_{MS} becomes closer to |𝒫S||\mathcal{P}_{S}|.

Refer to caption
(a) γ=1010\gamma=10^{-10}
Refer to caption
(b) γ=106\gamma=10^{-6}
Refer to caption
(c) γ=102\gamma=10^{-2}
Figure 1: Eigenvalues of 𝒫MS1|𝒫S|\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}| with n=16n=16, m=15m=15, and various γ\gamma.

Table 1 displays the iteration counts, CPU times, and errors for GMRES-𝒫S\mathcal{P}_{S} and MINRES-|𝒫S||\mathcal{P}_{S}| when applying the Crank-Nicolson method with different values of γ\gamma. Note that MINRES-𝒫MS\mathcal{P}_{MS} was not implemented for this example, as |𝒫S||\mathcal{P}_{S}| can already be efficiently implemented using fast sine transforms. We observe that: (i) both GMRES-𝒫S\mathcal{P}_{S} and MINRES-|𝒫S||\mathcal{P}_{S}| with ζ=π\zeta=\pi perform excellently and stably, considering both iteration counts and CPU times across various values of γ\gamma; and (ii) the error decreases as the mesh is refined, except in the case when γ=1010\gamma=10^{-10}. In such an instance, the error exhibits only a slight decrease as the matrix size grows, which is likely due to the convergence tolerance used for MINRES not being sufficiently small to demonstrate the anticipated reduction.

In Table 2, we compare our preconditioners against PCG-𝒫ϵ\mathcal{P}_{\epsilon} from [23] with ϵ=12min{τ24γ,τ3226γT,τ283γT,13}\epsilon=\frac{1}{2}\min\{\frac{\tau}{24\sqrt{\gamma}},\frac{\tau^{\frac{3}{2}}}{2\sqrt{6\gamma}T},\frac{\tau^{2}}{8\sqrt{3\gamma}T},\frac{1}{3}\}, where only the Crank-Nicolson method is considered. We report that for a larger value of γ106\gamma\geq 10^{-6}, our proposed GMRES-𝒫S\mathcal{P}_{S} with ω=1\omega=-1 outperforms PCG-𝒫ϵ\mathcal{P}_{\epsilon} significantly, namely, the computational time PCG-𝒫ϵ\mathcal{P}_{\epsilon} needed for convergence is roughly two times larger. When γ\gamma is small, GMRES-𝒫S\mathcal{P}_{S} is still highly comparable with PCG-𝒫ϵ\mathcal{P}_{\epsilon} in terms of CPU times. Overall, GMRES-𝒫S\mathcal{P}_{S} is stable and robust in both iteration numbers and computational time for a wide range of γ\gamma.

Example 2.

In this example, we consider the following two-dimensional problem of solving (1) with a variable function a(x1,x2)a(x_{1},x_{2}), where Ω=(0,1)2\Omega=(0,1)^{2}, T=1T=1, a(x1,x2)=105sin(πx1x2)a(x_{1},x_{2})=10^{-5}\sin{(\pi x_{1}x_{2})}, and

f(x1,x2,t)=sin(πt)sin(πx1)sin(πx2)+etx1(1x1)[2×105sin(πx1x2)x2(1x2)105πcos(πx1x2)x(12x2)]+etx2(1x2)[2×105sin(πx1x2)105πcos(πx1x2)x2(12x1)],f(x_{1},x_{2},t)=-\sin{(\pi t)}\sin{(\pi x_{1})}\sin{(\pi x_{2})}+e^{-t}x_{1}(1-x_{1})[2\times 10^{-5}\sin{(\pi x_{1}x_{2})}\\ -x_{2}(1-x_{2})-10^{-5}\pi\cos{(\pi x_{1}x_{2})}x(1-2x_{2})]\\ +e^{-t}x_{2}(1-x_{2})[2\times 10^{-5}\sin{(\pi x_{1}x_{2})}-10^{-5}\pi\cos{(\pi x_{1}x_{2})}x_{2}(1-2x_{1})],
g(x1,x2,t)=γπcos(πt)sin(πx1)sin(πx2)+etx1(1x1)x2(1x2)105γπ2sin(πt)[2sin(πx1x2)sin(πx1)sin(πx2)+cos(πx1x2)(x1sin(πx1)cos(πx2)+x2cos(πx1)sin(πx2))].g(x_{1},x_{2},t)=-\gamma\pi\cos{(\pi t)}\sin{(\pi x_{1})}\sin{(\pi x_{2})}+e^{-t}x_{1}(1-x_{1})x_{2}(1-x_{2})\\ -10^{-5}\gamma\pi^{2}\sin{(\pi t)}[-2\sin{(\pi x_{1}x_{2})}\sin{(\pi x_{1})}\sin{(\pi x_{2})}\\ +\cos{(\pi x_{1}x_{2})}(x_{1}\sin{(\pi x_{1})}\cos{(\pi x_{2})}+x_{2}\cos{(\pi x_{1})}\sin{(\pi x_{2})})].

The analytical solution of which is given by

y(x1,x2,t)=etx1(1x1)x2(1x2),p(x1,x2,t)=γsin(πt)sin(πx1)sin(πx2).\displaystyle y(x_{1},x_{2},t)=e^{-t}x_{1}(1-x_{1})x_{2}(1-x_{2}),\quad p(x_{1},x_{2},t)=\gamma\sin{(\pi t)}\sin{(\pi x_{1})}\sin{(\pi x_{2})}.

In the given example, the direct application of MINRES-|𝒫S||\mathcal{P}_{S}| is not feasible due to the non-diagonalizability of Lm-L_{m} using fast transform methods. Consequently, we adopt MINRES-𝒫MS\mathcal{P}_{MS} which incorporates a multigrid method. Specifically, to solve the shifted Laplacian linear system (as detailed in Subsection 3.5) and compute 𝒫MS1𝐯\mathcal{P}_{MS}^{-1}\mathbf{v} for any vector 𝐯\mathbf{v}, we apply one iteration of the V-cycle geometric multigrid method. In this iteration, the Gauss-Seidel method is employed as the pre-smoother.

Table 3 shows the iteration numbers, CPU time, and error of GMRES-𝒫S\mathcal{P}_{S} and MINRES-𝒫MS\mathcal{P}_{MS}, respectively, when the Crank-Nicolson method is applied with a range of γ\gamma. This example aims to investigate the effectiveness of our solvers when a(x1,x2)a(x_{1},x_{2}) in (3) is not a constant. The results show that (i) GMRES-𝒫S\mathcal{P}_{S} with ω=1\omega=-1 maintains relatively stable iteration numbers and CPU time across a wide range of γ\gamma; (ii) MINRES-𝒫MS\mathcal{P}_{MS} performs well for small γ\gamma, but its efficiency decreases as γ\gamma increases — a phenomenon that has also been observed and reported in a previous study [17]. This behavior may be attributed to the eigenvalue distribution of 𝒜\mathcal{A}. Specifically, when γ\gamma is very small compared to τ\tau, it is plausible to assume that α=τγ\alpha=\frac{\tau}{\sqrt{\gamma}} is quite large. Indeed, as γ\gamma approaches 0+0^{+}, it is corroborated by [17, Corollary 3.2] that the matrix-sequence {𝒜α}n\left\{{\mathcal{A}\over\alpha}\right\}_{n} has eigenvalues relatively clustered around ±1\pm 1, thus facilitating the solving of the all-at-once system. Additionally, as discussed in the preceding example, the increasing size of α\alpha results in 𝒫MS\mathcal{P}_{MS} closely resembling |𝒫S||\mathcal{P}_{S}|, which in turn leads to an improved preconditioning effect.

Example 3.

This example aims to test the robustness of our proposed method with the (homogeneous) Neumann boundary condition. We consider the following two-dimensional problem, where Ω=(0,1)2\Omega=(0,1)^{2}, T=1T=1, a(x1,x2)=103a(x_{1},x_{2})=10^{-3}, and

f(x1,x2,t)\displaystyle f(x_{1},x_{2},t) =(103×8π21)etcos(2πx1)cos(2πx2),\displaystyle=(10^{-3}\times 8\pi^{2}-1)e^{-t}\cos{(2\pi x_{1})}\cos{(2\pi x_{2})},
g(x1,x2,t)\displaystyle g(x_{1},x_{2},t) =etcos(2πx1)cos(2πx2),\displaystyle=e^{-t}\cos{(2\pi x_{1})}\cos{(2\pi x_{2})},

The analytical solution of which is given by

y(x1,x2,t)\displaystyle y(x_{1},x_{2},t) =etcos(2πx1)cos(2πx2),p=0.\displaystyle=e^{-t}\cos{(2\pi x_{1})}\cos{(2\pi x_{2})},\quad p=0.

Note that we use one iteration of the V-cycle geometric multigrid method with the Gauss-Seidel method as a pre-smoother to solve the shifted Laplacian linear system for GMRES-𝒫S\mathcal{P}_{S}. Also, MINRES is not applicable in this case with the Neumann boundary condition, since 𝒜\mathcal{A} is not symmetric. Thus, we resort to using GMRES-𝒫S\mathcal{P}_{S}.

Table 4 shows the iteration numbers, CPU time, and error of GMRES-𝒫S\mathcal{P}_{S} when the Crank-Nicolson method is applied with various values of γ\gamma. The results indicate that GMRES-𝒫S\mathcal{P}_{S} maintains stable and low iteration numbers across a wide range of γ\gamma.

5 Conclusions

In this work, we have provided a unifying preconditioning framework for circulant-based preconditioning applied to the concerned parabolic control problem. The framework is applicable for both first order (i.e., θ=1\theta=1) and second order (i.e., θ=1/2\theta=1/2) time discretization schemes. Moreover, it encompasses both circulant (i.e., ω=1\omega=1) and skew-circulant (i.e., ω=1\omega=-1 and θ=1\theta=1) preconditioners previously proposed in existing works. We note that it appears feasible to extend our proposed preconditioning theory to various implicit time-discretization methods, such as Backward Difference Formulas, as long as the block Toeplitz structure within the resulting all-at-once linear system remains intact. When considering more general discretization schemes, such as (semi-)implicit Runge-Kutta methods, a good starting point could be the study recently introduced in [20]. This work develops a robust preconditioning approach designed specifically for all-at-once linear systems that are derived from the Runge-Kutta time discretization of time-dependent PDEs.

Specifically, we have proposed a class of block ω\omega-circulant based preconditioners for the all-at-once system of the parabolic control problem. First, when GMRES is considered, we have proposed a PinT preconditioner 𝒫S\mathcal{P}_{S} for the concerned system. Second, when MINRES is used for the symmetrized system, we have constructed an ideal preconditioner |𝒜||\mathcal{A}|, which can be used as a prototype for designing efficient preconditioners based on |𝒜||\mathcal{A}|. Then, we have designed two novel preconditioners |𝒫S||\mathcal{P}_{S}| and 𝒫MS\mathcal{P}_{MS} for the same problem, which can be efficiently implemented in a PinT manner. All proposed preconditioners have been shown effective in both numerical tests and a theoretical study. Based on our numerical tests, it has been demonstrated that our proposed solver, GMRES-𝒫S\mathcal{P}_{S} with ω=1\omega=-1 and θ=1/2\theta=1/2, can achieve rapid convergence, consistently maintaining stable iteration counts across a wide range of γ\gamma values.

We stress that the development of our proposed MINRES approach for optimal control problems is still in its infancy. As future work, we plan at least to develop more efficient preconditioned Krylov subspace solvers by integrating with an ϵ\epsilon-circulant matrix, where a small ϵ>0\epsilon>0 is chosen. In recent years, this approach has been shown successful for solving various PDEs (see, e.g, [24, 26, 33, 27]), achieving clustered singular values without any outliers. We will investigate whether such a combination can reduce the number of singular values/eigenvalue outliers that are present as a result from our preconditioners, which could achieve parameter-independent convergence in the MINRES framework.

Acknowledgments

The work of Sean Hon was supported in part by the Hong Kong RGC under grant 22300921 and a start-up grant from the Croucher Foundation.

Table 1: Results of GMRES-𝒫S\mathcal{P}_{S} and MINRES-|𝒫S||\mathcal{P}_{S}| for Example 1 with ζ=π\zeta=\pi and θ=12\theta=\frac{1}{2} (Crank-Nicolson)
γ\gamma hh DoF GMRES-𝒫S\mathcal{P}_{S} MINRES-|𝒫S||\mathcal{P}_{S}|
Iter CPU ehe_{h} Iter CPU ehe_{h}
101010^{-10} 252^{-5} 61504 3 0.039 1.18e-9 3 0.033 3.18e-9
262^{-6} 508032 3 0.35 1.18e-9 5 0.48 1.45e-9
272^{-7} 4129024 3 3.32 1.04e-9 6 4.91 1.04e-9
282^{-8} 33292800 3 27.96 4.49e-10 6 40.91 4.49e-10
10810^{-8} 252^{-5} 61504 3 0.030 1.12e-7 6 0.045 1.26e-7
262^{-6} 508032 3 0.37 6.71e-8 6 0.52 6.71e-8
272^{-7} 4129024 3 3.30 1.81e-8 6 4.90 1.81e-8
282^{-8} 33292800 3 27.85 4.53e-9 6 40.83 4.53e-9
10610^{-6} 252^{-5} 61504 3 0.029 2.90e-6 6 0.044 2.90e-6
262^{-6} 508032 3 0.34 7.26e-7 6 0.56 7.26e-7
272^{-7} 4129024 3 3.35 1.81e-7 6 4.91 1.81e-7
282^{-8} 33292800 3 27.76 4.54e-8 6 40.98 4.54e-8
10410^{-4} 252^{-5} 61504 3 0.029 2.87e-5 6 0.045 2.87e-5
262^{-6} 508032 3 0.39 7.19e-6 6 0.51 7.19e-6
272^{-7} 4129024 3 3.34 1.80e-6 6 4.91 1.80e-6
282^{-8} 33292800 3 27.96 4.49e-7 6 40.84 4.49e-7
10210^{-2} 252^{-5} 61504 3 0.030 2.77e-4 6 0.046 2.77e-4
262^{-6} 508032 3 0.37 6.91e-5 6 0.55 6.91e-5
272^{-7} 4129024 3 3.35 1.73e-5 6 4.90 1.73e-5
282^{-8} 33292800 3 27.80 4.31e-6 6 40.96 4.31e-6
Table 2: Results of PCG-𝒫ϵ\mathcal{P}_{\epsilon} for Example 1 with Crank-Nicolson method (θ=12\theta=\frac{1}{2})
γ\gamma hh DoF PCG-𝒫ϵ\mathcal{P}_{\epsilon}
Iter CPU ehe_{h}
101010^{-10} 252^{-5} 61504 3 0.084 1.18e-9
262^{-6} 508032 3 0.23 1.18e-9
272^{-7} 4129024 4 2.49 1.04e-9
282^{-8} 33292800 5 24.55 4.49e-10
10810^{-8} 252^{-5} 61504 4 0.023 1.12e-7
262^{-6} 508032 5 0.34 6.71e-8
272^{-7} 4129024 5 3.01 1.81e-8
282^{-8} 33292800 5 24.70 4.53e-9
10610^{-6} 252^{-5} 61504 5 0.046 2.90e-6
262^{-6} 508032 5 0.33 7.26e-7
272^{-7} 4129024 6 3.50 1.81e-7
282^{-8} 33292800 6 28.64 4.54e-8
10410^{-4} 252^{-5} 61504 9 0.063 2.87e-5
262^{-6} 508032 9 0.57 7.19e-6
272^{-7} 4129024 9 4.95 1.80e-6
282^{-8} 33292800 10 44.46 4.49e-7
10210^{-2} 252^{-5} 61504 11 0.061 2.77e-4
262^{-6} 508032 11 0.68 6.91e-5
272^{-7} 4129024 11 5.95 1.73e-5
282^{-8} 33292800 12 53.96 4.31e-6
Table 3: Results of GMRES-𝒫S\mathcal{P}_{S} and MINRES-𝒫MS\mathcal{P}_{MS} for Example 2 with ζ=π\zeta=\pi and θ=12\theta=\frac{1}{2} (Crank-Nicolson)
γ\gamma hh DoF GMRES-𝒫S\mathcal{P}_{S} MINRES-𝒫MS\mathcal{P}_{MS}
Iter CPU ehe_{h} Iter CPU ehe_{h}
101010^{-10} 252^{-5} 61504 3 0.14 6.60e-13 3 0.11 2.91e-10
262^{-6} 508032 3 0.67 4.68e-13 5 0.74 2.55e-11
272^{-7} 4129024 3 5.35 7.51e-13 6 6.27 4.43e-13
282^{-8} 33292800 3 48.26 8.40e-12 6 61.63 1.73e-11
10810^{-8} 252^{-5} 61504 3 0.13 6.30e-11 6 0.17 6.29e-11
262^{-6} 508032 3 0.64 5.48e-11 6 0.84 5.39e-11
272^{-7} 4129024 3 5.39 1.13e-10 7 7.51 7.28e-10
282^{-8} 33292800 3 59.97 1.14e-10 9 107.00 3.90e-11
10610^{-6} 252^{-5} 61504 3 0.17 2.53e-9 7 0.24 2.79e-9
262^{-6} 508032 3 0.81 1.26e-9 10 1.65 5.65e-10
272^{-7} 4129024 3 6.71 1.14e-9 10 11.98 3.22e-10
282^{-8} 33292800 3 74.15 1.14e-9 10 126.17 5.38e-10
10410^{-4} 252^{-5} 61504 5 0.25 1.53e-7 14 0.42 1.53e-7
262^{-6} 508032 5 1.24 3.40e-8 15 2.57 3.40e-8
272^{-7} 4129024 5 10.89 8.51e-9 18 23.00 8.51e-9
282^{-8} 33292800 5 100.60 2.13e-9 23 313.30 2.13e-9
10210^{-2} 252^{-5} 61504 5 0.25 1.16e-5 20 0.57 1.16e-5
262^{-6} 508032 5 1.54 2.90e-6 24 4.28 2.90e-6
272^{-7} 4129024 5 12.99 7.25e-7 30 39.82 7.25e-7
282^{-8} 33292800 5 121.50 1.81e-7 52 739.91 1.81e-7
Table 4: Results of GMRES-𝒫S\mathcal{P}_{S} for Example 3 with ζ=π\zeta=\pi and θ=12\theta=\frac{1}{2} (Crank-Nicolson)
γ\gamma hh DoF GMRES-𝒫S\mathcal{P}_{S}
Iter CPU ehe_{h}
101010^{-10} 252^{-5} 65536 3 0.23 1.51e-11
262^{-6} 524288 3 1.02 1.39e-11
272^{-7} 4194304 3 6.98 1.18e-11
282^{-8} 33554432 3 81.18 4.99e-12
10810^{-8} 252^{-5} 65536 3 0.18 1.43e-9
262^{-6} 524288 3 1.07 7.93e-10
272^{-7} 4194304 3 8.37 2.06e-10
282^{-8} 33554432 3 94.33 5.05e-11
10610^{-6} 252^{-5} 65536 3 0.22 3.69e-8
262^{-6} 524288 3 1.26 8.56e-9
272^{-7} 4194304 3 12.36 2.06e-9
282^{-8} 33554432 3 152.13 5.05e-10
10410^{-4} 252^{-5} 65536 3 0.27 3.73e-7
262^{-6} 524288 3 1.70 8.64e-8
272^{-7} 4194304 3 18.68 2.08e-8
282^{-8} 33554432 3 249.61 5.10e-9
10210^{-2} 252^{-5} 65536 3 0.35 4.10e-6
262^{-6} 524288 3 2.28 9.51e-7
272^{-7} 4194304 3 22.44 2.29e-7
282^{-8} 33554432 3 282.79 5.61e-8

References

  • [1] Owe Axelsson and Maya Neytcheva. Eigenvalue estimates for preconditioned saddle point matrices. Numerical Linear Algebra with Applications, 13(4):339-360, 2006.
  • [2] Daniele Bertaccini and Michael K. Ng. Block ω\omega-circulant preconditioners for the systems of differential equations. Calcolo, 40(2):71-90, 2003.
  • [3] Dario A. Bini, Guy Latouche, and Beatrice Meini. Numerical Methods for Structured Markov Chains. Oxford University Press, New York, 2005.
  • [4] Arne Bouillon, Giovanni Samaey, and Karl Meerbergen. On generalized preconditioners for time-parallel parabolic optimal control. arXiv preprint, arXiv:2302.06406, 2021.
  • [5] Jan H. Brandts and Ricardo Reis da Silva. Computable eigenvalue bounds for rank-kk perturbations. Linear Algebra and Its Applications, 432(12), 3100-3116, 2010.
  • [6] Alfio Borzì and Volker Schulz. Computational optimization of systems governed by partial differential equations. Society for Industrial and Applied Mathematics. 2011.
  • [7] Raymond H. Chan and Michael K. Ng. Conjugate gradient methods for Toeplitz systems. SIAM Review, 38(3), 427-482, 1996.
  • [8] Daniel Potts and Gabriele Steidl. Preconditioners for ill-conditioned Toeplitz matrices. BIT Numerical Mathematics, 39(3), 513-533, 1999.
  • [9] Daniel Potts and Gabriele Steidl. Preconditioners for ill-conditioned Toeplitz systems constructed from positive kernels. SIAM Journal on Scientific Computing, 22(5), 1741-1761, 2001
  • [10] Paola Ferrari, Isabella Furci, Sean Hon, Mohammad Ayman-Mursaleen, and Stefano Serra-Capizzano. The eigenvalue distribution of special 2-by-2 block matrix-sequences with applications to the case of symmetrized Toeplitz structures. SIAM Journal on Matrix Analysis and Applications, 40(3), 1066-1086, 2019.
  • [11] Carlo Garoni and Stefano Serra-Capizzano. Generalized locally Toeplitz sequences: Theory and applications, volume 1. Springer, Cham, 2017.
  • [12] Carlo Garoni and Stefano Serra-Capizzano. Generalized locally Toeplitz sequences: Theory and applications, volume 2. Springer, Cham, 2018.
  • [13] Anne Greenbaum, Vlastimil Pták, and Zdeněk Strakoš. Any nonincreasing convergence curve is possible for GMRES. SIAM Journal on Matrix Analysis and Applications, 17(3):465–469, 1996.
  • [14] Michael Hinze, Rene Pinnau, Michael Ulbrich, and Stefan Ulbrich. Optimization with PDE constraints (Vol. 23). Springer Science & Business Media. 2008.
  • [15] Yunhui He and Jun Liu. A Vanka-type multigrid solver for complex-shifted Laplacian systems from diagonalization-based parallel-in-time algorithms. Applied Mathematics Letters, 132, 108125, 2022.
  • [16] Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press. 1990.
  • [17] Sean Hon, Jiamei Dong, and Stefano Serra-Capizzano. A preconditioned MINRES method for optimal control of wave equations and its asymptotic spectral distribution theory. SIAM Journal on Matrix Analysis and Applications, 44(4):1477–1509, 2023.
  • [18] Sean Hon, Po Yin Fung, Jiamei Dong, and Stefano Serra-Capizzano. A sine transform based preconditioned MINRES method for all-at-once systems from constant and variable-coefficient evolutionary PDEs. Numerical Algorithms, 2023.
  • [19] Sean Hon, Stefano Serra-Capizzano, and Andy Wathen. Band-Toeplitz preconditioners for ill-conditioned Toeplitz systems. BIT Numerical Mathematics. 2021.
  • [20] Santolo Leveque, Luca Bergamaschi, Ángeles Martínez, and John W. Pearson. Fast iterative solver for the all-at-once Runge–Kutta discretization. arXiv preprint, arXiv:2303.02090, 2023.
  • [21] Santolo Leveque and John W. Pearson. Fast iterative solver for the optimal control of time-dependent PDEs with Crank–Nicolson discretization in time. Numerical Linear Algebra with Applications, 29:e2419, 2022.
  • [22] Buyang Li, Jun Liu, and Mingqing Xiao. A new multigrid method for unconstrained heat optimal control problems. Journal of Computational and Applied Mathematics, 326, 358-373, 2017.
  • [23] Xue-lei Lin and Shu-Lin Wu. A parallel-in-time preconditioner for Crank-Nicolson discretization of a parabolic optimal control problem. arXiv preprint, arXiv:2109.12524, 2023.
  • [24] Xue-lei Lin and Michael Ng. An All-at-once preconditioner for evolutionary partial differential equations. SIAM Journal on Scientific Computing, 43(4), A2766–A2784, 2021.
  • [25] Jacques Louis Lions. Optimal control of systems governed by partial differential equations. Springer-Verlag Berlin Heidelberg. 1971.
  • [26] Jun Liu and Shu-Lin Wu. A fast block α\alpha-circulant preconditioner for all-at-once systems from wave equations. SIAM Journal on Matrix Analysis and Applications. 41(4), 1912–1943, 2021.
  • [27] Martin J. Gander, Jun Liu, Shu-Lin Wu, Xiaoqiang Yue, and Tao Zhou. ParaDiag: parallel-in-time algorithms based on the diagonalization technique. arXiv e-prints, arXiv:2005.09158, 2020.
  • [28] Malcolm F. Murphy, Gene H. Golub, and Andrew J. Wathen. A note on preconditioning for indefinite linear systems. SIAM Journal on Scientific Computing, 21(6):1969–1972, 2000.
  • [29] Michael K. Ng. Iterative Methods for Toeplitz Systems. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, 2004.
  • [30] John W. Pearson and Andrew J. Wathen. A new approximation of the Schur complement in preconditioners for PDE-constrained optimization. Numerical Linear Algebra and Applications, 19(5):816-829, 2012.
  • [31] John W. Pearson, Martin Stoll, and Andrew J. Wathen. Regularization-robust preconditioners for time-dependent PDE-constrained optimization problems. SIAM Journal on Matrix Analysis and Applications, 33:1126-1152, 2012.
  • [32] Gilbert Strang. A Proposal for Toeplitz Matrix Calculations. Studies in Applied Mathematics, 74(2), 171-176, 1986.
  • [33] Yafei Sun, Shu-Lin Wu, and Yingxiang Xu. A Parallel-in-Time Implementation of the Numerov Method For Wave Equations. Journal of Scientific Computing, 90.1, 2022.
  • [34] Ferdi Tröltzsch. Optimal control of partial differential equations: theory, methods, and applications (Vol. 112). American Mathematical Soc. 2010.
  • [35] Andrew J. Wathen. Preconditioning. Acta Numerica. 24, 329-376, 2015.
  • [36] Shu-Lin Wu and Tao Zhou. Diagonalization-based parallel-in-time algorithms for heat PDE-constrained optimization problems. ESAIM: Control, Optimisation and Calculus of Variations, 26, 88, 2020.
  • [37] Shu-Lin Wu, Zhiyong Wang, and Tao Zhou. PinT Preconditioner for Forward-Backward Evolutionary Equations. SIAM Journal on Matrix Analysis and Applications, 44(4): 1771-1798, 2023.