Block $\omega$ -circulant preconditioners for parabolic optimal control problems

Po Yin Fung and Sean Hon Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong SAR ([email protected]).Corresponding Author. Department of Mathematics, Hong Kong Baptist University, Hong Kong SAR ([email protected]).

Abstract

In this work, we propose a class of novel preconditioned Krylov subspace methods for solving an optimal control problem of parabolic equations. Namely, we develop a family of block $\omega$ -circulant based preconditioners for the all-at-once linear system arising from the concerned optimal control problem, where both first order and second order time discretization methods are considered. The proposed preconditioners can be efficiently diagonalized by fast Fourier transforms in a parallel-in-time fashion, and their effectiveness is theoretically shown in the sense that the eigenvalues of the preconditioned matrix are clustered around $\pm 1$ , which leads to rapid convergence when the minimal residual method is used. When the generalized minimal residual method is deployed, the efficacy of the proposed preconditioners are justified in the way that the singular values of the preconditioned matrices are proven clustered around unity. Numerical results are provided to demonstrate the effectiveness of our proposed solvers.

Keywords: Toeplitz, skew/circulant matrices, $\omega$ -circulant matrices, preconditioners, parallel-in-time

AMS Subject Classification: 65F08, 65F10, 65M22, 15B05

1 Introduction

In recent years, there has been a growing interest in analyzing and solving optimization problems constrained by parabolic equations. We refer to [25, 14, 34, 6] and the references therein for a comprehensive overview.

In this work, we are interested in solving the distributed optimal control model problem. Namely, the following quadratic cost functional is minimized:

\min_{y,u}~{}\mathcal{J}(y,u):=\frac{1}{2}\|y-g\|^{2}_{L^{2}(\Omega\times(0,T))}+\frac{\gamma}{2}\|u\|^{2}_{L^{2}(\Omega\times(0,T))},

(1)

subject to a parabolic equation with certain initial and boundary conditions

\left\{\begin{array}[]{lc}y_{t}-\mathcal{L}y=f+u,\quad(x,t)\in\Omega\times(0,T],\qquad y=0,\quad(x,t)\in\partial\Omega\times(0,T],\\ y(x,0)=y_{0},\quad x\in\Omega,\end{array}\right.\,

(2)

where $u,g\in L^{2}$ are the distributed control and the desired tracking trajectory, respectively, $\gamma>0$ is a regularization parameter, $\mathcal{L}=\nabla\cdot(a({x})\nabla)$ , and $f$ and $y_{0}$ are given problem dependent functions. Under appropriate assumptions, theoretical aspects such as the existence, uniqueness, and regularity of the solution were well studied in [25]. The optimal solution of (1) & (2) can be characterized by the following system:

\left\{\begin{array}[]{lc}y_{t}-\mathcal{L}y-\frac{1}{\gamma}p=f,\quad(x,t)\in\Omega\times(0,T],\qquad y=0,\quad(x,t)\in\partial\Omega\times(0,T],\\ y(x,0)=y_{0}\quad x\in\Omega,\\ -p_{t}-\mathcal{L}p+y=g,\quad(x,t)\in\Omega\times(0,T],\qquad p=0,\quad(x,t)\in\partial\Omega\times(0,T],\\ p(x,T)=0\quad x\in\Omega,\end{array}\right.\,

(3)

where the control variable $u$ has been eliminated.

Following [36, 37], we discretize (3) using the $\theta$ -method for time and some space discretization, which gives

	$\displaystyle M_{m}\frac{\mathbf{y}_{m}^{(k+1)}-\mathbf{y}_{m}^{(k)}}{\tau}+K_{m}(\theta\mathbf{y}_{m}^{(k+1)}+(1-\theta)\mathbf{y}_{m}^{(k)})$	$\displaystyle=$	$\displaystyle M_{m}(\theta\mathbf{f}_{m}^{(k+1)}+(1-\theta)\mathbf{f}_{m}^{(k)}+\frac{1}{\gamma}(\theta\mathbf{p}_{m}^{(k)}+(1-\theta)\mathbf{p}_{m}^{(k+1)})),$
	$\displaystyle-M_{m}\frac{\mathbf{p}_{m}^{(k+1)}-\mathbf{p}_{m}^{(k)}}{\tau}+K_{m}(\theta\mathbf{p}_{m}^{(k)}+(1-\theta)\mathbf{p}_{m}^{(k+1)})$	$\displaystyle=$	$\displaystyle M_{m}(\theta\mathbf{g}_{m}^{(k)}+(1-\theta)\mathbf{g}_{m}^{(k+1)}-\theta\mathbf{y}_{m}^{(k+1)}-(1-\theta)\mathbf{y}_{m}^{(k)}).$

The backward Euler method corresponds to $\theta=1$ , while the Crank-Nicolson method is adopted when $\theta=1/2$ .

Combining the given initial and boundary conditions, one needs to solve the following linear system

\widetilde{\mathcal{A}}\begin{bmatrix}\mathbf{y}\\ \mathbf{p}\end{bmatrix}=\begin{bmatrix}\mathbf{\widetilde{g}}\\ \mathbf{\widetilde{f}}\end{bmatrix},

(4)

where we have $\mathbf{y}=[\mathbf{y}_{m}^{(1)},\cdots,\mathbf{y}_{m}^{(n)}]^{\top}$ , $\mathbf{p}=[\mathbf{p}_{m}^{(0)},\cdots,\mathbf{p}_{m}^{(n-1)}]^{\top}$ ,

\mathbf{\widetilde{f}}=\begin{bmatrix}M_{m}(\theta\tau\mathbf{f}_{m}^{(1)}+(1-\theta)\tau\mathbf{f}_{m}^{(0)})+(M_{m}-(1-\theta)\tau K_{m})\mathbf{y}_{m}^{(0)}\\ M_{m}(\theta\tau\mathbf{f}_{m}^{(2)}+(1-\theta)\tau\mathbf{f}_{m}^{(1)})\\ \vdots\\ M_{m}(\theta\tau\mathbf{f}_{m}^{(n)}+(1-\theta)\tau\mathbf{f}_{m}^{(n-1)})\\ \end{bmatrix},\mathbf{\widetilde{g}}=\tau\begin{bmatrix}M_{m}(\theta\mathbf{g}_{m}^{(0)}+(1-\theta)\mathbf{g}_{m}^{(1)}-(1-\theta)\mathbf{y}_{m}^{(0)})\\ M_{m}(\theta\mathbf{g}_{m}^{(1)}+(1-\theta)\mathbf{g}_{m}^{(2)})\\ \vdots\\ M_{m}(\theta\mathbf{g}_{m}^{(n-1)}+(1-\theta)\mathbf{g}_{m}^{(n)})\\ \end{bmatrix},

\displaystyle\widetilde{\mathcal{A}}

\displaystyle=

\displaystyle\begin{bmatrix}\tau B_{n}^{(2)}\otimes M_{m}&(B_{n}^{(1)})^{\top}\otimes M_{m}+\tau(B_{n}^{(2)})^{\top}\otimes K_{m}\\ B_{n}^{(1)}\otimes M_{m}+\tau B_{n}^{(2)}\otimes K_{m}&-\frac{\tau}{\gamma}(B_{n}^{(2)})^{\top}\otimes M_{m}\end{bmatrix},

(5)

and the matrices $B_{n}^{(1)},B_{n}^{(2)}$ are, respectively,

B_{n}^{(1)}=\begin{bmatrix}1&&&&\\ -1&1&&&\\ &-1&1&&\\ &&\ddots&\ddots&\\ &&&-1&1\end{bmatrix},\quad B_{n}^{(2)}=\begin{bmatrix}\theta&&&&\\ 1-\theta&\theta&&&\\ &1-\theta&\theta&&\\ &&\ddots&\ddots&\\ &&&1-\theta&\theta\\ \end{bmatrix}.

We assume that the matrix $M_{m}$ is symmetric positive definite, and the matrix $K_{m}$ is symmetric positive semi-definite. The matrices $M_{m}$ and $K_{m}$ represent the mass matrix and the stiffness matrix, respectively, if a finite element method is employed. For the finite difference method, the linear system is such that $M_{m}=I_{m}$ and $K_{m}=-L_{m}$ , where $-L_{m}$ is the discretization matrix of the negative Laplacian.

Following a similar idea in [21], we can further transform (5) into the following equivalent system

{\mathcal{A}}\begin{bmatrix}\sqrt{\gamma}\widetilde{\mathbf{y}}\\ \widetilde{\mathbf{p}}\end{bmatrix}=\begin{bmatrix}\mathbf{g}\\ \sqrt{\gamma}\mathbf{f}\end{bmatrix},

(6)

where $\mathbf{f}=(I_{n}\otimes M_{m}^{-\frac{1}{2}})\mathbf{\widetilde{f}}$ , $\mathbf{g}=(I_{n}\otimes M_{m}^{-\frac{1}{2}})\mathbf{\widetilde{g}}$ , $\widetilde{\mathbf{y}}=(B_{n}^{(2)}\otimes M_{m}^{\frac{1}{2}})[\mathbf{y}_{m}^{(1)},\cdots,\mathbf{y}_{m}^{(n)}]^{\top}$ , $\mathbf{p}=((B_{n}^{(2)})^{\top}\otimes M_{m}^{\frac{1}{2}})[\mathbf{p}_{m}^{(0)},\cdots,\mathbf{p}_{m}^{(n-1)}]^{\top}$ , and

	$\displaystyle{\mathcal{A}}$	$\displaystyle=$	$\displaystyle\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&B_{n}^{\top}\otimes I_{m}+\tau{I}_{n}\otimes M_{m}^{-\frac{1}{2}}K_{m}M_{m}^{-\frac{1}{2}}\\ B_{n}\otimes I_{m}+\tau{I}_{n}\otimes M_{m}^{-\frac{1}{2}}K_{m}M_{m}^{-\frac{1}{2}}&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}$
		$\displaystyle=$	$\displaystyle\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&\mathcal{T}^{\top}\\ \mathcal{T}&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}.$

Note that $\frac{1}{2}\leq\theta\leq 1$ , $\alpha=\frac{\tau}{\sqrt{\gamma}}$ , $I_{m}$ is the $m\times m$ identity matrix, and

\mathcal{T}=B_{n}\otimes I_{m}+\tau{I}_{n}\otimes M_{m}^{-\frac{1}{2}}K_{m}M_{m}^{-\frac{1}{2}},

(8)

where $B_{n}$ is a lower triangular Toeplitz matrix whose entries are known explicitly as

B_{n}=\begin{bmatrix}\frac{1}{\theta}&&&&\\ \frac{-1}{\theta^{2}}&\frac{1}{\theta}&&&\\ \frac{-(\theta-1)}{\theta^{3}}&\frac{-1}{\theta^{2}}&\frac{1}{\theta}&&\\ \vdots&\ddots&\ddots&\ddots&\\ \frac{-(\theta-1)^{n-2}}{\theta^{n}}&\cdots&\frac{-(\theta-1)}{\theta^{3}}&\frac{-1}{\theta^{2}}&\frac{1}{\theta}\end{bmatrix}.

Incidentally, $B_{n}$ can be expressed as the product of two Toeplitz matrices, i.e., $B_{n}=B_{n}^{(1)}(B_{n}^{(2)})^{-1}=(B_{n}^{(2)})^{-1}B_{n}^{(1)}$ . As will be explained in Section 2, the Toeplitz matrices $B_{n}^{(1)}$ and $B_{n}^{(2)}$ are respectively generated by the functions

f_{1}(\phi)=1-\exp{(\mathbf{i}\phi})

(9)

and

f_{2}(\phi)=\theta+(1-\theta)\exp{(\mathbf{i}\phi)}.

(10)

In what follows, we focus on using the finite difference method to discretize the system (3), namely, $M_{m}=I_{m}$ and $K_{m}=-L_{m}$ in the linear system (6). However, we point out that our proposed preconditioning methods with minimal modification are still applicable when a finite element method is used. We first develop a preconditioned generalized minimal residual (GMRES) method for a nonsymmetric equivalent system of (6), which is

\mathcal{\widehat{A}}\begin{bmatrix}\sqrt{\gamma}\widetilde{\mathbf{y}}\\ \widetilde{\mathbf{p}}\end{bmatrix}=\begin{bmatrix}\sqrt{\gamma}\mathbf{f}\\ \mathbf{g}\end{bmatrix}

(11)

where

\displaystyle\mathcal{\widehat{A}}

\displaystyle=

\displaystyle\begin{bmatrix}\mathcal{T}&-\alpha{I}_{n}\otimes I_{m}\\ \alpha{I}_{n}\otimes I_{m}&\mathcal{T}^{\top}\end{bmatrix}.

(12)

For $\mathcal{\widehat{A}}$ , we propose the following novel block preconditioner:

\displaystyle\mathcal{P}_{S}=\begin{bmatrix}\mathcal{S}&-\alpha{I}_{n}\otimes I_{m}\\ \alpha{I}_{n}\otimes I_{m}&\mathcal{S}^{*}\end{bmatrix},

(13)

where

\mathcal{S}=S_{n}\otimes I_{m}+\tau I_{n}\otimes(-L_{m}).

(14)

Notice that $S_{n}:=S_{n}^{(1)}(S_{n}^{(2)})^{-1}$ , where

\displaystyle S_{n}^{(1)}=\begin{bmatrix}1&&&&-\omega\\ -1&1&&&\\ &-1&1&&\\ &&\ddots&\ddots&\\ &&&-1&1\end{bmatrix},\quad S_{n}^{(2)}=\begin{bmatrix}\theta&&&&\omega(1-\theta)\\ 1-\theta&\theta&&&\\ &1-\theta&\theta&&\\ &&\ddots&\ddots&\\ &&&1-\theta&\theta\end{bmatrix}

and $\omega=e^{\textbf{i}\zeta}\in\mathbb{C}$ with $\zeta\in[0,2\pi)$ . Clearly, both $S_{n}^{(1)}$ and $S_{n}^{(2)}$ are $\omega$ -circulant matrices [3, 2], so they admit the eigendecompositions

S_{n}^{(k)}=(\Gamma_{n}\mathbb{F}_{n})\Lambda_{n}^{(j)}(\Gamma_{n}\mathbb{F}_{n})^{*},\quad j=1,2,

(15)

where $\Gamma_{n}={\rm diag}(\exp{(\textbf{i}\zeta{(\frac{k-1}{n}}))})_{k=1}^{n}$ and $\mathbb{F}_{n}=\frac{1}{\sqrt{n}}[\theta_{n}^{(i-1)(j-1)}]_{i,j=1}^{n}$ with $\theta_{n}=\exp(\frac{2\pi{\bf i}}{n})$ and $\Lambda_{n}^{(j)}={\rm diag}(f_{1}(\frac{\zeta+2\pi k}{n}))_{k=0}^{n-1}$ and $f_{j}$ is defined by (9) and (10).

Remark 1.

Since $S_{n}^{(2)}$ is an $\omega$ -circulant matrix, its eigenvalues $\lambda_{k}(S_{n}^{(2)})$ can be found explicitly, i.e., $\lambda_{k}(S_{n}^{(2)})=\theta+(1-\theta)\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}},k=0,\dots,n-1$ . It is known that $S_{n}^{(2)}$ can be singular. For example, $S_{n}^{(2)}$ has a zero eigenvalue for even $n$ when $\theta=\frac{1}{2}$ and $\zeta=0$ (i.e., $\omega=1$ ), which was discussed in [36]. When it happens, a remedy is to replace the zero eigenvalue by a nonzero real number. Thus, it can easily create a nonsingular circulant matrix $\widetilde{S}_{n}^{(2)}$ such that $\mathrm{rank}(\widetilde{S}_{n}^{(2)}-{S}_{n}^{(2)})=1$ . Hence, our preconditioning approach can work with $S_{n}$ replaced by $\widetilde{S}_{n}=S_{n}^{(1)}(\widetilde{S}_{n}^{(2)})^{-1}$ , without the restrictive assumption needed in [36] (i.e., $n$ should be chosen odd). Therefore, for ease of exposition, we assume that $S_{n}^{(2)}$ is nonsingular in the rest of this work.

Remark 2.

It should be noted that our adopted $\omega$ -circulant preconditioning is distinct from the $\epsilon$ -circulant preconditioning that has received much attention in the literature due to its excellent performance for solving PDE problems (see, e.g, [24, 26, 33, 27]), despite these two kind of matrices have a similar decomposition like (15). A notable difference between the two is that $\omega$ is a complex number in general, while $\epsilon$ is only chosen real. Correspondingly, the diagonal matrix $\Gamma_{n}$ for $\omega$ -circulant matrices is unitary in general, while that for $\epsilon$ -circulant matrices is not.

When $\omega=1$ , $\mathcal{P}_{S}$ becomes the block circulant based preconditioner proposed in [36]. The existing block skew-circulant based preconditioner [4] proposed only with the backward Euler method is also included in our preconditioning strategy when $\omega=-1$ and $\theta=1$ . As extensively studied in [8, 9], $\omega$ -circulant matrices as preconditioners for Toeplitz systems can substantially outperform the Strang type preconditioners [32] (i.e., when $\omega=1$ ), especially in the ill-conditioned case. For related studies on the unsatisfactory performance of Strang preconditioners for ill-conditioned nonsymmetric (block) Toeplitz systems, we refer to [19, 18].

Since $\omega$ -circulant matrices can be efficiently diagonalized by the fast Fourier transforms (FFTs), which can be parallelizable over different possessors. Hence, our preconditioner $\mathcal{P}_{S}$ is especially advantageous in a high performance computing environment.

In order to support our GMRES solver with $\mathcal{P}_{S}$ as a preconditioner, we will show that the singular values of $\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}}$ are clustered around unity. However, despite of its success which can be seen in the numerical experiments from Section 4, the convergence study of preconditioning strategies for nonsymmetric problems is to a great extent heuristic. As mentioned in [35, Chapter 6], descriptive convergence bounds are usually not available for GMRES or any of the other applicable nonsymmetric Krylov subspace iterative methods.

Therefore, as an alternative solver, we develop a preconditioned minimal residual (MINRES) method for the symmetric system (6), instead of (11). Notice that our proposed MINRES method is in contrast with the aforementioned GMRES solvers, such as [36, 4] where a block (skew-)circulant type preconditioner was proposed and the eigenvalues of the preconditioned matrix were shown clustered around unity. As well explained in [13], the convergence behaviour of GMRES cannot be rigorously analyzed by using only eigenvalues in general. Thus, our MINRES solver can get round these theoretical difficulties of GMRES.

Based on the spectral distribution of $\mathcal{A}$ , we first propose the following novel SPD block diagonal preconditioner as an ideal preconditioner for $\mathcal{A}$ :

\displaystyle|\mathcal{A}|:=\sqrt{\mathcal{A}^{2}}=\begin{bmatrix}\sqrt{\mathcal{T}^{\top}\mathcal{T}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\mathcal{T}\mathcal{T}^{\top}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}.

(16)

Despite its excellent preconditioning effect for $\mathcal{A}$ , which will be shown in Section 3.2, the matrix $|\mathcal{A}|$ is computational expensive to invert. Thus, we then propose the following parallel-in-time (PinT) preconditioner, which mimics $|\mathcal{A}|$ and can be fast implemented:

\displaystyle|\mathcal{P}_{S}|:=\sqrt{\mathcal{P}_{S}^{*}\mathcal{P}_{S}}=\begin{bmatrix}\sqrt{\mathcal{S}^{*}\mathcal{S}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\mathcal{S}\mathcal{S}^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}.

(17)

However, the preconditioner $|\mathcal{P}_{S}|$ requires fast diagonalizability of $L_{m}$ in order to be efficiently implemented. When such diagonalizability is not available, we further propose the following preconditioner $\mathcal{P}_{MS}$ as a modification of $|\mathcal{P}_{S}|$ :

			$\displaystyle\mathcal{P}_{MS}$
		$\displaystyle=$	$\displaystyle\begin{bmatrix}\sqrt{S_{n}^{}S_{n}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &\sqrt{S_{n}S_{n}^{}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\end{bmatrix}.$

One of our main contributions in this work is to develop a preconditioned MINRES method with the proposed preconditioners, which has theoretically guaranteed convergence based on eigenvalues.

It is worth noting that our preconditioning approaches are fundamentally different from another kind of related existing work (see, e.g., [31, 23]), which is a typical preconditioning approach in the context of preconditioning for saddle point systems. Its effectiveness is based on the approximation of Schur complements (e.g., [30, 21]) and the classical preconditioning techniques [1, 28]. Yet, for instance, our preconditioning proposal extends the MINRES preconditioning strategy proposed in [17] from optimal control of wave equations to that of parabolic equations, resulting in a clustered spectrum around $\{\pm 1\}$ . Moreover, the implementation of our preconditioners based on FFTs are parallel-in-time.

The paper is organized as follows. In Section 2, we review some preliminary results on block Toeplitz matrices. In Section 3, we provide our main results on the spectral analysis for our proposed preconditioners. Numerical examples are given in Section 4 for supporting the performance of our proposed preconditioners.

2 Preliminaries on Toeplitz matrices

In this section, we provide some useful background knowledge regarding Toeplitz matrices.

We let $L^{1}([-\pi,\pi])$ be the Banach space of all functions that are Lebesgue integrable over $[-\pi,\pi]$ and periodically extended to the whole real line. The Toeplitz matrix generated by $f\in L^{1}([-\pi,\pi])$ is denoted by $T_{n}[f]$ , namely,

\displaystyle T_{n}[f]=\begin{bmatrix}{}a_{0}&a_{-1}&\cdots&a_{-n+2}&a_{-n+1}\\ a_{1}&a_{0}&a_{-1}&&a_{-n+2}\\ \vdots&a_{1}&a_{0}&\ddots&\vdots\\ a_{n-2}&&\ddots&\ddots&a_{-1}\\ a_{n-1}&a_{n-2}&\cdots&a_{1}&a_{0}\end{bmatrix},

where

\displaystyle a_{k}=\frac{1}{2\pi}\int_{-\pi}^{\pi}f(\theta)e^{-\mathbf{i}k\theta}\,d\theta,\quad k=0,\pm 1,\pm 2,\dots

are the Fourier coefficients of $f$ . The function $f$ is called the generating function of $T_{n}[f]$ . If $f$ is complex-valued, then $T_{n}[f]$ is non-Hermitian for all sufficiently large $n$ . Conversely, if $f$ is real-valued, then $T_{n}[f]$ is Hermitian for all $n$ . If $f$ is real-valued and nonnegative, but not identically zero almost everywhere, then $T_{n}[f]$ is Hermitian positive definite for all $n$ . If $f$ is real-valued and even, $T_{n}[f]$ is symmetric for all $n$ . For thorough discussions on the related properties of block Toeplitz matrices, we refer readers to [11] and references therein; for computational features see [29, 7, 12] and references there reported.

3 Main results

In this section, the main results which support the effectiveness of our proposed preconditioners are provided. Also, the implementation issue is discussed.

3.1 GMRES - block $\omega$ -circulant based preconditioner

Proposition 3.1.

Let $\mathcal{\widehat{A}}\in\mathbb{R}^{2mn\times 2mn},\mathcal{P}_{S}\in\mathbb{C}^{2mn\times 2mn}$ be defined by (12) and (13), respectively. Then,

\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}}=I_{nm}+\widetilde{\mathcal{R}}_{1},

where $I_{mn}$ is the $mn$ by $mn$ identity matrix and $\mathrm{rank}(\widetilde{\mathcal{R}}_{1})\leq 4m$ .

Proof.

First, we observe that

	$\displaystyle\mathcal{P}_{S}-\mathcal{\widehat{A}}$	$\displaystyle=\begin{bmatrix}\mathcal{S-T}&\\ &(\mathcal{S-T})^{*}\\ \end{bmatrix}$
		$\displaystyle=\begin{bmatrix}(S_{n}-B_{n})\otimes I_{m}\\ &(S_{n}-B_{n})^{*}\otimes I_{m}\\ \end{bmatrix}.$

Now, we examine $\mathrm{rank}(S_{n}-B_{n})$ via the following matrix decomposition:

	$\displaystyle S_{n}-B_{n}$	$\displaystyle=$	$\displaystyle S_{n}^{(1)}(S_{n}^{(2)})^{-1}-B_{n}^{(1)}(B_{n}^{(2)})^{-1}$
		$\displaystyle=$	$\displaystyle(S_{n}^{(1)}-B_{n}^{(1)})(S_{n}^{(2)})^{-1}+B_{n}^{(1)}\big{(}(S_{n}^{(2)})^{-1}-(B_{n}^{(2)})^{-1}\big{)}.$

From the simple structure of these matrices, it is clear that $\mathrm{rank}(S_{n}^{(1)}-B_{n}^{(1)})\leq 1$ and $\mathrm{rank}\big{(}(S_{n}^{(2)})^{-1}-(B_{n}^{(2)})^{-1}\big{)}\leq 1$ (because $\mathrm{rank}(S_{n}^{(2)}-B_{n}^{(2)})\leq 1$ ). Thus, we have $\mathrm{rank}(S_{n}-B_{n})\leq 2$ , implying $\mathrm{rank}(\mathcal{P}_{S}-\mathcal{\widehat{A}})\leq 4m$ . Then, we have

\displaystyle\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}}

\displaystyle=I_{nm}-\underbrace{\mathcal{P}_{S}^{-1}(\mathcal{P}_{S}-\mathcal{\widehat{A}})}_{=:\widetilde{\mathcal{R}}_{1}},

where $\mathrm{rank}(\widetilde{\mathcal{R}}_{1})\leq 4m$ . ∎

As a consequence of proposition 3.1, we can show that the singular values of $\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}}$ are clustered around unity except for a number of outliers whose size is independent of $n$ in general. From a preconditioning for nonsymmetric Toeplitz systems point of view, such a singular value cluster is often used to support the preconditioning effectiveness of $\mathcal{P}_{S}$ for $\mathcal{\widehat{A}}$ when GMRES is used. We refer to [29] for a systematic exposition of preconditioning for non-Hermitian Toeplitz systems. One could further show that the eigenvalues of $\mathcal{P}_{S}^{-1}\mathcal{\widehat{A}}$ are also clustered around unity, as with many existing works. However, as mentioned in Section 1, the convergence of GMRES in general cannot be rigorously analyzed by using only eigenvalues. As such, in the next subsections, we provide theoretical supports for our proposed MINRES solvers.

3.2 MINRES - ideal preconditioner

In what follows, we will show that an ideal preconditioner for $\mathcal{A}$ is the SPD matrix $|\mathcal{A}|$ defined by (16).

Proposition 3.2.

Let $\mathcal{A}\in\mathbb{R}^{2mn\times 2mn}$ be defined by (1). Then, the preconditioned matrix $|\mathcal{A}|^{-1}\mathcal{A}$ is both (real) symmetric and orthogonal.

Proof.

Considering the singular value decomposition of $\mathcal{T}=U\Sigma V^{\top}$ , the associated decomposition of $\mathcal{A}$ is obtained by direct computation, that is

$\displaystyle\mathcal{A}$	$\displaystyle=$	$\displaystyle\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&V\Sigma U^{\top}\\ U\Sigma V^{\top}&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}$
	$\displaystyle=$	$\displaystyle\begin{bmatrix}V&\\ &U\end{bmatrix}\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&\Sigma\\ \Sigma&-\alpha{I}_{n}\otimes I_{m}\end{bmatrix}\begin{bmatrix}V&\\ &U\end{bmatrix}^{\top}$
	$\displaystyle=$	$\displaystyle\begin{bmatrix}V&\\ &U\end{bmatrix}\mathcal{{\widehat{Q}}}\begin{bmatrix}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &-\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}\mathcal{{\widehat{Q}}}^{\top}\begin{bmatrix}V&\\ &U\end{bmatrix}^{\top},$

where $\mathcal{{\widehat{Q}}}$ is orthogonal given by

\displaystyle\mathcal{{\widehat{Q}}}=\begin{bmatrix}\Sigma D_{1}^{-1}&-\Sigma D_{2}^{-1}\\ \big{(}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}-\alpha{I}_{n}\otimes I_{m}\big{)}D_{1}^{-1}&\big{(}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}+\alpha{I}_{n}\otimes I_{m}\big{)}D_{2}^{-1}\end{bmatrix}

with

D_{1}=\sqrt{(-\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}+\alpha{I}_{n}\otimes I_{m})^{2}+\Sigma^{2}}

and

D_{2}=\sqrt{(\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}+\alpha{I}_{n}\otimes I_{m})^{2}+\Sigma^{2}}.

It is obvious that both diagonal matrices $D_{1}$ and $D_{2}$ are invertible. Thus $\widehat{Q}$ is well-defined.

Since both $\widehat{Q}$ and $\begin{bmatrix}V&\\ &U\end{bmatrix}$ are orthogonal, $\mathcal{Q}:=\begin{bmatrix}V&\\ &U\end{bmatrix}\widehat{Q}$ is also orthogonal. Hence, from (3.2), we have obtained an eigendecomposition of $\mathcal{A}$ , i.e.,

\mathcal{A}=\mathcal{Q}\begin{bmatrix}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &-\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}\mathcal{Q}^{\top}.

Thus, we have

\displaystyle|\mathcal{A}|

\displaystyle=

\displaystyle\mathcal{Q}\begin{bmatrix}\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\Sigma^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\end{bmatrix}\mathcal{Q}^{\top},

where $\mathcal{Q}$ is orthogonal and $\Sigma$ is a diagonal matrix containing the singular values of $\mathcal{T}$ . Thus,

\displaystyle|\mathcal{A}|^{-1}\mathcal{A}

\displaystyle=

\displaystyle\mathcal{Q}\begin{bmatrix}{I}_{n}\otimes I_{m}&\\ &-{I}_{n}\otimes I_{m}\end{bmatrix}\mathcal{Q}^{\top},

which is both symmetric and orthogonal. The proof is complete. ∎

In other words, Proposition 3.2 shows that the preconditioner $|\mathcal{A}|$ can render the eigenvalues exactly at $\pm 1$ , providing a good guide to designing effective preconditioners for $\mathcal{A}$ . As a consequence of Proposition 3.2, we conclude that the MINRES with $|\mathcal{A}|$ as a preconditioner can achieve mesh-independent convergence, i.e., a convergence rate independent of both the meshes and the regularization parameter.

Despite the fact that $|\mathcal{A}|$ is an ideal preconditioner, its direct application has the drawback of being computationally expensive in general. Proposition 3.2 reveals an eigendecomposition of both $\mathcal{A}$ and $|\mathcal{A}|$ , allowing us to develop preconditioners based on the spectral symbol. In what follows, we will show that $\mathcal{P}_{S}$ defined by (17) is a good preconditioner for $\mathcal{A}$ in the sense that the preconditioned matrix $\mathcal{P}_{S}^{-1}\mathcal{A}$ can be expressed as the sum of a Hermitian unitary matrix and a low-rank matrix.

3.3 MINRES - block $\omega$ -circulant based preconditioner

The following theorem accounts for the preconditioning effect of MINRES- $\mathcal{P}_{S}$ .

Theorem 3.3.

Let $\mathcal{A}\in\mathbb{R}^{2mn\times 2mn},|\mathcal{P}_{S}|\in\mathbb{C}^{2mn\times 2mn}$ be defined by (1) and (17), respectively. Then,

|\mathcal{P}_{S}|^{-1}\mathcal{A}=\widetilde{\mathcal{Q}}_{1}+\widetilde{\mathcal{R}}_{2},

where $\widetilde{\mathcal{Q}}_{1}$ is both Hermitian and unitary and $\mathrm{rank}(\widetilde{\mathcal{R}}_{2})\leq 4m$ .

Proof.

Let $s(\mathcal{A})=\begin{bmatrix}\alpha{I}_{n}\otimes I_{m}&\mathcal{S}^{*}\\ \mathcal{S}&-\alpha{I}_{n}\otimes I_{m}\\ \end{bmatrix}$ . Notice that $|s(\mathcal{A})|=\sqrt{s(\mathcal{A})^{2}}=|\mathcal{P}_{S}|=\sqrt{\mathcal{P}_{S}^{*}\mathcal{P}_{S}}$ .
Simple calculations show that

	$\displaystyle s(\mathcal{A})-\mathcal{A}$	$\displaystyle=\begin{bmatrix}&(\mathcal{S-T})^{*}\\ \mathcal{S-T}&\\ \end{bmatrix}$
		$\displaystyle=\begin{bmatrix}&(S_{n}-B_{n})^{*}\otimes I_{m}\\ (S_{n}-B_{n})\otimes I_{m}\\ \end{bmatrix}.$

Thus, $\mathrm{rank}(s(\mathcal{A})-\mathcal{A})\leq 4m$ , following an argument from the proof of Proposition 3.1. Thus, we have

	$\displaystyle\|\mathcal{P}_{S}\|^{-1}\mathcal{A}$	$\displaystyle=\|s(\mathcal{A})\|^{-1}\mathcal{A}$
		$\displaystyle=\underbrace{\|s(\mathcal{A})\|^{-1}s(\mathcal{A})}_{=:\widetilde{\mathcal{Q}}_{1}}-\underbrace{\|s(\mathcal{A})\|^{-1}(s(\mathcal{A})-\mathcal{A})}_{=:\widetilde{\mathcal{R}}_{2}},$

where $\mathrm{rank}(\widetilde{\mathcal{R}}_{2})\leq 4m$ .

Since $s(\mathcal{A})$ is Hermitian, we have $|s(\mathcal{A})|=\mathcal{W}\Psi\mathcal{W}^{*}$ , where $\mathcal{W}$ is a unitary matrix and $\Psi$ is diagonal matrix containing the eigenvalues of $s(\mathcal{A})$ . Correspondingly, we have $s(\mathcal{A})=\mathcal{W}|\Psi|\mathcal{W}^{*}$ , where $|\Psi|$ is diagonal matrix containing the absolute value eigenvalues of $s(\mathcal{A})$ . Thus, we have $\widetilde{\mathcal{Q}}_{1}=\mathcal{W}|\Psi|^{-1}\Psi\mathcal{W}^{*}$ , which is clearly Hermitian. Also, as $\widetilde{\mathcal{Q}}_{1}^{2}=\mathcal{W}|\Psi|^{-1}\Psi|\Psi|^{-1}\Psi\mathcal{W}^{*}=\mathcal{W}I_{2nm}\mathcal{W}^{*}=I_{2nm}$ , we know that $\widetilde{\mathcal{Q}}_{1}$ is unitary.

The proof is complete. ∎

As a consequence of Theorem 3.3 and [5, Corollary 3], we know that the preconditioned matrix $|\mathcal{P}_{S}|^{-1}\mathcal{A}$ has clustered eigenvalues at $\pm 1$ , with a number of outliers independent of $n$ in general (i.e., depending only on $m$ ). Thus, the convergence is independent of the time step in general, and we can expect that MINRES for $\mathcal{A}$ will converge rapidly in exact arithmetic with $|\mathcal{P}_{S}|$ as the preconditioner.

3.4 MINRES - modified block $\omega$ -circulant based preconditioner

To support the preconditioning effect of $\mathcal{P}_{MS}$ , we will show that it is spectrally equivalent to $\mathcal{P}_{S}$ . Before proceeding, we introduce the following auxiliary matrix $\mathcal{P}_{AS}$ which is useful to showing the preconditioning effect for $\mathcal{P}_{MS}$ .

			$\displaystyle\mathcal{P}_{AS}$
		$\displaystyle=$	$\displaystyle\begin{bmatrix}\sqrt{(S_{n}^{}S_{n}+\alpha^{2}I_{n})\otimes I_{m}+\tau^{2}I_{n}\otimes L_{m}^{2}}&\\ &\sqrt{(S_{n}S_{n}^{}+\alpha^{2}I_{n})\otimes I_{m}+\tau^{2}I_{n}\otimes L_{m}^{2}}\end{bmatrix}.$

Also, the following lemma is useful.

Lemma 3.4.

Let $S_{n}\in\mathbb{C}^{n\times n}$ be defined in (14). Then, $S_{n}^{*}+S_{n}$ is (Hermitian) positive semi-definite.

Proof.

The eigenvalues of $S_{n}$ can be found explicitly, which are

\displaystyle\lambda_{k}(S_{n})=\frac{\lambda_{k}(S_{n}^{(1)})}{\lambda_{k}(S_{n}^{(2)})}=\frac{1-\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}{\theta+(1-\theta)\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}},

$k=0,\dots,n-1.$ Thus, we have

	$\displaystyle\lambda_{k}(S_{n}^{*}+S_{n})$	$\displaystyle=$	$\displaystyle\frac{1-\exp{\big{(}-\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}{\theta+(1-\theta)\exp{\big{(}-\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}+\frac{1-\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}{\theta+(1-\theta)\exp{\big{(}\mathbf{i}(\frac{\zeta+2\pi k}{n})\big{)}}}$
		$\displaystyle=$	$\displaystyle\frac{2(2\theta-1)\big{(}1-\cos{(\frac{\zeta+2\pi k}{n})}\big{)}}{\theta^{2}+(1-\theta)^{2}+2\theta(1-\theta)\cos{(\frac{\zeta+2\pi k}{n})}},$

which is always non-negative provided $S_{n}^{(2)}$ is nonsingular by assumption. The proof is complete. ∎

Denote by $\sigma({\bf C})$ the spectrum of a square matrix ${\bf C}$ .

Lemma 3.5.

Let $X=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn}$ and $Y=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}$ , with the involved notation defined in (14). Then,

\displaystyle\sigma\bigg{(}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq[1,\sqrt{2}].

Proof.

Let

Z=\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}

and

\widetilde{Z}=\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}.

Knowing that the eigendecomposition of $-L_{m}$ is given by $-L_{m}=\mathbb{U}_{m}\Omega_{m}\mathbb{U}_{m}^{\top}$ with $-L_{m}$ assumed SPD, where $\mathbb{U}_{m}$ is orthogonal and $\Omega_{m}$ is a real-valued diagonal matrix containing the eigenvalues of $-L_{m}$ , we have

$\displaystyle Z$	$\displaystyle=$	$\displaystyle\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}$
	$\displaystyle=$	$\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})$
		$\displaystyle\times\sqrt{(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})^{*}(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}$
		$\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}$

and

$\displaystyle\widetilde{Z}$	$\displaystyle=$	$\displaystyle\sqrt{X^{}X+Y^{}Y+\alpha^{2}{I}_{n}\otimes I_{m}}$
	$\displaystyle=$	$\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})$
		$\displaystyle\times\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}$
		$\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}.$

Since $Z$ and $\widetilde{Z}$ are diagonalized by the unitary matrix $\mathbb{Q}=\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m}$ , they are simultaneously diagonalizable.

Since both $Z$ and $\widetilde{Z}$ are invertible, they are Hermitian positive definite by construction. To examine the target spectrum of $\widetilde{Z}^{-1}Z$ , we consider the Rayleigh quotient for complex $\mathbf{v}\neq\mathbf{0}$ :

	$\displaystyle R$	$\displaystyle:=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel=$	$\displaystyle\frac{\mathbf{v}^{}\sqrt{(X+Y)^{}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{v}}{\mathbf{v}^{}\sqrt{X^{}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{v}}$
		$\displaystyle\underbrace{=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel=}_{{\bf w}=\mathbb{Q}{\bf v}}$	$\displaystyle\frac{\mathbf{w}^{}\sqrt{(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})^{}(\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{w}}{\mathbf{w}^{}\sqrt{(\Lambda_{n}\otimes I_{m})^{}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{w}}.$

By the invertibility of $Z$ and $\widetilde{Z}$ , both numerator and denominator are positive.

On one hand, we estimate an upper bound for $R$ . Let $z_{1}$ and $z_{2}$ be an entry of $\Lambda_{n}\otimes I_{m}$ and $\tau I_{n}\otimes\Omega_{m}$ , respectively. We have

\displaystyle\frac{{1}}{2}(\overline{z_{1}+z_{2}})(z_{1}+z_{2})\leq{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}},

which implies

\displaystyle\frac{{1}}{2}(\overline{z_{1}+z_{2}})(z_{1}+z_{2})+\alpha^{2}\leq{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}}+\alpha^{2},

since $\alpha^{2}=\frac{\tau^{2}}{{\gamma}}$ is positive. Thus, we have

\displaystyle\frac{{1}}{\sqrt{2}}\sqrt{(\overline{z_{1}+z_{2}})(z_{1}+z_{2})+\alpha^{2}}\leq\sqrt{{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}}+\alpha^{2}}.

Therefore,

R\leq\sqrt{2}.

On the other hand, we estimate an lower bound for $R$ by first examining the definiteness of the matrix $X^{*}Y+Y^{*}X$ . Since $S_{n}^{*}+S_{n}$ is (Hermitian) non-negative definite by Lemma 3.4, which implies

	$\displaystyle X^{}Y+Y^{}X$	$\displaystyle=$	$\displaystyle(S_{n}\otimes I_{m})^{\top}(\tau I_{n}\otimes(-L_{m}))+(\tau I_{n}\otimes(-L_{m}))^{\top}(S_{n}\otimes I_{m})$
		$\displaystyle=$	$\displaystyle\tau(S_{n}^{*}+S_{n})\otimes(-L_{m})$

is also non-negative definite. Thus, we also have

	$\displaystyle\sqrt{(\overline{z_{1}+z_{2}})(z_{1}+z_{2})+\alpha^{2}}$	$\displaystyle=$	$\displaystyle\sqrt{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}+\bar{z}_{1}{z_{2}}+\bar{z}_{2}{z_{1}}+\alpha^{2}}$
		$\displaystyle\geq$	$\displaystyle\sqrt{{\bar{z}_{1}{z_{1}}+\bar{z}_{2}{z_{2}}}+\alpha^{2}},$

implying

1\leq R.

The proof is complete. ∎

Similarly, we can show the following lemma:

Lemma 3.6.

Let $X=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn}$ and $Y=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}$ , with the involved notation defined in (14). Then,

\displaystyle\sigma\bigg{(}\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}\sqrt{(X+Y)(X+Y)^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq[1,\sqrt{2}].

Proposition 3.7.

Let $|\mathcal{P}_{S}|,\mathcal{P}_{AS}\in\mathbb{C}^{2mn\times 2mn}$ be defined by (17) and (3.4), respectively. Then,

\displaystyle\sigma(\mathcal{P}_{AS}^{-1}|\mathcal{P}_{S}|)\subseteq[1,\sqrt{2}].

Proof.

Knowing that

|\mathcal{P}_{S}|=\begin{bmatrix}\sqrt{(X+Y)^{*}(X+Y)+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{(X+Y)(X+Y)^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix}

and

\mathcal{P}_{AS}=\begin{bmatrix}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix},

we know that $\sigma(\mathcal{P}_{AS}^{-1}|\mathcal{P}_{S}|)\subseteq[1,\sqrt{2}],$ by Lemmas 3.5 and 3.6.

∎

Remark 3.

When the Crank-Nicolson method (i.e., $\theta=\frac{1}{2})$ is adopted, $S_{n}^{*}+S_{n}$ is a null matrix from Lemma 3.4. Using this fact and considering the proofs of Lemmas 3.5 & 3.6, we can show that $X^{*}Y+Y^{*}X$ is also a null matrix which implies $\mathcal{P}_{AS}=|\mathcal{P}_{S}|$ .

Lemma 3.8.

Let $X=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn}$ and $Y=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}$ , with the involved notation defined in (14). Then,

\displaystyle\sigma\bigg{(}\big{(}\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}\big{)}^{-1}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq\bigg{[}\frac{1}{\sqrt{2}},1\bigg{]}.

Proof.

Similar to the proof of Lemma 3.5, we let

\widehat{Z}=\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}

and

\widetilde{Z}=\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}.

Knowing that the eigendecomposition of $-L_{m}$ is given by $-L_{m}=\mathbb{U}_{m}\Omega_{m}\mathbb{U}_{m}^{\top}$ , where $\mathbb{U}_{m}$ is orthogonal and $\Omega_{m}$ is a real diagonal matrix containing the eigenvalues of $-L_{m}$ , we have

$\displaystyle\widehat{Z}$	$\displaystyle=$	$\displaystyle\sqrt{X^{}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{}Y}$
	$\displaystyle=$	$\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})$
		$\displaystyle\times\big{(}\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}+\tau I_{n}\otimes\Omega_{m}\big{)}$
		$\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}$

and, again,

$\displaystyle\widetilde{Z}$	$\displaystyle=$	$\displaystyle\sqrt{X^{}X+Y^{}Y+\alpha^{2}{I}_{n}\otimes I_{m}}$
	$\displaystyle=$	$\displaystyle(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})$
		$\displaystyle\times\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}$
		$\displaystyle\times(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}.$

Thus, $\widehat{Z}$ and $\widetilde{Z}$ are simultaneously diagonalized by the unitary matrix $\mathbb{Q}=\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m}$ .

Since both $\widehat{Z}$ and $\widetilde{Z}$ are invertible, they are Hermitian positive definite by construction. To examine the target spectrum of $\widehat{Z}^{-1}\widetilde{Z}$ , we consider the Rayleigh quotient for complex $\mathbf{v}\neq\mathbf{0}$ :

	$\displaystyle\widehat{R}$	$\displaystyle:=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel=$	$\displaystyle\frac{\mathbf{v}^{}\sqrt{X^{}X+Y^{}Y+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{v}}{\mathbf{v}^{}\big{(}\sqrt{X^{}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{}Y}\big{)}\mathbf{v}}$
		$\displaystyle\underbrace{=\joinrel=\joinrel=\joinrel=\joinrel=\joinrel=}_{{\bf w}=\mathbb{Q}{\bf v}}$	$\displaystyle\frac{\mathbf{w}^{}\sqrt{(\Lambda_{n}\otimes I_{m})^{}(\Lambda_{n}\otimes I_{m})+(\tau I_{n}\otimes\Omega_{m})^{\top}(\tau I_{n}\otimes\Omega_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}\mathbf{w}}{\mathbf{w}^{}\big{(}\sqrt{(\Lambda_{n}\otimes I_{m})^{}(\Lambda_{n}\otimes I_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}+\tau I_{n}\otimes\Omega_{m}\big{)}\mathbf{w}}$

For two non-negative numbers, $c_{1}$ and $c_{2}$ , it is known that

\displaystyle\frac{1}{\sqrt{2}}(c_{1}+c_{2})\leq\sqrt{c_{1}^{2}+c_{2}^{2}}\leq c_{1}+c_{2}.

Therefore, by letting $c_{1}$ and $c_{2}$ be an entry of $\sqrt{(\Lambda_{n}\otimes I_{m})^{*}(\Lambda_{n}\otimes I_{m})+\alpha^{2}{I}_{n}\otimes I_{m}}$ and $\tau I_{n}\otimes\Omega_{m}$ , respectively, we have

\frac{1}{\sqrt{2}}\leq\widehat{R}\leq 1.

The proof is complete. ∎

Similarly, we can show the following lemma and proposition.

Lemma 3.9.

Let $X=S_{n}\otimes I_{m}\in\mathbb{C}^{mn\times mn}$ and $Y=\tau I_{n}\otimes(-L_{m})\in\mathbb{R}^{mn\times mn}$ , with the involved notation defined in (14). Then,

\displaystyle\sigma\bigg{(}\big{(}\sqrt{XX^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{YY^{*}}\big{)}^{-1}\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\bigg{)}\subseteq\bigg{[}\frac{1}{\sqrt{2}},1\bigg{]}.

Proposition 3.10.

Let $\mathcal{P}_{AS},\mathcal{P}_{MS}\in\mathbb{C}^{2mn\times 2mn}$ be defined by (3.4) and (1), respectively. Then,

\displaystyle\sigma(\mathcal{P}_{MS}^{-1}\mathcal{P}_{AS})\subseteq\bigg{[}\frac{1}{\sqrt{2}},1\bigg{]}.

Proof.

Knowing that

\mathcal{P}_{MS}=\begin{bmatrix}\sqrt{X^{*}X+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{Y^{*}Y}&\\ &\sqrt{XX^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}+\sqrt{YY^{*}}\\ \end{bmatrix}

and

\mathcal{P}_{AS}=\begin{bmatrix}\sqrt{X^{*}X+Y^{*}Y+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{XX^{*}+YY^{*}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix},

we know that $\sigma(\mathcal{P}_{MS}^{-1}\mathcal{P}_{AS})\subseteq\big{[}\frac{1}{\sqrt{2}},1\big{]},$ by Lemmas 3.8 and 3.9. ∎

Now, we are ready to show that $\mathcal{P}_{MS}$ and $\mathcal{P}_{S}$ are spectrally equivalent in the following theorem, which explains the effectiveness of $\mathcal{P}_{MS}$ .

Theorem 3.11.

Let $|\mathcal{P}_{S}|,\mathcal{P}_{MS}\in\mathbb{C}^{2mn\times 2mn}$ be defined by (17) and (1), respectively. Then,

\displaystyle\sigma(\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}|)\subseteq\bigg{[}\frac{1}{\sqrt{2}},\sqrt{2}\bigg{]}.

Proof.

Notice that for complex $\mathbf{v}\neq\mathbf{0}$ ,

\displaystyle\frac{\mathbf{v}^{*}|\mathcal{P}_{S}|\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{MS}\mathbf{v}}=\frac{\mathbf{v}^{*}|\mathcal{P}_{S}|\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{AS}\mathbf{v}}\cdot\frac{\mathbf{v}^{*}\mathcal{P}_{AS}\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{MS}\mathbf{v}}.

By Propositions 3.7 and 3.10, we have

\frac{1}{\sqrt{2}}\leq\frac{\mathbf{v}^{*}|\mathcal{P}_{S}|\mathbf{v}}{\mathbf{v}^{*}\mathcal{P}_{MS}\mathbf{v}}\leq\sqrt{2}.

The proof is complete. ∎

Remark 4.

From Remark 3, we know that $\sigma(\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}|)\subseteq[\frac{1}{\sqrt{2}},1]$ when $\theta=\frac{1}{2}$ .

In what follows, we will justify the preconditioning effectiveness of $\mathcal{P}_{MS}$ for $\mathcal{A}$ .

Lemma 3.12.

[16, Theorem 4.5.9 (Ostrowski)] Let $A_{m},W_{m}$ be $m\times m$ matrices. Suppose $A_{m}$ is Hermitian and $W_{m}$ is nonsingular. Let the eigenvalues of $A_{m}$ and $W_{m}W_{m}^{*}$ be arranged in an increasing order. For each $k=1,2,\dots,m,$ there exists a positive real number $\theta_{k}$ such that $\lambda_{1}(W_{m}W_{m}^{*})\leq\theta_{k}\leq\lambda_{m}(W_{m}W_{m}^{*})$ and

\lambda_{k}(W_{m}A_{m}W_{m}^{*})=\theta_{k}\lambda_{k}(A_{m}).

As a consequence of Theorem 3.11, Lemma 3.12, and [5, Corollary 3], we can show the following corollary accounting for the preconditioning effectiveness of $\mathcal{P}_{MS}$ :

Corollary 3.13.

Let $\mathcal{A}\in\mathbb{R}^{2mn\times 2mn},\mathcal{P}_{MS}\in\mathbb{C}^{2mn\times 2mn}$ be defined by (1) and (1), respectively. Then, the eigenvalues of the matrix $\mathcal{P}_{MS}^{-1}\mathcal{A}$ are contained $[-\sqrt{2},-\frac{1}{\sqrt{2}}]\cup[\frac{1}{\sqrt{2}},\sqrt{2}]$ , with a number of outliers independent of $n$ in general (i.e., depending only on $m$ ).

Proof.

Note that

\displaystyle\mathcal{P}_{MS}^{-1/2}\mathcal{A}\mathcal{P}_{MS}^{-1/2}=\mathcal{P}_{MS}^{-1/2}|\mathcal{P}_{S}|^{1/2}|\mathcal{P}_{S}|^{-1/2}\mathcal{A}|\mathcal{P}_{S}|^{-1/2}|\mathcal{P}_{S}|^{1/2}\mathcal{P}_{MS}^{-1/2}.

From Lemma 3.12 and Theorem 3.11, we know that, for each $k=1,2,\dots,2mn$ , there exists a positive real number $\theta_{k}$ such that

\frac{1}{\sqrt{2}}\leq\lambda_{\min}(\mathcal{P}_{MS}^{-1/2}|\mathcal{P}_{S}|\mathcal{P}_{MS}^{-1/2})\leq\theta_{k}\leq\lambda_{\max}(\mathcal{P}_{MS}^{-1/2}|\mathcal{P}_{S}|\mathcal{P}_{MS}^{-1/2})\leq\sqrt{2}

and

\lambda_{k}(\mathcal{P}_{MS}^{-1/2}\mathcal{A}\mathcal{P}_{MS}^{-1/2})=\theta_{k}\lambda_{k}(|\mathcal{P}_{S}|^{-1/2}\mathcal{A}|\mathcal{P}_{S}|^{-1/2}).

Recalling from Theorem 3.3 and [5, Corollary 3] that $\lambda_{k}(|\mathcal{P}_{S}|^{-1/2}\mathcal{A}|\mathcal{P}_{S}|^{-1/2})$ are either $\pm 1$ except for a number of outliers independent of $n$ in general, the proof is complete. ∎

Remark 5.

When the Crank-Nicolson method is used, we can show that the eigenvalues of the matrix $\mathcal{P}_{MS}^{-1}\mathcal{A}$ are contained $[-1,-\frac{1}{\sqrt{2}}]\cup[\frac{1}{\sqrt{2}},1]$ , with a number of outliers independent of $n$ in general.

In light of the last corollary, we can expect that MINRES for $\mathcal{A}$ will converge rapidly in exact arithmetic with $\mathcal{P}_{MS}$ as the preconditioner.

3.5 Implementation

We begin by discussing the computation of $\mathcal{\widehat{A}}\mathbf{v}$ (and $\mathcal{A}\mathbf{v}$ ) for any given vector $\mathbf{v}$ . The computation of matrix-vector product $\mathcal{\widehat{A}}\mathbf{v}$ can be computed in $\mathcal{O}(mn\log{n})$ operations by using fast Fourier transforms, because of the fact that $\mathcal{\widehat{A}}$ contains two block (dense) Toeplitz matrices. The required storage is of $\mathcal{O}(mn)$ . In the special case when $\theta=1$ for instance, the product $\mathcal{\widehat{A}}\mathbf{v}$ requires only linear complexity of $\mathcal{O}(mn)$ since $\mathcal{A}$ is a sparse matrix with a simple bi-diagonal Toeplitz matrix $B_{n}$ .

In each GMRES iteration, the matrix-vector product $\mathcal{P}_{S}^{-1}\mathbf{v}$ for a given vector $\mathbf{v}$ needs to be computed. Since $\omega$ -circulant matrices are diagonalizable by the product of a diagonal matrix and a discrete Fourier matrix $\mathbb{F}_{n}=\frac{1}{\sqrt{n}}[\theta_{n}^{(i-1)(j-1)}]_{i,j=1}^{n}\in\mathbb{C}^{n\times n}$ with $\theta_{n}=\exp{(\frac{2\pi\mathbf{i}}{n})}$ , we can represent the matrix $S_{n}$ defined by (14) using eigendecomposition $S_{n}=\Gamma_{n}\mathbb{F}_{n}\Lambda_{n}\mathbb{F}_{n}^{*}\Gamma_{n}^{*}$ . Note that $\Lambda_{n}$ is a diagonal matrix.

Hence, we can decompose $\mathcal{P}_{S}$ from (13) as follows:

	$\displaystyle\mathcal{P}_{S}$	$\displaystyle=\begin{bmatrix}\mathcal{S}&-\alpha{I}_{n}\otimes I_{m}\\ \alpha{I}_{n}\otimes I_{m}&\mathcal{S}^{*}\\ \end{bmatrix}$
		$\displaystyle=\widetilde{\mathcal{U}}\left(\underbrace{\begin{bmatrix}\Lambda_{n}&-\alpha I_{n}\\ \alpha I_{n}&\Lambda_{n}^{}\\ \end{bmatrix}}_{\mathcal{G}}\otimes I_{m}+\tau\begin{bmatrix}I_{n}&\\ &I_{n}\\ \end{bmatrix}\otimes(-L_{m})\right)\widetilde{\mathcal{U}}^{},$

where $\widetilde{\mathcal{U}}=\begin{bmatrix}\Gamma_{n}\mathbb{F}_{n}\otimes I_{m}&\\ &(\Gamma_{n}\mathbb{F}_{n}\otimes I_{m})^{*}\\ \end{bmatrix}$ is an unitary matrix. Note that the matrix $\mathcal{G}$ can be further decomposed using the following Lemma in [36].

Lemma 3.14.

([36], Lemma 2.3) Let $G_{1,2,3,4}\in\mathbb{C}^{n\times n}$ be four diagonal matrices and $\mathbf{G}=\begin{bmatrix}G_{1}&G_{2}\\ G_{3}&G_{4}\\ \end{bmatrix}$ . Suppose $G_{2}$ and $G_{3}$ are invertible. Then, it holds that

\displaystyle\mathbf{G}=\mathbf{W}\begin{bmatrix}G_{1}+G_{2}M_{1}&\\ &G_{4}+G_{3}M_{2}\\ \end{bmatrix}\mathbf{W}^{-1},\quad\mathbf{W}=\begin{bmatrix}I_{n}&M_{2}\\ M_{1}&I_{n}\\ \end{bmatrix},

provided $\mathbf{W}$ is invertible, where $I_{n}\in\mathbb{R}^{n\times n}$ is the identity matrix and

	$\displaystyle M_{1}$	$\displaystyle=\frac{1}{2}G_{2}^{-1}\left(G_{4}-G_{1}+\sqrt{(G_{4}-G_{1})^{2}+4G_{2}G_{3}}\right),$
	$\displaystyle M_{2}$	$\displaystyle=\frac{-1}{2}G_{3}^{-1}\left(G_{4}-G_{1}+\sqrt{(G_{4}-G_{1})^{2}+4G_{2}G_{3}}\right).$

Applying Lemma 3.14 to the matrix $\mathcal{G}=\mathcal{WDW}^{-1}$ , we can further decompose $\mathcal{P}_{S}$ as follows:

	$\displaystyle\mathcal{P}_{S}$	$\displaystyle=\widetilde{\mathcal{U}}\left(\mathcal{WDW}^{-1}\otimes I_{m}+\tau\begin{bmatrix}I_{n}&\\ &I_{n}\\ \end{bmatrix}\otimes(-L_{m})\right)\widetilde{\mathcal{U}}^{*}$
		$\displaystyle=\mathcal{V}\left(\mathcal{D}\otimes I_{m}+\tau\begin{bmatrix}I_{n}&\\ &I_{n}\\ \end{bmatrix}\otimes(-L_{m})\right)\mathcal{V}^{*},$
		$\displaystyle=\mathcal{V}\begin{bmatrix}D_{1}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &D_{2}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}\mathcal{V}^{*},$

where $\mathcal{V}=\widetilde{\mathcal{U}}\left(\mathcal{W}\otimes I_{m}\right)$ , and the matrices $\mathcal{W}$ and $\mathcal{D}=\begin{bmatrix}D_{1}&\\ &D_{2}\\ \end{bmatrix}$ are explicitly known from Lemma 3.14.

Therefore, the computation of $\mathbf{w}=\mathcal{P}_{S}^{-1}\mathbf{v}$ can be implemented by the following three steps.

1.

$\textrm{Compute}~{}\widetilde{\mathbf{v}}=\mathcal{V}^{*}\mathbf{\mathbf{v}}$ ,
2.

$\textrm{Compute}~{}\widetilde{\mathbf{w}}=\begin{bmatrix}D_{1}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &D_{2}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}^{-1}\widetilde{\mathbf{v}}$ ,
3.

$\textrm{Compute}~{}\mathbf{w}=\mathcal{V}\widetilde{\mathbf{w}}$ .

Both Steps 1 and 3 can be computed by fast Fourier transformation in $\mathcal{O}(mn\log{n})$ . In Step 2, the shifted Laplacian systems can be efficiently solved for instance by using the multigrid method. A detailed description of this highly effective implementation can be found in [15] for example.

In each MINRES iteration, we need to compute a matrix-vector product in the form of $|\mathcal{P}_{S}|^{-1}\mathbf{v}$ for some given vector $\mathbf{v}$ . The eigendecomposition of $-L_{m}$ is given by $-L_{m}=\mathbb{U}_{m}\Omega_{m}\mathbb{U}_{m}^{\top}$ with $-L_{m}$ assumed SPD, where $\mathbb{U}_{m}$ is orthogonal and $\Omega_{m}$ is a diagonal matrix containing the eigenvalues of $-L_{m}$ .

Hence, we can rewrite $|\mathcal{P}_{S}|$ from (17) as follows:

	$\displaystyle\|\mathcal{P}_{S}\|$	$\displaystyle=\begin{bmatrix}\sqrt{\mathcal{S}^{}\mathcal{S}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\mathcal{S}^{}\mathcal{S}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix}$
		$\displaystyle=\mathcal{U}\begin{bmatrix}\sqrt{\|\Lambda\|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}&\\ &\sqrt{\|\Lambda\|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}\\ \end{bmatrix}\mathcal{U}^{*},$

where $\mathcal{U}:=\begin{bmatrix}\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m}&\\ &(\Gamma_{n}\mathbb{F}_{n}\otimes\mathbb{U}_{m})^{*}\\ \end{bmatrix}$ is an unitary matrix and $\Lambda:=\Lambda_{n}\otimes I_{m}+\tau I_{n}\otimes\Omega_{m}$ .

Therefore, the computation of $\mathbf{w}=|\mathcal{P}_{S}|^{-1}\mathbf{v}$ can be implemented with the following three steps.

1.

$\textrm{Compute}~{}\widetilde{\mathbf{v}}=\mathcal{U}^{*}\mathbf{\mathbf{v}}$ ,
2.

$\textrm{Compute}~{}\widetilde{\mathbf{w}}=\begin{bmatrix}\sqrt{|\Lambda|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}&\\ &\sqrt{|\Lambda|^{2}+\alpha^{2}{I}_{n}\otimes I_{m}}^{-1}\\ \end{bmatrix}\widetilde{\mathbf{v}}$ ,
3.

$\textrm{Compute}~{}\mathbf{w}=\mathcal{U}\widetilde{\mathbf{w}}$ .

When the spatial grid is uniformly partitioned, the orthogonal matrix $\mathbb{U}_{m}$ becomes the discrete sine matrix $\mathbb{S}_{m}$ . In this case, Step 1 and 3 can be computed efficiently by fast Fourier transform and fast sine transform in $\mathcal{O}(mn\log{n})$ operations. For step 2, the required computations take $\mathcal{O}(mn)$ operations since the matrix involved is a simple diagonal matrix.

The product of $\mathcal{P}_{MS}^{-1}\mathbf{v}$ for any vector $\mathbf{v}$ can be implemented following the above procedures. Note that

	$\displaystyle\mathcal{P}_{MS}$
	$\displaystyle=\begin{bmatrix}\sqrt{S_{n}^{}S_{n}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &\sqrt{S_{n}S_{n}^{}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}$
	$\displaystyle=\widetilde{\mathcal{U}}\begin{bmatrix}\sqrt{\|\Lambda_{n}\|^{2}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m})&\\ &\sqrt{\|\Lambda_{n}\|^{2}+\alpha^{2}I_{n}}+\tau I_{n}\otimes(-L_{m})\\ \end{bmatrix}\widetilde{\mathcal{U}}^{*},$

where $\widetilde{\mathcal{U}}=\begin{bmatrix}\Gamma_{n}\mathbb{F}_{n}\otimes I_{m}&\\ &(\Gamma_{n}\mathbb{F}_{n}\otimes I_{m})^{*}\\ \end{bmatrix}$ is an unitary matrix.

The computation of $\mathcal{P}_{MS}^{-1}\mathbf{v}$ can be implemented by the following three steps.

1.

$\textrm{Compute}~{}\widetilde{\mathbf{v}}=\widetilde{\mathcal{U}}^{*}\mathbf{\mathbf{v}}$ ,

Compute

			$\displaystyle\widetilde{\mathbf{w}}$
		$\displaystyle=$	$\displaystyle\begin{bmatrix}(\sqrt{\|\Lambda_{n}\|^{2}+\alpha^{2}I_{n}}\otimes I_{m}+\tau I_{n}\otimes(-L_{m}))^{-1}&\\ &(\sqrt{\|\Lambda_{n}\|^{2}+\alpha^{2}I_{n}}+\tau I_{n}\otimes(-L_{m}))^{-1}\\ \end{bmatrix}\widetilde{\mathbf{v}},$

3.

$\textrm{Compute}~{}\mathbf{w}=\widetilde{\mathcal{U}}\widetilde{\mathbf{w}}$ .

Both Steps 1 and 3 can be computed by fast Fourier transformation in $\mathcal{O}(mn\log{n})$ . As for Step 2, again, the shifted Laplacian systems can be efficiently solved by the multigrid method. We refer to [36] for more details regarding such efficient implementation.

4 Numerical examples

In this section, we provide several numerical results to show the performance of our proposed preconditioners. All numerical experiments are carried out using MATLAB 2022b on a PC with Intel i5-13600KF CPU 3.50GHz and 32 GB RAM.

The CPU time in seconds is measured using MATLAB built-in functions $\mathbb{tic\backslash toc}$ . All Steps $1-3$ in Section 3.5 are implemented by the functions $\mathbb{dst}$ and $\mathbb{fft}$ as discrete sine transform and fast Fourier transform respectively. All Krylov subspace solvers used are implemented using the built-in functions on MATLAB. We choose a zero initial guess and a stopping tolerance of $10^{-8}$ based on the reduction in relative residual norms for all Krylov subspace solvers tested unless otherwise indicated.

We adopt the notation MINRES- $|\mathcal{P}_{S}|$ and MINRES- $\mathcal{P}_{MS}$ to represent the MINRES solvers with $|\mathcal{P}_{S}|$ and $\mathcal{P}_{MS}$ , respectively. Also, GMRES- $\mathcal{P}_{S}$ is used to represent the GMRES solver with the proposed preconditioner $\mathcal{P}_{S}$ . We compare our proposed methods against the state-of-the-art solver proposed recently in [23] (denoted by PCG- $\mathcal{P}_{\epsilon}$ ), where an $\epsilon$ -circulant preconditioner $\mathcal{P}_{\epsilon}$ was constructed. Note that we did not compare with the matching Schur complement preconditioners proposed in [30, 21]. It is expected that their effectiveness cannot surpass PCG- $\mathcal{P}_{\epsilon}$ as studied in the numerical tests carried out in [23].

In the related tables, we denote by ’Iter’ the number of iterations for solving a linear system by an iterative solver within the given accuracy. Denote by ’DoF’, the number of unknowns in a linear system. Let $p^{*}$ and $y^{*}$ denote the approximate solution to $p$ and $y$ , respectively. Then, we define the error measure $e_{h}$ as

e_{h}=\left\|\begin{bmatrix}y^{*}\\ p^{*}\end{bmatrix}-\begin{bmatrix}y\\ p\end{bmatrix}\right\|_{L^{\infty}_{\tau}(L^{2}(\Omega))}.

(21)

The time interval $[0,T]$ and the space are partitioned uniformly with the mesh step size $\tau=T/n=T/{h^{-1}}$ and $h=1/(m+1)$ , respectively, where $h$ can be found in the related tables. Also, only $\zeta=\pi$ is used in the related tables for $\omega=e^{\textbf{i}\zeta}$ in the block $\omega$ -circulant preconditioners. It is because, after conducting extensive trials, it was consistently observed that the preconditioner corresponding to $\zeta=\pi$ yielded the best results.

Example 1.

In this example [23], we consider the following two-dimensional problem of solving (1), where $\Omega=(0,1)^{2}$ , $T=1$ , $a(x_{1},x_{2})=1$ , and

	$\displaystyle f(x_{1},x_{2},t)$	$\displaystyle=(2\pi^{2}-1)e^{-t}\sin{(\pi x_{1})}\sin{(\pi x_{2})},$
	$\displaystyle g(x_{1},x_{2},t)$	$\displaystyle=e^{-t}\sin{(\pi x_{1})}\sin{(\pi x_{2})},$

The analytical solution of which is given by

\displaystyle y(x_{1},x_{2},t)

\displaystyle=e^{-t}\sin{(\pi x_{1})}\sin{(\pi x_{2})},\quad p=0.

To support the result of Theorem 3.11, we display the eigenvalues of the matrix $\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}|$ for various values of $\gamma$ in Figure 1. The illustration confirms that the eigenvalues consistently fall within the interval $[\frac{1}{\sqrt{2}},1]$ , aligning with the expectations set forth in Remark 4. Furthermore, it is evident that as $\gamma$ diminishes, the eigenvalues of $\mathcal{P}_{MS}^{-1}|\mathcal{P}_{S}|$ exhibit increased clustering around one. This trend can be attributed to the fact that $\alpha=\frac{\tau}{\sqrt{\gamma}}$ grows larger when $\gamma$ is reduced, assuming the matrix size (or $\tau$ ) remains constant. As $\alpha$ becomes larger, it follows from their respective definitions that $\mathcal{P}_{MS}$ becomes closer to $|\mathcal{P}_{S}|$ .

Refer to caption — (a) $\gamma=10^{-10}$

Table 1 displays the iteration counts, CPU times, and errors for GMRES- $\mathcal{P}_{S}$ and MINRES- $|\mathcal{P}_{S}|$ when applying the Crank-Nicolson method with different values of $\gamma$ . Note that MINRES- $\mathcal{P}_{MS}$ was not implemented for this example, as $|\mathcal{P}_{S}|$ can already be efficiently implemented using fast sine transforms. We observe that: (i) both GMRES- $\mathcal{P}_{S}$ and MINRES- $|\mathcal{P}_{S}|$ with $\zeta=\pi$ perform excellently and stably, considering both iteration counts and CPU times across various values of $\gamma$ ; and (ii) the error decreases as the mesh is refined, except in the case when $\gamma=10^{-10}$ . In such an instance, the error exhibits only a slight decrease as the matrix size grows, which is likely due to the convergence tolerance used for MINRES not being sufficiently small to demonstrate the anticipated reduction.

In Table 2, we compare our preconditioners against PCG- $\mathcal{P}_{\epsilon}$ from [23] with $\epsilon=\frac{1}{2}\min\{\frac{\tau}{24\sqrt{\gamma}},\frac{\tau^{\frac{3}{2}}}{2\sqrt{6\gamma}T},\frac{\tau^{2}}{8\sqrt{3\gamma}T},\frac{1}{3}\}$ , where only the Crank-Nicolson method is considered. We report that for a larger value of $\gamma\geq 10^{-6}$ , our proposed GMRES- $\mathcal{P}_{S}$ with $\omega=-1$ outperforms PCG- $\mathcal{P}_{\epsilon}$ significantly, namely, the computational time PCG- $\mathcal{P}_{\epsilon}$ needed for convergence is roughly two times larger. When $\gamma$ is small, GMRES- $\mathcal{P}_{S}$ is still highly comparable with PCG- $\mathcal{P}_{\epsilon}$ in terms of CPU times. Overall, GMRES- $\mathcal{P}_{S}$ is stable and robust in both iteration numbers and computational time for a wide range of $\gamma$ .

Example 2.

In this example, we consider the following two-dimensional problem of solving (1) with a variable function $a(x_{1},x_{2})$ , where $\Omega=(0,1)^{2}$ , $T=1$ , $a(x_{1},x_{2})=10^{-5}\sin{(\pi x_{1}x_{2})}$ , and

f(x_{1},x_{2},t)=-\sin{(\pi t)}\sin{(\pi x_{1})}\sin{(\pi x_{2})}+e^{-t}x_{1}(1-x_{1})[2\times 10^{-5}\sin{(\pi x_{1}x_{2})}\\ -x_{2}(1-x_{2})-10^{-5}\pi\cos{(\pi x_{1}x_{2})}x(1-2x_{2})]\\ +e^{-t}x_{2}(1-x_{2})[2\times 10^{-5}\sin{(\pi x_{1}x_{2})}-10^{-5}\pi\cos{(\pi x_{1}x_{2})}x_{2}(1-2x_{1})],

g(x_{1},x_{2},t)=-\gamma\pi\cos{(\pi t)}\sin{(\pi x_{1})}\sin{(\pi x_{2})}+e^{-t}x_{1}(1-x_{1})x_{2}(1-x_{2})\\ -10^{-5}\gamma\pi^{2}\sin{(\pi t)}[-2\sin{(\pi x_{1}x_{2})}\sin{(\pi x_{1})}\sin{(\pi x_{2})}\\ +\cos{(\pi x_{1}x_{2})}(x_{1}\sin{(\pi x_{1})}\cos{(\pi x_{2})}+x_{2}\cos{(\pi x_{1})}\sin{(\pi x_{2})})].

The analytical solution of which is given by

\displaystyle y(x_{1},x_{2},t)=e^{-t}x_{1}(1-x_{1})x_{2}(1-x_{2}),\quad p(x_{1},x_{2},t)=\gamma\sin{(\pi t)}\sin{(\pi x_{1})}\sin{(\pi x_{2})}.

In the given example, the direct application of MINRES- $|\mathcal{P}_{S}|$ is not feasible due to the non-diagonalizability of $-L_{m}$ using fast transform methods. Consequently, we adopt MINRES- $\mathcal{P}_{MS}$ which incorporates a multigrid method. Specifically, to solve the shifted Laplacian linear system (as detailed in Subsection 3.5) and compute $\mathcal{P}_{MS}^{-1}\mathbf{v}$ for any vector $\mathbf{v}$ , we apply one iteration of the V-cycle geometric multigrid method. In this iteration, the Gauss-Seidel method is employed as the pre-smoother.

Table 3 shows the iteration numbers, CPU time, and error of GMRES- $\mathcal{P}_{S}$ and MINRES- $\mathcal{P}_{MS}$ , respectively, when the Crank-Nicolson method is applied with a range of $\gamma$ . This example aims to investigate the effectiveness of our solvers when $a(x_{1},x_{2})$ in (3) is not a constant. The results show that (i) GMRES- $\mathcal{P}_{S}$ with $\omega=-1$ maintains relatively stable iteration numbers and CPU time across a wide range of $\gamma$ ; (ii) MINRES- $\mathcal{P}_{MS}$ performs well for small $\gamma$ , but its efficiency decreases as $\gamma$ increases — a phenomenon that has also been observed and reported in a previous study [17]. This behavior may be attributed to the eigenvalue distribution of $\mathcal{A}$ . Specifically, when $\gamma$ is very small compared to $\tau$ , it is plausible to assume that $\alpha=\frac{\tau}{\sqrt{\gamma}}$ is quite large. Indeed, as $\gamma$ approaches $0^{+}$ , it is corroborated by [17, Corollary 3.2] that the matrix-sequence $\left\{{\mathcal{A}\over\alpha}\right\}_{n}$ has eigenvalues relatively clustered around $\pm 1$ , thus facilitating the solving of the all-at-once system. Additionally, as discussed in the preceding example, the increasing size of $\alpha$ results in $\mathcal{P}_{MS}$ closely resembling $|\mathcal{P}_{S}|$ , which in turn leads to an improved preconditioning effect.

Example 3.

This example aims to test the robustness of our proposed method with the (homogeneous) Neumann boundary condition. We consider the following two-dimensional problem, where $\Omega=(0,1)^{2}$ , $T=1$ , $a(x_{1},x_{2})=10^{-3}$ , and

	$\displaystyle f(x_{1},x_{2},t)$	$\displaystyle=(10^{-3}\times 8\pi^{2}-1)e^{-t}\cos{(2\pi x_{1})}\cos{(2\pi x_{2})},$
	$\displaystyle g(x_{1},x_{2},t)$	$\displaystyle=e^{-t}\cos{(2\pi x_{1})}\cos{(2\pi x_{2})},$

The analytical solution of which is given by

\displaystyle y(x_{1},x_{2},t)

\displaystyle=e^{-t}\cos{(2\pi x_{1})}\cos{(2\pi x_{2})},\quad p=0.

Note that we use one iteration of the V-cycle geometric multigrid method with the Gauss-Seidel method as a pre-smoother to solve the shifted Laplacian linear system for GMRES- $\mathcal{P}_{S}$ . Also, MINRES is not applicable in this case with the Neumann boundary condition, since $\mathcal{A}$ is not symmetric. Thus, we resort to using GMRES- $\mathcal{P}_{S}$ .

Table 4 shows the iteration numbers, CPU time, and error of GMRES- $\mathcal{P}_{S}$ when the Crank-Nicolson method is applied with various values of $\gamma$ . The results indicate that GMRES- $\mathcal{P}_{S}$ maintains stable and low iteration numbers across a wide range of $\gamma$ .

5 Conclusions

In this work, we have provided a unifying preconditioning framework for circulant-based preconditioning applied to the concerned parabolic control problem. The framework is applicable for both first order (i.e., $\theta=1$ ) and second order (i.e., $\theta=1/2$ ) time discretization schemes. Moreover, it encompasses both circulant (i.e., $\omega=1$ ) and skew-circulant (i.e., $\omega=-1$ and $\theta=1$ ) preconditioners previously proposed in existing works. We note that it appears feasible to extend our proposed preconditioning theory to various implicit time-discretization methods, such as Backward Difference Formulas, as long as the block Toeplitz structure within the resulting all-at-once linear system remains intact. When considering more general discretization schemes, such as (semi-)implicit Runge-Kutta methods, a good starting point could be the study recently introduced in [20]. This work develops a robust preconditioning approach designed specifically for all-at-once linear systems that are derived from the Runge-Kutta time discretization of time-dependent PDEs.

Specifically, we have proposed a class of block $\omega$ -circulant based preconditioners for the all-at-once system of the parabolic control problem. First, when GMRES is considered, we have proposed a PinT preconditioner $\mathcal{P}_{S}$ for the concerned system. Second, when MINRES is used for the symmetrized system, we have constructed an ideal preconditioner $|\mathcal{A}|$ , which can be used as a prototype for designing efficient preconditioners based on $|\mathcal{A}|$ . Then, we have designed two novel preconditioners $|\mathcal{P}_{S}|$ and $\mathcal{P}_{MS}$ for the same problem, which can be efficiently implemented in a PinT manner. All proposed preconditioners have been shown effective in both numerical tests and a theoretical study. Based on our numerical tests, it has been demonstrated that our proposed solver, GMRES- $\mathcal{P}_{S}$ with $\omega=-1$ and $\theta=1/2$ , can achieve rapid convergence, consistently maintaining stable iteration counts across a wide range of $\gamma$ values.

We stress that the development of our proposed MINRES approach for optimal control problems is still in its infancy. As future work, we plan at least to develop more efficient preconditioned Krylov subspace solvers by integrating with an $\epsilon$ -circulant matrix, where a small $\epsilon>0$ is chosen. In recent years, this approach has been shown successful for solving various PDEs (see, e.g, [24, 26, 33, 27]), achieving clustered singular values without any outliers. We will investigate whether such a combination can reduce the number of singular values/eigenvalue outliers that are present as a result from our preconditioners, which could achieve parameter-independent convergence in the MINRES framework.

Acknowledgments

The work of Sean Hon was supported in part by the Hong Kong RGC under grant 22300921 and a start-up grant from the Croucher Foundation.

Table 1: Results of GMRES-

\mathcal{P}_{S}

and MINRES-

|\mathcal{P}_{S}|

for Example 1 with

\zeta=\pi

and

\theta=\frac{1}{2}

(Crank-Nicolson)

$\gamma$	$h$	DoF	GMRES- $\mathcal{P}_{S}$			MINRES- $\|\mathcal{P}_{S}\|$
$\gamma$	$h$	DoF	Iter	CPU	$e_{h}$	Iter	CPU	$e_{h}$
$10^{-10}$	$2^{-5}$	61504	3	0.039	1.18e-9	3	0.033	3.18e-9
	$2^{-6}$	508032	3	0.35	1.18e-9	5	0.48	1.45e-9
	$2^{-7}$	4129024	3	3.32	1.04e-9	6	4.91	1.04e-9
	$2^{-8}$	33292800	3	27.96	4.49e-10	6	40.91	4.49e-10
$10^{-8}$	$2^{-5}$	61504	3	0.030	1.12e-7	6	0.045	1.26e-7
	$2^{-6}$	508032	3	0.37	6.71e-8	6	0.52	6.71e-8
	$2^{-7}$	4129024	3	3.30	1.81e-8	6	4.90	1.81e-8
	$2^{-8}$	33292800	3	27.85	4.53e-9	6	40.83	4.53e-9
$10^{-6}$	$2^{-5}$	61504	3	0.029	2.90e-6	6	0.044	2.90e-6
	$2^{-6}$	508032	3	0.34	7.26e-7	6	0.56	7.26e-7
	$2^{-7}$	4129024	3	3.35	1.81e-7	6	4.91	1.81e-7
	$2^{-8}$	33292800	3	27.76	4.54e-8	6	40.98	4.54e-8
$10^{-4}$	$2^{-5}$	61504	3	0.029	2.87e-5	6	0.045	2.87e-5
	$2^{-6}$	508032	3	0.39	7.19e-6	6	0.51	7.19e-6
	$2^{-7}$	4129024	3	3.34	1.80e-6	6	4.91	1.80e-6
	$2^{-8}$	33292800	3	27.96	4.49e-7	6	40.84	4.49e-7
$10^{-2}$	$2^{-5}$	61504	3	0.030	2.77e-4	6	0.046	2.77e-4
	$2^{-6}$	508032	3	0.37	6.91e-5	6	0.55	6.91e-5
	$2^{-7}$	4129024	3	3.35	1.73e-5	6	4.90	1.73e-5
	$2^{-8}$	33292800	3	27.80	4.31e-6	6	40.96	4.31e-6

Table 2: Results of PCG-

\mathcal{P}_{\epsilon}

for Example 1 with Crank-Nicolson method (

\theta=\frac{1}{2}

)

$\gamma$	$h$	DoF	PCG- $\mathcal{P}_{\epsilon}$
$\gamma$	$h$	DoF	Iter	CPU	$e_{h}$
$10^{-10}$	$2^{-5}$	61504	3	0.084	1.18e-9
	$2^{-6}$	508032	3	0.23	1.18e-9
	$2^{-7}$	4129024	4	2.49	1.04e-9
	$2^{-8}$	33292800	5	24.55	4.49e-10
$10^{-8}$	$2^{-5}$	61504	4	0.023	1.12e-7
	$2^{-6}$	508032	5	0.34	6.71e-8
	$2^{-7}$	4129024	5	3.01	1.81e-8
	$2^{-8}$	33292800	5	24.70	4.53e-9
$10^{-6}$	$2^{-5}$	61504	5	0.046	2.90e-6
	$2^{-6}$	508032	5	0.33	7.26e-7
	$2^{-7}$	4129024	6	3.50	1.81e-7
	$2^{-8}$	33292800	6	28.64	4.54e-8
$10^{-4}$	$2^{-5}$	61504	9	0.063	2.87e-5
	$2^{-6}$	508032	9	0.57	7.19e-6
	$2^{-7}$	4129024	9	4.95	1.80e-6
	$2^{-8}$	33292800	10	44.46	4.49e-7
$10^{-2}$	$2^{-5}$	61504	11	0.061	2.77e-4
	$2^{-6}$	508032	11	0.68	6.91e-5
	$2^{-7}$	4129024	11	5.95	1.73e-5
	$2^{-8}$	33292800	12	53.96	4.31e-6

Table 3: Results of GMRES-

\mathcal{P}_{S}

and MINRES-

\mathcal{P}_{MS}

for Example 2 with

\zeta=\pi

and

\theta=\frac{1}{2}

(Crank-Nicolson)

$\gamma$	$h$	DoF	GMRES- $\mathcal{P}_{S}$			MINRES- $\mathcal{P}_{MS}$
$\gamma$	$h$	DoF	Iter	CPU	$e_{h}$	Iter	CPU	$e_{h}$
$10^{-10}$	$2^{-5}$	61504	3	0.14	6.60e-13	3	0.11	2.91e-10
	$2^{-6}$	508032	3	0.67	4.68e-13	5	0.74	2.55e-11
	$2^{-7}$	4129024	3	5.35	7.51e-13	6	6.27	4.43e-13
	$2^{-8}$	33292800	3	48.26	8.40e-12	6	61.63	1.73e-11
$10^{-8}$	$2^{-5}$	61504	3	0.13	6.30e-11	6	0.17	6.29e-11
	$2^{-6}$	508032	3	0.64	5.48e-11	6	0.84	5.39e-11
	$2^{-7}$	4129024	3	5.39	1.13e-10	7	7.51	7.28e-10
	$2^{-8}$	33292800	3	59.97	1.14e-10	9	107.00	3.90e-11
$10^{-6}$	$2^{-5}$	61504	3	0.17	2.53e-9	7	0.24	2.79e-9
	$2^{-6}$	508032	3	0.81	1.26e-9	10	1.65	5.65e-10
	$2^{-7}$	4129024	3	6.71	1.14e-9	10	11.98	3.22e-10
	$2^{-8}$	33292800	3	74.15	1.14e-9	10	126.17	5.38e-10
$10^{-4}$	$2^{-5}$	61504	5	0.25	1.53e-7	14	0.42	1.53e-7
	$2^{-6}$	508032	5	1.24	3.40e-8	15	2.57	3.40e-8
	$2^{-7}$	4129024	5	10.89	8.51e-9	18	23.00	8.51e-9
	$2^{-8}$	33292800	5	100.60	2.13e-9	23	313.30	2.13e-9
$10^{-2}$	$2^{-5}$	61504	5	0.25	1.16e-5	20	0.57	1.16e-5
	$2^{-6}$	508032	5	1.54	2.90e-6	24	4.28	2.90e-6
	$2^{-7}$	4129024	5	12.99	7.25e-7	30	39.82	7.25e-7
	$2^{-8}$	33292800	5	121.50	1.81e-7	52	739.91	1.81e-7

Table 4: Results of GMRES-

\mathcal{P}_{S}

for Example 3 with

\zeta=\pi

and

\theta=\frac{1}{2}

(Crank-Nicolson)

$\gamma$	$h$	DoF	GMRES- $\mathcal{P}_{S}$
$\gamma$	$h$	DoF	Iter	CPU	$e_{h}$
$10^{-10}$	$2^{-5}$	65536	3	0.23	1.51e-11
	$2^{-6}$	524288	3	1.02	1.39e-11
	$2^{-7}$	4194304	3	6.98	1.18e-11
	$2^{-8}$	33554432	3	81.18	4.99e-12
$10^{-8}$	$2^{-5}$	65536	3	0.18	1.43e-9
	$2^{-6}$	524288	3	1.07	7.93e-10
	$2^{-7}$	4194304	3	8.37	2.06e-10
	$2^{-8}$	33554432	3	94.33	5.05e-11
$10^{-6}$	$2^{-5}$	65536	3	0.22	3.69e-8
	$2^{-6}$	524288	3	1.26	8.56e-9
	$2^{-7}$	4194304	3	12.36	2.06e-9
	$2^{-8}$	33554432	3	152.13	5.05e-10
$10^{-4}$	$2^{-5}$	65536	3	0.27	3.73e-7
	$2^{-6}$	524288	3	1.70	8.64e-8
	$2^{-7}$	4194304	3	18.68	2.08e-8
	$2^{-8}$	33554432	3	249.61	5.10e-9
$10^{-2}$	$2^{-5}$	65536	3	0.35	4.10e-6
	$2^{-6}$	524288	3	2.28	9.51e-7
	$2^{-7}$	4194304	3	22.44	2.29e-7
	$2^{-8}$	33554432	3	282.79	5.61e-8

References

[1] Owe Axelsson and Maya Neytcheva. Eigenvalue estimates for preconditioned saddle point matrices. Numerical Linear Algebra with Applications, 13(4):339-360, 2006.
[2] Daniele Bertaccini and Michael K. Ng. Block $\omega$ -circulant preconditioners for the systems of differential equations. Calcolo, 40(2):71-90, 2003.
[3] Dario A. Bini, Guy Latouche, and Beatrice Meini. Numerical Methods for Structured Markov Chains. Oxford University Press, New York, 2005.
[4] Arne Bouillon, Giovanni Samaey, and Karl Meerbergen. On generalized preconditioners for time-parallel parabolic optimal control. arXiv preprint, arXiv:2302.06406, 2021.
[5] Jan H. Brandts and Ricardo Reis da Silva. Computable eigenvalue bounds for rank- $k$ perturbations. Linear Algebra and Its Applications, 432(12), 3100-3116, 2010.
[6] Alfio Borzì and Volker Schulz. Computational optimization of systems governed by partial differential equations. Society for Industrial and Applied Mathematics. 2011.
[7] Raymond H. Chan and Michael K. Ng. Conjugate gradient methods for Toeplitz systems. SIAM Review, 38(3), 427-482, 1996.
[8] Daniel Potts and Gabriele Steidl. Preconditioners for ill-conditioned Toeplitz matrices. BIT Numerical Mathematics, 39(3), 513-533, 1999.
[9] Daniel Potts and Gabriele Steidl. Preconditioners for ill-conditioned Toeplitz systems constructed from positive kernels. SIAM Journal on Scientific Computing, 22(5), 1741-1761, 2001
[10] Paola Ferrari, Isabella Furci, Sean Hon, Mohammad Ayman-Mursaleen, and Stefano Serra-Capizzano. The eigenvalue distribution of special 2-by-2 block matrix-sequences with applications to the case of symmetrized Toeplitz structures. SIAM Journal on Matrix Analysis and Applications, 40(3), 1066-1086, 2019.
[11] Carlo Garoni and Stefano Serra-Capizzano. Generalized locally Toeplitz sequences: Theory and applications, volume 1. Springer, Cham, 2017.
[12] Carlo Garoni and Stefano Serra-Capizzano. Generalized locally Toeplitz sequences: Theory and applications, volume 2. Springer, Cham, 2018.
[13] Anne Greenbaum, Vlastimil Pták, and Zdeněk Strakoš. Any nonincreasing convergence curve is possible for GMRES. SIAM Journal on Matrix Analysis and Applications, 17(3):465–469, 1996.
[14] Michael Hinze, Rene Pinnau, Michael Ulbrich, and Stefan Ulbrich. Optimization with PDE constraints (Vol. 23). Springer Science & Business Media. 2008.
[15] Yunhui He and Jun Liu. A Vanka-type multigrid solver for complex-shifted Laplacian systems from diagonalization-based parallel-in-time algorithms. Applied Mathematics Letters, 132, 108125, 2022.
[16] Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press. 1990.
[17] Sean Hon, Jiamei Dong, and Stefano Serra-Capizzano. A preconditioned MINRES method for optimal control of wave equations and its asymptotic spectral distribution theory. SIAM Journal on Matrix Analysis and Applications, 44(4):1477–1509, 2023.
[18] Sean Hon, Po Yin Fung, Jiamei Dong, and Stefano Serra-Capizzano. A sine transform based preconditioned MINRES method for all-at-once systems from constant and variable-coefficient evolutionary PDEs. Numerical Algorithms, 2023.
[19] Sean Hon, Stefano Serra-Capizzano, and Andy Wathen. Band-Toeplitz preconditioners for ill-conditioned Toeplitz systems. BIT Numerical Mathematics. 2021.
[20] Santolo Leveque, Luca Bergamaschi, Ángeles Martínez, and John W. Pearson. Fast iterative solver for the all-at-once Runge–Kutta discretization. arXiv preprint, arXiv:2303.02090, 2023.
[21] Santolo Leveque and John W. Pearson. Fast iterative solver for the optimal control of time-dependent PDEs with Crank–Nicolson discretization in time. Numerical Linear Algebra with Applications, 29:e2419, 2022.
[22] Buyang Li, Jun Liu, and Mingqing Xiao. A new multigrid method for unconstrained heat optimal control problems. Journal of Computational and Applied Mathematics, 326, 358-373, 2017.
[23] Xue-lei Lin and Shu-Lin Wu. A parallel-in-time preconditioner for Crank-Nicolson discretization of a parabolic optimal control problem. arXiv preprint, arXiv:2109.12524, 2023.
[24] Xue-lei Lin and Michael Ng. An All-at-once preconditioner for evolutionary partial differential equations. SIAM Journal on Scientific Computing, 43(4), A2766–A2784, 2021.
[25] Jacques Louis Lions. Optimal control of systems governed by partial differential equations. Springer-Verlag Berlin Heidelberg. 1971.
[26] Jun Liu and Shu-Lin Wu. A fast block $\alpha$ -circulant preconditioner for all-at-once systems from wave equations. SIAM Journal on Matrix Analysis and Applications. 41(4), 1912–1943, 2021.
[27] Martin J. Gander, Jun Liu, Shu-Lin Wu, Xiaoqiang Yue, and Tao Zhou. ParaDiag: parallel-in-time algorithms based on the diagonalization technique. arXiv e-prints, arXiv:2005.09158, 2020.
[28] Malcolm F. Murphy, Gene H. Golub, and Andrew J. Wathen. A note on preconditioning for indefinite linear systems. SIAM Journal on Scientific Computing, 21(6):1969–1972, 2000.
[29] Michael K. Ng. Iterative Methods for Toeplitz Systems. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, 2004.
[30] John W. Pearson and Andrew J. Wathen. A new approximation of the Schur complement in preconditioners for PDE-constrained optimization. Numerical Linear Algebra and Applications, 19(5):816-829, 2012.
[31] John W. Pearson, Martin Stoll, and Andrew J. Wathen. Regularization-robust preconditioners for time-dependent PDE-constrained optimization problems. SIAM Journal on Matrix Analysis and Applications, 33:1126-1152, 2012.
[32] Gilbert Strang. A Proposal for Toeplitz Matrix Calculations. Studies in Applied Mathematics, 74(2), 171-176, 1986.
[33] Yafei Sun, Shu-Lin Wu, and Yingxiang Xu. A Parallel-in-Time Implementation of the Numerov Method For Wave Equations. Journal of Scientific Computing, 90.1, 2022.
[34] Ferdi Tröltzsch. Optimal control of partial differential equations: theory, methods, and applications (Vol. 112). American Mathematical Soc. 2010.
[35] Andrew J. Wathen. Preconditioning. Acta Numerica. 24, 329-376, 2015.
[36] Shu-Lin Wu and Tao Zhou. Diagonalization-based parallel-in-time algorithms for heat PDE-constrained optimization problems. ESAIM: Control, Optimisation and Calculus of Variations, 26, 88, 2020.
[37] Shu-Lin Wu, Zhiyong Wang, and Tao Zhou. PinT Preconditioner for Forward-Backward Evolutionary Equations. SIAM Journal on Matrix Analysis and Applications, 44(4): 1771-1798, 2023.

Block ω\omega-circulant preconditioners for parabolic optimal control problems

Abstract

1 Introduction

Remark 1.

Remark 2.

2 Preliminaries on Toeplitz matrices

3 Main results

3.1 GMRES - block ω\omega-circulant based preconditioner

Proposition 3.1.

Proof.

3.2 MINRES - ideal preconditioner

Proposition 3.2.

Proof.

3.3 MINRES - block ω\omega-circulant based preconditioner

Theorem 3.3.

Proof.

3.4 MINRES - modified block ω\omega-circulant based preconditioner

Lemma 3.4.

Proof.

Lemma 3.5.

Proof.

Lemma 3.6.

Proposition 3.7.

Proof.

Remark 3.

Lemma 3.8.

Proof.

Lemma 3.9.

Proposition 3.10.

Proof.

Theorem 3.11.

Proof.

Remark 4.

Lemma 3.12.

Corollary 3.13.

Proof.

Remark 5.

3.5 Implementation

Lemma 3.14.

4 Numerical examples

Example 1.

Example 2.

Example 3.

5 Conclusions

Acknowledgments

References

Block $\omega$ -circulant preconditioners for parabolic optimal control problems

3.1 GMRES - block $\omega$ -circulant based preconditioner

3.3 MINRES - block $\omega$ -circulant based preconditioner

3.4 MINRES - modified block $\omega$ -circulant based preconditioner