This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\newsiamthm

claimClaim \newsiamremarkremarkRemark \newsiamremarkhypothesisHypothesis

On GSOR, the Generalized Successive Overrelaxation Method for Double Saddle-Point Problems

Na Huang Department of Applied Mathematics, College of Science, China Agricultural University, Beijing, China. E-mail: [email protected]. Research partially supported by National Natural Science Foundation of China (No. 12001531).    Yu-Hong Dai LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China. E-mail: [email protected].    Dominique Orban GERAD and Department of Mathematics and Industrial Engineering, Polytechnique Montréal, QC, Canada. E-mail: [email protected]. Research partially supported by an NSERC Discovery Grant.    Michael A. Saunders Systems Optimization Laboratory, Department of Management Science and Engineering, Stanford University, Stanford, CA, USA. E-mail: [email protected].
Abstract

We consider the generalized successive overrelaxation (GSOR) method for solving a class of block three-by-three saddle-point problems. Based on the necessary and sufficient conditions for all roots of a real cubic polynomial to have modulus less than one, we derive convergence results under reasonable assumptions. We also analyze a class of block lower triangular preconditioners induced from GSOR and derive explicit and sharp spectral bounds for the preconditioned matrices. We report numerical experiments on test problems from the liquid crystal director model and the coupled Stokes-Darcy flow, demonstrating the usefulness of GSOR.

keywords:
iterative methods, double saddle-point systems, saddle-point problems, matrix splitting, successive overrelaxation, preconditioning.
{AMS}

65F10, 65F50.

1 Introduction

We consider the double saddle-point problem

(1) 𝒜w:=(ABTCTB00C0D)(xyz)=(fgh)=:b,{\cal A}\,w:=\begin{pmatrix}A&B^{T}\!&C^{T}\!\\ B&0&0\\ C&0&-D\end{pmatrix}\begin{pmatrix}x\\ y\\ z\end{pmatrix}=\begin{pmatrix}f\\ g\\ h\end{pmatrix}=:b,

where An×nA\in\mathds{R}^{n\times n} and Dp×pD\in\mathds{R}^{p\times p} are symmetric positive definite (SPD) matrices, Bm×nB\in\mathds{R}^{m\times n} has full row rank, and Cp×nC\in\mathds{R}^{p\times n}. Linear systems like (1) arise from many practical applications, such as mixed and mixed-hybrid finite element approximation of the liquid crystal director model [17] and coupled Stokes-Darcy flow [6, 12, 13, 8], and interior methods for quadratic programming problems [19, 9, 10]. We emphasize that (1) is importantly different from the block 3×33\times 3 systems considered by Huang et al. in [15, 14].

In principle, (1) can be treated as the block 22×\times22 saddle-point problem

(2) (HETEW)(xy)=(fg),\begin{pmatrix}H&E^{T}\!\\ E&-W\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}f\\ g\end{pmatrix},

which has been studied for decades [5]. We focus here on splitting iterative methods for (1) by fully utilizing the special structure of 𝒜\cal A. The generalized successive overrelaxation (GSOR) method of Bai et al. [1] is for (2) with W=0W=0. We extend GSOR to (1) by introducing three parameters. The convergence analysis of this new GSOR method is quite different from that of stationary methods; we derive convergence conditions based on the necessary and sufficient conditions for all roots of a real cubic polynomial to have modulus less than one. We also analyze a class of block lower triangular preconditioners induced from GSOR and show that all eigenvalues of the preconditioned matrices are positive real and can be clustered by appropriate selections of parameters.

For linear systems discretized from a mixed Stokes-Darcy model, Cai et al. [6] proposed preconditioning techniques by treating (1) as system (2) with W=0W=0. Ramage and Gartland Jr. [17] studied a preconditioned nullspace method for solving systems (1) that arise from discretizations of continuum models for the orientational properties of liquid crystals, in which they also partitioned 𝒜\cal A into a block 22×\times22 form. Recently, based on the special structure of 𝒜\cal A in (1), several preconditioners were proposed to accelerate Krylov subspace methods. Beik and Benzi [2, 3] analyzed several block diagonal and block triangular preconditioners and derived bounds for the eigenvalues of the preconditioned matrices. An alternating positive semidefinite splitting (APSS) preconditioner and its relaxed variant were proposed by Liang and Zhang [16] to solve double saddle-point problems arising from liquid crystal director models. The improved APSS preconditioner of Ren et al. [18] and the two-parameter block triangular preconditioner of Zhu et al. [21] were also constructed to deal with the same saddle-point problem. However, the latter preconditioners either do not fully exploit the special structure of 𝒜\cal A or need to solve several complicated and dense linear systems at each iteration.

It is generally difficult to analyze the spectral properties of a “full” block three-by-three matrix; i.e., one that cannot be reduced to a block 22×\times22 matrix. Little literature exists on iterative schemes for (1). Uzawa-like methods based on the splitting

(3) 𝒜=(A00B1αQ0C0M)(0BTCT01αQ000N){\cal A}=\begin{pmatrix}A&0&0\\ B&-\frac{1}{\alpha}Q&0\\ C&0&M\end{pmatrix}-\begin{pmatrix}0&-B^{T}\!&-C^{T}\!\\ 0&-\frac{1}{\alpha}Q&0\\ 0&0&N\end{pmatrix}

were studied by Benzi and Beik [4], where α>0\alpha>0 and the SPD matrix QQ are given, and D=NMD=N-M with MM negative definite. In addition, given a parameter ω0\omega\neq 0, they split 𝒜\cal A into

𝒜=1ω(ABT0B00ωC0D)1ω((1ω)A(1ω)BTωCT(1ω)B0000(1ω)D){\cal A}=\frac{1}{\omega}\begin{pmatrix}A&B^{T}\!&0\\ B&0&0\\ \omega C&0&-D\end{pmatrix}-\frac{1}{\omega}\begin{pmatrix}(1-\omega)A&(1-\omega)B^{T}\!&-\omega C^{T}\!\\ (1-\omega)B&0&0\\ 0&0&-(1-\omega)D\end{pmatrix}

and proposed a generalized block successive overrelaxation (GBSOR) method. The convergence analysis of these two methods is similar to that of stationary iterative schemes for block 22×\times22 linear systems, where convergence conditions are derived from a quadratic polynomial equation of the eigenvalues of the iteration matrix. Moreover, GBSOR needs to solve four linear systems at each step: two of the form Ax=r1Ax=r_{1}, one BA1BTy=r2BA^{-1}B^{T}\!y=r_{2}, and one Dz=r3Dz=r_{3}. By partitioning 𝒜\cal A into system (2) with H=AH=A, Dou and Liang [7] construct a class of block alternating splitting implicit (BASI) iteration methods. At each step, BASI needs to solve several linear systems of the form αI+A+aBTB+bCTC\alpha I+A+aB^{T}\!B+bC^{T}\!C and αI+D+cCCT\alpha I+D+cCC^{T}\!, where II is the identity and α\alpha, aa, bb, cc are real scalar constants.

The paper is organized as follows. In Section 2, we present the generalized successive overrelaxation method. In Section 3, convergence of GSOR is established under reasonable assumptions. Preconditioners are analyzed in Section 4. Numerical experiments are reported in Section 5. Conclusions are summarized in Section 6.

Notation

For any Sr×rS\in\mathds{R}^{r\times r}, its spectral radius, inverse and transpose are denoted ρ(S)\rho(S), S1S^{-1} and STS^{T}, respectively. For any srs\in\mathds{C}^{r}, its conjugate transpose is denoted ss^{*}.

2 The generalized successive overrelaxation (GSOR) method

In this section, we present GSOR for solving the double saddle-point problem (1). We consider the equivalent unsymmetric system

(4) 𝒜^w:=(ABTCTB00C0D)(xyz)=(fgh)=:b^.\widehat{{\cal A}}\,w:=\begin{pmatrix}A&B^{T}\!&C^{T}\!\\ -B&0&0\\ -C&0&D\end{pmatrix}\begin{pmatrix}x\\ y\\ z\end{pmatrix}=\begin{pmatrix}f\\ -g\\ -h\end{pmatrix}=:\widehat{b}.

Although 𝒜^\widehat{\cal A} is unsymmetric, it has certain desirable properties:

  1. 1.

    𝒜^\widehat{\cal A} is semipositive real: vT𝒜^v0v^{T}\!\widehat{\cal A}v\geq 0 for all vn+m+pv\in\mathds{R}^{n+m+p}.

  2. 2.

    𝒜^\widehat{\cal A} is positive semistable; i.e., its eigenvalues have nonnegative real part.

These properties enable convergence of the classical successive overrelaxation (SOR) method [20]. To improve efficiency, we modify the classical SOR method and propose a generalized version that extends the GSOR method considered in [1].

By introducing the three matrices

(5) 𝒟=(A000P000D),=(000B00C00),𝒰=(0BTCT0P0000),{{\cal D}}=\begin{pmatrix}A&0&0\\ 0&P&0\\ 0&0&D\end{pmatrix},\quad{{\cal L}}=\begin{pmatrix}0&0&0\\ B&0&0\\ C&0&0\end{pmatrix},\quad{{\cal U}}=\begin{pmatrix}0&-B^{T}\!&-C^{T}\!\\ 0&P&0\\ 0&0&0\end{pmatrix},

where Pm×mP\in\mathds{R}^{m\times m} is SPD, we can split 𝒜^\widehat{\cal A} as

𝒜^=𝒟𝒰.\widehat{\cal A}={{\cal D}}-{{\cal L}}-{{\cal U}}.

Let ω\omega, τ\tau, and θ\theta be three nonzero reals, InI_{n}, ImI_{m}, and IpI_{p} be identity matrices of appropriate order, and

Ω=(ωIn000τIm000θIp).\Omega=\begin{pmatrix}\omega I_{n}&0&0\\ 0&\tau I_{m}&0\\ 0&0&\theta I_{p}\end{pmatrix}.

Consider the following iteration for (1):

wk+1\displaystyle w_{k+1} =\displaystyle= (𝒟Ω)1[(IΩ)𝒟+Ω𝒰]wk+(𝒟Ω)1Ωb^\displaystyle({{\cal D}}-\Omega{{\cal L}})^{-1}[(I-\Omega){{\cal D}}+\Omega{{\cal U}}]w_{k}+({{\cal D}}-\Omega{{\cal L}})^{-1}\Omega\widehat{b}
=:\displaystyle=: 𝒯wk+(𝒟Ω)1Ωb^.\displaystyle{{\cal T}}w_{k}+({{\cal D}}-\Omega{{\cal L}})^{-1}\Omega\widehat{b}.

From (5),

(7) 𝒟Ω\displaystyle{{\cal D}}-\Omega{{\cal L}} =(A00τBP0θC0D),\displaystyle=\begin{pmatrix}A&0&0\\ -\tau B&P&0\\ -\theta C&0&D\end{pmatrix},
(8) (IΩ)𝒟+Ω𝒰\displaystyle(I-\Omega){{\cal D}}+\Omega{{\cal U}} =((1ω)AωBTωCT0P000(1θ)D).\displaystyle=\begin{pmatrix}(1-\omega)A&-\omega B^{T}\!&-\omega C^{T}\!\\ 0&P&0\\ 0&0&(1-\theta)D\end{pmatrix}.

Substituting (7) and (8) into (2), we obtain GSOR as stated in Algorithm 1.

Algorithm 1 The GSOR method
1:  Choose (x0,y0,z0)n+m+p(x_{0},y_{0},z_{0})\in\mathds{R}^{n+m+p}, Pm×mP\in\mathds{R}^{m\times m} SPD, and ω\omega, τ\tau, θ>0\theta>0.
2:  for k=0,1,k=0,1,\dots do
3:     Compute (xk+1,yk+1,zk+1)(x_{k+1},y_{k+1},z_{k+1}) according to the iteration
(2.7) {xk+1=xk+ωA1(fAxkBTykCTzk),yk+1=yk+τP1(Bxk+1g),zk+1=zk+θD1(Cxk+1Dzkh).\left\{\begin{array}[]{l}x_{k+1}=x_{k}+\omega A^{-1}(f-Ax_{k}-B^{T}\!y_{k}-C^{T}\!z_{k}),\\[2.0pt] y_{k+1}=y_{k}+\tau P^{-1}(Bx_{k+1}-g),\\[2.0pt] z_{k+1}=z_{k}+\theta D^{-1}(Cx_{k+1}-Dz_{k}-h).\end{array}\right.
4:  end for

At each step, GSOR needs to solve only three SPD systems (of order nn, mm, and pp). This is easier than in GBSOR [4], which solves four linear systems involving AA, AA, BA1BTBA^{-1}B^{T}\!, and DD.

Iteration scheme (2.7) can also be deduced from the splitting

(2.6) 𝒜=𝒩:=(1ωA00B1τP0C01θD)((1ω1)ABTCT01τP000(11θ)D).{\cal A}={\cal M}-{\cal N}:=\begin{pmatrix}\frac{1}{\omega}A&0&0\\ B&-\frac{1}{\tau}P&0\\ C&0&-\frac{1}{\theta}D\end{pmatrix}-\begin{pmatrix}(\frac{1}{\omega}-1)A&-B^{T}\!&-C^{T}\!\\ 0&-\frac{1}{\tau}P&0\\ 0&0&(1-\frac{1}{\theta})D\end{pmatrix}.

Therefore, GSOR is a splitting method. In particular if ω=1\omega=1, GSOR reduces to the Uzawa-like schemes studied in [4], where M=1θDM=-\frac{1}{\theta}D and N=(11θ)DN=\left(1-\frac{1}{\theta}\right)D in (3).

3 Convergence analysis for GSOR

First, the following two lemmas give readily verifiable necessary and sufficient conditions for all roots of a real polynomial of degree two or three, respectively, to have modulus less than one.

Lemma 3.1.

[11, Theorem 1.3] Consider the second-degree polynomial equation

(9) λ2+a1λ+a0=0,\lambda^{2}+a_{1}\lambda+a_{0}=0,

where a0a_{0} and a1a_{1} are real numbers. A necessary and sufficient condition for both roots of (9) to lie in the open disk |λ|<1|\lambda|<1 is

|a1|<1+a0<2.\displaystyle|a_{1}|<1+a_{0}<2.

Lemma 3.2.

[11, Theorem 1.4] Consider the third-degree polynomial equation

(10) λ3+a2λ2+a1λ+a0=0,\lambda^{3}+a_{2}\lambda^{2}+a_{1}\lambda+a_{0}=0,

where a0a_{0}, a1a_{1}, and a2a_{2} are real numbers. A necessary and sufficient condition for all roots of (10) to lie in the open disk |λ|<1|\lambda|<1 is

|a2+a0|<1+a1,\displaystyle|a_{2}+a_{0}|<1+a_{1},\quad |a23a0|<3a1,\displaystyle|a_{2}-3a_{0}|<3-a_{1},\quad a02+a1a0a2<1.\displaystyle a_{0}^{2}+a_{1}-a_{0}a_{2}<1.

GSOR is convergent if and only if the spectral radius of 𝒯{{\cal T}} is less than 11; i.e., ρ(𝒯)=ρ((𝒟Ω)1[(IΩ)𝒟+Ω𝒰])<1\rho({{\cal T}})=\rho(({{\cal D}}-\Omega{{\cal L}})^{-1}[(I-\Omega){{\cal D}}+\Omega{{\cal U}}])<1. From this point of view, we can now study the convergence properties of GSOR.

Theorem 3.1.

Assume that AA and DD are SPD, and that BB has full row rank. Let the maximum eigenvalues of A1BTP1BA^{-1}B^{T}\!P^{-1}B and A1CTD1CA^{-1}C^{T}\!D^{-1}C be μmax\mu_{\max} and νmax\nu_{\max}, respectively. Then GSOR is convergent if 0<θ<20<\theta<2 and

0<ω<4(2θ)(2θ)(2+τμmax)+2θνmax,\displaystyle 0<\omega<\dfrac{4(2-\theta)}{(2-\theta)(2+\tau\mu_{\max})+2\theta\nu_{\max}}, 0<τ<4(ω+θωθ)ωθμmax.\displaystyle 0<\tau<\frac{4(\omega+\theta-\omega\theta)}{\omega\theta\mu_{\max}}.

Proof 3.2.

Let λ\lambda and v=(x,y,z)v=(x,y,z) be an eigenvalue and eigenvector of 𝒯{{\cal T}}. Then

[(IΩ)𝒟+Ω𝒰]v=λ(𝒟Ω)v.[(I-\Omega){{\cal D}}+\Omega{{\cal U}}]v=\lambda({{\cal D}}-\Omega{{\cal L}})v.

Substituting (7) and (8) gives

(11) (1ω)AxωBTyωCTz=λAx,\displaystyle(1-\omega)Ax-\omega B^{T}\!y-\omega C^{T}\!z=\lambda Ax,
(12) Py=λτBx+λPy,\displaystyle Py=-\lambda\tau Bx+\lambda Py,
(13) (1θ)Dz=λθCx+λDz.\displaystyle(1-\theta)Dz=-\lambda\theta Cx+\lambda Dz.

To continue the proof, we consider sufficient conditions to guarantee |λ|<1|\lambda|<1.

If λ=1θ\lambda=1-\theta, we have |λ|<1|\lambda|<1 if and only if 0<θ<20<\theta<2. In the following, we assume that λ1θ\lambda\neq 1-\theta and 0<θ<20<\theta<2.

Note that we must have λ1\lambda\neq 1; otherwise, (11)–(13) give

(14) Ax+BTy+CTz=0,\displaystyle Ax+B^{T}\!y+C^{T}\!z=0, Bx=0,\displaystyle\quad Bx=0, Dz=Cx,\displaystyle\quad Dz=Cx,

which imply that 0=xTAx+xTBTy+xTCTz=xTAx+zTDz0=x^{T}\!Ax+x^{T}\!B^{T}\!y+x^{T}\!C^{T}\!z=x^{T}\!Ax+z^{T}\!Dz. Given that both AA and DD are SPD, we have xTAx=0x^{T}\!Ax=0 and zTDz=0z^{T}\!Dz=0, which further means x=0x=0 and z=0z=0. With BB having full row rank, the first equality in (14) gives y=0y=0. This contradicts the fact that vv is an eigenvector.

It follows from λ1\lambda\neq 1, λ1θ\lambda\neq 1-\theta, and (12)–(13) that

y=λτλ1P1Bx,\displaystyle y=\frac{\lambda\tau}{\lambda-1}P^{-1}Bx, z=λθλ+θ1D1Cx.\displaystyle z=\frac{\lambda\theta}{\lambda+\theta-1}D^{-1}Cx.

Substituting these into (11), we obtain

(1ω)Axλωτλ1BTP1Bxλωθλ+θ1CTD1Cx=λAx.(1-\omega)Ax-\frac{\lambda\omega\tau}{\lambda-1}B^{T}\!P^{-1}Bx-\frac{\lambda\omega\theta}{\lambda+\theta-1}C^{T}\!D^{-1}Cx=\lambda Ax.

Note that x0x\neq 0; otherwise, we have Py=0Py=0 and Dz=0Dz=0, which implies v=0v=0 because PP and DD are SPD. This contradicts the fact that vv is an eigenvector. Therefore, premultiplying both sides by x/(xAx)x^{*}/(x^{*}Ax) gives

(15) 1ωλωτλ1ϕ(x)λωθλ+θ1φ(x)=λ,1-\omega-\frac{\lambda\omega\tau}{\lambda-1}\phi(x)-\frac{\lambda\omega\theta}{\lambda+\theta-1}\varphi(x)=\lambda,

where

ϕ(x)=xBTP1BxxAx,\displaystyle\phi(x)=\frac{x^{*}B^{T}\!P^{-1}Bx}{x^{*}Ax}, φ(x)=xCTD1CxxAx.\displaystyle\varphi(x)=\frac{x^{*}C^{T}\!D^{-1}Cx}{x^{*}Ax}.

First, we consider the case xnull(B)x\in{\rm null}(B). Clearly, ϕ(x)=0\phi(x)=0. If xnull(C)x\in{\rm null}(C) as well, it follows from (15) that λ=1ω\lambda=1-\omega. To guarantee |λ|<1|\lambda|<1, we can assume that 0<ω<20<\omega<2. If xnull(C)x\notin{\rm null}(C), with ϕ(x)=0\phi(x)=0, (15) reduces to the quadratic polynomial equation

λ2+(ω+θ+ωθφ(x)2)λ+1+ωθωθ=0.\lambda^{2}+\left(\omega+\theta+\omega\theta\varphi(x)-2\right)\lambda+1+\omega\theta-\omega-\theta=0.

By Lemma 3.1, both roots λ\lambda of this real quadratic equation satisfy |λ|<1|\lambda|<1 if and only if

|ω+θ+ωθφ(x)2|<2+ωθωθ<2.|\omega+\theta+\omega\theta\varphi(x)-2|<2+\omega\theta-\omega-\theta<2.

If 0<ω<20<\omega<2 and 0<θ<20<\theta<2, we have ω+θ2ωθ>ωθω+θωθ>0\omega+\theta\geq 2\sqrt{\omega\theta}>\omega\theta\Rightarrow\omega+\theta-\omega\theta>0. This, along with φ(x)0\varphi(x)\geq 0 yields

(16) 0<ω<2(2θ)2θ+θφ(x)2.\displaystyle 0<\omega<\frac{2(2-\theta)}{2-\theta+\theta\varphi(x)}\leq 2\,.

Next, we consider the case xnull(B)x\notin{\rm null}(B). Then ϕ(x)>0\phi(x)>0 and (15) can be rewritten as a cubic polynomial equation λ3+a2λ2+a1λ+a0=0\lambda^{3}+a_{2}\lambda^{2}+a_{1}\lambda+a_{0}=0, where

a2\displaystyle a_{2} =θ+ω+ωτϕ(x)+ωθφ(x)3,\displaystyle=\theta+\omega+\omega\tau\phi(x)+\omega\theta\varphi(x)-3,
a1\displaystyle a_{1} =3+ωθ2ω2θωτ(1θ)ϕ(x)ωθφ(x),\displaystyle=3+\omega\theta-2\omega-2\theta-\omega\tau(1-\theta)\phi(x)-\omega\theta\varphi(x),
a0\displaystyle a_{0} =ω+θωθ1.\displaystyle=\omega+\theta-\omega\theta-1.

By Lemma 3.2, all roots λ\lambda of the above real cubic equation satisfy |λ|<1|\lambda|<1 if and only if

(17) | 2ω+2θωθ+ωτϕ(x)+ωθφ(x)4|\displaystyle\big{|}\,2\omega+2\theta-\omega\theta+\omega\tau\phi(x)+\omega\theta\varphi(x)-4\,\big{|} <4+ωθ2ω2θωτ(1θ)ϕ(x)ωθφ(x),\displaystyle<4+\omega\theta-2\omega-2\theta-\omega\tau(1-\theta)\phi(x)-\omega\theta\varphi(x),
(18) |ωτϕ(x)+ωθφ(x)2ω2θ+3ωθ|\displaystyle\big{|}\,\omega\tau\phi(x)+\omega\theta\varphi(x)-2\omega-2\theta+3\omega\theta\,\big{|} <2ω+2θωθ+ωτ(1θ)ϕ(x)+ωθφ(x),\displaystyle<2\omega+2\theta-\omega\theta+\omega\tau(1-\theta)\phi(x)+\omega\theta\varphi(x),
(19) θ(ωθωθ)(1+φ(x))ωτ(1θ)ϕ(x)\displaystyle\theta(\omega\theta-\omega-\theta)\left(1+\varphi(x)\right)-\omega\tau(1-\theta)\phi(x) <0.\displaystyle<0.

Note that with ϕ(x)>0\phi(x)>0, φ(x)0\varphi(x)\geq 0, and 0<θ<20<\theta<2, (17) leads to τ>0\tau>0 and

(20) 0<ω<4(2θ)42θ+τ(2θ)ϕ(x)+2θφ(x)2.0<\omega<\frac{4(2-\theta)}{4-2\theta+\tau(2-\theta)\phi(x)+2\theta\varphi(x)}\leq 2.

It follows from (18) that

(21) ωτθϕ(x)<4(ω+θωθ),\omega\tau\theta\phi(x)<4(\omega+\theta-\omega\theta),

and

(22) ωτ(2θ)ϕ(x)+2ωθφ(x)+2ωθ>0.\omega\tau(2-\theta)\phi(x)+2\omega\theta\varphi(x)+2\omega\theta>0.

If ω>0\omega>0, τ>0\tau>0 and 0<θ<20<\theta<2, as ϕ(x)>0\phi(x)>0 and φ(x)0\varphi(x)\geq 0, clearly (22) holds. This together with (21) implies that (18) holds if

(23) 0<τ<4(ω+θωθ)ωθϕ(x).\displaystyle 0<\tau<\frac{4(\omega+\theta-\omega\theta)}{\omega\theta\phi(x)}.

Inequality (19) holds if 0<θ10<\theta\leq 1 because ω,τ>0\omega,\,\tau>0. If 1<θ<21<\theta<2 and 0<ω<20<\omega<2, solving (19) leads to

τ<θ(ω+θωθ)(1+φ(x))ω(θ1)ϕ(x).\tau<\frac{\theta(\omega+\theta-\omega\theta)\left(1+\varphi(x)\right)}{\omega(\theta-1)\phi(x)}\,.

Note that ω>0\omega>0, ϕ(x)>0\phi(x)>0, ω+θωθ>0\omega+\theta-\omega\theta>0 and 4(θ1)<θ2(1+φ(x))4(\theta-1)<\theta^{2}(1+\varphi(x)), giving

4(ω+θωθ)ωθϕ(x)<θ(ω+θωθ)(1+φ(x))ω(θ1)ϕ(x).\frac{4(\omega+\theta-\omega\theta)}{\omega\theta\phi(x)}<\frac{\theta(\omega+\theta-\omega\theta)\left(1+\varphi(x)\right)}{\omega(\theta-1)\phi(x)}.

This implies that (19) holds under condition (23).

To sum up, by combining (16), (20), (23) and the fact that τ(2θ)ϕ(x)0\tau(2-\theta)\phi(x)\geq 0, we know that |λ|<1|\lambda|<1 if

(24) 0<θ<2,\displaystyle 0<\theta<2,
(25) 0<ω<4(2θ)42θ+τ(2θ)ϕ(x)+2θφ(x),\displaystyle 0<\omega<\dfrac{4(2-\theta)}{4-2\theta+\tau(2-\theta)\phi(x)+2\theta\varphi(x)},
(26) 0<τ<4(ω+θωθ)ωθϕ(x).\displaystyle 0<\tau<\frac{4(\omega+\theta-\omega\theta)}{\omega\theta\phi(x)}.

For any x0x\neq 0, we have 0ϕ(x)μmax0\leq\phi(x)\leq\mu_{\max} and 0φ(x)νmax0\leq\varphi(x)\leq\nu_{\max}. Combining with (24)–(26) completes the proof.

Remark 3.3.

We emphasize that parameters ω\omega, τ\tau and θ\theta can be chosen to satisfy the conditions derived in Theorem 3.1. Indeed, as 0<ω<20<\omega<2 and 0<θ<20<\theta<2, we get

4(ω+θωθ)ωθμmax=4θμmax(1+θωθ)>4θμmax(1+θ2θ)=2(2θ)θμmax.\frac{4(\omega+\theta-\omega\theta)}{\omega\theta\mu_{\max}}=\frac{4}{\theta\mu_{\max}}\left(1+\frac{\theta}{\omega}-\theta\right)>\frac{4}{\theta\mu_{\max}}\left(1+\frac{\theta}{2}-\theta\right)=\frac{2(2-\theta)}{\theta\mu_{\max}}.

Thus, we can first choose θ\theta satisfying 0<θ<20<\theta<2, and then choose τ\tau in the open interval (0,2(2θ)θμmax)\left(0,\,\frac{2(2-\theta)}{\theta\mu_{\max}}\right). Finally, we choose ω\omega satisfying

0<ω<4(2θ)(2θ)(2+τμmax)+2θνmax.\displaystyle 0<\omega<\dfrac{4(2-\theta)}{(2-\theta)(2+\tau\mu_{\max})+2\theta\nu_{\max}}.

Remark 3.4.

If ω=1\omega=1, (15) can be simplified as

λ2+(θ2+τϕ(x)+θφ(x))λ+1θτϕ(x)+τθϕ(x)θφ(x)=0.\lambda^{2}+(\theta-2+\tau\phi(x)+\theta\varphi(x))\lambda+1-\theta-\tau\phi(x)+\tau\theta\phi(x)-\theta\varphi(x)=0.

It follows from Lemma 3.1 that |λ|<1|\lambda|<1 holds if and only if

|θ2+τϕ(x)+θφ(x)|<2θτϕ(x)+τθϕ(x)θφ(x)<2.|\theta-2+\tau\phi(x)+\theta\varphi(x)|<2-\theta-\tau\phi(x)+\tau\theta\phi(x)-\theta\varphi(x)<2.

After some algebra, we see that GSOR with ω=1\omega=1 is convergent if

0<θ<21+νmax,\displaystyle 0<\theta<\frac{2}{1+\nu_{\max}}, 0<τ<2(2θθνmax)(2θ)μmax.\displaystyle 0<\tau<\frac{2(2-\theta-\theta\nu_{\max})}{(2-\theta)\mu_{\max}}.

Remark 3.5.

If ω=1\omega=1 and θ=1\theta=1, GSOR is the same as the Uzawa-like method studied in [4, section 2.2]. In this case, (15) reduces to

λ2+(τϕ(x)+φ(x)1)λφ(x)=0.\lambda^{2}+\left(\tau\phi(x)+\varphi(x)-1\right)\lambda-\varphi(x)=0.

By Lemma 3.1, we know that |λ|<1|\lambda|<1 holds if and only if

|τϕ(x)+φ(x)1|<1φ(x)<2.\big{|}\tau\phi(x)+\varphi(x)-1\big{|}<1-\varphi(x)<2.

This implies that, for any τ\tau, GSOR diverges when νmax1\nu_{\max}\geq 1. Therefore, for this special case, GSOR is convergent provided

νmax<1,\displaystyle\nu_{\max}<1, 0<τ<2(1νmax)μmax.\displaystyle 0<\tau<\frac{2(1-\nu_{\max})}{\mu_{\max}}.

This result is the same as [4, Theorem 3], which is the convergence theorem of the Uzawa-like method. However, we emphasize that the condition νmax<1\nu_{\max}<1 is strong. In fact, as shown in section 5.2 below, the saddle-point problems from the mixed Stokes-Darcy model in porous media applications do not satisfy this condition. With Remark 3.4, this shows that it is necessary to introduce another parameter.

4 The GSOR preconditioner

We develop and analyze a class of block lower triangular preconditioners to accelerate Krylov methods for (1).

The splitting in (2.6) can induce a preconditioner \cal M for (1). The corresponding preconditioned matrix 1𝒜{\cal M}^{-1}{\cal A} has the form

(ωIωA1BTωA1CT(ω1)τP1BωτP1BA1BTωτP1BA1CT(ω1)θD1CωθD1CA1BTθI+ωθD1CA1CT).\begin{pmatrix}\omega I&\omega A^{-1}B^{T}\!&\omega A^{-1}C^{T}\!\\ (\omega-1)\tau P^{-1}B&\omega\tau P^{-1}BA^{-1}B^{T}\!&\omega\tau P^{-1}BA^{-1}C^{T}\!\\ (\omega-1)\theta D^{-1}C&\omega\theta D^{-1}CA^{-1}B^{T}\!&\theta I+\omega\theta D^{-1}CA^{-1}C^{T}\!\end{pmatrix}.

When ω=1\omega=1, 1𝒜{\cal M}^{-1}{\cal A} has at least nn eigenvalues equal to 11. As clustered eigenvalues are desirable, we consider the block lower triangular preconditioner

𝒫=(A00B1τP0C01θD).{\cal P}=\begin{pmatrix}A&0&0\\ B&-\frac{1}{\tau}P&0\\ C&0&-\frac{1}{\theta}D\end{pmatrix}.

When 𝒫{\cal P} is used to precondition Krylov subspace methods, each step needs to solve three linear systems involving AA, PP, and DD. This is more practical than the block preconditioners of [3, 2] and the (relaxed) APSS preconditioners of [16], which need to solve several dense linear systems involving matrices like BA1BTBA^{-1}B^{T}\!, D+CA1CTD+CA^{-1}C^{T}, A+BTB/αA+B^{T}\!B/\alpha, and D+CCT/αD+CC^{T}/\alpha, where α\alpha is a positive number.

To illustrate further the efficiency of our preconditioner 𝒫{\cal P}, we derive explicit and sharp bounds on the spectrum of the preconditioned matrix 𝒫1𝒜{\cal P}^{-1}{\cal A}. By direct calculations, we have

𝒫1𝒜=(IA1BTA1CT0τP1BA1BTτP1BA1CT0θD1CA1BTθI+θD1CA1CT),{\cal P}^{-1}{\cal A}=\begin{pmatrix}I&A^{-1}B^{T}\!&A^{-1}C^{T}\!\\ 0&\tau P^{-1}BA^{-1}B^{T}\!&\tau P^{-1}BA^{-1}C^{T}\!\\ 0&\theta D^{-1}CA^{-1}B^{T}\!&\theta I+\theta D^{-1}CA^{-1}C^{T}\!\end{pmatrix},

which is similar to

(IB^TC^T0τB^B^TτB^C^T0θC^B^TθI+θC^C^T),\begin{pmatrix}I&\hat{B}^{T}\!&\hat{C}^{T}\!\\ 0&\tau\hat{B}\hat{B}^{T}\!&\tau\hat{B}\hat{C}^{T}\!\\ 0&\theta\hat{C}\hat{B}^{T}\!&\theta I+\theta\hat{C}\hat{C}^{T}\!\end{pmatrix},

where B^=P1/2BA1/2\hat{B}=P^{-1/2}BA^{-1/2} and C^=D1/2CA1/2\hat{C}=D^{-1/2}CA^{-1/2}. Thus 𝒫1𝒜{\cal P}^{-1}{\cal A} has eigenvalue 1 with multiplicity nn, and the remaining eigenvalues are the same as those of

K=(τB^B^TτB^C^TθC^B^TθI+θC^C^T).K=\begin{pmatrix}\tau\hat{B}\hat{B}^{T}\!&\tau\hat{B}\hat{C}^{T}\!\\ \theta\hat{C}\hat{B}^{T}\!&\theta I+\theta\hat{C}\hat{C}^{T}\!\end{pmatrix}.

We can now establish the following theorem.

Theorem 4.1.

Assume that AA and DD are SPD, and BB has full row rank. Let the minimum and maximum eigenvalues of P1BA1BTP^{-1}BA^{-1}B^{T}\! be μmin\mu_{\min} and μmax\mu_{\max}. Let the maximum eigenvalue of D1CA1CTD^{-1}CA^{-1}C^{T}\! be νmax\nu_{\max}. Then 𝒫1𝒜{\cal P}^{-1}{\cal A} has eigenvalue 1 with multiplicity at least nn, and the remaining eigenvalues lie in the interval

[Λ¯Λ¯24τθμmin2,Λ¯+Λ¯24τθμmax2],\left[\,\frac{\underline{\Lambda}-\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}}{2},\,\,\frac{\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{2}\,\right],

where

(27) Λ¯=θ(1+νmax)+τμmin,\displaystyle\underline{\Lambda}=\theta(1+\nu_{\max})+\tau\mu_{\min},\quad Λ¯=θ(1+νmax)+τμmax.\displaystyle\quad\overline{\Lambda}=\theta(1+\nu_{\max})+\tau\mu_{\max}.

Proof 4.2.

We need to estimate spectral bounds for KK. Let λ\lambda be an eigenvalue of KK and (yT,zT)T(y^{T}\!,z^{T}\!)^{T}\! be a corresponding eigenvector. With τ>0\tau>0 and θ>0\theta>0 we see that KK is similar to a symmetric matrix, and hence λ\lambda is real. Also,

(28) τB^B^Ty+τB^C^Tz\displaystyle\tau\hat{B}\hat{B}^{T}\!y+\tau\hat{B}\hat{C}^{T}\!z =λy,\displaystyle=\lambda y,
(29) θC^B^Ty+θz+θC^C^Tz\displaystyle\theta\hat{C}\hat{B}^{T}\!y+\theta z+\theta\hat{C}\hat{C}^{T}\!z =λz.\displaystyle=\lambda z.

We obtain estimates of λ\lambda by considering two cases separately.

Case I: znull(C^T)z\in\mathrm{null}(\hat{C}^{T}). Clearly, τB^B^Ty=λy\tau\hat{B}\hat{B}^{T}\!y=\lambda y and θC^B^Ty=(λθ)z\theta\hat{C}\hat{B}^{T}\!y=(\lambda-\theta)z. This implies that λ=θ\lambda=\theta or λ\lambda is an eigenvalue of B^B^T\hat{B}\hat{B}^{T}\!. Note that B^B^T=P1/2BA1BTP1/2\hat{B}\hat{B}^{T}\!=P^{-1/2}BA^{-1}B^{T}\!P^{-1/2} is similar to P1BA1BTP^{-1}BA^{-1}B^{T}\!, so that λ=θ\lambda=\theta or τμminλτμmax\tau\mu_{\min}\leq\lambda\leq\tau\mu_{\max}.

Case II: znull(C^T)z\notin\mathrm{null}(\hat{C}^{T}). We only consider the case λ[τμmin,τμmax]\lambda\notin[\tau\mu_{\min},\,\tau\mu_{\max}], so that λIτB^B^T\lambda I-\tau\hat{B}\hat{B}^{T}\! is nonsingular. With (28), this leads to y=τ(λIτB^B^T)1B^C^Tz.y=\tau(\lambda I-\tau\hat{B}\hat{B}^{T})^{-1}\hat{B}\hat{C}^{T}\!z. Substituting into (29) gives

(30) τθC^B^T(λIτB^B^T)1B^C^Tz+θz+θC^C^Tz=λz.\displaystyle\tau\theta\hat{C}\hat{B}^{T}\!(\lambda I-\tau\hat{B}\hat{B}^{T})^{-1}\hat{B}\hat{C}^{T}\!z+\theta z+\theta\hat{C}\hat{C}^{T}\!z=\lambda z.

As (IτλB^TB^)1=I+τB^T(λIτB^B^T)1B^\left(I-\frac{\tau}{\lambda}\hat{B}^{T}\!\hat{B}\right)^{-1}=I+\tau\hat{B}^{T}\!(\lambda I-\tau\hat{B}\hat{B}^{T}\!)^{-1}\hat{B}, (30) yields

(31) θC^(IτλB^TB^)1C^Tz=(λθ)z.\displaystyle\theta\hat{C}\left(I-\frac{\tau}{\lambda}\hat{B}^{T}\!\hat{B}\right)^{-1}\hat{C}^{T}\!z=(\lambda-\theta)z.

We assert that λ>0\lambda>0. Otherwise, we have

(λθ)zTz<0andzTC^(IτλB^TB^)1C^Tz0,(\lambda-\theta)z^{T}\!z<0\qquad\mbox{and}\qquad z^{T}\!\hat{C}\left(I-\dfrac{\tau}{\lambda}\hat{B}^{T}\!\hat{B}\right)^{-1}\hat{C}^{T}\!z\geq 0,

which contradicts (31).

If λ>τμmax\lambda>\tau\mu_{\max}, as the matrices B^TB^\hat{B}^{T}\!\hat{B} and B^B^T\hat{B}\hat{B}^{T}\! have the same nonzero eigenvalues, it holds for any 0un0\neq u\in\mathds{R}^{n} that

uT(IτλB^TB^)u(1τμmaxλ)uTu>0.u^{T}\!\left(I-\frac{\tau}{\lambda}\hat{B}^{T}\!\hat{B}\right)u\geq\left(1-\frac{\tau\mu_{\max}}{\lambda}\right)u^{T}\!u>0.

With (31) and the fact that C^C^T=D1/2CA1CTD1/2\hat{C}\hat{C}^{T}\!=D^{-1/2}CA^{-1}C^{T}\!D^{-1/2} is similar to D1CA1CTD^{-1}CA^{-1}C^{T}\!, this leads to

λθθ(1τμmaxλ)1zTC^C^TzzTzθ(1τμmaxλ)1νmax.\lambda-\theta\leq\theta\left(1-\frac{\tau\mu_{\max}}{\lambda}\right)^{-1}\frac{z^{T}\!\hat{C}\hat{C}^{T}\!z}{z^{T}\!z}\leq\theta\left(1-\frac{\tau\mu_{\max}}{\lambda}\right)^{-1}\nu_{\max}.

Solving this inequality for λ\lambda gives

Λ¯Λ¯24τθμmax2λΛ¯+Λ¯24τθμmax2.\frac{\overline{\Lambda}-\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{2}\leq\lambda\leq\frac{\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{2}.

We can directly check that

Λ¯Λ¯24τθμmax2max{θ,τμmax}Λ¯+Λ¯24τθμmax.\overline{\Lambda}-\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}\,\leq 2\max\{\theta,\tau\mu_{\max}\}\leq\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}.

Therefore, λ\lambda admits the upper bound

(32) λΛ¯+Λ¯24τθμmax2.\lambda\leq\frac{\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{2}.

If λ<τμmin\lambda<\tau\mu_{\min}, it can be verified that

zTC^(IτλB^TB^)1C^Tz(1τμminλ)1zTC^C^Tz(1τμminλ)1νmaxzTz.z^{T}\!\hat{C}\left(I-\frac{\tau}{\lambda}\hat{B}^{T}\!\hat{B}\right)^{-1}\hat{C}^{T}\!z\geq\left(1-\frac{\tau\mu_{\min}}{\lambda}\right)^{-1}z^{T}\!\hat{C}\hat{C}^{T}\!z\geq\left(1-\frac{\tau\mu_{\min}}{\lambda}\right)^{-1}\nu_{\max}z^{T}\!z.

Combining with (31) gives

λθ(1τμminλ)1θνmax=λθνmaxλτμmin.\lambda-\theta\geq\left(1-\frac{\tau\mu_{\min}}{\lambda}\right)^{-1}\theta\nu_{\max}=\frac{\lambda\theta\nu_{\max}}{\lambda-\tau\mu_{\min}}.

Note that λ<τμmin\lambda<\tau\mu_{\min} and the inequality can be simplified as

λ2(θ+θνmax+τμmin)λ+τθμmin0.\lambda^{2}-(\theta+\theta\nu_{\max}+\tau\mu_{\min})\lambda+\tau\theta\mu_{\min}\leq 0.

By the definition of Λ¯\underline{\Lambda}, we have

Λ¯Λ¯24τθμmin2λΛ¯+Λ¯24τθμmin2.\frac{\underline{\Lambda}-\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}}{2}\leq\lambda\leq\frac{\underline{\Lambda}+\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}}{2}.

Similarly, we can check that

Λ¯Λ¯24τθμmin2min{θ,τμmin}Λ¯+Λ¯24τθμmin.\underline{\Lambda}-\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}\,\leq 2\min\{\theta,\tau\mu_{\min}\}\leq\underline{\Lambda}+\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}.

This implies that λ\lambda admits the lower bound

τμmin>λΛ¯Λ¯24τθμmin2.\tau\mu_{\min}>\lambda\geq\frac{\underline{\Lambda}-\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}}{2}.

Combining with (32) and the bounds derived in Case I completes the proof.

Remark 4.3.

To precondition the equivalent unsymmetric system (4), we can use the block lower triangular preconditioner

𝒫^=(A00B1τP0C01θD).\widehat{\cal P}=\begin{pmatrix}A&0&0\\ -B&\frac{1}{\tau}P&0\\ -C&0&\frac{1}{\theta}D\end{pmatrix}.

Because 𝒫^1𝒜^=𝒫1𝒜\widehat{\cal P}^{-1}\widehat{{\cal A}}={\cal P}^{-1}{\cal A}, the preconditioned matrix 𝒫^1𝒜^\widehat{\cal P}^{-1}\widehat{\cal A} possesses the same spectral bounds as in Theorem 4.1.

Remark 4.4.

Theorem 4.1 shows that the preconditioned matrices 𝒫1𝒜=𝒫^1𝒜^{\cal P}^{-1}{\cal A}=\widehat{\cal P}^{-1}\widehat{\cal A} are positive stable. Moreover, their condition number is bounded by

max{Λ¯+Λ¯24τθμmax2,Λ¯+Λ¯24τθμmaxΛ¯Λ¯24τθμmin}.\max\left\{\,\frac{\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{2},\,\,\frac{\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{\underline{\Lambda}-\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}}\,\right\}.

Using (27), we obtain

Λ¯+Λ¯24τθμmax2Λ¯=θ(1+νmax)+τμmin\frac{\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{2}\leq\overline{\Lambda}=\theta(1+\nu_{\max})+\tau\mu_{\min}

and

Λ¯+Λ¯24τθμmaxΛ¯Λ¯24τθμmin=(Λ¯+Λ¯24τθμmax)(Λ¯+Λ¯24τθμmin)4τθμminΛ¯Λ¯τθμmin\displaystyle\frac{\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}}{\underline{\Lambda}-\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}}=\frac{\left(\overline{\Lambda}+\sqrt{\overline{\Lambda}^{2}-4\tau\theta\mu_{\max}}\right)\left(\underline{\Lambda}+\sqrt{\underline{\Lambda}^{2}-4\tau\theta\mu_{\min}}\right)}{4\tau\theta\mu_{\min}}\leq\frac{\underline{\Lambda}\overline{\Lambda}}{\tau\theta\mu_{\min}}
=θ2(1+νmax)2+τ2μminμmax+τθ(1+νmax)(μmax+μmin)τθμmin\displaystyle=\frac{\theta^{2}(1+\nu_{\max})^{2}+\tau^{2}\mu_{\min}\mu_{\max}+\tau\theta(1+\nu_{\max})(\mu_{\max}+\mu_{\min})}{\tau\theta\mu_{\min}}
=θτ(1+νmax)2μmin+τθμmax+(1+νmax)(1+μmaxμmin).\displaystyle=\frac{\theta}{\tau}\frac{(1+\nu_{\max})^{2}}{\mu_{\min}}+\frac{\tau}{\theta}\mu_{\max}+(1+\nu_{\max})\left(1+\frac{\mu_{\max}}{\mu_{\min}}\right).

This shows that the matrices 𝒫1𝒜{\cal P}^{-1}{\cal A} and 𝒫^1𝒜^\widehat{\cal P}^{-1}\widehat{\cal A} will be well-conditioned given appropriate selections of parameters τ\tau, θ\theta and matrix PP when νmax\nu_{\max} is not too large.111This is a reasonable request. As shown in Section 5 below, νmax\nu_{\max} of the saddle-point systems from the liquid crystal directors model and the mixed Stokes-Darcy model in porous media applications is 0.17500.1750 and 1.00571.0057, respectively.

5 Numerical experiments

We present the results of numerical tests to examine the feasibility and effectiveness of GSOR. All experiments were run using MATLAB R2015b on a PC with an Intel(R) Core(TM) i7-8550U CPU @ 1.8GHz and 16GB of RAM. The initial guess is taken to be the zero vector, and the algorithms are terminated when the number of iterations exceeds 10510^{5} or

Res:=b𝒜wk2/b108,{\rm Res}:=\|b-{\cal A}w_{k}\|_{2}/\|b\|\leq 10^{-8},

where wkw_{k} is the current approximate solution. We report the number of iterations, the CPU time, and the final value of the relative residual, denoted by “Iter”, “CPU” and “Res”, respectively.

For our GSOR method, we tried just a few values of the parameters ω\omega, τ\tau and θ\theta. We compared our method with the Uzawa-like method (denoted “Uzawa”) and the generalization of the block SOR method (denoted “GBSOR”) studied in [4, Section 2.2 and Section 3], respectively. We emphasize that the Uzawa method is a special case of our GSOR method with P=QP=Q, ω=1\omega=1, and θ=1\theta=1. For GBSOR, based on [4, Theorem 5], we chose ω=s/4,s/2, 3s/4\omega=s/4,\,s/2,\,3s/4 (denoted “GBSORa”, “GBSORb”, “GBSORc”, respectively), where s=2/(1+νmax)s=2/(1+\sqrt{\nu_{\max}}) is the upper bound of the convergence interval for the parameter ω\omega. We used the function “eigs” to compute νmax\nu_{\max}.

We also tested Krylov methods for (1) or (4), such as MINRES, GMRES, and BICGSTAB. For preconditioned MINRES (denoted “BPMINRES”), we use the block diagonal preconditioner

(A000BA1BT000D+CA1CT).\begin{pmatrix}A&0&0\\ 0&BA^{-1}B^{T}\!&0\\ 0&0&D+CA^{-1}C^{T}\!\end{pmatrix}.

For D=0D=0, this block diagonal preconditioner has been studied in [3]. For preconditioned GMRES, we test the GSOR preconditioner 𝒫\cal P with τ=θ=1\tau=\theta=1 (denoted “GPGMRES”) and the block triangular preconditioner [3] (denoted “BPGMRES”)

(ABTCT0BA1BT000(D+CA1CT)).\begin{pmatrix}A&B^{T}\!&C^{T}\!\\ 0&-BA^{-1}B^{T}\!&0\\ 0&0&-(D+CA^{-1}C^{T}\!)\end{pmatrix}.

5.1 Saddle-point systems from the liquid crystal directors model

Continuum models for the orientational properties of liquid crystals require minimization of free energy functionals of the form

(33) [u,v,w,U]=1201[(uz2+vz2+wz2)η2(β+w2)Uz2]dz,{\cal F}[u,v,w,U]=\frac{1}{2}\int_{0}^{1}[(u_{z}^{2}+v_{z}^{2}+w_{z}^{2})-\eta^{2}(\beta+w^{2})U_{z}^{2}]{\rm d}z,

where uu, vv, ww, and UU are functions of z[0,1]z\in[0,1] subject to suitable end-point conditions, uzu_{z}, vzv_{z}, wzw_{z}, and UzU_{z} denote the first derivatives of the corresponding functions with respect to zz, and η\eta and β\beta are prescribed positive parameters. By discretizing with a uniform piecewise-linear finite element scheme with N+1N+1 cells using nodal quadrature and the prescribed boundary conditions, we minimize the free energy (33) under the unit vector constraint. We apply the Lagrange multiplier method to solve this discretized minimization model, and Newton’s method to solve the nonlinear equations from the first-order conditions of the Lagrangian. Each step involves the solution of a linear system of the form (1) with n=3Nn=3N and m=p=Nm=p=N. For more details, we refer to [17].

In our numerical experiments we set η=3π/4\eta=\sqrt{3}\pi/4 and β=0.5\beta=0.5, which is known as the critical switching value. The discretized matrix AA is tridiagonal, so in all algorithms we solve systems Ax=rAx=r directly by the function “\”, which uses a tridiagonal solver. We set P=BA1BTP=BA^{-1}B^{T}\! and solve systems Py=rPy=r using Cholesky factorization. Numerical results are listed in Tables 1 and 2 with N=1023,2047,4095,8191,16383N=1023,2047,4095,8191,16383, where the parameter choices for GSOR and the corresponding notation are as follows:

Method GSORa GSORb GSORc GSORd
(ω,τ,θ)(\omega,\tau,\theta) (1,1,1)(1,1,1) (0.95,0.95,0.95)(0.95,0.95,0.95) (0.9,0.8,1)(0.9,0.8,1) (0.95,1,0.95)(0.95,1,0.95)

For this problem, ACTD1CA-C^{T}\!D^{-1}C is SPD, which guarantees convergence of the Uzawa-like method [4, Theorem 3]. We set Q=BA1BTQ=BA^{-1}B^{T}\! and α=1νmax\alpha=1-\nu_{\max}, where νmax=0.1750\nu_{\max}=0.1750 is the maximum eigenvalue of A1CTD1CA^{-1}C^{T}\!D^{-1}C. MINRES, GMRES and BICGSTAB without preconditioning failed to solve this problem. (For GMRES, we set the restart frequency to 100100.) BICGSTAB hit an error condition. Therefore in Table 2 we only report results from preconditioned MINRES and preconditioned GMRES.

Table 1: CPU time for Cholesky factorization of P=BA1BTP=BA^{-1}B^{T}.
NN 1023 2047 4095 8191 16383
nn 3069 6141 12285 24573 49149
mm 1023 2047 4095 8191 16383
pp 1023 2047 4095 8191 16383
n+m+pn+m+p 5115 10235 20475 40955 81915
BA1BTBA^{-1}B^{T}\! 0.086 0.43 1.95 8.93 145.2
Table 2: Numerical results for saddle-point systems from the liquid crystal directors model.
NN 1023 2047 4095 8191 16383
Iter 24 24 25 25 26
GSORa CPU 0.20 1.08 4.81 23.86 226.04
Res 5.95e-09 8.42e-09 5.44e-09 7.70e-09 4.96e-09
Iter 15 15 16 16 16
GSORb CPU 0.12 0.64 3.03 15.35 152.79
Res 5.02e-09 7.09e-09 2.53e-09 3.57e-09 5.05e-09
Iter 16 17 17 17 17
GSORc CPU 0.14 0.71 3.29 15.28 160.58
Res 8.41e-09 1.82e-09 2.01e-09 2.34e-09 2.88e-09
Iter 14 14 14 14 14
GSORd CPU 0.11 0.59 2.83 12.64 132.68
Res 7.30e-10 9.62e-10 1.31e-09 1.81e-09 2.53e-09
Iter 18 18 18 20 20
UZAWA CPU 0.14 0.76 3.51 18.35 187.64
Res 4.01e-09 5.67e-09 8.02e-09 1.56e-09 2.22e-09
Iter 72 73 75 76 77
GBSORa CPU 0.56 3.12 14.80 70.90 773.60
Res 8.38e-09 9.23e-09 7.90e-09 8.69e-09 9.57e-09
Iter 29 30 30 31 31
GBSORb CPU 0.23 1.31 5.98 28.03 294.17
Res 7.91e-09 5.95e-09 8.42e-09 6.32e-09 8.95e-09
Iter 35 36 36 37 38
GBSORc CPU 0.27 1.59 7.13 34.63 356.66
Res 7.59e-09 6.34e-09 8.97e-09 7.51e-09 6.30e-09
Iter 13 13 13 14 14
BPMINRES CPU 3.50 18.10 96.25 837.49 10736.63
Res 2.67e-09 4.75e-09 9.35e-09 1.41e-09 1.51e-09
Iter 8 8 8 8 8
BPGMRES CPU 3.35 18.63 91.74 543.05 10609.10
Res 6.42e-09 1.34e-08 9.35e-09 7.74e-08 1.86e-07
Iter 8 8 8 8 8
GPGMRES CPU 1.80 8.17 40.53 251.52 3548.65
Res 9.00e-10 1.92e-09 4.99e-09 1.66e-08 7.79e-08

To see the role of the parameters in the convergence behavior of GSOR, Fig. 1 shows the region of the parameters where GSOR satisfies Res108{\rm Res}\leq 10^{-8} within 5,0005,000 iterations, and the characteristic curves of the number of iterations versus the parameters for N=1,023N=1,023. In Fig. 2, we plot the eigenvalue distributions of the original matrix and the GSOR preconditioned matrix 𝒫1𝒜{\cal P}^{-1}{\cal A} with different τ\tau and ω\omega.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1: Top left: The region of parameter values for which GSOR satisfies Res108{\rm Res}\leq 10^{-8} within 5,0005,000 iterations. Other plots: Characteristic curves for the number of iterations versus parameters ω\omega, τ\tau and θ\theta for GSOR with ω=1\omega=1 (top right), τ=1\tau=1 (bottom left), and θ=1\theta=1 (bottom right). All plots are for saddle-point systems from the liquid crystal directors model with n=3069n=3069, m=p=1023m=p=1023.
Refer to caption
(a) 𝒜\cal A
Refer to caption
(b) 𝒫1𝒜{\cal P}^{-1}\cal A with τ=0.1,θ=1\tau=0.1,\theta=1
Refer to caption
(c) 𝒫1𝒜{\cal P}^{-1}\cal A with τ=1,θ=1\tau=1,\theta=1
Refer to caption
(d) 𝒫1𝒜{\cal P}^{-1}\cal A with τ=1,θ=0.1\tau=1,\theta=0.1
Figure 2: Eigenvalue distributions of the original matrix and the GSOR preconditioned matrices for saddle-point systems from the liquid crystal directors model with n=3069,m=p=1023n=3069,m=p=1023.

5.2 Saddle-point systems from the mixed Stokes-Darcy model in porous media applications

Fluid flow in Ωf2\Omega_{f}\subset\mathds{R}^{2} coupled with porous media flow in Ωp2\Omega_{p}\subset\mathds{R}^{2} is governed by the static Stokes equations

(34) υΔ𝒖f+pf=𝒇,anddiv𝒖f=0,𝒙Ωf,-\upsilon\Delta\,{\bm{u}}_{f}+\nabla\,p_{f}={\bm{f}},\quad\textup{and}\quad{\rm div}\,{\bm{u}}_{f}=0,\quad{\bm{x}}\in\Omega_{f},

where ΩfΩp=\Omega_{f}\cap\Omega_{p}=\emptyset and Ω¯fΩ¯p=Γ\overline{\Omega}_{f}\cap\overline{\Omega}_{p}=\Gamma with Γ\Gamma being an interface, υ>0\upsilon>0 is the kinematic viscosity, and 𝒇\bm{f} is the external force.

In the porous media region, the governing variable is ϕ=ppρfg\phi=\frac{p_{p}}{\rho_{f}g}, where ppp_{p} is the pressure in Ωp\Omega_{p}, ρf\rho_{f} is the fluid density, and gg is the gravity acceleration. The velocity 𝒖p{\bm{u}}_{p} of the porous media flow is related to ϕ\phi by Darcy’s law and is also divergence free:

(35) 𝒖p=ϵ2rυϕanddiv𝒖p=0,𝒙Ωp,{\bm{u}}_{p}=-\dfrac{\epsilon^{2}}{r\upsilon}\nabla\phi\quad\textup{and}\quad-{\rm div}\,{\bm{u}}_{p}=0,\quad{\bm{x}}\in\Omega_{p},

where rr is the volumetric porosity, and ϵ\epsilon the characteristic length of the porous media.

Applying finite element discretization to the mixed Stokes-Darcy model (34)–(35) with the Dirichlet boundary conditions leads to linear systems of form (1[6].

In our numerical experiments, we set υ=1\upsilon=1, r=1r=1, and ϵ=0.1\epsilon=\sqrt{0.1}. The computational domain is Ωf=(0,1)×(1,2)\Omega_{f}=(0,1)\times(1,2), Ωp=(0,1)×(0,1)\Omega_{p}=(0,1)\times(0,1) and the interface is Γ=(0,1)×{1}\Gamma=(0,1)\times\{1\}. We use a uniform mesh with grid parameters h=23, 24, 25, 26h=2^{-3},\,2^{-4},\,2^{-5},\,2^{-6} to decompose Ωf\Omega_{f}, P2–P1 elements in the fluid region, and P2 Lagrange elements in the porous media region.

For this problem, PP is the pressure mass matrix discretized from the decoupled problem of (34)–(35[6]. In all algorithms we use Cholesky factorization to solve the systems Ax=rAx=r, Py=rPy=r and BA1BTy=rBA^{-1}B^{T}\!y=r. Numerical results for saddle-point systems from the mixed Stokes-Darcy model (34)–(35) are listed in Tables 3 and 4, where the parameters choices for GSOR and the corresponding notation are as follows.

Method GSORa GSORb GSORc GSORd
(ω,τ,θ)(\omega,\tau,\theta) (0.5,1.5,1.0)(0.5,1.5,1.0) (0.5,1.7,0.8)(0.5,1.7,0.8) (0.5,1.6,1.2)(0.5,1.6,1.2) (0.6,1.5,1.0)(0.6,1.5,1.0)

For this problem, νmax=1.0057\nu_{\max}=1.0057. The matrix ACTD1CA-C^{T}\!D^{-1}C is no longer SPD, so convergence of the Uzawa-like method cannot be guaranteed [4, Theorem 3]. We tested several α\alpha ranging from 0.0050.005 to 0.50.5 for h=23h=2^{-3}. Uzawa failed in all cases. Thus, we do not report results for Uzawa in Table 4. As MINRES and GMRES worked only for systems with h25h\geq 2^{-5}, we again do not report their results.

To see the role of the parameters in the convergence behavior of GSOR, Figure 3 shows the region of parameters for which GSOR satisfies Res108{\rm Res}\leq 10^{-8} within 5,0005,000 steps, and the characteristic curves of iteration numbers versus parameters for h=23h=2^{-3}. In Fig. 4, we plot the eigenvalue distributions of the original matrix and the GSOR preconditioned matrix 𝒫1𝒜{\cal P}^{-1}{\cal A} with different τ\tau and ω\omega.

Table 3: The CPU time of the Cholesky factorization.
hh 232^{-3} 242^{-4} 252^{-5} 262^{-6} 272^{-7}
nn 578 2178 8450 33282 132098
mm 81 289 1089 4225 16641
pp 289 1089 4225 16641 66049
n+m+pn+m+p 948 3556 13764 54148 214788
AA 0.0008 0.0048 0.029 0.23 1.75
PP 0.0003 0.0004 0.0011 0.02 0.10
BA1BTBA^{-1}B^{T}\! 0.0064 0.18 7.18 555.59 31368.15
Table 4: Numerical results for saddle-point systems from mixed the Stokes-Darcy model.
hh 232^{-3} 242^{-4} 252^{-5} 262^{-6} 272^{-7}
Iter 50 49 49 47 47
GSORa CPU 0.05 0.15 0.89 10.19 54.22
Res 5.99e-09 9.80e-09 8.41e-09 9.53e-09 6.85e-09
Iter 50 50 50 50 50
GSORb CPU 0.03 0.15 0.91 10.13 64.10
Res 6.54e-09 5.93e-09 5.51e-09 5.82e-09 5.74e-09
Iter 50 50 50 50 49
GSORc CPU 0.04 0.15 0.90 10.02 62.91
Res 5.98e-09 5.19e-09 6.92e-09 7.00e-09 9.53e-09
Iter 48 45 42 39 38
GSORd CPU 0.04 0.14 0.80 7.70 49.98
Res 8.39e-09 8.53e-09 8.52e-09 9.63e-09 5.26e-09
Iter 158 150 141 132 124
GBSORa CPU 0.21 0.84 6.01 90.33 948.46
Res 9.79e-09 9.16e-09 9.46e-09 9.96e-09 9.77e-09
Iter 73 69 65 61 58
GBSORb CPU 0.08 0.36 2.74 41.47 441.51
Res 9.17e-09 9.18e-09 9.28e-09 9.57e-09 8.28e-09
Iter 44 42 40 37 35
GBSORc CPU 0.04 0.19 1.73 25.59 265.67
Res 9.33e-09 8.25e-09 7.38e-09 9.57e-09 8.61e-09
Iter 767.5 1491 2997.5 5912.5 13826.5
BICGSTAB CPU 0.09 0.41 2.23 21.51 288.07
Res 7.96e-09 7.64e-09 9.57e-09 4.76e-09 9.48e-09
Iter 18 18 18 17 18
BPMINRES CPU 0.13 0.71 6.11 216.39 3479.83
Res 8.34e-09 3.66e-09 1.49e-09 5.70e-09 2.85e-09
Iter 10 10 10 10 10
BPGMRES CPU 0.14 0.81 10.00 323.95 3992.25
Res 1.60e-09 1.56e-09 1.05e-09 7.06e-10 1.39e-09
Iter 24 24 25 26 27
GPGMRES CPU 0.14 0.84 3.53 17.34 92.48
Res 1.06e-09 4.98e-09 6.87e-09 1.53e-09 6.15e-10
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 3: Top left: The region of parameter values for which GSOR satisfies Res108{\rm Res}\leq 10^{-8} within 5,0005,000 iterations. Other plots: Characteristic curves for the number of iterations versus parameters ω\omega, τ\tau and θ\theta for GSOR with ω=1\omega=1 (top right), τ=1\tau=1 (bottom left), and θ=1\theta=1 (bottom right). All plots are for saddle-point systems from the mixed Stokes-Darcy model with n=578n=578, m=81m=81, p=289p=289.
Refer to caption
(a) 𝒜\cal A
Refer to caption
(b) 𝒫1𝒜{\cal P}^{-1}\cal A with τ=0.1,θ=1\tau=0.1,\theta=1
Refer to caption
(c) 𝒫1𝒜{\cal P}^{-1}\cal A with τ=1,θ=1\tau=1,\theta=1
Refer to caption
(d) 𝒫1𝒜{\cal P}^{-1}\cal A with τ=1,θ=0.1\tau=1,\theta=0.1
Figure 4: Eigenvalue distributions of the original matrix and the GSOR preconditioned matrices for saddle-point systems from the mixed Stokes-Darcy model with n=578,m=81,p=289n=578,m=81,p=289.

Tables 1, 2, 3 and 4 and Figures 1, 2, 3 and 4 illustrate that GSOR is a practical method, and its advantages increase with the problem size. We see from Tables 1, 2, 3 and 4 that BPMINRES and BPGMRES are not practical in terms of CPU times. Figures 1 and 3 indicate that the convergence rate of GSOR depends strongly on ω\omega, τ\tau and θ\theta. Figures 2 and 4 show that 𝒫\cal P greatly improves the eigenvalue distribution of the original 𝒜\cal A.

6 Conclusions

We presented a theoretical and numerical study of the GSOR method for solving the double saddle-point problem (1). GSOR is convergent with suitable parameters ω\omega, τ\tau, and θ\theta. Unlike existing work, our proof is based on the necessary and sufficient conditions for all roots of a real cubic polynomial to have modulus less than one. We analyzed a class of block lower triangular preconditioners 𝒫\cal P induced from GSOR and derived explicit and sharp bounds for the eigenvalues of preconditioned matrices. The numerical results presented are highly encouraging. GSOR requires the least CPU time, and especially for larger problems, its advantages are clear. A shortcoming is the need to choose the three parameters. A practical method to choose them is a topic for future research.

Acknowledgments

This research began with the work of our colleague and friend Dr Oleg Burdakov. In 2019, Oleg focused on Barzilai-Borwein-type methods to solve quasi-definite linear systems, and he conducted preliminary tests on double saddle-point problems. While testing the BB-type methods, we found that GSOR performs well on the double saddle-point problems and were motivated to start this work. We are grateful for Oleg’s insight and foresight.

We are also grateful to Mingchao Cai, Alison Ramage, and Zhaozheng Liang for providing the test problems used in our numerical experiments.

References

  • [1] Z. Z. Bai, B. N. Parlett, and Z. Q. Wang, On generalized successive overrelaxation methods for augmented linear systems, Numer. Math., 102 (2005), pp. 1–38.
  • [2] F. P. A. Beik and M. Benzi, Block preconditioners for saddle point systems arising from liquid crystal directors modeling, Calcolo, 55 (2018), pp. 1–16.
  • [3] F. P. A. Beik and M. Benzi, Iterative methods for double saddle point systems, SIAM J. Matrix Anal. Appl., 39 (2018), pp. 902–921.
  • [4] M. Benzi and F. P. A. Beik, Uzawa-type and augmented Lagrangian methods for double saddle point systems, in Structured Matrices in Numerical Linear Algebra, Springer, 2019, pp. 215–236.
  • [5] M. Benzi, G. H. Golub, and J. Liesen, Numerical solution of saddle point problems, Acta Numerica, 14 (2005), pp. 1–137.
  • [6] M. C. Cai, M. Mu, and J. C. Xu, Preconditioning techniques for a mixed Stokes/Darcy model in porous media applications, J. Comput. Appl. Math., 233 (2009), pp. 346–355.
  • [7] Y. Dou and Z. Z. Liang, A class of block alternating splitting implicit iteration methods for double saddle point linear systems, Numer. Linear Algebra Appl., (2022), p. e2455.
  • [8] A. J. Ellingsrud, Preconditioning unified mixed discretizations of coupled Darcy-Stokes flow, master’s thesis, University of Oslo, 2015.
  • [9] A. Ghannad, D. Orban, and M. A. Saunders, Linear systems arising in interior methods for convex optimization: a symmetric formulation with bounded condition number, Optim. Method Softw., 0 (2021), pp. 1–26, https://doi.org/10.1080/10556788.2021.1965599.
  • [10] C. Greif, E. Moulding, and D. Orban, Bounds on eigenvalues of matrices arising from interior-point methods, SIAM J. Optim., 24 (2014), pp. 49–83.
  • [11] E. A. Grove and G. Ladas, Periodicities in nonlinear difference equations, Chapman and Hall/CRC, 2004.
  • [12] K. E. Holter, M. Kuchta, and K. A. Mardal, Robust preconditioning of monolithically coupled multiphysics problems, arXiv preprint arXiv:2001.05527, (2020).
  • [13] K. E. Holter, M. Kuchta, and K. A. Mardal, Robust preconditioning for coupled Stokes-Darcy problems with the Darcy problem in primal form, Comput. Math. Appl., 91 (2021), pp. 53–66.
  • [14] N. Huang, Variable parameter Uzawa method for solving a class of block three-by-three saddle point problems, Numer. Algor., 85 (2020), pp. 1233––1254.
  • [15] N. Huang, Y. H. Dai, and Q. Y. Hu, Uzawa methods for a class of block three-by-three saddle-point problems, Numer. Linear Algebra Appl., 26 (2019), p. e2265.
  • [16] Z. Z. Liang and G. F. Zhang, Alternating positive semidefinite splitting preconditioners for double saddle point problems, Calcolo, 56 (2019), pp. 1–17.
  • [17] A. Ramage and E. C. Gartland Jr, A preconditioned nullspace method for liquid crystal director modeling, SIAM J. Sci. Comput., 35 (2013), pp. B226–B247.
  • [18] B.-C. Ren, F. Chen, and X.-L. Wang, Improved splitting preconditioner for double saddle point problems arising from liquid crystal director modeling, Numer. Algor., (2022), pp. 1–17.
  • [19] S. J. Wright, Primal-dual Interior-point Methods, SIAM, Philadelphia, 1997.
  • [20] D. M. Young, Iterative Solution of Large Linear Systems, Elsevier, 2014.
  • [21] J.-L. Zhu, Y.-J. Wu, and A.-L. Yang, A two-parameter block triangular preconditioner for double saddle point problem arising from liquid crystal directors modeling, Numer. Algor., 89 (2022), pp. 987–1006.