This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\marginsize

2cm2cm2.54cm2cm

footnotetext: Email address: [email protected]

The standard form and convergence theory of the relaxation Kaczmarz-Tanabe method for solving linear systems

Chuan-gang Kang
School of Mathematical Sciences
Tiangong University Tianjin 300387 People’s Republic of China
Abstract

The Kaczmarz method is a popular iterative method for solving consistent, overdetermined linear system such as medical imaging in computerized tomography. The Kaczmarz’s iteration repeatedly scans all equations in order, which leads to lower computational efficiency especially in solving a large scale problem. The standard form of Kaczmarz-Tanabe’s iteration proposed recently effectively overcomes the computational redundancy problem of the Kaczmarz method. In this paper, we introduce relaxation parameters 𝐮=(μ1,,μm){\bf u}=(\mu_{1},\ldots,\mu_{m}) into the Kaczmarz-Tanabe method based on the relaxation Kaczmarz method, and consider the standard form and convergence of this combination. Moreover, we analyze and prove the sufficient conditions for convergence of the relaxation Kaczmarz-Tanabe method, i.e., μi(0,2)\mu_{i}\in(0,2). Numerical experiments show the convergence characteristics of the relaxation Kaczmarz-Tanabe method corresponding to these parameters.

Key words. Kaczmarz method; Relaxation Kaczmarz method; Relaxation Kaczmarz-Tanabe method; Computerized tomography

AMS subject classification. 65F10; 65F08; 65N22; 65J20

1 Introduction

In medical imaging tomography (see, i.e., [1, 2, 3]) such as computerized tomography (CT), magnetic resonance imaging (MRI) and electrical impedance tomography (EIT), people are often required to solve the following linear system of equations [4, 5, 6],

Ax=b,\displaystyle Ax=b, (1.1)

where Am×n,bmA\in\mathbb{R}^{m\times n},b\in\mathbb{R}^{m} are also called projection matrix and measurement vector, respectively. We suppose that (1.1) is over-determined and consistent, and xx^{*} is a true solution. When (1.1) has many solutions, xx^{\dagger} is used to denote the minimum least-squares solution.

The Kaczmarz method [7] is one of the most popular methods to solve (1.1) in computerized tomography. Let A=(a1,a2,,am)TA=(a_{1},a_{2},\ldots,a_{m})^{T}, Kaczmarz’s iteration is written as

xk=xk1+biai,xk1ai22ai,k=1,2,,\displaystyle x_{k}=x_{k-1}+\frac{b_{i}-\langle a_{i},x_{k-1}\rangle}{\|a_{i}\|_{2}^{2}}a_{i},\quad k=1,2,\ldots, (1.2)

where i=mod(k1,m)+1i=\bmod(k-1,m)+1, x,y=xTy\langle x,y\rangle=x^{T}y and x2=x,x\|x\|_{2}=\sqrt{\langle x,x\rangle} denote the inner product of x,yx,y and the square norm of xx in n\mathbb{R}^{n}, respectively.

The Kaczmarz method was originally discovered by Kaczmarz in 1937, and it was rediscovered by Gordon, Bender and Herman [8] to deal with X-ray photography as an algebraic reconstruction technique (ART) in 1970. The convergence of the Kaczmarz method was established by Tanabe in 1971. Recently, Kang and Zhou gave the convergence rate results of the Kaczmarz method [9]. In order to improve computing speed, Eggermont [10], Censor [11], et al. considered the problem of block calculation and adopted approximate and executable iterative forms of the block Kaczmarz method to solve this problem. The strictly block Kaczmarz method was given by Rebrova and Needell [12]. In addition, the simultaneous iterative reconstructive technique (SIRT) as another important branch is also developed rapidly, and there appeared many known methods) such as Cimmino method[13], Landweber method [14], CAV method [15], DROP method [16], and SART method[17], etc.

With the development of hardware technology and computing technology, the idea of stochastic theory is continuously integrated into the non-randomized methods. In this context, people introduced and investigated the randomized Kaczmarz method [18, 19, 20, 21, 22] and greedy randomized Kaczmarz methods [23, 24].

From the perspective of SIRT methods, Kang [25] considered the matrix-vector form of the Kaczmarz method, i.e., the Kaczmarz-Tanabe’s iteration which is named by Popa [26], and the iteration reads

yk+1=(IA𝒮TMA)yk+A𝒮TMb,k=0,1,2,,\displaystyle y_{k+1}=(I-A_{\mathcal{S}}^{T}MA)y_{k}+A_{\mathcal{S}}^{T}Mb,\qquad k=0,1,2,\ldots, (1.3)

where II denotes identity matrix of whatever size appropriate to the context, and

Pi=IaiaiTai22,Qm=I,Qj=PmPm1Pj+1,Q=PmPm1P1,\displaystyle P_{i}=I-\frac{a_{i}a_{i}^{T}}{\|a_{i}\|_{2}^{2}},\quad Q_{m}=I,Q_{j}=P_{m}P_{m-1}\ldots P_{j+1},\quad Q=P_{m}P_{m-1}\cdots P_{1},
M=diag(1/a122,1/a222,,1/am22),A𝒮=(Q1a1,Q2a2,,Qmam)T.\displaystyle M=\text{diag}(1/\|a_{1}\|_{2}^{2},1/\|a_{2}\|_{2}^{2},\ldots,1/\|a_{m}\|_{2}^{2}),\quad A_{\mathcal{S}}=(Q_{1}a_{1},Q_{2}a_{2},\ldots,Q_{m}a_{m})^{T}.

Compared with the classic form of the Kaczmarz-Tanabe iteration, the iteration (1.3) has been greatly improved. However, each column of A𝒮A_{\mathcal{S}} is the product of matrix QiQ_{i} and column vector aia_{i} from AA. Notice this defect of A𝒮A_{\mathcal{S}}, Kang [27] considered the decomposition of A𝒮A_{\mathcal{S}} and gave the standard form of the Kaczmarz-Tanabe’s iteration, i.e.,

yk+1=yk+ATCTM(bAyk),k=0,1,2,,\displaystyle y_{k+1}=y_{k}+A^{T}C^{T}M(b-Ay_{k}),\qquad k=0,1,2,\ldots, (1.4)

where CC is defined as [27, Corollary 2.7]. The iterative scheme (1.4) has no essential difference from the iterative formula of the SIRT methods.

In this paper, we combine the relaxation Kaczmarz method and the Kaczmarz-Tanabe method, and introduce relaxation parameters into Kaczmarz-Tanabe method and propose the relaxation Kaczmarz-Tanabe method. Just as the convergence of the relaxation method is related to the relaxation parameters, the convergence of the relaxation Kaczmarz-Tanabe is also related to the relaxation parameters. Hence we further consider the convergence conditions (about the relaxation parameters) and convergence results of the relaxation Kaczmarz-Tanabe method.

The rest of this work is organized as follows. In Section 2, we consider the relaxation Kaczmarz method and derive the sufficient conditions for its convergence. In Section 3, we consider the standard form of the relaxation Kaczmarz-Tanabe method, and discuss their properties such as decomposition of A𝒮(𝐮)A_{\mathcal{S}}({\bf u}), convergence condition, convergence, and so on. Section 4 is about the algorithm flow to compute C(𝐮)C({\bf u}) appearing in (3.19) and (3.24). In Section LABEL:section.numerical.tests, we verify the effectiveness of the Kaczmarz-Tanabe method with given relaxation parameters by several classical numerical examples.

2 The relaxation Kaczmarz method and its convergence

The relaxation Kaczmarz’s iteration is written as

xk=xk1+μibiai,xk1ai22ai,k=1,2,,\displaystyle x_{k}=x_{k-1}+\mu_{i}\frac{b_{i}-\langle a_{i},x_{k-1}\rangle}{\|a_{i}\|_{2}^{2}}a_{i},\quad k=1,2,\ldots, (2.1)

where i=mod(k1,m)+1i=\bmod(k-1,m)+1 and μi\mu_{i} are relaxation parameters. Let

Pi(μi)=IμiaiaiTai22,i=1,2,,m,\displaystyle P_{i}(\mu_{i})=I-\mu_{i}\frac{a_{i}a_{i}^{T}}{\|a_{i}\|_{2}^{2}},\quad i=1,2,\ldots,m, (2.2)

then it follows from (2.1) that

xk=Pi(μi)xk1+μibiai22ai,k=1,2,.\displaystyle x_{k}=P_{i}(\mu_{i})x_{k-1}+\mu_{i}\frac{b_{i}}{\|a_{i}\|_{2}^{2}}a_{i},\quad k=1,2,\ldots. (2.3)

Denote

Pi(μ):=IμaiaiT/ai22,\displaystyle P_{i}(\mu):=I-\mu a_{i}a_{i}^{T}/\|a_{i}\|_{2}^{2}, (2.4)

then the following lemma gives a sufficient condition for the nonexpansion of linear operator Pi(μ)P_{i}(\mu).

Lemma 2.1.

Let {Pi(μ),i=1,2,,m}\{P_{i}(\mu),i=1,2,\cdots,m\} be defined as in (2.4), then, for any 0<μ<20<\mu<2,

Pi(μ)x2x2,\displaystyle\|P_{i}(\mu)x\|_{2}\leq\|x\|_{2},

hold and the equality hold iif xN(aiT){x:aiTx=0}x\in N(a_{i}^{T})\triangleq\{x:a_{i}^{T}x=0\}. Moreover, for any μ\mu\in\mathbb{R} there also holds Pi(μ)x=xP_{i}(\mu)x=x when xN(aiT)x\in N(a_{i}^{T}).

Proof. For any 1im1\leq i\leq m, we have

Pi(μ)x22=(IμaiaiTai22)x22=x22μ(2μ)aiTx22/ai22.\displaystyle\|P_{i}(\mu)x\|_{2}^{2}=\|(I-\mu\frac{a_{i}a_{i}^{T}}{\|a_{i}\|_{2}^{2}})x\|_{2}^{2}=\|x\|_{2}^{2}-\mu(2-\mu)\|a_{i}^{T}x\|_{2}^{2}/\|a_{i}\|_{2}^{2}. (2.5)

If xN(aiT)x\notin N(a_{i}^{T}) and 0<μ<20<\mu<2, then μ(2μ)aiTx22>0\mu(2-\mu)\|a_{i}^{T}x\|_{2}^{2}>0, consequently, Pi(μ)x2<x2\|P_{i}(\mu)x\|_{2}<\|x\|_{2}. Conversely, if xN(aiT)x\in N(a_{i}^{T}), then aiTx=0a_{i}^{T}x=0, and it follows from (2.5) that Pi(μ)x22=x22\|P_{i}(\mu)x\|_{2}^{2}=\|x\|_{2}^{2}. Moreover, there also holds Pi(μ)x=xP_{i}(\mu)x=x when xN(aiT)x\in N(a_{i}^{T}). ∎

Let {xk,k>0}\{x_{k},k>0\} be generated by (2.1) and ek=xkPN(A)x0xe_{k}=x_{k}-P_{N(A)}x_{0}-x^{\dagger}, thus

ek=(IμiaiaiTai22)ek1=Pi(μi)ek1,\displaystyle e_{k}=(I-\mu_{i}\frac{a_{i}a_{i}^{T}}{\|a_{i}\|_{2}^{2}})e_{k-1}=P_{i}(\mu_{i})e_{k-1}, (2.6)

where i=mod(k1,m)+1i=\bmod(k-1,m)+1, and we have the following set property.

Lemma 2.2.

Let {xk,k>0}\{x_{k},k>0\} be generated by (2.1), then, for any μi\mu_{i}\in\mathbb{R},

ekN(A)\displaystyle e_{k}\in N(A)^{\bot}

holds.

Proof. We prove the result by mathematical induction. First, we know that e0=y0xPN(A)y0N(A)e_{0}=y_{0}-x^{\dagger}-P_{N(A)}y_{0}\in N(A)^{\bot}. Second, we suppose that for any given k0k\geq 0 there holds ekN(A)e_{k}\in N(A)^{\bot}, then ek+1N(A)e_{k+1}\in N(A)^{\bot}. In fact, for any zN(A)z\in N(A),

ek+1,z=Pi(μi)ek,z=ek,zμibiai22ai,z=0.\displaystyle\langle e_{k+1},z\rangle=\langle P_{i}(\mu_{i})e_{k},z\rangle=\langle e_{k},z\rangle-\mu_{i}\frac{b_{i}}{\|a_{i}\|_{2}^{2}}\langle a_{i},z\rangle=0.

This proves ek+1N(A)e_{k+1}\in N(A)^{\bot}. To sum up, ekN(A)e_{k}\in N(A)^{\bot} for any k0k\geq 0.∎

Lemma 2.2 shows that, for any relaxation parameter μi\mu_{i}, the iterative error {ek,k0}\{e_{k},k\geq 0\} of the relaxation Kaczmarz method belong to N(A)N(A)^{\bot}. The following theorem gives the convergence result of the relaxation Kaczmarz method for solving a consistent linear system when 0<μi<20<\mu_{i}<2 (where i=1,2,,mi=1,2,\ldots,m).

Theorem 2.3.

For a consistent linear system Ax=bAx=b, let {xk,k0}\{x_{k},k\geq 0\} be generated by (2.1) and relaxation parameter 0<μi<20<\mu_{i}<2. Then the sequence {xk,k0}\{x_{k},k\geq 0\} converges to the solution xx^{\dagger} when x0R(AT)x_{0}\in R(A^{T}).

Proof. Let P¯i(μi)Pi(μi)|N(A)\bar{P}_{i}(\mu_{i})\triangleq P_{i}(\mu_{i})|_{N(A)^{\bot}}, when 0<μi<20<\mu_{i}<2, then from Lemma 2.1 there holds P¯i(μi)2<1\|\bar{P}_{i}(\mu_{i})\|_{2}<1, consequently ζmax1imP¯i(μi)<1\zeta\triangleq\max\limits_{1\leq i\leq m}\|\bar{P}_{i}(\mu_{i})\|<1. It follows from (2.6) that

ek2=Pi(μi)ek12.\displaystyle\|e_{k}\|_{2}=\|P_{i}(\mu_{i})e_{k-1}\|_{2}.

By Lemma 2.2, then

ek2=P¯i(μi)ek12P¯i(μi)ek12<ζek12<<ζke02,\displaystyle\|e_{k}\|_{2}=\|\bar{P}_{i}(\mu_{i})e_{k-1}\|_{2}\leq\|\bar{P}_{i}(\mu_{i})\|\|e_{k-1}\|_{2}<\zeta\|e_{k-1}\|_{2}<\ldots<\zeta^{k}\|e_{0}\|_{2},

which means ek20\|e_{k}\|_{2}\rightarrow 0 as kk\rightarrow\infty, i.e. xkx_{k} converges to x+PN(A)x0x^{\dagger}+P_{N(A)}x_{0}. Moreover xkx_{k} also converges to xx^{\dagger} when x0R(AT)x_{0}\in R(A^{T}). ∎

3 The standard form of the relaxation Kaczmarz-Tanabe method

We start with the first equation of (1.1) and perform the iteration (2.3) in turn until the mm-th equation, and denote the mm-th iteration with xmx_{m}, then get

xm\displaystyle x_{m} =Pm(μm)P1(μ1)x0\displaystyle=P_{m}(\mu_{m})\cdots P_{1}(\mu_{1})x_{0}
+μ1Pm(μm)P2(μ2)b1a122a1+μ2Pm(μm)P3(μ3)b2a222a2++μmbmam22am.\displaystyle\quad+\mu_{1}P_{m}(\mu_{m})\cdots P_{2}(\mu_{2})\frac{b_{1}}{\|a_{1}\|_{2}^{2}}a_{1}+\mu_{2}P_{m}(\mu_{m})\cdots P_{3}(\mu_{3})\frac{b_{2}}{\|a_{2}\|_{2}^{2}}a_{2}+\cdots+\mu_{m}\frac{b_{m}}{\|a_{m}\|_{2}^{2}}a_{m}. (3.1)

For 1jm11\leq j\leq m-1, let

𝐮:=(μ1,,μm),𝐮j:=(μj+1,,μm),\displaystyle{\bf u}:=(\mu_{1},\ldots,\mu_{m}),\quad{\bf u}_{j}:=(\mu_{j+1},\ldots,\mu_{m}), (3.2)
Q(𝐮):=Pm(μm)P1(μ1),Qj(𝐮j):=Pm(μm)Pj+1(μj+1).\displaystyle Q({\bf u}):=P_{m}(\mu_{m})\cdots P_{1}(\mu_{1}),\quad Q_{j}({\bf u}_{j}):=P_{m}(\mu_{m})\cdots P_{j+1}(\mu_{j+1}). (3.3)

If μi1\mu_{i}\equiv 1 for each 1im1\leq i\leq m, then Q(𝐮)=Q,Q(𝐮j)=QjQ({\bf u})=Q,Q({\bf u}_{j})=Q_{j}. In addition, we agree that Qm(𝐮m)=IQ_{m}({\bf u}_{m})=I, where II denote identity matrix of whatever size appropriate to the context.

Let

A𝒮(𝐮)(Q1(𝐮1)a1,Q2(𝐮2)a2,,Qm(𝐮m)am)T,Λ=𝐝𝐢𝐚𝐠(μ1,,μm).\displaystyle A_{\mathcal{S}}({\bf u})\triangleq(Q_{1}({\bf u}_{1})a_{1},Q_{2}({\bf u}_{2})a_{2},\ldots,Q_{m}({\bf u}_{m})a_{m}\big{)}^{T},\quad\Lambda=\mathbf{diag}(\mu_{1},\ldots,\mu_{m}). (3.4)

Thus we get from (3), (3.2), (3.3) and (3.4) that

xm=Q(𝐮)x0+[A𝒮(𝐮)]TΛMb.\displaystyle x_{m}=Q({\bf u})x_{0}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda Mb.

Consequently, it follows by repeating this iteration that

xkm=Q(𝐮)x(k1)m+[A𝒮(𝐮)]TΛMb,k=1,2,.\displaystyle x_{k\cdot m}=Q({\bf u})x_{(k-1)\cdot m}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda Mb,\quad k=1,2,\ldots. (3.5)

Taking yk=xkmy_{k}=x_{k\cdot m}, then we get the relaxation Kaczmarz-Tanabe’s iteration

yk=Q(𝐮)yk1+[A𝒮(𝐮)]TΛMb,k=1,2,,\displaystyle y_{k}=Q({\bf u})y_{k-1}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda Mb,\quad k=1,2,\ldots, (3.6)

where y0=x0y_{0}=x_{0}.

Lemma 3.1.

Let Q(𝐮)Q(\bf{u}) and Qj(𝐮j)Q_{j}({\bf u}_{j}) be defined as in (3.3), then there holds

Q(𝐮)=I[A𝒮(𝐮)]TΛMA.\displaystyle Q({\bf u})=I-[A_{\mathcal{S}}({\bf u})]^{T}\Lambda MA. (3.7)

Proof. First, we have from (3.3) that

Q(𝐮)\displaystyle Q({\bf u}) =Q1(𝐮1)μ1Q1(𝐮1)a1a1Ta122.\displaystyle=Q_{1}({\bf u}_{1})-\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\|a_{1}\|_{2}^{2}}.

By Qj1(𝐮j1)=Qj(𝐮j)μjQj(𝐮j)ajajTaj22Q_{j-1}({\bf u}_{j-1})=Q_{j}({\bf u}_{j})-\mu_{j}Q_{j}({\bf u}_{j})\frac{a_{j}a_{j}^{T}}{\|a_{j}\|_{2}^{2}}, then

Q(𝐮)\displaystyle Q({\bf u}) =Q1(𝐮1)μ1Q1(𝐮1)a1a1Ta122\displaystyle=Q_{1}({\bf u}_{1})-\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\|a_{1}\|_{2}^{2}}
=Q2(𝐮2)(μ1Q1(𝐮1)a1a1Ta122+μ2Q2(𝐮2)a2a2Ta222)\displaystyle=Q_{2}({\bf u}_{2})-\big{(}\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\|a_{1}\|_{2}^{2}}+\mu_{2}Q_{2}({\bf u}_{2})\frac{a_{2}a_{2}^{T}}{\|a_{2}\|_{2}^{2}}\big{)}
=\displaystyle=\cdots
=Qm(𝐮m)(μ1Q1(𝐮1)a1a1Ta122+μ2Q2(𝐮2)a2a2Ta222++μmQm(𝐮m)amamTam22)\displaystyle=Q_{m}({\bf u}_{m})-\big{(}\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\|a_{1}\|_{2}^{2}}+\mu_{2}Q_{2}({\bf u}_{2})\frac{a_{2}a_{2}^{T}}{\|a_{2}\|_{2}^{2}}+\cdots+\mu_{m}Q_{m}({\bf u}_{m})\frac{a_{m}a_{m}^{T}}{\|a_{m}\|_{2}^{2}}\big{)}
=I(μ1Q1(𝐮1)a1,μ2Q2(𝐮2)a2,,μmQm(𝐮m)am)𝐝𝐢𝐚𝐠(1a122,,1am22)(a1,a2,,am)T.\displaystyle=I-\big{(}\mu_{1}Q_{1}({\bf u}_{1})a_{1},\mu_{2}Q_{2}({\bf u}_{2})a_{2},\cdots,\mu_{m}Q_{m}({\bf u}_{m})a_{m}\big{)}\mathbf{diag}(\frac{1}{\|a_{1}\|_{2}^{2}},\cdots,\frac{1}{\|a_{m}\|_{2}^{2}})\big{(}a_{1},a_{2},\cdots,a_{m}\big{)}^{T}.

holds. This proves (3.7).∎

By Lemma 3.1, we can get the alternative form of (3.6), i.e.,

yk=yk1+[A𝒮(𝐮)]TΛM(bAyk1),k=1,2,.\displaystyle y_{k}=y_{k-1}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda M(b-Ay_{k-1}),\quad k=1,2,\ldots.

For the further discussion to (3.6), we extend the conceptions of sequential projection matrix, sequential projection matrix set and sequential compatible to Qj(𝐮j)Q_{j}({\bf u}_{j}) (see [27, Definition 2.1 & Definition 2.2]).

Definition 3.2.

We call Qj(𝐮j)Q_{j}({\bf u}_{j}) a sequential projection matrix with parameters on aj+1,,ama_{j+1},\ldots,a_{m}, and denote the sequential projection matrix set with Ssp(a1,,am;𝐮)S_{sp}(a_{1},\ldots,a_{m};{\bf u}), i.e.,

Ssp(a1,a2,,am;𝐮)={Q1(𝐮1),Q2(𝐮2),,Qm1(𝐮m1)}.\displaystyle S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u})=\{Q_{1}({\bf u}_{1}),Q_{2}({\bf u}_{2}),\ldots,Q_{m-1}({\bf u}_{m-1})\}. (3.8)
Definition 3.3.

For any Qi(𝐮i)Ssp(a1,a2,,am;𝐮)Q_{i}({\bf u}_{i})\in S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u}), if there exist ζi,1,,ζi,m\zeta_{i,1},\ldots,\zeta_{i,m} such that

Qi(𝐮i)ai=ζi,1(𝐮i)a1+ζi,m(𝐮i)am,\displaystyle Q_{i}({\bf u}_{i})a_{i}=\zeta_{i,1}({\bf u}_{i})a_{1}+\ldots\zeta_{i,m}({\bf u}_{i})a_{m}, (3.9)

then we call AA and Ssp(a1,a2,,am;𝐮)S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u}) sequentially compatible. In general, for any 1im,ijm1\leq i\leq m,i\leq j\leq m, if there exist ζ1(i,j)(𝐮j),,ζm(i,j)(𝐮j)\zeta_{1}^{(i,j)}({\bf u}_{j}),\ldots,\zeta_{m}^{(i,j)}({\bf u}_{j}) such that

Qj(𝐮j)ai=ζ1(i,j)(𝐮j)a1+ζm(i,j)(𝐮j)am,\displaystyle Q_{j}({\bf u}_{j})a_{i}=\zeta_{1}^{(i,j)}({\bf u}_{j})a_{1}+\ldots\zeta_{m}^{(i,j)}({\bf u}_{j})a_{m}, (3.10)

then we call AA and Ssp(a1,a2,,am;𝐮)S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u}) forward sequential compatible, and call (ζ1(i,j)(𝐮j),,ζm(i,j)(𝐮j))(\zeta_{1}^{(i,j)}({\bf u}_{j}),\ldots,\zeta_{m}^{(i,j)}({\bf u}_{j})) compatible vector of Qj(𝐮j)aiQ_{j}({\bf u}_{j})a_{i} on AA.

Theorem 3.4.

Suppose AA has no zero row, and Ssp(a1,,am;𝐮)S_{sp}(a_{1},\ldots,a_{m};{\bf u}) is defined by (3.8), then AA and Ssp(a1,,am;𝐮)S_{sp}(a_{1},\ldots,a_{m};{\bf u}) are forward sequential compatible.

Proof. Similar to [27, Theorem 2.6], we will complete the proof by mathematical induction. We take the subscript (i,j)(i,j) of Qj(𝐮j)aiQ_{j}({\bf u}_{j})a_{i} as an ordered array. Obviously, amT[Qm(𝐮m)]T=amTa_{m}^{T}[Q_{m}({\bf u}_{m})]^{T}=a_{m}^{T}, that is, (3.10) holds for (i,j)=(m,m)(i,j)=(m,m) and (ζ1(m,m)(𝐮m),,ζm(m,m)(𝐮m))=(0,,0,1)(\zeta_{1}^{(m,m)}({\bf u}_{m}),\ldots,\zeta_{m}^{(m,m)}({\bf u}_{m}))=(0,\ldots,0,1). In fact, for any 1im1\leq i\leq m, (3.10) obviously holds for aiT[Qm(𝐮m)]Ta_{i}^{T}[Q_{m}({\bf u}_{m})]^{T} because Qm=IQ_{m}=I is the agreement. Consequently, As the first step of mathematical induction, we prove (3.10) holds for (i,j)=(m1,m1)(i,j)=(m-1,m-1). Actually,

am1T[Qm1(𝐮m1)]T=am1T[Pm(μm)]T[Qm(𝐮m)]T=am1Tμmam1Tamam22amT.\displaystyle a_{m-1}^{T}[Q_{m-1}({\bf u}_{m-1})]^{T}=a_{m-1}^{T}[P_{m}(\mu_{m})]^{T}[Q_{m}({\bf u}_{m})]^{T}=a_{m-1}^{T}-\mu_{m}\frac{a_{m-1}^{T}a_{m}}{\|a_{m}\|_{2}^{2}}a_{m}^{T}.

So (3.9) holds for i=m1i=m-1, where

(ζ1(m1,m1)(𝐮m1),,ζm(m1,m1)(𝐮m1))=(0,,0,1,μmam1Tam/am22).(\zeta_{1}^{(m-1,m-1)}({\bf u}_{m-1}),\ldots,\zeta_{m}^{(m-1,m-1)}({\bf u}_{m-1}))=(0,\ldots,0,1,-\mu_{m}a_{m-1}^{T}a_{m}/\|a_{m}\|_{2}^{2}).

Second, we suppose (3.10) holds for any given (i,j)(i,j) subjected to si<ms\leq i<m and st<j<ms\leq t<j<m, i.e., there exists (ζ1(i,j)(𝐮j),,ζm(i,j)(𝐮j))(\zeta_{1}^{(i,j)}({\bf u}_{j}),\ldots,\zeta_{m}^{(i,j)}({\bf u}_{j})) such that

aiT[Qj(𝐮j)]T=ζ1(i,j)(𝐮j)a1T+ζm(i,j)(𝐮j)amT.\displaystyle a_{i}^{T}[Q_{j}({\bf u}_{j})]^{T}=\zeta_{1}^{(i,j)}({\bf u}_{j})a_{1}^{T}+\ldots\zeta_{m}^{(i,j)}({\bf u}_{j})a_{m}^{T}.

Then, we prove (3.10) holds for (i,j)=(s,t)(i,j)=(s,t). Because of Qt(𝐮t)=Qt+1(𝐮t+1)Pt+1(μt+1)Q_{t}({\bf u}_{t})=Q_{t+1}({\bf u}_{t+1})P_{t+1}(\mu_{t+1}), then

asT[Qt(𝐮t)]T=asT[Pt+1(μt+1)]T[Qt+1(𝐮t+1)]T=asT[Qt+1(𝐮t+1)]Tμt+1asTat+1at+122at+1T[Qt+1(𝐮t+1)]T.\displaystyle a_{s}^{T}[Q_{t}({\bf u}_{t})]^{T}=a_{s}^{T}[P_{t+1}(\mu_{t+1})]^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}=a_{s}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}-\mu_{t+1}\frac{a_{s}^{T}a_{t+1}}{\|a_{t+1}\|_{2}^{2}}a_{t+1}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}. (3.11)

From the hypothesis, there exist (ζ1(s,t+1)(𝐮t+1),,ζm(s,t+1)(𝐮t+1))(\zeta_{1}^{(s,t+1)}({\bf u}_{t+1}),\ldots,\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})) and (ζ1(t+1,t+1)(𝐮t+1),,ζm(t+1,t+1)(𝐮t+1))(\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1}),\ldots,\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})) such that

asT[Qt+1(𝐮t+1)]T=ζ1(s,t+1)(𝐮t+1)a1T++ζm(s,t+1)(𝐮t+1)amT,\displaystyle a_{s}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}=\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})a_{m}^{T},
at+1T[Qt+1(𝐮t+1)]T=ζ1(t+1,t+1)(𝐮t+1)a1T++ζm(t+1,t+1)(𝐮t+1)amT.\displaystyle a_{t+1}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}=\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})a_{m}^{T}.

Then, it follows from (3.11) that

asT[Qt(𝐮t)]T\displaystyle a_{s}^{T}[Q_{t}({\bf u}_{t})]^{T}
=ζ1(s,t+1)(𝐮t+1)a1T++ζm(s,t+1)(𝐮t+1)amTasTat+1at+122(ζ1(t+1,t+1)(𝐮t+1)a1T++ζm(t+1,t+1)(𝐮t+1)amT)\displaystyle=\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})a_{m}^{T}-\frac{a_{s}^{T}a_{t+1}}{\|a_{t+1}\|_{2}^{2}}\big{(}\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})a_{m}^{T}\big{)}
=(ζ1(s,t+1)(𝐮t+1)asTat+1at+122ζ1(t+1,t+1)(𝐮t+1))a1T++(ζm(s,t+1)(𝐮t+1)asTat+1at+122ζm(t+1,t+1)(𝐮t+1))amT.\displaystyle=\big{(}\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\|a_{t+1}\|_{2}^{2}}\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1})\big{)}a_{1}^{T}+\ldots+\big{(}\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\|a_{t+1}\|_{2}^{2}}\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})\big{)}a_{m}^{T}.

Denote

(ζ1(s,t)(𝐮t),,ζm(s,t)(𝐮t))\displaystyle\big{(}\zeta_{1}^{(s,t)}({\bf u}_{t}),\ldots,\zeta_{m}^{(s,t)}({\bf u}_{t})\big{)}
=(ζ1(s,t+1)(𝐮t+1)asTat+1at+122ζ1(t+1,t+1)(𝐮t+1),,ζm(s,t+1)(𝐮t+1)asTat+1at+122ζm(t+1,t+1)(𝐮t+1)).\displaystyle=\big{(}\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\|a_{t+1}\|_{2}^{2}}\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1}),\ldots,\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\|a_{t+1}\|_{2}^{2}}\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})\big{)}.

This proves (3.10) holds for (i,j)=(s,t)(i,j)=(s,t).

To sum up the above, the conclusion is proved for all (i,j)(i,j) with respect to 1im,ijm1\leq i\leq m,i\leq j\leq m. Namely, AA and Ssp(a1,a2,,am;𝐮)S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u}) generated by relaxation Kaczmarz’s iteration are forward sequential compatible. ∎

In order to obtain the standard form (i.e., matrix-vector form) of the relaxation Kaczmarz-Tanabe’s iteration, we first import the following decomposition theorem about A𝒮(𝐮)A_{\mathcal{S}}({\bf u}).

Theorem 3.5.

Suppose AA has no zero rows, and let A𝒮(𝐮)A_{\mathcal{S}}({\bf u}) be defined by (3.4), then there exists a unit upper triangular matrix C(𝐮)m×mC({\bf u})\in\mathbb{R}^{m\times m} such that A𝒮(𝐮)=C(𝐮)AA_{\mathcal{S}}({\bf u})=C({\bf u})A.

Proof. According to A𝒮(𝐮)=(Q1(𝐮1)a1,,Qm(𝐮m)am)TA_{\mathcal{S}}({\bf u})=(Q_{1}({\bf u}_{1})a_{1},\ldots,Q_{m}({\bf u}_{m})a_{m})^{T} and Theorem 3.4, the corollary can be proved by taking [C(𝐮)](i,j)=ζi,j(𝐮i)[C({\bf u})](i,j)=\zeta_{i,j}({\bf u}_{i}), where ζi,j(𝐮i)\zeta_{i,j}({\bf u}_{i}) is determined by the forward sequential compatible of AA and Ssp(a1,,am;𝐮)S_{sp}(a_{1},\ldots,a_{m};{\bf u}). ∎

Theorem 3.5 is a general result, but it shows that there exists a structure-reserved decomposition in relaxation Kaczmarz-Tanabe’s iteration when a relaxation parameter is introduced into Kaczmarz’s iteration. However, the decomposition is not unique when AA is not row full rank.

By Theorem 3.5, then we get the standard form of the relaxation Kaczmarz-Tanabe’s iteration from (3.6) that

yk=yk1+AT[C(𝐮)]TΛM(bAyk1),k=1,2,.\displaystyle y_{k}=y_{k-1}+A^{T}[C({\bf u})]^{T}\Lambda M(b-Ay_{k-1}),\quad k=1,2,\ldots. (3.12)

In order to get the special form of CC in Kaczmarz-Tanabe’s iteration, Kang introduced the concept of index set Id(n1,n2,v)I_{d}(n_{1},n_{2},v) in [27, Definition 2.9], i.e.,

Definition 3.6.

The index set Id(n1,n2,v)I_{d}(n_{1},n_{2},v) is defined as follows

Id(n1,n2,v)={[Id(1),,Id(v)]},\displaystyle I_{d}(n_{1},n_{2},v)=\Big{\{}[I_{d}(1),\ldots,I_{d}(v)]\Big{\}},

where n1,n2,vn_{1},n_{2},v are positive integers satisfied |n1n2|v2|n_{1}-n_{2}|\geq v\geq 2. Id(i)I_{d}(i) is integer between n1n_{1} and n2n_{2}, and Id(1)=n1,Id(v)=n2I_{d}(1)=n_{1},I_{d}(v)=n_{2}. For any i<ji<j there satisfy

Id(i)<Id(j),n1<n2,Id(i)>Id(j),n1>n2.\displaystyle\begin{array}[]{ll}I_{d}(i)<I_{d}(j),&n_{1}<n_{2},\\ I_{d}(i)>I_{d}(j),&n_{1}>n_{2}.\end{array}

By Definition 3.6, we can give the following expression of aiT[Qi(𝐮i)]Tx~a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x} for any 1im1\leq i\leq m and x~N(A)\tilde{x}\in N(A)^{\bot}.

Lemma 3.7.

Suppose that AA has no zero row, QiQ_{i} is the sequential projection matrix of AA and x~N(A)\tilde{x}\in N(A)^{\bot}. For any 1im,i+1jm11\leq i\leq m,i+1\leq j\leq m-1, denote

di,j(𝐮i)=v=2ji+1(1)v1Id(i,j,v)s=1v1μId(s+1)s=1v1hId(s),Id(s+1).\displaystyle d_{i,j}({\bf u}_{i})=\sum_{v=2}^{j-i+1}(-1)^{v-1}\sum_{I_{d}(i,j,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}. (3.13)

Then,

aiT[Qi(𝐮i)]Tx~=(1,di,i+1(𝐮i),,di,m(𝐮i))(aiT,ai+1T,,amT)Tx~\displaystyle a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x}=(1,d_{i,i+1}({\bf u}_{i}),\ldots,d_{i,m}({\bf u}_{i}))(a_{i}^{T},a_{i+1}^{T},\ldots,a_{m}^{T})^{T}\tilde{x} (3.14)

holds. That is, the compatible vector of aiT[Qi(𝐮i)]Ta_{i}^{T}[Q_{i}({\bf u}_{i})]^{T} on AA is (0,,0,1,di,i+1(𝐮i),,di,m(𝐮i))(0,\ldots,0,1,d_{i,i+1}({\bf u}_{i}),\ldots,d_{i,m}({\bf u}_{i})).

Proof. When 1im1\leq i\leq m and x~N(A)\tilde{x}\in N(A)^{\bot}, we have

aiT[Qi(𝐮i)]Tx~\displaystyle a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x} =(1,μi+1hi,i+1)(aiT[Qi+1(𝐮i+1)]Tx~,ai+1T[Qi+1(𝐮i+1)]Tx~)T\displaystyle=(1,-\mu_{i+1}h_{i,i+1})(a_{i}^{T}[Q_{i+1}({\bf u}_{i+1})]^{T}\tilde{x},a_{i+1}^{T}[Q_{i+1}({\bf u}_{i+1})]^{T}\tilde{x})^{T}
=(1,μi+1hi,i+1,μi+2hi,i+2+μi+1μi+2hi,i+1hi+1,i+2)\displaystyle=(1,-\mu_{i+1}h_{i,i+1},-\mu_{i+2}h_{i,i+2}+\mu_{i+1}\mu_{i+2}h_{i,i+1}h_{i+1,i+2})
(aiT[Qi+2(𝐮i+2)]Tx~,ai+1T[Qi+2(𝐮i+2)]Tx~,ai+2T[Qi+2(𝐮i+2)]Tx~)T\displaystyle\quad\cdot(a_{i}^{T}[Q_{i+2}({\bf u}_{i+2})]^{T}\tilde{x},a_{i+1}^{T}[Q_{i+2}({\bf u}_{i+2})]^{T}\tilde{x},a_{i+2}^{T}[Q_{i+2}({\bf u}_{i+2})]^{T}\tilde{x})^{T}
=(1,μi+1hi,i+1,,v=2mi+1(1)m1Id(i,m,v)s=1v1μId(s+1)s=1v1hId(s),Id(s+1))\displaystyle=\big{(}1,-\mu_{i+1}h_{i,i+1},\ldots,\sum_{v=2}^{m-i+1}(-1)^{m-1}\sum_{I_{d}(i,m,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}\big{)}
(aiT[Qm(𝐮m)]Tx~,ai+1T[Qm(𝐮m)]Tx~,,amT[Qm(𝐮m)]Tx~)T.\displaystyle\quad\cdot(a_{i}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x},a_{i+1}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x},\ldots,a_{m}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x})^{T}.

By the agreement Qm(𝐮m)=IQ_{m}({\bf u}_{m})=I,

aiT[Qi(𝐮i)]Tx~\displaystyle a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x} =(1,μi+1hi,i+1,,v=2mi+1(1)v1Id(i,m,v)s=1v1μId(s+1)s=1v1hId(s),Id(s+1))\displaystyle=\big{(}1,-\mu_{i+1}h_{i,i+1},\ldots,\sum_{v=2}^{m-i+1}(-1)^{v-1}\sum_{I_{d}(i,m,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}\big{)}
(aiTx~,ai+1Tx~,,amTx~)T.\displaystyle\quad\cdot(a_{i}^{T}\tilde{x},a_{i+1}^{T}\tilde{x},\ldots,a_{m}^{T}\tilde{x})^{T}. (3.15)

Thus (3.14) holds by taking di,j(𝐮i)d_{i,j}({\bf u}_{i}) according to (3.13). Because (3.14) holds for any x~n\tilde{x}\in\mathbb{R}^{n}, we then get the compatible vector of aiT[Qi(𝐮i)]Ta_{i}^{T}[Q_{i}({\bf u}_{i})]^{T} on AA. ∎

Theorem 3.8.

Under the Lemma 3.7, let Ω(𝐮)=(ωi,j(𝐮i))m×m\Omega({\bf u})=(\omega_{i,j}({\bf u}_{i}))_{m\times m} satisfy

ωi,j(𝐮i)={di,j(𝐮i),j>i,1,j=i,0,j<i.\displaystyle\omega_{i,j}({\bf u}_{i})=\left\{\begin{array}[]{ll}d_{i,j}({\bf u}_{i}),&j>i,\\ 1,&j=i,\\ 0,&j<i.\end{array}\right. (3.19)

Then,

A𝒮(𝐮)=Ω(𝐮)A.\displaystyle A_{\mathcal{S}}({\bf u})=\Omega({\bf u})A.

Proof. For any x~N(A)\tilde{x}\in N(A)^{\bot},

A𝒮(𝐮)x~=(a1T[Q1(𝐮1)]Tx~,a2T[Q2(𝐮2)]Tx~,,amT[Qm(𝐮m)]Tx~)T.\displaystyle A_{\mathcal{S}}({\bf u})\tilde{x}=(a_{1}^{T}[Q_{1}({\bf u}_{1})]^{T}\tilde{x},a_{2}^{T}[Q_{2}({\bf u}_{2})]^{T}\tilde{x},\ldots,a_{m}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x})^{T}. (3.20)

By Lemma 3.7 and (3.19), we then get

A𝒮(𝐮)x~=Ω(𝐮)Ax~.\displaystyle A_{\mathcal{S}}({\bf u})\tilde{x}=\Omega({\bf u})A\tilde{x}. (3.21)

Moreover, when x~N(A)\tilde{x}\in N(A), (3.21) obviously holds. Therefore, for any x~n\tilde{x}\in\mathbb{R}^{n}, there holds A𝒮(𝐮)x~=Ω(𝐮)Ax~A_{\mathcal{S}}({\bf u})\tilde{x}=\Omega({\bf u})A\tilde{x}, which means A𝒮(𝐮)=Ω(𝐮)AA_{\mathcal{S}}({\bf u})=\Omega({\bf u})A. ∎

In the context, we especially call the iteration (3.12) relaxation Kaczmarz-Tanabe’s iteration when C(𝐮)=Ω(𝐮)C({\bf u})=\Omega({\bf u}) is taken according to (3.19).

Let E(j,i(μihj,i))E(j,i(-\mu_{i}h_{j,i})) be a matrix obtained by multiplying the ii-th row of the identity matrix by μihj,i-\mu_{i}h_{j,i} and adding it to the jj-th row, i.e., the diagonal elements of E(j,i(μihj,i))E(j,i(-\mu_{i}h_{j,i})) are all 11, the (j,i)(j,i)- element is μihj,i-\mu_{i}h_{j,i}, and all other elements are 0. Consequently, we have the following decomposition theorem of Ω(𝐮)\Omega({\bf u}).

Theorem 3.9.

For any μi\mu_{i}\in\mathbb{R}, let Ω(𝐮)\Omega({\bf u}) be defined by (3.19), and define

H1(μ1)=I,Hi(μi)=j=1i1E(j,i(μihj,i)),1<im.\displaystyle H_{1}(\mu_{1})=I,H_{i}(\mu_{i})=\prod\limits_{j=1}^{i-1}E(j,i(-\mu_{i}h_{j,i})),\quad 1<i\leq m.

Then,

Ω(𝐮)=H1(μ1)H2(μ2)Hm(μm).\displaystyle\Omega({\bf u})=H_{1}(\mu_{1})H_{2}(\mu_{2})\cdots H_{m}(\mu_{m}). (3.22)

Proof. For any x~N(A)\tilde{x}\in N(A)^{\bot}, from (3), we have

ajT[Qj(𝐮j)]Tx~\displaystyle a_{j}^{T}[Q_{j}({\bf u}_{j})]^{T}\tilde{x} =(0,,0,1,μj+1hj,j+1,,v=2mj+1(1)v1Id(j,m,v)s=1v1μId(s+1)s=1v1hId(s),Id(s+1))\displaystyle=\big{(}0,\ldots,0,1,-\mu_{j+1}h_{j,j+1},\ldots,\sum_{v=2}^{m-j+1}(-1)^{v-1}\sum_{I_{d}(j,m,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}\big{)}
(a1Tx~,a2Tx~,,amTx~)T.\displaystyle\quad\cdot(a_{1}^{T}\tilde{x},a_{2}^{T}\tilde{x},\ldots,a_{m}^{T}\tilde{x})^{T}. (3.23)

From (3), the coefficient of aiTx~a_{i}^{T}\tilde{x} is actually the (j,i)(j,i)-element of Ω(𝐮)\Omega({\bf u}) when i>ji>j, i.e.,

ωj,i(𝐮j)=μihj,i+μiμi1hj,i1hi1,i++(1)ijμiμi1μj+1hj,j+1hj+1,j+2hi1,i.\displaystyle\omega_{j,i}({\bf u}_{j})=-\mu_{i}h_{j,i}+\mu_{i}\mu_{i-1}h_{j,i-1}h_{i-1,i}+\ldots+(-1)^{i-j}\mu_{i}\mu_{i-1}\ldots\mu_{j+1}h_{j,j+1}h_{j+1,j+2}\cdot\ldots\cdot h_{i-1,i}.

Denote H^(𝐮)=H1(μ1)Hm(μm)\widehat{H}({\bf u})=H_{1}(\mu_{1})\cdots H_{m}(\mu_{m}), in order to show Ω(𝐮)=H1(μ1)Hm(μm)\Omega({\bf u})=H_{1}(\mu_{1})\cdots H_{m}(\mu_{m}), we only need to prove ωj,i(𝐮j)=[H^(𝐮)]j,i\omega_{j,i}({\bf u}_{j})=[\widehat{H}({\bf u})]_{j,i}, i.e.,

ωj,i(𝐮j)=ejTH^(𝐮)ei,\displaystyle\omega_{j,i}({\bf u}_{j})=e_{j}^{T}\widehat{H}({\bf u})e_{i},

where eje_{j} and eie_{i} are the jjth and iith columns of the identity matrix in m\mathbb{R}^{m}, respectively. Owing to ejTHk(μk)=ejTe_{j}^{T}H_{k}(\mu_{k})=e_{j}^{T} when jkj\geq k and Hl(μl)ei=eiH_{l}(\mu_{l})e_{i}=e_{i} when ili\neq l, when i>ji>j it then follows from the notation of H^(𝐮)\widehat{H}({\bf u}) that

ejTH^(𝐮)ei\displaystyle e_{j}^{T}\widehat{H}({\bf u})e_{i} =ejTHj+1(μj+1)Hi(μi)ei\displaystyle=e_{j}^{T}H_{j+1}(\mu_{j+1})\cdots H_{i}(\mu_{i})e_{i}
=(ejTHj+1(μj+1))Hj+2(μj+2)Hi(μi)ei\displaystyle=(e_{j}^{T}H_{j+1}(\mu_{j+1}))H_{j+2}(\mu_{j+2})\cdots H_{i}(\mu_{i})e_{i}
=((0,,1,μj+1hj,j+1,0,,0)Hj+2(μj+2))Hj+3(μj+3)Hi(μi)ei\displaystyle=((0,\ldots,1,-\mu_{j+1}h_{j,j+1},0,\ldots,0)H_{j+2}(\mu_{j+2}))H_{j+3}(\mu_{j+3})\cdots H_{i}(\mu_{i})e_{i}
=((0,,1,μj+1hj,j+1,μj+1hj,j+1+μj+2μj+1hj,j+1hj+1,j+2,,0)Hj+2(μj+2))\displaystyle=((0,\ldots,1,-\mu_{j+1}h_{j,j+1},-\mu_{j+1}h_{j,j+1}+\mu_{j+2}\mu_{j+1}h_{j,j+1}h_{j+1,j+2},\ldots,0)H_{j+2}(\mu_{j+2}))
Hj+3(μj+3)Hi(μi)ei\displaystyle\quad\cdot H_{j+3}(\mu_{j+3})\cdots H_{i}(\mu_{i})e_{i}
=(0,,1,μj+1hj,j+1,,μihj,i+μiμi1hj,i1hi1,i+\displaystyle=(0,\ldots,1,-\mu_{j+1}h_{j,j+1},\ldots,-\mu_{i}h_{j,i}+\mu_{i}\mu_{i-1}h_{j,i-1}h_{i-1,i}+\ldots
+(1)ijμiμi1μj+1hj,j+1hi1,i,0,,0)ei\displaystyle\quad+(-1)^{i-j}\mu_{i}\mu_{i-1}\ldots\mu_{j+1}h_{j,j+1}\cdots h_{i-1,i},0,\ldots,0)e_{i}
=μihj,i+μiμi1hj,i1hi1,i++(1)ijμiμi1μj+1hj,j+1hj+1,j+2hi1,i.\displaystyle=-\mu_{i}h_{j,i}+\mu_{i}\mu_{i-1}h_{j,i-1}h_{i-1,i}+\ldots+(-1)^{i-j}\mu_{i}\mu_{i-1}\cdots\mu_{j+1}h_{j,j+1}h_{j+1,j+2}\cdots h_{i-1,i}.

This proves

ωj,i(𝐮i)=[H^(𝐮)]j,i\omega_{j,i}({\bf u}_{i})=[\widehat{H}({\bf u})]_{j,i}

for any 1jn11\leq j\leq n-1 and i>ji>j. Additionally, there also holds

ωj,j(𝐮j)=[H^(𝐮)]j,j=1\omega_{j,j}({\bf u}_{j})=[\widehat{H}({\bf u})]_{j,j}=1

for any 1jn1\leq j\leq n. Consequently, the conclusion is proved. ∎

Theorem 3.9 actually provides a computational method of C(𝐮)C({\bf u}) for the relaxation Kaczmarz-Tanabe’s algorithm. Moreover, there also holds the following decomposition about ΛC(𝐮)\Lambda C({\bf u}).

Lemma 3.10.

Let C(𝐮)C({\bf u}) be defined like Ω(𝐮)\Omega({\bf u}) in Theorem 3.8, then there exists decomposition of

ΛC(𝐮)=Tu(𝐮)M,\displaystyle\Lambda C({\bf u})=T_{u}({\bf u})M, (3.24)

where Tu(𝐮)m×mT_{u}({\bf u})\in\mathbb{R}^{m\times m} is an upper triangular matrix with diagonal elements [Tu(𝐮)](i,i)=μiai22[T_{u}({\bf u})](i,i)=\mu_{i}\|a_{i}\|_{2}^{2}.

Proof. The proof of the lemma is trivial and omitted here. ∎

3.1 The convergence of the relaxation Kaczmarz-Tanabe method

Next, we consider the convergence of the relaxation Kaczmarz-Tanabe method and give a sufficient condition of convergence for the relaxation Kaczmarz-Tanabe’s iteration. Notice that each relaxation Kaczmarz-Tanabe’s iteration is a composite of mm relaxation Kaczmarz’s iterations, consequently, it is natural to transfer the convergence condition of the relaxation Kaczmarz method to the relaxation Kaczmarz-Tanabe method.

Lemma 3.11.

For a linear system Ax=bAx=b, let Q(𝐮)Q({\bf u}) be defined by (3.3), and we suppose 0<μi<20<\mu_{i}<2 for each 1im1\leq i\leq m. Then, for any xnx\in\mathbb{R}^{n}, there holds Q(𝐮)x2x2\|Q({\bf u})x\|_{2}\leq\|x\|_{2}, and the equality holds iif xN(A)x\in N(A).

Proof. When xN(A)x\in N(A), from Pj(μj)x=x,j=1,,mP_{j}(\mu_{j})x=x,j=1,\ldots,m, therefore there holds

Q(𝐮)x=Pm(μm)P1(μ1)x=x.\displaystyle Q({\bf u})x=P_{m}(\mu_{m})\cdots P_{1}(\mu_{1})x=x. (3.25)

If xN(A)x\notin N(A), from Lemma 2.1, there exists at least an index s0Θ{1,2,,m}s_{0}\in\Theta\triangleq\{1,2,\ldots,m\} such that Ps0(μs0)x2<x2\|P_{s_{0}}(\mu_{s_{0}})x\|_{2}<\|x\|_{2}. We suppose that s0s_{0} is the smallest index in Θ\Theta that Ps0(μs0)x2<x2\|P_{s_{0}}(\mu_{s_{0}})x\|_{2}<\|x\|_{2}, then there holds

Q(𝐮)x2=Pm(μm)P1(μ1)x2Ps0(μs0)P1(μ1)x2ps0(μs0)x2<x2.\|Q({\bf u})x\|_{2}=\|P_{m}(\mu_{m})\cdots P_{1}(\mu_{1})x\|_{2}\leq\|P_{s_{0}}(\mu_{s_{0}})\cdots P_{1}(\mu_{1})x\|_{2}\leq\|p_{s_{0}}(\mu_{s_{0}})x\|_{2}<\|x\|_{2}.

These prove the conclusions. ∎

Lemma 3.11 actually gives the condition to ensure the nonexpansion of Q(𝐮)Q({\bf u}) on n\mathbb{R}^{n}. Let {yk}\{y_{k}\} be generated by (3.12) and e¯k=ykxPN(A)y0\bar{e}_{k}=y_{k}-x^{\dagger}-P_{N(A)}y_{0}, then

e¯k=(IAT[C(𝐮)]TΛMA)e¯k1,\displaystyle\bar{e}_{k}=\big{(}I-A^{T}[C({\bf u})]^{T}\Lambda MA\big{)}\bar{e}_{k-1},

it follows from Lemma 3.1 that

e¯k=Q(𝐮)e¯k1.\displaystyle\bar{e}_{k}=Q({\bf u})\bar{e}_{k-1}. (3.26)

By Lemma 3.11, thus we have the following convergence of the relaxation Kaczmarz-Tanabe method.

Theorem 3.12.

For a consistent linear system (1.1), supposing that {yk,k0}\{y_{k},k\geq 0\} is generated by the relaxation Kaczmarz-Tanabe’s iteration (3.12) with 0<μi<20<\mu_{i}<2 (where 1im1\leq i\leq m). Then {yk,k>0}\{y_{k},k>0\} converges to x+PN(A)y0x^{\dagger}+P_{N(A)}y_{0}.

Proof. Let Qr(𝐮)=Q(𝐮)|N(A)Q_{r}({\bf u})=Q({\bf u})|_{N(A)^{\bot}}, by Lemma 3.11, we have Qr(𝐮)2<1\|Q_{r}({\bf u})\|_{2}<1. Because the sequence generated by the relaxation Kaczmarz-Tanabe’s iteration is a subsequence of the sequence generated by the relaxation Kaczmarz’s iteration, there holds from Lemma 2.2 that e¯kN(A)\bar{e}_{k}\in N(A)^{\bot}. Consequently, by (3.26), we have

e¯k2=Q(𝐮)e¯k12=Qr(𝐮)e¯k12Qr(𝐮)2e¯k12Qr(𝐮)2ke¯02,\displaystyle\|\bar{e}_{k}\|_{2}=\|Q({\bf u})\bar{e}_{k-1}\|_{2}=\|Q_{r}({\bf u})\bar{e}_{k-1}\|_{2}\leq\|Q_{r}({\bf u})\|_{2}\|\bar{e}_{k-1}\|_{2}\leq\cdots\leq\|Q_{r}({\bf u})\|_{2}^{k}\|\bar{e}_{0}\|_{2},

which shows e¯k0\|\bar{e}_{k}\|\rightarrow 0 as kk\rightarrow\infty, i.e., ykx+PN(A)y0y_{k}\rightarrow x^{\dagger}+P_{N(A)}y_{0}.∎

Moreover, we have the following convergence rate of the relaxation Kaczmarz-Tanabe method for solving error-free linear system.

Corollary 3.13.

Under the conditions of Theorem 3.12, there also hold

e¯k2max0<σi(𝐮)<1σi(𝐮)e¯k12\displaystyle\|\bar{e}_{k}\|_{2}\leq\max_{0<\sigma_{i}({\bf u})<1}\sigma_{i}({\bf u})\|\bar{e}_{k-1}\|_{2} (3.27)

and

e¯k2max0<σi(𝐮)<1[σi(𝐮)]ke¯02,\displaystyle\|\bar{e}_{k}\|_{2}\leq\max_{0<\sigma_{i}({\bf u})<1}[\sigma_{i}({\bf u})]^{k}\|\bar{e}_{0}\|_{2}, (3.28)

where σi(𝐮)\sigma_{i}({\bf{u}}) is the singular value of Q(𝐮)Q({\bf u}).

Proof. From the proof of Theorem 3.12, we have

e¯k2Qr(𝐮)2e¯k12.\displaystyle\|\bar{e}_{k}\|_{2}\leq\|Q_{r}({\bf u})\|_{2}\|\bar{e}_{k-1}\|_{2}.

When 0<μi<20<\mu_{i}<2, Q(𝐮)21\|Q({\bf u})\|_{2}\leq 1 and Qr(𝐮)2<1\|Q_{r}({\bf u})\|_{2}<1. Then,

Qr(𝐮)2max0<σi(𝐮)<1[σi(𝐮)].\displaystyle\|Q_{r}({\bf u})\|_{2}\leq\max_{0<\sigma_{i}({\bf u})<1}[\sigma_{i}({\bf u})].

Consequently, (3.27) and (3.28) hold. ∎

4 The related algorithms

The implementation of the relaxation Kaczmarz-Tanabe method are divided into two processes. The first process is to compute C(𝐮)C({\bf u}) for given 𝐮m{\bf u}\in\mathbb{R}^{m}, and the second process is to execute the Kaczmarz-Tanabe’s iteration repeatedly. In Algorithm 4, we list the flowchart to compute C(𝐮)C({\bf u}). After computing C(𝐮)C({\bf u}), the implementation of the relaxation Kaczmarz-Tanabe’s iteration is trivial, and we list the process in Algorithm 4.

  Algorithm 1 The calculation of matrix C(𝐮)C({\bf u})

 

1:Input
2:A=(a1,a2,,am)TA=(a_{1},a_{2},\ldots,a_{m})^{T}
3:𝐮=(μ1,μ2,,μm){\bf u}=(\mu_{1},\mu_{2},\ldots,\mu_{m}) \triangleright 𝐮{\bf u} relaxation parameter, μi=1\mu_{i}=1 is for decomposition without relaxation
4:C=ImC=I_{m} \triangleright ImI_{m} is an identity matrix with order mm
5:kmk\leftarrow m
6:while k>1k>1 do
7:     iki\leftarrow k
8:     while i>1i>1 do
9:         jmj\leftarrow m
10:         while j>k1j>k-1 do
11:              if akTak=0a_{k}^{T}a_{k}=0 then
12:                  C(i1,j)=C(i1,j)C(i-1,j)=C(i-1,j)
13:              else
14:                  C(i1,j)=C(i1,j)+μk(ai1Tak/(akTak)C(k,j)C(i-1,j)=C(i-1,j)+\mu_{k}(-a_{i-1}^{T}a_{k}/(a_{k}^{T}a_{k})C(k,j)
15:              end if
16:              jj1j\leftarrow j-1
17:         end while
18:         ii1i\leftarrow i-1
19:     end while
20:     kk1k\leftarrow k-1
21:end while
22:Output C(𝐮)C({\bf u})

 

  Algorithm 2 The relaxation Kaczmarz-Tanabe method solver

 

1:Input
2:A,M,Λ,b,𝐮A,M,\Lambda,b,{\bf u}
3:y0=x0y_{0}=x_{0} \triangleright Initial value
4:KmaxK_{\max} \triangleright Maximal iteration number
5:Compute C(𝐮)C({\bf u}) by Algorithm 4
6:k=1k=1
7:while k<Kmaxk<K_{\max} do
8:     yk+1=yk+AT[C(𝐮)]TΛM(bAyk)y_{k+1}=y_{k}+A^{T}[C({\bf u})]^{T}\Lambda M(b-Ay_{k})
9:     kk+1k\leftarrow k+1
10:end while
11:Output yky_{k}. \triangleright yky_{k} is the acquired solution

 

5 Conclusion

In this paper, we consider the relaxation Kaczmarz-Tanabe method by introducing the relaxation parameter idea into Kaczmarz-Tanabe method. The development of this work is based on the research to the Kaczmarz-Tanabe method. We first make some exploration on the selections of the relaxation parameter for the relaxation Kaczmarz-Tanabe method, then analyze the related theories of relaxation Kaczmarz-Tanabe’s iteration such as the the decompositions of A𝒮(𝐮)A_{\mathcal{S}}({\bf u}) and C(𝐮)C({\bf u}).

In numerical tests, we experiment several classic examples to verify these theoretical results of the selections of relaxation parameters for the relaxation Kaczmarz-Tanabe method. Although we prove that μi(0,2)\mu_{i}\in(0,2) is the sufficient condition for the convergence of the relaxation Kaczmarz-Tanabe method, the operation of adjusting parameters one by one is tedious and unnecessary. Consequently, we carry out many experiments with constant and random parameters to test the effect of parameter selection, moreover the experimental results of random parameters setting show the effectiveness of the specific case. However the selection of optimal parameters still needs further research, and in the current situation μi1\mu_{i}\equiv 1 may still be a compromise for the relaxation Kaczmarz-Tanabe method.

Reference

References

  • [1] R. S. Ledley and W. R. Ayers. Computerized medical imaging and graphics evolves from computerized tomography. Comput. Med. Imag. Grap., 12(1):v–xviii, 1988.
  • [2] G. T. Herman. Fundamentals of computerized tomography. Academic Press, 2010.
  • [3] F. Natterer. The mathematics of computerized tomography. Medical Physics, 32, 1986.
  • [4] B. Jin and Y. Xu. Adaptive reconstruction for electrical impedance tomography with a piecewise constant conductivity. Inverse Problems, 36:014003, 2020.
  • [5] G. S. Alberti, H. Ammari, B. Jin, J. K. Seo, and W. Zhang. The linearized inverse problem in multifrequency electrical impedance tomography. SIAM J. Imaging Sci., 9:1525–51, 2016.
  • [6] Y. T. Chow, K. Ito, and J. Zou. A direct sampling method for electrical impedance tomography. Inverse Problems, 30:095003, 2014.
  • [7] S. Kaczmarz. Angenäherte auflösung von systemen linearer gleichungen. Bull. Int. Acad. Pol. Sci. Lett. A, 35:355–357, 1937.
  • [8] R. Gordon, R. Bender, and G. T. Herman. Algebraic reconstrction techniques (ART) for three dimensional electron microscopy and X-ray photography. J. Theoret. Biol., 29(3):471–481, 1970.
  • [9] C. Kang and H. Zhou. The extension of convergence rates of Kaczmarz type methods. J. Comput. Appl. Math., 382:113099, 2021.
  • [10] P. P. B. Eggermont, G. T. Herman, and A. Lent. Iterative algorithms for large partitioned linear systems, with applications to image reconstruction. Linear Algebra Appl., 40(OCT):37–67, 1981.
  • [11] Y. Censor, P. P. B. Eggermont, and D. Gordon. Strong underrelaxation in Kaczmarz’s method for inconsistent systems. Numer. Math., 41(1):83–92, 1983.
  • [12] E. Rebrova and D. Needell. On block gaussian sketching for the kaczmarz method. Numer. Algorithms, 86(1):443–473, 2021.
  • [13] G. Cimmino. Calcolo approssimato per le soluzioni dei sistemi di equazioni lineari. La Ricerca Scientifica,Series II, 9:326–333, 1938.
  • [14] L. Landweber. An iteration formula for Fredholm integral equations of the first kind. Am. J. Math., 73(3):615–624, 1951.
  • [15] Y. Censor, G. Dan, and R. Gordon. Component averaging: An efficient iterative parallel algorithm for large and sparse unstructured problems. Parallel Comput., 27(6):777–808, 2001.
  • [16] Y. Censor, T. Elfving, G. T. Herman, and T. Nikazad. On diagonally relaxed orthogonal projection methods. SIAM J. Sci. Comput., 30(1):473–504, 2007.
  • [17] P. C. Hansen and J. S. Jorgensen. AIR Tools II: Algebraic iterative reconstruction method, improved implementation. Numer. Algorithms, 79(1):107–137, 2018.
  • [18] T. Strohmer and R. Vershynin. A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl, 15:262–278, 2009.
  • [19] Y. L. Jiao, B. T. Jin, and X. L. Lu. Preasymptotic convergence of randomized Kaczmarz method. Inverse Problems, 33, 2017.
  • [20] D. Needell. Randomized Kaczmarz solver for noisy linear systems. Behav. Inf. Technol., 50(2):395–403, 2010.
  • [21] D. Needell and J. A. Tropp. Paved with good intentions: Analysis of a randomized block Kaczmarz method. Linear Algebra Appl., 441(1):199–221, 2014.
  • [22] D. Needell, R. Zhao, and A. Zouzias. Randomized block Kaczmarz method with projection for solving least squares. Linear Algebra Appl., 484:322–343, 2015.
  • [23] Z. Z. Bai and W. T. Wu. On convergence rate of the randomized Kaczmarz method. Linear Algebra Appl., 553:252–269, 2018.
  • [24] Z. Z. Bai and W. T. Wu. On greedy randomized Kaczmarz method for solving large sparse linear systems. SIAM J. Sci. Comput., 40:A592–A606, 2018.
  • [25] C. Kang. Convergence rates of the Kaczmarz-Tanabe method for linear system. J. Comput. Appl. Math., 394:113577, 2021.
  • [26] C. Popa. Convergence rates for Kaczmarz-type algorithms. Numer. Algorithms, 79:1–17, 2018.
  • [27] C. Kang. The standard forms and convergence theory of the Kaczmarz-Tanabe type methods for solving linear systems. Journal of Computational and Applied Mathematics, page 115333, 2023.