\marginsize

2cm2cm2.54cm2cm

^†^†footnotetext: Email address: [email protected]

The standard form and convergence theory of the relaxation Kaczmarz-Tanabe method for solving linear systems

Chuan-gang Kang
School of Mathematical Sciences Tiangong University Tianjin 300387 People’s Republic of China

Abstract

The Kaczmarz method is a popular iterative method for solving consistent, overdetermined linear system such as medical imaging in computerized tomography. The Kaczmarz’s iteration repeatedly scans all equations in order, which leads to lower computational efficiency especially in solving a large scale problem. The standard form of Kaczmarz-Tanabe’s iteration proposed recently effectively overcomes the computational redundancy problem of the Kaczmarz method. In this paper, we introduce relaxation parameters ${\bf u}=(\mu_{1},\ldots,\mu_{m})$ into the Kaczmarz-Tanabe method based on the relaxation Kaczmarz method, and consider the standard form and convergence of this combination. Moreover, we analyze and prove the sufficient conditions for convergence of the relaxation Kaczmarz-Tanabe method, i.e., $\mu_{i}\in(0,2)$ . Numerical experiments show the convergence characteristics of the relaxation Kaczmarz-Tanabe method corresponding to these parameters.

Key words. Kaczmarz method; Relaxation Kaczmarz method; Relaxation Kaczmarz-Tanabe method; Computerized tomography

AMS subject classification. 65F10; 65F08; 65N22; 65J20

1 Introduction

In medical imaging tomography (see, i.e., [1, 2, 3]) such as computerized tomography (CT), magnetic resonance imaging (MRI) and electrical impedance tomography (EIT), people are often required to solve the following linear system of equations [4, 5, 6],

\displaystyle Ax=b,

(1.1)

where $A\in\mathbb{R}^{m\times n},b\in\mathbb{R}^{m}$ are also called projection matrix and measurement vector, respectively. We suppose that (1.1) is over-determined and consistent, and $x^{*}$ is a true solution. When (1.1) has many solutions, $x^{\dagger}$ is used to denote the minimum least-squares solution.

The Kaczmarz method [7] is one of the most popular methods to solve (1.1) in computerized tomography. Let $A=(a_{1},a_{2},\ldots,a_{m})^{T}$ , Kaczmarz’s iteration is written as

\displaystyle x_{k}=x_{k-1}+\frac{b_{i}-\langle a_{i},x_{k-1}\rangle}{\|a_{i}\|_{2}^{2}}a_{i},\quad k=1,2,\ldots,

(1.2)

where $i=\bmod(k-1,m)+1$ , $\langle x,y\rangle=x^{T}y$ and $\|x\|_{2}=\sqrt{\langle x,x\rangle}$ denote the inner product of $x,y$ and the square norm of $x$ in $\mathbb{R}^{n}$ , respectively.

The Kaczmarz method was originally discovered by Kaczmarz in 1937, and it was rediscovered by Gordon, Bender and Herman [8] to deal with X-ray photography as an algebraic reconstruction technique (ART) in 1970. The convergence of the Kaczmarz method was established by Tanabe in 1971. Recently, Kang and Zhou gave the convergence rate results of the Kaczmarz method [9]. In order to improve computing speed, Eggermont [10], Censor [11], et al. considered the problem of block calculation and adopted approximate and executable iterative forms of the block Kaczmarz method to solve this problem. The strictly block Kaczmarz method was given by Rebrova and Needell [12]. In addition, the simultaneous iterative reconstructive technique (SIRT) as another important branch is also developed rapidly, and there appeared many known methods) such as Cimmino method[13], Landweber method [14], CAV method [15], DROP method [16], and SART method[17], etc.

With the development of hardware technology and computing technology, the idea of stochastic theory is continuously integrated into the non-randomized methods. In this context, people introduced and investigated the randomized Kaczmarz method [18, 19, 20, 21, 22] and greedy randomized Kaczmarz methods [23, 24].

From the perspective of SIRT methods, Kang [25] considered the matrix-vector form of the Kaczmarz method, i.e., the Kaczmarz-Tanabe’s iteration which is named by Popa [26], and the iteration reads

\displaystyle y_{k+1}=(I-A_{\mathcal{S}}^{T}MA)y_{k}+A_{\mathcal{S}}^{T}Mb,\qquad k=0,1,2,\ldots,

(1.3)

where $I$ denotes identity matrix of whatever size appropriate to the context, and

	$\displaystyle P_{i}=I-\frac{a_{i}a_{i}^{T}}{\\|a_{i}\\|_{2}^{2}},\quad Q_{m}=I,Q_{j}=P_{m}P_{m-1}\ldots P_{j+1},\quad Q=P_{m}P_{m-1}\cdots P_{1},$
	$\displaystyle M=\text{diag}(1/\\|a_{1}\\|_{2}^{2},1/\\|a_{2}\\|_{2}^{2},\ldots,1/\\|a_{m}\\|_{2}^{2}),\quad A_{\mathcal{S}}=(Q_{1}a_{1},Q_{2}a_{2},\ldots,Q_{m}a_{m})^{T}.$

Compared with the classic form of the Kaczmarz-Tanabe iteration, the iteration (1.3) has been greatly improved. However, each column of $A_{\mathcal{S}}$ is the product of matrix $Q_{i}$ and column vector $a_{i}$ from $A$ . Notice this defect of $A_{\mathcal{S}}$ , Kang [27] considered the decomposition of $A_{\mathcal{S}}$ and gave the standard form of the Kaczmarz-Tanabe’s iteration, i.e.,

\displaystyle y_{k+1}=y_{k}+A^{T}C^{T}M(b-Ay_{k}),\qquad k=0,1,2,\ldots,

(1.4)

where $C$ is defined as [27, Corollary 2.7]. The iterative scheme (1.4) has no essential difference from the iterative formula of the SIRT methods.

In this paper, we combine the relaxation Kaczmarz method and the Kaczmarz-Tanabe method, and introduce relaxation parameters into Kaczmarz-Tanabe method and propose the relaxation Kaczmarz-Tanabe method. Just as the convergence of the relaxation method is related to the relaxation parameters, the convergence of the relaxation Kaczmarz-Tanabe is also related to the relaxation parameters. Hence we further consider the convergence conditions (about the relaxation parameters) and convergence results of the relaxation Kaczmarz-Tanabe method.

The rest of this work is organized as follows. In Section 2, we consider the relaxation Kaczmarz method and derive the sufficient conditions for its convergence. In Section 3, we consider the standard form of the relaxation Kaczmarz-Tanabe method, and discuss their properties such as decomposition of $A_{\mathcal{S}}({\bf u})$ , convergence condition, convergence, and so on. Section 4 is about the algorithm flow to compute $C({\bf u})$ appearing in (3.19) and (3.24). In Section LABEL:section.numerical.tests, we verify the effectiveness of the Kaczmarz-Tanabe method with given relaxation parameters by several classical numerical examples.

2 The relaxation Kaczmarz method and its convergence

The relaxation Kaczmarz’s iteration is written as

\displaystyle x_{k}=x_{k-1}+\mu_{i}\frac{b_{i}-\langle a_{i},x_{k-1}\rangle}{\|a_{i}\|_{2}^{2}}a_{i},\quad k=1,2,\ldots,

(2.1)

where $i=\bmod(k-1,m)+1$ and $\mu_{i}$ are relaxation parameters. Let

\displaystyle P_{i}(\mu_{i})=I-\mu_{i}\frac{a_{i}a_{i}^{T}}{\|a_{i}\|_{2}^{2}},\quad i=1,2,\ldots,m,

(2.2)

then it follows from (2.1) that

\displaystyle x_{k}=P_{i}(\mu_{i})x_{k-1}+\mu_{i}\frac{b_{i}}{\|a_{i}\|_{2}^{2}}a_{i},\quad k=1,2,\ldots.

(2.3)

Denote

\displaystyle P_{i}(\mu):=I-\mu a_{i}a_{i}^{T}/\|a_{i}\|_{2}^{2},

(2.4)

then the following lemma gives a sufficient condition for the nonexpansion of linear operator $P_{i}(\mu)$ .

Lemma 2.1.

Let $\{P_{i}(\mu),i=1,2,\cdots,m\}$ be defined as in (2.4), then, for any $0<\mu<2$ ,

\displaystyle\|P_{i}(\mu)x\|_{2}\leq\|x\|_{2},

hold and the equality hold iif $x\in N(a_{i}^{T})\triangleq\{x:a_{i}^{T}x=0\}$ . Moreover, for any $\mu\in\mathbb{R}$ there also holds $P_{i}(\mu)x=x$ when $x\in N(a_{i}^{T})$ .

Proof. For any $1\leq i\leq m$ , we have

\displaystyle\|P_{i}(\mu)x\|_{2}^{2}=\|(I-\mu\frac{a_{i}a_{i}^{T}}{\|a_{i}\|_{2}^{2}})x\|_{2}^{2}=\|x\|_{2}^{2}-\mu(2-\mu)\|a_{i}^{T}x\|_{2}^{2}/\|a_{i}\|_{2}^{2}.

(2.5)

If $x\notin N(a_{i}^{T})$ and $0<\mu<2$ , then $\mu(2-\mu)\|a_{i}^{T}x\|_{2}^{2}>0$ , consequently, $\|P_{i}(\mu)x\|_{2}<\|x\|_{2}$ . Conversely, if $x\in N(a_{i}^{T})$ , then $a_{i}^{T}x=0$ , and it follows from (2.5) that $\|P_{i}(\mu)x\|_{2}^{2}=\|x\|_{2}^{2}$ . Moreover, there also holds $P_{i}(\mu)x=x$ when $x\in N(a_{i}^{T})$ . ∎

Let $\{x_{k},k>0\}$ be generated by (2.1) and $e_{k}=x_{k}-P_{N(A)}x_{0}-x^{\dagger}$ , thus

\displaystyle e_{k}=(I-\mu_{i}\frac{a_{i}a_{i}^{T}}{\|a_{i}\|_{2}^{2}})e_{k-1}=P_{i}(\mu_{i})e_{k-1},

(2.6)

where $i=\bmod(k-1,m)+1$ , and we have the following set property.

Lemma 2.2.

Let $\{x_{k},k>0\}$ be generated by (2.1), then, for any $\mu_{i}\in\mathbb{R}$ ,

\displaystyle e_{k}\in N(A)^{\bot}

holds.

Proof. We prove the result by mathematical induction. First, we know that $e_{0}=y_{0}-x^{\dagger}-P_{N(A)}y_{0}\in N(A)^{\bot}$ . Second, we suppose that for any given $k\geq 0$ there holds $e_{k}\in N(A)^{\bot}$ , then $e_{k+1}\in N(A)^{\bot}$ . In fact, for any $z\in N(A)$ ,

\displaystyle\langle e_{k+1},z\rangle=\langle P_{i}(\mu_{i})e_{k},z\rangle=\langle e_{k},z\rangle-\mu_{i}\frac{b_{i}}{\|a_{i}\|_{2}^{2}}\langle a_{i},z\rangle=0.

This proves $e_{k+1}\in N(A)^{\bot}$ . To sum up, $e_{k}\in N(A)^{\bot}$ for any $k\geq 0$ .∎

Lemma 2.2 shows that, for any relaxation parameter $\mu_{i}$ , the iterative error $\{e_{k},k\geq 0\}$ of the relaxation Kaczmarz method belong to $N(A)^{\bot}$ . The following theorem gives the convergence result of the relaxation Kaczmarz method for solving a consistent linear system when $0<\mu_{i}<2$ (where $i=1,2,\ldots,m$ ).

Theorem 2.3.

For a consistent linear system $Ax=b$ , let $\{x_{k},k\geq 0\}$ be generated by (2.1) and relaxation parameter $0<\mu_{i}<2$ . Then the sequence $\{x_{k},k\geq 0\}$ converges to the solution $x^{\dagger}$ when $x_{0}\in R(A^{T})$ .

Proof. Let $\bar{P}_{i}(\mu_{i})\triangleq P_{i}(\mu_{i})|_{N(A)^{\bot}}$ , when $0<\mu_{i}<2$ , then from Lemma 2.1 there holds $\|\bar{P}_{i}(\mu_{i})\|_{2}<1$ , consequently $\zeta\triangleq\max\limits_{1\leq i\leq m}\|\bar{P}_{i}(\mu_{i})\|<1$ . It follows from (2.6) that

\displaystyle\|e_{k}\|_{2}=\|P_{i}(\mu_{i})e_{k-1}\|_{2}.

By Lemma 2.2, then

\displaystyle\|e_{k}\|_{2}=\|\bar{P}_{i}(\mu_{i})e_{k-1}\|_{2}\leq\|\bar{P}_{i}(\mu_{i})\|\|e_{k-1}\|_{2}<\zeta\|e_{k-1}\|_{2}<\ldots<\zeta^{k}\|e_{0}\|_{2},

which means $\|e_{k}\|_{2}\rightarrow 0$ as $k\rightarrow\infty$ , i.e. $x_{k}$ converges to $x^{\dagger}+P_{N(A)}x_{0}$ . Moreover $x_{k}$ also converges to $x^{\dagger}$ when $x_{0}\in R(A^{T})$ . ∎

3 The standard form of the relaxation Kaczmarz-Tanabe method

We start with the first equation of (1.1) and perform the iteration (2.3) in turn until the $m$ -th equation, and denote the $m$ -th iteration with $x_{m}$ , then get

	$\displaystyle x_{m}$	$\displaystyle=P_{m}(\mu_{m})\cdots P_{1}(\mu_{1})x_{0}$
		$\displaystyle\quad+\mu_{1}P_{m}(\mu_{m})\cdots P_{2}(\mu_{2})\frac{b_{1}}{\\|a_{1}\\|_{2}^{2}}a_{1}+\mu_{2}P_{m}(\mu_{m})\cdots P_{3}(\mu_{3})\frac{b_{2}}{\\|a_{2}\\|_{2}^{2}}a_{2}+\cdots+\mu_{m}\frac{b_{m}}{\\|a_{m}\\|_{2}^{2}}a_{m}.$		(3.1)

For $1\leq j\leq m-1$ , let

	$\displaystyle{\bf u}:=(\mu_{1},\ldots,\mu_{m}),\quad{\bf u}_{j}:=(\mu_{j+1},\ldots,\mu_{m}),$		(3.2)
	$\displaystyle Q({\bf u}):=P_{m}(\mu_{m})\cdots P_{1}(\mu_{1}),\quad Q_{j}({\bf u}_{j}):=P_{m}(\mu_{m})\cdots P_{j+1}(\mu_{j+1}).$		(3.3)

If $\mu_{i}\equiv 1$ for each $1\leq i\leq m$ , then $Q({\bf u})=Q,Q({\bf u}_{j})=Q_{j}$ . In addition, we agree that $Q_{m}({\bf u}_{m})=I$ , where $I$ denote identity matrix of whatever size appropriate to the context.

Let

\displaystyle A_{\mathcal{S}}({\bf u})\triangleq(Q_{1}({\bf u}_{1})a_{1},Q_{2}({\bf u}_{2})a_{2},\ldots,Q_{m}({\bf u}_{m})a_{m}\big{)}^{T},\quad\Lambda=\mathbf{diag}(\mu_{1},\ldots,\mu_{m}).

(3.4)

Thus we get from (3), (3.2), (3.3) and (3.4) that

\displaystyle x_{m}=Q({\bf u})x_{0}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda Mb.

Consequently, it follows by repeating this iteration that

\displaystyle x_{k\cdot m}=Q({\bf u})x_{(k-1)\cdot m}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda Mb,\quad k=1,2,\ldots.

(3.5)

Taking $y_{k}=x_{k\cdot m}$ , then we get the relaxation Kaczmarz-Tanabe’s iteration

\displaystyle y_{k}=Q({\bf u})y_{k-1}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda Mb,\quad k=1,2,\ldots,

(3.6)

where $y_{0}=x_{0}$ .

Lemma 3.1.

Let $Q(\bf{u})$ and $Q_{j}({\bf u}_{j})$ be defined as in (3.3), then there holds

\displaystyle Q({\bf u})=I-[A_{\mathcal{S}}({\bf u})]^{T}\Lambda MA.

(3.7)

Proof. First, we have from (3.3) that

\displaystyle Q({\bf u})

\displaystyle=Q_{1}({\bf u}_{1})-\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\|a_{1}\|_{2}^{2}}.

By $Q_{j-1}({\bf u}_{j-1})=Q_{j}({\bf u}_{j})-\mu_{j}Q_{j}({\bf u}_{j})\frac{a_{j}a_{j}^{T}}{\|a_{j}\|_{2}^{2}}$ , then

	$\displaystyle Q({\bf u})$	$\displaystyle=Q_{1}({\bf u}_{1})-\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\\|a_{1}\\|_{2}^{2}}$
		$\displaystyle=Q_{2}({\bf u}_{2})-\big{(}\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\\|a_{1}\\|_{2}^{2}}+\mu_{2}Q_{2}({\bf u}_{2})\frac{a_{2}a_{2}^{T}}{\\|a_{2}\\|_{2}^{2}}\big{)}$
		$\displaystyle=\cdots$
		$\displaystyle=Q_{m}({\bf u}_{m})-\big{(}\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\\|a_{1}\\|_{2}^{2}}+\mu_{2}Q_{2}({\bf u}_{2})\frac{a_{2}a_{2}^{T}}{\\|a_{2}\\|_{2}^{2}}+\cdots+\mu_{m}Q_{m}({\bf u}_{m})\frac{a_{m}a_{m}^{T}}{\\|a_{m}\\|_{2}^{2}}\big{)}$
		$\displaystyle=I-\big{(}\mu_{1}Q_{1}({\bf u}_{1})a_{1},\mu_{2}Q_{2}({\bf u}_{2})a_{2},\cdots,\mu_{m}Q_{m}({\bf u}_{m})a_{m}\big{)}\mathbf{diag}(\frac{1}{\\|a_{1}\\|_{2}^{2}},\cdots,\frac{1}{\\|a_{m}\\|_{2}^{2}})\big{(}a_{1},a_{2},\cdots,a_{m}\big{)}^{T}.$

holds. This proves (3.7).∎

By Lemma 3.1, we can get the alternative form of (3.6), i.e.,

\displaystyle y_{k}=y_{k-1}+[A_{\mathcal{S}}({\bf u})]^{T}\Lambda M(b-Ay_{k-1}),\quad k=1,2,\ldots.

For the further discussion to (3.6), we extend the conceptions of sequential projection matrix, sequential projection matrix set and sequential compatible to $Q_{j}({\bf u}_{j})$ (see [27, Definition 2.1 & Definition 2.2]).

Definition 3.2.

We call $Q_{j}({\bf u}_{j})$ a sequential projection matrix with parameters on $a_{j+1},\ldots,a_{m}$ , and denote the sequential projection matrix set with $S_{sp}(a_{1},\ldots,a_{m};{\bf u})$ , i.e.,

\displaystyle S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u})=\{Q_{1}({\bf u}_{1}),Q_{2}({\bf u}_{2}),\ldots,Q_{m-1}({\bf u}_{m-1})\}.

(3.8)

Definition 3.3.

For any $Q_{i}({\bf u}_{i})\in S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u})$ , if there exist $\zeta_{i,1},\ldots,\zeta_{i,m}$ such that

\displaystyle Q_{i}({\bf u}_{i})a_{i}=\zeta_{i,1}({\bf u}_{i})a_{1}+\ldots\zeta_{i,m}({\bf u}_{i})a_{m},

(3.9)

then we call $A$ and $S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u})$ sequentially compatible. In general, for any $1\leq i\leq m,i\leq j\leq m$ , if there exist $\zeta_{1}^{(i,j)}({\bf u}_{j}),\ldots,\zeta_{m}^{(i,j)}({\bf u}_{j})$ such that

\displaystyle Q_{j}({\bf u}_{j})a_{i}=\zeta_{1}^{(i,j)}({\bf u}_{j})a_{1}+\ldots\zeta_{m}^{(i,j)}({\bf u}_{j})a_{m},

(3.10)

then we call $A$ and $S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u})$ forward sequential compatible, and call $(\zeta_{1}^{(i,j)}({\bf u}_{j}),\ldots,\zeta_{m}^{(i,j)}({\bf u}_{j}))$ compatible vector of $Q_{j}({\bf u}_{j})a_{i}$ on $A$ .

Theorem 3.4.

Suppose $A$ has no zero row, and $S_{sp}(a_{1},\ldots,a_{m};{\bf u})$ is defined by (3.8), then $A$ and $S_{sp}(a_{1},\ldots,a_{m};{\bf u})$ are forward sequential compatible.

Proof. Similar to [27, Theorem 2.6], we will complete the proof by mathematical induction. We take the subscript $(i,j)$ of $Q_{j}({\bf u}_{j})a_{i}$ as an ordered array. Obviously, $a_{m}^{T}[Q_{m}({\bf u}_{m})]^{T}=a_{m}^{T}$ , that is, (3.10) holds for $(i,j)=(m,m)$ and $(\zeta_{1}^{(m,m)}({\bf u}_{m}),\ldots,\zeta_{m}^{(m,m)}({\bf u}_{m}))=(0,\ldots,0,1)$ . In fact, for any $1\leq i\leq m$ , (3.10) obviously holds for $a_{i}^{T}[Q_{m}({\bf u}_{m})]^{T}$ because $Q_{m}=I$ is the agreement. Consequently, As the first step of mathematical induction, we prove (3.10) holds for $(i,j)=(m-1,m-1)$ . Actually,

\displaystyle a_{m-1}^{T}[Q_{m-1}({\bf u}_{m-1})]^{T}=a_{m-1}^{T}[P_{m}(\mu_{m})]^{T}[Q_{m}({\bf u}_{m})]^{T}=a_{m-1}^{T}-\mu_{m}\frac{a_{m-1}^{T}a_{m}}{\|a_{m}\|_{2}^{2}}a_{m}^{T}.

So (3.9) holds for $i=m-1$ , where

(\zeta_{1}^{(m-1,m-1)}({\bf u}_{m-1}),\ldots,\zeta_{m}^{(m-1,m-1)}({\bf u}_{m-1}))=(0,\ldots,0,1,-\mu_{m}a_{m-1}^{T}a_{m}/\|a_{m}\|_{2}^{2}).

Second, we suppose (3.10) holds for any given $(i,j)$ subjected to $s\leq i<m$ and $s\leq t<j<m$ , i.e., there exists $(\zeta_{1}^{(i,j)}({\bf u}_{j}),\ldots,\zeta_{m}^{(i,j)}({\bf u}_{j}))$ such that

\displaystyle a_{i}^{T}[Q_{j}({\bf u}_{j})]^{T}=\zeta_{1}^{(i,j)}({\bf u}_{j})a_{1}^{T}+\ldots\zeta_{m}^{(i,j)}({\bf u}_{j})a_{m}^{T}.

Then, we prove (3.10) holds for $(i,j)=(s,t)$ . Because of $Q_{t}({\bf u}_{t})=Q_{t+1}({\bf u}_{t+1})P_{t+1}(\mu_{t+1})$ , then

\displaystyle a_{s}^{T}[Q_{t}({\bf u}_{t})]^{T}=a_{s}^{T}[P_{t+1}(\mu_{t+1})]^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}=a_{s}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}-\mu_{t+1}\frac{a_{s}^{T}a_{t+1}}{\|a_{t+1}\|_{2}^{2}}a_{t+1}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}.

(3.11)

From the hypothesis, there exist $(\zeta_{1}^{(s,t+1)}({\bf u}_{t+1}),\ldots,\zeta_{m}^{(s,t+1)}({\bf u}_{t+1}))$ and $(\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1}),\ldots,\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1}))$ such that

	$\displaystyle a_{s}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}=\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})a_{m}^{T},$
	$\displaystyle a_{t+1}^{T}[Q_{t+1}({\bf u}_{t+1})]^{T}=\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})a_{m}^{T}.$

Then, it follows from (3.11) that

	$\displaystyle a_{s}^{T}[Q_{t}({\bf u}_{t})]^{T}$
	$\displaystyle=\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})a_{m}^{T}-\frac{a_{s}^{T}a_{t+1}}{\\|a_{t+1}\\|_{2}^{2}}\big{(}\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1})a_{1}^{T}+\ldots+\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})a_{m}^{T}\big{)}$
	$\displaystyle=\big{(}\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\\|a_{t+1}\\|_{2}^{2}}\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1})\big{)}a_{1}^{T}+\ldots+\big{(}\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\\|a_{t+1}\\|_{2}^{2}}\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})\big{)}a_{m}^{T}.$

Denote

	$\displaystyle\big{(}\zeta_{1}^{(s,t)}({\bf u}_{t}),\ldots,\zeta_{m}^{(s,t)}({\bf u}_{t})\big{)}$
	$\displaystyle=\big{(}\zeta_{1}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\\|a_{t+1}\\|_{2}^{2}}\zeta_{1}^{(t+1,t+1)}({\bf u}_{t+1}),\ldots,\zeta_{m}^{(s,t+1)}({\bf u}_{t+1})-\frac{a_{s}^{T}a_{t+1}}{\\|a_{t+1}\\|_{2}^{2}}\zeta_{m}^{(t+1,t+1)}({\bf u}_{t+1})\big{)}.$

This proves (3.10) holds for $(i,j)=(s,t)$ .

To sum up the above, the conclusion is proved for all $(i,j)$ with respect to $1\leq i\leq m,i\leq j\leq m$ . Namely, $A$ and $S_{sp}(a_{1},a_{2},\ldots,a_{m};{\bf u})$ generated by relaxation Kaczmarz’s iteration are forward sequential compatible. ∎

In order to obtain the standard form (i.e., matrix-vector form) of the relaxation Kaczmarz-Tanabe’s iteration, we first import the following decomposition theorem about $A_{\mathcal{S}}({\bf u})$ .

Theorem 3.5.

Suppose $A$ has no zero rows, and let $A_{\mathcal{S}}({\bf u})$ be defined by (3.4), then there exists a unit upper triangular matrix $C({\bf u})\in\mathbb{R}^{m\times m}$ such that $A_{\mathcal{S}}({\bf u})=C({\bf u})A$ .

Proof. According to $A_{\mathcal{S}}({\bf u})=(Q_{1}({\bf u}_{1})a_{1},\ldots,Q_{m}({\bf u}_{m})a_{m})^{T}$ and Theorem 3.4, the corollary can be proved by taking $[C({\bf u})](i,j)=\zeta_{i,j}({\bf u}_{i})$ , where $\zeta_{i,j}({\bf u}_{i})$ is determined by the forward sequential compatible of $A$ and $S_{sp}(a_{1},\ldots,a_{m};{\bf u})$ . ∎

Theorem 3.5 is a general result, but it shows that there exists a structure-reserved decomposition in relaxation Kaczmarz-Tanabe’s iteration when a relaxation parameter is introduced into Kaczmarz’s iteration. However, the decomposition is not unique when $A$ is not row full rank.

By Theorem 3.5, then we get the standard form of the relaxation Kaczmarz-Tanabe’s iteration from (3.6) that

\displaystyle y_{k}=y_{k-1}+A^{T}[C({\bf u})]^{T}\Lambda M(b-Ay_{k-1}),\quad k=1,2,\ldots.

(3.12)

In order to get the special form of $C$ in Kaczmarz-Tanabe’s iteration, Kang introduced the concept of index set $I_{d}(n_{1},n_{2},v)$ in [27, Definition 2.9], i.e.,

Definition 3.6.

The index set $I_{d}(n_{1},n_{2},v)$ is defined as follows

\displaystyle I_{d}(n_{1},n_{2},v)=\Big{\{}[I_{d}(1),\ldots,I_{d}(v)]\Big{\}},

where $n_{1},n_{2},v$ are positive integers satisfied $|n_{1}-n_{2}|\geq v\geq 2$ . $I_{d}(i)$ is integer between $n_{1}$ and $n_{2}$ , and $I_{d}(1)=n_{1},I_{d}(v)=n_{2}$ . For any $i<j$ there satisfy

\displaystyle\begin{array}[]{ll}I_{d}(i)<I_{d}(j),&n_{1}<n_{2},\\ I_{d}(i)>I_{d}(j),&n_{1}>n_{2}.\end{array}

By Definition 3.6, we can give the following expression of $a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x}$ for any $1\leq i\leq m$ and $\tilde{x}\in N(A)^{\bot}$ .

Lemma 3.7.

Suppose that $A$ has no zero row, $Q_{i}$ is the sequential projection matrix of $A$ and $\tilde{x}\in N(A)^{\bot}$ . For any $1\leq i\leq m,i+1\leq j\leq m-1$ , denote

\displaystyle d_{i,j}({\bf u}_{i})=\sum_{v=2}^{j-i+1}(-1)^{v-1}\sum_{I_{d}(i,j,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}.

(3.13)

Then,

\displaystyle a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x}=(1,d_{i,i+1}({\bf u}_{i}),\ldots,d_{i,m}({\bf u}_{i}))(a_{i}^{T},a_{i+1}^{T},\ldots,a_{m}^{T})^{T}\tilde{x}

(3.14)

holds. That is, the compatible vector of $a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}$ on $A$ is $(0,\ldots,0,1,d_{i,i+1}({\bf u}_{i}),\ldots,d_{i,m}({\bf u}_{i}))$ .

Proof. When $1\leq i\leq m$ and $\tilde{x}\in N(A)^{\bot}$ , we have

	$\displaystyle a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x}$	$\displaystyle=(1,-\mu_{i+1}h_{i,i+1})(a_{i}^{T}[Q_{i+1}({\bf u}_{i+1})]^{T}\tilde{x},a_{i+1}^{T}[Q_{i+1}({\bf u}_{i+1})]^{T}\tilde{x})^{T}$
		$\displaystyle=(1,-\mu_{i+1}h_{i,i+1},-\mu_{i+2}h_{i,i+2}+\mu_{i+1}\mu_{i+2}h_{i,i+1}h_{i+1,i+2})$
		$\displaystyle\quad\cdot(a_{i}^{T}[Q_{i+2}({\bf u}_{i+2})]^{T}\tilde{x},a_{i+1}^{T}[Q_{i+2}({\bf u}_{i+2})]^{T}\tilde{x},a_{i+2}^{T}[Q_{i+2}({\bf u}_{i+2})]^{T}\tilde{x})^{T}$
		$\displaystyle=\big{(}1,-\mu_{i+1}h_{i,i+1},\ldots,\sum_{v=2}^{m-i+1}(-1)^{m-1}\sum_{I_{d}(i,m,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}\big{)}$
		$\displaystyle\quad\cdot(a_{i}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x},a_{i+1}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x},\ldots,a_{m}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x})^{T}.$

By the agreement $Q_{m}({\bf u}_{m})=I$ ,

	$\displaystyle a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}\tilde{x}$	$\displaystyle=\big{(}1,-\mu_{i+1}h_{i,i+1},\ldots,\sum_{v=2}^{m-i+1}(-1)^{v-1}\sum_{I_{d}(i,m,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}\big{)}$
		$\displaystyle\quad\cdot(a_{i}^{T}\tilde{x},a_{i+1}^{T}\tilde{x},\ldots,a_{m}^{T}\tilde{x})^{T}.$		(3.15)

Thus (3.14) holds by taking $d_{i,j}({\bf u}_{i})$ according to (3.13). Because (3.14) holds for any $\tilde{x}\in\mathbb{R}^{n}$ , we then get the compatible vector of $a_{i}^{T}[Q_{i}({\bf u}_{i})]^{T}$ on $A$ . ∎

Theorem 3.8.

Under the Lemma 3.7, let $\Omega({\bf u})=(\omega_{i,j}({\bf u}_{i}))_{m\times m}$ satisfy

\displaystyle\omega_{i,j}({\bf u}_{i})=\left\{\begin{array}[]{ll}d_{i,j}({\bf u}_{i}),&j>i,\\ 1,&j=i,\\ 0,&j<i.\end{array}\right.

(3.19)

Then,

\displaystyle A_{\mathcal{S}}({\bf u})=\Omega({\bf u})A.

Proof. For any $\tilde{x}\in N(A)^{\bot}$ ,

\displaystyle A_{\mathcal{S}}({\bf u})\tilde{x}=(a_{1}^{T}[Q_{1}({\bf u}_{1})]^{T}\tilde{x},a_{2}^{T}[Q_{2}({\bf u}_{2})]^{T}\tilde{x},\ldots,a_{m}^{T}[Q_{m}({\bf u}_{m})]^{T}\tilde{x})^{T}.

(3.20)

By Lemma 3.7 and (3.19), we then get

\displaystyle A_{\mathcal{S}}({\bf u})\tilde{x}=\Omega({\bf u})A\tilde{x}.

(3.21)

Moreover, when $\tilde{x}\in N(A)$ , (3.21) obviously holds. Therefore, for any $\tilde{x}\in\mathbb{R}^{n}$ , there holds $A_{\mathcal{S}}({\bf u})\tilde{x}=\Omega({\bf u})A\tilde{x}$ , which means $A_{\mathcal{S}}({\bf u})=\Omega({\bf u})A$ . ∎

In the context, we especially call the iteration (3.12) relaxation Kaczmarz-Tanabe’s iteration when $C({\bf u})=\Omega({\bf u})$ is taken according to (3.19).

Let $E(j,i(-\mu_{i}h_{j,i}))$ be a matrix obtained by multiplying the $i$ -th row of the identity matrix by $-\mu_{i}h_{j,i}$ and adding it to the $j$ -th row, i.e., the diagonal elements of $E(j,i(-\mu_{i}h_{j,i}))$ are all $1$ , the $(j,i)$ - element is $-\mu_{i}h_{j,i}$ , and all other elements are $0$ . Consequently, we have the following decomposition theorem of $\Omega({\bf u})$ .

Theorem 3.9.

For any $\mu_{i}\in\mathbb{R}$ , let $\Omega({\bf u})$ be defined by (3.19), and define

\displaystyle H_{1}(\mu_{1})=I,H_{i}(\mu_{i})=\prod\limits_{j=1}^{i-1}E(j,i(-\mu_{i}h_{j,i})),\quad 1<i\leq m.

Then,

\displaystyle\Omega({\bf u})=H_{1}(\mu_{1})H_{2}(\mu_{2})\cdots H_{m}(\mu_{m}).

(3.22)

Proof. For any $\tilde{x}\in N(A)^{\bot}$ , from (3), we have

	$\displaystyle a_{j}^{T}[Q_{j}({\bf u}_{j})]^{T}\tilde{x}$	$\displaystyle=\big{(}0,\ldots,0,1,-\mu_{j+1}h_{j,j+1},\ldots,\sum_{v=2}^{m-j+1}(-1)^{v-1}\sum_{I_{d}(j,m,v)}\prod_{s=1}^{v-1}\mu_{I_{d}(s+1)}\prod_{s=1}^{v-1}h_{I_{d}(s),I_{d}(s+1)}\big{)}$
		$\displaystyle\quad\cdot(a_{1}^{T}\tilde{x},a_{2}^{T}\tilde{x},\ldots,a_{m}^{T}\tilde{x})^{T}.$		(3.23)

From (3), the coefficient of $a_{i}^{T}\tilde{x}$ is actually the $(j,i)$ -element of $\Omega({\bf u})$ when $i>j$ , i.e.,

\displaystyle\omega_{j,i}({\bf u}_{j})=-\mu_{i}h_{j,i}+\mu_{i}\mu_{i-1}h_{j,i-1}h_{i-1,i}+\ldots+(-1)^{i-j}\mu_{i}\mu_{i-1}\ldots\mu_{j+1}h_{j,j+1}h_{j+1,j+2}\cdot\ldots\cdot h_{i-1,i}.

Denote $\widehat{H}({\bf u})=H_{1}(\mu_{1})\cdots H_{m}(\mu_{m})$ , in order to show $\Omega({\bf u})=H_{1}(\mu_{1})\cdots H_{m}(\mu_{m})$ , we only need to prove $\omega_{j,i}({\bf u}_{j})=[\widehat{H}({\bf u})]_{j,i}$ , i.e.,

\displaystyle\omega_{j,i}({\bf u}_{j})=e_{j}^{T}\widehat{H}({\bf u})e_{i},

where $e_{j}$ and $e_{i}$ are the $j$ th and $i$ th columns of the identity matrix in $\mathbb{R}^{m}$ , respectively. Owing to $e_{j}^{T}H_{k}(\mu_{k})=e_{j}^{T}$ when $j\geq k$ and $H_{l}(\mu_{l})e_{i}=e_{i}$ when $i\neq l$ , when $i>j$ it then follows from the notation of $\widehat{H}({\bf u})$ that

	$\displaystyle e_{j}^{T}\widehat{H}({\bf u})e_{i}$	$\displaystyle=e_{j}^{T}H_{j+1}(\mu_{j+1})\cdots H_{i}(\mu_{i})e_{i}$
		$\displaystyle=(e_{j}^{T}H_{j+1}(\mu_{j+1}))H_{j+2}(\mu_{j+2})\cdots H_{i}(\mu_{i})e_{i}$
		$\displaystyle=((0,\ldots,1,-\mu_{j+1}h_{j,j+1},0,\ldots,0)H_{j+2}(\mu_{j+2}))H_{j+3}(\mu_{j+3})\cdots H_{i}(\mu_{i})e_{i}$
		$\displaystyle=((0,\ldots,1,-\mu_{j+1}h_{j,j+1},-\mu_{j+1}h_{j,j+1}+\mu_{j+2}\mu_{j+1}h_{j,j+1}h_{j+1,j+2},\ldots,0)H_{j+2}(\mu_{j+2}))$
		$\displaystyle\quad\cdot H_{j+3}(\mu_{j+3})\cdots H_{i}(\mu_{i})e_{i}$
		$\displaystyle=(0,\ldots,1,-\mu_{j+1}h_{j,j+1},\ldots,-\mu_{i}h_{j,i}+\mu_{i}\mu_{i-1}h_{j,i-1}h_{i-1,i}+\ldots$
		$\displaystyle\quad+(-1)^{i-j}\mu_{i}\mu_{i-1}\ldots\mu_{j+1}h_{j,j+1}\cdots h_{i-1,i},0,\ldots,0)e_{i}$
		$\displaystyle=-\mu_{i}h_{j,i}+\mu_{i}\mu_{i-1}h_{j,i-1}h_{i-1,i}+\ldots+(-1)^{i-j}\mu_{i}\mu_{i-1}\cdots\mu_{j+1}h_{j,j+1}h_{j+1,j+2}\cdots h_{i-1,i}.$

This proves

\omega_{j,i}({\bf u}_{i})=[\widehat{H}({\bf u})]_{j,i}

for any $1\leq j\leq n-1$ and $i>j$ . Additionally, there also holds

\omega_{j,j}({\bf u}_{j})=[\widehat{H}({\bf u})]_{j,j}=1

for any $1\leq j\leq n$ . Consequently, the conclusion is proved. ∎

Theorem 3.9 actually provides a computational method of $C({\bf u})$ for the relaxation Kaczmarz-Tanabe’s algorithm. Moreover, there also holds the following decomposition about $\Lambda C({\bf u})$ .

Lemma 3.10.

Let $C({\bf u})$ be defined like $\Omega({\bf u})$ in Theorem 3.8, then there exists decomposition of

\displaystyle\Lambda C({\bf u})=T_{u}({\bf u})M,

(3.24)

where $T_{u}({\bf u})\in\mathbb{R}^{m\times m}$ is an upper triangular matrix with diagonal elements $[T_{u}({\bf u})](i,i)=\mu_{i}\|a_{i}\|_{2}^{2}$ .

Proof. The proof of the lemma is trivial and omitted here. ∎

3.1 The convergence of the relaxation Kaczmarz-Tanabe method

Next, we consider the convergence of the relaxation Kaczmarz-Tanabe method and give a sufficient condition of convergence for the relaxation Kaczmarz-Tanabe’s iteration. Notice that each relaxation Kaczmarz-Tanabe’s iteration is a composite of $m$ relaxation Kaczmarz’s iterations, consequently, it is natural to transfer the convergence condition of the relaxation Kaczmarz method to the relaxation Kaczmarz-Tanabe method.

Lemma 3.11.

For a linear system $Ax=b$ , let $Q({\bf u})$ be defined by (3.3), and we suppose $0<\mu_{i}<2$ for each $1\leq i\leq m$ . Then, for any $x\in\mathbb{R}^{n}$ , there holds $\|Q({\bf u})x\|_{2}\leq\|x\|_{2}$ , and the equality holds iif $x\in N(A)$ .

Proof. When $x\in N(A)$ , from $P_{j}(\mu_{j})x=x,j=1,\ldots,m$ , therefore there holds

\displaystyle Q({\bf u})x=P_{m}(\mu_{m})\cdots P_{1}(\mu_{1})x=x.

(3.25)

If $x\notin N(A)$ , from Lemma 2.1, there exists at least an index $s_{0}\in\Theta\triangleq\{1,2,\ldots,m\}$ such that $\|P_{s_{0}}(\mu_{s_{0}})x\|_{2}<\|x\|_{2}$ . We suppose that $s_{0}$ is the smallest index in $\Theta$ that $\|P_{s_{0}}(\mu_{s_{0}})x\|_{2}<\|x\|_{2}$ , then there holds

\|Q({\bf u})x\|_{2}=\|P_{m}(\mu_{m})\cdots P_{1}(\mu_{1})x\|_{2}\leq\|P_{s_{0}}(\mu_{s_{0}})\cdots P_{1}(\mu_{1})x\|_{2}\leq\|p_{s_{0}}(\mu_{s_{0}})x\|_{2}<\|x\|_{2}.

These prove the conclusions. ∎

Lemma 3.11 actually gives the condition to ensure the nonexpansion of $Q({\bf u})$ on $\mathbb{R}^{n}$ . Let $\{y_{k}\}$ be generated by (3.12) and $\bar{e}_{k}=y_{k}-x^{\dagger}-P_{N(A)}y_{0}$ , then

\displaystyle\bar{e}_{k}=\big{(}I-A^{T}[C({\bf u})]^{T}\Lambda MA\big{)}\bar{e}_{k-1},

it follows from Lemma 3.1 that

\displaystyle\bar{e}_{k}=Q({\bf u})\bar{e}_{k-1}.

(3.26)

By Lemma 3.11, thus we have the following convergence of the relaxation Kaczmarz-Tanabe method.

Theorem 3.12.

For a consistent linear system (1.1), supposing that $\{y_{k},k\geq 0\}$ is generated by the relaxation Kaczmarz-Tanabe’s iteration (3.12) with $0<\mu_{i}<2$ (where $1\leq i\leq m$ ). Then $\{y_{k},k>0\}$ converges to $x^{\dagger}+P_{N(A)}y_{0}$ .

Proof. Let $Q_{r}({\bf u})=Q({\bf u})|_{N(A)^{\bot}}$ , by Lemma 3.11, we have $\|Q_{r}({\bf u})\|_{2}<1$ . Because the sequence generated by the relaxation Kaczmarz-Tanabe’s iteration is a subsequence of the sequence generated by the relaxation Kaczmarz’s iteration, there holds from Lemma 2.2 that $\bar{e}_{k}\in N(A)^{\bot}$ . Consequently, by (3.26), we have

\displaystyle\|\bar{e}_{k}\|_{2}=\|Q({\bf u})\bar{e}_{k-1}\|_{2}=\|Q_{r}({\bf u})\bar{e}_{k-1}\|_{2}\leq\|Q_{r}({\bf u})\|_{2}\|\bar{e}_{k-1}\|_{2}\leq\cdots\leq\|Q_{r}({\bf u})\|_{2}^{k}\|\bar{e}_{0}\|_{2},

which shows $\|\bar{e}_{k}\|\rightarrow 0$ as $k\rightarrow\infty$ , i.e., $y_{k}\rightarrow x^{\dagger}+P_{N(A)}y_{0}$ .∎

Moreover, we have the following convergence rate of the relaxation Kaczmarz-Tanabe method for solving error-free linear system.

Corollary 3.13.

Under the conditions of Theorem 3.12, there also hold

\displaystyle\|\bar{e}_{k}\|_{2}\leq\max_{0<\sigma_{i}({\bf u})<1}\sigma_{i}({\bf u})\|\bar{e}_{k-1}\|_{2}

(3.27)

and

\displaystyle\|\bar{e}_{k}\|_{2}\leq\max_{0<\sigma_{i}({\bf u})<1}[\sigma_{i}({\bf u})]^{k}\|\bar{e}_{0}\|_{2},

(3.28)

where $\sigma_{i}({\bf{u}})$ is the singular value of $Q({\bf u})$ .

Proof. From the proof of Theorem 3.12, we have

\displaystyle\|\bar{e}_{k}\|_{2}\leq\|Q_{r}({\bf u})\|_{2}\|\bar{e}_{k-1}\|_{2}.

When $0<\mu_{i}<2$ , $\|Q({\bf u})\|_{2}\leq 1$ and $\|Q_{r}({\bf u})\|_{2}<1$ . Then,

\displaystyle\|Q_{r}({\bf u})\|_{2}\leq\max_{0<\sigma_{i}({\bf u})<1}[\sigma_{i}({\bf u})].

Consequently, (3.27) and (3.28) hold. ∎

4 The related algorithms

The implementation of the relaxation Kaczmarz-Tanabe method are divided into two processes. The first process is to compute $C({\bf u})$ for given ${\bf u}\in\mathbb{R}^{m}$ , and the second process is to execute the Kaczmarz-Tanabe’s iteration repeatedly. In Algorithm 4, we list the flowchart to compute $C({\bf u})$ . After computing $C({\bf u})$ , the implementation of the relaxation Kaczmarz-Tanabe’s iteration is trivial, and we list the process in Algorithm 4.

Algorithm 1 The calculation of matrix $C({\bf u})$

1:Input

A=(a_{1},a_{2},\ldots,a_{m})^{T}

{\bf u}=(\mu_{1},\mu_{2},\ldots,\mu_{m})

\triangleright

{\bf u}

relaxation parameter,

\mu_{i}=1

is for decomposition without relaxation

C=I_{m}

\triangleright

I_{m}

is an identity matrix with order

m

k\leftarrow m

6:while

k>1

i\leftarrow k

8: while

i>1

j\leftarrow m

10: while

j>k-1

11: if

a_{k}^{T}a_{k}=0

then

12:

C(i-1,j)=C(i-1,j)

13: else

14:

C(i-1,j)=C(i-1,j)+\mu_{k}(-a_{i-1}^{T}a_{k}/(a_{k}^{T}a_{k})C(k,j)

15: end if

16:

j\leftarrow j-1

17: end while

18:

i\leftarrow i-1

19: end while

20:

k\leftarrow k-1

21:end while

22:Output

C({\bf u})

Algorithm 2 The relaxation Kaczmarz-Tanabe method solver

1:Input

A,M,\Lambda,b,{\bf u}

y_{0}=x_{0}

\triangleright

Initial value

K_{\max}

\triangleright

Maximal iteration number

5:Compute

C({\bf u})

by Algorithm 4

k=1

7:while

k<K_{\max}

y_{k+1}=y_{k}+A^{T}[C({\bf u})]^{T}\Lambda M(b-Ay_{k})

k\leftarrow k+1

10:end while

11:Output

y_{k}

\triangleright

y_{k}

is the acquired solution

5 Conclusion

In this paper, we consider the relaxation Kaczmarz-Tanabe method by introducing the relaxation parameter idea into Kaczmarz-Tanabe method. The development of this work is based on the research to the Kaczmarz-Tanabe method. We first make some exploration on the selections of the relaxation parameter for the relaxation Kaczmarz-Tanabe method, then analyze the related theories of relaxation Kaczmarz-Tanabe’s iteration such as the the decompositions of $A_{\mathcal{S}}({\bf u})$ and $C({\bf u})$ .

In numerical tests, we experiment several classic examples to verify these theoretical results of the selections of relaxation parameters for the relaxation Kaczmarz-Tanabe method. Although we prove that $\mu_{i}\in(0,2)$ is the sufficient condition for the convergence of the relaxation Kaczmarz-Tanabe method, the operation of adjusting parameters one by one is tedious and unnecessary. Consequently, we carry out many experiments with constant and random parameters to test the effect of parameter selection, moreover the experimental results of random parameters setting show the effectiveness of the specific case. However the selection of optimal parameters still needs further research, and in the current situation $\mu_{i}\equiv 1$ may still be a compromise for the relaxation Kaczmarz-Tanabe method.

Reference

References

[1] R. S. Ledley and W. R. Ayers. Computerized medical imaging and graphics evolves from computerized tomography. Comput. Med. Imag. Grap., 12(1):v–xviii, 1988.
[2] G. T. Herman. Fundamentals of computerized tomography. Academic Press, 2010.
[3] F. Natterer. The mathematics of computerized tomography. Medical Physics, 32, 1986.
[4] B. Jin and Y. Xu. Adaptive reconstruction for electrical impedance tomography with a piecewise constant conductivity. Inverse Problems, 36:014003, 2020.
[5] G. S. Alberti, H. Ammari, B. Jin, J. K. Seo, and W. Zhang. The linearized inverse problem in multifrequency electrical impedance tomography. SIAM J. Imaging Sci., 9:1525–51, 2016.
[6] Y. T. Chow, K. Ito, and J. Zou. A direct sampling method for electrical impedance tomography. Inverse Problems, 30:095003, 2014.
[7] S. Kaczmarz. Angenäherte auflösung von systemen linearer gleichungen. Bull. Int. Acad. Pol. Sci. Lett. A, 35:355–357, 1937.
[8] R. Gordon, R. Bender, and G. T. Herman. Algebraic reconstrction techniques (ART) for three dimensional electron microscopy and X-ray photography. J. Theoret. Biol., 29(3):471–481, 1970.
[9] C. Kang and H. Zhou. The extension of convergence rates of Kaczmarz type methods. J. Comput. Appl. Math., 382:113099, 2021.
[10] P. P. B. Eggermont, G. T. Herman, and A. Lent. Iterative algorithms for large partitioned linear systems, with applications to image reconstruction. Linear Algebra Appl., 40(OCT):37–67, 1981.
[11] Y. Censor, P. P. B. Eggermont, and D. Gordon. Strong underrelaxation in Kaczmarz’s method for inconsistent systems. Numer. Math., 41(1):83–92, 1983.
[12] E. Rebrova and D. Needell. On block gaussian sketching for the kaczmarz method. Numer. Algorithms, 86(1):443–473, 2021.
[13] G. Cimmino. Calcolo approssimato per le soluzioni dei sistemi di equazioni lineari. La Ricerca Scientifica,Series II, 9:326–333, 1938.
[14] L. Landweber. An iteration formula for Fredholm integral equations of the first kind. Am. J. Math., 73(3):615–624, 1951.
[15] Y. Censor, G. Dan, and R. Gordon. Component averaging: An efficient iterative parallel algorithm for large and sparse unstructured problems. Parallel Comput., 27(6):777–808, 2001.
[16] Y. Censor, T. Elfving, G. T. Herman, and T. Nikazad. On diagonally relaxed orthogonal projection methods. SIAM J. Sci. Comput., 30(1):473–504, 2007.
[17] P. C. Hansen and J. S. Jorgensen. AIR Tools II: Algebraic iterative reconstruction method, improved implementation. Numer. Algorithms, 79(1):107–137, 2018.
[18] T. Strohmer and R. Vershynin. A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl, 15:262–278, 2009.
[19] Y. L. Jiao, B. T. Jin, and X. L. Lu. Preasymptotic convergence of randomized Kaczmarz method. Inverse Problems, 33, 2017.
[20] D. Needell. Randomized Kaczmarz solver for noisy linear systems. Behav. Inf. Technol., 50(2):395–403, 2010.
[21] D. Needell and J. A. Tropp. Paved with good intentions: Analysis of a randomized block Kaczmarz method. Linear Algebra Appl., 441(1):199–221, 2014.
[22] D. Needell, R. Zhao, and A. Zouzias. Randomized block Kaczmarz method with projection for solving least squares. Linear Algebra Appl., 484:322–343, 2015.
[23] Z. Z. Bai and W. T. Wu. On convergence rate of the randomized Kaczmarz method. Linear Algebra Appl., 553:252–269, 2018.
[24] Z. Z. Bai and W. T. Wu. On greedy randomized Kaczmarz method for solving large sparse linear systems. SIAM J. Sci. Comput., 40:A592–A606, 2018.
[25] C. Kang. Convergence rates of the Kaczmarz-Tanabe method for linear system. J. Comput. Appl. Math., 394:113577, 2021.
[26] C. Popa. Convergence rates for Kaczmarz-type algorithms. Numer. Algorithms, 79:1–17, 2018.
[27] C. Kang. The standard forms and convergence theory of the Kaczmarz-Tanabe type methods for solving linear systems. Journal of Computational and Applied Mathematics, page 115333, 2023.

	$\displaystyle Q({\bf u})$	$\displaystyle=Q_{1}({\bf u}_{1})-\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\\|a_{1}\\|_{2}^{2}}$
		$\displaystyle=Q_{2}({\bf u}_{2})-\big{(}\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\\|a_{1}\\|_{2}^{2}}+\mu_{2}Q_{2}({\bf u}_{2})\frac{a_{2}a_{2}^{T}}{\\|a_{2}\\|_{2}^{2}}\big{)}$
		$\displaystyle=\cdots$
		$\displaystyle=Q_{m}({\bf u}_{m})-\big{(}\mu_{1}Q_{1}({\bf u}_{1})\frac{a_{1}a_{1}^{T}}{\\|a_{1}\\|_{2}^{2}}+\mu_{2}Q_{2}({\bf u}_{2})\frac{a_{2}a_{2}^{T}}{\\|a_{2}\\|_{2}^{2}}+\cdots+\mu_{m}Q_{m}({\bf u}_{m})\frac{a_{m}a_{m}^{T}}{\\|a_{m}\\|_{2}^{2}}\big{)}$
		$\displaystyle=I-\big{(}\mu_{1}Q_{1}({\bf u}_{1})a_{1},\mu_{2}Q_{2}({\bf u}_{2})a_{2},\cdots,\mu_{m}Q_{m}({\bf u}_{m})a_{m}\big{)}\mathbf{diag}(\frac{1}{\\|a_{1}\\|_{2}^{2}},\cdots,\frac{1}{\\|a_{m}\\|_{2}^{2}})\big{(}a_{1},a_{2},\cdots,a_{m}\big{)}^{T}.$