This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\IEEEsettopmargin

t0.8in

Chebyshev Inertial Landweber Algorithm
for Linear Inverse Problems

Tadashi Wadayama and Satoshi Takabe 1Nagoya Institute of Technology, Gokiso, Nagoya, Aichi, 466-8555, Japan,
{wadayama, s_takabe}@nitech.ac.jp
Abstract

The Landweber algorithm defined on complex/real Hilbert spaces is a gradient descent algorithm for linear inverse problems. Our contribution is to present a novel method for accelerating convergence of the Landweber algorithm. In this paper, we first extend the theory of the Chebyshev inertial iteration to the Landweber algorithm on Hilbert spaces. An upper bound on the convergence rate clarifies the speed of global convergence of the proposed method. The Chebyshev inertial Landweber algorithm can be applied to wide class of signal recovery problems on a Hilbert space including deconvolution for continuous signals. The theoretical discussion developed in this paper naturally leads to a novel practical signal recovery algorithm. As a demonstration, a MIMO detection algorithm based on the projected Landweber algorithm is derived. The proposed MIMO detection algorithm achieves much smaller symbol error rate compared with the MMSE detector.

I Introduction

A number of signal detection problems in wireless communications and signal processing can be classified into linear inverse problems. In a linear inverse problem, a source signal xβˆˆπ”½nx\in\mathbb{F}^{n} (𝔽=ℝ\mathbb{F}=\mathbb{R} or 𝔽=β„‚\mathbb{F}=\mathbb{C}) is inferred from the noisy linear observation y=H​x+wy=Hx+w where Hβˆˆπ”½mΓ—nH\in\mathbb{F}^{m\times n} and the additive noise wβˆˆπ”½mw\in\mathbb{F}^{m}.

One of the simplest approaches for the above task is to rely on the least square principle, i.e., one can minimize (1/2)​‖yβˆ’H​xβ€–2(1/2)\|y-Hx\|^{2} which corresponds to the maximum likelihood estimation rule for estimating the source signal xx under the Gaussian noise assumption. The Landweber algorithm [1] [5] is defined by the fixed-point iteration

x(k+1)=x(k)βˆ’Ο‰β€‹H†​(H​x(k)βˆ’y),k=0,1,2,…x^{(k+1)}=x^{(k)}-\omega H^{\dagger}(Hx^{(k)}-y),\ k=0,1,2,\ldots (1)

where H†:=HTH^{\dagger}:=H^{T} if 𝔽=ℝ\mathbb{F}=\mathbb{R}, otherwise H†:=HHH^{\dagger}:=H^{H}. The notation XHX^{H} indicates the Hermitian transpose of XX and the parameter Ο‰\omega is a real constant. Note that the Landweber algorithm can be defined not only on a finite dimensional Euclidean space but also on an infinite dimensional complex/real Hilbert space. The Landweber algorithm defined on a Hilbert space is especially important for linear inverse problems including convolutions of continuous signals.

The Landweber algorithm can be regarded as a gradient descent method for minimizing the objective function (1/2)​‖yβˆ’H​xβ€–2(1/2)\|y-Hx\|^{2} because H†​(H​x(k)βˆ’y)H^{\dagger}(Hx^{(k)}-y) is the gradient of the objective function. The Landweber algorithm has been widely employed as a signal reconstruction algorithm for image deconvolution [2], inverse problems regarding diffusion partial differential equations [4], MIMO detectors [3], and so on. An advantage of the Landweber algorithm is that we can easily include a proximal or projection operation that utilizes the prior knowledge on the source signal after the gradient descent step (1), which is often called the projected Landweber algorithm [5].

One evident drawback of the Landweber algorithm is that the convergence speed is often too slow and we need to exploit an appropriate acceleration method. Recently, Takabe and Wadayama [8] found that a step-size sequence determined from Chebyshev polynomials can accelerate the convergence of gradient descent algorithms. Wadayama and Takabe [9] generalized the central idea of [8] for general fixed-point iterations. The method is called the Chebyshev inertial iteration. It would be natural to apply the Chebyshev inertial iteration for improving the Landweber algorithm in terms of the convergence speed.

The successive over relaxation (SOR) is a common method for accelerating a fixed-point iteration x(k+1)=f​(x(k))x^{(k+1)}=f(x^{(k)}) for solving linear equation such as Jacobi method. The SOR iteration corresponding to the above equation is given by x(k+1)=x(k)+Ο‰k​(f​(x(k))βˆ’x(k))x^{(k+1)}=x^{(k)}+\omega_{k}\left(f(x^{(k)})-x^{(k)}\right) where Ο‰k∈(0,2)\omega_{k}\in(0,2). A fixed SOR factors Ο‰k=Ο‰\omega_{k}=\omega is commonly used in practice. The Chebyshev inertial iteration [9] is a natural generalization of the SOR method, which is a method employing {Ο‰k}\{\omega_{k}\} defined based on the Chebyshev polynomials. The fundamental properties including the convergence rate are analyzed in [8, 9].

The goal of this paper is twofold. The first goal is to extend the theory of the Chebyshev inertial iteration to the Landweber algorithm on Hilbert spaces. The arguments in [8, 9] are restricted to the case where the underlying space is a finite dimensional Euclidean space. By extending the argument to a Hilbert space, the essential idea of the Chebyshev inertial iteration becomes applicable to iterative algorithms for infinite dimensional linear inverse problems, and we can obtain accelerated convergence of these algorithms. The Chebyshev inertial Landweber algorithm presented in this paper can be applied to wide class of signal reconstruction algorithms on complex/real Hilbert spaces such as deconvolution for continuous signals.

The second goal is to present a novel MIMO detection algorithm based on the projected Landweber algorithm with the Chebyshev inertial iteration for demonstrating that the theoretical discussion developed in this paper naturally leads to a novel practical signal detection algorithm.

II Preliminaries

In this section, several basic facts on functional analysis required for the subsequent argument are briefly reviewed. Precise definitions regarding functional analysis presented in this section can be found in [6].

II-A Hilbert space

Let pp be a real number satisfying 1≀p<∞1\leq p<\infty. The set of infinite sequences (a1,a2,…)(a_{1},a_{2},\ldots) in 𝔽\mathbb{F} satisfying βˆ‘i=1∞|ai|p<∞\sum_{i=1}^{\infty}|a_{i}|^{p}<\infty is denoted by lpl^{p} where 𝔽=β„‚\mathbb{F}=\mathbb{C} or ℝ\mathbb{R}. If the length of sequences are finite, i.e., a complex or real vector space, we can define a finite dimensional norm space lp​(N)l^{p}(N). The set of measurable functions defined on the closed interval [a,b]βŠ‚β„[a,b]\subset\mathbb{R}, ∫ab|f​(x)|p​𝑑x<∞\int_{a}^{b}|f(x)|^{p}dx<\infty is denoted by Lp​(a,b)L^{p}(a,b).

Let β„‹{\cal H} be a complex or real vector space. For any f,gβˆˆβ„‹f,g\in{\cal H}, an inner product ⟨f,g⟩\langle f,g\rangle is assumed to be given. The norm (inner product norm) associated with the inner product is defined by β€–fβ€–:=⟨f,g⟩,fβˆˆβ„‹.\|f\|:=\sqrt{\langle f,g\rangle},\quad f\in{\cal H}. If the norm space (β„‹,βˆ₯β‹…βˆ₯)({\cal H},\|\cdot\|) is complete, the space is said to be Hilbert space. The norm spaces l2l^{2} and L2​(a,b)L^{2}(a,b) are Hilbert spaces. In the case of l2l^{2}, the inner products are defined by ⟨f,g⟩:=(βˆ‘i=1∞fi​giΒ―)1/2.\langle f,g\rangle:=\left(\sum_{i=1}^{\infty}f_{i}\overline{g_{i}}\right)^{1/2}. On the other hand, the inner product for L2​(a,b)L^{2}(a,b) is defined by ⟨f,g⟩:=∫abf​(x)​g​(x)¯​𝑑x.\langle f,g\rangle:=\sqrt{\int_{a}^{b}f(x)\overline{g(x)}dx}.

Let TT be a bounded linear operator on β„‹{\cal H}. The operator norm of TT is defined by β€–Tβ€–:=supβ€–fβ€–=1β€–T​fβ€–\|T\|:=\sup_{\|f\|=1}\|Tf\| where ff is an element of β„‹{\cal H}. For any xβˆˆβ„‹x\in{\cal H} and an operator TT, we have the sub-multiplicative inequality

β€–T​x‖≀‖T‖​‖xβ€–.\|Tx\|\leq\|T\|\|x\|. (2)

II-B Compact operator and spectral mapping theorem

Let TT be a linear operator on β„‹{\cal H}. A complex number Ξ»βˆˆβ„‚\lambda\in\mathbb{C} is an eigenvalue of TT if and only if a nonzero vector xx in β„‹{\cal H} satisfies T​x=λ​xTx=\lambda x where xx is said to be an eigenvector corresponding to Ξ»\lambda. In the following, the set of all eigenvalues of TT is denoted by σ​(T)\sigma(T). The spectral radius of TT is given as

ρ​(T):=max⁑{|Ξ»|:Ξ»βˆˆΟƒβ€‹(T)}.\rho(T):=\max\{|\lambda|:\lambda\in\sigma(T)\}. (3)

Suppose that K​(x,y)K(x,y) satisfies ∫ab∫ab|K​(x,y)|​𝑑x​𝑑y<∞.\int_{a}^{b}\int_{a}^{b}|K(x,y)|dxdy<\infty. For any f∈L2​(a,b)f\in L^{2}(a,b), let T​f:=∫abK​(x,y)​f​(y)​𝑑y,Tf:=\int_{a}^{b}K(x,y)f(y)dy, which is a linear operator on L2​(a,b)L^{2}(a,b) and the operator TT is a compact operator [6]. If the kernel function K​(x,y)K(x,y) satisfies K​(x,y)=K​(y,x)Β―,K(x,y)=\overline{K(y,x)}, then the operator TT is a compact self-adjoint operator. In the case of the finite dimensional space l2​(N)l^{2}(N), a linear operator defined by a Hermitian matrix TT corresponds to a compact self-adjoint operator.

The next theorem plays an important role in our analysis described below.

Theorem 1 (Spectral mapping theorem [6])

Let TT be a compact operator defined on β„‹{\cal H}. If a complex valued function f:β„‚β†’β„‚f:\mathbb{C}\rightarrow\mathbb{C} is analytic around ρ​(T)\rho(T), then the set of eigenvalues of f​(T)f(T) is given by {f​(Ξ»):Ξ»βˆˆΟƒβ€‹(T)}.\{f(\lambda):\lambda\in\sigma(T)\}.

Let TT be a compact self-adjoint operator on β„‹{\cal H}. It is known that the set of eigenvalues of a compact self-adjoint operator is a countable set of real numbers [6]. Let σ​(T):={Ξ»1,Ξ»2,…}​(Ξ»iβˆˆβ„)\sigma(T):=\{\lambda_{1},\lambda_{2},\ldots\}(\lambda_{i}\in\mathbb{R}). A polynomial f​(x):=an​xn+anβˆ’1​xnβˆ’1+β‹―+a0f(x):=a_{n}x^{n}+a_{n-1}x^{n-1}+\cdots+a_{0} defined on β„‚\mathbb{C} is analytic over whole β„‚\mathbb{C}. As a consequence of the spectral mapping theorem, the set of eigenvalues of f​(T)f(T) is given as σ​(f​(T))={f​(Ξ»1),f​(Ξ»2),…}.\sigma(f(T))=\{f(\lambda_{1}),f(\lambda_{2}),\ldots\}. For a compact self-adjoint operator TT, we also have

β€–f​(T)β€–=ρ​(f​(T)).\|f(T)\|=\rho(f(T)). (4)

II-C Landweber iteration

Assume that xx is an element in a Hilbert space. The aim of the Landweber iteration is to recover the original signal xx from a measurement y=T​xy=Tx or noisy measurement yy where TT is a linear operator. The Landweber iteration is a gradient descent algorithm in a Hilbert space to minimize the error functional (1/2)​‖T​xβˆ’yβ€–2.(1/2)\|Tx-y\|^{2}. The Landweber iteration is defined by the fixed-point iteration

x(k+1)=x(k)βˆ’Ο‰β€‹Tβˆ—β€‹(T​x(k)βˆ’y).x^{(k+1)}=x^{(k)}-\omega T^{*}(Tx^{(k)}-y). (5)

III Chebyshev Inertial Iteration

III-A Fixed-point iteration

Suppose that we have a fixed-point iteration defined on a Hilbert space β„‹{\cal H}:

x(k+1)=A​x(k)+b,k=0,1,2,…,x^{(k+1)}=Ax^{(k)}+b,\quad k=0,1,2,\ldots, (6)

where x(k),bβˆˆβ„‹x^{(k)},b\in{\cal H}. The operator A:β„‹β†’β„‹A:{\cal H}\rightarrow{\cal H} is a compact self-adjoint operator on β„‹{\cal H}.

Assume also that AA is a contraction mapping which has a fixed point xβˆ—βˆˆβ„‹x^{*}\in{\cal H} satisfying xβˆ—=A​xβˆ—+b,x^{*}=Ax^{*}+b, where the existence of the fixed point is guaranteed by Banach fixed-point theorem [6]. The inertial iteration [9] corresponding to the original iteration (6) is defined by

x(k+1)=x(k)+Ο‰k​(A​x(k)+bβˆ’x(k)),x^{(k+1)}=x^{(k)}+\omega_{k}\left(Ax^{(k)}+b-x^{(k)}\right), (7)

where {Ο‰k}\{\omega_{k}\} are called the inertial factors. It should be remarked that the fixed point of the inertial iteration exactly coincides with the fixed point of the original iteration (6).

III-B Analysis on spectral radius

From the inertial iteration (7), we have the equivalent update equation

x(k+1)=(Iβˆ’Ο‰k​B)​x(k)+Ο‰k​b,x^{(k+1)}=(I-\omega_{k}B)x^{(k)}+\omega_{k}b, (8)

where B:=Iβˆ’AB:=I-A. From the assumption that AA is a compact self-adjoint operator, we can show BB is also compact self-adjoint. Since the fixed point xβˆ—x^{*} satisfies

xβˆ—=(Iβˆ’Ο‰k​B)​xβˆ—+Ο‰k​b,x^{*}=(I-\omega_{k}B)x^{*}+\omega_{k}b, (9)

for any nonnegative kk and subtracting (9) from (8), we immediately have a recursive formula on the residual errors:

x(k+1)βˆ’xβˆ—=(Iβˆ’Ο‰k​B)​(x(k)βˆ’xβˆ—).x^{(k+1)}-x^{*}=(I-\omega_{k}B)(x^{(k)}-x^{*}). (10)

In the following discussion, we will assume the following inertial factors satisfying the periodical condition:

ωℓ​T+j=Ο‰j,β„“=0,1,2,…,j=0,1,…,Tβˆ’1,\omega_{\ell T+j}=\omega_{j},\quad\ell=0,1,2,\ldots,\quad j=0,1,\ldots,T-1, (11)

where a positive integer TT represents the period.

Recursively applying (10) multiple times, we can get

x(T)βˆ’xβˆ—=(∏k=0Tβˆ’1(Iβˆ’Ο‰k​B))​(x(0)βˆ’xβˆ—).x^{(T)}-x^{*}=\left(\prod_{k=0}^{T-1}(I-\omega_{k}B)\right)(x^{(0)}-x^{*}). (12)

The norm of the left-hand side can be upper bounded by

β€–x(T)βˆ’xβˆ—β€–\displaystyle\|x^{(T)}-x^{*}\| =\displaystyle= β€–(∏k=0Tβˆ’1(Iβˆ’Ο‰k​B))​(x(0)βˆ’xβˆ—)β€–\displaystyle\left\|\left(\prod_{k=0}^{T-1}(I-\omega_{k}B)\right)(x^{(0)}-x^{*})\right\| (13)
≀\displaystyle\leq β€–βˆk=0Tβˆ’1(Iβˆ’Ο‰k​B)‖​‖x(0)βˆ’xβˆ—β€–\displaystyle\left\|\prod_{k=0}^{T-1}(I-\omega_{k}B)\right\|\|x^{(0)}-x^{*}\| (14)
=\displaystyle= ρ​(∏k=0Tβˆ’1(Iβˆ’Ο‰k​B))​‖x(0)βˆ’xβˆ—β€–.\displaystyle\rho\left(\prod_{k=0}^{T-1}(I-\omega_{k}B)\right)\|x^{(0)}-x^{*}\|. (15)

The above inequality is based on the sub-multiplicative inequality (2) and the last equality is due to (4).

III-C Chebyshev inertial factors

As we discussed, the operator BB is a compact self-adjoint operator and it has countable real eigenvalues, which are denoted by Ξ»k​(k=1,2,…){\lambda_{k}}(k=1,2,\ldots). Hereafter, the minimum and the maximum eigenvalues of BB are denoted by lm​i​nl_{min} and lm​a​xl_{max}, respectively.

In the following discussion, we will use the inertial factors defined by

Ο‰kβˆ—:=[Ξ»++Ξ»βˆ’β€‹cos⁑(2​k+12​T​π)]βˆ’1,k=0,1,…,Tβˆ’1,\omega_{k}^{*}:=\left[\lambda_{+}+\lambda_{-}\cos\left(\frac{2k+1}{2T}\pi\right)\right]^{-1},\ k=0,1,\ldots,T-1, (16)

where Ξ»+:=(lm​a​x+lm​i​n)/2,Ξ»βˆ’:=(lm​a​xβˆ’lm​i​n)/2.\lambda_{+}:=(l_{max}+l_{min})/2,\quad\lambda_{-}:=(l_{max}-l_{min})/2. These inertail factors are called the Chebyshev inertial factors. A Chebyshev inertial factor is the inverse of a root of a Chebyshev polynomial [8, 9].

Let Ξ²T​(x):=∏k=0Tβˆ’1(1βˆ’Ο‰kβˆ—β€‹x)\beta_{T}(x):=\prod_{k=0}^{T-1}(1-\omega_{k}^{*}x). If the Chebyshev inertial factors are employed in the inertial iteration, we can use Lemma 2 proved in [9] to bound |Ξ²T​(Ξ»)||\beta_{T}(\lambda)| as

|Ξ²T​(Ξ»)|≀sech​(T​coshβˆ’1⁑(Ξ»+Ξ»βˆ’)).|\beta_{T}(\lambda)|\leq\text{sech}\left(T\cosh^{-1}\left(\frac{\lambda_{+}}{\lambda_{-}}\right)\right). (17)

for lm​i​n≀λ≀lm​a​xl_{min}\leq\lambda\leq l_{max}. Combining the spectral mapping theorem and the above inequality, we immediately obtain

ρ​(Ξ²T​(B))=ρ​(∏k=0Tβˆ’1(Iβˆ’Ο‰k​B))≀sech​(T​coshβˆ’1⁑(Ξ»+Ξ»βˆ’)).\rho(\beta_{T}(B))=\rho\left(\prod_{k=0}^{T-1}(I-\omega_{k}B)\right)\leq\text{sech}\left(T\cosh^{-1}\left(\frac{\lambda_{+}}{\lambda_{-}}\right)\right). (18)

The above argument can be summarized in the following theorem.

Theorem 2

Let TT be a positive integer. For β„“=1,2,3,…\ell=1,2,3,\ldots, the Landweber iteration with the Chebyshev inertial factors satisfies the residual error bound:

β€–x(ℓ​T)βˆ’xβˆ—β€–β‰€U​(T)ℓ​‖x(0)βˆ’xβˆ—β€–,\|x^{(\ell T)}-x^{*}\|\leq U(T)^{\ell}\|x^{(0)}-x^{*}\|, (19)

where U​(T):=sech​(T​coshβˆ’1⁑(Ξ»+/Ξ»βˆ’))U(T):=\text{sech}\left(T\cosh^{-1}\left({\lambda_{+}}/{\lambda_{-}}\right)\right).

This theorem implies that the error norm is tightly upper bounded if the iteration index is a multiple of TT.

Combining the Landweber iteration with the Chebyshev inertial iteration, we have the Chebyshev inertial Landweber algorithm:

x(k+1)\displaystyle x^{(k+1)} =\displaystyle= x(k)+Ο‰kβˆ—β€‹(x(k)βˆ’Ο‰β€‹Tβˆ—β€‹(T​x(k)βˆ’y)βˆ’x(k))\displaystyle x^{(k)}+\omega_{k}^{*}\left(x^{(k)}-\omega T^{*}(Tx^{(k)}-y)-x^{(k)}\right) (20)
=\displaystyle= x(k)βˆ’Ο‰kβˆ—β€‹Ο‰β€‹Tβˆ—β€‹(T​x(k)βˆ’y).\displaystyle x^{(k)}-\omega_{k}^{*}\omega T^{*}\left(Tx^{(k)}-y\right).

The iteration is almost the same as the original Landweber iteration except for the use of the Chebyshev inertial factors {Ο‰kβˆ—}\{\omega_{k}^{*}\}. It is common to set the inverse of the maximum eigenvalue of Tβˆ—β€‹TT^{*}T to the fixed factor Ο‰\omega.

IV Experiments

IV-A Deconvolution by Landweber iteration

In this subsection, deconvolution by the Landweber iteration over the Hilbert space β„‹=L2​(βˆ’βˆž,∞){\cal H}=L^{2}(-\infty,\infty) over ℝ\mathbb{R} is studied. Suppose that f,gβˆˆβ„‹f,g\in{\cal H} are measurable functions on ℝ\mathbb{R}. The convolution of ff and gg are defined as

y​(x):=βˆ«βˆ’βˆžβˆžf​(u)​g​(xβˆ’u)​𝑑u,y(x):=\int_{-\infty}^{\infty}f(u)g(x-u)du, (21)

which is a compact operator on β„‹{\cal H} [6]. We thus can write y=G​fy=Gf where G:β„‹β†’β„‹G:{\cal H}\rightarrow{\cal H} is a compact operator defined by (21) and f,gβˆˆβ„‹f,g\in{\cal H}. In a context of an inverse problem regarding a linear system involving a convlution, the function ff can be considered as a source signal and gg represents a point spread function (PSF) or impulse response. In the context of a partial differential equation, gg represents the Green’s function or the integral kernel corresponding to a given partial differential equation.

We further assume that gg is an even function satisfying g​(x)=g​(βˆ’x)g(x)=g(-x). This implies that the operator GG is a compact self-adjoint operator. In the following, we try to recover the source signal ff from the blurred signal yy by using the Landweber iteration on β„‹{\cal H} given by s(k+1)=s(k)βˆ’Ο‰β€‹Gβˆ—β€‹(G​s(k)βˆ’y),s^{(k+1)}=s^{(k)}-\omega G^{*}(Gs^{(k)}-y), where the initial condition is s(0):=y.s^{(0)}:=y. Note that Gβˆ—=GG^{*}=G holds due to the assumption on gg.

In the following experiments shown below, the closed range from βˆ’8.192-8.192 to 8.1928.192 are discretized into 16384 bins, and convolution integration (21) is approximated by cyclic convolution with 16384-points FFT and frequency domain products. Figure 1 presents the source signal ff, the convolutional kernel gg, and the convolved signal yy.

Refer to caption
Figure 1: Plots of the source signal ff, convolutional kernel gg, and the convolved signal yy. The definitions of ff and gg are given by f​(x):=12​exp⁑(βˆ’x2)+exp⁑(βˆ’(xβˆ’2)2)βˆ’exp⁑(βˆ’(xβˆ’3)2)+12​exp⁑(βˆ’(xβˆ’4)2),g​(x):=exp⁑(βˆ’x2).f(x):=\frac{1}{2}\exp(-x^{2})+\exp(-(x-2)^{2})-\exp(-(x-3)^{2})+\frac{1}{2}\exp(-(x-4)^{2}),g(x):=\exp(-x^{2}).
Refer to caption
Figure 2: Deconvolution process: (upper) original Landweber iteration, (lower) Chebyshev inertial Landweber iteration. The coefficient Ο‰\omega is set to 0.30.3 in the both cases. For Chebyshev inertial Landweber algorithm, lm​i​n=0.1l_{min}=0.1 and lm​a​x=0.9l_{max}=0.9 are used. The period TT is set to 88. The index kk represents the number of iteration.

Figure 2 presents a deconvolution process by the original Landweber algorithm (upper) and the Chebyshev inertial Landweber algorithm (lower). The tentative results s(k)s^{(k)} for every 30 iterations are shown. In both cases, we can observe that s(k)s^{(k)} gradually approaches to the source signal ff. This is because Iβˆ’Ο‰β€‹Gβˆ—β€‹GI-\omega G^{*}G is a contractive mapping under the setting Ο‰=0.3\omega=0.3. Comparing two figures, we can see that the Chebyshev inertial Landweber algorithm shows much faster convergence to the source signal ff. This observation is also supported by the error curves depicted in Fig.3. The Chebyshev inertial Landweber algorithm (T=1,2,8)(T=1,2,8) shows faster convergence compared with the original Landweber iteration. For example, the error β€–s(k)βˆ’fβ€–=0.1||s^{(k)}-f||=0.1 is achieved with k=100k=100 with the original Landweber iteration but the Chebyshev iteration requires only k=25k=25 iterations.

Refer to caption
Figure 3: Error norms β€–s(k)βˆ’fβ€–||s^{(k)}-f|| as functions of number of iterations in deconvolution processes.

IV-B Finite-dimensional least square problem

In this subsection, we will discuss a least square problem closely related to MIMO detection problems. We here assume a finite dimensional Hilbert space β„‹=l2​(n){\cal H}=l^{2}(n) on β„‚\mathbb{C}.

Let Hβˆˆβ„‚nΓ—nH\in\mathbb{C}^{n\times n} be a complex matrix whose elements follow the normal complex Gaussian distribution π’žβ€‹π’©β€‹(0,1){\cal CN}(0,1). Our task is to recover a source signal xβˆˆβ„‚nx\in\mathbb{C}^{n} from a noisy linear measurement y=H​x+wy=Hx+w where ww is a complex Gaussian noise vector. The problem can be seen as a MIMO detection problem where the numbers of transmit and receive antennas are nn. The least square problem x^:=minimizexβˆˆβ„‚n​(1/2)​‖yβˆ’H​xβ€–2\hat{x}:=\text{minimize}_{x\in\mathbb{C}^{n}}(1/2)\|y-Hx\|^{2} is equivalent to the maximum likelihood estimation rule for the above setting.

In this case, the Landweber iteration is exactly same as a gradient descent process

s(k+1)=s(k)βˆ’Ο‰β€‹HH​(H​s(k)βˆ’y)s^{(k+1)}=s^{(k)}-\omega H^{H}(Hs^{(k)}-y) (22)

for minimizing the objective function. It is known that the asymptotic convergence rate is optimal if Ο‰=Ο‰o​p​t:=2​(Ξ»m​i​n​(HH​H)+Ξ»m​a​x​(HH​H))βˆ’1\omega=\omega_{opt}:={2}({\lambda_{min}(H^{H}H)+\lambda_{max}(H^{H}H)})^{-1} holds. In this case, the matrix A:=Iβˆ’Ο‰o​p​t​HH​HA:=I-\omega_{opt}H^{H}H becomes a Hermitian matrix and thus AA is a compact self-adjoint operator on β„‹{\cal H}. This means that we can apply the Chebyshev inertial iteration to the Landweber iteration.

The details of the experiment are summarized as follows. The dimension nn is set to 3232. Each element of xx is chosen from 8-PSK constellation {exp⁑((2​π​j​k)/8)}​(k=0,1,…,7)\{\exp((2\pi jk)/8)\}(k=0,1,\ldots,7) uniformly at random. The optimal factor Ο‰o​p​t\omega_{opt} is employed in the Landweber iteration. Each element of the noise vector ww follows π’žβ€‹π’©β€‹(0,Οƒ2){\cal CN}(0,\sigma^{2}) where Οƒ=10βˆ’4\sigma=10^{-4}. The minimum and maximum eigenvalues of Ο‰o​p​t​HH​H\omega_{opt}H^{H}H are used as lm​i​nl_{min} and lm​a​xl_{max} for determining the Chebyshev inertial factors.

Figure 4 summarizes the comparison on the squared error norms β€–s(k)βˆ’xβ€–2\|s^{(k)}-x\|^{2} between the original Landweber iteration and the accelerated iterations. From Fig. 4 (left), we can immediately recognize the zigzag-shaped error curves of the Chebyshev inertial Landweber algorithm. This is because the error norm is exponentially upper bounded only when the iteration index kk is multiple of TT as shown in Theorem 2. It mean that the lower envelope of the zigzag curves can be seen as the guaranteed performance of the Chebyshev inertial Landweber algorithm. Figure 4 (right) indicates the squared errors up to k=30000k=30000. The proposed schemes (Chebyshev T=2,8T=2,8) shows much faster convergence compared with the original Landweber iteration.

Refer to caption
Figure 4: Squared error norms β€–s(k)βˆ’xβ€–2\|s^{(k)}-x\|^{2} as a function of number of iterations: (left) number of iteration is up to 50, (right) number of iteration is up to 30000. The squared error norms are averaged for 100-trials.

The speed of convergence of the Landweber iteration is dominated by the spectral radius ρ​(A)=ρ​(Iβˆ’Ο‰o​p​t​HH​H).\rho(A)=\rho\left(I-\omega_{opt}H^{H}H\right). On the other hand, U​(T)U(T) indicates an upper bound of the normalized asymptotic convergence rate depending on the period TT. Figure 5 (left) presents both ρ​(A)\rho(A) and U​(T)U(T) as functions of TT. As TT becomes larger, the value of U​(T)U(T) becomes strictly smaller than ρ​(A)\rho(A). This means that the Chebyshev inertial iteration certainly accelerates the asymptotic convergence rate. Figure 5 (right) shows approximate error norms derived from ρ​(A)\rho(A) and U​(T)U(T) which are given by ρ​(A)k,U​(2)k,U​(8)k\rho(A)^{k},U(2)^{k},U(8)^{k}. Although these values does not include the initial errors β€–s0βˆ’xβ€–\|s_{0}-x\|, these curves qualitatively explains the behavior of the actual error curves in Fig. 4 (right). These results support the usefulness of the theoretical analysis discussed in the previous section.

Refer to caption
Figure 5: (left) Normalized convergence rate ρ​(A)\rho(A) and U​(T)U(T), (right) approximate error norms derived from ρ​(A)\rho(A) and U​(T)U(T): the approximated error norms are ρ​(A)k,U​(2)k,U​(8)k\rho(A)^{k},U(2)^{k},U(8)^{k}, respectively.

IV-C Projected Landweber-based MIMO detection

It is natural to consider the projected Landweber algorithm [5] for making use of the prior information of the source signal to improve the reconstruction performance. We here examine the detection performance of a simple projected Landweber-based MIMO detection algorithm.

The soft projection operator Ξ·:β„‚β†’β„‚\eta:\mathbb{C}\rightarrow\mathbb{C} used here is defined by

η​(r):=βˆ‘p∈Sp​exp⁑(βˆ’|rβˆ’p|2/Ξ±2)βˆ‘p∈Sexp⁑(βˆ’|rβˆ’p|2/Ξ±2),\eta(r):=\frac{\sum_{p\in S}p\exp\left({-|r-p|^{2}}/{\alpha^{2}}\right)}{\sum_{p\in S}\exp\left({-|r-p|^{2}}/{\alpha^{2}}\right)}, (23)

where SβŠ‚β„‚S\subset\mathbb{C} is a signal constellation [7]. The iteration of the projected Landweber algorithm is fairly simple; the soft projection operator is just element-wisely applied to the output s(k+1)s^{(k+1)} in (22) and then, the output is passed to the inertial iteration process. The details of the experiments are as follows. The channel model is exactly the same as that used in the previous subsection (32Γ—3232\times 32 MIMO channel with 8-PSK input). The minimum and maximum eigenvalues lm​i​nl_{min} and lm​a​xl_{max} are determined according to the Marchenko-Pastur law. Figure 6 presents the symbol error rate (SER) of the proposed scheme. The projected Landweber detectors achieves one or two order of magnitude smaller SER than those of the MMSE detector. The accelerated schemes (Chebyshev T=4,8,16T=4,8,16) provide steeper error curves than that of the projected Landweber detector without Chebyshev inertial iteration (denoted by β€œLandweber”). This observation supports the potential of the Chebyshev inertial iteration.

Refer to caption
Figure 6: Symbol error rate of the projected Landweber algorithm with the Chebyshev inertial iteration. The channel model is 32Γ—3232\times 32 MIMO channel with 8-PSK input. The number of iterations for all the algorithms (except for MMSE) is set to 100. As baselines, the performance of the MMSE detector defined by x^:=(HH​H+Οƒ2​I)βˆ’1​HH​y\hat{x}:=(H^{H}H+\sigma^{2}I)^{-1}H^{H}y is included. The SNR s​n​rsnr (in dB) and Οƒ\sigma is related as Οƒ=10βˆ’s​n​r/10\sigma=\sqrt{10^{-snr/10}}. In the projected Landweber algorithms, Ξ±2=0.5\alpha^{2}=0.5 is used when iteration index kk is less than 20; for kβ‰₯20k\geq 20, Ξ±2=0.25\alpha^{2}=0.25 is used.

Acknowledgement

This work was supported by JSPS Grant-in-Aid for Scientific Research (B) Grant Number 19H02138.

References

  • [1] L. Landweber, β€œAn iteration formula for Fredholm integral equations of the first kind, ” American journal of mathematics, vol. 73, no. 3, pp. 615–624, 1951.
  • [2] F. Vankawala and A. Ganatra, β€œModified Landweber algorithm for deblur and denoise Images,” International Journal of Soft Computing and Engineering (IJSCE), vol.5, no.6, pp.79–82, 2016.
  • [3] W. Zhang, X. Bao, and J. Dai, β€œLow-complexity detection based on Landweber method in the uplink of massive MIMO systems,” in Yang et al. Boundary Value Problems, 2017.
  • [4] F. Yang, Y.-P. Ren, X.-X. Li and D. G. Li, β€œLandweber iterative method for identifying a space-dependent source for the time-fractional diffusion equation,” Yang et al. Boundary Value Problems, Springer, 2017.
  • [5] P. L. Combettes and J. C. Pesquet, β€œProximal splitting methods in signal processing, ” in Fixed-point algorithms for inverse problems in Science and Engineering, Springer-Verlag, pp.185–212, 2011.
  • [6] J. Muscat, β€œFunctional analysis,” Springer, 2014.
  • [7] S. Takabe, T. Wadayama, Y. C. Eldar, β€œComplex trainable ISTA for linear and nonlinear inverse problems,” arXiv:1904.07409v2, 2019.
  • [8] S. Takabe and T. Wadayama, β€œTheoretical interpretation of learned step size in deep-unfolded gradient descent, ” arXiv:2001.05142, 2020.
  • [9] T. Wadayama and S. Takabe, β€œChebyshev inertial iteration for accelerating fixed-point iterations, ” arXiv:2001.03280, 2020.