This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\headers

GSVD for Kronecker Products in 1\ell_{1} regularizationSweeney, Español, and Renaut

Efficient Decomposition-Based Algorithms for 1\ell_{1}-Regularized Inverse Problems with Column-Orthogonal and Kronecker Product Matrices

Brian Sweeney School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ ([email protected], [email protected], and [email protected] )    Malena I. Españolfootnotemark:    Rosemary Renautfootnotemark:
Abstract

We consider an 1\ell_{1}-regularized inverse problem where both the forward and regularization operators have a Kronecker product structure. By leveraging this structure, a joint decomposition can be obtained using generalized singular value decompositions. We show how this joint decomposition can be effectively integrated into the Split Bregman and Majorization-Minimization methods to solve the 1\ell_{1}-regularized inverse problem. Furthermore, for cases involving column-orthogonal regularization matrices, we prove that the joint decomposition can be derived directly from the singular value decomposition of the system matrix. As a result, we show that framelet and wavelet operators are efficient for these decomposition-based algorithms in the context of 1\ell_{1}-regularized image deblurring problems.

keywords:
1\ell_{1} regularization, Kronecker product, Framelets, Wavelets, Generalized singular value decomposition
{AMS}

65F22, 65F10, 68W40

1 Introduction

We are interested in solving discrete ill-posed inverse problems of the form 𝐀𝐱𝐛{\bf A}{\bf x}\approx{\bf b}, where 𝐀m×n{\bf A}\in\mathbb{R}^{m\times n}, mnm\geq n, 𝐛m{\bf b}\in\mathbb{R}^{m}, and 𝐱n{\bf x}\in\mathbb{R}^{n}. It is assumed that the matrix 𝐀{\bf A} has full column rank but is ill-conditioned, and 𝐛{\bf b} is contaminated by additive Gaussian noise: 𝐛=𝐀𝐱+𝐞{\bf b}={\bf A}{\bf x}+{\bf e}, where 𝐞{\bf e} is a Gaussian noise vector. Throughout this paper, we normalize the problem by dividing 𝐀{\bf A} and 𝐛{\bf b} by the standard deviation of 𝐞{\bf e}. This corresponds to whitening the noise in the data. Due to the ill-posedness of this problem, the solution of the normal equations, 𝐱=(𝐀𝐀)1𝐀𝐛{\bf x}=({\bf A}^{\top}{\bf A})^{-1}{\bf A}^{\top}{\bf b}, is poor, and therefore, we will apply regularization. With generalized Tikhonov regularization [48], 𝐱{\bf x} is selected to solve the minimization problem

(1) min𝐱{12𝐀𝐱𝐛22+λ22𝐋𝐱22},\displaystyle\min_{{\bf x}}\left\{\frac{1}{2}\left\lVert{\bf A}{\bf x}-{\bf b}\right\rVert_{2}^{2}+\frac{\lambda^{2}}{2}\left\lVert{\bf L}{\bf x}\right\rVert_{2}^{2}\right\},

where λ\lambda is a regularization parameter and 𝐋p×n{\bf L}\in\mathbb{R}^{p\times n} is a regularization matrix. 𝐋{\bf L} is often selected as the discretization of some derivative operator [28]. We also assume that

(2) Null(𝐀)Null(𝐋)=,\displaystyle\mbox{Null}({\bf A})\cap\mbox{Null}({\bf L})=\emptyset,

where Null(𝐂)\mbox{Null}({\bf C}) denotes the null space of the matrix 𝐂{\bf C}.

Instead of considering the Tikhonov problem Eq. 1, we will consider a sparsity-preserving problem

(3) min𝐱{12𝐀𝐱𝐛22+μ𝐋𝐱1},\displaystyle\min_{\bf x}\left\{\frac{1}{2}\left\lVert{\bf A}{\bf x}-{\bf b}\right\rVert_{2}^{2}+\mu\left\lVert{\bf L}{\bf x}\right\rVert_{1}\right\},

where μ\mu is a regularization parameter. We will solve Eq. 3 using two iterative methods, split Bregman (SB) [24] and Majorization-Minimization (MM) [31, 32]. Both methods share a minimization sub-problem of the same form [47].

Our focus is on Kronecker Product (KP) matrices 𝐀{\bf A} and 𝐋{\bf L}, that is, matrices that can be written as 𝐀=𝐀1𝐀2{\bf A}={\bf A}_{1}\otimes{\bf A}_{2} and 𝐋=𝐋1𝐋2,{\bf L}={\bf L}_{1}\otimes{\bf L}_{2}, where \otimes denotes the Kronecker product, which is defined for matrices 𝐂m×n{\bf C}\in\mathbb{R}^{m\times n} and 𝐃p×q{\bf D}\in\mathbb{R}^{p\times q} by

(4) 𝐂𝐃=[c11𝐃c12𝐃c1n𝐃c21𝐃c22𝐃c2n𝐃cm1𝐃cm2𝐃cmn𝐃]mp×nq.{\bf C}\otimes{\bf D}=\begin{bmatrix}c_{11}{\bf D}&c_{12}{\bf D}&\cdots&c_{1n}{\bf D}\\ c_{21}{\bf D}&c_{22}{\bf D}&\cdots&c_{2n}{\bf D}\\ \vdots&\vdots&&\vdots\\ c_{m1}{\bf D}&c_{m2}{\bf D}&\cdots&c_{mn}{\bf D}\end{bmatrix}_{mp\times nq}.

Thus, we are interested in solving problem Eq. 3 of the specific form

(5) min𝐱{12(𝐀1𝐀2)𝐱𝐛22+μ(𝐋1𝐋2)𝐱1}.\displaystyle\min_{\bf x}\left\{\frac{1}{2}\left\lVert({\bf A}_{1}\otimes{\bf A}_{2}){\bf x}-{\bf b}\right\rVert_{2}^{2}+\mu\left\lVert({\bf L}_{1}\otimes{\bf L}_{2}){\bf x}\right\rVert_{1}\right\}.

KP matrices 𝐀{\bf A} and 𝐋{\bf L} arise in many applications. In image deblurring problems, 𝐀{\bf A} is a KP matrix, provided that the blur can be separated in the horizontal and vertical directions [29]. Another case where 𝐀{\bf A} is a KP matrix is in 2D nuclear magnetic resonance (NMR) problems [6, 23], where the goal is to reconstruct a map of relaxation times from a probed material. 𝐋{\bf L} is a KP matrix with certain forms of regularization, such as when orthonormal framelets [44] or wavelets [12] are used in two dimensions. Both are tight frames, namely, they form redundant orthogonal bases for n\mathbb{R}^{n}. This redundancy allows for robust representations where the loss of information is more tolerable [11]. When used as regularization in Eq. 3, framelet [8, 10, 11, 54] and wavelet-based [18] methods enforce sparsity on the solution within the tight frame.

In [8], tight frames are used for regularization in a general p\ell_{p}-q\ell_{q} problem without the assumptions of having KP matrices. Other minimization problems with KP matrices 𝐀{\bf A} or 𝐋{\bf L} have been studied before. Least-squares problems of the form

min𝐱(𝐀1𝐀2)𝐱𝐛22,\displaystyle\min_{\bf x}\left\lVert({\bf A}_{1}\otimes{\bf A}_{2}){\bf x}-{\bf b}\right\rVert^{2}_{2},

have been solved efficiently using the singular value and QR decompositions [19, 20]. The special case where the forward operator has the form 𝐀=𝐈𝐀2{\bf A}={\bf I}\otimes{\bf A}_{2}, which is equivalent to a matrix inverse problem, has also been analyzed [26]. For least-squares problems of the form

(6) min𝐱[𝐀1𝐀2𝐁1𝐁2]𝐱𝐛22,\displaystyle\min_{\bf x}\left\lVert\begin{bmatrix}{\bf A}_{1}\otimes{\bf A}_{2}\\ {\bf B}_{1}\otimes{\bf B}_{2}\end{bmatrix}{\bf x}-{\bf b}\right\rVert^{2}_{2},

the generalized singular value decompositions (GSVDs) of {𝐀1,𝐁1}\{{\bf A}_{1},{\bf B}_{1}\} and {𝐀2,𝐁2}\{{\bf A}_{2},{\bf B}_{2}\} can be used to obtain a joint decomposition and solve the problem [3, 35, 52]. In particular, Givens rotations can be used to diagonalize the matrix containing the generalized singular values, which then diagonalizes the problem. A regularized problem of the form

(7) min𝐱[𝐀x𝐋y𝐋x𝐀y]𝐱𝐛22+λ2[𝐈𝐋y𝐋x𝐈]𝐱22\displaystyle\min_{\bf x}\left\lVert\begin{bmatrix}{\bf A}_{x}\otimes{\bf L}_{y}\\ {\bf L}_{x}\otimes{\bf A}_{y}\end{bmatrix}{\bf x}-{\bf b}\right\rVert^{2}_{2}+\lambda^{2}\left\lVert\begin{bmatrix}{\bf I}\otimes{\bf L}_{y}\\ {\bf L}_{x}\otimes{\bf I}\end{bmatrix}{\bf x}\right\rVert_{2}^{2}

occurs in adaptive optics [3, 35]. The structure of this problem can be utilized to rewrite it as a least-squares problem with a preconditioner. Another problem that utilizes KP matrices is the constrained least-squares problem

min𝐱(𝐀1𝐀2)𝐱𝐛22,s.t. (𝐋1𝐋2)𝐱=𝐡,\displaystyle\min_{\bf x}\left\lVert({\bf A}_{1}\otimes{\bf A}_{2}){\bf x}-{\bf b}\right\rVert^{2}_{2},\quad\text{s.t. }({\bf L}_{1}\otimes{\bf L}_{2}){\bf x}={\bf h},

which is considered in [4]. In this case, both the forward operator and the constraint are KP matrices, and a null space method can be applied to the matrix equations. The solution is unique under certain circumstances, which can be analyzed using the GSVD [4].

In cases where 𝐀{\bf A} is not a KP matrix, there are methods to approximate it by a Kronecker product 𝐀𝐂𝐃{\bf A}\approx{\bf C}\otimes{\bf D} so that the benefits of the KP structure can still be utilized. The nearest Kronecker product can be found by minimizing

𝐀𝐂𝐃F2.\displaystyle\left\lVert{\bf A}-{\bf C}\otimes{\bf D}\right\rVert_{F}^{2}.

Van Loan and Pitsianis [53] developed a method for solving this problem, and in [42], Pitsianis extended it to solve the problem

𝐀i=1k𝐂i𝐃iF2.\displaystyle\left\lVert{\bf A}-\sum_{i=1}^{k}{\bf C}_{i}\otimes{\bf D}_{i}\right\rVert_{F}^{2}.

These KP approximations have been applied to the forward matrix 𝐀{\bf A} in imaging problems [7, 22, 33, 34].

Main Contributions: We utilize joint decompositions that exploit the KP structure, computed at the first iteration only, to solve Eq. 5 with SB and MM where the regularization parameter is selected at each iteration as in [47]. We introduce a family of regularization operators that are column orthogonal. This family includes framelet and wavelet matrices. We show that with these matrices, we can obtain the GSVDs needed for the joint decompositions directly from the singular value decompositions (SVDs) of 𝐀1{\bf A}_{1} and 𝐀2{\bf A}_{2}, allowing us to use framelets and wavelets without computing a GSVD. Other regularization matrices, such as finite-difference operators, would require computing GSVDs to implement these same decomposition-based algorithms. This makes using framelet and wavelet operators efficient for solving 1\ell_{1}-regularized problems with SB and MM. We also show that the solution of the generalized Tikhonov problem with a fixed regularization parameter will generate the same solution using framelets and wavelets.

This paper is organized as follows. In Section 2, we review two iterative methods for solving Eq. 3, split Bregman and Majorization-Minimization. In Section 3, the GSVD is presented and we discuss how it can be used to obtain a joint decomposition that can then be utilized within the iterative methods. Theoretical results are also presented for when 𝐋1{\bf L}_{1} and 𝐋2{\bf L}_{2} are column orthogonal, which allow this joint decomposition to be obtained from the SVD rather than the GSVD. In Section 4, we introduce a family of column-orthogonal regularization operators that include the framelet and wavelet regularization matrices. The iterative methods are then tested on image deblurring examples in Section 5. Although the results presented in Section 3 apply to a general 𝐀{\bf A}, the examples are presented for cases with mnm\geq n. Conclusions are presented in Section 6.

2 Iterative techniques for solution of the sparsity constrained problem

In this section, we will consider solving Eq. 3 using two iterative methods: the split Bregman (SB) method and the Majorization-Minimization (MM) method. Both methods only require computing two GSVDs, which can be reused for every iteration.

2.1 Split Bregman

The SB method was first introduced by Goldstein and Osher [24]. This method rewrites Eq. 3 as a constrained optimization problem

(8) min𝐱{12𝐀𝐱𝐛22+μ𝐝1} s.t. 𝐋𝐱=𝐝,\displaystyle\min_{{\bf x}}\left\{\frac{1}{2}\|{\bf A}{\bf x}-{\bf b}\|^{2}_{2}+\mu\|{\bf d}\|_{1}\right\}\quad\text{ s.t. }{\bf L}{\bf x}={\bf d},

which can then be converted to the unconstrained optimization problem

(9) min𝐱,𝐝{12𝐀𝐱𝐛22+λ22𝐋𝐱𝐝22+μ𝐝1}.\displaystyle\min_{{\bf x},{\bf d}}\left\{\frac{1}{2}\|{\bf A}{\bf x}-{\bf b}\|^{2}_{2}+\frac{\lambda^{2}}{2}\|{\bf L}{\bf x}-{\bf d}\|^{2}_{2}+\mu\|{\bf d}\|_{1}\right\}.

This unconstrained optimization problem can be solved through a series of minimization steps and updates, here indicated by superscripts for iteration kk, known as the SB iteration

(10) (𝐱(k+1),𝐝(k+1))\displaystyle({\bf x}^{(k+1)},{\bf d}^{(k+1)}) =argmin𝐱,𝐝{12𝐀𝐱𝐛22+λ22𝐋𝐱𝐝(k)+𝐠(k)22+μ𝐝1}\displaystyle=\operatorname*{arg\,min}_{{\bf x},{\bf d}}\left\{\frac{1}{2}\|{\bf A}{\bf x}-{\bf b}\|^{2}_{2}+\frac{\lambda^{2}}{2}\|{\bf L}{\bf x}-{\bf d}^{(k)}+{\bf g}^{(k)}\|^{2}_{2}+\mu\|{\bf d}\|_{1}\right\}
(11) 𝐠(k+1)\displaystyle{\bf g}^{(k+1)} =𝐠(k)+(𝐋𝐱(k+1)𝐝(k+1)).\displaystyle={\bf g}^{(k)}+({\bf L}{\bf x}^{(k+1)}-{\bf d}^{(k+1)}).

In Eq. 10, the vectors 𝐱{\bf x} and 𝐝{\bf d} can be found separately as

(12) 𝐱(k+1)\displaystyle{\bf x}^{(k+1)} =argmin𝐱{12𝐀𝐱𝐛22+λ22𝐋𝐱(𝐝(k)𝐠(k))22}\displaystyle=\operatorname*{arg\,min}_{\bf x}\left\{\frac{1}{2}\|{\bf A}{\bf x}-{\bf b}\|_{2}^{2}+\frac{\lambda^{2}}{2}\|{\bf L}{\bf x}-({\bf d}^{(k)}-{\bf g}^{(k)})\|_{2}^{2}\right\}
(13) 𝐝(k+1)\displaystyle{\bf d}^{(k+1)} =argmin𝐝{μ𝐝1+λ22𝐝(𝐋𝐱(k+1)+𝐠(k))22}.\displaystyle=\operatorname*{arg\,min}_{\bf d}\left\{\mu\|{\bf d}\|_{1}+\frac{\lambda^{2}}{2}\|{\bf d}-({\bf L}{\bf x}^{(k+1)}+{\bf g}^{(k)})\|_{2}^{2}\right\}.

Here, and in the update Eq. 11, 𝐠{\bf g} is the vector of Lagrange multipliers. Defining τ=μ/λ2\tau=\mu/\lambda^{2}, Eq. 13 becomes

(14) 𝐝(k+1)=argmin𝐝{τ𝐝1+12𝐝(𝐋𝐱(k+1)+𝐠(k))22}.\displaystyle{\bf d}^{(k+1)}=\operatorname*{arg\,min}_{\bf d}\left\{\tau\|{\bf d}\|_{1}+\frac{1}{2}\|{\bf d}-({\bf L}{\bf x}^{(k+1)}+{\bf g}^{(k)})\|_{2}^{2}\right\}.

Since the elements of 𝐝{\bf d} are decoupled in Eq. 14, 𝐝(k+1){\bf d}^{(k+1)} can be computed using shrinkage operators. Consequently, each element is given by

dj(k+1)=shrink((𝐋𝐱(k+1))j+gj(k),τ),\displaystyle d_{j}^{(k+1)}=\text{shrink}\left(({\bf L}{\bf x}^{(k+1)})_{j}+g^{(k)}_{j},\tau\right),

where shrink(x,τ)=sign(x)max(|x|τ,0)\text{shrink}(x,\tau)=\text{sign}(x)\cdot\text{max}(|x|-\tau,0). We will also select the parameter λ\lambda at each iteration by applying parameter selection methods to Eq. 12 as in [47]. We terminate iterations once the relative change in the solution 𝐱{\bf x}, defined as

(15) RC(𝐱(k+1))=𝐱(k+1)𝐱(k)2𝐱(k)2,\displaystyle\text{RC}({\bf x}^{(k+1)})=\frac{\left\lVert{\bf x}^{(k+1)}-{\bf x}^{(k)}\right\rVert_{2}}{\left\lVert{\bf x}^{(k)}\right\rVert_{2}},

drops below a specified tolerance toltol or after a specified maximum number of iterations KmaxK_{max} is reached. The SB algorithm is summarized in Algorithm 1. Notice that SB is related to applying the alternating direction method of multipliers (ADMM) to the augmented Lagrangian in Eq. 9 [17].

Algorithm 1 The SB Method for the 2\ell_{2}-1\ell_{1} Problem Eq. 3
1:𝐀,𝐛,𝐋,τ,𝐝(0)=𝐠(0)=𝟎{\bf A},{\bf b},{\bf L},\tau,{\bf d}^{(0)}={\bf g}^{(0)}={{\bf 0}}, toltol, KmaxK_{max}
2:𝐱{\bf x}
3:for k=0,1,k=0,1,\dots until RC(𝐱(k+1))<tol\text{RC}({\bf x}^{(k+1)})<tol or k=Kmaxk=K_{max} do
4:     Estimate λ(k)\lambda^{(k)}
5:     𝐱(k+1)=argmin𝐱{12𝐀𝐱𝐛22+(λ(k))22𝐋𝐱(𝐝(k)𝐠(k))22}{\bf x}^{(k+1)}=\operatorname*{arg\,min}_{\bf x}\left\{\frac{1}{2}\|{\bf A}{\bf x}-{\bf b}\|_{2}^{2}+\frac{\left(\lambda^{(k)}\right)^{2}}{2}\|{\bf L}{\bf x}-({\bf d}^{(k)}-{\bf g}^{(k)})\|_{2}^{2}\right\}
6:     𝐝(k+1)=argmin𝐝{τ𝐝1+12𝐝(𝐋𝐱(k+1)+𝐠(k))22}{\bf d}^{(k+1)}=\operatorname*{arg\,min}_{\bf d}\left\{\tau\|{\bf d}\|_{1}+\frac{1}{2}\|{\bf d}-({\bf L}{\bf x}^{(k+1)}+{\bf g}^{(k)})\|_{2}^{2}\right\}
7:     𝐠(k+1)=𝐠(k)+(𝐋𝐱(k+1)𝐝(k+1)){\bf g}^{(k+1)}={\bf g}^{(k)}+\left({\bf L}{\bf x}^{(k+1)}-{\bf d}^{(k+1)}\right)
8:end for

2.2 Majorization-Minimization

The MM method is another iterative method for solving Eq. 3. It is an optimization method that utilizes two steps: majorization and minimization [32]. First, in the majorization step, the function is majorized with a surrogate convex function. The convexity of this function is then utilized in the minimization step. When applied to Eq. 3, MM is often combined with the generalized Krylov subspace (GKS) method [9, 31, 41] to solve large-scale problems. In this method, known as MM-GKS, the Krylov subspace is enlarged at each iteration, and MM is then applied to the problem in the subspace. In this paper, we will apply the MM method directly to Eq. 3 instead of building a Krylov subspace. For the majorant, we will use the fixed quadratic majorant from [31]. There are other ways for majorizing Eq. 3 as well, including both fixed and adaptive quadratic majorants [1, 9, 41].

The fixed majorant in [31] depends on a parameter ε>0\varepsilon>0. With this fixed majorant, the minimization problem at the kthk^{\text{th}} iteration is

(16) min𝐱{12𝐀𝐱𝐛22+λ22𝐋𝐱𝐰reg(k)22},\displaystyle\min_{{\bf x}}\left\{\frac{1}{2}\|{\bf A}{\bf x}-{\bf b}\|^{2}_{2}+\frac{\lambda^{2}}{2}\|{\bf L}{\bf x}-{\bf w}_{reg}^{(k)}\|^{2}_{2}\right\},

where λ=μ/ε\lambda=\sqrt{\mu/\varepsilon} and

(17) 𝐰reg(k)=𝐮(k)(1(ε2(𝐮(k))2+ε2)12)\displaystyle{\bf w}_{reg}^{(k)}={\bf u}^{(k)}\left(1-\left(\frac{\varepsilon^{2}}{({\bf u}^{(k)})^{2}+\varepsilon^{2}}\right)^{\frac{1}{2}}\right)

with 𝐮(k)=𝐋𝐱(k){\bf u}^{(k)}={\bf L}{\bf x}^{(k)}. All operations in Eq. 17 are component-wise. As in [47], we will consider selecting the parameter λ\lambda at each iteration using the minimization in Eq. 16. The iterations are again terminated when RC(𝐱(k+1))\text{RC}({\bf x}^{(k+1)}) drops below a specified tolerance toltol or after KmaxK_{max} iterations are reached. The MM algorithm is summarized in Algorithm 2.

Algorithm 2 The MM Method for the 2\ell_{2}-1\ell_{1} Problem Eq. 3 with a Fixed Quadratic Majorant
1:𝐀,𝐛,𝐋,𝐱(0),ε{\bf A},{\bf b},{\bf L},{\bf x}^{(0)},\varepsilon, toltol, KmaxK_{max}
2:𝐱{\bf x}
3:for k=0,1,k=0,1,\dots until RC(𝐱(k+1))<tol\text{RC}({\bf x}^{(k+1)})<tol or k=Kmaxk=K_{max} do
4:     𝐮(k)=𝐋𝐱(k){\bf u}^{(k)}={\bf L}{\bf x}^{(k)}
5:     𝐰reg(k)=𝐮(k)(1(ε2(𝐮(k))2+ε2)12){\bf w}_{reg}^{(k)}={\bf u}^{(k)}\left(1-\left(\frac{\varepsilon^{2}}{\left({\bf u}^{(k)}\right)^{2}+\varepsilon^{2}}\right)^{\frac{1}{2}}\right)
6:     Estimate λ(k)\lambda^{(k)}
7:     𝐱(k+1)=argmin𝐱{12𝐀𝐱𝐛22+(λ(k))22𝐋𝐱𝐰reg(k)22}{\bf x}^{(k+1)}=\operatorname*{arg\,min}_{{\bf x}}\left\{\frac{1}{2}\|{\bf A}{\bf x}-{\bf b}\|^{2}_{2}+\frac{\left(\lambda^{(k)}\right)^{2}}{2}\|{\bf L}{\bf x}-{\bf w}_{reg}^{(k)}\|^{2}_{2}\right\}
8:end for

2.3 Parameter Selection Methods

In both iterative methods, we will solve a minimization problem of the form

(18) min𝐱{12𝐀𝐱𝐛22+(λ(k))22𝐋𝐱𝐡(k)22},\displaystyle\min_{\bf x}\left\{\frac{1}{2}\left\lVert{\bf A}{\bf x}-{\bf b}\right\rVert_{2}^{2}+\frac{\left(\lambda^{(k)}\right)^{2}}{2}\left\lVert{\bf L}{\bf x}-{\bf h}^{(k)}\right\rVert_{2}^{2}\right\},

at each iteration. To select λ(k)\lambda^{(k)}, we will use both generalized cross validation (GCV) and the χ2\chi^{2} degrees of freedom (dof) test as discussed in the context of application to SB and MM algorithms in [47]. GCV [2, 25] selects λ\lambda to minimize the average predictive risk when an entry of 𝐛{\bf b} is left out. This method does not require any information about the noise. We will follow [47] in which the GCV is extended to this inner minimization problem.

The χ2\chi^{2} dof test [37, 38, 43] is a statistical test that treats 𝐱{\bf x} as a random variable with mean 𝐱0{\bf x}_{0} and utilizes information about the noise to select λ\lambda. Under the assumption that 𝐞{\bf e} is normally distributed with mean 𝟎{\bf{0}} and covariance matrix 𝐈{\bf I}, denoted 𝐞𝒩(𝟎,𝐈){\bf e}\sim\mathcal{N}({\bf{0}},{\bf I}), and 𝐋(𝐱𝐱0)𝒩(𝟎,λ2𝐈){\bf L}({\bf x}-{\bf x}_{0})\sim\mathcal{N}({\bf{0}},\lambda^{-2}{\bf I}), the functional

J(𝐱)=𝐀𝐱𝐛22+λ2𝐋(𝐱𝐱0)22\displaystyle J({\bf x})=\|{\bf A}{\bf x}-{\bf b}\|_{2}^{2}+\lambda^{2}\left\lVert{\bf L}({\bf x}-{\bf x}_{0})\right\rVert_{2}^{2}

follows a χ2\chi^{2} distribution with rank(𝐋)+max(mn,0)\text{rank}({\bf L})+\max{(m-n,0)} dof [47]. The χ2\chi^{2} dof test then selects λ\lambda so that JJ most resembles this χ2\chi^{2} distribution. As an approximation of 𝐱0{\bf x}_{0}, we will use 𝐱0=𝐋𝐀𝐡(k){\bf x}_{0}={\bf L}_{\bf A}^{\dagger}{\bf h}^{(k)} as in [47]. In this case, the determination of λ\lambda is implemented efficiently as a one-dimensional root-finding algorithm. On the other hand, if 𝐱0{\bf x}_{0} is not the mean of 𝐱{\bf x}, then JJ instead follows a non-central χ2\chi^{2} distribution [43] and a non-central χ2\chi^{2} test can be applied in a similar, but more difficult manner to select λ\lambda.

3 Solving Generalized Tikhonov with Kronecker Product Matrices

Under the assumption that both 𝐀{\bf A} and 𝐋{\bf L} are KP matrices, we can solve the minimization problem Eq. 18 occurring for both the SB and MM methods by using a joint decomposition resulting from the individual GSVDs of {𝐀1,𝐋1}\{{\bf A}_{1},{\bf L}_{1}\} and {𝐀2,𝐋2}\{{\bf A}_{2},{\bf L}_{2}\}. The GSVD is a joint decomposition of a matrix pair {𝐀,𝐋}\{{\bf A},{\bf L}\} introduced by Van Loan [51]. It has been formulated in different ways [30, 39], but for this paper, we use the definition of the GSVD given in [30].

Lemma 3.1.

For 𝐀m×n{\bf A}\in\mathbb{R}^{m\times n} and 𝐋p×n{\bf L}\in\mathbb{R}^{p\times n}, let 𝐊=[𝐀;𝐋]{\bf K}=\left[{\bf A};{\bf L}\right]. Let t=rank(𝐊)t=\text{rank}({\bf K}), r=rank(𝐊)rank(𝐋)r=\text{rank}({\bf K})-\text{rank}({\bf L}), and s=rank(𝐀)+rank(𝐋)rank(𝐊)s=\text{rank}({\bf A})+\text{rank}({\bf L})-\text{rank}({\bf K}). Then, there exist orthogonal matrices 𝐔m×m{\bf U}\in\mathbb{R}^{m\times m}, 𝐕p×p{\bf V}\in\mathbb{R}^{p\times p}, 𝐖t×t{\bf W}\in\mathbb{R}^{t\times t}, and 𝐐n×n{\bf Q}\in\mathbb{R}^{n\times n}, and non-singular matrix 𝐑t×t{\bf R}\in\mathbb{R}^{t\times t} such that

(19) 𝐔𝐀𝐐=𝚼~[𝐖𝐑,𝟎t×(nt)]𝐕𝐋𝐐=𝐌~[𝐖𝐑,𝟎t×(nt)],{\bf U}^{\top}{\bf A}{\bf Q}=\tilde{\bm{\Upsilon}}\left[{\bf W}^{\top}{\bf R},{\bf 0}_{t\times(n-t)}\right]\quad{\bf V}^{\top}{\bf L}{\bf Q}=\tilde{\bf M}\left[{\bf W}^{\top}{\bf R},{\bf 0}_{t\times(n-t)}\right],

where 𝚼~m×t\tilde{\bm{\Upsilon}}\in\mathbb{R}^{m\times t} and 𝐌~p×t\tilde{\bf M}\in\mathbb{R}^{p\times t} with

(20a) 𝚼~\displaystyle\tilde{\bm{\Upsilon}} =[𝐈𝐀𝚼𝟎𝐀],𝐌~=[𝟎𝐋𝐌𝐈𝐋]\displaystyle=\begin{bmatrix}{\bf I}_{\bf A}&&\\ &{\bm{\Upsilon}}&\\ &&{\bf 0}_{\bf A}\end{bmatrix},\quad\tilde{\bf M}=\begin{bmatrix}{\bf 0}_{\bf L}&&\\ &{\bf M}&\\ &&{\bf I}_{\bf L}\end{bmatrix}
(20b) 𝚼\displaystyle{\bm{\Upsilon}} =diag(υr+1,,υr+s) with 1>υr+1υr+2υr+s>0\displaystyle=\mathrm{diag}(\upsilon_{r+1},\dots,\upsilon_{r+s})\text{ with }1>\upsilon_{r+1}\geq\upsilon_{r+2}\geq\dots\geq\upsilon_{r+s}>0
(20c) 𝐌\displaystyle{\bf M} =diag(μr+1,,μr+s) with 0<μr+1μr+2μr+s<1, and\displaystyle=\mathrm{diag}(\mu_{r+1},\dots,\mu_{r+s})\text{ with }0<\mu_{r+1}\leq\mu_{r+2}\leq\dots\leq\mu_{r+s}<1,\text{ and }
(20d) 1\displaystyle 1 =υi2+μi2,i=r+1:r+s,\displaystyle=\upsilon_{i}^{2}+\mu_{i}^{2},i=r+1:r+s,

defining υi=1,μi=0 for i<r+1, and υi=0,μi=1, for i>r+s.\text{defining }\upsilon_{i}=1,\;\mu_{i}=0\text{ for }i<r+1,\text{ and }\upsilon_{i}=0,\;\mu_{i}=1,\text{ for }i>r+s. Here the identity matrices are 𝐈𝐀r×r and 𝐈𝐋(trs)×(trs){\bf I}_{\bf A}\in\mathbb{R}^{r\times r}\text{ and }{\bf I}_{\bf L}\in\mathbb{R}^{(t-r-s)\times(t-r-s)} and the zero matrices are 𝟎𝐀(mrs)×(trs) and 𝟎𝐋(pt+r)×r.{\bf 0}_{\bf A}\in\mathbb{R}^{(m-r-s)\times(t-r-s)}\text{ and }{\bf 0}_{\bf L}\in\mathbb{R}^{(p-t+r)\times r}. The singular values of 𝐑{\bf R} are the nonzero singular values of 𝐊{\bf K}. Defining

𝐙=𝐐[𝐑𝐖𝟎(nt)×t],\displaystyle{\bf Z}={\bf Q}\begin{bmatrix}{\bf R}^{\top}{\bf W}\\ {\bf{0}}_{(n-t)\times t}\end{bmatrix},

then 𝐀=𝐔𝚼~𝐙{\bf A}={\bf U}\tilde{\bm{\Upsilon}}{\bf Z}^{\top} and 𝐋=𝐕𝐌~𝐙{\bf L}={\bf V}\tilde{\bf M}{\bf Z}^{\top}, where 𝐙n×t{\bf Z}\in\mathbb{R}^{n\times t} has full column rank. If t=nt=n, then 𝐙{\bf Z} is invertible and we have that 𝐀=𝐔𝚼~𝐘1{\bf A}={\bf U}\tilde{\bm{\Upsilon}}{\bf Y}^{-1} and 𝐋=𝐕𝐌~𝐘1{\bf L}={\bf V}\tilde{\bf M}{\bf Y}^{-1}, where 𝐘1=𝐙{\bf Y}^{-1}={\bf Z}^{\top}.

The ratios γi=υi/μi\gamma_{i}=\upsilon_{i}/\mu_{i} in the GSVD are known as the generalized singular values (GSVs) [27] and are sorted in decreasing order in Lemma 3.1. If μi=0\mu_{i}=0 and υi=1\upsilon_{i}=1, then the corresponding GSV γi\gamma_{i} is infinite. The columns of 𝐔{\bf U} that correspond to υi=0\upsilon_{i}=0 span Null(𝐀)\mbox{Null}({\bf A}) while the columns of 𝐕{\bf V} that correspond to μi=0\mu_{i}=0 span Null(𝐋)\mbox{Null}({\bf L}).

From the GSVDs of the factors {𝐀1,𝐋1}\{{\bf A}_{1},{\bf L}_{1}\} and {𝐀2,𝐋2}\{{\bf A}_{2},{\bf L}_{2}\},

(21) [𝐀1𝐋1]=[𝐔1𝚼~1𝐙1𝐕1𝐌~1𝐙1]and[𝐀2𝐋2]=[𝐔2𝚼~2𝐙2𝐕2𝐌~2𝐙2],\begin{bmatrix}{\bf A}_{1}\\ {\bf L}_{1}\end{bmatrix}=\begin{bmatrix}{\bf U}_{1}\tilde{\bm{\Upsilon}}_{1}{\bf Z}_{1}^{\top}\bf\\ {\bf V}_{1}\tilde{\bf M}_{1}{\bf Z}_{1}^{\top}\end{bmatrix}\quad\rm and\quad\begin{bmatrix}{\bf A}_{2}\\ {\bf L}_{2}\end{bmatrix}=\begin{bmatrix}{\bf U}_{2}\tilde{\bm{\Upsilon}}_{2}{\bf Z}_{2}^{\top}\bf\\ {\bf V}_{2}\tilde{\bf M}_{2}{\bf Z}_{2}^{\top}\end{bmatrix},

we can construct a joint decomposition of 𝐀{\bf A} and 𝐋{\bf L} using the properties of the Kronecker product [52]. This joint decomposition is given by

(22) 𝐊=[𝐀1𝐀2𝐋1𝐋2]=[𝐔1𝐔2𝟎𝟎𝐕1𝐕2][𝚼~1𝚼~2𝐌~1𝐌~2][𝐙1𝐙2].{\bf K}=\begin{bmatrix}{\bf A}_{1}\otimes{\bf A}_{2}\\ {\bf L}_{1}\otimes{\bf L}_{2}\end{bmatrix}=\begin{bmatrix}{\bf U}_{1}\otimes{\bf U}_{2}&{\bf 0}\\ {\bf 0}&{\bf V}_{1}\otimes{\bf V}_{2}\end{bmatrix}\begin{bmatrix}\tilde{\bm{\Upsilon}}_{1}\otimes\tilde{\bm{\Upsilon}}_{2}\\ \tilde{\bf M}_{1}\otimes\tilde{\bf M}_{2}\end{bmatrix}\begin{bmatrix}{\bf Z}_{1}^{\top}\otimes{\bf Z}_{2}^{\top}\end{bmatrix}.

Let us assume, without loss of generality, that 𝐱{\bf x} is square, meaning that array(𝐱)=𝐗n×n\text{array}({\bf x})={\bf X}\in\mathbb{R}^{n\times n}, where array(𝐱)\text{array}({\bf x}) is an operator that reshapes the vector 𝐱n2×1{\bf x}\in\mathbb{R}^{n^{2}\times 1} into an n×nn\times n matrix. We will also assume that 𝐀1,𝐀2,𝐋1,𝐋2{\bf A}_{1},{\bf A}_{2},{\bf L}_{1},{\bf L}_{2} have at least as many rows as columns. With assumption Eq. 2, t=nt=n in Lemma 3.1 and the solution to Eq. 18 is then given as

(23) 𝐱=(𝐀𝐀+λ2𝐋𝐋)1(𝐀𝐛+λ2𝐋𝐡(k)).\displaystyle{\bf x}=({\bf A}^{\top}{\bf A}+\lambda^{2}{\bf L}^{\top}{\bf L})^{-1}({\bf A}^{\top}{\bf b}+\lambda^{2}{\bf L}^{\top}{\bf h}^{(k)}).

Since t=nt=n, 𝐙1{\bf Z}_{1}, 𝐙2{\bf Z}_{2}, and 𝐙=𝐙1𝐙2{\bf Z}={\bf Z}_{1}\otimes{\bf Z}_{2} are invertible, and we may set 𝐙1=𝐘11{\bf Z}_{1}^{\top}={\bf Y}_{1}^{-1} and 𝐙2=𝐘21{\bf Z}_{2}^{\top}={\bf Y}_{2}^{-1}. Let 𝐔~1\tilde{{\bf U}}_{1} and 𝐔~2\tilde{{\bf U}}_{2} be the first nn columns of 𝐔1{\bf U}_{1} and 𝐔2{\bf U}_{2}, respectively, and let 𝐕~1\tilde{{\bf V}}_{1} and 𝐕~2\tilde{{\bf V}}_{2} be the last nn columns of 𝐕1{\bf V}_{1} and 𝐕2{\bf V}_{2}, respectively. Define also 𝐁~=𝐔~2𝐁𝐔~1\tilde{{\bf B}}=\tilde{{\bf U}}_{2}^{\top}{\bf B}\tilde{{\bf U}}_{1} and 𝐇~(k)=𝐕~2𝐇(k)𝐕~1\tilde{{\bf H}}^{(k)}=\tilde{{\bf V}}_{2}^{\top}{\bf H}^{(k)}\tilde{{\bf V}}_{1}, where 𝐁=array(𝐛){\bf B}=\text{array}({\bf b}) and 𝐇(k)=array(𝐡(k)){\bf H}^{(k)}=\text{array}({\bf h}^{(k)}). Then, using Eq. 22 and

(24) (𝐂𝐃)vec(𝐅)=vec(𝐃𝐅𝐂),\displaystyle({\bf C}\otimes{\bf D})\text{vec}({\bf F})=\text{vec}({\bf D}{\bf F}{\bf C}^{\top}),

where vec(𝐅)\text{vec}({\bf F}) is the vectorization of the matrix 𝐅{\bf F}, the solution in Eq. 23 can be written as

(25) 𝐱=vec(𝐘2[𝚽~𝐁~+𝚿~𝐇~(k))]𝐘1),\displaystyle{\bf x}=\text{vec}({\bf Y}_{2}[\tilde{\bm{\Phi}}\odot\tilde{{\bf B}}+\tilde{\bm{\Psi}}\odot\tilde{{\bf H}}^{(k)})]{\bf Y}_{1}^{\top}),

where \odot denotes element-wise multiplication and 𝚽~\tilde{\bm{\Phi}} and 𝚿~\tilde{\bm{\Psi}} are given by

𝚽~\displaystyle\tilde{\bm{\Phi}} =array([υ1υ12+λ2μ12,,υn2υn22+λ2μn22])\displaystyle=\text{array}\left(\left[\frac{\upsilon_{1}}{\upsilon_{1}^{2}+\lambda^{2}\mu_{1}^{2}},\dots,\frac{\upsilon_{n^{2}}}{\upsilon_{n^{2}}^{2}+\lambda^{2}\mu_{n^{2}}^{2}}\right]^{\top}\right)
𝚿~\displaystyle\tilde{\bm{\Psi}} =array([λ2μ1υ12+λ2μ12,,λ2μn2υn22+λ2μn22]).\displaystyle=\text{array}\left(\left[\frac{\lambda^{2}\mu_{1}}{\upsilon_{1}^{2}+\lambda^{2}\mu_{1}^{2}},\dots,\frac{\lambda^{2}\mu_{n^{2}}}{\upsilon_{n^{2}}^{2}+\lambda^{2}\mu_{n^{2}}^{2}}\right]^{\top}\right).

Here, υi\upsilon_{i} and μi\mu_{i} refer to the (potentially) non-zero entry in column ii of 𝚼~1𝚼~2\tilde{\bm{\Upsilon}}_{1}\otimes\tilde{\bm{\Upsilon}}_{2} or 𝐌~1𝐌~2\tilde{\bf M}_{1}\otimes\tilde{\bf M}_{2}.

In both SB and MM, 𝐋𝐱{\bf L}{\bf x} must also be computed at each iteration. This computation can be written using Eq. 24 as

(26) 𝐋𝐱=vec(𝐕~2[𝐌^(𝐘21𝐗𝐘1)]𝐕~1),\displaystyle{\bf L}{\bf x}=\text{vec}\left(\tilde{{\bf V}}_{2}[\hat{{\bf M}}\odot({\bf Y}_{2}^{-1}{\bf X}{\bf Y}_{1}^{-\top})]\tilde{{\bf V}}_{1}^{\top}\right),

where 𝐌^=array([μ1,,μn2])\hat{{\bf M}}=\text{array}(\left[\mu_{1},\dots,\mu_{n^{2}}\right]^{\top}).

In the χ2\chi^{2} dof test, we also compute 𝐋𝐀𝐡(k){\bf L}^{\dagger}_{\bf A}{\bf h}^{(k)}. Defining 𝐦¯n2\overline{{\bf m}}\in\mathbb{R}^{n^{2}} by

m¯i={1/μi,μi>00,μi=0,\displaystyle\overline{m}_{i}=\begin{cases}1/\mu_{i},&\mu_{i}>0\\ 0,&\mu_{i}=0\end{cases},

and 𝐌¯=array(𝐦¯)\overline{{\bf M}}=\text{array}(\overline{{\bf m}}), we have that

(27) 𝐋𝐀𝐡(k)=vec(𝐘2𝐌¯𝐇~(k)𝐘1).{\bf L}^{\dagger}_{\bf A}{\bf h}^{(k)}=\text{vec}({\bf Y}_{2}\overline{{\bf M}}\tilde{{\bf H}}^{(k)}{\bf Y}_{1}^{\top}).

Notice that Eq. 25, Eq. 26, and Eq. 27 allow each step in the iterative methods to be computed using only the individual GSVDs in Eq. 21.

3.1 Theoretical Properties of the GSVD of Column-Orthogonal Matrices

Here, we examine the properties of the GSVD of a matrix pair {𝐀,𝐋}\{{\bf A},{\bf L}\} where one is a column-orthogonal matrix. We refer to a matrix 𝐂t×n{\bf C}\in\mathbb{R}^{t\times n}, tnt\geq n, as column orthogonal if 𝐂𝐂=𝐈n{\bf C}^{\top}{\bf C}={\bf I}_{n}. First, we recall the singular value decomposition (SVD) of 𝐀{\bf A}. The SVD is a decomposition of the form 𝐀=𝐔𝐀𝚺𝐕𝐀{\bf A}={\bf U}_{\bf A}\bm{\Sigma}{\bf V}_{\bf A}^{\top}, where 𝐔𝐀m×m{\bf U}_{\bf A}\in\mathbb{R}^{m\times m} and 𝐕𝐀n×n{\bf V}_{\bf A}\in\mathbb{R}^{n\times n} are orthogonal and 𝚺m×n\bm{\Sigma}\in\mathbb{R}^{m\times n} is diagonal with entries σ1,,σm~\sigma_{1},\dots,\sigma_{\tilde{m}} and m~=min{m,n}\tilde{m}=\min\{m,n\}. These diagonal entries are known as the singular values (SVs) of 𝐀{\bf A} and are sorted in decreasing order: σ1σ2σm~0\sigma_{1}\geq\sigma_{2}\geq\dots\geq\sigma_{\tilde{m}}\geq 0. The rank of 𝐀{\bf A} is equal to the number of non-zero SVs. It is well known that when 𝐋=𝐈{\bf L}={\bf I}, the GSVD of {𝐀,𝐋}\{{\bf A},{\bf L}\} reduces to the SVD of 𝐀{\bf A} with γi=σi\gamma_{i}=\sigma_{i}, where the γi\gamma_{i} are sorted in decreasing order [45, 50]. If 𝐋{\bf L} has full column rank, the GSVs are the SVs of 𝐀𝐋{\bf A}{\bf L}^{\dagger} [14, 15, 36, 56]. If, however, 𝐋{\bf L} is column orthogonal, we can say more about the GSVD of {𝐀,𝐋}\{{\bf A},{\bf L}\} as described by the following lemma.

Lemma 3.2.

Let 𝐋{\bf L} be column orthogonal and k=rank(𝐀)k=\text{rank}({\bf A}). Then, in the GSVD of {𝐀,𝐋}\{{\bf A},{\bf L}\}, γi=σi\gamma_{i}=\sigma_{i} for i=1,,m~i=1,\dots,\tilde{m} and γi=0\gamma_{i}=0 for i=m~+1,,ni=\tilde{m}+1,\dots,n. Further, 𝐘=𝐕𝐀𝐆{\bf Y}={\bf V}_{\bf A}{\bf G}, where 𝐆=diag(1/σ12+1,,1/σk2+1,𝟏nk){\bf G}=\text{diag}(1/\sqrt{\sigma_{1}^{2}+1},\dots,1/\sqrt{\sigma_{k}^{2}+1},{\bf 1}_{n-k}). The γi\gamma_{i} are sorted in decreasing order as in Lemma 3.1.

Proof 3.3.

Let kk be the rank of 𝐀{\bf A} and let the SVD of 𝐀{\bf A} be given by 𝐀=𝐔𝐀𝚺𝐕𝐀{\bf A}={\bf U}_{\bf A}\bm{\Sigma}{\bf V}_{\bf A}^{\top}. Since 𝐕𝐀{\bf V}_{\bf A} is orthogonal, we can write that 𝐋=𝐋𝐕𝐀𝐈𝐕𝐀{\bf L}={\bf L}{\bf V}_{\bf A}{\bf I}{\bf V}_{\bf A}^{\top}. This provides a joint decomposition of the form

(28) [𝐀𝐋]=[𝐔𝐀𝟎𝟎𝐋𝐕𝐀][𝚺𝐈]𝐕𝐀.\displaystyle\begin{bmatrix}{\bf A}\\ {\bf L}\end{bmatrix}=\begin{bmatrix}{\bf U}_{\bf A}&{\bf{0}}\\ {\bf{0}}&{\bf L}{\bf V}_{\bf A}\end{bmatrix}\begin{bmatrix}\bm{\Sigma}\\ {\bf I}\end{bmatrix}{\bf V}_{\bf A}^{\top}.

Since 𝐆{\bf G} is invertible, we have that

(29) [𝚺𝐈]𝐕𝐀=[𝚺𝐈](𝐆𝐆1)𝐕𝐀=([𝚺𝐈]𝐆)(𝐆1𝐕𝐀)=[𝚼𝐌]𝐘1,\displaystyle\begin{bmatrix}\bm{\Sigma}\\ {\bf I}\end{bmatrix}{\bf V}_{\bf A}^{\top}=\begin{bmatrix}\bm{\Sigma}\\ {\bf I}\end{bmatrix}\left({\bf G}{\bf G}^{-1}\right){\bf V}_{\bf A}^{\top}=\left(\begin{bmatrix}\bm{\Sigma}\\ {\bf I}\end{bmatrix}{\bf G}\right)\left({\bf G}^{-1}{\bf V}_{\bf A}^{\top}\right)=\begin{bmatrix}\bm{\Upsilon}\\ {\bf M}\end{bmatrix}{\bf Y}^{-1},

where 𝚼=𝚺𝐆{\bm{\Upsilon}}=\bm{\Sigma}{\bf G}, 𝐌=𝐆{\bf M}={\bf G}, and 𝐘=𝐕𝐀𝐆{\bf Y}={\bf V}_{\bf A}{\bf G}. With this,

𝚼=diag(σ1σ12+1,,σkσk2+1,𝟎m~k)and 𝐌=diag(1σ12+1,,1σk2+1,𝟏nk).\displaystyle{\bm{\Upsilon}}=\text{diag}\left(\frac{\sigma_{1}}{\sqrt{\sigma_{1}^{2}+1}},\dots,\frac{\sigma_{k}}{\sqrt{\sigma_{k}^{2}+1}},{\bf{0}}_{\tilde{m}-k}\right)\text{and }{\bf M}=\text{diag}\left(\frac{1}{\sqrt{\sigma_{1}^{2}+1}},\dots,\frac{1}{\sqrt{\sigma_{k}^{2}+1}},{\bf{1}}_{n-k}\right).

Hence, Eq. 20d is satisfied. 𝐔𝐀{\bf U}_{\bf A} is orthogonal from the SVD and 𝐋𝐕𝐀{\bf L}{\bf V}_{\bf A} is column orthogonal since 𝐋{\bf L} is column orthogonal and 𝐕𝐀{\bf V}_{\bf A} is orthogonal. 𝐋𝐕𝐀{\bf L}{\bf V}_{\bf A} is not necessarily orthogonal since 𝐋{\bf L} may have more rows than columns. In this case, we add columns to the left of 𝐋𝐕𝐀{\bf L}{\bf V}_{\bf A} to make it orthogonal. To maintain the decomposition, we also add rows of zeros to the top of 𝐌{\bf M} that correspond to these additional columns in 𝐋𝐕𝐀{\bf L}{\bf V}_{\bf A}. Following this, the decomposition is then a GSVD.

Notice that the GSVs for i=1,,ki=1,\dots,k are

γi=υiμi=σi/σi2+11/σi2+1=σi,\displaystyle\gamma_{i}=\frac{\upsilon_{i}}{\mu_{i}}=\frac{\sigma_{i}/\sqrt{\sigma_{i}^{2}+1}}{1/\sqrt{\sigma_{i}^{2}+1}}=\sigma_{i},

and for i=k+1,,m~i=k+1,\dots,\tilde{m} are γi=0/1=0=σi\gamma_{i}=0/1=0=\sigma_{i}. Thus, if mnm\geq n, the GSVs are the SVs of 𝐀{\bf A}. If m<nm<n, then the GSVs for i=m+1,,ni=m+1,\dots,n come from the columns of zeros in 𝚼{\bm{\Upsilon}} and are γi=0/1=0\gamma_{i}=0/1=0. From Eq. 29, we can also see that 𝐘{\bf Y} is 𝐕𝐀{\bf V}_{\bf A} from the SVD of 𝐀{\bf A} with scaled columns. In particular, the scaling implies that for i=1,,ki=1,\dots,k, μi=yi2\mu_{i}=\left\lVert y_{i}\right\rVert_{2} and υi=σiyi2\upsilon_{i}=\sigma_{i}\left\lVert y_{i}\right\rVert_{2}. This also shows that 𝐕{\bf V} is related to 𝐕𝐀{\bf V}_{\bf A} and, by extension, to 𝐘{\bf Y}, since 𝐕=𝐋𝐕𝐀=𝐋𝐘𝐆1{\bf V}={\bf L}{\bf V}_{\bf A}={\bf L}{\bf Y}{\bf G}^{-1}.

If 𝐀{\bf A} is column orthogonal instead of 𝐋{\bf L}, we obtain a similar result by flipping the roles of 𝐀{\bf A} and 𝐋{\bf L} and writing a decomposition of 𝐀{\bf A} in terms of the SVD of 𝐋{\bf L}. In this case, the SVD of 𝐋{\bf L} determines the matrices in the GSVD of {𝐀,𝐋}\{{\bf A},{\bf L}\}. On the other hand, if 𝐀{\bf A} is column orthogonal, then 𝐀𝐀=𝐈{\bf A}^{\top}{\bf A}={\bf I} and the system of equations is not ill-posed. Thus, we immediately obtain 𝐱=𝐀𝐛{\bf x}={\bf A}^{\top}{\bf b}, and we do not need the GSVD.

To obtain the same result as in Lemma 3.2, we can also consider the generalized eigenvalue problem [40] involving the matrix pair {𝐀𝐀,𝐋𝐋}\{{\bf A}^{\top}{\bf A},{\bf L}^{\top}{\bf L}\}, in which the goal is to find a scalar λg\lambda_{g} and non-zero vector 𝐱g{\bf x}_{g} such that

(30) 𝐀𝐀𝐱g=λg𝐋𝐋𝐱g,\displaystyle{\bf A}^{\top}{\bf A}{\bf x}_{g}=\lambda_{g}{\bf L}^{\top}{\bf L}{\bf x}_{g},

where 𝐀𝐀,𝐋𝐋n×n{\bf A}^{\top}{\bf A},{\bf L}^{\top}{\bf L}\in\mathbb{R}^{n\times n} are symmetric. When 𝐋{\bf L} is column orthogonal, Eq. 30 simplifies to the eigenvalue problem

(31) 𝐀𝐀𝐱s=λs𝐱s,\displaystyle{\bf A}^{\top}{\bf A}{\bf x}_{s}=\lambda_{s}{\bf x}_{s},

since 𝐋𝐋=𝐈{\bf L}^{\top}{\bf L}={\bf I} [46, 55].

Lemma 3.4.

When 𝐋{\bf L} is column orthogonal, the λg\lambda_{g} and 𝐱g{\bf x}_{g} that solve Eq. 30 are the same as the λs\lambda_{s} and 𝐱s{\bf x}_{s} that solve Eq. 31: λg=λs\lambda_{g}=\lambda_{s} and 𝐱g=𝐱s{\bf x}_{g}={\bf x}_{s}.

Proof 3.5.

Since 𝐀𝐀+𝐋𝐋{\bf A}^{\top}{\bf A}+{\bf L}^{\top}{\bf L} is positive definite, there exists a solution to Eq. 30 and any λg\lambda_{g} and 𝐱g{\bf x}_{g} that solve Eq. 30 are real [21, 40]. The λg\lambda_{g} is a generalized eigenvalue and 𝐱g{\bf x}_{g} is the associated eigenvector. We can write Eq. 30 to include all eigenvalues and eigenvectors as a matrix equation

(32) 𝐀𝐀𝚽=𝐋𝐋𝚽𝚲,\displaystyle{\bf A}^{\top}{\bf A}{\bm{\Phi}}={\bf L}^{\top}{\bf L}{\bm{\Phi}}\bm{\Lambda},

where 𝚽n×n{\bm{\Phi}}\in\mathbb{R}^{n\times n} contains the nn generalized eigenvectors and 𝚲n×n\bm{\Lambda}\in\mathbb{R}^{n\times n} is a diagonal matrix with the generalized eigenvalues. When 𝐋{\bf L} is column orthogonal, Eq. 30 simplifies to the eigenvalue problem for 𝐀𝐀{\bf A}^{\top}{\bf A}:

𝐀𝐀𝚽\displaystyle{\bf A}^{\top}{\bf A}{\bm{\Phi}} =𝚽𝚲\displaystyle={\bm{\Phi}}\bm{\Lambda}
𝚽𝐀𝐀𝚽\displaystyle{\bm{\Phi}}^{\top}{\bf A}^{\top}{\bf A}{\bm{\Phi}} =𝚲.\displaystyle=\bm{\Lambda}.

Thus, the eigenvalues of 𝐀𝐀{\bf A}^{\top}{\bf A} are the eigenvalues of 𝚲\bm{\Lambda}, meaning that the generalized eigenvalues of {𝐀𝐀,𝐋𝐋}\{{\bf A}^{\top}{\bf A},{\bf L}^{\top}{\bf L}\} are the eigenvalues of 𝐀𝐀{\bf A}^{\top}{\bf A}. The associated eigenvectors for 𝐀𝐀{\bf A}^{\top}{\bf A} are the columns of 𝚽{\bm{\Phi}}, which are also the generalized eigenvectors of {𝐀𝐀,𝐋𝐋}\{{\bf A}^{\top}{\bf A},{\bf L}^{\top}{\bf L}\} from Eq. 32. When mnm\geq n, this analysis can be connected back to the GSVs. Since the SVs of 𝐀{\bf A} are the square roots of the eigenvalues of 𝐀𝐀{\bf A}^{\top}{\bf A} and the finite GSVs of {𝐀,𝐋}\{{\bf A},{\bf L}\} are the square roots of the finite generalized eigenvalues of {𝐀𝐀,𝐋𝐋}\{{\bf A}^{\top}{\bf A},{\bf L}^{\top}{\bf L}\} [5], the GSVs of {𝐀,𝐋}\{{\bf A},{\bf L}\} are the SVs of 𝐀{\bf A}.

From the proof of Lemma 3.2, we can see that when 𝐋{\bf L} is column orthogonal, the GSVD of {𝐀,𝐋}\{{\bf A},{\bf L}\} can be obtained from the SVD of 𝐀{\bf A}. This means that instead of computing the GSVD, we can use the SVD of 𝐀{\bf A}. Another result of Lemma 3.2 is that the basis for the solution of Eq. 18 is determined solely by the right singular vectors of 𝐀{\bf A} and not by 𝐋{\bf L}. Similarly, the GSVs are also determined by the SVs of 𝐀{\bf A}. The only matrix in the GSVD that changes between the framelets and wavelets is the 𝐕{\bf V} matrix, which is used in the solution Eq. 25 when 𝐡(k)𝟎{\bf h}^{(k)}\neq{\bf{0}}. When 𝐡(k)=𝟎{\bf h}^{(k)}={\bf{0}} though, as in the generalized Tikhonov problem Eq. 1, both 𝐋{\bf L} will produce the same solution, leading to the following corollary.

Corollary 3.6.

Let 𝐋{\bf L} and 𝐋~\tilde{{\bf L}} both be column orthogonal. Then, for a given λ\lambda, it makes no difference whether 𝐋{\bf L} or 𝐋~\tilde{{\bf L}} is used as the regularization matrix for the generalized Tikhonov problem Eq. 1 as the problem reduces to standard Tikhonov with 𝐋=𝐈{\bf L}={\bf I} for both matrices. Thus, the solutions will be the same, to machine precision.

When 𝐋{\bf L} is column orthogonal, Eq. 25 can be written as

(33) 𝐱=vec(𝐘2[𝚽~¯𝐁~+𝚿~¯𝐇~(k))]𝐘1),\displaystyle{\bf x}=\text{vec}({\bf Y}_{2}[\overline{\tilde{\bm{\Phi}}}\odot\tilde{{\bf B}}+\overline{\tilde{\bm{\Psi}}}\odot\tilde{{\bf H}}^{(k)})]{\bf Y}_{1}^{\top}),

where

𝚽~¯=array([υ1υ12+λ2,,υn2υn22+λ2]) and 𝚿~¯=array([λ2μ1υ12+λ2,,λ2μn2υn22+λ2]).\displaystyle\overline{\tilde{\bm{\Phi}}}=\text{array}\left(\left[\frac{\upsilon_{1}}{\upsilon_{1}^{2}+\lambda^{2}},\dots,\frac{\upsilon_{n^{2}}}{\upsilon_{n^{2}}^{2}+\lambda^{2}}\right]^{\top}\right)\mbox{ and }\overline{\tilde{\bm{\Psi}}}=\text{array}\left(\left[\frac{\lambda^{2}\mu_{1}}{\upsilon_{1}^{2}+\lambda^{2}},\dots,\frac{\lambda^{2}\mu_{n^{2}}}{\upsilon_{n^{2}}^{2}+\lambda^{2}}\right]^{\top}\right).

4 Family of Regularization Operators

This section presents a family of regularization operators 𝐋{\bf L}, which we adopted for the solution of Eq. 3. Specifically, we will consider a two-level framelet analysis operator [13, 44] and the Daubechies D4 discrete wavelet transform [12].

Consider a matrix that corresponds to a one-dimensional (1D) transform of the form

𝐋T1D=[𝐓0𝐓1𝐓t],\displaystyle{\bf L}_{T}^{1D}=\begin{bmatrix}{\bf T}_{0}\\ {\bf T}_{1}\\ \vdots\\ {\bf T}_{t}\end{bmatrix},

and its two-dimensional (2D) transform consists of the form

𝐋T2D=[𝐓00𝐓01𝐓ij𝐓tt],\displaystyle{\bf L}_{T}^{2D}=\begin{bmatrix}{\bf T}_{00}\\ {\bf T}_{01}\\ \vdots\\ {\bf T}_{ij}\\ \vdots\\ {\bf T}_{tt}\end{bmatrix},

where 𝐓ij=𝐓i𝐓j{\bf T}_{ij}={\bf T}_{i}\otimes{\bf T}_{j} for i,j=1,,t.i,j=1,\dots,t. This then yields the 2D-transform-regularized problem

(34) min𝐱{12(𝐀1𝐀2)𝐱𝐛22+μ2𝐋T2D𝐱1}.\displaystyle\min_{\bf x}\left\{\frac{1}{2}\left\lVert({\bf A}_{1}\otimes{\bf A}_{2}){\bf x}-{\bf b}\right\rVert_{2}^{2}+\frac{\mu}{2}\left\lVert{\bf L}_{T}^{2D}{\bf x}\right\rVert_{1}\right\}.

To solve Eq. 34 we observe that 𝐋T1D𝐋T1D=𝐏T𝐋T2D{\bf L}_{T}^{1D}\otimes{\bf L}_{T}^{1D}={\bf P}_{T}{\bf L}_{T}^{2D}, where 𝐏T{\bf P}_{T} is a permutation matrix. Since

𝐋T2D𝐱1=𝐏𝐋T2D𝐱1\displaystyle\left\lVert{\bf L}_{T}^{2D}{\bf x}\right\rVert_{1}=\left\lVert{\bf P}{\bf L}_{T}^{2D}{\bf x}\right\rVert_{1}

for any permutation matrix 𝐏{\bf P}, we have that

𝐋T2D𝐱1=𝐏T𝐋T2D𝐱1=(𝐋T1D𝐋T1D)𝐱1.\displaystyle\left\lVert{\bf L}_{T}^{2D}{\bf x}\right\rVert_{1}=\left\lVert{\bf P}_{T}{\bf L}_{T}^{2D}{\bf x}\right\rVert_{1}=\left\lVert({\bf L}_{T}^{1D}\otimes{\bf L}_{T}^{1D}){\bf x}\right\rVert_{1}.

As a result, solving

(35) min𝐱{12(𝐀1𝐀2)𝐱𝐛22+μ2(𝐋T1D𝐋T1D)𝐱1},\displaystyle\min_{\bf x}\left\{\frac{1}{2}\left\lVert({\bf A}_{1}\otimes{\bf A}_{2}){\bf x}-{\bf b}\right\rVert_{2}^{2}+\frac{\mu}{2}\left\lVert({\bf L}_{T}^{1D}\otimes{\bf L}_{T}^{1D}){\bf x}\right\rVert_{1}\right\},

is equivalent to solving Eq. 34. Thus, we will solve Eq. 35 as it provides the advantage of the KP structure.

4.1 Framelets

Following the presentation in [8], we use tight frames that are determined by B-splines for regularization. With these frames, we have one low-pass filter 𝐅0n×n{\bf F}_{0}\in\mathbb{R}^{n\times n} and two high pass filters 𝐅1,𝐅2n×n{\bf F}_{1},{\bf F}_{2}\in\mathbb{R}^{n\times n} given by

𝐅0=14[31001211210013],𝐅1=24[11001011010011] and 𝐅2=14[11001211210011].\scriptsize{\bf F}_{0}=\frac{1}{4}\begin{bmatrix}3&1&0&\cdots&0\\ 1&2&1&&\\ &\ddots&\ddots&\ddots&\\ &&1&2&1\\ 0&\cdots&0&1&3\end{bmatrix},\quad{\bf F}_{1}=\frac{\sqrt{2}}{4}\begin{bmatrix}-1&1&0&\cdots&0\\ -1&0&1&&\\ &\ddots&\ddots&\ddots&\\ &&-1&0&1\\ 0&\cdots&0&-1&1\end{bmatrix}\text{ and }{\bf F}_{2}=\frac{1}{4}\begin{bmatrix}1&-1&0&\cdots&0\\ -1&2&-1&&\\ &\ddots&\ddots&\ddots&\\ &&-1&2&-1\\ 0&\cdots&0&-1&1\end{bmatrix}.

The one-dimensional operator 𝐋F1D3n×n{\bf L}_{F}^{1D}\in\mathbb{R}^{3n\times n} is given by

𝐋F1D=[𝐅0𝐅1𝐅2].\displaystyle{\bf L}_{F}^{1D}=\begin{bmatrix}{\bf F}_{0}\\ {\bf F}_{1}\\ {\bf F}_{2}\end{bmatrix}.

Since we are concerned with image deblurring in two dimensions, the two-dimensional operator is then constructed using the Kronecker product:

𝐅ij=𝐅i𝐅j,i,j=0,1,2.\displaystyle{\bf F}_{ij}={\bf F}_{i}\otimes{\bf F}_{j},\quad i,j=0,1,2.

The matrix 𝐅00{\bf F}_{00} is a low-pass filter, while each of the other matrices contains at least one of the high-pass filters. The 2D operator 𝐋F2D9n2×n2{\bf L}_{F}^{2D}\in\mathbb{R}^{9n^{2}\times n^{2}}, obtained by the concatenation of these matrices, is given by

𝐋F2D=[𝐅00𝐅01𝐅02𝐅22].\displaystyle{\bf L}_{F}^{2D}=\begin{bmatrix}{\bf F}_{00}\\ {\bf F}_{01}\\ {\bf F}_{02}\\ \vdots\\ {\bf F}_{22}\end{bmatrix}.

4.2 Wavelets

Discrete wavelet transforms are composed of a low-pass filter, 𝐖1{\bf W}_{1}^{\top} and a high-pass filter, 𝐖2{\bf W}_{2}^{\top} [16, 49]. For a discrete wavelet transform of order kk, let h0,h1,,hk1h_{0},h_{1},\dots,h_{k-1} be the filter coefficients. Then, the filter matrices have the forms

𝐖1\displaystyle{\bf W}_{1}^{\top} =[h0h1hk2hk1h0h1hk2hk1h0h1hk2hk1hk2hk1h0h1hk2hk1h0h1]\displaystyle=\setcounter{MaxMatrixCols}{12}\begin{bmatrix}h_{0}&h_{1}&\cdots&\cdots&h_{k-2}&h_{k-1}&&&&&&\\ &&h_{0}&h_{1}&\cdots&\cdots&h_{k-2}&h_{k-1}&&&&\\ &&&&\cdots&\cdots&\cdots&\cdots&\cdots&\cdots&&\\ &&&&&&h_{0}&h_{1}&\cdots&\cdots&h_{k-2}&h_{k-1}\\ h_{k-2}&h_{k-1}&&&&&&&h_{0}&h_{1}&\cdots&\cdots\\ \cdots&\cdots&\cdots&\cdots&&&&&&&\cdots&\cdots\\ \cdots&\cdots&h_{k-2}&h_{k-1}&&&&&&&h_{0}&h_{1}\end{bmatrix}
𝐖2\displaystyle{\bf W}_{2}^{\top} =[g0g1gk2gk1g0g1gk2gk1g0g1gk2gk1gk2gk1g0g1gk2gk1g0g1],\displaystyle=\setcounter{MaxMatrixCols}{12}\begin{bmatrix}g_{0}&g_{1}&\cdots&\cdots&g_{k-2}&g_{k-1}&&&&&&\\ &&g_{0}&g_{1}&\cdots&\cdots&g_{k-2}&g_{k-1}&&&&\\ &&&&\cdots&\cdots&\cdots&\cdots&\cdots&\cdots&&\\ &&&&&&g_{0}&g_{1}&\cdots&\cdots&g_{k-2}&g_{k-1}\\ g_{k-2}&g_{k-1}&&&&&&&g_{0}&g_{1}&\cdots&\cdots\\ \cdots&\cdots&\cdots&\cdots&&&&&&&\cdots&\cdots\\ \cdots&\cdots&g_{k-2}&g_{k-1}&&&&&&&g_{0}&g_{1}\end{bmatrix},

where gj=(1)jhkj1,j=0,,k1g_{j}=(-1)^{j}h_{k-j-1},\;j=0,\dots,k-1. The 1D DWT matrix is then

𝐋W1D=[𝐖1𝐖2].\displaystyle{\bf L}_{W}^{1D}=\begin{bmatrix}{\bf W}^{\top}_{1}\\ {\bf W}_{2}^{\top}\end{bmatrix}.

As with the framelets, the 2D operator is constructed by taking the Kronecker products of the low-pass and high-pass filters:

𝐖ij=𝐖i𝐖j,i,j=0,1,2.\displaystyle{\bf W}^{\top}_{ij}={\bf W}^{\top}_{i}\otimes{\bf W}^{\top}_{j},\quad i,j=0,1,2.

The 2D operator is then

𝐋W2D=[𝐖11𝐖12𝐖21𝐖22].\displaystyle{\bf L}_{W}^{2D}=\begin{bmatrix}{\bf W}_{11}\\ {\bf W}_{12}\\ {\bf W}_{21}\\ {\bf W}_{22}\end{bmatrix}.

For the examples, we will use the D4 wavelet transform, which is of order 4 and has the filter coefficients

h0\displaystyle h_{0} =1+32,h1=3+32,h2=332,h3=132.\displaystyle=\frac{1+\sqrt{3}}{2},\,h_{1}=\frac{3+\sqrt{3}}{2},\,h_{2}=\frac{3-\sqrt{3}}{2},\,h_{3}=\frac{1-\sqrt{3}}{2}.

5 Numerical Results

In this section, the SB and MM methods are applied to image deblurring examples where the blur is separable. For the blurring matrix 𝐀{\bf A}, we consider 𝐀jn×n{\bf A}_{j}\in\mathbb{R}^{n\times n}, j=1j=1, 22, to be a one-dimensional blur with either zero or periodic boundary conditions. With zero boundary conditions, 𝐀j{\bf A}_{j} is a symmetric Toeplitz matrix with the first row

𝐳j=12πσ~j[exp([0:bandj1]22σ~j2)𝟎nbandj],\displaystyle{\bf z}_{j}=\frac{1}{\sqrt{2\pi}\tilde{\sigma}_{j}}\begin{bmatrix}\text{exp}\left(\frac{-[0:\text{band}_{j}-1]^{2}}{2\tilde{\sigma}_{j}^{2}}\right)&{{\bf 0}_{n-\text{band}_{j}}}\end{bmatrix},

while with periodic boundary conditions, 𝐀j{\bf A}_{j} is a circulant matrix with first row

𝐳j=12πσ~j[exp([0:bandj1]22σ~j2)𝟎n2bandj+1exp([bandj1:1]22σ~j2)].\displaystyle{\bf z}_{j}=\frac{1}{\sqrt{2\pi}\tilde{\sigma}_{j}}\begin{bmatrix}\text{exp}\left(\frac{-[0:\text{band}_{j}-1]^{2}}{2\tilde{\sigma}_{j}^{2}}\right)&{{\bf 0}_{n-2\text{band}_{j}+1}}&\text{exp}\left(\frac{-[\text{band}_{j}-1:1]^{2}}{2\tilde{\sigma}_{j}^{2}}\right)\end{bmatrix}.

Here, σ~j\tilde{\sigma}_{j} and bandj\text{band}_{j} are blur parameters. We then add Gaussian noise 𝐞{\bf e} to 𝐛true{\bf b}_{true} according to a specified blurred signal to noise ratio (BSNR), which is defined as

BSNR=20log10(𝐛true2𝐞2).\displaystyle\text{BSNR}=20\log_{10}\left(\frac{\left\lVert{\bf b}_{true}\right\rVert_{2}}{\left\lVert{\bf e}\right\rVert_{2}}\right).

The image is then deblurred using the SB and MM methods, with 𝐋{\bf L} being defined by framelets or wavelets as described in Section 4. We set set tol=0.01tol=0.01 and Kmax=20K_{max}=20. The methods are compared using the relative error (RE) defined by

RE=𝐱𝐱true2𝐱true2,\displaystyle\text{RE}=\frac{\left\lVert{\bf x}-{\bf x}_{true}\right\rVert_{2}}{\left\lVert{\bf x}_{true}\right\rVert_{2}},

and the improved signal to noise ratio (ISNR), which is defined by

ISNR=20log10(𝐛𝐱true2𝐱𝐱true2).\displaystyle\text{ISNR}=20\log_{10}\left(\frac{\left\lVert{\bf b}-{\bf x}_{true}\right\rVert_{2}}{\left\lVert{\bf x}-{\bf x}_{true}\right\rVert_{2}}\right).

Both metrics measure the quality of the solution, with a smaller RE and a larger ISNR indicating a better solution.

5.1 Blurring Examples

Refer to caption
(a) 𝐱{\bf x}
Refer to caption
(b) PSF
Refer to caption
(c) 𝐛true{\bf b}_{true}
Refer to caption
(d) 𝐛{\bf b}
Figure 1: True image 𝐱{\bf x} (128×128128\times 128 pixels), PSF (17×1717\times 17 pixels), blurred image 𝐛true{\bf b}_{true}, and the blurred and noisy image 𝐛{\bf b} with BSNR=10\text{BSNR}=10 for Example 1.

5.1.1 Example 1

For the first example, we blur the satellite image in Fig. 1(a) with size n=128n=128. For 𝐀{\bf A}, we use zero boundary conditions, σ~1=3\tilde{\sigma}_{1}=3, σ~2=1\tilde{\sigma}_{2}=1, and band1=band2=15\text{band}_{1}=\text{band}_{2}=15. The corresponding point spread function (PSF) is shown in Fig. 1(b), and the blurred image 𝐛true{\bf b}_{true} is shown in Fig. 1(c). Gaussian noise with BSNR=10\text{BSNR}=10 is added to obtain 𝐛{\bf b} as seen in Fig. 1(d). This problem is then solved using SB with τ=0.04\tau=0.04 and MM with ε=0.03\varepsilon=0.03. The solutions in Fig. 2 show that framelet regularization produces solutions with fewer artifacts in the background. The RE and ISNR for these methods are shown in Fig. 3. This shows that SB and MM perform similarly and that the framelet solutions have larger ISNRs than the wavelet solutions. The results are summarized in Table 1 along with timing. From this, we can observe that with χ2\chi^{2}, wavelet methods converge faster than framelet methods while with GCV, framelet methods converge faster. In all cases, the χ2\chi^{2} selection method, which is a root-finding technique, is faster than GCV despite GCV taking fewer iterations to converge. GCV and χ2\chi^{2} produce solutions with comparable RE and ISNR with either framelets or wavelets.

Refer to caption
(a) SB Framelets, Opt
Refer to caption
(b) SB Wavelets, Opt
Refer to caption
(c) MM Framelets, Opt
Refer to caption
(d) MM Wavelets, Opt
Refer to caption
(e) SB Framelets, GCV
Refer to caption
(f) SB Wavelets, GCV
Refer to caption
(g) MM Framelets, GCV
Refer to caption
(h) MM Wavelets, GCV
Refer to caption
(i) SB Framelets, χ2\chi^{2}
Refer to caption
(j) SB Wavelets, χ2\chi^{2}
Refer to caption
(k) MM Framelets, χ2\chi^{2}
Refer to caption
(l) MM Wavelets, χ2\chi^{2}
Figure 2: Solutions to Example 1 in Fig. 1. The solutions use either SB or MM to solve the problem and either framelets or wavelets for regularization. The parameter is either fixed optimally or selected with GCV or the non-central χ2\chi^{2} test.
Refer to caption
(a) RE: SB Framelets
Refer to caption
(b) RE: SB Wavelets
Refer to caption
(c) RE: MM Framelets
Refer to caption
(d) RE: MM Wavelets
Refer to caption
(e) ISNR: SB Framelets
Refer to caption
(f) ISNR: SB Wavelets
Refer to caption
(g) ISNR: MM Framelets
Refer to caption
(h) ISNR: MM Wavelets
Figure 3: RE and ISNR by iteration for the methods applied to Example 1 in Fig. 1.
Table 1: Results for Example 1 in Fig. 1. Timing includes computing the SVD, running the iterative method to convergence, and selecting λ\lambda at each iteration. For the χ2\chi^{2} test, we report the results for the non-central χ2\chi^{2} test.
Method RE ISNR Iterations Timing (s)
SB Framelets, Optimal 0.26 35.0 9
SB Framelets, GCV 0.26 34.7 8 0.17
SB Framelets, χ2\chi^{2} 0.26 34.9 10 0.08
SB Wavelets, Optimal 0.28 34.2 11
SB Wavelets, GCV 0.28 34.2 9 0.21
SB Wavelets, χ2\chi^{2} 0.28 34.2 13 0.04
MM Framelets, Optimal 0.28 34.3 8
MM Framelets, GCV 0.29 34.4 6 0.13
MM Framelets, χ2\chi^{2} 0.27 34.5 8 0.08
MM Wavelets, Optimal 0.28 34.1 11
MM Wavelets, GCV 0.28 34.2 10 0.17
MM Wavelets, χ2\chi^{2} 0.28 34.1 14 0.05

5.1.2 Example 2

As a larger example, we blur the image of the Hubble telescope in Fig. 4(a) which has size n=512n=512. Here, we use zero boundary conditions, σ~1=2\tilde{\sigma}_{1}=2, σ~2=8\tilde{\sigma}_{2}=8, and band1=band2=50\text{band}_{1}=\text{band}_{2}=50. Gaussian noise with BSNR=10\text{BSNR}=10 is added to the blurred image to produce 𝐛{\bf b}. The PSF, 𝐛true{\bf b}_{true}, and 𝐛{\bf b} are shown in Fig. 4. We solve this problem using SB with τ=0.04\tau=0.04 and MM with ε=0.03\varepsilon=0.03. The solutions are shown in Fig. 5 with the RE and ISNR in Fig. 6. When comparing RE and ISNR, SB and MM perform similarly for both framelets and wavelets. We observe that framelets produce solutions with slightly smaller REs and larger ISNRs than wavelets. With wavelets, the χ2\chi^{2} test does not perform as well as GCV, but with framelets, both GCV and χ2\chi^{2} achieve solutions near the optimal. The results are summarized in Table 2 along with timing. From this, we can see that although the framelet methods produce better solutions, they take longer than wavelet methods to run. We also observe again that the χ2\chi^{2} selection method is faster than GCV, that again takes fewer iterations.

Refer to caption
(a) 𝐱{\bf x}
Refer to caption
(b) PSF
Refer to caption
(c) 𝐛true{\bf b}_{true}
Refer to caption
(d) 𝐛{\bf b}
Figure 4: True image 𝐱{\bf x} (512×512512\times 512 pixels), PSF (39×3939\times 39 pixels), blurred image 𝐛true{\bf b}_{true}, and the blurred and noisy image 𝐛{\bf b} with BSNR=10\text{BSNR}=10 for Example 2.
Refer to caption
(a) SB Framelets, Opt
Refer to caption
(b) SB Wavelets, Opt
Refer to caption
(c) MM Framelets, Opt
Refer to caption
(d) MM Wavelets, Opt
Refer to caption
(e) SB Framelets, GCV
Refer to caption
(f) SB Wavelets, GCV
Refer to caption
(g) MM Framelets, GCV
Refer to caption
(h) MM Wavelets, GCV
Refer to caption
(i) SB Framelets, χ2\chi^{2}
Refer to caption
(j) SB Wavelets, χ2\chi^{2}
Refer to caption
(k) MM Framelets, χ2\chi^{2}
Refer to caption
(l) MM Wavelets, χ2\chi^{2}
Figure 5: Solutions to Example 2 in Fig. 4. The solutions use either SB or MM to solve the problem and either framelets or wavelets for regularization. The parameter is either fixed optimally or selected with GCV or the non-central χ2\chi^{2} test.
Refer to caption
(a) RE: SB Framelets
Refer to caption
(b) RE: SB Wavelets
Refer to caption
(c) RE: MM Framelets
Refer to caption
(d) RE: MM Wavelets
Refer to caption
(e) ISNR: SB Framelets
Refer to caption
(f) ISNR: SB Wavelets
Refer to caption
(g) ISNR: MM Framelets
Refer to caption
(h) ISNR: MM Wavelets
Figure 6: RE and ISNR by iteration for the methods applied to Example 2 in Fig. 4.
Table 2: Results for Example 2 in Fig. 4. Timing includes computing the SVD, running the iterative method to convergence, and selecting λ\lambda at each iteration. For the χ2\chi^{2} test, we report the results for the non-central χ2\chi^{2} test.
Method RE ISNR Iterations Timing (s)
SB Framelets, Optimal 0.23 32.9 8
SB Framelets, GCV 0.23 32.9 8 4.76
SB Framelets, χ2\chi^{2} 0.23 33.0 14 2.75
SB Wavelets, Optimal 0.24 32.6 8
SB Wavelets, GCV 0.25 32.3 7 3.29
SB Wavelets, χ2\chi^{2} 0.30 30.7 14 1.14
MM Framelets, Optimal 0.24 32.7 6
MM Framelets, GCV 0.23 32.9 6 4.22
MM Framelets, χ2\chi^{2} 0.23 32.8 10 2.81
MM Wavelets, Optimal 0.24 32.6 7
MM Wavelets, GCV 0.25 32.4 7 3.89
MM Wavelets, χ2\chi^{2} 0.29 30.9 16 1.65

5.1.3 Example 3

Now, we consider the barcode image in Fig. 7(a), which has size n=128n=128. Here, we use periodic boundary conditions for the blurring matrix along with parameters σ~1=1.5\tilde{\sigma}_{1}=1.5, σ~2=0.8\tilde{\sigma}_{2}=0.8, and band1=band2=15\text{band}_{1}=\text{band}_{2}=15. Gaussian noise with BSNR=20\text{BSNR}=20 is added to the blurred image to obtain 𝐛{\bf b}. The PSF, 𝐛true{\bf b}_{true}, and 𝐛{\bf b} are shown in Fig. 7. For this problem, we will compare regularization using column-orthogonal matrices with regularization using the first derivative operator in one direction. For this derivative operator, we use periodic boundary conditions, which gives a matrix of the form 𝐋D=𝐅1D𝐈n{\bf L}_{D}={\bf F}_{1D}\otimes{\bf I}_{n}, where 𝐅1Dn×n{\bf F}_{1D}\in\mathbb{R}^{n\times n} is given by

𝐅1D=[1100011000011000111001].\displaystyle{\bf F}_{1D}=\begin{bmatrix}-1&1&0&\cdots&&0\\ 0&-1&1&0&\cdots&0\\ \vdots&&\ddots&\ddots&&\vdots\\ 0&\cdots&0&-1&1&0\\ 0&\cdots&&0&-1&1\\ 1&0&\cdots&&0&-1\end{bmatrix}.

This matrix only captures the derivative in the horizontal direction, which fits well with this example. We solve this problem using SB and MM where τ=0.02\tau=0.02 in SB and ε=0.02\varepsilon=0.02 in MM.

The results for framelets and 𝐋D{\bf L}_{D} are shown in Fig. 8 with the RE and ISNR plotted in Fig. 9. Wavelet results are not shown, but the conclusions for wavelets are the same as in the previous examples. The solutions show that the derivative operator produces horizontal artifacts while the framelet solutions do not. This is reflected in the framelet methods, which have smaller REs and larger ISNRs. For parameter selection methods, GCV and χ2\chi^{2} reach similar RE and ISNR for both framelets and 𝐋D{\bf L}_{D}. With framelets, χ2\chi^{2} again takes more iterations but less time to converge than GCV and provides approximately comparable results in terms of RE and ISNR. With 𝐋D{\bf L}_{D} though, MM with GCV reached the maximum number of iterations KmaxK_{max}. The results are summarized in Table 3 along with timing. The timing shows that the χ2\chi^{2} selection method is faster than GCV in all cases. We also see that framelet methods take longer to run but achieve better RE and ISNR. This difference in timing comes primarily from the framelet matrix having more rows since the number of iterations for framelets and 𝐋D{\bf L}_{D} to converge is generally comparable. With framelets, computing the SVDs takes about 0.008 seconds compared to 0.013 seconds for the GSVD with 𝐋D{\bf L}_{D}. Within the iterative methods, however, the last nn columns of 𝐕1{\bf V}_{1} and 𝐕2{\bf V}_{2} are needed, which are of length p=3np=3n with framelets and of length nn with 𝐋D{\bf L}_{D}. Similarly, 𝐡(k){\bf h}^{(k)} has 9n29n^{2} entries with framelets compared to n2n^{2} with 𝐋D{\bf L}_{D}.

Refer to caption
(a) 𝐱{\bf x}
Refer to caption
(b) PSF
Refer to caption
(c) 𝐛true{\bf b}_{true}
Refer to caption
(d) 𝐛{\bf b}
Figure 7: True image 𝐱{\bf x} (128×128128\times 128 pixels), PSF (9×99\times 9 pixels), blurred image 𝐛true{\bf b}_{true}, and the blurred and noisy image 𝐛{\bf b} with BSNR=20\text{BSNR}=20 for Example 3.
Refer to caption
(a) SB Framelets, Opt
Refer to caption
(b) SB 𝐋D{\bf L}_{D}, Opt
Refer to caption
(c) MM Framelets, Opt
Refer to caption
(d) MM 𝐋D{\bf L}_{D}, Opt
Refer to caption
(e) SB Framelets, GCV
Refer to caption
(f) SB 𝐋D{\bf L}_{D}, GCV
Refer to caption
(g) MM Framelets, GCV
Refer to caption
(h) MM 𝐋D{\bf L}_{D}, GCV
Refer to caption
(i) SB Framelets, χ2\chi^{2}
Refer to caption
(j) SB 𝐋D{\bf L}_{D}, χ2\chi^{2}
Refer to caption
(k) MM Framelets, χ2\chi^{2}
Refer to caption
(l) MM 𝐋D{\bf L}_{D}, χ2\chi^{2}
Figure 8: Solutions to Example 3 in Fig. 7. The solutions use either SB or MM to solve the problem and either framelets or 𝐋D{\bf L}_{D} for regularization. The parameter is either fixed optimally or selected with GCV or the non-central χ2\chi^{2} test.
Refer to caption
(a) RE: SB Framelets
Refer to caption
(b) RE: SB 𝐋D{\bf L}_{D}
Refer to caption
(c) RE: MM Framelets
Refer to caption
(d) RE: MM 𝐋D{\bf L}_{D}
Refer to caption
(e) ISNR: SB Framelets
Refer to caption
(f) ISNR: SB 𝐋D{\bf L}_{D}
Refer to caption
(g) ISNR: MM Framelets
Refer to caption
(h) ISNR: MM 𝐋D{\bf L}_{D}
Figure 9: RE and ISNR by iteration for the methods applied to Example 3 in Fig. 7.
Table 3: Results for Example 3 in Fig. 7. Timing includes computing the decomposition (SVD for framelets, GSVD for 𝐋D{\bf L}_{D}), running the iterative method to convergence, and selecting λ\lambda at each iteration. For the χ2\chi^{2} test, we report the results for the non-central χ2\chi^{2} test.
Method RE ISNR Iterations Timing (s)
SB Framelets, Optimal 0.22 34.8 9
SB Framelets, GCV 0.22 34.8 10 0.22
SB Framelets, χ2\chi^{2} 0.22 34.8 15 0.08
SB 𝐋D{\bf L}_{D}, Optimal 0.28 32.8 10
SB 𝐋D{\bf L}_{D}, GCV 0.28 32.7 9 0.17
SB 𝐋D{\bf L}_{D}, χ2\chi^{2} 0.28 32.8 10 0.04
MM Framelets, Optimal 0.22 34.9 8
MM Framelets, GCV 0.21 35.0 9 0.23
MM Framelets, χ2\chi^{2} 0.22 34.9 14 0.14
MM 𝐋D{\bf L}_{D}, Optimal 0.28 32.7 9
MM 𝐋D{\bf L}_{D}, GCV 0.27 32.9 20 0.40
MM 𝐋D{\bf L}_{D}, χ2\chi^{2} 0.28 32.7 9 0.03

6 Conclusions

In this paper, we presented decomposition-based algorithms for solving an 1\ell_{1}-regularized problem with KP-structured matrices. Within these algorithms, we utilized the column orthogonality of a family of regularization operators, which include framelets and wavelets. These operators allow us to obtain a joint decomposition using SVDs rather than GSVDs, making them more efficient than matrices of comparable size without this property, such as finite-difference operators. When tested on numerical examples, framelet-based regularization methods are slower than wavelet-based ones but generally produce better solutions in terms of RE and ISNR. For the parameter selection methods, GCV and χ2\chi^{2} both perform well with framelets, with χ2\chi^{2} being faster than GCV. Overall, the results support choosing the SB algorithm with framelets for regularization and the χ2\chi^{2} parameter selection method, as optimal in the context of the solution of Eq. 3 and in relation to the other algorithm choices presented here. We also make the interesting observation that the solution of the Generalized Tikhonov problem with a column-orthogonal regularization matrix is just the solution of the standard Tikhonov problem, independent of the choice of this column-orthogonal matrix. In the future, these decomposition-based algorithms could also be extended for p\ell_{p} regularization.

Acknowledgments

Funding: This work was partially supported by the National Science Foundation (NSF) under grant DMS-2152704 for Renaut. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. M.I. Español was supported through a Karen Uhlenbeck EDGE Fellowship.

References

  • [1] M. Alotaibi, A. Buccini, and L. Reichel, Restoration of blurred images corrupted by impulse noise via median filters and p\ell_{p}-q\ell_{q} minimization, in 2021 21st International Conference on Computational Science and Its Applications (ICCSA), IEEE, 2021, pp. 112–122.
  • [2] R. C. Aster, B. Borchers, and C. H. Thurber, Parameter Estimation and Inverse Problems, Elsevier, Amsterdam, 2018.
  • [3] J. M. Bardsley, S. Knepper, and J. Nagy, Structured linear algebra problems in adaptive optics imaging, Advances in Computational Mathematics, 35 (2011), pp. 103–117.
  • [4] A. Barrlund, Efficient solution of constrained least squares problems with Kronecker product structure, SIAM Journal on Matrix Analysis and Applications, 19 (1998), pp. 154–160.
  • [5] T. Betcke, The generalized singular value decomposition and the method of particular solutions, SIAM Journal on Scientific Computing, 30 (2008), pp. 1278–1295.
  • [6] V. Bortolotti, R. Brown, P. Fantazzini, G. Landi, and F. Zama, Uniform penalty inversion of two-dimensional NMR relaxation data, Inverse Problems, 33 (2016), p. 015003.
  • [7] A. Bouhamidi and K. Jbilou, A Kronecker approximation with a convex constrained optimization method for blind image restoration, Optimization Letters, 6 (2012), pp. 1251–1264.
  • [8] A. Buccini, M. Pasha, and L. Reichel, Modulus-based iterative methods for constrained p\ell_{p}-q\ell_{q} minimization, Inverse Problems, 36 (2020), p. 084001.
  • [9] A. Buccini and L. Reichel, An p\ell_{p}-q\ell_{q} minimization method with cross-validation for the restoration of impulse noise contaminated images, Journal of Computational and Applied Mathematics, 375 (2020), p. 112824.
  • [10] J.-F. Cai, S. Osher, and Z. Shen, Linearized Bregman iterations for frame-based image deblurring, SIAM Journal on Imaging Sciences, 2 (2009), pp. 226–252.
  • [11] J.-F. Cai, S. Osher, and Z. Shen, Split Bregman methods and frame based image restoration, Multiscale Modeling & Simulation, 8 (2010), pp. 337–369.
  • [12] I. Daubechies, Ten lectures on wavelets, SIAM, 1992.
  • [13] I. Daubechies, B. Han, A. Ron, and Z. Shen, Framelets: MRA-based constructions of wavelet frames, Applied and Computational Harmonic Analysis, 14 (2003), pp. 1–46.
  • [14] L. Dykes and L. Reichel, Simplified GSVD computations for the solution of linear discrete ill-posed problems, Journal of Computational and Applied Mathematics, 255 (2014), pp. 15–27.
  • [15] A. Edelman and Y. Wang, The GSVD: Where are the ellipses?, matrix trigonometry, and more, SIAM Journal on Matrix Analysis and Applications, 41 (2020), pp. 1826–1856.
  • [16] M. I. Español, Multilevel methods for discrete ill-posed problems: Application to deblurring, PhD thesis, Tufts University, 2009.
  • [17] E. Esser, Applications of Lagrangian-based alternating direction methods and connections to split Bregman, CAM report, 9 (2009), p. 31.
  • [18] H. Fang and H. Zhang, Wavelet-based double-difference seismic tomography with sparsity regularization, Geophysical Journal International, 199 (2014), pp. 944–955.
  • [19] D. W. Fausett and C. T. Fulton, Large least squares problems involving Kronecker products, SIAM Journal on Matrix Analysis and Applications, 15 (1994), pp. 219–227.
  • [20] D. W. Fausett, C. T. Fulton, and H. Hashish, Improved parallel QR method for large least squares problems involving Kronecker products, Journal of Computational and Applied Mathematics, 78 (1997), pp. 63–78.
  • [21] X.-B. Gao, G. H. Golub, and L.-Z. Liao, Continuous methods for symmetric generalized eigenvalue problems, Linear Algebra and its Applications, 428 (2008), pp. 676–696.
  • [22] C. Garvey, C. Meng, and J. G. Nagy, Singular value decomposition approximation via Kronecker summations for imaging applications, SIAM Journal on Matrix Analysis and Applications, 39 (2018), pp. 1836–1857.
  • [23] S. Gazzola, P. C. Hansen, and J. G. Nagy, IR Tools: a MATLAB package of iterative regularization methods and large-scale test problems, Numerical Algorithms, 81 (2019), pp. 773–811.
  • [24] T. Goldstein and S. Osher, The split Bregman method for 1\ell_{1}-regularized problems, SIAM Journal on Imaging Sciences, 2 (2009), pp. 323–343.
  • [25] G. H. Golub, M. Heath, and G. Wahba, Generalized cross-validation as a method for choosing a good ridge parameter, Technometrics, 21 (1979), pp. 215–223.
  • [26] F. Greensite, Inverse problems with I \otimes A Kronecker structure, SIAM Journal on Matrix Analysis and Applications, 27 (2005), pp. 218–237.
  • [27] P. C. Hansen, Regularization, GSVD and truncated GSVD, BIT Numerical Mathematics, 29 (1989), pp. 491–504.
  • [28] P. C. Hansen, Discrete Inverse Problems: insight and Algorithms, SIAM, Philadelphia, 2010.
  • [29] P. C. Hansen, J. G. Nagy, and D. P. O’leary, Deblurring Images: Matrices, Spectra, and Filtering, SIAM, Philadelphia, 2006.
  • [30] P. Howland and H. Park, Generalizing discriminant analysis using the generalized singular value decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26 (2004), pp. 995–1006.
  • [31] G. Huang, A. Lanza, S. Morigi, L. Reichel, and F. Sgallari, Majorization–minimization generalized Krylov subspace methods for pq\ell_{p}-\ell_{q} optimization applied to image restoration, BIT Numerical Mathematics, 57 (2017), pp. 351–378.
  • [32] D. R. Hunter and K. Lange, A tutorial on MM algorithms, The American Statistician, 58 (2004), pp. 30–37.
  • [33] J. Kamm and J. G. Nagy, Kronecker product and SVD approximations in image restoration, Linear Algebra and its Applications, 284 (1998), pp. 177–192.
  • [34] J. Kamm and J. G. Nagy, Optimal Kronecker product approximation of block Toeplitz matrices, SIAM Journal on Matrix Analysis and Applications, 22 (2000), pp. 155–172.
  • [35] S. M. Knepper, Large-Scale Inverse Problems in Imaging: two Case Studies, PhD thesis, Emory University, 2011.
  • [36] H. Li, Characterizing GSVD by singular value expansion of linear operators and its computation, arXiv preprint arXiv:2404.00655, (2024).
  • [37] J. L. Mead, Parameter estimation: a new approach to weighting a priori information, Journal of Inverse and Ill-posed Problems, 16 (2008), pp. 175–194.
  • [38] J. L. Mead and R. A. Renaut, A Newton root-finding algorithm for estimating the regularization parameter for solving ill-conditioned least squares problems, Inverse Problems, 25 (2008), p. 025002.
  • [39] C. C. Paige and M. A. Saunders, Towards a generalized singular value decomposition, SIAM Journal on Numerical Analysis, 18 (1981), pp. 398–405.
  • [40] B. N. Parlett, The Symmetric Eigenvalue Problem, SIAM, 1998.
  • [41] M. Pasha, Krylov Subspace Type Methods for the Computation of Non-negative or Sparse Solutions of Ill-posed Problems, PhD thesis, Kent State University, Kent, OH, 2020.
  • [42] N. P. Pitsianis, The Kronecker product in approximation and fast transform generation, PhD thesis, Cornell University, 1997.
  • [43] R. A. Renaut, I. Hnětynková, and J. Mead, Regularization parameter estimation for large-scale Tikhonov regularization using a priori information, Computational Statistics & Data Analysis, 54 (2010), pp. 3430–3445.
  • [44] A. Ron and Z. Shen, Affine systems in L2(Rd)L_{2}(R_{d}): the analysis of the analysis operator, Journal of Functional Analysis, 148 (1997), pp. 408–447.
  • [45] B. Schimpf and F. Schreier, Robust and efficient inversion of vertical sounding atmospheric high-resolution spectra by means of regularization, Journal of Geophysical Research: Atmospheres, 102 (1997), pp. 16037–16055.
  • [46] B. K. Sriperumbudur, D. A. Torres, and G. R. Lanckriet, A majorization-minimization approach to the sparse generalized eigenvalue problem, Machine learning, 85 (2011), pp. 3–39.
  • [47] B. Sweeney, R. Renaut, and M. I. Español, Parameter selection by GCV and a χ2\chi^{2} test within iterative methods for 1\ell_{1}-regularized inverse problems, arXiv preprint arXiv:2404.19156, (2024).
  • [48] A. N. Tikhonov and V. Y. Arsenin, Solution of incorrectly formulated problems and the regularization method, Soviet Mathematics Doklady, 5 (1963), pp. 1035–1038.
  • [49] P. J. Van Fleet, Discrete wavelet transformations: An elementary approach with applications, John Wiley & Sons, 2019.
  • [50] S. Van Huffel and J. Vandewalle, Analysis and properties of the generalized total least squares problem AXBAX\approx B when some or all columns in AA are subject to error, SIAM Journal on Matrix Analysis and Applications, 10 (1989), p. 294.
  • [51] C. F. Van Loan, Generalizing the singular value decomposition, SIAM Journal on Numerical Analysis, 13 (1976), pp. 76–83.
  • [52] C. F. Van Loan, The ubiquitous Kronecker product, Journal of Computational and Applied Mathematics, 123 (2000), pp. 85–100.
  • [53] C. F. Van Loan and N. Pitsianis, Approximation with Kronecker products, Springer, 1993.
  • [54] G. Wang, J. Xu, Z. Pan, and Z. Diao, Ultrasound image denoising using backward diffusion and framelet regularization, Biomedical Signal Processing and Control, 13 (2014), pp. 212–217.
  • [55] X. Wang, M. Che, and Y. Wei, Recurrent neural network for computation of generalized eigenvalue problem with real diagonalizable matrix pair and its applications, Neurocomputing, 216 (2016), pp. 230–241.
  • [56] H. Zha, Computing the generalized singular values/vectors of large sparse or structured matrix pairs, Numerische Mathematik, 72 (1996), pp. 391–417.