Alternating Direction Method of Multipliers Based on -norm for Multiple Measurement Vector Problem
Abstract
The multiple measurement vector (MMV) problem is an extension of the single measurement vector (SMV) problem, and it has many applications. Nowadays, most studies of the MMV problem are based on the -norm relaxation, which will fail in recovery under some adverse conditions. We propose an alternating direction method of multipliers (ADMM)-based optimization algorithm to achieve a larger undersampling rate for the MMV problem. The key innovation is the introduction of an -norm sparsity constraint to describe the joint-sparsity of the MMV problem; this differs from the -norm constraint that has been widely used in previous studies. To illustrate the advantages of the -norm, we first prove the equivalence of the sparsity of the row support set of a matrix and its -norm. Then, the MMV problem based on the -norm is proposed. Next, we give our algorithm called MMV-ADMM- by applying ADMM to the reformulated problem. Moreover, based on the Kurdyka-Lojasiewicz property of objective functions, we prove that the iteration generated by the proposed algorithm globally converges to the optimal solution of the MMV problem. Finally, the performance of the proposed algorithm and comparisons with other algorithms under different conditions are studied with simulated examples. The results show that the proposed algorithm can solve a larger range of MMV problems even under adverse conditions.
1 Introduction
The multiple measurement vector (MMV) problem [1] has received considerable attention in the field of compressed sensing [2, 3]. In the single measurement vector (SMV) problem, the purpose is to recover a vector from , where and are given. Because , the problem is underdetermined. With the prior information that is sparse enough [4], we can solve
(1.1) | ||||
to obtain the unique solution, where the -norm of a vector is the number of its nonzero elements. However, problem (1.1) is NP-hard and is difficult to solve directly. A common approach to overcome this drawback is to relax the -norm to the -norm; this is known as the basis pursuit [5]:
(1.2) | ||||
Problem (1.2) is convex, implying that it can be solved much more efficiently. By solving problem (1.2) we can obtain the same solution as that for problem (1.1) under suitable conditions [3].
The MMV problem is an extension of the SMV problem. It describes the case in which multiple signals are jointly measured by the same sensing matrix from signals that are jointly sparse, that is
where all of have the same locations of nonzero elements. The MMV problem occurs in many applications including hyperspectral imagery [6], computation of sparse solutions to linear inverse problems [7], neuromagnetic imaging [8], source localization [9], and equalization of sparse communication channels [10]. Given , the MMV problem is to recover from where only a few rows of have nonzero elements. For this purpose, the most widely used approach is -norm minimization [7, 11, 12]:
(1.3) | ||||
and the definition of the generalized -norm of a matrix is
where is the th row of . Owing to the benefit of the additional joint-sparsity property, the recovery performance in the accuracy and total time spent for the MMV problem are better than those of the SMV problem [11, 12].
In recent decades, several recovery algorithms for the MMV problem have been proposed [11, 7, 13, 14]. [13] employed the greedy pursuit strategy to recover the signals because the joint-sparsity recovery problem is NP-hard. As a similar convex relaxation technique for the SMV problem (1.2), [7, 11, 12] used -norm minimization to solve the MMV problem. For hyperspectral imagery applications, [6] applied the alternating direction method of multipliers (ADMM) to solve problem (1.3). Most theoretical results on the relaxation of the SMV problem can be extended to the MMV problem [11]. However, as (1.2) is a convex relaxation of (1.1) that yields the same solution only under suitable conditions [3], (1.3) is only a convex relaxation of the -norm minimization problem for MMV. (1.3) cannot provide an accurate solution under certain unfavorable conditions, whereas the -norm minimization problem can. Therefore, some drawbacks limit the use of previous algorithms.
In this study, instead of considering the widely used convex relaxation -norm minimization problem, we directly study the original -norm minimization in the MMV problem. We first note the equivalence of the joint-sparsity property and the -norm in the MMV problem. Then, we reformulate the MMV problem via the sparsity constraint in [11]. Next, we propose our algorithm called MMV-ADMM- by applying ADMM [15] to the reformulated problem. Theoretical analysis shows that the proposed algorithm is globally convergent to the unique solution of the MMV problem. Compared with existing algorithms, the proposed algorithm can solve the MMV problem when the signals are not very sparse or the number of sensors is small, and it can achieve a larger undersampling rate.
The rest of this paper is organized as follows: In Section 2, we present some definitions used for the subsequent analysis. In Section 3, we propose the optimization problem by reformulating the MMV problem. In Section 4, we present the proposed MMV-ADMM- algorithm and discuss the subproblems in detail. In Section 5, we present the convergence results obtained with the proposed algorithm. In Section 6, we describe numerical experiments performed to verify the validity of the proposed algorithm and compare it with other algorithms. In Section 7, we present the conclusions of this study.
2 Preliminaries
We list some of the notations and definitions used for the subsequent analysis. For any vector , its sparse support set is defined as
Note that the assumption of the MMV problem is that the set of all vectors has the same sparse support set, implying that
Definition 2.1.
For a set of vectors , the common sparse support set is defined as
In fact, although it is almost impossible to satisfy the above assumption with real datasets, the MMV problem works well when real data have sufficient joint-sparsity [11].
The following few definitions are used for the subsequent analysis.
Definition 2.2 ([16]).
Let be a generalized real function.
-
(i)
For a nonempty set , we call proper to if there exist such that and for any .
-
(ii)
is a lower semicontinuous function if for any .
-
(iii)
is a closed function if its epigraph
is a closed set.
For a generalized real function , the lower semicontinuity of is equivalent to it being a closed function.
Definition 2.3 ([17]).
For a proper lower semicontinuous function , its critical point is one that satisfies .
In keeping with the first-order optimal condition, if is a minimizer of , it must be a critical point of first.
Definition 2.4 ([18]).
is a semi-algebraic set if there exist polynomial functions with such that
The proper function is semi-algebraic if its graph
is a semi-algebraic set of .
Semi-algebra has a few useful properties:
-
(i)
Semi-algebraic sets are closed under finite unions and Cartesian products.
-
(ii)
Many common functions are actually semi-algebraic, including real polynomials, compositions of semi-algebraic functions, indicator functions of semi-algebraic sets, and where is a semi-algebraic set and is the projection operator.
The definitions of Gâteaux differentiable, subdifferential, Karush-Kuhn-Tucker (KKT) conditions, and Kurdyka-Lojasiewicz (KL) property can be referred elsewhere [19, 17, 16, 18, 20].
The following proposition states the connection between semi-algebraic functions and KL functions.
Proposition 2.1 ([20]).
If the proper lower semicontinuous function is semi-algebraic, it must be a KL function.
3 Problem formulation
First, we state a proposition that is useful to describe the MMV problem in mathematical language.
Proposition 3.1.
For a set of vectors , denote . Then, has the common sparse support set with sparsity .
Proof.
Denote the common sparse support set of as . can be expressed by its row vector as . Then, we can conclude that
This completes the whole proof.
Next, we convert the MMV problem to an optimization problem by using Proposition 3.1.
Assuming there are sensors, the sparse vectors after sparse representation are , where for all ; we denote . Assuming the sensing matrix , the measurement vectors are , where for all ; we denote . By Proposition 3.1, minimizing the joint-sparsity of is equivalent to minimizing . Therefore, the MMV problem can be described as
(3.1) | ||||
The following theorem offers the prerequisite to make a further conversion of problem (3.1).
Theorem 3.1 ([11]).
Based on Theorem 3.1, we do not require all vectors to share the same sparse support set in the MMV problem. Instead, we just require that satisfies (3.2).
To consider the odevity of and the measurement error between and the real , set
(3.3) |
where is the round down operator. Problem (3.1) can be converted to
(3.4) | ||||
It is a nonconvex constrained optimization problem. The indicator function
is introduced to the set
to move the nonconvex constraint to the objective function, and matrix is introduced as an auxiliary variable of to reformulate problem (3.4) as
(3.5) | ||||
Now we have obtained the final problem formulation (3.5) to describe the MMV problem.
4 Algorithm
Problem (3.5) is a two block nonconvex optimization problem with a linear equality constraint; it can be solved using ADMM [15].
The augmented Lagrangian function associated with problem (3.5) is defined as
(4.1) |
where is the Lagrangian multiplier of the equation constraint, is the inner product of matrices, and is the penalty parameter.
By using the ADMM scheme, we can obtain an approximate solution of problem (3.5) by minimizing the variables one by one with others fixed, as follows:
(4.2) |
Next, the subproblems in Equation (4.2) are solved one by one.
4.1 Update B
Fix and ,
(4.3) |
where is the projection operator on set .
Actually, is the hard thresholding operator [21]. If , then ; otherwise, truncate with the rows whose -norm is the top large preserved.
4.2 Update S
Fix and ,
(4.4) |
Denote . The first-order optimal condition for unconstrained differentiable optimization problem gives
It is easy to calculate
Therefore, by solving the linear equation , we obtain
(4.5) |
4.3 Update L
Fix and . The Lagrangian multiplier can be updated as
(4.6) |
4.4 Convergence criterion
The convergence criterion of the proposed algorithm is given below, and the reasons will be discussed in the following section.
Assuming is the sequence generated using the ADMM procedure (4.2), the convergence criterion is given as
(4.7) |
We call our algorithm for solving problem (3.5) the MMV-ADMM-; it is summarized in Algorithm 1. Section 6 describes the parameter initialization.
Notably, is unchanged in the iteration; therefore, its inverse needs to be calculated only once. Algorithm 1 can be accelerated in an obvious way. When updating by (4.5), we need to calculate the inverse of an matrix; the computational time required for this purpose will be high when is large. Because in compressed sensing, we can use the Sherman-Morrison-Woodbury (SMW) formula [22, 23] to simplify the calculation of the inverse. Specifically, based on the SMW-formula, we have
(4.8) |
It converts the calculation of an matrix inverse to the inverse of an matrix. Because is much smaller than in compressed sensing, this will reduce the computational time. We call our algorithm MMV-ADMM--SMW when using (4.8) to update .
5 Convergence analysis
Algorithm 1 is a two-block ADMM for a nonconvex problem; its convergence analysis remains an open problem. However, owing to the KL property of objective functions in problem (3.5), Algorithm 1 has global convergence.
First, we prove some properties of objective functions in problem (3.5).
Proposition 5.1.
The -norm of a matrix is proper and lower semicontinuous.
Proof.
First, we prove that the -norm of a vector is lower semicontinuous. Denote
For any , assume
When , for
there exist such that
Therefore, when ,
implying
According to the sign-preserving property of limit, we have ; therefore, is lower semicontinuous.
For any , can be expressed by its row vector as . Denote
Then, the -norm of a matrix
is the composition of and , that is, .
It is apparent that is continuous. Upon combining , a lower semicontinuous function, we can conclude that for any , ,
Therefore, is lower semicontinuous.
Apparently, the -norm is proper. Therefore, it is proper and lower semicontinuous.
Based on the equivalence of the semicontinuous function and closed function, Corollary 5.1 holds immediately.
Corollary 5.1.
The -norm of a matrix is a closed function.
Corollary 5.2.
The indicator function
to the set is also proper and lower semicontinuous.
Proof.
We conclude that is a closed function. In fact, for any sequence , we have . According to Definition 2.2, we only need to prove if ; then, .
Because , is large enough in only two cases:
If , then . From , we have ; therefore, .
If , then . From , we have . Through Corollary 5.1, we know that the -norm of a matrix is a closed function. Therefore,
Implying that . Therefore, .
Further, . This proves that is closed. It is apparent that is proper. Therefore, is also proper and lower semicontinuous.
Proposition 5.2.
The -norm of a matrix is a KL function.
Proof.
[20] proved that both and with are semi-algebraic. Denote functions , , and in the same way as the definitions in Proposition 5.1. For any , can be expressed by its row vector as :
is the Cartesian product of the -norm. Because the semi-algebra is stable under a Cartesian product, is also semi-algebraic. Therefore, the composition is semi-algebraic.
Corollary 5.3.
The indicator function
to the set is also a KL function.
Proof.
First, we prove that is semi-algebraic. Because , can be expressed as
Proposition 5.2 proves that the -norm is a semi-algebraic function. Therefore, we conclude through Definition 2.4 that
is a semi-algebraic set.
Denote the projection operator
Then,
Because the semi-algebra is stable under finite union and the projection operator , is semi-algebraic.
In [24] the authors discuss the following optimization problem:
(5.1) | ||||
where is proper and lower semicontinuous, and is Gradient--Lipschitz continuous.
The authors make the following assumptions:
-
(i)
The two minimization subproblems in (4.2) possess solutions, and
-
(ii)
The penalty parameter ,
-
(iii)
for some .
Set , , , , , and in (5.1); this gives problem (3.5). It is apparent that problem (3.5) satisfies the form of (5.1) and the above assumptions.
Next, we refer to the sufficient conditions in [24] to guarantee that the sequence generated by Algorithm 1 is bounded.
Lemma 5.1 ([24]).
By applying the classic ADMM to (5.1), we obtain a sequence . Suppose that
(5.2) |
is bounded as long as one of the following two statements holds:
-
(i)
, and
-
(ii)
and .
The main result in [24] is shown below.
Theorem 5.1 ([24]).
As an application of the above theorem, Algorithm 1 has global convergence.
Theorem 5.2.
Proof.
First, we prove that is bounded.
For , we have and . For any ,
Therefore, (5.2) is satisfied. Moreover, it is not hard to see that (ii) in Lemma 5.1 holds. Therefore, is bounded.
Corollary 5.3 proves that is a KL function. Because is a real polynomial, it is also a KL function.
Finally, we state the theorem about settings for the convergence criterion (4.7).
Theorem 5.3.
Proof.
In [25], the authors note that for the following problem
(5.3) | ||||
where denotes a convex polyhedron, the optimal solution must also satisfy the KKT conditions.
It is easy to see that problem (3.4) satisfies the form of (5.3). Because problem (3.5) is equivalent to problem (3.4), the optimal point of (3.5) is also its KKT point.
Therefore, if converges to the optimal solution , then is a KKT point of (3.5). The Lagrangian function of (3.5) is
(5.4) |
it satisfies the KKT conditions at . Therefore,
(5.5) |
From and the third formula of (5.5), we can conclude that .
Denote in the same way as in the definition in Subsection 4.2. When updating , we have
Let . We obtain , implying that the first formula of (5.5) is satisfied in Algorithm 1 without any further conditions.
From the rule of updating in (4.3), we know that is always satisfied during iterations in Algorithm 1. According to Corollary 5.2, is proper and lower semicontinuous. Thus,
Therefore, when limiting the space to all iteration points generated by Algorithm 1 and their accumulations, we have . Thus, . Therefore, the second formula of (5.5) holds if and only if , which is equivalent to .
Thus far we have proved that the KKT conditions (5.5) are equivalent to the first and third formulas in (4.7). From , we have , which is the second formula in (4.7). Therefore, the optimal point must satisfy the convergence criterion (4.7). This completes the necessity.
For sufficiency, the convergence criterion for a generalized constrained optimization problem [16] is that the distance between the adjacent iteration points tends to zero and the KKT conditions at the current iteration point are nearly satisfied. We have proved that the first and third formulas in (4.7) are equivalent to the KKT conditions. Therefore, the convergence criterion (4.7) ensures that the KKT conditions are satisfied. From combining the second formula in (4.7), which implies that the distance between the adjacent iteration points tends to zero, we know that the convergence criterion (4.7) guarantees to converge to a KKT point of (3.5). Denote the KKT point as ; we say that it is also the optimal point of (3.5).
In fact, at the KKT point , we have proved that and (5.5) are satisfied. Therefore, from the first formula of (5.5), we can conclude that
(5.6) |
The sensing matrix in the MMV problem is known to be row full rank. Thus, from (5.6), we know that . We have proved , implying that . Because , too. Therefore, the KKT point satisfies . From Theorem 3.1, is the unique optimal point of (3.1). According to the equivalence of (3.1) and (3.5), we can conclude that is also the unique optimal point of (3.5).
Therefore, when the convergence criterion is satisfied, Algorithm 1 converges to the optimal solution of (3.5).
This completes the proof.
In fact, it is not hard to find that all of the discussions above are also correct when we replace the -norm with the -norm with , implying that the proposed algorithm is also applicable to the -norm case with . However, for the convenience of calculation, we just consider the -norm case.
6 Numerical simulations
We designed numerical experiments to verify the validity of the proposed algorithm and theorems and then explored the performance and advantages of our MMV-ADMM- by comparing it with existing algorithms.
6.1 Experiment setup
All experiments were based on synthetic data and were performed on a PC with an Intel Core i7-8565U 1.8 GHz CPU and 20 GB RAM. The results in Subsections 6.4-6.7 are averaged over 100 independent experiments. Our datasets are generated as follows: sensing matrix is an Gaussian random matrix with unit-norm columns, and the ground truth of the recovery problem is generated in two steps. First, rows were randomly selected as nonzero rows, with the remaining rows all being zero. Then, the elements of the selected rows were independently generated from . The solution obtained from the different recovery algorithms was denoted as . The root-mean-square error (RMSE), given as
was used to compare the recovery quality. The average running time was used to measure the computational cost. Recovery is considered successful when .
For comparison, we test SOMP [13] and MFOCUSS [7], both of which are traditional algorithms that show good performance for the MMV problem. We also include the MMV-SPG algorithm based on the -norm, which is derived from a solver for large-scale sparse reconstruction called SPGL1 [26], to solve the MMV problem. Because the projection operator in (4.3) is a hard thresholding operator, we also check the SNIHT [14] for the MMV problem. As the most direct comparison, MMV-ADMM- [6] is the same ADMM scheme algorithm based on the -norm. We introduce it in the experiments to compare the changes in performance after the -norm is replaced by the -norm.
The parameters of all algorithms are set as shown in Table 1. For MMV-ADMM- and MMV-ADMM-, and are separately initialized as an Gaussian random matrix and a null matrix. For MMV-ADMM-, if we have the prior information that the joint-sparsity of is about , then we set to be slightly larger than . Otherwise, we use (3.3) to initialize . The spark of the matrix is not easy to calculate accurately; however, an estimation given by [4] is enough for the proposed algorithm.
6.2 Validity of MMV-ADMM-
In the first experiment, we observed the recovery quality of our MMV-ADMM-. MMV-ADMM- was applied without a convergence criterion (, iterates to ) on datasets of size over 10 random repetitions. As shown in Table 2, the of MMV-ADMM- is close to the computer precision, and the running time is less than 1 s; this indicates that the proposed algorithm can recover the sparse signals precisely and efficiently.
Data | RMSE | Time(s) |
---|---|---|
1 | 8.9904e-16 | 0.6610 |
2 | 9.1155e-16 | 0.6290 |
3 | 9.1562e-16 | 0.5604 |
4 | 8.4978e-16 | 0.5965 |
5 | 9.1531e-16 | 0.6480 |
6 | 9.0638e-16 | 0.6051 |
7 | 9.9356e-16 | 0.5715 |
8 | 9.1655e-16 | 0.5679 |
9 | 7.7866e-16 | 0.6154 |
10 | 7.6170e-16 | 0.6352 |
Mean | 8.8484e-16 | 0.6090 |
Std | 6.9661e-17 | 0.0350 |
6.3 Test for convergence criterion
In the second experiment, we tested the convergence criterion (4.7) given by Theorem 5.3. We set . The aim was to observe the average change in the convergence criterion (4.7) in iterations over 10 random repetitions. The average change of the convergence criterion in 200 iterations is shown in Figure 1. All three variables in (4.7) tend to zero and reach an order of magnitude of at ; this satisfies our analysis in Theorem 5.3.
Then, we studied the influence of the convergence criterion (4.7) on our algorithm. When all three variables in the convergence criterion (4.7) were less than , the proposed algorithm was considered to have found an approximate optimal solution, and the iteration was stopped. As shown in Table 3, with the convergence criterion, the proposed algorithm can successfully recover the sparse signals in a short time.
In the following experiments, we applied MMV-ADMM- with the proposed convergence criterion.
data | Without convergence criterion | With convergence criterion | ||
---|---|---|---|---|
RMSE | Time(s) | RMSE | Time(s) | |
1 | 8.2155e-16 | 0.5765 | 9.4005e-10 | 0.1560 |
2 | 9.0799e-16 | 0.5193 | 7.4102e-10 | 0.1179 |
3 | 7.6515e-16 | 0.4940 | 6.9234e-10 | 0.1116 |
4 | 8.4516e-16 | 0.5109 | 8.8110e-10 | 0.1132 |
5 | 8.5083e-16 | 0.5278 | 7.4768e-10 | 0.1269 |
6 | 9.4163e-16 | 0.5697 | 8.0561e-10 | 0.1508 |
7 | 8.2227e-16 | 0.5152 | 6.5987e-10 | 0.1108 |
8 | 8.9286e-16 | 0.5053 | 7.4714e-10 | 0.1069 |
9 | 7.6474e-16 | 0.4863 | 8.1573e-10 | 0.0949 |
10 | 8.0768e-16 | 0.4796 | 7.8128e-10 | 0.1024 |
Mean | 8.4199e-16 | 0.5185 | 7.8118e-10 | 0.1191 |
Std | 5.8515e-17 | 0.0324 | 8.4008e-11 | 0.0200 |
6.4 Performance with different sparsity
In the third experiment, we studied how sparsity influenced the recovery quality of all these algorithms. Set and let range from 25 to 150 with step size 25. Observe how the percentage of successful recovery changes with when applying different algorithms. The experiment results are shown in Figure 2.
It is not hard to see that all these algorithms fail to recover the original signals when the sparsity of signals is large (, ). However, when the sparsity is not so large, that is, , the proposed algorithm can successfully recover all signals over 100 experiments, whereas the other algorithms have poor performance. In addition, in the case of successful recovery, all these algorithms are fast enough. Therefore, we can conclude that for the MMV problem, when the original signals are not sparse enough, the proposed algorithm is more suitable.
6.5 Performance with different undersampling rate
The main purpose of compressed sensing is to reduce the number of measurements while ensuring good recovery quality. To illustrate that the proposed algorithm has better performance with a larger undersampling rate compared with that of other algorithms, we studied the percentage of successful recovery with different undersampling rates. We set and let the undersampling rate range from 1.6 to 8 with a step size of 1.6.
As shown in Figure 3, when the undersampling rate , all algorithms fail to recover the original signals. However, when , the proposed algorithm can still recover all signals, whereas MFOCUSS can only recover approximately 94% of the signals and other algorithms perform poorly. With regard to the running time, we just need to consider the case of successful recovery. We see that the proposed algorithm has average performance among all algorithms. This implies that the proposed algorithm has a larger undersampling rate compared with that of the other algorithms; this is the key of compressed sensing.
6.6 Performance with different number of sensors
In this experiment, we studied how the recovery quality was affected by the number of sensors . We set and let range from 1 (multiplied by 2 per step) to 32.
Figure 4 shows that when the number of sensors is 1 (, a SMV problem), all algorithms cannot successfully recover all signals. However, the proposed algorithm can recover approximately 82% of signals, whereas the other algorithms perform poorly. When , the proposed algorithm can successfully recover all signals. However, the other algorithms can only recover all signals when . The running time of the proposed algorithm is the average of that of all the other algorithms, all of which are fast enough to recover the original signals. This experiment proves that the proposed algorithm performs well even though with a small number of sensors, whereas the other algorithms cannot.
6.7 Performance with different sample dimensions
We tested the recovery quality of our algorithm for varying numbers of dimensions. We set and tested with and .
Figure 5 shows that MFOCUSS, MMV-ADMM-, and the proposed algorithm all perform well irrespective of how the dimension changes. MFOCUSS and the proposed algorithm have similar running times; however, the running time of MMV-ADMM- is higher when is large. This experiment proves that the proposed algorithm is also suitable for a large number of dimensions.
6.8 Performance with SMW-formula
Finally, we designed an experiment to illustrate the advantage of MMV-ADMM- with the SMW formula when is large. We set and observe the RMSE and running time over 10 random experiments.
As shown in Table 4, with the SMW-formula, MMV-ADMM--SMW requires less time to obtain solutions with the same precision as that of MMV-ADMM-.
data | MMV-ADMM- | MMV-ADMM--SMW | ||
---|---|---|---|---|
RMSE | Time(s) | RMSE | Time(s) | |
1 | 2.5689e-10 | 12.5324 | 2.3941e-10 | 11.5070 |
2 | 2.6148e-10 | 13.4957 | 2.4949e-10 | 12.0098 |
3 | 2.5017e-10 | 13.3999 | 2.6850e-10 | 12.7097 |
4 | 2.5103e-10 | 13.8419 | 2.5707e-10 | 12.2158 |
5 | 2.4778e-10 | 13.6656 | 2.4574e-10 | 12.1301 |
6 | 2.4846e-10 | 13.5132 | 2.5370e-10 | 12.0958 |
7 | 2.4829e-10 | 13.5649 | 2.4438e-10 | 12.4199 |
8 | 2.5644e-10 | 14.2626 | 2.5308e-10 | 12.4762 |
9 | 2.6028e-10 | 13.6527 | 2.5655e-10 | 12.3948 |
10 | 2.4507e-16 | 13.8025 | 2.5998e-10 | 12.3597 |
Mean | 2.5259e-10 | 13.5732 | 2.5279e-10 | 12.2319 |
Std | 5.7252e-12 | 0.4394 | 8.4480e-12 | 0.3283 |
7 Conclusions
In this study, we propose MMV-ADMM-, an ADMM algorithm based on -norm for the MMV problem. Instead of using the relaxation -norm, we directly solve the -norm minimization problem; this is the main difference between our study and other works [7, 11, 12]. We prove that the proposed algorithm has global convergence and determine its convergence criterion (4.7). Through numerical simulations, we test the validity of the proposed algorithm and its convergence criterion. The experiment results are consistent with Theorems 5.2 and 5.3. In comparisons against other algorithms, MMV-ADMM- shows better performance with high sparsity and few sensors , and it is insensitive to the dimension . Moreover, MMV-ADMM- has a larger undersampling rate compared with those of other algorithms, especially MMV-ADMM-, which is the key of compressed sensing.
The proposed algorithm has some drawbacks. Although its running time is roughly the average of those of the other algorithms, its efficiency remains lower than those of SOMP and SNIHT. We propose MMV-ADMM--SMW to reduce the computational cost, and it runs faster. However, this improvement is not significant until the dimension becomes large.
The research in this paper can still be studied through a deep dive. First, Algorithm 1 can be extended to the generalised JSM-2 problem, where multiple vectors are measured by different sensing matrices. And the derivation of the algorithm for the generalized JSM-2 problem is nearly the same as discussions in Section 3 and 4 of this paper. Second, Algorithm 1 can accelerate even further. The most time cost step in the iterations of Algorithm 1 is updating by (4.5). The key is to solve a high-dimensional system of positive-definite linear equations. Drawing on recent advances in the field of numerical algebra, can be updated more efficiently, thus Algorithm 1 can converge faster. We will focus on the above two points in the future research; give a more efficient algorithm for the MMV problem, and extend it to the generalized JSM-2 problem.
References
- [1] Ewout van den Berg and Michael P. Friedlander. Theoretical and empirical results for recovery from multiple measurements. IEEE Transactions on Information Theory, 56(5):2516–2527, 2010.
- [2] D.L. Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4):1289–1306, 2006.
- [3] E.J. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2):489–509, 2006.
- [4] David L. Donoho and Michael Elad. Optimally sparse representation in general (nonorthogonal) dictionaries via minimization. Proceedings of the National Academy of Sciences of the United States of America, 100:2197 – 2202, 2003.
- [5] Scott Saobing Chen, David L. Donoho, and Michael A. Saunders. Atomic decomposition by basis pursuit. SIAM J. Sci. Comput., 20:33–61, 1998.
- [6] Qing Qu, Nasser M. Nasrabadi, and Trac D. Tran. Abundance estimation for bilinear mixture models via joint sparse and low-rank representation. IEEE Transactions on Geoscience and Remote Sensing, 52(7):4404–4423, 2014.
- [7] Shane F. Cotter, Bhaskar D. Rao, Kjersti Engan, and Kenneth Kreutz-Delgado. Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Transactions on Signal Processing, 53:2477–2488, 2005.
- [8] Irina F. Gorodnitsky, John S. George, and B.D. Rao. Neuromagnetic source imaging with focuss: a recursive weighted minimum norm algorithm. Electroencephalography and clinical neurophysiology, 95 4:231–51, 1995.
- [9] Dmitry M. Malioutov, Müjdat Çetin, and Alan S. Willsky. Source localization by enforcing sparsity through a laplacian prior: an svd-based approach. IEEE Workshop on Statistical Signal Processing, 2003, pages 573–576, 2004.
- [10] S.F. Cotter and B.D. Rao. Sparse channel estimation via matching pursuit with application to equalization. IEEE Transactions on Communications, 50(3):374–377, 2002.
- [11] Jie Chen and Xiaoming Huo. Theoretical results on sparse representations of multiple-measurement vectors. IEEE Transactions on Signal Processing, 54(12):4634–4643, 2006.
- [12] Yonina C. Eldar and Moshe Mishali. Robust recovery of signals from a structured union of subspaces. IEEE Transactions on Information Theory, 55(11):5302–5316, 2009.
- [13] Joel A. Tropp, Anna C. Gilbert, and Martin Strauss. Algorithms for simultaneous sparse approximation. part i: Greedy pursuit. Signal Process., 86:572–588, 2006.
- [14] Jeffrey D. Blanchard, Michael Cermak, David Hanle, and Yirong Jing. Greedy algorithms for joint sparse recovery. IEEE Transactions on Signal Processing, 62(7):1694–1704, 2014.
- [15] Stephen P. Boyd, Neal Parikh, Eric King wah Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn., 3:1–122, 2011.
- [16] Wen Zaiwen, Hu Jiang, Li Yongfeng, and Liu Haoyang. Optimization: Modeling, Algorithm and Theory(in Chinese). Higher Education Press, 2020.
- [17] Stephen P. Boyd and Lieven Vandenberghe. Convex optimization. IEEE Transactions on Automatic Control, 51:1859–1859, 2004.
- [18] Attouch H, Bolte J, and Svaiter B. F. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized gauss-seidel methods. Mathematical Programming, 2013.
- [19] Boris S. Mordukhovich. Variational analysis and generalized differentiation I: Basic theory. Grundlehren der mathematischen Wissenschaften. Springer Berlin, Heidelberg, 2006.
- [20] Bolte J, Sabach S, and Teboulle M. Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146, 459–494, 2014.
- [21] Simon Foucart. Hard thresholding pursuit: An algorithm for compressive sensing. SIAM J. Numer. Anal., 49:2543–2563, 2011.
- [22] Jack Sherman and Winifred J. Morrison. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. Annals of Mathematical Statistics, 21:124–127, 1950.
- [23] M Woodbury. Inverting Modified Matrices. Statistical Research Group, Princeton University, Princeton, 1950.
- [24] Ke Guo, Deren Han, and Tingting Wu. Convergence of alternating direction method for minimizing sum of two nonconvex functions with linear constraints. International Journal of Computer Mathematics, 94:1653 – 1669, 2017.
- [25] Oleg P. Burdakov, Christian Kanzow, and Alexandra Schwartz. Mathematical programs with cardinality constraints: Reformulation by complementarity-type conditions and a regularization method. SIAM J. on Optimization, 26(1):397–425, jan 2016.
- [26] Ewout van den Berg and Michael P. Friedlander. Sparse optimization with least-squares constraints. SIAM J. Optim., 21:1201–1229, 2011.