This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Covariance-Based Cooperative Activity Detection for Massive Grant-Free Random Access

Xiaodan Shao1{1}, Xiaoming Chen1{1}, Derrick Wing Kwan Ng2{2}, Caijun Zhong1{1}, and Zhaoyang Zhang1{1} 1{1}College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
2{2} School of Electrical Engineering and Telecommunications, University of New South Wales, Sydney, Australia
E-mails: {shaoxiaodan, chen_xiaoming, caijunzhong, ning_ming}@zju.edu.cn, w.k.ng@unsw.edu.au
Abstract

This paper designs a cooperative activity detection framework for massive grant-free random access in the sixth-generation (6G) cell-free wireless networks based on the covariance of the received signals at the access points (APs). In particular, multiple APs cooperatively detect the device activity by only exchanging the low-dimensional intermediate local information with their neighbors. The cooperative activity detection problem is non-smooth and the unknown variables are coupled with each other for which conventional approaches are inapplicable. Therefore, this paper proposes a covariance-based algorithm by exploiting the sparsity-promoting and similarity-promoting terms of the device state vectors among neighboring APs. An approximate splitting approach is proposed based on the proximal gradient method for solving the formulated problem. Simulation results show that the proposed algorithm is efficient for large-scale activity detection problems while requires shorter pilot sequences compared with the state-of-art algorithms in achieving the same system performance.

Index Terms:
Cooperative activity detection, massive access, 6G cell-free wireless networks, covariance-based detection.

I Introduction

The massive machine-type communications (mMTC), which is a typical application scenario for 6G wireless networks, aims to meet the demand for massive connectivity for hundreds of billions of Internet-of-Things (IoT) devices. For the massive access, conventional grant-based random access schemes lead to an exceedingly long access latency and a prohibitive signaling overhead. To this end, grant-free random access schemes have been considered as a promising candidate technique for realizing 6G cellular IoT [1], where active devices transmit their data signals without obtaining a grant from the base station (BS) after sending pre-assigned pilot sequences. Hence, the key to grant-free random access is active device detection at the BS based on the received pilot sequences [2].

Inspired by the sporadic characteristics of IoT applications, several compressed sensing-based approaches have been proposed to detect active devices for grant-free random access systems. For instance, in [3] and [4], the approximate message propagation (AMP) algorithms were designed for activity detection in different scenarios by exploiting the statistics of wireless channels. However, the AMP algorithms require high computational complexity. As a result, the authors in [5] proposed a low-complexity dimension reduction-based algorithm, which projects the original device state matrix to a low-dimensional space by exploiting its sparse and low-rank structure. Note that the above approaches in [3]-[5] perform activity detection based on the instantaneous received signals. Recently, a covariance-based algorithm has been proposed to improve the performance of device activity detection in [6], where the detection problem was solved by a coordinate descent algorithm with random sampling. In general, these algorithms, e.g. [3]-[6], exploiting the sparsity structure of the device state matrix, which enjoy reasonable detection performance. However, due to a large number of devices and the limited radio resources in 66G networks for massive access, the active device detection has been emerging as a challenging problem.

To overcome this challenge, multi-cell massive access with multiple APs were applied to the problem of active device detection. For example, the multi-cell sparse activity detection was proposed in [7], where each AP operates independently to perform activity detection and channel estimation for the devices distributed in its own cell by treating the inter-cell interference as noise. In fact, if the APs can jointly process the pilot sequences received from the devices in neighboring APs, the detection performance can be further improved even with only short pilot sequences. Motivated by this fact, this paper considers a 6G cell-free wireless network, where multiple APs deployed in a vast area to serve all devices located in this area [8, 9]. In particular, cooperative activity detection among the APs requires extra information exchanges in the system. To reduce the amount of associated signaling overhead, this paper designs a scalable computationally efficient algorithm to detect the active devices, which is reliable and robust to AP and/or backhaul link failure and the variation in channel statistics. The main contributions of this paper are as follows:

  1. 1.

    The paper proposes a novel cooperative activity detection framework for grant-free random access in 6G cell-free wireless networks based on the covariance of the received signals.

  2. 2.

    This paper proposes a cooperative massive detection (CMD) algorithm by exploiting the special characteristic of the device state vectors of interest among the neighboring APs, namely joint similarity and sparsity.

  3. 3.

    This paper analyzes the computational complexity and the communication cost of the proposed CMD algorithm which shows its effectiveness in 6G cell-free wireless networks.

II System Model

Consider a 6G cell-free wireless network comprising BB APs. The APs are equipped with MM antennas each, serving NN uniformly distributed single-antenna IoT devices in a vast area. Each AP is connected to several adjacent APs via backhaul links and can only communicate with its one-hop neighbors for reducing the communication load, as shown in Fig. 1. Due to the burst characteristic of IoT applications, only a fraction of IoT devices are active at any given time slot. Let ||c|\cdot|_{c} denote the cardinality of a set. We use 𝒦\mathcal{K} to denote the set of active devices with K=|𝒦|cNK=\left|\mathcal{K}\right|_{c}\ll N being the number of active devices. For convenience, we define χn\chi_{n} as the binary activity indicator with χn=1{\chi_{n}}=1 if the nnth device is active, and χn=0{\chi_{n}}=0 otherwise. Moreover, we represent the MM-dimensional channel vector from the nnth device to the bbth AP as gb,n𝐡b,n\sqrt{g_{b,n}}\mathbf{h}_{b,n}, where gb,ng_{b,n} is the large-scale fading component depending on the devices location, and 𝐡b,nM\mathbf{h}_{b,n}\in\mathbb{C}^{M} is the small-scale fading following independent and identically distributed (i.i.d.) complex Gaussian distribution with zero mean and unit variance.

Refer to caption
Figure 1: Illustration of a 6G cell-free wireless network with multiple APs.

The grant-free random access protocol is adopted in this paper [10]. Specifically, at the beginning of each time slot, the active devices transmit their pilot sequences over the uplink channels simultaneously and then the APs perform the activity detection based on the received signals in a cooperative manner. All pilot sequences, 𝐬nL,n{1,2,,N}\mathbf{s}_{n}\in\mathbb{C}^{L},n\in\{1,2,\cdots,N\}, are generated from i.i.d. complex Gaussian distribution with zero mean and unit variance which are known at the APs in advance. Thus, the received signal 𝐘bL×M\mathbf{Y}_{b}\in\mathbb{C}^{L\times M} at the bbth AP can be expressed as

𝐘b=n=1Nχn𝐬ngb,n𝐡b,nT+𝐖b=𝐒𝚪b12𝐇b+𝐖b\displaystyle\mathbf{Y}_{b}=\sum_{n=1}^{N}\chi_{n}\mathbf{s}_{n}\sqrt{g_{b,n}}\mathbf{h}_{b,n}^{T}+\mathbf{W}_{b}=\mathbf{S}\bm{\Gamma}_{b}^{\frac{1}{2}}\mathbf{H}_{b}+\mathbf{W}_{b} (1)

where 𝐇b=[𝐡b,1,,𝐡b,N]TN×M\mathbf{H}_{b}=[\mathbf{h}_{b,1},\cdots,\mathbf{h}_{b,N}]^{T}\in\mathbb{C}^{N\times M} denotes the small-scale fading channel matrix, 𝐒=[𝐬1,,𝐬N]L×N\mathbf{S}=[\mathbf{s}_{1},\cdots,\mathbf{s}_{N}]\in\mathbb{C}^{L\times N} denotes the horizontal stack of all pilot sequences, and 𝐖bL×M\mathbf{W}_{b}\in\mathbb{C}^{L\times M} is the additive white Gaussian noise (AWGN) marix with i.i.d. entries 𝒞𝒩(0,σ2)~{}\mathcal{CN}(0,\sigma^{2}), where σ2\sigma^{2} denotes the noise power at each antenna. In this paper, we adopt ()H(\cdot)^{H} and ()T(\cdot)^{T} to denote conjugate transpose and transpose, respectively. Define 𝜸b=[γb,1,,γb,N]TN×1\bm{\gamma}_{b}=[\gamma_{b,1},\cdots,\gamma_{b,N}]^{T}\in\mathbb{R}^{N\times 1} as the diagonal entries of 𝚪b\bm{\Gamma}_{b}, representing the device state vector of the bbth AP with γbn=χngb,n\gamma_{bn}=\chi_{n}g_{b,n}. The APs detect the active devices by estimating the term χngb,n\chi_{n}g_{b,n}. Especially, since the term 𝚪b\bm{\Gamma}_{b} can be determined by the covariance of the received signal, we aim to design a covariance-based cooperative activity detection algorithm via a limited cooperation among multiple APs.

III Cooperative Massive Detection Algorithm

In this section, we first propose a cooperative detection framework for 6G cell-free wireless networks with a massive number of IoT devices. Then, we design a corresponding cooperative detection algorithm.

III-A Cooperative Massive Detection Framework

For the problem in model (1), the unknown device state vectors for different APs are different. Moreover, there are some common characteristics among the neighboring APs. To enhance the detection performance, we first associate a local estimator with each AP. Incorporating the estimates of the neighboring APs, i.e., sparsity-promoting and the similarity-promoting terms [5, 11], we can modify the local estimator to associate a regularized local cost function with each AP.

Firstly, we design the local estimator. It is well known that the covariance-based massive activity detection is equivalent to recovering the device state vector 𝜸b\bm{\gamma}_{b} from the noisy measures 𝐘b\mathbf{Y}_{b} with the knowledge of the pre-defined pilot sequence matrix 𝐒\mathbf{S}. In general, the estimation of the device state vector 𝜸b\bm{\gamma}_{b} can be formulated as a maximum likelihood estimation problem [6]. In particular, for a given 𝐘b\mathbf{Y}_{b}, each column of 𝐘b\mathbf{Y}_{b}, denoted as 𝐲bm\mathbf{y}_{bm}, 1mM1\leq m\leq M, can be termed as an independent sample having the following multivariate complex Gaussian distribution:

𝐲bm𝒞𝒩(𝟎,𝐒𝚪b𝐒H+σ2𝐈),\mathbf{y}_{bm}\sim\mathcal{CN}(\mathbf{0},\mathbf{S}\bm{\Gamma}_{b}\mathbf{S}^{H}+\sigma^{2}\mathbf{I}), (2)

where the covariance matrix is calculated by 𝔼[𝐲bm𝐲bmH]\mathbb{E}[\mathbf{y}_{bm}\mathbf{y}_{bm}^{H}] and 𝐈\mathbf{I} denotes the identity matrix. For convenience, we define 𝚺b=𝐒𝚪b𝐒H+σ2𝐈\bm{\Sigma}_{b}=\mathbf{S}\bm{\Gamma}_{b}\mathbf{S}^{H}+\sigma^{2}\mathbf{I}. Then, the likelihood of 𝐘b\mathbf{Y}_{b} given 𝜸b\bm{\gamma}_{b} can be represented as

P(𝐘b|𝜸b)=1det(π𝚺b)Mexp(tr(𝚺b1𝐘b𝐘bH)),\displaystyle P(\mathbf{Y}_{b}|\bm{\gamma}_{b})=\frac{1}{\det(\pi\bm{\Sigma}_{b})^{M}}\exp(-\text{tr}(\bm{\Sigma}_{b}^{-1}\mathbf{Y}_{b}\mathbf{Y}_{b}^{H})), (3)

where det()\det(\cdot) and tr()\text{tr}(\cdot) are operators that return the determinant and the trace of a matrix, respectively. By exploiting the Gaussianity, we can obtain the Maximum Likelihood (ML) estimator of 𝜸b\bm{\gamma}_{b} at the bbth AP as follows:

f(𝜸b)=P(𝐘b|𝜸b)=lndet(𝚺b)+tr(𝚺b1𝚺^b𝐲),\displaystyle f(\bm{\gamma}_{b})=-P(\mathbf{Y}_{b}|\bm{\gamma}_{b})=\ln\text{det}(\bm{\Sigma}_{b})+\text{tr}(\bm{\Sigma}_{b}^{-1}\hat{\bm{\Sigma}}_{b\mathbf{y}}), (4)

where 𝚺^b𝐲=1M𝐘b𝐘bH\hat{\bm{\Sigma}}_{b\mathbf{y}}=\frac{1}{M}\mathbf{Y}_{b}\mathbf{Y}_{b}^{H} denotes the sample covariance matrix of the received signal of the bbth AP averaged over different antennas. Based on (4), the maximum likelihood estimation problem can be formulated as argmin𝜸b+f(𝜸b)\arg\min_{\bm{\gamma}_{b}\in\mathbb{R}_{+}}f(\bm{\gamma}_{b}).

Secondly, since the activity detection is a typical sparse signal processing problem, we propose a sparsity-promoting term to facilitate cooperative detection. The specific sparsity pattern can be simultaneously observed at different APs, namely the indices of nonzero entries of 𝜸b\bm{\gamma}_{b} are consistent for b=1,2,,Bb=1,2,\cdots,B. Because each AP only communicates with its neighbor APs, it cannot obtain the global information about the sparsity pattern. Moreover, it is quite challenging to split this global quantity into several local quantities consisting of components only from the neighboring nodes. In this case, for the bbth AP, we define a local parameter matrix consisting of the parameter vectors of all its neighbors, which can be directly obtained as follows:

𝐑b=[𝜸l1,𝜸l2,𝜸li,,𝜸l|𝒩b|c,𝜸b]N×(|𝒩b|c),\mathbf{R}_{b}=\left[\bm{\gamma}_{l_{1}},\bm{\gamma}_{l_{2}},\bm{\gamma}_{l_{i}},\cdots,\bm{\gamma}_{l_{|\mathcal{N}_{b}^{-}|_{c}}},\bm{\gamma}_{b}\right]\in\mathbb{C}^{N\times(|\mathcal{N}_{b}|_{c})}, (5)

where li𝒩bl_{i}\in\mathcal{N}_{b}^{-} is the index set of neighbors of the bbth AP except itself, 𝒩b\mathcal{N}_{b} denotes the index set of the neighbors of the bb AP including itself, and |𝒩b|c|\mathcal{N}_{b}^{-}|_{c} and |𝒩b|c|\mathcal{N}_{b}|_{c} denote the cardinality of the set 𝒩b\mathcal{N}_{b}^{-} and 𝒩b\mathcal{N}_{b}, respectively. Consequently, we aim to impose sparsity constraints on the row vectors of matrix 𝐑b\mathbf{R}_{b} to exploit the joint sparsity. To this end, this paper designs a novel sparsity-promoting term, which is given by

g(𝜸b)=n=1N(𝐑b(n,:)21θln(1+θ𝐑b(n,:)2)),g(\bm{\gamma}_{b})=\sum_{n=1}^{N}\left(\left\|\mathbf{R}_{b}(n,:)\right\|_{2}-\frac{1}{\theta}\ln(1+\theta\left\|\mathbf{R}_{b}(n,:)\right\|_{2})\right), (6)

where θ>0\theta>0 is the penalty parameter, 𝐑b(n,:)\mathbf{R}_{b}(n,:) is the nnth row of 𝐑b\mathbf{R}_{b}, and 2\left\|\cdot\right\|_{2} denotes the l2l_{2} norm of a matrix. Herein, g(𝜸b)g(\bm{\gamma}_{b}) is the logarithmic smooth function which can promote row sparsity [5], where the nonzero rows are penalized by minimizing g(𝜸b)g(\bm{\gamma}_{b}). In this way, a common sparsity profile across the columns of the local parameter matrix 𝐑b\mathbf{R}_{b} is promoted. Although the sparsity-promoting term is imposed on the local parameter matrix 𝐑b\mathbf{R}_{b}, the cooperative nature promotes a common sparsity profile across all columns of the global device state vectors {𝜸b}b=1B\{\bm{\gamma}_{b}\}_{b=1}^{B}.

Thirdly, we design a similarity-promoting term to improve the detection performance. The supports of the global device state vector {𝜸b}b=1B\{\bm{\gamma}_{b}\}_{b=1}^{B} for all APs are the same, but the amplitudes of the nonzero entries at the APs are different from each other due to the effects of different path loss. In particular, the device state vectors of neighboring APs have a large number of similar entries and only a relatively small number of distinct entries. Motivated by these observations, we design a similarity-promoting function as follows

Ψ(𝜸b)=l𝒩bclbΨl(𝜸b𝜸l),\Psi(\bm{\gamma}_{b})=\sum_{l\in\mathcal{N}_{b}}c_{lb}\Psi_{l}(\bm{\gamma}_{b}-\bm{\gamma}_{l}), (7)

where clbc_{lb} are linear weights satisfying the conditions: l𝒩bclb=1,clb=0l𝒩b\sum\limits_{l\in{\mathcal{N}_{b}}}{c_{lb}}=1,~{}~{}{c_{lb}}=0~{}~{}\forall l\notin{\mathcal{N}_{b}}. Ψl(𝜸b𝜸l)\Psi_{l}(\bm{\gamma}_{b}-\bm{\gamma}_{l}) is a convex penalty function, minimized at Ψl(𝟎)\Psi_{l}(\mathbf{0}), which encourages similarity between 𝜸b\bm{\gamma}_{b} and 𝜸l\bm{\gamma}_{l}. Note that the log-likelihood f(𝜸b)f(\bm{\gamma}_{b}) depends on the empirical covariance 𝚺^b𝐲\hat{\bm{\Sigma}}_{b\mathbf{y}}. In high-dimensional settings, where the length of pilot sequences LL is larger than the number of AP antennas MM, 𝚺^b𝐲\hat{\bm{\Sigma}}_{b\mathbf{y}} will be relatively different from the covariance matrix 𝚺b\bm{\Sigma}_{b}. By enforcing structural similarity, each 𝚺b\bm{\Sigma}_{b} can exploit from the fact that neighboring AP estimates should be similar to each other.

The specific expressions in the penalty function Ψl()\Psi_{l}(\cdot) form can be set different. In general, it dependents on the assumptions imposed on the problem, one may choose the most appropriate penalty for the data at hand. For example, l1l_{1}-norm penalty Ψl(𝐱)=n=1N|xn|\Psi_{l}(\mathbf{x})=\sum_{n=1}^{N}|x_{n}|, where xnx_{n} and |||\cdot| denote the nnth element of the vector 𝐱\mathbf{x} and the absolute value, respectively. This penalty function encourages the changes of limited number of values between neighbor APs, while the rest of the structure remains the same. In other words, it borrows information aggressively across neighbors, encouraging not only similar structure but also similar values. As a result, this penalty is suitable for massive access where only a small fraction of potential devices to change their states at a time slot and is adopted in this paper.

After defining the similarity-promoting term and sparsity-promoting term, accumulating them into (4) leads to the following novel regularized local cost function at the bbth AP:

F(𝜸b)=f(𝜸b)+βg(𝜸b)+τΨ(𝜸b),b{1,2,,B},\displaystyle F(\bm{\gamma}_{b})\!=\!f(\bm{\gamma}_{b})+\beta g(\bm{\gamma}_{b})+\tau\Psi(\bm{\gamma}_{b}),~{}\forall b\!\in\!\{1,2,\cdots,B\}, (8)

where β>0\beta>0 and τ>0\tau>0 are the penalty parameters used to enforce sparsity and similarity, respectively. In the following, we design a massive activity detection algorithm to minimize the local cost function at each AP.

III-B A Decentralized Approximate Separating Strategy

Note that the first term of (8), i.e., f(𝜸b)f(\bm{\gamma}_{b}) is differentiable and geodesically convex [12]. However, as stated in the above subsection, Ψl(𝜸b𝜸l)\Psi_{l}(\bm{\gamma}_{b}-\bm{\gamma}_{l}) is discontinuous, i.e., the third term of the local cost function could be a sum of non-smooth functions, and the second term is also potentially non-differentiable. In addition, the unknown variables 𝜸b\bm{\gamma}_{b} for neighboring APs are coupled with each other. These obstacles make the problem intractable to solve and existing algorithms are not applicable to such a problem. In the following, we design a decentralized approximate separating strategy for minimizing the cost function in (8) based on the forward-backward splitting strategy [13], which can handle the non-smooth problem and is especially amenable to solve the high-dimensional activity detection problem due to its fast convergence rate and its conceptual and mathematical simplicity.

Before proceeding, we recall the forward-backward splitting approach for minimizing (8), which is given by the iteration

𝜸bt+1=proxηb(τΨ+βg)(𝜸btηbf(𝜸bt)),\bm{\gamma}_{b}^{t+1}=\text{prox}_{\eta_{b}(\tau\Psi+\beta g)}(\bm{\gamma}_{b}^{t}-\eta_{b}\bigtriangledown f(\bm{\gamma}_{b}^{t})), (9)

where f()\bigtriangledown f(\cdot) is the gradient of function ff, ηb\eta_{b} is the step size for the bbth AP, and 𝜸bt\bm{\gamma}_{b}^{t} denotes the value of 𝜸b\bm{\gamma}_{b} in the ttth iteration. The gradient descent step is the forward step and the proximal step is the backward step [13]. Note that the proximal operator of a function hh is a mapping function given by: proxηh(𝐲)=argmin𝐮h(𝐮)+12η𝐮𝐲22\text{prox}_{\eta h}(\mathbf{y})=\arg\min_{\mathbf{u}}h(\mathbf{u})+\frac{1}{2\eta}\left\|\mathbf{u}-\mathbf{y}\right\|_{2}^{2} with variables 𝐲\mathbf{y} and 𝐮\mathbf{u}, and a step-size η>0\eta>0 [14].

Unfortunately, it is prohibitively challenging to directly evaluate the proximal operators with respect to similarity-promoting function Ψ(𝜸b)\Psi(\bm{\gamma}_{b}) and the sum of βg(𝜸b)+τΨ(𝜸b)\beta g(\bm{\gamma}_{b})+\tau\Psi(\bm{\gamma}_{b}). Moreover, the calculation of Ψ(𝜸b)\Psi(\bm{\gamma}_{b}) over all the number of neighborhood, |𝒩b|c|\mathcal{N}_{b}|_{c}, in each iteration is expensive. Motivated by Douglas Rachford splitting in [13], where two of the proximal operators can be updated alternately, this paper aims to handle the proximal operator of function Ψ(𝜸b)\Psi(\bm{\gamma}_{b}) and g(𝜸b)g(\bm{\gamma}_{b}) separately. Specifically, we first calculate an estimator 𝐱bt\mathbf{x}_{b}^{t} of the subgradient Ψ(𝜸bt)\partial\Psi(\bm{\gamma}_{b}^{t}) and then incorporate the gradient descent step into the proximal step with respect to sparsity-promoting term g()g(\cdot) for the bbth AP, which is given by

𝐳bt=proxβηbg(𝜸btηbf(𝜸bt)τηb𝐱bt),\mathbf{z}_{b}^{t}=\text{prox}_{\beta\eta_{b}g}(\bm{\gamma}_{b}^{t}-\eta_{b}\bigtriangledown f(\bm{\gamma}_{b}^{t})-\tau\eta_{b}\mathbf{x}_{b}^{t}), (10)

where 𝐳bt\mathbf{z}_{b}^{t} is a intermediate variable. Then, according to the update rule of Douglas Rachford splitting, we incorporate 𝐳bt\mathbf{z}_{b}^{t} into proximal operator with respect to similarity-promoting function Ψ(𝜸b)\Psi(\bm{\gamma}_{b}):

𝜸bt+1=proxτηbΨ(𝐳bt+τηb𝐱bt).\displaystyle\bm{\gamma}_{b}^{t+1}=\text{prox}_{\tau\eta_{b}\Psi}(\mathbf{z}_{b}^{t}+\tau\eta_{b}\mathbf{x}_{b}^{t}). (11)

The intermediate variable 𝐳bt\mathbf{z}_{b}^{t} and device state vector 𝜸bt\bm{\gamma}_{b}^{t} iterate alternately and their values approach to each other. When converging to optimality, their values are identical. Afterwards, in order to overcome the difficulty in processing the non-smooth finite sum term and to reduce the computational overhead, this paper proposes a splitting strategy that uses the proximal operator of a single function Ψl\Psi_{l} in each iteration to approximate the proximal operator of the average of |𝒩b|c|\mathcal{N}_{b}|_{c} non-smooth functions Ψl\Psi_{l}. In mathematical terms, we first choose ll randomly from the set 𝒩b\mathcal{N}_{b} with probabilities {p1,p2,,p|𝒩b|c}\{p_{1},p_{2},\cdots,p_{\left|\mathcal{N}_{b}\right|_{c}}\}. Then, utilizing the proximal operator, a specific step is introduced as follows

𝜸bt+1=proxτηblΨl(𝐳bt+τηbl𝐱bl,t),\bm{\gamma}_{b}^{t+1}=\text{prox}_{\tau\eta_{b}^{l}\Psi_{l}}(\mathbf{z}_{b}^{t}+\tau\eta_{b}^{l}\mathbf{x}_{b}^{l,t}), (12)

where 𝐱bl,t\mathbf{x}_{b}^{l,t} is the estimator of subgradient Ψl(𝜸bt+1)\partial\Psi_{l}(\bm{\gamma}_{b}^{t+1}) for the randomly selected llth neighbor of the bbth AP in the ttth iteration. Let clbtc_{lb}^{t} denotes the combiner at ttth iteration. In the sequel, ηbl\eta_{b}^{l} can be set to ηbl=clbtηbpl\eta_{b}^{l}=\frac{c_{lb}^{t}\eta_{b}}{p_{l}}, which is a stochastic approximation of ηb\eta_{b} controlled by the combiner and the probability of being selected. In this way, we are able to treat the difficult term in (8) with non-smooth finite sum term for any size of cardinality |𝒩b|c|\mathcal{N}_{b}|_{c}.

Since 𝐳bt\mathbf{z}_{b}^{t} and 𝜸bt\bm{\gamma}_{b}^{t} converge to the same value, (12) is an accurate approximation of (9) if 𝐱bl,t=Ψl(𝜸bt+1)\mathbf{x}_{b}^{l,t}=\partial\Psi_{l}(\bm{\gamma}_{b}^{t+1}) and 𝐱bt=Ψ(𝜸bt+1)\mathbf{x}_{b}^{t}=\partial\Psi(\bm{\gamma}_{b}^{t+1}) hold. Thus, we must ensure that Ψl(𝜸bt+1)\partial\Psi_{l}(\bm{\gamma}_{b}^{t+1}) is close to 𝐱bl,t\mathbf{x}_{b}^{l,t} to obtain an accurate estimator. According to the definition of proximal operator, equation (12) satisfies

(𝐳bt+τηbl𝐱bl,tproxτηblΨl(𝐳bt+τηbl𝐱bl,t))τηblΨl(𝜸bt+1).\displaystyle\!\!\!\!\!\!\!\!\!\!\!\!\frac{(\mathbf{z}_{b}^{t}\!+\!\tau\eta_{b}^{l}\mathbf{x}_{b}^{l,t}\!-\!\text{prox}_{\tau\eta_{b}^{l}\Psi_{l}}(\mathbf{z}_{b}^{t}\!+\!\tau\eta_{b}^{l}\mathbf{x}_{b}^{l,t}))}{\tau\eta_{b}^{l}}\in\partial\Psi_{l}(\bm{\gamma}_{b}^{t+1}). (13)

Hence, we can arrive at the following subgradient estimator 𝐱bl,t+1\mathbf{x}_{b}^{l,t+1} such that (12) holds:

𝐱bl,t+1=𝐱bl,t+1τηbl(𝐳bt𝜸bt+1),\mathbf{x}_{b}^{l,t+1}=\mathbf{x}_{b}^{l,t}+\frac{1}{\tau\eta_{b}^{l}}(\mathbf{z}_{b}^{t}-\bm{\gamma}_{b}^{t+1}), (14)

where the right hand side (RHS) of (14) is obtained by replacing the proximal step in (13) by 𝜸bt+1\bm{\gamma}_{b}^{t+1} and further reorganizing the left hand side (LHS) of formula (13). Consequently, the subgradient estimator 𝐱bt\mathbf{x}_{b}^{t} in (10) can be updated as

𝐱bt+1=𝐱bt+clbt(𝐱bl,t+1𝐱bl,t),\mathbf{x}_{b}^{t+1}=\mathbf{x}_{b}^{t}+c_{lb}^{t}(\mathbf{x}_{b}^{l,t+1}-\mathbf{x}_{b}^{l,t}), (15)

which exploits the fact that 𝐱bt=l=1|𝒩b|cclbt𝐱bl,t\mathbf{x}_{b}^{t}=\sum_{l=1}^{|\mathcal{N}_{b}|_{c}}c_{lb}^{t}\mathbf{x}_{b}^{l,t} and as stated in (12), only a single selected 𝐱bl,t\mathbf{x}_{b}^{l,t} is updated in each iteration.

III-C Derivation of Proximal Operators and Combiners

Since the proximal operator needs to be calculated at each iteration in (10) and (12), it is important to derive closed form expressions for evaluating them exactly. According to the well-known Sherman-Morrison rank-1 update identity [15], we obtain

(𝚺bγbn𝐬n𝐬nH+γbn𝐬n𝐬nH)1=𝚺bn1γbn𝚺bn1𝐬n𝐬nH𝚺bn11+γbn𝐬nH𝚺bn1𝐬n,\displaystyle\!\!\!\!\!\!\left(\bm{\Sigma}_{b}-{\gamma}_{bn}\mathbf{s}_{n}\mathbf{s}_{n}^{H}+{\gamma}_{bn}\mathbf{s}_{n}\mathbf{s}_{n}^{H}\right)^{-1}=\bm{\Sigma}_{bn}^{-1}-\frac{{\gamma}_{bn}\bm{\Sigma}_{bn}^{-1}\mathbf{s}_{n}\mathbf{s}_{n}^{H}\bm{\Sigma}_{bn}^{-1}}{1+{\gamma}_{bn}\mathbf{s}_{n}^{H}\bm{\Sigma}_{bn}^{-1}\mathbf{s}_{n}},

with 𝚺bn=𝚺bγbn𝐬n𝐬nH\bm{\Sigma}_{bn}=\bm{\Sigma}_{b}-{\gamma}_{bn}\mathbf{s}_{n}\mathbf{s}_{n}^{H}, where γbn{\gamma}_{bn} is the nnth element of 𝜸b\bm{\gamma}_{b}. Applying the well-known determinant identity yields

det(𝚺bn+γbn𝐬n𝐬nH)=(1+γbn𝐬nH𝚺bn1𝐬n)det(𝚺bn).\displaystyle\text{det}(\bm{\Sigma}_{bn}+{\gamma}_{bn}\mathbf{s}_{n}\mathbf{s}_{n}^{H})=(1+{\gamma}_{bn}\mathbf{s}_{n}^{H}\bm{\Sigma}_{bn}^{-1}\mathbf{s}_{n})\text{det}(\bm{\Sigma}_{bn}). (17)

Then, substituting (III-C) and (17) into (4) and taking the derivative of f(𝜸b)f(\bm{\gamma}_{b}) with respect to γbn{\gamma}_{bn}, we have

f(γbn)=𝐬nH𝚺bn1𝐬n1+γbn𝐬nH𝚺bn1𝐬n𝐬nH𝚺bn1𝚺^b𝐲𝚺bn1𝐬n(1+γbn𝐬nH𝚺bn1𝐬n)2.\displaystyle\bigtriangledown f(\gamma_{bn})=\frac{\mathbf{s}_{n}^{H}\bm{\Sigma}_{bn}^{-1}\mathbf{s}_{n}}{1+{\gamma}_{bn}\mathbf{s}_{n}^{H}\bm{\Sigma}_{bn}^{-1}\mathbf{s}_{n}}-\frac{\mathbf{s}_{n}^{H}\bm{\Sigma}_{bn}^{-1}\hat{\bm{\Sigma}}_{b\mathbf{y}}\bm{\Sigma}_{bn}^{-1}\mathbf{s}_{n}}{(1+{\gamma}_{bn}\mathbf{s}_{n}^{H}\bm{\Sigma}_{bn}^{-1}\mathbf{s}_{n})^{2}}. (18)

Correspondingly, the gradient f(𝜸bt)\bigtriangledown f(\bm{\gamma}_{b}^{t}) can be derived by computing the following derivative: f(𝜸bt)=col{f(γb1t),,f(γbNt)}\bigtriangledown f(\bm{\gamma}_{b}^{t})=\text{col}\{\bigtriangledown f(\gamma_{b1}^{t}),\cdots,\bigtriangledown f(\gamma_{bN}^{t})\}, where col()\text{col}(\cdot) denotes a column vector. By substituting it into (10) and calculating the closed form expression of the proximal operator of g(𝜸b)g(\bm{\gamma}_{b}), we can obtain the following intermediate recursion

𝐳bt=𝝇btηbβcol{ςb1t𝐑bt(1,:)2,,ςbNt𝐑bt(N,:)2},\displaystyle\!\!\!\!\!\!\!\!\!\!\!\!\mathbf{z}_{b}^{t}=\bm{\varsigma}_{b}^{t}-\eta_{b}\beta\text{col}\left\{\frac{\varsigma_{b1}^{t}}{\left\|\mathbf{R}_{b}^{t}(1,:)\right\|_{2}},\cdots,\frac{\varsigma_{bN}^{t}}{\left\|\mathbf{R}_{b}^{t}(N,:)\right\|_{2}}\right\}, (19)

with 𝝇𝒃t=𝜸btηbf(𝜸bt)τηb𝐱bt\bm{\varsigma_{b}}^{t}=\bm{\gamma}_{b}^{t}-\eta_{b}\bigtriangledown f(\bm{\gamma}_{b}^{t})-\tau\eta_{b}\mathbf{x}_{b}^{t}, where ςbnt\varsigma_{bn}^{t} is the nnth element of 𝝇bt\bm{\varsigma}_{b}^{t}.

Now, we turn to derive the recursion of 𝜸bt\bm{\gamma}_{b}^{t} in (12). Since Ψl()\Psi_{l}(\cdot) in (12) is fully separable, its proximal operator can be evaluated component-wise:

γbnt+1={min(τηblzbnt+τηblxbnl,t𝜸lnt|zbnt+τηblxbnl,t𝜸lnt|,zbnt+τηblxbnl,t)+zbnt+τηblxbnl,t,ifzbnt+τηblxbnl,t0,0,ifzbnt+τηblxbnl,t=0,\displaystyle\gamma_{bn}^{t+1}=\left\{\begin{array}[]{l}-\min\left(\tau\eta_{b}^{l}\frac{z_{bn}^{t}+\tau\eta_{b}^{l}x_{bn}^{l,t}-\bm{\gamma}_{ln}^{t}}{|z_{bn}^{t}+\tau\eta_{b}^{l}x_{bn}^{l,t}-\bm{\gamma}_{ln}^{t}|},z_{bn}^{t}+\tau\eta_{b}^{l}x_{bn}^{l,t}\right)\\ +z_{bn}^{t}+\tau\eta_{b}^{l}x_{bn}^{l,t},~{}~{}~{}~{}\text{if}~{}~{}z_{bn}^{t}+\tau\eta_{b}^{l}x_{bn}^{l,t}\neq 0,\\ 0,~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\text{if}~{}~{}z_{bn}^{t}+\tau\eta_{b}^{l}x_{bn}^{l,t}=0,\end{array}\right. (23)

where the minimization operator is to preserve the positivity of γbn\gamma_{bn}. Here, zbntz_{bn}^{t} and xbnl,tx_{bn}^{l,t} are the nnth entry of vectors 𝐳bt\mathbf{z}_{b}^{t} and 𝐱bl,t\mathbf{x}_{b}^{l,t}, respectively. For (15) and (23), the estimation performance depends, to a great extent, on the cooperation strategy specified by the combiner clbtc_{lb}^{t}. This paper adopts the following adaptive combiner

clbt={2|𝒩b|c11+exp(ρ𝜸bt1𝜸lt12),l𝒩b,1l𝒩bclbt,l=b,0,l𝒩b,\displaystyle c_{lb}^{t}=\left\{\begin{array}[]{l}\frac{2}{\left|\mathcal{N}_{b}^{-}\right|_{c}}\frac{1}{1+\exp(\rho\|\bm{\gamma}_{b}^{t-1}-\bm{\gamma}_{l}^{t-1}\|_{2})},~{}l\in\mathcal{N}_{b}^{-},\\ 1-\sum_{l\in\mathcal{N}_{b}^{-}}c_{lb}^{t},~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}l=b,\\ 0,~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}l\notin\mathcal{N}_{b},\end{array}\right. (27)

where ρ\rho is a large constant set beforehand. Note that the term 𝜸bt1𝜸lt12\|\bm{\gamma}_{b}^{t-1}-\bm{\gamma}_{l}^{t-1}\|_{2} in (27) accounts for the distance between the local estimates of the bbth AP and its llth neighbor. The combiner clbtc_{lb}^{t} is inversely proportional to such a distance. When the distance defined above between two APs is large, the bbth AP tends to decrease the combination weight, or even discard the information from this neighbor. Conversely, the bbth AP will increase the combination weight when the distance of estimation between two APs is small. For clarity, the pseudo-code of the CMD algorithm is summarized in Algorithm 1.

Once an estimate {𝜸b}b=1B\{\bm{\gamma}_{b}\}_{b=1}^{B} is obtained, we employ the element-wise thresholding at each AP to determine χn\chi_{n} from γbn\gamma_{bn} which is the nn-th entry of 𝜸b\bm{\gamma}_{b}. In specific, χn=1\chi_{n}=1 if γb0n>ıσ2\gamma_{b^{0}n}>\imath\sigma^{2} for a pre-specified threshold ı>0\imath>0, and χn=0\chi_{n}=0 otherwise [16]. Herein, b0b^{0} is the AP closest to device nn. Note that there exists a particular fixed point 𝜸b\bm{\gamma}_{b}^{*} of the problem argmin𝜸b+F(𝜸b)\arg\min_{\bm{\gamma}_{b}\in\mathbb{R}_{+}}F(\bm{\gamma}_{b}) for (10) and (12). Based on the information across the 6G wireless network, it can be shown that the iterates 𝜸bt\bm{\gamma}_{b}^{t} converge to this particular fixed point under certain step-size conditions.

Algorithm 1 Cooperative Massive Detection Algorithm
1:  Input: {𝐘b}b=1B\{\mathbf{Y}_{b}\}_{b=1}^{B}, 𝐒\mathbf{S}, {𝚺^b𝐲=1M𝐘b𝐘bH}b=1B\{\hat{\bm{\Sigma}}_{b\mathbf{y}}=\frac{1}{M}\mathbf{Y}_{b}\mathbf{Y}_{b}^{H}\}_{b=1}^{B}, step size {ηb}b=1B\{\eta_{b}\}_{b=1}^{B}, and total iterations TT.
2:  Initialization: {𝜸b0=𝟎}b=1B\{\bm{\gamma}_{b}^{0}=\mathbf{0}\}_{b=1}^{B}, {𝚺b0=σ2𝐈}b=1B\{\bm{\Sigma}_{b}^{0}=\sigma^{2}\mathbf{I}\}_{b=1}^{B}, {𝐱bl,0,l𝒩b}b=1B\{\mathbf{x}_{b}^{l,0},l\in\mathcal{N}_{b}\}_{b=1}^{B}, {𝐱b0=l𝒩bclb0𝐱bl,0}b=1B\{\mathbf{x}_{b}^{0}=\sum_{l\in\mathcal{N}_{b}}c_{lb}^{0}\mathbf{x}_{b}^{l,0}\}_{b=1}^{B}.
3:  for t=1:Tt=1:T do
4:     for each AP bb:
5:     Adaptation:
6:     Compute 𝐳bt\mathbf{z}_{b}^{t} based on (19)
7:     Choose ll randomly from the set 𝒩b\mathcal{N}_{b} with probabilities {p1,p2,,p|𝒩b|c}\{p_{1},p_{2},\cdots,p_{\left|\mathcal{N}_{b}\right|_{c}}\}
8:     Compute adaptive combiner clbtc_{lb}^{t} based on (27)
9:     Compute ηbl=clbtηbpl\eta_{b}^{l}=\frac{c_{lb}^{t}\eta_{b}}{p_{l}}
10:     for n=1:Nn=1:N do
11:        Update γbnt+1\gamma_{bn}^{t+1} based on (23)
12:        𝚺bt+1=𝚺bt+(γbnt+1γbnt)𝐬n𝐬nH\bm{\Sigma}_{b}^{t+1}=\bm{\Sigma}_{b}^{t}+(\gamma_{bn}^{t+1}-\gamma_{bn}^{t})\mathbf{s}_{n}\mathbf{s}_{n}^{H}
13:     end for
14:     Compute 𝐱bl,t+1\mathbf{x}_{b}^{l,t+1} based on (14)
15:     Compute 𝐱bt+1\mathbf{x}_{b}^{t+1} based on (15)
16:     Communication:
17:     Transmit 𝜸bt\bm{\gamma}_{b}^{t} to its one-hop neighbor AP
18:  end for
19:  Output: {𝜸bt+1}b=1B\{\bm{\gamma}_{b}^{t+1}\}_{b=1}^{B}

III-D Computational Complexity and Communication Cost

In what follows, the computational complexity and communication cost of the proposed CMD algorithm is analyzed. In each iteration, for an arbitrary AP, the computational complexity mainly arises from the matrix multiplication, and the overall computational complexity of the CMD algorithm is 𝒪(L2N)\mathcal{O}(L^{2}N). Although the computational complexity of sample covariance 𝚺^b𝐲\hat{\bm{\Sigma}}_{b\mathbf{y}} is 𝒪(L2M)\mathcal{O}(L^{2}M), it only needs to be calculated once at each time slot before the iteration. For the communication cost, in each iteration, each AP needs to transmit NN-dimensional intermediate 𝜸bt\bm{\gamma}_{b}^{t} to its neighboring APs. Thus, for all APs, the CMD algorithm needs to exchange Nb=1B|𝒩b|cN\sum_{b=1}^{B}\left|\mathcal{N}_{b}^{-}\right|_{c} parameters. Since the APs exchange intermediate variables instead of the received signal matrix 𝐘b\mathbf{Y}_{b}, the communication cost does not grow as the number MM of AP antennas increases.

Remark 1: It is interesting to emphasize that the computational complexity and the communication cost at each iteration of the CMD algorithm do not grow as the number of each AP antennas, MM, increases. Note that the CMD algorithm can also be adopted for data detection in unsourced random access [17], where an arbitrary AP only needs to know which messages are sent without identifying which message belongs to which device. Suppose that each active device has qq bits to send, then the total number of potential devices NN in the device activity detection problem is replaced by 2J2^{J}, where the small size JJ is obtained by divided qq-bit message into 𝒵\mathcal{Z} blocks. The details please refer to our journal paper. Thus, the computational cost of CMD algorithm reduces to 2Jb=1B|𝒩b|c2^{J}\sum_{b=1}^{B}\left|\mathcal{N}_{b}^{-}\right|_{c}. This implies that the communication cost of the proposed CMD algorithm does not grow by increasing the total number of potential devices, which is an appealing feature for reality IoT networks.

IV Numerical Results

In this section, we present numerical simulations to validate the effectiveness of the proposed CMD algorithm. We simulate the underdetermined 6G cell-free wireless network comprising B=20B=20 APs geographically distributed in a vast area to serve NN potential devices. The AP-to-AP distance is 500500 m and the number of connections between APs can be set differently. The positive constants θ\theta is selected to be 1/0.0391/0.039. The penalty parameters β\beta and τ\tau are set as 0.0380.038 and 0.00750.0075, respectively. The step size ηb=0.003\eta_{b}=0.003 and is the same for all APs, pl=1|𝒩b|cp_{l}=\frac{1}{|\mathcal{N}_{b}|_{c}}, and ρ=500\rho=500.

As a performance measure, we adopt the activity error rate (AER). The AER is a sum of the missed detection probability, defined as the probability that a device is active but is declared to be inactive, and the false-alarm probability, defined as the probability that a device is inactive but the detector declares it to be active. As a reference, we compare the proposed CMD algorithm with two baseline schemes: the conventional ML-based multi-cell algorithm [6] and the AMP-based multi-cell algorithm, where each AP only serves its cell’s devices without multi-cell cooperation and treats the inter-cell interference as noise [7].

Fig. 2 depicts the detection performance versus different choices of the number of APs for cooperation. Initially, in the area with a few numbers of cooperation APs, the AER decreases sharply as the number of cooperation APs increases. When the number of cooperation APs continues to increase, the performance improvement diminishes. In addition, it is observed that such a performance saturation point value depends on AP antennas MM and pilot length LL, i.e., increasing MM or LL helps decrease the saturation point value, which indicates that the AP becomes more capable of detecting the activity with a low communication cost. The reason for this phenomenon is that for an arbitrary AP, more connections result in more intermediate estimates exchange in the proposed CMD algorithm, leading to good detection performance. However, the channel strengths from a specific active device to the far away APs are approximate zero, and the intermediate estimates exchange with the remote AP can not further improve the massive detection performance significantly. The results also indicate that only a small number of APs are required for cooperation which strikes a tradeoff between the detection performance and the communication cost.

Refer to caption
Figure 2: The AER for different number of APs for cooperation with potential devices N=1,000N=1,000, active device K=200K=200, and SNR is set as 10 dB.

In the rest of the simulations, the number of APs for cooperation is set to 55. Fig. 4 demonstrates the activity detection performance versus different numbers of AP antennas MM with potential devices N=1,000N=1,000, active devices K=200K=200, pilot length L=100L=100, and SNR is set as 10 dB. It is seen that the proposed CMD algorithm provides much lower AER than that of the baseline ones and the performance gap is enlarged as the number of AP antennas increases. In other words, the superiority of the proposed CMD algorithm is evident in massive MIMO systems, which is a key technique for 6G wireless networks. Such an advantage of cooperative strategies mainly comes from that the proposed algorithm exploits the joint sparsity and similarities of the multiple APs, and the closed-form expressions for the proximal operators are derived to achieve higher efficiency. In contrast, ML-based and AMP-based multi-cell approaches ignore such prior information and only perform activity detection for the devices distributed in its own cell, where the inter-cell interference is also a severely limiting factor for reliable activity detection.

Refer to caption
Figure 3: The AER versus MM.
Refer to caption
Figure 4: The AER versus LL.

Fig. 4 shows the detection performance versus the length of the pilot sequence LL with potential devices N=500N=500, active devices K=100K=100, M=32M=32 antennas at the AP, and SNR is set as 10 dB. From this figure, we observe that the activity detection performance of all the considered algorithms increases as the pilot length increases and the CMD algorithm achieves a substantial performance gain over the ML-based multi-cell algorithm and the AMP-based multi-cell algorithm. Note that the CMD algorithm does not require the knowledge of channel strengths which only needs to estimate a smaller number of unknown parameters, thus, it is more efficient for activity detection than that of the AMP-based multi-cell detection approach. We can also see that the performance gap between the proposed algorithm and the baseline ones is large, especially when the pilot sequence is short.

V Conclusion

This paper designed a grant-free cooperative random access framework for mMTC in 6G cell-free wireless networks based on the covariance of the received signals. By exploiting the special characteristic of the device state vectors of interest, we developed a covariance-based high-accuracy and low-complexity algorithm. Simulation results showed that the proposed algorithm can almost achieve near-optimal activity detection performance.

References

  • [1] X. Chen, D. W. K. Ng, W. Yu, E. G. Larsson, N. Al-Dhahir, and R. Schober, “Massive access for 5G and beyond,” IEEE J. Sel. Areas Commun., vol. PP, no. 99, pp. 1-1, Aug. 2020.
  • [2] V. W. S. Wong, R. Schober, D. W. K. Ng, and L.-C. Wang, Key Technologies for 5G Wireless Systems. Cambridge, U.K.: Cambridge Univ. Press, 2017.
  • [3] X. Shao, X. Chen, C. Zhong, J. Zhao, and Z. Zhang, “A unified design of massive access for cellular Internet of Things,” IEEE Internet of Things J., vol. 6. no. 2, pp. 3934-3947, Apr. 2019.
  • [4] L. Liu and W. Yu, “Massive connectivity with massive MIMO-Part I: Device activity detection and channel estimation,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 2933-2946, Jun. 2018.
  • [5] X. Shao, X. Chen, and R. Jia, “A dimension reduction-based joint activity detection and channel estimation algorithm for massive access,” IEEE Trans. Signal Process., vol. 68, pp. 420-435, 2020.
  • [6] S. Haghighatshoar, P. Jung, and G. Caire, “A new scaling law for activity detection in massive MIMO systems,” [Online]: arXiv:1803.02288, Mar. 2018.
  • [7] Z. Chen, F. Sohrabi, and W. Yu, “Multi-cell sparse activity detection for massive random access: Massive MIMO versus cooperative MIMO,” IEEE Trans. Wireless Commun., vol. 18, no. 8, pp. 1558-2248, Aug. 2019.
  • [8] M. Ke, Z. Gao, Y. Wu, X. Gao and K. Wong, “Massive access in cell-free massive MIMO-based Internet of Things: cloud computing and edge computing paradigms,” in IEEE J. Sel. Areas Commun., vol. PP, no. 99, pp. 1-1, Aug. 2020.
  • [9] J. Zhang, E. Björnson, M. Matthaiou, D. W. K. Ng, H. Yang and D. J. Love, “Prospective multiple antenna technologies for beyond 5G,” IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1637-1660, Aug. 2020.
  • [10] Y. Qiang, X. Shao, and X. Chen, “A model-driven deep learning algorithm for joint activity detection and channel estimation,” IEEE Commun. Lett., vol. PP, no. 99, pp. 1-1, 2020.
  • [11] R. Nassif, C. Richard, A. Ferrari, and A. H. Sayed, “Multitask diffusion LMS with sparsity-based regularization,” IEEE Intern. Conf. on Acous. Speech and Signal Process. (ICASSP), Brisbane, QLD, pp. 3516-3520, 2015.
  • [12] A. Wiesel, “Geodesic convexity and covariance estimation,” IEEE Trans. Signal Process., vol. 60, no. 12, pp. 6182, 2012.
  • [13] N. Parikh and S. Boyd, “Proximal algorithms,” Found. Trends Optim., vol. 1, no. 3, pp. 123-231, 2013.
  • [14] A. Defazio, “A simple practical accelerated method for finite sums,” Advances in Neural Information Processing Systems (NIPS), 2016.
  • [15] J. Sherman and W. J. Morrison, “Adjustment of an inverse matrix corresponding to a change in one element of a given matrix,” The Annals of Mathematical Statistics, vol. 21, no. 1, pp. 124-127, 1950.
  • [16] X. Shao, X. Chen, C. Zhong and Z. Zhang, “Joint activity detection and channel estimation for mmW/THz wideband massive access,” IEEE Intern. Conf. Commun. (ICC), Dublin, Ireland, Jun. 2020, pp. 1-6.
  • [17] A. Fengler, G. Caire, P. Jung, and S. Haghighatshoar, “Massive MIMO unsourced random access,” [Online]: http://arxiv.org/abs/1901.00828, Jan. 2019.