This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\newfloatcommand

capbtabboxtable[][0.4]

Untrained Neural Network based Bayesian Detector for OTFS Modulation Systems

Hao Chang1, Alva Kosasih1, Wibowo Hardjawana1, Xinwei Qu1, and Branka Vucetic1
1Centre of Excellence in IoT and Telecommunications, The University of Sydney, Sydney, Australia.
{hao.chang,alva.kosasih,wibowo.hardjawana,xinwei.qu,branka.vucetic}@sydney.edu.au
Abstract

The orthogonal time frequency space (OTFS) symbol detector design for high mobility communication scenarios has received numerous attention lately. Current state-of-the-art OTFS detectors mainly can be divided into two categories; iterative and training-based deep neural network (DNN) detectors. Many practical iterative detectors rely on minimum-mean-square-error (MMSE) denoiser to get the initial symbol estimates. However, their computational complexity increases exponentially with the number of detected symbols. Training-based DNN detectors typically suffer from dependency on the availability of large computation resources and the fidelity of synthetic datasets for the training phase, which are both costly. In this paper, we propose an untrained DNN based on the deep image prior (DIP) and decoder architecture, referred to as D-DIP that replaces the MMSE denoiser in the iterative detector. DIP is a type of DNN that requires no training, which makes it beneficial in OTFS detector design. Then we propose to combine the D-DIP denoiser with the Bayesian parallel interference cancellation (BPIC) detector to perform iterative symbol detection, referred to as D-DIP-BPIC. Our simulation results show that the symbol error rate (SER) performance of the proposed D-DIP-BPIC detector outperforms practical state-of-the-art detectors by 0.5 dB and retains low computational complexity.

Index Terms:
OTFS, symbol detection, deep image prior, Bayesian parallel interference cancellation, mobile cellular networks.

I Introduction

The future mobile system will support various high-mobility scenarios (e.g., unmanned aerial vehicles and autonomous cars) with strict mobility requirements [1]. However, current orthogonal frequency division multiplexing (OFDM)[2] is not suitable for these scenarios due to the high inter-carrier interference (ICI) caused by a large number of high-mobility moving reflectors. The orthogonal time frequency space (OTFS) modulation was proposed in [1] to address this issue because it allows the tracking of ICI during the symbol estimation process. Multiple OTFS symbol detectors [3, 4, 5, 6, 7, 8, 9, 10] have been investigated in current literature.

Several iterative detectors have been proposed in OTFS systems, e.g., message passing (MP)[3], approximate message passing (AMP)[4], Bayesian parallel interference cancellation (BPIC) that uses minimum-mean-square-error (MMSE) denoiser [5], unitary approximate message passing (UAMP) [6], and expectation propagation (EP)[7] detectors. These detectors provide a significant symbol error rate (SER) performance gain compared to that of the classical MMSE detector[8]. Unfortunately, when a large number of moving reflectors exist, MP and AMP suffer from performance degradation due to high ICI [5]. The UAMP detector addresses this issue by performing singular value decomposition (SVD) that exploits the structure of the OTFS channel prior to executing AMP. Similar performance in terms of reliability and complexity to the UAMP detector has also been achieved by our proposed iterative MMSE-BPIC detector in [5]. We combined an MMSE denoiser, the Bayesian concept, and parallel interference cancellation (PIC) to perform iterative symbol detection. Unfortunately, their performance is still suboptimal in comparison with the EP OTFS detector [7]. EP uses the Bayesian concept and multivariate Gaussian distributions to approximate the mean and variance of posterior detected symbols iteratively from the observed received signals. The outperformance of the EP detector comes at the cost of high computational complexity in performing iterative matrix inversion operations.

In addition to those iterative detectors, deep neural network (DNN) based approaches are widely used in symbol detector design. They can be divided into two categories; 1) Training-based DNN and 2) untrained DNN. The training-based DNN requires a large dataset to train the symbol detector prior to deployment. Recent examples of training-based DNN category are a 2-D convolutional neural network (CNN) based OTFS detector in [9] and also our recently proposed BPICNet OTFS detector in [10] that integrates the MMSE denoiser, BPIC and DNN whereby the modified BPIC parameters are trained by using DNN. There are two major disadvantages for the training-based DNN approach; 1) dependency on the availability of large computation resources that necessitate substantial energy or CO2 consumptions and high cost for the training phase [11]; 2) the fidelity of synthetic training data, artificially generated due to high cost of acquiring real datasets, in the real environment [12]. For example, a high fidelity training dataset implies the distribution functions for all possible velocity of mobile reflectors is known beforehand, which is impossible.

The second category, untrained DNN, avoids the need for training datasets. Deep image prior (DIP) proposed in [13] has been widely used in image restoration as an untrained DNN approach. The encoder-decoder architecture used in the original DIP shows excellent performance in image restoration tasks but the use of up to millions of trainable parameters results in high latency and thus still cannot be used for an OTFS detector that requires close to real-time processing time. Recently, the authors in [14] show that the decoder-only DIP offers similar performance as compared to an encoder-decoder DIP architecture when it is applied to Magnetic Resonance Imaging (MRI). The complexity of decoder-only DIP is significantly lower than the original encoder-decoder DIP, thus enhancing its potential use as a real-time OTFS detector. To date, no study has been conducted on untrained DNN based OTFS detectors.

In this paper, we propose to use untrained DNN with BPIC to perform iterative symbol detection. Specifically, we use DIP with a decoder-only architecture, referred to as D-DIP to act as a denoiser and to provide the initial symbol estimates for the BPIC detector. We choose BPIC here in order to keep low computational complexity for the OTFS receiver. We first describe a single-input single-output (SISO) OTFS system model consisting of the transmitter, channel and receiver. We then provide a review of the MMSE-BPIC detector in [15, 5] that uses the MMSE denoiser to obtain the initial symbol estimates. Instead of using MMSE, we propose a high-performance D-DIP denoiser to calculate the initial symbol estimates inputted to the BPIC. We then explain our proposed D-DIP in detail and also provide computational complexity and performance comparisons to other schemes. Simulation results indicate an average of approximately 0.5 dB SER out-performance as compared to other practical schemes in the literature.

The main contribution of this paper is the first to propose a combination of a decoder-only DIP denoiser and the BPIC OTFS detector. The proposed denoiser 1) provides better initial symbol estimates for the BPIC detector and 2) has lower computational complexity than the MMSE denoiser. This leads to the proposed scheme having the closest SER performance to the EP scheme as compared to other schemes, achieved with much lower computational complexity (approximately 15 times less complex than the EP).

Refer to caption
Figure 1: The system model of OTFS modulation scheme

Notations: aa, a and A denote scalar, vector, and matrix respectively. M×N\mathbb{C}^{M\times N} denotes the set of M×NM\times N dimensional complex matrices. We use 𝐈N{\bf I}_{N}, FN\textbf{F}_{N}, and FN𝐇\textbf{F}_{N}^{{\bf H}} to represent an NN-dimensional identity matrix, NN-points discrete Fourier Transform (DFT) matrix, and NN-points inverse discrete Fourier transform (IDFT) matrix. ()T(\cdot)^{T} represents the transpose operation. We define a=𝗏𝖾𝖼(A)\textbf{a}={\sf vec}(\textbf{A}) as the column-wise vectorization of matrix A and A=𝗏𝖾𝖼1(a)\textbf{A}={\sf vec}^{-1}(\textbf{a}) denotes the vector elements folded back into a matrix. The Kronecker product is denoted as \otimes. ab\lfloor\frac{a}{b}\rfloor represents the floor operation, and []M[\cdot]_{M} represent the mod-MM operations. The Euclidean distance of vector 𝐱{\bf x} is denoted as 𝐱\|{\bf x}\|. We use 𝒩(𝐱:𝝁,𝚺)\mathcal{N}({\bf x}:{\boldsymbol{\mu}},{\boldsymbol{\Sigma}}) to express the multivariate Gaussian distribution of a vector 𝐱{\bf x} where 𝝁{\boldsymbol{\mu}} is the mean and 𝚺{\boldsymbol{\Sigma}} is the covariance matrix.

II OTFS System Model

We consider an OTFS system, as illustrated in Fig. 1. In the following, we explain the details of the OTFS transmitter, channel and receiver.

II-A OTFS Transmitter

In the transmitter side, MNMN information symbols 𝐗DDM×N{\bf X}_{\rm DD}\in\mathbb{C}^{M\times N} from a modulation alphabet of size Q𝔸={a1,,aQ}Q\mathbb{A}=\{a_{1},\cdots,a_{Q}\} are allocated to an M×NM\times N grids in the delay-Doppler (DD) domain, where MM and NN represent the number of subcarriers and time slots used, respectively. As illustrated in Fig. 1, the DD domain symbols are transformed into the time-frequency (TF) domain by using the inverse symplectic finite Fourier transform (ISFFT)[1]. Here, the TF domain is discretized to MM by NN grids with uniform intervals Δf\Delta f (Hz) and Ts=1/ΔfT_{s}=1/\Delta f (seconds), respectively. Therefore, the sampling time is Ts/MT_{s}/M. The TF domain sample 𝐗TFM×N{\bf X}_{\rm TF}\in\mathbb{C}^{M\times N} is an OTFS frame, which occupies the bandwidth of MΔfM\Delta f and the duration of NTsNT_{s}, is given as

𝐗TF=𝐅M𝐗DD𝐅N𝐇,{\bf X}_{\rm TF}={\bf F}_{M}{\bf X}_{\rm DD}{\bf F}_{N}^{{\bf H}}, (1)

where 𝐅MM×M{\bf F}_{M}\in\mathbb{C}^{M\times M} and 𝐅N𝐇N×N{\bf F}_{N}^{{\bf H}}\in\mathbb{C}^{N\times N} are MM-points DFT and NN-points IDFT matrices, and the (p,q)(p,q)-th entries of them are (1Mej2πpq/M)p,q=0,,M1(\frac{1}{\sqrt{M}}e^{-j2\pi pq/M})_{p,q=0,\cdots,M-1} and (1Nej2πpq/N)p,q=0,,N1(\frac{1}{\sqrt{N}}e^{j2\pi pq/N})_{p,q=0,\cdots,N-1}, respectively. The (m,n)(m,n)-th entries XTF[m,n]X_{\rm TF}[m,n] of 𝐗TF{\bf X}_{\rm TF} is written as

XTF[m,n]=1MNk=0N1l=0M1XDD[k,l]ej2π(nkNmlM),X_{\rm TF}[m,n]=\frac{1}{\sqrt{MN}}\sum_{k=0}^{N-1}\sum_{l=0}^{M-1}X_{\rm DD}[k,l]e^{j2\pi(\frac{nk}{N}-\frac{ml}{M})}, (2)

where XDD[k,l]X_{\rm DD}[k,l] represents the (k,l)(k,l)-th entries of 𝐗DD{\bf X}_{\rm DD} for k=0,,M1,l=0,,N1k=0,\cdots,M-1,l=0,\cdots,N-1. The (discrete) Heisenberg transform[1] is then applied to generate the time domain transmitted signal by using (1) and Kronecker product rule111A matrix multiplication is often expressed by using vectorization with the Kronecker product. That is, 𝗏𝖾𝖼(ABC)=(CTA)𝗏𝖾𝖼(B){\sf vec}(ABC)=(C^{T}\otimes A){\sf vec}(B), the vector form of the transmitted signal can be written as

𝐬=𝗏𝖾𝖼(𝐆tx𝐅M𝐇𝐗TF)=(𝐅N𝐇𝐆tx)𝐱DD,{\bf s}={\sf vec}({\bf G}_{\rm tx}{\bf F}^{{\bf H}}_{M}{\bf X}_{\rm TF})=({\bf F}^{{\bf H}}_{N}\otimes{\bf G}_{\rm tx}){\bf x}_{\rm DD}, (3)

where 𝐆tx{\bf G}_{\rm tx} is the pulse-shaping waveform, and we consider the rectangular waveform with a duration of TsT_{s} that leads to 𝐆tx=𝐈M{\bf G}_{\rm tx}={\bf I}_{M} [16], 𝐱DD=𝗏𝖾𝖼(𝐗DD){\bf x}_{\rm DD}={\sf vec}({\bf X}_{\rm DD}), and 𝐱DD=[xDD(0),,xDD(MN1)]T{\bf x}_{\rm DD}=[x_{\rm DD}(0),\cdots,x_{\rm DD}(MN-1)]^{T}. 𝐬MN×1{\bf s}\in\mathbb{C}^{MN\times 1} is the vector form of the transmitted signal, 𝐬=[s(0),,s(n),,s(MN1)]T{\bf s}=[s(0),\cdots,s(n),\cdots,s(MN-1)]^{T}, n=0,,MN1n=0,\cdots,MN-1, and s(n)s(n) can be written as

s(n)=1Nk=0N1ej2πnMk/NxDD([n]M+kM).s(n)=\frac{1}{\sqrt{N}}\sum_{k=0}^{N-1}e^{j2\pi\lfloor\frac{n}{M}\rfloor k/N}x_{\rm DD}([n]_{M}+kM). (4)

We insert the cyclic prefix (CP) at the beginning of each OTFS frame, the length of CP is the same as the index of maximum delay lmaxl_{max}. Thus, the time duration after adding CP is NTs+NcpTsMNT_{s}+N_{\rm cp}\frac{T_{s}}{M}, where Ncp=lmaxN_{\rm cp}=l_{max}. After adding CP, 𝐬=[s(MNNcp+1),s(MNNcp+2),,s(MN1),s(0),,s(n),,s(MN1)]T{\bf s}=[s(MN-N_{\rm cp}+1),s(MN-N_{\rm cp}+2),\cdots,s(MN-1),s(0),\cdots,s(n),\cdots,s(MN-1)]^{T}, and 𝐬{\bf s} is transmitted through a time-varying channel.

II-B OTFS Wireless Channel

The OTFS wireless channel is a time-varying multipath channel, represented by the impulse responses in the DD domain,

h(τ,v)=i=1Phiδ(ττi)δ(vvi)h(\tau,v)=\sum_{i=1}^{P}h_{i}\delta(\tau-\tau_{i})\delta(v-v_{i}) (5)

where δ()\delta(\cdot) is the Dirac delta function, hi𝒩(0,1/P)h_{i}\sim\mathcal{N}(0,1/P) denotes the ii-th path gain, and PP is the total number of paths. Each of the paths represents a channel between a moving reflector/transmitter and a receiver with a different delay (τi)(\tau_{i}) and/or Doppler (vi)(v_{i}) characteristics. The delay and Doppler shifts are given as τi=liTsM\tau_{i}=l_{i}\frac{T_{s}}{M} andvi=kiΔfN\quad v_{i}=k_{i}\frac{\Delta f}{N}, respectively. The ICI depends on the delay and Doppler of the channel as illustrated in [16]. Here, for every path, the randomly selected integers li[0,lmax]l_{i}\in[0,l_{max}] and ki[kmax,kmax]k_{i}\in[-k_{max},k_{max}] denote the indices of the delay and Doppler shifts, where lmaxl_{max} and kmaxk_{max} are the indices of the maximum delay and maximum Doppler shifts among all channel paths. Note for every path, the combination of the lil_{i} and kik_{i} are different. For our wireless channel, we assume lmaxM1l_{max}\leq M-1 and kmaxN2k_{max}\leq\lfloor\frac{N}{2}\rfloor, implying maximum channel delay and Doppler shifts of less than TsT_{s} seconds and Δf\Delta f Hz, respectively.

II-C OTFS Receiver

At the receiver side, the time domain received signal r(t)r(t) is shown as [1]

r(t)=h(τ,v)s(tτ)ej2πv(tτ)𝑑τ𝑑v+w(t),r(t)=\int\int h(\tau,v)s(t-\tau)e^{j2\pi v(t-\tau)}d\tau dv+w(t), (6)

where s(t)s(t) is the time-domain received signal 𝐬{\bf s}, while h(τ,v)h(\tau,v) is the DD domain channel shown in (5). The received signal r(t)r(t) is then sampled at t=nMΔft=\frac{n}{M\Delta f}, where n=0,,MN1n=0,\cdots,MN-1. After discarding CP, the discrete received signal r(n)r(n) is obtained from (5) and (6), written as

r(n)=i=1Phiej2πki(nli)MNs([nli]MN)+w(n),r(n)=\sum^{P}_{i=1}h_{i}e^{j2\pi\frac{k_{i}(n-l_{i})}{MN}}s([n-l_{i}]_{MN})+w(n), (7)

We then write (7) in the vector form as

𝐫=𝐇𝐬+𝐰,{\bf r}={\bf H}{\bf s}+{\bf w}, (8)

where 𝐰{\bf w} is the complex independent and identically distributed (i.i.d.) white Gaussian noise that follows 𝒩(𝟘,σc2𝐈)\mathcal{N}(\mathbb{0},\sigma^{2}_{c}{\bf I}), σc2\sigma^{2}_{c} is the variance of the noise. 𝐇=i=1Phi𝐈MN(li)𝚫(ki){\bf H}=\sum_{i=1}^{P}h_{i}{\bf I}_{MN}(l_{i}){\boldsymbol{\Delta}}({k_{i}}), 𝐈MN(li){\bf I}_{MN}(l_{i}) denotes a MN×MNMN\times MN matrix obtained by circularly left shifting the columns of the identity matrix by lil_{i}. 𝚫{\boldsymbol{\Delta}} is the MN×MNMN\times MN Doppler shift diagonal matrix, 𝚫(ki)=𝖽𝗂𝖺𝗀[ej2πki(0)MN,ej2πki(1)MN,,ej2πki(MN1)MN]{\boldsymbol{\Delta}}({k_{i}})={\sf diag}\left[e^{\frac{j2\pi k_{i}(0)}{MN}},e^{\frac{j2\pi k_{i}(1)}{MN}},\cdots,e^{\frac{j2\pi k_{i}(MN-1)}{MN}}\right], and 𝖽𝗂𝖺𝗀(){\sf diag}(\cdot) denotes a diagonalization operation on a vector. Note that the matrices 𝐈MN(li){\bf I}_{MN}(l_{i}) and 𝚫(ki){\boldsymbol{\Delta}}({k_{i}}) model the delay and Doppler shifts in (5), respectively. As shown in Fig. 1, the TF domain received signal 𝐘TFM×N{\bf Y}_{\rm TF}\in\mathbb{C}^{M\times N} is obtained by applying the Wigner transform [16], shown as,

𝐘TF=𝐅M𝐆rx𝐑,{\bf Y}_{\rm TF}={\bf F}_{M}{\bf G}_{\rm rx}{\bf R}, (9)

where 𝐑=𝗏𝖾𝖼1(𝐫){\bf R}={\sf vec}^{-1}({\bf r}), 𝐆rx{\bf G}_{\rm rx} is the rectangular waveform with a duration TsT_{s} in the receiver, and 𝐆rx=𝐈M{\bf G}_{\rm rx}={\bf I}_{M}. Then the DD domain received signal 𝐘DDM×N{\bf Y}_{\rm DD}\in\mathbb{C}^{M\times N} is obtained by using the symplectic finite Fourier transform (SFFT), which is

𝐘DD=𝐅M𝐇𝐘TF𝐅N=𝐅M𝐇𝐅M𝐆rx𝐑𝐅N=𝐆rx𝐑𝐅N.\displaystyle{\bf Y}_{\rm DD}={\bf F}^{{\bf H}}_{M}{\bf Y}_{\rm TF}{\bf F}_{N}={\bf F}^{{\bf H}}_{M}{\bf F}_{M}{\bf G}_{\rm rx}{\bf R}{\bf F}_{N}={\bf G}_{\rm rx}{\bf R}{\bf F}_{N}. (10)

By following the vectorization with Kronecker product rule, we can rewrite (10) as

𝐲DD=𝗏𝖾𝖼(𝐘DD)=𝗏𝖾𝖼(𝐆rx𝐑𝐅N)=(𝐅N𝐆rx)𝐫.\displaystyle{\bf y}_{\rm DD}={\sf vec}({\bf Y}_{\rm DD})={\sf vec}({\bf G}_{\rm rx}{\bf R}{\bf F}_{N})=({\bf F}_{N}\otimes{\bf G}_{\rm rx}){\bf r}. (11)

By substituting (3) into (8) and (11) we obtain

𝐲DD=𝐇DD𝐱DD+𝐰~,\displaystyle{\bf y}_{\rm DD}={\bf H}_{\rm DD}{\bf x}_{\rm DD}+\tilde{{\bf w}}, (12)

where 𝐇DD=(𝐅N𝐆rx)𝐇(𝐅N𝐇𝐆tx){\bf H}_{\rm DD}=({\bf F}_{N}\otimes{\bf G}_{\rm rx}){\bf H}({\bf F}^{{\bf H}}_{N}\otimes{\bf G}_{\rm tx}) and 𝐰~=(𝐅N𝐆rx)𝐰\tilde{{\bf w}}=({\bf F}_{N}\otimes{\bf G}_{\rm rx}){\bf w} denote the effective channel and noise in the DD domain, respectively. Here, 𝐰~\tilde{{\bf w}} is an i.i.d. Gaussian noise, since 𝐅N𝐆rx{\bf F}_{N}\otimes{\bf G}_{\rm rx} is a unitary orthogonal matrix [1, 16]. For convenience, we transform complex-valued model in (12) into real-valued model. Accordingly, 𝐱=[(𝐱DD)(𝐱DD)]T{\bf x}=\begin{bmatrix}\Re({\bf x}_{\rm DD})\ \Im({\bf x}_{\rm DD})\end{bmatrix}^{T}, 𝐲=[(𝐲DD)(𝐲DD)]T{\bf y}=\begin{bmatrix}\Re({\bf y}_{\rm DD})\ \Im({\bf y}_{\rm DD})\end{bmatrix}^{T}, 𝐧=[(𝐰~)(𝐰~)]T{\bf n}=\begin{bmatrix}\Re(\tilde{{\bf w}})\ \Im(\tilde{{\bf w}})\end{bmatrix}^{T}, 𝐇eff=[(𝐇DD)(𝐇DD)(𝐇DD)(𝐇DD)]{\bf H}_{\rm eff}=\begin{bmatrix}\Re({\bf H}_{\rm DD})&-\Im({\bf H}_{\rm DD})\\ \Im({\bf H}_{\rm DD})&\Re({\bf H}_{\rm DD})\end{bmatrix}, ()\Re(\cdot) and ()\Im(\cdot) are the real and imaginary parts, respectively. Thus, the variance of 𝐧{\bf n} is σ2=σc2/2\sigma^{2}=\sigma^{2}_{c}/2 and 𝐱,𝐲,𝐧{\bf x},{\bf y},{\bf n} are vectors of size 2MN2MN and 𝐇eff{\bf H}_{\rm eff} is a matrix of size 2MN×2MN2MN\times 2MN. Then, we can rewrite (12) as

𝐲=𝐇eff𝐱+𝐧.\displaystyle{\bf y}={\bf H}_{\rm eff}{\bf x}+{\bf n}. (13)

We assume 𝐇eff{\bf H}_{\rm eff} is known at the detector side. For notation simplicity, we omit the subscript of 𝐇eff{\bf H}_{\rm eff} in (13) and just write it as 𝐇{\bf H} in all subsequent sections. The signal-to-noise ratio (SNR) of the system is defined as SNR=10log10(1σc2)dB\rm SNR=10log_{10}(\frac{1}{\sigma^{2}_{c}})dB.

III MMSE-BPIC Detector

In this section, we briefly describe the BPIC detector that employs MMSE denoiser, recently proposed in [15]. The structure of the BPIC detector is shown in Fig. 2. It consists of four modules: Denoiser, Bayesian symbol observation (BSO), Bayesian symbol estimation (BSE), and decision statistics combining (DSC).

In the Denoiser module, the MMSE scheme is used to obtain the initial symbol estimates 𝐱^(0)\hat{{\bf x}}^{(0)} in the first BPIC iteration [15] as shown in Fig. 2. The MMSE denoiser can be expressed as

𝐱^(0)=(𝐇T𝐇+σ2𝐈)1𝐇T𝐲.\hat{{\bf x}}^{(0)}=\left({\bf H}^{T}{\bf H}+\sigma^{2}{\bf I}\right)^{-1}{\bf H}^{T}{\bf y}. (14)

In the BSO module, the matched filter based PIC scheme is used to detect the transmitted symbols, shown as

μq(t)=x^q(t1)+𝐡qT(𝐲𝐇𝐱^(t1))𝐡q2,\mu_{q}^{(t)}=\hat{x}_{q}^{(t-1)}+\frac{{\bf h}_{q}^{T}\left({\bf y}-{\bf H}\hat{{\bf x}}^{(t-1)}\right)}{\|{\bf h}_{q}\|^{2}}, (15)

where μq(t)\mu_{q}^{(t)} is the soft estimate of qq-th symbol xqx_{q} in iteration tt, 𝐡q{\bf h}_{q} is the qq-th column of matrix 𝐇{\bf H}. 𝐱^(t1)=[x^1(t1),,x^q(t1),,x^2MN(t1)]T\hat{{\bf x}}^{(t-1)}=[\hat{x}_{1}^{(t-1)},\cdots,\hat{x}_{q}^{(t-1)},\cdots,\hat{x}_{2MN}^{(t-1)}]^{T} is the vector of the estimated symbol. The variance Σq(t)\Sigma_{q}^{(t)} of the qq-th symbol estimate is derived in[15] as

Σq(t)\displaystyle\Sigma_{q}^{(t)} =1(𝐡qT𝐡q)2(j=1jqMN(𝐡qT𝐡q)2vj(t1)+(𝐡qT𝐡q)σ2),\displaystyle=\frac{1}{({\bf h}_{q}^{T}{\bf h}_{q})^{2}}\left(\sum_{j=1\atop{j\neq{q}}}^{MN}({\bf h}_{q}^{T}{\bf h}_{q})^{2}v_{j}^{(t-1)}+({\bf h}_{q}^{T}{\bf h}_{q})\sigma^{2}\right), (16)

where vj(t1)v_{j}^{(t-1)} is the jj-th element in a vector of symbol estimates variance 𝐯(t1){\bf v}^{(t-1)} in iteration t1t-1 and 𝐯(t1)=[v1(t1),,vq(t1),,v2MN(t1)]T{\bf v}^{(t-1)}=[v_{1}^{(t-1)},\cdots,v_{q}^{(t-1)},\cdots,v_{2MN}^{(t-1)}]^{T}, we set 𝐯(0)=0{\bf v}^{(0)}=0 because we have no prior knowledge of the variance at the beginning. Then the estimated symbol 𝝁(t)=[μ1(t),,μq(t),,μ2MN(t)]T{\boldsymbol{\mu}}^{(t)}=[\mu_{1}^{(t)},\cdots,\mu_{q}^{(t)},\cdots,\mu_{2MN}^{(t)}]^{T}and variance 𝚺(t)=[Σ1(t),,Σq(t),,Σ2MN(t)]T{\boldsymbol{\Sigma}}^{(t)}=[\Sigma_{1}^{(t)},\cdots,\Sigma_{q}^{(t)},\cdots,\Sigma_{2MN}^{(t)}]^{T} are forwarded to the BSE module, as shown in Fig. 2

In the BSE module, we compute the Bayesian symbol estimates and the variance of the qq-th symbol obtained from the BSO module. given as

x^q(t)=𝔼[xq|μq(t),Σq(t)]=aΩap^(t)(xq=a|𝐲)\hat{x}_{q}^{(t)}=\mathbb{E}\left[x_{q}\Big{|}\mu_{q}^{(t)},\Sigma_{q}^{(t)}\right]=\sum_{a\in\Omega}a\hat{p}^{(t)}{\left(x_{q}=a|{\bf y}\right)} (17)
vq(t)=𝔼[|xq𝔼[xq|μq(t),Σq(t)]|2],v_{q}^{(t)}=\mathbb{E}\left[\left|x_{q}-\mathbb{E}\left[x_{q}\Big{|}\mu_{q}^{(t)},\Sigma_{q}^{(t)}\right]\right|^{2}\right], (18)

where p^(t)(xq|𝐲)=𝒩(xq:μq(t),Σq(t))\hat{p}^{(t)}\left(x_{q}|{\bf y}\right)=\mathcal{N}(x_{q}:\mu_{q}^{(t)},\Sigma_{q}^{(t)}) is obtained from the BSO module and it is normalized so that aΩp^(t)(xq=a|𝐲)=1\sum_{a\in\Omega}\hat{p}^{(t)}\left(x_{q}=a|{\bf y}\right)=1. The outputs of the BSE module, x^q(t)\hat{x}_{q}^{(t)}and vq(t)v_{q}^{(t)} are then sent to the following DSC module.

The DSC module performs a linear combination of the symbol estimates in two consecutive iterations, shown as

x^q(t)=(1ρq(t))x^q(t1)+ρq(t)x^q(t)\hat{x}_{q}^{(t)}=\left(1-\rho_{q}^{(t)}\right)\hat{x}_{q}^{(t-1)}+\rho_{q}^{(t)}\hat{x}_{q}^{(t)} (19)
vq(t)=(1ρq(t))vq(t1)+ρq(t)vq(t).v_{q}^{(t)}=\left(1-\rho_{q}^{(t)}\right)v_{q}^{(t-1)}+\rho_{q}^{(t)}v_{q}^{(t)}. (20)

The weighting coefficient is determined by maximizing the signal-to-interference-plus-noise-ratio variance, given as

ρq(t)=eq(t1)eq(t)+eq(t1),\rho_{q}^{(t)}=\frac{e_{q}^{(t-1)}}{e_{q}^{(t)}+e_{q}^{(t-1)}}, (21)

where eq(t)e_{q}^{(t)} is defined as the instantaneous square error of the qq-th symbol estimate, computed by using the MRC filter,

eq(t)=𝐡qT𝐡q2(𝐲𝐇𝐱^(t))2.\displaystyle e_{q}^{(t)}=\left\|\frac{{\bf h}_{q}^{T}}{\|{\bf h}_{q}\|^{2}}\left({\bf y}-{\bf H}\hat{{\bf x}}^{(t)}\right)\right\|^{2}. (22)

The weighted symbol estimates 𝐱^(t)\hat{{\bf x}}^{(t)} and their variance 𝐯(t){\bf v}^{(t)} are then returned to the BSO module to continue the iteration. After TT iterations, 𝐱^(T)\hat{{\bf x}}^{(T)} is taken as a vector of symbol estimates.

Refer to caption
Figure 2: BPIC detector architecture
Refer to caption
Figure 3: D-DIP structure

IV D-DIP denoiser For symbol estimation

In this section, we propose D-DIP to improve the initial symbol estimates performance of the BPIC detector, and the whole iterative process of D-DIP is shown in Fig. 3.

The DNN used in D-DIP is classified as a fully connected decoder DNN that consists of L=5L=5 fully connected layers. Those layers can be broken down into an input layer, an output layer and three hidden layers with p1 = 4, p2 = 8, p3 = 16, p4 = 32, p5 = 2MN2MN neurons, respectively. We use a random vector 𝐳0{\bf z}_{0} drawn from a normal distribution 𝒩(𝟘,𝟙)\mathcal{N}(\mathbb{0},\mathbb{1}) of size 4x1 as the input of the DNN first layer (i.e., input layer). 𝐳0{\bf z}_{0} is fixed during the D-DIP iterative process.

DNN output at iteration ii 𝐱DDIP(i){\bf x}_{\rm{D-DIP}}^{(i)} is obtained by passing 𝐳0{\bf z}_{0} through 5 layers, shown as

𝐱DDIP(i)=cfL(i)(fL1(i)(f2(i)(𝐳0))),{\bf x}_{\rm{D-DIP}}^{(i)}=cf_{L}^{(i)}(f_{L-1}^{(i)}(\cdots f_{2}^{(i)}({\bf z}_{0}))), (23)

where cc is a constant used to control the output range of the DNN and fl(i)f_{l}^{(i)} is the output of layer ll at iteration ii,

fl(i)=Tanh(𝐖l(i)fl1(i)+𝐛l(i)),l=2,,Lf_{l}^{(i)}={\rm{Tanh}}({\bf W}_{l}^{(i)}f_{l-1}^{(i)}+{\bf b}_{l}^{(i)}),l=2,\dots,L (24)

where f1(i)=𝐳0f_{1}^{(i)}={\bf z}_{0}, 𝐖l(i){\bf W}_{l}^{(i)} represents the weight matrix between layer ll and l1l-1 at iteration ii. 𝐛l(i){\bf b}_{l}^{(i)} is the bias vector in layer ll at iteration ii. In the beginning, each entry of 𝐖l(0){\bf W}_{l}^{(0)} and 𝐛l(0){\bf b}_{l}^{(0)} are initialized randomly following a uniform distribution with a range of (1pl,1pl)(\frac{-1}{\sqrt{p_{l}}},\frac{1}{\sqrt{p_{l}}}) [17], where plp_{l} represents the number of neurons in layer ll. Tanh{\rm{Tanh}} is an activation function used after each layer.

After that, we use a stopping scheme in [18] to control the iterative process of D-DIP to avoid the overfitting problem due to the parameterization feature in the DIP. The stopping scheme is based on calculating the variance of the DNN output, given as

ς(i)=1Wj=iWi𝐱DDIP(j)1Wj=iWi𝐱DDIP(j)2,iW,\displaystyle\varsigma^{(i)}=\frac{1}{W}\sum_{j=i-W}^{i}\|{\bf x}_{\rm{D-DIP}}^{(j)}-\frac{1}{W}\sum_{j^{\prime}=i-W}^{i}{\bf x}_{\rm{D-DIP}}^{(j^{\prime})}\|^{2},i\geq W, (25)

where ς(i)\varsigma^{(i)} is the variance value at iteration ii. When i<Wi<W, the variance calculation is inactive. WW is a constant determined based on the experiments and should be smaller than the iterations needed for D-DIP to converge. As shown in Fig. 3, we compare ς(i)\varsigma^{(i)} with a threshold ϵ\epsilon. If ς(i)<ϵ\varsigma^{(i)}<\epsilon the iterative process of D-DIP will stop, and the output of D-DIP 𝐱DDIP(I){\bf x}_{\rm{D-DIP}}^{(I)} is then forwarded to BPIC as initial symbol estimates, i.e., 𝐱^(0)=𝐱DDIP(I)\hat{{\bf x}}^{(0)}={\bf x}_{\rm{D-DIP}}^{(I)}, where II is the number of the last D-DIP iteration. Otherwise use mean square error (MSE) to calculate the loss shown as

(i)=12MN𝐇𝐱DDIP(i)𝐲2.\mathcal{L}^{(i)}=\frac{1}{2MN}\|{\bf H}{\bf x}_{\rm{D-DIP}}^{(i)}-{\bf y}\|^{2}. (26)

The DNN parameters that consist of weights 𝐖l(i){\bf W}_{l}^{(i)} and biases 𝐛l(i){\bf b}_{l}^{(i)} are then optimized by using Adam optimizer [19] and the calculated loss in (26). The process is then repeated as shown in Fig. 3.

V Complexity Analysis

In this section, we analyze the computational complexity of the proposed D-DIP-BPIC detector. As for the complexity of D-DIP, the computational complexity of fully-connected layers is matrix vector multiplications with a cost of 𝒪(M2N2I)\mathcal{O}(M^{2}N^{2}I), where II denotes the number of iterations needed for D-DIP. The computational complexity for different detection algorithms is shown in Table I, where TT represents the iterations needed for the BPIC, UAMP, EP and BPICNet detectors. For instance, for M=12,N=7,T=10,I=50M=12,N=7,T=10,I=50, the complexity of D-DIP-BPIC is approximately 1.5 times lower than MMSE-BPIC, UAMP and BPICNet. The complexity of D-DIP-BPIC is approximately 15 times lower than EP. Thus our proposed detector has the lowest complexity compared to above high-performance detectors.

Note that BPICNet has an extra complexity due to training requirements. BPICNet uses a large data set used for the training prior to deployment. For example, b=5.12×106b=5.12\times 10^{6} is used in [10].

Fig. 4 shows the cumulative distribution function (CDF) of II (i.e., the number of D-DIP iterations needed to satisfy the stopping scheme (25)) for M=12,24,36,48,N=7,lmax=M1,kmax=3,SNR=15dBM={12,24,36,48},N=7,l_{max}=M-1,k_{max}=3,SNR=15dB. The figure shows that the number of iterations required for D-DIP to converge, II, is not sensitive to the OTFS frame size (i.e., MM and NN) which is a significant advantage.

Detector Complexity order (Training) Complexity order (Deployment)
MMSE-BPIC[5] Not required 𝒪(M3N3+M2N2T)\mathcal{O}(M^{3}N^{3}+M^{2}N^{2}T)
UAMP[20] Not required 𝒪(M3N3+M2N2T)\mathcal{O}(M^{3}N^{3}+M^{2}N^{2}T)
EP[7] Not required 𝒪(M3N3T)\mathcal{O}(M^{3}N^{3}T)
BPICNet[10] 𝒪(b(M3N3+MN\mathcal{O}(b(M^{3}N^{3}+MN +M2N2T))+M^{2}N^{2}T)) 𝒪(M3N3+MN\mathcal{O}(M^{3}N^{3}+MN +M2N2T)+M^{2}N^{2}T)
D-DIP-BPIC Not required 𝒪(M2N2I+M2N2T)\mathcal{O}(M^{2}N^{2}I+M^{2}N^{2}T)
Table I: Computational complexity comparison
Refer to caption
Figure 4: CDF of I

VI Numerical results

In this section, we evaluate the performance of our proposed detector by comparing its SER performance with those in MMSE-BPIC [5], UAMP [20], EP [7] and BPICNet [10]. Here we use UAMP in [20] instead, because the UAMP proposed in [6] is not suitable for our system model as shown in [5]. For the simulations, we set N=7,lmax=M1N=7,l_{max}=M-1, Δf=15\Delta f=15kHz. The carrier frequency is set to fc=10f_{c}=10GHz. The 44-QAM modulation is employed for the simulations, and we set c=1/2c=1/\sqrt{2} that is corresponding to the normalized power of constellations to normalize the DNN output. The same DNN parameters described in section IV (e.g., number of layers and number of neurons in each layer) are used in the DNN for all simulations. We use the Adam optimizer with a learning rate of 0.010.01 to optimize the DNN parameters. The stopping criteria parameter for (25), WW is set to 30, and the threshold ϵ\epsilon is set to 0.001. The number of iterations for the BPIC, UAMP, EP and BPICNet is set to T=10T=10 to ensure convergence. For the training setting of BPICNet, we use the same setting in [10], where M=12,N=7,lmax=11,kmax=3M=12,N=7,l_{max}=11,k_{max}=3 and 500 epochs are used during the training process, in each epoch, 40 batches of 256 samples were generated. P{6,,12}P\in\{6,\dots,12\} is randomly chosen and the values of SNR are uniformly distributed in a certain range, more details are shown in [10].

Fig. 5(a) demonstrates that the proposed D-DIP-BPIC detector achieves around 0.5 dB performance gain over MMSE-BPIC and UAMP. In fact, its SER performance is very close to BPICNet and EP. Fig. 5(b) evaluates the scalability of our proposed D-DIP-BPIC detector. As we increase the OTFS frame size (i.e., number of subcarriers), D-DIP-BPIC remains the outperformance over MMSE-BPIC and UAMP and achieves a close to BPICNet and EP performance. Fig. 5(c) shows that when the number of paths (e.g., mobile reflectors) increases, the D-DIP-BPIC detector still can achieve close to BPICNet and EP performance and outperform others. As shown in Fig. 5(d), it is obvious that the performance of the BPICNet detector degrades in the case of kmax=1k_{max}=1 as compared to kmax=2,3k_{max}=2,3 as the fidelity of training data is compromised while our D-DIP-BPIC still retains its benefit.

VII Conclusion

We proposed an untrained neural network based OTFS detector that can achieve excellent performance compared to state-of-the-art OTFS detectors. Our simulation results showed that the proposed D-DIP-BPIC detector achieves a 0.5 dB SER performance improvement over MMSE-BPIC, and achieve a close to EP SER performance with much lower complexity.

Refer to caption
(a) M=12,P=6,kmax=3M=12,P=6,k_{max}=3
Refer to caption
(b) P=6,kmax=3P=6,k_{max}=3, SNR=15dB
Refer to caption
(c) M=12,kmax=3M=12,k_{max}=3, SNR=15dB
Refer to caption
(d) M=12,P=12M=12,P=12, SNR=15dB
Figure 5: SER performance

References

  • [1] R. Hadani, S. Rakib, M. Tsatsanis, A. Monk, A. J. Goldsmith, A. F. Molisch, and R. Calderbank, “Orthogonal time frequency space modulation,” in Proc. IEEE Wireless Commun. and Netw. Conf. (WCNC), USA, Mar. 2017, pp. 1–6.
  • [2] T. Jiang, H. H. Chen, H. C. Wu, and Y. Yi, “Channel modeling and inter-carrier interference analysis for V2V communication systems in frequency-dispersive channels,” Mob. Netw. Appl., vol. 15, no. 1, pp. 4–12, May 2010.
  • [3] P. Raviteja, K. T. Phan, Y. Hong, and E. Viterbo, “Interference cancellation and iterative detection for orthogonal time frequency space modulation,” IEEE Trans. Wirel. Commun., vol. 17, no. 10, pp. 6501–6515, Aug. 2018.
  • [4] M. Khumalo, W. T. Shi, and C. K. Wen, “Fixed-point implementation of approximate message passing (AMP) algorithm in massive MIMO systems,” Digit. Commun. Netw., vol. 2, no. 4, pp. 218–224, Sept. 2016.
  • [5] X. Qu, A. Kosasih, W. Hardjawana, V. Onasis, and B. Vucetic, “Bayesian-based symbol detector for orthogonal time frequency space modulation systems,” in Proc. IEEE Int. Symb. Pers. Indoor Mob. Radio Commun. (PIMRC), Finland, Sept. 2021.
  • [6] Z. Yuan, F. Liu, W. Yuan, Q. Guo, Z. Wang, and J. Yuan, “Iterative detection for orthogonal time frequency space modulation with unitary approximate message passing,” IEEE Trans. Wireless Commun., vol. 21, no. 2, pp. 714–725, 2022.
  • [7] F. Long, K. Niu, and J. Lin, “Low complexity block equalizer for OTFS based on expectation propagation,” IEEE Wireless Commun. Lett., vol. 11, no. 2, pp. 376–380, Feb. 2022.
  • [8] P. Singh, A. Gupta, H. B. Mishra, and R. Budhiraja. Low-complexity ZF/MMSE receivers for MIMO-OTFS systems with imperfect CSI. [Online]. Available: https://arxiv.org/abs/2010.04057
  • [9] Y. K. Enku, B. Bai, F. Wan, C. U. Guyo, I. N. Tiba, C. Zhang, and S. Li, “Two-dimensional convolutional neural network-based signal detection for OTFS systems,” IEEE Wireless Commun. Lett., vol. 10, no. 11, pp. 2514–2518, 2021.
  • [10] A. Kosasih, X. Qu, W. Hardjawana, C. Yue, and B. Vucetic, “Bayesian neural network detector for an orthogonal time frequency space modulation,” IEEE Wireless Commun. Lett., vol. 11, no. 12, pp. 2570–2574, 2022.
  • [11] E. Strubell, A. Ganesh, and A. McCallum, “Energy and policy considerations for deep learning in NLP,” in Annu. Meet. Assoc. Comput. Linguist. (ACL), Jul. 2019, pp. 3645–3650.
  • [12] A. Linden, “Is synthetic data the future of AI?” gartner.com. https://www.gartner.com/en/newsroom/press-releases/2022-06-22-is-synthetic-data-the-future-of-ai (accessed Jan. 8, 2023).
  • [13] V. Lempitsky, A. Vedaldi, and D. Ulyanov, “Deep image prior,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., USA, Jun. 2018, pp. 9446–9454.
  • [14] M. Z. Darestani and R. Heckel, “Accelerated MRI with un-trained neural networks,” IEEE Trans. Comput. Imag., vol. 7, pp. 724–733, 2021.
  • [15] A. Kosasih, V. Miloslavskaya, W. Hardjawana, C. She, C. K. Wen, and B. Vucetic, “A Bayesian receiver with improved complexity-reliability trade-off in massive MIMO systems,” IEEE Trans. Commun., vol. 69, no. 9, pp. 6251–6266, Sept. 2021.
  • [16] P. Raviteja, Y. Hong, E. Viterbo, and E. Biglieri, “Practical pulse-shaping waveforms for reduced-cyclic-prefix OTFS,” IEEE Trans. Vehic. Technol., vol. 68, no. 1, pp. 957–961, Jan. 2019.
  • [17] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Chile, Dec. 2015, pp. 1026–1034.
  • [18] H. Wang, T. Li, Z. Zhuang, T. Chen, H. Liang, and J. Sun, “Early stopping for deep image prior,” 2021. [Online]. Available: https://arxiv.org/abs/2112.06074
  • [19] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Represent. (ICLR), USA, 2015, pp. 1–15.
  • [20] Z. Yuan, Q. Guo, and M. Luo, “Approximate message passing with unitary transformation for robust bilinear recovery,” IEEE Trans. Signal Process., vol. 69, pp. 617–630, 2021.