Performance Analysis of Coded OTFS Systems over High-Mobility Channels

Shuangyang Li, Jinhong Yuan, Weijie Yuan, Zhiqiang Wei, Baoming Bai, and Derrick Wing Kwan Ng Part of the paper has been submitted to the IEEE International Conference on Communication Workshops 2021 [1].

Abstract

Orthogonal time frequency space (OTFS) modulation is a recently developed multi-carrier multi-slot transmission scheme for wireless communications in high-mobility environments. In this paper, the error performance of coded OTFS modulation over high-mobility channels is investigated. We start from the study of conditional pairwise-error probability (PEP) of the OTFS scheme, based on which its performance upper bound of the coded OTFS system is derived. Then, we show that the coding improvement for OTFS systems depends on the squared Euclidean distance among codeword pairs and the number of independent resolvable paths of the channel. More importantly, we show that there exists a fundamental trade-off between the coding gain and the diversity gain for OTFS systems, i.e., the diversity gain of OTFS systems improves with the number of resolvable paths, while the coding gain declines. Furthermore, based on our analysis, the impact of channel coding parameters on the performance of the coded OTFS systems is unveiled. The error performance of various coded OTFS systems over high-mobility channels is then evaluated. Simulation results demonstrate a significant performance improvement for OTFS modulation over the conventional orthogonal frequency division multiplexing (OFDM) modulation over high-mobility channels. Analytical results and the effectiveness of the proposed code design are also verified by simulations with the application of both classical and modern codes for OTFS systems.

Index Terms:

OTFS modulation, diversity analysis, code design, high-mobility, fading channels.

I Introduction

Beyond the fifth-generation (B5G) wireless communication systems are required to accommodate various emerging applications in high-mobility environments, such as mobile communications on board aircraft (MCA), low-earth-orbit satellites (LEOSs), high speed trains, and unmanned aerial vehicles (UAVs) [2, 3, 4]. Currently deployed orthogonal frequency division multiplexing (OFDM) modulation may not be able to support efficient and reliable communications in such scenarios [5]. Therefore, as a potential solution to supporting heterogeneous requirements of B5G wireless systems in high-mobility scenarios, the recently proposed orthogonal time frequency space (OTFS) modulation has attracted substantial attention [6].

In high-mobility scenarios, wireless channels are usually doubly-dispersive in the time-frequency (TF) domain [7, 8]. In specific, the time dispersion is caused by the effect of multi-path, while the frequency dispersion is caused by the Doppler shifts. Conventionally, OFDM modulation can efficiently mitigate the intersymbol interference (ISI) induced by the time dispersion by introducing a cyclic prefix (CP). However, the success of OFDM modulation relies deeply on maintaining the orthogonality among all the sub-carriers. Note that perfect orthogonality is highly impractical at the receiver, especially in high-mobility environments, due to the exceedingly large frequency dispersion, and consequently, the performance of conventional OFDM systems is unsatisfactory in such scenarios [5]. On the other hand, by invoking the two-dimensional (2D) symplectic finite Fourier transform (SFFT), OTFS modulates the information symbols in the delay-Doppler (DD) domain, where the channel parameters are relatively stable compared to those in the TF domain [7]. More importantly, it can be shown that by modulating the information symbols in the DD domain rather than the TF domain, each symbol principally experiences the whole fluctuations of the TF channel over an OTFS frame and thus OTFS modulation offers the potential of exploiting the full channel diversity, achieving a better error performance compared to that of the conventional OFDM modulation in a high-mobility environment [6]. Furthermore, with the domain transformation performed in OTFS modualtion, one can represent the high dynamic channel parameters in the TF domain equivalently by a sparse presentation in the DD domain. This unique property suggests that the acquisition of channel state information (CSI) for OTFS systems can be performed with a low pilot signaling overhead and that the symbol detection for OTFS systems can be carried out with a low complexity [9, 10].

To unleash the potential of OTFS modulation, some recent works have focused on the implementation of practical OTFS systems. For example, a low-complexity modem structure for OTFS systems was proposed in [11]. In particular, it showed that the OTFS modulation can be efficiently implemented with simple pre- and post-processing units based on the conventional OFDM modulator. Besides, to reduce the signaling overhead, an embedded pilot-aided channel estimation method for OTFS modulation was proposed in [9]. By taking advantage of the sparse representation of the wireless channel in the DD domain, this channel estimation method only requires one pilot symbol with a small number of guard symbols in the DD domain. Furthermore, in order to reduce the detection complexity, various variations of sum-product algorithms (SPAs) have been proposed to facilitate the OTFS detection. In specific, Raviteja et. al proposed an SPA-based OTFS detector, where the inter-Doppler interference (IDI) was approximated as a Gaussian variable to reduce the detection complexity [10]. A variational Bayes detector for OTFS systems was proposed in [12]. The basic idea of this detector is to approximate the corresponding a posteriori distribution of the optimal detection by exploiting the Kullback-Leibler (KL) divergence such that the SPA can be implemented based on a simpler factor graph compared to that of the original OTFS modulation.

Although the aforementioned excellent works have provided guidelines for practical OTFS system designs, the theoretical error performance advantages of OTFS systems over the conventional OFDM systems have not been thoroughly studied yet, especially for coded cases. We note that there are several previous works [13, 14, 15] on the error performance analysis of OTFS systems. However, these works mainly considered the uncoded case and their analysis may not be directly extended to the coded cases. Consequently, the error performance analysis of coded OTFS systems is still missing in the literature to the best of our knowledge. As commonly recognized, channel coding is an efficient tool to combat fading and channel impairments and thus is a key enabler for reliable communications between users with high-mobility [16]. For OTFS modulation, the 2D transformation from the TF domain to the DD domain provides a potential of exploiting the full TF diversity. In this case, a good channel code needs to couple the coded symbols to the 2D OTFS modulation, in order to exploit the full diversity and in the meantime maximize the coding gain. However, it is still unknown what is the key coding parameter determining the coding gain for OTFS modulation. Therefore, in order to facilitate the implementation of practical OTFS systems, the error performance analysis for coded OTFS systems needs to be investigated.

In this paper, we aim to analyze the error performance of coded OTFS systems. To this end, we start from the study of conditional pairwise-error probability (PEP) [17, 18] for a given channel realization. In order to obtain an accurate performance analysis, we consider two cases of the OTFS transmission depending on the number of independent resolvable paths of the channel and derive the corresponding conditional PEPs. Since the exact unconditional PEP is generally intractable [19], we resort to the application of some proper bounding techniques to study the conditional PEP and derive the unconditional performance upper bounds. Based on the unconditional performance bound, the impact of channel coding parameters on the performance of OTFS modulation is unveiled. In particular, we find that the squared Euclidean distance between a pair of codewords is the key parameter that determines the coding gain for coded OTFS systems, given the number of independent resolvable paths. Therefore, the code design criterion is formulated to optimize the coding gain by maximizing the minimum squared Euclidean distance between all codeword pairs. The main contributions of this paper can be summarized as follows.

•

We investigate the conditional PEP of OTFS systems for a given channel realization by studying the pairwise Euclidean distance between OTFS codewords. Based on the conditional PEP, we derive the unconditional performance upper bounds for OTFS systems, according to the number of independent resolvable paths. We also show a few properties on the trace and determinant of the codeword difference matrix. Furthermore, according to the derived bounds, we define the coding gain and diversity gain of OTFS systems. More importantly, we show that the coding gain of OTFS systems depends on the squared Euclidean distance and the number of independent resolvable paths.
•

According to the derived performance bounds, we show that there is a fundamental trade-off between the diversity gain and the coding gain for OTFS systems. In particular, the diversity gain of OTFS systems improves with the number of independent resolvable paths, while the coding gain declines.
•

Based on the derived performance bounds, we propose our code design criterion to optimize the coding gain, which is to maximize the minimum squared Euclidean distance of among all codeword pairs. In order words, traditional good codes with a large minimum Euclidean distance can be directly applied to OTFS systems.
•

We demonstrate a significant performance improvement achieved by the coded OTFS modulation over the coded OFDM modulation over high-mobility channels by numerical simulations. We also provide numerical results of coded OTFS systems over high-mobility channels with various channel codes, such as classical convolutional codes and state-of-art low-density parity-check (LDPC) codes. Our performance analysis and code design are explicitly verified by these results.

The rest of this paper is organized as follows. We provide a brief overview and the system model of OTFS modulation in Section II. In Section III, the derivation of error performance bounds and the code design criterion for coded OTFS systems are investigated. The numerical results of coded OTFS systems are presented in Section IV and finally a summary is provided in Section V.

Notations: The blackboard bold letter ${\mathbb{A}}$ and ${\mathbb{H}}$ denote the constellation set and an arbitrary subspace, respectively; The notations $(\cdot)^{\rm{T}}$ , $(\cdot)^{*}$ , $\left\|{\cdot}\right\|$ , $(\cdot)^{-1}$ , and $(\cdot)^{\rm{H}}$ denote the transpose, the conjugate, the Euclidean norm, the inverse, and the Hermitian operations for a matrix, respectively; $\mathbb{E}[\cdot]$ denote the expectation; $\textrm{det}(\cdot)$ , $\textrm{tr}(\cdot)$ , and $\textrm{vec}(\cdot)$ denote the determinant, the trace, and the vectorization operation; $\textrm{diag}{\{\cdot\}}$ denotes the diagonal matrix; “ $\otimes$ ” denotes the Kronecker product operator; ${{{\bf{F}}_{N}}}$ and ${{{\bf{I}}_{M}}}$ denote the discrete Fourier transform (DFT) matrix of size $N\times N$ and the identity matrix of size $M\times M$ , respectively; $Q(\cdot)$ denotes the tail distribution function of the standard normal distribution, $\delta(\cdot)$ denotes the Dirac delta function, ${I_{0}}\left({\cdot}\right)$ denotes the zero-order modified Bessel function of the first kind, respectively; $P(\cdot)$ denotes the probability and $p(\cdot)$ denotes the probability density function (PDF); $f(\cdot)$ denotes an arbitrary function; ${\left({f(\cdot)}\right)_{\max}}$ and ${\left({f(\cdot)}\right)_{\min}}$ denote the maximum and minimum values of function $f(\cdot)$ , respectively.

II OTFS System Model

In this section, we first review the OTFS concept and then introduce the considered system model.

II-A Coded OTFS System Model

Refer to caption — Figure 1: The block diagram of an OTFS system.

Without loss of generality, we consider a coded OTFS system as shown in Fig. 1. Let $M$ be the number of sub-carriers and $N$ be the number of time slots for each OTFS symbol, respectively. An information sequence $\bf{u}$ is channel-encoded and then modulated into ${\bf{x}}\in{{\mathbb{A}}^{MN}}$ with length $MN$ . Let us arrange ${\bf{x}}$ into a 2D matrix ${\bf{X}}\in{{\mathbb{A}}^{M\times N}}$ , i.e., ${\bf{x}}\buildrel\Delta\over{=}\textrm{vec}\left({\bf{X}}\right)$ , representing the symbols in the DD domain, whose $(k,l)$ -th element $x\left[{k,l}\right]$ is the modulated signal in the $k$ -th Doppler and $l$ -th delay grid [6], for $0\leq k\leq N-1,0\leq l\leq M-1$ . Each transmitted symbol in the TF domain $X\left[{n,m}\right],0\leq n\leq N-1,0\leq m\leq M-1$ , is then obtained based on ${\bf{X}}$ via the inverse symplectic fast Fourier transform (ISFFT) as follows [6]

X\left[{n,m}\right]=\frac{1}{{\sqrt{NM}}}\sum\limits_{k=0}^{N-1}{\sum\limits_{l=0}^{M-1}{x\left[{k,l}\right]}}{e^{j2\pi\left({\frac{{nk}}{N}-\frac{{ml}}{M}}\right)}}.

(1)

A brief diagram regarding the DD and TF domain transformation is shown in Fig. 2, where $\Delta f$ is the frequency spacing between adjacent sub-carriers and $T={1\mathord{\left/{\vphantom{1{\Delta f}}}\right.\kern-1.2pt}{\Delta f}}$ is the corresponding TF domain time slot duraion. Therefore, each OTFS system takes the total bandwidth of $M\Delta f$ and symbol duration $NT$ . In particular, the sampling time $1/(M\Delta T)$ and sampling frequency $1/(NT)$ are referred to as the delay resolution and the Doppler resolution of the DD grid, respectively [10], which indicate how precise the acquisition of the channel delay and Doppler can be for the underlying OTFS system. After the domain transformation (ISFFT), the TF domain transmitted symbol $X\left[{n,m}\right]$ is then modulated via a conventional OFDM modulator. The time domain OTFS signal $s\left(t\right)$ is written as

s\left(t\right)=\sum\limits_{n=0}^{N-1}{\sum\limits_{m=0}^{M-1}{X\left[{n,m}\right]{g_{{\rm{tx}}}}\left({t-nT}\right){e^{j2\pi m\Delta f\left({t-nT}\right)}}}},

(2)

where the $g_{{\rm{tx}}}(t)$ denotes the pulse shaping filter. As described above, for OFDM-based OTFS implementation, the OTFS modulator can be viewed as a concatenation of a precoder (ISFFT) and a conventional OFDM modulator [11], where the OFDM modulator consists of an inverse FFT (IFFT) block and a pulse shaping filter $g_{{\rm{tx}}}(t)$ . Similar to [6], we consider the DD domain representation of the time-varying channel, where the channel impulse response is given by

h\left({\tau,\nu}\right)=\sum\limits_{i=1}^{P}{{h_{i}}\delta\left({\tau-{\tau_{i}}}\right)\delta\left({\nu-{\nu_{i}}}\right)}.

(3)

In (3), $P$ is the number of independent resolvable paths while $h_{i}$ , $\tau_{i}$ , and $\nu_{i}$ are the channel coefficients, delay shifts, and Doppler shifts corresponding to the $i$ -th path, respectively. Let $\bar{w}\left(t\right)$ be the additive white Gaussian noise process with one-sided power spectral density (PSD) $N_{0}$ . The received signal can be written as

r\left(t\right)=\int{\int{h\left({\tau,\nu}\right)s\left({t-\tau}\right)}}{e^{j2\pi\nu\left({t-\tau}\right)}}d\tau d\nu+\bar{w}\left(t\right).

(4)

Let $g_{{\rm{rx}}}(t)$ be the filter adopted at the receiver side. The received symbols $Y\left[{n,m}\right]$ in the TF domain are then obtained by

Y\left[{n,m}\right]=\int{r\left(t\right)g_{{\rm{rx}}}^{*}\left({t-nT}\right){e^{-j2\pi m\Delta f\left({t-nT}\right)}}}dt.

(5)

Substituting (4) into (5) and after similar manipulations as in [10], (5) can be simplified as

Y\left[{n,m}\right]=\sum\limits_{n^{\prime}=0}^{N-1}{\sum\limits_{m^{\prime}=0}^{M-1}{{H_{n,m}}\left[{n^{\prime},m^{\prime}}\right]}}X\left[{n^{\prime},m^{\prime}}\right]+\bar{w}\left[{n,m}\right],

(6)

where $\bar{w}\left[{n,m}\right]$ is the corresponding TF domain noise sample and ${H_{n,m}}\left[{n^{\prime},m^{\prime}}\right]$ is the TF domain channel response, given by

	$\displaystyle{H_{n,m}}\left[{n^{\prime},m^{\prime}}\right]=$	$\displaystyle\int{\int{h\left({\tau,\nu}\right){A_{{g_{{\rm{tx}}}},{g_{{\rm{rx}}}}}}\Big{(}{\left({n-n^{\prime}}\right)T-\tau,\left({m-m^{\prime}}\right)\Delta f-\nu}\Big{)}}}$
		$\displaystyle{e^{j2\pi\left({\nu+m^{\prime}\Delta f}\right)\left({\left({n-n^{\prime}}\right)T-\tau}\right)}}{e^{j2\pi\nu n^{\prime}T}}d\tau d\nu.$		(7)

In (7), the function ${A_{{g_{{\rm{tx}}}},{g_{{\rm{rx}}}}}}\left({{\tau_{\Delta}},{\nu_{\Delta}}}\right)$ is the so-called cross-ambiguity function, which indicates the interference level between the TF domain symbols due to the channel dispersion and is given by [10]

{A_{{g_{{\rm{tx}}}},{g_{{\rm{rx}}}}}}\left({{\tau_{\Delta}},{\nu_{\Delta}}}\right)\buildrel\Delta\over{=}\int{{g_{{\rm{tx}}}}\left(t\right)g_{{\rm{rx}}}^{*}}\left({t-{\tau_{\Delta}}}\right){e^{j2\pi{\nu_{\Delta}}\Delta ft}}dt.

(8)

Hence, the received symbols $y\left[{k,l}\right]$ in the DD domain are obtained by performing the SFFT on the TF domain received symbols $Y\left[{n,m}\right]$ which are written as

y\left[{k,l}\right]=\frac{1}{{\sqrt{NM}}}\sum\limits_{n=0}^{N-1}{\sum\limits_{m=0}^{M-1}{Y\left[{n,m}\right]{e^{-j2\pi\left({\frac{{nk}}{N}-\frac{{ml}}{M}}\right)}}}}+w\left[{k,l}\right],

(9)

where $w\left[{k,l}\right]$ denotes the equivalent AWGN samples in the DD domain. In specific, the DD domain received symbols $y\left[{k,l}\right]$ can be arranged into the 2D received symbol matrix ${\bf{Y}}$ according to the DD grid, whose $(k,l)$ -th element is $y\left[{k,l}\right]$ . For the ease of presentation and analysis, we consider the vector form representation of the input-output relationship of OTFS system in the DD domain based on (9) in the sequel.

II-B Vector Form Representation of OTFS

Let ${\bf{x}}\buildrel\Delta\over{=}\textrm{vec}\left({\bf{X}}\right)\in{{\mathbb{A}}^{MN}}$ and ${\bf{y}}\buildrel\Delta\over{=}\textrm{vec}\left({\bf{Y}}\right)\in{{\mathbb{A}}^{MN}}$ denote the vector forms of the transmitted symbols ${\bf{X}}$ and the received symbols ${\bf{Y}}$ in the DD domain, respectively. According to (9), we have

{\bf{y}}={{\bf{H}}_{{\rm{eff}}}}{\bf{x}}+{\bf{w}},

(10)

where $\bf{w}$ is the corresponding noise vector and ${{\bf{H}}_{{\rm{eff}}}}$ of size $MN\times MN$ is the effective channel matrix in the DD domain. Assuming that both $g_{{\rm{tx}}}(t)$ and $g_{{\rm{rx}}}(t)$ are rectangular pulses, with a reduced CP frame format, the effective channel matrix ${{\bf{H}}_{{\rm{eff}}}}$ is given by [20]

{{\bf{H}}_{{\rm{eff}}}}=\sum\limits_{i=1}^{P}{{h_{i}}}\left({{{\bf{F}}_{N}}\otimes{{\bf{I}}_{M}}}\right){{\bm{\Pi}}^{{l_{i}}}}{{\bm{\Delta}}^{{k_{i}}}}\left({{\bf{F}}_{N}^{\rm{H}}\otimes{{\bf{I}}_{M}}}\right),

(11)

where ${\bm{\Pi}}$ is the permutation matrix (forward cyclic shift), i.e.,

{\bm{\Pi}}={\left[{\begin{array}[]{*{20}{c}}0&\cdots&0&1\\ 1&\ddots&0&0\\ \vdots&\ddots&\ddots&\vdots\\ 0&\cdots&1&0\end{array}}\right]_{MN\times MN}},

(12)

and ${\bm{\Delta}}=\textrm{diag}\{z^{0},z^{1},...,z^{MN-1}\}$ is a diagonal matrix with $z\buildrel\Delta\over{=}{e^{\frac{{j2\pi}}{{MN}}}}$ [20]. In (11), $l_{i}$ and $k_{i}$ are the delay and Doppler indices corresponding to the $i$ -th path, respectively, and we have

{\tau_{i}}=\frac{{l_{i}}}{{M\Delta f}},\quad{\nu_{i}}=\frac{{{k_{i}}+{\kappa_{i}}}}{{NT}}.

(13)

Note that the term $-{1\mathord{\left/{\vphantom{12}}\right.\kern-1.2pt}2}\leq{\kappa_{i}}\leq{1\mathord{\left/{\vphantom{12}}\right.\kern-1.2pt}2}$ denotes the fractional Doppler which corresponds to the fractional shift from the nearest Doppler [10]. On the other hand, since the typical value of the sampling time ${1\mathord{\left/{\vphantom{1{M\Delta f}}}\right.\kern-1.2pt}{M\Delta f}}$ is in the delay domain usually sufficiently small, the impact of fractional delays in typical wide-band systems can be neglected [8]. It should be noted that (11) still holds even for the fractional Doppler case, i.e., ${\kappa_{i}}\neq 0$ , due to the properties of diagonal matrix $\bm{\Delta}$ . For simplicity, we only consider the integer Doppler case in this paper. To facilitate the following analysis, we assume that the delay and Doppler indices $l_{i}$ and $k_{i}$ follow the discrete uniform distribution. In specific, we have $l_{i}\in\left[{0,{l_{\max}}}\right]$ , where $l_{\max}$ is the maximum delay index. We also have $k_{i}\in\left[{-{k_{\max}},{k_{\max}}}\right]$ , where $k_{\max}$ is the maximum Doppler index¹¹1The maximum Doppler shift is given by ${\nu_{\max}}=\frac{v}{c}{f_{c}}$ , where $v$ is the relative user equipment (UE) speed, $c$ is the speed of light, and $f_{c}$ is the carrier frequency, respectively. Thus, the maximum Doppler index is given by ${k_{\max}}={\nu_{\max}}NT$ [10]. The maximum delay shift is given by ${\tau_{\max}}=\frac{d_{\rm{max}}}{c}$ , where $d_{\rm{max}}$ is the maximum distance among the $P$ channel paths.. In particular, let ${{\bm{\omega}}_{\tau}}$ and ${{\bm{\omega}}_{\nu}}$ denote the vectors of delay indices and Doppler indices, respectively, i.e., ${{\bm{\omega}}_{\tau}}=\left[{{l_{1}},{l_{2}},\ldots,{l_{P}}}\right]^{\rm{T}}$ , ${{\bm{\omega}}_{\nu}}=\left[{{k_{1}},{k_{2}},\ldots,{k_{P}}}\right]^{\rm{T}}$ , respectively. Therefore, according to [13], (10) can be rewritten as

\displaystyle{\bf{y}}={\bf{\Phi}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{x}}\right){\bf{h}}+{\bf{w}},

(14)

where ${\bf{\Phi}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{x}}\right)$ is referred to as the equivalent codeword matrix and it is a concatenated matrix of size $MN\times P$ constructed by the column vector ${{\bf{\Xi}}_{i}}{\bf{x}}$ , i.e.,

{\bf{\Phi}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{x}}\right)=\left[{{{\bf{\Xi}}_{1}}{\bf{x}}\quad{{\bf{\Xi}}_{2}}{\bf{x}}\quad\cdots\quad{{\bf{\Xi}}_{P}}{\bf{x}}}\right],

(15)

and ${{\bf{\Xi}}_{i}}$ is given by

{{\bf{\Xi}}_{i}}\buildrel\Delta\over{=}\left({{{\bf{F}}_{N}}\otimes{{\bf{I}}_{M}}}\right){{\bm{\Pi}}^{{l_{i}}}}{{\bm{\Delta}}^{{k_{i}}}}\left({{\bf{F}}_{N}^{\rm{H}}\otimes{{\bf{I}}_{M}}}\right),1\leq i\leq P.

(16)

In (14), $\bf{h}$ is the path coefficient vector of size $P\times 1$ , i.e., ${\bf{h}}={\left[{{h_{1}},{h_{2}},...,{h_{P}}}\right]^{\rm{T}}}$ , where the elements in ${\bf{h}}$ are assumed to be independent and identically distributed complex Gaussian random variables. Besides, we assume that $h_{i}$ has the mean $\mu$ and the variance $1/(2P)$ per real dimension²²2Here, we assume a uniform power delay profile of the channel., for $1\leq i\leq P$ . In particular, we note that if $\mu=0$ , $\left|{{h_{i}}}\right|$ follows the Rayleigh distribution, which will be considered as a special case in our error performance analysis and code design. Based on (14), the error performance analysis of the OTFS systems will be conducted in the next section.

III Error Performance Analysis

In order to investigate the error performance of the coded OTFS systems, we assume that ideal channel state information (CSI) is available at the receiver, including $\bf{h}$ , ${{\bm{\omega}}_{\tau}}$ , and ${{\bm{\omega}}_{\nu}}$ [9, 21]. We note that matrix ${\bf{\Phi}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{x}}\right)$ depends on ${{\bm{\omega}}_{\tau}}$ , ${{\bm{\omega}}_{\nu}}$ , and the transmitted symbol vector ${\bf{x}}$ . Therefore, for a given channel realization, we define the conditional Euclidean distance ${d_{\bf{h},{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}^{2}\left({{\bf{x}},{\bf{x^{\prime}}}}\right)}$ between a pair of codewords ${\bf{x}}$ and ${\bf{x^{\prime}}}$ ( ${\bf{x}}\neq{\bf{x^{\prime}}}$ ) as

d_{\bf{h},{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}^{2}\left({{\bf{x}},{\bf{x^{\prime}}}}\right)=d_{\bf{h},{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}^{2}\left({\bf{e}}\right)\buildrel\Delta\over{=}{\left\|{{{\bf{\Phi}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{e}}\right)}{\bf{h}}}\right\|^{2}}={{\bf{h}}^{\rm{H}}}{\bf{\Omega}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{e}}\right){\bf{h}},

(17)

where ${\bf{e}}={\bf{x}}-{\bf{x^{\prime}}}$ is the corresponding codeword difference (error) sequence and ${\bf{\Omega}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{e}}\right)={\left({{\bf{\Phi}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{e}}\right)}\right)^{\rm{H}}}\left({\bf{\Phi}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{e}}\right)\right)$ is referred to as the codeword difference matrix. Without loss of generality and for notational simplicity, we henceforth drop the subscript of ${\bf{\Omega}}_{{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}\left({\bf{e}}\right)$ and we now have

{\bf{\Omega}}\left({\bf{e}}\right)=\left[{\begin{array}[]{*{20}{c}}{{{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{1}^{\rm{H}}{{\bf{\Xi}}_{1}}{\bf{e}}}&{{{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{1}^{\rm{H}}{{\bf{\Xi}}_{2}}{\bf{e}}}&\cdots&{{{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{1}^{\rm{H}}{{\bf{\Xi}}_{P}}{\bf{e}}}\\ {{{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{2}^{\rm{H}}{{\bf{\Xi}}_{1}}{\bf{e}}}&{{{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{2}^{\rm{H}}{{\bf{\Xi}}_{2}}{\bf{e}}}&{}\hfil&\vdots\\ \vdots&{}\hfil&\ddots&\vdots\\ {{{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{P}^{\rm{H}}{{\bf{\Xi}}_{1}}{\bf{e}}}&\cdots&\cdots&{{{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{P}^{\rm{H}}{{\bf{\Xi}}_{P}}{\bf{e}}}\end{array}}\right].

(18)

The conditional PEP is upper-bounded by [17, 18]

P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{\bf{h}},{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq\exp\left({-\frac{{{E_{s}}}}{{4{N_{0}}}}{d_{\bf{h},{{\bm{\omega}}_{\tau}},{{\bm{\omega}}_{\nu}}}^{2}\left({{\bf{x}},{\bf{x^{\prime}}}}\right)}}\right),

(19)

where $E_{s}$ is the average symbol energy. Note that the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ is nonnegative definite Hermitian. Let us denote by $\left\{{{{\bf{v}}_{1}},{{\bf{v}}_{2}},...,{{\bf{v}}_{P}}}\right\}$ the eigenvectors of ${\bf{\Omega}}\left({\bf{e}}\right)$ and $\left\{{{\lambda_{1}},{\lambda_{2}},...,{\lambda_{P}}}\right\}$ the corresponding nonnegative real eigenvalues sorted in the descending order. Thus, (19) can be further expanded as [17, 18]

P\left({{\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{\bf{h}},{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}}\right)\leq\exp\left({-\frac{{{E_{s}}}}{{4{N_{0}}}}\sum\limits_{i=1}^{r}{{\lambda_{i}}{{\left|{{{\tilde{h}}_{i}}}\right|}^{2}}}}\right),

(20)

where $r$ is the rank of ${\bf{\Omega}}\left({\bf{e}}\right)$ , i.e., $r\leq P$ , and ${{\tilde{h}}_{i}}={\bf{h}}\cdot{{\bf{v}}_{i}}$ , for $1\leq i\leq r$ . It can be shown that $\left\{{{{\tilde{h}}_{1}},{{\tilde{h}}_{2}},...,{{\tilde{h}}_{r}}}\right\}$ are independent complex Gaussian random variables with mean ${\mu_{{{\tilde{h}}_{i}}}}=\mathbb{E}\left[{\bf{h}}\right]\cdot{{\bf{v}}_{i}}$ and variance $1/(2P)$ per real dimension. Thus, it is obvious that ${\small|{{{\tilde{h}}_{i}}}\small|}$ follows the Rician distribution with Rician factor ${K_{i}}={\left|{{\mu_{{{\tilde{h}}_{i}}}}}\right|^{2}}$ [17], and its PDF is given by

p\left({\left|{{{\tilde{h}}_{i}}}\right|}\right)=2P\left|{{{\tilde{h}}_{i}}}\right|\exp\left({-P{{\left|{{{\tilde{h}}_{i}}}\right|}^{2}}-P{K_{i}}}\right){I_{0}}\left({2P\left|{{{\tilde{h}}_{i}}}\right|\sqrt{{K_{i}}}}\right).

(21)

In the following, we will target on the analysis of the unconditional PEP. To this end, we aim to calculate the average of (20) over the channel distribution according to (21), and consider the impact of the distributions of delay and Doppler indices. More specifically, we will discuss two important cases depending on the number of independent resolvable paths $P$ .

Remarks: It has been defined in the previous works [13, 14, 15] that the rank of ${\bf{\Omega}}\left({\bf{e}}\right)$ is the diversity gain of the OTFS modulation. Specifically, it has been shown in [15] that the diversity gain of uncoded OTFS modulation systems can be one but the full diversity can be obtained by suitable precoding schemes. Furthermore, [13] has shown that the full diversity can be achieved almost surely when the frame size is sufficiently large, even for uncoded OTFS modulation systems. Therefore, in the following, we will mainly focus on the analysis of the coding gain when the OTFS modulation achieves the full diversity, i.e., $r=P$ .

III-A Error Performance Analysis for Coded OTFS systems

Notice that $\bf{h}$ , ${{\bm{\omega}}_{\tau}}$ , and ${{{\bm{\omega}}_{\nu}}}$ are independent from each other. Therefore, the unconditional PEP can be derived by firstly averaging (20) over ${|{{{\tilde{h}}_{i}}}|}$ term by term which results in

P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq\prod\limits_{i=1}^{P}{\frac{1}{{1+\frac{{{E_{s}}}}{{4{N_{0}}}}\cdot\frac{{{\lambda_{i}}}}{{P}}}}\exp\left({-\frac{{{K_{i}}\frac{{{E_{s}}}}{{4{N_{0}}}}\cdot\frac{{{\lambda_{i}}}}{{P}}}}{{1+\frac{{{E_{s}}}}{{4{N_{0}}}}\cdot\frac{{{\lambda_{i}}}}{{P}}}}}\right)}.

(22)

Furthermore, we consider a special case where ${{K_{i}}}=0$ and ${\small|{{{\tilde{h}}_{i}}}\small|}$ follows the Rayleigh distribution, i.e., ${\small|{{{h}_{i}}}\small|}$ also follows the Rayleigh distribution. In the case of Rayleigh fading, (22) can be further simplified as

P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq{\left({\prod\limits_{i=1}^{P}{{\lambda_{i}}/P}}\right)^{-1}}{\left({\frac{{{E_{s}}}}{{4{N_{0}}}}}\right)^{-P}}.

(23)

By noticing that the term $\prod\nolimits_{i=1}^{P}{{\lambda_{i}}}$ equals to the determinant of ${\bf{\Omega}}\left({\bf{e}}\right)$ , (23) can be written as

P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq\frac{1}{{\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}}{\left({\frac{{{E_{s}}}}{{4{N_{0}}P}}}\right)^{-P}}.

(24)

It should be noted that (24) is consistent with the analysis in [14]. On the other hand, the PEP in (24) depends on the delay and Doppler indices ${\bm{\omega}}_{\tau}$ and ${\bm{\omega}}_{\nu}$ . In order to derive the unconditional PEP, we need to find the statistical distribution for the determinant of ${\bf{\Omega}}\left({\bf{e}}\right)$ regarding the delay and Doppler indices. Unfortunately, such a task is generally intractable [19] and is normally handled by applying the Monte Carlo method without providing any important insight. Instead of resorting to the Monte Carlo method, we apply proper bounding techniques to evaluate the determinant of ${\bf{\Omega}}\left({\bf{e}}\right)$ in order to obtain some general results about the unconditional PEP. In particular, we assume that ${\bf{\Omega}}\left({\bf{e}}\right)$ is of full-rank for any channel parameters ${{\bm{\omega}}_{\tau}}$ and ${{{\bm{\omega}}_{\nu}}}$ and error sequence [13, 15], in which case we have the following property.

Property 1 (Gram matrix [22]): Let ${\bar{\bf{u}}_{i}}\buildrel\Delta\over{=}{{\bf{\Xi}}_{i}}{\bf{e}}$ , for $1\leq i\leq P$ , where ${\bf{\Xi}}_{i}$ is given by (16). Then, $\left\{{{\bar{\bf{u}}_{1}},{\bar{\bf{u}}_{2}},\ldots{\bar{\bf{u}}_{P}}}\right\}$ form a list of independent vectors chosen from the $P$ -dimensional complex inner-product subspace $\mathbb{H}^{P}$ . Thus, the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ is positive definite Hermitian and it is a Gram matrix corresponding to the vectors $\left\{{{\bar{\bf{{u}}}_{1}},{\bar{\bf{{u}}}_{2}},\ldots{\bar{\bf{{u}}}_{P}}}\right\}$ .

Based on the property of ${\bf{\Omega}}\left({\bf{e}}\right)$ , we now introduce four important Lemmas, which will be served as the building blocks for our error performance analysis for coded OTFS systems.

Lemma 1 (Main diagonal elements of ${\bf{\Omega}}\left({\bf{e}}\right)$ ): The main diagonal elements of the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ are of the same value ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ , where $d_{\rm{E}}^{2}\left({\bf{e}}\right)={{\bf{e}}^{\rm{H}}}{\bf{e}}$ is the squared Euclidean distance for a pair of codewords ${\bf{x}}$ and ${\bf{x^{\prime}}}$ with the corresponding to the error sequence $\bf{e}$ .

Proof: By considering (15), the $i$ -th diagonal element of ${\bf{\Omega}}\left({\bf{e}}\right)$ is given by ${{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{i}^{\rm{H}}{{\bf{\Xi}}_{i}}{\bf{e}}$ , and it is equal to the inner product of ${{{{\bf{\bar{u}}}}_{i}}}$ . Furthermore, we have

$\displaystyle{\bf{\Xi}}_{i}^{\rm{H}}{{\bf{\Xi}}_{i}}=$	$\displaystyle\left({{{\bf{F}}_{N}}\otimes{{\bf{I}}_{M}}}\right){\left({{{\bm{\Delta}}^{{k_{i}}}}}\right)^{\rm{H}}}{\left({{{\bm{\Pi}}^{{l_{i}}}}}\right)^{\rm{H}}}{{\bm{\Pi}}^{{l_{i}}}}{{\bm{\Delta}}^{{k_{i}}}}\left({{\bf{F}}_{N}^{\rm{H}}\otimes{{\bf{I}}_{M}}}\right)$
$\displaystyle=$	$\displaystyle\left({{{\bf{F}}_{N}}\otimes{{\bf{I}}_{M}}}\right){{\bm{\Delta}}^{-{k_{i}}}}{{\bm{\Pi}}^{-{l_{i}}}}{{\bm{\Pi}}^{{l_{i}}}}{{\bm{\Delta}}^{{k_{i}}}}\left({{\bf{F}}_{N}^{\rm{H}}\otimes{{\bf{I}}_{M}}}\right)$	(25)
$\displaystyle=$	$\displaystyle{{\bf{I}}_{MN}},$	(26)

where (25) is due to the diagonal properties of ${\bm{\Pi}}$ and $\bm{\Delta}$ , respectively, and (26) is due to the property of the Kronecker product. Therefore, the term ${{\bf{e}}^{\rm{H}}}{\bf{\Xi}}_{i}^{\rm{H}}{{\bf{\Xi}}_{i}}{\bf{e}}$ is further simplified as ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ . This completes the proof of Lemma 1.

Lemma 2 (Trace of ${\bf{\Omega}}\left({\bf{e}}\right)$ ): The trace of the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ is ${Pd_{\rm{E}}^{2}\left({\bf{e}}\right)}$ .

Proof: This lemma is a straightforward extension of Lemma 1.

Lemma 3 (Lower bound on the trace of ${{{\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}^{-1}}}$ ): The trace of the inverse of the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ is lower-bounded by

{{\rm{tr}}\left({{{\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}^{-1}}}\right)}\geq\frac{P}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}},

(27)

Proof: The proof is given in Appendix A.

Lemma 4 (Lower bound on the summation of eigenvalue squares of ${{{{{\bf{\Omega}}\left({\bf{e}}\right)}}}}$ ): The summation of the squared eigenvalues of ${{{{{\bf{\Omega}}\left({\bf{e}}\right)}}}}$ is lower-bounded by

\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}\geq P{\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)^{2}},

(28)

Proof: It is obvious that the eigenvalues $\lambda_{i}$ of the codeword difference matrix ${{\bf{\Omega}}\left({\bf{e}}\right)}$ are all positive when ${{\bf{\Omega}}\left({\bf{e}}\right)}$ is full-rank. Therefore, we apply the Cauchy-Schwarz inequality and the following holds

\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}\geq\frac{1}{P}{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)^{2}}=\frac{1}{P}{\left({{\rm{tr}}\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}\right)^{2}}=P{\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)^{2}},

(29)

where the equality is only achieved when the eigenvalues of ${\bf{\Omega}}\left({\bf{e}}\right)$ are the same, e.g., ${\bf{\Omega}}\left({\bf{e}}\right)=\textrm{diag}\left\{{d_{\rm{E}}^{2}{\left({\bf{e}}\right)},\ldots,d_{\rm{E}}^{2}}{\left({\bf{e}}\right)}\right\}$ . This completes the proof of Lemma 4.

The above Lemmas show some important properties of the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ . Based on these properties of ${\bf{\Omega}}\left({\bf{e}}\right)$ , we can now consider the following lower bounds of the determinant for the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ .

Theorem 1 (Lower bound on the determinant of ${\bf{\Omega}}\left({\bf{e}}\right)$ ): The determinant of the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ is lower-bounded by

\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)\geq{\left({d_{\rm{E}}^{2}}\left({\bf{e}}\right)\right)^{P}}\exp\left({P-d_{\rm{E}}^{2}\left({\bf{e}}\right){\rm{tr}}\left({{{\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}^{-1}}}\right)}\right),

(30)

Proof: The proof is given in Appendix B.

It should be noted that (30) still depends on the channel parameters ${\bm{\omega}}_{\tau}$ and ${\bm{\omega}}_{\nu}$ . To obtain an unconditional lower bound, we apply an approximation to the determinant bound which is summarized in the following Theorem.

Theorem 2 (Approximated lower bound on the determinant of ${\bf{\Omega}}\left({\bf{e}}\right)$ ): The determinant of the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ can be approximately lower-bounded by

\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)\mathbin{\lower 1.29167pt\hbox{$\buildrel>\over{\smash{\scriptstyle\sim}\vphantom{{}_{x}}}$}}{\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)^{P}},

(31)

where the approximation is exact if ${\bf{\Omega}}\left({\bf{e}}\right)$ is a diagonal matrix, i.e., ${\bf{\Omega}}\left({\bf{e}}\right)=\textrm{diag}\left\{{d_{\rm{E}}^{2}{\left({\bf{e}}\right)},\ldots,d_{\rm{E}}^{2}}{\left({\bf{e}}\right)}\right\}$ .

Proof: The proof is given in Appendix C.

Note that the approximated lower bound of the determinant of ${\bf{\Omega}}\left({\bf{e}}\right)$ does not depend on the delay and Doppler indices. Furthermore, based on Theorem 2, it is not hard to notice that the determinant of the codeword difference matrix ${\bf{\Omega}}\left({\bf{e}}\right)$ can be approximated by the term ${\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)^{P}}$ . In particular, the approximation becomes exact if ${\bf{\Omega}}\left({\bf{e}}\right)=\textrm{diag}\left\{{d_{\rm{E}}^{2}{\left({\bf{e}}\right)},\ldots,d_{\rm{E}}^{2}}{\left({\bf{e}}\right)}\right\}$ , which can be interpreted as the projections of the error sequence $\bf{e}$ onto each independent resolvable path, i.e., ${\bar{\bf{u}}_{i}}$ , are orthogonal to each other. According to Theorem 2, we obtain the unconditional PEP as

P\left({{{\bf{x}},{\bf{x^{\prime}}}}}\right)\mathbin{\lower 1.29167pt\hbox{$\buildrel<\over{\smash{\scriptstyle\sim}\vphantom{{}_{x}}}$}}{\left({\frac{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}{P}}\right)^{-P}}{\left({\frac{{{E_{s}}}}{{4{N_{0}}}}}\right)^{-P}}.

(32)

Based on (32), we notice that the unconditional PEP for OTFS modulation depends on ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ and number of independent resolvable paths $P$ . In particular, the term $P$ in the denominator can be interpreted as the energy averaging with respect to the number of independent paths, while the term ${\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)^{-P}}$ indicates the potential improvement of the error performance introduced by channel coding. Furthermore, according to (32), we refer to the power of the signal-to-noise ratio (SNR) as the diversity gain, which dominates the exponential behaviour of the error performance for OTFS systems against the average SNR. On the other hand, the term ${{d_{\rm{E}}^{2}\left({\bf{e}}\right)}\mathord{\left/{\vphantom{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}P}}\right.\kern-1.2pt}P}$ is referred to as the coding gain, which characterizes the approximate improvement of coded OTFS systems over the uncoded counterpart with the same diversity gain, i.e., the same exponent $-P$ [17]. It is interesting to see from (32) that, there exists a fundamental trade-off between the diversity gain and the coding gain which is formally stated in the following.

Corollary 1 (Trade-off between diversity and coding gain): For a given channel code, the diversity gain of OTFS systems improves with the number of independent resolvable paths $P$ , while the coding gain declines.

Based on Corollary 1, we note that when $P$ is small, the diversity gain is small. In this case, the squared Euclidean distance between codewords is crucial for OTFS systems as an optimized code can greatly improve the error performance. On the other hand, when $P$ is large, there is a large diversity gain. In this case, it is expected that the code design can only offer a limited error performance improvement. However, it should also be noted that the coding gain always improves with the increase of $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ , regardless of the value of channel parameter $P$ according to (32). Therefore, a preliminary guideline for the code design for the OTFS systems is to maximize the minimum value of $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ among all pairs of codewords of the code.

To verify the accuracy of the derived unconditional PEP bound, we numerically compare the coding gain corresponding to (32) and (24). In particular, recalling (24), after some manipulations, we obtain

P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq{\left({\frac{{{{\left({\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}\right)}^{\frac{{\rm{1}}}{P}}}}}{P}}\right)^{-P}}{\left({\frac{{{E_{s}}}}{{4{N_{0}}}}}\right)^{-P}}.

(33)

Hence, we refer to the term ${{{{\left({\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}\right)}^{\frac{{\rm{1}}}{P}}}}\mathord{\left/{\vphantom{{{{\left({\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}\right)}^{\frac{{\rm{1}}}{P}}}}P}}\right.\kern-1.2pt}P}$ as the conditional coding gain of the OTFS systems for given channel parameters ${\bm{\omega}}_{\tau}$ and ${\bm{\omega}}_{\nu}$ , as ${{\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}}$ is a function of ${{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}$ . We can also obtain the average coding gain with respect to various channel parameters ${{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}$ and error sequences $\bf{e}$ . On the other hand, from the unconditional PEP upper bound (32), we call the function $f\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)={{d_{\rm{E}}^{2}\left({\bf{e}}\right)}\mathord{\left/{\vphantom{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}P}}\right.\kern-1.2pt}P}$ the coding gain bound of the OTFS systems. Now let us compare the average coding gain and the coding gain obtained from the performance bound via simulations. In particular, we consider a binary phase shift keying (BPSK) signal for the OTFS system with $M=2$ and $N=5$ , and error sequences with ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ , where the maximum delay $l_{\rm{max}}$ and Doppler index $k_{\rm{max}}$ are set to be $2$ and $4$ , respectively. To be more specific, we numerically average the conditional coding gains subjected to all error sequences with ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ and channel parameters ${{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}$ to obtain the average coding gain.

Fig. 3 shows the comparison between the average coding gain and the corresponding coding gain bound in decibels (dB) for $P=2$ . As shown in the figure, different values of maximum delay and Doppler indices do not have a strong impact on the average coding gain. Meanwhile, it can be observed in the figure that the average coding gain improves with the increase of the squared Euclidean distance ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ . Furthermore, the derived coding gain bound shows a close match with the overall average coding gain, especially when ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ is small.

Fig. 4 illustrates the average coding gain and the corresponding coding gain bound (in dashed lines) in decibels (dB) with respect to different values of $P$ . Without loss of generality, we set the maximum delay index $l_{\rm{max}}$ and the maximum Doppler index $k_{\rm{max}}$ as $2$ and $4$ , respectively. As shown in the figure, given ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ , the average coding gain decreases with the increase of the number of paths $P$ , which is consistent with Corollary 1. Similar to the previous figure, the coding gain bounds match well with the general behaviour of average coding gains, especially when $P$ is small, which verifies the correctness of our derivation. On the other hand, we notice that the derived coding gain bound slightly diverge from the average coding gains, when $P$ is large. This observation motivates us to derive a more suitable approximation for the coding gain for large $P$ , the details of which will be given in the following subsection.

III-B Error Performance Analysis for Large Values of $P$

When there is a large number of independent resolvable paths of the channel, i.e., the value of $P$ is large, the unconditional PEP can be more accurately bounded by considering the strong law of large number [23, 18]. In specific, the term $\sum\limits_{i=1}^{P}{{\lambda_{i}}{{|{{{\tilde{h}}_{i}}}|}^{2}}}$ in (20) approaches a Gaussian random variable, due to the central limit theorem [24].

Notice that ${{\tilde{h}}_{i}}$ follows the complex Gaussian distribution, with mean ${\mu_{{{\tilde{h}}_{i}}}}=\mathbb{E}\left[{\bf{h}}\right]\cdot{{\bf{v}}_{i}}$ and variance $1/(2P)$ per real dimension. Therefore, for the ease of derivation, we normalize the variance and rewrite (20) as

P\left({{\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{\bf{h}},{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}}\right)\leq\exp\left({-\frac{{{E_{s}}}}{{4{N_{0}}}}\sum\limits_{i=1}^{r}{\frac{{{\lambda_{i}}}}{P}{{\left|{{{\bar{h}}_{i}}}\right|}^{2}}}}\right),

(34)

where ${\bar{h}_{i}}=\sqrt{P}{\tilde{h}_{i}}$ . Note that $\left\{{{{\bar{h}}_{1}},{{\bar{h}}_{2}},...,{{\bar{h}}_{r}}}\right\}$ are independent complex Gaussian random variables with mean ${\mu_{{{\bar{h}}_{i}}}}={\sqrt{P}}{\mu_{{{\tilde{h}}_{i}}}}$ and variance $1/2$ per real dimension. Furthermore, it is obvious that $\left|{{{\bar{h}}_{i}}}\right|$ follows the Rician distribution with Rician factor ${{\bar{K}}_{i}}={\left|{{\mu_{{{\bar{h}}_{i}}}}}\right|^{2}}$ and a unit variance. Thus, it can be shown that ${{{\left|{{{\bar{h}}_{i}}}\right|}^{2}}}$ follows a noncentral chi-squared distribution with two degrees of freedom (DoFs) and noncentrality parameter $S={{\bar{K}}_{i}}$ , whose mean and variance are given by

	$\displaystyle{\mu_{{{\left\|{{{\bar{h}}_{i}}}\right\|}^{2}}}}$	$\displaystyle=1+{{\bar{K}}_{i}},$		(35)
	$\displaystyle\sigma_{{{\left\|{{{\bar{h}}_{i}}}\right\|}^{2}}}^{2}$	$\displaystyle=1+2{{\bar{K}}_{i}}.$		(36)

Next, we derive the unconditional PEP by means of Gaussian approximation. To start with, let $\psi=\sum\limits_{i=1}^{P}{{\lambda_{i}}{{\left|{{{\bar{h}}_{i}}}\right|}^{2}}}$ . According to (35) and (36), we approximate $\psi$ as a Gaussian random variable, whose mean is ${\mu_{\psi}}=\sum\limits_{i=1}^{P}{{\lambda_{i}}\left({1+{{{\bar{K}}_{i}}}}\right)}$ and variance is $\sigma_{\psi}^{2}=\sum\limits_{i=1}^{P}{\lambda_{i}^{2}\left({1+2{{\bar{K}}_{i}}}\right)}$ . Thus, according to the Gaussian distribution of $\psi$ , the conditional PEP in (20) is upper-bounded by

P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq\int\limits_{0}^{+\infty}{\exp\left({-\frac{{{E_{s}}}}{{4{N_{0}}P}}\psi}\right)p}\left(\psi\right)d\psi.

(37)

Considering

\displaystyle\int\limits_{0}^{+\infty}{\exp\left({-\gamma\psi}\right)p}\left(\psi\right)d\psi=\exp\left({\frac{1}{2}{\gamma^{2}}\sigma_{\psi}^{2}-\gamma{\mu_{\psi}}}\right)Q\left({\frac{{\gamma\sigma_{\psi}^{2}-{\mu_{\psi}}}}{{{\sigma_{\psi}}}}}\right),\gamma>0,

(38)

we obtain

\displaystyle P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq\exp\left({\frac{1}{2}{{\left({\frac{{{E_{s}}}}{{4{N_{0}}}}}\right)}^{2}}\cdot\frac{{\sigma_{\psi}^{2}}}{{{P^{2}}}}-\frac{{{E_{s}}}}{{4{N_{0}}}}\cdot\frac{{{\mu_{\psi}}}}{P}}\right)Q\left({\frac{{{E_{s}}}}{{4{N_{0}}}}\cdot\frac{{{\sigma_{\psi}}}}{P}-\frac{{{\mu_{\psi}}}}{{{\sigma_{\psi}}}}}\right).

(39)

Similar to the previous subsection, we consider the special case of Rayleigh fading.

In the case of Rayleigh fading, i.e., $|{{{\bar{h}}_{i}}}|$ and $|{{h_{i}}}|$ follow the Rayleigh distribution, we have ${\mu_{\psi}}=\sum\limits_{i=1}^{P}{{\lambda_{i}}}$ and $\sigma_{\psi}^{2}=\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}$ . Therefore, the right hand side of (39) is given by

	$\displaystyle P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right\|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq$	$\displaystyle\exp\left({\frac{1}{2}{{\left({\frac{{{E_{s}}}}{{4{N_{0}}}}}\right)}^{2}}\sum\limits_{i=1}^{P}{\frac{{\lambda_{i}^{2}}}{{{P^{2}}}}}-\frac{{{E_{s}}}}{{4{N_{0}}}}\sum\limits_{i=1}^{P}{\frac{{{\lambda_{i}}}}{P}}}\right)$
		$\displaystyle\quad\quad Q\left({\frac{{{E_{s}}}}{{4{N_{0}}P}}\sqrt{\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}-\frac{{\sum\nolimits_{i=1}^{P}{{\lambda_{i}}}}}{{\sqrt{\sum\nolimits_{i=1}^{P}{\lambda_{i}^{2}}}}}}\right).$		(40)

Furthermore, we consider the Chernoff bound of the Q-function [23]

Q\left(\gamma\right)\leq\exp\left({-\frac{1}{2}{\gamma^{2}}}\right),\gamma>0,

(41)

and (40) can be further upper-bounded by

$\displaystyle P\left({\left.{{\bf{x}},{\bf{x^{\prime}}}}\right\|{{{\bm{\omega}}_{\tau}}},{{{\bm{\omega}}_{\nu}}}}\right)\leq$	$\displaystyle\exp\left({\frac{1}{2}{{\left({\frac{{{E_{s}}}}{{4{N_{0}}}}}\right)}^{2}}\sum\limits_{i=1}^{P}{\frac{{\lambda_{i}^{2}}}{{{P^{2}}}}}-\frac{{{E_{s}}}}{{4{N_{0}}}}\sum\limits_{i=1}^{P}{\frac{{{\lambda_{i}}}}{P}}}\right)$
	$\displaystyle\exp\left({-\frac{1}{2}{{\left({\frac{{{E_{s}}}}{{4{N_{0}}}}}\right)}^{2}}\sum\limits_{i=1}^{P}{\frac{{\lambda_{i}^{2}}}{{{P^{2}}}}}-\frac{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}}{{2\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}}+\frac{{{E_{s}}}}{{4{N_{0}}}}\sum\limits_{i=1}^{P}{\frac{{{\lambda_{i}}}}{P}}}\right)$
$\displaystyle=$	$\displaystyle\exp\left({-\frac{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}}{{2\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}}}\right),$	(42)

when

\frac{{{E_{s}}}}{{4{N_{0}}}}\geq\frac{P{\sum\nolimits_{i=1}^{P}{{\lambda_{i}}}}}{{\sum\nolimits_{i=1}^{P}{\lambda_{i}^{2}}}}.

(43)

Based on (42), the unconditional PEP can be approximately upper-bounded as shown in the following Theorem.

Theorem 3 (Unconditional PEP upper bound for large $P$ ): For a large value of $P$ and a reasonably high SNR, i.e., $\frac{{{E_{s}}}}{{4{N_{0}}}}\geq\frac{P}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}$ , the unconditional PEP of OTFS systems can be approximately upper-bounded by

P\left({{\bf{x}},{\bf{x^{\prime}}}}\right)\mathbin{\lower 1.29167pt\hbox{$\buildrel<\over{\smash{\scriptstyle\sim}\vphantom{{}_{x}}}$}}\exp\left({-\frac{{{E_{s}}}}{{8{N_{0}}}}d_{\rm{E}}^{2}}\left({\bf{e}}\right)\right).

(44)

Proof: The proof is given in Appendix D.

It should be noted that, for $P\geq 4$ , the approximation in Theorem 3 is sufficiently accurate owing to the strong law of the large number [18, 23]. On the other hand, the number of paths $P$ is usually smaller than the squared Euclidean distance $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ for practical wireless transmissions with good channel codes ³³3For example, a popular industry-standard rate- $1/2$ convolutional code with code memory of $6$ has a minimum squared Euclidean distance $d_{\rm{E}}^{2}\left({\bf{e}}\right)=40$ [25].. Therefore, our SNR assumption is reasonable. Compared with Corollary 1, it is not surprising that the unconditional PEP only depends on the squared Euclidean distance $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ , regardless of the delay and Doppler indices. Furthermore, it should be noted that (44) is of the similar form of the error performance for AWGN channels [26]. This is because the impact of fading is mitigated by a large number of diversity branches and consequently, in order words, the channel with a large number of diversity paths approaches an AWGN model [26].

In this section, we have derived the error performance analysis of coded OTFS systems. It should be noted that our error performance analysis only considers the integer Doppler case. However, the extension of the above analysis to the fractional Doppler case is straightforward. In the following, we will design suitable channel codes according to our error performance analysis.

III-C Code Design Issues

According to the derived analysis, the rule-of-thumb channel code design criterion is discussed in this section. Without loss of generality, we only consider the Rayleigh fading channel in the following. It should be noticed from the previous analysis that the code design criterion for the coded OTFS system is to maximize the minimum squared Euclidean distance $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ .

Proposition 1 (The squared Euclidean distance criterion): The channel code should be designed to maximize the minimum squared Euclidean distance among all pairs of possible codewords.

We note that even with the designed code, the error performance of coded OTFS systems may still vary with different channel parameters e.g., ${\bm{\omega}}_{\tau}$ and ${\bm{\omega}}_{\nu}$ . This detrimental effect due to channel realizations is widely observed in the system designs for fading channels, such as in [19, 27]. In specific, with different channel parameters, the value of the conditional coding gain can be different even with the same error sequence, which may potentially jeopardize the overall error performance of OTFS systems. In order to obtain a more robust performance, it is desirable to apply an interleaver to permute the coded symbols before sending to the constellation mapper or the OTFS modulator in the DD domain. As pointed out in [27], such an interleaver can “whiten” the transmitted symbols from the information theoretic point of view and the detrimental effect on the error performance due to the channel parameters can thus be alleviated.

To examine our performance analysis of the coded OTFS systems, we perform numerical simulations for OTFS systems over high-mobility channels, the results of which will be shown in the next section.

IV Numerical Results

In this section, the error performance of the coded OTFS system with various channel codes is evaluated via numerical simulations. We consider the BPSK signal for the OTFS system, where the data sequence is firstly encoded and interleaved, and then BPSK mapped, according to Fig. 1. Without loss of generality, we consider the maximum-likelihood (ML) detection of OTFS modulation. In order to verify the accuracy of the analytical results, we consider four different convolutional codes (with trellis termination) with different minimum squared Euclidean distance $d_{\rm{E}}^{2}{\left({\bf{e}}\right)}$ among all possible codeword pairs. The details of the code parameters are given in Table I, including the generator matrix and the memory length. In particular, we also show the smallest value of squared Euclidean distance $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ among all possible codeword pairs for each code. In specific, we consider a coded OTFS system with $N=8$ and $M=16$ and correspondingly the codeword length for all considered simulations is $128$ bits unless otherwise specified. The channel decoder adopts the logarithm domain Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm [28]. Furthermore, we consider the Rayleigh fading case. If not otherwise specified, we set the maximum delay index as $l_{\max}=3$ and the maximum Doppler index as $k_{\max}=5$ , which is corresponding to a relative UE speed around $250$ km/h with $4$ GHz carrier frequency and $1.5$ kHz sub-carrier spacing. For each channel realization, we randomly select the delay and Doppler indices such that we have $-{k_{\max}}\leq{k_{i}}\leq{k_{\max}}$ and $0\leq{l_{i}}\leq{l_{\max}}$ .

TABLE I: Code Parameters

Code structure	Generator matrix	Memory length	Minimum $d_{\rm{E}}^{2}\left({\bf{e}}\right)$
A	$\left[1+D,D\right]$	1	12
B	$\left[1+{D^{2}},1+D+{D^{2}}\right]$	2	20
C	$\left[{1+{D^{2}}+{D^{5}},1+D+{D^{2}}+{D^{3}}+{D^{4}}+{D^{5}}}\right]$	5	32
D	$\left[{1+D+{D^{2}}+{D^{5}}+{D^{6}},1+{D^{2}}+{D^{3}}+{D^{4}}+{D^{6}}}\right]$	6	40

The frame-error-rate (FER) performance of the OTFS systems with $P=2$ is shown in Fig. 5. It can be observed that the slope of the FER curve for the uncoded OTFS system is slightly worse than that for coded OTFS systems. This indicates that the uncoded OTFS system with $P=2$ does not guarantee the full diversity for all possible channel realizations, which is consistent with the analysis in [13]. More importantly, this observation also shows that the application of channel coding can improve the diversity gain of OTFS systems in the case where the OTFS modulation fails to achieve the full diversity. Moreover, we observe that employing the channel code with a larger minimum squared Euclidean distance $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ indeed leads to a larger coding gain. In specific, we observe from the figure that for FER $\approx{10^{-2}}$ , the required SNRs for codes A, B, C and D, are $15.28$ dB, $13.64$ dB, $12.65$ dB, and $12.54$ dB, respectively. Compared to uncoded OTFS systems, these four coded OTFS systems achieve coding gains roughly $2.99$ dB, $4.63$ dB, $5.62$ dB, and $5.73$ dB, respectively. This observation clearly substantiates the proposed performance analysis and code design criterion in Proposition 1.

Fig. 6 shows the FER performance of the OTFS systems with $P=8$ . We notice that the slope of the FER curve for the uncoded OTFS system with $P=8$ is slightly lower than that for coded OTFS systems, which is consistent with the previous figure. On the other hand, similar to Fig. 5, we observe that the channel code with a larger minimum squared Euclidean distance $d_{\rm{E}}^{2}\left({\bf{e}}\right)$ enjoys a larger coding gain compared to the uncoded OTFS system, which is consistent with the analysis of Theorem 3, i.e., the channel approaches an AWGN model with a large number of diversity branches. Together with the observations from Fig. 5, one can conclude that our proposed code design criterion in Proposition 1 is universal for general OTFS systems in spite of the channel parameters and the number of paths $P$ .

The FER performance of the OTFS modulation with code D and different number of paths $P$ is illustrated in Fig. 7. It can be observed from the figure that given the same code, the error performance of the coded OTFS systems improves with the increase of number of distinguishable paths. Furthermore, it is obvious that with the same code, a larger value of $P$ corresponds to a larger diversity advantage as indicated in the figure, which is consistent with our analysis.

Fig. 8 presents the trade-off between the diversity and coding gain. In particular, we consider the FER performance of code D with $P=3$ and $P=8$ , comparing with that of the corresponding uncoded OTFS systems. As shown in figure, at FER $\approx{10^{-3}}$ , the coded OTFS system with $P=3$ exhibits around $5.7$ dB coding gain compared to that of the uncoded OTFS system with the same FER, while only around $5.0$ dB coding gain is obtained for the coded OTFS system with $P=8$ . This observation matches the prediction in Proposition 1, which indicates that the coding gain reduces with the increase of $P$ , given the same channel code.

We compare the FER performance of code A with $P=3$ and $P=8$ , and that of the corresponding OFDM systems in Fig. 9. As observed from the figure, the OTFS system enjoy better error performance than that of the OFDM system with the same code, for both $P=3$ and $P=8$ . Furthermore, the FER curve of the OFDM system shares almost the same slope as that of the OTFS system, for $P=3$ . As for $P=8$ , the achieved diversity gain of the OTFS system is clearly higher than that of the OFDM system. Note that the diversity gain of coded OFDM systems is determined by the smaller value of the minimum symbol-wise Hamming distance of the code ${\delta_{\rm{H}}}$ and the number of paths $P$ ⁴⁴4According to [29], we have $r_{\rm{OFDM}}={\rm{min}}({\delta_{\rm{H}}},P)$ , where $r_{\rm{OFDM}}$ is the achievable diversity gain of a coded OFDM system. In specific, we have $r_{\rm{OFDM}}=3$ for both $P=3$ and $P=8$ , with Code A. [29], while OTFS systems can obtain the full diversity almost surely regardless of the employment of channel codes. Therefore, this observation clearly shows the advantage of the OTFS systems over the OFDM systems.

The FER performance of both OTFS systems and OFDM systems with various values of the relative UE speed is compared in Fig. 10. We consider $l_{\max}=3$ and $k_{\max}=3,5,6$ , which corresponds to the cases where the relative UE speeds are $150$ km/h, $250$ km/h, and $300$ km/h, respectively. We apply the ML detection for both the OTFS and OFDM systems to have a fair comparison. We observe that the error performance of both OTFS and OFDM systems with ML detection does not change much with different relative UE speeds, which is consistent with the observations in [10]. Note that, the ML detection is not practically feasible for conventional OFDM systems due to the high detection complexity. Therefore, frequency domain equalization is usually deployed for OFDM systems [30]. In this case, the error performance of OFDM systems will degrade dramatically with the increase of the speed [30, 21] due to the severe inter-carrier interference (ICI) induced by the Doppler spread. Furthermore, the FER performance of the OTFS systems outperform that of the OFDM systems, including both coded and uncoded cases with various values of the speed. Similar to the previous figure, we also observe that the achieved diversity gain of the OTFS system is higher than that of the OFDM system, which agrees with our analysis.

TABLE II: Code Parameters for Fig. 11

Code	Data length	Codeword length	Code rate
Convolutional Code D	250	512	0.488
5G LDPC	256	512	0.5
LTE Turbo	250	512	0.488

We present the FER results of coded OTFS systems with modern codes in Fig. 11. We consider an LDPC code from the 5G communication standard [31] (referred to as the 5G LDPC code) and the Turbo code from the 3GPP long term evolution (LTE) standard [32] (referred to as the LTE Turbo code), where code parameters are given in Table II and we have $N=16$ and $M=32$ for OTFS modulation. As a benchmark, the FER performance of the convolutional code D is also given in Fig. 11. We observe that, the LTE Turbo code achieves the best error performance compared to the 5G LDPC code and the convolutional code D with $P=4$ , although they all share the same diversity gain. More specifically, at FER $\approx{10^{-3}}$ , 5G LDPC code and LTE Turbo code show around $0.5$ dB and $0.7$ dB SNR gain compared to the convolutional code D. Note that a similar observation of the error performance can be observed over the AWGN channel, where the LTE Turbo code has the best performance while the convolutional code D has the worst performance. Therefore, this observation indicate that codes optimized for AWGN channels can also achieve a good error performance in the OTFS systems, which is consistent with our analysis.

V Conclusion

In this paper, we studied the performance analysis of coded OTFS systems over high-mobility channels. We first derived the conditional PEP for a given channel realization and then obtained the unconditional PEP by leveraging proper bounding techniques. We discussed two cases of the OTFS transmission according to the number of independent resolvable paths of the channel, where we showed that the coding improvement of OTFS systems depends on the squared Euclidean distance between a pair of codewords. More importantly, we demonstrated the fundamental trade-off between the diversity gain and the coding gain for OTFS systems. Furthermore, we proposed a code design criterion based on the derived unconditional bound. The analysis and the code design criterion are verified by numerical simulations with various channel codes. Our future work will focus on the modern code design for OTFS systems by considering analytical tools such as the density evolution and the extrinsic information transfer (EXIT) chart.

Appendix A Proof of Lemma 3

Notice that, ${{\bf{\Omega}}\left({\bf{e}}\right)}$ is positive definite Hermitian. Hence, the eigenvalues $\left\{{{\lambda_{i}}}\right\}$ of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ are all positive. Considering the arithmetic mean and geometric mean (AM-GM) inequality, we obtain

\sum\limits_{i=1}^{P}{\frac{1}{{{\lambda_{i}}}}}\geq P{\left({\prod\limits_{i=1}^{P}{\frac{1}{{{\lambda_{i}}}}}}\right)^{\frac{1}{P}}}=\frac{P}{{{{\left({\prod\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{\frac{1}{P}}}}}.

(45)

Then, we apply the Cauchy-Schwarz inequality to the denominator of the right hand side of (45), which yields

\frac{P}{{{{\left({\prod\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{\frac{1}{P}}}}}\geq\frac{{{P^{2}}}}{{\sum\limits_{i=1}^{P}{{\lambda_{i}}}}}\geq\frac{P}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}.

(46)

It is obvious that the equality only holds when the eigenvalues $\left\{{{\lambda_{i}}}\right\}$ are of the same value, e.g., the codeword difference matrix ${{\bf{\Omega}}\left({\bf{e}}\right)}$ is a diagonal matrix, i.e., ${\bf{\Omega}}\left({\bf{e}}\right)=\textrm{diag}\left\{{d_{\rm{E}}^{2}{\left({\bf{e}}\right)},\ldots,d_{\rm{E}}^{2}}{\left({\bf{e}}\right)}\right\}$ . This completes the proof of Lemma 3.

Appendix B Proof of Theorem 1

The determinant of the codeword difference matrix ${{\bf{\Omega}}\left({\bf{e}}\right)}$ can be written as

\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)=\exp\left({\ln\left({\prod\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}\right)=\exp\left({\sum\limits_{i=1}^{P}{\ln\left({{\lambda_{i}}}\right)}}\right),

(47)

where $\{\lambda_{i}\}$ for $i\in\left\{{1,\ldots,P}\right\}$ are the eigenvalues of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ . Furthermore, let us consider the inequality $\ln\left(\gamma\right)\geq 1-\frac{1}{\gamma},\gamma\in\left({0,+\infty}\right)$ , where the equality only holds when $\gamma=1$ . Therefore, we have

$\displaystyle\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)$	$\displaystyle=\exp\left({\sum\limits_{i=1}^{P}{\ln\left({{\lambda_{i}}}\right)}}\right)$
	$\displaystyle=\exp\left({P\ln\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)+\sum\limits_{i=1}^{P}{\ln\left({\frac{{{\lambda_{i}}}}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}}\right)}}\right)$
	$\displaystyle\geq\exp\left({P\ln\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)+\sum\limits_{i=1}^{P}{\left({1-\frac{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}{{{\lambda_{i}}}}}\right)}}\right)$	(48)
	$\displaystyle={\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)^{P}}\exp\left({P-d_{\rm{E}}^{2}\left({\bf{e}}\right){\rm{tr}}\left({{{\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)}^{-1}}}\right)}\right).$	(49)

It is obvious that the equality holds in (48) only if all the eigenvalues $\{\lambda_{i}\}$ of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ equal to ${d_{\rm{E}}^{2}\left({\bf{e}}\right)}$ . Notice that the eigenvalues $\{\lambda_{i}\}$ of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ equal to the main diagonal elements when ${{\bf{\Omega}}\left({\bf{e}}\right)}$ is a diagonal matrix, i.e., ${\bf{\Omega}}\left({\bf{e}}\right)=\textrm{diag}\left\{{d_{\rm{E}}^{2}\left({\bf{e}}\right),\ldots,d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right\}$ , in which case we have $\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)={\left({d_{\rm{E}}^{2}}\left({\bf{e}}\right)\right)^{P}}$ . This completes the proof of Theorem 1.

Appendix C Proof of Theorem 2

Following Theorem 1, we note that the term ${P-{d_{\rm{E}}^{2}\left({\bf{e}}\right)}{\rm{tr}}\left({{\bf{\Omega}}_{i}^{-1}}\right)}$ is less than zero according to Lemma 3. Therefore, the value of $\exp\left({P-{d_{\rm{E}}^{2}\left({\bf{e}}\right)}{\rm{tr}}\left({{\bf{\Omega}}_{i}^{-1}}\right)}\right)$ changes slowly with the increase of ${\rm{tr}}\left({{\bf{\Omega}}_{i}^{-1}}\right)$ , owing to the slow decay property of the corresponding function. Therefore, we apply the result from Lemma 3 to approximate the determinant of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ [23, 18],

$\displaystyle\det\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)$	$\displaystyle\geq\exp\left({P-d_{\rm{E}}^{2}\left({\bf{e}}\right){\rm{tr}}\left({{\bf{\Omega}}_{i}^{-1}}\right)}\right)$
	$\displaystyle\mathbin{\lower 1.29167pt\hbox{$\buildrel>\over{\smash{\scriptstyle\sim}\vphantom{{}_{x}}}$}}\exp\left({P-d_{\rm{E}}^{2}\left({\bf{e}}\right){{\left({{\rm{tr}}\left({{\bf{\Omega}}_{i}^{-1}}\right)}\right)}_{\min}}}\right)$
	$\displaystyle={{{\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)}^{P}}\exp\left({P-d_{\rm{E}}^{2}\left({\bf{e}}\right)\frac{P}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}}\right)}$
	$\displaystyle={{{\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)}^{P}}},$	(50)

where the equality holds if ${\bf{\Omega}}\left({\bf{e}}\right)$ is a diagonal matrix, i.e., ${\bf{\Omega}}\left({\bf{e}}\right)=\textrm{diag}\left\{{d_{\rm{E}}^{2}{\left({\bf{e}}\right)},\ldots,d_{\rm{E}}^{2}}{\left({\bf{e}}\right)}\right\}$ . Mathematically, the above approximation may be loose if ${\bf{\Omega}}\left({\bf{e}}\right)$ is ill-conditioned. Therefore, we justify the accuracy of our approximation as follows. A commonly adopted approach in testifying if a matrix is in ill-condition is the $\bf{P}$ -condition number [33]. In specific, we consider the lower bound on the $\bf{P}$ -condition number of a Gram matrix [33]. The ${\bf{P}}$ -condition number of ${\bf{\Omega}}\left({\bf{e}}\right)$ is defined by

{\bf{P}}\left({\bf{\Omega}}\left({\bf{e}}\right)\right)\buildrel\Delta\over{=}r\left({\bf{\Omega}}\left({\bf{e}}\right)\right)r\left({\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)^{-1}}\right),

(51)

where $r\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)$ is the spectral radius of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ , i.e., $r\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)$ equals to the largest eigenvalues of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ . In particular, the matrix ${{\bf{\Omega}}\left({\bf{e}}\right)}$ is said to be ill-conditioned if ${\bf{P}}\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)$ is large and is to be well-conditioned if ${\bf{P}}\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)$ is small. According to [33], we have ${\bf{P}}\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)\geq{{{{\left({{{\left\|{{{\bf{u}}_{i}}}\right\|}^{2}}}\right)}_{\max}}}\mathord{\left/{\vphantom{{{{\left({{{\left\|{{{\bf{u}}_{i}}}\right\|}^{2}}}\right)}_{\max}}}{{{\left({{{\left\|{{{\bf{u}}_{j}}}\right\|}^{2}}}\right)}_{\min}}}}}\right.\kern-1.2pt}{{{\left({{{\left\|{{{\bf{u}}_{j}}}\right\|}^{2}}}\right)}_{\min}}}}$ for $1\leq i,j\leq P$ , which yields

{\bf{P}}\left({{\bf{\Omega}}\left({\bf{e}}\right)}\right)\geq{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}\mathord{\left/{\vphantom{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}{d_{\rm{E}}^{2}\left({\bf{e}}\right)=1}}}\right.\kern-1.2pt}{d_{\rm{E}}^{2}\left({\bf{e}}\right)=1}}.

(52)

We can see that the $\bf{P}$ -condition number of ${{\bf{\Omega}}\left({\bf{e}}\right)}$ always greater than or equal to $1$ , which indicates that ${{\bf{\Omega}}\left({\bf{e}}\right)}$ is generally well-conditioned. This completes the proof of Theorem 2.

Appendix D Proof of Theorem 3

Recalling (43), we note that (42) only holds if

\sum\nolimits_{i=1}^{P}{\lambda_{i}^{2}}\geq\frac{{P\sum\nolimits_{i=1}^{P}{{\lambda_{i}}}}}{{{{{E_{s}}}\mathord{\left/{\vphantom{{{E_{s}}}{\left({4{N_{0}}}\right)}}}\right.\kern-1.2pt}{\left({4{N_{0}}}\right)}}}}=\frac{{{P^{2}}d_{\rm{E}}^{2}\left({\bf{e}}\right)}}{{{{{E_{s}}}\mathord{\left/{\vphantom{{{E_{s}}}{\left({4{N_{0}}}\right)}}}\right.\kern-1.2pt}{\left({4{N_{0}}}\right)}}}}.

(53)

Therefore, we consider the approximation of (42) as follows

	$\displaystyle P\left({{\bf{x}},{\bf{x^{\prime}}}}\right)$	$\displaystyle\leq\exp\left({-{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}\mathord{\left/{\vphantom{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}{\left({2\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}\right)}}}\right.\kern-1.2pt}{\left({2\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}\right)}}}\right)\mathbin{\lower 1.29167pt\hbox{$\buildrel<\over{\smash{\scriptstyle\sim}\vphantom{{}_{x}}}$}}\exp\left({-{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}\mathord{\left/{\vphantom{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}{2{{\left({\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}\right)}_{\min}}}}}\right.\kern-1.2pt}{2{{\left({\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}\right)}_{\min}}}}}\right)$
		$\displaystyle=\exp\left({-{{{{\left({Pd_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)}^{2}}}\mathord{\left/{\vphantom{{{{\left({Pd_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)}^{2}}}{\left({2\frac{{{P^{2}}d_{\rm{E}}^{2}\left({\bf{e}}\right)}}{{{{{E_{s}}}\mathord{\left/{\vphantom{{{E_{s}}}{\left({4{N_{0}}}\right)}}}\right.\kern-1.2pt}{\left({4{N_{0}}}\right)}}}}}\right)}}}\right.\kern-1.2pt}{\left({\frac{{2{P^{2}}d_{\rm{E}}^{2}\left({\bf{e}}\right)}}{{{{{E_{s}}}\mathord{\left/{\vphantom{{{E_{s}}}{\left({4{N_{0}}}\right)}}}\right.\kern-1.2pt}{\left({4{N_{0}}}\right)}}}}}\right)}}}\right)=\exp\left({-\frac{{{E_{s}}}}{{8{N_{0}}}}d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right).$		(54)

In particular, this approximation is reasonable since the value of $\exp\left({-{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}\mathord{\left/{\vphantom{{{{\left({\sum\limits_{i=1}^{P}{{\lambda_{i}}}}\right)}^{2}}}{\left({2\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}\right)}}}\right.\kern-1.2pt}{\left({2\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}\right)}}}\right)$ changes slowly with the increase of ${2\sum\limits_{i=1}^{P}{\lambda_{i}^{2}}}$ , owing to the slow decay property of the corresponding function [23, 18]. On the other hand, the justification of the SNR assumption of (43) is necessary. According to Lemma 4, we have

\frac{{P\sum\nolimits_{i=1}^{P}{{\lambda_{i}}}}}{{\sum\nolimits_{i=1}^{P}{\lambda_{i}^{2}}}}\leq\frac{{{P^{2}}d_{\rm{E}}^{2}\left({\bf{e}}\right)}}{{P{{\left({d_{\rm{E}}^{2}\left({\bf{e}}\right)}\right)}^{2}}}}=\frac{P}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}.

(55)

where the equality only holds when the eigenvalues of ${\bf{\Omega}}\left({\bf{e}}\right)$ share the same value, e.g., ${\bf{\Omega}}\left({\bf{e}}\right)=\textrm{diag}\left\{{d_{\rm{E}}^{2}{\left({\bf{e}}\right)},\ldots,d_{\rm{E}}^{2}}{\left({\bf{e}}\right)}\right\}$ . Therefore, we can see that the term $\frac{P{\sum\nolimits_{i=1}^{P}{{\lambda_{i}}}}}{{\sum\nolimits_{i=1}^{P}{\lambda_{i}^{2}}}}$ is upper-bounded by $\frac{P}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}$ . Hence, the assumption of SNR of (43) can be further restricted as $\frac{{{E_{s}}}}{{4{N_{0}}}}\geq\frac{P}{{d_{\rm{E}}^{2}\left({\bf{e}}\right)}}$ . This completes the proof of Theorem 3.

References

[1] S. Li, J. Yuan, W. Yuan, Z. Wei, B. Bai, and D. W. K. Ng, “On the performance of coded OTFS modulation over high-mobility channels,” submitted to IEEE Int. Conf. Commun. Workshops, pp. 1–6, 2021.
[2] G. Meyer and S. Beiker, Road Vehicle Automation. Springer International Publishing, 2019.
[3] G. Giambene, S. Kota, and P. Pillai, “Satellite-5G integration: A network perspective,” IEEE Netw., vol. 32, no. 5, pp. 25–31, Oct. 2018.
[4] Y. Cai, Z. Wei, R. Li, D. W. K. Ng, and J. Yuan, “Joint trajectory and resource allocation design for energy-efficient secure UAV communication systems,” IEEE Trans. Commun., vol. 68, no. 7, pp. 4536–4553, 2020.
[5] T. Hwang, C. Yang, G. Wu, S. Li, and G. Y. Li, “OFDM and its wireless applications: A survey,” IEEE Trans Veh. Technol., vol. 58, no. 4, pp. 1673–1694, May 2008.
[6] R. Hadani, S. Rakib, M. Tsatsanis, A. Monk, A. J. Goldsmith, A. F. Molisch, and R. Calderbank, “Orthogonal time frequency space modulation,” in Proc. 2017 IEEE Wireless Commun. Net. Conf., Mar. 2017, pp. 1–6.
[7] F. Hlawatsch and G. Matz, Wireless Communications over Rapidly Time-varying Channels. Academic Press, 2011.
[8] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge university press, 2005.
[9] P. Raviteja, K. T. Phan, and Y. Hong, “Embedded pilot-aided channel estimation for OTFS in delay-Doppler channels,” IEEE Trans. Veh. Technol., vol. 68, no. 5, pp. 4906–4917, May 2019.
[10] P. Raviteja, K. T. Phan, Y. Hong, and E. Viterbo, “Interference cancellation and iterative detection for orthogonal time frequency space modulation,” IEEE Trans. Wireless Commun., vol. 17, no. 10, pp. 6501–6515, Oct. 2018.
[11] A. Farhang, A. RezazadehReyhani, L. E. Doyle, and B. Farhang-Boroujeny, “Low complexity modem structure for OFDM-based orthogonal time frequency space modulation,” IEEE Wireless Commun. Lett., vol. 7, no. 3, pp. 344–347, Jun. 2017.
[12] W. Yuan, Z. Wei, J. Yuan, and D. W. K. Ng, “A simple variational Bayes detector for orthogonal time frequency space (OTFS) modulation,” IEEE Trans Veh. Technol., vol. 69, no. 7, pp. 7976–7980, Jul. 2020.
[13] P. Raviteja, Y. Hong, E. Viterbo, and E. Biglieri, “Effective diversity of OTFS modulation,” IEEE Wireless Commun. Lett., vol. 9, no. 2, pp. 249–253, Feb. 2020.
[14] E. Biglieri, P. Raviteja, and Y. Hong, “Error performance of orthogonal time frequency space (OTFS) modulation,” in IEEE Int. Conf. Commun. Workshops (ICC Workshops), May 2019, pp. 1–6.
[15] G. D. Surabhi, R. M. Augustine, and A. Chockalingam, “On the diversity of uncoded OTFS modulation in doubly-dispersive channels,” IEEE Trans. Wireless Commun., vol. 18, no. 6, pp. 3049–3063, Jun. 2019.
[16] J. Wu and P. Fan, “A survey on high mobility wireless communications: Challenges, opportunities and solutions,” IEEE Access, vol. 4, pp. 450–476, 2016.
[17] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for high data rate wireless communication: Performance criterion and code construction,” IEEE Trans. Inf. Theory, vol. 44, no. 2, pp. 744–765, Mar. 1998.
[18] B. Vucetic and J. Yuan, Space-time coding. John Wiley & Sons, 2003.
[19] B. Lu and X. Wang, “Space-time code design in OFDM systems,” in IEEE. Global Telecommun. Conf., vol. 2, 2000, pp. 1000–1004.
[20] P. Raviteja, Y. Hong, E. Viterbo, and E. Biglieri, “Practical pulse-shaping waveforms for reduced-cyclic-prefix OTFS,” IEEE Trans. Veh. Technol., vol. 68, no. 1, pp. 957–961, Jan. 2019.
[21] J. Yuan, W. Yuan and Z. Wei, “Orthogonal time frequency space (OTFS) channel estimation,” Interim Report, 2020.
[22] M. Rothstein, The Gram matrix, orthogonal projection, and volume. Tutorial, Dept. of Math., Georgia University.
[23] J. Yuan, Z. Chen, B. Vucetic, and W. Firmanto, “Performance and design of space-time coding in fading channels,” IEEE Trans. Commun., vol. 51, no. 12, pp. 1991–1996, Dec. 2003.
[24] A. Papoulis and S. U. Pillai, Probability, Random Variables, and Stochastic Processes. Tata McGraw-Hill Education, 2002.
[25] W. Ryan and S. Lin, Channel Codes: Classical and Modern. Cambridge university press, 2009.
[26] J. Ventura-Traveset, G. Caire, E. Biglieri, and G. Taricco, “Impact of diversity reception on fading channels with coded modulation. I. coherent detection,” IEEE Trans. Commun., vol. 45, no. 5, pp. 563–572, May 1997.
[27] E. Biglieri, J. Proakis, and S. Shamai, “Fading channels: Information-theoretic and communications aspects,” IEEE Trans. Inf. Theory, vol. 44, no. 6, pp. 2619–2692, Oct. 1998.
[28] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate (corresp.),” IEEE Trans. Inf. Theory, vol. 20, no. 2, pp. 284–287, Mar. 1974.
[29] J. Yuan and J. Choi, “Channel adaptive space-time (CAST) coding and precoding for wireless downlink packet services,” Interim Report, 2006.
[30] Y. Mostofi and D. C. Cox, “ICI mitigation for pilot-aided OFDM mobile systems,” IEEE Trans. Wireless Commun., vol. 4, no. 2, pp. 765–774, 2005.
[31] 3GPP, “NR; Multiplexing and channel coding,” Release 15, Technical Specification (TS) 38.212, 2017.
[32] ——, “Group Radio Access Network, Evolved Universal Terrestrial Radio Access, Multiplexing and Channel Coding,” Release 8, Technical Specification (TS) 36.212, 2007.
[33] J. Taylor, “The condition of Gram matrices and related problems,” Proc. the Royal Soc. of Edinburgh, vol. 80, no. 1-2, pp. 45–56, Nov. 1978.