A Fast Recursive Algorithm for G-STBC
Abstract
This paper proposes a fast recursive algorithm for Group-wise Space-Time Block Code (G-STBC), which takes full advantage of the Alamouti structure in the equivalent channel matrix to reduce the computational complexity. With respect to the existing efficient algorithms for G-STBC, the proposed algorithm achieves better performance and usually requires less computational complexity.
Index Terms:
Fast recursive algorithms, MIMO, V-BLAST, G-STBC, DSTTD, Alamouti structure.I Introduction
Multiple-input multiple-output (MIMO) wireless communication systems possess huge channel capacities [1] when the multipath scattering is sufficiently rich. A MIMO system can provide two types of gains simultaneously, i.e. spatial multiplexing gain and diversity gain, while usually the increase in one type of gain needs the sacrifice in the other type of gain [2]. Two extreme examples of MIMO systems are Vertical Bell Laboratories Layered Space-Time architecture (V-BLAST) [3] and space-time block code (STBC) [4, 5], which achieve full spatial multiplexing gain and full diversity gain, respectively. The combination of V-BLAST and STBC is known as Group-wise STBC (G-STBC) or Multi-layered STBC (MLSTBC) [6, 8, 7, 9, 10, 11, 12, 13, 14], which achieves both spatial multiplexing gain and diversity gain concurrently. Specifically, the Alamouti code (the simplest and most popular STBC) [4] and the G-STBC consisting of two Alamouti codes [8, 12, 13, 14] have been adopted in wireless standards [15], where they are named space-time transmit diversity (STTD) and double-STTD (DSTTD), respectively.
In G-STBC, the transmit antennas are divided into layers (groups), of which each has transmit antennas and a corresponding STBC encoder. To detect one layer by suppressing the interferences from the other layers, the detector in [6] requires receive antennas, while the linear or successive interference cancelation (SIC) detectors in [8, 7, 9, 10, 11, 12, 13, 14] only require receive antennas, which are developed from the equivalent channel model based on the temporal and spatial structure of STBC. Moreover, group-wise SIC detectors and trellis-coded modulation (TCM) encoders/decoders are jointly designed for G-STBC in [16]. The minimum mean-square error (MMSE) ordered SIC (OSIC) detector for G-STBC [7] and that only for DSTTD [8] both require high computational complexities. Then some efficient detectors for G-STBC or only for DSTTD are proposed in [9, 10, 11, 12, 13, 14]. The MMSE sub-optimal OSIC detector for V-BLAST [17] using sorted QR decomposition (SQRD) are extended to decode G-STBC in [9]. The simple linear Zero Forcing (ZF) detector for G-STBC in [10] is further simplified in [11] by utilizing Strassen algorithm. For DSTTD, SIC detectors are reported in [12, 13]. Recently the authors of [14] notice that none of the DSTTD detectors in [8, 12, 13] utilizes the post-cancellation orthogonal structure, and then utilizes that structure to develop the SIC detector [14] for DSTDD that reduces the complexity at the sacrifice of performance.
In this letter, we consider the G-STBC consisting of Alamouti Codes. We deduce an efficient recursive algorithm to compute the initial estimation error covariance matrix, based on which we propose a fast recursive algorithm for G-STBC. The proposed algorithm exploits the Alamouti structure [18] in the equivalent channel matrix to reduce the complexity dramatically.
In what follows, , , and denote matrix transposition, matrix conjugate, and matrix conjugate transposition, respectively. is the identity matrix of size .
II System Model
The considered G-STBC system consists of transmit antennas and () receive antennas in a rich-scattering and flat-fading wireless channel. It includes parallel and independent STBC encoders. In this letter, we consider the Alamouti STBC encoder [4] with . The transmitted data stream is de-multiplexed into sub-streams. Each sub-stream () is encoded by an independent Alamouti encoder, and then fed to its respective transmit antennas. Correspondingly the received symbols over two time slots are
(1) |
where is the symbol received by the receive antenna, is the fading gain from the transmitter to the receiver, and is the complex Gaussian noise matrix. We can transform (1) into the equivalent channel model [8]
(2) |
where the equivalent received symbol vector , and the equivalent channel matrix
(8) |
In (2), the transmitted data stream is with the covariance , and the Gaussian noise vector is with zero mean and the covariance .
The MMSE OSIC detector for V-BLAST can be extended to the equivalent channel model (2) for G-STBC, since mathematically (2) has the same form as the standard channel model for V-BLAST [7]. Correspondingly the linear MMSE estimate of can be [7, 9]
(9) |
where , and is the estimation error covariance matrix.
The OSIC detector for G-STBC [7] detects entries of iteratively with the optimal order. In each iteration, the entry with the highest SINR (Signal to Interference plus Noise Ratio) is detected, and then its interference is cancelled in . Suppose that the entries of are permuted such that the successive detection order is . Then in the () iteration, the undetected entries can be represented as that includes the first entries of , while the corresponding equivalent channel matrix
(10) |
where () denotes the column of .
III A Fast Recursive Algorithm for G-STBC
In this section, we deduce an efficient recursive algorithm to compute the initial , based on which we develop a fast recursive algorithm for G-STBC.
The Alamouti structure in the equivalent channel matrix , which remains invariant under several matrix operations [18], is regarded as the matrix representation of a quaternion and then called as a block entry [18]. Block entries can form vectors and matrices, which are then called as block vectors and block matrices [18], respectively. In what follows, scalars, vectors and matrices with overlines always represent the above-mentioned block entries (i.e. block scalars), block vectors, and block matrices, respectively. We define the block matrix
(11) |
where , and the block column vector . Then we write
(12c) | |||||
(12f) |
where and are the leading principal sub-matrices of and [19, 20], respectively. In (12c), (computed from the block matrix ) must be Hermitian with diagonal sub-blocks that are non-negative scaled multiples of the identity matrix and with off-diagonal sub-blocks that are Alamouti matrices [18, Property 3 in the subsection “B. Block Matrices”], and the same is . Then in (12c), the block entry
(13) |
(where is a scalar), while is the block column vector, which can be represented as Moreover, in Appendix A we derive , and in (12f) respectively as
(14a) | |||||
(14b) | |||||
(14c) | |||||
(14d) |
Base on (14d), we develop the recursive G-STBC algorithm. In the initialization phase, to obtain from , we compute from recursively (for ) by (14d) and (12f). Moreover, as the V-BLAST algorithms in [19, 20], we perform interference cancellation not in , but in . Then we need to compute the initial
(15) |
In the recursion phase, we detect layers of STBC encoded symbols recursively by group-wise OSIC. The subscript is added to some variables for the recursion phase, to distinguish them from the variables for the initialization phase, e.g., usually . In each recursion, estimate the layer, i.e. the layer with the highest SINR among all the undetected layers, by
(16) |
where and are the and columns of the permuted , respectively. Quantize and to obtain and , respectively. Then as in [19, 20], we cancel the effect of and in by
(17) |
where is with the last two entries removed, and is in , as shown in (12c). In the next recursion, is required to compute (16). Thus we deflate by (14d) to obtain
(18) |
To some extent, (18) (that deflates a block matrix recursively) has a similar form as equation (15) in [21] and equation (21) in [20] (that deflates a matrix recursively), while the vector in the equations of [21, 20] is replaced with the block vector in (18). On the other hand, as (18), equation (24) in [22] also deflates a block matrix recursively, while it deflates the matched-filtered (MF) real-valued channel matrix for G-STBC that is obtained from the augmented real-valued channel matrix.
Table \@slowromancapi@ summarizes the proposed recursive G-STBC algorithm, where and are the entries in the row and column of the permuted and , respectively. Notice that () and consist of Alamouti sub-blocks[18, Lemma “Invariance Under Inversion”].
Initialization | Compute by (15). Compute the initial . Compute from recursively by (14d) |
---|---|
and (12f) for , to obtain the initial . | |
Recursion | Set , and let denote the entry of . For : |
(a) Find , since the entry with the highest SINR is the one with the least | |
mean-square error [19, 20]. In and , permute rows and columns (, ) with rows | |
and columns (, ). Permute entries (, ) with entries (, ) in . | |
Interchange entries and of . | |
(b) Compute and by (16), which are quantized into and . | |
(c) Cancel the effect of in by (17), to obtain . | |
(d) Deflate to get by (18). Remove the last two columns and rows of to get . | |
Solution | When , only execute the above-described step (b). The hard decisions of are |
, for . |
Algorithm (Alg.) | Total Complexity |
---|---|
Proposed G-STBC Alg. | real mult., |
real add. | |
G-STBC Alg. in [9] | real mult., |
real add. [9, equation (19)] | |
Proposed G-STBC Alg. with | real mult. |
ZF G-STBC Alg. in [11] with | real mult. [11, Table \@slowromancapi@] |
Proposed DSTTD Alg. | real mult., |
real add. ( is always for DSTTD) | |
One-step SIC Alg. for | real mult., |
DSTTD in [14] | real add.[24] |
IV Performance Analysis and Numerical Results
In the recursion phase, the proposed group-wise OSIC only needs to compute the block matrices s () that consist of Alamouti sub-blocks, while the symbol-wise OSIC [7] implemented by the recursive V-BLAST algorithm in [20] needs to compute the matrices s () that usually do not consist of Alamouti sub-blocks. So the proposed group-wise OSIC can save much computational complexity, with respect to the symbol-wise OSIC [7]. On the other hand, the proposed group-wise OSIC is equivalent to the symbol-wise SIC with the group-wise optimal detection order, as shown in Appendix B. Thus its performance loss with respect to the symbol-wise OSIC (for G-STBC) [7] only comes from different detection orders (i.e., the group-wise optimal order vs. the symbol-wise optimal order), and is relatively small, as shown in the rest of this section.
We list the computational complexities of the presented G-STBC and DSTTD algorithms in Table \@slowromancapii@, where the exact complexities of the presented DSTTD algorithms have been verified by our numerical experiments to count the floating-point operations (flops — One flop can be one real multiplication or one real addition). In this table, “Proposed DSTTD Alg.” denotes the proposed G-STBC algorithm applied to D-STTD. When counting the complexities, we utilize the fact that one complex multiplication needs real multiplications and real additions, while one complex addition needs real additions. The total complexity of the proposed G-STBC algorithm is the sum of the complexities to compute , and s () that are , and , respectively, where denotes complex multiplications and complex additions. The algorithm in [9] nearly requires the same number of complex multiplications and additions [9], [17, Table \@slowromancapi@], while [9, equation (19)] claims a complexity of operations, where , , and an operation can be a complex multiplication or addition. The complexity of the one-step SIC algorithm for DSTTD in [14] has been revised in [24], while the two-step SIC algorithm and the SINR-ordered SIC algorithm for DSTTD in [14] both need nearly double the complexity of the one-step SIC algorithm.
For DSTTD, the special case of G-STBC with the minimum number of transmit antennas, the ML(maximum likelihood)-like sphere detectors [23] are alternative methods worth considering. For example, with respect to the SIC DSTTD detector in [8], the ML-like DSTTD detector in [23] performs about dB better at BER=, but requires about times of complexity [23, Fig. 5]. Although ML-like DSTTD detectors can be preferred in some applications, they usually do not have advantage in both complexity and BER performance [23]. Thus quite a few recent literatures focusing on SIC DSTTD receivers [12, 13, 14, 16] did not discuss ML-like sphere detectors, and neither does this letter focusing on SIC G-STBC receivers, which, due to the limited space, cannot give too extensive discussion for only the special case of G-STBC (i.e. DSTTD).
From Table \@slowromancapii@, we can compare the complexities of the presented algorithms. When , the proposed G-STBC algorithm speeds up the G-STBC algorithm in [9] by approximately, and even requires less real multiplications than the linear ZF G-STBC algorithm in [11]. When , the proposed DSTTD algorithm is faster than the one-step SIC DSTTD algorithm in [14], and the speedup (in the number of flops) grows with , which is for , for , and for . However, when , the one-step SIC DSTTD algorithm in [14] (with much performance loss) is faster than the proposed DSTTD algorithm, while the speedup is .
Let . For different , we carried out numerical experiments to count the average flops per time slot of the G-STBC algorithm in [9] and the proposed G-STBC algorithm in Fig. 1. It can be seen that they are consistent with the theoretical flops calculation. On the other hand, Fig. 2 shows the BER (bit error rate) performance of the algorithms in [7, 9] and the proposed algorithm in a G-STBC system with transmit and receive antennas, while Fig. 3 and Fig. 4 show the BER performance of the algorithms in [7, 8, 14] and the proposed algorithm in a DSTTD system with and receive antennas, respectively. We used QPSK modulation in Fig. 2, Fig. 3 and Fig. 4. As shown in Fig. 2 and Fig. 3, the proposed algorithm performs better than the efficient G-STBC algorithms in [9] and the efficient DSTTD algorithms in [14]. For example, it performs about dB better than the SQRD algorithm in [9] at BER , and performs about dB better than the one-step fixed-order SIC algorithm for DSTTD in [14] at BER. Fig. 2 and Fig. 3 also shows that with respect to the MMSE OSIC algorithm in [7], the proposed algorithm performs about dB worse in the G-STBC system and dB worse in the DSTTD system, at BER . Moreover, it can be seen from Fig. 4 that for a large number of receive antennas, the performance difference among the presented DSTTD algorithms is small and even negligible.




V Conclusion
We propose a fast recursive algorithm for G-STBC, which takes full advantage of the Alamouti structure in the equivalent channel matrix to reduce the computational complexity dramatically. With respect to the existing efficient algorithms for G-STBC or only for DSTTD, the proposed group-wise MMSE OSIC algorithm for G-STBC achieves better performance and usually requires less complexity. When , the proposed algorithm speeds up the MMSE sub-optimal OSIC algorithm for G-STBC by , speeds up the recently proposed DSTTD algorithm by up to , and even requires less real multiplications than the linear ZF G-STBC algorithm.
Appendix A The Derivation of (14d)
In what follows, is the zero matrix. Substitute (12f) into to obtain , from which we can deduce
(19a) | |||||
(19b) | |||||
(19c) |
On the other hand, let us extend equation (17) in [17] to in (11) for G-STBC, to obtain
(20) |
where is QR decomposed into and the orthogonal , i.e., . consists of Alamouti sub-blocks [18], and so do [18, Lemma “Invariance Under QR Factorization”] and [18, Lemma “Invariance Under Inversion”]. Then it can be seen from (20) that must be Hermitian with diagonal sub-blocks that are non-negative scaled multiples of the identity matrix and with off-diagonal sub-blocks that are Alamouti matrices [18, Property 3 in the subsection “B. Block Matrices”]. Correspondingly the diagonal sub-block in must satisfy (14b).
Appendix B Proposed Group-wise OSIC vs. Symbol-wise SIC with Group-wise Ordering
The proposed G-STBC algorithm applies (16) to compute the estimates of and , which are the last two entries of
(21) |
(22) |
In , the interferences of the detected symbols have been cancelled. Then
(23) |
where the noise is irrelevant to our discussion and can be neglected. Substitute (23) into (22), and then substitute (22) into (21) to obtain , i.e.,
(24) |
in (24) is with diagonal sub-blocks that are non-negative scaled multiples of the identity matrix, as shown in Appendix A. Thus (24) can be represented as
(25) |
where denotes the non-zero entry. Now it can be seen from (25) that the interference of does not affect the estimate of (i.e. ), and vice versa, for . That is to say, in the linear MMSE estimates of , each layer of STBC encoded symbols, i.e. and , remain orthogonal. Thus the performance will remain unchanged if we modify the proposed group-wise OSIC detector into the corresponding symbol-wise SIC detector with the group-wise optimal detection order, which detects immediately after the interference of is cancelled.
Acknowledgment
The authors would like to transfer their appreciation to the editor and the reviewers for their valuable comments leading to the improvement of this paper.
References
- [1] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a fadiig environment when using multiple antennas,” Wireless Personal Communications, vol. 6, pp. 311-335, Mar. 1998.
- [2] L. Zheng and D. N. C. Tse, “Diversity and multiplexing: a fundamental tradeoff in multiple-antenna channels,” IEEE Transactions on Information Theory, vol. 49, no. 5, pp. 1073-1096, May 2003.
- [3] P. W. Wolniansky, G. J. Foschini, G. D. Golden and R. A. Valenzuela, “V-BLAST: an architecture for realizing very high data rates over the rich-scattering wireless channel”, URSI International Symposium on Signals, Systems, and Electronics, pp. 295-300, Sept. 1998.
- [4] S. M. Alamouti., “A simple transmit diversity technique for wireless communications,” IEEE J. Selected Areas in Communications, vol. 16, no. 8, pp. 1451-1458, Oct. 1998.
- [5] V. Tarokh, H. Jafarkhani and A. R. Calderbank, “Space-time block codes from orthogonal designs,” IEEE Transactions on Information Theory, vol. 45, no. 5, pp. 1456-1467, July 1999.
- [6] V. Tarokh, A. Naguib, N. Seshadri and A. R. Calderbank, “Combined array processing and space-time coding”, IEEE Transactions on Information Theory, vol. 45, no. 4, pp. 1121-1128, May 1999.
- [7] X. Fan, H. Zhang, H. Luo and J. Huang, “Optimal MMSE successive interference cancellation in group-wise STBC MIMO systems”, Journal of Systems Engineering and Electronics, vol. 17, no. 1, pp. 85-90, 2006
- [8] A. F. Naguib, N. Seshadri and A. R. Calderbank, “Applications of space-time block codes and interference suppression for high capacity and high data rate wireless systems”, IEEE Asilomar SSC, vol. 2, pp. 1803-1810, Nov. 1998.
- [9] M. Gomaa and A. Ghrayeb, “A Low Complexity MMSE Detector for Multiuser Layered Space-Time Coded MIMO Systems”, Canadian Conference on Electrical and Computer Engineering, 22-26 April, 2007.
- [10] A. Stamoulis, N. Al-Dhahir and A. R. Calderbank, “Further results on interference cancellation and space-time block codes”, IEEE Asilomar SSC, vol. 1, pp. 257-261, Nov. 2001.
- [11] T. Hardtke, K. Chin and S. Sun, “A linear groupwise STBC detector based on Strassen algorithm”, ICICS 2007.
- [12] H. Kim and H. Park, “Efficient successive interference cancellation algorithms for the DSTTD system,” IEEE PIMRC 2005, vol. 1, pp. 62-66, Sept. 2005.
- [13] F. Wang, W. Zhao and Y. Xiong, “Ordered group interference cancellation for quasi-orthogonal space time block codes,” Proc. WICOM 2006.
- [14] S. Jung and J. Lee, “A New ML Based Interference Cancellation Technique for Layered Space-Time Codes”, IEEE Transactions on Communications, vol. 57, no. 4, pp. 930-936, April 2009.
- [15] IEEE Std 802.16e-2005, December 2005.
- [16] N. Prasad, M.K. Varanasi, L. Venturino, X. Wang, “An Analysis of the MIMO SDMA Channel With Space Time Orthogonal and Quasi-Orthogonal User Transmissions and Efficient Successive Cancellation Decoders”, IEEE Transactions on Information Theory, vol 54, no. 12, pp. 5427-5446, Dec. 2008.
- [17] R. Buhnke, D. Wubben, V. Kuhn and K. D. Kameyer, “MMSE extension of V-BLAST based on sorted QR decomposition”, IEEE Vehicular Technology Conference 2003-Fall.
- [18] A. H. Sayed, W. M., Younis and A. Tarighat, “An invariant matrix structure in multiantenna communications”, IEEE Signal Processing Letters, vol. 12, pp. 749-752, Nov. 2005.
- [19] H. Zhu, Z. Lei, F. P. S. Chin, “An improved recursive algorithm for BLAST”, Signal Processing, vol. 87, no. 6, pp. 1408-1411, Jun. 2007.
- [20] Y. Shang and X. G. Xia, “On Fast Recursive Algorithms For V-BLAST With Optimal Ordered SIC Detection”, IEEE Transactions on Wireless Communications, vol. 8, pp. 2860-2865, June 2009.
- [21] T. Liu and Y. Liu, “Modified fast recursive algorithm for efficient MMSE-SIC detection of the V-BLAST system”, IEEE Transactions on Wireless Communications, vol. 7, pp. 3713-3717, Oct. 2008.
- [22] C. L. Ho, J. Y. Wu, T. S. Lee, “Group-wise V-BLAST detection in multiuser space-time dual-signalling wireless systems”, IEEE Trans. Wireless Commun, vol. 5, no. 7, pp. 1896-1909, July 2006.
- [23] J. Cha, S. Hur, C. Min, H. Lee, J. Kang, “Efficient Modified Fano Detection with Reduced Branches for DSTTD System”, IEEE International Conference on Communications 2006.
- [24] H. Zhu and W. Chen, “Comments on a New ML Based Interference Cancellation Technique for Layered Space-Time Codes,” IEEE Transactions on Communications, vol. 58, no. 11, pp. 3054-3055, Nov. 2010.