Distributed Arithmetic Coding for Sources with Hidden Markov Correlation
Abstract
Distributed arithmetic coding (DAC) has been shown to be effective for Slepian-Wolf coding, especially for short data blocks. In this letter, we propose to use the DAC to compress momery-correlated sources. More specifically, the correlation between sources is modeled as a hidden Markov process. Experimental results show that the performance is close to the theoretical Slepian-Wolf limit.
Index Terms:
Distributed Source Coding (DSC), Slepian-Wolf Coding (SWC), Distributed Arithmetic Coding (DAC), Hidden Markov Correlation, Forward Algorithm.I Introduction
We consider the problem of Slepian-Wolf Coding (SWC) with decoder Side Information (SI). The encoder compresses discrete source in the absence of , discretely correlated SI. Slepian-Wolf theorem points out that lossless compression is achievable at rates , the conditional entropy of given , where both and are discrete random processes [1]. Conventionally, channel codes, such as turbo codes [2] or Low-Density Parity-Check (LDPC) codes [3], are used to deal with the SWC problem.
Recently, some SWC techniques based on entropy coding are proposed, such as Distributed Arithmetic Coding (DAC) [4, 5] and Overlapped Quasi-Arithmetic Coding (OQAC) [6]. These schemes can be seen as an extension of classic Arithmetic Coding (AC) whose principle is to encode source at rates by allowing overlapped intervals. The overlapping leads to a larger final interval and hence a shorter codeword. However, ambiguous codewords are unavoidable at the same time. A soft joint decoder exploits SI to decode . Afterwards, the time-shared DAC (TS-DAC) [7] is proposed to deal with the symmetric SWC problem. To realize rate-incremental SWC, the rate-compatible DAC is proposed in [8].
In this paper, we research how to use the DAC to compress sources with hidden Markov correlation.
II Binary Distributed Arithmetic Coding
Let be the bias probability of binary source , i.e., . In the classic AC, source symbol is iteratively mapped onto sub-intervals of , whose lengths are proportional to and . The resulting rate is . In the DAC [4], interval lengths are proportional to the modified probabilities and , where
(1) |
The resulting rate is . To fit the interval, the sub-intervals have to be partially overlapped. More specifically, symbols and correspond to intervals and , respectively. It is just the overlapping that leads to a larger final interval, and hence a shorter codeword. However, as a cost, the decoder can not decode unambiguously without .
To describe the decoding process, we define a ternary symbol set , where represents a decoding ambiguity. Let be the codeword and be the -th decoded symbol, then
(2) |
When the -th symbol is decoded, if , the decoder performs a branching: two candidate branches are generated, corresponding to two alternative symbols and . For each new branch, its metric is updated and the corresponding interval is selected for next iteration. To reduce complexity, every time after decoding a symbol, the decoder uses the -algorithm to keep at most branches with the best partial metric, and prunes others [4].
Note that the metric is not reliabe for the very last symbols of a finite length sequence [5]. This problem is solved by encoding the last symbols without interval overlapping [5]. It means that for , is mapped onto and ; while for , is mapped onto and .
Therefore, a binary DAC system can be described by four parameters: .
III Hidden Markov Model and Forward Algorithm
Let be a sequence of states and be a sequence of observations. A hidden Markov process is defined by :
: state transition probability matrix, where ;
: observation probability distribution, where ;
: initial state distribution, where .
The aim of forward algorithm is to compute , given observation and model . Let be the probability of observing the partial sequence such that state is , i.e.,
(3) |
Initially, we have
(4) |
For , can be induced through iteration
(5) |
Therefore,
(6) |
In practice, is usually normalized by
(7) |
where
(8) |
In this case, we have
(9) |
IV DAC for Hidden Markov Correlation
Assume that binary source and SI are correlated by , where is generated by a hidden Markov model with parameter . is encoded using a DAC encoder. The decoding process is very similar to what described in [4]. The only difference is that the forward algorithm is embedded into the DAC decoder and the metric of each branch is replaced by , where .
V Experimental Results
We have implemented a 16-bit DAC codec system. The bias probability of is . According to the recommendation of [5], we set and . The same 2-state (0 and 1) and 2-output (0 and 1) sources as in [9] are used in simulations (see Table I). The length of data block used in each test is .
To achieve lossless compression, each test starts from (see Table II). If the decoding fails, we increase with 0.01. Such process is iterated until the decoding succeeds. For each model, results are averaged over 100 trials. Experimental results are enlisted in Table II.
For comparison, also included in Table II are experimental results for the same settings from [9]. In each test of [9], source symbols are encoded using an LDPC code. In addition, to synchronize the hidden Markov model, original source symbols are sent to the decoder directly without compression.
The results show that the DAC performs similarly to or slightly better than (for models 1 and 2) the LDPC-based approach [9]. Moreover, for hidden Markov correlation, the DAC outperforms the LDPC-based approach in two aspects:
1). The LDPC-based approach requires longer codes to achieve better performance, while the DAC is insensitive to code length [4].
2). For the LDPC-based approach, to synchronize the hidden Markov model, a certain proportion of original source symbols must be sent to the decoder as “seeds”. However, it is hard to determine , the optimal proportion of “seeds”. The results reported in [9] were obtained through an exhaustive search, which limits its application in practice.
1 | {0.01, 0.03, 0.99, 0.98} |
---|---|
2 | {0.01, 0.065, 0.95, 0.925} |
3 | {0.97, 0.967, 0.93, 0.973} |
4 | {0.99, 0.989, 0.945, 0.9895} |
[9] | DAC | ||
---|---|---|---|
1 | 0.24 | 0.36 | 0.345908 |
2 | 0.52 | 0.67 | 0.648594 |
3 | 0.45 | 0.58 | 0.585645 |
4 | 0.28 | 0.42 | 0.427236 |
VI Conclusion
This paper researches the compression of sources with hidden Markov correlation using the DAC. The forward algorithm is incorporated into the DAC decoder. The results are similar to that of the LPDC-based approach. Compared to the LDPC-based approach, the DAC is more suitable for practical applications.
References
- [1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Info. Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
- [2] J. Garcia-Frias, “Compression of correlated binary sources using turbo codes,” IEEE Commun. Lett., vol. 5, no. 10, pp. 417-419, Oct. 2001.
- [3] A. Liveris, Z. Xiong, and C. Georghiades, “Compression of binary sources with side information at the decoder using LDPC codes,” IEEE Commun. Lett., vol. 6, no. 10, pp. 440-442, Oct. 2002.
- [4] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic coding,” IEEE Commun. Lett., vol. 11, no. 11, pp. 883-885, Nov. 2007.
- [5] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic coding for the asymmetric slepian-wolf problem,” IEEE Trans. Signal Process (submitted), preprint available at URL www1.tlc.polito.it/sas-ipl/Magli/index.php.
- [6] X. Artigas, S. Malinowski, C. Guillemot, and L. Torres, “Overlapped Quasi-Arithmetic codes for Distributed Video Coding,” IEEE International Conference on Image Processing (ICIP), San Antonio, Texas, USA, Sep. 2007.
- [7] M. Grangetto, E. Magli, and G. Olmo, “Symmetric Distributed Arithmetic Coding of Correlated Sources,” IEEE International Workshop on Multimedia Signal Processing (MMSP), Crete, Greece, Oct. 2007.
- [8] M. Grangetto, E. Magli, R. Tron, and G. Olmo, “Rate-compatible distributed arithmetic coding,” IEEE Commun. Lett., vol. 12, no. 8, pp. 575-577, Aug. 2008.
- [9] J. Garcia-Frias and W. Zhong, “LDPC Codes for Compression of Multi-Terminal Sources With Hidden Markov Correlation,” IEEE Commun. Lett., vol. 7, no. 3, pp. 115-117, Mar. 2003.