This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Distributed Arithmetic Coding for Sources with Hidden Markov Correlation

Yong Fang and Jechang Jeong This research was supported by Seoul Future Contents Convergence (SFCC) Cluster established by Seoul R&BD Program.The authors are with the Lab. of ICSP, Dep. of Electronics and Communications Engineering, Hanyang University, Haengdang-dong, Seongdong-gu, Seoul 133-791, Korea (e-mail: [email protected]).
Abstract

Distributed arithmetic coding (DAC) has been shown to be effective for Slepian-Wolf coding, especially for short data blocks. In this letter, we propose to use the DAC to compress momery-correlated sources. More specifically, the correlation between sources is modeled as a hidden Markov process. Experimental results show that the performance is close to the theoretical Slepian-Wolf limit.

Index Terms:
Distributed Source Coding (DSC), Slepian-Wolf Coding (SWC), Distributed Arithmetic Coding (DAC), Hidden Markov Correlation, Forward Algorithm.

I Introduction

We consider the problem of Slepian-Wolf Coding (SWC) with decoder Side Information (SI). The encoder compresses discrete source X={xt}t=1NX=\{x_{t}\}_{t=1}^{N} in the absence of Y={yt}t=1NY=\{y_{t}\}_{t=1}^{N}, discretely correlated SI. Slepian-Wolf theorem points out that lossless compression is achievable at rates RH(X|Y)R\geq H(X|Y), the conditional entropy of XX given YY, where both XX and YY are discrete random processes [1]. Conventionally, channel codes, such as turbo codes [2] or Low-Density Parity-Check (LDPC) codes [3], are used to deal with the SWC problem.

Recently, some SWC techniques based on entropy coding are proposed, such as Distributed Arithmetic Coding (DAC) [4, 5] and Overlapped Quasi-Arithmetic Coding (OQAC) [6]. These schemes can be seen as an extension of classic Arithmetic Coding (AC) whose principle is to encode source XX at rates H(X|Y)R<H(X)H(X|Y)\leq R<H(X) by allowing overlapped intervals. The overlapping leads to a larger final interval and hence a shorter codeword. However, ambiguous codewords are unavoidable at the same time. A soft joint decoder exploits SI YY to decode XX. Afterwards, the time-shared DAC (TS-DAC) [7] is proposed to deal with the symmetric SWC problem. To realize rate-incremental SWC, the rate-compatible DAC is proposed in [8].

In this paper, we research how to use the DAC to compress sources with hidden Markov correlation.

II Binary Distributed Arithmetic Coding

Let pp be the bias probability of binary source XX, i.e., p=P(xt=1)p=P(x_{t}=1). In the classic AC, source symbol xtx_{t} is iteratively mapped onto sub-intervals of [0,1)[0,1), whose lengths are proportional to (1p)(1-p) and pp. The resulting rate is RH(X)R\geq H(X). In the DAC [4], interval lengths are proportional to the modified probabilities (1p)γ(1-p)^{\gamma} and pγp^{\gamma}, where

H(X|Y)H(X)γ1.\frac{H(X|Y)}{H(X)}\leq\gamma\leq 1. (1)

The resulting rate is RγH(X)H(X|Y)R\geq\gamma H(X)\geq H(X|Y). To fit the [0,1)[0,1) interval, the sub-intervals have to be partially overlapped. More specifically, symbols xt=0x_{t}=0 and xt=1x_{t}=1 correspond to intervals [0,(1p)γ)[0,(1-p)^{\gamma}) and [1pγ,1)[1-p^{\gamma},1), respectively. It is just the overlapping that leads to a larger final interval, and hence a shorter codeword. However, as a cost, the decoder can not decode XX unambiguously without YY.

To describe the decoding process, we define a ternary symbol set {0,χ,1}\{0,\chi,1\}, where χ\chi represents a decoding ambiguity. Let CXC_{X} be the codeword and x~t\tilde{x}_{t} be the tt-th decoded symbol, then

x~t={][c]ls0,0 ≤C_X ¡ 1 - p^γχ,1 - p^γ≤C_X ¡ (1-p)^γ1,(1-p)^γ≤C_X ¡ 1.\tilde{x}_{t}=\left\{\begin{IEEEeqnarraybox}[]{[}][c]{l^{\prime}s}0,&$0 \leq C_X < 1 - p^\gamma$\\ \chi,&$1 - p^\gamma\leq C_X < (1-p)^\gamma$\\ 1,&$(1-p)^\gamma\leq C_X < 1$\end{IEEEeqnarraybox}.\right. (2)

When the tt-th symbol is decoded, if x~t=χ\tilde{x}_{t}=\chi, the decoder performs a branching: two candidate branches are generated, corresponding to two alternative symbols xt=0x_{t}=0 and xt=1x_{t}=1. For each new branch, its metric is updated and the corresponding interval is selected for next iteration. To reduce complexity, every time after decoding a symbol, the decoder uses the MM-algorithm to keep at most MM branches with the best partial metric, and prunes others [4].

Note that the metric is not reliabe for the very last symbols of a finite length sequence XX [5]. This problem is solved by encoding the last TT symbols without interval overlapping [5]. It means that for 1t(NT)1\leq t\leq(N-T), xtx_{t} is mapped onto [0,(1p)γ)[0,(1-p)^{\gamma}) and [1pγ,1)[1-p^{\gamma},1); while for (NT+1)tN(N-T+1)\leq t\leq N, xtx_{t} is mapped onto [0,1p)[0,1-p) and [1p,1)[1-p,1).

Therefore, a binary DAC system can be described by four parameters: {p,γ,M,T}\{p,\gamma,M,T\}.

III Hidden Markov Model and Forward Algorithm

Let S={st}t=1NS=\{s_{t}\}_{t=1}^{N} be a sequence of states and Z={zt}t=1NZ=\{z_{t}\}_{t=1}^{N} be a sequence of observations. A hidden Markov process is defined by λ=(A,B,π)\lambda=(A,B,\pi):

A={aji}A=\{a_{ji}\}: state transition probability matrix, where aji=P(st=i|st1=j)a_{ji}=P(s_{t}=i|s_{t-1}=j);

B={bi(k)}B=\{b_{i}(k)\}: observation probability distribution, where bi(k)=P(zt=k|st=i)b_{i}(k)=P(z_{t}=k|s_{t}=i);

π={πi}\pi=\{\pi_{i}\}: initial state distribution, where πi=P(s1=i)\pi_{i}=P(s_{1}=i).

The aim of forward algorithm is to compute P(z1,,zt|λ)P(z_{1},...,z_{t}|\lambda), given observation {z1,,zt}\{z_{1},...,z_{t}\} and model λ\lambda. Let αt(i)\alpha_{t}(i) be the probability of observing the partial sequence {z1,,zt}\{z_{1},...,z_{t}\} such that state sts_{t} is ii, i.e.,

αt(i)=P(z1,,zt,st=i|λ).\alpha_{t}(i)=P(z_{1},...,z_{t},s_{t}=i|\lambda). (3)

Initially, we have

α1(i)=πibi(z1).\alpha_{1}(i)=\pi_{i}b_{i}(z_{1}). (4)

For t>1t>1, αt(i)\alpha_{t}(i) can be induced through iteration

αt(i)={j[αt1(j)aji]}bi(zt).\alpha_{t}(i)=\{\sum_{j}[\alpha_{t-1}(j)a_{ji}]\}b_{i}(z_{t}). (5)

Therefore,

P(z1,,zt|λ)=iαt(i).P(z_{1},...,z_{t}|\lambda)=\sum_{i}{\alpha_{t}(i)}. (6)

In practice, αt(i)\alpha_{t}(i) is usually normalized by

αt(i)=αt(i)δt,\alpha_{t}(i)=\frac{\alpha_{t}(i)}{\delta_{t}}, (7)

where

δt=iαt(i).\delta_{t}=\sum_{i}{\alpha_{t}(i)}. (8)

In this case, we have

P(z1,,zt|λ)=t=1tδt.P(z_{1},...,z_{t}|\lambda)=\prod_{t^{\prime}=1}^{t}{\delta_{t^{\prime}}}. (9)

IV DAC for Hidden Markov Correlation

Assume that binary source XX and SI YY are correlated by Y=XZY=X\oplus Z, where ZZ is generated by a hidden Markov model with parameter λ\lambda. XX is encoded using a {p,γ,M,T}\{p,\gamma,M,T\} DAC encoder. The decoding process is very similar to what described in [4]. The only difference is that the forward algorithm is embedded into the DAC decoder and the metric of each branch is replaced by P(z1,,zt|λ)P(z_{1},...,z_{t}|\lambda), where zt=xtytz_{t}=x_{t}\oplus y_{t}.

V Experimental Results

We have implemented a 16-bit DAC codec system. The bias probability of XX is p=0.5p=0.5. According to the recommendation of [5], we set M=2048M=2048 and T=15T=15. The same 2-state (0 and 1) and 2-output (0 and 1) sources as in [9] are used in simulations (see Table I). The length of data block used in each test is N=1024N=1024.

To achieve lossless compression, each test starts from γ=H(X|Y)\gamma=H(X|Y) (see Table II). If the decoding fails, we increase γ\gamma with 0.01. Such process is iterated until the decoding succeeds. For each model, results are averaged over 100 trials. Experimental results are enlisted in Table II.

For comparison, also included in Table II are experimental results for the same settings from [9]. In each test of [9], N=16384N=16384 source symbols are encoded using an LDPC code. In addition, to synchronize the hidden Markov model, NαN\alpha original source symbols are sent to the decoder directly without compression.

The results show that the DAC performs similarly to or slightly better than (for models 1 and 2) the LDPC-based approach [9]. Moreover, for hidden Markov correlation, the DAC outperforms the LDPC-based approach in two aspects:

1). The LDPC-based approach requires longer codes to achieve better performance, while the DAC is insensitive to code length [4].

2). For the LDPC-based approach, to synchronize the hidden Markov model, a certain proportion of original source symbols must be sent to the decoder as “seeds”. However, it is hard to determine α\alpha, the optimal proportion of “seeds”. The results reported in [9] were obtained through an exhaustive search, which limits its application in practice.

TABLE I: Models for Simulation
modelmodel {a00,a11,b0(0),b1(1)}\{a_{00},a_{11},b_{0}(0),b_{1}(1)\}
1 {0.01, 0.03, 0.99, 0.98}
2 {0.01, 0.065, 0.95, 0.925}
3 {0.97, 0.967, 0.93, 0.973}
4 {0.99, 0.989, 0.945, 0.9895}
TABLE II: Experimental Results
modelmodel H(X|Y)H(X|Y) [9] DAC
1 0.24 0.36 0.345908
2 0.52 0.67 0.648594
3 0.45 0.58 0.585645
4 0.28 0.42 0.427236

VI Conclusion

This paper researches the compression of sources with hidden Markov correlation using the DAC. The forward algorithm is incorporated into the DAC decoder. The results are similar to that of the LPDC-based approach. Compared to the LDPC-based approach, the DAC is more suitable for practical applications.

References

  • [1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Info. Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
  • [2] J. Garcia-Frias, “Compression of correlated binary sources using turbo codes,” IEEE Commun. Lett., vol. 5, no. 10, pp. 417-419, Oct. 2001.
  • [3] A. Liveris, Z. Xiong, and C. Georghiades, “Compression of binary sources with side information at the decoder using LDPC codes,” IEEE Commun. Lett., vol. 6, no. 10, pp. 440-442, Oct. 2002.
  • [4] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic coding,” IEEE Commun. Lett., vol. 11, no. 11, pp. 883-885, Nov. 2007.
  • [5] M. Grangetto, E. Magli, and G. Olmo, “Distributed arithmetic coding for the asymmetric slepian-wolf problem,” IEEE Trans. Signal Process (submitted), preprint available at URL www1.tlc.polito.it/sas-ipl/Magli/index.php.
  • [6] X. Artigas, S. Malinowski, C. Guillemot, and L. Torres, “Overlapped Quasi-Arithmetic codes for Distributed Video Coding,” IEEE International Conference on Image Processing (ICIP), San Antonio, Texas, USA, Sep. 2007.
  • [7] M. Grangetto, E. Magli, and G. Olmo, “Symmetric Distributed Arithmetic Coding of Correlated Sources,” IEEE International Workshop on Multimedia Signal Processing (MMSP), Crete, Greece, Oct. 2007.
  • [8] M. Grangetto, E. Magli, R. Tron, and G. Olmo, “Rate-compatible distributed arithmetic coding,” IEEE Commun. Lett., vol. 12, no. 8, pp. 575-577, Aug. 2008.
  • [9] J. Garcia-Frias and W. Zhong, “LDPC Codes for Compression of Multi-Terminal Sources With Hidden Markov Correlation,” IEEE Commun. Lett., vol. 7, no. 3, pp. 115-117, Mar. 2003.