This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Upper Bound Scalability on Achievable Rates of
Batched Codes for Line Networks

Shenghao Yang and Jie Wang The Chinese University of Hong Kong, Shenzhen
Abstract

The capacity of line networks with buffer size constraints is an open, but practically important problem. In this paper, the upper bound on the achievable rate of a class of codes, called batched codes, is studied for line networks. Batched codes enable a range of buffer size constraints, and are general enough to include special coding schemes studied in the literature for line networks. Existing works have characterized the achievable rates of batched codes for several classes of parameter sets, but leave the cut-set bound as the best existing general upper bound. In this paper, we provide upper bounds on the achievable rates of batched codes as functions of line network length for these parameter sets. Our upper bounds are tight in order of the network length compared with the existing achievability results.

I Introduction

The communication in a network from a source node to a destination node may go through multiple hops, each of which introduces errors. In this paper, we are interested in the problem that when the intermediate nodes have buffer size constraints, how the communication rate scales with the number of hops.

In particular, we consider a line network of LL hops formed by a sequence of nodes, where discrete memoryless channels (DMCs) exist only between two adjacent nodes. We call the first node source node and the last node destination node. Except for the source and destination nodes, all the other nodes, called intermediate nodes, have one incoming channel and one outgoing channel. Each intermediate node has a buffer of BB bits to keep the content used between different intermediate processing steps. There are no other storage and computation constraints on the network nodes.

For some cases of the problem, the answers are known. When the buffer size BB is allowed to increase with the block length at the source node, the min-cut capacity can be achieved using hop-by-hop decoding and re-encoding [1]. When the zero-error capacity of each channel is nonzero, using a constant buffer size BB can achieve the zero-error capacity for any value of LL [2].

In this paper, we focus on the DMCs in the line network with finite input and output alphabets and 0 zero-error capacity. Note that for most common channel models, e.g., binary symmetric channels and erasure channels, the zero-error capacities are zero. When all cascaded channels are identical, Niesen, Fragouli, and Tuninetti [2] showed that a class of codes with a constant buffer size BB can achieve rates Ω(ecL)\Omega(e^{-cL}), where cc is a constant. They also showed that if the buffer size BB is of order lnL\ln L, any rate below the capacity of the channel QQ can be achieved. Recently, Yang et al. [3, 4] showed that the end-to-end throughput can be lower bounded by Ω(1/lnL)\Omega(1/\ln L) using an intermediate node buffer size O(lnlnL)O(\ln\ln L). 111In this paper, we say that f(n)=Ω(g(n))f(n)=\Omega(g(n)) if there exists a real constant c>0c>0 and there exists an integer constant n01n_{0}\geq 1 such that f(n)cg(n)f(n)\geq c\cdot g(n) for every integer nn0n\geq n_{0}; f(n)=O(g(n))f(n)=O(g(n)) if there exists a real constant c>0c>0 and there exists an integer constant n01n_{0}\geq 1 such that f(n)cg(n)f(n)\leq c\cdot g(n) for every integer nn0n\geq n_{0}; and f(n)=Θ(g(n))f(n)=\Theta(g(n)) if both f(n)=Ω(g(n))f(n)=\Omega(g(n)) and f(n)=O(g(n))f(n)=O(g(n)) are satisfied.

In contrast to these achievability results, min-cut is still the best upper bound. Characterizing a non-trivial, general upper bound for a line network with buffer size constraints could be difficult as hinted in [5]. We relax the difficulty of the problem by asking the scalability of the upper bound with the network length LL for a class of codes, called batched codes.

Batched codes provide a general coding framework for line networks with buffer size constraints, and include the codes studied in the previous works [2, 3, 4] to show the achievability results as special cases. A batched code has an outer code and an inner code. The outer code encodes the information messages into batches, each of which is a sequence of coded symbols, while the inner code performs a general network coding for the symbols belonging to the same batch. The inner code, comprising of recoding at network nodes on each batch separately, should be designed for specific channels. Batched codes have been studied for designing efficient network coding for packet erasure channels (see, for example, [6, 7]), and practical designs have been provided [8, 9].

The upper bound scalability on the achievable rates of batched codes provides important guidance for us to design batched codes for large networks. For example, we want to know whether the exponential decade of the achievable rate with LL is necessary for B=O(1)B=O(1), and whether we can do better than Ω(1/lnL)\Omega(1/\ln L) when B=O(lnlnL)B=O(\ln\ln L). These questions are answered in this paper (see Table I). In particular, we show that when N=O(1)N=O(1), which implies M,B=O(1)M,B=O(1), the achievable rates must be exponential decade with LL. When N=O(1/lnL)N=O(1/\ln L) and M=O(1)M=O(1), which implies B=O(1/lnL)B=O(1/\ln L), the achievable rate is O(1/lnL)O(1/\ln L). These upper bounds have the same order of LL as the previous achievability results, and hence, together, provide tight capacity scalability results of batched codes for these parameter sets.

Our results are proved in a general setting of line networks where the DMC channels in the line network can be arbitrarily different except for a mild technical condition. The main technique of our converse is to separate the end-to-end transition matrix induced by the inner code as the linear combination of two parts, where one part captures the communication bottleneck in the line network and the other part can be simply upper bounded.

After introducing batched codes, we first use line networks of packet erasure channels to illustrate our main technique (Sec III). We then generalize the results to a broad class of channels called canonical channels, which include BSCs and BECs (Sec IV-A). Finally, we present a technique to solve line networks of general DMCs with zero-error capacity zero (Sec IV-B).

TABLE I: Summarization of the achievable rate scalability for the channels with the zero-error capacity 0 using batched codes. Here, cc and cc^{\prime} have constant values that do not change with LL.
batch size MM inner block-length NN buffer size BB lower bound
O(1)O(1) O(1)O(1) O(1)O(1) Ω(ecL)\Omega(e^{-cL})
O(1)O(1) O(lnL)O(\ln L) O(lnlnL)O(\ln\ln L) Ω(1/lnL)\Omega(1/\ln L)
O(lnL)O(\ln L) O(lnL)O(\ln L) O(lnL)O(\ln L) Ω(1)\Omega(1)
(a)
batch size MM inner block-length NN buffer size BB upper bound
arbitrary O(1)O(1) O(1)O(1) O(ecL)O(e^{-c^{\prime}L})
O(1)O(1) O(lnL)O(\ln L) O(lnL)O(\ln L) O(1/lnL)O(1/\ln L)
O(lnL)O(\ln L) O(lnL)O(\ln L) O(lnL)O(\ln L) min-cut
(b)

II Line Networks and Batched Codes

In this section, we describe a general line network model and introduce batched codes, which form a general coding framework for line networks with buffer size constraints.

II-A General Description

A line network of length LL is formed by a sequence of nodes vv_{\ell}, =0,1,,L\ell=0,1,\ldots,L, where communication links exist only between nodes v1v_{\ell-1} and vv_{\ell} for =1,,L\ell=1,\ldots,L. We assume that each link (v1,v)(v_{\ell-1},v_{\ell}) is a discrete memoryless channel (DMC) with the transition matrix QQ_{\ell}, where the input and output alphabets are 𝒬in\mathcal{Q}_{\mathrm{in}} and 𝒬out\mathcal{Q}_{\mathrm{out}}, respectively, both finite. We study the communication from the source node v0v_{0} to the destination node vLv_{L}, where all the intermediate nodes v1,,vL1v_{1},\ldots,v_{L-1} can help with the communication.

Let KK, nn and MM be positive integers, and 𝒜\mathcal{A} and \mathcal{B} be finite alphabets. A batched code has an outer code and an inner code described as follows. The message of the source node is formed by KK symbols from 𝒜\mathcal{A}. The outer code of a batched code, performed at v0v_{0}, encodes the message and generates nn batches, each of which has MM symbols from 𝒜\mathcal{A}. Here MM is called the batch size, and nn is called the outer blocklength.

Let NN be a positive integer called the inner blocklength. The inner code of a batched code is performed on different batches separately, and includes the recoding operations at nodes v0,,vL1v_{0},\ldots,v_{L-1}:

  • At the source node v0v_{0} that generates the batches, recoding is performed on the original MM symbols of a batch to generate NN recoded symbols (in 𝒬in\mathcal{Q}_{\mathrm{in}}) to be transmitted on the outgoing links of the source node.

  • At an intermediate network node vv that does not need to decode the input symbols, recoding is performed on the received symbols (in 𝒬out\mathcal{Q}_{\mathrm{out}}) belonging to the same batch to generate NN recoded symbols (in 𝒬in\mathcal{Q}_{\mathrm{in}}) to be transmitted on the outgoing links of vv.

In general, the number of recoded symbols transmitted by different nodes can be different. Here we assume that they are all the same for the simplicity of the discussion.

II-B Recoding Formulations

Let us formulate recoding for a generic batch 𝐗\mathbf{X}. We denote by 𝐗[k]\mathbf{X}[k] (1kM1\leq k\leq M) the kkth symbol in 𝐗\mathbf{X}. (Similar notations apply to other symbols of a sequence of entries.) The recoding at the source node is a function f:𝒜M𝒬inNf:\mathcal{A}^{M}\rightarrow\mathcal{Q}_{\mathrm{in}}^{N}. For =1,,L\ell=1,\ldots,L, denote by 𝐔\mathbf{U}_{\ell} and 𝐘\mathbf{Y}_{\ell} the input and output of NN uses of the link (v1,v)(v_{\ell-1},v_{\ell}), where 𝐔1=f(𝐗)\mathbf{U}_{1}=f(\mathbf{X}). Due to the memoryless of the channel,

Pr{𝐘=𝐲|𝐔=𝐮}=QN(𝐲|𝐮)i=1NQ(𝐲[i]|𝐮[i]).\Pr\{\mathbf{Y}_{\ell}=\mathbf{y}|\mathbf{U}_{\ell}=\mathbf{u}\}=Q_{\ell}^{\otimes N}(\mathbf{y}|\mathbf{u})\triangleq\prod_{i=1}^{N}Q_{\ell}(\mathbf{y}[i]|\mathbf{u}[i]). (1)

The channel inputs 𝐔\mathbf{U}_{\ell}, =2,3,L1\ell=2,3,\ldots L-1 can be formulated recursively. Let NN^{\prime} be an integer in {0,1,,N}\{0,1,\ldots,N\} used to represent the input-output latency. For i=0,1,,N+Ni=0,1,\ldots,N+N^{\prime}, let 𝐁[i]\mathbf{B}_{\ell}[i] be a random variable over the finite set \mathcal{B} with 𝐁[0]\mathbf{B}_{\ell}[0] a constant, which is used to represent the content in the buffer for the batch 𝐗\mathbf{X}. The recoding at vv_{\ell} is the function ϕ\phi_{\ell} such that for i=1,,N+Ni=1,\ldots,N+N^{\prime}

(𝐁[i],𝐔+1[iN])=ϕ(𝐁[i1],𝐘[i]),\left(\mathbf{B}_{\ell}[i],\mathbf{U}_{\ell+1}[i-N^{\prime}]\right)=\phi_{\ell}\left(\mathbf{B}_{\ell}[i-1],\mathbf{Y}_{\ell}[i]\right), (2)

where 𝐔+1[i]\mathbf{U}_{\ell+1}[i] and 𝐘[i]\mathbf{Y}_{\ell}[i] with i{1,,N}i\notin\{1,\ldots,N\} are regarded as empty random variables. In other words,

  • For the first NN^{\prime} received symbols, the recoding only updates its buffer content, but does not generate any channel inputs.

  • After receiving NN^{\prime} symbols, the recoding generates NN channel inputs.

An inner code (or recoding) scheme is the specification of ff, NN, NN^{\prime} and {ϕ}\{\phi_{\ell}\}.

At the destination node, all received symbols (which may belong to different batches) are decoded jointly. The end-to-end transformation of a batch is given by the transition matrix from 𝐗\mathbf{X} to 𝐘L\mathbf{Y}_{L}, which can be derived using (1) and (2) recursively. In general, the source recoding function ff and the intermediate recoding functions {ϕ}\{\phi_{\ell}\} can be random. Let FF be the transition matrix from 𝐗\mathbf{X} to 𝐔1\mathbf{U}_{1} and let Φ\Phi_{\ell} be the transition matrix from 𝐘\mathbf{Y}_{\ell} to 𝐔+1\mathbf{U}_{\ell+1}. We have the Markov chain

𝐗𝐔1𝐘1𝐔L𝐘L.\mathbf{X}\rightarrow\mathbf{U}_{1}\rightarrow\mathbf{Y}_{1}\rightarrow\cdots\rightarrow\mathbf{U}_{L}\rightarrow\mathbf{Y}_{L}.

The end-to-end transition matrix from 𝐗\mathbf{X} to 𝐘L\mathbf{Y}_{L} is

WLFQ1NΦ1Q2NΦ2QL1NΦL1QLN.W_{L}\triangleq FQ_{1}^{\otimes N}\Phi_{1}Q_{2}^{\otimes N}\Phi_{2}\cdots Q_{L-1}^{\otimes N}\Phi_{L-1}Q_{L}^{\otimes N}. (3)

II-C Design Considerations

The major parameters of a batched code include: i) batch size MM, ii) inner blocklength NN, and iii) buffer size BB at the intermediate nodes. The buffer size B=log||B=\log|\mathcal{B}| when N=0N^{\prime}=0, and B=2log||B=2\log|\mathcal{B}| when N>0N^{\prime}>0. For a given recoding scheme, the maximum achievable rate of the outer code is maxp𝐗I(𝐗;𝐘(L))\max_{p_{\mathbf{X}}}I(\mathbf{X};\mathbf{Y}^{(L)}) for NN channel uses. In other words, the design goal of a recoding scheme is to maximize

CL1Nmaxp𝐗I(𝐗;𝐘L)=1Nmaxp𝐗I(p𝐗,WL)C_{L}\triangleq\frac{1}{N}\max_{p_{\mathbf{X}}}I(\mathbf{X};\mathbf{Y}_{L})=\frac{1}{N}\max_{p_{\mathbf{X}}}I(p_{\mathbf{X}},W_{L}) (4)

under certain constraints of MM, NN and BB to be discussed later. For a given recoding scheme, an outer code should be designed for the transition matrix WLW_{L}. The optimal value of (4) is called the capacity of the line network with batched codes (under a certain constraint of MM, NN and BB), denoted as CLC_{L}.

By the convexity of mutual information for WLW_{L} when p𝐗p_{\mathbf{X}} is fixed, we have the following proposition.

Proposition 1.

There exists a deterministic capacity achieving recoding scheme.

II-D Capacity Scalability

Under various constraints of MM, NN and BB, we study how the capacity of a line network with batched codes scales with the network length LL. Denote by C(Q)C(Q_{\ell}) and C0(Q)C_{0}(Q_{\ell}) the channel capacity and the zero-error capacity of QQ_{\ell}, respectively. Note that if C0(Q)>0C_{0}(Q_{\ell})>0 for any \ell, a constant rate can be achieved for any network length LL using fixed MM, NN and BB (see also [2]). The same scalability result can be extended to a line network with only a fixed number of DMCs QQ_{\ell} with C0(Q)=0C_{0}(Q_{\ell})=0. Henceforth in this paper, we consider the case that C0(Q)=0C_{0}(Q_{\ell})=0 for all \ell.

II-D1 M=Θ(N)M=\Theta(N), B=Θ(N)B=\Theta(N) and NN\rightarrow\infty

Decode-and-forward is an optimal recoding scheme and achieves the min-cut capacity min=1LC(Q)\min_{\ell=1}^{L}C(Q_{\ell}) when i) BB is not limited and ii) NN is allowed to be arbitrarily large [1].

II-D2 M=Θ(N)M=\Theta(N), B=Θ(N)B=\Theta(N) and N=O(lnL)N=O(\ln L)

As NN does not tend to infinity, the error probability at each intermediate node does not tend to zero if decode and forward is applied. When QQ_{\ell}, =1,,L\ell=1,\ldots,L are identical, any constant rate below the channel capacity C(Q)C(Q_{\ell}) can be achieved using batched codes with nn\rightarrow\infty [2].

II-D3 N=O(1)N=O(1)

When NN is a fixed number that does not change with LL, it is sufficient to consider a fixed BB and MM. When QQ_{\ell}, =1,,L\ell=1,\ldots,L are identical, CLC_{L} tends to zero as LL\rightarrow\infty [2]. It was also shown that when Φ\Phi_{\ell}, =1,,L1\ell=1,\ldots,L-1 are also identical, the maximum achievable rate converges to zero exponentially fast.

When N=O(1)N=O(1), the scalability of CLC_{L} for general cases is still open. For example, it is unknown whether this exponential convergence of the achievable rate still holds when channels and recoding functions at intermediate nodes can be different. In this paper, we will answer this question by a general upper bound that decreases exponentially in LL.

II-D4 M=O(1)M=O(1)

We are also interested in the case that MM is a relatively small, fixed number that does not change with the network length LL, so that the major parameters of the outer code do not depend on the network size. This may have certain advantages for the hardware implementation of the outer code. It was shown in [4] that when N=O(lnL)N=O(\ln L) and B=O(lnlnL)B=O(\ln\ln L), rate Ω(1/lnL)\Omega(1/\ln L) can be achieved. Note that B=O(lnN)B=O(\ln N) is necessary when a node needs at least to count how many packets of a batch has been received. In this paper, we will show that when N=O(lnL)N=O(\ln L) and B=O(N)B=O(N), CLC_{L} is O(1/lnL)O(1/\ln L).

III Line Networks of Packet Erasure Channels

We first discuss a special case that the channels {Q}\{Q_{\ell}\} are identical packet erasure channels with transition matrix QerasureQ_{\text{erasure}}. Fix an alphabet 𝒬\mathcal{Q}^{*} with |𝒬|2|\mathcal{Q}^{*}|\geq 2. Suppose that the input alphabet 𝒬in\mathcal{Q}_{\mathrm{in}} and the output alphabet 𝒬out\mathcal{Q}_{\mathrm{out}} are both 𝒬{0}\mathcal{Q}^{*}\cup\{0\} where 0𝒬0\notin\mathcal{Q}^{*} is called the erasure. For each x𝒬x\in\mathcal{Q}^{*},

Qerasure(y|x)={1ϵif y=x,ϵif y=0,Q_{\text{erasure}}(y|x)=\begin{cases}1-\epsilon&\text{if }y=x,\\ \epsilon&\text{if }y=0,\end{cases}

where ϵ\epsilon is a constant value in (0,1)(0,1) called the erasure probability. The input 0 can be used to model the input when the channel is not used for transmission and we define Q(0|0)=1Q(0|0)=1.

The relation between the input XX and output YY of a packet erasure channel can be written as a function Y=XZY=XZ, where ZZ is a binary random variable independent of XX with Pr{Z=0}=1Pr{Z=1}=ϵ\Pr\{Z=0\}=1-\Pr\{Z=1\}=\epsilon. In other words, ZZ indicates whether the channel output is the erasure or not. With this formulation, we can write for =1,,L\ell=1,\ldots,L and i=1,,Ni=1,\ldots,N,

𝐘[i]=𝐔[i]𝐙[i]\mathbf{Y}_{\ell}[i]=\mathbf{U}_{\ell}[i]\mathbf{Z}_{\ell}[i]

where 𝐙[i]\mathbf{Z}_{\ell}[i] are independent binary random variables with Pr{𝐙[i]=0}=ϵ\Pr\{\mathbf{Z}_{\ell}[i]=0\}=\epsilon.

The main idea of our converse is that the worst link in a line network restricts the capacity. We define the event E0E_{0} to capture the communication bottleneck

E0==1L{𝐙=0}={=1L(𝐙=0)},E_{0}=\cup_{\ell=1}^{L}\{\mathbf{Z}_{\ell}=0\}=\left\{\lor_{\ell=1}^{L}(\mathbf{Z}_{\ell}=0)\right\},

where 𝐙=0\mathbf{Z}_{\ell}=0 means 𝐙[i]=0\mathbf{Z}_{\ell}[i]=0 for all ii. In other words, E0E_{0} is the event that for at least one link, all the NN uses of the channel for a batch are erasures. Define WL(0)W_{L}^{(0)} and WL(1)W_{L}^{(1)} as the transition matrix from 𝒜M\mathcal{A}^{M} to 𝒬outN\mathcal{Q}_{\mathrm{out}}^{N} such that

WL(0)(𝐲|𝐱)\displaystyle W_{L}^{(0)}(\mathbf{y}|\mathbf{x}) =\displaystyle= Pr{𝐘L=𝐲|𝐗=𝐱,E0},\displaystyle\Pr\{\mathbf{Y}_{L}=\mathbf{y}|\mathbf{X}=\mathbf{x},E_{0}\},
WL(1)(𝐲|𝐱)\displaystyle W_{L}^{(1)}(\mathbf{y}|\mathbf{x}) =\displaystyle= Pr{𝐘L=𝐲|𝐗=𝐱,E0¯},\displaystyle\Pr\{\mathbf{Y}_{L}=\mathbf{y}|\mathbf{X}=\mathbf{x},\overline{E_{0}}\},

where E0¯={=1L(𝐙0)}\overline{E_{0}}=\left\{\land_{\ell=1}^{L}(\mathbf{Z}_{\ell}\neq 0)\right\}. As 𝐗\mathbf{X} and 𝐙\mathbf{Z}_{\ell}, =1,,L\ell=1,\ldots,L are independent, we have

WL=WL(0)p0+WL(1)p1,W_{L}=W_{L}^{(0)}p_{0}+W_{L}^{(1)}p_{1},

where p0=Pr{E0}p_{0}=\Pr\{E_{0}\} and p1=1p0p_{1}=1-p_{0}. As I(p𝐗,WL)I(p_{\mathbf{X}},W_{L}) is a convex function of WLW_{L} when p𝐗p_{\mathbf{X}} is fixed, we obtain

I(p𝐗,WL)p0I(p𝐗,WL(0))+p1I(p𝐗,WL(1)).I(p_{\mathbf{X}},W_{L})\leq p_{0}I(p_{\mathbf{X}},W_{L}^{(0)})+p_{1}I(p_{\mathbf{X}},W_{L}^{(1)}).
Lemma 2.

For a line network of identical packet erasure channels, I(p𝐗,WL(0))=0I(p_{\mathbf{X}},W_{L}^{(0)})=0.

Proof.

Denote by PP the (joint) probability mass function of the random variables we have defined for the batch codes. To prove the lemma, we only need to show for all 𝐱𝒜M\mathbf{x}\in\mathcal{A}^{M} and 𝐲L𝒬outN\mathbf{y}_{L}\in\mathcal{Q}_{\mathrm{out}}^{N},

P(𝐲L,𝐱,E0)=P(𝐱)P(𝐲L,E0),P(\mathbf{y}_{L},\mathbf{x},E_{0})=P(\mathbf{x})P(\mathbf{y}_{L},E_{0}), (5)

which implies I(p𝐗,WL(0))=0I(p_{\mathbf{X}},W_{L}^{(0)})=0.

Define a sequence of events for =1,,L\ell=1,\ldots,L,

E0()¯={𝐙0,>}.\overline{E_{0}^{(\ell)}}=\{\mathbf{Z}_{\ell^{\prime}}\neq 0,\ell^{\prime}>\ell\}.

We have {𝐙=0}E0()¯\{\mathbf{Z}_{\ell}=0\}\cap\overline{E_{0}^{(\ell)}}, =1,,L\ell=1,\ldots,L are disjoint and E0==1L[{𝐙=0}E0()¯]E_{0}=\cup_{\ell=1}^{L}\left[\{\mathbf{Z}_{\ell}=0\}\cap\overline{E_{0}^{(\ell)}}\right]. Therefore,

\IEEEeqnarraymulticol3lP(𝐲L,𝐱,E0)\displaystyle\IEEEeqnarraymulticol{3}{l}{P(\mathbf{y}_{L},\mathbf{x},E_{0})}
=\displaystyle= P(𝐲L,𝐱,𝐙=0,E0()¯)\displaystyle\sum_{\ell}P\left(\mathbf{y}_{L},\mathbf{x},\mathbf{Z}_{\ell}=0,\overline{E_{0}^{(\ell)}}\right)
=\displaystyle= 𝐲𝐮P(𝐲L,𝐱,𝐲,𝐮,𝐙=0,E0()¯)\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}\sum_{\mathbf{u}_{\ell}}P\left(\mathbf{y}_{L},\mathbf{x},\mathbf{y}_{\ell},\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=0,\overline{E_{0}^{(\ell)}}\right)
=\displaystyle= 𝐲P(𝐲L,E0()¯|𝐲)𝐮P(𝐱,𝐮,𝐲,𝐙=0).\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{|}\mathbf{y}_{\ell})\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell},\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0).

We have

\IEEEeqnarraymulticol3l𝐮P(𝐱,𝐮,𝐲,𝐙=0)\displaystyle\IEEEeqnarraymulticol{3}{l}{\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell},\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0)}
=\displaystyle= 𝐮P(𝐱,𝐮)P(𝐙=0)P(𝐲|𝐮,𝐙=0)\displaystyle\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell})P(\mathbf{Z}_{\ell}=0)P(\mathbf{y}_{\ell}|\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=0)
=\displaystyle= 𝐮P(𝐱,𝐮)P(𝐙=0)P(𝐲|𝐙=0)\displaystyle\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell})P(\mathbf{Z}_{\ell}=0)P(\mathbf{y}_{\ell}|\mathbf{Z}_{\ell}=0)
=\displaystyle= P(𝐱)P(𝐲,𝐙=0)\displaystyle P(\mathbf{x})P(\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0)

where P(𝐲|𝐮,𝐙=0)=P(𝐲|𝐙=0)P(\mathbf{y}_{\ell}|\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=0)=P(\mathbf{y}_{\ell}|\mathbf{Z}_{\ell}=0) follows that 𝐘=0\mathbf{Y}_{\ell}=0 as 𝐙=0\mathbf{Z}_{\ell}=0. Hence, we obtain

P(𝐲L,𝐱,E0)\displaystyle P(\mathbf{y}_{L},\mathbf{x},E_{0}) =\displaystyle= P(𝐱)𝐲P(𝐲L,E0()¯|𝐲)P(𝐲,𝐙=0).\displaystyle P(\mathbf{x})\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P\left(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{|}\mathbf{y}_{\ell}\right)P(\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0).

Similarly, we have

P(𝐲L,E0)\displaystyle P(\mathbf{y}_{L},E_{0}) =\displaystyle= 𝐲P(𝐲L,E0()¯|𝐲)P(𝐲,𝐙=0).\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P\left(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{|}\mathbf{y}_{\ell}\right)P(\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0).

Therefore, we show (5). ∎

As p1=(1ϵN)Lp_{1}=(1-\epsilon^{N})^{L} and

I(p𝐗,WL(1))min{Mln|𝒜|,Nln|𝒬out|},I(p_{\mathbf{X}},W_{L}^{(1)})\leq\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\},

we have

CL(1ϵN)LNmin{Mln|𝒜|,Nln|𝒬out|}.C_{L}\leq\frac{(1-\epsilon^{N})^{L}}{N}\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\}. (6)
Theorem 3.

For a line network of length LL of packet erasure channels with erasure probability ϵ\epsilon,

  1. 1.

    When N=O(1)N=O(1), CL=O((1ϵN)L)C_{L}=O((1-\epsilon^{N})^{L}).

  2. 2.

    When M=O(1)M=O(1) and N=Θ(lnL)N=\Theta(\ln L), CL=O(1/lnL)C_{L}=O(1/\ln L).

  3. 3.

    When M=Ω(lnL)M=\Omega(\ln L) and N=Ω(lnL)N=\Omega(\ln L), CL=O(1)C_{L}=O(1).

Proof.

The theorem can be proved by substituting MM and NN in each case into (6). ∎

IV Converse for General Channels

Consider a generic channel Q:𝒬in𝒬outQ:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}}. The relation between the input XX and output YY of QQ can be modeled as a function α\alpha (see [10, Section 7.1]):

Y=α(X,Z=(Zx,x𝒬in))=x𝒬in𝟏{X=x}Zx,Y=\alpha(X,Z=(Z_{x},x\in\mathcal{Q}_{\mathrm{in}}))=\sum_{x\in\mathcal{Q}_{\mathrm{in}}}\bm{1}\{X=x\}Z_{x}, (7)

where 𝟏\bm{1} denotes the indicator function, and Zx,x𝒬inZ_{x},x\in\mathcal{Q}_{\mathrm{in}} are independent random variables with alphabet 𝒬out\mathcal{Q}_{\mathrm{out}} define as

Pr{Zx=y}=Q(y|x).\Pr\{Z_{x}=y\}=Q(y|x). (8)

For NN uses of the channel QQ, we can write

𝐘=α(N)(𝐔,𝐙),\mathbf{Y}=\alpha^{(N)}(\mathbf{U},\mathbf{Z}), (9)

where 𝐘[i]=α(𝐔[i],𝐙[i])\mathbf{Y}[i]=\alpha(\mathbf{U}[i],\mathbf{Z}[i]).

In this section, we consider general DMCs QQ_{\ell} for all \ell, which can be modeled as the function α\alpha_{\ell}. With the above formulation, we can write for =1,,L\ell=1,\ldots,L,

𝐘=α(N)(𝐔,𝐙).\mathbf{Y}_{\ell}=\alpha_{\ell}^{(N)}(\mathbf{U}_{\ell},\mathbf{Z}_{\ell}). (10)

IV-A Canonical Channels

For 0<ε10<\varepsilon\leq 1, we call a channel Q:𝒬in𝒬outQ:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}} an ε\varepsilon-canonical channel if there exists y0𝒬outy_{0}\in\mathcal{Q}_{\mathrm{out}} such that for every x𝒬inx\in\mathcal{Q}_{\mathrm{in}}, Q(y0|x)εQ(y_{0}|x)\geq\varepsilon. The packet erasure channel, BSC and BEC are all canonical channels. Note that a canonical QQ has C0(Q)=0C_{0}(Q)=0. We first consider the case that the channels {Q}\{Q_{\ell}\} are all ε\varepsilon-canonical channels. Define the event

E0={=1L(𝐙=y0)},E_{0}=\left\{\lor_{\ell=1}^{L}(\mathbf{Z}_{\ell}=y_{0})\right\},

where 𝐙=y0\mathbf{Z}_{\ell}=y_{0} means (𝐙[i])x=y0(\mathbf{Z}_{\ell}[i])_{x}=y_{0} for all ii and xx. The event E0E_{0} means that there exists one link of the network such that all uses of the channel for transmitting a batch have the same output y0y_{0}. Similar to the discussion in Section III, the transition matrix WLW_{L} can be expressed as

WL=WL(0)p0+WL(1)p1,W_{L}=W_{L}^{(0)}p_{0}+W_{L}^{(1)}p_{1},

where p0=Pr{E0},p1=Pr{E0¯}p_{0}=\text{Pr}\{E_{0}\},p_{1}=\text{Pr}\{\overline{E_{0}}\}, and

WL(0)(𝐲𝐱)\displaystyle W_{L}^{(0)}(\mathbf{y}\mid\mathbf{x}) =Pr{𝐘L=𝐲𝐗=𝐱,E0},\displaystyle=\text{Pr}\{\mathbf{Y}_{L}=\mathbf{y}\mid\mathbf{X}=\mathbf{x},E_{0}\},
WL(1)(𝐲𝐱)\displaystyle W_{L}^{(1)}(\mathbf{y}\mid\mathbf{x}) =Pr{𝐘L=𝐲𝐗=𝐱,E0¯}.\displaystyle=\text{Pr}\{\mathbf{Y}_{L}=\mathbf{y}\mid\mathbf{X}=\mathbf{x},\overline{E_{0}}\}.

Hence,

I(p𝐗,WL)p0I(p𝐗,WL(0))+p1I(p𝐗,WL(1)).I(p_{\mathbf{X}},W_{L})\leq p_{0}I(p_{\mathbf{X}},W_{L}^{(0)})+p_{1}I(p_{\mathbf{X}},W_{L}^{(1)}).
Lemma 4.

When QQ_{\ell}, =1,,L\ell=1,\ldots,L are all ε\varepsilon-canonical channels, p1(1ε|𝒬in|N)Lp_{1}\leq(1-\varepsilon^{|\mathcal{Q}_{\mathrm{in}}|N})^{L}.

Proof.

We write

p1\displaystyle p_{1} ==1L[1i{1,,N}x𝒬inPr((𝐙[i])x=y0)]\displaystyle=\prod_{\ell=1}^{L}\bigg{[}1-\prod_{i\in\{1,\ldots,N\}}\prod_{x\in\mathcal{Q}_{\mathrm{in}}}\text{Pr}((\mathbf{Z}_{\ell}[i])_{x}=y_{0})\bigg{]}
==1L[1i{1,,N}x𝒬inQ(y0|x)](1ε|𝒬in|N)L,\displaystyle=\prod_{\ell=1}^{L}\bigg{[}1-\prod_{i\in\{1,\ldots,N\}}\prod_{x\in\mathcal{Q}_{\mathrm{in}}}Q_{\ell}(y_{0}|x)\bigg{]}\leq(1-\varepsilon^{|\mathcal{Q}_{\mathrm{in}}|N})^{L},

where the second equality follows from (8). ∎

Lemma 5.

For a line network of length LL of ε\varepsilon-canonical channels, I(p𝐗,WL(0))=0I(p_{\mathbf{X}},W_{L}^{(0)})=0.

Proof.

Similar as the proof of Lemma 2, we have

P(𝐲L,𝐱,E0)\displaystyle P(\mathbf{y}_{L},\mathbf{x},E_{0}) =\displaystyle= 𝐲P(𝐲L,E0()¯|𝐲)\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{|}\mathbf{y}_{\ell})
𝐮P(𝐱,𝐮)P(𝐙=y0)P(𝐲|𝐮,𝐙=y0).\displaystyle\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell})P(\mathbf{Z}_{\ell}=y_{0})P(\mathbf{y}_{\ell}|\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=y_{0}).

By (10), given 𝐙=y0\mathbf{Z}_{\ell}=y_{0},

𝐘\displaystyle\mathbf{Y}_{\ell} =\displaystyle= α(N)(𝐔,𝐙=y0)=y0,\displaystyle\alpha^{(N)}(\mathbf{U}_{\ell},\mathbf{Z}_{\ell}=y_{0})=y_{0},

and hence P(𝐲𝐮,𝐙=y0)=P(𝐲𝐙=y0)P(\mathbf{y}_{\ell}\mid\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=y_{0})=P(\mathbf{y}_{\ell}\mid\mathbf{Z}_{\ell}=y_{0}). Following the same argument as in Lemma 2,

P(𝐲L,𝐱,E0)=P(𝐱)P(𝐲L,E0),P(\mathbf{y}_{L},\mathbf{x},E_{0})=P(\mathbf{x})P(\mathbf{y}_{L},E_{0}),

which implies I(p𝐗,WL(0))=0I(p_{\mathbf{X}},W_{L}^{(0)})=0. ∎

Combining both Lemma 4 and Lemma 5, we can assert that

CL(1ε|𝒬in|N)LNmin{Mln|𝒜|,Nln|𝒬out|},C_{L}\leq\frac{(1-\varepsilon^{|\mathcal{Q}_{\mathrm{in}}|N})^{L}}{N}\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\},

which implies the following theorem:

Theorem 6.

For a length-LL line network of ε\varepsilon-canonical channels with finite input and output alphabets,

  1. 1.

    When N=O(1)N=O(1), CL=O((1ε|𝒬in|N)L)C_{L}=O((1-\varepsilon^{|\mathcal{Q}_{\mathrm{in}}|N})^{L}).

  2. 2.

    When M=O(1)M=O(1) and N=Θ(lnL)N=\Theta(\ln L), CL=O(1/lnL)C_{L}=O(1/\ln L).

  3. 3.

    When M=Ω(lnL)M=\Omega(\ln L) and N=Ω(lnL)N=\Omega(\ln L), CL=O(1)C_{L}=O(1).

IV-B General Channels

Consider a channel Q:𝒬in𝒬outQ:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}} with C0(Q)=0C_{0}(Q)=0, modeled as in (7)-(9). Denote by εQ\varepsilon_{Q} the maximum value such that for any x,x𝒬inx,x^{\prime}\in\mathcal{Q}_{\mathrm{in}}, there exists y𝒬outy\in\mathcal{Q}_{\mathrm{out}} such that Q(y|x)εQQ(y|x)\geq\varepsilon_{Q} and Q(y|x)εQQ(y|x^{\prime})\geq\varepsilon_{Q}. Note that C0(Q)=0C_{0}(Q)=0 if and only if εQ>0\varepsilon_{Q}>0.

Lemma 7.

For a channel Q:𝒬in𝒬outQ:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}} with C0(Q)=0C_{0}(Q)=0, and any non-empty 𝒜𝒬inN\mathcal{A}\subseteq\mathcal{Q}_{\mathrm{in}}^{N}, there exist 𝐳=((𝐳[i])x𝒬out,x𝒬in,i=1,,N)\mathbf{z}=((\mathbf{z}[i])_{x}\in\mathcal{Q}_{\mathrm{out}},x\in\mathcal{Q}_{\mathrm{in}},i=1,\ldots,N) and a subset 𝒬outN\mathcal{B}\subseteq\mathcal{Q}_{\mathrm{out}}^{N} with |||𝒜|/2|\mathcal{B}|\leq\lceil|\mathcal{A}|/2\rceil such that α(N)(𝐱,𝐳)\alpha^{(N)}(\mathbf{x},\mathbf{z})\in\mathcal{B} for any 𝐱𝒜\mathbf{x}\in\mathcal{A} and Pr{𝐙=𝐳}εQ|𝒬in|N\Pr\{\mathbf{Z}=\mathbf{z}\}\geq\varepsilon_{Q}^{|\mathcal{Q}_{\mathrm{in}}|N}.

Proof.

The sequences in 𝒜\mathcal{A} can be put into |𝒜|/2\lceil|\mathcal{A}|/2\rceil pairs. For each pair 𝐱\mathbf{x} and 𝐱\mathbf{x}^{\prime}, there exists 𝐲\mathbf{y} such that for each i=1,,Ni=1,\ldots,N, Q(𝐲[i]|𝐱[i])εQQ(\mathbf{y}[i]|\mathbf{x}[i])\geq\varepsilon_{Q} and Q(𝐲[i]|𝐱[i])εQQ(\mathbf{y}[i]|\mathbf{x}^{\prime}[i])\geq\varepsilon_{Q}. Let (𝐳[i])𝐱[i]=𝐲[i](\mathbf{z}[i])_{\mathbf{x}[i]}=\mathbf{y}[i] and (𝐳[i])𝐱[i]=𝐲[i](\mathbf{z}[i])_{\mathbf{x}^{\prime}[i]}=\mathbf{y}[i]. After going through all the |𝒜|/2\lceil|\mathcal{A}|/2\rceil pairs, let \mathcal{B} be the collection of all 𝐲\mathbf{y}, which satisfies |||𝒜|/2|\mathcal{B}|\leq\lceil|\mathcal{A}|/2\rceil. For all (𝐳[i])x(\mathbf{z}[i])_{x} that have not been assigned, let (𝐳[i])x=y(\mathbf{z}[i])_{x}=y such that Q(y|x)εQQ(y|x)\geq\varepsilon_{Q}. Hence Pr{𝐙=𝐳}=Pr{(𝐙[i])x=(𝐳[i])x,x𝒬in,i=1,,N}εQ|𝒬in|N\Pr\{\mathbf{Z}=\mathbf{z}\}=\Pr\{(\mathbf{Z}[i])_{x}=(\mathbf{z}[i])_{x},x\in\mathcal{Q}_{\mathrm{in}},i=1,\ldots,N\}\geq\varepsilon_{Q}^{|\mathcal{Q}_{\mathrm{in}}|N}. ∎

Assume that L=LKL=L^{\prime}K, where LL^{\prime} and KK are integers. As a result, the end-to-end transition matrix WLW_{L} can be written as

WL=FG1ΦKG2Φ2KGL,W_{L}=FG_{1}\Phi_{K}G_{2}\Phi_{2K}\cdots G_{L^{\prime}},

where for i=1,,Li=1,\ldots,L^{\prime},

Gi=QK(i1)+1NΦK(i1)+1ΦKi1QKiN.G_{i}=Q_{K(i-1)+1}^{\otimes N}\Phi_{K(i-1)+1}\cdots\Phi_{Ki-1}Q_{Ki}^{\otimes N}.

The length-LL network can be regarded as a length-LL^{\prime} network of channels GiG_{i}, i=1,,Li=1,\ldots,L^{\prime}. Because of proposition 1, without loss of optimality, we assume a deterministic recoding scheme, i.e., F,ΦF,\Phi_{\ell} are deterministic transition matrices. The input 𝐗\mathbf{X} and output 𝐘\mathbf{Y} of GiG_{i} can be written as a function

𝐘=αGi(𝐗,𝐙,=K(i1)+1,,Ki),\mathbf{Y}=\alpha_{G_{i}}(\mathbf{X},\mathbf{Z}_{\ell},\ell=K(i-1)+1,\ldots,Ki),

where αGi\alpha_{G_{i}} can be determined recursively by F,{Φ}F,\{\Phi_{\ell}\} and (9).

When KNlog2|𝒬in|K\geq N\log_{2}|\mathcal{Q}_{\mathrm{in}}| and εQε\varepsilon_{Q_{\ell}}\geq\varepsilon for all \ell, applying Lemma 7 inductively, we know that there exists 𝐲i\mathbf{y}_{i} and {𝐳}\{\mathbf{z}_{\ell}\} such that αGi(𝐱,𝐳,=K(i1)+1,,Ki)=𝐲i\alpha_{G_{i}}(\mathbf{x},\mathbf{z}_{\ell},\ell=K(i-1)+1,\ldots,Ki)=\mathbf{y}_{i} for all 𝐱𝒬inN\mathbf{x}\in\mathcal{Q}_{\mathrm{in}}^{N}, and

Pr{𝐙=𝐳,=K(i1)+1,,K}εQ|𝒬in|NK.\Pr\{\mathbf{Z}_{\ell}=\mathbf{z}_{\ell},\ell=K(i-1)+1,\ldots,K\}\geq\varepsilon_{Q}^{|\mathcal{Q}_{\mathrm{in}}|NK}.

For i=1,,Li=1,\ldots,L^{\prime}, define events

Ei={𝐙=𝐳,=K(i1)+1,,K}.E_{i}=\{\mathbf{Z}_{\ell}=\mathbf{z}_{\ell},\ell=K(i-1)+1,\ldots,K\}.

Define the event

E0={i=1LEi}.E_{0}=\left\{\lor_{i=1}^{L^{\prime}}E_{i}\right\}.

Performing the similar analysis as in Section IV-A for the length-LL^{\prime} network of channels GiG_{i}, i=1,,Li=1,\ldots,L^{\prime} with E0E_{0} defined above, we obtain

CL(1εNK|𝒬in|)L/KNmin{Mln|𝒜|,Nln|𝒬out|},\displaystyle C_{L}\leq\frac{(1-\varepsilon^{NK|\mathcal{Q}_{\mathrm{in}}|})^{L/K}}{N}\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\}, (11)

and hence the following result holds:

Theorem 8.

For a length-LL line network of channels QQ_{\ell} with finite input and output alphabets and εQε>0\varepsilon_{Q_{\ell}}\geq\varepsilon>0 for all \ell,

  1. 1.

    When N=O(1)N=O(1), CL=O((1ε)L)C_{L}=O((1-\varepsilon^{\prime})^{L}) for certain ε(0,1)\varepsilon^{\prime}\in(0,1).

  2. 2.

    When M=O(1)M=O(1) and N=Θ(lnL)N=\Theta(\ln L), CL=O(1/lnL)C_{L}=O(1/\ln L).

  3. 3.

    When M=Ω(lnL)M=\Omega(\ln L) and N=Ω(lnL)N=\Omega(\ln L), CL=O(1)C_{L}=O(1).

V Concluding Remarks

This paper characterized the tight capacity upper bound of batched codes for line networks when the channels have finite alphabets and 0 zero-error capacities.

Generalization of our analysis for channels with infinite alphabets and continuous channels is of research interests. The study of batched code design for a line network of channels like BSC is also desirable.

Last, we are curious whether our outer bound holds without the batched code constraint.

References

  • [1] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed.   John Wiley & Sons, Inc, 2006.
  • [2] U. Niesen, C. Fragouli, and D. Tuninetti, “On capacity of line networks,” Information Theory, IEEE Transactions on, vol. 53, no. 11, pp. 4039–4058, Nov. 2007.
  • [3] S. Yang and R. W. Yeung, BATS Codes: Theory and Practice, ser. Synthesis Lectures on Communication Networks.   Morgan & Claypool Publishers, 2017.
  • [4] S. Yang, J. Wang, Y. Dong, and Y. Zhang, “On the capacity scalability of line networks with buffer size constraints,” in Information Theory Proceedings (ISIT), 2019 IEEE International Symposium on, 2019.
  • [5] B. N. Vellambi, N. Torabkhani, and F. Fekri, “Throughput and latency in finite-buffer line networks,” IEEE Transactions on Information Theory, vol. 57, no. 6, pp. 3622–3643, June 2011.
  • [6] P. A. Chou, Y. Wu, and K. Jain, “Practical network coding,” in Proc. Allerton Conf. Comm., Control, and Computing, Oct. 2003.
  • [7] D. Silva, W. Zeng, and F. R. Kschischang, “Sparse network coding with overlapping classes,” in Proc. NetCod ’09, 2009, pp. 74–79.
  • [8] S. Yang and R. W. Yeung, “Coding for a network coded fountain,” in Information Theory Proceedings (ISIT), 2011 IEEE International Symposium on, Saint Petersburg, Russia, July 31 - Aug. 5 2011, pp. 2647–2651.
  • [9] ——, “Batched sparse codes,” Information Theory, IEEE Transactions on, vol. 60, no. 9, pp. 5322–5346, Sep. 2014.
  • [10] R. W. Yeung, Information Theory and Network Coding.   Springer, 2008.