Upper Bound Scalability on Achievable Rates of
Batched Codes for Line Networks

Shenghao Yang and Jie Wang The Chinese University of Hong Kong, Shenzhen

Abstract

The capacity of line networks with buffer size constraints is an open, but practically important problem. In this paper, the upper bound on the achievable rate of a class of codes, called batched codes, is studied for line networks. Batched codes enable a range of buffer size constraints, and are general enough to include special coding schemes studied in the literature for line networks. Existing works have characterized the achievable rates of batched codes for several classes of parameter sets, but leave the cut-set bound as the best existing general upper bound. In this paper, we provide upper bounds on the achievable rates of batched codes as functions of line network length for these parameter sets. Our upper bounds are tight in order of the network length compared with the existing achievability results.

I Introduction

The communication in a network from a source node to a destination node may go through multiple hops, each of which introduces errors. In this paper, we are interested in the problem that when the intermediate nodes have buffer size constraints, how the communication rate scales with the number of hops.

In particular, we consider a line network of $L$ hops formed by a sequence of nodes, where discrete memoryless channels (DMCs) exist only between two adjacent nodes. We call the first node source node and the last node destination node. Except for the source and destination nodes, all the other nodes, called intermediate nodes, have one incoming channel and one outgoing channel. Each intermediate node has a buffer of $B$ bits to keep the content used between different intermediate processing steps. There are no other storage and computation constraints on the network nodes.

For some cases of the problem, the answers are known. When the buffer size $B$ is allowed to increase with the block length at the source node, the min-cut capacity can be achieved using hop-by-hop decoding and re-encoding [1]. When the zero-error capacity of each channel is nonzero, using a constant buffer size $B$ can achieve the zero-error capacity for any value of $L$ [2].

In this paper, we focus on the DMCs in the line network with finite input and output alphabets and $0$ zero-error capacity. Note that for most common channel models, e.g., binary symmetric channels and erasure channels, the zero-error capacities are zero. When all cascaded channels are identical, Niesen, Fragouli, and Tuninetti [2] showed that a class of codes with a constant buffer size $B$ can achieve rates $\Omega(e^{-cL})$ , where $c$ is a constant. They also showed that if the buffer size $B$ is of order $\ln L$ , any rate below the capacity of the channel $Q$ can be achieved. Recently, Yang et al. [3, 4] showed that the end-to-end throughput can be lower bounded by $\Omega(1/\ln L)$ using an intermediate node buffer size $O(\ln\ln L)$ . ¹¹1In this paper, we say that $f(n)=\Omega(g(n))$ if there exists a real constant $c>0$ and there exists an integer constant $n_{0}\geq 1$ such that $f(n)\geq c\cdot g(n)$ for every integer $n\geq n_{0}$ ; $f(n)=O(g(n))$ if there exists a real constant $c>0$ and there exists an integer constant $n_{0}\geq 1$ such that $f(n)\leq c\cdot g(n)$ for every integer $n\geq n_{0}$ ; and $f(n)=\Theta(g(n))$ if both $f(n)=\Omega(g(n))$ and $f(n)=O(g(n))$ are satisfied.

In contrast to these achievability results, min-cut is still the best upper bound. Characterizing a non-trivial, general upper bound for a line network with buffer size constraints could be difficult as hinted in [5]. We relax the difficulty of the problem by asking the scalability of the upper bound with the network length $L$ for a class of codes, called batched codes.

Batched codes provide a general coding framework for line networks with buffer size constraints, and include the codes studied in the previous works [2, 3, 4] to show the achievability results as special cases. A batched code has an outer code and an inner code. The outer code encodes the information messages into batches, each of which is a sequence of coded symbols, while the inner code performs a general network coding for the symbols belonging to the same batch. The inner code, comprising of recoding at network nodes on each batch separately, should be designed for specific channels. Batched codes have been studied for designing efficient network coding for packet erasure channels (see, for example, [6, 7]), and practical designs have been provided [8, 9].

The upper bound scalability on the achievable rates of batched codes provides important guidance for us to design batched codes for large networks. For example, we want to know whether the exponential decade of the achievable rate with $L$ is necessary for $B=O(1)$ , and whether we can do better than $\Omega(1/\ln L)$ when $B=O(\ln\ln L)$ . These questions are answered in this paper (see Table I). In particular, we show that when $N=O(1)$ , which implies $M,B=O(1)$ , the achievable rates must be exponential decade with $L$ . When $N=O(1/\ln L)$ and $M=O(1)$ , which implies $B=O(1/\ln L)$ , the achievable rate is $O(1/\ln L)$ . These upper bounds have the same order of $L$ as the previous achievability results, and hence, together, provide tight capacity scalability results of batched codes for these parameter sets.

Our results are proved in a general setting of line networks where the DMC channels in the line network can be arbitrarily different except for a mild technical condition. The main technique of our converse is to separate the end-to-end transition matrix induced by the inner code as the linear combination of two parts, where one part captures the communication bottleneck in the line network and the other part can be simply upper bounded.

After introducing batched codes, we first use line networks of packet erasure channels to illustrate our main technique (Sec III). We then generalize the results to a broad class of channels called canonical channels, which include BSCs and BECs (Sec IV-A). Finally, we present a technique to solve line networks of general DMCs with zero-error capacity zero (Sec IV-B).

TABLE I: Summarization of the achievable rate scalability for the channels with the zero-error capacity

0

using batched codes. Here,

c

and

c^{\prime}

have constant values that do not change with

L

batch size $M$	inner block-length $N$	buffer size $B$	lower bound
$O(1)$	$O(1)$	$O(1)$	$\Omega(e^{-cL})$
$O(1)$	$O(\ln L)$	$O(\ln\ln L)$	$\Omega(1/\ln L)$
$O(\ln L)$	$O(\ln L)$	$O(\ln L)$	$\Omega(1)$

(a)

batch size $M$	inner block-length $N$	buffer size $B$	upper bound
arbitrary	$O(1)$	$O(1)$	$O(e^{-c^{\prime}L})$
$O(1)$	$O(\ln L)$	$O(\ln L)$	$O(1/\ln L)$
$O(\ln L)$	$O(\ln L)$	$O(\ln L)$	min-cut

(b)

II Line Networks and Batched Codes

In this section, we describe a general line network model and introduce batched codes, which form a general coding framework for line networks with buffer size constraints.

II-A General Description

A line network of length $L$ is formed by a sequence of nodes $v_{\ell}$ , $\ell=0,1,\ldots,L$ , where communication links exist only between nodes $v_{\ell-1}$ and $v_{\ell}$ for $\ell=1,\ldots,L$ . We assume that each link $(v_{\ell-1},v_{\ell})$ is a discrete memoryless channel (DMC) with the transition matrix $Q_{\ell}$ , where the input and output alphabets are $\mathcal{Q}_{\mathrm{in}}$ and $\mathcal{Q}_{\mathrm{out}}$ , respectively, both finite. We study the communication from the source node $v_{0}$ to the destination node $v_{L}$ , where all the intermediate nodes $v_{1},\ldots,v_{L-1}$ can help with the communication.

Let $K$ , $n$ and $M$ be positive integers, and $\mathcal{A}$ and $\mathcal{B}$ be finite alphabets. A batched code has an outer code and an inner code described as follows. The message of the source node is formed by $K$ symbols from $\mathcal{A}$ . The outer code of a batched code, performed at $v_{0}$ , encodes the message and generates $n$ batches, each of which has $M$ symbols from $\mathcal{A}$ . Here $M$ is called the batch size, and $n$ is called the outer blocklength.

Let $N$ be a positive integer called the inner blocklength. The inner code of a batched code is performed on different batches separately, and includes the recoding operations at nodes $v_{0},\ldots,v_{L-1}$ :

•

At the source node $v_{0}$ that generates the batches, recoding is performed on the original $M$ symbols of a batch to generate $N$ recoded symbols (in $\mathcal{Q}_{\mathrm{in}}$ ) to be transmitted on the outgoing links of the source node.
•

At an intermediate network node $v$ that does not need to decode the input symbols, recoding is performed on the received symbols (in $\mathcal{Q}_{\mathrm{out}}$ ) belonging to the same batch to generate $N$ recoded symbols (in $\mathcal{Q}_{\mathrm{in}}$ ) to be transmitted on the outgoing links of $v$ .

In general, the number of recoded symbols transmitted by different nodes can be different. Here we assume that they are all the same for the simplicity of the discussion.

II-B Recoding Formulations

Let us formulate recoding for a generic batch $\mathbf{X}$ . We denote by $\mathbf{X}[k]$ ( $1\leq k\leq M$ ) the $k$ th symbol in $\mathbf{X}$ . (Similar notations apply to other symbols of a sequence of entries.) The recoding at the source node is a function $f:\mathcal{A}^{M}\rightarrow\mathcal{Q}_{\mathrm{in}}^{N}$ . For $\ell=1,\ldots,L$ , denote by $\mathbf{U}_{\ell}$ and $\mathbf{Y}_{\ell}$ the input and output of $N$ uses of the link $(v_{\ell-1},v_{\ell})$ , where $\mathbf{U}_{1}=f(\mathbf{X})$ . Due to the memoryless of the channel,

\Pr\{\mathbf{Y}_{\ell}=\mathbf{y}|\mathbf{U}_{\ell}=\mathbf{u}\}=Q_{\ell}^{\otimes N}(\mathbf{y}|\mathbf{u})\triangleq\prod_{i=1}^{N}Q_{\ell}(\mathbf{y}[i]|\mathbf{u}[i]).

(1)

The channel inputs $\mathbf{U}_{\ell}$ , $\ell=2,3,\ldots L-1$ can be formulated recursively. Let $N^{\prime}$ be an integer in $\{0,1,\ldots,N\}$ used to represent the input-output latency. For $i=0,1,\ldots,N+N^{\prime}$ , let $\mathbf{B}_{\ell}[i]$ be a random variable over the finite set $\mathcal{B}$ with $\mathbf{B}_{\ell}[0]$ a constant, which is used to represent the content in the buffer for the batch $\mathbf{X}$ . The recoding at $v_{\ell}$ is the function $\phi_{\ell}$ such that for $i=1,\ldots,N+N^{\prime}$

\left(\mathbf{B}_{\ell}[i],\mathbf{U}_{\ell+1}[i-N^{\prime}]\right)=\phi_{\ell}\left(\mathbf{B}_{\ell}[i-1],\mathbf{Y}_{\ell}[i]\right),

(2)

where $\mathbf{U}_{\ell+1}[i]$ and $\mathbf{Y}_{\ell}[i]$ with $i\notin\{1,\ldots,N\}$ are regarded as empty random variables. In other words,

•

For the first $N^{\prime}$ received symbols, the recoding only updates its buffer content, but does not generate any channel inputs.
•

After receiving $N^{\prime}$ symbols, the recoding generates $N$ channel inputs.

An inner code (or recoding) scheme is the specification of $f$ , $N$ , $N^{\prime}$ and $\{\phi_{\ell}\}$ .

At the destination node, all received symbols (which may belong to different batches) are decoded jointly. The end-to-end transformation of a batch is given by the transition matrix from $\mathbf{X}$ to $\mathbf{Y}_{L}$ , which can be derived using (1) and (2) recursively. In general, the source recoding function $f$ and the intermediate recoding functions $\{\phi_{\ell}\}$ can be random. Let $F$ be the transition matrix from $\mathbf{X}$ to $\mathbf{U}_{1}$ and let $\Phi_{\ell}$ be the transition matrix from $\mathbf{Y}_{\ell}$ to $\mathbf{U}_{\ell+1}$ . We have the Markov chain

\mathbf{X}\rightarrow\mathbf{U}_{1}\rightarrow\mathbf{Y}_{1}\rightarrow\cdots\rightarrow\mathbf{U}_{L}\rightarrow\mathbf{Y}_{L}.

The end-to-end transition matrix from $\mathbf{X}$ to $\mathbf{Y}_{L}$ is

W_{L}\triangleq FQ_{1}^{\otimes N}\Phi_{1}Q_{2}^{\otimes N}\Phi_{2}\cdots Q_{L-1}^{\otimes N}\Phi_{L-1}Q_{L}^{\otimes N}.

(3)

II-C Design Considerations

The major parameters of a batched code include: i) batch size $M$ , ii) inner blocklength $N$ , and iii) buffer size $B$ at the intermediate nodes. The buffer size $B=\log|\mathcal{B}|$ when $N^{\prime}=0$ , and $B=2\log|\mathcal{B}|$ when $N^{\prime}>0$ . For a given recoding scheme, the maximum achievable rate of the outer code is $\max_{p_{\mathbf{X}}}I(\mathbf{X};\mathbf{Y}^{(L)})$ for $N$ channel uses. In other words, the design goal of a recoding scheme is to maximize

C_{L}\triangleq\frac{1}{N}\max_{p_{\mathbf{X}}}I(\mathbf{X};\mathbf{Y}_{L})=\frac{1}{N}\max_{p_{\mathbf{X}}}I(p_{\mathbf{X}},W_{L})

(4)

under certain constraints of $M$ , $N$ and $B$ to be discussed later. For a given recoding scheme, an outer code should be designed for the transition matrix $W_{L}$ . The optimal value of (4) is called the capacity of the line network with batched codes (under a certain constraint of $M$ , $N$ and $B$ ), denoted as $C_{L}$ .

By the convexity of mutual information for $W_{L}$ when $p_{\mathbf{X}}$ is fixed, we have the following proposition.

Proposition 1.

There exists a deterministic capacity achieving recoding scheme.

II-D Capacity Scalability

Under various constraints of $M$ , $N$ and $B$ , we study how the capacity of a line network with batched codes scales with the network length $L$ . Denote by $C(Q_{\ell})$ and $C_{0}(Q_{\ell})$ the channel capacity and the zero-error capacity of $Q_{\ell}$ , respectively. Note that if $C_{0}(Q_{\ell})>0$ for any $\ell$ , a constant rate can be achieved for any network length $L$ using fixed $M$ , $N$ and $B$ (see also [2]). The same scalability result can be extended to a line network with only a fixed number of DMCs $Q_{\ell}$ with $C_{0}(Q_{\ell})=0$ . Henceforth in this paper, we consider the case that $C_{0}(Q_{\ell})=0$ for all $\ell$ .

II-D1 $M=\Theta(N)$ , $B=\Theta(N)$ and $N\rightarrow\infty$

Decode-and-forward is an optimal recoding scheme and achieves the min-cut capacity $\min_{\ell=1}^{L}C(Q_{\ell})$ when i) $B$ is not limited and ii) $N$ is allowed to be arbitrarily large [1].

II-D2 $M=\Theta(N)$ , $B=\Theta(N)$ and $N=O(\ln L)$

As $N$ does not tend to infinity, the error probability at each intermediate node does not tend to zero if decode and forward is applied. When $Q_{\ell}$ , $\ell=1,\ldots,L$ are identical, any constant rate below the channel capacity $C(Q_{\ell})$ can be achieved using batched codes with $n\rightarrow\infty$ [2].

II-D3 $N=O(1)$

When $N$ is a fixed number that does not change with $L$ , it is sufficient to consider a fixed $B$ and $M$ . When $Q_{\ell}$ , $\ell=1,\ldots,L$ are identical, $C_{L}$ tends to zero as $L\rightarrow\infty$ [2]. It was also shown that when $\Phi_{\ell}$ , $\ell=1,\ldots,L-1$ are also identical, the maximum achievable rate converges to zero exponentially fast.

When $N=O(1)$ , the scalability of $C_{L}$ for general cases is still open. For example, it is unknown whether this exponential convergence of the achievable rate still holds when channels and recoding functions at intermediate nodes can be different. In this paper, we will answer this question by a general upper bound that decreases exponentially in $L$ .

II-D4 $M=O(1)$

We are also interested in the case that $M$ is a relatively small, fixed number that does not change with the network length $L$ , so that the major parameters of the outer code do not depend on the network size. This may have certain advantages for the hardware implementation of the outer code. It was shown in [4] that when $N=O(\ln L)$ and $B=O(\ln\ln L)$ , rate $\Omega(1/\ln L)$ can be achieved. Note that $B=O(\ln N)$ is necessary when a node needs at least to count how many packets of a batch has been received. In this paper, we will show that when $N=O(\ln L)$ and $B=O(N)$ , $C_{L}$ is $O(1/\ln L)$ .

III Line Networks of Packet Erasure Channels

We first discuss a special case that the channels $\{Q_{\ell}\}$ are identical packet erasure channels with transition matrix $Q_{\text{erasure}}$ . Fix an alphabet $\mathcal{Q}^{*}$ with $|\mathcal{Q}^{*}|\geq 2$ . Suppose that the input alphabet $\mathcal{Q}_{\mathrm{in}}$ and the output alphabet $\mathcal{Q}_{\mathrm{out}}$ are both $\mathcal{Q}^{*}\cup\{0\}$ where $0\notin\mathcal{Q}^{*}$ is called the erasure. For each $x\in\mathcal{Q}^{*}$ ,

Q_{\text{erasure}}(y|x)=\begin{cases}1-\epsilon&\text{if }y=x,\\ \epsilon&\text{if }y=0,\end{cases}

where $\epsilon$ is a constant value in $(0,1)$ called the erasure probability. The input $0$ can be used to model the input when the channel is not used for transmission and we define $Q(0|0)=1$ .

The relation between the input $X$ and output $Y$ of a packet erasure channel can be written as a function $Y=XZ$ , where $Z$ is a binary random variable independent of $X$ with $\Pr\{Z=0\}=1-\Pr\{Z=1\}=\epsilon$ . In other words, $Z$ indicates whether the channel output is the erasure or not. With this formulation, we can write for $\ell=1,\ldots,L$ and $i=1,\ldots,N$ ,

\mathbf{Y}_{\ell}[i]=\mathbf{U}_{\ell}[i]\mathbf{Z}_{\ell}[i]

where $\mathbf{Z}_{\ell}[i]$ are independent binary random variables with $\Pr\{\mathbf{Z}_{\ell}[i]=0\}=\epsilon$ .

The main idea of our converse is that the worst link in a line network restricts the capacity. We define the event $E_{0}$ to capture the communication bottleneck

E_{0}=\cup_{\ell=1}^{L}\{\mathbf{Z}_{\ell}=0\}=\left\{\lor_{\ell=1}^{L}(\mathbf{Z}_{\ell}=0)\right\},

where $\mathbf{Z}_{\ell}=0$ means $\mathbf{Z}_{\ell}[i]=0$ for all $i$ . In other words, $E_{0}$ is the event that for at least one link, all the $N$ uses of the channel for a batch are erasures. Define $W_{L}^{(0)}$ and $W_{L}^{(1)}$ as the transition matrix from $\mathcal{A}^{M}$ to $\mathcal{Q}_{\mathrm{out}}^{N}$ such that

	$\displaystyle W_{L}^{(0)}(\mathbf{y}\|\mathbf{x})$	$\displaystyle=$	$\displaystyle\Pr\{\mathbf{Y}_{L}=\mathbf{y}\|\mathbf{X}=\mathbf{x},E_{0}\},$
	$\displaystyle W_{L}^{(1)}(\mathbf{y}\|\mathbf{x})$	$\displaystyle=$	$\displaystyle\Pr\{\mathbf{Y}_{L}=\mathbf{y}\|\mathbf{X}=\mathbf{x},\overline{E_{0}}\},$

where $\overline{E_{0}}=\left\{\land_{\ell=1}^{L}(\mathbf{Z}_{\ell}\neq 0)\right\}$ . As $\mathbf{X}$ and $\mathbf{Z}_{\ell}$ , $\ell=1,\ldots,L$ are independent, we have

W_{L}=W_{L}^{(0)}p_{0}+W_{L}^{(1)}p_{1},

where $p_{0}=\Pr\{E_{0}\}$ and $p_{1}=1-p_{0}$ . As $I(p_{\mathbf{X}},W_{L})$ is a convex function of $W_{L}$ when $p_{\mathbf{X}}$ is fixed, we obtain

I(p_{\mathbf{X}},W_{L})\leq p_{0}I(p_{\mathbf{X}},W_{L}^{(0)})+p_{1}I(p_{\mathbf{X}},W_{L}^{(1)}).

Lemma 2.

For a line network of identical packet erasure channels, $I(p_{\mathbf{X}},W_{L}^{(0)})=0$ .

Proof.

Denote by $P$ the (joint) probability mass function of the random variables we have defined for the batch codes. To prove the lemma, we only need to show for all $\mathbf{x}\in\mathcal{A}^{M}$ and $\mathbf{y}_{L}\in\mathcal{Q}_{\mathrm{out}}^{N}$ ,

P(\mathbf{y}_{L},\mathbf{x},E_{0})=P(\mathbf{x})P(\mathbf{y}_{L},E_{0}),

(5)

which implies $I(p_{\mathbf{X}},W_{L}^{(0)})=0$ .

Define a sequence of events for $\ell=1,\ldots,L$ ,

\overline{E_{0}^{(\ell)}}=\{\mathbf{Z}_{\ell^{\prime}}\neq 0,\ell^{\prime}>\ell\}.

We have $\{\mathbf{Z}_{\ell}=0\}\cap\overline{E_{0}^{(\ell)}}$ , $\ell=1,\ldots,L$ are disjoint and $E_{0}=\cup_{\ell=1}^{L}\left[\{\mathbf{Z}_{\ell}=0\}\cap\overline{E_{0}^{(\ell)}}\right]$ . Therefore,

$\displaystyle\IEEEeqnarraymulticol{3}{l}{P(\mathbf{y}_{L},\mathbf{x},E_{0})}$
	$\displaystyle=$	$\displaystyle\sum_{\ell}P\left(\mathbf{y}_{L},\mathbf{x},\mathbf{Z}_{\ell}=0,\overline{E_{0}^{(\ell)}}\right)$
	$\displaystyle=$	$\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}\sum_{\mathbf{u}_{\ell}}P\left(\mathbf{y}_{L},\mathbf{x},\mathbf{y}_{\ell},\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=0,\overline{E_{0}^{(\ell)}}\right)$
	$\displaystyle=$	$\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{\|}\mathbf{y}_{\ell})\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell},\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0).$

We have

$\displaystyle\IEEEeqnarraymulticol{3}{l}{\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell},\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0)}$
	$\displaystyle=$	$\displaystyle\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell})P(\mathbf{Z}_{\ell}=0)P(\mathbf{y}_{\ell}\|\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=0)$
	$\displaystyle=$	$\displaystyle\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell})P(\mathbf{Z}_{\ell}=0)P(\mathbf{y}_{\ell}\|\mathbf{Z}_{\ell}=0)$
	$\displaystyle=$	$\displaystyle P(\mathbf{x})P(\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0)$

where $P(\mathbf{y}_{\ell}|\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=0)=P(\mathbf{y}_{\ell}|\mathbf{Z}_{\ell}=0)$ follows that $\mathbf{Y}_{\ell}=0$ as $\mathbf{Z}_{\ell}=0$ . Hence, we obtain

\displaystyle P(\mathbf{y}_{L},\mathbf{x},E_{0})

\displaystyle=

\displaystyle P(\mathbf{x})\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P\left(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{|}\mathbf{y}_{\ell}\right)P(\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0).

Similarly, we have

\displaystyle P(\mathbf{y}_{L},E_{0})

\displaystyle=

\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P\left(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{|}\mathbf{y}_{\ell}\right)P(\mathbf{y}_{\ell},\mathbf{Z}_{\ell}=0).

Therefore, we show (5). ∎

As $p_{1}=(1-\epsilon^{N})^{L}$ and

I(p_{\mathbf{X}},W_{L}^{(1)})\leq\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\},

we have

C_{L}\leq\frac{(1-\epsilon^{N})^{L}}{N}\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\}.

(6)

Theorem 3.

For a line network of length $L$ of packet erasure channels with erasure probability $\epsilon$ ,

1.

When $N=O(1)$ , $C_{L}=O((1-\epsilon^{N})^{L})$ .
2.

When $M=O(1)$ and $N=\Theta(\ln L)$ , $C_{L}=O(1/\ln L)$ .
3.

When $M=\Omega(\ln L)$ and $N=\Omega(\ln L)$ , $C_{L}=O(1)$ .

Proof.

The theorem can be proved by substituting $M$ and $N$ in each case into (6). ∎

IV Converse for General Channels

Consider a generic channel $Q:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}}$ . The relation between the input $X$ and output $Y$ of $Q$ can be modeled as a function $\alpha$ (see [10, Section 7.1]):

Y=\alpha(X,Z=(Z_{x},x\in\mathcal{Q}_{\mathrm{in}}))=\sum_{x\in\mathcal{Q}_{\mathrm{in}}}\bm{1}\{X=x\}Z_{x},

(7)

where $\bm{1}$ denotes the indicator function, and $Z_{x},x\in\mathcal{Q}_{\mathrm{in}}$ are independent random variables with alphabet $\mathcal{Q}_{\mathrm{out}}$ define as

\Pr\{Z_{x}=y\}=Q(y|x).

(8)

For $N$ uses of the channel $Q$ , we can write

\mathbf{Y}=\alpha^{(N)}(\mathbf{U},\mathbf{Z}),

(9)

where $\mathbf{Y}[i]=\alpha(\mathbf{U}[i],\mathbf{Z}[i])$ .

In this section, we consider general DMCs $Q_{\ell}$ for all $\ell$ , which can be modeled as the function $\alpha_{\ell}$ . With the above formulation, we can write for $\ell=1,\ldots,L$ ,

\mathbf{Y}_{\ell}=\alpha_{\ell}^{(N)}(\mathbf{U}_{\ell},\mathbf{Z}_{\ell}).

(10)

IV-A Canonical Channels

For $0<\varepsilon\leq 1$ , we call a channel $Q:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}}$ an $\varepsilon$ -canonical channel if there exists $y_{0}\in\mathcal{Q}_{\mathrm{out}}$ such that for every $x\in\mathcal{Q}_{\mathrm{in}}$ , $Q(y_{0}|x)\geq\varepsilon$ . The packet erasure channel, BSC and BEC are all canonical channels. Note that a canonical $Q$ has $C_{0}(Q)=0$ . We first consider the case that the channels $\{Q_{\ell}\}$ are all $\varepsilon$ -canonical channels. Define the event

E_{0}=\left\{\lor_{\ell=1}^{L}(\mathbf{Z}_{\ell}=y_{0})\right\},

where $\mathbf{Z}_{\ell}=y_{0}$ means $(\mathbf{Z}_{\ell}[i])_{x}=y_{0}$ for all $i$ and $x$ . The event $E_{0}$ means that there exists one link of the network such that all uses of the channel for transmitting a batch have the same output $y_{0}$ . Similar to the discussion in Section III, the transition matrix $W_{L}$ can be expressed as

W_{L}=W_{L}^{(0)}p_{0}+W_{L}^{(1)}p_{1},

where $p_{0}=\text{Pr}\{E_{0}\},p_{1}=\text{Pr}\{\overline{E_{0}}\}$ , and

	$\displaystyle W_{L}^{(0)}(\mathbf{y}\mid\mathbf{x})$	$\displaystyle=\text{Pr}\{\mathbf{Y}_{L}=\mathbf{y}\mid\mathbf{X}=\mathbf{x},E_{0}\},$
	$\displaystyle W_{L}^{(1)}(\mathbf{y}\mid\mathbf{x})$	$\displaystyle=\text{Pr}\{\mathbf{Y}_{L}=\mathbf{y}\mid\mathbf{X}=\mathbf{x},\overline{E_{0}}\}.$

Hence,

I(p_{\mathbf{X}},W_{L})\leq p_{0}I(p_{\mathbf{X}},W_{L}^{(0)})+p_{1}I(p_{\mathbf{X}},W_{L}^{(1)}).

Lemma 4.

When $Q_{\ell}$ , $\ell=1,\ldots,L$ are all $\varepsilon$ -canonical channels, $p_{1}\leq(1-\varepsilon^{|\mathcal{Q}_{\mathrm{in}}|N})^{L}$ .

Proof.

We write

	$\displaystyle p_{1}$	$\displaystyle=\prod_{\ell=1}^{L}\bigg{[}1-\prod_{i\in\{1,\ldots,N\}}\prod_{x\in\mathcal{Q}_{\mathrm{in}}}\text{Pr}((\mathbf{Z}_{\ell}[i])_{x}=y_{0})\bigg{]}$
		$\displaystyle=\prod_{\ell=1}^{L}\bigg{[}1-\prod_{i\in\{1,\ldots,N\}}\prod_{x\in\mathcal{Q}_{\mathrm{in}}}Q_{\ell}(y_{0}\|x)\bigg{]}\leq(1-\varepsilon^{\|\mathcal{Q}_{\mathrm{in}}\|N})^{L},$

where the second equality follows from (8). ∎

Lemma 5.

For a line network of length $L$ of $\varepsilon$ -canonical channels, $I(p_{\mathbf{X}},W_{L}^{(0)})=0$ .

Proof.

Similar as the proof of Lemma 2, we have

	$\displaystyle P(\mathbf{y}_{L},\mathbf{x},E_{0})$	$\displaystyle=$	$\displaystyle\sum_{\ell}\sum_{\mathbf{y}_{\ell}}P(\mathbf{y}_{L},\overline{E_{0}^{(\ell)}}\Big{\|}\mathbf{y}_{\ell})$
			$\displaystyle\sum_{\mathbf{u}_{\ell}}P(\mathbf{x},\mathbf{u}_{\ell})P(\mathbf{Z}_{\ell}=y_{0})P(\mathbf{y}_{\ell}\|\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=y_{0}).$

By (10), given $\mathbf{Z}_{\ell}=y_{0}$ ,

\displaystyle\mathbf{Y}_{\ell}

\displaystyle=

\displaystyle\alpha^{(N)}(\mathbf{U}_{\ell},\mathbf{Z}_{\ell}=y_{0})=y_{0},

and hence $P(\mathbf{y}_{\ell}\mid\mathbf{u}_{\ell},\mathbf{Z}_{\ell}=y_{0})=P(\mathbf{y}_{\ell}\mid\mathbf{Z}_{\ell}=y_{0})$ . Following the same argument as in Lemma 2,

P(\mathbf{y}_{L},\mathbf{x},E_{0})=P(\mathbf{x})P(\mathbf{y}_{L},E_{0}),

which implies $I(p_{\mathbf{X}},W_{L}^{(0)})=0$ . ∎

Combining both Lemma 4 and Lemma 5, we can assert that

C_{L}\leq\frac{(1-\varepsilon^{|\mathcal{Q}_{\mathrm{in}}|N})^{L}}{N}\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\},

which implies the following theorem:

Theorem 6.

For a length- $L$ line network of $\varepsilon$ -canonical channels with finite input and output alphabets,

1.

When $N=O(1)$ , $C_{L}=O((1-\varepsilon^{|\mathcal{Q}_{\mathrm{in}}|N})^{L})$ .
2.

When $M=O(1)$ and $N=\Theta(\ln L)$ , $C_{L}=O(1/\ln L)$ .
3.

When $M=\Omega(\ln L)$ and $N=\Omega(\ln L)$ , $C_{L}=O(1)$ .

IV-B General Channels

Consider a channel $Q:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}}$ with $C_{0}(Q)=0$ , modeled as in (7)-(9). Denote by $\varepsilon_{Q}$ the maximum value such that for any $x,x^{\prime}\in\mathcal{Q}_{\mathrm{in}}$ , there exists $y\in\mathcal{Q}_{\mathrm{out}}$ such that $Q(y|x)\geq\varepsilon_{Q}$ and $Q(y|x^{\prime})\geq\varepsilon_{Q}$ . Note that $C_{0}(Q)=0$ if and only if $\varepsilon_{Q}>0$ .

Lemma 7.

For a channel $Q:\mathcal{Q}_{\mathrm{in}}\to\mathcal{Q}_{\mathrm{out}}$ with $C_{0}(Q)=0$ , and any non-empty $\mathcal{A}\subseteq\mathcal{Q}_{\mathrm{in}}^{N}$ , there exist $\mathbf{z}=((\mathbf{z}[i])_{x}\in\mathcal{Q}_{\mathrm{out}},x\in\mathcal{Q}_{\mathrm{in}},i=1,\ldots,N)$ and a subset $\mathcal{B}\subseteq\mathcal{Q}_{\mathrm{out}}^{N}$ with $|\mathcal{B}|\leq\lceil|\mathcal{A}|/2\rceil$ such that $\alpha^{(N)}(\mathbf{x},\mathbf{z})\in\mathcal{B}$ for any $\mathbf{x}\in\mathcal{A}$ and $\Pr\{\mathbf{Z}=\mathbf{z}\}\geq\varepsilon_{Q}^{|\mathcal{Q}_{\mathrm{in}}|N}$ .

Proof.

The sequences in $\mathcal{A}$ can be put into $\lceil|\mathcal{A}|/2\rceil$ pairs. For each pair $\mathbf{x}$ and $\mathbf{x}^{\prime}$ , there exists $\mathbf{y}$ such that for each $i=1,\ldots,N$ , $Q(\mathbf{y}[i]|\mathbf{x}[i])\geq\varepsilon_{Q}$ and $Q(\mathbf{y}[i]|\mathbf{x}^{\prime}[i])\geq\varepsilon_{Q}$ . Let $(\mathbf{z}[i])_{\mathbf{x}[i]}=\mathbf{y}[i]$ and $(\mathbf{z}[i])_{\mathbf{x}^{\prime}[i]}=\mathbf{y}[i]$ . After going through all the $\lceil|\mathcal{A}|/2\rceil$ pairs, let $\mathcal{B}$ be the collection of all $\mathbf{y}$ , which satisfies $|\mathcal{B}|\leq\lceil|\mathcal{A}|/2\rceil$ . For all $(\mathbf{z}[i])_{x}$ that have not been assigned, let $(\mathbf{z}[i])_{x}=y$ such that $Q(y|x)\geq\varepsilon_{Q}$ . Hence $\Pr\{\mathbf{Z}=\mathbf{z}\}=\Pr\{(\mathbf{Z}[i])_{x}=(\mathbf{z}[i])_{x},x\in\mathcal{Q}_{\mathrm{in}},i=1,\ldots,N\}\geq\varepsilon_{Q}^{|\mathcal{Q}_{\mathrm{in}}|N}$ . ∎

Assume that $L=L^{\prime}K$ , where $L^{\prime}$ and $K$ are integers. As a result, the end-to-end transition matrix $W_{L}$ can be written as

W_{L}=FG_{1}\Phi_{K}G_{2}\Phi_{2K}\cdots G_{L^{\prime}},

where for $i=1,\ldots,L^{\prime}$ ,

G_{i}=Q_{K(i-1)+1}^{\otimes N}\Phi_{K(i-1)+1}\cdots\Phi_{Ki-1}Q_{Ki}^{\otimes N}.

The length- $L$ network can be regarded as a length- $L^{\prime}$ network of channels $G_{i}$ , $i=1,\ldots,L^{\prime}$ . Because of proposition 1, without loss of optimality, we assume a deterministic recoding scheme, i.e., $F,\Phi_{\ell}$ are deterministic transition matrices. The input $\mathbf{X}$ and output $\mathbf{Y}$ of $G_{i}$ can be written as a function

\mathbf{Y}=\alpha_{G_{i}}(\mathbf{X},\mathbf{Z}_{\ell},\ell=K(i-1)+1,\ldots,Ki),

where $\alpha_{G_{i}}$ can be determined recursively by $F,\{\Phi_{\ell}\}$ and (9).

When $K\geq N\log_{2}|\mathcal{Q}_{\mathrm{in}}|$ and $\varepsilon_{Q_{\ell}}\geq\varepsilon$ for all $\ell$ , applying Lemma 7 inductively, we know that there exists $\mathbf{y}_{i}$ and $\{\mathbf{z}_{\ell}\}$ such that $\alpha_{G_{i}}(\mathbf{x},\mathbf{z}_{\ell},\ell=K(i-1)+1,\ldots,Ki)=\mathbf{y}_{i}$ for all $\mathbf{x}\in\mathcal{Q}_{\mathrm{in}}^{N}$ , and

\Pr\{\mathbf{Z}_{\ell}=\mathbf{z}_{\ell},\ell=K(i-1)+1,\ldots,K\}\geq\varepsilon_{Q}^{|\mathcal{Q}_{\mathrm{in}}|NK}.

For $i=1,\ldots,L^{\prime}$ , define events

E_{i}=\{\mathbf{Z}_{\ell}=\mathbf{z}_{\ell},\ell=K(i-1)+1,\ldots,K\}.

Define the event

E_{0}=\left\{\lor_{i=1}^{L^{\prime}}E_{i}\right\}.

Performing the similar analysis as in Section IV-A for the length- $L^{\prime}$ network of channels $G_{i}$ , $i=1,\ldots,L^{\prime}$ with $E_{0}$ defined above, we obtain

\displaystyle C_{L}\leq\frac{(1-\varepsilon^{NK|\mathcal{Q}_{\mathrm{in}}|})^{L/K}}{N}\min\{M\ln|\mathcal{A}|,N\ln|\mathcal{Q}_{\mathrm{out}}|\},

(11)

and hence the following result holds:

Theorem 8.

For a length- $L$ line network of channels $Q_{\ell}$ with finite input and output alphabets and $\varepsilon_{Q_{\ell}}\geq\varepsilon>0$ for all $\ell$ ,

1.

When $N=O(1)$ , $C_{L}=O((1-\varepsilon^{\prime})^{L})$ for certain $\varepsilon^{\prime}\in(0,1)$ .
2.

When $M=O(1)$ and $N=\Theta(\ln L)$ , $C_{L}=O(1/\ln L)$ .
3.

When $M=\Omega(\ln L)$ and $N=\Omega(\ln L)$ , $C_{L}=O(1)$ .

V Concluding Remarks

This paper characterized the tight capacity upper bound of batched codes for line networks when the channels have finite alphabets and $0$ zero-error capacities.

Generalization of our analysis for channels with infinite alphabets and continuous channels is of research interests. The study of batched code design for a line network of channels like BSC is also desirable.

Last, we are curious whether our outer bound holds without the batched code constraint.

References

[1] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed. John Wiley & Sons, Inc, 2006.
[2] U. Niesen, C. Fragouli, and D. Tuninetti, “On capacity of line networks,” Information Theory, IEEE Transactions on, vol. 53, no. 11, pp. 4039–4058, Nov. 2007.
[3] S. Yang and R. W. Yeung, BATS Codes: Theory and Practice, ser. Synthesis Lectures on Communication Networks. Morgan & Claypool Publishers, 2017.
[4] S. Yang, J. Wang, Y. Dong, and Y. Zhang, “On the capacity scalability of line networks with buffer size constraints,” in Information Theory Proceedings (ISIT), 2019 IEEE International Symposium on, 2019.
[5] B. N. Vellambi, N. Torabkhani, and F. Fekri, “Throughput and latency in finite-buffer line networks,” IEEE Transactions on Information Theory, vol. 57, no. 6, pp. 3622–3643, June 2011.
[6] P. A. Chou, Y. Wu, and K. Jain, “Practical network coding,” in Proc. Allerton Conf. Comm., Control, and Computing, Oct. 2003.
[7] D. Silva, W. Zeng, and F. R. Kschischang, “Sparse network coding with overlapping classes,” in Proc. NetCod ’09, 2009, pp. 74–79.
[8] S. Yang and R. W. Yeung, “Coding for a network coded fountain,” in Information Theory Proceedings (ISIT), 2011 IEEE International Symposium on, Saint Petersburg, Russia, July 31 - Aug. 5 2011, pp. 2647–2651.
[9] ——, “Batched sparse codes,” Information Theory, IEEE Transactions on, vol. 60, no. 9, pp. 5322–5346, Sep. 2014.
[10] R. W. Yeung, Information Theory and Network Coding. Springer, 2008.

	$\displaystyle W_{L}^{(0)}(\mathbf{y}\|\mathbf{x})$	$\displaystyle=$	$\displaystyle\Pr\{\mathbf{Y}_{L}=\mathbf{y}\|\mathbf{X}=\mathbf{x},E_{0}\},$
	$\displaystyle W_{L}^{(1)}(\mathbf{y}\|\mathbf{x})$	$\displaystyle=$	$\displaystyle\Pr\{\mathbf{Y}_{L}=\mathbf{y}\|\mathbf{X}=\mathbf{x},\overline{E_{0}}\},$

Upper Bound Scalability on Achievable Rates of Batched Codes for Line Networks

Abstract

I Introduction

II Line Networks and Batched Codes

II-A General Description

II-B Recoding Formulations

II-C Design Considerations

Proposition 1.

II-D Capacity Scalability

II-D1 M=Θ​(N)M=\Theta(N), B=Θ​(N)B=\Theta(N) and N→∞N\rightarrow\infty

II-D2 M=Θ​(N)M=\Theta(N), B=Θ​(N)B=\Theta(N) and N=O​(ln⁡L)N=O(\ln L)

II-D3 N=O​(1)N=O(1)

II-D4 M=O​(1)M=O(1)

III Line Networks of Packet Erasure Channels

Lemma 2.

Proof.

Theorem 3.

Proof.

IV Converse for General Channels

IV-A Canonical Channels

Lemma 4.

Proof.

Lemma 5.

Proof.

Theorem 6.

IV-B General Channels

Lemma 7.

Proof.

Theorem 8.

V Concluding Remarks

References

Upper Bound Scalability on Achievable Rates of
Batched Codes for Line Networks

II-D1 $M=\Theta(N)$ , $B=\Theta(N)$ and $N\rightarrow\infty$

II-D2 $M=\Theta(N)$ , $B=\Theta(N)$ and $N=O(\ln L)$

II-D3 $N=O(1)$

II-D4 $M=O(1)$