A dynamic program to achieve capacity of multiple access channel with noiseless feedback

Deepanshu Vasal Northwestern University
Evanston, IL 60201, USA
[email protected]

Abstract

In this paper, we consider the problem of evaluating capacity expression of a multiple access channel (MAC) with noiseless feedback. So far, the capacity expression for this channel is known through a multi letter directed information by Kramer [1]. Recently, it was shown in [2] that one can pose it as a dynamic optimization problem, however, no dynamic program was provided as the authors claimed there is no notion of state that is observed by both the senders. In this paper, we build upon [2] to show that there indeed exists a state and therefore a dynamic program (DP) that decomposes this dynamic optimization problem, and equivalently a Bellman fixed-point equation to evaluate capacity of this channel. We do so by defining a common belief on private messages and private beliefs of the two senders, and using this common belief as state of the system. We further show that this DP can be further reduced to a DP with state as the common belief on just the messages. This provides a single letter characterization of the capacity of this channel.

I Introduction

Finding the capacity and capacity achieving scheme of Multiple Access Channel with noiseless feedback (MAC-NF) is a fundamental problem in communication that has been studied from the lens of information theory [3]. It was shown by Gaarder and Wolf [4] that feedback strictly increases the capacity of this channel. Later Cover and Leung [5] provided an achievable rate region for this channel using block Markov coding and showed it to be tight for a class of channels for which each encoder can perfectly decode the transmitted symbol of the other using feedback and its own transmitted symbol (henceforth referred to as Cover-Leung Channels). Kramer [1] provided a multi-letter directed information expression for the capacity of this channel. For Gaussian MAC with feedback, Ozarow [6] provided a linear scheme that achieves capacity and shows that for this channel, the expression for capacity is single letter. Since then finding a single letter expression for MAC channel with feedback has been an open problem.

Recently, Anastasopoulos and Pradhan in [2] presented a connection between the problem of Decentralized sequential active hypothesis testing (DSAHT) and that of finding capacity of a multiple access channel with noiseless feedback. DSAHT problem consists of minimizing the terminal probability of error for MAC-NF. For this, the authors show that this can be posed as a decentralized stochastic control problem, and using common information approach [7], they pose it as a dynamic program for common agent and show that there exists Markovian strategies that are optimal, where these policies are Markovian with respect to a common belief state $\pi_{t}$ that puts a belief on private messages of the users. For the problem of achieving capacity, they show that the multi letter expression of capacity defined by Kramer [1] is a dynamic optimization problem. However, the instantaneous costs involve private beliefs of the senders on their own transmitted symbols, that are not observed by each other. Thus, the instantaneous costs can not be written as a function of a state that is observed by all the players, in effect showing that there is no notion of state and this problem can not decomposed sequentially through as a dynamic program. However, they argue that one can still restrict oneself to Markovian policies that are optimal for DSAHT problem to solve for the dynamic optimization problem which in turn would solve for finding capacity. In summary, their main result shows that there exist a common belief based Markovian strategy that achieves capacity for this channel, and thus one can restrict oneself to this class of strategies to evaluate capacity. This is given by solving a dynamic optimization problem that can not be sequentially decomposed (i.e. does not have a dynamic program).

In this paper, we build upon their work to show that there indeed exists a state and therefore a dynamic program that decomposes this optimization problem, and thus there exists a Bellman fixed-point equation to evaluate capacity of this channel. We do this by again using the common agent approach [7] and constructing a new common belief state $\tilde{\pi}_{t}$ that puts a measure on the private messages of the users as well as their private beliefs, that appear in the instantaneous cost to achieve capacity, as described above. Based on this new common belief state, we show that this is indeed a state of the system such that it is a controlled Markov process for the problem where the instantaneous cost can written as a function of this state and user’s partial functions that map their private state to their transmitted symbols. This shows that one can in principle solve for capacity expression, and the capacity achieving policy, by solving a parameterized Bellman type fixed-point dynamic programming (DP) equation with state as $\tilde{\pi}_{t}$ . We further show that this DP equation can be further simplified to a DP with state as common belief on just the messages of the transmitters i.e. $\pi_{t}$ , the same state considered in [2] for the DSAHT problem. This provides a single letter characterization of the capacity of this channel.

The paper is structured as follows. Section II provides the model. In Section III, we discuss previous results. In Section IV we present our main result. In Section V, we present conclusion and future work.

II Channel Model

We consider a two-user discrete memoryless Multiple Access Channel with Noiseless Feedback (MAC-NF). The input symbols $X^{1}$ , $X^{2}$ and the output symbol $Y$ take values in the finite alphabets $\mathcal{X}^{1}$ , $\mathcal{X}^{2}$ and $\mathcal{Y}$ , respectively.¹¹1We use discrete alphabets to simplify exposition and avoid measure theoretic technicalities that could arise from the continuous alphabets. However, our results go through for continuous alphabets under appropriate technical regularity assumptions. The channel is memoryless in the sense that the current channel output is independent of all the past channel inputs and the channel outputs, i.e.,

\mathbb{P}(y_{t}|x^{1}_{1:t},x^{2}_{1:t},y_{1:t-1})=Q(y_{t}|x^{1}_{t},x^{2}_{t}).

(1)

Our model considers noiseless feedback with unit delay, that is, at time $t$ , the presence of the channel outputs $y_{1:t-1}$ to both encoders.

Consider the problem of transmission of messages $m^{i}\in\mathcal{M}^{i}=\{1,\ldots,\mathbb{M}^{i}\},\;i=1,2$ , over the MAC-NF using fixed length codes of length $n$ . Encoders generate their channel inputs based on their private messages and past outputs. Thus

\displaystyle X^{i}_{t}

\displaystyle=\tilde{f}_{t}^{i}(M^{i},X^{i}_{1:t-1},Y_{1:t-1})=f_{t}^{i}(M^{i},Y_{1:t-1}),\quad i=1,2.

(2)

The decoder estimates the messages $M^{1}$ and $M^{2}$ based on $n$ channel outputs, $Y_{1:n}$ as

(\hat{M}^{1},\hat{M}^{2})=g(Y_{1:n}).

(3)

A fixed-length transmission scheme for the channel $Q$ is the pair $s=(f,g)$ , consisting of the encoding functions $f=(f^{1},f^{2})$ with $f^{i}=f^{i}_{1:n}$ and decoding function $g$ . The error probability associated with the transmission scheme $s$ is defined as

P_{e}(s)=\mathbb{P}^{s}((M^{1},M^{2})\neq(\hat{M}^{1},\hat{M}^{2})).

(4)

III DSAHT and MAC-NF channel capacity

III-A DSAHT

The problem of Dynamic Sequential Active Hypothesis Testing (DSAHT) involves minimizing the terminal probability of error $P_{e}(s)$ over all possible transmission schemes $s$ , as follows. Given the alphabets $\mathcal{M}^{1},\mathcal{M}^{2},\mathcal{X}$ , $\mathcal{Y}$ , $\mathcal{Z}$ , the channels $Q^{f}$ , and for a fixed length $T$ , design the optimal transmission scheme $s=(f,g)$ that minimizes the error probability $P_{e}(s)$ .

P_{e}=\min_{s}\mathbb{P}_{e}(s)

(P1)

For any pair of encoding and update functions, the optimal decoder is the ML decoder (assuming equally likely hypotheses), denoted by $g_{ML}$ .

III-B Multi-letter capacity expressions

Shannon derived the capacity for a point to point discrete memoryless channel as maximum of the information between the transmitted symbol and the received symbol [8]. This is the supremum of the rate below which one can reliably transmit data. However, a similar single letter characterization of capacity of MAC-NF is not known. Kramer in [1] provided a multi-letter capacity expression for discrete memoryless MAC-NF, which can be stated as follows.

Fact 1 (Theorem 5.1 in [1])

The capacity region of the discrete memoryless MAC-NF is $\mathcal{C}_{FB}=\bigcup_{n=1}^{\infty}\mathcal{C}_{n}$ where $\mathcal{C}_{n}$ , the directed information $n$ -th inner bound region, is defined as $\mathcal{C}_{n}=\text{co}\left(\mathcal{R}_{n}\right)$ , where $co(A)$ denotes the convex hull of a set $A$ , and

$\displaystyle\mathcal{R}_{n}=\cup_{\mathcal{P}_{n}}\{(R_{1},$	$\displaystyle R_{2}):0\leq R_{1}\leq I_{n}(X^{1}\rightarrow Z\|\|X^{2}),$
	$\displaystyle 0\leq R_{2}\leq I_{n}(X^{2}\rightarrow Z\|\|X^{1}),$
	$\displaystyle 0\leq R_{1}+R_{2}\leq I_{n}(X^{1},X^{2}\rightarrow Z)\},$	(5)

where $I_{n}(A\rightarrow B||C)=\frac{1}{n}\sum_{t=1}^{n}I(A_{1:t};B_{t}|C_{1:t},B_{1:t-1})=\frac{1}{n}\sum_{t=1}^{n}I(A_{t};B_{t}|C_{1:t},B_{1:t-1})$ . The above information quantities are evaluated using the joint distribution

	$\displaystyle\mathbb{P}(x^{1}_{1:n},x^{2}_{1:n},y_{1:n})=\prod_{t=1}^{n}Q(y_{t}\|$	$\displaystyle x^{1}_{t},x^{2}_{t})q^{1}_{t}(x^{1}_{t}\|x^{1}_{1:t-1},y_{1:t-1})\times$
		$\displaystyle q^{2}_{t}(x^{2}_{t}\|x^{2}_{1:t-1},y_{1:t-1}),$		(6)

and the union is over all input joint distributions on $x^{1}_{t},x^{2}_{t}$ that are conditionally factorizable as

		$\displaystyle\mathbb{P}(x^{1}_{t},x^{2}_{t}\|x^{1}_{1:t-1},x^{2}_{1:t-1},y_{1:t-1})=$
		$\displaystyle\quad q^{1}_{t}(x^{1}_{t}\|x^{1}_{1:t-1},y_{1:t-1})q^{2}_{t}(x^{2}_{t}\|x^{2}_{1:t-1},Y_{1:t-1})$		(7)

for $t=1,2,...,n$ .

Furthermore, the regions $\mathcal{C}_{n}$ can be expressed in the form [9]

	$\displaystyle\mathcal{C}_{n}$	$\displaystyle=\left\{(R_{1},R_{2})\geq 0:\forall\ \underline{\lambda}=(\lambda_{1},\lambda_{2},\lambda_{3})\in\mathbb{R}^{3}_{+},\right.$
		$\displaystyle\qquad\left.\lambda_{1}R_{1}+\lambda_{2}R_{2}+\lambda_{3}(R_{1}+R_{2})\leq C_{n}(\underline{\lambda})\right\},$		(8)

where


	$\displaystyle C_{n}(\underline{\lambda})\triangleq\sup_{\mathcal{P}_{n}}I_{n}(\underline{\lambda})$		(9a)
	$\displaystyle I_{n}(\underline{\lambda})\triangleq\lambda_{1}I_{n}(X^{1}\rightarrow Z\|\|X^{2})+\lambda_{2}I_{n}(X^{2}\rightarrow Z\|\|X^{1})+$
	$\displaystyle\qquad\lambda_{3}I_{n}(X^{1},X^{2}\rightarrow Z)$		(9b)
	$\displaystyle=\frac{1}{n}\sum_{t=1}^{n}[\lambda_{1}I(X^{1}_{t};Y_{t}\|X^{2}_{1:t},Y_{1:t-1})+$
	$\displaystyle\qquad\qquad\lambda_{2}I(X^{2}_{t};Y_{t}\|X^{1}_{1:t},Y_{1:t-1})+\lambda_{3}I(X^{1}_{t},X^{2}_{t};Y_{t}\|Y_{1:t-1})]$		(9c)

and in the above, the set $\mathcal{P}_{n}$ is defined as

\displaystyle\mathcal{P}_{n}=\left\{(q^{1}_{t},q^{2}_{t})_{t=1,\ldots,n}:q^{i}_{t}\in(\mathcal{X}^{i})^{t-1}\times\mathcal{Z}^{t-1}\rightarrow\mathcal{P}(\mathcal{X}^{i})\right\}.

(10)

III-C Previous results

Anastasopoulos and Pradhan in [2] considered the problem of DSAHT and DM-MAC-NF capacity. We summarize their problem statement and results as follows.

They first reformulate the above problem (P1) into an equivalent optimization problem. Using the “common agent” methodology for decentralized dynamic team problems [7], they first decompose the encoding process $x^{i}_{t}=f^{i}_{t}(m^{i},y_{1:t-1})$ into an equivalent two-stage process. In the first stage, based on the common information $y_{1:t-1}$ , the mappings (or “partial encoding functions”) $e^{i}_{t}$ , $i=1,2$ are generated as $e^{i}_{t}=\phi^{i}_{t}[y_{1:t-1}]$ ²²2We use square brackets to denote functions with range being function sets, i.e., we use notation $e^{i}_{t}=\phi^{i}_{t}[y_{1:t-1}]$ because $e^{i}_{t}$ is itself a function. (or collectively, $e_{t}=(e^{1}_{t},e^{2}_{t})=\phi_{t}[y_{1:t-1}]$ ) where $e^{i}_{t}:\mathcal{M}^{i}\rightarrow\mathcal{X}^{i}$ . In the second stage, each of these mappings are evaluated at the private information of each agent, producing $x^{i}_{t}=e^{i}_{t}(m^{i})$ . In other words, for $i=1,2$ , let $\mathcal{E}^{i}$ be the collection of all (deterministic) encoding functions $e^{i}:\mathcal{M}^{i}\rightarrow\mathcal{X}^{i}$ . In the first stage, the common information given by $y_{1:t-1}$ is transformed using mappings $\phi^{i}_{t}:\mathcal{Y}^{t-1}\rightarrow\mathcal{E}^{i}$ to produce a pair of encoding functions $e_{t}=(e^{1}_{t},e^{2}_{t})$ . In the second stage these functions are evaluated at the private messages $m^{i}$ producing $x^{i}_{t}=e^{i}_{t}(m^{i})=\phi^{i}_{t}[y_{1:t-1}](m^{i})$ .

For the DSAHT problem, authors first define a common belief $\pi_{t}(m)=P^{\phi}(m|y_{1:t})$ and show that there exists an update function $F$ of $\pi_{t}$ , independent of $\phi$ , such that

\displaystyle\pi_{t}=F(\pi_{t-1},e_{t},y_{t})

(11)

where more explicitly, $F$ is given by


	$\displaystyle\pi_{t}$	$\displaystyle(m^{1},m^{2})=\frac{Q(y_{t}\|e^{1}_{t}(m^{1}),e^{2}_{t}(m^{2}))\pi_{t-1}(m^{1},m^{2})}{\sum\limits_{m^{1},m^{2}}Q(y_{t}\|e^{1}_{t}(m^{1}),e^{2}_{t}(m^{2}))\pi_{t-1}(m^{1},m^{2})}$		(12a)

Then DSAHT problem can be solved by the following DP,

$\displaystyle V_{T+1}(\pi_{T})$	$\displaystyle=\mathbb{E}^{\phi}[1-\max_{m^{1},m^{2}}\pi_{T}(m^{1},m^{2})],$	(13)
$\displaystyle V_{t}(\pi_{t-1})$	$\displaystyle=\min_{e_{t}}\mathbb{E}[V_{t+1}(F(\pi_{t-1},e_{t},Y_{t}))\|\pi_{t-1},e_{t}]$	(14)
	$\displaystyle=\min_{e_{t}}\sum_{y_{t},m^{1},m^{2}}Q(y_{t}\|e_{t}^{1}(m^{1}),e_{t}^{2}(m^{2}))\pi_{t-1}(m^{1},m^{2})$
	$\displaystyle\qquad\qquad\qquad\qquad V_{t+1}(F(\pi_{t-1},e_{t},y_{t})).$	(15)

Thereafter, regarding the capacity achieving problem, the authors in [2] note that the problem of evaluating capacity involves evaluating $C_{n}(\underline{\lambda})$ for every $\underline{\lambda}$ and is therefore at least as hard as the problem of evaluating the quantity $C_{n}(\underline{\lambda})$ . The optimization problem involved in evaluating $C_{n}(\underline{\lambda})$ can be thought of as a decentralized optimization problem involving two agents: the first is choosing the distribution $q^{1}_{t}$ on $x^{1}_{t}$ after observing the common information $y_{1:t-1}$ and her private information $x^{1}_{1:t-1}$ , while the second is choosing the distribution $q^{2}_{t}$ on $x^{2}_{t}$ after observing the common information $y_{1:t-1}$ and her private information $x^{2}_{1:t-1}$ . They further show that the capacity expression in (9) can be expressed as follows. They first define a private belief $\hat{\pi}^{i}_{t}$ as the marginal belief that user $i$ maintains on her own message $m^{i}$ , given her information $(x^{i}_{1:t},y_{1:t})$ till time $t$ , i.e.

\hat{\pi}^{i}_{t}(m^{i})\triangleq\mathbb{P}^{g}(M^{i}=m^{i}|x^{i}_{1:t},y_{1:t}),\qquad i=1,2.

(16)

Note that this belief does not depend on all the information available to each transmitter, and specifically does not depend on their own messages $m^{i}$ . Furthermore, they show that there exists functions $\hat{F}^{i}$ independent of the policies of the transmitters $g$ such that

\hat{\pi}_{t}^{i}=\hat{F}^{i}(\hat{\pi}^{i}_{t-1},e^{i}_{t},x^{i}_{t}),

(17)

where more explicitly $\hat{F}^{i}$ is given by,

\displaystyle\hat{\pi}^{i}_{t}(m^{i})=\frac{1_{e^{i}_{t}(m^{i})}(x^{i}_{t})\hat{\pi}^{i}_{t-1}(m^{i})}{\sum_{\tilde{m}^{i}}1_{e^{i}_{t}(\tilde{m}^{i})}(x^{i}_{t})\hat{\pi}^{i}_{t-1}(\tilde{m}^{i})},

(18)

Although the authors do not specify, the repeated application of the above lemma implies that

\displaystyle\hat{\pi}^{i}_{t}(m^{i})=\mathbb{P}^{g}(m^{i}=m^{i}|x^{i}_{1:t})=\mathbb{P}(m^{i}=m^{i}|e^{i}_{1:t},x^{i}_{1:t}),

(19)

i.e. the belief $\hat{\pi}_{t}^{i}$ doesn’t depend on $y_{1:t}$ . Based on this, they derive simplified expressions for the mutual information quantities that are involved in the $I_{n}(\underline{\lambda})$ in (9). Specifically, they derive simplified expressions for the quantities $I(X^{1}_{t};Y_{t}|X^{2}_{1:t},Y_{1:t-1})$ , $I(X^{2}_{t};Y_{t}|X^{1}_{1:t},Y_{1:t-1})$ , and $I(X^{1}_{t},X^{2}_{t};Y_{t}|Y_{1:t-1})$ , or equivalently, for the quantities $H(Y_{t}|X^{2}_{1:t},Y_{1:t-1})$ , $H(Y_{t}|X^{1}_{1:t},Y_{1:t-1})$ , $H(Y_{t}|Y_{1:t-1})$ and $H(Y_{t}|X^{1}_{t},X^{2}_{t})$ . Their results are summarized in the following theorem.

Fact 2 (Anastasopoulos and Pradhan, 2020)

The mutual information quantities that are involved in the expression for $I_{n}(\underline{\lambda})$ in (9) can be evaluated as expectations of time invariant quantities depended only on $\Pi_{t-1}$ , $\hat{\Pi}^{i}_{t-1}$ and $E_{t}$ . Specifically, for each $t=1,\ldots,n$ ,


$\displaystyle I(X^{1}_{t};Y_{t}\|X^{2}_{1:t},Y_{1:t-1})$	$\displaystyle=\mathbb{E}^{\theta}[i_{1}(\hat{\Pi}^{2}_{t-1},\Pi_{t-1},E_{t})]$	(20a)
$\displaystyle I(X^{2}_{t};Y_{t}\|X^{1}_{1:t},Y_{1:t-1})$	$\displaystyle=\mathbb{E}^{\theta}[i_{2}(\hat{\Pi}^{1}_{t-1},\Pi_{t-1},E_{t})]$	(20b)
$\displaystyle I(X^{1}_{t},X^{2}_{t};Y_{t}\|Y_{1:t-1})$	$\displaystyle=\mathbb{E}^{\theta}[i_{3}(\Pi_{t-1},E_{t})],$	(20c)

where the functions $i_{1}$ , $i_{2}$ , $i_{3}$ are specified in the proof of the theorem and expectations are taken wrt the joint distribution


	$\displaystyle\mathbb{P}^{\theta}(\pi_{0:n-1},\hat{\pi}_{0:n-1},e_{1:n})$
	$\displaystyle=\prod_{t=0}^{n-1}\mathbb{P}^{\theta}(\pi_{t},\hat{\pi}_{t},e_{t+1}\|\pi_{0:t-1},\hat{\pi}_{0:t-1},e_{1:t})$		(21a)
	$\displaystyle=\prod_{t=0}^{n-1}1_{\theta_{t+1}[\pi_{t}]}(e_{t+1})\sum_{y_{t},x^{1}_{t},x^{2}_{t}}Q(y_{t}\|x^{1}_{t},x^{2}_{t})1_{F(\pi_{t-1},e_{t},y_{t})}(\pi_{t})$
	$\displaystyle\quad 1_{\hat{F}^{1}(\hat{\pi}^{1}_{t-1},e^{1}_{t},x^{1}_{t})}(\hat{\pi}^{1}_{t})1_{\hat{F}^{2}(\hat{\pi}^{2}_{t-1},e^{2}_{t},x^{2}_{t})}(\hat{\pi}^{2}_{t})$
	$\displaystyle\quad\sum_{m^{1},m^{2}}1_{e^{1}_{t}(m^{1})}(x^{1}_{t})1_{e^{2}_{t}(m^{2})}(x^{2}_{t})\hat{\pi}^{1}_{t-1}(m^{1})\hat{\pi}^{2}_{t-1}(m^{2}).$		(21b)

Equivalently, for a fixed $\underline{\lambda}\in\mathbb{R}^{3}_{+}$ , Fact 2 shows that the expression $I_{n}(\underline{\lambda})$ in (9) involved in evaluating the channel capacity can be expressed as

I_{n}(\underline{\lambda})=\frac{1}{n}\sum_{t=1}^{n}\mathbb{E}^{\theta}[i(\Pi_{t-1},\hat{\Pi}_{t-1},E_{t};\underline{\lambda})].

(22)

Furthermore, the unstructured optimization problem for finding $C_{n}(\underline{\lambda})$ in (9) can now be restated as

C_{n}(\underline{\lambda})=\sup_{\theta}\frac{1}{n}\sum_{t=1}^{n}\mathbb{E}^{\theta}[i(\Pi_{t-1},\hat{\Pi}_{t-1},E_{t};\underline{\lambda})].

(23)

The authors argue that the above expression hints at thinking of the quantity $C_{n}(\underline{\lambda})$ as the average reward received from a dynamical system with a process $(\hat{\Pi}_{t-1},\Pi_{t-1})$ partially controlled by the encoding functions $E_{t}=\theta_{t}[\Pi_{t-1}]$ , and optimized over all such policies (Note however that $(\hat{\Pi}_{t-1},\Pi_{t-1})$ is not observed by any single agent, and thus there does not exist a DP to find capacity). The authors also argue that there exists a capacity achieving distribution that is Markovian in the sense of DSAHT problem, based on the following argument. Consider a capacity achieving sequence of transmission schemes indexed by the code length (horizon) n with message size $M_{n}$ and encoding/decoding functions $e_{n},d_{n}$ . Clearly we have a sequence of systems indexed by $n$ such that they achieve the capacity i.e, their rate $R_{n}:=(log_{2}M_{n})/n\rightarrow C$ , and their corresponding error probability $P_{e}(n)\rightarrow 0$ . Now for each element of this sequence with a given $n,M_{n}$ one can design an optimal scheme for the DSAHT problem. The optimal scheme for a system $n,M_{n}$ does not change its rate but improves its error probability.

IV Dynamic program for DM-MAC-NF capacity

In this section, we show that there indeed exists a state and consequently a dynamic program for the capacity achieving problem. For any policy $\phi$ of the transmitters, we define a new common belief

\displaystyle\tilde{\pi}_{t}(m,\hat{\pi}_{t}):=P^{\phi}(m,\hat{\pi}_{t}|y_{1:t}).

(24)

Note that $\pi_{t}$ which is a common belief on $m$ (as defined in the previous section) can be derived from $\tilde{\pi}_{t}$ as a marginal. As discussed in the previous section, it was shown in [2] that there exists capacity achieving policy of the transmitters that is also optimal for DSAHT and is thus of the kind $x_{t}^{i}=\theta^{i}[\pi_{t-1}](m^{i})=f^{i}_{t}(\pi_{t-1},m^{i})$ . Thus, we will also restrict ourselves to class of such policies which depend on private information of the player $i$ only through $m^{i}$ . In other words, we do not consider policies that depend on player $i$ ’s complete private information in this set up which is $(\hat{\pi}^{i}_{t},m^{i})$ , but only on part of it which is $m^{i}$ , as considered in the previous section. In the following lemma, we show that $\tilde{\pi}_{t-1}$ can be updated using Bayes’ rule

Lemma 1

There exists functions $\tilde{F}$ independent of the policies of the transmitter $\phi$ such that

\tilde{\pi}_{t}=\tilde{F}(\tilde{\pi}_{t-1},e_{t},y_{t})

(25)

Proof:

	$\displaystyle\tilde{\pi}_{t}(m,\hat{\pi}_{t})=P^{\phi}(m,\hat{\pi}_{t}\|y_{1:t})$
	$\displaystyle=\frac{\sum_{\hat{\pi}_{t-1}}P^{\phi}(m,\hat{\pi}_{t-1},y_{t},\hat{\pi}_{t}\|y_{1:t-1})}{\sum_{m,\hat{\pi}_{t-1},\hat{\pi}_{t}}P^{\phi}(m,\hat{\pi}_{t-1},y_{t},\hat{\pi}_{t}\|y_{1:t-1})}$		(26)
	$\displaystyle=\frac{\sum_{\hat{\pi}_{t-1}}\begin{subarray}{c}\displaystyle P^{\phi}(m,\hat{\pi}_{t-1}\|y_{1:t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}{\sum_{m,\hat{\pi}_{t-1},\hat{\pi}_{t}}\begin{subarray}{c}\displaystyle P^{\phi}(m,\hat{\pi}_{t-1}\|y_{1:t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}$		(27)
	$\displaystyle=\frac{\sum_{\hat{\pi}_{t-1}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(m,\hat{\pi}_{t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}{\sum_{m,\hat{\pi}_{t-1},\hat{\pi}_{t}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(m,\hat{\pi}_{t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}$		(28)
	$\displaystyle=\frac{\pi_{t-1}(m)Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))}{\sum_{m}\pi_{t-1}(m)Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))}\times$
	$\displaystyle\frac{\sum_{\hat{\pi}_{t-1}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(\hat{\pi}_{t-1}\|m)1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}{\sum_{\hat{\pi}_{t-1},\hat{\pi}_{t}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(\hat{\pi}_{t-1}\|m)1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}$		(29)

∎

The next step in the development is to derive simplified expressions for the mutual information quantities that are involved in the $I_{n}(\underline{\lambda})$ in (9). Specifically, we will derive simplified expressions for the quantities $I(X^{1}_{t};Y_{t}|X^{2}_{1:t},Y_{1:t-1})$ , $I(X^{2}_{t};Y_{t}|X^{1}_{1:t},Y_{1:t-1})$ , and $I(X^{1}_{t},X^{2}_{t};Y_{t}|Y_{1:t-1})$ , or equivalently, for the quantities $H(Y_{t}|X^{2}_{1:t},Y_{1:t-1})$ , $H(Y_{t}|X^{1}_{1:t},Y_{1:t-1})$ , $H(Y_{t}|Y_{1:t-1})$ and $H(Y_{t}|X^{1}_{t},X^{2}_{t})$ . Our results are summarized in the following theorem.

Lemma 2

The mutual information quantities that are involved in the expression for $I_{n}(\underline{\lambda})$ in (9) can be evaluated as expectations of time invariant quantities depended only on $\tilde{\Pi}_{t-1}$ , and $E_{t}$ . Specifically, for each $t=1,\ldots,n$ we have


$\displaystyle I(X^{1}_{t};Y_{t}\|X^{2}_{1:t},Y_{1:t-1})$	$\displaystyle=\mathbb{E}^{\theta}[\tilde{i}_{1}(\tilde{\Pi}_{t-1},E_{t})]$	(30a)
$\displaystyle I(X^{2}_{t};Y_{t}\|X^{1}_{1:t},Y_{1:t-1})$	$\displaystyle=\mathbb{E}^{\theta}[\tilde{i}_{2}(\tilde{\Pi}_{t-1},E_{t})]$	(30b)
$\displaystyle I(X^{1}_{t},X^{2}_{t};Y_{t}\|Y_{1:t-1})$	$\displaystyle=\mathbb{E}^{\theta}[\tilde{i}_{3}(\tilde{\Pi}_{t-1},E_{t})],$	(30c)

where the functions $i_{1}$ , $i_{2}$ , $i_{3}$ are specified in the proof of the theorem and expectations are taken wrt the joint distribution


	$\displaystyle\mathbb{P}^{\theta}(\tilde{\pi}_{0:n-1},e_{1:n})=\prod_{t=0}^{n-1}\mathbb{P}^{\theta}(\tilde{\pi}_{t},e_{t+1}\|\tilde{\pi}_{0:t-1},e_{1:t})$		(31a)
	$\displaystyle=\prod_{t=0}^{n-1}1_{\theta_{t+1}[\tilde{\pi}_{t}]}(e_{t+1})\times$
	$\displaystyle\sum_{y_{t},m^{1},m^{2}}\pi_{t}(m^{1},m^{2})Q(y_{t}\|e^{1}_{t}(m^{1}),e^{2}_{t}(m^{2}))1_{\tilde{F}(\tilde{\pi}_{t-1},e_{t},y_{t})}(\tilde{\pi}_{t})$		(31b)

Proof:

Let $i_{1}(\hat{\Pi}^{2}_{t-1},\Pi_{t-1},E_{t}),i_{2}(\hat{\Pi}^{1}_{t-1},\Pi_{t-1},E_{t})$ , $i_{3}(\Pi_{t-1},E_{t})$ be as defined in Theorem 1 in [2]. Define


$\displaystyle\tilde{i}_{1}(\tilde{\pi}_{t-1},e)$	$\displaystyle=\sum_{\hat{\pi}_{t-1}}i_{1}(\hat{\pi}_{t-1}^{2},\pi_{t-1},e)\tilde{\pi}_{t-1}(\hat{\pi}_{t-1}^{2})$	(32a)
$\displaystyle\tilde{i}_{2}(\tilde{\pi}_{t-1},e)$	$\displaystyle=\sum_{\hat{\pi}_{t-1}}i_{2}(\hat{\pi}_{t-1}^{1},\pi_{t-1},e)\tilde{\pi}_{t-1}(\hat{\pi}_{t-1}^{1})$	(32b)
$\displaystyle\tilde{i}_{3}(\tilde{\pi}_{t-1},e)$	$\displaystyle=i_{3}(\pi_{t-1},e),$	(32c)

and $\tilde{i}(\tilde{\pi}_{t-1},e_{t})$ is appropriately defined through the above quantities in a similar way $i(\pi_{t-1},\hat{\pi}_{t-1},e_{t})$ in (22) is defined through quantities in (30). \optvarxiv

Consequently, the mutual information quantities at time $t$ become


$\displaystyle I$	$\displaystyle(X^{1}_{t};Y_{t}\|X^{2}_{1:t},Y_{1:t-1})=\mathbb{E}^{\theta}[\tilde{i}_{1}(\tilde{\Pi}_{t-1},E_{t})]$	(33a)
$\displaystyle I$	$\displaystyle(X^{2}_{t};Y_{t}\|X^{1}_{1:t},Y_{1:t-1})=\mathbb{E}^{\theta}[\tilde{i}_{2}(\tilde{\Pi}_{t-1},E_{t})]$	(33b)
$\displaystyle I$	$\displaystyle(X^{1}_{t},X^{2}_{t};Y_{t}\|Y_{1:t-1})=\mathbb{E}^{\theta}[\tilde{i}_{3}(\tilde{\Pi}_{t-1},E_{t})]$	(33c)

∎

Based on Lemma 2 we can now solve for the capacity of DM-MAC-NF as a dynamic program for the common agent as follows.

Theorem 1

The dynamic optimization problem in (23) can be solved using the following dynamic program


	$\displaystyle J+V(\tilde{\pi})=\max_{e}\mathbb{E}[\tilde{i}(\tilde{\pi},e;\underline{\lambda})+V(\tilde{F}(\tilde{\pi},e,Y))\|\tilde{\pi},e]$		(34a)
	$\displaystyle=\max_{e}\tilde{i}(\tilde{\pi},e;\underline{\lambda})+\sum_{y,m^{1},m^{2}}{\pi}(m^{1},m^{2})\times$
	$\displaystyle Q(y\|e^{1}(m^{1}),e^{2}(m^{2}))V(\tilde{F}(\tilde{\pi},e,y)).$		(34b)

Proof:

We note that $\{\tilde{\pi}_{t},e_{t}\}_{t}$ is a controlled Markov process for this problem, since the instantaneous cost can be written as a function $\tilde{\pi}_{t},e_{t}$ , and $P(\tilde{\pi}_{t+1}|\tilde{\pi}_{1:t},e_{1:t})=P(\tilde{\pi}_{t+1}|\tilde{\pi}_{t},e_{t})$ , as implied by (25). Thus the result is implied by Markov decision process theory [10]. ∎

The above result implies that there exist optimal Markovian policy of the transmitter $i$ that is a function of the state $\tilde{\pi}_{t}$ and $m^{i}$ .

IV-A Simplification

We recall that it was shown in [2] that there exists capacity achieving policies of the players that depend only on $\pi_{t},m^{i}$ , (and not on $\tilde{\pi}_{t},m^{i})$ . However, Theorem 1 above shows that there exists an optimal strategy through the DP that depends on $(\tilde{\pi}_{t},m^{i})$ . Here, we show that there exists a one to one function between $\pi_{t}$ and $\tilde{\pi}_{t}$ , and thus one can restrict oneself to a smaller space of $\pi_{t}$ and yet derive the belief $\tilde{\pi}_{t}$ . To show the equivalency, we note that

$\displaystyle\tilde{\pi}_{t}(m,\hat{\pi}_{t})$	$\displaystyle:=P^{\phi}(m,\hat{\pi}_{t}\|y_{1:t})$	(35)
	$\displaystyle:=P^{\phi}(m\|y_{1:t})P^{\phi}(\hat{\pi}_{t}\|m,y_{1:t})$	(36)
	$\displaystyle=\pi_{t}(m)\tilde{\pi}_{t}(\hat{\pi}_{t}\|m)$	(37)

Now note that $\hat{\pi}_{t}^{i}$ is a function of $(e_{1:t}^{i},x_{1:t}^{i})=(e_{1:t}^{i},e_{1:t}^{i}(m^{i}))$ as shown in (19). Thus knowing $m,\phi,y_{1:t}$ and equivalently $m,e_{1:t}$ , $\hat{\pi}_{t}$ is perfectly observed i.e. $\tilde{\pi}_{t}(\hat{\pi}_{t}|m)=\delta(\hat{\pi}_{t}(\cdot)=P(\cdot|e_{1:t},e_{1:t}(m)))$ . This can also be seen in (29) where through induction if $\tilde{\pi}_{t-1}(\hat{\pi}_{t-1}|m)$ is a delta function then so is $\tilde{\pi}_{t}(\hat{\pi}_{t}|m)$ . Thus one can reduce the state space in DP in (34) from $\tilde{\pi}_{t}$ to $\pi_{t}$ .

Theorem 2

The dynamic optimization problem in (23) can be solved using the following dynamic program

\displaystyle J+V(\pi)=\max_{e}\mathbb{E}[\tilde{i}(\tilde{\pi},e;\underline{\lambda})+V(F(\pi,e,Y))|\pi,e]

(38)

where as argued before $\tilde{\pi}_{t}$ can be derived from $\pi_{t}$ .

Proof:

The proof is implied from Theorem 1 and the above discussion. ∎

V Conclusion and future work

In this paper, we considered the problem of finding capacity of discrete memoryless multiple access channel with noiseless feedback. In [2], authors made connections with finding capacity of this channel with the problem of minimizing the terminal probability of error, and show that achieving capacity is a dynamic optimization problem. We build upon their work and show that there exists a state and a dynamic program to find the capacity of this channel. Thus we show that there is a single letter characterization of the capacity expression whose alphabet is a belief state.

Future work involves deriving the already known results in this domain using this new framework such as [6] for Gaussian MAC with feedback, for which the capacity expression is known, or for Cover-Leung channels [5] again for which the capacity is known, or sum capacity of N-player Gaussian channel [11], and building upon that.

References

[1] G. Kramer, “Directed information for channels with feedback,” Ph.D. dissertation, ETH Series in Information Processing. Konstanz, Switzerland: Hartung-Gorre Verlag,, 1998.
[2] A. Anastasopoulos and S. Pradhan, “Decentralized sequential active hypothesis testing and the MAC feedback capacity,” IEEE International Symposium on Information Theory - Proceedings, vol. 2020-June, pp. 2085–2090, jun 2020.
[3] Y.-H. El Gamal, Abbas and Kim, Network information theory, 2011.
[4] N. T. Gaarder and J. K. Wolf, “The Capacity Region of a Multiple-Access Discrete Memoryless Channel Can Increase with Feedback,” IEEE Transactions on Information Theory, vol. 21, no. 1, pp. 100–102, 1975.
[5] T. M. Cover and C. S. Leung, “An Achievable Rate Region for the Multiple-Access Channel with Feedback,” IEEE Transactions on Information Theory, vol. 27, no. 3, pp. 292–298, 1981.
[6] L. H. Ozarow, “The Capacity of the White Gaussian Multiple Access Channel with Feedback,” IEEE Transactions on Information Theory, vol. 30, no. 4, pp. 623–629, 1984.
[7] A. Nayyar, A. Mahajan, and D. Teneketzis, “Decentralized stochastic control with partial history sharing: A common information approach,” Automatic Control, IEEE Transactions on, vol. 58, no. 7, pp. 1644–1658, 2013.
[8] C. E. Shannon, “A mathematical theory of communication,” Bell system technical journal, vol. 27, no. 3, pp. 379–423, 1948.
[9] M. Salehi, “Cardinality bounds on auxiliary variables in multiple-user theory via the method of Ahlswede and K{ $\backslash$ "o}rner,” Dept. Statistics, Stanford Univ., Stanford, CA, Tech. Rep, vol. 33, 1978.
[10] P. Kumar and P. Varaiya, “Stochastic systems,” 1986.
[11] E. Sula, M. Gastpar, and G. Kramer, “Sum-Rate Capacity for Symmetric Gaussian Multiple Access Channels with Feedback,” IEEE Transactions on Information Theory, vol. 66, no. 5, pp. 2860–2871, may 2020.

	$\displaystyle\mathbb{P}(x^{1}_{1:n},x^{2}_{1:n},y_{1:n})=\prod_{t=1}^{n}Q(y_{t}\|$	$\displaystyle x^{1}_{t},x^{2}_{t})q^{1}_{t}(x^{1}_{t}\|x^{1}_{1:t-1},y_{1:t-1})\times$
		$\displaystyle q^{2}_{t}(x^{2}_{t}\|x^{2}_{1:t-1},y_{1:t-1}),$		(6)

	$\displaystyle\tilde{\pi}_{t}(m,\hat{\pi}_{t})=P^{\phi}(m,\hat{\pi}_{t}\|y_{1:t})$
	$\displaystyle=\frac{\sum_{\hat{\pi}_{t-1}}P^{\phi}(m,\hat{\pi}_{t-1},y_{t},\hat{\pi}_{t}\|y_{1:t-1})}{\sum_{m,\hat{\pi}_{t-1},\hat{\pi}_{t}}P^{\phi}(m,\hat{\pi}_{t-1},y_{t},\hat{\pi}_{t}\|y_{1:t-1})}$		(26)
	$\displaystyle=\frac{\sum_{\hat{\pi}_{t-1}}\begin{subarray}{c}\displaystyle P^{\phi}(m,\hat{\pi}_{t-1}\|y_{1:t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}{\sum_{m,\hat{\pi}_{t-1},\hat{\pi}_{t}}\begin{subarray}{c}\displaystyle P^{\phi}(m,\hat{\pi}_{t-1}\|y_{1:t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}$		(27)
	$\displaystyle=\frac{\sum_{\hat{\pi}_{t-1}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(m,\hat{\pi}_{t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}{\sum_{m,\hat{\pi}_{t-1},\hat{\pi}_{t}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(m,\hat{\pi}_{t-1})Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}$		(28)
	$\displaystyle=\frac{\pi_{t-1}(m)Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))}{\sum_{m}\pi_{t-1}(m)Q(y_{t}\|e_{t}^{1}(m^{1}),e^{2}_{t}(m^{2}))}\times$
	$\displaystyle\frac{\sum_{\hat{\pi}_{t-1}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(\hat{\pi}_{t-1}\|m)1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}{\sum_{\hat{\pi}_{t-1},\hat{\pi}_{t}}\begin{subarray}{c}\displaystyle\tilde{\pi}_{t-1}(\hat{\pi}_{t-1}\|m)1_{\hat{F}(\hat{\pi}^{1}_{t-1},e^{1}_{t},e_{t}^{1}(m^{1}))}(\hat{\pi}_{t}^{1})\\ \displaystyle 1_{\hat{F}(\hat{\pi}^{2}_{t-1},e_{t}^{2},e_{t}^{2}(m^{2}))}(\hat{\pi}_{t}^{2})\end{subarray}}$		(29)