Covariance Decomposition as a Universal Limit on Correlations in Networks

Salman Beigi¹ ¹School of Mathematics, Institute for Research in Fundamental Sciences (IPM)
P.O. Box 19395-5746, Tehran, Iran
²ICFO-Institut de Ciencies Fotoniques, 08860 Castelldefels (Barcelona), Spain Marc-Olivier Renou² ¹School of Mathematics, Institute for Research in Fundamental Sciences (IPM)
P.O. Box 19395-5746, Tehran, Iran
²ICFO-Institut de Ciencies Fotoniques, 08860 Castelldefels (Barcelona), Spain

Abstract

Parties connected to independent sources through a network can generate correlations among themselves. Notably, the space of feasible correlations for a given network, depends on the physical nature of the sources and the measurements performed by the parties. In particular, quantum sources give access to nonlocal correlations that cannot be generated classically. In this paper, we derive a universal limit on correlations in networks in terms of their covariance matrix. We show that in a network satisfying a certain condition, the covariance matrix of any feasible correlation can be decomposed as a summation of positive semidefinite matrices each of whose terms corresponds to a source in the network. Our result is universal in the sense that it holds in any physical theory of correlation in networks, including the classical, quantum and all generalized probabilistic theories.

1 Introduction

The causal inference problem, i.e., determining the causes of observed phenomena, is a fundamental problem in many disciplines. Given two categories of observed and latent variables, it asks to deduce the causation patterns between the two sets of variables, where the former ones can be measured while the latter ones are hidden. More specifically, the problem is whether the statistics of the observed variables can be modeled with a given causal pattern [29, 18].

In this paper, we restrict ourselves to network causation patterns, which are determined by a bipartite graph with one part for observed variables and one part for latent variables [10, 6]. We can think of the vertices associated to latent variables as sources that produce signals that are independent of each other. We can also think of vertices associated to observed variables as parties who receive signals from their adjacent sources. After receiving these signals, each party performs a measurement whose output determines the value of the corresponding observed variable. For instance, if the signals are classical random variables, the measurement is a function of these variables. Here, we emphasize that the distribution of signals (latent variables) are mutually independent, and an observed variable equals the measurement outcome applied on the adjacent sources. See Figure 1 for an example of a network. Now given such a network with $n$ parties, and a joint probability distribution $p(a_{1},\dots,a_{n})$ on their observed variables, the question is, can $p(a_{1},\dots,a_{n})$ be modeled as the output distribution of the measurement of sources distributed in the network?

Quantum physics introduces seminal aspects to the causal inference problem [13, 11, 2]. The point is that the sources may produce quantum signals, i.e., multipartite entangled states, and parties may perform quantum measurements on the received signals. In this case the answer to the causal inference problem may differ from its answer in the completely classical (local) setting. Indeed, already in the primary example of Bell’s scenario we observe nonlocal correlations in the quantum setting that cannot be modeled classically [5]; this is Bell’s nonlocality. It is the elementary manifestation of a broader phenomenon, called network nonlocality, in which quantum nonlocal correlations admitting a model in some network cannot be modeled classically in the same network.

The existence of quantum models, which give access to more correlations than classical ones, raises the question of the ultimate limits on correlations in networks. In the case of Bell’s scenario, this ultimate limit is given by the no-signaling principle [21]. We imagine that any correlation satisfying this principle can be obtained through ‘measurements’ of ‘states’ in some hypothetical Generalised Probabilistic Theory (GPT) [4, 26, 14, 28]. Such theories generalize quantum physics and formalize the Popescu-Rohrlich box [20, 25]. However, the no-signaling principle alone is insufficient to define the ultimate limits of correlation in networks [12, 3]. In the following, we define these limits through the causality principle (see Figure 1). This resolusion, combined with the inflation technique [32], allow us to derive a universal limit on correlations in networks.

Refer to caption — Figure 1: This is an example of a network with two sources $S_{\alpha},S_{\beta}$ and three parties $A_{1},A_{2},A_{3}$ . Each party after receiving the signal from its adjacent sources, performs a measurement to determine her output. We denote the output of party $A_{i}$ by $a_{i}$ , and the joint distribution of outputs by $p(a_{1},a_{2},a_{3})$ .
*First causality law:* Imaging that this network is a part of a larger network, yet the structure of the network in the neighborhood of $A_{2}$ is the same. Then the marginal distribution $p(a_{2})$ in the larger network remains the same.
*Second causality law:* Since in this network $A_{1}$ and $A_{3}$ do not have a common source, we have that $p(a_{1},a_{3})=p(a_{1})p(a_{3})$ .

Recently, Kale et al [15] proved a limit on the set of distributions $p(a_{1},\dots,a_{n})$ that can be modeled in a network in the classical setting. They showed that the covariance matrix of functions $f_{1}(a_{1}),\dots,f_{n}(a_{n})$ of output variables associated to such a distribution $p(a_{1},\dots,a_{n})$ can be decomposed as a summation of positive semidefinite matrices whose terms somehow correspond to sources in the network. Later, the same result is proven by Åberg et al [1] in the quantum setting, and discussed in Kraft et al [16]. The main contribution of our paper is that this decomposition holds to exist in a large class of networks in any underlying physical theory satisfying the aforementioned causality principle. More precisely, we prove that in any such physical theory, for local scalar functions $f_{1},\dots,f_{n}$ of outputs of parties, their covariance matrix can be decomposed into a summation of positive semidefinite terms that are associated to sources in the network.

In the rest of this section, after giving some preliminary definitions, we present the statement of our main result and explain its proof techniques.

1.1 Main result

In this paper, we adopt Dirac’s ket-bra notation. We let $[d]:=\{1,\dots,d\}$ and denote the computational basis of the $d$ -dimensional complex vector space $\mathbb{C}^{d}$ by $\{|1\rangle,\dots,|d\rangle\}=\{|i\rangle:\,i\in[d]\}$ . The complex conjugate of $x\in\mathbb{C}$ is $\bar{x}$ . Capital letters represent matrices and the $(i,j)$ -th entry of matrix $M$ is denoted by $M_{ij}$ . For two matrices $M,N$ of the same size, $M\circ N$ is the Schur or Hadamard product of $M$ and $N$ , that is a matrix with entries

(M\circ N)_{ij}=M_{ij}\cdot N_{ij}.

$M\succeq 0$ indicates that $M$ is positive semidefinite. We represent the Kronecker delta function by $\delta_{k,\ell}$ .

Networks.

A network $\mathcal{N}$ is a bipartite graph whose vertex set consists of sources and parties. The sources are denoted by $S_{\alpha}$ for $\alpha=1,\dots,m$ , and parties are denoted by $A_{i}$ for $i=1,\dots,n$ . $S_{\alpha}\to A_{i}$ (and sometimes $\alpha\to i$ ) indicates that the source $S_{\alpha}$ is connected to the party $A_{i}$ . We assume throughout this paper that the networks do not have isolated vertices meaning that each source is adjacent to at least one party, and each party is adjacent to at least one source. Moreover, we assume that there are no two sources in the network whose adjacent sets of parties are comparable under inclusion since otherwise one of them is redundant and can be merged with the other one.

Output distribution.

In the network $\mathcal{N}$ , each source $S_{\alpha}$ sends some signals to its adjacent parties, and each party $A_{i}$ after receiving signals from its adjacent sources, performs a measurement and outputs the result $a_{i}$ . The joint distribution of these outputs is denoted by $p(a_{1},\dots,a_{n})$ .

Underlying physical theory.

In this paper, we do not make any assumption about the nature of signals and measurements. Hence, we consider any correlation which can be obtained in any Generalized Probabilistic Theory (GPT), i.e., from any hypothetical model for correlations in networks beyond classical and quantum physics.

The causal structure of the Bell scenario can be derived from the no-signaling principle alone. This approach fails in general networks. Hence, we need to accept the network causal structure. Here, we assume the causality principle from which two laws are followed. By the first causality law, the marginal distribution of the outputs of a group of parties depends only on the sources adjacent to them, and not on the structure of the network beyond those sources. By the second causality law, two groups of parties who do not share a common source are uncorrelated, and their marginal distribution factorizes. We think of these two causality laws as the only generic restrictions on GPTs in networks.¹¹1Our first causality law follows from the the no-signaling principle, and our second causality law is called the independence principle in [12, 3]. Thus, as an alternative solution we can also take the No-Signaling and the Independence principles (called NSI [12, 3]) instead of the causality principle. See [19] for a discussion on these different approaches. See Figure 1.

We also assume that we can make independent and identical copies of the sources and parties of a network $\mathcal{N}$ . That is, we assume that we can repeat the process (experiment) under which the sources produce their signals and the parties perform their measurements. For instance, for a source $S_{\alpha}$ of $\mathcal{N}$ that emits some signal, we can make an independent copy $S_{\alpha^{\prime}}$ of it that emits the same signal. We emphasize that we do not clone this signal; we only repeat the process under which it is produced. We give more details on these assumptions and their consequences in Subsection 2.1.

Covariance matrix.

Suppose that each party $A_{i}$ applies a real or complex function $f_{i}(a_{i})$ on her output. Associated to the distribution $p(a_{1},\dots,a_{n})$ and these functions we consider the covariance matrix $\mathcal{C}=\mathcal{C}(f_{1},\dots,f_{n})$ . This is an $n\times n$ matrix whose $(i,j)$ -th entry equals

\mathcal{C}_{ij}=\mathsf{Cov}(f_{i},f_{j})=\mathsf{E}[\bar{f}_{i}f_{j}]-\mathsf{E}[\bar{f}_{i}]\cdot\mathsf{E}[f_{j}]=\mathsf{E}\big{[}(\bar{f}_{i}-\mathsf{E}[\bar{f}_{i}])\cdot(f_{j}-\mathsf{E}[f_{j}])\big{]},

where the expectations are with respect to the output distribution $p(a_{1},\dots,a_{n})$ . It is well-known and can easily be verified that the covariance matrix is always positive semidefinite.

In this paper, we are interested in certain decompositions of covariance matrices.

Definition 1 ( $\mathcal{N}$ -compatible matrix decomposition).

Let $\mathcal{N}$ be a network with sources $S_{\alpha}$ , $\alpha=1,\dots,m$ , and parties $A_{i}$ , $i=1,\dots,n$ . For any source $S_{\alpha}$ let $\mathcal{L}_{\alpha}$ be the set of $n\times n$ matrices $M_{\alpha}$ such that

–

$M_{\alpha}$ is positive semidefinite,
–

The $(i,j)$ -th entry of $M_{\alpha}$ is non-zero only if $\alpha\to i$ and $\alpha\to j$ .

Then we say that a positive semidefinite $n\times n$ matrix $M$ admits an $\mathcal{N}$ -compatible matrix decomposition if there are $M_{\alpha}\in\mathcal{L}_{\alpha}$ , $\alpha=1,\dots,m$ , such that $M=\sum_{\alpha}M_{\alpha}$ .

For an example of a network-compatible matrix decomposition, consider the network $\mathcal{N}$ of Figure 1. As mentioned in the caption of this figure, by the independence principle $p(a_{1},a_{3})=p(a_{1})p(a_{3})$ . This means that $\mathcal{C}_{13}=\bar{\mathcal{C}}_{31}=\mathsf{Cov}(f_{1},f_{3})=0$ . Then an $\mathcal{N}$ -compatible matrix decomposition for the covariance matrix $\mathcal{C}$ takes the form

\displaystyle\mathcal{C}=\begin{pmatrix}\mathsf{Var}(f_{1})&\mathsf{Cov}(f_{1},f_{2})&0\\ \mathsf{Cov}(f_{2},f_{1})&\mathsf{Var}(f_{2})&\mathsf{Cov}(f_{2},f_{3})\\ 0&\mathsf{Cov}(f_{3},f_{2})&\mathsf{Var}(f_{3})\end{pmatrix}=\begin{pmatrix}\ast&\ast&0\\ \ast&\ast&0\\ 0&0&0\end{pmatrix}+\begin{pmatrix}0&0&0\\ 0&\ast&\ast\\ 0&\ast&\ast\end{pmatrix},

where the two matrices on the right hand side are positive semidefinite and belong to $\mathcal{L}_{\alpha}$ and $\mathcal{L}_{\beta}$ , respectively.

In this paper, we prove that the covariance matrix $\mathcal{C}(f_{1},\dots,f_{n})$ defined above admits an $\mathcal{N}$ -compatible matrix decomposition with the minimal aforementioned assumptions on the underlying physical theory. To prove our result we need to restrict to a subclass of networks first introduced in [22].

Definition 2 (No double common-source network).

A network $\mathcal{N}$ is called a No Double Common-Source (NDCS) network if for any two parties $A_{i}\neq A_{j}$ , there are at most one source $S_{\alpha}$ adjacent to both of them.

Observe that a network in which all sources are bipartite (i.e., each source is adjacent to exactly two parties) are automatically NDCS.²²2Recall that we assume there is no redundant source in the network. We can now formally state the main result of this paper.

Theorem 3 (Main result).

Let $\mathcal{N}$ be an NDCS network with sources $S_{\alpha}$ , $\alpha=1,\dots,m$ , and parties $A_{i}$ , $i=1,\dots,n$ . Consider signals emitted by sources and measurements performed by the parties in an arbitrary underlying physical theory that satisfies the no-signaling and independence principles. Suppose that this results in an output joint distribution $p(a_{1},\dots,a_{n})$ . Let $f_{i}(a_{i})$ be a function of the output of party $A_{i}$ . Then the covariance matrix $\mathcal{C}(f_{1},\dots,f_{n})$ admits an $\mathcal{N}$ -compatible matrix decomposition.

In [15] and [1] the above theorem is proven for the classical and quantum theories without the NDCS assumption and for vector-valued functions beyond the scalar ones. Nevertheless, Theorem 3 states the validity of covariance decomposition in the box world [28, 26] as well as any GPT [4, 14].

Remark that this theorem is tight, in the sense that given an NDCS network $\mathcal{N}$ and a positive semidefinite matrix $M$ that admits an $\mathcal{N}$ -compatible matrix decomposition, there exists an output distribution of the network whose covariance matrix is $M$ . Indeed, letting $M=\sum_{\alpha}M_{\alpha}$ be an $\mathcal{N}$ -compatible matrix decomposition of $M$ , assume that source $\alpha$ sends a multivariate Gaussian distribution with covariance matrix $M_{\alpha}$ to its adjacent parties (each coordinate to its associated party). Then, if each party outputs the summation of all received (one-dimensional) Gaussian signals, the joint output distribution would be a Gaussian distribution whose covariance matrix is $M$ .

Proof ideas.

Our main conceptual idea in proving Theorem 3 is the so called inflation technique [32] (see Subsection 2.1). Using non-fanout inflations, we consider certain expansions of the network $\mathcal{N}$ by adding copies of the sources and parties and connecting them based on similar rules as in $\mathcal{N}$ . Then we consider the covariance matrix associated to the inflated network. As a covariance matrix, it is again positive semidefinite. This fact imposes some conditions on the original covariance matrix. Then we show that these conditions imply that the covariance matrix admits an $\mathcal{N}$ -compatible matrix decomposition.

From a more technical point of view, we first note that, fixing a network $\mathcal{N}$ , the space of covariance matrices associated to $\mathcal{N}$ forms a convex cone (see Definition 4). On the other hand, the space of matrices that admit an $\mathcal{N}$ -compatible matrix decomposition is also a convex cone. To prove Theorem 3 we need to show that these two convex cones coincide. To this end, based on the theory of dual cones (see Subsection 2.2), it suffices to show that the associated dual cones are equal. Next, to prove equality in the dual picture, we use the inflation technique as discussed above. Putting these together, the theorem is proven when $\mathcal{N}$ contains only bipartite sources (see Theorem 8). For general NDCS networks we also need to use the theory of embezzlement, first introduced in [30] in the context of entanglement theory (see Subsection 2.3).

2 Main tools

In this section we explain the main three tools we use in the proof of Theorem 3, namely, non-fanout inflations, the theory of dual cones and the theory of embezzlement.

2.1 Non-fanout inflation

The inflation technique [32] is a method to obtain constraints over the output correlations produced in a network $\mathcal{N}$ . Here, we give a formal definition of a restricted class of inflations called non-fanout inflations.

Consider a network $\mathcal{N}$ with sources $S_{\alpha}$ , $\alpha=1,\dots,m$ , and parties $A_{i}$ , $i=1,\dots,n$ outputting $a_{1},\dots,a_{n}$ with joint distribution $p(a_{1},\dots,a_{n})$ . In the inflated network of order $d\geq 2$ , which we denote by $\widetilde{\mathcal{N}}$ , we consider $d$ copies of each source $S_{\alpha}$ which we denote by $S_{\alpha}^{(1)},\dots,S_{\alpha}^{(d)}$ , and $d$ copies of each party $A_{i}$ which we denote by $A_{i}^{(1)},\dots,A_{i}^{(d)}$ . Let $S_{\alpha}$ and $A_{i}$ be an adjacent source and party in $\mathcal{N}$ . We assume that their associated copies are connected in $\widetilde{\mathcal{N}}$ as follows. Fix some permutation $\pi_{i}^{\alpha}$ on $\{1,\dots,d\}$ . Then we assume that $S_{\alpha}^{(k)}$ is adjacent to $A_{i}^{(\pi_{i}^{\alpha})^{-1}(k)}$ for any $1\leq k\leq d$ . Observe that in $\widetilde{\mathcal{N}}$ , any party $A_{i}^{(k)}$ is adjacent to the same number of sources and of the same types as in $\mathcal{N}$ . Similarly, any source $S_{\alpha}^{(k)}$ in $\widetilde{\mathcal{N}}$ is adjacent to the same number of parties and of the same types as in $\mathcal{N}$ . Therefore, sources and parties in $\widetilde{\mathcal{N}}$ can behave similarly to the sources and parties in the original network by producing the same signals and performing the same measurements. The output of $A_{i}^{(k)}$ is denoted by $a_{i}^{(k)}$ , and the joint distribution of the outputs is denoted by

\displaystyle p\Big{(}a_{1}^{(1)},\dots,a_{1}^{(d)},a_{2}^{(1)},\dots,a_{2}^{(d)},\dots\dots,a_{n}^{(1)},\dots,a_{n}^{(d)}\Big{)}.

(1)

For an example of a non-fanout inflation see Figure 2.

In this paper, we assume that the underlying physical theory satisfies causality in the network, which is defined through the following two laws:

First causality law.

The marginal distribution of the outputs of a group of parties depends only on the structure of the network near those parties. For instance, suppose that $A_{i}^{(k)},A_{j}^{(k^{\prime})}$ share a source $S_{\alpha}^{(\ell)}$ in $\widetilde{\mathcal{N}}$ and that $S_{\alpha}^{(\ell)}$ is the unique such source. This means that $S_{\alpha}\to A_{i},A_{j}$ in $\mathcal{N}$ . Then the first principle states that the marginal distribution of outputs of $A_{i}^{(k)},A_{j}^{(k^{\prime})}$ in $\widetilde{\mathcal{N}}$ coincides with that of $A_{i},A_{j}$ in $\mathcal{N}$ , i.e., $p\big{(}a_{i}^{(k)},a_{j}^{(k^{\prime})}\big{)}=p(a_{i},a_{j})$ . The point is that the correlation between parties $A_{i}^{(k)},A_{j}^{(k^{\prime})}$ in $\widetilde{\mathcal{N}}$ are created in the very same way as the one between parties $A_{i},A_{j}$ in $\mathcal{N}$ .

As a consequence of this principle, if $A_{i},A_{j}$ compute functions $f_{i},f_{j}$ on their outputs, and similarly $A_{i}^{(k)},A_{j}^{(k^{\prime})}$ compute identical functions $f_{i}^{(k)},f_{j}^{(k^{\prime})}$ , respectively, then we have $\mathsf{Cov}(f_{i}^{(k)},f_{j}^{(k^{\prime})})=\mathsf{Cov}(f_{i},f_{j})$ .

Second causality law.

If two groups of parties do not share any common source, then their marginal distribution factorizes. For instance, if $A_{i}^{(k)},A_{j}^{(k^{\prime})}$ do not share any source in $\widetilde{\mathcal{N}}$ (which may be the case even if $A_{i},A_{j}$ share a source in $\mathcal{N}$ ), then the marginal distribution $p\big{(}a_{i}^{(k)},a_{j}^{(k^{\prime})}\big{)}$ factorizes as $p\big{(}a_{i}^{(k)},a_{j}^{(k^{\prime})}\big{)}=p\big{(}a_{i}^{(k)}\big{)}p\big{(}a_{j}^{(k^{\prime})}\big{)}=p(a_{i})p(a_{j})$ , where the second equality follows from the first principle. This, in particular, means that if $f_{i}^{(k)}$ and $f_{j}^{(k^{\prime})}$ are functions of the outputs of $A_{i}^{(k)},A_{j}^{(k^{\prime})}$ , then we have $\mathsf{Cov}(f_{i}^{(k)},f_{k}^{(k^{\prime})})=0$ .

Importantly, these two laws and inflation are more than a method to find limitations on correlations. They provide the definition of the ultimate limits of correlations in networks, hence should be satisfied by any GPT in network (see [7, 19]). We emphasize that we only consider non-fanout inflations meaning that we do not clone signals. In the classical theory signals can be cloned, so, e.g., a bipartite source in $\mathcal{N}$ may become tripartite in $\widetilde{\mathcal{N}}$ . Indeed, the inflation technique with unbounded fanout exactly characterizes distributions in classical networks [17]. Here, as we consider arbitrary theories, we restrict only to non-fanout inflations. Such inflations has also been considered in [12].

2.2 Cones and duality

One of the main tools that we use to prove our main theorem is the theory of dual cones [8]. Here, restricting to cones of real or complex self-adjoint (Hermitian) matrices, we review the basic definitions and properties. Let us first introduce the Hilbert-Schmidt inner product on the space of matrices given by $\langle X,Y\rangle:=\text{\rm tr}(X^{\dagger}Y),$ where $X^{\dagger}$ is the adjoint of $X$ . Cones and dual cones are defined as follows:

Definition 4.

A subset $\mathcal{K}$ of self-adjoint $n\times n$ matrices is called a convex cone if

•

For $X\in\mathcal{K}$ and $r\geq 0$ we have $rX\in\mathcal{K}$ ,
•

For any $X_{1},X_{2}\in\mathcal{K}$ we have $X_{1}+X_{2}\in\mathcal{K}$ .

Moreover, the dual cone of $\mathcal{K}$ is defined by $\mathcal{K}^{*}:=\{Y\,|\,\langle X,Y\rangle\geq 0,~{}~{}\forall X\in\mathcal{K}\}.$

An important example of a convex cone is the set of positive semidefinite matrices, that is self-dual.

In the proof of Theorem 3 we will use the following basic properties of dual cones, whose proofs are left for Appendix A.

Lemma 5.

[8] The followings hold for both the real and complex scalar fields:

(i)

For any cone $\mathcal{K}$ , we have $\mathcal{K}^{*}=(\,\overline{\mathcal{K}}\,)^{*}$ where $\overline{\mathcal{K}}$ is the closure of $\mathcal{K}$ . Moreover, the dual cone $\mathcal{K}^{*}$ is convex and closed.
(ii)

For any two cones $\mathcal{L}\subseteq\mathcal{K}$ we have $\mathcal{K}^{*}\subseteq\mathcal{L}^{*}$ .
(iii)

For any closed convex cone $\mathcal{K}$ we have $(\mathcal{K}^{*})^{*}=\mathcal{K}$ .
(iv)

For any closed convex cones $\mathcal{K}_{1},\mathcal{K}_{2}$ , we have that $\mathcal{K}_{1}\cap\mathcal{K}_{2}$ is a closed convex cone.
(v)

For any closed convex cones $\mathcal{K}_{1},\mathcal{K}_{2}$ , their sum $\mathcal{K}_{1}+\mathcal{K}_{2}=\big{\{}X_{1}+X_{2}|\,~{}X_{1}\in\mathcal{K}_{1},X_{2}\in\mathcal{K}_{2}\big{\}}$ is a convex cone (but not necessarily closed).
(vi)

For any closed convex cones $\mathcal{K}_{1},\mathcal{K}_{2}$ we have $(\mathcal{K}_{1}\cap\mathcal{K}_{2})^{*}=\overline{\mathcal{K}_{1}^{*}+\mathcal{K}_{2}^{*}}$ .
(vii)

For any closed convex cones $\mathcal{K}_{1},\mathcal{K}_{2}$ we have $(\mathcal{K}_{1}+\mathcal{K}_{2})^{*}=\mathcal{K}_{1}^{*}\cap\mathcal{K}_{2}^{*}$ .

2.3 Embezzlement

Embezzlement is another pivotal idea in our proof of Theorem 3. Entanglement embezzlement is a compelling notion in quantum information theory first introduced in [30]. In the following we present the construction of the universal embezzling family of this paper, but without referring to entanglement. For any integer $R$ define

\displaystyle|\mu_{R}\rangle:=\frac{1}{\sqrt{\chi_{R}}}\sum_{r=1}^{R}\frac{1}{\sqrt{r}}|r\rangle,

(2)

where $\chi_{R}=\sum_{r=1}^{R}r^{-1}$ is the Harmonic number and a normalization factor.

Lemma 6.

[30] For any $\varepsilon>0$ , integer $d$ and sufficiently large $R$ the following holds. Let $|\phi\rangle=\sum_{j=1}^{d}c_{j}|j\rangle$ be a normal $d$ -dimensional vector satisfying $c_{j}\geq 0$ for all $j$ . Then there exists a permutation $\pi:[dR]\to[dR]$ such that

\langle\mu_{R}|P_{\pi}\,(|\phi\rangle\otimes|\mu_{R}\rangle)\geq 1-\varepsilon.

Here we identified $[d]\times[R]$ with $[dR]$ , e.g., via $(j,r)\mapsto(j-1)R+r$ , and $P_{\pi}$ is the permutation matrix associated with $\pi$ given by $P_{\pi}|j\rangle\otimes|r\rangle=|\pi((j-1)R+r)\rangle$ . Note that by abuse of notation we consider $|\mu_{R}\rangle$ as a vector in both $\mathbb{C}^{R}$ and $\mathbb{C}^{dR}$ (by padding it with zero coordinates).

For the sake of completeness, we give the proof of this lemma in Appendix B.

We now generalize the above lemma to the case where the coordinates of $|\phi\rangle$ are not necessarily non-negative. For this generalization we will use another class of states. For an integer $T$ , let $\omega=e^{2\pi i/T}$ be a $T$ -th root of unity and define

\displaystyle|\vartheta_{T}\rangle=\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\omega^{t}|t\rangle.

(3)

Lemma 7.

For any $\varepsilon>0$ , integer $d$ and sufficiently large $R,T$ the following holds. Let $|\phi\rangle=\sum_{j=1}^{d}c_{j}|j\rangle$ be an arbitrary normal $d$ -dimensional vector. Then there exist a permutation $\pi:[TdR]\to[TdR]$ such that

(\langle\vartheta_{T}|\otimes\langle\mu_{R}|)P_{\pi}\,(|\vartheta_{T}\rangle\otimes|\phi\rangle\otimes|\mu_{R}\rangle)\geq 1-\varepsilon.

Here as in the previous lemma we identify $[T]\times[d]\times[R]$ with $[TdR]$ via a fixed map, and by abuse of notation we consider $|\vartheta_{T}\rangle\otimes|\mu_{R}\rangle$ as a vector in both $\mathbb{C}^{T}\otimes\mathbb{C}^{R}$ and $\mathbb{C}^{TdR}$ (by padding it with zero coordinates).

Proof.

Let $c_{j}=b_{j}e^{2\pi i\theta_{j}}$ with $0\leq\theta_{j}<1$ and $b_{j}>0$ . Then for any $j$ there exists an integer $0\leq r_{j}<T$ such that

\displaystyle\big{|}\theta_{j}-\frac{r_{j}}{T}\big{|}\leq\frac{1}{T}.

(4)

Let $Q$ be the permutation matrix acting on $\mathbb{C}^{T}$ that shifts the basis vectors by 1:

Q|t\rangle=|(t+1)\mod T\rangle,\qquad 1\leq t\leq T.

Then we have $Q|\vartheta_{T}\rangle=\omega^{-1}|\vartheta_{T}\rangle$ . Next, let

\widetilde{Q}=\sum_{j=1}^{d}Q^{r_{j}}\otimes|j\rangle\langle j|.

Note that $\widetilde{Q}$ is a permutation matrix acting on $\mathbb{C}^{d}\otimes\mathbb{C}^{T}$ . Then we have

	$\displaystyle\widetilde{Q}\,\|\vartheta_{T}\rangle\otimes\|\phi\rangle$	$\displaystyle=\sum_{j=1}^{d}c_{j}Q^{r_{j}}\|\vartheta_{T}\rangle\otimes\|j\rangle$
		$\displaystyle=\sum_{j=1}^{d}\omega^{-r_{j}}c_{j}\|\vartheta_{T}\rangle\otimes\|j\rangle$
		$\displaystyle=\sum_{j=1}^{d}b_{j}e^{2\pi i(\theta_{i}-r_{j}/T)}\|\vartheta_{T}\rangle\otimes\|j\rangle.$

Therefore, letting $|\hat{\phi}\rangle=\sum_{j=1}^{d}b_{j}|j\rangle$ and using (4), for sufficiently large $T$ , the vector $\widetilde{Q}\,|\vartheta_{T}\rangle\otimes|\phi\rangle$ is arbitrarily close to $|\vartheta_{T}\rangle\otimes|\hat{\phi}\rangle$ . On the other hand, by Lemma 6, there is a permutation matrix $P$ acting on $|\hat{\phi}\rangle\otimes|\mu_{R}\rangle$ such that for sufficiently large $R$ , the vector $P|\hat{\phi}\rangle\otimes|\mu_{R}\rangle$ is sufficiently close to $|\mu_{R}\rangle$ . Putting these together, we find that for sufficiently large $R,T$ , the vector $(I\otimes P)(\tilde{Q}\otimes I)|\vartheta_{T}\rangle\otimes|\phi\rangle\otimes|\mu_{R}\rangle$ is sufficiently close to $|\vartheta_{T}\rangle\otimes|\mu_{R}\rangle$ . We are done since $(I\otimes P)(\tilde{Q}\otimes I)$ is a multiplication of permutation matrices, and is a permutation matrix itself.

∎

3 Bipartite-source networks

In this section, as a warm up and to convey some of our ideas, we prove Theorem 3 for networks all of whose sources are bipartite and the functions $f_{i}$ are real. Such a network can be visualized as a graph with nodes representing parties and edges representing sources. Thus for such networks, a source $S_{\alpha}$ adjacent to two parties $A_{i},A_{j}$ is denoted by $S_{ij}$ and sometimes by $\alpha=(i,j)$ .

In the following we restate Theorem 3 for bipartite-source networks and real functions.

Theorem 8 (Main result, bipartite-source networks).

Let $\mathcal{N}$ be a network with parties $A_{i}$ , $i=1,\dots,n$ and with sources $S_{\alpha}$ , $\alpha=1,\dots,m$ , that are all bipartite. Consider signals emitted by sources and measurements performed by the parties in an arbitrary underlying physical theory that satisfies the no-signaling and independence principles. Suppose that this results in an output joint distribution $p(a_{1},\dots,a_{n})$ . Let $f_{i}(a_{i})$ be a real function of the output of party $A_{i}$ . Then the covariance matrix $\mathcal{C}(f_{1},\dots,f_{n})$ admits an $\mathcal{N}$ -compatible matrix decomposition.

Our proof of this theorem is in two steps. First, based on non-fanout inflations, we show in Lemma 10 that the Schur product of the covariance matrix $\mathcal{C}(f_{1},\dots,f_{n})$ with any sign matrix (to be defined) is positive semidefinite. Then, using the theory of dual cones, in Lemma 11 we show that any matrix with the aforementioned property admits an $\mathcal{N}$ -compatible matrix decomposition.

Definition 9 (Sign matrix).

Consider a network $\mathcal{N}$ all of whose sources are bipartite. A sign function $\epsilon$ on $\mathcal{N}$ is a function that assigns $\epsilon(\alpha)\in\{\pm 1\}$ to any source $S_{\alpha}$ of $\mathcal{N}$ . Then a sign matrix $\Gamma_{\epsilon}=(\gamma_{ij})$ associated to $\epsilon$ is an $n\times n$ symmetric matrix such that

\displaystyle\gamma_{ij}=\begin{cases}1&\qquad i=j,\\ \epsilon(\alpha)&\qquad i\neq j~{}\&~{}\alpha\to i,j.\end{cases}

(5)

Note that if parties $i\neq j$ do not have a common source, then the entry $\gamma_{ij}$ is unconstrained and can take any value.

We can now state our first lemma.

Lemma 10.

Let $\mathcal{N}$ be a network with parties $A_{1},\dots,A_{n}$ and with bipartite sources. Let $f_{i}(a_{i})$ be an arbitrary real function of the output of $A_{i}$ . Then for any sign matrix $\Gamma_{\epsilon}$ , the Schur product $\mathcal{C}(f_{1},\dots,f_{n})\circ\Gamma_{\epsilon}$ is positive semidefinite.

Note that although the entries $\gamma_{ij}$ of a sign matrix $\Gamma_{\epsilon}$ for which $i,j$ do not have a common source are not uniquely determined in terms of $\epsilon$ , the Schur product $\mathcal{C}(f_{1},\dots,f_{n})\circ\Gamma_{\epsilon}$ depends only on $\epsilon$ (by the independence principle, the corresponding entries in $\mathcal{C}(f_{1},\dots,f_{n})$ vanish, forcing those entries in $\mathcal{C}(f_{1},\dots,f_{n})\circ\Gamma_{\epsilon}$ to be zero).

Proof.

To prove this lemma we use the inflation technique discussed in Subsection 2.1. Consider the following non-fanout inflation of $\mathcal{N}$ . For any party $A_{i}$ in $\mathcal{N}$ consider two parties $A_{i}^{(1)},A_{i}^{(2)}$ in the inflated network $\widetilde{\mathcal{N}}$ . Then for any source $\alpha=(i,j)$ in $\mathcal{N}$ consider two sources $\alpha^{(1)},\alpha^{(2)}$ in $\widetilde{\mathcal{N}}$ . If $\epsilon(\alpha)=+1$ , we connect one of them to $A_{i}^{(1)},A_{j}^{(1)}$ and the other one to $A_{i}^{(2)},A_{j}^{(2)}$ . If $\epsilon(\alpha)=-1$ , then we connect one of them to $A_{i}^{(1)},A_{j}^{(2)}$ and the other one to $A_{i}^{(2)},A_{j}^{(1)}$ . See Figure 2 for an example of a sign function and its associated inflation.

Now suppose that party $A_{i}^{(k)}$ computes function $f_{i}$ of her output, and let $\widetilde{\mathcal{C}}$ be the covariance matrix of these $2n$ functions, that is a $(2n)\times(2n)$ positive semidefinite matrix. Suppose that we index rows and columns of $\widetilde{\mathcal{C}}$ by $i_{k}$ , $1\leq i\leq n$ and $k=1,2$ , representing party $A_{i}^{(k)}$ . Then as mentioned in Subsection 2.1 entries of $\widetilde{\mathcal{C}}$ can be computed as follows:

•

$\widetilde{\mathcal{C}}_{i_{k},i_{\ell}}=\delta_{k,\ell}\mathsf{Var}(f_{i})$ ,
•

$\widetilde{\mathcal{C}}_{i_{k},j_{\ell}}=0$ if $A_{i}\neq A_{j}$ do not have a common source in $\mathcal{N}$ ,
•

$\widetilde{\mathcal{C}}_{i_{k},j_{\ell}}=\delta_{k,\ell}\mathsf{Cov}(f_{i},f_{j})$ if $\alpha=(i,j)$ is a source in $\mathcal{N}$ and $\epsilon(\alpha)=+1$ ,
•

$\widetilde{\mathcal{C}}_{i_{k},j_{\ell}}=\delta_{3-k,\ell}\mathsf{Cov}(f_{i},f_{j})$ if $\alpha=(i,j)$ is a source in $\mathcal{N}$ and $\epsilon(\alpha)=-1$ .

Indeed, the covariance matrix $\widetilde{\mathcal{C}}$ can be written as

\widetilde{\mathcal{C}}=\sum_{\alpha=(i,j)}\mathsf{Cov}(f_{i},f_{j})|i\rangle\langle j|\otimes T_{\epsilon(\alpha)},

(6)

where

T_{+1}=\begin{pmatrix}1&0\\ 0&1\end{pmatrix},\qquad\qquad T_{-1}=\begin{pmatrix}0&1\\ 1&0\end{pmatrix}.

Next, introducing the Hadamard matrix

H=\frac{1}{\sqrt{2}}\begin{pmatrix}1&1\\ 1&-1\end{pmatrix},

we find that

(I\otimes H)\cdot\widetilde{\mathcal{C}}\cdot(I\otimes H^{\dagger})=\sum_{\alpha=(i,j)}\mathsf{Cov}(f_{i},f_{j})|i\rangle\langle j|\otimes\begin{pmatrix}1&0\\ 0&\epsilon(\alpha)\end{pmatrix}.

(7)

Recall that $\widetilde{\mathcal{C}}$ is positive semidefinite. Then the matrix on the right hand side of (7), and any of its principal submatrices, is also positive semidefinite. This means that the submatrix consisting of rows and columns indexed by $i_{2}$ , $1\leq i\leq n$ , that is equal to

\sum_{{\alpha}=(i,j)}\mathsf{Cov}(f_{i},f_{j})\epsilon(\alpha)|i\rangle\langle j|=\mathcal{C}\circ\Gamma_{\epsilon},

is positive semidefinite. ∎

We now show that any matrix satisfying the condition of Lemma 10 admits an $\mathcal{N}$ -compatible matrix decomposition.

Lemma 11.

Let $\mathcal{N}$ be a network with parties $A_{1},\dots,A_{n}$ and with bipartite sources. Consider an $n\times n$ positive semidefinite matrix $M$ such that $M_{ij}=0$ if $A_{i}\neq A_{j}$ do not share a source. Then $M$ admits an $\mathcal{N}$ -compatible matrix decomposition if and only if $\widehat{M}$ is positive semidefinite where

\displaystyle\widehat{M}_{ij}=\begin{cases}M_{ii}&\quad i=j,\\ -|M_{ij}|&\quad i\neq j,\end{cases}

is the comparison matrix of $M$ .

We note that this lemma is first proven in [24] in the context of coherence theory in the special case where the underlying graph is complete (i.e., every two parties share a source). Moreover, the importance of this lemma in the causal inference problem has been notified in [16]. This lemma also has applications in entanglement theory [27]. We will generalise it in Lemma 14. Here we present a proof different from those of [24, 27].

Proof.

We prove this lemma using the theory of dual cones. We first introduce two sets of positive semidefinite matrices as follows:

•

We let $\mathcal{K}$ be the set of $n\times n$ positive semidefinite matrices $M$ such that $M_{ij}=0$ if $A_{i}\neq A_{j}$ do not have a common source in $\mathcal{N}$ , and $\widehat{M}\succeq 0$ .
•

We let $\mathcal{L}$ be the set of positive semidefinite $n\times n$ matrices that admit an $\mathcal{N}$ -compatible matrix decomposition. Note that using the notation we introduced in Definition 1 we have $\mathcal{L}=\sum_{\alpha}\mathcal{L}_{\alpha}$ .

The statement of the lemma says that $\mathcal{K}=\mathcal{L}$ . As shown in Appendix C it is not hard to verify that both $\mathcal{K},\mathcal{L}$ are closed convex cones. Then using Lemma 5, to establish $\mathcal{K}=\mathcal{L}$ it suffices to prove that $\mathcal{L}\subseteq\mathcal{K}$ and $\mathcal{L}^{*}\subseteq\mathcal{K}^{*}$ .

$\mathcal{L}\subseteq\mathcal{K}$ :

For any $M\in\mathcal{L}$ there is a decomposition $M=\sum_{\alpha}M_{\alpha}$ with $M_{\alpha}\in\mathcal{L}_{\alpha}$ . We note that for any $2\times 2$ positive semidefinite matrix $X$ we have $\widehat{X}\succeq 0$ . This means that $\widehat{M}_{\alpha}\succeq 0$ for any $\alpha$ . Therefore, $\widehat{M}=\sum_{\alpha}\widehat{M}_{\alpha}$ is positive semidefinite and $M\in\mathcal{K}$ .

$\mathcal{L}^{}\subseteq\mathcal{K}^{}$ :

We need to show that for any $X\in\mathcal{L}^{*}$ and $M\in\mathcal{K}$ we have $\langle X,M\rangle=\text{\rm tr}(X^{\dagger}M)\geq 0$ . To this end, first remark that by Lemma 5, we have $\mathcal{L}^{*}=\bigcap_{\alpha}\mathcal{L}_{\alpha}^{*}$ . On the other hand, for $\alpha=(i,j)$ , the cone $\mathcal{L}_{\alpha}^{*}$ is the set of matrices whose $\alpha$ -block, i.e., the principal submatrix consisting of rows and columns indexed by $i,j$ , is positive semidefinite. Therefore, $X\in\mathcal{L}^{*}$ means that for any $\alpha=(i,j)$ we have $|X_{ij}|\leq\sqrt{X_{ii}X_{jj}}$ . Then we have

	$\displaystyle\text{\rm tr}\left(X^{\dagger}M\right)$	$\displaystyle=\sum_{i}M_{ii}X_{ii}+\sum_{\alpha=(i,j)}\big{(}M_{ij}\bar{X}_{ij}+M_{ji}\bar{X}_{ji}\big{)}$
		$\displaystyle\geq\sum_{i}M_{ii}X_{ii}-2\sum_{\alpha=(i,j)}\|M_{ij}X_{ij}\|$
		$\displaystyle\geq\sum_{i}M_{ii}X_{ii}-2\sum_{\alpha=(i,j)}\|M_{ij}\|\sqrt{X_{ii}X_{jj}}$
		$\displaystyle\geq\langle v\|\,\widehat{M}\,\|v\rangle$
		$\displaystyle\geq 0,$

where $|v\rangle$ is the vector with coordinates $\sqrt{X_{ii}}$ , $1\leq i\leq n$ . Here the last inequality holds since by assumption $M\in\mathcal{K}$ and $\widehat{M}\succeq 0$ . ∎

We can now give the proof of Theorem 8.

Proof of Theorem 8.

By Lemma 11 we need to show that $\widehat{\mathcal{C}}$ is positive semidefinite where $\mathcal{C}=\mathcal{C}(f_{1},\dots,f_{n})$ . On the other hand, since by assumption entries of $\mathcal{C}$ are real, there is a sign matrix $\Gamma_{\epsilon}$ such that $\widehat{\mathcal{C}}=\mathcal{C}\circ\Gamma_{\epsilon}$ . Next, by Lemma 10 we know that $\mathcal{C}\circ\Gamma_{\epsilon}$ is positive semidefinite. We are done.

∎

We remark that in Theorem 8 for simplicity of presentation and conveying some of our ideas we restrict to real functions $f_{i}$ . In the next section where we prove Theorem 3 in its most general form, we cover complex functions $f_{i}$ as well. Nevertheless, here we briefly explain the changes we should make in the above proofs in order to include complex functions.

Lemma 11 already works for complex matrices, so we only need to generalize Lemma 10. To this end, let $\epsilon$ be a function that associates a norm-1 complex number $\epsilon(\alpha)$ to any source $\alpha$ . Then we define the generalised sign matrix $\Gamma_{\epsilon}=(\gamma_{ij})$ to be an $n\times n$ matrix with entries

\displaystyle\gamma_{ij}=\begin{cases}1&\qquad i=j,\\ \epsilon(\alpha)&\qquad i<j~{}\&~{}\alpha\to i,j,\\ \bar{\epsilon}(\alpha)&\qquad i>j~{}\&~{}\alpha\to i,j.\end{cases}

(8)

We note that for any positive semidefinite matrix $M$ , there is an appropriate $\epsilon$ such that $\widehat{M}=M\circ\Gamma_{\epsilon}$ . Thus for complex functions $f_{i}$ , we need to generalize Lemma 10 and show that $\mathcal{C}(f_{1},\dots,f_{n})\circ\Gamma_{\epsilon}\succeq 0$ for any function $\epsilon(\alpha)$ with $|\epsilon(\alpha)|=1$ . To prove such a lemma, observe that for sufficiently large $d$ , there are $0\leq t_{\alpha}<d$ such that $|\epsilon(\alpha)-\omega^{t_{\alpha}}|$ is arbitrarily small, where $\omega=e^{2\pi i/d}$ is the $d$ -th root of unity. Then by a continuity argument it suffices to show that $\mathcal{C}(f_{1},\dots,f_{n})\circ\Gamma_{\epsilon}\succeq 0$ in the special case where $\epsilon(\alpha)=\omega^{t_{\alpha}}$ . To prove this, we consider an inflation of $\mathcal{N}$ with $d$ copies per source and party. Letting $\alpha_{1},\dots,\alpha_{d}$ be copies of $\alpha=(i,j)$ , we connect $\alpha_{k}$ to parties $A_{i}^{(k)}$ and $A_{j}^{(k+t_{\alpha})}$ , the $k$ -th and $(k+t_{\alpha})$ -th copies of $A_{i}$ and $A_{j}$ respectively. Next, we consider the covariance matrix associated to this inflated network that is an $(nd)\times(nd)$ positive semidefinite matrix. The $d\times d$ block of this matrix associated to the source $\alpha=(i,j)$ equals $\mathsf{Cov}(f_{i},f_{j})$ times a permutation matrix that is a shift by $t_{\alpha}$ . Then, conjugating this $(nd)\times(nd)$ matrix with the block-diagonal matrix all of whose diagonal blocks are Fourier transform, we can simultaneously diagonalize all blocks of the $(nd)\times(nd)$ covariance matrix. Next, putting the second diagonal element of these blocks together we obtain an $n\times n$ matrix that is equal to $\mathcal{C}(f_{1},\dots,f_{n})\circ\Gamma_{\epsilon}$ . Thus, this matrix is positive semidefinite since it is a principal submatrix of a positive semidefinite matrix. This gives the proof of Lemma 10 for complex functions.

In the next section, building on the above ideas, we generalize Theorem 8 for arbitrary NDCS networks and for arbitrary complex functions.

4 NDCS networks

In this section we prove Theorem 3 in its most general form. As in the previous section, the proof is in two steps. We first show in Lemma 13 (which generalises Lemma 10) that the Schur product of the covariance matrix $\mathcal{C}(f_{1},\dots,f_{n})$ with any twisted Gram matrix (to be defined) is positive semidefinite. We prove this using non-fanout inflations. Then in Lemma 14 (which generalises Lemma 11) we show that any matrix with the aforementioned property admits an $\mathcal{N}$ -compatible matrix decomposition. To prove this result we use the theory of dual cones and the idea of embezzlement discussed in Subsection 2.3.

Let us first generalise the notion of sign matrices. Here, the sign function in Definition 9 is replaced by a collection of permutations and a set of vectors.

Definition 12 (Twisted Gram matrix).

Let $\mathcal{N}$ be an arbitrary NDCS network with $n$ parties $A_{1},\dots,A_{n}$ . Let $|\psi_{1}\rangle,\dots,|\psi_{n}\rangle\in\mathbb{C}^{d}$ be arbitrary $d$ -dimensional vectors. Also for any source $\alpha$ and parties $A_{i}$ with $\alpha\to i$ , let $\pi_{i}^{\alpha}$ be an arbitrary permutation on $\{1,\dots,d\}$ . Let $P_{\pi_{i}^{\alpha}}$ be the associated permutation matrix acting on $\mathbb{C}^{d}$ (i.e., $P_{\pi_{i}^{\alpha}}|x\rangle=|\pi_{i}^{\alpha}(x)\rangle$ for any $1\leq x\leq d$ ). Then a twisted Gram matrix associated to these vectors and permutations is an $n\times n$ matrix $W$ such that

\displaystyle W_{ij}=\begin{cases}\langle\psi_{i}|\psi_{i}\rangle&\qquad i=j,\\ \langle\psi_{i}|P^{\dagger}_{\pi_{i}^{\alpha}}P_{\pi_{j}^{\alpha}}|\psi_{j}\rangle&\qquad i\neq j~{}\&~{}\alpha\to i,j.\end{cases}

(9)

Note that, when parties $i\neq j$ do not share any common source, $W_{ij}$ is unconstrained and can take any value. Also, note that $W$ is well-defined since by the NDCS assumption if $A_{i},A_{j}$ have a common source, then it is unique.

We can now generalise Lemma 10.

Lemma 13.

Let $\mathcal{N}$ be an arbitrary NDCS network with $n$ parties $A_{1},\dots,A_{n}$ . Let $f_{i}(a_{i})$ , $i=1,\dots,n$ , be an arbitrary scalar function of the output of $A_{i}$ . Then for any twisted Gram matrix $W$ , the Schur product $\mathcal{C}(f_{1},\dots,f_{n})\circ W$ is positive semidefinite.

Proof.

Let $W$ be a twisted Gram matrix given by (9). Note that the entries of $W$ that are not specified by (9) are not important since by the independence principle the corresponding entries in $\mathcal{C}(f_{1},\dots,f_{n})$ vanish, so $\mathcal{C}(f_{1},\dots,f_{n})\circ W$ is independent of those entries of $W$ . We consider a non-fanout inflation of the network as follows. For any party $A_{i}$ we consider $d$ copies $A_{i}^{(1)},\dots,A_{i}^{(d)}$ , and for any source $S_{\alpha}$ we consider $d$ copies $S_{\alpha}^{(1)},\dots,S_{\alpha}^{(d)}$ . If source $S_{\alpha}$ is adjacent to party $A_{i}$ in $\mathcal{N}$ , in the inflated network we connect source $S_{\alpha}^{(k)}$ to the party $A_{i}^{((\pi_{i}^{\alpha})^{-1}(k))}$ . This fully describes the inflated network. Next, we assume that party $A_{i}^{(k)}$ computes $f_{i}^{(k)}$ that is identical to $f_{i}$ . Then the covariance matrix

\widetilde{\mathcal{C}}=\mathcal{C}\big{(}f_{1}^{(1)},\dots,f_{1}^{(d)},f_{2}^{(1)},\dots,f_{2}^{(d)},\dots\dots,f_{n}^{(1)},\dots,f_{n}^{(d)}\big{)},

is positive semidefinite. Observe that $\widetilde{\mathcal{C}}$ is an $(nd)\times(nd)$ matrix consisting of $d^{2}$ blocks of size $n\times n$ . These blocks labeled by pairs $(i,j)$ with $1\leq i,j\leq n$ and denoted by $\widetilde{\mathcal{C}}_{ij}$ are described as follows:

–

The $(i,i)$ -block is diagonal with $\mathsf{Var}(f_{i})=\mathsf{Cov}(f_{i},f_{i})$ on the diagonal, i.e., $\widetilde{\mathcal{C}}_{ii}=\mathsf{Var}(f_{i})I$ .
–

If $i\neq j$ and parties $A_{i},A_{j}$ do not share any source, then the $(i,j)$ -block is $\widetilde{\mathcal{C}}_{ij}=0$ .
–

If $i\neq j$ and parties $A_{i},A_{j}$ share source $S_{\alpha}$ , then the $(i,j)$ -block equals $\widetilde{\mathcal{C}}_{ij}=\mathsf{Cov}(f_{i},f_{j})P_{\pi_{i}^{\alpha}}^{\dagger}P_{\pi_{j}^{\alpha}}$ . See Figure 3 to verify this.

Let $R_{i}$ be a $d\times d$ matrix such that

R_{i}R_{i}^{\dagger}=R_{i}^{\dagger}R_{i}=\|\psi_{i}\|^{2}I\qquad\quad R_{i}|i\rangle=|\psi_{i}\rangle.

Such a matrix can be constructed as follows. If $|\psi\rangle=0$ , then let $R_{i}=0$ . Otherwise, start with $\frac{1}{\|\psi_{i}\|}|\psi_{i}\rangle$ and extend it to an orthonormal basis; put those basis vectors in the columns of a matrix to get a unitary matrix; finally multiply this unitary by $\|\psi_{i}\|$ to obtain $R_{i}$ . Next, let $R$ be the $(nd)\times(nd)$ block diagonal matrix whose $i$ -th block on the diagonal is $R_{i}$ . Then $R^{\dagger}\widetilde{\mathcal{C}}R$ is a positive semidefinite matrix consisting of blocks $R_{i}^{\dagger}\widetilde{\mathcal{C}}_{ij}R_{j}$ . Therefore, the principal submatrix of $R^{\dagger}\widetilde{\mathcal{C}}R$ consisting of the $(1,1)$ -entry of all these blocks is also a positive semidefinite matrix. Denoting this $n\times n$ matrix by $M$ , the entries of $M$ are computed as follows:

–

$M_{ii}=(R_{i}^{\dagger}\,\widetilde{\mathcal{C}}_{ii}\,R_{i})_{11}=\mathsf{Var}(f_{i})\|\psi_{i}\|^{2}$ .
–

If $i\neq j$ and parties $A_{i},A_{j}$ do not share any source, then the $(i,j)$ -block is $M_{ij}=0$ .
–

If $i\neq j$ and parties $A_{i},A_{j}$ share source $S_{\alpha}$ , then $M_{ij}=\mathsf{Cov}(f_{i},f_{j})\langle\psi_{i}|P_{\pi_{i}^{\alpha}}^{\dagger}P_{\pi_{j}^{\alpha}}|\psi_{j}\rangle$ .

Therefore,

M=\mathcal{C}(f_{1},\dots,f_{n})\circ W,

and it is a positive semidefinite matrix. We are done.

∎

In the following lemma we present a description of the dual cone of positive semidefinite matrices that admit a network-compatible matrix decomposition. This lemma may be of independent interest, notably in coherence and entanglement theory.

Lemma 14.

Let $\mathcal{N}$ be an arbitrary NDCS network with $n$ parties $A_{i}$ , $i=1,\dots,n$ and sources $S_{\alpha}$ , $\alpha=1,\dots,m$ . Let $\mathcal{L}$ be the cone of positive semidefinite matrices that admit an $\mathcal{N}$ -compatible matrix decomposition. Then

\mathcal{L}^{*}=\overline{\mathcal{W}},

where $\mathcal{W}$ is the set of twisted Gram matrices of Definition 12, and $\overline{\mathcal{W}}$ is its closure.

Note that this lemma is a generalization of Lemma 11, expressed in the dual picture. In the bipartite case, twisted Gram matrices turn to (generalized) sign matrices which have a much simpler description. Then the dual cone $\mathcal{W}^{*}$ , that is the cone of interest, is easily characterized; it contains all matrices whose comparison matrix is positive semidefinite. This provides the explicit characterization of $\mathcal{L}$ in Lemma 11. In the multipartite case, the set $\mathcal{W}$ takes a much more complicated structure, so characterizing the dual cone $\mathcal{W}^{*}$ is not easy. To clarify this issue, note that assuming that all sources are bipartite, the permutations that we need to use are either transpositions (in the real case) or shifts (in the complex case). Such permutations can be simultaneously diagonalized using Fourier transform as we did in the proof of Lemma 10 and mentioned at the end of Section 3. However, in the multipartite case, we need to use arbitrary permutations that cannot be simultaneously diagonalized. This makes it difficult to obtain a simple direct description of the cone $\mathcal{W}$ .

Proof.

For any $n\times n$ matrix $M$ we let $M^{(\alpha)}$ be the principal submatrix of $M$ consisting of rows and columns indexed by parties $i$ with $\alpha\to i$ . We call $M^{(\alpha)}$ the $\alpha$ -block of $M$ . Let $\mathcal{L}_{\alpha}$ be the cone of positive semidefinite matrices $M$ whose only non-zero entries are in their $\alpha$ -block, i.e., $M_{ij}\neq 0$ only if $\alpha\to i,j$ . We note that as the dual of the cone of positive semidefinite matrices is itself, $\mathcal{L}_{\alpha}^{*}$ is the space of matrices whose $\alpha$ -block is positive semidefinite. On the other hand, by definition $\mathcal{L}=\sum_{\alpha}\mathcal{L}_{\alpha}$ , so by Lemma 5 we have

\mathcal{L}^{*}=\bigcap_{\alpha}\mathcal{L}_{\alpha}^{*}.

That is, $\mathcal{L}^{*}$ consists of matrices whose $\alpha$ -block for any source $S_{\alpha}$ is positive semidefinite.

$\overline{\mathcal{W}}\subseteq\mathcal{L}^{*}$ :

By definition, any $\alpha$ -block of any matrix $W\in\mathcal{W}$ is a Gram matrix and is positive semidefinite. Then by the above discussion, we have $\mathcal{W}\subseteq\mathcal{L}^{*}$ . On the other hand $\mathcal{L}^{*}$ , as a dual cone, is closed. Therefore, $\overline{\mathcal{W}}\subseteq\mathcal{L}^{*}$ .

$\mathcal{L}^{*}\subseteq\overline{\mathcal{W}}$ :

Let $M\in\mathcal{L}^{*}$ . In the following we show that $M$ can be approximated by elements of $\mathcal{W}$ with arbitrary precision, which shows that $M\in\overline{\mathcal{W}}$ . To this end, we note that if $i\neq j$ do not share any source, the $(i,j)$ -entries of matrices in $\mathcal{W}$ can take any value. So such entries of $M$ can be ignored in approximating $M$ with elements of $\mathcal{W}$ . Indeed, it suffices to consider entries of $M$ that belong to an $\alpha$ -block for some source $\alpha$ .

Since by the characterization of $\mathcal{L}^{*}$ , the $\alpha$ -block of $M$ is positive semidefinite, there are vectors $|\phi_{i}^{\alpha}\rangle$ , for any $i$ with $\alpha\to i$ , such that

M_{ij}=(M^{(\alpha)})_{ij}=\langle\phi_{i}^{\alpha}|\phi_{j}^{\alpha}\rangle.

By padding these vectors with zero coordinates if necessary, we assume that $|\phi_{i}^{\alpha}\rangle\in\mathbb{C}^{d}$ for all $i,\alpha$ , for some $d$ . To prove the lemma we need to show that for any $\varepsilon>0$ there are vectors $|\psi_{i}\rangle$ and permutations $\pi_{i}^{\alpha}$ such that

\big{|}\langle\psi_{i}|P_{\pi_{i}^{\alpha}}^{\dagger}P_{\pi_{j}^{\alpha}}|\psi_{j}\rangle-\langle\phi_{i}^{\alpha}|\phi_{j}^{\alpha}\rangle\big{|}\leq\varepsilon.

To construct these vectors we use Lemma 7. Let

|\psi_{i}\rangle=\|\phi_{i}^{\alpha}\|\cdot|\vartheta_{T}\rangle\otimes|\mu_{R}\rangle=\sqrt{M_{ii}}|\vartheta_{T}\rangle\otimes|\mu_{R}\rangle.

where $|\mu_{R}\rangle$ and $|\vartheta_{T}\rangle$ are defined in (2) and (3) respectively. By Lemma 7 there exists a permutation $\pi_{i}^{\alpha}$ such that

\langle\psi_{i}|P_{\pi_{i}^{\alpha}}\big{(}|\vartheta_{T}\rangle\otimes|\phi_{i}^{\alpha}\rangle\otimes|\mu_{R}\rangle\big{)}\geq M_{ii}\,\kappa_{T,R},

where $\kappa_{T,R}\to 0$ as $T,R\to\infty$ . This means that for sufficiently large $T,R$ , the vector $P_{\pi_{i}^{\alpha}}|\psi_{i}\rangle$ is arbitrarily close to $|\vartheta_{T}\rangle\otimes|\phi_{i}^{\alpha}\rangle\otimes|\mu_{R}\rangle$ . Therefore, for sufficiently large $T,R$ we have

\varepsilon\geq\big{|}\langle\psi_{i}|P_{\pi_{i}^{\alpha}}^{\dagger}P_{\pi_{j}^{\alpha}}|\psi_{j}\rangle-\langle\vartheta_{T}|\vartheta_{T}\rangle\cdot\langle\phi_{i}^{\alpha}|\phi_{j}^{\alpha}\rangle\cdot\langle\mu_{R}|\mu_{R}\rangle\big{|}=\big{|}\langle\psi_{i}|P_{\pi_{i}^{\alpha}}^{\dagger}P_{\pi_{j}^{\alpha}}|\psi_{j}\rangle-\langle\phi_{i}^{\alpha}|\phi_{j}^{\alpha}\rangle\big{|}.

We are done.

∎

Putting Lemma 13 and Lemma 14 together, we can now prove Theorem 3.

Proof of Theorem 3.

Using the notation of Lemma 14 we need to show that

\mathcal{C}=\mathcal{C}(f_{1},\dots,f_{n})\in\mathcal{L}.

To this end, using $(\mathcal{L}^{*})^{*}=\mathcal{L}$ , it suffices to show that for any $W\in\mathcal{L}^{*}$ we have

\langle\mathcal{C},W\rangle=\text{\rm tr}(\mathcal{C}W)\geq 0.

Then using $\mathcal{L}^{*}=\overline{\mathcal{W}}$ established in Lemma 14 and by a continuity argument, we may assume that $W\in\mathcal{W}$ and takes the form (9). Now letting $|J\rangle=\sum_{j=1}^{n}|j\rangle$ , we have

\langle\mathcal{C},W\rangle=\langle J|\mathcal{C}\circ W|J\rangle.

On the other hand, by Lemma 13 the matrix $\mathcal{C}\circ W$ is positive semidefinite. Therefore, $\langle\mathcal{C},W\rangle\geq 0$ .

∎

5 Conclusion

In this paper we showed that the covariance matrix of any output distribution of NDCS networks can be written as a summation of certain positive semidefinite matrices. This result holds in the local classical theory, the quantum theory, the box world [28, 26] and more generally in any GPT [4, 14, 31] compatible with the network structure. To our knowledge, our result provides the first universal limit on correlations in generic networks, beyond constraints derived in specific ones (see [12, 3]). As also mentioned in [15, 1] this covariance decomposition condition can be stated as a semidefinite program and can be verified efficiently.

Our proof technique is valid only for the subclass of NDCS networks, while the covariance decomposition is known to hold for all networks in the case of local and quantum models. Hence, the necessity of this NDCS assumption in the case of generic GPTs is an open question; does the network-compatible matrix decomposition holds in all GPTs for networks that do not satisfy the NDCS assumption? Another open problem is to generalize our results for vector-valued functions beyond scalar ones.

The main conceptual tool in our proof is the inflation technique. It is proven in [17] that inflations with unbounded fanout characterize correlations in the local classical model. On the other hand, non-fanout inflation allows for a characterization of the ultimate limits on correlations in networks [12, 7, 19]. Our results show that even non-fanout inflations induce quite powerful limits on correlations in networks. To our knowledge, our work contains the first proof exploiting inflated networks of arbitrary sizes.

Note that the covariance matrix of a correlation depends only on its bipartite marginals. Yet, even the simple hexagon inflation of the triangle network (see Figure 2) allows for proving restrictions on multipartite marginals [12]. Thus, it would be interesting to find generic limits imposed on multipartite correlation functions by considering well-chosen families of non-fanout inflations, possibly of unbounded size. For instance, it would be interesting to see if the Finner’s inequality [9], as a constraint on multipartite correlations, holds for arbitrary GPTs. This inequality is proven in [23] for the quantum as well as the box world when the underlying network is the triangle.

At last, let us mention that our results may be of interest beyond causal inference in networks, particularly in coherence and entanglement theory. In Lemma 11 and Lemma 14 we characterized the space of matrices that admit a network-compatible matrix decomposition. Lemma 11 has applications in coherence and entanglement theory. Thus, Lemma 14 as a generalization of this lemma may find applications in these theories or elsewhere. Moreover, the proof of this theorem shows that the idea of embezzlement may find other applications in matrix analysis beyond the entanglement theory.

Acknowledgements.

The authors are thankful to Amin Gohari for bringing the example of Gaussian sources to their attention to show the tightness of Theorem 3; and to Elie Wolfe, Stefano Pironio and Nicolas Gisin for clarifying the different ways to define the ultimate limits of correlations in networks. M.-O.R. is supported by the Swiss National Fund Early Mobility Grant P2GEP2_191444 and acknowledges the Government of Spain (FIS2020-TRANQI and Severo Ochoa CEX2019-000910-S), Fundació Cellex, Fundació Mir-Puig, Generalitat de Catalunya (CERCA, AGAUR SGR 1381) and the ERC AdG CERQUTE.

References

[1] J. Åberg, R. Nery, C. Duarte, and R. Chaves. Semidefinite tests for quantum network topologies. Physical Review Letters, 125(11):110505, 2020.
[2] J.-M. A. Allen, J. Barrett, D. C. Horsman, C. M. Lee, and R. W. Spekkens. Quantum common causes and quantum causal models. Phys. Rev. X, 7:031021, Jul 2017.
[3] J.-D. Bancal and N. Gisin. Non-local boxes for networks. arXiv preprint arXiv:2102.03597, 2021.
[4] J. Barrett. Information processing in generalized probabilistic theories. Phys. Rev. A, 75:032304, Mar 2007.
[5] J. S. Bell. On the Einstein Podolsky Rosen paradox. Physics Physique Fizika, 1(3):195, 1964.
[6] C. Branciard, N. Gisin, and S. Pironio. Characterizing the Nonlocal Correlations Created via Entanglement Swapping. Phys. Rev. Lett., 104:170401, Apr 2010.
[7] X. Coiteux-Roy, E. Wolfe, and M.-O. Renou. In preparation. 2021.
[8] J. Dattorro. Convex optimization & Euclidean distance geometry. Meboo Publishing, 2010.
[9] H. Finner. A Generalization of Holder’s Inequality and Some Probability Inequalities. The Annals of probability, 20:1893–1901, 1992.
[10] T. Fritz. Beyond Bell’s theorem: correlation scenarios. New Journal of Physics, 14(10):103001, oct 2012.
[11] T. Fritz. Beyond Bell’s theorem ii: Scenarios with arbitrary causal structure. Communications in Mathematical Physics, 341:391–434, 2016.
[12] N. Gisin, J.-D. Bancal, Y. Cai, P. Remy, A. Tavakoli, E. Z. Cruzeiro, S. Popescu, and N. Brunner. Constraints on nonlocality in networks from no-signaling and independence. Nature Communications, 11(1):1–6, 2020.
[13] R. L. J. Henson and M. F. Pusey. Theory-independent limits on correlations from generalized bayesian networks. New Journal of Physics, 16:113043, 2014.
[14] P. Janotta and R. Lal. Generalized probabilistic theories without the no-restriction hypothesis. Phys. Rev. A, 87:052131, May 2013.
[15] A. Kela, K. Von Prillwitz, J. Åberg, R. Chaves, and D. Gross. Semidefinite tests for latent causal structures. IEEE Transactions on Information Theory, 66(1):339–349, 2019.
[16] T. Kraft, C. Spee, X.-D. Yu, and O. Gühne. Characterizing quantum networks: Insights from coherence theory. arXiv preprint arXiv:2006.06693, 2020.
[17] M. Navascués and E. Wolfe. The inflation technique completely solves the causal compatibility problem. Journal of Causal Inference, 8(1):70–91, 2020.
[18] J. Pearl. Causality. Cambridge University Press, 2009.
[19] S. Pironio. In preparation. 2021.
[20] S. Popescu and D. Rohrlich. Quantum Nonlocality as an Axiom. Found. Phys., 24(3):379–385, 1994.
[21] S. Popescu and D. Rohrlich. Causality and nonlocality as axioms for quantum mechanics. In Causality and Locality in Modern Physics, pages 383–389. Springer, 1998.
[22] M.-O. Renou and S. Beigi. Network nonlocality via rigidity of token-counting and color-matching. arXiv preprint arXiv:2011.02769, 2020.
[23] M.-O. Renou, Y. Wang, S. Boreiri, S. Beigi, N. Gisin, and N. Brunner. Limits on correlations in networks for quantum and no-signaling resources. Physical Review Letters, 123(7):070403, 2019.
[24] M. Ringbauer, T. R. Bromley, M. Cianciaruso, L. Lami, W. S. Lau, G. Adesso, A. G. White, A. Fedrizzi, and M. Piani. Certification and quantification of multilevel quantum coherence. Physical Review X, 8(4):041007, 2018.
[25] V. Scarani. Feats, Features and Failures of the PR-box. In AIP Conference Proceedings. AIP, 2006.
[26] A. J. Short and J. Barrett. Strong nonlocality: a trade-off between states and measurements. New J. Phys., 12(3):033034, 2010.
[27] S. Singh and I. Nechita. The ppt² conjecture holds for all choi-type maps. arXiv preprint arXiv:2011.03809, 2020.
[28] P. Skrzypczyk and N. Brunner. Couplers for non-locality swapping. New J. Phys., 11(7):073014, 2009.
[29] P. Spirtes, C. Glymour, R. Scheines, D. Heckerman, C. Meek, G. Cooper, and T. Richardson. Causation, prediction, and search. MIT Press, 2000.
[30] W. van Dam and P. Hayden. Universal entanglement transformations without communication. Physical Review A, 67(6):060302, 2003.
[31] M. Weilenmann and R. Colbeck. Self-testing of physical theories, or, is quantum theory optimal with respect to some information-processing task? Phys. Rev. Lett., 125:060406, Aug 2020.
[32] E. Wolfe, R. W. Spekkens, and T. Fritz. The inflation technique for causal inference with latent variables. Journal of Causal Inference, 7(2), 2019.

Appendix A Proof of Lemma 5

Items (i), (ii), (iv) and (v) follow from the definition of dual cones.

To prove (iii), note that by definition $\mathcal{K}\subseteq(\mathcal{K}^{*})^{*}$ . To prove the inclusion in the other direction, let $Z\notin\mathcal{K}$ . Since $\mathcal{K}$ is convex, by the Hahn-Banach separation theorem there exists $Y$ and $c\in\mathbb{R}$ such that $\langle Y,X\rangle\geq c$ for all $X\in\mathcal{K}$ and $\langle Y,Z\rangle<c$ . We note that by the definition of cones, $X=0$ belongs to $\mathcal{K}$ . Therefore, $c\leq 0=\langle Y,0\rangle$ . On the other hand, suppose that $X\in\mathcal{K}$ is such that $x=\langle Y,X\rangle<0$ . Then letting $r=(c-1)/x>0$ , we have $rX\in\mathcal{K}$ . Therefore, $c\leq\langle Y,rX\rangle=r\langle Y,X\rangle=c-1$ which is a contradiction. This means that for any $X\in\mathcal{K}$ we have $\langle Y,X\rangle\geq 0$ . As a result, $Y\in\mathcal{K}^{*}$ while we have $\langle Y,Z\rangle<c\leq 0$ . This means that $Z$ does not belong to $(\mathcal{K}^{*})^{*}$ . Therefore, the complement of $\mathcal{K}$ is a subset of the complement of $(\mathcal{K}^{*})^{*}$ which means that $(\mathcal{K}^{*})^{*}\subseteq\mathcal{K}$ .

To prove (vi) note that by (ii) we have $\mathcal{K}_{i}^{*}\subseteq(\mathcal{K}_{1}\cap\mathcal{K}_{2})^{*}$ for $i=1,2$ . Then since $(\mathcal{K}_{1}\cap\mathcal{K}_{2})^{*}$ is a closed convex cone and closed under summation, we have $\overline{\mathcal{K}_{1}^{*}+\mathcal{K}_{2}^{*}}\subseteq(\mathcal{K}_{1}\cap\mathcal{K}_{2})^{*}$ . To prove the other direction, suppose that $Z\notin\overline{\mathcal{K}_{1}^{*}+\mathcal{K}_{2}^{*}}$ . Then by the Hahn-Banach separation theorem and following similar steps as in the proof of (iii) we find that there exists $Y$ such that $\langle Y,X\rangle\geq 0$ for all $X\in\overline{\mathcal{K}_{1}^{*}+\mathcal{K}_{2}^{*}}$ , and $\langle Y,Z\rangle<0$ . Since $\mathcal{K}_{i}^{*}\subseteq\overline{\mathcal{K}_{1}^{*}+\mathcal{K}^{*}_{2}}$ , for $i=1,2$ , by definition $Y\in(\mathcal{K}_{i}^{*})^{*}$ . Then by (iii) we have $Y\in\mathcal{K}_{i}$ . This means that $Y\in\mathcal{K}_{1}\cap\mathcal{K}_{2}$ . On the other hand, $\langle Y,Z\rangle<0$ , so $Z\notin(\mathcal{K}_{1}\cap\mathcal{K}_{2})^{*}$ . We conclude that the complement of $\overline{\mathcal{K}_{1}^{*}+\mathcal{K}_{2}^{*}}$ is a subset of the complement $(\mathcal{K}_{1}\cap\mathcal{K}_{2})^{*}$ , which means that $(\mathcal{K}_{1}\cap\mathcal{K}_{2})^{*}\subseteq\overline{\mathcal{K}_{1}^{*}+\mathcal{K}_{2}^{*}}$ .

For (vii) take the dual of both sides of (iv) and use (iii).

Appendix B Proof of Lemma 6

Let $|\lambda_{R}\rangle\in\mathbb{C}^{dR}$ be the state obtained from $|\phi\rangle\otimes|\mu_{R}\rangle$ via identification of $[d]\times[R]$ with $[dR]$ , so the coordinates of $|\lambda_{R}\rangle$ are $\frac{c_{j}}{\sqrt{r\chi_{R}}}$ for $(j,r)\in[d]\times[R]$ . Let $\pi$ be the permutation that sorts these coordinates in the decreasing order. Therefore,

|\tilde{\lambda}_{R}\rangle=P_{\pi}|\lambda_{R}\rangle=\sum_{s=1}^{R}\tilde{\lambda}_{s}|s\rangle,

with the multi-set of $\tilde{\lambda}_{s}$ ’s being the same as the multi-set of $\frac{c_{j}}{\sqrt{r\chi_{R}}}$ ’s and $\tilde{\lambda}_{1}\geq\tilde{\lambda}_{2}\geq\cdots\geq\tilde{\lambda}_{dR}$ . In the following we show that $\langle\mu_{R}|\tilde{\lambda}_{R}\rangle$ is close to $1$ when $R$ is large.

Let

N_{j}^{t}=\Big{|}\Big{\{}r:\,\frac{c_{j}}{\sqrt{r\chi_{R}}}>\frac{1}{\sqrt{t\chi_{R}}}\Big{\}}\Big{|}=\big{|}\big{\{}r:\,c_{j}^{2}t>r\big{\}}\big{|}.

Then we have $N_{j}^{t}<c_{j}^{2}t$ . Moreover, by the normalization of $|\phi\rangle$ we have

\sum_{j=1}^{d}N_{j}^{t}<t\sum_{j=1}^{d}c_{j}^{2}=t.

This means that

\Big{|}\Big{\{}s:\,\tilde{\lambda}_{s}>\frac{1}{\sqrt{t\chi_{R}}}\Big{\}}\Big{|}<t.

Next, using the fact that $\tilde{\lambda}_{1}\geq\cdots\geq\tilde{\lambda}_{t}$ we conclude that

\frac{1}{\sqrt{t\chi_{R}}}\geq\tilde{\lambda}_{t},\qquad\forall t.

Therefore,

\displaystyle\langle\mu_{R}|\tilde{\lambda}_{R}\rangle

\displaystyle=\sum_{r=1}^{R}\frac{\tilde{\lambda}_{r}}{\sqrt{r\chi_{R}}}\geq\sum_{r=1}^{R}\tilde{\lambda}_{r}^{2}\geq\Big{(}\sum_{j=1}^{d}c_{j}^{2}\Big{)}\cdot\Big{(}\sum_{s=1}^{\lfloor R/d\rfloor}\frac{1}{s\chi_{R}}\Big{)}=\frac{\chi_{\lfloor R/d\rfloor}}{\chi_{R}}\geq\frac{\ln R-\ln d}{\ln R+1}.

Thus, $\langle\mu_{R}|\tilde{\lambda}_{R}\rangle$ is arbitrarily close to $1$ for sufficiently large $R$ . We are done since $|\tilde{\lambda}_{R}\rangle$ is obtained from $|\phi\rangle\otimes|\mu_{R}\rangle$ by applying a permutation.

Appendix C $\mathcal{K}$ and $\mathcal{L}$ are closed convex cones

In this appendix we show that the two sets $\mathcal{K},\mathcal{L}$ defined in the proof of Lemma 11 are closed convex cones.

From the definition it is clear that $\mathcal{K}$ is closed and that $rM\in\mathcal{K}$ for $M\in\mathcal{K}$ and $r\geq 0$ . Then we need to show that for $P,Q\in\mathcal{K}$ , we have $M=P+Q\in\mathcal{K}$ . Let $|v\rangle=\sum_{i}v_{i}|i\rangle$ be an arbitrary vector. We compute

	$\displaystyle\langle v\|\widehat{M}\|v\rangle$	$\displaystyle\geq\sum_{i}\|v_{i}\|^{2}M_{ii}-\sum_{i,j}\|v_{i}v_{j}\|\cdot\|M_{ij}\|$
		$\displaystyle=\sum_{i,j}\|v_{i}\|^{2}(P_{ii}+Q_{ii})-\sum_{i\neq j}\|v_{i}v_{j}\|\cdot\|P_{ij}+Q_{ij}\|$
		$\displaystyle\geq\sum_{i,j}\|v_{i}\|^{2}(P_{ii}+Q_{ii})-\sum_{i\neq j}\|v_{i}v_{j}\|\cdot\big{(}\|P_{ij}\|+\|Q_{ij}\|\big{)}$
		$\displaystyle=\langle\hat{v}\|\widehat{P}\|\hat{v}\rangle+\langle\hat{v}\|\widehat{Q}\|\hat{v}\rangle$
		$\displaystyle\geq 0,$

where $|\hat{v}\rangle$ is the vector whose coordinates are $|v_{i}|$ ’s, and in the last line we use $\widehat{P},\widehat{Q}\succeq 0$ . As a result, $\widehat{M}\succeq 0$ and $M\in\mathcal{K}$ . Therefore, $\mathcal{K}$ is a closed convex cone.

From the definition it is clear that $\mathcal{L}$ is a convex cone. Thus we need to show that $\mathcal{L}$ is closed. Indeed, the sum of closed convex cones is not necessarily closed, yet this holds for $\mathcal{L}=\sum_{\alpha}\mathcal{L}_{\alpha}$ . To prove this, suppose that the sequence $\{X^{(j)}:j\geq 1\}\subset\mathcal{L}$ tends to $X$ as $j\to\infty$ . Since $X^{(j)}$ ’s and $X$ are positive semidefinite, for sufficiently large $j$ we have $0\preceq X^{(j)}\preceq X+I$ . On the other hand, since $X^{(j)}$ belongs to $\mathcal{L}$ , there are $X^{(j)}_{\alpha}\in\mathcal{L}_{\alpha}$ such that $X^{(j)}=\sum_{\alpha}X_{\alpha}^{(j)}$ . Again using the fact that $X^{(j)}_{\alpha}$ ’s are positive semidefinite, for sufficiently large $j$ we have

0\preceq X_{\alpha}^{(j)}\preceq X^{(j)}\preceq X+I.

Then by a compactness argument, there is subsequence $\{j_{k}:\,k\geq 1\}$ such that $\lim_{k\to\infty}X_{\alpha}^{(j_{k})}=X_{\alpha}$ exists for all $\alpha$ . Now since $\mathcal{L}_{\alpha}$ is closed, we have $X_{\alpha}\in\mathcal{L}_{\alpha}$ . On the other hand,

X=\lim_{k\to\infty}X^{(j_{k})}=\lim_{k\to\infty}\sum_{\alpha}X^{(j_{k})}_{\alpha}=\sum_{\alpha}X_{\alpha}.

This means that $X\in\mathcal{L}$ . Therefore, $\mathcal{L}$ is also a closed convex cone.

	$\displaystyle\widetilde{Q}\,\|\vartheta_{T}\rangle\otimes\|\phi\rangle$	$\displaystyle=\sum_{j=1}^{d}c_{j}Q^{r_{j}}\|\vartheta_{T}\rangle\otimes\|j\rangle$
		$\displaystyle=\sum_{j=1}^{d}\omega^{-r_{j}}c_{j}\|\vartheta_{T}\rangle\otimes\|j\rangle$
		$\displaystyle=\sum_{j=1}^{d}b_{j}e^{2\pi i(\theta_{i}-r_{j}/T)}\|\vartheta_{T}\rangle\otimes\|j\rangle.$

	$\displaystyle\langle v\|\widehat{M}\|v\rangle$	$\displaystyle\geq\sum_{i}\|v_{i}\|^{2}M_{ii}-\sum_{i,j}\|v_{i}v_{j}\|\cdot\|M_{ij}\|$
		$\displaystyle=\sum_{i,j}\|v_{i}\|^{2}(P_{ii}+Q_{ii})-\sum_{i\neq j}\|v_{i}v_{j}\|\cdot\|P_{ij}+Q_{ij}\|$
		$\displaystyle\geq\sum_{i,j}\|v_{i}\|^{2}(P_{ii}+Q_{ii})-\sum_{i\neq j}\|v_{i}v_{j}\|\cdot\big{(}\|P_{ij}\|+\|Q_{ij}\|\big{)}$
		$\displaystyle=\langle\hat{v}\|\widehat{P}\|\hat{v}\rangle+\langle\hat{v}\|\widehat{Q}\|\hat{v}\rangle$
		$\displaystyle\geq 0,$

Covariance Decomposition as a Universal Limit on Correlations in Networks

Abstract

1 Introduction

1.1 Main result

Networks.

Output distribution.

Underlying physical theory.

Covariance matrix.

Definition 1 (𝒩\mathcal{N}-compatible matrix decomposition).

Definition 2 (No double common-source network).

Theorem 3 (Main result).

Proof ideas.

2 Main tools

2.1 Non-fanout inflation

First causality law.

Second causality law.

2.2 Cones and duality

Definition 4.

Lemma 5.

2.3 Embezzlement

Lemma 6.

Lemma 7.

Proof.

3 Bipartite-source networks

Theorem 8 (Main result, bipartite-source networks).

Definition 9 (Sign matrix).

Lemma 10.

Proof.

Lemma 11.

Proof.

ℒ⊆𝒦\mathcal{L}\subseteq\mathcal{K}:

ℒ∗⊆𝒦∗\mathcal{L}^{*}\subseteq\mathcal{K}^{*}:

Proof of Theorem 8.

4 NDCS networks

Definition 12 (Twisted Gram matrix).

Lemma 13.

Proof.

Lemma 14.

Proof.

𝒲¯⊆ℒ∗\overline{\mathcal{W}}\subseteq\mathcal{L}^{*}:

ℒ∗⊆𝒲¯\mathcal{L}^{*}\subseteq\overline{\mathcal{W}}:

Proof of Theorem 3.

5 Conclusion

Acknowledgements.

References

Appendix A Proof of Lemma 5

Appendix B Proof of Lemma 6

Appendix C 𝒦\mathcal{K} and ℒ\mathcal{L} are closed convex cones

Definition 1 ( $\mathcal{N}$ -compatible matrix decomposition).

$\mathcal{L}\subseteq\mathcal{K}$ :

$\mathcal{L}^{}\subseteq\mathcal{K}^{}$ :

$\overline{\mathcal{W}}\subseteq\mathcal{L}^{*}$ :

$\mathcal{L}^{*}\subseteq\overline{\mathcal{W}}$ :

Appendix C $\mathcal{K}$ and $\mathcal{L}$ are closed convex cones