Rigidity of superdense coding

Ashwin Nayak Department of Combinatorics and Optimization, and Institute for Quantum Computing, University of Waterloo, 200 University Ave. W., Waterloo, ON, N2L 3G1, Canada. Email: [email protected] . Henry Yuen Department of Computer Science, Columbia University, New York, USA. E-mail: [email protected] .

Abstract

The famous superdense coding protocol of Bennett and Wiesner demonstrates that it is possible to communicate two bits of classical information by sending only one qubit and using a shared EPR pair. Our first result is that an arbitrary protocol for achieving this task (where there are no assumptions on the sender’s encoding operations or the dimension of the shared entangled state) is locally equivalent to the canonical Bennett-Wiesner protocol. In other words, the superdense coding task is rigid. In particular, we show that the sender and receiver only use additional entanglement (beyond the EPR pair) as a source of classical randomness.

We also investigate several questions about higher-dimensional superdense coding, where the goal is to communicate one of $d^{2}$ possible messages by sending a $d$ -dimensional quantum state, for general dimensions $d$ . Unlike the $d=2$ case (i.e. sending a single qubit), there can be inequivalent superdense coding protocols for higher $d$ . We present concrete constructions of inequivalent protocols, based on constructions of inequivalent orthogonal unitary bases for all $d>2$ . Finally, we analyze the performance of superdense coding protocols where the encoding operators are independently sampled from the Haar measure on the unitary group. Our analysis involves bounding the distinguishability of random maximally entangled states, which may be of independent interest.

1 Introduction

In quantum information theory, rigidity is a phenomenon where optimal performance in an information processing task requires using a protocol satisfying extremely stringent constraints — in some cases, there is essentially a unique optimal protocol. The primary examples of rigidity come from nonlocal games (also known as Bell tests in the physics literature). In this setting two spatially separated parties Alice and Bob play a game with a third-party called the referee. In order to maximize their chances of winning, before the game starts Alice and Bob choose an entangled state to share as well as local measurements to perform on the state. For example, in the famous CHSH game the optimal winning probability is $\cos^{2}(\pi/8)$ , and a canonical strategy that achieves this uses a (rotated) EPR pair and single-qubit Pauli measurements. The CHSH game is rigid in the sense that any optimal strategy for the CHSH game is identical to this canonical strategy, up to local changes of basis.

The study of rigidity in quantum information processing arguably started with the work of Mayers and Yao [MY98, MY04], who initiated the concept of device-independent cryptography. The idea behind this subject is that a classical user can verify that untrusted quantum hardware is behaving as intended — say, generating random keys or performing a quantum computation – simply by verifying that the hardware is employing a (near)-optimal strategy in a rigid nonlocal game. Since the work of Mayers and Yao, nonlocal game rigidity has been an extremely fruitful concept in quantum cryptography (see, e.g., Refs. [VV19] and [CGJV19]), complexity theory (see Ref. [JNV⁺20] and the references therein), and quantum information more generally [ŠB20]. This motivates the following question: what other tasks in quantum information also exhibit rigidity phenomena?

To our knowledge, the only other work on rigidity phenomena outside of nonlocal games is that reported in Refs. [TKV⁺18, FK19] on the rigidity of quantum random access codes (QRACs). The authors study “ $2^{d}\rightarrow 1$ ” QRACs, which encode $2$ classical dits $x,y\in[d]$ into a $d$ -dimensional system, such that either $x$ or $y$ may be retrieved by performing a suitable measurement. These works show that $2^{d}\to 1$ QRACs are rigid, and in fact certify measurements based on mutually unbiased bases (MUBs).

In this paper we investigate the rigidity properties of superdense coding, which plays a fundamental role in quantum Shannon theory (see, e.g., Ref. [Wil13, Chapter 6]). The superdense coding task is to communicate one of four possible messages while only transmitting one quantum bit across a channel. The superdense coding protocol, first proposed by Bennett and Wiesner [BW92], achieves this task in the following way: Alice and Bob share one qubit each of an EPR pair (i.e., the maximally entangled state $\tfrac{1}{\sqrt{2}}|00\rangle+\frac{1}{\sqrt{2}}|11\rangle$ ) in advance, and to transmit a message $i\in\{1,2,3,4\}$ , Alice applies a one of four Pauli operators to her half of the EPR pair and sends her qubit. Bob then performs a Bell measurement on the qubit received from Alice and his qubit to determine $i$ .

1.1 Rigidity for superdense coding of two classical bits

The first result in our paper is to show that superdense coding is rigid: any protocol that accomplishes this task is “locally equivalent” to the Bennett-Wiesner protocol. We model arbitrary protocols for superdense coding in the following manner: Alice and Bob share a density matrix $\tau$ on a bipartite Hilbert space $\mathcal{H}_{A}\otimes\mathcal{H}_{B}$ , where we assume without loss of generality that $\mathcal{H}_{A}$ factors into $\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}}$ where $\mathcal{H}_{A^{\prime\prime}}$ is isomorphic to ${\mathbb{C}}^{2}$ . Given an input $i\in\{1,2,3,4\}$ , Alice applies a unitary operator $U_{i}$ (called an encoding operator) to her share of $\tau$ (with support in the space $\mathcal{H}_{A}$ ), sends the qubit $A^{\prime\prime}$ to Bob, and Bob then performs an optimal distinguishing measurement on the Hilbert space $\mathcal{H}_{A^{\prime\prime}}\otimes\mathcal{H}_{B}$ to determine what the input $i$ was. See Figure 1 for an illustration of a general superdense coding protocol.

\Qcircuit@C=1.5em @R=1em & Alice
\lsticki \control\cw\qwx[1] \cw \cw \cw \cw \cw \cw \lsticki
\lstickA’ \multigate1U_i \qw \qw \qw \qw \qw \rstick \qw
\lstickA” \ghostU_i \qw \qw\qwx[1] \push
\qwx[1]
\qwx[1] Bob
\push \multigate1M \qw \qw \qw
\lstickB \qw \qw \qw \ghostM \qw \meter \push \cw \lsticki \gategroup1148.7em. \gategroup6188.7em.

Figure 1: A general superdense coding protocol. This quantum circuit is modified from [Cha20].

A priori it appears daunting to characterize the structure of an arbitrary superdense coding protocol. For one, the dimension of the spaces $\mathcal{H}_{A^{\prime}}$ and $\mathcal{H}_{B}$ are unbounded, and the state $\tau$ is uncharacterized. Furthermore, the encoding unitary operators $U_{i}$ can be extremely complicated, potentially performing complex entangling operations between the space $\mathcal{H}_{A^{\prime}}$ and $\mathcal{H}_{A^{\prime\prime}}$ (the qubit to be sent over to Bob). However, the property of being a superdense coding protocol is extremely constraining. Theorem 1.1 gives a precise characterization of how an arbitrary superdense protocol is locally equivalent to the canonical Bennett-Wiesner protocol. In the statement of the theorem, “ $=_{\tau^{\prime}}$ ” denotes equality of two unitary operators with respect to the state $\tau^{\prime}$ ; in other words, $C=_{\rho}D$ means that $C\rho C^{*}=D\rho D^{*}$ .

Theorem 1.1 (Rigidity for superdense coding).

Let $(\tau,(U_{i}))$ denote a superdense coding protocol. Then there exist

1.

Unitary operators $V$ acting on $\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}}$ and $(C_{i})_{i\in[4]}$ acting on $\mathcal{H}_{A^{\prime}}$ ,
2.

An isometry $W$ mapping $\mathcal{H}_{B}$ to a Hilbert space $\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{B^{\prime\prime}}$ where $\mathcal{H}_{B^{\prime\prime}}$ is isomorphic to $\mathbb{C}^{2}$ ,
3.

A density matrix $\rho$ on $\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ ,
4.

A set of pairwise orthogonal projectors $\{P_{r}\}$ that sum to the identity on $\mathcal{H}_{A^{\prime}}$ , and
5.

A collection of $2\times 2$ unitary operators $\{S_{r}\}$ ,

such that, letting $\tau^{\prime}\coloneqq(V\otimes W)\tau(V\otimes W)^{*}$ , we have

\tau^{\prime}=\rho^{A^{\prime}B^{\prime}}\otimes|\mathrm{EPR}\rangle\!\langle\mathrm{EPR}|^{A^{\prime\prime}B^{\prime\prime}}

and for $i\in\{1,2,3,4\}$ ,

(C_{i}^{*}\otimes\mathds{1})U_{i}V^{*}=_{\tau^{\prime}}\sum_{r}P_{r}\otimes S_{r}\sigma_{i}S_{r}^{*}

where $\sigma_{1}\coloneqq\mathds{1}$ , $\sigma_{2}\coloneqq{\mathrm{Z}}$ , $\sigma_{3}\coloneqq{\mathrm{X}}$ , and $\sigma_{4}\coloneqq{\mathrm{Y}}$ are the one-qubit Pauli matrices.

Theorem 1.1 can be interpreted as expressing rigidity for superdense coding in the following way: given an arbitrary protocol $(\tau,(U_{i}))$ for superdense coding, there exists local isometries $V,W$ where if Alice applies $V$ and Bob applies $W$ to their share of $\tau$ , then an EPR pair is extracted with an auxiliary state $\rho$ remaining. By pre-applying $V^{*}$ to Alice’s unitary operators $U_{i}$ , we discover that $U_{i}V^{*}$ has a very regular form: operationally, it can be interpreted as performing some projective measurement $\{P_{r}\}$ on Alice’s part of the auxiliary state $\rho$ to obtain some outcome $r$ in a set $\mathcal{R}$ , and then based on $r$ , applying a rotated version of the standard Bennett-Wiesner superdense coding protocol to Alice’s part of the EPR pair. Finally, after sending her EPR qubit, Alice then applies some unitary operator $C_{i}$ on her remaining qubits (which does not affect Bob’s measurement in any way). This considerably strengthens and extends the characterization of “tight” superdense coding protocols due to Vollbrecht and Werner [VW00, Lemma 3] (see also Ref. [Wer01]); they studied protocols in which the shared entangled state $\tau$ is a state on ${\mathbb{C}}^{2}\otimes{\mathbb{C}}^{2}$ (or on ${\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}$ in the case of $d$ -dimensional superdense coding protocols; see Section 1.2). This difference would be significant in a cryptographic setting; in the context of quantum key distribution, this is the difference between the device-independent and semi-device-independent settings.

The proof of Theorem 1.1 is given in Section 3. It proceeds via a number of reductions: first, using an information-theoretic argument, we show every superdense coding protocol $(\tau,(U_{i}))$ is locally equivalent to one that uses an EPR pair in the state $\tau$ . Given this, we then show that each of the encoding operators $U_{i}$ can be individually block-diagonalized with respect to the EPR pair. Finally, we show that the blocks across the different encoding operators $U_{i}$ can be “matched up” in a way that they correspond to the Pauli matrices. Each of these steps requires carefully deducing the structure imposed on the state and the encoding operators by the correctness of the protocol.

1.2 Rigidity for higher dimensional superdense coding?

We then consider the generalization of superdense coding to communicating more than $2$ classical bits. Specifically, we consider protocols for communicating one of $d^{2}$ possible messages by sending a $d$ -dimensional quantum system over the channel — we call these $d$ -dimensional superdense coding protocols. A canonical protocol for $d$ -dimensional superdense coding is as follows: the players share a $d$ -dimensional maximally entangled state $|\upphi_{d}\rangle\coloneqq\frac{1}{\sqrt{d}}\sum_{e=0}^{d-1}|e\rangle|e\rangle$ , and given message $i\in[d^{2}]$ , Alice applies a unitary operator $E_{i}$ to her share of $|\upphi_{d}\rangle$ , and sends it over to Bob. The family of unitary operators $\{E_{i}\}$ can be any orthogonal unitary basis for the space of $d\times d$ matrices. (The orthogonality property means that $\operatorname{Tr}(E_{i}^{*}E_{j})=0$ if and only if $i\neq j$ .) An example of such a basis is the set of Heisenberg-Weyl operators. In dimension $d$ , these are a set of $d\times d$ matrices $\{P_{i,j}:0\leq i,j<d\}$ defined as follows. Let $\omega_{d}\coloneqq\exp\left(\frac{2\pi{\mathrm{i}}}{d}\right)$ be a primitive $d$ th root of unity. For $i,j\in\{0,1,2,\ldots,d-1\}$ , let $P_{i,j}={\mathrm{X}}^{i}_{d}{\mathrm{Z}}^{j}_{d}$ where ${\mathrm{X}}_{d}\coloneqq\sum_{k=0}^{d-1}|k+1\,(\mathrm{mod}\,d)\rangle\!\langle k|$ is the “shift” operator, and ${\mathrm{Z}}_{d}\coloneqq\sum_{k=0}^{d-1}\omega^{k}_{d}\,|k\rangle\!\langle k|$ is the “clock” operator. Does the rigidity phenomenon also extend to dimensions $d$ larger than $2$ ?

The second result of this paper is that $d$ -dimensional superdense coding for $d\geq 3$ is not rigid in the same sense as Theorem 1.1: there are $d$ -dimensional superdense coding protocols which are not locally equivalent to each other. This is because in dimensions three and higher there are inequivalent orthogonal unitary bases. (In contrast, all orthogonal unitary bases in dimension two are equivalent to the Pauli matrices.) Here, equivalence between two unitary bases $\{E_{i}\}$ and $\{F_{i}\}$ means there exist unitary operators $U,V$ such that for all $i$ , we have $F_{i}=\alpha_{i}UE_{i}V$ for some choice of complex phase $\alpha_{i}$ .

Theorem 1.2 (Existence of inequivalent orthogonal unitary bases for all $d\geq 3$ ).

For every dimension $d\geq 3$ , there are orthogonal unitary bases that are not equivalent to each other.

The uniqueness of orthogonal unitary bases was first studied by Vollbrecht and Werner [VW00], and the existence of non-equivalent orthogonal unitary bases for all dimensions greater than $2$ was observed in follow-up work by Werner [Wer01]. (We elaborate on prior work on the topic in Section 4.2.) Werner described, without proof, how non-equivalent bases may be constructed. We present explicit constructions of such bases in Section 4. The construction for $d\geq 4$ is based on the observation that the shift operator ${\mathrm{X}}_{d}$ corresponds to a perfect matching in $K_{d,d}$ , the complete bipartite graph. Moreover, its powers $\left\{{\mathrm{X}}_{d}^{i}:0\leq i<d\right\}$ correspond to a partition of the edge set of $K_{d,d}$ into $d$ disjoint perfect matchings. By replacing this partition with another carefully chosen such partition, we obtain an orthogonal unitary basis that is not equivalent to the clock and shift construction. The proof of non-equivalence involves comparing the spectra of the operators in the two bases, taking into account the complex phase and unitary operators that witness a potential equivalence map. For $d=3$ , we follow a construction described by Werner [Wer01]. We prove non-equivalence to the clock and shift basis by showing that the resulting basis is not a commutative projective group (again accounting for a potential equivalence map).

In a previous version of this paper, we conjectured that rigidity for higher dimensional superdense coding holds up to choosing orthogonal unitary bases [NY20, Conjecture 1.3]. That is, every $d$ -dimensional superdense coding protocol is locally equivalent (in the sense of Theorem 1.1) to one where Alice and Bob share an entangled state $\rho^{A^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle\!\langle\upphi_{d}|^{A^{\prime\prime}B^{\prime\prime}}$ for some density matrix $\rho^{A^{\prime}B^{\prime}}$ , and Alice’s encoding operators are of the form

U_{i}=\sum_{r}P_{r}\otimes E_{r,i}

where $\{P_{r}\}$ is a set of pairwise orthogonal projectors that sum to the identity on $\mathcal{H}_{A^{\prime}}$ and for every $r$ , the set $\{E_{r,i}\}_{i\in[d^{2}]}$ is an orthogonal unitary basis for the space of $d\times d$ complex matrices. This would be a natural extension of the statement of Theorem 1.1 to the case of general $d\geq 2$ where the registers $A^{\prime}B^{\prime}$ are treated as a source of “shared randomness” to help Alice and Bob synchronize their choice of orthogonal unitary basis.

We can show that when the shared entangled state between Alice and Bob is a pure state in $\mathbb{C}^{d}\otimes\mathbb{C}^{d}$ , then this conjecture holds (see Section 4.1): up to local unitary operators, the shared state is necessarily the maximally entangled state $|\upphi_{d}\rangle$ of local dimension $d$ , and the encoding operators $\{U_{i}\}$ necessarily form an orthogonal unitary basis. However, this conjecture is false for protocols where Alice sends only a part of her entangled state. In work subsequent to ours, Farkas, Kaniewski, and Nayak [FKN22] show that there exist infinitely many superdense coding protocols that are not locally equivalent to a protocol of the form described in the conjecture. In particular, in these counterexample protocols Alice may perform a complicated entangling operation between her message and the rest of her state, rather than just treating the ancilla system as a source of shared randomness.

1.3 Superdense coding protocols with error

Finally, we consider probabilistic protocols for $d$ -dimensional superdense coding, where Bob’s decoding only needs to succeed with high probability. In particular, we say that $(\tau,(U_{i}))$ is a $(d,\varepsilon)$ -superdense coding protocol if Bob is able to decode Alice’s message $i$ with probability at least $1-\varepsilon$ , for all $i$ . We focus on the case where Alice and Bob share an entangled state in $\mathbb{C}^{d}\otimes\mathbb{C}^{d}$ (i.e., have local dimension $d$ ). As mentioned previously, in the exact case $\varepsilon=0$ , their shared state is necessarily the maximally entangled state and Alice’s encoding unitary operators form an orthogonal unitary basis. We conjecture that even in the probabilistic setting, this characterization of $d$ -dimensional superdense coding protocols is robust, in the following sense.

Conjecture 1.3.

There exist functions $\delta_{1},\delta_{2}:[0,1]\to[0,1]$ where $\delta_{1}(\varepsilon)$ and $\delta_{2}(\varepsilon)$ monotonically decrease to $0$ as $\varepsilon\to 0$ , such that the following holds. For all $(d,\varepsilon)$ -superdense coding protocols $(\tau,(U_{i}))$ such that $\tau$ is a density matrix on $\mathbb{C}^{d}\otimes C^{d}$ and $U_{i}$ are unitary operators in ${\mathsf{U}}(\mathbb{C}^{d})$ , we have

\langle\upphi_{d}|\tau|\upphi_{d}\rangle\quad\geq\quad 1-\delta_{1}(\varepsilon)\enspace,

and there exists an orthogonal unitary basis $\{E_{i}\}_{i\in[d^{2}]}$ for the space of $d\times d$ complex matrices such that for all $i\in[d^{2}]$ ,

\|U_{i}-E_{i}\|_{\mathrm{nhs}}\quad\leq\quad\delta_{2}(\varepsilon)\enspace,

where $\|X\|_{\mathrm{nhs}}\coloneqq\sqrt{\frac{1}{d}\operatorname{Tr}(XX^{\dagger})}$ denotes the normalized Hilbert-Schmidt norm on the space of $d\times d$ matrices.

We note that the choice of the normalized Hilbert-Schmidt norm in the statement of 1.3 is somewhat arbitrary; one can also consider other formulations of the conjecture with other norms (such as the spectral norm, etc.).

The last part of our paper analyzes a possible challenge to 1.3 proposed by Aram Harrow. Consider the following probabilistic construction for a potential $(d,\varepsilon)$ -superdense coding protocol: independently sample $d^{2}$ matrices ${\bm{U}}_{1},\ldots,{\bm{U}}_{d^{2}}$ from the Haar measure on ${\mathsf{U}}({\mathbb{C}}^{d})$ , the group of $d\times d$ complex unitary matrices. Let $\tau\coloneqq|\upphi_{d}\rangle\!\langle\upphi_{d}|$ denote the $d$ -dimensional maximally entangled state. How well does the protocol $(\tau,({\bm{U}}_{i}))$ accomplish superdense coding?

In classical and quantum communication, many tasks can be performed near-optimally via probabilistically constructed protocols. See, e.g., the text by Wilde [Wil13] for examples from Shannon theory. A simple example from communication complexity is the task of quantum fingerprinting [BCWdW01], which enables checking whether two $n$ -bit strings $x$ and $y$ are equal by only comparing two $\mathrm{O}(\log n)$ -qubit fingerprints of the strings. It can be shown that picking random $\mathrm{O}(\log n)$ -qubit states for each $n$ -bit string $x$ yields a good quantum fingerprinting protocol.

Let ${\bm{\Pi}}_{d}$ denote the random superdense protocol specified by $(\upphi_{d},({\bm{U}}_{i}))$ . Note that the error ${\bm{\varepsilon}}$ of ${\bm{\Pi}}_{d}$ , when averaged over the choice of random unitaries $({\bm{U}}_{i})$ , is some function of $d$ . We first argue that the conjecture implies that the error of a random superdense coding protocol, when averaged over the choice of $({\bm{U}}_{i})$ , cannot be too small:

Proposition 1.4.

Suppose 1.3 were true. Let $\delta_{2}(\varepsilon)$ be the function from 1.3. Then the random superdense coding protocol ${\bm{\Pi}}_{d}$ specified by $(\upphi_{d},({\bm{U}}_{i}))$ must have error ${\bm{\varepsilon}}$ satisfying

\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\delta_{2}({\bm{\varepsilon}})^{2}\quad\geq\quad(2d)^{-2}\enspace.

Put another way, it cannot be that both 1.3 is true and also the random superdense protocol has error vanishing so quickly such that $\delta_{2}({\bm{\varepsilon}})$ is smaller than $(2d)^{-2}$ , on average. Due to the concentration of measure phenomenon for Haar-random states and unitary operators (as expressed by, e.g., the Lévy-like property in Theorem 5.5), it is plausible a priori that the average error ${\bm{\varepsilon}}$ , and therefore also $\delta_{2}({\bm{\varepsilon}})$ , scale as $\mathrm{o}(d^{-2})$ . Thus, the random superdense protocol is potentially a counterexample to 1.3.

We show that this probabilistic construction does not yield a good superdense coding protocol: with overwhelmingly high probability over the choice of random unitary operators $(U_{i})$ , the protocol has a nonzero probability of error that is independent of $d$ . Thus, 1.3 is not ruled out by the random protocol construction.

Theorem 1.5 (Performance of a random superdense coding protocol).

The random superdense coding protocol ${\bm{\Pi}}_{d}$ specified by $(\upphi_{d},({\bm{U}}_{i}))$ where ${\bm{U}}_{i}\in{\mathsf{U}}({\mathbb{C}}^{d})$ are Haar-random unitary operators has error at least $1-\tfrac{8}{3\pi}\approx 0.15$ as $d\to\infty$ , with high probability over the choice of $({\bm{U}}_{i})$ .

We prove Theorem 1.5 by showing that the distinguishability of the ensemble of random states $\{({\bm{U}}_{i}\otimes\mathds{1})|\upphi_{d}\rangle\}_{i\in[d^{2}]}$ is bounded away from $1$ (with high probability). The generalized Holevo-Curlander bounds [Kho79, Cur79, ON99, Tys09b] relate the distinguishability of an ensemble $\{(p_{i},\rho_{i})\}$ to the quantity

\operatorname{Tr}\Big{(}\sum_{i}p_{i}^{2}\rho_{i}^{2}\,\Big{)}^{1/2}\enspace.

(1.1)

Our analysis of this quantity is largely inspired by work due to Montanaro [Mon07] on the distinguishability of random pure quantum states. However, extending his approach to the ensemble of interest to us—one consisting of random maximally entangled states—involves significant technical difficulties. The approach involves relating the distinguishability of an ensemble of states to the spectrum of the ensemble average. In the case of Haar-random pure states, the ensemble average is well approximated by the ensemble average of unnormalized complex gaussian vectors with suitably chosen variance. The spectrum of such matrices in the asymptotic limit is given by the Marčenko-Pastur Theorem from random matrix theory. In our case, the entries of the random vectors in the ensemble are not independent. We instead bound the generalized Holevo-Curlander quantity in Equation 1.1 by employing a recent generalization of the Marčenko-Pastur Theorem due to Yaskov [Yas16]. (The theorem was proven for ensembles of random real vectors. We verify that its proof also extends to complex random vectors with analogous properties.) In the process, we show that random maximally entangled states satisfy a pseudo-isotropy condition that suffices for the theorem to hold.

A subtlety in the use of the Marčenko-Pastur law is that we would like to deduce the convergence of a sequence of means to the mean of the limiting distribution from the convergence of a sequence of distributions. This does not necessarily hold in general. In order to prove such a relation between the two forms of convergence, we show that random maximally entangled states are sub-gaussian. This allows us to draw on a generalization of the Bai-Yin Theorem, which bounds the norm of matrices whose columns are given by i.i.d. sub-gaussian vectors. We thus show that the norm of the ensemble average has an exponentially decaying tail, which in turn guarantees the form of convergence we seek.

We believe the techniques used in our analysis are of independent interest. In fact, the subtlety mentioned above was overlooked by Montanaro; the ideas we develop may also be used to close a gap in his analysis (see Section 5.4 for the details).

1.4 Further remarks and open questions

In this paper we have initiated the study of rigidity phenomena in superdense coding protocols. Given its importance in quantum Shannon theory, our study may shed new light on protocols based on superdense coding. The power of entanglement as a resource in distributed quantum computation, in particular in two-party communication complexity, remains a mystery. The rigidity theorem we establish (Theorem 1.1) gives a complete picture for a simple but fundamental task. The property shown in the analysis of random superdense coding protocols (Theorem 1.5) may also be interpreted as placing a limit on how closely a sequence of random unitary operators approximate an orthogonal basis. This may be of relevance to the theory of error-correction, where unitary error bases play a central role.

We list several open questions that arise from this work:

1.

Is 1.3 true? Does a robust version of Theorem 1.1 hold?
2.

Do $d$ -dimensional superdense coding protocols, in which the shared state between Alice and Bob may have local dimension larger than $d$ , also exhibit some non-trivial form of rigidity?
3.

Does rigidity also hold for quantum teleportation, a task that is “dual” to superdense coding? Can this be derived in a black-box way from the rigidity of superdense coding?
4.

Are there any connections between the QRAC rigidity results of [TKV⁺18, FK19] and our results on the rigidity of superdense coding?
5.

What other quantum information processing tasks have the rigidity property?

We believe the investigation of these questions will lead to significant new insights into the nature of quantum information, with wide ranging ramifications.

Acknowledgements.

We thank the anonymous journal reviewers for their thorough feedback on an earlier version of the paper. We thank Adam Bouland, Chinmay Nirkhe, and Zeph Landau for stimulating discussions at the beginning of this project. H.Y. would like to especially thank Adam Bouland for his integral role in formulating the questions explored in this paper. We would like to thank Pavel Yaskov for the correspondence on his work on the Marčenko–Pastur theorem. We thank Jędrzej Kaniewski and Mate Farkas for their pointers to the literature on rigidity of QRACs. A.N. would like to thank Kanstantsin Pashkovich and Vern Paulsen for helpful discussions on bases of orthogonal unitary operators, and is grateful to the Berkeley CS Theory Group for their hospitality during a visit in Fall 2017, when this work was initiated. A.N.’s research is supported in part by a Discovery Grant from NSERC Canada. H.Y. is supported by an NSERC Discovery Grant, a Google Quantum Research Award, and AFOSR Grant No. FA9550-21-1-0040. This research was partly conducted at the Kavli Institute for Theoretical Physics during the Quantum Physics of Information program in 2017 (and thus this research was supported in part by the National Science Foundation under Grant No. NSF PHY17-48958).

2 Properties of superdense coding

2.1 Quantum information basics

We refer the reader to texts such as [NC11, Wat18, Wil13] for the basics of quantum information, and mention some notational conventions here.

We write $\mathds{1}$ to denote the identity operator on a Hilbert space. We use superscripts on quantum states, e.g., $|\psi\rangle^{AB}$ or $\rho^{AB}$ , to denote the registers in which they are stored. Similarly, we use subscripts on operators to indicate the registers on which they act, unless this is clear from the context. Given a bipartite density matrix $\rho^{AB}$ , we write $\rho^{A}$ to denote its reduction to register $A$ (i.e., the partial trace over $B$ ).

Given operators $A,B$ and a density matrix $\rho$ acting on a Hilbert space $\mathcal{H}$ , we write $A=_{\rho}B$ to denote $A\rho A^{*}=B\rho B^{*}$ . In other words, the operators $A$ and $B$ have the same action on the state $\rho$ .

We write $|\mathrm{EPR}\rangle$ to denote the maximally entangled state $\frac{1}{\sqrt{2}}\Big{(}|00\rangle+|11\rangle\Big{)}$ on two qubits. We write $|\upphi_{d}\rangle=\frac{1}{\sqrt{d}}\sum_{i=0}^{d-1}|i\rangle|i\rangle$ to denote the $d$ -dimensional maximally entangled state, or simply $|\upphi\rangle$ if the dimension $d$ is clear from context. We recall the single qubit Pauli matrices:

\mathds{1}\coloneqq\begin{pmatrix}1&0\\ 0&1\end{pmatrix}\qquad{\mathrm{X}}\coloneqq\begin{pmatrix}0&1\\ 1&0\end{pmatrix}\qquad{\mathrm{Y}}\coloneqq\begin{pmatrix}0&-{\mathrm{i}}\\ {\mathrm{i}}&0\end{pmatrix}\qquad{\mathrm{Z}}\coloneqq\begin{pmatrix}1&0\\ 0&-1\end{pmatrix}\;.

2.2 Basic properties of superdense coding

Here we give a formal definition of a general superdense coding protocol, and prove some basic properties about them.

Definition 2.1 (Superdense coding protocol).

Let $d$ be a positive integer. Let $\mathcal{H}_{A}\coloneqq\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}}$ and $\mathcal{H}_{B}$ be finite dimensional Hilbert spaces where $\mathcal{H}_{A^{\prime\prime}}$ is isomorphic to ${\mathbb{C}}^{d}$ . Let $\tau$ denote a density matrix on $\mathcal{H}_{A}\otimes\mathcal{H}_{B}$ and let $(U_{i})_{i\in[d^{2}]}$ denote a sequence of $d^{2}$ unitary operators (called encoding operators) acting on $\mathcal{H}_{A}$ . We say that $(\tau,(U_{i}))$ is a $(d,\varepsilon)$ -superdense coding protocol if there exists a POVM $\{M_{i}\}_{i\in[d^{2}]}$ acting on $\mathcal{H}_{A^{\prime\prime}}\otimes\mathcal{H}_{B}$ such that

\operatorname{Tr}(M_{i}\,\rho_{i})\geq 1-\varepsilon\qquad\forall\,i\in[d^{2}]

(2.1)

where $\rho_{i}$ denotes the reduced density matrix of $(U_{i}\otimes\mathds{1})\tau(U_{i}\otimes\mathds{1})^{*}$ on registers $A^{\prime\prime}B$ . When $\varepsilon=0$ we simply call $(\tau,(U_{i}))$ a $d$ -dimensional superdense coding protocol.

Lemma 2.2 (Orthogonality conditions I).

Let $(\tau,(U_{i}))$ be a $d$ -dimensional superdense coding protocol. Then letting $\rho_{i}$ denote the reduced density matrix of $(U_{i}\otimes\mathds{1})\tau(U_{i}\otimes\mathds{1})^{*}$ on registers $A^{\prime\prime}B$ , we have that

\operatorname{Tr}(\rho_{i}\,\rho_{j})=0\qquad\forall i\neq j\in[d^{2}]\;.

Proof.

Let $\{M_{i}\}$ denote a POVM satisfying Equation 2.1 for $\varepsilon=0$ . Then for all $i\in[d^{2}]$ , we have $\rho_{i}\leq M_{i}$ according to the positive semidefnite ordering. This is because if we write $\rho_{i}=\sum_{k}p_{ik}|\psi_{ik}\rangle\!\langle\psi_{ik}|$ for some probabilities $\{p_{ik}\}$ , then $\operatorname{Tr}(\rho_{i}M_{i})=1$ implies that for all $k$ , $\langle\psi_{ik}|M_{i}|\psi_{ik}\rangle=1$ , which implies that $|\psi_{ik}\rangle$ is an eigenvector of $M_{i}$ with eigenvalue $1$ . This implies that $M_{i}=\sum_{k}|\psi_{ik}\rangle\!\langle\psi_{ik}|+M_{i}^{\prime}$ for some positive semidefinite operator $M_{i}^{\prime}$ , and this is at least $\rho_{i}$ in the positive semidefinite ordering.

This means then that for all $j\neq i$ ,

0\leq\operatorname{Tr}(\rho_{i}\,M_{j})\leq\operatorname{Tr}(\rho_{i}\,(\mathds{1}-M_{i}))=\operatorname{Tr}(\rho_{i})-\operatorname{Tr}(\rho_{i}\,M_{i})=0

where the first inequality is due to the positivity of $\rho_{i}$ and $M_{j}$ , the second inequality is due to the fact that $\sum_{i}M_{i}=\mathds{1}$ , and the last equality is due to the fact that $\operatorname{Tr}(\rho_{i})=\operatorname{Tr}(\rho_{i}\,M_{i})=1$ . Therefore $\operatorname{Tr}(\rho_{i}\,M_{j})=0$ . Thus we have

0\leq\operatorname{Tr}(\rho_{i}\,\rho_{j})\leq\operatorname{Tr}(\rho_{i}\,M_{j})=0\enspace,

so $\operatorname{Tr}(\rho_{i}\,\rho_{j})=0$ . ∎

Lemma 2.3 (Orthogonality conditions II).

Let $(\tau,(U_{i}))$ be a $d$ -dimensional superdense coding protocol. Then for all $i\neq j\in[d^{2}]$ ,

\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}\tau^{A}U_{j}^{*}\right)=0

where $\tau^{A}$ denotes the reduced density matrix of $\tau$ on register $A$ and $\operatorname{Tr}_{A^{\prime\prime}}(\cdot)$ denotes the partial trace over register $A^{\prime\prime}$ .

Proof.

Let $|\tau\rangle$ denote a purification of $\tau$ on the Hilbert space $\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{R}$ , where $\mathcal{H}_{R}$ is the purifying space. Clearly, the protocol where Bob also has access to the purifying space $\mathcal{H}_{R}$ is also a $d$ -dimensional superdense coding protocol.

Let $\{|1\rangle,\ldots,|\dim A^{\prime}\rangle\}$ denote an orthonormal basis for $\mathcal{H}_{A^{\prime}}$ . Let $|\rho_{ik}\rangle$ be the (sub-normalized) pure state on registers $A^{\prime\prime}BR$ given by

|\rho_{ik}\rangle\coloneqq(\langle k|^{A^{\prime}}\otimes\mathds{1})(U_{i}\otimes\mathds{1})|\tau\rangle\enspace.

Intuitively, $|\rho_{ik}\rangle$ represents the residual state of the protocol on registers $A^{\prime\prime}B$ when Alice applies unitary operator $U_{i}$ , and then measures the $A^{\prime}$ subsystem in the standard basis to obtain outcome $|k\rangle$ .

Note that if we let $\rho_{i}$ denote the state $(U_{i}\otimes\mathds{1})|\tau\rangle\!\langle\tau|(U_{i}\otimes\mathds{1})^{*}$ reduced to the registers $A^{\prime\prime}BR$ , we have the identity

\rho_{i}=\sum_{k}|\rho_{ik}\rangle\!\langle\rho_{ik}|\;,

because we can think of $\rho_{i}$ as the result of first measuring the $A^{\prime}$ register in the standard basis, and discarding the outcome. Applying Lemma 2.2 to the purified protocol $(|\tau\rangle\!\langle\tau|,(U_{i}))$ , we have for $i\neq j$ ,

0=\operatorname{Tr}(\rho_{i}\,\rho_{j})=\sum_{k,k^{\prime}}\operatorname{Tr}(|\rho_{ik}\rangle\!\langle\rho_{ik}|\cdot|\rho_{jk^{\prime}}\rangle\!\langle\rho_{jk^{\prime}}|)=\sum_{k,k^{\prime}}|\langle\rho_{jk^{\prime}}|\rho_{ik}\rangle|^{2}

and therefore $|\langle\rho_{jk^{\prime}}|\rho_{ik}\rangle|^{2}=0$ for all $k,k^{\prime}$ . This implies that $\langle\rho_{jk^{\prime}}|\rho_{ik}\rangle=0$ for all $k,k^{\prime}$ , which can be rewritten as

\langle\tau|(U_{j}\otimes\mathds{1})^{*}(|k^{\prime}\rangle\!\langle k|^{A^{\prime}}\otimes\mathds{1})(U_{i}\otimes\mathds{1})|\tau\rangle=\operatorname{Tr}\Big{(}(\langle k|^{A^{\prime}}\otimes\mathds{1})(U_{i}\otimes\mathds{1})|\tau\rangle\!\langle\tau|(U_{j}^{*}\otimes\mathds{1})(|k^{\prime}\rangle^{A^{\prime}}\otimes\mathds{1})\Big{)}=0.

This is equivalent to the statement that

\langle k|^{A^{\prime}}\,\operatorname{Tr}_{A^{\prime\prime}}\Big{(}U_{i}\tau^{A}U_{j}^{*}\Big{)}\,|k^{\prime}\rangle^{A^{\prime}}=0.

Since this holds for all $k,k^{\prime}$ , the matrix $\operatorname{Tr}_{A^{\prime\prime}}\Big{(}U_{i}\tau^{A}U_{j}^{*}\Big{)}$ is identically zero, which completes the proof of the Lemma. ∎

Next we define what it means for superdense protocols to be locally equivalent.

Definition 2.4.

Let $\mathcal{H}_{A}\coloneqq\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}}$ be a Hilbert space where $\mathcal{H}_{A^{\prime\prime}}$ is isomorphic to $\mathbb{C}^{d}$ . Let $\tau,\tau^{\prime}$ be density matrices on $\mathcal{H}_{A}\otimes\mathcal{H}_{B}$ . Let $(U_{i}),(U_{i}^{\prime})$ be unitary operators acting on $\mathcal{H}_{A}$ . We say that $(\tau,(U_{i}))$ and $(\tau^{\prime},(U_{i}^{\prime}))$ are locally equivalent if there exists

1.

A unitary operator $V$ acting on $\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}}$ ,
2.

A set of unitary operators $(C_{i})_{i\in[d^{2}]}$ acting on $\mathcal{H}_{A^{\prime}}$

such that

1.

$\tau^{\prime}=(V\otimes\mathds{1})\tau(V\otimes\mathds{1})^{*}$ , and
2.

$U_{i}^{\prime}=(C_{i}\otimes\mathds{1})U_{i}V^{*}$ .

Lemma 2.5 (Local unitary freedom of superdense coding protocols).

The following properties hold for local equivalence.

1.

If $(\tau,(U_{i}))$ and $(\tau^{\prime},(U_{i}^{\prime}))$ are locally equivalent, then $(\tau,(U_{i}))$ is a $(d,\varepsilon)$ -superdense coding protocol if and only if $(\tau^{\prime},(U_{i}^{\prime}))$ is.
2.

Local equivalence is transitive.

Proof.

For any fixed $i$ , after Alice applies her encoding unitary operator, the reduced density matrix on registers $A^{\prime\prime}B$ is the same whether the protocol $(\tau,(U_{i}))$ or $(\tau^{\prime},(U_{i}^{\prime}))$ is used. Thus Bob’s ability to distinguish between the different messages is exactly the same. This establishes Item 1.

If $(\tau,(U_{i}))$ and $(\tau^{\prime},(U_{i}^{\prime}))$ are locally equivalent, then $\tau^{\prime}=(V\otimes\mathds{1})\tau(V^{*}\otimes\mathds{1})$ , and $U_{i}^{\prime}=(C_{i}\otimes\mathds{1})U_{i}V^{*}$ . If $(\tau^{\prime},(U_{i}^{\prime}))$ and $(\tau^{\prime\prime},(U_{i}^{\prime\prime}))$ are locally equivalent, then $\tau^{\prime\prime}=(V^{\prime}\otimes\mathds{1})\tau^{\prime}((V^{\prime})^{*}\otimes\mathds{1})$ , and $U_{i}^{\prime\prime}=(C_{i}^{\prime}\otimes\mathds{1})U_{i}^{\prime}(V^{\prime})^{*}$ . Thus we have

	$\displaystyle\tau^{\prime\prime}\quad$	$\displaystyle=\quad(V^{\prime}V\otimes\mathds{1})\tau(V^{}(V^{\prime})^{}\otimes\mathds{1})\enspace,\qquad\text{and}$
	$\displaystyle U_{i}^{\prime\prime}\quad$	$\displaystyle=\quad(C_{i}^{\prime}C_{i}\otimes\mathds{1})U_{i}V^{}(V^{\prime})^{}\enspace,$

which implies that $(\tau,(U_{i}))$ is locally equivalent to $(\tau^{\prime\prime},(U_{i}^{\prime\prime}))$ . This establishes Item 2. ∎

2.3 Nice form protocols

In this section, we define nice form protocols and then show that every superdense coding protocol is locally equivalent to one that has a nice form.

Definition 2.6.

A $d$ -dimensional superdense coding protocol $(\tau,(U_{i}))$ has a nice form if

1.

$U_{1}=\mathds{1}$ ,

There exists an isometry $W:\mathcal{H}_{B}\to\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{B^{\prime\prime}}$ where $\mathcal{H}_{B^{\prime\prime}}$ is isomorphic to $\mathbb{C}^{d}$ such that

(\mathds{1}\otimes W)\tau(\mathds{1}\otimes W)^{*}=\rho^{A^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle\!\langle\upphi_{d}|^{A^{\prime\prime}B^{\prime\prime}}

for some density matrix $\rho$ on $\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ .

For all $i\in[d^{2}]$ , we have that

U_{i}\operatorname{Tr}_{B}(\tau)U_{i}^{*}=U_{i}\left(\rho^{A^{\prime}}\otimes\frac{\mathds{1}}{d}\right)U_{i}^{*}=\rho^{A^{\prime}}\otimes\frac{\mathds{1}}{d}\;

where $\rho^{A^{\prime}}$ denotes the reduced density matrix of $\tau$ on $\mathcal{H}_{A^{\prime}}$ .

4.

Let the spectral decomposition of $\rho^{A^{\prime}}$ be $\sum_{k}\lambda_{k}\Pi_{k}$ where $\lambda_{k}>0$ for all $k$ , with $\lambda_{k}$ distinct. Then for all $k$ and $i\neq j$ , we have

$\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})\,U_{i}U_{j}^{*}\,(\Pi_{k}\otimes\mathds{1}))=0.$

Item 2 says that up to a local unitary operation, the two parties share a maximally entangled state (in addition to other entanglement), and Item 3 turns out to be a consequence of this. Item 4 is equivalent to saying that the encoding of distinct messages $i\neq j$ are orthogonal to each other. The proof of Lemma 2.7 below may give the reader further intuition into these properties.

In the proof of Lemma 2.7, we make use of an information-theoretic argument that involves quantities such as von Neumann entropy $\operatorname{H}(A)$ , conditional entropy $\operatorname{H}(A|B)$ , and mutual information $\operatorname{I}(A:B)$ . For a comprehensive reference on these quantities and their basic properties, we recommend Wilde’s textbook [Wil13]. It is an interesting question whether Lemma 2.7 can be proved without making use of these information-theoretic quantities.

Lemma 2.7.

All superdense coding protocols $(\tau,(U_{i}))$ are locally equivalent to a superdense coding protocol $(\tau^{\prime},(U_{i}^{\prime}))$ that has a nice form.

Proof.

We define a unitary operator $V$ acting on $\mathcal{H}_{A^{\prime}}$ and unitary operators $(C_{i}:i\in[d^{2}])$ acting on $\mathcal{H}_{A^{\prime}}$ such that, letting $\tau^{\prime}\coloneqq(V\otimes\mathds{1})\tau(V\otimes\mathds{1})^{*}$ and $U_{i}^{\prime}\coloneqq(C_{i}\otimes\mathds{1})U_{i}V^{*}$ , the pair $(\tau^{\prime},(U_{i}^{\prime}))$ is a superdense coding protocol and has a nice form.

Let $V\coloneqq U_{1}$ and let $C_{1}\coloneqq\mathds{1}$ . This already yields Item 1 of Definition 2.6.

Let $|\tau\rangle^{ABR}$ be a purification of $\tau$ where $\mathcal{H}_{R}$ is a reference system of dimension $\dim(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$ . Consider the cq-state

\xi\coloneqq\frac{1}{d^{2}}\sum_{i}|i\rangle\!\langle i|^{X}\otimes(U_{i}\otimes\mathds{1})|\tau\rangle\!\langle\tau|^{ABR}(U_{i}^{*}\otimes\mathds{1})\enspace,

where the Hilbert space of register $X$ is $\mathcal{H}_{X}$ . By Lemma 2.5, the protocol $(V\tau V^{*},(U_{i}V^{*}))$ is a superdense coding protocol. Therefore the information contained in registers $A^{\prime\prime}B$ about $X$ in the state $\xi$ is

\operatorname{I}(X:A^{\prime\prime}B)_{\xi}=2\log_{2}d\enspace.

Intuitively, this is because Bob can perfectly recover the value of $i\in[d^{2}]$ , i.e., $2\log_{2}d$ bits of information, from the registers $A^{\prime\prime}B$ of $\xi$ . On the other hand, we have that

\operatorname{I}(X:B)_{\xi}=0

because without the qubit register $A^{\prime\prime}$ , Bob has no information about $X$ (the state of the register $B$ is the same for all $i$ ). Therefore we get

\operatorname{I}(X:A^{\prime\prime}|B)_{\xi}=\operatorname{I}(X:A^{\prime\prime}B)_{\xi}-\operatorname{I}(X:B)_{\xi}=2\log_{2}d\enspace.

Using the entropy characterization of conditional mutual information, we get

2\log_{2}d=\operatorname{I}(X:A^{\prime\prime}|B)_{\xi}=\operatorname{H}(A^{\prime\prime}|B)_{\xi}-\operatorname{H}(A^{\prime\prime}|XB)_{\xi}\enspace.

Since $\operatorname{H}(A^{\prime\prime}|B)_{\xi}\leq\log_{2}d$ and $\operatorname{H}(A^{\prime\prime}|XB)_{\xi}\geq-\log_{2}d$ (because the dimension of register $A^{\prime\prime}$ is $d$ ), we get that $\operatorname{H}(A^{\prime\prime}|B)_{\xi}=\log_{2}d$ and $\operatorname{H}(A^{\prime\prime}|XB)_{\xi}=-\log_{2}d$ .

Since $X$ is a classical register, we can write $\operatorname{H}(A^{\prime\prime}|XB)_{\xi}$ as

-\log_{2}d=\operatorname{H}(A^{\prime\prime}|XB)_{\xi}=\operatorname*{\mathbb{E}}_{i}\operatorname{H}(A^{\prime\prime}|B,X=i)

where $\operatorname{H}(A^{\prime\prime}|B,X=i)$ is defined as $\operatorname{H}(A^{\prime\prime}|B)_{\xi_{i}}$ with $|\xi_{i}\rangle\coloneqq(U_{i}\otimes\mathds{1})|\tau\rangle$ . Since $\operatorname{H}(A^{\prime\prime}|B,X=i)\geq-\log_{2}d$ , we have $\operatorname{H}(A^{\prime\prime}|B)_{\xi_{i}}=-\log_{2}d$ for all $i$ , and in particular for $i=1$ .

Then $\operatorname{H}(A^{\prime\prime}|B)_{\xi_{i}}=-\operatorname{H}(A^{\prime\prime}|RA^{\prime})_{\xi_{i}}$ (because $\xi_{i}$ is pure), so $\operatorname{H}(A^{\prime\prime}|RA^{\prime})_{\xi_{i}}=\log_{2}d$ . On one hand, we have that $\operatorname{I}(A^{\prime\prime}:RA^{\prime})_{\xi_{i}}=\operatorname{H}(A^{\prime\prime})_{\xi_{i}}-\operatorname{H}(A^{\prime\prime}|RA^{\prime})_{\xi_{i}}$ , and on the other hand, mutual information is always nonnegative. Thus $\operatorname{H}(A^{\prime\prime})_{\xi_{i}}=\log_{2}d$ , and the reduced density matrix of $\xi_{i}$ on the $A^{\prime\prime}$ register is maximally mixed. Furthermore we have $\operatorname{I}(A^{\prime\prime}:RA^{\prime})_{\xi_{i}}=0$ , so $\xi_{i}$ has no correlations between registers $A^{\prime\prime}$ and $RA^{\prime}$ :

\operatorname{Tr}_{B}(\xi_{i})=\rho_{i}^{RA^{\prime}}\otimes\frac{\mathds{1}}{d}

(2.2)

where $\rho_{i}$ is some density matrix on the $RA^{\prime}$ registers.

Fix $i=1$ , and let $\rho^{RA^{\prime}}$ denote $\rho_{1}^{RA^{\prime}}$ . Let $\mathcal{H}_{B^{\prime}}$ be a Hilbert space with dimension $\dim(\mathcal{H}_{R}\otimes\mathcal{H}_{A^{\prime}})$ and let $\mathcal{H}_{B^{\prime\prime}}$ be isomorphic to $\mathbb{C}^{d}$ . Let $|\rho\rangle^{RA^{\prime}B^{\prime}}$ denote a purification of $\rho^{RA^{\prime}}$ . Notice that $|\rho\rangle^{RA^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle^{A^{\prime\prime}B^{\prime\prime}}$ is a purification of the state in Equation 2.2. Using Uhlmann’s Theorem [Uhl76] (also known as the Schrödinger-HJW Theorem [Sch35, HJW93]), there exists an isometry $W$ on $\mathcal{H}_{B}$ such that

(\mathds{1}\otimes W)|\xi_{1}\rangle^{RA^{\prime}A^{\prime\prime}B}=|\rho\rangle^{RA^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle^{A^{\prime\prime}B^{\prime\prime}}\;.

Since $|\xi_{1}\rangle=(V\otimes\mathds{1})|\tau\rangle$ , we have that

\rho^{A^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle\!\langle\upphi_{d}|^{A^{\prime\prime}B^{\prime\prime}}=\operatorname{Tr}_{R}((\mathds{1}\otimes W)|\xi_{1}\rangle\!\langle\xi_{1}|(\mathds{1}\otimes W)^{*})=(V\otimes W)\tau(V\otimes W)^{*}\;.

Since $\tau^{\prime}=(V\otimes\mathds{1})\tau(V\otimes\mathds{1})^{*}$ we obtain Item 2 of Definition 2.6 for the protocol $(\tau^{\prime},(U_{i}V^{*}))$ .

In what follows we let $\zeta$ denote $\rho^{A^{\prime}}$ (which also equals $\tau^{\prime A^{\prime}}$ ). We now establish Item 3 of Definition 2.6. Let $\sum_{k}\lambda_{k}\Pi_{k}$ be the spectral decomposition of $\zeta$ where the $\{\lambda_{k}\}$ are distinct and nonzero, and $\Pi_{k}$ is the orthogonal projector onto the eigenspace of $\zeta$ corresponding to eigenvalue $\lambda_{k}$ . Since by Equation 2.2 we have

U_{i}V^{*}\left(\zeta\otimes\frac{\mathds{1}}{d}\right)(U_{i}V^{*})^{*}=\operatorname{Tr}_{BR}(\xi_{i})=\zeta_{i}\otimes\frac{\mathds{1}}{d}

(2.3)

for some density matrix $\zeta_{i}$ , the states $\zeta_{i}$ and $\zeta$ have the same eigenvalues with the same multiplicities. That is, there are an orthogonal set of projectors $\{\Pi_{k}^{(i)}\}_{k}$ such that

\zeta_{i}=\sum_{k}\lambda_{k}\Pi_{k}^{(i)}\enspace,

where $\dim(\Pi_{k}^{(i)})=\dim(\Pi_{k})$ for all $i\in[d^{2}]$ . It follows that for all $i$ ,

U_{i}V^{*}\left(\Pi_{k}\otimes\frac{\mathds{1}}{d}\right)(U_{i}V^{*})^{*}=\Pi_{k}^{(i)}\otimes\frac{\mathds{1}}{d}\enspace.

For $i\in[d^{2}]$ let $C_{i}$ be a unitary operator on $\mathcal{H}_{A^{\prime}}$ such that $C_{i}\Pi_{k}^{(i)}C_{i}^{*}=\Pi_{k}$ for all $k$ . Since $\Pi_{k}^{(1)}=\Pi_{k}$ , our choice of $C_{1}=\mathds{1}$ suffices. Let $U_{i}^{\prime}=(C_{i}\otimes\mathds{1})U_{i}V^{*}$ . By Lemma 2.5, we have $(\tau^{\prime},(U_{i}^{\prime}))$ is a superdense coding protocol. Furthermore, Equation 2.3 implies that

U_{i}^{\prime}\Big{(}\sum_{k}\lambda_{k}\,\Pi_{k}\otimes\frac{\mathds{1}}{d}\Big{)}(U_{i}^{\prime})^{*}=\sum_{k}\lambda_{k}\,\Pi_{k}\otimes\frac{\mathds{1}}{d}\enspace,

which implies Item 3 of Definition 2.6.

Lemma 2.5 implies that $(\tau^{\prime},(U_{i}^{\prime}))$ is also a superdense coding protocol, so by Lemma 2.3 we have that for all $i\neq j$ ,

\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}^{\prime}\left(\zeta\otimes\frac{\mathds{1}}{d}\right)(U_{j}^{\prime})^{*}\right)=0\;.

Since $\Pi_{k}$ commutes with the $(U_{i}^{\prime})$ for all $k$ , we have

	$\displaystyle 0$	$\displaystyle=\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}^{\prime}\left(\rho\otimes\frac{\mathds{1}}{d}\right)(U_{j}^{\prime})^{*}\right)$
		$\displaystyle=\sum_{k}\lambda_{k}\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}^{\prime}\left(\Pi_{k}\otimes\frac{\mathds{1}}{d}\right)(U_{j}^{\prime})^{*}\right)$
		$\displaystyle=\frac{1}{d}\sum_{k}\lambda_{k}\operatorname{Tr}_{A^{\prime\prime}}\left(\left(\Pi_{k}\otimes\mathds{1}\right)U_{i}^{\prime}(U_{j}^{\prime})^{*}\right)\enspace.$

Since the $\lambda_{k}$ ’s are positive, we have

\operatorname{Tr}_{A^{\prime\prime}}\left((\Pi_{k}\otimes\mathds{1})U_{i}^{\prime}(U_{j}^{\prime})^{*}(\Pi_{k}\otimes\mathds{1})\right)=0

for all $k$ , and $i\neq j$ . This implies Item 4 of Definition 2.6.

∎

3 Rigidity for two-dimensional superdense coding

In this section we prove Theorem 1.1, that is, rigidity for $2$ -dimensional superdense coding protocols (coding $2$ bits into one qubit, with no error). For the remainder of this section we drop the qualification “ $2$ -dimensional” for brevity.

The proof involves a number of steps. First, we invoke Lemma 2.7, which states that every superdense coding protocol is locally equivalent to one that has a nice form. Then, we argue that up to local equivalence, in every nice form superdense coding protocol, the encoding operators $(U_{i})$ can be block-diagonalized. That is, we can write $U_{i}=\sum_{\ell}Q_{i\ell}\otimes R_{i\ell}$ where the $\{Q_{i\ell}\}$ are a set of orthogonal projectors acting on $\mathcal{H}_{A^{\prime}}$ summing to the identity, and the $\{R_{i\ell}\}$ are a set of Hermitian unitary operators acting on $\mathcal{H}_{A^{\prime\prime}}$ . Next, we argue that (again up to local equivalence) across the different $i$ ’s, the projectors $\{Q_{i\ell}\}$ can be “matched up”, and the corresponding operators $R_{i\ell}$ are all pairwise orthogonal. This implies that in fact $\{R_{1\ell},R_{2\ell},R_{3\ell},R_{4\ell}\}$ are unitarily equivalent to the standard Pauli matrices $\{\mathds{1},{\mathrm{X}},{\mathrm{Y}},{\mathrm{Z}}\}$ . Using the property that local equivalence is a transitive relation, this concludes the argument.

Lemma 2.7 is proven in Section 2.3. We now proceed to prove the remaining steps in detail.

3.1 Block-diagonalizing nice form protocols

In this section, we analyze the structure of the encoding operators in a nice form superdense coding protocol in dimension two. We show that they apply a $2\times 2$ unitary operator on register $A^{\prime\prime}$ , controlled by the state in register $A^{\prime}$ .

Theorem 3.1.

Let $(\tau,(U_{i}))$ be a nice-form protocol. Then there exists a locally-equivalent protocol $(\tau^{\prime},(U_{i}^{\prime}))$ that has a nice form and for $i\in\{2,3,4\}$ we have

U_{i}^{\prime}=_{\tau^{\prime}}\sum_{\ell}Q_{i\ell}\otimes R_{i\ell}\;.

for some orthogonal projectors $\{Q_{i\ell}\}_{\ell}$ on $\mathcal{H}_{A^{\prime}}$ that sum to $\mathds{1}$ , and $2\times 2$ traceless, Hermitian unitary matrices $\{R_{i\ell}\}_{\ell}$ on $\mathcal{H}_{A^{\prime\prime}}$ .

Proof.

Fix an $i\in\{2,3,4\}$ . Since $(\tau,(U_{i}))$ has a nice form (see Definition 2.6), this means that $\tau^{A}=\rho^{A^{\prime}}\otimes\frac{\mathds{1}}{2}$ for some density matrix $\rho$ , and $\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}U_{1}^{*}(\Pi_{k}\otimes\mathds{1}))=\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}(\Pi_{k}\otimes\mathds{1}))=0$ , where $\Pi_{k}$ is a non-zero eigenspace of $\rho$ .

Fix a $k$ . By property 3 of a nice-form protocol, $U_{i}$ commutes with $\Pi_{k}\otimes\mathds{1}$ , and we can write

U_{i}\quad=\quad(\Pi_{k}\otimes\mathds{1})U_{i}(\Pi_{k}\otimes\mathds{1})+(\mathds{1}-\Pi_{k}\otimes\mathds{1})U_{i}(\mathds{1}-\Pi_{k}\otimes\mathds{1})\enspace.

Let $\hat{U}_{ik}=(\Pi_{k}\otimes\mathds{1})U_{i}(\Pi_{k}\otimes\mathds{1})$ , and note that $\hat{U}_{ik}$ is unitary on the image of $\Pi_{k}\otimes\mathds{1}$ , i.e.:

\hat{U}_{ik}\hat{U}_{ik}^{*}=\hat{U}_{ik}^{*}\hat{U}_{ik}=\Pi_{k}\otimes\mathds{1}\;.

For notational convenience we drop the subscripts $i$ and $k$ until the very end. Let $\hat{U}\coloneqq\hat{U}_{ik}$ and let $\Pi\coloneqq\Pi_{k}$ .

The condition that $\operatorname{Tr}_{A^{\prime\prime}}(\hat{U})=0$ implies that we can write $\hat{U}$ as

\hat{U}=\left(\begin{array}[]{c|c}F&G\\ \hline\cr H&-F\end{array}\right)\enspace,

where $F,G,H$ are block matrices that act on the image of $\Pi$ and the block partitions are with respect to the tensor factor $\mathcal{H}_{A^{\prime\prime}}$ .

Let

F=D_{F}T_{F}~{},\qquad G=D_{G}T_{G}~{},\qquad H=D_{H}T_{H}

give the polar decompositions of $F,G,H$ respectively where $D_{F},D_{G},D_{H}$ are positive semidefinite and $T_{F},T_{G},T_{H}$ are unitary on the image of $\Pi$ .

Then

\hat{U}=\left(\begin{array}[]{c|c}D_{F}T_{F}&D_{G}T_{G}\\ \hline\cr D_{H}T_{H}&-D_{F}T_{F}\end{array}\right).

The relation $\hat{U}\hat{U}^{*}=\Pi\otimes\mathds{1}$ implies that $D_{F}^{2}=\Pi-D_{G}^{2}=\Pi-D_{H}^{2}$ . For notational brevity we write $D\coloneqq D_{F}$ and $\widetilde{D}\coloneqq D_{G}=D_{H}=\sqrt{\Pi-D^{2}}$ . Note that $D$ and $\widetilde{D}$ have support in the image of $\Pi$ and are simultaneously diagonalizable. Write $K\coloneqq T_{F}^{*}DT_{F}$ and $\widetilde{K}\coloneqq T_{F}^{*}\widetilde{D}T_{F}$ . Note that $K$ and $\widetilde{K}$ are positive semidefinite, and also simultaneously diagonalizable. Write $W_{G}\coloneqq T_{F}^{*}T_{G}$ and $W_{H}^{*}\coloneqq T_{F}^{*}T_{H}$ . Continuing our simplification, we see that

(T_{F}^{*}\otimes\mathds{1})\hat{U}=\left(\begin{array}[]{c|c}K&\widetilde{K}W_{G}\\ \hline\cr\widetilde{K}W_{H}^{*}&-K\end{array}\right).

Now our goal is to find a unitary operator $E$ acting on $\mathcal{H}_{A^{\prime}}$ such that $(ET_{F}^{*}\otimes\mathds{1})\hat{U}$ is Hermitian. This is equivalent to the conditions that $EK=KE^{*}$ and $E\widetilde{K}W_{G}=(E\widetilde{K}W_{H}^{*})^{*}$ . We construct such an operator $E$ using some relations between the operators $K,\widetilde{K},W_{G},W_{H}$ that we derive below.

We use the unitarity relation $\hat{U}^{*}\hat{U}=\Pi\otimes\mathds{1}$ to obtain the equations

	$\displaystyle K^{2}+W_{G}^{*}\widetilde{K}^{2}W_{G}$	$\displaystyle=\Pi\enspace,\qquad\text{and}$		(3.1)
	$\displaystyle K^{2}+W_{H}\widetilde{K}^{2}W_{H}^{*}$	$\displaystyle=\Pi\enspace.$		(3.2)

These equations, along with the definitions of $K$ and $\widetilde{K}$ , imply that the unitary operators $W_{G},W_{H}$ are block-diagonal with respect to the eigenspaces of $K$ and $\widetilde{K}$ (and therefore commute with $K$ and $\widetilde{K}$ ). Another relation we get from unitarity is $K\widetilde{K}W_{G}=W_{H}\widetilde{K}K$ , which via commutativity of $W_{H},\widetilde{K},K$ implies

K\widetilde{K}W_{G}=K\widetilde{K}W_{H}\;.

(3.3)

Let $\Pi_{+}$ be the orthogonal projector onto $\operatorname{supp}(K)$ . Since $W_{G}$ commutes with $K,\widetilde{K}$ (and so does $W_{H}$ ), we have that $\Pi_{+}$ commutes with $\widetilde{K},W_{G},W_{H}$ . By Equation 3.3, we have $\Pi_{+}\widetilde{K}W_{G}=\Pi_{+}\widetilde{K}W_{H}=W_{H}\widetilde{K}\,\Pi_{+}$ . Since $K^{2}+\widetilde{K}^{2}=\Pi$ , we also have that $(\Pi-\Pi_{+})\widetilde{K}=(\Pi-\Pi_{+})$ . Note that Equation 3.3, together with the fact that $W_{H},W_{G}$ are block-diagonal with respect to the eigenspaces of the product $K\widetilde{K}$ , implies that $W_{G}$ and $W_{H}$ must be equal on the support of $K\widetilde{K}$ (equivalently, the support of $\Pi_{+}$ ).

We now construct the desired unitary $E$ . Let $\Pi_{0}$ denote $\Pi-\Pi_{+}$ (i.e., the projection onto the kernel of $K$ within the image of $\Pi$ ). Left-multiplying both sides of Equation 3.3 with $K^{+}$ (the pseudoinverse of $K$ on the image of $\Pi_{+}$ ) and right-multiplying both sides by $\Pi_{+}$ we get

\Pi_{+}\widetilde{K}W_{G}\Pi_{+}=\Pi_{+}\widetilde{K}W_{H}\Pi_{+}\;.

(3.4)

Recall that $\Pi_{0}\widetilde{K}=\Pi_{0}$ ; combined with the fact that $W_{G}$ and $W_{H}$ are both block-diagonal with respect to the eigenspaces of $K$ and $\widetilde{K}$ , we have

\widetilde{K}W_{G}=\Pi_{+}\widetilde{K}W_{G}\Pi_{+}+\Pi_{0}W_{G}\Pi_{0}\qquad\text{and}\qquad\widetilde{K}W_{H}^{*}=\Pi_{+}\widetilde{K}W_{H}^{*}\Pi_{+}+\Pi_{0}W_{H}^{*}\Pi_{0}~{}.

(3.5)

Furthermore, $V_{G}\coloneqq\Pi_{0}W_{G}\Pi_{0}$ and $V_{H}\coloneqq\Pi_{0}W_{H}^{*}\Pi_{0}$ are unitary on $\Pi_{0}$ . Let $M\coloneqq V_{H}^{*}V_{G}$ , and consider its spectral decomposition $M=\sum_{j}e^{{\mathrm{i}}\theta_{j}}|v_{j}\rangle\!\langle v_{j}|$ where $\{|v_{j}\rangle\}$ is an orthonormal basis for the support of $\Pi_{0}$ . Let $M^{1/2}\coloneqq\sum_{j}e^{{\mathrm{i}}\theta_{j}/2}|v_{j}\rangle\!\langle v_{j}|$ denote the principal square root of $M$ . Define $E_{0}\coloneqq M^{1/2}V_{G}^{*}$ . Observe that $E_{0}$ is supported only on $\Pi_{0}$ and satisfies

E_{0}V_{G}=M^{1/2}=(M^{1/2}M^{*})^{*}=(E_{0}V_{H})^{*}.

(3.6)

Consider the operator $E\coloneqq\Pi_{+}+E_{0}$ that is unitary on the support of $\Pi$ , and acts non-trivially only on the support of $\Pi_{0}$ . Combining Equations (3.4), (3.5) and (3.6) we get

	$\displaystyle E\widetilde{K}W_{G}$	$\displaystyle=\Pi_{+}\widetilde{K}W_{G}\Pi_{+}+E_{0}V_{G}$
		$\displaystyle=\Pi_{+}\widetilde{K}W_{H}\Pi_{+}+(E_{0}V_{H})^{*}$
		$\displaystyle=(\Pi_{+}W_{H}^{}\widetilde{K}\Pi_{+}+E_{0}V_{H})^{}$
		$\displaystyle=(\Pi_{+}\widetilde{K}W_{H}^{}\Pi_{+}+E_{0}V_{H})^{}$
		$\displaystyle=(E\widetilde{K}W_{H}^{})^{}\;.$

where the second line follows from Equation 3.4 and Equation 3.6, the fourth line follows because $\widetilde{K}$ commutes with $W_{H}^{*}$ , and the last line follows because of Equation 3.5. We also have that $E$ commutes with $\widetilde{K}$ : this is because $\Pi_{+}$ commutes with $\widetilde{K}$ and also $E_{0}$ acts nontrivially only on the eigenspace of $\widetilde{K}$ with eigenvalue $1$ .

Let $L\coloneqq E\widetilde{K}W_{G}$ . Putting everything together, we have

(ET_{F}^{*}\otimes\mathds{1})\hat{U}=\left(\begin{array}[]{c|c}EK&E\widetilde{K}W_{G}\\ \hline\cr E\widetilde{K}W_{H}^{*}&-EK\end{array}\right)=\left(\begin{array}[]{c|c}K&L\\ \hline\cr L^{*}&-K\end{array}\right).

(3.7)

where we used the fact that $EK=K$ .

Let $K=\sum_{r}\alpha_{r}P_{r}$ and $\widetilde{K}=\sum_{r}\sqrt{1-\alpha_{r}^{2}}\,P_{r}$ be spectral decompositions of $K$ and $\widetilde{K}$ where the reals $\alpha_{r}$ ’s are nonnegative and distinct, and the operators $P_{r}$ are orthogonal projectors summing to $\Pi$ . The operator $\widetilde{K}$ has such a spectral decomposition because $K^{2}+\widetilde{K}^{2}=\Pi$ . Next, since the unitary operators $E$ and $W_{G}$ commute with $\widetilde{K}$ , they are block-diagonal with respect to the projectors $\{P_{r}\}$ . Thus

\displaystyle P_{r}LP_{r}

\displaystyle=P_{r}E\widetilde{K}W_{G}P_{r}=P_{r}\sqrt{\widetilde{K}}EW_{G}\sqrt{\widetilde{K}}P_{r}=\sqrt{1-\alpha_{r}^{2}}\,\,P_{r}EW_{G}P_{r}~{}.

The operator $P_{r}EW_{G}P_{r}$ is unitary on $P_{r}$ and we can express it as $\sum_{s}\beta_{rs}Q_{rs}$ where the $\beta_{rs}$ ’s are complex numbers on the unit circle and $\{Q_{rs}\}_{s}$ are orthogonal projectors that sum to $P_{r}$ . Thus we can write $K=\sum_{r,s}\alpha_{r}\,Q_{rs}$ and $L=\sum_{r,s}\sqrt{1-\alpha_{r}^{2}}\,\beta_{rs}\,Q_{rs}$ , and $(ET_{F}^{*}\otimes\mathds{1})\hat{U}$ can be written as

(ET_{F}^{*}\otimes\mathds{1})\hat{U}=\sum_{r,s}Q_{rs}\otimes R_{rs}

(3.8)

where $R_{rs}$ is the $2\times 2$ matrix

\begin{pmatrix}\alpha_{r}&\sqrt{1-\alpha_{r}^{2}}\cdot\beta_{rs}\\ \sqrt{1-\alpha_{r}^{2}}\cdot\beta_{rs}^{*}&-\alpha_{r}\end{pmatrix}\enspace.

Notice that $R_{rs}$ has determinant $-1$ and is traceless, therefore its eigenvalues are $\{+1,-1\}$ .

Re-introducing the indices $i\in\{2,3,4\}$ and $k$ , we have deduced that for every block of $U_{i}$ corresponding to the eigenspace $\Pi_{k}$ , there exists a map $S_{ik}$ that is unitary on the image of $\Pi_{k}$ such that

(S_{ik}\otimes\mathds{1})\hat{U}_{ik}=\sum_{r,s}Q_{ikrs}\otimes R_{ikrs}

where the $(R_{ikrs})_{r,s}$ are $2\times 2$ Hermitian unitary operators with trace $0$ . Define the unitary operator $S_{i}$ on $\mathcal{H}_{A^{\prime}}$ as $S_{i}\coloneqq(\mathds{1}-\sum_{k}\Pi_{k})+\sum_{k}S_{ik}$ . If we sum over $k$ , we get

(S_{i}\otimes\mathds{1})\hat{U}_{i}=\sum_{\ell}Q_{i\ell}\otimes R_{i\ell}\enspace,

where $\hat{U}_{i}\coloneqq\sum_{k}\hat{U}_{ik}$ and we have re-indexed the sum over $k,r,s$ to be a sum over indices $\ell$ . Let $U_{i}^{\prime}\coloneqq(S_{i}\otimes\mathds{1})U_{i}$ for all $i\in\{2,3,4\}$ . Then, letting $P\coloneqq\sum_{k}\Pi_{k}$ denote the projector onto the support of $\rho$ , we have

U_{i}^{\prime}\tau(U_{i}^{\prime})^{*}=(S_{i}U_{i}P)\,\tau\,(S_{i}U_{i}P)^{*}=(S_{i}\hat{U}_{i}P)\,\tau\,(S_{i}\hat{U}_{i}P)^{*}=S_{i}\hat{U}_{i}\,\tau\,\hat{U}_{i}^{*}S_{i}^{*}

where have suppressed the tensoring with identity that extends all the operators to the same space, and used the property that $(P\otimes\mathds{1})\tau(P\otimes\mathds{1})=\tau$ , $\hat{U}_{i}P=U_{i}P$ , and $\hat{U}_{i}P=\hat{U}_{i}$ . Thus, the unitary operators $U_{i}^{\prime}$ satisfy the conclusions of the theorem statement. Let $\tau^{\prime}\coloneqq\tau$ , so that $(\tau^{\prime},(U_{i}^{\prime}))$ is a superdense coding protocol by Lemma 2.5.

Furthermore, since $(\tau,(U_{i}))$ has a nice form, it can be verified that $(\tau^{\prime},(U_{i}^{\prime}))$ also has a nice form. First, since $S_{1}=\mathds{1}$ , we have that $U_{1}^{\prime}=\mathds{1}$ (and hence Item 1 of Definition 2.6 is satisfied). Item 2 of Definition 2.6 is satisfied since $\tau^{\prime}=\tau$ . Third, since $U_{i}$ commutes with $\rho\otimes\frac{\mathds{1}}{2}$ and $S_{i}$ is block-diagonal with respect to the eigenspaces of $\rho$ , it follows that $U_{i}^{\prime}$ also commutes with $\rho\otimes\frac{\mathds{1}}{2}$ (so Item 3 of Definition 2.6 is satisfied). Finally, we have

	$\displaystyle\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}^{\prime}(U_{j}^{\prime})^{*}(\Pi_{k}\otimes\mathds{1}))$	$\displaystyle=\operatorname{Tr}_{A^{\prime\prime}}((S_{ik}\otimes\mathds{1})\hat{U}_{ik}\hat{U}_{jk}^{}(S_{jk}^{}\otimes\mathds{1}))$
		$\displaystyle=S_{ik}\left[\operatorname{Tr}_{A^{\prime\prime}}(\hat{U}_{ik}\hat{U}_{jk}^{})\right]S_{jk}^{}$
		$\displaystyle=S_{ik}\left[\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}U_{j}^{}(\Pi_{k}\otimes\mathds{1}))\right]S_{jk}^{}$
		$\displaystyle=0.$

Thus, Item 4 of Definition 2.6 is satisfied. This completes the proof of the Theorem. ∎

3.2 Matching the blocks of the encoding operators

In the previous section we saw how, up to local equivalence of protocols, we can express the encoding operator $U_{i}$ in a two-dimensional superdense coding protocol as a block-diagonal matrix with $2\times 2$ Hermitian unitary operators on the diagonal. In this section we relate the decompositions to each other. Ultimately, the conclusion is that the blocks “line up”, so that the operators in the same diagonal block of the four encoding operators are the four single qubit Pauli operators.

Theorem 3.2.

Let $(\tau,(U_{i}))$ be a superdense coding protocol that has a nice form where for $i\in\{2,3,4\}$ we have

U_{i}=_{\tau}\sum_{k}Q_{ik}\otimes R_{ik}\;,

for some orthogonal projectors $\{Q_{ik}\}_{k}$ on $\mathcal{H}_{A^{\prime}}$ that sum to $\mathds{1}$ , and $2\times 2$ traceless, Hermitian unitary matrices $\{R_{ik}\}_{k}$ on $\mathcal{H}_{A^{\prime\prime}}$ . Then $(\tau,(U_{i}))$ is locally equivalent to a superdense coding protocol $(\tau^{\prime},(U_{i}^{\prime}))$ that has a nice form and satisfies the following: there exist orthogonal projectors $\{K_{r}\}$ on $\mathcal{H}_{A^{\prime}}$ that sum to the identity and $2\times 2$ traceless, Hermitian unitary operators $\{R_{ir}\}$ such that for all $i\in\{2,3,4\}$ ,

U_{i}^{\prime}=_{\tau^{\prime}}\sum_{r}K_{r}\otimes R_{ir}\;.

Furthermore, for all $r,i\neq j$ we have

\operatorname{Tr}(R_{ir}R_{jr})=0\enspace.

Proof.

The first step is to “coarse-grain” the projectors $\{Q_{ik}\}$ so that the associated operators $R_{ik}$ are all inequivalent in the following sense. For each $i$ , we say that $k$ and $k^{\prime}$ are $i$ -equivalent if $R_{ik}=\pm R_{ik^{\prime}}$ . For every $i$ , this forms an equivalence relation on the $k$ ’s. Let $p_{i}(k)$ denote the least $k^{\prime}$ such that $k^{\prime}$ and $k$ are $i$ -equivalent. Define $s_{ik}\in\{\pm 1\}$ to be such that $R_{ik}=s_{ik}R_{ip_{i}(k)}$ .

For every $i\in\{2,3,4\}$ , for every $k$ , define the unitary operator $S_{i}=\sum_{k}s_{ik}Q_{ik}$ which acts on $\mathcal{H}_{A^{\prime}}$ (and set $S_{1}=\mathds{1}$ ). Then if we define $U^{\prime}_{i}=(S_{i}\otimes\mathds{1})U_{i}$ and $\tau^{\prime}=\tau$ , by Lemma 2.5 we get that the pair $(\tau^{\prime},(U_{i}^{\prime}))$ is a superdense coding protocol, and furthermore the operators $(U_{i}^{\prime})$ admit a block-diagonalization where for all $i$ , the associated $2\times 2$ unitary operators $R_{ik}$ are all inequivalent. It is also straightforward to check that $(\tau^{\prime},(U_{i}^{\prime}))$ has a nice form.

Next, from Lemma 2.7 we have that for $i\neq j$

0=\operatorname{Tr}_{A^{\prime\prime}}(U_{i}^{\prime}(U_{j}^{\prime})^{*})=\sum_{k,\ell}Q_{ik}Q_{j\ell}\cdot\operatorname{Tr}(R_{ik}R_{j\ell}).

(3.9)

By left-multiplying the above expression by $Q_{ik}$ for some $k$ and right-multiplying by $Q_{j\ell}$ for some $\ell$ yields $Q_{ik}Q_{j\ell}\cdot\operatorname{Tr}(R_{ik}R_{j\ell})=0$ . Therefore, if $Q_{ik}Q_{j\ell}\neq 0$ , it follows that $\operatorname{Tr}(R_{ik}R_{j\ell})=0$ .

Given three sets of projectors $\{Q_{2k}\}$ , $\{Q_{3\ell}\}$ , and $\{Q_{4m}\}$ we can define the following tripartite graph $G$ , which we call the overlap graph. Associate a vertex with every projector $Q_{ik}$ for $i\in\{2,3,4\}$ . Include an edge between $Q_{ik}$ and $Q_{j\ell}$ if and only if $Q_{ik}Q_{j\ell}\neq 0$ . A triangle $T=(k,\ell,m)$ in the graph $G$ corresponds to a triple of projectors $Q_{2k},Q_{3\ell},Q_{4m}$ such that the pairwise products are all nonzero. Given encoding operators as in Equation 3.9, we use triangles to match their blocks.

Lemma 3.3 (Reduction Lemma).

Let $\{Q_{2k}\}$ , $\{Q_{3\ell}\}$ , and $\{Q_{4m}\}$ be sets of orthogonal projectors with the following properties:

1.

$\sum_{k}Q_{2k}=\sum_{\ell}Q_{3\ell}=\sum_{m}Q_{4m}$
2.

For $i\neq j$ , for all $k,\ell$ , $Q_{ik}Q_{j\ell}\neq 0$ implies that $\operatorname{Tr}(R_{ik}R_{j\ell})=0$ .
3.

For all $i\in\{2,3,4\}$ , the $\{R_{ik}\}_{k}$ are inequivalent.

Then there exists a triangle $T=(k,\ell,m)$ and a unit vector $|v\rangle\in\mathcal{H}_{A^{\prime}}$ such that

Q_{2k}|v\rangle=Q_{3\ell}|v\rangle=Q_{4m}|v\rangle=|v\rangle.

We shall assume for now that the Reduction Lemma holds. We show how this gives us an iterative decomposition procedure to construct the orthogonal projectors $\{K_{r}\}$ satisfying the conclusions of the Theorem.

The sets $Q_{2}^{(0)}\coloneqq\{Q_{2k}\}$ , $Q_{3}^{(0)}\coloneqq\{Q_{3\ell}\}$ , and $Q_{4}^{(0)}\coloneqq\{Q_{4m}\}$ satisfy the required conditions of the Reduction Lemma with $\sum_{k}Q_{2k}=\sum_{\ell}Q_{3\ell}=\sum_{m}Q_{4m}=\mathds{1}$ . Thus there exists a triangle $T_{0}=(k_{0},\ell_{0},m_{0})$ and a vector $|v_{0}\rangle$ that is a common eigenvector of $Q_{2k_{0}},Q_{3\ell_{0}},Q_{4m_{0}}$ . Thus we can write

Q_{2k_{0}}=|v_{0}\rangle\!\langle v_{0}|+Q_{2k_{0}}^{\prime}\qquad Q_{3\ell_{0}}=|v_{0}\rangle\!\langle v_{0}|+Q_{3\ell_{0}}^{\prime}\qquad Q_{4m_{0}}=|v_{0}\rangle\!\langle v_{0}|+Q_{4m_{0}}^{\prime}

where $Q_{2k_{0}}^{\prime}$ , $Q_{3\ell_{0}}^{\prime}$ , and $Q_{4m_{0}}^{\prime}$ are orthogonal projectors with rank one smaller.

Define the sets $Q_{2}^{(1)},Q_{3}^{(1)},Q_{4}^{(1)}$ to be the sets $Q_{2}^{(0)},Q_{3}^{(0)},Q_{4}^{(0)}$ with the projectors $Q_{2k_{0}},Q_{3\ell_{0}},Q_{4m_{0}}$ replaced by $Q_{2k_{0}}^{\prime},Q_{3\ell_{0}}^{\prime},Q_{4m_{0}}^{\prime}$ .

Observe that $Q_{2}^{(1)},Q_{3}^{(1)},Q_{4}^{(1)}$ satisfies the required conditions of the Reduction Lemma, with

\sum_{F\in Q_{i}^{(1)}}F\quad=\quad\mathds{1}-|v_{0}\rangle\!\langle v_{0}|\enspace,

for all $i\in\left\{2,3,4\right\}$ . Applying the Reduction Lemma again, we find another triangle $T_{1}$ and a common eigenvector $|v_{1}\rangle$ of the triangle. We continue this process of reducing the rank of at least one operator each in the sets $Q_{2}^{(r)},Q_{3}^{(r)},Q_{4}^{(r)}$ and finding common eigenvectors $|v_{r}\rangle$ until we have fully expressed

U_{i}^{\prime}=_{\tau}\sum_{r}K_{r}\otimes R_{ir}\enspace,

where $K_{r}\coloneqq|v_{r}\rangle\!\langle v_{r}|$ , and for every $r$ , the pairwise inner products satisfy $\operatorname{Tr}(R_{2r}R_{3r})=\operatorname{Tr}(R_{2r}R_{4r})=\operatorname{Tr}(R_{3r}R_{4r})=0$ . This concludes the proof of the Theorem. ∎

Before proving the Reduction Lemma we establish the following lemma, which claims that up to conjugation by the same unitary operator, the only collection of $2\times 2$ traceless, Hermitian, unitary, mutually orthogonal matrices are the single-qubit Pauli matrices.

Lemma 3.4.

Let $R_{2},R_{3},R_{4}$ be $2\times 2$ unitary matrices that are traceless, Hermitian, and satisfy $\operatorname{Tr}(R_{i}R_{j})=0$ for all $i\neq j$ . Then there exists a $2\times 2$ unitary operator $S$ such that

R_{2}=S{\mathrm{Z}}S^{*}\qquad R_{3}=S{\mathrm{X}}S^{*}\qquad R_{4}=S{\mathrm{Y}}S^{*}.

Proof.

We find a sequence of unitary operators $S_{1},S_{2},S_{3}$ such that $S\coloneqq S_{1}^{*}S_{2}^{*}S_{3}^{*}$ satisfies the conclusions of the lemma. Because $R_{2}$ is unitary, Hermitian and traceless, we can unitarily diagonalize it as $R_{2}=|a\rangle\!\langle a|-|b\rangle\!\langle b|$ . Define $S_{1}$ as the unitary operator with

S_{1}|a\rangle=|0\rangle\qquad S_{1}|b\rangle=|1\rangle.

Let $R_{i}^{\prime}\coloneqq S_{1}R_{i}S_{1}^{*}$ for $i\in\{2,3,4\}$ . These are all traceless, Hermitian, pairwise orthogonal unitary matrices, and furthermore $R_{2}^{\prime}={\mathrm{Z}}$ . Suppose

R_{3}^{\prime}=\begin{pmatrix}r&s\\ t&u\end{pmatrix}\;.

Since $\operatorname{Tr}(R_{2}^{\prime}R_{3}^{\prime})=0$ and $R_{3}^{\prime}$ is traceless, we have that $r=u=0$ , and since $R_{3}^{\prime}$ is Hermitian and unitary, we have $s=t^{*}={\mathrm{e}}^{{\mathrm{i}}\theta}$ for some $\theta\in[0,2\pi)$ . Let

S_{2}\coloneqq\begin{pmatrix}{\mathrm{e}}^{-{\mathrm{i}}\theta/2}&0\\ 0&{\mathrm{e}}^{{\mathrm{i}}\theta/2}\end{pmatrix}\enspace,

and $R_{i}^{\prime\prime}\coloneqq S_{2}R_{i}^{\prime}S_{2}^{*}$ for $i\in\{2,3,4\}$ . Again, the operators $R_{i}^{\prime\prime}$ remain traceless, Hermitian unitary, and pairwise orthogonal, and furthermore $R_{2}^{\prime\prime}={\mathrm{Z}}$ and $R_{3}^{\prime\prime}={\mathrm{X}}$ . Suppose

R_{4}^{\prime\prime}=\begin{pmatrix}w&x\\ y&z\end{pmatrix}\;.

From $\operatorname{Tr}(R_{2}^{\prime\prime}R_{4}^{\prime\prime})=0$ , Hermiticity, unitarity, and tracelessness of $R_{4}^{\prime\prime}$ we have again that $w=z=0$ and $x=y^{*}={\mathrm{e}}^{{\mathrm{i}}\phi}$ for some $\phi\in[0,2\pi)$ . From $\operatorname{Tr}(R_{3}^{\prime\prime}R_{4}^{\prime\prime})=0$ we have that $x=-y$ , which means that $x=\pm{\mathrm{i}}$ . If $x=-{\mathrm{i}}$ , then set $S_{3}\coloneqq\mathds{1}$ . Otherwise, set $S_{3}\coloneqq-{\mathrm{i}}\,{\mathrm{Z}}$ . Let $R_{i}^{\prime\prime\prime}\coloneqq S_{3}R_{i}^{\prime\prime}S_{3}^{*}$ for $i\in\{2,3,4\}$ . We have that $R_{2}^{\prime\prime\prime}={\mathrm{Z}}$ , $R_{3}^{\prime\prime\prime}={\mathrm{X}}$ , $R_{4}^{\prime\prime\prime}={\mathrm{Y}}$ . Thus, letting $S=S_{1}^{*}S_{2}^{*}S_{3}^{*}$ , we obtain the desired conclusion of the lemma. ∎

We now turn to proving the Reduction Lemma.

Proof of Lemma 3.3.

Define $\Pi=\sum_{k}Q_{2k}$ .

No two triangles in $G$ share an edge.

Suppose we have two triangles corresponding to projectors $(Q_{2k},Q_{3\ell},Q_{4m})$ and $(Q_{2k},Q_{3\ell},Q_{4m^{\prime}})$ for some $k,l,m,m^{\prime}$ . We then have the equations

\operatorname{Tr}(R_{2k}R_{3\ell})=\operatorname{Tr}(R_{2k}R_{4m})=\operatorname{Tr}(R_{3\ell}R_{4m})=\operatorname{Tr}(R_{2k}R_{4m^{\prime}})=\operatorname{Tr}(R_{3\ell}R_{4m^{\prime}})=0.

By Lemma 3.4, this implies that there exists a unitary operator $S$ such that

R_{2k}=S{\mathrm{Z}}S^{*}\qquad R_{3\ell}=S{\mathrm{X}}S^{*}\qquad R_{4m}=S{\mathrm{Y}}S^{*}\;.

Therefore we obtain that

\operatorname{Tr}({\mathrm{Z}}S^{*}R_{4m^{\prime}}S)=\operatorname{Tr}({\mathrm{X}}S^{*}R_{4m^{\prime}}S)=0\enspace.

S^{*}R_{4m^{\prime}}S=\begin{pmatrix}a&b\\ b^{*}&d\end{pmatrix}\enspace,

the above equation implies that $a=d=0$ and $b=-b^{*}$ , or equivalently that $b=\pm{\mathrm{i}}$ . Thus $S^{*}R_{4m^{\prime}}S=\pm{\mathrm{Y}}=\pm S^{*}R_{4m}S$ , or in other words $R_{4m}=\pm R_{4m^{\prime}}$ , contradicting the assumption that $R_{4m}$ and $R_{4m^{\prime}}$ are inequivalent.

Every vertex is in a triangle.

Consider $Q_{2k}$ for some $k$ . There exists an index $\ell$ such that $Q_{2k}Q_{3\ell}\neq 0$ , because the operators $\{Q_{3\ell}\}$ form a resolution of $\Pi$ . Since $\{Q_{4m}\}$ also forms an orthogonal resolution of $\Pi$ , we have that

0\neq Q_{2k}Q_{3\ell}=Q_{2k}\left(\sum_{m}Q_{4m}\right)Q_{3\ell}\enspace.

This implies that there exists an index $m$ such that $Q_{2k}Q_{4m}\neq 0$ and $Q_{4m}Q_{3\ell}\neq 0$ .

Finding a common eigenvector of a triangle.

Fix a triangle $T=(k,\ell,m)$ . For notational simplicity we shall write $C\coloneqq Q_{2k}$ , $D\coloneqq Q_{3\ell}$ , and $E\coloneqq Q_{4m}$ . First, observe that if $C(\Pi-D)E\neq 0$ , then there exists some index $\ell^{\prime}$ such that $CQ_{3\ell^{\prime}}E\neq 0$ . This implies that $(k,\ell^{\prime},m)$ forms a triangle in $G$ . But this cannot happen as this triangle would share the edge $(k,m)$ with $T$ . Therefore $C(\Pi-D)E=0$ , i.e., $CE=CDE$ .

By symmetry, we also get that $CD=CED$ and $ED=ECD$ . Thus we have

0\neq CE=CDE=CEDE=CECDE=CDECDE.

Since $CDE$ is a product of three projectors, its spectral norm is at most $1$ . Let $|v\rangle$ be a unit vector realizing the spectral norm of $CDE$ , i.e. such that $\|CDE|v\rangle\|=\|CDE\|>0$ . Then

\displaystyle\|CDE\|=\|CDE|v\rangle\|=\|CDECDE|v\rangle\|\leq\|CDECDE\|\leq\|CDE\|^{2}.

The inequality $\|CDE\|\leq\|CDE\|^{2}$ implies that $\|CDE\|=1$ (since it is not zero and is at most one). Therefore $|v\rangle$ is a vector such that $C|v\rangle=D|v\rangle=E|v\rangle=|v\rangle$ . ∎

We now put everything in this section together to prove Theorem 1.1, which we restate here for convenience.

See 1.1

Proof.

Putting together Lemma 2.7, Theorem 3.1, and Theorem 3.2, we get that all superdense coding protocols $(\tau,(U_{i}))$ are locally equivalent to one that has a nice form and satisfies the conclusions of Theorem 3.2. Finally, we apply Lemma 3.4 to the conclusions of Theorem 3.2 to obtain the conclusions of Theorem 1.1. ∎

4 Superdense coding and orthogonal unitary bases

In this section, we prove that there are multiple non-equivalent superdense coding protocols for transmitting $d^{2}$ messages for $d\geq 3$ , even when no ancilla is used in the encoding process, and there is no error in decoding. This implies that rigidity of superdense coding protocols for $d\geq 3$ may only hold in a relaxed form: as we see in this section, rigidity may hold only up to the choice of an orthogonal unitary basis for the space of linear operators ${\mathsf{L}}({\mathbb{C}}^{d})$ .

4.1 The connection with unitary bases

We draw a connection between superdense coding and bases for the vector space of $d\times d$ complex matrices. Although this connection may be inferred from Lemma 2.7, we give a simple and direct derivation here.

For any integer $d>1$ , consider a protocol for superdense coding of $d^{2}$ classical strings using a shared entangled state with local dimension $d$ , and a single $d$ -dimensional message. Assume that the protocol does not use any ancilla in the encoding process, and that there is no decoding error. Such a protocol necessarily has a simple form, as we describe below.

First, we argue that the initial shared state is maximally entangled. Bob’s state after the message has support in a $d^{2}$ -dimensional space. Since there are $d^{2}$ strings, and these are decoded without error, the corresponding states are orthogonal and pure. So the mixed state of the entire encoded state, corresponding to a uniformly random string, is completely mixed. However, the marginal of this state on the register initially held by Bob is be the same as the marginal for any fixed string. Thus Bob’s share of the initial state is also the $d$ -dimensional completely mixed state. This implies that the initial shared state is maximally entangled.

Any maximally entangled state with local dimension $d$ is of the form $(U\otimes V)|\upphi_{d}\rangle$ , where $U,V$ are unitary operators in ${\mathsf{U}}({\mathbb{C}}^{d})$ , and $|\upphi_{d}\rangle\coloneqq\tfrac{1}{\sqrt{d}}\sum_{k=0}^{d-1}|k\rangle|k\rangle$ . Therefore, without loss of generality, we may assume that Alice and Bob initially share the state $|\upphi_{d}\rangle$ . When the dimension $d$ is clear from the context, we omit it from the subscript.

Second, since the encoding of any message is pure, Alice’s local operations satisfy the following properties. On input $i\in[d^{2}]$ , Alice applies a unitary operator $U_{i}\in{\mathsf{U}}({\mathbb{C}}^{d})$ to her share of the state $|\upphi\rangle$ , and sends the share to the Bob. Since Bob can decode the input $i$ with probability $1$ , the states $(U_{i}\otimes\mathds{1})|\upphi\rangle$ are all orthogonal, i.e., for all distinct $i,j\in[d^{2}]$ , we have

\langle\upphi|(U_{i}^{*}U_{j}\otimes\mathds{1})|\upphi\rangle\quad=\quad 0\enspace.

This condition is equivalent to the property that the operators $U_{i}$ are mutually orthogonal with respect to the Hilbert-Schmidt inner product:

\operatorname{Tr}(U_{i}^{*}U_{j})\quad=\quad 0\enspace,\qquad\textrm{ for all }i,j\in[d^{2}],~{}i\neq j\enspace.

(4.1)

Thus, the operators form an orthogonal unitary basis for the space of linear operators on ${\mathbb{C}}^{d}$ .

It is straightforward to verify that any such basis for ${\mathsf{L}}({\mathbb{C}}^{d})$ leads to an errorless superdense coding protocol for $d^{2}$ classical messages. Thus, the study of rigidity of superdense coding protocols as above is equivalent to the study of orthogonal unitary bases.

A well-known example of an orthogonal unitary basis in dimension $d$ is generated by the “clock” and “shift” operators. The elements of this basis are also known as the generalized Pauli operators or the Heisenberg-Weyl operators. Let $\omega_{d}\coloneqq\exp\left(\tfrac{2\pi{\mathrm{i}}}{d}\right)$ be a primitive $d$ th root of unity. For $i,j\in\left\{0,1,\dotsc,d-1\right\}$ , the $(i,j)$ th operator $P_{ij}$ in the basis is defined as $P_{ij}\coloneqq{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}$ , where ${\mathrm{X}}_{d}\coloneqq\sum_{k=0}^{d-1}|k+1\pmod{d}\rangle\!\langle k|$ is the shift (or Pauli X) operator, and ${\mathrm{Z}}_{d}\coloneqq\sum_{k=0}^{d-1}\omega_{d}^{k}|k\rangle\!\langle k|$ is the clock (or Pauli Z) operator.

4.2 Uniqueness of orthogonal unitary bases

Given an orthogonal unitary basis $B$ for ${\mathsf{L}}({\mathbb{C}}^{d})$ , we may derive other such bases by conjugating elements of $B$ by a pair of unitary operators, and mutliplying each basis element by a potentially different complex number of unit modulus. Since this is a rather straightforward method to derive new bases, we consider the new basis to be equivalent to $B$ .

Definition 4.1.

Let $B_{1}\coloneqq\left\{U_{i}:i\in[d^{2}]\right\}$ be an orthogonal unitary basis for ${\mathsf{L}}({\mathbb{C}}^{d})$ . We say that an orthogonal unitary basis $B_{2}$ is equivalent to $B_{1}$ if there exist unit complex numbers $\alpha_{i}\in{\mathsf{U}}({\mathbb{C}})$ and a pair of unitary operators $V,W\in{\mathsf{U}}({\mathbb{C}}^{d})$ such that

B_{2}\quad=\quad\left\{\alpha_{i}VU_{i}W:~{}~{}i\in[d^{2}]\right\}\enspace.

(4.2)

We may verify that this defines an equivalence relation.

Another way to construct an orthogonal unitary basis is by taking tensor products of bases in lower dimensions. Suppose $d$ is composite, with $d=d_{1}d_{2}$ and $1<d_{1},d_{2}<d$ , and $\left\{U_{i}:i\in[d_{1}^{2}]\right\}$ and $\left\{V_{j}:j\in[d_{2}^{2}]\right\}$ are orthogonal unitary bases for ${\mathsf{L}}({\mathbb{C}}^{d_{1}})$ and ${\mathsf{L}}({\mathbb{C}}^{d_{2}})$ , respectively. Then

\left\{U_{i}\otimes V_{j}:i\in[d_{1}^{2}];~{}j\in[d_{2}^{2}]\right\}

is an orthogonal unitary basis for ${\mathsf{L}}({\mathbb{C}}^{d})$ . This hints at the possibility that are bases that are not equivalent to each other under operations as in Eq. (4.2). The following proposition confirms this for dimensions which are powers of two.

Proposition 4.2.

Suppose $d=2^{k}$ for an integer $k>1$ . Let $B_{1}$ be the basis for ${\mathsf{L}}({\mathbb{C}}^{d})$ obtained by taking tensor products of $k$ two-dimensional Pauli X and Z operators, i.e.,

B_{1}\quad\coloneqq\quad\left\{\bigotimes_{i=1}^{k}P_{i}:P_{i}\in\left\{\mathds{1},{\mathrm{X}}_{2},{\mathrm{Z}}_{2},{\mathrm{X}}_{2}{\mathrm{Z}}_{2}\right\}\right\}\enspace.

The basis $B_{1}$ is not equivalent to $B_{2}$ , the $d$ -dimensional clock and shift basis.

Proof.

The intuition behind the statement is that tensor products of the two-dimensional Pauli operators in $B_{1}$ all have at most two distinct eigenvalues (either $1$ , or $\pm 1$ , or $\pm{\mathrm{i}}$ ), whereas some of the operators in the clock and shift basis have $d$ distinct complex eigenvalues. Due to the freedom available in generating equivalent bases, we need additional arguments to formalize this intuition.

Suppose that $B_{1}$ and $B_{2}$ are equivalent and consider unitary operators $V,W\in{\mathsf{U}}({\mathbb{C}}^{d})$ which show their equivalence. Consider the operator $P\in B_{1}$ that is mapped to the identity in $B_{2}$ under the equivalence. Let $\alpha$ be a complex number of unit modulus such that $\alpha VPW=\mathds{1}$ . Then $V=\alpha^{*}W^{*}P^{*}$ .

Suppose the operator $Q\in B_{1}$ is mapped to the clock operator ${\mathrm{Z}}_{d}\in B_{2}$ , and that ${\mathrm{Z}}_{d}=\beta VQW$ for some complex number $\beta$ . Then ${\mathrm{Z}}_{d}=\beta\alpha^{*}W^{*}P^{*}QW$ . The operator on the right hand side has at most two eigenvalues (either $\beta\alpha^{*}$ , or $\pm\beta\alpha^{*}$ , or $\pm{\mathrm{i}}\beta\alpha^{*}$ ) as $P^{*}Q$ has eigenvalues $1$ , or $\pm 1$ , or $\pm{\mathrm{i}}$ . However, the clock operator ${\mathrm{Z}}_{d}$ has $d$ distinct eigenvalues, the $d$ th complex roots of unity. Since $d\geq 4$ , we get a contradiction, and we conclude that $B_{1}$ and $B_{2}$ are not equivalent. ∎

It is then natural to wonder if there is a unique orthogonal unitary basis in prime dimensions, up to the equivalence defined above. In Section 4.4 we show that even this does not hold, by giving an explicit construction of a basis in any dimension $d\geq 5$ that is not equivalent to the clock and shift basis.

After our discovery of non-equivalent bases, we learned that the question of uniqueness has been studied before by Vollbrecht and Werner [VW00]. They prove the uniqueness of the basis consisting of the Pauli operators in dimension two, and state that the problem of characterizing orthogonal unitary bases in dimensions larger than two is open. They also give a construction of “shift-and-multiply” bases from a collection of $d$ complex Hadamard matrices of dimension $d\times d$ and a $d\times d$ Latin square. This construction and non-equivalent bases are discussed in more detail by Werner in subsequent work [Wer01], although the notion of equivalence there does not include multiplication by phases (complex numbers of unit modulus). Werner states without proof that the existence of non-equivalent bases in dimension at least five follows from the existence of non-equivalent Hadamard matrices or non-equivalent Latin squares, even when the dimension is prime. In dimension three, Werner describes how we may construct non-equivalent bases, but does not explicitly present them. We present a concrete instance of this construction in Proposition 4.9. Altogether, we have the following result.

Theorem 4.3.

For every dimension $d\geq 3$ , there are orthogonal unitary bases that are not equivalent to the clock and shift basis.

The theorem implies that for any $d\geq 3$ , there are non-equivalent superdense coding protocols for transmitting $d^{2}$ messages, even when no ancilla is used in the encoding process, and there is no error in decoding.

Orthogonal unitary bases have also been studied in the context of quantum error-correction under the name “unitary error bases” (see, e.g., Ref. [MV16] and the references therein). In addition to the shift-and-multiply construction, several other methods such as the “Hadamard method” and the “algebraic method” have been proposed for their construction. The “quantum shift-and-multiply” method due to Musto and Vicary [MV16] simultaneously generalizes the shift-and-multiply and Hadamard methods. Musto and Vicary give examples of orthogonal unitary bases resulting from this method that are not equivalent to those derived from any of the other methods mentioned above. However, they give explicit examples only in dimension 4. As far as we can tell, earlier explicit constructions, for example those due to Klappenecker and Rötteler [KR03], were also for a few small dimensions.

4.3 Some useful properties

Here we present two properties that are used in an explicit construction leading to Theorem 4.3. The following property of the eigenvalues of the clock and shift operators helps in proving non-equivalence to another basis. Recall that $\omega_{d}\coloneqq\exp\left(\tfrac{2\pi{\mathrm{i}}}{d}\right)$ is a primitive $d$ th root of unity.

Lemma 4.4.

Let $d>1$ be an integer, and let $a,b\in\left\{0,1,\dotsc,d-1\right\}$ . The eigenvalues of the operator ${\mathrm{X}}_{d}^{a}\,{\mathrm{Z}}_{d}^{b}$ are all of the form

\omega_{d}^{l}\cdot\exp\!\left(\frac{ab(d-1)\pi{\mathrm{i}}}{d}\right)\enspace,

for some $l\in\left\{0,1,\dotsc,d-1\right\}$ .

Proof.

Since ${\mathrm{X}}_{d}\,{\mathrm{Z}}_{d}=\omega^{*}_{d}\,{\mathrm{Z}}_{d}{\mathrm{X}}_{d}$ , we have $\left({\mathrm{X}}_{d}^{a}\,{\mathrm{Z}}_{d}^{b}\right)^{d}=\omega_{d}^{abd(d-1)/2}\,{\mathrm{X}}_{d}^{ad}\,{\mathrm{Z}}_{d}^{bd}=\omega_{d}^{abd(d-1)/2}\,\mathds{1}$ . So the eigenvalues of ${\mathrm{X}}_{d}^{a}\,{\mathrm{Z}}_{d}^{b}$ are $d$ th complex roots of $\omega_{d}^{abd(d-1)/2}$ , and the lemma follows. ∎

We also use the following simple number-theoretic property in the construction of new orthogonal unitary bases.

Lemma 4.5.

Any integer $d\geq 5$ has at most $d-2$ positive integer divisors.

Proof.

If $d$ is prime, then it has exactly two positive integer divisors, $1,d$ , and the lemma holds.

Suppose $d$ is composite and has prime factorization $p_{1}^{a_{1}}p_{2}^{a_{2}}\dotsb p_{k}^{a_{k}}$ , where $k$ and $a_{1},a_{2},\dotsc,a_{k}$ are positive integers, and $p_{1},p_{2},\dotsc,p_{k}$ are distinct prime numbers arranged in increasing order. The number of positive integer divisors of $d$ equals $(a_{1}+1)(a_{2}+1)\dotsb(a_{k}+1)$ .

Since $d$ is composite, either $k=1$ and $a_{1}\geq 2$ , or $k\geq 2$ .

Suppose $k=1$ . If $p_{1}\geq 3$ , the lemma follows since $n+1\leq q^{n}-2$ for all positive integers $n\geq 2$ , for any $q\geq 3$ . If $p_{1}=2$ , we have $a_{1}\geq 3$ since $d=p_{1}^{a_{1}}\geq 5$ . Since $n+1\leq 2^{n}-2$ for all integers $n\geq 3$ , the lemma again follows.

Now suppose $k\geq 2$ . Since $n+1\leq q^{n}$ and $n+1\leq r^{n}-1$ for all $n\geq 1$ whenever $q\geq 2$ and $r\geq 3$ , the number of divisors of $d$ is bounded as

$\displaystyle(a_{1}+1)(a_{2}+1)\dotsb(a_{k}+1)$	$\displaystyle\leq$	$\displaystyle p_{1}^{a_{1}}(p_{2}^{a_{2}}-1)\,p_{3}^{a_{3}}\dotsb p_{k}^{a_{k}}$
	$\displaystyle\leq$	$\displaystyle p_{1}^{a_{1}}p_{2}^{a_{2}}p_{3}^{a_{3}}\dotsb p_{k}^{a_{k}}-p_{1}^{a_{1}}$
	$\displaystyle\leq$	$\displaystyle d-2\enspace,$

as claimed. ∎

4.4 Explicit constructions

We now proceed to describe an explicit construction for all dimensions $d\geq 5$ . The construction we give has the same form as the shift-and-multiply construction due to Vollbrecht and Werner [VW00]. In particular, the bases we present correspond to the construction with the Fourier transform over the cyclic group of order $d$ as the Hadamard matrix, and certain Latin squares that are not equivalent to the one generated by ${\mathrm{X}}_{d}$ .

We construct bases that are not equivalent to the clock and shift basis by introducing a modification. In particular, we replace the operators generated by ${\mathrm{X}}_{d}$ by another sequence. Note that the operators ${\mathrm{X}}_{d}^{i}$ correspond to permutations on ${\mathbb{Z}}/d{\mathbb{Z}}$ , i.e., they permute the standard basis of ${\mathbb{C}}^{d}$ . Conversely any permutation $P$ on ${\mathbb{Z}}/d{\mathbb{Z}}$ corresponds to the operator $\sum_{a\in{\mathbb{Z}}/d{\mathbb{Z}}}|P(a)\rangle\!\langle a|$ which permutes standard basis elements of ${\mathbb{C}}^{d}$ . So this is a bijection.

It is also helpful to view a permutation $P$ on ${\mathbb{Z}}/d{\mathbb{Z}}$ as a perfect matching in the complete bipartite graph $K_{d,d}$ , with vertex $a$ in one part being matched with the vertex $P(a)$ in the other. This mapping defines a bijection between permutations and perfect matchings. The construction we give relies on these three equivalent views of a permutation, and uses the same letter to refer to the corresponding matching and the linear operator on ${\mathbb{C}}^{d}$ .

We start with the following observation.

Lemma 4.6.

Let $P_{0},P_{1},\dotsc,P_{d-1}$ be a sequence of $d$ disjoint matchings in the graph $K_{d,d}$ . Then the matrices

\left\{P_{i}{\mathrm{Z}}_{d}^{j}:0\leq i,j<d\right\}

form an orthogonal unitary basis.

Proof.

Since $P_{i}$ permutes the standard basis vectors of ${\mathbb{C}}^{d}$ , it is a unitary operator. Therefore $P_{i}{\mathrm{Z}}_{d}^{j}$ is also unitary.

Now consider the inner product of $P_{i}{\mathrm{Z}}_{d}^{j}$ and $P_{k}{\mathrm{Z}}_{d}^{l}$ for two pairs $(i,j)$ and $(k,l)$ . We have

\operatorname{Tr}({\mathrm{Z}}_{d}^{-j}P_{i}^{-1}P_{k}{\mathrm{Z}}_{d}^{l})\quad=\quad\sum_{m=0}^{d-1}\omega_{d}^{(l-j)m}\langle P_{i}(m)|P_{k}(m)\rangle\enspace.

If $i\neq k$ , the inner product is $0$ since for any $m$ , the disjoint matchings $P_{i}$ and $P_{k}$ match the vertex $m$ to distinct vertices. If $i=k$ , but $l\neq j$ , the inner product is again $0$ as $\omega_{d}$ is a $d$ th root of unity. ∎

We show that to derive non-equivalent bases, it suffices to have two of the matchings satisfy simple properties.

Lemma 4.7.

Let $d\geq 2$ , and $P_{0},P_{1},\dotsc,P_{d-1}$ be a sequence of $d$ disjoint matchings in $K_{d,d}$ such that

1.

$P_{0}$ is the identity permutation, i.e., matches vertex $i$ in one part to vertex $i$ in the other part; and
2.

the permutation $P_{1}$ has a cycle of length $k$ such that $k$ does not divide $d$ .

Then the basis $B\coloneqq\left\{P_{i}{\mathrm{Z}}_{d}^{j}:0\leq i,j<d\right\}$ is not equivalent to the clock and shift basis.

Proof.

The intuition here is the following. The operator corresponding to the permutation $P_{1}$ has the $k$ distinct $k$ th roots of unity as eigenvalues. In particular, the eigenvalues include $1$ and $\omega_{k}$ . On the other hand, the eigenvalues of any operator in the clock and shift basis are of the form $\gamma\omega_{d}^{l}$ for some integer $l$ , and a fixed unit complex number $\gamma\in{\mathsf{U}}({\mathbb{C}})$ depending only on the operator. Since $k$ does not divide $d$ , the operator $P_{1}$ does not belong to the clock and shift basis.

Formally, suppose that $B$ is equivalent to the clock and shift basis, and the equivalence is given by unitary operators $U,W$ . Suppose that the identity operator $P_{0}\in B$ is mapped to ${\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}$ and $P_{1}$ is mapped to ${\mathrm{X}}_{d}^{k}\,{\mathrm{Z}}_{d}^{l}$ under this equivalence. That is, ${\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}=\alpha VP_{0}W=\alpha VW$ and ${\mathrm{X}}_{d}^{k}\,{\mathrm{Z}}_{d}^{l}=\beta\,VP_{1}W$ for some $\alpha,\beta\in{\mathsf{U}}({\mathbb{C}})$ .

From the equation for $P_{0}$ we have $V=\alpha^{*}\,{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}W^{*}$ , so that ${\mathrm{X}}_{d}^{k}\,{\mathrm{Z}}_{d}^{l}=\beta\alpha^{*}\,{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}W^{*}P_{1}W$ . Equivalently, we have

\alpha\beta^{*}\omega_{d}^{m}\,{\mathrm{X}}_{d}^{k-i}\,{\mathrm{Z}}_{d}^{l-j}\quad=\quad W^{*}P_{1}W\enspace,

(4.3)

where $m=-(k-i)j$ . By Lemma 4.4, there is a fixed $\gamma\in{\mathsf{U}}({\mathbb{C}})$ such that the eigenvalues of the operator on the left hand side of Eq. (4.3) are of the form $\gamma\omega_{d}^{l}$ for some integer $l$ . On the other hand, the operator on the right hand side of the equation is similar to $P_{1}$ . Since $P_{1}$ has eigenvalues $1$ and $\omega_{k}$ , we have $1=\gamma\omega_{d}^{m}$ and $\omega_{k}=\gamma\omega_{d}^{n}$ for some integers $m,n$ . Eliminating $\gamma$ , we get $\omega_{k}=\omega_{d}^{n-m}$ . This implies that

\frac{2\pi{\mathrm{i}}}{k}\quad=\quad\frac{2\pi{\mathrm{i}}(n-m)}{d}+2\pi{\mathrm{i}}p\enspace,

for some integer $p$ , or equivalently that $d=(n-m+pd)k$ . This is a contradiction, as $k$ does not divide $d$ . ∎

Finally, we prove that matchings as in the hypothesis of Lemma 4.7 exist.

Lemma 4.8.

For any integer $d\geq 5$ , there is a sequence of $d$ disjoint matchings $P_{0},P_{1},\dotsc,P_{d-1}$ in $K_{d,d}$ such that

1.

$P_{0}$ is the identity permutation, i.e., matches vertex $i$ in one part with vertex $i$ in the other part; and
2.

the permutation $P_{1}$ has a cycle of length $k$ such that $k$ does not divide $d$ .

Proof.

By Lemma 4.5, for any $d\geq 5$ , there is an integer $k\in[2,d-2]$ that does not divide $d$ .

Let $P_{0}$ be the identity permutation, and let $P_{1}\coloneqq(0,1,\dotsc,k-1)(k,k+1,\dotsc,d-1)$ be a permutation consisting of two cycles of length $k$ and $d-k$ , respectively. The perfect matchings corresponding to $P_{0}$ and $P_{1}$ are disjoint, as $P_{0}$ maps each element to itself while $P_{1}$ cyclically shifts every element within each of its two cycles (both of which are of length at least two).

Consider the graph $G$ obtained by deleting the edges in the matchings $P_{0}$ and $P_{1}$ from $K_{d,d}$ . The graph $G$ is a $(d-2)$ -regular bipartite graph. Thus, by the Hall theorem [Hal35], $G$ can be decomposed into $(d-2)$ disjoint perfect matchings. ∎

Lemma 4.8 and Lemma 4.7 together imply that for any dimension $d\geq 5$ , there are multiple non-equivalent orthogonal unitary bases. The same property holds for $d=4$ due to Proposition 4.2, and for $d=3$ due to Proposition 4.9 below. This proves Theorem 4.3.

Proposition 4.9.

There are orthogonal unitary bases for ${\mathsf{L}}({\mathbb{C}}^{3})$ that are not equivalent to the clock and shift basis.

Proof.

Denote the clock and shift basis by $B$ . Note that $B$ is a commutative projective group under operator composition, i.e., it is closed under taking products of the operators and the operators commute, all up to some phase (a unit complex number) that may depend on the operators. We construct a basis $B^{\prime}$ such that the equivalence of $B$ and $B^{\prime}$ implies that $B^{\prime}$ also is a commutative projective group. However, the basis $B^{\prime}$ has elements that do not commute even up to a phase, which is a contradiction.

We construct $B^{\prime}$ following an idea due to Werner [Wer01]; see the discussion after Proposition 9 in the paper. Let $M\coloneqq\beta|0\rangle\!\langle 0|+|1\rangle\!\langle 1|+|2\rangle\!\langle 2|$ , where $\beta\in{\mathsf{U}}({\mathbb{C}})$ is a unit complex number such that $\beta\neq 1$ . Let $B^{\prime}\coloneqq\left\{U_{ij}:0\leq i,j\leq 2\right\}$ , where for any $i\in\left\{0,1,2\right\}$

\displaystyle U_{ij}\quad\coloneqq

\displaystyle\quad\begin{cases}{\mathrm{X}}_{3}^{j}\,{\mathrm{Z}}_{3}^{i}&\quad j\in\left\{0,2\right\}\enspace,\quad\text{and}\\ {\mathrm{X}}_{3}\,{\mathrm{Z}}_{3}^{i}M&\quad j=1\enspace.\end{cases}

We may verify that this is an orthogonal unitary basis for any choice of $\beta\in{\mathsf{U}}({\mathbb{C}})$ .

Suppose the basis $B^{\prime}$ is equivalent to $B$ , and the equivalence is given by the operators $V,W\in{\mathsf{U}}({\mathbb{C}}^{3})$ and unit complex numbers $\alpha_{ij}\in{\mathsf{U}}({\mathbb{C}})$ . Consider the element ${\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b}$ of $B$ that corresponds to the operator $U_{00}\in B^{\prime}$ . We have ${\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b}=\alpha_{00}VU_{00}W=\alpha_{00}VW$ . Then $W=\alpha_{00}^{*}V^{*}{\mathrm{X}}_{3}^{a}{\mathrm{Z}}_{3}^{b}$ , and

B\quad=\quad\left\{\alpha_{ij}\,\alpha_{00}^{*}VU_{ij}V^{*}{\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b}~{}:~{}0\leq i,j\leq 2\right\}\enspace.

Since $B$ is closed under right multiplication by $({\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b})^{*}$ up to phases, the set of operators

\left\{VU_{ij}M^{*}V^{*}:0\leq i,j\leq 2\right\}

is also a commutative projective group, as is the basis $B^{\prime}$ .

We show next that not all operators in the set $B^{\prime}$ commute, even up to a phase. Consider the operators $U_{01}$ and $U_{02}$ . These operators commute up to a phase if and only if there is a unit complex number $\gamma$ such that

\begin{array}[]{rrl}&U_{01}U_{02}\quad=&\gamma\,U_{02}U_{01}\\ \Longleftrightarrow&{\mathrm{X}}_{3}M\,{\mathrm{X}}_{3}^{2}\quad=&\gamma\,{\mathrm{X}}_{3}^{2}\,{\mathrm{X}}_{3}M\\ \Longleftrightarrow&|0\rangle\!\langle 0|+\beta|1\rangle\!\langle 1|+|2\rangle\!\langle 2|\quad=&\gamma\left(\beta|0\rangle\!\langle 0|+|1\rangle\!\langle 1|+|2\rangle\!\langle 2|\right)\enspace.\end{array}

This implies that $\gamma=\beta=1$ . As we chose $\beta\neq 1$ , this is a contradiction, and $B$ and $B^{\prime}$ are not equivalent. ∎

5 Random superdense coding protocols

In this section we study a random protocol for approximate superdense coding. Its analysis draws heavily on results in high-dimensional probability. We present these results in Section 5.1 and develop some properties of random entangled vectors in Section 5.2, before proceeding to the analysis in Section 5.3. Finally, in Section 5.4, we address a subtle issue that we encounter in the analysis.

5.1 Background from random matrix theory

In this section, we present some useful results from random matrix theory.

Definition 5.1 (Isotropic vector).

We say a random vector $|{\bm{\xi}}\rangle\in{\mathbb{C}}^{n}$ is isotropic if $\operatorname{{\mathbb{E}}}|{\bm{\xi}}\rangle\!\langle{\bm{\xi}}|=\mathds{1}$ .

Random variables which have tails that decay as fast as the normal distribution play an important role in high dimensional probability. Let $S^{n-1}$ denote the set of unit vectors in ${\mathbb{C}}^{n}$ .

Definition 5.2 (Sub-gaussian random variables and vectors).

A random variable ${\bm{x}}\in{\mathbb{C}}$ is sub-gaussian if there exists a parameter $\kappa>0$ such that

\Pr(|{\bm{x}}|\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/\kappa^{2}\big{)}

for all $t\geq 0$ . The sub-gaussian norm of ${\bm{x}}$ , denoted by $\|{\bm{x}}\|_{{\uppsi_{2}}}$ , is defined as

\|{\bm{x}}\|_{{\uppsi_{2}}}\quad\coloneqq\quad\inf\left\{t>0:\operatorname{{\mathbb{E}}}\exp(\left\lvert{\bm{x}}\right\rvert^{2}/t^{2})\leq 2\right\}\enspace.

A random vector $|{\bm{v}}\rangle\in{\mathbb{C}}^{n}$ is sub-gaussian if for all unit vectors $|u\rangle\in S^{n-1}$ , the inner product $\langle u|{\bm{v}}\rangle$ is sub-gaussian. The sub-gaussian norm of $|{\bm{v}}\rangle$ is defined as

\left\|{\bm{v}}\right\|_{{\uppsi_{2}}}\quad\coloneqq\quad\sup_{u\in S^{n-1}}\left\|\langle u|{\bm{v}}\rangle\right\|_{{\uppsi_{2}}}\enspace.

Sub-gaussian norm can be characterized in multiple ways. The following lemma describes two of them; see [Ver18, Proposition 2.5.2, Section 2.5.1].

Lemma 5.3.

There are positive universal constants $c_{1},c_{2}$ such that for any random variable ${\bm{x}}\in{\mathbb{C}}$ ,

1.

If for some parameter $\kappa_{1}>0$ ,

$\Pr(|{\bm{x}}|\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/\kappa_{1}^{2}\big{)}$

for all $t\geq 0$ , then ${\bm{x}}$ is sub-gaussian and $\left\|{\bm{x}}\right\|_{\uppsi_{2}}\leq c_{1}\kappa_{1}$ .
2.

If ${\bm{x}}$ is sub-gaussian with $\left\|{\bm{x}}\right\|_{\uppsi_{2}}\leq\kappa_{2}$ for some parameter $\kappa_{2}>0$ , then

$\Pr(|{\bm{x}}|\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/c_{2}\kappa_{2}^{2}\big{)}$

for all $t\geq 0$ .

The following theorem gives a sharp bound on the largest singular value of a class of random matrices; see the text by Verhsynin [Ver18] for this and related results. Vershynin states the result for real matrices, but the proof extends to complex matrices in a straightforward manner.

Theorem 5.4 ([Ver18], Theorem 4.6.1).

Let ${\bm{A}}\coloneqq\sum_{i=1}^{m}|i\rangle\!\langle{\bm{x}}_{i}|$ be a complex $m\times n$ matrix whose rows $|{\bm{x}}_{i}\rangle$ are independent, mean zero, sub-gaussian isotropic vectors in ${\mathbb{C}}^{n}$ . Then there is a universal constant $c>0$ such that for all $t\geq 0$ , we have

\|{\bm{A}}\|\quad\leq\quad\sqrt{m}+c\kappa^{2}(\sqrt{n}+t)

with probability at least $1-2\exp(-t^{2})$ , where $\kappa\coloneqq\max_{i}\|{\bm{x}}_{i}\|_{{\uppsi_{2}}}$ .

Let $\|\cdot\|_{2}$ denote the Hilbert-Schmidt norm on ${\mathsf{L}}({\mathbb{C}}^{d})$ :

\|A\|_{2}\quad\coloneqq\quad\sqrt{\operatorname{Tr}(A^{*}A)}\enspace.

This norm induces the following $\ell_{2}$ -sum metric on $\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}$ :

\left\|(U_{1},U_{2},\dotsc,U_{m})-(V_{1},V_{2},\dotsc,V_{m})\right\|_{2}\quad\coloneqq\quad\left(\sum_{i=1}^{m}\left\|U_{i}-V_{i}\right\|_{2}^{2}\right)^{1/2}\enspace.

Let $f:\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}\rightarrow{\mathbb{R}}$ be a continuous function. We say $f$ is $\kappa$ -Lipschitz with respect to the $\ell_{2}$ -sum of Hilbert-Schmidt metrics if for all $(U_{i}),(V_{i})\in\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}$ , we have

\left|f(U_{1},U_{2},\dotsc,U_{m})-f(V_{1},V_{2},\dotsc,V_{m})\right|\quad\leq\quad\kappa\left\|(U_{1},U_{2},\dotsc,U_{m})-(V_{1},V_{2},\dotsc,V_{m})\right\|_{2}\enspace.

Let ${\bm{U}}_{i}\in{\mathsf{U}}({\mathbb{C}}^{d})$ , $1\leq i\leq m$ be i.i.d. Haar-random unitary operators. If $\kappa$ is sufficiently smaller than the dimension $d$ , with high probability, the random variable $f({\bm{U}}_{1},{\bm{U}}_{2},\dotsc,{\bm{U}}_{m})$ is close to its expectation. This concentration of measure property is formalized by the following theorem, which is a special case of Theorem 5.17 in the book on random matrix theory by Meckes [Mec19].

Theorem 5.5 ([Mec19], Theorem 5.17, page 159).

Let ${\bm{U}}_{i}\in{\mathsf{U}}({\mathbb{C}}^{d})$ , $i\in[m]$ , be i.i.d. random unitary operators chosen according to the Haar measure. Suppose the function $f:\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}\rightarrow{\mathbb{R}}$ is $\kappa$ -Lipschitz with respect to the $\ell_{2}$ -sum of Hilbert-Schmidt metrics, with $\kappa>0$ . Then for every positive real number $t$ , we have

\Pr\!\left(f({\bm{U}}_{1},{\bm{U}}_{2},\dotsc,{\bm{U}}_{m})\geq\varphi+t\right)\quad\leq\quad\exp\!\left(-\frac{(d-2)t^{2}}{24\kappa^{2}}\right)\enspace,

where $\varphi\coloneqq\operatorname{{\mathbb{E}}}f({\bm{U}}_{1},{\bm{U}}_{2},\dotsc,{\bm{U}}_{m})$ .

The Marčenko–Pastur theorem characterises the spectrum of a wide class of random matrices in the limit of large dimension. We rely on a version of the theorem due to Yaskov [Yas16] that applies to matrices whose entries need not all be independent. While Yaskov states the result for real matrices, the proof extends to complex matrices with straightforward modifications. We sketch the observations and the modifications which enable this extension after the statement of the theorem.

The columns of the random matrices we consider satisfy a certain asymptotic isotropy condition.

Definition 5.6.

Let $m(n)$ be a sequence of positive integers such that $m\to\infty$ as $n\to\infty$ . Let $(|{\bm{x}}_{m}\rangle)$ be a sequence of random vectors with $|{\bm{x}}_{m}\rangle\in{\mathbb{C}}^{m}$ . We say that the sequence $(|{\bm{x}}_{m}\rangle)$ is pseudo-isotropic if for all sequences of complex matrices $(A_{m})$ with $A_{m}\in{\mathbb{C}}^{m\times m}$ and with uniformly bounded spectral norm (i.e., $\|A_{m}\|\leq\kappa$ for all $m$ for a universal constant $\kappa$ ),

\frac{1}{m}\left(\langle{\bm{x}}_{m}|A_{m}|{\bm{x}}_{m}\rangle-\operatorname{Tr}(A_{m})\right)\overset{{\mathrm{P}}}{\longrightarrow}0

as $m\to\infty$ .

Define the empirical spectral distribution (ESD) of an $m\times m$ positive semi-definite matrix $A$ as $\tfrac{1}{m}\sum_{i=1}^{m}\operatorname{\updelta}(x-\lambda_{i})$ , where $(\lambda_{i}:i\in[m])$ are the eigenvalues of $A$ , and $\operatorname{\updelta}$ is the Dirac-delta function. This is the probability density function of a uniformly random eigenvalue of $A$ .

Theorem 5.7 (Marčenko-Pastur law [Yas16]).

Fix an $r>0$ , and let $m,n$ be integers with $n,m\geq 1$ and $m$ a function of $n$ such that $m/n\to r$ as $n\to\infty$ . For each $m$ , let $|{\bm{x}}_{m}\rangle$ in ${\mathbb{C}}^{m}$ be a random vector such that the sequence of vectors $(|{\bm{x}}_{m}\rangle)$ is pseudo-isotropic. Let $({\bm{M}}_{n,m})$ be a sequence of $m\times n$ random matrices whose columns are i.i.d. copies of the random vector $|{\bm{x}}_{m}\rangle$ , and let ${\bm{\mu}}_{n,m}$ be the ESD of the matrix $\frac{1}{n}{\bm{M}}_{n,m}{\bm{M}}_{n,m}^{*}$ . Then, as $n\to\infty$ , the ESD ${\bm{\mu}}_{n,m}$ converges weakly to the density $\operatorname{p}_{r}$ almost surely, where

\operatorname{p}_{r}(x)\quad\coloneqq\quad\max\left\{0,1-1/r\right\}\operatorname{\updelta}(x)+\frac{\sqrt{(x-a)(b-x)}}{2\pi rx}\;\mathbf{1}(a\leq x\leq b)\enspace,

with $a\coloneqq(1-\sqrt{r}\,)^{2}$ , $b\coloneqq(1+\sqrt{r}\,)^{2}$ .

In other words, as $n\to\infty$ , with probability $1$ , the cumulative distribution function of a uniformly random eigenvalue of the matrix $\frac{1}{n}{\bm{M}}_{n,m}{\bm{M}}_{n,m}^{*}$ converges point-wise to that given by the probability density function $p_{r}$ .

Theorem 5.7 follows from the proof of Theorem 2.1 in Ref. [Yas16] by noting the following points. The eigenvalues of the matrix $\frac{1}{n}{\bm{M}}_{n,m}{\bm{M}}_{n,m}^{*}$ are all real, and therefore the Stieltjes continuity theorem applies to ${\bm{\mu}}_{n,m}$ . Further, the Sherman-Morrison formula also extends to the sum $A+|u\rangle\!\langle v|$ , where $A$ is an invertible $m\times m$ complex matrix, and $|u\rangle,|v\rangle$ are in ${\mathbb{C}}^{m}$ : the matrix $A+|u\rangle\!\langle v|$ is invertible if and only if $1+\langle v|A^{-1}|u\rangle\neq 0$ , and if the latter condition holds,

(A+|u\rangle\!\langle v|)^{-1}\quad=\quad A^{-1}-\frac{A^{-1}|u\rangle\!\langle v|A^{-1}}{1+\langle v|A^{-1}|u\rangle}\enspace.

We can prove that the Stieltjes transform ${\bm{s}}_{n}(z)$ of ${\bm{\mu}}_{n,m}$ tends to its expectation $\operatorname{{\mathbb{E}}}{\bm{s}}_{n}(z)$ almost surely as $n\to\infty$ , following Step 1 in the proof of Theorem 1.1 in Ref. [BZ08]. The rest of the proof in Ref. [Yas16] now extends to the case of interest to us by replacing all instances of the transpose of a real vector by the conjugate transpose of the corresponding complex vector.

5.2 Pseudo-isotropy of random maximally entangled vectors

In this section, we develop properties of linear operators with certain symmetries, and use these to prove that a sequence of random maximally entangled vectors is pseudo-isotropic. This property is later used in the analysis of a random superdense coding protocol.

We consider operators on ${\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}$ , and label the four tensor factors with $A,B,C,D$ , respectively. As is the convention in quantum information, we use superscripts to indicate the tensor factors on which an operator acts. Let ${\mathrm{F}}\coloneqq\sum_{i,j=1}^{d}|i,j\rangle\!\langle j,i|$ be the swap operator on ${\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}$ ; it permutes the two tensor factors.

Lemma 5.8.

Let $W\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d})$ . Suppose $W$ commutes with $\mathds{1}^{AB}\otimes U^{C}\otimes U^{D}$ for all unitary operators $U\in{\mathsf{U}}({\mathbb{C}}^{d})$ , as well as with $U^{A}\otimes U^{B}\otimes\mathds{1}^{CD}$ . Then $W$ is a linear combination of operators of the form $P^{AB}\otimes Q^{CD}$ , where $P,Q\in\left\{\mathds{1},{\mathrm{F}}\right\}$ .

Proof.

Let $(E_{i})$ be a basis for the vector space ${\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d})$ . We may express $W$ as $W=\sum_{i=1}^{d^{2}}E_{i}\otimes W_{i}$ for some operators $W_{i}\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d})$ . Since $W$ commutes with $\mathds{1}^{AB}\otimes U^{C}\otimes U^{D}$ , we have

\sum_{i=1}^{d^{2}}E_{i}\otimes W_{i}\quad=\quad\sum_{i=1}^{d^{2}}E_{i}\otimes(U\otimes U)W_{i}(U^{*}\otimes U^{*})\enspace,

for all $U\in{\mathsf{U}}({\mathbb{C}}^{d})$ . Since the operators $E_{i}$ form a basis, we conclude that

W_{i}\quad=\quad(U\otimes U)W_{i}(U^{*}\otimes U^{*})\enspace,

i.e., the operator $W_{i}$ commutes with $U\otimes U$ for every $i$ . As a consequence of the von Neumann double commutant theorem [Wat18, Theorem 7.15, Section 7.1], each operator $W_{i}$ may be written as a linear combination of $\left\{\mathds{1},{\mathrm{F}}\right\}$ . So

W\quad=\quad\sum_{i=1}^{d^{2}}E_{i}\otimes(\alpha_{i}\mathds{1}+\beta_{i}{\mathrm{F}})\enspace,

for some complex numbers $\alpha_{i},\beta_{i}$ . Rearranging the sum, we get that

W\quad=\quad G\otimes\mathds{1}+H\otimes{\mathrm{F}}\enspace,

for some operators $G,H\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d})$ . Since $W$ commutes with $U^{A}\otimes U^{B}\otimes\mathds{1}^{CD}$ as well, and $\mathds{1}$ and ${\mathrm{F}}$ are linearly independent, by [Wat18, Theorem 7.15, Section 7.1] we similarly get that $G$ and $H$ are also linear combinations of $\left\{\mathds{1},{\mathrm{F}}\right\}$ . The lemma follows. ∎

Consider the random vector $|{\bm{\psi}}\rangle$ defined as $|{\bm{\psi}}\rangle\coloneqq({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle$ , where ${\bm{U}}\in{\mathsf{U}}({\mathbb{C}}^{d})$ is a Haar-random unitary operator and $|\upphi_{d}\rangle$ is the maximally entangled state $\tfrac{1}{\sqrt{d}}\sum_{k=1}^{d}|k\rangle|k\rangle$ with local dimension $d$ . We would like to compute a closed form expression for the operator $M$ on $\mathbb{C}^{d}\otimes\mathbb{C}^{d}\otimes\mathbb{C}^{d}\otimes\mathbb{C}^{d}$ defined as:

M\quad\coloneqq\quad\operatorname*{\mathbb{E}}|{\bm{\psi}}\rangle\!\langle{\bm{\psi}}|^{\otimes 2}\enspace.

(5.1)

We use the symmetries of $M$ in order to do so.

Lemma 5.9.

Let $M$ be the operator defined in Eq. (5.1). Then

M\quad=\quad\beta\left[\mathds{1}+{\mathrm{F}}^{AC}\otimes{\mathrm{F}}^{BD}\right]+\gamma\left[\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD}+{\mathrm{F}}^{AC}\otimes\mathds{1}^{BD}\right]\enspace,

where $\beta\coloneqq d^{-2}(d^{2}-1)^{-1}$ and $\gamma\coloneqq-d^{-3}(d^{2}-1)^{-1}$ .

Proof.

Since

M\quad=\quad\operatorname*{\mathbb{E}}~{}({\bm{U}}^{A}\otimes\mathds{1}^{B})|\upphi_{d}\rangle\!\langle\upphi_{d}|({\bm{U}}^{*A}\otimes\mathds{1}^{B})\otimes({\bm{U}}^{C}\otimes\mathds{1}^{D})|\upphi_{d}\rangle\!\langle\upphi_{d}|({\bm{U}}^{*C}\otimes\mathds{1}^{D})\enspace,

and ${\bm{U}}$ is Haar random, the operator $M$ commutes with $V^{A}\otimes V^{C}\otimes\mathds{1}^{BD}$ for all $V\in{\mathsf{U}}({\mathbb{C}}^{d})$ . Further, since $(\mathds{1}\otimes V)|\upphi_{d}\rangle=(V^{\top}\otimes\mathds{1})|\upphi_{d}\rangle$ , the operator $M$ also commutes with $\mathds{1}^{AC}\otimes V^{B}\otimes V^{D}$ . By Lemma 5.8, we have

M\quad=\quad\alpha\mathds{1}+\beta({\mathrm{F}}^{AC}\otimes{\mathrm{F}}^{BD})+\gamma(\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD})+\delta({\mathrm{F}}^{AC}\otimes\mathds{1}^{BD})\enspace.

(5.2)

Consider the following linear functionals:

1.

$X\mapsto\operatorname{Tr}(X)$
2.

$X\mapsto\operatorname{Tr}(({\mathrm{F}}^{AC}\otimes{\mathrm{F}}^{BD})X)$
3.

$X\mapsto\operatorname{Tr}(({\mathrm{F}}^{AC}\otimes\mathds{1}^{BD})X)$
4.

$X\mapsto\operatorname{Tr}((\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD})X)$

We apply these functionals to both sides of Eq. (5.2). We calculate the value of the functional on the left hand side directly from the definition of $M$ , i.e., Eq. (5.1), and on the right hand side from Eq. (5.2). We thus obtain the following linear equations:

1.

$1=\alpha d^{4}+\beta d^{2}+\gamma d^{3}+\delta d^{3}$ ,
2.

$1=\alpha d^{2}+\beta d^{4}+\gamma d^{3}+\delta d^{3}$ ,
3.

$1/d=\alpha d^{3}+\beta d^{3}+\gamma d^{2}+\delta d^{4}$ , and
4.

$1/d=\alpha d^{3}+\beta d^{3}+\gamma d^{4}+\delta d^{2}$ , respectively.

Solving for $\alpha,\beta,\gamma,\delta$ , we get the unique solution

\alpha=\beta=\frac{1}{d^{2}(d^{2}-1)}\enspace,\quad\text{and}\quad\gamma=\delta=\frac{-1}{d^{3}(d^{2}-1)}\enspace.

∎

Let $n\coloneqq d^{2}$ . Consider the random vector $|{\bm{\xi}}_{n}\rangle\in{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}$ defined as $|{\bm{\xi}}_{n}\rangle\coloneqq d|{\bm{\psi}}\rangle=d({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle$ . We prove that the sequence of these vectors is pseudo-isotropic.

Lemma 5.10.

The sequence of vectors $(|{\bm{\xi}}_{n}\rangle)$ is pseudo-isotropic.

Proof.

Let $(A_{n}\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}):n\geq 1)$ be a sequence of complex matrices with spectral norm $\left\|A_{n}\right\|$ bounded by a constant $\kappa$ , for each $n$ . We use the Chebyshev Inequality to show that

\displaystyle\frac{1}{n}\left(\langle{\bm{\xi}}_{n}|A_{n}|{\bm{\xi}}_{n}\rangle-\operatorname{Tr}(A_{n})\right)\overset{{\mathrm{P}}}{\longrightarrow}0

(5.3)

as $n\to\infty$ . Let ${\bm{x}}_{n}$ be the complex random variable defined as ${\bm{x}}_{n}\coloneqq\langle{\bm{\xi}}_{n}|A_{n}|{\bm{\xi}}_{n}\rangle$ . We may verify that $\operatorname{{\mathbb{E}}}|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|=\mathds{1}$ , so that $\operatorname{{\mathbb{E}}}{\bm{x}}_{n}=\operatorname{{\mathbb{E}}}\operatorname{Tr}(|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|A_{n})=\operatorname{Tr}(A_{n})$ . Eq. (5.3) is equivalent to showing that for every $\epsilon>0$ , $\Pr(\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert>\epsilon n)\to 0$ as $n\to\infty$ . By the Chebyshev Inequality,

\Pr(\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert>\epsilon n)\quad\leq\quad\frac{1}{\epsilon^{2}n^{2}}\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert^{2}\enspace.

So it suffices to show that the variance of ${\bm{x}}_{n}$ is $\mathrm{o}(n^{2})$ .

The variance $\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert^{2}=\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}-\left\lvert\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert^{2}=\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}-\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}$ . To calculate the second moment of ${\bm{x}}_{n}$ , we rewrite it as follows.

	$\displaystyle\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}\quad$	$\displaystyle=\quad\operatorname{{\mathbb{E}}}~{}\langle{\bm{\xi}}_{n}\|A_{n}\|{\bm{\xi}}_{n}\rangle\langle{\bm{\xi}}_{n}\|A_{n}^{*}\|{\bm{\xi}}_{n}\rangle$
		$\displaystyle=\quad\operatorname{{\mathbb{E}}}~{}\operatorname{Tr}\big{[}(\|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}\|\otimes\|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}\|)(A_{n}\otimes A_{n}^{*})\big{]}$
		$\displaystyle=\quad n^{2}\operatorname{Tr}\big{[}M(A_{n}\otimes A_{n}^{*})\big{]}\enspace,$

where $M$ is the matrix defined in Eq. (5.1). By Lemma 5.9, and the Hölder Inequality (namely, $\left\lvert\operatorname{Tr}(AB)\right\rvert\leq\left\|A\right\|_{\mathrm{tr}}\left\|B\right\|$ ),

	$\displaystyle\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}\quad$	$\displaystyle=\quad n^{2}\Big{[}\beta\big{(}\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}+\operatorname{Tr}(A_{n}^{*}A_{n})\big{)}$
		$\displaystyle\qquad\mbox{}+\gamma\big{(}\operatorname{Tr}\big{(}({\mathrm{F}}^{AC}\otimes\mathds{1}^{BD})(A_{n}\otimes A_{n}^{})\big{)}+\operatorname{Tr}\big{(}(\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD})(A_{n}\otimes A_{n}^{})\big{)}\big{)}\Big{]}$
		$\displaystyle\leq\quad n^{2}\Big{[}\beta\big{(}\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}+\kappa^{2}n\big{)}+2\left\lvert\gamma\right\rvert\kappa^{2}n^{2}\Big{]}\enspace,$

where $\beta=1/n(n-1)$ and $\gamma=-1/n^{3/2}(n-1)$ . Thus the variance is bounded as

\displaystyle\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}-\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}\quad

\displaystyle\leq\quad\frac{1}{n-1}\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}+\frac{\kappa^{2}n^{2}}{n-1}+\frac{2\kappa^{2}n^{5/2}}{n-1}\enspace,

which is $\mathrm{o}(n^{2})$ as $\left\lvert\operatorname{Tr}(A_{n})\right\rvert\leq\kappa n$ . This proves that the sequence $(|{\bm{\xi}}_{n}\rangle)$ is pseudo-isotropic. ∎

5.3 Analysis of a random protocol

Consider the following random protocol ${\bm{\Pi}}_{d}$ . Let $d$ be an integer $\geq 2$ , and $n\coloneqq d^{2}$ . Alice and Bob agree on a choice of $n$ independently chosen Haar-random unitary operators ${\bm{U}}_{1},\dotsc,{\bm{U}}_{n}\in{\mathsf{U}}({\mathbb{C}}^{d})$ . They also share the maximally entangled state $|\upphi_{d}\rangle\coloneqq\tfrac{1}{\sqrt{d}}\sum_{k=1}^{d}|k\rangle|k\rangle$ with local dimension $d$ . When Alice gets message $i\in[n]$ , she applies ${\bm{U}}_{i}$ to her half of $|\upphi_{d}\rangle$ , and sends it over to Bob. Bob now holds the state $|{\bm{\psi}}_{i}\rangle\coloneqq({\bm{U}}_{i}\otimes\mathds{1})|\upphi_{d}\rangle$ . He performs an optimal measurement to identify $i$ , given that the state is drawn from the ensemble ${\bm{\mathcal{E}}}_{d}\coloneqq\big{(}|{\bm{\psi}}_{j}\rangle:j\in[n]\big{)}$ .

Aram Harrow (personal communication) suggested the protocol ${\bm{\Pi}}_{d}$ as a candidate for an approximate $(d,\epsilon)$ -superdense coding protocol with vanishing error $\epsilon$ in the limit of large dimension. If this random construction of superdense coding protocols did indeed have error that vanishes rapidly as a function of $d$ , then this could potentially refute 1.3. This is formalized by the following proposition (which was stated in Section 1.3, and is reproduced here for convenience).

See 1.4

Proof.

Let ${\bm{\Pi}}_{d}$ be the random protocol and let $({\bm{U}}_{i})$ be the ensemble of random unitaries specified in the Proposition statement. Suppose for contradiction that 1.3 were true and the error ${\bm{\varepsilon}}$ of the protocol ${\bm{\Pi}}_{d}$ satisfied

\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\delta_{2}({\bm{\varepsilon}})^{2}\quad<\quad(2d)^{-2}~{}.

(5.4)

First we argue that

\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname*{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{*})|^{2}\quad=\quad 1\enspace,

(5.5)

where the first expectation is over the ensemble of random unitary operators $({\bm{U}}_{i})$ , and the second expectation is over a uniformly random pair of distinct indices ${\bm{j}},{\bm{k}}\in[d]$ , ${\bm{j}}\neq{\bm{k}}$ . To prove this, note that for all $j\neq k$ $\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}|\operatorname{Tr}({\bm{U}}_{j}{\bm{U}}_{k}^{*})|^{2}=\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}|\operatorname{Tr}({\bm{U}}_{1}{\bm{U}}_{2}^{*})|^{2}$ because ${\bm{U}}_{i}$ are independent, identically distributed Haar-random unitaries operators. Furthermore, by the rotation invariance of the Haar measure, ${\bm{U}}_{1}{\bm{U}}_{2}^{*}$ is also distributed according to the Haar measure. So the above quantity is equal to $\operatorname*{\mathbb{E}}_{\bm{U}}|\operatorname{Tr}({\bm{U}})|^{2}$ for Haar-random ${\bm{U}}$ . So the LHS of Equation 5.5 equals

	$\displaystyle\operatorname*{\mathbb{E}}_{\bm{U}}\|\operatorname{Tr}({\bm{U}})\|^{2}\quad$	$\displaystyle=\quad\operatorname*{\mathbb{E}}_{\bm{U}}\left\lvert\sum_{j=1}^{d}\langle j\|{\bm{U}}\|j\rangle\right\rvert^{2}$
		$\displaystyle=\quad\operatorname{\mathbb{E}}_{\bm{U}}\sum_{j,k=1}^{d}\langle j\|{\bm{U}}\|j\rangle\!\langle k\|{\bm{U}}^{}\|k\rangle$
		$\displaystyle=\quad\sum_{j,k}\langle j\|\Big{(}\operatorname{\mathbb{E}}_{\bm{U}}{\bm{U}}\|j\rangle\!\langle k\|{\bm{U}}^{}\Big{)}\|k\rangle~{}.$

Since $\operatorname*{\mathbb{E}}_{\bm{U}}{\bm{U}}|j\rangle\!\langle j|{\bm{U}}^{*}=\mathds{1}/d$ and $\operatorname*{\mathbb{E}}_{\bm{U}}{\bm{U}}|j\rangle\!\langle k|{\bm{U}}^{*}=0$ when $j\neq k$ , we have $\operatorname*{\mathbb{E}}|\operatorname{Tr}({\bm{U}})|^{2}=1$ , which establishes Equation 5.5.

On the other hand, the rigidity condition promised by 1.3 implies that every collection of $d\times d$ unitary operators $(U_{i})$ yields a superdense protocol with some error $\varepsilon$ , and in turn there exists an orthogonal unitary basis $(E_{i})$ such that $\|U_{i}-E_{i}\|_{\mathrm{nhs}}\leq\delta_{2}(\varepsilon)$ for all $i\in[d^{2}]$ . Note that $\varepsilon$ and $(E_{i})$ depend on $(U_{i})$ , and let ${\bm{\varepsilon}}$ and $({\bm{E}}_{i})$ be the error of the protocol ${\bm{\Pi}}_{d}$ and the corresponding orthogonal unitary basis, respectively. Then

	$\displaystyle\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{*})\|^{2}\quad$	$\displaystyle=\quad\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left\|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{})-\operatorname{Tr}({\bm{E}}_{\bm{j}}{\bm{E}}_{\bm{k}}^{})\right\|^{2}$
		$\displaystyle\leq\quad\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\Big{(}\left\|\operatorname{Tr}(({\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}){\bm{U}}_{\bm{k}}^{})\|+\|\operatorname{Tr}({\bm{E}}_{\bm{j}}({\bm{U}}_{\bm{k}}^{}-{\bm{E}}_{\bm{k}}^{*}))\right\|\Big{)}^{2}$
		$\displaystyle\leq\quad 2\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left\|\operatorname{Tr}(({\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}){\bm{U}}_{\bm{k}}^{})\right\|^{2}+\left\|\operatorname{Tr}({\bm{E}}_{\bm{j}}({\bm{U}}_{\bm{k}}^{}-{\bm{E}}_{\bm{k}}^{*}))\right\|^{2}$	(5.6)
where the first equality is due to the orthogonality condition $\operatorname{Tr}({\bm{E}}_{j}{\bm{E}}_{k}^{})=0$ whenever $j\neq k$ , the second line is due to the triangle inequality, and the third line is due to the inequality $(a+b)^{2}\leq 2a^{2}+2b^{2}$ for real numbers $a,b$ . By the Cauchy-Schwarz inequality for the Hilbert-Schmidt inner product, we have $\|\operatorname{Tr}(({\bm{U}}_{j}-{\bm{E}}_{j}){\bm{U}}_{k}^{})\|^{2}\leq\\|{\bm{U}}_{j}-{\bm{E}}_{j}\\|_{2}^{2}\cdot\\|{\bm{U}}_{k}\\|_{2}^{2}$ and $\|\operatorname{Tr}({\bm{E}}_{j}({\bm{U}}_{k}^{}-{\bm{E}}_{k}^{}))\|^{2}\leq\\|{\bm{E}}_{j}\\|_{2}^{2}\cdot\\|{\bm{U}}_{k}-{\bm{E}}_{k}\\|_{2}^{2}$ , where $\\|X\\|_{2}=\sqrt{\operatorname{Tr}(XX^{*})}$ denotes the (unnormalized) Hilbert-Schmidt norm. Since $\\|A\\|_{2}^{2}=d$ for all $d\times d$ unitary matrices $A$ , we can upper bound the RHS of Equation 5.6 as
		$\displaystyle\leq\quad 2d\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left(\\|{\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}\\|_{2}^{2}+\\|{\bm{U}}_{\bm{k}}-{\bm{E}}_{\bm{k}}\\|_{2}^{2}\right)$
		$\displaystyle=\quad 4\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\sum_{j}\\|{\bm{U}}_{j}-{\bm{E}}_{j}\\|_{\mathrm{nhs}}^{2}$
		$\displaystyle\leq\quad 4d^{2}\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\delta_{2}({\bm{\varepsilon}})^{2}\enspace$
		$\displaystyle<\quad 1\enspace,$

where the last inequality follows from the assumption in Equation 5.4. However, this contradicts Equation 5.5. Thus, either the conjecture does not hold, or the random superdense coding protocol ${\bm{\Pi}}_{d}$ has error satisfying $\operatorname*{\mathbb{E}}\delta_{2}({\bm{\varepsilon}})\geq(2d)^{-2}$ . ∎

In the rest of this section, we prove that for sufficiently large dimension, with high probability, the protocol ${\bm{\Pi}}_{d}$ has positive constant error. This indicates that random maximally entangled quantum states are not very reliable for transmitting classical information, and proves Theorem 1.5. Thus, the random protocol ${\bm{\Pi}}_{d}$ does not rule out a robust rigidity theorem for superdense coding.

To analyze the decoding error of the protocol, we study the distinguishability of the ensemble ${\bm{\mathcal{E}}}_{d}$ . This is the probability that, if the pure state $|{\bm{\psi}}_{i}\rangle$ is selected uniformly at random from the ensemble, an optimal measurement correctly identifies the state.

Definition 5.11.

Let ${\mathcal{F}}\coloneqq\big{(}(p_{i},\rho_{i}):\rho_{i}\in{\mathsf{D}}({\mathbb{C}}^{k}),i\in[m]\big{)}$ be an ensemble of states in which state $\rho_{i}$ occurs with probability $p_{i}$ . We define the distinguishability of ${\mathcal{F}}$ as

\operatorname{\upgamma}({\mathcal{F}})\quad\coloneqq\quad\max_{\text{POVM }M}\sum_{i=1}^{m}p_{i}\operatorname{Tr}(M_{i}\rho_{i})\enspace,

where the maximization is over all measurements (i.e., POVMs) $M$ with elements $M_{1},\ldots,M_{m}$ .

We can estimate the distinguishability of an ensemble of states via the generalized Holevo-Curlander bounds [Kho79, Cur79, ON99, Tys09b].

Theorem 5.12 (generalized Holevo-Curlander bounds [ON99, Tys09b]).

Let ${\mathcal{F}}\coloneqq\big{(}(p_{i},\rho_{i}):\rho_{i}\in{\mathsf{D}}({\mathbb{C}}^{k}),i\in[m]\big{)}$ be an ensemble of $m$ quantum states. Then the distinguishability of ${\mathcal{F}}$ satisfies

\left(\operatorname{\upalpha_{\text{Holevo}}}({\mathcal{F}})\right)^{2}\quad\leq\quad\operatorname{\upgamma}({\mathcal{F}})\quad\leq\quad\operatorname{\upalpha_{\text{Holevo}}}({\mathcal{F}})\enspace,

where

\operatorname{\upalpha_{\text{Holevo}}}({\mathcal{F}})\quad\coloneqq\quad\operatorname{Tr}\sqrt{\sum_{i=1}^{m}p_{i}^{2}\rho_{i}^{2}}\enspace.

We only need the upper bound on distinguishability above for a uniform ensemble of pure states. This bound was given by Curlander [Cur79] in the case of linearly independent states. It was generalized to the case of equiprobable, possibly mixed states by Ogawa and Nagaoka [ON99, Lemma 1]. The proof they gave also extends with minor modifications to non-uniform ensembles. The two bounds in Theorem 5.12 were proven — re-proven independently in the case of the upper bound [Tys09a] — by Tyson[Tys09b, Theorem 10]. Tyson later gave another proof of the bounds which also generalizes to error-recovery [Tys10, Section III].

We show that the expectation of the quantity $\operatorname{\upalpha_{\text{Holevo}}}({\bm{\mathcal{E}}}_{d})$ for the ensemble of random maximally entangled states is at most a constant strictly less than $1$ , for sufficiently large dimension $d$ . This implies that the distinguishability $\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})$ is also strictly less than $1$ in expectation, and that any measurement Bob makes has a non-zero constant probability of failure, on average.

Theorem 5.13.

The distinguishability $\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})$ of the random superdense coding protocol ${\bm{\Pi}}_{d}$ tends to $\tfrac{8}{3\pi}\approx 0.85$ as $d\to\infty$ .

Proof.

Define the matrix ${\bm{Q}}$ as ${\bm{Q}}\coloneqq\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle{\bm{\psi}}_{i}|$ , and ${\bm{\Lambda}}_{d}$ as a uniformly random eigenvalue of ${\bm{Q}}$ . Then $\operatorname{{\mathbb{E}}}\operatorname{\upalpha_{\text{Holevo}}}({\bm{\mathcal{E}}}_{d})$ is the expectation of the random variable $\sqrt{{\bm{\Lambda}}_{d}}$ , and we aim to bound this from above.

Define the $n\times n$ matrix ${\bm{R}}\coloneqq d\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle i|$ so that ${\bm{Q}}=\tfrac{1}{n}{\bm{R}}{\bm{R}}^{*}$ . Consider the random vector $|{\bm{\xi}}_{n}\rangle$ defined as $|{\bm{\xi}}_{n}\rangle\coloneqq d({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle$ , where ${\bm{U}}\in{\mathsf{U}}({\mathbb{C}}^{d})$ is a Haar-random unitary operator. I.e., $|{\bm{\xi}}_{n}\rangle$ is a scaled random maximally entangled state. Lemma 5.10 shows that the sequence $(|{\bm{\xi}}_{n}\rangle)$ is pseudo-isotropic. Since the columns of the matrix ${\bm{R}}$ are i.i.d. copies of the random vector $|{\bm{\xi}}_{n}\rangle$ , the random matrix ${\bm{Q}}$ is of the form described in Theorem 5.7. Thus the limiting distribution of the uniformly random eigenvalue ${\bm{\Lambda}}_{d}$ of ${\bm{Q}}$ follows the Marčenko-Pastur law with density $\operatorname{p}_{1}$ (i.e., with parameter $r=1$ ):

\displaystyle\operatorname{p}_{1}(x)\quad

\displaystyle\coloneqq\quad\begin{cases}\frac{1}{2\pi}\sqrt{\frac{4-x}{x}}&\text{ if }0\leq x\leq 4\enspace,\text{ and}\\ 0&\text{otherwise}\enspace.\end{cases}

(5.7)

We would like to use the density $\operatorname{p}_{1}$ to estimate the limit of the expectation of $\sqrt{{\bm{\Lambda}}_{d}}$ . A subtle issue is that weak convergence (i.e., convergence in distribution of the random variables) does not necessarily imply that the limit of the expectation values $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}$ equals the expectation of the limiting random variable. A simple example for which this does not hold is described in Section 5.4. Nonetheless, in Theorem 5.16 in Section 5.4, we show that the sequence of random variables ${\bm{\Lambda}}_{d}$ satisfies the stronger property we need. Namely, $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}$ converges to $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}$ as $d\to\infty$ , where ${\bm{\Lambda}}$ is a random variable with density $\operatorname{p}_{1}$ . We may thus bound the distinguishability of the ensemble ${\bm{\mathcal{E}}}_{d}$ as $d\to\infty$ as follows.

	$\displaystyle\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})\quad$	$\displaystyle\leq\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\operatorname{\upalpha_{\text{Holevo}}}({\bm{\mathcal{E}}}_{d})$
		$\displaystyle=\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}$
		$\displaystyle=\quad\int_{-\infty}^{\infty}\sqrt{x}\,\operatorname{p}_{1}(x)\mathop{}\!\mathrm{d}x$
		$\displaystyle=\quad\frac{1}{2\pi}\int_{0}^{4}\sqrt{4-x}\mathop{}\!\mathrm{d}x$
		$\displaystyle=\quad\frac{8}{3\pi}\quad\approx\quad 0.85\enspace.$

∎

We can strengthen this result to show that the distinguishability of ${\bm{\mathcal{E}}}_{d}$ is tightly concentrated around the mean. So all but an exponentially small fraction of the superdense coding protocols using $|\upphi_{d}\rangle$ succeed with probability smaller than $\approx 0.85$ .

Theorem 5.14.

The distinguishability $\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})$ of the random superdense coding protocol ${\bm{\Pi}}_{d}$ satisfies

\Pr\Big{(}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})\geq\operatorname*{\mathbb{E}}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})+t\Big{)}\leq\exp\!\left(-\frac{d^{3}(d-2)t^{2}}{96}\right)\enspace.

Proof.

Define the function $f:\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{n}\rightarrow{\mathbb{R}}$ as

f(U_{1},\ldots,U_{n})\quad\coloneqq\quad\sup_{\text{POVM }M}\frac{1}{n}\sum_{i=1}^{n}\operatorname{Tr}(M_{i}\psi_{i})\enspace,

where $|\psi_{i}\rangle=(U_{i}\otimes\mathds{1})|\upphi_{d}\rangle$ , and we denote $|\psi\rangle\!\langle\psi|$ by $\psi$ . So $f({\bm{U}}_{1},\ldots,{\bm{U}}_{n})=\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})$ , the distinguishability of the ensemble ${\bm{\mathcal{E}}}_{d}$ . We bounded the expected distinguishability by $\approx.85<1$ in Theorem 5.13. We show that $\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})$ is tightly concentrated around its expectation using Theorem 5.5. To do so, we compute a bound on the Lipschitz constant of $f$ .

Fix unitary operators $U_{1},\ldots,U_{n},U_{1}^{\prime},\ldots,U_{n}^{\prime}\in{\mathsf{U}}({\mathbb{C}}^{d})$ and let $|\psi_{i}\rangle=(U_{i}\otimes\mathds{1})|\upphi_{d}\rangle$ and $|\psi_{i}^{\prime}\rangle=(U_{i}^{\prime}\otimes\mathds{1})|\upphi_{d}\rangle$ . Since the space of $n$ -dimensional POVMs with $n$ outcomes is compact, for any sequence $(U_{i})$ the supremum in the definition of $f$ is attained at some POVM $M$ . Let $M$ , $M^{\prime}$ correspond to the POVMs achieving $f(U_{1},\ldots,U_{n})$ and $f(U_{1}^{\prime},\ldots,U_{n}^{\prime})$ , respectively, and let $\alpha,\alpha^{\prime}$ denote these quantities. Assume without loss of generality that $\alpha^{\prime}\leq\alpha$ .

We have that

\displaystyle|\alpha-\alpha^{\prime}|\quad=\quad\alpha-\alpha^{\prime}\quad=\quad\frac{1}{d^{2}}\Big{(}\sum_{i}\operatorname{Tr}(M_{i}\psi_{i})-\operatorname{Tr}(M_{i}^{\prime}\psi_{i}^{\prime})\Big{)}\quad\leq\quad\frac{1}{d^{2}}\sum_{i}\operatorname{Tr}(M_{i}(\psi_{i}-\psi_{i}^{\prime}))\enspace,

as the POVM $M$ may not be an optimal distinguishing measurement for the ensemble $(\psi_{i}^{\prime})$ . We bound this by

	$\displaystyle\frac{1}{d^{2}}\sum_{i}\Big{\|}\operatorname{Tr}(M_{i}(\psi_{i}-\psi_{i}^{\prime}))\Big{\|}$
	$\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}\\|M_{i}\\|\cdot\\|\psi_{i}-\psi_{i}^{\prime}\\|_{1}$	$\displaystyle(\text{H\"{o}lder inequality})$
	$\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}\\|\psi_{i}-\psi_{i}^{\prime}\\|_{1}$	$\displaystyle(\text{$M$ is a POVM})$
	$\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}2\\|\|\psi_{i}\rangle-\|\psi_{i}^{\prime}\rangle\\|$
	$\displaystyle=\quad\frac{2}{d^{2}}\sum_{i}\sqrt{\langle\upphi\|(U_{i}-U_{i}^{\prime})^{*}(U_{i}-U_{i}^{\prime})\|\upphi\rangle}$
	$\displaystyle\leq\quad 2\sqrt{\frac{1}{d^{2}}\sum_{i}\frac{1}{d}\\|U_{i}-U_{i}^{\prime}\\|_{2}^{2}}$	$\displaystyle(\text{Jensen inequality})$

where in the fourth line we used the property that for any two pure states $|\varphi\rangle$ and $|\theta\rangle$ , the trace distance between $|\varphi\rangle\!\langle\varphi|$ and $|\theta\rangle\!\langle\theta|$ is at most $2\||\varphi\rangle-|\theta\rangle\|$ . In the last line, we used the identity $\langle\upphi_{d}|(A\otimes\mathds{1})|\upphi_{d}\rangle$ for any $d\times d$ matrix is equal to $\operatorname{Tr}(A)/d$ .

Thus $|\alpha-\alpha^{\prime}|\leq 2d^{-3/2}\sqrt{\sum_{i}\|U_{i}-U_{i}^{\prime}\|_{2}^{2}}$ , which implies that $f$ is $2d^{-3/2}$ -Lipschitz. Applying Theorem 5.5 we obtain

\Pr\Big{(}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})\geq\operatorname*{\mathbb{E}}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})+t\Big{)}\quad\leq\quad\exp\!\left(-\frac{d^{3}(d-2)t^{2}}{96}\right)\enspace.

∎

5.4 A subtle issue

Recall from the proof of Theorem 5.13 in Section 5.3 that ${\bm{\Lambda}}_{d}$ denotes a uniformly random eigenvalue of the matrix ${\bm{Q}}$ . The generalized Marčenko-Pastur Law (Theorem 5.7) tells us that ${\bm{\Lambda}}_{d}$ converges in distribution to a random variable ${\bm{\Lambda}}$ with density $\operatorname{p}_{1}$ given in Eq. (5.7) as $d\rightarrow\infty$ . We used this limiting distribution to estimate the limit of the mean of $\sqrt{{\bm{\Lambda}}_{d}}$ in Theorem 5.13. We pointed out the subtle issue that convergence in distribution does not necessarily imply that the limit of means equals the mean of the limiting random variable. A simple example which illustrates this issue is the following. For any positive integer $k$ , let the random variable ${\bm{x}}_{k}$ take value $k$ with probability $1/k$ , and value $0$ with the remaining probability. Then ${\bm{x}}_{k}$ converges in distribution to the constant $0$ , whereas $\operatorname{{\mathbb{E}}}{\bm{x}}_{k}=1$ for all $k$ .

The example above highlights the reason underlying this phenomenon: while the probability of an interval on the line may go to zero in the limit, the rate of convergence may not be fast enough to dampen the contribution to the mean from that interval. We show that the probability that the random variable ${\bm{\Lambda}}_{d}$ deviates from zero decays exponentially. This helps us conclude the convergence of the mean $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}$ to $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}$ .

A similar property was assumed to hold by Montanaro [Mon07] in his work on the distinguishability of random quantum states. Let ${\bm{S}}\coloneqq\sum_{i=1}^{k}|{\bm{\zeta}}_{i}\rangle\!\langle i|$ , where $|{\bm{\zeta}}_{i}\rangle$ are i.i.d. random vectors in ${\mathbb{C}}^{d}$ with i.i.d. complex gaussian entries with mean $0$ and variance $1$ . Montanaro approximates $\operatorname{{\mathbb{E}}}\operatorname{Tr}({\bm{S}}{\bm{S}}^{*})^{1/2}$ using the Marčenko-Pastur Law. He justifies this using estimates on the rate of convergence of the expected distribution of a uniformly random eigenvalue of $(1/k){\bm{S}}{\bm{S}}^{*}$ to the limiting distribution given by the Marčenko-Pastur Law (see the discussion after Lemma 5 in Ref. [Mon07]). The rate of convergence is measured in terms of the Kolmogorov distance between the two distributions. (The Kolmogorov distance between the cumulative distribution functions $F_{1}$ and $F_{2}$ of real random variables is defined as $\sup_{x\in{\mathbb{R}}}\left\lvert F_{1}(x)-F_{2}(x)\right\rvert$ .) The Kolmogorov distance was shown to be $\mathrm{O}(k^{-5/48})$ by Bai [Bai93]. However, vanishing Kolmogorov distance does not necessarily imply the convergence of the mean to the mean of the limiting distribution. For example, the Kolmogorov distance of the distribution of the random variable ${\bm{x}}_{k}$ defined above from the constant $0$ is $1/k$ . The approach we take in this section can also be used to fill the gap in Montanaro’s work. In fact, the analogue of Lemma 5.15 we need for this purpose follows directly, as the columns of ${\bm{S}}$ are gaussian. We leave the details to the reader, and return to the analysis of the random variable ${\bm{\Lambda}}_{d}$ .

In order to show that ${\bm{\Lambda}}_{d}$ has an exponentially decaying tail, it suffices to show that the spectral norm of the matrix ${\bm{Q}}$ —i.e., its largest eigenvalue—has this property. So we proceed by deriving a tail bound for the spectral norm of ${\bm{Q}}$ .

Lemma 5.15.

Let $d\geq 3$ . There are positive universal constants $c_{1},c_{2}$ such that for all $t\geq c_{1}$ ,

\Pr\left(\left\|{\bm{Q}}\right\|>t\right)\quad\leq\quad 2\exp(-tn/c_{2})\enspace.

(5.8)

Proof.

Recall that $n\coloneqq d^{2}$ , ${\bm{Q}}\coloneqq\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle{\bm{\psi}}_{i}|$ , and that ${\bm{R}}\coloneqq d\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle i|$ . We have $\left\|{\bm{Q}}\right\|=n^{-1}\left\|{\bm{R}}{\bm{R}}^{*}\right\|=n^{-1}\left\|{\bm{R}}\right\|^{2}$ , so

\Pr\left(\left\|{\bm{Q}}\right\|>t\right)\quad=\quad\Pr\left(\left\|{\bm{R}}\right\|>\sqrt{nt}\right)\enspace.

It thus suffices to give a suitable tail bound for $\left\|{\bm{R}}\right\|$ .

The vectors $d|{\bm{\psi}}_{i}\rangle$ are i.i.d. copies of the random vector $|{\bm{\xi}}_{n}\rangle$ defined as $|{\bm{\xi}}_{n}\rangle\coloneqq d({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle$ , where ${\bm{U}}$ is a Haar-random unitary operator on ${\mathbb{C}}^{d}$ . So the vectors $d|{\bm{\psi}}_{i}\rangle$ have zero mean. We can verify that $\operatorname{{\mathbb{E}}}|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|=\mathds{1}$ , so the vectors $d|{\bm{\psi}}_{i}\rangle$ are isotropic. We prove below that the vector $|{\bm{\xi}}_{n}\rangle$ is sub-gaussian, with sub-gaussian norm at most a universal constant $\kappa$ . So the matrix ${\bm{R}}^{*}=d\sum_{i=1}^{n}|i\rangle\!\langle\psi_{i}|$ satisfies the conditions of Theorem 5.4, and we have that for some positive universal constant $c_{3}$ ,

\left\|{\bm{R}}\right\|=\left\|{\bm{R}}^{*}\right\|\quad>\quad\sqrt{n}+c_{3}\kappa^{2}(\sqrt{n}+t_{1})

with probability at most $2\exp(-t_{1}^{2})$ , for all $t_{1}\geq 0$ . Let $t_{1}$ be such that the right hand side above equals $\sqrt{nt}$ , i.e.,

t_{1}\quad\coloneqq\quad\frac{\sqrt{n}}{c_{3}\kappa^{2}}\left(\sqrt{t}-(1+c_{3}\kappa^{2})\right)\enspace.

Let $c_{1}\coloneqq 4(1+c_{3}\kappa^{2})^{2}$ . Whenever $t\geq c_{1}$ , we see that $t_{1}\geq\sqrt{nt}/2c_{3}\kappa^{2}\geq 0$ . So

\Pr\left(\left\|{\bm{R}}\right\|>\sqrt{nt}\right)\quad\leq\quad 2\exp\!\big{(}-nt/4c_{3}^{2}\kappa^{4}\big{)}\enspace,

and the theorem holds with $c_{2}\coloneqq 4c_{3}^{2}\kappa^{4}$ .

It remains to prove that the sub-gaussian norm $\kappa$ of $|{\bm{\xi}}_{n}\rangle$ is at most some universal constant. By Lemma 5.3, it suffices to show that for any unit vector $|u\rangle\in{\mathbb{C}}^{n}$ , the random variable ${\bm{x}}\coloneqq\langle u|{\bm{\xi}}_{n}\rangle$ has sub-gaussian tails: for a positive universal constant $\kappa_{1}$ ,

\Pr(\left\lvert{\bm{x}}\right\rvert\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/\kappa_{1}^{2}\big{)}\enspace,

(5.9)

for all $t\geq 0$ . We establish this by appealing to Theorem 5.5.

Since $|{\bm{\xi}}_{n}\rangle$ is isotropic,

\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}\right\rvert^{2}\quad=\quad\langle u|\;(\operatorname{{\mathbb{E}}}|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|)\;|u\rangle\quad=\quad 1\enspace.

So $\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}\right\rvert\leq\big{(}\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}\right\rvert^{2}\big{)}^{1/2}\leq 1$ .

Define the function $f:{\mathsf{U}}({\mathbb{C}}^{d})\rightarrow{\mathbb{C}}$ as $f(U)\coloneqq d\left\lvert\langle u|(U\otimes\mathds{1})|\upphi_{d}\rangle\right\rvert$ . Then $\left\lvert{\bm{x}}\right\rvert=f({\bm{U}})$ . To show that $f$ is Lipschitz, consider $U,V\in{\mathsf{U}}({\mathbb{C}}^{d})$ . Since $|u\rangle$ is a unit vector and $\langle\upphi_{d}|(W\otimes\mathds{1})|\upphi_{d}\rangle=\tfrac{1}{d}\operatorname{Tr}(W)$ we have

	$\displaystyle\left\lvert f(U)-f(V)\right\rvert\quad$	$\displaystyle\leq\quad d\left\lvert\langle u\|((U-V)\otimes\mathds{1})\|\upphi_{d}\rangle\right\rvert$
		$\displaystyle\leq\quad d\left\\|((U-V)\otimes\mathds{1})\|\upphi_{d}\rangle\right\\|$
		$\displaystyle=\quad d\left(\langle\upphi_{d}\|((U-V)^{*}(U-V)\otimes\mathds{1})\|\upphi_{d}\rangle\right)^{1/2}$
		$\displaystyle=\quad\sqrt{d}\left\\|U-V\right\\|_{2}\enspace,$

where $\left\|\cdot\right\|_{2}$ denotes the Hilbert-Schmidt norm on ${\mathsf{L}}({\mathbb{C}}^{d})$ . So the function $f$ is $\sqrt{d}$ -Lipschitz with respect to Hilbert-Schmidt metric. By Theorem 5.5,

\Pr(\left\lvert{\bm{x}}\right\rvert\geq 1+t_{1})\quad\leq\quad\exp\!\left(-\frac{(d-2)t_{1}^{2}}{24d}\right)

for every $t_{1}>0$ . For $d\geq 3$ , we have $d-2\geq d/3$ . So the right hand side is at most $\exp(-t_{1}^{2}/72)$ , and we have

\Pr(\left\lvert{\bm{x}}\right\rvert\geq t)\quad\leq\quad\exp\!\left(-\frac{(t-1)^{2}}{3\cdot 24}\right)\quad\leq\quad\exp\!\left(-\frac{t^{2}}{12\cdot 24}\right)\enspace,

for all $t\geq 2$ (as $t-1\geq t/2$ ). Note that Eq. (5.9) holds trivially for $t\in[0,2]$ for any choice of positive constant $\kappa_{1}$ such that $2\exp(-4/\kappa_{1}^{2})\geq 1$ . Thus, taking $\kappa_{1}\coloneqq 24$ , we see that Eq. (5.9) holds for all $t\geq 0$ whenever $d\geq 3$ . ∎

Lemma 5.15 implies that for a large enough constant $\alpha$ , the contribution to the mean $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}$ outside an interval $[0,\alpha]$ goes to $0$ as $d\to\infty$ . Within this interval, the contribution to the mean tends to that for $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}$ . This helps us derive the limiting value of the mean.

Theorem 5.16.

$\lim_{d\rightarrow\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\enspace.$

Proof.

We formalize the intuition given above by appealing to a weaker property implied by convergence in distribution, namely that the expectation of any bounded continuous function $f$ of the random variable ${\bm{\Lambda}}_{d}$ converges to $\operatorname{{\mathbb{E}}}f({\bm{\Lambda}})$ .

Fix $\alpha\geq\max\left\{c_{1},4\right\}$ , where $c_{1}$ is the constant in the statement of Lemma 5.15 and consider the function $f_{\alpha}$ defined as follows:

f_{\alpha}(x)\quad\coloneqq\quad\begin{cases}0&\quad x\leq 0\\ \sqrt{x}&\quad 0<x\leq\alpha\\ \sqrt{\alpha}&\quad\alpha<x\enspace.\end{cases}

Since $f_{\alpha}$ is continuous and bounded, and ${\bm{\Lambda}}\in[0,4]$ ,

\lim_{d\to\infty}\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})\quad=\quad\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}})\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\enspace.

On the other hand, ${\bm{\Lambda}}_{d}\geq 0$ and $f_{\alpha}(x)\leq\sqrt{x}$ for all $x\geq 0$ . So $\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})\leq\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}$ , and

\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\quad=\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})\quad\leq\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\enspace.

We prove the reverse inequality using Lemma 5.15. Let $p(x)$ be the probability density function of ${\bm{\Lambda}}_{d}$ . By the definition of $f_{\alpha}$ ,

\displaystyle\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad

\displaystyle\leq\quad\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})+\int_{x\geq\alpha}\sqrt{x}p(x)\mathop{}\!\mathrm{d}x\enspace.

(5.10)

Let $g(\alpha,d)$ denote the second term on the right hand side of Eq. (5.10) above. This is the contribution to $\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}$ outside of the interval $[0,\alpha]$ . Using $\alpha\geq 4$ , Fubini’s Theorem, ${\bm{\Lambda}}_{d}\leq\left\|{\bm{Q}}\right\|$ , and Lemma 5.15, we have

	$\displaystyle g(\alpha,d)\quad$	$\displaystyle\leq\quad\int_{x\geq\alpha}xp(x)\mathop{}\!\mathrm{d}x$
		$\displaystyle=\quad\int_{x\geq\alpha}\int_{y\in[0,x]}p(x)\mathop{}\!\mathrm{d}y\mathop{}\!\mathrm{d}x$
		$\displaystyle=\quad\int_{y\geq 0}\int_{x\geq\max\left\{\alpha,y\right\}}p(x)\mathop{}\!\mathrm{d}x\mathop{}\!\mathrm{d}y$
		$\displaystyle=\quad\int_{y\in[0,\alpha]}\int_{x\geq\alpha}p(x)\mathop{}\!\mathrm{d}x\mathop{}\!\mathrm{d}y+\int_{y\geq\alpha}\int_{x\geq y}p(x)\mathop{}\!\mathrm{d}x\mathop{}\!\mathrm{d}y$
		$\displaystyle=\quad\int_{y\in[0,\alpha]}\Pr({\bm{\Lambda}}_{d}\geq\alpha)\mathop{}\!\mathrm{d}y+\int_{y\geq\alpha}\Pr({\bm{\Lambda}}_{d}\geq y)\mathop{}\!\mathrm{d}y$
		$\displaystyle\leq\quad 2\alpha\exp(-\alpha n/c_{2})+2\int_{y\geq\alpha}\exp(-yn/c_{2})\mathop{}\!\mathrm{d}y$
		$\displaystyle=\quad 2(\alpha+c_{2}/n)\exp(-\alpha n/c_{2})\enspace,$

where $c_{2}$ is the universal constant in the statement of Lemma 5.15. Since $n=d^{2}$ , $g(\alpha,d)$ vanishes as $d$ goes to $\infty$ . By Eq. (5.10),

	$\displaystyle\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad$	$\displaystyle\leq\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})+\lim_{d\to\infty}g(\alpha,d)$
		$\displaystyle=\quad\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}})\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\enspace.$

This proves the theorem. ∎

References

[Bai93] Z. D. Bai. Convergence rate of expected spectral distributions of large random matrices. Part II. sample covariance matrices. Annals of Probability, 21(2):649–672, April 1993.
[BCWdW01] Harry Buhrman, Richard Cleve, John Watrous, and Ronald de Wolf. Quantum fingerprinting. Physical Review Letters, 87(16):167902, 2001.
[BW92] Charles H. Bennett and Stephen J. Wiesner. Communication via one- and two-particle operators on einstein-podolsky-rosen states. Physical review letters, 69(20):2881, 1992.
[BZ08] Zhidong Bai and Wang Zhou. Large sample covariance matrices without independence structures in columns. Statistica Sinica, 18(2):425–442, 2008.
[CGJV19] Andrea Coladangelo, Alex B. Grilo, Stacey Jeffery, and Thomas Vidick. Verifier-on-a-leash: New schemes for verifiable delegated quantum computation, with quasilinear resources. In Yuval Ishai and Vincent Rijmen, editors, Advances in Cryptology – EUROCRYPT 2019, volume 11478 of Lecture Notes in Computer Science, pages 247–277, Cham, 2019. Springer International Publishing.
[Cha20] Michael Charezma. Quantum circuit diagrams. https://warwick.ac.uk/fac/sci/physics/research/cfsa/people/pastmembers/charemzam/pastprojects, 2006 (accessed October 14, 2020).
[Cur79] Paul Joseph Curlander. Quantum Limitations on Communication Systems. PhD thesis, Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1979.
[FK19] Máté Farkas and Jędrzej Kaniewski. Self-testing mutually unbiased bases in the prepare-and-measure scenario. Physical Review A, 99(3):032316, 2019.
[FKN22] Máté Farkas, Jędrzej Kaniewski, and Ashwin Nayak. Mutually unbiased measurements, Hadamard matrices, and Superdense Coding. Technical Report arXiv:2204.11886 [quant-ph], ArXiv.org Preprint Archive, https://www.arxiv.org/, April 2022.
[Hal35] Philip Hall. On representatives of subsets. Journal of the London Mathematical Society, s1-10(1):26–30, 1935.
[HJW93] Lane P. Hughston, Richard Jozsa, and William K. Wootters. A complete classification of quantum ensembles having a given density matrix. Physics Letters A, 183(1):14–18, 1993.
[JNV⁺20] Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright, and Henry Yuen. $\mathsf{MIP}^{*}=\mathsf{RE}$ . Technical Report arXiv:2001.04383 [quant-ph], ArXiv.org Preprint Archive, https://www.arxiv.org/, January 2020.
[Kho79] Alexander S. Kholevo. On asymptotically optimal hypothesis testing in quantum statistics. Theory of Probability & Its Applications, 23(2):411–415, 1979.
[KR03] Andreas Klappenecker and Martin Rötteler. Unitary error bases: Constructions, equivalence, and applications. In Marc Fossorier, Tom Høholdt, and Alain Poli, editors, Proceedings of the 15th International Symposium on Applied Algebra, Algebraic Algorithms, and Error-Correcting Codes (AAECC), volume 2643 of Lecture Notes in Computer Science, pages 139–149. Springer, Berlin / Heidelberg, Germany, May12–16, 2003.
[Mec19] Elizabeth S. Meckes. The Random Matrix Theory of the Classical Compact Groups, volume 218 of Cambridge Tracts in Mathematics. Cambridge University Press, July 2019.
[Mon07] Ashley Montanaro. On the distinguishability of random quantum states. Communications in Mathematical Physics, 273(3):619–636, August 2007.
[MV16] Benjamin Musto and Jamie Vicary. Quantum Latin squares and unitary error bases. Quantum Information and Computation, 16(15-16):1318–1332, November 2016.
[MY98] Dominic Mayers and Andrew Yao. Quantum cryptography with imperfect apparatus. In Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No. 98CB36280), pages 503–509. IEEE, 1998.
[MY04] Dominic Mayers and Andrew Yao. Self testing quantum apparatus. Quantum Information & Computation, 4(4):273–286, July 2004.
[NC11] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, New York, NY, USA, 2011. 10th Anniversary Edition.
[NY20] Ashwin Nayak and Henry Yuen. Rigidity of superdense coding. Technical Report arXiv:2012.01672v1 [quant-ph], arXiv Pre-print server, https://arxiv.org/abs/2012.01672, December 2020.
[ON99] Tomohiro Ogawa and Hiroshi Nagaoka. Strong converse to the quantum channel coding theorem. IEEE Transactions on Information Theory, 45(7):2486–2489, 1999.
[ŠB20] Ivan Šupić and Joseph Bowles. Self-testing of quantum systems: a review. Quantum, 4:337, 2020.
[Sch35] Erwin Schrödinger. Discussion of probability relations between separated systems. Mathematical Proceedings of the Cambridge Philosophical Society, 31(4):555–563, 1935.
[TKV⁺18] Armin Tavakoli, Jędrzej Kaniewski, Tamás Vértesi, Denis Rosset, and Nicolas Brunner. Self-testing quantum states and measurements in the prepare-and-measure scenario. Physical Review A, 98(6):062307, 2018.
[Tys09a] Jon Tyson. Erratum: “Minimum-error quantum distinguishability bounds from matrix monotone functions: A comment on ‘Two-sided estimates of minimum-error distinguishability of mixed quantum states via generalized Holevo-Curlander bounds’ ” [J. Math. Phys. 50, 062102 (2009)]. Journal of Mathematical Physics, 50(10):109902, 2009.
[Tys09b] Jon Tyson. Two-sided estimates of minimum-error distinguishability of mixed quantum states via generalized Holevo-Curlander bounds. Journal of Mathematical Physics, 50(3):032106, 2009.
[Tys10] Jon Tyson. Two-sided bounds on minimum-error quantum measurement, on the reversibility of quantum dynamics, and on maximum overlap using directional iterates. Journal of Mathematical Physics, 51(9):092204, 2010.
[Uhl76] Armin Uhlmann. The “transition probability” in the state space of a $*$ -algebra. Reports on Mathematical Physics, 9(2):273–279, 1976.
[Ver18] Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, UK, 2018.
[VV19] Umesh Vazirani and Thomas Vidick. Fully device independent quantum key distribution. Communications of the ACM, 62(4):133–133, 2019.
[VW00] Karl Gerd H. Vollbrecht and Reinhard F. Werner. Why two qubits are special. Journal of Mathematical Physics, 41(10):6772–6782, 2000.
[Wat18] John Watrous. The Theory of Quantum Information. Cambridge University Press, May 2018.
[Wer01] Reinhard F. Werner. All teleportation and dense coding schemes. Journal of Physics A: Mathematical and General, 34(35):7081–7094, August 2001.
[Wil13] Mark M. Wilde. Quantum Information Theory. Cambridge University Press, Cambridge, UK, 2013.
[Yas16] Pavel Yaskov. A short proof of the Marchenko–Pastur theorem. Une courte démonstration du théorème de Marchenko–Pastur. Comptes Rendus Mathématique, 354(3):319–322, March 2016.

	$\displaystyle\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}^{\prime}(U_{j}^{\prime})^{*}(\Pi_{k}\otimes\mathds{1}))$	$\displaystyle=\operatorname{Tr}_{A^{\prime\prime}}((S_{ik}\otimes\mathds{1})\hat{U}_{ik}\hat{U}_{jk}^{}(S_{jk}^{}\otimes\mathds{1}))$
		$\displaystyle=S_{ik}\left[\operatorname{Tr}_{A^{\prime\prime}}(\hat{U}_{ik}\hat{U}_{jk}^{})\right]S_{jk}^{}$
		$\displaystyle=S_{ik}\left[\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}U_{j}^{}(\Pi_{k}\otimes\mathds{1}))\right]S_{jk}^{}$
		$\displaystyle=0.$

	$\displaystyle\operatorname*{\mathbb{E}}_{\bm{U}}\|\operatorname{Tr}({\bm{U}})\|^{2}\quad$	$\displaystyle=\quad\operatorname*{\mathbb{E}}_{\bm{U}}\left\lvert\sum_{j=1}^{d}\langle j\|{\bm{U}}\|j\rangle\right\rvert^{2}$
		$\displaystyle=\quad\operatorname{\mathbb{E}}_{\bm{U}}\sum_{j,k=1}^{d}\langle j\|{\bm{U}}\|j\rangle\!\langle k\|{\bm{U}}^{}\|k\rangle$
		$\displaystyle=\quad\sum_{j,k}\langle j\|\Big{(}\operatorname{\mathbb{E}}_{\bm{U}}{\bm{U}}\|j\rangle\!\langle k\|{\bm{U}}^{}\Big{)}\|k\rangle~{}.$

	$\displaystyle\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{*})\|^{2}\quad$	$\displaystyle=\quad\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left\|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{})-\operatorname{Tr}({\bm{E}}_{\bm{j}}{\bm{E}}_{\bm{k}}^{})\right\|^{2}$
		$\displaystyle\leq\quad\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\Big{(}\left\|\operatorname{Tr}(({\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}){\bm{U}}_{\bm{k}}^{})\|+\|\operatorname{Tr}({\bm{E}}_{\bm{j}}({\bm{U}}_{\bm{k}}^{}-{\bm{E}}_{\bm{k}}^{*}))\right\|\Big{)}^{2}$
		$\displaystyle\leq\quad 2\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left\|\operatorname{Tr}(({\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}){\bm{U}}_{\bm{k}}^{})\right\|^{2}+\left\|\operatorname{Tr}({\bm{E}}_{\bm{j}}({\bm{U}}_{\bm{k}}^{}-{\bm{E}}_{\bm{k}}^{*}))\right\|^{2}$	(5.6)
where the first equality is due to the orthogonality condition $\operatorname{Tr}({\bm{E}}_{j}{\bm{E}}_{k}^{})=0$ whenever $j\neq k$ , the second line is due to the triangle inequality, and the third line is due to the inequality $(a+b)^{2}\leq 2a^{2}+2b^{2}$ for real numbers $a,b$ . By the Cauchy-Schwarz inequality for the Hilbert-Schmidt inner product, we have $\|\operatorname{Tr}(({\bm{U}}_{j}-{\bm{E}}_{j}){\bm{U}}_{k}^{})\|^{2}\leq\\|{\bm{U}}_{j}-{\bm{E}}_{j}\\|_{2}^{2}\cdot\\|{\bm{U}}_{k}\\|_{2}^{2}$ and $\|\operatorname{Tr}({\bm{E}}_{j}({\bm{U}}_{k}^{}-{\bm{E}}_{k}^{}))\|^{2}\leq\\|{\bm{E}}_{j}\\|_{2}^{2}\cdot\\|{\bm{U}}_{k}-{\bm{E}}_{k}\\|_{2}^{2}$ , where $\\|X\\|_{2}=\sqrt{\operatorname{Tr}(XX^{*})}$ denotes the (unnormalized) Hilbert-Schmidt norm. Since $\\|A\\|_{2}^{2}=d$ for all $d\times d$ unitary matrices $A$ , we can upper bound the RHS of Equation 5.6 as
		$\displaystyle\leq\quad 2d\operatorname{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left(\\|{\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}\\|_{2}^{2}+\\|{\bm{U}}_{\bm{k}}-{\bm{E}}_{\bm{k}}\\|_{2}^{2}\right)$
		$\displaystyle=\quad 4\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\sum_{j}\\|{\bm{U}}_{j}-{\bm{E}}_{j}\\|_{\mathrm{nhs}}^{2}$
		$\displaystyle\leq\quad 4d^{2}\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\delta_{2}({\bm{\varepsilon}})^{2}\enspace$
		$\displaystyle<\quad 1\enspace,$

	$\displaystyle\frac{1}{d^{2}}\sum_{i}\Big{\|}\operatorname{Tr}(M_{i}(\psi_{i}-\psi_{i}^{\prime}))\Big{\|}$
	$\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}\\|M_{i}\\|\cdot\\|\psi_{i}-\psi_{i}^{\prime}\\|_{1}$	$\displaystyle(\text{H\"{o}lder inequality})$
	$\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}\\|\psi_{i}-\psi_{i}^{\prime}\\|_{1}$	$\displaystyle(\text{$M$ is a POVM})$
	$\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}2\\|\|\psi_{i}\rangle-\|\psi_{i}^{\prime}\rangle\\|$
	$\displaystyle=\quad\frac{2}{d^{2}}\sum_{i}\sqrt{\langle\upphi\|(U_{i}-U_{i}^{\prime})^{*}(U_{i}-U_{i}^{\prime})\|\upphi\rangle}$
	$\displaystyle\leq\quad 2\sqrt{\frac{1}{d^{2}}\sum_{i}\frac{1}{d}\\|U_{i}-U_{i}^{\prime}\\|_{2}^{2}}$	$\displaystyle(\text{Jensen inequality})$

	$\displaystyle\left\lvert f(U)-f(V)\right\rvert\quad$	$\displaystyle\leq\quad d\left\lvert\langle u\|((U-V)\otimes\mathds{1})\|\upphi_{d}\rangle\right\rvert$
		$\displaystyle\leq\quad d\left\\|((U-V)\otimes\mathds{1})\|\upphi_{d}\rangle\right\\|$
		$\displaystyle=\quad d\left(\langle\upphi_{d}\|((U-V)^{*}(U-V)\otimes\mathds{1})\|\upphi_{d}\rangle\right)^{1/2}$
		$\displaystyle=\quad\sqrt{d}\left\\|U-V\right\\|_{2}\enspace,$

Rigidity of superdense coding

Abstract

1 Introduction

1.1 Rigidity for superdense coding of two classical bits

Theorem 1.1 (Rigidity for superdense coding).

1.2 Rigidity for higher dimensional superdense coding?

Theorem 1.2 (Existence of inequivalent orthogonal unitary bases for all d≥3d\geq 3).

1.3 Superdense coding protocols with error

Conjecture 1.3.

Proposition 1.4.

Theorem 1.5 (Performance of a random superdense coding protocol).

1.4 Further remarks and open questions

Acknowledgements.

2 Properties of superdense coding

2.1 Quantum information basics

2.2 Basic properties of superdense coding

Definition 2.1 (Superdense coding protocol).

Lemma 2.2 (Orthogonality conditions I).

Proof.

Lemma 2.3 (Orthogonality conditions II).

Proof.

Definition 2.4.

Lemma 2.5 (Local unitary freedom of superdense coding protocols).

Proof.

2.3 Nice form protocols

Definition 2.6.

Lemma 2.7.

Proof.

3 Rigidity for two-dimensional superdense coding

3.1 Block-diagonalizing nice form protocols

Theorem 3.1.

Proof.

3.2 Matching the blocks of the encoding operators

Theorem 3.2.

Proof.

Lemma 3.3 (Reduction Lemma).

Lemma 3.4.

Proof.

Proof of Lemma 3.3.

No two triangles in GG share an edge.

Every vertex is in a triangle.

Finding a common eigenvector of a triangle.

Proof.

4 Superdense coding and orthogonal unitary bases

4.1 The connection with unitary bases

4.2 Uniqueness of orthogonal unitary bases

Definition 4.1.

Proposition 4.2.

Proof.

Theorem 4.3.

4.3 Some useful properties

Lemma 4.4.

Proof.

Lemma 4.5.

Proof.

4.4 Explicit constructions

Lemma 4.6.

Proof.

Lemma 4.7.

Proof.

Lemma 4.8.

Proof.

Proposition 4.9.

Proof.

5 Random superdense coding protocols

5.1 Background from random matrix theory

Definition 5.1 (Isotropic vector).

Definition 5.2 (Sub-gaussian random variables and vectors).

Lemma 5.3.

Theorem 5.4 ([Ver18], Theorem 4.6.1).

Theorem 5.5 ([Mec19], Theorem 5.17, page 159).

Definition 5.6.

Theorem 5.7 (Marčenko-Pastur law [Yas16]).

5.2 Pseudo-isotropy of random maximally entangled vectors

Lemma 5.8.

Proof.

Lemma 5.9.

Proof.

Lemma 5.10.

Proof.

Theorem 1.2 (Existence of inequivalent orthogonal unitary bases for all $d\geq 3$ ).

No two triangles in $G$ share an edge.