This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Rigidity of superdense coding

Ashwin Nayak Department of Combinatorics and Optimization, and Institute for Quantum Computing, University of Waterloo, 200 University Ave. W., Waterloo, ON, N2L 3G1, Canada. Email: [email protected] .    Henry Yuen Department of Computer Science, Columbia University, New York, USA. E-mail: [email protected] .
Abstract

The famous superdense coding protocol of Bennett and Wiesner demonstrates that it is possible to communicate two bits of classical information by sending only one qubit and using a shared EPR pair. Our first result is that an arbitrary protocol for achieving this task (where there are no assumptions on the sender’s encoding operations or the dimension of the shared entangled state) is locally equivalent to the canonical Bennett-Wiesner protocol. In other words, the superdense coding task is rigid. In particular, we show that the sender and receiver only use additional entanglement (beyond the EPR pair) as a source of classical randomness.

We also investigate several questions about higher-dimensional superdense coding, where the goal is to communicate one of d2d^{2} possible messages by sending a dd-dimensional quantum state, for general dimensions dd. Unlike the d=2d=2 case (i.e. sending a single qubit), there can be inequivalent superdense coding protocols for higher dd. We present concrete constructions of inequivalent protocols, based on constructions of inequivalent orthogonal unitary bases for all d>2d>2. Finally, we analyze the performance of superdense coding protocols where the encoding operators are independently sampled from the Haar measure on the unitary group. Our analysis involves bounding the distinguishability of random maximally entangled states, which may be of independent interest.

1 Introduction

In quantum information theory, rigidity is a phenomenon where optimal performance in an information processing task requires using a protocol satisfying extremely stringent constraints — in some cases, there is essentially a unique optimal protocol. The primary examples of rigidity come from nonlocal games (also known as Bell tests in the physics literature). In this setting two spatially separated parties Alice and Bob play a game with a third-party called the referee. In order to maximize their chances of winning, before the game starts Alice and Bob choose an entangled state to share as well as local measurements to perform on the state. For example, in the famous CHSH game the optimal winning probability is cos2(π/8)\cos^{2}(\pi/8), and a canonical strategy that achieves this uses a (rotated) EPR pair and single-qubit Pauli measurements. The CHSH game is rigid in the sense that any optimal strategy for the CHSH game is identical to this canonical strategy, up to local changes of basis.

The study of rigidity in quantum information processing arguably started with the work of Mayers and Yao [MY98, MY04], who initiated the concept of device-independent cryptography. The idea behind this subject is that a classical user can verify that untrusted quantum hardware is behaving as intended — say, generating random keys or performing a quantum computation – simply by verifying that the hardware is employing a (near)-optimal strategy in a rigid nonlocal game. Since the work of Mayers and Yao, nonlocal game rigidity has been an extremely fruitful concept in quantum cryptography (see, e.g., Refs. [VV19] and [CGJV19]), complexity theory (see Ref. [JNV+20] and the references therein), and quantum information more generally [ŠB20]. This motivates the following question: what other tasks in quantum information also exhibit rigidity phenomena?

To our knowledge, the only other work on rigidity phenomena outside of nonlocal games is that reported in Refs. [TKV+18, FK19] on the rigidity of quantum random access codes (QRACs). The authors study “2d12^{d}\rightarrow 1” QRACs, which encode 22 classical dits x,y[d]x,y\in[d] into a dd-dimensional system, such that either xx or yy may be retrieved by performing a suitable measurement. These works show that 2d12^{d}\to 1 QRACs are rigid, and in fact certify measurements based on mutually unbiased bases (MUBs).

In this paper we investigate the rigidity properties of superdense coding, which plays a fundamental role in quantum Shannon theory (see, e.g., Ref. [Wil13, Chapter 6]). The superdense coding task is to communicate one of four possible messages while only transmitting one quantum bit across a channel. The superdense coding protocol, first proposed by Bennett and Wiesner [BW92], achieves this task in the following way: Alice and Bob share one qubit each of an EPR pair (i.e., the maximally entangled state 12|00+12|11\tfrac{1}{\sqrt{2}}|00\rangle+\frac{1}{\sqrt{2}}|11\rangle) in advance, and to transmit a message i{1,2,3,4}i\in\{1,2,3,4\}, Alice applies a one of four Pauli operators to her half of the EPR pair and sends her qubit. Bob then performs a Bell measurement on the qubit received from Alice and his qubit to determine ii.

1.1 Rigidity for superdense coding of two classical bits

The first result in our paper is to show that superdense coding is rigid: any protocol that accomplishes this task is “locally equivalent” to the Bennett-Wiesner protocol. We model arbitrary protocols for superdense coding in the following manner: Alice and Bob share a density matrix τ\tau on a bipartite Hilbert space AB\mathcal{H}_{A}\otimes\mathcal{H}_{B}, where we assume without loss of generality that A\mathcal{H}_{A} factors into AA′′\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}} where A′′\mathcal{H}_{A^{\prime\prime}} is isomorphic to 2{\mathbb{C}}^{2}. Given an input i{1,2,3,4}i\in\{1,2,3,4\}, Alice applies a unitary operator UiU_{i} (called an encoding operator) to her share of τ\tau (with support in the space A\mathcal{H}_{A}), sends the qubit A′′A^{\prime\prime} to Bob, and Bob then performs an optimal distinguishing measurement on the Hilbert space A′′B\mathcal{H}_{A^{\prime\prime}}\otimes\mathcal{H}_{B} to determine what the input ii was. See Figure 1 for an illustration of a general superdense coding protocol.

\Qcircuit@C=1.5em @R=1em & Alice
\lsticki  \control\cw\qwx[1] \cw \cw \cw \cw \cw \cw \lsticki
\lstickA’  \multigate1U_i \qw \qw \qw \qw \qw \rstick  \qw
\lstickA”  \ghostU_i \qw \qw\qwx[1] \push
\qwx[1]
\qwx[1] Bob
\push \multigate1M \qw \qw \qw
\lstickB  \qw \qw \qw \ghostM \qw \meter \push \cw \lsticki \gategroup1148.7em. \gategroup6188.7em.

Figure 1: A general superdense coding protocol. This quantum circuit is modified from [Cha20].

A priori it appears daunting to characterize the structure of an arbitrary superdense coding protocol. For one, the dimension of the spaces A\mathcal{H}_{A^{\prime}} and B\mathcal{H}_{B} are unbounded, and the state τ\tau is uncharacterized. Furthermore, the encoding unitary operators UiU_{i} can be extremely complicated, potentially performing complex entangling operations between the space A\mathcal{H}_{A^{\prime}} and A′′\mathcal{H}_{A^{\prime\prime}} (the qubit to be sent over to Bob). However, the property of being a superdense coding protocol is extremely constraining. Theorem 1.1 gives a precise characterization of how an arbitrary superdense protocol is locally equivalent to the canonical Bennett-Wiesner protocol. In the statement of the theorem, “=τ=_{\tau^{\prime}}” denotes equality of two unitary operators with respect to the state τ\tau^{\prime}; in other words, C=ρDC=_{\rho}D means that CρC=DρDC\rho C^{*}=D\rho D^{*}.

Theorem 1.1 (Rigidity for superdense coding).

Let (τ,(Ui))(\tau,(U_{i})) denote a superdense coding protocol. Then there exist

  1. 1.

    Unitary operators VV acting on AA′′\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}} and (Ci)i[4](C_{i})_{i\in[4]} acting on A\mathcal{H}_{A^{\prime}},

  2. 2.

    An isometry WW mapping B\mathcal{H}_{B} to a Hilbert space BB′′\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{B^{\prime\prime}} where B′′\mathcal{H}_{B^{\prime\prime}} is isomorphic to 2\mathbb{C}^{2},

  3. 3.

    A density matrix ρ\rho on AB\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}},

  4. 4.

    A set of pairwise orthogonal projectors {Pr}\{P_{r}\} that sum to the identity on A\mathcal{H}_{A^{\prime}}, and

  5. 5.

    A collection of 2×22\times 2 unitary operators {Sr}\{S_{r}\},

such that, letting τ(VW)τ(VW)\tau^{\prime}\coloneqq(V\otimes W)\tau(V\otimes W)^{*}, we have

τ=ρAB|EPREPR|A′′B′′\tau^{\prime}=\rho^{A^{\prime}B^{\prime}}\otimes|\mathrm{EPR}\rangle\!\langle\mathrm{EPR}|^{A^{\prime\prime}B^{\prime\prime}}

and for i{1,2,3,4}i\in\{1,2,3,4\},

(Ci𝟙)UiV=τrPrSrσiSr(C_{i}^{*}\otimes\mathds{1})U_{i}V^{*}=_{\tau^{\prime}}\sum_{r}P_{r}\otimes S_{r}\sigma_{i}S_{r}^{*}

where σ1𝟙\sigma_{1}\coloneqq\mathds{1}, σ2Z\sigma_{2}\coloneqq{\mathrm{Z}}, σ3X\sigma_{3}\coloneqq{\mathrm{X}}, and σ4Y\sigma_{4}\coloneqq{\mathrm{Y}} are the one-qubit Pauli matrices.

Theorem 1.1 can be interpreted as expressing rigidity for superdense coding in the following way: given an arbitrary protocol (τ,(Ui))(\tau,(U_{i})) for superdense coding, there exists local isometries V,WV,W where if Alice applies VV and Bob applies WW to their share of τ\tau, then an EPR pair is extracted with an auxiliary state ρ\rho remaining. By pre-applying VV^{*} to Alice’s unitary operators UiU_{i}, we discover that UiVU_{i}V^{*} has a very regular form: operationally, it can be interpreted as performing some projective measurement {Pr}\{P_{r}\} on Alice’s part of the auxiliary state ρ\rho to obtain some outcome rr in a set \mathcal{R}, and then based on rr, applying a rotated version of the standard Bennett-Wiesner superdense coding protocol to Alice’s part of the EPR pair. Finally, after sending her EPR qubit, Alice then applies some unitary operator CiC_{i} on her remaining qubits (which does not affect Bob’s measurement in any way). This considerably strengthens and extends the characterization of “tight” superdense coding protocols due to Vollbrecht and Werner [VW00, Lemma 3] (see also Ref. [Wer01]); they studied protocols in which the shared entangled state τ\tau is a state on 22{\mathbb{C}}^{2}\otimes{\mathbb{C}}^{2} (or on dd{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d} in the case of dd-dimensional superdense coding protocols; see Section 1.2). This difference would be significant in a cryptographic setting; in the context of quantum key distribution, this is the difference between the device-independent and semi-device-independent settings.

The proof of Theorem 1.1 is given in Section 3. It proceeds via a number of reductions: first, using an information-theoretic argument, we show every superdense coding protocol (τ,(Ui))(\tau,(U_{i})) is locally equivalent to one that uses an EPR pair in the state τ\tau. Given this, we then show that each of the encoding operators UiU_{i} can be individually block-diagonalized with respect to the EPR pair. Finally, we show that the blocks across the different encoding operators UiU_{i} can be “matched up” in a way that they correspond to the Pauli matrices. Each of these steps requires carefully deducing the structure imposed on the state and the encoding operators by the correctness of the protocol.

1.2 Rigidity for higher dimensional superdense coding?

We then consider the generalization of superdense coding to communicating more than 22 classical bits. Specifically, we consider protocols for communicating one of d2d^{2} possible messages by sending a dd-dimensional quantum system over the channel — we call these dd-dimensional superdense coding protocols. A canonical protocol for dd-dimensional superdense coding is as follows: the players share a dd-dimensional maximally entangled state |ϕd1de=0d1|e|e|\upphi_{d}\rangle\coloneqq\frac{1}{\sqrt{d}}\sum_{e=0}^{d-1}|e\rangle|e\rangle, and given message i[d2]i\in[d^{2}], Alice applies a unitary operator EiE_{i} to her share of |ϕd|\upphi_{d}\rangle, and sends it over to Bob. The family of unitary operators {Ei}\{E_{i}\} can be any orthogonal unitary basis for the space of d×dd\times d matrices. (The orthogonality property means that Tr(EiEj)=0\operatorname{Tr}(E_{i}^{*}E_{j})=0 if and only if iji\neq j.) An example of such a basis is the set of Heisenberg-Weyl operators. In dimension dd, these are a set of d×dd\times d matrices {Pi,j:0i,j<d}\{P_{i,j}:0\leq i,j<d\} defined as follows. Let ωdexp(2πid)\omega_{d}\coloneqq\exp\left(\frac{2\pi{\mathrm{i}}}{d}\right) be a primitive ddth root of unity. For i,j{0,1,2,,d1}i,j\in\{0,1,2,\ldots,d-1\}, let Pi,j=XdiZdjP_{i,j}={\mathrm{X}}^{i}_{d}{\mathrm{Z}}^{j}_{d} where Xdk=0d1|k+1(modd)k|{\mathrm{X}}_{d}\coloneqq\sum_{k=0}^{d-1}|k+1\,(\mathrm{mod}\,d)\rangle\!\langle k| is the “shift” operator, and Zdk=0d1ωdk|kk|{\mathrm{Z}}_{d}\coloneqq\sum_{k=0}^{d-1}\omega^{k}_{d}\,|k\rangle\!\langle k| is the “clock” operator. Does the rigidity phenomenon also extend to dimensions dd larger than 22?

The second result of this paper is that dd-dimensional superdense coding for d3d\geq 3 is not rigid in the same sense as Theorem 1.1: there are dd-dimensional superdense coding protocols which are not locally equivalent to each other. This is because in dimensions three and higher there are inequivalent orthogonal unitary bases. (In contrast, all orthogonal unitary bases in dimension two are equivalent to the Pauli matrices.) Here, equivalence between two unitary bases {Ei}\{E_{i}\} and {Fi}\{F_{i}\} means there exist unitary operators U,VU,V such that for all ii, we have Fi=αiUEiVF_{i}=\alpha_{i}UE_{i}V for some choice of complex phase αi\alpha_{i}.

Theorem 1.2 (Existence of inequivalent orthogonal unitary bases for all d3d\geq 3).

For every dimension d3d\geq 3, there are orthogonal unitary bases that are not equivalent to each other.

The uniqueness of orthogonal unitary bases was first studied by Vollbrecht and Werner [VW00], and the existence of non-equivalent orthogonal unitary bases for all dimensions greater than 22 was observed in follow-up work by Werner [Wer01]. (We elaborate on prior work on the topic in Section 4.2.) Werner described, without proof, how non-equivalent bases may be constructed. We present explicit constructions of such bases in Section 4. The construction for d4d\geq 4 is based on the observation that the shift operator Xd{\mathrm{X}}_{d} corresponds to a perfect matching in Kd,dK_{d,d}, the complete bipartite graph. Moreover, its powers {Xdi:0i<d}\left\{{\mathrm{X}}_{d}^{i}:0\leq i<d\right\} correspond to a partition of the edge set of Kd,dK_{d,d} into dd disjoint perfect matchings. By replacing this partition with another carefully chosen such partition, we obtain an orthogonal unitary basis that is not equivalent to the clock and shift construction. The proof of non-equivalence involves comparing the spectra of the operators in the two bases, taking into account the complex phase and unitary operators that witness a potential equivalence map. For d=3d=3, we follow a construction described by Werner [Wer01]. We prove non-equivalence to the clock and shift basis by showing that the resulting basis is not a commutative projective group (again accounting for a potential equivalence map).

In a previous version of this paper, we conjectured that rigidity for higher dimensional superdense coding holds up to choosing orthogonal unitary bases [NY20, Conjecture 1.3]. That is, every dd-dimensional superdense coding protocol is locally equivalent (in the sense of Theorem 1.1) to one where Alice and Bob share an entangled state ρAB|ϕdϕd|A′′B′′\rho^{A^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle\!\langle\upphi_{d}|^{A^{\prime\prime}B^{\prime\prime}} for some density matrix ρAB\rho^{A^{\prime}B^{\prime}}, and Alice’s encoding operators are of the form

Ui=rPrEr,iU_{i}=\sum_{r}P_{r}\otimes E_{r,i}

where {Pr}\{P_{r}\} is a set of pairwise orthogonal projectors that sum to the identity on A\mathcal{H}_{A^{\prime}} and for every rr, the set {Er,i}i[d2]\{E_{r,i}\}_{i\in[d^{2}]} is an orthogonal unitary basis for the space of d×dd\times d complex matrices. This would be a natural extension of the statement of Theorem 1.1 to the case of general d2d\geq 2 where the registers ABA^{\prime}B^{\prime} are treated as a source of “shared randomness” to help Alice and Bob synchronize their choice of orthogonal unitary basis.

We can show that when the shared entangled state between Alice and Bob is a pure state in dd\mathbb{C}^{d}\otimes\mathbb{C}^{d}, then this conjecture holds (see Section 4.1): up to local unitary operators, the shared state is necessarily the maximally entangled state |ϕd|\upphi_{d}\rangle of local dimension dd, and the encoding operators {Ui}\{U_{i}\} necessarily form an orthogonal unitary basis. However, this conjecture is false for protocols where Alice sends only a part of her entangled state. In work subsequent to ours, Farkas, Kaniewski, and Nayak [FKN22] show that there exist infinitely many superdense coding protocols that are not locally equivalent to a protocol of the form described in the conjecture. In particular, in these counterexample protocols Alice may perform a complicated entangling operation between her message and the rest of her state, rather than just treating the ancilla system as a source of shared randomness.

1.3 Superdense coding protocols with error

Finally, we consider probabilistic protocols for dd-dimensional superdense coding, where Bob’s decoding only needs to succeed with high probability. In particular, we say that (τ,(Ui))(\tau,(U_{i})) is a (d,ε)(d,\varepsilon)-superdense coding protocol if Bob is able to decode Alice’s message ii with probability at least 1ε1-\varepsilon, for all ii. We focus on the case where Alice and Bob share an entangled state in dd\mathbb{C}^{d}\otimes\mathbb{C}^{d} (i.e., have local dimension dd). As mentioned previously, in the exact case ε=0\varepsilon=0, their shared state is necessarily the maximally entangled state and Alice’s encoding unitary operators form an orthogonal unitary basis. We conjecture that even in the probabilistic setting, this characterization of dd-dimensional superdense coding protocols is robust, in the following sense.

Conjecture 1.3.

There exist functions δ1,δ2:[0,1][0,1]\delta_{1},\delta_{2}:[0,1]\to[0,1] where δ1(ε)\delta_{1}(\varepsilon) and δ2(ε)\delta_{2}(\varepsilon) monotonically decrease to 0 as ε0\varepsilon\to 0, such that the following holds. For all (d,ε)(d,\varepsilon)-superdense coding protocols (τ,(Ui))(\tau,(U_{i})) such that τ\tau is a density matrix on dCd\mathbb{C}^{d}\otimes C^{d} and UiU_{i} are unitary operators in 𝖴(d){\mathsf{U}}(\mathbb{C}^{d}), we have

ϕd|τ|ϕd1δ1(ε),\langle\upphi_{d}|\tau|\upphi_{d}\rangle\quad\geq\quad 1-\delta_{1}(\varepsilon)\enspace,

and there exists an orthogonal unitary basis {Ei}i[d2]\{E_{i}\}_{i\in[d^{2}]} for the space of d×dd\times d complex matrices such that for all i[d2]i\in[d^{2}],

UiEinhsδ2(ε),\|U_{i}-E_{i}\|_{\mathrm{nhs}}\quad\leq\quad\delta_{2}(\varepsilon)\enspace,

where Xnhs1dTr(XX)\|X\|_{\mathrm{nhs}}\coloneqq\sqrt{\frac{1}{d}\operatorname{Tr}(XX^{\dagger})} denotes the normalized Hilbert-Schmidt norm on the space of d×dd\times d matrices.

We note that the choice of the normalized Hilbert-Schmidt norm in the statement of 1.3 is somewhat arbitrary; one can also consider other formulations of the conjecture with other norms (such as the spectral norm, etc.).

The last part of our paper analyzes a possible challenge to 1.3 proposed by Aram Harrow. Consider the following probabilistic construction for a potential (d,ε)(d,\varepsilon)-superdense coding protocol: independently sample d2d^{2} matrices 𝑼1,,𝑼d2{\bm{U}}_{1},\ldots,{\bm{U}}_{d^{2}} from the Haar measure on 𝖴(d){\mathsf{U}}({\mathbb{C}}^{d}), the group of d×dd\times d complex unitary matrices. Let τ|ϕdϕd|\tau\coloneqq|\upphi_{d}\rangle\!\langle\upphi_{d}| denote the dd-dimensional maximally entangled state. How well does the protocol (τ,(𝑼i))(\tau,({\bm{U}}_{i})) accomplish superdense coding?

In classical and quantum communication, many tasks can be performed near-optimally via probabilistically constructed protocols. See, e.g., the text by Wilde [Wil13] for examples from Shannon theory. A simple example from communication complexity is the task of quantum fingerprinting [BCWdW01], which enables checking whether two nn-bit strings xx and yy are equal by only comparing two O(logn)\mathrm{O}(\log n)-qubit fingerprints of the strings. It can be shown that picking random O(logn)\mathrm{O}(\log n)-qubit states for each nn-bit string xx yields a good quantum fingerprinting protocol.

Let 𝚷d{\bm{\Pi}}_{d} denote the random superdense protocol specified by (ϕd,(𝑼i))(\upphi_{d},({\bm{U}}_{i})). Note that the error 𝜺{\bm{\varepsilon}} of 𝚷d{\bm{\Pi}}_{d}, when averaged over the choice of random unitaries (𝑼i)({\bm{U}}_{i}), is some function of dd. We first argue that the conjecture implies that the error of a random superdense coding protocol, when averaged over the choice of (𝑼i)({\bm{U}}_{i}), cannot be too small:

Proposition 1.4.

Suppose 1.3 were true. Let δ2(ε)\delta_{2}(\varepsilon) be the function from 1.3. Then the random superdense coding protocol 𝚷d{\bm{\Pi}}_{d} specified by (ϕd,(𝐔i))(\upphi_{d},({\bm{U}}_{i})) must have error 𝛆{\bm{\varepsilon}} satisfying

𝔼(𝑼i)δ2(𝜺)2(2d)2.\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\delta_{2}({\bm{\varepsilon}})^{2}\quad\geq\quad(2d)^{-2}\enspace.

Put another way, it cannot be that both 1.3 is true and also the random superdense protocol has error vanishing so quickly such that δ2(𝜺)\delta_{2}({\bm{\varepsilon}}) is smaller than (2d)2(2d)^{-2}, on average. Due to the concentration of measure phenomenon for Haar-random states and unitary operators (as expressed by, e.g., the Lévy-like property in Theorem 5.5), it is plausible a priori that the average error 𝜺{\bm{\varepsilon}}, and therefore also δ2(𝜺)\delta_{2}({\bm{\varepsilon}}), scale as o(d2)\mathrm{o}(d^{-2}). Thus, the random superdense protocol is potentially a counterexample to 1.3.

We show that this probabilistic construction does not yield a good superdense coding protocol: with overwhelmingly high probability over the choice of random unitary operators (Ui)(U_{i}), the protocol has a nonzero probability of error that is independent of dd. Thus, 1.3 is not ruled out by the random protocol construction.

Theorem 1.5 (Performance of a random superdense coding protocol).

The random superdense coding protocol 𝚷d{\bm{\Pi}}_{d} specified by (ϕd,(𝐔i))(\upphi_{d},({\bm{U}}_{i})) where 𝐔i𝖴(d){\bm{U}}_{i}\in{\mathsf{U}}({\mathbb{C}}^{d}) are Haar-random unitary operators has error at least 183π0.151-\tfrac{8}{3\pi}\approx 0.15 as dd\to\infty, with high probability over the choice of (𝐔i)({\bm{U}}_{i}).

We prove Theorem 1.5 by showing that the distinguishability of the ensemble of random states {(𝑼i𝟙)|ϕd}i[d2]\{({\bm{U}}_{i}\otimes\mathds{1})|\upphi_{d}\rangle\}_{i\in[d^{2}]} is bounded away from 11 (with high probability). The generalized Holevo-Curlander bounds [Kho79, Cur79, ON99, Tys09b] relate the distinguishability of an ensemble {(pi,ρi)}\{(p_{i},\rho_{i})\} to the quantity

Tr(ipi2ρi2)1/2.\operatorname{Tr}\Big{(}\sum_{i}p_{i}^{2}\rho_{i}^{2}\,\Big{)}^{1/2}\enspace. (1.1)

Our analysis of this quantity is largely inspired by work due to Montanaro [Mon07] on the distinguishability of random pure quantum states. However, extending his approach to the ensemble of interest to us—one consisting of random maximally entangled states—involves significant technical difficulties. The approach involves relating the distinguishability of an ensemble of states to the spectrum of the ensemble average. In the case of Haar-random pure states, the ensemble average is well approximated by the ensemble average of unnormalized complex gaussian vectors with suitably chosen variance. The spectrum of such matrices in the asymptotic limit is given by the Marčenko-Pastur Theorem from random matrix theory. In our case, the entries of the random vectors in the ensemble are not independent. We instead bound the generalized Holevo-Curlander quantity in Equation 1.1 by employing a recent generalization of the Marčenko-Pastur Theorem due to Yaskov [Yas16]. (The theorem was proven for ensembles of random real vectors. We verify that its proof also extends to complex random vectors with analogous properties.) In the process, we show that random maximally entangled states satisfy a pseudo-isotropy condition that suffices for the theorem to hold.

A subtlety in the use of the Marčenko-Pastur law is that we would like to deduce the convergence of a sequence of means to the mean of the limiting distribution from the convergence of a sequence of distributions. This does not necessarily hold in general. In order to prove such a relation between the two forms of convergence, we show that random maximally entangled states are sub-gaussian. This allows us to draw on a generalization of the Bai-Yin Theorem, which bounds the norm of matrices whose columns are given by i.i.d. sub-gaussian vectors. We thus show that the norm of the ensemble average has an exponentially decaying tail, which in turn guarantees the form of convergence we seek.

We believe the techniques used in our analysis are of independent interest. In fact, the subtlety mentioned above was overlooked by Montanaro; the ideas we develop may also be used to close a gap in his analysis (see Section 5.4 for the details).

1.4 Further remarks and open questions

In this paper we have initiated the study of rigidity phenomena in superdense coding protocols. Given its importance in quantum Shannon theory, our study may shed new light on protocols based on superdense coding. The power of entanglement as a resource in distributed quantum computation, in particular in two-party communication complexity, remains a mystery. The rigidity theorem we establish (Theorem 1.1) gives a complete picture for a simple but fundamental task. The property shown in the analysis of random superdense coding protocols (Theorem 1.5) may also be interpreted as placing a limit on how closely a sequence of random unitary operators approximate an orthogonal basis. This may be of relevance to the theory of error-correction, where unitary error bases play a central role.

We list several open questions that arise from this work:

  1. 1.

    Is 1.3 true? Does a robust version of Theorem 1.1 hold?

  2. 2.

    Do dd-dimensional superdense coding protocols, in which the shared state between Alice and Bob may have local dimension larger than dd, also exhibit some non-trivial form of rigidity?

  3. 3.

    Does rigidity also hold for quantum teleportation, a task that is “dual” to superdense coding? Can this be derived in a black-box way from the rigidity of superdense coding?

  4. 4.

    Are there any connections between the QRAC rigidity results of [TKV+18, FK19] and our results on the rigidity of superdense coding?

  5. 5.

    What other quantum information processing tasks have the rigidity property?

We believe the investigation of these questions will lead to significant new insights into the nature of quantum information, with wide ranging ramifications.

Acknowledgements.

We thank the anonymous journal reviewers for their thorough feedback on an earlier version of the paper. We thank Adam Bouland, Chinmay Nirkhe, and Zeph Landau for stimulating discussions at the beginning of this project. H.Y. would like to especially thank Adam Bouland for his integral role in formulating the questions explored in this paper. We would like to thank Pavel Yaskov for the correspondence on his work on the Marčenko–Pastur theorem. We thank Jędrzej Kaniewski and Mate Farkas for their pointers to the literature on rigidity of QRACs. A.N. would like to thank Kanstantsin Pashkovich and Vern Paulsen for helpful discussions on bases of orthogonal unitary operators, and is grateful to the Berkeley CS Theory Group for their hospitality during a visit in Fall 2017, when this work was initiated. A.N.’s research is supported in part by a Discovery Grant from NSERC Canada. H.Y. is supported by an NSERC Discovery Grant, a Google Quantum Research Award, and AFOSR Grant No.​ FA9550-21-1-0040. This research was partly conducted at the Kavli Institute for Theoretical Physics during the Quantum Physics of Information program in 2017 (and thus this research was supported in part by the National Science Foundation under Grant No. NSF PHY17-48958).

2 Properties of superdense coding

2.1 Quantum information basics

We refer the reader to texts such as [NC11, Wat18, Wil13] for the basics of quantum information, and mention some notational conventions here.

We write 𝟙\mathds{1} to denote the identity operator on a Hilbert space. We use superscripts on quantum states, e.g., |ψAB|\psi\rangle^{AB} or ρAB\rho^{AB}, to denote the registers in which they are stored. Similarly, we use subscripts on operators to indicate the registers on which they act, unless this is clear from the context. Given a bipartite density matrix ρAB\rho^{AB}, we write ρA\rho^{A} to denote its reduction to register AA (i.e., the partial trace over BB).

Given operators A,BA,B and a density matrix ρ\rho acting on a Hilbert space \mathcal{H}, we write A=ρBA=_{\rho}B to denote AρA=BρBA\rho A^{*}=B\rho B^{*}. In other words, the operators AA and BB have the same action on the state ρ\rho.

We write |EPR|\mathrm{EPR}\rangle to denote the maximally entangled state 12(|00+|11)\frac{1}{\sqrt{2}}\Big{(}|00\rangle+|11\rangle\Big{)} on two qubits. We write |ϕd=1di=0d1|i|i|\upphi_{d}\rangle=\frac{1}{\sqrt{d}}\sum_{i=0}^{d-1}|i\rangle|i\rangle to denote the dd-dimensional maximally entangled state, or simply |ϕ|\upphi\rangle if the dimension dd is clear from context. We recall the single qubit Pauli matrices:

𝟙(1001)X(0110)Y(0ii0)Z(1001).\mathds{1}\coloneqq\begin{pmatrix}1&0\\ 0&1\end{pmatrix}\qquad{\mathrm{X}}\coloneqq\begin{pmatrix}0&1\\ 1&0\end{pmatrix}\qquad{\mathrm{Y}}\coloneqq\begin{pmatrix}0&-{\mathrm{i}}\\ {\mathrm{i}}&0\end{pmatrix}\qquad{\mathrm{Z}}\coloneqq\begin{pmatrix}1&0\\ 0&-1\end{pmatrix}\;.

2.2 Basic properties of superdense coding

Here we give a formal definition of a general superdense coding protocol, and prove some basic properties about them.

Definition 2.1 (Superdense coding protocol).

Let dd be a positive integer. Let AAA′′\mathcal{H}_{A}\coloneqq\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}} and B\mathcal{H}_{B} be finite dimensional Hilbert spaces where A′′\mathcal{H}_{A^{\prime\prime}} is isomorphic to d{\mathbb{C}}^{d}. Let τ\tau denote a density matrix on AB\mathcal{H}_{A}\otimes\mathcal{H}_{B} and let (Ui)i[d2](U_{i})_{i\in[d^{2}]} denote a sequence of d2d^{2} unitary operators (called encoding operators) acting on A\mathcal{H}_{A}. We say that (τ,(Ui))(\tau,(U_{i})) is a (d,ε)(d,\varepsilon)-superdense coding protocol if there exists a POVM {Mi}i[d2]\{M_{i}\}_{i\in[d^{2}]} acting on A′′B\mathcal{H}_{A^{\prime\prime}}\otimes\mathcal{H}_{B} such that

Tr(Miρi)1εi[d2]\operatorname{Tr}(M_{i}\,\rho_{i})\geq 1-\varepsilon\qquad\forall\,i\in[d^{2}] (2.1)

where ρi\rho_{i} denotes the reduced density matrix of (Ui𝟙)τ(Ui𝟙)(U_{i}\otimes\mathds{1})\tau(U_{i}\otimes\mathds{1})^{*} on registers A′′BA^{\prime\prime}B. When ε=0\varepsilon=0 we simply call (τ,(Ui))(\tau,(U_{i})) a dd-dimensional superdense coding protocol.

Lemma 2.2 (Orthogonality conditions I).

Let (τ,(Ui))(\tau,(U_{i})) be a dd-dimensional superdense coding protocol. Then letting ρi\rho_{i} denote the reduced density matrix of (Ui𝟙)τ(Ui𝟙)(U_{i}\otimes\mathds{1})\tau(U_{i}\otimes\mathds{1})^{*} on registers A′′BA^{\prime\prime}B, we have that

Tr(ρiρj)=0ij[d2].\operatorname{Tr}(\rho_{i}\,\rho_{j})=0\qquad\forall i\neq j\in[d^{2}]\;.
Proof.

Let {Mi}\{M_{i}\} denote a POVM satisfying Equation 2.1 for ε=0\varepsilon=0. Then for all i[d2]i\in[d^{2}], we have ρiMi\rho_{i}\leq M_{i} according to the positive semidefnite ordering. This is because if we write ρi=kpik|ψikψik|\rho_{i}=\sum_{k}p_{ik}|\psi_{ik}\rangle\!\langle\psi_{ik}| for some probabilities {pik}\{p_{ik}\}, then Tr(ρiMi)=1\operatorname{Tr}(\rho_{i}M_{i})=1 implies that for all kk, ψik|Mi|ψik=1\langle\psi_{ik}|M_{i}|\psi_{ik}\rangle=1, which implies that |ψik|\psi_{ik}\rangle is an eigenvector of MiM_{i} with eigenvalue 11. This implies that Mi=k|ψikψik|+MiM_{i}=\sum_{k}|\psi_{ik}\rangle\!\langle\psi_{ik}|+M_{i}^{\prime} for some positive semidefinite operator MiM_{i}^{\prime}, and this is at least ρi\rho_{i} in the positive semidefinite ordering.

This means then that for all jij\neq i,

0Tr(ρiMj)Tr(ρi(𝟙Mi))=Tr(ρi)Tr(ρiMi)=00\leq\operatorname{Tr}(\rho_{i}\,M_{j})\leq\operatorname{Tr}(\rho_{i}\,(\mathds{1}-M_{i}))=\operatorname{Tr}(\rho_{i})-\operatorname{Tr}(\rho_{i}\,M_{i})=0

where the first inequality is due to the positivity of ρi\rho_{i} and MjM_{j}, the second inequality is due to the fact that iMi=𝟙\sum_{i}M_{i}=\mathds{1}, and the last equality is due to the fact that Tr(ρi)=Tr(ρiMi)=1\operatorname{Tr}(\rho_{i})=\operatorname{Tr}(\rho_{i}\,M_{i})=1. Therefore Tr(ρiMj)=0\operatorname{Tr}(\rho_{i}\,M_{j})=0. Thus we have

0Tr(ρiρj)Tr(ρiMj)=0,0\leq\operatorname{Tr}(\rho_{i}\,\rho_{j})\leq\operatorname{Tr}(\rho_{i}\,M_{j})=0\enspace,

so Tr(ρiρj)=0\operatorname{Tr}(\rho_{i}\,\rho_{j})=0. ∎

Lemma 2.3 (Orthogonality conditions II).

Let (τ,(Ui))(\tau,(U_{i})) be a dd-dimensional superdense coding protocol. Then for all ij[d2]i\neq j\in[d^{2}],

TrA′′(UiτAUj)=0\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}\tau^{A}U_{j}^{*}\right)=0

where τA\tau^{A} denotes the reduced density matrix of τ\tau on register AA and TrA′′()\operatorname{Tr}_{A^{\prime\prime}}(\cdot) denotes the partial trace over register A′′A^{\prime\prime}.

Proof.

Let |τ|\tau\rangle denote a purification of τ\tau on the Hilbert space ABR\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{R}, where R\mathcal{H}_{R} is the purifying space. Clearly, the protocol where Bob also has access to the purifying space R\mathcal{H}_{R} is also a dd-dimensional superdense coding protocol.

Let {|1,,|dimA}\{|1\rangle,\ldots,|\dim A^{\prime}\rangle\} denote an orthonormal basis for A\mathcal{H}_{A^{\prime}}. Let |ρik|\rho_{ik}\rangle be the (sub-normalized) pure state on registers A′′BRA^{\prime\prime}BR given by

|ρik(k|A𝟙)(Ui𝟙)|τ.|\rho_{ik}\rangle\coloneqq(\langle k|^{A^{\prime}}\otimes\mathds{1})(U_{i}\otimes\mathds{1})|\tau\rangle\enspace.

Intuitively, |ρik|\rho_{ik}\rangle represents the residual state of the protocol on registers A′′BA^{\prime\prime}B when Alice applies unitary operator UiU_{i}, and then measures the AA^{\prime} subsystem in the standard basis to obtain outcome |k|k\rangle.

Note that if we let ρi\rho_{i} denote the state (Ui𝟙)|ττ|(Ui𝟙)(U_{i}\otimes\mathds{1})|\tau\rangle\!\langle\tau|(U_{i}\otimes\mathds{1})^{*} reduced to the registers A′′BRA^{\prime\prime}BR, we have the identity

ρi=k|ρikρik|,\rho_{i}=\sum_{k}|\rho_{ik}\rangle\!\langle\rho_{ik}|\;,

because we can think of ρi\rho_{i} as the result of first measuring the AA^{\prime} register in the standard basis, and discarding the outcome. Applying Lemma 2.2 to the purified protocol (|ττ|,(Ui))(|\tau\rangle\!\langle\tau|,(U_{i})), we have for iji\neq j,

0=Tr(ρiρj)=k,kTr(|ρikρik||ρjkρjk|)=k,k|ρjk|ρik|20=\operatorname{Tr}(\rho_{i}\,\rho_{j})=\sum_{k,k^{\prime}}\operatorname{Tr}(|\rho_{ik}\rangle\!\langle\rho_{ik}|\cdot|\rho_{jk^{\prime}}\rangle\!\langle\rho_{jk^{\prime}}|)=\sum_{k,k^{\prime}}|\langle\rho_{jk^{\prime}}|\rho_{ik}\rangle|^{2}

and therefore |ρjk|ρik|2=0|\langle\rho_{jk^{\prime}}|\rho_{ik}\rangle|^{2}=0 for all k,kk,k^{\prime}. This implies that ρjk|ρik=0\langle\rho_{jk^{\prime}}|\rho_{ik}\rangle=0 for all k,kk,k^{\prime}, which can be rewritten as

τ|(Uj𝟙)(|kk|A𝟙)(Ui𝟙)|τ=Tr((k|A𝟙)(Ui𝟙)|ττ|(Uj𝟙)(|kA𝟙))=0.\langle\tau|(U_{j}\otimes\mathds{1})^{*}(|k^{\prime}\rangle\!\langle k|^{A^{\prime}}\otimes\mathds{1})(U_{i}\otimes\mathds{1})|\tau\rangle=\operatorname{Tr}\Big{(}(\langle k|^{A^{\prime}}\otimes\mathds{1})(U_{i}\otimes\mathds{1})|\tau\rangle\!\langle\tau|(U_{j}^{*}\otimes\mathds{1})(|k^{\prime}\rangle^{A^{\prime}}\otimes\mathds{1})\Big{)}=0.

This is equivalent to the statement that

k|ATrA′′(UiτAUj)|kA=0.\langle k|^{A^{\prime}}\,\operatorname{Tr}_{A^{\prime\prime}}\Big{(}U_{i}\tau^{A}U_{j}^{*}\Big{)}\,|k^{\prime}\rangle^{A^{\prime}}=0.

Since this holds for all k,kk,k^{\prime}, the matrix TrA′′(UiτAUj)\operatorname{Tr}_{A^{\prime\prime}}\Big{(}U_{i}\tau^{A}U_{j}^{*}\Big{)} is identically zero, which completes the proof of the Lemma. ∎

Next we define what it means for superdense protocols to be locally equivalent.

Definition 2.4.

Let AAA′′\mathcal{H}_{A}\coloneqq\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}} be a Hilbert space where A′′\mathcal{H}_{A^{\prime\prime}} is isomorphic to d\mathbb{C}^{d}. Let τ,τ\tau,\tau^{\prime} be density matrices on AB\mathcal{H}_{A}\otimes\mathcal{H}_{B}. Let (Ui),(Ui)(U_{i}),(U_{i}^{\prime}) be unitary operators acting on A\mathcal{H}_{A}. We say that (τ,(Ui))(\tau,(U_{i})) and (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) are locally equivalent if there exists

  1. 1.

    A unitary operator VV acting on AA′′\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{A^{\prime\prime}} ,

  2. 2.

    A set of unitary operators (Ci)i[d2](C_{i})_{i\in[d^{2}]} acting on A\mathcal{H}_{A^{\prime}}

such that

  1. 1.

    τ=(V𝟙)τ(V𝟙)\tau^{\prime}=(V\otimes\mathds{1})\tau(V\otimes\mathds{1})^{*}, and

  2. 2.

    Ui=(Ci𝟙)UiVU_{i}^{\prime}=(C_{i}\otimes\mathds{1})U_{i}V^{*}.

Lemma 2.5 (Local unitary freedom of superdense coding protocols).

The following properties hold for local equivalence.

  1. 1.

    If (τ,(Ui))(\tau,(U_{i})) and (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) are locally equivalent, then (τ,(Ui))(\tau,(U_{i})) is a (d,ε)(d,\varepsilon)-superdense coding protocol if and only if (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) is.

  2. 2.

    Local equivalence is transitive.

Proof.

For any fixed ii, after Alice applies her encoding unitary operator, the reduced density matrix on registers A′′BA^{\prime\prime}B is the same whether the protocol (τ,(Ui))(\tau,(U_{i})) or (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) is used. Thus Bob’s ability to distinguish between the different messages is exactly the same. This establishes Item 1.

If (τ,(Ui))(\tau,(U_{i})) and (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) are locally equivalent, then τ=(V𝟙)τ(V𝟙)\tau^{\prime}=(V\otimes\mathds{1})\tau(V^{*}\otimes\mathds{1}), and Ui=(Ci𝟙)UiVU_{i}^{\prime}=(C_{i}\otimes\mathds{1})U_{i}V^{*}. If (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) and (τ′′,(Ui′′))(\tau^{\prime\prime},(U_{i}^{\prime\prime})) are locally equivalent, then τ′′=(V𝟙)τ((V)𝟙)\tau^{\prime\prime}=(V^{\prime}\otimes\mathds{1})\tau^{\prime}((V^{\prime})^{*}\otimes\mathds{1}), and Ui′′=(Ci𝟙)Ui(V)U_{i}^{\prime\prime}=(C_{i}^{\prime}\otimes\mathds{1})U_{i}^{\prime}(V^{\prime})^{*}. Thus we have

τ′′\displaystyle\tau^{\prime\prime}\quad =(VV𝟙)τ(V(V)𝟙),and\displaystyle=\quad(V^{\prime}V\otimes\mathds{1})\tau(V^{*}(V^{\prime})^{*}\otimes\mathds{1})\enspace,\qquad\text{and}
Ui′′\displaystyle U_{i}^{\prime\prime}\quad =(CiCi𝟙)UiV(V),\displaystyle=\quad(C_{i}^{\prime}C_{i}\otimes\mathds{1})U_{i}V^{*}(V^{\prime})^{*}\enspace,

which implies that (τ,(Ui))(\tau,(U_{i})) is locally equivalent to (τ′′,(Ui′′))(\tau^{\prime\prime},(U_{i}^{\prime\prime})). This establishes Item 2. ∎

2.3 Nice form protocols

In this section, we define nice form protocols and then show that every superdense coding protocol is locally equivalent to one that has a nice form.

Definition 2.6.

dd-dimensional superdense coding protocol (τ,(Ui))(\tau,(U_{i})) has a nice form if

  1. 1.

    U1=𝟙U_{1}=\mathds{1},

  2. 2.

    There exists an isometry W:BBB′′W:\mathcal{H}_{B}\to\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{B^{\prime\prime}} where B′′\mathcal{H}_{B^{\prime\prime}} is isomorphic to d\mathbb{C}^{d} such that

    (𝟙W)τ(𝟙W)=ρAB|ϕdϕd|A′′B′′(\mathds{1}\otimes W)\tau(\mathds{1}\otimes W)^{*}=\rho^{A^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle\!\langle\upphi_{d}|^{A^{\prime\prime}B^{\prime\prime}}

    for some density matrix ρ\rho on AB\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}.

  3. 3.

    For all i[d2]i\in[d^{2}], we have that

    UiTrB(τ)Ui=Ui(ρA𝟙d)Ui=ρA𝟙dU_{i}\operatorname{Tr}_{B}(\tau)U_{i}^{*}=U_{i}\left(\rho^{A^{\prime}}\otimes\frac{\mathds{1}}{d}\right)U_{i}^{*}=\rho^{A^{\prime}}\otimes\frac{\mathds{1}}{d}\;

    where ρA\rho^{A^{\prime}} denotes the reduced density matrix of τ\tau on A\mathcal{H}_{A^{\prime}}.

  4. 4.

    Let the spectral decomposition of ρA\rho^{A^{\prime}} be kλkΠk\sum_{k}\lambda_{k}\Pi_{k} where λk>0\lambda_{k}>0 for all kk, with λk\lambda_{k} distinct. Then for all kk and iji\neq j, we have

    TrA′′((Πk𝟙)UiUj(Πk𝟙))=0.\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})\,U_{i}U_{j}^{*}\,(\Pi_{k}\otimes\mathds{1}))=0.

Item 2 says that up to a local unitary operation, the two parties share a maximally entangled state (in addition to other entanglement), and Item 3 turns out to be a consequence of this. Item 4 is equivalent to saying that the encoding of distinct messages iji\neq j are orthogonal to each other. The proof of Lemma 2.7 below may give the reader further intuition into these properties.

In the proof of Lemma 2.7, we make use of an information-theoretic argument that involves quantities such as von Neumann entropy H(A)\operatorname{H}(A), conditional entropy H(A|B)\operatorname{H}(A|B), and mutual information I(A:B)\operatorname{I}(A:B). For a comprehensive reference on these quantities and their basic properties, we recommend Wilde’s textbook [Wil13]. It is an interesting question whether Lemma 2.7 can be proved without making use of these information-theoretic quantities.

Lemma 2.7.

All superdense coding protocols (τ,(Ui))(\tau,(U_{i})) are locally equivalent to a superdense coding protocol (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) that has a nice form.

Proof.

We define a unitary operator VV acting on A\mathcal{H}_{A^{\prime}} and unitary operators (Ci:i[d2])(C_{i}:i\in[d^{2}]) acting on A\mathcal{H}_{A^{\prime}} such that, letting τ(V𝟙)τ(V𝟙)\tau^{\prime}\coloneqq(V\otimes\mathds{1})\tau(V\otimes\mathds{1})^{*} and Ui(Ci𝟙)UiVU_{i}^{\prime}\coloneqq(C_{i}\otimes\mathds{1})U_{i}V^{*}, the pair (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) is a superdense coding protocol and has a nice form.

Let VU1V\coloneqq U_{1} and let C1𝟙C_{1}\coloneqq\mathds{1}. This already yields Item 1 of Definition 2.6.

Let |τABR|\tau\rangle^{ABR} be a purification of τ\tau where R\mathcal{H}_{R} is a reference system of dimension dim(AB)\dim(\mathcal{H}_{A}\otimes\mathcal{H}_{B}). Consider the cq-state

ξ1d2i|ii|X(Ui𝟙)|ττ|ABR(Ui𝟙),\xi\coloneqq\frac{1}{d^{2}}\sum_{i}|i\rangle\!\langle i|^{X}\otimes(U_{i}\otimes\mathds{1})|\tau\rangle\!\langle\tau|^{ABR}(U_{i}^{*}\otimes\mathds{1})\enspace,

where the Hilbert space of register XX is X\mathcal{H}_{X}. By Lemma 2.5, the protocol (VτV,(UiV))(V\tau V^{*},(U_{i}V^{*})) is a superdense coding protocol. Therefore the information contained in registers A′′BA^{\prime\prime}B about XX in the state ξ\xi is

I(X:A′′B)ξ=2log2d.\operatorname{I}(X:A^{\prime\prime}B)_{\xi}=2\log_{2}d\enspace.

Intuitively, this is because Bob can perfectly recover the value of i[d2]i\in[d^{2}], i.e., 2log2d2\log_{2}d bits of information, from the registers A′′BA^{\prime\prime}B of ξ\xi. On the other hand, we have that

I(X:B)ξ=0\operatorname{I}(X:B)_{\xi}=0

because without the qubit register A′′A^{\prime\prime}, Bob has no information about XX (the state of the register BB is the same for all ii). Therefore we get

I(X:A′′|B)ξ=I(X:A′′B)ξI(X:B)ξ=2log2d.\operatorname{I}(X:A^{\prime\prime}|B)_{\xi}=\operatorname{I}(X:A^{\prime\prime}B)_{\xi}-\operatorname{I}(X:B)_{\xi}=2\log_{2}d\enspace.

Using the entropy characterization of conditional mutual information, we get

2log2d=I(X:A′′|B)ξ=H(A′′|B)ξH(A′′|XB)ξ.2\log_{2}d=\operatorname{I}(X:A^{\prime\prime}|B)_{\xi}=\operatorname{H}(A^{\prime\prime}|B)_{\xi}-\operatorname{H}(A^{\prime\prime}|XB)_{\xi}\enspace.

Since H(A′′|B)ξlog2d\operatorname{H}(A^{\prime\prime}|B)_{\xi}\leq\log_{2}d and H(A′′|XB)ξlog2d\operatorname{H}(A^{\prime\prime}|XB)_{\xi}\geq-\log_{2}d (because the dimension of register A′′A^{\prime\prime} is dd), we get that H(A′′|B)ξ=log2d\operatorname{H}(A^{\prime\prime}|B)_{\xi}=\log_{2}d and H(A′′|XB)ξ=log2d\operatorname{H}(A^{\prime\prime}|XB)_{\xi}=-\log_{2}d.

Since XX is a classical register, we can write H(A′′|XB)ξ\operatorname{H}(A^{\prime\prime}|XB)_{\xi} as

log2d=H(A′′|XB)ξ=𝔼iH(A′′|B,X=i)-\log_{2}d=\operatorname{H}(A^{\prime\prime}|XB)_{\xi}=\operatorname*{\mathbb{E}}_{i}\operatorname{H}(A^{\prime\prime}|B,X=i)

where H(A′′|B,X=i)\operatorname{H}(A^{\prime\prime}|B,X=i) is defined as H(A′′|B)ξi\operatorname{H}(A^{\prime\prime}|B)_{\xi_{i}} with |ξi(Ui𝟙)|τ|\xi_{i}\rangle\coloneqq(U_{i}\otimes\mathds{1})|\tau\rangle. Since H(A′′|B,X=i)log2d\operatorname{H}(A^{\prime\prime}|B,X=i)\geq-\log_{2}d, we have H(A′′|B)ξi=log2d\operatorname{H}(A^{\prime\prime}|B)_{\xi_{i}}=-\log_{2}d for all ii, and in particular for i=1i=1.

Then H(A′′|B)ξi=H(A′′|RA)ξi\operatorname{H}(A^{\prime\prime}|B)_{\xi_{i}}=-\operatorname{H}(A^{\prime\prime}|RA^{\prime})_{\xi_{i}} (because ξi\xi_{i} is pure), so H(A′′|RA)ξi=log2d\operatorname{H}(A^{\prime\prime}|RA^{\prime})_{\xi_{i}}=\log_{2}d. On one hand, we have that I(A′′:RA)ξi=H(A′′)ξiH(A′′|RA)ξi\operatorname{I}(A^{\prime\prime}:RA^{\prime})_{\xi_{i}}=\operatorname{H}(A^{\prime\prime})_{\xi_{i}}-\operatorname{H}(A^{\prime\prime}|RA^{\prime})_{\xi_{i}}, and on the other hand, mutual information is always nonnegative. Thus H(A′′)ξi=log2d\operatorname{H}(A^{\prime\prime})_{\xi_{i}}=\log_{2}d, and the reduced density matrix of ξi\xi_{i} on the A′′A^{\prime\prime} register is maximally mixed. Furthermore we have I(A′′:RA)ξi=0\operatorname{I}(A^{\prime\prime}:RA^{\prime})_{\xi_{i}}=0, so ξi\xi_{i} has no correlations between registers A′′A^{\prime\prime} and RARA^{\prime}:

TrB(ξi)=ρiRA𝟙d\operatorname{Tr}_{B}(\xi_{i})=\rho_{i}^{RA^{\prime}}\otimes\frac{\mathds{1}}{d} (2.2)

where ρi\rho_{i} is some density matrix on the RARA^{\prime} registers.

Fix i=1i=1, and let ρRA\rho^{RA^{\prime}} denote ρ1RA\rho_{1}^{RA^{\prime}}. Let B\mathcal{H}_{B^{\prime}} be a Hilbert space with dimension dim(RA)\dim(\mathcal{H}_{R}\otimes\mathcal{H}_{A^{\prime}}) and let B′′\mathcal{H}_{B^{\prime\prime}} be isomorphic to d\mathbb{C}^{d}. Let |ρRAB|\rho\rangle^{RA^{\prime}B^{\prime}} denote a purification of ρRA\rho^{RA^{\prime}}. Notice that |ρRAB|ϕdA′′B′′|\rho\rangle^{RA^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle^{A^{\prime\prime}B^{\prime\prime}} is a purification of the state in Equation 2.2. Using Uhlmann’s Theorem [Uhl76] (also known as the Schrödinger-HJW Theorem [Sch35, HJW93]), there exists an isometry WW on B\mathcal{H}_{B} such that

(𝟙W)|ξ1RAA′′B=|ρRAB|ϕdA′′B′′.(\mathds{1}\otimes W)|\xi_{1}\rangle^{RA^{\prime}A^{\prime\prime}B}=|\rho\rangle^{RA^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle^{A^{\prime\prime}B^{\prime\prime}}\;.

Since |ξ1=(V𝟙)|τ|\xi_{1}\rangle=(V\otimes\mathds{1})|\tau\rangle, we have that

ρAB|ϕdϕd|A′′B′′=TrR((𝟙W)|ξ1ξ1|(𝟙W))=(VW)τ(VW).\rho^{A^{\prime}B^{\prime}}\otimes|\upphi_{d}\rangle\!\langle\upphi_{d}|^{A^{\prime\prime}B^{\prime\prime}}=\operatorname{Tr}_{R}((\mathds{1}\otimes W)|\xi_{1}\rangle\!\langle\xi_{1}|(\mathds{1}\otimes W)^{*})=(V\otimes W)\tau(V\otimes W)^{*}\;.

Since τ=(V𝟙)τ(V𝟙)\tau^{\prime}=(V\otimes\mathds{1})\tau(V\otimes\mathds{1})^{*} we obtain Item 2 of Definition 2.6 for the protocol (τ,(UiV))(\tau^{\prime},(U_{i}V^{*})).

In what follows we let ζ\zeta denote ρA\rho^{A^{\prime}} (which also equals τA\tau^{\prime A^{\prime}}). We now establish Item 3 of Definition 2.6. Let kλkΠk\sum_{k}\lambda_{k}\Pi_{k} be the spectral decomposition of ζ\zeta where the {λk}\{\lambda_{k}\} are distinct and nonzero, and Πk\Pi_{k} is the orthogonal projector onto the eigenspace of ζ\zeta corresponding to eigenvalue λk\lambda_{k}. Since by Equation 2.2 we have

UiV(ζ𝟙d)(UiV)=TrBR(ξi)=ζi𝟙dU_{i}V^{*}\left(\zeta\otimes\frac{\mathds{1}}{d}\right)(U_{i}V^{*})^{*}=\operatorname{Tr}_{BR}(\xi_{i})=\zeta_{i}\otimes\frac{\mathds{1}}{d} (2.3)

for some density matrix ζi\zeta_{i}, the states ζi\zeta_{i} and ζ\zeta have the same eigenvalues with the same multiplicities. That is, there are an orthogonal set of projectors {Πk(i)}k\{\Pi_{k}^{(i)}\}_{k} such that

ζi=kλkΠk(i),\zeta_{i}=\sum_{k}\lambda_{k}\Pi_{k}^{(i)}\enspace,

where dim(Πk(i))=dim(Πk)\dim(\Pi_{k}^{(i)})=\dim(\Pi_{k}) for all i[d2]i\in[d^{2}]. It follows that for all ii,

UiV(Πk𝟙d)(UiV)=Πk(i)𝟙d.U_{i}V^{*}\left(\Pi_{k}\otimes\frac{\mathds{1}}{d}\right)(U_{i}V^{*})^{*}=\Pi_{k}^{(i)}\otimes\frac{\mathds{1}}{d}\enspace.

For i[d2]i\in[d^{2}] let CiC_{i} be a unitary operator on A\mathcal{H}_{A^{\prime}} such that CiΠk(i)Ci=ΠkC_{i}\Pi_{k}^{(i)}C_{i}^{*}=\Pi_{k} for all kk. Since Πk(1)=Πk\Pi_{k}^{(1)}=\Pi_{k}, our choice of C1=𝟙C_{1}=\mathds{1} suffices. Let Ui=(Ci𝟙)UiVU_{i}^{\prime}=(C_{i}\otimes\mathds{1})U_{i}V^{*}. By Lemma 2.5, we have (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) is a superdense coding protocol. Furthermore, Equation 2.3 implies that

Ui(kλkΠk𝟙d)(Ui)=kλkΠk𝟙d,U_{i}^{\prime}\Big{(}\sum_{k}\lambda_{k}\,\Pi_{k}\otimes\frac{\mathds{1}}{d}\Big{)}(U_{i}^{\prime})^{*}=\sum_{k}\lambda_{k}\,\Pi_{k}\otimes\frac{\mathds{1}}{d}\enspace,

which implies Item 3 of Definition 2.6.

Lemma 2.5 implies that (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) is also a superdense coding protocol, so by Lemma 2.3 we have that for all iji\neq j,

TrA′′(Ui(ζ𝟙d)(Uj))=0.\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}^{\prime}\left(\zeta\otimes\frac{\mathds{1}}{d}\right)(U_{j}^{\prime})^{*}\right)=0\;.

Since Πk\Pi_{k} commutes with the (Ui)(U_{i}^{\prime}) for all kk, we have

0\displaystyle 0 =TrA′′(Ui(ρ𝟙d)(Uj))\displaystyle=\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}^{\prime}\left(\rho\otimes\frac{\mathds{1}}{d}\right)(U_{j}^{\prime})^{*}\right)
=kλkTrA′′(Ui(Πk𝟙d)(Uj))\displaystyle=\sum_{k}\lambda_{k}\operatorname{Tr}_{A^{\prime\prime}}\left(U_{i}^{\prime}\left(\Pi_{k}\otimes\frac{\mathds{1}}{d}\right)(U_{j}^{\prime})^{*}\right)
=1dkλkTrA′′((Πk𝟙)Ui(Uj)).\displaystyle=\frac{1}{d}\sum_{k}\lambda_{k}\operatorname{Tr}_{A^{\prime\prime}}\left(\left(\Pi_{k}\otimes\mathds{1}\right)U_{i}^{\prime}(U_{j}^{\prime})^{*}\right)\enspace.

Since the λk\lambda_{k}’s are positive, we have

TrA′′((Πk𝟙)Ui(Uj)(Πk𝟙))=0\operatorname{Tr}_{A^{\prime\prime}}\left((\Pi_{k}\otimes\mathds{1})U_{i}^{\prime}(U_{j}^{\prime})^{*}(\Pi_{k}\otimes\mathds{1})\right)=0

for all kk, and iji\neq j. This implies Item 4 of Definition 2.6.

3 Rigidity for two-dimensional superdense coding

In this section we prove Theorem 1.1, that is, rigidity for 22-dimensional superdense coding protocols (coding 22 bits into one qubit, with no error). For the remainder of this section we drop the qualification “22-dimensional” for brevity.

The proof involves a number of steps. First, we invoke Lemma 2.7, which states that every superdense coding protocol is locally equivalent to one that has a nice form. Then, we argue that up to local equivalence, in every nice form superdense coding protocol, the encoding operators (Ui)(U_{i}) can be block-diagonalized. That is, we can write Ui=QiRiU_{i}=\sum_{\ell}Q_{i\ell}\otimes R_{i\ell} where the {Qi}\{Q_{i\ell}\} are a set of orthogonal projectors acting on A\mathcal{H}_{A^{\prime}} summing to the identity, and the {Ri}\{R_{i\ell}\} are a set of Hermitian unitary operators acting on A′′\mathcal{H}_{A^{\prime\prime}}. Next, we argue that (again up to local equivalence) across the different ii’s, the projectors {Qi}\{Q_{i\ell}\} can be “matched up”, and the corresponding operators RiR_{i\ell} are all pairwise orthogonal. This implies that in fact {R1,R2,R3,R4}\{R_{1\ell},R_{2\ell},R_{3\ell},R_{4\ell}\} are unitarily equivalent to the standard Pauli matrices {𝟙,X,Y,Z}\{\mathds{1},{\mathrm{X}},{\mathrm{Y}},{\mathrm{Z}}\}. Using the property that local equivalence is a transitive relation, this concludes the argument.

Lemma 2.7 is proven in Section 2.3. We now proceed to prove the remaining steps in detail.

3.1 Block-diagonalizing nice form protocols

In this section, we analyze the structure of the encoding operators in a nice form superdense coding protocol in dimension two. We show that they apply a 2×22\times 2 unitary operator on register A′′A^{\prime\prime}, controlled by the state in register AA^{\prime}.

Theorem 3.1.

Let (τ,(Ui))(\tau,(U_{i})) be a nice-form protocol. Then there exists a locally-equivalent protocol (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) that has a nice form and for i{2,3,4}i\in\{2,3,4\} we have

Ui=τQiRi.U_{i}^{\prime}=_{\tau^{\prime}}\sum_{\ell}Q_{i\ell}\otimes R_{i\ell}\;.

for some orthogonal projectors {Qi}\{Q_{i\ell}\}_{\ell} on A\mathcal{H}_{A^{\prime}} that sum to 𝟙\mathds{1}, and 2×22\times 2 traceless, Hermitian unitary matrices {Ri}\{R_{i\ell}\}_{\ell} on A′′\mathcal{H}_{A^{\prime\prime}}.

Proof.

Fix an i{2,3,4}i\in\{2,3,4\}. Since (τ,(Ui))(\tau,(U_{i})) has a nice form (see Definition 2.6), this means that τA=ρA𝟙2\tau^{A}=\rho^{A^{\prime}}\otimes\frac{\mathds{1}}{2} for some density matrix ρ\rho, and TrA′′((Πk𝟙)UiU1(Πk𝟙))=TrA′′((Πk𝟙)Ui(Πk𝟙))=0\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}U_{1}^{*}(\Pi_{k}\otimes\mathds{1}))=\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}(\Pi_{k}\otimes\mathds{1}))=0, where Πk\Pi_{k} is a non-zero eigenspace of ρ\rho.

Fix a kk. By property 3 of a nice-form protocol, UiU_{i} commutes with Πk𝟙\Pi_{k}\otimes\mathds{1}, and we can write

Ui=(Πk𝟙)Ui(Πk𝟙)+(𝟙Πk𝟙)Ui(𝟙Πk𝟙).U_{i}\quad=\quad(\Pi_{k}\otimes\mathds{1})U_{i}(\Pi_{k}\otimes\mathds{1})+(\mathds{1}-\Pi_{k}\otimes\mathds{1})U_{i}(\mathds{1}-\Pi_{k}\otimes\mathds{1})\enspace.

Let U^ik=(Πk𝟙)Ui(Πk𝟙)\hat{U}_{ik}=(\Pi_{k}\otimes\mathds{1})U_{i}(\Pi_{k}\otimes\mathds{1}), and note that U^ik\hat{U}_{ik} is unitary on the image of Πk𝟙\Pi_{k}\otimes\mathds{1}, i.e.:

U^ikU^ik=U^ikU^ik=Πk𝟙.\hat{U}_{ik}\hat{U}_{ik}^{*}=\hat{U}_{ik}^{*}\hat{U}_{ik}=\Pi_{k}\otimes\mathds{1}\;.

For notational convenience we drop the subscripts ii and kk until the very end. Let U^U^ik\hat{U}\coloneqq\hat{U}_{ik} and let ΠΠk\Pi\coloneqq\Pi_{k}.

The condition that TrA′′(U^)=0\operatorname{Tr}_{A^{\prime\prime}}(\hat{U})=0 implies that we can write U^\hat{U} as

U^=(FGHF),\hat{U}=\left(\begin{array}[]{c|c}F&G\\ \hline\cr H&-F\end{array}\right)\enspace,

where F,G,HF,G,H are block matrices that act on the image of Π\Pi and the block partitions are with respect to the tensor factor A′′\mathcal{H}_{A^{\prime\prime}}.

Let

F=DFTF,G=DGTG,H=DHTHF=D_{F}T_{F}~{},\qquad G=D_{G}T_{G}~{},\qquad H=D_{H}T_{H}

give the polar decompositions of F,G,HF,G,H respectively where DF,DG,DHD_{F},D_{G},D_{H} are positive semidefinite and TF,TG,THT_{F},T_{G},T_{H} are unitary on the image of Π\Pi.

Then

U^=(DFTFDGTGDHTHDFTF).\hat{U}=\left(\begin{array}[]{c|c}D_{F}T_{F}&D_{G}T_{G}\\ \hline\cr D_{H}T_{H}&-D_{F}T_{F}\end{array}\right).

The relation U^U^=Π𝟙\hat{U}\hat{U}^{*}=\Pi\otimes\mathds{1} implies that DF2=ΠDG2=ΠDH2D_{F}^{2}=\Pi-D_{G}^{2}=\Pi-D_{H}^{2}. For notational brevity we write DDFD\coloneqq D_{F} and D~DG=DH=ΠD2\widetilde{D}\coloneqq D_{G}=D_{H}=\sqrt{\Pi-D^{2}}. Note that DD and D~\widetilde{D} have support in the image of Π\Pi and are simultaneously diagonalizable. Write KTFDTFK\coloneqq T_{F}^{*}DT_{F} and K~TFD~TF\widetilde{K}\coloneqq T_{F}^{*}\widetilde{D}T_{F}. Note that KK and K~\widetilde{K} are positive semidefinite, and also simultaneously diagonalizable. Write WGTFTGW_{G}\coloneqq T_{F}^{*}T_{G} and WHTFTHW_{H}^{*}\coloneqq T_{F}^{*}T_{H}. Continuing our simplification, we see that

(TF𝟙)U^=(KK~WGK~WHK).(T_{F}^{*}\otimes\mathds{1})\hat{U}=\left(\begin{array}[]{c|c}K&\widetilde{K}W_{G}\\ \hline\cr\widetilde{K}W_{H}^{*}&-K\end{array}\right).

Now our goal is to find a unitary operator EE acting on A\mathcal{H}_{A^{\prime}} such that (ETF𝟙)U^(ET_{F}^{*}\otimes\mathds{1})\hat{U} is Hermitian. This is equivalent to the conditions that EK=KEEK=KE^{*} and EK~WG=(EK~WH)E\widetilde{K}W_{G}=(E\widetilde{K}W_{H}^{*})^{*}. We construct such an operator EE using some relations between the operators K,K~,WG,WHK,\widetilde{K},W_{G},W_{H} that we derive below.

We use the unitarity relation U^U^=Π𝟙\hat{U}^{*}\hat{U}=\Pi\otimes\mathds{1} to obtain the equations

K2+WGK~2WG\displaystyle K^{2}+W_{G}^{*}\widetilde{K}^{2}W_{G} =Π,and\displaystyle=\Pi\enspace,\qquad\text{and} (3.1)
K2+WHK~2WH\displaystyle K^{2}+W_{H}\widetilde{K}^{2}W_{H}^{*} =Π.\displaystyle=\Pi\enspace. (3.2)

These equations, along with the definitions of KK and K~\widetilde{K}, imply that the unitary operators WG,WHW_{G},W_{H} are block-diagonal with respect to the eigenspaces of KK and K~\widetilde{K} (and therefore commute with KK and K~\widetilde{K}). Another relation we get from unitarity is KK~WG=WHK~KK\widetilde{K}W_{G}=W_{H}\widetilde{K}K, which via commutativity of WH,K~,KW_{H},\widetilde{K},K implies

KK~WG=KK~WH.K\widetilde{K}W_{G}=K\widetilde{K}W_{H}\;. (3.3)

Let Π+\Pi_{+} be the orthogonal projector onto supp(K)\operatorname{supp}(K). Since WGW_{G} commutes with K,K~K,\widetilde{K} (and so does WHW_{H}), we have that Π+\Pi_{+} commutes with K~,WG,WH\widetilde{K},W_{G},W_{H}. By Equation 3.3, we have Π+K~WG=Π+K~WH=WHK~Π+\Pi_{+}\widetilde{K}W_{G}=\Pi_{+}\widetilde{K}W_{H}=W_{H}\widetilde{K}\,\Pi_{+}. Since K2+K~2=ΠK^{2}+\widetilde{K}^{2}=\Pi, we also have that (ΠΠ+)K~=(ΠΠ+)(\Pi-\Pi_{+})\widetilde{K}=(\Pi-\Pi_{+}). Note that Equation 3.3, together with the fact that WH,WGW_{H},W_{G} are block-diagonal with respect to the eigenspaces of the product KK~K\widetilde{K}, implies that WGW_{G} and WHW_{H} must be equal on the support of KK~K\widetilde{K} (equivalently, the support of Π+\Pi_{+}).

We now construct the desired unitary EE. Let Π0\Pi_{0} denote ΠΠ+\Pi-\Pi_{+} (i.e., the projection onto the kernel of KK within the image of Π\Pi). Left-multiplying both sides of Equation 3.3 with K+K^{+} (the pseudoinverse of KK on the image of Π+\Pi_{+}) and right-multiplying both sides by Π+\Pi_{+} we get

Π+K~WGΠ+=Π+K~WHΠ+.\Pi_{+}\widetilde{K}W_{G}\Pi_{+}=\Pi_{+}\widetilde{K}W_{H}\Pi_{+}\;. (3.4)

Recall that Π0K~=Π0\Pi_{0}\widetilde{K}=\Pi_{0}; combined with the fact that WGW_{G} and WHW_{H} are both block-diagonal with respect to the eigenspaces of KK and K~\widetilde{K}, we have

K~WG=Π+K~WGΠ++Π0WGΠ0andK~WH=Π+K~WHΠ++Π0WHΠ0.\widetilde{K}W_{G}=\Pi_{+}\widetilde{K}W_{G}\Pi_{+}+\Pi_{0}W_{G}\Pi_{0}\qquad\text{and}\qquad\widetilde{K}W_{H}^{*}=\Pi_{+}\widetilde{K}W_{H}^{*}\Pi_{+}+\Pi_{0}W_{H}^{*}\Pi_{0}~{}. (3.5)

Furthermore, VGΠ0WGΠ0V_{G}\coloneqq\Pi_{0}W_{G}\Pi_{0} and VHΠ0WHΠ0V_{H}\coloneqq\Pi_{0}W_{H}^{*}\Pi_{0} are unitary on Π0\Pi_{0}. Let MVHVGM\coloneqq V_{H}^{*}V_{G}, and consider its spectral decomposition M=jeiθj|vjvj|M=\sum_{j}e^{{\mathrm{i}}\theta_{j}}|v_{j}\rangle\!\langle v_{j}| where {|vj}\{|v_{j}\rangle\} is an orthonormal basis for the support of Π0\Pi_{0}. Let M1/2jeiθj/2|vjvj|M^{1/2}\coloneqq\sum_{j}e^{{\mathrm{i}}\theta_{j}/2}|v_{j}\rangle\!\langle v_{j}| denote the principal square root of MM. Define E0M1/2VGE_{0}\coloneqq M^{1/2}V_{G}^{*}. Observe that E0E_{0} is supported only on Π0\Pi_{0} and satisfies

E0VG=M1/2=(M1/2M)=(E0VH).E_{0}V_{G}=M^{1/2}=(M^{1/2}M^{*})^{*}=(E_{0}V_{H})^{*}. (3.6)

Consider the operator EΠ++E0E\coloneqq\Pi_{+}+E_{0} that is unitary on the support of Π\Pi, and acts non-trivially only on the support of Π0\Pi_{0}. Combining Equations (3.4), (3.5) and (3.6) we get

EK~WG\displaystyle E\widetilde{K}W_{G} =Π+K~WGΠ++E0VG\displaystyle=\Pi_{+}\widetilde{K}W_{G}\Pi_{+}+E_{0}V_{G}
=Π+K~WHΠ++(E0VH)\displaystyle=\Pi_{+}\widetilde{K}W_{H}\Pi_{+}+(E_{0}V_{H})^{*}
=(Π+WHK~Π++E0VH)\displaystyle=(\Pi_{+}W_{H}^{*}\widetilde{K}\Pi_{+}+E_{0}V_{H})^{*}
=(Π+K~WHΠ++E0VH)\displaystyle=(\Pi_{+}\widetilde{K}W_{H}^{*}\Pi_{+}+E_{0}V_{H})^{*}
=(EK~WH).\displaystyle=(E\widetilde{K}W_{H}^{*})^{*}\;.

where the second line follows from Equation 3.4 and Equation 3.6, the fourth line follows because K~\widetilde{K} commutes with WHW_{H}^{*}, and the last line follows because of Equation 3.5. We also have that EE commutes with K~\widetilde{K}: this is because Π+\Pi_{+} commutes with K~\widetilde{K} and also E0E_{0} acts nontrivially only on the eigenspace of K~\widetilde{K} with eigenvalue 11.

Let LEK~WGL\coloneqq E\widetilde{K}W_{G}. Putting everything together, we have

(ETF𝟙)U^=(EKEK~WGEK~WHEK)=(KLLK).(ET_{F}^{*}\otimes\mathds{1})\hat{U}=\left(\begin{array}[]{c|c}EK&E\widetilde{K}W_{G}\\ \hline\cr E\widetilde{K}W_{H}^{*}&-EK\end{array}\right)=\left(\begin{array}[]{c|c}K&L\\ \hline\cr L^{*}&-K\end{array}\right). (3.7)

where we used the fact that EK=KEK=K.

Let K=rαrPrK=\sum_{r}\alpha_{r}P_{r} and K~=r1αr2Pr\widetilde{K}=\sum_{r}\sqrt{1-\alpha_{r}^{2}}\,P_{r} be spectral decompositions of KK and K~\widetilde{K} where the reals αr\alpha_{r}’s are nonnegative and distinct, and the operators PrP_{r} are orthogonal projectors summing to Π\Pi. The operator K~\widetilde{K} has such a spectral decomposition because K2+K~2=ΠK^{2}+\widetilde{K}^{2}=\Pi. Next, since the unitary operators EE and WGW_{G} commute with K~\widetilde{K}, they are block-diagonal with respect to the projectors {Pr}\{P_{r}\}. Thus

PrLPr\displaystyle P_{r}LP_{r} =PrEK~WGPr=PrK~EWGK~Pr=1αr2PrEWGPr.\displaystyle=P_{r}E\widetilde{K}W_{G}P_{r}=P_{r}\sqrt{\widetilde{K}}EW_{G}\sqrt{\widetilde{K}}P_{r}=\sqrt{1-\alpha_{r}^{2}}\,\,P_{r}EW_{G}P_{r}~{}.

The operator PrEWGPrP_{r}EW_{G}P_{r} is unitary on PrP_{r} and we can express it as sβrsQrs\sum_{s}\beta_{rs}Q_{rs} where the βrs\beta_{rs}’s are complex numbers on the unit circle and {Qrs}s\{Q_{rs}\}_{s} are orthogonal projectors that sum to PrP_{r}. Thus we can write K=r,sαrQrsK=\sum_{r,s}\alpha_{r}\,Q_{rs} and L=r,s1αr2βrsQrsL=\sum_{r,s}\sqrt{1-\alpha_{r}^{2}}\,\beta_{rs}\,Q_{rs}, and (ETF𝟙)U^(ET_{F}^{*}\otimes\mathds{1})\hat{U} can be written as

(ETF𝟙)U^=r,sQrsRrs(ET_{F}^{*}\otimes\mathds{1})\hat{U}=\sum_{r,s}Q_{rs}\otimes R_{rs} (3.8)

where RrsR_{rs} is the 2×22\times 2 matrix

(αr1αr2βrs1αr2βrsαr).\begin{pmatrix}\alpha_{r}&\sqrt{1-\alpha_{r}^{2}}\cdot\beta_{rs}\\ \sqrt{1-\alpha_{r}^{2}}\cdot\beta_{rs}^{*}&-\alpha_{r}\end{pmatrix}\enspace.

Notice that RrsR_{rs} has determinant 1-1 and is traceless, therefore its eigenvalues are {+1,1}\{+1,-1\}.

Re-introducing the indices i{2,3,4}i\in\{2,3,4\} and kk, we have deduced that for every block of UiU_{i} corresponding to the eigenspace Πk\Pi_{k}, there exists a map SikS_{ik} that is unitary on the image of Πk\Pi_{k} such that

(Sik𝟙)U^ik=r,sQikrsRikrs(S_{ik}\otimes\mathds{1})\hat{U}_{ik}=\sum_{r,s}Q_{ikrs}\otimes R_{ikrs}

where the (Rikrs)r,s(R_{ikrs})_{r,s} are 2×22\times 2 Hermitian unitary operators with trace 0. Define the unitary operator SiS_{i} on A\mathcal{H}_{A^{\prime}} as Si(𝟙kΠk)+kSikS_{i}\coloneqq(\mathds{1}-\sum_{k}\Pi_{k})+\sum_{k}S_{ik}. If we sum over kk, we get

(Si𝟙)U^i=QiRi,(S_{i}\otimes\mathds{1})\hat{U}_{i}=\sum_{\ell}Q_{i\ell}\otimes R_{i\ell}\enspace,

where U^ikU^ik\hat{U}_{i}\coloneqq\sum_{k}\hat{U}_{ik} and we have re-indexed the sum over k,r,sk,r,s to be a sum over indices \ell. Let Ui(Si𝟙)UiU_{i}^{\prime}\coloneqq(S_{i}\otimes\mathds{1})U_{i} for all i{2,3,4}i\in\{2,3,4\}. Then, letting PkΠkP\coloneqq\sum_{k}\Pi_{k} denote the projector onto the support of ρ\rho, we have

Uiτ(Ui)=(SiUiP)τ(SiUiP)=(SiU^iP)τ(SiU^iP)=SiU^iτU^iSiU_{i}^{\prime}\tau(U_{i}^{\prime})^{*}=(S_{i}U_{i}P)\,\tau\,(S_{i}U_{i}P)^{*}=(S_{i}\hat{U}_{i}P)\,\tau\,(S_{i}\hat{U}_{i}P)^{*}=S_{i}\hat{U}_{i}\,\tau\,\hat{U}_{i}^{*}S_{i}^{*}

where have suppressed the tensoring with identity that extends all the operators to the same space, and used the property that (P𝟙)τ(P𝟙)=τ(P\otimes\mathds{1})\tau(P\otimes\mathds{1})=\tau, U^iP=UiP\hat{U}_{i}P=U_{i}P, and U^iP=U^i\hat{U}_{i}P=\hat{U}_{i}. Thus, the unitary operators UiU_{i}^{\prime} satisfy the conclusions of the theorem statement. Let ττ\tau^{\prime}\coloneqq\tau, so that (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) is a superdense coding protocol by Lemma 2.5.

Furthermore, since (τ,(Ui))(\tau,(U_{i})) has a nice form, it can be verified that (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) also has a nice form. First, since S1=𝟙S_{1}=\mathds{1}, we have that U1=𝟙U_{1}^{\prime}=\mathds{1} (and hence Item 1 of Definition 2.6 is satisfied). Item 2 of Definition 2.6 is satisfied since τ=τ\tau^{\prime}=\tau. Third, since UiU_{i} commutes with ρ𝟙2\rho\otimes\frac{\mathds{1}}{2} and SiS_{i} is block-diagonal with respect to the eigenspaces of ρ\rho, it follows that UiU_{i}^{\prime} also commutes with ρ𝟙2\rho\otimes\frac{\mathds{1}}{2} (so Item 3 of Definition 2.6 is satisfied). Finally, we have

TrA′′((Πk𝟙)Ui(Uj)(Πk𝟙))\displaystyle\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}^{\prime}(U_{j}^{\prime})^{*}(\Pi_{k}\otimes\mathds{1})) =TrA′′((Sik𝟙)U^ikU^jk(Sjk𝟙))\displaystyle=\operatorname{Tr}_{A^{\prime\prime}}((S_{ik}\otimes\mathds{1})\hat{U}_{ik}\hat{U}_{jk}^{*}(S_{jk}^{*}\otimes\mathds{1}))
=Sik[TrA′′(U^ikU^jk)]Sjk\displaystyle=S_{ik}\left[\operatorname{Tr}_{A^{\prime\prime}}(\hat{U}_{ik}\hat{U}_{jk}^{*})\right]S_{jk}^{*}
=Sik[TrA′′((Πk𝟙)UiUj(Πk𝟙))]Sjk\displaystyle=S_{ik}\left[\operatorname{Tr}_{A^{\prime\prime}}((\Pi_{k}\otimes\mathds{1})U_{i}U_{j}^{*}(\Pi_{k}\otimes\mathds{1}))\right]S_{jk}^{*}
=0.\displaystyle=0.

Thus, Item 4 of Definition 2.6 is satisfied. This completes the proof of the Theorem. ∎

3.2 Matching the blocks of the encoding operators

In the previous section we saw how, up to local equivalence of protocols, we can express the encoding operator UiU_{i} in a two-dimensional superdense coding protocol as a block-diagonal matrix with 2×22\times 2 Hermitian unitary operators on the diagonal. In this section we relate the decompositions to each other. Ultimately, the conclusion is that the blocks “line up”, so that the operators in the same diagonal block of the four encoding operators are the four single qubit Pauli operators.

Theorem 3.2.

Let (τ,(Ui))(\tau,(U_{i})) be a superdense coding protocol that has a nice form where for i{2,3,4}i\in\{2,3,4\} we have

Ui=τkQikRik,U_{i}=_{\tau}\sum_{k}Q_{ik}\otimes R_{ik}\;,

for some orthogonal projectors {Qik}k\{Q_{ik}\}_{k} on A\mathcal{H}_{A^{\prime}} that sum to 𝟙\mathds{1}, and 2×22\times 2 traceless, Hermitian unitary matrices {Rik}k\{R_{ik}\}_{k} on A′′\mathcal{H}_{A^{\prime\prime}}. Then (τ,(Ui))(\tau,(U_{i})) is locally equivalent to a superdense coding protocol (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) that has a nice form and satisfies the following: there exist orthogonal projectors {Kr}\{K_{r}\} on A\mathcal{H}_{A^{\prime}} that sum to the identity and 2×22\times 2 traceless, Hermitian unitary operators {Rir}\{R_{ir}\} such that for all i{2,3,4}i\in\{2,3,4\},

Ui=τrKrRir.U_{i}^{\prime}=_{\tau^{\prime}}\sum_{r}K_{r}\otimes R_{ir}\;.

Furthermore, for all r,ijr,i\neq j we have

Tr(RirRjr)=0.\operatorname{Tr}(R_{ir}R_{jr})=0\enspace.
Proof.

The first step is to “coarse-grain” the projectors {Qik}\{Q_{ik}\} so that the associated operators RikR_{ik} are all inequivalent in the following sense. For each ii, we say that kk and kk^{\prime} are ii-equivalent if Rik=±RikR_{ik}=\pm R_{ik^{\prime}}. For every ii, this forms an equivalence relation on the kk’s. Let pi(k)p_{i}(k) denote the least kk^{\prime} such that kk^{\prime} and kk are ii-equivalent. Define sik{±1}s_{ik}\in\{\pm 1\} to be such that Rik=sikRipi(k)R_{ik}=s_{ik}R_{ip_{i}(k)}.

For every i{2,3,4}i\in\{2,3,4\}, for every kk, define the unitary operator Si=ksikQikS_{i}=\sum_{k}s_{ik}Q_{ik} which acts on A\mathcal{H}_{A^{\prime}} (and set S1=𝟙S_{1}=\mathds{1}). Then if we define Ui=(Si𝟙)UiU^{\prime}_{i}=(S_{i}\otimes\mathds{1})U_{i} and τ=τ\tau^{\prime}=\tau, by Lemma 2.5 we get that the pair (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) is a superdense coding protocol, and furthermore the operators (Ui)(U_{i}^{\prime}) admit a block-diagonalization where for all ii, the associated 2×22\times 2 unitary operators RikR_{ik} are all inequivalent. It is also straightforward to check that (τ,(Ui))(\tau^{\prime},(U_{i}^{\prime})) has a nice form.

Next, from Lemma 2.7 we have that for iji\neq j

0=TrA′′(Ui(Uj))=k,QikQjTr(RikRj).0=\operatorname{Tr}_{A^{\prime\prime}}(U_{i}^{\prime}(U_{j}^{\prime})^{*})=\sum_{k,\ell}Q_{ik}Q_{j\ell}\cdot\operatorname{Tr}(R_{ik}R_{j\ell}). (3.9)

By left-multiplying the above expression by QikQ_{ik} for some kk and right-multiplying by QjQ_{j\ell} for some \ell yields QikQjTr(RikRj)=0Q_{ik}Q_{j\ell}\cdot\operatorname{Tr}(R_{ik}R_{j\ell})=0. Therefore, if QikQj0Q_{ik}Q_{j\ell}\neq 0, it follows that Tr(RikRj)=0\operatorname{Tr}(R_{ik}R_{j\ell})=0.

Given three sets of projectors {Q2k}\{Q_{2k}\}, {Q3}\{Q_{3\ell}\}, and {Q4m}\{Q_{4m}\} we can define the following tripartite graph GG, which we call the overlap graph. Associate a vertex with every projector QikQ_{ik} for i{2,3,4}i\in\{2,3,4\}. Include an edge between QikQ_{ik} and QjQ_{j\ell} if and only if QikQj0Q_{ik}Q_{j\ell}\neq 0. A triangle T=(k,,m)T=(k,\ell,m) in the graph GG corresponds to a triple of projectors Q2k,Q3,Q4mQ_{2k},Q_{3\ell},Q_{4m} such that the pairwise products are all nonzero. Given encoding operators as in Equation 3.9, we use triangles to match their blocks.

Lemma 3.3 (Reduction Lemma).

Let {Q2k}\{Q_{2k}\}, {Q3}\{Q_{3\ell}\}, and {Q4m}\{Q_{4m}\} be sets of orthogonal projectors with the following properties:

  1. 1.

    kQ2k=Q3=mQ4m\sum_{k}Q_{2k}=\sum_{\ell}Q_{3\ell}=\sum_{m}Q_{4m}

  2. 2.

    For iji\neq j, for all k,k,\ell, QikQj0Q_{ik}Q_{j\ell}\neq 0 implies that Tr(RikRj)=0\operatorname{Tr}(R_{ik}R_{j\ell})=0.

  3. 3.

    For all i{2,3,4}i\in\{2,3,4\}, the {Rik}k\{R_{ik}\}_{k} are inequivalent.

Then there exists a triangle T=(k,,m)T=(k,\ell,m) and a unit vector |vA|v\rangle\in\mathcal{H}_{A^{\prime}} such that

Q2k|v=Q3|v=Q4m|v=|v.Q_{2k}|v\rangle=Q_{3\ell}|v\rangle=Q_{4m}|v\rangle=|v\rangle.

We shall assume for now that the Reduction Lemma holds. We show how this gives us an iterative decomposition procedure to construct the orthogonal projectors {Kr}\{K_{r}\} satisfying the conclusions of the Theorem.

The sets Q2(0){Q2k}Q_{2}^{(0)}\coloneqq\{Q_{2k}\}, Q3(0){Q3}Q_{3}^{(0)}\coloneqq\{Q_{3\ell}\}, and Q4(0){Q4m}Q_{4}^{(0)}\coloneqq\{Q_{4m}\} satisfy the required conditions of the Reduction Lemma with kQ2k=Q3=mQ4m=𝟙\sum_{k}Q_{2k}=\sum_{\ell}Q_{3\ell}=\sum_{m}Q_{4m}=\mathds{1}. Thus there exists a triangle T0=(k0,0,m0)T_{0}=(k_{0},\ell_{0},m_{0}) and a vector |v0|v_{0}\rangle that is a common eigenvector of Q2k0,Q30,Q4m0Q_{2k_{0}},Q_{3\ell_{0}},Q_{4m_{0}}. Thus we can write

Q2k0=|v0v0|+Q2k0Q30=|v0v0|+Q30Q4m0=|v0v0|+Q4m0Q_{2k_{0}}=|v_{0}\rangle\!\langle v_{0}|+Q_{2k_{0}}^{\prime}\qquad Q_{3\ell_{0}}=|v_{0}\rangle\!\langle v_{0}|+Q_{3\ell_{0}}^{\prime}\qquad Q_{4m_{0}}=|v_{0}\rangle\!\langle v_{0}|+Q_{4m_{0}}^{\prime}

where Q2k0Q_{2k_{0}}^{\prime}, Q30Q_{3\ell_{0}}^{\prime}, and Q4m0Q_{4m_{0}}^{\prime} are orthogonal projectors with rank one smaller.

Define the sets Q2(1),Q3(1),Q4(1)Q_{2}^{(1)},Q_{3}^{(1)},Q_{4}^{(1)} to be the sets Q2(0),Q3(0),Q4(0)Q_{2}^{(0)},Q_{3}^{(0)},Q_{4}^{(0)} with the projectors Q2k0,Q30,Q4m0Q_{2k_{0}},Q_{3\ell_{0}},Q_{4m_{0}} replaced by Q2k0,Q30,Q4m0Q_{2k_{0}}^{\prime},Q_{3\ell_{0}}^{\prime},Q_{4m_{0}}^{\prime}.

Observe that Q2(1),Q3(1),Q4(1)Q_{2}^{(1)},Q_{3}^{(1)},Q_{4}^{(1)} satisfies the required conditions of the Reduction Lemma, with

FQi(1)F=𝟙|v0v0|,\sum_{F\in Q_{i}^{(1)}}F\quad=\quad\mathds{1}-|v_{0}\rangle\!\langle v_{0}|\enspace,

for all i{2,3,4}i\in\left\{2,3,4\right\}. Applying the Reduction Lemma again, we find another triangle T1T_{1} and a common eigenvector |v1|v_{1}\rangle of the triangle. We continue this process of reducing the rank of at least one operator each in the sets Q2(r),Q3(r),Q4(r)Q_{2}^{(r)},Q_{3}^{(r)},Q_{4}^{(r)} and finding common eigenvectors |vr|v_{r}\rangle until we have fully expressed

Ui=τrKrRir,U_{i}^{\prime}=_{\tau}\sum_{r}K_{r}\otimes R_{ir}\enspace,

where Kr|vrvr|K_{r}\coloneqq|v_{r}\rangle\!\langle v_{r}|, and for every rr, the pairwise inner products satisfy Tr(R2rR3r)=Tr(R2rR4r)=Tr(R3rR4r)=0\operatorname{Tr}(R_{2r}R_{3r})=\operatorname{Tr}(R_{2r}R_{4r})=\operatorname{Tr}(R_{3r}R_{4r})=0. This concludes the proof of the Theorem. ∎

Before proving the Reduction Lemma we establish the following lemma, which claims that up to conjugation by the same unitary operator, the only collection of 2×22\times 2 traceless, Hermitian, unitary, mutually orthogonal matrices are the single-qubit Pauli matrices.

Lemma 3.4.

Let R2,R3,R4R_{2},R_{3},R_{4} be 2×22\times 2 unitary matrices that are traceless, Hermitian, and satisfy Tr(RiRj)=0\operatorname{Tr}(R_{i}R_{j})=0 for all iji\neq j. Then there exists a 2×22\times 2 unitary operator SS such that

R2=SZSR3=SXSR4=SYS.R_{2}=S{\mathrm{Z}}S^{*}\qquad R_{3}=S{\mathrm{X}}S^{*}\qquad R_{4}=S{\mathrm{Y}}S^{*}.
Proof.

We find a sequence of unitary operators S1,S2,S3S_{1},S_{2},S_{3} such that SS1S2S3S\coloneqq S_{1}^{*}S_{2}^{*}S_{3}^{*} satisfies the conclusions of the lemma. Because R2R_{2} is unitary, Hermitian and traceless, we can unitarily diagonalize it as R2=|aa||bb|R_{2}=|a\rangle\!\langle a|-|b\rangle\!\langle b|. Define S1S_{1} as the unitary operator with

S1|a=|0S1|b=|1.S_{1}|a\rangle=|0\rangle\qquad S_{1}|b\rangle=|1\rangle.

Let RiS1RiS1R_{i}^{\prime}\coloneqq S_{1}R_{i}S_{1}^{*} for i{2,3,4}i\in\{2,3,4\}. These are all traceless, Hermitian, pairwise orthogonal unitary matrices, and furthermore R2=ZR_{2}^{\prime}={\mathrm{Z}}. Suppose

R3=(rstu).R_{3}^{\prime}=\begin{pmatrix}r&s\\ t&u\end{pmatrix}\;.

Since Tr(R2R3)=0\operatorname{Tr}(R_{2}^{\prime}R_{3}^{\prime})=0 and R3R_{3}^{\prime} is traceless, we have that r=u=0r=u=0, and since R3R_{3}^{\prime} is Hermitian and unitary, we have s=t=eiθs=t^{*}={\mathrm{e}}^{{\mathrm{i}}\theta} for some θ[0,2π)\theta\in[0,2\pi). Let

S2(eiθ/200eiθ/2),S_{2}\coloneqq\begin{pmatrix}{\mathrm{e}}^{-{\mathrm{i}}\theta/2}&0\\ 0&{\mathrm{e}}^{{\mathrm{i}}\theta/2}\end{pmatrix}\enspace,

and Ri′′S2RiS2R_{i}^{\prime\prime}\coloneqq S_{2}R_{i}^{\prime}S_{2}^{*} for i{2,3,4}i\in\{2,3,4\}. Again, the operators Ri′′R_{i}^{\prime\prime} remain traceless, Hermitian unitary, and pairwise orthogonal, and furthermore R2′′=ZR_{2}^{\prime\prime}={\mathrm{Z}} and R3′′=XR_{3}^{\prime\prime}={\mathrm{X}}. Suppose

R4′′=(wxyz).R_{4}^{\prime\prime}=\begin{pmatrix}w&x\\ y&z\end{pmatrix}\;.

From Tr(R2′′R4′′)=0\operatorname{Tr}(R_{2}^{\prime\prime}R_{4}^{\prime\prime})=0, Hermiticity, unitarity, and tracelessness of R4′′R_{4}^{\prime\prime} we have again that w=z=0w=z=0 and x=y=eiϕx=y^{*}={\mathrm{e}}^{{\mathrm{i}}\phi} for some ϕ[0,2π)\phi\in[0,2\pi). From Tr(R3′′R4′′)=0\operatorname{Tr}(R_{3}^{\prime\prime}R_{4}^{\prime\prime})=0 we have that x=yx=-y, which means that x=±ix=\pm{\mathrm{i}}. If x=ix=-{\mathrm{i}}, then set S3𝟙S_{3}\coloneqq\mathds{1}. Otherwise, set S3iZS_{3}\coloneqq-{\mathrm{i}}\,{\mathrm{Z}}. Let Ri′′′S3Ri′′S3R_{i}^{\prime\prime\prime}\coloneqq S_{3}R_{i}^{\prime\prime}S_{3}^{*} for i{2,3,4}i\in\{2,3,4\}. We have that R2′′′=ZR_{2}^{\prime\prime\prime}={\mathrm{Z}}, R3′′′=XR_{3}^{\prime\prime\prime}={\mathrm{X}}, R4′′′=YR_{4}^{\prime\prime\prime}={\mathrm{Y}}. Thus, letting S=S1S2S3S=S_{1}^{*}S_{2}^{*}S_{3}^{*}, we obtain the desired conclusion of the lemma. ∎

We now turn to proving the Reduction Lemma.

Proof of Lemma 3.3.

Define Π=kQ2k\Pi=\sum_{k}Q_{2k}.

No two triangles in GG share an edge.

Suppose we have two triangles corresponding to projectors (Q2k,Q3,Q4m)(Q_{2k},Q_{3\ell},Q_{4m}) and (Q2k,Q3,Q4m)(Q_{2k},Q_{3\ell},Q_{4m^{\prime}}) for some k,l,m,mk,l,m,m^{\prime}. We then have the equations

Tr(R2kR3)=Tr(R2kR4m)=Tr(R3R4m)=Tr(R2kR4m)=Tr(R3R4m)=0.\operatorname{Tr}(R_{2k}R_{3\ell})=\operatorname{Tr}(R_{2k}R_{4m})=\operatorname{Tr}(R_{3\ell}R_{4m})=\operatorname{Tr}(R_{2k}R_{4m^{\prime}})=\operatorname{Tr}(R_{3\ell}R_{4m^{\prime}})=0.

By Lemma 3.4, this implies that there exists a unitary operator SS such that

R2k=SZSR3=SXSR4m=SYS.R_{2k}=S{\mathrm{Z}}S^{*}\qquad R_{3\ell}=S{\mathrm{X}}S^{*}\qquad R_{4m}=S{\mathrm{Y}}S^{*}\;.

Therefore we obtain that

Tr(ZSR4mS)=Tr(XSR4mS)=0.\operatorname{Tr}({\mathrm{Z}}S^{*}R_{4m^{\prime}}S)=\operatorname{Tr}({\mathrm{X}}S^{*}R_{4m^{\prime}}S)=0\enspace.

If

SR4mS=(abbd),S^{*}R_{4m^{\prime}}S=\begin{pmatrix}a&b\\ b^{*}&d\end{pmatrix}\enspace,

the above equation implies that a=d=0a=d=0 and b=bb=-b^{*}, or equivalently that b=±ib=\pm{\mathrm{i}}. Thus SR4mS=±Y=±SR4mSS^{*}R_{4m^{\prime}}S=\pm{\mathrm{Y}}=\pm S^{*}R_{4m}S, or in other words R4m=±R4mR_{4m}=\pm R_{4m^{\prime}}, contradicting the assumption that R4mR_{4m} and R4mR_{4m^{\prime}} are inequivalent.

Every vertex is in a triangle.

Consider Q2kQ_{2k} for some kk. There exists an index \ell such that Q2kQ30Q_{2k}Q_{3\ell}\neq 0, because the operators {Q3}\{Q_{3\ell}\} form a resolution of Π\Pi. Since {Q4m}\{Q_{4m}\} also forms an orthogonal resolution of Π\Pi, we have that

0Q2kQ3=Q2k(mQ4m)Q3.0\neq Q_{2k}Q_{3\ell}=Q_{2k}\left(\sum_{m}Q_{4m}\right)Q_{3\ell}\enspace.

This implies that there exists an index mm such that Q2kQ4m0Q_{2k}Q_{4m}\neq 0 and Q4mQ30Q_{4m}Q_{3\ell}\neq 0.

Finding a common eigenvector of a triangle.

Fix a triangle T=(k,,m)T=(k,\ell,m). For notational simplicity we shall write CQ2kC\coloneqq Q_{2k}, DQ3D\coloneqq Q_{3\ell}, and EQ4mE\coloneqq Q_{4m}. First, observe that if C(ΠD)E0C(\Pi-D)E\neq 0, then there exists some index \ell^{\prime} such that CQ3E0CQ_{3\ell^{\prime}}E\neq 0. This implies that (k,,m)(k,\ell^{\prime},m) forms a triangle in GG. But this cannot happen as this triangle would share the edge (k,m)(k,m) with TT. Therefore C(ΠD)E=0C(\Pi-D)E=0, i.e., CE=CDECE=CDE.

By symmetry, we also get that CD=CEDCD=CED and ED=ECDED=ECD. Thus we have

0CE=CDE=CEDE=CECDE=CDECDE.0\neq CE=CDE=CEDE=CECDE=CDECDE.

Since CDECDE is a product of three projectors, its spectral norm is at most 11. Let |v|v\rangle be a unit vector realizing the spectral norm of CDECDE, i.e. such that CDE|v=CDE>0\|CDE|v\rangle\|=\|CDE\|>0. Then

CDE=CDE|v=CDECDE|vCDECDECDE2.\displaystyle\|CDE\|=\|CDE|v\rangle\|=\|CDECDE|v\rangle\|\leq\|CDECDE\|\leq\|CDE\|^{2}.

The inequality CDECDE2\|CDE\|\leq\|CDE\|^{2} implies that CDE=1\|CDE\|=1 (since it is not zero and is at most one). Therefore |v|v\rangle is a vector such that C|v=D|v=E|v=|vC|v\rangle=D|v\rangle=E|v\rangle=|v\rangle. ∎

We now put everything in this section together to prove Theorem 1.1, which we restate here for convenience.

See 1.1

Proof.

Putting together Lemma 2.7, Theorem 3.1, and Theorem 3.2, we get that all superdense coding protocols (τ,(Ui))(\tau,(U_{i})) are locally equivalent to one that has a nice form and satisfies the conclusions of Theorem 3.2. Finally, we apply Lemma 3.4 to the conclusions of Theorem 3.2 to obtain the conclusions of Theorem 1.1. ∎

4 Superdense coding and orthogonal unitary bases

In this section, we prove that there are multiple non-equivalent superdense coding protocols for transmitting d2d^{2} messages for d3d\geq 3, even when no ancilla is used in the encoding process, and there is no error in decoding. This implies that rigidity of superdense coding protocols for d3d\geq 3 may only hold in a relaxed form: as we see in this section, rigidity may hold only up to the choice of an orthogonal unitary basis for the space of linear operators 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}).

4.1 The connection with unitary bases

We draw a connection between superdense coding and bases for the vector space of d×dd\times d complex matrices. Although this connection may be inferred from Lemma 2.7, we give a simple and direct derivation here.

For any integer d>1d>1, consider a protocol for superdense coding of d2d^{2} classical strings using a shared entangled state with local dimension dd, and a single dd-dimensional message. Assume that the protocol does not use any ancilla in the encoding process, and that there is no decoding error. Such a protocol necessarily has a simple form, as we describe below.

First, we argue that the initial shared state is maximally entangled. Bob’s state after the message has support in a d2d^{2}-dimensional space. Since there are d2d^{2} strings, and these are decoded without error, the corresponding states are orthogonal and pure. So the mixed state of the entire encoded state, corresponding to a uniformly random string, is completely mixed. However, the marginal of this state on the register initially held by Bob is be the same as the marginal for any fixed string. Thus Bob’s share of the initial state is also the dd-dimensional completely mixed state. This implies that the initial shared state is maximally entangled.

Any maximally entangled state with local dimension dd is of the form (UV)|ϕd(U\otimes V)|\upphi_{d}\rangle, where U,VU,V are unitary operators in 𝖴(d){\mathsf{U}}({\mathbb{C}}^{d}), and |ϕd1dk=0d1|k|k|\upphi_{d}\rangle\coloneqq\tfrac{1}{\sqrt{d}}\sum_{k=0}^{d-1}|k\rangle|k\rangle. Therefore, without loss of generality, we may assume that Alice and Bob initially share the state |ϕd|\upphi_{d}\rangle. When the dimension dd is clear from the context, we omit it from the subscript.

Second, since the encoding of any message is pure, Alice’s local operations satisfy the following properties. On input i[d2]i\in[d^{2}], Alice applies a unitary operator Ui𝖴(d)U_{i}\in{\mathsf{U}}({\mathbb{C}}^{d}) to her share of the state |ϕ|\upphi\rangle, and sends the share to the Bob. Since Bob can decode the input ii with probability 11, the states (Ui𝟙)|ϕ(U_{i}\otimes\mathds{1})|\upphi\rangle are all orthogonal, i.e., for all distinct i,j[d2]i,j\in[d^{2}], we have

ϕ|(UiUj𝟙)|ϕ=0.\langle\upphi|(U_{i}^{*}U_{j}\otimes\mathds{1})|\upphi\rangle\quad=\quad 0\enspace.

This condition is equivalent to the property that the operators UiU_{i} are mutually orthogonal with respect to the Hilbert-Schmidt inner product:

Tr(UiUj)=0, for all i,j[d2],ij.\operatorname{Tr}(U_{i}^{*}U_{j})\quad=\quad 0\enspace,\qquad\textrm{ for all }i,j\in[d^{2}],~{}i\neq j\enspace. (4.1)

Thus, the operators form an orthogonal unitary basis for the space of linear operators on d{\mathbb{C}}^{d}.

It is straightforward to verify that any such basis for 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}) leads to an errorless superdense coding protocol for d2d^{2} classical messages. Thus, the study of rigidity of superdense coding protocols as above is equivalent to the study of orthogonal unitary bases.

A well-known example of an orthogonal unitary basis in dimension dd is generated by the “clock” and “shift” operators. The elements of this basis are also known as the generalized Pauli operators or the Heisenberg-Weyl operators. Let ωdexp(2πid)\omega_{d}\coloneqq\exp\left(\tfrac{2\pi{\mathrm{i}}}{d}\right) be a primitive ddth root of unity. For i,j{0,1,,d1}i,j\in\left\{0,1,\dotsc,d-1\right\}, the (i,j)(i,j)th operator PijP_{ij} in the basis is defined as PijXdiZdjP_{ij}\coloneqq{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}, where Xdk=0d1|k+1(modd)k|{\mathrm{X}}_{d}\coloneqq\sum_{k=0}^{d-1}|k+1\pmod{d}\rangle\!\langle k| is the shift (or Pauli X) operator, and Zdk=0d1ωdk|kk|{\mathrm{Z}}_{d}\coloneqq\sum_{k=0}^{d-1}\omega_{d}^{k}|k\rangle\!\langle k| is the clock (or Pauli Z) operator.

4.2 Uniqueness of orthogonal unitary bases

Given an orthogonal unitary basis BB for 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}), we may derive other such bases by conjugating elements of BB by a pair of unitary operators, and mutliplying each basis element by a potentially different complex number of unit modulus. Since this is a rather straightforward method to derive new bases, we consider the new basis to be equivalent to BB.

Definition 4.1.

Let B1{Ui:i[d2]}B_{1}\coloneqq\left\{U_{i}:i\in[d^{2}]\right\} be an orthogonal unitary basis for 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}). We say that an orthogonal unitary basis B2B_{2} is equivalent to B1B_{1} if there exist unit complex numbers αi𝖴()\alpha_{i}\in{\mathsf{U}}({\mathbb{C}}) and a pair of unitary operators V,W𝖴(d)V,W\in{\mathsf{U}}({\mathbb{C}}^{d}) such that

B2={αiVUiW:i[d2]}.B_{2}\quad=\quad\left\{\alpha_{i}VU_{i}W:~{}~{}i\in[d^{2}]\right\}\enspace. (4.2)

We may verify that this defines an equivalence relation.

Another way to construct an orthogonal unitary basis is by taking tensor products of bases in lower dimensions. Suppose dd is composite, with d=d1d2d=d_{1}d_{2} and 1<d1,d2<d1<d_{1},d_{2}<d, and {Ui:i[d12]}\left\{U_{i}:i\in[d_{1}^{2}]\right\} and {Vj:j[d22]}\left\{V_{j}:j\in[d_{2}^{2}]\right\} are orthogonal unitary bases for 𝖫(d1){\mathsf{L}}({\mathbb{C}}^{d_{1}}) and 𝖫(d2){\mathsf{L}}({\mathbb{C}}^{d_{2}}), respectively. Then

{UiVj:i[d12];j[d22]}\left\{U_{i}\otimes V_{j}:i\in[d_{1}^{2}];~{}j\in[d_{2}^{2}]\right\}

is an orthogonal unitary basis for 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}). This hints at the possibility that are bases that are not equivalent to each other under operations as in Eq. (4.2). The following proposition confirms this for dimensions which are powers of two.

Proposition 4.2.

Suppose d=2kd=2^{k} for an integer k>1k>1. Let B1B_{1} be the basis for 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}) obtained by taking tensor products of kk two-dimensional Pauli X and Z operators, i.e.,

B1{i=1kPi:Pi{𝟙,X2,Z2,X2Z2}}.B_{1}\quad\coloneqq\quad\left\{\bigotimes_{i=1}^{k}P_{i}:P_{i}\in\left\{\mathds{1},{\mathrm{X}}_{2},{\mathrm{Z}}_{2},{\mathrm{X}}_{2}{\mathrm{Z}}_{2}\right\}\right\}\enspace.

The basis B1B_{1} is not equivalent to B2B_{2}, the dd-dimensional clock and shift basis.

Proof.

The intuition behind the statement is that tensor products of the two-dimensional Pauli operators in B1B_{1} all have at most two distinct eigenvalues (either 11, or ±1\pm 1, or ±i\pm{\mathrm{i}}), whereas some of the operators in the clock and shift basis have dd distinct complex eigenvalues. Due to the freedom available in generating equivalent bases, we need additional arguments to formalize this intuition.

Suppose that B1B_{1} and B2B_{2} are equivalent and consider unitary operators V,W𝖴(d)V,W\in{\mathsf{U}}({\mathbb{C}}^{d}) which show their equivalence. Consider the operator PB1P\in B_{1} that is mapped to the identity in B2B_{2} under the equivalence. Let α\alpha be a complex number of unit modulus such that αVPW=𝟙\alpha VPW=\mathds{1}. Then V=αWPV=\alpha^{*}W^{*}P^{*}.

Suppose the operator QB1Q\in B_{1} is mapped to the clock operator ZdB2{\mathrm{Z}}_{d}\in B_{2}, and that Zd=βVQW{\mathrm{Z}}_{d}=\beta VQW for some complex number β\beta. Then Zd=βαWPQW{\mathrm{Z}}_{d}=\beta\alpha^{*}W^{*}P^{*}QW. The operator on the right hand side has at most two eigenvalues (either βα\beta\alpha^{*}, or ±βα\pm\beta\alpha^{*}, or ±iβα\pm{\mathrm{i}}\beta\alpha^{*}) as PQP^{*}Q has eigenvalues 11, or ±1\pm 1, or ±i\pm{\mathrm{i}}. However, the clock operator Zd{\mathrm{Z}}_{d} has dd distinct eigenvalues, the ddth complex roots of unity. Since d4d\geq 4, we get a contradiction, and we conclude that B1B_{1} and B2B_{2} are not equivalent. ∎

It is then natural to wonder if there is a unique orthogonal unitary basis in prime dimensions, up to the equivalence defined above. In Section 4.4 we show that even this does not hold, by giving an explicit construction of a basis in any dimension d5d\geq 5 that is not equivalent to the clock and shift basis.

After our discovery of non-equivalent bases, we learned that the question of uniqueness has been studied before by Vollbrecht and Werner [VW00]. They prove the uniqueness of the basis consisting of the Pauli operators in dimension two, and state that the problem of characterizing orthogonal unitary bases in dimensions larger than two is open. They also give a construction of “shift-and-multiply” bases from a collection of dd complex Hadamard matrices of dimension d×dd\times d and a d×dd\times d Latin square. This construction and non-equivalent bases are discussed in more detail by Werner in subsequent work [Wer01], although the notion of equivalence there does not include multiplication by phases (complex numbers of unit modulus). Werner states without proof that the existence of non-equivalent bases in dimension at least five follows from the existence of non-equivalent Hadamard matrices or non-equivalent Latin squares, even when the dimension is prime. In dimension three, Werner describes how we may construct non-equivalent bases, but does not explicitly present them. We present a concrete instance of this construction in Proposition 4.9. Altogether, we have the following result.

Theorem 4.3.

For every dimension d3d\geq 3, there are orthogonal unitary bases that are not equivalent to the clock and shift basis.

The theorem implies that for any d3d\geq 3, there are non-equivalent superdense coding protocols for transmitting d2d^{2} messages, even when no ancilla is used in the encoding process, and there is no error in decoding.

Orthogonal unitary bases have also been studied in the context of quantum error-correction under the name “unitary error bases” (see, e.g., Ref. [MV16] and the references therein). In addition to the shift-and-multiply construction, several other methods such as the “Hadamard method” and the “algebraic method” have been proposed for their construction. The “quantum shift-and-multiply” method due to Musto and Vicary [MV16] simultaneously generalizes the shift-and-multiply and Hadamard methods. Musto and Vicary give examples of orthogonal unitary bases resulting from this method that are not equivalent to those derived from any of the other methods mentioned above. However, they give explicit examples only in dimension 4. As far as we can tell, earlier explicit constructions, for example those due to Klappenecker and Rötteler [KR03], were also for a few small dimensions.

4.3 Some useful properties

Here we present two properties that are used in an explicit construction leading to Theorem 4.3. The following property of the eigenvalues of the clock and shift operators helps in proving non-equivalence to another basis. Recall that ωdexp(2πid)\omega_{d}\coloneqq\exp\left(\tfrac{2\pi{\mathrm{i}}}{d}\right) is a primitive ddth root of unity.

Lemma 4.4.

Let d>1d>1 be an integer, and let a,b{0,1,,d1}a,b\in\left\{0,1,\dotsc,d-1\right\}. The eigenvalues of the operator XdaZdb{\mathrm{X}}_{d}^{a}\,{\mathrm{Z}}_{d}^{b} are all of the form

ωdlexp(ab(d1)πid),\omega_{d}^{l}\cdot\exp\!\left(\frac{ab(d-1)\pi{\mathrm{i}}}{d}\right)\enspace,

for some l{0,1,,d1}l\in\left\{0,1,\dotsc,d-1\right\}.

Proof.

Since XdZd=ωdZdXd{\mathrm{X}}_{d}\,{\mathrm{Z}}_{d}=\omega^{*}_{d}\,{\mathrm{Z}}_{d}{\mathrm{X}}_{d}, we have (XdaZdb)d=ωdabd(d1)/2XdadZdbd=ωdabd(d1)/2 1\left({\mathrm{X}}_{d}^{a}\,{\mathrm{Z}}_{d}^{b}\right)^{d}=\omega_{d}^{abd(d-1)/2}\,{\mathrm{X}}_{d}^{ad}\,{\mathrm{Z}}_{d}^{bd}=\omega_{d}^{abd(d-1)/2}\,\mathds{1}. So the eigenvalues of XdaZdb{\mathrm{X}}_{d}^{a}\,{\mathrm{Z}}_{d}^{b} are ddth complex roots of ωdabd(d1)/2\omega_{d}^{abd(d-1)/2}, and the lemma follows. ∎

We also use the following simple number-theoretic property in the construction of new orthogonal unitary bases.

Lemma 4.5.

Any integer d5d\geq 5 has at most d2d-2 positive integer divisors.

Proof.

If dd is prime, then it has exactly two positive integer divisors, 1,d1,d, and the lemma holds.

Suppose dd is composite and has prime factorization p1a1p2a2pkakp_{1}^{a_{1}}p_{2}^{a_{2}}\dotsb p_{k}^{a_{k}}, where kk and a1,a2,,aka_{1},a_{2},\dotsc,a_{k} are positive integers, and p1,p2,,pkp_{1},p_{2},\dotsc,p_{k} are distinct prime numbers arranged in increasing order. The number of positive integer divisors of dd equals (a1+1)(a2+1)(ak+1)(a_{1}+1)(a_{2}+1)\dotsb(a_{k}+1).

Since dd is composite, either k=1k=1 and a12a_{1}\geq 2, or k2k\geq 2.

Suppose k=1k=1. If p13p_{1}\geq 3, the lemma follows since n+1qn2n+1\leq q^{n}-2 for all positive integers n2n\geq 2, for any q3q\geq 3. If p1=2p_{1}=2, we have a13a_{1}\geq 3 since d=p1a15d=p_{1}^{a_{1}}\geq 5. Since n+12n2n+1\leq 2^{n}-2 for all integers n3n\geq 3, the lemma again follows.

Now suppose k2k\geq 2. Since n+1qnn+1\leq q^{n} and n+1rn1n+1\leq r^{n}-1 for all n1n\geq 1 whenever q2q\geq 2 and r3r\geq 3, the number of divisors of dd is bounded as

(a1+1)(a2+1)(ak+1)\displaystyle(a_{1}+1)(a_{2}+1)\dotsb(a_{k}+1) \displaystyle\leq p1a1(p2a21)p3a3pkak\displaystyle p_{1}^{a_{1}}(p_{2}^{a_{2}}-1)\,p_{3}^{a_{3}}\dotsb p_{k}^{a_{k}}
\displaystyle\leq p1a1p2a2p3a3pkakp1a1\displaystyle p_{1}^{a_{1}}p_{2}^{a_{2}}p_{3}^{a_{3}}\dotsb p_{k}^{a_{k}}-p_{1}^{a_{1}}
\displaystyle\leq d2,\displaystyle d-2\enspace,

as claimed. ∎

4.4 Explicit constructions

We now proceed to describe an explicit construction for all dimensions d5d\geq 5. The construction we give has the same form as the shift-and-multiply construction due to Vollbrecht and Werner [VW00]. In particular, the bases we present correspond to the construction with the Fourier transform over the cyclic group of order dd as the Hadamard matrix, and certain Latin squares that are not equivalent to the one generated by Xd{\mathrm{X}}_{d}.

We construct bases that are not equivalent to the clock and shift basis by introducing a modification. In particular, we replace the operators generated by Xd{\mathrm{X}}_{d} by another sequence. Note that the operators Xdi{\mathrm{X}}_{d}^{i} correspond to permutations on /d{\mathbb{Z}}/d{\mathbb{Z}}, i.e., they permute the standard basis of d{\mathbb{C}}^{d}. Conversely any permutation PP on /d{\mathbb{Z}}/d{\mathbb{Z}} corresponds to the operator a/d|P(a)a|\sum_{a\in{\mathbb{Z}}/d{\mathbb{Z}}}|P(a)\rangle\!\langle a| which permutes standard basis elements of d{\mathbb{C}}^{d}. So this is a bijection.

It is also helpful to view a permutation PP on /d{\mathbb{Z}}/d{\mathbb{Z}} as a perfect matching in the complete bipartite graph Kd,dK_{d,d}, with vertex aa in one part being matched with the vertex P(a)P(a) in the other. This mapping defines a bijection between permutations and perfect matchings. The construction we give relies on these three equivalent views of a permutation, and uses the same letter to refer to the corresponding matching and the linear operator on d{\mathbb{C}}^{d}.

We start with the following observation.

Lemma 4.6.

Let P0,P1,,Pd1P_{0},P_{1},\dotsc,P_{d-1} be a sequence of dd disjoint matchings in the graph Kd,dK_{d,d}. Then the matrices

{PiZdj:0i,j<d}\left\{P_{i}{\mathrm{Z}}_{d}^{j}:0\leq i,j<d\right\}

form an orthogonal unitary basis.

Proof.

Since PiP_{i} permutes the standard basis vectors of d{\mathbb{C}}^{d}, it is a unitary operator. Therefore PiZdjP_{i}{\mathrm{Z}}_{d}^{j} is also unitary.

Now consider the inner product of PiZdjP_{i}{\mathrm{Z}}_{d}^{j} and PkZdlP_{k}{\mathrm{Z}}_{d}^{l} for two pairs (i,j)(i,j) and (k,l)(k,l). We have

Tr(ZdjPi1PkZdl)=m=0d1ωd(lj)mPi(m)|Pk(m).\operatorname{Tr}({\mathrm{Z}}_{d}^{-j}P_{i}^{-1}P_{k}{\mathrm{Z}}_{d}^{l})\quad=\quad\sum_{m=0}^{d-1}\omega_{d}^{(l-j)m}\langle P_{i}(m)|P_{k}(m)\rangle\enspace.

If iki\neq k, the inner product is 0 since for any mm, the disjoint matchings PiP_{i} and PkP_{k} match the vertex mm to distinct vertices. If i=ki=k, but ljl\neq j, the inner product is again 0 as ωd\omega_{d} is a ddth root of unity. ∎

We show that to derive non-equivalent bases, it suffices to have two of the matchings satisfy simple properties.

Lemma 4.7.

Let d2d\geq 2, and P0,P1,,Pd1P_{0},P_{1},\dotsc,P_{d-1} be a sequence of dd disjoint matchings in Kd,dK_{d,d} such that

  1. 1.

    P0P_{0} is the identity permutation, i.e., matches vertex ii in one part to vertex ii in the other part; and

  2. 2.

    the permutation P1P_{1} has a cycle of length kk such that kk does not divide dd.

Then the basis B{PiZdj:0i,j<d}B\coloneqq\left\{P_{i}{\mathrm{Z}}_{d}^{j}:0\leq i,j<d\right\} is not equivalent to the clock and shift basis.

Proof.

The intuition here is the following. The operator corresponding to the permutation P1P_{1} has the kk distinct kkth roots of unity as eigenvalues. In particular, the eigenvalues include 11 and ωk\omega_{k}. On the other hand, the eigenvalues of any operator in the clock and shift basis are of the form γωdl\gamma\omega_{d}^{l} for some integer ll, and a fixed unit complex number γ𝖴()\gamma\in{\mathsf{U}}({\mathbb{C}}) depending only on the operator. Since kk does not divide dd, the operator P1P_{1} does not belong to the clock and shift basis.

Formally, suppose that BB is equivalent to the clock and shift basis, and the equivalence is given by unitary operators U,WU,W. Suppose that the identity operator P0BP_{0}\in B is mapped to XdiZdj{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j} and P1P_{1} is mapped to XdkZdl{\mathrm{X}}_{d}^{k}\,{\mathrm{Z}}_{d}^{l} under this equivalence. That is, XdiZdj=αVP0W=αVW{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}=\alpha VP_{0}W=\alpha VW and XdkZdl=βVP1W{\mathrm{X}}_{d}^{k}\,{\mathrm{Z}}_{d}^{l}=\beta\,VP_{1}W for some α,β𝖴()\alpha,\beta\in{\mathsf{U}}({\mathbb{C}}).

From the equation for P0P_{0} we have V=αXdiZdjWV=\alpha^{*}\,{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}W^{*}, so that XdkZdl=βαXdiZdjWP1W{\mathrm{X}}_{d}^{k}\,{\mathrm{Z}}_{d}^{l}=\beta\alpha^{*}\,{\mathrm{X}}_{d}^{i}\,{\mathrm{Z}}_{d}^{j}W^{*}P_{1}W. Equivalently, we have

αβωdmXdkiZdlj=WP1W,\alpha\beta^{*}\omega_{d}^{m}\,{\mathrm{X}}_{d}^{k-i}\,{\mathrm{Z}}_{d}^{l-j}\quad=\quad W^{*}P_{1}W\enspace, (4.3)

where m=(ki)jm=-(k-i)j. By Lemma 4.4, there is a fixed γ𝖴()\gamma\in{\mathsf{U}}({\mathbb{C}}) such that the eigenvalues of the operator on the left hand side of Eq. (4.3) are of the form γωdl\gamma\omega_{d}^{l} for some integer ll. On the other hand, the operator on the right hand side of the equation is similar to P1P_{1}. Since P1P_{1} has eigenvalues 11 and ωk\omega_{k}, we have 1=γωdm1=\gamma\omega_{d}^{m} and ωk=γωdn\omega_{k}=\gamma\omega_{d}^{n} for some integers m,nm,n. Eliminating γ\gamma, we get ωk=ωdnm\omega_{k}=\omega_{d}^{n-m}. This implies that

2πik=2πi(nm)d+2πip,\frac{2\pi{\mathrm{i}}}{k}\quad=\quad\frac{2\pi{\mathrm{i}}(n-m)}{d}+2\pi{\mathrm{i}}p\enspace,

for some integer pp, or equivalently that d=(nm+pd)kd=(n-m+pd)k. This is a contradiction, as kk does not divide dd. ∎

Finally, we prove that matchings as in the hypothesis of Lemma 4.7 exist.

Lemma 4.8.

For any integer d5d\geq 5, there is a sequence of dd disjoint matchings P0,P1,,Pd1P_{0},P_{1},\dotsc,P_{d-1} in Kd,dK_{d,d} such that

  1. 1.

    P0P_{0} is the identity permutation, i.e., matches vertex ii in one part with vertex ii in the other part; and

  2. 2.

    the permutation P1P_{1} has a cycle of length kk such that kk does not divide dd.

Proof.

By Lemma 4.5, for any d5d\geq 5, there is an integer k[2,d2]k\in[2,d-2] that does not divide dd.

Let P0P_{0} be the identity permutation, and let P1(0,1,,k1)(k,k+1,,d1)P_{1}\coloneqq(0,1,\dotsc,k-1)(k,k+1,\dotsc,d-1) be a permutation consisting of two cycles of length kk and dkd-k, respectively. The perfect matchings corresponding to P0P_{0} and P1P_{1} are disjoint, as P0P_{0} maps each element to itself while P1P_{1} cyclically shifts every element within each of its two cycles (both of which are of length at least two).

Consider the graph GG obtained by deleting the edges in the matchings P0P_{0} and P1P_{1} from Kd,dK_{d,d}. The graph GG is a (d2)(d-2)-regular bipartite graph. Thus, by the Hall theorem [Hal35], GG can be decomposed into (d2)(d-2) disjoint perfect matchings. ∎

Lemma 4.8 and Lemma 4.7 together imply that for any dimension d5d\geq 5, there are multiple non-equivalent orthogonal unitary bases. The same property holds for d=4d=4 due to Proposition 4.2, and for d=3d=3 due to Proposition 4.9 below. This proves Theorem 4.3.

Proposition 4.9.

There are orthogonal unitary bases for 𝖫(3){\mathsf{L}}({\mathbb{C}}^{3}) that are not equivalent to the clock and shift basis.

Proof.

Denote the clock and shift basis by BB. Note that BB is a commutative projective group under operator composition, i.e., it is closed under taking products of the operators and the operators commute, all up to some phase (a unit complex number) that may depend on the operators. We construct a basis BB^{\prime} such that the equivalence of BB and BB^{\prime} implies that BB^{\prime} also is a commutative projective group. However, the basis BB^{\prime} has elements that do not commute even up to a phase, which is a contradiction.

We construct BB^{\prime} following an idea due to Werner [Wer01]; see the discussion after Proposition 9 in the paper. Let Mβ|00|+|11|+|22|M\coloneqq\beta|0\rangle\!\langle 0|+|1\rangle\!\langle 1|+|2\rangle\!\langle 2|, where β𝖴()\beta\in{\mathsf{U}}({\mathbb{C}}) is a unit complex number such that β1\beta\neq 1. Let B{Uij:0i,j2}B^{\prime}\coloneqq\left\{U_{ij}:0\leq i,j\leq 2\right\}, where for any i{0,1,2}i\in\left\{0,1,2\right\}

Uij\displaystyle U_{ij}\quad\coloneqq {X3jZ3ij{0,2},andX3Z3iMj=1.\displaystyle\quad\begin{cases}{\mathrm{X}}_{3}^{j}\,{\mathrm{Z}}_{3}^{i}&\quad j\in\left\{0,2\right\}\enspace,\quad\text{and}\\ {\mathrm{X}}_{3}\,{\mathrm{Z}}_{3}^{i}M&\quad j=1\enspace.\end{cases}

We may verify that this is an orthogonal unitary basis for any choice of β𝖴()\beta\in{\mathsf{U}}({\mathbb{C}}).

Suppose the basis BB^{\prime} is equivalent to BB, and the equivalence is given by the operators V,W𝖴(3)V,W\in{\mathsf{U}}({\mathbb{C}}^{3}) and unit complex numbers αij𝖴()\alpha_{ij}\in{\mathsf{U}}({\mathbb{C}}). Consider the element X3aZ3b{\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b} of BB that corresponds to the operator U00BU_{00}\in B^{\prime}. We have X3aZ3b=α00VU00W=α00VW{\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b}=\alpha_{00}VU_{00}W=\alpha_{00}VW. Then W=α00VX3aZ3bW=\alpha_{00}^{*}V^{*}{\mathrm{X}}_{3}^{a}{\mathrm{Z}}_{3}^{b}, and

B={αijα00VUijVX3aZ3b:0i,j2}.B\quad=\quad\left\{\alpha_{ij}\,\alpha_{00}^{*}VU_{ij}V^{*}{\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b}~{}:~{}0\leq i,j\leq 2\right\}\enspace.

Since BB is closed under right multiplication by (X3aZ3b)({\mathrm{X}}_{3}^{a}\,{\mathrm{Z}}_{3}^{b})^{*} up to phases, the set of operators

{VUijMV:0i,j2}\left\{VU_{ij}M^{*}V^{*}:0\leq i,j\leq 2\right\}

is also a commutative projective group, as is the basis BB^{\prime}.

We show next that not all operators in the set BB^{\prime} commute, even up to a phase. Consider the operators U01U_{01} and U02U_{02}. These operators commute up to a phase if and only if there is a unit complex number γ\gamma such that

U01U02=γU02U01X3MX32=γX32X3M|00|+β|11|+|22|=γ(β|00|+|11|+|22|).\begin{array}[]{rrl}&U_{01}U_{02}\quad=&\gamma\,U_{02}U_{01}\\ \Longleftrightarrow&{\mathrm{X}}_{3}M\,{\mathrm{X}}_{3}^{2}\quad=&\gamma\,{\mathrm{X}}_{3}^{2}\,{\mathrm{X}}_{3}M\\ \Longleftrightarrow&|0\rangle\!\langle 0|+\beta|1\rangle\!\langle 1|+|2\rangle\!\langle 2|\quad=&\gamma\left(\beta|0\rangle\!\langle 0|+|1\rangle\!\langle 1|+|2\rangle\!\langle 2|\right)\enspace.\end{array}

This implies that γ=β=1\gamma=\beta=1. As we chose β1\beta\neq 1, this is a contradiction, and BB and BB^{\prime} are not equivalent. ∎

5 Random superdense coding protocols

In this section we study a random protocol for approximate superdense coding. Its analysis draws heavily on results in high-dimensional probability. We present these results in Section 5.1 and develop some properties of random entangled vectors in Section 5.2, before proceeding to the analysis in Section 5.3. Finally, in Section 5.4, we address a subtle issue that we encounter in the analysis.

5.1 Background from random matrix theory

In this section, we present some useful results from random matrix theory.

Definition 5.1 (Isotropic vector).

We say a random vector |𝛏n|{\bm{\xi}}\rangle\in{\mathbb{C}}^{n} is isotropic if 𝔼|𝛏𝛏|=𝟙\operatorname{{\mathbb{E}}}|{\bm{\xi}}\rangle\!\langle{\bm{\xi}}|=\mathds{1}.

Random variables which have tails that decay as fast as the normal distribution play an important role in high dimensional probability. Let Sn1S^{n-1} denote the set of unit vectors in n{\mathbb{C}}^{n}.

Definition 5.2 (Sub-gaussian random variables and vectors).

A random variable 𝐱{\bm{x}}\in{\mathbb{C}} is sub-gaussian if there exists a parameter κ>0\kappa>0 such that

Pr(|𝒙|t)2exp(t2/κ2)\Pr(|{\bm{x}}|\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/\kappa^{2}\big{)}

for all t0t\geq 0. The sub-gaussian norm of 𝐱{\bm{x}}, denoted by 𝐱ψ2\|{\bm{x}}\|_{{\uppsi_{2}}}, is defined as

𝒙ψ2inf{t>0:𝔼exp(|𝒙|2/t2)2}.\|{\bm{x}}\|_{{\uppsi_{2}}}\quad\coloneqq\quad\inf\left\{t>0:\operatorname{{\mathbb{E}}}\exp(\left\lvert{\bm{x}}\right\rvert^{2}/t^{2})\leq 2\right\}\enspace.

A random vector |𝐯n|{\bm{v}}\rangle\in{\mathbb{C}}^{n} is sub-gaussian if for all unit vectors |uSn1|u\rangle\in S^{n-1}, the inner product u|𝐯\langle u|{\bm{v}}\rangle is sub-gaussian. The sub-gaussian norm of |𝐯|{\bm{v}}\rangle is defined as

𝒗ψ2supuSn1u|𝒗ψ2.\left\|{\bm{v}}\right\|_{{\uppsi_{2}}}\quad\coloneqq\quad\sup_{u\in S^{n-1}}\left\|\langle u|{\bm{v}}\rangle\right\|_{{\uppsi_{2}}}\enspace.

Sub-gaussian norm can be characterized in multiple ways. The following lemma describes two of them; see [Ver18, Proposition 2.5.2, Section 2.5.1].

Lemma 5.3.

There are positive universal constants c1,c2c_{1},c_{2} such that for any random variable 𝐱{\bm{x}}\in{\mathbb{C}},

  1. 1.

    If for some parameter κ1>0\kappa_{1}>0,

    Pr(|𝒙|t)2exp(t2/κ12)\Pr(|{\bm{x}}|\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/\kappa_{1}^{2}\big{)}

    for all t0t\geq 0, then 𝒙{\bm{x}} is sub-gaussian and 𝒙ψ2c1κ1\left\|{\bm{x}}\right\|_{\uppsi_{2}}\leq c_{1}\kappa_{1}.

  2. 2.

    If 𝒙{\bm{x}} is sub-gaussian with 𝒙ψ2κ2\left\|{\bm{x}}\right\|_{\uppsi_{2}}\leq\kappa_{2} for some parameter κ2>0\kappa_{2}>0, then

    Pr(|𝒙|t)2exp(t2/c2κ22)\Pr(|{\bm{x}}|\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/c_{2}\kappa_{2}^{2}\big{)}

    for all t0t\geq 0.

The following theorem gives a sharp bound on the largest singular value of a class of random matrices; see the text by Verhsynin [Ver18] for this and related results. Vershynin states the result for real matrices, but the proof extends to complex matrices in a straightforward manner.

Theorem 5.4 ([Ver18], Theorem 4.6.1).

Let 𝐀i=1m|i𝐱i|{\bm{A}}\coloneqq\sum_{i=1}^{m}|i\rangle\!\langle{\bm{x}}_{i}| be a complex m×nm\times n matrix whose rows |𝐱i|{\bm{x}}_{i}\rangle are independent, mean zero, sub-gaussian isotropic vectors in n{\mathbb{C}}^{n}. Then there is a universal constant c>0c>0 such that for all t0t\geq 0, we have

𝑨m+cκ2(n+t)\|{\bm{A}}\|\quad\leq\quad\sqrt{m}+c\kappa^{2}(\sqrt{n}+t)

with probability at least 12exp(t2)1-2\exp(-t^{2}), where κmaxi𝐱iψ2\kappa\coloneqq\max_{i}\|{\bm{x}}_{i}\|_{{\uppsi_{2}}}.

Let 2\|\cdot\|_{2} denote the Hilbert-Schmidt norm on 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}):

A2Tr(AA).\|A\|_{2}\quad\coloneqq\quad\sqrt{\operatorname{Tr}(A^{*}A)}\enspace.

This norm induces the following 2\ell_{2}-sum metric on (𝖴(d))m\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}:

(U1,U2,,Um)(V1,V2,,Vm)2(i=1mUiVi22)1/2.\left\|(U_{1},U_{2},\dotsc,U_{m})-(V_{1},V_{2},\dotsc,V_{m})\right\|_{2}\quad\coloneqq\quad\left(\sum_{i=1}^{m}\left\|U_{i}-V_{i}\right\|_{2}^{2}\right)^{1/2}\enspace.

Let f:(𝖴(d))mf:\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}\rightarrow{\mathbb{R}} be a continuous function. We say ff is κ\kappa-Lipschitz with respect to the 2\ell_{2}-sum of Hilbert-Schmidt metrics if for all (Ui),(Vi)(𝖴(d))m(U_{i}),(V_{i})\in\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}, we have

|f(U1,U2,,Um)f(V1,V2,,Vm)|κ(U1,U2,,Um)(V1,V2,,Vm)2.\left|f(U_{1},U_{2},\dotsc,U_{m})-f(V_{1},V_{2},\dotsc,V_{m})\right|\quad\leq\quad\kappa\left\|(U_{1},U_{2},\dotsc,U_{m})-(V_{1},V_{2},\dotsc,V_{m})\right\|_{2}\enspace.

Let 𝑼i𝖴(d){\bm{U}}_{i}\in{\mathsf{U}}({\mathbb{C}}^{d}), 1im1\leq i\leq m be i.i.d. Haar-random unitary operators. If κ\kappa is sufficiently smaller than the dimension dd, with high probability, the random variable f(𝑼1,𝑼2,,𝑼m)f({\bm{U}}_{1},{\bm{U}}_{2},\dotsc,{\bm{U}}_{m}) is close to its expectation. This concentration of measure property is formalized by the following theorem, which is a special case of Theorem 5.17 in the book on random matrix theory by Meckes [Mec19].

Theorem 5.5 ([Mec19], Theorem 5.17, page 159).

Let 𝐔i𝖴(d){\bm{U}}_{i}\in{\mathsf{U}}({\mathbb{C}}^{d}), i[m]i\in[m], be i.i.d. random unitary operators chosen according to the Haar measure. Suppose the function f:(𝖴(d))mf:\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{m}\rightarrow{\mathbb{R}} is κ\kappa-Lipschitz with respect to the 2\ell_{2}-sum of Hilbert-Schmidt metrics, with κ>0\kappa>0. Then for every positive real number tt, we have

Pr(f(𝑼1,𝑼2,,𝑼m)φ+t)exp((d2)t224κ2),\Pr\!\left(f({\bm{U}}_{1},{\bm{U}}_{2},\dotsc,{\bm{U}}_{m})\geq\varphi+t\right)\quad\leq\quad\exp\!\left(-\frac{(d-2)t^{2}}{24\kappa^{2}}\right)\enspace,

where φ𝔼f(𝐔1,𝐔2,,𝐔m)\varphi\coloneqq\operatorname{{\mathbb{E}}}f({\bm{U}}_{1},{\bm{U}}_{2},\dotsc,{\bm{U}}_{m}).

The Marčenko–Pastur theorem characterises the spectrum of a wide class of random matrices in the limit of large dimension. We rely on a version of the theorem due to Yaskov [Yas16] that applies to matrices whose entries need not all be independent. While Yaskov states the result for real matrices, the proof extends to complex matrices with straightforward modifications. We sketch the observations and the modifications which enable this extension after the statement of the theorem.

The columns of the random matrices we consider satisfy a certain asymptotic isotropy condition.

Definition 5.6.

Let m(n)m(n) be a sequence of positive integers such that mm\to\infty as nn\to\infty. Let (|𝐱m)(|{\bm{x}}_{m}\rangle) be a sequence of random vectors with |𝐱mm|{\bm{x}}_{m}\rangle\in{\mathbb{C}}^{m}. We say that the sequence (|𝐱m)(|{\bm{x}}_{m}\rangle) is pseudo-isotropic if for all sequences of complex matrices (Am)(A_{m}) with Amm×mA_{m}\in{\mathbb{C}}^{m\times m} and with uniformly bounded spectral norm (i.e., Amκ\|A_{m}\|\leq\kappa for all mm for a universal constant κ\kappa),

1m(𝒙m|Am|𝒙mTr(Am))P0\frac{1}{m}\left(\langle{\bm{x}}_{m}|A_{m}|{\bm{x}}_{m}\rangle-\operatorname{Tr}(A_{m})\right)\overset{{\mathrm{P}}}{\longrightarrow}0

as mm\to\infty.

Define the empirical spectral distribution (ESD) of an m×mm\times m positive semi-definite matrix AA as 1mi=1mδ(xλi)\tfrac{1}{m}\sum_{i=1}^{m}\operatorname{\updelta}(x-\lambda_{i}), where (λi:i[m])(\lambda_{i}:i\in[m]) are the eigenvalues of AA, and δ\operatorname{\updelta} is the Dirac-delta function. This is the probability density function of a uniformly random eigenvalue of AA.

Theorem 5.7 (Marčenko-Pastur law [Yas16]).

Fix an r>0r>0, and let m,nm,n be integers with n,m1n,m\geq 1 and mm a function of nn such that m/nrm/n\to r as nn\to\infty. For each mm, let |𝐱m|{\bm{x}}_{m}\rangle in m{\mathbb{C}}^{m} be a random vector such that the sequence of vectors (|𝐱m)(|{\bm{x}}_{m}\rangle) is pseudo-isotropic. Let (𝐌n,m)({\bm{M}}_{n,m}) be a sequence of m×nm\times n random matrices whose columns are i.i.d. copies of the random vector |𝐱m|{\bm{x}}_{m}\rangle, and let 𝛍n,m{\bm{\mu}}_{n,m} be the ESD of the matrix 1n𝐌n,m𝐌n,m\frac{1}{n}{\bm{M}}_{n,m}{\bm{M}}_{n,m}^{*}. Then, as nn\to\infty, the ESD 𝛍n,m{\bm{\mu}}_{n,m} converges weakly to the density pr\operatorname{p}_{r} almost surely, where

pr(x)max{0,11/r}δ(x)+(xa)(bx)2πrx 1(axb),\operatorname{p}_{r}(x)\quad\coloneqq\quad\max\left\{0,1-1/r\right\}\operatorname{\updelta}(x)+\frac{\sqrt{(x-a)(b-x)}}{2\pi rx}\;\mathbf{1}(a\leq x\leq b)\enspace,

with a(1r)2a\coloneqq(1-\sqrt{r}\,)^{2}, b(1+r)2b\coloneqq(1+\sqrt{r}\,)^{2}.

In other words, as nn\to\infty, with probability 11, the cumulative distribution function of a uniformly random eigenvalue of the matrix 1n𝑴n,m𝑴n,m\frac{1}{n}{\bm{M}}_{n,m}{\bm{M}}_{n,m}^{*} converges point-wise to that given by the probability density function prp_{r}.

Theorem 5.7 follows from the proof of Theorem 2.1 in Ref. [Yas16] by noting the following points. The eigenvalues of the matrix 1n𝑴n,m𝑴n,m\frac{1}{n}{\bm{M}}_{n,m}{\bm{M}}_{n,m}^{*} are all real, and therefore the Stieltjes continuity theorem applies to 𝝁n,m{\bm{\mu}}_{n,m}. Further, the Sherman-Morrison formula also extends to the sum A+|uv|A+|u\rangle\!\langle v|, where AA is an invertible m×mm\times m complex matrix, and |u,|v|u\rangle,|v\rangle are in m{\mathbb{C}}^{m}: the matrix A+|uv|A+|u\rangle\!\langle v| is invertible if and only if 1+v|A1|u01+\langle v|A^{-1}|u\rangle\neq 0, and if the latter condition holds,

(A+|uv|)1=A1A1|uv|A11+v|A1|u.(A+|u\rangle\!\langle v|)^{-1}\quad=\quad A^{-1}-\frac{A^{-1}|u\rangle\!\langle v|A^{-1}}{1+\langle v|A^{-1}|u\rangle}\enspace.

We can prove that the Stieltjes transform 𝒔n(z){\bm{s}}_{n}(z) of 𝝁n,m{\bm{\mu}}_{n,m} tends to its expectation 𝔼𝒔n(z)\operatorname{{\mathbb{E}}}{\bm{s}}_{n}(z) almost surely as nn\to\infty, following Step 1 in the proof of Theorem 1.1 in Ref. [BZ08]. The rest of the proof in Ref. [Yas16] now extends to the case of interest to us by replacing all instances of the transpose of a real vector by the conjugate transpose of the corresponding complex vector.

5.2 Pseudo-isotropy of random maximally entangled vectors

In this section, we develop properties of linear operators with certain symmetries, and use these to prove that a sequence of random maximally entangled vectors is pseudo-isotropic. This property is later used in the analysis of a random superdense coding protocol.

We consider operators on dddd{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}, and label the four tensor factors with A,B,C,DA,B,C,D, respectively. As is the convention in quantum information, we use superscripts to indicate the tensor factors on which an operator acts. Let Fi,j=1d|i,jj,i|{\mathrm{F}}\coloneqq\sum_{i,j=1}^{d}|i,j\rangle\!\langle j,i| be the swap operator on dd{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}; it permutes the two tensor factors.

Lemma 5.8.

Let W𝖫(dddd)W\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}). Suppose WW commutes with 𝟙ABUCUD\mathds{1}^{AB}\otimes U^{C}\otimes U^{D} for all unitary operators U𝖴(d)U\in{\mathsf{U}}({\mathbb{C}}^{d}), as well as with UAUB𝟙CDU^{A}\otimes U^{B}\otimes\mathds{1}^{CD}. Then WW is a linear combination of operators of the form PABQCDP^{AB}\otimes Q^{CD}, where P,Q{𝟙,F}P,Q\in\left\{\mathds{1},{\mathrm{F}}\right\}.

Proof.

Let (Ei)(E_{i}) be a basis for the vector space 𝖫(dd){\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}). We may express WW as W=i=1d2EiWiW=\sum_{i=1}^{d^{2}}E_{i}\otimes W_{i} for some operators Wi𝖫(dd)W_{i}\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}). Since WW commutes with 𝟙ABUCUD\mathds{1}^{AB}\otimes U^{C}\otimes U^{D}, we have

i=1d2EiWi=i=1d2Ei(UU)Wi(UU),\sum_{i=1}^{d^{2}}E_{i}\otimes W_{i}\quad=\quad\sum_{i=1}^{d^{2}}E_{i}\otimes(U\otimes U)W_{i}(U^{*}\otimes U^{*})\enspace,

for all U𝖴(d)U\in{\mathsf{U}}({\mathbb{C}}^{d}). Since the operators EiE_{i} form a basis, we conclude that

Wi=(UU)Wi(UU),W_{i}\quad=\quad(U\otimes U)W_{i}(U^{*}\otimes U^{*})\enspace,

i.e., the operator WiW_{i} commutes with UUU\otimes U for every ii. As a consequence of the von Neumann double commutant theorem [Wat18, Theorem 7.15, Section 7.1], each operator WiW_{i} may be written as a linear combination of {𝟙,F}\left\{\mathds{1},{\mathrm{F}}\right\}. So

W=i=1d2Ei(αi𝟙+βiF),W\quad=\quad\sum_{i=1}^{d^{2}}E_{i}\otimes(\alpha_{i}\mathds{1}+\beta_{i}{\mathrm{F}})\enspace,

for some complex numbers αi,βi\alpha_{i},\beta_{i}. Rearranging the sum, we get that

W=G𝟙+HF,W\quad=\quad G\otimes\mathds{1}+H\otimes{\mathrm{F}}\enspace,

for some operators G,H𝖫(dd)G,H\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}). Since WW commutes with UAUB𝟙CDU^{A}\otimes U^{B}\otimes\mathds{1}^{CD} as well, and 𝟙\mathds{1} and F{\mathrm{F}} are linearly independent, by [Wat18, Theorem 7.15, Section 7.1] we similarly get that GG and HH are also linear combinations of {𝟙,F}\left\{\mathds{1},{\mathrm{F}}\right\}. The lemma follows. ∎

Consider the random vector |𝝍|{\bm{\psi}}\rangle defined as |𝝍(𝑼𝟙)|ϕd|{\bm{\psi}}\rangle\coloneqq({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle, where 𝑼𝖴(d){\bm{U}}\in{\mathsf{U}}({\mathbb{C}}^{d}) is a Haar-random unitary operator and |ϕd|\upphi_{d}\rangle is the maximally entangled state 1dk=1d|k|k\tfrac{1}{\sqrt{d}}\sum_{k=1}^{d}|k\rangle|k\rangle with local dimension dd. We would like to compute a closed form expression for the operator MM on dddd\mathbb{C}^{d}\otimes\mathbb{C}^{d}\otimes\mathbb{C}^{d}\otimes\mathbb{C}^{d} defined as:

M𝔼|𝝍𝝍|2.M\quad\coloneqq\quad\operatorname*{\mathbb{E}}|{\bm{\psi}}\rangle\!\langle{\bm{\psi}}|^{\otimes 2}\enspace. (5.1)

We use the symmetries of MM in order to do so.

Lemma 5.9.

Let MM be the operator defined in Eq. (5.1). Then

M=β[𝟙+FACFBD]+γ[𝟙ACFBD+FAC𝟙BD],M\quad=\quad\beta\left[\mathds{1}+{\mathrm{F}}^{AC}\otimes{\mathrm{F}}^{BD}\right]+\gamma\left[\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD}+{\mathrm{F}}^{AC}\otimes\mathds{1}^{BD}\right]\enspace,

where βd2(d21)1\beta\coloneqq d^{-2}(d^{2}-1)^{-1} and γd3(d21)1\gamma\coloneqq-d^{-3}(d^{2}-1)^{-1}.

Proof.

Since

M=𝔼(𝑼A𝟙B)|ϕdϕd|(𝑼A𝟙B)(𝑼C𝟙D)|ϕdϕd|(𝑼C𝟙D),M\quad=\quad\operatorname*{\mathbb{E}}~{}({\bm{U}}^{A}\otimes\mathds{1}^{B})|\upphi_{d}\rangle\!\langle\upphi_{d}|({\bm{U}}^{*A}\otimes\mathds{1}^{B})\otimes({\bm{U}}^{C}\otimes\mathds{1}^{D})|\upphi_{d}\rangle\!\langle\upphi_{d}|({\bm{U}}^{*C}\otimes\mathds{1}^{D})\enspace,

and 𝑼{\bm{U}} is Haar random, the operator MM commutes with VAVC𝟙BDV^{A}\otimes V^{C}\otimes\mathds{1}^{BD} for all V𝖴(d)V\in{\mathsf{U}}({\mathbb{C}}^{d}). Further, since (𝟙V)|ϕd=(V𝟙)|ϕd(\mathds{1}\otimes V)|\upphi_{d}\rangle=(V^{\top}\otimes\mathds{1})|\upphi_{d}\rangle, the operator MM also commutes with 𝟙ACVBVD\mathds{1}^{AC}\otimes V^{B}\otimes V^{D}. By Lemma 5.8, we have

M=α𝟙+β(FACFBD)+γ(𝟙ACFBD)+δ(FAC𝟙BD).M\quad=\quad\alpha\mathds{1}+\beta({\mathrm{F}}^{AC}\otimes{\mathrm{F}}^{BD})+\gamma(\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD})+\delta({\mathrm{F}}^{AC}\otimes\mathds{1}^{BD})\enspace. (5.2)

Consider the following linear functionals:

  1. 1.

    XTr(X)X\mapsto\operatorname{Tr}(X)

  2. 2.

    XTr((FACFBD)X)X\mapsto\operatorname{Tr}(({\mathrm{F}}^{AC}\otimes{\mathrm{F}}^{BD})X)

  3. 3.

    XTr((FAC𝟙BD)X)X\mapsto\operatorname{Tr}(({\mathrm{F}}^{AC}\otimes\mathds{1}^{BD})X)

  4. 4.

    XTr((𝟙ACFBD)X)X\mapsto\operatorname{Tr}((\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD})X)

We apply these functionals to both sides of Eq. (5.2). We calculate the value of the functional on the left hand side directly from the definition of MM, i.e., Eq. (5.1), and on the right hand side from Eq. (5.2). We thus obtain the following linear equations:

  1. 1.

    1=αd4+βd2+γd3+δd31=\alpha d^{4}+\beta d^{2}+\gamma d^{3}+\delta d^{3},

  2. 2.

    1=αd2+βd4+γd3+δd31=\alpha d^{2}+\beta d^{4}+\gamma d^{3}+\delta d^{3},

  3. 3.

    1/d=αd3+βd3+γd2+δd41/d=\alpha d^{3}+\beta d^{3}+\gamma d^{2}+\delta d^{4}, and

  4. 4.

    1/d=αd3+βd3+γd4+δd21/d=\alpha d^{3}+\beta d^{3}+\gamma d^{4}+\delta d^{2}, respectively.

Solving for α,β,γ,δ\alpha,\beta,\gamma,\delta, we get the unique solution

α=β=1d2(d21),andγ=δ=1d3(d21).\alpha=\beta=\frac{1}{d^{2}(d^{2}-1)}\enspace,\quad\text{and}\quad\gamma=\delta=\frac{-1}{d^{3}(d^{2}-1)}\enspace.

Let nd2n\coloneqq d^{2}. Consider the random vector |𝝃ndd|{\bm{\xi}}_{n}\rangle\in{\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d} defined as |𝝃nd|𝝍=d(𝑼𝟙)|ϕd|{\bm{\xi}}_{n}\rangle\coloneqq d|{\bm{\psi}}\rangle=d({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle. We prove that the sequence of these vectors is pseudo-isotropic.

Lemma 5.10.

The sequence of vectors (|𝛏n)(|{\bm{\xi}}_{n}\rangle) is pseudo-isotropic.

Proof.

Let (An𝖫(dd):n1)(A_{n}\in{\mathsf{L}}({\mathbb{C}}^{d}\otimes{\mathbb{C}}^{d}):n\geq 1) be a sequence of complex matrices with spectral norm An\left\|A_{n}\right\| bounded by a constant κ\kappa, for each nn. We use the Chebyshev Inequality to show that

1n(𝝃n|An|𝝃nTr(An))P0\displaystyle\frac{1}{n}\left(\langle{\bm{\xi}}_{n}|A_{n}|{\bm{\xi}}_{n}\rangle-\operatorname{Tr}(A_{n})\right)\overset{{\mathrm{P}}}{\longrightarrow}0 (5.3)

as nn\to\infty. Let 𝒙n{\bm{x}}_{n} be the complex random variable defined as 𝒙n𝝃n|An|𝝃n{\bm{x}}_{n}\coloneqq\langle{\bm{\xi}}_{n}|A_{n}|{\bm{\xi}}_{n}\rangle. We may verify that 𝔼|𝝃n𝝃n|=𝟙\operatorname{{\mathbb{E}}}|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|=\mathds{1}, so that 𝔼𝒙n=𝔼Tr(|𝝃n𝝃n|An)=Tr(An)\operatorname{{\mathbb{E}}}{\bm{x}}_{n}=\operatorname{{\mathbb{E}}}\operatorname{Tr}(|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|A_{n})=\operatorname{Tr}(A_{n}). Eq. (5.3) is equivalent to showing that for every ϵ>0\epsilon>0, Pr(|𝒙n𝔼𝒙n|>ϵn)0\Pr(\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert>\epsilon n)\to 0 as nn\to\infty. By the Chebyshev Inequality,

Pr(|𝒙n𝔼𝒙n|>ϵn)1ϵ2n2𝔼|𝒙n𝔼𝒙n|2.\Pr(\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert>\epsilon n)\quad\leq\quad\frac{1}{\epsilon^{2}n^{2}}\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert^{2}\enspace.

So it suffices to show that the variance of 𝒙n{\bm{x}}_{n} is o(n2)\mathrm{o}(n^{2}).

The variance 𝔼|𝒙n𝔼𝒙n|2=𝔼|𝒙n|2|𝔼𝒙n|2=𝔼|𝒙n|2|Tr(An)|2\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}-\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert^{2}=\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}-\left\lvert\operatorname{{\mathbb{E}}}{\bm{x}}_{n}\right\rvert^{2}=\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}-\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}. To calculate the second moment of 𝒙n{\bm{x}}_{n}, we rewrite it as follows.

𝔼|𝒙n|2\displaystyle\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}\quad =𝔼𝝃n|An𝝃n𝝃n|An|𝝃n\displaystyle=\quad\operatorname{{\mathbb{E}}}~{}\langle{\bm{\xi}}_{n}|A_{n}|{\bm{\xi}}_{n}\rangle\langle{\bm{\xi}}_{n}|A_{n}^{*}|{\bm{\xi}}_{n}\rangle
=𝔼Tr[(|𝝃n𝝃n||𝝃n𝝃n|)(AnAn)]\displaystyle=\quad\operatorname{{\mathbb{E}}}~{}\operatorname{Tr}\big{[}(|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|\otimes|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|)(A_{n}\otimes A_{n}^{*})\big{]}
=n2Tr[M(AnAn)],\displaystyle=\quad n^{2}\operatorname{Tr}\big{[}M(A_{n}\otimes A_{n}^{*})\big{]}\enspace,

where MM is the matrix defined in Eq. (5.1). By Lemma 5.9, and the Hölder Inequality (namely, |Tr(AB)|AtrB\left\lvert\operatorname{Tr}(AB)\right\rvert\leq\left\|A\right\|_{\mathrm{tr}}\left\|B\right\|),

𝔼|𝒙n|2\displaystyle\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}\quad =n2[β(|Tr(An)|2+Tr(AnAn))\displaystyle=\quad n^{2}\Big{[}\beta\big{(}\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}+\operatorname{Tr}(A_{n}^{*}A_{n})\big{)}
+γ(Tr((FAC𝟙BD)(AnAn))+Tr((𝟙ACFBD)(AnAn)))]\displaystyle\qquad\mbox{}+\gamma\big{(}\operatorname{Tr}\big{(}({\mathrm{F}}^{AC}\otimes\mathds{1}^{BD})(A_{n}\otimes A_{n}^{*})\big{)}+\operatorname{Tr}\big{(}(\mathds{1}^{AC}\otimes{\mathrm{F}}^{BD})(A_{n}\otimes A_{n}^{*})\big{)}\big{)}\Big{]}
n2[β(|Tr(An)|2+κ2n)+2|γ|κ2n2],\displaystyle\leq\quad n^{2}\Big{[}\beta\big{(}\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}+\kappa^{2}n\big{)}+2\left\lvert\gamma\right\rvert\kappa^{2}n^{2}\Big{]}\enspace,

where β=1/n(n1)\beta=1/n(n-1) and γ=1/n3/2(n1)\gamma=-1/n^{3/2}(n-1). Thus the variance is bounded as

𝔼|𝒙n|2|Tr(An)|2\displaystyle\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}_{n}\right\rvert^{2}-\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}\quad 1n1|Tr(An)|2+κ2n2n1+2κ2n5/2n1,\displaystyle\leq\quad\frac{1}{n-1}\left\lvert\operatorname{Tr}(A_{n})\right\rvert^{2}+\frac{\kappa^{2}n^{2}}{n-1}+\frac{2\kappa^{2}n^{5/2}}{n-1}\enspace,

which is o(n2)\mathrm{o}(n^{2}) as |Tr(An)|κn\left\lvert\operatorname{Tr}(A_{n})\right\rvert\leq\kappa n. This proves that the sequence (|𝝃n)(|{\bm{\xi}}_{n}\rangle) is pseudo-isotropic. ∎

5.3 Analysis of a random protocol

Consider the following random protocol 𝚷d{\bm{\Pi}}_{d}. Let dd be an integer 2\geq 2, and nd2n\coloneqq d^{2}. Alice and Bob agree on a choice of nn independently chosen Haar-random unitary operators 𝑼1,,𝑼n𝖴(d){\bm{U}}_{1},\dotsc,{\bm{U}}_{n}\in{\mathsf{U}}({\mathbb{C}}^{d}). They also share the maximally entangled state |ϕd1dk=1d|k|k|\upphi_{d}\rangle\coloneqq\tfrac{1}{\sqrt{d}}\sum_{k=1}^{d}|k\rangle|k\rangle with local dimension dd. When Alice gets message i[n]i\in[n], she applies 𝑼i{\bm{U}}_{i} to her half of |ϕd|\upphi_{d}\rangle, and sends it over to Bob. Bob now holds the state |𝝍i(𝑼i𝟙)|ϕd|{\bm{\psi}}_{i}\rangle\coloneqq({\bm{U}}_{i}\otimes\mathds{1})|\upphi_{d}\rangle. He performs an optimal measurement to identify ii, given that the state is drawn from the ensemble 𝓔d(|𝝍j:j[n]){\bm{\mathcal{E}}}_{d}\coloneqq\big{(}|{\bm{\psi}}_{j}\rangle:j\in[n]\big{)}.

Aram Harrow (personal communication) suggested the protocol 𝚷d{\bm{\Pi}}_{d} as a candidate for an approximate (d,ϵ)(d,\epsilon)-superdense coding protocol with vanishing error ϵ\epsilon in the limit of large dimension. If this random construction of superdense coding protocols did indeed have error that vanishes rapidly as a function of dd, then this could potentially refute 1.3. This is formalized by the following proposition (which was stated in Section 1.3, and is reproduced here for convenience).

See 1.4

Proof.

Let 𝚷d{\bm{\Pi}}_{d} be the random protocol and let (𝑼i)({\bm{U}}_{i}) be the ensemble of random unitaries specified in the Proposition statement. Suppose for contradiction that 1.3 were true and the error 𝜺{\bm{\varepsilon}} of the protocol 𝚷d{\bm{\Pi}}_{d} satisfied

𝔼(𝑼i)δ2(𝜺)2<(2d)2.\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\delta_{2}({\bm{\varepsilon}})^{2}\quad<\quad(2d)^{-2}~{}. (5.4)

First we argue that

𝔼(𝑼i)𝔼𝒋𝒌|Tr(𝑼𝒋𝑼𝒌)|2=1,\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname*{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{*})|^{2}\quad=\quad 1\enspace, (5.5)

where the first expectation is over the ensemble of random unitary operators (𝑼i)({\bm{U}}_{i}), and the second expectation is over a uniformly random pair of distinct indices 𝒋,𝒌[d]{\bm{j}},{\bm{k}}\in[d], 𝒋𝒌{\bm{j}}\neq{\bm{k}}. To prove this, note that for all jkj\neq k 𝔼(𝑼i)|Tr(𝑼j𝑼k)|2=𝔼(𝑼i)|Tr(𝑼1𝑼2)|2\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}|\operatorname{Tr}({\bm{U}}_{j}{\bm{U}}_{k}^{*})|^{2}=\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}|\operatorname{Tr}({\bm{U}}_{1}{\bm{U}}_{2}^{*})|^{2} because 𝑼i{\bm{U}}_{i} are independent, identically distributed Haar-random unitaries operators. Furthermore, by the rotation invariance of the Haar measure, 𝑼1𝑼2{\bm{U}}_{1}{\bm{U}}_{2}^{*} is also distributed according to the Haar measure. So the above quantity is equal to 𝔼𝑼|Tr(𝑼)|2\operatorname*{\mathbb{E}}_{\bm{U}}|\operatorname{Tr}({\bm{U}})|^{2} for Haar-random 𝑼{\bm{U}}. So the LHS of Equation 5.5 equals

𝔼𝑼|Tr(𝑼)|2\displaystyle\operatorname*{\mathbb{E}}_{\bm{U}}|\operatorname{Tr}({\bm{U}})|^{2}\quad =𝔼𝑼|j=1dj|𝑼|j|2\displaystyle=\quad\operatorname*{\mathbb{E}}_{\bm{U}}\left\lvert\sum_{j=1}^{d}\langle j|{\bm{U}}|j\rangle\right\rvert^{2}
=𝔼𝑼j,k=1dj|𝑼|jk|𝑼|k\displaystyle=\quad\operatorname*{\mathbb{E}}_{\bm{U}}\sum_{j,k=1}^{d}\langle j|{\bm{U}}|j\rangle\!\langle k|{\bm{U}}^{*}|k\rangle
=j,kj|(𝔼𝑼𝑼|jk|𝑼)|k.\displaystyle=\quad\sum_{j,k}\langle j|\Big{(}\operatorname*{\mathbb{E}}_{\bm{U}}{\bm{U}}|j\rangle\!\langle k|{\bm{U}}^{*}\Big{)}|k\rangle~{}.

Since 𝔼𝑼𝑼|jj|𝑼=𝟙/d\operatorname*{\mathbb{E}}_{\bm{U}}{\bm{U}}|j\rangle\!\langle j|{\bm{U}}^{*}=\mathds{1}/d and 𝔼𝑼𝑼|jk|𝑼=0\operatorname*{\mathbb{E}}_{\bm{U}}{\bm{U}}|j\rangle\!\langle k|{\bm{U}}^{*}=0 when jkj\neq k, we have 𝔼|Tr(𝑼)|2=1\operatorname*{\mathbb{E}}|\operatorname{Tr}({\bm{U}})|^{2}=1, which establishes Equation 5.5.

On the other hand, the rigidity condition promised by 1.3 implies that every collection of d×dd\times d unitary operators (Ui)(U_{i}) yields a superdense protocol with some error ε\varepsilon, and in turn there exists an orthogonal unitary basis (Ei)(E_{i}) such that UiEinhsδ2(ε)\|U_{i}-E_{i}\|_{\mathrm{nhs}}\leq\delta_{2}(\varepsilon) for all i[d2]i\in[d^{2}]. Note that ε\varepsilon and (Ei)(E_{i}) depend on (Ui)(U_{i}), and let 𝜺{\bm{\varepsilon}} and (𝑬i)({\bm{E}}_{i}) be the error of the protocol 𝚷d{\bm{\Pi}}_{d} and the corresponding orthogonal unitary basis, respectively. Then

𝔼(𝑼i)𝔼𝒋𝒌|Tr(𝑼𝒋𝑼𝒌)|2\displaystyle\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname*{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{*})|^{2}\quad =𝔼(𝑼i)𝔼𝒋𝒌|Tr(𝑼𝒋𝑼𝒌)Tr(𝑬𝒋𝑬𝒌)|2\displaystyle=\quad\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname*{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left|\operatorname{Tr}({\bm{U}}_{\bm{j}}{\bm{U}}_{\bm{k}}^{*})-\operatorname{Tr}({\bm{E}}_{\bm{j}}{\bm{E}}_{\bm{k}}^{*})\right|^{2}
𝔼(𝑼i)𝔼𝒋𝒌(|Tr((𝑼𝒋𝑬𝒋)𝑼𝒌)|+|Tr(𝑬𝒋(𝑼𝒌𝑬𝒌))|)2\displaystyle\leq\quad\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname*{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\Big{(}\left|\operatorname{Tr}(({\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}){\bm{U}}_{\bm{k}}^{*})|+|\operatorname{Tr}({\bm{E}}_{\bm{j}}({\bm{U}}_{\bm{k}}^{*}-{\bm{E}}_{\bm{k}}^{*}))\right|\Big{)}^{2}
2𝔼(𝑼i)𝔼𝒋𝒌|Tr((𝑼𝒋𝑬𝒋)𝑼𝒌)|2+|Tr(𝑬𝒋(𝑼𝒌𝑬𝒌))|2\displaystyle\leq\quad 2\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname*{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left|\operatorname{Tr}(({\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}){\bm{U}}_{\bm{k}}^{*})\right|^{2}+\left|\operatorname{Tr}({\bm{E}}_{\bm{j}}({\bm{U}}_{\bm{k}}^{*}-{\bm{E}}_{\bm{k}}^{*}))\right|^{2} (5.6)
where the first equality is due to the orthogonality condition Tr(𝑬j𝑬k)=0\operatorname{Tr}({\bm{E}}_{j}{\bm{E}}_{k}^{*})=0 whenever jkj\neq k, the second line is due to the triangle inequality, and the third line is due to the inequality (a+b)22a2+2b2(a+b)^{2}\leq 2a^{2}+2b^{2} for real numbers a,ba,b. By the Cauchy-Schwarz inequality for the Hilbert-Schmidt inner product, we have |Tr((𝑼j𝑬j)𝑼k)|2𝑼j𝑬j22𝑼k22|\operatorname{Tr}(({\bm{U}}_{j}-{\bm{E}}_{j}){\bm{U}}_{k}^{*})|^{2}\leq\|{\bm{U}}_{j}-{\bm{E}}_{j}\|_{2}^{2}\cdot\|{\bm{U}}_{k}\|_{2}^{2} and |Tr(𝑬j(𝑼k𝑬k))|2𝑬j22𝑼k𝑬k22|\operatorname{Tr}({\bm{E}}_{j}({\bm{U}}_{k}^{*}-{\bm{E}}_{k}^{*}))|^{2}\leq\|{\bm{E}}_{j}\|_{2}^{2}\cdot\|{\bm{U}}_{k}-{\bm{E}}_{k}\|_{2}^{2}, where X2=Tr(XX)\|X\|_{2}=\sqrt{\operatorname{Tr}(XX^{*})} denotes the (unnormalized) Hilbert-Schmidt norm. Since A22=d\|A\|_{2}^{2}=d for all d×dd\times d unitary matrices AA, we can upper bound the RHS of Equation 5.6 as
2d𝔼(𝑼i)𝔼𝒋𝒌(𝑼𝒋𝑬𝒋22+𝑼𝒌𝑬𝒌22)\displaystyle\leq\quad 2d\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\operatorname*{\mathbb{E}}_{{\bm{j}}\neq{\bm{k}}}\left(\|{\bm{U}}_{\bm{j}}-{\bm{E}}_{\bm{j}}\|_{2}^{2}+\|{\bm{U}}_{\bm{k}}-{\bm{E}}_{\bm{k}}\|_{2}^{2}\right)
=4𝔼(𝑼i)j𝑼j𝑬jnhs2\displaystyle=\quad 4\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\sum_{j}\|{\bm{U}}_{j}-{\bm{E}}_{j}\|_{\mathrm{nhs}}^{2}
4d2𝔼(𝑼i)δ2(𝜺)2\displaystyle\leq\quad 4d^{2}\operatorname*{\mathbb{E}}_{({\bm{U}}_{i})}\delta_{2}({\bm{\varepsilon}})^{2}\enspace
<1,\displaystyle<\quad 1\enspace,

where the last inequality follows from the assumption in Equation 5.4. However, this contradicts Equation 5.5. Thus, either the conjecture does not hold, or the random superdense coding protocol 𝚷d{\bm{\Pi}}_{d} has error satisfying 𝔼δ2(𝜺)(2d)2\operatorname*{\mathbb{E}}\delta_{2}({\bm{\varepsilon}})\geq(2d)^{-2}. ∎

In the rest of this section, we prove that for sufficiently large dimension, with high probability, the protocol 𝚷d{\bm{\Pi}}_{d} has positive constant error. This indicates that random maximally entangled quantum states are not very reliable for transmitting classical information, and proves Theorem 1.5. Thus, the random protocol 𝚷d{\bm{\Pi}}_{d} does not rule out a robust rigidity theorem for superdense coding.

To analyze the decoding error of the protocol, we study the distinguishability of the ensemble 𝓔d{\bm{\mathcal{E}}}_{d}. This is the probability that, if the pure state |𝝍i|{\bm{\psi}}_{i}\rangle is selected uniformly at random from the ensemble, an optimal measurement correctly identifies the state.

Definition 5.11.

Let ((pi,ρi):ρi𝖣(k),i[m]){\mathcal{F}}\coloneqq\big{(}(p_{i},\rho_{i}):\rho_{i}\in{\mathsf{D}}({\mathbb{C}}^{k}),i\in[m]\big{)} be an ensemble of states in which state ρi\rho_{i} occurs with probability pip_{i}. We define the distinguishability of {\mathcal{F}} as

γ()maxPOVM Mi=1mpiTr(Miρi),\operatorname{\upgamma}({\mathcal{F}})\quad\coloneqq\quad\max_{\text{POVM }M}\sum_{i=1}^{m}p_{i}\operatorname{Tr}(M_{i}\rho_{i})\enspace,

where the maximization is over all measurements (i.e., POVMs) MM with elements M1,,MmM_{1},\ldots,M_{m}.

We can estimate the distinguishability of an ensemble of states via the generalized Holevo-Curlander bounds [Kho79, Cur79, ON99, Tys09b].

Theorem 5.12 (generalized Holevo-Curlander bounds [ON99, Tys09b]).

Let ((pi,ρi):ρi𝖣(k),i[m]){\mathcal{F}}\coloneqq\big{(}(p_{i},\rho_{i}):\rho_{i}\in{\mathsf{D}}({\mathbb{C}}^{k}),i\in[m]\big{)} be an ensemble of mm quantum states. Then the distinguishability of {\mathcal{F}} satisfies

(αHolevo())2γ()αHolevo(),\left(\operatorname{\upalpha_{\text{Holevo}}}({\mathcal{F}})\right)^{2}\quad\leq\quad\operatorname{\upgamma}({\mathcal{F}})\quad\leq\quad\operatorname{\upalpha_{\text{Holevo}}}({\mathcal{F}})\enspace,

where

αHolevo()Tri=1mpi2ρi2.\operatorname{\upalpha_{\text{Holevo}}}({\mathcal{F}})\quad\coloneqq\quad\operatorname{Tr}\sqrt{\sum_{i=1}^{m}p_{i}^{2}\rho_{i}^{2}}\enspace.

We only need the upper bound on distinguishability above for a uniform ensemble of pure states. This bound was given by Curlander [Cur79] in the case of linearly independent states. It was generalized to the case of equiprobable, possibly mixed states by Ogawa and Nagaoka [ON99, Lemma 1]. The proof they gave also extends with minor modifications to non-uniform ensembles. The two bounds in Theorem 5.12 were proven — re-proven independently in the case of the upper bound [Tys09a] — by Tyson[Tys09b, Theorem 10]. Tyson later gave another proof of the bounds which also generalizes to error-recovery [Tys10, Section III].

We show that the expectation of the quantity αHolevo(𝓔d)\operatorname{\upalpha_{\text{Holevo}}}({\bm{\mathcal{E}}}_{d}) for the ensemble of random maximally entangled states is at most a constant strictly less than 11, for sufficiently large dimension dd. This implies that the distinguishability γ(𝓔d)\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d}) is also strictly less than 11 in expectation, and that any measurement Bob makes has a non-zero constant probability of failure, on average.

Theorem 5.13.

The distinguishability γ(𝓔d)\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d}) of the random superdense coding protocol 𝚷d{\bm{\Pi}}_{d} tends to 83π0.85\tfrac{8}{3\pi}\approx 0.85 as dd\to\infty.

Proof.

Define the matrix 𝑸{\bm{Q}} as 𝑸i=1n|𝝍i𝝍i|{\bm{Q}}\coloneqq\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle{\bm{\psi}}_{i}|, and 𝚲d{\bm{\Lambda}}_{d} as a uniformly random eigenvalue of 𝑸{\bm{Q}}. Then 𝔼αHolevo(𝓔d)\operatorname{{\mathbb{E}}}\operatorname{\upalpha_{\text{Holevo}}}({\bm{\mathcal{E}}}_{d}) is the expectation of the random variable 𝚲d\sqrt{{\bm{\Lambda}}_{d}}, and we aim to bound this from above.

Define the n×nn\times n matrix 𝑹di=1n|𝝍ii|{\bm{R}}\coloneqq d\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle i| so that 𝑸=1n𝑹𝑹{\bm{Q}}=\tfrac{1}{n}{\bm{R}}{\bm{R}}^{*}. Consider the random vector |𝝃n|{\bm{\xi}}_{n}\rangle defined as |𝝃nd(𝑼𝟙)|ϕd|{\bm{\xi}}_{n}\rangle\coloneqq d({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle, where 𝑼𝖴(d){\bm{U}}\in{\mathsf{U}}({\mathbb{C}}^{d}) is a Haar-random unitary operator. I.e., |𝝃n|{\bm{\xi}}_{n}\rangle is a scaled random maximally entangled state. Lemma 5.10 shows that the sequence (|𝝃n)(|{\bm{\xi}}_{n}\rangle) is pseudo-isotropic. Since the columns of the matrix 𝑹{\bm{R}} are i.i.d. copies of the random vector |𝝃n|{\bm{\xi}}_{n}\rangle, the random matrix 𝑸{\bm{Q}} is of the form described in Theorem 5.7. Thus the limiting distribution of the uniformly random eigenvalue 𝚲d{\bm{\Lambda}}_{d} of 𝑸{\bm{Q}} follows the Marčenko-Pastur law with density p1\operatorname{p}_{1} (i.e., with parameter r=1r=1):

p1(x)\displaystyle\operatorname{p}_{1}(x)\quad {12π4xx if 0x4, and0otherwise.\displaystyle\coloneqq\quad\begin{cases}\frac{1}{2\pi}\sqrt{\frac{4-x}{x}}&\text{ if }0\leq x\leq 4\enspace,\text{ and}\\ 0&\text{otherwise}\enspace.\end{cases} (5.7)

We would like to use the density p1\operatorname{p}_{1} to estimate the limit of the expectation of 𝚲d\sqrt{{\bm{\Lambda}}_{d}}. A subtle issue is that weak convergence (i.e., convergence in distribution of the random variables) does not necessarily imply that the limit of the expectation values 𝔼𝚲d\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}} equals the expectation of the limiting random variable. A simple example for which this does not hold is described in Section 5.4. Nonetheless, in Theorem 5.16 in Section 5.4, we show that the sequence of random variables 𝚲d{\bm{\Lambda}}_{d} satisfies the stronger property we need. Namely, 𝔼𝚲d\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}} converges to 𝔼𝚲\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}} as dd\to\infty, where 𝚲{\bm{\Lambda}} is a random variable with density p1\operatorname{p}_{1}. We may thus bound the distinguishability of the ensemble 𝓔d{\bm{\mathcal{E}}}_{d} as dd\to\infty as follows.

limd𝔼γ(𝓔d)\displaystyle\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})\quad limd𝔼αHolevo(𝓔d)\displaystyle\leq\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\operatorname{\upalpha_{\text{Holevo}}}({\bm{\mathcal{E}}}_{d})
=limd𝔼𝚲d=𝔼𝚲\displaystyle=\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}
=xp1(x)dx\displaystyle=\quad\int_{-\infty}^{\infty}\sqrt{x}\,\operatorname{p}_{1}(x)\mathop{}\!\mathrm{d}x
=12π044xdx\displaystyle=\quad\frac{1}{2\pi}\int_{0}^{4}\sqrt{4-x}\mathop{}\!\mathrm{d}x
=83π0.85.\displaystyle=\quad\frac{8}{3\pi}\quad\approx\quad 0.85\enspace.

We can strengthen this result to show that the distinguishability of 𝓔d{\bm{\mathcal{E}}}_{d} is tightly concentrated around the mean. So all but an exponentially small fraction of the superdense coding protocols using |ϕd|\upphi_{d}\rangle succeed with probability smaller than 0.85\approx 0.85.

Theorem 5.14.

The distinguishability γ(𝓔d)\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d}) of the random superdense coding protocol 𝚷d{\bm{\Pi}}_{d} satisfies

Pr(γ(𝓔d)𝔼γ(𝓔d)+t)exp(d3(d2)t296).\Pr\Big{(}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})\geq\operatorname*{\mathbb{E}}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})+t\Big{)}\leq\exp\!\left(-\frac{d^{3}(d-2)t^{2}}{96}\right)\enspace.
Proof.

Define the function f:(𝖴(d))nf:\big{(}{\mathsf{U}}({\mathbb{C}}^{d})\big{)}^{n}\rightarrow{\mathbb{R}} as

f(U1,,Un)supPOVM M1ni=1nTr(Miψi),f(U_{1},\ldots,U_{n})\quad\coloneqq\quad\sup_{\text{POVM }M}\frac{1}{n}\sum_{i=1}^{n}\operatorname{Tr}(M_{i}\psi_{i})\enspace,

where |ψi=(Ui𝟙)|ϕd|\psi_{i}\rangle=(U_{i}\otimes\mathds{1})|\upphi_{d}\rangle, and we denote |ψψ||\psi\rangle\!\langle\psi| by ψ\psi. So f(𝑼1,,𝑼n)=γ(𝓔d)f({\bm{U}}_{1},\ldots,{\bm{U}}_{n})=\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d}), the distinguishability of the ensemble 𝓔d{\bm{\mathcal{E}}}_{d}. We bounded the expected distinguishability by .85<1\approx.85<1 in Theorem 5.13. We show that γ(𝓔d)\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d}) is tightly concentrated around its expectation using Theorem 5.5. To do so, we compute a bound on the Lipschitz constant of ff.

Fix unitary operators U1,,Un,U1,,Un𝖴(d)U_{1},\ldots,U_{n},U_{1}^{\prime},\ldots,U_{n}^{\prime}\in{\mathsf{U}}({\mathbb{C}}^{d}) and let |ψi=(Ui𝟙)|ϕd|\psi_{i}\rangle=(U_{i}\otimes\mathds{1})|\upphi_{d}\rangle and |ψi=(Ui𝟙)|ϕd|\psi_{i}^{\prime}\rangle=(U_{i}^{\prime}\otimes\mathds{1})|\upphi_{d}\rangle. Since the space of nn-dimensional POVMs with nn outcomes is compact, for any sequence (Ui)(U_{i}) the supremum in the definition of ff is attained at some POVM MM. Let MM, MM^{\prime} correspond to the POVMs achieving f(U1,,Un)f(U_{1},\ldots,U_{n}) and f(U1,,Un)f(U_{1}^{\prime},\ldots,U_{n}^{\prime}), respectively, and let α,α\alpha,\alpha^{\prime} denote these quantities. Assume without loss of generality that αα\alpha^{\prime}\leq\alpha.

We have that

|αα|=αα=1d2(iTr(Miψi)Tr(Miψi))1d2iTr(Mi(ψiψi)),\displaystyle|\alpha-\alpha^{\prime}|\quad=\quad\alpha-\alpha^{\prime}\quad=\quad\frac{1}{d^{2}}\Big{(}\sum_{i}\operatorname{Tr}(M_{i}\psi_{i})-\operatorname{Tr}(M_{i}^{\prime}\psi_{i}^{\prime})\Big{)}\quad\leq\quad\frac{1}{d^{2}}\sum_{i}\operatorname{Tr}(M_{i}(\psi_{i}-\psi_{i}^{\prime}))\enspace,

as the POVM MM may not be an optimal distinguishing measurement for the ensemble (ψi)(\psi_{i}^{\prime}). We bound this by

1d2i|Tr(Mi(ψiψi))|\displaystyle\frac{1}{d^{2}}\sum_{i}\Big{|}\operatorname{Tr}(M_{i}(\psi_{i}-\psi_{i}^{\prime}))\Big{|}
1d2iMiψiψi1\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}\|M_{i}\|\cdot\|\psi_{i}-\psi_{i}^{\prime}\|_{1} (Hölder inequality)\displaystyle(\text{H\"{o}lder inequality})
1d2iψiψi1\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}\|\psi_{i}-\psi_{i}^{\prime}\|_{1} (M is a POVM)\displaystyle(\text{$M$ is a POVM})
1d2i2|ψi|ψi\displaystyle\leq\quad\frac{1}{d^{2}}\sum_{i}2\||\psi_{i}\rangle-|\psi_{i}^{\prime}\rangle\|
=2d2iϕ|(UiUi)(UiUi)|ϕ\displaystyle=\quad\frac{2}{d^{2}}\sum_{i}\sqrt{\langle\upphi|(U_{i}-U_{i}^{\prime})^{*}(U_{i}-U_{i}^{\prime})|\upphi\rangle}
21d2i1dUiUi22\displaystyle\leq\quad 2\sqrt{\frac{1}{d^{2}}\sum_{i}\frac{1}{d}\|U_{i}-U_{i}^{\prime}\|_{2}^{2}} (Jensen inequality)\displaystyle(\text{Jensen inequality})

where in the fourth line we used the property that for any two pure states |φ|\varphi\rangle and |θ|\theta\rangle, the trace distance between |φφ||\varphi\rangle\!\langle\varphi| and |θθ||\theta\rangle\!\langle\theta| is at most 2|φ|θ2\||\varphi\rangle-|\theta\rangle\|. In the last line, we used the identity ϕd|(A𝟙)|ϕd\langle\upphi_{d}|(A\otimes\mathds{1})|\upphi_{d}\rangle for any d×dd\times d matrix is equal to Tr(A)/d\operatorname{Tr}(A)/d.

Thus |αα|2d3/2iUiUi22|\alpha-\alpha^{\prime}|\leq 2d^{-3/2}\sqrt{\sum_{i}\|U_{i}-U_{i}^{\prime}\|_{2}^{2}}, which implies that ff is 2d3/22d^{-3/2}-Lipschitz. Applying Theorem 5.5 we obtain

Pr(γ(𝓔d)𝔼γ(𝓔d)+t)exp(d3(d2)t296).\Pr\Big{(}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})\geq\operatorname*{\mathbb{E}}\operatorname{\upgamma}({\bm{\mathcal{E}}}_{d})+t\Big{)}\quad\leq\quad\exp\!\left(-\frac{d^{3}(d-2)t^{2}}{96}\right)\enspace.

5.4 A subtle issue

Recall from the proof of Theorem 5.13 in Section 5.3 that 𝚲d{\bm{\Lambda}}_{d} denotes a uniformly random eigenvalue of the matrix 𝑸{\bm{Q}}. The generalized Marčenko-Pastur Law (Theorem 5.7) tells us that 𝚲d{\bm{\Lambda}}_{d} converges in distribution to a random variable 𝚲{\bm{\Lambda}} with density p1\operatorname{p}_{1} given in Eq. (5.7) as dd\rightarrow\infty. We used this limiting distribution to estimate the limit of the mean of 𝚲d\sqrt{{\bm{\Lambda}}_{d}} in Theorem 5.13. We pointed out the subtle issue that convergence in distribution does not necessarily imply that the limit of means equals the mean of the limiting random variable. A simple example which illustrates this issue is the following. For any positive integer kk, let the random variable 𝒙k{\bm{x}}_{k} take value kk with probability 1/k1/k, and value 0 with the remaining probability. Then 𝒙k{\bm{x}}_{k} converges in distribution to the constant 0, whereas 𝔼𝒙k=1\operatorname{{\mathbb{E}}}{\bm{x}}_{k}=1 for all kk.

The example above highlights the reason underlying this phenomenon: while the probability of an interval on the line may go to zero in the limit, the rate of convergence may not be fast enough to dampen the contribution to the mean from that interval. We show that the probability that the random variable 𝚲d{\bm{\Lambda}}_{d} deviates from zero decays exponentially. This helps us conclude the convergence of the mean 𝔼𝚲d\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}} to 𝔼𝚲\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}.

A similar property was assumed to hold by Montanaro [Mon07] in his work on the distinguishability of random quantum states. Let 𝑺i=1k|𝜻ii|{\bm{S}}\coloneqq\sum_{i=1}^{k}|{\bm{\zeta}}_{i}\rangle\!\langle i|, where |𝜻i|{\bm{\zeta}}_{i}\rangle are i.i.d. random vectors in d{\mathbb{C}}^{d} with i.i.d. complex gaussian entries with mean 0 and variance 11. Montanaro approximates 𝔼Tr(𝑺𝑺)1/2\operatorname{{\mathbb{E}}}\operatorname{Tr}({\bm{S}}{\bm{S}}^{*})^{1/2} using the Marčenko-Pastur Law. He justifies this using estimates on the rate of convergence of the expected distribution of a uniformly random eigenvalue of (1/k)𝑺𝑺(1/k){\bm{S}}{\bm{S}}^{*} to the limiting distribution given by the Marčenko-Pastur Law (see the discussion after Lemma 5 in Ref. [Mon07]). The rate of convergence is measured in terms of the Kolmogorov distance between the two distributions. (The Kolmogorov distance between the cumulative distribution functions F1F_{1} and F2F_{2} of real random variables is defined as supx|F1(x)F2(x)|\sup_{x\in{\mathbb{R}}}\left\lvert F_{1}(x)-F_{2}(x)\right\rvert.) The Kolmogorov distance was shown to be O(k5/48)\mathrm{O}(k^{-5/48}) by Bai [Bai93]. However, vanishing Kolmogorov distance does not necessarily imply the convergence of the mean to the mean of the limiting distribution. For example, the Kolmogorov distance of the distribution of the random variable 𝒙k{\bm{x}}_{k} defined above from the constant 0 is 1/k1/k. The approach we take in this section can also be used to fill the gap in Montanaro’s work. In fact, the analogue of Lemma 5.15 we need for this purpose follows directly, as the columns of 𝑺{\bm{S}} are gaussian. We leave the details to the reader, and return to the analysis of the random variable 𝚲d{\bm{\Lambda}}_{d}.

In order to show that 𝚲d{\bm{\Lambda}}_{d} has an exponentially decaying tail, it suffices to show that the spectral norm of the matrix 𝑸{\bm{Q}}—i.e., its largest eigenvalue—has this property. So we proceed by deriving a tail bound for the spectral norm of 𝑸{\bm{Q}}.

Lemma 5.15.

Let d3d\geq 3. There are positive universal constants c1,c2c_{1},c_{2} such that for all tc1t\geq c_{1},

Pr(𝑸>t)2exp(tn/c2).\Pr\left(\left\|{\bm{Q}}\right\|>t\right)\quad\leq\quad 2\exp(-tn/c_{2})\enspace. (5.8)
Proof.

Recall that nd2n\coloneqq d^{2}, 𝑸i=1n|𝝍i𝝍i|{\bm{Q}}\coloneqq\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle{\bm{\psi}}_{i}|, and that 𝑹di=1n|𝝍ii|{\bm{R}}\coloneqq d\sum_{i=1}^{n}|{\bm{\psi}}_{i}\rangle\!\langle i|. We have 𝑸=n1𝑹𝑹=n1𝑹2\left\|{\bm{Q}}\right\|=n^{-1}\left\|{\bm{R}}{\bm{R}}^{*}\right\|=n^{-1}\left\|{\bm{R}}\right\|^{2}, so

Pr(𝑸>t)=Pr(𝑹>nt).\Pr\left(\left\|{\bm{Q}}\right\|>t\right)\quad=\quad\Pr\left(\left\|{\bm{R}}\right\|>\sqrt{nt}\right)\enspace.

It thus suffices to give a suitable tail bound for 𝑹\left\|{\bm{R}}\right\|.

The vectors d|𝝍id|{\bm{\psi}}_{i}\rangle are i.i.d. copies of the random vector |𝝃n|{\bm{\xi}}_{n}\rangle defined as |𝝃nd(𝑼𝟙)|ϕd|{\bm{\xi}}_{n}\rangle\coloneqq d({\bm{U}}\otimes\mathds{1})|\upphi_{d}\rangle, where 𝑼{\bm{U}} is a Haar-random unitary operator on d{\mathbb{C}}^{d}. So the vectors d|𝝍id|{\bm{\psi}}_{i}\rangle have zero mean. We can verify that 𝔼|𝝃n𝝃n|=𝟙\operatorname{{\mathbb{E}}}|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|=\mathds{1}, so the vectors d|𝝍id|{\bm{\psi}}_{i}\rangle are isotropic. We prove below that the vector |𝝃n|{\bm{\xi}}_{n}\rangle is sub-gaussian, with sub-gaussian norm at most a universal constant κ\kappa. So the matrix 𝑹=di=1n|iψi|{\bm{R}}^{*}=d\sum_{i=1}^{n}|i\rangle\!\langle\psi_{i}| satisfies the conditions of Theorem 5.4, and we have that for some positive universal constant c3c_{3},

𝑹=𝑹>n+c3κ2(n+t1)\left\|{\bm{R}}\right\|=\left\|{\bm{R}}^{*}\right\|\quad>\quad\sqrt{n}+c_{3}\kappa^{2}(\sqrt{n}+t_{1})

with probability at most 2exp(t12)2\exp(-t_{1}^{2}), for all t10t_{1}\geq 0. Let t1t_{1} be such that the right hand side above equals nt\sqrt{nt}, i.e.,

t1nc3κ2(t(1+c3κ2)).t_{1}\quad\coloneqq\quad\frac{\sqrt{n}}{c_{3}\kappa^{2}}\left(\sqrt{t}-(1+c_{3}\kappa^{2})\right)\enspace.

Let c14(1+c3κ2)2c_{1}\coloneqq 4(1+c_{3}\kappa^{2})^{2}. Whenever tc1t\geq c_{1}, we see that t1nt/2c3κ20t_{1}\geq\sqrt{nt}/2c_{3}\kappa^{2}\geq 0. So

Pr(𝑹>nt)2exp(nt/4c32κ4),\Pr\left(\left\|{\bm{R}}\right\|>\sqrt{nt}\right)\quad\leq\quad 2\exp\!\big{(}-nt/4c_{3}^{2}\kappa^{4}\big{)}\enspace,

and the theorem holds with c24c32κ4c_{2}\coloneqq 4c_{3}^{2}\kappa^{4}.

It remains to prove that the sub-gaussian norm κ\kappa of |𝝃n|{\bm{\xi}}_{n}\rangle is at most some universal constant. By Lemma 5.3, it suffices to show that for any unit vector |un|u\rangle\in{\mathbb{C}}^{n}, the random variable 𝒙u|𝝃n{\bm{x}}\coloneqq\langle u|{\bm{\xi}}_{n}\rangle has sub-gaussian tails: for a positive universal constant κ1\kappa_{1},

Pr(|𝒙|t)2exp(t2/κ12),\Pr(\left\lvert{\bm{x}}\right\rvert\geq t)\quad\leq\quad 2\exp\!\big{(}-t^{2}/\kappa_{1}^{2}\big{)}\enspace, (5.9)

for all t0t\geq 0. We establish this by appealing to Theorem 5.5.

Since |𝝃n|{\bm{\xi}}_{n}\rangle is isotropic,

𝔼|𝒙|2=u|(𝔼|𝝃n𝝃n|)|u=1.\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}\right\rvert^{2}\quad=\quad\langle u|\;(\operatorname{{\mathbb{E}}}|{\bm{\xi}}_{n}\rangle\!\langle{\bm{\xi}}_{n}|)\;|u\rangle\quad=\quad 1\enspace.

So 𝔼|𝒙|(𝔼|𝒙|2)1/21\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}\right\rvert\leq\big{(}\operatorname{{\mathbb{E}}}\left\lvert{\bm{x}}\right\rvert^{2}\big{)}^{1/2}\leq 1.

Define the function f:𝖴(d)f:{\mathsf{U}}({\mathbb{C}}^{d})\rightarrow{\mathbb{C}} as f(U)d|u|(U𝟙)|ϕd|f(U)\coloneqq d\left\lvert\langle u|(U\otimes\mathds{1})|\upphi_{d}\rangle\right\rvert. Then |𝒙|=f(𝑼)\left\lvert{\bm{x}}\right\rvert=f({\bm{U}}). To show that ff is Lipschitz, consider U,V𝖴(d)U,V\in{\mathsf{U}}({\mathbb{C}}^{d}). Since |u|u\rangle is a unit vector and ϕd|(W𝟙)|ϕd=1dTr(W)\langle\upphi_{d}|(W\otimes\mathds{1})|\upphi_{d}\rangle=\tfrac{1}{d}\operatorname{Tr}(W) we have

|f(U)f(V)|\displaystyle\left\lvert f(U)-f(V)\right\rvert\quad d|u|((UV)𝟙)|ϕd|\displaystyle\leq\quad d\left\lvert\langle u|((U-V)\otimes\mathds{1})|\upphi_{d}\rangle\right\rvert
d((UV)𝟙)|ϕd\displaystyle\leq\quad d\left\|((U-V)\otimes\mathds{1})|\upphi_{d}\rangle\right\|
=d(ϕd|((UV)(UV)𝟙)|ϕd)1/2\displaystyle=\quad d\left(\langle\upphi_{d}|((U-V)^{*}(U-V)\otimes\mathds{1})|\upphi_{d}\rangle\right)^{1/2}
=dUV2,\displaystyle=\quad\sqrt{d}\left\|U-V\right\|_{2}\enspace,

where 2\left\|\cdot\right\|_{2} denotes the Hilbert-Schmidt norm on 𝖫(d){\mathsf{L}}({\mathbb{C}}^{d}). So the function ff is d\sqrt{d}-Lipschitz with respect to Hilbert-Schmidt metric. By Theorem 5.5,

Pr(|𝒙|1+t1)exp((d2)t1224d)\Pr(\left\lvert{\bm{x}}\right\rvert\geq 1+t_{1})\quad\leq\quad\exp\!\left(-\frac{(d-2)t_{1}^{2}}{24d}\right)

for every t1>0t_{1}>0. For d3d\geq 3, we have d2d/3d-2\geq d/3. So the right hand side is at most exp(t12/72)\exp(-t_{1}^{2}/72), and we have

Pr(|𝒙|t)exp((t1)2324)exp(t21224),\Pr(\left\lvert{\bm{x}}\right\rvert\geq t)\quad\leq\quad\exp\!\left(-\frac{(t-1)^{2}}{3\cdot 24}\right)\quad\leq\quad\exp\!\left(-\frac{t^{2}}{12\cdot 24}\right)\enspace,

for all t2t\geq 2 (as t1t/2t-1\geq t/2). Note that Eq. (5.9) holds trivially for t[0,2]t\in[0,2] for any choice of positive constant κ1\kappa_{1} such that 2exp(4/κ12)12\exp(-4/\kappa_{1}^{2})\geq 1. Thus, taking κ124\kappa_{1}\coloneqq 24, we see that Eq. (5.9) holds for all t0t\geq 0 whenever d3d\geq 3. ∎

Lemma 5.15 implies that for a large enough constant α\alpha, the contribution to the mean 𝔼𝚲d\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}} outside an interval [0,α][0,\alpha] goes to 0 as dd\to\infty. Within this interval, the contribution to the mean tends to that for 𝔼𝚲\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}. This helps us derive the limiting value of the mean.

Theorem 5.16.

limd𝔼𝚲d=𝔼𝚲.\lim_{d\rightarrow\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\enspace.

Proof.

We formalize the intuition given above by appealing to a weaker property implied by convergence in distribution, namely that the expectation of any bounded continuous function ff of the random variable 𝚲d{\bm{\Lambda}}_{d} converges to 𝔼f(𝚲)\operatorname{{\mathbb{E}}}f({\bm{\Lambda}}).

Fix αmax{c1,4}\alpha\geq\max\left\{c_{1},4\right\}, where c1c_{1} is the constant in the statement of Lemma 5.15 and consider the function fαf_{\alpha} defined as follows:

fα(x){0x0x0<xααα<x.f_{\alpha}(x)\quad\coloneqq\quad\begin{cases}0&\quad x\leq 0\\ \sqrt{x}&\quad 0<x\leq\alpha\\ \sqrt{\alpha}&\quad\alpha<x\enspace.\end{cases}

Since fαf_{\alpha} is continuous and bounded, and 𝚲[0,4]{\bm{\Lambda}}\in[0,4],

limd𝔼fα(𝚲d)=𝔼fα(𝚲)=𝔼𝚲.\lim_{d\to\infty}\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})\quad=\quad\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}})\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\enspace.

On the other hand, 𝚲d0{\bm{\Lambda}}_{d}\geq 0 and fα(x)xf_{\alpha}(x)\leq\sqrt{x} for all x0x\geq 0. So 𝔼fα(𝚲d)𝔼𝚲d\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})\leq\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}, and

𝔼𝚲=limd𝔼fα(𝚲d)limd𝔼𝚲d.\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\quad=\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})\quad\leq\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\enspace.

We prove the reverse inequality using Lemma 5.15. Let p(x)p(x) be the probability density function of 𝚲d{\bm{\Lambda}}_{d}. By the definition of fαf_{\alpha},

𝔼𝚲d\displaystyle\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad 𝔼fα(𝚲d)+xαxp(x)dx.\displaystyle\leq\quad\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})+\int_{x\geq\alpha}\sqrt{x}p(x)\mathop{}\!\mathrm{d}x\enspace. (5.10)

Let g(α,d)g(\alpha,d) denote the second term on the right hand side of Eq. (5.10) above. This is the contribution to 𝔼𝚲d\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}} outside of the interval [0,α][0,\alpha]. Using α4\alpha\geq 4, Fubini’s Theorem, 𝚲d𝑸{\bm{\Lambda}}_{d}\leq\left\|{\bm{Q}}\right\|, and Lemma 5.15, we have

g(α,d)\displaystyle g(\alpha,d)\quad xαxp(x)dx\displaystyle\leq\quad\int_{x\geq\alpha}xp(x)\mathop{}\!\mathrm{d}x
=xαy[0,x]p(x)dydx\displaystyle=\quad\int_{x\geq\alpha}\int_{y\in[0,x]}p(x)\mathop{}\!\mathrm{d}y\mathop{}\!\mathrm{d}x
=y0xmax{α,y}p(x)dxdy\displaystyle=\quad\int_{y\geq 0}\int_{x\geq\max\left\{\alpha,y\right\}}p(x)\mathop{}\!\mathrm{d}x\mathop{}\!\mathrm{d}y
=y[0,α]xαp(x)dxdy+yαxyp(x)dxdy\displaystyle=\quad\int_{y\in[0,\alpha]}\int_{x\geq\alpha}p(x)\mathop{}\!\mathrm{d}x\mathop{}\!\mathrm{d}y+\int_{y\geq\alpha}\int_{x\geq y}p(x)\mathop{}\!\mathrm{d}x\mathop{}\!\mathrm{d}y
=y[0,α]Pr(𝚲dα)dy+yαPr(𝚲dy)dy\displaystyle=\quad\int_{y\in[0,\alpha]}\Pr({\bm{\Lambda}}_{d}\geq\alpha)\mathop{}\!\mathrm{d}y+\int_{y\geq\alpha}\Pr({\bm{\Lambda}}_{d}\geq y)\mathop{}\!\mathrm{d}y
2αexp(αn/c2)+2yαexp(yn/c2)dy\displaystyle\leq\quad 2\alpha\exp(-\alpha n/c_{2})+2\int_{y\geq\alpha}\exp(-yn/c_{2})\mathop{}\!\mathrm{d}y
=2(α+c2/n)exp(αn/c2),\displaystyle=\quad 2(\alpha+c_{2}/n)\exp(-\alpha n/c_{2})\enspace,

where c2c_{2} is the universal constant in the statement of Lemma 5.15. Since n=d2n=d^{2}, g(α,d)g(\alpha,d) vanishes as dd goes to \infty. By Eq. (5.10),

limd𝔼𝚲d\displaystyle\lim_{d\to\infty}\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}_{d}}\quad limd𝔼fα(𝚲d)+limdg(α,d)\displaystyle\leq\quad\lim_{d\to\infty}\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}}_{d})+\lim_{d\to\infty}g(\alpha,d)
=𝔼fα(𝚲)=𝔼𝚲.\displaystyle=\quad\operatorname{{\mathbb{E}}}f_{\alpha}({\bm{\Lambda}})\quad=\quad\operatorname{{\mathbb{E}}}\sqrt{{\bm{\Lambda}}}\enspace.

This proves the theorem. ∎

References

  • [Bai93] Z. D. Bai. Convergence rate of expected spectral distributions of large random matrices. Part II. sample covariance matrices. Annals of Probability, 21(2):649–672, April 1993.
  • [BCWdW01] Harry Buhrman, Richard Cleve, John Watrous, and Ronald de Wolf. Quantum fingerprinting. Physical Review Letters, 87(16):167902, 2001.
  • [BW92] Charles H. Bennett and Stephen J. Wiesner. Communication via one- and two-particle operators on einstein-podolsky-rosen states. Physical review letters, 69(20):2881, 1992.
  • [BZ08] Zhidong Bai and Wang Zhou. Large sample covariance matrices without independence structures in columns. Statistica Sinica, 18(2):425–442, 2008.
  • [CGJV19] Andrea Coladangelo, Alex B. Grilo, Stacey Jeffery, and Thomas Vidick. Verifier-on-a-leash: New schemes for verifiable delegated quantum computation, with quasilinear resources. In Yuval Ishai and Vincent Rijmen, editors, Advances in Cryptology – EUROCRYPT 2019, volume 11478 of Lecture Notes in Computer Science, pages 247–277, Cham, 2019. Springer International Publishing.
  • [Cha20] Michael Charezma. Quantum circuit diagrams. https://warwick.ac.uk/fac/sci/physics/research/cfsa/people/pastmembers/charemzam/pastprojects, 2006 (accessed October 14, 2020).
  • [Cur79] Paul Joseph Curlander. Quantum Limitations on Communication Systems. PhD thesis, Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1979.
  • [FK19] Máté Farkas and Jędrzej Kaniewski. Self-testing mutually unbiased bases in the prepare-and-measure scenario. Physical Review A, 99(3):032316, 2019.
  • [FKN22] Máté Farkas, Jędrzej Kaniewski, and Ashwin Nayak. Mutually unbiased measurements, Hadamard matrices, and Superdense Coding. Technical Report arXiv:2204.11886 [quant-ph], ArXiv.org Preprint Archive, https://www.arxiv.org/, April 2022.
  • [Hal35] Philip Hall. On representatives of subsets. Journal of the London Mathematical Society, s1-10(1):26–30, 1935.
  • [HJW93] Lane P. Hughston, Richard Jozsa, and William K. Wootters. A complete classification of quantum ensembles having a given density matrix. Physics Letters A, 183(1):14–18, 1993.
  • [JNV+20] Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright, and Henry Yuen. 𝖬𝖨𝖯=𝖱𝖤\mathsf{MIP}^{*}=\mathsf{RE}. Technical Report arXiv:2001.04383 [quant-ph], ArXiv.org Preprint Archive, https://www.arxiv.org/, January 2020.
  • [Kho79] Alexander S. Kholevo. On asymptotically optimal hypothesis testing in quantum statistics. Theory of Probability & Its Applications, 23(2):411–415, 1979.
  • [KR03] Andreas Klappenecker and Martin Rötteler. Unitary error bases: Constructions, equivalence, and applications. In Marc Fossorier, Tom Høholdt, and Alain Poli, editors, Proceedings of the 15th International Symposium on Applied Algebra, Algebraic Algorithms, and Error-Correcting Codes (AAECC), volume 2643 of Lecture Notes in Computer Science, pages 139–149. Springer, Berlin / Heidelberg, Germany, May12–16, 2003.
  • [Mec19] Elizabeth S. Meckes. The Random Matrix Theory of the Classical Compact Groups, volume 218 of Cambridge Tracts in Mathematics. Cambridge University Press, July 2019.
  • [Mon07] Ashley Montanaro. On the distinguishability of random quantum states. Communications in Mathematical Physics, 273(3):619–636, August 2007.
  • [MV16] Benjamin Musto and Jamie Vicary. Quantum Latin squares and unitary error bases. Quantum Information and Computation, 16(15-16):1318–1332, November 2016.
  • [MY98] Dominic Mayers and Andrew Yao. Quantum cryptography with imperfect apparatus. In Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No. 98CB36280), pages 503–509. IEEE, 1998.
  • [MY04] Dominic Mayers and Andrew Yao. Self testing quantum apparatus. Quantum Information & Computation, 4(4):273–286, July 2004.
  • [NC11] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, New York, NY, USA, 2011. 10th Anniversary Edition.
  • [NY20] Ashwin Nayak and Henry Yuen. Rigidity of superdense coding. Technical Report arXiv:2012.01672v1 [quant-ph], arXiv Pre-print server, https://arxiv.org/abs/2012.01672, December 2020.
  • [ON99] Tomohiro Ogawa and Hiroshi Nagaoka. Strong converse to the quantum channel coding theorem. IEEE Transactions on Information Theory, 45(7):2486–2489, 1999.
  • [ŠB20] Ivan Šupić and Joseph Bowles. Self-testing of quantum systems: a review. Quantum, 4:337, 2020.
  • [Sch35] Erwin Schrödinger. Discussion of probability relations between separated systems. Mathematical Proceedings of the Cambridge Philosophical Society, 31(4):555–563, 1935.
  • [TKV+18] Armin Tavakoli, Jędrzej Kaniewski, Tamás Vértesi, Denis Rosset, and Nicolas Brunner. Self-testing quantum states and measurements in the prepare-and-measure scenario. Physical Review A, 98(6):062307, 2018.
  • [Tys09a] Jon Tyson. Erratum: “Minimum-error quantum distinguishability bounds from matrix monotone functions: A comment on ‘Two-sided estimates of minimum-error distinguishability of mixed quantum states via generalized Holevo-Curlander bounds’ ” [J. Math. Phys. 50, 062102 (2009)]. Journal of Mathematical Physics, 50(10):109902, 2009.
  • [Tys09b] Jon Tyson. Two-sided estimates of minimum-error distinguishability of mixed quantum states via generalized Holevo-Curlander bounds. Journal of Mathematical Physics, 50(3):032106, 2009.
  • [Tys10] Jon Tyson. Two-sided bounds on minimum-error quantum measurement, on the reversibility of quantum dynamics, and on maximum overlap using directional iterates. Journal of Mathematical Physics, 51(9):092204, 2010.
  • [Uhl76] Armin Uhlmann. The “transition probability” in the state space of a *-algebra. Reports on Mathematical Physics, 9(2):273–279, 1976.
  • [Ver18] Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, UK, 2018.
  • [VV19] Umesh Vazirani and Thomas Vidick. Fully device independent quantum key distribution. Communications of the ACM, 62(4):133–133, 2019.
  • [VW00] Karl Gerd H. Vollbrecht and Reinhard F. Werner. Why two qubits are special. Journal of Mathematical Physics, 41(10):6772–6782, 2000.
  • [Wat18] John Watrous. The Theory of Quantum Information. Cambridge University Press, May 2018.
  • [Wer01] Reinhard F. Werner. All teleportation and dense coding schemes. Journal of Physics A: Mathematical and General, 34(35):7081–7094, August 2001.
  • [Wil13] Mark M. Wilde. Quantum Information Theory. Cambridge University Press, Cambridge, UK, 2013.
  • [Yas16] Pavel Yaskov. A short proof of the Marchenko–Pastur theorem. Une courte démonstration du théorème de Marchenko–Pastur. Comptes Rendus Mathématique, 354(3):319–322, March 2016.