Rigidity of superdense coding
Abstract
The famous superdense coding protocol of Bennett and Wiesner demonstrates that it is possible to communicate two bits of classical information by sending only one qubit and using a shared EPR pair. Our first result is that an arbitrary protocol for achieving this task (where there are no assumptions on the sender’s encoding operations or the dimension of the shared entangled state) is locally equivalent to the canonical Bennett-Wiesner protocol. In other words, the superdense coding task is rigid. In particular, we show that the sender and receiver only use additional entanglement (beyond the EPR pair) as a source of classical randomness.
We also investigate several questions about higher-dimensional superdense coding, where the goal is to communicate one of possible messages by sending a -dimensional quantum state, for general dimensions . Unlike the case (i.e. sending a single qubit), there can be inequivalent superdense coding protocols for higher . We present concrete constructions of inequivalent protocols, based on constructions of inequivalent orthogonal unitary bases for all . Finally, we analyze the performance of superdense coding protocols where the encoding operators are independently sampled from the Haar measure on the unitary group. Our analysis involves bounding the distinguishability of random maximally entangled states, which may be of independent interest.
1 Introduction
In quantum information theory, rigidity is a phenomenon where optimal performance in an information processing task requires using a protocol satisfying extremely stringent constraints — in some cases, there is essentially a unique optimal protocol. The primary examples of rigidity come from nonlocal games (also known as Bell tests in the physics literature). In this setting two spatially separated parties Alice and Bob play a game with a third-party called the referee. In order to maximize their chances of winning, before the game starts Alice and Bob choose an entangled state to share as well as local measurements to perform on the state. For example, in the famous CHSH game the optimal winning probability is , and a canonical strategy that achieves this uses a (rotated) EPR pair and single-qubit Pauli measurements. The CHSH game is rigid in the sense that any optimal strategy for the CHSH game is identical to this canonical strategy, up to local changes of basis.
The study of rigidity in quantum information processing arguably started with the work of Mayers and Yao [MY98, MY04], who initiated the concept of device-independent cryptography. The idea behind this subject is that a classical user can verify that untrusted quantum hardware is behaving as intended — say, generating random keys or performing a quantum computation – simply by verifying that the hardware is employing a (near)-optimal strategy in a rigid nonlocal game. Since the work of Mayers and Yao, nonlocal game rigidity has been an extremely fruitful concept in quantum cryptography (see, e.g., Refs. [VV19] and [CGJV19]), complexity theory (see Ref. [JNV+20] and the references therein), and quantum information more generally [ŠB20]. This motivates the following question: what other tasks in quantum information also exhibit rigidity phenomena?
To our knowledge, the only other work on rigidity phenomena outside of nonlocal games is that reported in Refs. [TKV+18, FK19] on the rigidity of quantum random access codes (QRACs). The authors study “” QRACs, which encode classical dits into a -dimensional system, such that either or may be retrieved by performing a suitable measurement. These works show that QRACs are rigid, and in fact certify measurements based on mutually unbiased bases (MUBs).
In this paper we investigate the rigidity properties of superdense coding, which plays a fundamental role in quantum Shannon theory (see, e.g., Ref. [Wil13, Chapter 6]). The superdense coding task is to communicate one of four possible messages while only transmitting one quantum bit across a channel. The superdense coding protocol, first proposed by Bennett and Wiesner [BW92], achieves this task in the following way: Alice and Bob share one qubit each of an EPR pair (i.e., the maximally entangled state ) in advance, and to transmit a message , Alice applies a one of four Pauli operators to her half of the EPR pair and sends her qubit. Bob then performs a Bell measurement on the qubit received from Alice and his qubit to determine .
1.1 Rigidity for superdense coding of two classical bits
The first result in our paper is to show that superdense coding is rigid: any protocol that accomplishes this task is “locally equivalent” to the Bennett-Wiesner protocol. We model arbitrary protocols for superdense coding in the following manner: Alice and Bob share a density matrix on a bipartite Hilbert space , where we assume without loss of generality that factors into where is isomorphic to . Given an input , Alice applies a unitary operator (called an encoding operator) to her share of (with support in the space ), sends the qubit to Bob, and Bob then performs an optimal distinguishing measurement on the Hilbert space to determine what the input was. See Figure 1 for an illustration of a general superdense coding protocol.
\Qcircuit@C=1.5em @R=1em
& Alice
\lsticki \control\cw\qwx[1] \cw \cw \cw \cw \cw \cw \lsticki
\lstickA’ \multigate1U_i \qw \qw \qw \qw \qw \rstick \qw
\lstickA” \ghostU_i \qw \qw\qwx[1] \push
\qwx[1]
\qwx[1] Bob
\push \multigate1M \qw \qw \qw
\lstickB \qw \qw \qw \ghostM \qw \meter \push \cw \lsticki
\gategroup1148.7em.
\gategroup6188.7em.
A priori it appears daunting to characterize the structure of an arbitrary superdense coding protocol. For one, the dimension of the spaces and are unbounded, and the state is uncharacterized. Furthermore, the encoding unitary operators can be extremely complicated, potentially performing complex entangling operations between the space and (the qubit to be sent over to Bob). However, the property of being a superdense coding protocol is extremely constraining. Theorem 1.1 gives a precise characterization of how an arbitrary superdense protocol is locally equivalent to the canonical Bennett-Wiesner protocol. In the statement of the theorem, “” denotes equality of two unitary operators with respect to the state ; in other words, means that .
Theorem 1.1 (Rigidity for superdense coding).
Let denote a superdense coding protocol. Then there exist
-
1.
Unitary operators acting on and acting on ,
-
2.
An isometry mapping to a Hilbert space where is isomorphic to ,
-
3.
A density matrix on ,
-
4.
A set of pairwise orthogonal projectors that sum to the identity on , and
-
5.
A collection of unitary operators ,
such that, letting , we have
and for ,
where , , , and are the one-qubit Pauli matrices.
Theorem 1.1 can be interpreted as expressing rigidity for superdense coding in the following way: given an arbitrary protocol for superdense coding, there exists local isometries where if Alice applies and Bob applies to their share of , then an EPR pair is extracted with an auxiliary state remaining. By pre-applying to Alice’s unitary operators , we discover that has a very regular form: operationally, it can be interpreted as performing some projective measurement on Alice’s part of the auxiliary state to obtain some outcome in a set , and then based on , applying a rotated version of the standard Bennett-Wiesner superdense coding protocol to Alice’s part of the EPR pair. Finally, after sending her EPR qubit, Alice then applies some unitary operator on her remaining qubits (which does not affect Bob’s measurement in any way). This considerably strengthens and extends the characterization of “tight” superdense coding protocols due to Vollbrecht and Werner [VW00, Lemma 3] (see also Ref. [Wer01]); they studied protocols in which the shared entangled state is a state on (or on in the case of -dimensional superdense coding protocols; see Section 1.2). This difference would be significant in a cryptographic setting; in the context of quantum key distribution, this is the difference between the device-independent and semi-device-independent settings.
The proof of Theorem 1.1 is given in Section 3. It proceeds via a number of reductions: first, using an information-theoretic argument, we show every superdense coding protocol is locally equivalent to one that uses an EPR pair in the state . Given this, we then show that each of the encoding operators can be individually block-diagonalized with respect to the EPR pair. Finally, we show that the blocks across the different encoding operators can be “matched up” in a way that they correspond to the Pauli matrices. Each of these steps requires carefully deducing the structure imposed on the state and the encoding operators by the correctness of the protocol.
1.2 Rigidity for higher dimensional superdense coding?
We then consider the generalization of superdense coding to communicating more than classical bits. Specifically, we consider protocols for communicating one of possible messages by sending a -dimensional quantum system over the channel — we call these -dimensional superdense coding protocols. A canonical protocol for -dimensional superdense coding is as follows: the players share a -dimensional maximally entangled state , and given message , Alice applies a unitary operator to her share of , and sends it over to Bob. The family of unitary operators can be any orthogonal unitary basis for the space of matrices. (The orthogonality property means that if and only if .) An example of such a basis is the set of Heisenberg-Weyl operators. In dimension , these are a set of matrices defined as follows. Let be a primitive th root of unity. For , let where is the “shift” operator, and is the “clock” operator. Does the rigidity phenomenon also extend to dimensions larger than ?
The second result of this paper is that -dimensional superdense coding for is not rigid in the same sense as Theorem 1.1: there are -dimensional superdense coding protocols which are not locally equivalent to each other. This is because in dimensions three and higher there are inequivalent orthogonal unitary bases. (In contrast, all orthogonal unitary bases in dimension two are equivalent to the Pauli matrices.) Here, equivalence between two unitary bases and means there exist unitary operators such that for all , we have for some choice of complex phase .
Theorem 1.2 (Existence of inequivalent orthogonal unitary bases for all ).
For every dimension , there are orthogonal unitary bases that are not equivalent to each other.
The uniqueness of orthogonal unitary bases was first studied by Vollbrecht and Werner [VW00], and the existence of non-equivalent orthogonal unitary bases for all dimensions greater than was observed in follow-up work by Werner [Wer01]. (We elaborate on prior work on the topic in Section 4.2.) Werner described, without proof, how non-equivalent bases may be constructed. We present explicit constructions of such bases in Section 4. The construction for is based on the observation that the shift operator corresponds to a perfect matching in , the complete bipartite graph. Moreover, its powers correspond to a partition of the edge set of into disjoint perfect matchings. By replacing this partition with another carefully chosen such partition, we obtain an orthogonal unitary basis that is not equivalent to the clock and shift construction. The proof of non-equivalence involves comparing the spectra of the operators in the two bases, taking into account the complex phase and unitary operators that witness a potential equivalence map. For , we follow a construction described by Werner [Wer01]. We prove non-equivalence to the clock and shift basis by showing that the resulting basis is not a commutative projective group (again accounting for a potential equivalence map).
In a previous version of this paper, we conjectured that rigidity for higher dimensional superdense coding holds up to choosing orthogonal unitary bases [NY20, Conjecture 1.3]. That is, every -dimensional superdense coding protocol is locally equivalent (in the sense of Theorem 1.1) to one where Alice and Bob share an entangled state for some density matrix , and Alice’s encoding operators are of the form
where is a set of pairwise orthogonal projectors that sum to the identity on and for every , the set is an orthogonal unitary basis for the space of complex matrices. This would be a natural extension of the statement of Theorem 1.1 to the case of general where the registers are treated as a source of “shared randomness” to help Alice and Bob synchronize their choice of orthogonal unitary basis.
We can show that when the shared entangled state between Alice and Bob is a pure state in , then this conjecture holds (see Section 4.1): up to local unitary operators, the shared state is necessarily the maximally entangled state of local dimension , and the encoding operators necessarily form an orthogonal unitary basis. However, this conjecture is false for protocols where Alice sends only a part of her entangled state. In work subsequent to ours, Farkas, Kaniewski, and Nayak [FKN22] show that there exist infinitely many superdense coding protocols that are not locally equivalent to a protocol of the form described in the conjecture. In particular, in these counterexample protocols Alice may perform a complicated entangling operation between her message and the rest of her state, rather than just treating the ancilla system as a source of shared randomness.
1.3 Superdense coding protocols with error
Finally, we consider probabilistic protocols for -dimensional superdense coding, where Bob’s decoding only needs to succeed with high probability. In particular, we say that is a -superdense coding protocol if Bob is able to decode Alice’s message with probability at least , for all . We focus on the case where Alice and Bob share an entangled state in (i.e., have local dimension ). As mentioned previously, in the exact case , their shared state is necessarily the maximally entangled state and Alice’s encoding unitary operators form an orthogonal unitary basis. We conjecture that even in the probabilistic setting, this characterization of -dimensional superdense coding protocols is robust, in the following sense.
Conjecture 1.3.
There exist functions where and monotonically decrease to as , such that the following holds. For all -superdense coding protocols such that is a density matrix on and are unitary operators in , we have
and there exists an orthogonal unitary basis for the space of complex matrices such that for all ,
where denotes the normalized Hilbert-Schmidt norm on the space of matrices.
We note that the choice of the normalized Hilbert-Schmidt norm in the statement of 1.3 is somewhat arbitrary; one can also consider other formulations of the conjecture with other norms (such as the spectral norm, etc.).
The last part of our paper analyzes a possible challenge to 1.3 proposed by Aram Harrow. Consider the following probabilistic construction for a potential -superdense coding protocol: independently sample matrices from the Haar measure on , the group of complex unitary matrices. Let denote the -dimensional maximally entangled state. How well does the protocol accomplish superdense coding?
In classical and quantum communication, many tasks can be performed near-optimally via probabilistically constructed protocols. See, e.g., the text by Wilde [Wil13] for examples from Shannon theory. A simple example from communication complexity is the task of quantum fingerprinting [BCWdW01], which enables checking whether two -bit strings and are equal by only comparing two -qubit fingerprints of the strings. It can be shown that picking random -qubit states for each -bit string yields a good quantum fingerprinting protocol.
Let denote the random superdense protocol specified by . Note that the error of , when averaged over the choice of random unitaries , is some function of . We first argue that the conjecture implies that the error of a random superdense coding protocol, when averaged over the choice of , cannot be too small:
Proposition 1.4.
Put another way, it cannot be that both 1.3 is true and also the random superdense protocol has error vanishing so quickly such that is smaller than , on average. Due to the concentration of measure phenomenon for Haar-random states and unitary operators (as expressed by, e.g., the Lévy-like property in Theorem 5.5), it is plausible a priori that the average error , and therefore also , scale as . Thus, the random superdense protocol is potentially a counterexample to 1.3.
We show that this probabilistic construction does not yield a good superdense coding protocol: with overwhelmingly high probability over the choice of random unitary operators , the protocol has a nonzero probability of error that is independent of . Thus, 1.3 is not ruled out by the random protocol construction.
Theorem 1.5 (Performance of a random superdense coding protocol).
The random superdense coding protocol specified by where are Haar-random unitary operators has error at least as , with high probability over the choice of .
We prove Theorem 1.5 by showing that the distinguishability of the ensemble of random states is bounded away from (with high probability). The generalized Holevo-Curlander bounds [Kho79, Cur79, ON99, Tys09b] relate the distinguishability of an ensemble to the quantity
(1.1) |
Our analysis of this quantity is largely inspired by work due to Montanaro [Mon07] on the distinguishability of random pure quantum states. However, extending his approach to the ensemble of interest to us—one consisting of random maximally entangled states—involves significant technical difficulties. The approach involves relating the distinguishability of an ensemble of states to the spectrum of the ensemble average. In the case of Haar-random pure states, the ensemble average is well approximated by the ensemble average of unnormalized complex gaussian vectors with suitably chosen variance. The spectrum of such matrices in the asymptotic limit is given by the Marčenko-Pastur Theorem from random matrix theory. In our case, the entries of the random vectors in the ensemble are not independent. We instead bound the generalized Holevo-Curlander quantity in Equation 1.1 by employing a recent generalization of the Marčenko-Pastur Theorem due to Yaskov [Yas16]. (The theorem was proven for ensembles of random real vectors. We verify that its proof also extends to complex random vectors with analogous properties.) In the process, we show that random maximally entangled states satisfy a pseudo-isotropy condition that suffices for the theorem to hold.
A subtlety in the use of the Marčenko-Pastur law is that we would like to deduce the convergence of a sequence of means to the mean of the limiting distribution from the convergence of a sequence of distributions. This does not necessarily hold in general. In order to prove such a relation between the two forms of convergence, we show that random maximally entangled states are sub-gaussian. This allows us to draw on a generalization of the Bai-Yin Theorem, which bounds the norm of matrices whose columns are given by i.i.d. sub-gaussian vectors. We thus show that the norm of the ensemble average has an exponentially decaying tail, which in turn guarantees the form of convergence we seek.
We believe the techniques used in our analysis are of independent interest. In fact, the subtlety mentioned above was overlooked by Montanaro; the ideas we develop may also be used to close a gap in his analysis (see Section 5.4 for the details).
1.4 Further remarks and open questions
In this paper we have initiated the study of rigidity phenomena in superdense coding protocols. Given its importance in quantum Shannon theory, our study may shed new light on protocols based on superdense coding. The power of entanglement as a resource in distributed quantum computation, in particular in two-party communication complexity, remains a mystery. The rigidity theorem we establish (Theorem 1.1) gives a complete picture for a simple but fundamental task. The property shown in the analysis of random superdense coding protocols (Theorem 1.5) may also be interpreted as placing a limit on how closely a sequence of random unitary operators approximate an orthogonal basis. This may be of relevance to the theory of error-correction, where unitary error bases play a central role.
We list several open questions that arise from this work:
-
1.
Is 1.3 true? Does a robust version of Theorem 1.1 hold?
-
2.
Do -dimensional superdense coding protocols, in which the shared state between Alice and Bob may have local dimension larger than , also exhibit some non-trivial form of rigidity?
-
3.
Does rigidity also hold for quantum teleportation, a task that is “dual” to superdense coding? Can this be derived in a black-box way from the rigidity of superdense coding?
- 4.
-
5.
What other quantum information processing tasks have the rigidity property?
We believe the investigation of these questions will lead to significant new insights into the nature of quantum information, with wide ranging ramifications.
Acknowledgements.
We thank the anonymous journal reviewers for their thorough feedback on an earlier version of the paper. We thank Adam Bouland, Chinmay Nirkhe, and Zeph Landau for stimulating discussions at the beginning of this project. H.Y. would like to especially thank Adam Bouland for his integral role in formulating the questions explored in this paper. We would like to thank Pavel Yaskov for the correspondence on his work on the Marčenko–Pastur theorem. We thank Jędrzej Kaniewski and Mate Farkas for their pointers to the literature on rigidity of QRACs. A.N. would like to thank Kanstantsin Pashkovich and Vern Paulsen for helpful discussions on bases of orthogonal unitary operators, and is grateful to the Berkeley CS Theory Group for their hospitality during a visit in Fall 2017, when this work was initiated. A.N.’s research is supported in part by a Discovery Grant from NSERC Canada. H.Y. is supported by an NSERC Discovery Grant, a Google Quantum Research Award, and AFOSR Grant No. FA9550-21-1-0040. This research was partly conducted at the Kavli Institute for Theoretical Physics during the Quantum Physics of Information program in 2017 (and thus this research was supported in part by the National Science Foundation under Grant No. NSF PHY17-48958).
2 Properties of superdense coding
2.1 Quantum information basics
We refer the reader to texts such as [NC11, Wat18, Wil13] for the basics of quantum information, and mention some notational conventions here.
We write to denote the identity operator on a Hilbert space. We use superscripts on quantum states, e.g., or , to denote the registers in which they are stored. Similarly, we use subscripts on operators to indicate the registers on which they act, unless this is clear from the context. Given a bipartite density matrix , we write to denote its reduction to register (i.e., the partial trace over ).
Given operators and a density matrix acting on a Hilbert space , we write to denote . In other words, the operators and have the same action on the state .
We write to denote the maximally entangled state on two qubits. We write to denote the -dimensional maximally entangled state, or simply if the dimension is clear from context. We recall the single qubit Pauli matrices:
2.2 Basic properties of superdense coding
Here we give a formal definition of a general superdense coding protocol, and prove some basic properties about them.
Definition 2.1 (Superdense coding protocol).
Let be a positive integer. Let and be finite dimensional Hilbert spaces where is isomorphic to . Let denote a density matrix on and let denote a sequence of unitary operators (called encoding operators) acting on . We say that is a -superdense coding protocol if there exists a POVM acting on such that
(2.1) |
where denotes the reduced density matrix of on registers . When we simply call a -dimensional superdense coding protocol.
Lemma 2.2 (Orthogonality conditions I).
Let be a -dimensional superdense coding protocol. Then letting denote the reduced density matrix of on registers , we have that
Proof.
Let denote a POVM satisfying Equation 2.1 for . Then for all , we have according to the positive semidefnite ordering. This is because if we write for some probabilities , then implies that for all , , which implies that is an eigenvector of with eigenvalue . This implies that for some positive semidefinite operator , and this is at least in the positive semidefinite ordering.
This means then that for all ,
where the first inequality is due to the positivity of and , the second inequality is due to the fact that , and the last equality is due to the fact that . Therefore . Thus we have
so . ∎
Lemma 2.3 (Orthogonality conditions II).
Let be a -dimensional superdense coding protocol. Then for all ,
where denotes the reduced density matrix of on register and denotes the partial trace over register .
Proof.
Let denote a purification of on the Hilbert space , where is the purifying space. Clearly, the protocol where Bob also has access to the purifying space is also a -dimensional superdense coding protocol.
Let denote an orthonormal basis for . Let be the (sub-normalized) pure state on registers given by
Intuitively, represents the residual state of the protocol on registers when Alice applies unitary operator , and then measures the subsystem in the standard basis to obtain outcome .
Note that if we let denote the state reduced to the registers , we have the identity
because we can think of as the result of first measuring the register in the standard basis, and discarding the outcome. Applying Lemma 2.2 to the purified protocol , we have for ,
and therefore for all . This implies that for all , which can be rewritten as
This is equivalent to the statement that
Since this holds for all , the matrix is identically zero, which completes the proof of the Lemma. ∎
Next we define what it means for superdense protocols to be locally equivalent.
Definition 2.4.
Let be a Hilbert space where is isomorphic to . Let be density matrices on . Let be unitary operators acting on . We say that and are locally equivalent if there exists
-
1.
A unitary operator acting on ,
-
2.
A set of unitary operators acting on
such that
-
1.
, and
-
2.
.
Lemma 2.5 (Local unitary freedom of superdense coding protocols).
The following properties hold for local equivalence.
-
1.
If and are locally equivalent, then is a -superdense coding protocol if and only if is.
-
2.
Local equivalence is transitive.
Proof.
For any fixed , after Alice applies her encoding unitary operator, the reduced density matrix on registers is the same whether the protocol or is used. Thus Bob’s ability to distinguish between the different messages is exactly the same. This establishes Item 1.
If and are locally equivalent, then , and . If and are locally equivalent, then , and . Thus we have
which implies that is locally equivalent to . This establishes Item 2. ∎
2.3 Nice form protocols
In this section, we define nice form protocols and then show that every superdense coding protocol is locally equivalent to one that has a nice form.
Definition 2.6.
A -dimensional superdense coding protocol has a nice form if
-
1.
,
-
2.
There exists an isometry where is isomorphic to such that
for some density matrix on .
-
3.
For all , we have that
where denotes the reduced density matrix of on .
-
4.
Let the spectral decomposition of be where for all , with distinct. Then for all and , we have
Item 2 says that up to a local unitary operation, the two parties share a maximally entangled state (in addition to other entanglement), and Item 3 turns out to be a consequence of this. Item 4 is equivalent to saying that the encoding of distinct messages are orthogonal to each other. The proof of Lemma 2.7 below may give the reader further intuition into these properties.
In the proof of Lemma 2.7, we make use of an information-theoretic argument that involves quantities such as von Neumann entropy , conditional entropy , and mutual information . For a comprehensive reference on these quantities and their basic properties, we recommend Wilde’s textbook [Wil13]. It is an interesting question whether Lemma 2.7 can be proved without making use of these information-theoretic quantities.
Lemma 2.7.
All superdense coding protocols are locally equivalent to a superdense coding protocol that has a nice form.
Proof.
We define a unitary operator acting on and unitary operators acting on such that, letting and , the pair is a superdense coding protocol and has a nice form.
Let and let . This already yields Item 1 of Definition 2.6.
Let be a purification of where is a reference system of dimension . Consider the cq-state
where the Hilbert space of register is . By Lemma 2.5, the protocol is a superdense coding protocol. Therefore the information contained in registers about in the state is
Intuitively, this is because Bob can perfectly recover the value of , i.e., bits of information, from the registers of . On the other hand, we have that
because without the qubit register , Bob has no information about (the state of the register is the same for all ). Therefore we get
Using the entropy characterization of conditional mutual information, we get
Since and (because the dimension of register is ), we get that and .
Since is a classical register, we can write as
where is defined as with . Since , we have for all , and in particular for .
Then (because is pure), so . On one hand, we have that , and on the other hand, mutual information is always nonnegative. Thus , and the reduced density matrix of on the register is maximally mixed. Furthermore we have , so has no correlations between registers and :
(2.2) |
where is some density matrix on the registers.
Fix , and let denote . Let be a Hilbert space with dimension and let be isomorphic to . Let denote a purification of . Notice that is a purification of the state in Equation 2.2. Using Uhlmann’s Theorem [Uhl76] (also known as the Schrödinger-HJW Theorem [Sch35, HJW93]), there exists an isometry on such that
Since , we have that
Since we obtain Item 2 of Definition 2.6 for the protocol .
In what follows we let denote (which also equals ). We now establish Item 3 of Definition 2.6. Let be the spectral decomposition of where the are distinct and nonzero, and is the orthogonal projector onto the eigenspace of corresponding to eigenvalue . Since by Equation 2.2 we have
(2.3) |
for some density matrix , the states and have the same eigenvalues with the same multiplicities. That is, there are an orthogonal set of projectors such that
where for all . It follows that for all ,
For let be a unitary operator on such that for all . Since , our choice of suffices. Let . By Lemma 2.5, we have is a superdense coding protocol. Furthermore, Equation 2.3 implies that
which implies Item 3 of Definition 2.6.
Lemma 2.5 implies that is also a superdense coding protocol, so by Lemma 2.3 we have that for all ,
Since commutes with the for all , we have
Since the ’s are positive, we have
for all , and . This implies Item 4 of Definition 2.6.
∎
3 Rigidity for two-dimensional superdense coding
In this section we prove Theorem 1.1, that is, rigidity for -dimensional superdense coding protocols (coding bits into one qubit, with no error). For the remainder of this section we drop the qualification “-dimensional” for brevity.
The proof involves a number of steps. First, we invoke Lemma 2.7, which states that every superdense coding protocol is locally equivalent to one that has a nice form. Then, we argue that up to local equivalence, in every nice form superdense coding protocol, the encoding operators can be block-diagonalized. That is, we can write where the are a set of orthogonal projectors acting on summing to the identity, and the are a set of Hermitian unitary operators acting on . Next, we argue that (again up to local equivalence) across the different ’s, the projectors can be “matched up”, and the corresponding operators are all pairwise orthogonal. This implies that in fact are unitarily equivalent to the standard Pauli matrices . Using the property that local equivalence is a transitive relation, this concludes the argument.
Lemma 2.7 is proven in Section 2.3. We now proceed to prove the remaining steps in detail.
3.1 Block-diagonalizing nice form protocols
In this section, we analyze the structure of the encoding operators in a nice form superdense coding protocol in dimension two. We show that they apply a unitary operator on register , controlled by the state in register .
Theorem 3.1.
Let be a nice-form protocol. Then there exists a locally-equivalent protocol that has a nice form and for we have
for some orthogonal projectors on that sum to , and traceless, Hermitian unitary matrices on .
Proof.
Fix an . Since has a nice form (see Definition 2.6), this means that for some density matrix , and , where is a non-zero eigenspace of .
Fix a . By property 3 of a nice-form protocol, commutes with , and we can write
Let , and note that is unitary on the image of , i.e.:
For notational convenience we drop the subscripts and until the very end. Let and let .
The condition that implies that we can write as
where are block matrices that act on the image of and the block partitions are with respect to the tensor factor .
Let
give the polar decompositions of respectively where are positive semidefinite and are unitary on the image of .
Then
The relation implies that . For notational brevity we write and . Note that and have support in the image of and are simultaneously diagonalizable. Write and . Note that and are positive semidefinite, and also simultaneously diagonalizable. Write and . Continuing our simplification, we see that
Now our goal is to find a unitary operator acting on such that is Hermitian. This is equivalent to the conditions that and . We construct such an operator using some relations between the operators that we derive below.
We use the unitarity relation to obtain the equations
(3.1) | ||||
(3.2) |
These equations, along with the definitions of and , imply that the unitary operators are block-diagonal with respect to the eigenspaces of and (and therefore commute with and ). Another relation we get from unitarity is , which via commutativity of implies
(3.3) |
Let be the orthogonal projector onto . Since commutes with (and so does ), we have that commutes with . By Equation 3.3, we have . Since , we also have that . Note that Equation 3.3, together with the fact that are block-diagonal with respect to the eigenspaces of the product , implies that and must be equal on the support of (equivalently, the support of ).
We now construct the desired unitary . Let denote (i.e., the projection onto the kernel of within the image of ). Left-multiplying both sides of Equation 3.3 with (the pseudoinverse of on the image of ) and right-multiplying both sides by we get
(3.4) |
Recall that ; combined with the fact that and are both block-diagonal with respect to the eigenspaces of and , we have
(3.5) |
Furthermore, and are unitary on . Let , and consider its spectral decomposition where is an orthonormal basis for the support of . Let denote the principal square root of . Define . Observe that is supported only on and satisfies
(3.6) |
Consider the operator that is unitary on the support of , and acts non-trivially only on the support of . Combining Equations (3.4), (3.5) and (3.6) we get
where the second line follows from Equation 3.4 and Equation 3.6, the fourth line follows because commutes with , and the last line follows because of Equation 3.5. We also have that commutes with : this is because commutes with and also acts nontrivially only on the eigenspace of with eigenvalue .
Let . Putting everything together, we have
(3.7) |
where we used the fact that .
Let and be spectral decompositions of and where the reals ’s are nonnegative and distinct, and the operators are orthogonal projectors summing to . The operator has such a spectral decomposition because . Next, since the unitary operators and commute with , they are block-diagonal with respect to the projectors . Thus
The operator is unitary on and we can express it as where the ’s are complex numbers on the unit circle and are orthogonal projectors that sum to . Thus we can write and , and can be written as
(3.8) |
where is the matrix
Notice that has determinant and is traceless, therefore its eigenvalues are .
Re-introducing the indices and , we have deduced that for every block of corresponding to the eigenspace , there exists a map that is unitary on the image of such that
where the are Hermitian unitary operators with trace . Define the unitary operator on as . If we sum over , we get
where and we have re-indexed the sum over to be a sum over indices . Let for all . Then, letting denote the projector onto the support of , we have
where have suppressed the tensoring with identity that extends all the operators to the same space, and used the property that , , and . Thus, the unitary operators satisfy the conclusions of the theorem statement. Let , so that is a superdense coding protocol by Lemma 2.5.
Furthermore, since has a nice form, it can be verified that also has a nice form. First, since , we have that (and hence Item 1 of Definition 2.6 is satisfied). Item 2 of Definition 2.6 is satisfied since . Third, since commutes with and is block-diagonal with respect to the eigenspaces of , it follows that also commutes with (so Item 3 of Definition 2.6 is satisfied). Finally, we have
Thus, Item 4 of Definition 2.6 is satisfied. This completes the proof of the Theorem. ∎
3.2 Matching the blocks of the encoding operators
In the previous section we saw how, up to local equivalence of protocols, we can express the encoding operator in a two-dimensional superdense coding protocol as a block-diagonal matrix with Hermitian unitary operators on the diagonal. In this section we relate the decompositions to each other. Ultimately, the conclusion is that the blocks “line up”, so that the operators in the same diagonal block of the four encoding operators are the four single qubit Pauli operators.
Theorem 3.2.
Let be a superdense coding protocol that has a nice form where for we have
for some orthogonal projectors on that sum to , and traceless, Hermitian unitary matrices on . Then is locally equivalent to a superdense coding protocol that has a nice form and satisfies the following: there exist orthogonal projectors on that sum to the identity and traceless, Hermitian unitary operators such that for all ,
Furthermore, for all we have
Proof.
The first step is to “coarse-grain” the projectors so that the associated operators are all inequivalent in the following sense. For each , we say that and are -equivalent if . For every , this forms an equivalence relation on the ’s. Let denote the least such that and are -equivalent. Define to be such that .
For every , for every , define the unitary operator which acts on (and set ). Then if we define and , by Lemma 2.5 we get that the pair is a superdense coding protocol, and furthermore the operators admit a block-diagonalization where for all , the associated unitary operators are all inequivalent. It is also straightforward to check that has a nice form.
Next, from Lemma 2.7 we have that for
(3.9) |
By left-multiplying the above expression by for some and right-multiplying by for some yields . Therefore, if , it follows that .
Given three sets of projectors , , and we can define the following tripartite graph , which we call the overlap graph. Associate a vertex with every projector for . Include an edge between and if and only if . A triangle in the graph corresponds to a triple of projectors such that the pairwise products are all nonzero. Given encoding operators as in Equation 3.9, we use triangles to match their blocks.
Lemma 3.3 (Reduction Lemma).
Let , , and be sets of orthogonal projectors with the following properties:
-
1.
-
2.
For , for all , implies that .
-
3.
For all , the are inequivalent.
Then there exists a triangle and a unit vector such that
We shall assume for now that the Reduction Lemma holds. We show how this gives us an iterative decomposition procedure to construct the orthogonal projectors satisfying the conclusions of the Theorem.
The sets , , and satisfy the required conditions of the Reduction Lemma with . Thus there exists a triangle and a vector that is a common eigenvector of . Thus we can write
where , , and are orthogonal projectors with rank one smaller.
Define the sets to be the sets with the projectors replaced by .
Observe that satisfies the required conditions of the Reduction Lemma, with
for all . Applying the Reduction Lemma again, we find another triangle and a common eigenvector of the triangle. We continue this process of reducing the rank of at least one operator each in the sets and finding common eigenvectors until we have fully expressed
where , and for every , the pairwise inner products satisfy . This concludes the proof of the Theorem. ∎
Before proving the Reduction Lemma we establish the following lemma, which claims that up to conjugation by the same unitary operator, the only collection of traceless, Hermitian, unitary, mutually orthogonal matrices are the single-qubit Pauli matrices.
Lemma 3.4.
Let be unitary matrices that are traceless, Hermitian, and satisfy for all . Then there exists a unitary operator such that
Proof.
We find a sequence of unitary operators such that satisfies the conclusions of the lemma. Because is unitary, Hermitian and traceless, we can unitarily diagonalize it as . Define as the unitary operator with
Let for . These are all traceless, Hermitian, pairwise orthogonal unitary matrices, and furthermore . Suppose
Since and is traceless, we have that , and since is Hermitian and unitary, we have for some . Let
and for . Again, the operators remain traceless, Hermitian unitary, and pairwise orthogonal, and furthermore and . Suppose
From , Hermiticity, unitarity, and tracelessness of we have again that and for some . From we have that , which means that . If , then set . Otherwise, set . Let for . We have that , , . Thus, letting , we obtain the desired conclusion of the lemma. ∎
We now turn to proving the Reduction Lemma.
Proof of Lemma 3.3.
Define .
No two triangles in share an edge.
Suppose we have two triangles corresponding to projectors and for some . We then have the equations
By Lemma 3.4, this implies that there exists a unitary operator such that
Therefore we obtain that
If
the above equation implies that and , or equivalently that . Thus , or in other words , contradicting the assumption that and are inequivalent.
Every vertex is in a triangle.
Consider for some . There exists an index such that , because the operators form a resolution of . Since also forms an orthogonal resolution of , we have that
This implies that there exists an index such that and .
Finding a common eigenvector of a triangle.
Fix a triangle . For notational simplicity we shall write , , and . First, observe that if , then there exists some index such that . This implies that forms a triangle in . But this cannot happen as this triangle would share the edge with . Therefore , i.e., .
By symmetry, we also get that and . Thus we have
Since is a product of three projectors, its spectral norm is at most . Let be a unit vector realizing the spectral norm of , i.e. such that . Then
The inequality implies that (since it is not zero and is at most one). Therefore is a vector such that . ∎
We now put everything in this section together to prove Theorem 1.1, which we restate here for convenience.
See 1.1
Proof.
Putting together Lemma 2.7, Theorem 3.1, and Theorem 3.2, we get that all superdense coding protocols are locally equivalent to one that has a nice form and satisfies the conclusions of Theorem 3.2. Finally, we apply Lemma 3.4 to the conclusions of Theorem 3.2 to obtain the conclusions of Theorem 1.1. ∎
4 Superdense coding and orthogonal unitary bases
In this section, we prove that there are multiple non-equivalent superdense coding protocols for transmitting messages for , even when no ancilla is used in the encoding process, and there is no error in decoding. This implies that rigidity of superdense coding protocols for may only hold in a relaxed form: as we see in this section, rigidity may hold only up to the choice of an orthogonal unitary basis for the space of linear operators .
4.1 The connection with unitary bases
We draw a connection between superdense coding and bases for the vector space of complex matrices. Although this connection may be inferred from Lemma 2.7, we give a simple and direct derivation here.
For any integer , consider a protocol for superdense coding of classical strings using a shared entangled state with local dimension , and a single -dimensional message. Assume that the protocol does not use any ancilla in the encoding process, and that there is no decoding error. Such a protocol necessarily has a simple form, as we describe below.
First, we argue that the initial shared state is maximally entangled. Bob’s state after the message has support in a -dimensional space. Since there are strings, and these are decoded without error, the corresponding states are orthogonal and pure. So the mixed state of the entire encoded state, corresponding to a uniformly random string, is completely mixed. However, the marginal of this state on the register initially held by Bob is be the same as the marginal for any fixed string. Thus Bob’s share of the initial state is also the -dimensional completely mixed state. This implies that the initial shared state is maximally entangled.
Any maximally entangled state with local dimension is of the form , where are unitary operators in , and . Therefore, without loss of generality, we may assume that Alice and Bob initially share the state . When the dimension is clear from the context, we omit it from the subscript.
Second, since the encoding of any message is pure, Alice’s local operations satisfy the following properties. On input , Alice applies a unitary operator to her share of the state , and sends the share to the Bob. Since Bob can decode the input with probability , the states are all orthogonal, i.e., for all distinct , we have
This condition is equivalent to the property that the operators are mutually orthogonal with respect to the Hilbert-Schmidt inner product:
(4.1) |
Thus, the operators form an orthogonal unitary basis for the space of linear operators on .
It is straightforward to verify that any such basis for leads to an errorless superdense coding protocol for classical messages. Thus, the study of rigidity of superdense coding protocols as above is equivalent to the study of orthogonal unitary bases.
A well-known example of an orthogonal unitary basis in dimension is generated by the “clock” and “shift” operators. The elements of this basis are also known as the generalized Pauli operators or the Heisenberg-Weyl operators. Let be a primitive th root of unity. For , the th operator in the basis is defined as , where is the shift (or Pauli X) operator, and is the clock (or Pauli Z) operator.
4.2 Uniqueness of orthogonal unitary bases
Given an orthogonal unitary basis for , we may derive other such bases by conjugating elements of by a pair of unitary operators, and mutliplying each basis element by a potentially different complex number of unit modulus. Since this is a rather straightforward method to derive new bases, we consider the new basis to be equivalent to .
Definition 4.1.
Let be an orthogonal unitary basis for . We say that an orthogonal unitary basis is equivalent to if there exist unit complex numbers and a pair of unitary operators such that
(4.2) |
We may verify that this defines an equivalence relation.
Another way to construct an orthogonal unitary basis is by taking tensor products of bases in lower dimensions. Suppose is composite, with and , and and are orthogonal unitary bases for and , respectively. Then
is an orthogonal unitary basis for . This hints at the possibility that are bases that are not equivalent to each other under operations as in Eq. (4.2). The following proposition confirms this for dimensions which are powers of two.
Proposition 4.2.
Suppose for an integer . Let be the basis for obtained by taking tensor products of two-dimensional Pauli X and Z operators, i.e.,
The basis is not equivalent to , the -dimensional clock and shift basis.
Proof.
The intuition behind the statement is that tensor products of the two-dimensional Pauli operators in all have at most two distinct eigenvalues (either , or , or ), whereas some of the operators in the clock and shift basis have distinct complex eigenvalues. Due to the freedom available in generating equivalent bases, we need additional arguments to formalize this intuition.
Suppose that and are equivalent and consider unitary operators which show their equivalence. Consider the operator that is mapped to the identity in under the equivalence. Let be a complex number of unit modulus such that . Then .
Suppose the operator is mapped to the clock operator , and that for some complex number . Then . The operator on the right hand side has at most two eigenvalues (either , or , or ) as has eigenvalues , or , or . However, the clock operator has distinct eigenvalues, the th complex roots of unity. Since , we get a contradiction, and we conclude that and are not equivalent. ∎
It is then natural to wonder if there is a unique orthogonal unitary basis in prime dimensions, up to the equivalence defined above. In Section 4.4 we show that even this does not hold, by giving an explicit construction of a basis in any dimension that is not equivalent to the clock and shift basis.
After our discovery of non-equivalent bases, we learned that the question of uniqueness has been studied before by Vollbrecht and Werner [VW00]. They prove the uniqueness of the basis consisting of the Pauli operators in dimension two, and state that the problem of characterizing orthogonal unitary bases in dimensions larger than two is open. They also give a construction of “shift-and-multiply” bases from a collection of complex Hadamard matrices of dimension and a Latin square. This construction and non-equivalent bases are discussed in more detail by Werner in subsequent work [Wer01], although the notion of equivalence there does not include multiplication by phases (complex numbers of unit modulus). Werner states without proof that the existence of non-equivalent bases in dimension at least five follows from the existence of non-equivalent Hadamard matrices or non-equivalent Latin squares, even when the dimension is prime. In dimension three, Werner describes how we may construct non-equivalent bases, but does not explicitly present them. We present a concrete instance of this construction in Proposition 4.9. Altogether, we have the following result.
Theorem 4.3.
For every dimension , there are orthogonal unitary bases that are not equivalent to the clock and shift basis.
The theorem implies that for any , there are non-equivalent superdense coding protocols for transmitting messages, even when no ancilla is used in the encoding process, and there is no error in decoding.
Orthogonal unitary bases have also been studied in the context of quantum error-correction under the name “unitary error bases” (see, e.g., Ref. [MV16] and the references therein). In addition to the shift-and-multiply construction, several other methods such as the “Hadamard method” and the “algebraic method” have been proposed for their construction. The “quantum shift-and-multiply” method due to Musto and Vicary [MV16] simultaneously generalizes the shift-and-multiply and Hadamard methods. Musto and Vicary give examples of orthogonal unitary bases resulting from this method that are not equivalent to those derived from any of the other methods mentioned above. However, they give explicit examples only in dimension 4. As far as we can tell, earlier explicit constructions, for example those due to Klappenecker and Rötteler [KR03], were also for a few small dimensions.
4.3 Some useful properties
Here we present two properties that are used in an explicit construction leading to Theorem 4.3. The following property of the eigenvalues of the clock and shift operators helps in proving non-equivalence to another basis. Recall that is a primitive th root of unity.
Lemma 4.4.
Let be an integer, and let . The eigenvalues of the operator are all of the form
for some .
Proof.
Since , we have . So the eigenvalues of are th complex roots of , and the lemma follows. ∎
We also use the following simple number-theoretic property in the construction of new orthogonal unitary bases.
Lemma 4.5.
Any integer has at most positive integer divisors.
Proof.
If is prime, then it has exactly two positive integer divisors, , and the lemma holds.
Suppose is composite and has prime factorization , where and are positive integers, and are distinct prime numbers arranged in increasing order. The number of positive integer divisors of equals .
Since is composite, either and , or .
Suppose . If , the lemma follows since for all positive integers , for any . If , we have since . Since for all integers , the lemma again follows.
Now suppose . Since and for all whenever and , the number of divisors of is bounded as
as claimed. ∎
4.4 Explicit constructions
We now proceed to describe an explicit construction for all dimensions . The construction we give has the same form as the shift-and-multiply construction due to Vollbrecht and Werner [VW00]. In particular, the bases we present correspond to the construction with the Fourier transform over the cyclic group of order as the Hadamard matrix, and certain Latin squares that are not equivalent to the one generated by .
We construct bases that are not equivalent to the clock and shift basis by introducing a modification. In particular, we replace the operators generated by by another sequence. Note that the operators correspond to permutations on , i.e., they permute the standard basis of . Conversely any permutation on corresponds to the operator which permutes standard basis elements of . So this is a bijection.
It is also helpful to view a permutation on as a perfect matching in the complete bipartite graph , with vertex in one part being matched with the vertex in the other. This mapping defines a bijection between permutations and perfect matchings. The construction we give relies on these three equivalent views of a permutation, and uses the same letter to refer to the corresponding matching and the linear operator on .
We start with the following observation.
Lemma 4.6.
Let be a sequence of disjoint matchings in the graph . Then the matrices
form an orthogonal unitary basis.
Proof.
Since permutes the standard basis vectors of , it is a unitary operator. Therefore is also unitary.
Now consider the inner product of and for two pairs and . We have
If , the inner product is since for any , the disjoint matchings and match the vertex to distinct vertices. If , but , the inner product is again as is a th root of unity. ∎
We show that to derive non-equivalent bases, it suffices to have two of the matchings satisfy simple properties.
Lemma 4.7.
Let , and be a sequence of disjoint matchings in such that
-
1.
is the identity permutation, i.e., matches vertex in one part to vertex in the other part; and
-
2.
the permutation has a cycle of length such that does not divide .
Then the basis is not equivalent to the clock and shift basis.
Proof.
The intuition here is the following. The operator corresponding to the permutation has the distinct th roots of unity as eigenvalues. In particular, the eigenvalues include and . On the other hand, the eigenvalues of any operator in the clock and shift basis are of the form for some integer , and a fixed unit complex number depending only on the operator. Since does not divide , the operator does not belong to the clock and shift basis.
Formally, suppose that is equivalent to the clock and shift basis, and the equivalence is given by unitary operators . Suppose that the identity operator is mapped to and is mapped to under this equivalence. That is, and for some .
From the equation for we have , so that . Equivalently, we have
(4.3) |
where . By Lemma 4.4, there is a fixed such that the eigenvalues of the operator on the left hand side of Eq. (4.3) are of the form for some integer . On the other hand, the operator on the right hand side of the equation is similar to . Since has eigenvalues and , we have and for some integers . Eliminating , we get . This implies that
for some integer , or equivalently that . This is a contradiction, as does not divide . ∎
Finally, we prove that matchings as in the hypothesis of Lemma 4.7 exist.
Lemma 4.8.
For any integer , there is a sequence of disjoint matchings in such that
-
1.
is the identity permutation, i.e., matches vertex in one part with vertex in the other part; and
-
2.
the permutation has a cycle of length such that does not divide .
Proof.
By Lemma 4.5, for any , there is an integer that does not divide .
Let be the identity permutation, and let be a permutation consisting of two cycles of length and , respectively. The perfect matchings corresponding to and are disjoint, as maps each element to itself while cyclically shifts every element within each of its two cycles (both of which are of length at least two).
Consider the graph obtained by deleting the edges in the matchings and from . The graph is a -regular bipartite graph. Thus, by the Hall theorem [Hal35], can be decomposed into disjoint perfect matchings. ∎
Lemma 4.8 and Lemma 4.7 together imply that for any dimension , there are multiple non-equivalent orthogonal unitary bases. The same property holds for due to Proposition 4.2, and for due to Proposition 4.9 below. This proves Theorem 4.3.
Proposition 4.9.
There are orthogonal unitary bases for that are not equivalent to the clock and shift basis.
Proof.
Denote the clock and shift basis by . Note that is a commutative projective group under operator composition, i.e., it is closed under taking products of the operators and the operators commute, all up to some phase (a unit complex number) that may depend on the operators. We construct a basis such that the equivalence of and implies that also is a commutative projective group. However, the basis has elements that do not commute even up to a phase, which is a contradiction.
We construct following an idea due to Werner [Wer01]; see the discussion after Proposition 9 in the paper. Let , where is a unit complex number such that . Let , where for any
We may verify that this is an orthogonal unitary basis for any choice of .
Suppose the basis is equivalent to , and the equivalence is given by the operators and unit complex numbers . Consider the element of that corresponds to the operator . We have . Then , and
Since is closed under right multiplication by up to phases, the set of operators
is also a commutative projective group, as is the basis .
We show next that not all operators in the set commute, even up to a phase. Consider the operators and . These operators commute up to a phase if and only if there is a unit complex number such that
This implies that . As we chose , this is a contradiction, and and are not equivalent. ∎
5 Random superdense coding protocols
In this section we study a random protocol for approximate superdense coding. Its analysis draws heavily on results in high-dimensional probability. We present these results in Section 5.1 and develop some properties of random entangled vectors in Section 5.2, before proceeding to the analysis in Section 5.3. Finally, in Section 5.4, we address a subtle issue that we encounter in the analysis.
5.1 Background from random matrix theory
In this section, we present some useful results from random matrix theory.
Definition 5.1 (Isotropic vector).
We say a random vector is isotropic if .
Random variables which have tails that decay as fast as the normal distribution play an important role in high dimensional probability. Let denote the set of unit vectors in .
Definition 5.2 (Sub-gaussian random variables and vectors).
A random variable is sub-gaussian if there exists a parameter such that
for all . The sub-gaussian norm of , denoted by , is defined as
A random vector is sub-gaussian if for all unit vectors , the inner product is sub-gaussian. The sub-gaussian norm of is defined as
Sub-gaussian norm can be characterized in multiple ways. The following lemma describes two of them; see [Ver18, Proposition 2.5.2, Section 2.5.1].
Lemma 5.3.
There are positive universal constants such that for any random variable ,
-
1.
If for some parameter ,
for all , then is sub-gaussian and .
-
2.
If is sub-gaussian with for some parameter , then
for all .
The following theorem gives a sharp bound on the largest singular value of a class of random matrices; see the text by Verhsynin [Ver18] for this and related results. Vershynin states the result for real matrices, but the proof extends to complex matrices in a straightforward manner.
Theorem 5.4 ([Ver18], Theorem 4.6.1).
Let be a complex matrix whose rows are independent, mean zero, sub-gaussian isotropic vectors in . Then there is a universal constant such that for all , we have
with probability at least , where .
Let denote the Hilbert-Schmidt norm on :
This norm induces the following -sum metric on :
Let be a continuous function. We say is -Lipschitz with respect to the -sum of Hilbert-Schmidt metrics if for all , we have
Let , be i.i.d. Haar-random unitary operators. If is sufficiently smaller than the dimension , with high probability, the random variable is close to its expectation. This concentration of measure property is formalized by the following theorem, which is a special case of Theorem 5.17 in the book on random matrix theory by Meckes [Mec19].
Theorem 5.5 ([Mec19], Theorem 5.17, page 159).
Let , , be i.i.d. random unitary operators chosen according to the Haar measure. Suppose the function is -Lipschitz with respect to the -sum of Hilbert-Schmidt metrics, with . Then for every positive real number , we have
where .
The Marčenko–Pastur theorem characterises the spectrum of a wide class of random matrices in the limit of large dimension. We rely on a version of the theorem due to Yaskov [Yas16] that applies to matrices whose entries need not all be independent. While Yaskov states the result for real matrices, the proof extends to complex matrices with straightforward modifications. We sketch the observations and the modifications which enable this extension after the statement of the theorem.
The columns of the random matrices we consider satisfy a certain asymptotic isotropy condition.
Definition 5.6.
Let be a sequence of positive integers such that as . Let be a sequence of random vectors with . We say that the sequence is pseudo-isotropic if for all sequences of complex matrices with and with uniformly bounded spectral norm (i.e., for all for a universal constant ),
as .
Define the empirical spectral distribution (ESD) of an positive semi-definite matrix as , where are the eigenvalues of , and is the Dirac-delta function. This is the probability density function of a uniformly random eigenvalue of .
Theorem 5.7 (Marčenko-Pastur law [Yas16]).
Fix an , and let be integers with and a function of such that as . For each , let in be a random vector such that the sequence of vectors is pseudo-isotropic. Let be a sequence of random matrices whose columns are i.i.d. copies of the random vector , and let be the ESD of the matrix . Then, as , the ESD converges weakly to the density almost surely, where
with , .
In other words, as , with probability , the cumulative distribution function of a uniformly random eigenvalue of the matrix converges point-wise to that given by the probability density function .
Theorem 5.7 follows from the proof of Theorem 2.1 in Ref. [Yas16] by noting the following points. The eigenvalues of the matrix are all real, and therefore the Stieltjes continuity theorem applies to . Further, the Sherman-Morrison formula also extends to the sum , where is an invertible complex matrix, and are in : the matrix is invertible if and only if , and if the latter condition holds,
We can prove that the Stieltjes transform of tends to its expectation almost surely as , following Step 1 in the proof of Theorem 1.1 in Ref. [BZ08]. The rest of the proof in Ref. [Yas16] now extends to the case of interest to us by replacing all instances of the transpose of a real vector by the conjugate transpose of the corresponding complex vector.
5.2 Pseudo-isotropy of random maximally entangled vectors
In this section, we develop properties of linear operators with certain symmetries, and use these to prove that a sequence of random maximally entangled vectors is pseudo-isotropic. This property is later used in the analysis of a random superdense coding protocol.
We consider operators on , and label the four tensor factors with , respectively. As is the convention in quantum information, we use superscripts to indicate the tensor factors on which an operator acts. Let be the swap operator on ; it permutes the two tensor factors.
Lemma 5.8.
Let . Suppose commutes with for all unitary operators , as well as with . Then is a linear combination of operators of the form , where .
Proof.
Let be a basis for the vector space . We may express as for some operators . Since commutes with , we have
for all . Since the operators form a basis, we conclude that
i.e., the operator commutes with for every . As a consequence of the von Neumann double commutant theorem [Wat18, Theorem 7.15, Section 7.1], each operator may be written as a linear combination of . So
for some complex numbers . Rearranging the sum, we get that
for some operators . Since commutes with as well, and and are linearly independent, by [Wat18, Theorem 7.15, Section 7.1] we similarly get that and are also linear combinations of . The lemma follows. ∎
Consider the random vector defined as , where is a Haar-random unitary operator and is the maximally entangled state with local dimension . We would like to compute a closed form expression for the operator on defined as:
(5.1) |
We use the symmetries of in order to do so.
Lemma 5.9.
Proof.
Since
and is Haar random, the operator commutes with for all . Further, since , the operator also commutes with . By Lemma 5.8, we have
(5.2) |
Consider the following linear functionals:
-
1.
-
2.
-
3.
-
4.
We apply these functionals to both sides of Eq. (5.2). We calculate the value of the functional on the left hand side directly from the definition of , i.e., Eq. (5.1), and on the right hand side from Eq. (5.2). We thus obtain the following linear equations:
-
1.
,
-
2.
,
-
3.
, and
-
4.
, respectively.
Solving for , we get the unique solution
∎
Let . Consider the random vector defined as . We prove that the sequence of these vectors is pseudo-isotropic.
Lemma 5.10.
The sequence of vectors is pseudo-isotropic.
Proof.
Let be a sequence of complex matrices with spectral norm bounded by a constant , for each . We use the Chebyshev Inequality to show that
(5.3) |
as . Let be the complex random variable defined as . We may verify that , so that . Eq. (5.3) is equivalent to showing that for every , as . By the Chebyshev Inequality,
So it suffices to show that the variance of is .
5.3 Analysis of a random protocol
Consider the following random protocol . Let be an integer , and . Alice and Bob agree on a choice of independently chosen Haar-random unitary operators . They also share the maximally entangled state with local dimension . When Alice gets message , she applies to her half of , and sends it over to Bob. Bob now holds the state . He performs an optimal measurement to identify , given that the state is drawn from the ensemble .
Aram Harrow (personal communication) suggested the protocol as a candidate for an approximate -superdense coding protocol with vanishing error in the limit of large dimension. If this random construction of superdense coding protocols did indeed have error that vanishes rapidly as a function of , then this could potentially refute 1.3. This is formalized by the following proposition (which was stated in Section 1.3, and is reproduced here for convenience).
See 1.4
Proof.
Let be the random protocol and let be the ensemble of random unitaries specified in the Proposition statement. Suppose for contradiction that 1.3 were true and the error of the protocol satisfied
(5.4) |
First we argue that
(5.5) |
where the first expectation is over the ensemble of random unitary operators , and the second expectation is over a uniformly random pair of distinct indices , . To prove this, note that for all because are independent, identically distributed Haar-random unitaries operators. Furthermore, by the rotation invariance of the Haar measure, is also distributed according to the Haar measure. So the above quantity is equal to for Haar-random . So the LHS of Equation 5.5 equals
Since and when , we have , which establishes Equation 5.5.
On the other hand, the rigidity condition promised by 1.3 implies that every collection of unitary operators yields a superdense protocol with some error , and in turn there exists an orthogonal unitary basis such that for all . Note that and depend on , and let and be the error of the protocol and the corresponding orthogonal unitary basis, respectively. Then
(5.6) | ||||
where the first equality is due to the orthogonality condition whenever , the second line is due to the triangle inequality, and the third line is due to the inequality for real numbers . By the Cauchy-Schwarz inequality for the Hilbert-Schmidt inner product, we have and , where denotes the (unnormalized) Hilbert-Schmidt norm. Since for all unitary matrices , we can upper bound the RHS of Equation 5.6 as | ||||
where the last inequality follows from the assumption in Equation 5.4. However, this contradicts Equation 5.5. Thus, either the conjecture does not hold, or the random superdense coding protocol has error satisfying . ∎
In the rest of this section, we prove that for sufficiently large dimension, with high probability, the protocol has positive constant error. This indicates that random maximally entangled quantum states are not very reliable for transmitting classical information, and proves Theorem 1.5. Thus, the random protocol does not rule out a robust rigidity theorem for superdense coding.
To analyze the decoding error of the protocol, we study the distinguishability of the ensemble . This is the probability that, if the pure state is selected uniformly at random from the ensemble, an optimal measurement correctly identifies the state.
Definition 5.11.
Let be an ensemble of states in which state occurs with probability . We define the distinguishability of as
where the maximization is over all measurements (i.e., POVMs) with elements .
We can estimate the distinguishability of an ensemble of states via the generalized Holevo-Curlander bounds [Kho79, Cur79, ON99, Tys09b].
Theorem 5.12 (generalized Holevo-Curlander bounds [ON99, Tys09b]).
Let be an ensemble of quantum states. Then the distinguishability of satisfies
where
We only need the upper bound on distinguishability above for a uniform ensemble of pure states. This bound was given by Curlander [Cur79] in the case of linearly independent states. It was generalized to the case of equiprobable, possibly mixed states by Ogawa and Nagaoka [ON99, Lemma 1]. The proof they gave also extends with minor modifications to non-uniform ensembles. The two bounds in Theorem 5.12 were proven — re-proven independently in the case of the upper bound [Tys09a] — by Tyson[Tys09b, Theorem 10]. Tyson later gave another proof of the bounds which also generalizes to error-recovery [Tys10, Section III].
We show that the expectation of the quantity for the ensemble of random maximally entangled states is at most a constant strictly less than , for sufficiently large dimension . This implies that the distinguishability is also strictly less than in expectation, and that any measurement Bob makes has a non-zero constant probability of failure, on average.
Theorem 5.13.
The distinguishability of the random superdense coding protocol tends to as .
Proof.
Define the matrix as , and as a uniformly random eigenvalue of . Then is the expectation of the random variable , and we aim to bound this from above.
Define the matrix so that . Consider the random vector defined as , where is a Haar-random unitary operator. I.e., is a scaled random maximally entangled state. Lemma 5.10 shows that the sequence is pseudo-isotropic. Since the columns of the matrix are i.i.d. copies of the random vector , the random matrix is of the form described in Theorem 5.7. Thus the limiting distribution of the uniformly random eigenvalue of follows the Marčenko-Pastur law with density (i.e., with parameter ):
(5.7) |
We would like to use the density to estimate the limit of the expectation of . A subtle issue is that weak convergence (i.e., convergence in distribution of the random variables) does not necessarily imply that the limit of the expectation values equals the expectation of the limiting random variable. A simple example for which this does not hold is described in Section 5.4. Nonetheless, in Theorem 5.16 in Section 5.4, we show that the sequence of random variables satisfies the stronger property we need. Namely, converges to as , where is a random variable with density . We may thus bound the distinguishability of the ensemble as as follows.
∎
We can strengthen this result to show that the distinguishability of is tightly concentrated around the mean. So all but an exponentially small fraction of the superdense coding protocols using succeed with probability smaller than .
Theorem 5.14.
The distinguishability of the random superdense coding protocol satisfies
Proof.
Define the function as
where , and we denote by . So , the distinguishability of the ensemble . We bounded the expected distinguishability by in Theorem 5.13. We show that is tightly concentrated around its expectation using Theorem 5.5. To do so, we compute a bound on the Lipschitz constant of .
Fix unitary operators and let and . Since the space of -dimensional POVMs with outcomes is compact, for any sequence the supremum in the definition of is attained at some POVM . Let , correspond to the POVMs achieving and , respectively, and let denote these quantities. Assume without loss of generality that .
We have that
as the POVM may not be an optimal distinguishing measurement for the ensemble . We bound this by
where in the fourth line we used the property that for any two pure states and , the trace distance between and is at most . In the last line, we used the identity for any matrix is equal to .
5.4 A subtle issue
Recall from the proof of Theorem 5.13 in Section 5.3 that denotes a uniformly random eigenvalue of the matrix . The generalized Marčenko-Pastur Law (Theorem 5.7) tells us that converges in distribution to a random variable with density given in Eq. (5.7) as . We used this limiting distribution to estimate the limit of the mean of in Theorem 5.13. We pointed out the subtle issue that convergence in distribution does not necessarily imply that the limit of means equals the mean of the limiting random variable. A simple example which illustrates this issue is the following. For any positive integer , let the random variable take value with probability , and value with the remaining probability. Then converges in distribution to the constant , whereas for all .
The example above highlights the reason underlying this phenomenon: while the probability of an interval on the line may go to zero in the limit, the rate of convergence may not be fast enough to dampen the contribution to the mean from that interval. We show that the probability that the random variable deviates from zero decays exponentially. This helps us conclude the convergence of the mean to .
A similar property was assumed to hold by Montanaro [Mon07] in his work on the distinguishability of random quantum states. Let , where are i.i.d. random vectors in with i.i.d. complex gaussian entries with mean and variance . Montanaro approximates using the Marčenko-Pastur Law. He justifies this using estimates on the rate of convergence of the expected distribution of a uniformly random eigenvalue of to the limiting distribution given by the Marčenko-Pastur Law (see the discussion after Lemma 5 in Ref. [Mon07]). The rate of convergence is measured in terms of the Kolmogorov distance between the two distributions. (The Kolmogorov distance between the cumulative distribution functions and of real random variables is defined as .) The Kolmogorov distance was shown to be by Bai [Bai93]. However, vanishing Kolmogorov distance does not necessarily imply the convergence of the mean to the mean of the limiting distribution. For example, the Kolmogorov distance of the distribution of the random variable defined above from the constant is . The approach we take in this section can also be used to fill the gap in Montanaro’s work. In fact, the analogue of Lemma 5.15 we need for this purpose follows directly, as the columns of are gaussian. We leave the details to the reader, and return to the analysis of the random variable .
In order to show that has an exponentially decaying tail, it suffices to show that the spectral norm of the matrix —i.e., its largest eigenvalue—has this property. So we proceed by deriving a tail bound for the spectral norm of .
Lemma 5.15.
Let . There are positive universal constants such that for all ,
(5.8) |
Proof.
Recall that , , and that . We have , so
It thus suffices to give a suitable tail bound for .
The vectors are i.i.d. copies of the random vector defined as , where is a Haar-random unitary operator on . So the vectors have zero mean. We can verify that , so the vectors are isotropic. We prove below that the vector is sub-gaussian, with sub-gaussian norm at most a universal constant . So the matrix satisfies the conditions of Theorem 5.4, and we have that for some positive universal constant ,
with probability at most , for all . Let be such that the right hand side above equals , i.e.,
Let . Whenever , we see that . So
and the theorem holds with .
It remains to prove that the sub-gaussian norm of is at most some universal constant. By Lemma 5.3, it suffices to show that for any unit vector , the random variable has sub-gaussian tails: for a positive universal constant ,
(5.9) |
for all . We establish this by appealing to Theorem 5.5.
Since is isotropic,
So .
Define the function as . Then . To show that is Lipschitz, consider . Since is a unit vector and we have
where denotes the Hilbert-Schmidt norm on . So the function is -Lipschitz with respect to Hilbert-Schmidt metric. By Theorem 5.5,
for every . For , we have . So the right hand side is at most , and we have
for all (as ). Note that Eq. (5.9) holds trivially for for any choice of positive constant such that . Thus, taking , we see that Eq. (5.9) holds for all whenever . ∎
Lemma 5.15 implies that for a large enough constant , the contribution to the mean outside an interval goes to as . Within this interval, the contribution to the mean tends to that for . This helps us derive the limiting value of the mean.
Theorem 5.16.
Proof.
We formalize the intuition given above by appealing to a weaker property implied by convergence in distribution, namely that the expectation of any bounded continuous function of the random variable converges to .
Fix , where is the constant in the statement of Lemma 5.15 and consider the function defined as follows:
Since is continuous and bounded, and ,
On the other hand, and for all . So , and
We prove the reverse inequality using Lemma 5.15. Let be the probability density function of . By the definition of ,
(5.10) |
Let denote the second term on the right hand side of Eq. (5.10) above. This is the contribution to outside of the interval . Using , Fubini’s Theorem, , and Lemma 5.15, we have
where is the universal constant in the statement of Lemma 5.15. Since , vanishes as goes to . By Eq. (5.10),
This proves the theorem. ∎
References
- [Bai93] Z. D. Bai. Convergence rate of expected spectral distributions of large random matrices. Part II. sample covariance matrices. Annals of Probability, 21(2):649–672, April 1993.
- [BCWdW01] Harry Buhrman, Richard Cleve, John Watrous, and Ronald de Wolf. Quantum fingerprinting. Physical Review Letters, 87(16):167902, 2001.
- [BW92] Charles H. Bennett and Stephen J. Wiesner. Communication via one- and two-particle operators on einstein-podolsky-rosen states. Physical review letters, 69(20):2881, 1992.
- [BZ08] Zhidong Bai and Wang Zhou. Large sample covariance matrices without independence structures in columns. Statistica Sinica, 18(2):425–442, 2008.
- [CGJV19] Andrea Coladangelo, Alex B. Grilo, Stacey Jeffery, and Thomas Vidick. Verifier-on-a-leash: New schemes for verifiable delegated quantum computation, with quasilinear resources. In Yuval Ishai and Vincent Rijmen, editors, Advances in Cryptology – EUROCRYPT 2019, volume 11478 of Lecture Notes in Computer Science, pages 247–277, Cham, 2019. Springer International Publishing.
- [Cha20] Michael Charezma. Quantum circuit diagrams. https://warwick.ac.uk/fac/sci/physics/research/cfsa/people/pastmembers/charemzam/pastprojects, 2006 (accessed October 14, 2020).
- [Cur79] Paul Joseph Curlander. Quantum Limitations on Communication Systems. PhD thesis, Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1979.
- [FK19] Máté Farkas and Jędrzej Kaniewski. Self-testing mutually unbiased bases in the prepare-and-measure scenario. Physical Review A, 99(3):032316, 2019.
- [FKN22] Máté Farkas, Jędrzej Kaniewski, and Ashwin Nayak. Mutually unbiased measurements, Hadamard matrices, and Superdense Coding. Technical Report arXiv:2204.11886 [quant-ph], ArXiv.org Preprint Archive, https://www.arxiv.org/, April 2022.
- [Hal35] Philip Hall. On representatives of subsets. Journal of the London Mathematical Society, s1-10(1):26–30, 1935.
- [HJW93] Lane P. Hughston, Richard Jozsa, and William K. Wootters. A complete classification of quantum ensembles having a given density matrix. Physics Letters A, 183(1):14–18, 1993.
- [JNV+20] Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright, and Henry Yuen. . Technical Report arXiv:2001.04383 [quant-ph], ArXiv.org Preprint Archive, https://www.arxiv.org/, January 2020.
- [Kho79] Alexander S. Kholevo. On asymptotically optimal hypothesis testing in quantum statistics. Theory of Probability & Its Applications, 23(2):411–415, 1979.
- [KR03] Andreas Klappenecker and Martin Rötteler. Unitary error bases: Constructions, equivalence, and applications. In Marc Fossorier, Tom Høholdt, and Alain Poli, editors, Proceedings of the 15th International Symposium on Applied Algebra, Algebraic Algorithms, and Error-Correcting Codes (AAECC), volume 2643 of Lecture Notes in Computer Science, pages 139–149. Springer, Berlin / Heidelberg, Germany, May12–16, 2003.
- [Mec19] Elizabeth S. Meckes. The Random Matrix Theory of the Classical Compact Groups, volume 218 of Cambridge Tracts in Mathematics. Cambridge University Press, July 2019.
- [Mon07] Ashley Montanaro. On the distinguishability of random quantum states. Communications in Mathematical Physics, 273(3):619–636, August 2007.
- [MV16] Benjamin Musto and Jamie Vicary. Quantum Latin squares and unitary error bases. Quantum Information and Computation, 16(15-16):1318–1332, November 2016.
- [MY98] Dominic Mayers and Andrew Yao. Quantum cryptography with imperfect apparatus. In Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No. 98CB36280), pages 503–509. IEEE, 1998.
- [MY04] Dominic Mayers and Andrew Yao. Self testing quantum apparatus. Quantum Information & Computation, 4(4):273–286, July 2004.
- [NC11] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, New York, NY, USA, 2011. 10th Anniversary Edition.
- [NY20] Ashwin Nayak and Henry Yuen. Rigidity of superdense coding. Technical Report arXiv:2012.01672v1 [quant-ph], arXiv Pre-print server, https://arxiv.org/abs/2012.01672, December 2020.
- [ON99] Tomohiro Ogawa and Hiroshi Nagaoka. Strong converse to the quantum channel coding theorem. IEEE Transactions on Information Theory, 45(7):2486–2489, 1999.
- [ŠB20] Ivan Šupić and Joseph Bowles. Self-testing of quantum systems: a review. Quantum, 4:337, 2020.
- [Sch35] Erwin Schrödinger. Discussion of probability relations between separated systems. Mathematical Proceedings of the Cambridge Philosophical Society, 31(4):555–563, 1935.
- [TKV+18] Armin Tavakoli, Jędrzej Kaniewski, Tamás Vértesi, Denis Rosset, and Nicolas Brunner. Self-testing quantum states and measurements in the prepare-and-measure scenario. Physical Review A, 98(6):062307, 2018.
- [Tys09a] Jon Tyson. Erratum: “Minimum-error quantum distinguishability bounds from matrix monotone functions: A comment on ‘Two-sided estimates of minimum-error distinguishability of mixed quantum states via generalized Holevo-Curlander bounds’ ” [J. Math. Phys. 50, 062102 (2009)]. Journal of Mathematical Physics, 50(10):109902, 2009.
- [Tys09b] Jon Tyson. Two-sided estimates of minimum-error distinguishability of mixed quantum states via generalized Holevo-Curlander bounds. Journal of Mathematical Physics, 50(3):032106, 2009.
- [Tys10] Jon Tyson. Two-sided bounds on minimum-error quantum measurement, on the reversibility of quantum dynamics, and on maximum overlap using directional iterates. Journal of Mathematical Physics, 51(9):092204, 2010.
- [Uhl76] Armin Uhlmann. The “transition probability” in the state space of a -algebra. Reports on Mathematical Physics, 9(2):273–279, 1976.
- [Ver18] Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, UK, 2018.
- [VV19] Umesh Vazirani and Thomas Vidick. Fully device independent quantum key distribution. Communications of the ACM, 62(4):133–133, 2019.
- [VW00] Karl Gerd H. Vollbrecht and Reinhard F. Werner. Why two qubits are special. Journal of Mathematical Physics, 41(10):6772–6782, 2000.
- [Wat18] John Watrous. The Theory of Quantum Information. Cambridge University Press, May 2018.
- [Wer01] Reinhard F. Werner. All teleportation and dense coding schemes. Journal of Physics A: Mathematical and General, 34(35):7081–7094, August 2001.
- [Wil13] Mark M. Wilde. Quantum Information Theory. Cambridge University Press, Cambridge, UK, 2013.
- [Yas16] Pavel Yaskov. A short proof of the Marchenko–Pastur theorem. Une courte démonstration du théorème de Marchenko–Pastur. Comptes Rendus Mathématique, 354(3):319–322, March 2016.