Local measurement strategies for multipartite entanglement quantification

Luke Coffman Department of Physics, University of Colorado, Boulder, Colorado 80309, USA JILA, NIST and University of Colorado, Boulder, Colorado 80309, USA Akshay Seshadri Department of Physics, University of Colorado, Boulder, Colorado 80309, USA JILA, NIST and University of Colorado, Boulder, Colorado 80309, USA Graeme Smith Department of Physics, University of Colorado, Boulder, Colorado 80309, USA JILA, NIST and University of Colorado, Boulder, Colorado 80309, USA Institute for Quantum Computing and Department of Applied Math, University of Waterloo Jacob L. Beckey Department of Physics, University of Colorado, Boulder, Colorado 80309, USA JILA, NIST and University of Colorado, Boulder, Colorado 80309, USA

Abstract

Despite multipartite entanglement being a global property of a quantum state, a number of recent works have made it clear that it can be quantified using only local measurements. This is appealing because local measurements are the easiest to implement on current quantum hardware. However, it remains an open question what protocol one should use in order to minimize the resources required to estimate multipartite entanglement from local measurements alone. In this work, we construct and compare several estimators of multipartite entanglement based solely on the data from local measurements. We first construct statistical estimators for a broad family of entanglement measures using local randomized measurement (LRM) data before providing a general criterion for the construction of such estimators in terms of projective 2-designs. Importantly, this allows us to de-randomize the multipartite estimation protocol based on LRMs. In particular, we show how local symmetric informationally complete POVMs enable multipartite entanglement quantification with only a single measurement setting. For all estimators, we provide both the classical post-processing cost and rigorous performance guarantees in the form of analytical upper bounds on the number of measurements needed to estimate the measures to any desired precision.

I Introduction

In the nearly 90 years since the famous Einstein-Podolsky-Rosen (EPR) paper Einstein et al. (1935) initiated the study of entanglement, there has been an enormous effort to understand this phenomenon theoretically Horodecki et al. (2009) and, as quantum technologies have matured, to probe it experimentally Gühne and Tóth (2009); Islam et al. (2015); Kaufman et al. (2016); Bluvstein et al. (2022). While many questions in entanglement theory remain only partially resolved, the experimental violations of Bell’s inequalities Bell (1964); Clauser et al. (1969) over the past several decades Freedman and Clauser (1972); Hensen et al. (2015); Giustina et al. (2015); Shalm et al. (2015) imply, conclusively, that entanglement can lead to fundamentally non-local correlations. Nonetheless, the entanglement content of an unknown quantum state can be quantified and characterized using local measurements alone Elben et al. (2018); Brydges et al. (2019); Elben et al. (2019); Notarnicola et al. (2022); Elben et al. (2020); Vermersch et al. (2023).

The most informative measurements allowable in quantum mechanics correspond to positive operator-valued measures (POVMs) on many identical copies of ones quantum state (i.e. on $\rho^{\otimes n}$ ), so-called many-copy measurements. Because all single-copy POVMs are a subset of all POVMs, many-copy measurements must be at least powerful as single-copy measurements. The same argument holds when one restricts further to local measurements on the individual subsystems. There is a natural trade-off between ease of experimental implementation and information gained. This intuition can be made rigorous in a number of ways, including comparing lower bounds on the sample complexity of various learning tasks under different measurement restrictions.

For example, consider the canonical learning task of quantum state tomography (QST), where one’s goal is to produce an estimate $\hat{\rho}$ of an unknown $d$ -dimensional state $\rho$ , such that $\|\hat{\rho}-\rho\|_{1}<\epsilon$ with high probability. If restricting to local measurements, one requires at least $\Omega(\frac{d^{3}}{\epsilon^{2}})$ copies of the state Haah et al. (2017), while with many-copy measurements, $\Theta(\frac{d^{2}}{\epsilon^{2}})$ copies are necessary and sufficient O’Donnell and Wright (2016); Haah et al. (2017). Because $d$ scales exponentially with the number of quantum systems, this seemingly minor improvement would have significant impact in practice. While many-copy measurements are necessary for optimal sample complexity Bubeck et al. (2020), they are nowhere near being implementable on today’s quantum systems. Even global single-copy measurements are infeasible for moderate system sizes. Thus, in this work, we focus on local measurements on each individual qubit of an $n$ -qubit state $\rho$ .

Requiring full knowledge of state, as in QST, is almost always superfluous. Typically, we are interested in functions of quantum states that can be estimated directly without ever estimating the quantum state itself. This realization has led to a burgeoning field of study regarding the optimal estimation of properties of quantum states, in which the estimation procedures have lower sample complexity and classical post-processing requirements than full state tomography van Enk and Beenakker (2012); O’Donnell and Wright (2015); Elben et al. (2018, 2019); Rath et al. (2021); Huang et al. (2020); Elben et al. (2020); Huang et al. (2022); Elben et al. (2023); Vermersch et al. (2023). Our work contributes to the ongoing effort to devise practical methods for the experimental estimation of entanglement using easily implementable measurements.

In this work, we construct statistical estimators of a broad family of entanglement measures based only on local POVMs. While it is known that entangled measurements on many identical copies of a state are necessary for the optimal property testing Bubeck et al. (2020) and that there are fundamental trade-offs between entanglement characterization and detection with and without multi-copy measurements Lu et al. (2016); Liu et al. (2022), separations between resource requirements for the estimation of multipartite entanglement measures remain an interesting area of study with many open problems. Moreover, as these are the simplest measurements to implement in practice, our results are relevant to experiments, both current and near-term, that wish to probe entanglement in quantum systems without the need to prepare identical copies simultaneously or utilize multi-copy measurements.

Specifically, we generalize Ref. Ohnemus et al. (2023), which devised estimators for the generalized concurrence Carvalho et al. (2004) based on local randomized measurements (LRMs), to a more general family of multipartite entanglement measures called the concentratable entanglements (CEs) Beckey et al. (2021). We provide analytical performance guarantees on the estimation of this family of entanglement measures in the form of upper bounds on the number of measurements needed to estimate the CEs to $\epsilon$ precision with high probability (i.e. upper bounds on the sample complexity). Moreover, we provide estimates of the post-processing cost of computing our estimators, a crucial consideration in practice. We then provide a distinct estimation procedure, based on LRM data and using median-of-means (MoM) estimation, that provides a square-root enhancement in the sample complexity of this task – making multipartite entanglement quantification in systems of several tens of qubits a feasible prospect.

A remaining limitation of LRMs is that they require an exponential number of measurement settings to be implemented, which can be experimentally challenging. To address this, we provide a general theorem that shows that any projective $2$ -design can be used to construct an estimator for the CEs. In particular, this theorem implies that local symmetric informationally complete (SIC) POVMs can be used to estimate all of the CEs using a single measurement setting, generalizing the work in Ref. Stricker et al. (2022) to the study of multipartite entanglement. While the classical post-processing of this method is more costly than the other methods, it is likely preferable to having to change the experimental measurement setting an exponential number of times. Thus, we expect these methods to be of interest to the experimental community.

II Preliminaries

In this section, we establish our notation and provide some essential facts from quantum information theory and classical statistics that are needed to prove our main results. To begin, let $\mathcal{B}=\{|j\rangle\}$ denote a basis of a finite-dimensional Hilbert space $\mathcal{H}$ . Then, $\{|j\rangle|j^{\prime}\rangle\mid|j\rangle,|j^{\prime}\rangle\in\mathcal{B}\}$ forms a basis for the composite space $\mathcal{H}\otimes\mathcal{H}$ . The SWAP operator $\mathbb{F}:\mathcal{H}\otimes\mathcal{H}\rightarrow\mathcal{H}\otimes\mathcal{H}$ , as the name suggests, is the operator that swaps two tensor components

\displaystyle\mathbb{F}|j\rangle|j^{\prime}\rangle=|j^{\prime}\rangle|j\rangle,\quad\forall~{}|j\rangle,|j^{\prime}\rangle\in\mathcal{B}.

(1)

This operator is diagonal in the Bell basis and has eigenvalues $\pm 1$ . The $+1$ and $-1$ eigenspaces are called the symmetric and anti-symmetric subspaces, respectively. Denoting the symmetric (anti-symmetric) subspaces as $\Pi_{+}$ ( $\Pi_{-}$ ), we can express the spectral decomposition of the SWAP operator as $\mathbb{F}=\Pi_{+}-\Pi_{-}$ . Moreover, these projectors resolve the identity $\mathbb{I}\otimes\mathbb{I}=\Pi_{+}+\Pi_{-}$ , allowing us to express the subspace projectors as

\displaystyle\Pi_{\pm}=\frac{\mathbb{I}\otimes\mathbb{I}\pm\mathbb{F}}{2}.

(2)

The multipartite entanglement measures we consider in this paper are functions of the subsystem purities, so in the proof of our main results, we will utilize the following well-known relationship between the SWAP operator and a state’s purity, called the SWAP “trick.”

Lemma 1 (The SWAP “trick”).

For an $n$ -qubit state $\rho$ , the following equality holds

\displaystyle\textnormal{tr}\left[\mathbb{F}\rho^{\otimes 2}\right]

\displaystyle=\textnormal{tr}\left[\rho^{2}\right].

(3)

This lemma can be proven from the definitions of the SWAP operator and the trace (see, for example, Appendix A of Ref. Beckey et al. (2023)). Around the turn of this century, it was realized that Lemma 1 suggests a method of purity estimation if one could prepare two identical copies of the quantum state and coherently manipulate the systems with high fidelity Ekert et al. (2002). The necessary level of coherent control has only become possible in the past decade on ion trap and neutral atom systems Islam et al. (2015); Kaufman et al. (2016); Bluvstein et al. (2022), but remains at the cutting-edge. In an effort to avoid the use of two-copy measurements, Ref. van Enk and Beenakker (2012) initiated the study of purity estimation using single-copy measurements and many great theoretical and experimental works to this end have been produced Elben et al. (2018, 2019); Stricker et al. (2022). While the current work focuses on the task of estimating multipartite entanglement measures using only local measurements, we suspect our techniques could be adapted to the direct estimation of all non-linear functionals of the form $\textnormal{tr}\left[\rho^{k}\right]$ for $k\in\mathbb{Z}^{+}$ .

II.1 Concentratable Entanglement

In this work, we consider the estimation of the family of multipartite entanglement measures defined in Ref. Beckey et al. (2021). This family includes the pure state entanglement measures from Refs. Meyer and Wallach (2002); Brennen (2003); Carvalho et al. (2004) as special cases and has several properties that make them interesting (see Ref. Beckey et al. (2021) for details). That work shows how to compute the measures using a parallelized controlled-SWAP circuit, which requires two identical copies of an $n$ -qubit state and $n$ ancillary qubits. In a follow-up work Beckey et al. (2023), the requirements were decreased to two identical copies of the state of interest using Bell basis measurements. Still, creating two identical copies of a moderately-sized quantum state and acting coherently on it is at the cutting edge of quantum information processing Bluvstein et al. (2022, 2023).

Before constructing estimators of the CEs, we must introduce the quantities themselves. To that end, we now define the CEs and mention some useful facts about them.

Definition 1 (Concentratable Entanglement Beckey et al. (2021)).

Let $|\psi\rangle$ be a pure quantum state of $n$ qubits and let $[n]=\{1,2,\ldots,n\}$ denote the full set of qubit labels. For any non-empty set of qubit labels $S\in\mathcal{P}([n])\setminus\{\emptyset\}$ , the Concentratable Entanglement (CE) is defined as

\displaystyle\mathcal{C}_{|\psi\rangle}(S)

\displaystyle=1-\frac{1}{2^{s}}\sum_{\alpha\in\mathcal{P}(S)}\textnormal{tr}\left[\rho_{\alpha}^{2}\right],

(4)

where $s:=|S|$ denotes the cardinality of $S$ , and the $\rho_{\alpha}$ ’s are reduced states of $|\psi\rangle\langle\psi|$ obtained by tracing out subsystems with labels not in $\alpha$ . For the trivial subset, we define $\textnormal{tr}\left[\rho_{\emptyset}^{2}\right]:=1$ .

Crucial to the present work is the following expression for the CEs Beckey et al. (2023)

\displaystyle\mathcal{C}_{|\psi\rangle}(S)=1-\textnormal{tr}\left[\rho^{\otimes 2}\prod_{i\in S}\Pi_{+}^{i}\right],

(5)

where $\Pi_{+}^{i}:=\mathbb{I}_{1}\otimes\dotsm\otimes\mathbb{I}_{i-1}\otimes\Pi_{+}^{i}\otimes\mathbb{I}_{i+1}\otimes\dotsm\otimes\mathbb{I}_{n}$ denotes the symmetric subspace projector on the $i$ -th qubit of the first and second copy of $\rho$ and identities for all $j\in S$ such that $j\neq i$ . Note this convention for the representation of operators is used throughout.

Importantly, when $S=[n]$ , the CE is related to the multipartite concurrence defined in Ref. Carvalho et al. (2004), denoted $c_{n}(|\psi\rangle)$ , via the simple expression

\displaystyle\mathcal{C}_{|\psi\rangle}([n])=\frac{c_{n}(|\psi\rangle)^{2}}{4},

(6)

which is the measure Ref. Ohnemus et al. (2023) consider estimating using local random measurements. Our results will hold for that measure as well as all others obtained by choice of any $S\subseteq[n]$ . While there are many interesting formulas for the CEs given in Refs. Beckey et al. (2021, 2023), these will suffice to appreciate our main results. We now turn to some facts from classical statistics that will be needed to prove our sample complexity upper bounds.

II.2 Results from Classical Statistics

In this work, we will work exclusively with bounded random variables $X_{1},\dots,X_{M}$ , such that each $X_{i}\in[a,b]$ and $a,b\in\mathbb{R}$ . We will denote the expectation value of a random variable $X$ by $\mathbb{E}[X]$ and its variance by $\textnormal{Var}(X)$ . We denote the covariance of two (possibly dependent) random variables $X$ and $Y$ by $\textnormal{Cov}[X,Y]=(X-\operatorname{\mathbb{E}}[X])(Y-\operatorname{\mathbb{E}}[Y])$ . We denote estimators of statistical quantities with hats. For example, if the actual parameter of interest is denoted $\theta$ , we will denote an estimator of the parameter as $\hat{\theta}$ . We say an estimator is unbiased if $\operatorname{\mathbb{E}}[\hat{\theta}]=\theta$ . Many of our estimators will be expressed using indicator functions, which are defined as

\displaystyle\textnormal{ind}[A]=\begin{cases}1,&\mathrm{condition}~{}A~{}\mathrm{is}~{}\mathrm{true}\\ 0,&\mathrm{otherwise}.\end{cases}

(7)

For example, given a fixed, length $s$ bitstring $\boldsymbol{z}_{0}\in\{0,1\}^{s}$ , $\textnormal{ind}[\boldsymbol{Z}=\boldsymbol{z}_{0}]$ takes the value $1$ when the random variable $\boldsymbol{Z}$ is equal to $\boldsymbol{z}_{0}$ , and takes the value $0$ otherwise. Crucially, the probability of obtaining a particular bitstring, $\boldsymbol{z}_{0}$ can be expressed as $\operatorname{\mathbb{E}}[\textnormal{ind}[\boldsymbol{Z}=\boldsymbol{z}_{0}]]=P(\boldsymbol{z}_{0}).$ Note that we will use uppercase letters to denote random variables, and the corresponding lowercase letter to denote a specific instance of the random variable.

In practice, finite sample statistics prohibit us from exactly determining quantities of interest. A natural approach is then to ask how many measurements are sufficient to guarantee that our estimate is close to the true value with high probability. This is a well-studied problem in classical statistics and results in so-called sample complexity upper bounds. We now present the two methods that will allow us to provide analytical performance guarantees on our estimators of multipartite entanglement.

We start with a well-known result called Hoeffding’s inequality, which provides exponential concentration for sums of independent bounded random variables. We will state it without proof as it is a standard result proven in most mathematical statistics textbooks (e.g. Sec. 2.2 of Ref. Vershynin (2018)).

Proposition 1 (Hoeffding’s inequality).

Let $X_{1},\dots,X_{M}$ be independent random variables such that $a\leqslant X_{i}\leqslant b$ and $\operatorname{\mathbb{E}}[X_{i}]=\mu$ for all $i\in[M]$ . Then, given a precision $\epsilon>0$ and a confidence level $1-\delta\in(0,1)$ , choosing $M=\lceil(b-a)^{2}\log(2/\delta)/(2\epsilon^{2})\rceil$ suffices to guarantee that

\displaystyle\mathbb{P}\left(\left|\frac{1}{M}\sum_{i=1}^{M}X_{i}-\mu\right|\geqslant\epsilon\right)\leqslant\delta.

(8)

If, however, the random variables in question have a large allowed range (but small variance), Hoeffding will often be looser than bounds that utilize the information about this second moment. In other words, the number of measurements required to achieve $\epsilon$ -precision will be overestimated. To provide stronger concentration, we will make use of MoM estimation (e.g. Sec. 2.3 of Ref. Lerasle (2019)).

Proposition 2 (Median-of-means (MoM) estimation).

Suppose that $X_{1},\dotsc,X_{M}$ are i.i.d. random variables, with variance bounded above by $\sigma^{2}>0$ . Let $\epsilon>0$ be the precision and $1-\delta$ be the confidence level, and $M=N_{B}B$ be the total number of samples, where the number of samples per batch is $N_{B}=\lceil 8\log(1/\delta)\rceil$ and the number of batches is $B=\lceil 4\sigma^{2}/\epsilon^{2}\rceil$ . Let $\widehat{\mu}_{b}$ denote the empirical mean of $X_{(b-1)B+1},\dotsc,X_{bB}$ , for $b\in\{1,\dotsc,N_{B}\}$ . Then, we have

\displaystyle\mathbb{P}\left(\left|\textnormal{median}(\widehat{\mu}_{1},\dotsc,\widehat{\mu}_{B})-\mathbb{E}[X]\right|\geqslant\epsilon\right)\leqslant\delta.

(9)

Both of these results allow us to derive analytical performance guarantees in the form of upper bounds on the sample complexity of an estimator. With these preliminaries in mind, we turn now to our main results.

Estimator Name	Sample Complexity Upper Bound	Classical Post-processing	# of Measurement Settings
LRM-Mean, $K=2$	$M_{1}=O\left(\left(\frac{9}{4}\right)^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$	$O(K^{2}sM_{1})$	$O\left(\left(\frac{9}{4}\right)^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$
LRM-MoM, $K=2$	$M_{2}=O\left(\left(\frac{3}{2}\right)^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$	$O(K^{2}sM_{2})$	$O\left(\left(\frac{3}{2}\right)^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$
SIC-MoM, $K=2$	$M_{3}=O\left(3^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$	$O(K^{2}sM_{3})$	1
SIC-MoM, $K=K_{\mathrm{opt}}$	$M_{4}=O\left(\left[\left(\frac{3}{2}\right)^{s}\epsilon^{-2}+\sqrt{3^{s}}\epsilon^{-1}\right]\log\left(\frac{1}{\delta}\right)\right)$	$\min\{O(K_{\mathrm{opt}}2^{s}M_{4}),O(K_{\mathrm{opt}}^{2}sM_{4})\}$	1

Table 1: Summary of Local Strategies for Multipartite Entanglement Quantification. The second column of this table represents asymptotic upper bounds on the number of measurements needed to estimate the CEs to precision

\epsilon

, with probability at least

1-\delta

. The third gives asymptotic estimates of the classical time complexity, where we assume that all elementary operations are

O(1)

. Finally, the fourth column summarizes the number of measurement settings needed for each estimator.

III Main Results

We first generalize and provide analytical performance guarantees for the estimators of multipartite entanglement via LRMs constructed in Ref. Ohnemus et al. (2023). We then utilize median-of-means estimation to obtain a (quadratically) better upper bound on the sample complexity CE estimation using LRM data. Then, we show that any local POVM that forms a projective $2$ -design can be used to construct an estimator for the CEs. A corollary of this result is the de-randomization of the LRM protocol which enables multipartite entanglement quantification using a single measurement setting. All of our sample complexity upper bounds, as well as estimates of the worst case classical post-processing cost, are summarized in Table 1.

III.1 CEs via LRMs

As mentioned above, randomized measurements have been studied in many estimation tasks, but most relevant to our work is the recent work of Ref. Ohnemus et al. (2023) in which the authors present a method of estimating the multipartite concurrence Carvalho et al. (2004) using local randomized measurement data. Because one recovers the generalized concurrence easily from the CEs (see Eq. (6)), Ref. Ohnemus et al. (2023) is easily generalizable to the entire family of CEs.

As depicted in Fig. 1, the LRM protocol entails preparing the state of interest, applying $U=\prod_{i=1}^{n}U_{i}$ , where each $U_{i}$ is taken from some single qubit matrix distribution (e.g. Haar distribution or single-qubit Cliffords Mele (2023)), to our state and then measuring in the computational basis $\{|\mathbf{z}\rangle\}$ . Note that This yields a bitstring $\mathbf{z}$ , where $\boldsymbol{z}:=z_{1}\cdot z_{2}\cdot\dotsm z_{n}$ , with $z_{i}\in\{0,1\}$ , and where the probability of obtaining $\mathbf{z}$ is given as $P(\mathbf{z})=\textnormal{tr}\left[U\rho U^{\dagger}|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|\right]$ . Note that if one only cares about a subset $S\subseteq[n]$ of qubits, then one can restrict the unitary rotation and subsequent computational basis measurement to the subsystems defined by $S$ . Specifically, this corresponds to rotation by the unitary $U=\prod_{i\in S}U_{i}$ , by which mean $U_{i}\in U(2)$ is a unitary acting on the $i$ th qubit for $i\in S$ , and we implicitly assume that the action of the unitary is trivial (i.e., $\openone$ ) on qubits not in the set $S$ . We then have the following proposition.

Proposition 3 (CEs via LRMs).

Let $\rho=|\hskip 1.0pt\psi\rangle\langle\psi\hskip 1.0pt|$ be an $n$ -qubit pure state, $U=\prod_{i\in S}U_{i}$ for $U_{i}\in U(2)$ be the tensor product of single-qubit Haar random unitaries in $S\subseteq[n]$ , and $P_{U}(\mathbf{z})$ be the probability of measuring bit-string $\mathbf{z}\in\{0,1\}^{s}$ . The CEs can be obtained exactly via

\displaystyle\mathcal{C}_{|\psi\rangle}(S)=1-3^{s}\operatorname{\mathbb{E}}_{U}[P_{U}(\mathbf{z})^{2}]

(10)

for any bitstring $\mathbf{z}$ , where $\operatorname{\mathbb{E}}_{U}[\cdot]$ represents an average over the Haar measure.

The proof of this theorem relies on the fact that

\displaystyle\mathbb{E}_{U}[U^{\otimes 2}|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger^{\otimes 2}}]

\displaystyle=\frac{1}{3}\Pi_{+},

(11)

which follows from Schur’s lemma and Eq. (5). We include a detailed proof in Appendix B.1. Crucially, we emphasize that this holds for any fixed bitstring $\mathbf{z}\in\{0,1\}^{s}$ . Because this expression is equally valid for all bitstrings, it follows that we can simply take a uniform average over all bitstrings

\displaystyle\mathcal{C}_{|\psi\rangle}(S)=1-\left(\frac{3}{2}\right)^{s}\sum_{\mathbf{z}\in\{0,1\}^{s}}\operatorname{\mathbb{E}}_{U}[P_{U}(\mathbf{z})^{2}],

(12)

which is a crucial experimental consideration because the probability of obtaining any particular bitstring decays exponentially with the number of qubits being probed. Moreover, taking such an average enables improved statistical estimators for two key reasons. First, as we will see below, the uniform average allows us to reduce the constant factor that appears in the sample complexity from $3^{s}$ to $\left(\frac{3}{2}\right)^{s}$ . And second, the uniform average allows us to focus on estimating the sum of squared probabilities, rather than the squared probabilities themselves. As we will see, this is exponentially more efficient in terms of classical post-processing for many cases of interest.

Refer to caption — Figure 1: Local randomized measurement. A local randomized measurement (LRM) is simply a random local unitary on each subsystem, followed by a computational basis measurement.

In practice, one cannot compute this average exactly over the Haar measure, but instead samples $L$ unitaries from the Haar measure (or a unitary design of the appropriate degree Mele (2023)) and approximates the exact average. Moreover, Eq. (10) depends on the probability, at a fixed unitary, of obtaining the bitstring $\mathbf{z}$ . This too, cannot be computed exactly, but is estimated by repeating, for each random unitary, the computational basis measurement $K$ times and taking the sample average. This yields a total measurement budget of $M=LK$ . Because of these two sources of finite-sampling error, providing convergence guarantees for estimators based on LRM data can be challenging, with many groups resorting to various approximations, special cases, or numerical simulations Brydges et al. (2019); Elben et al. (2018, 2019); Vermersch et al. (2018); Ohnemus et al. (2023). While these methods are still informative, it would be preferable to obtain analytical performance guarantees. Before showing how to obtain such guarantees, let us understand the methods used in Ref. Ohnemus et al. (2023), where they construct unbiased estimators of $P^{2}_{U}(\mathbf{z})$ to then approximate the actual quantity of interest $\operatorname{\mathbb{E}}_{U}[P_{U}(\mathbf{z})^{2}]$ .

To understand their result, let $\boldsymbol{z}\in\{0,1\}^{s}$ denote a bitstring resulting from an LRM as depicted in Fig. 1. Ohnemus et al. Ohnemus et al. (2023) show that

\displaystyle\widehat{P}^{(2)}_{U}({\boldsymbol{z}})

\displaystyle=\widehat{P}_{U}({\boldsymbol{z}})\frac{(K\widehat{P}_{U}({\boldsymbol{z}})-1)}{K-1},

(13)

is an unbiased estimator of the squared probability $P_{U}(\boldsymbol{z})^{2}$ , for a fixed unitary $U$ . Here, $\widehat{P}_{U}({\boldsymbol{z}})=\frac{1}{K}\sum_{k=1}^{K}\textnormal{ind}[\boldsymbol{Z}_{k}=\boldsymbol{z}]$ denotes the fraction of the $K$ outcomes equal to $\boldsymbol{z}$ . However, the classical run-time of estimating each probability is $O(K)$ , and there are $2^{s}$ terms in the sum that appears in Eq. (12). Thus, the classical post-processing scales exponentially in the number of subsystems being probed, i.e. $O(K2^{s})$ . Moreover, proving upper bounds on the sample complexity of the CEs is difficult if one estimates the squared probabilities. Hoeffding’s inequality and the union bound give an analytical sample complexity upper bound, but it is very loose. Alternatively, one could attempt to compute or bound the variance of this estimator and use the Chebyshev-Cantelli inequality, as done in Ref. Ohnemus et al. (2023), but this requires control of up to the fourth moment of a multinomial distribution, which is difficult to compute for general input states. We address these limitations with a simple shift in the estimation procedure.

III.1.1 LRM-Mean Estimator

Instead of estimating the probabilities, or even the squared probabilities, we estimate the expectation value of the sum of the squared probabilities that appear in Eq. (12).

Theorem 1 (Unbiased CE Estimation via LRM Data).

For each $K\geqslant 2$ ,

\displaystyle\hat{\mathcal{C}}_{|\psi\rangle}(S)=1-\left(\frac{3}{2}\right)^{s}\frac{1}{L}\sum_{l=1}^{L}\hat{S}_{l}^{(K)},

(14)

is an unbiased estimator of $\mathcal{C}_{|\psi\rangle}(S)$ , where

\displaystyle\hat{S}_{l}^{(K)}

\displaystyle=\frac{1}{K(K-1)}\sum_{\begin{subarray}{c}k,k^{\prime}=1\\ k\neq k^{\prime}\end{subarray}}^{K}\textnormal{ind}[\boldsymbol{Z}_{l,k}=\boldsymbol{Z}_{l,k^{\prime}}].

(15)

Given a precision $\epsilon>0$ and a confidence level of $1-\delta\in(0,1)$ , using at most $L=O((\frac{9}{4})^{s}\log{(1/\delta)}/\epsilon^{2})$ random unitaries, one is guaranteed

\displaystyle\Pr[|\hat{\mathcal{C}}_{|\psi\rangle}(S)-\mathcal{C}_{|\psi\rangle}(S)|\geqslant\epsilon]\leqslant\delta.

(16)

A detailed proof of this theorem and a derivation of the estimator can be found in Appendix B.1. We remark that $\hat{S}_{l}^{(K)}$ is mathematically equivalent to summing Eq. (53) over all bitstrings, but writing it as $\hat{S}_{l}^{(K)}$ allows for faster classical post-processing for a constant $K$ . As mentioned previously, computing each probability and summing would scale as $O(K2^{s})$ , which would become a bottleneck for moderate system sizes. In contrast, computing $\hat{S}_{l}^{(K)}$ directly can be done in $O(K^{2}s)$ . For constant $K$ , this represents an exponential improvement in classical post-processing.

Importantly, the concentration inequality in Theorem 1 holds for all $K\geqslant 2$ . Thus, using a given unitary for more than $2$ shots would increase the total measurement budget without providing provably better concentration. Moreover, the computational post-processing cost increases with $K$ . To see this, note that the sum $\hat{S}_{l}^{(K)}$ is computed by checking the number of distinct pairs of trials $(k,k^{\prime})$ that output the same string, and thus, has time complexity $O(K^{2}s)$ . This, too, indicates that one should choose $K=2$ . Note that this does not imply this is the optimal division of the total measurement budget. We will see in the next section that the sample complexity could be marginally improved by increasing $K$ , but with diminishing returns after a certain value. Finally, although the classical post-processing is efficient, the sample complexity upper bound scales as $\sim(\frac{9}{4})^{s}$ , which can become prohibitive for tens of qubits. As such, we now show how to quadratically improve this sample complexity using an alternative estimation procedure.

III.1.2 LRM-MoM Estimator

As mentioned in Sec. II.2, Hoeffding’s inequality holds for all independent random variables that are bounded. However, if the range of values the random variable takes is large relative to its standard deviation, and one can compute exactly or obtain a good upper bound on the variance of the random variable, median-of-means estimation can give tighter concentration. To apply Prop. 2 to our LRM estimator, we have to compute the variance of $\hat{S}_{l}^{(K)}$ given in Theorem 1. As we show in Appendix B.1, the variance can be expressed as

\displaystyle\begin{split}\textnormal{Var}[\hat{S}_{l}^{(K)}]=\frac{2P_{2}(1-P_{2})+4(K-2)(P_{3}-P_{2}^{2})}{K(K-1)}\\ +\frac{(K-2)(K-3)(P_{2,2}-P_{2}^{2})}{K(K-1)},\end{split}

(17)

where $P_{2}:=\operatorname{\mathbb{E}}_{U}[\sum_{\boldsymbol{z}}P_{U}(\boldsymbol{z})^{2}]$ , $P_{3}:=\operatorname{\mathbb{E}}_{U}[\sum_{\boldsymbol{z}}P_{U}(\boldsymbol{z})^{3}]$ , $P_{2,2}:=\operatorname{\mathbb{E}}_{U}[(\sum_{\boldsymbol{z}}P_{U}(\boldsymbol{z}))^{2}]$ . Note, importantly, that this variance is independent of $l$ and the second additive term dominates in the large $K$ limit, approaching $P_{2,2}-P_{2}^{2}$ . Thus, increasing $K$ past some threshold value will not lead to an appreciable decrease in the variance. Moreover, if one wishes to bound the variance for arbitrary states, it is necessary to bound the fourth moment implicit in $P_{2,2}$ . While local Cliffords are a common choice for implementing LRMs, they only form a $3$ -design, so one would need to either find a method to bound $P_{2,2}$ for (local) Clifford measurements, or perform (local) Haar random measurements, which can be more experimentally challenging. We therefore restrict to $K=2$ , which allows us to use local Clifford measurements, bound the variance for arbitrary states, and provably perform better than the estimation procedure given in Theorem 1. In addition, this allows for faster classical post-processing, as explained in the previous section.

Thus, choosing $K=2$ (i.e. employing only two measurements per unitary), Eq. (17) reduces to

\displaystyle\textnormal{Var}[\hat{S}_{l}^{(2)}]=P_{2}(1-P_{2}).

(18)

Note that Eq. (12) and the definition of $P_{2}$ above imply that the CE can be expressed as $\mathcal{C}_{|\psi\rangle}(S)=1-\left(\frac{3}{2}\right)^{s}P_{2}$ . Then, because $\mathcal{C}_{|\psi\rangle}(S)\geqslant 0$ , we have $P_{2}\leqslant\left(\frac{2}{3}\right)^{s}$ , which coupled with Eq. (18), implies that $\textnormal{Var}[\hat{S}_{l}^{(2)}]\leqslant\left(\frac{2}{3}\right)^{s}$ . Noting that we can write the estimator in Eq. (14) as $\hat{\mathcal{C}}_{|\psi\rangle}(S)=\sum_{i=1}^{L}\hat{\mathcal{C}}_{l}^{(K)}/L$ with $\hat{\mathcal{C}}_{l}^{(K)}=1-(3/2)^{s}\hat{S}_{l}^{(K)}$ , we obtain

\displaystyle\textnormal{Var}[\hat{C}_{l}^{(2)}]=\left(\frac{9}{4}\right)^{s}\textnormal{Var}[\hat{S}_{l}^{(2)}]\leqslant\left(\frac{3}{2}\right)^{s}.

(19)

This allows us to directly apply Prop. 2, which we state as our second theorem.

Theorem 2 (CE Estimation via MoM and LRMs).

Given precision of $\epsilon>0$ and a confidence level $1-\delta\in(0,1)$ , we randomly sample a total of $L=N_{B}B$ local unitaries of the form $U=\prod_{i\in S}U_{i}$ , where $N_{B}=\lceil 8\log(\frac{1}{\delta})\rceil$ and $B=\lceil 4\left(\frac{3}{2}\right)^{s}\epsilon^{-2}\rceil$ . Measuring $K=2$ outcomes per unitary, we denote the outcomes from the $L$ LRMs as $\boldsymbol{Z}_{1,1},\boldsymbol{Z}_{1,2},\ldots,\boldsymbol{Z}_{L,1},\boldsymbol{Z}_{L,2}$ . Breaking these $2N_{B}B$ outcomes into $N_{B}$ batches of size $2B$ , for each $1\leqslant b\leqslant N_{B}$ , compute the empirical mean

\displaystyle\overline{\mathcal{C}}_{|\psi\rangle}^{(b)}(S)=1-\left(\frac{3}{2}\right)^{s}\frac{1}{B}\sum_{l=(b-1)B+1}^{bB}\textnormal{ind}[\boldsymbol{Z}_{l,1}=\boldsymbol{Z}_{l,2}].

(20)

Then, given at most $O\left(\left(\frac{3}{2}\right)^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$ measurement outcomes, one is guaranteed to have

\displaystyle\Pr[|\textnormal{median}[\overline{\mathcal{C}}^{(1)}_{|\psi\rangle}(S),\ldots,\overline{\mathcal{C}}^{(N_{B})}_{|\psi\rangle}(S)]-\mathcal{C}_{|\psi\rangle}(S)|\geqslant\epsilon]\leqslant\delta.

(21)

The proof of this theorem can be found in Appendix B.1, and is a direct application of Proposition 2, using the variance bound in Eq. (19). While the classical post-processing needed to compute this estimator is no worse than the estimator in Theorem 1, it improves on the sample complexity by a factor of $\left(\frac{3}{2}\right)^{s}$ , which effectively doubles the size of the state one could probe with comparable resources (asymptotically).

Despite this significant improvement in sample complexity and classical post-processing, LRMs still require an exponential number of measurement settings which is time consuming experimentally. To address this limitation, we now show how to de-randomize these protocols and probe entanglement using a single experimental setting.

III.2 CEs via Projective 2-designs

As mentioned previously, the sample complexity of an estimation protocol is not the only relevant experimental consideration. While each measurement setting used in LRMs is easily implementable, we saw above that guaranteeing $\epsilon$ -close estimation of the CEs requires an exponential number of measurement settings. Inspired by Ref. Stricker et al. (2022), we propose a “de-randomization” of the LRM protocol that requires only a single experimental setting. That such a protocol is possible follows from Eq. (5) and the definition of a projective 2-design Scott (2006) as stated below.

Definition 2 (Projective 2-design).

A projective 2-design is a probability distribution over $N$ quantum states, $\{p_{i},|\phi_{i}\rangle\}_{i=1}^{N}$ , such that

\displaystyle\sum_{i}p_{i}(|\phi_{i}\rangle\langle\phi_{i}|)^{\otimes 2}

\displaystyle=\int_{\mathrm{Haar}}(|\psi\rangle\langle\psi|)^{\otimes 2}d\psi

(22)

where integration in the right-hand expression is with respect to the Haar measure.

From this definition and Schur’s lemma, we have

\displaystyle\mathbb{E}_{\phi}[(|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|)^{\otimes 2}]

\displaystyle=\frac{1}{3}\Pi_{+},

(23)

where $\mathbb{E}_{\phi}[(|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|)^{\otimes 2}]:=\sum_{i}p_{i}(|\phi_{i}\rangle\langle\phi_{i}|)^{\otimes 2}$ . Note that the state in $\mathbb{E}_{\phi}[(|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|)^{\otimes 2}]$ denotes a random variable while $|\hskip 1.0pt\phi_{i}\rangle\langle\phi_{i}\hskip 1.0pt|$ is a specific state. Thus, we may estimate all of the CEs by appropriately post-processing the measurement data from any POVM that also forms a projective 2-design. This motivates the following theorem.

Theorem 3 (CEs via Projective 2-designs).

Let $\{p_{i},|\phi_{i}\rangle\}_{i=1}^{N}$ be a single-qubit projective 2-design with $N$ elements. Further, let $|\hskip 1.0pt\Phi_{\mathbf{q}}\rangle\langle\Phi_{\mathbf{q}}\hskip 1.0pt|=\prod_{i\in\mathbf{q}}|\hskip 1.0pt\phi_{i}\rangle\langle\phi_{i}\hskip 1.0pt|$ denote the projector onto output string $\mathbf{q}\in\{1,2,\ldots,N\}^{s}$ which occurs with probability $P(\mathbf{q})=\textnormal{tr}\left[\rho|\hskip 1.0pt\Phi_{\mathbf{q}}\rangle\langle\Phi_{\mathbf{q}}\hskip 1.0pt|\right]$ when measuring the qubits in the set $S$ , given the state $\rho$ . Then, the CEs can be written as

\displaystyle\mathcal{C}_{|\psi\rangle}(S)=1-3^{s}\mathbb{E}_{\Phi}[\textnormal{tr}\left[\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|\right]^{2}],

(24)

where $\mathbb{E}_{\Phi}[\cdot]$ denotes an expectation over the 2-design.

The proof is a direct application of Eq. (23) and Eq. (5), keeping in mind that each $|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|$ is independent. The details are provided in Appendix B.2.

While Thm. 3 holds for all projective 2-designs, we will restrict our attention to SIC POVMs, because they are minimal among single-qubit projective $2$ -designs. That is, they saturate the lower bound on the number of elements, $N$ , needed to form a projective $t$ -design in a $d$ -dimensional space Scott (2006)

\displaystyle N\geqslant\binom{d+\lceil t/2\rceil-1}{\lceil t/2\rceil}\binom{d+\lfloor t/2\rfloor-1}{\lfloor t/2\rfloor},

(25)

which becomes $N\geqslant 4$ for $2$ -designs for the $2$ -dimensional case on which we focus. Their minimal nature has important bearing on experimental implementations. Because the size of ancillary space required for implementation through Neumark’s (Naimark’s) Theorem (see Supp. Theorem A.1) is proportional to the number of design elements Chen et al. (2007), SIC POVMs are the cheapest to implement experimentally. We refer the reader to Refs. Renes et al. (2004); Scott (2006); Fuchs et al. (2017) for more about SICs generally, and to Ref. Scott (2006) for a detailed overview of SICs and projective 2-designs as we use them.

For qubits, and as depicted in Fig. 3c), the simplest SIC has the following 4 elements

\displaystyle\begin{split}|\phi_{1}\rangle&=|0\rangle,\\ |\phi_{2}\rangle&=\frac{1}{\sqrt{3}}|0\rangle+\sqrt{\frac{2}{3}}|1\rangle,\\ |\phi_{3}\rangle&=\frac{1}{\sqrt{3}}|0\rangle+\sqrt{\frac{2}{3}}e^{i2\pi/3}|1\rangle,\\ |\phi_{4}\rangle&=\frac{1}{\sqrt{3}}|0\rangle+\sqrt{\frac{2}{3}}e^{i4\pi/3}|1\rangle,\end{split}

(26)

which obey the following relation,

\displaystyle\frac{1}{4}\sum_{i=1}^{4}(|\hskip 1.0pt\phi_{i}\rangle\langle\phi_{i}\hskip 1.0pt|)^{\otimes 2}=\frac{1}{3}\Pi_{+}.

(27)

By defining $|\tilde{\phi}_{i}\rangle=\frac{1}{\sqrt{2}}|\phi_{i}\rangle$ , we can turn these SIC elements into a well-defined POVM satisfying

\displaystyle\sum_{i=1}^{4}|\hskip 1.0pt\tilde{\phi}_{i}\rangle\langle\tilde{\phi}_{i}\hskip 1.0pt|

\displaystyle=\mathbb{I}.

(28)

With this notation in place, we can state a corollary to Theorem 3 that yields a very simple expression for the CEs via local SICs.

Corollary 1 (CEs via local SICs).

Let $\{|\hskip 1.0pt\tilde{\phi}_{i}\rangle\langle\tilde{\phi}_{i}\hskip 1.0pt|\}_{i=1}^{4}$ be a single-qubit SIC-POVM and $|\hskip 1.0pt\tilde{\Phi}_{\mathbf{q}}\rangle\langle\tilde{\Phi}_{\mathbf{q}}\hskip 1.0pt|=\prod_{i\in\boldsymbol{q}}|\hskip 1.0pt\tilde{\phi}_{i}\rangle\langle\tilde{\phi}_{i}\hskip 1.0pt|$ denote the projector onto output string $\mathbf{q}\in\{1,2,3,4\}^{s}$ which occurs with probability,

\displaystyle P(\mathbf{q})=\textnormal{tr}\left[\rho|\hskip 1.0pt\tilde{\Phi}_{\mathbf{q}}\rangle\langle\tilde{\Phi}_{\mathbf{q}}\hskip 1.0pt|\right].

(29)

Then the CEs can be written as,

\displaystyle\mathcal{C}_{|\psi\rangle}(S)=1-3^{s}\sum_{\mathbf{q}}P(\mathbf{q})^{2}.

(30)

Because we do not have to average over unitaries as in LRMs, the finite sampling error incurred when estimating Eq. (30) will only be due to approximating the probability of obtaining a particular bitstring. This important distinction between SICs and LRMs means we just need to estimate the sum of the squared probabilities obtained from a local SIC measurements. While this theorem implies one can quantify multipartite entanglement with a single experimental measurement setting, this simplicity comes at the cost of implementing a generalized POVM.

As shown schematically in Fig. 3, this can be achieved by employing one additional ancilla qubit for each system qubit Chen et al. (2007) or, as has been demonstrated experimentally in Refs. Stricker et al. (2022); Fischer et al. (2022), by encoding qubit states into 4-dimensional qudits (ququarts). Experimentally, Cor. 3 amounts to transforming each state-ancilla qubit pair or ququart by the unitary $U_{\mathrm{SIC}}^{\dagger}$ , where

\displaystyle U_{\mathrm{SIC}}=\begin{pmatrix}\frac{1}{\sqrt{2}}&\frac{1}{\sqrt{6}}&\frac{1}{\sqrt{6}}&\frac{1}{\sqrt{6}}\\ 0&\frac{1}{\sqrt{3}}&\frac{e^{i2\pi/3}}{\sqrt{3}}&\frac{e^{i4\pi/3}}{\sqrt{3}}\\ 0&\frac{1}{\sqrt{3}}&\frac{e^{-i2\pi/3}}{\sqrt{3}}&\frac{e^{-i4\pi/3}}{\sqrt{3}}\\ \frac{1}{\sqrt{2}}&\frac{-1}{\sqrt{6}}&\frac{-1}{\sqrt{6}}&\frac{-1}{\sqrt{6}}\end{pmatrix},

(31)

and then performing a computational basis measurement. The construction of $U_{\mathrm{SIC}}$ follows from Neumark’s Theorem, which we prove carefully in Appendix A.1 for the reader’s convenience. The intuition is to convert the SIC elements into a $4$ -dimensional orthonormal basis by appending to each $|\tilde{\phi}_{i}\rangle$ a respective $|\tilde{\phi}^{\perp}_{i}\rangle$ such that $\langle\tilde{\phi}_{i}\hskip 1.0pt|\hskip 1.0pt\tilde{\phi}^{\perp}_{i}\rangle=0$ .

An estimator based on SIC POVMs follows naturally from Theorem 1 and Eq. (5). Because we saw that MoM gives tighter sample complexity upper bounds, we focus only on MoM estimation for SICs.

Theorem 4 (CE Estimation via SIC Data and MoM).

Given a precision $\epsilon>0$ and confidence level $1-\delta\in(0,1)$ , perform a total of $M=2N_{B}B$ SIC measurements, where $N_{B}=\lceil 8\log(1/\delta)\rceil$ and $B=\lceil 4(3^{s}/\epsilon^{2})\rceil$ . Let $\mathbf{Q}_{1},\ldots,\mathbf{Q}_{M}$ denote the outcomes obtained from these measurements. Break these $M$ outcomes into $N_{B}$ batches of size $2B$ , and for each $1\leqslant b\leqslant N_{B}$ , compute the empirical mean

\displaystyle\overline{\mathcal{C}}^{(b)}_{|\psi\rangle}(S)=1-3^{s}\frac{1}{B}\sum_{i=(b-1)B+1}^{bB}\textnormal{ind}[\boldsymbol{Q}_{2i-1}=\boldsymbol{Q}_{2i}].

(32)

Then, we have the guarantee that

\displaystyle\Pr[|\textnormal{median}[\overline{\mathcal{C}}^{(1)}_{|\psi\rangle}(S),\ldots,\overline{\mathcal{C}}^{(N_{B})}_{|\psi\rangle}(S)]-\mathcal{C}_{|\psi\rangle}(S)|\geqslant\epsilon]\leqslant\delta,

(33)

giving an upper bound of $O\left(3^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$ on the sample complexity of estimating $\mathcal{C}_{|\psi\rangle}(S)$ using SIC measurements.

The proof of this theorem can be found in Appendix B.1. Note that we have restricted our attention to the $K=2$ case in the main text. Evidently, this sample complexity is substantially worse than the LRMs. We can improve upon this by optimizing the number of samples used to construct a single estimate of the CE using SIC measurements (i.e. by optimizing $K$ ). As we show in Appendix B.2, the optimal value of $K$ is given as

\displaystyle\begin{split}K_{\mathrm{opt}}&=\Biggl{\lceil}\frac{1}{2}\left(\frac{16}{\epsilon^{2}}\left(\frac{3}{2}\right)^{s}+1\right)\\ &+\frac{1}{2}\sqrt{\left(\frac{16}{\epsilon^{2}}\left(\frac{3}{2}\right)^{s}-1\right)^{2}+\frac{32}{\epsilon^{2}}3^{s}}\Biggr{\rceil},\end{split}

(34)

which corresponds to the value of $K$ that minimizes the bound on the variance of the estimator for CE determined by SIC measurements. As shown in Appendix B.2, setting $K=K_{\mathrm{opt}}$ and appropriately modifying Theorem 4, it follows that $O\left(\left[\left(\frac{3}{2}\right)^{s}\epsilon^{-2}+\sqrt{3^{s}}\epsilon^{-1}\right]\log\left(\frac{1}{\delta}\right)\right)$ total measurements are sufficient to guarantee that

\displaystyle\Pr[|\textnormal{median}[\overline{\mathcal{C}}^{(1)}_{|\psi\rangle}(s),\ldots,\overline{\mathcal{C}}^{(N_{B})}_{|\psi\rangle}(s)]-\mathcal{C}_{|\psi\rangle}(s)|\geqslant\epsilon]\leqslant\delta.

(35)

With this optimization, we have shown that using a single experimental measurement setting, one can obtain an upper bound on the sample complexity that is similar to the bound obtained for LRMs, which requires an exponential number of different measurement settings. However, as noted in Table 1, the classical post-processing in this optimized case is more costly. Moreover, the local measurement itself is more difficult to implement than in the case of the simple projective measurements used in LRM protocols.

Nonetheless, we hope that our results will be useful to experimentalists that would like to quantify multipartite entanglement without the need to prepare and coherently manipulate multiple identical copies of a quantum state. While this capability is becoming possible on neutral atom platforms Bluvstein et al. (2022, 2023), it remains very challenging on other architectures.

Our motivation in this work was to improve upon protocols in the literature and find the most resource efficient method for multipartite entanglement quantification using only local measurements. While we have accomplished this, we also aimed to provide tools that experimentalists can use in current and near-term experiments where states are not perfectly pure. While there are a number of ways one could quantify mixed state entanglement, we sketch one technique to which the methods above could be directly applied.

III.3 Mixed States

Our motivation in this work was to optimize and compare various methods of quantifying pure state multipartite entanglement using only local POVMs. As such, we focused on pure states with ideal unitary evolution and perfect measurements. While this question is of theoretical interest, to be useful in practice, one must be able to handle mixed states. While we save the full treatment of mixed states for a future work (more on this below), we mention here one potential way forward that would utilize our results here.

A standard method of extending pure state entanglement measures to mixed state ones is via a convex roof extension Bennett et al. (1996); Uhlmann (2010). This was done for the CEs in Ref. Beckey et al. (2023), but we sketch the argument here to illustrate the main points. The convex roof extension can be expressed as

\displaystyle\mathcal{C}_{\rho}(S)

\displaystyle=\inf\sum_{i}p_{i}\mathcal{C}_{|\psi_{i}\rangle}(S),

(36)

where the infimum is over the set of decompositions of the form $\rho=\sum_{i}p_{i}|\psi_{i}\rangle\langle\psi_{i}|$ , with $p_{i}\geqslant 0$ for all $i$ and $\sum_{i}p_{i}=1$ . Because this optimization is generally difficult, one often considers a lower bound on Eq. (36). We will show how one could construct such lower bounds for the CEs and compute them from local measurement data alone, allowing one to bound the mixed state entanglement within the above framework developed for estimating pure state entanglement.

First we note that the bipartite concurrence Wootters (2001); Rungta et al. (2001) of a pure quantum state $|\psi\rangle_{AB}$ can be expressed as

\displaystyle c_{2}(|\psi\rangle_{AB})

\displaystyle=\sqrt{2(1-\textnormal{tr}\left[\rho_{A}^{2}\right])}.

(37)

We can use this to express the CE in terms of the bipartite concurrences by defining $c_{\alpha}:=\sqrt{2(1-\textnormal{tr}\left[\rho_{\alpha^{2}}\right])}$ , leading to

\displaystyle\mathcal{C}_{|\psi\rangle}(S)=\frac{1}{2^{s+1}}\sum_{\alpha}c_{\alpha}^{2}(|\psi\rangle).

(38)

Then, one could use the observable lower bound on the bipartite concurrence introduced in Ref. Mintert and Buchleitner (2007) to construct a lower bound on the CEs. For example, when the $S=[n]$ and the CE is simply related to the multipartite concurrence of Ref. Carvalho et al. (2004), one obtains a lower bound of the form

\displaystyle\mathcal{C}_{\rho}^{\ell}([n])

\displaystyle=\frac{1}{2^{n}}+(1-\frac{1}{2^{n}})\textnormal{tr}\left[\rho^{2}\right]-\frac{1}{2^{n}}\sum_{\alpha\in\mathcal{P}([n])}\textnormal{tr}\left[\rho_{\alpha}^{2}\right].

(39)

Throughout this work, we have shown how to estimate the last term using various local measurement strategies. The middle term depends only on the quantum state’s purity, $\textnormal{tr}\left[\rho^{2}\right]$ . There exists a number of theoretical van Enk and Beenakker (2012); Elben et al. (2019) and experimental Daley et al. (2012); Islam et al. (2015); Kaufman et al. (2016); Vermersch et al. (2018); Brydges et al. (2019) works showing how to estimate the purity of quantum states using local random measurements, and local SICs were implemented experimentally in Ref. Stricker et al. (2022) to estimate purity. In fact, the purity of a quantum state is a quantity of great practical interest in its own right, and we suspect our results will be of interest to those researchers interested in proving rigorous performance guarantees on its estimation.

Coupled with these results from the literature, one could use the framework and techniques described above to probe mixed state entanglement in this manner. We further note that, for high-purity states that are becoming increasingly common in today’s state of the art experiments, this bound is very close to the pure state theoretical value. This can be seen by noting that $C_{|\psi\rangle}([n])-\mathcal{C}_{\rho}^{\ell}([n])=(1-2^{-n})(1-\textnormal{tr}\left[\rho^{2}\right]),$ which is very close to zero for nearly pure states.

Such a procedure described above would generalize the method used in Ref. Ohnemus et al. (2023) for the multipartite concurrence Mintert et al. (2005); Mintert and Buchleitner (2007), but is just one possible extension of our methods to mixed state entanglement. There have been a number of interesting works in the recent years that use local random measurement strategies to probe mixed state entanglement Elben et al. (2020); Neven et al. (2021); Vermersch et al. (2023), and exploring the connections between our proposed method and theirs is an interesting direction we save for future work.

IV Future Directions and Conclusions

Before concluding, we would like to mention some directions for future investigations and place our work in the broader context of the literature. As mentioned above, a timely follow-up to our work would be the careful analysis applying our methods to mixed states and, ideally, an implementation on real hardware.

In addition to matters of direct practical interest, our work also connects to interesting open theoretical questions regarding the ultimate limits on the learnability of quantum entanglement with local measurements. While we provide several upper bounds on the sample complexity of the estimation of multipartite entanglement with local measurements, the best of which scales as $(\frac{3}{2})^{s}$ , it remains an open question as to what the optimal scaling is. Finding matching lower and upper bounds on the sample complexity of this task would be an exciting result both for experimentalists wishing to probe entanglement in the lab and to the quantum learning theory community that aims to quantify the ultimate limits of the learnability of quantum properties. In particular, such bounds would allow one to establish ultimate separations between local, single-copy, and multi-copy measurement strategies. While such separations have been established for QST (see Table 1 in Ref. Lowe and Nayak (2022)) and other learning tasks, the authors are unaware of any such separation in the case of multipartite entanglement estimation.

However, partial results in that direction do exist. Given the ability to create, store, and coherently manipulate two copies of state at once (as in Refs. Bluvstein et al. (2022, 2023)), one could use Bell basis measurements on each qubit of the two copies of $\rho$ to estimate the CEs Beckey et al. (2023). Given access to these two-copy measurements, Ref. Beckey et al. (2023) showed that at most $O(\log{(\frac{1}{\delta})}\epsilon^{-2})$ measurements are needed to estimate the CEs to $\epsilon$ precision with probability $1-\delta$ . Thus, for constant $\epsilon,\delta$ , two-copy measurements allow for multipartite entanglement quantification using at most $O(1)$ measurements, while our best local strategy scales as $O((\frac{3}{2})^{s})$ . Proving an exponential lower bound on multipartite entanglement estimation with local measurements would imply an exponential separation between local and two-copy measurements, which would be very interesting to theorists and experimentalists alike.

To conclude, in this work we have generalized Ref. Ohnemus et al. (2023), which studies the estimation of multipartite concurrence using LRMs, to the CEs and then significantly simplified the error analysis and provided analytical upper bounds on the sample complexity of this task, as summarized in Table 1. While each measurement in an LRM protocol is easy to implement, the number of measurement settings required scales exponentially with the number of qubits. To address this experimental shortcoming of LRMs, we provided a de-randomization of the entanglement estimation procedure, allowing experimentalists to estimate many multipartite entanglement measures of interest using a single experimental setting.

Finally, the purity of a quantum state is a quantity of great practical interest, in its own right. Many works exploring the estimation of functions of the form $\textnormal{tr}\left[\rho^{k}\right]$ for integer $k$ fall short of the rigorous performance guarantees we provide here. We suspect our methods of statistical estimation could be adapted to those settings as well. In general, we hope our work will enable multipartite entanglement quantification in the increasingly large quantum systems being built, and coherently controlled, in experimental laboratories today.

Acknowledgements.

The authors thank Chris Fuchs for several enlightening discussions about SIC measurements during his visit to JILA. This work was supported by NSF award 2137984 and NSF PHY 1915407.

References

Einstein et al. (1935) A. Einstein, B. Podolsky, and N. Rosen, “Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?” Physical Review 47, 777–780 (1935).
Horodecki et al. (2009) Ryszard Horodecki, Paweł Horodecki, Michał Horodecki, and Karol Horodecki, “Quantum entanglement,” Rev. Mod. Phys. 81, 865–942 (2009).
Gühne and Tóth (2009) Otfried Gühne and Géza Tóth, “Entanglement detection,” Physics Reports 474, 1–75 (2009).
Islam et al. (2015) Rajibul Islam, Ruichao Ma, Philipp M. Preiss, M. Eric Tai, Alexander Lukin, Matthew Rispoli, and Markus Greiner, “Measuring entanglement entropy in a quantum many-body system,” Nature 528, 77–83 (2015).
Kaufman et al. (2016) Adam M. Kaufman, M. Eric Tai, Alexander Lukin, Matthew Rispoli, Robert Schittko, Philipp M. Preiss, and Markus Greiner, “Quantum thermalization through entanglement in an isolated many-body system,” Science 353, 794–800 (2016).
Bluvstein et al. (2022) Dolev Bluvstein, Harry Levine, Giulia Semeghini, Tout T. Wang, Sepehr Ebadi, Marcin Kalinowski, Alexander Keesling, Nishad Maskara, Hannes Pichler, Markus Greiner, and et al., “A quantum processor based on coherent transport of entangled atom arrays,” Nature 604, 451–456 (2022).
Bell (1964) J. S. Bell, “On the Einstein Podolsky Rosen paradox,” Physics Physique Fizika 1, 195–200 (1964).
Clauser et al. (1969) John F. Clauser, Michael A. Horne, Abner Shimony, and Richard A. Holt, “Proposed Experiment to Test Local Hidden-Variable Theories,” Physical Review Letters 23, 880–884 (1969).
Freedman and Clauser (1972) Stuart J. Freedman and John F. Clauser, “Experimental Test of Local Hidden-Variable Theories,” Physical Review Letters 28, 938–941 (1972).
Hensen et al. (2015) B. Hensen, H. Bernien, A. E. Dréau, A. Reiserer, N. Kalb, M. S. Blok, J. Ruitenberg, R. F. L. Vermeulen, R. N. Schouten, C. Abellán, W. Amaya, V. Pruneri, M. W. Mitchell, M. Markham, D. J. Twitchen, D. Elkouss, S. Wehner, T. H. Taminiau, and R. Hanson, “Loophole-free Bell inequality violation using electron spins separated by 1.3 kilometres,” Nature 526, 682–686 (2015).
Giustina et al. (2015) Marissa Giustina, Marijn A. M. Versteegh, Sören Wengerowsky, Johannes Handsteiner, Armin Hochrainer, Kevin Phelan, Fabian Steinlechner, Johannes Kofler, Jan-Åke Larsson, Carlos Abellán, Waldimar Amaya, Valerio Pruneri, Morgan W. Mitchell, Jörn Beyer, Thomas Gerrits, Adriana E. Lita, Lynden K. Shalm, Sae Woo Nam, Thomas Scheidl, Rupert Ursin, Bernhard Wittmann, and Anton Zeilinger, “Significant-Loophole-Free Test of Bell’s Theorem with Entangled Photons,” Physical Review Letters 115, 250401 (2015).
Shalm et al. (2015) Lynden K. Shalm, Evan Meyer-Scott, Bradley G. Christensen, Peter Bierhorst, Michael A. Wayne, Martin J. Stevens, Thomas Gerrits, Scott Glancy, Deny R. Hamel, Michael S. Allman, Kevin J. Coakley, Shellee D. Dyer, Carson Hodge, Adriana E. Lita, Varun B. Verma, Camilla Lambrocco, Edward Tortorici, Alan L. Migdall, Yanbao Zhang, Daniel R. Kumor, William H. Farr, Francesco Marsili, Matthew D. Shaw, Jeffrey A. Stern, Carlos Abellán, Waldimar Amaya, Valerio Pruneri, Thomas Jennewein, Morgan W. Mitchell, Paul G. Kwiat, Joshua C. Bienfang, Richard P. Mirin, Emanuel Knill, and Sae Woo Nam, “Strong Loophole-Free Test of Local Realism,” Physical Review Letters 115, 250402 (2015).
Elben et al. (2018) A. Elben, B. Vermersch, M. Dalmonte, J. I. Cirac, and P. Zoller, “Rényi entropies from random quenches in atomic hubbard and spin models,” Phys. Rev. Lett. 120, 050406 (2018).
Brydges et al. (2019) Tiff Brydges, Andreas Elben, Petar Jurcevic, Benoît Vermersch, Christine Maier, Ben P. Lanyon, Peter Zoller, Rainer Blatt, and Christian F. Roos, “Probing rényi entanglement entropy via randomized measurements,” Science 364, 260–263 (2019).
Elben et al. (2019) A. Elben, B. Vermersch, C. F. Roos, and P. Zoller, “Statistical correlations between locally randomized measurements: A toolbox for probing entanglement in many-body quantum states,” Phys. Rev. A 99, 052323 (2019).
Notarnicola et al. (2022) Simone Notarnicola, Andreas Elben, Thierry Lahaye, Antoine Browaeys, Simone Montangero, and Benoit Vermersch, “A randomized measurement toolbox for rydberg quantum technologies,” arXiv preprint arXiv:2112.11046 (2022).
Elben et al. (2020) Andreas Elben, Richard Kueng, Hsin-Yuan (Robert) Huang, Rick van Bijnen, Christian Kokail, Marcello Dalmonte, Pasquale Calabrese, Barbara Kraus, John Preskill, Peter Zoller, and Benoît Vermersch, “Mixed-State Entanglement from Local Randomized Measurements,” Physical Review Letters 125, 200501 (2020).
Vermersch et al. (2023) Benoît Vermersch, Marko Ljubotina, J. Ignacio Cirac, Peter Zoller, Maksym Serbyn, and Lorenzo Piroli, “Many-body entropies and entanglement from polynomially-many local measurements,” arXiv pre-print arXiv:2311.08108 (2023).
Haah et al. (2017) Jeongwan Haah, Aram W. Harrow, Zhengfeng Ji, Xiaodi Wu, and Nengkun Yu, “Sample-Optimal Tomography of Quantum States,” IEEE Transactions on Information Theory 63, 5628–5641 (2017).
O’Donnell and Wright (2016) Ryan O’Donnell and John Wright, “Efficient quantum tomography,” in Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’16 (Association for Computing Machinery, New York, NY, USA, 2016) pp. 899–912.
Bubeck et al. (2020) Sebastien Bubeck, Sitan Chen, and Jerry Li, “Entanglement is Necessary for Optimal Quantum Property Testing,” in 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS) (2020) pp. 692–703.
van Enk and Beenakker (2012) S. J. van Enk and C. W. J. Beenakker, “Measuring $\mathrm{Tr}{\rho}^{n}$ on single copies of $\rho$ using random measurements,” Phys. Rev. Lett. 108, 110503 (2012).
O’Donnell and Wright (2015) Ryan O’Donnell and John Wright, “Quantum Spectrum Testing,” in Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15 (Association for Computing Machinery, New York, NY, USA, 2015) pp. 529–538.
Rath et al. (2021) Aniket Rath, Cyril Branciard, Anna Minguzzi, and Benoît Vermersch, “Quantum Fisher Information from Randomized Measurements,” Physical Review Letters 127, 260501 (2021).
Huang et al. (2020) Hsin-Yuan Huang, Richard Kueng, and John Preskill, “Predicting many properties of a quantum system from very few measurements,” Nature Physics 16, 1050–1057 (2020).
Huang et al. (2022) Hsin-Yuan Huang, Michael Broughton, Jordan Cotler, Sitan Chen, Jerry Li, Masoud Mohseni, Hartmut Neven, Ryan Babbush, Richard Kueng, John Preskill, and et al., “Quantum advantage in learning from experiments,” Science 376, 1182–1186 (2022).
Elben et al. (2023) Andreas Elben, Steven T. Flammia, Hsin-Yuan Huang, Richard Kueng, John Preskill, Benoît Vermersch, and Peter Zoller, “The randomized measurement toolbox,” Nature Reviews Physics 5, 9–24 (2023).
Lu et al. (2016) Dawei Lu, Tao Xin, Nengkun Yu, Zhengfeng Ji, Jianxin Chen, Guilu Long, Jonathan Baugh, Xinhua Peng, Bei Zeng, and Raymond Laflamme, “Tomography is Necessary for Universal Entanglement Detection with Single-Copy Observables,” Physical Review Letters 116, 230501 (2016).
Liu et al. (2022) Zhenhuan Liu, Pei Zeng, You Zhou, and Mile Gu, “Characterizing correlation within multipartite quantum systems via local randomized measurements,” Phys. Rev. A 105, 022407 (2022).
Ohnemus et al. (2023) Sophia Ohnemus, Heinz-Peter Breuer, and Andreas Ketterer, “Quantifying multiparticle entanglement with randomized measurements,” Physical Review A 107, 042406 (2023).
Carvalho et al. (2004) André R. R. Carvalho, Florian Mintert, and Andreas Buchleitner, “Decoherence and multipartite entanglement,” Phys. Rev. Lett. 93, 230501 (2004).
Beckey et al. (2021) Jacob L. Beckey, N. Gigena, Patrick J. Coles, and M. Cerezo, “Computable and operationally meaningful multipartite entanglement measures,” Phys. Rev. Lett. 127, 140501 (2021).
Stricker et al. (2022) Roman Stricker, Michael Meth, Lukas Postler, Claire Edmunds, Chris Ferrie, Rainer Blatt, Philipp Schindler, Thomas Monz, Richard Kueng, and Martin Ringbauer, “Experimental Single-Setting Quantum State Tomography,” PRX Quantum 3, 040310 (2022).
Beckey et al. (2023) Jacob L. Beckey, Gerard Pelegrí, Steph Foulds, and Natalie J. Pearson, “Multipartite entanglement measures via Bell-basis measurements,” Physical Review A 107, 062425 (2023).
Ekert et al. (2002) Artur K. Ekert, Carolina Moura Alves, Daniel K. L. Oi, Michał Horodecki, Paweł Horodecki, and L. C. Kwek, “Direct Estimations of Linear and Nonlinear Functionals of a Quantum State,” Physical Review Letters 88, 217901 (2002).
Meyer and Wallach (2002) David A Meyer and Nolan R Wallach, “Global entanglement in multiparticle systems,” Journal of Mathematical Physics 43, 4273–4278 (2002).
Brennen (2003) Gavin K Brennen, “An observable measure of entanglement for pure states of multi-qubit systems,” arXiv preprint quant-ph/0305094 (2003).
Bluvstein et al. (2023) Dolev Bluvstein, Simon J. Evered, Alexandra A. Geim, Sophie H. Li, Hengyun Zhou, Tom Manovitz, Sepehr Ebadi, Madelyn Cain, Marcin Kalinowski, Dominik Hangleiter, J. Pablo Bonilla Ataides, Nishad Maskara, Iris Cong, Xun Gao, Pedro Sales Rodriguez, Thomas Karolyshyn, Giulia Semeghini, Michael J. Gullans, Markus Greiner, Vladan Vuletić, and Mikhail D. Lukin, “Logical quantum processor based on reconfigurable atom arrays,” Nature , 1–3 (2023).
Vershynin (2018) Roman Vershynin, High-Dimensional Probability: An Introduction with Applications in Data Science, Cambridge Series in Statistical and Probabilistic Mathematics (Cambridge University Press, Cambridge, 2018).
Lerasle (2019) Matthieu Lerasle, “Lecture Notes: Selected topics on robust statistical learning theory,” arXiv:1908.10761 (2019).
Mele (2023) Antonio Anna Mele, “Introduction to Haar Measure Tools in Quantum Information: A Beginner’s Tutorial,” (2023), arxiv:2307.08956 .
Vermersch et al. (2018) B. Vermersch, A. Elben, M. Dalmonte, J. I. Cirac, and P. Zoller, “Unitary $n$ -designs via random quenches in atomic hubbard and spin models: Application to the measurement of rényi entropies,” Phys. Rev. A 97, 023604 (2018).
Scott (2006) A J Scott, “Tight informationally complete quantum measurements,” Journal of Physics A: Mathematical and General 39, 13507–13530 (2006).
Chen et al. (2007) Ping-Xing Chen, János A. Bergou, Shi-Yao Zhu, and Guang-Can Guo, “Ancilla dimensions needed to carry out positive-operator-valued measurement,” Phys. Rev. A 76, 060303 (2007).
Renes et al. (2004) Joseph M. Renes, Robin Blume-Kohout, A. J. Scott, and Carlton M. Caves, “Symmetric informationally complete quantum measurements,” Journal of Mathematical Physics 45, 2171–2180 (2004).
Fuchs et al. (2017) Christopher Fuchs, Michael Hoang, and Blake Stacey, “The sic question: History and state of play,” Axioms 6, 21 (2017).
Fischer et al. (2022) Laurin E. Fischer, Daniel Miller, Francesco Tacchino, Panagiotis Kl. Barkoutsos, Daniel J. Egger, and Ivano Tavernelli, “Ancilla-free implementation of generalized measurements for qubits embedded in a qudit space,” Phys. Rev. Res. 4, 033027 (2022).
Bennett et al. (1996) Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters, “Mixed-state entanglement and quantum error correction,” Physical Review A 54, 3824–3851 (1996).
Uhlmann (2010) Armin Uhlmann, “Roofs and Convexity,” Entropy 12, 1799–1832 (2010).
Wootters (2001) William K. Wootters, “Entanglement of formation and concurrence,” Quantum Information and Computation 1, 27–44 (2001).
Rungta et al. (2001) Pranaw Rungta, V. Bužek, Carlton M. Caves, M. Hillery, and G. J. Milburn, “Universal state inversion and concurrence in arbitrary dimensions,” Physical Review A 64, 042315 (2001).
Mintert and Buchleitner (2007) Florian Mintert and Andreas Buchleitner, “Observable entanglement measure for mixed quantum states,” Phys. Rev. Lett. 98, 140505 (2007).
Daley et al. (2012) A. J. Daley, H. Pichler, J. Schachenmayer, and P. Zoller, “Measuring entanglement growth in quench dynamics of bosons in an optical lattice,” Phys. Rev. Lett. 109, 020505 (2012).
Mintert et al. (2005) Florian Mintert, Marek Kuś, and Andreas Buchleitner, “Concurrence of mixed multipartite quantum states,” Phys. Rev. Lett. 95, 260502 (2005).
Neven et al. (2021) Antoine Neven, Jose Carrasco, Vittorio Vitale, Christian Kokail, Andreas Elben, Marcello Dalmonte, Pasquale Calabrese, Peter Zoller, Benoit Vermersch, Richard Kueng, and Barbara Kraus, “Symmetry-resolved entanglement detection using partial transpose moments,” npj Quantum Information 7, 1–12 (2021).
Lowe and Nayak (2022) Angus Lowe and Ashwin Nayak, “Lower bounds for learning quantum states with single-copy measurements,” (2022), arXiv:2207.14438v2 .
Preskill (1997) John Preskill, “Quantum computation lecutre notes,” http://theory.caltech.edu/ preskill/ph229/notes/chap3.pdf (1997).

Appendix

Appendix A Preliminaries

A.1 Neumark’s (Naimark’s) Theorem

Neumark’s theorem is a procedure for realizing POVMs as a projective measurement on a larger Hilbert space. Because it is the key theoretical component that allows SICs to be implemented in practice, we provide a proof here. Our proof is based directly upon the one given in Preskill’s 1997 lecture notes Preskill (1997). Note that we focus on the case of POVMs with four elements due to our interest in single qubit SICs; however, the result can be generalized for POVMs with $N$ elements.

Theorem 5 (Neumark’s (Naimark’s) Theorem for single-qubit SICs).

Suppose our two-dimensional Hilbert space of interest $\mathcal{H}_{A}$ is actually a subspace of a larger Hilbert space with a direct sum structure $\mathcal{H}=\mathcal{H}_{A}\oplus\mathcal{H}_{A}^{\perp}$ , where $\mathcal{H}_{A}^{\perp}$ is another two-dimensional Hilbert space. Then, a single-qubit SIC-POVM has elements $\{F_{i}\}_{i=0}^{3}=\{\frac{1}{2}|\hskip 1.0pt\phi_{i}\rangle\langle\phi_{i}\hskip 1.0pt|\}_{i=0}^{3}=\{|\hskip 1.0pt\tilde{\phi_{i}}\rangle\langle\tilde{\phi_{i}}\hskip 1.0pt|\}_{i=0}^{3}\subset\mathcal{H}_{A}$ that only have support on $\mathcal{H}_{A}$ , i.e, $F_{i}|\phi^{\perp}\rangle=0=\langle\phi^{\perp}|F_{i}$ , for any $|\phi^{\perp}\rangle\in\mathcal{H}_{A}^{\perp}$ and for any $i$ . We can then realize the SIC-POVM as a projective measurement $\{|\hskip 1.0ptu_{i}\rangle\langle u_{i}\hskip 1.0pt|\}$ on $\mathcal{H}$ , where $\{|u_{0}\rangle,\dotsc,|u_{3}\rangle\}$ is an orthonormal basis, with $|u_{i}\rangle$ defined as

\displaystyle|u_{i}\rangle=|\tilde{\phi_{i}}\rangle\oplus|\tilde{\phi}_{i}^{\perp}\rangle,

(40)

where $|\tilde{\phi}_{i}^{\perp}\rangle$ is a vector of magnitude $1/2$ that is orthogonal to $|\tilde{\phi}_{i}\rangle$ for each $i$ given in Eq. (42).

Proof.

We have the following requirements to realize the single-qubit SIC-POVM as a projective POVM on the larger space $\mathcal{H}$ :

1.

$\{|u_{0}\rangle,\dotsc,|u_{3}\rangle\}$ forms an orthonormal basis for $\mathcal{H}$ , and

For any state $\rho$ on the space $\mathcal{H}_{A}$ , we have

\displaystyle\textnormal{tr}\left[(\rho\oplus 0_{A^{\perp}})|\hskip 1.0ptu_{i}\rangle\langle u_{i}\hskip 1.0pt|\right]=\textnormal{tr}\left[\rho|\hskip 1.0pt\tilde{\phi_{i}}\rangle\langle\tilde{\phi_{i}}\hskip 1.0pt|\right]

(41)

for all $i\in\{0,\dotsc,3\}$ , where $0_{A^{\perp}}$ denotes the zero matrix on $\mathcal{H}_{A}^{\perp}$ .

The second requirement above ensures that the POVM on the larger space reproduces the statistics of the single-qubit SIC POVM. This condition implies that each $|u_{i}\rangle$ is of the form $|u_{i}\rangle=|\tilde{\psi}_{i}\rangle\oplus|w_{i}\rangle$ for some vector $|w_{i}\rangle\in\mathcal{H}_{A}^{\perp}$ . Then, orthonormality of the $\{|u_{i}\rangle\}$ basis gives the constraint that $\langle w_{j}|w_{i}\rangle=-\langle\tilde{\phi}_{j}|\tilde{\phi}_{i}\rangle$ for all $i,j$ . It can be verified through explicit calculation that choosing $|w_{i}\rangle=|\tilde{\phi}_{i}^{\perp}\rangle$ to be a vector perpendicular to $|\tilde{\phi}_{i}\rangle$ for each $i$ given in the equation below satisfies the above requirements.

\displaystyle\begin{split}|\tilde{\phi}_{0}^{\perp}\rangle&=\frac{1}{\sqrt{2}}|1\rangle,\\ |\tilde{\phi}_{1}^{\perp}\rangle&=\frac{1}{\sqrt{3}}|0\rangle-\frac{1}{\sqrt{6}}|1\rangle,\\ |\tilde{\phi}_{2}^{\perp}\rangle&=\frac{e^{-i2\pi/3}}{\sqrt{3}}|0\rangle-\frac{1}{\sqrt{6}}|1\rangle,\\ |\tilde{\phi}_{3}^{\perp}\rangle&=\frac{e^{-i4\pi/3}}{\sqrt{3}}|0\rangle-\frac{1}{\sqrt{6}}|1\rangle.\end{split}

(42)

∎

Neumark’s Theorem involves a direct sum structure, but it can be exchanged for a tensor product structure with some caveats Chen et al. (2007). For the case of single-qubit SIC-POVM, however, this is not a problem as both $\mathcal{H}_{A}$ and $\mathcal{H}_{A}^{\perp}$ are two-dimensional. Specifically, we can obtain a tensor product structure by considering an ancillary qubit $B$ , and writing

\displaystyle|u_{i}\rangle=|\tilde{\phi_{i}}\rangle_{A}|0\rangle_{B}+|\tilde{\phi_{i}^{\perp}}\rangle_{A}|1\rangle_{B}.

(43)

This once again gives the same measurement statistics as the POVM,

\displaystyle\textnormal{tr}\left[\rho|\hskip 1.0pt\tilde{\phi_{i}}\rangle\langle\tilde{\phi_{i}}\hskip 1.0pt|\right]=\textnormal{tr}\left[(\rho\otimes|\hskip 1.0pt0\rangle\langle 0\hskip 1.0pt|)|\hskip 1.0ptu_{i}\rangle\langle u_{i}\hskip 1.0pt|\right].

(44)

To physically implement a SIC-POVM we can construct a unitary to act on our state such that a computational basis measurement afterward will give us the outcomes for a SIC. Let $U_{\mathrm{SIC}}=\begin{bmatrix}|u_{1}\rangle&|u_{2}\rangle&|u_{3}\rangle&|u_{4}\rangle\end{bmatrix},$ then

\displaystyle\textnormal{tr}\left[\rho|\hskip 1.0pt\tilde{\phi_{i}}\rangle\langle\tilde{\phi_{i}}\hskip 1.0pt|\right]=\textnormal{tr}\left[U_{\mathrm{SIC}}^{\dagger}(\rho\otimes|\hskip 1.0pt0\rangle\langle 0\hskip 1.0pt|)U_{\mathrm{SIC}}|\hskip 1.0ptq\rangle\langle q\hskip 1.0pt|\right]

(45)

where $q\in\{0,1,2,3\}$ . An explicit matrix for $U_{\mathrm{SIC}}$ is given in Eq. (31).

Appendix B Proofs of Main Results

B.1 LRMs

Proof of Proposition. 3.

Let $S\subseteq[n]$ denote the system that we are interested in. We define $U$ as follows:

\displaystyle U=\prod_{i\in S}U_{i},

(46)

where $U_{i}\in U(2)$ . We can then compute the Haar average quantity $\mathbb{E}[P_{U}(\mathbf{z})^{2}]$ as

\displaystyle\begin{split}\mathbb{E}[P_{U}(\mathbf{z})^{2}]&=\mathbb{E}_{U}[\textnormal{tr}\left[\rho U|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger}]^{2}\right],\\ &=\mathbb{E}_{U}[\textnormal{tr}\left[\rho U|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger}\right]\textnormal{tr}\left[\rho U|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger}\right]],\\ &=\mathbb{E}_{U}[\textnormal{tr}\left[\rho U|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger}\otimes\rho U|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger}\right]],\\ &=\mathbb{E}_{U}[\textnormal{tr}\left[\rho^{\otimes 2}(U|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger})^{\otimes 2}\right]],\\ &=\textnormal{tr}\left[\rho^{\otimes 2}\mathbb{E}_{U}[(U|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger})^{\otimes 2}]\right],\\ &=\textnormal{tr}\left[\rho^{\otimes 2}\prod_{i\in s}\mathbb{E}_{U_{i}}[U_{i}^{\otimes 2}|\hskip 1.0ptz_{i}\rangle\langle z_{i}\hskip 1.0pt|^{\otimes 2}U^{\dagger^{\otimes 2}}_{i}]\right],\\ \mathbb{E}[P_{U}(\mathbf{z})^{2}]&=\left(\frac{1}{3}\right)^{\lvert s\rvert}\textnormal{tr}\left[\rho^{\otimes 2}\prod_{i\in s}\Pi^{i}_{+}\right],\end{split}

(47)

where the last line follows from:

\displaystyle\begin{split}\mathbb{E}_{U}[U^{\otimes 2}|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|U^{\dagger^{\otimes 2}}]&=\frac{1}{\textnormal{tr}\left[\Pi_{+}\right]}\Pi_{+}\textnormal{tr}\left[\Pi_{+}|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|^{\otimes 2}\right]\\ &+\frac{1}{\textnormal{tr}\left[\Pi_{-}\right]}\Pi_{-}\textnormal{tr}\left[\Pi_{-}|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|^{\otimes 2}\right],\\ &=\frac{2}{d(d+1)}\Pi_{+}\textnormal{tr}\left[\frac{\mathbb{I}+\mathbb{F}}{2}|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|^{\otimes 2}\right],\\ &=\frac{2}{d(d+1)}\Pi_{+}.\end{split}

(48)

Where we notice that since $|\hskip 1.0pt\mathbf{z}\rangle\langle\mathbf{z}\hskip 1.0pt|$ is a product state, under the SWAP operation ( $\mathbb{F}$ ) it remains the same and is thus annihilated by the anti-symmetric projector. Using the relation that $\Pi_{+}=\frac{\mathbb{I}+\mathbb{F}}{2}$ , $d=2$ , and Lemma 1, i.e. the SWAP trick, we can rewrite to the following:

$\displaystyle\mathbb{E}[P_{U}(\mathbf{z})^{2}]$	$\displaystyle=\left(\frac{1}{3}\right)^{\lvert s\rvert}\textnormal{tr}\left[\rho^{\otimes 2}\prod_{i\in s}\Pi_{+}^{i}\right],$	(49)
	$\displaystyle=\left(\frac{1}{6}\right)^{\lvert s\rvert}\textnormal{tr}\left[\prod_{i\in S}(\mathbb{F}_{i}+\mathbb{I}_{i})\rho^{\otimes 2}\right],$	(50)
	$\displaystyle=\left(\frac{1}{6}\right)^{\lvert s\rvert}\sum_{\alpha\in\mathcal{P}(s)}\textnormal{tr}\left[\rho_{\alpha}^{2}\right],$	(51)
$\displaystyle\implies$	$\displaystyle\mathcal{C}_{\|\psi\rangle}(S)=1-3^{\lvert s\rvert}\mathbb{E}[P_{U}(\mathbf{z})^{2}]$	(52)

∎

Proof of Theorem 1.

Let $\boldsymbol{Z}_{l,k}$ denote the $k$ th outcome observed after rotating a given subset $S$ of qubits by $U_{l}$ and measuring in the computational basis of those qubits. For a fixed $\boldsymbol{z}\in\{0,1\}^{s}$ bitstring, let $\widehat{P}_{l}(\boldsymbol{z})=\frac{1}{K}\sum_{k=1}^{K}\textnormal{ind}[\boldsymbol{Z}_{l,k}=\boldsymbol{z}]$ denote the fraction of $K$ outcomes equal to $\boldsymbol{z}$ . Observe that $\widehat{P}_{l}(\boldsymbol{z})$ is an unbiased estimator of $\mathbb{E}_{U}[P_{U}(\boldsymbol{z})]$ , which is the probability of observing the bitstring $\boldsymbol{z}$ , averaged over unitaries. Note that here we take the expectation value over the outcome probabilities as well as the unitaries. Then, following Ohnemus et al. Ohnemus et al. (2023),

\displaystyle\widehat{P}^{(2)}_{l}({\boldsymbol{z}})

\displaystyle=\widehat{P}_{l}({\boldsymbol{z}})\frac{(K\widehat{P}_{l}({\boldsymbol{z}})-1)}{K-1}.

(53)

is an unbiased estimator of the squared probability averaged over unitaries, $\mathbb{E}_{U}[P_{U}(\boldsymbol{z})^{2}]$ . Consequently, $\hat{S}_{l}^{(K)}=\sum_{\boldsymbol{z}\in\{0,1\}^{s}}\widehat{P}^{(2)}_{l}(\boldsymbol{z})$ is an unbiased estimator of the sum of squared probabilities averaged over unitaries. To see that $\hat{S}_{l}^{(K)}$ coincides with Eq. (15), we substitute the expression for $\widehat{P}_{l}(\boldsymbol{z})$ in terms of indicator function to obtain

\displaystyle\hat{S}_{l}^{(K)}=\frac{1}{K(K-1)}\sum_{\begin{subarray}{c}k,k^{\prime}=1\\ k\neq k^{\prime}\end{subarray}}^{K}\sum_{\boldsymbol{z}}\textnormal{ind}[\boldsymbol{Z}_{l,k}=\boldsymbol{z}]\textnormal{ind}[\boldsymbol{Z}_{l,k^{\prime}}=\boldsymbol{z}].

(54)

Observe that $\sum_{\boldsymbol{z}}\textnormal{ind}[\boldsymbol{Z}_{l,k}=\boldsymbol{z}]\textnormal{ind}[\boldsymbol{Z}_{l,k^{\prime}}=\boldsymbol{z}]$ is $1$ when $\boldsymbol{Z}_{l,k}=\boldsymbol{Z}_{l,k^{\prime}}$ and $0$ otherwise. Thus, we can write $\sum_{\boldsymbol{z}}\textnormal{ind}[\boldsymbol{Z}_{l,k}=\boldsymbol{z}]\textnormal{ind}[\boldsymbol{Z}_{l,k^{\prime}}=\boldsymbol{z}]=\textnormal{ind}[\boldsymbol{Z}_{l,k}=\boldsymbol{Z}_{l,k^{\prime}}]$ , and subsequently, we obtain

\displaystyle\hat{S}_{l}^{(K)}=\frac{1}{K(K-1)}\sum_{\begin{subarray}{c}k,k^{\prime}=1\\ k\neq k^{\prime}\end{subarray}}^{K}\textnormal{ind}[\boldsymbol{Z}_{l,k}=\boldsymbol{Z}_{l,k^{\prime}}].

(55)

As a result, $\hat{\mathcal{C}}_{l}^{(K)}(S)=1-(3/2)^{s}\hat{S}_{l}^{(K)}$ is an unbiased estimator of CE for each $l$ , and $\hat{\mathcal{C}}_{|\psi\rangle}(S)$ defined in Eq. (14) is just the empirical average of $\hat{\mathcal{C}}_{l}^{(K)}(S)$ over $L$ randomly sampled unitaries. Then, since $\hat{S}_{l}^{(K)}$ is bounded between $0$ and $1$ , using Hoeffding’s inequality (Proposition 1), one obtains a sample complexity of $O\left(\left(\frac{9}{4}\right)^{s}\log(\frac{1}{\delta})\epsilon^{-2}\right)$ for estimating the CE. ∎

Proof of Theorem 2.

From the proof of Theorem 1, we know that the the quantity $\hat{\mathcal{C}}_{l}^{(2)}=1-(3/2)^{s}\hat{S}_{l}^{(2)}$ is an unbiased estimator of $\mathcal{C}_{|\psi\rangle}(S)$ . From Eq. (19), we know that variance of $\hat{\mathcal{C}}_{l}^{(2)}$ is bounded above by $(3/2)^{s}$ . Then applying Propostion 2 yields the advertised sample complexity of $O\left(\left(\frac{3}{2}\right)^{s}\log\left(\frac{1}{\delta}\right)\epsilon^{-2}\right)$ . The only remaining component of the proof is calculating the variance of $\hat{\mathcal{C}}_{l}^{(K)}$ , which is done in Lemma 2. ∎

Lemma 2.

The variance of the estimator $\hat{S}_{l}^{(K)}$ defined in Eq. (15) is given by

	$\displaystyle\textnormal{Var}[\hat{S}_{l}^{(K)}]$	$\displaystyle=\frac{2P_{2}(1-P_{2})+4(K-2)(P_{3}-P_{2}^{2})}{K(K-1)}$
		$\displaystyle\qquad+\frac{(K-2)(K-3)(P_{2,2}-P_{2}^{2})}{K(K-1)}.$		(56)

Proof.

For any $l\in[L]$ , we define $X_{k,k^{\prime}}=\textnormal{ind}[Z_{l,k}=Z_{l,k^{\prime}}]$ to be the Bernoulli random variable with mean $P_{2}=\operatorname{\mathbb{E}}_{U}[\sum_{\boldsymbol{z}}P(\boldsymbol{z})^{2}]$ . Then, we can write

\textnormal{Var}[\hat{S}_{l}^{(K)}]=\frac{1}{K^{2}(K-1)^{2}}\sum_{\begin{subarray}{c}k,k^{\prime}=1\\ k\neq k^{\prime}\end{subarray}}^{K}\sum_{\begin{subarray}{c}j,j^{\prime}=1\\ j\neq j^{\prime}\end{subarray}}^{K}\textnormal{Cov}[X_{k,k^{\prime}},X_{j,j^{\prime}}],

(57)

which we evaluate using combinatorial arguments. To facilitate counting, we compare the indices $(k,k^{\prime})$ and $(j,j^{\prime})$ under the requirement that $k\neq k^{\prime}$ and $j\neq j^{\prime}$ based on this contraint in the sum above. Observe that $X_{k,k^{\prime}}=X_{k^{\prime},k}$ , and thus, we need to account for indices $(k,k^{\prime})$ and $(k^{\prime},k)$ denoting the same random variable. To proceed, we break the indices appearing in the sum into three cases and evaluate each of them separately.

Case 1:

$(k,k^{\prime})=(j,j^{\prime})$ and its permutation $(k,k^{\prime})=(j^{\prime},j)$ . There are $2K(K-1)$ such terms including the permutation. This results in $X_{k,k^{\prime}}=X_{j,j^{\prime}}$ , and thus, $\textnormal{Cov}(X_{k,k^{\prime}}X_{j,j^{\prime}})=\textnormal{Var}(X_{k,k^{\prime}})=P_{2}(1-P_{2})$ .
Case 2:

$k=j$ & $k^{\prime}\neq j^{\prime}$ and its $3$ permutations ( $k=j^{\prime}$ & $k^{\prime}\neq j$ , $k^{\prime}=j$ & $k\neq j^{\prime}$ , $k^{\prime}=j^{\prime}$ & $k\neq j$ ). There are a total of $4K(K-1)(K-2)$ such terms including permutations. For $k=j$ & $k^{\prime}\neq j^{\prime}$ , we have $X_{k,k^{\prime}}X_{j,j^{\prime}}=\textnormal{ind}[\boldsymbol{Z}_{k}=\boldsymbol{Z}_{k^{\prime}}=\boldsymbol{Z}_{j^{\prime}}]$ , which takes the value $1$ when exactly $3$ independent outcome strings are equal and $0$ otherwise. Subsequently, $\textnormal{Cov}(X_{k,k^{\prime}},X_{j,j^{\prime}})=\mathbb{E}[X_{k,k^{\prime}}X_{j,j^{\prime}}]-\mathbb{E}[X_{k,k^{\prime}}]\mathbb{E}[X_{j,j^{\prime}}]=P_{3}-P_{2}^{2}$ , where $P_{3}=\mathbb{E}_{U}[\sum_{\boldsymbol{z}}P_{U}(\boldsymbol{z})^{3}]$ . The permutations of $(k,k^{\prime})$ , $(j,j^{\prime})$ noted above give the same value for the covariance.
Case 3:

The last case corresponds to the situation where none of the indices are equal. There are $K(K-1)(K-2)(K-3)$ such terms. In this case, the random variables $X_{k,k^{\prime}}$ and $X_{j,j^{\prime}}$ are independent with respect to the outcome probabilities, and subsequently, $\textnormal{Cov}(X_{k,k^{\prime}},X_{j,j^{\prime}})=\mathbb{E}[X_{k,k^{\prime}}X_{j,j^{\prime}}]-\mathbb{E}[X_{k,k^{\prime}}]\mathbb{E}[X_{j,j^{\prime}}]=\mathbb{E}_{U}[(\sum_{\boldsymbol{z}}P_{U}(\boldsymbol{z})^{2})^{2}]-P_{2}^{2}$ . For convenience, we denote $P_{2,2}=\mathbb{E}_{U}[(\sum_{\boldsymbol{z}}P_{U}(\boldsymbol{z})^{2})^{2}]$ .

Using the values calculated in the above cases in Eq. (57), one can infer that the variance of $\hat{S}_{l}^{(K)}$ is given by Eq. (56). Importantly, $\textnormal{Var}[\hat{S}_{l}^{(K)}]$ is independent of $l$ . ∎

In Theorem 2, we restrict our attention to $K=2$ . Below, we briefly mention how one can obtain bounds on the variance for any $K\geqslant 2$ . First, we bound the variance from above by dropping terms corresponding to $-P_{2}^{2}$ to obtain

\displaystyle\textnormal{Var}[\hat{S}_{l}^{(K)}]

\displaystyle\leqslant\frac{2P_{2}+4(K-2)P_{3}}{K(K-1)}+\frac{(K-2)(K-3)P_{2,2}}{K(K-1)}.

As noted in Section III.1.2, we have the bound $P_{2}\leqslant(2/3)^{s}$ using the expression for CE given in Eq. (12). To bound $P_{3}$ , we integrate $P_{U}(\boldsymbol{z})^{3}$ over the Haar measure, which amounts to computing the third moment.

\displaystyle\begin{split}\mathbb{E}_{U}\left[P_{U}(\boldsymbol{z})^{3}\right]&=\textnormal{tr}\left[\rho^{\otimes 3}\prod_{i\in S}\mathbb{E}_{U_{i}}\left[U_{i}^{\dagger^{\otimes 3}}|\hskip 1.0pt\boldsymbol{z}\rangle\langle\boldsymbol{z}\hskip 1.0pt|^{\otimes 3}U_{i}^{\otimes 3}\right]\right],\\ &=\frac{1}{4^{s}}\textnormal{tr}\left[\rho^{\otimes 3}\prod_{i\in S}\Pi^{i}_{\mathrm{sym},3}\right],\\ &\leqslant\frac{1}{4^{s}},\end{split}

(58)

where $\Pi^{i}_{\mathrm{sym},3}$ denotes the projector onto the symmetric subspace defined by the symmetric group $S_{3}$ for the $i$ -th index Mele (2023). Thus, we obtain $P_{3}=\sum_{\boldsymbol{z}}\mathbb{E}_{U}[P_{U}(\boldsymbol{z})^{3}]\leqslant 1/2^{s}$ . Finally, for bounding $P_{2,2}$ , we need to compute the fourth moment. As mentioned previously, since Cliffords only form a $3$ -design, the value of $P_{2,2}$ will differ for local Cliffords and local Haar random unitaries. We leave this computation for future work, and instead give a simple bound on $P_{2,2}$ here. Since $\sum_{\boldsymbol{z}}P_{U}(\boldsymbol{z})^{2}\leqslant 1$ for any unitary $U$ , we obtain $P_{2,2}\leqslant P_{2}$ . This bound can likely be tightened by directly computing $P_{2,2}$ as noted above. In any case, since the variance does not go to zero and instead approaches $P_{2,2}-P_{2}^{2}$ for large $K$ , sampling and measuring many unitaries is unavoidable for estimating CE to a small enough precision. Moreover, a larger value of $K$ leads to a larger computational cost. For this reason, we focus on $K=2$ in this study.

B.2 SICs

Proof of Theorem. 3.

Let $S\subseteq[n]$ denote the system of interest. Let $\{p_{i},|\phi_{i}\rangle\}_{i=1}^{N}$ be a projective 2-design with associated probability distribution $\phi$ such that $\mathbb{E}_{\phi}[(|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|)^{\otimes 2}]=\frac{1}{3}\Pi_{+}$ . Finally denote the joint probability distribution of the local projective 2-designs as $\Phi$ and have $|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|=\prod_{S}|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|$ denote the tensor product of local random variable states. Then the expectation of the squared projection of $\rho$ onto $|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|$ w.r.t. it’s probability distribution is

\displaystyle\begin{split}\mathbb{E}_{\Phi}[\textnormal{tr}\left[\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|\right]^{2}]&=\mathbb{E}_{\Phi}[\textnormal{tr}\left[\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|\right]\textnormal{tr}\left[\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|\right]],\\ &=\mathbb{E}_{\Phi}[\textnormal{tr}\left[(\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|)\otimes(\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|)\right]],\\ &=\mathbb{E}_{\Phi}[\textnormal{tr}\left[\rho^{\otimes 2}(|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|)^{\otimes 2}\right]],\\ &=\textnormal{tr}\left[\rho^{\otimes 2}\mathbb{E}_{\Phi}[(|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|)^{\otimes 2}]\right],\\ &=\textnormal{tr}\left[\rho^{\otimes 2}\mathbb{E}_{\Phi}\left[\prod_{S}(|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|)^{\otimes 2}\right]\right],\\ &=\textnormal{tr}\left[\rho^{\otimes 2}\prod_{S}\mathbb{E}_{\phi}[(|\hskip 1.0pt\phi\rangle\langle\phi\hskip 1.0pt|)^{\otimes 2}]\right],\\ &=\left(\frac{1}{3}\right)^{s}\textnormal{tr}\left[\rho^{\otimes 2}\prod_{i\in S}\Pi_{+}^{i}\right],\\ \mathbb{E}_{\Phi}[\textnormal{tr}\left[\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|\right]^{2}]&=\left(\frac{1}{6}\right)^{\lvert s\rvert}\sum_{\alpha\in\mathcal{P}(s)}\textnormal{tr}\left[\rho_{\alpha}^{2}\right].\end{split}

(59)

Here we used the independence of the marginal probability distributions to take the expectation of each random state variable in the tensor product separately, then applied Lemma 1, i.e. the SWAP trick. After shuffling around factors we arrive at,

\displaystyle\mathcal{C}_{|\psi\rangle}(S)=1-3^{s}\mathbb{E}_{\Phi}[\textnormal{tr}\left[\rho|\hskip 1.0pt\Phi\rangle\langle\Phi\hskip 1.0pt|\right]^{2}].

(60)

∎

Proof of Theorem 4.

Given two iid samples $\boldsymbol{Q}_{2i-1},\boldsymbol{Q}_{2i}$ , we define $\hat{S}^{(K)}=\textnormal{ind}[\boldsymbol{Q}_{2i-1},\boldsymbol{Q}_{2i}]$ , which is an unbiased estimator of $P_{2}:=\sum_{\boldsymbol{q}}P(\boldsymbol{q})^{2}$ . Thus, $\hat{\mathcal{C}}_{i}^{(2)}=1-3^{s}\hat{S}^{(2)}$ is an unbiased estimator for $\mathcal{C}_{|\psi\rangle}(S)$ using SIC measurements. From Lemma 3, we know that the variance of $\hat{S}^{(2)}$ is bounded above by $P_{2}$ for $K=2$ , and therefore, $\textnormal{Var}[\hat{\mathcal{C}}_{i}^{(2)}]\leqslant 9^{s}P_{2}$ . From the expression of CE in Eq. (30) in terms of SIC measurements, we know that $P_{2}\leqslant 1/3^{s}$ , giving $\textnormal{Var}[\hat{\mathcal{C}}_{i}^{(2)}]\leqslant 3^{s}$ . Then, using Proposition 2, we obtain the desired result. ∎

Theorem 6 (CE Estimation via SIC Data and MoM).

Given a precision $\epsilon>0$ and confidence level $1-\delta\in(0,1)$ , perform a total of $M=N_{B}K_{\mathrm{opt}}$ SIC measurements, where $N_{B}=\lceil 8\log(1/\delta)\rceil$ and

\displaystyle\begin{split}K_{\mathrm{opt}}&=\Biggl{\lceil}\frac{1}{2}\left(\frac{16}{\epsilon^{2}}\left(\frac{3}{2}\right)^{s}+1\right)\\ &\qquad+\frac{1}{2}\sqrt{\left(\frac{16}{\epsilon^{2}}\left(\frac{3}{2}\right)^{s}-1\right)^{2}+\frac{32}{\epsilon^{2}}3^{s}}\Biggr{\rceil}.\end{split}

(61)

Denote $\mathbf{Q}_{1},\ldots,\mathbf{Q}_{M}$ to be the outcomes obtained from these measurements. Break these $M$ outcomes into $N_{B}$ batches of size $K_{\mathrm{opt}}$ , and for each $1\leqslant b\leqslant N_{B}$ , compute estimate

\displaystyle\overline{\mathcal{C}}_{|\psi\rangle}^{(b)}(S)=1-3^{s}\hat{S}_{b}^{(K_{\mathrm{opt}})},

(62)

where

\displaystyle\hat{S}_{b}^{(K_{\mathrm{opt}})}

\displaystyle=\frac{1}{K_{\mathrm{opt}}(K_{\mathrm{opt}}-1)}\sum_{\begin{subarray}{c}k,k^{\prime}=(b-1)K_{\mathrm{opt}}+1\\ k\neq k^{\prime}\end{subarray}}^{bK_{\mathrm{opt}}}\textnormal{ind}[\boldsymbol{Q}_{k}=\boldsymbol{Q}_{k^{\prime}}].

(63)

Then, we have the guarantee that

\displaystyle\Pr[|\textnormal{median}[\overline{\mathcal{C}}^{(1)}_{|\psi\rangle}(S),\ldots,\overline{\mathcal{C}}^{(N_{B})}_{|\psi\rangle}(S)]-\mathcal{C}_{|\psi\rangle}(S)|\geqslant\epsilon]\leqslant\delta,

(64)

giving an upper bound of

O\left(\left[\left(\frac{3}{2}\right)^{s}\epsilon^{-2}+\sqrt{3^{s}}\epsilon^{-1}\right]\log\left(\frac{1}{\delta}\right)\right)

(65)

on the sample complexity of estimating $\mathcal{C}_{|\psi\rangle}(S)$ using SIC measurements.

Proof.

From Lemma 3, we know that for any $K\geqslant 2$ ,

\hat{S}^{(K)}=\frac{1}{K(K-1)}\sum_{\begin{subarray}{c}k,k^{\prime}=1\\ k\neq k^{\prime}\end{subarray}}^{K}\textnormal{ind}[\boldsymbol{Q}_{k}=\boldsymbol{Q}_{k^{\prime}}]

is an unbiased estimator of sum of squared probabilities for SIC measurents, with

\textnormal{Var}[\hat{S}^{(K)}]=\frac{2P_{2}(1-P_{2})+4(K-2)(P_{3}-P_{2}^{2})}{K(K-1)},

where $P_{2}=\sum_{\boldsymbol{q}}P(\boldsymbol{q})^{2}$ and $P_{3}=\sum_{\boldsymbol{q}}P(\boldsymbol{q})^{3}$ . Dropping $-P_{2}^{2}$ terms form the above equation for variance, we obtain the bound

\textnormal{Var}[\hat{S}^{(K)}]\leqslant\frac{2P_{2}+4(K-2)P_{3}}{K(K-1)}.

To proceed, we need to bound $P_{2}$ and $P_{3}$ . As mentioned in the proof of Theorem 4, we have $P_{2}\leqslant 1/3^{s}$ . To bound $P_{3}$ , we note that

\displaystyle\begin{split}P_{3}&=\sum_{\boldsymbol{q}}P(\boldsymbol{q})\cdot P(\boldsymbol{q})^{2}\\ &\leqslant\left(\max_{\boldsymbol{q}}P(\boldsymbol{q})\right)P_{2}\\ &\leqslant\left(\max_{\boldsymbol{q}}\textnormal{tr}\left[\rho\left(\frac{1}{2^{s}}\prod_{i\in\boldsymbol{q}}|\hskip 1.0pt\psi_{i}\rangle\langle\psi_{i}\hskip 1.0pt|\right)\right]\right)\left(\frac{1}{3^{s}}\right)\\ &\leqslant\frac{1}{6^{s}}.\end{split}

Plugging this back into the bound on $\textnormal{Var}[\hat{S}^{(K)}]$ , we obtain

\displaystyle\textnormal{Var}[\hat{S}^{(K)}]\leqslant\frac{2}{K(K-1)}\left(\frac{1}{3^{s}}+\frac{2(K-2)}{6^{s}}\right).

(66)

Since $\hat{\mathcal{C}}^{(K)}=1-3^{s}\hat{S}^{(K)}$ gives an unbiased estimate of $\mathcal{C}$ , we obtain

	$\displaystyle\textnormal{Var}[\hat{\mathcal{C}}^{(K)}]$	$\displaystyle=9^{s}\textnormal{Var}[\hat{S}^{(K)}]$
		$\displaystyle\leqslant\frac{2}{K(K-1)}\left(3^{s}+2(K-2)\left(\frac{3}{2}\right)^{s}\right).$		(67)

With the future use of median-of-means estimation in mind, we impose the requirement that $\textnormal{Var}[\hat{\mathcal{C}}^{(K)}]\leqslant\epsilon^{2}/4$ , given the precision $\epsilon>0$ for estimating CE. Using this requirement with the bound on the variance of $\hat{\mathcal{C}}^{(K)}$ determined in Eq. (67), we obtain the inequality

K^{2}-\left(\frac{16}{\epsilon^{2}}\left(\frac{3}{2}\right)^{s}+1\right)K-\frac{8\cdot 3^{s}}{\epsilon^{2}}\left(1-\frac{1}{2^{|s|-1}}\right)\geqslant 0.

Noting that $1-\frac{1}{2^{s-1}}\geqslant 0$ for $s\geqslant 1$ , we obtain the solution $K=K_{\mathrm{opt}}$ given in Eq. (61), which is the smallest integer value of $K$ satisfying the above inequality. Since for any integer $n$ , we have $n\leqslant\lceil n\rceil\leqslant n+1$ , one can verify using Eq. (67) that the above choice of $K$ gives $\textnormal{Var}[\hat{\mathcal{C}}^{(K)}]\leqslant\epsilon^{2}/4$ .

Thus, $\hat{\mathcal{C}}_{|\psi\rangle}^{(b)}(S)$ defined in Eq. (62) is an unbiased estimator of $\mathcal{C}_{|\psi\rangle}(S)$ , with $\textnormal{Var}[\hat{\mathcal{C}}_{|\psi\rangle}^{(b)}]\leqslant\epsilon^{2}/4$ for each $b$ . Consequently, using Proposition 2, we obtain the desired result. ∎

The only remaining step is to compute the variance of the estimator for the sum of squared probabilities for SIC measurements. We summarize this calculation in the lemma below.

Lemma 3.

Given iid outcomes $\boldsymbol{Q}_{1},\dotsc,\boldsymbol{Q}_{K}$ from SIC measurements with $K\geqslant 2$ , let

\hat{S}^{(K)}=\frac{1}{K(K-1)}\sum_{\begin{subarray}{c}k,k^{\prime}=1\\ k\neq k^{\prime}\end{subarray}}^{K}\textnormal{ind}[\boldsymbol{Q}_{k}=\boldsymbol{Q}_{k^{\prime}}]

(68)

be an unbiased estimator of $P_{2}=\sum_{\boldsymbol{q}}P(\boldsymbol{q})^{2}$ , where $P(\boldsymbol{q})$ denotes the probability of observing the string $\boldsymbol{q}$ after a local SIC measurement. Then, the variance of this estimation is given by

\textnormal{Var}[\hat{S}^{(K)}]=\frac{2P_{2}(1-P_{2})+4(K-2)(P_{3}-P_{2}^{2})}{K(K-1)},

(69)

where $P_{3}=\sum_{\boldsymbol{q}}P(\boldsymbol{q})^{3}$ .

Proof.

That $\hat{S}^{(K)}$ is an unbiased estimator of $P_{2}$ can be verified using similar arguments given for LRMs. To compute the variance, we again break the sum defining $\hat{S}^{(K)}$ into three cases, following the proof of Lemma 2 for LRMs. The first two cases are identical to the variance calculation for LRMs, except that $P_{2}=\sum_{\boldsymbol{q}}P(\boldsymbol{q})^{2}$ and $P_{3}=\sum_{\boldsymbol{q}}P(\boldsymbol{q})^{3}$ . The third case is also the same, except that $\textnormal{Cov}(X_{k,k^{\prime}},X_{j,j^{\prime}})=\mathbb{E}[X_{k,k^{\prime}}X_{j,j^{\prime}}]-\mathbb{E}[X_{k,k^{\prime}}]\mathbb{E}[X_{j,j^{\prime}}]=0$ , since $X_{k,k^{\prime}}$ and $X_{j,j^{\prime}}$ are independent with respect to the outcome probabilities and we do not need to take an expectation over unitaries for SIC measurements. Putting these observations together, we find that the variance of $\hat{S}^{(K)}$ is given by Eq. (69). ∎

Appendix C Simulation Details

C.1 LRMs Simulations

In the main text, our simulations evaluate the CEs of two canonical examples of entangled states. First, the $n$ -qubit Greenberger-Horne-Zeilinger (GHZ) state, which is defined as

\displaystyle|\mathrm{GHZ}\rangle_{n}=\frac{1}{\sqrt{2}}(|0\rangle^{\otimes n}+|1\rangle^{\otimes n}).

(70)

Because the purity of any subsystem of this state is $1/2$ , the CE of the full state can easily be computed as Beckey et al. (2023)

\displaystyle\mathcal{C}_{|\mathrm{GHZ}\rangle}([n])=\frac{1}{2}-\frac{1}{2^{n}}.

(71)

The W state is defined as a uniform superposition of all states with Hamming weight one

\displaystyle|W\rangle_{n}=\frac{1}{\sqrt{n}}(|10\dotsm 0\rangle+|010\dotsm 0\rangle+\dotsm+|0\dotsm 01\rangle).

(72)

The general formula is somewhat harder to prove, but the result is similarly simple Beckey et al. (2023)

\displaystyle\mathcal{C}_{|W\rangle}([n])=\frac{1}{2}-\frac{1}{2n}.

(73)

These are the exact formulas used to benchmark the performance of our estimators as shown in Fig. 2.

Recall from Theorem 1 the estimator for the CEs based on $L$ rounds of LRMs, $\boldsymbol{Z}_{1,1},\ldots,\boldsymbol{Z}_{1,K},\ldots,\boldsymbol{Z}_{L,1},\ldots,\boldsymbol{Z}_{L,K}$ , is,

\displaystyle\hat{\mathcal{C}}_{|\psi\rangle}(S)=1-\left(\frac{3}{2}\right)^{s}\frac{1}{L}\sum_{l=1}^{L}\hat{S}^{(K)}_{l}.

(74)

From this, we employ the following algorithm to simulate this estimator:

Algorithm 1 LRM-Mean Simulation

1:Input:

n

-qubit pure state

\rho=|\hskip 1.0pt\psi\rangle\langle\psi\hskip 1.0pt|

S\subseteq[n]

, number of unitaries

L

of the form

\prod_{i\in S}U_{i}

, and

K\geqslant 2

shots per unitary.

2:Let

\mathcal{D}_{l}

denote a discrete probability distribution over bitstrings and

C(2)

denote the set of single qubit Clifford gates.

3:function MeanCEviaLRMs(

\rho,S,L,K

)

l=1

\hat{S}^{(K)}=0

6: while

l\leqslant L

U_{l}=\prod_{i\in S}U_{i},\quad U_{i}\in C(2)

\mathcal{D}_{l}=\{\langle\boldsymbol{z}|U_{l}^{\dagger}\rho U_{l}|\boldsymbol{z}\rangle\rightarrow\boldsymbol{z}\}_{\boldsymbol{z}=0}^{2^{s}-1}

M_{l}=\mathrm{Sample}(\mathcal{D}_{l},K)

10: for

i\neq j\in M_{l}

11: if

\boldsymbol{Z}_{l,i}=\boldsymbol{Z}_{l,j}

then

\hat{S}^{(K)}=\hat{S}^{(K)}+1

12: end if

13: end for

14:

l=l+1

15: end while

16: return

\hat{\mathcal{C}}_{|\psi\rangle}(S)=1-\left(\frac{3}{2}\right)^{s}\frac{1}{L}\frac{1}{K(K-1)}\hat{S}^{(K)}

17:end function

It is worth noting that for Fig. 2 unitaries were sampled from the Haar distribution rather than from the single qubit Cliffords. In practice, it is likely single qubit Cliffords would be preferable.

For median-of-means estimation, we directly call this function with a specific value for $L$ :

Algorithm 2 LRM-MoM Simulation

1:Input:

n

-qubit pure state

\rho=|\hskip 1.0pt\psi\rangle\langle\psi\hskip 1.0pt|

S\subseteq[n]

, number of unitaries

L

, shots per unitary

K\geqslant 2

, and confidence level

1-\delta\in(0,1)

2:function MoMCEviaLRMs(

\rho,S,L,K,\delta

)

N_{B}=\lceil 8\log(\frac{1}{\delta})\rceil

B=\lceil\frac{L}{N_{B}}\rceil

5: return

\mathrm{Median}[\{\mathrm{MeanCEviaLRMs}[\rho,S,B,K]\}_{i=1}^{N_{B}}]

6:end function

C.2 SICs Simulations

Following the same modality as in the LRM section, we have the following estimator from Theorem 4,

\displaystyle\hat{\mathcal{C}}_{|\psi\rangle}(S)=1-3^{s}\hat{S}^{(K)},

(75)

where $\boldsymbol{Q}_{1},\ldots,\boldsymbol{Q}_{M}$ are the outcomes from $M$ rounds of SIC measurements. For ease of notation, let $\rho\otimes|\hskip 1.0pt0\rangle\langle 0\hskip 1.0pt|^{\otimes n}$ denote the state with qubits ordered such that each system qubit is next to its ancilla qubit. Then the following algorithm simulates this estimator:

Algorithm 3 SIC-MoM Simulation

1:Input:

n

-qubit pure state

\rho=|\hskip 1.0pt\psi\rangle\langle\psi\hskip 1.0pt|

S\subseteq[n]

K

, and

\delta>0

confidence level.

2:Let

\mathcal{D}

denote a discrete probability distribution.

3:function MoMCEviaSICs(

\rho,S,K,\delta

)

N_{B}

\lceil 8\log(\frac{1}{\delta})\rceil

U=\prod_{i\in S}U_{\mathrm{SIC}}

\mathcal{D}=\{\langle\boldsymbol{q}|U^{\dagger}(\rho\otimes|\hskip 1.0pt0\rangle\langle 0\hskip 1.0pt|^{\otimes n})U|\boldsymbol{q}\rangle\rightarrow\boldsymbol{q}\}_{\boldsymbol{q}=0}^{4^{s}-1}

b=1

8: while

b\leqslant N_{B}

\hat{S}^{(K)}_{b}=0

10:

M_{b}=\mathrm{Sample}(\mathcal{D},K)

11: for

i\neq j\in M_{b}

12: if

\boldsymbol{Q}_{i}=\boldsymbol{Q}_{j}

then

\hat{S}^{(K)}_{b}=\hat{S}^{(K)}_{b}+1

13: end if

14: end for

15:

\overline{\mathcal{C}}^{(b)}_{|\psi\rangle}(S)=1-3^{s}\frac{1}{K(K-1)}\hat{S}^{(K)}_{b}

16: end while

17: return

\hat{\mathcal{C}}_{|\psi\rangle}(S)=\mathrm{median}[\{\overline{\mathcal{C}}_{|\psi\rangle}^{(b)}(S)\}_{b=1}^{N_{B}}]

18:end function

To reproduce Fig. 4, to each of the simulations outlined in Table 1 we input a $5$ -qubit GHZ state that is $\rho=|\hskip 1.0pt\mathrm{GHZ}_{5}\rangle\langle\mathrm{GHZ}_{5}\hskip 1.0pt|$ and set $\delta=0.05$ for methods based on median-of-means. The total measurement budget is fixed based on the measurements required for SIC-MoM, $K=K_{\mathrm{opt}}$ to have $\epsilon=0.05$ precision at $1-\delta=0.95$ confidence. That is a total measurement budget of $M=K_{\mathrm{opt}}\lceil 8\log(\frac{1}{\delta})\rceil$ where,

\displaystyle\begin{split}K_{\mathrm{opt}}&=\Biggl{\lceil}\frac{1}{2}\left(\frac{16}{\epsilon^{2}}\left(\frac{3}{2}\right)^{s}+1\right)\\ &+\frac{1}{2}\sqrt{\left(\frac{16}{\epsilon^{2}}\left(\frac{3}{2}\right)^{s}-1\right)^{2}+\frac{32}{\epsilon^{2}}3^{s}}\Biggr{\rceil}.\end{split}

(76)

The corresponding measurement splits were then computed from this total number. Then $1,000$ trials of $M$ total measurements of LRM data and SIC data were generated. Since both LRM-Mean and LRM-MoM have $K=2$ , i.e. two shots per unitary, the same data was post-processed according to each estimator. This is possible as median-of-means only changes how things are batched since the total measurement budget was fixed. The same was done for the SIC estimators. As a final comment, to begin to see the asymptotic behavior within this plot, one would have to go up to roughly 12-13 qubits. This is because of the various pre-factors hidden by big- $O$ notation.