This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Indiana University Bloomington, USA and https://homes.luddy.indiana.edu/qzhangcs/ [email protected]://orcid.org/0000-0002-6851-3115supported by NSF CCF-1844234 and IU Luddy Faculty FellowshipIndiana University Bloomington, USA and https://homes.luddy.indiana.edu/mheidar/[email protected] supported by NSF CCF-2211423 \CopyrightQin Zhang and Mohsen Heidari \ccsdesc[500]Theory of computation Quantum query complexity \ccsdesc[500]Information systems Query operators \ccsdesc[500]Information systems Query optimization \EventEditorsJohn Q. Open and Joan R. Access \EventNoEds2 \EventLongTitle42nd Conference on Very Important Topics (CVIT 2016) \EventShortTitleCVIT 2016 \EventAcronymCVIT \EventYear2016 \EventDateDecember 24–27, 2016 \EventLocationLittle Whinging, United Kingdom \EventLogo \SeriesVolume42 \ArticleNo23

Quantum Data Sketches

Qin Zhang    Mohsen Heidari
Abstract

Recent advancements in quantum technologies, particularly in quantum sensing and simulation, have facilitated the generation and analysis of inherently quantum data. This progress underscores the necessity for developing efficient and scalable quantum data management strategies. This goal faces immense challenges due to the exponential dimensionality of quantum data and its unique quantum properties such as no-cloning and measurement stochasticity. Specifically, classical storage and manipulation of an arbitrary nn-qubit quantum state requires exponential space and time. Hence, there is a critical need to revisit foundational data management concepts and algorithms for quantum data. In this paper, we propose succinct quantum data sketches to support basic database operations such as search and selection. We view our work as an initial step towards the development of quantum data management model, opening up many possibilities for future research in this direction.

keywords:
quantum data representation, data sketching, query execution

1 Introduction

Quantum information and computing, rooted in the principles of quantum mechanics, have emerged as an important field of study with far-reaching effects across a broad spectrum of disciplines. Central to the concept of quantum computing are quantum bits (or qubits), which set themselves apart from classical bits due to their ability to exist in a superposition of states, allowing a quantum computer to offer the potential computational advantage against classical computing.

Although significant advancements have been made in the development of quantum algorithms after several decades of research, only a handful provably outperform their classical counterparts. Notable examples include Shor’s algorithm for factorization [54], Grover’s algorithm for search [22], and linear system solvers [26]. These quantum algorithms typically start by encoding classical input data into quantum states, execute a series of quantum operations, and then measure the resulting quantum states and carry out specific post-processing on the measurement outcomes. The reasons for the difficulties in the design of quantum algorithms that can outperform classical counterparts on classical input data remain elusive.

In this paper, we take a different perspective, directing our attention towards quantum data themselves. The nature, along with scientific experiments spanning physics, chemistry, material science, biology, and other fields, generates massive quantities of quantum data every day. Sources include Hawking Radiation, Cosmic Microwave Background, quantum effects in neutron stars, quantum states in ultra-cold atoms, quantum information in DNA replication, etc. In many scenarios, there is a need for us to preserve quantum data that has been collected from nature or generated in labs for future analysis. For example, scientists often use photons collected from remote stars to study the properties of those astronomical objects. It would be beneficial to store those photons as quantum states in a database, since it may not be feasible to collect fresh photons from those astronomical objects at the time of data analysis. In the case that the quantum states are prepared in the labs, generating fresh copies of quantum states on demand is often time-consuming. Let us use quantum simulation as an example. Quantum simulation is a prominent advantage of quantum computers, with significant implications for numerous areas of scientific research, including computer-aided drug design [50], high-energy physics [41], quantum chemistry [58] and many-body physics [55]. Quantum simulation typically relies on solving the Schrödinger equation for the underlying Hamiltonian. The Hamiltonian is implemented by a quantum circuit, which is applied to an initial quantum state to generate target quantum states. The construction of the Hamiltonians and the preparation of the target states can be rather time-consuming.111For instance, the Hamiltonian of the two-dimensional Fermi-Hubbard model on an 8×88\times 8 lattice already requires approximately 10710^{7} Toffoli gates [43], which directly contribute to the query time if states need to be generated from scratch at query. Storing the generated molecular quantum states in a database would eliminate the need to repeat the state preparation procedures during data analysis.

Once the quantum states are stored in a database, and assuming each state is associated with additional information such as the nature sources recorded at the time of collection or parameters of the experimental setup used to produce them, numerous applications can be envisioned. For example, if scientists receive photons from an unknown remote star, they can search a photon database to find a matching quantum state. Upon finding a match, they can retrieve its associated properties and other information, such as the time and method of its previous observation. They may also want to sort the states using a local observable (see Definition 3.5 in Section 3) with respect to certain properties (such as energy or momentum) to get an order of the photons in the database, aiding in the understanding of the spectrum of the corresponding stars in the universe. In quantum simulation, if we want to produce molecular states with average energy levels above a certain threshold relative to a specific local observable, we can perform a selection operation in our database to identify those states, and then use the associated parameters for the experimental setup to produce more of such quantum states.

Nevertheless, quantum data management remains in its infant stage. Some of the previously mentioned motivating examples are more like anticipated future problems. There has been research that leverages quantum data for learning or optimization, such as quantum machine learning [33, 24, 3], quantum variational optimization algorithm [28, 17]. and quantum neural network [52, 48, 18, 31, 45, 20]. However, their primary focus is on the sample complexity (namely, the number of copies of the quantum state needed for the task) and the convergence to optimal points, rather than on developing methods for the efficient representation and storage of quantum data for subsequent analysis.

In this paper, we introduce several quantum data sketches to support basic database operations in a sustainable and efficient manner. This paper does not aim to formulate a comprehensive quantum data management model. Rather, we view our work as an initial step towards developing a sustainable model for representing, querying, and analyzing quantum data at scale.

Unique Challenges in the Quantum World

The quantum world possesses several unique properties, such as superposition and entanglement, that can be leveraged to reduce resource usage in computing and information exchange. However, some of these features also post significant challenges to quantum data management. We highlight a few below.

Post-Measurement State Disturbance.  The only way to extract information from a quantum state is to perform quantum measurements and observe probabilistic outcomes. However, each measurement has the effect of perturbing the quantum state. This characteristic implies that a quantum state might not be reusable post-measurement. In other words, we may need to consume many identical copies of a quantum state in order to derive enough useful information about it. This phenomenon is in stark contrast with the classical setting, in which we can consistently access the same data element for a number of times, always yielding the same result.

No-cloning.  A natural thought to resolve the issue caused by state disturbance is to clone the quantum state before the operations. Unfortunately, the no-cloning theorem (see, e.g., [46]) in quantum mechanics asserts that it is impossible to create an exact copy of an arbitrary unknown quantum state.

Lack of Large-Scale Quantum Storage Systems.  At the time of writing this paper, we are not aware of any reliable large-scale quantum storage systems. One reason for this is that qubits are highly susceptible to environmental disruptions such as temperature variations, electromagnetic radiation, or particle interactions. These disruptions lead to what is known as decoherence [40], resulting in the loss of quantum information.

Moreover, due to the quantum state disturbance and the no-cloning principle, even if we successfully build viable large-scale quantum storage systems in the future, we still need many identical copies of the quantum state for any nontrivial database operation. This implies that in order to accommodate an unlimited number of database operations (i.e., to be sustainable), we must prepare an unlimited number of copies for each quantum state in the storage, which is certainly not practical.

An alternative approach is to first learn the classical description of each quantum state and store it in a classical memory for future operations. Indeed, we believe that for the purpose of quantum data management, we have to store quantum states in the classical format. However, learning and storing the full information of a quantum state as a classical object is both time and space expensive, as the dimensionality of a quantum state is exponential in terms of the number of qubits.

We thus propose to design succinct classical representations (or, sketches) of quantum states that can be used to perform database operations efficiently. Based on the particular database operation it is intended to support, each sketch preserves only partial information of a quantum state. This is also the reason why we may be able to make the size of the sketch to be o(d)o(d), where dd is the dimension of the quantum state. We also note that the sample complexity for constructing data sketches is a secondary consideration for database management systems, as it is just a one-time preprocessing step in the database design. This is where our work departs from the quantum state learning/tomography literature, which we will discuss in Section 1.1.

Our Contribution

We give the first systematic approach to designing space-efficient sketches for quantum states. These sketches can then be used to develop time-efficient algorithms for basic database operations. In particular:

  1. 1.

    In Section 3, we have formalized a set of basic database operations for quantum data, including search, selection, sorting, and join. These operations differ from those for classical data as they inherently incorporate approximation in their definitions.

  2. 2.

    Our main technical results are the first set of classical vector sketches that preserve, up to a distortion of (1+ι)(1+\iota) for an arbitrarily small ι>0\iota>0, the trace distance of the quantum states with probability (1δ)(1-\delta). Our sketches have sizes O(log(1/δ)/ι2)O(\log(1/\delta)/\iota^{2}), which is independent of the dimension of the states. Coupled with efficient nearest neighbor search via locality sensitive hashing, they can be used to support the search and join operations with time sublinear in the database size and independent of the state dimension. See Section 4.1.

  3. 3.

    We make use of classical shadow seeds of quantum states [37] to approximate the expectation value of any given kk-local observable (to be defined in Section 3.3) using time and space independent of the dimension of the state. We also present a new hybrid quantum-classical algorithm to accelerate the query time. This sketch can be used for selection and sorting operations. See Section 4.2.

Paper Outline

In Section 2, we review some background on quantum information and computing as well as tools for classical data management. In Section 3, we define a set of basic database operations for quantum data. After these preparations, in Section 4, we present our classical sketches of quantum states and illustrate how to perform various database operations using these sketches. We review works that are most relevant to this paper in Section 1.1 and propose several directions for future research in Section 5.

1.1 Related Work

We are not aware of any prior work on designing classical sketches of quantum data, except for the paper [37] discussed in Section 4.2. There have been effort aiming to introduce quantum computing, quantum algorithms and quantum machine learning to the database community [13, 61, 44, 57, 9, 59]. We refer the readers to the recent tutorial [25] for an overview of these works. However, these initiatives either attempt to design and perform database operations directly on quantum data (i.e., assuming database elements are stored as quantum states) or focused on speeding up databases query optimization and transactions on classical data, setting them apart from the objectives pursued in this paper.

There are works [60, 38] focusing on applying classical data compression techniques (such as quantization) to the quantum state vector during quantum simulation. We note that our approach with sketches is quite different, as we aim to extract relevant information (often independent of the quantum states’ dimension) for various database operations.

Quantum State Learning

Many studies have explored the task of characterizing and learning properties of a quantum state using multiple copies of the state, including approximate state discrimination [12], quantum state discrimination [29], quantum state tomography [23, 49], quantum state property testing [27], quantum state certification [7], shadow tomography [2, 6], and pretty good tomography [1].

In the problem of approximate state discrimination, we are promised that a query quantum state ϕ{\phi} belongs to a set SS of quantum states. The algorithm’s task is to return a state ψS{\psi}\in S such that D(ϕ,ψ)ϵD({\phi},{\psi})\leq\epsilon. The algorithm for approximate state discrimination proposed in [12] can be used together with the equality testing to handle the search operation when the available number of copies of the query state is limited, at the cost of larger time and space complexities. However, the need of fresh copies of database states for equality testing would undermine the long-term sustainability of the database system.

The problem of quantum state discrimination is very similar: We are again promised that the query state ϕ{\phi} belongs to a set SS, but now the algorithm needs to return the exact ϕ{\phi}. Harrow and Winter [29] gave an algorithm for this problem where the sample complexity of the query state depends on a parameter FF, which is the maximum pairwise fidelity of states in the set SS.

In the quantum state tomography, we wanted to learn an unknown quantum state up to a trace distance ϵ\epsilon. Optimal sample complexity Θ~(d/ϵ2)\tilde{\Theta}(d/\epsilon^{2}) has been identified [23, 49].

Quantum state property testing [27] and quantum state certification [7] can be seen as relaxations of aforementioned problems. In the former, we are given a query state ϕ{\phi} and a set SS of quantum states, and asked to test whether ϕS{\phi}\in S or ϕ{\phi} is ϵ\epsilon-far from SS (that is, for any state ψS{\psi}\in S, we have D(ϕ,ψ)>ϵD({\phi},{\psi})>\epsilon), and in the latter, we are given a query state ϕ{\phi} and a known state ψ{\psi}, and asked to test whether ϕ=ψ{\phi}={\psi} or D(ϕ,ψ)>ϵD({\phi},{\psi})>\epsilon. The main issue with property testing and certification in the setting of data management is that the decision can be arbitrary even if the query state is very close to (but not the same as) a database state.

Both shadow tomography and pretty good tomography focus on approximating ϕMiϕ\phi^{\dagger}M_{i}\phi for a query state ϕ{\phi} and a set of known binary measurements {Mi}\{M_{i}\} [2, 6], or a distribution on them [1]. However, these algorithms cannot be used for the (η,ε)(\eta,\varepsilon)-selection for an arbitrary observable MM given at the time of query. Their running time is also polynomial in terms of the state dimension dd. Recently, Gong and Aaronson [21] generalized shadow tomography to a fixed set of measurements with multiple outcomes.

To the best of our knowledge, all the previous work on quantum state learning focuses on the sample complexity, but not on the space complexity for representing the quantum states for various data management operations.

2 Preliminaries

We start by giving a gentle introduction of the basics of quantum information and computing, particularly for readers who are not in the field yet. For a comprehensive treatment on this topic, we refer the readers to standard textbooks in the field, such as [46].

Quantum States and Qubits

The first axiom of quantum mechanics is concerned with quantum state as a way to describe a quantum system, such as a qubit. For accessibility of the paper we focus on pure state that are represented by complex-valued vectors. Moreover, we assume that each quantum data point is stored in nn-qubits. Therefore, the dimentionality of the space is d=2nd=2^{n}. In that case, the quantum stats are unit-norm vectors in d\mathbb{C}^{d}. Following the Dirac bra-ket notation, a vector udu\in\mathbb{C}^{d} is simply denoted by the ket |u\ket{u}. As an example, a qubit is a 22-dimensional vector represented as |ϕ=α0|0+α1|1\ket{\phi}=\alpha_{0}\ket{0}+\alpha_{1}\ket{1}, where |α0|2+|α1|2=1\absolutevalue{\alpha_{0}}^{2}+\absolutevalue{\alpha_{1}}^{2}=1. This decomposition is typically called a superposition. A well-known superposition is the state 12(|0+|1)\frac{1}{\sqrt{2}}(\ket{0}+\ket{1}). Similarly, an nn-qubit state is represented by a superposition as |u=x1xn{0,1}nαx1xn|x1xn,\ket{u}=\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\alpha_{x_{1}\cdots x_{n}}\ket{x_{1}\cdots x_{n}}, where x1xn{0,1}nαx1xn2=1\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\alpha^{2}_{x_{1}\cdots x_{n}}=1. For compactness, we use |i\ket{i} to represent each |x1xn\ket{x_{1}\cdots x_{n}}, where ii is the decimal representation of the binary string x1xnx_{1}\cdots x_{n}.

Quantum Operations

The second axiom of quantum mechanics states that the evolution of quantum states are described via unitary transformation. A unitary transformation is represented by a unitary matrix UU such that UU=UU=IU^{\dagger}U=UU^{\dagger}=I. If the initial state is |ϕ\ket{\phi}, then the evolved state is U|ϕU\ket{\phi}. In quantum computing UU is typically implemented in terms of elementary quantum logical gates. In this perspective, one can study the gate complexity of implementing UU. This axiom implies a unique feature of quantum, known as the no-cloning principle that prohibits making copies of quantum data. As a result one needs to adopt data management procedures that abide this rule.

Quantum Measurements

The third axiom of quantum mechanics asserts that any classical information about a quantum state is obtained via measuring it. The act of measuring a quantum system will collapse the quantum state inevitably. The specific outcome of a measurement is probabilistic and is governed by the Born’s law. These probabilities are determined by the initial state of the system and the nature of the interaction between the system and the measuring device. Measuring in an nn-qubit system is typically modeled in the so-called computational basis. When the quantum state is in the superposition |ϕ=iαi|i\ket{\phi}=\sum_{i}\alpha_{i}\ket{i}, the outcome of the measurement in the computational basis is going to be i[2n]i\in[2^{n}] with probability pi=|αi|2p_{i}=|\alpha_{i}|^{2}. For instance, measuring the state 12(|0+|1)\frac{1}{\sqrt{2}}(\ket{0}+\ket{1}) produces a random uniform binary output. The stochasticity of quantum measurements is another feature that calls for probabilistic data management frameworks. Moreover, the state collapse phenomenon significantly complicates the tasks, as the quantum state cannot be entirely “recycled” following a measurement.

One may attempt to think of a quantum state |ϕ=iαi|i\ket{\phi}=\sum_{i}\alpha_{i}\ket{i} - as far as measurement is concerned - as a discrete probability distribution {p1,,pd}\{p_{1},\ldots,p_{d}\}, but there are two fundamental differences. First, the coefficients (called amplitudes) αi\alpha_{i}’s are complex numbers that make superposition and interference possible. Second, the probability of an outcome in quantum mechanics is found by taking the absolute square of the amplitude, that is, pi=|αi|2p_{i}=|\alpha_{i}|^{2}.

In general, a certain measurement \mathcal{M} on a quantum state can be obtained in three stages: (i) applying an appropriate quantum operator UU to the state, (ii) measuring the evolved state U|ϕU\ket{\phi} in the computational basis; and (iii) applying classical post processing on the measurement outcomes. This procedure is compactly modeled as a matrix MM called an observable that is multiplied by the original state |ϕ\ket{\phi}. The eigenvalues of MM represent the possible values of the measurement outcomes. Moreover, by (|ϕ)\mathcal{M}(\ket{\phi}) we denote the probability distribution of the measurement outcomes after applying \mathcal{M} on |ϕ\ket{\phi}. Because the outcomes are probabilistic, we are often interested in their expectation values. The expectation of the outcome distributed by (|ϕ)\mathcal{M}(\ket{\phi}) is equal to ϕ|M|ϕ\expectationvalue{M}{\phi}, where ϕ|\bra{\phi} is the complex conjugate transpose of the vector |ϕ\ket{\phi}.

Standard Math Notations Versus Dirac Notations

As this paper is intended for an audience within the database community, we recognize that the Dirac bra-ket notation might appear unfamiliar to database researchers without a background in quantum information and computing. To simplify, in the main text we express a pure quantum state as a column vector with dimensions denoted as dd, and use ϕ\phi and ϕ\phi^{\dagger} to denote |ϕ\ket{\phi} and ϕ|\bra{\phi}, respectively. We use ϕMϕ\phi^{\dagger}M\phi to denote the expectation value ϕ|M|ϕ\expectationvalue{M}{\phi} of an observable MM. Throughout the paper, we reserve the notations ϕ\phi and ψ\psi for quantum states.

We have also included a more formal (but still gentle) introduction of quantum information and computing using Dirac bra-ket notations in Appendix A. We will use Dirac bra-ket notations in all the proofs in the appendix.

Trace Distance

Given two quantum states ϕ\phi and ψ\psi, we define their trace distance to be D(ϕ,ψ)=1|ψϕ|2D(\phi,\psi)=\sqrt{1-\absolutevalue{\psi^{\dagger}\phi}^{2}}. The trace distance is the most widely used distance measure for quantum states in the literature.

2.1 Performance Metrics

In the context of quantum data, similar to classical database design, the efficiency of space and time is crucial during database initialization, indexing, and querying. Minimizing the number of quantum state copies used for constructing sketches is also important, as obtaining state copies can be costly and they cannot be fully recycled due to post-measurement disturbance. However, as we mentioned earlier, sample complexity is a secondary consideration in the data management setting, since the sketch-building/initialization is a one-time process.

A unit-time quantum operation comprises standard single-qubit gates like the Hadamard gate, Pauli gates, phase gate, and TT gate, as well as a two-qubit gate, such as the Controlled-NOT (CNOT) gate, that enables entangling operations.222We refer the readers to [46] for a detailed introduction of these gates. The combination of these gates is sufficient to approximate any unitary operation to arbitrary accuracy. We call these gates unit gates, and define the size of a circuit (for representing a unitary operation) to be the number of unit gates in the circuit.

As mentioned, a typical quantum measurement \mathcal{M} on nn qubit systems consists of a unitary operator UU_{\mathcal{M}} followed by measurement in computational basis and classical post processing. Assuming that the classical post processing is polynomial, the overall time cost is typically dominated by the gate complexity of UU_{\mathcal{M}}. It has been shown in [56] that a circuit depth of Θ(2n/n)\Theta(2^{n}/n) (i.e., Θ(d/logd)\Theta(d/\log d)) is needed for constructing an arbitrary unitary operator UU. To simplify matters, we assume that both executing an arbitrary dd-dimensional quantum measurement and preparing an arbitrary dd-dimensional state require O(d)O(d) quantum time.

2.2 Nearest Neighbor in High Dimensions

As quantum states are inherently high dimensional, even after effective sketching and summarization that we will illustrate in the subsequent sections, we will thus use Approximate Nearest Neighbor (ANN) via Locality Sensitive Hashing (LSH) to further speed up some database operations. This subsection will take a brief detour from our discussion of quantum data management.

Definition 2.1 ((r,β)(r,\beta)-ANN-search).

Let XX be a database containing a set of vectors in d\mathbb{R}^{d} and qdq\in\mathbb{R}^{d} be a query vector. Let dist(,)dist(\cdot,\cdot) be a distance function. If there is at least one vector pXp\in X with dist(p,q)rdist(p,q)\leq r, return any pXp^{\prime}\in X with dist(p,q)βrdist(p^{\prime},q)\leq\beta r. Otherwise, either return a pXp^{\prime}\in X with dist(p,q)βrdist(p^{\prime},q)\leq\beta r or return \emptyset.

Let us focus on the case that the distance function dist(,)dist(\cdot,\cdot) is 1\ell_{1} or 2\ell_{2}. Indyk and Motwani [39] showed that (r,β)(r,\beta)-ANN can be solved efficiently via LSH. The idea is that we first apply multiple hash functions to each vector in XX; this part can be pre-computed and stored as an indexing. At the time of query, we apply the same set of hash functions to the query vector qq. We then run over all vectors pXp\in X such that pp and qq collide (i.e., fall into the same bin) on at least one hash function, and return the first vector pp if dist(p,q)βrdist(p,q)\leq\beta r. If no such pp found after traversing a certain number of vectors in XX, we return \emptyset.

We will use 𝙰𝙽𝙽(q,X,r,β)\mathtt{ANN}(q,X,r,\beta) to denote the (r,β)(r,\beta)-ANN search for a query vector qq in database XX. The following is a summary of results on LSH-based ANN for 1/2\ell_{1}/\ell_{2} distances.

Theorem 2.2 ([39, 15, 5]).

For dist(,)dist(\cdot,\cdot) being 1\ell_{1} or 2\ell_{2}, a database XX of mm vectors, and a dd-dimensional vector qq, there is an algorithm that solves 𝙰𝙽𝙽(q,X,r,β)\mathtt{ANN}(q,X,r,\beta) using O(dm+m1+γ)O(dm+m^{1+\gamma}) space and O(dmγ)O(dm^{\gamma}) classical time, where γ1/β\gamma\approx 1/\beta for 1\ell_{1} distance and γ1/β2\gamma\approx 1/\beta^{2} for 2\ell_{2} distance.

Remark 2.3.

We note that if we do not terminate the algorithm after encounter the first pXp\in X such that dist(p,q)βrdist(p,q)\leq\beta r, then the same algorithm can return a subset YXY\subseteq X including all vectors pp such that dist(p,q)rdist(p,q)\leq r, and excluding all vectors pp such that dist(p,q)βrdist(p,q)\geq\beta r.

Remark 2.4.

We can also use LSH to find a set JJ of pairs of vectors such that JJ includes all pairs (p,q)(p,q) such that dist(p,q)rdist(p,q)\leq r, and excludes all pairs (p,q)(p,q) such that dist(p,q)βrdist(p,q)\geq\beta r. To this end, we first hash all vectors, and then check the distances of all pairs of vectors that collide on at least one hash function.

3 Basic Operations on Quantum Data

The characteristics of quantum information dictate that we can only obtain an approximation of a quantum state ϕ{\phi} with a finite number of quantum state copies. A celebrated result in quantum state tomography states that to learn an unknown nn-qubit quantum state ϕ{\phi} up to a trace distance ϵ\epsilon, we already need Ω(d/ϵ2)\Omega\left({d}/{\epsilon^{2}}\right) copies of the quantum state, where d=2nd=2^{n} is the dimension of ϕ{\phi} [23, 49]. We thus consider two quantum states ϕ,ψ{\phi},{\psi} with D(ϕ,ψ)εD({\phi},{\psi})\leq\varepsilon the same state. Consequently, all the operations that we support in a quantum database also need to be approximate. The precise definition of ‘approximation’ varies for different operations.

In this section, we formulate basic quantum data operations that we aim to support using our proposed sketches. When we say the return of a quantum state ϕ\phi, we are referring to its identifier.

3.1 Equality Test

In the classical data setting, the equality test on two data objects returns 11 if p=qp=q, and returns 0 otherwise. In the quantum setting, since we cannot distinguish two quantum states using o(d/ε2)o(d/\varepsilon^{2}) copies of the states if their trace distance is at most ε\varepsilon, we need to introduce the approximation version of the equality test:

Definition 3.1 ((ε,β)(\varepsilon,\beta)-equality-test).

Given two quantum states ϕ{\phi} and ψ{\psi}, output 11 if D(ϕ,ψ)εD({\phi},{\psi})\leq\varepsilon, and 0 if D(ϕ,ψ)>βεD({\phi},{\psi})>\beta\varepsilon. The output can be arbitrary if ε<D(ϕ,ψ)βε\varepsilon<D({\phi},{\psi})\leq\beta\varepsilon.

In words, we consider two quantum states the same if their trace distance is at most ε\varepsilon, and different if their trace distance is more than βε\beta\varepsilon. If the distance falls between the two values, then the decision can be arbitrary. The gap between yes and no is inevitable for quantum data.

Given two quantum states ϕ{\phi} and ψ{\psi}, which may be unknown, the standard method for estimating their trace distance is the swap test [8]. The algorithm uses a controlled-SWAP gate (can be implemented using O(n)=O(logd)O(n)=O(\log d) unit gates) and two single-qubit Hadamard gates. The test outputs 11 with probability 1+|ϕψ|22=1D(ϕ,ψ)22\frac{1+\absolutevalue{\phi^{\dagger}\psi}^{2}}{2}=1-\frac{D({\phi},{\psi})^{2}}{2}, and 0 otherwise. Therefore, using Oβ(1ε2log(1δ))O_{\beta}\left(\frac{1}{\varepsilon^{2}}\log{\frac{1}{\delta}}\right) such tests (the constant hidden in the big-OO depends on the constant β\beta), we can differentiate the case D(ϕ,ψ)>βεD({\phi},{\psi})>\beta\varepsilon from D(ϕ,ψ)εD({\phi},{\psi})\leq\varepsilon with a probability 1δ1-\delta.

The main issue with this algorithm is that we have to consume fresh copies of database states for each equality test, which is unsustainable for a database system that is designed to answer an unlimited number of queries.

3.2 Search and Join

In the classical data setting, given a set of objects X={p1,,pn}X=\{p_{1},\ldots,p_{n}\} and a query object qq, the search operation returns some piXp_{i}\in X such that pi=qp_{i}=q if such pip_{i} exists, and \emptyset otherwise. In the quantum setting, again due to the difficulty of distinguishing two quantum states within a distance of ϵ\epsilon, we propose the following approximation version.

Definition 3.2 ((ε,β)(\varepsilon,\beta)-search).

Given a query state ϕ{\phi} and a database XX, if there exist a state ψX{\psi}\in X such that D(ϕ,ψ)εD({\phi},{\psi})\leq\varepsilon, return a state ψX{\psi^{\prime}}\in X with D(ϕ,ψ)βεD({\phi},{\psi^{\prime}})\leq\beta\varepsilon. Otherwise, either return a state ψX{\psi^{\prime}}\in X with D(ϕ,ψ)βεD({\phi},{\psi^{\prime}})\leq\beta\varepsilon or return \emptyset.

In other words, if there exists a state in the database which has a trace distance no more than ε\varepsilon from the query state ϕ\phi, we return a state in XX whose distance is no more than βε\beta\varepsilon from ϕ\phi (similar to the ANN search). Else if all states in the database have distances larger than βε\beta\varepsilon from the query state, we return \emptyset. In other cases, we either return a database state with distance no more than βε\beta\varepsilon from the query state or return \emptyset.

The most straightforward way is to perform the (ε,β)(\varepsilon,\beta)-equality-test for each database state ψX{\psi}\in X with the query state ϕ{\phi}. By the above algorithm for equality test (setting δ=1/m2\delta=1/m^{2}), we can determine with probability (1mδ)=(1o(1))(1-m\delta)=(1-o(1)) whether there exists a state ψX{\psi}\in X such that the (ε,β)(\varepsilon,\beta)-equality-test on ϕ{\phi} and ψ{\psi} returns 11. The above procedure takes O(mlogdlogm/ε2)O\left(m\log d\log m/\varepsilon^{2}\right) quantum time, which is linear in terms of the number of states in the database. Another significant limitation of this method is the necessity of using fresh copies of the database states for each search operation because of the equality test, making the database system unsustainable.

A closely related operation to search is join, which is one of the most important operations in relational database systems. We introduce the quantum version of natural join as follows.

Definition 3.3 ((ε,β)(\varepsilon,\beta)-natural-join).

Given two databases XX and YY of quantum states, we want to output a set that includes all pairs of states (ϕ,ψ)(ϕX,ψY)(\phi,\psi)\ (\phi\in X,\psi\in Y) such that D(ϕ,ψ)ϵD(\phi,\psi)\leq\epsilon, and excludes all pairs (ϕ,ψ)(\phi,\psi) such that D(ϕ,ψ)>βϵD(\phi,\psi)>\beta\epsilon. The decisions for other pairs can be arbitrary.

3.3 Selection and Sorting

In relational databases for classical data, selection is typically denoted by σθ(R)\sigma_{\theta}(R), where RR is a relation and θ\theta is a propositional formula that involves an attribute, a comparison operator in the set {<,>,,,=,}\{<,>,\leq,\geq,=,\neq\}, and a constant value for comparison (e.g., age 8\geq 8). However, in the quantum data setting, quantum states cannot be directly compared. We can only apply a measurement \mathcal{M} on the state ϕ{\phi} and get a random outcome according to the distribution (ϕ)\mathcal{M}({\phi}). As a classical analog, we would say a person’s age is 55 with probability 0.60.6 and 1010 with probability 0.40.4.333This assembles probabilistic databases, but in the quantum data setting the probability distribution is not given explicitly, and the support size of the distribution is exponential in terms of the number of qubits of each quantum state. We thus look at the expectation value ϕMϕ\phi^{\dagger}M\phi for the observable MM corresponding to \mathcal{M}.

The quantity ϕMϕ\phi^{\dagger}M\phi holds significant importance in quantum mechanics (see, e.g., the textbook [51]). It can be used to provide an estimate of the system’s average energy in a particular state, describe the level of non-classical correlations between entangled particles, quantify quantum information such as entropy, coherence, and entanglement, etc.

We define the ε\varepsilon-approximate ‘\geq’ selection operation for quantum data as follows.

Definition 3.4 ((η,ε)(\eta,\varepsilon)-selection).

Given a database XX, an observable MM, a threshold η\eta, and an error parameter ε\varepsilon, return a set of states SXS\subseteq X such that SS includes all database states ϕ{\phi} such that ϕMϕη\phi^{\dagger}M\phi\geq\eta, but excludes all ϕ{\phi} such that ϕMϕηε\phi^{\dagger}M\phi\leq\eta-\varepsilon.

Note that the ε\varepsilon-approximate equality selection can be implemented by taking the difference between (ηε,ε)(\eta-\varepsilon,\varepsilon)-selection and (η+2ε,ε)(\eta+2\varepsilon,\varepsilon)-selection, which includes all ϕ{\phi} with ηεϕMϕη+ε\eta-\varepsilon\leq\phi^{\dagger}M\phi\leq\eta+\varepsilon and excludes all ϕ{\phi} with ϕMϕη2ε\phi^{\dagger}M\phi\leq\eta-2\varepsilon or ϕMϕη+2ε\phi^{\dagger}M\phi\geq\eta+2\varepsilon. In the context of approximation, we can consider ‘<<’ and ‘>>’ the same as ‘\leq’ and ‘\geq’, respectively.

We also note that (ε,β)(\varepsilon,\beta)-search can also be handled by looking at ϕMϕ\phi^{\dagger}M\phi for a specific observable MM, although this solution is not as efficient as that using the particular sketches that we shall design for the search operation. We have included a reduction from (ϵ,β)(\epsilon,\beta)-search to (η,ϵ)(\eta,\epsilon)-selection in Appendix D.

In the context of databases, we are particularly interested in the following type of observables.

Definition 3.5 (kk-local observable).

An observable OO of a system with nn qubits is called kk-local if it can be written as a sum of a constant number of terms, each acting on at most kk qubits. For instance, a 22-local observable in a 33-qubit system might look like:

O=O12I3+I1O23,O=O_{12}\otimes I_{3}+I_{1}\otimes O_{23},

Where O12O_{12} and O23O_{23} are operators acting on the pairs of qubits (1,2) and (2,3) respectively, while I3I_{3} and I1I_{1} are the identity operators acting on the remaining qubits.

kk-local observables have been well studied in the literature (see [11, 42] and references therein). They are interesting because, in most practical scenarios, our goal is to identify specific properties of a quantum state (e.g., the energy, momentum, or spin of a photon) that rely on a small subset of qubits of the state. This is similar to the classical setting where most queries depend on a few attributes of a relational database table. For example, suppose we want to retrieve all records in a table containing patient information for individuals aged 8080 years or older with systolic blood pressure at least 140140, we only need to look at two attributes in the table: age and blood pressure. If we view each qubit of a quantum state as an attribute (e.g., spin, position, momentum, polarization, etc.), then a kk-local observable performs selection on at most kk attributes of the quantum state.

A related problem of selection is sorting. As a motivation, we would like to sort a set of given quantum states according to their average energy with respect to an observable determined by a particular application. Note that there is no natural order between the quantum states themselves. Therefore, introducing an observable and computing the expectation value is somewhat necessary to establish a total order between the quantum states.

We define the sorting operation with respect to an observable MM as follows. Similar to the selection operation, we introduce an additive approximation ε\varepsilon in the sorted order.

Definition 3.6 (ε\varepsilon-sorting).

Given a database XX of mm states, an observable MM, and an error parameter ε\varepsilon, return an order (ϕ1,ϕ2,,ϕm)(\phi_{1},\phi_{2},\ldots,\phi_{m}) of the states in XX such that for all i=1,,m1i=1,\ldots,m-1, we have ϕiMϕiϕi+1Mϕi+1+ε{\phi_{i}}^{\dagger}M\phi_{i}\leq{\phi_{i+1}}^{\dagger}M\phi_{i+1}+\varepsilon.

4 Sketches for Quantum Data Operations

In this section, we introduce two quantum data sketches, vector sketches and shadow seeds, which are summaries of the original states for efficiently handling previously mentioned database operations.

Before delving into the details, let us use metaphors to provide some very high-level intuition of the two data summarizing methods. The vector sketches can be seen as capturing snapshots of the state from different angles, while each shadow seed can be seen as a piece of information gleaned from the state. Using multiple shadow seeds, we can reconstruct the original state at varying levels of resolution.

4.1 Vector Sketches for Equality-Test, Search, and Join

The concept of vector sketch is to represent a quantum state ϕ{\phi} as a vector in t\mathbb{R}^{t} with tdt\ll d instead of a vector in d\mathbb{C}^{d}, while preserve certain distance properties. In this section, we design vector sketches for quantum states and then use them to conduct equality test, search, and join.

A natural way to construct the sketch is to take a number of random measurements on ϕ{\phi}, and write down the measurement outcomes as a vector. The following result is due to Sen [53], rewritten for pure quantum states.

Theorem 4.1 ([53]).

Let ϕ{\phi} and ψ{\psi} be two pure quantum states in d\mathbb{C}^{d}. With probability at least (1eΩ(d))\left(1-e^{-\Omega(d)}\right) over the choice of a random measurement basis d={M1,,Md}\mathcal{M}_{d}=\{M_{1},\ldots,M_{d}\}, there exists a universal constant c(0,1)c\in(0,1) such that

cD(ϕ,ψ)d(ϕ)d(ψ)1D(ϕ,ψ).c\cdot D({\phi},{\psi})\leq\norm{\mathcal{M}_{d}({\phi})-\mathcal{M}_{d}({\psi})}_{1}\leq D({\phi},{\psi}). (1)

Theorem 4.1 connects the trace distance of two quantum states to the 1\ell_{1} distance of their measurement outcome distributions. We note that the distortion in (1), D(ϕ,ψ)/(cD(ϕ,ψ))=1/cD({\phi},{\psi})/(cD({\phi},{\psi}))=1/c, is a big constant whose value left unspecified in [53].

Vectors d(ϕ)\mathcal{M}_{d}({\phi}) and d(ψ)\mathcal{M}_{d}({\psi}) are discrete distributions with outcomes {1,2,,d}\{1,2,\ldots,d\}. It is well-known that for a discrete distribution μ\mu over a domain of size dd, using Θ((d+log(1/δ))/ϵ2)\Theta\left({(d+\log(1/\delta))}/{\epsilon^{2}}\right) samples we can obtain an empirical distribution μ~\widetilde{\mu} such that μμ~1ϵ\norm{\mu-\widetilde{\mu}}_{1}\leq\epsilon with probability 1δ1-\delta (see, e.g., [10]).

Corollary 4.2.

Let d~(ϕ)\widetilde{\mathcal{M}_{d}}({\phi}) and d~(ψ)\widetilde{\mathcal{M}_{d}}({\psi}) be the empirical distributions of measurement outcomes by applying d\mathcal{M}_{d} in Theorem 4.1 to cs(d+log(1/δ))/ϵ2c_{s}(d+\log(1/\delta))/\epsilon^{2} (for a sufficiently large constant csc_{s}) copies of ϕ{\phi} and ψ{\psi}, respectively. With probability 1δeΩ(d)1-\delta-e^{-\Omega(d)}, we have

cD(ϕ,ψ)ϵd~(ϕ)d~(ψ)1D(ϕ,ψ)+ϵ,c\cdot D({\phi},{\psi})-\epsilon\leq\norm{\widetilde{\mathcal{M}_{d}}({\phi})-\widetilde{\mathcal{M}_{d}}({\psi})}_{1}\leq D({\phi},{\psi})+\epsilon,

where c(0,1)c\in(0,1) is a universal constant.

We can view d~(ϕ)\widetilde{\mathcal{M}_{d}}({\phi}) and d~(ψ)\widetilde{\mathcal{M}_{d}}({\psi}) as two empirical probability vectors. However, since d=2nd=2^{n} for a nn-qubit state, it is both space-expensive to store d~(ϕ)\widetilde{\mathcal{M}_{d}}({\phi}) and time-expensive to use it for database operations.

Embedding to L1L_{1}-space

We aim to address the issue of efficiency in both time and space by showing that there is another distribution of measurements whose number of outcomes is independent of the state dimension dd, for which a similar connection exists between the trace distance of two quantum states and the 1\ell_{1} distance of the corresponding measurement outcome distributions. Moreover, the distortion of our sketching can be made arbitrarily close to 11 (compared with 1/c1/c in (1)). It is worth noting that this distortion will significantly impact the efficiency of the search and join operations, as we will discuss shortly.

Our result is summarized in the following theorem.

Theorem 4.3.

Let ϕ{\phi} and ψ{\psi} be two pure dd-dimensional quantum states. For any ι>0\iota>0, there is a distribution π\pi of measurements with k=clog(1/δ)/ι2k=c{\log(1/\delta)}/{\iota^{2}} outcomes for a sufficiently large constant cc, such that a measurement k\mathcal{M}_{k} sampled randomly from π\pi satisfies

(1ι)D(ϕ,ψ)dkcτk(ϕ)k(ψ)1(1+ι)D(ϕ,ψ)(1-\iota)D({\phi},{\psi})\leq\sqrt{\frac{d}{k}}c_{\tau}\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{1}\leq(1+\iota)D({\phi},{\psi})

with probability at least (1δ)(1-\delta), where cτ[0.48,2]c_{\tau}\in[0.48,\sqrt{2}] is a universal computable constant. Additionally, the measurement sampling can be completed in O(log8d)O(\log^{8}d) time, and the sampled measurement can be represented as a quantum circuit with a gate complexity of O(log2d)O(\log^{2}d).

Proof 4.4 (Proof Overview).

At a high level, our approach leverages form dimension reduction through quantum measurements. We make use of a technique called pretty good measurement [36] to generate random projective quantum measurements \mathcal{M} with kk outcomes. The output of these measurements are random vectors serving as the embedding of the state ϕ\phi into k\mathbb{R}^{k}.

We start by picking a random basis for d\mathbb{C}^{d} based on the Haar measure [35]. Let xt,yt(t=1,,d)x_{t},y_{t}\ (t=1,\ldots,d) be independent Gaussian random variables with mean zero and variance σ2=12d\sigma^{2}=\frac{1}{2d}, and let g(c1,,cd)dg\triangleq(c_{1},\ldots,c_{d})\in\mathbb{C}^{d} be a random vector where ct=xt+iytc_{t}=x_{t}+iy_{t}. We repeat this process and generate dd complex Gaussian random vectors g1,,gdg_{1},\ldots,g_{d}. These vectors are linearly independent with probability one; but they are not necessarily orthonormal. We make use of pretty good measurement to orthogonalize and normalize these vectors. More precisely, we construct the operator (matrix) Γt[d]gtgt\Gamma\triangleq\sum_{t\in[d]}g_{t}^{\dagger}g_{t}, and define the vector γtΓ1/2gt\gamma_{t}\triangleq\Gamma^{-1/2}g_{t} for each t[d]t\in[d]. We can show that γ1,,γd\gamma_{1},\ldots,\gamma_{d} are linearly independent and are orthonormal. Moreover, the distribution of γt\gamma_{t} is unitary invariant, and hence the Haar measure. Intuitively, γt\gamma_{t} is distributed uniformly over surface of the unit sphere in d\mathbb{C}^{d}. Next, we randomly group γt{\gamma_{t}}’s into kk groups and form random projection operators as

Πj=[d/k](γj)γj.(j=1,,k)\Pi_{j}=\sum_{\ell\in[d/k]}(\gamma^{j}_{\ell})^{\dagger}\gamma^{j}_{\ell}\ .\quad\quad(j=1,\ldots,k)

Let k={Π1,,Πk}\mathcal{M}_{k}=\{\Pi_{1},\cdots,\Pi_{k}\} be the corresponding measurement. Clearly, \mathcal{M} is a valid measurement with probability one. This random measurement facilitates an embedding of the quantum states in d\mathbb{C}^{d} into k\mathbb{R}^{k}. We carefully analyze the distortion of the embedding (i.e., the outcome distribution by applying k\mathcal{M}_{k} to the quantum state) using tools from the concentration of measures and properties of the Haar distribution. We show that the distortion of this embedding is no more than (1+ι)(1+\iota) with probability (1δ)(1-\delta) when k=clog(1/δ)/ι2k=c\log(1/\delta)/\iota^{2} for a constant cc. The complete proof can be found in Appendix B.1.

The measurement construction described above could require polynomial time in dd. However, we demonstrate that it can be sampled more efficiently from the Clifford group in classical time O(log8d)O(\log^{8}d), leveraging the properties of unitary 2-designs from quantum information theory. The details are provided in Appendix B.1.1.

To approximate dkcτk(ϕ)k(ψ)1\sqrt{\frac{d}{k}}c_{\tau}\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{1} up to an additive error ϵ\epsilon, we have to approximate k(ϕ)k(ψ)1\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{1} up to ϵ=ϵd/kcτ\epsilon^{\prime}=\frac{\epsilon}{\sqrt{d/k}\cdot c_{\tau}}. We have the following immediate corollary.

Corollary 4.5.

For any ι>0\iota>0, let k=clog(1/δ)/ι2k=c\log(1/\delta)/{\iota^{2}} for a sufficiently large constant cc, and let k~(ϕ)\widetilde{\mathcal{M}_{k}}({\phi}) and k~(ψ)\widetilde{\mathcal{M}_{k}}({\psi}) be the empirical distributions of the outcomes by applying independent random measurements k\mathcal{M}_{k} in Theorem 4.3 to csd/ϵ2c_{s}d/\epsilon^{2} (for a sufficiently large constant csc_{s}) copies of ϕ{\phi} and ψ{\psi}, respectively. With probability at least 1δ1-\delta, we have

(1ι)D(ϕ,ψ)ϵdkcτk~(ϕ)k~(ψ)1(1+ι)D(ϕ,ψ)+ϵ,\displaystyle(1-\iota)D({\phi},{\psi})-\epsilon\leq\sqrt{\frac{d}{k}}c_{\tau}\norm{\widetilde{\mathcal{M}_{k}}({\phi})-\widetilde{\mathcal{M}_{k}}({\psi})}_{1}\leq(1+\iota)D({\phi},{\psi})+\epsilon,

where cτ[0.48,2]c_{\tau}\in[0.48,\sqrt{2}] is the same constant in Theorem 4.3.

Embedding to L2L_{2}-space

The sketch we have constructed for the L1L_{1}-space can also be applied to the L2L_{2}-space, albeit through a different analysis. The 2\ell_{2} distance is interesting since we know from Theorem 2.2 that 2\ell_{2} enjoys a slightly better ANN scheme in term of time and space complexities, which will be useful for speeding up search and join operations. The proof of the following theorem can be found in Appendix B.2.

Theorem 4.6.

Let ϕ{\phi} and ψ{\psi} be two pure dd-dimensional quantum states. For any ι>0\iota>0, there is a distribution π\pi of measurements with k=clog(1/δ)/ι2k=c\log(1/\delta)/\iota^{2} outcomes for a sufficiently large constant cc, such that a measurement k\mathcal{M}_{k} sampled randomly from π\pi satisfies

(1ι)D(ϕ,ψ)d2k(ϕ)k(ψ)2(1+ι)D(ϕ,ψ)(1-\iota)D({\phi},{\psi})\leq\sqrt{\frac{d}{2}}\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{2}\leq(1+\iota)D({\phi},{\psi})

with probability at least 1δ1-\delta. Additionally, the measurement sampling can be completed in O(log8d)O(\log^{8}d) time, and the sampled measurement can be represented as a quantum circuit with a gate complexity of O(log2d)O(\log^{2}d).

For a discrete distribution μ\mu over a domain of size dd for any d1d\geq 1, it takes Θ(log(1/δ)/ϵ2)\Theta\left({\log(1/\delta)}/{\epsilon^{2}}\right) samples to obtain an empirical distribution μ~\widetilde{\mu} such that μμ~2ϵ\norm{\mu-\widetilde{\mu}}_{2}\leq\epsilon with probability 1δ1-\delta (see, e.g., [10]). We have the following corollary.

Corollary 4.7.

For any ι>0\iota>0, let k=clog(1/δ)/ι2k=c{\log(1/\delta)}/{\iota^{2}} for a sufficiently large constant cc, and let k~(ϕ)\widetilde{\mathcal{M}_{k}}({\phi}) and k~(ψ)\widetilde{\mathcal{M}_{k}}({\psi}) be the empirical distributions of the outcomes by applying independent random measurements k\mathcal{M}_{k} in Theorem 4.6 to csdlog(1/δ)/ϵ2c_{s}{d\log(1/\delta)}/{\epsilon^{2}} (for a sufficiently large constant csc_{s}) copies of ϕ{\phi} and ψ{\psi}, respectively. With probability 1δ1-\delta, we have

(1ι)D(ϕ,ψ)ϵd2k~(ϕ)k~(ψ)2(1+ι)D(ϕ,ψ)+ϵ.\displaystyle(1-\iota)D({\phi},{\psi})-\epsilon\leq\sqrt{\frac{d}{2}}\norm{\widetilde{\mathcal{M}_{k}}({\phi})-\widetilde{\mathcal{M}_{k}}({\psi})}_{2}\leq(1+\iota)D({\phi},{\psi})+\epsilon.

Johnson-Lindenstrauss Lemma in Our Context.  It is natural to ask whether existing dimension reduction techniques, such as the Johnson–Lindenstrauss (JL) lemma, can be applied directly to the dd-dimensional vector representation α(ϕ)=(α1,,αd)d\alpha(\phi)=(\alpha_{1},\ldots,\alpha_{d})\in\mathbb{C}^{d} of a quantum state ϕ\phi, or the outcome distribution p(ϕ)=(p1,,pd)d(pi=|αi|2)p(\phi)=(p_{1},\ldots,p_{d})\in\mathbb{R}^{d}\ (p_{i}=\absolutevalue{\alpha_{i}}^{2}) when measured in the computational basis. After all, we can use quantum tomography to learn the representation (α1,,αd)(\alpha_{1},\ldots,\alpha_{d}) approximately. We would like to first point out that a direct application will not work, since we can construct simple examples demonstrating inherent distortions between the trace distance of quantum states and the 1/2\ell_{1}/\ell_{2} distances of their dd-dimensional vector representations (α(ϕ)\alpha(\phi) or p(ϕ)p(\phi)), even when all the coordinates are real-valued and before any dimension reduction step. We leave the detailed examples and calculation to Appendix C. In our examples, for the α(ϕ)\alpha(\phi) vector representation, the distortions between the trace distance of quantum states and the 1\ell_{1} and 2\ell_{2} distances of the two corresponding vectors are at least d/6\sqrt{{d}/{6}} and 1.5\sqrt{1.5}, respectively. And for the p(ϕ)p(\phi) vector representation, the distortions between the trace distance of quantum states and the 1\ell_{1} and 2\ell_{2} distances of the two corresponding vectors are at least 3\sqrt{3} and 3d/4\sqrt{{3d}/{4}}, respectively. Moreover, the JL lemma only takes real vectors.

We also note that there exists a near-linear lower bound for dimension reduction in the L1L_{1} space [4], indicating that, unlike the JL lemma for L2L_{2} space, dimension reduction in the L1L_{1} space is not generally possible.

We note that there is a way to circumvent the issues for embedding quantum states into the L2L_{2} space: for each state ϕ\phi, we write its density matrix ϕϕ\phi\phi^{\dagger} as a real-valued 2d22d^{2} dimensional vector vϕv_{\phi}. By some calculation, we can show that the 2\ell_{2} distance of vϕv_{\phi} and vψv_{\psi} preserves the trace distance of the two original pure states ϕ\phi and ψ\psi. We then perform dimension reduction on the vectors vϕv_{\phi} using the JL lemma. Our sketching algorithm has the following advantages compared with this “full tomography plus JL lemma” approach (setting the error probability δ=0.01\delta=0.01):

  1. 1.

    The memory usage of our sketch construction is independent of dd, while the memory needed for storing the classical vector representation of the quantum state ϕ\phi is O(d)O(d) and that for the density matrix ϕϕ\phi\phi^{\dagger} is O(d2)O(d^{2}).

  2. 2.

    Our sketch construction takes O~(d/ϵ2)\tilde{O}(d/\epsilon^{2}) time, while the full (pure) quantum state tomography takes O(d2/ϵ5)O(d^{2}/\epsilon^{5}) [19] time and the dimension reduction using the JL lemma needs another O(d2/ϵ2)O(d^{2}/\epsilon^{2}) time.

These comparisons demonstrate that our sketch construction using direct quantum measurements significantly outperforms the method of first converting the quantum state to its classical description followed by dimension reduction, both in terms of time and space, which are the main focus of this paper.

We now apply our embedding results to database operations.

The Equality-Test Operation

We observe that Corollary 4.5 and Corollary 4.7 directly provide a way for solving (ϵ,β)(\epsilon,\beta)-equality-test. We just set ι=ϵ=ε2\iota=\epsilon=\frac{\varepsilon}{2}, and use the 1\ell_{1} or 2\ell_{2} distances between the two vector sketches k~(ϕ)\widetilde{\mathcal{M}_{k}}({\phi}) and k~(ψ)\widetilde{\mathcal{M}_{k}}({\psi}) to estimate D(ϕ,ψ)D(\phi,\psi) up to an additive error ε\varepsilon with probability 1δ1-\delta. The running time is bounded by O(k)=O(log(1/δ)/ε2)O(k)=O(\log(1/\delta)/\varepsilon^{2}).

The Search Operation

We now illustrate how to use vector sketches and approximate nearest neighbor (ANN) to perform (ϵ,β)(\epsilon,\beta)-search on quantum states.

Let ϵ\epsilon and (1+ι)(1+\iota) be the additive error and multiplicative error in Corollary 4.5/Corollary 4.7 for building {k~(ϕ)|ϕX}\left\{\left.\widetilde{\mathcal{M}_{k}}({\phi})\ \right|\ {\phi}\in X\right\}, respectively. We assume that an LSH indexing structure has already been built on top of k~(ϕ)\widetilde{\mathcal{M}_{k}}({\phi})’s to achieve the time and space usages stated in Theorem 2.2. To handle (ε,β)(\varepsilon,\beta)-search, we call 𝙰𝙽𝙽(k~(ϕ),{k~(ψ)|ψX},(1+ι)ε,βnn)\mathtt{ANN}\left(\widetilde{\mathcal{M}_{k}}({\phi}),\left\{\left.\widetilde{\mathcal{M}_{k}}({\psi})\ \right|\ {\psi}\in X\right\},(1+\iota)\varepsilon,\beta_{nn}\right), where βnn=β/(1+ι+ϵ/ε)\beta_{nn}={\beta}/{(1+\iota+\epsilon/\varepsilon)} is the parameter for the tradeoff between the distortion and the time/space complexity in ANN. By Corollary 4.5/Corollary 4.7 and Theorem 2.2, if there exists a state ψX{\psi}\in X such that D(ϕ,ψ)εD({\phi},{\psi})\leq\varepsilon, then ANN returns a state ψD{\psi^{\prime}}\in D such that D(ϕ,ψ)βεD({\phi},{\psi^{\prime}})\leq\beta\varepsilon. On the other hand, ANN either returns a state ψD{\psi^{\prime}}\in D with D(ϕ,ψ)βεD({\phi},{\psi^{\prime}})\leq\beta\varepsilon, or returns \emptyset.

By Theorem 2.2, it takes O(kmγ)=O(mγlogm/ε2)O(km^{\gamma})=O(m^{\gamma}\log m/\varepsilon^{2}) classical time to perform the search. The space for storing the LHS index is O(km+m1+γ)=O(mlogm/ε2+m1+γ)O(km+m^{1+\gamma})=O(m\log m/\varepsilon^{2}+m^{1+\gamma}), where γ1/βnn\gamma\approx 1/\beta_{nn} for 1\ell_{1} and γ1/βnn2\gamma\approx 1/\beta_{nn}^{2} for 2\ell_{2}.

We note that in the above approach, we have to make sure that βnn1\beta_{nn}\geq 1. In other words, we can only handle (ϵ,β)(\epsilon,\beta)-search with β(1+ι+ϵ/ε)\beta\geq(1+\iota+\epsilon/\varepsilon). However, since ϵ\epsilon and ι\iota can be positive constants arbitrarily close to 0, we can essentially handle all constants β>1\beta>1. Certainly, the higher the value of β\beta, the larger βnn\beta_{nn} that we can pick for reducing the query time and space usage in the ANN search. In practice, a reasonably large constant β\beta may be okay, as the trace distance between two quantum states that are generated by separate entities or experiments is typically much larger than that between two states originating from the same entity or experiment (due to quantum noise or preparation errors).

Setting δ=1/m2\delta=1/m^{2}, ι=0.01\iota=0.01 and ϵ=0.01ε\epsilon=0.01\varepsilon, we have βnn0.98β\beta_{nn}\geq 0.98\beta, and consequently γ1.05/β2\gamma\leq 1.05/\beta^{2}. Applying our vector sketch with respect to the 2\ell_{2} distance and the corresponding ANN search, we have the following theorem.

Theorem 4.8.

There is an index of size O(mlogmε2+m1+1.05β2)O\left(\frac{m\log m}{\varepsilon^{2}}+m^{1+\frac{1.05}{\beta^{2}}}\right), using which we can solve (ϵ,β)(\epsilon,\beta)-search on a database of mm quantum states with success probability 1o(1)1-o(1) and classical time O(m1.05β2logmε2)O\left(m^{\frac{1.05}{\beta^{2}}}\cdot\frac{\log m}{\varepsilon^{2}}\right).

Note that the index space cost is independent of dd, and the query time is sublinear in mm (for β>1.05\beta>\sqrt{1.05}) and independent of the state dimension dd.

The Join Operation

The sketch-based approach can also be used for join. Given a set of sketch vectors {k~(ϕ)|ϕX}\left\{\left.\widetilde{\mathcal{M}_{k}}({\phi})\ \right|\ {\phi}\in X\right\}, we can apply the same hashing process as that for the ANN search, and then verify (by computing the actual distance) all pairs of vectors that collide on at least one hash function. The space cost is the same as that of the search. The query time is dependent on the size of the join output, but it is still independent of the state dimension dd.

4.2 Shadow Seeds for Selection and Sorting

In this section, we develop a classical data summarization that can be used to estimate the expectation value ϕMϕ\phi^{\dagger}M\phi for an arbitrary kk-local observable MM. We make use of the classical shadow tomography (CST), introduced in [37], to approximate ϕMϕ\phi^{\dagger}M\phi up to a small additive error. CST tries to extract minimal information about the quantum state, without performing complete tomography, to estimate certain properties of the state described by observables.

For completeness, let us briefly describe the CST procedure using Pauli measurements. For each of the NN copies of ϕ{\phi}, we select nn unitary operators, U1,,UnU_{1},\ldots,U_{n}, randomly and independently from the set {I,H,SH}\{I,H,S^{\dagger}H\}, where HH is the Hadamard gate and S=ZS=\sqrt{Z} is the square root of the Pauli-ZZ gate; see Appendix A.2 for their matrix representations. We then apply UjU_{j} to the jj-th qubit of ϕ{\phi} and measure the state on the computational basis. The result is a binary string b1,,bn{0,1}b_{1},\ldots,b_{n}\in\{0,1\}. The nn pairs {bj,𝚒𝚗𝚍𝚎𝚡(Uj))}j=1n\{b_{j},\mathtt{index}(U_{j}))\}_{j=1}^{n} form a row vector, where 𝚒𝚗𝚍𝚎𝚡(Uj)\mathtt{index}(U_{j}) is the index of UjU_{j} in the set {I,H,SH}\{I,H,S^{\dagger}H\}. We then repeat this process for NN times, getting NN rows, forming the seed matrix A(ϕ)={bi,j,𝚒𝚗𝚍𝚎𝚡(Ui,j)}i[N],j[n]A({\phi})=\{b_{i,j},\mathtt{index}(U_{i,j})\}_{i\in[N],j\in[n]}. We call A(ϕ)A({\phi}) the shadow seeds. Clearly, A(ϕ)A({\phi}) can be stored using O(nN)O(nN) classical bits, since each entry of A(ϕ)A({\phi}) belongs to {0,1}×{0,1,2}\{0,1\}\times\{0,1,2\}.

At the time of query, given a kk-local observable MM, we first construct kk-local classical shadows ϕ~i{\tilde{\phi}_{i}} of the database state ϕ{\phi} from each row i[N]i\in[N] of its seed matrix A(ϕ)A({\phi}) with respect to the kk-local observable MM. Suppose MM depends non-trivially on the kk qubits indexed by Q{q1,,qk}Q\triangleq\{q_{1},\ldots,q_{k}\}. Let e0=(0,1)T,e1=(1,0)Te_{0}=(0,1)^{T},e_{1}=(1,0)^{T} be the standard basis vectors in the two dimensional plane. For each row i[N]i\in[N] and column jQj\in Q, we first construct a vector vi,j=Ui,jebi,j{v_{i,j}}=U_{i,j}e_{b_{i,j}}. Next, we construct the ii-th shadow as a 2k×2k2^{k}\times 2^{k} matrix ρ^i=jQ(3vi,jvi,jI),\hat{\rho}_{i}=\bigotimes_{j\in Q}\quantity(3v_{i,j}v^{\dagger}_{i,j}-I), where II is the 2×22\times 2 identity matrix. Finally, the estimator for ϕMϕ\phi^{\dagger}M\phi is given by T=1Ni[N]tr(Mρ^i)T=\frac{1}{N}\sum_{i\in[N]}\tr{M\hat{\rho}_{i}}. The following theorem states that TT is a good approximation of the expectation value ϕMϕ\phi^{\dagger}M\phi.

Theorem 4.9 (Based on [37]).

The above procedure prepares an N×nN\times n shadow seed matrix A(ϕ)A({\phi}) given NN copies of an nn-qubit quantum state ϕ{\phi}, such that for any given kk-local observable MM, if N4kM2log(1/δ)/ε2N\geq 4^{k}\norm{M}_{\infty}^{2}{\log(1/\delta)}/{\varepsilon^{2}}, the estimator TT approximates ϕMϕ\phi^{\dagger}M\phi up to an additive error ε\varepsilon with probability (1δ)(1-\delta) using A(ϕ)A({\phi}). Moreover, the time for computing ϕMϕ\phi^{\dagger}M\phi using A(ϕ)A({\phi}) is bounded by O(22kN)(16k)O\left(2^{2k}N\right)(\propto 16^{k}), and the space for storing A(ϕ)A({\phi}) is O(Nn)O(Nn) classical bits.

Note that the space cost and query time are both independent of the state dimension dd.

Typically, the kk-local observable MM can be expressed as a quantum circuit with poly(k)\text{poly}(k) gate complexity. In this case, we propose a new estimation algorithm to further improve the total query time from O(16k)O(16^{k}) to O(9k)O(9^{k}) (omitting other less critical factors) by an approach we call QCQC (quantum\toclassical\toquantum\toclassical). We have the following theorem, whose proof can be found in Appendix B.3.

Theorem 4.10.

There is a procedure for preparing an N×nN\times n shadow seed matrix A(ϕ)A({\phi}) given NN copies of an nn-qubit quantum state ϕ{\phi}, such that for any given kk-local observable MM with poly(k)\text{poly}(k) gate complexity, if N9kM2log(1/δ)/ε2N\geq 9^{k}\norm{M}_{\infty}^{2}{\log(1/\delta)}/{\varepsilon^{2}}, we can approximate ϕMϕ\phi^{\dagger}M\phi up to an additive error ε\varepsilon with probability (1δ)(1-\delta) using A(ϕ)A({\phi}). Moreover, the quantum time for computing ϕMϕ\phi^{\dagger}M\phi using A(ϕ)A({\phi}) is bounded by O(Npoly(k))(9k)O\left(N\text{poly}(k)\right)(\propto 9^{k}), and the space for storing A(ϕ)A({\phi}) is O(Nn)O(Nn) classical bits.

The Selection Operation

It is easy to see that Theorem 4.10 directly implies an algorithm for handling (η,ε)(\eta,\varepsilon)-selection: Setting δ=1/m2\delta=1/m^{2}, we can estimate ϕMϕ\phi^{\dagger}M\phi up to an additive error ε\varepsilon with probability (11/m2)(1-1/m^{2}) for each nn-qubit database state ϕ{\phi} using an N×nN\times n shadow seed matrix, where N9kM22logm/ε2N\geq 9^{k}\norm{M}_{\infty}^{2}\cdot{2\log m}/{\varepsilon^{2}}. By a union bound over mm database states, we can solve the (η,ε)(\eta,\varepsilon)-selection problem with probability (11/m)(1-1/m). The query time is bounded by Nmpoly(k)=9kmlogmpoly(k)M2/ε2Nm\cdot\text{poly}(k)=9^{k}m\log m\cdot\text{poly}(k)\norm{M}_{\infty}^{2}/{\varepsilon^{2}}.

Theorem 4.11.

There is an index of size O(9knW2logm/ε2)O\left(9^{k}nW^{2}{\log m}/{\varepsilon^{2}}\right), using which we can solve for any kk-local observable M(MW)M\ (\norm{M}_{\infty}\leq W) the (η,ϵ)(\eta,\epsilon)-selection on a database of mm nn-qubit quantum states with success probability (1o(1))(1-o(1)) and quantum time 9kmlogmW2poly(k)/ε29^{k}m{\log m}W^{2}\text{poly}(k)/{\varepsilon^{2}}.

The Sorting Operation

Since the shadow seed matrix can be used for estimating the expectation value ϕMϕ\phi^{\dagger}M\phi up to an additive error ε\varepsilon, we can use it for ε\varepsilon-sorting with the same space and time complexity as that for the selection operation.

5 Conclusion and Future Work

In this paper, we have defined basic database queries for quantum data and proposed several classical sketches of quantum states to facilitate these queries. We consider our work a preliminary step towards a comprehensive quantum data management system. Numerous questions and directions remain open following this work. We list a few below.

Support More Data Operations

This paper primarily focuses on two basic database operations: search and selection, along with several related operations. We would like to expand the support to more complex operations for data analytics, such as clustering and classification, for which we may need to develop new classical summaries of the quantum states for the sake of efficiency.

Mixed States

In various scenarios, such as when the description of a quantum system is unknown due to quantum noise, the use of a density operator (or, density matrix) for describing mixed quantum states becomes more convenient. Suppose the quantum system is in one of a collection of dd-dimensional pure states {ϕ1,,ϕk}\{{\phi_{1}},\ldots,{\phi_{k}}\}, we can represent a mixed quantum state as ρ=i=1kpiϕiϕi\rho=\sum_{i=1}^{k}p_{i}\phi_{i}{\phi_{i}}^{\dagger}, where p1,,pk0p_{1},\ldots,p_{k}\geq 0 and i=ikpi=1\sum_{i=i}^{k}p_{i}=1. We can view ρ\rho as a convex combination of outer products of pure states ϕi{\phi_{i}}, where each ϕiϕi\phi_{i}{\phi_{i}}^{\dagger} is associated with a probability pip_{i}. We anticipate that results presented in this paper can be extended to mixed states, although the technical aspects of this generalization require further investigation.

The Integration with the Theory of Relational Databases

A key feature of our proposed model is that quantum data is represented entirely in the classical format. This unique aspect enables us to integrate our model with established theories related to indexing, query execution, and query optimization in relational databases designed for classical data. However, the integration process will likely require the redesign of multiple components to accommodate the inherent differences stemming from the distinct definitions of database operations for quantum data.

References

  • [1] Scott Aaronson. The learnability of quantum states. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 463(2088):3089–3114, 2007.
  • [2] Scott Aaronson. Shadow tomography of quantum states. In Ilias Diakonikolas, David Kempe, and Monika Henzinger, editors, STOC, pages 325–338. ACM, 2018.
  • [3] Scott Aaronson and Daniel Gottesman. Improved simulation of stabilizer circuits. Physical Review A, 70(5):052328, nov 2004. doi:10.1103/physreva.70.052328.
  • [4] Alexandr Andoni, Moses Charikar, Ofer Neiman, and Huy L. Nguyen. Near linear lower bound for dimension reduction in L1. In Rafail Ostrovsky, editor, FOCS, pages 315–323. IEEE Computer Society, 2011.
  • [5] Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459–468. IEEE Computer Society, 2006.
  • [6] Costin Badescu and Ryan O’Donnell. Improved quantum data analysis. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC, pages 1398–1411. ACM, 2021.
  • [7] Costin Badescu, Ryan O’Donnell, and John Wright. Quantum state certification. CoRR, abs/1708.06002, 2017.
  • [8] Harry Buhrman, Richard Cleve, John Watrous, and Ronald De Wolf. Quantum fingerprinting. Physical Review Letters, 87(16):167902, 2001.
  • [9] Umut Çalikyilmaz, Sven Groppe, Jinghua Groppe, Tobias Winker, Stefan Prestel, Farida Shagieva, Daanish Arya, Florian Preis, and Le Gruenwald. Opportunities for quantum acceleration of databases: Optimization of queries and transaction schedules. Proc. VLDB Endow., 16(9):2344–2353, 2023.
  • [10] Clément L. Canonne. A short note on learning discrete distributions, 2020. arXiv:2002.11457.
  • [11] Thomas Chen, Shivam Nadimpalli, and Henry Yuen. Testing and learning quantum juntas nearly optimally. In Nikhil Bansal and Viswanath Nagarajan, editors, SODA, pages 1163–1185.
  • [12] Kai-Min Chung and Han-Hsuan Lin. Sample efficient algorithms for learning quantum channels in PAC model and the approximate state discrimination problem. In Min-Hsiu Hsieh, editor, TQC, volume 197 of LIPIcs, pages 3:1–3:22. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
  • [13] Paul Cockshott. Quantum relational databases, 1997. arXiv:quant-ph/9712025.
  • [14] Christoph Dankert, Richard Cleve, Joseph Emerson, and Etera Livine. Exact and approximate unitary 2-designs and their application to fidelity estimation. Physical Review A, 80(1):012304, July 2009. doi:10.1103/physreva.80.012304.
  • [15] Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Jack Snoeyink and Jean-Daniel Boissonnat, editors, SOCG, pages 253–262. ACM, 2004.
  • [16] D.P. DiVincenzo, D.W. Leung, and B.M. Terhal. Quantum data hiding. IEEE Transactions on Information Theory, 48(3):580–598, March 2002. doi:10.1109/18.985948.
  • [17] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm, 2014. arXiv:1411.4028, doi:10.48550/ARXIV.1411.4028.
  • [18] Edward Farhi and Hartmut Neven. Classification with quantum neural networks on near term processors. February 2018. arXiv:1802.06002.
  • [19] Daniel Stilck França, Fernando G. S. L. Brandão, and Richard Kueng. Fast and robust quantum state tomography from few basis measurements. In Min-Hsiu Hsieh, editor, 16th Conference on the Theory of Quantum Computation, Communication and Cryptography, TQC 2021, July 5-8, 2021, Virtual Conference, volume 197 of LIPIcs, pages 7:1–7:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
  • [20] Siddhant Garg and Goutham Ramakrishnan. Advances in quantum deep learning: An overview. arXiv:2005.04316, May 2020. arXiv:2005.04316.
  • [21] Weiyuan Gong and Scott Aaronson. Learning distributions over quantum measurement outcomes. CoRR, abs/2209.03007, 2022.
  • [22] Lov K. Grover. A fast quantum mechanical algorithm for database search. In Gary L. Miller, editor, STOC, pages 212–219. ACM, 1996.
  • [23] Jeongwan Haah, Aram W. Harrow, Zheng-Feng Ji, Xiaodi Wu, and Nengkun Yu. Sample-optimal tomography of quantum states. In Daniel Wichs and Yishay Mansour, editors, STOC, pages 913–925. ACM, 2016.
  • [24] Jeongwan Haah, Aram W. Harrow, Zhengfeng Ji, Xiaodi Wu, and Nengkun Yu. Sample-optimal tomography of quantum states. IEEE Transactions on Information Theory, pages 1–1, 2017. doi:10.1109/tit.2017.2719044.
  • [25] Rihan Hai, Shih-Han Hung, and Sebastian Feld. Quantum data management: From theory to opportunities. In ICDE, pages 5376–5381. IEEE, 2024. URL: https://doi.org/10.1109/ICDE60146.2024.00410, doi:10.1109/ICDE60146.2024.00410.
  • [26] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear systems of equations. Physical review letters, 103(15):150502, 2009.
  • [27] Aram W. Harrow, Cedric Yen-Yu Lin, and Ashley Montanaro. Sequential measurements, disturbance and property testing. In Philip N. Klein, editor, SODA, pages 1598–1611. SIAM, 2017.
  • [28] Aram W. Harrow and John C. Napp. Low-depth gradient measurements can improve convergence in variational hybrid quantum-classical algorithms. Physical Review Letters, 126(14):140502, apr 2021. doi:10.1103/physrevlett.126.140502.
  • [29] Aram W. Harrow and Andreas J. Winter. How many copies are needed for state discrimination? IEEE Trans. Inf. Theory, 58(1):1–2, 2012.
  • [30] Patrick Hayden, Peter W. Shor, and Andreas Winter. Random quantum codes from gaussian ensembles and an uncertainty relation. Open Systems & Information Dynamics, 15(01):71–89, mar 2008.
  • [31] Mohsen Heidari, Ananth Y. Grama, and Wojciech Szpankowski. Toward physically realizable quantum neural networks. Association for the Advancement of Articial Intelligence (AAAI), 2022.
  • [32] Mohsen Heidari, Mobasshir A Naved, Zahra Honjani, Wenbo Xie, Arjun Jacob Grama, and Wojciech Szpankowski. Quantum shadow gradient descent for variational quantum algorithms. 2024. URL: https://arxiv.org/abs/2310.06935, arXiv:2310.06935.
  • [33] Mohsen Heidari, Arun Padakandla, and Wojciech Szpankowski. A theoretical framework for learning from quantum data. In IEEE International Symposium on Information Theory (ISIT), 2021.
  • [34] Fumio Hiai and Denes Petz. The Semicircle Law, Free Random Variables and Entropy (Mathematical Surveys & Monographs). American Mathematical Society, 2006.
  • [35] Alexander S. Holevo. Quantum Systems, Channels, Information. De Gruyter, Berlin, Boston, 2013. URL: https://doi.org/10.1515/9783110273403 [cited 2023-08-27], doi:doi:10.1515/9783110273403.
  • [36] Alexander Semenovich Holevo. On asymptotically optimal hypotheses testing in quantum statistics. Teoriya Veroyatnostei i ee Primeneniya, 23(2):429–432, 1978.
  • [37] Hsin-Yuan Huang, Richard Kueng, and John Preskill. Predicting many properties of a quantum system from very few measurements. Nature Physics 16, 1050–1057 (2020), February 2020.
  • [38] Noah Huffman, Dmitri Pavlichin, and Tsachy Weissman. Lossy compression for schrödinger-style quantum simulations, 2024. URL: https://arxiv.org/abs/2401.11088, arXiv:2401.11088.
  • [39] Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Jeffrey Scott Vitter, editor, STOC, pages 604–613. ACM, 1998.
  • [40] E. Joos, H. D. Zeh, C. Kiefer, D. J. W. Giulini, J. Kupsch, and I. O. Stamatescu. Decoherence and the Appearance of a Classical World in Quantum Theory. Springer, 2003.
  • [41] Stephen P. Jordan, Keith S. M. Lee, and John Preskill. Quantum algorithms for quantum field theories. Science, 336(6085):1130–1133, jun 2012. doi:10.1126/science.1217069.
  • [42] Julia Kempe, Alexei Y. Kitaev, and Oded Regev. The complexity of the local hamiltonian problem. SIAM J. Comput., 35(5):1070–1097, 2006.
  • [43] Ian D. Kivlichan, Craig Gidney, Dominic W. Berry, Nathan Wiebe, Jarrod McClean, Wei Sun, Zhang Jiang, Nicholas Rubin, Austin Fowler, Alán Aspuru-Guzik, Hartmut Neven, and Ryan Babbush. Improved fault-tolerant quantum simulation of condensed-phase correlated electrons via trotterization. Quantum, 4:296, jul 2020. doi:10.22331/q-2020-07-16-296.
  • [44] Yang Liu and Gui Lu Long. Deleting a marked item from an unsorted database with a single query, 2007. arXiv:0710.3301.
  • [45] Fabio Valerio Massoli, Lucia Vadicamo, Giuseppe Amato, and Fabrizio Falchi. A leap among entanglement and neural networks: A quantum survey. arXiv:2107.03313, July 2021. arXiv:2107.03313.
  • [46] Isaac L. Chuang Michael A. Nielsen. Quantum Computation and Quantum Information. Cambridge University Pr., December 2010.
  • [47] Isaac L. Chuang Michael A. Nielsen. Quantum Computation and Quantum Information. Cambridge University Pr., December 2010. URL: https://www.ebook.de/de/product/13055864/michael_a_nielsen_isaac_l_chuang_quantum_computation_and_quantum_information.html.
  • [48] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii. Quantum circuit learning. Physical Review A, 98(3):032309, sep 2018. doi:10.1103/physreva.98.032309.
  • [49] Ryan O’Donnell and John Wright. Efficient quantum tomography. In Daniel Wichs and Yishay Mansour, editors, STOC, pages 899–912. ACM, 2016.
  • [50] Alexey Pyrkov, Alex Aliper, Dmitry Bezrukov, Yen-Chu Lin, Daniil Polykovskiy, Petrina Kamya, Feng Ren, and Alex Zhavoronkov. Quantum computing for near-term applications in generative chemistry and drug discovery. Drug Discovery Today, page 103675, 2023.
  • [51] Jun John Sakurai and Eugene D Commins. Modern quantum mechanics, revised edition, 1995.
  • [52] Maria Schuld, Ilya Sinayskiy, and Francesco Petruccione. The quest for a quantum neural network. Quantum Information Processing, 13(11):2567–2586, Aug 2014. doi:10.1007/s11128-014-0809-8.
  • [53] Pranab Sen. Random measurement bases, quantum state distinction and applications to the hidden subgroup problem. In CCC, pages 274–287. IEEE Computer Society, 2006.
  • [54] Peter W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput., 26(5):1484–1509, 1997.
  • [55] Adam Smith, MS Kim, Frank Pollmann, and Johannes Knolle. Simulating quantum many-body dynamics on a current digital quantum computer. npj Quantum Information, 5(1):106, 2019.
  • [56] Xiaoming Sun, Guojing Tian, Shuai Yang, Pei Yuan, and Shengyu Zhang. Asymptotically optimal circuit depth for quantum state preparation and general unitary synthesis, 2023. arXiv:2108.06150.
  • [57] Immanuel Trummer and Christoph Koch. Multiple query optimization on the d-wave 2x adiabatic quantum computer. Proc. VLDB Endow., 9(9):648–659, 2016. URL: http://www.vldb.org/pvldb/vol9/p648-trummer.pdf, doi:10.14778/2947618.2947621.
  • [58] James D. Whitfield, Jacob Biamonte, and Alán Aspuru-Guzik. Simulation of electronic structure hamiltonians using quantum computers. Molecular Physics, 109(5):735–750, mar 2011. doi:10.1080/00268976.2011.552441.
  • [59] Tobias Winker, Sven Groppe, Valter Uotila, Zhengtong Yan, Jiaheng Lu, Maja Franz, and Wolfgang Mauerer. Quantum machine learning: Foundation, new techniques, and opportunities for database research. In Sudipto Das, Ippokratis Pandis, K. Selçuk Candan, and Sihem Amer-Yahia, editors, SIGMOD, pages 45–52. ACM, 2023.
  • [60] Xin-Chuan Wu, Sheng Di, Emma Maitreyee Dasgupta, Franck Cappello, Hal Finkel, Yuri Alexeev, and Frederic T. Chong. Full-state quantum circuit simulation by using data compression. In Michela Taufer, Pavan Balaji, and Antonio J. Peña, editors, SC, pages 80:1–80:24. ACM, 2019.
  • [61] Ahmed Younes. Database manipulation on quantum computers, 2007. arXiv:0705.4303.

Appendix A More Preliminaries

A.1 Basics of Quantum Information (A More Formal Approach)

Quantum States and Qubits. We can represent a dd-dimensional pure quantum state as a vector in d\mathbb{C}^{d}. Using the standard Dirac bra-ket notation, we write |ϕ=i=0d1αi|i,\ket{\phi}=\sum_{i=0}^{d-1}\alpha_{i}\ket{i}, where |0,|1,,|d1\ket{0},\ket{1},\ldots,\ket{d-1} is an orthonormal basis in d\mathbb{C}^{d} (referred as the computational basis), and αi\alpha_{i}’s are called amplitudes with the property i=0d1|αi|2=1\sum_{i=0}^{d-1}\absolutevalue{\alpha_{i}}^{2}=1.

Let ϕ|\bra{\phi} denote the conjugate transpose of |ϕ\ket{\phi}, and let ϕ|φ\innerproduct{\phi}{\varphi} and |ϕφ|\outerproduct{\phi}{\varphi} denote the inner product and outer product of vectors |ϕ\ket{\phi} and φ|\bra{\varphi}, respectively.

A qubit is a 22-dimensional quantum state and can be represented as |ϕ=α0|0+α1|1\ket{\phi}=\alpha_{0}\ket{0}+\alpha_{1}\ket{1}, where |α0|2+|α1|2=1\absolutevalue{\alpha_{0}}^{2}+\absolutevalue{\alpha_{1}}^{2}=1. Generally, a nn-qubit state can be represented as

|ϕ=x1xn{0,1}nαx1xn|x1xn,\ket{\phi}=\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\alpha_{x_{1}\cdots x_{n}}\ket{x_{1}\cdots x_{n}},

where {|x1xn}\{\ket{x_{1}\cdots x_{n}}\} are the computational basis states of the nn-qubit system, and it holds that x1xn{0,1}n|αx1xn|2=1\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\absolutevalue{\alpha_{x_{1}\cdots x_{n}}}^{2}=1.

A quantum state is called separable if it can be written as a tensor product of at least two states |ϕ=|ϕ1|ϕ2|ϕk\ket{\phi}=\ket{\phi_{1}}\otimes\ket{\phi_{2}}\otimes\cdots\otimes\ket{\phi_{k}}, which is often abbreviated as |ϕ1|ϕ2|ϕk\ket{\phi_{1}}\ket{\phi_{2}}\cdots\ket{\phi_{k}}, or |ϕ1ϕ2ϕk\ket{\phi_{1}\phi_{2}\cdots\phi_{k}}. Otherwise, the state is called entangled. A classical entangled state is the Bell state |ϕ=|00+|112\ket{\phi}=\frac{\ket{00}+\ket{11}}{\sqrt{2}}.

Quantum Operations

There are two types of quantum operations. The first is called unitary transformation. That is, we apply a unitary operator UU to a quantum state |ϕ\ket{\phi} and get U|ϕU\ket{\phi}.444A unitary operator is a linear operator UU such that UU=UU=IUU^{\dagger}=U^{\dagger}U=I. This type of operation is used to describe the evolution of a closed quantum system. The second type of operations are measurements. Quantum measurements are the interface to obtain classical information about quantum states. Under the POVM (Positive Operator Valued Measures) formalism, a quantum measurement \mathcal{M} is described as a collection of d×dd\times d positive semi-definite operators {Mi}\{M_{i}\} with iMi=I\sum_{i}M_{i}=I; each MiM_{i} is associated with a measurement outcome oio_{i}, which can be chosen by the experimentalist. When performing \mathcal{M} on a quantum state |ϕ\ket{\phi}, the probability of getting the outcome oio_{i} is given by ϕ|Mi|ϕ\expectationvalue{M_{i}}{\phi}. Let (|ϕ)\mathcal{M}(\ket{\phi}) be the probability distribution of the measurement outcomes after applying \mathcal{M} on |ϕ\ket{\phi}.

A projective measurement is a special case of a POVM where MiM_{i}’s are projective operators, i.e., Mi2=MiM_{i}^{2}=M_{i}. An example of a projective measurement is the measurement on the computational basis where Mi=|ii|M_{i}=\outerproduct{i}{i}. Any POVM can be written as a unitary operator UU followed by a projective measurement.

Observables are physical variables that can be measured. An observable is represented by a Hermitian operator MM whose eigenvalues are the set of possible outcomes. The observable spectrally decomposes as M=iλiMiM=\sum_{i}\lambda_{i}M_{i}, where MiM_{i} represents the projector onto the eigenspace of MM associated with the eigenvalue {λi}\{\lambda_{i}\}. The observable MM can be associated with the measurement ={Mi}\mathcal{M}=\{M_{i}\} with outcomes {λi}\{\lambda_{i}\}. The expected value of an observable on a state |ϕ\ket{\phi} is expressed as 𝐄[(|ϕ)]=ϕ|M|ϕ\operatorname{{\mathop{\mathbf{E}}}}[\mathcal{M}(\ket{\phi})]=\expectationvalue{M}{\phi}.

When we say a measurement is performed on a dd-dimensional quantum state |ϕ\ket{\phi} in the computational basis, we mean that the measurement ={M0,M1,,Md1}\mathcal{M}=\{M_{0},M_{1},\ldots,M_{d-1}\} with Mi=|ii|M_{i}=\outerproduct{i}{i} is applied on |ϕ\ket{\phi}. It is noteworthy that the probability of observing the measurement outcome oio_{i}, denoted by ϕ|Mi|ϕ\expectationvalue{M_{i}}{\phi}, is equal to |ϕ|i|2=|αi|2\absolutevalue{\innerproduct{\phi}{i}}^{2}=\absolutevalue{\alpha_{i}}^{2}, wherein αi\alpha_{i} is the ii-th amplitude of the quantum state |ϕ\ket{\phi}.

A crucial property of quantum mechanics is that each measurement would cause a disturbance to the quantum states. If the measurement outcome is oio_{i}, then the post-measurement state of |ϕ\ket{\phi} can be written as

|ϕ=Bi|ϕϕ|BiBi|ϕ,\ket{\phi^{\prime}}=\frac{B_{i}\ket{\phi}}{\sqrt{\expectationvalue{B_{i}^{\dagger}B_{i}}{\phi}}}\ ,

where BiB_{i} satisfies BiBi=MiB_{i}^{\dagger}B_{i}=M_{i}. This particular phenomenon significantly complicates the quantum data management, as the quantum state cannot be entirely “recycled” following a measurement.

Trace Distance

Given two quantum states |ϕ\ket{\phi} and |ψ\ket{\psi}, we define their trace distance to be

D(|ϕ,|ψ)=12|ϕϕ||ψψ|1=1|ψ|ϕ|2.D(\ket{\phi},\ket{\psi})=\frac{1}{2}\norm{\outerproduct{\phi}{\phi}-\outerproduct{\psi}{\psi}}_{1}=\sqrt{1-\absolutevalue{\innerproduct{\psi}{\phi}}^{2}}.

The trace distance is the most widely used distance measure for quantum states in the literature. It also has a nice physical meaning: Let ={Mi}\mathcal{M}=\{M_{i}\} be a POVM, and let piϕ|Mi|ϕp_{i}\triangleq\expectationvalue{M_{i}}{\phi} and qiψ|Mi|ψq_{i}\triangleq\expectationvalue{M_{i}}{\psi}. That is, pip_{i} and qiq_{i} are the probabilities of obtaining measurement outcome oio_{i} on |ϕ\ket{\phi} and |ψ\ket{\psi}, respectively. We have D(|ϕ,|ψ)=maxi|piqi|,D(\ket{\phi},\ket{\psi})=\max_{\mathcal{M}}\sum_{i}\absolutevalue{p_{i}-q_{i}}, which implies that if two quantum states are close in terms of trace distance, then any measurement conducted on these quantum states will yield probability distributions that are close in terms of the total variance distance. In other words, two quantum states that are close in terms of the trace distance are statistically indistinguishable under measurements.

A.2 Some Basic Quantum States and Gates

We list the vector representations of some basic quantum states that we need to use in this paper.

  • |0\ket{0}:

    |0=(10)\ket{0}=\begin{pmatrix}1\\ 0\end{pmatrix}
  • |1\ket{1}:

    |1=(01)\ket{1}=\begin{pmatrix}0\\ 1\end{pmatrix}
  • |+\ket{+}:

    |+=12(11)\ket{+}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ 1\end{pmatrix}
  • |\ket{-}:

    |=12(11)\ket{-}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ -1\end{pmatrix}
  • |+i\ket{+i}:

    |+i=12(1i)\ket{+i}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ i\end{pmatrix}
  • |i\ket{-i}:

    |i=12(1i)\ket{-i}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ -i\end{pmatrix}

We make use of two basic quantum gates:

  • Hadamard gate

    H=12[1111].H=\frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\ 1&-1\end{bmatrix}\ .

    It turns |0\ket{0} to (|0+|1)/2(\ket{0}+\ket{1})/\sqrt{2}, and turns |1\ket{1} to (|0|1)/2(\ket{0}-\ket{1})/\sqrt{2}.

  • Phase gate

    S=[100i].S=\begin{bmatrix}1&0\\ 0&i\end{bmatrix}\ .

    It leaves |0\ket{0} unchanged, and turns |1\ket{1} to i|1i\ket{1}.

  • Another ser of special unitary operators are the Pauli operators defined as:

    I=[1001],X=[0110],Y=[0ii0],Z=[1001].I=\begin{bmatrix}1&0\\ 0&1\end{bmatrix},\quad X=\begin{bmatrix}0&1\\ 1&0\end{bmatrix},\quad Y=\begin{bmatrix}0&-i\\ i&0\end{bmatrix},\quad Z=\begin{bmatrix}1&0\\ 0&-1\end{bmatrix}.

    Here, II is the identity operator, XX represents a bit-flip operation, ZZ represents a phase-flip operation, and YY combines both bit-flip and phase-flip with an imaginary phase factor.

A.3 Clifford Group

In this paper, we make use of a special class of quantum unitary operations called the Clifford group, which is a fundamental construct in quantum information theory. The Clifford group consists of unitary operations that map Pauli operators to other Pauli operators. Furthermore, the Clifford group is known for its efficient classical simulation, as described by the Gottesman-Knill theorem [47]. Any unitary in in this group can be implemented (up to a global phase factor) using a circuit with only Hadamard, Phase, and CNOT gates. The formal definition of the Clifford group is given below.

Definition A.1.

The Clifford group 𝒞n\mathcal{C}_{n} on nn-qubits is defined as the normalizer of the Pauli group 𝒫n\mathcal{P}_{n} under the action of conjugation, that is

𝒞n={UU(2n)|UPU𝒫n,P𝒫n},\mathcal{C}_{n}=\{U\in U(2^{n})\,|\,UPU^{\dagger}\in\mathcal{P}_{n},\,\forall P\in\mathcal{P}_{n}\},

where 𝒫n=iI,Xj,Yj,Zj|j=1,,n\mathcal{P}_{n}=\langle iI,X_{j},Y_{j},Z_{j}\,|\,j=1,\dots,n\rangle is the nn-qubit Pauli group generated by the identity II and the Pauli matrices X,Y,ZX,Y,Z on each qubit, along with the phase factor ii.

A.4 Mathematical Tools

Lemma A.2 (Hoeffding’s inequality).

Let X1,,Xn[0,1]X_{1},\dotsc,X_{n}\in[0,1] be i.i.d. random variables and X=1ni=1nXiX=\frac{1}{n}\sum_{i=1}^{n}X_{i}. Then

Pr[|X𝐄[X]|>t]2exp(2t2n).\Pr[\absolutevalue{X-\mathbf{E}[X]}>t]\leq 2\exp(-2t^{2}n)\,.
Lemma A.3 (Generic Chernoff bound).

For any aa\in\mathbb{R} and random variable XX with moment generating function MX(t):=𝐄[etX]M_{X}(t):=\mathbf{E}[e^{tX}],

𝐏𝐫[Xa]inft>0MX(t)eta.\displaystyle\mathbf{Pr}[X\geq a]\leq\inf_{t>0}M_{X}(t)e^{-ta}.
Lemma A.4 (Berry-Esseen theorem).

Let X1,X2,,XnX_{1},X_{2},...,X_{n} be i.i.d. random variables with 𝐄[Xi]=0,𝐄[Xi2]=σ2<\mathbf{E}[X_{i}]=0,\mathbf{E}\left[X_{i}^{2}\right]=\sigma^{2}<\infty and 𝐄[Xi3]=ρ<\mathbf{E}\left[X_{i}^{3}\right]=\rho<\infty. If Yn:=1σnjXjY_{n}:=\frac{1}{\sigma\sqrt{n}}\sum_{j}X_{j}, then

|𝐏𝐫[Ynx]ϕN(x)|cρσ3n,\absolutevalue{\mathbf{Pr}[Y_{n}\leq x]-\phi_{N}(x)}\leq\frac{c\rho}{\sigma^{3}\sqrt{n}},

where ϕN(x)\phi_{N}(x) is the CDF of N(0,1)N(0,1) and cσ2c\leq\sigma^{2} is a constant.

Appendix B Missing Proofs

B.1 Proof of Theorem 4.3

Measurements Construction

We first describe how to generate random measurements. We start by picking a random basis for d\mathbb{C}^{d} based on the Haar measure, which can be done using a Gaussian ensemble of pure states as is used in [30]: Let xt,yt(t=1,,d)x_{t},y_{t}\ (t=1,\ldots,d) be independent Gaussian random variables with mean zero and variance σ2=12d\sigma^{2}=\frac{1}{2d}, and let g(c1,,cd)dg\triangleq(c_{1},\ldots,c_{d})\in\mathbb{C}^{d} be a random vector where ct=xt+iytc_{t}=x_{t}+iy_{t}. We repeat this process and generate dd Gaussian random vectors (written in the ket notation) |g1,,|gd\ket{g_{1}},\ldots,\ket{g_{d}}.

We next create an orthonormal basis for d\mathbb{C}^{d} using |g1,,|gd\ket{g_{1}},\ldots,\ket{g_{d}}. It is clear that with probability one, |gt\ket{g_{t}}’s are linearly independent, which means that they span d\mathbb{C}^{d}. However, they are not necessarily orthogonal. To address this issue, we use the pretty good measurement technique [35]. Define the operator Γt[d]|gtgt|\Gamma\triangleq\sum_{t\in[d]}\outerproduct{g_{t}}{g_{t}}, and define the vector |γtΓ1/2|gt\ket{\gamma_{t}}\triangleq\Gamma^{-1/2}\ket{g_{t}} for each t[d]t\in[d].

We note that computing Γ1/2\Gamma^{-1/2} may be time expensive. In Section B.1.1, we will discuss a more efficient measurement construction via tt-design, which is a concept in quantum information theory that generalizes the idea of random sampling over the unitary group U(d)U(d) of d×dd\times d unitary matrices.

Claim 1.

The set {|γt:t[d]}\left\{\ket{\gamma_{t}}:t\in[d]\right\} forms an orthonormal basis for d\mathbb{C}^{d}.

Proof B.1.

Observe that

t[d]|γtγt|=Γ1/2(t[d]|gtgt|)Γ1/2=I.\sum_{t\in[d]}\outerproduct{\gamma_{t}}{\gamma_{t}}=\Gamma^{-1/2}\left(\sum_{t\in[d]}\outerproduct{g_{t}}{g_{t}}\right)\Gamma^{-1/2}=I.

Moreover, |γt\ket{\gamma_{t}} are linearly independent since |gt\ket{g_{t}}’s are linearly independent. Hence, |γt\ket{\gamma_{t}}’s are orthonormal.

We also note that the distribution of |γt\ket{\gamma_{t}} is unitary invariant. Hence, it is the Haar measure.

Next, we randomly group |γt\ket{\gamma_{t}}’s into kk groups and form random projection operators as

Πj=[d/k]|γjγj|,j=1,,k.\Pi_{j}=\sum_{\ell\in[d/k]}\outerproduct{\gamma^{j}_{\ell}}{\gamma^{j}_{\ell}},\qquad j=1,\ldots,k. (2)

Let k={Π1,,Πk}\mathcal{M}_{k}=\{\Pi_{1},\cdots,\Pi_{k}\} be the corresponding measurement. Clearly, \mathcal{M} is a valid POVM with probability one.

The Analysis of Distortion

Let k(|ϕ)\mathcal{M}_{k}(\ket{\phi}) and k(|ψ)\mathcal{M}_{k}(\ket{\psi}) be the probability distributions of the measurement outcomes when the states are |ϕ\ket{\phi} and |ψ\ket{\psi}, respectively. Then, the total variation distance between the two probability distributions can be written as

k(|ϕ)k(|ψ)1=j[k]|tr(Πj|ϕϕ|)tr(Πj|ψψ|)|=j[k]|tr(ΠjA)|,\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}\outerproduct{\phi}{\phi}}-\tr{\Pi_{j}\outerproduct{\psi}{\psi}}}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}}, (3)

where we have used the Born’s law and set A|ϕϕ||ψψ|A\triangleq\outerproduct{\phi}{\phi}-\outerproduct{\psi}{\psi}.

We next show that AA has two eigenvalues ±D(|ϕ,|ψ)\pm D(\ket{\phi},\ket{\psi}), where D(,)D(\cdot,\cdot) is the trace distance. To this end, suppose |ω\ket{\omega} is an eigenstate of AA, and A|ω=λ|ωA\ket{\omega}=\lambda\ket{\omega}, where λ\lambda\in\mathbb{R} as AA is a Hermitian operator. Multiplying both sides by ϕ|\bra{\phi} gives

ϕ|A|ω=ϕ|ωϕ|ψψ|ω=λϕ|ω.\matrixelement{\phi}{A}{\omega}=\innerproduct{\phi}{\omega}-\innerproduct{\phi}{\psi}\innerproduct{\psi}{\omega}=\lambda\innerproduct{\phi}{\omega}. (4)

Similarly, multiplying both sides by ψ|\bra{\psi} gives

ψ|A|ω=ψ|ϕϕ|ωψ|ω=λψ|ω.\matrixelement{\psi}{A}{\omega}=\innerproduct{\psi}{\phi}\innerproduct{\phi}{\omega}-\innerproduct{\psi}{\omega}=\lambda\innerproduct{\psi}{\omega}. (5)

Combining (4) and (5) gives

(1+λ)ψ|ω=ψ|ϕϕ|ω=ψ|ϕ11λϕ|ψψ|ω.\displaystyle(1+\lambda)\innerproduct{\psi}{\omega}=\innerproduct{\psi}{\phi}\innerproduct{\phi}{\omega}=\innerproduct{\psi}{\phi}\frac{1}{1-\lambda}\innerproduct{\phi}{\psi}\innerproduct{\psi}{\omega}.

Hence, (1λ)(1+λ)=|ϕ|ψ|2(1-\lambda)(1+\lambda)=\absolutevalue{\innerproduct{\phi}{\psi}}^{2}, which implies

λ=±1|ϕ|ψ|2=±D(|ϕ,|ψ).\lambda=\pm\sqrt{1-\absolutevalue{\innerproduct{\phi}{\psi}}^{2}}=\pm D(\ket{\phi},\ket{\psi}). (6)

Now without loss of generality, let |1\ket{1} and |2\ket{2} denote the two eigenstates of AA. Hence,

A=|λ|(|11||22|).A=\absolutevalue{\lambda}(\outerproduct{1}{1}-\outerproduct{2}{2}). (7)

The right-hand side of (3) can be written as

j[k]|tr(ΠjA)|\displaystyle\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}} =\displaystyle= |λ|j[k]|1|Πj|12|Πj|2|\displaystyle\absolutevalue{\lambda}\sum_{j\in[k]}\absolutevalue{\expectationvalue{\Pi_{j}}{1}-\expectationvalue{\Pi_{j}}{2}} (8)
=\displaystyle= |λ|j[k]|[d/k]|1|γj|2|2|γj|2|.\displaystyle\absolutevalue{\lambda}\sum_{j\in[k]}\absolutevalue{\sum_{\ell\in[d/k]}\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}}. (9)

Let Wjd(|1|γj|2|2|γj|2)W_{\ell}^{j}\triangleq d\left(\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}\right). Combining (3) and (9), we have

k(|ϕ)k(|ψ)1=|λ|j[k]|[d/k]1dWj|.\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}=\absolutevalue{\lambda}\sum_{j\in[k]}\absolutevalue{\sum_{\ell\in[d/k]}\frac{1}{d}W_{\ell}^{j}}.

Multiplying both sides of the above equality by dk\sqrt{\frac{d}{k}} gives

dkk(|ϕ)k(|ψ)1\displaystyle\sqrt{\frac{d}{k}}\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1} =\displaystyle= |λ|kj[k]|1d/k[d/k]Wj|\displaystyle\frac{\absolutevalue{\lambda}}{k}\sum_{j\in[k]}\absolutevalue{\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}} (10)
=(6)\displaystyle\stackrel{{\scriptstyle\eqref{eq:f-1}}}{{=}} D(|ϕ,|ψ)kj[k]|1d/k[d/k]Wj|.\displaystyle\frac{D(\ket{\phi},\ket{\psi})}{k}\sum_{j\in[k]}\absolutevalue{\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}}.

We try to analyze the expectation and variance of each WjW_{\ell}^{j}. First, note that 𝐄[Wj]=0\mathbf{E}\left[W_{\ell}^{j}\right]=0, because the distribution of |γj\ket{\gamma^{j}_{\ell}} is unitary invariant, which implies that |1|γj|2\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2} and |2|γj|2\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2} have an identical distribution.

The variance of WjW_{\ell}^{j} equals to

σW2\displaystyle\sigma_{W}^{2} \displaystyle\triangleq 𝐕𝐚𝐫[Wj]=𝐄[|Wj|2]\displaystyle\mathbf{Var}\left[W_{\ell}^{j}\right]=\mathbf{E}\left[|W_{\ell}^{j}|^{2}\right] (11)
=\displaystyle= d2𝐄[(|1|γj|2|2|γj|2)2]\displaystyle{d^{2}}\mathbf{E}\left[\left(\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}\right)^{2}\right]
=\displaystyle= 2d2(𝐄[|1|γ|4](𝐄[|1|γ|2|2|γ|2])),\displaystyle{2}{d^{2}}\left(\mathbf{E}\left[\absolutevalue{\innerproduct{1}{\gamma}}^{4}\right]-\left(\mathbf{E}\left[\absolutevalue{\innerproduct{1}{\gamma}}^{2}\absolutevalue{\innerproduct{2}{\gamma}}^{2}\right]\right)\right),\quad

where |γ\ket{\gamma} is a random pure state generated based on the Haar measure. Since the Haar measure is invariant under Unitary transformation, 1|γ\innerproduct{1}{\gamma} and 2|γ\innerproduct{2}{\gamma} have the same joint distribution as U11U_{11} and U21U_{21}, where UU is a random unitary matrix (a Haar unitary) and UijU_{ij} refers to the entry on the iiths row and jjth column of UU. Note that all entries UijU_{ij} of a Haar unitary UU are identically distributed [34]. Moreover, they can be written as Uij=reiθU_{ij}=re^{i\theta} with the distribution given by d1π(1r2)d2rrθ\frac{d-1}{\pi}(1-r^{2})^{d-2}r\partial r\partial\theta where r[0,1]r\in[0,1] and θ[0,2π]\theta\in[0,2\pi]. Therefore, the distribution of |U11|2|U_{11}|^{2} is given by (d1)(1r)d2r(d-1)(1-r)^{d-2}\partial r. Consequently, from (11) we have

σW2=2d2(𝐄[|U11|4]𝐄[|U11|2|U21|2]).\sigma_{W}^{2}={2}{d^{2}}\left(\mathbf{E}\left[|U_{11}|^{4}\right]-\mathbf{E}\left[|U_{11}|^{2}|U_{21}|^{2}\right]\right). (12)

The first expectation in (12) calculates as

𝐄[|U11|4]=(d1)01r2(1r)d2r=(d1)(3,d1),\mathbf{E}[|U_{11}|^{4}]=(d-1)\int_{0}^{1}r^{2}(1-r)^{d-2}\partial r=(d-1)\mathcal{B}(3,d-1),

where (,)\mathcal{B}(\cdot,\cdot) is the Beta function that is defined as

(α,β)=01rα1(1r)β1r.\mathcal{B}(\alpha,\beta)=\int_{0}^{1}r^{\alpha-1}(1-r)^{\beta-1}\partial r.

The Beta function at positive integers can be calculated combinatorically as

(m,n)=(m+n)/(mn)(m+nm).\mathcal{B}(m,n)=\frac{(m+n)/(mn)}{\binom{m+n}{m}}.

Therefore,

𝐄[|U11|4]=(d1)(d+2)/(3(d1))(d+23)=2d(d+1).\mathbf{E}\left[|U_{11}|^{4}\right]=(d-1)\frac{(d+2)/(3(d-1))}{\binom{d+2}{3}}=\frac{2}{d(d+1)}.

Similarly, the second expectation in (12) equals to

𝐄[|U11|2|U21|2]=1d(d+1).\mathbf{E}\left[|U_{11}|^{2}|U_{21}|^{2}\right]=\frac{1}{d(d+1)}.

Consequently, (12) reduces to

σW2=2d2(2d(d+1)1d(d+1))=2dd+1.\sigma_{W}^{2}={2}{d^{2}}\left(\frac{2}{d(d+1)}-\frac{1}{d(d+1)}\right)=\frac{2d}{d+1}. (13)

Implying that σW22\sigma_{W}^{2}\leq 2. Now we continue our investigation of Equality (10). Let

Zj1d/k[d/k]Wj.Z^{j}\triangleq\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}.

(10) simplifies to

Qdkk(|ϕ)k(|ψ)1=D(|ϕ,|ψ)1kj[k]|Zj|.Q\triangleq\sqrt{\frac{d}{k}}\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}=D(\ket{\phi},\ket{\psi})\cdot\frac{1}{k}\sum_{j\in[k]}|Z^{j}|. (14)

Let

UQ𝐄[Q]D(|ϕ,|ψ)=1kj[k](|Zj|𝐄[|Zj|]).U\triangleq\frac{Q-\mathbf{E}[Q]}{D(\ket{\phi},\ket{\psi})}=\frac{1}{k}\sum_{j\in[k]}\left(|Z^{j}|-\mathbf{E}\left[|Z^{j}|\right]\right). (15)

Applying the generic Chernoff bound (Lemma A.3) to UU gives

𝐏𝐫[Uη]inft>0MU(t)etη,\mathbf{Pr}[U\geq\eta]\leq\inf_{t>0}M_{U}(t)e^{-t\eta}, (16)

where MU(t)=𝐄[etU]M_{U}(t)=\mathbf{E}\left[e^{tU}\right] is the moment generating function.

Let t0ηk/2t_{0}\triangleq\eta k/2. We use t=t0t=t_{0} to upper bound the RHS of (16). Since ZjZ^{j}’s are i.i.d., we have

MU(t0)=j[k]𝐄[exp(t0k(|Zj|𝐄[|Zj|]))]=(MU~(t0k))k=(MU~(η2))k,M_{U}(t_{0})=\prod_{j\in[k]}\mathbf{E}\left[\exp\left({\frac{t_{0}}{k}\left(|Z^{j}|-\mathbf{E}\left[|Z^{j}|\right]\right)}\right)\right]=\left(M_{\tilde{U}}\left(\frac{t_{0}}{k}\right)\right)^{k}=\left(M_{\tilde{U}}\left(\frac{\eta}{2}\right)\right)^{k}, (17)

where U~:=|Zj|𝐄[|Zj|]\tilde{U}:=|Z^{j}|-\mathbf{E}\left[|Z^{j}|\right]. Apply Taylor’s expansion on MU~(η2)M_{\tilde{U}}\left(\frac{\eta}{2}\right) around η=0\eta=0, we have

MU~(η2)=MU~(0)+MU~(0)η2+MU~′′(0)12(η2)2+ζη,M_{\tilde{U}}\left(\frac{\eta}{2}\right)=M_{\tilde{U}}(0)+M^{\prime}_{\tilde{U}}(0)\cdot\frac{\eta}{2}+M^{{}^{\prime\prime}}_{\tilde{U}}(0)\cdot\frac{1}{2}\left(\frac{\eta}{2}\right)^{2}+\zeta_{\eta}, (18)

where ζηcηη3\zeta_{\eta}\leq c_{\eta}\eta^{3} (for a universal constant cη>0c_{\eta}>0) is the remainder term.

It is clear that MU~(0)=1M_{\tilde{U}}(0)=1. By the definition of the moment generating function, MU~(0)=𝐄[U~]=0M^{\prime}_{\tilde{U}}(0)=\mathbf{E}\left[\tilde{U}\right]=0 and

MU~′′(0)=𝐕𝐚𝐫[U~]=𝐕𝐚𝐫[|Zj|]𝐕𝐚𝐫[Zj]=𝐕𝐚𝐫[Wj]=σW2,M^{{}^{\prime\prime}}_{\tilde{U}}(0)=\mathbf{Var}\left[\tilde{U}\right]=\mathbf{Var}\left[|Z^{j}|\right]\leq\mathbf{Var}\left[Z^{j}\right]=\mathbf{Var}\left[W_{\ell}^{j}\right]=\sigma^{2}_{W}\ ,

(18) simplifies to the following:

MU~(η2)1+σW22(η2)2+cηη31+(η2)2+cηη3,M_{\tilde{U}}\left(\frac{\eta}{2}\right)\leq 1+\frac{\sigma^{2}_{W}}{2}\left(\frac{\eta}{2}\right)^{2}+c_{\eta}\eta^{3}\leq 1+\left(\frac{\eta}{2}\right)^{2}+c_{\eta}\eta^{3},

where we have used the fact that σW22\sigma_{W}^{2}\leq 2 (see (13)).

In the rest of the analysis, we will focus on the parameter η18cη\eta\leq\frac{1}{8c_{\eta}}; the actual value of η\eta will be determined later.

Hence, by (17) we have

MU(t0)(1+(η2)2+cηη3)kexp(η2k4+cηη3k),M_{U}(t_{0})\leq\left(1+\left(\frac{\eta}{2}\right)^{2}+c_{\eta}\eta^{3}\right)^{k}\leq\exp\left(\frac{\eta^{2}k}{4}+c_{\eta}\eta^{3}k\right), (19)

where in the second inequality we have used the fact (1+xk)kex\left(1+\frac{x}{k}\right)^{k}\leq e^{x}.

Plugging (19) to (16), we have

𝐏𝐫(Uη)exp(η2k4+cηη3k)exp(η2k2)=exp(Ω(η2k)).\mathbf{Pr}(U\geq\eta)\leq\exp\left(\frac{\eta^{2}k}{4}+c_{\eta}\eta^{3}k\right)\cdot\exp\left(-\frac{\eta^{2}k}{2}\right)=\exp\left(-\Omega(\eta^{2}k)\right).

where we have used cηη3kη2k8c_{\eta}\eta^{3}k\leq\frac{\eta^{2}k}{8} since η18cη\eta\leq\frac{1}{8c_{\eta}}.

For the other direction, we have

𝐏𝐫[Uη]=𝐏𝐫[Uη]MU(t0)et0ηexp(Ω(η2k)).\mathbf{Pr}\left[U\leq-\eta\right]=\mathbf{Pr}\left[-U\geq\eta\right]\leq M_{U}(-t_{0})e^{-t_{0}\eta}\leq\exp\left(-\Omega(\eta^{2}k)\right). (20)

Hence, for any δ(0,1)\delta\in(0,1), setting k=ck(1η2log1δ)k=c_{k}\left(\frac{1}{\eta^{2}}\log\frac{1}{\delta}\right) for a sufficiently large constant ck>0c_{k}>0, we have that |U|η|U|\leq\eta with probability at least (1δ)(1-\delta). Combining (20), (15), and (14), we have that with probability at least (1δ)(1-\delta),

|dkk(|ϕ)k(|ψ)1D(|ϕ,|ψ)𝐄[|Z|]|η,\absolutevalue{\sqrt{\frac{d}{k}}\cdot\frac{\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}}{D(\ket{\phi},\ket{\psi})}-\mathbf{E}[\absolutevalue{Z}]}\leq\eta, (21)

where ZZ is distributed identically as ZjZ^{j}’s.

It remains to investigate 𝐄[|Z|]\mathbf{E}[|Z|]. We show that 𝐄[|Z|]=Θ(1)\mathbf{E}[|Z|]=\Theta(1). By the definition we know that

𝐄[Z]=1d/k[d/k]𝐄[W1]=0.\mathbf{E}[Z]=\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}\mathbf{E}[W_{\ell}^{1}]=0.

We start with the upper bound.

𝐄[|Z|]𝐄[|Z|2]=𝐄[Z2]=𝐕𝐚𝐫[Z]=σW2.\mathbf{E}[|Z|]\leq\sqrt{\mathbf{E}[|Z|^{2}]}=\sqrt{\mathbf{E}[Z^{2}]}=\sqrt{\mathbf{Var}[Z]}=\sigma_{W}\leq\sqrt{2}. (22)

For the lower bound, Markov inequality implies that

𝐄[|Z|]a𝐏𝐫[|Z|>a],\displaystyle\mathbf{E}\left[|Z|\right]\geq a\mathbf{Pr}\left[|Z|>a\right],

for any a>0a>0. Since ZZ is distributed identically as sum of i.i.d. random variables {W1}[d/k]\{W_{\ell}^{1}\}_{\ell\in[d/k]} with E[W1]=0E[W_{\ell}^{1}]=0, E[(W1)2]=σW2E[(W_{\ell}^{1})^{2}]=\sigma_{W}^{2}, and E[(W1)3]<+E[(W_{\ell}^{1})^{3}]<+\infty, by the Berry-Esseen theorem (Lemma A.4) we get

𝐏𝐫[1σW|Z|>x]𝐏𝐫[|N(0,1)|x]O(1d/k)=(1erf(x2))O(1d/k).\mathbf{Pr}\left[\frac{1}{\sigma_{W}}|Z|>x\right]\geq\mathbf{Pr}[|N(0,1)|\geq x]-O\left(\frac{1}{\sqrt{d/k}}\right)=\left(1-\erf\left(\frac{x}{\sqrt{2}}\right)\right)-O\left(\frac{1}{\sqrt{d/k}}\right).

Hence,

𝐄[|Z|]supx>0σWx(1erf(x2))O(1d/k).\mathbf{E}\left[|Z|\right]\geq\sup_{x>0}\sigma_{W}x\left(1-\erf\left(\frac{x}{\sqrt{2}}\right)\right)-O\left(\frac{1}{\sqrt{d/k}}\right). (23)

Recall that σW2\sigma_{W}\approx\sqrt{2} (see (13)). By a numerical computation, the maximum of the RHS of (23) attends at x0.7475x\approx 0.7475, which gives

𝐄[|Z|]0.4807O(1d/k)0.48,\mathbf{E}\left[|Z|\right]\geq 0.4807-O\left(\frac{1}{\sqrt{d/k}}\right)\geq 0.48, (24)

as long as d/kd/k is a sufficiently large constant.

By (22) and (24), we have μZ𝐄[|Z|]=Θ(1)\mu_{Z}\triangleq\mathbf{E}[\absolutevalue{Z}]=\Theta(1). Now, for any given constant ι>0\iota>0, we set the η=min{ιμZ,18cη}\eta=\min\left\{\iota\mu_{Z},\frac{1}{8c_{\eta}}\right\}. (21) gives

|dk1μZk(|ϕ)k(|ψ)1D(|ϕ,|ψ)1|ι.\absolutevalue{\sqrt{\frac{d}{k}}\frac{1}{\mu_{Z}}\frac{\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}}{D(\ket{\phi},\ket{\psi})}-1}\leq\iota.

B.1.1 Efficient Implementations of Random Measurements

Lastly, we address the space and time complexity of the sketching procedure in this proof. Specifically, we consider efficient construction of k\mathcal{M}_{k}. Note that the runtime of the original measurement construction based on pretty good measurements is polynomial in dd, hence exponential in the number of qubits. This is because it relies on generating dd kets |γj\ket{\gamma_{\ell}^{j}} based on Haar measure and via complex Gaussian vectors in d\mathbb{C}^{d} and the inverse-square root of the matrix Γ\Gamma. Generally, the classical time and the circuit gate complexity for sampling from Haar distribution is exponential. In what follows, we present a construction for k\mathcal{M}_{k} with polylog(d)\text{poly}\log(d) time and space.

Our approach is based on 2-design methods. A unitary tt-design is a concept in quantum information theory that generalizes the idea of random sampling over the unitary group U(d)U(d) of d×dd\times d unitary matrices. It provides a way to approximate certain statistical properties of quantum states or operations without needing to sample from the entire group, which can be computationally expensive. Below we highlight basics of this concept.

Let Pt,t(U)P_{t,t}(U) denote a polynomial which is homogeneous with degree at most tt in the matrix elements of UU, and at most degree tt in the complex conjugates of these elements.

Definition B.2.

A unitary tt-design is a finite set of unitary matrices {U(i)}i=1N\{U^{(i)}\}_{i=1}^{N} such that for any homogeneous polynomial Pt,tP_{t,t}

1Ni=1NPt,t(U(i))=Pt,t𝑑μHaar(U)\frac{1}{N}\sum_{i=1}^{N}P_{t,t}(U^{(i)})=\int P_{t,t}\,d\mu_{\text{Haar}}(U)

where μHaar\mu_{\text{Haar}} is the Haar measure on the space of d×dd\times d unitary matrices.

Intuitively, this definition implies that a unitary tt-design is indistinguishable from Haar measure when only polynomials of degree at most tt are used. We show that a 2-design is sufficient to obtain Theorem 4.3.

Note that the analysis of sample complexity in our proof relies on the generic Chernoff bound which, per (18), depends on the first two moments of

Wj=d(|1|γj|2|2|γj|2).W_{\ell}^{j}=d\left(\absolutevalue{\innerproduct{1}{\gamma_{\ell}^{j}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma_{\ell}^{j}}}^{2}\right).

We show that WjW_{\ell}^{j} can be written as a polynomial P1,1P_{1,1} of degree t=1t=1 in the elements of a unitary matrix UU. Let UU be a matrix with columns being the vectors of |γj\ket{\gamma_{\ell}^{j}}. That is r|γj\innerproduct{r}{\gamma_{\ell}^{j}} is the element of UU, denoted by Ur,(,j)U_{r,(\ell,j)}, located at the row rr and the column indexed by (,j)(\ell,j). With this definition, we can write |1|γj|2=U1,(,j)U1,(,j)\absolutevalue{\innerproduct{1}{\gamma_{\ell}^{j}}}^{2}=U_{1,(\ell,j)}U_{1,(\ell,j)}^{*}, implying that Wj=P1,1(U)W_{\ell}^{j}=P_{1,1}(U). Therefore, the first moment of WjW_{\ell}^{j} is a polynomial P1,1P_{1,1} of degree t=1t=1 in UU. Moreover, the second moment of WjW_{\ell}^{j} is a polynomial P2,2P_{2,2} of degree t=2t=2 in UU. This is because based on (11), the second moment is a function of |1|γj|4\absolutevalue{\innerproduct{1}{\gamma_{\ell}^{j}}}^{4}, which is written as U1,(,j)2(U1,(,j))2U_{1,(\ell,j)}^{2}(U_{1,(\ell,j)}^{*})^{2}. Hence, as long as the first and the second moments of WjW_{\ell}^{j} resemble the Haar measure, the proof remains valid, indicating that a 2-design is sufficient.

Based on the above argument, instead of using the pretty good measurement construction, we sample from 22-design unitaries. This can be implemented efficiently using the Clifford group, a specialized class of quantum operators (for more details see Section A.3). It is well-known that the Clifford group is a 2-design [14]. Furthermore, there exists an algorithm that samples uniformly from the Clifford group in classical time O(n8)O(n^{8}) and outputs a circuit representing the measurement with a gate complexity of O(n2)O(n^{2}), where n=logdn=\log d is the number of qubits [16]. To construct the desired measurement k\mathcal{M}_{k}, we first randomly generate a Clifford circuit, apply it to the input quantum state, and then measure in the computational basis. The outcome of the measurement is a binary string in {0,1}n\{0,1\}^{n}. Lastly, we design a binning function ff that takes the nn-bit string and outputs an index in [k][k]. This is achieved by partitioning the set {0,1}n\{0,1\}^{n} into kk equal-size bins, which can be efficiently implemented using a decision tree that only reads the first logk\log k bits of the binary measurement outcomes. There are kk possible outputs of the decision tree, meaning that it can be represented by a function f:{0,1}n[k]f:\{0,1\}^{n}\rightarrow[k] that determines the bin index for any binary string of length nn. This construction is summarized as Algorithm 1.

Input: k,n,|ϕk,n,\ket{\phi}
1 Sample from the Clifford group on nn-qubits and construct the random circuit UU.
2 Apply UU on the input state |ϕ\ket{\phi}
3 Measure the first logk\lceil\log k\rceil qubits along the computational basis.
4 return mod-kk of the decimal representation of the resulted binary string.
Algorithm 1 kk-sketching measurement

In conclusion, the above construction generates a random measurement k\mathcal{M}_{k} with O(n2)O(n^{2}) quantum gates and O(n8)O(n^{8}) classical time. Moreover, this measurement enjoys the same bound on the sample complexity as for the original construction based on the Haar measure.

B.2 Proof of Theorem 4.6

The proof for Theorem 4.6 (the 2\ell_{2}-norm case) is similar to that for Theorem 4.3 (the 1\ell_{1}-norm case), but the calculation will be different due to different distance functions. Let k={Π1,,Πk}\mathcal{M}_{k}=\{\Pi_{1},\ldots,\Pi_{k}\} be the same random POVM generated as that in the proof of Theorem 4.3.

The 2\ell_{2} distance between the output probability vectors can be written as

k(|ϕ)k(|ψ)22=j[k]|tr(Πj|ϕϕ|)tr(Πj|ψψ|)|2=j[k]|tr(ΠjA)|2,\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}\outerproduct{\phi}{\phi}}-\tr{\Pi_{j}\outerproduct{\psi}{\psi}}}^{2}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}}^{2}, (25)

where A=|ϕϕ||ψψ|A=\outerproduct{\phi}{\phi}-\outerproduct{\psi}{\psi}, which can be rewritten as A=|λ|(|11||22|)A=|\lambda|(\outerproduct{1}{1}-\outerproduct{2}{2}), where |1\ket{1} and |2\ket{2} are two eigenstates of AA. Hence, we can rewrite (25) as

k(|ϕ)k(|ψ)22\displaystyle\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2} =\displaystyle= j[k]|tr(ΠjA)|2\displaystyle\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}}^{2} (26)
=\displaystyle= |λ|2j[k]|1|Πj|12|Πj|2|2\displaystyle\absolutevalue{\lambda}^{2}\sum_{j\in[k]}\absolutevalue{\expectationvalue{\Pi_{j}}{1}-\expectationvalue{\Pi_{j}}{2}}^{2}
=\displaystyle= |λ|2j[k]([d/k]|1|γ(j)|2|2|γ(j)|2)2.\displaystyle\absolutevalue{\lambda}^{2}\sum_{j\in[k]}\left(\sum_{\ell\in[d/k]}\absolutevalue{\innerproduct{1}{\gamma^{(j)}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{(j)}_{\ell}}}^{2}\right)^{2}.

Multiplying both sides of (26) by a factor of dd, and letting Wjd(|1|γj|2|2|γj|2)W_{\ell}^{j}\triangleq d\left(\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}\right), we have

dk(|ϕ)k(|ψ)22=|λ|21kj[k](1d/k[d/k]Wj)2.d\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2}=\absolutevalue{\lambda}^{2}\frac{1}{k}\sum_{j\in[k]}\left(\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}\right)^{2}. (27)

Let

Zj1d/k[d/k]Wj.Z^{j}\triangleq\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}. (28)

By a similar analysis as that in the proof of Theorem 4.3 (particularly, recall (6) and (13)), the expectation of the RHS of (27) is calculated to be

𝐄[|λ|21kj[k]|Zj|2]=|λ|2σW2=D2(|ϕ,|ψ)2dd+1.\mathbf{E}\left[\absolutevalue{\lambda}^{2}\frac{1}{k}\sum_{j\in[k]}|Z^{j}|^{2}\right]=\absolutevalue{\lambda}^{2}\sigma_{W}^{2}=D^{2}(\ket{\phi},\ket{\psi})\cdot\frac{2d}{d+1}. (29)

We next analyze the variance of the RHS of (27).

We write

V1|λ|2(dk(|ϕ)k(|ψ)22|λ|2σW2)=1kj[k](|Zj|2σW2).V\triangleq\frac{1}{|\lambda|^{2}}\left(d\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{2}^{2}-|\lambda|^{2}\sigma_{W}^{2}\right)=\frac{1}{k}\sum_{j\in[k]}\left(|Z^{j}|^{2}-\sigma_{W}^{2}\right). (30)

By the generic Chernoff bound (Lemma A.3), we have

𝐏𝐫(Vη)inft>0MV(t)etη,\mathbf{Pr}(V\geq\eta)\leq\inf_{t>0}M_{V}(t)e^{-t\eta}, (31)

where MV(t)=𝐄[etV]M_{V}(t)=\mathbf{E}\left[e^{tV}\right]. Set t0=ctηkt_{0}=c_{t}\eta k for a constant ctc_{t} to be determined later . We use t=t0t=t_{0} to upper bound the RHS of (31). Since ZjZ^{j}’s are i.i.d., we have

MV(t0)=j[k]𝐄[exp(t0k(|Zj|2σW2))]=(MV~(t0k))k=(MV~(ctη))k,M_{V}(t_{0})=\prod_{j\in[k]}\mathbf{E}\left[\exp\left({\frac{t_{0}}{k}\left(|Z^{j}|^{2}-\sigma_{W}^{2}\right)}\right)\right]=\left(M_{\tilde{V}}\left(\frac{t_{0}}{k}\right)\right)^{k}=\left(M_{\tilde{V}}\left(c_{t}\eta\right)\right)^{k}, (32)

where V~=|Zj|2σW2\tilde{V}=|Z^{j}|^{2}-\sigma_{W}^{2}.

Apply Taylor’s expansion on MV~(ctη)M_{\tilde{V}}\left(c_{t}\eta\right) around η=0\eta=0, we get

MV~(ctη)=MV~(0)+MV~(0)ctη+MV~′′(0)12(ctη)2+ζη,M_{\tilde{V}}\left(c_{t}\eta\right)=M_{\tilde{V}}(0)+M^{\prime}_{\tilde{V}}(0)\cdot c_{t}\eta+M^{{}^{\prime\prime}}_{\tilde{V}}(0)\cdot\frac{1}{2}\left(c_{t}\eta\right)^{2}+\zeta_{\eta}, (33)

where ζηcηη3\zeta_{\eta}\leq c_{\eta}\eta^{3} (for a universal constant cη>0c_{\eta}>0) is the remainder term. We again have MV~(0)=1M_{\tilde{V}}(0)=1, MV~(0)=𝐄[V~]=0M^{\prime}_{\tilde{V}}(0)=\mathbf{E}\left[\tilde{V}\right]=0, and MV~′′(0)=𝐕𝐚𝐫[V~]cV~M^{{}^{\prime\prime}}_{\tilde{V}}(0)=\mathbf{Var}\left[\tilde{V}\right]\leq c_{\tilde{V}} for a universal constant cV~>0c_{\tilde{V}}>0.

In the rest of the analysis, we will focus on parameter ηcV~ct22cη\eta\leq\frac{c_{\tilde{V}}c_{t}^{2}}{2c_{\eta}}; the actual value of η\eta will be determined later.

We extend (33) as

MV~(ctη)1+cV~2(ctη)2+cηη31+cV~ct2η2.M_{\tilde{V}}\left(c_{t}\eta\right)\leq 1+\frac{c_{\tilde{V}}}{2}\left(c_{t}\eta\right)^{2}+c_{\eta}\eta^{3}\leq 1+c_{\tilde{V}}c_{t}^{2}\eta^{2}. (34)

Plugging (34) to (32), we have

MV(t0)=(MV~(ctη))k(1+cV~ct2η2)kexp(cV~ct2kη2).M_{V}(t_{0})=\left(M_{\tilde{V}}\left(c_{t}\eta\right)\right)^{k}\leq\left(1+c_{\tilde{V}}c_{t}^{2}\eta^{2}\right)^{k}\leq\exp\left(c_{\tilde{V}}c_{t}^{2}k\eta^{2}\right). (35)

Plugging (35) to (31), we have

𝐏𝐫(Vη)MV(t0)et0ηexp(ct2cV~kη2)exp(ctη2k)=exp(Ω(η2k)),\mathbf{Pr}(V\geq\eta)\leq M_{V}(t_{0})e^{-t_{0}\eta}\leq\exp\left(c_{t}^{2}c_{\tilde{V}}k\eta^{2}\right)\exp\left(-c_{t}\eta^{2}k\right)=\exp\left(-\Omega(\eta^{2}k)\right),

where for the last equality to hold, we set constant ct=12cV~c_{t}=\frac{1}{2c_{\tilde{V}}}.

For the other direction, we have

𝐏𝐫[Vη]=𝐏𝐫[Vη]MV(t0)et0ηexp(Ω(η2k)).\mathbf{Pr}\left[V\leq-\eta\right]=\mathbf{Pr}\left[-V\geq\eta\right]\leq M_{V}(-t_{0})e^{-t_{0}\eta}\leq\exp\left(-\Omega(\eta^{2}k)\right).

Hence, for any δ(0,1)\delta\in(0,1), setting k=ck(1η2log1δ)k=c_{k}\left(\frac{1}{\eta^{2}}\log\frac{1}{\delta}\right) for a sufficiently large constant ck>0c_{k}>0, we have that |V|η|V|\leq\eta with probability at least (1δ)(1-\delta). This, together with (27), (28) (29), and (30), we obtain

|dk(|ϕ)k(|ψ)22D2(|ϕ,|ψ)2dd+1|η,\absolutevalue{\frac{d\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2}}{D^{2}(\ket{\phi},\ket{\psi})}-\frac{2d}{d+1}}\leq\eta,

which implies

2ηo(1)dk(|ϕ)k(|ψ)2D(|ϕ,|ψ)2+η.\sqrt{2-\eta-o(1)}\leq\frac{\sqrt{d}\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{2}}{D(\ket{\phi},\ket{\psi})}\leq\sqrt{2+\eta}. (36)

Now, for any constant ι>0\iota>0, we set η={ι,cV~ct22cη}\eta=\left\{\iota,\frac{c_{\tilde{V}}c_{t}^{2}}{2c_{\eta}}\right\}. (36) gives

|d2k(|ϕ)k(|ψ)2D(|ϕ,|ψ)1|ι.\absolutevalue{\sqrt{\frac{d}{2}}\cdot\frac{\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{2}}{D(\ket{\phi},\ket{\psi})}-1}\leq\iota.

B.3 Proof of Theorem 4.10

For the first part, we use the same procedure is CST to create the N×nN\times n seed matrix A(ϕ)A({\phi}), where each row i[N]i\in[N] consists of the nn pairs {bi,j,𝚒𝚗𝚍𝚎𝚡(Ui,j))}j=1n\{b_{i,j},\mathtt{index}(U_{i,j}))\}_{j=1}^{n}.

Next, for the query algorithm, we propose an encoding to turn the classical shadows into quantum states. This is a deviation from CST as we push the computations back to quantum via a QCQC approach. Let QQ be the index of the relevant qubits of a local observable MM in the query phase. For each jQj\in Q, we create a random binary pair (ci,j,wi,j)(c_{i,j},w_{i,j}) using the stored bit bi,jb_{i,j} as follows:

(ci,j,wi,j)={(bi,j,1)w.pr. 2/3,(1bi,j,1)w.pr. 1/3.(c_{i,j},w_{i,j})=\begin{cases}(b_{i,j},1)&\text{w.pr.\ }2/3\ ,\\ (1-b_{i,j},-1)&\text{w.pr.\ }1/3\ .\end{cases} (37)

We then prepare a qubit |ci,j\ket{c_{i,j}} and apply the corresponding operator Ui,jU_{i,j} to create qubit |vi,j=Ui,j|ci,j\ket{v_{i,j}}=U_{i,j}\ket{c_{i,j}}. Note that |vi,j\ket{v_{i,j}} takes one of the following states:

|0,|1,|+,|,|+i,|i,\ket{0},\ket{1},\ket{+},\ket{-},\ket{+i},\ket{-i},

that are easy to prepare (see Appendix A.2 for their vector representations). Then, we prepare the |0\ket{0} for the rest of qubits not indexed in QQ. Let

|vi,j={|vi,j,if jQ|0,otherwise.\ket{v^{\prime}_{i,j}}=\begin{cases}\ket{v_{i,j}},&\text{if\ }j\in Q\\ \ket{0},&\text{otherwise}.\end{cases} (38)

We construct the ii-th shadow sample as (written as the outer-product form)

|ϕ~i=j=1n|vi,j.\ket{\tilde{\phi}_{i}}=\bigotimes_{j=1}^{n}\ket{v^{\prime}_{i,j}}. (39)

We next measure each |ϕ~i\ket{\tilde{\phi}_{i}} using the observable MM, and obtain an outcome xix_{i}. Let

Si=3kxijQwi,j.S_{i}=3^{k}x_{i}\prod_{j\in Q}w_{i,j}. (40)

Then, our estimator is the empirical average

T=1NiSi.T=\frac{1}{N}\sum_{i}S_{i}.

We now show that when N9kM2log(1/δ)ε2N\geq 9^{k}\norm{M}_{\infty}^{2}\frac{\log(1/\delta)}{\varepsilon^{2}}, then TT approximates ϕ|M|ϕ\expectationvalue{M}{\phi} up to an additive error ε\varepsilon with probability (1δ)(1-\delta), proving the correctness part of Theorem 4.10. We will also give the query time analysis. Recall that the space needed for storing the seed matrix is O(Nn)O(Nn) classical bits.

In the rest of the proof, we will focus on the random variable SSiS\triangleq S_{i} for particular i[N]i\in[N]. Recall that the final approximation TT is the average of NN i.i.d. copies of SS.

Let XiX_{i}, WjW_{j}, Vi,jV_{i,j}, Bi,jB_{i,j} be the corresponding random variables of xix_{i}, wi,jw_{i,j}, vi,jv_{i,j}, bi,jb_{i,j}, respectively. Since we focus on a particular i[N]i\in[N], we will omit all subscripts ii in those random variables and write them as XX, WjW_{j}, VjV_{j}, BjB_{j}. We also write |ϕ~i\ket{\tilde{\phi}_{i}} as |ϕ~\ket{\tilde{\phi}}.

The following result can be inferred from [32]. We include a proof for completeness.

Lemma B.3.

For any i[N]i\in[N], let |φ~iφ~i|=j=1n|VjVj|\outerproduct{\tilde{\varphi}_{i}}{\tilde{\varphi}_{i}}=\bigotimes_{j=1}^{n}\outerproduct{V_{j}}{V_{j}}. We have 𝐄[3nW|φ~iφ~i|]=|ϕϕ|\mathbf{E}\left[3^{n}W\outerproduct{\tilde{\varphi}_{i}}{\tilde{\varphi}_{i}}\right]=\outerproduct{\phi}{\phi}, where W=j=1nWjW=\prod_{j=1}^{n}W_{j}.

Proof B.4.

Let Γ0\Gamma_{0} denote the shadow channel for Pauli measurements as defined in [37], which is given by

Γ0[O]U{I,H,SH}b{0,1}13b|UOU|bU|bb|U,\Gamma_{0}[O]\triangleq\sum_{U\in\{I,H,S^{\dagger}H\}}\sum_{b\in\{0,1\}}\frac{1}{3}\matrixelement{b}{U^{\dagger}OU}{b}\leavevmode\nobreak\ U\outerproduct{b}{b}U^{\dagger},

for any single qubit operator OO. By direct calculation, we have for a generic state |ψ=a0|0+a1|1\ket{\psi}=a_{0}\ket{0}+a_{1}\ket{1},

Γ01[|ψψ|]=[2|a0|2|a1|23a0a13a0a12|a1|2|a0|2].\Gamma_{0}^{-1}[\outerproduct{\psi}{\psi}]=\begin{bmatrix}2|a_{0}|^{2}-|a_{1}|^{2}&3a_{0}a_{1}^{*}\\ 3a_{0}^{*}a_{1}&2|a_{1}|^{2}-|a_{0}|^{2}\end{bmatrix}.

It is known that Γ0\Gamma_{0} has an inverse as it is a linear mapping. Applying Γ01\Gamma_{0}^{-1} on |Bj\ket{B_{j}}, we have, again by direct calculation, that

Γ01[|BjBj|]=2|BjBj||1Bj1Bj|.\Gamma_{0}^{-1}\left[\outerproduct{B_{j}}{B_{j}}\right]=2\outerproduct{B_{j}}{B_{j}}-\outerproduct{1-{B_{j}}}{1-{B_{j}}}.

By (37), taking the expectation of Wj|CjCj|W_{j}\outerproduct{C_{j}}{C_{j}} gives

𝐄[Wj|CjCj|]=23|BjBj|13|1Bj1Bj|=13Γ01[|BjBj|].\mathbf{E}\Big{[}W_{j}\outerproduct{C_{j}}{C_{j}}\Big{]}=\frac{2}{3}\outerproduct{B_{j}}{B_{j}}-\frac{1}{3}\outerproduct{1-B_{j}}{1-B_{j}}=\frac{1}{3}\Gamma_{0}^{-1}\left[\outerproduct{B_{j}}{B_{j}}\right].

Since |Vj=Uj|Cj\ket{V_{j}}=U_{j}\ket{C_{j}}, we get

𝐄[3Wj|VjVj|]=UjΓ01[|BjBj|]Uj=Γ01[Uj|BjBj|Uj].\mathbf{E}\left[3W_{j}\outerproduct{V_{j}}{V_{j}}\right]=U_{j}\Gamma_{0}^{-1}\left[\outerproduct{B_{j}}{B_{j}}\right]U_{j}^{\dagger}=\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right]. (41)

Since (Uj,Bj)(U_{j},B_{j})’s are independent for different j[n]j\in[n],

𝐄[3nW|φ~iφ~i|]\displaystyle\mathbf{E}\left[3^{n}W\outerproduct{\tilde{\varphi}_{i}}{\tilde{\varphi}_{i}}\right] =\displaystyle= 𝐄[j=1n(3Wj|vjvj|)]\displaystyle\mathbf{E}\left[\bigotimes_{j=1}^{n}\left(3W_{j}\outerproduct{v_{j}}{v_{j}}\right)\right]
=\displaystyle= 𝐄[j=1nΓ01[Uj|BjBj|Uj]]\displaystyle\mathbf{E}\left[\bigotimes_{j=1}^{n}\Gamma_{0}^{-1}[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}]\right]
=\displaystyle= j=1n𝐄[Γ01[Uj|BjBj|Uj]]\displaystyle\bigotimes_{j=1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right]\right]
=\displaystyle= |ϕϕ|.\displaystyle\outerproduct{\phi}{\phi}. (43)

where the last equality follows form [32, Lemma 6].

The next lemma shows that SS is an unbiased estimator of the quantity ϕ|M|ϕ\expectationvalue{M}{\phi}.

Lemma B.5.

𝐄[S]=ϕ|M|ϕ\operatorname{{\mathop{\mathbf{E}}}}[S]=\expectationvalue{M}{\phi} .

Proof B.6.

Without loss of generality, assume that q=q_{\ell}=\ell for all [k]\ell\in[k]. We have

𝐄[S]\displaystyle\mathbf{E}[S] =\displaystyle= 3k𝐄[Xj[k]Wj]\displaystyle 3^{k}\mathbf{E}\left[X\prod_{j\in[k]}W_{j}\right] (44)
=\displaystyle= 3k𝐄[ϕ~|M|ϕ~j[k]Wj]\displaystyle 3^{k}\mathbf{E}\left[\expectationvalue{M}{\tilde{\phi}}\prod_{j\in[k]}W_{j}\right]
=\displaystyle= 3k𝐄[tr(M|ϕ~ϕ~|)j[k]Wj]\displaystyle 3^{k}\mathbf{E}\left[\tr\left(M\outerproduct{\tilde{\phi}}{\tilde{\phi}}\right)\prod_{j\in[k]}W_{j}\right]
=\displaystyle= tr{M𝐄[j[k](3Wj|VjVj|)|0nk0nk|]}.\displaystyle\tr\left\{M\mathbf{E}\left[\bigotimes_{j\in[k]}\left(3W_{j}\outerproduct{V_{j}}{V_{j}}\right)\bigotimes\outerproduct{0^{n-k}}{0^{n-k}}\right]\right\}.\quad\quad

Since MM only depends on the first kk qubits, the expectation in (44) does not change if |0nk0nk|\outerproduct{0^{n-k}}{0^{n-k}} is replaced by the following state

j=k+1n𝐄[Γ01[Uj|bjbj|Uj]].\bigotimes_{j=k+1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{b_{j}}{b_{j}}U_{j}^{\dagger}\right]\right].

Hence, we can write 𝐄[S]\mathbf{E}[S] as

tr{M𝐄[j=1k(3Wi,j|Vi,jVi,j|)]j=k+1n𝐄[Γ01[Uj|BjBj|Uj]]}\displaystyle\tr\left\{M\mathbf{E}\left[\bigotimes_{j=1}^{k}\left(3W_{i,j}\outerproduct{V_{i,j}}{V_{i,j}}\right)\right]\bigotimes_{j=k+1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right]\right]\right\}
=\displaystyle\stackrel{{\scriptstyle}}{{=}} tr{Mj=1n𝐄[Γ01[Uj|BjBj|Uj]]}\displaystyle\tr\left\{M\bigotimes_{j=1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right]\right]\right\}
=\displaystyle\stackrel{{\scriptstyle}}{{=}} tr(M|ϕϕ|)=ϕ|M|ϕ,\displaystyle\tr{M\outerproduct{\phi}{\phi}}=\expectationvalue{M}{\phi},

where the second equality follows from (41), and the third equality follows from (43).

The correctness part of Theorem 4.10 follows immediately from Lemma B.5, Hoeffding’s inequality (Lemma A.2), and the fact that for all i[N]i\in[N], we have |Si|3kM\absolutevalue{S_{i}}\leq 3^{k}\norm{M}_{\infty}.

Running time

We now analyze the running time of the query estimation procedure. First, the preparation of each quantum state |vi,j(i[N],j[n])\ket{v^{\prime}_{i,j}}\ (i\in[N],j\in[n]) takes quantum time O(1)O(1). The construction of each shadow sample |ϕ~i\ket{\tilde{\phi}_{i}} (Eq. (39)) takes O(k)O(k) quantum time; note that we do not actually need to prepare those |vi,j\ket{v^{\prime}_{i,j}}’s with jQj\not\in Q, since the kk-local observable does not depend on those qubits. The quantum time for measuring each |ϕ~i\ket{\tilde{\phi}_{i}} with a kk-local observable MM is O(poly(k))O\left(poly(k)\right), as we have assumed that MM has a poly(k)\text{poly}(k) gate complexity. The computation of each SiS_{i} (Eq. (40)) can be done in O(n)O(n) classical time, and that of T=1NiSiT=\frac{1}{N}\sum_{i}S_{i} can be bounded by O(N)O(N) classical time. Summing up everything, the total running time can be bounded by O(poly(k)N)O\left(poly(k)N\right) quantum time plus O(kN)O(kN) classical time; the classical time can be ignored if we assume that a unit quantum time is at least a unit classical time.

Appendix C Distortions Between The Trace Distance and 1/2\ell_{1}/\ell_{2} Distances of Quantum States

ϕ\phi & ψ\psi |0\ket{0} & |1\ket{1} distortion w.r.t. DD
DD 34\sqrt{\frac{3}{4}} 11
L1L_{1} d2\sqrt{\frac{d}{2}} 22 d6\sqrt{\frac{d}{6}}
L2L_{2} 11 2\sqrt{2} 32\sqrt{\frac{3}{2}}
L1L^{\prime}_{1} 11 22 3\sqrt{3}
L2L^{\prime}_{2} 2d\sqrt{\frac{2}{d}} 2\sqrt{2} 3d4\sqrt{\frac{3d}{4}}
Table 1: Example of a simple table

Let α(|0)=(1,0,0,,0)T\alpha(\ket{0})=(1,0,0,\ldots,0)^{T} be the vector representation of a dd-dimensional quantum state |0\ket{0}, where the first coordinate is 11 and the others are 0. Let α(|1)=(0,1,0,,0)T\alpha(\ket{1})=(0,1,0,\ldots,0)^{T} be a dd-dimensional vector with the second coordinate being 11 and the others being 0.

Let α(ϕ)=1d/2(1,,1,1,,1,0,,0,0,,0)T\alpha(\phi)=\frac{1}{\sqrt{d/2}}(1,\ldots,1,1,\ldots,1,0,\ldots,0,0,\ldots,0)^{T} be the dd-dimensional vector with the first d/2d/2 coordinators being 11 and second half being 0, and

ψ=1d/2(0,,0,1,,1,1,,1,0,,0)T\psi=\frac{1}{\sqrt{d/2}}(0,\ldots,0,1,\ldots,1,1,\ldots,1,0,\ldots,0)^{T}

be the dd-dimensional vector with the middle d/2d/2 coordinates being 11 and the rest being 0.

Let D(ϕ,ψ)D(\phi,\psi) denote the trace distance between two quantum states ϕ\phi and ψ\psi. Let L1(ϕ,ψ)L_{1}(\phi,\psi) and L2(ϕ,ψ)L_{2}(\phi,\psi) denote the 1\ell_{1} and 2\ell_{2} distances between α(ϕ)\alpha(\phi) and α(ψ)\alpha(\psi), respectively. Let L1(ϕ,ψ)L^{\prime}_{1}(\phi,\psi) and L2(ϕ,ψ)L^{\prime}_{2}(\phi,\psi) denote the 1\ell_{1} and 2\ell_{2} distances between p(ϕ)p(\phi) and p(ψ)p(\psi), where p(ϕ)p(\phi) is α(ϕ)\alpha(\phi) after taking the coordinate-wise absolute square; that is, p(ϕ)=1d/2(1,,1,1,,1,0,,0,0,,0)Tp(\phi)=\frac{1}{d/2}(1,\ldots,1,1,\ldots,1,0,\ldots,0,0,\ldots,0)^{T}.

The distortion of a distance d(,)d(\cdot,\cdot) with respect to DD is lower bounded by

max{D(ϕ,ψ)d(ϕ,ψ)/D(|0,|1)d(|0,|1),d(ϕ,ψ)D(ϕ,ψ)/d(|0,|1)D(|0,|1)}.\sqrt{\max\bigg{\{}\frac{D(\phi,\psi)}{d(\phi,\psi)}\bigg{/}\frac{D(\ket{0},\ket{1})}{d(\ket{0},\ket{1})},\frac{d(\phi,\psi)}{D(\phi,\psi)}\bigg{/}\frac{d(\ket{0},\ket{1})}{D(\ket{0},\ket{1})}\bigg{\}}}.

In Table 1, we have calculated the distortions between the trace distance of the quantum states and the 1\ell_{1} and 2\ell_{2} distances of their corresponding classical vector representations. It is easy to see that all distortions are larger than 1.11.1 .

Appendix D Search As A Special Case of Selection

We note that the selection operation can also be used for search. Given two quantum states ϕ{\phi} and ψ{\psi}, letting M=ψψM=\psi\psi^{\dagger}, we have

D(ϕ,ψ)=1|ψϕ|2=1ϕMϕ.\displaystyle D({\phi},{\psi})=\sqrt{1-\absolutevalue{{\psi^{\dagger}}{\phi}}^{2}}=\sqrt{1-\phi^{\dagger}M\phi}.

Therefore, if we can estimate ϕMϕ\phi^{\dagger}M\phi up to an additive factor cεε2c_{\varepsilon}\varepsilon^{2} (for a sufficiently small constant cεc_{\varepsilon}) for any database state ϕ{\phi}, we can also solve (ε,β)(\varepsilon,\beta)-search for any constant β>1\beta>1. By Theorem 4.10 and the fact that M2=1\norm{M}_{\infty}^{2}=1 when M=ψψM=\psi\psi^{\dagger}, for a database consisting of mm states, the query (quantum) time using expectation value estimations is bounded by 9nmlogmpoly(n)/ε49^{n}m{\log m}\cdot\text{poly}(n)/{\varepsilon^{4}}. This approach is certainly much more time-expensive than that using state sketches presented in Section 4.1.