Indiana University Bloomington, USA and https://homes.luddy.indiana.edu/qzhangcs/ [email protected]://orcid.org/0000-0002-6851-3115supported by NSF CCF-1844234 and IU Luddy Faculty FellowshipIndiana University Bloomington, USA and https://homes.luddy.indiana.edu/mheidar/[email protected] supported by NSF CCF-2211423 \CopyrightQin Zhang and Mohsen Heidari \ccsdesc[500]Theory of computation Quantum query complexity \ccsdesc[500]Information systems Query operators \ccsdesc[500]Information systems Query optimization \EventEditorsJohn Q. Open and Joan R. Access \EventNoEds2 \EventLongTitle42nd Conference on Very Important Topics (CVIT 2016) \EventShortTitleCVIT 2016 \EventAcronymCVIT \EventYear2016 \EventDateDecember 24–27, 2016 \EventLocationLittle Whinging, United Kingdom \EventLogo \SeriesVolume42 \ArticleNo23

Quantum Data Sketches

Qin Zhang Mohsen Heidari

Abstract

Recent advancements in quantum technologies, particularly in quantum sensing and simulation, have facilitated the generation and analysis of inherently quantum data. This progress underscores the necessity for developing efficient and scalable quantum data management strategies. This goal faces immense challenges due to the exponential dimensionality of quantum data and its unique quantum properties such as no-cloning and measurement stochasticity. Specifically, classical storage and manipulation of an arbitrary $n$ -qubit quantum state requires exponential space and time. Hence, there is a critical need to revisit foundational data management concepts and algorithms for quantum data. In this paper, we propose succinct quantum data sketches to support basic database operations such as search and selection. We view our work as an initial step towards the development of quantum data management model, opening up many possibilities for future research in this direction.

keywords:

quantum data representation, data sketching, query execution

1 Introduction

Quantum information and computing, rooted in the principles of quantum mechanics, have emerged as an important field of study with far-reaching effects across a broad spectrum of disciplines. Central to the concept of quantum computing are quantum bits (or qubits), which set themselves apart from classical bits due to their ability to exist in a superposition of states, allowing a quantum computer to offer the potential computational advantage against classical computing.

Although significant advancements have been made in the development of quantum algorithms after several decades of research, only a handful provably outperform their classical counterparts. Notable examples include Shor’s algorithm for factorization [54], Grover’s algorithm for search [22], and linear system solvers [26]. These quantum algorithms typically start by encoding classical input data into quantum states, execute a series of quantum operations, and then measure the resulting quantum states and carry out specific post-processing on the measurement outcomes. The reasons for the difficulties in the design of quantum algorithms that can outperform classical counterparts on classical input data remain elusive.

In this paper, we take a different perspective, directing our attention towards quantum data themselves. The nature, along with scientific experiments spanning physics, chemistry, material science, biology, and other fields, generates massive quantities of quantum data every day. Sources include Hawking Radiation, Cosmic Microwave Background, quantum effects in neutron stars, quantum states in ultra-cold atoms, quantum information in DNA replication, etc. In many scenarios, there is a need for us to preserve quantum data that has been collected from nature or generated in labs for future analysis. For example, scientists often use photons collected from remote stars to study the properties of those astronomical objects. It would be beneficial to store those photons as quantum states in a database, since it may not be feasible to collect fresh photons from those astronomical objects at the time of data analysis. In the case that the quantum states are prepared in the labs, generating fresh copies of quantum states on demand is often time-consuming. Let us use quantum simulation as an example. Quantum simulation is a prominent advantage of quantum computers, with significant implications for numerous areas of scientific research, including computer-aided drug design [50], high-energy physics [41], quantum chemistry [58] and many-body physics [55]. Quantum simulation typically relies on solving the Schrödinger equation for the underlying Hamiltonian. The Hamiltonian is implemented by a quantum circuit, which is applied to an initial quantum state to generate target quantum states. The construction of the Hamiltonians and the preparation of the target states can be rather time-consuming.¹¹1For instance, the Hamiltonian of the two-dimensional Fermi-Hubbard model on an $8\times 8$ lattice already requires approximately $10^{7}$ Toffoli gates [43], which directly contribute to the query time if states need to be generated from scratch at query. Storing the generated molecular quantum states in a database would eliminate the need to repeat the state preparation procedures during data analysis.

Once the quantum states are stored in a database, and assuming each state is associated with additional information such as the nature sources recorded at the time of collection or parameters of the experimental setup used to produce them, numerous applications can be envisioned. For example, if scientists receive photons from an unknown remote star, they can search a photon database to find a matching quantum state. Upon finding a match, they can retrieve its associated properties and other information, such as the time and method of its previous observation. They may also want to sort the states using a local observable (see Definition 3.5 in Section 3) with respect to certain properties (such as energy or momentum) to get an order of the photons in the database, aiding in the understanding of the spectrum of the corresponding stars in the universe. In quantum simulation, if we want to produce molecular states with average energy levels above a certain threshold relative to a specific local observable, we can perform a selection operation in our database to identify those states, and then use the associated parameters for the experimental setup to produce more of such quantum states.

Nevertheless, quantum data management remains in its infant stage. Some of the previously mentioned motivating examples are more like anticipated future problems. There has been research that leverages quantum data for learning or optimization, such as quantum machine learning [33, 24, 3], quantum variational optimization algorithm [28, 17]. and quantum neural network [52, 48, 18, 31, 45, 20]. However, their primary focus is on the sample complexity (namely, the number of copies of the quantum state needed for the task) and the convergence to optimal points, rather than on developing methods for the efficient representation and storage of quantum data for subsequent analysis.

In this paper, we introduce several quantum data sketches to support basic database operations in a sustainable and efficient manner. This paper does not aim to formulate a comprehensive quantum data management model. Rather, we view our work as an initial step towards developing a sustainable model for representing, querying, and analyzing quantum data at scale.

Unique Challenges in the Quantum World

The quantum world possesses several unique properties, such as superposition and entanglement, that can be leveraged to reduce resource usage in computing and information exchange. However, some of these features also post significant challenges to quantum data management. We highlight a few below.

Post-Measurement State Disturbance. The only way to extract information from a quantum state is to perform quantum measurements and observe probabilistic outcomes. However, each measurement has the effect of perturbing the quantum state. This characteristic implies that a quantum state might not be reusable post-measurement. In other words, we may need to consume many identical copies of a quantum state in order to derive enough useful information about it. This phenomenon is in stark contrast with the classical setting, in which we can consistently access the same data element for a number of times, always yielding the same result.

No-cloning. A natural thought to resolve the issue caused by state disturbance is to clone the quantum state before the operations. Unfortunately, the no-cloning theorem (see, e.g., [46]) in quantum mechanics asserts that it is impossible to create an exact copy of an arbitrary unknown quantum state.

Lack of Large-Scale Quantum Storage Systems. At the time of writing this paper, we are not aware of any reliable large-scale quantum storage systems. One reason for this is that qubits are highly susceptible to environmental disruptions such as temperature variations, electromagnetic radiation, or particle interactions. These disruptions lead to what is known as decoherence [40], resulting in the loss of quantum information.

Moreover, due to the quantum state disturbance and the no-cloning principle, even if we successfully build viable large-scale quantum storage systems in the future, we still need many identical copies of the quantum state for any nontrivial database operation. This implies that in order to accommodate an unlimited number of database operations (i.e., to be sustainable), we must prepare an unlimited number of copies for each quantum state in the storage, which is certainly not practical.

An alternative approach is to first learn the classical description of each quantum state and store it in a classical memory for future operations. Indeed, we believe that for the purpose of quantum data management, we have to store quantum states in the classical format. However, learning and storing the full information of a quantum state as a classical object is both time and space expensive, as the dimensionality of a quantum state is exponential in terms of the number of qubits.

We thus propose to design succinct classical representations (or, sketches) of quantum states that can be used to perform database operations efficiently. Based on the particular database operation it is intended to support, each sketch preserves only partial information of a quantum state. This is also the reason why we may be able to make the size of the sketch to be $o(d)$ , where $d$ is the dimension of the quantum state. We also note that the sample complexity for constructing data sketches is a secondary consideration for database management systems, as it is just a one-time preprocessing step in the database design. This is where our work departs from the quantum state learning/tomography literature, which we will discuss in Section 1.1.

Our Contribution

We give the first systematic approach to designing space-efficient sketches for quantum states. These sketches can then be used to develop time-efficient algorithms for basic database operations. In particular:

1.

In Section 3, we have formalized a set of basic database operations for quantum data, including search, selection, sorting, and join. These operations differ from those for classical data as they inherently incorporate approximation in their definitions.
2.

Our main technical results are the first set of classical vector sketches that preserve, up to a distortion of $(1+\iota)$ for an arbitrarily small $\iota>0$ , the trace distance of the quantum states with probability $(1-\delta)$ . Our sketches have sizes $O(\log(1/\delta)/\iota^{2})$ , which is independent of the dimension of the states. Coupled with efficient nearest neighbor search via locality sensitive hashing, they can be used to support the search and join operations with time sublinear in the database size and independent of the state dimension. See Section 4.1.
3.

We make use of classical shadow seeds of quantum states [37] to approximate the expectation value of any given $k$ -local observable (to be defined in Section 3.3) using time and space independent of the dimension of the state. We also present a new hybrid quantum-classical algorithm to accelerate the query time. This sketch can be used for selection and sorting operations. See Section 4.2.

Paper Outline

In Section 2, we review some background on quantum information and computing as well as tools for classical data management. In Section 3, we define a set of basic database operations for quantum data. After these preparations, in Section 4, we present our classical sketches of quantum states and illustrate how to perform various database operations using these sketches. We review works that are most relevant to this paper in Section 1.1 and propose several directions for future research in Section 5.

1.1 Related Work

We are not aware of any prior work on designing classical sketches of quantum data, except for the paper [37] discussed in Section 4.2. There have been effort aiming to introduce quantum computing, quantum algorithms and quantum machine learning to the database community [13, 61, 44, 57, 9, 59]. We refer the readers to the recent tutorial [25] for an overview of these works. However, these initiatives either attempt to design and perform database operations directly on quantum data (i.e., assuming database elements are stored as quantum states) or focused on speeding up databases query optimization and transactions on classical data, setting them apart from the objectives pursued in this paper.

There are works [60, 38] focusing on applying classical data compression techniques (such as quantization) to the quantum state vector during quantum simulation. We note that our approach with sketches is quite different, as we aim to extract relevant information (often independent of the quantum states’ dimension) for various database operations.

Quantum State Learning

Many studies have explored the task of characterizing and learning properties of a quantum state using multiple copies of the state, including approximate state discrimination [12], quantum state discrimination [29], quantum state tomography [23, 49], quantum state property testing [27], quantum state certification [7], shadow tomography [2, 6], and pretty good tomography [1].

In the problem of approximate state discrimination, we are promised that a query quantum state ${\phi}$ belongs to a set $S$ of quantum states. The algorithm’s task is to return a state ${\psi}\in S$ such that $D({\phi},{\psi})\leq\epsilon$ . The algorithm for approximate state discrimination proposed in [12] can be used together with the equality testing to handle the search operation when the available number of copies of the query state is limited, at the cost of larger time and space complexities. However, the need of fresh copies of database states for equality testing would undermine the long-term sustainability of the database system.

The problem of quantum state discrimination is very similar: We are again promised that the query state ${\phi}$ belongs to a set $S$ , but now the algorithm needs to return the exact ${\phi}$ . Harrow and Winter [29] gave an algorithm for this problem where the sample complexity of the query state depends on a parameter $F$ , which is the maximum pairwise fidelity of states in the set $S$ .

In the quantum state tomography, we wanted to learn an unknown quantum state up to a trace distance $\epsilon$ . Optimal sample complexity $\tilde{\Theta}(d/\epsilon^{2})$ has been identified [23, 49].

Quantum state property testing [27] and quantum state certification [7] can be seen as relaxations of aforementioned problems. In the former, we are given a query state ${\phi}$ and a set $S$ of quantum states, and asked to test whether ${\phi}\in S$ or ${\phi}$ is $\epsilon$ -far from $S$ (that is, for any state ${\psi}\in S$ , we have $D({\phi},{\psi})>\epsilon$ ), and in the latter, we are given a query state ${\phi}$ and a known state ${\psi}$ , and asked to test whether ${\phi}={\psi}$ or $D({\phi},{\psi})>\epsilon$ . The main issue with property testing and certification in the setting of data management is that the decision can be arbitrary even if the query state is very close to (but not the same as) a database state.

Both shadow tomography and pretty good tomography focus on approximating $\phi^{\dagger}M_{i}\phi$ for a query state ${\phi}$ and a set of known binary measurements $\{M_{i}\}$ [2, 6], or a distribution on them [1]. However, these algorithms cannot be used for the $(\eta,\varepsilon)$ -selection for an arbitrary observable $M$ given at the time of query. Their running time is also polynomial in terms of the state dimension $d$ . Recently, Gong and Aaronson [21] generalized shadow tomography to a fixed set of measurements with multiple outcomes.

To the best of our knowledge, all the previous work on quantum state learning focuses on the sample complexity, but not on the space complexity for representing the quantum states for various data management operations.

2 Preliminaries

We start by giving a gentle introduction of the basics of quantum information and computing, particularly for readers who are not in the field yet. For a comprehensive treatment on this topic, we refer the readers to standard textbooks in the field, such as [46].

Quantum States and Qubits

The first axiom of quantum mechanics is concerned with quantum state as a way to describe a quantum system, such as a qubit. For accessibility of the paper we focus on pure state that are represented by complex-valued vectors. Moreover, we assume that each quantum data point is stored in $n$ -qubits. Therefore, the dimentionality of the space is $d=2^{n}$ . In that case, the quantum stats are unit-norm vectors in $\mathbb{C}^{d}$ . Following the Dirac bra-ket notation, a vector $u\in\mathbb{C}^{d}$ is simply denoted by the ket $\ket{u}$ . As an example, a qubit is a $2$ -dimensional vector represented as $\ket{\phi}=\alpha_{0}\ket{0}+\alpha_{1}\ket{1}$ , where $\absolutevalue{\alpha_{0}}^{2}+\absolutevalue{\alpha_{1}}^{2}=1$ . This decomposition is typically called a superposition. A well-known superposition is the state $\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})$ . Similarly, an $n$ -qubit state is represented by a superposition as $\ket{u}=\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\alpha_{x_{1}\cdots x_{n}}\ket{x_{1}\cdots x_{n}},$ where $\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\alpha^{2}_{x_{1}\cdots x_{n}}=1$ . For compactness, we use $\ket{i}$ to represent each $\ket{x_{1}\cdots x_{n}}$ , where $i$ is the decimal representation of the binary string $x_{1}\cdots x_{n}$ .

Quantum Operations

The second axiom of quantum mechanics states that the evolution of quantum states are described via unitary transformation. A unitary transformation is represented by a unitary matrix $U$ such that $U^{\dagger}U=UU^{\dagger}=I$ . If the initial state is $\ket{\phi}$ , then the evolved state is $U\ket{\phi}$ . In quantum computing $U$ is typically implemented in terms of elementary quantum logical gates. In this perspective, one can study the gate complexity of implementing $U$ . This axiom implies a unique feature of quantum, known as the no-cloning principle that prohibits making copies of quantum data. As a result one needs to adopt data management procedures that abide this rule.

Quantum Measurements

The third axiom of quantum mechanics asserts that any classical information about a quantum state is obtained via measuring it. The act of measuring a quantum system will collapse the quantum state inevitably. The specific outcome of a measurement is probabilistic and is governed by the Born’s law. These probabilities are determined by the initial state of the system and the nature of the interaction between the system and the measuring device. Measuring in an $n$ -qubit system is typically modeled in the so-called computational basis. When the quantum state is in the superposition $\ket{\phi}=\sum_{i}\alpha_{i}\ket{i}$ , the outcome of the measurement in the computational basis is going to be $i\in[2^{n}]$ with probability $p_{i}=|\alpha_{i}|^{2}$ . For instance, measuring the state $\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})$ produces a random uniform binary output. The stochasticity of quantum measurements is another feature that calls for probabilistic data management frameworks. Moreover, the state collapse phenomenon significantly complicates the tasks, as the quantum state cannot be entirely “recycled” following a measurement.

One may attempt to think of a quantum state $\ket{\phi}=\sum_{i}\alpha_{i}\ket{i}$ - as far as measurement is concerned - as a discrete probability distribution $\{p_{1},\ldots,p_{d}\}$ , but there are two fundamental differences. First, the coefficients (called amplitudes) $\alpha_{i}$ ’s are complex numbers that make superposition and interference possible. Second, the probability of an outcome in quantum mechanics is found by taking the absolute square of the amplitude, that is, $p_{i}=|\alpha_{i}|^{2}$ .

In general, a certain measurement $\mathcal{M}$ on a quantum state can be obtained in three stages: (i) applying an appropriate quantum operator $U$ to the state, (ii) measuring the evolved state $U\ket{\phi}$ in the computational basis; and (iii) applying classical post processing on the measurement outcomes. This procedure is compactly modeled as a matrix $M$ called an observable that is multiplied by the original state $\ket{\phi}$ . The eigenvalues of $M$ represent the possible values of the measurement outcomes. Moreover, by $\mathcal{M}(\ket{\phi})$ we denote the probability distribution of the measurement outcomes after applying $\mathcal{M}$ on $\ket{\phi}$ . Because the outcomes are probabilistic, we are often interested in their expectation values. The expectation of the outcome distributed by $\mathcal{M}(\ket{\phi})$ is equal to $\expectationvalue{M}{\phi}$ , where $\bra{\phi}$ is the complex conjugate transpose of the vector $\ket{\phi}$ .

Standard Math Notations Versus Dirac Notations

As this paper is intended for an audience within the database community, we recognize that the Dirac bra-ket notation might appear unfamiliar to database researchers without a background in quantum information and computing. To simplify, in the main text we express a pure quantum state as a column vector with dimensions denoted as $d$ , and use $\phi$ and $\phi^{\dagger}$ to denote $\ket{\phi}$ and $\bra{\phi}$ , respectively. We use $\phi^{\dagger}M\phi$ to denote the expectation value $\expectationvalue{M}{\phi}$ of an observable $M$ . Throughout the paper, we reserve the notations $\phi$ and $\psi$ for quantum states.

We have also included a more formal (but still gentle) introduction of quantum information and computing using Dirac bra-ket notations in Appendix A. We will use Dirac bra-ket notations in all the proofs in the appendix.

Trace Distance

Given two quantum states $\phi$ and $\psi$ , we define their trace distance to be $D(\phi,\psi)=\sqrt{1-\absolutevalue{\psi^{\dagger}\phi}^{2}}$ . The trace distance is the most widely used distance measure for quantum states in the literature.

2.1 Performance Metrics

In the context of quantum data, similar to classical database design, the efficiency of space and time is crucial during database initialization, indexing, and querying. Minimizing the number of quantum state copies used for constructing sketches is also important, as obtaining state copies can be costly and they cannot be fully recycled due to post-measurement disturbance. However, as we mentioned earlier, sample complexity is a secondary consideration in the data management setting, since the sketch-building/initialization is a one-time process.

A unit-time quantum operation comprises standard single-qubit gates like the Hadamard gate, Pauli gates, phase gate, and $T$ gate, as well as a two-qubit gate, such as the Controlled-NOT (CNOT) gate, that enables entangling operations.²²2We refer the readers to [46] for a detailed introduction of these gates. The combination of these gates is sufficient to approximate any unitary operation to arbitrary accuracy. We call these gates unit gates, and define the size of a circuit (for representing a unitary operation) to be the number of unit gates in the circuit.

As mentioned, a typical quantum measurement $\mathcal{M}$ on $n$ qubit systems consists of a unitary operator $U_{\mathcal{M}}$ followed by measurement in computational basis and classical post processing. Assuming that the classical post processing is polynomial, the overall time cost is typically dominated by the gate complexity of $U_{\mathcal{M}}$ . It has been shown in [56] that a circuit depth of $\Theta(2^{n}/n)$ (i.e., $\Theta(d/\log d)$ ) is needed for constructing an arbitrary unitary operator $U$ . To simplify matters, we assume that both executing an arbitrary $d$ -dimensional quantum measurement and preparing an arbitrary $d$ -dimensional state require $O(d)$ quantum time.

2.2 Nearest Neighbor in High Dimensions

As quantum states are inherently high dimensional, even after effective sketching and summarization that we will illustrate in the subsequent sections, we will thus use Approximate Nearest Neighbor (ANN) via Locality Sensitive Hashing (LSH) to further speed up some database operations. This subsection will take a brief detour from our discussion of quantum data management.

Definition 2.1 ( $(r,\beta)$ -ANN-search).

Let $X$ be a database containing a set of vectors in $\mathbb{R}^{d}$ and $q\in\mathbb{R}^{d}$ be a query vector. Let $dist(\cdot,\cdot)$ be a distance function. If there is at least one vector $p\in X$ with $dist(p,q)\leq r$ , return any $p^{\prime}\in X$ with $dist(p^{\prime},q)\leq\beta r$ . Otherwise, either return a $p^{\prime}\in X$ with $dist(p^{\prime},q)\leq\beta r$ or return $\emptyset$ .

Let us focus on the case that the distance function $dist(\cdot,\cdot)$ is $\ell_{1}$ or $\ell_{2}$ . Indyk and Motwani [39] showed that $(r,\beta)$ -ANN can be solved efficiently via LSH. The idea is that we first apply multiple hash functions to each vector in $X$ ; this part can be pre-computed and stored as an indexing. At the time of query, we apply the same set of hash functions to the query vector $q$ . We then run over all vectors $p\in X$ such that $p$ and $q$ collide (i.e., fall into the same bin) on at least one hash function, and return the first vector $p$ if $dist(p,q)\leq\beta r$ . If no such $p$ found after traversing a certain number of vectors in $X$ , we return $\emptyset$ .

We will use $\mathtt{ANN}(q,X,r,\beta)$ to denote the $(r,\beta)$ -ANN search for a query vector $q$ in database $X$ . The following is a summary of results on LSH-based ANN for $\ell_{1}/\ell_{2}$ distances.

Theorem 2.2 ([39, 15, 5]).

For $dist(\cdot,\cdot)$ being $\ell_{1}$ or $\ell_{2}$ , a database $X$ of $m$ vectors, and a $d$ -dimensional vector $q$ , there is an algorithm that solves $\mathtt{ANN}(q,X,r,\beta)$ using $O(dm+m^{1+\gamma})$ space and $O(dm^{\gamma})$ classical time, where $\gamma\approx 1/\beta$ for $\ell_{1}$ distance and $\gamma\approx 1/\beta^{2}$ for $\ell_{2}$ distance.

Remark 2.3.

We note that if we do not terminate the algorithm after encounter the first $p\in X$ such that $dist(p,q)\leq\beta r$ , then the same algorithm can return a subset $Y\subseteq X$ including all vectors $p$ such that $dist(p,q)\leq r$ , and excluding all vectors $p$ such that $dist(p,q)\geq\beta r$ .

Remark 2.4.

We can also use LSH to find a set $J$ of pairs of vectors such that $J$ includes all pairs $(p,q)$ such that $dist(p,q)\leq r$ , and excludes all pairs $(p,q)$ such that $dist(p,q)\geq\beta r$ . To this end, we first hash all vectors, and then check the distances of all pairs of vectors that collide on at least one hash function.

3 Basic Operations on Quantum Data

The characteristics of quantum information dictate that we can only obtain an approximation of a quantum state ${\phi}$ with a finite number of quantum state copies. A celebrated result in quantum state tomography states that to learn an unknown $n$ -qubit quantum state ${\phi}$ up to a trace distance $\epsilon$ , we already need $\Omega\left({d}/{\epsilon^{2}}\right)$ copies of the quantum state, where $d=2^{n}$ is the dimension of ${\phi}$ [23, 49]. We thus consider two quantum states ${\phi},{\psi}$ with $D({\phi},{\psi})\leq\varepsilon$ the same state. Consequently, all the operations that we support in a quantum database also need to be approximate. The precise definition of ‘approximation’ varies for different operations.

In this section, we formulate basic quantum data operations that we aim to support using our proposed sketches. When we say the return of a quantum state $\phi$ , we are referring to its identifier.

3.1 Equality Test

In the classical data setting, the equality test on two data objects returns $1$ if $p=q$ , and returns $0$ otherwise. In the quantum setting, since we cannot distinguish two quantum states using $o(d/\varepsilon^{2})$ copies of the states if their trace distance is at most $\varepsilon$ , we need to introduce the approximation version of the equality test:

Definition 3.1 ( $(\varepsilon,\beta)$ -equality-test).

Given two quantum states ${\phi}$ and ${\psi}$ , output $1$ if $D({\phi},{\psi})\leq\varepsilon$ , and $0$ if $D({\phi},{\psi})>\beta\varepsilon$ . The output can be arbitrary if $\varepsilon<D({\phi},{\psi})\leq\beta\varepsilon$ .

In words, we consider two quantum states the same if their trace distance is at most $\varepsilon$ , and different if their trace distance is more than $\beta\varepsilon$ . If the distance falls between the two values, then the decision can be arbitrary. The gap between yes and no is inevitable for quantum data.

Given two quantum states ${\phi}$ and ${\psi}$ , which may be unknown, the standard method for estimating their trace distance is the swap test [8]. The algorithm uses a controlled-SWAP gate (can be implemented using $O(n)=O(\log d)$ unit gates) and two single-qubit Hadamard gates. The test outputs $1$ with probability $\frac{1+\absolutevalue{\phi^{\dagger}\psi}^{2}}{2}=1-\frac{D({\phi},{\psi})^{2}}{2}$ , and $0$ otherwise. Therefore, using $O_{\beta}\left(\frac{1}{\varepsilon^{2}}\log{\frac{1}{\delta}}\right)$ such tests (the constant hidden in the big- $O$ depends on the constant $\beta$ ), we can differentiate the case $D({\phi},{\psi})>\beta\varepsilon$ from $D({\phi},{\psi})\leq\varepsilon$ with a probability $1-\delta$ .

The main issue with this algorithm is that we have to consume fresh copies of database states for each equality test, which is unsustainable for a database system that is designed to answer an unlimited number of queries.

3.2 Search and Join

In the classical data setting, given a set of objects $X=\{p_{1},\ldots,p_{n}\}$ and a query object $q$ , the search operation returns some $p_{i}\in X$ such that $p_{i}=q$ if such $p_{i}$ exists, and $\emptyset$ otherwise. In the quantum setting, again due to the difficulty of distinguishing two quantum states within a distance of $\epsilon$ , we propose the following approximation version.

Definition 3.2 ( $(\varepsilon,\beta)$ -search).

Given a query state ${\phi}$ and a database $X$ , if there exist a state ${\psi}\in X$ such that $D({\phi},{\psi})\leq\varepsilon$ , return a state ${\psi^{\prime}}\in X$ with $D({\phi},{\psi^{\prime}})\leq\beta\varepsilon$ . Otherwise, either return a state ${\psi^{\prime}}\in X$ with $D({\phi},{\psi^{\prime}})\leq\beta\varepsilon$ or return $\emptyset$ .

In other words, if there exists a state in the database which has a trace distance no more than $\varepsilon$ from the query state $\phi$ , we return a state in $X$ whose distance is no more than $\beta\varepsilon$ from $\phi$ (similar to the ANN search). Else if all states in the database have distances larger than $\beta\varepsilon$ from the query state, we return $\emptyset$ . In other cases, we either return a database state with distance no more than $\beta\varepsilon$ from the query state or return $\emptyset$ .

The most straightforward way is to perform the $(\varepsilon,\beta)$ -equality-test for each database state ${\psi}\in X$ with the query state ${\phi}$ . By the above algorithm for equality test (setting $\delta=1/m^{2}$ ), we can determine with probability $(1-m\delta)=(1-o(1))$ whether there exists a state ${\psi}\in X$ such that the $(\varepsilon,\beta)$ -equality-test on ${\phi}$ and ${\psi}$ returns $1$ . The above procedure takes $O\left(m\log d\log m/\varepsilon^{2}\right)$ quantum time, which is linear in terms of the number of states in the database. Another significant limitation of this method is the necessity of using fresh copies of the database states for each search operation because of the equality test, making the database system unsustainable.

A closely related operation to search is join, which is one of the most important operations in relational database systems. We introduce the quantum version of natural join as follows.

Definition 3.3 ( $(\varepsilon,\beta)$ -natural-join).

Given two databases $X$ and $Y$ of quantum states, we want to output a set that includes all pairs of states $(\phi,\psi)\ (\phi\in X,\psi\in Y)$ such that $D(\phi,\psi)\leq\epsilon$ , and excludes all pairs $(\phi,\psi)$ such that $D(\phi,\psi)>\beta\epsilon$ . The decisions for other pairs can be arbitrary.

3.3 Selection and Sorting

In relational databases for classical data, selection is typically denoted by $\sigma_{\theta}(R)$ , where $R$ is a relation and $\theta$ is a propositional formula that involves an attribute, a comparison operator in the set $\{<,>,\leq,\geq,=,\neq\}$ , and a constant value for comparison (e.g., age $\geq 8$ ). However, in the quantum data setting, quantum states cannot be directly compared. We can only apply a measurement $\mathcal{M}$ on the state ${\phi}$ and get a random outcome according to the distribution $\mathcal{M}({\phi})$ . As a classical analog, we would say a person’s age is $5$ with probability $0.6$ and $10$ with probability $0.4$ .³³3This assembles probabilistic databases, but in the quantum data setting the probability distribution is not given explicitly, and the support size of the distribution is exponential in terms of the number of qubits of each quantum state. We thus look at the expectation value $\phi^{\dagger}M\phi$ for the observable $M$ corresponding to $\mathcal{M}$ .

The quantity $\phi^{\dagger}M\phi$ holds significant importance in quantum mechanics (see, e.g., the textbook [51]). It can be used to provide an estimate of the system’s average energy in a particular state, describe the level of non-classical correlations between entangled particles, quantify quantum information such as entropy, coherence, and entanglement, etc.

We define the $\varepsilon$ -approximate ‘ $\geq$ ’ selection operation for quantum data as follows.

Definition 3.4 ( $(\eta,\varepsilon)$ -selection).

Given a database $X$ , an observable $M$ , a threshold $\eta$ , and an error parameter $\varepsilon$ , return a set of states $S\subseteq X$ such that $S$ includes all database states ${\phi}$ such that $\phi^{\dagger}M\phi\geq\eta$ , but excludes all ${\phi}$ such that $\phi^{\dagger}M\phi\leq\eta-\varepsilon$ .

Note that the $\varepsilon$ -approximate equality selection can be implemented by taking the difference between $(\eta-\varepsilon,\varepsilon)$ -selection and $(\eta+2\varepsilon,\varepsilon)$ -selection, which includes all ${\phi}$ with $\eta-\varepsilon\leq\phi^{\dagger}M\phi\leq\eta+\varepsilon$ and excludes all ${\phi}$ with $\phi^{\dagger}M\phi\leq\eta-2\varepsilon$ or $\phi^{\dagger}M\phi\geq\eta+2\varepsilon$ . In the context of approximation, we can consider ‘ $<$ ’ and ‘ $>$ ’ the same as ‘ $\leq$ ’ and ‘ $\geq$ ’, respectively.

We also note that $(\varepsilon,\beta)$ -search can also be handled by looking at $\phi^{\dagger}M\phi$ for a specific observable $M$ , although this solution is not as efficient as that using the particular sketches that we shall design for the search operation. We have included a reduction from $(\epsilon,\beta)$ -search to $(\eta,\epsilon)$ -selection in Appendix D.

In the context of databases, we are particularly interested in the following type of observables.

Definition 3.5 ( $k$ -local observable).

An observable $O$ of a system with $n$ qubits is called $k$ -local if it can be written as a sum of a constant number of terms, each acting on at most $k$ qubits. For instance, a $2$ -local observable in a $3$ -qubit system might look like:

O=O_{12}\otimes I_{3}+I_{1}\otimes O_{23},

Where $O_{12}$ and $O_{23}$ are operators acting on the pairs of qubits (1,2) and (2,3) respectively, while $I_{3}$ and $I_{1}$ are the identity operators acting on the remaining qubits.

$k$ -local observables have been well studied in the literature (see [11, 42] and references therein). They are interesting because, in most practical scenarios, our goal is to identify specific properties of a quantum state (e.g., the energy, momentum, or spin of a photon) that rely on a small subset of qubits of the state. This is similar to the classical setting where most queries depend on a few attributes of a relational database table. For example, suppose we want to retrieve all records in a table containing patient information for individuals aged $80$ years or older with systolic blood pressure at least $140$ , we only need to look at two attributes in the table: age and blood pressure. If we view each qubit of a quantum state as an attribute (e.g., spin, position, momentum, polarization, etc.), then a $k$ -local observable performs selection on at most $k$ attributes of the quantum state.

A related problem of selection is sorting. As a motivation, we would like to sort a set of given quantum states according to their average energy with respect to an observable determined by a particular application. Note that there is no natural order between the quantum states themselves. Therefore, introducing an observable and computing the expectation value is somewhat necessary to establish a total order between the quantum states.

We define the sorting operation with respect to an observable $M$ as follows. Similar to the selection operation, we introduce an additive approximation $\varepsilon$ in the sorted order.

Definition 3.6 ( $\varepsilon$ -sorting).

Given a database $X$ of $m$ states, an observable $M$ , and an error parameter $\varepsilon$ , return an order $(\phi_{1},\phi_{2},\ldots,\phi_{m})$ of the states in $X$ such that for all $i=1,\ldots,m-1$ , we have ${\phi_{i}}^{\dagger}M\phi_{i}\leq{\phi_{i+1}}^{\dagger}M\phi_{i+1}+\varepsilon$ .

4 Sketches for Quantum Data Operations

In this section, we introduce two quantum data sketches, vector sketches and shadow seeds, which are summaries of the original states for efficiently handling previously mentioned database operations.

Before delving into the details, let us use metaphors to provide some very high-level intuition of the two data summarizing methods. The vector sketches can be seen as capturing snapshots of the state from different angles, while each shadow seed can be seen as a piece of information gleaned from the state. Using multiple shadow seeds, we can reconstruct the original state at varying levels of resolution.

4.1 Vector Sketches for Equality-Test, Search, and Join

The concept of vector sketch is to represent a quantum state ${\phi}$ as a vector in $\mathbb{R}^{t}$ with $t\ll d$ instead of a vector in $\mathbb{C}^{d}$ , while preserve certain distance properties. In this section, we design vector sketches for quantum states and then use them to conduct equality test, search, and join.

A natural way to construct the sketch is to take a number of random measurements on ${\phi}$ , and write down the measurement outcomes as a vector. The following result is due to Sen [53], rewritten for pure quantum states.

Theorem 4.1 ([53]).

Let ${\phi}$ and ${\psi}$ be two pure quantum states in $\mathbb{C}^{d}$ . With probability at least $\left(1-e^{-\Omega(d)}\right)$ over the choice of a random measurement basis $\mathcal{M}_{d}=\{M_{1},\ldots,M_{d}\}$ , there exists a universal constant $c\in(0,1)$ such that

c\cdot D({\phi},{\psi})\leq\norm{\mathcal{M}_{d}({\phi})-\mathcal{M}_{d}({\psi})}_{1}\leq D({\phi},{\psi}).

(1)

Theorem 4.1 connects the trace distance of two quantum states to the $\ell_{1}$ distance of their measurement outcome distributions. We note that the distortion in (1), $D({\phi},{\psi})/(cD({\phi},{\psi}))=1/c$ , is a big constant whose value left unspecified in [53].

Vectors $\mathcal{M}_{d}({\phi})$ and $\mathcal{M}_{d}({\psi})$ are discrete distributions with outcomes $\{1,2,\ldots,d\}$ . It is well-known that for a discrete distribution $\mu$ over a domain of size $d$ , using $\Theta\left({(d+\log(1/\delta))}/{\epsilon^{2}}\right)$ samples we can obtain an empirical distribution $\widetilde{\mu}$ such that $\norm{\mu-\widetilde{\mu}}_{1}\leq\epsilon$ with probability $1-\delta$ (see, e.g., [10]).

Corollary 4.2.

Let $\widetilde{\mathcal{M}_{d}}({\phi})$ and $\widetilde{\mathcal{M}_{d}}({\psi})$ be the empirical distributions of measurement outcomes by applying $\mathcal{M}_{d}$ in Theorem 4.1 to $c_{s}(d+\log(1/\delta))/\epsilon^{2}$ (for a sufficiently large constant $c_{s}$ ) copies of ${\phi}$ and ${\psi}$ , respectively. With probability $1-\delta-e^{-\Omega(d)}$ , we have

c\cdot D({\phi},{\psi})-\epsilon\leq\norm{\widetilde{\mathcal{M}_{d}}({\phi})-\widetilde{\mathcal{M}_{d}}({\psi})}_{1}\leq D({\phi},{\psi})+\epsilon,

where $c\in(0,1)$ is a universal constant.

We can view $\widetilde{\mathcal{M}_{d}}({\phi})$ and $\widetilde{\mathcal{M}_{d}}({\psi})$ as two empirical probability vectors. However, since $d=2^{n}$ for a $n$ -qubit state, it is both space-expensive to store $\widetilde{\mathcal{M}_{d}}({\phi})$ and time-expensive to use it for database operations.

Embedding to $L_{1}$ -space

We aim to address the issue of efficiency in both time and space by showing that there is another distribution of measurements whose number of outcomes is independent of the state dimension $d$ , for which a similar connection exists between the trace distance of two quantum states and the $\ell_{1}$ distance of the corresponding measurement outcome distributions. Moreover, the distortion of our sketching can be made arbitrarily close to $1$ (compared with $1/c$ in (1)). It is worth noting that this distortion will significantly impact the efficiency of the search and join operations, as we will discuss shortly.

Our result is summarized in the following theorem.

Theorem 4.3.

Let ${\phi}$ and ${\psi}$ be two pure $d$ -dimensional quantum states. For any $\iota>0$ , there is a distribution $\pi$ of measurements with $k=c{\log(1/\delta)}/{\iota^{2}}$ outcomes for a sufficiently large constant $c$ , such that a measurement $\mathcal{M}_{k}$ sampled randomly from $\pi$ satisfies

(1-\iota)D({\phi},{\psi})\leq\sqrt{\frac{d}{k}}c_{\tau}\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{1}\leq(1+\iota)D({\phi},{\psi})

with probability at least $(1-\delta)$ , where $c_{\tau}\in[0.48,\sqrt{2}]$ is a universal computable constant. Additionally, the measurement sampling can be completed in $O(\log^{8}d)$ time, and the sampled measurement can be represented as a quantum circuit with a gate complexity of $O(\log^{2}d)$ .

Proof 4.4 (Proof Overview).

At a high level, our approach leverages form dimension reduction through quantum measurements. We make use of a technique called pretty good measurement [36] to generate random projective quantum measurements $\mathcal{M}$ with $k$ outcomes. The output of these measurements are random vectors serving as the embedding of the state $\phi$ into $\mathbb{R}^{k}$ .

We start by picking a random basis for $\mathbb{C}^{d}$ based on the Haar measure [35]. Let $x_{t},y_{t}\ (t=1,\ldots,d)$ be independent Gaussian random variables with mean zero and variance $\sigma^{2}=\frac{1}{2d}$ , and let $g\triangleq(c_{1},\ldots,c_{d})\in\mathbb{C}^{d}$ be a random vector where $c_{t}=x_{t}+iy_{t}$ . We repeat this process and generate $d$ complex Gaussian random vectors $g_{1},\ldots,g_{d}$ . These vectors are linearly independent with probability one; but they are not necessarily orthonormal. We make use of pretty good measurement to orthogonalize and normalize these vectors. More precisely, we construct the operator (matrix) $\Gamma\triangleq\sum_{t\in[d]}g_{t}^{\dagger}g_{t}$ , and define the vector $\gamma_{t}\triangleq\Gamma^{-1/2}g_{t}$ for each $t\in[d]$ . We can show that $\gamma_{1},\ldots,\gamma_{d}$ are linearly independent and are orthonormal. Moreover, the distribution of $\gamma_{t}$ is unitary invariant, and hence the Haar measure. Intuitively, $\gamma_{t}$ is distributed uniformly over surface of the unit sphere in $\mathbb{C}^{d}$ . Next, we randomly group ${\gamma_{t}}$ ’s into $k$ groups and form random projection operators as

\Pi_{j}=\sum_{\ell\in[d/k]}(\gamma^{j}_{\ell})^{\dagger}\gamma^{j}_{\ell}\ .\quad\quad(j=1,\ldots,k)

Let $\mathcal{M}_{k}=\{\Pi_{1},\cdots,\Pi_{k}\}$ be the corresponding measurement. Clearly, $\mathcal{M}$ is a valid measurement with probability one. This random measurement facilitates an embedding of the quantum states in $\mathbb{C}^{d}$ into $\mathbb{R}^{k}$ . We carefully analyze the distortion of the embedding (i.e., the outcome distribution by applying $\mathcal{M}_{k}$ to the quantum state) using tools from the concentration of measures and properties of the Haar distribution. We show that the distortion of this embedding is no more than $(1+\iota)$ with probability $(1-\delta)$ when $k=c\log(1/\delta)/\iota^{2}$ for a constant $c$ . The complete proof can be found in Appendix B.1.

The measurement construction described above could require polynomial time in $d$ . However, we demonstrate that it can be sampled more efficiently from the Clifford group in classical time $O(\log^{8}d)$ , leveraging the properties of unitary 2-designs from quantum information theory. The details are provided in Appendix B.1.1.

To approximate $\sqrt{\frac{d}{k}}c_{\tau}\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{1}$ up to an additive error $\epsilon$ , we have to approximate $\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{1}$ up to $\epsilon^{\prime}=\frac{\epsilon}{\sqrt{d/k}\cdot c_{\tau}}$ . We have the following immediate corollary.

Corollary 4.5.

For any $\iota>0$ , let $k=c\log(1/\delta)/{\iota^{2}}$ for a sufficiently large constant $c$ , and let $\widetilde{\mathcal{M}_{k}}({\phi})$ and $\widetilde{\mathcal{M}_{k}}({\psi})$ be the empirical distributions of the outcomes by applying independent random measurements $\mathcal{M}_{k}$ in Theorem 4.3 to $c_{s}d/\epsilon^{2}$ (for a sufficiently large constant $c_{s}$ ) copies of ${\phi}$ and ${\psi}$ , respectively. With probability at least $1-\delta$ , we have

\displaystyle(1-\iota)D({\phi},{\psi})-\epsilon\leq\sqrt{\frac{d}{k}}c_{\tau}\norm{\widetilde{\mathcal{M}_{k}}({\phi})-\widetilde{\mathcal{M}_{k}}({\psi})}_{1}\leq(1+\iota)D({\phi},{\psi})+\epsilon,

where $c_{\tau}\in[0.48,\sqrt{2}]$ is the same constant in Theorem 4.3.

Embedding to $L_{2}$ -space

The sketch we have constructed for the $L_{1}$ -space can also be applied to the $L_{2}$ -space, albeit through a different analysis. The $\ell_{2}$ distance is interesting since we know from Theorem 2.2 that $\ell_{2}$ enjoys a slightly better ANN scheme in term of time and space complexities, which will be useful for speeding up search and join operations. The proof of the following theorem can be found in Appendix B.2.

Theorem 4.6.

Let ${\phi}$ and ${\psi}$ be two pure $d$ -dimensional quantum states. For any $\iota>0$ , there is a distribution $\pi$ of measurements with $k=c\log(1/\delta)/\iota^{2}$ outcomes for a sufficiently large constant $c$ , such that a measurement $\mathcal{M}_{k}$ sampled randomly from $\pi$ satisfies

(1-\iota)D({\phi},{\psi})\leq\sqrt{\frac{d}{2}}\norm{\mathcal{M}_{k}({\phi})-\mathcal{M}_{k}({\psi})}_{2}\leq(1+\iota)D({\phi},{\psi})

with probability at least $1-\delta$ . Additionally, the measurement sampling can be completed in $O(\log^{8}d)$ time, and the sampled measurement can be represented as a quantum circuit with a gate complexity of $O(\log^{2}d)$ .

For a discrete distribution $\mu$ over a domain of size $d$ for any $d\geq 1$ , it takes $\Theta\left({\log(1/\delta)}/{\epsilon^{2}}\right)$ samples to obtain an empirical distribution $\widetilde{\mu}$ such that $\norm{\mu-\widetilde{\mu}}_{2}\leq\epsilon$ with probability $1-\delta$ (see, e.g., [10]). We have the following corollary.

Corollary 4.7.

For any $\iota>0$ , let $k=c{\log(1/\delta)}/{\iota^{2}}$ for a sufficiently large constant $c$ , and let $\widetilde{\mathcal{M}_{k}}({\phi})$ and $\widetilde{\mathcal{M}_{k}}({\psi})$ be the empirical distributions of the outcomes by applying independent random measurements $\mathcal{M}_{k}$ in Theorem 4.6 to $c_{s}{d\log(1/\delta)}/{\epsilon^{2}}$ (for a sufficiently large constant $c_{s}$ ) copies of ${\phi}$ and ${\psi}$ , respectively. With probability $1-\delta$ , we have

\displaystyle(1-\iota)D({\phi},{\psi})-\epsilon\leq\sqrt{\frac{d}{2}}\norm{\widetilde{\mathcal{M}_{k}}({\phi})-\widetilde{\mathcal{M}_{k}}({\psi})}_{2}\leq(1+\iota)D({\phi},{\psi})+\epsilon.

Johnson-Lindenstrauss Lemma in Our Context. It is natural to ask whether existing dimension reduction techniques, such as the Johnson–Lindenstrauss (JL) lemma, can be applied directly to the $d$ -dimensional vector representation $\alpha(\phi)=(\alpha_{1},\ldots,\alpha_{d})\in\mathbb{C}^{d}$ of a quantum state $\phi$ , or the outcome distribution $p(\phi)=(p_{1},\ldots,p_{d})\in\mathbb{R}^{d}\ (p_{i}=\absolutevalue{\alpha_{i}}^{2})$ when measured in the computational basis. After all, we can use quantum tomography to learn the representation $(\alpha_{1},\ldots,\alpha_{d})$ approximately. We would like to first point out that a direct application will not work, since we can construct simple examples demonstrating inherent distortions between the trace distance of quantum states and the $\ell_{1}/\ell_{2}$ distances of their $d$ -dimensional vector representations ( $\alpha(\phi)$ or $p(\phi)$ ), even when all the coordinates are real-valued and before any dimension reduction step. We leave the detailed examples and calculation to Appendix C. In our examples, for the $\alpha(\phi)$ vector representation, the distortions between the trace distance of quantum states and the $\ell_{1}$ and $\ell_{2}$ distances of the two corresponding vectors are at least $\sqrt{{d}/{6}}$ and $\sqrt{1.5}$ , respectively. And for the $p(\phi)$ vector representation, the distortions between the trace distance of quantum states and the $\ell_{1}$ and $\ell_{2}$ distances of the two corresponding vectors are at least $\sqrt{3}$ and $\sqrt{{3d}/{4}}$ , respectively. Moreover, the JL lemma only takes real vectors.

We also note that there exists a near-linear lower bound for dimension reduction in the $L_{1}$ space [4], indicating that, unlike the JL lemma for $L_{2}$ space, dimension reduction in the $L_{1}$ space is not generally possible.

We note that there is a way to circumvent the issues for embedding quantum states into the $L_{2}$ space: for each state $\phi$ , we write its density matrix $\phi\phi^{\dagger}$ as a real-valued $2d^{2}$ dimensional vector $v_{\phi}$ . By some calculation, we can show that the $\ell_{2}$ distance of $v_{\phi}$ and $v_{\psi}$ preserves the trace distance of the two original pure states $\phi$ and $\psi$ . We then perform dimension reduction on the vectors $v_{\phi}$ using the JL lemma. Our sketching algorithm has the following advantages compared with this “full tomography plus JL lemma” approach (setting the error probability $\delta=0.01$ ):

1.

The memory usage of our sketch construction is independent of $d$ , while the memory needed for storing the classical vector representation of the quantum state $\phi$ is $O(d)$ and that for the density matrix $\phi\phi^{\dagger}$ is $O(d^{2})$ .
2.

Our sketch construction takes $\tilde{O}(d/\epsilon^{2})$ time, while the full (pure) quantum state tomography takes $O(d^{2}/\epsilon^{5})$ [19] time and the dimension reduction using the JL lemma needs another $O(d^{2}/\epsilon^{2})$ time.

These comparisons demonstrate that our sketch construction using direct quantum measurements significantly outperforms the method of first converting the quantum state to its classical description followed by dimension reduction, both in terms of time and space, which are the main focus of this paper.

We now apply our embedding results to database operations.

The Equality-Test Operation

We observe that Corollary 4.5 and Corollary 4.7 directly provide a way for solving $(\epsilon,\beta)$ -equality-test. We just set $\iota=\epsilon=\frac{\varepsilon}{2}$ , and use the $\ell_{1}$ or $\ell_{2}$ distances between the two vector sketches $\widetilde{\mathcal{M}_{k}}({\phi})$ and $\widetilde{\mathcal{M}_{k}}({\psi})$ to estimate $D(\phi,\psi)$ up to an additive error $\varepsilon$ with probability $1-\delta$ . The running time is bounded by $O(k)=O(\log(1/\delta)/\varepsilon^{2})$ .

The Search Operation

We now illustrate how to use vector sketches and approximate nearest neighbor (ANN) to perform $(\epsilon,\beta)$ -search on quantum states.

Let $\epsilon$ and $(1+\iota)$ be the additive error and multiplicative error in Corollary 4.5/Corollary 4.7 for building $\left\{\left.\widetilde{\mathcal{M}_{k}}({\phi})\ \right|\ {\phi}\in X\right\}$ , respectively. We assume that an LSH indexing structure has already been built on top of $\widetilde{\mathcal{M}_{k}}({\phi})$ ’s to achieve the time and space usages stated in Theorem 2.2. To handle $(\varepsilon,\beta)$ -search, we call $\mathtt{ANN}\left(\widetilde{\mathcal{M}_{k}}({\phi}),\left\{\left.\widetilde{\mathcal{M}_{k}}({\psi})\ \right|\ {\psi}\in X\right\},(1+\iota)\varepsilon,\beta_{nn}\right)$ , where $\beta_{nn}={\beta}/{(1+\iota+\epsilon/\varepsilon)}$ is the parameter for the tradeoff between the distortion and the time/space complexity in ANN. By Corollary 4.5/Corollary 4.7 and Theorem 2.2, if there exists a state ${\psi}\in X$ such that $D({\phi},{\psi})\leq\varepsilon$ , then ANN returns a state ${\psi^{\prime}}\in D$ such that $D({\phi},{\psi^{\prime}})\leq\beta\varepsilon$ . On the other hand, ANN either returns a state ${\psi^{\prime}}\in D$ with $D({\phi},{\psi^{\prime}})\leq\beta\varepsilon$ , or returns $\emptyset$ .

By Theorem 2.2, it takes $O(km^{\gamma})=O(m^{\gamma}\log m/\varepsilon^{2})$ classical time to perform the search. The space for storing the LHS index is $O(km+m^{1+\gamma})=O(m\log m/\varepsilon^{2}+m^{1+\gamma})$ , where $\gamma\approx 1/\beta_{nn}$ for $\ell_{1}$ and $\gamma\approx 1/\beta_{nn}^{2}$ for $\ell_{2}$ .

We note that in the above approach, we have to make sure that $\beta_{nn}\geq 1$ . In other words, we can only handle $(\epsilon,\beta)$ -search with $\beta\geq(1+\iota+\epsilon/\varepsilon)$ . However, since $\epsilon$ and $\iota$ can be positive constants arbitrarily close to $0$ , we can essentially handle all constants $\beta>1$ . Certainly, the higher the value of $\beta$ , the larger $\beta_{nn}$ that we can pick for reducing the query time and space usage in the ANN search. In practice, a reasonably large constant $\beta$ may be okay, as the trace distance between two quantum states that are generated by separate entities or experiments is typically much larger than that between two states originating from the same entity or experiment (due to quantum noise or preparation errors).

Setting $\delta=1/m^{2}$ , $\iota=0.01$ and $\epsilon=0.01\varepsilon$ , we have $\beta_{nn}\geq 0.98\beta$ , and consequently $\gamma\leq 1.05/\beta^{2}$ . Applying our vector sketch with respect to the $\ell_{2}$ distance and the corresponding ANN search, we have the following theorem.

Theorem 4.8.

There is an index of size $O\left(\frac{m\log m}{\varepsilon^{2}}+m^{1+\frac{1.05}{\beta^{2}}}\right)$ , using which we can solve $(\epsilon,\beta)$ -search on a database of $m$ quantum states with success probability $1-o(1)$ and classical time $O\left(m^{\frac{1.05}{\beta^{2}}}\cdot\frac{\log m}{\varepsilon^{2}}\right)$ .

Note that the index space cost is independent of $d$ , and the query time is sublinear in $m$ (for $\beta>\sqrt{1.05}$ ) and independent of the state dimension $d$ .

The Join Operation

The sketch-based approach can also be used for join. Given a set of sketch vectors $\left\{\left.\widetilde{\mathcal{M}_{k}}({\phi})\ \right|\ {\phi}\in X\right\}$ , we can apply the same hashing process as that for the ANN search, and then verify (by computing the actual distance) all pairs of vectors that collide on at least one hash function. The space cost is the same as that of the search. The query time is dependent on the size of the join output, but it is still independent of the state dimension $d$ .

4.2 Shadow Seeds for Selection and Sorting

In this section, we develop a classical data summarization that can be used to estimate the expectation value $\phi^{\dagger}M\phi$ for an arbitrary $k$ -local observable $M$ . We make use of the classical shadow tomography (CST), introduced in [37], to approximate $\phi^{\dagger}M\phi$ up to a small additive error. CST tries to extract minimal information about the quantum state, without performing complete tomography, to estimate certain properties of the state described by observables.

For completeness, let us briefly describe the CST procedure using Pauli measurements. For each of the $N$ copies of ${\phi}$ , we select $n$ unitary operators, $U_{1},\ldots,U_{n}$ , randomly and independently from the set $\{I,H,S^{\dagger}H\}$ , where $H$ is the Hadamard gate and $S=\sqrt{Z}$ is the square root of the Pauli- $Z$ gate; see Appendix A.2 for their matrix representations. We then apply $U_{j}$ to the $j$ -th qubit of ${\phi}$ and measure the state on the computational basis. The result is a binary string $b_{1},\ldots,b_{n}\in\{0,1\}$ . The $n$ pairs $\{b_{j},\mathtt{index}(U_{j}))\}_{j=1}^{n}$ form a row vector, where $\mathtt{index}(U_{j})$ is the index of $U_{j}$ in the set $\{I,H,S^{\dagger}H\}$ . We then repeat this process for $N$ times, getting $N$ rows, forming the seed matrix $A({\phi})=\{b_{i,j},\mathtt{index}(U_{i,j})\}_{i\in[N],j\in[n]}$ . We call $A({\phi})$ the shadow seeds. Clearly, $A({\phi})$ can be stored using $O(nN)$ classical bits, since each entry of $A({\phi})$ belongs to $\{0,1\}\times\{0,1,2\}$ .

At the time of query, given a $k$ -local observable $M$ , we first construct $k$ -local classical shadows ${\tilde{\phi}_{i}}$ of the database state ${\phi}$ from each row $i\in[N]$ of its seed matrix $A({\phi})$ with respect to the $k$ -local observable $M$ . Suppose $M$ depends non-trivially on the $k$ qubits indexed by $Q\triangleq\{q_{1},\ldots,q_{k}\}$ . Let $e_{0}=(0,1)^{T},e_{1}=(1,0)^{T}$ be the standard basis vectors in the two dimensional plane. For each row $i\in[N]$ and column $j\in Q$ , we first construct a vector ${v_{i,j}}=U_{i,j}e_{b_{i,j}}$ . Next, we construct the $i$ -th shadow as a $2^{k}\times 2^{k}$ matrix $\hat{\rho}_{i}=\bigotimes_{j\in Q}\quantity(3v_{i,j}v^{\dagger}_{i,j}-I),$ where $I$ is the $2\times 2$ identity matrix. Finally, the estimator for $\phi^{\dagger}M\phi$ is given by $T=\frac{1}{N}\sum_{i\in[N]}\tr{M\hat{\rho}_{i}}$ . The following theorem states that $T$ is a good approximation of the expectation value $\phi^{\dagger}M\phi$ .

Theorem 4.9 (Based on [37]).

The above procedure prepares an $N\times n$ shadow seed matrix $A({\phi})$ given $N$ copies of an $n$ -qubit quantum state ${\phi}$ , such that for any given $k$ -local observable $M$ , if $N\geq 4^{k}\norm{M}_{\infty}^{2}{\log(1/\delta)}/{\varepsilon^{2}}$ , the estimator $T$ approximates $\phi^{\dagger}M\phi$ up to an additive error $\varepsilon$ with probability $(1-\delta)$ using $A({\phi})$ . Moreover, the time for computing $\phi^{\dagger}M\phi$ using $A({\phi})$ is bounded by $O\left(2^{2k}N\right)(\propto 16^{k})$ , and the space for storing $A({\phi})$ is $O(Nn)$ classical bits.

Note that the space cost and query time are both independent of the state dimension $d$ .

Typically, the $k$ -local observable $M$ can be expressed as a quantum circuit with $\text{poly}(k)$ gate complexity. In this case, we propose a new estimation algorithm to further improve the total query time from $O(16^{k})$ to $O(9^{k})$ (omitting other less critical factors) by an approach we call QCQC (quantum $\to$ classical $\to$ quantum $\to$ classical). We have the following theorem, whose proof can be found in Appendix B.3.

Theorem 4.10.

There is a procedure for preparing an $N\times n$ shadow seed matrix $A({\phi})$ given $N$ copies of an $n$ -qubit quantum state ${\phi}$ , such that for any given $k$ -local observable $M$ with $\text{poly}(k)$ gate complexity, if $N\geq 9^{k}\norm{M}_{\infty}^{2}{\log(1/\delta)}/{\varepsilon^{2}}$ , we can approximate $\phi^{\dagger}M\phi$ up to an additive error $\varepsilon$ with probability $(1-\delta)$ using $A({\phi})$ . Moreover, the quantum time for computing $\phi^{\dagger}M\phi$ using $A({\phi})$ is bounded by $O\left(N\text{poly}(k)\right)(\propto 9^{k})$ , and the space for storing $A({\phi})$ is $O(Nn)$ classical bits.

The Selection Operation

It is easy to see that Theorem 4.10 directly implies an algorithm for handling $(\eta,\varepsilon)$ -selection: Setting $\delta=1/m^{2}$ , we can estimate $\phi^{\dagger}M\phi$ up to an additive error $\varepsilon$ with probability $(1-1/m^{2})$ for each $n$ -qubit database state ${\phi}$ using an $N\times n$ shadow seed matrix, where $N\geq 9^{k}\norm{M}_{\infty}^{2}\cdot{2\log m}/{\varepsilon^{2}}$ . By a union bound over $m$ database states, we can solve the $(\eta,\varepsilon)$ -selection problem with probability $(1-1/m)$ . The query time is bounded by $Nm\cdot\text{poly}(k)=9^{k}m\log m\cdot\text{poly}(k)\norm{M}_{\infty}^{2}/{\varepsilon^{2}}$ .

Theorem 4.11.

There is an index of size $O\left(9^{k}nW^{2}{\log m}/{\varepsilon^{2}}\right)$ , using which we can solve for any $k$ -local observable $M\ (\norm{M}_{\infty}\leq W)$ the $(\eta,\epsilon)$ -selection on a database of $m$ $n$ -qubit quantum states with success probability $(1-o(1))$ and quantum time $9^{k}m{\log m}W^{2}\text{poly}(k)/{\varepsilon^{2}}$ .

The Sorting Operation

Since the shadow seed matrix can be used for estimating the expectation value $\phi^{\dagger}M\phi$ up to an additive error $\varepsilon$ , we can use it for $\varepsilon$ -sorting with the same space and time complexity as that for the selection operation.

5 Conclusion and Future Work

In this paper, we have defined basic database queries for quantum data and proposed several classical sketches of quantum states to facilitate these queries. We consider our work a preliminary step towards a comprehensive quantum data management system. Numerous questions and directions remain open following this work. We list a few below.

Support More Data Operations

This paper primarily focuses on two basic database operations: search and selection, along with several related operations. We would like to expand the support to more complex operations for data analytics, such as clustering and classification, for which we may need to develop new classical summaries of the quantum states for the sake of efficiency.

Mixed States

In various scenarios, such as when the description of a quantum system is unknown due to quantum noise, the use of a density operator (or, density matrix) for describing mixed quantum states becomes more convenient. Suppose the quantum system is in one of a collection of $d$ -dimensional pure states $\{{\phi_{1}},\ldots,{\phi_{k}}\}$ , we can represent a mixed quantum state as $\rho=\sum_{i=1}^{k}p_{i}\phi_{i}{\phi_{i}}^{\dagger}$ , where $p_{1},\ldots,p_{k}\geq 0$ and $\sum_{i=i}^{k}p_{i}=1$ . We can view $\rho$ as a convex combination of outer products of pure states ${\phi_{i}}$ , where each $\phi_{i}{\phi_{i}}^{\dagger}$ is associated with a probability $p_{i}$ . We anticipate that results presented in this paper can be extended to mixed states, although the technical aspects of this generalization require further investigation.

The Integration with the Theory of Relational Databases

A key feature of our proposed model is that quantum data is represented entirely in the classical format. This unique aspect enables us to integrate our model with established theories related to indexing, query execution, and query optimization in relational databases designed for classical data. However, the integration process will likely require the redesign of multiple components to accommodate the inherent differences stemming from the distinct definitions of database operations for quantum data.

References

[1] Scott Aaronson. The learnability of quantum states. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 463(2088):3089–3114, 2007.
[2] Scott Aaronson. Shadow tomography of quantum states. In Ilias Diakonikolas, David Kempe, and Monika Henzinger, editors, STOC, pages 325–338. ACM, 2018.
[3] Scott Aaronson and Daniel Gottesman. Improved simulation of stabilizer circuits. Physical Review A, 70(5):052328, nov 2004. doi:10.1103/physreva.70.052328.
[4] Alexandr Andoni, Moses Charikar, Ofer Neiman, and Huy L. Nguyen. Near linear lower bound for dimension reduction in L1. In Rafail Ostrovsky, editor, FOCS, pages 315–323. IEEE Computer Society, 2011.
[5] Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459–468. IEEE Computer Society, 2006.
[6] Costin Badescu and Ryan O’Donnell. Improved quantum data analysis. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC, pages 1398–1411. ACM, 2021.
[7] Costin Badescu, Ryan O’Donnell, and John Wright. Quantum state certification. CoRR, abs/1708.06002, 2017.
[8] Harry Buhrman, Richard Cleve, John Watrous, and Ronald De Wolf. Quantum fingerprinting. Physical Review Letters, 87(16):167902, 2001.
[9] Umut Çalikyilmaz, Sven Groppe, Jinghua Groppe, Tobias Winker, Stefan Prestel, Farida Shagieva, Daanish Arya, Florian Preis, and Le Gruenwald. Opportunities for quantum acceleration of databases: Optimization of queries and transaction schedules. Proc. VLDB Endow., 16(9):2344–2353, 2023.
[10] Clément L. Canonne. A short note on learning discrete distributions, 2020. arXiv:2002.11457.
[11] Thomas Chen, Shivam Nadimpalli, and Henry Yuen. Testing and learning quantum juntas nearly optimally. In Nikhil Bansal and Viswanath Nagarajan, editors, SODA, pages 1163–1185.
[12] Kai-Min Chung and Han-Hsuan Lin. Sample efficient algorithms for learning quantum channels in PAC model and the approximate state discrimination problem. In Min-Hsiu Hsieh, editor, TQC, volume 197 of LIPIcs, pages 3:1–3:22. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
[13] Paul Cockshott. Quantum relational databases, 1997. arXiv:quant-ph/9712025.
[14] Christoph Dankert, Richard Cleve, Joseph Emerson, and Etera Livine. Exact and approximate unitary 2-designs and their application to fidelity estimation. Physical Review A, 80(1):012304, July 2009. doi:10.1103/physreva.80.012304.
[15] Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Jack Snoeyink and Jean-Daniel Boissonnat, editors, SOCG, pages 253–262. ACM, 2004.
[16] D.P. DiVincenzo, D.W. Leung, and B.M. Terhal. Quantum data hiding. IEEE Transactions on Information Theory, 48(3):580–598, March 2002. doi:10.1109/18.985948.
[17] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm, 2014. arXiv:1411.4028, doi:10.48550/ARXIV.1411.4028.
[18] Edward Farhi and Hartmut Neven. Classification with quantum neural networks on near term processors. February 2018. arXiv:1802.06002.
[19] Daniel Stilck França, Fernando G. S. L. Brandão, and Richard Kueng. Fast and robust quantum state tomography from few basis measurements. In Min-Hsiu Hsieh, editor, 16th Conference on the Theory of Quantum Computation, Communication and Cryptography, TQC 2021, July 5-8, 2021, Virtual Conference, volume 197 of LIPIcs, pages 7:1–7:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
[20] Siddhant Garg and Goutham Ramakrishnan. Advances in quantum deep learning: An overview. arXiv:2005.04316, May 2020. arXiv:2005.04316.
[21] Weiyuan Gong and Scott Aaronson. Learning distributions over quantum measurement outcomes. CoRR, abs/2209.03007, 2022.
[22] Lov K. Grover. A fast quantum mechanical algorithm for database search. In Gary L. Miller, editor, STOC, pages 212–219. ACM, 1996.
[23] Jeongwan Haah, Aram W. Harrow, Zheng-Feng Ji, Xiaodi Wu, and Nengkun Yu. Sample-optimal tomography of quantum states. In Daniel Wichs and Yishay Mansour, editors, STOC, pages 913–925. ACM, 2016.
[24] Jeongwan Haah, Aram W. Harrow, Zhengfeng Ji, Xiaodi Wu, and Nengkun Yu. Sample-optimal tomography of quantum states. IEEE Transactions on Information Theory, pages 1–1, 2017. doi:10.1109/tit.2017.2719044.
[25] Rihan Hai, Shih-Han Hung, and Sebastian Feld. Quantum data management: From theory to opportunities. In ICDE, pages 5376–5381. IEEE, 2024. URL: https://doi.org/10.1109/ICDE60146.2024.00410, doi:10.1109/ICDE60146.2024.00410.
[26] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear systems of equations. Physical review letters, 103(15):150502, 2009.
[27] Aram W. Harrow, Cedric Yen-Yu Lin, and Ashley Montanaro. Sequential measurements, disturbance and property testing. In Philip N. Klein, editor, SODA, pages 1598–1611. SIAM, 2017.
[28] Aram W. Harrow and John C. Napp. Low-depth gradient measurements can improve convergence in variational hybrid quantum-classical algorithms. Physical Review Letters, 126(14):140502, apr 2021. doi:10.1103/physrevlett.126.140502.
[29] Aram W. Harrow and Andreas J. Winter. How many copies are needed for state discrimination? IEEE Trans. Inf. Theory, 58(1):1–2, 2012.
[30] Patrick Hayden, Peter W. Shor, and Andreas Winter. Random quantum codes from gaussian ensembles and an uncertainty relation. Open Systems & Information Dynamics, 15(01):71–89, mar 2008.
[31] Mohsen Heidari, Ananth Y. Grama, and Wojciech Szpankowski. Toward physically realizable quantum neural networks. Association for the Advancement of Articial Intelligence (AAAI), 2022.
[32] Mohsen Heidari, Mobasshir A Naved, Zahra Honjani, Wenbo Xie, Arjun Jacob Grama, and Wojciech Szpankowski. Quantum shadow gradient descent for variational quantum algorithms. 2024. URL: https://arxiv.org/abs/2310.06935, arXiv:2310.06935.
[33] Mohsen Heidari, Arun Padakandla, and Wojciech Szpankowski. A theoretical framework for learning from quantum data. In IEEE International Symposium on Information Theory (ISIT), 2021.
[34] Fumio Hiai and Denes Petz. The Semicircle Law, Free Random Variables and Entropy (Mathematical Surveys & Monographs). American Mathematical Society, 2006.
[35] Alexander S. Holevo. Quantum Systems, Channels, Information. De Gruyter, Berlin, Boston, 2013. URL: https://doi.org/10.1515/9783110273403 [cited 2023-08-27], doi:doi:10.1515/9783110273403.
[36] Alexander Semenovich Holevo. On asymptotically optimal hypotheses testing in quantum statistics. Teoriya Veroyatnostei i ee Primeneniya, 23(2):429–432, 1978.
[37] Hsin-Yuan Huang, Richard Kueng, and John Preskill. Predicting many properties of a quantum system from very few measurements. Nature Physics 16, 1050–1057 (2020), February 2020.
[38] Noah Huffman, Dmitri Pavlichin, and Tsachy Weissman. Lossy compression for schrödinger-style quantum simulations, 2024. URL: https://arxiv.org/abs/2401.11088, arXiv:2401.11088.
[39] Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Jeffrey Scott Vitter, editor, STOC, pages 604–613. ACM, 1998.
[40] E. Joos, H. D. Zeh, C. Kiefer, D. J. W. Giulini, J. Kupsch, and I. O. Stamatescu. Decoherence and the Appearance of a Classical World in Quantum Theory. Springer, 2003.
[41] Stephen P. Jordan, Keith S. M. Lee, and John Preskill. Quantum algorithms for quantum field theories. Science, 336(6085):1130–1133, jun 2012. doi:10.1126/science.1217069.
[42] Julia Kempe, Alexei Y. Kitaev, and Oded Regev. The complexity of the local hamiltonian problem. SIAM J. Comput., 35(5):1070–1097, 2006.
[43] Ian D. Kivlichan, Craig Gidney, Dominic W. Berry, Nathan Wiebe, Jarrod McClean, Wei Sun, Zhang Jiang, Nicholas Rubin, Austin Fowler, Alán Aspuru-Guzik, Hartmut Neven, and Ryan Babbush. Improved fault-tolerant quantum simulation of condensed-phase correlated electrons via trotterization. Quantum, 4:296, jul 2020. doi:10.22331/q-2020-07-16-296.
[44] Yang Liu and Gui Lu Long. Deleting a marked item from an unsorted database with a single query, 2007. arXiv:0710.3301.
[45] Fabio Valerio Massoli, Lucia Vadicamo, Giuseppe Amato, and Fabrizio Falchi. A leap among entanglement and neural networks: A quantum survey. arXiv:2107.03313, July 2021. arXiv:2107.03313.
[46] Isaac L. Chuang Michael A. Nielsen. Quantum Computation and Quantum Information. Cambridge University Pr., December 2010.
[47] Isaac L. Chuang Michael A. Nielsen. Quantum Computation and Quantum Information. Cambridge University Pr., December 2010. URL: https://www.ebook.de/de/product/13055864/michael_a_nielsen_isaac_l_chuang_quantum_computation_and_quantum_information.html.
[48] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii. Quantum circuit learning. Physical Review A, 98(3):032309, sep 2018. doi:10.1103/physreva.98.032309.
[49] Ryan O’Donnell and John Wright. Efficient quantum tomography. In Daniel Wichs and Yishay Mansour, editors, STOC, pages 899–912. ACM, 2016.
[50] Alexey Pyrkov, Alex Aliper, Dmitry Bezrukov, Yen-Chu Lin, Daniil Polykovskiy, Petrina Kamya, Feng Ren, and Alex Zhavoronkov. Quantum computing for near-term applications in generative chemistry and drug discovery. Drug Discovery Today, page 103675, 2023.
[51] Jun John Sakurai and Eugene D Commins. Modern quantum mechanics, revised edition, 1995.
[52] Maria Schuld, Ilya Sinayskiy, and Francesco Petruccione. The quest for a quantum neural network. Quantum Information Processing, 13(11):2567–2586, Aug 2014. doi:10.1007/s11128-014-0809-8.
[53] Pranab Sen. Random measurement bases, quantum state distinction and applications to the hidden subgroup problem. In CCC, pages 274–287. IEEE Computer Society, 2006.
[54] Peter W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput., 26(5):1484–1509, 1997.
[55] Adam Smith, MS Kim, Frank Pollmann, and Johannes Knolle. Simulating quantum many-body dynamics on a current digital quantum computer. npj Quantum Information, 5(1):106, 2019.
[56] Xiaoming Sun, Guojing Tian, Shuai Yang, Pei Yuan, and Shengyu Zhang. Asymptotically optimal circuit depth for quantum state preparation and general unitary synthesis, 2023. arXiv:2108.06150.
[57] Immanuel Trummer and Christoph Koch. Multiple query optimization on the d-wave 2x adiabatic quantum computer. Proc. VLDB Endow., 9(9):648–659, 2016. URL: http://www.vldb.org/pvldb/vol9/p648-trummer.pdf, doi:10.14778/2947618.2947621.
[58] James D. Whitfield, Jacob Biamonte, and Alán Aspuru-Guzik. Simulation of electronic structure hamiltonians using quantum computers. Molecular Physics, 109(5):735–750, mar 2011. doi:10.1080/00268976.2011.552441.
[59] Tobias Winker, Sven Groppe, Valter Uotila, Zhengtong Yan, Jiaheng Lu, Maja Franz, and Wolfgang Mauerer. Quantum machine learning: Foundation, new techniques, and opportunities for database research. In Sudipto Das, Ippokratis Pandis, K. Selçuk Candan, and Sihem Amer-Yahia, editors, SIGMOD, pages 45–52. ACM, 2023.
[60] Xin-Chuan Wu, Sheng Di, Emma Maitreyee Dasgupta, Franck Cappello, Hal Finkel, Yuri Alexeev, and Frederic T. Chong. Full-state quantum circuit simulation by using data compression. In Michela Taufer, Pavan Balaji, and Antonio J. Peña, editors, SC, pages 80:1–80:24. ACM, 2019.
[61] Ahmed Younes. Database manipulation on quantum computers, 2007. arXiv:0705.4303.

Appendix A More Preliminaries

A.1 Basics of Quantum Information (A More Formal Approach)

Quantum States and Qubits. We can represent a $d$ -dimensional pure quantum state as a vector in $\mathbb{C}^{d}$ . Using the standard Dirac bra-ket notation, we write $\ket{\phi}=\sum_{i=0}^{d-1}\alpha_{i}\ket{i},$ where $\ket{0},\ket{1},\ldots,\ket{d-1}$ is an orthonormal basis in $\mathbb{C}^{d}$ (referred as the computational basis), and $\alpha_{i}$ ’s are called amplitudes with the property $\sum_{i=0}^{d-1}\absolutevalue{\alpha_{i}}^{2}=1$ .

Let $\bra{\phi}$ denote the conjugate transpose of $\ket{\phi}$ , and let $\innerproduct{\phi}{\varphi}$ and $\outerproduct{\phi}{\varphi}$ denote the inner product and outer product of vectors $\ket{\phi}$ and $\bra{\varphi}$ , respectively.

A qubit is a $2$ -dimensional quantum state and can be represented as $\ket{\phi}=\alpha_{0}\ket{0}+\alpha_{1}\ket{1}$ , where $\absolutevalue{\alpha_{0}}^{2}+\absolutevalue{\alpha_{1}}^{2}=1$ . Generally, a $n$ -qubit state can be represented as

\ket{\phi}=\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\alpha_{x_{1}\cdots x_{n}}\ket{x_{1}\cdots x_{n}},

where $\{\ket{x_{1}\cdots x_{n}}\}$ are the computational basis states of the $n$ -qubit system, and it holds that $\sum_{x_{1}\cdots x_{n}\in\{0,1\}^{n}}\absolutevalue{\alpha_{x_{1}\cdots x_{n}}}^{2}=1$ .

A quantum state is called separable if it can be written as a tensor product of at least two states $\ket{\phi}=\ket{\phi_{1}}\otimes\ket{\phi_{2}}\otimes\cdots\otimes\ket{\phi_{k}}$ , which is often abbreviated as $\ket{\phi_{1}}\ket{\phi_{2}}\cdots\ket{\phi_{k}}$ , or $\ket{\phi_{1}\phi_{2}\cdots\phi_{k}}$ . Otherwise, the state is called entangled. A classical entangled state is the Bell state $\ket{\phi}=\frac{\ket{00}+\ket{11}}{\sqrt{2}}$ .

Quantum Operations

There are two types of quantum operations. The first is called unitary transformation. That is, we apply a unitary operator $U$ to a quantum state $\ket{\phi}$ and get $U\ket{\phi}$ .⁴⁴4A unitary operator is a linear operator $U$ such that $UU^{\dagger}=U^{\dagger}U=I$ . This type of operation is used to describe the evolution of a closed quantum system. The second type of operations are measurements. Quantum measurements are the interface to obtain classical information about quantum states. Under the POVM (Positive Operator Valued Measures) formalism, a quantum measurement $\mathcal{M}$ is described as a collection of $d\times d$ positive semi-definite operators $\{M_{i}\}$ with $\sum_{i}M_{i}=I$ ; each $M_{i}$ is associated with a measurement outcome $o_{i}$ , which can be chosen by the experimentalist. When performing $\mathcal{M}$ on a quantum state $\ket{\phi}$ , the probability of getting the outcome $o_{i}$ is given by $\expectationvalue{M_{i}}{\phi}$ . Let $\mathcal{M}(\ket{\phi})$ be the probability distribution of the measurement outcomes after applying $\mathcal{M}$ on $\ket{\phi}$ .

A projective measurement is a special case of a POVM where $M_{i}$ ’s are projective operators, i.e., $M_{i}^{2}=M_{i}$ . An example of a projective measurement is the measurement on the computational basis where $M_{i}=\outerproduct{i}{i}$ . Any POVM can be written as a unitary operator $U$ followed by a projective measurement.

Observables are physical variables that can be measured. An observable is represented by a Hermitian operator $M$ whose eigenvalues are the set of possible outcomes. The observable spectrally decomposes as $M=\sum_{i}\lambda_{i}M_{i}$ , where $M_{i}$ represents the projector onto the eigenspace of $M$ associated with the eigenvalue $\{\lambda_{i}\}$ . The observable $M$ can be associated with the measurement $\mathcal{M}=\{M_{i}\}$ with outcomes $\{\lambda_{i}\}$ . The expected value of an observable on a state $\ket{\phi}$ is expressed as $\operatorname{{\mathop{\mathbf{E}}}}[\mathcal{M}(\ket{\phi})]=\expectationvalue{M}{\phi}$ .

When we say a measurement is performed on a $d$ -dimensional quantum state $\ket{\phi}$ in the computational basis, we mean that the measurement $\mathcal{M}=\{M_{0},M_{1},\ldots,M_{d-1}\}$ with $M_{i}=\outerproduct{i}{i}$ is applied on $\ket{\phi}$ . It is noteworthy that the probability of observing the measurement outcome $o_{i}$ , denoted by $\expectationvalue{M_{i}}{\phi}$ , is equal to $\absolutevalue{\innerproduct{\phi}{i}}^{2}=\absolutevalue{\alpha_{i}}^{2}$ , wherein $\alpha_{i}$ is the $i$ -th amplitude of the quantum state $\ket{\phi}$ .

A crucial property of quantum mechanics is that each measurement would cause a disturbance to the quantum states. If the measurement outcome is $o_{i}$ , then the post-measurement state of $\ket{\phi}$ can be written as

\ket{\phi^{\prime}}=\frac{B_{i}\ket{\phi}}{\sqrt{\expectationvalue{B_{i}^{\dagger}B_{i}}{\phi}}}\ ,

where $B_{i}$ satisfies $B_{i}^{\dagger}B_{i}=M_{i}$ . This particular phenomenon significantly complicates the quantum data management, as the quantum state cannot be entirely “recycled” following a measurement.

Trace Distance

Given two quantum states $\ket{\phi}$ and $\ket{\psi}$ , we define their trace distance to be

D(\ket{\phi},\ket{\psi})=\frac{1}{2}\norm{\outerproduct{\phi}{\phi}-\outerproduct{\psi}{\psi}}_{1}=\sqrt{1-\absolutevalue{\innerproduct{\psi}{\phi}}^{2}}.

The trace distance is the most widely used distance measure for quantum states in the literature. It also has a nice physical meaning: Let $\mathcal{M}=\{M_{i}\}$ be a POVM, and let $p_{i}\triangleq\expectationvalue{M_{i}}{\phi}$ and $q_{i}\triangleq\expectationvalue{M_{i}}{\psi}$ . That is, $p_{i}$ and $q_{i}$ are the probabilities of obtaining measurement outcome $o_{i}$ on $\ket{\phi}$ and $\ket{\psi}$ , respectively. We have $D(\ket{\phi},\ket{\psi})=\max_{\mathcal{M}}\sum_{i}\absolutevalue{p_{i}-q_{i}},$ which implies that if two quantum states are close in terms of trace distance, then any measurement conducted on these quantum states will yield probability distributions that are close in terms of the total variance distance. In other words, two quantum states that are close in terms of the trace distance are statistically indistinguishable under measurements.

A.2 Some Basic Quantum States and Gates

We list the vector representations of some basic quantum states that we need to use in this paper.

•

$\ket{0}$ :

$\ket{0}=\begin{pmatrix}1\\ 0\end{pmatrix}$
•

$\ket{1}$ :

$\ket{1}=\begin{pmatrix}0\\ 1\end{pmatrix}$
•

$\ket{+}$ :

$\ket{+}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ 1\end{pmatrix}$
•

$\ket{-}$ :

$\ket{-}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ -1\end{pmatrix}$
•

$\ket{+i}$ :

$\ket{+i}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ i\end{pmatrix}$
•

$\ket{-i}$ :

$\ket{-i}=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ -i\end{pmatrix}$

We make use of two basic quantum gates:

•

Hadamard gate

$H=\frac{1}{\sqrt{2}}\begin{bmatrix}1&1\\ 1&-1\end{bmatrix}\ .$

It turns $\ket{0}$ to $(\ket{0}+\ket{1})/\sqrt{2}$ , and turns $\ket{1}$ to $(\ket{0}-\ket{1})/\sqrt{2}$ .
•

Phase gate

$S=\begin{bmatrix}1&0\\ 0&i\end{bmatrix}\ .$

It leaves $\ket{0}$ unchanged, and turns $\ket{1}$ to $i\ket{1}$ .

•

Another ser of special unitary operators are the Pauli operators defined as:

I=\begin{bmatrix}1&0\\ 0&1\end{bmatrix},\quad X=\begin{bmatrix}0&1\\ 1&0\end{bmatrix},\quad Y=\begin{bmatrix}0&-i\\ i&0\end{bmatrix},\quad Z=\begin{bmatrix}1&0\\ 0&-1\end{bmatrix}.

Here, $I$ is the identity operator, $X$ represents a bit-flip operation, $Z$ represents a phase-flip operation, and $Y$ combines both bit-flip and phase-flip with an imaginary phase factor.

A.3 Clifford Group

In this paper, we make use of a special class of quantum unitary operations called the Clifford group, which is a fundamental construct in quantum information theory. The Clifford group consists of unitary operations that map Pauli operators to other Pauli operators. Furthermore, the Clifford group is known for its efficient classical simulation, as described by the Gottesman-Knill theorem [47]. Any unitary in in this group can be implemented (up to a global phase factor) using a circuit with only Hadamard, Phase, and CNOT gates. The formal definition of the Clifford group is given below.

Definition A.1.

The Clifford group $\mathcal{C}_{n}$ on $n$ -qubits is defined as the normalizer of the Pauli group $\mathcal{P}_{n}$ under the action of conjugation, that is

\mathcal{C}_{n}=\{U\in U(2^{n})\,|\,UPU^{\dagger}\in\mathcal{P}_{n},\,\forall P\in\mathcal{P}_{n}\},

where $\mathcal{P}_{n}=\langle iI,X_{j},Y_{j},Z_{j}\,|\,j=1,\dots,n\rangle$ is the $n$ -qubit Pauli group generated by the identity $I$ and the Pauli matrices $X,Y,Z$ on each qubit, along with the phase factor $i$ .

A.4 Mathematical Tools

Lemma A.2 (Hoeffding’s inequality).

Let $X_{1},\dotsc,X_{n}\in[0,1]$ be i.i.d. random variables and $X=\frac{1}{n}\sum_{i=1}^{n}X_{i}$ . Then

\Pr[\absolutevalue{X-\mathbf{E}[X]}>t]\leq 2\exp(-2t^{2}n)\,.

Lemma A.3 (Generic Chernoff bound).

For any $a\in\mathbb{R}$ and random variable $X$ with moment generating function $M_{X}(t):=\mathbf{E}[e^{tX}]$ ,

\displaystyle\mathbf{Pr}[X\geq a]\leq\inf_{t>0}M_{X}(t)e^{-ta}.

Lemma A.4 (Berry-Esseen theorem).

Let $X_{1},X_{2},...,X_{n}$ be i.i.d. random variables with $\mathbf{E}[X_{i}]=0,\mathbf{E}\left[X_{i}^{2}\right]=\sigma^{2}<\infty$ and $\mathbf{E}\left[X_{i}^{3}\right]=\rho<\infty$ . If $Y_{n}:=\frac{1}{\sigma\sqrt{n}}\sum_{j}X_{j}$ , then

\absolutevalue{\mathbf{Pr}[Y_{n}\leq x]-\phi_{N}(x)}\leq\frac{c\rho}{\sigma^{3}\sqrt{n}},

where $\phi_{N}(x)$ is the CDF of $N(0,1)$ and $c\leq\sigma^{2}$ is a constant.

Appendix B Missing Proofs

B.1 Proof of Theorem 4.3

Measurements Construction

We first describe how to generate random measurements. We start by picking a random basis for $\mathbb{C}^{d}$ based on the Haar measure, which can be done using a Gaussian ensemble of pure states as is used in [30]: Let $x_{t},y_{t}\ (t=1,\ldots,d)$ be independent Gaussian random variables with mean zero and variance $\sigma^{2}=\frac{1}{2d}$ , and let $g\triangleq(c_{1},\ldots,c_{d})\in\mathbb{C}^{d}$ be a random vector where $c_{t}=x_{t}+iy_{t}$ . We repeat this process and generate $d$ Gaussian random vectors (written in the ket notation) $\ket{g_{1}},\ldots,\ket{g_{d}}$ .

We next create an orthonormal basis for $\mathbb{C}^{d}$ using $\ket{g_{1}},\ldots,\ket{g_{d}}$ . It is clear that with probability one, $\ket{g_{t}}$ ’s are linearly independent, which means that they span $\mathbb{C}^{d}$ . However, they are not necessarily orthogonal. To address this issue, we use the pretty good measurement technique [35]. Define the operator $\Gamma\triangleq\sum_{t\in[d]}\outerproduct{g_{t}}{g_{t}}$ , and define the vector $\ket{\gamma_{t}}\triangleq\Gamma^{-1/2}\ket{g_{t}}$ for each $t\in[d]$ .

We note that computing $\Gamma^{-1/2}$ may be time expensive. In Section B.1.1, we will discuss a more efficient measurement construction via $t$ -design, which is a concept in quantum information theory that generalizes the idea of random sampling over the unitary group $U(d)$ of $d\times d$ unitary matrices.

Claim 1.

The set $\left\{\ket{\gamma_{t}}:t\in[d]\right\}$ forms an orthonormal basis for $\mathbb{C}^{d}$ .

Proof B.1.

Observe that

\sum_{t\in[d]}\outerproduct{\gamma_{t}}{\gamma_{t}}=\Gamma^{-1/2}\left(\sum_{t\in[d]}\outerproduct{g_{t}}{g_{t}}\right)\Gamma^{-1/2}=I.

Moreover, $\ket{\gamma_{t}}$ are linearly independent since $\ket{g_{t}}$ ’s are linearly independent. Hence, $\ket{\gamma_{t}}$ ’s are orthonormal.

We also note that the distribution of $\ket{\gamma_{t}}$ is unitary invariant. Hence, it is the Haar measure.

Next, we randomly group $\ket{\gamma_{t}}$ ’s into $k$ groups and form random projection operators as

\Pi_{j}=\sum_{\ell\in[d/k]}\outerproduct{\gamma^{j}_{\ell}}{\gamma^{j}_{\ell}},\qquad j=1,\ldots,k.

(2)

Let $\mathcal{M}_{k}=\{\Pi_{1},\cdots,\Pi_{k}\}$ be the corresponding measurement. Clearly, $\mathcal{M}$ is a valid POVM with probability one.

The Analysis of Distortion

Let $\mathcal{M}_{k}(\ket{\phi})$ and $\mathcal{M}_{k}(\ket{\psi})$ be the probability distributions of the measurement outcomes when the states are $\ket{\phi}$ and $\ket{\psi}$ , respectively. Then, the total variation distance between the two probability distributions can be written as

\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}\outerproduct{\phi}{\phi}}-\tr{\Pi_{j}\outerproduct{\psi}{\psi}}}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}},

(3)

where we have used the Born’s law and set $A\triangleq\outerproduct{\phi}{\phi}-\outerproduct{\psi}{\psi}$ .

We next show that $A$ has two eigenvalues $\pm D(\ket{\phi},\ket{\psi})$ , where $D(\cdot,\cdot)$ is the trace distance. To this end, suppose $\ket{\omega}$ is an eigenstate of $A$ , and $A\ket{\omega}=\lambda\ket{\omega}$ , where $\lambda\in\mathbb{R}$ as $A$ is a Hermitian operator. Multiplying both sides by $\bra{\phi}$ gives

\matrixelement{\phi}{A}{\omega}=\innerproduct{\phi}{\omega}-\innerproduct{\phi}{\psi}\innerproduct{\psi}{\omega}=\lambda\innerproduct{\phi}{\omega}.

(4)

Similarly, multiplying both sides by $\bra{\psi}$ gives

\matrixelement{\psi}{A}{\omega}=\innerproduct{\psi}{\phi}\innerproduct{\phi}{\omega}-\innerproduct{\psi}{\omega}=\lambda\innerproduct{\psi}{\omega}.

(5)

Combining (4) and (5) gives

\displaystyle(1+\lambda)\innerproduct{\psi}{\omega}=\innerproduct{\psi}{\phi}\innerproduct{\phi}{\omega}=\innerproduct{\psi}{\phi}\frac{1}{1-\lambda}\innerproduct{\phi}{\psi}\innerproduct{\psi}{\omega}.

Hence, $(1-\lambda)(1+\lambda)=\absolutevalue{\innerproduct{\phi}{\psi}}^{2}$ , which implies

\lambda=\pm\sqrt{1-\absolutevalue{\innerproduct{\phi}{\psi}}^{2}}=\pm D(\ket{\phi},\ket{\psi}).

(6)

Now without loss of generality, let $\ket{1}$ and $\ket{2}$ denote the two eigenstates of $A$ . Hence,

A=\absolutevalue{\lambda}(\outerproduct{1}{1}-\outerproduct{2}{2}).

(7)

The right-hand side of (3) can be written as

	$\displaystyle\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}}$	$\displaystyle=$	$\displaystyle\absolutevalue{\lambda}\sum_{j\in[k]}\absolutevalue{\expectationvalue{\Pi_{j}}{1}-\expectationvalue{\Pi_{j}}{2}}$		(8)
		$\displaystyle=$	$\displaystyle\absolutevalue{\lambda}\sum_{j\in[k]}\absolutevalue{\sum_{\ell\in[d/k]}\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}}.$		(9)

Let $W_{\ell}^{j}\triangleq d\left(\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}\right)$ . Combining (3) and (9), we have

\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}=\absolutevalue{\lambda}\sum_{j\in[k]}\absolutevalue{\sum_{\ell\in[d/k]}\frac{1}{d}W_{\ell}^{j}}.

Multiplying both sides of the above equality by $\sqrt{\frac{d}{k}}$ gives

	$\displaystyle\sqrt{\frac{d}{k}}\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}$	$\displaystyle=$	$\displaystyle\frac{\absolutevalue{\lambda}}{k}\sum_{j\in[k]}\absolutevalue{\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}}$		(10)
		$\displaystyle\stackrel{{\scriptstyle\eqref{eq:f-1}}}{{=}}$	$\displaystyle\frac{D(\ket{\phi},\ket{\psi})}{k}\sum_{j\in[k]}\absolutevalue{\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}}.$		(10)

We try to analyze the expectation and variance of each $W_{\ell}^{j}$ . First, note that $\mathbf{E}\left[W_{\ell}^{j}\right]=0$ , because the distribution of $\ket{\gamma^{j}_{\ell}}$ is unitary invariant, which implies that $\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}$ and $\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}$ have an identical distribution.

The variance of $W_{\ell}^{j}$ equals to

$\displaystyle\sigma_{W}^{2}$	$\displaystyle\triangleq$	$\displaystyle\mathbf{Var}\left[W_{\ell}^{j}\right]=\mathbf{E}\left[\|W_{\ell}^{j}\|^{2}\right]$	(11)
	$\displaystyle=$	$\displaystyle{d^{2}}\mathbf{E}\left[\left(\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}\right)^{2}\right]$
	$\displaystyle=$	$\displaystyle{2}{d^{2}}\left(\mathbf{E}\left[\absolutevalue{\innerproduct{1}{\gamma}}^{4}\right]-\left(\mathbf{E}\left[\absolutevalue{\innerproduct{1}{\gamma}}^{2}\absolutevalue{\innerproduct{2}{\gamma}}^{2}\right]\right)\right),\quad$

where $\ket{\gamma}$ is a random pure state generated based on the Haar measure. Since the Haar measure is invariant under Unitary transformation, $\innerproduct{1}{\gamma}$ and $\innerproduct{2}{\gamma}$ have the same joint distribution as $U_{11}$ and $U_{21}$ , where $U$ is a random unitary matrix (a Haar unitary) and $U_{ij}$ refers to the entry on the $i$ ths row and $j$ th column of $U$ . Note that all entries $U_{ij}$ of a Haar unitary $U$ are identically distributed [34]. Moreover, they can be written as $U_{ij}=re^{i\theta}$ with the distribution given by $\frac{d-1}{\pi}(1-r^{2})^{d-2}r\partial r\partial\theta$ where $r\in[0,1]$ and $\theta\in[0,2\pi]$ . Therefore, the distribution of $|U_{11}|^{2}$ is given by $(d-1)(1-r)^{d-2}\partial r$ . Consequently, from (11) we have

\sigma_{W}^{2}={2}{d^{2}}\left(\mathbf{E}\left[|U_{11}|^{4}\right]-\mathbf{E}\left[|U_{11}|^{2}|U_{21}|^{2}\right]\right).

(12)

The first expectation in (12) calculates as

\mathbf{E}[|U_{11}|^{4}]=(d-1)\int_{0}^{1}r^{2}(1-r)^{d-2}\partial r=(d-1)\mathcal{B}(3,d-1),

where $\mathcal{B}(\cdot,\cdot)$ is the Beta function that is defined as

\mathcal{B}(\alpha,\beta)=\int_{0}^{1}r^{\alpha-1}(1-r)^{\beta-1}\partial r.

The Beta function at positive integers can be calculated combinatorically as

\mathcal{B}(m,n)=\frac{(m+n)/(mn)}{\binom{m+n}{m}}.

Therefore,

\mathbf{E}\left[|U_{11}|^{4}\right]=(d-1)\frac{(d+2)/(3(d-1))}{\binom{d+2}{3}}=\frac{2}{d(d+1)}.

Similarly, the second expectation in (12) equals to

\mathbf{E}\left[|U_{11}|^{2}|U_{21}|^{2}\right]=\frac{1}{d(d+1)}.

Consequently, (12) reduces to

\sigma_{W}^{2}={2}{d^{2}}\left(\frac{2}{d(d+1)}-\frac{1}{d(d+1)}\right)=\frac{2d}{d+1}.

(13)

Implying that $\sigma_{W}^{2}\leq 2$ . Now we continue our investigation of Equality (10). Let

Z^{j}\triangleq\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}.

(10) simplifies to

Q\triangleq\sqrt{\frac{d}{k}}\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}=D(\ket{\phi},\ket{\psi})\cdot\frac{1}{k}\sum_{j\in[k]}|Z^{j}|.

(14)

Let

U\triangleq\frac{Q-\mathbf{E}[Q]}{D(\ket{\phi},\ket{\psi})}=\frac{1}{k}\sum_{j\in[k]}\left(|Z^{j}|-\mathbf{E}\left[|Z^{j}|\right]\right).

(15)

Applying the generic Chernoff bound (Lemma A.3) to $U$ gives

\mathbf{Pr}[U\geq\eta]\leq\inf_{t>0}M_{U}(t)e^{-t\eta},

(16)

where $M_{U}(t)=\mathbf{E}\left[e^{tU}\right]$ is the moment generating function.

Let $t_{0}\triangleq\eta k/2$ . We use $t=t_{0}$ to upper bound the RHS of (16). Since $Z^{j}$ ’s are i.i.d., we have

M_{U}(t_{0})=\prod_{j\in[k]}\mathbf{E}\left[\exp\left({\frac{t_{0}}{k}\left(|Z^{j}|-\mathbf{E}\left[|Z^{j}|\right]\right)}\right)\right]=\left(M_{\tilde{U}}\left(\frac{t_{0}}{k}\right)\right)^{k}=\left(M_{\tilde{U}}\left(\frac{\eta}{2}\right)\right)^{k},

(17)

where $\tilde{U}:=|Z^{j}|-\mathbf{E}\left[|Z^{j}|\right]$ . Apply Taylor’s expansion on $M_{\tilde{U}}\left(\frac{\eta}{2}\right)$ around $\eta=0$ , we have

M_{\tilde{U}}\left(\frac{\eta}{2}\right)=M_{\tilde{U}}(0)+M^{\prime}_{\tilde{U}}(0)\cdot\frac{\eta}{2}+M^{{}^{\prime\prime}}_{\tilde{U}}(0)\cdot\frac{1}{2}\left(\frac{\eta}{2}\right)^{2}+\zeta_{\eta},

(18)

where $\zeta_{\eta}\leq c_{\eta}\eta^{3}$ (for a universal constant $c_{\eta}>0$ ) is the remainder term.

It is clear that $M_{\tilde{U}}(0)=1$ . By the definition of the moment generating function, $M^{\prime}_{\tilde{U}}(0)=\mathbf{E}\left[\tilde{U}\right]=0$ and

M^{{}^{\prime\prime}}_{\tilde{U}}(0)=\mathbf{Var}\left[\tilde{U}\right]=\mathbf{Var}\left[|Z^{j}|\right]\leq\mathbf{Var}\left[Z^{j}\right]=\mathbf{Var}\left[W_{\ell}^{j}\right]=\sigma^{2}_{W}\ ,

(18) simplifies to the following:

M_{\tilde{U}}\left(\frac{\eta}{2}\right)\leq 1+\frac{\sigma^{2}_{W}}{2}\left(\frac{\eta}{2}\right)^{2}+c_{\eta}\eta^{3}\leq 1+\left(\frac{\eta}{2}\right)^{2}+c_{\eta}\eta^{3},

where we have used the fact that $\sigma_{W}^{2}\leq 2$ (see (13)).

In the rest of the analysis, we will focus on the parameter $\eta\leq\frac{1}{8c_{\eta}}$ ; the actual value of $\eta$ will be determined later.

Hence, by (17) we have

M_{U}(t_{0})\leq\left(1+\left(\frac{\eta}{2}\right)^{2}+c_{\eta}\eta^{3}\right)^{k}\leq\exp\left(\frac{\eta^{2}k}{4}+c_{\eta}\eta^{3}k\right),

(19)

where in the second inequality we have used the fact $\left(1+\frac{x}{k}\right)^{k}\leq e^{x}$ .

Plugging (19) to (16), we have

\mathbf{Pr}(U\geq\eta)\leq\exp\left(\frac{\eta^{2}k}{4}+c_{\eta}\eta^{3}k\right)\cdot\exp\left(-\frac{\eta^{2}k}{2}\right)=\exp\left(-\Omega(\eta^{2}k)\right).

where we have used $c_{\eta}\eta^{3}k\leq\frac{\eta^{2}k}{8}$ since $\eta\leq\frac{1}{8c_{\eta}}$ .

For the other direction, we have

\mathbf{Pr}\left[U\leq-\eta\right]=\mathbf{Pr}\left[-U\geq\eta\right]\leq M_{U}(-t_{0})e^{-t_{0}\eta}\leq\exp\left(-\Omega(\eta^{2}k)\right).

(20)

Hence, for any $\delta\in(0,1)$ , setting $k=c_{k}\left(\frac{1}{\eta^{2}}\log\frac{1}{\delta}\right)$ for a sufficiently large constant $c_{k}>0$ , we have that $|U|\leq\eta$ with probability at least $(1-\delta)$ . Combining (20), (15), and (14), we have that with probability at least $(1-\delta)$ ,

\absolutevalue{\sqrt{\frac{d}{k}}\cdot\frac{\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}}{D(\ket{\phi},\ket{\psi})}-\mathbf{E}[\absolutevalue{Z}]}\leq\eta,

(21)

where $Z$ is distributed identically as $Z^{j}$ ’s.

It remains to investigate $\mathbf{E}[|Z|]$ . We show that $\mathbf{E}[|Z|]=\Theta(1)$ . By the definition we know that

\mathbf{E}[Z]=\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}\mathbf{E}[W_{\ell}^{1}]=0.

We start with the upper bound.

\mathbf{E}[|Z|]\leq\sqrt{\mathbf{E}[|Z|^{2}]}=\sqrt{\mathbf{E}[Z^{2}]}=\sqrt{\mathbf{Var}[Z]}=\sigma_{W}\leq\sqrt{2}.

(22)

For the lower bound, Markov inequality implies that

\displaystyle\mathbf{E}\left[|Z|\right]\geq a\mathbf{Pr}\left[|Z|>a\right],

for any $a>0$ . Since $Z$ is distributed identically as sum of i.i.d. random variables $\{W_{\ell}^{1}\}_{\ell\in[d/k]}$ with $E[W_{\ell}^{1}]=0$ , $E[(W_{\ell}^{1})^{2}]=\sigma_{W}^{2}$ , and $E[(W_{\ell}^{1})^{3}]<+\infty$ , by the Berry-Esseen theorem (Lemma A.4) we get

\mathbf{Pr}\left[\frac{1}{\sigma_{W}}|Z|>x\right]\geq\mathbf{Pr}[|N(0,1)|\geq x]-O\left(\frac{1}{\sqrt{d/k}}\right)=\left(1-\erf\left(\frac{x}{\sqrt{2}}\right)\right)-O\left(\frac{1}{\sqrt{d/k}}\right).

Hence,

\mathbf{E}\left[|Z|\right]\geq\sup_{x>0}\sigma_{W}x\left(1-\erf\left(\frac{x}{\sqrt{2}}\right)\right)-O\left(\frac{1}{\sqrt{d/k}}\right).

(23)

Recall that $\sigma_{W}\approx\sqrt{2}$ (see (13)). By a numerical computation, the maximum of the RHS of (23) attends at $x\approx 0.7475$ , which gives

\mathbf{E}\left[|Z|\right]\geq 0.4807-O\left(\frac{1}{\sqrt{d/k}}\right)\geq 0.48,

(24)

as long as $d/k$ is a sufficiently large constant.

By (22) and (24), we have $\mu_{Z}\triangleq\mathbf{E}[\absolutevalue{Z}]=\Theta(1)$ . Now, for any given constant $\iota>0$ , we set the $\eta=\min\left\{\iota\mu_{Z},\frac{1}{8c_{\eta}}\right\}$ . (21) gives

\absolutevalue{\sqrt{\frac{d}{k}}\frac{1}{\mu_{Z}}\frac{\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{1}}{D(\ket{\phi},\ket{\psi})}-1}\leq\iota.

B.1.1 Efficient Implementations of Random Measurements

Lastly, we address the space and time complexity of the sketching procedure in this proof. Specifically, we consider efficient construction of $\mathcal{M}_{k}$ . Note that the runtime of the original measurement construction based on pretty good measurements is polynomial in $d$ , hence exponential in the number of qubits. This is because it relies on generating $d$ kets $\ket{\gamma_{\ell}^{j}}$ based on Haar measure and via complex Gaussian vectors in $\mathbb{C}^{d}$ and the inverse-square root of the matrix $\Gamma$ . Generally, the classical time and the circuit gate complexity for sampling from Haar distribution is exponential. In what follows, we present a construction for $\mathcal{M}_{k}$ with $\text{poly}\log(d)$ time and space.

Our approach is based on 2-design methods. A unitary $t$ -design is a concept in quantum information theory that generalizes the idea of random sampling over the unitary group $U(d)$ of $d\times d$ unitary matrices. It provides a way to approximate certain statistical properties of quantum states or operations without needing to sample from the entire group, which can be computationally expensive. Below we highlight basics of this concept.

Let $P_{t,t}(U)$ denote a polynomial which is homogeneous with degree at most $t$ in the matrix elements of $U$ , and at most degree $t$ in the complex conjugates of these elements.

Definition B.2.

A unitary $t$ -design is a finite set of unitary matrices $\{U^{(i)}\}_{i=1}^{N}$ such that for any homogeneous polynomial $P_{t,t}$

\frac{1}{N}\sum_{i=1}^{N}P_{t,t}(U^{(i)})=\int P_{t,t}\,d\mu_{\text{Haar}}(U)

where $\mu_{\text{Haar}}$ is the Haar measure on the space of $d\times d$ unitary matrices.

Intuitively, this definition implies that a unitary $t$ -design is indistinguishable from Haar measure when only polynomials of degree at most $t$ are used. We show that a 2-design is sufficient to obtain Theorem 4.3.

Note that the analysis of sample complexity in our proof relies on the generic Chernoff bound which, per (18), depends on the first two moments of

W_{\ell}^{j}=d\left(\absolutevalue{\innerproduct{1}{\gamma_{\ell}^{j}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma_{\ell}^{j}}}^{2}\right).

We show that $W_{\ell}^{j}$ can be written as a polynomial $P_{1,1}$ of degree $t=1$ in the elements of a unitary matrix $U$ . Let $U$ be a matrix with columns being the vectors of $\ket{\gamma_{\ell}^{j}}$ . That is $\innerproduct{r}{\gamma_{\ell}^{j}}$ is the element of $U$ , denoted by $U_{r,(\ell,j)}$ , located at the row $r$ and the column indexed by $(\ell,j)$ . With this definition, we can write $\absolutevalue{\innerproduct{1}{\gamma_{\ell}^{j}}}^{2}=U_{1,(\ell,j)}U_{1,(\ell,j)}^{*}$ , implying that $W_{\ell}^{j}=P_{1,1}(U)$ . Therefore, the first moment of $W_{\ell}^{j}$ is a polynomial $P_{1,1}$ of degree $t=1$ in $U$ . Moreover, the second moment of $W_{\ell}^{j}$ is a polynomial $P_{2,2}$ of degree $t=2$ in $U$ . This is because based on (11), the second moment is a function of $\absolutevalue{\innerproduct{1}{\gamma_{\ell}^{j}}}^{4}$ , which is written as $U_{1,(\ell,j)}^{2}(U_{1,(\ell,j)}^{*})^{2}$ . Hence, as long as the first and the second moments of $W_{\ell}^{j}$ resemble the Haar measure, the proof remains valid, indicating that a 2-design is sufficient.

Based on the above argument, instead of using the pretty good measurement construction, we sample from $2$ -design unitaries. This can be implemented efficiently using the Clifford group, a specialized class of quantum operators (for more details see Section A.3). It is well-known that the Clifford group is a 2-design [14]. Furthermore, there exists an algorithm that samples uniformly from the Clifford group in classical time $O(n^{8})$ and outputs a circuit representing the measurement with a gate complexity of $O(n^{2})$ , where $n=\log d$ is the number of qubits [16]. To construct the desired measurement $\mathcal{M}_{k}$ , we first randomly generate a Clifford circuit, apply it to the input quantum state, and then measure in the computational basis. The outcome of the measurement is a binary string in $\{0,1\}^{n}$ . Lastly, we design a binning function $f$ that takes the $n$ -bit string and outputs an index in $[k]$ . This is achieved by partitioning the set $\{0,1\}^{n}$ into $k$ equal-size bins, which can be efficiently implemented using a decision tree that only reads the first $\log k$ bits of the binary measurement outcomes. There are $k$ possible outputs of the decision tree, meaning that it can be represented by a function $f:\{0,1\}^{n}\rightarrow[k]$ that determines the bin index for any binary string of length $n$ . This construction is summarized as Algorithm 1.

Input:

k,n,\ket{\phi}

1 Sample from the Clifford group on

n

-qubits and construct the random circuit

U

2 Apply

U

on the input state

\ket{\phi}

3 Measure the first

\lceil\log k\rceil

qubits along the computational basis.

4 return mod-

k

of the decimal representation of the resulted binary string.

Algorithm 1

k

-sketching measurement

In conclusion, the above construction generates a random measurement $\mathcal{M}_{k}$ with $O(n^{2})$ quantum gates and $O(n^{8})$ classical time. Moreover, this measurement enjoys the same bound on the sample complexity as for the original construction based on the Haar measure.

B.2 Proof of Theorem 4.6

The proof for Theorem 4.6 (the $\ell_{2}$ -norm case) is similar to that for Theorem 4.3 (the $\ell_{1}$ -norm case), but the calculation will be different due to different distance functions. Let $\mathcal{M}_{k}=\{\Pi_{1},\ldots,\Pi_{k}\}$ be the same random POVM generated as that in the proof of Theorem 4.3.

The $\ell_{2}$ distance between the output probability vectors can be written as

\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}\outerproduct{\phi}{\phi}}-\tr{\Pi_{j}\outerproduct{\psi}{\psi}}}^{2}=\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}}^{2},

(25)

where $A=\outerproduct{\phi}{\phi}-\outerproduct{\psi}{\psi}$ , which can be rewritten as $A=|\lambda|(\outerproduct{1}{1}-\outerproduct{2}{2})$ , where $\ket{1}$ and $\ket{2}$ are two eigenstates of $A$ . Hence, we can rewrite (25) as

$\displaystyle\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2}$	$\displaystyle=$	$\displaystyle\sum_{j\in[k]}\absolutevalue{\tr{\Pi_{j}A}}^{2}$	(26)
	$\displaystyle=$	$\displaystyle\absolutevalue{\lambda}^{2}\sum_{j\in[k]}\absolutevalue{\expectationvalue{\Pi_{j}}{1}-\expectationvalue{\Pi_{j}}{2}}^{2}$
	$\displaystyle=$	$\displaystyle\absolutevalue{\lambda}^{2}\sum_{j\in[k]}\left(\sum_{\ell\in[d/k]}\absolutevalue{\innerproduct{1}{\gamma^{(j)}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{(j)}_{\ell}}}^{2}\right)^{2}.$

Multiplying both sides of (26) by a factor of $d$ , and letting $W_{\ell}^{j}\triangleq d\left(\absolutevalue{\innerproduct{1}{\gamma^{j}_{\ell}}}^{2}-\absolutevalue{\innerproduct{2}{\gamma^{j}_{\ell}}}^{2}\right)$ , we have

d\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2}=\absolutevalue{\lambda}^{2}\frac{1}{k}\sum_{j\in[k]}\left(\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}\right)^{2}.

(27)

Let

Z^{j}\triangleq\frac{1}{\sqrt{d/k}}\sum_{\ell\in[d/k]}W_{\ell}^{j}.

(28)

By a similar analysis as that in the proof of Theorem 4.3 (particularly, recall (6) and (13)), the expectation of the RHS of (27) is calculated to be

\mathbf{E}\left[\absolutevalue{\lambda}^{2}\frac{1}{k}\sum_{j\in[k]}|Z^{j}|^{2}\right]=\absolutevalue{\lambda}^{2}\sigma_{W}^{2}=D^{2}(\ket{\phi},\ket{\psi})\cdot\frac{2d}{d+1}.

(29)

We next analyze the variance of the RHS of (27).

We write

V\triangleq\frac{1}{|\lambda|^{2}}\left(d\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{2}^{2}-|\lambda|^{2}\sigma_{W}^{2}\right)=\frac{1}{k}\sum_{j\in[k]}\left(|Z^{j}|^{2}-\sigma_{W}^{2}\right).

(30)

By the generic Chernoff bound (Lemma A.3), we have

\mathbf{Pr}(V\geq\eta)\leq\inf_{t>0}M_{V}(t)e^{-t\eta},

(31)

where $M_{V}(t)=\mathbf{E}\left[e^{tV}\right]$ . Set $t_{0}=c_{t}\eta k$ for a constant $c_{t}$ to be determined later . We use $t=t_{0}$ to upper bound the RHS of (31). Since $Z^{j}$ ’s are i.i.d., we have

M_{V}(t_{0})=\prod_{j\in[k]}\mathbf{E}\left[\exp\left({\frac{t_{0}}{k}\left(|Z^{j}|^{2}-\sigma_{W}^{2}\right)}\right)\right]=\left(M_{\tilde{V}}\left(\frac{t_{0}}{k}\right)\right)^{k}=\left(M_{\tilde{V}}\left(c_{t}\eta\right)\right)^{k},

(32)

where $\tilde{V}=|Z^{j}|^{2}-\sigma_{W}^{2}$ .

Apply Taylor’s expansion on $M_{\tilde{V}}\left(c_{t}\eta\right)$ around $\eta=0$ , we get

M_{\tilde{V}}\left(c_{t}\eta\right)=M_{\tilde{V}}(0)+M^{\prime}_{\tilde{V}}(0)\cdot c_{t}\eta+M^{{}^{\prime\prime}}_{\tilde{V}}(0)\cdot\frac{1}{2}\left(c_{t}\eta\right)^{2}+\zeta_{\eta},

(33)

where $\zeta_{\eta}\leq c_{\eta}\eta^{3}$ (for a universal constant $c_{\eta}>0$ ) is the remainder term. We again have $M_{\tilde{V}}(0)=1$ , $M^{\prime}_{\tilde{V}}(0)=\mathbf{E}\left[\tilde{V}\right]=0$ , and $M^{{}^{\prime\prime}}_{\tilde{V}}(0)=\mathbf{Var}\left[\tilde{V}\right]\leq c_{\tilde{V}}$ for a universal constant $c_{\tilde{V}}>0$ .

In the rest of the analysis, we will focus on parameter $\eta\leq\frac{c_{\tilde{V}}c_{t}^{2}}{2c_{\eta}}$ ; the actual value of $\eta$ will be determined later.

We extend (33) as

M_{\tilde{V}}\left(c_{t}\eta\right)\leq 1+\frac{c_{\tilde{V}}}{2}\left(c_{t}\eta\right)^{2}+c_{\eta}\eta^{3}\leq 1+c_{\tilde{V}}c_{t}^{2}\eta^{2}.

(34)

Plugging (34) to (32), we have

M_{V}(t_{0})=\left(M_{\tilde{V}}\left(c_{t}\eta\right)\right)^{k}\leq\left(1+c_{\tilde{V}}c_{t}^{2}\eta^{2}\right)^{k}\leq\exp\left(c_{\tilde{V}}c_{t}^{2}k\eta^{2}\right).

(35)

Plugging (35) to (31), we have

\mathbf{Pr}(V\geq\eta)\leq M_{V}(t_{0})e^{-t_{0}\eta}\leq\exp\left(c_{t}^{2}c_{\tilde{V}}k\eta^{2}\right)\exp\left(-c_{t}\eta^{2}k\right)=\exp\left(-\Omega(\eta^{2}k)\right),

where for the last equality to hold, we set constant $c_{t}=\frac{1}{2c_{\tilde{V}}}$ .

For the other direction, we have

\mathbf{Pr}\left[V\leq-\eta\right]=\mathbf{Pr}\left[-V\geq\eta\right]\leq M_{V}(-t_{0})e^{-t_{0}\eta}\leq\exp\left(-\Omega(\eta^{2}k)\right).

Hence, for any $\delta\in(0,1)$ , setting $k=c_{k}\left(\frac{1}{\eta^{2}}\log\frac{1}{\delta}\right)$ for a sufficiently large constant $c_{k}>0$ , we have that $|V|\leq\eta$ with probability at least $(1-\delta)$ . This, together with (27), (28) (29), and (30), we obtain

\absolutevalue{\frac{d\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}^{2}_{2}}{D^{2}(\ket{\phi},\ket{\psi})}-\frac{2d}{d+1}}\leq\eta,

which implies

\sqrt{2-\eta-o(1)}\leq\frac{\sqrt{d}\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{2}}{D(\ket{\phi},\ket{\psi})}\leq\sqrt{2+\eta}.

(36)

Now, for any constant $\iota>0$ , we set $\eta=\left\{\iota,\frac{c_{\tilde{V}}c_{t}^{2}}{2c_{\eta}}\right\}$ . (36) gives

\absolutevalue{\sqrt{\frac{d}{2}}\cdot\frac{\norm{\mathcal{M}_{k}(\ket{\phi})-\mathcal{M}_{k}(\ket{\psi})}_{2}}{D(\ket{\phi},\ket{\psi})}-1}\leq\iota.

B.3 Proof of Theorem 4.10

For the first part, we use the same procedure is CST to create the $N\times n$ seed matrix $A({\phi})$ , where each row $i\in[N]$ consists of the $n$ pairs $\{b_{i,j},\mathtt{index}(U_{i,j}))\}_{j=1}^{n}$ .

Next, for the query algorithm, we propose an encoding to turn the classical shadows into quantum states. This is a deviation from CST as we push the computations back to quantum via a QCQC approach. Let $Q$ be the index of the relevant qubits of a local observable $M$ in the query phase. For each $j\in Q$ , we create a random binary pair $(c_{i,j},w_{i,j})$ using the stored bit $b_{i,j}$ as follows:

(c_{i,j},w_{i,j})=\begin{cases}(b_{i,j},1)&\text{w.pr.\ }2/3\ ,\\ (1-b_{i,j},-1)&\text{w.pr.\ }1/3\ .\end{cases}

(37)

We then prepare a qubit $\ket{c_{i,j}}$ and apply the corresponding operator $U_{i,j}$ to create qubit $\ket{v_{i,j}}=U_{i,j}\ket{c_{i,j}}$ . Note that $\ket{v_{i,j}}$ takes one of the following states:

\ket{0},\ket{1},\ket{+},\ket{-},\ket{+i},\ket{-i},

that are easy to prepare (see Appendix A.2 for their vector representations). Then, we prepare the $\ket{0}$ for the rest of qubits not indexed in $Q$ . Let

\ket{v^{\prime}_{i,j}}=\begin{cases}\ket{v_{i,j}},&\text{if\ }j\in Q\\ \ket{0},&\text{otherwise}.\end{cases}

(38)

We construct the $i$ -th shadow sample as (written as the outer-product form)

\ket{\tilde{\phi}_{i}}=\bigotimes_{j=1}^{n}\ket{v^{\prime}_{i,j}}.

(39)

We next measure each $\ket{\tilde{\phi}_{i}}$ using the observable $M$ , and obtain an outcome $x_{i}$ . Let

S_{i}=3^{k}x_{i}\prod_{j\in Q}w_{i,j}.

(40)

Then, our estimator is the empirical average

T=\frac{1}{N}\sum_{i}S_{i}.

We now show that when $N\geq 9^{k}\norm{M}_{\infty}^{2}\frac{\log(1/\delta)}{\varepsilon^{2}}$ , then $T$ approximates $\expectationvalue{M}{\phi}$ up to an additive error $\varepsilon$ with probability $(1-\delta)$ , proving the correctness part of Theorem 4.10. We will also give the query time analysis. Recall that the space needed for storing the seed matrix is $O(Nn)$ classical bits.

In the rest of the proof, we will focus on the random variable $S\triangleq S_{i}$ for particular $i\in[N]$ . Recall that the final approximation $T$ is the average of $N$ i.i.d. copies of $S$ .

Let $X_{i}$ , $W_{j}$ , $V_{i,j}$ , $B_{i,j}$ be the corresponding random variables of $x_{i}$ , $w_{i,j}$ , $v_{i,j}$ , $b_{i,j}$ , respectively. Since we focus on a particular $i\in[N]$ , we will omit all subscripts $i$ in those random variables and write them as $X$ , $W_{j}$ , $V_{j}$ , $B_{j}$ . We also write $\ket{\tilde{\phi}_{i}}$ as $\ket{\tilde{\phi}}$ .

The following result can be inferred from [32]. We include a proof for completeness.

Lemma B.3.

For any $i\in[N]$ , let $\outerproduct{\tilde{\varphi}_{i}}{\tilde{\varphi}_{i}}=\bigotimes_{j=1}^{n}\outerproduct{V_{j}}{V_{j}}$ . We have $\mathbf{E}\left[3^{n}W\outerproduct{\tilde{\varphi}_{i}}{\tilde{\varphi}_{i}}\right]=\outerproduct{\phi}{\phi}$ , where $W=\prod_{j=1}^{n}W_{j}$ .

Proof B.4.

Let $\Gamma_{0}$ denote the shadow channel for Pauli measurements as defined in [37], which is given by

\Gamma_{0}[O]\triangleq\sum_{U\in\{I,H,S^{\dagger}H\}}\sum_{b\in\{0,1\}}\frac{1}{3}\matrixelement{b}{U^{\dagger}OU}{b}\leavevmode\nobreak\ U\outerproduct{b}{b}U^{\dagger},

for any single qubit operator $O$ . By direct calculation, we have for a generic state $\ket{\psi}=a_{0}\ket{0}+a_{1}\ket{1}$ ,

\Gamma_{0}^{-1}[\outerproduct{\psi}{\psi}]=\begin{bmatrix}2|a_{0}|^{2}-|a_{1}|^{2}&3a_{0}a_{1}^{*}\\ 3a_{0}^{*}a_{1}&2|a_{1}|^{2}-|a_{0}|^{2}\end{bmatrix}.

It is known that $\Gamma_{0}$ has an inverse as it is a linear mapping. Applying $\Gamma_{0}^{-1}$ on $\ket{B_{j}}$ , we have, again by direct calculation, that

\Gamma_{0}^{-1}\left[\outerproduct{B_{j}}{B_{j}}\right]=2\outerproduct{B_{j}}{B_{j}}-\outerproduct{1-{B_{j}}}{1-{B_{j}}}.

By (37), taking the expectation of $W_{j}\outerproduct{C_{j}}{C_{j}}$ gives

\mathbf{E}\Big{[}W_{j}\outerproduct{C_{j}}{C_{j}}\Big{]}=\frac{2}{3}\outerproduct{B_{j}}{B_{j}}-\frac{1}{3}\outerproduct{1-B_{j}}{1-B_{j}}=\frac{1}{3}\Gamma_{0}^{-1}\left[\outerproduct{B_{j}}{B_{j}}\right].

Since $\ket{V_{j}}=U_{j}\ket{C_{j}}$ , we get

\mathbf{E}\left[3W_{j}\outerproduct{V_{j}}{V_{j}}\right]=U_{j}\Gamma_{0}^{-1}\left[\outerproduct{B_{j}}{B_{j}}\right]U_{j}^{\dagger}=\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right].

(41)

Since $(U_{j},B_{j})$ ’s are independent for different $j\in[n]$ ,

$\displaystyle\mathbf{E}\left[3^{n}W\outerproduct{\tilde{\varphi}_{i}}{\tilde{\varphi}_{i}}\right]$	$\displaystyle=$	$\displaystyle\mathbf{E}\left[\bigotimes_{j=1}^{n}\left(3W_{j}\outerproduct{v_{j}}{v_{j}}\right)\right]$
	$\displaystyle=$	$\displaystyle\mathbf{E}\left[\bigotimes_{j=1}^{n}\Gamma_{0}^{-1}[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}]\right]$
	$\displaystyle=$	$\displaystyle\bigotimes_{j=1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right]\right]$
	$\displaystyle=$	$\displaystyle\outerproduct{\phi}{\phi}.$	(43)

where the last equality follows form [32, Lemma 6].

The next lemma shows that $S$ is an unbiased estimator of the quantity $\expectationvalue{M}{\phi}$ .

Lemma B.5.

$\operatorname{{\mathop{\mathbf{E}}}}[S]=\expectationvalue{M}{\phi}$ .

Proof B.6.

Without loss of generality, assume that $q_{\ell}=\ell$ for all $\ell\in[k]$ . We have

$\displaystyle\mathbf{E}[S]$	$\displaystyle=$	$\displaystyle 3^{k}\mathbf{E}\left[X\prod_{j\in[k]}W_{j}\right]$	(44)
	$\displaystyle=$	$\displaystyle 3^{k}\mathbf{E}\left[\expectationvalue{M}{\tilde{\phi}}\prod_{j\in[k]}W_{j}\right]$
	$\displaystyle=$	$\displaystyle 3^{k}\mathbf{E}\left[\tr\left(M\outerproduct{\tilde{\phi}}{\tilde{\phi}}\right)\prod_{j\in[k]}W_{j}\right]$
	$\displaystyle=$	$\displaystyle\tr\left\{M\mathbf{E}\left[\bigotimes_{j\in[k]}\left(3W_{j}\outerproduct{V_{j}}{V_{j}}\right)\bigotimes\outerproduct{0^{n-k}}{0^{n-k}}\right]\right\}.\quad\quad$

Since $M$ only depends on the first $k$ qubits, the expectation in (44) does not change if $\outerproduct{0^{n-k}}{0^{n-k}}$ is replaced by the following state

\bigotimes_{j=k+1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{b_{j}}{b_{j}}U_{j}^{\dagger}\right]\right].

Hence, we can write $\mathbf{E}[S]$ as

			$\displaystyle\tr\left\{M\mathbf{E}\left[\bigotimes_{j=1}^{k}\left(3W_{i,j}\outerproduct{V_{i,j}}{V_{i,j}}\right)\right]\bigotimes_{j=k+1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right]\right]\right\}$
		$\displaystyle\stackrel{{\scriptstyle}}{{=}}$	$\displaystyle\tr\left\{M\bigotimes_{j=1}^{n}\mathbf{E}\left[\Gamma_{0}^{-1}\left[U_{j}\outerproduct{B_{j}}{B_{j}}U_{j}^{\dagger}\right]\right]\right\}$
		$\displaystyle\stackrel{{\scriptstyle}}{{=}}$	$\displaystyle\tr{M\outerproduct{\phi}{\phi}}=\expectationvalue{M}{\phi},$

where the second equality follows from (41), and the third equality follows from (43).

The correctness part of Theorem 4.10 follows immediately from Lemma B.5, Hoeffding’s inequality (Lemma A.2), and the fact that for all $i\in[N]$ , we have $\absolutevalue{S_{i}}\leq 3^{k}\norm{M}_{\infty}$ .

Running time

We now analyze the running time of the query estimation procedure. First, the preparation of each quantum state $\ket{v^{\prime}_{i,j}}\ (i\in[N],j\in[n])$ takes quantum time $O(1)$ . The construction of each shadow sample $\ket{\tilde{\phi}_{i}}$ (Eq. (39)) takes $O(k)$ quantum time; note that we do not actually need to prepare those $\ket{v^{\prime}_{i,j}}$ ’s with $j\not\in Q$ , since the $k$ -local observable does not depend on those qubits. The quantum time for measuring each $\ket{\tilde{\phi}_{i}}$ with a $k$ -local observable $M$ is $O\left(poly(k)\right)$ , as we have assumed that $M$ has a $\text{poly}(k)$ gate complexity. The computation of each $S_{i}$ (Eq. (40)) can be done in $O(n)$ classical time, and that of $T=\frac{1}{N}\sum_{i}S_{i}$ can be bounded by $O(N)$ classical time. Summing up everything, the total running time can be bounded by $O\left(poly(k)N\right)$ quantum time plus $O(kN)$ classical time; the classical time can be ignored if we assume that a unit quantum time is at least a unit classical time.

Appendix C Distortions Between The Trace Distance and $\ell_{1}/\ell_{2}$ Distances of Quantum States

	$\phi$ & $\psi$	$\ket{0}$ & $\ket{1}$	distortion w.r.t. $D$
$D$	$\sqrt{\frac{3}{4}}$	$1$	–
$L_{1}$	$\sqrt{\frac{d}{2}}$	$2$	$\sqrt{\frac{d}{6}}$
$L_{2}$	$1$	$\sqrt{2}$	$\sqrt{\frac{3}{2}}$
$L^{\prime}_{1}$	$1$	$2$	$\sqrt{3}$
$L^{\prime}_{2}$	$\sqrt{\frac{2}{d}}$	$\sqrt{2}$	$\sqrt{\frac{3d}{4}}$

Table 1: Example of a simple table

Let $\alpha(\ket{0})=(1,0,0,\ldots,0)^{T}$ be the vector representation of a $d$ -dimensional quantum state $\ket{0}$ , where the first coordinate is $1$ and the others are $0$ . Let $\alpha(\ket{1})=(0,1,0,\ldots,0)^{T}$ be a $d$ -dimensional vector with the second coordinate being $1$ and the others being $0$ .

Let $\alpha(\phi)=\frac{1}{\sqrt{d/2}}(1,\ldots,1,1,\ldots,1,0,\ldots,0,0,\ldots,0)^{T}$ be the $d$ -dimensional vector with the first $d/2$ coordinators being $1$ and second half being $0$ , and

\psi=\frac{1}{\sqrt{d/2}}(0,\ldots,0,1,\ldots,1,1,\ldots,1,0,\ldots,0)^{T}

be the $d$ -dimensional vector with the middle $d/2$ coordinates being $1$ and the rest being $0$ .

Let $D(\phi,\psi)$ denote the trace distance between two quantum states $\phi$ and $\psi$ . Let $L_{1}(\phi,\psi)$ and $L_{2}(\phi,\psi)$ denote the $\ell_{1}$ and $\ell_{2}$ distances between $\alpha(\phi)$ and $\alpha(\psi)$ , respectively. Let $L^{\prime}_{1}(\phi,\psi)$ and $L^{\prime}_{2}(\phi,\psi)$ denote the $\ell_{1}$ and $\ell_{2}$ distances between $p(\phi)$ and $p(\psi)$ , where $p(\phi)$ is $\alpha(\phi)$ after taking the coordinate-wise absolute square; that is, $p(\phi)=\frac{1}{d/2}(1,\ldots,1,1,\ldots,1,0,\ldots,0,0,\ldots,0)^{T}$ .

The distortion of a distance $d(\cdot,\cdot)$ with respect to $D$ is lower bounded by

\sqrt{\max\bigg{\{}\frac{D(\phi,\psi)}{d(\phi,\psi)}\bigg{/}\frac{D(\ket{0},\ket{1})}{d(\ket{0},\ket{1})},\frac{d(\phi,\psi)}{D(\phi,\psi)}\bigg{/}\frac{d(\ket{0},\ket{1})}{D(\ket{0},\ket{1})}\bigg{\}}}.

In Table 1, we have calculated the distortions between the trace distance of the quantum states and the $\ell_{1}$ and $\ell_{2}$ distances of their corresponding classical vector representations. It is easy to see that all distortions are larger than $1.1$ .

Appendix D Search As A Special Case of Selection

We note that the selection operation can also be used for search. Given two quantum states ${\phi}$ and ${\psi}$ , letting $M=\psi\psi^{\dagger}$ , we have

\displaystyle D({\phi},{\psi})=\sqrt{1-\absolutevalue{{\psi^{\dagger}}{\phi}}^{2}}=\sqrt{1-\phi^{\dagger}M\phi}.

Therefore, if we can estimate $\phi^{\dagger}M\phi$ up to an additive factor $c_{\varepsilon}\varepsilon^{2}$ (for a sufficiently small constant $c_{\varepsilon}$ ) for any database state ${\phi}$ , we can also solve $(\varepsilon,\beta)$ -search for any constant $\beta>1$ . By Theorem 4.10 and the fact that $\norm{M}_{\infty}^{2}=1$ when $M=\psi\psi^{\dagger}$ , for a database consisting of $m$ states, the query (quantum) time using expectation value estimations is bounded by $9^{n}m{\log m}\cdot\text{poly}(n)/{\varepsilon^{4}}$ . This approach is certainly much more time-expensive than that using state sketches presented in Section 4.1.

Quantum Data Sketches

Abstract

keywords:

1 Introduction

Unique Challenges in the Quantum World

Our Contribution

Paper Outline

1.1 Related Work

Quantum State Learning

2 Preliminaries

Quantum States and Qubits

Quantum Operations

Quantum Measurements

Standard Math Notations Versus Dirac Notations

Trace Distance

2.1 Performance Metrics

2.2 Nearest Neighbor in High Dimensions

Definition 2.1 ((r,β)(r,\beta)-ANN-search).

Theorem 2.2 ([39, 15, 5]).

Remark 2.3.

Remark 2.4.

3 Basic Operations on Quantum Data

3.1 Equality Test

Definition 3.1 ((ε,β)(\varepsilon,\beta)-equality-test).

3.2 Search and Join

Definition 3.2 ((ε,β)(\varepsilon,\beta)-search).

Definition 3.3 ((ε,β)(\varepsilon,\beta)-natural-join).

3.3 Selection and Sorting

Definition 3.4 ((η,ε)(\eta,\varepsilon)-selection).

Definition 3.5 (kk-local observable).

Definition 3.6 (ε\varepsilon-sorting).

4 Sketches for Quantum Data Operations

4.1 Vector Sketches for Equality-Test, Search, and Join

Theorem 4.1 ([53]).

Corollary 4.2.

Embedding to L1L_{1}-space

Theorem 4.3.

Proof 4.4 (Proof Overview).

Corollary 4.5.

Embedding to L2L_{2}-space

Theorem 4.6.

Corollary 4.7.

The Equality-Test Operation

The Search Operation

Theorem 4.8.

The Join Operation

4.2 Shadow Seeds for Selection and Sorting

Theorem 4.9 (Based on [37]).

Theorem 4.10.

The Selection Operation

Theorem 4.11.

The Sorting Operation

5 Conclusion and Future Work

Support More Data Operations

Mixed States

The Integration with the Theory of Relational Databases

References

Appendix A More Preliminaries

A.1 Basics of Quantum Information (A More Formal Approach)

Quantum Operations

Trace Distance

A.2 Some Basic Quantum States and Gates

A.3 Clifford Group

Definition A.1.

A.4 Mathematical Tools

Lemma A.2 (Hoeffding’s inequality).

Lemma A.3 (Generic Chernoff bound).

Lemma A.4 (Berry-Esseen theorem).

Appendix B Missing Proofs

B.1 Proof of Theorem 4.3

Measurements Construction

Claim 1.

Proof B.1.

The Analysis of Distortion

B.1.1 Efficient Implementations of Random Measurements

Definition B.2.

B.2 Proof of Theorem 4.6

B.3 Proof of Theorem 4.10

Lemma B.3.

Proof B.4.

Definition 2.1 ( $(r,\beta)$ -ANN-search).

Definition 3.1 ( $(\varepsilon,\beta)$ -equality-test).

Definition 3.2 ( $(\varepsilon,\beta)$ -search).

Definition 3.3 ( $(\varepsilon,\beta)$ -natural-join).

Definition 3.4 ( $(\eta,\varepsilon)$ -selection).

Definition 3.5 ( $k$ -local observable).

Definition 3.6 ( $\varepsilon$ -sorting).

Embedding to $L_{1}$ -space

Embedding to $L_{2}$ -space

Appendix C Distortions Between The Trace Distance and $\ell_{1}/\ell_{2}$ Distances of Quantum States