This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Optimal Universal Quantum Encoding for Statistical Inference

Farhad Farokhi [email protected] Department of Electrical and Electronic Engineering, The University of Melbourne, Parkville, VIC 3010, Australia
Abstract

Optimal encoding of classical data for statistical inference using quantum computing is investigated. A universal encoder is sought that is optimal for a wide array of statistical inference tasks. Accuracy of any statistical inference is shown to be upper bounded by a term that is proportional to maximal quantum leakage from the classical data, i.e., the input to the inference model, through its quantum encoding. This demonstrates that the maximal quantum leakage is a universal measure of the quality of the encoding strategy for statistical inference as it only depends on the quantum encoding of the data and not the inference task itself. The optimal universal encoding strategy, i.e., the encoding strategy that maximizes the maximal quantum leakage, is proved to be attained by pure states. When there are enough qubits, basis encoding is proved to be universally optimal. An iterative method for numerically computing the optimal universal encoding strategy is presented.

preprint: APS/123-QED

I Introduction

Encoding classical data, e.g., traditional datasets, into quantum systems often forms the first step in quantum computing and communication. In quantum machine learning [1], data needs to be encoded into quantum states so that it can be fed into parameterized quantum circuits for predictions. In quantum key distribution [2], random bits must be encoded in non-orthogonal quantum states for communication. In quantum communication [3], messages are encoded by appropriate code-words to ensure high fidelity over noisy channels.

Not all encoding strategies are equal. There are many methods for encoding classical data into quantum systems, such as basis encoding, angle encoding, quantum random access memory encoding, and amplitude encoding [4, 5]. There have been many attempts at systematically comparing encoding strategies and establishing optimal ones. The effect of commonly used data-encoding mechanisms on the expressivity of quantum machine learning models was explored in [1]. Optimal encoding to achieve maximal fidelity in communication over spin systems was derived in [3]. Optimal encoding and retrieval of classical data in quantum domain was developed in [6]. Quantum information theory, especially relative entropy, was used to analyze quantum encoding over a family of communication channels [7]. Incompatibility of states was used to understand security of various encoding strategies in quantum key sharing mechanisms [8]. This problem was approached again using information leakage with gentle measurements in [9]. However, there has been no information-theoretic approach to analyze quantum encoding strategies in statistical inference and to develop universal encoding strategies. Universality here refers to encoders that are optimal for a wide array of statistical inference tasks as opposed to problem-specific encoders. This is the topic of the current letter.

In this letter, we focus on information-theoretical analysis of statistical inference using quantum computing. We consider an inference model to guess or estimate the output ZZ based on access to input XX. Random variables XX and ZZ are assumed to be jointly distributed. For instance, in image recognition, the input XX is an image or its pixelated representation while the output ZZ is the category to which the image belongs (e.g., cat or dog) or the content of the image (e.g., location of obstacles). Note that the inference model does not have access to the output ZZ and must guess the output based on the input XX and historical data (often called training dataset). In this letter, we focus on the large-data regime, where we can assume that the probability distribution of the data is known, i.e., it is accurately estimated based on many available data points. The outcome of the inference model is denoted by Z^\widehat{Z}. In statistical inference, the quality of an inference model is synonymous with its accuracy {Z^=Z}\mathbb{P}\{\widehat{Z}=Z\}. For statistical inference using a quantum-computing procedure, as the first step, the classical data XX must be encoded into a quantum state. The quantum state is then processed by arbitrary quantum channels, which can model unitary gates used in quantum computing [10] and more general noisy transformations [11]. Subsequently, measurements are taken and post-processed to generate the output estimate Z^\widehat{Z}. We prove that, irrespective of the objective of the inference task, the accuracy of the quantum computing procedure is constrained by a term that is proportional with the maximal quantum leakage [12] from the classical data XX through its quantum encoding. This demonstrates that the maximal quantum leakage is a universal measure of the quality of the encoding strategy. Interestingly, maximal quantum inference is also independent of the distribution of the input XX (and is only a function of the quantum encoding of the classical data). Therefore, the optimal universal encoder, i.e., the encoder that maximizes the maximal quantum leakage, is independent of the inference problem at hand. This feature signifies the universality of the optimal universal encoder. These observations show that the number of qubits required for statistical inference must be larger than log2(|𝒳|)/2\log_{2}(|\mathcal{X}|)/2, where 𝒳\mathcal{X} is the support set of discrete input random variable XX, to not artificially constrain the performance of the inference model. We prove that the optimal universal encoding strategy is composed of pure states. This is an important revelation as often quantum computing procedures rely on pure states. Furthermore, when there are enough qubits, i.e., the number of qubits is higher than log2(|𝒳|)\log_{2}(|\mathcal{X}|), the basis encoding is proved to be the optimal universal encoder. Finally, an iterative method for numerically computing the optimal universal encoding strategy in all other cases is presented. This procedure relies on subgradient ascent for maximization of the maximal quantum leakage.

II Statistical Inference via Quantum Computing

Consider a statistical inference problem with jointly distributed discrete random variables X𝒳X\in\mathcal{X}, referred to as the input in what follows, and Z𝒵Z\in\mathcal{Z}, referred to as the output. The aim is to develop an inference model to guess or estimate the output ZZ based on only access to XX and the distribution of the underlying random variables. The outcome of the inference model is denoted by Z^𝒵\widehat{Z}\in\mathcal{Z}. The accuracy of the statistical inference model is given by the probability of the event that Z^\widehat{Z} and ZZ coincide {Z^=Z}\mathbb{P}\{\widehat{Z}=Z\}. In this letter, we use a quantum computing framework for the statistical inference. This framework is composed of four parts: quantum encoding, processing, measurement, and classical post-processing. In what follows, we explain each part briefly.

Encoding: Here, we convert or encode the raw classical data, i.e., the realization of random variable XX, into a quantum system. For the purpose of this letter, a quantum system is modelled by a finite-dimensional Hilbert space \mathcal{H}. The set of positive semi-definite linear operators on \mathcal{H} is denoted by 𝒫()\mathcal{P}(\mathcal{H}) while the set of density operators, i.e., positive semi-definite linear operators on \mathcal{H} with unit trace, is denoted by 𝒮()\mathcal{S}(\mathcal{H}). We use density operators to model the state of a quantum system. The quantum encoding entails preparing quantum system AA in state ρx𝒮()\rho^{x}\in\mathcal{S}(\mathcal{H}) if the input is X=x𝒳X=x\in\mathcal{X} is realized. The ensemble :={ρx}x𝒳\mathcal{R}:=\{\rho^{x}\}_{x\in\mathcal{X}} denotes the quantum encoding of the classical random variable XX.

Processing: After encoding the data, we must process it using a quantum circuit. We model this by an arbitrary quantum channel 𝒩:𝒮()𝒮()\mathcal{N}:\mathcal{S}(\mathcal{H})\rightarrow\mathcal{S}(\mathcal{H}^{\prime}). A quantum channel is a completely positive and trace preserving mapping [11].

Measurement: After processing, we need to extract information from the quantum system AA using a positive operator-valued measure (POVM) :={Fy}y𝒴\mathcal{F}:=\{F_{y}\}_{y\in\mathcal{Y}}, i.e., Fy𝒫()F_{y}\in\mathcal{P}(\mathcal{H}^{\prime}) and y𝒴Fy=I\sum_{y\in\mathcal{Y}}F_{y}=I. Let random variable Y𝒴Y\in\mathcal{Y} denote the outcome of the measurement. By Born’s rule, we know that the probability of observing Y=y𝒴Y=y\in\mathcal{Y} if the state of the quantum system AA is ρ\rho is given by {Y=y}=tr(Fyρ)\mathbb{P}\{Y=y\}=\tr(F_{y}\rho). Therefore, {Y=y|X=x}=tr(Fyρx)\mathbb{P}\{Y=y\,|\,X=x\}=\tr(F_{y}\rho^{x}) for all x𝒳x\in\mathcal{X} and y𝒴y\in\mathcal{Y}.

Classical post-processing: The last step is to process the random variable YY by a classical processing routine to generate random variable Z^𝒵\widehat{Z}\in\mathcal{Z}. This can be modelled by a conditional probability density function {Z^=z|Y=y}=γzy\mathbb{\{}\widehat{Z}=z\,|\,Y=y\}=\gamma_{zy}. Let γ=(γzy)z𝒵,y𝒴\gamma=(\gamma_{zy})_{z\in\mathcal{Z},y\in\mathcal{Y}}.

We denote the quantum computing procedure by tuple (,𝒩,,γ)(\mathcal{R},\mathcal{N},\mathcal{F},\gamma). Before analysing the accuracy of the statistical inference model, we need to present maximal quantum leakage. We use this notion of information to bound the accuracy of inference models.

Definition 1 (Maximal Quantum Leakage [12]).

Maximal quantum leakage from random variable XX through quantum system AA is

𝒬(XA)ρ=\displaystyle\mathcal{Q}(X\rightarrow A)_{\rho}= sup{Fy}y𝒴log(y𝒴maxx𝒳:{X=x}>0tr(ρxFy)),\displaystyle\sup_{\{F_{y}\}_{y\in\mathcal{Y}}}\log\!\left(\sum_{y\in\mathcal{Y}}\max_{\begin{subarray}{c}x\in\mathcal{X}:\\ \mathbb{P}\{X=x\}>0\end{subarray}}\!\!\tr(\rho^{x}F_{y})\!\right)\!, (1)

where the supremum is taken over all POVMs ={Fy}y𝒴\mathcal{F}=\{F_{y}\}_{y\in\mathcal{Y}} with arbitrary finite outcome set 𝒴\mathcal{Y}.

Maximal leakage was defined in [12] as a measure of private or secure information leakage to an arbitrary adversary for investigating security of quantum encoding policies (in communication, storage, and data analysis). Interestingly, maximal quantum leakage is independent of the output of the inference problem. It also does not depend of the distribution of the input; it only depends on the support set of the input and not the exact probability values. In this letter, we show that maximal quantum leakage is useful for bounding performance of quantum-based statistical inference mechanisms. This is established in the following theorem.

Theorem 1.

The accuracy of any quantum computing procedure (,𝒩,,γ)(\mathcal{R},\mathcal{N},\mathcal{F},\gamma) is constrained as

{Z^=Z}𝒬(XA)ρmaxz𝒵{Z=z}.\displaystyle\mathbb{P}\{\widehat{Z}=Z\}\leq\mathcal{Q}(X\rightarrow A)_{\rho}\max_{z\in\mathcal{Z}}\mathbb{P}\{Z=z\}.
Proof.

Note that

{Z^=Z}maxz𝒵{Z=z}\displaystyle\frac{\mathbb{P}\{\widehat{Z}=Z\}}{\displaystyle\max_{z\in\mathcal{Z}}\mathbb{P}\{Z=z\}} sup{Fy}y𝒴supZ,Z^{Z^=Z}maxz𝒵{Z=z}\displaystyle\leq\sup_{\{F_{y}\}_{y\in\mathcal{Y}}}\sup_{Z,\widehat{Z}}\frac{\mathbb{P}\{\widehat{Z}=Z\}}{\displaystyle\max_{z\in\mathcal{Z}}\mathbb{P}\{Z=z\}}
=𝒬(XA)𝒩(ρ),\displaystyle=\mathcal{Q}(X\rightarrow A)_{\mathcal{N}(\rho)},

where the equality follows from [12, Theorem 1]. Subsequently, we have 𝒬(XA)𝒩(ρ)𝒬(XA)ρ\mathcal{Q}(X\rightarrow A)_{\mathcal{N}(\rho)}\leq\mathcal{Q}(X\rightarrow A)_{\rho} [12, Proposition 3]. This concludes the proof. ∎

The upper bound in Theorem 1 shows that the accuracy of a quantum inference method is upper bounded by the maximal quantum leakage 𝒬(XA)ρ\mathcal{Q}(X\rightarrow A)_{\rho}, which is independent of the output and the distribution of the input of the statistical inference model, multiplied by maxz𝒵{Z=z}\max_{z\in\mathcal{Z}}\mathbb{P}\{Z=z\}, which captures the accuracy of the best111Maximum a priori estimator. statistical inference algorithm that ignores the realization of the input XX. The maximal quantum leakage 𝒬(XA)ρ\mathcal{Q}(X\rightarrow A)_{\rho} essentially captures the multiplicative increase in the accuracy of the statistical inference model by accessing the input XX via its quantum encoding \mathcal{R}.

Corollary 1.

The accuracy of any quantum computing procedure (,𝒩,,γ)(\mathcal{R},\mathcal{N},\mathcal{F},\gamma) is constrained as

{Z^=Z}\displaystyle\mathbb{P}\{\widehat{Z}=Z\}\leq maxz𝒵{Z=z}\displaystyle\max_{z\in\mathcal{Z}}\mathbb{P}\{Z=z\}
×min{log2(|𝒳|),2log2(dim())}.\displaystyle\times\min\{\log_{2}(|\mathcal{X}|),2\log_{2}(\dim(\mathcal{H}))\}.
Proof.

The proof follows from that 𝒬(XA)ρmin{log2(|𝒳|),log2(dim()2)}\mathcal{Q}(X\rightarrow A)_{\rho}\leq\min\{\log_{2}(|\mathcal{X}|),\log_{2}(\dim(\mathcal{H})^{2})\} [12, Proposition 2]. ∎

Note that, in Corollary 1, dim()\dim(\mathcal{H}) captures the dimension of the quantum system used for statistical inference. In fact, the quantum system will be composed of log2(dim())\log_{2}(\dim(\mathcal{H})) qubits. If 2log2(dim())<log2(|𝒳|)2\log_{2}(\dim(\mathcal{H}))<\log_{2}(|\mathcal{X}|), the upper bound in Corollary 1 is unnecessarily reduced by the dimension of the quantum system. This points to that the minimum number of qubits required for accurately solving an inference problem must be above log2(dim())log2(|𝒳|)/2\log_{2}(\dim(\mathcal{H}))\approx\log_{2}(|\mathcal{X}|)/2. Note that we are not asserting that log2(|𝒳|)/2\log_{2}(|\mathcal{X}|)/2 is the optimal number of required qubits, but that this is a lower bound for how many qubits needed to solve an inference problem effectively. Furthermore, the only thing that matters here is the size of the support set of the input XX (not its distribution, not the output ZZ, not the quantum computing method used, and not the classical post-processing procedure implemented). Therefore, this results is rather universal and of value to any statistical inference problem.

The upper bound in Theorem 1, which is a function of the maximal quantum leakage 𝒬(XA)ρ\mathcal{Q}(X\rightarrow A)_{\rho}, only depends on the quantum encoding of the classical data \mathcal{R}. This hints that we can find a good universal encoding mechanism by maximizing 𝒬(XA)ρ\mathcal{Q}(X\rightarrow A)_{\rho}. This encoder can unlock the barrier in achieving a high accuracy in the statistical inference by increasing the upper bound in Theorem 1. This is pursued in what follows.

Optimal Quantum Encoding: To compute the optimal universal encoder, we need to solve the following optimization problem:

argmaxρx𝒮(),x𝒳[sup{Fy}y𝒴log(y𝒴maxx𝒳:{X=x}>0tr(ρxFy))].\displaystyle\operatorname*{arg\,max}_{\begin{subarray}{c}\rho^{x}\in\mathcal{S}(\mathcal{H}),\\ \forall x\in\mathcal{X}\end{subarray}}\left[\sup_{\{F_{y}\}_{y\in\mathcal{Y}}}\log\left(\sum_{y\in\mathcal{Y}}\max_{\begin{subarray}{c}x\in\mathcal{X}:\\ \mathbb{P}\{X=x\}>0\end{subarray}}\tr(\rho^{x}F_{y})\right)\right]. (2)

In the next proposition, we prove that this optimization problem attains its maximum over pure states, i.e., {ρx}x𝒳\{\rho^{x}\}_{x\in\mathcal{X}} such that rank(ρx)=1\rank(\rho^{x})=1 for all x𝒳x\in\mathcal{X}. This is an important revelation as often quantum computing procedures rely on pure states. Furthermore, this result is universal; it does not rely on the form of the statistical inference problem.

Proposition 1.

The maximum of (2) is attained by pure states.

Proof.

First, because log2()\log_{2}(\cdot) is strictly increasing, the optimization problem (2) is equivalent to

argmaxρx𝒮(),x𝒳g({ρx}x𝒳),\displaystyle\operatorname*{arg\,max}_{\begin{subarray}{c}\rho^{x}\in\mathcal{S}(\mathcal{H}),\\ \forall x\in\mathcal{X}\end{subarray}}\;g(\{\rho^{x}\}_{x\in\mathcal{X}}),

where

g({ρx}x𝒳):=sup{Fy}y𝒴(y𝒴maxx𝒳:{X=x}>0tr(ρxFy)).\displaystyle g(\{\rho^{x}\}_{x\in\mathcal{X}}):=\sup_{\{F_{y}\}_{y\in\mathcal{Y}}}\left(\sum_{y\in\mathcal{Y}}\max_{\begin{subarray}{c}x\in\mathcal{X}:\\ \mathbb{P}\{X=x\}>0\end{subarray}}\tr(\rho^{x}F_{y})\right).

It is easy to see that g:𝒮()|𝒳|g:\mathcal{S}(\mathcal{H})^{|\mathcal{X}|}\rightarrow\mathbb{R} is convex because

g({α\displaystyle g(\{\alpha ρx+(1α)σx}x𝒳)\displaystyle\rho^{x}+(1-\alpha)\sigma^{x}\}_{x\in\mathcal{X}})
=sup{Fy}y𝒴y𝒴maxx𝒳tr((αρx+(1α)σx)Fy)\displaystyle=\sup_{\{F_{y}\}_{y\in\mathcal{Y}}}\sum_{y\in\mathcal{Y}}\max_{x\in\mathcal{X}}\tr((\alpha\rho^{x}+(1-\alpha)\sigma^{x})F_{y})
αsup{Fy}y𝒴y𝒴maxx𝒳tr(ρxFy)\displaystyle\leq\alpha\sup_{\{F_{y}\}_{y\in\mathcal{Y}}}\sum_{y\in\mathcal{Y}}\max_{x\in\mathcal{X}}\tr(\rho^{x}F_{y})
+(1α)sup{Fy}y𝒴y𝒴maxx𝒳tr(σxFy)\displaystyle\quad+(1-\alpha)\sup_{\{F_{y}\}_{y\in\mathcal{Y}}}\sum_{y\in\mathcal{Y}}\max_{x\in\mathcal{X}}\tr(\sigma^{x}F_{y})
αg({ρx}x𝒳)+(1α)g({σx}x𝒳).\displaystyle\leq\alpha g(\{\rho^{x}\}_{x\in\mathcal{X}})+(1-\alpha)g(\{\sigma^{x}\}_{x\in\mathcal{X}}).

According to the Bauer’s maximum principle [13, Theorem 3.5.29], originally proved in [14], g()g(\cdot) must attain its maximum at an extreme point of 𝒮()|𝒳|\mathcal{S}(\mathcal{H})^{|\mathcal{X}|}. The extreme points of this set are pure states [15, Theorem 2.3]. ∎

Note that Proposition 1 does not claim that the solution is unique. The problem might admit many solutions but at least one of those solutions uses pure states for encoding the classical data. Note that all the optimal solutions have the same maximal quantum leakage.

Proposition 2.

If dim()|𝒳|\dim(\mathcal{H})\geq|\mathcal{X}|, the maximum of (2) is attained by basis encoding, i.e., ρx=|τ(x)τ(x)|\rho^{x}=\ket{\tau(x)}\bra{\tau(x)}, where {|i}i=1,,dim()\{\ket{i}\}_{i=1,\dots,\dim(\mathcal{H})} is an orthonormal basis for \mathcal{H} and τ:𝒳{1,,|𝒳|}\tau:\mathcal{X}\rightarrow\{1,\dots,|\mathcal{X}|\} is any one-to-one mapping.

Proof.

Note that 𝒬(XA)log2(|𝒳|)\mathcal{Q}(X\rightarrow A)\leq\log_{2}(|\mathcal{X}|) irrespective of {ρx}x𝒳\{\rho^{x}\}_{x\in\mathcal{X}} [12, Proposition 2]. Let ρx=|τ(x)τ(x)|\rho^{x}=\ket{\tau(x)}\bra{\tau(x)} for all x𝒳x\in\mathcal{X}. Fix 𝒴=𝒳\mathcal{Y}=\mathcal{X} and Fy=ρyF_{y}=\rho^{y} for all y𝒴y\in\mathcal{Y}. We get y𝒴maxx𝒳:{X=x}>0tr(ρxFy)=|𝒳|\sum_{y\in\mathcal{Y}}\max_{{x\in\mathcal{X}:\mathbb{P}\{X=x\}>0}}\tr(\rho^{x}F_{y})=|\mathcal{X}|, which attains 𝒬(XA)=log2(|𝒳|)\mathcal{Q}(X\rightarrow A)=\log_{2}(|\mathcal{X}|). ∎

Basis encoding, also called index encoding, is one of the most common forms of quantum encoding [4]. Proposition 2 shows that this popular encoding policy is in fact universally optimal when there are enough qubits present.

Refer to caption
Figure 1: The maximal quantum leakage versus the number of iterations while implementing the projected subgradient ascent in (4) to find the best quantum encoding of the classical data.
Refer to caption
Figure 2: The Maximal quantum leakage for the optimal universal encoder versus number of the qubits.

Iterative Algorithm for Maximizing the Maximal Quantum Leakage: We make use of the iterative algorithm proposed in [12] to compute the maximal quantum leakage. Then, we update the quantum encoding via projected subgradient222Subgradients are generalizations of gradient to convex functions which are not necessarily differentiable. ascent. We encourage interested readers to check [16, § 14.2-14.3] for more information on subgradients and non-smooth optimization. In [12], it was shown that

2𝒬(XA)ρ=sup{Fy}\displaystyle 2^{\mathcal{Q}(X\rightarrow A)_{\rho}}=\sup_{\{F_{y}\}} y𝕐tr(ρx(y)Fy),\displaystyle\sum_{y\in\mathbb{Y}}\tr(\rho^{x^{*}(y)}F_{y}), (3a)
s.t.\displaystyle\mathrm{s.t.}\; 0Fy,y𝒴,\displaystyle 0\preceq F_{y},y\in\mathcal{Y}, (3b)
y𝒴Fy=I,\displaystyle\sum_{y\in\mathcal{Y}}F_{y}=I, (3c)

where 𝒴={1,,dim()2}\mathcal{Y}=\{1,\dots,\dim(\mathcal{H})^{2}\} and

x(y)argmaxx𝒳tr(ρAxFy).\displaystyle x^{*}(y)\in\operatorname*{arg\,max}_{x\in\mathcal{X}}\tr(\rho_{A}^{x}F_{y}).

Therefore, the subgradient with respect to ρx\rho^{x}, x𝒳x\in\mathcal{X}, is given by

ρx2𝒬(XA)ρ=y𝒴:x(y)=xFy,\displaystyle\partial_{\rho^{x}}2^{\mathcal{Q}(X\rightarrow A)_{\rho}}=\sum_{\begin{subarray}{c}y\in\mathcal{Y}:\\ x^{*}(y)=x\end{subarray}}F_{y}^{*},

where ρx\partial_{\rho^{x}} denotes the subgradient with respect to ρx\rho^{x} and {Fy}\{F_{y}^{*}\} denotes the POVM that attains the maximum in (3), which can be computed using the iterative algorithm in [12]. We can therefore update the quantum encoding of the data by using the projected subgradient ascent:

ρxΠ[ρx+μρx2𝒬(XA)ρ],\displaystyle\rho^{x}\leftarrow\Pi[\rho^{x}+\mu\partial_{\rho^{x}}2^{\mathcal{Q}(X\rightarrow A)_{\rho}}], (4)

where μ>0\mu>0 is the step size (selected small enough to ensure convergence) and Π\Pi is projection to the set of rank-one matrices with unit trace. For any Hermitian operator σ()\sigma\in\mathcal{L}(\mathcal{H}), the projection Π\Pi is defined as Π[σ]=|ii|,\Pi[\sigma]=\ket{i^{*}}\bra{i^{*}}, where iargmax1idim()|λi|i^{*}\in\operatorname*{arg\,max}_{1\leq i\leq\dim(\mathcal{H})}|\lambda_{i}| with σ=iλi|ii|\sigma=\sum_{i}\lambda_{i}\ket{i}\bra{i}. Note that σ\sigma can always be diagonalized as above because it is Hermitian.

In what follows, we demonstrate the convergence of the subgradient ascent in the case where 𝒳:={1,,8}\mathcal{X}:=\{1,\dots,8\} and dim()=8\dim(\mathcal{H})=8. In this case, Proposition 2 shows that the basis encoding is the optimal universal encoding strategy. Therefore, for optimal universal encoding strategy, 𝒬(XA)ρ=3\mathcal{Q}(X\rightarrow A)_{\rho}=3. Figure 1 illustrates the maximal quantum leakage versus the number of iterations while implementing the projected subgradient ascent in (4) to find the best quantum encoding. The algorithm is initialized at a random encoding strategy. The gray area demonstrates the maximum and minimum in each iteration over 100 runs of the algorithm (note the randomness in the initialization) and the solid black line shows the median in each iteration. As we can see, the algorithm rapidly converges to the optimal quantum encoding. Note that the optimal encoder is not unique but 𝒬(XA)ρ=3\mathcal{Q}(X\rightarrow A)_{\rho}=3 irrespectively.

Figure 2 shows the maximal quantum leakage for the optimal universal encoder versus the number of the qubits for the case where 𝒳={1,,8}\mathcal{X}=\{1,\dots,8\}. Clearly, the maximal quantum leakage increases until the number of the qubits hits log2(|𝒳|)=3\log_{2}(|\mathcal{X}|)=3. Based on Corollary 1, we expected that the minimum number of required qubits must be above log2(|𝒳|)/2=1.5\log_{2}(|\mathcal{X}|)/2=1.5. This shows that there is a multiplicative gap of two between the lower bound for the number of qubits, discussed earlier, and the exact number of qubits needed. This can be attributed to the looseness of the bound in Corollary 1 for pure states. Pursing a tighter bound is a viable direction for future research. Irrespective of this, the observation from Corollary 1 seems to be optimal up to a constant scaling factor.

Discussions: We presented a framework for developing optimal universal strategies for quantum encoding of classical data. The framework is universal as it is optimal for a wide array of statistical inference tasks; it does not rely on the output of the statistical inference problem and the underlying probability distributions. The optimal encoding strategy only takes into account the support set of the input variable. We proved that the optimal universal encoding strategy is attained by pure states. Furthermore, when there are enough qubits, basis encoding was proved to be universally optimal. For all other cases, an iterative method for numerically computing the optimal universal encoding strategy was presented. Future work can focus on using maximal quantum leakage for bounding the accuracy of inference models in the small-data regime and in quantum machine learning.

References

  • Schuld et al. [2021] M. Schuld, R. Sweke, and J. J. Meyer, Effect of data encoding on the expressive power of variational quantum-machine-learning models, Phys. Rev. A 103, 032430 (2021).
  • Bennett and Brassard [2014] C. H. Bennett and G. Brassard, Quantum cryptography: Public key distribution and coin tossing, Theoretical computer science 560, 7 (2014).
  • Haselgrove [2005] H. L. Haselgrove, Optimal state encoding for quantum walks and quantum communication over spin systems, Physical Review A 72, 062326 (2005).
  • Weigold et al. [2022] M. Weigold, J. Barzen, F. Leymann, and M. Salm, Data encoding patterns for quantum computing, in Proceedings of the 27th Conference on Pattern Languages of Programs, PLoP ’20 (The Hillside Group, USA, 2022).
  • Weigold et al. [2021] M. Weigold, J. Barzen, F. Leymann, and M. Salm, Encoding patterns for quantum algorithms, IET Quantum Communication 2, 141 (2021).
  • Elron and Eldar [2007] N. Elron and Y. C. Eldar, Optimal encoding of classical information in a quantum medium, IEEE transactions on information theory 53, 1900 (2007).
  • Korzekwa et al. [2022] K. Korzekwa, Z. Puchała, M. Tomamichel, and K. Życzkowski, Encoding classical information into quantum resources, IEEE Transactions on Information Theory 68, 4518 (2022).
  • Mitra and Mandayam [2021] A. Mitra and P. Mandayam, On optimal cloning and incompatibility, Journal of Physics A: Mathematical and Theoretical 54, 405303 (2021).
  • Farokhi and Kim [2024] F. Farokhi and S. Kim, Measuring quantum information leakage under detection threat, arXiv preprint arXiv:2403.11433  (2024).
  • Nielsen and Chuang [2000] M. Nielsen and I. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, 2000).
  • Wilde [2013] M. Wilde, Quantum Information Theory, Quantum Information Theory (Cambridge University Press, 2013).
  • Farokhi [2024] F. Farokhi, Maximal information leakage from quantum encoding of classical data, Phys. Rev. A 109, 022608 (2024).
  • Denkowski et al. [2003] Z. Denkowski, S. Migórski, and N. S. Papageorgiou, An Introduction to Nonlinear Analysis: Theory, An Introduction to Nonlinear Analysis (Kluwer Academic Publishers, 2003).
  • Bauer [1958] H. Bauer, Minimalstellen von funktionen und extremalpunkte, Archiv der Mathematik 9, 389 (1958).
  • Holevo [2013] A. S. Holevo, Quantum Systems, Channels, Information: A Mathematical Introduction (De Gruyter, Berlin, Boston, 2013).
  • Sun and Yuan [2006] W. Sun and Y. X. Yuan, Optimization Theory and Methods: Nonlinear Programming, Springer Optimization and Its Applications (Springer US, 2006).