This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Optimized quantum ff-divergences

Mark M. Wilde Hearne Institute for Theoretical Physics, Department of Physics and Astronomy, Center for Computation and Technology,
Louisiana State University, Baton Rouge, Louisiana 70803, USA, Email: [email protected]
Abstract

The quantum relative entropy is a measure of the distinguishability of two quantum states, and it is a unifying concept in quantum information theory: many information measures such as entropy, conditional entropy, mutual information, and entanglement measures can be realized from it. As such, there has been broad interest in generalizing the notion to further understand its most basic properties, one of which is the data processing inequality. The quantum ff-divergence of Petz is one generalization of the quantum relative entropy, and it also leads to other relative entropies, such as the Petz–Rényi relative entropies. In this contribution, I introduce the optimized quantum ff-divergence as a related generalization of quantum relative entropy. I prove that it satisfies the data processing inequality, and the method of proof relies upon the operator Jensen inequality, similar to Petz’s original approach. Interestingly, the sandwiched Rényi relative entropies are particular examples of the optimized ff-divergence. Thus, one benefit of this approach is that there is now a single, unified approach for establishing the data processing inequality for both the Petz–Rényi and sandwiched Rényi relative entropies, for the full range of parameters for which it is known to hold.

Full version of this paper is accessible at arXiv:1710.10252

I Introduction

The quantum relative entropy [1] is a foundational distinguishability measure in quantum information theory (QIT). It is a function of two quantum states and measures how well one can tell the two states apart by a quantum-mechanical experiment. One important reason for why it has found such widespread application is that it satisfies a data-processing inequality [2, 3]: it does not increase under the action of a quantum channel on the two states. This can be interpreted as saying that two quantum states do not become more distinguishable if the same quantum channel is applied to them, and a precise interpretation of this statement in terms of quantum hypothesis testing is available in [4, 5, 6]. Quantum relative entropy generalizes its classical counterpart [7].

The wide interest in relative entropy sparked various researchers to generalize and study it further, in an attempt to elucidate the fundamental properties that govern its behavior. One notable generalization is Rényi’s relative entropy [8], but this was subsequently generalized even further in the form of the ff-divergence [9, 10, 11]. For probability distributions {p(x)}x\{p(x)\}_{x} and {q(x)}x\{q(x)\}_{x} and a convex function ff, the ff-divergence is defined as xq(x)f(p(x)/q(x)),\sum_{x}q(x)f(p(x)/q(x)), in the case that p(x)=0p(x)=0 for all xx such that q(x)=0q(x)=0. The resulting quantity is then non-increasing under the action of a classical channel r(y|x)r(y|x) that produces the output distributions xr(y|x)p(x)\sum_{x}r(y|x)p(x) and xr(y|x)q(x)\sum_{x}r(y|x)q(x). Some years after these developments, a quantum generalization of ff-divergence appeared in [12, 13] In [12, 13] and a later development [14], the quantum data-processing inequality was proved in full generality for arbitrary quantum channels, whenever the underlying function ff is operator convex.

Interestingly, when generalizing a notion from classical to QIT, there is often more than one way to do so, and sometimes there could even be an infinite number of ways to do so. This has to do with the non-commutativity of quantum states. For example, there are several different ways that one could generalize the relative entropy to the quantum case, and two prominent formulas were put forward in [1] and [15]. This added complexity for the quantum case could potentially be problematic, but the typical way of determining on which generalizations we should focus is to show that a given formula is the answer to a meaningful operational task. The papers [4, 5] accomplished this for the formula from [1], and since then, researchers have realized more and more just how foundational the formula of [1] is. As a consequence, the formula of [1] is now known as quantum relative entropy.

The situation becomes more intricate when it comes to quantum generalizations of Rényi relative entropy. For many years, the Petz–Rényi relative entropy of [12, 13] has been widely studied and given an operational interpretation [16, 17], again in the context of quantum hypothesis testing. However, in recent years, the sandwiched Rényi relative entropy of [18, 19] has gained prominence, due to its role in establishing strong converses for communication tasks (see, e.g.,[19, 20]). The result of [21] solidified its fundamental meaning in QIT, proving that it has an operational interpretation in the strong converse exponent of quantum hypothesis testing. As such, the situation is that there are two generalizations of Rényi relative entropy that should be considered in QIT, due to their operational role mentioned above.

The same work that introduced the Petz–Rényi relative entropy also introduced a quantum generalization of the notion of ff-divergence [12, 13] (see also [22]), with the Petz–Rényi relative entropy being a particular example. Since then, other quantum ff-divergences have appeared [23, 24], now known as minimal and maximal ff-divergences [25, 24]. However, it has not been known how the sandwiched Rényi relative entropy fits into the paradigm of quantum ff-divergences.

In this paper, I modify Petz’s definition of quantum ff-divergence [12, 13, 22], by allowing for a particular optimization (see Definition 1 for details of the modification). As such, I call the resulting quantity the optimized quantum ff-divergence. I prove that it obeys a quantum data processing inequality, and as such, my perspective is that it deserves to be considered as another variant of the quantum ff-divergence, in addition to the original, the minimal, and the maximal. Interestingly, the sandwiched Rényi relative entropy is directly related to the optimized quantum ff-divergence, thus bringing the sandwiched quantity into the ff-divergence formalism.

One benefit of the results of this paper is that there is now a single, unified approach for establishing the data-processing inequality for both the Petz–Rényi relative entropy and the sandwiched Rényi relative entropy, for the full Rényi parameter ranges for which it is known to hold. This unified approach is based on Petz’s original approach that employed the operator Jensen inequality [26], and it is useful for presenting a succint proof of the data processing inequality for both quantum Rényi relative entropy families.

In the rest of the paper, I begin by defining the optimized quantum ff-divergence in the next section. In Section III, I prove that the optimized ff-divergence satisfies the quantum data processing inequality under partial trace whenever the underlying function ff is operator anti-monotone with domain (0,)(0,\infty) and range \mathbb{R}. The core tool underlying this proof is the operator Jensen inequality [26]. In Section IV, I show how the quantum relative entropy and the sandwiched Rényi relative entropies are directly related to the optimized quantum ff-divergence. Section V then discusses the relation between Petz’s ff-divergence and the optimized one. I finally conclude in Section VI with a summary.

II Optimized quantum ff-divergence

Let us begin by defining the optimized quantum ff-divergence. Here I focus exclusively on the case of positive definite operators, and the full version provides details for the more general case of positive semi-definite operators.

Definition 1 (Optimized quantum ff-divergence)

Let ff be a function with domain (0,)(0,\infty) and range \mathbb{R}. For positive definite operators XX and YY acting on a Hilbert space S\mathcal{H}_{S}, we define the optimized quantum ff-divergence as

Q~f(XY)supτ>0,Tr{τ}1Q~f(XY;τ),\widetilde{Q}_{f}(X\|Y)\equiv\sup_{\tau>0,\ \operatorname{Tr}\{\tau\}\leq 1}\widetilde{Q}_{f}(X\|Y;\tau), (1)

where Q~f(XY;τ)\widetilde{Q}_{f}(X\|Y;\tau) is defined for positive definite YY and τ\tau acting on S\mathcal{H}_{S} as

Q~f(XY;τ)\displaystyle\widetilde{Q}_{f}(X\|Y;\tau) φX|SS^f(τS1YS^T)|φXSS^,\displaystyle\equiv\langle\varphi^{X}|_{S\hat{S}}f(\tau_{S}^{-1}\otimes Y_{\hat{S}}^{T})|\varphi^{X}\rangle_{S\hat{S}}, (2)
|φXSS^\displaystyle|\varphi^{X}\rangle_{S\hat{S}} (XS1/2IS^)|ΓSS^.\displaystyle\equiv(X_{S}^{1/2}\otimes I_{\hat{S}})|\Gamma\rangle_{S\hat{S}}. (3)

In the above, S^\mathcal{H}_{\hat{S}} is an auxiliary Hilbert space isomorphic to S\mathcal{H}_{S}, |ΓSS^i=1|S||iS|iS^,\left|\Gamma\right\rangle_{S\hat{S}}\equiv\sum_{i=1}^{\left|S\right|}\left|i\right\rangle_{S}\left|i\right\rangle_{\hat{S}}, for orthonormal bases {|iS}i=1|S|\{\left|i\right\rangle_{S}\}_{i=1}^{\left|S\right|} and {|iS^}i=1|S^|\{\left|i\right\rangle_{\hat{S}}\}_{i=1}^{|\hat{S}|}, and the TT superscript indicates transpose with respect to the basis {|iS^}i\{\left|i\right\rangle_{\hat{S}}\}_{i}.

The case of greatest interest for us here is when the underlying function ff is operator anti-monotone; i.e., for Hermitian operators AA and BB, the function ff is such that ABf(B)f(A)A\leq B\Rightarrow f(B)\leq f(A) (see, e.g., [27]). This property is rather strong, but there are several functions of interest in quantum physical applications that obey it (see Section IV). One critical property of an operator anti-monotone function with domain (0,)(0,\infty) and range \mathbb{R} is that it is also operator convex and continuous (see, e.g., [28]). In this case, we have the following simple proposition, proved in the full version:

Proposition 2

Let ff be an operator anti-monotone function with domain (0,)(0,\infty) and range \mathbb{R}. For positive definite operators XX and YY acting on a Hilbert space S\mathcal{H}_{S},

Q~f(XY)=supτ>0,Tr{τ}=1Q~f(XY;τ),\widetilde{Q}_{f}(X\|Y)=\sup_{\tau>0,\,\operatorname{Tr}\{\tau\}=1}\widetilde{Q}_{f}(X\|Y;\tau), (4)

and the function Q~f(XY;τ)\widetilde{Q}_{f}(X\|Y;\tau) is concave in τ\tau.

III Quantum data processing

Our first main objective is to prove that Q~f(XY)\widetilde{Q}_{f}(X\|Y) deserves the name “ff-divergence” or “ff-relative entropy,” i.e., that it is monotone non-increasing under the action of a completely positive trace-preserving map 𝒩\mathcal{N}:

Q~f(XY)Q~f(𝒩(X)𝒩(Y)).\widetilde{Q}_{f}(X\|Y)\geq\widetilde{Q}_{f}(\mathcal{N}(X)\|\mathcal{N}(Y)). (5)

Such a map 𝒩\mathcal{N} is also called a quantum channel, due to its purpose in quantum physics as modeling the physical evolution of the state of a quantum system. In QIT contexts, the inequality in (5) is known as the quantum data processing inequality. According to the Stinespring dilation theorem [29], to every quantum channel 𝒩SB\mathcal{N}_{S\rightarrow B}, there exists an isometry USBE𝒩U_{S\rightarrow BE}^{\mathcal{N}} such that

𝒩SB(XS)=TrE{USBE𝒩XS(USBE𝒩)}.\mathcal{N}_{S\rightarrow B}(X_{S})=\operatorname{Tr}_{E}\{U_{S\rightarrow BE}^{\mathcal{N}}X_{S}\left(U_{S\rightarrow BE}^{\mathcal{N}}\right)^{{\dagger}}\}. (6)

As such, we can prove the inequality in (5) in two steps. Isometric invariance: First show that

Q~f(XY)=Q~f(UXUUYU)\widetilde{Q}_{f}(X\|Y)=\widetilde{Q}_{f}(UXU^{{\dagger}}\|UYU^{{\dagger}}) (7)

for any isometry UU and any positive semi-definite XX and YY. This is done in the full version of this work, using the general definition given there. Monotonicity under partial trace: Then show that

Q~f(XABYAB)Q~f(XAYA)\widetilde{Q}_{f}(X_{AB}\|Y_{AB})\geq\widetilde{Q}_{f}(X_{A}\|Y_{A}) (8)

for positive semi-definite operators XABX_{AB} and YABY_{AB} acting on the tensor-product Hilbert space AB\mathcal{H}_{A}\otimes\mathcal{H}_{B}, with XA=TrB{XAB}X_{A}=\operatorname{Tr}_{B}\{X_{AB}\} and YA=TrB{YAB}Y_{A}=\operatorname{Tr}_{B}\{Y_{AB}\}.

We now discuss the second step toward quantum data processing, mentioned above, and here we focus exclusively on positive definite operators:

Theorem 3 (Monotonicity under partial trace)

Let ff be an operator anti-monotone function with domain (0,)(0,\infty) and range \mathbb{R}. Given positive definite operators XABX_{AB} and YABY_{AB} acting on the tensor-product Hilbert space AB\mathcal{H}_{A}\otimes\mathcal{H}_{B}, the optimized quantum ff-divergence does not increase under the action of a partial trace, in the sense that

Q~f(XABYAB)Q~f(XAYA),\widetilde{Q}_{f}(X_{AB}\|Y_{AB})\geq\widetilde{Q}_{f}(X_{A}\|Y_{A}), (9)

where XA=TrB{XAB}X_{A}=\operatorname{Tr}_{B}\{X_{AB}\} and YA=TrB{YAB}Y_{A}=\operatorname{Tr}_{B}\{Y_{AB}\}.

Proof:

The quantities of interest are as follows:

Q~f(XABYAB;τAB)=φXAB|ABA^B^f(τAB1YA^B^T)|φXABABA^B^,\widetilde{Q}_{f}(X_{AB}\|Y_{AB};\tau_{AB})=\\ \langle\varphi^{X_{AB}}|_{AB\hat{A}\hat{B}}f(\tau_{AB}^{-1}\otimes Y_{\hat{A}\hat{B}}^{T})|\varphi^{X_{AB}}\rangle_{AB\hat{A}\hat{B}}, (10)
Q~f(XAYA;ωA)=φXA|AA^f(ωA1YA^T)|φXAAA^,\widetilde{Q}_{f}(X_{A}\|Y_{A};\omega_{A})=\langle\varphi^{X_{A}}|_{A\hat{A}}f(\omega_{A}^{-1}\otimes Y_{\hat{A}}^{T})|\varphi^{X_{A}}\rangle_{A\hat{A}}, (11)

where τAB\tau_{AB} and ωA\omega_{A} are invertible density operators and, by definition,

|φXABABA^B^=(XAB1/2IA^B^)|ΓAA^|ΓBB^.|\varphi^{X_{AB}}\rangle_{AB\hat{A}\hat{B}}=\left(X_{AB}^{1/2}\otimes I_{\hat{A}\hat{B}}\right)|\Gamma\rangle_{A\hat{A}}\otimes|\Gamma\rangle_{B\hat{B}}. (12)

The following map, acting on an operator ZAZ_{A}, is a quantum channel known as the Petz recovery channel [30, 31]:

ZAXAB1/2([XA1/2ZAXA1/2]IB)XAB1/2.Z_{A}\rightarrow X_{AB}^{1/2}\left(\left[X_{A}^{-1/2}Z_{A}X_{A}^{-1/2}\right]\otimes I_{B}\right)X_{AB}^{1/2}. (13)

It is completely positive because it consists of the serial concatenation of three completely positive maps: sandwiching by XA1/2X_{A}^{-1/2}, tensoring in the identity IBI_{B}, and sandwiching by XAB1/2X_{AB}^{1/2}. It is also trace preserving. The Petz recovery channel has the property that it perfectly recovers XABX_{AB} if XAX_{A} is input because

XAXAB1/2([XA1/2XAXA1/2]IB)XAB1/2=XAB.\!\!\!X_{A}\rightarrow X_{AB}^{1/2}\left(\left[X_{A}^{-1/2}X_{A}X_{A}^{-1/2}\right]\otimes I_{B}\right)X_{AB}^{1/2}=X_{AB}. (14)

Every completely positive and trace preserving map 𝒩\mathcal{N} has a Kraus decomposition, which is a set {Ki}i\{K_{i}\}_{i} of operators such that 𝒩()=iKi()Ki\mathcal{N}(\cdot)=\sum_{i}K_{i}(\cdot)K_{i}^{{\dagger}} and iKiKi=I.\sum_{i}K_{i}^{{\dagger}}K_{i}=I. A standard construction for an isometric extension of a channel is then to pick an orthonormal basis {|iE}i\{|i\rangle_{E}\}_{i} for an auxiliary Hilbert space E\mathcal{H}_{E} and define

V=iKi|iE.V=\sum_{i}K_{i}\otimes|i\rangle_{E}. (15)

One can then readily check that 𝒩()=TrE{V()V}\mathcal{N}(\cdot)=\operatorname{Tr}_{E}\{V(\cdot)V^{{\dagger}}\} and VV=IV^{{\dagger}}V=I. For the Petz recovery channel, we can figure out a Kraus decomposition by expanding the identity operator IB=j=1|B||jj|BI_{B}=\sum_{j=1}^{\left|B\right|}|j\rangle\langle j|_{B}, with respect to some orthonormal basis {|jB}j\{|j\rangle_{B}\}_{j}, so that

XAB1/2([XA1/2ωAXA1/2]IB)XAB1/2\displaystyle X_{AB}^{1/2}\left(\left[X_{A}^{-1/2}\omega_{A}X_{A}^{-1/2}\right]\otimes I_{B}\right)X_{AB}^{1/2}
=j=1|B|XAB1/2([XA1/2ωAXA1/2]|jj|B)XAB1/2\displaystyle=\sum_{j=1}^{\left|B\right|}X_{AB}^{1/2}\left(\left[X_{A}^{-1/2}\omega_{A}X_{A}^{-1/2}\right]\otimes|j\rangle\langle j|_{B}\right)X_{AB}^{1/2}
=j=1|B|XAB1/2[XA1/2|jB]ωA[XA1/2j|B]XAB1/2.\displaystyle=\sum_{j=1}^{\left|B\right|}X_{AB}^{1/2}\left[X_{A}^{-1/2}\otimes|j\rangle_{B}\right]\omega_{A}\left[X_{A}^{-1/2}\otimes\langle j|_{B}\right]X_{AB}^{1/2}.

Thus, Kraus operators for the Petz recovery channel are {XAB1/2[XA1/2|jB]}j=1|B|\left\{X_{AB}^{1/2}\left[X_{A}^{-1/2}\otimes|j\rangle_{B}\right]\right\}_{j=1}^{\left|B\right|}. According to the standard recipe in (15), we can construct an isometric extension of the Petz recovery channel as

j=1|B|XAB1/2[XA1/2|jB]|jB^\displaystyle\sum_{j=1}^{\left|B\right|}X_{AB}^{1/2}\left[X_{A}^{-1/2}\otimes|j\rangle_{B}\right]|j\rangle_{\hat{B}} =XAB1/2XA1/2j=1|B||jB|jB^\displaystyle=X_{AB}^{1/2}X_{A}^{-1/2}\sum_{j=1}^{\left|B\right|}|j\rangle_{B}|j\rangle_{\hat{B}}
=XAB1/2XA1/2|ΓBB^.\displaystyle=X_{AB}^{1/2}X_{A}^{-1/2}|\Gamma\rangle_{B\hat{B}}. (16)

We can then extend this isometry to act as an isometry on a larger space by tensoring it with the identity operator IA^I_{\hat{A}}, and so we define

VAA^AA^BB^XAB1/2[XA1/2IA^]|ΓBB^.V_{A\hat{A}\rightarrow A\hat{A}B\hat{B}}\equiv X_{AB}^{1/2}\left[X_{A}^{-1/2}\otimes I_{\hat{A}}\right]|\Gamma\rangle_{B\hat{B}}. (17)

We can also see that VAA^AA^BB^V_{A\hat{A}\rightarrow A\hat{A}B\hat{B}} acting on |φXAAA^|\varphi^{X_{A}}\rangle_{A\hat{A}} generates |φXABABA^B^|\varphi^{X_{AB}}\rangle_{AB\hat{A}\hat{B}}: |φXABABA^B^=VAA^AA^BB^|φXAAA^|\varphi^{X_{AB}}\rangle_{AB\hat{A}\hat{B}}=V_{A\hat{A}\rightarrow A\hat{A}B\hat{B}}|\varphi^{X_{A}}\rangle_{A\hat{A}}. This can be interpreted as a generalization of (14) in the language of QIT: an isometric extension of the Petz recovery channel perfectly recovers a purification |φXABABA^B^|\varphi^{X_{AB}}\rangle_{AB\hat{A}\hat{B}} of XABX_{AB} from a purification |φXAAA^|\varphi^{X_{A}}\rangle_{A\hat{A}} of XAX_{A}. Since the Petz recovery channel is indeed a channel, we can pick τAB\tau_{AB} as the output state of the Petz recovery channel acting on an invertible state ωA\omega_{A}:

τAB=XAB1/2([XA1/2ωAXA1/2]IB)XAB1/2.\tau_{AB}=X_{AB}^{1/2}\left(\left[X_{A}^{-1/2}\omega_{A}X_{A}^{-1/2}\right]\otimes I_{B}\right)X_{AB}^{1/2}. (18)

Observe that τAB\tau_{AB} is invertible. Then consider that

V(τAB1YA^B^T)V\displaystyle V^{{\dagger}}\left(\tau_{AB}^{-1}\otimes Y_{\hat{A}\hat{B}}^{T}\right)V
=Γ|BB^(XA1/2XAB1/2τAB1XAB1/2XA1/2YA^B^T)|ΓBB^\displaystyle=\langle\Gamma|_{B\hat{B}}\Big{(}X_{A}^{-1/2}X_{AB}^{1/2}\tau_{AB}^{-1}X_{AB}^{1/2}X_{A}^{-1/2}\otimes Y_{\hat{A}\hat{B}}^{T}\Big{)}|\Gamma\rangle_{B\hat{B}} (19)
=Γ|BB^(ωA1IBYA^B^T)|ΓBB^\displaystyle=\langle\Gamma|_{B\hat{B}}\left(\omega_{A}^{-1}\otimes I_{B}\otimes Y_{\hat{A}\hat{B}}^{T}\right)|\Gamma\rangle_{B\hat{B}} (20)
=ωA1Γ|BB^YA^B^T|ΓBB^\displaystyle=\omega_{A}^{-1}\otimes\langle\Gamma|_{B\hat{B}}Y_{\hat{A}\hat{B}}^{T}|\Gamma\rangle_{B\hat{B}} (21)
=ωA1YA^T.\displaystyle=\omega_{A}^{-1}\otimes Y_{\hat{A}}^{T}. (22)

For the second equality, we used the fact that XA1/2XAB1/2τAB1XAB1/2XA1/2=ωA1IBX_{A}^{-1/2}X_{AB}^{1/2}\tau_{AB}^{-1}X_{AB}^{1/2}X_{A}^{-1/2}=\omega_{A}^{-1}\otimes I_{B} for the choice of τAB\tau_{AB} in (18). With this setup, we can now readily establish the desired inequality by employing the operator Jensen inequality [26] and operator convexity of the function ff:

Q~f(XABYAB;τAB)\displaystyle\widetilde{Q}_{f}(X_{AB}\|Y_{AB};\tau_{AB})
=φXAB|ABA^B^f(τAB1YA^B^T)|φXABABA^B^\displaystyle=\langle\varphi^{X_{AB}}|_{AB\hat{A}\hat{B}}f(\tau_{AB}^{-1}\otimes Y_{\hat{A}\hat{B}}^{T})|\varphi^{X_{AB}}\rangle_{AB\hat{A}\hat{B}} (23)
=φXA|AA^Vf(τAB1YA^B^T)V|φXAAA^\displaystyle=\langle\varphi^{X_{A}}|_{A\hat{A}}V^{{\dagger}}f(\tau_{AB}^{-1}\otimes Y_{\hat{A}\hat{B}}^{T})V|\varphi^{X_{A}}\rangle_{A\hat{A}} (24)
φXA|AA^f(V[τAB1YA^B^T]V)|φXAAA^\displaystyle\geq\langle\varphi^{X_{A}}|_{A\hat{A}}f(V^{{\dagger}}[\tau_{AB}^{-1}\otimes Y_{\hat{A}\hat{B}}^{T}]V)|\varphi^{X_{A}}\rangle_{A\hat{A}} (25)
=φXA|AA^f(ωA1YA^T)|φXAAA^\displaystyle=\langle\varphi^{X_{A}}|_{A\hat{A}}f(\omega_{A}^{-1}\otimes Y_{\hat{A}}^{T})|\varphi^{X_{A}}\rangle_{A\hat{A}} (26)
=Q~f(XAYA;ωA).\displaystyle=\widetilde{Q}_{f}(X_{A}\|Y_{A};\omega_{A}). (27)

Taking a supremum over τAB\tau_{AB} such that τAB>0\tau_{AB}>0 and Tr{τAB}=1\operatorname{Tr}\{\tau_{AB}\}=1, we conclude that the following inequality holds for all invertible states ωA\omega_{A}:

Q~f(XABYAB)Q~f(XAYA;ωA).\widetilde{Q}_{f}(X_{AB}\|Y_{AB})\geq\widetilde{Q}_{f}(X_{A}\|Y_{A};\omega_{A}). (28)

After taking a supremum over invertible states ωA\omega_{A}, we find that the inequality in (9) holds when XABX_{AB} is invertible. ∎

IV Examples of optimized quantum ff-divergences

I now show how several known quantum divergences are particular examples of an optimized quantum ff-divergence, including the quantum relative entropy [1] and the sandwiched Rényi relative quasi-entropies [18, 19]. The result will be that Theorem 3 recovers quantum data processing for the sandwiched Rényi relative entropies for the full range of parameters for which it is known to hold. Thus, one benefit of Theorem 3 and earlier work of [12, 13, 14] is a single, unified approach, based on the operator Jensen inequality [26], for establishing quantum data processing for all of the Petz– and sandwiched Rényi relative entropies for the full parameter ranges for which data processing is known to hold.

IV-A Quantum relative entropy as optimized quantum ff-divergence

Let τ\tau be an invertible state and XX and YY positive definite. Let X¯=X/Tr{X}\overline{X}=X/\operatorname{Tr}\{X\}. Pick the function f(x)=logxf(x)=-\log x, which is an operator anti-monotone function with domain (0,)(0,\infty) and range \mathbb{R}, and we find that

1Tr{X}φX|SS^[log(τS1YS^T)]|φXSS^\displaystyle\frac{1}{\operatorname{Tr}\{X\}}\langle\varphi^{X}|_{S\hat{S}}\left[-\log(\tau_{S}^{-1}\otimes Y_{\hat{S}}^{T})\right]|\varphi^{X}\rangle_{S\hat{S}}
=φX¯|SS^[log(τS)IS^ISlogYS^T]|φX¯SS^\displaystyle=\langle\varphi^{\overline{X}}|_{S\hat{S}}\left[\log(\tau_{S})\otimes I_{\hat{S}}-I_{S}\otimes\log Y_{\hat{S}}^{T}\right]|\varphi^{\overline{X}}\rangle_{S\hat{S}} (29)
=φX¯|SS^log(τS)IS^|φX¯SS^\displaystyle=\langle\varphi^{\overline{X}}|_{S\hat{S}}\log(\tau_{S})\otimes I_{\hat{S}}|\varphi^{\overline{X}}\rangle_{S\hat{S}}
φX¯|SS^ISlog(YS^T)|φX¯SS^\displaystyle\qquad-\langle\varphi^{\overline{X}}|_{S\hat{S}}I_{S}\otimes\log\left(Y_{\hat{S}}^{T}\right)|\varphi^{\overline{X}}\rangle_{S\hat{S}} (30)
=Tr{X¯logτ}Tr{X¯logY}\displaystyle=\operatorname{Tr}\{\overline{X}\log\tau\}-\operatorname{Tr}\{\overline{X}\log Y\} (31)
Tr{X¯logX¯}Tr{X¯logY}=D(X¯Y).\displaystyle\leq\operatorname{Tr}\{\overline{X}\log\overline{X}\}-\operatorname{Tr}\{\overline{X}\log Y\}=D(\overline{X}\|Y). (32)

The inequality is a consequence of Klein’s inequality [32] (see also [33]), establishing that the optimal τ\tau is set to X¯\overline{X}. So we find that Q~log()(XY)=Tr{X}D(X¯Y),\widetilde{Q}_{-\log(\cdot)}(X\|Y)=\operatorname{Tr}\{X\}D(\overline{X}\|Y), where the quantum relative entropy D(X¯Y)D(\overline{X}\|Y) is defined as [1] D(X¯Y)=Tr{X¯[logX¯logY]}D(\overline{X}\|Y)=\operatorname{Tr}\{\overline{X}\left[\log\overline{X}-\log Y\right]\}.

IV-B Sandwiched Rényi relative quasi-entropy as optimized quantum ff-divergence

Take τ\tau, XX, and YY as defined in Section IV-A. For α[1/2,1)\alpha\in[1/2,1), pick the function f(x)=x(1α)/α,f(x)=-x^{\left(1-\alpha\right)/\alpha}, which is an operator anti-monotone function with domain (0,)(0,\infty) and range \mathbb{R}. Note that this is a reparametrization of xβ-x^{\beta} for β(0,1]\beta\in(0,1]. I now show that

Q~()(1α)/α(XY)=Y(1α)/2αXY(1α)/2αα,\widetilde{Q}_{-\left(\cdot\right)^{\left(1-\alpha\right)/\alpha}}(X\|Y)=-\left\|Y^{\left(1-\alpha\right)/2\alpha}XY^{\left(1-\alpha\right)/2\alpha}\right\|_{\alpha}, (33)

which is the known expression for sandwiched quasi-entropy for α[1/2,1)\alpha\in[1/2,1) [18, 19]. To see this, consider that

φX|SS^[τS1YS^T](1α)/α|φXSS^\displaystyle-\langle\varphi^{X}|_{S\hat{S}}\left[\tau_{S}^{-1}\otimes Y_{\hat{S}}^{T}\right]^{\left(1-\alpha\right)/\alpha}|\varphi^{X}\rangle_{S\hat{S}}
=φX|SS^τS(α1)/α(YS^T)(1α)/α|φXSS^\displaystyle=-\langle\varphi^{X}|_{S\hat{S}}\tau_{S}^{\left(\alpha-1\right)/\alpha}\otimes\left(Y_{\hat{S}}^{T}\right)^{\left(1-\alpha\right)/\alpha}|\varphi^{X}\rangle_{S\hat{S}}
=Γ|SS^XS1/2τS(α1)/αXS1/2(YS^T)(1α)/α|ΓSS^\displaystyle=-\langle\Gamma|_{S\hat{S}}X_{S}^{1/2}\tau_{S}^{\left(\alpha-1\right)/\alpha}X_{S}^{1/2}\otimes\left(Y_{\hat{S}}^{T}\right)^{\left(1-\alpha\right)/\alpha}|\Gamma\rangle_{S\hat{S}}
=Tr{X1/2τ(α1)/αX1/2Y(1α)/α}\displaystyle=-\operatorname{Tr}\left\{X^{1/2}\tau^{\left(\alpha-1\right)/\alpha}X^{1/2}Y^{\left(1-\alpha\right)/\alpha}\right\}
=Tr{X1/2Y(1α)/αX1/2τ(α1)/α}.\displaystyle=-\operatorname{Tr}\left\{X^{1/2}Y^{\left(1-\alpha\right)/\alpha}X^{1/2}\tau^{\left(\alpha-1\right)/\alpha}\right\}. (34)

Now optimizing over invertible states τ\tau and employing Hölder duality, in the form of the reverse Hölder inequality and as observed in [18], we find that

supτ>0,Tr{τ}=1[Tr{X1/2Y1ααX1/2τα1α}]=X1/2Y(1α)/αX1/2α,\sup_{\begin{subarray}{c}\tau>0,\\ \operatorname{Tr}\{\tau\}=1\end{subarray}}\left[-\operatorname{Tr}\left\{X^{1/2}Y^{\frac{1-\alpha}{\alpha}}X^{1/2}\tau^{\frac{\alpha-1}{\alpha}}\right\}\right]\\ =-\left\|X^{1/2}Y^{\left(1-\alpha\right)/\alpha}X^{1/2}\right\|_{\alpha}, (35)

where for positive semi-definite ZZ, we define Zα=[Tr{Zα}]1/α\left\|Z\right\|_{\alpha}=\left[\operatorname{Tr}\{Z^{\alpha}\}\right]^{1/\alpha}. We then get that

Q~()(1α)/α(XY)\displaystyle\widetilde{Q}_{-\left(\cdot\right)^{\left(1-\alpha\right)/\alpha}}(X\|Y) =X1/2Y(1α)/αX1/2α\displaystyle=-\left\|X^{1/2}Y^{\left(1-\alpha\right)/\alpha}X^{1/2}\right\|_{\alpha} (36)
=Y(1α)/2αXY(1α)/2αα,\displaystyle=-\left\|Y^{\left(1-\alpha\right)/2\alpha}XY^{\left(1-\alpha\right)/2\alpha}\right\|_{\alpha}, (37)

which is the sandwiched Rényi relative quasi-entropy for the range α[1/2,1)\alpha\in[1/2,1). The sandwiched Rényi relative entropy itself is defined up to a normalization factor as [18, 19]

D~α(XY)=αα1logY(1α)/2αXY(1α)/2αα.\widetilde{D}_{\alpha}(X\|Y)=\frac{\alpha}{\alpha-1}\log\left\|Y^{\left(1-\alpha\right)/2\alpha}XY^{\left(1-\alpha\right)/2\alpha}\right\|_{\alpha}. (38)

Thus, Theorem 3 implies quantum data processing for the sandwiched Rényi relative entropy D~α(XABYAB)D~α(XAYA),\widetilde{D}_{\alpha}(X_{AB}\|Y_{AB})\geq\widetilde{D}_{\alpha}(X_{A}\|Y_{A}), for the parameter range α[1/2,1)\alpha\in[1/2,1), which is a result previously established in [34].

For α(1,]\alpha\in(1,\infty], pick the function f(x)=x(1α)/α,f(x)=x^{\left(1-\alpha\right)/\alpha}, which is an operator anti-monotone function with domain (0,)(0,\infty) and range \mathbb{R}. Note that this is a reparametrization of xβx^{\beta} for β[1,0)\beta\in[-1,0). I now show that

Q~()(1α)/α(XY)=Y(1α)/2αXY(1α)/2αα,\widetilde{Q}_{\left(\cdot\right)^{\left(1-\alpha\right)/\alpha}}(X\|Y)=\left\|Y^{\left(1-\alpha\right)/2\alpha}XY^{\left(1-\alpha\right)/2\alpha}\right\|_{\alpha}, (39)

which is the known expression for sandwiched Rényi relative quasi-entropy for α(1,]\alpha\in(1,\infty] [18, 19]. To see this, consider that the same development as above gives that

φX|SS^(τS1YS^T)(1α)/α|φXSS^=Tr{X1/2Y(1α)/αX1/2τ(α1)/α}.\langle\varphi^{X}|_{S\hat{S}}(\tau_{S}^{-1}\otimes Y_{\hat{S}}^{T})^{\left(1-\alpha\right)/\alpha}|\varphi^{X}\rangle_{S\hat{S}}\\ =\operatorname{Tr}\left\{X^{1/2}Y^{\left(1-\alpha\right)/\alpha}X^{1/2}\tau^{\left(\alpha-1\right)/\alpha}\right\}. (40)

Again employing Hölder duality, as observed in [18], we find

supτ>0,Tr{τ}=1Tr{X1/2Y(1α)/αX1/2τ(α1)/α}=X1/2Y(1α)/αX1/2α,\sup_{\tau>0,\operatorname{Tr}\{\tau\}=1}\operatorname{Tr}\left\{X^{1/2}Y^{\left(1-\alpha\right)/\alpha}X^{1/2}\tau^{\left(\alpha-1\right)/\alpha}\right\}\\ =\left\|X^{1/2}Y^{\left(1-\alpha\right)/\alpha}X^{1/2}\right\|_{\alpha}, (41)

We then get that

Q~()(1α)/α(XY)\displaystyle\widetilde{Q}_{\left(\cdot\right)^{\left(1-\alpha\right)/\alpha}}(X\|Y) =X1/2Y(1α)/αX1/2α\displaystyle=\left\|X^{1/2}Y^{\left(1-\alpha\right)/\alpha}X^{1/2}\right\|_{\alpha} (42)
=Y(1α)/2αXY(1α)/2αα,\displaystyle=\left\|Y^{\left(1-\alpha\right)/2\alpha}XY^{\left(1-\alpha\right)/2\alpha}\right\|_{\alpha}, (43)

where the equalities hold as observed in [18]. The sandwiched Rényi relative entropy itself is defined up to a normalization factor as in (38) [18, 19]. Thus, Theorem 3 implies quantum data processing for the sandwiched Rényi relative entropy D~α(XABYAB)D~α(XAYA),\widetilde{D}_{\alpha}(X_{AB}\|Y_{AB})\geq\widetilde{D}_{\alpha}(X_{A}\|Y_{A}), for the parameter range α(1,]\alpha\in(1,\infty], which is a result previously established in full by [34, 35, 21] and for α(1,2]\alpha\in(1,2] by [18, 19].

V On Petz’s quantum ff-divergence

I now discuss in more detail the relation between the optimized quantum ff-divergence and Petz’s ff-divergence from [12, 13]. In brief, we find that the Petz ff-divergence can be recovered by replacing τ\tau in Definition 1 with XX.

Definition 4 (Petz quantum ff-divergence)

Let ff be a continuous function with domain (0,)(0,\infty) and range \mathbb{R}. For positive definite operators XX and YY acting on a Hilbert space S\mathcal{H}_{S}, the Petz quantum ff-divergence is defined as

Qf(XY)φX|SS^f(XS1YS^T)|φXSS^,Q_{f}(X\|Y)\equiv\langle\varphi^{X}|_{S\hat{S}}f\left(X_{S}^{-1}\otimes Y_{\hat{S}}^{T}\right)|\varphi^{X}\rangle_{S\hat{S}}, (44)

where the notation is the same as in Definition 1.

One main concern is about quantum data processing with the Petz ff-divergence. To show this, we take ff to be an operator anti-monotone function with domain (0,)(0,\infty) and range \mathbb{R}. As discussed in Section III, one can establish data processing by showing isometric invariance and monotonicity under partial trace. Isometric invariance of Qf(XY)Q_{f}(X\|Y) follows from the same proof as given in the full version and was also shown in [14]. Monotonicity of Qf(XABYAB)Q_{f}(X_{AB}\|Y_{AB}) under partial trace for positive definite XABX_{AB} and YABY_{AB} follows from the operator Jensen inequality [12, 13].

Special and interesting cases of the Petz ff-divergence are found by taking f(x)=logx,f(x)=-\log x, f(x)=xβf(x)=-x^{\beta} for β(0,1]\beta\in(0,1], and f(x)=xβf(x)=x^{\beta} for β[1,0)\beta\in[-1,0). Each of these functions are operator anti-monotone with domain (0,)(0,\infty) and range \mathbb{R}. As shown in [12, 13], all of the following quantities obey the data processing inequality:

Qlog()(XY)\displaystyle Q_{-\log(\cdot)}(X\|Y) =Tr{X}D(X¯Y),\displaystyle=\operatorname{Tr}\{X\}D(\overline{X}\|Y), (45)
Q()β(XY)\displaystyle Q_{-(\cdot)^{\beta}}(X\|Y) =Tr{X1βYβ}, for β(0,1],\displaystyle=-\operatorname{Tr}\{X^{1-\beta}Y^{\beta}\},\text{ for }\beta\in(0,1], (46)
Q()β(XY)\displaystyle Q_{(\cdot)^{\beta}}(X\|Y) =Tr{X1βYβ},for β[1,0).\displaystyle=\operatorname{Tr}\{X^{1-\beta}Y^{\beta}\},\ \text{for }\beta\in[-1,0). (47)

By a reparametrization α=1β\alpha=1-\beta, the latter two quantities are directly related to the Petz Rényi relative entropy Dα(XY)1α1logTr{XαY1α}D_{\alpha}(X\|Y)\equiv\frac{1}{\alpha-1}\log\operatorname{Tr}\{X^{\alpha}Y^{1-\alpha}\}. Thus, the data processing inequality holds for Dα(XY)D_{\alpha}(X\|Y) for α[0,1)(1,2]\alpha\in[0,1)\cup(1,2] [13, 14].

VI Conclusion

The main contribution of the present work is the definition of the optimized quantum ff-divergence and the proof that the data processing inequality holds for it whenever the function ff is operator anti-monotone with domain (0,)(0,\infty) and range \mathbb{R}. The proof of the data processing inequality relies on the operator Jensen inequality [26], and it bears some similarities to the original approach from [12, 13, 14]. Furthermore, I showed how the sandwiched Rényi relative entropies are particular examples of the optimized quantum ff-divergence. As such, one benefit of this paper is that there is now a single, unified approach, based on the operator Jensen inequality [26], for establishing the data processing inequality for the Petz–Rényi and sandwiched Rényi relative entropies, for the full range of parameters for which it is known to hold.

Acknowledgements. I thank Anna Vershynina for discussions related to the topic of this paper, and I acknowledge support from the NSF under grant no. 1714215.

References

  • [1] H. Umegaki, “Conditional expectations in an operator algebra IV,” Kodai Math. Sem. Rep., vol. 14, no. 2, pp. 59–85, 1962.
  • [2] G. Lindblad, “Completely positive maps and entropy inequalities,” Comm. Math. Phys., vol. 40, no. 2, pp. 147–151, June 1975.
  • [3] A. Uhlmann, “Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory,” Communications in Mathematical Physics, vol. 54, no. 1, pp. 21–32, 1977.
  • [4] F. Hiai and D. Petz, “The proper formula for relative entropy and its asymptotics in quantum probability,” Communications in Mathematical Physics, vol. 143, no. 1, pp. 99–114, December 1991.
  • [5] T. Ogawa and H. Nagaoka, “Strong converse and Stein’s lemma in quantum hypothesis testing,” IEEE Transactions on Information Theory, vol. 46, no. 7, pp. 2428–2433, November 2000.
  • [6] I. Bjelakovic and R. Siegmund-Schultze, “Quantum Stein’s lemma revisited, inequalities for quantum entropies, and a concavity theorem of Lieb,” July 2012, arXiv:quant-ph/0307170.
  • [7] S. Kullback and R. A. Leibler, “On information and sufficiency,” Ann. Math. Stat., vol. 22, no. 1, pp. 79–86, March 1951.
  • [8] A. Rényi, “On measures of entropy and information,” Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability, vol. 1, pp. 547–561, 1961.
  • [9] I. Csiszár, “Information type measure of difference of probability distributions and indirect observations,” Studia Scientiarum Mathematicarum Hungarica, vol. 2, pp. 299––318, 1967.
  • [10] S. M. Ali and S. D. Silvey, “A general class of coefficients of divergence of one distribution from another,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 28, no. 1, pp. 131–142, 1966.
  • [11] T. Morimoto, “Markov processes and the hh-theorem,” Journal of the Physical Society of Japan, vol. 18, no. 3, pp. 328–331, 1963.
  • [12] D. Petz, “Quasi-entropies for states of a von Neumann algebra,” Publ. RIMS, Kyoto University, vol. 21, pp. 787–800, 1985.
  • [13] ——, “Quasi-entropies for finite quantum systems,” Reports in Mathematical Physics, vol. 23, pp. 57–65, 1986.
  • [14] M. Tomamichel, R. Colbeck, and R. Renner, “A fully quantum asymptotic equipartition property,” IEEE Transactions on Information Theory, vol. 55, no. 12, pp. 5840–5847, December 2009.
  • [15] V. P. Belavkin and P. Staszewski, “C*-algebraic generalization of relative entropy and entropy,” Annales de l’I.H.P. Physique théorique, vol. 37, no. 1, pp. 51–58, 1982.
  • [16] H. Nagaoka, “The converse part of the theorem for quantum Hoeffding bound,” November 2006, arXiv:quant-ph/0611289.
  • [17] M. Hayashi, “Error exponent in asymmetric quantum hypothesis testing and its application to classical-quantum channel coding,” Physical Review A, vol. 76, no. 6, p. 062301, December 2007.
  • [18] M. Müller-Lennert, F. Dupuis, O. Szehr, S. Fehr, and M. Tomamichel, “On quantum Rényi entropies: a new definition and some properties,” J. Math. Phys., vol. 54, no. 12, p. 122203, December 2013.
  • [19] M. M. Wilde, A. Winter, and D. Yang, “Strong converse for the classical capacity of entanglement-breaking and Hadamard channels via a sandwiched Rényi relative entropy,” Communications in Mathematical Physics, vol. 331, no. 2, pp. 593–622, October 2014.
  • [20] M. M. Wilde, M. Tomamichel, and M. Berta, “Converse bounds for private communication over quantum channels,” IEEE Transactions on Information Theory, vol. 63, no. 3, pp. 1792–1817, March 2017.
  • [21] M. Mosonyi and T. Ogawa, “Quantum hypothesis testing and the operational interpretation of the quantum Rényi relative entropies,” Comm. Math. Phys., vol. 334, no. 3, pp. 1617–1648, March 2015.
  • [22] F. Hiai, M. Mosonyi, D. Petz, and C. Beny, “Quantum ff-divergences and error correction,” Reviews in Mathematical Physics, vol. 23, no. 7, pp. 691–747, August 2011.
  • [23] D. Petz and M. B. Ruskai, “Contraction of generalized relative entropy under stochastic mappings on matrices,” Inf. Dim. Ana., Quantum Prob. and Related Topics, vol. 1, no. 1, pp. 83–89, January 1998.
  • [24] F. Hiai and M. Mosonyi, “Different quantum ff-divergences and the reversibility of quantum operations,” Reviews in Mathematical Physics, vol. 29, no. 7, p. 1750023, August 2017.
  • [25] K. Matsumoto, “A new quantum version of ff-divergence,” 2013, arXiv:1311.4722.
  • [26] F. Hansen and G. K. Pedersen, “Jensen’s operator inequality,” Bulletin London Math. Soc., vol. 35, no. 4, pp. 553–564, July 2003.
  • [27] R. Bhatia, Matrix Analysis.   Springer, 1997.
  • [28] F. Hansen, “The fast track to Löwner’s theorem,” Linear Algebra and its Applications, vol. 438, no. 11, pp. 4557–4571, June 2013.
  • [29] W. F. Stinespring, “Positive functions on C*-algebras,” Proceedings of the American Mathematical Society, vol. 6, pp. 211–216, 1955.
  • [30] D. Petz, “Sufficient subalgebras and the relative entropy of states of a von Neumann algebra,” Communications in Mathematical Physics, vol. 105, no. 1, pp. 123–131, March 1986.
  • [31] ——, “Sufficiency of channels over von Neumann algebras,” Quarterly Journal of Mathematics, vol. 39, no. 1, pp. 97–108, 1988.
  • [32] O. Klein, “Zur Quantenmechanischen Begründung des zweiten Hauptsatzes der Wärmelehre,” Z. Physik, vol. 72, pp. 767–775, 1931.
  • [33] M. B. Ruskai, “Inequalities for quantum entropy: A review with conditions for equality,” Journal of Mathematical Physics, vol. 43, no. 9, pp. 4358–4375, September 2002.
  • [34] R. L. Frank and E. H. Lieb, “Monotonicity of a relative Rényi entropy,” J. Math. Phys., vol. 54, no. 12, p. 122201, December 2013.
  • [35] S. Beigi, “Sandwiched Rényi divergence satisfies data processing inequality,” J. Math. Phys., vol. 54, no. 12, p. 122202, December 2013.