This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Fluctuation theorems from Bayesian retrodiction

Francesco Buscemi [email protected] Graduate School of Informatics, Nagoya University, Chikusa-ku, 464-8601 Nagoya, Japan    Valerio Scarani Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117542, Singapore
Abstract

Quantitative studies of irreversibility in statistical mechanics often involve the consideration of a reverse process, whose definition has been the object of many discussions, in particular for quantum mechanical systems. Here we show that the reverse channel very naturally arises from Bayesian retrodiction, both in classical and quantum theories. Previous paradigmatic results, such as Jarzynski’s equality, Crooks’ fluctuation theorem, and Tasaki’s two-measurement fluctuation theorem for closed driven quantum systems, are all shown to be consistent with retrodictive arguments. Also, various corrections that were introduced to deal with nonequilibrium steady states or open quantum systems are justified on general grounds as remnants of Bayesian retrodiction. More generally, with the reverse process constructed on consistent logical inference, fluctuation relations acquire a much broader form and scope.

I Introduction

In modern statistical mechanics, it has become customary to capture irreversibility by a suitable comparison between a forward and a backward (or reverse) process. Such a comparison is formulated in terms of fluctuation relations. Initially limited to linear response, such relations have progressively been extended to encompass a much larger class of processes BK (77). After the widely noticed works of Jarzynski Jar (97) and Crooks Cro (98), the literature has grown at such a fast pace that we can only point the reader to some reviews on the matter Jar (11); CHT (11); Gaw (13); FUS (18).

Forward and backward processes include a prior (initial state), which can be chosen arbitrarily, and a transition rule, viz. a channel. The forward process is the “physical process,” i.e., the propagation of the prior through the physical channel: as such, its definition is unproblematic. The main focus of this paper is the construction of the reverse process. For some processes, the identification is clear: for instance, in the case of classical Hamiltonian dynamics, the reverse transition can be easily identified with the dynamics that generates the time-reversed trajectory. For some other processes, however, physical intuition may not be sufficient. Notably, when the transition rule is governed by an underlying quantum channel, there is currently a widespread belief (see e.g. ALMZ (13); RZ (14); FUS (18)) that Crooks’ fluctuation theorem and Jarzynski’s equality hold without modifications or other corrections only when the evolution is unitary (i.e., closed) or at least unital (i.e., preserving the uniform distribution). However, non-unital channels such as partial swaps ZSB+ (02); LMC+ (15); SSBE (17) or thermal processes HO (13) are much more obvious paradigms of thermodynamic irreversibility than unital channels. Besides, classical fluctuation theorems are known to hold also for processes that are surely not unital: now, quantum theory should be a generalisation of classical probability theory, not a restriction over it.

In this paper, we propose that the reverse transition can be systematically constructed as a form of Bayesian retrodiction Wat (55); Jef (65); Pea (88); Jay (03); CD (05); Jac (19). This retrodictive structure can be recognized in some of the most famous classical fluctuations relations BK (77); Jar (97); Cro (98); Jar (00); HS (01), even if their original derivation was based on different arguments. Then we prove that, when the forward process is realised as a quantum process (preparation, evolution, and measurement), the Bayesian-reversed process is always realised as a valid quantum process as well. This process coincides with the “quantum retrodiction” independently defined in several works BPJ (00); Fuc (02); LS (13); FSB (20), and justifies on general grounds the use of Petz’s reverse map Pet (88) in the context of fluctuation relations. Our construction applies to any choice of preparation states, quantum channels, and final measurements, leading to exact fluctuation relations that do not require any modification.

The paper is structured as follows. In Section II, as a preparation, we show how fluctuation relations constitute information-theoretic divergence measures between the forward and the reverse process, and we derive a whole new family of fluctuation relations that originates from Csiszár’s ff-divergences Csi (67). In Section III we introduce the theory of Bayesian retrodiction and derive the general expression for the classical Bayesian-reversed process. In Section IV, we consider processes with a quantum realization and show that the Bayesian-reversed process precisely coincides with the quantum retrodiction of the quantum forward process. In Section V, we recast several known classical fluctuation relations in terms of Bayesian retrodiction. Finally, in Section VI, we consider the most general case of processes arising from arbitrary quantum channels. In this case we show how our approach very naturally accounts for, and thus justifies as an implicit Bayesian inversion, all the corrections that were before introduced to deal with situations such as nonequilibrium steady states and nonunital quantum channels.

II Generalized fluctuation relations from information divergences

II.1 Irreversibility as divergence between the forward and the reverse processes

As mentioned, studies of irreversibility involve a comparison between a forward (FF) and a reverse (RR) process Sei (05). Consider the evolution of a system from an initial time t=0t=0 to a final time t=τ>0t=\tau>0. Let us assume, for simplicity, that the system possesses a finite state space 𝒜\mathcal{A}. Suppose now that both forward and reverse processes, respectively, are given to us in terms of two suitable joint probability distributions, PF(x,y)P_{F}(x,y) and PR(x,y)P_{R}(x,y), respectively, where x𝒜x\in\mathcal{A} labels the state of the system at time t=0t=0, while y𝒜y\in\mathcal{A} labels the state of the system at time t=τt=\tau. Concretely, PF(x,y)P_{F}(x,y) denotes the probability that the system starts in state xx at time t=0t=0 and ends in state yy at time t=τt=\tau under the forward process, while PR(x,y)P_{R}(x,y) denotes the probability that the system starts in state yy at time t=τt=\tau and ends in state xx at time t=0t=0 under the reverse process. Notice that at this point we are not yet preoccupied with the problem of determining what makes the two processes one the “reverse” of the other. This will be the main issue in the rest of the paper, but for the time being we are just considering PF(x,y)P_{F}(x,y) and PR(x,y)P_{R}(x,y) as given.

If really PF(x,y)P_{F}(x,y) and PR(x,y)P_{R}(x,y) are one the reverse of the other, then irreversibility may be quantified in terms of how much the two distributions differ, in agreement with the idea that a reversible situation should correspond to the case in which PF(x,y)=PR(x,y)P_{F}(x,y)=P_{R}(x,y) for all x,y𝒜x,y\in\mathcal{A}. In mathematical statistics, the degree of “disagreement” of two distributions is captured in terms of information divergences. Here we focus on the family of ff-divergences, defined as111In fact, ff-divergences are usually defined in terms of 1/r1/r, see e.g. LM (08). However, for the sake of the present discussion, formulas are more easily recognizable with the alternative definition. Csi (67)

Df(PFPR)\displaystyle D_{f}(P_{F}\|P_{R}) :=x,yPF(x,y)f(PF(x,y)PR(x,y))\displaystyle:=\sum_{x,y}P_{F}(x,y)\ f\!\left(\frac{P_{F}(x,y)}{P_{R}(x,y)}\right) (1)
=f(PF(x,y)PR(x,y))F,\displaystyle=\left\langle f\!\left(\frac{P_{F}(x,y)}{P_{R}(x,y)}\right)\right\rangle_{F}\;,

where f:+f:\mathbb{R}^{+}\to\mathbb{R} is a function to be specialized further in what follows. For the time being, it suffices to notice that for f(r)=lnrf(r)=\ln r one recovers the Kullback-Leibler divergence KL (51), i.e., the usual relative entropy, whereas for f(r)=rαf(r)=r^{\alpha} one recovers the family of Hellinger-Rényi divergences LM (08). We will come back to these particular choices at the end of this section.

The forward-reverse ratio will be henceforth denoted

r(x,y)\displaystyle r(x,y) =\displaystyle= PF(x,y)PR(x,y).\displaystyle\frac{P_{F}(x,y)}{P_{R}(x,y)}\,. (2)

A priori, one should pay attention to the pairs (x,y)(x,y) such that this ratio is not well-defined. If necessary, this problem can be dealt with on a case-by-case basis, but there is no need to burden the notation already at this point. Besides, common sense demands (and our later definition will vindicate it) that if a pair (x,y)(x,y) is assigned a null probability under the forward process, it should be so also under the reverse process, and vice versa. Thus, ultimately, r(x,y)r(x,y) will be ill-defined only for pairs (x,y)(x,y) that don’t contribute to any average because PF(x,y)=PR(x,y)=0P_{F}(x,y)=P_{R}(x,y)=0.

II.2 ff-fluctuation relations

Eq. (1) suggests to interpret f(r(x,y))f(r(x,y)) itself as a random variable, whose realization is denoted ωF(x,y)\omega_{F}(x,y) for the forward process. The same random variable, when evaluated for the reverse process, is denoted ωR(x,y)\omega_{R}(x,y). For consistency, these two must be the same function, though evaluated on different arguments: more explicitly,

ωF(x,y)\displaystyle\omega_{F}(x,y) :=f(r(x,y))\displaystyle:=f(r(x,y))
\displaystyle\Downarrow (3)
ωR(x,y)\displaystyle\omega_{R}(x,y) :=f(1r(x,y)),\displaystyle:=f\!\left(\frac{1}{r(x,y)}\right)\;,

simply due to the fact that when considering the reverse variable ωR\omega_{R}, the roles of forward and reverse processes are exchanged and the ratio is thus inverted.

A Crooks-type fluctuation relation is a relation between the probability density functions μ\mu of ω\omega for the forward and the backward processes:

μF(ω)\displaystyle\mu_{F}(\omega) =\displaystyle= x,yδ(ωF(x,y)ω)PF(x,y)\displaystyle\sum_{x,y}\delta(\omega_{F}(x,y)-\omega)P_{F}(x,y) (4)
μR(ω)\displaystyle\mu_{R}(\omega) =\displaystyle= x,yδ(ωR(x,y)ω)PR(x,y)\displaystyle\sum_{x,y}\delta(\omega_{R}(x,y)-\omega)P_{R}(x,y) (5)
=\displaystyle= x,yδ(ωR(x,y)ω)1r(x,y)PF(x,y).\displaystyle\sum_{x,y}\delta(\omega_{R}(x,y)-\omega)\frac{1}{r(x,y)}P_{F}(x,y)\,.

If f:+f:\mathbb{R}^{+}\to\mathbb{R} is invertible, everything is well defined, and moreover there exists another function gg such that f(1/r)=g(f(r))f(1/r)=g(f(r)), i.e., ωR(x,y)=g(ωF(x,y))\omega_{R}(x,y)=g(\omega_{F}(x,y)). More explicitly, g(u)=f(1f1(u))g(u)=f(\frac{1}{f^{-1}(u)}). Plugging these definitions into (5), we obtain

μR(u)\displaystyle\mu_{R}(u)
=x,yδ(ωR(x,y)u)f1(ωR(x,y))PF(x,y)\displaystyle=\sum_{x,y}\delta(\omega_{R}(x,y)-u)f^{-1}(\omega_{R}(x,y))P_{F}(x,y)
=f1(u)x,yδ(g(ωF(x,y))u)PF(x,y)\displaystyle=f^{-1}(u)\sum_{x,y}\delta(g(\omega_{F}(x,y))-u)P_{F}(x,y)
=f1(u)|g(g1(u))|x,yδ(ωF(x,y)g1(u))PF(x,y)\displaystyle=\frac{f^{-1}(u)}{|g^{\prime}(g^{-1}(u))|}\sum_{x,y}\delta(\omega_{F}(x,y)-g^{-1}(u))P_{F}(x,y)

and thus, with u=g(ω)u=g(\omega), we have a Crooks-type relation of the form

μR(g(ω))\displaystyle\mu_{R}(g(\omega)) =\displaystyle= f1(g(ω))|g(ω)|μF(ω).\displaystyle\frac{f^{-1}(g(\omega))}{|g^{\prime}(\omega)|}\,\mu_{F}(\omega)\,. (6)

Moreover, since +μR(g(ω))|g(ω)|𝑑ω=+μR(u)𝑑u=1\int_{\mathbb{R}^{+}}\mu_{R}(g(\omega))|g^{\prime}(\omega)|d\omega=\int_{\mathbb{R}^{+}}\mu_{R}(u)du=1, there follows the corresponding Jarzynski-like relation

f1(g(ω))F\displaystyle\left\langle f^{-1}(g(\omega))\right\rangle_{F} =\displaystyle= 1.\displaystyle 1\,. (7)

The above relation can be verified by unraveling our notations: since ω\omega here is ωF\omega_{F}, we have g(ω)=ωRg(\omega)=\omega_{R}, and f1(ωR)=1/rf^{-1}(\omega_{R})=1/r. Thus f1(g(ω))F=1/rF=x,y1r(x,y)PF(x,y)=x,yPR(x,y)=1\left\langle f^{-1}(g(\omega))\right\rangle_{F}=\left\langle 1/r\right\rangle_{F}=\sum_{x,y}\frac{1}{r(x,y)}P_{F}(x,y)=\sum_{x,y}P_{R}(x,y)=1. Both this direct proof and the derivation from (6) show that, from our perspective, the Jarzynski-like relation (7) expresses the normalization of the reverse process.

II.3 Examples

The usual choice made in the literature is

ω=f(r)=1zlnr,\omega=f(r)=\frac{1}{z}\ln r\;,

with z0z\neq 0. In this case, we have f1(ω)=ezωf^{-1}(\omega)=e^{z\omega}, f(1/r)=1zlnrf(1/r)=-\frac{1}{z}\ln r, and thus g(ω)=ωg(\omega)=-\omega. The relations (6) and (7) become the familiar ones, namely,

μR(ω)=ezωμF(ω),\displaystyle\mu_{R}(-\omega)=e^{-z\omega}\mu_{F}(\omega)\;, (8)

and

ezωF=1.\displaystyle\left\langle e^{-z\omega}\right\rangle_{F}=1\,.

Notice that one could have imposed the condition ωR(x,y)=ωF(x,y)\omega_{R}(x,y)=-\omega_{F}(x,y) from the start, in a way somehow reminiscent of an “arrow-of-time” variable. In our notation, this condition is equivalent to impose that the function ff satisfies f(1/r)=f(r)f(1/r)=-f(r), which leads to f(r)=1zlnrf(r)=\frac{1}{z}\ln r for an arbitrary zz.

As a second example, consider

ω=f(r)=rα,\omega=f(r)=r^{\alpha}\;,

with α0\alpha\neq 0. Then f1(ω)=ω1/αf^{-1}(\omega)=\omega^{1/\alpha}, f(1/r)=rαf(1/r)=r^{-\alpha}, and therefore g(ω)=1/ωg(\omega)=1/\omega. The fluctuation relations (6) and (7) for random variables that satisfy ωR=1/ωF\omega_{R}=1/\omega_{F} are therefore

μR(1/ω)=ω21/αμF(ω)\displaystyle\mu_{R}(1/\omega)=\omega^{2-1/\alpha}\mu_{F}(\omega) \displaystyle\;\implies\; ω1/αF=1.\displaystyle\left\langle\omega^{-1/\alpha}\right\rangle_{F}=1\,.

Other examples may look exotic, but nonetheless possible: for instance, by choosing ω=f(r)=eκr\omega=f(r)=e^{\kappa r}, with κ0\kappa\neq 0, one obtains ωR=eκ2/lnωF\omega_{R}=e^{\kappa^{2}/\ln\omega_{F}}, that is, lnωRlnωF=κ2\ln\omega_{R}\ln\omega_{F}=\kappa^{2}.

III Reverse process from Bayesian retrodiction

In the previous section we assumed that forward and reverse processes were both given. However, when only the forward process is given, the reverse process should be derived from it. In this section we explain how the reverse process can be derived from the forward process by applying the formalism of Bayesian retrodiction.

III.1 Basics of Bayesian update

We begin this section by reviewing the theory of Bayesian retrodiction. For simplicity, let us consider two random variables XX and YY with states labeled by the indices xx and yy, both taken from a finite set 𝒜\mathcal{A}. Let PXY(x,y)P_{XY}(x,y) denote their joint distribution. If one’s knowledge on YY is updated to a definite value y=yy=y^{*}, the Bayes–Laplace rule tells that the agents should update their belief on XX according to PX(x)=PX|Y(x|y)P^{\prime}_{X}(x)=P_{X|Y}(x|y^{*}). In other words, the joint probability is updated as

PXY(x,y)\displaystyle P^{\prime}_{XY}(x,y) =\displaystyle= PX|Y(x|y)δy,y.\displaystyle P_{X|Y}(x|y)\,\delta_{y,y^{*}}\,. (9)

The above formula, which constitutes the standard formulation of the Bayes–Laplace rule, is silent however about those situations, most common in real scenarios, in which the update does not result in a definite value yy^{*}, but is itself described in terms of another probability distribution PY(y)P_{Y}^{\prime}(y). In the face of such “soft evidence”, the update should follow Jeffrey’s conditioning Jef (65)

PXY(x,y)\displaystyle P^{\prime}_{XY}(x,y) =\displaystyle= PX|Y(x|y)PY(y).\displaystyle P_{X|Y}(x|y)\,P^{\prime}_{Y}(y)\,. (10)

Jeffrey based this update on his “rule of probability kinematics”. The coefficients PX|Y(x|y)P_{X|Y}(x|y) are seen as defining a channel that propagates the soft belief acquired about yy back onto xx. It was later noticed that Jeffrey’s update can also be obtained from Bayes–Laplace rule using Pearl’s “method of virtual evidence”  Pea (88); Jay (03); CD (05); Jac (19). From this viewpoint, the soft evidence on YY is consequence of another definite evidence on a “virtual” variable ZZ, that has no direct influence on XX (i.e. XYZX\to Y\to Z forms a Markov chain). When ZZ is updated to a definite value zz^{*}, one updates the belief on YY to PY(y)PY|Z(y|z)δz,zP^{\prime}_{Y}(y)\equiv P_{Y|Z}(y|z)\,\delta_{z,z^{*}} and Eq. (10) is recovered.

III.2 From update to retrodiction

The Bayesian update we just described becomes retrodiction when the variables XX and YY are given a diachronic meaning: XX represents the system’s state at the initial time t=0t=0, while YY represents the system’s state at the final time τ>0\tau>0.

As already noticed, if we know the forward transition probability PY|X(y|x)P_{Y|X}(y|x), henceforth denoted as φ(y|x)\varphi(y|x), any prior knowledge p(x)p(x) on XX can be forward-propagated using it. Hence, the forward process will be

PF(x,y)\displaystyle P_{F}(x,y) =\displaystyle= p(x)φ(y|x)\displaystyle p(x)\varphi(y|x) (11)

where p(x)p(x) can be chosen arbitrarily. Now we want to define the reverse transition probability φ^(x|y)\hat{\varphi}(x|y), with the same functionality: it should be possible to use it to back-propagate onto XX any prior knowledge q(y)q(y) that one obtains about YY. In other words, the reverse process will be

PR(x,y)\displaystyle P_{R}(x,y) =\displaystyle= q(y)φ^(x|y)\displaystyle q(y)\hat{\varphi}(x|y) (12)

where q(y)q(y) can be chosen arbitrarily.

The reverse transition is obtained from Jeffrey’s conditioning (10), but the knowledge of the forward transition alone is not enough Wat (65). One needs to define a reference prior PX(x)P_{X}(x). In the context of irreversibility, a natural choice is to set the reference prior equal to a steady (viz. invariant) state γ(x)\gamma(x), that is, a distribution such that

γ(y)=xγ(x)φ(y|x),\gamma(y)=\sum_{x}\gamma(x)\varphi(y|x)\;,

for all y𝒜y\in\mathcal{A}. This choice coincides with what is customarily done in the theory of Markov chains when defining the reverse chain Nor (97). Notice that, while every transition matrix possesses at least one steady state, the steady state may not be unique: in such a case, to any choice of a steady state there will correspond a different reverse transition, hence a different fluctuation relation.

Starting from a steady state, the reverse transition φ^(x|y)\hat{\varphi}(x|y) is defined by the relation γ(y)φ^(x|y)=γ(x)φ(y|x)\gamma(y)\hat{\varphi}(x|y)=\gamma(x)\varphi(y|x). We see that γ\gamma is an invariant distribution for the reverse process too (in other words: by choosing the steady state as prior, we define a reference process that does not distinguish between forward and reverse evolution). Implicitly restricting the analysis to the pairs (x,y)(x,y) with strictly positive weight (i.e., γ(x)φ(y|x)>0\gamma(x)\varphi(y|x)>0), we obtain

φ^(x|y)φ(y|x)=γ(x)γ(y).\displaystyle\frac{\hat{\varphi}(x|y)}{\varphi(y|x)}=\frac{\gamma(x)}{\gamma(y)}\;. (13)

Plugging this together with Eqs. (11) and (12) into (2), we find

r(x,y)=p(x)γ(y)q(y)γ(x),\displaystyle r(x,y)=\frac{p(x)\gamma(y)}{q(y)\gamma(x)}\;, (14)

that is, the forward-reverse ratio r(x,y)r(x,y), which is the crucial quantity in the study of irreversibility as noticed in Section II, depends on the stochastic transition φ(y|x)\varphi(y|x) only through its steady state γ\gamma.

IV Quantum inside: recovering quantum retrodiction

When the channel is realised by a quantum process, we proceed to show that the formalism of Jeffrey’s conditioning is automatically compatible with the formalism of quantum retrodiction BPJ (00); Fuc (02); LS (13); FSB (20).

Here quantum mechanics enters the picture assuming that the stochastic transition φ(y|x)\varphi(y|x) involves an inner quantum “mechanism”: to each input x𝒜x\in\mathcal{A} is associated an input state (density matrix) ρ0x\rho_{0}^{x}, that is later propagated to ρτx=(ρ0x)\rho_{\tau}^{x}=\mathcal{E}(\rho_{0}^{x}) via a completely positive trace-preserving (CPTP) linear map \mathcal{E}, and finally measured using a positive operator-valued measure (POVM) with outcomes y𝒜y\in\mathcal{A} and elements Πτy\Pi_{\tau}^{y}. The subscripts 0 and τ\tau are used to denote, respectively, an initial time t=0t=0 and a final time t=τ>0t=\tau>0. With these notations,

φ(y|x)\displaystyle\varphi(y|x) =\displaystyle= Tr[Πτy(ρ0x)].\displaystyle\textrm{Tr}[\Pi_{\tau}^{y}\ \mathcal{E}(\rho_{0}^{x})]\,. (15)

In the above equation, the density matrix ρ0x\rho_{0}^{x} is meant to encode all the relevant degrees of freedom needed to represent the xx-th experimental setup. Hence, it can account for processes in which, for example, there is no clear distinction between system and environment due to the presence of initial correlations Pec (94); Ali (95); SB (01); JSS (04); Bus (14); DSL (16). In all such cases, ρ0x\rho_{0}^{x} will include not only the degrees of freedom typically associated with the system, but also those associated with the environment.

The expression of φ^\hat{\varphi} in the quantum formalism is immediately obtained from (15). By introducing the state γ0=xγ(x)ρ0x\gamma_{0}=\sum_{x}\gamma(x)\rho^{x}_{0}, that we assume invertible (otherwise we can restrict the analysis to the subspace where γ0>0\gamma_{0}>0), one gets the much more evocative expression

φ^(x|y)\displaystyle\hat{\varphi}(x|y) =Tr[Θ0x^(στy)]\displaystyle=\Tr[\Theta^{x}_{0}\ \hat{\mathcal{E}}(\sigma^{y}_{\tau})] (16)

which is the Born rule for the POVM elements

Θ0x:=γ(x)1γ0ρ0x1γ0,\displaystyle\Theta^{x}_{0}:=\gamma(x)\frac{1}{\sqrt{\gamma_{0}}}\rho^{x}_{0}\frac{1}{\sqrt{\gamma_{0}}}\;, (17)

the normalized states

στy:=1γ(y)(γ0)Πτy(γ0),\displaystyle\sigma^{y}_{\tau}:=\frac{1}{\gamma(y)}\sqrt{\mathcal{E}(\gamma_{0})}\ \Pi^{y}_{\tau}\ \sqrt{\mathcal{E}(\gamma_{0})}\;, (18)

and the reverse quantum channel Pet (88); BK (02); Cro (08)

^():=γ0[1(γ0)()1(γ0)]γ0,\displaystyle\hat{\mathcal{E}}(\cdot):=\sqrt{\gamma_{0}}\ \mathcal{E}^{\dagger}\left[\frac{1}{\sqrt{\mathcal{E}(\gamma_{0})}}(\cdot)\frac{1}{\sqrt{\mathcal{E}(\gamma_{0})}}\right]\ \sqrt{\gamma_{0}}\;, (19)

\mathcal{E}^{\dagger} being the trace-dual of \mathcal{E}, defined by the relation Tr[(X)Y]=Tr[X(Y)]\Tr[\mathcal{E}^{\dagger}(X)\ Y]=\Tr[X\ \mathcal{E}(Y)] for all operators XX and YY. We assume (γ0)>0\mathcal{E}(\gamma_{0})>0, so that ^\hat{\mathcal{E}} is a CPTP linear map defined everywhere, and thus physically realizable222Even if (γ0)\mathcal{E}(\gamma_{0}) is not invertible, ^\hat{\mathcal{E}} can always be extended to a CPTP map defined everywhere and, thus, physically realizable Wil (13)..

Just as Eq. (13) is the classical retrodiction for φ(y|x)\varphi(y|x), its quantum description given in Eq. (16) constitutes the quantum retrodiction of Eq. (15), in perfect agreement with previous literature on quantum retrodiction BPJ (00); Fuc (02); LS (13); FSB (20).

V Classical Thermodynamics of Retrodiction

In this Section, we show how several important classical fluctuation relations can be recovered using the retrodictive approach.

V.1 Doubly-stochastic transitions, and classical Hamiltonian dynamics

We start by noticing that the condition

φ^(x|y)=φ(y|x)\displaystyle\hat{\varphi}(x|y)=\varphi(y|x) (20)

holds if and only if the transition is doubly-stochastic, viz. if in addition to the compulsory normalisation yφ(y|x)=1\sum_{y}\varphi(y|x)=1, it also holds xφ(y|x)=1\sum_{x}\varphi(y|x)=1. Indeed, comparing with (13), we see that (20) holds if and only if the steady state can be chosen as the uniform distribution γ(x)1\gamma(x)\propto 1. It is then trivial to check that doubly stochastic channels always admit such a steady state, while channels that are not doubly stochastic do not have a uniform steady state.

A particularly important case of doubly stochastic transition is classical Hamiltonian dynamics, which is deterministic, viz. there is a one-to-one correspondence between the initial state xx and the final state yxy\equiv x^{\prime}. The evolution xxx\to x^{\prime} is then represented by the transition φ(x|x)\varphi(x^{\prime}|x), with φ(x|x)=1\varphi(x^{\prime}|x)=1 if xx^{\prime} is the final state corresponding to the initial state xx, and φ(x|x)=0\varphi(x^{\prime}|x)=0 otherwise. Thus, for any classical Hamiltonian process one can choose γ\gamma as uniform.

For instance, the scenario considered by Bochkov and Kuzovlev BK (77) is of this type. They specify a class of driven Hamiltonians such that, on any given forward trajectory xxx\rightarrow x^{\prime}, the non-driven term H0H_{0} satisfies H0(x)=H0(x)+E(x,x)H_{0}(x^{\prime})=H_{0}(x)+E(x,x^{\prime}), with EE determined by the driving protocol. By choosing thermal priors p(x)eβH0(x)p(x)\propto e^{-\beta H_{0}(x)} and q(x)eβH0(x)q(x^{\prime})\propto e^{-\beta H_{0}(x^{\prime})}, with β=1/kBT\beta=1/k_{B}T, they obtain

r(x,x)\displaystyle r(x,x^{\prime}) =\displaystyle= φ(x|x)p(x)φ^(x|x)q(x)=p(x)q(x)=eβE(x,x).\displaystyle\frac{\varphi(x^{\prime}|x)p(x)}{\hat{\varphi}(x|x^{\prime})q(x^{\prime})}=\frac{p(x)}{q(x^{\prime})}\,=\,e^{-\beta E(x,x^{\prime})}\,. (21)

Since the process is deterministic, the statistics (and in particular the fluctuation relations) carry over from the initial conditions to the whole trajectory. From this observation, rich physical consequences follow: one can derive many-point relations, relations for the currents (e.g. Onsager), linear response results (in the limit when the driving forces go to zero), etc. BK (77). As we presented it, all this can be seen as starting with retrodiction.

Another example that fits in this subsection is that of a generic time-dependent Hamiltonian, but with microcanonical priors instead of the most frequently used thermal ones THM (08). Staying with a discrete alphabet again (the generalisation follows immediately), such priors read: p(x)=N(E)1p(x)=N(E)^{-1} if Ex=EE_{x}=E (and zero otherwise) and q(x)=N(E)1q(x^{\prime})=N(E^{\prime})^{-1} if Ex=EE_{x^{\prime}}=E^{\prime} (and zero otherwise), where N(E)N(E) and N(E)N(E^{\prime}) are the degeneracies of the two levels. Thus, for the processes with non-zero probability, the forward-reverse ratio reads

r(x,x)\displaystyle r(x,x^{\prime}) =p(x)q(x)=N(E)N(E)\displaystyle=\frac{p(x)}{q(x^{\prime})}=\frac{N(E^{\prime})}{N(E)}
=e(S(E)S(E))/kB,\displaystyle=e^{(S(E^{\prime})-S(E))/k_{B}}\;, (22)

where S(E):=kBlnN(E)S(E):=k_{B}\ln N(E) coincides with Boltzmann’s entropy formula.

V.2 Classical Hamiltonian system-reservoir interactions

We consider now a system composed of two subsystems. For notational purposes, let us denote the microstates of the first system at initial and final time by the labels xx and xx^{\prime}, respectively; and those of the second system analogously by the labels ww and ww^{\prime}. For the purpose of the present example, all labels belong to discrete sets. If the joint evolution is Hamiltonian, we can borrow from the previous discussion and conclude that

φ^(x,w|x,w)=φ(x,w|x,w).\displaystyle\hat{\varphi}(x,w|x^{\prime},w^{\prime})=\varphi(x^{\prime},w^{\prime}|x,w)\;. (23)

To construct the forward and reverse processes (11) and (12), one should specify the prior distributions p(x,w)p(x,w) and q(x,w)q(x^{\prime},w^{\prime}).

This notation opens the possibility of coarse-graining over one of the subsystems. For definiteness, we describe the particular case studied in Jar (00). In this narrative, the second system consists of one or several heat reservoirs. It is rather natural therefore to stipulate that:

  1. 1.

    Both priors are in product form, i.e., p(x,w)=p(x)P(w)p(x,w)=p(x)P(w) and q(x,w)=q(x)Q(w)q(x^{\prime},w^{\prime})=q(x^{\prime})Q(w^{\prime});

  2. 2.

    The reservoirs’ priors P(w)P(w) and Q(w)Q(w^{\prime}) are thermal distributions at inverse temperature β1=kBT\beta^{-1}=k_{B}T, i.e., P(w)eβEwP(w)\propto e^{-\beta E_{w}} and Q(w)eβEwQ(w^{\prime})\propto e^{-\beta E_{w^{\prime}}}, where EwE_{w} and EwE_{w^{\prime}} are the energies of the reservoir’s microstates ww and ww^{\prime}. This condition implies

    P(w)Q(w)=eβ(EwEw)=:eΔS/kB\displaystyle\frac{P(w)}{Q(w^{\prime})}=e^{\beta(E_{w^{\prime}}-E_{w})}=:e^{\Delta S/k_{B}} (24)

    where ΔS\Delta S is the entropy generated.

With these two assumptions on the prior distributions, one can compute the following marginal conditional probability:

φ(x,ΔS|x)=w,w:EwEw=TΔSφ(x,w|x,w)P(w).\displaystyle\varphi(x^{\prime},\Delta S|x)=\sum_{w,w^{\prime}:E_{w^{\prime}}-E_{w}=T\Delta S}\varphi(x^{\prime},w^{\prime}|x,w)P(w)\;. (25)

The above represents the forward transition probability that the system, if starting in microstate xx, will end up in microstate xx^{\prime} generating in the process an amount of entropy equal to ΔS\Delta S. Notice that the coarse-grained transition φ(x,ΔS|x)\varphi(x^{\prime},\Delta S|x) computed in (25) only depends on the reservoir’s prior P(x)P(x). In other words, while the reservoir’s prior distribution P(x)P(x) is fixed, the system’s prior remains arbitrary.

Analogously, but starting from the retrodicted transition (23), we obtain

φ^(x,ΔS|x)\displaystyle\hat{\varphi}(x,-\Delta S|x^{\prime})
=w,w:EwEw=TΔSφ^(x,w|x,w)Q(w)\displaystyle=\sum_{w,w^{\prime}:E_{w}-E_{w^{\prime}}=-T\Delta S}\hat{\varphi}(x,w|x^{\prime},w^{\prime})Q(w^{\prime})
=w,w:EwEw=TΔSφ(x,w|x,w)Q(w)\displaystyle=\sum_{w,w^{\prime}:E_{w}-E_{w^{\prime}}=-T\Delta S}\varphi(x^{\prime},w^{\prime}|x,w)Q(w^{\prime}) (26)
=w,w:EwEw=TΔSφ(x,w|x,w)P(w)Q(w)P(w)\displaystyle=\sum_{w,w^{\prime}:E_{w^{\prime}}-E_{w}=T\Delta S}\varphi(x^{\prime},w^{\prime}|x,w)P(w)\frac{Q(w^{\prime})}{P(w)}
=w,w:EwEw=TΔSφ(x,w|x,w)P(w)eΔS/kB\displaystyle=\sum_{w,w^{\prime}:E_{w^{\prime}}-E_{w}=T\Delta S}\varphi(x^{\prime},w^{\prime}|x,w)P(w)e^{-\Delta S/k_{B}} (27)
=eΔS/kBφ(x,ΔS|x),\displaystyle=e^{-\Delta S/k_{B}}\varphi(x^{\prime},\Delta S|x)\;,

where in (26) we used relation (23), while in (27) we used relation (24). The above represents the reverse transition probability that the system, if starting in microstate xx^{\prime}, will end up in microstate xx generating in the process an amount of entropy equal to ΔS-\Delta S. Summarizing, we have recovered

φ(x,ΔS|x)φ^(x,ΔS|x)=eΔS/kB,\displaystyle\frac{\varphi(x^{\prime},\Delta S|x)}{\hat{\varphi}(x,-\Delta S|x^{\prime})}=e^{\Delta S/k_{B}}\;, (28)

which is the main result of Jar (00). It is worth stressing that, as written, ΔS\Delta S plays the role of an additional random variable coarse-graining the reservoir’s microstates: that is, φ(x,ΔS|x)\varphi(x^{\prime},\Delta S|x) describes a transition from input xx to output (x,ΔS)(x^{\prime},\Delta S). Thus, the object at the denominator in (28) is not the complete Bayesian reverse of the numerator, but rather a partial (viz. “hybrid” Jar (00)) reversal.

V.3 Retrodiction in stochastic thermodynamics

However desirable it would be to derive everything from the Hamiltonian dynamics of a closed (possibly composite) system, information about the latter is often lacking. This is when the approach based on Bayesian retrodiction really shows its power: it allows one to make inferences based on whatever partial information is available.

As a first example of this type, we consider the stochastic thermodynamics setting that led to Crooks’ theorem Cro (98). A general process in discrete-time stochastic thermodynamics is modeled as a sequence of external driving protocols (the work steps) alternating with periods during which the system is allowed to equilibrate with an ideal heat bath (the relaxation stepsCro (98); SSBC (12). The changes in the system’s internal energy during each work step are counted as work done on the system, while the changes happening during each relaxation steps are counted as heat absorbed by the system. For simplicity, we consider here just a two-step process: one work step followed by one relaxation step.

Let us denote the system’s initial state by xx, and let ExE_{x} be the system’s initial energy. The system is (deterministically) driven to another energy ExE^{\prime}_{x}. This constitutes the work step. The relaxation step, which is not deterministic, is modeled using a transition conditional probability φ(y|x)\varphi(y|x), where yy labels the system’s state after the relaxation step. Assuming that during the relaxation step the system’s Hamiltonian does not change, the system’s final energy, after the relaxation step is over, is EyE^{\prime}_{y}.

By definition, an invariant distribution for φ(y|x)\varphi(y|x) is the thermal distribution γ(x)eβEx\gamma(x)\propto e^{-\beta E^{\prime}_{x}}. This, via (13), leads to the retrodicted transition

φ^(x|y)\displaystyle\hat{\varphi}(x|y) =γ(x)γ(y)φ(y|x)\displaystyle=\frac{\gamma(x)}{\gamma(y)}\varphi(y|x)
=eβ(EyEx)φ(y|x).\displaystyle=e^{\beta(E^{\prime}_{y}-E^{\prime}_{x})}\varphi(y|x)\;.

Choosing as priors the thermal distributions, that is, p(x)=eβ(FEx)p(x)=e^{\beta(F-E_{x})} and q(y)=eβ(FEy)q(y)=e^{\beta(F^{\prime}-E^{\prime}_{y})}, with F,FF,F^{\prime} the corresponding Helmholtz free energies, the ratio (14) becomes

r(x,y)\displaystyle r(x,y) =φ(y|x)p(x)φ^(x|y)q(y)\displaystyle=\frac{\varphi(y|x)p(x)}{\hat{\varphi}(x|y)q(y)}
=eβ(EyEx)eβ(EyExΔF)\displaystyle=e^{-\beta(E^{\prime}_{y}-E^{\prime}_{x})}e^{\beta(E^{\prime}_{y}-E_{x}-\Delta F)}
=eβ(ExExΔF)=eβ(WΔF),\displaystyle=e^{\beta(E^{\prime}_{x}-E_{x}-\Delta F)}=e^{\beta(W-\Delta F)}\;,

where in the last passage we used the assumption that the system’s internal energy change during the work step is work WW done on the system. The fluctuation relation (8) then immediately gives

μF(W)μR(W)=eβ(WΔF),\frac{\mu_{F}(W)}{\mu_{R}(-W)}=e^{\beta(W-\Delta F)}\;,

in accordance with Cro (98). Notice that in our retrodictive derivation we did not require any particular condition (such as the condition of detailed balance) for the relaxation step φ(y|x)\varphi(y|x), apart from it preserving the thermal distribution, which is implicit in the definition of “relaxation”.

VI Quantum Thermodynamics of Retrodiction

Our definition of the reverse quantum channel based on Bayesian inversion, presented in Section IV, accommodates any state preparation {ρ0x:x𝒜}\{\rho_{0}^{x}:x\in\mathcal{A}\}, any quantum channel \mathcal{E}, and any final measurement {Πτy:y𝒜}\{\Pi^{y}_{\tau}:y\in\mathcal{A}\}. It hence contains, as a very special case, the conventional setup of Tasaki’s two-measurement thermodynamics of closed driven quantum systems Tas (00). But it also contains, and resolves, some quantum setups that have been considered problematic in the past due to the supposed lack of a well-defined reverse process (see e.g. ALMZ (13); RZ (14); FUS (18)). Where physical intuition fails to envisage a “natural” inversion of the physical process, consistent logical inference (viz. Bayesian inversion) comes to the rescue.

VI.1 Two-measurement setup for closed driven quantum systems

The paradigm for quantum fluctuation relations is provided by Tasaki’s two-measurement setup Tas (00). Here, a dd-level quantum system is prepared in the state ρ0\rho_{0} but immediately subjected to a von Neumann projective measurement of the initial Hamiltonian H0=xϵx|ϵxϵx|H_{0}=\sum_{x}\epsilon_{x}|\epsilon_{x}\rangle\langle\epsilon_{x}|. The system, now collapsed onto |ϵx|\epsilon_{x}\rangle after observation of ϵx\epsilon_{x}, next undergoes a perfectly adiabatic work protocol: its Hamiltonian is driven from H0H_{0} to Hτ=yηy|ηyηy|H_{\tau}=\sum_{y}\eta_{y}|\eta_{y}\rangle\langle\eta_{y}|, but the system is otherwise perfectly isolated from the surrounding environment. At the end of the driving protocol, during which only mechanical work has been exchanged with the system, the system is subjected to a second energy measurement, this time of the final Hamiltonian HτH_{\tau}.

Let us analyse Tasaki’s setup with our tools. Denoting the unitary evolution resulting from the driving protocol by U0τU_{0\to\tau}, the forward (predictive) transition probability is given by

φ(y|x)=Tr[U0τ|ϵxϵx|U0τ|ηyηy|].\displaystyle\varphi(y|x)=\Tr[U_{0\to\tau}|\epsilon_{x}\rangle\langle\epsilon_{x}|U_{0\to\tau}^{\dagger}\ |\eta_{y}\rangle\langle\eta_{y}|]\;. (29)

It is easy to verify that φ(y|x)\varphi(y|x) is doubly stochastic (see Subsection V.1). This is a consequence of the following three facts: (i) that the underlying quantum process is unitary; (ii), that Tr[|ϵxϵx|]=Tr[|ηyηy|]=1\Tr[|\epsilon_{x}\rangle\langle\epsilon_{x}|]=\Tr[|\eta_{y}\rangle\langle\eta_{y}|]=1; and (iii), that x|ϵxϵx|=y|ηyηy|=𝟙\sum_{x}|\epsilon_{x}\rangle\langle\epsilon_{x}|=\sum_{y}|\eta_{y}\rangle\langle\eta_{y}|=\openone. As noted in Subsection V.1, the reverse (retrodictive) transition is given by

φ(y|x)=φ^(x|y).\displaystyle\varphi(y|x)=\hat{\varphi}(x|y)\;. (30)

At the same time, the corresponding quantum retrodiction, given in Eqs. (17)-(19), is in perfect agreement with a narrative involving time-reversals, as in e.g. Sag (12): one first prepares the eigenstates of HτH_{\tau}, then evolves them “backwards in time” using U0τU_{0\to\tau}^{\dagger}, and finally measures H0H_{0}. While such a narrative seems simple and appealing to intuition, it is not directly operational, as the actual implementation of time-reversals is far from straightforward QDS+ (19); GLM+ (20). Moreover, as we will see in what follows, when the evolution between the two energy measurements is not Hamiltonian, a naive argument using time-reversal can lead to inconsistencies.

In any case, with Eq. (30) at hand, fluctuation relations can be easily derived. One only needs to specify the two priors p(x)p(x) and q(y)q(y), which in the two-measurement setup is tantamount to specifying two quantum states ρ0\rho_{0} and στ\sigma_{\tau}, since from them we have p(x)=ϵx|ρ0|ϵxp(x)=\langle\epsilon_{x}|\rho_{0}|\epsilon_{x}\rangle and q(y)=ηy|στ|ηyq(y)=\langle\eta_{y}|\sigma_{\tau}|\eta_{y}\rangle. Simply by choosing ρ0\rho_{0} as the thermal state for H0H_{0} and στ\sigma_{\tau} as the thermal state for HτH_{\tau} (by assuming that the temperature is the same), we find that

r(x,y)\displaystyle r(x,y) =φ(y|x)p(x)φ^(x|y)q(y)\displaystyle=\frac{\varphi(y|x)p(x)}{\hat{\varphi}(x|y)q(y)}
=p(x)q(y)\displaystyle=\frac{p(x)}{q(y)}
=eβ(F0ϵx)eβ(Fτηy)\displaystyle=e^{\beta(F_{0}-\epsilon_{x})}e^{-\beta(F_{\tau}-\eta_{y})}
=eβ(ηyϵxΔF)\displaystyle=e^{\beta(\eta_{y}-\epsilon_{x}-\Delta F)}
=eβ(WΔF),\displaystyle=e^{\beta(W-\Delta F)}\;,

where in the final passage we identified the system’s energy difference as work WW done on the system, due to the assumption of adiabaticity during the driving protocol. From here, fluctuation relations analogous to Crooks’ theorem and Jarzynski’s equality quickly follow.

VI.2 Two-measurement setup for general quantum channels, and nonequilibrium potentials

Tasaki’s setup invites various generalizations. Several references ALMZ (13); Ras (13); RZ (14); GPM (15) have considered the situation in which the initial and final measurements are as in Tasaki’s arrangement but the unitary evolution is replaced by a general CPTP linear map \mathcal{E}, leading to the forward transition

φ(y|x)=Tr[(|ϵxϵx|)|ηyηy|].\displaystyle\varphi(y|x)=\Tr[\mathcal{E}(|\epsilon_{x}\rangle\langle\epsilon_{x}|)\ |\eta_{y}\rangle\langle\eta_{y}|]\,. (31)

If the linear map \mathcal{E} is not unital, that is, if it does not preserve the unit matrix (viz. (𝟙)𝟙\mathcal{E}(\openone)\neq\openone), then the above transition probability is not doubly stochastic in general. For the retrodictive transition, this means that φ^(x|y)φ(y|x)\hat{\varphi}(x|y)\neq\varphi(y|x), and indeed Eq. (13) contains the extra factor γ(x)/γ(y)\gamma(x)/\gamma(y) coming from Bayes–Laplace rule.

Nonetheless, a common approach prescribes to stay put with Eq. (30) and work with the non-normalized conditional distribution φ~(x|y):=φ(y|x)\tilde{\varphi}(x|y):=\varphi(y|x) instead. Obviously, φ~(x|y)\tilde{\varphi}(x|y) does not admit a realization as in (16), simply because it is not a well-formed conditional distribution. Nonetheless, it is still possible to construct a “ratio”

r~(x,y)=φ(y|x)p(x)φ~(x|y)q(y)p(x)q(y)\tilde{r}(x,y)=\frac{\varphi(y|x)p(x)}{\tilde{\varphi}(x|y)q(y)}\equiv\frac{p(x)}{q(y)}

and formally go through all the calculations as in the normalized case. However, as a result of considering a reverse “process” that is not properly normalized, the resulting Jarzynski–like relation does not average to 1 as one would like (and as it happens in (7)), but to x,yφ~(x|y)q(y)\sum_{x,y}\tilde{\varphi}(x|y)q(y). This quantity is known as the efficacy ALMZ (13), but we recognize it here as a mathematical artifact arising from an ill-defined reverse transition.

The formalism presented in this work allows a simple treatment of the two-measurement process also for arbitrary quantum channels. The forward transition (31) coincides with Eq. (15) for ρ0x:=|ϵxϵx|\rho_{0}^{x}:=|\epsilon_{x}\rangle\langle\epsilon_{x}| and Πτy:=|ηyηy|\Pi^{y}_{\tau}:=|\eta_{y}\rangle\langle\eta_{y}|. Let γ(x)\gamma(x) be an invariant distribution for φ(y|x)\varphi(y|x). Further, let predictive and retrodictive prior distributions be thermal distributions for some energy levels ϵx\epsilon_{x} and ηy\eta_{y}, so that p(x)eβϵxp(x)\propto e^{-\beta\epsilon_{x}} and q(y)eβηyq(y)\propto e^{-\beta\eta_{y}}. The corresponding initial and final Hamiltonians are defined as xϵx|ϵxϵx|\sum_{x}\epsilon_{x}|\epsilon_{x}\rangle\langle\epsilon_{x}| and yηy|ηyηy|\sum_{y}\eta_{y}|\eta_{y}\rangle\langle\eta_{y}|, respectively.

The ratio is defined as usual, that is,

r(x,y)=φ(y|x)p(x)φ^(x|y)q(y)=p(x)q(y)γ(y)γ(x)\displaystyle r(x,y)=\frac{\varphi(y|x)p(x)}{\hat{\varphi}(x|y)q(y)}=\frac{p(x)}{q(y)}\frac{\gamma(y)}{\gamma(x)}
=eβ(F0ϵxlnγ(x))eβ(Fτηylnγ(y))\displaystyle=e^{\beta(F_{0}-\epsilon_{x}-\ln\gamma(x))}e^{-\beta(F_{\tau}-\eta_{y}-\ln\gamma(y))}
=eβ(ΔEΔΦΔF),\displaystyle=e^{\beta(\Delta E-\Delta\Phi-\Delta F)}\;,

where now we find an extra term ΔΦ:=ΦyΦx:=1βlnγ(x)1βlnγ(y)\Delta\Phi:=\Phi_{y}-\Phi_{x}:=\frac{1}{\beta}\ln\gamma(x)-\frac{1}{\beta}\ln\gamma(y). This term is understood as the difference of a nonequilibrium potential that adds to the difference of equilibrium free energy ΔF:=FτF0\Delta F:=F_{\tau}-F_{0}. Finally, introducing a total stochastic entropy production Σ:=β(ΔEΔΦΔF)\Sigma:=\beta(\Delta E-\Delta\Phi-\Delta F), one easily obtains the detailed relation

μF(Σ)μR(Σ)=eΣ,\displaystyle\frac{\mu_{F}(\Sigma)}{\mu_{R}(-\Sigma)}=e^{\Sigma}\;,

and the corresponding integral relation eΣF=1\left\langle e^{-\Sigma}\right\rangle_{F}=1. In this way, we can see how a correct application of the Bayes–Laplace inversion formula automatically takes into account nonequilibrium potentials. These, usually introduced as corrections HS (01); MHP (15), are now recognizable as the remnants of Bayesian inversion.

VII Conclusion

Studies of irreversibility rely on the comparison between a forward physical process and its reverse. We have proposed to define the latter as a form of Bayesian retrodiction (13). We showed that, on the one hand, this definition matches the one used in the derivation of canonical results like Jarzynski’s equality, Crooks’ theorem, and Tasaki’s two-measurement fluctuation theorem for closed driven quantum systems. On the other hand, it also applies to situations in which a reverse process was supposed to be lacking. As a by-product, various modifications, like non-unit efficacies or nonequilibrium potentials, are given a simple explanation as shadows of Bayes–Laplace inversion.

Logical inference thus emerges as a powerful tool to supplement or replace physical intuition, whenever this seems hard to obtain. Clearly, the present approach opens up the possibility for various developments in statistical mechanics and beyond. In particular, an important development is to prove quantum retrodiction as the logical foundation of a fully quantum fluctuation theorem, such as those posited in AMOP (16); Å (18); MHP (18); KK (19), but we leave this for future research.

Acknowledgments

The authors thank Paul Riechers for insightful comments. F.B. acknowledges support from the Japan Society for the Promotion of Science (JSPS) KAKENHI, Grants Nos.19H04066 and 20K03746, and from MEXT Quantum Leap Flagship Program (MEXT Q-LEAP), Grant Number JPMXS0120319794. V.S. acknowledges support from the National Research Foundation and the Ministry of Education, Singapore, under the Research Centres of Excellence programme.

References

  • Å (18) Johan Åberg. Fully quantum fluctuation theorems. Phys. Rev. X, 8:011019, Feb 2018. doi:10.1103/PhysRevX.8.011019.
  • Ali (95) Robert Alicki. Comment on “Reduced dynamics need not be completely positive”. Phys. Rev. Lett., 75:3020–3020, Oct 1995. doi:10.1103/PhysRevLett.75.3020.
  • ALMZ (13) Tameem Albash, Daniel A. Lidar, Milad Marvian, and Paolo Zanardi. Fluctuation theorems for quantum processes. Phys. Rev. E, 88:032146, Sep 2013. doi:10.1103/PhysRevE.88.032146.
  • AMOP (16) Álvaro M. Alhambra, Lluis Masanes, Jonathan Oppenheim, and Christopher Perry. Fluctuating work: From quantum thermodynamical identities to a second law equality. Phys. Rev. X, 6:041017, Oct 2016. doi:10.1103/PhysRevX.6.041017.
  • BK (77) G. N. Bochkov and Y. E. Kuzovlev. General theory of thermal fluctuations in nonlinear systems. Sov. Phys. JETP, 45:125–130, 1977.
  • BK (02) H. Barnum and E. Knill. Reversing quantum dynamics with near-optimal quantum and classical fidelity. Journal of Mathematical Physics, 43(5):2097–2106, 2002. doi:10.1063/1.1459754.
  • BPJ (00) Stephen M. Barnett, David T. Pegg, and John Jeffers. Bayes’ theorem and quantum retrodiction. Journal of Modern Optics, 47(11):1779–1789, 2000. doi:10.1080/09500340008232431.
  • Bus (14) Francesco Buscemi. Complete positivity, markovianity, and the quantum data-processing inequality, in the presence of initial system-environment correlations. Phys. Rev. Lett., 113:140502, Oct 2014. doi:10.1103/PhysRevLett.113.140502.
  • CD (05) Hei Chan and Adnan Darwiche. On the revision of probabilistic beliefs using uncertain evidence. Artificial Intelligence, 163(1):67 – 90, 2005. doi:https://doi.org/10.1016/j.artint.2004.09.005.
  • CHT (11) Michele Campisi, Peter Hänggi, and Peter Talkner. Colloquium: Quantum fluctuation relations: Foundations and applications. Rev. Mod. Phys., 83:771–791, Jul 2011. doi:10.1103/RevModPhys.83.771.
  • Cro (98) Gavin E. Crooks. Nonequilibrium Measurements of Free Energy Differences for Microscopically Reversible Markovian Systems. Journal of Statistical Physics, 90:1481–1487, 1998. doi:10.1023/A:1023208217925.
  • Cro (08) Gavin E. Crooks. Quantum operation time reversal. Phys. Rev. A, 77:034101, Mar 2008. doi:10.1103/PhysRevA.77.034101.
  • Csi (67) I. Csiszár. Information type measures of difference of probability distribution and indirect observations. Studia Scient. Math. Hungar., 2:299–381, 1967.
  • DSL (16) J.M. Dominy, A. Shabani, and D.A. Lidar. A general framework for complete positivity. Quantum Inf. Process., 15:465–494, 2016. doi:10.1007/s11128-015-1148-0.
  • FSB (20) Dov Fields, Abdelali Sajia, and János A. Bergou. Quantum retrodiction made fully symmetric, 2020. arXiv:2006.15692.
  • Fuc (02) Christopher A. Fuchs. Quantum mechanics as quantum information (and only a little more), 2002. arXiv:quant-ph/0205039.
  • FUS (18) Ken Funo, Masahito Ueda, and Takahiro Sagawa. Quantum Fluctuation Theorems, pages 249–273. Springer International Publishing, Cham, 2018. doi:10.1007/978-3-319-99046-0_10.
  • Gaw (13) Krzysztof Gawedzki. Fluctuation relations in stochastic thermodynamics, 2013. arXiv:1308.1518.
  • GLM+ (20) András Gilyén, Seth Lloyd, Iman Marvian, Yihui Quek, and Mark M. Wilde. Quantum algorithm for petz recovery channels and pretty good measurements, 2020. arXiv:2006.16924.
  • GPM (15) J. Goold, M. Paternostro, and K. Modi. Nonequilibrium quantum Landauer principle. Phys. Rev. Lett., 114:060602, Feb 2015. doi:10.1103/PhysRevLett.114.060602.
  • HO (13) M. Horodecki and J. Oppenheim. Fundamental limitations for quantum and nanoscale thermodynamics. Nature Communications, 4:2059, 2013. doi:10.1038/ncomms3059.
  • HS (01) Takahiro Hatano and Shin-ichi Sasa. Steady-state thermodynamics of langevin systems. Phys. Rev. Lett., 86:3463–3466, Apr 2001. doi:10.1103/PhysRevLett.86.3463.
  • Jac (19) Bart Jacobs. The mathematics of changing one’s mind, via Jeffrey’s or via Pearl’s update rule. Journal of Artificial Intelligence Research, 65:783–806, Aug 2019. doi:10.1613/jair.1.11349.
  • Jar (97) C. Jarzynski. Nonequilibrium equality for free energy differences. Phys. Rev. Lett., 78:2690–2693, Apr 1997. doi:10.1103/PhysRevLett.78.2690.
  • Jar (00) C. Jarzynski. Hamiltonian derivation of a detailed fluctuation theorem. Journal of Statistical Physics, 98:77–102, 2000. doi:10.1023/A:1018670721277.
  • Jar (11) Christopher Jarzynski. Equalities and inequalities: Irreversibility and the second law of thermodynamics at the nanoscale. Annual Review of Condensed Matter Physics, 2(1):329–351, 2011. doi:10.1146/annurev-conmatphys-062910-140506.
  • Jay (03) E. T. Jaynes. Probability Theory: The Logic of Science. Cambridge University Press, 2003. doi:10.1017/CBO9780511790423.
  • Jef (65) R.C. Jeffrey. The logic of decision. McGraw-Hill, 1965.
  • JSS (04) Thomas F. Jordan, Anil Shaji, and E. C. G. Sudarshan. Dynamics of initially entangled open quantum systems. Phys. Rev. A, 70:052110, Nov 2004. URL: https://link.aps.org/doi/10.1103/PhysRevA.70.052110, doi:10.1103/PhysRevA.70.052110.
  • KK (19) Hyukjoon Kwon and M. S. Kim. Fluctuation theorems for a quantum channel. Phys. Rev. X, 9:031029, Aug 2019. doi:10.1103/PhysRevX.9.031029.
  • KL (51) S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22(1):79–86, 03 1951. doi:10.1214/aoms/1177729694.
  • LM (08) F. Liese and K.-J. Miescke. Statistical Decision Theory. Springer, 2008.
  • LMC+ (15) S. Lorenzo, R. McCloskey, F. Ciccarello, M. Paternostro, and G. M. Palma. Landauer’s principle in multipartite open quantum system dynamics. Phys. Rev. Lett., 115:120403, Sep 2015. doi:10.1103/PhysRevLett.115.120403.
  • LS (13) M. S. Leifer and Robert W. Spekkens. Towards a formulation of quantum theory as a causally neutral theory of bayesian inference. Phys. Rev. A, 88:052130, Nov 2013. doi:10.1103/PhysRevA.88.052130.
  • MHP (15) Gonzalo Manzano, Jordan M. Horowitz, and Juan M. R. Parrondo. Nonequilibrium potential and fluctuation theorems for quantum maps. Phys. Rev. E, 92:032129, Sep 2015. doi:10.1103/PhysRevE.92.032129.
  • MHP (18) Gonzalo Manzano, Jordan M. Horowitz, and Juan M. R. Parrondo. Quantum fluctuation theorems for arbitrary environments: Adiabatic and nonadiabatic entropy production. Phys. Rev. X, 8:031037, Aug 2018. doi:10.1103/PhysRevX.8.031037.
  • Nor (97) J. R. Norris. Markov Chains. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1997. doi:10.1017/CBO9780511810633.
  • Pea (88) J. Pearls. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 1988.
  • Pec (94) Philip Pechukas. Reduced dynamics need not be completely positive. Phys. Rev. Lett., 73:1060–1062, Aug 1994. doi:10.1103/PhysRevLett.73.1060.
  • Pet (88) Denes Petz. Sufficiency of channels over von Neumann algebras. The Quarterly Journal of Mathematics, 39(1):97–108, 03 1988. doi:10.1093/qmath/39.1.97.
  • QDS+ (19) Marco Túlio Quintino, Qingxiuxiong Dong, Atsushi Shimbo, Akihito Soeda, and Mio Murao. Reversing unknown quantum transformations: Universal quantum circuit for inverting general unitary operations. Phys. Rev. Lett., 123:210502, Nov 2019. doi:10.1103/PhysRevLett.123.210502.
  • Ras (13) Alexey E Rastegin. Non-equilibrium equalities with unital quantum channels. Journal of Statistical Mechanics: Theory and Experiment, 2013(06):P06016, jun 2013. doi:10.1088/1742-5468/2013/06/p06016.
  • RZ (14) Alexey E. Rastegin and Karol Zyczkowski. Jarzynski equality for quantum stochastic maps. Phys. Rev. E, 89:012127, Jan 2014. doi:10.1103/PhysRevE.89.012127.
  • Sag (12) Takahiro Sagawa. Second law-like inequalities with quantum relative entropy: An introduction, 2012. arXiv:1202.0983.
  • SB (01) Peter Stelmachovic and Vladimir Bužek. Dynamics of open quantum systems initially entangled with environment: Beyond the Kraus representation. Phys. Rev. A, 64:062106, Nov 2001. doi:10.1103/PhysRevA.64.062106.
  • Sei (05) Udo Seifert. Entropy production along a stochastic trajectory and an integral fluctuation theorem. Phys. Rev. Lett., 95:040602, Jul 2005. doi:10.1103/PhysRevLett.95.040602.
  • SSBC (12) Susanne Still, David A. Sivak, Anthony J. Bell, and Gavin E. Crooks. Thermodynamics of prediction. Phys. Rev. Lett., 109:120604, Sep 2012. doi:10.1103/PhysRevLett.109.120604.
  • SSBE (17) Philipp Strasberg, Gernot Schaller, Tobias Brandes, and Massimiliano Esposito. Quantum and information thermodynamics: A unifying framework based on repeated interactions. Phys. Rev. X, 7:021003, Apr 2017. doi:10.1103/PhysRevX.7.021003.
  • Tas (00) Hal Tasaki. Jarzynski relations for quantum systems and some applications, 2000. arXiv:cond-mat/0009244.
  • THM (08) Peter Talkner, Peter Hänggi, and Manuel Morillo. Microcanonical quantum fluctuation theorems. Phys. Rev. E, 77:051131, May 2008. doi:10.1103/PhysRevE.77.051131.
  • Wat (55) Satosi Watanabe. Symmetry of physical laws. part iii. prediction and retrodiction. Rev. Mod. Phys., 27:179–186, Apr 1955. doi:10.1103/RevModPhys.27.179.
  • Wat (65) Satosi Watanabe. Conditional probabilities in physics. Progr. Theor. Phys. Suppl., E65:135–160, Jan 1965. doi:https://doi.org/10.1143/PTPS.E65.135.
  • Wil (13) Mark M. Wilde. Quantum Information Theory. Cambridge University Press, 2013. doi:10.1017/CBO9781139525343.
  • ZSB+ (02) M. Ziman, P. Stelmachovič, V. Bužek, M. Hillery, V. Scarani, and N. Gisin. Diluting quantum information: An analysis of information transfer in system-reservoir interactions. Phys. Rev. A, 65:042105, Mar 2002. doi:10.1103/PhysRevA.65.042105.