This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quantum Covariance and Filtering

John E. Gough11footnotemark: 1 Aberystwyth University, Wales, SY23 3BZ, UK
Abstract

We give a tutorial exposition of the analogue of the filtering equation for quantum systems focusing on the quantum probabilistic framework and developing the ideas from the classical theory. Quantum covariances and conditional expectations on von Neumann algebras play an essential part in the presentation.

keywords:
Quantum probability, quantum filtering, quantum Markovian systems
journal: Journal of  Templates

1 Introduction

Nonlinear filtering theory is a well-developed field of engineering which is used to estimate unknown quantities in the presence of noise. One of the founders of the field was the Soviet mathematician Ruslan Stratonovich who encouraged his student Viacheslav Belavkin to extend the problem to the quantum domain [1]. Classically, estimation works by measuring one or more variables which are dependent on the variables to estimated, and Bayes Theorem plays an essential role in inferring the unknown variables based on what we measure. Belavkin’s approach uses the theory of quantum stochastic calculus for continuous-in-time homodyne and photon counting measurements. There are several approaches: in the paper of Barchielli and Belavkin [2], the characteristic functional method is used to derive the photon-counting case, with the diffusive case obtained as an appropriate limit. Further details of the many approaches and applications may be found in the books by Barchielli and Gregoratti [3] and by Wiseman and Milburn [4].

However, the proof of Bayes Theorem requires a joint probability distribution for the unknown variables and the measured ones. Once we go to quantum theory, we have to be very careful as incompatible observables do not possess a joint probability distribution - in such cases, applying Bayes Theorem will lead to erroneous results and is the root of many of the paradoxes in the theory.

We will derive the simplest quantum filter. The filter equation itself was originally postulated by Gisin on different grounds of continuous collapse of the wavefunction, but subsequently given a standard filtering interpretation [5]. It also appeared as way of simulating quantum open systems due to Carmichael [6] and Dalibard, Castin and Mølmer [7]: while this appears as a trick for simulating just the quantum master equation (analogue of the Fokker-Planck equation) by stochastic processes, it is clear that the authors consider an underlying interpretation based on continual measurements. The discrete-time version of the filter also featured in the famous Paris Photon-Box experiment [8].

2 Quantum Probabilistic Setting

We start from the tradition formulation of quantum theory in terms of operators on a separable Hilbert space, 𝔥\mathfrak{h}. The norm of a linear operator XX is X=sup{Xϕ:ϕ𝔥,ϕ=1}\|X\|=\sup\{\|X\phi\|:\phi\in\mathfrak{h},\|\phi\|=1\}, and the collection of bounded operators will be denoted by B(𝔥)B(\mathfrak{h}). We will denote the identity operator by 11. The adjoint of XB(𝔥)X\in B(\mathfrak{h}) will be denoted by XX^{\ast}.

Our interest will be in von Neumann algebras. These are unital *-algebras with that are closed in the weak operator topology. Here we say that a sequence of operators (Xn)(X_{n}) converges weakly in B(𝔥)B(\mathfrak{h}) to XX if their matrix elements converge, that is ϕ,Xnψϕ,Xψ\langle\phi,X_{n}\psi\rangle\to\langle\phi,X\psi\rangle for all ϕ,ψ𝔥\phi,\psi\in\mathfrak{h}.

A pair (𝔄,)(\mathfrak{A},\langle\cdot\rangle) consisting of a von Neumann algebra and a state is referred to as a quantum probability (QP) space [9].

Commutative = Classical

Kolmogorov’s setting for classical probability is in terms of probability spaces (Ω,𝒜,)(\Omega,\mathcal{A},\mathbb{P}) where Ω\Omega is a space of outcomes (the sample space), 𝒜\mathcal{A} is a σ\sigma-algebra of subsets of Ω\Omega, and \mathbb{P} is a probability measure on the elements in 𝒜\mathcal{A}. The collection of functions 𝔄=L(Ω,𝒜,)\mathfrak{A}=L^{\infty}(\Omega,\mathcal{A},\mathbb{P}) will form a commutative von Neumann algebra and, moreover, a state is given by A=ΩA(ω)[dω]\langle A\rangle=\int_{\Omega}A(\omega)\,\mathbb{P}[d\omega]. (Conversely, every commutative von Neumann algebra with a state that is continuous in the normal topology, see below, will be isomorphic to this framework.)

Commutants

There is an alternative definition of von Neumann algebras which, surprising, is purely algebraic. For a subset of operators 𝔄\mathfrak{A}, we define its commutant in B(𝔥)B(\mathfrak{h}) to be

𝔄={XB(𝔥):[A,X]=0,A𝔄}.\displaystyle\mathfrak{A}^{\prime}=\{X\in B(\mathfrak{h}):[A,X]=0,\forall A\in\mathfrak{A}\}. (1)

The commutant of the commutant of 𝔄\mathfrak{A} is called the bicommutant and is denoted 𝔄′′\mathfrak{A}^{\prime\prime}. Von Neumann’s Bicommutant Theorem states that a collection of operators 𝔄\mathfrak{A} is a von Neumann algebra if and only if it is closed under taking adjoints and 𝔄=𝔄′′\mathfrak{A}=\mathfrak{A}^{\prime\prime}.

B(𝔥)B(\mathfrak{h}) itself is a von Neumann algebra. If 𝔄\mathfrak{A} and 𝔅\mathfrak{B} are von Neumann algebras then 𝔅\mathfrak{B} is said to be coarser than 𝔄\mathfrak{A} if 𝔅𝔄\mathfrak{B}\subset\mathfrak{A}. A collection of operators KK generates a von Neumann algebra vN(K)=(KK)′′)\mathrm{vN}(K)=(K\cup K^{\ast})^{\prime\prime}).

States

A state on a von Neumann algebra is a *-linear functional :𝔄\langle\cdot\rangle:\mathfrak{A}\mapsto\mathbb{C} which is positive (X0\langle X\rangle\geq 0 whenever X0X\geq 0) and normalized (11=1\langle\hbox{1\kern-4.0pt1}\rangle=1). We will assume that the state is continuous in the normal topology, that is supn𝔼[Xn]=𝔼[supnXn]\sup_{n}\mathbb{E}[X_{n}]=\mathbb{E}[\sup_{n}X_{n}] for any increasing sequence (Xn)(X_{n}) of positive elements of 𝔄\mathfrak{A}. The main point of interest is that the normal state takes the form X=tr{ϱX}\langle X\rangle=\mathrm{tr}\{\varrho X\} for ϱ\varrho a density matrix.

The state satisfies the Cauchy-Schwartz identity |XY|2XXYY|\langle X^{\ast}Y\rangle|^{2}\leq\langle X^{\ast}X\rangle\,\langle Y^{\ast}Y\rangle.

Morphisms between QP Spaces

A morphism ϕ:(𝔄1,1)(𝔄2,2)\phi:(\mathfrak{A}_{1},\langle\cdot\rangle_{1})\mapsto(\mathfrak{A}_{2},\langle\cdot\rangle_{2}) between QP spaces is a normal, completely positive, *-linear map which preserves the identity, ϕ(111)=112\phi(\hbox{1\kern-4.0pt1}_{1})=\hbox{1\kern-4.0pt1}_{2}, and the probabilities, ϕ(X)2=X1\langle\phi(X)\rangle_{2}=\langle X\rangle_{1} for all X𝔄1X\in\mathfrak{A}_{1}. If a morphism is a homomorphism, that is, ϕ(X)ϕ(Y)=ϕ(XY)\phi(X)\phi(Y)=\phi(XY) for all X,Y𝔄1X,Y\in\mathfrak{A}_{1}, then we say that 𝔄1\mathfrak{A}_{1} is embedded into 𝔄2\mathfrak{A}_{2}.

Tomita-Takesaki Theory

As operators do not necessarily commute we may have XY\langle X^{\ast}Y\rangle different from YX\langle YX^{\ast}\rangle. Nevertheless, it is possible to write

YX=XΔY,\displaystyle\langle YX^{\ast}\rangle=\langle X^{\ast}\Delta Y\rangle, (2)

where Δ\Delta is a positive (possibly unbounded operator on 𝔄\mathfrak{A} known as the modular operator. This plays a central role in the Tomita-Takesaki theory of von Neumann algebras. A one-parameter group {σt:t}\{\sigma_{t}:t\in\mathbb{R}\} of maps on 𝔄\mathfrak{A} is defined by σt(X)=ΔitXΔit\sigma_{t}(X)=\Delta^{-it}X\Delta^{it} and is known as the modular group associated with the QP space (𝔄,)(\mathfrak{A},\langle\cdot\rangle).

Theorem 1 (Takesaki, [10])

Let (𝔄,)(\mathfrak{A},\langle\cdot\rangle) be a QP space and let 𝔅\mathfrak{B} be a von Neumann subalgebra of 𝔄\mathfrak{A}. There will exist a morphism 𝔈\mathfrak{E} from 𝔄\mathfrak{A} down to 𝔅\mathfrak{B} which is projective (𝔈𝔈=𝔈\mathfrak{E}\circ\mathfrak{E}=\mathfrak{E}) if and only if 𝔅\mathfrak{B} is invariant under the modular group of (𝔄,)(\mathfrak{A},\langle\cdot\rangle).

2.1 Quantum Conditioning

We fix a QP space (𝔄,)\big{(}\mathfrak{A},\langle\cdot\rangle\big{)}, and define the covariance of two elements X,Y𝔄X,Y\in\mathfrak{A} to be

Cov(X,Y)XYXY.\displaystyle\mathrm{Cov}(X,Y)\triangleq\langle X^{\ast}Y\rangle-\langle X\rangle^{\ast}\langle Y\rangle. (3)

Likewise the variance is defined as Var(X)Cov(X,X)\mathrm{Var}(X)\triangleq\mathrm{Cov}(X,X).

The idea is that we have a subset 𝔅𝔄\mathfrak{B}\subset\mathfrak{A}, and we want to associate an element 𝔈[A]𝔅\mathfrak{E}[A]\in\mathfrak{B} with each A𝔄A\in\mathfrak{A}, see Figure 1. As 𝔅\mathfrak{B} is smaller than 𝔄\mathfrak{A} we think of 𝔈[A]\mathfrak{E}[A] as a coarse-grained version of AA based on a less information. The map 𝔈\mathfrak{E} therefore compresses the model (𝔄,)(\mathfrak{A},\langle\cdot\rangle) into a coarser one on 𝔅\mathfrak{B}: we would like to do this is a way that preserves averages.

Refer to caption
Figure 1: A conditional expectation 𝔈\mathfrak{E} is a projection from an algebra 𝔄\mathfrak{A} of random objects down into a smaller algebra 𝔅\mathfrak{B} such that 𝔈[A]=A\langle\mathfrak{E}[A]\rangle=\langle A\rangle.

We now list some desirable features for 𝔈\mathfrak{E} which we have already encountered in the classical case: for any X,Y,A𝔄X,Y,A\in\mathfrak{A}, α,β\alpha,\beta\in\mathbb{C} and B1,B2𝔅B_{1},B_{2}\in\mathfrak{B},

  1. (CE1)

    linearity: 𝔈[αX+βY]=α𝔈[X]+β𝔈[Y]\mathfrak{E}[\alpha X+\beta Y]=\alpha\mathfrak{E}[X]+\beta\mathfrak{E}[Y];

  2. (CE2)

    *-map: 𝔈[X]=𝔈[X]\mathfrak{E}[X^{\ast}]=\mathfrak{E}[X]^{\ast};

  3. (CE3)

    conservativity: 𝔈[11]=11\mathfrak{E}[\hbox{1\kern-4.0pt1}]=\hbox{1\kern-4.0pt1};

  4. (CE4)

    compatibility: 𝔈[A]=A\langle\mathfrak{E}[A]\rangle=\langle A\rangle;

  5. (CE5)

    projectivity: 𝔈[𝔈[A]]=𝔈[A]\mathfrak{E}[\mathfrak{E}[A]]=\mathfrak{E}[A];

  6. (CE6)

    peelability: 𝔈[B1AB2]=B1𝔈[A]B2\mathfrak{E}[B_{1}AB_{2}]=B_{1}\mathfrak{E}[A]B_{2};

  7. (CE7)

    positivity: 𝔈[A]0\mathfrak{E}[A]\geq 0 whenever A0A\geq 0.

We call property (CE6) “peelability”  for the lack of a better name and we emphasize that the order of the operators is important. Property (CE7) is known to be insufficient to deal with quantum theory and must be strengthened as follows:

  1. (CE7)

    complete positivity: for each integer n1n\geq 1

    [𝔈[A11]𝔈[A1n]𝔈[An1]𝔈[Ann]]0whenever[A11A1nAn1Ann]0.\displaystyle\left[\begin{array}[]{ccc}\mathfrak{E}[A_{11}]&\cdots&\mathfrak{E}[A_{1n}]\\ \vdots&\ddots&\vdots\\ \mathfrak{E}[A_{n1}]&\cdots&\mathfrak{E}[A_{nn}]\end{array}\right]\geq 0\;\mathrm{whenever}\,\left[\begin{array}[]{ccc}A_{11}&\cdots&A_{1n}\\ \vdots&\ddots&\vdots\\ A_{n1}&\cdots&A_{nn}\end{array}\right]\geq 0. (10)
Definition 2

Let 𝔄\mathfrak{A} and 𝔅\mathfrak{B} be a unital *-algebras with 𝔅\mathfrak{B} a subalgebra of 𝔄\mathfrak{A}, then a mapping 𝔈:𝔄𝔅\mathfrak{E}:\mathfrak{A}\mapsto\mathfrak{B} satisfying properties (CE1)-(CE6) and (CE7{}^{\,\prime}) is a quantum conditional expectation.

Proposition 3

A quantum conditional expectation 𝔈\mathfrak{E} acts as the identity map on 𝔅\mathfrak{B}.

Proof. Set A=B1=11A=B_{1}=\hbox{1\kern-4.0pt1} and B2=B𝔅B_{2}=B\in\mathfrak{B}, then peelability implies that 𝔈[B]=𝔈[11]B\mathfrak{E}[B]=\mathfrak{E}[\hbox{1\kern-4.0pt1}]B. So the result follows from conservativity.   

Existence

We observe that the conditional expectation always exists in the classical world. Here 𝔄\mathfrak{A} can be identified as some L(Ω,𝒜,)L^{\infty}(\Omega,\mathcal{A},\mathbb{P}) and then the subalgebra 𝔅\mathfrak{B} will be then take the form L(Ω,,)L^{\infty}(\Omega,\mathcal{B},\mathbb{P}) where \mathcal{B} is a coarser σ\sigma-algebra. Conditional expectation is then well defined: For AL(Ω,𝒜,)A\in L^{\infty}(\Omega,\mathcal{A},\mathbb{P}) one sets μA[I]=IA(ω)[dω]\mu_{A}[I]=\int_{I}A(\omega)\mathbb{P}[d\omega] for each II\in\mathcal{B} then μA\mu_{A} is absolutely continuous with respect to |𝒢\mathbb{P}|_{\mathcal{G}} and its Radon-Nikodym derivative is the conditional expectation which we denote as 𝔼[A|]\mathbb{E}[A|\mathcal{B}]. This is explicit in Kolmogorov’s original paper.

In contrast, quantum conditional expectations need not exits. By definition, they satisfy the requirements of the Takesaki Theorem above (and additionally the peelability condition) so we need further invariance of the subalgebra 𝔅\mathfrak{B} under the modular group.

2.2 Quantum Covariance

Definition 4

Let 𝔈\mathfrak{E} be a quantum conditional expectation from 𝔄\mathfrak{A} onto a subalgebra 𝔅\mathfrak{B}. For each A𝔄A\in\mathfrak{A}, we define δAA𝔈[A]\delta A\triangleq A-\mathfrak{E}[A]. The conditional covariance of X,Y𝔄X,Y\in\mathfrak{A} is defined to be

Cov𝔅(X,Y)𝔈[δXδY].\displaystyle\mathrm{Cov}_{\mathfrak{B}}(X,Y)\triangleq\mathfrak{E}[\delta X^{\ast}\,\delta Y]. (11)

The conditional variance is

Var𝔅(X)Cov𝔅(X,X).\displaystyle\mathrm{Var}_{\mathfrak{B}}(X)\triangleq\mathrm{Cov}_{\mathfrak{B}}(X,X). (12)

Note that

𝔈[δA]=δA=0,\displaystyle\mathfrak{E}[\delta A]=\langle\delta A\rangle=0, (13)

for every A𝔄A\in\mathfrak{A}. It is worth emphasizing that the conditional covariance defined here is an operator on 𝔅\mathfrak{B}, not a scalar.

Lemma 5

We have 𝔈[B1δAB2]=0\mathfrak{E}[B_{1}\,\delta A\,B_{2}]=0 whenever A𝔄A\in\mathfrak{A} and B1,B2𝔅B_{1},B_{2}\in\mathfrak{B}. In particular, 𝔈[BδA]=0\mathfrak{E}[B\,\delta A]=0 whenever A𝔄A\in\mathfrak{A} and B𝔅B\in\mathfrak{B}.

The proof depends crucially on peelability: 𝔈[B1δAB2]=B1𝔈[δA]B2=0\mathfrak{E}[B_{1}\,\delta A\,B_{2}]=B_{1}\,\mathfrak{E}[\delta A]\,B_{2}=0. The following result is trivial classically, but again requires peelability in the non-commutative setting.

Proposition 6

The conditional covariance may alternatively be written as

Cov𝔅(X,Y)=𝔈[XY]𝔈[X]𝔈[Y].\displaystyle\mathrm{Cov}_{\mathfrak{B}}(X,Y)=\mathfrak{E}[X^{\ast}Y]-\mathfrak{E}[X]^{\ast}\mathfrak{E}[Y]. (14)

Proof. From 5 we then have

Cov𝔅(X,Y)\displaystyle\mathrm{Cov}_{\mathfrak{B}}(X,Y) =\displaystyle= 𝔈[XY𝔈[X]YX𝔈[Y]+𝔈[X]𝔈[Y]]\displaystyle\mathfrak{E}\bigg{[}X^{\ast}Y-\mathfrak{E}[X]^{\ast}Y-X^{\ast}\mathfrak{E}[Y]+\mathfrak{E}[X]^{\ast}\mathfrak{E}[Y]\bigg{]}
=\displaystyle= 𝔈[XY]𝔈[X]𝔈[Y]𝔈[X]𝔈[Y]+𝔈[X]𝔈[Y]\displaystyle\mathfrak{E}[X^{\ast}Y]-\mathfrak{E}[X]^{\ast}\mathfrak{E}[Y]-\mathfrak{E}[X]^{\ast}\mathfrak{E}[Y]+\mathfrak{E}[X]^{\ast}\mathfrak{E}[Y]

and the result follows.   

Proposition 7

The conditional covariance has the invariance property

Cov𝔅(X+B1,Y+B2)=Cov𝔅(X,Y),\displaystyle\mathrm{Cov}_{\mathfrak{B}}(X+B_{1},Y+B_{2})=\mathrm{Cov}_{\mathfrak{B}}(X,Y), (15)

for all B1,B2𝔅B_{1},B_{2}\in\mathfrak{B}.

Proof. From *-linearity and (14), we see that the left hand side of (15) equals

𝔈[XY+XB2+B1Y+B1B2](𝔈[X]+B1)(𝔈[Y]+B2)\displaystyle\mathfrak{E}\big{[}X^{\ast}Y+X^{\ast}B_{2}+B_{1}^{\ast}Y+B_{1}^{\ast}B_{2}\big{]}-\big{(}\mathfrak{E}[X]+B_{1}\big{)}^{\ast}\big{(}\mathfrak{E}[Y]+B_{2}\big{)}

and the result follows using peelability.   

Lemma 8

The covariance and conditional covariance are related by

Cov(X,Y)=Cov𝔅(X,Y)+(𝔈[X]X)(𝔈[Y]Y).\displaystyle\mathrm{Cov}(X,Y)=\langle\mathrm{Cov}_{\mathfrak{B}}(X,Y)\rangle+\big{\langle}(\mathfrak{E}[X]-\langle X\rangle)^{\ast}(\mathfrak{E}[Y]-\langle Y\rangle)\big{\rangle}. (16)

Proof. This follows from repeated application of the compatibility property.

Cov𝔅(X,Y)\displaystyle\langle\mathrm{Cov}_{\mathfrak{B}}(X,Y)\rangle =\displaystyle= XY𝔈[X]𝔈[Y]\displaystyle\langle X^{\ast}Y\rangle-\langle\mathfrak{E}[X]^{\ast}\mathfrak{E}[Y]\rangle
=\displaystyle= XYXY(𝔈[X]𝔈[Y]XY)),\displaystyle\langle X^{\ast}Y\rangle-\langle X\rangle^{\ast}\langle Y\rangle-\big{(}\langle\mathfrak{E}[X]^{\ast}\mathfrak{E}[Y]\rangle-\langle X^{\ast}\rangle\langle Y\rangle)\big{)},

which is readily rearranged to give the result.   

As a consequence we have

Var(X)=Var𝔅(X)+(𝔈[X]X)(𝔈[X]X).\displaystyle\mathrm{Var}(X)=\langle\mathrm{Var}_{\mathfrak{B}}(X)\rangle+\big{\langle}(\mathfrak{E}[X]-\langle X\rangle)^{\ast}(\mathfrak{E}[X]-\langle X\rangle)\big{\rangle}. (17)

2.3 Least Squares Property

Proposition 9

The conditional covariance has the least squares property, that is, 𝔈[(XB)(XB)]\mathfrak{E}[(X-B)^{\ast}(X-B)] is minimized over B𝔅B\in\mathfrak{B} by B=𝔈[X]B=\mathfrak{E}[X].

Proof. Let B𝔅B\in\mathfrak{B} then B=B+𝔈[X]B^{\prime}=B+\mathfrak{E}[X] which is in again in 𝔅\mathfrak{B}. Then

𝔈[(XB)(XB)]\displaystyle\mathfrak{E}[(X-B)^{\ast}(X-B)] =\displaystyle= 𝔈[(δXB)(δXB)]\displaystyle\mathfrak{E}[(\delta X-B^{\prime})^{\ast}(\delta X-B^{\prime})]
=\displaystyle= 𝔈[δXδX]BδXδXB+BB]\displaystyle\mathfrak{E}[\delta X^{\ast}\,\delta X]-B^{\prime\ast}\delta X-\delta X\,B^{\prime}+B^{\prime\ast}B^{\prime}]
=\displaystyle= Var𝔅(X)+𝔈[BB]\displaystyle\mathrm{Var}_{\mathfrak{B}}(X)+\mathfrak{E}[B^{\prime\ast}B^{\prime}]
\displaystyle\geq Var𝔅(X),\displaystyle\mathrm{Var}_{\mathfrak{B}}(X),

where we use the positivity property.   

Corollary 10

The variance (XB)(XB)\langle(X-B)^{\ast}(X-B)\rangle is also minimized over B𝔅B\in\mathfrak{B} by B=𝔈[X]B=\mathfrak{E}[X].

Proof. Using the same notations from the proof of Lemma 9, we have

(XB)(XB)\displaystyle\langle(X-B)^{\ast}(X-B)\rangle =\displaystyle= (δXB)(δXB)\displaystyle\langle(\delta X-B^{\prime})^{\ast}(\delta X-B^{\prime})\rangle
=\displaystyle= δXδXBδXδXB+BB\displaystyle\langle\delta X^{\ast}\delta X\rangle-\langle B^{\prime\ast}\delta X\rangle-\langle\delta X^{\ast}\,B^{\prime}\rangle+\langle B^{\prime\ast}B^{\prime}\rangle
=\displaystyle= δXδX+BB,\displaystyle\langle\delta X^{\ast}\delta X\rangle+\langle B^{\prime\ast}B^{\prime}\rangle,

since BδX=δXB=0\langle B^{\prime\ast}\delta X\rangle=\langle\delta X^{\ast}\,B^{\prime}\rangle=0 by Lemma 5. Therefore (XB)(XB)\langle(X-B)^{\ast}(X-B)\rangle is also minimized over B𝔅B\in\mathfrak{B} by B=𝔈[X]B=\mathfrak{E}[X].   

3 Classical Filtering

In this section we recall in detail Kolmogorov’s Theory of Probability. In the process we will see the commutative analogues that motivated the our more general definitions in the Introduction.

3.1 Kolmogorov’s Theory

Kolmogorov’s axiomatic formulation of probability theory is based on the mathematical formalism of measure theory. The main concept is that of a probability space. This is a triple (Ω,,)(\Omega,\mathcal{F},\mathbb{P}) where:

  • 1.

    Ω\Omega, called the sample space, is the collection of all possible outcomes (typically a topological space);

  • 2.

    \mathcal{F} is a σ\sigma-algebra of subsets of Ω\Omega,the elements of which are known as events;

  • 3.

    \mathbb{P} is a probability measure on \mathcal{F}.

In details, \mathcal{F} will form a σ\sigma-algebra if it contains the empty set \emptyset, if it is closed under complementation (that is, if AA\in\mathcal{F} then so too will be its complement A={ωΩ:ωA}A^{\prime}=\{\omega\in\Omega:\omega\notin A\}), and finally if whenever {An}\{A_{n}\} is a countable number of events in \mathcal{F} then their intersection nAn\cap_{n}A_{n} and union nAn\cup_{n}A_{n} will be in \mathcal{F}. Note that Ω\Omega will be an event since it is the complement of the empty set.

A probability measure \mathbb{P} on \mathcal{F} is an assignment of a probability [A]0\mathbb{P}[A]\geq 0 to each event AA\in\mathcal{F} with the rule that [Ω]=1\mathbb{P}[\Omega]=1 and [nAn]=n[An]\mathbb{P}[\cap_{n}A_{n}]=\sum_{n}\mathbb{P}[A_{n}] for any countable number of events, {An}\{A_{n}\}, that are non-overlapping (i.e., AnAm=A_{n}\cap A_{m}=\emptyset if nmn\neq m)

The pair (Ω,)(\Omega,\mathcal{F}) comprise a measurable space. In other words, a space where we are capable to assign possible measures of size to selected subsets in a consistent manner: this is the branch of mathematics known as measure theory which was set up to resolve pathological problems when you try and assign a measure to all subsets. It follows that probability theory is formally just special case of measure theory where the measure \mathbb{P} has maximum value [Ω]=1\mathbb{P}[\Omega]=1.

More exactly, the setting is measure theory but probability theory brings its own additionally concepts with it. An example is conditional probability: the probability of event AA given that BB has occurred is defined by

[A|B]=[AB][B]\displaystyle\mathbb{P}[A|B]=\frac{\mathbb{P}[A\cap B]}{\mathbb{P}[B]} (18)

which is the joint probability, [AB]\mathbb{P}[A\cap B], for both AA and BB to occur divided by the marginal probability [B]\mathbb{P}[B].

The choice of \mathcal{F} in a given problem is part of the modeling process. Essentially, we have to ask what are the events that we want to assign a probability to. Let 𝒢\mathcal{G} be a σ\sigma-algebra that is contained in \mathcal{F} (that is every event in 𝒢\mathcal{G} there is also an event in \mathcal{F}) then we say that 𝒢\mathcal{G} is coarser, or smaller, than \mathcal{F}. The probability space (Ω,𝒢,)(\Omega,\mathcal{G},\mathbb{Q}) is then a coarse-graining of (Ω,𝒢,)(\Omega,\mathcal{G},\mathbb{P}) where we take \mathbb{Q} to be the restriction of \mathbb{P} to the smaller σ\sigma-algebra 𝒢\mathcal{G}.

Just as we do not consider all subsets of Ω\Omega, we do not consider all functions on Ω\Omega either. Let X:ΩX:\Omega\to\mathbb{R} then we say XX is measurable with respect to a σ\sigma-algebra \mathcal{F} if the sets

X1[I]{ωΩ:X(ω)I}\displaystyle X^{-1}[I]\triangleq\{\omega\in\Omega:X(\omega)\in I\} (19)

belong to \mathcal{F} for each interval II. A measurable function XX on a probability space is called a random variable and the probability that it takes a value in the interval II, denoted Prob{XI}\mathrm{Prob}\{X\in I\} is just the value \mathbb{P} assigns to the event X1[I]X^{-1}[I]. We will use the term random vector for a vector-valued function whose components are all random variables.

Let X1,,XnX_{1},\cdots,X_{n} be random variables, then there is a coarsest σ\sigma-algebra which contains all the events of the form Xj1[I]X_{j}^{-1}[I] for all jj and all intervals II: we refer to this as the σ\sigma-algebra generated by the random variables.

The correct way to think of an ensemble is a probability space where (Ω,)(\Omega,\mathcal{F}) is collection Γ\Gamma all possible microstates with \mathcal{F} is some suitable σ\sigma-algebra, and \mathbb{P} is a suitable probability measure. The Hamiltonian must, at the very least, be a measurable function with respect to whatever σ\sigma-algebra we chose. No philosophical interpretations needed beyond this point.

3.2 Conditioning in Classical Probability

We will now restrict attention to continuous random variables with well-defined probability densities. A random variable XX has probability distribution function (pdf) ρX\rho_{X} so that

Pr{xX<x+dx}=ρX(x)dx.\displaystyle\Pr\left\{x\leq X<x+dx\right\}=\rho_{X}\left(x\right)\,dx. (20)

Normalization requires ρX(x)𝑑x=1\int_{-\infty}^{\infty}\rho_{X}\left(x\right)dx=1. If we have several random variables, then we need to specify their joint probability. For instance, if we have a pair XX and YY then their joint pdf will be ρX,Y(x,y)\rho_{X,Y}\left(x,y\right) with

ρX(x)\displaystyle\rho_{X}\left(x\right) =\displaystyle= ρX,Y(x,y)𝑑y,(xmarginal)\displaystyle\int\rho_{X,Y}\left(x,y\right)dy,\quad\mathrm{(}x\mathrm{-marginal)} (21)
ρY(y)\displaystyle\rho_{Y}\left(y\right) =\displaystyle= ρX,Y(x,y)𝑑x,(ymarginal)\displaystyle\int\rho_{X,Y}\left(x,y\right)dx,\quad\mathrm{(}y\mathrm{-marginal)} (22)

and 1=ρX,Y(x,y)𝑑x𝑑y1=\int\int\rho_{X,Y}\left(x,y\right)dxdy.

We say that XX and YY are statistically independent if their joint probability factors into the marginals

ρX,Y(x,y)ρX(x)×ρY(y),( independence).\displaystyle\rho_{X,Y}\left(x,y\right)\equiv\rho_{X}\left(x\right)\times\rho_{Y}\left(y\right),\quad\text{( independence).} (23)

This is equivalent to pairs of events of the form X1[I]X^{-1}[I] and Y1[J]Y^{-1}[J] being statistically independent for all intervals I,JI,J.

More generally, we can work out the conditional probabilities from a joint probability. The pdf for XX given that Y=yY=y is defined to be

ρX|Y(x|y)ρX,Y(x,y)ρY(y).\displaystyle\rho_{X|Y}\left(x|y\right)\triangleq\frac{\rho_{X,Y}\left(x,y\right)}{\rho_{Y}\left(y\right)}. (24)

In the special case where XX and YY are independent we have

ρX|Y(x|y)=ρX(x).\displaystyle\rho_{X|Y}\left(x|y\right)=\rho_{X}\left(x\right). (25)

In other words, conditioning on the fact that Y=yY=y makes no change to our knowledge of XX.

Definition 11

Let A=a(X,Y)A=a(X,Y) be a random variable for some function a:×a:\mathbb{R}\times\mathbb{R}\mapsto\mathbb{R}, then its conditional expectation given Y=yY=y is defined to be

𝔼[A|Y=y]a(x,y)ρX|Y(x|y)𝑑x.\displaystyle\mathbb{E}[A|Y=y]\triangleq\int_{\mathbb{R}}a(x,y)\rho_{X|Y}(x|y)dx. (26)

More generally, let 𝒴\mathcal{Y} be the σ\sigma-algebra generated by YY, then 𝔼[A|𝒴]\mathbb{E}[A|\mathcal{Y}] is the 𝒴\mathcal{Y}-measurable random variable taking the value 𝔼[A|Y=y]\mathbb{E}[A|Y=y] for each ω\omega where yy is the value of Y(ω)Y(\omega).

As ρX|Y(x|y)𝑑x=1\int\rho_{X|Y}(x|y)\,dx=1, we have

𝔼[1|𝒴]1.\displaystyle\mathbb{E}[1|\mathcal{Y}]\equiv 1. (27)

We note that for any random variable A=a(X,Y)A=a(X,Y)

𝔼[𝔼[A|𝒴]]\displaystyle\mathbb{E}\big{[}\mathbb{E}[A|\mathcal{Y}]\big{]} =\displaystyle= (a(x,y)ρX|Y(x|y)𝑑x)ρY(x)𝑑y\displaystyle\int_{\mathbb{R}}\left(\int_{\mathbb{R}}a(x,y)\rho_{X|Y}(x|y)dx\right)\rho_{Y}(x)\,dy (28)
=\displaystyle= 𝑑x𝑑ya(x,y)ρX,Y(x,y)𝑑x\displaystyle\int_{\mathbb{R}}dx\int_{\mathbb{R}}dy\,a(x,y)\rho_{X,Y}(x,y)dx
=\displaystyle= 𝔼[A].\displaystyle\mathbb{E}[A].

Also, for any A=a(X,Y)A=a(X,Y) and B=b(Y)B=b(Y) we have

𝔼[AB|𝒴](ω)\displaystyle\mathbb{E}[AB|\mathcal{Y}](\omega) =\displaystyle= a(x,Y(ω)b(Y(ω))ρX|Y(x|y)dx\displaystyle\int_{\mathbb{R}}a(x,Y(\omega)b(Y(\omega))\rho_{X|Y}(x|y)dx (29)
=\displaystyle= (𝑑xa(x,Y(ω))ρX|Y(x|Y(ω))𝑑x)b(Y(ω))\displaystyle\left(\int_{\mathbb{R}}dx\,a(x,Y(\omega))\rho_{X|Y}(x|Y(\omega))dx\right)b(Y(\omega))
=\displaystyle= 𝔼[A|𝒴](ω)B(ω).\displaystyle\mathbb{E}[A|\mathcal{Y}](\omega)\,B(\omega).

This construction was specific to random variables with pdfs. However, it extends to the general setting as follows.

Theorem 12

Let (Ω,,)(\Omega,\mathcal{F},\mathbb{P}) be a probability space and let 𝒴\mathcal{Y} be a sub-σ\sigma-algebra of \mathcal{F}. Then there exists a \mathbb{P}-almost surely unique 𝒴\mathcal{Y}-measurable random variable 𝔼[X|𝒴]\mathbb{E}[X|\mathcal{Y}] such that 𝔼[1|𝒴]=1\mathbb{E}[1|\mathcal{Y}]=1, 𝔼[𝔼[A|𝒴]]=𝔼[A]\mathbb{E}\big{[}\mathbb{E}[A|\mathcal{Y}]\big{]}=\mathbb{E}[A] and 𝔼[AB|𝒴]=𝔼[A|𝒴]B\mathbb{E}[AB|\mathcal{Y}]=\mathbb{E}[A|\mathcal{Y}]\,B whenever BB is 𝒴\mathcal{Y}-measurable.

Proposition 13

If BB is 𝒴\mathcal{Y}-measurable, then

𝔼[B|𝒴]=B.\displaystyle\mathbb{E}[B|\mathcal{Y}]=B. (30)

Proof. Setting A=1A=1 in the identity 𝔼[AB|𝒴]=𝔼[A|𝒴]B\mathbb{E}[AB|\mathcal{Y}]=\mathbb{E}[A|\mathcal{Y}]\,B whenever BB is 𝒴\mathcal{Y}-measurable, we see that 𝔼[B|𝒴]=𝔼[1|𝒴]B\mathbb{E}[B|\mathcal{Y}]=\mathbb{E}[1|\mathcal{Y}]B which in turn equals BB.   

Proposition 14

Conditional expectations are projections.

Proof. For AA arbitrary, we set B=𝔼[A|𝒴]B=\mathbb{E}[A|\mathcal{Y}] which is 𝒴\mathcal{Y}-measurable and so

𝔼[𝔼[A|𝒴]|𝒴]=𝔼[A|𝒴].\displaystyle\mathbb{E}[\mathbb{E}[A|\mathcal{Y}]|\mathcal{Y}]=\mathbb{E}[A|\mathcal{Y}]. (31)

 

3.3 Classical Measurement

We now suppose that we have a system with phase space Γ\Gamma and a measuring apparatus with parameter space MM. We let xx denote the phase points of Γ\Gamma as before, and write yy for the variables of the apparatus. The components of yy are sometimes referred to as pointer variables. The total space will be Ω=Γ×M\Omega=\Gamma\times M with coordinates ω=(x,y)\omega=(x,y). We take \mathbb{P} to be a probability measure on Ω\Omega and consider the random vectors X(ω)=xX(\omega)=x and Y(ω)=yY(\omega)=y.

In an experiment, we will not measure the system directly but instead record the value of one or more pointer variables. Let 𝒴\mathcal{Y} be the σ\sigma-algebra generated by YY. We therefore refer to YY as the data.

We shall assume that the system variables and the pointer variables are statistically dependent for our probability measure \mathbb{P}, otherwise we learn nothing about our system from the data. As before we assume a joint pdf ρ(x,y)\rho(x,y) with marginals ρΓ(x)\rho_{\Gamma}(x) for the system and ρM(y)\rho_{M}(y) for the measuring apparatus. We will write ρ(x|y)\rho(x|y) for the conditional pdf for our system given the data but write λ(y|x)\lambda(y|x) for the conditional pdf of the data given the system. This implies that

ρ(x,y)=ρ(x|y)ρM(y)=λ(y|x)ρΓ(x).\displaystyle\rho(x,y)=\rho(x|y)\,\rho_{M}(y)=\lambda(y|x)\,\rho_{\Gamma}(x). (32)

In practice, we may not know \mathbb{P} however we will assume that we know λ(y|x)\lambda(y|x). That is, we assume that we know the probability distribution of the pointer variables if we prepared our system precisely in state xx, for each possible xΓx\in\Gamma. Statisticians refer to λ(y|x)\lambda\left(y|x\right) as the likelihood function of the data yy given xx.

Note that

Mλ(y|x)𝑑y=Γρ(y|x)𝑑x=1.\displaystyle\int_{M}\lambda\left(y|x\right)\,dy=\int_{\Gamma}\rho(y|x)\,dx=1. (33)

Now every random variable may be written as A=a(X,Y)A=a(X,Y) for some function a:Ω=Γ×Ma:\Omega=\Gamma\times M\mapsto\mathbb{R}. Its conditional expectation given the data is

𝔼[A|𝒴]Γa(x,Y)ρ(x|Y)𝑑x.\displaystyle\mathbb{E}[A|\mathcal{Y}]\equiv\int_{\Gamma}a(x,Y)\rho(x|Y)dx. (34)

Indeed, for ω=(x,y)\omega=(x,y) we have

𝔼[A|𝒴](ω)\displaystyle\mathbb{E}[A|\mathcal{Y}](\omega) =\displaystyle= 1ρM(y)Γa(x,y)ρ(x,y)𝑑x\displaystyle\frac{1}{\rho_{M}(y)}\int_{\Gamma}a(x^{\prime},y)\rho(x^{\prime},y)dx^{\prime} (35)
=\displaystyle= Γa(x,y)ρ(x,y)𝑑xΓρ(x′′,y)𝑑x′′.\displaystyle\frac{\int_{\Gamma}a(x^{\prime},y)\rho(x^{\prime},y)dx^{\prime}}{\int_{\Gamma}\rho(x^{\prime\prime},y)dx^{\prime\prime}}.

This is an average over the hypersurface Ωy={ωΩ:Y(ω)=y}\Omega_{y}=\{\omega\in\Omega:Y(\omega)=y\}. Indeed, the decomposition ω=(x,y)\omega=(x,y) can be thought of as split into the constraint coordinates yy and the hypersurface coordinates xx.


From a practical stand point, we will have access only to the data - that is, variables measurable with respect to 𝒴\mathcal{Y} only. We are assuming that we know λ\lambda, which is the conditional probability for data given that the system. However, the problem is that the system is unknown and what we are given is, of course, the data. Therefore, we need to solve the inverse problem, namely to give the conditional probability for the unknown XX given the measured values for YY. The problem however is not well-posed. We do not have enough information in the problem yet to write down the joint probability.

To remedy this, we introduce a pdf for XX which is our a priori guess:

ρX(x)=guess!ρprior(x).\displaystyle\rho_{X}(x)\stackrel{{\scriptstyle\mathrm{guess!}}}{{=}}\rho_{\mathrm{prior}}\left(x\right). (36)

We then have the corresponding joint probability for XX and YY:

ρprior(x,y)=λ(y|x)×ρprior(x).\displaystyle\rho_{\mathrm{prior}}\left(x,y\right)=\lambda\left(y|x\right)\times\rho_{\mathrm{prior}}\left(x\right). (37)

If we subsequently measure Y=yY=y then we obtain the a posteriori probability

ρpost(x|y)\displaystyle\rho_{\mathrm{post}}\left(x|y\right) =\displaystyle= ρX,Y(x,y)ρY(y)\displaystyle\frac{\rho_{X,Y}\left(x,y\right)}{\rho_{Y}\left(y\right)} (38)
=\displaystyle= λ(y|x)ρprior(x)λ(y|x)ρprior(x)𝑑x.\displaystyle\frac{{\lambda\left(y|x\right)\rho_{\mathrm{prior}}\left(x\right)}}{\int\lambda\left(y|x^{\prime}\right)\rho_{\mathrm{prior}}\left(x^{\prime}\right)dx^{\prime}}.

The conditional expectation in (35) can be then written as

𝔼[A|𝒴](ω)=Γa(x,y)λ(y|x)ρprior(x)𝑑xΓλ(y|x′′)ρprior(x′′)𝑑x′′.\displaystyle\mathbb{E}[A|\mathcal{Y}](\omega)=\frac{\int_{\Gamma}a(x^{\prime},y)\lambda\left(y|x^{\prime}\right)\rho_{\mathrm{prior}}\left(x^{\prime}\right)dx^{\prime}}{\int_{\Gamma}\lambda\left(y|x^{\prime\prime}\right)\rho_{\mathrm{prior}}\left(x^{\prime\prime}\right)dx^{\prime\prime}}. (39)
Example 15

Let XX be the position of a particle. We measure

Y=X+σZ\displaystyle Y=X+\sigma Z (40)

where ZZ is a standard normal variable independent of XX. We may refer to XX as the signal and ZZ as the noise.

Now if XX was known to be exactly xx then YY will be normal with mean xx and variance σ2\sigma^{2}. Therefore, we can immediately write down the likelihood function: it is

λ(y|x)=12πσe(yx)2/2σ2,\displaystyle\lambda\left(y|x\right)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\left(y-x\right)^{2}/2\sigma^{2}}, (41)
ρpost(x|y)=ρprior(x)e(yx)2/2σ2ρprior(x)e(yx)2/2σ2𝑑x.\displaystyle\rho_{\mathrm{post}}\left(x|y\right)=\frac{\rho_{\mathrm{prior}}\left(x\right)e^{-\left(y-x\right)^{2}/2\sigma^{2}}}{\int\rho_{\mathrm{prior}}\left(x^{\prime}\right)e^{-\left(y-x^{\prime}\right)^{2}/2\sigma^{2}}dx^{\prime}}. (42)

In the special case where XX is assumed to be Gaussian, say mean μ0\mu_{0} and variance σ02\sigma_{0}^{2}, we can give the explicit form of the posterior as Gaussian with mean μ1\mu_{1} and variance σ02\sigma_{0}^{2} where

μ1\displaystyle\mu_{1} =\displaystyle= σ12σ02μ0+σ12σ2y\displaystyle\frac{\sigma_{1}^{2}}{\sigma_{0}^{2}}\mu_{0}+\frac{\sigma_{1}^{2}}{\sigma^{2}}y (43)
1σ12\displaystyle\frac{1}{\sigma_{1}^{2}} =\displaystyle= 1σ02+1σ2.\displaystyle\frac{1}{\sigma_{0}^{2}}+\frac{1}{\sigma^{2}}. (44)

There are two desirable features here. First, the new mean μ1\mu_{1} uses the data yy. Second, the new variance σ12\sigma_{1}^{2} is smaller than the prior variance σ02\sigma_{0}^{2}. In other words, the measurement is informative and decreases uncertainty in the state

3.4 Classical Filtering

It is possible to extend the conditioning problem to estimate the state of a dynamical system as it evolves in time based on continual monitoring. This involves the theory of stochastic processes and we will use the informal language of path integrals rather than the Ito calculus.

3.4.1 Stochastic Process

A stochastic process is a family, {X(t):t0}\left\{X\left(t\right):t\geq 0\right\}, of random variables labeled by time. The process is determined by specifying all the multi-time distributions

ρ(xn,tn;;x1,t1)\displaystyle\rho\left(x_{n},t_{n};\cdots;x_{1},t_{1}\right) (45)

for X(t1)=x1,,X(tn)=xnX\left(t_{1}\right)=x_{1},\cdots,X\left(t_{n}\right)=x_{n} for each n0n\geq 0.


A stochastic process is said to be Markov if the multi-time distributions take the form

ρ(xn,tn;;x1,t1)=T(xn,tn|xn1,tn1)T(x2,t2|x1,t1)ρ(x1,t1),\displaystyle\rho\left(x_{n},t_{n};\cdots;x_{1},t_{1}\right)=T(x_{n},t_{n}|x_{n-1},t_{n-1})\cdots T(x_{2},t_{2}|x_{1},t_{1})\,\rho(x_{1},t_{1}), (46)

where whenever tn>tn1>>t1t_{n}>t_{n-1}>\cdots>t_{1}.

Here T(x,t|x0,t0)T(x,t|x_{0},t_{0}) is the probability density for X(t)=xX(t)=x given that X(t0)=x0X(t_{0})=x_{0}, (t>t0t>t_{0}).

Prob{xX(t)x+dx|X(t0)=x0}=T(x,t|x0,t0)dx,\displaystyle\text{Prob}\big{\{}x\leq X(t)\leq x+dx|X(t_{0})=x_{0}\big{\}}=T(x,t|x_{0},t_{0})\,dx, (47)

for t>t0t>t_{0}. It is called the transition mechanism of the Markov process. For consistency we should have the following propagation rule, known as the Chapman-Kolmogorov equation in probability theory,

T(x,t|x1,t1)T(x1,t1|x0,t0)𝑑x1=T(x,t|x0,t0),\displaystyle\int T(x,t|x_{1},t_{1})\,T(x_{1},t_{1}|x_{0},t_{0})\,dx_{1}=T(x,t|x_{0},t_{0}), (48)

for all t>t1>t0t>t_{1}>t_{0}.


Example 16

The Wiener process (Brownian motion) is determined by

T(x,t|x0,t0)\displaystyle T\left(x,t|x_{0},t_{0}\right) =\displaystyle= 12π(tt0)e(xx0)22(tt0),\displaystyle\frac{1}{\sqrt{2\pi\left(t-t_{0}\right)}}e^{-\frac{\left(x-x_{0}\right)^{2}}{2\left(t-t_{0}\right)}}, (49)
ρ(x,0)\displaystyle\rho\left(x,0\right) =\displaystyle= δ0(x).\displaystyle\delta_{0}\left(x\right). (50)

The transition mechanism here is the Green’s function for the heat equation

tρ=122x2ρ.\displaystyle\frac{\partial}{\partial t}\rho=\frac{1}{2}\frac{\partial^{2}}{\partial x^{2}}\rho. (51)

(In other words, given the data ρ(,t0)=f()\rho(\cdot,t_{0})=f(\cdot) at time t0t_{0}, the solution for later times is ρ(x,t)=T(x,t|x0,t0)f(x0)𝑑x0\rho(x,t)=\int T(x,t|x_{0},t_{0})f(x_{0})\,dx_{0}.)

Norbert Wiener gave an explicit construction - known as the canonical version of Brownian motion, where the sample space is the space of continuous paths, 𝐰={w(t):t0}\mathbf{w}=\left\{w\left(t\right):t\geq 0\right\}, starting a the origin as sample space, with a suitable σ\sigma-algebra of subsets and a well defined measure Wienert\mathbb{P}_{\text{Wiener}}^{t}.

The corresponding stochastic process is denote W(t)W(t). Ito was able to construct a stochastic differential calculus around the Wiener process, and more generally diffusions, and we have the following Ito table

dt00dW0dt.\displaystyle\begin{tabular}[]{l|ll}$\times$&$dt$&$dW$\\ \hline\cr$dt$&0&0\\ $dW$&0&$dt$\end{tabular}.
×dtdW (55)

3.4.2 Path Integral Formulation

Indeed, we have

ρ(xn,tn;;x1,t1)dxndx1ek(xkxk1)22(tktk1)dxndx1.\displaystyle\rho\left(x_{n},t_{n};\cdots;x_{1},t_{1}\right)\,dx_{n}\cdots dx_{1}\propto e^{-\sum_{k}\frac{\left(x_{k}-x_{k-1}\right)^{2}}{2\left(t_{k}-t_{k-1}\right)}}dx_{n}\cdots dx_{1}. (56)

Formally, we may introduce a limit “path integral” with probability measure on the space of paths

Wienert[d𝐰]=eSWiener[𝐰]𝒟𝐰.\displaystyle\mathbb{P}_{\text{Wiener}}^{t}\left[d\mathbf{w}\right]=e^{-S_{\text{Wiener}}\left[\mathbf{w}\right]}\mathcal{D}\mathbf{w}. (57)

where we have the action

SWiener[𝐰]=0t12w˙(τ)2𝑑τ.\displaystyle S_{\text{Wiener}}\left[\mathbf{w}\right]=\int_{0}^{t}\frac{1}{2}\dot{w}\left(\tau\right)^{2}d\tau. (58)

For a diffusion X(t)X\left(t\right) satisfying the Ito stochastic differential equation

dX=v(X)dt+σ(X)dW\displaystyle dX=v\left(X\right)dt+\sigma\left(X\right)dW (59)

we have the corresponding measure

Xt[d𝐱]=eSX[𝐱]𝒟𝐱.\displaystyle\mathbb{P}_{X}^{t}\left[d\mathbf{x}\right]=e^{-S_{X}\left[\mathbf{x}\right]}\mathcal{D}\mathbf{x}. (60)

where we have the action (substitute w˙=x˙wσ\dot{w}=\frac{\dot{x}-w}{\sigma} into SWiener[𝐰]S_{\text{Wiener}}\left[\mathbf{w}\right], and allow for a Jacobian correction)

SX[𝐱]=0t12[x˙v(x)]2σ(x)2𝑑τ+120t.v(x)dτ.\displaystyle S_{X}\left[\mathbf{x}\right]=\int_{0}^{t}\frac{1}{2}\frac{[\dot{x}-v(x)]^{2}}{\sigma(x)^{2}}d\tau+\frac{1}{2}\int_{0}^{t}\nabla.v(x)d\tau. (61)

3.4.3 The Classical Filtering Problem

Suppose that we have a system described by a process {X(t):t0}\left\{X\left(t\right):t\geq 0\right\}. We obtain information by observing a related process {Y(t):t0}\left\{Y\left(t\right):t\geq 0\right\}.

dX\displaystyle dX =\displaystyle= v(X)dt+σ(X)dW(stochastic dynamics),\displaystyle v\left(X\right)dt+\sigma\left(X\right)dW\quad\text{(stochastic dynamics),} (62)
dY\displaystyle dY =\displaystyle= h(X)dt+dZ(Noisy observations).\displaystyle h\left(X\right)dt+dZ\quad\text{(Noisy observations).} (63)

Here we assume that the dynamical noise WW and the observational noise ZZ are independent Wiener processes.


The joint probability of both XX and YY up to time tt is

X,Yt[d𝐱,d𝐲]=eSX,Y[x,y]𝒟𝐱𝒟𝐲,\displaystyle\mathbb{P}_{X,Y}^{t}\left[d\mathbf{x},d\mathbf{y}\right]=e^{-S_{X,Y}\left[x,y\right]}\mathcal{D}\mathbf{x}\mathcal{D}\mathbf{y}, (64)

where

SX,Y[𝐱,𝐲]\displaystyle S_{X,Y}\left[\mathbf{x},\mathbf{y}\right] =\displaystyle= SX[𝐱]+0t12[y˙h(x)]2𝑑τ\displaystyle S_{X}\left[\mathbf{x}\right]+\int_{0}^{t}\frac{1}{2}\left[\dot{y}-h\left(x\right)\right]^{2}d\tau (65)
=\displaystyle= SX[𝐱]+SWiener[𝐲]0t[h(x)y˙12h(x)2]𝑑τ,\displaystyle S_{X}\left[\mathbf{x}\right]+S_{\text{Wiener}}[\mathbf{y}]-\int_{0}^{t}\left[h\left(x\right)\dot{y}-\frac{1}{2}h\left(x\right)^{2}\right]d\tau, (66)

or

X,Yt[d𝐱,d𝐲]=Xt[d𝐱]Wienert[d𝐲]λ(𝐲|𝐱).\displaystyle\mathbb{P}_{X,Y}^{t}\left[d\mathbf{x},d\mathbf{y}\right]=\mathbb{P}_{X}^{t}\left[d\mathbf{x}\right]\mathbb{P}_{\mathrm{Wiener}}^{t}\left[d\mathbf{y}\right]\,\lambda\left(\mathbf{y}|\mathbf{x}\right). (67)

where the Kallianpur-Streibel likelihood222Readers with a background in stochastic processes will recognize this as a Radon-Nikodym derivative associated with a Girsanov transformation. is

λt(𝐲|𝐱)=e0t[h(x)dy(τ)12h(x)2dτ].\displaystyle\lambda_{t}\left(\mathbf{y}|\mathbf{x}\right)=e^{\int_{0}^{t}\left[h\left(x\right)dy(\tau)-\frac{1}{2}h\left(x\right)^{2}d\tau\right]}. (68)

The distribution for X(t)X\left(t\right) given observations 𝐲={y(τ):0τt}\mathbf{y}=\left\{y\left(\tau\right):0\leq\tau\leq t\right\} is then

ρt(x|𝐲)=x(0)=x0x(t)=xλt(𝐲|𝐱)Xt[d𝐱]x(0)=x0λt(𝐲|𝐱)Xt[d𝐱]\displaystyle\rho_{t}\left(x|\mathbf{y}\right)=\frac{\int_{x(0)=x_{0}}^{x(t)=x}\lambda_{t}\left(\mathbf{y}|\mathbf{x}\right)\mathbb{P}_{X}^{t}\left[d\mathbf{x}\right]}{\int_{x(0)=x_{0}}\lambda_{t}\left(\mathbf{y}|\mathbf{x}^{\prime}\right)\mathbb{P}_{X}^{t}\left[d\mathbf{x}^{\prime}\right]} (69)

Let us write 𝒴t\mathcal{Y}_{t} for the σ\sigma-algebra generated by the observations {Y(τ):0τt}\{Y(\tau):0\leq\tau\leq t\}. The estimate for f(X(t))f(X(t)) for any function ff conditioned on the observations up to time tt is called the filter and, generalizing (39) to continuous time, we may write this as

𝔈t(f)\displaystyle\mathfrak{E}_{t}(f) =\displaystyle= 𝔼[f(X(t))|𝒴t]\displaystyle\mathbb{E}[f(X(t))|\mathcal{Y}_{t}] (70)
=\displaystyle= ρt(x|𝐲)f(x)𝑑x=σt(x|𝐲)f(x)𝑑xσt(x|𝐲)𝑑x\displaystyle\int\rho_{t}\left(x|\mathbf{y}\right)f(x)\,dx=\frac{\int\sigma_{t}(x|\mathbf{y})f(x)dx}{\int\sigma_{t}(x^{\prime}|\mathbf{y})dx^{\prime}}

where σt(x|𝐲)=x(0)=x0x(t)=xλ(𝐲|𝐱)Xt[d𝐱]\sigma_{t}(x|\mathbf{y})=\int_{x(0)=x_{0}}^{x(t)=x}\lambda\left(\mathbf{y}|\mathbf{x}\right)\mathbb{P}_{X}^{t}\left[d\mathbf{x}\right] is a non-normalized density. We introduce the stochastic process σt(x):ωσt(x|𝐲)\sigma_{t}(x):\omega\mapsto\sigma_{t}(x|\mathbf{y}) and it can be shown to satisfy the Duncan-Mortensen-Zakai equation

dσt(x)=σt(x)dt+h(x)σt(x)dY(t).\displaystyle d\sigma_{t}(x)=\mathcal{L}^{\ast}\sigma_{t}(x)\,dt+h(x)\sigma_{t}(x)\,dY(t). (71)

This implies the filtering equation

d𝔈t(f)=𝔈t(f)dt+{𝔈t(fh)𝔈t(f)𝔈t(h)}dI(t),\displaystyle d\mathfrak{E}_{t}(f)=\mathfrak{E}_{t}(\mathcal{L}f)\,dt+\big{\{}\mathfrak{E}_{t}(fh)-\mathfrak{E}_{t}(f)\mathfrak{E}_{t}(h)\big{\}}dI(t), (72)

where the innovations process is defined as

dI(t)=dY(t)𝔈t(h)dt.\displaystyle dI(t)=dY(t)-\mathfrak{E}_{t}(h)\,dt. (73)

4 Quantum Filtering

We now consider the quantum analogue of filtering. See also [11]-[15].

4.1 Quantum Measurement

The Basic Concepts

The Born interpretation of the wave function, ψ(x)\psi(x), in quantum mechanics is that |ψ(x)|2|\psi(x)|^{2} gives the probability density of finding the particle at position xx. More generally, in quantum theory, observables are represented by self-adjoint operators on a Hilbert space. The basic postulate of quantum theory is that the pure states of a system correespond to normalized the wave functions, Ψ\Psi, and we will follow Dirac and denote these as kets |Ψ|\Psi\rangle. When we measure an observable, the physical value we record will be an eigenvalue. If the state is |Ψ|\Psi\rangle then the average value of the observable represented by A^\hat{A} is A^=Ψ|A^|Ψ\langle\hat{A}\rangle=\langle\Psi|\hat{A}|\Psi\rangle.

Let us recall that a Hermitian operator P^\hat{P} is called an orthogonal projection if it satisfies P^2=P^\hat{P}^{2}=\hat{P}. Then if we have a Hermitian operator A^\hat{A} with a discrete set of eigenvalues, then there exists a collection of orthogonal projections P^a\hat{P}_{a} labeled by the eigenvalues aa, satisfying P^aP^a=0\hat{P}_{a}\hat{P}_{a^{\prime}}=0 if aaa\neq a^{\prime} and aP^a=I^\sum_{a}\hat{P}_{a}=\hat{I}, such that

A^=aaP^a.\displaystyle\hat{A}=\sum_{a}a\,\hat{P}_{a}. (74)

This is the spectral decomposition of A^\hat{A}. The operators P^a\hat{P}_{a} project onto a\mathcal{E}_{a} which is the eigenspace of A^\hat{A} for eigenvalue aa. In other words, a\mathcal{E}_{a} is the space of all eigenvectors of A^\hat{A} having eigenvalue aa. The eigenspaces are orthogonal, that is ψ|ϕ=0\langle\psi|\phi\rangle=0 whenever ψ\psi and ϕ\phi lie in different eigenspaces (this is equivalent to P^aP^a=0\hat{P}_{a}\hat{P}_{a^{\prime}}=0 if aaa\neq a^{\prime}), and every vector |ψ|\psi\rangle can be written as a superposition of vectors a|ψa\sum_{a}|\psi_{a}\rangle where |ψa|\psi_{a}\rangle lies in eigenspace a\mathcal{E}_{a}. (In fact, |ψa=P^a|ψ|\psi_{a}\rangle=\hat{P}_{a}|\psi\rangle.)

We note that, for any integer nn,

A^n=aanP^a\displaystyle\hat{A}^{n}=\sum_{a}a^{n}\,\hat{P}_{a} (75)

and any real tt

eitA^=aeitaP^a.\displaystyle e^{it\hat{A}}=\sum_{a}e^{ita}\hat{P}_{a}. (76)

Suppose we prepare a quantum system in a state |Ψ|\Psi\rangle and perform a measurement of an observable A^\hat{A}. We know that we may only measure an eigenvalue aa and quantum mechanics predicts the probability pap_{a}. In fact, using the spectral decomposition

A^n=aanP^a=aanP^a=aanpa,\displaystyle\langle\hat{A}^{n}\rangle=\langle\sum_{a}a^{n}\,\hat{P}_{a}\rangle=\sum_{a}\langle a^{n}\,\hat{P}_{a}\rangle=\sum_{a}a^{n}\,p_{a}, (77)

and so

pa=P^aΨ|P^a|Ψ.\displaystyle p_{a}=\langle\hat{P}_{a}\rangle\equiv\langle\Psi|\hat{P}_{a}|\Psi\rangle. (78)

For the special case of a non-degenerate eigenvalue aa, we have that the eigenspace a\mathcal{E}_{a} is spanned by a single eigenvector |a|a\rangle, which we take to be normalized. In this case we have P^a=|aa|\hat{P}_{a}=|a\rangle\langle a|

pa=Ψ|P^a|Ψ=Ψ|aa|Ψ|a|Ψ|2.\displaystyle p_{a}=\langle\Psi|\hat{P}_{a}|\Psi\rangle=\langle\Psi|a\rangle\langle a|\Psi\rangle\equiv\left|\langle a|\Psi\rangle\right|^{2}. (79)

We see that if an observable A^\hat{A} has a non-degenerate eigenvalue aa with normalized eigenvector |a|a\rangle, then if the system is prepared in state |Ψ|\Psi\rangle, the probability of measuring aa in an experiment is |a|Ψ|2\left|\langle a|\Psi\rangle\right|^{2}. The modulus squared of an overlap in this way may therefore have the interpretation as a probability.

The degenerate case needs some more attention. Here the eigenspace a\mathcal{E}_{a} can spanned by a set of orthonormal vectors |a1,|a2,|a1\rangle,|a2\rangle,\cdots so that P^a=n|anan|\hat{P}_{a}=\sum_{n}|an\rangle\langle an|, and so pa=n|an|Ψ|2p_{a}=\sum_{n}\left|\langle an|\Psi\rangle\right|^{2}. The choice of the orthonormal basis for a\mathcal{E}_{a} is not important!

The probability pap_{a} is equal to the length-squared of P^a|Ψ\hat{P}_{a}|\Psi\rangle, that is,

pa=P^aΨ2.\displaystyle p_{a}=\|\hat{P}_{a}\Psi\|^{2}. (80)

To see this, note that P^aΨ2\|\hat{P}_{a}\Psi\|^{2} is the overlap of the ket P^a|Ψ\hat{P}_{a}|\Psi\rangle with its own bra Ψ|P^a\langle\Psi|\hat{P}_{a}^{{\dagger}} so

P^aΨ2=Ψ|P^aP^a|Ψ=Ψ|P^a2|Ψ=Ψ|P^a|Ψ=pa\displaystyle\|\hat{P}_{a}\Psi\|^{2}=\langle\Psi|\hat{P}_{a}^{{\dagger}}\,\hat{P}_{a}|\Psi\rangle=\langle\Psi|\hat{P}_{a}^{2}|\Psi\rangle=\langle\Psi|\hat{P}_{a}|\Psi\rangle=p_{a} (81)

where we used the fact that P^a=P^a=P^a2\hat{P}_{a}=\hat{P}_{a}^{{\dagger}}=\hat{P}_{a}^{2}.

In the picture below, we project |Ψ|\Psi\rangle into the eigenspace a\mathcal{E}_{a} to get P^a|Ψ\hat{P}_{a}|\Psi\rangle. In the special case where |Ψ|\Psi\rangle was already in the eigenspace, it equals its own projection (P^a|Ψ=|Ψ\hat{P}_{a}|\Psi\rangle=|\Psi\rangle) and so pa=1p_{a}=1 since the state |Ψ|\Psi\rangle is normalized. If the state |Ψ|\Psi\rangle is however orthogonal to the eigenspace then its projection is zero (P^a|Ψ=0\hat{P}_{a}|\Psi\rangle=0) and so pa=0p_{a}=0.

In general, we get something in between. In the picture below we see that |Ψ|\Psi\rangle has a component in the eigenspace and a component orthogonal to it. The projected vector P^a|Ψ\hat{P}_{a}|\Psi\rangle will then have length less than the original |Ψ|\Psi\rangle, and so pa<1p_{a}<1.

Refer to caption
Figure 2: The state |Ψ|\Psi\rangle is projected into the eigenspace a\mathcal{E}_{a} corresponding to the eigenvalue aa of A^\hat{A}.
Von Neumann’s Projection Postulate

Suppose the initial state is |Ψ|\Psi\rangle and we measure the eigenvalue aa of observable A^\hat{A} in an given experiment. A second measurement of A^\hat{A} performed straight way ought to yield the same value aa again, this time with certainty.

The only way however to ensure that we measure a given eigenvalue with certainty is if the state lies in the eigenspace for that eigenvalue. We therefore require that the state of the system immediately after the result aa is measured will jump from |Ψ|\Psi\rangle to something lying in the eigenspace a\mathcal{E}_{a}. This leads us directly to the von Neumann projection postulate.


The von Neumann projection postulate: If the state of a system is given by a ket |Ψ|\Psi\rangle, and a measurement of observable A^\hat{A} yields the eigenvalue aa, then the state immediately after measurement becomes |Ψa=1paP^a|Ψ.|\Psi_{a}\rangle=\dfrac{1}{\sqrt{p_{a}}}\,\hat{P}_{a}|\Psi\rangle.

We note that the projected vector P^a|Ψ\hat{P}_{a}|\Psi\rangle has length pa\sqrt{p_{a}} so we need to divide by this to ensure that |Ψa|\Psi_{a}\rangle is properly normalized. The von Neumann postulate is essentially the simplest geometric way to get the vector |Ψ|\Psi\rangle into the eigenspace: project down and then normalize!

Compatible Measurements

Suppose we measure a pair of observables A^\hat{A} and B^\hat{B} in that sequence. The A^\hat{A}-measurement leaves the state in the eigenspace of the measured value aa, the subsequent B^\hat{B}-measurement then leaves the state in the eigenspace of the measured value bb. If we then went back and remeasured A^\hat{A} would be find aa again with certainty? The state after the second measurement will be an eigenvector of B^\hat{B} with eigenvalue bb, but this need not necessarily be an eigenvector of A^\hat{A}.

Let AA and B^\hat{B} be a pair of observables with spectral decompositions aaP^a\sum_{a}a\hat{P}_{a} and bbQ^b\sum_{b}b\hat{Q}_{b} respectively. Let us measure A^\hat{A} and then B^\hat{B} recording values aa and bb respectively. If the initial state was |Ψin|\Psi_{\text{in}}\rangle then we obtain after both measurements the final state will be

|ΨoutQ^bP^a|Ψin.\displaystyle|\Psi_{\text{out}}\rangle\propto\hat{Q}_{b}\hat{P}_{a}\,|\Psi_{\text{in}}\rangle. (82)

In particular |Ψout|\Psi_{\text{out}}\rangle is an eigenstate of B^\hat{B} with eigenvalue bb. However suppose we also wanted |Ψout|\Psi_{\text{out}}\rangle to be an eigenstate of A^\hat{A} with the original eigenvalue aa, the we must have P^a|Ψout=|Ψout\hat{P}_{a}|\Psi_{\text{out}}\rangle=|\Psi_{\text{out}}\rangle or equivalently

P^aQ^bP^a|Ψin=Q^bP^a|Ψin.\displaystyle\hat{P}_{a}\hat{Q}_{b}\hat{P}_{a}\,|\Psi_{\text{in}}\rangle=\hat{Q}_{b}\hat{P}_{a}\,|\Psi_{\text{in}}\rangle. (83)

If we want this to be true irrespective of the actual initial state |Ψin|\Psi_{\text{in}}\rangle then we arrive at the operator equation

P^aQ^bP^a=Q^bP^a.\displaystyle\hat{P}_{a}\hat{Q}_{b}\hat{P}_{a}=\hat{Q}_{b}\hat{P}_{a}. (84)
Proposition 17

Let P^\hat{P} and Q^\hat{Q} be a pair of orthogonal projections satisfying P^Q^P^=Q^P^\hat{P}\hat{Q}\hat{P}=\hat{Q}\hat{P} then P^Q^=Q^P^\hat{P}\hat{Q}=\hat{Q}\hat{P}.

Proof. We first observe that R^=Q^P^Q^\hat{R}=\hat{Q}\hat{P}\hat{Q} will again be an orthogonal projection. To this end we must show that R=RR^{{\dagger}}=R and R2=RR^{2}=R. However, R=(Q^P^Q^)=Q^P^Q^=Q^P^Q^=RR^{{\dagger}}=\left(\hat{Q}\hat{P}\hat{Q}\right)^{{\dagger}}=\hat{Q}^{{\dagger}}\hat{P}^{{\dagger}}\hat{Q}^{{\dagger}}=\hat{Q}\hat{P}\hat{Q}=R and

R^2\displaystyle\hat{R}^{2} =\displaystyle= (Q^P^Q^)(Q^P^Q^)=Q^P^Q^2P^Q^\displaystyle\left(\hat{Q}\hat{P}\hat{Q}\right)\left(\hat{Q}\hat{P}\hat{Q}\right)=\hat{Q}\hat{P}\hat{Q}^{2}\hat{P}\hat{Q}
=\displaystyle= Q^P^Q^P^Q^=Q^(P^Q^P^)Q^\displaystyle\hat{Q}\hat{P}\hat{Q}\hat{P}\hat{Q}=\hat{Q}(\hat{P}\hat{Q}\hat{P})\hat{Q}
=\displaystyle= Q^(Q^P^)Q^=Q^2P^Q^\displaystyle\hat{Q}(\hat{Q}\hat{P})\hat{Q}=\hat{Q}^{2}\hat{P}\hat{Q}
=\displaystyle= Q^P^Q^=R^.\displaystyle\hat{Q}\hat{P}\hat{Q}=\hat{R}.

However we also have R^=Q^P^\hat{R}=\hat{Q}\hat{P}, so the relation R^=R^\hat{R}=\hat{R}^{{\dagger}} implies that Q^P^=P^Q^=P^Q^\hat{Q}\hat{P}=\hat{P}^{{\dagger}}\hat{Q}^{{\dagger}}=\hat{P}\hat{Q}.   

We see that our operator identity above means that Q^a\hat{Q}_{a} and P^b\hat{P}_{b} need to commute! If we wanted the B^\hat{B}-measurement not to disturb the A^\hat{A}-measurement for any possible outcome aa and bb, then we require that all the eigen-projections of A^\hat{A} commute with all the eigen-projections of B^\hat{B}, and this implies that .

Definition 18

A collection of observables are compatible if they commute. We define the commutator of two operators as

[A^,B^]=A^B^B^A^\displaystyle\left[\hat{A},\hat{B}\right]=\hat{A}\hat{B}-\hat{B}\hat{A} (85)

So A^\hat{A} and B^\hat{B} are compatible if [A^,B^]=0\left[\hat{A},\hat{B}\right]=0.

Von Neumann’s Model of Measurement

The postulates of quantum mechanics outlined above assume that all measurements are idealized, but one might expect the actual process of extracting information from quantum systems to be more involved. Von Neumann modeled the measurement process as follows. We wish to get information about an observable, X^\hat{X}, say the position of a quantum system. Rather than measure X^\hat{X} directly, we measure an observable Y^\hat{Y} giving the pointer position of a second system (called the measurement apparatus).

We will reformulate the von Neumann measurement problem in the language of estimation theory. First we assume that apparatus is described by a wave-function ϕ\phi. The initial state of the system and apparatus is |Ψ0=|Ψprior|ϕ|\Psi_{0}\rangle=|\Psi_{\mathrm{prior}}\rangle\otimes|\phi\rangle, i.e.,

x,y|Ψ0=Ψprior(x)ϕ(y).\displaystyle\langle x,y|\Psi_{0}\rangle=\Psi_{\mathrm{prior}}\left(x\right)\,\phi\left(y\right). (86)

(Note that we are already falling in line with the estimation way of thinking by referring to the initial wave function of the particle as an a priori wave function - it is something we have to fix at the outset, even if we recognize it as only a guess for the correct physical state.)) The system and apparatus are taken to interact by means of the unitary

U^=eiμX^P^app/\displaystyle\hat{U}=e^{i\mu\hat{X}\otimes\hat{P}_{\mathrm{app}}/\hbar} (87)

where P^app=iy\hat{P}_{\mathrm{app}}=-i\hbar\frac{\partial}{\partial y} is the momentum operator of the pointer conjugate to Y^\hat{Y}. After coupling, the joint state is

x,y|U^Ψ0=Ψprior(x)ϕ(yμx).\displaystyle\langle x,y|\hat{U}\Psi_{0}\rangle=\Psi_{\mathrm{prior}}\left(x\right)\,\phi\left(y-\mu x\right). (88)

If the measured value of Y^\hat{Y} is yy, then the a posteriori wave-function must be

ψpost(x|y)=1ρY(y)ψprior(x)ϕ(yμx)\displaystyle\psi_{\mathrm{post}}(x|y)=\frac{1}{\sqrt{\rho_{Y}(y)}}\psi_{\mathrm{prior}}\left(x\right)\,\phi\left(y-\mu x\right) (89)

where

ρY(y)=|ψprior(x)ϕ(yμx)|2𝑑x.\displaystyle\rho_{Y}(y)=\int|\psi_{\mathrm{prior}}\left(x\right)\,\phi\left(y-\mu x\right)|^{2}dx. (90)

Basically, the pointer position will be a random variable with pdf given by ρY\rho_{Y}: the a posteriori wave-function may then be thought of as a random wave-function on the system Hilbert space:

ψprior(x)ψpost(x|Y).\displaystyle\psi_{\mathrm{prior}}(x)\longrightarrow\psi_{\mathrm{post}}(x|Y). (91)

In the parlance of quantum theorists, the wave function of the apparatus collapses to |y|y\rangle, while we update the a priori wave function to get the a posteriori one.

We have been describing events in the Schrödinger picture where states evolve while observables remain fixed. In this picture, we measure the observable Y^in=IY^\hat{Y}^{\mathrm{in}}=I\otimes\hat{Y}, but the state is changing in time. It is instructive to describe events in the Heisenberg picture. Here the state is fixed as |Ψ0=|Ψprior|ϕ|\Psi_{0}\rangle=|\Psi_{\mathrm{prior}}\rangle\otimes|\phi\rangle, while the observables evolve. In fact, the observable that we actually measure is

Y^out=U^(IY^)U^=μU^(X^I)U^signal+Y^innoise,\displaystyle\hat{Y}^{\text{out}}=\hat{U}^{\ast}\big{(}I\otimes\hat{Y}\big{)}\hat{U}=\underbrace{\mu\,\hat{U}^{\ast}\big{(}\hat{X}\otimes I\big{)}\hat{U}}_{\mathrm{signal}}+\underbrace{\hat{Y}^{\mathrm{in}}}_{\mathrm{noise}}, (92)

from which it is clear that we are obtaining some information about X^\hat{X}. Note that the measured observable Y^out\hat{Y}^{\text{out}} is explicitly of the form signal plus noise as in Example 15. The noise term, Y^in\hat{Y}^{\text{in}}, is independent of the signal and has the prescribed pdf |ϕ(y)|2|\phi(y)|^{2}.

4.2 Quantum Markovian Systems

Quantum Systems with Classical Noise

We consider a quantum system driven by Wiener noise. For HH and RR self-adjoint, we set

U(t)=eiHtiRW(t),\displaystyle U(t)=e^{-iHt-iRW(t)}, (93)

which clearly defines a unitary process. From the Ito calculus we can quickly deduce the corresponding Schrödinger equation

dU(t)=[iH12R2]U(t)dtiRU(t)dW(t).\displaystyle dU(t)=\big{[}-iH-\frac{1}{2}R^{2}\big{]}U(t)\,dt-iRU(t)\,dW(t). (94)

If we set jt(X)=U(t)XU(t)j_{t}(X)=U(t)^{\ast}XU(t), which we may think of as an embedding of the system observable XX into a noisy environment, then we similarly obtain

djt(X)=jt((X))dtijt([X,R])dW(t).\displaystyle dj_{t}(X)=j_{t}\big{(}\mathcal{L}(X)\big{)}\,dt-ij_{t}\big{(}[X,R]\big{)}\,dW(t). (95)

where

(X)=i[X,H]12[[X,R],R].\displaystyle\mathcal{L}(X)=-i[X,H]-\frac{1}{2}\big{[}[X,R],R\big{]}. (96)

An alternative is to use Poissonian noise. Here we apply a unitary kick, SS, at times distributed as a Poisson process with rate ν>0\nu>0. Let N(t)N(t) count the number of kicks up to time tt, then {N(t):t0}\{N(t):t\geq 0\} is a stochastic process with independent stationary increments (like the Wiener process) and we have the Ito rules

dN(t)dN(t)=dN(t),dN(t)=νdt.\displaystyle dN(t)\,dN(t)=dN(t),\qquad\langle dN(t)\rangle=\nu\,dt. (97)

The Schrödinger equation is dU(t)=(SI)U(t)dN(t)dU(t)=(S-I)U(t)\,dN(t) and for the evolution of observables we now have

djt(X)=jt((X))dN(t),(X)=SXSX.\displaystyle dj_{t}(X)=j_{t}\big{(}\mathcal{L}(X)\big{)}dN(t),\qquad\mathcal{L}(X)=S^{\ast}XS-X. (98)
Lindblad Generators

A quantum dynamical semigroup is a family of CP maps, {Φt:t0}\{\Phi_{t}:t\geq 0\}, such that ΦtΦs=Φt+s\Phi_{t}\circ\Phi_{s}=\Phi_{t+s} and Φ(I)=I\Phi(I)=I. Under various continuity conditions one can show that the general form of the generator is

(X)=k12Lk[X,Lk]+k12[Lk,X]Lki[X,H].\displaystyle\mathcal{L}(X)=\sum_{k}\frac{1}{2}L_{k}^{\ast}[X,L_{k}]+\sum_{k}\frac{1}{2}[L_{k}^{\ast},X]L_{k}-i[X,H]. (99)

These include the examples emerging from classical noise above - in fact, combinations of the Wiener and Poissonian cases give the general classical case. But the class of Lindblad generators is strictly larger that this, meaning that we need quantum noise! This is typically what we consider when modeling quantum optics situation.

4.3 Quantum Noise Models

Fock Space

We recall how to model bosonic fields. We wish to describe a typical pure state |Ψ|\Psi\rangle of the field. If we look at the field we expect to see a certain number, nn, of particles at locations x1,x2,,xnx_{1},x_{2},\cdots,x_{n} and to this situation we assign a complex number (the probability amplitude) ψn(x1,x2,xn)\psi_{n}(x_{1},x_{2},\cdots x_{n}). As the particles are indistinguishable bosons, the amplitude should be completely symmetric under interchange of particle identities.

The field however can have an indefinite number of particles - that is, it can be written as a superposition of fixed number states. The general form of a pure state for the field will be

|Ψ=(ψ0,ψ1,ψ2,ψ3,).\displaystyle|\Psi\rangle=\big{(}\psi_{0},\psi_{1},\psi_{2},\psi_{3},\cdots\big{)}. (100)

Note that the case n=0n=0 is included and is understood as the vacuum state. Here ψ0\psi_{0} is a complex number, with p0=|ψ0|2p_{0}=|\psi_{0}|^{2} giving the probability for finding no particles in the field.

The probability that we have exactly nn particles is

pn=|ψn(x1,x2,,xn)|2𝑑x1𝑑x2𝑑xn,\displaystyle p_{n}=\int\left|\psi_{n}\left(x_{1},x_{2},\cdots,x_{n}\right)\right|^{2}dx_{1}dx_{2}\cdots dx_{n}, (101)

and the normalization of the state is therefore n=0pn=1\sum_{n=0}^{\infty}p_{n}=1.

In particular, we take the vacuum state to be

|Ω=(1,0,0,0,).\displaystyle|\Omega\rangle=\big{(}1,0,0,0,\cdots\big{)}. (102)

The Hilbert space spanned by such indefinite number of indistinguishable boson states is called Fock Space.

A convenient spanning set is given by the exponential vectors

x1,x2,,xn|exp(α)=1n!α(x1)α(x2)α(xn).\displaystyle\langle x_{1},x_{2},\cdots,x_{n}|\exp\left(\alpha\right)\rangle=\frac{1}{\sqrt{n!}}\alpha\left(x_{1}\right)\alpha\left(x_{2}\right)\cdots\alpha\left(x_{n}\right). (103)

They are, in fact, over-complete and we have the inner products

exp(α)|exp(β)\displaystyle\langle\exp\left(\alpha\right)|\exp\left(\beta\right)\rangle (104)
=\displaystyle= n1n!α(x1)α(xn)β(x1)β(xn)𝑑x1𝑑xn\displaystyle\sum_{n}\frac{1}{n!}\int\alpha\left(x_{1}\right)^{\ast}\cdots\alpha\left(x_{n}\right)^{\ast}\beta\left(x_{1}\right)\cdots\beta\left(x_{n}\right)\,dx_{1}\cdots dx_{n}
=\displaystyle= eα(x)β(x)𝑑x\displaystyle e^{\int\alpha\left(x\right)^{\ast}\beta\left(x\right)dx}
=\displaystyle= eα|β.\displaystyle e^{\langle\alpha|\beta\rangle}.

The exponential vectors, when normalized, give the analogues to the coherent states for a single mode.

We note that the vacuum is an example: |Ω=|exp(0)|\Omega\rangle=|\exp(0)\rangle.

Quanta on a Wire

We now take our space to be 1-dimensional - a wire. Let’s parametrize the position on the wire by variable τ\tau, and denote by 𝔉[s,t]\mathfrak{F}_{[s,t]} the Fock space over a segment of the wire sτts\leq\tau\leq t. We have the following tensor product decomposition

𝔉AB=𝔉A𝔉B,ifAB=.\displaystyle\mathfrak{F}_{A\cup B}=\mathfrak{F}_{A}\otimes\mathfrak{F}_{B},\qquad\qquad\text{if}\quad A\cap B=\emptyset. (105)

In is convenient to introduce quantum white noises b(t)b(t) and b(t)b(t)^{\ast} satisfying the singular commutation relations

[b(t),b(s)]\displaystyle[b(t),b(s)^{\ast}] =\displaystyle= δ(ts).\displaystyle\delta(t-s). (106)

Here b(t)b(t) annihilates a quantum of the field at location tt. In keeping with the usual theory of the quantized harmonic oscillator, we take it that b(t)b(t) annihilates the vacuum: b(t)|Ω=0b(t)\,|\Omega\rangle=0. More generally, this implies that

b(t)|exp(β)=β(t)|exp(β).\displaystyle b(t)\,|\exp(\beta)\rangle=\beta(t)\,|\exp(\beta)\rangle. (107)

The adjoint b(t)b(t)^{\ast} creates a quantum at position tt.

The quantum white noises are operator densities and are singular, but their integrated forms do correspond to well defined operators which we call the annihilation and creation processes, respectively,

B(t)=0tb(τ)𝑑τ,B(t)=0tb(τ)𝑑τ.\displaystyle B(t)=\int_{0}^{t}b(\tau)d\tau,\qquad B(t)^{\ast}=\int_{0}^{t}b(\tau)^{\ast}d\tau. (108)

We see that

[B(t),B(s)]=0t𝑑τ0s𝑑σδ(τσ)=min(t,s).\displaystyle[B(t),B(s)^{\ast}]=\int_{0}^{t}d\tau\int_{0}^{s}d\sigma\,\delta(\tau-\sigma)=\text{min}(t,s). (109)

In addition we introduce a further process, called the number process, according to

Λ(t)=0tb(τ)b(τ)𝑑τ.\displaystyle\Lambda(t)=\int_{0}^{t}b(\tau)^{\ast}b(\tau)d\tau. (110)
Quantum Stochastic Models

We now think of our system as lying at the origin τ=0\tau=0 of a quantum wire. The quanta move along the wire at the speed of light, cc, and the parameter τ\tau can be thought of as x/cx/c which is the time for quanta at a distance xx away to reach the system. Better still τ\tau is the time at which this part of the field passes through the system. The process B(t)=0tb(τ)𝑑τB(t)=\int_{0}^{t}b(\tau)d\tau is the operator describing the annihilation of quanta passing through the system at some stage over the time-interval [0,t][0,t].

Fix a system Hilbert space, 𝔥0\mathfrak{h}_{0}, called the initial space. A quantum stochastic process is a family of operators, {X(t):t0}\{X(t):t\geq 0\}, acting on 𝔥0𝔉[0,)\mathfrak{h}_{0}\otimes\mathfrak{F}_{[0,\infty)}. .

The process is adapted if, for each tt, the operator X(t)X(t) acts trivially on the future environment factor .

QSDEs with adapted coefficients where originally introduced by Hudson & Parthasarathy in 1984. Let {Xαβ(t):t0}\{X_{\alpha\beta}(t):t\geq 0\} be four adapted quantum stochastic processes defined for α,β{0,1}\alpha,\beta\in\{0,1\}. We then define consider the QSDE

X˙(t)=b(t)(t)X11(t)b(t)+b(t)X10+X01(t)b(t)+X00(t),\displaystyle\dot{X}(t)=b(t)^{\ast}(t)X_{11}(t)b(t)+b(t)^{\ast}X_{10}+X_{01}(t)b(t)+X_{00}(t), (111)

with initial condition X(0)=X0IX(0)=X_{0}\otimes I. To understand this we take matrix elements between states of the form |ϕexp(α)|\phi\otimes\exp(\alpha)\rangle and use the eigen-relation (107) to get the integrated form

ϕexp(α)|X(t)|ψexp(β)=ϕ|X0|ψexp(α)|exp(β)\displaystyle\langle\phi\otimes\exp(\alpha)|X(t)|\psi\otimes\exp(\beta)\rangle=\langle\phi|X_{0}|\psi\rangle\,\langle\exp(\alpha)|\exp(\beta)\rangle (112)
+\displaystyle+ 0tα(τ)ϕexp(α)|X11(t)|ψexp(β)β(τ)𝑑τ\displaystyle\int_{0}^{t}\alpha(\tau)^{\ast}\langle\phi\otimes\exp(\alpha)|X_{11}(t)|\psi\otimes\exp(\beta)\rangle\beta(\tau)d\tau
+\displaystyle+ 0tα(τ)ϕexp(α)|X10(t)|ψexp(β)𝑑τ\displaystyle\int_{0}^{t}\alpha(\tau)^{\ast}\langle\phi\otimes\exp(\alpha)|X_{10}(t)|\psi\otimes\exp(\beta)\rangle d\tau
+\displaystyle+ 0tϕexp(α)|X01(t)|ψexp(β)β(τ)𝑑τ\displaystyle\int_{0}^{t}\langle\phi\otimes\exp(\alpha)|X_{01}(t)|\psi\otimes\exp(\beta)\rangle\beta(\tau)d\tau
+\displaystyle+ 0tϕexp(α)|X00(t)|ψexp(β)𝑑τ.\displaystyle\int_{0}^{t}\langle\phi\otimes\exp(\alpha)|X_{00}(t)|\psi\otimes\exp(\beta)\rangle d\tau. (113)

Processes obtain this way are called quantum stochastic integrals.

The approach of Hudson and Parthasarathy is actually different [16, 17]. The arrive at the process defined by (111) by building the analogue of the Ito theory for stochastic integration: that is the show conditions in which

dX(t)\displaystyle dX(t) =\displaystyle= X11(t)dΛ(t)+X10(t)dB(t)+X01(t)dB(t)+X00(t)dt,\displaystyle X_{11}(t)\otimes d\Lambda(t)+X_{10}(t)\otimes dB(t)^{\ast}+X_{01}(t)\otimes dB(t)+X_{00}(t)\otimes dt,
(114)

makes sense as a limit process where all the increments are future pointing. That is ΔΛΛ(t+Δt)Λ(t)\Delta\Lambda\equiv\Lambda(t+\Delta t)-\Lambda(t) with Δt>0\Delta t>0, etc.

One has, for instance,

ϕexp(α)|X00(t)ΔB(t)|ψexp(β)\displaystyle\langle\phi\otimes\exp(\alpha)|X_{00}(t)\otimes\Delta B(t)|\psi\otimes\exp(\beta)\rangle
=(tt+Δtβ(τ)𝑑τ)×ϕexp(α)|X00(t)I|ψexp(β),\displaystyle\quad=\bigg{(}\int_{t}^{t+\Delta t}\beta(\tau)d\tau\bigg{)}\times\langle\phi\otimes\exp(\alpha)|X_{00}(t)\otimes I|\psi\otimes\exp(\beta)\rangle, (115)

etc., so the two approaches coincide.

Quantum Ito Rules

It is clear from (111) that this calculus is Wick ordered - note that the creators b(t)b(t)^{\ast} all appear to the left and all the annihilators, b(t)b(t), appear to the right of the coefficients. The product of two Wick ordered expressions in not immediately Wick ordered and one must use the singular commutation relations to achieve this. This results in a additional term which corresponds to a quantum Ito correction.

We have

dB(t)dB(t)=dB(t)dB(t)=dB(t)dB(t)=0\displaystyle dB(t)dB(t)=dB(t)^{\ast}dB(t)=dB^{\ast}(t)dB^{\ast}(t)=0 (116)

To see this, let XtX_{t} adapted, then

exp(α)|XtdB(t)dB(t)|exp(β)=α(t)exp(α)|Xtexp(β)β(t)(dt)2\displaystyle\langle\exp(\alpha)|X_{t}dB(t)^{\ast}dB(t)|\exp(\beta)\rangle=\alpha(t)^{\ast}\langle\exp(\alpha)|X_{t}\exp(\beta)\rangle\beta(t)\,(dt)^{2} (117)

As we have a square of dtdt we can neglect such terms.

However, we have

[B(t)B(s),B(t)B(s)]=ts,(t>s)\displaystyle[B(t)-B(s),B(t)^{\ast}-B(s)^{\ast}]=t-s,\qquad(t>s) (118)

and so ΔBΔB=ΔBΔB+Δt\Delta B\,\Delta B^{\ast}=\Delta B^{\ast}\Delta B+\Delta t. The infinitesimal form of this is then

dB(t)dB(t)=dt.\displaystyle dB(t)dB(t)^{\ast}=dt. (119)

This is strikingly similar to the classical rule for increments of the Wiener process!

In fact, we have the following quantum Ito table

dt0000dB00dtdBdB0000dΛ00dBdΛ.\displaystyle\begin{tabular}[]{l|llll}$\times$&$dt$&$dB$&$dB^{\ast}$&$d\Lambda$\\ \hline\cr$dt$&0&0&0&0\\ $dB$&0&0&$dt$&$dB$\\ $dB^{\ast}$&0&0&0&0\\ $d\Lambda$&0&0&$dB^{\ast}$&$d\Lambda$\end{tabular}.
×dtdBdBdΛ (125)

Each of the non-zero terms arises from multiplying two processes that are not in Wick order.

For a pair of quantum stochastic integrals, we have the following quantum Ito product formula

d(XY)=(dX)dY+dX(dY)+(dX)(dY).\displaystyle d\big{(}XY\big{)}=(dX)dY+dX(dY)+(dX)(dY). (126)

Unlike the classical version, the order of XX and YY here is crucial.

Some Classical Processes On Fock Space

The process Q(t)=B(t)+B(t)Q(t)=B(t)+B(t)^{\ast} is self-commuting, that is [Q(t),Q(s)]=0,t,s[Q(t),Q(s)]=0,\quad\forall t,s, and has the distribution of a Wiener process is the vacuum state

Q˙(t)\displaystyle\langle\dot{Q}(t)\rangle =\displaystyle= Ω|[b(t)+b(t)]Ω=0,\displaystyle\langle\Omega|[b(t)+b(t)^{\ast}]\Omega\rangle=0, (127)
Q˙(t)Q˙(s)\displaystyle\langle\dot{Q}(t)\dot{Q}(s)\rangle =\displaystyle= Ω|b(t)b(s)Ω=δ(ts).\displaystyle\langle\Omega|b(t)b^{\ast}(s)\Omega\rangle=\delta(t-s). (128)

The same applies to P(t)=1i[B(t)B(t)]P(t)=\frac{1}{i}[B(t)-B(t)^{\ast}], but

[Q(t),P(s)]=2imin(t,s).\displaystyle[Q(t),P(s)]=2i\,\text{min}(t,s). (129)

So we have two non-commuting Wiener processes in Fock space. We refer to QQ and PP as canonically conjugate quadrature processes.

One see that, for instance,

dQdQ=dBdB=dt.\displaystyle dQdQ=dBdB^{\ast}=dt. (130)

We also obtain a Poisson process by the prescription

N(t)=Λ(t)+νB(t)+νB(t)+νt.\displaystyle N(t)=\Lambda(t)+\sqrt{\nu}B^{\ast}(t)+\sqrt{\nu}B(t)+\nu t. (131)

One readily checks that dNdN=dNdNdN=dN from the quantum Ito table.

Emission-Absorption Interactions

Let us consider a singular Hamiltonian of the form

Υ(t)=HI+iLb(t)iLb(t).\displaystyle\Upsilon(t)=H\otimes I+iL\otimes b(t)^{\ast}-iL^{\ast}\otimes b(t). (132)

We will try and realize the solution to the Schrödinger equation

U˙(t)=iΥ(t)U(t),U(0)=I.\displaystyle\dot{U}(t)=-i\Upsilon(t)\,U(t),\qquad U(0)=I. (133)

as a unitary quantum stochastic integral process.

Let us first remark that the annihilator part of (132) will appear out of Wick order when we consider (133). The standard approach in quantum field theory is to develop the unitary U(t)U(t) as a Dyson series expansion - often re-interpreted as a time order-exponential:

U(t)\displaystyle U(t) =\displaystyle= Ii0tΥ(τ)U(τ)𝑑τ\displaystyle I-i\int_{0}^{t}\Upsilon(\tau)U(\tau)d\tau (134)
=\displaystyle= 1i0t𝑑τΥ(τ)+(i)20t𝑑τ20τ2𝑑τ2Υ(τ2)Υ(τ1)+\displaystyle 1-i\int_{0}^{t}d\tau\Upsilon(\tau)+(-i)^{2}\int_{0}^{t}d\tau_{2}\int_{0}^{\tau_{2}}d\tau_{2}\Upsilon(\tau_{2})\Upsilon(\tau_{1})+\cdots
=\displaystyle= Tei0tΥ(τ)𝑑τ.\displaystyle\vec{T}e^{-i\int_{0}^{t}\Upsilon(\tau)d\tau}.

In our case the field terms - the quantum white noises - are linear, however, we have the problem that they come multiplied by the system operators LL and LL^{\ast} which do not commute, and don’t necessarily commute with HH either.

Fortunately we can do the Wick ordering in one fell swoop rather than having to go down each term of the Dyson series. We have

[b(t),U(t)]\displaystyle\left[b\left(t\right),U\left(t\right)\right] =\displaystyle= [b(t),Ii0tΥ(τ)U(τ)𝑑τ]=i0t[b(t),Υ(τ)]U(τ)𝑑τ\displaystyle\left[b\left(t\right),I-i\int_{0}^{t}\Upsilon\left(\tau\right)U\left(\tau\right)d\tau\right]=-i\int_{0}^{t}\left[b\left(t\right),\Upsilon\left(\tau\right)\right]U\left(\tau\right)d\tau (135)
=\displaystyle= 0t[b(t),Lb(τ)]U(τ)𝑑τ\displaystyle\int_{0}^{t}\left[b\left(t\right),Lb\left(\tau\right)^{\ast}\right]U\left(\tau\right)d\tau
=\displaystyle= L0tδ(tτ)U(τ)𝑑τ=12LU(t),\displaystyle L\int_{0}^{t}\delta\left(t-\tau\right)U\left(\tau\right)d\tau=\frac{1}{2}LU\left(t\right),

where we dropped the [b(t),U(τ)][b(t),U(\tau)] term as this should vanish for t>τt>\tau and took half the weight of the δ\delta-function due to the upper limit tt of the integration. However, we get

b(t)U(t)=U(t)b(t)+12LU(t).\displaystyle b\left(t\right)U\left(t\right)=U\left(t\right)b\left(t\right)+\frac{1}{2}LU\left(t\right). (136)

Plugging this into the equation (133), we get

U˙(t)\displaystyle\dot{U}\left(t\right) =\displaystyle= b(t)LU(t)Lb(t)U(t)iH(t)U(t)\displaystyle b\left(t\right)^{\ast}LU\left(t\right)-L^{\ast}b\left(t\right)U\left(t\right)-iH\left(t\right)U\left(t\right) (137)
=\displaystyle= b(t)LU(t)LU(t)b(t)(12LL+iH)U(t).\displaystyle b\left(t\right)^{\ast}LU\left(t\right)-L^{\ast}U\left(t\right)b\left(t\right)-\left(\frac{1}{2}L^{\ast}L+iH\right)U\left(t\right).

which is now Wick ordered. We can interpret this as the Hudson-Parthasarathy equation

dU(t)={LdB(t)LdB(t)(12LL+iH)dt}U(t).\displaystyle dU\left(t\right)=\left\{L\otimes dB\left(t\right)^{\ast}-L^{\ast}\otimes dB\left(t\right)-\left(\frac{1}{2}L^{\ast}L+iH\right)\otimes dt\right\}U\left(t\right). (138)

The corresponding Heisenberg equation for jt(X)=U(t)[XI]U(t)j_{t}(X)=U(t)^{\ast}[X\otimes I]U(t) will be

djt(X)\displaystyle dj_{t}\left(X\right) =\displaystyle= dU(t)[XI]U(t)+U(t)[XI]dU(t)\displaystyle dU\left(t\right)^{\ast}\left[X\otimes I\right]U\left(t\right)+U\left(t\right)^{\ast}\left[X\otimes I\right]dU\left(t\right) (139)
+dU(t)[XI]dU(t)\displaystyle+dU\left(t\right)^{\ast}\left[X\otimes I\right]dU\left(t\right)
=\displaystyle= jt(X)dt+jt([X,L])dB(t)+jt([L,X])dB(t)\displaystyle j_{t}\left(\mathcal{L}X\right)\otimes dt+j_{t}\left(\left[X,L\right]\right)\otimes dB\left(t\right)^{\ast}+j_{t}\left(\left[L^{\ast},X\right]\right)\otimes dB\left(t\right)

where

X\displaystyle\mathcal{L}X =\displaystyle= X(12LL+iH)(12LLiH)X+LXL\displaystyle-X\left(\frac{1}{2}L^{\ast}L+iH\right)-\left(\frac{1}{2}L^{\ast}L-iH\right)X+L^{\ast}XL (140)
=\displaystyle= 12[L,X]L+12L[X,L]i[X,H].\displaystyle\frac{1}{2}\left[L^{\ast},X\right]L+\frac{1}{2}L^{\ast}\left[X,L\right]-i\left[X,H\right].

We note that we obtain the typical Lindblad form for the generator.

Scattering Interactions

We mention that we could also treat a Hamiltonian with only scattering terms Let us set Υ(t)=Eb(t)b(t)\Upsilon\left(t\right)=E\otimes b\left(t\right)^{\ast}b\left(t\right). The same sort of argument leads to

[b(t),U(t)]=iE0t[b(t),b(τ)]b(τ)U(τ)𝑑τ=i2Eb(t)U(t),\displaystyle\left[b\left(t\right),U\left(t\right)\right]=-iE\int_{0}^{t}\left[b\left(t\right),b\left(\tau\right)^{\ast}\right]b\left(\tau\right)U\left(\tau\right)d\tau=-\frac{i}{2}Eb\left(t\right)U\left(t\right), (141)

which can be rearranged to give

b(t)U(t)=1Ii2EU(t)b(t).\displaystyle b\left(t\right)U\left(t\right)=\frac{1}{I-\frac{i}{2}E}U\left(t\right)b\left(t\right). (142)

So the Wick ordered form is

U˙(t)=Eb(t)b(t)U(t)=EIi2b(t)U(t)b(t)\displaystyle\dot{U}\left(t\right)=Eb\left(t\right)^{\ast}b\left(t\right)U\left(t\right)=\frac{E}{I-\frac{i}{2}}b\left(t\right)^{\ast}U\left(t\right)b\left(t\right) (143)

or in quantum Ito form

dU(t)=(SI)dΛ(t)U(t),(S=I+i2EIi2E, unitary!).\displaystyle dU\left(t\right)=\left(S-I\right)\otimes d\Lambda\left(t\right)\,U\left(t\right),\qquad\left(S=\frac{I+\frac{i}{2}E}{I-\frac{i}{2}E}\text{, unitary!}\right). (144)

The Heisenberg equation here is djt(X)=jt(SXSX)dΛ(t)dj_{t}\left(X\right)=j_{t}\left(S^{\ast}XS-X\right)\otimes d\Lambda\left(t\right).

This is all comparable to the classical Poisson process driven evolution involving unitary kicks.

The SLH Formalism

We now outline the so-called SLH Formalism - named after the scattering matrix operator SS, the coupling vector operator LL and Hamiltonian HH appearing in these Markov models [18]-[20]. The examples considered up to now used only one species of quanta. We could in fact have nn channels, based on nn quantum white noises:

[bj(t),bk(s)]=δjkδ(ts).\displaystyle[b_{j}(t),b^{\ast}_{k}(s)]=\delta_{jk}\,\delta(t-s). (145)

The most general form of a unitary process with fixed coefficients may be described as follows: we have a Hamiltonian H=HH=H^{\ast}, a column vector of coupling/ collapse operators

L=[L1Ln],\displaystyle L=\left[\begin{array}[]{c}L_{1}\\ \vdots\\ L_{n}\end{array}\right], (149)

and a matrix of operators

S=[S11S1nSn1Snn],S1=S.\displaystyle S=\left[\begin{array}[]{ccc}S_{11}&\cdots&S_{1n}\\ \vdots&\ddots&\vdots\\ S_{n1}&\cdots&S_{nn}\end{array}\right],\qquad S^{-1}=S^{\ast}. (153)

For each such triple (S,L,H)(S,L,H) we have the QSDE

dU(t)\displaystyle dU(t) =\displaystyle= {jk(SjkδjkI)dΛjk(t)+jLjdBj(t)\displaystyle\bigg{\{}\sum_{jk}(S_{jk}-\delta_{jk}I)\otimes d\Lambda_{jk}(t)+\sum_{j}L_{j}\otimes dB_{j}^{\ast}(t) (154)
jkLjSjkdBk(t)(12kLkLk+iH)dt}U(t)\displaystyle-\sum_{jk}L_{j}^{\ast}S_{jk}\otimes dB_{k}(t)-(\frac{1}{2}\sum_{k}L_{k}^{\ast}L_{k}+iH)\otimes dt\bigg{\}}\,U(t)

which has, for initial condition U(0)=IU(0)=I, a solution which is a unitary adapted quantum stochastic process. The emission-absorption case is the n=1n=1 model with no scattering (S=IS=I). Likewise the purse scattering corresponds to H=0H=0 and L=0L=0.

Heisenberg-Langevin Dynamics

System observables evolve according to the Heisenberg-Langevin equation

djt(X)\displaystyle dj_{t}(X) =\displaystyle= jkjt(SljXSlkδjkX)dΛjk(t)+jljt(Slj[Ll,X])dBj(t)\displaystyle\sum_{jk}j_{t}(S^{\ast}_{lj}XS_{lk}-\delta_{jk}X)d\Lambda_{jk}(t)+\sum_{jl}j_{t}(S_{lj}^{\ast}[L_{l},X])\otimes dB_{j}(t)^{\ast} (155)
+lkjt([X,Ll]Slk)dBk(t)+jt(X)dt.\displaystyle+\sum_{lk}j_{t}([X,L^{\ast}_{l}]S_{lk})\otimes dB_{k}(t)+j_{t}(\mathscr{L}X)\otimes dt.

where the generator is the traditional Lindblad form

X=12kLk[X,Lk]+12k[Lk,X]Lki[X,H].\displaystyle\mathscr{L}X=\frac{1}{2}\sum_{k}L^{\ast}_{k}[X,L_{k}]+\frac{1}{2}\sum_{k}[L^{\ast}_{k},X]L_{k}-i[X,H]. (156)
Quantum Outputs

The output fields are defined by

Bkout(t)=U(t)[IBk(t)]U(t).\displaystyle B^{\text{out}}_{k}(t)=U(t)^{\ast}[I\otimes B_{k}(t)]U(t). (157)

From the quantum Ito calculus we find that

dBjout(t)=kjt(Sjk)dBk(t)+jt(Lk)dt,\displaystyle dB^{\text{out}}_{j}(t)=\sum_{k}j_{t}(S_{jk})\otimes dB_{k}(t)+j_{t}(L_{k})\otimes dt, (158)

Or, maybe more suggestively in quantum white noise language [21],

bjout(t)=jjt(Sjk)bk(t)+jt(Lj)I.\displaystyle b^{\text{out}}_{j}(t)=\sum_{j}j_{t}(S_{jk})\otimes b_{k}(t)+j_{t}(L_{j})\otimes I. (159)

4.4 Quantum Filtering

We now set up the quantum filtering problem. For simplicity, we will take n=1n=1 and set S=IS=I so that we have a simple emission-absorption interaction. We will also consider the situation where we measure the QQ-quadrature of the output.

The initial state is taken to be |ψ0|Ω|\psi_{0}\rangle\otimes|\Omega\rangle, and in the Heisenberg picture this is fixed for all time.

The analogue of the stochastic dynamical equation considered in the classical filtering problem is the Heisenberg-Langevin equation

djt(X)=jt(X)dt+jt([X,L])dB(t)+jt([L,X])dB(t)\displaystyle dj_{t}\left(X\right)=j_{t}\left(\mathcal{L}X\right)\otimes dt+j_{t}\left(\left[X,L\right]\right)\otimes dB\left(t\right)^{\ast}+j_{t}\left(\left[L^{\ast},X\right]\right)\otimes dB\left(t\right) (160)

where X=12[L,X]L+12L[X,L]i[X,H]\mathcal{L}X=\frac{1}{2}\left[L^{\ast},X\right]L+\frac{1}{2}L^{\ast}\left[X,L\right]-i\left[X,H\right].

Some care is needed in specifying what exactly we measure: we should really work in the Heisenberg picture for clarity. The QQ-quadrature of the input field is Q(t)=B(t)+B(t)Q\left(t\right)=B\left(t\right)+B\left(t\right)^{\ast} which we have already seen is a Wiener process for the vacuum state of the field. Of course this is not what we measure - we measure the output quadrature!

Set

Yin(t)=IQ(t).\displaystyle Y^{\text{in}}\left(t\right)=I\otimes Q\left(t\right). (161)

As indicated in our discussion on von Neumann’s measurement model, what we actually measure is

Yout(t)=U(t)Yin(t)U(t)=Bout(t)+Bout(t).\displaystyle Y^{\text{out}}(t)=U(t)^{\ast}Y^{\text{in}}(t)U(t)=B^{\text{out}}(t)+B^{\text{out}}(t)^{\ast}. (162)

The differential form of this is

dYout(t)=dYin(t)+jt(L+L)dt.\displaystyle dY^{\text{out}}(t)=dY^{\text{in}}(t)+j_{t}(L+L^{\ast})dt. (163)

Note that

dYin(t)dYin(t)=dt=dYout(t)dYout(t).\displaystyle dY^{\text{in}}\left(t\right)dY^{\text{in}}\left(t\right)=dt=dY^{\text{out}}\left(t\right)dY^{\text{out}}\left(t\right). (164)

The dynamical noise is generally a quantum noise and can only be considered classical in very special circumstances, while the observational noise is just its QQ-quadrature which can hardly be treated as independent!

In complete contrast to the classical filtering problem we considered earlier, we have no paths for the system - just evolving observables of the system. What is more these observables do not typically commute amongst themselves, or indeed the measured process.

We can only apply Bayes Theorem in the situation where the quantities involved have a joint probability distribution, and in the quantum world this requires them to be compatible. At this stage it may seem like a miracle that we have any theory of filtering in the quantum world. However, let us stake stock of what we have.

What Commutes With What?

For fixed s0s\geq 0, let U(t,s)U(t,s) be the solution to the QSDE (154) in time variable tst\geq s with U(s,s)=IU(s,s)=I. Formally, we have

U(t,s)=TeistΥ(τ)𝑑τ\displaystyle U\left(t,s\right)=\vec{T}e^{-i\int_{s}^{t}\Upsilon\left(\tau\right)d\tau} (165)

which is the unitary which couples the system to the part of the field that enters over the time sτts\leq\tau\leq t. In terms of our previous definition, we have U(t)=U(t,0)U(t)=U(t,0) and we have the property

U(t)=U(t,s)U(s),(t>s>0).\displaystyle U\left(t\right)=U\left(t,s\right)U\left(s\right),\qquad\left(t>s>0\right). (166)

In the Heisenberg picture, the observables evolve

jt(X)\displaystyle j_{t}\left(X\right) =\displaystyle= U(t)[XI]U(t),\displaystyle U\left(t\right)^{\ast}\left[X\otimes I\right]U\left(t\right), (167)
Yout(t)\displaystyle Y^{\text{out}}\left(t\right) =\displaystyle= U(t)[IQ(t)]U(t).\displaystyle U\left(t\right)^{\ast}\left[I\otimes Q\left(t\right)\right]U\left(t\right). (168)

We know that the input quadrature is self-commuting, but what about the output one? A key identity here is that

Yout(t)=U(t)Yin(s)U(t),(t>s),\displaystyle Y^{\text{out}}\left(t\right)=U\left(t\right)^{\ast}Y^{\text{in}}\left(s\right)U\left(t\right),\qquad\left(t>s\right), (169)

which follows from the fact that [Yin(s),U(t,s)]=0\left[Y^{\text{in}}\left(s\right),U\left(t,s\right)\right]=0.


From this, we see that the process YoutY^{\text{out}} is also commutative since

[Yout(t),Yout(s)]=U(t)[Yin(t),Yin(s)]U(t)=0,(t>s).\displaystyle\left[Y^{\text{out}}\left(t\right),Y^{\text{out}}\left(s\right)\right]=U\left(t\right)^{\ast}\left[Y^{\text{in}}\left(t\right),Y^{\text{in}}\left(s\right)\right]U\left(t\right)=0,\quad\left(t>s\right). (170)

If this was not the case then subsequent measurements of the process YoutY^{\text{out}} would invalidate (disturb?) earlier ones. In fancier parlance, we say that process is not self-demolishing - that is, all parts are compatible with each other.

A similar line of argument shows that

[jt(X),Yout(s)]=U(t)[XI,IQ(t)]U(t)=0,(t>s).\displaystyle\left[j_{t}\left(X\right),Y^{\text{out}}\left(s\right)\right]=U\left(t\right)^{\ast}\left[X\otimes I,I\otimes Q\left(t\right)\right]U\left(t\right)=0,\quad\left(t>s\right). (171)

Therefore, we have a joint probability for jt(X)j_{t}\left(X\right) and the continuous collection of observables {Yout(τ):0τt}\left\{Y^{\text{out}}\left(\tau\right):0\leq\tau\leq t\right\} so can use Bayes Theorem to estimate jt(X)j_{t}(X) for any XX using the past observations. Following V.P. Belavkin, we refer to this as the non-demolition principle.

The Conditioned State

In the Schrödinger picture, the state at time t0t\geq 0 is |Ψt=U(t)|ϕΩ|\Psi_{t}\rangle=U\left(t\right)|\phi\otimes\Omega\rangle, so

d|Ψt\displaystyle d|\Psi_{t}\rangle =\displaystyle= (12LL+iH)|Ψtdt+LdB(t)|ΨtLdB(t)|Ψt\displaystyle-\left(\frac{1}{2}L^{\ast}L+iH\right)|\Psi_{t}\rangle dt+LdB\left(t\right)^{\ast}|\Psi_{t}\rangle-L^{\ast}dB\left(t\right)|\Psi_{t}\rangle (172)
=\displaystyle= (12LL+iH)|Ψtdt+LdB(t)|Ψt+LdB(t)|Ψt\displaystyle-\left(\frac{1}{2}L^{\ast}L+iH\right)|\Psi_{t}\rangle dt+LdB\left(t\right)^{\ast}|\Psi_{t}\rangle+LdB\left(t\right)|\Psi_{t}\rangle
=\displaystyle= (12LL+iH)|Ψtdt+LdYin(t)|Ψt.\displaystyle-\left(\frac{1}{2}L^{\ast}L+iH\right)|\Psi_{t}\rangle dt+LdY^{\text{in}}(t)|\Psi_{t}\rangle.

Here we have used a profound trick due to A.S. Holevo. The differential dB(t)dB(t) acting on |Ψt|\Psi_{t}\rangle yields zero since it is future pointing and so only affects the future part which, by adaptedness, is the vacuum state of the future part of the field. To get from the first line to the second line, we remove and add a term that is technically zero. In its reconstituted form, we obtain the QQ-quadrature of the input. The result is that we obtain an expression for the state |Ψt|\Psi_{t}\rangle which is “diagonal”  in the input quadrature - our terminology here is poor (we are talking about a state not and observable!) but hopefully wakes up physicists to see what’s going on.

The above equation is equivalent to the SDE in the system Hilbert space

d|χt=(12LL+iH)|χtdt+L|χtdyt\displaystyle d|\chi_{t}\rangle=-\left(\frac{1}{2}L^{\ast}L+iH\right)|\chi_{t}\rangle dt+L|\chi_{t}\rangle dy_{t} (173)

where 𝐲\mathbf{y} is a sample path - or better still, eigen-path - of the quantum stochastic process YinY^{\text{in}}.

We refer to (173) as the Belavkin-Zakai equation.

The Quantum Filter

Let us begin with a useful computational

ϕΩ|jt(X)F[Y[0,t]out]|ϕΩ\displaystyle\langle\phi\otimes\Omega|j_{t}\left(X\right)F\left[Y_{\left[0,t\right]}^{\text{out}}\right]|\phi\otimes\Omega\rangle =\displaystyle= ϕΩ|U(t)(XF[Y[0,t]in])U(t)|ϕΩ\displaystyle\langle\phi\otimes\Omega|U(t)^{\ast}\big{(}X\otimes F\left[Y_{\left[0,t\right]}^{\text{in}}\right]\big{)}U(t)|\phi\otimes\Omega\rangle (174)
=\displaystyle= Ψt|XF[Y[0,t]in]|Ψt\displaystyle\langle\Psi_{t}|X\otimes F\left[Y_{\left[0,t\right]}^{\text{in}}\right]|\Psi_{t}\rangle
=\displaystyle= χt(𝐲)|X|χt(𝐲)F[𝐲]Wiener[d𝐲].\displaystyle\int\langle\chi_{t}(\mathbf{y})|X\otimes|\chi_{t}(\mathbf{y})\rangle\,F\left[\mathbf{y}\right]\,\mathbb{P}_{\text{Wiener}}[d\mathbf{y}].

A few comments are in order here. The operator jt(X)j_{t}\left(X\right) will commute with any functional of the past measurements - here F[Y[0,t]out]F\left[Y_{\left[0,t\right]}^{\text{out}}\right]. In the first equality is pulling things back in terms of the unitary U(t)U(t). The second is just the equivalence between Schrödinger and Heisenberg pictures. The final one just uses the equivalent form (173): note that the paths of the input quadrature gets their correct weighting as Wiener processes.

Setting X=IX=I in (174), we get the

ϕΩ|F[Y[0,t]out]|ϕΩ\displaystyle\langle\phi\otimes\Omega|F\left[Y_{\left[0,t\right]}^{\text{out}}\right]|\phi\otimes\Omega\rangle =\displaystyle= χt(𝐲|χt(𝐲)F[𝐲]Wiener[d𝐲]\displaystyle\int\langle\chi_{t}(\mathbf{y}|\chi_{t}(\mathbf{y})\rangle\,F\left[\mathbf{y}\right]\,\mathbb{P}_{\text{Wiener}}[d\mathbf{y}] (175)

So the probability of the measured paths is

[d𝐲]=χt(𝐲)|χt(𝐲)Wiener[d𝐲].\displaystyle\mathbb{Q}[d\mathbf{y}]=\langle\chi_{t}(\mathbf{y})|\chi_{t}(\mathbf{y})\rangle\,\mathbb{P}_{\text{Wiener}}[d\mathbf{y}]. (176)

Now this last equation deserves some comment! The vector |Ψt|\Psi_{t}\rangle, which lives in the system tensor Fock space, is properly normalized, but its corresponding form |χt|\chi_{t}\rangle is not! The latter is a stochastic process taking values in the system Hilbert space and is adapted to input quadrature. However, we never said that |χt|\chi_{t}\rangle had to be normalized too, and indeed it follows from or “diagonalization”  procedure. In fact, if |χt|\chi_{t}\rangle was normalized then the output measure would follow a Wiener distribution and so we would be measuring white noise!

From (174) again, we an deduce the filter: we get (using the arbitrariness of the functional FF)

𝔈t(X)=χt(𝐲)|X|χt(𝐲)χt(𝐲)|χt(𝐲).\displaystyle\mathfrak{E}_{t}(X)=\frac{\langle\chi_{t}(\mathbf{y})|X|\chi_{t}(\mathbf{y})\rangle}{\langle\chi_{t}(\mathbf{y})|\chi_{t}(\mathbf{y})\rangle}. (177)

This has a remarkable similarity to (70). Moreover, using the Ito calculus see that

dχt(𝐲)|X|χt(𝐲)\displaystyle d\langle\chi_{t}(\mathbf{y})|X|\chi_{t}(\mathbf{y})\rangle =\displaystyle= χt(𝐲)|X|χt(𝐲)dt\displaystyle\langle\chi_{t}(\mathbf{y})|\mathcal{L}X|\chi_{t}(\mathbf{y})\rangle dt (178)
+χt(𝐲)|(XL+LX)|χt(𝐲)dy(t).\displaystyle+\langle\chi_{t}(\mathbf{y})|\big{(}XL+L^{\ast}X\big{)}|\chi_{t}(\mathbf{y})\rangle\,dy(t).

This is the quantum analogue of the Duncan-Mortensen-Zakai equation.

So small work is left in order to derive the filter equation. We first observe that the normalization (set X=IX=I) is that

dχt(𝐲)|χt(𝐲)=χt(𝐲)|(L+L)|χt(𝐲)dy(t).\displaystyle d\langle\chi_{t}(\mathbf{y})|\chi_{t}(\mathbf{y})\rangle=\langle\chi_{t}(\mathbf{y})|\big{(}L+L^{\ast}\big{)}|\chi_{t}(\mathbf{y})\rangle\,dy(t). (179)

Using the Ito calculus, it is then routine to show that the quantum filter is

d𝔈t(X)=𝔈t(X)dt+{𝔈t(XL+LX)𝔈t(X)𝔈t(L+L)}dI(t)\displaystyle d\mathfrak{E}_{t}(X)=\mathfrak{E}_{t}(\mathcal{L}X)\,dt+\big{\{}\mathfrak{E}_{t}(XL+L^{\ast}X)-\mathfrak{E}_{t}(X)\mathfrak{E}_{t}(L+L^{\ast})\big{\}}dI(t) (180)

where the innovations are defined by

dI(t)=dYout(t)𝔈t(L+L)dt.\displaystyle dI(t)=dY^{\text{out}}(t)-\mathfrak{E}_{t}(L+L^{\ast})\,dt. (181)

Again, the innovations have the statistics of a Wiener process. As in the classical case, the innovations give the difference between what we observe next, dYout(t)dY^{\text{out}}(t), and what we would have expected based on our observations up to that point, 𝔈t(L+L)dt\mathfrak{E}_{t}(L+L^{\ast})\,dt. The fact that the innovations are a Wiener process is a reflection of the efficiency of the filter - after extracting as much information as we can out of the observations, we are left with just white noise.

Acknowledgements

I would like to thank the staff at CIRM, Luminy (Marseille), and at Institut Henri Poincaré (Paris) for their kind support during the 2018 Trimester on Measurement and Control of Quantum Systems where this work was began. I am also grateful to the other organizers Pierre Rouchon and Denis Bernard for valuable comments during the writing of these notes.

References

  • [1] V.P. Belavkin, (1989), Non-Demolition Measurements, Nonlinear Filtering and Dynamic Programming of Quantum Stochastic Processes, Lecture Notes in Control and Inform Sciences 121 245–265, Springer–Verlag, Berlin.
  • [2] A. Barchielli and V. P. Belavkin, (1991), Measurements continuous in time and a posteriori states in quantum mechanics, J. Phys. A: Math. Gen. 24, 1495.
  • [3] A. Barchielli and M. Gregoratti, (2009), Quantum Trajectories and Measurements in Continuous Time - the diffusive case, Springer Berlin Heidelberg.
  • [4] H.M. Wiseman and G.J. Milburn, (2009), Quantum Measurement and Control, Cambridge University Press.
  • [5] D. Gatarek, N. Gisin, (1995), Continuous quantum jumps and infinite‐dimensional stochastic equations, Journal of Mathematical Physics 32 (8), 2152-2157
  • [6] H.J. Carmichael, (1993), Phys. Rev. Lett. 70(15) p.2273.
  • [7] J. Dalibard, Y. Castin, Yvan, and K. Mølmer, (Feb 1992), Wave-function approach to dissipative processes in quantum optics. Phys. Rev. Lett. American Physical Society. 68 (5): 580–58.
  • [8] C. Sayrin, I. Dotsenko, et al., (1 September 2011), Real-time quantum feedback prepares and stabilizes photon number states, Nature 477, 73-77.
  • [9] H. Maassen, (1988), Theoretical concepts in quantum probability: quantum Markov processes. Fractals, quasicrystals, chaos, knots and algebraic quantum mechanics (Maratea, 1987), 287-302, NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., 235, Kluwer Acad. Publ., Dordrecht
  • [10] M. Takesaki, (1972) Conditional Expectations in von Neumann Algebras, J. Func. Anal., 9, 306-321.
  • [11] L. Bouten, R. van Handel and M.R. James, (2007), An introduction to quantum filtering, SIAM Journal on Control and Optimization 46, 2199.
  • [12] L. Bouten, R. van Handel, Quantum filtering: a reference probability approach, aXiv:math-ph/0508006
  • [13] R. van Handel, Ph.D. Thesis, Filtering, Stability, and Robustness, CalTech, 2006, http://www.princeton.edu/ rvan/thesisf070108.pdf
  • [14] H. Wiseman, (1994), Quantum theory of continuous feedback, Phys. Rev. A, 49(3):2133-2150.
  • [15] P. Rouchon, (August 13 - 21, 2014), Models and Feedback Stabilization of Open Quantum Systems Extended version of the paper attached to an invited conference for the International Congress of Mathematicians in Seoul, arXiv:1407.7810
  • [16] R.L. Hudson and K.R. Parthasarathy, (1984), Quantum Ito’s formula and stochastic evolutions, Commun. Math. Phys. 93, 301.
  • [17] K.R. Parthasarathy, (1992) An Introduction to Quantum Stochastic Calculus, Birkhauser.
  • [18] J. Gough, M.R. James, (2009), Quantum Feedback Networks: Hamiltonian Formulation, Commun. Math. Phys. 287, 1109.
  • [19] J. Gough, M.R. James, (2009), The series product and its application to quantum feedforward and feedback networks, IEEE Trans. on Automatic Control 54, 2530.
  • [20] J. Combes, J. Kerckhoff, M. Sarovar, (2017), The SLH framework for modeling quantum input-output networks, Advances in Physics: X, 2:3, 784-888.
  • [21] C.W. Gardiner and M.J. Collett, (1985), Input and output in damped quantum systems: Quantum stochastic differential equations and the master equation. Phys. Rev. A, 31(6):3761-3774.