Quantum Covariance and Filtering
Abstract
We give a tutorial exposition of the analogue of the filtering equation for quantum systems focusing on the quantum probabilistic framework and developing the ideas from the classical theory. Quantum covariances and conditional expectations on von Neumann algebras play an essential part in the presentation.
keywords:
Quantum probability, quantum filtering, quantum Markovian systems1 Introduction
Nonlinear filtering theory is a well-developed field of engineering which is used to estimate unknown quantities in the presence of noise. One of the founders of the field was the Soviet mathematician Ruslan Stratonovich who encouraged his student Viacheslav Belavkin to extend the problem to the quantum domain [1]. Classically, estimation works by measuring one or more variables which are dependent on the variables to estimated, and Bayes Theorem plays an essential role in inferring the unknown variables based on what we measure. Belavkin’s approach uses the theory of quantum stochastic calculus for continuous-in-time homodyne and photon counting measurements. There are several approaches: in the paper of Barchielli and Belavkin [2], the characteristic functional method is used to derive the photon-counting case, with the diffusive case obtained as an appropriate limit. Further details of the many approaches and applications may be found in the books by Barchielli and Gregoratti [3] and by Wiseman and Milburn [4].
However, the proof of Bayes Theorem requires a joint probability distribution for the unknown variables and the measured ones. Once we go to quantum theory, we have to be very careful as incompatible observables do not possess a joint probability distribution - in such cases, applying Bayes Theorem will lead to erroneous results and is the root of many of the paradoxes in the theory.
We will derive the simplest quantum filter. The filter equation itself was originally postulated by Gisin on different grounds of continuous collapse of the wavefunction, but subsequently given a standard filtering interpretation [5]. It also appeared as way of simulating quantum open systems due to Carmichael [6] and Dalibard, Castin and Mølmer [7]: while this appears as a trick for simulating just the quantum master equation (analogue of the Fokker-Planck equation) by stochastic processes, it is clear that the authors consider an underlying interpretation based on continual measurements. The discrete-time version of the filter also featured in the famous Paris Photon-Box experiment [8].
2 Quantum Probabilistic Setting
We start from the tradition formulation of quantum theory in terms of operators on a separable Hilbert space, . The norm of a linear operator is , and the collection of bounded operators will be denoted by . We will denote the identity operator by 11. The adjoint of will be denoted by .
Our interest will be in von Neumann algebras. These are unital *-algebras with that are closed in the weak operator topology. Here we say that a sequence of operators converges weakly in to if their matrix elements converge, that is for all .
A pair consisting of a von Neumann algebra and a state is referred to as a quantum probability (QP) space [9].
Commutative = Classical
Kolmogorov’s setting for classical probability is in terms of probability spaces where is a space of outcomes (the sample space), is a -algebra of subsets of , and is a probability measure on the elements in . The collection of functions will form a commutative von Neumann algebra and, moreover, a state is given by . (Conversely, every commutative von Neumann algebra with a state that is continuous in the normal topology, see below, will be isomorphic to this framework.)
Commutants
There is an alternative definition of von Neumann algebras which, surprising, is purely algebraic. For a subset of operators , we define its commutant in to be
(1) |
The commutant of the commutant of is called the bicommutant and is denoted . Von Neumann’s Bicommutant Theorem states that a collection of operators is a von Neumann algebra if and only if it is closed under taking adjoints and .
itself is a von Neumann algebra. If and are von Neumann algebras then is said to be coarser than if . A collection of operators generates a von Neumann algebra .
States
A state on a von Neumann algebra is a *-linear functional which is positive ( whenever ) and normalized (). We will assume that the state is continuous in the normal topology, that is for any increasing sequence of positive elements of . The main point of interest is that the normal state takes the form for a density matrix.
The state satisfies the Cauchy-Schwartz identity .
Morphisms between QP Spaces
A morphism between QP spaces is a normal, completely positive, *-linear map which preserves the identity, , and the probabilities, for all . If a morphism is a homomorphism, that is, for all , then we say that is embedded into .
Tomita-Takesaki Theory
As operators do not necessarily commute we may have different from . Nevertheless, it is possible to write
(2) |
where is a positive (possibly unbounded operator on known as the modular operator. This plays a central role in the Tomita-Takesaki theory of von Neumann algebras. A one-parameter group of maps on is defined by and is known as the modular group associated with the QP space .
Theorem 1 (Takesaki, [10])
Let be a QP space and let be a von Neumann subalgebra of . There will exist a morphism from down to which is projective () if and only if is invariant under the modular group of .
2.1 Quantum Conditioning
We fix a QP space , and define the covariance of two elements to be
(3) |
Likewise the variance is defined as .
The idea is that we have a subset , and we want to associate an element with each , see Figure 1. As is smaller than we think of as a coarse-grained version of based on a less information. The map therefore compresses the model into a coarser one on : we would like to do this is a way that preserves averages.

We now list some desirable features for which we have already encountered in the classical case: for any , and ,
-
(CE1)
linearity: ;
-
(CE2)
*-map: ;
-
(CE3)
conservativity: ;
-
(CE4)
compatibility: ;
-
(CE5)
projectivity: ;
-
(CE6)
peelability: ;
-
(CE7)
positivity: whenever .
We call property (CE6) “peelability” for the lack of a better name and we emphasize that the order of the operators is important. Property (CE7) is known to be insufficient to deal with quantum theory and must be strengthened as follows:
-
(CE7′)
complete positivity: for each integer
(10)
Definition 2
Let and be a unital *-algebras with a subalgebra of , then a mapping satisfying properties (CE1)-(CE6) and (CE7) is a quantum conditional expectation.
Proposition 3
A quantum conditional expectation acts as the identity map on .
Proof. Set and , then peelability implies that . So the result follows from conservativity.
Existence
We observe that the conditional expectation always exists in the classical world. Here can be identified as some and then the subalgebra will be then take the form where is a coarser -algebra. Conditional expectation is then well defined: For one sets for each then is absolutely continuous with respect to and its Radon-Nikodym derivative is the conditional expectation which we denote as . This is explicit in Kolmogorov’s original paper.
In contrast, quantum conditional expectations need not exits. By definition, they satisfy the requirements of the Takesaki Theorem above (and additionally the peelability condition) so we need further invariance of the subalgebra under the modular group.
2.2 Quantum Covariance
Definition 4
Let be a quantum conditional expectation from onto a subalgebra . For each , we define . The conditional covariance of is defined to be
(11) |
The conditional variance is
(12) |
Note that
(13) |
for every . It is worth emphasizing that the conditional covariance defined here is an operator on , not a scalar.
Lemma 5
We have whenever and . In particular, whenever and .
The proof depends crucially on peelability: . The following result is trivial classically, but again requires peelability in the non-commutative setting.
Proposition 6
The conditional covariance may alternatively be written as
(14) |
Proposition 7
The conditional covariance has the invariance property
(15) |
for all .
Proof. From *-linearity and (14), we see that the left hand side of (15) equals
and the result follows using peelability.
Lemma 8
The covariance and conditional covariance are related by
(16) |
Proof. This follows from repeated application of the compatibility property.
which is readily rearranged to give the result.
As a consequence we have
(17) |
2.3 Least Squares Property
Proposition 9
The conditional covariance has the least squares property, that is, is minimized over by .
Proof. Let then which is in again in . Then
where we use the positivity property.
Corollary 10
The variance is also minimized over by .
3 Classical Filtering
In this section we recall in detail Kolmogorov’s Theory of Probability. In the process we will see the commutative analogues that motivated the our more general definitions in the Introduction.
3.1 Kolmogorov’s Theory
Kolmogorov’s axiomatic formulation of probability theory is based on the mathematical formalism of measure theory. The main concept is that of a probability space. This is a triple where:
-
1.
, called the sample space, is the collection of all possible outcomes (typically a topological space);
-
2.
is a -algebra of subsets of ,the elements of which are known as events;
-
3.
is a probability measure on .
In details, will form a -algebra if it contains the empty set , if it is closed under complementation (that is, if then so too will be its complement ), and finally if whenever is a countable number of events in then their intersection and union will be in . Note that will be an event since it is the complement of the empty set.
A probability measure on is an assignment of a probability to each event with the rule that and for any countable number of events, , that are non-overlapping (i.e., if )
The pair comprise a measurable space. In other words, a space where we are capable to assign possible measures of size to selected subsets in a consistent manner: this is the branch of mathematics known as measure theory which was set up to resolve pathological problems when you try and assign a measure to all subsets. It follows that probability theory is formally just special case of measure theory where the measure has maximum value .
More exactly, the setting is measure theory but probability theory brings its own additionally concepts with it. An example is conditional probability: the probability of event given that has occurred is defined by
(18) |
which is the joint probability, , for both and to occur divided by the marginal probability .
The choice of in a given problem is part of the modeling process. Essentially, we have to ask what are the events that we want to assign a probability to. Let be a -algebra that is contained in (that is every event in there is also an event in ) then we say that is coarser, or smaller, than . The probability space is then a coarse-graining of where we take to be the restriction of to the smaller -algebra .
Just as we do not consider all subsets of , we do not consider all functions on either. Let then we say is measurable with respect to a -algebra if the sets
(19) |
belong to for each interval . A measurable function on a probability space is called a random variable and the probability that it takes a value in the interval , denoted is just the value assigns to the event . We will use the term random vector for a vector-valued function whose components are all random variables.
Let be random variables, then there is a coarsest -algebra which contains all the events of the form for all and all intervals : we refer to this as the -algebra generated by the random variables.
The correct way to think of an ensemble is a probability space where is collection all possible microstates with is some suitable -algebra, and is a suitable probability measure. The Hamiltonian must, at the very least, be a measurable function with respect to whatever -algebra we chose. No philosophical interpretations needed beyond this point.
3.2 Conditioning in Classical Probability
We will now restrict attention to continuous random variables with well-defined probability densities. A random variable has probability distribution function (pdf) so that
(20) |
Normalization requires . If we have several random variables, then we need to specify their joint probability. For instance, if we have a pair and then their joint pdf will be with
(21) | |||||
(22) |
and .
We say that and are statistically independent if their joint probability factors into the marginals
(23) |
This is equivalent to pairs of events of the form and being statistically independent for all intervals .
More generally, we can work out the conditional probabilities from a joint probability. The pdf for given that is defined to be
(24) |
In the special case where and are independent we have
(25) |
In other words, conditioning on the fact that makes no change to our knowledge of .
Definition 11
Let be a random variable for some function , then its conditional expectation given is defined to be
(26) |
More generally, let be the -algebra generated by , then is the -measurable random variable taking the value for each where is the value of .
As , we have
(27) |
We note that for any random variable
(28) | |||||
Also, for any and we have
(29) | |||||
This construction was specific to random variables with pdfs. However, it extends to the general setting as follows.
Theorem 12
Let be a probability space and let be a sub--algebra of . Then there exists a -almost surely unique -measurable random variable such that , and whenever is -measurable.
Proposition 13
If is -measurable, then
(30) |
Proof. Setting in the identity whenever is -measurable, we see that which in turn equals .
Proposition 14
Conditional expectations are projections.
Proof. For arbitrary, we set which is -measurable and so
(31) |
3.3 Classical Measurement
We now suppose that we have a system with phase space and a measuring apparatus with parameter space . We let denote the phase points of as before, and write for the variables of the apparatus. The components of are sometimes referred to as pointer variables. The total space will be with coordinates . We take to be a probability measure on and consider the random vectors and .
In an experiment, we will not measure the system directly but instead record the value of one or more pointer variables. Let be the -algebra generated by . We therefore refer to as the data.
We shall assume that the system variables and the pointer variables are statistically dependent for our probability measure , otherwise we learn nothing about our system from the data. As before we assume a joint pdf with marginals for the system and for the measuring apparatus. We will write for the conditional pdf for our system given the data but write for the conditional pdf of the data given the system. This implies that
(32) |
In practice, we may not know however we will assume that we know . That is, we assume that we know the probability distribution of the pointer variables if we prepared our system precisely in state , for each possible . Statisticians refer to as the likelihood function of the data given .
Note that
(33) |
Now every random variable may be written as for some function . Its conditional expectation given the data is
(34) |
Indeed, for we have
(35) | |||||
This is an average over the hypersurface . Indeed, the decomposition can be thought of as split into the constraint coordinates and the hypersurface coordinates .
From a practical stand point, we will have access only to the data - that is, variables measurable with respect to only. We are assuming that we know , which is the conditional probability for data given that the system. However, the problem is that the system is unknown and what we are given is, of course, the data. Therefore, we need to solve the inverse problem, namely to give the conditional probability for the unknown given the measured values for . The problem however is not well-posed. We do not have enough information in the problem yet to write down the joint probability.
To remedy this, we introduce a pdf for which is our a priori guess:
(36) |
We then have the corresponding joint probability for and :
(37) |
If we subsequently measure then we obtain the a posteriori probability
(38) | |||||
The conditional expectation in (35) can be then written as
(39) |
Example 15
Let be the position of a particle. We measure
(40) |
where is a standard normal variable independent of . We may refer to as the signal and as the noise.
Now if was known to be exactly then will be normal with mean and variance . Therefore, we can immediately write down the likelihood function: it is
(41) |
(42) |
In the special case where is assumed to be Gaussian, say mean and variance , we can give the explicit form of the posterior as Gaussian with mean and variance where
(43) | |||||
(44) |
There are two desirable features here. First, the new mean uses the data . Second, the new variance is smaller than the prior variance . In other words, the measurement is informative and decreases uncertainty in the state
3.4 Classical Filtering
It is possible to extend the conditioning problem to estimate the state of a dynamical system as it evolves in time based on continual monitoring. This involves the theory of stochastic processes and we will use the informal language of path integrals rather than the Ito calculus.
3.4.1 Stochastic Process
A stochastic process is a family, , of random variables labeled by time. The process is determined by specifying all the multi-time distributions
(45) |
for for each .
A stochastic process is said to be Markov if the multi-time distributions take the form
(46) |
where whenever .
Here is the probability density for given that , ().
(47) |
for . It is called the transition mechanism of the Markov process. For consistency we should have the following propagation rule, known as the Chapman-Kolmogorov equation in probability theory,
(48) |
for all .
Example 16
The Wiener process (Brownian motion) is determined by
(49) | |||||
(50) |
The transition mechanism here is the Green’s function for the heat equation
(51) |
(In other words, given the data at time , the solution for later times is .)
Norbert Wiener gave an explicit construction - known as the canonical version of Brownian motion, where the sample space is the space of continuous paths, , starting a the origin as sample space, with a suitable -algebra of subsets and a well defined measure .
The corresponding stochastic process is denote . Ito was able to construct a stochastic differential calculus around the Wiener process, and more generally diffusions, and we have the following Ito table
(55) |
3.4.2 Path Integral Formulation
Indeed, we have
(56) |
Formally, we may introduce a limit “path integral” with probability measure on the space of paths
(57) |
where we have the action
(58) |
For a diffusion satisfying the Ito stochastic differential equation
(59) |
we have the corresponding measure
(60) |
where we have the action (substitute into , and allow for a Jacobian correction)
(61) |
3.4.3 The Classical Filtering Problem
Suppose that we have a system described by a process . We obtain information by observing a related process .
(62) | |||||
(63) |
Here we assume that the dynamical noise and the observational noise are independent Wiener processes.
The joint probability of both and up to time is
(64) |
where
(65) | |||||
(66) |
or
(67) |
where the Kallianpur-Streibel likelihood222Readers with a background in stochastic processes will recognize this as a Radon-Nikodym derivative associated with a Girsanov transformation. is
(68) |
The distribution for given observations is then
(69) |
Let us write for the -algebra generated by the observations . The estimate for for any function conditioned on the observations up to time is called the filter and, generalizing (39) to continuous time, we may write this as
(70) | |||||
where is a non-normalized density. We introduce the stochastic process and it can be shown to satisfy the Duncan-Mortensen-Zakai equation
(71) |
This implies the filtering equation
(72) |
where the innovations process is defined as
(73) |
4 Quantum Filtering
4.1 Quantum Measurement
The Basic Concepts
The Born interpretation of the wave function, , in quantum mechanics is that gives the probability density of finding the particle at position . More generally, in quantum theory, observables are represented by self-adjoint operators on a Hilbert space. The basic postulate of quantum theory is that the pure states of a system correespond to normalized the wave functions, , and we will follow Dirac and denote these as kets . When we measure an observable, the physical value we record will be an eigenvalue. If the state is then the average value of the observable represented by is .
Let us recall that a Hermitian operator is called an orthogonal projection if it satisfies . Then if we have a Hermitian operator with a discrete set of eigenvalues, then there exists a collection of orthogonal projections labeled by the eigenvalues , satisfying if and , such that
(74) |
This is the spectral decomposition of . The operators project onto which is the eigenspace of for eigenvalue . In other words, is the space of all eigenvectors of having eigenvalue . The eigenspaces are orthogonal, that is whenever and lie in different eigenspaces (this is equivalent to if ), and every vector can be written as a superposition of vectors where lies in eigenspace . (In fact, .)
We note that, for any integer ,
(75) |
and any real
(76) |
Suppose we prepare a quantum system in a state and perform a measurement of an observable . We know that we may only measure an eigenvalue and quantum mechanics predicts the probability . In fact, using the spectral decomposition
(77) |
and so
(78) |
For the special case of a non-degenerate eigenvalue , we have that the eigenspace is spanned by a single eigenvector , which we take to be normalized. In this case we have
(79) |
We see that if an observable has a non-degenerate eigenvalue with normalized eigenvector , then if the system is prepared in state , the probability of measuring in an experiment is . The modulus squared of an overlap in this way may therefore have the interpretation as a probability.
The degenerate case needs some more attention. Here the eigenspace can spanned by a set of orthonormal vectors so that , and so . The choice of the orthonormal basis for is not important!
The probability is equal to the length-squared of , that is,
(80) |
To see this, note that is the overlap of the ket with its own bra so
(81) |
where we used the fact that .
In the picture below, we project into the eigenspace to get . In the special case where was already in the eigenspace, it equals its own projection () and so since the state is normalized. If the state is however orthogonal to the eigenspace then its projection is zero () and so .
In general, we get something in between. In the picture below we see that has a component in the eigenspace and a component orthogonal to it. The projected vector will then have length less than the original , and so .

Von Neumann’s Projection Postulate
Suppose the initial state is and we measure the eigenvalue of observable in an given experiment. A second measurement of performed straight way ought to yield the same value again, this time with certainty.
The only way however to ensure that we measure a given eigenvalue with certainty is if the state lies in the eigenspace for that eigenvalue. We therefore require that the state of the system immediately after the result is measured will jump from to something lying in the eigenspace . This leads us directly to the von Neumann projection postulate.
The von Neumann projection postulate: If the state of a system is given by a ket , and a measurement of observable yields the eigenvalue , then the state immediately after measurement becomes
We note that the projected vector has length so we need to divide by this to ensure that is properly normalized. The von Neumann postulate is essentially the simplest geometric way to get the vector into the eigenspace: project down and then normalize!
Compatible Measurements
Suppose we measure a pair of observables and in that sequence. The -measurement leaves the state in the eigenspace of the measured value , the subsequent -measurement then leaves the state in the eigenspace of the measured value . If we then went back and remeasured would be find again with certainty? The state after the second measurement will be an eigenvector of with eigenvalue , but this need not necessarily be an eigenvector of .
Let and be a pair of observables with spectral decompositions and respectively. Let us measure and then recording values and respectively. If the initial state was then we obtain after both measurements the final state will be
(82) |
In particular is an eigenstate of with eigenvalue . However suppose we also wanted to be an eigenstate of with the original eigenvalue , the we must have or equivalently
(83) |
If we want this to be true irrespective of the actual initial state then we arrive at the operator equation
(84) |
Proposition 17
Let and be a pair of orthogonal projections satisfying then .
Proof. We first observe that will again be an orthogonal projection. To this end we must show that and . However, and
However we also have , so the relation implies that .
We see that our operator identity above means that and need to commute! If we wanted the -measurement not to disturb the -measurement for any possible outcome and , then we require that all the eigen-projections of commute with all the eigen-projections of , and this implies that .
Definition 18
A collection of observables are compatible if they commute. We define the commutator of two operators as
(85) |
So and are compatible if .
Von Neumann’s Model of Measurement
The postulates of quantum mechanics outlined above assume that all measurements are idealized, but one might expect the actual process of extracting information from quantum systems to be more involved. Von Neumann modeled the measurement process as follows. We wish to get information about an observable, , say the position of a quantum system. Rather than measure directly, we measure an observable giving the pointer position of a second system (called the measurement apparatus).
We will reformulate the von Neumann measurement problem in the language of estimation theory. First we assume that apparatus is described by a wave-function . The initial state of the system and apparatus is , i.e.,
(86) |
(Note that we are already falling in line with the estimation way of thinking by referring to the initial wave function of the particle as an a priori wave function - it is something we have to fix at the outset, even if we recognize it as only a guess for the correct physical state.)) The system and apparatus are taken to interact by means of the unitary
(87) |
where is the momentum operator of the pointer conjugate to . After coupling, the joint state is
(88) |
If the measured value of is , then the a posteriori wave-function must be
(89) |
where
(90) |
Basically, the pointer position will be a random variable with pdf given by : the a posteriori wave-function may then be thought of as a random wave-function on the system Hilbert space:
(91) |
In the parlance of quantum theorists, the wave function of the apparatus collapses to , while we update the a priori wave function to get the a posteriori one.
We have been describing events in the Schrödinger picture where states evolve while observables remain fixed. In this picture, we measure the observable , but the state is changing in time. It is instructive to describe events in the Heisenberg picture. Here the state is fixed as , while the observables evolve. In fact, the observable that we actually measure is
(92) |
from which it is clear that we are obtaining some information about . Note that the measured observable is explicitly of the form signal plus noise as in Example 15. The noise term, , is independent of the signal and has the prescribed pdf .
4.2 Quantum Markovian Systems
Quantum Systems with Classical Noise
We consider a quantum system driven by Wiener noise. For and self-adjoint, we set
(93) |
which clearly defines a unitary process. From the Ito calculus we can quickly deduce the corresponding Schrödinger equation
(94) |
If we set , which we may think of as an embedding of the system observable into a noisy environment, then we similarly obtain
(95) |
where
(96) |
An alternative is to use Poissonian noise. Here we apply a unitary kick, , at times distributed as a Poisson process with rate . Let count the number of kicks up to time , then is a stochastic process with independent stationary increments (like the Wiener process) and we have the Ito rules
(97) |
The Schrödinger equation is and for the evolution of observables we now have
(98) |
Lindblad Generators
A quantum dynamical semigroup is a family of CP maps, , such that and . Under various continuity conditions one can show that the general form of the generator is
(99) |
These include the examples emerging from classical noise above - in fact, combinations of the Wiener and Poissonian cases give the general classical case. But the class of Lindblad generators is strictly larger that this, meaning that we need quantum noise! This is typically what we consider when modeling quantum optics situation.
4.3 Quantum Noise Models
Fock Space
We recall how to model bosonic fields. We wish to describe a typical pure state of the field. If we look at the field we expect to see a certain number, , of particles at locations and to this situation we assign a complex number (the probability amplitude) . As the particles are indistinguishable bosons, the amplitude should be completely symmetric under interchange of particle identities.
The field however can have an indefinite number of particles - that is, it can be written as a superposition of fixed number states. The general form of a pure state for the field will be
(100) |
Note that the case is included and is understood as the vacuum state. Here is a complex number, with giving the probability for finding no particles in the field.
The probability that we have exactly particles is
(101) |
and the normalization of the state is therefore .
In particular, we take the vacuum state to be
(102) |
The Hilbert space spanned by such indefinite number of indistinguishable boson states is called Fock Space.
A convenient spanning set is given by the exponential vectors
(103) |
They are, in fact, over-complete and we have the inner products
(104) | |||||
The exponential vectors, when normalized, give the analogues to the coherent states for a single mode.
We note that the vacuum is an example: .
Quanta on a Wire
We now take our space to be 1-dimensional - a wire. Let’s parametrize the position on the wire by variable , and denote by the Fock space over a segment of the wire . We have the following tensor product decomposition
(105) |
In is convenient to introduce quantum white noises and satisfying the singular commutation relations
(106) |
Here annihilates a quantum of the field at location . In keeping with the usual theory of the quantized harmonic oscillator, we take it that annihilates the vacuum: . More generally, this implies that
(107) |
The adjoint creates a quantum at position .
The quantum white noises are operator densities and are singular, but their integrated forms do correspond to well defined operators which we call the annihilation and creation processes, respectively,
(108) |
We see that
(109) |
In addition we introduce a further process, called the number process, according to
(110) |
Quantum Stochastic Models
We now think of our system as lying at the origin of a quantum wire. The quanta move along the wire at the speed of light, , and the parameter can be thought of as which is the time for quanta at a distance away to reach the system. Better still is the time at which this part of the field passes through the system. The process is the operator describing the annihilation of quanta passing through the system at some stage over the time-interval .
Fix a system Hilbert space, , called the initial space. A quantum stochastic process is a family of operators, , acting on . .
The process is adapted if, for each , the operator acts trivially on the future environment factor .
QSDEs with adapted coefficients where originally introduced by Hudson & Parthasarathy in 1984. Let be four adapted quantum stochastic processes defined for . We then define consider the QSDE
(111) |
with initial condition . To understand this we take matrix elements between states of the form and use the eigen-relation (107) to get the integrated form
(112) |
(113) |
Processes obtain this way are called quantum stochastic integrals.
The approach of Hudson and Parthasarathy is actually different [16, 17]. The arrive at the process defined by (111) by building the analogue of the Ito theory for stochastic integration: that is the show conditions in which
(114) |
makes sense as a limit process where all the increments are future pointing. That is with , etc.
One has, for instance,
(115) |
etc., so the two approaches coincide.
Quantum Ito Rules
It is clear from (111) that this calculus is Wick ordered - note that the creators all appear to the left and all the annihilators, , appear to the right of the coefficients. The product of two Wick ordered expressions in not immediately Wick ordered and one must use the singular commutation relations to achieve this. This results in a additional term which corresponds to a quantum Ito correction.
We have
(116) |
To see this, let adapted, then
(117) |
As we have a square of we can neglect such terms.
However, we have
(118) |
and so . The infinitesimal form of this is then
(119) |
This is strikingly similar to the classical rule for increments of the Wiener process!
In fact, we have the following quantum Ito table
(125) |
Each of the non-zero terms arises from multiplying two processes that are not in Wick order.
For a pair of quantum stochastic integrals, we have the following quantum Ito product formula
(126) |
Unlike the classical version, the order of and here is crucial.
Some Classical Processes On Fock Space
The process is self-commuting, that is , and has the distribution of a Wiener process is the vacuum state
(127) | |||||
(128) |
The same applies to , but
(129) |
So we have two non-commuting Wiener processes in Fock space. We refer to and as canonically conjugate quadrature processes.
One see that, for instance,
(130) |
We also obtain a Poisson process by the prescription
(131) |
One readily checks that from the quantum Ito table.
Emission-Absorption Interactions
Let us consider a singular Hamiltonian of the form
(132) |
We will try and realize the solution to the Schrödinger equation
(133) |
as a unitary quantum stochastic integral process.
Let us first remark that the annihilator part of (132) will appear out of Wick order when we consider (133). The standard approach in quantum field theory is to develop the unitary as a Dyson series expansion - often re-interpreted as a time order-exponential:
(134) | |||||
In our case the field terms - the quantum white noises - are linear, however, we have the problem that they come multiplied by the system operators and which do not commute, and don’t necessarily commute with either.
Fortunately we can do the Wick ordering in one fell swoop rather than having to go down each term of the Dyson series. We have
(135) | |||||
where we dropped the term as this should vanish for and took half the weight of the -function due to the upper limit of the integration. However, we get
(136) |
Plugging this into the equation (133), we get
(137) | |||||
which is now Wick ordered. We can interpret this as the Hudson-Parthasarathy equation
(138) |
The corresponding Heisenberg equation for will be
(139) | |||||
where
(140) | |||||
We note that we obtain the typical Lindblad form for the generator.
Scattering Interactions
We mention that we could also treat a Hamiltonian with only scattering terms Let us set . The same sort of argument leads to
(141) |
which can be rearranged to give
(142) |
So the Wick ordered form is
(143) |
or in quantum Ito form
(144) |
The Heisenberg equation here is .
This is all comparable to the classical Poisson process driven evolution involving unitary kicks.
The SLH Formalism
We now outline the so-called SLH Formalism - named after the scattering matrix operator , the coupling vector operator and Hamiltonian appearing in these Markov models [18]-[20]. The examples considered up to now used only one species of quanta. We could in fact have channels, based on quantum white noises:
(145) |
The most general form of a unitary process with fixed coefficients may be described as follows: we have a Hamiltonian , a column vector of coupling/ collapse operators
(149) |
and a matrix of operators
(153) |
For each such triple we have the QSDE
(154) | |||||
which has, for initial condition , a solution which is a unitary adapted quantum stochastic process. The emission-absorption case is the model with no scattering (). Likewise the purse scattering corresponds to and .
Heisenberg-Langevin Dynamics
System observables evolve according to the Heisenberg-Langevin equation
(155) | |||||
where the generator is the traditional Lindblad form
(156) |
Quantum Outputs
The output fields are defined by
(157) |
From the quantum Ito calculus we find that
(158) |
Or, maybe more suggestively in quantum white noise language [21],
(159) |
4.4 Quantum Filtering
We now set up the quantum filtering problem. For simplicity, we will take and set so that we have a simple emission-absorption interaction. We will also consider the situation where we measure the -quadrature of the output.
The initial state is taken to be , and in the Heisenberg picture this is fixed for all time.
The analogue of the stochastic dynamical equation considered in the classical filtering problem is the Heisenberg-Langevin equation
(160) |
where .
Some care is needed in specifying what exactly we measure: we should really work in the Heisenberg picture for clarity. The -quadrature of the input field is which we have already seen is a Wiener process for the vacuum state of the field. Of course this is not what we measure - we measure the output quadrature!
Set
(161) |
As indicated in our discussion on von Neumann’s measurement model, what we actually measure is
(162) |
The differential form of this is
(163) |
Note that
(164) |
The dynamical noise is generally a quantum noise and can only be considered classical in very special circumstances, while the observational noise is just its -quadrature which can hardly be treated as independent!
In complete contrast to the classical filtering problem we considered earlier, we have no paths for the system - just evolving observables of the system. What is more these observables do not typically commute amongst themselves, or indeed the measured process.
We can only apply Bayes Theorem in the situation where the quantities involved have a joint probability distribution, and in the quantum world this requires them to be compatible. At this stage it may seem like a miracle that we have any theory of filtering in the quantum world. However, let us stake stock of what we have.
What Commutes With What?
For fixed , let be the solution to the QSDE (154) in time variable with . Formally, we have
(165) |
which is the unitary which couples the system to the part of the field that enters over the time . In terms of our previous definition, we have and we have the property
(166) |
In the Heisenberg picture, the observables evolve
(167) | |||||
(168) |
We know that the input quadrature is self-commuting, but what about the output one? A key identity here is that
(169) |
which follows from the fact that .
From this, we see that the process is also commutative since
(170) |
If this was not the case then subsequent measurements of the process would invalidate (disturb?) earlier ones. In fancier parlance, we say that process is not self-demolishing - that is, all parts are compatible with each other.
A similar line of argument shows that
(171) |
Therefore, we have a joint probability for and the continuous collection of observables so can use Bayes Theorem to estimate for any using the past observations. Following V.P. Belavkin, we refer to this as the non-demolition principle.
The Conditioned State
In the Schrödinger picture, the state at time is , so
(172) | |||||
Here we have used a profound trick due to A.S. Holevo. The differential acting on yields zero since it is future pointing and so only affects the future part which, by adaptedness, is the vacuum state of the future part of the field. To get from the first line to the second line, we remove and add a term that is technically zero. In its reconstituted form, we obtain the -quadrature of the input. The result is that we obtain an expression for the state which is “diagonal” in the input quadrature - our terminology here is poor (we are talking about a state not and observable!) but hopefully wakes up physicists to see what’s going on.
The above equation is equivalent to the SDE in the system Hilbert space
(173) |
where is a sample path - or better still, eigen-path - of the quantum stochastic process .
We refer to (173) as the Belavkin-Zakai equation.
The Quantum Filter
Let us begin with a useful computational
(174) | |||||
A few comments are in order here. The operator will commute with any functional of the past measurements - here . In the first equality is pulling things back in terms of the unitary . The second is just the equivalence between Schrödinger and Heisenberg pictures. The final one just uses the equivalent form (173): note that the paths of the input quadrature gets their correct weighting as Wiener processes.
Setting in (174), we get the
(175) |
So the probability of the measured paths is
(176) |
Now this last equation deserves some comment! The vector , which lives in the system tensor Fock space, is properly normalized, but its corresponding form is not! The latter is a stochastic process taking values in the system Hilbert space and is adapted to input quadrature. However, we never said that had to be normalized too, and indeed it follows from or “diagonalization” procedure. In fact, if was normalized then the output measure would follow a Wiener distribution and so we would be measuring white noise!
From (174) again, we an deduce the filter: we get (using the arbitrariness of the functional )
(177) |
This has a remarkable similarity to (70). Moreover, using the Ito calculus see that
(178) | |||||
This is the quantum analogue of the Duncan-Mortensen-Zakai equation.
So small work is left in order to derive the filter equation. We first observe that the normalization (set ) is that
(179) |
Using the Ito calculus, it is then routine to show that the quantum filter is
(180) |
where the innovations are defined by
(181) |
Again, the innovations have the statistics of a Wiener process. As in the classical case, the innovations give the difference between what we observe next, , and what we would have expected based on our observations up to that point, . The fact that the innovations are a Wiener process is a reflection of the efficiency of the filter - after extracting as much information as we can out of the observations, we are left with just white noise.
Acknowledgements
I would like to thank the staff at CIRM, Luminy (Marseille), and at Institut Henri Poincaré (Paris) for their kind support during the 2018 Trimester on Measurement and Control of Quantum Systems where this work was began. I am also grateful to the other organizers Pierre Rouchon and Denis Bernard for valuable comments during the writing of these notes.
References
- [1] V.P. Belavkin, (1989), Non-Demolition Measurements, Nonlinear Filtering and Dynamic Programming of Quantum Stochastic Processes, Lecture Notes in Control and Inform Sciences 121 245–265, Springer–Verlag, Berlin.
- [2] A. Barchielli and V. P. Belavkin, (1991), Measurements continuous in time and a posteriori states in quantum mechanics, J. Phys. A: Math. Gen. 24, 1495.
- [3] A. Barchielli and M. Gregoratti, (2009), Quantum Trajectories and Measurements in Continuous Time - the diffusive case, Springer Berlin Heidelberg.
- [4] H.M. Wiseman and G.J. Milburn, (2009), Quantum Measurement and Control, Cambridge University Press.
- [5] D. Gatarek, N. Gisin, (1995), Continuous quantum jumps and infinite‐dimensional stochastic equations, Journal of Mathematical Physics 32 (8), 2152-2157
- [6] H.J. Carmichael, (1993), Phys. Rev. Lett. 70(15) p.2273.
- [7] J. Dalibard, Y. Castin, Yvan, and K. Mølmer, (Feb 1992), Wave-function approach to dissipative processes in quantum optics. Phys. Rev. Lett. American Physical Society. 68 (5): 580–58.
- [8] C. Sayrin, I. Dotsenko, et al., (1 September 2011), Real-time quantum feedback prepares and stabilizes photon number states, Nature 477, 73-77.
- [9] H. Maassen, (1988), Theoretical concepts in quantum probability: quantum Markov processes. Fractals, quasicrystals, chaos, knots and algebraic quantum mechanics (Maratea, 1987), 287-302, NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., 235, Kluwer Acad. Publ., Dordrecht
- [10] M. Takesaki, (1972) Conditional Expectations in von Neumann Algebras, J. Func. Anal., 9, 306-321.
- [11] L. Bouten, R. van Handel and M.R. James, (2007), An introduction to quantum filtering, SIAM Journal on Control and Optimization 46, 2199.
- [12] L. Bouten, R. van Handel, Quantum filtering: a reference probability approach, aXiv:math-ph/0508006
- [13] R. van Handel, Ph.D. Thesis, Filtering, Stability, and Robustness, CalTech, 2006, http://www.princeton.edu/ rvan/thesisf070108.pdf
- [14] H. Wiseman, (1994), Quantum theory of continuous feedback, Phys. Rev. A, 49(3):2133-2150.
- [15] P. Rouchon, (August 13 - 21, 2014), Models and Feedback Stabilization of Open Quantum Systems Extended version of the paper attached to an invited conference for the International Congress of Mathematicians in Seoul, arXiv:1407.7810
- [16] R.L. Hudson and K.R. Parthasarathy, (1984), Quantum Ito’s formula and stochastic evolutions, Commun. Math. Phys. 93, 301.
- [17] K.R. Parthasarathy, (1992) An Introduction to Quantum Stochastic Calculus, Birkhauser.
- [18] J. Gough, M.R. James, (2009), Quantum Feedback Networks: Hamiltonian Formulation, Commun. Math. Phys. 287, 1109.
- [19] J. Gough, M.R. James, (2009), The series product and its application to quantum feedforward and feedback networks, IEEE Trans. on Automatic Control 54, 2530.
- [20] J. Combes, J. Kerckhoff, M. Sarovar, (2017), The SLH framework for modeling quantum input-output networks, Advances in Physics: X, 2:3, 784-888.
- [21] C.W. Gardiner and M.J. Collett, (1985), Input and output in damped quantum systems: Quantum stochastic differential equations and the master equation. Phys. Rev. A, 31(6):3761-3774.