Duality for Nonlinear Filtering II: Optimal Control
Abstract
This paper is concerned with the development and use of duality theory for a nonlinear filtering model with white noise observations. The main contribution of this paper is to introduce a stochastic optimal control problem as a dual to the nonlinear filtering problem. The mathematical statement of the dual relationship between the two problems is given in the form of a duality principle. The constraint for the optimal control problem is the backward stochastic differential equation (BSDE) introduced in the companion paper. The optimal control solution is obtained from an application of the maximum principle, and subsequently used to derive the equation of the nonlinear filter. The proposed duality is shown to be an exact extension of the classical Kalman-Bucy duality, and different from other types of optimal control and variational formulations given in literature.
Stochastic systems; Optimal control; Nonlinear filtering.
1 Introduction
In this paper, we continue the development of duality theory for nonlinear filtering. While the companion paper (part I) was concerned with a (dual) controllability counterpart of stochastic observability, the purpose of this present paper (part II) is to express the nonlinear filtering problem as a (dual) optimal control problem. The proposed duality is shown to be an exact extension of the original Kalman-Bucy duality [1, 2], in the sense that the dual optimal control problem has the same minimum variance structure for both linear and nonlinear filtering problems. Because of its historical importance, we begin by introducing and reviewing the classical duality for the linear Gaussian model.
1.1 Background and literature review
The linear Gaussian filtering model is as follows:
(1a) | ||||
(1b) |
where is the state process, the prior is a Gaussian density with mean and variance , is the observation process, and both and are Brownian motion (B.M.). It is assumed that are mutually independent. The model parameters , , and .
For this problem, the dual optimal control formulations are well-understood. These are of following two types:
-
•
Minimum variance optimal control problem:
(2a) | ||||
(2b) |
-
•
Minimum energy optimal control problem:
(3a) | ||||
(3b) |
where is a given sample path of observations.
These two types of linear quadratic (LQ) optimal control problems are known since 1960s and described in [3, Sec. 7.3.1 and 7.3.2]. Because it is discussed in the seminal paper [2] of Kalman and Bucy, the minimum variance duality (2) is also referred to as the Kalman-Bucy duality [4]. The relationship of the two problems to the model (1) is as follows:
- •
- •
Their respective solutions are related to (1) as follows:
-
•
The solution of the minimum variance duality (2) is useful to derive the Kalman filter for (1) [5, Ch. 7.6]. The derivation helps explain why the covariance equation of the Kalman filter is the same as the differential Ricatti equation (DRE) of the LQ optimal control. Note however that the time arrow is reversed: the DRE is solved in forward time for the Kalman filter. This is because the constraint (2b) is a backward (in time) ordinary differential equation (ODE).
-
•
The solution of the minimum energy duality (3) is a favorite technique to derive the forward-backward equations of smoothing for the model (1). The Hamilton’s equation for (3) is referred to as the Bryson-Frazier formula [6, Eq. (13.3.4)]. By introducing a DRE, other forms of solution, e.g., the Fraser-Potter smoother [7, Eq. (16)-(17)], are possible and useful in practice.
Given this background for the linear Gaussian model (1), there has been extensive work spanning decades on extending duality to the problems of nonlinear filtering and smoothing. The prominent duality type solution approaches in literature include the following:
-
•
Mortensen’s maximum likelihood estimator (MLE) [8].
-
•
Minimum energy estimator (MEE) in the model predictive control (MPC) literature [9, Ch. 4].
-
•
Log transformation relationship between the Zakai equation of nonlinear filtering and the Hamilton-Jacobi-Bellman (HJB) equation of optimal control [10].
-
•
Mitter and Newton’s variational formulation of the nonlinear smoothing problem [11].
In an early work [8], Mortensen considered a slightly more general version of the linear Gaussian model (1) where the drift terms in both (1a) and (1b) are nonlinear. Both the optimal control problem and its forward-backward solution are straightforward extensions of (3). Since 1960s, closely related extensions have appeared by different names in different communities, e.g., maximum likelihood estimation (MLE), maximum a posteriori (MAP) estimation, and the minimum energy estimation (MEE) which is discussed next.
Based on the use of duality, the theory and algorithms developed in the MPC literature are readily adapted to solve state estimation problems. The resulting class of estimators is referred to as the minimum energy estimator (MEE) [9, Ch. 4]. The MEE algorithms are broadly of two types: (i) Full information estimator (FIE) where the entire history of observation is used; and (ii) Moving horizon estimator (MHE) where only a most recent fixed window of observation is used. An important motivation is to also incorporate additional constraints in estimator design. Early papers include [12, 13, 14] and more recent extensions have appeared in [15, 16, 17, 18]. A historical survey is given in [9, Sec. 4.7] where Rawlings et. al. write “establishing duality [of optimal estimator] with the optimal regulator is a favorite technique for establishing estimator stability”. Although the specific comment is made for the Kalman filter, the remainder of the chapter amply demonstrates the utility of dual constructions for both algorithm design and convergence analysis (as the time-horizon ). Convergence analysis typically requires additional assumptions on the model which in turn has motivated the work on nonlinear observability and detectability definitions. A literature survey of these definitions, including the connections to duality theory, appears in the introduction of the companion paper [19].
While the focus of MEE is on deterministic models, duality is also an important theme in the study of nonlinear stochastic systems (hidden Markov models). A key concept is the log transformation [20]. In [10], the log transformation was used to transform the Zakai equation into a Hamilton-Jacobi-Bellman (HJB) equation. Because of this, the negative log of a posterior density is a value function for some stochastic optimal control problem (this is how duality is understood in stochastic settings [21, Sec. 4.8]). While the problem itself was not clarified in [10] (see however [22]), Mitter and Newton introduced a dual optimal control problem in [11] based on a variational interpretation of the Bayes’ formula. This work continues to impact algorithm design which remains an important area of research [23, 24, 25, 26, 27]. A notable ensuing contribution appeared in the PhD thesis-work [28] where Mitter-Newton duality is used to obtain results on nonlinear filter stability.
Given the importance of duality for the purposes of stability analysis in both deterministic and stochastic settings of the problem, it is useful to return to the linear Gaussian model (1) and compare the two types of duality (2) and (3). An important point, that has perhaps not been stressed in literature, is that minimum variance duality (2) is more compatible with the classical duality between controllability and observability in linear systems theory. This is because of the following reasons:
- •
-
•
Constraint. If we ignore the noise terms in (1) then the resulting deterministic state-output system ( and ) shares a dual relationship with the deterministic state-input system (2b). (It is shown in part I [19, Sec. III-F] that (2b) is also the dual for the stochastic system (1)). In contrast, the ODE (3b) is a modified copy of the model (1a).
-
•
Stability condition. The condition for asymptotic analysis of (2) is stabilizability of (2b) and by duality detectability of . The latter is known to be also the appropriate condition for stability of the Kalman filter. In contrast, for (3), asymptotic convergence of the optimal is possible even with . The important condition again is detectability of but it is not at all easy to see from (3).
-
•
Arrow of time. Because the respective DREs are solved forward (resp. backward) in time for optimal filtering (resp. control), the arrow of time flips between optimal control and optimal filtering. Evidently, this is the case for minimum variance duality (2) but not so for the minimum energy duality (3): The constraint (2b) is a backward in time ODE while the constraint (3b) is a modified copy of the signal model which proceeds forward in time.
All of this suggests that a fruitful approach – for both defining observability and for using the definition for asymptotic stability analysis – is to consider the minimum variance duality, which naturally begets the following questions:
- •
-
•
What type of duality is implicit in Mitter-Newton’s work? It is already evident that MEE is an extension of (3).
Both of these questions are answered in the present paper (for the white noise observation model). Before discussing the original contributions, it is noted that the past work on minimum variance duality has been on refinement and extensions of the linear model with additional constraints. In [29], it is used to obtain the solution to a class of singular regulator problems, and in [30], the Lagrangian dual for an MEE problem with truncated measurement noise is considered. Numerical algorithms for (2) and its extensions appear in [31, 32, 33, 34]. Prior to our work, it was widely believed that the nonlinear extension of minimum variance duality is not possible [4].
1.2 Summary of original contributions
The main contribution of this paper is to present a minimum variance dual to the nonlinear filtering problem. As in the companion paper (part I), the nonlinear filtering problem is for the HMM with the white noise observation model. The mathematical statement of the dual relationship between optimal filtering and optimal control is given in the form of a duality principle (Thm. 1). The principle relates the value of the control problem to the variance of the filtering problem. The classical Kalman-Bucy duality (2) is recovered as a special case for the linear-Gaussian model (1).
Two approaches are described to solve the optimal control problem: (i) Based on the use of the stochastic maximum principle to derive the Hamilton’s equation (Thm. 4.9); and (ii) Based on a martingale characterization (Thm. 5.14). A formula for the optimal control as a feedback control law is obtained and used to derive the equation of the optimal nonlinear filter. Our duality is also related to Mitter-Newton duality with a side-by-side comparison in Table 1.
This paper is drawn from the PhD thesis of the first author [35]. A prior conference version appeared in [36]. While the duality principle was already stated in the conference paper, it relied on a certain assumption [36, Assumption A1] which has now been proved. Various formulae are stated more simply, e.g., the use of carré du champ operator to specify the running cost. Issues related to function spaces have been clarified to a large extent. While the conference version relied on the innovation process, the present version directly works with the observation process. Such a choice is more natural for the problem at hand. As a result, most of the results and certainly their proofs are novel. Comparison with Mitter-Newton duality is also novel.
1.3 Paper outline
The outline of the remainder of this paper is as follows: The mathematical model and necessary background appears in Sec. 2. The dual optimal control problem together with the duality principle and its relation to the linear-Gaussian case is described in Sec. 3. Its solution using the maximum principle and the martingale characterization appears in Sec. 4 and Sec. 5, respectively. Duality-based derivation of the equation of the nonlinear filter appears in Sec. 6. A comparison with Mitter-Newton duality is contained in Sec. 7. The paper closes with some conclusions and directions for future work in Sec. 8. All the proof are contained in the Appendix.
2 Background
We briefly review the model and the notation as presented in [19]. Although the presentation is self-contained, it is in an abbreviated form with a focus on additional new concepts that are necessary for this paper.
On the probability space , we consider a pair of continuous-time stochastic processes as follows:
-
•
The state process is a Feller-Markov process taking values in the state-space . The prior is denoted by (space of probability measures) and . The infinitesimal generator is denoted by .
-
•
The observation process satisfies the stochastic differential equation (SDE):
(4) where is referred to as the observation function and is an -dimensional Brownian motion (B.M.). We write is -B.M. It is assumed that is independent of .
The above is referred to as the white noise observation model of nonlinear filtering. The model is denoted by .
An important additional concept in this paper is the carré du champ operator defined as follows (see [37]):
where is a test function. Explicit formulae for the most important examples are described next.
2.1 Guiding examples
Example 1 (Finite state-space)
. A real-valued function is identified with a vector in where the element of the vector is . In this manner, the generator of the Markov process is identified with a row-stochastic rate matrix (the non-diagonal elements of are non-negative and the row sum is zero). The carré du champ operator is as follows:
(5) |
Example 2 (Euclidean state-space)
. The Markov process is an Itô diffusion modeled using a stochastic differential equation (SDE):
where and satisfy appropriate technical conditions such that a strong solution exists for , and is a standard B.M. assumed to be independent of and . In the Euclidean case, all the measures are identified with their density. In particular, we use the notation to denote the probability density function of the prior.
The infinitesimal generator acts on functions in its domain according to [38, Thm. 7.3.3]
where is the gradient vector and is the Hessian matrix. For , the carré du champ operator is given by
(6) |
Example 3 (Linear Gaussian model)
The model (1) introduced in Sec. 1 is a special case of Itô diffusion where the drift terms are linear and , the coefficient of the process noise is a constant matrix, and the prior is a Gaussian density. A real-valued linear function is expressed as
where . Then is also a linear function given by
and is a constant function given by
(7) |
2.2 Background on nonlinear filtering
The canonical filtration . The filtration generated by the observation is denoted by where . A standard approach is based upon the Girsanov change of measure. Suppose the model satisfies the Novikov’s condition: . Define a new measure on as follows:
Then it is shown that the probability law for is unchanged but is a -B.M. that is independent of [28, Lem. 1.1.5]. The expectation with respect to is denoted by .
The two probability measures are used to define the un-normalized and the normalized (or nonlinear) filter are as follows: For and ,
As the name suggests, which is referred to as the Kallianpur-Striebel formula [39, Thm. 5.3] (here is the constant function for all ). Combining the tower property of conditional expectation with the change of measure gives
(8) |
2.3 Function spaces
The notation and is used to denote the Hilbert space of -measurable random vector and -adapted stochastic process, respectively. These Hilbert spaces suffice if the state-space is finite. In general settings, let denote a suitable Banach space of real-valued functions on , equipped with the norm . Then
-
•
For a random function, the Banach space .
-
•
For a function-valued stochastic process, the Banach space is .
In the remainder of this paper, we set (the space of continuous and bounded functions) equipped with the sup-norm. The dual space (the space of rba measures) is denoted by where the duality pairing for and .
3 Main result: The duality principle
3.1 Problem statement
For a function , the nonlinear filter is the minimum variance estimate of [3, Sec. 6.1.2]:
(9) |
Our goal is to express the above minimum variance optimization problem as a dual optimal control problem.
The conditional variance is denoted by
For notational ease, the expected value of the conditional variance is denoted by
Strictly speaking, the above is variance only at time . However, the verbiage is consistent with the “minimum variance” interpretation of the nonlinear filter.
3.2 Dual optimal control problem
The function space of admissible control inputs is denoted by . An element of is denoted . It is referred to as the control input. The main contribution of this paper is the following problem.
-
•
Minimum variance optimal control problem:
(10a) | |||
Subject to (BSDE constraint): | |||
(10b) |
where the running cost
and .
Remark 1
The BSDE (10b) is introduced in the companion paper (part I) as the dual control system. The data for the BSDE is the given terminal condition and the control input . The solution of the BSDE is the pair which is (forward) adapted to the filtration . Existence, uniqueness, and regularity theory for linear BSDEs is standard and throughout the paper, we assume that the solution of BSDE is uniquely determined in for each given and . The well-posedness results for finite state-space can be found in [40, Ch. 7] and for the Euclidean state space in [41].
Theorem 1 (Duality principle)
For any admissible control , consider an estimator
(11) |
Then
(12) |
Proof 3.2.
See Appendix A.1.
The problem (10) is a stochastic linear quadratic optimal control problem for which there is a well established existence-uniqueness theory for the optimal control solution. Application of this theory is the subject of the following section. For now, we assume that the optimal control is well-defined and denote it as . Because the right-hand side of the identity (12) is bounded below by , the duality gap
In order to conclude that the duality gap is zero, it is both necessary and sufficient to show that there exists a such that the estimator , as given by (11), equals . Since is a -B.M., the following lemma is a consequence of the Itô representation theorem for Brownian motion [38, Thm. 4.3.3].
Lemma 3.3.
For any , there exists a unique such that
Proof 3.4.
See Appendix A.2.
Because the duality gap is zero, the following implications are to be had:
-
•
The optimal control gives the conditional mean
-
•
The optimal value is the expected value of the conditional variance
where is the optimally controlled stochastic process obtained with in (10b).
In fact, these two implications carry over to the entire optimal trajectory.
Proposition 3.5.
Suppose is the optimal control input and that is the associated solution of the BSDE (10b). Then for almost every ,
(13) | ||||
(14) |
Proof 3.6.
See Appendix A.3.
Consequently, the expected value of the conditional variance is the optimal cost-to-go (for a.e. ). We do not yet have a formula for the optimal control . The difficulty arises because there is no HJB equation for BSDE-constrained optimal control problem. Instead, the literature on such problem utilizes the stochastic maximum principle for BSDE which is the subject of the next section. Before that, we discuss the linear Gaussian case.
3.3 Linear Gaussian case
The goal is to show that the classical Kalman-Bucy duality (2) described in Sec. 1 for the linear Gaussian model (1) is a special case. Consider a linear function where is a given deterministic vector. The problem is to compute a minimum variance estimate of the scalar random variable . It is given by . Now, it is a standard result in the theory of Gaussian processes that conditional expectation can be evaluated in the form of a linear predictor [42, Cor. 1.10]. For this reason, it suffices to consider an estimator of the form
where and are both deterministic (the lower case notation is used to stress this). Consequently, for linear Gaussian estimation, we can restrict the admissible space of control inputs to which is a much smaller subspace of . Using a deterministic control , and the terminal condition , the solution of the BSDE is given by
where is a solution of the backward ODE:
Using the formula (7) for the carré du champ, the running cost
With the Gaussian prior, the initial cost . Combining all of the above, the optimal control problem (10) reduces to (2) for the linear Gaussian model (1).
Remark 3.7.
The solution of the optimal control problem yields the optimal control input , along with the vector that determines the minimum-variance estimator:
The Kalman filter is obtained by expressing as the solution to a linear SDE [5, Ch. 7.6].
4 Solution of the optimal control problem
The BSDE constrained optimal control problem (10) is not in its standard form [43, Eq. 5.10]. There are two issues:
- •
-
•
The filtration: The ‘state’ of the optimal control problem is adapted to the filtration . However, the cost function (10a) also depends upon the non-adapted exogenous process .
The second problem is easily fixed by using the tower property of conditional expectation. To resolve the first problem, we have two choices:
-
1.
Use the change of measure to evaluate with respect to measure, or
-
2.
Express the BSDE using a driving martingale that is a -B.M. A convenient such process is the innovation process.
In this paper, the standard form of the dual optimal control problem is presented based on the first choice. For an analysis based on the second choice, see [36] and [35, Sec. 5.5].
In order to express the expectation for the control objective (10a) with respect to , we use the change of measure (see Appendix A.4 for the calculation) to obtain
where the Lagrangian is defined by
(15) |
The dual optimal control problem (standard form) is now expressed as follows:
(16a) | |||
Subject to: | |||
(16b) |
Remark 4.8.
The Lagrangian is a time-dependent random functional of the dual state and the control . The randomness and time-dependency comes only from the last argument .
4.1 Solution using the maximum principle
Because is a function, the co-state is a measure. The Hamiltonian is defined as follows:
In the following, the Hamilton’s equations for the optimal trajectory are derived by an application of the maximum principle for BSDE constrained optimal control problems [44, Thm. 4.4].
The Hamilton’s equations are expressed in terms of the derivatives of the Hamiltonian. In order to take derivatives with respect to functions and measures, we adopt the notion of Gâteaux differentiability. Given a nonlinear functional , the Gâteaux derivative is obtained from the defining relation [3, Sec. 10.1.3]:
For the problem at hand, the derivatives of the Hamiltonian are as follows:
where is the adjoint of (whereby for all ). Using this notation, the Hamilton’s equations are as follows:
Theorem 4.9.
Consider the optimal control problem (16). Suppose is the optimal control input and the is the associated solution of the BSDE (16b). Then there exists a -adapted measure-valued stochastic process such that
(17a) | |||
(17b) | |||
(17c) |
where the optimal control is given by
(18) |
(In (17c), denotes the R-N derivative of the measure with respect to the measure ).
Proof 4.10.
See Appendix A.5.
Remark 4.11.
From linear optimal control theory, it is known that is related to by a (-measurable) linear transformation [40, Sec. 6.6]. The boundary condition suggests that the R-N derivative
(19) |
This is indeed the case as we show in Appendix A.6 by verifying that (19) solves the Hamilton’s equation. Combining this formula with (18), we have a formula for optimal control input as a feedback control law:
4.2 Explicit formulae for the guiding examples
Example 4.12 (Finite state-space).
(Continued from Example 1). A real-valued function (resp. a measure ) is identified with a column vector in where the element of the vector represents (resp. ), and . In this manner, the generator is identified with a rate matrix and the observation function is identified with a matrix . Let denote the canonical basis in , and . For any vector , is a diagonal matrix whose diagonal entires are defined as for . For a matrix , is a -dimensional vector whose entries are defined as for .
The Lagrangian and the Hamiltonian are as follows:
The functional derivatives are now the partial derivatives. For the Hamiltonian, these are as follows:
The Hamilton’s equations are given by
where .
Example 4.13 (Euclidean state-space).
(Continued from Example 2). We consider the Itô diffusion (2) in with a prior density denoted as . Likewise, and are used to denote the density of the respective measures. Doing so, the Lagrangian and the Hamiltonian are as follows:
The functional derivatives are computed by evaluating the first variation. These are as follows:
where is now understood to mean and the formula for adjoint is
Therefore, the Hamilton’s equations are given by
where note that is now a (random) function (same as ).
5 Martingale characterization
Although we do not have an HJB equation, a martingale characterization of the optimal solution is possible as described in the following theorem:
Theorem 5.14.
Fix . Consider a -adapted real-valued stochastic process
where is the solution to the BSDE (10b) and is the nonlinear filter. Then is a -supermartingale, and is a -martingale if and only if
(20) |
for , .
Proof 5.15.
See Appendix A.7.
A direct consequence of Thm. 5.14 is the optimality of the control (20), because
which means
with equality if and only if . Therefore, the expected value of the conditional variance is the optimal value functional for the optimal control problem.
Remark 5.16.
We now have a complete solution of the optimal control problem (10). Remarkably, the solution admits a meaningful interpretation not only at the terminal time but also for intermediate times . At time ,
-
•
The optimal value functional is (formula (14)).
-
•
The optimal control is a feedback control law (20).
-
•
The optimal estimate is (formula (13)).
Formula (13) for explicitly connects the optimal control to the optimal filter. In particular, the optimal control up to time yields an optimal estimate of .
Because of the BSDE constrained nature of the optimal control problem (10), an explicit characterization of the optimal value functional and the feedback form of the optimal control are both welcome surprises. It is noted that the feedback formula (20) for the optimal control is derived using two approaches: using the maximum principle (Rem. 4.11) and using the martingale characterization (Thm. 5.14).
6 Derivation of the nonlinear filter
From Prop. 3.5, using the formula (20) for optimal control
(21) |
for , . Because the equation for is known, a natural question is whether (21) can be used to obtain the equation for nonlinear filter (akin to the derivation of the Kalman filter described in Rem. 3.7). A formal derivation of the nonlinear filter along these lines is given in Appendix A.8.
Mitter-Newton duality | Duality proposed in this paper | |
---|---|---|
Filtering/smoothing objective | Minimize relative entropy (Eq. (22)) | Minimize variance (Eq. (9)) |
Observation (output) process | Pathwise ( is a sample path) | is a stochastic process |
Control (input) process | has the dimension of the process noise | and are both elements of |
Dual optimal control problem | Eq. (23) | Eq. (10) |
Arrow of time | Forward in time | Backward in time |
Dual state-space | : same as the state-space for | : the space of functions on |
Constraint | Controlled copy of the state process SDE (23a) | Dual control system BSDE (10b) |
Running cost (Lagrangian) | ||
Value function (its interpretation) | Minus log of the posterior density | Expected value of the conditional variance |
Asymptotic analysis (condition) | Unclear | Stabilizability of BSDE Detectability of HMM |
Optimal solution gives | Forward-backward equations of smoothing | Equation of nonlinear filtering |
Linear-Gaussian special case | Minimum energy duality (3) | Minimum variance duality (2) |
7 Comparison with Mitter-Newton Duality
7.1 Review of Mitter-Newton duality
In [11], Mitter and Newton introduced a modified version of the Markov process . The modified process is denoted by . The problem is to pick (i) the initial prior ; and (ii) the state transition, such that the probability law of equals the conditional law for .
This is accomplished by setting up an optimization problem on the space of probability laws. Let denote the law for , denote the law for , and denote the conditional law for given an observation sample path . Assuming , the objective function is the relative entropy between and :
(22) |
In [28], (22) is referred to as the variational Kallianpur-Striebel formula. For Example 2 (Itô diffusion), this procedure yields the following stochastic optimal control problem:
(23a) | |||
(23b) |
where
where is the generator of the controlled Markov process . A similar construction is also possible for Example 1 (finite state-space) [28, Sec. 2.2.2], [45, Sec. 3.3].
The problem (23) is a standard stochastic optimal control problem whose solution is obtained by writing the HJB equation (see [45]),
and the optimal control where
By expressing the value function
a direct calculation shows that the process satisfies the backward Zakai equation of the smoothing problem [46],[47, Thm. 3.8]. This shows the connection to both the log transformation and to the smoothing problem. In fact, the above can be used to derive the forward-backward equations of nonlinear smoothing (see [45] and [35, Appdx. B]).
7.2 Linear Gaussian case
The goal is to relate (23) to the minimum energy duality (3) described in Sec. 1 for the linear Gaussian model (1). In the linear Gaussian case, the controlled process (23b) becomes
(24) |
where , are decision variables. Because the problem is linear Gaussian, it suffices to consider a linear control law of the form
(25) |
where and the two deterministic processes
are the new decision variables. With a linear control law (25), the state is a Gaussian random variable with mean and variance . It is possible to equivalently express (23) as two un-coupled deterministic optimal control problems, for the mean and for the variance, respectively. Detailed calculations showing this are contained in Appendix A.9. In particular, it is shown that the optimal control problem for the mean is the classical minimum energy duality (3).
7.3 Comparison
Table 1 provides a side-by-side comparison of the two types of duality:
-
•
Mitter-Newton duality (23) on the left-hand side; and
-
•
Duality (10) proposed in this paper on the right-hand side.
In Sec. 7.2 and Sec. 3.3, the two are shown to be generalization of the classical minimum energy duality (3) and the minimum variance duality (2), respectively. All of this conclusively answers the two questions raised in Sec. 1.
We make a note of some important distinctions (compare with the bulleted list in Sec. 1):
-
•
Inputs and outputs. In proposed duality (10), inputs and outputs are dual processes that have the same dimension. These are element of the same Hilbert space .
-
•
Constraint. The constraint is the dual control system (10b) studied in the companion paper (part I).
-
•
Stability condition. For asymptotic analysis of (10), stabilizability of the constraint is the most natural condition. The main result of part I was to establish that stabilizability of the dual control system is equivalent to the detectability of the HMM. The latter condition of course is central to filter stability.
-
•
Arrow of time. The dual control system is backward in time. However, it is important to note that the information structure (filtration) is forward in time. In particular, all the processes are forward adapted to the filtration defined by the observation process.
A major drawback of the proposed duality is that the problem (for the Euclidean state-space ) is infinite-dimensional. This is to be expected because the nonlinear filter is infinite-dimensional. In contrast, the state space in the minimum energy duality is which is important for algorithm design as in MEE. Having said that, the linear quadratic nature of the infinite-dimensional problem may prove to be useful in practical applications of this work.
8 Conclusions and directions of future work
In this paper, we presented the minimum variance dual optimal control problem for the nonlinear filtering problem. The mathematical relationship between the two problems is given by a duality principle. Two approaches are described to solve the problem, based on maximum principle and based on a martingale characterization. A formula for the optimal control as a feedback control law is obtained, and used to derive the equation of the nonlinear filter. A detailed comparison with the Mitter-Newton duality is given.
There are several possible directions of future research: An important next step is to use the controllability and stabilizability definitions of the dual control system to recover the known results in filter stability. Research on this has already begun with preliminary results appearing in [35, Chapter 7-8] and [48, 49]. Although some sufficient conditions have been obtained and compared with literature, a complete resolution still remains open.
Both the stability analysis and the optimal control formulation suggest natural connections to the dissipativity theory. Because the dual control system is linear, one might consider quadratic forms of supply rate function as follows (compare with the formula for the running cost ):
where and is a suitable stochastic process (which can be picked). Establishing conditions for existence of a storage function and relating these conditions to the properties of the HMM may be useful for stability and robustness analysis.
Another avenue is numerical approximation of the nonlinear filter by considering sub-optimal solutions of the dual optimal control problem. The simplest choice is to consider deterministic control inputs . Some preliminary work on algorithm design along these lines appears in [36, Rem. 1], [35, Sec. 9.2] and [50, Ch. 4]. In particular for the finite state space case, this approach provides derivation and justification of Kalman filter for Markov chains [51]. In this regard, it is useful to relate duality to both the feedback particle filter (FPF) [52] and to the special cases (apart from the linear Gaussian case) where the optimal filter is known to be finite-dimensional, e.g. [53].
9 Acknowledgement
It is a pleasure to acknowledge Sean Meyn and Amirhossein Taghvaei for many useful technical discussions over the years on the topic of duality. The authors also acknowledge Alain Bensoussan for his early encouragement of this work.
Appendix A Proofs of the statements
A.1 Proof of Thm. 1
For a Markov process, the following process is a martingale:
Upon applying the Itô-Wentzell theorem [54, Thm. 1.17] on (note here that all stochastic processes are forward adapted),
Integrating both sides from to ,
Consider now an estimator
where is a deterministic constant. Then
The left-hand side is the error of the estimator. The three terms on the right-hand side are mutually independent. Therefore, upon squaring and taking an expectation
The proof is completed by setting .
A.2 Proof of Lemma 3.3
Because is a -B.M., the formula holds for by the Brownian motion representation theorem [42, Thm. 5.18]. Note that
because is the sup norm. Therefore if then . The conclusion follows.
A.3 Proof of Prop. 3.5
Using optimal control , is the solution of the BSDE (10b) with . Fix and let
Then by repeating the proof of Thm. 1 now over the time-horizon ,
If then there is nothing to prove. Because then (-a.s.) by the uniqueness of the conditional expectation. Therefore, suppose
In this case, we show that there exists a such that . Because is the optimal control, this provides the necessary contradiction.
Set and we have
Because , by Lemma 3.3 there exists such that
Consider an admissible control as follows
and denote by the solution of the BSDE with the control . Because of the uniqueness of the solution, for all and therefore
This supplies the necessary contradiction and completes the proof.
A.4 Derivation of the Lagrangian
A.5 Proof of Thm. 4.9
A.6 Justification of the formula (19)
For notational ease, we drop the superscript and denote the optimal control input simply as . In this proof, is used to denote the duality paring between functions and measures (e.g., ).
Let be an arbitrary test function. We show that
This is known to be true at time because of the boundary condition (17c). Therefore, the proof is carried out by taking a derivative of both sides and showing these to be identical.
A.7 Proof of Thm. 5.14
The proof uses the equation of the nonlinear filter and is the innovation increment. We evaluate the derivative of .
Similarly,
where . Therefore,
Collecting terms, we have
Since and is a -martingale, is a -supermartingale, and it is a martingale if and only if for all .
A.8 Formal derivation of the nonlinear filter
We begin with an ansatz
(26) |
where the goal is to obtain formulae for and . Because we have an equation (21) for , let us express in terms of the unknown and . Using the SDE (26) for and the BSDE (10b) for , apply the Itô-Wentzell formula to obtain
Comparing with (21),
for , . Because and therefore is arbitrary, the second of these equations suggests setting
using which the first equation is manipulated to show
which gives the following
Substituting the expressions for and into the ansatz (26)
This is the well known SDE of the nonlinear filter.
A.9 Mitter-Newton duality for the linear Gaussian model
Consider (24) with the linear control law (25). Then is a Gaussian random variable whose mean and variance evolve as follows:
(27a) | ||||
(27b) |
Note that the two equations are entirely un-coupled: affects only the equation for and affects only the equation for . We now turn to explicitly computing the running cost. For the linear Gaussian model
and the running cost becomes
Because ,
and because from are both Gaussian, the divergence
and because is linear the terminal condition term
Combining all of the above, upon a formal integration by parts, is expressed as sum of two un-coupled costs
plus a few constant terms that are not affected by the decision variables. The first of these costs subject to the ODE constraint (27a) for the mean is the classical minimum energy duality.
References
- [1] R. E. Kalman, “On the general theory of control systems,” in Proceedings First International Conference on Automatic Control, Moscow, USSR, 1960, pp. 481–492.
- [2] R. E. Kalman and R. S. Bucy, “New results in linear filtering and prediction theory,” Journal of basic engineering, vol. 83, no. 1, pp. 95–108, 1961.
- [3] A. Bensoussan, Estimation and Control of Dynamical Systems. Springer, 2018, vol. 48.
- [4] E. Todorov, “General duality between optimal control and estimation,” in 2008 IEEE 47th Conference on Decision and Control (CDC), 12 2008, pp. 4286–4292.
- [5] K. J. Åström, Introduction to Stochastic Control Theory. Academic Press, 1970.
- [6] A. E. Bryson and Y.-C. Ho, Applied optimal control: optimization, estimation, and control. Routledge, 2018.
- [7] D. Fraser and J. Potter, “The optimum linear smoother as a combination of two optimum linear filters,” IEEE Transactions on automatic control, vol. 14, no. 4, pp. 387–390, 1969.
- [8] R. E. Mortensen, “Maximum-likelihood recursive nonlinear filtering,” Journal of Optimization Theory and Applications, vol. 2, no. 6, pp. 386–394, 1968.
- [9] J. B. Rawlings, D. Q. Mayne, and M. Diehl, Model predictive control: theory, computation, and design. Nob Hill Publishing Madison, WI, 2017, vol. 2.
- [10] W. H. Fleming and S. K. Mitter, “Optimal control and nonlinear filtering for nondegenerate diffusion processes,” Stochastics: An International Journal of Probability and Stochastic Processes, vol. 8, no. 1, pp. 63–77, 1982.
- [11] S. K. Mitter and N. J. Newton, “A variational approach to nonlinear estimation,” SIAM Journal on Control and Optimization, vol. 42, no. 5, pp. 1813–1833, 2003.
- [12] H. Michalska and D. Q. Mayne, “Moving horizon observers and observer-based control,” IEEE Transactions on Automatic Control, vol. 40, no. 6, pp. 995–1006, 1995.
- [13] C. V. Rao, J. B. Rawlings, and J. H. Lee, “Constrained linear state estimation—a moving horizon approach,” Automatica, vol. 37, no. 10, pp. 1619–1628, 2001.
- [14] A. J. Krener, “The convergence of the minimum energy estimator,” in New Trends in Nonlinear Dynamics and Control and their Applications. Springer, 2003, pp. 187–208.
- [15] D. A. Copp and J. P. Hespanha, “Simultaneous nonlinear model predictive control and state estimation,” Automatica, vol. 77, pp. 143–154, 2017.
- [16] M. Farina, G. Ferrari-Trecate, and R. Scattolini, “Distributed moving horizon estimation for linear constrained systems,” IEEE Trans. on Auto. Control, vol. 55, no. 11, pp. 2462–2475, 2010.
- [17] R. Schneider, R. Hannemann-Tamás, and W. Marquardt, “An iterative partition-based moving horizon estimator with coupled inequality constraints,” Automatica, vol. 61, pp. 302–307, 2015.
- [18] A. Alessandri, M. Baglietto, and G. Battistelli, “A maximum-likelihood kalman filter for switching discrete-time linear systems,” Automatica, vol. 46, no. 11, pp. 1870–1876, 2010.
- [19] J. W. Kim and P. G. Mehta, “Duality for nonlinear filtering I: Observability,” unpublished.
- [20] W. H. Fleming, “Exit probabilities and optimal stochastic control,” Applied Mathematics and Optimization, vol. 4, no. 1, pp. 329–346, 1978.
- [21] A. Bensoussan, Stochastic control of partially observable systems. Cambridge University Press, 1992.
- [22] W. H. Fleming and E. De Giorgi, “Deterministic nonlinear filtering,” Annali della Scuola Normale Superiore di Pisa-Classe di Scienze-Serie IV, vol. 25, no. 3, pp. 435–454, 1997.
- [23] Y. Chen, T. T. Georgiou, and M. Pavon, “On the relation between optimal transport and Schrödinger bridges: A stochastic control viewpoint,” Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 671–691, 2016.
- [24] H. J. Kappen and H. C. Ruiz, “Adaptive importance sampling for control and inference,” Journal of Statistical Physics, vol. 162, no. 5, pp. 1244–1266, 2016.
- [25] S. Reich, “Data assimilation: the Schrödinger perspective,” Acta Numerica, vol. 28, pp. 635–711, 2019.
- [26] H. Ruiz and H. J. Kappen, “Particle smoothing for hidden diffusion processes: Adaptive path integral smoother,” IEEE Transactions on Signal Processing, vol. 65, no. 12, pp. 3191–3203, 2017.
- [27] T. Sutter, A. Ganguly, and H. Koeppl, “A variational approach to path estimation and parameter inference of hidden diffusion processes,” Journal of Machine Learning Research, vol. 17, pp. 6544–80, 2016.
- [28] R. van Handel, “Filtering, stability, and robustness,” Ph.D. dissertation, California Institute of Technology, Pasadena, 12 2006.
- [29] K. W. Simon and A. R. Stubberud, “Duality of linear estimation and control,” Journal of Optimization Theory and Applications, vol. 6, no. 1, pp. 55–67, 1970.
- [30] G. C. Goodwin, J. A. de Doná, M. M. Seron, and X. W. Zhuo, “Lagrangian duality between constrained estimation and control,” Automatica, vol. 41, no. 6, pp. 935–944, 2005.
- [31] P. K. Mishra, G. Chowdhary, and P. G. Mehta, “Minimum variance constrained estimator,” Automatica, vol. 137, p. 110106, 2022.
- [32] B. K. Kwon, S. Han, O. K. Kwon, and W. H. Kwon, “Minimum variance FIR smoothers for discrete-time state space models,” IEEE Signal Processing Letters, vol. 14, no. 8, pp. 557–560, 2007.
- [33] S. Zhao, Y. S. Shmaliy, B. Huang, and F. Liu, “Minimum variance unbiased FIR filter for discrete time-variant systems,” Automatica, vol. 53, pp. 355–361, 2015.
- [34] M. Darouach, M. Zasadzinski, and M. Boutayeb, “Extension of minimum variance estimation for systems with unknown inputs,” Automatica, vol. 39, no. 5, pp. 867–876, 2003.
- [35] J. W. Kim, “Duality for nonlinear filtiering,” Ph.D. dissertation, University of Illinois at Urbana-Champaign, Urbana, 06 2022.
- [36] J. W. Kim, P. G. Mehta, and S. Meyn, “What is the Lagrangian for nonlinear filtering?” in 2019 IEEE 58th Conference on Decision and Control (CDC). Nice, France: IEEE, 12 2019, pp. 1607–1614.
- [37] D. Bakry, I. Gentil, and M. Ledoux, Analysis and geometry of Markov diffusion operators. Springer Science & Business Media, 2013, vol. 348.
- [38] B. Øksendal, Stochastic differential equations: an introduction with applications. Springer Science & Business Media, 2013.
- [39] J. Xiong, An Introduction to Stochastic Filtering Theory. Oxford University Press on Demand, 2008, vol. 18.
- [40] J. Yong and X. Y. Zhou, Stochastic controls: Hamiltonian systems and HJB equations. Springer Science & Business Media, 1999, vol. 43.
- [41] J. Ma and J. Yong, “On linear, degenerate backward stochastic partial differential equations,” Probability Theory and Related Fields, vol. 113, no. 2, pp. 135–170, 1999.
- [42] J. F. Le Gall, Brownian Motion, Martingales, and Stochastic Calculus. Springer, 2016, vol. 274.
- [43] E. Pardoux and A. Răşcanu, Stochastic Differential Equations, Backward SDEs, Partial Differential Equations. Springer, 2014.
- [44] S. Peng, “Backward stochastic differential equations and applications to optimal control,” Applied Mathematics and Optimization, vol. 27, no. 2, pp. 125–144, 1993.
- [45] J. W. Kim and P. G. Mehta, “An optimal control derivation of nonlinear smoothing equations,” in Proceedings of the Workshop on Dynamics, Optimization and Computation held in honor of the 60th birthday of Michael Dellnitz. Springer, 2020, pp. 295–311.
- [46] E. Pardoux, “Backward and forward stochastic partial differential equations associated with a non linear filtering problem,” in 1979 18th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes, vol. 2. IEEE, 1979, pp. 166–171.
- [47] ——, “Non-linear filtering, prediction and smoothing,” in Stochastic systems: the mathematics of filtering and identification and applications. Springer, 1981, pp. 529–557.
- [48] J. W. Kim, P. G. Mehta, and S. Meyn, “The conditional Poincaré inequality for filter stability,” in 2021 IEEE 60th Conference on Decision and Control (CDC), 12 2021, pp. 1629–1636.
- [49] J. W. Kim and P. G. Mehta, “A dual characterization of the stability of the Wonham filter,” in 2021 IEEE 60th Conference on Decision and Control (CDC), 12 2021, pp. 1621–1628.
- [50] J. Szalankiewicz, “Duality in nonlinear filtering,” Master’s thesis, Technische Universität Berlin, Institut für Mathematik, Berlin, 2021.
- [51] N. V. Krylov, R. S. Lipster, and A. A. Novikov, “Kalman filter for Markov processes,” in Statistics and Control of Stochastic Processes. New York: Optimization Software, inc., 1984, pp. 197–213.
- [52] T. Yang, P. G. Mehta, and S. Meyn, “Feedback particle filter,” IEEE Transactions on Automatic Control, vol. 58, no. 10, pp. 2465–2480, 10 2013.
- [53] V. E. Beneš, “Exact finite-dimensional filters for certain diffusions with nonlinear drift,” Stochastics, vol. 5, no. 1-2, pp. 65–92, 1981.
- [54] B. L. Rozovsky and S. V. Lototsky, Stochastic Evolution Systems: Linear Theory and Applications to Non-Linear Filtering. Springer, 2018, vol. 89.
- [55] N. V. Krylov, “On the Itô–Wentzell formula for distribution-valued processes and related topics,” Probability Theory and Related Fields, vol. 150, no. 1-2, pp. 295–319, 2011.
[]Jin Won Kim received the Ph.D. degree in Mechanical Engineering from University of Illinois at Urbana-Champaign, Urbana, IL, in 2022.
He is now a postdocdoral research scientist in the Institute of Mathematics at the University of Potsdam.
His current research interests are in nonlinear filtering and stochastic optimal control.
He received the Best Student Paper Awards at the IEEE Conference on Decision and Control 2019.
[]Prashant G. Mehta received the Ph.D. degree in Applied Mathematics from Cornell University, Ithaca, NY, in 2004.
He is a Professor of Mechanical Science and Engineering at the University of Illinois at Urbana-Champaign.
Prior to joining Illinois, he was a Research Engineer at the United Technologies Research Center (UTRC). His current research interests are in nonlinear filtering. He received the Outstanding Achievement Award at UTRC for his contributions to the modeling and control of combustion instabilities in jet-engines. His students received the Best Student Paper Awards at the IEEE Conference on Decision and Control 2007, 2009 and 2019, and were finalists for these awards in 2010 and 2012. In the past, he has served on the editorial boards of the ASME Journal of Dynamic Systems, Measurement, and Control and the Systems and Control Letters. He currently serves on the editorial board of the IEEE Transactions on Automatic Control.