Convex operator-theoretic methods in stochastic control
Abstract
This paper is about operator-theoretic methods for solving nonlinear stochastic optimal control problems to global optimality. These methods leverage on the convex duality between optimally controlled diffusion processes and Hamilton-Jacobi-Bellman (HJB) equations for nonlinear systems in an ergodic Hilbert-Sobolev space. In detail, a generalized Bakry-Emery condition is introduced under which one can establish the global exponential stabilizability of a large class of nonlinear systems. It is shown that this condition is sufficient to ensure the existence of solutions of the ergodic HJB for stochastic optimal control problems on infinite time horizons. Moreover, a novel dynamic programming recursion for bounded linear operators is introduced, which can be used to numerically solve HJB equations by a Galerkin projection.
1 Introduction
Bellman’s principle of optimality [12] and its dual counterpart, the Chapman-Kolmogorov equation [20, 30] for controlled diffusion processes [33], are the fundament of stochastic control theory. They describe the semigroup structure of the evolution of optimally operated stochastic processes. Their associated partial differential equations (PDEs) for continuous-time models are the well-known HJB equation [23] and the controlled Fokker-Planck-Kolmogorov (FPK) equation [15].
While parabolic FPKs and HJBs are related to stochastic optimal control problems on finite time horizons, their elliptic counterparts model the evolution of optimally operated diffusion processes on an infinite horizon. One distinguishes between ergodic optimal control and infinite horizon optimal control.
-
•
Ergodic optimal control is about the analysis of the limit behavior of optimally operated dynamic systems as well as the long-term average control performance. For instance, in the deterministic case, a control system might have optimal steady states or optimal periodic orbits, but it could as well be that it is optimal to stay on a homoclinic orbit or even to allow a chaotic limit behavior. Similarly, the optimal ergodic distribution of nonlinear stochastic control systems often has a highly complex multi-modal structure.
-
•
Infinite horizon optimal control is about optimizing the system’s transient behavior. Here, the hope is that this transient eventually converges to an optimal ergodic limit. For general nonlinear stochastic systems, however, such a convergence analysis often turns out to be difficult. This is because, in practice, one is often interested in optimizing the system’s economic performance, which leads to indefinite stage cost functions. Unfortunately, it is often not even clear whether the integral over such indefinite costs has an expected value on an infinite time horizon.
The goal of the present paper is to address the above challenges in ergodic- and infinite horizon optimal control by building upon recent results in elliptic diffusion operator theory and functional analysis, reviewed below.
1.1 Literature Review
The systematic mathematical study of stochastic systems goes back to the work of Kolmogorov [30], although related continuous-time electro-dynamic, molecular, quantum-theoretic, and thermal diffusions have been studied even earlier by Fokker [25], von Smoluchowski [45], Planck [37], and Chapman [20]. The relevance of such diffusions in the context of analyzing the ergodic behavior of dynamical systems has been observed by Has’minskiǐ [27] in 1960, who found an elegant way of using Harnack’s inequality in combination with a weak Lyapunov function in order to analyze the existence of steady states of FPK equations. And, in fact, it was only six years later that Wonham realized the great potential of Has’minskiǐ’s work in the context of ergodic control theory [46]. By now, these historical developments are well-understood and have been extended by many authors, as reviewed in great detail in [15] focusing on FPK equations and in [33] focusing on controlled diffusion processes.
In contrast to linear FPKs, HJBs are nonlinear PDEs, which have been introduced by Bellman [12]. They model the backward-in-time evolution of the value function of either deterministic or stochastic optimal control problems. Here, the state of the HJB equation can—at least formally—be interpreted as the co-state of the FPK-constraint of certain optimal diffusion control problems. As such, HJBs and FPKs are closely related via the concept of duality in convex optimization, as originally pointed out in [13] and [24]. HJBs for optimal control of deterministic systems are typically analyzed by introducing an artificial stochastic perturbation and then passing to the so-called viscosity limit by making this perturbation infinitesimally small. The corresponding viscosity methods go back to Crandall, Ishii and Lions [22]. They can be considered as the main tool for analyzing parabolic HJBs on finite time horizons. The textbooks [9] and [23] offer a relatively complete overview of the associated literature on HJBs between 1957 and 1997.
Regarding the existence and properties of solutions to elliptic HJBs for ergodic and infinite-horizon optimal control, however, surprisingly little is known. In [9] a whole chapter is dedicated to infinite horizon optimal control problems with exponentially discounted stage costs. The analysis methods in this chapter are, however, inapplicable to problems with a general indefinite stage cost. In the context of ergodic optimal control, this problem has partly been resolved by Arisawa [1, 2], who has proposed methods to analyze the limit of vanishing discounts, which, under certain assumptions, turns out to be a successful path towards analyzing ergodic HJBs. Moreover, Barles [10] as well as Namah and Roquejoffre [36] have suggested methods for analyzing the long term behavior of HJBs with periodic Hamiltonians. These methods have in common that they focus on special classes of stochastic systems under periodicity assumptions. And, finally, yet another option for analyzing infinite horizon optimal control problems is to work with turnpikes [5, 41], which can be used to analyze the long-term local behavior of systems near optimal steady-states or periodic orbits.
In contrast to ergodic HJBs, much more is known about the limit behavior of FPKs. This is not only due the above mentioned work by Has’minskiǐ, but also due to relatively recent breakthroughs in functional analysis that can be traced back to the seminal work of Bakry and Emery [6] on carre-du-champ operators, whose far-reaching consequences have only been understood to full extent in the last two decades. Based on earlier work by Gross on hypercontractivity and logarithmic Sobolev inequalities [26], Arnold, Markowich, Toscani, and Unterreiter [3] were among the first to come up with a comprehensive and general analysis of the convergence of FPKs to ergodic limit states by featuring Bakry-Emery conditions. Even more practical conditions for deriving logarithmic Sobolev- and general entropy-type inequalities for diffusion semigroups and FPKs have appeared in an article by Bolley and Gentil [16]. The most recent developments regarding the trend to equilibrium of FPKs in the degenerate case are summarized in [7, 8].
Last but not least, concerning recent developments in PDE-constrained optimization [28], optimal control of FPKs has received significant attention during the last five years [4, 17, 18]. These articles are, however, more concerned with the well-posedness of FPK-constrained optimization problems, typically, with least-squares or other objectives. These particular classes of FPK-constrained optimization problems are not directly related to HJBs. The only exception is the work of Bobkov, Gentil, and Ledoux [14], who are the first to attempt an analysis of Hamilton-Jacobi equations based on logarithmic Sobolev inequalities—even if these authors focus only on a very special class of HJBs that can be solved explicitly via the Hopf-Lax formula.
1.2 Contributions
This paper presents novel results about the existence and uniqueness of solutions to ergodic and infinite horizon stochastic optimal control problems. They are summarized, respectively, in Theorem 1 and Theorem 2.
In detail, Theorem 1 is about the existence of ergodic limits of optimally controlled nonlinear stochastic McKean-Vlasov systems on a general convex domain with reflecting boundary, assuming that a weak Has’minskiǐ-Lyapunov function is available. Its proof is based on an energy dissipation argument for FPKs with mixed boundary conditions that models the evolution of the probability density function of its underlying nonlinear stochastic control system. The theorem is applicable to a general class of ergodic optimal control problems with indefinite stage cost and potentially multi-modal optimal limit distributions.
Similarly, Theorem 2 is about the existence of weak solutions to ergodic HJBs with Neumann boundary condition in a suitably weighted Hilbert-Sobolev space and their relation to infinite horizon stochastic optimal control. It is based on a generalized Bakry-Emery condition for optimally controlled FPKs under which global exponential stabilizability can be established. This condition turns out to be sufficient for ensuring that the value function of the stochastic infinite horizon optimal control problem is a bounded linear operator, whose unique Riesz’ representation can be used to construct a weak solution of the ergodic HJB.
A particularly relevant aspect of Theorem 2 is that it succeeds in analyzing a class of nonlinear PDEs—in this case HJBs—by relying on convex functional analysis methods that have originally been designed for analyzing linear PDEs. This implies in particular that mature numerical methods for linear parabolic PDEs, such as finite element Galerkin methods, can be used to approximately solve finite- and infinite horizon stochastic optimal control problems via a discretized operator-theoretic dynamic programming recursion. Because such methods for discretizing linear parabolic PDEs are, in general, more accurate, easier to analyze, and easier to implement than methods for discretizing nonlinear PDEs, such as HJBs, operator-theoretic dynamic programming methods offer clear advantages over classical dynamic programming methods for optimal control.
1.3 Overview
In order to put the above contributions into context, this paper is structured as follows.
-
•
Section 2 reviews the relations between controlled McKean-Vlasov systems and FPKs in the presence of reflecting boundary conditions.
- •
-
•
Section 4 proposes a novel approach to analyzing parabolic HJBs with Neumann boundary conditions in a suitably weighted Hilbert-Sobolev space.
-
•
Section 5 introduces Lyapunov diffusion operators in order to analyze the stabilizability of controlled diffusions via a generalized Bakry-Emery condition.
- •
-
•
Section 7 presents a finite element Galerkin projection based operator-theoretic dynamic programming method for solving parabolic and ergodic HJBs. A case study illustrates its performance for an infinite horizon stochastic optimal control problem with indefinite stage costs and non-trivial ergodic behavior.
And, finally, Section 8 concludes the paper.
1.4 Notation
Let denote an open set, its closure and its boundary. The sets
denote—as usual—the set of functions that are, respectively, locally -integrable, -integrable, smooth with compact support, -times continuously differentiable with continuous extensions to the boundary, and -integrable with -integrable weak derivatives on . The inner products
are defined such that and are Hilbert spaces. Here, denotes the gradient operator. Additionally, if with is a strictly positive weight function, the shorthands
are used. The associated inner products are then, of course, weighted with , too. For given time intervals and any Banach space , we use the symbol to denote the set of strongly measurable functions with
such that is a Bochner space. Weak derivatives with respect to time are indicated by a “dot”; and denotes the dual of . The space
is then again a Hilbert space with respect to its natural inner product [28, Thm. 1.31]. Finally, the natural duality pairing between a Hilbert space and its dual is denoted by assuming that it is clear from the context which Hilbert space is meant.
2 Stochastic Control Systems
This section is about the equivalence of closed-loop controlled stochastic nonlinear systems in a finite dimensional space and a special class of open-loop controlled deterministic linear systems in an infinite dimensional space, namely, controlled FPKs.
2.1 Nonlinear Control Systems
Deterministic control-affine systems have the form
(1) |
Here, denotes an open time horizon. It is finite if and infinite if . The right-hand functions and are assumed to be defined on the closure of an open domain . Moreover, denotes the state and the control at time .
The goal is to design a feedback that can be used to control (1) in closed-loop mode,
(2) |
with . Because traditional existence theorems for ordinary differential equations are inapplicable in the context of such general feedback functions , solutions of (2) are defined in the viscosity sense, by introducing a small random process noise.
2.2 Stochastic Differential Equations
Let us temporarily assume that . In this case, we can add a process noise to the right-hand of (2), arriving at a stochastic differential equation (SDE) of the form
(3) |
where , , denotes a Wiener process, a diffusion constant, and an initial state. The state of this SDE at time is denoted by . It is a random variable, whose probability density function is denoted by . This means that for any given Borel set , the probability of the event is given by
Conditions under which (3) admits a martingale solution that has such an -integrable probability density function are discussed below.
Remark 1
The assumption that is a scalar constant is introduced for simplicity presentation. The results below can be generalized to the case that the constant scalar factor in (3) is replaced by a potentially state-dependent invertible matrix.
2.3 Reflections at the Boundary
The states of many nonlinear systems have no physical interpretation outside of their natural domain . For instance the concentration of the reactants in a controlled chemical process are usually required to be non-negative implying that . In such a case, the stochastic process noise model in (3) makes no sense, as the Wiener process is unbounded. Instead, at least for the case that is convex, a physically more meaningful stochastic model is given by
(5) |
This is a McKean-Vlasov SDE with reflection [21, 40], where the additional drift term,
acts as reflector. We have for , which means that (3) and (5) coincide on . For , however, is equal to the normal cone of at . Consequently, whenever (5) admits a solution that has an -integrable probability density , we expect that
(6) |
which is the case under very mild assumptions that are, for instance, discussed in [21], but sufficient conditions under which this is the case are also discussed below. For the special case that , (5) reduces to (3).
2.4 Fokker-Planck-Kolmogorov Equations
Let denote the -diffusion operator of and the linear transport operator of , formally defined by
where “” denotes the divergence and “” the Laplace operator. The pair is associated with an infinite-dimensional linear control system, called the controlled FPK equation. In the presence of reflecting boundary conditions, it has the form
(10) |
Here, denotes a given initial probability distribution on , while
denote, respectively, the FPK’s open domain and its spatial boundary. Moreover, denotes an outer normal, such that
The open-loop linear control system (10) is closely related to the closed-loop nonlinear control system (5) after identifying with the probability distribution of the random process and after substituting
(11) |
In order to elaborate on this relation between (5), (10) and (11), a couple of definitions are needed. For this aim, the shorthand
is introduced, where , , is a strictly positive function. Moreover, the linear operators and are associated with the bilinear forms
which are defined for all and all .
Definition 1
Let with on and be given. A function with given initial value is called a weak solution of (10) if
(12) |
for all test functions .
For the special case that is convex and bounded and , we may set the weight function to . In this case, the existence of a unique weak solution of (10) for arbitrary follows from a standard result about parabolic PDEs, see [28, Thm. 1.33], recalling that and are assumed to be continuously differentiable on and . For unbounded domains , however, additional assumptions on , , and are needed in order to ensure the existence of weak solutions, as discussed a bit further below, in Section 3.
2.5 Nonlinear Control via Linear Systems Theory
From a control theoretic perspective, it is interesting to point out that for the purpose of analyzing the finite-dimensional closed-loop controlled nonlinear system , as defined by (5), it sufficient to analyze the infinite dimensional open-loop controlled linear system , as defined in (10). This is because the existence of a unique weak solution of (10) for and an initial probability density function with of the initial random variable is sufficient to ensure that (5) has a martingale solution with probability density function that satisfies (6). This is a standard result from the theory of diffusion processes [42], which has recently been extended for general FPKs with reflecting boundary conditions [7]; see also [21]. Of course, here we have to substitute (11), but this substitution is justified, too. Namely, it follows from the minimum principle for parabolic FPKs [15] that if , is a weak solution of (10) for an initial probability density , then
(13) |
recalling that is an open set, , and . Thus, the closed-loop control law, , can be recovered from the infinite-dimensional open-loop input and the strictly positive function .
Remark 2
There are many ways to reformulate finite dimensional nonlinear control systems as equivalent infinite dimensional linear control systems. This includes linear embedding methods based on occupation measures [34, 44] as well as lifting methods based on first order differential operators such as Liouville-Koopman or Perron-Frobenius operators [31, 43]. In contrast to these first order transport operators and their adjoints, the Kolmogorov forward operator is, however, an elliptic second order operator. Such second order operators have many advantages regarding both their mathematical properties and their numerical discretization [28].
3 Ergodic Optimal Control
The idea to analyze optimal control of diffusion processes with reflection can be traced back to the work of Puterman [38], who originally introduced such optimal control problems on a finite time horizon and on a bounded domain . This section presents methods for analyzing such optimal control problems on potentially unbounded domains, with a focus on ergodic limits.
3.1 Problem Formulation
Let denote a control constraint set, let
denote the associated set of feasible control laws, and let be a cost function. The current section is about ergodic optimal control problems of the form
(14) |
Here, denotes a martingale solution of the SDE (5) that depends implicitly on the optimization variable .
Assumption 1
Throughout this paper, we assume that
-
1.
the set is convex, open, and non-empty;
-
2.
the system satisfies ;
-
3.
the set is convex, compact, ; and
-
4.
the stage cost is bounded from below, radially unbounded in ,
and strongly convex in , on .
Assumption 1 neither excludes that is unbounded nor that . On such unbounded domains, however, a weak Has’minskiǐ-Lyapunov function is needed.
Assumption 2
There exists a function and positive constants such that
-
1.
the Hessian of is positive definite and uniformly bounded, such that
(15) -
2.
the function satisfies on , and
-
3.
satisfies the weak Lyapunov condition
(16) (17) for at least one with on .
Assumption 2 can be interpreted as a weak stabilizability condition of the system relative to its cost .
Example 1
Example 2
Let , , and the cost be defined on the unbounded domain . In this case, Assumption 2 is satisfied for ,
The associated nominal nonlinear system, however, is neither controllable nor stabilizable and, consequently, does not admit a strong control Lyapunov function.
3.2 Optimal Steady States
This section discusses methods for analyzing the expected value of the stage cost at the optimal steady state of the FPK. As a first step towards this goal, the following lemma exploits a variant of Has’minskiǐ’s theorem in order to guarantee the existence of a potentially suboptimal steady state and to derive an associated upper bound on the expected value of .
Lemma 1
Proof. Let be defined as in Assumption 2 and let
denote the associated open -sublevel sets of . Because is strongly convex and differentiable and because is assumed to be sufficiently large, the sets are open, non-empty, convex, bounded, and have a Lipschitz boundary. Thus, it makes sense to introduce the convex PDE-constrained auxiliary optimization problem
(24) | |||||
(29) |
with optimization variable . Here, denotes an outer normal of . Note that the generalized version of Has’minskiǐ’s theorem on the existence of unique probability solutions of the stationary FPK on convex domains [29, Thm. A], see also [15], implies that (24) is feasible. Since we assume that is bounded from below, it follows that there exists a with . Moreover, a Lagrangian relaxation of (24) yields
(30) |
for sufficiently large constants . The latter inequality follows from (15), (16), and the fact that
holds due to the above construction of . More precisely, (15) and (16) imply that
Consequently, (30) holds for
But this particular choice for does not depend on and, consequently, the bounds hold uniformly. The statement of the theorem follows then by taking the limit in (24) and dividing the upper bound on by .
The stage cost of (14) can be represented by the convex perspective function
(31) |
It is defined for any strictly positive and measurable function and any measurable function for which the above integral exists. Similarly, control constraints are represented by the convex cone
Thus, optimal steady states can be found by solving the convex optimization problem
(32) | |||||
(36) |
with denoting optimization variables.
Proof. The construction of and in Lemma 1 satisfies
As such, the closed and convex cone is compatible with Slater’s constraint qualification, recalling that follows from the minimum principle for elliptic operators [15]. Consequently, feasibility and boundedness of (32) follows immediately from Lemma 1, as is a feasible point with bounded objective value. In summary, (32) has the following properties:
-
1.
Slater’s (weak) constraint qualification holds,
-
2.
the objective function is a convex perspective function that is—due to Assumption 1—strongly convex in the control , and
-
3.
the operator is uniformly elliptic.
Consequently, the statement of this corollary follows from a well-known standard result on the existence and uniqueness of minimizers of elliptic PDE-constrained convex optimization problems in Hilbert-Sobolev spaces; see, for example, [28, Thm 1.45].
3.3 Dissipation of Energy
Let denote the optimal steady state, as defined in Corollary 1, and let denote an associated optimal control law. It can be used to study the controlled parabolic FPK
(40) |
on a possibly infinite time horizon . Here, denotes an initial probability density that is assumed to be an element of the set
In this setting, the key idea for analyzing (40) is to introduce the energy functional
Its weak time derivative is given by
(41) |
Note that this equation is well-known in the functional analysis and FPK literature [3, 15], where is often interpreted as energy (or entropy) that is dissipated during the evolution of the FPK. Alternatively, by translating this to control engineering language, could also be called a Lyapunov function for (40).
Lemma 2
Proof. FPKs have been analyzed by so many authors that is impossible to list them all. As such, [3] and [15] should be regarded as selected examples for two out of a long list of articles in which proofs of variants of the current lemma can be found. Nevertheless, because this is important for the developments in this paper, it appears appropriate to recall the following three main arguments why the above statement holds.
-
1.
First, if is bounded, a unique weak solution of (40) exists, since it is a uniformly parabolic linear PDE. Moreover, it follows from Harnack’s inequality that
(42) The details of this argument can be found in [15, Chapter 7], where the square-integrability of logarithmic gradients of FPKs is discussed in all detail.
- 2.
-
3.
For the case that and are unbounded, one can proceed as in the proof of Lemma 1. This means that the above two steps are repeated in order to construct a sequence of solutions of (40) on the bounded domains with increasing sequences and . Because the energy estimate (44) yields a uniform upper bound, the sequence has a weakly convergent subsequence in the Hilbert space , whose limit must be a weak solution of (40) and, by continuity, (41) holds, too.
In addition to the above three arguments, it is also recalled that FPKs model the evolution of probability densities and, as such, the conservation law
holds on , referring to [15] for a formal discussion of this property. This completes the proof of the lemma.
Remark 3
Note that more general strategies for constructing Lyapunov functions for FPKs can be found in [3, 16], where so-called -entropies are proposed as Lyapunov functions. These functions are useful for analyzing the properties of FPKs in general -spaces, typically for . Here, the case is particularly relevant, as the -norm is a natural choice for measuring the distance between probability densities. The methods in this paper are nevertheless presented in Hilbert-Sobolev spaces, as many of the constructions below rely on the concept of duality in convex PDE-constrained optimization. Extensions to other reflexive and non-reflexive -spaces are beyond the scope of this article.
3.4 Ergodic Optimal Control
In order to analyze the ergodic optimal control problem (14), an additional assumption is introduced.
Assumption 3
We have .
This assumption is equivalent to requiring that has a variance at the steady state ; that is,
If Assumption 1 and 2 hold and if is bounded, this is clearly the case, and Assumption 3 automatically holds. Otherwise, if is unbounded, such a variance exists under very mild growth conditions on , as discussed in all detail in [15, Chapter 3], where upper bound estimates for can be found. Next, let
(45) | |||||
(49) |
denote the value function of the stochastic control problem on the finite time horizon with parameters and . Here, the optimization variables and are elements of the convex cone
By construction, Problem (14) is equivalent to computing the limit
(50) |
The following lemma proves that , which, in turn, implies that does not depend on .
Theorem 1
Proof. This proof is divided into two parts, where the first part verifies the claim that (45) has a unique minimizer while the second part is about the ergodic limit.
Part I. First note that an upper bound on the stage cost of (45) can be found by applying the triangle- and the Cauchy-Schwarz inequality,
(51) | |||||
which holds along the trajectory of (40). Due to Assumption 3 and the energy estimate (44) from the proof of Lemma 2, this upper bound is sufficient to ensure that . Thus, the first part of this proof is analogous to the argument in the proof of Corollary 1: the conic constraint is strictly feasible, which implies that Slater’s weak constraint qualification holds, and is strictly convex in . Consequently, the existence of a unique minimizer of (45) for follows from the standard existence theorem for PDE-constrained optimization problems [28, Thm. 1.45].
Part II. Lemma 2 implies first that the solution of (40) is such that that the function
converges to a constant . But we also have
since and are both probability density functions. Consequently, this constant is given by and converges to in for . By substituting in (45), it follows that there exists for every a , with for , such that
(52) |
Next, let denote the minimizer of (45) on the time horizon and let
denote their time averages. Since is a convex cone, it follows that and
which implies that
(53) |
almost everywhere on , since is bounded from below and radially unbounded. In other words, converges to a feasible steady-state in . Moreover, Jensen’s inequality yields
(54) |
But then, on the one hand, (52) implies
and, on the other hand, (53) and (54) and the fact that is uniformly elliptic imply, by continuity,
Consequently, the ergodic limit for , or, equivalently, , exists, and . This completes the proof of the theorem.
4 Hamilton-Jacobi-Bellman Equations
Many classical methods for analyzing the existence of solutions to HJBs [9, 23] start by establishing (or sometimes even imposing) certain properties—for example, Hölder continuity or differentiability—of the value function of an optimal control problem and then show that this function actually satisfies its associated HJB, be it in a classical-, weak-, or viscosity sense. An impressive exception is the work by Krylov [32, 33], which is using and developing methods for general nonlinear parabolic PDEs that can be applied to analyze the solutions of controlled diffusions and nonlinear HJBs. Nevertheless, it appears that the convex duality of HJBs and FPKs, which, in principle, is known since a long time [13, 24], has never been fully exploited for the purpose of analyzing HJBs. Therefore, rather than relying on general nonlinear analysis methods, this section proposes to analyze HJBs by using modern PDE-constrained convex optimization methods [28], which can exploit the above mentioned duality of HJBs and FPKs.
4.1 Hamiltonians
4.2 Parabolic HJBs
On finite time horizons the parabolic HJB with Neumann boundary condition is given by
(59) |
Definition 2
Different from classical methods for analyzing HJBs, the following lemma uses Riesz’ representation theorem to establish the existence and uniqueness of weak solutions in the Hilbert space .
Lemma 3
Proof. Let us temporarily assume that is bounded. In this case, the auxiliary optimization problem
(60) | |||||
(65) |
satisfies Slater’s weak constraint qualification for uniformly parabolic PDE-constrained convex optimization problems on a bounded domain, recalling that is strongly convex in . Consequently, it follows from a standard existence result for PDE-constrained optimization problems [28, Chapter 1.5] that (60) has for all parameters and given a unique minimizer . Next, let denote the unique weak solution of the uniformly parabolic linear PDE
(69) |
at , where denotes—as in Theorem 1—the minimizer of (45). This notation is such that the relations and hold. Let
denote an associated parametric Lagrangian at . Because is a weak solution of (69), the function is constant on and satisfies
(70) |
Moreover, the Lagrangian is closely related to the Hamiltonian . In order to elaborate on this relation, the auxiliary control law
is introduced, recalling that is Lipschitz continuous. Thus, it satisfies and
for all with . Moreover, by regarding as a test function, it follows that
(72) | |||||
for all , where the second equation in (72) follows by substituting the definitions of , and and a partial integration with respect to time. Next, an important observation is that
(73) | |||||
Since is—due to Theorem 1—the unique minimizer of the optimization problem
it follows from (73) that . Moreover, by substituting the equation in (69), we find that satisfies (59); and, finally, (70) and (72) yield
for all . This implies that is a bounded linear operator. Next, if HJB (59) had a second solution , we could repeat the above construction by using the control law
in order to find yet another representation of the operator , namely,
for any . But this is impossible, as the Riesz representation of the bounded linear operator must be unique. Thus, HJB (59) has a unique solution and we have proven both statements of the theorem under the additional assumption that is bounded.
Last but not least, if is unbounded, the same strategy as in the proof of Lemma 2 can be applied. This means that we repeat the above argument by replacing the domain in (45) with an increasing sequence of bounded domains in order to construct an associated sequence of bounded linear operators, , that converges to for . Since Theorem 1 has already established the fact that is bounded for finite but potentially unbounded domains , it follows that is a bounded linear operator. Note that this is equivalent to the convergence of the corresponding sequence of Riesz’ representations of the operators
since in Hilbert spaces one can rely on the natural isometric isomorphism between linear operators and their Riesz’ representations. Because is Lipschitz continuous, the limit of the sequence for is the unique weak solution of (59). This completes the proof of the lemma.
Remark 4
As already mentioned above, parabolic HJBs are among the most well-studied nonlinear PDEs in the literature [9, 23, 33]. Obviously, Lemma 3 adds yet another set of assumptions under which such HJBs admit a unique solution. The relevance of this result, however, is that—different from all other methods for analyzing the existence of solutions of HJBs—the proof of Lemma 3 starts from the convex optimization problem (45) and then derives the HJB (59) by constructing an associated dual problem. As discussed in the following sections, this convex optimization based derivation of HJBs turns out to have many advantages, as it enables us to analyze HJBs by leveraging on modern functional analysis methods for linear FPKs.
5 Exponential Stabilizability
Carré Du Champ (CDC) operators have originally been introduced in the French literature on functional analysis [6]. They can be used to analyze general diffusions. The following section introduces a variant of such CDC operators in order to establish global exponential stabilizability conditions for controlled FPKs.
5.1 Weighted Carré Du Champ Operators
The weighted CDC operator is a bilinear form,
(74) |
Here, denotes a weight that satisfies for given positive lower- and upper bounds, . For the special case that is constant and equal to the scaled unit matrix , coincides with the traditional CDC operator of the diffusion semigroup that is generated by the linear operator
It is defined for all control laws , although the right-hand of the CDC inequality
(75) |
is an invariant, in the sense that it does not depend on .
5.2 Lyapunov Diffusion Operators
The design of potentially non-trivial weight functions turns out to be important for the construction of practical stabilizability conditions. In order to explain why this is so, the Lyapunov diffusion operator
(76) |
is introduced for arbitrary . Here, is applied to componentwise and denotes the distributional Jacobian of . This means that we define
for all components , where denotes the distributional derivative with respect to . In the following, we use the convention that a formal symmetric matrix-valued distribution on is positive semi-definite, writing , if
for all symmetric and positive semi-definite test functions with compatible dimension. Linear matrix inequalities (LMIs) are then understood in an analogous manner.
Assumption 4
Let denote the optimal ergodic control law as defined in Corollary 1. We assume that there exists a such that the LMI
(81) |
admits a positive definite solution with on . Here, denotes the Kronecker product.
Examples for linear and nonlinear control systems for which Assumption 4 holds can be found below.
Example 3
Let and be a linear control system on with being asymptotically stabilizable and . In this case, is the well-known Linear-Quadratic-Gaussian (LQG) regulator and Assumption 4 holds for a constant . This is because for a constant , the equations and hold such that LMI (81) reduces to
This LMI has a strictly positive solution for sufficiently small , since we assume that is asymptotically stabilizable, or, equivalently, that the eigenvalues of have strictly negative real parts.
Example 4
Let us consider a stochastic system with , , and on . It leads to an SDE with multi-modal limit distribution,
where the constant can be chosen such that the integral over is equal to . In this example, (81) cannot be satisfies for a constant function , but it is possible to find other classes of functions which satisfy (81). For instance, Assumption 4 can be satisfied by setting
Apart from the above examples, another reason why Assumption 4 appears appropriate for our purposes, is elaborated in the following section. Namely, it turns out that LMI (81) is satisfied for an even broader class of problems than the original Bakry-Emery condition for FPKs, which is—at least in the FPK literature—well-accepted as a useful and practical condition; see, for instance, [3, 6, 16, 35], where many examples are collected.
5.3 Exponential Convergence Rate
The following lemma interprets LMI (81) as a generalized Bakry-Emery condition.
Lemma 4
Proof. In order to interpret LMI (81) as a generalized Bakry-Emery condition, we work out the -operator
(93) | |||||
for arbitrary test functions . Note that this equation is understood by interpreting all first order derivatives of as distributional derivatives. Next, (81) implies that the integral version of the Bakry-Emery condition [6, 35],
(94) |
is satisfied for all test functions and all weight functions with . This integral inequality remains correct even if we work with distributional Jacobians , since these distributional derivatives merely enter via the integrand of the left-hand integral in (94), and, consequently, can be resolved by partial integration. It is well-known [16, Thm. 2 & Prop. 5] that (75) and (94) imply that the entropy inequality
(95) |
holds for all , recalling that is, by construction, an ergodic measure of the diffusion operator . Of course, by continuity, it follows that (95) then also holds for all functions . Consequently, it further follows from Lemma 2 that
(96) | |||||
where we have set . An application of Gronwall’s lemma to (96) yields the energy estimate (86), which, in turn, implies that the unique weak solution of (85) also remains well-defined on infinite horizons.
6 Infinite Horizon Optimal Control
The technical result of Lemma 2 on the exponential convergence of the state of the optimally controlled FPK can be used to bound the value function of infinite horizon optimal control problems. Here, the main idea is to exploit the strong duality between FPKs and HJBs that has been established by Lemma 3.
6.1 Problem Formulation
This section concerns infinite horizon optimal control problems of the form
(97) |
recalling that the cost function has been defined in (45). Here, denotes the ergodic average cost that has been defined in (32) and interpreted in Theorem 1. Compared to the ergodic optimal control problem from Theorem 1, the cost of the infinite horizon optimal control problem (97) is more difficult to analyze, as it depends on the initial probability density function . Conditions under which the limit on the right hand of (97) exists are discussed below.
6.2 Ergodic HJB
Problem (97) is closely related to the ergodic HJB with Neumann boundary condition, given by
(100) |
Weak solutions of (100) are defined as follows, recalling once more that is Lipschitz continuous.
Definition 3
The relation between this ergodic HJB and (97) is explained in the following section.
6.3 Solution of the Infinite Horizon Control Problem
The following theorem summarizes a novel result about the existence of weak solutions of ergodic HJBs in the presence of general indefinite stage cost functions.
Theorem 2
Proof. The proof is divided into three parts:
Part I. Let denote the solution of (85) and let denote its associated input function such that
The above upper bounded can be used to bound the value function of the auxiliary optimization problem
(102) | |||||
(107) |
Namely, by integrating the above inequality, we find that
(108) |
for all and all . Moreover, by using an analogous argument as in the proof of Lemma 3, we can use the weak solution of the linear parabolic PDE
(112) |
to represent in the form
Next, because is a uniformly bounded linear operator, its limit for admits a unique Riesz representation,
where must be a weak solution of the linear elliptic PDE
(115) |
at .
Part II. Let us use the function as a terminal cost of the finite horizon optimal control problem
(119) | |||||
By using an analogous argument as in Lemma 3, it follows that is a bounded linear operator, whose unique Riesz representation
is given by the weak solution of the parabolic HJB
(124) |
Since is a feasible time-invariant control law for (LABEL:eq::Jtilde), we must have
Next, let us assume that we can find a and a for which . Then the Lebesgue measure of the set
must be strictly positive. But this implies that there exists a such that
(125) |
where does not depend on . In order to show that this inequality leads to a contradiction, we denote the minimizer of (LABEL:eq::Jtilde) at by . Due to the above inequality, the ergodic average values
exist in , since the optimal trajectories remain bounded and must satisfy
where the limits are understood in . The latter claim follows by using an analogous argument as in the proof of Theorem 1, recalling that the ergodic limit is strictly optimal on infinite horizons. This implies in particular that as well as
due to (125) and Jensen’s inequality, recalling that is convex. But this contradicts the definition of in (32). Thus, we must have
where the latter equation follows from the fact the first equation holds uniformly and the fact that the contribution of the terminal cost,
vanishes for .
Part III. An immediate consequence of the constructions in Part II is that the unique weak solution of the parabolic HJB (124),
does not depend on . Thus, is a weak solution of the ergodic HJB (100) as claimed by the statement of this theorem. Finally, the energy estimate follows from (108) by using that
(126) | |||||
This completes the proof.
Remark 5
The above framework can, in principle, be extended for systems with state constraints of the form
for a constraint function . This can be done by adding the above linear inequality constraint to (45), as long as one restricts the analysis to initial density functions for which this linear inequality constraint is strictly feasible, such that Slater’s constraint qualification still holds. Because (45) is then still a convex PDE-constrained optimization problem, it still admits a unique solution. One major difficulty, however, is that, in this case, is not a bounded linear operator anymore. This renders Lemma 3 and Theorem 2 inapplicable to systems with stochastic state constraints. Instead, the KKT optimality conditions of (45) have to be analyzed by using the Zowe-Kurcyusz theorem [47], see also [28, Thm. 1.56]. In combination with Rockafellar’s theorem [11, Thm. 22.14] it then follows that the Frechet subdifferential of (and also ) is, under suitable regularity assumptions, a maximal monotone operator. The practical relevance of such analysis techniques is, however, limited, as the whole point of the HJBs (59) and (100) is that one can compute a value function that is independent of . Because this desirable property of the HJB fails to hold in the presence of explicit state constraints, a much more practical heuristic is to choose a function that takes large values in regions of the state space that one wishes to constraint. The optimal controller can then be expected to automatically avoid these regions with high probability whenever possible.
7 Global Optimal Control Algorithm
The results of Lemma 3 and Theorem 2 motivate a completely new numerical method for solving HJBs. This method relies on discretizing the dynamic programming recursion for the bounded linear operator rather than on discretizing the value function directly.
7.1 Galerkin Projection
The bounded linear operator , as defined in (45), can be approximated by a finite dimensional linear function by applying a simple Galerkin projection. This means that we choose basis functions
in order to discretize the states and controls . For simplicity of presentation, we assume that this construction is based on given grid points
with step-size and time index and that Lagrange’s equation holds111Note that the methods in this paper can be extended easily for all kinds of finite element-, model order reduction, and other related numerical methods for linear parabolic PDEs or even tailored methods for FPKs, which might use basis functions that do not satisfy Lagrange’s equation—this equation is merely introduced in order avoid unnecessary technical notation overhead.; that is, , where if and . In this case, the discrete approximation of the bounded linear operator at has the form
(127) | |||||
s.t. | (131) |
Here, and are discrete-time system matrices that are found by a Galerkin projection of and in combination with a suitable implicit differential equation solver in order to discretize the system also in time. In this context, is an invertible mass matrix. The coefficients of the associated truncated series expansions
are optimized subject to a conic constraint, given by
Moreover, the quadrature weights are needed in order to discretize the integral in the objective of (45).
Remark 6
Methods for bounding the discretization error of the above Galerkin projection of (45) in dependence on and can be found in [19]. Moreover, we recall that a desirable property of Galerkin methods is that—at least for appropriate choices of the basis functions—Markov’s conservation laws
hold. Similarly, accurate discretization schemes either satisfy (or simply enforce) the condition
componentwise for all vectors , with , such that implies for all feasible inputs.
7.2 Dual perspective function
For the following developments it will be convenient to introduce the Fenchel conjugate of the stage cost,
Fortunately, in practical applications, one can find an explicit expression for . For instance if is a diagonal quadratic form in and a simple interval box, a single evaluation of with modern processors can be done efficiently, usually, in a couple of nanoseconds. In this case, the dual perspective function
can be evaluated with reasonable effort, even if the number of basis functions, , is very large.
7.3 Operator-Theoretic Dynamic Programming
Assuming that the stochastic optimal control problem is discretized as above, it can be solved by starting with and solving the recursion
(132) |
backwards in time, for . This procedure can be justified directly by observing that the linear cost-to-go function of the discrete optimal control problem (127) is given by . It satisfies the discretized operator-theoretic dynamic programming recursion,
A comparison of coefficients yields (132). Moreover, the associated separable minimizers
can be used to recover an approximation of the optimal control law at the minimizer of (45),
The computational complexity for computing this approximation of is given by . Finally, regarding the solution of the infinite horizon control problem, one option is to keep on iterating (132) backward in time until a suitable termination criterion is satisfied, for example, until
for a user specified tolerance . This method converges for as long as the conditions of Theorem 2 are satisfied—assuming that the above finite element discretization is reasonably accurate. Alternatively, one can use a large-scale convex optimization problem solver to solve (127) for but subject to the additional stationarity constraint .
Remark 7
As reviewed in the introduction, the duality between HJBs and optimally controlled FPKs is known since a long time and, in fact, in [39] a method for solving HJBs via FPK-constrained convex optimization can be found. The operator-theoretic dynamic programming recursion (132) is, however, an original contribution of this paper. It is entirely explicit and does—at least if the Fenchel conjugate admits an explicit expression—not even require an optimization problem solver at all.

7.4 Numerical Illustration
This section is about a nonlinear stochastic system with two states and one control input, given by
on the domain and control constraint set . An associated non-convex stage cost function is given by
This example trivially satisfies Assumptions 1, 2, and 3. Assumption 4 can be verified numerically finding the exponential decay rate . Figure 1 visualizes the optimal infinite horizon control law , the optimal steady state , and the associated dual solution that solves the ergodic HJB at
This solution has been found by using a Galerkin discretization with Lagrange basis functions on a simple equidistant grid with basis functions and time iterations with . The table below,
,
lists the CPU run-time of a JULIA-prototype implementation of (132) for different choices of . The third column of this table lists the -norm between the control law that is found by using basis functions and the control law that is found by using basis functions.
Note that this example should be regarded as a “toy problem”. More advanced implementations of the above method and case studies are left for future work. Nevertheless, the above example is non-trivial in the sense that it has a multi-modal steady state . Moreover, the associated solution of the ergodic HJB turns out to be indefinite. And, finally, the closed-loop system has a non-trivial limit behavior: the plot in the center of Figure 1 shows closed-loop simulations without noise in order to illustrate that the nominal trajectories converge to a non-trivial periodic attractor.
8 Conclusions
This paper has presented methods for analyzing ergodic- and infinite horizon stochastic optimal control problems. In detail, Theorem 1 has established the existence and uniqueness of ergodic optimal controls under Assumptions 1 and 2. These assumptions are satisfied for most nonlinear stochastic control problems of practical relevance, because nonlinear McKean-Vlasov systems on bounded domains with reflecting boundaries always admit a Lyapunov-Has’minskiǐ function. But even if the domain is unbounded such functions are—in contrast to traditional Lyapunov functions—relatively easy to find.
Concerning the analysis of infinite horizon optimal control problems, this paper has proposed to introduce a novel class of Lyapunov diffusion operators that can be used to formulate a generalized Bakry-Emery condition, as summarized in Assumption 4. The relevance of this Lyapunov diffusion operator based LMI has been clarified in Theorem 2, which has used this condition to establish the existence of solutions to ergodic HJB equations for nonlinear optimal control problems with indefinite stage costs.
For the first time, this paper has presented methods for analyzing a very general class of nonlinear HJBs by using functional analysis methods in Hilbert-Sobolev spaces that would usually find applications in the context of linear parabolic PDEs. This has far reaching consequences on the development of global robust optimal control algorithms. This is because the proposed operator-theoretic dynamic programming method renders existing numerical methods for discretizing linear parabolic PDEs fully applicable for solving nonlinear HJBs. The exploitation of the implications of this observation appear to be a fruitful direction for future research on stochastic optimal control.
References
- [1] M. Arisawa. Ergodic problem for the Hamilton- Jacobi-Bellman equation (II). Ann. Inst. H. Poincaré Anal. Non Linéaire, 14:415–438, 1998.
- [2] M. Arisawa and P.-L. Lions. On ergodic stochastic control. Communications in Partial Differential Equations, 23(11–12):333–358, 1998.
- [3] A. Arnold, P. Markowich, G. Toscani, and A. Unterreiter. On convex Sobolev inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. Comm. Partial Differential Equations, 26(1–2):43–100, 2001.
- [4] M.S. Aronna and F. Tröltzsch. First and second order optimality conditions for the control of Fokker-Planck equations. ESAIM - Control, Optimisation and Calculus of Variations, 27(15), 2021.
- [5] V. Aškovic, E. Trélat, and H. Zidani. On the asymptotic behavior of the value function in large time optimal control problems. IFAC-PapersOnLine, 55(16):38–43, 2022.
- [6] D. Bakry and M. Émery. Diffusions hypercontractives. Séminaire de probabilités, XIX, Lecture Notes in Math. 1123, pages 177–206, 1985.
- [7] V. Barbu. The controllability of Fokker-Planck equations with reflecting boundary conditions and controllers in diffusion term. SIAM Journal on Control and Optimization, 59(1), 2021.
- [8] V. Barbu and M. Röckner. The evolution to equilibrium of solutions to nonlinear Fokker-Planck equation. Indiana University Mathematics Journal, 72(1):89–131, 2023.
- [9] M. Bardi and I. Capuzzo-Dolcetta. Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations. Birkhäuser, 1997.
- [10] G. Barles and P.E. Souganidis. On the large-time behavior of solutions of Hamilton-Jacobi equations. SIAM Journal on Mathematical Analysis, 31(4):925–939, 2000.
- [11] H.H. Bauschke and P.L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics, Springer, 2011.
- [12] R.E. Bellman. Dynamic Programming. Princeton University Press, 1957.
- [13] J.-M. Bismut. Conjugate convex functions in optimal stochastic control. Journal of Mathematical Analysis and Applications, 44:383–404, 1973.
- [14] S.G. Bobkov, I. Gentil, and M Ledoux. Hypercontractivity of Hamilton-Jacobi equations. Journal de Mathématiques Pures et Appliquées, 80(7):669–696, 2001.
- [15] V.I. Bogachev, N.V. Krylov, M. Röckner, and S.V. Shaposhnikov. Fokker-Planck-Kolmogorov equations. American Mathematical Society, 2015.
- [16] F. Bolley and I. Gentil. Phi-entropy inequalities for diffusion semigroups. Journal de Mathématiques Pures et Appliquées, 93:449–473, 2010.
- [17] T. Breiten, K. Kunisch, and L. Pfeiffer. Control strategies for the Fokker- Planck equation. ESAIM: Control, Optimisation and Calculus of Variations, 24(2):741–763, 2018.
- [18] T. Breiten, K. Kunisch, and L. Pfeiffer. Infinite-horizon bilinear optimal control problems: sensitivity analysis and polynomial feedback laws. SIAM Journal on Control and Optimization, 56(5):3184–3214, 2018.
- [19] S. Brenner and R.L. Scott. The mathematical theory of finite element methods. Springer, 2005.
- [20] S. Chapman. On the Brownian displacements and thermal diffusion of grains suspended in a non-uniform fluid. Proceedings of the Royal Society (A), 119:34–54, 1928.
- [21] I. Ciotir and R. Fayad. Nonlinear Fokker-Planck equation with reflecting boundary conditions. Journal of Differential Equations, 321(1):296–317, 2022.
- [22] M.G. Crandall, H. Ishii, and P.-L. Lions. User’s guide to viscosity solutions of second order partial differential equations. Bulletin of the Americal Mathematical Society, 27(1):1–67, 1992.
- [23] W.H. Fleming and H.M. Soner. Controlled Markov Processes and Viscosity Solutions. Springer, 1993.
- [24] W.H. Fleming and D. Vermes. Convex duality approach to the control of diffusions. SIAM Journal on Control and Optimization, 27(5):1136–1155, 1989.
- [25] A.D. Fokker. Die mittlere Energie rotierender elektrischer Dipole im Strahlungsfeld. Annalen der Physik, 4(3):810–820, 1914.
- [26] L. Gross. Logarithmic Sobolev inequalities. Amer. J. Math., 97(4):1061–1083, 1975.
- [27] R.Z. Has’minskiǐ. Ergodic properties of recurrent diffusion processes and stabilization of the solution of the Cauchy problem for parabolic equations. Theory of Probability and its Applications, 5:179–196, 1960.
- [28] M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich. Optimization with PDE Constraints. Springer, 2009.
- [29] W. Huang, M. Ji, Z. Liu, and Y. Yi. Steady states of Fokker-Planck equations: I. Existence. Journal of Dynamics and Differential Equations, 27:721–742, 2015.
- [30] A.N. Kolmogoroff. Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung. Mathematische Annalen, 104:415–458, 1931.
- [31] M. Korda and I. Mezić. Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control. Automatica, 93:149–160, 2018.
- [32] N.V. Krylov. Nonlinear Elliptic and Parabolic Equations of the Second Order. Reidel, Dordrecht, 1987.
- [33] N.V. Krylov. Controlled diffusion processes. Springer, 2008.
- [34] J.B. Lasserre, D. Henrion, C. Prieur, and E. Trelat. Nonlinear optimal control via occupation measures and LMI relaxations. SIAM Journal on Control and Optimization, 47:1649–1666, 2008.
- [35] M. Ledoux. On an integral criterion for hypercontractivity of diffusion semigroups and extremal functions. Journal of Functional Analysis, 105(2):444–465, 1992.
- [36] G. Namah and J.M. Roquejoffre. Remarks on the long time behavior of the solutions of Hamilton-Jacobi equations. Communications on Partial Differential Equations, 24:883–893, 1999.
- [37] M. Planck. Über einen Satz der statistischen Dynamik und seine Erweiterung in der Quantentheorie. Sitzungsberichte der Königlich Preussischen Akademie der Wissenschaften, pages 324–341, 1917.
- [38] M.L. Puterman. Optimal control of diffusion processes with reflection. Journal of Optimization Theory and Applications, 22(1), 1977.
- [39] P. Rutquist, T. Wik, and C. Breitholtz. State constrained optimal control via the Fokker-Planck equation. IFAC-PapersOnLine, 50(1):6303–6307, 2017.
- [40] A.-S. Sznitman. Nonlinear reflecting diffusion process, and the propagation of chaos and fluctuations associated. Journal of Functional Analysis, 56(3):311–336, 1984.
- [41] E. Trélat and E. Zuazua. The turnpike property in finite-dimensional nonlinear optimal control. Journal of Differential Equations, 258(1):81–114, 2015.
- [42] D. Trevisan. Well-posedness of multidimensional diffusion processes. Electronic Journal of Probability, 21:1–41, 2016.
- [43] M.E. Villanueva, C.N. Jones, and B. Houska. Towards global optimal control via Koopman lifts. Automatica, 132(109610), 2021.
- [44] R. Vinter. Convex duality and nonlinear optimal control. SIAM Journal on Control and Optimization, 31(2):518–538, 1993.
- [45] M. von Smoluchowski. Über Brownsche Molekularbewegung unter Einwirkung äußerer Kräfte und den Zusammenhang mit der verallgemeinerten Diffusionsgleichung. Annalen der Physik, 353(24):1103–1112, 1915.
- [46] W.M. Wonham. Liapunov criteria for weak stochastic stability. Journal of Differential Equations, 2:195–207, 1966.
- [47] J. Zowe and S. Kurcyusz. Regularity and stability for the mathematical programming problem in Banach spaces. Applied Mathematics and Optimization, 5:49–62, 1979.