This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Sparse Optimal Stochastic Control

Kaito Ito [email protected]    Takuya Ikeda [email protected]    Kenji Kashima [email protected] Graduate School of Informatics, Kyoto University, Kyoto, Japan Faculty of Environmental Engineering, The University of Kitakyushu, Kitakyushu, Japan
Abstract

In this paper, we investigate a sparse optimal control of continuous-time stochastic systems. We adopt the dynamic programming approach and analyze the optimal control via the value function. Due to the non-smoothness of the L0L^{0} cost functional, in general, the value function is not differentiable in the domain. Then, we characterize the value function as a viscosity solution to the associated Hamilton-Jacobi-Bellman (HJB) equation. Based on the result, we derive a necessary and sufficient condition for the L0L^{0} optimality, which immediately gives the optimal feedback map. Especially for control-affine systems, we consider the relationship with L1L^{1} optimal control problem and show an equivalence theorem.

keywords:
sparsity, non-smooth optimal control, bang-off-bang control, dynamic programming, viscosity solution
thanks: This paper was not presented at any IFAC meeting. Corresponding author K. Kashima. Tel. +81-75-753-5512.
© 2021. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/

, ,

1 Introduction

This work investigates an optimal control problem for non-linear stochastic systems with the L0L^{0} control cost. This cost functional penalizes the length of the support of control variables, and the optimization based on the criteria tends to make the control input identically zero on a set with positive measures. Consequently, the optimal control is switched off completely on parts of the time domain. Hence, this type of control is also referred to as sparse optimal control. For example, this optimal control framework is applied to actuator placements [22, 8, 16], networked control systems [18, 9, 20], and discrete-valued control [12], to name a few. The sparse optimal control involves the discontinuous and non-convex cost functional. Then, in order to deal with the difficulty of analysis, some relaxed problems with the LpL^{p} cost functional have been often investigated, akin to methods used in compressed sensing applications [4].

Literature review:  For deterministic control-affine systems, the L1L^{1} cost functional is analyzed with an aim to show the relationship between the L0L^{0} optimality and the L1L^{1} optimality, and an equivalence theorem is derived in [17]. In [10], the result is extended to deterministic general linear systems including infinite-dimensional systems. The L1L^{1} control cost is also considered in [1, 23, 2]. In [10], the sparsity properties of optimal controls for the LpL^{p} cost with p(0,1)p\in(0,1) is discussed. The authors investigated this problem from a dynamic programming viewpoint [11]. When it comes to stochastic systems, in [5], a finite horizon optimal control problem with the L1L^{1} cost functional for stochastic systems is dealt with and the authors propose sampling-based algorithm to solve the problem utilizing forward and backward stochastic differential equations. However, it is not obvious that the L1L^{1} optimal control achieves the desired sparsity. To the best of the authors’ knowledge, our preliminary work [13] (Theorem 1 below) on the continuity of the value function is the only theoretical result on L0L^{0} optimal control of stochastic systems.

Contribution:  The goal of this work is to obtain the sparse optimal feedback map (Theorem 4), where the optimal control input has the bang-off-bang property and to reveal the equivalence between the L0L^{0} optimality and the L1L^{1} optimality for control-affine stochastic systems (Theorem 5). To this end, we utilize the dynamic programming. In the present paper, we first characterize our value function as a viscosity solution to the Hamilton-Jacobi-Bellman (HJB) equation [7, 24]. Based on the result, we show a necessary and sufficient condition for the L0L^{0} optimality (Theorem 3), which immediately gives an optimal feedback map. In addition, a sufficient condition for the value function to be a classical solution to the HJB equation (i.e., a solution that satisfies the HJB equation in the usual sense) is given via the equivalence, while, in general, for the deterministic case, we cannot ensure the differentiability of the value function.

In the stochastic case, the HJB equation becomes a second-order equation compared to that of the deterministic case, and hence the results for the deterministic systems [11] cannot be directly applied to the stochastic case. Indeed, several difficulties arise due to the stochasticity. For example, the analysis of the deterministic L0L^{0} optimality of a control in [11] heavily relies on the local Lipschitz continuity of the value function, which means the almost everywhere differentiability. On the other hand, the value function for the stochastic L0L^{0} optimal control is at most locally 1/2-Hölder continuous [13]. This implies that we need a quite different approach. Even for the problem formulation, we must be careful about the probability space we work on to correctly apply the dynamic programming principle.

In order to demonstrate the practical usefulness of our theoretical results, an example is exhibited; see Example 2 for more details.

Example 1.

Consider the following stochastic system:

dxs=cxsds+usds+σdws,0sTd{{x}}_{s}=c{{x}}_{s}ds+u_{s}ds+\sigma dw_{s},\quad 0\leq s\leq T (1)

where {xs}\{{{x}}_{s}\} is a real-valued state process, {us}\{u_{s}\} is a control process, and {ws}\{w_{s}\} is a Wiener process. We take c=1,σ=0.1,T=1c=1,\ \sigma=0.1,\ T=1, and x0=0.5{{x}}_{0}=0.5. Then, the black lines in Fig. 1 show sample paths of the optimal control input and corresponding state trajectories that minimize 𝔼[01|us|2𝑑s+x12]{\mathbb{E}}\left[\int_{0}^{1}|u_{s}|^{2}ds+{{x}}_{1}^{2}\right]. It is well known that this minimum energy control is given by linear state-feedback control, and hence it takes non-zero values almost everywhere. On the contrary, our problem can deal with the sparse optimal control that minimizes 𝔼[01|us|0𝑑s+x12]{\mathbb{E}}\left[\int_{0}^{1}|u_{s}|^{0}ds+{{x}}_{1}^{2}\right] with the constraint |us|1|u_{s}|\leq 1 where 00=00^{0}=0. The first term represents the length of time that the control takes non-zero values. Theorem 4 reveals that the optimal control input takes only three values of {1,0,1}\{-1,0,1\}, and enables us to numerically compute the state-feedback map from xs{{x}}_{s} to us{1,0,1}u_{s}\in\{-1,0,1\}. The colored lines show the result of L0L^{0} optimal control, whose input trajectories are sparse while the variance of the state is small enough. Note that the purple dotted lines show the boundary of the bang-off-bang regions. \lhd

Refer to caption
Figure 1: The colored lines except black are the sample paths of the L0L^{0} optimal state process (top, solid) and control process (bottom, solid), and the switching boundary (top, dotted). The same color indicates the correspondence between the sample paths of the state process and the control process. The black lines are the sample paths of the L2L^{2} optimal state process (top) and control process (bottom).

Organization:  The remainder of this paper is organized as follows. In Section 2, we give mathematical preliminaries for our subsequent discussion. In Section 3, we describe the system model and formulate the sparse optimal control problem for stochastic systems. Section 4 is devoted to the general analysis of the stochastic optimal control with the discontinuous L0L^{0} cost. We first characterize the value function as a viscosity solution to the associated HJB equation and next show a necessary and sufficient condition for the L0L^{0} optimality. Section 5 characterizes the sparse optimal stochastic control. We show the relationship with the L1L^{1} optimization problem and some basic properties of the sparse optimal stochastic control for control-affine systems with box constraints. In Section 6 we offer concluding remarks.

2 Mathematical preliminaries

This section reviews notation that will be used throughout the paper.

Let NN, N1N_{1}, and N2N_{2} be positive integers. For a matrix MN1×N2M\in{\mathbb{R}}^{N_{1}\times N_{2}}, MM^{\top} denotes the transpose of MM. For a matrix MN×NM\in{\mathbb{R}}^{N\times N}, tr(M)\mathrm{tr}(M) denotes the trace of MM. Denote by 𝒮N{\mathcal{S}}^{N} the set of all symmetric N×NN\times N matrices and by 𝒮+N{\mathcal{S}}_{+}^{N} the set of all positive semidefinite matrices. Denote the Frobenius norm of MN1×N2M\in{\mathbb{R}}^{N_{1}\times N_{2}} by M\|M\|, i.e., Mtr(MM)\|M\|\triangleq\sqrt{\textrm{tr}(M^{\top}M)}. For a vector a=[a(1),a(2),,a(N)]Na=[a^{(1)},a^{(2)},\dots,a^{(N)}]^{\top}\in\mathbb{R}^{N}, we denote the Euclidean norm by a(i=1N(a(i))2)1/2\|a\|\triangleq(\sum_{i=1}^{N}(a^{(i)})^{2})^{1/2} and the open ball with center at aa and radius r>0r>0 by B(a,r)B(a,r), i.e., B(a,r){xN:xa<r}B(a,r)\triangleq\{x\in\mathbb{R}^{N}:\|x-a\|<r\}. We denote the inner product of aNa\in\mathbb{R}^{N} and bNb\in\mathbb{R}^{N} by aba\cdot b.

For p{0,1}p\in\{0,1\} and a continuous-time signal us=[us(1),us(2),,us(N)]Nu_{s}=[u_{s}^{(1)},u_{s}^{(2)},\dots,u_{s}^{(N)}]^{\top}\in{\mathbb{R}}^{N} over a time interval [t,T][t,T], the LpL^{p} norm of u={us}tsTu=\{u_{s}\}_{t\leq s\leq T} is defined by

u0j=1NμL({s[t,T]:us(j)0}),\displaystyle\|u\|_{0}\triangleq\sum_{j=1}^{N}\mu_{L}(\{s\in[t,T]:u_{s}^{(j)}\neq 0\}),
u1j=1NtT|us(j)|𝑑s,\displaystyle\|u\|_{1}\triangleq\sum_{j=1}^{N}\int_{t}^{T}|u_{s}^{(j)}|ds,

with the Lebesgue measure μL\mu_{L} on \mathbb{R}. The L0L^{0} norm is also expressed by u0=tTψ0(us)𝑑s\|u\|_{0}=\int_{t}^{T}\psi_{0}(u_{s})ds, where ψ0:N\psi_{0}:\mathbb{R}^{N}\to\mathbb{R} is a function that returns the number of non-zero components, i.e.,

ψ0(a)j=1N|a(j)|0,aN\psi_{0}(a)\triangleq\sum_{j=1}^{N}{\color[rgb]{0,0,0}{|a^{(j)}|^{0}}},\ a\in{\mathbb{R}}^{N}

with 00=00^{0}=0.

For a given set ΩN\Omega\subset\mathbb{R}^{N}, C(Ω)C(\Omega) denotes the set of all continuous functions on Ω\Omega. For T>0T>0, C1,2((0,T)×N)C^{1,2}((0,T)\times{\mathbb{R}}^{N}) denotes the set of all functions ϕ\phi on (0,T)×N(0,T)\times{\mathbb{R}}^{N} whose partial derivatives ϕs,ϕx(i),2ϕx(i)x(j),i,j=1,,N,\frac{\partial\phi}{\partial s},\frac{\partial\phi}{\partial x^{(i)}},\frac{\partial^{2}\phi}{\partial x^{(i)}\partial x^{(j)}},i,j=1,\ldots,N, exist and are continuous on (0,T)×N(0,T)\times{\mathbb{R}}^{N}. Denote by C1,2([0,T)×N)C^{1,2}([0,T)\times{\mathbb{R}}^{N}) the set of all ϕC1,2((0,T)×N)C([0,T)×N)\phi\in C^{1,2}((0,T)\times{\mathbb{R}}^{N})\cap C([0,T)\times{\mathbb{R}}^{N}) such that ϕs,ϕx(i),2ϕx(i)x(j),i,j=1,,N,\frac{\partial\phi}{\partial s},\frac{\partial\phi}{\partial x^{(i)}},\frac{\partial^{2}\phi}{\partial x^{(i)}\partial x^{(j)}},i,j=1,\ldots,N, can be extended to continuous functions on C([0,T)×N)C([0,T)\times{\mathbb{R}}^{N}). For ϕC1,2([0,T)×N)\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{N}), ϕt\phi_{t} denotes the partial derivative with respect to the first variable, DxϕD_{x}\phi denotes the gradient with respect to the last NN variables, and Dx2ϕD_{x}^{2}\phi denotes the Hessian matrix with respect to the last NN variables. For p2p\geq 2, denote by Cp1,2([0,T]×N)C_{p}^{1,2}([0,T]\times{\mathbb{R}}^{N}) the set of all ϕC1,2([0,T)×N)C([0,T]×N)\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{N})\cap C([0,T]\times{\mathbb{R}}^{N}) satisfying

ρ(t,x)K(1+xp),(t,x)[0,T]×N\|\rho({t,x})\|\leq K(1+\|x\|^{p}),\ ({t,x})\in[0,T]\times{\mathbb{R}}^{N} (2)

for some constant K>0K>0 and any ρ{ϕ,Dxϕ,Dx2ϕ,ϕt}\rho\in\{\phi,D_{x}\phi,D_{x}^{2}\phi,\phi_{t}\}. A function ρ:[0,T]×N\rho:[0,T]\times{\mathbb{R}}^{N}\rightarrow{\mathbb{R}} is said to satisfy a polynomial growth condition or to be at most polynomially growing if there exist constants K>0K>0 and p2p\geq 2 such that (2) holds.

Let α(0,1]\alpha\in(0,1]. A function f:N1N2f:{\mathbb{R}}^{N_{1}}\rightarrow{\mathbb{R}}^{N_{2}} is called α\alpha-Hölder continuous if there exists a constant L>0L>0 such that, for all x,yN1x,y\in{\mathbb{R}}^{N_{1}}, f(x)f(y)Lxyα\|f(x)-f(y)\|\leq L\|x-y\|^{\alpha}. Especially when α=1\alpha=1, ff is called Lipschitz continuous. A function ff is called locally α\alpha-Hölder continuous if for any xN1x\in{\mathbb{R}}^{N_{1}}, there exists a neighborhood UxU_{x} of xx such that ff restricted to UxU_{x} is α\alpha-Hölder continuous.

The notation o(s)o(s) denotes a real-valued function ff defined on some subset of {\mathbb{R}} such that lims0f(s)/s=0\lim_{s\rightarrow 0}f(s)/s=0.

For 0tT0\leq t\leq T, let (Ω,,{s}st,)(\Omega,{\mathcal{F}},\{{\mathcal{F}}_{s}\}_{s\geq t},{\mathbb{P}}) be a filtered probability space, and 𝔼{\mathbb{E}} be the expectation with respect to {\mathbb{P}}. For S=Nor𝒮NS={\mathbb{R}}^{N}{\rm~{}or~{}}{\mathcal{S}}^{N}, denote by 2(t,T;S){\mathcal{L}}_{{\mathcal{F}}}^{2}(t,T;S) the set of all {s}st\{{\mathcal{F}}_{s}\}_{s\geq t}-adapted SS-valued processes {Xs}st\{X_{s}\}_{s\geq t} such that 𝔼[tTXs2𝑑s]<+{\mathbb{E}}\left[\int_{t}^{T}\|X_{s}\|^{{\color[rgb]{0,0,0}{2}}}ds\right]<+\infty. In what follows, we omit the subscript of stochastic processes when no confusion occurs, e.g., {Xs}={Xs}st\{X_{s}\}=\{X_{s}\}_{s\geq t}.

3 Problem formulation

This paper considers the sparse optimal control for stochastic systems. This section provides the system description and formulates the main problem.

We consider the following stochastic system where the state is governed by a stochastic differential equation valued in n{\mathbb{R}}^{n}:

dxs=f(xs,us)ds+σ(xs,us)dws,s>t,xt=x.\displaystyle\begin{split}&d{{x}}_{s}=f({{x}}_{s},u_{s})ds+\sigma({{x}}_{s},u_{s})dw_{s},\quad s>t,\\ &{{x}}_{t}=x.\end{split} (3)

The initial value xnx\in{\mathbb{R}}^{n} is deterministic, and {ws}\{w_{s}\} is a dd-dimensional Wiener process. The range of the control 𝕌m{\mathbb{U}}\subset{\mathbb{R}}^{m} is a compact set that contains 0m0\in{\mathbb{R}}^{m}, and we fix a finite horizon 0<T<0<T<\infty.

We are interested in the optimal control that minimizes the cost functional

J𝗌(t,x,u)𝔼[tTψ0(us)𝑑s+g(xT)].J^{\sf s}(t,x,u)\triangleq{\mathbb{E}}\left[\int_{t}^{T}\psi_{0}(u_{s})ds+g({{x}}_{T})\right]. (4)

We assume the following conditions for functions f,σ,gf,\sigma,g:

  1. (A1)(A_{1})

    The functions ff and σ\sigma are globally Lipschitz, namely, there exist positive constants LL, M¯\bar{M} and a nondecreasing function m¯C([0,+))\bar{m}\in C([0,+\infty)) such that f:n×𝕌nf:{\mathbb{R}}^{n}\times{\mathbb{U}}\to{\mathbb{R}}^{n} and σ:n×𝕌n×d\sigma:{\mathbb{R}}^{n}\times{\mathbb{U}}\to{\mathbb{R}}^{n\times d} satisfy the following condition:

    f(x,u)f(y,v)+σ(x,u)σ(y,v)\displaystyle\|f({{x}},u)-f(y,v)\|+\|\sigma({{x}},u)-\sigma(y,v)\|
    Lxy+m¯(uv)\displaystyle\quad\leq L\|{{x}}-y\|+\bar{m}(\|u-v\|) (5)

    for all x,yn{{x}},y\in{\mathbb{R}}^{n}, u,v𝕌u,v\in{\mathbb{U}}, where m¯()M¯\bar{m}(\cdot)\leq\bar{M} and m¯(0)=0\bar{m}(0)=0;

  2. (A2)(A_{2})

    There exist constants C^>0\hat{C}>0 and p2p\geq 2 such that g:ng:{\mathbb{R}}^{n}\to{\mathbb{R}} satisfies the following growth condition:

    |g(x)|C^(1+xp)|g({{x}})|\leq\hat{C}(1+\|{{x}}\|^{p}) (6)

    for all xn{{x}}\in{\mathbb{R}}^{n};

  3. (A3)(A_{3})

    g:ng:{\mathbb{R}}^{n}\to{\mathbb{R}} is continuous.

Given a probability space with the filtration {s}st\{{\mathcal{F}}_{s}\}_{s\geq t} generated by a Wiener process, Assumption (A1)(A_{1}) ensures the existence and uniqueness of a strong solution to the stochastic differential equation (3) with any initial condition xt=x,(t,x)[0,T]×n{{x}}_{t}=x,(t,x)\in[0,T]\times{\mathbb{R}}^{n}, and any {s}st\{{\mathcal{F}}_{s}\}_{s\geq t}-progressively measurable and 𝕌{\mathbb{U}}-valued control process {us}\{u_{s}\}. In addition, under assumptions (A1)(A_{1}) and (A2)(A_{2}), the cost functional J𝗌(t,x,u)J^{\sf s}(t,x,u) is finite; see Appendix A. Assumption (A3)(A_{3}) is introduced to show the continuity of the value function defined later in (7).

For our analysis, we utilize the method of dynamic programming. In order to establish the dynamic programming principle (Lemma 1), we need to consider a family of optimal control problems with different initial times and states (t,x)[0,T]×n(t,x)\in[0,T]\times{\mathbb{R}}^{n} along a state trajectory. Let us consider a state trajectory starting from x0=xx_{0}=x on a filtered probability space (Ω,,{s}s0,)(\Omega,{\mathcal{F}},\{{\mathcal{F}}_{s}\}_{s\geq 0},{\mathbb{P}}). For any s>0s>0, xsx_{s} is a random variable. However, an {s}s0\{{\mathcal{F}}_{s}\}_{s\geq 0}-progressively measurable control {us}\{u_{s}\} knows the information of the system up to the current time. In particular, the current state xsx_{s} is deterministic under a different probability measure (|s){\mathbb{P}}(\cdot|{\mathcal{F}}_{s}). This observation naturally leads us to vary the probability spaces as well as control processes; for details see e.g., [24, 19, 6]. For this reason, we adopt the so-called weak formulation of the stochastic optimal control problem; see also Remark 1.

For each fixed t[0,T)t\in[0,T), we denote by 𝒰𝗌[t,T]{\mathcal{U}}^{\sf s}[t,T] the set of all 5-tuples (Ω,,,{ws},{us})(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\}) satisfying the following conditions:

  • (i)

    (Ω,,)(\Omega,{\mathcal{F}},{\mathbb{P}}) is a complete probability space,

  • (ii)

    {ws}\{w_{s}\} is a dd-dimensional Wiener process on (Ω,,)(\Omega,{\mathcal{F}},{\mathbb{P}}) over [t,T][t,T] (with wt=0w_{t}=0 almost surely),

  • (iii)

    The control {us}\{u_{s}\} is an {s}st\{{\mathcal{F}}_{s}\}_{s\geq t}-progressively measurable and 𝕌{\mathbb{U}}-valued process on (Ω,,)(\Omega,{\mathcal{F}},{\mathbb{P}}) where s{\mathcal{F}}_{s} is the σ\sigma-field generated by {wr}trs\{w_{r}\}_{t\leq r\leq s}.

For (Ω,,,{ws},{us})𝒰𝗌[t,T](\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T], we call {us}\{u_{s}\} and (Ω,,,{ws})(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\}) an admissible control process and a reference probability space, respectively. For notational simplicity, we sometimes write u𝒰𝗌[t,T]u\in{\mathcal{U}}^{\sf s}[t,T] instead of (Ω,,,{ws},{us})𝒰𝗌[t,T](\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T]. Note that in (4)\eqref{eq:cost_stoc} the expectation 𝔼{\mathbb{E}} is with respect to {\mathbb{P}}. For given (t,x)[0,T]×n(t,x)\in[0,T]\times{\mathbb{R}}^{n} and u𝒰𝗌[t,T]u\in{\mathcal{U}}^{\sf s}[t,T], we denote by {xst,x,u}tsT\{{{x}}_{s}^{t,x,u}\}_{t\leq s\leq T} the unique solution of (3). When there is no confusion, we omit the subscript.

Then, we are ready to formulate the main problem as follows:

Problem 1.

Given xnx\in\mathbb{R}^{n}, T>0T>0, and t[0,T]t\in[0,T], find a 5-tuple u𝒰𝗌[t,T]u\in{\mathcal{U}}^{\sf s}[t,T] that solves

minimize𝑢\displaystyle\underset{u}{\text{minimize}} J𝗌(t,x,u)\displaystyle J^{\sf s}(t,x,u)
subject to dxs=f(xs,us)ds+σ(xs,us)dws,\displaystyle d{{x}}_{s}=f({{x}}_{s},u_{s})ds+\sigma({{x}}_{s},u_{s})dw_{s},
xt=x,\displaystyle{{x}}_{t}=x,
u𝒰𝗌[t,T].\displaystyle u\in\mathcal{U^{\sf s}}[t,T].

\lhd

The value function for Problem 1 is defined by

V𝗌(t,x)infu𝒰𝗌[t,T]J𝗌(t,x,u),(t,x)[0,T]×n.V^{\sf s}(t,x)\triangleq\inf_{u\in{\mathcal{U}}^{\sf s}[t,T]}J^{\sf s}(t,x,u),\ (t,x)\in[0,T]\times{\mathbb{R}}^{n}. (7)
Remark 1.

In Problem 1, we vary probability spaces. This problem formulation is called a weak formulation. On the other hand, the problem where we fix a probability space for each initial time and state (t,x)[0,T]×n(t,x)\in[0,T]\times{\mathbb{R}}^{n} and vary only control processes is referred to as a strong formulation, which is natural from the practical point of view. Despite the difference in the settings, it is known that, under some conditions, the value function of the weak formulation coincides with the one of the strong formulation; see [7]. In this paper, under some assumptions, we will show that, for any given reference probability space, we can design an optimal state-feedback controller in Corollary 1. This result bridges the gap between the weak formulation and the strong formulation. Lastly, we would like to emphasize that the term “weak” refers only to the fact that the probability spaces vary and not to the concept of solution of the stochastic differential equation (3). In fact, once we fix u𝒰𝗌[t,T]u\in{\mathcal{U}}^{\sf s}[t,T], then the solution is defined on the same probability space. \lhd

4 General analysis of stochastic optimal control with discontinuous input cost functional

This section is devoted to the preliminary analysis of the stochastic L0L^{0} optimal control problem. We first characterize the value function as a viscosity solution to the associated HJB equation. Then, we derive a necessary and sufficient condition for the L0L^{0} optimality.

4.1 Characterization of the value function

In what follows, we show that the value function V𝗌V^{\sf s} is a viscosity solution to the associated HJB equation. The definition of a viscosity solution appears in Appendix C. The HJB equation [24] corresponding to the stochastic system (3) is given by

vt(t,x)+H𝗌(x,Dxv(t,x),Dx2v(t,x))=0,\displaystyle-v_{t}(t,x)+H^{\sf s}(x,D_{x}v(t,x),D_{x}^{2}v(t,x))=0, (8)
(t,x)[0,T)×n,\displaystyle\hskip 113.81102pt(t,x)\in[0,T)\times\mathbb{R}^{n},
v(T,x)=g(x),xn,\displaystyle v(T,x)=g(x),\quad x\in\mathbb{R}^{n}, (9)

where H𝗌:n×n×𝒮nH^{\sf s}:\mathbb{R}^{n}\times\mathbb{R}^{n}\times{\mathcal{S}}^{n}\to\mathbb{R} is defined by

H𝗌(x,p,M)supu𝕌{f(x,u)p12tr(σσ(x,u)M)ψ0(u)}.\displaystyle\begin{split}H^{\sf s}(x,p,M)&\triangleq\underset{u\in\mathbb{U}}{\sup}\Bigl{\{}-f(x,u)\cdot p\\ &\quad\quad-\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(x,u)M)-\psi_{0}(u)\Bigr{\}}.\end{split} (10)

We first introduce the result for the continuity of the value function [13]. The main difficulty in the analysis is that the state of the system (3) is unbounded due to the stochastic noise.

Theorem 1.

Fix T>0T>0. Under assumptions (A1),(A2)(A_{1}),(A_{2}), and (A3)(A_{3}), the value function V𝗌V^{\sf s} defined by (7), is continuous on [0,T]×n{\color[rgb]{0,0,0}{[0,T]\times{\mathbb{R}}^{n}}}. If in addition the terminal cost gg is Lipschitz continuous, then V𝗌(t,x)V^{\sf s}({\color[rgb]{0,0,0}{t,x}}) is Lipschitz continuous in xx uniformly in tt, and locally 1/21/2-Hölder continuous in tt for each xx. \lhd

Remark 2.

Note that the Lipschitz continuity of gg shows the local Lipschitz continuity of the value function for deterministic systems [11, Theorem 1], which ensures that the value function is differentiable almost everywhere. On the other hand, we cannot expect the local Lipschitz continuity of the value function V𝗌V^{\sf s} in the stochastic case even under the Lipschitz continuity of gg. This is essentially because 0tσ𝑑w\int_{0}^{t}\sigma dw is only of order t1/2t^{1/2}. \lhd

The dynamic programming principle plays an important role in proving that the value function is a viscosity solution to the HJB equation. Since the proof is similar to [24, Chapter 4, Theorem 3.3], it is omitted.

Lemma 1.

Fix any T>0T>0 and any τ[0,T]\tau\in[0,T]. Assume (A1)(A_{1}) and (A2)(A_{2}). Then, the value function (7) satisfies

V𝗌(t,x)=infu𝒰𝗌[t,T]𝔼[tτψ0(us)𝑑s+V𝗌(τ,xτt,x,u)]V^{\sf s}({\color[rgb]{0,0,0}{t,x}})=\inf_{u\in{\mathcal{U}}^{\sf s}[t,T]}{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}(u_{s})ds+V^{\sf s}(\tau,{{x}}_{\tau}^{t,x,u})\right]

for all (t,x)[0,τ]×n(t,x)\in[0,\tau]\times{\mathbb{R}}^{n}. \lhd

According to the definition of a viscosity solution in Appendix C, we have to check the inequalities (46) and (47) for any smooth function ϕ\phi. However, this requirement is too strong for our analysis. Fortunately, it is possible to restrict the class of ϕ\phi to be considered by the following lemma.

Lemma 2.

Assume (A1),(A2)(A_{1}),(A_{2}), and (A3)(A_{3}). Then, the value function (7) satisfies the polynomial growth condition, i.e., for some constant C^p>0\hat{C}_{p}>0,

|V𝗌(t,x)|C^p(1+xp),(t,x)[0,T]×n|V^{\sf s}({\color[rgb]{0,0,0}{t,x}})|\leq\hat{C}_{p}(1+\|x\|^{p}),\ ({t,x})\in[0,T]\times{\mathbb{R}}^{n} (11)

holds where p2p\geq 2 satisfies (6). In addition, if (46) and (47) where vv and HH are replaced by V𝗌V^{\sf s} and H𝗌H^{\sf s}, respectively, are satisfied for any ϕCp1,2([0,T]×n)\phi\in C_{p}^{1,2}([0,T]\times{\mathbb{R}}^{n}), then V𝗌V^{\sf s} is a viscosity solution to the HJB equation (8) with a terminal condition (9).

{pf}

First, we derive the polynomial growth condition of V𝗌V^{\sf s}. By Assumption (A2)(A_{2}),

|V𝗌(t,x)|\displaystyle|V^{\sf s}({t,x})| 𝔼[|g(x¯T)|]\displaystyle\leq{\mathbb{E}}[|g(\bar{{{x}}}_{T})|]
𝔼[C^(1+x¯Tp)]\displaystyle\leq{\mathbb{E}}[\hat{C}(1+\|\bar{{{x}}}_{T}\|^{p})] (12)

holds, where C^>0\hat{C}>0 and p2p\geq 2 are constants that satisfy (6), and {x¯s}\{\bar{{{x}}}_{s}\} is the solution of the uncontrolled system:

dx¯s=f(x¯s,0)ds+σ(x¯s,0)dws,x¯t=x.d\bar{{{x}}}_{s}=f(\bar{{{x}}}_{s},0)ds+\sigma(\bar{{{x}}}_{s},0)dw_{s},\quad\bar{{{x}}}_{t}=x.

Combining the inequality (12) and inequality (42) of Lemma 4 in Appendix A, we obtain (11).

Next, note that by the definition (7) of the value function V𝗌V^{\sf s}, it satisfies the terminal condition (9). Moreover, thanks to the continuity of V𝗌V^{\sf s} and the derived growth condition (11), we can apply Theorem 3.1 of [19], that is, V𝗌V^{\sf s} is a viscosity subsolution (resp. supersolution) of (8) if (46) (resp. (47)) holds for any ϕC1,2([0,T)×n)C([0,T]×n)\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{n})\cap C([0,T]\times{\mathbb{R}}^{n}) satisfying, for some R>0R>0,

ϕ(t,x)=cp(1+xp)fort[0,T],xR,\phi({t,x})=c_{p}(1+\|x\|^{p})~{}~{}{\rm for}~{}t\in[0,T],\ \|x\|\geq R, (13)

where cp=C^pc_{p}=\hat{C}_{p} (resp. cp=C^pc_{p}=-\hat{C}_{p}). This implies that for some large K(1)>0K^{(1)}>0,

|ϕ(t,x)|K(1)(1+xp),(t,x)[0,T]×n,|\phi({t,x})|\leq K^{(1)}(1+\|x\|^{p}),\ ({t,x})\in{\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n},

noting that ϕ\phi is continuous, and therefore |ϕ||\phi| attains a maximum on [0,T]×{xn:xR}[0,T]\times\{x\in{\mathbb{R}}^{n}:\|x\|\leq R\}. Moreover, (13)\eqref{eq:ass_pgrowth} gives

Dxϕ(t,x)=pcpxp2xfort[0,T],xR,D_{x}\phi({t,x})=pc_{p}\|x\|^{p-2}x~{}~{}{\rm for}~{}t\in{\color[rgb]{0,0,0}{[0,T]}},\ \|x\|\geq R,

and hence, for some constant K(2)>0K^{(2)}>0, it holds that

Dxϕ(t,x)K(2)(1+xp),(t,x)[0,T]×n,\|D_{x}\phi({t,x})\|\leq K^{(2)}(1+\|x\|^{p}),~{}({t,x})\in{\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n},

noting that DxϕD_{x}\phi is continuous. Likewise, for some constants K(3),K(4)>0K^{(3)},K^{(4)}>0, it holds that

Dx2ϕ(t,x)K(3)(1+xp),\displaystyle\|D_{x}^{2}\phi({t,x})\|\leq K^{(3)}(1+\|x\|^{p}),
|ϕt(t,x)|K(4)(1+xp),(t,x)[0,T]×n.\displaystyle|\phi_{t}({t,x})|\leq K^{(4)}(1+\|x\|^{p}),~{}~{}({t,x})\in{\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n}.

This completes the proof. \Box

Then, we are ready to prove that our value function is a viscosity solution to the associated HJB equation.

Theorem 2.

Fix T>0T>0. Assume (A1),(A2)(A_{1}),(A_{2}), and (A3)(A_{3}). Then, the value function (7) is a viscosity solution to the HJB equation (8) with a terminal condition (9).

{pf}

Note that H𝗌H^{\sf s} is continuous (see Lemma 5 in Appendix B), and the condition (44) in Appendix C is obviously satisfied since the matrix σσ\sigma\sigma^{\top} is positive semidefinite. We first show that the value function V𝗌V^{\sf s} is a viscosity subsolution of (8). For p2p\geq 2 satisfying (6), fix any ϕCp1,2([0,T]×n)\phi\in{\color[rgb]{0,0,0}{C_{p}^{1,2}}}([0,T]\times{\mathbb{R}}^{n}), and let (t,x)({t,x}) be a global maximum point of V𝗌ϕV^{\sf s}-\phi. Let us consider a constant control us=u¯u_{s}=\bar{u} for any s[t,T]s\in[t,T], with u¯𝕌\bar{u}\in{\mathbb{U}}. Denote the corresponding state process xst,x,u{{x}}_{s}^{t,x,{\color[rgb]{0,0,0}{u}}} by x¯s\bar{{{x}}}_{s}. Then, for τ(t,T)\tau\in(t,T), we have

𝔼[ϕ(t,x)ϕ(τ,x¯τ)]𝔼[V𝗌(t,x)V𝗌(τ,x¯τ)].{\mathbb{E}}\left[\phi({t,x})-\phi(\tau,\bar{{{x}}}_{\tau})\right]\leq{\mathbb{E}}\left[V^{\sf s}({t,x})-V^{\sf s}(\tau,\bar{{{x}}}_{\tau})\right]. (14)

By using Lemma 1, we obtain

V𝗌(t,x)\displaystyle V^{\sf s}({t,x}) 𝔼[tτψ0(us)𝑑s+V𝗌(τ,x¯τ)]\displaystyle\leq{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}({\color[rgb]{0,0,0}{u}}_{s})ds+V^{\sf s}(\tau,\bar{{{x}}}_{\tau})\right]
=(τt)ψ0(u¯)+𝔼[V𝗌(τ,x¯τ)].\displaystyle=(\tau-t)\psi_{0}({\color[rgb]{0,0,0}{\bar{u}}})+{\mathbb{E}}\left[V^{\sf s}(\tau,\bar{{{x}}}_{\tau})\right].

Therefore,

𝔼[ϕ(t,x)ϕ(τ,x¯τ)](τt)ψ0(u¯).{\mathbb{E}}\left[\phi({t,x})-\phi(\tau,\bar{{{x}}}_{\tau})\right]\leq(\tau-t)\psi_{0}({\color[rgb]{0,0,0}{\bar{u}}}).

Note that under the growth condition (2), it holds that

limτt𝔼[ϕ(τ,x¯τ)]ϕ(t,x)τt=Dxϕ(t,x)f(x,u¯)\displaystyle\lim_{\tau\searrow t}\frac{{\mathbb{E}}[\phi(\tau,\bar{{{x}}}_{\tau})]-\phi({t,x})}{\tau-t}=D_{x}\phi({t,x})\cdot f(x,{\color[rgb]{0,0,0}{\bar{u}}})
+12tr(σσ(x,u¯)Dx2ϕ(t,x))+ϕt(t,x)\displaystyle\qquad\qquad+\frac{1}{2}\mathrm{tr}\left(\sigma\sigma^{\top}(x,{\color[rgb]{0,0,0}{\bar{u}}})D_{x}^{2}\phi({t,x})\right)+\phi_{t}({t,x})

where Itô’s formula is applied [7]. Therefore, we get

Dxϕ(t,x)f(x,u¯)\displaystyle-D_{x}\phi({t,x})\cdot f(x,{\color[rgb]{0,0,0}{\bar{u}}})- 12tr(σσ(x,u¯)Dx2ϕ(t,x))\displaystyle\frac{1}{2}\mathrm{tr}\left(\sigma\sigma^{\top}(x,{\color[rgb]{0,0,0}{\bar{u}}})D_{x}^{2}\phi({t,x})\right)
ϕt(t,x)ψ0(u¯).\displaystyle\qquad-\phi_{t}({t,x})\leq\psi_{0}({\color[rgb]{0,0,0}{\bar{u}}}).

This inequality holds for all u¯𝕌{\color[rgb]{0,0,0}{\bar{u}}}\in{\mathbb{U}}. This means

ϕt(t,x)+H𝗌(x,Dxϕ(t,x),Dx2ϕ(t,x))0.\displaystyle-\phi_{t}({t,x})+H^{\sf s}(x,D_{x}\phi({t,x}),D_{x}^{2}\phi({t,x}))\leq 0.

We next show that V𝗌V^{\sf s} is a viscosity supersolution of (8). Fix any ϕCp1,2([0,T]×n)\phi\in{\color[rgb]{0,0,0}{C_{p}^{1,2}}}({\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n}), and let (t,x)({t,x}) be a global minimum point of V𝗌ϕV^{\sf s}-\phi. Then, for any ε>0\varepsilon>0 and τ(t,T)\tau\in(t,T), by Lemma 1, there exists u~𝒰𝗌[t,T]\tilde{u}\in{\mathcal{U}}^{\sf s}[t,T], which depends on ε\varepsilon and τ\tau, such that

V𝗌(t,x)+(τt)ε𝔼[tτψ0(u~s)𝑑s+V𝗌(τ,x~τ)],V^{\sf s}({t,x})+(\tau-t)\varepsilon\geq{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}(\tilde{u}_{s})ds+V^{\sf s}(\tau,\tilde{{{x}}}_{\tau})\right], (15)

where we denote xst,x,u~{{x}}_{s}^{t,x,\tilde{u}} by x~s\tilde{{{x}}}_{s}. Therefore, it holds that

0\displaystyle 0 𝔼[V𝗌(t,x)ϕ(t,x)V𝗌(τ,x~τ)+ϕ(τ,x~τ)]\displaystyle\geq{\mathbb{E}}\left[V^{\sf s}({t,x})-\phi({t,x})-V^{\sf s}(\tau,\tilde{{{x}}}_{\tau})+\phi(\tau,\tilde{{{x}}}_{\tau})\right]
(τt)ε+𝔼[tτψ0(u~s)𝑑s+ϕ(τ,x~τ)ϕ(t,x)].\displaystyle\geq-(\tau-t)\varepsilon+{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}(\tilde{u}_{s})ds+\phi(\tau,\tilde{{{x}}}_{\tau})-\phi({t,x})\right]. (16)

By applying Itô’s formula, we obtain

𝔼[ϕ(t,x)ϕ(τ,x~τ)]\displaystyle{\mathbb{E}}[\phi({t,x})-\phi(\tau,\tilde{{{x}}}_{\tau})] =𝔼[tτDxϕ(s,x~s)f(x~s,u~s)ds\displaystyle={\mathbb{E}}\Biggl{[}-\int_{t}^{\tau}D_{x}\phi(s,\tilde{{{x}}}_{s})\cdot f(\tilde{{{x}}}_{s},\tilde{u}_{s})ds
tτ12tr(σσ(x~s,u~s)Dx2ϕ(s,x~s))𝑑s\displaystyle-\int_{t}^{\tau}\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(\tilde{{{x}}}_{s},\tilde{u}_{s})D_{x}^{2}\phi(s,\tilde{{{x}}}_{s})\right)ds
tτϕt(s,x~s)ds].\displaystyle-\int_{t}^{\tau}\phi_{t}(s,\tilde{{{x}}}_{s})ds\Biggr{]}. (17)

Here, note that

𝔼[tτDxϕ(s,x~s)f(x~s,u~s)𝑑s]\displaystyle{\mathbb{E}}\left[\int_{t}^{\tau}D_{x}\phi(s,\tilde{{{x}}}_{s})\cdot f(\tilde{{{x}}}_{s},\tilde{u}_{s})ds\right]
=𝔼[tτDxϕ(t,x)f(x,u~s)𝑑s]+o(τt).\displaystyle\qquad={\mathbb{E}}\left[\int_{t}^{\tau}D_{x}\phi({t,x})\cdot f(x,\tilde{u}_{s})ds\right]+o(\tau-t). (18)

To see this, rewrite (18) as

𝔼[tτ{Dxϕ(s,x~s)Dxϕ(t,x)}f(x~s,u~s)𝑑s]I1(τ)\displaystyle\underbrace{{\mathbb{E}}\biggl{[}\int_{t}^{\tau}\big{\{}D_{x}\phi(s,\tilde{{{x}}}_{s})-D_{x}\phi({t,x})\big{\}}\cdot f(\tilde{{{x}}}_{s},\tilde{u}_{s})ds\biggr{]}}_{\triangleq I_{1}(\tau)}
+𝔼[tτDxϕ(t,x){f(x~s,u~s)f(x,u~s)}𝑑s]I2(τ)=o(τt).\displaystyle\mspace{10.0mu}+\underbrace{{\mathbb{E}}\biggl{[}\int_{t}^{\tau}D_{x}\phi({t,x})\cdot\big{\{}f(\tilde{{{x}}}_{s},\tilde{u}_{s})-f(x,\tilde{u}_{s})\big{\}}ds\biggr{]}}_{\triangleq I_{2}(\tau)}=o(\tau-t). (19)

The first term I1I_{1} is bounded above as follows.

I1(τ)𝔼[tτDxϕ(s,x~s)Dxϕ(t,x)f(x~s,u~s)𝑑s]\displaystyle I_{1}(\tau)\leq{\mathbb{E}}\left[\int_{t}^{\tau}\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\|\cdot\|f(\tilde{x}_{s},\tilde{u}_{s})\|ds\right]
{LK1(1+x)+Kf}\displaystyle\leq\left\{LK_{1}(1+\|x\|)+K_{f}\right\}
×𝔼[tτDxϕ(s,x~s)Dxϕ(t,x)𝑑s]\displaystyle\qquad\qquad\times{\mathbb{E}}\left[\int_{t}^{\tau}\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\|ds\right]
{LK1(1+x)+Kf}\displaystyle\leq\left\{LK_{1}(1+\|x\|)+K_{f}\right\}
×(τt)sups[t,τ]𝔼[Dxϕ(s,x~s)Dxϕ(t,x)],\displaystyle\qquad\qquad\times(\tau-t)\sup_{s\in[t,\tau]}{\mathbb{E}}\left[\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\|\right],

where LL and K1K_{1} satisfies (5) and (42), respectively, and KfK_{f} is some constant satisfying f(0,u)Kf\|f(0,u)\|\leq K_{f} for all u𝕌u\in{\mathbb{U}}. If it holds that

limst𝔼[Dxϕ(s,x~s)Dxϕ(t,x)]=0,\lim_{s\searrow t}{\mathbb{E}}\left[\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\|\right]=0, (20)

then we obtain limτtI1(τ)/(τt)=0\lim_{\tau\searrow t}I_{1}(\tau)/(\tau-t)=0. Indeed, we can show (20) under the condition ϕCp1,2([0,T]×n)\phi\in C_{p}^{1,2}([0,T]\times{\mathbb{R}}^{n}) along the same lines as the proof of Theorem 2 in [13]. Likewise, we get limτtI2(τ)/(τt)=0\lim_{\tau\searrow t}I_{2}(\tau)/(\tau-t)=0 under Assumption (A1)(A_{1}), and therefore (19) holds.

By the same argument, we see that

𝔼[tτ12tr(σσ(x~s,u~s)Dx2ϕ(s,x~s))𝑑s]\displaystyle{\mathbb{E}}\left[\int_{t}^{\tau}\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(\tilde{{{x}}}_{s},\tilde{u}_{s})D_{x}^{2}\phi(s,\tilde{{{x}}}_{s})\right)ds\right]
=𝔼[tτ12tr(σσ(x,u~s)Dx2ϕ(t,x))𝑑s]+o(τt),\displaystyle\qquad={\mathbb{E}}\left[\int_{t}^{\tau}\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,\tilde{u}_{s})D_{x}^{2}\phi({t,x})\right)ds\right]+o(\tau-t),
𝔼[tτϕt(s,x~s)𝑑s]=(τt)ϕt(t,x)+o(τt).{\mathbb{E}}\left[\int_{t}^{\tau}\phi_{t}(s,\tilde{{{x}}}_{s})ds\right]=(\tau-t)\phi_{t}({t,x})+o(\tau-t).

Then, it follows from (16) and (17) that

(τt)ε\displaystyle-(\tau-t)\varepsilon 𝔼[tτ{Dxϕ(t,x)f(x,u~s)\displaystyle\leq{\mathbb{E}}\Biggl{[}\int_{t}^{\tau}\Bigl{\{}-D_{x}\phi({t,x})\cdot f(x,\tilde{u}_{s})
12tr(σσ(x,u~s)Dx2ϕ(t,x))ψ0(u~s)}ds]\displaystyle\quad-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,\tilde{u}_{s})D_{x}^{2}\phi({t,x})\right)-\psi_{0}(\tilde{u}_{s})\Bigr{\}}ds\Biggr{]}
(τt)ϕt(t,x)+o(τt)\displaystyle\quad-(\tau-t)\phi_{t}({t,x})+o(\tau-t)
(τt)supu𝕌{Dxϕ(t,x)f(x,u)\displaystyle\leq(\tau-t)\sup_{u\in{\mathbb{U}}}\Bigl{\{}-D_{x}\phi({t,x})\cdot f(x,u)
12tr(σσ(x,u)Dx2ϕ(t,x))ψ0(u)}\displaystyle\qquad-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,u)D_{x}^{2}\phi({t,x})\right)-\psi_{0}(u)\Bigr{\}}
(τt)ϕt(t,x)+o(τt).\displaystyle\quad\qquad-(\tau-t)\phi_{t}({t,x})+o(\tau-t).

Divide both sides by (τt)(\tau-t) and let τt\tau\searrow t, then

ε\displaystyle-\varepsilon supu𝕌{Dxϕ(t,x)f(x,u)\displaystyle\leq\sup_{u\in{\mathbb{U}}}\Bigl{\{}-D_{x}\phi({t,x})\cdot f(x,u)
12tr(σσ(x,u)Dx2ϕ(t,x))ψ0(u)}ϕt(t,x).\displaystyle\quad-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,u)D_{x}^{2}\phi({t,x})\right)-\psi_{0}(u)\Bigr{\}}-\phi_{t}({t,x}).

The arbitrariness of ε\varepsilon shows that V𝗌V^{\sf s} is a viscosity supersolution of (8). Combining the above arguments with Lemma 2 completes the proof. \Box

4.2 Optimality of a control

Next, we provide a necessary condition and a sufficient condition for the L0L^{0} optimality. The second-order right parabolic superdifferential Dt+,x1,2,+D_{t+,x}^{1,2,+} and subdifferential Dt+,x1,2,D_{t+,x}^{1,2,-} are defined in Appendix C. The proof is same as the one of [24, Chapter 5, Theorem 5.3, 5.7], noting that under assumptions (A1),(A2)(A_{1}),(A_{2}), and (A3)(A_{3}), Theorem 1, 2, and Lemma 1 hold.

Lemma 3.

Fix T>0T>0 and (t,x)[0,T)×n({t,x})\in[0,T)\times{\mathbb{R}}^{n}. Assume (A1),(A2)(A_{1}),(A_{2}), and (A3)(A_{3}).
(Necessary condition)
Let (Ω,,,{ws},{us})𝒰𝗌[t,T](\Omega^{*},{\mathcal{F}}^{*},{\mathbb{P}}^{*},\{w_{s}^{*}\},\{u_{s}^{*}\})\in{\mathcal{U}}^{\sf s}[t,T] be an optimal solution for Problem 1, and {xs}\{{{x}}_{s}^{*}\} be the corresponding optimal state trajectory. Then, for any

(q,p,M)\displaystyle(q^{*},p^{*},M^{*}) 2(t,T;n)×2(t,T;n)\displaystyle\in{\mathcal{L}}_{{\mathcal{F}}^{*}}^{2}(t,T;{\mathbb{R}}^{n})\times{\mathcal{L}}_{{\mathcal{F}}^{*}}^{2}(t,T;{\mathbb{R}}^{n})
×2(t,T;𝒮n)\displaystyle\quad\times{\mathcal{L}}_{{\mathcal{F}}^{*}}^{2}(t,T;{\mathcal{S}}^{n})

satisfying

(qs,ps,Ms)Dt+,x1,2,V𝗌(s,xs),a.e.s[t,T],a.s.,(q_{s}^{*},p_{s}^{*},M_{s}^{*})\in D_{t+,x}^{1,2,-}V^{\sf s}(s,{{x}}_{s}^{*}),\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}^{*}{\rm\mathchar 45\relax a.s.}, (21)

it must hold that

𝔼[qs]𝔼[G(xs,us,ps,Ms)],a.e.s[t,T],{\mathbb{E}}[q_{s}^{*}]\leq{\mathbb{E}}\left[G({{x}}_{s}^{*},u_{s}^{*},p_{s}^{*},M_{s}^{*})\right],\ {\rm a.e.}\ s\in[t,T], (22)

where we define

G(x,u,p,M)f(x,u)p12tr(σσ(x,u)M)ψ0(u).G(x,u,p,M)\triangleq-f(x,u)\cdot p-\frac{1}{2}{\rm tr}\left(\sigma\sigma^{\top}(x,u)M\right)-\psi_{0}(u).

(Sufficient condition)
Let (Ω¯,¯,¯,{w¯s},{u¯s})𝒰𝗌[t,T](\bar{\Omega},\bar{{\mathcal{F}}},\bar{{\mathbb{P}}},\{\bar{w}_{s}\},\{\bar{u}_{s}\})\in{\mathcal{U}}^{\sf s}[t,T] and {x¯s}\{\bar{{{x}}}_{s}\} be the corresponding state trajectory. If there exists

(q¯,p¯,M¯)¯2(t,T;n)ׯ2(t,T;n)ׯ2(t,T;𝒮n)(\bar{q},\bar{p},\bar{M})\in{\mathcal{L}}_{\bar{{\mathcal{F}}}}^{2}(t,T;{\mathbb{R}}^{n})\times{\mathcal{L}}_{\bar{{\mathcal{F}}}}^{2}(t,T;{\mathbb{R}}^{n})\times{\mathcal{L}}_{\bar{{\mathcal{F}}}}^{2}(t,T;{\mathcal{S}}^{n})

satisfying

(q¯s,p¯s,M¯s)Dt+,x1,2,+V𝗌(s,x¯s),a.e.s[t,T],¯a.s.,(\bar{q}_{s},\bar{p}_{s},\bar{M}_{s})\in D_{t+,x}^{1,2,+}V^{\sf s}(s,\bar{{{x}}}_{s}),\ {\rm a.e.}\ s\in[t,T],\ \bar{{\mathbb{P}}}{\rm\mathchar 45\relax a.s.}, (23)

and

q¯s\displaystyle\bar{q}_{s} =G(x¯s,u¯s,p¯s,M¯s)\displaystyle=G(\bar{{{x}}}_{s},\bar{u}_{s},\bar{p}_{s},\bar{M}_{s})
=maxu𝕌G(x¯s,u,p¯s,M¯s),a.e.s[t,T],¯a.s.,\displaystyle=\max_{u\in{\mathbb{U}}}G(\bar{{{x}}}_{s},u,\bar{p}_{s},\bar{M}_{s}),\ {\rm a.e.}\ s\in[t,T],\ \bar{{\mathbb{P}}}{\rm\mathchar 45\relax a.s.}, (24)

then {u¯s}\{\bar{u}_{s}\} is an optimal control process. \lhd

Compared to the verification theorem [7] that is well known as an optimality condition for the case when the value function is smooth, the above conditions are quite complicated and do not show explicitly the relationship between the optimal control value and the state value at the current time via the value function. In view of this, we derive a novel necessary and sufficient condition that is similar to the verification theorem and therefore much clearer. Now we introduce some assumptions:

  • (B1)(B_{1})

    For any u𝒰𝗌[t,T]u\in{\mathcal{U}}^{\sf s}[t,T], the value function V𝗌V^{\sf s} defined by (7), admits Vt𝗌,DxV𝗌V_{t}^{\sf s},D_{x}V^{\sf s}, and Dx2V𝗌D_{x}^{2}V^{\sf s} at (s,xs)(s,{{x}}_{s}) almost everywhere s[t,T]s\in[t,T] and almost surely;

  • (B2)(B_{2})

    For any ρ{Vt𝗌,DxV𝗌,Dx2V𝗌}\rho\in\{V_{t}^{\sf s},D_{x}V^{\sf s},D_{x}^{2}V^{\sf s}\}, there exists a function φ:[t,T]S(S=,n,𝒮n)\varphi:[t,T]\rightarrow S\ (S={\mathbb{R}},{\mathbb{R}}^{n},{\mathcal{S}}^{n}) such that, for any s[t,T]s\in[t,T],

    ρφ,s(x){ρ(s,x),ifρexistsat(s,x),φ(s),otherwise,xn\rho_{\varphi,s}({{x}})\triangleq\left\{\begin{array}[]{ll}\rho(s,{{x}}),&~{}~{}{\rm if}~{}\rho~{}{\rm exists~{}at}~{}(s,{{x}}),\\ \varphi(s),&~{}~{}{\rm otherwise},\end{array}\right.\ x\in{\mathbb{R}}^{n} (25)

    is Borel measurable;

  • (B3)(B_{3})

    For any ρ{Vt𝗌,DxV𝗌,Dx2V𝗌}\rho\in\{V_{t}^{\sf s},D_{x}V^{\sf s},D_{x}^{2}V^{\sf s}\}, there exist constants K>0K>0 and p2p\geq 2 such that

    ρ(s,x)K(1+xp)\|\rho(s,x)\|\leq K(1+\|x\|^{p})

    holds at any (s,x)[t,T]×n(s,x)\in[t,T]\times{\mathbb{R}}^{n} where ρ(s,x)\rho(s,x) exists.

The validity of the above assumptions is discussed in Remark 3.

Theorem 3.

Fix T>0T>0 and (t,x)[0,T)×n({t,x})\in[0,T)\times{\mathbb{R}}^{n}. Assume (A1),(A2),(A3)(A_{1}),(A_{2}),(A_{3}) and (B1),(B2),(B3)(B_{1}),(B_{2}),(B_{3}). Then, (Ω,,,{ws},{us})𝒰𝗌[t,T](\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T] is an optimal solution for Problem 1 if and only if

usargmaxu𝕌{G(xs,u,DxV𝗌(s,xs),Dx2V𝗌(s,xs))}\displaystyle u_{s}\in\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\left\{G\left({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)\right\}
a.e.s[t,T],a.s.,\displaystyle\hskip 113.81102pt\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}, (26)

where {xs}\{{{x}}_{s}\} is the corresponding state trajectory.

{pf}

For a given (Ω,,,{ws},{us})𝒰𝗌[t,T](\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T] and the corresponding state trajectory {xs}\{x_{s}\}, define a stochastic process

qs{Vt𝗌(s,xs),ifVt𝗌existsat(s,xs)φ(s),otherwise,q_{s}\triangleq\left\{\begin{array}[]{ll}V_{t}^{\sf s}(s,{{x}}_{s}),&~{}~{}{\rm if}~{}V_{t}^{\sf s}~{}{\rm exists~{}at}~{}(s,{{x}}_{s})\\ \varphi(s),&~{}~{}{\rm otherwise},\end{array}\right. (27)

where φ:[t,T]\varphi:[t,T]\rightarrow{\mathbb{R}} satisfies Assumption (B2)(B_{2}), which ensures that {qs}\{q_{s}\} is an {s}st\{{\mathcal{F}}_{s}\}_{s\geq t}-adapted process. By Assumption (B1)(B_{1}), it holds that qs=Vt𝗌(s,xs)q_{s}=V_{t}^{\sf s}(s,{{x}}_{s}) almost everywhere s[t,T]s\in[t,T] and {\mathbb{P}}-almost surely, and by a slight abuse of notation, we denote qs=Vt𝗌(s,xs)q_{s}=V_{t}^{\sf s}(s,{{x}}_{s}). In addition, Assumption (B3)(B_{3}) and Lemma 4 imply that 𝔼[tTqs2𝑑s]<+{\mathbb{E}}[\int_{t}^{T}\|q_{s}\|^{2}ds]<+\infty. To sum up, we get {qs}2(t,T;n)\{q_{s}\}\in{\mathcal{L}}_{{\mathcal{F}}}^{2}(t,T;{\mathbb{R}}^{n}). Similarly, take ps=DxV𝗌(s,xs)p_{s}=D_{x}V^{\sf s}(s,{{x}}_{s}) and Ms=Dx2V𝗌(s,xs)M_{s}=D_{x}^{2}V^{\sf s}(s,{{x}}_{s}). Note that, by (48) in Appendix C, it holds that

(qs,ps,Ms)Dt+,x1,2,V𝗌(s,xs)Dt+,x1,2,+V𝗌(s,xs),\displaystyle(q_{s},p_{s},M_{s})\in D_{t+,x}^{1,2,-}V^{\sf s}(s,{{x}}_{s})\cap D_{t+,x}^{1,2,+}V^{\sf s}(s,{{x}}_{s}),
a.e.s[t,T],a.s.\displaystyle\hskip 113.81102pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}

If (Ω,,,{ws},{us})(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\}) is an optimal solution, by Lemma 3, (22) holds for qs=qs,ps=ps,Ms=Msq_{s}^{*}=q_{s},p_{s}^{*}=p_{s},M_{s}^{*}=M_{s}, that is,

𝔼[Vt𝗌(s,xs)]𝔼[G(xs,us,DxV𝗌(s,xs),Dx2V𝗌(s,xs))],\displaystyle{\mathbb{E}}[V^{\sf s}_{t}(s,{{x}}_{s})]\leq{\mathbb{E}}\left[G\left({{x}}_{s},u_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)\right],
a.e.s[t,T].\displaystyle\hskip 142.26378pt{\rm a.e.}\ s\in[t,T]. (28)

On the other hand, since V𝗌V^{\sf s} is a viscosity solution to the HJB equation (8) under assumptions (A1),(A2)(A_{1}),(A_{2}), and (A3)(A_{3}) by Theorem 2, V𝗌V^{\sf s} satisfies (8) at any point where Vt𝗌,DxV𝗌V_{t}^{\sf s},D_{x}V^{\sf s}, and Dx2V𝗌D_{x}^{2}V^{\sf s} exist. Together with Assumption (B1)(B_{1}),

Vt𝗌(s,xs)\displaystyle V^{\sf s}_{t}(s,{{x}}_{s}) =H𝗌(xs,DxV𝗌(s,xs),Dx2V𝗌(s,xs))\displaystyle=H^{\sf s}({{x}}_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s}))
=supu𝕌G(xs,u,DxV𝗌(s,xs),Dx2V𝗌(s,xs))\displaystyle=\sup_{u\in{\mathbb{U}}}G\left({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)
G(xs,us,DxV𝗌(s,xs),Dx2V𝗌(s,xs)),\displaystyle\geq G\left({{x}}_{s},u_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right),
a.e.s[t,T],a.s.\displaystyle\hskip 85.35826pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.} (29)

Combining (28) and (29), we obtain

Vt𝗌(s,xs)\displaystyle V^{\sf s}_{t}(s,{{x}}_{s}) =maxu𝕌G(xs,u,DxV𝗌(s,xs),Dx2V𝗌(s,xs))\displaystyle=\max_{u\in{\mathbb{U}}}G\left({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)
=G(xs,us,DxV𝗌(s,xs),Dx2V𝗌(s,xs)),\displaystyle=G\left({{x}}_{s},u_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right),
a.e.s[t,T],a.s.\displaystyle\hskip 85.35826pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.} (30)

Hence, the optimal control process {us}\{u_{s}\} satisfies (3).

Conversely, if an admissible control process {us}\{u_{s}\} satisfies (3), it follows from (29) that

qs\displaystyle q_{s} =supu𝕌G(xs,u,ps,Ms)\displaystyle=\sup_{u\in{\mathbb{U}}}G({{x}}_{s},u,p_{s},M_{s})
=G(xs,us,ps,Ms),a.e.s[t,T],a.s.\displaystyle=G({{x}}_{s},u_{s},p_{s},M_{s}),\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}

In other words, the sufficient condition (3) in Lemma 3 holds for q¯s=qs,p¯s=ps,M¯s=Ms,x¯s=xs,u¯s=us\bar{q}_{s}=q_{s},\bar{p}_{s}=p_{s},\bar{M}_{s}=M_{s},\bar{{{x}}}_{s}={{x}}_{s},\bar{u}_{s}=u_{s}. This completes the proof. \Box

Theorem 3 can be seen as a generalization of the verification theorem for almost everywhere differentiable value functions; see Remark 3 below. It should be emphasized that the above result can be easily generalized to the cost functional with a uniformly continuous state cost and control cost since Theorem 5.3 and 5.7 in [24, Chapter 5] hold.

Remark 3.

If V𝗌V^{\sf s} admits any ρ{Vt𝗌,DxV𝗌,Dx2V𝗌}\rho\in\{V_{t}^{\sf s},D_{x}V^{\sf s},D_{x}^{2}V^{\sf s}\} almost everywhere (s,x)[t,T]×n(s,x)\in[t,T]\times{\mathbb{R}}^{n}, and ρ\rho is continuous almost everywhere, then, from Lusin’s theorem, there exists a Borel measurable function which coincides with ρ\rho almost everywhere. Hence, in this case, we can remove Assumption (B2)(B_{2}). In addition, if for any u𝒰𝗌[t,T]u\in{\mathcal{U}}^{\sf s}[t,T], there exist densities of {xs}\{x_{s}\}, Assumption (B1)(B_{1}) holds. This is a sufficient condition, but it is not necessary. See [3] for the existence of the density for solutions to stochastic differential equations. For Assumption (B3)(B_{3}), we expect that the condition can be removed or relaxed along the line of [7, Theorem IV.3.1], but this is not our focus in the present paper. \lhd

Theorem 3 immediately characterizes the L0L^{0} optimal control in terms of the feedback control. In fact, as a straightforward consequence of Theorem 3, we obtain the following result.

Corollary 1.

Fix T>0T>0 and (t,x)[0,T)×n(t,x)\in[0,T)\times{\mathbb{R}}^{n}. Assume (A1),(A2),(A3)(A_{1}),(A_{2}),(A_{3}) and (B1),(B2),(B3)(B_{1}),(B_{2}),(B_{3}). Let a Borel measurable function u¯:[t,T]×n𝕌\underline{u}:[t,T]\times{\mathbb{R}}^{n}\rightarrow{\mathbb{U}} satisfy

u¯(s,x)argmaxu𝕌{G(x,u,DxV𝗌(s,x),Dx2V𝗌(s,x))}\displaystyle\underline{u}(s,x^{\prime})\in\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\left\{G\left({{x}}^{\prime},u,D_{x}V^{\sf s}(s,{{x}}^{\prime}),D_{x}^{2}V^{\sf s}(s,{{x}}^{\prime})\right)\right\}
a.e.s[t,T],xn.\displaystyle\hskip 142.26378pt{\rm a.e.}\ s\in[t,T],\ x^{\prime}\in{\mathbb{R}}^{n}.

Fix any reference probability space (Ω,,,{ws})(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\}). If the stochastic differential equation

dxs=f(xs,u¯(s,xs))ds+σ(xs,u¯(s,xs))dws,s>t,\displaystyle dx_{s}=f(x_{s},\underline{u}(s,x_{s}))ds+\sigma(x_{s},\underline{u}(s,x_{s}))dw_{s},\ s>t,
xt=x\displaystyle x_{t}=x

has a unique strong solution, then usu¯(s,xs)u_{s}^{*}\triangleq\underline{u}(s,x_{s}) is an optimal control process, namely, (Ω,,,{ws},{us})𝒰𝗌[t,T](\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}^{*}\})\in{\mathcal{U}}^{\sf s}[t,T] is an optimal solution for Problem 1. \lhd

Here, we emphasize that in the above result, we can choose any reference probability space to be fixed. Thus, for a state-feedback controller, we need not to distinguish which reference probability space is optimal, and we can concentrate only on control processes.

5 Characterization of sparse optimal stochastic control

In this section, we focus on the control-affine systems satisfying

f(x,u)=f0(x)+j=1mfj(x)u(j),σ(x,u)=σ(x)f({{x}},u)=f_{0}({{x}})+\sum_{j=1}^{m}f_{j}({{x}})u^{(j)},\ \sigma({{x}},u)=\sigma({{x}}) (31)

for some fj:nn,j=0,1,2,,mf_{j}:{\mathbb{R}}^{n}\rightarrow{\mathbb{R}}^{n},j=0,1,2,\ldots,m where u(j)u^{(j)} is the jj-th component of umu\in{\mathbb{R}}^{m}. First, we reveal the discreteness of the stochastic L0L^{0} optimal control. Next, we show an equivalence between the L0L^{0} optimality and the L1L^{1} optimality. Thanks to the equivalence, we ensure that our value function is a classical solution of the associated HJB equation under some assumptions.

5.1 Discreteness of the optimal control

We explain the discreteness of the stochastic L0L^{0} optimal control based on Theorem 3.

Theorem 4.

Fix T>0T>0 and (t,x)[0,T)×n(t,x)\in[0,T)\times{\mathbb{R}}^{n}. Assume (A1),(A2),(A3)(A_{1}),(A_{2}),(A_{3}) and (B1),(B2),(B3)(B_{1}),(B_{2}),(B_{3}). If the system (3) is control-affine, i.e., (31) holds, and 𝕌={um:Uju(j)Uj+,j}\mathbb{U}=\{u\in\mathbb{R}^{m}:U_{j}^{-}\leq u^{(j)}\leq U_{j}^{+},\forall j\} for some Uj<0U_{j}^{-}<0 and Uj+>0U_{j}^{+}>0, then, u𝒰𝗌[t,T]u\in{\mathcal{U}}^{\sf s}[t,T] is an optimal solution to Problem 1 if and only if

us(j)\displaystyle u_{s}^{(j)} argmaxu(j)𝕌j{(fj(xs)DxV𝗌(s,xs))u(j)|u(j)|0}\displaystyle\in\mathop{\rm arg~{}max~{}}\limits_{u^{(j)}\in{\mathbb{U}}_{j}}\{-(f_{j}({{x}}_{s})\cdot D_{x}V^{\sf s}(s,{{x}}_{s}))u^{(j)}-|u^{(j)}|^{0}\}
a.e.s[t,T],a.s.\displaystyle\hskip 113.81102pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.} (32)

for all j=1,2,,mj=1,2,\ldots,m where 𝕌j{a:UjaUj+}{\mathbb{U}}_{j}\triangleq\{a\in{\mathbb{R}}:U_{j}^{-}\leq a\leq U_{j}^{+}\}. Furthermore, if an optimal control process {us}\{u_{s}^{*}\} exists, then the jj-th component of usu_{s}^{*} takes only three values of {Uj,0,Uj+}\{U_{j}^{-},0,U_{j}^{+}\} almost everywhere s[t,T]s\in[t,T] and almost surely.

{pf}

By Theorem 3, a necessary and sufficient condition for the L0L^{0} optimality of {us}\{u_{s}\} is given by

us\displaystyle u_{s} argmaxu𝕌G(xs,u,DxV𝗌(s,xs),Dx2V𝗌(s,xs))\displaystyle\in\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}G({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s}))
=argmaxu𝕌{j=1mfj(xs)u(j)DxV𝗌(s,xs)\displaystyle=\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\left\{-\sum_{j=1}^{m}f_{j}({{x}}_{s})u^{(j)}\cdot D_{x}V^{\sf s}(s,{{x}}_{s})\right.
j=1m|u(j)|0}\displaystyle\qquad\qquad\qquad\left.-\sum_{j=1}^{m}|u^{(j)}|^{0}\right\}
=argmaxu𝕌j=1m{(fj(xs)DxV𝗌(s,xs))u(j)\displaystyle=\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\sum_{j=1}^{m}\left\{-(f_{j}({{x}}_{s})\cdot D_{x}V^{\sf s}(s,{{x}}_{s}))u^{(j)}\right.
|u(j)|0},a.e.s[t,T],a.s.\displaystyle\qquad\qquad\qquad\left.-|u^{(j)}|^{0}\right\},\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.} (33)

noting that σ\sigma does not depend on the control variable. Then, (33) is equivalent to (32).

Next, it follows from (32) and an elementary calculation that

us(j){{Uj},ifbj(s,xs)Uj<1,{Uj,0},ifbj(s,xs)Uj=1,{0},ifbj(s,xs)Uj>1,and bj(s,xs)Uj+>1,{0,Uj+},ifbj(s,xs)Uj+=1,{Uj+},ifbj(s,xs)Uj+<1,u_{s}^{(j)}\in\begin{cases}\{U_{j}^{-}\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{-}<-1,\\ \{U_{j}^{-},0\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{-}=-1,\\ \{0\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{-}>-1,\\ &\mspace{12.0mu}\quad\mbox{and~{}}b_{j}(s,{{x}}_{s})U_{j}^{+}>-1,\\ \{0,U_{j}^{+}\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{+}=-1,\\ \{U_{j}^{+}\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{+}<-1,\end{cases} (34)

where we define bj(s,x)DxV𝗌(s,x)fj(x)b_{j}(s,{{x}})\triangleq D_{x}V^{\sf s}(s,{{x}})\cdot f_{j}({{x}}). Therefore, the jj-th component of an optimal control process {us}\{u_{s}^{*}\} must take only three values of {Uj,0,Uj+}\{U_{j}^{-},0,U_{j}^{+}\} almost everywhere s[t,T]s\in[t,T] and almost surely. \Box

5.2 Equivalence between L0L^{0} optimality and L1L^{1} optimality

Let us consider the stochastic L1L^{1} optimal control problem where the cost functional J𝗌J^{\sf s} in Problem 1 is replaced by the following one:

J1𝗌(t,x,u)𝔼[j=1mtT|us(j)|𝑑s+g(xTt,x,u)].J_{1}^{\sf s}(t,x,u)\triangleq{\mathbb{E}}\left[\sum_{j=1}^{m}\int_{t}^{T}|u_{s}^{(j)}|ds+g({{x}}_{T}^{t,x,u})\right]. (35)

The corresponding value function is defined by

V1𝗌(t,x)infu𝒰𝗌[t,T]J1𝗌(t,x,u),(t,x)[0,T]×n.V_{1}^{\sf s}({t,x})\triangleq\inf_{u\in{\mathcal{U}}^{\sf s}[t,T]}J_{1}^{\sf s}(t,x,u),\ (t,x)\in[0,T]\times{\mathbb{R}}^{n}. (36)

We here show the coincidence of the value functions of the L0L^{0} optimal control and the L1L^{1} optimal control for the control-affine system.

Theorem 5.

Fix T>0T>0 and (t,x)[0,T)×n(t,x)\in[0,T)\times{\mathbb{R}}^{n}. Assume (A1),(A2)(A_{1}),(A_{2}), and (A3)(A_{3}). If the system (3) is control-affine, i.e., (31) holds, and 𝕌={um:|u(j)|1,j}{\mathbb{U}}=\{u\in{\mathbb{R}}^{m}:|u^{(j)}|\leq 1,\forall j\}, then for the value functions V𝗌V^{\sf s} and V1𝗌V_{1}^{\sf s} defined by (7) and (36), respectively, it holds that

V𝗌(t,x)=V1𝗌(t,x),(t,x)[0,T]×n.V^{\sf s}({t,x})=V_{1}^{\sf s}({t,x}),\ \forall({t,x})\in[0,T]\times{\mathbb{R}}^{n}.

In addition, V𝗌V^{\sf s} is a unique, at most polynomially growing viscosity solution to the HJB equation (8) with the terminal condition (9).

{pf}

In this setting, for any x,pnx,p\in{\mathbb{R}}^{n} and M𝒮nM\in{\mathcal{S}}^{n},

H𝗌(x,p,M)\displaystyle H^{\sf s}(x,p,M) =supu𝕌{j=1mfj(x)u(j)pj=1m|u(j)|0}\displaystyle=\sup_{u\in{\mathbb{U}}}\left\{-\sum_{j=1}^{m}f_{j}(x)u^{(j)}\cdot p-\sum_{j=1}^{m}|u^{(j)}|^{0}\right\}
f0(x)p12tr(σσ(x)M)\displaystyle\qquad-f_{0}(x)\cdot p-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}(x)M)
=j=1msupu(j)𝕌j{(fj(x)p)u(j)|u(j)|0}\displaystyle=\sum_{j=1}^{m}\sup_{u^{(j)}\in{\mathbb{U}}_{j}}\left\{-(f_{j}(x)\cdot p)u^{(j)}-|u^{(j)}|^{0}\right\}
f0(x)p12tr(σσ(x)M)\displaystyle\qquad-f_{0}(x)\cdot p-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}(x)M)

where 𝕌j={a:|a|1}{\mathbb{U}}_{j}=\{a\in{\mathbb{R}}:|a|\leq 1\}. Here, it follows from an elementary calculation that

supu(j)𝕌j{ax,pju(j)|u(j)|0}\displaystyle\underset{u^{(j)}\in\mathbb{U}_{j}}{\sup}\bigg{\{}-a_{x,p}^{j}u^{(j)}-|u^{(j)}|^{0}\bigg{\}}
=supu(j)𝕌j{ax,pju(j)|u(j)|}\displaystyle\qquad=\underset{u^{(j)}\in\mathbb{U}_{j}}{\sup}\bigg{\{}-a_{x,p}^{j}u^{(j)}-|u^{(j)}|\bigg{\}}

for all x,pnx,p\in\mathbb{R}^{n} and j=1,2,,mj=1,2,\dots,m, where ax,pjfj(x)pa_{x,p}^{j}\triangleq f_{j}(x)\cdot p. Indeed, the supremum of both sides is given by

{ax,pj1,ifax,pj>1,0,if|ax,pj|1,ax,pj1,ifax,pj<1.\displaystyle\begin{cases}a_{x,p}^{j}-1,&\mbox{if}\quad a_{x,p}^{j}>1,\\ 0,&\mbox{if}\quad|a_{x,p}^{j}|\leq 1,\\ -a_{x,p}^{j}-1,&\mbox{if}\quad a_{x,p}^{j}<-1.\\ \end{cases}

Hence, the HJB equation (8) is equivalent to

vt(t,x)+H1(x,Dxv(t,x),Dx2v(t,x))=0,-v_{t}({t,x})+H_{1}(x,D_{x}v({t,x}),D_{x}^{2}v({t,x}))=0, (37)

where

H1(x,p,M)\displaystyle H_{1}(x,p,M) supu𝕌{f(x,u)p12tr(σσ(x,u)M)\displaystyle\triangleq\sup_{u\in\mathbb{U}}\Bigl{\{}-f(x,u)\cdot p-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}(x,u)M)
ψ1(u)},x,pn,M𝒮n,\displaystyle\qquad-\psi_{1}(u)\Bigr{\}},\quad x,p\in\mathbb{R}^{n},M\in{\mathcal{S}}^{n},
ψ1(a)\displaystyle\psi_{1}(a) j=1m|aj|,am.\displaystyle\triangleq\sum_{j=1}^{m}|a_{j}|,\quad a\in\mathbb{R}^{m}.

Note that the equation (37) is the HJB equation for the L1L^{1} optimal control problem. Moreover, it is known that V1𝗌V_{1}^{\sf s} defined via the L1L^{1} optimal control is a unique, at most polynomially growing viscosity solution to the associated HJB equation (37) or, equivalently (8) with the terminal condition (9) [21].

Now, V𝗌V^{\sf s} is also a viscosity solution to the HJB equation (8) with the terminal condition (9) by Theorem 2. Note also that V𝗌V^{\sf s} satisfies the polynomial growth condition by Lemma 2. Therefore, by the aforementioned uniqueness of the viscosity solution to the HJB equation (8), we conclude that V𝗌=V1𝗌V^{\sf s}=V_{1}^{\sf s}. \Box

Theorem 5 justifies the use of the value function for the L1L^{1} optimal control to obtain the L0L^{0} optimal control. For example, we can use a sampling-based algorithm recently proposed in [5] to calculate the value function.

In contrast to the deterministic case where the corresponding HJB equation is of first order, if the second order HJB equation is uniformly elliptic, then we expect that the HJB equation with a terminal condition has a unique classical solution. By using this property and Theorem 5, we show that the value function V𝗌V^{\sf s} is a unique classical solution to the HJB equation under some assumptions. Define

Cbk(n){ρCk(n):ρ and all partial derivatives of\displaystyle C_{b}^{k}({\mathbb{R}}^{n})\triangleq\{\rho\in C^{k}({\mathbb{R}}^{n}):\rho\textrm{~{}and~{}all~{}partial~{}derivatives~{}of~{}}
ρ of ordersk are bounded}.\displaystyle\rho\textrm{~{}of~{}orders}\leq k\textrm{~{}are~{}bounded}\}.
Corollary 2.

Suppose the assumptions in Theorem 5 and the following assumptions:

  • (a)(a)

    For any ρ{f0,f1fm,σσ}\rho\in\{f_{0},f_{1}\ldots f_{m},\sigma\sigma^{\top}\}, ρCb2(n)\rho\in C_{b}^{2}({\mathbb{R}}^{n}),

  • (b)(b)

    gCb3(n)g\in C_{b}^{3}({\mathbb{R}}^{n}),

  • (c)(c)

    Uniform ellipticity condition:
    There exists c>0c>0 such that, for all xnx\in{\mathbb{R}}^{n} and ξn\xi\in{\mathbb{R}}^{n},

    ξσσ(x)ξcξ2.\xi^{\top}\sigma\sigma^{\top}(x)\xi\geq c\|\xi\|^{2}.

Then, the value function V𝗌V^{\sf s} is a unique classical solution to the HJB equation (8) with the terminal condition (9).

{pf}

By [7, Theorem IV.4.2], the HJB equation (37) with the terminal condition (9) for the L1L^{1} optimal control problem has a bounded unique classical solution under assumptions (a),(b),(c)(a),(b),(c). In other words, the HJB equation (8) with (9) has a bounded unique classical solution. Note that any classical solution of (8) is also a viscosity solution. Note also that the value function V𝗌V^{\sf s} is a unique viscosity solution satisfying a polynomial growth condition by Theorem 5. This means that V𝗌V^{\sf s} must be a unique classical solution to (8) with (9). \Box

Thanks to the above result, we need not to consider the non-differentiability of the value function, and we can apply usual numerical methods to solve the HJB equation under the conditions (a),(b),(c)(a),(b),(c).

In Theorem 5, we have shown the equivalence about the value functions of the L0L^{0} optimal control and the L1L^{1} optimal control. Combining this and the discreteness of the L0L^{0} optimal control, we obtain an equivalence for the optimal control itself.

Corollary 3.

Suppose the assumptions in Theorem 4 and let Uj=1,Uj+=1,j=1,,mU_{j}^{-}=-1,U_{j}^{+}=1,j=1,\ldots,m. If an L0L^{0} optimal control process exists, then it is also an L1L^{1} optimal control process. Conversely, if an L1L^{1} optimal control process {us1}\{u_{s}^{1*}\} exists, and it holds that

|fj(xs)DxV𝗌(s,xs)|1,a.e.s[t,T],a.s.,|f_{j}({{x}}_{s})\cdot D_{x}V^{\sf s}(s,{{x}}_{s})|\neq 1,\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}, (38)

where {xs}\{{{x}}_{s}\} is the corresponding optimal state trajectory, then {us1}\{u_{s}^{1*}\} is also an L0L^{0} optimal control process.

{pf}

By Theorem 4, each element of an L0L^{0} optimal control {us}\{u_{s}^{*}\} takes only three values of {1,0,1}\{-1,0,1\}, and therefore it holds that |us(j)|=|us(j)|0|u_{s}^{*^{(j)}}|=|u_{s}^{*^{(j)}}|^{0}. In addition, the optimal values of (4) and (35) coincide by Theorem 5. This implies that {us}\{u_{s}^{*}\} is an L1L^{1} optimal control process. Next, by the same arguments as in the proofs of Theorem 3 and 4, a control process {us}\{u_{s}\} is an L1L^{1} optimal control process if and only if

us(j)\displaystyle u_{s}^{(j)} {{1},ifbj(s,xs)>1,[1,0],ifbj(s,xs)=1,{0},if|bj(s,xs)|<1,[0,1],ifbj(s,xs)=1,{1},ifbj(s,xs)<1,s[t,T],a.s.,\displaystyle\in\begin{cases}\{-1\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})>1,\\ [-1,0],&\mbox{if}\quad b_{j}(s,{{x}}_{s})=1,\\ \{0\},&\mbox{if}\quad|b_{j}(s,{{x}}_{s})|<1,\\ [0,1],&\mbox{if}\quad b_{j}(s,{{x}}_{s})=-1,\\ \{1\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})<-1,\end{cases}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.},

where bj(s,x)=fj(x)DxV𝗌(s,x)=fj(x)DxV1𝗌(s,x)b_{j}(s,{{x}})=f_{j}({{x}})\cdot D_{x}V^{\sf s}(s,{{x}})=f_{j}({{x}})\cdot D_{x}V_{1}^{\sf s}(s,{{x}}). Therefore, if (38) holds, then each element of the L1L^{1} optimal control also takes only three values of {1,0,1}\{-1,0,1\}. Then we obtain the desired result as in the first part of the proof. \Box The condition (38) corresponds to the normality of the L1L^{1} optimal control problem [17].

Remark 4.

Finally, we would like to point out that all the results obtained in this paper can be extended to the case where a continuous state transition cost :n\ell:{\mathbb{R}}^{n}\rightarrow{\mathbb{R}} is added to our cost functional (4), i.e.,

J𝗌(t,x,u)𝔼[tT((xs)+ψ0(us))𝑑s+g(xT)].J_{\ell}^{\sf s}(t,x,u)\triangleq{\mathbb{E}}\left[\int_{t}^{T}\left(\ell(x_{s})+\psi_{0}(u_{s})\right)ds+g(x_{T})\right].

Indeed, since \ell does not depend on uu, the only difference in the associated HJB equation is an additional term (x)\ell(x). Moreover, the continuity of \ell can be used to prove the continuity of the corresponding value function. \lhd

Example 2 (Revisited).

Throughout the following examples, we fix a reference probability space and consider a state-feedback controller. We explain the result for (1) in more detail. First, we consider the deterministic case, i.e., σ=0\sigma=0. We can show that a smooth function having the polynomial growth property satisfies the associated HJB equation. By the uniqueness of the viscosity solution, this is the value function; see Theorem 5. Note also that it is possible to apply [11, Theorem 4] without the Lipschitz continuity of gg due to the smoothness of V𝗌V^{\sf s}. Therefore, it can be verified that the L0L^{0} optimal feedback control u(s,x)u^{*}(s,{{x}}) is given by

u(s,x)={1,if 12e2c(Ts)<x,0sT,0,if |x|12e2c(Ts),0sT,1,if x<12e2c(Ts),0sT.u^{\ast}(s,{{x}})=\begin{cases}-1,&\mbox{if~{}}\frac{1}{2}e^{-2c(T-s)}<{{x}},~{}0\leq s\leq T,\\ 0,&\mbox{if~{}}|{{x}}|\leq\frac{1}{2}e^{-2c(T-s)},~{}0\leq s\leq T,\\ 1,&\mbox{if~{}}{{x}}<-\frac{1}{2}e^{-2c(T-s)},~{}0\leq s\leq T.\end{cases} (39)

This analysis implies that the Lipschitz continuity of gg is not necessary for the value function to be differentiable almost everywhere; see Remark 2.

Next, we consider the stochastic case, i.e., σ>0\sigma>0. The associated HJB equation is

{vt(t,x)cxDxv(t,x)σ22Dx2v(t,x)+α(Dxv(t,x))=0,(t,x)[0,T)×,v(T,x)=x2,x,\mspace{10.0mu}\begin{cases}-v_{t}({t,x})-cxD_{x}v({t,x})-\frac{\sigma^{2}}{2}D_{x}^{2}v({t,x})\\ \hskip 56.9055pt+\alpha(D_{x}v({t,x}))=0,~{}~{}({t,x})\in[0,T)\times\mathbb{R},\\ v(T,x)=x^{2},\quad x\in\mathbb{R},\end{cases}

where

α(p){p1,if p1,0,if |p|<1,p1,if p1.\alpha(p)\triangleq\begin{cases}p-1,&\mbox{if~{}}p\geq 1,\\ 0,&\mbox{if~{}}|p|<1,\\ -p-1,&\mbox{if~{}}p\leq-1.\\ \end{cases}

We solve the above HJB equation numerically using a finite difference scheme. See [7, 5] for numerical methods to compute the viscosity solution to the HJB equation. We take c=1,σ=0.1c=1,\sigma=0.1, and T=1T=1. The switching boundary {(s,x):|DxV𝗌(s,x)|=1}\{(s,{{x}}):|D_{x}V^{\sf s}(s,{{x}})|=1\} is depicted in Fig. 2. For comparison, we also plot the deterministic optimal switching boundary {(s,x):|x|=12e2c(Ts)}\{(s,{{x}}):|{{x}}|=\frac{1}{2}e^{-2c(T-s)}\} obtained in (39). As shown in Fig. 2, the region where the stochastic L0L^{0} optimal control takes value 0 is larger than the deterministic one. This implies that the stochastic L0L^{0} optimal control gives a sparser control than the deterministic one instead of allowing the larger variance of the terminal state.

Refer to caption
Figure 2: Stochastic optimal switching boundary obtained by the numerical solution V𝗌V^{\sf s} (blue) and the deterministic optimal switching boundary ((39), red).

\lhd

Example 3.

Next, we consider a simplified load frequency control (LFC) model depicted in Fig. 3; see [15, 14] for more details. The physical meanings of xs(1)x_{s}^{(1)} and xs(2)x_{s}^{(2)} are frequency deviation and its compensation by a thermal plant, respectively. The feedback loop with 1/s1/s and the saturation function

satd(x){d,x<d,x,|x|d,d,x>d,x,{\rm sat}_{d}(x)\triangleq\begin{cases}-d,&x<-d,\\ x,&|x|\leq d,\\ d,&x>d,\end{cases}\hskip 14.22636ptx\in{\mathbb{R}}, (40)

represents the rate limiter, where d>0d>0 characterizes the limited responsiveness of the adjustment of the thermal power generation. An extra compensation, which should not be activated for long time, is denoted as usu_{s}. The dynamics in Fig. 3 is given by

{dxs(1)=(pxs(1)kxs(2))ds+kusds+kσdws,dxs(2)=satd(xs(1)xs(2))ds,\begin{cases}d{{x}}_{s}^{(1)}&=(-p{{x}}_{s}^{(1)}-k{{x}}_{s}^{(2)})ds+ku_{s}ds+k\sigma dw_{s},\\ d{{x}}_{s}^{(2)}&={\rm sat}_{d}({{x}}_{s}^{(1)}-{{x}}_{s}^{(2)})ds,\end{cases} (41)

where p>0,k>0,σ>0p>0,k>0,\sigma>0. We take p=1/3,k=2,σ=0.5,𝕌=[1,1],T=0.5p=1/3,k=2,\sigma=0.5,{\mathbb{U}}=[-1,1],T=0.5, and g(x)=x2g(x)=\|x\|^{2}. Based on the equivalence result in Theorem 5, we employ a sampling-based method proposed in [5] with radial basis functions to solve the associated HJB equation. Figure 4 compares the obtained switching boundaries at time s=0s=0, i.e., {x:k|(DxV𝗌)(1)(0,x)|=1}\{x:k|(D_{x}V^{\sf s})^{(1)}(0,{{x}})|=1\} for d=0.4d=0.4 and the linear case (d=+)(d=+\infty). To describe the result, let us consider the case x(1)>0{{x}}^{(1)}>0 and x(2)0{{x}}^{(2)}\simeq 0. In such a case, it is expected that x(2)x^{(2)} increases to suppress x(1)x^{(1)}. When the rate limiter prevents the quick adjustment of x(2)x^{(2)}, we need to activate usu_{s}. This is why the region on which the optimal control takes value 0 is larger for d=+d=+\infty than for d=0.4d=0.4. Similar interpretation applies to the case with x(1)0x^{(1)}\simeq 0 and x(2)>0x^{(2)}>0. \lhd

x(2)x^{(2)}satd{\rm sat}_{d}1s\frac{1}{s}σ\sigmaks+p\frac{k}{s+p}w˙\dot{w}uu++-++++x(1)x^{(1)}
Figure 3: Block diagram of the load frequency control system.
Refer to caption

u0=1u_{0}^{*}=1u0=1u_{0}^{*}=-1u0=0u_{0}^{*}=0

Figure 4: Optimal switching boundaries at time s=0s=0 for d=0.4d=0.4 (blue) and the linear case (red), and the optimal control value u0u_{0}^{*}.

6 Conclusions

We have investigated a finite horizon stochastic optimal control problem with the L0L^{0} control cost functional. We have characterized the value function as a viscosity solution to the associated HJB equation and shown an equivalence theorem between the L0L^{0} optimality and the L1L^{1} optimality via the uniqueness of a viscosity solution. Thanks to the equivalence, we have ensured that the value function is a classical solution of the associated HJB equation under some conditions. Moreover, we have derived a sufficient and necessary condition for the L0L^{0} optimality that connects the current state and the current optimal control value. Furthermore, we have revealed the discreteness property of the sparse optimal stochastic control for control-affine systems.

{ack}

This work was supported in part by JSPS KAKENHI under Grant Number JP18H01461 and by JST, ACT-X under Grant Number JPMJAX1902.

Appendix

Appendix A Moment estimate for the state

Here, we introduce an estimate for the pp-th order moment of the state governed by the stochastic system (3) [19, Theorem 1.2].

Lemma 4.

Fix T>0T>0. Assume (A1)(A_{1}) and let p2p\geq 2 be given. Then there exists a positive constant KpK_{p} such that, for any (t,x)[0,T]×n(t,x)\in[0,T]\times{\mathbb{R}}^{n},

𝔼[suptsTxst,x,up]Kp(1+xp),u𝒰𝗌[t,T].{\mathbb{E}}\left[\sup_{t\leq s\leq T}\|{{x}}_{s}^{t,x,u}\|^{p}\right]\leq K_{p}(1+\|x\|^{p}),\ \forall u\in{\mathcal{U}}^{\sf s}[t,T]. (42)

\lhd

By applying Hölder’s inequality, we obtain the estimate for the first order moment, that is, (42) also holds for p=1p=1.

The estimate (42) implies 𝔼[xTt,x,up]<+{\mathbb{E}}[\|{{x}}_{T}^{t,x,u}\|^{p}]<+\infty for any p2p\geq 2. Note that

𝔼[tTψ0(us)𝑑s]m(Tt).{\mathbb{E}}\left[\int_{t}^{T}\psi_{0}(u_{s})ds\right]\leq m(T-t).

Hence, the growth condition in (A2)(A_{2}) ensures that the cost functional J𝗌(t,x,u)J^{\sf s}(t,x,u) has a finite value for any (t,x,u)[0,T]×n×𝒰𝗌[t,T](t,x,u)\in[0,T]\times{\mathbb{R}}^{n}\times{\mathcal{U}}^{\sf s}[t,T].

Appendix B Continuity of H𝗌H^{\sf s}

Lemma 5.

If ff and σ\sigma satisfy (5), then H𝗌H^{\sf s} defined by (10) is continuous on n×n×𝒮n{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}.

{pf}

Fix ε>0\varepsilon>0 and (x,p,M)n×n×𝒮n(x,p,M)\in{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}. By definition of H𝗌H^{\sf s}, there exists u¯𝕌\bar{u}\in{\mathbb{U}} such that

H𝗌(x,p,M)ε<f(x,u¯)p12tr(σσ(x,u¯)M)ψ0(u¯).H^{\sf s}(x,p,M)-\varepsilon<-f(x,\bar{u})\cdot p-\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(x,\bar{u})M)-\psi_{0}(\bar{u}). (43)

Therefore, for any (y,q,N)n×n×𝒮n(y,q,N)\in{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n},

H𝗌(x,p,M)H𝗌(y,q,N)\displaystyle H^{\sf s}(x,p,M)-H^{\sf s}(y,q,N)
f(x,u¯)p12tr(σσ(x,u¯)M)ψ0(u¯)+ε\displaystyle\leq-f(x,\bar{u})\cdot p-\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(x,\bar{u})M)-\psi_{0}(\bar{u})+\varepsilon
+f(y,u¯)q+12tr(σσ(y,u¯)N)+ψ0(u¯)\displaystyle\quad+f(y,\bar{u})\cdot q+\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(y,\bar{u})N)+\psi_{0}(\bar{u})
=f(y,u¯)qf(x,u¯)p\displaystyle=f(y,\bar{u})\cdot q-f(x,\bar{u})\cdot p
+12tr(σσ(y,u¯)Nσσ(x,u¯)M)+ε.\displaystyle\quad+\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(y,\bar{u})N-\sigma\sigma^{\top}(x,\bar{u})M)+\varepsilon.

Note that ff and σ\sigma are continuous by (5), and thus there exists δ>0\delta>0 such that, for any (y,q,N)B((x,p,M),δ)(y,q,N)\in B((x,p,M),\delta),

f(y,u¯)qf(x,u¯)p+12tr(σσ(y,u¯)Nσσ(x,u¯)M)<ε.f(y,\bar{u})\cdot q-f(x,\bar{u})\cdot p+\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(y,\bar{u})N-\sigma\sigma^{\top}(x,\bar{u})M)<\varepsilon.

Hence, for any (y,q,N)B((x,p,M),δ)(y,q,N)\in B((x,p,M),\delta),

H𝗌(x,p,M)H𝗌(y,q,N)<2ε.H^{\sf s}(x,p,M)-H^{\sf s}(y,q,N)<2\varepsilon.

Similarly, H𝗌(y,q,N)H𝗌(x,p,M)<2εH^{\sf s}(y,q,N)-H^{\sf s}(x,p,M)<2\varepsilon also holds. This shows the continuity of H𝗌H^{\sf s}. \Box

Appendix C Viscosity solution

Here, we briefly introduce a viscosity solution [19]. Let H:n×n×𝒮nH:{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}\to{\mathbb{R}} be a continuous function that satisfies the following condition:

H(x,p,M)H(x,p,N),ifMN𝒮+n.H(x,p,M)\leq H(x,p,N),\ \textrm{if}\ M-N\in{\mathcal{S}}_{+}^{n}. (44)

Consider a second-order partial differential equation

{vt(t,x)+H(x,Dxv(t,x),Dx2v(t,x))=0,(t,x)[0,T)×n,v(T,x)=g(x),xn.\displaystyle\begin{cases}-v_{t}({t,x})+H(x,D_{x}v({t,x}),D_{x}^{2}v({t,x}))=0,\\ \hskip 85.35826pt({t,x})\in[0,T)\times{\mathbb{R}}^{n},\\ v(T,x)=g(x),\ x\in{\mathbb{R}}^{n}.\end{cases} (45)

A function vC([0,T]×n)v\in C([0,T]\times{\mathbb{R}}^{n}) is said to be a viscosity subsolution of (45) if

v(T,x)g(x),xnv(T,x)\leq g(x),\ \forall x\in{\mathbb{R}}^{n}

and, for any ϕC1,2([0,T)×n)C([0,T]×n)\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{n})\cap C([0,T]\times{\mathbb{R}}^{n}),

ϕt(t0,x0)+H(x0,Dxϕ(t0,x0),Dx2ϕ(t0,x0))0-\phi_{t}(t_{0},x_{0})+H(x_{0},D_{x}\phi(t_{0},x_{0}),D_{x}^{2}\phi(t_{0},x_{0}))\leq 0 (46)

at any global maximum point (t0,x0)[0,T)×n(t_{0},x_{0})\in[0,T)\times{\mathbb{R}}^{n} of vϕv-\phi. Similarly, a function vC([0,T]×n)v\in C([0,T]\times{\mathbb{R}}^{n}) is said to be a viscosity supersolution of (45) if

v(T,x)g(x),xnv(T,x)\geq g(x),\ \forall x\in{\mathbb{R}}^{n}

and, for any ϕC1,2([0,T)×n)C([0,T]×n)\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{n})\cap C([0,T]\times{\mathbb{R}}^{n}),

ϕt(t0,x0)+H(x0,Dxϕ(t0,x0),Dx2ϕ(t0,x0))0-\phi_{t}(t_{0},x_{0})+H(x_{0},D_{x}\phi(t_{0},x_{0}),D_{x}^{2}\phi(t_{0},x_{0}))\geq 0 (47)

at any global minimum point (t0,x0)[0,T)×n(t_{0},x_{0})\in[0,T)\times{\mathbb{R}}^{n} of vϕv-\phi. Finally, vv is said to be a viscosity solution of (45), if it is simultaneously a viscosity subsolution and supersolution.

Next, we define the second-order right parabolic superdifferential and subdifferential, which are used in Lemma 3 and Theorem 3. For vC([0,T]×n)v\in C([0,T]\times{\mathbb{R}}^{{\color[rgb]{0,0,0}{n}}}) with T>0T>0, the second-order right parabolic superdifferential of vv at (t,x)[0,T)×n({t,x})\in[0,T)\times{\mathbb{R}}^{n} is defined by

Dt+,x1,2,+v(t,x){(q,p,M)×n×𝒮n:\displaystyle{\color[rgb]{0,0,0}{D_{t+,x}^{1,2,+}}}v({t,x})\triangleq\Bigl{\{}(q,p,M)\in{\mathbb{R}}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}:
lim supst,s[0,T)yx1|st|+yx2(v(s,y)v(t,x)\displaystyle\quad\limsup_{\begin{subarray}{c}s\searrow t,s\in[0,T)\\ y\rightarrow x\end{subarray}}\frac{1}{|s-t|+\|y-x\|^{2}}\bigl{(}v(s,y)-v(t,x)
q(st)p(yx)12(yx)M(yx))0}.\displaystyle\quad-q(s-t)-p\cdot(y-x)-\frac{1}{2}(y-x)^{\top}M(y-x)\bigr{)}\leq 0\Bigr{\}}.

Similarly, the second-order right parabolic subdifferential of vv at (t,x)[0,T)×n({t,x})\in[0,T)\times{\mathbb{R}}^{n} is defined by

Dt+,x1,2,v(t,x){(q,p,M)×n×𝒮n:\displaystyle D_{t+,x}^{1,2,-}v({t,x})\triangleq\Bigl{\{}(q,p,M)\in{\mathbb{R}}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}:
lim infst,s[0,T)yx1|st|+yx2(v(s,y)v(t,x)\displaystyle\quad\liminf_{\begin{subarray}{c}s\searrow t,s\in[0,T)\\ y\rightarrow x\end{subarray}}\frac{1}{|s-t|+\|y-x\|^{2}}\bigl{(}v(s,y)-v(t,x)
q(st)p(yx)12(yx)M(yx))0}.\displaystyle\quad-q(s-t)-p\cdot(y-x)-\frac{1}{2}(y-x)^{\top}M(y-x)\bigr{)}\geq 0\Bigr{\}}.

If vv admits vt,Dxvv_{t},D_{x}v, and Dx2vD_{x}^{2}v at (t0,x0)(0,T)×n(t_{0},x_{0})\in(0,T)\times{\mathbb{R}}^{n}, it holds that

(vt(t0,x0),Dxv(t0,x0),Dx2v(t0,x0))\displaystyle\left(v_{t}(t_{0},x_{0}),D_{x}v(t_{0},x_{0}),D_{x}^{2}v(t_{0},x_{0})\right)
Dt+,x1,2,+v(t0,x0)Dt+,x1,2,v(t0,x0).\displaystyle\qquad\in D_{t+,x}^{1,2,+}v(t_{0},x_{0})\cap D_{t+,x}^{1,2,-}v(t_{0},x_{0}). (48)

References

  • [1] Walter Alt and Christopher Schneider. Linear-quadratic control problems with L1{L}^{1}-control cost. Optimal control applications and methods, 36(4):512–534, 2015.
  • [2] Michael Athans and Peter L Falb. Optimal Control: An Introduction to the Theory and Its Applications. Dover Publications, 1966.
  • [3] Nicolas Bouleau and Francis Hirsch. Dirichlet Forms and Analysis on Wiener Space, volume 14. Walter de Gruyter, 2010.
  • [4] David L Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4):1289–1306, 2006.
  • [5] Ioannis Exarchos, Evangelos A Theodorou, and Panagiotis Tsiotras. Stochastic L1L^{1}-optimal control via forward and backward sampling. Systems & Control Letters, 118:101–108, 2018.
  • [6] Giorgio Fabbri, Fausto Gozzi, and Andrzej Swiech. Stochastic Optimal Control in Infinite Dimension. Springer, 2017.
  • [7] Wendell H Fleming and Halil Mete Soner. Controlled Markov Processes and Viscosity Solutions, volume 25. Springer Science & Business Media, 2006.
  • [8] Roland Herzog, Georg Stadler, and Gerd Wachsmuth. Directional sparsity in optimal control of partial differential equations. SIAM Journal on Control and Optimization, 50(2):943–963, 2012.
  • [9] Takuya Ikeda and Kenji Kashima. Sparsity-constrained controllability maximization with application to time-varying control node selection. IEEE Control Systems Letters, 2(3):321–326, 2018.
  • [10] Takuya Ikeda and Kenji Kashima. On sparse optimal control for general linear systems. IEEE Transactions on Automatic Control, 64(5):2077–2083, 2019.
  • [11] Takuya Ikeda and Kenji Kashima. Sparse optimal feedback control for continuous-time systems. 2019 European Control Conference (ECC), pages 3728–3733, 2019.
  • [12] Takuya Ikeda, Masaaki Nagahara, and Shunsuke Ono. Discrete-valued control of linear time-invariant systems by sum-of-absolute-values optimization. IEEE Transactions on Automatic Control, 62(6):2750–2763, 2016.
  • [13] Kaito Ito, Takuya Ikeda, and Kenji Kashima. Continuity of the value function for stochastic sparse optimal control. IFAC-PapersOnLine, 53(2):7179–7184, 2020.
  • [14] Kenji Kashima, Hiroki Aoyama, and Yoshito Ohta. Stable process approach to analysis of systems under heavy-tailed noise: Modeling and stochastic linearization. IEEE Transactions on Automatic Control, 64(4):1344–1357, 2019.
  • [15] Kenji Kashima, Masakazu Kato, Jun-ichi Imura, and Kazuyuki Aihara. Probabilistic evaluation of interconnectable capacity for wind power generation: Stochastic linearization approach. European Physical Journal: Special Topics, 223(12):2493–2501, 2014.
  • [16] Karl Kunisch, Konstantin Pieper, and Boris Vexler. Measure valued directional sparsity for parabolic optimal control problems. SIAM Journal on Control and Optimization, 52(5):3078–3108, 2014.
  • [17] Masaaki Nagahara, Daniel E Quevedo, and Dragan Nešić. Maximum hands-off control: a paradigm of control effort minimization. IEEE Transactions on Automatic Control, 61(3):735–747, 2016.
  • [18] Masaaki Nagahara, Daniel E Quevedo, and Jan Østergaard. Sparse packetized predictive control for networked control over erasure channels. IEEE Transactions on Automatic Control, 59(7):1899–1905, 2014.
  • [19] Makiko Nisio. Stochastic Control Theory: Dynamic Programming Principle, volume 72. Springer, 2014.
  • [20] Alex Olshevsky. On a relaxation of time-varying actuator placement. IEEE Control Systems Letters, 4(3):656–661, 2020.
  • [21] Huyên Pham. Continuous-time Stochastic Control and Optimization with Financial Applications, volume 61. Springer Science & Business Media, 2009.
  • [22] Georg Stadler. Elliptic optimal control problems with L1L^{1}-control cost and applications for the placement of control devices. Computational Optimization and Applications, 44(2):159–181, 2009.
  • [23] Georg Vossen and Helmut Maurer. On L1{L}^{1}-minimization in optimal control and applications to robotics. Optimal Control Applications and Methods, 27(6):301–321, 2006.
  • [24] Jiongmin Yong and Xun Yu Zhou. Stochastic Controls: Hamiltonian Systems and HJB Equations, volume 43. Springer Science & Business Media, 1999.