Sparse Optimal Stochastic Control

Kaito Ito [email protected] Takuya Ikeda [email protected] Kenji Kashima [email protected] Graduate School of Informatics, Kyoto University, Kyoto, Japan Faculty of Environmental Engineering, The University of Kitakyushu, Kitakyushu, Japan

Abstract

In this paper, we investigate a sparse optimal control of continuous-time stochastic systems. We adopt the dynamic programming approach and analyze the optimal control via the value function. Due to the non-smoothness of the $L^{0}$ cost functional, in general, the value function is not differentiable in the domain. Then, we characterize the value function as a viscosity solution to the associated Hamilton-Jacobi-Bellman (HJB) equation. Based on the result, we derive a necessary and sufficient condition for the $L^{0}$ optimality, which immediately gives the optimal feedback map. Especially for control-affine systems, we consider the relationship with $L^{1}$ optimal control problem and show an equivalence theorem.

keywords:

sparsity, non-smooth optimal control, bang-off-bang control, dynamic programming, viscosity solution

^†^†thanks: This paper was not presented at any IFAC meeting. Corresponding author K. Kashima. Tel. +81-75-753-5512.
© 2021. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/

, ,

1 Introduction

This work investigates an optimal control problem for non-linear stochastic systems with the $L^{0}$ control cost. This cost functional penalizes the length of the support of control variables, and the optimization based on the criteria tends to make the control input identically zero on a set with positive measures. Consequently, the optimal control is switched off completely on parts of the time domain. Hence, this type of control is also referred to as sparse optimal control. For example, this optimal control framework is applied to actuator placements [22, 8, 16], networked control systems [18, 9, 20], and discrete-valued control [12], to name a few. The sparse optimal control involves the discontinuous and non-convex cost functional. Then, in order to deal with the difficulty of analysis, some relaxed problems with the $L^{p}$ cost functional have been often investigated, akin to methods used in compressed sensing applications [4].

Literature review: For deterministic control-affine systems, the $L^{1}$ cost functional is analyzed with an aim to show the relationship between the $L^{0}$ optimality and the $L^{1}$ optimality, and an equivalence theorem is derived in [17]. In [10], the result is extended to deterministic general linear systems including infinite-dimensional systems. The $L^{1}$ control cost is also considered in [1, 23, 2]. In [10], the sparsity properties of optimal controls for the $L^{p}$ cost with $p\in(0,1)$ is discussed. The authors investigated this problem from a dynamic programming viewpoint [11]. When it comes to stochastic systems, in [5], a finite horizon optimal control problem with the $L^{1}$ cost functional for stochastic systems is dealt with and the authors propose sampling-based algorithm to solve the problem utilizing forward and backward stochastic differential equations. However, it is not obvious that the $L^{1}$ optimal control achieves the desired sparsity. To the best of the authors’ knowledge, our preliminary work [13] (Theorem 1 below) on the continuity of the value function is the only theoretical result on $L^{0}$ optimal control of stochastic systems.

Contribution: The goal of this work is to obtain the sparse optimal feedback map (Theorem 4), where the optimal control input has the bang-off-bang property and to reveal the equivalence between the $L^{0}$ optimality and the $L^{1}$ optimality for control-affine stochastic systems (Theorem 5). To this end, we utilize the dynamic programming. In the present paper, we first characterize our value function as a viscosity solution to the Hamilton-Jacobi-Bellman (HJB) equation [7, 24]. Based on the result, we show a necessary and sufficient condition for the $L^{0}$ optimality (Theorem 3), which immediately gives an optimal feedback map. In addition, a sufficient condition for the value function to be a classical solution to the HJB equation (i.e., a solution that satisfies the HJB equation in the usual sense) is given via the equivalence, while, in general, for the deterministic case, we cannot ensure the differentiability of the value function.

In the stochastic case, the HJB equation becomes a second-order equation compared to that of the deterministic case, and hence the results for the deterministic systems [11] cannot be directly applied to the stochastic case. Indeed, several difficulties arise due to the stochasticity. For example, the analysis of the deterministic $L^{0}$ optimality of a control in [11] heavily relies on the local Lipschitz continuity of the value function, which means the almost everywhere differentiability. On the other hand, the value function for the stochastic $L^{0}$ optimal control is at most locally 1/2-Hölder continuous [13]. This implies that we need a quite different approach. Even for the problem formulation, we must be careful about the probability space we work on to correctly apply the dynamic programming principle.

In order to demonstrate the practical usefulness of our theoretical results, an example is exhibited; see Example 2 for more details.

Example 1.

Consider the following stochastic system:

d{{x}}_{s}=c{{x}}_{s}ds+u_{s}ds+\sigma dw_{s},\quad 0\leq s\leq T

(1)

where $\{{{x}}_{s}\}$ is a real-valued state process, $\{u_{s}\}$ is a control process, and $\{w_{s}\}$ is a Wiener process. We take $c=1,\ \sigma=0.1,\ T=1$ , and ${{x}}_{0}=0.5$ . Then, the black lines in Fig. 1 show sample paths of the optimal control input and corresponding state trajectories that minimize ${\mathbb{E}}\left[\int_{0}^{1}|u_{s}|^{2}ds+{{x}}_{1}^{2}\right]$ . It is well known that this minimum energy control is given by linear state-feedback control, and hence it takes non-zero values almost everywhere. On the contrary, our problem can deal with the sparse optimal control that minimizes ${\mathbb{E}}\left[\int_{0}^{1}|u_{s}|^{0}ds+{{x}}_{1}^{2}\right]$ with the constraint $|u_{s}|\leq 1$ where $0^{0}=0$ . The first term represents the length of time that the control takes non-zero values. Theorem 4 reveals that the optimal control input takes only three values of $\{-1,0,1\}$ , and enables us to numerically compute the state-feedback map from ${{x}}_{s}$ to $u_{s}\in\{-1,0,1\}$ . The colored lines show the result of $L^{0}$ optimal control, whose input trajectories are sparse while the variance of the state is small enough. Note that the purple dotted lines show the boundary of the bang-off-bang regions. $\lhd$

Refer to caption — Figure 1: The colored lines except black are the sample paths of the $L^{0}$ optimal state process (top, solid) and control process (bottom, solid), and the switching boundary (top, dotted). The same color indicates the correspondence between the sample paths of the state process and the control process. The black lines are the sample paths of the $L^{2}$ optimal state process (top) and control process (bottom).

Organization: The remainder of this paper is organized as follows. In Section 2, we give mathematical preliminaries for our subsequent discussion. In Section 3, we describe the system model and formulate the sparse optimal control problem for stochastic systems. Section 4 is devoted to the general analysis of the stochastic optimal control with the discontinuous $L^{0}$ cost. We first characterize the value function as a viscosity solution to the associated HJB equation and next show a necessary and sufficient condition for the $L^{0}$ optimality. Section 5 characterizes the sparse optimal stochastic control. We show the relationship with the $L^{1}$ optimization problem and some basic properties of the sparse optimal stochastic control for control-affine systems with box constraints. In Section 6 we offer concluding remarks.

2 Mathematical preliminaries

This section reviews notation that will be used throughout the paper.

Let $N$ , $N_{1}$ , and $N_{2}$ be positive integers. For a matrix $M\in{\mathbb{R}}^{N_{1}\times N_{2}}$ , $M^{\top}$ denotes the transpose of $M$ . For a matrix $M\in{\mathbb{R}}^{N\times N}$ , $\mathrm{tr}(M)$ denotes the trace of $M$ . Denote by ${\mathcal{S}}^{N}$ the set of all symmetric $N\times N$ matrices and by ${\mathcal{S}}_{+}^{N}$ the set of all positive semidefinite matrices. Denote the Frobenius norm of $M\in{\mathbb{R}}^{N_{1}\times N_{2}}$ by $\|M\|$ , i.e., $\|M\|\triangleq\sqrt{\textrm{tr}(M^{\top}M)}$ . For a vector $a=[a^{(1)},a^{(2)},\dots,a^{(N)}]^{\top}\in\mathbb{R}^{N}$ , we denote the Euclidean norm by $\|a\|\triangleq(\sum_{i=1}^{N}(a^{(i)})^{2})^{1/2}$ and the open ball with center at $a$ and radius $r>0$ by $B(a,r)$ , i.e., $B(a,r)\triangleq\{x\in\mathbb{R}^{N}:\|x-a\|<r\}$ . We denote the inner product of $a\in\mathbb{R}^{N}$ and $b\in\mathbb{R}^{N}$ by $a\cdot b$ .

For $p\in\{0,1\}$ and a continuous-time signal $u_{s}=[u_{s}^{(1)},u_{s}^{(2)},\dots,u_{s}^{(N)}]^{\top}\in{\mathbb{R}}^{N}$ over a time interval $[t,T]$ , the $L^{p}$ norm of $u=\{u_{s}\}_{t\leq s\leq T}$ is defined by

	$\displaystyle\\|u\\|_{0}\triangleq\sum_{j=1}^{N}\mu_{L}(\{s\in[t,T]:u_{s}^{(j)}\neq 0\}),$
	$\displaystyle\\|u\\|_{1}\triangleq\sum_{j=1}^{N}\int_{t}^{T}\|u_{s}^{(j)}\|ds,$

with the Lebesgue measure $\mu_{L}$ on $\mathbb{R}$ . The $L^{0}$ norm is also expressed by $\|u\|_{0}=\int_{t}^{T}\psi_{0}(u_{s})ds$ , where $\psi_{0}:\mathbb{R}^{N}\to\mathbb{R}$ is a function that returns the number of non-zero components, i.e.,

\psi_{0}(a)\triangleq\sum_{j=1}^{N}{\color[rgb]{0,0,0}{|a^{(j)}|^{0}}},\ a\in{\mathbb{R}}^{N}

with $0^{0}=0$ .

For a given set $\Omega\subset\mathbb{R}^{N}$ , $C(\Omega)$ denotes the set of all continuous functions on $\Omega$ . For $T>0$ , $C^{1,2}((0,T)\times{\mathbb{R}}^{N})$ denotes the set of all functions $\phi$ on $(0,T)\times{\mathbb{R}}^{N}$ whose partial derivatives $\frac{\partial\phi}{\partial s},\frac{\partial\phi}{\partial x^{(i)}},\frac{\partial^{2}\phi}{\partial x^{(i)}\partial x^{(j)}},i,j=1,\ldots,N,$ exist and are continuous on $(0,T)\times{\mathbb{R}}^{N}$ . Denote by $C^{1,2}([0,T)\times{\mathbb{R}}^{N})$ the set of all $\phi\in C^{1,2}((0,T)\times{\mathbb{R}}^{N})\cap C([0,T)\times{\mathbb{R}}^{N})$ such that $\frac{\partial\phi}{\partial s},\frac{\partial\phi}{\partial x^{(i)}},\frac{\partial^{2}\phi}{\partial x^{(i)}\partial x^{(j)}},i,j=1,\ldots,N,$ can be extended to continuous functions on $C([0,T)\times{\mathbb{R}}^{N})$ . For $\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{N})$ , $\phi_{t}$ denotes the partial derivative with respect to the first variable, $D_{x}\phi$ denotes the gradient with respect to the last $N$ variables, and $D_{x}^{2}\phi$ denotes the Hessian matrix with respect to the last $N$ variables. For $p\geq 2$ , denote by $C_{p}^{1,2}([0,T]\times{\mathbb{R}}^{N})$ the set of all $\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{N})\cap C([0,T]\times{\mathbb{R}}^{N})$ satisfying

\|\rho({t,x})\|\leq K(1+\|x\|^{p}),\ ({t,x})\in[0,T]\times{\mathbb{R}}^{N}

(2)

for some constant $K>0$ and any $\rho\in\{\phi,D_{x}\phi,D_{x}^{2}\phi,\phi_{t}\}$ . A function $\rho:[0,T]\times{\mathbb{R}}^{N}\rightarrow{\mathbb{R}}$ is said to satisfy a polynomial growth condition or to be at most polynomially growing if there exist constants $K>0$ and $p\geq 2$ such that (2) holds.

Let $\alpha\in(0,1]$ . A function $f:{\mathbb{R}}^{N_{1}}\rightarrow{\mathbb{R}}^{N_{2}}$ is called $\alpha$ -Hölder continuous if there exists a constant $L>0$ such that, for all $x,y\in{\mathbb{R}}^{N_{1}}$ , $\|f(x)-f(y)\|\leq L\|x-y\|^{\alpha}$ . Especially when $\alpha=1$ , $f$ is called Lipschitz continuous. A function $f$ is called locally $\alpha$ -Hölder continuous if for any $x\in{\mathbb{R}}^{N_{1}}$ , there exists a neighborhood $U_{x}$ of $x$ such that $f$ restricted to $U_{x}$ is $\alpha$ -Hölder continuous.

The notation $o(s)$ denotes a real-valued function $f$ defined on some subset of ${\mathbb{R}}$ such that $\lim_{s\rightarrow 0}f(s)/s=0$ .

For $0\leq t\leq T$ , let $(\Omega,{\mathcal{F}},\{{\mathcal{F}}_{s}\}_{s\geq t},{\mathbb{P}})$ be a filtered probability space, and ${\mathbb{E}}$ be the expectation with respect to ${\mathbb{P}}$ . For $S={\mathbb{R}}^{N}{\rm~{}or~{}}{\mathcal{S}}^{N}$ , denote by ${\mathcal{L}}_{{\mathcal{F}}}^{2}(t,T;S)$ the set of all $\{{\mathcal{F}}_{s}\}_{s\geq t}$ -adapted $S$ -valued processes $\{X_{s}\}_{s\geq t}$ such that ${\mathbb{E}}\left[\int_{t}^{T}\|X_{s}\|^{{\color[rgb]{0,0,0}{2}}}ds\right]<+\infty$ . In what follows, we omit the subscript of stochastic processes when no confusion occurs, e.g., $\{X_{s}\}=\{X_{s}\}_{s\geq t}$ .

3 Problem formulation

This paper considers the sparse optimal control for stochastic systems. This section provides the system description and formulates the main problem.

We consider the following stochastic system where the state is governed by a stochastic differential equation valued in ${\mathbb{R}}^{n}$ :

\displaystyle\begin{split}&d{{x}}_{s}=f({{x}}_{s},u_{s})ds+\sigma({{x}}_{s},u_{s})dw_{s},\quad s>t,\\ &{{x}}_{t}=x.\end{split}

(3)

The initial value $x\in{\mathbb{R}}^{n}$ is deterministic, and $\{w_{s}\}$ is a $d$ -dimensional Wiener process. The range of the control ${\mathbb{U}}\subset{\mathbb{R}}^{m}$ is a compact set that contains $0\in{\mathbb{R}}^{m}$ , and we fix a finite horizon $0<T<\infty$ .

We are interested in the optimal control that minimizes the cost functional

J^{\sf s}(t,x,u)\triangleq{\mathbb{E}}\left[\int_{t}^{T}\psi_{0}(u_{s})ds+g({{x}}_{T})\right].

(4)

We assume the following conditions for functions $f,\sigma,g$ :

(A_{1})

The functions $f$ and $\sigma$ are globally Lipschitz, namely, there exist positive constants $L$ , $\bar{M}$ and a nondecreasing function $\bar{m}\in C([0,+\infty))$ such that $f:{\mathbb{R}}^{n}\times{\mathbb{U}}\to{\mathbb{R}}^{n}$ and $\sigma:{\mathbb{R}}^{n}\times{\mathbb{U}}\to{\mathbb{R}}^{n\times d}$ satisfy the following condition:

	$\displaystyle\\|f({{x}},u)-f(y,v)\\|+\\|\sigma({{x}},u)-\sigma(y,v)\\|$
	$\displaystyle\quad\leq L\\|{{x}}-y\\|+\bar{m}(\\|u-v\\|)$		(5)

for all ${{x}},y\in{\mathbb{R}}^{n}$ , $u,v\in{\mathbb{U}}$ , where $\bar{m}(\cdot)\leq\bar{M}$ and $\bar{m}(0)=0$ ;

$(A_{2})$

There exist constants $\hat{C}>0$ and $p\geq 2$ such that $g:{\mathbb{R}}^{n}\to{\mathbb{R}}$ satisfies the following growth condition:

$|g({{x}})|\leq\hat{C}(1+\|{{x}}\|^{p})$ (6)

for all ${{x}}\in{\mathbb{R}}^{n}$ ;
$(A_{3})$

$g:{\mathbb{R}}^{n}\to{\mathbb{R}}$ is continuous.

Given a probability space with the filtration $\{{\mathcal{F}}_{s}\}_{s\geq t}$ generated by a Wiener process, Assumption $(A_{1})$ ensures the existence and uniqueness of a strong solution to the stochastic differential equation (3) with any initial condition ${{x}}_{t}=x,(t,x)\in[0,T]\times{\mathbb{R}}^{n}$ , and any $\{{\mathcal{F}}_{s}\}_{s\geq t}$ -progressively measurable and ${\mathbb{U}}$ -valued control process $\{u_{s}\}$ . In addition, under assumptions $(A_{1})$ and $(A_{2})$ , the cost functional $J^{\sf s}(t,x,u)$ is finite; see Appendix A. Assumption $(A_{3})$ is introduced to show the continuity of the value function defined later in (7).

For our analysis, we utilize the method of dynamic programming. In order to establish the dynamic programming principle (Lemma 1), we need to consider a family of optimal control problems with different initial times and states $(t,x)\in[0,T]\times{\mathbb{R}}^{n}$ along a state trajectory. Let us consider a state trajectory starting from $x_{0}=x$ on a filtered probability space $(\Omega,{\mathcal{F}},\{{\mathcal{F}}_{s}\}_{s\geq 0},{\mathbb{P}})$ . For any $s>0$ , $x_{s}$ is a random variable. However, an $\{{\mathcal{F}}_{s}\}_{s\geq 0}$ -progressively measurable control $\{u_{s}\}$ knows the information of the system up to the current time. In particular, the current state $x_{s}$ is deterministic under a different probability measure ${\mathbb{P}}(\cdot|{\mathcal{F}}_{s})$ . This observation naturally leads us to vary the probability spaces as well as control processes; for details see e.g., [24, 19, 6]. For this reason, we adopt the so-called weak formulation of the stochastic optimal control problem; see also Remark 1.

For each fixed $t\in[0,T)$ , we denote by ${\mathcal{U}}^{\sf s}[t,T]$ the set of all 5-tuples $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})$ satisfying the following conditions:

(i)

$(\Omega,{\mathcal{F}},{\mathbb{P}})$ is a complete probability space,
(ii)

$\{w_{s}\}$ is a $d$ -dimensional Wiener process on $(\Omega,{\mathcal{F}},{\mathbb{P}})$ over $[t,T]$ (with $w_{t}=0$ almost surely),
(iii)

The control $\{u_{s}\}$ is an $\{{\mathcal{F}}_{s}\}_{s\geq t}$ -progressively measurable and ${\mathbb{U}}$ -valued process on $(\Omega,{\mathcal{F}},{\mathbb{P}})$ where ${\mathcal{F}}_{s}$ is the $\sigma$ -field generated by $\{w_{r}\}_{t\leq r\leq s}$ .

For $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T]$ , we call $\{u_{s}\}$ and $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\})$ an admissible control process and a reference probability space, respectively. For notational simplicity, we sometimes write $u\in{\mathcal{U}}^{\sf s}[t,T]$ instead of $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T]$ . Note that in $\eqref{eq:cost_stoc}$ the expectation ${\mathbb{E}}$ is with respect to ${\mathbb{P}}$ . For given $(t,x)\in[0,T]\times{\mathbb{R}}^{n}$ and $u\in{\mathcal{U}}^{\sf s}[t,T]$ , we denote by $\{{{x}}_{s}^{t,x,u}\}_{t\leq s\leq T}$ the unique solution of (3). When there is no confusion, we omit the subscript.

Then, we are ready to formulate the main problem as follows:

Problem 1.

Given $x\in\mathbb{R}^{n}$ , $T>0$ , and $t\in[0,T]$ , find a 5-tuple $u\in{\mathcal{U}}^{\sf s}[t,T]$ that solves

		$\displaystyle\underset{u}{\text{minimize}}$		$\displaystyle J^{\sf s}(t,x,u)$
		subject to		$\displaystyle d{{x}}_{s}=f({{x}}_{s},u_{s})ds+\sigma({{x}}_{s},u_{s})dw_{s},$
		$\displaystyle{{x}}_{t}=x,$
		$\displaystyle u\in\mathcal{U^{\sf s}}[t,T].$

$\lhd$

The value function for Problem 1 is defined by

V^{\sf s}(t,x)\triangleq\inf_{u\in{\mathcal{U}}^{\sf s}[t,T]}J^{\sf s}(t,x,u),\ (t,x)\in[0,T]\times{\mathbb{R}}^{n}.

(7)

Remark 1.

In Problem 1, we vary probability spaces. This problem formulation is called a weak formulation. On the other hand, the problem where we fix a probability space for each initial time and state $(t,x)\in[0,T]\times{\mathbb{R}}^{n}$ and vary only control processes is referred to as a strong formulation, which is natural from the practical point of view. Despite the difference in the settings, it is known that, under some conditions, the value function of the weak formulation coincides with the one of the strong formulation; see [7]. In this paper, under some assumptions, we will show that, for any given reference probability space, we can design an optimal state-feedback controller in Corollary 1. This result bridges the gap between the weak formulation and the strong formulation. Lastly, we would like to emphasize that the term “weak” refers only to the fact that the probability spaces vary and not to the concept of solution of the stochastic differential equation (3). In fact, once we fix $u\in{\mathcal{U}}^{\sf s}[t,T]$ , then the solution is defined on the same probability space. $\lhd$

4 General analysis of stochastic optimal control with discontinuous input cost functional

This section is devoted to the preliminary analysis of the stochastic $L^{0}$ optimal control problem. We first characterize the value function as a viscosity solution to the associated HJB equation. Then, we derive a necessary and sufficient condition for the $L^{0}$ optimality.

4.1 Characterization of the value function

In what follows, we show that the value function $V^{\sf s}$ is a viscosity solution to the associated HJB equation. The definition of a viscosity solution appears in Appendix C. The HJB equation [24] corresponding to the stochastic system (3) is given by

		$\displaystyle-v_{t}(t,x)+H^{\sf s}(x,D_{x}v(t,x),D_{x}^{2}v(t,x))=0,$			(8)
		$\displaystyle\hskip 113.81102pt(t,x)\in[0,T)\times\mathbb{R}^{n},$
		$\displaystyle v(T,x)=g(x),\quad x\in\mathbb{R}^{n},$			(9)

where $H^{\sf s}:\mathbb{R}^{n}\times\mathbb{R}^{n}\times{\mathcal{S}}^{n}\to\mathbb{R}$ is defined by

\displaystyle\begin{split}H^{\sf s}(x,p,M)&\triangleq\underset{u\in\mathbb{U}}{\sup}\Bigl{\{}-f(x,u)\cdot p\\ &\quad\quad-\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(x,u)M)-\psi_{0}(u)\Bigr{\}}.\end{split}

(10)

We first introduce the result for the continuity of the value function [13]. The main difficulty in the analysis is that the state of the system (3) is unbounded due to the stochastic noise.

Theorem 1.

Fix $T>0$ . Under assumptions $(A_{1}),(A_{2})$ , and $(A_{3})$ , the value function $V^{\sf s}$ defined by (7), is continuous on ${\color[rgb]{0,0,0}{[0,T]\times{\mathbb{R}}^{n}}}$ . If in addition the terminal cost $g$ is Lipschitz continuous, then $V^{\sf s}({\color[rgb]{0,0,0}{t,x}})$ is Lipschitz continuous in $x$ uniformly in $t$ , and locally $1/2$ -Hölder continuous in $t$ for each $x$ . $\lhd$

Remark 2.

Note that the Lipschitz continuity of $g$ shows the local Lipschitz continuity of the value function for deterministic systems [11, Theorem 1], which ensures that the value function is differentiable almost everywhere. On the other hand, we cannot expect the local Lipschitz continuity of the value function $V^{\sf s}$ in the stochastic case even under the Lipschitz continuity of $g$ . This is essentially because $\int_{0}^{t}\sigma dw$ is only of order $t^{1/2}$ . $\lhd$

The dynamic programming principle plays an important role in proving that the value function is a viscosity solution to the HJB equation. Since the proof is similar to [24, Chapter 4, Theorem 3.3], it is omitted.

Lemma 1.

Fix any $T>0$ and any $\tau\in[0,T]$ . Assume $(A_{1})$ and $(A_{2})$ . Then, the value function (7) satisfies

V^{\sf s}({\color[rgb]{0,0,0}{t,x}})=\inf_{u\in{\mathcal{U}}^{\sf s}[t,T]}{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}(u_{s})ds+V^{\sf s}(\tau,{{x}}_{\tau}^{t,x,u})\right]

for all $(t,x)\in[0,\tau]\times{\mathbb{R}}^{n}$ . $\lhd$

According to the definition of a viscosity solution in Appendix C, we have to check the inequalities (46) and (47) for any smooth function $\phi$ . However, this requirement is too strong for our analysis. Fortunately, it is possible to restrict the class of $\phi$ to be considered by the following lemma.

Lemma 2.

Assume $(A_{1}),(A_{2})$ , and $(A_{3})$ . Then, the value function (7) satisfies the polynomial growth condition, i.e., for some constant $\hat{C}_{p}>0$ ,

|V^{\sf s}({\color[rgb]{0,0,0}{t,x}})|\leq\hat{C}_{p}(1+\|x\|^{p}),\ ({t,x})\in[0,T]\times{\mathbb{R}}^{n}

(11)

holds where $p\geq 2$ satisfies (6). In addition, if (46) and (47) where $v$ and $H$ are replaced by $V^{\sf s}$ and $H^{\sf s}$ , respectively, are satisfied for any $\phi\in C_{p}^{1,2}([0,T]\times{\mathbb{R}}^{n})$ , then $V^{\sf s}$ is a viscosity solution to the HJB equation (8) with a terminal condition (9).

{pf}

First, we derive the polynomial growth condition of $V^{\sf s}$ . By Assumption $(A_{2})$ ,

	$\displaystyle\|V^{\sf s}({t,x})\|$	$\displaystyle\leq{\mathbb{E}}[\|g(\bar{{{x}}}_{T})\|]$
		$\displaystyle\leq{\mathbb{E}}[\hat{C}(1+\\|\bar{{{x}}}_{T}\\|^{p})]$		(12)

holds, where $\hat{C}>0$ and $p\geq 2$ are constants that satisfy (6), and $\{\bar{{{x}}}_{s}\}$ is the solution of the uncontrolled system:

d\bar{{{x}}}_{s}=f(\bar{{{x}}}_{s},0)ds+\sigma(\bar{{{x}}}_{s},0)dw_{s},\quad\bar{{{x}}}_{t}=x.

Combining the inequality (12) and inequality (42) of Lemma 4 in Appendix A, we obtain (11).

Next, note that by the definition (7) of the value function $V^{\sf s}$ , it satisfies the terminal condition (9). Moreover, thanks to the continuity of $V^{\sf s}$ and the derived growth condition (11), we can apply Theorem 3.1 of [19], that is, $V^{\sf s}$ is a viscosity subsolution (resp. supersolution) of (8) if (46) (resp. (47)) holds for any $\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{n})\cap C([0,T]\times{\mathbb{R}}^{n})$ satisfying, for some $R>0$ ,

\phi({t,x})=c_{p}(1+\|x\|^{p})~{}~{}{\rm for}~{}t\in[0,T],\ \|x\|\geq R,

(13)

where $c_{p}=\hat{C}_{p}$ (resp. $c_{p}=-\hat{C}_{p}$ ). This implies that for some large $K^{(1)}>0$ ,

|\phi({t,x})|\leq K^{(1)}(1+\|x\|^{p}),\ ({t,x})\in{\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n},

noting that $\phi$ is continuous, and therefore $|\phi|$ attains a maximum on $[0,T]\times\{x\in{\mathbb{R}}^{n}:\|x\|\leq R\}$ . Moreover, $\eqref{eq:ass_pgrowth}$ gives

D_{x}\phi({t,x})=pc_{p}\|x\|^{p-2}x~{}~{}{\rm for}~{}t\in{\color[rgb]{0,0,0}{[0,T]}},\ \|x\|\geq R,

and hence, for some constant $K^{(2)}>0$ , it holds that

\|D_{x}\phi({t,x})\|\leq K^{(2)}(1+\|x\|^{p}),~{}({t,x})\in{\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n},

noting that $D_{x}\phi$ is continuous. Likewise, for some constants $K^{(3)},K^{(4)}>0$ , it holds that

	$\displaystyle\\|D_{x}^{2}\phi({t,x})\\|\leq K^{(3)}(1+\\|x\\|^{p}),$
	$\displaystyle\|\phi_{t}({t,x})\|\leq K^{(4)}(1+\\|x\\|^{p}),~{}~{}({t,x})\in{\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n}.$

This completes the proof. $\Box$

Then, we are ready to prove that our value function is a viscosity solution to the associated HJB equation.

Theorem 2.

Fix $T>0$ . Assume $(A_{1}),(A_{2})$ , and $(A_{3})$ . Then, the value function (7) is a viscosity solution to the HJB equation (8) with a terminal condition (9).

{pf}

Note that $H^{\sf s}$ is continuous (see Lemma 5 in Appendix B), and the condition (44) in Appendix C is obviously satisfied since the matrix $\sigma\sigma^{\top}$ is positive semidefinite. We first show that the value function $V^{\sf s}$ is a viscosity subsolution of (8). For $p\geq 2$ satisfying (6), fix any $\phi\in{\color[rgb]{0,0,0}{C_{p}^{1,2}}}([0,T]\times{\mathbb{R}}^{n})$ , and let $({t,x})$ be a global maximum point of $V^{\sf s}-\phi$ . Let us consider a constant control $u_{s}=\bar{u}$ for any $s\in[t,T]$ , with $\bar{u}\in{\mathbb{U}}$ . Denote the corresponding state process ${{x}}_{s}^{t,x,{\color[rgb]{0,0,0}{u}}}$ by $\bar{{{x}}}_{s}$ . Then, for $\tau\in(t,T)$ , we have

{\mathbb{E}}\left[\phi({t,x})-\phi(\tau,\bar{{{x}}}_{\tau})\right]\leq{\mathbb{E}}\left[V^{\sf s}({t,x})-V^{\sf s}(\tau,\bar{{{x}}}_{\tau})\right].

(14)

By using Lemma 1, we obtain

	$\displaystyle V^{\sf s}({t,x})$	$\displaystyle\leq{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}({\color[rgb]{0,0,0}{u}}_{s})ds+V^{\sf s}(\tau,\bar{{{x}}}_{\tau})\right]$
		$\displaystyle=(\tau-t)\psi_{0}({\color[rgb]{0,0,0}{\bar{u}}})+{\mathbb{E}}\left[V^{\sf s}(\tau,\bar{{{x}}}_{\tau})\right].$

Therefore,

{\mathbb{E}}\left[\phi({t,x})-\phi(\tau,\bar{{{x}}}_{\tau})\right]\leq(\tau-t)\psi_{0}({\color[rgb]{0,0,0}{\bar{u}}}).

Note that under the growth condition (2), it holds that

	$\displaystyle\lim_{\tau\searrow t}\frac{{\mathbb{E}}[\phi(\tau,\bar{{{x}}}_{\tau})]-\phi({t,x})}{\tau-t}=D_{x}\phi({t,x})\cdot f(x,{\color[rgb]{0,0,0}{\bar{u}}})$
	$\displaystyle\qquad\qquad+\frac{1}{2}\mathrm{tr}\left(\sigma\sigma^{\top}(x,{\color[rgb]{0,0,0}{\bar{u}}})D_{x}^{2}\phi({t,x})\right)+\phi_{t}({t,x})$

where Itô’s formula is applied [7]. Therefore, we get

	$\displaystyle-D_{x}\phi({t,x})\cdot f(x,{\color[rgb]{0,0,0}{\bar{u}}})-$	$\displaystyle\frac{1}{2}\mathrm{tr}\left(\sigma\sigma^{\top}(x,{\color[rgb]{0,0,0}{\bar{u}}})D_{x}^{2}\phi({t,x})\right)$
		$\displaystyle\qquad-\phi_{t}({t,x})\leq\psi_{0}({\color[rgb]{0,0,0}{\bar{u}}}).$

This inequality holds for all ${\color[rgb]{0,0,0}{\bar{u}}}\in{\mathbb{U}}$ . This means

\displaystyle-\phi_{t}({t,x})+H^{\sf s}(x,D_{x}\phi({t,x}),D_{x}^{2}\phi({t,x}))\leq 0.

We next show that $V^{\sf s}$ is a viscosity supersolution of (8). Fix any $\phi\in{\color[rgb]{0,0,0}{C_{p}^{1,2}}}({\color[rgb]{0,0,0}{[0,T]}}\times{\mathbb{R}}^{n})$ , and let $({t,x})$ be a global minimum point of $V^{\sf s}-\phi$ . Then, for any $\varepsilon>0$ and $\tau\in(t,T)$ , by Lemma 1, there exists $\tilde{u}\in{\mathcal{U}}^{\sf s}[t,T]$ , which depends on $\varepsilon$ and $\tau$ , such that

V^{\sf s}({t,x})+(\tau-t)\varepsilon\geq{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}(\tilde{u}_{s})ds+V^{\sf s}(\tau,\tilde{{{x}}}_{\tau})\right],

(15)

where we denote ${{x}}_{s}^{t,x,\tilde{u}}$ by $\tilde{{{x}}}_{s}$ . Therefore, it holds that

	$\displaystyle 0$	$\displaystyle\geq{\mathbb{E}}\left[V^{\sf s}({t,x})-\phi({t,x})-V^{\sf s}(\tau,\tilde{{{x}}}_{\tau})+\phi(\tau,\tilde{{{x}}}_{\tau})\right]$
		$\displaystyle\geq-(\tau-t)\varepsilon+{\mathbb{E}}\left[\int_{t}^{\tau}\psi_{0}(\tilde{u}_{s})ds+\phi(\tau,\tilde{{{x}}}_{\tau})-\phi({t,x})\right].$		(16)

By applying Itô’s formula, we obtain

$\displaystyle{\mathbb{E}}[\phi({t,x})-\phi(\tau,\tilde{{{x}}}_{\tau})]$	$\displaystyle={\mathbb{E}}\Biggl{[}-\int_{t}^{\tau}D_{x}\phi(s,\tilde{{{x}}}_{s})\cdot f(\tilde{{{x}}}_{s},\tilde{u}_{s})ds$
	$\displaystyle-\int_{t}^{\tau}\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(\tilde{{{x}}}_{s},\tilde{u}_{s})D_{x}^{2}\phi(s,\tilde{{{x}}}_{s})\right)ds$
	$\displaystyle-\int_{t}^{\tau}\phi_{t}(s,\tilde{{{x}}}_{s})ds\Biggr{]}.$	(17)

Here, note that

	$\displaystyle{\mathbb{E}}\left[\int_{t}^{\tau}D_{x}\phi(s,\tilde{{{x}}}_{s})\cdot f(\tilde{{{x}}}_{s},\tilde{u}_{s})ds\right]$
	$\displaystyle\qquad={\mathbb{E}}\left[\int_{t}^{\tau}D_{x}\phi({t,x})\cdot f(x,\tilde{u}_{s})ds\right]+o(\tau-t).$		(18)

To see this, rewrite (18) as

	$\displaystyle\underbrace{{\mathbb{E}}\biggl{[}\int_{t}^{\tau}\big{\{}D_{x}\phi(s,\tilde{{{x}}}_{s})-D_{x}\phi({t,x})\big{\}}\cdot f(\tilde{{{x}}}_{s},\tilde{u}_{s})ds\biggr{]}}_{\triangleq I_{1}(\tau)}$
	$\displaystyle\mspace{10.0mu}+\underbrace{{\mathbb{E}}\biggl{[}\int_{t}^{\tau}D_{x}\phi({t,x})\cdot\big{\{}f(\tilde{{{x}}}_{s},\tilde{u}_{s})-f(x,\tilde{u}_{s})\big{\}}ds\biggr{]}}_{\triangleq I_{2}(\tau)}=o(\tau-t).$		(19)

The first term $I_{1}$ is bounded above as follows.

	$\displaystyle I_{1}(\tau)\leq{\mathbb{E}}\left[\int_{t}^{\tau}\\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\\|\cdot\\|f(\tilde{x}_{s},\tilde{u}_{s})\\|ds\right]$
	$\displaystyle\leq\left\{LK_{1}(1+\\|x\\|)+K_{f}\right\}$
	$\displaystyle\qquad\qquad\times{\mathbb{E}}\left[\int_{t}^{\tau}\\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\\|ds\right]$
	$\displaystyle\leq\left\{LK_{1}(1+\\|x\\|)+K_{f}\right\}$
	$\displaystyle\qquad\qquad\times(\tau-t)\sup_{s\in[t,\tau]}{\mathbb{E}}\left[\\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\\|\right],$

where $L$ and $K_{1}$ satisfies (5) and (42), respectively, and $K_{f}$ is some constant satisfying $\|f(0,u)\|\leq K_{f}$ for all $u\in{\mathbb{U}}$ . If it holds that

\lim_{s\searrow t}{\mathbb{E}}\left[\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\|\right]=0,

(20)

then we obtain $\lim_{\tau\searrow t}I_{1}(\tau)/(\tau-t)=0$ . Indeed, we can show (20) under the condition $\phi\in C_{p}^{1,2}([0,T]\times{\mathbb{R}}^{n})$ along the same lines as the proof of Theorem 2 in [13]. Likewise, we get $\lim_{\tau\searrow t}I_{2}(\tau)/(\tau-t)=0$ under Assumption $(A_{1})$ , and therefore (19) holds.

By the same argument, we see that

	$\displaystyle{\mathbb{E}}\left[\int_{t}^{\tau}\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(\tilde{{{x}}}_{s},\tilde{u}_{s})D_{x}^{2}\phi(s,\tilde{{{x}}}_{s})\right)ds\right]$
	$\displaystyle\qquad={\mathbb{E}}\left[\int_{t}^{\tau}\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,\tilde{u}_{s})D_{x}^{2}\phi({t,x})\right)ds\right]+o(\tau-t),$

{\mathbb{E}}\left[\int_{t}^{\tau}\phi_{t}(s,\tilde{{{x}}}_{s})ds\right]=(\tau-t)\phi_{t}({t,x})+o(\tau-t).

Then, it follows from (16) and (17) that

	$\displaystyle-(\tau-t)\varepsilon$	$\displaystyle\leq{\mathbb{E}}\Biggl{[}\int_{t}^{\tau}\Bigl{\{}-D_{x}\phi({t,x})\cdot f(x,\tilde{u}_{s})$
		$\displaystyle\quad-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,\tilde{u}_{s})D_{x}^{2}\phi({t,x})\right)-\psi_{0}(\tilde{u}_{s})\Bigr{\}}ds\Biggr{]}$
		$\displaystyle\quad-(\tau-t)\phi_{t}({t,x})+o(\tau-t)$
		$\displaystyle\leq(\tau-t)\sup_{u\in{\mathbb{U}}}\Bigl{\{}-D_{x}\phi({t,x})\cdot f(x,u)$
		$\displaystyle\qquad-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,u)D_{x}^{2}\phi({t,x})\right)-\psi_{0}(u)\Bigr{\}}$
		$\displaystyle\quad\qquad-(\tau-t)\phi_{t}({t,x})+o(\tau-t).$

Divide both sides by $(\tau-t)$ and let $\tau\searrow t$ , then

	$\displaystyle-\varepsilon$	$\displaystyle\leq\sup_{u\in{\mathbb{U}}}\Bigl{\{}-D_{x}\phi({t,x})\cdot f(x,u)$
		$\displaystyle\quad-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}\left(x,u)D_{x}^{2}\phi({t,x})\right)-\psi_{0}(u)\Bigr{\}}-\phi_{t}({t,x}).$

The arbitrariness of $\varepsilon$ shows that $V^{\sf s}$ is a viscosity supersolution of (8). Combining the above arguments with Lemma 2 completes the proof. $\Box$

4.2 Optimality of a control

Next, we provide a necessary condition and a sufficient condition for the $L^{0}$ optimality. The second-order right parabolic superdifferential $D_{t+,x}^{1,2,+}$ and subdifferential $D_{t+,x}^{1,2,-}$ are defined in Appendix C. The proof is same as the one of [24, Chapter 5, Theorem 5.3, 5.7], noting that under assumptions $(A_{1}),(A_{2})$ , and $(A_{3})$ , Theorem 1, 2, and Lemma 1 hold.

Lemma 3.

Fix $T>0$ and $({t,x})\in[0,T)\times{\mathbb{R}}^{n}$ . Assume $(A_{1}),(A_{2})$ , and $(A_{3})$ .
(Necessary condition)
Let $(\Omega^{*},{\mathcal{F}}^{*},{\mathbb{P}}^{*},\{w_{s}^{*}\},\{u_{s}^{*}\})\in{\mathcal{U}}^{\sf s}[t,T]$ be an optimal solution for Problem 1, and $\{{{x}}_{s}^{*}\}$ be the corresponding optimal state trajectory. Then, for any

	$\displaystyle(q^{},p^{},M^{*})$	$\displaystyle\in{\mathcal{L}}_{{\mathcal{F}}^{}}^{2}(t,T;{\mathbb{R}}^{n})\times{\mathcal{L}}_{{\mathcal{F}}^{}}^{2}(t,T;{\mathbb{R}}^{n})$
		$\displaystyle\quad\times{\mathcal{L}}_{{\mathcal{F}}^{*}}^{2}(t,T;{\mathcal{S}}^{n})$

satisfying

(q_{s}^{*},p_{s}^{*},M_{s}^{*})\in D_{t+,x}^{1,2,-}V^{\sf s}(s,{{x}}_{s}^{*}),\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}^{*}{\rm\mathchar 45\relax a.s.},

(21)

it must hold that

{\mathbb{E}}[q_{s}^{*}]\leq{\mathbb{E}}\left[G({{x}}_{s}^{*},u_{s}^{*},p_{s}^{*},M_{s}^{*})\right],\ {\rm a.e.}\ s\in[t,T],

(22)

where we define

G(x,u,p,M)\triangleq-f(x,u)\cdot p-\frac{1}{2}{\rm tr}\left(\sigma\sigma^{\top}(x,u)M\right)-\psi_{0}(u).

(Sufficient condition)
Let $(\bar{\Omega},\bar{{\mathcal{F}}},\bar{{\mathbb{P}}},\{\bar{w}_{s}\},\{\bar{u}_{s}\})\in{\mathcal{U}}^{\sf s}[t,T]$ and $\{\bar{{{x}}}_{s}\}$ be the corresponding state trajectory. If there exists

(\bar{q},\bar{p},\bar{M})\in{\mathcal{L}}_{\bar{{\mathcal{F}}}}^{2}(t,T;{\mathbb{R}}^{n})\times{\mathcal{L}}_{\bar{{\mathcal{F}}}}^{2}(t,T;{\mathbb{R}}^{n})\times{\mathcal{L}}_{\bar{{\mathcal{F}}}}^{2}(t,T;{\mathcal{S}}^{n})

satisfying

(\bar{q}_{s},\bar{p}_{s},\bar{M}_{s})\in D_{t+,x}^{1,2,+}V^{\sf s}(s,\bar{{{x}}}_{s}),\ {\rm a.e.}\ s\in[t,T],\ \bar{{\mathbb{P}}}{\rm\mathchar 45\relax a.s.},

(23)

and

	$\displaystyle\bar{q}_{s}$	$\displaystyle=G(\bar{{{x}}}_{s},\bar{u}_{s},\bar{p}_{s},\bar{M}_{s})$
		$\displaystyle=\max_{u\in{\mathbb{U}}}G(\bar{{{x}}}_{s},u,\bar{p}_{s},\bar{M}_{s}),\ {\rm a.e.}\ s\in[t,T],\ \bar{{\mathbb{P}}}{\rm\mathchar 45\relax a.s.},$		(24)

then $\{\bar{u}_{s}\}$ is an optimal control process. $\lhd$

Compared to the verification theorem [7] that is well known as an optimality condition for the case when the value function is smooth, the above conditions are quite complicated and do not show explicitly the relationship between the optimal control value and the state value at the current time via the value function. In view of this, we derive a novel necessary and sufficient condition that is similar to the verification theorem and therefore much clearer. Now we introduce some assumptions:

$(B_{1})$

For any $u\in{\mathcal{U}}^{\sf s}[t,T]$ , the value function $V^{\sf s}$ defined by (7), admits $V_{t}^{\sf s},D_{x}V^{\sf s}$ , and $D_{x}^{2}V^{\sf s}$ at $(s,{{x}}_{s})$ almost everywhere $s\in[t,T]$ and almost surely;

(B_{2})

For any $\rho\in\{V_{t}^{\sf s},D_{x}V^{\sf s},D_{x}^{2}V^{\sf s}\}$ , there exists a function $\varphi:[t,T]\rightarrow S\ (S={\mathbb{R}},{\mathbb{R}}^{n},{\mathcal{S}}^{n})$ such that, for any $s\in[t,T]$ ,

\rho_{\varphi,s}({{x}})\triangleq\left\{\begin{array}[]{ll}\rho(s,{{x}}),&~{}~{}{\rm if}~{}\rho~{}{\rm exists~{}at}~{}(s,{{x}}),\\ \varphi(s),&~{}~{}{\rm otherwise},\end{array}\right.\ x\in{\mathbb{R}}^{n}

(25)

is Borel measurable;

$(B_{3})$

For any $\rho\in\{V_{t}^{\sf s},D_{x}V^{\sf s},D_{x}^{2}V^{\sf s}\}$ , there exist constants $K>0$ and $p\geq 2$ such that

$\|\rho(s,x)\|\leq K(1+\|x\|^{p})$

holds at any $(s,x)\in[t,T]\times{\mathbb{R}}^{n}$ where $\rho(s,x)$ exists.

The validity of the above assumptions is discussed in Remark 3.

Theorem 3.

Fix $T>0$ and $({t,x})\in[0,T)\times{\mathbb{R}}^{n}$ . Assume $(A_{1}),(A_{2}),(A_{3})$ and $(B_{1}),(B_{2}),(B_{3})$ . Then, $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T]$ is an optimal solution for Problem 1 if and only if

		$\displaystyle u_{s}\in\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\left\{G\left({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)\right\}$
		$\displaystyle\hskip 113.81102pt\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.},$		(26)

where $\{{{x}}_{s}\}$ is the corresponding state trajectory.

{pf}

For a given $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})\in{\mathcal{U}}^{\sf s}[t,T]$ and the corresponding state trajectory $\{x_{s}\}$ , define a stochastic process

q_{s}\triangleq\left\{\begin{array}[]{ll}V_{t}^{\sf s}(s,{{x}}_{s}),&~{}~{}{\rm if}~{}V_{t}^{\sf s}~{}{\rm exists~{}at}~{}(s,{{x}}_{s})\\ \varphi(s),&~{}~{}{\rm otherwise},\end{array}\right.

(27)

where $\varphi:[t,T]\rightarrow{\mathbb{R}}$ satisfies Assumption $(B_{2})$ , which ensures that $\{q_{s}\}$ is an $\{{\mathcal{F}}_{s}\}_{s\geq t}$ -adapted process. By Assumption $(B_{1})$ , it holds that $q_{s}=V_{t}^{\sf s}(s,{{x}}_{s})$ almost everywhere $s\in[t,T]$ and ${\mathbb{P}}$ -almost surely, and by a slight abuse of notation, we denote $q_{s}=V_{t}^{\sf s}(s,{{x}}_{s})$ . In addition, Assumption $(B_{3})$ and Lemma 4 imply that ${\mathbb{E}}[\int_{t}^{T}\|q_{s}\|^{2}ds]<+\infty$ . To sum up, we get $\{q_{s}\}\in{\mathcal{L}}_{{\mathcal{F}}}^{2}(t,T;{\mathbb{R}}^{n})$ . Similarly, take $p_{s}=D_{x}V^{\sf s}(s,{{x}}_{s})$ and $M_{s}=D_{x}^{2}V^{\sf s}(s,{{x}}_{s})$ . Note that, by (48) in Appendix C, it holds that

	$\displaystyle(q_{s},p_{s},M_{s})\in D_{t+,x}^{1,2,-}V^{\sf s}(s,{{x}}_{s})\cap D_{t+,x}^{1,2,+}V^{\sf s}(s,{{x}}_{s}),$
	$\displaystyle\hskip 113.81102pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}$

If $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}\})$ is an optimal solution, by Lemma 3, (22) holds for $q_{s}^{*}=q_{s},p_{s}^{*}=p_{s},M_{s}^{*}=M_{s}$ , that is,

	$\displaystyle{\mathbb{E}}[V^{\sf s}_{t}(s,{{x}}_{s})]\leq{\mathbb{E}}\left[G\left({{x}}_{s},u_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)\right],$
	$\displaystyle\hskip 142.26378pt{\rm a.e.}\ s\in[t,T].$		(28)

On the other hand, since $V^{\sf s}$ is a viscosity solution to the HJB equation (8) under assumptions $(A_{1}),(A_{2})$ , and $(A_{3})$ by Theorem 2, $V^{\sf s}$ satisfies (8) at any point where $V_{t}^{\sf s},D_{x}V^{\sf s}$ , and $D_{x}^{2}V^{\sf s}$ exist. Together with Assumption $(B_{1})$ ,

$\displaystyle V^{\sf s}_{t}(s,{{x}}_{s})$	$\displaystyle=H^{\sf s}({{x}}_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s}))$
	$\displaystyle=\sup_{u\in{\mathbb{U}}}G\left({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)$
	$\displaystyle\geq G\left({{x}}_{s},u_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right),$
	$\displaystyle\hskip 85.35826pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}$	(29)

Combining (28) and (29), we obtain

$\displaystyle V^{\sf s}_{t}(s,{{x}}_{s})$	$\displaystyle=\max_{u\in{\mathbb{U}}}G\left({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right)$
	$\displaystyle=G\left({{x}}_{s},u_{s},D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s})\right),$
	$\displaystyle\hskip 85.35826pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}$	(30)

Hence, the optimal control process $\{u_{s}\}$ satisfies (3).

Conversely, if an admissible control process $\{u_{s}\}$ satisfies (3), it follows from (29) that

	$\displaystyle q_{s}$	$\displaystyle=\sup_{u\in{\mathbb{U}}}G({{x}}_{s},u,p_{s},M_{s})$
		$\displaystyle=G({{x}}_{s},u_{s},p_{s},M_{s}),\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}$

In other words, the sufficient condition (3) in Lemma 3 holds for $\bar{q}_{s}=q_{s},\bar{p}_{s}=p_{s},\bar{M}_{s}=M_{s},\bar{{{x}}}_{s}={{x}}_{s},\bar{u}_{s}=u_{s}$ . This completes the proof. $\Box$

Theorem 3 can be seen as a generalization of the verification theorem for almost everywhere differentiable value functions; see Remark 3 below. It should be emphasized that the above result can be easily generalized to the cost functional with a uniformly continuous state cost and control cost since Theorem 5.3 and 5.7 in [24, Chapter 5] hold.

Remark 3.

If $V^{\sf s}$ admits any $\rho\in\{V_{t}^{\sf s},D_{x}V^{\sf s},D_{x}^{2}V^{\sf s}\}$ almost everywhere $(s,x)\in[t,T]\times{\mathbb{R}}^{n}$ , and $\rho$ is continuous almost everywhere, then, from Lusin’s theorem, there exists a Borel measurable function which coincides with $\rho$ almost everywhere. Hence, in this case, we can remove Assumption $(B_{2})$ . In addition, if for any $u\in{\mathcal{U}}^{\sf s}[t,T]$ , there exist densities of $\{x_{s}\}$ , Assumption $(B_{1})$ holds. This is a sufficient condition, but it is not necessary. See [3] for the existence of the density for solutions to stochastic differential equations. For Assumption $(B_{3})$ , we expect that the condition can be removed or relaxed along the line of [7, Theorem IV.3.1], but this is not our focus in the present paper. $\lhd$

Theorem 3 immediately characterizes the $L^{0}$ optimal control in terms of the feedback control. In fact, as a straightforward consequence of Theorem 3, we obtain the following result.

Corollary 1.

Fix $T>0$ and $(t,x)\in[0,T)\times{\mathbb{R}}^{n}$ . Assume $(A_{1}),(A_{2}),(A_{3})$ and $(B_{1}),(B_{2}),(B_{3})$ . Let a Borel measurable function $\underline{u}:[t,T]\times{\mathbb{R}}^{n}\rightarrow{\mathbb{U}}$ satisfy

	$\displaystyle\underline{u}(s,x^{\prime})\in\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\left\{G\left({{x}}^{\prime},u,D_{x}V^{\sf s}(s,{{x}}^{\prime}),D_{x}^{2}V^{\sf s}(s,{{x}}^{\prime})\right)\right\}$
	$\displaystyle\hskip 142.26378pt{\rm a.e.}\ s\in[t,T],\ x^{\prime}\in{\mathbb{R}}^{n}.$

Fix any reference probability space $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\})$ . If the stochastic differential equation

	$\displaystyle dx_{s}=f(x_{s},\underline{u}(s,x_{s}))ds+\sigma(x_{s},\underline{u}(s,x_{s}))dw_{s},\ s>t,$
	$\displaystyle x_{t}=x$

has a unique strong solution, then $u_{s}^{*}\triangleq\underline{u}(s,x_{s})$ is an optimal control process, namely, $(\Omega,{\mathcal{F}},{\mathbb{P}},\{w_{s}\},\{u_{s}^{*}\})\in{\mathcal{U}}^{\sf s}[t,T]$ is an optimal solution for Problem 1. $\lhd$

Here, we emphasize that in the above result, we can choose any reference probability space to be fixed. Thus, for a state-feedback controller, we need not to distinguish which reference probability space is optimal, and we can concentrate only on control processes.

5 Characterization of sparse optimal stochastic control

In this section, we focus on the control-affine systems satisfying

f({{x}},u)=f_{0}({{x}})+\sum_{j=1}^{m}f_{j}({{x}})u^{(j)},\ \sigma({{x}},u)=\sigma({{x}})

(31)

for some $f_{j}:{\mathbb{R}}^{n}\rightarrow{\mathbb{R}}^{n},j=0,1,2,\ldots,m$ where $u^{(j)}$ is the $j$ -th component of $u\in{\mathbb{R}}^{m}$ . First, we reveal the discreteness of the stochastic $L^{0}$ optimal control. Next, we show an equivalence between the $L^{0}$ optimality and the $L^{1}$ optimality. Thanks to the equivalence, we ensure that our value function is a classical solution of the associated HJB equation under some assumptions.

5.1 Discreteness of the optimal control

We explain the discreteness of the stochastic $L^{0}$ optimal control based on Theorem 3.

Theorem 4.

Fix $T>0$ and $(t,x)\in[0,T)\times{\mathbb{R}}^{n}$ . Assume $(A_{1}),(A_{2}),(A_{3})$ and $(B_{1}),(B_{2}),(B_{3})$ . If the system (3) is control-affine, i.e., (31) holds, and $\mathbb{U}=\{u\in\mathbb{R}^{m}:U_{j}^{-}\leq u^{(j)}\leq U_{j}^{+},\forall j\}$ for some $U_{j}^{-}<0$ and $U_{j}^{+}>0$ , then, $u\in{\mathcal{U}}^{\sf s}[t,T]$ is an optimal solution to Problem 1 if and only if

	$\displaystyle u_{s}^{(j)}$	$\displaystyle\in\mathop{\rm arg~{}max~{}}\limits_{u^{(j)}\in{\mathbb{U}}_{j}}\{-(f_{j}({{x}}_{s})\cdot D_{x}V^{\sf s}(s,{{x}}_{s}))u^{(j)}-\|u^{(j)}\|^{0}\}$
		$\displaystyle\hskip 113.81102pt{\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}$		(32)

for all $j=1,2,\ldots,m$ where ${\mathbb{U}}_{j}\triangleq\{a\in{\mathbb{R}}:U_{j}^{-}\leq a\leq U_{j}^{+}\}$ . Furthermore, if an optimal control process $\{u_{s}^{*}\}$ exists, then the $j$ -th component of $u_{s}^{*}$ takes only three values of $\{U_{j}^{-},0,U_{j}^{+}\}$ almost everywhere $s\in[t,T]$ and almost surely.

{pf}

By Theorem 3, a necessary and sufficient condition for the $L^{0}$ optimality of $\{u_{s}\}$ is given by

$\displaystyle u_{s}$	$\displaystyle\in\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}G({{x}}_{s},u,D_{x}V^{\sf s}(s,{{x}}_{s}),D_{x}^{2}V^{\sf s}(s,{{x}}_{s}))$
	$\displaystyle=\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\left\{-\sum_{j=1}^{m}f_{j}({{x}}_{s})u^{(j)}\cdot D_{x}V^{\sf s}(s,{{x}}_{s})\right.$
	$\displaystyle\qquad\qquad\qquad\left.-\sum_{j=1}^{m}\|u^{(j)}\|^{0}\right\}$
	$\displaystyle=\mathop{\rm arg~{}max~{}}\limits_{u\in\mathbb{U}}\sum_{j=1}^{m}\left\{-(f_{j}({{x}}_{s})\cdot D_{x}V^{\sf s}(s,{{x}}_{s}))u^{(j)}\right.$
	$\displaystyle\qquad\qquad\qquad\left.-\|u^{(j)}\|^{0}\right\},\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.}$	(33)

noting that $\sigma$ does not depend on the control variable. Then, (33) is equivalent to (32).

Next, it follows from (32) and an elementary calculation that

u_{s}^{(j)}\in\begin{cases}\{U_{j}^{-}\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{-}<-1,\\ \{U_{j}^{-},0\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{-}=-1,\\ \{0\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{-}>-1,\\ &\mspace{12.0mu}\quad\mbox{and~{}}b_{j}(s,{{x}}_{s})U_{j}^{+}>-1,\\ \{0,U_{j}^{+}\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{+}=-1,\\ \{U_{j}^{+}\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})U_{j}^{+}<-1,\end{cases}

(34)

where we define $b_{j}(s,{{x}})\triangleq D_{x}V^{\sf s}(s,{{x}})\cdot f_{j}({{x}})$ . Therefore, the $j$ -th component of an optimal control process $\{u_{s}^{*}\}$ must take only three values of $\{U_{j}^{-},0,U_{j}^{+}\}$ almost everywhere $s\in[t,T]$ and almost surely. $\Box$

5.2 Equivalence between $L^{0}$ optimality and $L^{1}$ optimality

Let us consider the stochastic $L^{1}$ optimal control problem where the cost functional $J^{\sf s}$ in Problem 1 is replaced by the following one:

J_{1}^{\sf s}(t,x,u)\triangleq{\mathbb{E}}\left[\sum_{j=1}^{m}\int_{t}^{T}|u_{s}^{(j)}|ds+g({{x}}_{T}^{t,x,u})\right].

(35)

The corresponding value function is defined by

V_{1}^{\sf s}({t,x})\triangleq\inf_{u\in{\mathcal{U}}^{\sf s}[t,T]}J_{1}^{\sf s}(t,x,u),\ (t,x)\in[0,T]\times{\mathbb{R}}^{n}.

(36)

We here show the coincidence of the value functions of the $L^{0}$ optimal control and the $L^{1}$ optimal control for the control-affine system.

Theorem 5.

Fix $T>0$ and $(t,x)\in[0,T)\times{\mathbb{R}}^{n}$ . Assume $(A_{1}),(A_{2})$ , and $(A_{3})$ . If the system (3) is control-affine, i.e., (31) holds, and ${\mathbb{U}}=\{u\in{\mathbb{R}}^{m}:|u^{(j)}|\leq 1,\forall j\}$ , then for the value functions $V^{\sf s}$ and $V_{1}^{\sf s}$ defined by (7) and (36), respectively, it holds that

V^{\sf s}({t,x})=V_{1}^{\sf s}({t,x}),\ \forall({t,x})\in[0,T]\times{\mathbb{R}}^{n}.

In addition, $V^{\sf s}$ is a unique, at most polynomially growing viscosity solution to the HJB equation (8) with the terminal condition (9).

{pf}

In this setting, for any $x,p\in{\mathbb{R}}^{n}$ and $M\in{\mathcal{S}}^{n}$ ,

	$\displaystyle H^{\sf s}(x,p,M)$	$\displaystyle=\sup_{u\in{\mathbb{U}}}\left\{-\sum_{j=1}^{m}f_{j}(x)u^{(j)}\cdot p-\sum_{j=1}^{m}\|u^{(j)}\|^{0}\right\}$
		$\displaystyle\qquad-f_{0}(x)\cdot p-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}(x)M)$
		$\displaystyle=\sum_{j=1}^{m}\sup_{u^{(j)}\in{\mathbb{U}}_{j}}\left\{-(f_{j}(x)\cdot p)u^{(j)}-\|u^{(j)}\|^{0}\right\}$
		$\displaystyle\qquad-f_{0}(x)\cdot p-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}(x)M)$

where ${\mathbb{U}}_{j}=\{a\in{\mathbb{R}}:|a|\leq 1\}$ . Here, it follows from an elementary calculation that

	$\displaystyle\underset{u^{(j)}\in\mathbb{U}_{j}}{\sup}\bigg{\{}-a_{x,p}^{j}u^{(j)}-\|u^{(j)}\|^{0}\bigg{\}}$
	$\displaystyle\qquad=\underset{u^{(j)}\in\mathbb{U}_{j}}{\sup}\bigg{\{}-a_{x,p}^{j}u^{(j)}-\|u^{(j)}\|\bigg{\}}$

for all $x,p\in\mathbb{R}^{n}$ and $j=1,2,\dots,m$ , where $a_{x,p}^{j}\triangleq f_{j}(x)\cdot p$ . Indeed, the supremum of both sides is given by

\displaystyle\begin{cases}a_{x,p}^{j}-1,&\mbox{if}\quad a_{x,p}^{j}>1,\\ 0,&\mbox{if}\quad|a_{x,p}^{j}|\leq 1,\\ -a_{x,p}^{j}-1,&\mbox{if}\quad a_{x,p}^{j}<-1.\\ \end{cases}

Hence, the HJB equation (8) is equivalent to

-v_{t}({t,x})+H_{1}(x,D_{x}v({t,x}),D_{x}^{2}v({t,x}))=0,

(37)

where

	$\displaystyle H_{1}(x,p,M)$	$\displaystyle\triangleq\sup_{u\in\mathbb{U}}\Bigl{\{}-f(x,u)\cdot p-\frac{1}{2}{\rm tr}(\sigma\sigma^{\top}(x,u)M)$
		$\displaystyle\qquad-\psi_{1}(u)\Bigr{\}},\quad x,p\in\mathbb{R}^{n},M\in{\mathcal{S}}^{n},$
	$\displaystyle\psi_{1}(a)$	$\displaystyle\triangleq\sum_{j=1}^{m}\|a_{j}\|,\quad a\in\mathbb{R}^{m}.$

Note that the equation (37) is the HJB equation for the $L^{1}$ optimal control problem. Moreover, it is known that $V_{1}^{\sf s}$ defined via the $L^{1}$ optimal control is a unique, at most polynomially growing viscosity solution to the associated HJB equation (37) or, equivalently (8) with the terminal condition (9) [21].

Now, $V^{\sf s}$ is also a viscosity solution to the HJB equation (8) with the terminal condition (9) by Theorem 2. Note also that $V^{\sf s}$ satisfies the polynomial growth condition by Lemma 2. Therefore, by the aforementioned uniqueness of the viscosity solution to the HJB equation (8), we conclude that $V^{\sf s}=V_{1}^{\sf s}$ . $\Box$

Theorem 5 justifies the use of the value function for the $L^{1}$ optimal control to obtain the $L^{0}$ optimal control. For example, we can use a sampling-based algorithm recently proposed in [5] to calculate the value function.

In contrast to the deterministic case where the corresponding HJB equation is of first order, if the second order HJB equation is uniformly elliptic, then we expect that the HJB equation with a terminal condition has a unique classical solution. By using this property and Theorem 5, we show that the value function $V^{\sf s}$ is a unique classical solution to the HJB equation under some assumptions. Define

	$\displaystyle C_{b}^{k}({\mathbb{R}}^{n})\triangleq\{\rho\in C^{k}({\mathbb{R}}^{n}):\rho\textrm{~{}and~{}all~{}partial~{}derivatives~{}of~{}}$
	$\displaystyle\rho\textrm{~{}of~{}orders}\leq k\textrm{~{}are~{}bounded}\}.$

Corollary 2.

Suppose the assumptions in Theorem 5 and the following assumptions:

$(a)$

For any $\rho\in\{f_{0},f_{1}\ldots f_{m},\sigma\sigma^{\top}\}$ , $\rho\in C_{b}^{2}({\mathbb{R}}^{n})$ ,
$(b)$

$g\in C_{b}^{3}({\mathbb{R}}^{n})$ ,
$(c)$

Uniform ellipticity condition:
There exists $c>0$ such that, for all $x\in{\mathbb{R}}^{n}$ and $\xi\in{\mathbb{R}}^{n}$ ,

$\xi^{\top}\sigma\sigma^{\top}(x)\xi\geq c\|\xi\|^{2}.$

Then, the value function $V^{\sf s}$ is a unique classical solution to the HJB equation (8) with the terminal condition (9).

{pf}

By [7, Theorem IV.4.2], the HJB equation (37) with the terminal condition (9) for the $L^{1}$ optimal control problem has a bounded unique classical solution under assumptions $(a),(b),(c)$ . In other words, the HJB equation (8) with (9) has a bounded unique classical solution. Note that any classical solution of (8) is also a viscosity solution. Note also that the value function $V^{\sf s}$ is a unique viscosity solution satisfying a polynomial growth condition by Theorem 5. This means that $V^{\sf s}$ must be a unique classical solution to (8) with (9). $\Box$

Thanks to the above result, we need not to consider the non-differentiability of the value function, and we can apply usual numerical methods to solve the HJB equation under the conditions $(a),(b),(c)$ .

In Theorem 5, we have shown the equivalence about the value functions of the $L^{0}$ optimal control and the $L^{1}$ optimal control. Combining this and the discreteness of the $L^{0}$ optimal control, we obtain an equivalence for the optimal control itself.

Corollary 3.

Suppose the assumptions in Theorem 4 and let $U_{j}^{-}=-1,U_{j}^{+}=1,j=1,\ldots,m$ . If an $L^{0}$ optimal control process exists, then it is also an $L^{1}$ optimal control process. Conversely, if an $L^{1}$ optimal control process $\{u_{s}^{1*}\}$ exists, and it holds that

|f_{j}({{x}}_{s})\cdot D_{x}V^{\sf s}(s,{{x}}_{s})|\neq 1,\ {\rm a.e.}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.},

(38)

where $\{{{x}}_{s}\}$ is the corresponding optimal state trajectory, then $\{u_{s}^{1*}\}$ is also an $L^{0}$ optimal control process.

{pf}

By Theorem 4, each element of an $L^{0}$ optimal control $\{u_{s}^{*}\}$ takes only three values of $\{-1,0,1\}$ , and therefore it holds that $|u_{s}^{*^{(j)}}|=|u_{s}^{*^{(j)}}|^{0}$ . In addition, the optimal values of (4) and (35) coincide by Theorem 5. This implies that $\{u_{s}^{*}\}$ is an $L^{1}$ optimal control process. Next, by the same arguments as in the proofs of Theorem 3 and 4, a control process $\{u_{s}\}$ is an $L^{1}$ optimal control process if and only if

\displaystyle u_{s}^{(j)}

\displaystyle\in\begin{cases}\{-1\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})>1,\\ [-1,0],&\mbox{if}\quad b_{j}(s,{{x}}_{s})=1,\\ \{0\},&\mbox{if}\quad|b_{j}(s,{{x}}_{s})|<1,\\ [0,1],&\mbox{if}\quad b_{j}(s,{{x}}_{s})=-1,\\ \{1\},&\mbox{if}\quad b_{j}(s,{{x}}_{s})<-1,\end{cases}\ s\in[t,T],\ {\mathbb{P}}{\rm\mathchar 45\relax a.s.},

where $b_{j}(s,{{x}})=f_{j}({{x}})\cdot D_{x}V^{\sf s}(s,{{x}})=f_{j}({{x}})\cdot D_{x}V_{1}^{\sf s}(s,{{x}})$ . Therefore, if (38) holds, then each element of the $L^{1}$ optimal control also takes only three values of $\{-1,0,1\}$ . Then we obtain the desired result as in the first part of the proof. $\Box$ The condition (38) corresponds to the normality of the $L^{1}$ optimal control problem [17].

Remark 4.

Finally, we would like to point out that all the results obtained in this paper can be extended to the case where a continuous state transition cost $\ell:{\mathbb{R}}^{n}\rightarrow{\mathbb{R}}$ is added to our cost functional (4), i.e.,

J_{\ell}^{\sf s}(t,x,u)\triangleq{\mathbb{E}}\left[\int_{t}^{T}\left(\ell(x_{s})+\psi_{0}(u_{s})\right)ds+g(x_{T})\right].

Indeed, since $\ell$ does not depend on $u$ , the only difference in the associated HJB equation is an additional term $\ell(x)$ . Moreover, the continuity of $\ell$ can be used to prove the continuity of the corresponding value function. $\lhd$

Example 2 (Revisited).

Throughout the following examples, we fix a reference probability space and consider a state-feedback controller. We explain the result for (1) in more detail. First, we consider the deterministic case, i.e., $\sigma=0$ . We can show that a smooth function having the polynomial growth property satisfies the associated HJB equation. By the uniqueness of the viscosity solution, this is the value function; see Theorem 5. Note also that it is possible to apply [11, Theorem 4] without the Lipschitz continuity of $g$ due to the smoothness of $V^{\sf s}$ . Therefore, it can be verified that the $L^{0}$ optimal feedback control $u^{*}(s,{{x}})$ is given by

u^{\ast}(s,{{x}})=\begin{cases}-1,&\mbox{if~{}}\frac{1}{2}e^{-2c(T-s)}<{{x}},~{}0\leq s\leq T,\\ 0,&\mbox{if~{}}|{{x}}|\leq\frac{1}{2}e^{-2c(T-s)},~{}0\leq s\leq T,\\ 1,&\mbox{if~{}}{{x}}<-\frac{1}{2}e^{-2c(T-s)},~{}0\leq s\leq T.\end{cases}

(39)

This analysis implies that the Lipschitz continuity of $g$ is not necessary for the value function to be differentiable almost everywhere; see Remark 2.

Next, we consider the stochastic case, i.e., $\sigma>0$ . The associated HJB equation is

\mspace{10.0mu}\begin{cases}-v_{t}({t,x})-cxD_{x}v({t,x})-\frac{\sigma^{2}}{2}D_{x}^{2}v({t,x})\\ \hskip 56.9055pt+\alpha(D_{x}v({t,x}))=0,~{}~{}({t,x})\in[0,T)\times\mathbb{R},\\ v(T,x)=x^{2},\quad x\in\mathbb{R},\end{cases}

where

\alpha(p)\triangleq\begin{cases}p-1,&\mbox{if~{}}p\geq 1,\\ 0,&\mbox{if~{}}|p|<1,\\ -p-1,&\mbox{if~{}}p\leq-1.\\ \end{cases}

We solve the above HJB equation numerically using a finite difference scheme. See [7, 5] for numerical methods to compute the viscosity solution to the HJB equation. We take $c=1,\sigma=0.1$ , and $T=1$ . The switching boundary $\{(s,{{x}}):|D_{x}V^{\sf s}(s,{{x}})|=1\}$ is depicted in Fig. 2. For comparison, we also plot the deterministic optimal switching boundary $\{(s,{{x}}):|{{x}}|=\frac{1}{2}e^{-2c(T-s)}\}$ obtained in (39). As shown in Fig. 2, the region where the stochastic $L^{0}$ optimal control takes value $0$ is larger than the deterministic one. This implies that the stochastic $L^{0}$ optimal control gives a sparser control than the deterministic one instead of allowing the larger variance of the terminal state.

$\lhd$

Example 3.

Next, we consider a simplified load frequency control (LFC) model depicted in Fig. 3; see [15, 14] for more details. The physical meanings of $x_{s}^{(1)}$ and $x_{s}^{(2)}$ are frequency deviation and its compensation by a thermal plant, respectively. The feedback loop with $1/s$ and the saturation function

{\rm sat}_{d}(x)\triangleq\begin{cases}-d,&x<-d,\\ x,&|x|\leq d,\\ d,&x>d,\end{cases}\hskip 14.22636ptx\in{\mathbb{R}},

(40)

represents the rate limiter, where $d>0$ characterizes the limited responsiveness of the adjustment of the thermal power generation. An extra compensation, which should not be activated for long time, is denoted as $u_{s}$ . The dynamics in Fig. 3 is given by

\begin{cases}d{{x}}_{s}^{(1)}&=(-p{{x}}_{s}^{(1)}-k{{x}}_{s}^{(2)})ds+ku_{s}ds+k\sigma dw_{s},\\ d{{x}}_{s}^{(2)}&={\rm sat}_{d}({{x}}_{s}^{(1)}-{{x}}_{s}^{(2)})ds,\end{cases}

(41)

where $p>0,k>0,\sigma>0$ . We take $p=1/3,k=2,\sigma=0.5,{\mathbb{U}}=[-1,1],T=0.5$ , and $g(x)=\|x\|^{2}$ . Based on the equivalence result in Theorem 5, we employ a sampling-based method proposed in [5] with radial basis functions to solve the associated HJB equation. Figure 4 compares the obtained switching boundaries at time $s=0$ , i.e., $\{x:k|(D_{x}V^{\sf s})^{(1)}(0,{{x}})|=1\}$ for $d=0.4$ and the linear case $(d=+\infty)$ . To describe the result, let us consider the case ${{x}}^{(1)}>0$ and ${{x}}^{(2)}\simeq 0$ . In such a case, it is expected that $x^{(2)}$ increases to suppress $x^{(1)}$ . When the rate limiter prevents the quick adjustment of $x^{(2)}$ , we need to activate $u_{s}$ . This is why the region on which the optimal control takes value $0$ is larger for $d=+\infty$ than for $d=0.4$ . Similar interpretation applies to the case with $x^{(1)}\simeq 0$ and $x^{(2)}>0$ . $\lhd$

Figure 3: Block diagram of the load frequency control system.

6 Conclusions

We have investigated a finite horizon stochastic optimal control problem with the $L^{0}$ control cost functional. We have characterized the value function as a viscosity solution to the associated HJB equation and shown an equivalence theorem between the $L^{0}$ optimality and the $L^{1}$ optimality via the uniqueness of a viscosity solution. Thanks to the equivalence, we have ensured that the value function is a classical solution of the associated HJB equation under some conditions. Moreover, we have derived a sufficient and necessary condition for the $L^{0}$ optimality that connects the current state and the current optimal control value. Furthermore, we have revealed the discreteness property of the sparse optimal stochastic control for control-affine systems.

{ack}

This work was supported in part by JSPS KAKENHI under Grant Number JP18H01461 and by JST, ACT-X under Grant Number JPMJAX1902.

Appendix

Appendix A Moment estimate for the state

Here, we introduce an estimate for the $p$ -th order moment of the state governed by the stochastic system (3) [19, Theorem 1.2].

Lemma 4.

Fix $T>0$ . Assume $(A_{1})$ and let $p\geq 2$ be given. Then there exists a positive constant $K_{p}$ such that, for any $(t,x)\in[0,T]\times{\mathbb{R}}^{n}$ ,

{\mathbb{E}}\left[\sup_{t\leq s\leq T}\|{{x}}_{s}^{t,x,u}\|^{p}\right]\leq K_{p}(1+\|x\|^{p}),\ \forall u\in{\mathcal{U}}^{\sf s}[t,T].

(42)

$\lhd$

By applying Hölder’s inequality, we obtain the estimate for the first order moment, that is, (42) also holds for $p=1$ .

The estimate (42) implies ${\mathbb{E}}[\|{{x}}_{T}^{t,x,u}\|^{p}]<+\infty$ for any $p\geq 2$ . Note that

{\mathbb{E}}\left[\int_{t}^{T}\psi_{0}(u_{s})ds\right]\leq m(T-t).

Hence, the growth condition in $(A_{2})$ ensures that the cost functional $J^{\sf s}(t,x,u)$ has a finite value for any $(t,x,u)\in[0,T]\times{\mathbb{R}}^{n}\times{\mathcal{U}}^{\sf s}[t,T]$ .

Appendix B Continuity of $H^{\sf s}$

Lemma 5.

If $f$ and $\sigma$ satisfy (5), then $H^{\sf s}$ defined by (10) is continuous on ${\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}$ .

{pf}

Fix $\varepsilon>0$ and $(x,p,M)\in{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}$ . By definition of $H^{\sf s}$ , there exists $\bar{u}\in{\mathbb{U}}$ such that

H^{\sf s}(x,p,M)-\varepsilon<-f(x,\bar{u})\cdot p-\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(x,\bar{u})M)-\psi_{0}(\bar{u}).

(43)

Therefore, for any $(y,q,N)\in{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}$ ,

	$\displaystyle H^{\sf s}(x,p,M)-H^{\sf s}(y,q,N)$
	$\displaystyle\leq-f(x,\bar{u})\cdot p-\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(x,\bar{u})M)-\psi_{0}(\bar{u})+\varepsilon$
	$\displaystyle\quad+f(y,\bar{u})\cdot q+\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(y,\bar{u})N)+\psi_{0}(\bar{u})$
	$\displaystyle=f(y,\bar{u})\cdot q-f(x,\bar{u})\cdot p$
	$\displaystyle\quad+\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(y,\bar{u})N-\sigma\sigma^{\top}(x,\bar{u})M)+\varepsilon.$

Note that $f$ and $\sigma$ are continuous by (5), and thus there exists $\delta>0$ such that, for any $(y,q,N)\in B((x,p,M),\delta)$ ,

f(y,\bar{u})\cdot q-f(x,\bar{u})\cdot p+\frac{1}{2}\mathrm{tr}(\sigma\sigma^{\top}(y,\bar{u})N-\sigma\sigma^{\top}(x,\bar{u})M)<\varepsilon.

Hence, for any $(y,q,N)\in B((x,p,M),\delta)$ ,

H^{\sf s}(x,p,M)-H^{\sf s}(y,q,N)<2\varepsilon.

Similarly, $H^{\sf s}(y,q,N)-H^{\sf s}(x,p,M)<2\varepsilon$ also holds. This shows the continuity of $H^{\sf s}$ . $\Box$

Appendix C Viscosity solution

Here, we briefly introduce a viscosity solution [19]. Let $H:{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}\to{\mathbb{R}}$ be a continuous function that satisfies the following condition:

H(x,p,M)\leq H(x,p,N),\ \textrm{if}\ M-N\in{\mathcal{S}}_{+}^{n}.

(44)

Consider a second-order partial differential equation

\displaystyle\begin{cases}-v_{t}({t,x})+H(x,D_{x}v({t,x}),D_{x}^{2}v({t,x}))=0,\\ \hskip 85.35826pt({t,x})\in[0,T)\times{\mathbb{R}}^{n},\\ v(T,x)=g(x),\ x\in{\mathbb{R}}^{n}.\end{cases}

(45)

A function $v\in C([0,T]\times{\mathbb{R}}^{n})$ is said to be a viscosity subsolution of (45) if

v(T,x)\leq g(x),\ \forall x\in{\mathbb{R}}^{n}

and, for any $\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{n})\cap C([0,T]\times{\mathbb{R}}^{n})$ ,

-\phi_{t}(t_{0},x_{0})+H(x_{0},D_{x}\phi(t_{0},x_{0}),D_{x}^{2}\phi(t_{0},x_{0}))\leq 0

(46)

at any global maximum point $(t_{0},x_{0})\in[0,T)\times{\mathbb{R}}^{n}$ of $v-\phi$ . Similarly, a function $v\in C([0,T]\times{\mathbb{R}}^{n})$ is said to be a viscosity supersolution of (45) if

v(T,x)\geq g(x),\ \forall x\in{\mathbb{R}}^{n}

and, for any $\phi\in C^{1,2}([0,T)\times{\mathbb{R}}^{n})\cap C([0,T]\times{\mathbb{R}}^{n})$ ,

-\phi_{t}(t_{0},x_{0})+H(x_{0},D_{x}\phi(t_{0},x_{0}),D_{x}^{2}\phi(t_{0},x_{0}))\geq 0

(47)

at any global minimum point $(t_{0},x_{0})\in[0,T)\times{\mathbb{R}}^{n}$ of $v-\phi$ . Finally, $v$ is said to be a viscosity solution of (45), if it is simultaneously a viscosity subsolution and supersolution.

Next, we define the second-order right parabolic superdifferential and subdifferential, which are used in Lemma 3 and Theorem 3. For $v\in C([0,T]\times{\mathbb{R}}^{{\color[rgb]{0,0,0}{n}}})$ with $T>0$ , the second-order right parabolic superdifferential of $v$ at $({t,x})\in[0,T)\times{\mathbb{R}}^{n}$ is defined by

	$\displaystyle{\color[rgb]{0,0,0}{D_{t+,x}^{1,2,+}}}v({t,x})\triangleq\Bigl{\{}(q,p,M)\in{\mathbb{R}}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}:$
	$\displaystyle\quad\limsup_{\begin{subarray}{c}s\searrow t,s\in[0,T)\\ y\rightarrow x\end{subarray}}\frac{1}{\|s-t\|+\\|y-x\\|^{2}}\bigl{(}v(s,y)-v(t,x)$
	$\displaystyle\quad-q(s-t)-p\cdot(y-x)-\frac{1}{2}(y-x)^{\top}M(y-x)\bigr{)}\leq 0\Bigr{\}}.$

Similarly, the second-order right parabolic subdifferential of $v$ at $({t,x})\in[0,T)\times{\mathbb{R}}^{n}$ is defined by

	$\displaystyle D_{t+,x}^{1,2,-}v({t,x})\triangleq\Bigl{\{}(q,p,M)\in{\mathbb{R}}\times{\mathbb{R}}^{n}\times{\mathcal{S}}^{n}:$
	$\displaystyle\quad\liminf_{\begin{subarray}{c}s\searrow t,s\in[0,T)\\ y\rightarrow x\end{subarray}}\frac{1}{\|s-t\|+\\|y-x\\|^{2}}\bigl{(}v(s,y)-v(t,x)$
	$\displaystyle\quad-q(s-t)-p\cdot(y-x)-\frac{1}{2}(y-x)^{\top}M(y-x)\bigr{)}\geq 0\Bigr{\}}.$

If $v$ admits $v_{t},D_{x}v$ , and $D_{x}^{2}v$ at $(t_{0},x_{0})\in(0,T)\times{\mathbb{R}}^{n}$ , it holds that

	$\displaystyle\left(v_{t}(t_{0},x_{0}),D_{x}v(t_{0},x_{0}),D_{x}^{2}v(t_{0},x_{0})\right)$
	$\displaystyle\qquad\in D_{t+,x}^{1,2,+}v(t_{0},x_{0})\cap D_{t+,x}^{1,2,-}v(t_{0},x_{0}).$		(48)

References

[1] Walter Alt and Christopher Schneider. Linear-quadratic control problems with ${L}^{1}$ -control cost. Optimal control applications and methods, 36(4):512–534, 2015.
[2] Michael Athans and Peter L Falb. Optimal Control: An Introduction to the Theory and Its Applications. Dover Publications, 1966.
[3] Nicolas Bouleau and Francis Hirsch. Dirichlet Forms and Analysis on Wiener Space, volume 14. Walter de Gruyter, 2010.
[4] David L Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4):1289–1306, 2006.
[5] Ioannis Exarchos, Evangelos A Theodorou, and Panagiotis Tsiotras. Stochastic $L^{1}$ -optimal control via forward and backward sampling. Systems & Control Letters, 118:101–108, 2018.
[6] Giorgio Fabbri, Fausto Gozzi, and Andrzej Swiech. Stochastic Optimal Control in Infinite Dimension. Springer, 2017.
[7] Wendell H Fleming and Halil Mete Soner. Controlled Markov Processes and Viscosity Solutions, volume 25. Springer Science & Business Media, 2006.
[8] Roland Herzog, Georg Stadler, and Gerd Wachsmuth. Directional sparsity in optimal control of partial differential equations. SIAM Journal on Control and Optimization, 50(2):943–963, 2012.
[9] Takuya Ikeda and Kenji Kashima. Sparsity-constrained controllability maximization with application to time-varying control node selection. IEEE Control Systems Letters, 2(3):321–326, 2018.
[10] Takuya Ikeda and Kenji Kashima. On sparse optimal control for general linear systems. IEEE Transactions on Automatic Control, 64(5):2077–2083, 2019.
[11] Takuya Ikeda and Kenji Kashima. Sparse optimal feedback control for continuous-time systems. 2019 European Control Conference (ECC), pages 3728–3733, 2019.
[12] Takuya Ikeda, Masaaki Nagahara, and Shunsuke Ono. Discrete-valued control of linear time-invariant systems by sum-of-absolute-values optimization. IEEE Transactions on Automatic Control, 62(6):2750–2763, 2016.
[13] Kaito Ito, Takuya Ikeda, and Kenji Kashima. Continuity of the value function for stochastic sparse optimal control. IFAC-PapersOnLine, 53(2):7179–7184, 2020.
[14] Kenji Kashima, Hiroki Aoyama, and Yoshito Ohta. Stable process approach to analysis of systems under heavy-tailed noise: Modeling and stochastic linearization. IEEE Transactions on Automatic Control, 64(4):1344–1357, 2019.
[15] Kenji Kashima, Masakazu Kato, Jun-ichi Imura, and Kazuyuki Aihara. Probabilistic evaluation of interconnectable capacity for wind power generation: Stochastic linearization approach. European Physical Journal: Special Topics, 223(12):2493–2501, 2014.
[16] Karl Kunisch, Konstantin Pieper, and Boris Vexler. Measure valued directional sparsity for parabolic optimal control problems. SIAM Journal on Control and Optimization, 52(5):3078–3108, 2014.
[17] Masaaki Nagahara, Daniel E Quevedo, and Dragan Nešić. Maximum hands-off control: a paradigm of control effort minimization. IEEE Transactions on Automatic Control, 61(3):735–747, 2016.
[18] Masaaki Nagahara, Daniel E Quevedo, and Jan Østergaard. Sparse packetized predictive control for networked control over erasure channels. IEEE Transactions on Automatic Control, 59(7):1899–1905, 2014.
[19] Makiko Nisio. Stochastic Control Theory: Dynamic Programming Principle, volume 72. Springer, 2014.
[20] Alex Olshevsky. On a relaxation of time-varying actuator placement. IEEE Control Systems Letters, 4(3):656–661, 2020.
[21] Huyên Pham. Continuous-time Stochastic Control and Optimization with Financial Applications, volume 61. Springer Science & Business Media, 2009.
[22] Georg Stadler. Elliptic optimal control problems with $L^{1}$ -control cost and applications for the placement of control devices. Computational Optimization and Applications, 44(2):159–181, 2009.
[23] Georg Vossen and Helmut Maurer. On ${L}^{1}$ -minimization in optimal control and applications to robotics. Optimal Control Applications and Methods, 27(6):301–321, 2006.
[24] Jiongmin Yong and Xun Yu Zhou. Stochastic Controls: Hamiltonian Systems and HJB Equations, volume 43. Springer Science & Business Media, 1999.

	$\displaystyle\|V^{\sf s}({t,x})\|$	$\displaystyle\leq{\mathbb{E}}[\|g(\bar{{{x}}}_{T})\|]$
		$\displaystyle\leq{\mathbb{E}}[\hat{C}(1+\\|\bar{{{x}}}_{T}\\|^{p})]$		(12)

	$\displaystyle I_{1}(\tau)\leq{\mathbb{E}}\left[\int_{t}^{\tau}\\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\\|\cdot\\|f(\tilde{x}_{s},\tilde{u}_{s})\\|ds\right]$
	$\displaystyle\leq\left\{LK_{1}(1+\\|x\\|)+K_{f}\right\}$
	$\displaystyle\qquad\qquad\times{\mathbb{E}}\left[\int_{t}^{\tau}\\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\\|ds\right]$
	$\displaystyle\leq\left\{LK_{1}(1+\\|x\\|)+K_{f}\right\}$
	$\displaystyle\qquad\qquad\times(\tau-t)\sup_{s\in[t,\tau]}{\mathbb{E}}\left[\\|D_{x}\phi(s,\tilde{x}_{s})-D_{x}\phi(t,x)\\|\right],$

Sparse Optimal Stochastic Control

Abstract

keywords:

1 Introduction

Example 1.

2 Mathematical preliminaries

3 Problem formulation

Problem 1.

Remark 1.

4 General analysis of stochastic optimal control with discontinuous input cost functional

4.1 Characterization of the value function

Theorem 1.

Remark 2.

Lemma 1.

Lemma 2.

Theorem 2.

4.2 Optimality of a control

Lemma 3.

Theorem 3.

Remark 3.

Corollary 1.

5 Characterization of sparse optimal stochastic control

5.1 Discreteness of the optimal control

Theorem 4.

5.2 Equivalence between L0L^{0} optimality and L1L^{1} optimality

Theorem 5.

Corollary 2.

Corollary 3.

Remark 4.

Example 2 (Revisited).

Example 3.

6 Conclusions

Appendix A Moment estimate for the state

Lemma 4.

Appendix B Continuity of H𝗌H^{\sf s}

Lemma 5.

Appendix C Viscosity solution

References

5.2 Equivalence between $L^{0}$ optimality and $L^{1}$ optimality

Appendix B Continuity of $H^{\sf s}$