^†^†thanks: This work was supported in part by JSPS KAKENHI under Grant Number JP18H01461 and JP21H04875.

Matrix Pontryagin principle approach to controllability metrics maximization under sparsity constraints

Tomofumi Ohtsuka (Kyoto University) Takuya Ikeda (The University of Kitakyushu) Kenji Kashima (Kyoto University) Graduate School of Informatics, Kyoto University, Kyoto, Japan.
(e-mail: [email protected], [email protected]) Faculty of Environmental Engineering, The University of Kitakyushu, Fukuoka, Japan. (e-mail: [email protected])

Abstract

Controllability maximization problem under sparsity constraints is a node selection problem that selects inputs that are effective for control in order to minimize the energy to control for desired state. In this paper we discuss the equivalence between the sparsity constrained controllability metrics maximization problems and their convex relaxation. The proof is based on the matrix-valued Pontryagin maximum principle applied to the controllability Lyapunov differential equation.

keywords:

Sparse optimal control, node selection problem, controllability maximization

1 Introduction

Sparse optimal problems have attracted a lot of attention in the field of optimal control. Such an approach is useful to find a small number of essential information that is closely related to the control performance of interest, and it is applied widely, for example, Ikeda et al. (2021). This paper investigates the application of sparse optimization to controllability maximization problem, one of the control node selection problems. The problem is known as the optimization problem minimizing the energy to control for the desired state.

These problems are generally formulated as maximization of some metric of the controllability Gramian with $L^{0}/l^{0}$ constraints, but it is known that the problems include combinatorial structures. To circumvent this, relaxed problems, where the $L^{0}/l^{0}$ norms are replaced by the $L^{1}/l^{1}$ norms, are considered for its computational tractability. Then, the problem is how to prove the equivalence between the main problem and its relaxation. The paper Ikeda and Kashima (2018) proved the equivalence when the trace of controllability Gramian is adopted as the metric, but its usefulness as a metric is questionable since the designed Gramian may include the zero eigenvalue, so the trace metric does not automatically ensure the controllability. The paper Ikeda and Kashima (2022) considered the minimum eigenvalue and the determinant of the controllability Gramian which is useful as metrics, but it avoided the proof of equivalence because of the difficulty and treated approximation problems that are easy to prove the equivalence. In view of this, this paper newly proposes a method to prove the equivalence for general metrics of controllability. Specifically, we adopted the controllability Lyapunov differential equation. The controllability Lyapunov differential equation is a matrix-valued differential equation whose solution is the controllability Gramian. By considering the optimal control problem for this Lyapunov differential equation, we can strictly treat useful metrics that are related to the controllability Gramian.

The remainder of this paper is organized as follows. Section 2 provides mathematical preliminaries. Section 3 formulates our node scheduling problem using controllability Lyapunov differential equation, and gives a sufficient condition for the main problem to boil down to the relaxation problem. Section 4 offers concluding remarks.

Notation

We denote the set of all positive integers by $\mathbb{N}$ and the set of all real numbers by $\mathbb{R}$ . Let $n,m\in\mathbb{N}$ . We denote the zero matrix of size $n\times m$ by $O_{n\times m}$ and the identity matrix of size $n$ by $I_{n}$ . For any $A,B\in\mathbb{R}^{n\times m}$ , we denote the Frobenius norm of $A$ by $\|A\|\triangleq\left(\sum_{i=1}^{n}\sum_{j=1}^{m}A_{i,j}^{2}\right)^{1/2}$ , and the inner product of $A$ and $B$ by $(A,B)\triangleq\left(\sum_{i=1}^{n}\sum_{j=1}^{m}A_{i,j}B_{i,j}\right)$ . Let $C$ be a closed subset of $\mathbb{R}^{n\times m}$ and $A\in C$ . A matrix $\Delta\in\mathbb{R}^{n\times m}$ is a proximal normal to the set $C$ at the point $A$ if and only if there exists a constant $\sigma\geq 0$ such that $(\Delta,B-A)\leq\sigma\|B-A\|^{2}$ for all $B\in C$ . The proximal normal cone to $C$ at $A$ is defined as the set of all such $\Delta$ , which is denoted by $N_{C}^{P}(A)$ . We denote the limiting normal cone to $C$ at $A$ by $N_{C}^{L}(A)$ , i.e., $N_{C}^{L}(A)\triangleq\{\Delta=\lim_{i\rightarrow\infty}\Delta_{i}:\Delta_{i}\in N_{C}^{P}(A_{i}),A_{i}\rightarrow A,A_{i}\in C\}$ . For other notations, see (Ikeda and Kashima, 2022, Section II).

2 Preliminary

Let us consider the following continuous-time linear system

\begin{split}&\dot{x}(t)=Ax(t)+BV(t)u(t),\quad t\in[0,T],\\ &V(t)={\rm diag}{\left(v(t)\right)},\quad v(t)\in\{0,1\}^{p},\end{split}

(1)

where $x(t)\in\mathbb{R}^{n}$ is the state vector consisting of $n$ nodes, where $x_{i}(t)$ is the state of the $i$ -th node at time $t$ ; $u(\cdot)\in\mathbb{R}^{m}$ is the exogenous control input that influences the network dynamics. Then the controllability Gramian for the system is defined by

G_{c}=\int_{0}^{T}e^{A(T-\tau)}BV(\tau)V(\tau)^{\top}B^{\top}e^{A^{\top}(T-\tau)}d\tau.

(2)

We next show why the controllability Gramian is used as the metric of the ease of control. We here recall the minimum-energy control problem:

\begin{split}\min_{u}\quad&\int_{0}^{T}\|u(t)\|^{2}dt\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{x}(t)=Ax(t)+BV(t)u(t),\\ &x(0)=0,\quad x(T)=x_{f}.\end{split}

(3)

The minimum control energy is then given by $x_{f}^{\top}G_{c}^{-1}x_{f}$ (Verriest and Kailath (1983)). Based on this, recent works have been considered to make $G_{c}$ as large as possible. In this paper we design $BV(t)$ in order to maximize some metric of the controllability Gramian. As the constraints, we introduce $L^{0}$ and $l^{0}$ constraints on $v(t)$ to take account of the upper bound of the total time length of node activation and the number of activated nodes at each time. We consider the following optimal problem that maximizes some metric of $G_{c}$ under sparsity constraints:

\begin{split}\max_{v}\quad&J(v)=K(G_{c})\\ \mathrm{s.t.}\text{ ~{} ~{}}&v(t)\in\{0,1\}^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{0}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{0}}\leq\beta\quad^{\forall}t\in[0,T],\end{split}

(4)

where $K(G_{c})$ is a metric of the controllability Gramian, and $\alpha_{j}>0$ and $\beta>0$ is constant.

Since the maximization problem in (4) is a combinatorial optimization problem, we consider the following relaxation problem:

\begin{split}\max_{v}\quad&J(v)=K(G_{c})\\ \mathrm{s.t.}\text{ ~{} ~{}}&v(t)\in[0,1]^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{1}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{1}}\leq\beta\quad^{\forall}t\in[0,T].\end{split}

(5)

This problem is easier to treat than the main problem (especially if $K$ is concave, problem (5) is a convex optimization problem). We, however, have to consider the equivalence between the main problem and the corresponding relaxation problem. Ikeda and Kashima (2022) formulated alternative approximation problem instead of proving the equivalence. Then this paper proves the equivalence between the main problem and the relaxed one by using the controllability Lyapunov differential equation.

3 Proposed method

In this section, we formulate a controllability Lyapunov differential equation which holds the controllability Gramian as a solution, and then formulate an optimization problem for a system in which the state space representation is given by the derived differential equation. We provide an equivalence theorem between the main problem and the corresponding relaxation problem.

3.1 Problem formulation and relaxation

Controllability Lyapunov differential equation is given as follows:

\begin{split}&\dot{G_{c}}(t)=AG_{c}(t)+G_{c}(t)A^{\top}+BV(t)V(t)^{\top}B^{\top},\\ &G_{c}(0)=O_{n\times n}.\end{split}

(6)

Then the controllability Gramian $G_{c}$ defined by (2) corresponds to the solution $G_{c}(T)$ of (6) at $t=T$ . Here we consider the following optimal control problem.

Problem 1 (Main problem).

\begin{split}\max_{v}\quad&J(v)=K(G_{c}(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{G_{c}}(t)=AG_{c}(t)+G_{c}(t)A^{\top}+BV(t)B^{\top},\\ &G_{c}(0)=O_{n\times n},\\ &v(t)\in\{0,1\}^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{0}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{0}}\leq\beta\quad^{\forall}t\in[0,T].\end{split}

(7)

Note that $V(\cdot)V(\cdot)^{\top}=V(\cdot)$ since $v(\cdot)\in\{0,1\}^{p}$ , so we rewrite the controllability Lyapunov differential equation. Problem 1 is a combinatorial optimization problem, so we consider the following relaxed problem, where the $L^{0}/l^{0}$ norms are replaced by the $L^{1}/l^{1}$ norms, respectively.

Problem 2 (Relaxed problem).

\begin{split}\max_{v}\quad&J(v)=K(G_{c}(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{G_{c}}(t)=AG_{c}(t)+G_{c}(t)A^{\top}+BV(t)B^{\top},\\ &G_{c}(0)=O_{n\times n},\\ &v(t)\in[0,1]^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{1}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{1}}\leq\beta\quad^{\forall}t\in[0,T].\end{split}

(8)

In what follows, we suppose that $K$ is continuously differentiable.

3.2 discreteness and equivalence

We define the set of feasible solutions of Problem 1 and Problem 2 by $\mathcal{V}_{0}$ and $\mathcal{V}_{1}$ , i.e.,

	$\displaystyle\mathcal{V}_{0}\triangleq\{v:$	$\displaystyle v(t)\in\{0,1\}^{p}\quad{}^{\forall}t,\quad\\|v_{j}\\|_{L^{0}}\leq\alpha_{j}\quad^{\forall}j,$
		$\displaystyle\quad\\|v(t)\\|_{l^{0}}\leq\beta\quad^{\forall}t\},$
	$\displaystyle\mathcal{V}_{1}\triangleq\{v:$	$\displaystyle v(t)\in[0,1]^{p}\quad{}^{\forall}t,\quad\\|v_{j}\\|_{L^{1}}\leq\alpha_{j}\quad^{\forall}j,$
		$\displaystyle\quad\\|v(t)\\|_{l^{1}}\leq\beta\quad^{\forall}t\}.$

Note that $\mathcal{V}_{0}\subset\mathcal{V}_{1}$ , since $\|v_{j}\|_{L^{1}}=\|v_{j}\|_{L^{0}}$ for all $j$ and $\|v(t)\|_{l^{1}}=\|v(t)\|_{l^{0}}$ on $[0,T]$ for any measurable function $v$ with $v(t)\in\{0,1\}^{p}$ on $[0,T]$ . The inclusion is proper in general, since the $L^{1}$ / $l^{1}$ constraints do not automatically guarantee the $L^{0}$ / $l^{0}$ constraints and some functions in $\mathcal{V}_{1}$ are not obviously binary. Then, we first show the discreteness of solutions of Problem 2, which guarantees that the optimal solutions of Problem 2 belongs to the set $\mathcal{V}_{0}$ . For this purpose, we prepare lemmas.

Lemma 1 (Matrix Pontryagin principle).

Let us consider the following optimization problem

\begin{split}\min_{U}\quad&J=L_{f}(X(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{X}(t)=F(X(t),U(t)),\\ &X(0)=X_{0},\quad X(T)\in E,\quad U(t)\in\Omega,\end{split}

(9)

where $L_{f}$ is continuously differentiable, $F$ is continuous, $D_{X}F(X(t),U(t))$ is continuous with respect to $t,X,U$ , $X(t)\in\mathbb{R}^{n\times m}$ , $U(t)\in\mathbb{R}^{p\times q}$ , $X_{0}\in\mathbb{R}^{n\times m}$ , $T>0$ , $E\subset\mathbb{R}^{n\times m}$ , and $\Omega\subset\mathbb{R}^{p\times q}$ . Note that $(L_{f},F,X_{0},T,E,\Omega)$ is given. We define Hamiltonian function $H:\mathbb{R}^{n\times m}\times\mathbb{R}^{n\times m}\times\mathbb{R}^{p\times q}\to\mathbb{R}$ associated to problem (9) by

H(X(t),P(t),U(t))=\mathrm{Tr}\left(P(t)^{\top}F(X(t),U(t))\right).

(10)

Let the process $(X^{*}(t),V^{*}(t))$ be a local minimizer for the problem (9). Then there exists a matrix $P:[0,T]\rightarrow\mathbb{R}^{n\times m}$ , and a scalar $\eta$ equal to 0 or 1 satisfying the following conditions:

•

the nontriviality condition:

$(\eta,P(t))\neq 0\quad^{\forall}t\in[0,T],$ (11)
•

the transversality condition:

$-P(T)\in\eta\nabla L_{f}(X^{*}(T))+N_{E}^{L}(X^{*}(T)),$ (12)
•

the adjoint equation for almost every $t\in[0,T]$ :

$-\dot{P}(t)=D_{X}H(X^{*}(t),P(t),U^{*}(t)),$ (13)
•

the maximum condition for almost every $t\in[0,T]$ :

$H(X^{*}(t),P(t),U^{*}(t))=\sup_{U\in\Omega}H(X^{*}(t),P(t),U).$ (14)

Proof.

We define a mapping $\psi_{nm}:\mathbb{R}^{n\times m}\to\mathbb{R}^{nm}$ by

\psi_{nm}(X)=\begin{bmatrix}X_{1}^{\top},\dots,X_{m}^{\top}\end{bmatrix}^{\top},

(15)

where $X_{i}\in\mathbb{R}^{n}$ denotes the $i$ th column of a matrix $X\in\mathbb{R}^{n\times m}$ . From Athans (1967), $\psi_{nm}$ is a regular linear mapping (hence $\psi_{nm}^{-1}$ exists), and preserves the inner product. Then problem (9) is equivalent to

\begin{split}\min_{u}\quad&J=l_{f}(x(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{x}(t)=f(x(t),u(t)),\\ &x(0)=x_{0}\quad x(T)\in e,\quad u(t)\in\omega,\end{split}

(16)

where $x=\psi_{nm}(X)$ , $u=\psi_{pq}(U)$ , and $l_{f},f,x_{0},e,\omega$ corresponds to $L_{f},F,X_{0},E,\Omega$ respectively. We define the Hamiltonian function $h:\mathbb{R}^{nm}\times\mathbb{R}^{nm}\times\mathbb{R}^{pq}\to\mathbb{R}$ associated to problem (16) by $h(x(t),p(t),u(t))=p(t)^{\top}f(x(t),u(t))$ and denote the local minimizer for problem (16) by ( $x^{*}(t)$ , $u^{*}(t)$ ). Then there exists an arc $p:[0,T]\rightarrow\mathbb{R}^{nm}$ and a scalar $\eta$ equal to 0 or 1 satisfying the following conditions (Pontryagin’s Maximum Principle (Clarke (2013))):

•

the nontriviality condition:

$(\eta,p(t))\neq 0\quad^{\forall}t\in[0,T],$ (17)
•

the transversality condition:

$-p(T)\in\eta\nabla l_{f}(x^{*}(T))+N_{e}^{L}(x^{*}(T)),$ (18)
•

the adjoint equation for almost every $t\in[0,T]$

$-\dot{p}(t)=D_{x}h(x^{*}(t),p(t),u^{*}(t)),$ (19)
•

the maximum condition for almost every $t\in[0,T]$

$h(x^{*}(t),p(t),u^{*}(t))=\sup_{u\in\omega}h(x^{*}(t),p(t),u).$ (20)

Since $\psi_{nm}^{-1}$ exists, we obtain the Hamiltonian function associated to $h(x^{*}(t),p(t),u^{*}(t))$ as follows:

H(X^{*}(t),P(t),U^{*}(t))=\mathrm{Tr}\left(P^{\top}(t)F(X^{*}(t),U^{*}(t))\right),

(21)

which satisfies (11), (12), (13), and (14), where $X^{*}=\psi_{nm}^{-1}(x^{*})$ , $U^{*}=\psi_{pq}^{-1}(u^{*})$ , $P=\psi_{nm}^{-1}(p)$ . This completes the proof. ∎

Lemma 2.

Define a set

E\triangleq\{A\in\mathbb{R}^{n\times n}:A_{i,j}\leq\alpha_{i,j},\quad(i,j)\in\mathcal{I}\},

(22)

and fix any $\gamma\in E$ , where $\mathcal{I}\subset\mathbb{N}^{n\times n}$ is a set of positions of elements of $A$ for which inequality constraints are given. Then any $\delta\in N_{E}^{L}(\gamma)$ satisfies

	$\displaystyle\delta_{i,j}(\gamma_{i,j}-\alpha_{i,j})=0\quad^{\forall}(i,j)\in\mathcal{I},$		(23)
	$\displaystyle\delta_{i,j}\geq 0\quad^{\forall}(i,j)\in\mathcal{I},$		(24)
	$\displaystyle\delta_{i,j}=0\quad^{\forall}(i,j)\notin\mathcal{I}.$		(25)

Proof.

Fix any $\hat{A}\in E$ and $\hat{a}=\psi_{nn}(\hat{A})$ , where $\psi_{nn}$ is from (15). Then we obtain a set $e$ satisfying $\hat{a}\in e$ as follows:

e\triangleq\{a\in\mathbb{R}^{n^{2}}:a_{j}\leq\alpha^{\prime}_{j},\quad j\in\mathcal{I}^{\prime}\},

(26)

where $\alpha^{\prime}=\psi_{nn}(\alpha)$ and $\mathcal{I}^{\prime}\subset\mathbb{N}^{n^{2}}$ is a set corresponding to $\mathcal{I}$ . Take any $\gamma^{\prime}\in e$ , then we have

	$\displaystyle\delta^{\prime}_{i}(\gamma^{\prime}_{i}-\alpha^{\prime}_{i})=0\quad^{\forall}i\in\mathcal{I}^{\prime},$		(27)
	$\displaystyle\delta^{\prime}_{i}\geq 0\quad^{\forall}i\in\mathcal{I}^{\prime},$		(28)
	$\displaystyle\delta^{\prime}_{i}=0\quad^{\forall}i\notin\mathcal{I}^{\prime}$		(29)

for all $\delta^{\prime}\in N_{E}^{L}(\gamma^{\prime})$ (Ikeda and Kashima (2022)). Finally, we obtain (23), (24), (25) where $\delta=\psi_{nn}^{-1}(\delta^{\prime})$ and $\gamma=\psi_{nn}^{-1}(\gamma^{\prime})$ . ∎

Theorem 1.

Let $G_{c}^{*}(t)$ and $V^{*}(t)$ be a local optimal solution of Problem 2. Assume that

q_{j}(t)\triangleq b_{j}^{\top}e^{A^{\top}(T-t)}\frac{\partial K(G_{c}^{*}(T))}{\partial G_{c}^{*}(T)}e^{A(T-t)}b_{j}

and $q_{i}(t)-q_{j}(t)$ is not constant on $[0,T]$ for all $i,j\in\{1,2,\dots,p\}$ . Then any solution to Problem 2 takes only the values in the binary set {0,1} almost everywhere.

Proof.

We first reformulate Problem 2 into a form to which Lemma 1 is applicable. The value $\|v_{j}\|_{L^{1}}$ is equal to the final state $y_{j}(T)$ of the system $\dot{y_{j}}(t)=v_{j}(t)$ with $y_{j}(0)=0$ . Define $Y(t)\triangleq{\rm diag}{\left(y(t)\right)}$ and matrices $X(t)$ , $\bar{V}(t)$ , $\bar{A}$ , $\bar{B}$ by

\begin{split}X(t)&\triangleq\begin{bmatrix}G_{c}(t)&O_{n\times p}\\ O_{p\times n}&Y(t)\\ \end{bmatrix},\;\bar{V}(t)\triangleq\begin{bmatrix}V(t)&O_{p\times p}\\ O_{p\times p}&V(t)\\ \end{bmatrix},\\ \bar{A}&\triangleq\begin{bmatrix}A&O_{n\times p}\\ O_{p\times n}&O_{p\times p}\\ \end{bmatrix},\;\bar{B}\triangleq\begin{bmatrix}B&O_{n\times p}\\ O_{p\times p}&I_{p}\\ \end{bmatrix}.\end{split}

Then, Problem 2 is equivalently expressed as follows:

\begin{split}\min_{v}\quad&J(V)=-L_{f}(X(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{X}(t)=\bar{A}X(t)+X(t)\bar{A}^{\top}+\bar{B}\bar{V}(t)\bar{B}^{\top},\\ &X(0)=O_{(n+p)\times(n+p)},\quad X(T)\in E,\\ &v(t)\in\Omega\quad^{\forall}t\in[0,T],\end{split}

(30)

where $L_{f}(X(T))=K(G_{c}(T))$ , $E=\{X(T):y_{j}(T)\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\}\}$ , $\Omega=\{v(t):v(t)\in[0,1]^{p},\|v(t)\|_{l^{1}}\leq\beta\}$ . This is an optimal control problem to which Lemma 1 is applicable. We define the Hamiltonian function $H$ associated to problem (30) by

H(X,P,V)=\mathrm{Tr}\left(P^{\top}(\bar{A}X(t)+X(t)\bar{A}^{\top}+\bar{B}\bar{V}(t)\bar{B}^{\top})\right).

We define two matrices as follows:

\displaystyle X^{*}(t)\triangleq\begin{bmatrix}G_{c}^{*}(t)&O_{n\times p}\\ O_{p\times n}&Y^{*}(t)\end{bmatrix},\quad\bar{V}^{*}(t)\triangleq\begin{bmatrix}V^{*}(t)&O_{p\times p}\\ O_{p\times p}&V^{*}(t)\end{bmatrix}.

(31)

Then ( $X^{*}(t)$ , $\bar{V}^{*}(t)$ ) is the local minimizer of problem (30) because of the equivalence between Problem 2 and problem (30), and there exists a scalar $\eta$ equal to 0 or 1 and a matrix $P:[0,T]\rightarrow\mathbb{R}^{n\times n}$ satisfying the conditions (11), (12), (13), (14). It follows from (13) that

-\dot{P}(t)=\bar{A}^{\top}P(t)+P(t)\bar{A},

which leads to

\begin{split}P(t)&=e^{\bar{A}^{\top}(T-t)}P(T)e^{\bar{A}(T-t)}\\ &=\begin{bmatrix}e^{A^{\top}(T-t)}P^{(11)}(T)e^{A(T-t)}&e^{A^{\top}(T-t)}P^{(12)}(T)\\ P^{(21)}(T)e^{A(T-t)}&P^{(22)}(T)\end{bmatrix},\end{split}

(32)

where

\begin{split}P(t)=\begin{bmatrix}P^{(11)}(t)&P^{(12)}(t)\\ P^{(21)}(t)&P^{(22)}(t)\\ \end{bmatrix}\end{split}

(33)

with $P^{(11)}(t)\in\mathbb{R}^{n\times n}$ and $P^{(22)}(t)\in\mathbb{R}^{p\times p}$ . Note that

\begin{bmatrix}-P^{(11)}(T)+\eta\frac{\partial K(G_{c}^{*}(T))}{\partial G_{c}^{*}(T)}&-P^{(12)}(T)\\ -P^{(21)}(T)&-P^{(22)}(T)\end{bmatrix}\in N_{E}^{L}(X^{*}(T))

by (12), then we have

	$\displaystyle P_{j,j}^{(22)}(T)(y_{j}(T)-\alpha_{j})=0\quad j=\{1,2,\dots,p\},$		(34)
	$\displaystyle P_{j,j}^{(22)}(T)\leq 0\quad j=\{1,2,\dots,p\},$		(35)
	$\displaystyle P_{i,j}^{(22)}(T)=0\quad^{\forall}(i,j)\in\{(i,j):i\neq j\},$		(36)
	$\displaystyle-P^{(11)}(T)+\eta\frac{\partial K(G_{c}^{}(T))}{\partial G_{c}^{}(T)}=O_{n\times n},$		(37)
	$\displaystyle P^{(12)}(T)=O_{n\times p},\quad P^{(21)}(T)=O_{p\times n},$		(38)

from Lemma 2. Substituting these into (32), we get

P(t)=\begin{bmatrix}e^{A^{\top}(T-t)}\eta\frac{\partial K(G_{c}^{*}(T))}{\partial G_{c}^{*}(T)}e^{A(T-t)}&O_{n\times p}\\ O_{p\times n}&P^{(22)}(T)\end{bmatrix}.

(39)

Then, we have

\begin{split}\mathrm{Tr}\left(P^{\top}(t)\bar{B}\bar{V}(t)\bar{B}^{\top}\right)&=\mathrm{Tr}\left(\bar{B}^{\top}P^{\top}(t)\bar{B}\bar{V}(t)\right)\\ &=\mathrm{Tr}\left(\begin{bmatrix}B^{\top}{P^{(11)}}^{\top}(t)B&O_{p\times p}\\ O_{p\times p}&P^{(22)}(t)\\ \end{bmatrix}\bar{V}(t)\right)\\ &=\sum_{j=1}^{p}\left(\eta q_{j}(t)+P_{j,j}^{(22)}(T)\right)v_{j}(t).\end{split}

It follows from (14) that

v^{*}(t)=\mathop{\rm arg~{}max~{}}\limits_{v\in\Omega}\sum_{j=1}^{p}\left(\eta q_{j}(t)+P_{j,j}^{(22)}(T)\right)v_{j}.

(40)

We here claim that $\eta=1$ . Indeed, if $\eta=0$ , $P^{(22)}(T)\neq O_{p\times p}$ follows from (11), i.e., there exists some $j$ that satisfies

P_{j,j}^{(22)}(T)<0,\quad y_{j}^{*}(T)=\alpha_{j}.

(41)

Hence, from (40) and (41), we have $v_{j}^{*}(t)=0$ for all $t\in[0,T]$ , i.e., $y_{j}^{*}(T)=\|v_{j}^{*}\|_{L^{1}}=0$ . This contradicts to (41). Thus, $\eta=1$ . From the assumption, it is easy to verify that

1)

we have $q_{j}(t)+P_{j,j}^{(22)}(T)\neq 0$ almost everywhere for all $j=\{1,2,\dots,p\}$ ,

there exists $j_{k}$ $:$ $[0,T]\rightarrow\{1,2,\dots,p\},k=1,2,\dots,p$ , such that

q_{j_{1}(t)}(t)+P_{j_{1}(t),j_{1}(t)}^{(22)}(T)>\cdots>q_{j_{p}(t)}(t)+P_{j_{p}(t),j_{p}(t)}^{(22)}(T)

almost everywhere.

Hence, we find

v_{j}^{*}(t)=\begin{cases}1\quad\mbox{if}\quad j\in\Xi_{1}(t)\cap\Xi_{2}(t),\\ 0\quad\mbox{otherwise}\end{cases}

(42)

for almost every $t\in[0,T]$ , where

	$\displaystyle\Xi_{1}(t)$	$\displaystyle\triangleq\{j_{1}(t),j_{2}(t),\dots,j_{\beta}(t)\},$
	$\displaystyle\Xi_{2}(t)$	$\displaystyle\triangleq\{k\in\{1,2,\dots,p\}:q_{j_{k}(t)}(t)+P_{j_{k}(t),j_{k}(t)}^{(22)}(T)>0\}.$

This completes the proof. ∎

The following theorem is the main result, which shows the equivalence between Problem 1 and Problem 2.

Theorem 2 (equivalence).

Assume that $q_{j}(t)$ and $q_{i}(t)-q_{j}(t)$ is not constant on $[0,T]$ for all $i,j\in\{1,2,\dots,p\}$ . Denote the set of all solutions of Problem 1 and Problem 2 by $\mathcal{V}_{0}^{*}$ and $\mathcal{V}_{1}^{*}$ , respectively. If the set $\mathcal{V}_{1}^{*}$ is not empty, then we have $\mathcal{V}_{0}^{*}=\mathcal{V}_{1}^{*}$ .

Proof.

Denote any solution of Problem 2 by $\hat{v}\in\mathcal{V}_{1}^{*}$ . It follows from Theorem 1 that $\hat{v}(t)\in\{0,1\}^{p}$ almost everywhere. Note that the null set $\cup_{j=1}^{p}\{t\in[0,T]:\hat{v}_{j}(t)\notin\{0,1\}\}$ does not affect the cost, and hence we can adjust the variables so that $\hat{v}(t)\in\{0,1\}^{p}$ on $[0,T]$ , without loss of the optimality. We have

\|\hat{v}(t)\|_{l^{1}}=\|\hat{v}(t)\|_{l^{0}},\quad\|\hat{v}_{j}\|_{L^{1}}=\|\hat{v}_{j}\|_{L^{0}}

for all $j$ . Since $\hat{v}\in\mathcal{V}_{1}$ , we have $\|\hat{v}(t)\|_{l^{0}}\leq\beta$ and $\|\hat{v}_{j}\|_{L^{0}}\leq\alpha_{j}$ for all $t$ and $j$ . Thus, $\hat{v}\in\mathcal{V}_{0}$ . Then,

J(\hat{v})\leq\max_{v\in\mathcal{V}_{0}}J(v)\leq\max_{v\in\mathcal{V}_{1}}J(v)=J(\hat{v}),

(43)

where the first relation follows from $\hat{v}\in\mathcal{V}_{0}$ , the second relation follows from $\mathcal{V}_{0}\subset\mathcal{V}_{1}$ , and the last relation follows from $\hat{v}\in\mathcal{V}_{1}^{*}$ . Hence, we have

J(\hat{v})=\max_{v\in\mathcal{V}_{0}}J(v),

(44)

which implies $\hat{v}\in\mathcal{V}_{0}^{*}$ . Hence, $\mathcal{V}_{1}^{*}\subset\mathcal{V}_{0}^{*}$ and $\mathcal{V}_{0}^{*}$ is not empty.

Next, take any $\tilde{v}\in\mathcal{V}_{0}^{*}$ . Note that $\tilde{v}\in\mathcal{V}_{1}$ , since $\mathcal{V}_{0}^{*}\subset\mathcal{V}_{0}\subset\mathcal{V}_{1}$ . In addition, it follows from (44) that $J(\tilde{v})=J(\hat{v})$ . Therefore, $\tilde{v}\in\mathcal{V}_{1}^{*}$ , which implies $\mathcal{V}_{0}^{*}\subset\mathcal{V}_{1}^{*}$ . This gives $\mathcal{V}_{0}^{*}=\mathcal{V}_{1}^{*}$ . ∎

4 Conclusion

In this paper, we discussed the equivalence between the sparsity constrained controllability metrics maximization problems and their convex relaxation. The proof is based on the matrix-valued Pontryagin maximum principle applied to the controllability Lyapunov differential equation. The existence of optimal solutions and computational cost are currently under investigation.

References

Athans (1967) Athans, M. (1967). The matrix minimum principle. Information and Control, 11, 592–606.
Clarke (2013) Clarke, F. (2013). Functional Analysis, Calculus of Variations and Optimal Control, volume 264. Springer Science & Business Media.
Ikeda and Kashima (2018) Ikeda, T. and Kashima, K. (2018). Sparsity-constrained controllability maximization with application to time-varying control node selection. IEEE Control Systems Letters, 2(3), 321–326.
Ikeda and Kashima (2022) Ikeda, T. and Kashima, K. (2022). Sparse control node scheduling in networked systems based on approximate controllability metrics. IEEE Transactions on Control of Network Systems.
Ikeda et al. (2021) Ikeda, T., Sakurama, K., and Kashima, K. (2021). Multiple sparsity constrained control node scheduling with application to rebalancing of mobility networks. IEEE Transactions on Automatic Control.
Verriest and Kailath (1983) Verriest, E. and Kailath, T. (1983). On generalized balanced realizations. IEEE Transactions on Automatic Control, 28(8), 833–844.

	$\displaystyle\mathcal{V}_{0}\triangleq\{v:$	$\displaystyle v(t)\in\{0,1\}^{p}\quad{}^{\forall}t,\quad\\|v_{j}\\|_{L^{0}}\leq\alpha_{j}\quad^{\forall}j,$
		$\displaystyle\quad\\|v(t)\\|_{l^{0}}\leq\beta\quad^{\forall}t\},$
	$\displaystyle\mathcal{V}_{1}\triangleq\{v:$	$\displaystyle v(t)\in[0,1]^{p}\quad{}^{\forall}t,\quad\\|v_{j}\\|_{L^{1}}\leq\alpha_{j}\quad^{\forall}j,$
		$\displaystyle\quad\\|v(t)\\|_{l^{1}}\leq\beta\quad^{\forall}t\}.$