This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

thanks: This work was supported in part by JSPS KAKENHI under Grant Number JP18H01461 and JP21H04875.

Matrix Pontryagin principle approach to controllability metrics maximization under sparsity constraints

Tomofumi Ohtsuka (Kyoto University)       Takuya Ikeda (The University of Kitakyushu)       Kenji Kashima (Kyoto University) Graduate School of Informatics, Kyoto University, Kyoto, Japan.
 (e-mail: [email protected], [email protected])
Faculty of Environmental Engineering, The University of Kitakyushu, Fukuoka, Japan. (e-mail: [email protected])
Abstract

Controllability maximization problem under sparsity constraints is a node selection problem that selects inputs that are effective for control in order to minimize the energy to control for desired state. In this paper we discuss the equivalence between the sparsity constrained controllability metrics maximization problems and their convex relaxation. The proof is based on the matrix-valued Pontryagin maximum principle applied to the controllability Lyapunov differential equation.

keywords:
Sparse optimal control, node selection problem, controllability maximization

1 Introduction

Sparse optimal problems have attracted a lot of attention in the field of optimal control. Such an approach is useful to find a small number of essential information that is closely related to the control performance of interest, and it is applied widely, for example, Ikeda et al. (2021). This paper investigates the application of sparse optimization to controllability maximization problem, one of the control node selection problems. The problem is known as the optimization problem minimizing the energy to control for the desired state.

These problems are generally formulated as maximization of some metric of the controllability Gramian with L0/l0L^{0}/l^{0} constraints, but it is known that the problems include combinatorial structures. To circumvent this, relaxed problems, where the L0/l0L^{0}/l^{0} norms are replaced by the L1/l1L^{1}/l^{1} norms, are considered for its computational tractability. Then, the problem is how to prove the equivalence between the main problem and its relaxation. The paper Ikeda and Kashima (2018) proved the equivalence when the trace of controllability Gramian is adopted as the metric, but its usefulness as a metric is questionable since the designed Gramian may include the zero eigenvalue, so the trace metric does not automatically ensure the controllability. The paper Ikeda and Kashima (2022) considered the minimum eigenvalue and the determinant of the controllability Gramian which is useful as metrics, but it avoided the proof of equivalence because of the difficulty and treated approximation problems that are easy to prove the equivalence. In view of this, this paper newly proposes a method to prove the equivalence for general metrics of controllability. Specifically, we adopted the controllability Lyapunov differential equation. The controllability Lyapunov differential equation is a matrix-valued differential equation whose solution is the controllability Gramian. By considering the optimal control problem for this Lyapunov differential equation, we can strictly treat useful metrics that are related to the controllability Gramian.

The remainder of this paper is organized as follows. Section 2 provides mathematical preliminaries. Section 3 formulates our node scheduling problem using controllability Lyapunov differential equation, and gives a sufficient condition for the main problem to boil down to the relaxation problem. Section 4 offers concluding remarks.

Notation

We denote the set of all positive integers by \mathbb{N} and the set of all real numbers by \mathbb{R}. Let n,mn,m\in\mathbb{N}. We denote the zero matrix of size n×mn\times m by On×mO_{n\times m} and the identity matrix of size nn by InI_{n}. For any A,Bn×mA,B\in\mathbb{R}^{n\times m}, we denote the Frobenius norm of AA by A(i=1nj=1mAi,j2)1/2\|A\|\triangleq\left(\sum_{i=1}^{n}\sum_{j=1}^{m}A_{i,j}^{2}\right)^{1/2}, and the inner product of AA and BB by (A,B)(i=1nj=1mAi,jBi,j)(A,B)\triangleq\left(\sum_{i=1}^{n}\sum_{j=1}^{m}A_{i,j}B_{i,j}\right). Let CC be a closed subset of n×m\mathbb{R}^{n\times m} and ACA\in C. A matrix Δn×m\Delta\in\mathbb{R}^{n\times m} is a proximal normal to the set CC at the point AA if and only if there exists a constant σ0\sigma\geq 0 such that (Δ,BA)σBA2(\Delta,B-A)\leq\sigma\|B-A\|^{2} for all BCB\in C. The proximal normal cone to CC at AA is defined as the set of all such Δ\Delta, which is denoted by NCP(A)N_{C}^{P}(A). We denote the limiting normal cone to CC at AA by NCL(A)N_{C}^{L}(A), i.e., NCL(A){Δ=limiΔi:ΔiNCP(Ai),AiA,AiC}N_{C}^{L}(A)\triangleq\{\Delta=\lim_{i\rightarrow\infty}\Delta_{i}:\Delta_{i}\in N_{C}^{P}(A_{i}),A_{i}\rightarrow A,A_{i}\in C\}. For other notations, see (Ikeda and Kashima, 2022, Section II).

2 Preliminary

Let us consider the following continuous-time linear system

x˙(t)=Ax(t)+BV(t)u(t),t[0,T],V(t)=diag(v(t)),v(t){0,1}p,\begin{split}&\dot{x}(t)=Ax(t)+BV(t)u(t),\quad t\in[0,T],\\ &V(t)={\rm diag}{\left(v(t)\right)},\quad v(t)\in\{0,1\}^{p},\end{split} (1)

where x(t)nx(t)\in\mathbb{R}^{n} is the state vector consisting of nn nodes, where xi(t)x_{i}(t) is the state of the ii-th node at time tt; u()mu(\cdot)\in\mathbb{R}^{m} is the exogenous control input that influences the network dynamics. Then the controllability Gramian for the system is defined by

Gc=0TeA(Tτ)BV(τ)V(τ)BeA(Tτ)𝑑τ.G_{c}=\int_{0}^{T}e^{A(T-\tau)}BV(\tau)V(\tau)^{\top}B^{\top}e^{A^{\top}(T-\tau)}d\tau. (2)

We next show why the controllability Gramian is used as the metric of the ease of control. We here recall the minimum-energy control problem:

minu0Tu(t)2𝑑ts.t. x˙(t)=Ax(t)+BV(t)u(t),x(0)=0,x(T)=xf.\begin{split}\min_{u}\quad&\int_{0}^{T}\|u(t)\|^{2}dt\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{x}(t)=Ax(t)+BV(t)u(t),\\ &x(0)=0,\quad x(T)=x_{f}.\end{split} (3)

The minimum control energy is then given by xfGc1xfx_{f}^{\top}G_{c}^{-1}x_{f} (Verriest and Kailath (1983)). Based on this, recent works have been considered to make GcG_{c} as large as possible. In this paper we design BV(t)BV(t) in order to maximize some metric of the controllability Gramian. As the constraints, we introduce L0L^{0} and l0l^{0} constraints on v(t)v(t) to take account of the upper bound of the total time length of node activation and the number of activated nodes at each time. We consider the following optimal problem that maximizes some metric of GcG_{c} under sparsity constraints:

maxvJ(v)=K(Gc)s.t. v(t){0,1}pt[0,T],vjL0αjj{1,2,,p},v(t)l0βt[0,T],\begin{split}\max_{v}\quad&J(v)=K(G_{c})\\ \mathrm{s.t.}\text{ ~{} ~{}}&v(t)\in\{0,1\}^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{0}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{0}}\leq\beta\quad^{\forall}t\in[0,T],\end{split} (4)

where K(Gc)K(G_{c}) is a metric of the controllability Gramian, and αj>0\alpha_{j}>0 and β>0\beta>0 is constant.

Since the maximization problem in (4) is a combinatorial optimization problem, we consider the following relaxation problem:

maxvJ(v)=K(Gc)s.t. v(t)[0,1]pt[0,T],vjL1αjj{1,2,,p},v(t)l1βt[0,T].\begin{split}\max_{v}\quad&J(v)=K(G_{c})\\ \mathrm{s.t.}\text{ ~{} ~{}}&v(t)\in[0,1]^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{1}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{1}}\leq\beta\quad^{\forall}t\in[0,T].\end{split} (5)

This problem is easier to treat than the main problem (especially if KK is concave, problem (5) is a convex optimization problem). We, however, have to consider the equivalence between the main problem and the corresponding relaxation problem. Ikeda and Kashima (2022) formulated alternative approximation problem instead of proving the equivalence. Then this paper proves the equivalence between the main problem and the relaxed one by using the controllability Lyapunov differential equation.

3 Proposed method

In this section, we formulate a controllability Lyapunov differential equation which holds the controllability Gramian as a solution, and then formulate an optimization problem for a system in which the state space representation is given by the derived differential equation. We provide an equivalence theorem between the main problem and the corresponding relaxation problem.

3.1 Problem formulation and relaxation

Controllability Lyapunov differential equation is given as follows:

Gc˙(t)=AGc(t)+Gc(t)A+BV(t)V(t)B,Gc(0)=On×n.\begin{split}&\dot{G_{c}}(t)=AG_{c}(t)+G_{c}(t)A^{\top}+BV(t)V(t)^{\top}B^{\top},\\ &G_{c}(0)=O_{n\times n}.\end{split} (6)

Then the controllability Gramian GcG_{c} defined by (2) corresponds to the solution Gc(T)G_{c}(T) of (6) at t=Tt=T. Here we consider the following optimal control problem.

Problem 1 (Main problem).
maxvJ(v)=K(Gc(T))s.t. Gc˙(t)=AGc(t)+Gc(t)A+BV(t)B,Gc(0)=On×n,v(t){0,1}pt[0,T],vjL0αjj{1,2,,p},v(t)l0βt[0,T].\begin{split}\max_{v}\quad&J(v)=K(G_{c}(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{G_{c}}(t)=AG_{c}(t)+G_{c}(t)A^{\top}+BV(t)B^{\top},\\ &G_{c}(0)=O_{n\times n},\\ &v(t)\in\{0,1\}^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{0}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{0}}\leq\beta\quad^{\forall}t\in[0,T].\end{split} (7)

Note that V()V()=V()V(\cdot)V(\cdot)^{\top}=V(\cdot) since v(){0,1}pv(\cdot)\in\{0,1\}^{p}, so we rewrite the controllability Lyapunov differential equation. Problem 1 is a combinatorial optimization problem, so we consider the following relaxed problem, where the L0/l0L^{0}/l^{0} norms are replaced by the L1/l1L^{1}/l^{1} norms, respectively.

Problem 2 (Relaxed problem).
maxvJ(v)=K(Gc(T))s.t. Gc˙(t)=AGc(t)+Gc(t)A+BV(t)B,Gc(0)=On×n,v(t)[0,1]pt[0,T],vjL1αjj{1,2,,p},v(t)l1βt[0,T].\begin{split}\max_{v}\quad&J(v)=K(G_{c}(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{G_{c}}(t)=AG_{c}(t)+G_{c}(t)A^{\top}+BV(t)B^{\top},\\ &G_{c}(0)=O_{n\times n},\\ &v(t)\in[0,1]^{p}\quad{}^{\forall}t\in[0,T],\\ &\|v_{j}\|_{L^{1}}\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\},\\ &\|v(t)\|_{l^{1}}\leq\beta\quad^{\forall}t\in[0,T].\end{split} (8)

In what follows, we suppose that KK is continuously differentiable.

3.2 discreteness and equivalence

We define the set of feasible solutions of Problem 1 and Problem 2 by 𝒱0\mathcal{V}_{0} and 𝒱1\mathcal{V}_{1}, i.e.,

𝒱0{v:\displaystyle\mathcal{V}_{0}\triangleq\{v: v(t){0,1}pt,vjL0αjj,\displaystyle v(t)\in\{0,1\}^{p}\quad{}^{\forall}t,\quad\|v_{j}\|_{L^{0}}\leq\alpha_{j}\quad^{\forall}j,
v(t)l0βt},\displaystyle\quad\|v(t)\|_{l^{0}}\leq\beta\quad^{\forall}t\},
𝒱1{v:\displaystyle\mathcal{V}_{1}\triangleq\{v: v(t)[0,1]pt,vjL1αjj,\displaystyle v(t)\in[0,1]^{p}\quad{}^{\forall}t,\quad\|v_{j}\|_{L^{1}}\leq\alpha_{j}\quad^{\forall}j,
v(t)l1βt}.\displaystyle\quad\|v(t)\|_{l^{1}}\leq\beta\quad^{\forall}t\}.

Note that 𝒱0𝒱1\mathcal{V}_{0}\subset\mathcal{V}_{1}, since vjL1=vjL0\|v_{j}\|_{L^{1}}=\|v_{j}\|_{L^{0}} for all jj and v(t)l1=v(t)l0\|v(t)\|_{l^{1}}=\|v(t)\|_{l^{0}} on [0,T][0,T] for any measurable function vv with v(t){0,1}pv(t)\in\{0,1\}^{p} on [0,T][0,T]. The inclusion is proper in general, since the L1L^{1}/l1l^{1} constraints do not automatically guarantee the L0L^{0}/l0l^{0} constraints and some functions in 𝒱1\mathcal{V}_{1} are not obviously binary. Then, we first show the discreteness of solutions of Problem 2, which guarantees that the optimal solutions of Problem 2 belongs to the set 𝒱0\mathcal{V}_{0}. For this purpose, we prepare lemmas.

Lemma 1 (Matrix Pontryagin principle).

Let us consider the following optimization problem

minUJ=Lf(X(T))s.t. X˙(t)=F(X(t),U(t)),X(0)=X0,X(T)E,U(t)Ω,\begin{split}\min_{U}\quad&J=L_{f}(X(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{X}(t)=F(X(t),U(t)),\\ &X(0)=X_{0},\quad X(T)\in E,\quad U(t)\in\Omega,\end{split} (9)

where LfL_{f} is continuously differentiable, FF is continuous, DXF(X(t),U(t))D_{X}F(X(t),U(t)) is continuous with respect to t,X,Ut,X,U, X(t)n×mX(t)\in\mathbb{R}^{n\times m}, U(t)p×qU(t)\in\mathbb{R}^{p\times q}, X0n×mX_{0}\in\mathbb{R}^{n\times m}, T>0T>0, En×mE\subset\mathbb{R}^{n\times m}, and Ωp×q\Omega\subset\mathbb{R}^{p\times q}. Note that (Lf,F,X0,T,E,Ω)(L_{f},F,X_{0},T,E,\Omega) is given. We define Hamiltonian function H:n×m×n×m×p×qH:\mathbb{R}^{n\times m}\times\mathbb{R}^{n\times m}\times\mathbb{R}^{p\times q}\to\mathbb{R} associated to problem (9) by

H(X(t),P(t),U(t))=Tr(P(t)F(X(t),U(t))).H(X(t),P(t),U(t))=\mathrm{Tr}\left(P(t)^{\top}F(X(t),U(t))\right). (10)

Let the process (X(t),V(t))(X^{*}(t),V^{*}(t)) be a local minimizer for the problem (9). Then there exists a matrix P:[0,T]n×mP:[0,T]\rightarrow\mathbb{R}^{n\times m}, and a scalar η\eta equal to 0 or 1 satisfying the following conditions:

  • the nontriviality condition:

    (η,P(t))0t[0,T],(\eta,P(t))\neq 0\quad^{\forall}t\in[0,T], (11)
  • the transversality condition:

    P(T)ηLf(X(T))+NEL(X(T)),-P(T)\in\eta\nabla L_{f}(X^{*}(T))+N_{E}^{L}(X^{*}(T)), (12)
  • the adjoint equation for almost every t[0,T]t\in[0,T]:

    P˙(t)=DXH(X(t),P(t),U(t)),-\dot{P}(t)=D_{X}H(X^{*}(t),P(t),U^{*}(t)), (13)
  • the maximum condition for almost every t[0,T]t\in[0,T]:

    H(X(t),P(t),U(t))=supUΩH(X(t),P(t),U).H(X^{*}(t),P(t),U^{*}(t))=\sup_{U\in\Omega}H(X^{*}(t),P(t),U). (14)
Proof.

We define a mapping ψnm:n×mnm\psi_{nm}:\mathbb{R}^{n\times m}\to\mathbb{R}^{nm} by

ψnm(X)=[X1,,Xm],\psi_{nm}(X)=\begin{bmatrix}X_{1}^{\top},\dots,X_{m}^{\top}\end{bmatrix}^{\top}, (15)

where XinX_{i}\in\mathbb{R}^{n} denotes the iith column of a matrix Xn×mX\in\mathbb{R}^{n\times m}. From Athans (1967), ψnm\psi_{nm} is a regular linear mapping (hence ψnm1\psi_{nm}^{-1} exists), and preserves the inner product. Then problem (9) is equivalent to

minuJ=lf(x(T))s.t. x˙(t)=f(x(t),u(t)),x(0)=x0x(T)e,u(t)ω,\begin{split}\min_{u}\quad&J=l_{f}(x(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{x}(t)=f(x(t),u(t)),\\ &x(0)=x_{0}\quad x(T)\in e,\quad u(t)\in\omega,\end{split} (16)

where x=ψnm(X)x=\psi_{nm}(X), u=ψpq(U)u=\psi_{pq}(U), and lf,f,x0,e,ωl_{f},f,x_{0},e,\omega corresponds to Lf,F,X0,E,ΩL_{f},F,X_{0},E,\Omega respectively. We define the Hamiltonian function h:nm×nm×pqh:\mathbb{R}^{nm}\times\mathbb{R}^{nm}\times\mathbb{R}^{pq}\to\mathbb{R} associated to problem (16) by h(x(t),p(t),u(t))=p(t)f(x(t),u(t))h(x(t),p(t),u(t))=p(t)^{\top}f(x(t),u(t)) and denote the local minimizer for problem (16) by (x(t)x^{*}(t), u(t)u^{*}(t)). Then there exists an arc p:[0,T]nmp:[0,T]\rightarrow\mathbb{R}^{nm} and a scalar η\eta equal to 0 or 1 satisfying the following conditions (Pontryagin’s Maximum Principle (Clarke (2013))):

  • the nontriviality condition:

    (η,p(t))0t[0,T],(\eta,p(t))\neq 0\quad^{\forall}t\in[0,T], (17)
  • the transversality condition:

    p(T)ηlf(x(T))+NeL(x(T)),-p(T)\in\eta\nabla l_{f}(x^{*}(T))+N_{e}^{L}(x^{*}(T)), (18)
  • the adjoint equation for almost every t[0,T]t\in[0,T]

    p˙(t)=Dxh(x(t),p(t),u(t)),-\dot{p}(t)=D_{x}h(x^{*}(t),p(t),u^{*}(t)), (19)
  • the maximum condition for almost every t[0,T]t\in[0,T]

    h(x(t),p(t),u(t))=supuωh(x(t),p(t),u).h(x^{*}(t),p(t),u^{*}(t))=\sup_{u\in\omega}h(x^{*}(t),p(t),u). (20)

Since ψnm1\psi_{nm}^{-1} exists, we obtain the Hamiltonian function associated to h(x(t),p(t),u(t))h(x^{*}(t),p(t),u^{*}(t)) as follows:

H(X(t),P(t),U(t))=Tr(P(t)F(X(t),U(t))),H(X^{*}(t),P(t),U^{*}(t))=\mathrm{Tr}\left(P^{\top}(t)F(X^{*}(t),U^{*}(t))\right), (21)

which satisfies (11), (12), (13), and (14), where X=ψnm1(x)X^{*}=\psi_{nm}^{-1}(x^{*}), U=ψpq1(u)U^{*}=\psi_{pq}^{-1}(u^{*}), P=ψnm1(p)P=\psi_{nm}^{-1}(p). This completes the proof. ∎

Lemma 2.

Define a set

E{An×n:Ai,jαi,j,(i,j)},E\triangleq\{A\in\mathbb{R}^{n\times n}:A_{i,j}\leq\alpha_{i,j},\quad(i,j)\in\mathcal{I}\}, (22)

and fix any γE\gamma\in E, where n×n\mathcal{I}\subset\mathbb{N}^{n\times n} is a set of positions of elements of AA for which inequality constraints are given. Then any δNEL(γ)\delta\in N_{E}^{L}(\gamma) satisfies

δi,j(γi,jαi,j)=0(i,j),\displaystyle\delta_{i,j}(\gamma_{i,j}-\alpha_{i,j})=0\quad^{\forall}(i,j)\in\mathcal{I}, (23)
δi,j0(i,j),\displaystyle\delta_{i,j}\geq 0\quad^{\forall}(i,j)\in\mathcal{I}, (24)
δi,j=0(i,j).\displaystyle\delta_{i,j}=0\quad^{\forall}(i,j)\notin\mathcal{I}. (25)
Proof.

Fix any A^E\hat{A}\in E and a^=ψnn(A^)\hat{a}=\psi_{nn}(\hat{A}), where ψnn\psi_{nn} is from (15). Then we obtain a set ee satisfying a^e\hat{a}\in e as follows:

e{an2:ajαj,j},e\triangleq\{a\in\mathbb{R}^{n^{2}}:a_{j}\leq\alpha^{\prime}_{j},\quad j\in\mathcal{I}^{\prime}\}, (26)

where α=ψnn(α)\alpha^{\prime}=\psi_{nn}(\alpha) and n2\mathcal{I}^{\prime}\subset\mathbb{N}^{n^{2}} is a set corresponding to \mathcal{I}. Take any γe\gamma^{\prime}\in e, then we have

δi(γiαi)=0i,\displaystyle\delta^{\prime}_{i}(\gamma^{\prime}_{i}-\alpha^{\prime}_{i})=0\quad^{\forall}i\in\mathcal{I}^{\prime}, (27)
δi0i,\displaystyle\delta^{\prime}_{i}\geq 0\quad^{\forall}i\in\mathcal{I}^{\prime}, (28)
δi=0i\displaystyle\delta^{\prime}_{i}=0\quad^{\forall}i\notin\mathcal{I}^{\prime} (29)

for all δNEL(γ)\delta^{\prime}\in N_{E}^{L}(\gamma^{\prime}) (Ikeda and Kashima (2022)). Finally, we obtain (23), (24), (25) where δ=ψnn1(δ)\delta=\psi_{nn}^{-1}(\delta^{\prime}) and γ=ψnn1(γ)\gamma=\psi_{nn}^{-1}(\gamma^{\prime}). ∎

Theorem 1.

Let Gc(t)G_{c}^{*}(t) and V(t)V^{*}(t) be a local optimal solution of Problem 2. Assume that

qj(t)bjeA(Tt)K(Gc(T))Gc(T)eA(Tt)bjq_{j}(t)\triangleq b_{j}^{\top}e^{A^{\top}(T-t)}\frac{\partial K(G_{c}^{*}(T))}{\partial G_{c}^{*}(T)}e^{A(T-t)}b_{j}

and qi(t)qj(t)q_{i}(t)-q_{j}(t) is not constant on [0,T][0,T] for all i,j{1,2,,p}i,j\in\{1,2,\dots,p\}. Then any solution to Problem 2 takes only the values in the binary set {0,1} almost everywhere.

Proof.

We first reformulate Problem 2 into a form to which Lemma 1 is applicable. The value vjL1\|v_{j}\|_{L^{1}} is equal to the final state yj(T)y_{j}(T) of the system yj˙(t)=vj(t)\dot{y_{j}}(t)=v_{j}(t) with yj(0)=0y_{j}(0)=0. Define Y(t)diag(y(t))Y(t)\triangleq{\rm diag}{\left(y(t)\right)} and matrices X(t)X(t), V¯(t)\bar{V}(t), A¯\bar{A}, B¯\bar{B} by

X(t)[Gc(t)On×pOp×nY(t)],V¯(t)[V(t)Op×pOp×pV(t)],A¯[AOn×pOp×nOp×p],B¯[BOn×pOp×pIp].\begin{split}X(t)&\triangleq\begin{bmatrix}G_{c}(t)&O_{n\times p}\\ O_{p\times n}&Y(t)\\ \end{bmatrix},\;\bar{V}(t)\triangleq\begin{bmatrix}V(t)&O_{p\times p}\\ O_{p\times p}&V(t)\\ \end{bmatrix},\\ \bar{A}&\triangleq\begin{bmatrix}A&O_{n\times p}\\ O_{p\times n}&O_{p\times p}\\ \end{bmatrix},\;\bar{B}\triangleq\begin{bmatrix}B&O_{n\times p}\\ O_{p\times p}&I_{p}\\ \end{bmatrix}.\end{split}

Then, Problem 2 is equivalently expressed as follows:

minvJ(V)=Lf(X(T))s.t. X˙(t)=A¯X(t)+X(t)A¯+B¯V¯(t)B¯,X(0)=O(n+p)×(n+p),X(T)E,v(t)Ωt[0,T],\begin{split}\min_{v}\quad&J(V)=-L_{f}(X(T))\\ \mathrm{s.t.}\text{ ~{} ~{}}&\dot{X}(t)=\bar{A}X(t)+X(t)\bar{A}^{\top}+\bar{B}\bar{V}(t)\bar{B}^{\top},\\ &X(0)=O_{(n+p)\times(n+p)},\quad X(T)\in E,\\ &v(t)\in\Omega\quad^{\forall}t\in[0,T],\end{split} (30)

where Lf(X(T))=K(Gc(T))L_{f}(X(T))=K(G_{c}(T)), E={X(T):yj(T)αjj{1,2,,p}}E=\{X(T):y_{j}(T)\leq\alpha_{j}\quad^{\forall}j\in\{1,2,\dots,p\}\}, Ω={v(t):v(t)[0,1]p,v(t)l1β}\Omega=\{v(t):v(t)\in[0,1]^{p},\|v(t)\|_{l^{1}}\leq\beta\}. This is an optimal control problem to which Lemma 1 is applicable. We define the Hamiltonian function HH associated to problem (30) by

H(X,P,V)=Tr(P(A¯X(t)+X(t)A¯+B¯V¯(t)B¯)).H(X,P,V)=\mathrm{Tr}\left(P^{\top}(\bar{A}X(t)+X(t)\bar{A}^{\top}+\bar{B}\bar{V}(t)\bar{B}^{\top})\right).

We define two matrices as follows:

X(t)[Gc(t)On×pOp×nY(t)],V¯(t)[V(t)Op×pOp×pV(t)].\displaystyle X^{*}(t)\triangleq\begin{bmatrix}G_{c}^{*}(t)&O_{n\times p}\\ O_{p\times n}&Y^{*}(t)\end{bmatrix},\quad\bar{V}^{*}(t)\triangleq\begin{bmatrix}V^{*}(t)&O_{p\times p}\\ O_{p\times p}&V^{*}(t)\end{bmatrix}. (31)

Then (X(t)X^{*}(t), V¯(t)\bar{V}^{*}(t)) is the local minimizer of problem (30) because of the equivalence between Problem 2 and problem (30), and there exists a scalar η\eta equal to 0 or 1 and a matrix P:[0,T]n×nP:[0,T]\rightarrow\mathbb{R}^{n\times n} satisfying the conditions (11), (12), (13), (14). It follows from (13) that

P˙(t)=A¯P(t)+P(t)A¯,-\dot{P}(t)=\bar{A}^{\top}P(t)+P(t)\bar{A},

which leads to

P(t)=eA¯(Tt)P(T)eA¯(Tt)=[eA(Tt)P(11)(T)eA(Tt)eA(Tt)P(12)(T)P(21)(T)eA(Tt)P(22)(T)],\begin{split}P(t)&=e^{\bar{A}^{\top}(T-t)}P(T)e^{\bar{A}(T-t)}\\ &=\begin{bmatrix}e^{A^{\top}(T-t)}P^{(11)}(T)e^{A(T-t)}&e^{A^{\top}(T-t)}P^{(12)}(T)\\ P^{(21)}(T)e^{A(T-t)}&P^{(22)}(T)\end{bmatrix},\end{split} (32)

where

P(t)=[P(11)(t)P(12)(t)P(21)(t)P(22)(t)]\begin{split}P(t)=\begin{bmatrix}P^{(11)}(t)&P^{(12)}(t)\\ P^{(21)}(t)&P^{(22)}(t)\\ \end{bmatrix}\end{split} (33)

with P(11)(t)n×nP^{(11)}(t)\in\mathbb{R}^{n\times n} and P(22)(t)p×pP^{(22)}(t)\in\mathbb{R}^{p\times p}. Note that

[P(11)(T)+ηK(Gc(T))Gc(T)P(12)(T)P(21)(T)P(22)(T)]NEL(X(T))\begin{bmatrix}-P^{(11)}(T)+\eta\frac{\partial K(G_{c}^{*}(T))}{\partial G_{c}^{*}(T)}&-P^{(12)}(T)\\ -P^{(21)}(T)&-P^{(22)}(T)\end{bmatrix}\in N_{E}^{L}(X^{*}(T))

by (12), then we have

Pj,j(22)(T)(yj(T)αj)=0j={1,2,,p},\displaystyle P_{j,j}^{(22)}(T)(y_{j}(T)-\alpha_{j})=0\quad j=\{1,2,\dots,p\}, (34)
Pj,j(22)(T)0j={1,2,,p},\displaystyle P_{j,j}^{(22)}(T)\leq 0\quad j=\{1,2,\dots,p\}, (35)
Pi,j(22)(T)=0(i,j){(i,j):ij},\displaystyle P_{i,j}^{(22)}(T)=0\quad^{\forall}(i,j)\in\{(i,j):i\neq j\}, (36)
P(11)(T)+ηK(Gc(T))Gc(T)=On×n,\displaystyle-P^{(11)}(T)+\eta\frac{\partial K(G_{c}^{*}(T))}{\partial G_{c}^{*}(T)}=O_{n\times n}, (37)
P(12)(T)=On×p,P(21)(T)=Op×n,\displaystyle P^{(12)}(T)=O_{n\times p},\quad P^{(21)}(T)=O_{p\times n}, (38)

from Lemma 2. Substituting these into (32), we get

P(t)=[eA(Tt)ηK(Gc(T))Gc(T)eA(Tt)On×pOp×nP(22)(T)].P(t)=\begin{bmatrix}e^{A^{\top}(T-t)}\eta\frac{\partial K(G_{c}^{*}(T))}{\partial G_{c}^{*}(T)}e^{A(T-t)}&O_{n\times p}\\ O_{p\times n}&P^{(22)}(T)\end{bmatrix}. (39)

Then, we have

Tr(P(t)B¯V¯(t)B¯)=Tr(B¯P(t)B¯V¯(t))=Tr([BP(11)(t)BOp×pOp×pP(22)(t)]V¯(t))=j=1p(ηqj(t)+Pj,j(22)(T))vj(t).\begin{split}\mathrm{Tr}\left(P^{\top}(t)\bar{B}\bar{V}(t)\bar{B}^{\top}\right)&=\mathrm{Tr}\left(\bar{B}^{\top}P^{\top}(t)\bar{B}\bar{V}(t)\right)\\ &=\mathrm{Tr}\left(\begin{bmatrix}B^{\top}{P^{(11)}}^{\top}(t)B&O_{p\times p}\\ O_{p\times p}&P^{(22)}(t)\\ \end{bmatrix}\bar{V}(t)\right)\\ &=\sum_{j=1}^{p}\left(\eta q_{j}(t)+P_{j,j}^{(22)}(T)\right)v_{j}(t).\end{split}

It follows from (14) that

v(t)=argmaxvΩj=1p(ηqj(t)+Pj,j(22)(T))vj.v^{*}(t)=\mathop{\rm arg~{}max~{}}\limits_{v\in\Omega}\sum_{j=1}^{p}\left(\eta q_{j}(t)+P_{j,j}^{(22)}(T)\right)v_{j}. (40)

We here claim that η=1\eta=1. Indeed, if η=0\eta=0, P(22)(T)Op×pP^{(22)}(T)\neq O_{p\times p} follows from (11), i.e., there exists some jj that satisfies

Pj,j(22)(T)<0,yj(T)=αj.P_{j,j}^{(22)}(T)<0,\quad y_{j}^{*}(T)=\alpha_{j}. (41)

Hence, from (40) and (41), we have vj(t)=0v_{j}^{*}(t)=0 for all t[0,T]t\in[0,T], i.e., yj(T)=vjL1=0y_{j}^{*}(T)=\|v_{j}^{*}\|_{L^{1}}=0. This contradicts to (41). Thus, η=1\eta=1. From the assumption, it is easy to verify that

  1. 1)

    we have qj(t)+Pj,j(22)(T)0q_{j}(t)+P_{j,j}^{(22)}(T)\neq 0 almost everywhere for all j={1,2,,p}j=\{1,2,\dots,p\},

  2. 2)

    there exists jkj_{k} :: [0,T]{1,2,,p},k=1,2,,p[0,T]\rightarrow\{1,2,\dots,p\},k=1,2,\dots,p, such that

    qj1(t)(t)+Pj1(t),j1(t)(22)(T)>>qjp(t)(t)+Pjp(t),jp(t)(22)(T)q_{j_{1}(t)}(t)+P_{j_{1}(t),j_{1}(t)}^{(22)}(T)>\cdots>q_{j_{p}(t)}(t)+P_{j_{p}(t),j_{p}(t)}^{(22)}(T)

    almost everywhere.

Hence, we find

vj(t)={1ifjΞ1(t)Ξ2(t),0otherwisev_{j}^{*}(t)=\begin{cases}1\quad\mbox{if}\quad j\in\Xi_{1}(t)\cap\Xi_{2}(t),\\ 0\quad\mbox{otherwise}\end{cases} (42)

for almost every t[0,T]t\in[0,T], where

Ξ1(t)\displaystyle\Xi_{1}(t) {j1(t),j2(t),,jβ(t)},\displaystyle\triangleq\{j_{1}(t),j_{2}(t),\dots,j_{\beta}(t)\},
Ξ2(t)\displaystyle\Xi_{2}(t) {k{1,2,,p}:qjk(t)(t)+Pjk(t),jk(t)(22)(T)>0}.\displaystyle\triangleq\{k\in\{1,2,\dots,p\}:q_{j_{k}(t)}(t)+P_{j_{k}(t),j_{k}(t)}^{(22)}(T)>0\}.

This completes the proof. ∎

The following theorem is the main result, which shows the equivalence between Problem 1 and Problem 2.

Theorem 2 (equivalence).

Assume that qj(t)q_{j}(t) and qi(t)qj(t)q_{i}(t)-q_{j}(t) is not constant on [0,T][0,T] for all i,j{1,2,,p}i,j\in\{1,2,\dots,p\}. Denote the set of all solutions of Problem 1 and Problem 2 by 𝒱0\mathcal{V}_{0}^{*} and 𝒱1\mathcal{V}_{1}^{*}, respectively. If the set 𝒱1\mathcal{V}_{1}^{*} is not empty, then we have 𝒱0=𝒱1\mathcal{V}_{0}^{*}=\mathcal{V}_{1}^{*}.

Proof.

Denote any solution of Problem 2 by v^𝒱1\hat{v}\in\mathcal{V}_{1}^{*}. It follows from Theorem 1 that v^(t){0,1}p\hat{v}(t)\in\{0,1\}^{p} almost everywhere. Note that the null set j=1p{t[0,T]:v^j(t){0,1}}\cup_{j=1}^{p}\{t\in[0,T]:\hat{v}_{j}(t)\notin\{0,1\}\} does not affect the cost, and hence we can adjust the variables so that v^(t){0,1}p\hat{v}(t)\in\{0,1\}^{p} on [0,T][0,T], without loss of the optimality. We have

v^(t)l1=v^(t)l0,v^jL1=v^jL0\|\hat{v}(t)\|_{l^{1}}=\|\hat{v}(t)\|_{l^{0}},\quad\|\hat{v}_{j}\|_{L^{1}}=\|\hat{v}_{j}\|_{L^{0}}

for all jj. Since v^𝒱1\hat{v}\in\mathcal{V}_{1}, we have v^(t)l0β\|\hat{v}(t)\|_{l^{0}}\leq\beta and v^jL0αj\|\hat{v}_{j}\|_{L^{0}}\leq\alpha_{j} for all tt and jj. Thus, v^𝒱0\hat{v}\in\mathcal{V}_{0}. Then,

J(v^)maxv𝒱0J(v)maxv𝒱1J(v)=J(v^),J(\hat{v})\leq\max_{v\in\mathcal{V}_{0}}J(v)\leq\max_{v\in\mathcal{V}_{1}}J(v)=J(\hat{v}), (43)

where the first relation follows from v^𝒱0\hat{v}\in\mathcal{V}_{0}, the second relation follows from 𝒱0𝒱1\mathcal{V}_{0}\subset\mathcal{V}_{1}, and the last relation follows from v^𝒱1\hat{v}\in\mathcal{V}_{1}^{*}. Hence, we have

J(v^)=maxv𝒱0J(v),J(\hat{v})=\max_{v\in\mathcal{V}_{0}}J(v), (44)

which implies v^𝒱0\hat{v}\in\mathcal{V}_{0}^{*}. Hence, 𝒱1𝒱0\mathcal{V}_{1}^{*}\subset\mathcal{V}_{0}^{*} and 𝒱0\mathcal{V}_{0}^{*} is not empty.

Next, take any v~𝒱0\tilde{v}\in\mathcal{V}_{0}^{*}. Note that v~𝒱1\tilde{v}\in\mathcal{V}_{1}, since 𝒱0𝒱0𝒱1\mathcal{V}_{0}^{*}\subset\mathcal{V}_{0}\subset\mathcal{V}_{1}. In addition, it follows from (44) that J(v~)=J(v^)J(\tilde{v})=J(\hat{v}). Therefore, v~𝒱1\tilde{v}\in\mathcal{V}_{1}^{*}, which implies 𝒱0𝒱1\mathcal{V}_{0}^{*}\subset\mathcal{V}_{1}^{*}. This gives 𝒱0=𝒱1\mathcal{V}_{0}^{*}=\mathcal{V}_{1}^{*}. ∎

4 Conclusion

In this paper, we discussed the equivalence between the sparsity constrained controllability metrics maximization problems and their convex relaxation. The proof is based on the matrix-valued Pontryagin maximum principle applied to the controllability Lyapunov differential equation. The existence of optimal solutions and computational cost are currently under investigation.

References

  • Athans (1967) Athans, M. (1967). The matrix minimum principle. Information and Control, 11, 592–606.
  • Clarke (2013) Clarke, F. (2013). Functional Analysis, Calculus of Variations and Optimal Control, volume 264. Springer Science & Business Media.
  • Ikeda and Kashima (2018) Ikeda, T. and Kashima, K. (2018). Sparsity-constrained controllability maximization with application to time-varying control node selection. IEEE Control Systems Letters, 2(3), 321–326.
  • Ikeda and Kashima (2022) Ikeda, T. and Kashima, K. (2022). Sparse control node scheduling in networked systems based on approximate controllability metrics. IEEE Transactions on Control of Network Systems.
  • Ikeda et al. (2021) Ikeda, T., Sakurama, K., and Kashima, K. (2021). Multiple sparsity constrained control node scheduling with application to rebalancing of mobility networks. IEEE Transactions on Automatic Control.
  • Verriest and Kailath (1983) Verriest, E. and Kailath, T. (1983). On generalized balanced realizations. IEEE Transactions on Automatic Control, 28(8), 833–844.