Sinkhorn MPC: Model predictive optimal transport over
dynamical systems*

Kaito Ito¹, Student Member, IEEE, and Kenji Kashima¹, Senior Member, IEEE *This work was supported in part by the joint project of Kyoto University and Toyota Motor Corporation, titled “Advanced Mathematical Science for Mobility Society”, and by JSPS KAKENHI Grant Numbers JP21J14577, JP21H04875.¹K. Ito and K. Kashima are with the Graduate School of Informatics, Kyoto University, Kyoto, Japan [email protected]; [email protected]

\copyright

2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Abstract

We consider the optimal control problem of steering an agent population to a desired distribution over an infinite horizon. This is an optimal transport problem over a dynamical system, which is challenging due to its high computational cost. In this paper, we propose Sinkhorn MPC, which is a dynamical transport algorithm combining model predictive control and the so-called Sinkhorn algorithm. The notable feature of the proposed method is that it achieves cost-effective transport in real time by performing control and transport planning simultaneously. In particular, for linear systems with an energy cost, we reveal the fundamental properties of Sinkhorn MPC such as ultimate boundedness and asymptotic stability.

I Introduction

The problem of controlling a large number of agents has become a more and more important area in control theory with a view to applications in sensor networks, smart grids, intelligent transportation systems, and systems biology, to name a few. One of the most fundamental tasks in this problem is to stabilize a collection of agents to a desired distribution shape with minimum cost. This can be formulated as an optimal transport (OT) problem [1] between the empirical distribution based on the state of the agents and the target distribution over a dynamical system.

In [2, 3, 4, 5], infinitely many agents are represented as a probability density of the state of a single system, and the problem of steering the state from an initial density to a target one with minimum energy is considered. Although this approach can avoid the difficulty due to the large scale of the collective dynamics, in this framework, the agents must have the identical dynamics, and they are considered to be indistinguishable. In addition, even for linear systems, the density control requires us to solve a nonlinear partial differential equation such as the Monge-Ampère equation or the Hamilton-Jacobi-Bellman equation. Furthermore, it is difficult to incorporate state constraints due to practical requirements, such as safety, into the density control.

With this in mind, we rather deal with the collective dynamics directly without taking the number of agents to infinity. This straightforward approach is investigated in multi-agent assignment problems; see e.g., [6, 7] and references therein. In the literature, homogeneous agents following single integrator dynamics and easily computable assignment costs, e.g., distance-based cost, are considered in general. On the other hand, for more general dynamics, the main challenge of the assignment problem is the large computation time for obtaining the minimum cost of stabilizing each agent to each target state (i.e., point-to-point optimal control (OC)) and optimizing the destination of each agent. This is especially problematic for OT in dynamical environments where control inputs need to be determined immediately for given initial and target distributions.

For the point-to-point OC, the concept of model predictive control (MPC) solving a finite horizon OC problem instead of an infinite horizon OC problem is effective in reducing the computational cost while incorporating constraints [8]. On the other hand, in [9], several favorable computational properties of an entropy-regularized version of OT are highlighted. In particular, entropy-regularized OT can be solved efficiently via the Sinkhorn algorithm. Based on these ideas, we propose a dynamical transport algorithm combining MPC and the Sinkhorn algorithm, which we call Sinkhorn MPC. Consequently, the computational effort can be reduced substantially. Moreover, for linear systems with an energy cost, we reveal the fundamental properties of Sinkhorn MPC such as ultimate boundedness and asymptotic stability.

Organization: The remainder of this paper is organized as follows. In Section II, we introduce OT between discrete distributions. In Section III, we provide the problem formulation. In Section IV, we describe the idea of Sinkhorn MPC. In Section V, numerical examples illustrate the utility of the proposed method. In Section VI, for linear systems with an energy cost, we investigate the fundamental properties of Sinkhorn MPC. Some concluding remarks are given in Section VII.

Notation: Let ${\mathbb{R}}$ denote the set of real numbers. The set of all positive (resp. nonnegative) vectors in ${\mathbb{R}}^{n}$ is denoted by ${\mathbb{R}}_{>0}^{n}$ (resp. ${\mathbb{R}}_{\geq 0}^{n}$ ). We use similar notations for the set of all real matrices ${\mathbb{R}}^{m\times n}$ and integers ${\mathbb{Z}}$ , respectively. The set of integers $\{1,\ldots,N\}$ is denoted by $[\![N]\!]$ . The Euclidean norm is denoted by $\|\cdot\|$ . For a positive semidefinite matrix $A$ , denote $\|x\|_{A}:=(x^{\top}Ax)^{1/2}$ . The identity matrix of size $n$ is denoted by $I_{n}$ . The matrix norm induced by the Euclidean norm is denoted by $\|\cdot\|_{2}$ . For vectors $x_{1},\ldots,x_{m}\in{\mathbb{R}}^{n}$ , a collective vector $[x_{1}^{\top}\ \cdots\ x_{m}^{\top}]^{\top}\in{\mathbb{R}}^{nm}$ is denoted by $[x_{1};\ \cdots\ ;x_{m}]$ . For $A=[a_{1}\ \cdots\ a_{n}]\in{\mathbb{R}}^{m\times n}$ , we write ${\rm vec}(A):={\color[rgb]{0,0,0}{[a_{1};\ \cdots\ ;a_{n}]}}$ . For $\alpha=[\alpha_{1}\ \cdots\ \alpha_{N}]^{\top}\in{\mathbb{R}}^{N}$ , the diagonal matrix with diagonal entries $\{\alpha_{i}\}_{i=1}^{N}$ is denoted by $\alpha^{\boxbslash}$ . The block diagonal matrix with diagonal entries $\{A_{i}\}_{i=1}^{N},A_{i}\in{\mathbb{R}}^{m\times n}$ is denoted by $\{A_{i}\}_{i}^{\boxbslash}$ . Especially when $A_{i}=A,\forall i$ , $\{A_{i}\}_{i}^{\boxbslash}$ is also denoted by $A^{\boxbslash,N}$ . Let $(M,d)$ be a metric space. The open ball of radius $r>0$ centered at $x\in M$ is denoted by $B_{r}(x):=\{y\in M:d(x,y)<r\}$ . The element-wise division of $a,b\in{\mathbb{R}}_{>0}^{n}$ is denoted by $a\oslash b:=[a_{1}/b_{1}\ \cdots\ a_{n}/b_{n}]^{\top}$ . The $N$ -dimensional vector of ones is denoted by $1_{N}$ . The gradient of a function $f$ with respect to the variable $x$ is denoted by $\nabla_{x}f$ . For $x,x^{\prime}\in{\mathbb{R}}_{>0}^{n}$ , define an equivalence relation $\sim$ on ${\mathbb{R}}_{>0}^{n}$ by $x\sim x^{\prime}$ if and only if $\exists r>0,x=rx^{\prime}$ .

II Background on optimal transport

Here, we briefly review OT between discrete distributions $\mu:=\sum_{i=1}^{N}a_{i}\delta_{x_{i}},\nu:=\sum_{j=1}^{M}b_{j}\delta_{y_{j}}$ where $a\in\Sigma_{N}:=\{p\in{\mathbb{R}}_{\geq 0}^{N}:\sum_{i=1}^{N}p_{i}=1\},b\in\Sigma_{M}$ , $x_{i},y_{j}\in{\mathbb{R}}^{n}$ , and $\delta_{x}$ is the Dirac delta at $x$ . Given a cost function $c:{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}(\ni(x,y))\rightarrow{\mathbb{R}}$ , which represents the cost of transporting a unit of mass from $x$ to $y$ , the original formulation of OT due to Monge seeks a map $T:\{x_{1},\ldots,x_{N}\}\rightarrow\{y_{1},\ldots,y_{M}\}$ that solves

		$\displaystyle\underset{T}{\rm minimize}\ \sum_{i\in[\![N]\!]}c(x_{i},T(x_{i}))$		(1)
		$\displaystyle\text{subject to}\ b_{j}=\sum_{i:T(x_{i})=y_{j}}a_{i},\ \forall j\in[\![M]\!].$		(1)

Especially when $M=N$ and $a=b=1_{N}/N$ , the optimal map $T$ gives the optimal assignment for transporting agents with the initial states $\{x_{i}\}_{i}$ to the desired states $\{y_{j}\}_{j}$ , and the Hungarian algorithm [10] can be used to solve (1). However, this method can be applied only to small problems because it has $O(N^{3})$ complexity.

On the other hand, the Kantorovich formulation of OT is a linear program (LP):

\underset{P\in{\mathcal{T}}(a,b)}{\rm minimize}\ \sum_{i\in[\![N]\!],j\in[\![M]\!]}C_{ij}P_{ij}

(2)

where $C_{ij}:=c(x_{i},y_{j})$ and

{\mathcal{T}}(a,b):=\{P\in{\mathbb{R}}_{\geq 0}^{N\times M}:P1_{M}=a,\ P^{\top}1_{N}=b\}.

A matrix $P\in{\mathcal{T}}(a,b)$ , which is called a coupling matrix, represents a transport plan where $P_{ij}$ describes the amount of mass flowing from $x_{i}$ towards $y_{j}$ . In particular, when $M=N$ and $a=b=1_{N}/N$ , there exists an optimal solution from which we can reconstruct an optimal map for (1) [11, Proposition 2.1]. However, similarly to (1), for a large number of agents and destinations, the problem (2) with $NM$ variables is challenging to solve.

In view of this, [9] employed an entropic regularization to (2):

\underset{P\in{\mathcal{T}}(a,b)}{\rm minimize}\ \sum_{i\in[\![N]\!],j\in[\![M]\!]}C_{ij}P_{ij}-\varepsilon H(P),

(3)

where $\varepsilon>0$ and the entropy of $P$ is defined by $H(P):=-\sum_{i,j}P_{ij}(\log(P_{ij})-1)$ . Define the Gibbs kernel $K$ associated to the cost matrix $C=(C_{ij})$ as

K=(K_{ij})\in{\mathbb{R}}_{>0}^{N\times M},\ K_{ij}:=\exp\left(-C_{ij}/\varepsilon\right).

Then, a unique solution of the entropic OT (3) has the form $P^{*}=(\alpha^{*})^{\boxbslash}K(\beta^{*})^{\boxbslash}$ for two (unknown) scaling variables $(\alpha^{*},\beta^{*})\in{\mathbb{R}}_{>0}^{N}\times{\mathbb{R}}_{>0}^{M}$ . The variables $(\alpha^{*},\beta^{*})$ can be efficiently computed by the Sinkhorn algorithm:

\alpha(k+1)=a\oslash[K\beta(k)],\ \beta(k)=b\oslash[K^{\top}\alpha(k)]

(4)

where

\lim_{k\rightarrow\infty}\alpha(k+1)^{\boxbslash}K\beta(k)^{\boxbslash}=P^{*},\ \forall\alpha(0)=\alpha_{0}\in{\mathbb{R}}_{>0}^{N}.

Now, let us introduce Hilbert’s projective metric

{d_{\mathcal{H}}}(\beta,\beta^{\prime}):=\log\max_{i,j\in[\![M]\!]}\frac{\beta_{i}\beta^{\prime}_{j}}{\beta_{j}\beta^{\prime}_{i}},\ \beta,\beta^{\prime}\in{\mathbb{R}}_{>0}^{M},

(5)

which is a distance on the projective cone ${\mathbb{R}}_{>0}^{M}/{\sim}$ (see the Notation in Section I for $\sim$ ) and is useful for the convergence analysis of the Sinkhorn algorithm; see [11, Remark 4.12 and 4.14]. Indeed, for any $(\beta,\beta^{\prime})\in({\mathbb{R}}_{>0}^{M})^{2}$ and any $\bar{K}\in{\mathbb{R}}_{>0}^{N\times M}$ , it holds

{d_{\mathcal{H}}}(\bar{K}\beta,\bar{K}\beta^{\prime})\leq\lambda(\bar{K}){d_{\mathcal{H}}}(\beta,\beta^{\prime})

(6)

where

\lambda(\bar{K}):=\frac{\sqrt{\eta(\bar{K})}-1}{\sqrt{\eta(\bar{K})}+1}<1,\ \eta(\bar{K}):=\max_{i,j,k,l}\frac{\bar{K}_{ik}\bar{K}_{jl}}{\bar{K}_{jk}\bar{K}_{il}}.

Then it follows from (6) that

	$\displaystyle{d_{\mathcal{H}}}$	$\displaystyle(\beta(k+1),\beta^{})={d_{\mathcal{H}}}(b\oslash[K^{\top}\alpha(k+1)],b\oslash[K^{\top}\alpha^{}])$
		$\displaystyle={d_{\mathcal{H}}}(K^{\top}\alpha(k+1),K^{\top}\alpha^{*})$
		$\displaystyle\leq\lambda(K){d_{\mathcal{H}}}(\alpha(k+1),\alpha^{})\leq\lambda^{2}(K){d_{\mathcal{H}}}(\beta(k),\beta^{})$

which implies $V_{P}(\beta):={d_{\mathcal{H}}}(\beta,\beta^{*})$ is a Lyapunov function of (4), and $\lim_{k\rightarrow\infty}\beta(k)=\beta^{*}\in{\mathbb{R}}_{>0}^{M}/{\sim}$ .

III Problem formulation

In this paper, we consider the problem of stabilizing agents efficiently to a given discrete distribution over dynamical systems. This can be formulated as Monge’s OT problem.

Problem 1

Given initial and desired states $\{x_{i}^{0}\}_{i=1}^{N},\{x_{j}^{\sf d}\}_{j=1}^{N}\in(\mathbb{R}^{n})^{N}$ , find control inputs $\{u_{i}\}_{i=1}^{N}$ and a permutation $\sigma:[\![N]\!]\rightarrow[\![N]\!]$ that solve

\displaystyle\underset{\sigma}{\rm minimize}~{}~{}\sum_{i\in[\![N]\!]}c_{\infty}^{i}(x_{i}^{0},x_{\sigma(i)}^{\sf d}).

(7)

Here, the cost function $c_{\infty}^{i}$ is defined by

	$\displaystyle c_{\infty}^{i}(x_{i}^{0},x_{j}^{\sf d}):=\min_{u_{i}}\ \sum_{k=0}^{\infty}\ell_{i}(x_{i}(k),u_{i}(k);x_{j}^{\sf d})$	(8)
subject to	$\displaystyle{\color[rgb]{0,0,0}{x_{i}(k+1)=A_{i}x_{i}(k)+B_{i}u_{i}(k)}},$	(9)
	$\displaystyle x_{i}(k)\in{\mathcal{X}}\subseteq{\mathbb{R}}^{n},\ \forall k\in{\mathbb{Z}}_{\geq 0},$	(10)
	$\displaystyle x_{i}(0)=x_{i}^{0},$	(11)
	$\displaystyle\lim_{k\rightarrow\infty}x_{i}(k)=x_{j}^{\sf d},$	(12)

where $x_{i}(k)\in{\mathbb{R}}^{n},u_{i}(k)\in{\mathbb{R}}^{m},A_{i}\in{\mathbb{R}}^{n\times n},B_{i}\in{\mathbb{R}}^{n\times m}$ . $\triangle$

Note that the running cost $\ell_{i}$ depends not only on the state $x_{i}$ and the control input $u_{i}$ , but also on the destination $x_{j}^{\sf d}$ . Throughout this paper, we assume the existence of an optimal solution of OC problems. In addition, we assume that there exists a constant input $\bar{u}_{i}$ under which $x_{i}=x_{j}^{\sf d}$ is an equilibrium of (9). A necessary condition for the infinite horizon cost $c_{\infty}^{i}(x_{i}^{0},x_{j}^{\sf d})$ to be finite is that at $x_{i}=x_{j}^{\sf d}$ with $u_{i}=\bar{u}_{i}$ , there is not a cost incurred, i.e., $\ell_{i}(x_{j}^{\sf d},\bar{u}_{i};x_{j}^{\sf d})=0$ . If $B_{i}$ is square and invertible, $\bar{u}_{i}=B_{i}^{-1}(x_{j}^{\sf d}-A_{i}x_{j}^{\sf d})$ makes $x_{i}=x_{j}^{\sf d}$ an equilibrium.

IV Combining MPC and Sinkhorn algorithm

The main difficulties of Problem 1 are as follows:

1.

In most cases, the infinite horizon OC problem $c_{\infty}^{{\color[rgb]{0,0,0}{i}}}(x_{i}^{0},x_{j}^{\sf d})$ is computationally intractable.
2.

In addition, given $c_{\infty}^{{\color[rgb]{0,0,0}{i}}}(x_{i}^{0},x_{j}^{\sf d}),\forall i,j\in[\![N]\!]$ , the assignment problem needs to be solved, which leads to the high computational burden when $N$ is large.

To overcome these issues, we utilize the concept of MPC, which solves tractable finite horizon OC while satisfying the state constraint (10). Specifically, at each time, we address the OT problem whose cost function with the current state $\check{x}_{i}$ and the finite horizon $T_{h}\in{\mathbb{Z}}_{>0}$ is given by

	$\displaystyle c_{T_{h}}^{{\color[rgb]{0,0,0}{i}}}(\check{x}_{i},x_{j}^{\sf d}):=$	$\displaystyle\min_{u_{i}}\ \sum_{k=0}^{T_{h}-1}\ell_{i}(x_{i}(k),u_{i}(k);x_{j}^{\sf d})$
		$\displaystyle\text{subj. to \eqref{eq:nonlinear_dynamics}, {\color[rgb]{0,0,0}{\eqref{eq:state_constraint}}},}~{}~{}x_{i}(0)=\check{x}_{i},\ x_{i}(T_{h})=x_{j}^{\sf d}.$

Denote the first control in the optimal sequence by $u_{i}^{\rm MPC}(\check{x}_{i},x_{j}^{\sf d})$ . From the viewpoint of the computation time for solving the OT problem, we relax the Problem 1 by the entropic regularization. It should be noted that, in challenging situations in which the number of agents is large and the sampling time is small, only a few Sinkhorn iterations are allowed. In what follows, we deal only with the case where just one Sinkhorn iteration is performed at each time. Nevertheless, by similar arguments, all of the results of this paper are still valid when more iterations are performed.

Based on the approximate solution $P(k)$ obtained by the Sinkhorn algorithm at time $k$ , we need to determine a target state for each agent. Then, we introduce a set ${\mathbb{X}}\subset{\mathbb{R}}^{n}$ and a map $x_{\rm tmp}^{{\sf d},i}:{\mathbb{R}}_{\geq 0}^{N\times N}\rightarrow{\mathbb{X}}$ as a policy to determine a temporary target $x_{\rm tmp}^{{\sf d},i}(P(k))$ at time $k$ for agent $i$ . Hereafter, assume that there exists a constant $r_{\rm upp}>0$ such that

\|x\|\leq r_{\rm upp},\ \forall x\in{\mathbb{X}}.

(13)

For example, if ${\mathbb{X}}$ is the convex hull of $\{x_{j}^{\sf d}\}_{j}$ , we can take $r_{\rm upp}=\max_{j}\|x_{j}^{\sf d}\|$ . A typical policy to approximate a Monge’s OT map from a coupling matrix $P$ is the so-called barycentric projection $x_{\rm tmp}^{{\sf d},i}(P)=N\sum_{j=1}^{N}P_{ij}x_{j}^{\sf d}$ [11, Remark 4.11].

For any given policy $x_{\rm tmp}^{{\sf d},i}$ , we summarize the above strategy as the following dynamics where the Sinkhorn algorithm behaves as a dynamic controller.
Sinkhorn MPC:

	$\displaystyle x_{i}(k+1)=A_{i}x_{i}(k)+B_{i}u_{i}^{\rm MPC}\bigl{(}x_{i}(k),x_{\rm tmp}^{{\sf d},i}\left(P(k)\right)\bigr{)},$
	$\displaystyle\hskip 170.71652pt\forall i\in[\![N]\!],$		(14)
	$\displaystyle P(k)=\alpha(k+1)^{\boxbslash}K(x(k))\beta(k)^{\boxbslash},$		(15)
	$\displaystyle\alpha(k+1)=1_{N}/N\oslash\left[K(x(k))\beta(k)\right],$		(16)
	$\displaystyle\beta(k)=1_{N}/N\oslash\left[K(x(k))^{\top}\alpha(k)\right],$		(17)
	$\displaystyle x_{i}(0)=x_{i}^{0},\ \alpha(0)=\alpha_{0},$

where

\displaystyle K_{ij}(x):=\exp\left(-\frac{c_{T_{h}}^{{\color[rgb]{0,0,0}{i}}}(x_{i},x_{j}^{\sf d})}{\varepsilon}\right),\ x=[x_{1};\cdots;x_{N}]\in{\mathbb{R}}^{nN},

and the initial value $\alpha_{0}$ is arbitrary. $\triangle$
Note that, under the assumption that for all $i\in[\![N]\!]$ , $B_{i}$ is square and invertible, Sinkhorn MPC is obviously recursively feasible, and the dynamics (14) satisfies the state constraint (10) for all $i\in[\![N]\!]$ .

V Numerical examples

This section gives examples for Sinkhorn MPC with an energy cost

\ell_{i}(x_{i},u_{i};x_{j}^{\sf d})=\|u_{i}-B_{i}^{-1}(x_{j}^{\sf d}-A_{i}x_{j}^{\sf d})\|^{2},

(18)

and ${\mathcal{X}}={\mathbb{R}}^{n}$ . Then, the dynamics under Sinkhorn MPC can be written as follows [12, Section 2.2, pp. 37-39]:

	$\displaystyle x_{i}(k+1)=\bar{A}_{i}x_{i}(k)+(I-\bar{A}_{i})x_{\rm tmp}^{{\sf d},i}(P(k)),$		(19)
	$\displaystyle u_{i}^{\rm MPC}(x_{i},{\color[rgb]{0,0,0}{\hat{x}}})=-B_{i}^{\top}(A_{i}^{\top})^{T_{h}-1}G_{i,T_{h}}^{-1}A_{i}^{T_{h}}(x_{i}-{\color[rgb]{0,0,0}{\hat{x}}})$
	$\displaystyle\hskip 71.13188pt+B_{i}^{-1}({\color[rgb]{0,0,0}{\hat{x}}}-A_{i}{\color[rgb]{0,0,0}{\hat{x}}}),\ \forall i\in[\![N]\!],\ {\color[rgb]{0,0,0}{\forall\hat{x}\in{\mathbb{R}}^{n}}}$

with (15)–(17) where

	$\displaystyle K_{ij}(x)=\exp\left(-\frac{\\|x_{i}-x_{j}^{\sf d}\\|_{{\mathcal{G}}_{i}}^{2}}{\varepsilon}\right),$
	$\displaystyle{\mathcal{G}}_{i}:=(A_{i}^{T_{h}})^{\top}G_{i,T_{h}}^{-1}A_{i}^{T_{h}},\ G_{i,T_{h}}:=\sum_{k=0}^{T_{h}-1}A_{i}^{k}B_{i}B_{i}^{\top}(A_{i}^{\top})^{k},$
	$\displaystyle\bar{A}_{i}:=A_{i}-B_{i}B_{i}^{\top}(A_{i}^{\top})^{T_{h}-1}G_{i,T_{h}}^{-1}A_{i}^{T_{h}}.$

In the examples below, we use the barycentric target $x_{\rm tmp}^{{\sf d},i}(P)=N\sum_{j=1}^{N}P_{ij}x_{j}^{\sf d}$ .

First, consider the case where

A_{i}=\begin{bmatrix}1.2&0.13\\ -0.05&1.1\end{bmatrix},\ B_{i}=0.1I_{2},\ \forall i\in[\![N]\!],

(20)

and set $N=150,\ \varepsilon=1.0,\ T_{h}=10,\ \alpha_{0}=1_{N}$ . For given initial and desired states, the trajectories of the agents governed by (19) with (15)–(17) are illustrated in Fig. 1. It can be seen that the agents converge sufficiently close to the target states. The computation time for one Sinkhorn iteration is about $0.0063$ ms, $0.030$ ms, and $0.08$ ms for $N=150,500,800$ , respectively, with MacBook Pro with Intel Core i5. On the other hand, solving the linear program (2) with MATLAB linprog takes about $0.12$ s, $6.4$ s, and $66$ s for $N=150,500,800$ , respectively, and is thus not scalable. Hence, Sinkhorn MPC contributes to reducing the computational burden.

(a)

k=0

(b)

k=50

(c)

k=200

(d)

k=500

(e)

k=1000

(f)

k=3000

Figure 1: Trajectories

x_{i}(k)=[x_{i}^{(1)}(k)\ x_{i}^{(2)}(k)]^{\top}

of 150 agents for (20) (colored filled circles) and desired states (black circles).

Next, we investigate the effect of the regularization parameter $\varepsilon$ on the behavior of Sinkhorn MPC. To this end, consider a simple case where $N=10,\ T_{h}=10$ , and

A_{i}=1,\ B_{i}=0.1,\ \forall i\in[\![N]\!].

(21)

Then the trajectories of the agents with $\varepsilon=0.5,1.0$ are shown in Fig. 2. As can be seen, the overshoot/undershoot is reduced for larger $\varepsilon$ while the limiting values of the states deviate from the desired states. In other words, the parameter $\varepsilon$ reflects the trade-off between the stationary and transient behaviors of the dynamics under Sinkhorn MPC.

Refer to caption — Figure 2: Trajectories of 10 agents for (21) with $\varepsilon=0.5$ (solid) and $\varepsilon=1.0$ (chain), respectively, and desired states (black circles).

VI Fundamental properties of Sinkhorn MPC with an energy cost

In this section, we elucidate ultimate boundedness and asymptotic stability of Sinkhorn MPC with the energy cost (18) and ${\mathcal{X}}={\mathbb{R}}^{n}$ . Hereafter, we assume the invertibility of $B_{i}$ .

VI-A Ultimate boundedness for Sinkhorn MPC

It is known that, under the assumption that $B_{i}$ is invertible, $\bar{A}_{i}$ is stable, i.e., the spectral radius $\rho_{i}$ of $\bar{A}_{i}$ satisfies $\rho_{i}<1$ [13, Corollary 1]. Using this fact, we derive the ultimate boundedness of (19) with (15)–(17).

Proposition 1

For any $\delta>0,\{x_{i}^{0}\}_{i}$ , and $\{\nu_{i}\}_{i}$ satisfying $\nu_{i}>0,\rho_{i}+\nu_{i}<1,\forall i$ , there exists $\tau(\delta,\{x_{i}^{0}\},\{\nu_{i}\})\in{\mathbb{Z}}_{>0}$ such that the solution $\{x_{i}\}_{i}$ of (19) with (15)–(17) satisfies

\|x_{i}(k)\|<\delta+\frac{r_{\rm upp}\|I-\bar{A}_{i}\|_{2}}{1-(\rho_{i}+\nu_{i})},\ \forall k\geq{\color[rgb]{0,0,0}{\tau}},\ \forall i\in[\![N]\!],

(22)

where $r_{\rm upp}$ satisfies (13).

Proof:

Let $\tilde{u}_{i}(k):=(I-\bar{A}_{i})x_{\rm tmp}^{{\sf d},i}(P(k))$ . Then, it follows from (13) that

\displaystyle\|\tilde{u}_{i}(k)\|\leq r_{\rm upp}\|I-\bar{A}_{i}\|_{2},\ \forall k\in{\mathbb{Z}}_{\geq 0}.

Note that for any $\nu_{i}>0$ , there exists $\tau_{i}(\nu_{i})\in{\mathbb{Z}}_{>0}$ such that

{\color[rgb]{0,0,0}{\|\bar{A}_{i}^{k}\|<(\rho_{i}+\nu_{i})^{k},\ \forall k\geq\tau_{i}.}}

Hence, the desired result is straightforward from

\displaystyle\|x_{i}(k)\|\leq\|\bar{A}_{i}^{k}\|_{2}\|x_{i}^{0}\|+\sum_{s=1}^{k}\|\bar{A}_{i}^{s-1}\|_{2}\|\tilde{u}_{i}(k-s)\|.

∎

We emphasize that Proposition 1 holds for any policy $x_{\rm tmp}^{{\sf d},i}$ whose range ${\mathbb{X}}$ satisfies (13).

VI-B Existence of the equilibrium points

In the remainder of this section, we focus on the barycentric target $x_{\rm tmp}^{{\sf d},i}(P)=N\sum_{j=1}^{N}P_{ij}x_{j}^{\sf d}$ . For $(x,\beta)\in{\mathbb{R}}^{nN}\times{\mathbb{R}}_{>0}^{N}$ and $X^{\sf d}:=[x_{1}^{\sf d}\ \cdots\ x_{N}^{\sf d}]\in{\mathbb{R}}^{n\times N}$ , define

	$\displaystyle f_{1}(x,\beta):=\{\bar{A}_{i}\}_{i}^{\boxbslash}x+N\{I_{n}-\bar{A}_{i}\}_{i}^{\boxbslash}(X^{\sf d})^{\boxbslash,N}{\rm vec}(\tilde{P}(x,\beta)),$		(23)
	$\displaystyle f_{2}(x,\beta):=1_{N}/N\oslash\left[K(f_{1}(x,\beta))^{\top}(1_{N}/N\oslash[K(x)\beta])\right],$		(24)
	$\displaystyle\tilde{P}(x,\beta):=\left(1_{N}/N\oslash[K(x)\beta]\right)^{\boxbslash}K(x)\beta^{\boxbslash}.$		(25)

Then, the collective dynamics (19) with (15)–(17) is

	$\displaystyle x(k+1)=f_{1}(x(k),\beta(k)),$		(26)
	$\displaystyle\beta(k+1)=f_{2}(x(k),\beta(k)),$		(27)

where $x(k):={\color[rgb]{0,0,0}{[x_{1}(k);\ \cdots;x_{N}(k)]}}$ .

Here, we characterize equilibria of (26), (27). Note that the existence of the equilibria is not trivial due to the nonlinearity of the dynamics. A point $x^{{\sf e}}={\color[rgb]{0,0,0}{[x_{1}^{\sf e};\ \cdots\ ;x_{N}^{\sf e}]}}\in{\mathbb{R}}^{nN}$ is an equilibrium if and only if

	$\displaystyle(I_{n}-\bar{A}_{i})\biggl{(}x_{i}^{{\sf e}}-N\sum_{j=1}^{N}P_{ij}^{*}(x^{\sf e})x_{j}^{\sf d}\biggr{)}=0,\ \forall i\in[\![N]\!],$
	$\displaystyle P_{ij}^{}(x):=\alpha_{i}^{}K_{ij}(x)\beta_{j}^{},\ \alpha^{},\beta^{*}\in{\mathbb{R}}_{>0}^{N},$		(28)
	$\displaystyle\alpha^{}=1_{N}/N\oslash\left[K(x)\beta^{}\right],\beta^{}=1_{N}/N\oslash\left[K(x)^{\top}\alpha^{}\right].$		(29)

The stability of $\bar{A}_{i}$ implies that it has no eigenvalue equal to $1$ , and therefore $I_{n}-\bar{A}_{i}$ is invertible. Thus, the necessary and sufficient condition for the equilibria is given by

x_{i}^{{\sf e}}-N\sum_{j=1}^{N}P_{ij}^{*}(x^{\sf e})x_{j}^{{\sf d}}=0,\ \forall i\in[\![N]\!].

(30)

Proposition 2

The dynamics (26), (27) has at least one equilibrium point $(x^{\sf e},\beta^{\sf e})\in{\mathbb{R}}^{nN}\times({\mathbb{R}}_{>0}^{N}/{\sim})$ .

Proof:

Note that if a point $x^{{\sf e}}\in{\mathbb{R}}^{nN}$ satisfies (30), the corresponding $\beta^{\sf e}\in{\mathbb{R}}_{>0}^{N}/{\sim}$ is uniquely determined by $\beta^{\sf e}=\beta^{*}$ in (29) with $x=x^{\sf e}$ [11, Theorem 4.2]. Define a continuous map $h:{\mathbb{R}}^{nN}\rightarrow{\mathbb{R}}^{nN}$ as

h(x):=\begin{bmatrix}N\sum_{j=1}^{N}P_{1j}^{*}(x)x_{j}^{\sf d}\\ \vdots\\ N\sum_{j=1}^{N}P_{Nj}^{*}(x)x_{j}^{\sf d}\end{bmatrix},\ x\in{\mathbb{R}}^{nN}.

(31)

From (30), fixed points of $h$ are equilibria of (26), (27). Note that for any $i\in[\![N]\!]$ and any $x\in{\mathbb{R}}^{nN}$ , $N\sum_{j=1}^{N}P_{ij}^{*}(x)x_{j}^{\sf d}$ belongs to the convex hull ${\mathbb{X}}$ of $\{x_{j}^{\sf d}\}_{j}$ . For brevity, we abuse notation and regard ${\mathbb{X}}^{N}$ as a subset of ${\mathbb{R}}^{nN}$ . Let $h_{{\mathbb{X}}}:{\mathbb{X}}^{N}\rightarrow{\mathbb{X}}^{N}$ be the restriction of $h$ in (31) to ${\mathbb{X}}^{N}$ . Now we can utilize Brouwer’s fixed point theorem. That is, since $h_{{\mathbb{X}}}$ is a continuous map from a compact convex set ${\mathbb{X}}^{N}$ into itself, there exists a point $x^{{\sf e}}\in{\mathbb{X}}^{N}$ such that $x^{{\sf e}}=h(x^{\sf e})$ . ∎

Sometimes, in order to emphasize the dependence of $(x^{\sf e},\beta^{\sf e})$ on $\varepsilon$ , we write $(x^{\sf e}(\varepsilon),\beta^{\sf e}(\varepsilon))$ .

VI-C Asymptotic stability for Sinkhorn MPC

Next, we analyze the stability of the equilibrium points. For this purpose, the following lemma is crucial when $\varepsilon$ is small. Due to the limited space, we omit the proof.

Lemma 1

Assume that $x_{i}^{\sf d}\neq x_{j}^{\sf d}$ for all $(i,j),\ i\neq j$ . For a permutation $\sigma:[\![N]\!]\rightarrow[\![N]\!]$ , define $x^{{\sf d}}(\sigma):={\color[rgb]{0,0,0}{[x_{\sigma(1)}^{{\sf d}};\ \cdots\ ;x_{\sigma(N)}^{{\sf d}}]}}$ and a permutation matrix $P^{\sigma}=(P_{ij}^{\sigma})$ as $P_{ij}^{\sigma}:=1/N$ if $j=\sigma(i)$ , and $0$ , otherwise. Then, there exists an equilibrium $(x^{\sf e}(\varepsilon),\beta^{\sf e}(\varepsilon))$ of (26), (27) such that $x^{\sf e}(\varepsilon)$ and $P^{*}(x^{\sf e}(\varepsilon))$ converge exponentially to $x^{{\sf d}}(\sigma)$ and $P^{\sigma}$ , respectively, as $\varepsilon\rightarrow+0$ , i.e., there exists $\zeta>0$ such that

\lim_{\varepsilon\rightarrow+0}\frac{\|\eta(\varepsilon)\|_{2}}{\exp(-\zeta/\varepsilon)}=0

for $\eta(\varepsilon)=x^{\sf e}(\varepsilon)-x^{\sf d}(\sigma)$ and $\eta(\varepsilon)=P^{*}(x^{\sf e}(\varepsilon))-P^{\sigma}$ . $\triangle$

Denote by ${\rm Exp}(\sigma)$ the set of all equilibria $(x^{\sf e}(\cdot),\beta^{\sf e}(\cdot))$ of (26), (27) having the property in Lemma 1 for a permutation $\sigma$ .

For $\bar{P}\in{\mathbb{R}}^{N\times N}$ and $x={\color[rgb]{0,0,0}{[x_{1};\ \cdots\ ;x_{N}]}}\in{\mathbb{R}}^{nN}$ , define

\displaystyle V_{\rm x}(x):=\sum_{i=1}^{N}\Bigl{\|}x_{i}-N\sum_{j=1}^{N}\bar{P}_{ij}x_{j}^{\sf d}\Bigr{\|}_{{\mathcal{G}}_{i}}^{2}.

Then, $V_{\rm x}$ is a Lyapunov function of (26) where $\tilde{P}(x,\beta)$ is fixed by $\bar{P}$ [14]. Indeed, we have

	$\displaystyle V_{\rm x}(x(k+1))-V_{\rm x}(x(k))\leq-\sum_{i=1}^{N}W_{1,i}(x_{i}(k),\bar{P})$
	$\displaystyle W_{1,i}(x_{i},\bar{P}):=\Bigl{\\|}B_{i}^{\top}(A_{i}^{\top})^{T_{h}-1}G_{i,T_{h}}^{-1}A_{i}^{T_{h}}$
	$\displaystyle\hskip 113.81102pt\times\bigl{(}x_{i}-N\sum_{j=1}^{N}\bar{P}_{ij}x_{j}^{\sf d}\bigr{)}\Bigr{\\|}^{2}.$

Given an equilibrium $(x^{\sf e},\beta^{\sf e})$ , let us take the optimal coupling $P^{\sf e}:=P^{*}(x^{{\sf e}})$ as $\bar{P}$ , and for $\gamma>0$ , define

V(x,\beta):=V_{\rm x}(x)+\gamma{d_{\mathcal{H}}}(\beta,\beta^{\sf e}),\ (x,\beta)\in{\mathbb{R}}^{nN}\times({\mathbb{R}}_{>0}^{N}/{\sim}).

(32)

The following theorem follows from the fact that, for sufficiently small or large $\varepsilon>0$ and large $\gamma>0$ , $V$ behaves as a Lyapunov function of (26), (27) with respect to $(x^{\sf e},\beta^{\sf e})$ .

Theorem 1

Assume that for all $i\in[\![N]\!]$ , $A_{i}$ is invertible. Then the following hold:

(i)

Assume that $(x^{\sf e},\beta^{\sf e})$ is an isolated equilibrium¹¹1An equilibrium is said to be isolated if it has a neighborhood which does not contain any other equilibria. of (26), (27). Then, for a sufficiently large $\varepsilon>0$ , $(x^{\sf e},\beta^{\sf e})$ is locally asymptotically stable.
(ii)

Assume that $x_{i}^{\sf d}\neq x_{j}^{\sf d}$ for all $(i,j),\ i\neq j$ . Assume further that for some $\varepsilon^{\prime}>0$ , $(x^{\sf e}(\varepsilon^{\prime}),\beta^{\sf e}(\varepsilon^{\prime}))$ is an isolated equilibrium of (26), (27) and for some permutation $\sigma$ , $(x^{\sf e}(\cdot),\beta^{\sf e}(\cdot))\in{\rm Exp}(\sigma)$ . Then, for sufficiently small $\varepsilon>0$ , $(x^{\sf e}(\varepsilon),\beta^{\sf e}(\varepsilon))$ is locally asymptotically stable.

Proof:

We prove only (ii) as the proof is similar for (i). In this proof, we regard $(x(\cdot),\beta(\cdot))$ as a trajectory in a metric space ${\mathbb{R}}^{nN}\times({\mathbb{R}}_{>0}^{N}/{\sim})$ with metric $d((x,\beta),(x^{\prime},\beta^{\prime})):=\|x-x^{\prime}\|+{d_{\mathcal{H}}}(\beta,\beta^{\prime})$ . Fix any $(x^{\sf e},\beta^{\sf e})\in{\rm Exp}(\sigma)$ satisfying the assumption in (ii). By definition, it is trivial that $V$ is positive definite on a neighborhood of $(x^{\sf e},\beta^{\sf e})$ . Moreover, for any $(x,\beta)\in{\mathbb{R}}^{nN}\times({\mathbb{R}}_{>0}^{N}/{\sim})$ , we have

	$\displaystyle V(f_{1}(x,\beta),f_{2}(x,\beta))-V(x,\beta)$
	$\displaystyle\leq\sum_{i=1}^{N}\biggl{\{}\Bigl{\\|}\bar{A}_{i}\Bigl{(}x_{i}-N\sum_{j}\tilde{P}_{ij}(x,\beta)x_{j}^{{\sf d}}\Bigr{)}$
	$\displaystyle+N\sum_{j}(\tilde{P}_{ij}(x,\beta)-P_{ij}^{\sf e})x_{j}^{{\sf d}}\Bigr{\\|}_{{\mathcal{G}}_{i}}^{2}-\Bigl{\\|}x_{i}-N\sum_{j}P_{ij}^{\sf e}x_{j}^{\sf d}\Bigr{\\|}_{{\mathcal{G}}_{i}}^{2}\biggr{\}}$
	$\displaystyle\quad+\gamma(-W_{3}(x,\beta)+W_{4}(x,\beta)+W_{5}(x,\beta))$
	$\displaystyle\leq\sum_{i=1}^{N}\left(-W_{1,i}(x_{i},\tilde{P}(x,\beta))+W_{2,i}(x,\beta)\right)$
	$\displaystyle\qquad+\gamma(-W_{3}(x,\beta)+W_{4}(x,\beta)+W_{5}(x,\beta))=:W(x,\beta),$

where we used the triangle inequality for ${d_{\mathcal{H}}}$ , and

	$\displaystyle W_{2,i}(x,\beta):=2\Bigl{(}x_{i}-N\sum_{j\in[\![N]\!]}\tilde{P}_{ij}(x,\beta)x_{j}^{\sf d}\Bigr{)}^{\top}(\bar{A}_{i}-I_{n})^{\top}{\mathcal{G}}_{i}$
	$\displaystyle\qquad\qquad\qquad\times N\sum_{j\in[\![N]\!]}(\tilde{P}_{ij}(x,\beta)-P_{ij}^{\sf e})x_{j}^{\sf d},$
	$\displaystyle W_{3}(x,\beta):=\left[1-\lambda(K(x))\lambda\left(K(f_{1}(x,\beta))\right)\right]{d_{\mathcal{H}}}(\beta,\beta^{\sf e}),$
	$\displaystyle W_{4}(x,\beta):={d_{\mathcal{H}}}(K(f_{1}(x,\beta))^{\top}\alpha^{\sf e},(K^{\sf e})^{\top}\alpha^{\sf e}),$
	$\displaystyle W_{5}(x,\beta):=\lambda\left(K(f_{1}(x,\beta))\right){d_{\mathcal{H}}}(K(x)\beta^{\sf e},K^{\sf e}\beta^{\sf e}),$
	$\displaystyle K^{\sf e}:=K(x^{\sf e}),\ \alpha^{\sf e}:=1_{N}/N\oslash[K^{\sf e}\beta^{\sf e}].$

In the sequel, we explain that sufficiently small $\varepsilon$ and large $\gamma$ enable us to take a neighborhood $B_{r}(x^{\sf e},\beta^{\sf e})$ where

W(x,\beta)<0,\ \forall(x,\beta)\in B_{r}(x^{\sf e},\beta^{\sf e})\backslash\{(x^{\sf e},\beta^{\sf e})\},

(33)

which means the asymptotic stability of $(x^{\sf e},\beta^{\sf e})$ [15, Theorem 1.3].

First, a straightforward calculation yields, for any $i,j\in[\![N]\!],l\in[\![n]\!]$ and any $(x,\beta)\in{\mathbb{R}}^{nN}\times({\mathbb{R}}_{>0}^{N}/{\sim})$ ,

	$\displaystyle\left\|\frac{\partial}{\partial x_{i,l}}\tilde{P}_{ij}(x,\beta)\right\|\leq\frac{2N\bar{g}_{i,j,l}}{\varepsilon}\tilde{P}_{ij}(x,\beta)\left(\frac{1}{N}-\tilde{P}_{ij}(x,\beta)\right),$
	$\displaystyle x_{i}=[x_{i,1}\ \cdots\ x_{i,n}]^{\top},$
	$\displaystyle\bar{g}_{i,j,l}:=\max_{k\neq j}\|g_{i,l}^{\top}(x_{j}^{\sf d}-x_{k}^{\sf d})\|,\ {\mathcal{G}}_{i}=[g_{i,1}\ \cdots\ g_{i,n}]^{\top}.$

From Lemma 1, under the assumption $x_{i}^{\sf d}\neq x_{j}^{\sf d},\ i\neq j$ , $\tilde{P}_{ij}(x^{\sf e}(\varepsilon),\beta^{\sf e}(\varepsilon))$ converges exponentially to $0$ or $1/N$ as $\varepsilon\rightarrow+0$ . Hence, the variation of $W_{2,i}$ with respect to $x$ around $(x^{\sf e}(\varepsilon),\beta^{\sf e}(\varepsilon))$ can be made arbitrarily small by using sufficiently small $\varepsilon=\varepsilon_{1}$ . In addition, since $\gamma>0$ can be chosen independently of $\varepsilon$ , sufficiently large $\gamma=\bar{\gamma}$ enables us to take a neighborhood $B_{r_{1}}(x^{\sf e},\beta^{\sf e})$ where

	$\displaystyle\sum_{i=1}^{N}\left(-\frac{1}{2}W_{1,i}\left(x_{i},\tilde{P}(x,\beta)\right)+W_{2,i}(x,\beta)\right)$
	$\displaystyle+\gamma\left(-\frac{1}{2}W_{3}(x,\beta)\right)<0,\ \forall(x,\beta)\in B_{r_{1}}(x^{\sf e},\beta^{\sf e})\backslash\{(x^{\sf e},\beta^{\sf e})\}.$		(34)

Next, it follows from $(x^{\sf e},\beta^{\sf e})\in{\rm Exp}(\sigma)$ that

	$\displaystyle\nabla_{x_{i}}K_{ij}\|_{x=x^{e}(\varepsilon)}$	$\displaystyle=-\frac{2}{\varepsilon}\exp\left(-\frac{\\|x_{i}^{\sf e}(\varepsilon)-x_{j}^{\sf d}\\|_{{\mathcal{G}}_{i}}^{2}}{\varepsilon}\right)$
		$\displaystyle\qquad\times{\mathcal{G}}_{i}(x_{i}^{\sf e}(\varepsilon)-x_{j}^{\sf d})\rightarrow 0,\ {\rm as}\ \varepsilon\rightarrow+0.$

Since $W_{4}$ and $W_{5}$ depend on $(x,\beta)$ only via $K$ , their variation around $(x^{\sf e}(\varepsilon),\beta^{\sf e}(\varepsilon))$ can be made arbitrarily small by taking sufficiently small $\varepsilon>0$ . Therefore, under the assumption that $(x^{\sf e}(\varepsilon),\beta^{\sf e}(\varepsilon))$ is isolated, for any given $\gamma>0$ , we can take $\varepsilon=\varepsilon_{2}(\gamma)$ such that there exists a neighborhood $B_{r_{2}}(x^{\sf e},\beta^{\sf e})$ where

	$\displaystyle\sum_{i=1}^{N}\left(-\frac{1}{2}W_{1,i}(x_{i},\tilde{P}(x,\beta))\right)+\gamma\biggl{(}-\frac{1}{2}W_{3}(x,\beta)+W_{4}(x,\beta)$
	$\displaystyle+W_{5}(x,\beta)\biggr{)}<0,\forall(x,\beta)\in B_{r_{2}}(x^{\sf e},\beta^{\sf e})\backslash\{(x^{\sf e},\beta^{\sf e})\}.$		(35)

By combining (34) and (35), we obtain (33) for $r=\min\{r_{1},r_{2}\}$ , $\gamma=\bar{\gamma}$ , and $\varepsilon=\min\{\varepsilon_{1},\varepsilon_{2}(\bar{\gamma})\}$ , which completes the proof. ∎

VII Conclusion

In this paper, we presented the concept of Sinkhorn MPC, which combines MPC and the Sinkhorn algorithm to achieve scalable, cost-effective transport over dynamical systems. For linear systems with an energy cost, we analyzed the ultimate boundedness and the asymptotic stability for Sinkhorn MPC based on the stability of the constrained MPC and the conventional Sinkhorn algorithm.

On the other hand, in the numerical example, we observed that the regularization parameter plays a key role in the trade-off between the stationary and transient behaviors for Sinkhorn MPC. Hence, an important direction for future work is to investigate the design of a time-varying regularization parameter to balance the trade-off. Another direction is to extend our results to nonlinear systems with state constraints. In addition, the computational complexity still can be a problem for nonlinear systems. These problems will be addressed in future work.

References

[1] C. Villani, Topics in Optimal Transportation. American Mathematical Soc., 2003, no. 58.
[2] Y. Chen, T. T. Georgiou, and M. Pavon, “Optimal transport over a linear dynamical system,” IEEE Transactions on Automatic Control, vol. 62, no. 5, pp. 2137–2152, 2017.
[3] ——, “Steering the distribution of agents in mean-field games system,” Journal of Optimization Theory and Applications, vol. 179, no. 1, pp. 332–357, 2018.
[4] K. Bakshi, D. D. Fan, and E. A. Theodorou, “Schrödinger approach to optimal control of large-size populations,” IEEE Transactions on Automatic Control, vol. 66, no. 5, pp. 2372–2378, 2020.
[5] M. H. de Badyn, E. Miehling, D. Janak, B. Açıkmeşe, M. Mesbahi, T. Başar, J. Lygeros, and R. S. Smith, “Discrete-time linear-quadratic regulation via optimal transport,” arXiv preprint arXiv:2109.02347, 2021.
[6] J. Yu, S.-J. Chung, and P. G. Voulgaris, “Target assignment in robotic networks: Distance optimality guarantees and hierarchical strategies,” IEEE Transactions on Automatic Control, vol. 60, no. 2, pp. 327–341, 2014.
[7] A. R. Mosteo, E. Montijano, and D. Tardioli, “Optimal role and position assignment in multi-robot freely reachable formations,” Automatica, vol. 81, pp. 305–313, 2017.
[8] D. Q. Mayne, “Model predictive control: Recent developments and future promise,” Automatica, vol. 50, no. 12, pp. 2967–2986, 2014.
[9] M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” Advances in Neural Information Processing Systems, vol. 26, pp. 2292–2300, 2013.
[10] H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval Research Logistics Quarterly, vol. 2, no. 1-2, pp. 83–97, 1955.
[11] G. Peyré and M. Cuturi, “Computational optimal transport: With applications to data science,” Foundations and Trends® in Machine Learning, vol. 11, no. 5-6, pp. 355–607, 2019.
[12] F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control. John Wiley & Sons, 2012.
[13] W. Kwon and A. Pearson, “On the stabilization of a discrete constant linear system,” IEEE Transactions on Automatic Control, vol. 20, no. 6, pp. 800–801, 1975.
[14] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. Scokaert, “Constrained model predictive control: Stability and optimality,” Automatica, vol. 36, no. 6, pp. 789–814, 2000.
[15] W. Krabs and S. Pickl, Dynamical Systems: Stability, Controllability and Chaotic Behavior. Springer-Verlag, 2010.

Sinkhorn MPC: Model predictive optimal transport over dynamical systems*

Abstract

I Introduction

II Background on optimal transport

III Problem formulation

Problem 1

IV Combining MPC and Sinkhorn algorithm

V Numerical examples

VI Fundamental properties of Sinkhorn MPC with an energy cost

VI-A Ultimate boundedness for Sinkhorn MPC

Proposition 1

Proof:

VI-B Existence of the equilibrium points

Proposition 2

Proof:

VI-C Asymptotic stability for Sinkhorn MPC

Lemma 1

Theorem 1

Proof:

VII Conclusion

References

Sinkhorn MPC: Model predictive optimal transport over
dynamical systems*