Sinkhorn MPC: Model predictive optimal transport over
dynamical systems*
Abstract
We consider the optimal control problem of steering an agent population to a desired distribution over an infinite horizon. This is an optimal transport problem over a dynamical system, which is challenging due to its high computational cost. In this paper, we propose Sinkhorn MPC, which is a dynamical transport algorithm combining model predictive control and the so-called Sinkhorn algorithm. The notable feature of the proposed method is that it achieves cost-effective transport in real time by performing control and transport planning simultaneously. In particular, for linear systems with an energy cost, we reveal the fundamental properties of Sinkhorn MPC such as ultimate boundedness and asymptotic stability.
I Introduction
The problem of controlling a large number of agents has become a more and more important area in control theory with a view to applications in sensor networks, smart grids, intelligent transportation systems, and systems biology, to name a few. One of the most fundamental tasks in this problem is to stabilize a collection of agents to a desired distribution shape with minimum cost. This can be formulated as an optimal transport (OT) problem [1] between the empirical distribution based on the state of the agents and the target distribution over a dynamical system.
In [2, 3, 4, 5], infinitely many agents are represented as a probability density of the state of a single system, and the problem of steering the state from an initial density to a target one with minimum energy is considered. Although this approach can avoid the difficulty due to the large scale of the collective dynamics, in this framework, the agents must have the identical dynamics, and they are considered to be indistinguishable. In addition, even for linear systems, the density control requires us to solve a nonlinear partial differential equation such as the Monge-Ampère equation or the Hamilton-Jacobi-Bellman equation. Furthermore, it is difficult to incorporate state constraints due to practical requirements, such as safety, into the density control.
With this in mind, we rather deal with the collective dynamics directly without taking the number of agents to infinity. This straightforward approach is investigated in multi-agent assignment problems; see e.g., [6, 7] and references therein. In the literature, homogeneous agents following single integrator dynamics and easily computable assignment costs, e.g., distance-based cost, are considered in general. On the other hand, for more general dynamics, the main challenge of the assignment problem is the large computation time for obtaining the minimum cost of stabilizing each agent to each target state (i.e., point-to-point optimal control (OC)) and optimizing the destination of each agent. This is especially problematic for OT in dynamical environments where control inputs need to be determined immediately for given initial and target distributions.
For the point-to-point OC, the concept of model predictive control (MPC) solving a finite horizon OC problem instead of an infinite horizon OC problem is effective in reducing the computational cost while incorporating constraints [8]. On the other hand, in [9], several favorable computational properties of an entropy-regularized version of OT are highlighted. In particular, entropy-regularized OT can be solved efficiently via the Sinkhorn algorithm. Based on these ideas, we propose a dynamical transport algorithm combining MPC and the Sinkhorn algorithm, which we call Sinkhorn MPC. Consequently, the computational effort can be reduced substantially. Moreover, for linear systems with an energy cost, we reveal the fundamental properties of Sinkhorn MPC such as ultimate boundedness and asymptotic stability.
Organization: The remainder of this paper is organized as follows. In Section II, we introduce OT between discrete distributions. In Section III, we provide the problem formulation. In Section IV, we describe the idea of Sinkhorn MPC. In Section V, numerical examples illustrate the utility of the proposed method. In Section VI, for linear systems with an energy cost, we investigate the fundamental properties of Sinkhorn MPC. Some concluding remarks are given in Section VII.
Notation: Let denote the set of real numbers. The set of all positive (resp. nonnegative) vectors in is denoted by (resp. ). We use similar notations for the set of all real matrices and integers , respectively. The set of integers is denoted by . The Euclidean norm is denoted by . For a positive semidefinite matrix , denote . The identity matrix of size is denoted by . The matrix norm induced by the Euclidean norm is denoted by . For vectors , a collective vector is denoted by . For , we write . For , the diagonal matrix with diagonal entries is denoted by . The block diagonal matrix with diagonal entries is denoted by . Especially when , is also denoted by . Let be a metric space. The open ball of radius centered at is denoted by . The element-wise division of is denoted by . The -dimensional vector of ones is denoted by . The gradient of a function with respect to the variable is denoted by . For , define an equivalence relation on by if and only if .
II Background on optimal transport
Here, we briefly review OT between discrete distributions where , , and is the Dirac delta at . Given a cost function , which represents the cost of transporting a unit of mass from to , the original formulation of OT due to Monge seeks a map that solves
(1) | ||||
Especially when and , the optimal map gives the optimal assignment for transporting agents with the initial states to the desired states , and the Hungarian algorithm [10] can be used to solve (1). However, this method can be applied only to small problems because it has complexity.
On the other hand, the Kantorovich formulation of OT is a linear program (LP):
(2) |
where and
A matrix , which is called a coupling matrix, represents a transport plan where describes the amount of mass flowing from towards . In particular, when and , there exists an optimal solution from which we can reconstruct an optimal map for (1) [11, Proposition 2.1]. However, similarly to (1), for a large number of agents and destinations, the problem (2) with variables is challenging to solve.
In view of this, [9] employed an entropic regularization to (2):
(3) |
where and the entropy of is defined by . Define the Gibbs kernel associated to the cost matrix as
Then, a unique solution of the entropic OT (3) has the form for two (unknown) scaling variables . The variables can be efficiently computed by the Sinkhorn algorithm:
(4) |
where
Now, let us introduce Hilbert’s projective metric
(5) |
which is a distance on the projective cone (see the Notation in Section I for ) and is useful for the convergence analysis of the Sinkhorn algorithm; see [11, Remark 4.12 and 4.14]. Indeed, for any and any , it holds
(6) |
where
Then it follows from (6) that
which implies is a Lyapunov function of (4), and .
III Problem formulation
In this paper, we consider the problem of stabilizing agents efficiently to a given discrete distribution over dynamical systems. This can be formulated as Monge’s OT problem.
Problem 1
Given initial and desired states , find control inputs and a permutation that solve
(7) |
Here, the cost function is defined by
(8) | ||||
subject to | (9) | |||
(10) | ||||
(11) | ||||
(12) |
where .
Note that the running cost depends not only on the state and the control input , but also on the destination . Throughout this paper, we assume the existence of an optimal solution of OC problems. In addition, we assume that there exists a constant input under which is an equilibrium of (9). A necessary condition for the infinite horizon cost to be finite is that at with , there is not a cost incurred, i.e., . If is square and invertible, makes an equilibrium.
IV Combining MPC and Sinkhorn algorithm
The main difficulties of Problem 1 are as follows:
-
1.
In most cases, the infinite horizon OC problem is computationally intractable.
-
2.
In addition, given , the assignment problem needs to be solved, which leads to the high computational burden when is large.
To overcome these issues, we utilize the concept of MPC, which solves tractable finite horizon OC while satisfying the state constraint (10). Specifically, at each time, we address the OT problem whose cost function with the current state and the finite horizon is given by
Denote the first control in the optimal sequence by . From the viewpoint of the computation time for solving the OT problem, we relax the Problem 1 by the entropic regularization. It should be noted that, in challenging situations in which the number of agents is large and the sampling time is small, only a few Sinkhorn iterations are allowed. In what follows, we deal only with the case where just one Sinkhorn iteration is performed at each time. Nevertheless, by similar arguments, all of the results of this paper are still valid when more iterations are performed.
Based on the approximate solution obtained by the Sinkhorn algorithm at time , we need to determine a target state for each agent. Then, we introduce a set and a map as a policy to determine a temporary target at time for agent . Hereafter, assume that there exists a constant such that
(13) |
For example, if is the convex hull of , we can take . A typical policy to approximate a Monge’s OT map from a coupling matrix is the so-called barycentric projection [11, Remark 4.11].
For any given policy , we summarize the above strategy as the following dynamics where the Sinkhorn algorithm behaves as a dynamic controller.
Sinkhorn MPC:
(14) | |||
(15) | |||
(16) | |||
(17) | |||
where
and the initial value is arbitrary.
Note that, under the assumption that for all , is square and invertible, Sinkhorn MPC is obviously recursively feasible, and the dynamics (14) satisfies the state constraint (10) for all .
V Numerical examples
This section gives examples for Sinkhorn MPC with an energy cost
(18) |
and . Then, the dynamics under Sinkhorn MPC can be written as follows [12, Section 2.2, pp. 37-39]:
(19) | |||
In the examples below, we use the barycentric target .
First, consider the case where
(20) |
and set . For given initial and desired states, the trajectories of the agents governed by (19) with (15)–(17) are illustrated in Fig. 1. It can be seen that the agents converge sufficiently close to the target states. The computation time for one Sinkhorn iteration is about ms, ms, and ms for , respectively, with MacBook Pro with Intel Core i5. On the other hand, solving the linear program (2) with MATLAB linprog takes about s, s, and s for , respectively, and is thus not scalable. Hence, Sinkhorn MPC contributes to reducing the computational burden.






Next, we investigate the effect of the regularization parameter on the behavior of Sinkhorn MPC. To this end, consider a simple case where , and
(21) |
Then the trajectories of the agents with are shown in Fig. 2. As can be seen, the overshoot/undershoot is reduced for larger while the limiting values of the states deviate from the desired states. In other words, the parameter reflects the trade-off between the stationary and transient behaviors of the dynamics under Sinkhorn MPC.

VI Fundamental properties of Sinkhorn MPC with an energy cost
In this section, we elucidate ultimate boundedness and asymptotic stability of Sinkhorn MPC with the energy cost (18) and . Hereafter, we assume the invertibility of .
VI-A Ultimate boundedness for Sinkhorn MPC
It is known that, under the assumption that is invertible, is stable, i.e., the spectral radius of satisfies [13, Corollary 1]. Using this fact, we derive the ultimate boundedness of (19) with (15)–(17).
Proposition 1
Proof:
Let . Then, it follows from (13) that
Note that for any , there exists such that
Hence, the desired result is straightforward from
∎
VI-B Existence of the equilibrium points
In the remainder of this section, we focus on the barycentric target . For and , define
(23) | |||
(24) | |||
(25) |
Then, the collective dynamics (19) with (15)–(17) is
(26) | |||
(27) |
where .
Here, we characterize equilibria of (26), (27). Note that the existence of the equilibria is not trivial due to the nonlinearity of the dynamics. A point is an equilibrium if and only if
(28) | |||
(29) |
The stability of implies that it has no eigenvalue equal to , and therefore is invertible. Thus, the necessary and sufficient condition for the equilibria is given by
(30) |
Proof:
Note that if a point satisfies (30), the corresponding is uniquely determined by in (29) with [11, Theorem 4.2]. Define a continuous map as
(31) |
From (30), fixed points of are equilibria of (26), (27). Note that for any and any , belongs to the convex hull of . For brevity, we abuse notation and regard as a subset of . Let be the restriction of in (31) to . Now we can utilize Brouwer’s fixed point theorem. That is, since is a continuous map from a compact convex set into itself, there exists a point such that . ∎
Sometimes, in order to emphasize the dependence of on , we write .
VI-C Asymptotic stability for Sinkhorn MPC
Next, we analyze the stability of the equilibrium points. For this purpose, the following lemma is crucial when is small. Due to the limited space, we omit the proof.
Lemma 1
Denote by the set of all equilibria of (26), (27) having the property in Lemma 1 for a permutation .
For and , define
Then, is a Lyapunov function of (26) where is fixed by [14]. Indeed, we have
Given an equilibrium , let us take the optimal coupling as , and for , define
(32) |
The following theorem follows from the fact that, for sufficiently small or large and large , behaves as a Lyapunov function of (26), (27) with respect to .
Theorem 1
Assume that for all , is invertible. Then the following hold:
- (i)
- (ii)
Proof:
We prove only (ii) as the proof is similar for (i). In this proof, we regard as a trajectory in a metric space with metric . Fix any satisfying the assumption in (ii). By definition, it is trivial that is positive definite on a neighborhood of . Moreover, for any , we have
where we used the triangle inequality for , and
In the sequel, we explain that sufficiently small and large enable us to take a neighborhood where
(33) |
which means the asymptotic stability of [15, Theorem 1.3].
First, a straightforward calculation yields, for any and any ,
From Lemma 1, under the assumption , converges exponentially to or as . Hence, the variation of with respect to around can be made arbitrarily small by using sufficiently small . In addition, since can be chosen independently of , sufficiently large enables us to take a neighborhood where
(34) |
Next, it follows from that
Since and depend on only via , their variation around can be made arbitrarily small by taking sufficiently small . Therefore, under the assumption that is isolated, for any given , we can take such that there exists a neighborhood where
(35) |
VII Conclusion
In this paper, we presented the concept of Sinkhorn MPC, which combines MPC and the Sinkhorn algorithm to achieve scalable, cost-effective transport over dynamical systems. For linear systems with an energy cost, we analyzed the ultimate boundedness and the asymptotic stability for Sinkhorn MPC based on the stability of the constrained MPC and the conventional Sinkhorn algorithm.
On the other hand, in the numerical example, we observed that the regularization parameter plays a key role in the trade-off between the stationary and transient behaviors for Sinkhorn MPC. Hence, an important direction for future work is to investigate the design of a time-varying regularization parameter to balance the trade-off. Another direction is to extend our results to nonlinear systems with state constraints. In addition, the computational complexity still can be a problem for nonlinear systems. These problems will be addressed in future work.
References
- [1] C. Villani, Topics in Optimal Transportation. American Mathematical Soc., 2003, no. 58.
- [2] Y. Chen, T. T. Georgiou, and M. Pavon, “Optimal transport over a linear dynamical system,” IEEE Transactions on Automatic Control, vol. 62, no. 5, pp. 2137–2152, 2017.
- [3] ——, “Steering the distribution of agents in mean-field games system,” Journal of Optimization Theory and Applications, vol. 179, no. 1, pp. 332–357, 2018.
- [4] K. Bakshi, D. D. Fan, and E. A. Theodorou, “Schrödinger approach to optimal control of large-size populations,” IEEE Transactions on Automatic Control, vol. 66, no. 5, pp. 2372–2378, 2020.
- [5] M. H. de Badyn, E. Miehling, D. Janak, B. Açıkmeşe, M. Mesbahi, T. Başar, J. Lygeros, and R. S. Smith, “Discrete-time linear-quadratic regulation via optimal transport,” arXiv preprint arXiv:2109.02347, 2021.
- [6] J. Yu, S.-J. Chung, and P. G. Voulgaris, “Target assignment in robotic networks: Distance optimality guarantees and hierarchical strategies,” IEEE Transactions on Automatic Control, vol. 60, no. 2, pp. 327–341, 2014.
- [7] A. R. Mosteo, E. Montijano, and D. Tardioli, “Optimal role and position assignment in multi-robot freely reachable formations,” Automatica, vol. 81, pp. 305–313, 2017.
- [8] D. Q. Mayne, “Model predictive control: Recent developments and future promise,” Automatica, vol. 50, no. 12, pp. 2967–2986, 2014.
- [9] M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” Advances in Neural Information Processing Systems, vol. 26, pp. 2292–2300, 2013.
- [10] H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval Research Logistics Quarterly, vol. 2, no. 1-2, pp. 83–97, 1955.
- [11] G. Peyré and M. Cuturi, “Computational optimal transport: With applications to data science,” Foundations and Trends® in Machine Learning, vol. 11, no. 5-6, pp. 355–607, 2019.
- [12] F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control. John Wiley & Sons, 2012.
- [13] W. Kwon and A. Pearson, “On the stabilization of a discrete constant linear system,” IEEE Transactions on Automatic Control, vol. 20, no. 6, pp. 800–801, 1975.
- [14] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. Scokaert, “Constrained model predictive control: Stability and optimality,” Automatica, vol. 36, no. 6, pp. 789–814, 2000.
- [15] W. Krabs and S. Pickl, Dynamical Systems: Stability, Controllability and Chaotic Behavior. Springer-Verlag, 2010.