This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Decentralized State-Dependent Markov Chain Synthesis
with an Application to Swarm Guidance

Samet Uzun, , Nazım Kemal Üre, , Behçet Açıkmeşe Samet Uzun and Behçet Açıkmeşe are with the William E. Boeing Department of Aeronautics and Astronautics, University of Washington, Seattle, WA 98125, USA (e-mail: [email protected]; [email protected]). Nazım Kemal Üre is with the Department of Artificial Intelligence and Data Engineering, Istanbul Technical University, Istanbul, 34469, Turkey (e-mail: [email protected]).©2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Abstract

This paper introduces a decentralized state-dependent Markov chain synthesis (DSMC) algorithm for finite-state Markov chains. We present a state-dependent consensus protocol that achieves exponential convergence under mild technical conditions, without relying on any connectivity assumptions regarding the dynamic network topology. Utilizing the proposed consensus protocol, we develop the DSMC algorithm, updating the Markov matrix based on the current state while ensuring the convergence conditions of the consensus protocol. This result establishes the desired steady-state distribution for the resulting Markov chain, ensuring exponential convergence from all initial distributions while adhering to transition constraints and minimizing state transitions. The DSMC’s performance is demonstrated through a probabilistic swarm guidance example, which interprets the spatial distribution of a swarm comprising a large number of mobile agents as a probability distribution and utilizes the Markov chain to compute transition probabilities between states. Simulation results demonstrate faster convergence for the DSMC based algorithm when compared to the previous Markov chain based swarm guidance algorithms.

Index Terms:
Markov Chains, Consensus Protocol, Decentralized Control, Probabilistic Swarm Guidance.

I Introduction

Markov chain synthesis has garnered attention from various disciplines, including physics, systems theory, computer science, and numerous other fields of science and engineering. This attention is particularly notable within the context of Monte Carlo Markov Chain (MCMC) algorithms [1, 2, 3]. The fundamental idea underlying MCMC algorithms is to synthesize a Markov chain that converges to a specified steady-state distribution. Random sampling of a large state space while adhering to a predefined probability distribution is the predominant use of MCMC algorithms. The current literature covers a broad spectrum of methodologies for Markov chain synthesis, incorporating both heuristic approaches and optimization-based techniques [4, 5, 6]. Each method provides specialized algorithms tailored to the synthesis of Markov chains in alignment with specific objectives or constraints. Markov chain synthesis plays a central role in probabilistic swarm guidance, which has led to the development of various algorithms incorporating additional transition and safety constraints [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. The probabilistic swarm guidance algorithm models the spatial distribution of the swarm agents as a probability distribution and employs the synthesized Markov chain to guide the spatial distribution of the swarm.

Consensus protocols form an important field of research that has a strong connection with Markov chains [18]. Consensus protocols are a set of rules used in distributed systems to achieve agreement among a group of agents on the value of a variable [19, 20, 21, 22]. Markov chains are often employed to model and analyze the dynamics and convergence properties of consensus protocols, providing insights into their behavior and performance [23, 24, 25, 26].

The current paper presents a consensus protocol specifically tailored to operate on a dynamic graph topology. We establish that the protocol provides exponential convergence guarantees, even under mild technical conditions. Consequently, our approach eliminates the reliance on conventional connectivity assumptions commonly found in the existing literature [27, 28, 29, 30, 31, 32]. Building on this new consensus protocol, the paper introduces a decentralized state-dependent Markov chain (DSMC) synthesis algorithm. It is demonstrated that the synthesized Markov chain, formulated using the proposed consensus algorithm, satisfies the aforementioned mild conditions. This, in turn, ensures the exponential convergence of the Markov chain to the desired steady-state. Subsequently, the DSMC algorithm’s performance is demonstrated on the probabilistic swarm guidance problem, with the objective of controlling the spatial distribution of swarm agents. Through simulations, it is shown that the DSMC algorithm achieves considerably faster convergence compared to other existing Markov chain synthesis algorithms.

I-A Related Works

The Metropolis-Hastings algorithm is widely recognized as one of the most prominent techniques for MCMC algorithms [4]. For the fastest mixing Markov chain synthesis, the problem is formulated as a convex optimization problem in [5], assuming that the Markov chain is symmetric. This paper also presents an extension to the method that involves synthesizing the fastest mixing reversible Markov chain with a given desired distribution. Furthermore, the number of variables in the optimization problem is reduced in [6] by exploiting the symmetries in the graph.

The probabilistic swarm guidance problem has been a crucial domain for the application and enhancement of Markov chain synthesis algorithms. A comprehensive review of the broader category of multi-agent algorithms is presented in [33], while a survey specifically focusing on aerial swarm robotics is provided in [34]. Additionally, [35] offers an overview of existing swarm robotic applications. For swarm guidance purposes, certain deterministic algorithms have been developed in [36, 37, 38, 39, 40, 41]. However, these algorithms may become computationally infeasible when dealing with swarms that comprise hundreds to thousands of agents. In the context of addressing the guidance problem for a large number of agents, considering the spatial distribution of swarm agents and directing it towards a desired steady-state distribution offers a computationally efficient approach. In this regard, both probabilistic and deterministic swarm guidance algorithms are presented in [42, 43, 44, 45, 46, 47, 48] for continuous state spaces. For discrete state spaces, a probabilistic guidance algorithm is introduced in [7]. This algorithm treats the spatial distribution of swarm agents, called the density distribution, as a probability distribution and employs the Metropolis-Hastings (M-H) algorithm to synthesize a Markov chain that guides the density distribution toward a desired state. The probabilistic guidance algorithm led to the development of numerous Markov chain synthesis algorithms involving specific objectives and constraints [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. In [8], the Metropolis-Hastings algorithm is extended to incorporate safety upper bound constraints on the probability vector. This paper includes numerical simulations that demonstrate the application of the extension in a probabilistic swarm guidance problem. In order to enhance convergence rates, [9] introduces a convex optimization-based technique for Markov chain synthesis. This technique formulates the objective function and constraints using linear matrix inequalities (LMIs) to synthesize a Markov chain capable of achieving a desired distribution while adhering to specified transition constraints. Notably, this study does not impose any assumptions on the Markov chain, rendering the problem inherently non-convex. The problem is convexified for practical purposes but the optimal convergence rate of the original non-convex problem cannot be attained. In [10], the approach presented in [9] is enhanced by incorporating state-feedback to further improve the convergence rate. These works are also extended to impose density upper bounds and density rate constraints in [11] and density flow and density diffusivity constraints in [12]. In [13], a more general approach is proposed for cases where the environment is stochastic, and the resulting Markov chain is shown to be a linear function of the stochastic environment and decision policy. These convex optimization problems can be solved using the interior-point algorithm but the computation time of the algorithm increases rapidly with increasing dimensionality of the state space [49]. Furthermore, neither the M-H algorithm nor convex optimization-based approaches attempt to minimize the number of state transitions. The swarm guidance problem is formulated as an optimal transport problem in [50] to minimize the number of agent transitions. However, besides the similar computational complexity issues, the performance of the algorithm drops significantly if the current density distribution of the swarm cannot be estimated accurately. The time-inhomogeneous Markov chain approach to the probabilistic swarm guidance problem (PSG-IMC algorithm) is developed in [14] to minimize the number of state transitions. This algorithm is computationally efficient and yields reasonable results with low estimation errors. However, all feedback-based algorithms mentioned above require global feedback on the state of the density distribution. Communication between all agents has to be established to estimate the density distribution in the probabilistic swarm guidance problem. Generating perfect feedback of the density distribution is often impractical, leading to the routine occurrence of estimation errors. An alternative approach that requires only local information is developed in [15]. In this work, the time-inhomogeneous Markov chain approach presented in [14] is modified to work with local information, and the algorithm is used for minimizing the number of state transitions. Nevertheless, in both global and local information based time-inhomogeneous Markov chain approaches, it is observed that the convergence rate diminishes significantly when the state transition capabilities become more restricted. As it can be seen in Corollary 1 in [14] or Corollary 3 in [15], the transition probability from a state to any directly connected state cannot be higher than the desired density value of the corresponding directly connected state. In situations where there are sparsely connected regions in the state space, it is common to observe a relatively low sum of desired density values among directly connected states. Consequently, there is a higher probability of remaining in the same state rather than transitioning to other states. In terms of the convergence rate, these algorithms are only effective in cases with high transition capabilities. Additionally, the performance of these algorithms is highly sensitive to hyperparameters and requires careful selection for optimum results in each experiment.

Graph temporal logic (GTL) is introduced in [16] to impose high-level task specifications as a constraint to the Markov chain synthesis. Markov chain synthesis is formulated as mixed-integer nonlinear programming (MINLP) feasibility problem and the problem is solved using a coordinate descent algorithm. In addition, an equivalence is proven between the feasibility of the MINLP and the feasibility of a mixed-integer linear program (MILP) for a particular case where the agents move along the nodes of a complete graph. While this study assumes homogeneous swarms for Markov chain synthesis subject to finite-horizon GTL formulas, an improved version of the formulation is presented in [17] to enable probabilistic control of heterogeneous swarms subject to infinite-horizon GTL formulas. Instead of solving the resulting MINLP using a coordinate descent algorithm, a sequential scheme, which is faster, more accurate, and robust to the choice of the starting point, is developed in the aforementioned paper.

Markov chains and consensus protocols share a close relationship. The rich theory of Markov chains has proven to be valuable in analyzing specific consensus protocols. Notable works such as [23, 24, 25, 26] have leveraged Markov chain theory to provide insights and analysis for consensus protocols. Consensus protocols, in contrast to Markov chains, operate without the limitations of non-negative nodes and edges or the requirement for the sum of nodes to equal one [18]. This broader scope enables consensus protocols to address a significantly wider range of problem spaces. Therefore, there is a significant interest in consensus protocols in a broad range of multi-agent networked systems research, including distributed coordination of mobile autonomous agents [27, 28, 29, 30, 31, 51], distributed optimization [52, 53, 54, 55, 56], distributed state estimation [57, 58], or dynamic load-balancing for parallel processors [59, 60]. There are comprehensive survey papers that review the research on consensus protocols [19, 20, 21, 22]. In many scenarios, the network topology of the consensus protocol is a switching topology due to failures, formation reconfiguration, or state-dependence. There is a large number of papers that propose consensus protocols with switching network topologies and convergence proofs of these algorithms are provided under various assumptions [27, 28, 29, 30, 31, 32]. In [27], a consensus protocol is proposed to solve the alignment problem of mobile agents, where the switching topology is assumed to be periodically connected. This assumption means that the union of networks over a finite time interval is strongly connected. Another algorithm is proposed in [28] that assumes the underlying switching network topology is ultimately connected. This assumption means that the union of graphs over an infinite interval is strongly connected. In [29], previous works are extended to solve the consensus problem on networks under limited and unreliable information exchange with dynamically changing interaction topologies. The convergence of the algorithm is provided under the ultimately connected assumption. Another consensus protocol is introduced in [30] for the cooperation of vehicles performing a shared task using inter-vehicle communication. Based on this work, a theoretical framework is presented in [31] to solve consensus problems under a variety of assumptions on the network topology such as strongly connected switching topology. In [32], a consensus protocol with state-dependent weights is proposed and it is assumed that corresponding graphs are weakly connected, which means that the network is assumed to be connected over the iterations. Additionally, optimization-based algorithms are proposed to obtain a high convergence rate for the consensus protocol in [23] and [61].

I-B Main Contributions

In this paper, we first propose a consensus protocol with state-dependent weights. The proposed consensus protocol does not require any connectivity assumption on the dynamic network topology, unlike the existing methods in the literature [27, 28, 29, 30, 31, 32]. We provide an exponential convergence guarantee for the consensus protocol under some mild technical conditions, which can be verified straightforwardly. We then present a decentralized Markov-chain synthesis (DSMC) algorithm based on the proposed consensus protocol and we prove that the resulting DSMC algorithm satisfies these mild conditions. This result is employed to prove that the resulting Markov chain has a desired steady-state distribution and that all initial distributions exponentially converge to this steady-state. Unlike the homogeneous Markov chain synthesis algorithms in [4, 7, 5, 6, 8, 9], the Markov matrix, synthesized by our algorithm, approaches the identity matrix as the probability distribution converges to the desired steady-state distribution. Hence the proposed algorithm attempts to minimize the number of state transitions, which eventually converge to zero as the probability distribution converges to the desired steady-state distribution. Whereas previous time-inhomogeneous Markov chain synthesis algorithms in [14, 15] only provide asymptotic convergence, the DSMC algorithm provides an exponential convergence rate guarantee. Furthermore, unlike previous algorithms in [14, 15], the convergence rate of the DSMC algorithm does not rapidly decrease in scenarios where the state space contains sparsely connected regions. Due to the decentralized nature of the consensus protocol, the Markov chain synthesis relies on local information, similar to the approach presented in [15], and a complex communication architecture is not required for the estimation of the distribution. By presenting numerical evidence within the context of the probabilistic swarm guidance problem, we demonstrate that the convergence rate of the swarm distribution to the desired steady-state distribution is substantially faster when compared to previous methodologies. In summary:

  • We propose a consensus protocol with state-dependent weights and prove its exponential convergence under mild technical conditions, without relying on the typical connectivity assumptions associated with the underlying graph topology.

  • Based on the proposed consensus protocol, we introduce the DSMC algorithm, which is shown to meet the convergence conditions outlined by the consensus protocol.

  • Simulation results demonstrate that the DSMC algorithm achieves faster convergence, characterized by an exponential convergence guarantee, compared to existing homogeneous and time-inhomogeneous Markov chain synthesis algorithms presented in [7] and [14].

I-C Organization and Notation

The paper is organized as follows. Section II presents the consensus protocol with state-dependent weights. The decentralized state-dependent Markov matrix synthesis (DSMC) algorithm is introduced in Section III. Section IV introduces the probabilistic swarm guidance problem formulation, and presents numerical simulations of swarms converging to desired distributions. The paper is concluded in Section V.

Notation: \mathbb{R} and \mathbb{C} represent the set of real numbers and complex numbers, respectively. +\mathbb{R}_{+} and +\mathbb{Z}_{+} represent the set of non-negative real numbers and non-negative integers, respectively. n\mathbb{R}^{n} is the nn dimensional real vector space. 𝟎\bm{0} is a zero matrix and 𝟏\bm{1} is a matrix of ones in appropriate dimensions. x[i](k)x[i](k) represents the ithi^{\text{th}} element of the vector xnx\in\mathbb{R}^{n} at time kk. xp\left\|x\right\|_{p} denotes the p\ell_{p} vector norm. (v1,v2,,vn)(v_{1},v_{2},...,v_{n}) represents a vector, such that (v1,v2,,vn)[v1T,v2T,,vnT]T(v_{1},v_{2},...,v_{n})\equiv[v_{1}^{T},v_{2}^{T},...,v_{n}^{T}]^{T} where viv_{i} have arbitrary dimensions. M>() 0M>(\geq)\;\bm{0} implies that Mm×nM\in\mathbb{R}^{m\times n} is a positive (non-negative) matrix. 1m\mathcal{I}_{1}^{m} denotes the set of integers such that 1m={1,2,,m}\mathcal{I}_{1}^{m}=\{1,2,\dots,m\}. λi(A)\lambda_{i}(A) is the ithi^{th} eigenvalue of a matrix Am×mA\in\mathbb{R}^{m\times m} such that λi(A)λi+1(A)\lambda_{i}(A)\leq\lambda_{i+1}(A) for i1m1i\in\mathcal{I}_{1}^{m-1}. σ(A)\sigma(A) is the set of eigenvalues of Am×mA\in\mathbb{R}^{m\times m} such that σ(A)={λ1,λ2,,λm}\sigma(A)=\{\lambda_{1},\lambda_{2},\dots,\lambda_{m}\}. \emptyset denotes the empty set. VWV\setminus W represents the elements in set VV that are not in set WW. 𝒫\mathcal{P} denotes the probability of a random variable.

II A Consensus Protocol
with State-Dependent Weights

The consensus protocol is a process used for achieving an agreement among distributed agents. In this section, we introduce a consensus protocol with state-dependent weights to reach a consensus on time-varying weighted graphs. Unlike other proposed consensus protocols in the literature, the consensus protocol we introduce does not require any connectivity assumption on the dynamic network topology. We provide theoretical analysis for proof of exponential convergence under some mild technical conditions. We then use the proposed consensus protocol for the synthesis of the column stochastic Markov matrix in Section III. It is proven that these mild assumptions required by the proposed consensus protocol are satisfied by the Markov matrix synthesis algorithm. We first define the graph for our consensus approach.

Definition 1.

(Time-varying weighted graph) A time-varying weighted graph is a tuple, 𝒢w(k)=(𝒱,,w(k))\mathcal{G}_{w}(k)=(\mathcal{V},\mathcal{E},w(k)), where 𝒱={v1,,vm}\mathcal{V}=\{\mathrm{v}_{1},...,\mathrm{v}_{m}\} is the set of vertices, 𝒱×𝒱\mathcal{E}\subseteq\mathcal{V}\times\mathcal{V} is the set of undirected edges, and w(k)w(k) is a time-varying function such that w:+w:\mathcal{E}\xrightarrow[]{}\mathbb{R}_{+}, where wi,j(k)w_{i,j}(k) represents the value of weight for edge {vi,vj}\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E} at time kk.

  • (Values of vertices) A time-varying vector e(k)me(k)\in\mathbb{R}^{m} such that 𝟏Te(k)=0\bm{1}^{T}e(k)=0 for k+k\in\mathbb{Z}_{+} can define a vector of values of vertices, i.e., e[i](k)e[i](k) represents the value of the vertex vi\mathrm{v}_{i} at time kk.

  • (Uniform graph induced by time-varying weighted graph) 𝒢=(𝒱,)\mathcal{G}=(\mathcal{V},\mathcal{E}) is the uniform graph induced by 𝒢w(k)\mathcal{G}_{w}(k) and it is obtained by setting wi,j(k)=1w_{i,j}(k)=1, where {vi,vj}\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E} for all k+k\in\mathbb{Z}_{+}.

  • (Adjacency matrices) Two vertices vi\mathrm{v}_{i} and vj\mathrm{v}_{j} are called adjacent if {vi,vj}\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E}. The adjacency matrix of the uniform graph 𝒢\mathcal{G} is represented by Aa(𝒢)A_{a}(\mathcal{G}), where Aa(𝒢)[i,j]=1A_{a}(\mathcal{G})[i,j]=1, if vi\mathrm{v}_{i} and vj\mathrm{v}_{j} are adjacent, and is 0 otherwise. The adjacency matrix of the time-varying weighted graph 𝒢w(k)\mathcal{G}_{w}(k) is represented by Aa(𝒢w(k))A_{a}(\mathcal{G}_{w}(k)), where Aa(𝒢w(k))[i,j]=Aa(𝒢)[i,j]wi,j(k)A_{a}(\mathcal{G}_{w}(k))[i,j]=A_{a}(\mathcal{G})[i,j]w_{i,j}(k).

  • (Degree matrices) 𝒟(𝒢)\mathcal{D}(\mathcal{G}) and 𝒟(𝒢w(k))\mathcal{D}(\mathcal{G}_{w}(k)) are the diagonal degree matrices of the graphs 𝒢\mathcal{G} and 𝒢w(k)\mathcal{G}_{w}(k) such that 𝒟(𝒢)[i,i]=jAa(𝒢)[i,j]\mathcal{D}(\mathcal{G})[i,i]=\sum_{j}A_{a}(\mathcal{G})[i,j] and 𝒟(𝒢w(k))[i,i]=j1mAa(𝒢w(k))[i,j]\mathcal{D}(\mathcal{G}_{w}(k))[i,i]=\sum_{j\in\mathcal{I}_{1}^{m}}A_{a}(\mathcal{G}_{w}(k))[i,j].

  • (Laplacian matrices) (𝒢)\mathcal{L}(\mathcal{G}) and (𝒢w(k))\mathcal{L}(\mathcal{G}_{w}(k)) are the Laplacian matrices of the graphs 𝒢\mathcal{G} and 𝒢w(k)\mathcal{G}_{w}(k) such that (𝒢)=𝒟(𝒢)Aa(𝒢)\mathcal{L}(\mathcal{G})=\mathcal{D}(\mathcal{G})-A_{a}(\mathcal{G}) and (𝒢w(k))=𝒟(𝒢w(k))Aa(𝒢w(k))\mathcal{L}(\mathcal{G}_{w}(k))=\mathcal{D}(\mathcal{G}_{w}(k))-A_{a}(\mathcal{G}_{w}(k)).

  • (Activated and Deactivated Edges) If wi,j(k)=0w_{i,j}(k)=0, where {vi,vj}\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E}, the edge {vi,vj}\{\mathrm{v}_{i},\mathrm{v}_{j}\} is deactivated. A(k)\mathcal{E}_{A}(k) and D(k)\mathcal{E}_{D}(k) represent the activated and deactivated edges for time kk such that =A(k)D(k)\mathcal{E}=\mathcal{E}_{A}(k)\cup\mathcal{E}_{D}(k).

For convenience, we use AaA_{a}, Aaw(k)A_{a_{w}}(k), 𝒟\mathcal{D}, 𝒟w(k)\mathcal{D}_{w}(k), \mathcal{L} and w(k)\mathcal{L}_{w}(k) instead of using Aa(𝒢)A_{a}(\mathcal{G}), Aa(𝒢w(k))A_{a}(\mathcal{G}_{w}(k)), 𝒟(𝒢)\mathcal{D}(\mathcal{G}), 𝒟(𝒢w(k))\mathcal{D}(\mathcal{G}_{w}(k)), (𝒢)\mathcal{L}(\mathcal{G}) and (𝒢w(k))\mathcal{L}(\mathcal{G}_{w}(k)), respectively.

The following assumption on the topology of the uniform graph 𝒢\mathcal{G} is needed for convergence analysis.

Assumption 1.

The uniform graph 𝒢\mathcal{G} is a connected graph, which means there exists a path between all vertices, or equivalently, (I+Aa)m1>𝟎(I+A_{a})^{m-1}>\bm{0} for Aa=AaTm×mA_{a}=A_{a}^{T}\in\mathbb{R}^{m\times m} [18, section 2.1].

We will consider time-varying weighted graphs where the weights of the edges depend on the values of vertices. The following definition is needed to present the relation between the weights of the edges and the values of vertices.

Definition 2.

(Index sets with respect to values of vertices) The index sets Ip(k)I_{p}(k) and In(k)I_{n}(k) contain the indices of the non-negative and negative valued vertices for time kk, i.e.,

Ip(k)\displaystyle I_{p}(k) ={i:e[i](k)0},\displaystyle=\{i:e[i](k)\geq 0\},
In(k)\displaystyle I_{n}(k) ={i:e[i](k)<0}.\displaystyle=\{i:e[i](k)<0\}.

The index set Inp(k)In(k)I_{n_{p}}(k)\subseteq I_{n}(k) consists of the indices of the negative valued vertices that are adjacent to the non-negative valued vertices, i.e.,

Inp(k)\displaystyle I_{n_{p}}(k) ={i:iIn(k),jIp(k) and Aa[i,j]=1}.\displaystyle=\{i:i\in I_{n}(k),j\in I_{p}(k)\text{ and }A_{a}[i,j]=1\}.

The set that contains the edges between vi\mathrm{v}_{i}, iIn(k)i\in I_{n}(k) and vj\mathrm{v}_{j}, jInp(k)j\in I_{n_{p}}(k) is defined as

Inp,n(k)\displaystyle I_{{n_{p}},n}(k) ={{i,j}:iIn(k),jInp(k) and Aa[i,j]=1}.\displaystyle=\{\{i,j\}:i\in I_{n}(k),j\in I_{n_{p}}(k)\text{ and }A_{a}[i,j]=1\}.

We provide a condition that represents a relation between the values of the vertices and the weights of the edges of the graph 𝒢w(k)\mathcal{G}_{w}(k). This condition is the key to providing the convergence proof of the consensus protocol with state-dependent weights.

Condition 1.
  1.   (a)

    If iIp(k)i\in I_{p}(k), then there exists a c1c_{1} such that 0<c1wi,j(k)0<c_{1}\leq w_{i,j}(k) for all {vi,vj}\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E}.

  2.   (b)

    If {i,j}Inp,n(k)\{i,j\}\in I_{{n_{p}},n}(k), then wi,j(k)=0w_{i,j}(k)=0.

In other words, Condition 1a implies that the weight of an edge that connects a non-negative valued vertex to any other vertex cannot be arbitrarily close to 0. Therefore, if Aaw[i,j](k)=0A_{a_{w}}[i,j](k)=0 where e[i](k)0e[i](k)\geq 0, then Aa[i,j](k)=0A_{a}[i,j](k)=0. Condition 1b implies that no transition is allowed between two negative valued vertices if at least one of them is adjacent to a non-negative valued vertex. Edges that have zero weights are called deactivated edges. Note that although 𝒢\mathcal{G} is assumed to be a connected graph in Assumption 1, 𝒢w(k)\mathcal{G}_{w}(k) may not be a connected graph for some kk, due to these deactivated edges.

The following example is provided to help the reader interpret the implication of Condition 1.

Example 1

The values of vertices and weights of the edges are represented on a graph for a time kk in Figure 1. According to Definition 2, Ip(k)={1}I_{p}(k)=\{1\}, In(k)={2,3,4,5,6}I_{n}(k)=\{2,3,4,5,6\}, Inp(k)={2,4}I_{n_{p}}(k)=\{2,4\} and Inp,n(k)={{2,3}{2,5}{4,5}}I_{{n_{p}},n}(k)=\{\{2,3\}\{2,5\}\{4,5\}\}. Condition 1a implies that 0<c1w1,2(k)0<c_{1}\leq w_{1,2}(k) and 0<c1w1,4(k)0<c_{1}\leq w_{1,4}(k) since 0e[1](k)0\leq e[1](k). Condition 1b implies that weights of all other edges of the nodes v2\mathrm{v}_{2} and v4\mathrm{v}_{4} are zero, which means that w2,3(k)=w2,5(k)=w4,5(k)=0w_{2,3}(k)=w_{2,5}(k)=w_{4,5}(k)=0.

\includegraphics

[width=0.7]Figs/Graphs/trans.pdf

Figure 1: Vertices of the graph and corresponding values are represented by circles. The indices of the vertices are shown in their corners. Edges of the vertices are represented by lines and weights of the edges are represented on the sides of the edges. Green and red lines represent the edges that are implied by Condition 1a and Condition 1b, respectively. Black lines represent the remaining edges of the graph.

The following lemma shows that each non-negative valued vertex is connected to a negative valued vertex in the graph 𝒢w\mathcal{G}_{w} under Condition 1.

Lemma 1.

Assume that Condition 1 holds. If e(k)𝟎e(k)\neq\bm{0}, then for each ii such that e[i](k)0e[i](k)\geq 0, j\exists j such that e[j](k)<0e[j](k)<0, where Aawn[i,j](k)0A_{a_{w}}^{n}[i,j](k)\neq 0 for some exponent n+n\in\mathbb{Z}_{+}.

Proof.

Since 𝟏Te(k)=0\bm{1}^{T}e(k)=0 by Definition 1 and e(k)𝟎e(k)\neq\bm{0}, i,j\exists i,j such that e[i](k)>0e[i](k)>0 and e[j](k)<0e[j](k)<0.

The lemma is proven by contradiction. Suppose that there exists one non-negative valued vertex vi\mathrm{v}_{i} such that e[i](k)0e[i](k)\geq 0 and for all jj such that e[j](k)<0e[j](k)<0, Aawn[i,j](k)=0A_{a_{w}}^{n}[i,j](k)=0, n+\forall n\in\mathbb{Z}_{+}. In other words, there exists one non-negative valued vertex that is not connected to any negative valued vertices. Denote the set of such non-negative valued vertices as 𝒱+(k)\mathcal{V}^{+}(k). The set of remaining vertices is denoted as 𝒱(k)\mathcal{V}^{-}(k) such that 𝒱(k)=𝒱𝒱+(k)\mathcal{V}^{-}(k)=\mathcal{V}\setminus\mathcal{V}^{+}(k). Note that all negative valued vertices and all non-negative valued vertices that are connected to negative valued vertices are in 𝒱(k)\mathcal{V}^{-}(k). According to the assumption provided at the beginning of the proof, vertices in 𝒱+(k)\mathcal{V}^{+}(k) cannot be connected to the vertices in 𝒱(k)\mathcal{V}^{-}(k), then adjacency matrix of the time-varying weighted graph can be permuted as

A^aw(k)=P(k)Aaw(k)P(k)T=[Aaw+(k)𝟎𝟎Aaw(k)],\widehat{A}_{a_{w}}(k)=P(k)A_{a_{w}}(k)P(k)^{T}=\begin{bmatrix}A_{a_{w}}^{+}(k)&\bm{0}\\ \bm{0}&A_{a_{w}}^{-}(k)\\ \end{bmatrix},

where P(k)P(k) is a permutation matrix, Aaw+(k)A_{a_{w}}^{+}(k) and Aaw(k)A_{a_{w}}^{-}(k) are the adjacency matrices of sub-graphs that consist of the vertices in 𝒱+(k)\mathcal{V}^{+}(k) and 𝒱(k)\mathcal{V}^{-}(k), respectively. Note that due to Condition 1a, if Aaw[i,j](k)=0A_{a_{w}}[i,j](k)=0, where e[i](k)0e[i](k)\geq 0, then Aa[i,j](k)=0A_{a}[i,j](k)=0. Therefore, AaA_{a} can also be permuted as

A^a(k)=P(k)AaP(k)T=[Aa+(k)𝟎𝟎Aa(k)].\widehat{A}_{a}(k)=P(k)A_{a}P(k)^{T}=\begin{bmatrix}A_{a}^{+}(k)&\bm{0}\\ \bm{0}&A_{a}^{-}(k)\\ \end{bmatrix}. (1)

Due to Assumption 1, 𝒢\mathcal{G} is a connected graph, the AaA_{a} matrix is an irreducible matrix, and then A^a(k)\widehat{A}_{a}(k) matrix is also an irreducible matrix. However, the matrix in Eq. (1) is a reducible matrix, which means it contradicts the assumption provided at the beginning of the proof. Therefore, all non-negative valued vertices are connected to some negative valued vertices. ∎

Therefore, even if 𝒢w(k)\mathcal{G}_{w}(k) may not be a connected graph for some kk, there always exists at least one connected sub-graph that consists of both non-negative and negative valued vertices. It is important to note that in these connected sub-graphs, each edge is between a non-negative valued vertex and any other vertex and there is no edge between two negative valued vertices due to Condition 1b. Therefore, weights of all edges are at least c1c_{1} due to Condition 1a.

The following lemma provides a bound for the eigenvalues of the Laplacian matrix w(k)\mathcal{L}_{w}(k) of a connected time-varying weighted graph when the weights are within a certain range.

Lemma 2.

Let 𝒢w(k){\mathcal{G}_{w}(k)} be a connected time-varying weighted graph with mm vertices. Assume that the following inequality is satisfied for the weights of the edges of the graph 𝒢w(k)\mathcal{G}_{w}(k),

c1wi,j(k)1c2max(𝒟[i,i],𝒟[j,j]),\displaystyle c_{1}\leq w_{{i,j}}(k)\leq\frac{1-c_{2}}{\max(\mathcal{D}[i,i],\mathcal{D}[j,j])}, (2)

where 0<c10<c_{1} and 0<c2<10<c_{2}<1. Then, the following inequality holds for the eigenvalues of the Laplacian matrix w(k)\mathcal{L}_{w}(k),

σ(w(k))[c3,(2c3)]{0},\sigma(\mathcal{L}_{w}(k))\subseteq[c_{3},(2-c_{3})]\cup\{0\},

where 0<c3=min(c14m(m1),2c2)10<c_{3}=\min\Big{(}c_{1}\frac{4}{m(m-1)},2c_{2}\Big{)}\leq 1.

Proof.

According to Gershgorin’s Circle Theorem [62], eigenvalues of a w(k)\mathcal{L}_{w}(k) matrix satisfy, σ(w(k))\sigma(\mathcal{L}_{w}(k))

i{λ:|λw[i,i](k)|j1m{i}|w[i,j](k)|}.\subseteq\bigcup_{i}\Big{\{}\lambda\in\mathbb{C}:\big{|}\lambda-\mathcal{L}_{w}[i,i](k)\big{|}\leq\sum_{j\in\mathcal{I}_{1}^{m}\setminus\{i\}}\big{|}\mathcal{L}_{w}[i,j](k)\big{|}\Big{\}}. (3)

Since w[i,i](k)={vi,vj}wi,j(k)\mathcal{L}_{w}[i,i](k)=\sum_{\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E}}w_{{i,j}}(k) for weighted Laplacian matrix, where 0<c1wi,j(k)0<c_{1}\leq w_{{i,j}}(k), Eq. (3) implies that, σ(w(k))\sigma(\mathcal{L}_{w}(k))

i{λ:|λ{vi,vj}wi,j(k)|{vi,vj}wi,j(k)}\displaystyle\subseteq\bigcup_{i}\Big{\{}\lambda\in\mathbb{C}:\bigg{|}\lambda-\sum_{\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E}}w_{{i,j}}(k)\bigg{|}\leq\sum_{\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E}}w_{{i,j}}(k)\Big{\}}
=i{λ:0λ2{vi,vj}wi,j(k)}.\displaystyle=\bigcup_{i}\Big{\{}\lambda\in\mathbb{C}:0\leq\lambda\leq 2\sum_{\{\mathrm{v}_{i},\mathrm{v}_{j}\}\in\mathcal{E}}w_{{i,j}}(k)\Big{\}}.

According to Eq. (2), wi,j(k)(1c2)/𝒟[i,i]w_{{i,j}}(k)\leq(1-c_{2})/\mathcal{D}[i,i] and wi,j(k)(1c2)/𝒟[j,j]w_{{i,j}}(k)\leq(1-c_{2})/\mathcal{D}[j,j]. Then, eigenvalues of the Laplacian matrix w(k)\mathcal{L}_{w}(k) satisfy that,

0λi(k)(22c2) for λi(k)σ(w(k)).0\leq\lambda_{i}(k)\leq(2-2c_{2})\text{ for }\lambda_{i}(k)\in\sigma(\mathcal{L}_{w}(k)). (4)

We now show that the second smallest eigenvalue of the Laplacian matrix w(k)\mathcal{L}_{w}(k) is also lower bounded. In [63, Theorem 4.2], the following lower bound is provided for the second smallest eigenvalue of the Laplacian matrix of a connected unweighted graph 𝒢\mathcal{G},

4diam(𝒢)mλ2(),\displaystyle\frac{4}{\text{diam}(\mathcal{G})m}\leq\lambda_{2}(\mathcal{L}),

where diam(𝒢)\text{diam}(\mathcal{G}) is the diameter of the graph, which is the length of the longest shortest path between any two vertices, and mm is the number of vertices of the graph. Let us define the Laplacian matrix c1\mathcal{L}_{c_{1}} for the case that weights of all edges of the graph 𝒢\mathcal{G} are c1c_{1} instead of 11. Suppose that, λi()\lambda_{i}(\mathcal{L}) and λi(c1)\lambda_{i}(\mathcal{L}_{c_{1}}) are the ithi^{\text{th}} eigenvalues of \mathcal{L} and c1\mathcal{L}_{c_{1}} matrices such that λi()λi+1()\lambda_{i}(\mathcal{L})\leq\lambda_{i+1}(\mathcal{L}) and λi(c1)λi+1(c1)\lambda_{i}(\mathcal{L}_{c_{1}})\leq\lambda_{i+1}(\mathcal{L}_{c_{1}}) for i1m1i\in\mathcal{I}_{1}^{m-1}. Then, c1λi()=λi(c1)c_{1}\lambda_{i}(\mathcal{L})=\lambda_{i}(\mathcal{L}_{c_{1}}) for i1mi\in\mathcal{I}_{1}^{m}. Since c1wi,jc_{1}\leq w_{i,j} for the graph 𝒢w(k)\mathcal{G}_{w}(k), w(k)c1\mathcal{L}_{w}(k)-\mathcal{L}_{c_{1}} is also a positive semi-definite Laplacian matrix, which means uT(w(k)c1)u0u^{T}(\mathcal{L}_{w}(k)-\mathcal{L}_{c_{1}})u\geq 0 for all umu\in\mathbb{R}^{m}. Let us denote unit eigenvector u2mu_{2}\in\mathbb{R}^{m} such that w(k)u2=λ2(w(k))u2\mathcal{L}_{w}(k)u_{2}=\lambda_{2}(\mathcal{L}_{w}(k))u_{2} and unit eigenvector zimz_{i}\in\mathbb{R}^{m} such that c1zi=λi(c1)zi\mathcal{L}_{c_{1}}z_{i}=\lambda_{i}(\mathcal{L}_{c_{1}})z_{i} for all i1mi\in\mathcal{I}_{1}^{m}. Note that 𝟏\bm{1} is the eigenvector of these Laplacian matrices with the associated eigenvalue of 0. Since these Laplacian matrices are symmetric, all other eigenvectors of them are orthogonal to 𝟏\bm{1}. Then, u2Tz1=u2T𝟏=0u_{2}^{T}z_{1}=u_{2}^{T}\bm{1}=0 and u2span(z2,z3,,zm)u_{2}\in\text{span}(z_{2},z_{3},...,z_{m}), which means u2=i2mαiziu_{2}=\sum_{i\in\mathcal{I}_{2}^{m}}\alpha_{i}z_{i}, where i2mαi2=1\sum_{i\in\mathcal{I}_{2}^{m}}\alpha_{i}^{2}=1. Thus,

0\displaystyle 0 u2T(w(k)c1)u2\displaystyle\leq u_{2}^{T}(\mathcal{L}_{w}(k)-\mathcal{L}_{c_{1}})u_{2}
=λ2(w(k))u2Tc1u2\displaystyle=\lambda_{2}(\mathcal{L}_{w}(k))-u_{2}^{T}\mathcal{L}_{c_{1}}u_{2}
=λ2(w(k))i2mαiziTλi(c1)αizi\displaystyle=\lambda_{2}(\mathcal{L}_{w}(k))-\sum_{i\in\mathcal{I}_{2}^{m}}\alpha_{i}z_{i}^{T}\lambda_{i}(\mathcal{L}_{c_{1}})\alpha_{i}z_{i}
=λ2(w(k))i2mαi2λi(c1)\displaystyle=\lambda_{2}(\mathcal{L}_{w}(k))-\sum_{i\in\mathcal{I}_{2}^{m}}\alpha_{i}^{2}\lambda_{i}(\mathcal{L}_{c_{1}})
λ2(w(k))λ2(c1).\displaystyle\leq\lambda_{2}(\mathcal{L}_{w}(k))-\lambda_{2}(\mathcal{L}_{c_{1}}).

Hence, Eq. (5) can be provided for the second smallest eigenvalue of w(k)\mathcal{L}_{w}(k),

c14diam(𝒢)mλ2(w(k)).c_{1}\begin{aligned} \frac{4}{\text{diam}(\mathcal{G})m}\leq\lambda_{2}(\mathcal{L}_{w}(k)).\end{aligned} (5)

Since 0<diam(𝒢)m10<\text{diam}(\mathcal{G})\leq m-1 for a connected graph and due to Eq. (4), the following equation represents the eigenvalues of the matrix w(k)\mathcal{L}_{w}(k),

σ(w(k))[c3,(2c3)]{0},\sigma(\mathcal{L}_{w}(k))\subseteq[c_{3},(2-c_{3})]\cup\{0\},

where 0<c3=min(c14m(m1),2c2)min(2c1,2c2)min(22c2,2c2)<10<c_{3}=\min\Big{(}c_{1}\frac{4}{m(m-1)},2c_{2}\Big{)}\leq\min(2c_{1},2c_{2})\leq\min(2-2c_{2},2c_{2})<1. ∎

The following theorem shows that values of vertices of 𝒢w(k)\mathcal{G}_{w}(k) converge to 0 under specific value transition dynamics.

Theorem 1.

Suppose that Assumption 1 and Condition 1 are satisfied, and e(k)e(k) vector in Definition 1 evolves according to

e[i](k+1)=e[i](k)+j1m(e[j](k)e[i](k))Aaw[i,j](k),\displaystyle e[i](k+1)=e[i](k)+\sum_{j\in\mathcal{I}_{1}^{m}}\Big{(}e[j](k)-e[i](k)\Big{)}A_{a_{w}}[i,j](k), (6)

for all i1mi\in\mathcal{I}_{1}^{m}, where Aaw[i,j](k)=Aa[i,j]wi,j(k), 0wi,j(k)(1c2)/(max(𝒟[i,i],𝒟[j,j])), 0<c2<1A_{a_{w}}[i,j](k)=A_{a}[i,j]w_{i,j}(k),\;0\leq w_{i,j}(k)\leq(1-c_{2})/(\max(\mathcal{D}[i,i],\mathcal{D}[j,j])),\;0<c_{2}<1. Then e(k)e(k) exponentially convergences to 𝟎\bm{0}, i.e. e(k+1)/e(k)γ||e(k+1)||/||e(k)||\leq\gamma, where 0γ<10\leq\gamma<1.

Proof.

Eq. (6) is equivalent to following equation,

e(k+1)=F\displaystyle e(k+1)=F (k)e(k), where\displaystyle(k)e(k),\text{ where}
F(k)=I\displaystyle F(k)=I- w(k).\displaystyle\mathcal{L}_{w}(k).

Note that F(k)F(k) is a doubly stochastic symmetric matrix, where 𝟏TF(k)=𝟏T\bm{1}^{T}F(k)=\bm{1}^{T}, F(k)𝟏=𝟏F(k)\bm{1}=\bm{1}, F(k)T=F(k)F(k)^{T}=F(k) and non-zero, non-diagonal elements of F(k)F(k) represents the weights of activated edges.

Condition 1a forces weights of the edges connecting non-negative valued vertices to other vertices to be at least c1c_{1}. However, the weights of some other edges may be 0. Thus, the weighted graph may not be connected, which means w(k)\mathcal{L}_{w}(k) and F(k)F(k) matrices may be reducible. Hence, w(k)\mathcal{L}_{w}(k) may have repeated eigenvalues at 0, and F(k)F(k) may have repeated eigenvalues at 11.

Even if 𝒢w(k)\mathcal{G}_{w}(k) may not be connected, due to Lemma 1, all non-negative valued vertices belong to a connected sub-graph in 𝒢w(k)\mathcal{G}_{w}(k), which consists of both non-negative and negative valued vertices. Note that in these connected sub-graphs, there is no edge between two negative valued vertices due to Condition 1b. Therefore, weights of all edges are at least c1c_{1} due to Condition 1a, which enables us to apply Lemma 2 in later parts. Let us denote the number of these connected sub-graphs as s(k)1s(k)\geq 1 and denote the reordered F(k)F(k) matrix and e(k)e(k) vector as

F^(k)\displaystyle\widehat{F}(k) =P(k)F(k)P(k)T\displaystyle=P(k)F(k)P(k)^{T}
=[F^0(k)F^1(k)F^s(k)(k)],\displaystyle=\begin{bmatrix}\widehat{F}_{0}(k)&&&\\ &\widehat{F}_{1}(k)&&\\ &&\ddots&\\ &&&\widehat{F}_{s(k)}(k)\par\end{bmatrix},
e^(k)\displaystyle\hat{e}(k) =P(k)e(k)\displaystyle=P(k)e(k)
=(e^0(k),e^1(k),,e^s(k)(k)),\displaystyle=(\hat{e}_{0}(k),\hat{e}_{1}(k),...,\hat{e}_{s(k)}(k)),

where P(k)P(k) is a permutation matrix, F^i(k)\widehat{F}_{i}(k), iIs(k)={1,,s(k)}i\in I_{s}(k)=\{1,...,s(k)\} are irreducible matrices that correspond to connected sub-graphs and F^0(k)\widehat{F}_{0}(k) is a matrix that corresponds to remaining sub-graph. Note that there are both non-negative and negative elements in e^i(k)\hat{e}_{i}(k), iIs(k)i\in I_{s}(k) and all elements of e^0(k)\hat{e}_{0}(k) are negative.

Recall that, in Example 1, s(k)=1s(k)=1, values of v1\mathrm{v}_{1}, v2\mathrm{v}_{2} and v4\mathrm{v}_{4} vertices belong to the e^1(k)\hat{e}_{1}(k) vector and values of v3\mathrm{v}_{3}, v5\mathrm{v}_{5} and v6\mathrm{v}_{6} vertices belong to the e^0(k)\hat{e}_{0}(k) vector.

Suppose that, αi(k)\alpha_{i}(k) is the average of elements in e^i(k)\hat{e}_{i}(k) such that αi(k)=𝟏Te^i(k)/mi(k)\alpha_{i}(k)=\bm{1}^{T}\hat{e}_{i}(k)/m_{i}(k), where mi(k)m_{i}(k) is the number of elements in e^i(k)\hat{e}_{i}(k) and 𝜶i(k)=𝟏αi(k)\bm{\alpha}_{i}(k)=\bm{1}\alpha_{i}(k), 𝜶i(k)mi(k)\bm{\alpha}_{i}(k)\in\mathbb{R}^{m_{i}(k)}. Lemma 2 implies that only one eigenvalue of F^i(k)\widehat{F}_{i}(k), iIs(k)i\in I_{s}(k) is 11 with associated eigenvector 𝟏\bm{1}, and other eigenvalues are in [(1c3i),(1c3i)][-(1-c_{3_{i}}),(1-c_{3_{i}})], 0<c3i10<c_{3_{i}}\leq 1 since F^i(k)\widehat{F}_{i}(k), iIs(k)i\in I_{s}(k) matrices corresponds to a connected sub-graphs in which weights of all edges are at least c1>0c_{1}>0. Since the values of all vertices are negative in the remaining sub-graph that corresponds to the F^0(k)\widehat{F}_{0}(k) matrix, there is no lower-bound for the weights of the edges, while there is an upper bound provided in Eq. (6). Hence, σ(F^0(k))[(1c30),1]\sigma(\widehat{F}_{0}(k))\in[-(1-c_{3_{0}}),1], 0<c3010<c_{3_{0}}\leq 1. Therefore the following equation holds,

F^i(k)e^i(k)𝜶i(k)2=\displaystyle||\widehat{F}_{i}(k)\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)||^{2}= (7)
=F^i(k)e^i(k)F^i(k)𝜶i(k)2\displaystyle\qquad=||\widehat{F}_{i}(k)\hat{e}_{i}(k)-\widehat{F}_{i}(k)\bm{\alpha}_{i}(k)||^{2}
=F^i(k)(e^i(k)𝜶i(k))2\displaystyle\qquad=||\widehat{F}_{i}(k)(\hat{e}_{i}(k)-\bm{\alpha}_{i}(k))||^{2}
=(e^i(k)𝜶i(k))TF^i(k)TF^i(k)(e^i(k)𝜶i(k))\displaystyle\qquad=(\hat{e}_{i}(k)-\bm{\alpha}_{i}(k))^{T}\widehat{F}_{i}(k)^{T}\widehat{F}_{i}(k)(\hat{e}_{i}(k)-\bm{\alpha}_{i}(k))
c42(e^i(k)𝜶i(k))T(e^i(k)𝜶i(k))\displaystyle\qquad\leq c_{4}^{2}(\hat{e}_{i}(k)-\bm{\alpha}_{i}(k))^{T}(\hat{e}_{i}(k)-\bm{\alpha}_{i}(k))
=c42e^i(k)𝜶i(k)2,\displaystyle\qquad=c_{4}^{2}||\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)||^{2},

where c42=(1minic3i)2<1c_{4}^{2}=(1-\min_{i}c_{3_{i}})^{2}<1 for all iIs(k)i\in I_{s}(k).

Here, we will show that F^i(k)e^i(k)2||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2} is also smaller than or equal to c52e^i(k)2c_{5}^{2}||\hat{e}_{i}(k)||^{2} for all i1mi\in\mathcal{I}_{1}^{m}, where c52<1c_{5}^{2}<1. Note that if 𝟏T𝜶i(k)=𝟏Te^i(k)\bm{1}^{T}\bm{\alpha}_{i}(k)=\bm{1}^{T}\hat{e}_{i}(k), then 𝜶i(k)\bm{\alpha}_{i}(k) and (e^i(k)𝜶i(k))(\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)) are orthogonal to each other,

𝜶i(k)T(e^i(k)𝜶i(k))\displaystyle\bm{\alpha}_{i}(k)^{T}(\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)) =𝜶i(k)Te^i(k)𝜶i(k)T𝜶i(k)\displaystyle=\bm{\alpha}_{i}(k)^{T}\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)^{T}\bm{\alpha}_{i}(k)
=αi(k)𝟏Te^i(k)𝜶i(k)T𝜶i(k)\displaystyle=\alpha_{i}(k)\bm{1}^{T}\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)^{T}\bm{\alpha}_{i}(k)
=αi(k)𝟏T𝜶i(k)𝜶i(k)T𝜶i(k)\displaystyle=\alpha_{i}(k)\bm{1}^{T}\bm{\alpha}_{i}(k)-\bm{\alpha}_{i}(k)^{T}\bm{\alpha}_{i}(k)
=𝜶i(k)T𝜶i(k)𝜶i(k)T𝜶i(k)\displaystyle=\bm{\alpha}_{i}(k)^{T}\bm{\alpha}_{i}(k)-\bm{\alpha}_{i}(k)^{T}\bm{\alpha}_{i}(k)
=0.\displaystyle=0.

Hence, it can be shown that,

e^i(k)𝜶i(k)2\displaystyle||\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)||^{2} =e^i(k)22𝜶i(k)Te^i(k)+𝜶i(k)2\displaystyle=||\hat{e}_{i}(k)||^{2}-2\bm{\alpha}_{i}(k)^{T}\hat{e}_{i}(k)+||\bm{\alpha}_{i}(k)||^{2}
=e^i(k)22𝜶i(k)T𝜶i(k)+𝜶i(k)2\displaystyle=||\hat{e}_{i}(k)||^{2}-2\bm{\alpha}_{i}(k)^{T}\bm{\alpha}_{i}(k)+||\bm{\alpha}_{i}(k)||^{2}
=e^i(k)2𝜶i(k)2,\displaystyle=||\hat{e}_{i}(k)||^{2}-||\bm{\alpha}_{i}(k)||^{2},

and F^i(k)e^i(k)𝜶i(k)2=||\widehat{F}_{i}(k)\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)||^{2}=

=F^i(k)e^i(k)22𝜶i(k)TF^i(k)e^i(k)+𝜶i(k)2\displaystyle=||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}-2\bm{\alpha}_{i}(k)^{T}\widehat{F}_{i}(k)\hat{e}_{i}(k)+||\bm{\alpha}_{i}(k)||^{2}
=F^i(k)e^i(k)22αi(k)𝟏TF^i(k)e^i(k)+𝜶i(k)2\displaystyle=||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}-2\alpha_{i}(k)\bm{1}^{T}\widehat{F}_{i}(k)\hat{e}_{i}(k)+||\bm{\alpha}_{i}(k)||^{2}
=F^i(k)e^i(k)22αi(k)𝟏Te^i(k)+𝜶i(k)2\displaystyle=||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}-2\alpha_{i}(k)\bm{1}^{T}\hat{e}_{i}(k)+||\bm{\alpha}_{i}(k)||^{2}
=F^i(k)e^i(k)22αi(k)𝟏T𝜶i(k)T+𝜶i(k)2\displaystyle=||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}-2\alpha_{i}(k)\bm{1}^{T}\bm{\alpha}_{i}(k)^{T}+||\bm{\alpha}_{i}(k)||^{2}
=F^i(k)e^i(k)22𝜶i(k)T𝜶i(k)+𝜶i(k)2\displaystyle=||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}-2\bm{\alpha}_{i}(k)^{T}\bm{\alpha}_{i}(k)+||\bm{\alpha}_{i}(k)||^{2}
=F^i(k)e^i(k)2𝜶i(k)2.\displaystyle=||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}-||\bm{\alpha}_{i}(k)||^{2}.

If iIs(k)i\in I_{s}(k), then e^i(k)𝟎\hat{e}_{i}(k)\neq\bm{0} because there are both non-negative and negative values in e^i(k)\hat{e}_{i}(k). Therefore, Eq. (7) implies that

F^i(k)e^i(k)𝜶i(k)2\displaystyle||\widehat{F}_{i}(k)\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)||^{2} c42e^i(k)𝜶i(k)2\displaystyle\leq c_{4}^{2}||\hat{e}_{i}(k)-\bm{\alpha}_{i}(k)||^{2} (8)
F^i(k)e^i(k)2𝜶i(k)2\displaystyle||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}-||\bm{\alpha}_{i}(k)||^{2} c42(e^i(k)2𝜶i(k)2)\displaystyle\leq c_{4}^{2}(||\hat{e}_{i}(k)||^{2}-||\bm{\alpha}_{i}(k)||^{2})
F^i(k)e^i(k)2\displaystyle||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2} c42e^i(k)2+(1c42)𝜶i(k)2\displaystyle\leq c_{4}^{2}||\hat{e}_{i}(k)||^{2}+(1-c_{4}^{2})||\bm{\alpha}_{i}(k)||^{2}
F^i(k)e^i(k)2e^i(k)2\displaystyle\frac{||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}}{||\hat{e}_{i}(k)||^{2}} c42+(1c42)𝜶i(k)2e^i(k)2.\displaystyle\leq c_{4}^{2}+(1-c_{4}^{2})\frac{||\bm{\alpha}_{i}(k)||^{2}}{||\hat{e}_{i}(k)||^{2}}.

Here, we will show that 𝜶i(k)2e^i(k)2<m1m\frac{||\bm{\alpha}_{i}(k)||^{2}}{||\hat{e}_{i}(k)||^{2}}<\frac{m-1}{m} for all iIs(k)i\in I_{s}(k). Suppose that, e^i(k)=Pe^i(k)(e^ip(k),e^in(k))\hat{e}_{i}(k)=P_{\hat{e}_{i}}(k)(\hat{e}_{i_{p}}(k),\hat{e}_{i_{n}}(k)), where Pe^i(k)P_{\hat{e}_{i}}(k) is a permutation matrix such that e^ip(k)0\hat{e}_{i_{p}}(k)\geq 0 and e^in(k)<0\hat{e}_{i_{n}}(k)<0. Let mip(k)m_{i_{p}}(k) and min(k)m_{i_{n}}(k) represent the number of elements in e^ip(k)\hat{e}_{i_{p}}(k) and e^in(k)\hat{e}_{i_{n}}(k), where mip(k)+min(k)=mi(k)m_{i_{p}}(k)+m_{i_{n}}(k)=m_{i}(k). Since there are both non-negative and negative values in e^i(k)\hat{e}_{i}(k), mip(k)1m_{i_{p}}(k)\geq 1, min(k)1m_{i_{n}}(k)\geq 1 and 𝟏Te^i(k)<e^i(k)1\bm{1}^{T}\hat{e}_{i}(k)<||\hat{e}_{i}(k)||_{1}, then

𝜶i(k)2e^i(k)2\displaystyle\frac{||\bm{\alpha}_{i}(k)||^{2}}{||\hat{e}_{i}(k)||^{2}} =𝜶iT𝜶ie^i(k)2\displaystyle=\frac{\bm{\alpha}_{i}^{T}\bm{\alpha}_{i}}{||\hat{e}_{i}(k)||^{2}}
=𝟏T𝟏(𝟏Tei(k))2mi2(k)e^i(k)2\displaystyle=\frac{\bm{1}^{T}\bm{1}(\bm{1}^{T}e_{i}(k))^{2}}{m_{i}^{2}(k)||\hat{e}_{i}(k)||^{2}}
=(𝟏Tei(k))2mi(k)e^i(k)2\displaystyle=\frac{(\bm{1}^{T}e_{i}(k))^{2}}{m_{i}(k)||\hat{e}_{i}(k)||^{2}}
<(𝟏Te^ip(k))2+(𝟏Te^in(k))2mi(k)e^i(k)2\displaystyle<\frac{(\bm{1}^{T}\hat{e}_{i_{p}}(k))^{2}+(\bm{1}^{T}\hat{e}_{i_{n}}(k))^{2}}{m_{i}(k)||\hat{e}_{i}(k)||^{2}}
=e^ip(k)12+e^in(k)12mi(k)e^i(k)2\displaystyle=\frac{||\hat{e}_{i_{p}}(k)||_{1}^{2}+||\hat{e}_{i_{n}}(k)||_{1}^{2}}{m_{i}(k)||\hat{e}_{i}(k)||^{2}}
mip(k)e^ip(k)2+min(k)e^in(k)2mi(k)e^i(k)2\displaystyle\leq\frac{m_{i_{p}}(k)||\hat{e}_{i_{p}}(k)||^{2}+m_{i_{n}}(k)||\hat{e}_{i_{n}}(k)||^{2}}{m_{i}(k)||\hat{e}_{i}(k)||^{2}}
max(mip(k),min(k))(e^ip(k)2+e^in(k)2)mi(k)e^i(k)2\displaystyle\leq\frac{\max(m_{i_{p}}(k),m_{i_{n}}(k))(||\hat{e}_{i_{p}}(k)||^{2}+||\hat{e}_{i_{n}}(k)||^{2})}{m_{i}(k)||\hat{e}_{i}(k)||^{2}}
=max(mip(k),min(k))mi(k)\displaystyle=\frac{\max(m_{i_{p}}(k),m_{i_{n}}(k))}{m_{i}(k)}
mi(k)1mi(k)m1m\displaystyle\leq\frac{m_{i}(k)-1}{m_{i}(k)}\leq\frac{m-1}{m}

for all iIs(k)i\in I_{s}(k). It then follows that Eq. (8) implies

F^i(k)e^i(k)2\displaystyle||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2} c52e^i(k)2 for all iIs(k),\displaystyle\leq c_{5}^{2}||\hat{e}_{i}(k)||^{2}\text{\; for all \;}i\in I_{s}(k),

where c52<c42+(1c42)m1m<1c_{5}^{2}<c_{4}^{2}+(1-c_{4}^{2})\frac{m-1}{m}<1.

Here, we will present that e(k+1)c6e(k)||e(k+1)||\leq c_{6}||e(k)||, where 0<c6<10<c_{6}<1. Since e(k)𝟎e(k)\neq\bm{0},

e(k+1)2e(k)2=F^(k)e^(k)2e^(k)2=i=0s(k)F^i(k)e^i(k)2i=0s(k)e^i(k)2.\displaystyle\frac{||e(k+1)||^{2}}{||e(k)||^{2}}=\frac{||\widehat{F}(k)\hat{e}(k)||^{2}}{||\hat{e}(k)||^{2}}=\frac{\sum_{i=0}^{s(k)}||\widehat{F}_{i}(k)\hat{e}_{i}(k)||^{2}}{\sum_{i=0}^{s(k)}||\hat{e}_{i}(k)||^{2}}.

Since σ(F^0(k))[(1c30),1]\sigma(\widehat{F}_{0}(k))\subseteq[-(1-c_{3_{0}}),1], 0<c3010<c_{3_{0}}\leq 1, the equation is derived as

e(k+1)2e(k)2e^0(k)2+c52i=1s(k)e^i(k)2e^0(k)2+i=1s(k)e^i(k)2.\displaystyle\frac{||e(k+1)||^{2}}{||e(k)||^{2}}\leq\frac{||\hat{e}_{0}(k)||^{2}+c_{5}^{2}\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}}{||\hat{e}_{0}(k)||^{2}+\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}}. (9)

Here, we derive an upper bound for e^0(k)2||\hat{e}_{0}(k)||^{2} with respect to other terms to show that, there is an upper bound for the right side of Eq. (9). Since 𝟏Te(k)=0\bm{1}^{T}e(k)=0 and e^0(k)<𝟎\hat{e}_{0}(k)<\bm{0}, e^0(k)1+i=1s(k)e^in(k)1||\hat{e}_{0}(k)||_{1}+\sum_{i=1}^{s(k)}||\hat{e}_{i_{n}}(k)||_{1} = i=1s(k)e^ip(k)1\sum_{i=1}^{s(k)}||\hat{e}_{i_{p}}(k)||_{1}. Note that e^in(k)1>0||\hat{e}_{i_{n}}(k)||_{1}>0 for any iIs(k)i\in I_{s}(k) due to Lemma 1. Then,

e^0(k)1\displaystyle||\hat{e}_{0}(k)||_{1} <i=1s(k)||e^ip(k)||1+i=1s(k)e^in(k)1\displaystyle<\sum_{i=1}^{s(k)}||\hat{e}_{i_{p}}(k)||_{1}+\sum_{i=1}^{s(k)}||\hat{e}_{i_{n}}(k)||_{1} (10)
=i=1s(k)e^i(k)1.\displaystyle=\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||_{1}.

The following inequalities can be provided using equivalence of norms,

||e^0(k)||2||e^0(\displaystyle||\hat{e}_{0}(k)||^{2}\leq||\hat{e}_{0}( k)||12m0||e^0(k)||2,\displaystyle k)||_{1}^{2}\leq m_{0}||\hat{e}_{0}(k)||^{2},
i=1s(k)||e^i(k)||2i=1s(k)||e^i(\displaystyle\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}\leq\sum_{i=1}^{s(k)}||\hat{e}_{i}( k)||12(mm0)i=1s(k)||e^i(k)||2.\displaystyle k)||_{1}^{2}\leq(m-m_{0})\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}.

Thus, Eq. (10) implies that

e^0(k)2<(mm0)i=1s(k)e^i(k)2.||\hat{e}_{0}(k)||^{2}<(m-m_{0})\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}.

The strict upper-bound for e^0(k)2||\hat{e}_{0}(k)||^{2} can be inserted into Eq. (9) as

e(k+1)2e(k)2\displaystyle\frac{||e(k+1)||^{2}}{||e(k)||^{2}} 1(1c52)i=1s(k)e^i(k)2e^0(k)2+i=1s(k)e^i(k)2\displaystyle\leq 1-\frac{(1-c_{5}^{2})\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}}{||\hat{e}_{0}(k)||^{2}+\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}}
<1(1c52)i=1s(k)e^i(k)2(mm0+1)i=1s(k)e^i(k)2\displaystyle<1-\frac{(1-c_{5}^{2})\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}}{(m-m_{0}+1)\sum_{i=1}^{s(k)}||\hat{e}_{i}(k)||^{2}}
=mm0+c52mm0+1.\displaystyle=\frac{m-m_{0}+c_{5}^{2}}{m-m_{0}+1}.

Since 0c52<10\leq c_{5}^{2}<1 and 0m0m20\leq m_{0}\leq m-2,

e(k+1)e(k)γ,\frac{||e(k+1)||}{||e(k)||}\leq\gamma,

where γ<max0m0m2mm0+c52mm0+1=m+c52m+1<1\gamma<\sqrt{\max_{0\leq m_{0}\leq m-2}\frac{m-m_{0}+c_{5}^{2}}{m-m_{0}+1}}=\sqrt{\frac{m+c_{5}^{2}}{m+1}}<1. Therefore, e(k)||e(k)|| exponentially converges to 0. ∎

III Decentralized State-Dependent
Markov Chain Synthesis

Based on the consensus protocol developed in Section II, we propose the decentralized state-dependent Markov chain synthesis (DSMC) algorithm that achieves convergence to the desired distribution with an exponential rate and minimal state transitions. Additionally, we present a shortest path algorithm that can be integrated with the DSMC algorithm, as utilized in [7, 14, 15], to further enhance the convergence rate.

III-A The DSMC Algorithm

We define a finite-state discrete-time Markov chain evolving on the vertices of the uniform graph 𝒢\mathcal{G} in Definition 1.

Definition 3.

(Markov chain) Let 𝒱={v1,,vm}\mathcal{V}=\{\mathrm{v}_{1},...,\mathrm{v}_{m}\}, which is the vertices of the graph in Definition 1, be the finite set of states of the Markov chain and 𝒳(k)𝒱\mathcal{X}(k)\in\mathcal{V} be the random variable of the Markov chain for time k+k\in\mathbb{Z}_{+}. The connectivity of the states is determined by the adjacency matrix AaA_{a} of the uniform graph 𝒢\mathcal{G}. Additionally, we define:

  • (Probability distribution) Let x(k)mx(k)\in\mathbb{R}^{m} be the probability distribution at time k:x[i](k)=𝒫(𝒳(k)=vi)k:x[i](k)=\mathcal{P}(\mathcal{X}(k)=\mathrm{v}_{i}).

  • (Markov matrix) Markov matrix determines the state transitions of the Markov chain M(k)m×mM(k)\in\mathbb{R}^{m\times m}, where M[i,j](k)=𝒫(𝒳(k+1)=vi|𝒳(k)=vj)M[i,j](k)=\mathcal{P}(\mathcal{X}(k+1)=\mathrm{v}_{i}|\mathcal{X}(k)=\mathrm{v}_{j}) for all i,j1mi,j\in\mathcal{I}_{1}^{m}. Hence x(k+1)=M(k)x(k)x(k+1)=M(k)x(k) for k+k\in\mathbb{Z}_{+}.

  • (Desired steady-state distribution) There exists a prescribed desired steady-state probability distribution vmv\in\mathbb{R}^{m} such that

    limkx(k)=v.\lim_{k\to\infty}x(k)=v.

Let the error vector e(k)e(k) be the difference between the probability distribution at time kk and the desired steady-state distribution e(k)=x(k)ve(k)=x(k)-v. The DSMC algorithm is designed to ensure that the dynamics of the error vector are identical to the dynamics of the value vector described in Theorem 1. This design guarantees consistency between the two, ensuring desirable convergence properties and performance in the DSMC algorithm. Similar to the consensus protocol in Section II, transitions in the DSMC algorithm occur from the states with higher error values to their adjacent states with lower error values to equalize the error values across all states of the Markov chain. Since the sum of these error values is equal to 0, the error values at all states will eventually become balanced at 0, resulting in the convergence of the probability distribution to the desired distribution.

Input: x[l](k),v[l],lIijx[l](k),v[l],\,\forall l\in I_{ij}, Aa[,]A_{a}[\ell,\cdot], {i,j}\forall\ell\in\{i,j\}, c2(0,1)c_{2}\in(0,1)
Output: M[i,j](k)M[i,j](k)

1:Compute error values at the states associated with set IijI_{ij}:
e[l](k):=x[l](k)v[l],l=Iije[l](k):=x[l](k)-v[l],\,\forall l=I_{ij}
2:Calculate the non-negative error differences between adjacent states:
E[i,j](k):=max(0,(e[j](k)e[i](k))Aa[i,j])E[i,j](k):=\max\bigg{(}0,\Big{(}e[j](k)-e[i](k)\Big{)}A_{a}[i,j]\bigg{)}
3:Determine the total number of adjacent states:
d[]:=l1mAa[,l],{i,j}d[\ell]:=\sum_{\begin{subarray}{c}l\in\mathcal{I}_{1}^{m}\setminus\ell\end{subarray}}A_{a}[\ell,l],\,\forall\ell\in\{i,j\}
4:Calculate the scaling factor to ensure algorithm stability:
s1[i,j]:=1c2max(d[i],d[j])s_{1}[i,j]:=\frac{1-c_{2}}{\max(d[i],d[j])}
5:Set the scaling factor based on the probability of state vj\mathrm{v}_{j}:
s2[j](k):={x[j](k)l1mE[l,j](k)if l1mE[l,j](k)01otherwises_{2}[j](k):=\begin{cases}\frac{x[j](k)}{\sum_{l\in\mathcal{I}_{1}^{m}}E[l,j](k)}&\text{if }\sum_{l\in\mathcal{I}_{1}^{m}}E[l,j](k)\neq 0\\ 1&\text{otherwise}\end{cases}
6:Determine the scaling factor to satisfy Condition 1b:
s3[i,j](k):={0if e[]<0 for all {i,j} and e[l](k)0 for some lIij1otherwises_{3}[i,j](k):=\begin{cases}0&\text{\parbox{150.00023pt}{if $e[\ell]<0$ for all $\ell\in\{i,j\}$ and $e[l](k)\geq 0$ for some $l\in I_{ij}$}}\\ 1&\text{otherwise}\end{cases}
7:Obtain the resulting scaling factor:
W[i,j](k):=min(s1[i,j],s2[j](k),s3[i,j](k))W[i,j](k):=\min\Big{(}s_{1}[i,j],s_{2}[j](k),s_{3}[i,j](k)\Big{)}
8:Calculate the scaled non-negative error differences between adjacent states:
T[i,j](k):=E[i,j](k)W[i,j](k)T[i,j](k):=E[i,j](k)W[i,j](k)
9:Compute the transition probability from vj\mathrm{v}_{j} to vi\mathrm{v}_{i}:
F[i,j](k):={T[i,j](k)x[j](k)if x[j](k)00otherwiseF[i,j](k):=\begin{cases}\frac{T[i,j](k)}{x[j](k)}&\text{if $x[j](k)\neq 0$}\\ 0&\text{otherwise}\end{cases}
10:Synthesize the Markov matrix:
M[i,j](k):={F[i,j](k)if ij1l1mF[l,j](k)if i=jM[i,j](k):=\begin{cases}F[i,j](k)&\text{if $i\neq j$}\\ 1-\sum_{l\in\mathcal{I}_{1}^{m}}F[l,j](k)&\text{if $i=j$}\end{cases}
Algorithm 1 Decentralized state-dependent Markov chain synthesis algorithm

We present the synthesis of the Markov matrix in Algorithm 1, and its convergence proof is presented in Theorem 2. To this end, let IijI_{ij} be the set of indices of states adjacent to either vi\mathrm{v}_{i} or vj\mathrm{v}_{j}, such that Iij={l:Aa[l,i]=1 or Aa[l,j]=1}I_{ij}=\{l:A_{a}[l,i]=1\text{ or }A_{a}[l,j]=1\}. It should be noted that e[]<0e[\ell]<0 for all {i,j}\ell\in\{i,j\} and e[l](k)0e[l](k)\geq 0 for some lIijl\in I_{ij} if and only if {i,j}Inp,n(k)\{i,j\}\in I_{n_{p},n}(k), which is introduced in Definition 2. To determine the transition probability from state vj\mathrm{v}_{j} to state vi\mathrm{v}_{i} for time kk, the required inputs of the algorithm are x[l](k)x[l](k) and v[l]v[l] for all lIijl\in I_{ij}, the corresponding rows of the adjacency matrix for states vi\mathrm{v}_{i} and vj\mathrm{v}_{j}, and a hyper-parameter c2(0,1)c_{2}\in(0,1) that scales the transition probabilities. In Line 1 of Algorithm 1, the algorithm computes the error values at the states associated with the set IijI_{ij}. From the state vj\mathrm{v}_{j}, there will be a transition to any adjacent state vi\mathrm{v}_{i} only if e[j](k)>e[i](k)e[j](k)>e[i](k). Also, the transition probabilities vary proportionally with the differences in error values. Therefore, the difference of the error value e[j](k)e[j](k) from the error value e[i](k)e[i](k) is calculated in Line 2 if e[j](k)>e[i](k)e[j](k)>e[i](k), otherwise it is set to 0. As in Theorem 1, appropriately scaling the differences between error values relative to the number of adjacent states is crucial for the stability of the convergence. Also, it is required to scale these differences with respect to the probabilities of the corresponding states. For this purpose, a scaling factor is determined in Line 4 for the stability of the convergence, while another scaling factor is determined by the probability of the state vj\mathrm{v}_{j} in Line 5. Furthermore, if l1mE[l,j](k)=0\sum_{l\in\mathcal{I}_{1}^{m}}E[l,j](k)=0, then there will be no transition from state vj\mathrm{v}_{j} to any adjacent state vi\mathrm{v}_{i}. In this case, the value of s2[j](k)s_{2}[j](k) becomes irrelevant and is set to 11. If e[i](k)<0e[i](k)<0 and e[j](k)<0e[j](k)<0, and either vi\mathrm{v}_{i} or vj\mathrm{v}_{j} has an adjacent state with a positive error value, then another scaling factor s3[i,j](k)s_{3}[i,j](k) is set to 0 in Line 6 to satisfy Condition 1b. The minimum of these three scaling factors is chosen in Line 7 to satisfy all conditions simultaneously. Hence, the scaled difference of the error value e[j](k)e[j](k) from the error value e[i](k)e[i](k) is calculated in Line 8. Note that T[i,j](k)T[i,j](k) represents the amount of increase in the error value at state vi\mathrm{v}_{i} due to the transition from state vj\mathrm{v}_{j}. Equivalently, it represents the amount of decrease of the error value at state vj\mathrm{v}_{j} due to the transition to the state vi\mathrm{v}_{i}. This amount is divided by the probability of the state vj\mathrm{v}_{j} in Line 9 to decide the transition probability from state vj\mathrm{v}_{j} to state vi\mathrm{v}_{i}. In Line 10, the remaining probabilities are added to the diagonal elements, finalizing the synthesis of the Markov matrix. The complexity of the DSMC algorithm is O(m2)O(m^{2}), where mm is the number of states.

The following proposition shows that the matrix synthesized by Algorithm 1 is a column stochastic Markov matrix.

Proposition 1.

The matrix synthesized by Algorithm 1 satisfies M(k)𝟎M(k)\geq\bm{0} and 𝟏TM(k)=𝟏T\bm{1}^{T}M(k)=\bm{1}^{T} constraints.

Proof.

(Proof of M(k)𝟎M(k)\geq\bm{0}) W(k)𝟎W(k)\geq\bm{0} since s1[i,j]>0s_{1}[i,j]>0, s2[j](k)0s_{2}[j](k)\geq 0 and s3[i,j](k)0s_{3}[i,j](k)\geq 0, i,j1m\forall i,j\in\mathcal{I}_{1}^{m}. Since E(k)𝟎E(k)\geq\bm{0} and W(k)𝟎W(k)\geq\bm{0}, T(k)𝟎T(k)\geq\bm{0}. x(k)𝟎x(k)\geq\bm{0} and T(k)𝟎T(k)\geq\bm{0} imply that M[i,j](k)=F[i,j](k)0M[i,j](k)=F[i,j](k)\geq 0, i,j1m\forall i,j\in\mathcal{I}_{1}^{m} if iji\neq j.

If x[j](k)=0x[j](k)=0 or i1mE[i,j](k)=0\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k)=0, then i1mF[i,j](k)=0\sum_{i\in\mathcal{I}_{1}^{m}}F[i,j](k)=0 and M[j,j](k)=1M[j,j](k)=1. If x[j](k)0x[j](k)\neq 0 and i1mE[i,j](k)0\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k)\neq 0, then

i1mF[i,j](k)\displaystyle\sum_{i\in\mathcal{I}_{1}^{m}}F[i,j](k) =i1mT[i,j](k)x[j](k)\displaystyle=\frac{\sum_{i\in\mathcal{I}_{1}^{m}}T[i,j](k)}{x[j](k)}
=i1mE[i,j](k)W[i,j](k)x[j](k)\displaystyle=\frac{\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k)W[i,j](k)}{x[j](k)}
i1mE[i,j](k)x[j](k)s2[j](k)\displaystyle\leq\frac{\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k)}{x[j](k)}s_{2}[j](k)
=i1mE[i,j](k)x[j](k)x[j](k)i1mE[i,j](k)\displaystyle=\frac{\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k)}{x[j](k)}\frac{x[j](k)}{\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k)}
=1.\displaystyle=1.

Therefore, i1mF[i,j](k)1\sum_{i\in\mathcal{I}_{1}^{m}}F[i,j](k)\leq 1 and M[j,j](k)0M[j,j](k)\geq 0.

(Proof of 𝟏TM(k)=𝟏T\bm{1}^{T}M(k)=\bm{1}^{T}) 𝟏TM(k)=𝟏T\bm{1}^{T}M(k)=\bm{1}^{T} requires that i1mM[i,j](k)=1\sum_{i\in\mathcal{I}_{1}^{m}}M[i,j](k)=1, j1m\forall j\in\mathcal{I}_{1}^{m}. Note that i1m{j}F[i,j](k)=i1mF[i,j](k)\sum_{i\in\mathcal{I}_{1}^{m}\setminus\{j\}}F[i,j](k)=\sum_{i\in\mathcal{I}_{1}^{m}}F[i,j](k) since F[i,i](k)=0F[i,i](k)=0. Then, i1mM[i,j](k)=i1m{j}F[i,j](k)+1i1mF[i,j](k)=1\sum_{i\in\mathcal{I}_{1}^{m}}M[i,j](k)=\sum_{i\in\mathcal{I}_{1}^{m}\setminus\{j\}}F[i,j](k)+1-\sum_{i\in\mathcal{I}_{1}^{m}}F[i,j](k)=1. This shows that 𝟏TM(k)=𝟏T\bm{1}^{T}M(k)=\bm{1}^{T}. ∎

The following proposition shows that the error dynamics in Algorithm 1 are identical to the evolution of the value vector in the consensus protocol presented in Theorem 1.

Proposition 2.

The error dynamics of Algorithm 1 satisfy Eq. (6).

Proof.

In Algorithm 1, the increase in the error value at state vi\mathrm{v}_{i} caused by state vj\mathrm{v}_{j} is denoted by T[i,j](k)T[i,j](k) in Line 8. Equivalently, T[i,j](k)T[i,j](k) represents the decrease in the error value at state vj\mathrm{v}_{j} caused by state vi\mathrm{v}_{i}. Since i1mF[i,j](k)1\sum_{i\in\mathcal{I}_{1}^{m}}F[i,j](k)\leq 1 in Line 9 of Algorithm 1, the amount of change determined in Line 8 is not suppressed by x(k)x(k). Thus, the error value at the state vi\mathrm{v}_{i} changes as

e[i](k+1)\displaystyle e[i](k+1) =e[i](k)+j1mT[i,j](k)j1mT[j,i](k)\displaystyle=e[i](k)+\sum_{j\in\mathcal{I}_{1}^{m}}T[i,j](k)-\sum_{j\in\mathcal{I}_{1}^{m}}T[j,i](k) (11)
=e[i](k)+\displaystyle=e[i](k)+ j1mE[i,j](k)W[i,j](k)j1mE[j,i](k)W[j,i](k)\displaystyle\sum_{j\in\mathcal{I}_{1}^{m}}E[i,j](k)W[i,j](k)-\sum_{j\in\mathcal{I}_{1}^{m}}E[j,i](k)W[j,i](k)
=e[i](k)+\displaystyle=e[i](k)+ j1me[j](k)e[i](k)(e[j](k)e[i](k))Aa[i,j]W[i,j](k)\displaystyle\sum_{\begin{subarray}{c}j\in\mathcal{I}_{1}^{m}\\ e[j](k)\geq e[i](k)\end{subarray}}\Big{(}e[j](k)-e[i](k)\Big{)}A_{a}[i,j]W[i,j](k)
\displaystyle- j1me[j](k)<e[i](k)(e[i](k)e[j](k))Aa[j,i]W[j,i](k).\displaystyle\sum_{\begin{subarray}{c}j\in\mathcal{I}_{1}^{m}\\ e[j](k)<e[i](k)\end{subarray}}\Big{(}e[i](k)-e[j](k)\Big{)}A_{a}[j,i]W[j,i](k).

Let us denote wi,j(k)w_{i,j}(k) as

wi,j(k)=W[i,j](k) if e[j](k)e[i](k).w_{i,j}(k)=W[i,j](k)\text{ if }e[j](k)\geq e[i](k). (12)

Note that 0W[i,j](k)s1[i,j]=(1c2)/max(d[i],d[j])0\leq W[i,j](k)\leq s_{1}[i,j]=(1-c_{2})/\max(d[i],d[j]), for all i,j1mi,j\in\mathcal{I}_{1}^{m}, where 0<c2<10<c_{2}<1 due to Line 4 and Line 7 of the Algorithm 1. Since d[i]d[i] in Line 3 of the Algorithm 1 represents the degree similar to 𝒟[i,i]\mathcal{D}[i,i] in Definition 1, wi,j(k)w_{i,j}(k) satisfies the constraints given in Theorem 1. Thus, Eq. (11) implies that

e[i](k+1)=e[i](k)\displaystyle e[i](k+1)=e[i](k) +j1m(e[j](k)e[i](k))Aaw[i,j](k)\displaystyle+\sum_{j\in\mathcal{I}_{1}^{m}}\Big{(}e[j](k)-e[i](k)\Big{)}A_{{a}_{w}}[i,j](k) (13)

for all i1mi\in\mathcal{I}_{1}^{m}, where Aaw[i,j](k)=Aa[i,j]wi,j(k), 0wi,j(k)<(1c2)/(max(𝒟[i,i],𝒟[j,j])), 0<c2<1A_{{a}_{w}}[i,j](k)=A_{a}[i,j]w_{i,j}(k),\;0\leq w_{i,j}(k)<(1-c_{2})/(\max(\mathcal{D}[i,i],\mathcal{D}[j,j])),\;0<c_{2}<1. Since Eq. (13) is identical to Eq. (6), the error dynamics of Algorithm 1 are identical to the evolution of the value vector in Theorem 1. ∎

Note that Assumption 1 is already satisfied for the DSMC algorithm due to Assumption 3. We now show that Condition 1, which is crucial for the convergence of the proposed consensus protocol, is satisfied by Algorithm 1.

Proposition 3.

Condition 1 is satisfied by Algorithm 1.

Proof.

To ensure Condition 1a it must be shown that there exists a c1c_{1} such that 0<c1wi,j(k)0<c_{1}\leq w_{i,j}(k) if 0e[i](k)0\leq e[i](k). Then, due to Eq. (12), it is sufficient to show that there exists a c1c_{1} such that 0<c1W[i,j](k)0<c_{1}\leq W[i,j](k) if 0e[j](k)0\leq e[j](k).

(Proof of Condition 1a) Since s1[i,j]>0s_{1}[i,j]>0 and it is not time-varying, there exists a c1c_{1}^{\prime} such that 0<c1s1[i,j]0<c_{1}^{\prime}\leq s_{1}[i,j]. The inequality 0e[j](k)0\leq e[j](k) implies that s3[i,j]=1s_{3}[i,j]=1 for all i1mi\in\mathcal{I}_{1}^{m} due to Line 6 of the Algorithm 1. Note that

i1mE[i,j](k)\displaystyle\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k) =i1m(0,(e[j](k)e[i](k))Aa[i,j])\displaystyle=\sum_{i\in\mathcal{I}_{1}^{m}}\bigg{(}0,\Big{(}e[j](k)-e[i](k)\Big{)}A_{a}[i,j]\bigg{)}
i1m|e[j](k)e[i](k)|\displaystyle\leq\sum_{i\in\mathcal{I}_{1}^{m}}|e[j](k)-e[i](k)|
(m1)|e[j](k)|+i1m{j}|e[i](k)|.\displaystyle\leq(m-1)|e[j](k)|+\sum_{\begin{subarray}{c}i\in\mathcal{I}_{1}^{m}\setminus\{j\}\end{subarray}}|e[i](k)|.

Since e[i](k)1||e[i](k)||_{\infty}\leq 1, i1m|e[i](k)|2\sum_{i\in\mathcal{I}_{1}^{m}}|e[i](k)|\leq 2, and m2m\geq 2, then

i1mE[i,j](k)\displaystyle\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k) (m1)|e[j](k)|+(2|e[j](k)|)\displaystyle\leq(m-1)|e[j](k)|+(2-|e[j](k)|)
=m|e[j](k)|+22|e[j](k)|\displaystyle=m|e[j](k)|+2-2|e[j](k)|
m.\displaystyle\leq m.

It then follows that Line 5 of the Algorithm 1 implies x[j](k)m=min(x[j](k)m,1)min(x[j](k)i1mE[i,j](k),1)s2[j](k)\frac{x[j](k)}{m}=\min\Big{(}\frac{x[j](k)}{m},1\Big{)}\leq\min\Big{(}\frac{x[j](k)}{\sum_{i\in\mathcal{I}_{1}^{m}}E[i,j](k)},1\Big{)}\leq s_{2}[j](k). Let vmin=mini1mv[i]v_{\text{min}}=\min_{i\in\mathcal{I}_{1}^{m}}v[i]. Then, the inequality 0e[j](k)0\leq e[j](k) implies that vminx[j](k)v_{\text{min}}\leq x[j](k). Thus, vmin/mx[j](k)/ms2[j](k)v_{\text{min}}/m\leq x[j](k)/m\leq s_{2}[j](k). Since vmin/mv_{\text{min}}/m is not time-varying, there exists a c1′′c_{1}^{\prime\prime} such that 0<c1′′vmin/ms2[j](k)0<c_{1}^{\prime\prime}\leq v_{\text{min}}/m\leq s_{2}[j](k). Hence, if 0e[j](k)0\leq e[j](k), then there exist a c1c_{1} such that 0<c1min(s1[i,j],s2[j](k),s3[i,j])=W[i,j](k)0<c_{1}\leq\min\Big{(}s_{1}[i,j],s_{2}[j](k),s_{3}[i,j]\Big{)}=W[i,j](k), where c1=min(c1,c1′′,1)c_{1}=\min(c_{1}^{\prime},c_{1}^{\prime\prime},1).

(Proof of Condition 1b) Condition 1b requires that wi,j=0w_{i,j}=0 if {i,j}Inp,n\{i,j\}\in I_{{n_{p}},n}, where Inp,nI_{{n_{p}},n} is the set defined in Definition 2. Note that {i,j}Inp,n\{i,j\}\in I_{{n_{p}},n} if and only if e[]<0e[\ell]<0 for all {i,j}\ell\in\{i,j\} and e[l]0e[l]\geq 0 for some lIijl\in I_{ij}. Therefore, if {i,j}Inp,n\{i,j\}\in I_{{n_{p}},n}, both s3[i,j](k)s_{3}[i,j](k) and s3[j,i](k)s_{3}[j,i](k) are set to 0, which imply that both W[i,j](k)W[i,j](k) and W[j,i](k)W[j,i](k) are 0. Hence, wi,j(k)=0w_{i,j}(k)=0 if {i,j}Inp,n\{i,j\}\in I_{{n_{p}},n}. ∎

The following theorem shows that Algorithm 1 achieves the desired convergence.

Theorem 2.

The probability distribution x(k)x(k) exponentially converges to the desired distribution vv if the corresponding Markov matrix is synthesized by Algorithm 1.

Proof.

In Proposition 2, it is proven that the dynamics of the error vector in Algorithm 1 are identical to the dynamics of the value vector in Theorem 1. Condition 1 is used in Theorem 1 to prove that the value vector exponentially converges to 𝟎\bm{0}. In Proposition 3, it is proven that Algorithm 1 satisfies Condition 1 which means that probability distribution x(k)x(k) exponentially converges to the desired distribution vv. ∎

III-B The Modified DSMC Algorithm

In this section, we introduce a shortest-path algorithm that is proposed as a modification to the Metropolis-Hastings algorithm in [7, Section V-E] and integrated with the Markov chain synthesis methods described in [14] and [15]. This algorithm can also be integrated with the DSMC algorithm to further increase the convergence rate. Our approach begins by categorizing the states of the desired distribution.

Definition 4.

(Recurrent and transient states) The states with non-zero elements in the desired distribution vv are called recurrent states. The other states with zero elements in the desired distribution vv are called transient states.

The shortest-path algorithm is used for the synthesis of the Markov chain for the transient states while the Markov chain for the recurrent states is synthesized by the DSMC algorithm. Let us split the desired distribution and the Markov matrix to synthesize the Markov matrix for recurrent and transient states separately. Assume that there are mrm_{r} recurrent states and mt=mmrm_{t}=m-m_{r} transient states of the desired distribution. Then the Markov matrix and the desired distribution are split as

v\displaystyle v =(vt,vr),M(k)=[M1𝟎M2M3(k)]m×m,\displaystyle=(v_{t},v_{r}),\quad M(k)=\begin{bmatrix}M_{1}&\bm{0}\\ M_{2}&M_{3}(k)\\ \end{bmatrix}_{m\times m}, (14)

where vt=𝟎v_{t}=\bm{0}, vtmtv_{t}\in\mathbb{R}^{m_{t}}, vr>𝟎v_{r}>\bm{0}, vrmrv_{r}\in\mathbb{R}^{m_{r}}, M1mt×mtM_{1}\in\mathbb{R}^{m_{t}\times m_{t}} and M3(k)mr×mrM_{3}(k)\in\mathbb{R}^{m_{r}\times m_{r}}. Since the desired distribution and the Markov matrix are partitioned by renumbering the elements, the probability distribution and the adjacency matrix are also partitioned using the same renumbering as

x(k)\displaystyle x(k) =(xt(k),xr(k)),Aa=[Aa1𝟎Aa2Aa3]m×m.\displaystyle=(x_{t}(k),x_{r}(k)),\quad A_{a}=\begin{bmatrix}A_{a_{1}}&\bm{0}\\ A_{a_{2}}&A_{a_{3}}\\ \end{bmatrix}_{m\times m}.

Note that while the time-invariant matrices M1M_{1} and M2M_{2} in Eq. (14) are synthesized by the shortest-path algorithm, the time-varying Markov matrix M3(k)M_{3}(k) is synthesized by the DSMC algorithm. If the shortest-path is to be combined with another Markov chain synthesis algorithm, it is also necessary to assume that the recurrent states are connected among themselves, which requires the following assumption.

Assumption 2.

The graph defined by the recurrent states is connected, which means there exists a path between all recurrent states, or equivalently, (I+Aa3)mr1>𝟎(I+A_{a_{3}})^{m_{r}-1}>\bm{0} for Aa3mr×mrA_{a_{3}}\in\mathbb{R}^{m_{r}\times m_{r}} [18, section 2.1].

Furthermore, we demonstrate that the condition 𝟏Txr(k)=1\bm{1}^{T}x_{r}(k)=1, which is crucial for the Markov chain synthesis algorithm of the recurrent states, is satisfied for all kKk\geq K, K+K\in\mathbb{Z}_{+} in the shortest-path algorithm.

Let us first define the following index sets to present the algorithm.

Definition 5.

(Index sets with respect to recurrent states) The index sets IrI_{r} and ItI_{t} contain the indices of the recurrent and transient states, respectively. Indices of transient states can be split into subsets as IsI_{s}, Is1,I_{s-1},..., and so on. The index set IsI_{s} consists of the indices of the states that are adjacent to the states corresponding to IrI_{r} and IsIr=I_{s}\cap I_{r}=\emptyset, the index set Is1I_{s-1} consists of the indices of states that are adjacent to the states corresponding to IsI_{s} and Is1(IrIs)=I_{s-1}\cap(I_{r}\cup I_{s})=\emptyset, and so on.

The synthesis of the Markov matrix using the DSMC algorithm and the shortest-path algorithm for the recurrent and transient states of the desired distribution is given as

M[i,j]={as in Algorithm 1if iIrjIr0 if iI-sn, jIr for nZ+  1/(lIrAa[j,l])if iIrjIsAa[j,i]=11/(lIsnAa[j,l])if iIsnjIsn1Aa[j,i]=1 for n+M[i,j]=\begin{cases}\text{as in Algorithm \ref{alg:DSMC}}&\text{if $i\in I_{r}$, $j\in I_{r}$}\\ 0&\text{\parbox{150.00023pt}{if $i\in I_{s-n}$, $j\in I_{r}$ for $n\in\mathbb{Z}_{+}$ } }\\ $$1/(\sum_{l\in I_{r}}A_{a}[j,l]$$)&\text{if $i\in I_{r}$, $j\in I_{s}$, $A_{a}[j,i]=1$}\\ $$1/(\sum_{l\in I_{s-n}}A_{a}[j,l]$$)&\text{\parbox{150.00023pt}{if $i\in I_{s-n}$, $j\in I_{s-n-1}$, $A_{a}[j,i]=1$ for $n\in\mathbb{Z}_{+}$}}\\ \end{cases}

In the proposed shortest-path algorithm, the transition probability from any state in set Isk1I_{s-k-1} to any state in set IskI_{s-k}, k+k\in\mathbb{Z}_{+}, and from any state in set IsI_{s} to any state in set IrI_{r} is 11. Therefore, the total probability of the transient states becomes zero in a finite time. In [7], it is shown that the condition ρ(M1)<1\rho(M_{1})<1 is satisfied using the properties of M-matrices, which are shown in Theorem 2.5.3 (parts 2.5.3.2 and 2.5.3.12) of [64].

IV An application of the DSMC Algorithm
on Probabilistic Swarm Guidance

In this section, we apply the DSMC algorithm to the probabilistic swarm guidance problem and provide numerical simulations that show the convergence rate of the DSMC algorithm is considerably faster as compared to the previous Markov chain synthesis algorithms in [7] and [14].

IV-A Probabilistic Swarm Guidance

Most of the underlying definitions and baseline algorithms in this section are based on [7]. We briefly review this material for completeness.

Definition 6.

(Bins) The operational region is denoted as \mathcal{R}. The region is assumed to be partitioned as the union of mm disjoint regions, which are referred to as bins RiR_{i}, i=1,,mi=1,...,m, such that =i=1mRi\mathcal{R}=\cup_{i=1}^{m}R_{i}, and RiRj=R_{i}\cap R_{j}=\varnothing for iji\neq j. Each bin can contain multiple agents.

Bins, as in the states of a Markov chain, in the operational region are represented as vertices of a graph. The allowed transitions between bins are specified by the adjacency matrix of the graph.

Definition 7.

(Transition constraints) An adjacency matrix Aam×mA_{a}\in\mathbb{R}^{m\times m} is used to restrict the allowed transitions between bins. Aa[i,j]=1A_{a}[i,j]=1 if the transition from RiR_{i} to RjR_{j} is allowable, and is 0 otherwise. The bins RiR_{i} and RjR_{j} are called neighboring bins if Aa[i,j]=1A_{a}[i,j]=1.

Assumption 3.

The graph describing the bin connections is connected, that is, (I+Aa)m1>𝟎(I+A_{a})^{m-1}>\bm{0} for Aam×mA_{a}\in\mathbb{R}^{m\times m} [18, section 2.1].

Consider a swarm comprised of NN agents. Let n[i](k)n[i](k) be the number of agents in bin RiR_{i} at time kk. Then x(k)=n(k)/Nx(k)=n(k)/N represents the swarm density distribution at time kk where x(k)𝟎x(k)\geq\bm{0} and 𝟏Tx(k)=1\bm{1}^{T}x(k)=1. It is desired to guide the swarm density distribution to a desired steady-state distribution vmv\in\mathbb{R}^{m}, where v𝟎v\geq\bm{0} and 𝟏Tv=1\bm{1}^{T}v=1. The distance between the swarm density distribution and the desired distribution is monitored via the total variation between the swarm density distribution at time kk and the desired distribution

Tx,v(k)=12x(k)v1.T_{x,v}(k)=\frac{1}{2}||x(k)-v||_{1}.
Assumption 4.

All agents can determine their current bins, and they can use this information to move between bins.

Each swarm agent changes its bin at time kk by using a column stochastic, Markov, matrix M(k)m×mM(k)\in\mathbb{R}^{m\times m} called a Markov matrix [65]. The entries of matrix M(k)M(k) define the transition probabilities between bins, that is, for any k+k\in\mathbb{Z}_{+} and i,j1mi,j\in\mathcal{I}_{1}^{m},

M[i,j](k)=𝒫(r(k+1)Ri|r(k)Rj),M[i,j](k)=\mathcal{P}(r(k+1)\in R_{i}|r(k)\in R_{j}),\\

i.e., an agent in RjR_{j} moves to RiR_{i} with probability M[i,j](k)M[i,j](k) at time kk. Algorithm 2 is an implementation of this probabilistic guidance algorithm. The first step of the algorithm is to determine the agent’s current bin. In the following steps, each agent samples from a uniform discrete distribution and goes to another bin depending on the column of the Markov matrix, which is determined by the agent’s current bin.

1:Each agent determines its current bin, e.g., rl(k)Rir_{l}(k)\in R_{i}.
2:Each agent generates a random number zz that is uniformly distributed in [0,1][0,1].
3:Each agent goes to bin jj, i.e., rl(k+1)Rjr_{l}(k+1)\in R_{j}, if {s=1j1M[s,i](k)zs=1jM[s,i](k).\begin{cases}$$\sum_{s=1}^{j-1}M[s,i](k)\leq z\leq\sum_{s=1}^{j}M[s,i](k)$$.\\ \end{cases}
Algorithm 2 Probabilistic Guidance Algorithm [7]

The main idea of the probabilistic swarm guidance is to drive the propagation of density distribution vector x(k)x(k), instead of individual agent positions {rl(k)}l=1N\{r_{l}(k)\}_{l=1}^{N}. Swarms are viewed as a statistical ensemble of agents to facilitate the swarm guidance problem. The density distribution of the swarm x(k)x(k) satisfies the properties of a probability distribution, which are x(k)𝟎x(k)\geq\bm{0}, and 𝟏Tx(k)=1\bm{1}^{T}x(k)=1. However, the number of agents is finite and, using the law of large numbers [66, Section V], the Algorithm 2 ensures that x(k+1)=M(k)x(k)x(k\!~{}+~{}\!1)~{}\!=\!~{}M(k)~{}x(k) is satisfied as NN\xrightarrow[]{}\infty. Furthermore, any given desired distribution vmv\in\mathbb{R}^{m} cannot always be accurately represented by the finite number of agents of the swarm. For example, if v=[0.5,0.5]Tv=[0.5,0.5]^{T} and N=1N=1, then the best-quantized representation of the distribution that can be obtained is either [1,0]T[1,0]^{T} or [0,1]T[0,1]^{T}. In this case, the total variation between the density distribution of the swarm and the desired distribution cannot be less than 0.50.5.

Definition 8.

(Quantization error) The quantization error is the minimum total variation value that can be achieved by the density distribution of the swarm x(k)=n(k)/Nx(k)=n(k)/N to represent a given desired distribution vv. It is denoted as qN(v)q_{N}(v) and is the solution to the following problem

qN(v)=minn{12n/Nv1:n+m,𝟏Tn=N}.q_{N}(v)=\min_{n}\bigg{\{}\frac{1}{2}||n/N-v||_{1}:n\in\mathbb{Z}_{+}^{m},\bm{1}^{T}n=N\bigg{\}}.

Let QNQ_{N} be the maximum value of the quantization error that considers the worst possible desired distribution vv for a given number of agents NN such that

QN=maxv{qN(v):v+m,𝟏Tv=1}.Q_{N}=\max_{v}\{q_{N}(v):v\in\mathbb{R}_{+}^{m},\bm{1}^{T}v=1\}.

The following theorem provides the maximum quantization error value for a given number of agents NN.

Lemma 3.

For any given desired distribution v+mv\in\mathbb{R}_{+}^{m}, 𝟏Tv=1\bm{1}^{T}v=1, the maximum quantization error value between x(k)=n(k)/Nx(k)=n(k)/N and vv is not greater than m/(4N)m/(4N), where NN is the total number of agents.

Proof.

Let us split the integer and fractional part of NvNv such that Nv=t+fNv=t+f, where t+mt\in\mathbb{Z}_{+}^{m} and 𝟎f<𝟏\bm{0}\leq f<\bm{1}. Let s=𝟏Tfs=\bm{1}^{T}f. Note that s+s\in\mathbb{Z}_{+} and s=N𝟏Tts=N-\bm{1}^{T}t. Let n[i]n^{*}[i] be the optimal value of n[i]n[i] that minimizes |n[i]Nv[i]||n[i]-Nv[i]|. Then, n[i]=t[i]n^{*}[i]=t[i] or n[i]=t[i]+1n^{*}[i]=t[i]+1. Without loss of generality, suppose that f[1]f[2]f[m]f[1]\geq f[2]\geq\dots\geq f[m]. Then n[i]={t[i]+1if ist[i]otherwise ,n^{*}[i]=\begin{cases}t[i]+1&\text{if }i\leq s\\ t[i]&\text{otherwise }\\ \end{cases}, where 𝟏Tn=𝟏Tt+s=N\bm{1}^{T}n=\bm{1}^{T}t+s=N. Therefore nvN1=(1f[1])++(1f[k])+f[k+1]++f[m]||n-vN||_{1}=(1-f[1])+\dots+(1-f[k])+f[k+1]+\dots+f[m]. Since s=𝟏Tfs=\bm{1}^{T}f, nvN1=2(f[k+1]++f[m])||n-vN||_{1}=2(f[k+1]+\dots+f[m]). Also, f[k+1]s/mf[k+1]\leq s/m since f[i]f[i+1]f[i]\geq f[i+1] for all i1mi\in\mathcal{I}_{1}^{m} and 𝟏Tf=s\bm{1}^{T}f=s. Therefore, nvN12(ms)(s/m)||n-vN||_{1}\leq 2(m-s)(s/m). The maximum value of 2(ms)(s/m)2(m-s)(s/m) is m/4m/4, where s=m/2s=m/2. Hence, nvN1m/2||n-vN||_{1}\leq m/2 and 12n/Nv1m/(4N)\frac{1}{2}||n/N-v||_{1}\leq m/(4N). ∎

IV-B Synthesis of the Markov Matrix

The DSMC algorithm presented in Section III-A is employed for synthesizing the Markov chain, which serves as the stochastic policy for the swarm agents. It is worth noting that the bins comprising the operational region, as defined in Definition 6, determine the vertices of the uniform graph in Definition 1. Consequently, these vertices correspond to the states of the Markov chain defined in Definition 3. Similarly, the transition constraints of the swarm, defined by an adjacency matrix in Definition 7, determine the adjacency matrix of the uniform graph in Definition 1, which subsequently corresponds to the adjacency matrix of the Markov chain in Definition 3. The connectivity requirement for the vertices of the consensus protocol, as outlined in Assumption 1, is satisfied by Assumption 3. Furthermore, if the recurrent states of the desired distribution are also connected among themselves, then Assumption 2 is satisfied, allowing the utilization of the Modified DSMC algorithm presented in Section III-B for Markov chain synthesis. Since the DSMC algorithm is state-dependent, we use the density distribution of the swarm for feedback, which requires the following assumption.

Assumption 5.

All agents know the density values of their own and neighboring bins.

A complex communication architecture is not required since communication only with neighboring bins is sufficient for an agent to determine its transition probabilities. If agents have only access to the number of agents of their own and neighboring bins, then they also need to know the total number of agents in the swarm, which is global information, to determine their density values. Distributed consensus algorithms, which work under the strongly-connected communication network topology assumption, can be used to estimate the total number of agents of the swarm. The consensus protocol that is used to estimate the density distribution of the swarm in [14, Remark 13] can also be used to estimate the total number of agents of the swarm.

IV-C Numerical Simulations

The convergence performance of the DSMC algorithm is demonstrated on the probabilistic swarm guidance application with numerical simulations shown in Figure 2 and 5. We monitor the total variation between the density distribution of the swarm and the desired distribution. The finite number of agents causes the quantization error outlined in Definition 8. Theorem 3 limits the quantization error to m/(4N)m/(4N), where mm is the number of bins and NN is the number of agents.

In the first simulation, which is the same numerical example presented in [7], agents converge to the letter “E” in 250250 time-steps. Then approximately 1/31/3 agents are removed and the remaining agents converge to the desired distribution again in 500500 time-steps, demonstrating the “self-repair” property of the proposed algorithm for swarm guidance. Comparisons of the total variation and the total number of transitions for the M-H, PSG-IMC, and the DSMC algorithms are given in Figures 3, 4, and Tables I, II for two different cases. In the first case, the adjacency matrix only allows transitions to the bins, which are above, below, right, and left to a given bin, i.e., the bins that are 1-step away. In the second case, the adjacency matrix allows transitions to 10-step away bins. When compared to the M-H and PSG-IMC algorithms, in total variation at the 750th750^{th} time-step, the DSMC algorithm provides approximately 1.261.26 and 60.6760.67 times improvement in the speed of convergence (the ratios of the total amount of changing of the total variations in the given time-step) in the first case and 1.511.51, and 1.251.25 times improvement in the second case. The PSG-IMC algorithm does not perform as well in the first case because of the issues caused by having a sparse adjacency matrix as discussed in Section I-A. In the second case, which has a much denser adjacency matrix, the PSG-IMC algorithm provides approximately 1.211.21 times improvement in total variation compared to the M-H algorithm but the DSMC algorithm still provides faster convergence than the PSG-IMC algorithm. Furthermore, the total variation value of the DSMC algorithm converges to a value below the maximum value of the quantization error provided by Theorem 3. In the DSMC algorithm, as the distribution converges, the Markov matrix turns into an identity matrix, and unnecessary transitions are avoided as in the PSG-IMC algorithm. Since the transition of agents rarely occurred in the PSG-IMC algorithm in the first case due to the sparse adjacency matrix, the total number of transitions of the DSMC algorithm is approximately 49.5649.56 times less than the M-H algorithm, and 4.934.93 times more than the PSG-IMC algorithm. For the second case, the total number of transitions of the DSMC algorithm is approximately 74.9674.96 times less than the M-H algorithm, and 1.171.17 times less than the PSG-IMC algorithm.

In the second simulation, a more comprehensive comparison is provided. The swarm distribution converges to various multimodal Gaussian distributions (i.e., a mixture of multiple Gaussian distributions). The desired distribution is changed to another multimodal Gaussian distribution every 4040 time-steps, except for the time-steps between 160160 and 240240. Up to the 160th160^{th} time-step, agents converge to the 44 different multimodal Gaussian distributions. To show the self-repair property of the swarm, approximately 1/31/3 of the agents are removed at the 161th161^{th} time-step and the remaining agents converge to the same desired distribution in 4040 time-steps. Moreover, 75007500 agents are uniformly added to the operational region at 201th201^{th} time-step and all agents converge to the same desired distribution again in 4040 time-steps. After the 240th240^{th} time-step, agents converge to the 33 new multimodal Gaussian distributions up to the 360th360^{th} time-step. Comparisons of the total variation and the total number of transitions for the M-H, PSG-IMC, and DSMC algorithms are given in Figure 6 and Table III. In this case, the adjacency matrix allows transitions to 10-step away bins. When compared to the M-H and PSG-IMC algorithms, the DSMC algorithm provides a significant improvement in the speed of convergence for each desired multimodal Gaussian distribution, as well as causing fewer transitions compared to both algorithms. Similar to the first simulation, the total variation value of the DSMC algorithm reaches the maximum value of the quantization error given in Theorem 3. Furthermore, when some agents are removed or new agents are added to the operational region, the DSMC algorithm recovers the desired distribution faster than the previous algorithms.

\includegraphics

[width=0.20]Figs/E_latter/10_E_0.png

(a) t=0t=0
\includegraphics

[width=0.20]Figs/E_latter/10_E_250.png

(b) t=250t=250
\includegraphics

[width=0.20]Figs/E_latter/10_E_251.png

(c) t=251t=251
\includegraphics

[width=0.20]Figs/E_latter/10_E_750.png

(d) t=750t=750
Figure 2: Representation of the distribution of the swarm for the time-steps 0, 250250, 251251, and 750750, respectively. There are 400400 (20×2020\times 20) bins and 50005000 agents in the operational region at the beginning of the simulation. The agents converge to the “E” letter in 250250 time-steps and approximately 1/31/3 agents are removed from the operational space. Then, the remaining agents converge to the desired distribution again in 500500 time-steps.
\includegraphics

[width=0.43]Figs/E_latter/TV_1.pdf \includegraphics[width=0.43]Figs/E_latter/NT_1.pdf

Figure 3: Comparison of change of the total variation and the number of transitions with time for the algorithms. In this case, the adjacency matrix only allows agents to transition to 11-step away bins.
TABLE I: Comparison of change of the total variation and the total number of transitions with time for the algorithms. In this case, the adjacency matrix only allows agents to transition to 11-step away bins.
Total Variations Ratios of change of Total Variation Total Number of Transitions
Time 0 250 251 750 0-250 250-750 0-750 750 750 (Ratios)
M-H Algorithm 0.268 0.107 0.273 0.123 1.27 1.16 1.261.26 2282394 49.5649.56
PSG-IMC Algorithm 0.260 0.169 0.282 0.257 2.25 6.96 60.6760.67 9348 1/(4.93)1/(4.93)
DCMS Algorithm 0.261 0.056 0.253 0.079 1.00 1.00 1.001.00 46051 1.001.00
\includegraphics

[width=0.43]Figs/E_latter/TV_10.pdf \includegraphics[width=0.43]Figs/E_latter/NT_10.pdf

Figure 4: Comparison of change of the total variation and the number of transitions with time for the algorithms. In this case, the adjacency matrix allows agents to transition to 1010-step away bins.
TABLE II: Comparison of change of the total variation and the total number of transitions with time for the algorithms. In this case, the adjacency matrix allows agents to transition to 1010-step away bins.
Total Variations Ratios of change of Total Variation Total Number of Transitions
Time 0 250 251 750 0-250 250-750 0-750 750 750 (Ratios)
M-H Algorithm 0.271 0.101 0.269 0.113 1.44 1.48 1.511.51 2663105 74.9674.96
PSG-IMC Algorithm 0.263 0.039 0.253 0.072 1.09 1.28 1.251.25 41483 1.171.17
DCMS Algorithm 0.261 0.017 0.253 0.022 1.00 1.00 1.001.00 35529 1.001.00
\includegraphics

[width=0.232]Figs/Gauss/0.png

(a) t=0t=0
\includegraphics

[width=0.232]Figs/Gauss/40.png

(b) t=40t=40
\includegraphics

[width=0.232]Figs/Gauss/80.png

(c) t=80t=80
\includegraphics

[width=0.232]Figs/Gauss/120.png

(d) t=120t=120
\includegraphics

[width=0.232]Figs/Gauss/160.png

(e) t=160t=160
\includegraphics

[width=0.232]Figs/Gauss/161.png

(f) t=161t=161
\includegraphics

[width=0.232]Figs/Gauss/200.png

(g) t=200t=200
\includegraphics

[width=0.232]Figs/Gauss/201.png

(h) t=201t=201
\includegraphics

[width=0.232]Figs/Gauss/240.png

(i) t=240t=240
\includegraphics

[width=0.232]Figs/Gauss/280.png

(j) t=280t=280
\includegraphics

[width=0.232]Figs/Gauss/320.png

(k) t=320t=320
\includegraphics

[width=0.232]Figs/Gauss/360.png

(l) t=360t=360
Figure 5: Representation of the distribution of the swarm for several time-steps. There are 600600 (20×3020\times 30) bins and 2000020000 agents in the operational region at the beginning of the simulation. Swarm distribution converges to a different multimodal Gaussian distribution in every 4040 time-step except the time-steps between 160160 and 240240. Agents converge to the 44 different multimodal Gaussian distributions up to the 160th160^{th} time-step. At the 161th161^{th} time-step, approximately 1/31/3 agents are removed and the remaining agents converge to the same desired distribution in 4040 time-steps. Then, 75007500 agents are uniformly added to the operational region at 201th201^{th} time-step and all agents converge to the same desired distribution again in 4040 time-steps. After the 240th240^{th} time-step, agents converge to the 33 new multimodal Gaussian distributions up to the 360th360^{th} time-step.
\includegraphics

[width=0.43]Figs/Gauss/Total_Variation_2.pdf \includegraphics[width=0.43]Figs/Gauss/N_of_transitions_2.pdf

Figure 6: Comparison of change of the total variation and the number of transitions with time for the algorithms.
TABLE III: Comparison of change of the total variation and the total number of transitions with time for the algorithms.
Ratios of change of Total Variation Total Number of Transitions
Time 0-40 40-80 80-120 120-160 161-200 201-240 240-280 280-320 320-360 360 360 (Ratios)
M-H Algorithm 2.57 1.47 1.34 2.00 1.29 7.06 1.77 1.24 1.34 6058502 77.09
PSG-IMC Algorithm 1.37 1.24 1.36 1.36 1.38 0.94 1.26 1.25 1.89 95277 1.21
DCMS Algorithm 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 78592 1.00

V Conclusion and Future Works

This paper introduces a decentralized state-dependent Markov chain synthesis (DSMC) algorithm with an application to the probabilistic swarm guidance problem. The proposed algorithm is theoretically proven to exhibit exponential convergence, and numerical experiments confirm faster convergence compared to existing homogeneous and time-inhomogeneous Markov chain synthesis algorithms, which respectively guarantee exponential and asymptotic convergence. Furthermore, it is observed that the number of state transitions is relatively low for the fast convergence rates it provides when compared with existing algorithms.

For the probabilistic swarm guidance application, removing the assumption that agents have access to density values of their own and neighboring bins will be the subject of future studies. A useful extension of this research may involve imposing safety constraints on the density distribution of the swarm, such as density upper bounds or density rate bounds. Additional future work may include the swarm engagement problem, which considers matching the distribution of a non-collaborative swarm whose density distribution evolves with a known Markov chain. In that case, having a fast converging algorithm, such as DSMC, would possibly be advantageous to quickly react to such changing desired distributions.

VI Acknowledgement

The authors would like to thank the anonymous reviewers for their helpful and constructive comments that greatly contributed to improving the final version of the paper.

References

  • [1] D. A. Levin and Y. Peres, Markov chains and mixing times, vol. 107. American Mathematical Soc., 2017.
  • [2] G. Fishman, Monte Carlo: concepts, algorithms, and applications. Springer Science & Business Media, 2013.
  • [3] W. R. Gilks, S. Richardson, and D. Spiegelhalter, Markov chain Monte Carlo in practice. CRC press, 1995.
  • [4] W. K. Hastings, “Monte carlo sampling methods using markov chains and their applications,” Biometrika, vol. 57, pp. 97–109, 1970.
  • [5] S. Boyd, P. Diaconis, and L. Xiao, “Fastest mixing markov chain on a graph,” SIAM review, vol. 46, no. 4, pp. 667–689, 2004.
  • [6] S. Boyd, P. Diaconis, P. Parrilo, and L. Xiao, “Fastest mixing markov chain on graphs with symmetries,” SIAM Journal on Optimization, vol. 20, no. 2, pp. 792–819, 2009.
  • [7] B. Açıkmeşe and D. S. Bayard, “A markov chain approach to probabilistic swarm guidance,” in 2012 American Control Conference (ACC), pp. 6300–6307, IEEE, 2012.
  • [8] M. El Chamie and B. Açıkmeşe, “Safe metropolis–hastings algorithm and its application to swarm control,” Systems & Control Letters, vol. 111, pp. 40–48, 2018.
  • [9] B. Açıkmeşe and D. S. Bayard, “Markov chain approach to probabilistic guidance for swarms of autonomous agents,” Asian Journal of Control, vol. 17, no. 4, pp. 1105–1124, 2015.
  • [10] N. Demir, B. Açıkmeşe, and C. Pehlivantürk, “Density control for decentralized autonomous agents with conflict avoidance,” IFAC Proceedings Volumes, vol. 47, no. 3, pp. 11715–11721, 2014.
  • [11] B. Açıkmeşe, N. Demir, and M. W. Harris, “Convex necessary and sufficient conditions for density safety constraints in markov chain synthesis,” IEEE Transactions on Automatic Control, vol. 60, no. 10, pp. 2813–2818, 2015.
  • [12] N. Demir, U. Eren, and B. Açıkmeşe, “Decentralized probabilistic density control of autonomous swarms with safety constraints,” Autonomous Robots, vol. 39, no. 4, pp. 537–554, 2015.
  • [13] N. Demirer, M. El Chamie, and B. Açıkmeşe, “Safe markov chains for on/off density control with observed transitions,” IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1442–1449, 2017.
  • [14] S. Bandyopadhyay, S.-J. Chung, and F. Y. Hadaegh, “Probabilistic and distributed control of a large-scale swarm of autonomous agents,” IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1103–1123, 2017.
  • [15] I. Jang, H.-S. Shin, and A. Tsourdos, “Local information-based control for probabilistic swarm distribution guidance,” Swarm Intelligence, vol. 12, no. 4, pp. 327–359, 2018.
  • [16] F. Djeumou, Z. Xu, and U. Topcu, “Probabilistic swarm guidance subject to graph temporal logic specifications.,” in Robotics: Science and Systems, 2020.
  • [17] F. Djeumou, Z. Xu, M. Cubuktepe, and U. Topcu, “Probabilistic control of heterogeneous swarms subject to graph temporal logic specifications: A decentralized and scalable approach,” IEEE Transactions on Automatic Control, 2022.
  • [18] M. Mesbahi and M. Egerstedt, Graph theoretic methods in multiagent networks. Princeton University Press, 2010.
  • [19] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and cooperation in networked multi-agent systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233, 2007.
  • [20] Y. Cao, W. Yu, W. Ren, and G. Chen, “An overview of recent progress in the study of distributed multi-agent coordination,” IEEE Transactions on Industrial informatics, vol. 9, no. 1, pp. 427–438, 2012.
  • [21] J. Qin, Q. Ma, Y. Shi, and L. Wang, “Recent advances in consensus of multi-agent systems: A brief survey,” IEEE Transactions on Industrial Electronics, vol. 64, no. 6, pp. 4972–4983, 2016.
  • [22] S. S. Kia, B. Van Scoy, J. Cortes, R. A. Freeman, K. M. Lynch, and S. Martinez, “Tutorial on dynamic average consensus: The problem, its applications, and the algorithms,” IEEE Control Systems Magazine, vol. 39, no. 3, pp. 40–72, 2019.
  • [23] L. Xiao and S. Boyd, “Fast linear iterations for distributed averaging,” Systems & Control Letters, vol. 53, no. 1, pp. 65–78, 2004.
  • [24] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip algorithms,” IEEE Transactions on Information Theory, vol. 52, no. 6, pp. 2508–2530, 2006.
  • [25] A. Tahbaz-Salehi and A. Jadbabaie, “Small world phenomenon, rapidly mixing markov chains, and average consensus algorithms,” in 2007 46th IEEE Conference on Decision and Control, pp. 276–281, IEEE, 2007.
  • [26] A. Olshevsky and J. N. Tsitsiklis, “Degree fluctuations and the convergence time of consensus algorithms,” IEEE Transactions on Automatic Control, vol. 58, no. 10, pp. 2626–2631, 2013.
  • [27] A. Jadbabaie, J. Lin, and A. S. Morse, “Coordination of groups of mobile autonomous agents using nearest neighbor rules,” IEEE Transactions on Automatic Control, vol. 48, no. 6, pp. 988–1001, 2003.
  • [28] L. Moreau, “Stability of multiagent systems with time-dependent communication links,” IEEE Transactions on Automatic Control, vol. 50, no. 2, pp. 169–182, 2005.
  • [29] W. Ren and R. W. Beard, “Consensus seeking in multiagent systems under dynamically changing interaction topologies,” IEEE Transactions on Automatic Control, vol. 50, no. 5, pp. 655–661, 2005.
  • [30] J. A. Fax and R. M. Murray, “Information flow and cooperative control of vehicle formations,” IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1465–1476, 2004.
  • [31] R. Olfati-Saber and R. M. Murray, “Consensus problems in networks of agents with switching topology and time-delays,” IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1520–1533, 2004.
  • [32] O. Slučiak and M. Rupp, “Consensus algorithms with state-dependent weights,” IEEE Transactions on Signal Processing, vol. 64, no. 8, pp. 1972–1985, 2016.
  • [33] F. Rossi, S. Bandyopadhyay, M. Wolf, and M. Pavone, “Review of multi-agent algorithms for collective behavior: a structural taxonomy,” IFAC-PapersOnLine, vol. 51, no. 12, pp. 112–117, 2018.
  • [34] S.-J. Chung, A. A. Paranjape, P. Dames, S. Shen, and V. Kumar, “A survey on aerial swarm robotics,” IEEE Transactions on Robotics, vol. 34, no. 4, pp. 837–855, 2018.
  • [35] M. Schranz, M. Umlauft, M. Sende, and W. Elmenreich, “Swarm robotic behaviors and current applications,” Frontiers in Robotics and AI, vol. 7, p. 36, 2020.
  • [36] V. J. Lumelsky and K. Harinarayan, “Decentralized motion planning for multiple mobile robots: The cocktail party model,” Autonomous Robots, vol. 4, no. 1, pp. 121–135, 1997.
  • [37] A. Richards, T. Schouwenaars, J. P. How, and E. Feron, “Spacecraft trajectory planning with avoidance constraints using mixed-integer linear programming,” Journal of Guidance, Control, and Dynamics, vol. 25, no. 4, pp. 755–764, 2002.
  • [38] M. Tillerson, G. Inalhan, and J. P. How, “Co-ordination and control of distributed spacecraft systems using convex optimization techniques,” International Journal of Robust and Nonlinear Control: IFAC-Affiliated Journal, vol. 12, no. 2-3, pp. 207–242, 2002.
  • [39] D. P. Scharf, F. Y. Hadaegh, and S. R. Ploen, “A survey of spacecraft formation flying guidance and control (part 1): guidance,” in Proceedings of the 2003 American Control Conference, 2003., vol. 2, pp. 1733–1739, 2003.
  • [40] Y. Kim, M. Mesbahi, and F. Y. Hadaegh, “Multiple-spacecraft reconfiguration through collision avoidance, bouncing, and stalemate,” Journal of Optimization Theory and Applications, vol. 122, no. 2, pp. 323–343, 2004.
  • [41] J. L. Ramirez-Riberos, M. Pavone, E. Frazzoli, and D. W. Miller, “Distributed control of spacecraft formations via cyclic pursuit: Theory and experiments,” Journal of Guidance, Control, and Dynamics, vol. 33, no. 5, pp. 1655–1669, 2010.
  • [42] S. Zhao, S. Ramakrishnan, and M. Kumar, “Density-based control of multiple robots,” in Proceedings of the 2011 American control conference, pp. 481–486, IEEE, 2011.
  • [43] U. Eren and B. Açıkmeşe, “Velocity field generation for density control of swarms using heat equation and smoothing kernels,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 9405–9411, 2017.
  • [44] K. Elamvazhuthi, Z. Kakish, A. Shirsat, and S. Berman, “Controllability and stabilization for herding a robotic swarm using a leader: A mean-field approach,” IEEE Transactions on Robotics, vol. 37, no. 2, pp. 418–432, 2020.
  • [45] K. Elamvazhuthi and S. Berman, “Mean-field models in swarm robotics: A survey,” Bioinspiration & Biomimetics, vol. 15, no. 1, p. 015001, 2019.
  • [46] S. Biswal, K. Elamvazhuthi, and S. Berman, “Decentralized control of multiagent systems using local density feedback,” IEEE Transactions on Automatic Control, vol. 67, no. 8, pp. 3920–3932, 2021.
  • [47] V. Krishnan and S. Martínez, “Distributed optimal transport for the deployment of swarms,” in 2018 IEEE Conference on Decision and Control (CDC), pp. 4583–4588, IEEE, 2018.
  • [48] V. Krishnan and S. Martinez, “Distributed control for spatial self-organization of multi-agent swarms,” SIAM Journal on Control and Optimization, vol. 56, no. 5, pp. 3642–3667, 2018.
  • [49] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, Linear matrix inequalities in system and control theory, vol. 15. Siam, 1994.
  • [50] S. Bandyopadhyay, S.-J. Chung, and F. Y. Hadaegh, “Probabilistic swarm guidance using optimal transport,” in 2014 IEEE Conference on Control Applications (CCA), pp. 498–505, IEEE, 2014.
  • [51] K.-K. Oh, M.-C. Park, and H.-S. Ahn, “A survey of multi-agent formation control,” Automatica, vol. 53, pp. 424–440, 2015.
  • [52] A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multi-agent optimization,” IEEE Transactions on Automatic Control, vol. 54, no. 1, pp. 48–61, 2009.
  • [53] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends® in Machine learning, vol. 3, no. 1, pp. 1–122, 2011.
  • [54] W. Shi, Q. Ling, G. Wu, and W. Yin, “Extra: An exact first-order algorithm for decentralized consensus optimization,” SIAM Journal on Optimization, vol. 25, no. 2, pp. 944–966, 2015.
  • [55] S. S. Kia, J. Cortés, and S. Martínez, “Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication,” Automatica, vol. 55, pp. 254–264, 2015.
  • [56] T. Yang, X. Yi, J. Wu, Y. Yuan, D. Wu, Z. Meng, Y. Hong, H. Wang, Z. Lin, and K. H. Johansson, “A survey of distributed optimization,” Annual Reviews in Control, vol. 47, pp. 278–305, 2019.
  • [57] R. Olfati-Saber, “Distributed kalman filtering for sensor networks,” in 2007 46th IEEE Conference on Decision and Control, pp. 5492–5498, IEEE, 2007.
  • [58] A. T. Kamal, J. A. Farrell, and A. K. Roy-Chowdhury, “Information weighted consensus filters and their application in distributed camera networks,” IEEE Transactions on Automatic Control, vol. 58, no. 12, pp. 3112–3125, 2013.
  • [59] G. Cybenko, “Dynamic load balancing for distributed memory multiprocessors,” Journal of parallel and distributed computing, vol. 7, no. 2, pp. 279–301, 1989.
  • [60] L. Xiao, S. Boyd, and S.-J. Kim, “Distributed average consensus with least-mean-square deviation,” Journal of parallel and distributed computing, vol. 67, no. 1, pp. 33–46, 2007.
  • [61] Y. Kim and M. Mesbahi, “On maximizing the second smallest eigenvalue of a state-dependent graph laplacian,” in Proceedings of the 2005, American Control Conference, 2005., pp. 99–103, IEEE, 2005.
  • [62] H. E. Bell, “Gershgorin’s theorem and the zeros of polynomials,” The American Mathematical Monthly, vol. 72, no. 3, pp. 292–295, 1965.
  • [63] B. Mohar, “Eigenvalues, diameter, and mean distance in graphs,” Graphs and combinatorics, vol. 7, no. 1, pp. 53–64, 1991.
  • [64] R. A. Horn and C. R. Johnson, Topics in matrix analysis. Cambridge university press, 1994.
  • [65] R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge university press, 2012.
  • [66] D. Bertsekas and J. N. Tsitsiklis, Introduction to probability, vol. 1. Athena Scientific, 2008.
\includegraphics[width=1in,height=1.25in,clip,keepaspectratio]Figs/Bio/SU.pdf Samet Uzun (Student Member, IEEE) received the B.Sc. and M.Sc. degrees in Department of Aeronautics and Astronautics from Istanbul Technical University, Istanbul, Turkey, in 2018 and 2020, respectively. He is currently pursuing a Ph.D. degree in the Department of Aeronautics and Astronautics from the University of Washington, Seattle, WA, USA. His research interests include optimization, control, and machine learning.
\includegraphics[width=1in,height=1.25in,clip,keepaspectratio]Figs/Bio/NKU.PNG N. Kemal Üre (Member, IEEE) received the B.Sc. and M.Sc. degrees in Department of Aeronautics and Astronautics from Istanbul Technical University, Istanbul, Türkiye, in 2008 and 2010, respectively, and the Ph.D. degree in Department of Aeronautics and Astronautics from the Massachusetts Institute of Technology, Cambridge, MA, USA, in 2015. He is currently an Associate Professor with the Department of Artificial Intelligence and Data Engineering, Istanbul Technical University. His main research interests include applications of deep learning and deep reinforcement learning for autonomous systems, large scale optimization, and the development of high-performance guidance navigation and control algorithms.
\includegraphics[width=1in,height=1.25in,clip,keepaspectratio]Figs/Bio/BA.PNG Behçet Açıkmeşe (Fellow, IEEE) received the Ph.D. degree in aerospace engineering from Purdue University, West Lafayette, IN, USA, in 2002. Previously, he was a Senior Technologist with JPL and a Faculty Member with the University of Texas at Austin, Austin, TX, USA. At JPL, he developed flyaway control algorithms that were successfully used in the landing of Curiosity and Perseverance rovers on Mars. He is currently a Professor with the University of Washington, Seattle, WA, USA. His research interests include robust and nonlinear control, convex optimization and its applications control theory and its aerospace applications, and Markov decision processes. Dr. Açıkmeşe was the recipient of many NASA and JPL achievement awards for his contributions to spacecraft control in planetary landing, formation flying, and asteroid and comet sample return missions, and also the NSF CAREER Award, IEEE CSS Award for Technical Excel- lence in Aerospace Control and IEEE CSM Outstanding paper award in 2023. He is a Fellow of AIAA and IEEE.