Decentralized State-Dependent Markov Chain Synthesis
with an Application to Swarm Guidance
Abstract
This paper introduces a decentralized state-dependent Markov chain synthesis (DSMC) algorithm for finite-state Markov chains. We present a state-dependent consensus protocol that achieves exponential convergence under mild technical conditions, without relying on any connectivity assumptions regarding the dynamic network topology. Utilizing the proposed consensus protocol, we develop the DSMC algorithm, updating the Markov matrix based on the current state while ensuring the convergence conditions of the consensus protocol. This result establishes the desired steady-state distribution for the resulting Markov chain, ensuring exponential convergence from all initial distributions while adhering to transition constraints and minimizing state transitions. The DSMC’s performance is demonstrated through a probabilistic swarm guidance example, which interprets the spatial distribution of a swarm comprising a large number of mobile agents as a probability distribution and utilizes the Markov chain to compute transition probabilities between states. Simulation results demonstrate faster convergence for the DSMC based algorithm when compared to the previous Markov chain based swarm guidance algorithms.
Index Terms:
Markov Chains, Consensus Protocol, Decentralized Control, Probabilistic Swarm Guidance.I Introduction
Markov chain synthesis has garnered attention from various disciplines, including physics, systems theory, computer science, and numerous other fields of science and engineering. This attention is particularly notable within the context of Monte Carlo Markov Chain (MCMC) algorithms [1, 2, 3]. The fundamental idea underlying MCMC algorithms is to synthesize a Markov chain that converges to a specified steady-state distribution. Random sampling of a large state space while adhering to a predefined probability distribution is the predominant use of MCMC algorithms. The current literature covers a broad spectrum of methodologies for Markov chain synthesis, incorporating both heuristic approaches and optimization-based techniques [4, 5, 6]. Each method provides specialized algorithms tailored to the synthesis of Markov chains in alignment with specific objectives or constraints. Markov chain synthesis plays a central role in probabilistic swarm guidance, which has led to the development of various algorithms incorporating additional transition and safety constraints [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. The probabilistic swarm guidance algorithm models the spatial distribution of the swarm agents as a probability distribution and employs the synthesized Markov chain to guide the spatial distribution of the swarm.
Consensus protocols form an important field of research that has a strong connection with Markov chains [18]. Consensus protocols are a set of rules used in distributed systems to achieve agreement among a group of agents on the value of a variable [19, 20, 21, 22]. Markov chains are often employed to model and analyze the dynamics and convergence properties of consensus protocols, providing insights into their behavior and performance [23, 24, 25, 26].
The current paper presents a consensus protocol specifically tailored to operate on a dynamic graph topology. We establish that the protocol provides exponential convergence guarantees, even under mild technical conditions. Consequently, our approach eliminates the reliance on conventional connectivity assumptions commonly found in the existing literature [27, 28, 29, 30, 31, 32]. Building on this new consensus protocol, the paper introduces a decentralized state-dependent Markov chain (DSMC) synthesis algorithm. It is demonstrated that the synthesized Markov chain, formulated using the proposed consensus algorithm, satisfies the aforementioned mild conditions. This, in turn, ensures the exponential convergence of the Markov chain to the desired steady-state. Subsequently, the DSMC algorithm’s performance is demonstrated on the probabilistic swarm guidance problem, with the objective of controlling the spatial distribution of swarm agents. Through simulations, it is shown that the DSMC algorithm achieves considerably faster convergence compared to other existing Markov chain synthesis algorithms.
I-A Related Works
The Metropolis-Hastings algorithm is widely recognized as one of the most prominent techniques for MCMC algorithms [4]. For the fastest mixing Markov chain synthesis, the problem is formulated as a convex optimization problem in [5], assuming that the Markov chain is symmetric. This paper also presents an extension to the method that involves synthesizing the fastest mixing reversible Markov chain with a given desired distribution. Furthermore, the number of variables in the optimization problem is reduced in [6] by exploiting the symmetries in the graph.
The probabilistic swarm guidance problem has been a crucial domain for the application and enhancement of Markov chain synthesis algorithms. A comprehensive review of the broader category of multi-agent algorithms is presented in [33], while a survey specifically focusing on aerial swarm robotics is provided in [34]. Additionally, [35] offers an overview of existing swarm robotic applications. For swarm guidance purposes, certain deterministic algorithms have been developed in [36, 37, 38, 39, 40, 41]. However, these algorithms may become computationally infeasible when dealing with swarms that comprise hundreds to thousands of agents. In the context of addressing the guidance problem for a large number of agents, considering the spatial distribution of swarm agents and directing it towards a desired steady-state distribution offers a computationally efficient approach. In this regard, both probabilistic and deterministic swarm guidance algorithms are presented in [42, 43, 44, 45, 46, 47, 48] for continuous state spaces. For discrete state spaces, a probabilistic guidance algorithm is introduced in [7]. This algorithm treats the spatial distribution of swarm agents, called the density distribution, as a probability distribution and employs the Metropolis-Hastings (M-H) algorithm to synthesize a Markov chain that guides the density distribution toward a desired state. The probabilistic guidance algorithm led to the development of numerous Markov chain synthesis algorithms involving specific objectives and constraints [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. In [8], the Metropolis-Hastings algorithm is extended to incorporate safety upper bound constraints on the probability vector. This paper includes numerical simulations that demonstrate the application of the extension in a probabilistic swarm guidance problem. In order to enhance convergence rates, [9] introduces a convex optimization-based technique for Markov chain synthesis. This technique formulates the objective function and constraints using linear matrix inequalities (LMIs) to synthesize a Markov chain capable of achieving a desired distribution while adhering to specified transition constraints. Notably, this study does not impose any assumptions on the Markov chain, rendering the problem inherently non-convex. The problem is convexified for practical purposes but the optimal convergence rate of the original non-convex problem cannot be attained. In [10], the approach presented in [9] is enhanced by incorporating state-feedback to further improve the convergence rate. These works are also extended to impose density upper bounds and density rate constraints in [11] and density flow and density diffusivity constraints in [12]. In [13], a more general approach is proposed for cases where the environment is stochastic, and the resulting Markov chain is shown to be a linear function of the stochastic environment and decision policy. These convex optimization problems can be solved using the interior-point algorithm but the computation time of the algorithm increases rapidly with increasing dimensionality of the state space [49]. Furthermore, neither the M-H algorithm nor convex optimization-based approaches attempt to minimize the number of state transitions. The swarm guidance problem is formulated as an optimal transport problem in [50] to minimize the number of agent transitions. However, besides the similar computational complexity issues, the performance of the algorithm drops significantly if the current density distribution of the swarm cannot be estimated accurately. The time-inhomogeneous Markov chain approach to the probabilistic swarm guidance problem (PSG-IMC algorithm) is developed in [14] to minimize the number of state transitions. This algorithm is computationally efficient and yields reasonable results with low estimation errors. However, all feedback-based algorithms mentioned above require global feedback on the state of the density distribution. Communication between all agents has to be established to estimate the density distribution in the probabilistic swarm guidance problem. Generating perfect feedback of the density distribution is often impractical, leading to the routine occurrence of estimation errors. An alternative approach that requires only local information is developed in [15]. In this work, the time-inhomogeneous Markov chain approach presented in [14] is modified to work with local information, and the algorithm is used for minimizing the number of state transitions. Nevertheless, in both global and local information based time-inhomogeneous Markov chain approaches, it is observed that the convergence rate diminishes significantly when the state transition capabilities become more restricted. As it can be seen in Corollary 1 in [14] or Corollary 3 in [15], the transition probability from a state to any directly connected state cannot be higher than the desired density value of the corresponding directly connected state. In situations where there are sparsely connected regions in the state space, it is common to observe a relatively low sum of desired density values among directly connected states. Consequently, there is a higher probability of remaining in the same state rather than transitioning to other states. In terms of the convergence rate, these algorithms are only effective in cases with high transition capabilities. Additionally, the performance of these algorithms is highly sensitive to hyperparameters and requires careful selection for optimum results in each experiment.
Graph temporal logic (GTL) is introduced in [16] to impose high-level task specifications as a constraint to the Markov chain synthesis. Markov chain synthesis is formulated as mixed-integer nonlinear programming (MINLP) feasibility problem and the problem is solved using a coordinate descent algorithm. In addition, an equivalence is proven between the feasibility of the MINLP and the feasibility of a mixed-integer linear program (MILP) for a particular case where the agents move along the nodes of a complete graph. While this study assumes homogeneous swarms for Markov chain synthesis subject to finite-horizon GTL formulas, an improved version of the formulation is presented in [17] to enable probabilistic control of heterogeneous swarms subject to infinite-horizon GTL formulas. Instead of solving the resulting MINLP using a coordinate descent algorithm, a sequential scheme, which is faster, more accurate, and robust to the choice of the starting point, is developed in the aforementioned paper.
Markov chains and consensus protocols share a close relationship. The rich theory of Markov chains has proven to be valuable in analyzing specific consensus protocols. Notable works such as [23, 24, 25, 26] have leveraged Markov chain theory to provide insights and analysis for consensus protocols. Consensus protocols, in contrast to Markov chains, operate without the limitations of non-negative nodes and edges or the requirement for the sum of nodes to equal one [18]. This broader scope enables consensus protocols to address a significantly wider range of problem spaces. Therefore, there is a significant interest in consensus protocols in a broad range of multi-agent networked systems research, including distributed coordination of mobile autonomous agents [27, 28, 29, 30, 31, 51], distributed optimization [52, 53, 54, 55, 56], distributed state estimation [57, 58], or dynamic load-balancing for parallel processors [59, 60]. There are comprehensive survey papers that review the research on consensus protocols [19, 20, 21, 22]. In many scenarios, the network topology of the consensus protocol is a switching topology due to failures, formation reconfiguration, or state-dependence. There is a large number of papers that propose consensus protocols with switching network topologies and convergence proofs of these algorithms are provided under various assumptions [27, 28, 29, 30, 31, 32]. In [27], a consensus protocol is proposed to solve the alignment problem of mobile agents, where the switching topology is assumed to be periodically connected. This assumption means that the union of networks over a finite time interval is strongly connected. Another algorithm is proposed in [28] that assumes the underlying switching network topology is ultimately connected. This assumption means that the union of graphs over an infinite interval is strongly connected. In [29], previous works are extended to solve the consensus problem on networks under limited and unreliable information exchange with dynamically changing interaction topologies. The convergence of the algorithm is provided under the ultimately connected assumption. Another consensus protocol is introduced in [30] for the cooperation of vehicles performing a shared task using inter-vehicle communication. Based on this work, a theoretical framework is presented in [31] to solve consensus problems under a variety of assumptions on the network topology such as strongly connected switching topology. In [32], a consensus protocol with state-dependent weights is proposed and it is assumed that corresponding graphs are weakly connected, which means that the network is assumed to be connected over the iterations. Additionally, optimization-based algorithms are proposed to obtain a high convergence rate for the consensus protocol in [23] and [61].
I-B Main Contributions
In this paper, we first propose a consensus protocol with state-dependent weights. The proposed consensus protocol does not require any connectivity assumption on the dynamic network topology, unlike the existing methods in the literature [27, 28, 29, 30, 31, 32]. We provide an exponential convergence guarantee for the consensus protocol under some mild technical conditions, which can be verified straightforwardly. We then present a decentralized Markov-chain synthesis (DSMC) algorithm based on the proposed consensus protocol and we prove that the resulting DSMC algorithm satisfies these mild conditions. This result is employed to prove that the resulting Markov chain has a desired steady-state distribution and that all initial distributions exponentially converge to this steady-state. Unlike the homogeneous Markov chain synthesis algorithms in [4, 7, 5, 6, 8, 9], the Markov matrix, synthesized by our algorithm, approaches the identity matrix as the probability distribution converges to the desired steady-state distribution. Hence the proposed algorithm attempts to minimize the number of state transitions, which eventually converge to zero as the probability distribution converges to the desired steady-state distribution. Whereas previous time-inhomogeneous Markov chain synthesis algorithms in [14, 15] only provide asymptotic convergence, the DSMC algorithm provides an exponential convergence rate guarantee. Furthermore, unlike previous algorithms in [14, 15], the convergence rate of the DSMC algorithm does not rapidly decrease in scenarios where the state space contains sparsely connected regions. Due to the decentralized nature of the consensus protocol, the Markov chain synthesis relies on local information, similar to the approach presented in [15], and a complex communication architecture is not required for the estimation of the distribution. By presenting numerical evidence within the context of the probabilistic swarm guidance problem, we demonstrate that the convergence rate of the swarm distribution to the desired steady-state distribution is substantially faster when compared to previous methodologies. In summary:
-
•
We propose a consensus protocol with state-dependent weights and prove its exponential convergence under mild technical conditions, without relying on the typical connectivity assumptions associated with the underlying graph topology.
-
•
Based on the proposed consensus protocol, we introduce the DSMC algorithm, which is shown to meet the convergence conditions outlined by the consensus protocol.
- •
I-C Organization and Notation
The paper is organized as follows. Section II presents the consensus protocol with state-dependent weights. The decentralized state-dependent Markov matrix synthesis (DSMC) algorithm is introduced in Section III. Section IV introduces the probabilistic swarm guidance problem formulation, and presents numerical simulations of swarms converging to desired distributions. The paper is concluded in Section V.
Notation: and represent the set of real numbers and complex numbers, respectively. and represent the set of non-negative real numbers and non-negative integers, respectively. is the dimensional real vector space. is a zero matrix and is a matrix of ones in appropriate dimensions. represents the element of the vector at time . denotes the vector norm. represents a vector, such that where have arbitrary dimensions. implies that is a positive (non-negative) matrix. denotes the set of integers such that . is the eigenvalue of a matrix such that for . is the set of eigenvalues of such that . denotes the empty set. represents the elements in set that are not in set . denotes the probability of a random variable.
II A Consensus Protocol
with State-Dependent Weights
The consensus protocol is a process used for achieving an agreement among distributed agents. In this section, we introduce a consensus protocol with state-dependent weights to reach a consensus on time-varying weighted graphs. Unlike other proposed consensus protocols in the literature, the consensus protocol we introduce does not require any connectivity assumption on the dynamic network topology. We provide theoretical analysis for proof of exponential convergence under some mild technical conditions. We then use the proposed consensus protocol for the synthesis of the column stochastic Markov matrix in Section III. It is proven that these mild assumptions required by the proposed consensus protocol are satisfied by the Markov matrix synthesis algorithm. We first define the graph for our consensus approach.
Definition 1.
(Time-varying weighted graph) A time-varying weighted graph is a tuple, , where is the set of vertices, is the set of undirected edges, and is a time-varying function such that , where represents the value of weight for edge at time .
-
•
(Values of vertices) A time-varying vector such that for can define a vector of values of vertices, i.e., represents the value of the vertex at time .
-
•
(Uniform graph induced by time-varying weighted graph) is the uniform graph induced by and it is obtained by setting , where for all .
-
•
(Adjacency matrices) Two vertices and are called adjacent if . The adjacency matrix of the uniform graph is represented by , where , if and are adjacent, and is otherwise. The adjacency matrix of the time-varying weighted graph is represented by , where .
-
•
(Degree matrices) and are the diagonal degree matrices of the graphs and such that and .
-
•
(Laplacian matrices) and are the Laplacian matrices of the graphs and such that and .
-
•
(Activated and Deactivated Edges) If , where , the edge is deactivated. and represent the activated and deactivated edges for time such that .
For convenience, we use , , , , and instead of using , , , , and , respectively.
The following assumption on the topology of the uniform graph is needed for convergence analysis.
Assumption 1.
The uniform graph is a connected graph, which means there exists a path between all vertices, or equivalently, for [18, section 2.1].
We will consider time-varying weighted graphs where the weights of the edges depend on the values of vertices. The following definition is needed to present the relation between the weights of the edges and the values of vertices.
Definition 2.
(Index sets with respect to values of vertices) The index sets and contain the indices of the non-negative and negative valued vertices for time , i.e.,
The index set consists of the indices of the negative valued vertices that are adjacent to the non-negative valued vertices, i.e.,
The set that contains the edges between , and , is defined as
We provide a condition that represents a relation between the values of the vertices and the weights of the edges of the graph . This condition is the key to providing the convergence proof of the consensus protocol with state-dependent weights.
Condition 1.
-
(a)
If , then there exists a such that for all .
-
(b)
If , then .
In other words, Condition 1a implies that the weight of an edge that connects a non-negative valued vertex to any other vertex cannot be arbitrarily close to . Therefore, if where , then . Condition 1b implies that no transition is allowed between two negative valued vertices if at least one of them is adjacent to a non-negative valued vertex. Edges that have zero weights are called deactivated edges. Note that although is assumed to be a connected graph in Assumption 1, may not be a connected graph for some , due to these deactivated edges.
The following example is provided to help the reader interpret the implication of Condition 1.
Example 1
The values of vertices and weights of the edges are represented on a graph for a time in Figure 1. According to Definition 2, , , and . Condition 1a implies that and since . Condition 1b implies that weights of all other edges of the nodes and are zero, which means that .
[width=0.7]Figs/Graphs/trans.pdf
The following lemma shows that each non-negative valued vertex is connected to a negative valued vertex in the graph under Condition 1.
Lemma 1.
Assume that Condition 1 holds. If , then for each such that , such that , where for some exponent .
Proof.
Since by Definition 1 and , such that and .
The lemma is proven by contradiction. Suppose that there exists one non-negative valued vertex such that and for all such that , , . In other words, there exists one non-negative valued vertex that is not connected to any negative valued vertices. Denote the set of such non-negative valued vertices as . The set of remaining vertices is denoted as such that . Note that all negative valued vertices and all non-negative valued vertices that are connected to negative valued vertices are in . According to the assumption provided at the beginning of the proof, vertices in cannot be connected to the vertices in , then adjacency matrix of the time-varying weighted graph can be permuted as
where is a permutation matrix, and are the adjacency matrices of sub-graphs that consist of the vertices in and , respectively. Note that due to Condition 1a, if , where , then . Therefore, can also be permuted as
(1) |
Due to Assumption 1, is a connected graph, the matrix is an irreducible matrix, and then matrix is also an irreducible matrix. However, the matrix in Eq. (1) is a reducible matrix, which means it contradicts the assumption provided at the beginning of the proof. Therefore, all non-negative valued vertices are connected to some negative valued vertices. ∎
Therefore, even if may not be a connected graph for some , there always exists at least one connected sub-graph that consists of both non-negative and negative valued vertices. It is important to note that in these connected sub-graphs, each edge is between a non-negative valued vertex and any other vertex and there is no edge between two negative valued vertices due to Condition 1b. Therefore, weights of all edges are at least due to Condition 1a.
The following lemma provides a bound for the eigenvalues of the Laplacian matrix of a connected time-varying weighted graph when the weights are within a certain range.
Lemma 2.
Let be a connected time-varying weighted graph with vertices. Assume that the following inequality is satisfied for the weights of the edges of the graph ,
(2) |
where and . Then, the following inequality holds for the eigenvalues of the Laplacian matrix ,
where .
Proof.
According to Gershgorin’s Circle Theorem [62], eigenvalues of a matrix satisfy,
(3) |
Since for weighted Laplacian matrix, where , Eq. (3) implies that,
According to Eq. (2), and . Then, eigenvalues of the Laplacian matrix satisfy that,
(4) |
We now show that the second smallest eigenvalue of the Laplacian matrix is also lower bounded. In [63, Theorem 4.2], the following lower bound is provided for the second smallest eigenvalue of the Laplacian matrix of a connected unweighted graph ,
where is the diameter of the graph, which is the length of the longest shortest path between any two vertices, and is the number of vertices of the graph. Let us define the Laplacian matrix for the case that weights of all edges of the graph are instead of . Suppose that, and are the eigenvalues of and matrices such that and for . Then, for . Since for the graph , is also a positive semi-definite Laplacian matrix, which means for all . Let us denote unit eigenvector such that and unit eigenvector such that for all . Note that is the eigenvector of these Laplacian matrices with the associated eigenvalue of . Since these Laplacian matrices are symmetric, all other eigenvectors of them are orthogonal to . Then, and , which means , where . Thus,
Hence, Eq. (5) can be provided for the second smallest eigenvalue of ,
(5) |
Since for a connected graph and due to Eq. (4), the following equation represents the eigenvalues of the matrix ,
where . ∎
The following theorem shows that values of vertices of converge to under specific value transition dynamics.
Theorem 1.
Proof.
Eq. (6) is equivalent to following equation,
Note that is a doubly stochastic symmetric matrix, where , , and non-zero, non-diagonal elements of represents the weights of activated edges.
Condition 1a forces weights of the edges connecting non-negative valued vertices to other vertices to be at least . However, the weights of some other edges may be . Thus, the weighted graph may not be connected, which means and matrices may be reducible. Hence, may have repeated eigenvalues at , and may have repeated eigenvalues at .
Even if may not be connected, due to Lemma 1, all non-negative valued vertices belong to a connected sub-graph in , which consists of both non-negative and negative valued vertices. Note that in these connected sub-graphs, there is no edge between two negative valued vertices due to Condition 1b. Therefore, weights of all edges are at least due to Condition 1a, which enables us to apply Lemma 2 in later parts. Let us denote the number of these connected sub-graphs as and denote the reordered matrix and vector as
where is a permutation matrix, , are irreducible matrices that correspond to connected sub-graphs and is a matrix that corresponds to remaining sub-graph. Note that there are both non-negative and negative elements in , and all elements of are negative.
Recall that, in Example 1, , values of , and vertices belong to the vector and values of , and vertices belong to the vector.
Suppose that, is the average of elements in such that , where is the number of elements in and , . Lemma 2 implies that only one eigenvalue of , is with associated eigenvector , and other eigenvalues are in , since , matrices corresponds to a connected sub-graphs in which weights of all edges are at least . Since the values of all vertices are negative in the remaining sub-graph that corresponds to the matrix, there is no lower-bound for the weights of the edges, while there is an upper bound provided in Eq. (6). Hence, , . Therefore the following equation holds,
(7) | ||||
where for all .
Here, we will show that is also smaller than or equal to for all , where . Note that if , then and are orthogonal to each other,
Hence, it can be shown that,
and
If , then because there are both non-negative and negative values in . Therefore, Eq. (7) implies that
(8) | ||||
Here, we will show that for all . Suppose that, , where is a permutation matrix such that and . Let and represent the number of elements in and , where . Since there are both non-negative and negative values in , , and , then
for all . It then follows that Eq. (8) implies
where .
Here, we will present that , where . Since ,
Since , , the equation is derived as
(9) |
Here, we derive an upper bound for with respect to other terms to show that, there is an upper bound for the right side of Eq. (9). Since and , = . Note that for any due to Lemma 1. Then,
(10) | ||||
The following inequalities can be provided using equivalence of norms,
Thus, Eq. (10) implies that
The strict upper-bound for can be inserted into Eq. (9) as
Since and ,
where . Therefore, exponentially converges to . ∎
III Decentralized State-Dependent
Markov Chain Synthesis
Based on the consensus protocol developed in Section II, we propose the decentralized state-dependent Markov chain synthesis (DSMC) algorithm that achieves convergence to the desired distribution with an exponential rate and minimal state transitions. Additionally, we present a shortest path algorithm that can be integrated with the DSMC algorithm, as utilized in [7, 14, 15], to further enhance the convergence rate.
III-A The DSMC Algorithm
We define a finite-state discrete-time Markov chain evolving on the vertices of the uniform graph in Definition 1.
Definition 3.
(Markov chain) Let , which is the vertices of the graph in Definition 1, be the finite set of states of the Markov chain and be the random variable of the Markov chain for time . The connectivity of the states is determined by the adjacency matrix of the uniform graph . Additionally, we define:
-
•
(Probability distribution) Let be the probability distribution at time .
-
•
(Markov matrix) Markov matrix determines the state transitions of the Markov chain , where for all . Hence for .
-
•
(Desired steady-state distribution) There exists a prescribed desired steady-state probability distribution such that
Let the error vector be the difference between the probability distribution at time and the desired steady-state distribution . The DSMC algorithm is designed to ensure that the dynamics of the error vector are identical to the dynamics of the value vector described in Theorem 1. This design guarantees consistency between the two, ensuring desirable convergence properties and performance in the DSMC algorithm. Similar to the consensus protocol in Section II, transitions in the DSMC algorithm occur from the states with higher error values to their adjacent states with lower error values to equalize the error values across all states of the Markov chain. Since the sum of these error values is equal to , the error values at all states will eventually become balanced at , resulting in the convergence of the probability distribution to the desired distribution.
Input: ,
,
,
Output:
We present the synthesis of the Markov matrix in Algorithm 1, and its convergence proof is presented in Theorem 2. To this end, let be the set of indices of states adjacent to either or , such that . It should be noted that for all and for some if and only if , which is introduced in Definition 2. To determine the transition probability from state to state for time , the required inputs of the algorithm are and for all , the corresponding rows of the adjacency matrix for states and , and a hyper-parameter that scales the transition probabilities. In Line 1 of Algorithm 1, the algorithm computes the error values at the states associated with the set . From the state , there will be a transition to any adjacent state only if . Also, the transition probabilities vary proportionally with the differences in error values. Therefore, the difference of the error value from the error value is calculated in Line 2 if , otherwise it is set to . As in Theorem 1, appropriately scaling the differences between error values relative to the number of adjacent states is crucial for the stability of the convergence. Also, it is required to scale these differences with respect to the probabilities of the corresponding states. For this purpose, a scaling factor is determined in Line 4 for the stability of the convergence, while another scaling factor is determined by the probability of the state in Line 5. Furthermore, if , then there will be no transition from state to any adjacent state . In this case, the value of becomes irrelevant and is set to . If and , and either or has an adjacent state with a positive error value, then another scaling factor is set to in Line 6 to satisfy Condition 1b. The minimum of these three scaling factors is chosen in Line 7 to satisfy all conditions simultaneously. Hence, the scaled difference of the error value from the error value is calculated in Line 8. Note that represents the amount of increase in the error value at state due to the transition from state . Equivalently, it represents the amount of decrease of the error value at state due to the transition to the state . This amount is divided by the probability of the state in Line 9 to decide the transition probability from state to state . In Line 10, the remaining probabilities are added to the diagonal elements, finalizing the synthesis of the Markov matrix. The complexity of the DSMC algorithm is , where is the number of states.
The following proposition shows that the matrix synthesized by Algorithm 1 is a column stochastic Markov matrix.
Proposition 1.
The matrix synthesized by Algorithm 1 satisfies and constraints.
Proof.
(Proof of ) since , and , . Since and , . and imply that , if .
If or , then and . If and , then
Therefore, and .
(Proof of ) requires that , . Note that since . Then, . This shows that . ∎
The following proposition shows that the error dynamics in Algorithm 1 are identical to the evolution of the value vector in the consensus protocol presented in Theorem 1.
Proof.
In Algorithm 1, the increase in the error value at state caused by state is denoted by in Line 8. Equivalently, represents the decrease in the error value at state caused by state . Since in Line 9 of Algorithm 1, the amount of change determined in Line 8 is not suppressed by . Thus, the error value at the state changes as
(11) | ||||
Let us denote as
(12) |
Note that , for all , where due to Line 4 and Line 7 of the Algorithm 1. Since in Line 3 of the Algorithm 1 represents the degree similar to in Definition 1, satisfies the constraints given in Theorem 1. Thus, Eq. (11) implies that
(13) |
for all , where . Since Eq. (13) is identical to Eq. (6), the error dynamics of Algorithm 1 are identical to the evolution of the value vector in Theorem 1. ∎
Note that Assumption 1 is already satisfied for the DSMC algorithm due to Assumption 3. We now show that Condition 1, which is crucial for the convergence of the proposed consensus protocol, is satisfied by Algorithm 1.
Proof.
To ensure Condition 1a it must be shown that there exists a such that if . Then, due to Eq. (12), it is sufficient to show that there exists a such that if .
(Proof of Condition 1a) Since and it is not time-varying, there exists a such that . The inequality implies that for all due to Line 6 of the Algorithm 1. Note that
Since , , and , then
It then follows that Line 5 of the Algorithm 1 implies . Let . Then, the inequality implies that . Thus, . Since is not time-varying, there exists a such that . Hence, if , then there exist a such that , where .
The following theorem shows that Algorithm 1 achieves the desired convergence.
Theorem 2.
The probability distribution exponentially converges to the desired distribution if the corresponding Markov matrix is synthesized by Algorithm 1.
Proof.
In Proposition 2, it is proven that the dynamics of the error vector in Algorithm 1 are identical to the dynamics of the value vector in Theorem 1. Condition 1 is used in Theorem 1 to prove that the value vector exponentially converges to . In Proposition 3, it is proven that Algorithm 1 satisfies Condition 1 which means that probability distribution exponentially converges to the desired distribution . ∎
III-B The Modified DSMC Algorithm
In this section, we introduce a shortest-path algorithm that is proposed as a modification to the Metropolis-Hastings algorithm in [7, Section V-E] and integrated with the Markov chain synthesis methods described in [14] and [15]. This algorithm can also be integrated with the DSMC algorithm to further increase the convergence rate. Our approach begins by categorizing the states of the desired distribution.
Definition 4.
(Recurrent and transient states) The states with non-zero elements in the desired distribution are called recurrent states. The other states with zero elements in the desired distribution are called transient states.
The shortest-path algorithm is used for the synthesis of the Markov chain for the transient states while the Markov chain for the recurrent states is synthesized by the DSMC algorithm. Let us split the desired distribution and the Markov matrix to synthesize the Markov matrix for recurrent and transient states separately. Assume that there are recurrent states and transient states of the desired distribution. Then the Markov matrix and the desired distribution are split as
(14) |
where , , , , and . Since the desired distribution and the Markov matrix are partitioned by renumbering the elements, the probability distribution and the adjacency matrix are also partitioned using the same renumbering as
Note that while the time-invariant matrices and in Eq. (14) are synthesized by the shortest-path algorithm, the time-varying Markov matrix is synthesized by the DSMC algorithm. If the shortest-path is to be combined with another Markov chain synthesis algorithm, it is also necessary to assume that the recurrent states are connected among themselves, which requires the following assumption.
Assumption 2.
The graph defined by the recurrent states is connected, which means there exists a path between all recurrent states, or equivalently, for [18, section 2.1].
Furthermore, we demonstrate that the condition , which is crucial for the Markov chain synthesis algorithm of the recurrent states, is satisfied for all , in the shortest-path algorithm.
Let us first define the following index sets to present the algorithm.
Definition 5.
(Index sets with respect to recurrent states) The index sets and contain the indices of the recurrent and transient states, respectively. Indices of transient states can be split into subsets as , , and so on. The index set consists of the indices of the states that are adjacent to the states corresponding to and , the index set consists of the indices of states that are adjacent to the states corresponding to and , and so on.
The synthesis of the Markov matrix using the DSMC algorithm and the shortest-path algorithm for the recurrent and transient states of the desired distribution is given as
In the proposed shortest-path algorithm, the transition probability from any state in set to any state in set , , and from any state in set to any state in set is . Therefore, the total probability of the transient states becomes zero in a finite time. In [7], it is shown that the condition is satisfied using the properties of M-matrices, which are shown in Theorem 2.5.3 (parts 2.5.3.2 and 2.5.3.12) of [64].
IV An application of the DSMC Algorithm
on Probabilistic Swarm Guidance
In this section, we apply the DSMC algorithm to the probabilistic swarm guidance problem and provide numerical simulations that show the convergence rate of the DSMC algorithm is considerably faster as compared to the previous Markov chain synthesis algorithms in [7] and [14].
IV-A Probabilistic Swarm Guidance
Most of the underlying definitions and baseline algorithms in this section are based on [7]. We briefly review this material for completeness.
Definition 6.
(Bins) The operational region is denoted as . The region is assumed to be partitioned as the union of disjoint regions, which are referred to as bins , , such that , and for . Each bin can contain multiple agents.
Bins, as in the states of a Markov chain, in the operational region are represented as vertices of a graph. The allowed transitions between bins are specified by the adjacency matrix of the graph.
Definition 7.
(Transition constraints) An adjacency matrix is used to restrict the allowed transitions between bins. if the transition from to is allowable, and is otherwise. The bins and are called neighboring bins if .
Assumption 3.
The graph describing the bin connections is connected, that is, for [18, section 2.1].
Consider a swarm comprised of agents. Let be the number of agents in bin at time . Then represents the swarm density distribution at time where and . It is desired to guide the swarm density distribution to a desired steady-state distribution , where and . The distance between the swarm density distribution and the desired distribution is monitored via the total variation between the swarm density distribution at time and the desired distribution
Assumption 4.
All agents can determine their current bins, and they can use this information to move between bins.
Each swarm agent changes its bin at time by using a column stochastic, Markov, matrix called a Markov matrix [65]. The entries of matrix define the transition probabilities between bins, that is, for any and ,
i.e., an agent in moves to with probability at time . Algorithm 2 is an implementation of this probabilistic guidance algorithm. The first step of the algorithm is to determine the agent’s current bin. In the following steps, each agent samples from a uniform discrete distribution and goes to another bin depending on the column of the Markov matrix, which is determined by the agent’s current bin.
The main idea of the probabilistic swarm guidance is to drive the propagation of density distribution vector , instead of individual agent positions . Swarms are viewed as a statistical ensemble of agents to facilitate the swarm guidance problem. The density distribution of the swarm satisfies the properties of a probability distribution, which are , and . However, the number of agents is finite and, using the law of large numbers [66, Section V], the Algorithm 2 ensures that is satisfied as . Furthermore, any given desired distribution cannot always be accurately represented by the finite number of agents of the swarm. For example, if and , then the best-quantized representation of the distribution that can be obtained is either or . In this case, the total variation between the density distribution of the swarm and the desired distribution cannot be less than .
Definition 8.
(Quantization error) The quantization error is the minimum total variation value that can be achieved by the density distribution of the swarm to represent a given desired distribution . It is denoted as and is the solution to the following problem
Let be the maximum value of the quantization error that considers the worst possible desired distribution for a given number of agents such that
The following theorem provides the maximum quantization error value for a given number of agents .
Lemma 3.
For any given desired distribution , , the maximum quantization error value between and is not greater than , where is the total number of agents.
Proof.
Let us split the integer and fractional part of such that , where and . Let . Note that and . Let be the optimal value of that minimizes . Then, or . Without loss of generality, suppose that . Then where . Therefore . Since , . Also, since for all and . Therefore, . The maximum value of is , where . Hence, and . ∎
IV-B Synthesis of the Markov Matrix
The DSMC algorithm presented in Section III-A is employed for synthesizing the Markov chain, which serves as the stochastic policy for the swarm agents. It is worth noting that the bins comprising the operational region, as defined in Definition 6, determine the vertices of the uniform graph in Definition 1. Consequently, these vertices correspond to the states of the Markov chain defined in Definition 3. Similarly, the transition constraints of the swarm, defined by an adjacency matrix in Definition 7, determine the adjacency matrix of the uniform graph in Definition 1, which subsequently corresponds to the adjacency matrix of the Markov chain in Definition 3. The connectivity requirement for the vertices of the consensus protocol, as outlined in Assumption 1, is satisfied by Assumption 3. Furthermore, if the recurrent states of the desired distribution are also connected among themselves, then Assumption 2 is satisfied, allowing the utilization of the Modified DSMC algorithm presented in Section III-B for Markov chain synthesis. Since the DSMC algorithm is state-dependent, we use the density distribution of the swarm for feedback, which requires the following assumption.
Assumption 5.
All agents know the density values of their own and neighboring bins.
A complex communication architecture is not required since communication only with neighboring bins is sufficient for an agent to determine its transition probabilities. If agents have only access to the number of agents of their own and neighboring bins, then they also need to know the total number of agents in the swarm, which is global information, to determine their density values. Distributed consensus algorithms, which work under the strongly-connected communication network topology assumption, can be used to estimate the total number of agents of the swarm. The consensus protocol that is used to estimate the density distribution of the swarm in [14, Remark 13] can also be used to estimate the total number of agents of the swarm.
IV-C Numerical Simulations
The convergence performance of the DSMC algorithm is demonstrated on the probabilistic swarm guidance application with numerical simulations shown in Figure 2 and 5. We monitor the total variation between the density distribution of the swarm and the desired distribution. The finite number of agents causes the quantization error outlined in Definition 8. Theorem 3 limits the quantization error to , where is the number of bins and is the number of agents.
In the first simulation, which is the same numerical example presented in [7], agents converge to the letter “E” in time-steps. Then approximately agents are removed and the remaining agents converge to the desired distribution again in time-steps, demonstrating the “self-repair” property of the proposed algorithm for swarm guidance. Comparisons of the total variation and the total number of transitions for the M-H, PSG-IMC, and the DSMC algorithms are given in Figures 3, 4, and Tables I, II for two different cases. In the first case, the adjacency matrix only allows transitions to the bins, which are above, below, right, and left to a given bin, i.e., the bins that are 1-step away. In the second case, the adjacency matrix allows transitions to 10-step away bins. When compared to the M-H and PSG-IMC algorithms, in total variation at the time-step, the DSMC algorithm provides approximately and times improvement in the speed of convergence (the ratios of the total amount of changing of the total variations in the given time-step) in the first case and , and times improvement in the second case. The PSG-IMC algorithm does not perform as well in the first case because of the issues caused by having a sparse adjacency matrix as discussed in Section I-A. In the second case, which has a much denser adjacency matrix, the PSG-IMC algorithm provides approximately times improvement in total variation compared to the M-H algorithm but the DSMC algorithm still provides faster convergence than the PSG-IMC algorithm. Furthermore, the total variation value of the DSMC algorithm converges to a value below the maximum value of the quantization error provided by Theorem 3. In the DSMC algorithm, as the distribution converges, the Markov matrix turns into an identity matrix, and unnecessary transitions are avoided as in the PSG-IMC algorithm. Since the transition of agents rarely occurred in the PSG-IMC algorithm in the first case due to the sparse adjacency matrix, the total number of transitions of the DSMC algorithm is approximately times less than the M-H algorithm, and times more than the PSG-IMC algorithm. For the second case, the total number of transitions of the DSMC algorithm is approximately times less than the M-H algorithm, and times less than the PSG-IMC algorithm.
In the second simulation, a more comprehensive comparison is provided. The swarm distribution converges to various multimodal Gaussian distributions (i.e., a mixture of multiple Gaussian distributions). The desired distribution is changed to another multimodal Gaussian distribution every time-steps, except for the time-steps between and . Up to the time-step, agents converge to the different multimodal Gaussian distributions. To show the self-repair property of the swarm, approximately of the agents are removed at the time-step and the remaining agents converge to the same desired distribution in time-steps. Moreover, agents are uniformly added to the operational region at time-step and all agents converge to the same desired distribution again in time-steps. After the time-step, agents converge to the new multimodal Gaussian distributions up to the time-step. Comparisons of the total variation and the total number of transitions for the M-H, PSG-IMC, and DSMC algorithms are given in Figure 6 and Table III. In this case, the adjacency matrix allows transitions to 10-step away bins. When compared to the M-H and PSG-IMC algorithms, the DSMC algorithm provides a significant improvement in the speed of convergence for each desired multimodal Gaussian distribution, as well as causing fewer transitions compared to both algorithms. Similar to the first simulation, the total variation value of the DSMC algorithm reaches the maximum value of the quantization error given in Theorem 3. Furthermore, when some agents are removed or new agents are added to the operational region, the DSMC algorithm recovers the desired distribution faster than the previous algorithms.
[width=0.20]Figs/E_latter/10_E_0.png
[width=0.20]Figs/E_latter/10_E_250.png
[width=0.20]Figs/E_latter/10_E_251.png
[width=0.20]Figs/E_latter/10_E_750.png
[width=0.43]Figs/E_latter/TV_1.pdf \includegraphics[width=0.43]Figs/E_latter/NT_1.pdf
Total Variations | Ratios of change of Total Variation | Total Number of Transitions | |||||||
---|---|---|---|---|---|---|---|---|---|
Time | 0 | 250 | 251 | 750 | 0-250 | 250-750 | 0-750 | 750 | 750 (Ratios) |
M-H Algorithm | 0.268 | 0.107 | 0.273 | 0.123 | 1.27 | 1.16 | 2282394 | ||
PSG-IMC Algorithm | 0.260 | 0.169 | 0.282 | 0.257 | 2.25 | 6.96 | 9348 | ||
DCMS Algorithm | 0.261 | 0.056 | 0.253 | 0.079 | 1.00 | 1.00 | 46051 |
[width=0.43]Figs/E_latter/TV_10.pdf \includegraphics[width=0.43]Figs/E_latter/NT_10.pdf
Total Variations | Ratios of change of Total Variation | Total Number of Transitions | |||||||
---|---|---|---|---|---|---|---|---|---|
Time | 0 | 250 | 251 | 750 | 0-250 | 250-750 | 0-750 | 750 | 750 (Ratios) |
M-H Algorithm | 0.271 | 0.101 | 0.269 | 0.113 | 1.44 | 1.48 | 2663105 | ||
PSG-IMC Algorithm | 0.263 | 0.039 | 0.253 | 0.072 | 1.09 | 1.28 | 41483 | ||
DCMS Algorithm | 0.261 | 0.017 | 0.253 | 0.022 | 1.00 | 1.00 | 35529 |
[width=0.232]Figs/Gauss/0.png
[width=0.232]Figs/Gauss/40.png
[width=0.232]Figs/Gauss/80.png
[width=0.232]Figs/Gauss/120.png
[width=0.232]Figs/Gauss/160.png
[width=0.232]Figs/Gauss/161.png
[width=0.232]Figs/Gauss/200.png
[width=0.232]Figs/Gauss/201.png
[width=0.232]Figs/Gauss/240.png
[width=0.232]Figs/Gauss/280.png
[width=0.232]Figs/Gauss/320.png
[width=0.232]Figs/Gauss/360.png
[width=0.43]Figs/Gauss/Total_Variation_2.pdf \includegraphics[width=0.43]Figs/Gauss/N_of_transitions_2.pdf
Ratios of change of Total Variation | Total Number of Transitions | ||||||||||
Time | 0-40 | 40-80 | 80-120 | 120-160 | 161-200 | 201-240 | 240-280 | 280-320 | 320-360 | 360 | 360 (Ratios) |
M-H Algorithm | 2.57 | 1.47 | 1.34 | 2.00 | 1.29 | 7.06 | 1.77 | 1.24 | 1.34 | 6058502 | 77.09 |
PSG-IMC Algorithm | 1.37 | 1.24 | 1.36 | 1.36 | 1.38 | 0.94 | 1.26 | 1.25 | 1.89 | 95277 | 1.21 |
DCMS Algorithm | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 78592 | 1.00 |
V Conclusion and Future Works
This paper introduces a decentralized state-dependent Markov chain synthesis (DSMC) algorithm with an application to the probabilistic swarm guidance problem. The proposed algorithm is theoretically proven to exhibit exponential convergence, and numerical experiments confirm faster convergence compared to existing homogeneous and time-inhomogeneous Markov chain synthesis algorithms, which respectively guarantee exponential and asymptotic convergence. Furthermore, it is observed that the number of state transitions is relatively low for the fast convergence rates it provides when compared with existing algorithms.
For the probabilistic swarm guidance application, removing the assumption that agents have access to density values of their own and neighboring bins will be the subject of future studies. A useful extension of this research may involve imposing safety constraints on the density distribution of the swarm, such as density upper bounds or density rate bounds. Additional future work may include the swarm engagement problem, which considers matching the distribution of a non-collaborative swarm whose density distribution evolves with a known Markov chain. In that case, having a fast converging algorithm, such as DSMC, would possibly be advantageous to quickly react to such changing desired distributions.
VI Acknowledgement
The authors would like to thank the anonymous reviewers for their helpful and constructive comments that greatly contributed to improving the final version of the paper.
References
- [1] D. A. Levin and Y. Peres, Markov chains and mixing times, vol. 107. American Mathematical Soc., 2017.
- [2] G. Fishman, Monte Carlo: concepts, algorithms, and applications. Springer Science & Business Media, 2013.
- [3] W. R. Gilks, S. Richardson, and D. Spiegelhalter, Markov chain Monte Carlo in practice. CRC press, 1995.
- [4] W. K. Hastings, “Monte carlo sampling methods using markov chains and their applications,” Biometrika, vol. 57, pp. 97–109, 1970.
- [5] S. Boyd, P. Diaconis, and L. Xiao, “Fastest mixing markov chain on a graph,” SIAM review, vol. 46, no. 4, pp. 667–689, 2004.
- [6] S. Boyd, P. Diaconis, P. Parrilo, and L. Xiao, “Fastest mixing markov chain on graphs with symmetries,” SIAM Journal on Optimization, vol. 20, no. 2, pp. 792–819, 2009.
- [7] B. Açıkmeşe and D. S. Bayard, “A markov chain approach to probabilistic swarm guidance,” in 2012 American Control Conference (ACC), pp. 6300–6307, IEEE, 2012.
- [8] M. El Chamie and B. Açıkmeşe, “Safe metropolis–hastings algorithm and its application to swarm control,” Systems & Control Letters, vol. 111, pp. 40–48, 2018.
- [9] B. Açıkmeşe and D. S. Bayard, “Markov chain approach to probabilistic guidance for swarms of autonomous agents,” Asian Journal of Control, vol. 17, no. 4, pp. 1105–1124, 2015.
- [10] N. Demir, B. Açıkmeşe, and C. Pehlivantürk, “Density control for decentralized autonomous agents with conflict avoidance,” IFAC Proceedings Volumes, vol. 47, no. 3, pp. 11715–11721, 2014.
- [11] B. Açıkmeşe, N. Demir, and M. W. Harris, “Convex necessary and sufficient conditions for density safety constraints in markov chain synthesis,” IEEE Transactions on Automatic Control, vol. 60, no. 10, pp. 2813–2818, 2015.
- [12] N. Demir, U. Eren, and B. Açıkmeşe, “Decentralized probabilistic density control of autonomous swarms with safety constraints,” Autonomous Robots, vol. 39, no. 4, pp. 537–554, 2015.
- [13] N. Demirer, M. El Chamie, and B. Açıkmeşe, “Safe markov chains for on/off density control with observed transitions,” IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1442–1449, 2017.
- [14] S. Bandyopadhyay, S.-J. Chung, and F. Y. Hadaegh, “Probabilistic and distributed control of a large-scale swarm of autonomous agents,” IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1103–1123, 2017.
- [15] I. Jang, H.-S. Shin, and A. Tsourdos, “Local information-based control for probabilistic swarm distribution guidance,” Swarm Intelligence, vol. 12, no. 4, pp. 327–359, 2018.
- [16] F. Djeumou, Z. Xu, and U. Topcu, “Probabilistic swarm guidance subject to graph temporal logic specifications.,” in Robotics: Science and Systems, 2020.
- [17] F. Djeumou, Z. Xu, M. Cubuktepe, and U. Topcu, “Probabilistic control of heterogeneous swarms subject to graph temporal logic specifications: A decentralized and scalable approach,” IEEE Transactions on Automatic Control, 2022.
- [18] M. Mesbahi and M. Egerstedt, Graph theoretic methods in multiagent networks. Princeton University Press, 2010.
- [19] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and cooperation in networked multi-agent systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233, 2007.
- [20] Y. Cao, W. Yu, W. Ren, and G. Chen, “An overview of recent progress in the study of distributed multi-agent coordination,” IEEE Transactions on Industrial informatics, vol. 9, no. 1, pp. 427–438, 2012.
- [21] J. Qin, Q. Ma, Y. Shi, and L. Wang, “Recent advances in consensus of multi-agent systems: A brief survey,” IEEE Transactions on Industrial Electronics, vol. 64, no. 6, pp. 4972–4983, 2016.
- [22] S. S. Kia, B. Van Scoy, J. Cortes, R. A. Freeman, K. M. Lynch, and S. Martinez, “Tutorial on dynamic average consensus: The problem, its applications, and the algorithms,” IEEE Control Systems Magazine, vol. 39, no. 3, pp. 40–72, 2019.
- [23] L. Xiao and S. Boyd, “Fast linear iterations for distributed averaging,” Systems & Control Letters, vol. 53, no. 1, pp. 65–78, 2004.
- [24] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip algorithms,” IEEE Transactions on Information Theory, vol. 52, no. 6, pp. 2508–2530, 2006.
- [25] A. Tahbaz-Salehi and A. Jadbabaie, “Small world phenomenon, rapidly mixing markov chains, and average consensus algorithms,” in 2007 46th IEEE Conference on Decision and Control, pp. 276–281, IEEE, 2007.
- [26] A. Olshevsky and J. N. Tsitsiklis, “Degree fluctuations and the convergence time of consensus algorithms,” IEEE Transactions on Automatic Control, vol. 58, no. 10, pp. 2626–2631, 2013.
- [27] A. Jadbabaie, J. Lin, and A. S. Morse, “Coordination of groups of mobile autonomous agents using nearest neighbor rules,” IEEE Transactions on Automatic Control, vol. 48, no. 6, pp. 988–1001, 2003.
- [28] L. Moreau, “Stability of multiagent systems with time-dependent communication links,” IEEE Transactions on Automatic Control, vol. 50, no. 2, pp. 169–182, 2005.
- [29] W. Ren and R. W. Beard, “Consensus seeking in multiagent systems under dynamically changing interaction topologies,” IEEE Transactions on Automatic Control, vol. 50, no. 5, pp. 655–661, 2005.
- [30] J. A. Fax and R. M. Murray, “Information flow and cooperative control of vehicle formations,” IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1465–1476, 2004.
- [31] R. Olfati-Saber and R. M. Murray, “Consensus problems in networks of agents with switching topology and time-delays,” IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1520–1533, 2004.
- [32] O. Slučiak and M. Rupp, “Consensus algorithms with state-dependent weights,” IEEE Transactions on Signal Processing, vol. 64, no. 8, pp. 1972–1985, 2016.
- [33] F. Rossi, S. Bandyopadhyay, M. Wolf, and M. Pavone, “Review of multi-agent algorithms for collective behavior: a structural taxonomy,” IFAC-PapersOnLine, vol. 51, no. 12, pp. 112–117, 2018.
- [34] S.-J. Chung, A. A. Paranjape, P. Dames, S. Shen, and V. Kumar, “A survey on aerial swarm robotics,” IEEE Transactions on Robotics, vol. 34, no. 4, pp. 837–855, 2018.
- [35] M. Schranz, M. Umlauft, M. Sende, and W. Elmenreich, “Swarm robotic behaviors and current applications,” Frontiers in Robotics and AI, vol. 7, p. 36, 2020.
- [36] V. J. Lumelsky and K. Harinarayan, “Decentralized motion planning for multiple mobile robots: The cocktail party model,” Autonomous Robots, vol. 4, no. 1, pp. 121–135, 1997.
- [37] A. Richards, T. Schouwenaars, J. P. How, and E. Feron, “Spacecraft trajectory planning with avoidance constraints using mixed-integer linear programming,” Journal of Guidance, Control, and Dynamics, vol. 25, no. 4, pp. 755–764, 2002.
- [38] M. Tillerson, G. Inalhan, and J. P. How, “Co-ordination and control of distributed spacecraft systems using convex optimization techniques,” International Journal of Robust and Nonlinear Control: IFAC-Affiliated Journal, vol. 12, no. 2-3, pp. 207–242, 2002.
- [39] D. P. Scharf, F. Y. Hadaegh, and S. R. Ploen, “A survey of spacecraft formation flying guidance and control (part 1): guidance,” in Proceedings of the 2003 American Control Conference, 2003., vol. 2, pp. 1733–1739, 2003.
- [40] Y. Kim, M. Mesbahi, and F. Y. Hadaegh, “Multiple-spacecraft reconfiguration through collision avoidance, bouncing, and stalemate,” Journal of Optimization Theory and Applications, vol. 122, no. 2, pp. 323–343, 2004.
- [41] J. L. Ramirez-Riberos, M. Pavone, E. Frazzoli, and D. W. Miller, “Distributed control of spacecraft formations via cyclic pursuit: Theory and experiments,” Journal of Guidance, Control, and Dynamics, vol. 33, no. 5, pp. 1655–1669, 2010.
- [42] S. Zhao, S. Ramakrishnan, and M. Kumar, “Density-based control of multiple robots,” in Proceedings of the 2011 American control conference, pp. 481–486, IEEE, 2011.
- [43] U. Eren and B. Açıkmeşe, “Velocity field generation for density control of swarms using heat equation and smoothing kernels,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 9405–9411, 2017.
- [44] K. Elamvazhuthi, Z. Kakish, A. Shirsat, and S. Berman, “Controllability and stabilization for herding a robotic swarm using a leader: A mean-field approach,” IEEE Transactions on Robotics, vol. 37, no. 2, pp. 418–432, 2020.
- [45] K. Elamvazhuthi and S. Berman, “Mean-field models in swarm robotics: A survey,” Bioinspiration & Biomimetics, vol. 15, no. 1, p. 015001, 2019.
- [46] S. Biswal, K. Elamvazhuthi, and S. Berman, “Decentralized control of multiagent systems using local density feedback,” IEEE Transactions on Automatic Control, vol. 67, no. 8, pp. 3920–3932, 2021.
- [47] V. Krishnan and S. Martínez, “Distributed optimal transport for the deployment of swarms,” in 2018 IEEE Conference on Decision and Control (CDC), pp. 4583–4588, IEEE, 2018.
- [48] V. Krishnan and S. Martinez, “Distributed control for spatial self-organization of multi-agent swarms,” SIAM Journal on Control and Optimization, vol. 56, no. 5, pp. 3642–3667, 2018.
- [49] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, Linear matrix inequalities in system and control theory, vol. 15. Siam, 1994.
- [50] S. Bandyopadhyay, S.-J. Chung, and F. Y. Hadaegh, “Probabilistic swarm guidance using optimal transport,” in 2014 IEEE Conference on Control Applications (CCA), pp. 498–505, IEEE, 2014.
- [51] K.-K. Oh, M.-C. Park, and H.-S. Ahn, “A survey of multi-agent formation control,” Automatica, vol. 53, pp. 424–440, 2015.
- [52] A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multi-agent optimization,” IEEE Transactions on Automatic Control, vol. 54, no. 1, pp. 48–61, 2009.
- [53] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends® in Machine learning, vol. 3, no. 1, pp. 1–122, 2011.
- [54] W. Shi, Q. Ling, G. Wu, and W. Yin, “Extra: An exact first-order algorithm for decentralized consensus optimization,” SIAM Journal on Optimization, vol. 25, no. 2, pp. 944–966, 2015.
- [55] S. S. Kia, J. Cortés, and S. Martínez, “Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication,” Automatica, vol. 55, pp. 254–264, 2015.
- [56] T. Yang, X. Yi, J. Wu, Y. Yuan, D. Wu, Z. Meng, Y. Hong, H. Wang, Z. Lin, and K. H. Johansson, “A survey of distributed optimization,” Annual Reviews in Control, vol. 47, pp. 278–305, 2019.
- [57] R. Olfati-Saber, “Distributed kalman filtering for sensor networks,” in 2007 46th IEEE Conference on Decision and Control, pp. 5492–5498, IEEE, 2007.
- [58] A. T. Kamal, J. A. Farrell, and A. K. Roy-Chowdhury, “Information weighted consensus filters and their application in distributed camera networks,” IEEE Transactions on Automatic Control, vol. 58, no. 12, pp. 3112–3125, 2013.
- [59] G. Cybenko, “Dynamic load balancing for distributed memory multiprocessors,” Journal of parallel and distributed computing, vol. 7, no. 2, pp. 279–301, 1989.
- [60] L. Xiao, S. Boyd, and S.-J. Kim, “Distributed average consensus with least-mean-square deviation,” Journal of parallel and distributed computing, vol. 67, no. 1, pp. 33–46, 2007.
- [61] Y. Kim and M. Mesbahi, “On maximizing the second smallest eigenvalue of a state-dependent graph laplacian,” in Proceedings of the 2005, American Control Conference, 2005., pp. 99–103, IEEE, 2005.
- [62] H. E. Bell, “Gershgorin’s theorem and the zeros of polynomials,” The American Mathematical Monthly, vol. 72, no. 3, pp. 292–295, 1965.
- [63] B. Mohar, “Eigenvalues, diameter, and mean distance in graphs,” Graphs and combinatorics, vol. 7, no. 1, pp. 53–64, 1991.
- [64] R. A. Horn and C. R. Johnson, Topics in matrix analysis. Cambridge university press, 1994.
- [65] R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge university press, 2012.
- [66] D. Bertsekas and J. N. Tsitsiklis, Introduction to probability, vol. 1. Athena Scientific, 2008.
\includegraphics[width=1in,height=1.25in,clip,keepaspectratio]Figs/Bio/SU.pdf | Samet Uzun (Student Member, IEEE) received the B.Sc. and M.Sc. degrees in Department of Aeronautics and Astronautics from Istanbul Technical University, Istanbul, Turkey, in 2018 and 2020, respectively. He is currently pursuing a Ph.D. degree in the Department of Aeronautics and Astronautics from the University of Washington, Seattle, WA, USA. His research interests include optimization, control, and machine learning. |
\includegraphics[width=1in,height=1.25in,clip,keepaspectratio]Figs/Bio/NKU.PNG | N. Kemal Üre (Member, IEEE) received the B.Sc. and M.Sc. degrees in Department of Aeronautics and Astronautics from Istanbul Technical University, Istanbul, Türkiye, in 2008 and 2010, respectively, and the Ph.D. degree in Department of Aeronautics and Astronautics from the Massachusetts Institute of Technology, Cambridge, MA, USA, in 2015. He is currently an Associate Professor with the Department of Artificial Intelligence and Data Engineering, Istanbul Technical University. His main research interests include applications of deep learning and deep reinforcement learning for autonomous systems, large scale optimization, and the development of high-performance guidance navigation and control algorithms. |
\includegraphics[width=1in,height=1.25in,clip,keepaspectratio]Figs/Bio/BA.PNG | Behçet Açıkmeşe (Fellow, IEEE) received the Ph.D. degree in aerospace engineering from Purdue University, West Lafayette, IN, USA, in 2002. Previously, he was a Senior Technologist with JPL and a Faculty Member with the University of Texas at Austin, Austin, TX, USA. At JPL, he developed flyaway control algorithms that were successfully used in the landing of Curiosity and Perseverance rovers on Mars. He is currently a Professor with the University of Washington, Seattle, WA, USA. His research interests include robust and nonlinear control, convex optimization and its applications control theory and its aerospace applications, and Markov decision processes. Dr. Açıkmeşe was the recipient of many NASA and JPL achievement awards for his contributions to spacecraft control in planetary landing, formation flying, and asteroid and comet sample return missions, and also the NSF CAREER Award, IEEE CSS Award for Technical Excel- lence in Aerospace Control and IEEE CSM Outstanding paper award in 2023. He is a Fellow of AIAA and IEEE. |