Mode Consensus Algorithms With Finite Convergence Time

Chao Huang, Hyungbo Shim, Siliang Yu, Brian D. O. Anderson This work was supported in part by the National Natural Science Foundation of China under Grant 62373282, Grant 62350003 and Grant 62150026 and by the National Research Foundation of Korea grant funded by MSIT (No. RS-2022-00165417).Chao Huang and Siliang Yu are with the College of Electronic and Information Engineering, Tongji University, Shanghai 200092, China (e-mail: [email protected].)Hyungbo Shim is with ASRI, Department of Electrical and Computer Engineering, Seoul National University, Seoul, Korea (e-mail: [email protected])Brian D.O. Anderson is with the School of Engineering, Australian National University, Acton, ACT 2601, Australia. (e-mail: [email protected].)Manuscript received XX XX, 20XX; revised XX XX, 20XX.

Abstract

This paper studies the distributed mode consensus problem in a multi-agent system, in which the agents each possess a certain attribute and they aim to agree upon the mode (the most frequent attribute owned by the agents) via distributed computation. Three algorithms are proposed. The first one directly calculates the frequency of all attributes at every agent, with protocols based on blended dynamics, and then returns the most frequent attribute as the mode. Assuming knowledge at each agent of a lower bound of the mode frequency as a priori information, the second algorithm is able to reduce the number of frequencies to be computed at every agent if the lower bound is large. The third algorithm further eliminates the need for this information by introducing an adaptive updating mechanism. The algorithms find the mode in finite time, and estimates of convergence time are provided. The proposed first and second algorithms enjoy the plug-and-play property with a dwell time.

Index Terms:

Consensus, Mode computing, Blended dynamics, Plug-and-play

I Introduction

Distributed mode consensus, also known as majority voting or multiple voting, allows for the identification of the most frequent choice when dealing with categorical data like movies, car brands, or political candidates. Since it is not possible to directly calculate average or median values for such inherently nonnumerical data, distributed mode consensus provides a way to determine the central tendency. In the existing literature, achieving consensus on functions of interest, known as the $f$ -consensus problem [9, 5], has been successful for specific types of functions typically assuming real values such as finding average, max(min), median, or the $k$ -smallest element. While distributed convex optimization based protocols can handle consensus on these functions directly, the mode consensus problem seems an exception. In addition, the mode function cannot be represented as a composition of the functions mentioned above, presenting a non-trivial challenge for mode consensus.

Achieving mode consensus is not an entirely new problem of course. In the literature, Ref. [1] introduces a distributed method for computing the mode. In this method, the frequency of each element is aggregated from the leaves to the root along a spanning tree, and only the root node performs the mode calculation. By incorporating hash functions, the algorithm is able to find the mode with high probability and low time complexity. The binary majority voting problem, where there are only two distinct elements in the population, is addressed using the “interval consensus gossip” approach described in [2]. The state space used for this problem is $\{0,0.5^{-},0.5^{+},1\}$ . Initially, nodes vote for either “ $0$ ” or “ $1$ ” with corresponding states of $0$ or $1$ . When neighboring nodes come into contact, they exchange their states and update them based on a predefined transition rule. When the algorithm reaches convergence, all nodes are expected to have states within the set $\{0,0.5^{-}\}$ if “ $0$ ” is the majority choice. Conversely, if “ $1$ ” is the majority choice, all node states will belong to the set $\{0.5^{+},1\}$ . Subsequently in [3], a Pairwise Asynchronous Graph Automata (PAGA) has been used to extend the above idea to the multiple choice voting problem, and sufficient conditions for convergence are stated. In [4], a distributed algorithm for multi-choice voting/ranking is proposed. The interaction between a pair of agents is based solely on intersection and union operations. The optimality of the number of states per node is proven for the ranking problem. Ref. [7] explores distributed mode consensus in an open multi-agent system. Each agent utilizes an average consensus protocol to estimate the frequency of each element and then selects the one with the highest frequency as the mode. Agents are free to join or leave the network, and the mode may vary during the process, but the agent that leaves the network needs to signal this intention to its neighbors beforehand.

In this paper, we present distributed mode consensus algorithms based on the concept of blended dynamics [6, 8]. Blended dynamics have the characteristic that the collective behavior of multi-agent systems can be constructed from the individual vector fields of each agent when there is strong coupling among them. As an example, [6] has demonstrated that individual agents can estimate the number of agents in the network in a distributed manner. The proposed mode consensus algorithms provide two main key benefits, over and beyond the inherent contribution. First, the algorithms can be easily implemented in a plug-and-play manner. This means that the system can maintain its mode consensus task without requiring a reset of all agents whenever a new agent joins or leaves the network. Second, we can demonstrate the intuitively satisfying conclusion that the frequency of the mode has an impact on the convergence rate of the mode consensus algorithm, in the sense that a higher mode frequency results in faster convergence of the algorithm.

The paper is organized as follows. In Section II, preliminaries on $f$ -consensus is introduced, and the mode consensus problem is then described. The direct mode consensus algorithm is described in Section III, along with its characterization of convergence rate. Section IV combines the direct algorithm with the $k$ -th smallest element consensus, resulting in two mode consensus algorithms that are applicable when the mode appears frequently. The performance of the proposed algorithms is evaluated in Section V. Finally, Section VI concludes the paper.

II Notation and Preliminaries

II-A Underlying network

Consider a group of $N$ agents labeled as $\mathcal{V}=\{1,\cdots,N\}$ . Every agent has an attribute, which can be thought of as a label. The attribute could be a positive integer, a real vector, a color, a gender, an age group, a brand of car, etc. Two or more agents may have the same attribute (and indeed, in many situations one might expect the number of distinct attributes to be much less than the number of agents). The attribute of agent $i$ will be denoted by $a_{i}$ . The vertex set is part of an undirected graph $\mathcal{G}$ with which is also associated a set $\mathcal{E}$ of edges (i.e. vertex pairs), in the usual way. The neighbor set $\mathcal{N}_{i}$ of agent $i$ is the set of vertices which share an edge with agent $i$ . The vertex set $\mathcal{V}$ , the edge set $\mathcal{E}$ , the attributes $a_{i}$ , and $\mathcal{N}_{i}$ are assumed to be time-invariant in the bulk of the paper, but at times we open up the possibility, to accommodate a “plug and play” capability, that, following certain rules, means they are piecewise constant. The state $x_{i}$ is updated by an out-distributed control algorithm, that is, at every time $t$ , the quantity $\dot{x}_{i}$ is computed as some function of $a_{i}$ , $x_{i}(t)$ , and $x_{j}(t)$ for all $j\in\mathcal{N}_{i}\left(t\right)$ .¹¹1While we choose a continuous time setting in this paper, it seems very probable that a discrete-time setting could be used as an alternative, with very little change to the main assumptions, arguments and conclusions.

Assumption 1

The graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ is undirected and connected, with $|\mathcal{V}|=N$ .

The need for connectivity is intuitively obvious. The need for $\mathcal{G}$ to be undirected is less so; note though that virtually all blended dynamics developments rest on an assumption of undirectedness of the underlying graph.

II-B $f$ -consensus and broad problem setting

To explain the problem setting, we define first the particular generalization of consensus, viz. $f$ -consensus, with which we are working, and then indicate an explicit type of $f$ -consensus problem of interest in this paper for which we are seeking an update algorithm. Following this, we provide two illustrative examples of $f$ -consensus which are in some way relevant to the problem considered here. For an introduction and the development of $f$ -consensus, see e.g. [10, 9, 5].

Definition 1 ( $f$ -consensus)

With $\Omega$ being the set of all possible distinct attributes, consider a collection of $N$ agents whose attributes take values in $\Omega$ . Suppose $f:\Omega^{N}\to\Omega$ is a given function. An algorithm is said to achieve $f$ -consensus asymptotically if it is out-distributed and the solution of the overall system exists and satisfies, for every $i\in\mathcal{V}$ , $\lim_{t\to\infty}x_{i}(t)=f(a_{1},\cdots,a_{N})$ .

Average consensus, obtained by setting $\Omega=\mathbb{R}$ , $a_{i}=x_{i}(0)$ and $f(a_{1},\cdots,a_{N})=\sum_{i=1}^{N}a_{i}/N$ is a very well known example. Some others are detailed further below, starting with the problem of interest in this paper. In the meantime however, we shall make the following assumption:

Assumption 2

The set $\Omega$ is finite, and there is a bijective mapping $l:\Omega\to\mathcal{D}:=\{1,2,\dots,|\Omega|\}$ .

To illustrate the practical effect of this assumption, suppose that we are considering an attribute which is a 2-vector of real numbers, being the height and weight of a group of individuals. Such data is always quantized in the real world, with height for example usually being expressed to the nearest number of centimeters. So the vector entry corresponding to height might be an integer number somewhere between 25 and 250, and a number of say 180 would indicate the individual’s height lies in the interval [179.5,180.5). In effect $\Omega$ becomes a finite subset of $\mathbb{R}^{2}$ . Then any such finite set $\Omega$ can always be bijectively mapped to $\{1,2,\cdots,|\Omega|\}=\mathcal{D}$ . While the possibly unordered set $\Omega$ is mapped to an ordered set $\mathcal{D}$ by the mapping $l$ , the order can be arbitrary for our purpose of computing the mode of the attributes.

II-C Two relevant examples of $f$ -consensus

There are two related problems treatable by $f$ -consensus which are similar to the problem just posed, and which have provided to the authors of this paper insights in the formulation of the solution of the mode consensus problem.

II-C1 Distributed computation of network size

In the literature, there exist consensus protocols that accomplish the task of distributed computation of the network size based on blended dynamics, see e.g. [6]. Inspired by [6], the following simple protocol estimates $N$ in finite time under the assumption that $N\leq\bar{N}$ where $\bar{N}$ is a known upper bound of $N$ :

\displaystyle\begin{split}\dot{x}_{1}&=h_{x}\left[-x_{1}+1+\gamma_{x}\sum_{j\in{\mathcal{N}}_{1}}(x_{j}-x_{1})\right],\\ \dot{x}_{i}&=h_{x}\left[1+\gamma_{x}\sum_{j\in{\mathcal{N}}_{i}}(x_{j}-x_{i})\right],\quad\forall i\not=1,\end{split}

(1)

where $\gamma_{x}>0$ is the coupling gain, and $h_{x}>0$ is the gain to control the speed of the algorithm. As can be shown in Theorem 1 of the next section, if $x_{i}(0)\in{\mathcal{K}}_{x}:=[0.5,\bar{N}+0.5]$ and $\gamma_{x}\geq\bar{N}^{3}$ , the solution of the system (1) satisfies

\langle x_{i}(t)\rangle=N,\quad\forall t>{\mathcal{T}}_{x}

where $\left\langle\cdot\right\rangle$ is the rounding function, and

{\mathcal{T}}_{x}=\frac{4\bar{N}}{h_{x}}\ln\frac{4M_{{\mathcal{K}}_{x}}\sqrt{\bar{N}}}{2-\sqrt{2}}

where $M_{{\mathcal{K}}_{y}}=\bar{N}$ .

Remark 1

The upper bound on the estimation time ${\mathcal{T}}_{x}$ (which may be quite conservative) depends on $\bar{N}$ in the order of $O(\bar{N}\ln\bar{N})$ . If $h_{x}$ grows linearly with $\bar{N}$ , one may even have $O\left(\ln{\bar{N}}\right)$ . However a large gain could also undermine the robustness of the protocol against high-frequency noise.

II-C2 $k$ -th smallest element consensus

Since $\mathcal{D}$ is a totally ordered set, suppose without loss of generality that $l(a_{1})\leq l(a_{2})\leq\cdots\leq l(a_{N})$ , then the $k$ -th smallest element is defined as $a_{k}$ . The $k$ -th smallest element (or $k$ -th order statistic) consensus problem is then an $f$ -consensus problem with $f(a_{1},\cdots,a_{N})=a_{k}$ .

Ref. [5] proposed a method to solve the $k$ -th smallest element consensus problem with distributed convex optimization algorithms. An example used in the following is

\dot{z}_{i}=-\phi_{k}(z_{i},a_{i},N)+\gamma_{z}\sum\limits_{j\in{\mathcal{N}_{i}}}{\rm{sgn}}\left(z_{j}-z_{i}\right),

(2)

where $\phi_{k}:{\mathbb{R}}\times\Omega\times{\mathbb{N}}$ is defined by

\phi_{k}(z_{i},a_{i},N)=\begin{cases}\beta(z_{i}-l(a_{i}))-gk,&z_{i}<l(a_{i}),\\ 0,&z_{i}=l(a_{i}),\\ \beta(z_{i}-l(a_{i}))+g\left(N+1-k\right),&z_{i}>l(a_{i}),\end{cases}

with $\beta>0$ and $g>\beta\bar{N}\left|\Omega\right|$ .²²2In [5] the bound was given by $g>\beta N\left|{{l(a_{k})}-\frac{1}{N}\sum\nolimits_{i=1}^{N}{{l(a_{i})}}}\right|$ , which can be loosened to $g>\beta N\left|{{l(a_{N})}-{{l(a_{1})}}}\right|=\beta N\left|\Omega\right|$ .³³3System (2) admits unique solution in the sense of Filippov. For more details, refer to, e.g., [11, Proposition S2]. Initial conditions for the differential equations can be arbitrary. It can be shown following [12] that if

\gamma_{z}>\bar{N}{\max_{1\leq i\leq N}}{\sup_{\tau\in{\mathcal{K}}_{z}}}\left|\phi_{k}(\tau,a_{i},N)\right|,

(3)

the convergence rate of the protocol (2)–(3) satisfies $V(t)\leq e^{-\beta t}V(0)$ , where $V(t)=\frac{1}{2}\sum_{i=1}^{N}\left|z_{i}(t)-l(a_{k})\right|^{2}$ . When $\mathcal{K}_{z}=\left[0.5,\bar{N}+0.5\right]$ , it can be obtained that the solution of (2)–(3) satisfies

\left\langle z_{i}(t)\right\rangle=l(a_{k}),\quad\forall t>{\mathcal{T}_{z}}

where

{\mathcal{T}_{z}}=\frac{1}{\beta}\ln{2\bar{N}\left|\Omega\right|}.

By inverting the map $l$ , every agent can figure out the $k$ -th smallest element.

Remark 2

Although increasing $\beta$ renders a smaller $\mathcal{T}_{z}$ , it also amplifies $\gamma_{z}$ to a very large number for large $\bar{N}$ . To reach a balance between $\mathcal{T}_{z}$ and $\gamma_{z}$ , we select $\beta=O\left(\frac{1}{\bar{N}}\right)$ , so that $\mathcal{T}_{z}=O\left(\bar{N}\ln\bar{N}\right)$ and $\gamma_{z}=O\left(\bar{N}^{2}\right)$ .

Remark 3

Both of the examples above, the distributed computation of network size and the $k$ -th smallest element consensus, exhibit algorithms with a finite convergence time. Likewise, for mode consensus, we seek algorithms which also yield finite consensus time.

II-D The problem of interest: Mode consensus

Mode consensus is a special class of $f$ -consensus problem. Suppose the function $f_{\rm m}:\Omega^{N}\to\Omega$ returns the attribute $a^{*}\in\Omega$ that appear most often among $a_{1},\cdots,a_{N}$ (when multiple distinct values equally appear most often, $f_{\rm m}$ returns any of these values specified by the user). The mode consensus problem as studied in this paper is an $f$ -consensus problem with $f=f_{\rm m}$ .

II-E Plug-and-play

In many circumstances, it may be advantageous to have a plug-and-play capability. Specifically, we are interested in a network that can change over time during its operation. Changes cannot be arbitrary, but rather admissible in accord with certain rules.

First, while the potential vertex set $\bar{\mathcal{V}}$ is taken as time-invariant, that is, $\bar{\mathcal{V}}=\{1,\cdots,\bar{N}\}$ is fixed over time for some $\bar{N}$ , the actual vertex set $\mathcal{V}$ which is an arbitrary subset of $\bar{\mathcal{V}}$ can be time-varying but piecewise constant; with $N(t)$ denoting its cardinality, $\bar{N}$ is an upper bound for $N(t)$ . In this setting, our admissible changes are as follows:

•

The set of edges, written as $\mathcal{E}(t)$ , is time-varying but piecewise constant. Therefore, at certain time instants, a new edge or edges can be created, and some existing edge or edges can be deleted.
•

There can be orphan nodes (meaning a node that has no incident edge). When all the edges that are incident to a node are deleted, we say the node leaves the network.
•

We assume that there is only one connected component in the network. In fact, even if there is more than one connected component, our concern is with just one of them. If a connected component of interest splits into two connected components (with some edges being deleted), one of these components still receives our attention, and all the nodes belonging to the other component are regarded as leaving the network.
•

The attribute of a node is permitted to be time-varying but must be piecewise constant.

•

The node dynamics stop integrating whenever the node is an orphan. One example of the node dynamics is:

\dot{x}_{i}={\mathrm{sgn}}(|\mathcal{N}_{i}(t)|)\cdot\left(g_{i}(x_{i},t)+\sum_{j\in\mathcal{N}_{i}(t)}(x_{j}-x_{i})\right)

(4)

where $|\cdot|$ for a set implies cardinality of the set. In this way, malfunctioning agents can also be represented by considering them as orphans.

The control algorithm for updating the states of agents is said to be plug-and-play ready if, whenever an abrupt admissible change of the network occurs, (a) the $f$ -consensus is recovered (to a new value reflecting the new situation) after a transient, and (b) the recovery is achieved by passive manipulation, together with local initialization of the newly joining agent if necessary. By passive manipulation we mean that, when an individual agent detects the changes in its incident edge set, or equivalently its neighbor set, associated with immediate neighbors leaving or joining the network, the agent can perform some actions that do not require reactions of neighboring agents. An example is (4) because, if $\mathcal{N}_{i}(t^{-})\not=\mathcal{N}_{i}(t^{+})$ at time $t$ , the manipulation simply resets the incident edge set or equivalently neighbor set, and this can be done by the individual agent. (Of course, neighbor agents may have to also reset in order to carry out their own updates.) By local initialization we mean that, any initialization following a change must be local only, i.e. it must be global initialization-free. This implies that the algorithm should not depend on a particular initial condition constraining two or more agents in some linked way. A constraint such as $\sum_{i=1}^{N}x_{i}(0)=0$ is an example of global initialization, while the constraint that $x_{i}(0)\in{\mathcal{K}}$ , $\forall i\in{\mathcal{V}}$ , where $\mathcal{K}$ is a compact set known to every agent, is considered as an example of local initialization (or equivalently it is global initialization-free), and a local initialization is required for the newly joining agent (or, in (4), when $\mathcal{N}_{i}(t)$ becomes non-empty so that the sign function changes from 0 to 1).

The property of plug-and-play ready basically requires that, when the change occurs, no further action is required except for the agents whose incident edge sets are changed. In particular, the requirement of passive manipulation is useful in the case when some agent suddenly stops working (without any prior notification) and cannot function properly anymore.

III Direct mode consensus algorithm

The following protocol, which is motivated by the algorithm for distributed computation of network size discussed above, lays the foundation of the mode consensus algorithm. It is also inspired by the notion of blended dynamics and calculates the number of agents with an arbitrary particular attribute $a\in\Omega$ , denoted by $\mathcal{F}(a)$ , in a distributed manner:

\displaystyle\begin{split}\dot{y}_{1}&=h_{y}\left[-y_{1}+I(a,a_{1})+\gamma_{y}\sum_{j\in{\mathcal{N}}_{1}}(y_{j}-y_{1})\right],\\ \dot{y}_{i}&=h_{y}\left[I(a,a_{i})+\gamma_{y}\sum_{j\in{\mathcal{N}}_{i}}(y_{j}-y_{i})\right],\qquad\forall i\not=1\end{split}

(5)

where

I(a,a_{i})=\begin{cases}1,&a_{i}=a,\\ 0,&a_{i}\neq a,\end{cases}

(6)

and analogous to (1), here $\gamma_{y}>0$ is the coupling gain, and $h_{y}>0$ is the gain to control the speed of the algorithm.

Theorem 1

Suppose Assumptions 1 and 2 hold. If $\gamma_{y}\geq\bar{N}^{3}$ , then for any initial condition $y_{i}(0)\in{\mathcal{K}}_{y}:=[-0.5,\bar{N}+0.5]$ , the solution of the consensus protocol (5)–(6) satisfies

\left\langle y_{i}(t)\right\rangle=\mathcal{F}(a),\quad\forall t>{\mathcal{T}}_{y},

(7)

where, with $M_{{\mathcal{K}}_{y}}=\bar{N}+1$ (the size of ${\mathcal{K}}_{y}$ ),

{\mathcal{T}}_{y}=\frac{4\bar{N}}{h_{y}}\ln\frac{4M_{{\mathcal{K}}_{y}}\sqrt{\bar{N}}}{2-\sqrt{2}}.

Proof of Theorem 1 is found in the Appendix.

Remark 4

If $a=a_{1}=\cdots=a_{N}$ , there is no essential difference between (1) and (5). Thus the proof of Theorem 1 is also applicable to the network size estimation protocol (1).

Remark 5

In the literature, $\mathcal{F}(a)/N$ can be estimated using average consensus protocols, see [7]. However, in order to run the protocol in an open multi-agent system, the agent that leaves the network must signal its intention to its neighbors beforehand. This is not required by the proposed Algorithm 1.

Remark 6

As is usual for algorithms based on using the blended dynamics approach, grounded in singular perturbation ideas, there are broadly speaking two convergence rates, the fast one being associated with the achieving of consensus between the agents (adjusted by $\gamma_{y}$ ), and the slower one being associated with the blended dynamics (adjusted by $h_{y}$ ). This behavior is apparent in the simulations discussed later.

Algorithm 1 Distributed mode consensus algorithm run at every agent

i

1.

Run the distributed consensus protocol (5)–(6) with $y_{i}(0)\in{\mathcal{K}}_{y}$ to estimate ${\mathcal{F}}(a)$ for every $a\in\Omega$ ;
2.

Return the attribute defined by the mode:

$a^{*}=\arg\max_{a\in\Omega}{\mathcal{F}}(a).$ (8)

Based on Theorem 1, Algorithm 1 for distributed computation of the mode can be formulated directly. Evidently, one can execute the second step of Algorithm 1 using parallel computations, with one for each attribute. Consider the following modification to (5)–(6):

\displaystyle\begin{split}\dot{\xi}_{1}&=h_{\xi}\left[-\xi_{1}+e_{l(a_{1})}+\gamma_{\xi}\sum_{j\in\mathcal{N}_{1}}(\xi_{j}-\xi_{1})\right]\\ \dot{\xi}_{i}&=h_{\xi}\left[e_{l(a_{i})}+\gamma_{\xi}\sum_{j\in\mathcal{N}_{i}}(\xi_{j}-\xi_{i})\right],\qquad\forall i=2,\cdots,|\Omega|,\end{split}

(9)

where $\xi_{i}\in\mathbb{R}^{|\Omega|}$ is the state vector, and $e_{i}\in\mathbb{R}^{|\Omega|}$ is the unit $|\Omega|$ -vector. Likewise, $\gamma_{\xi}$ and $h_{\xi}$ are positive scalar gains. If each agent holds the set $\Omega$ and the map $l$ , then the execution of the third step of the algorithm is straightforward (and achievable by a single agent, or all agents). Any agent $i$ for which $a_{i}=a^{*}$ by definition has an attribute which is the mode.

Remark 7

Indeed, if there is more than one attribute with the highest frequency of occurrence, Algorithm 1 is able to find all of these attributes. In that case the user has the option to return any one or several of them. This issue will not be examined any further in the remainder of the paper.

Algorithm 1 is attractive on several grounds. It offers finite convergence time with a bound available for that time, and it is plug-and-play ready. In particular, if the admissible changes discussed in Section II-E occur with the dwell time ${\mathcal{T}}_{y}$ (i.e., any two consecutive changes do not occur within ${\mathcal{T}}_{y}$ ), then the mode is obtained after the time ${\mathcal{T}}_{y}$ from the time of change. It is because, whenever $y_{i}(t_{j})\in{\mathcal{K}}_{y}$ where $t_{j}$ is the $j$ -th time of change, it holds that $-0.5+{\mathcal{F}}(a)<y_{i}(t)<{\mathcal{F}}(a)+0.5$ for all $t\geq t_{j}+{\mathcal{T}}_{y}$ by Theorem 1. Thus, $y_{i}(t_{j+1})\in{\mathcal{K}}_{y}$ because ${\mathcal{F}}(a)\in[1,\bar{N}]$ , and this repeats. On the other hand, it has the potential disadvantage that every agent has to run the consensus protocols (5)–(6) multiple (in fact $|\Omega|)$ times, which could be computationally burdensome (but may not be, even with large $N$ ). For small $|\Omega|$ , there is in fact no substantive disadvantage.

IV Mode consensus algorithm considering the frequency of mode

In this section, we consider two alternative algorithms based on knowledge, available a priori or acquired early in the algorithm, of a lower bound on the frequency of the mode.

This second style of mode consensus algorithm in both cases uses the following result.

Lemma 1

Let $a\in\Omega$ , and let $K$ be a positive integer. If $\mathcal{F}(a)\geq\left\lceil\frac{N}{K}\right\rceil$ , then there is an integer $j\in\left\{1,2,\cdots,K\right\}$ such that $l(a)$ equals the $j\left\lceil\frac{N}{K}\right\rceil$ -th smallest element of $\{l(a_{1}),\cdots,l(a_{N})\}$ .

Proof:

The proof uses a typical pigeonhole principle argument. For notational convenience, let $l_{i}:=l(a_{i})$ and $l_{a}:=l(a)$ . Suppose without loss of generality that $l_{1}\leq\cdots\leq l_{N}$ . Moreover, let $l_{N+1},\cdots,l_{K\left\lceil\frac{N}{K}\right\rceil}$ be additional attributes introduced temporarily just for the proof and such that

l_{1}\leq\cdots\leq l_{N}\leq l_{N+1}\leq\cdots\leq l_{K\left\lceil\frac{N}{K}\right\rceil}.

(10)

Partition the above sequence into subsequences

{\mathcal{D}}_{j}=\left\{l_{\left(j-1\right)\left\lceil{\frac{N}{K}}\right\rceil+1},l_{\left(j-1\right)\left\lceil{\frac{N}{K}}\right\rceil+2},\cdots,l_{j\left\lceil{\frac{N}{K}}\right\rceil}\right\}

for $j=1,2,\cdots,K$ . It follows that $\left|{\mathcal{D}}_{j}\right|=\left\lceil{\frac{N}{K}}\right\rceil$ .

The result is then proved with a contradiction argument. Suppose, to obtain a contradiction, that $l_{a}\neq l_{j\left\lceil{\frac{N}{K}}\right\rceil}$ for all $j\in\left\{1,2,\cdots,K\right\}$ . Then it must follow that $l_{a}=l_{\left(j-1\right)\left\lceil{\frac{N}{K}}\right\rceil+s}$ for some $j\in\left\{1,2,\cdots,K\right\}$ and $s\in\left\{1,\cdots,\left\lceil{\frac{N}{K}}\right\rceil-1\right\}$ . Combined with the fact that the values are ordered (see (10)), it must follow that the number of agents with attribute equal to $a$ is less than $\left\lceil{\frac{N}{K}}\right\rceil$ , which yields a contradiction. ∎

Remark 8

In Lemma 1, if in addition, we have $\left\lceil{\frac{N}{K}}\right\rceil>{\frac{N}{K}}$ , the result of Lemma 1 can be mildly strengthened to requiring $j\in\left\{1,2,\cdots,K-1\right\}$ , because the $j\left\lceil{\frac{N}{K}}\right\rceil$ -th smallest element with $j=K$ is then out of the set $\{l(a_{1}),\cdots,l(a_{N})\}$ .

IV-A Algorithm with a priori knowledge of the least frequency of the mode

The message of Lemma 1 is that, if a lower bound of the frequency of the mode is known, that is, ${\mathcal{F}}(a^{*})\geq f^{*}$ with a known $f^{*}$ , then, with $K$ such that $f^{*}\geq\lceil\frac{N}{K}\rceil$ , the integer $l(a^{*})$ for the mode $a^{*}$ should be one of the $j\left\lceil{\frac{N}{K}}\right\rceil$ -th smallest elements, $j=1,\dots,K$ , in $\{l(a_{1}),\dots,l(a_{N})\}$ . This message yields Algorithm 2, in which, Step 2) identifies the $j\left\lceil{\frac{N}{K}}\right\rceil$ -th smallest elements, $j=1,\dots,K$ , and then, Step 4) finds the mode by comparing the frequencies among the candidates.

Algorithm 2 Distributed mode consensus algorithm run at every agent

i

1.

Estimate the network size, $N$ , with the distributed consensus protocol (1) with $x_{i}(0)\in{\mathcal{K}}_{x}$ ;
2.

For each $j\in\left\{1,2,\cdots,K\right\}$ (or $j\in\left\{1,2,\cdots,K-1\right\}$ if $\left\lceil{\frac{N}{K}}\right\rceil>{\frac{N}{K}}$ ), run the consensus protocol (2)–(3) with $z_{i}(0)\in{\mathcal{K}}_{z}$ to estimate the $j\left\lceil{\frac{N}{K}}\right\rceil$ -th smallest element $\alpha_{j}\in\Omega$ . Collect them as ${\mathcal{A}}:=\{\alpha_{1},\dots,\alpha_{|{\mathcal{A}}|}\}$ where $|\mathcal{A}|\leq K$ or $K-1$ .
3.

Run the distributed consensus protocol (5)–(6) with $y_{i}(0)\in{\mathcal{K}}_{y}$ to estimate ${\mathcal{F}}(\alpha)$ for every $\alpha\in{\mathcal{A}}$ ;
4.

Return the mode

$a^{*}=\arg\max_{\alpha\in{\mathcal{A}}}\left\{{\mathcal{F}}(\alpha)\right\}.$ (11)

While Algorithm 2 outlines the distributed mode consensus algorithm, it also reveals the significance of $K$ in reducing the computational load. Depending on $K$ , sometimes Algorithm 2 may involve manipulating or storing fewer variables than Algorithm 1, but the reverse may hold. In fact, Step 2) in Algorithm 2 can be considered as a selection procedure to find an attribute to be inspected. By this, it is expected that the number of inspections in (11) (effectively, the cardinality of $\mathcal{A}$ ) is smaller than that in (8) (the cardinality of $\Omega$ ). This is indeed the case when, for example, with $N=10$ , $[l(a_{1}),\dots,l(a_{N})]=[1,1,1,1,1,2,3,4,5,6]$ , and $|\Omega|=6$ . That is, for (8), at least six attributes are inspected in Step 1) of Algorithm 1. For the case of Algorithm 2, assuming that $f^{*}=5$ is known, $K=2$ guarantees that $f^{*}=5\geq\lceil 10/2\rceil=5$ , and so two attributes are inspected by Step 3) of Algorithm 2. However, in order to identify the attributes to be inspected, Step 1)-2) of Algorithm 2 needs to be performed, as an overhead. So the total number of variables to be manipulated is $2K+1=5$ which is still less than $\left|\Omega\right|=6$ . As a second example, if $[l(a_{1}),\dots,l(a_{N})]=[1,2,3,4,4,5,6,7,8,8]$ with $N=10$ and $\left|\Omega\right|=8$ , then $f^{*}=2$ and we have to choose $K=5$ . In this case, Algorithm 2 inspects five candidates while Algorithm 1 inspects eight candidates, but the count of the overhead is five so that the total count becomes $2K+1=11$ , which is more than $|\Omega|=8$ .

A convergence time bound can be established as follows. Suppose Algorithm 2 is executed with parallel computations at Steps 2) and 4), for example at Step 2), each agent runs the $j\left\lceil{\frac{N}{K}}\right\rceil$ -th smallest element consensus protocol for every $j=1,2,\cdots,K$ in parallel. Then, the mode consensus is reached within the time ${\mathcal{T}}_{x}+{\mathcal{T}}_{y}+{\mathcal{T}}_{z}$ .

It appears from Algorithm 2 that Step $i+1)$ can only be implemented after Step $i)$ has converged. This is actually not the case. All steps may start simultaneously, provided that the parameters used in any step are replaced with their estimated value generated in the previous steps. Indeed, the following equations are one alternative implentation of Algorithm 2 for agent $i$ :

\displaystyle\begin{split}\dot{x}_{i}&=h_{x}\left[c_{i}x_{i}+1+\gamma_{x}\sum_{j\in\mathcal{N}_{i}}(x_{j}-x_{i})\right]\\ \dot{z}_{1}^{i}&=-\phi_{1}^{K}(z_{1}^{i},a_{i},x_{i})+\gamma_{z}\sum_{j\in\mathcal{N}_{i}}{\mathrm{sgn}}(z_{1}^{j}-z_{1}^{i})\\ &\vdots\\ \dot{z}_{K}^{i}&=-\phi_{K}^{K}(z_{K}^{i},a_{i},x_{i})+\gamma_{z}\sum_{j\in\mathcal{N}_{i}}{\mathrm{sgn}}(z_{K}^{j}-z_{K}^{i})\\ \dot{y}_{1}^{i}&=h_{y}\left[c_{i}y_{1}^{i}+{\mathcal{I}}(z_{1}^{i},a_{i})+\gamma_{y}\sum_{j\in\mathcal{N}_{i}}(y_{1}^{j}-y_{1}^{i})\right]\\ &\vdots\\ \dot{y}_{K}^{i}&=h_{y}\left[c_{i}y_{K}^{i}+{\mathcal{I}}(z_{K}^{i},a_{i})+\gamma_{y}\sum_{j\in\mathcal{N}_{i}}(y_{K}^{j}-y_{K}^{i})\right]\\ \widehat{m}^{i}&=\text{argmax}_{1\leq j\leq K}\{y_{j}^{i}\}\end{split}

(12)

where $c_{1}=1$ , $c_{i}=0$ for all $i\not=1$ , and

	$\displaystyle\phi_{k}^{K}(z,a,x)$
	$\displaystyle\qquad=\begin{cases}\beta(z-l(a))-gk\left\lceil\frac{\langle x\rangle}{K}\right\rceil,&z<l(a),\\ 0,&z=l(a),\\ \beta(z-l(a))+g\left(\langle x\rangle+1-k\left\lceil\frac{\langle x\rangle}{K}\right\rceil\right),&z>l(a),\end{cases}$
	$\displaystyle{\mathcal{I}}(z,a)=\begin{cases}1,&\langle z\rangle=l(a)\\ 0,&\langle z\rangle\not=l(a)\end{cases}$

in which, $\beta$ , $g$ , $\gamma_{x}$ , $\gamma_{y}$ , and $\gamma_{z}$ are predetermined. (When $\left\lceil{\frac{N}{K}}\right\rceil>{\frac{N}{K}}$ , the $K$ in the above is interpreted as $K-1$ .) Here, $x_{i}$ is the estimate of $N$ by the agent $i$ , $z_{k}^{i}$ is the estimate of the $k$ -th smallest element, $y_{k}^{i}$ is the estimate of the frequency of $z_{k}^{i}$ , and $\widehat{m}^{i}$ is the estimated integer for the mode, i.e., the estimated mode is $a\in\Omega$ such that $l(a)=\widehat{m}^{i}$ . In fact, although all the dynamics (12) run altogether, the estimates are sequentially obtained. That is, $z_{k}^{i}$ starts converging to its proper value after $\langle x_{i}\rangle$ becomes $N$ , and $y_{k}^{i}$ starts converging to its proper value after $\langle z_{k}^{i}\rangle$ becomes the $k$ -th smallest element. Therefore, the required time for getting the mode is still the same as ${\mathcal{T}}_{x}+{\mathcal{T}}_{y}+{\mathcal{T}}_{z}$ .

Algorithm 2 that repeats forever, and the alternative algorithm (12) are plug-and-play ready as long as the admissible changes occur with the dwell time ${\mathcal{T}}_{x}+{\mathcal{T}}_{y}+{\mathcal{T}}_{z}$ . This is because, with $x_{i}(t_{j})\in{\mathcal{K}}_{x}$ , $z_{k}^{i}(t_{j})\in{\mathcal{K}}_{z}$ , and $y_{k}^{i}(t_{j})\in{\mathcal{T}}_{y}$ at the time of change $t_{j}$ , the same inclusion holds at the next time of change $t_{j+1}$ ; i.e., $x_{i}(t_{j+1})\in{\mathcal{K}}_{x}$ , $z_{k}^{i}(t_{j+1})\in{\mathcal{K}}_{z}$ , and $y_{k}^{i}(t_{j+1})\in{\mathcal{T}}_{y}$ .

From (12) we see the number of the state variables needed in Algorithm 2 equals $2K+1$ (or $2K-1$ if $\left\lceil{\frac{N}{K}}\right\rceil>{\frac{N}{K}}$ ). Thus Algorithm 2 uses less state variables than Algorithm 1 if $2K+1<\left|\Omega\right|$ (or $2K-1<\left|\Omega\right|$ when $\left\lceil{\frac{N}{K}}\right\rceil>{\frac{N}{K}}$ ). In particular, If $K=1$ , the mode consensus problem reduces to the max consensus of $\{l(a_{1}),\dots,l(a_{N})\}$ since $l(a^{*})$ is the $N$ -th smallest element; If $K=2$ , the problem reduces to finding the attributes having larger frequency between the median and the maximum of $\{l(a_{1}),\dots,l(a_{N})\}$ (if $N$ is odd, it is simply the median); If $K=|\Omega|$ , Algorithm 2 probably has to do the $k$ -th smallest element consensus $|\Omega|$ times, and then it offers no advantage over Algorithm 1. Since $f^{*}\geq\left\lceil\frac{N}{|\Omega|}\right\rceil$ , it is not necessary to consider $K>|\Omega|$ because the condition $f^{*}\geq\left\lceil\frac{N}{K}\right\rceil$ is automatically fulfilled. Thus to choose Algorithm 2 over Algorithm 1 it is preferable to have $2K+1\ll|\Omega|$ .

IV-B Algorithm with learned knowledge of the least frequency of the mode

When $f^{*}$ is unknown, we are not able to employ Algorithm 2. However, once we determine a value of ${\mathcal{F}}(a)$ for some attribute $a$ , then we can immediately infer that $f^{*}\geq{\mathcal{F}}(a)$ . Based on this simple fact, we begin with an estimate $F$ of $f^{*}$ with $F=1$ , and update $F$ by ${\mathcal{F}}(a)$ whenever an attribute $a$ such that ${\mathcal{F}}(a)>F$ is found. At the same time, we begin with $K=1$ and repeatedly increase $K$ until it holds that $F\geq\lceil\frac{N}{K}\rceil$ . Once we have $F\geq\lceil\frac{N}{K}\rceil$ , it means that the integer corresponding to the mode $a^{*}$ is among the $j\left\lceil{\frac{N}{K}}\right\rceil$ -th smallest elements, $j=1,\dots,K$ , of $\{l(a_{1}),\dots,l(a_{N})\}$ , so that the previous algorithm can be applied. Putting all this together, we obtain Algorithm 3.

Algorithm 3 Distributed mode consensus algorithm run at every agent

i

1.

Estimate the network size, $N$ , with the distributed consensus protocol (1) with $x_{i}(0)\in{\mathcal{K}}_{x}$ . (If $N=1$ , stop because the mode is the same as the attribute.);
2.

Set $K=F=1$ ;
3.

While $F<\left\lceil{\frac{N}{{{K}}}}\right\rceil$ do
4.

$K\leftarrow K+1$ ;
5.

For each $j\in\left\{1,2,\cdots,K\right\}$ (or $j\in\left\{1,2,\cdots,K-1\right\}$ if $\left\lceil{\frac{N}{K}}\right\rceil>{\frac{N}{K}}$ ), run the consensus protocol (2)–(3) with $z_{i}(0)\in{\mathcal{K}}_{z}$ to estimate the $j\left\lceil{\frac{N}{K}}\right\rceil$ -th smallest element $\alpha_{j}\in\Omega$ . Collect them as ${\mathcal{A}}:=\{\alpha_{1},\dots,\alpha_{|{\mathcal{A}}|}\}$ where $|\mathcal{A}|\leq K$ or $K-1$ .
6.

Run the distributed consensus protocol (5)–(6) with $y_{i}(0)\in{\mathcal{K}}_{y}$ to estimate ${\mathcal{F}}(\alpha)$ for every $\alpha\in{\mathcal{A}}$ ;
7.

$F\leftarrow\max\left\{F,\max_{\alpha\in{\mathcal{A}}}\{{\mathcal{F}}(\alpha)\}\right\}$ ;
8.

End While
9.

Return the mode

$a^{*}=\arg\max_{\alpha\in{\mathcal{A}}}\left\{{\mathcal{F}}(\alpha)\right\}.$

In the algorithm, Steps 5) and 6) can be made more efficient by storing data obtained in the previous loop, but in the worst case scenario, the “While” loop should be run $K^{*}$ times, where $K^{*}$ is the smallest positive integer such that $f^{*}\geq\lceil\frac{N}{K^{*}}\rceil$ . Analogously to the analysis for Algorithm 2, the number of state variables needed in Algorithm 3 equals $1+\sum\nolimits_{i=1}^{{K^{*}}}2i=K^{*}\left(K^{*}+1\right)+1$ . To choose Algorithm 3 over Algorithm 1 it is preferable to have $K^{*}\left(K^{*}+1\right)+1\ll|\Omega|$ . If in addition, Algorithm 3 is executed with parallel computations, the time to reach mode consensus is less than ${\mathcal{T}}_{x}+K^{*}\left({\mathcal{T}}_{y}+{\mathcal{T}}_{z}\right)$ .

V Simulations

Consider an undirected loop composed of $N=40$ agents, where agent $i$ is connected to agent $i+1$ for every $i=1,2,\cdots,N-1$ , and connected to agent $i-1$ for every $i=2,3,\cdots,N$ . Agent $1$ and $N$ are also connected. Suppose that $\Omega=\{1,2,\cdots,10\}$ , and

[\mathcal{F}(1),\mathcal{F}(2),\dots\mathcal{F}(10)]=[5,6,7,16,1,1,1,1,1,1].

Thus $a^{*}=4$ is the mode, which is unique.

Algorithm 1 is examined first. For each $a\in\Omega$ , the initial condition of (5)–(6) at each agent (i.e., the initial guess of $\mathcal{F}\left(a\right)$ at each agent) is randomly and independently selected from $1$ to $40$ . According to Theorem 1, the coupling gain is selected as $\gamma_{y}=N^{3}=6.4\times 10^{4}$ . Set $h_{y}=10^{3}$ . The simulation results are shown in Figs. 1–2. From Fig. 2 it is observed that consensus is reached rapidly, while the trajectories converge to the frequency of the mode with a relatively slow rate. This is in accord with expectations, given use of blended dynamics ideas.

Second, Algorithm 2 is examined. Suppose that $\bar{N}=50$ . With protocol (1) where $h_{x}=10^{3}$ and $\gamma_{x}=\bar{N}^{3}=1.25\times 10^{5}$ , and the initial condition at each agent (i.e., the initial guess of $N$ at each agent) is randomly and independently selected from $1$ to $\bar{N}$ , each agent asymptotically estimates the network size, as shown in Fig. 3. Next, assume that $K=3$ is used in Algorithm 2 (this is valid since $\mathcal{F}\left(a^{*}\right)=16>\lceil{\frac{N}{K}}\rceil=14$ ). Thus only the $14$ -th and $28$ -th smallest element need to be estimated. Fig. 4 shows the corresponding estimation result by protocol (2), where $\beta=\frac{1}{\bar{N}}=0.02$ , $g=\left|\Omega\right|=10$ , $\gamma_{z}=g\bar{N}^{2}=2.5\times 10^{4}$ and the initial condition at each agent (i.e., the initial guess of the mode at each agent) is randomly and independently selected from $\Omega$ . It is shown that the $14$ -th element converges to $3$ while the $28$ -th element converges to $4$ . Then, Fig. 5 shows that the frequency of the $14$ -th smallest element converges to $7$ , and the $28$ -th smallest element converges to $16$ by running protocol (5)–(6), where the gains and initial conditions are selected following the rules of the first simulation. Finally, Fig. 6 shows the mode estimated at each agent.

Lastly, Algorithm 3 is examined. Parameter settings follow Algorithms 1 and 2, and the iteration starts at $K=1$ and $F=1$ . It is supposed that the criterion “ $F<\lceil{\frac{N}{K}}\rceil$ ” is verified every $0.6$ s. Since $F<\lceil{\frac{N}{K}}\rceil=40$ , $K$ is switched to $2$ immediately, and $\lceil{\frac{N}{K}}\rceil=20$ . Thus the $20$ -th and $40$ -th smallest element need to be estimated, and the corresponding result is shown in Fig. 7-8, in which the attribute estimation converges to $4$ and $10$ and the frequency estimation converges to $16$ and $1$ . Then since $F=16<\lceil{\frac{N}{K}}\rceil=20$ , $K$ is further switched to $3$ in Fig. 9. As the situation at $K=3$ is the same as the second simulation, simulation details are omitted. Note that the termination condition is satisfied since $F=16>\lceil{\frac{N}{K}}\rceil=14$ . Finally, Fig. 10 shows the mode estimated at each agent during the iterations.

Refer to caption — Figure 1: The mode $a^{*}$ estimated at each agent with Algorithm 1, converging to $4$ .

VI Conclusion

It is shown in this paper that, by employing a blended-dynamics based protocol, distributed mode consensus can be reached in an undirected network. It is also shown that, if the frequency of the mode is no less than $\lceil\frac{N}{K}\rceil$ for some positive integer $K$ satisfying $2K+1<\left|\Omega\right|$ , the number of distinct frequencies computed at every agent can be reduced.

To design the mode consensus algorithms, the upper bound of the network size, $\bar{N}$ , is required a priori information. The convergence time is always finite.

A nice feature of the proposed algorithms is the plug-and-play property. When a new agent plugs in or leaves the network, only passive manipulations are needed for the agents in order to restore consensus to a new mode.

Future works along this direction include extending the present results to the case of directed graphs, developing second (or higher) order algorithms to reach faster convergence, adaptively adjusting the gains so that the need for knowledge of $\bar{N}$ can be removed. While the piecewise-constant interaction graph is considered in this paper, gossip or other stochastic sorts of interactions are also of interest for future direction of research.

We shall prove Theorem 1 following the idea of [6]. By letting $y=[y_{1},y_{2},\cdots,y_{N}]^{T}$ and $b=[I(a,a_{1}),I(a,a_{2}),\cdots,I(a,a_{N})]^{T}$ , the overall system (5)–(6) is rewritten as

\dot{y}(t)=h\left[-(\gamma L+e_{1}e_{1}^{T})y(t)+b\right]

(13)

where $e_{1}=[1,0,\cdots,0]^{T}\in\mathbb{R}^{N}$ , and $h_{y}$ and $\gamma_{y}$ in (5) and $\bar{N}$ are written as $h$ , $\gamma$ , and $N$ , respectively, for simplicity.

Let $\lambda_{2}(L)$ denote the second smallest eigenvalue of the symmetric Laplacian matrix $L$ , which is nonzero under Assumption 1. Then, the following lemma is a key to the proof.

Lemma 2 (Lemma 1 in [6])

Suppose that Assumption 1 holds. If $\gamma>0$ , then the matrix $(\gamma L+e_{1}e_{1}^{T})$ is positive definite. Moreover, if $\gamma\geq N/\lambda_{2}(L)$ , then $\lambda_{\min}\left(\gamma L+e_{1}e_{1}^{T}\right)\geq 1/(4N)$ , where $\lambda_{\min}(A)$ is the smallest eigenvalue for a symmetric matrix $A$ .

Under Lemma 2, system (13) has the exponentially stable equilibrium

y^{*}=\left(\gamma L+e_{1}e_{1}^{T}\right)^{-1}b.

Lemma 3

Let $y_{i}^{*}$ be the $i$ -th element of $y^{*}$ . Under Assumption 1,

1.

$y_{1}^{*}=1_{N}^{T}b=\mathcal{F}(a)$ ,
2.

for $i=2,\dots,N$ ,

$|y_{i}^{*}-y_{1}^{*}|<\frac{1}{\gamma}\frac{\sqrt{2}N^{3}}{4}.$ (14)

Proof:

Observe that $1_{N}^{T}(\gamma L+e_{1}e_{1}^{T})=e_{1}^{T}$ because $1_{N}^{T}L=0$ and $1_{N}^{T}e_{1}=1$ . Then,

\displaystyle y_{1}^{*}=e_{1}^{T}y^{*}=1_{N}^{T}(\gamma L+e_{1}e_{1}^{T})(\gamma L+e_{1}e_{1}^{T})^{-1}b=1_{N}^{T}b,

which proves the first claim. Under Assumption 1, the matrix $L$ has exactly one zero eigenvalue and there exists an orthogonal matrix $U\in\mathbb{R}^{N\times N}$ such that

U=\begin{bmatrix}\frac{1}{\sqrt{N}}1_{N}&Q\end{bmatrix}\quad\text{and}\quad LU=U\begin{bmatrix}0&0\\ 0&\Lambda\end{bmatrix}

where $Q\in\mathbb{R}^{N\times(N-1)}$ is orthogonal, and $\Lambda\in\mathbb{R}^{(N-1)\times(N-1)}$ is real and diagonal. Then, with a coordinate change

\begin{bmatrix}\eta_{1}\\ \tilde{\eta}\end{bmatrix}:=U^{T}y=\begin{bmatrix}\frac{1}{\sqrt{N}}1_{N}^{T}\\ Q^{T}\end{bmatrix}y,

the equilibrium $y^{*}$ can be expressed by

y^{*}=\frac{1}{\sqrt{N}}1_{N}\eta_{1}^{*}+Q\tilde{\eta}^{*}

(15)

where $\eta_{1}^{*}=\lim_{t\to\infty}\eta_{1}(t)$ and $\tilde{\eta}^{*}=\lim_{t\to\infty}\tilde{\eta}(t)$ whose existence follows from the fact that $\lim_{t\to\infty}y(t)=y^{*}$ . In fact, we observe that

	$\displaystyle\dot{\tilde{\eta}}$	$\displaystyle=Q^{T}\dot{y}=Q^{T}(-\gamma Ly-e_{1}e_{1}^{T}y+b)$
		$\displaystyle=-\gamma Q^{T}LQ\tilde{\eta}-Q^{T}e_{1}y_{1}+Q^{T}b$

where $Q^{T}LQ=\Lambda$ which is positive definite. Therefore, we see that $\lim_{t\to\infty}\tilde{\eta}(t)=\tilde{\eta}^{*}=(1/\gamma)\Lambda^{-1}Q^{T}(b-e_{1}y_{1}^{*})$ . Now, by (15) and by $y_{1}^{*}=1_{N}^{T}b$ , we have that

\displaystyle e_{1}^{T}y^{*}

\displaystyle=y_{1}^{*}=\frac{1}{\sqrt{N}}\eta_{1}^{*}+\frac{1}{\gamma}e_{1}^{T}Q\Lambda^{-1}Q^{T}(I-e_{1}1_{N}^{T})b,

from which $\eta_{1}^{*}$ is obtained. With the expressions for $\eta_{1}^{*}$ and $\tilde{\eta}^{*}$ , equation (15) yields that

y^{*}=1_{N}y_{1}^{*}+\frac{1}{\gamma}(I-1_{N}e_{1}^{T})Q\Lambda^{-1}Q^{T}(I-e_{1}1_{N}^{T})b.

Therefore,

	$\displaystyle y_{i}^{}-y_{1}^{}$	$\displaystyle=\frac{1}{\gamma}e_{i}^{T}(I-1_{N}e_{1}^{T})Q\Lambda^{-1}Q^{T}(I-e_{1}1_{N}^{T})b$
		$\displaystyle=\frac{1}{\gamma}(e_{i}^{T}-e_{1}^{T})Q\Lambda^{-1}Q^{T}(I-e_{1}1_{N}^{T})b.$

Noting that, for $N\geq 2$ ,

	$\displaystyle\left\|(I-e_{1}1_{N}^{T})b\right\|^{2}$	$\displaystyle=\left(\sum_{i=2}^{N}b_{i}\right)^{2}+\sum_{i=2}^{N}b_{i}^{2}$
		$\displaystyle\leq(N-1)\sum_{i=2}^{N}b_{i}^{2}+\sum_{i=2}^{N}b_{i}^{2}=N\sum_{i=2}^{N}b_{i}^{2}<N^{2}$

we finally obtain

\displaystyle|y_{i}^{*}-y_{1}^{*}|<\frac{1}{\gamma}\cdot\sqrt{2}\cdot\frac{1}{\lambda_{2}(L)}\cdot N.

Recalling that $\lambda_{2}(L)\geq 4/N^{2}$ [13], (14) follows. ∎

Now, it follows from (13) that

y(t)-y^{*}=e^{-h(\gamma L+e_{1}e_{1}^{T})t}(y(0)-y^{*}).

Here we note that, since the matrix $(\gamma L+e_{1}e_{1}^{T})$ is symmetric,

\|e^{-h(\gamma L+e_{1}e_{1}^{T})t}\|\leq ke^{-\lambda_{\min}(\gamma L+e_{1}e_{1}^{T})ht}\quad\text{with $k=1$}.

Recalling that the assumption $\gamma\geq N^{3}$ of Theorem 1 implies $\gamma\geq N^{3}/4\geq N/\lambda_{2}(L)$ by the fact that $\lambda_{2}(L)\geq 4/N^{2}$ [13] so that Lemma 2 guarantees $\lambda_{\min}(\gamma L+e_{1}e_{1}^{T})\geq 1/(4N)$ . The assumption $\gamma\geq N^{3}$ also implies that $|y_{i}^{*}-y_{1}^{*}|<\sqrt{2}/4$ by (14), with which we note that $-\sqrt{2}/4<y_{i}^{*}<N+\sqrt{2}/4$ for all $1\leq i\leq N$ because $0\leq y_{1}^{*}={\mathcal{F}}(a)\leq N$ . This implies that $\|y(0)-y^{*}\|<M_{{\mathcal{K}}_{y}}\sqrt{N}$ where $M_{{\mathcal{K}}_{y}}=N+1$ that is the size of the interval ${\mathcal{K}}_{y}=[-0.5,N+0.5]$ since $y_{i}(0)\in{\mathcal{K}}_{y}$ by the assumption. Putting together, we obtain

	$\displaystyle\|y_{i}(t)-y_{1}^{*}\|$	$\displaystyle\leq\|y_{i}(t)-y_{i}^{}\|+\|y_{i}^{}-y_{1}^{*}\|$
		$\displaystyle\leq\\|y(t)-y^{}\\|+\|y_{i}^{}-y_{1}^{*}\|$
		$\displaystyle<e^{-\frac{h}{4N}t}\cdot M_{{\mathcal{K}}_{y}}\sqrt{N}+\frac{\sqrt{2}}{4}.$

Therefore, when

t>\frac{4N}{h}\ln\frac{4M_{{\mathcal{K}}_{y}}\sqrt{N}}{2-\sqrt{2}}=:{\mathcal{T}_{y}}(N),

we have that $|y_{i}(t)-{\mathcal{F}}(a)|<1/2$ for all $i$ .

References

[1] Kuhn F, Locher, T and Schmid, S. Distributed computation of the mode. Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing, 2008, 15–24.
[2] Bénézit, F. and Thiran, P. and Vetterli, M. Interval consensus: From quantized gossip to voting. Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP’09), 2009, 3661–3664.
[3] Bénézit, F. and Thiran, P. and Vetterli, M. The distributed multiple voting problem. IEEE Journal of Selected Topics in Signal Processing, 2011, 5: 791–804.
[4] Salehkaleybar, S. and Sharif-Nassab, A. and Golestani, S. J. Distributed voting/ranking with optimal number of states per node. IEEE Transactions on Signal and Information Processing over Networks, 2015, 1: 259–267.
[5] Huang, C. and Anderson B. D. O. and Zhang H. and Yan, H. Distributed convex optimization as a tool for solving $f$ -consensus problems. Automatica, 2023, 115: 111087.
[6] Lee, D. and Lee, S. and Kim, T. and Shim, H. Distributed algorithm for the network size estimation: Blended dynamics approach. 2018 IEEE Conference on Decision and Control (CDC), 2018, 4577–4582.
[7] Dashti, Zoreh Al Zahra Sanai and Oliva, Gabriele and Seatzu, Carla and Gasparri, Andrea and Franceschelli, Mauro. Distributed Mode Computation in Open Multi-Agent Systems. IEEE Control Systems Letters, 2022, 6: 3481–3486.
[8] Lee, Jin Gyu and Shim, Hyungbo. Design of Heterogeneous Multi-agent system for Distributed Computation. Trends in Nonlinear and Adaptive Control: A Tribute to Laurent Praly for his 65th Birthday, Springer, 2021, 83–108.
[9] Cortés J. Distributed algorithms for reaching consensus on general functions. Automatica, 2008, 44: 726–737.
[10] Olfati-Saber R. and Fax J. and Murray R. Consensus and Cooperation in Networked Multi-Agent Systems. Proceedings of the IEEE, 2007, 95: 215–233
[11] Cortés J. Discontinuous dynamical systems. IEEE Control systems magazine, 2008, 28: 36–73.
[12] Li, Weijian and Zeng, Xianlin and Liang, Shu and Hong, Yiguang. Exponentially Convergent Algorithm Design for Constrained Distributed Optimization via Nonsmooth Approach. IEEE Transactions on Automatic Control, 2022, 67: 934-940.
[13] Mohar, Bojan. Eigenvalues, diameter, and mean distance in graphs. Graphs and Combinatorics, 1991, 7: 53–64.

	$\displaystyle\|y_{i}(t)-y_{1}^{*}\|$	$\displaystyle\leq\|y_{i}(t)-y_{i}^{}\|+\|y_{i}^{}-y_{1}^{*}\|$
		$\displaystyle\leq\\|y(t)-y^{}\\|+\|y_{i}^{}-y_{1}^{*}\|$
		$\displaystyle<e^{-\frac{h}{4N}t}\cdot M_{{\mathcal{K}}_{y}}\sqrt{N}+\frac{\sqrt{2}}{4}.$

Mode Consensus Algorithms With Finite Convergence Time

Abstract

Index Terms:

I Introduction

II Notation and Preliminaries

II-A Underlying network

Assumption 1

II-B ff-consensus and broad problem setting

Definition 1 (ff-consensus)

Assumption 2

II-C Two relevant examples of ff-consensus

II-C1 Distributed computation of network size

Remark 1

II-C2 kk-th smallest element consensus

Remark 2

Remark 3

II-D The problem of interest: Mode consensus

II-E Plug-and-play

III Direct mode consensus algorithm

Theorem 1

Remark 4

Remark 5

Remark 6

Remark 7

IV Mode consensus algorithm considering the frequency of mode

Lemma 1

Proof:

Remark 8

IV-A Algorithm with a priori knowledge of the least frequency of the mode

IV-B Algorithm with learned knowledge of the least frequency of the mode

V Simulations

VI Conclusion

Lemma 2 (Lemma 1 in [6])

Lemma 3

Proof:

References

II-B $f$ -consensus and broad problem setting

Definition 1 ( $f$ -consensus)

II-C Two relevant examples of $f$ -consensus

II-C2 $k$ -th smallest element consensus