Distributed State Estimation over Time-Varying Graphs: Exploiting the Age-of-Information

Aritra Mitra, John A. Richards, Saurabh Bagchi and Shreyas Sundaram A. Mitra, S. Bagchi and S. Sundaram are with the School of Electrical and Computer Engineering at Purdue University. J. A. Richards is with Sandia National Laboratories. Email: {mitra14, sbagchi, sundara2}@purdue.edu, [email protected]. This work was supported in part by NSF CAREER award 1653648, and by the Laboratory Directed Research and Development program at Sandia National Laboratories. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. The views expressed in the article do not necessarily represent the views of the U.S. Department of Energy or the United States Government.

Abstract

We study the problem of designing a distributed observer for an LTI system over a time-varying communication graph. The limited existing work on this topic imposes various restrictions either on the observation model or on the sequence of communication graphs. In contrast, we propose a single-time-scale distributed observer that works under mild assumptions. Specifically, our communication model only requires strong-connectivity to be preserved over non-overlapping, contiguous intervals that are even allowed to grow unbounded over time. We show that under suitable conditions that bound the growth of such intervals, joint observability is sufficient to track the state of any discrete-time LTI system exponentially fast, at any desired rate. In fact, we also establish finite-time convergence based on our approach. Finally, we develop a variant of our algorithm that is provably robust to worst-case adversarial attacks, provided the sequence of graphs is sufficiently connected over time. The key to our approach is the notion of a “freshness-index” that keeps track of the age-of-information being diffused across the network. Such indices enable nodes to reject stale estimates of the state, and, in turn, contribute to stability of the error dynamics.

I Introduction

Given a discrete-time LTI system $\mathbf{x}[k+1]=\mathbf{Ax}[k]$ , and a linear measurement model $\mathbf{y}[k]=\mathbf{Cx}[k]$ , a classical result in control theory states that one can design an observer that generates an asymptotically correct estimate $\hat{\mathbf{x}}[k]$ of the state $\mathbf{x}[k]$ , if and only if the pair $(\mathbf{A},\mathbf{C})$ is detectable [1]. Additionally, if the pair $(\mathbf{A},\mathbf{C})$ is observable, then one can achieve exponential convergence at any desired convergence rate. Over the last couple of decades, significant effort has been directed towards studying the distributed counterpart of the above problem, wherein observations of the process are distributed among a set of sensors modeled as nodes of a communication graph [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]. A fundamental question that arises in this context is as follows: What are the minimal requirements on the measurement structure of the nodes and the underlying communication graph that guarantee the existence of a distributed observer? Here, by a distributed observer, we imply a set of state estimate update and information exchange rules that enable each node to track the entire state asymptotically. The question posed above was answered only recently in [5, 6, 7, 8, 9, 10] for static graphs. However, when the underlying network changes with time, very little is known about the distributed state estimation problem - a gap that we intend to bridge in this paper. Our study is motivated by applications in environmental monitoring tasks using mobile robot teams [13, 14], where time-varying communication graphs arise naturally either as a consequence of robot-mobility, or due to packet drops and link failures typical in wireless sensor networks.

Related Work: The existing literature on the design of distributed observers can be broadly classified in terms of the assumptions made on the system model, the observation model, and the network structure. Recent works on this topic [5, 6, 7, 8, 9, 10] that make minimal system- and graph-theoretic assumptions can be further classified based on a finer set of attributes, as follows. (i) Does the approach require multiple consensus iterations between two consecutive time-steps of the dynamics?¹¹1Such approaches, referred to as two-time-scale approaches, incur high communication cost, and might hence be prohibitive for real-time applications. (ii) What is the dimension of the estimator maintained by each node? (iii) Is the convergence asymptotic, or in finite-time? (iv) In case the convergence is asymptotic, can the convergence rate be controlled? The techniques proposed in [5, 6, 7, 8, 9, 10] operate on a single-time-scale, and those in [6, 8, 9, 10] require observers of dimension no more than that of the state of the system. The notion of convergence in each of the papers [5, 6, 7, 8, 9, 10] is asymptotic; the ones in [7, 8, 10] can achieve exponential convergence at any desired rate. For dynamic networks, there is much less work. Two recent papers [11] and [15] provide theoretical guarantees for certain specific classes of time-varying graphs; while the former proposes a two-time-scale approach, the latter develops a single-time-scale algorithm. However, the extent to which such results can be generalized has remained an open question.

Contributions: The main contribution of this paper is the development of a single-time-scale distributed state estimation algorithm in Section IV that requires each node to maintain an estimator of dimension equal to that of the state (along with some simple counters), and works under assumptions that are remarkably mild in comparison with those in the existing literature. Specifically, in terms of the observation model, we only require joint observability, i.e., the state is observable w.r.t. the collective observations of the nodes. This assumption is quite standard, and can be relaxed to joint detectability, with appropriate implications for the convergence rate.

However, the key departure from existing literature comes from the generality of our communication model, introduced formally in Section II. Based on this model, we require strong-connectivity to be preserved over non-overlapping, contiguous intervals that can even grow linearly with time at a certain rate. In other words, we allow the inter-communication intervals between the nodes to potentially grow unbounded. Even under the regime of such sparse communications, we establish that our proposed approach can track the state of any discrete-time LTI system exponentially fast, at any desired convergence rate. While all the works on distributed state estimation that we are aware of provide an asymptotic analysis, estimating the state of the system in finite-time might be crucial in certain safety-critical applications. To this end, we show that under a suitable selection of the observer gains, one can achieve finite-time convergence based on our approach. To put our results into context, it is instructive to compare them with the work closest to ours, namely [11]. In [11], the authors study a continuous-time analog of the problem under consideration, and develop a solution that leverages an elegant connection to the problem of distributed linear-equation solving [16]. In contrast to our technique, the one in [11] is inherently a two-time-scale approach, requires each node to maintain and update auxiliary state estimates, and works under the assumption that the communication graph is strongly-connected at every instant.

Our work is also related to the vast literature on consensus [17] and distributed optimization [18] over time-varying graphs, where one typically assumes strong-connectivity to be preserved over uniformly bounded intervals - a special case of our communication model. It is important to recognize that, in contrast to these settings, the problem at hand requires tracking potentially unstable external dynamics, making the stability analysis considerably more challenging. In particular, one can no longer directly leverage convergence properties of products of stochastic matrices. From a system-theoretic point of view, the error dynamics under our communication model evolves based on a switched linear system, where certain modes are potentially unstable. Thus, one way to analyze such dynamics is by drawing on techniques from the switched system stability literature [19]. However, such an analysis can potentially obscure the role played by the network structure. Instead, we directly exploit the interplay between the structure of the changing graph patterns on the one hand, and the evolution of the estimation error dynamics on the other. Doing so, we provide a comprehensive stability analysis of our estimation algorithm employing simple, self-contained arguments from linear system theory and graph theory.

Finally, in Section VI, we extend our study of distributed state estimation over dynamic graphs to settings where certain nodes are under attack, and can act arbitrarily. For the problem under consideration, accounting for adversarial behavior is highly non-trivial even for static graphs, with only a few recent results that provide any theoretical guarantees [20, 21, 22, 23]. Accounting for worst-case adversarial behavior over time-varying networks is considerably harder, since the compromised nodes can not only transmit incorrect, inconsistent state estimates, but also lie about the time-stamps of their data. Nonetheless, we develop a novel algorithm to handle such scenarios, and establish its correctness when the graph-sequence is sufficiently connected over time.

The key idea behind our approach is the use of a suitably defined “freshness-index” that keeps track of the age-of-information being diffused across the network. Loosely speaking, the “freshness-index” of a node quantifies the extent to which its estimate of the state is outdated. Exchanging such indices enables a node to assess, in real-time, the quality of the estimate of a neighbor. Accordingly, it can reject estimates that are relatively stale - a fact that contributes to the stability of the error dynamics. We point out that while this is perhaps the first use of the notion of age-of-information (AoI) in a networked control/estimation setting, such a concept has been widely employed in the study of various queuing-theoretic problems arising in wireless networks [24, 25, 26].²²2The notion of age-of-information (AoI) was first introduced in [24] as a performance metric to keep track of real-time status updates in a communication system. In a wireless network, it measures the time elapsed since the generation of the packet most recently delivered to the destination. In Section IV, we will see how such a concept applies to the present setting.

To sum up, we propose the first single-time-scale distributed state estimation algorithm that provides both finite-time and exponentially fast convergence guarantees, under significantly milder assumptions on the time-varying graph sequences than those in the existing literature. In addition, we show how our algorithm can be extended to allow for worst-case adversarial attacks as well.

A preliminary version of this paper appeared as [27]. Here, we significantly expand upon the content in [27] in two main directions. First, we generalize the results in [27] to allow for unbounded inter-communication intervals. Second, the extension to adversarial scenarios in Section VI is a new addition entirely. We also provide full proofs of all our results.

II Problem Formulation and Background

System and Measurement Model: We are interested in collaborative state estimation of a discrete-time LTI system of the form:

\mathbf{x}[k+1]=\mathbf{A}\mathbf{x}[k],

(1)

where $k\in\mathbb{N}$ is the discrete-time index, $\mathbf{A}\in\mathbb{R}^{n\times n}$ is the system matrix, and $\mathbf{x}[k]\in\mathbb{R}^{n}$ is the state of the system.³³3We use $\mathbb{N}$ and $\mathbb{N}_{+}$ to denote the set of non-negative integers and the set of positive integers, respectively. A network of sensors, modeled as nodes of a communication graph, obtain partial measurements of the state of the above process as follows:

\mathbf{y}_{i}[k]=\mathbf{C}_{i}\mathbf{x}[k],

(2)

where $\mathbf{y}_{i}[k]\in{\mathbb{R}}^{r_{i}}$ represents the measurement vector of the $i$ -th node at time-step $k$ , and $\mathbf{C}_{i}\in{\mathbb{R}}^{r_{i}\times n}$ represents the corresponding observation matrix. Let $\mathbf{y}[k]={\begin{bmatrix}\mathbf{y}^{T}_{1}[k]&\cdots&\mathbf{y}^{T}_{N}[k]\end{bmatrix}}^{T}$ and $\mathbf{C}={\begin{bmatrix}\mathbf{C}^{T}_{1}&\cdots&\mathbf{C}^{T}_{N}\end{bmatrix}}^{T}$ represent the collective measurement vector at time-step $k$ , and the collective observation matrix, respectively. The goal of each node $i$ in the network is to generate an asymptotically correct estimate $\hat{\mathbf{x}}_{i}[k]$ of the true dynamics $\mathbf{x}[k]$ . It may not be possible for any node $i$ in the network to accomplish this task in isolation, since the pair $(\mathbf{A},\mathbf{C}_{i})$ may not be detectable in general. Throughout the paper, we will only assume that the pair $(\mathbf{A},\mathbf{C})$ is observable; the subsequent developments can be readily generalized to the case when $(\mathbf{A},\mathbf{C})$ is detectable.

Communication Network Model: As is evident from the above discussion, information exchange among nodes is necessary for all nodes to estimate the full state. At each time-step $k\in\mathbb{N}$ , the available communication channels are modeled by a directed communication graph $\mathcal{G}[k]=(\mathcal{V},\mathcal{E}[k])$ , where $\mathcal{V}=\{1,\ldots,N\}$ represents the set of nodes, and $\mathcal{E}[k]$ represents the edge set of $\mathcal{G}[k]$ at time-step $k$ . Specifically, if $(i,j)\in\mathcal{E}[k]$ , then node $i$ can send information directly to node $j$ at time-step $k$ ; in such a case, node $i$ will be called a neighbor of node $j$ at time-step $k$ . We will use $\mathcal{N}_{i}[k]$ to represent the set of all neighbors (excluding node $i$ ) of node $i$ at time-step $k$ . When $\mathcal{G}[k]=\mathcal{G}\hskip 5.69054pt\forall k\in\mathbb{N}$ , where $\mathcal{G}$ is a static, directed communication graph, the necessary and sufficient condition (on the system and network) to solve the distributed state estimation problem is the joint detectability of each source component of $\mathcal{G}$ [5].⁴⁴4A source component of a static, directed graph is a strongly connected component with no incoming edges. Our goal in this paper is to extend the above result to scenarios where the underlying communication graph is allowed to change over time. To this end, let the union graph over an interval $[k_{1},k_{2}],0\leq k_{1}\leq k_{2}$ , denoted $\bigcup\limits_{\tau=k_{1}}^{k_{2}}\mathcal{G}[\tau]$ , indicate a graph with vertex set equal to $\mathcal{V}$ , and edge set equal to the union of the edge sets of the individual graphs appearing over the interval $[k_{1},k_{2}]$ . Based on this convention, we now formally describe the communication patterns (induced by the sequence $\{G[k]\}^{\infty}_{k=0}$ ) that are considered in this paper. We assume that there exists a sequence $\mathbb{I}=\{t_{0},t_{1},\ldots\}$ of increasing time-steps with $t_{0}=0$ and each $t_{i}\in\mathbb{N}$ , satisfying the following conditions.

(C1)

Define the mapping $f:\mathbb{I}\rightarrow\mathbb{N}_{+}$ as $f(t_{q})=t_{q+1}-t_{q},\forall t_{q}\in\mathbb{I}.$ We assume that $f(t_{q})$ is a non-decreasing function of its argument.
(C2)

For each $k\in\mathbb{N}$ , let $m(k)\triangleq\max\{t_{q}\in\mathbb{I}:t_{q}\leq k\}$ , and $M(k)\triangleq\min\{t_{q}\in\mathbb{I}:t_{q}>k\}$ . Define $g:\mathbb{N}\rightarrow\mathbb{N}_{+}$ as $g(k)=M(k)-m(k)=f(m(k)).$ Then, we assume that the following holds:

$\limsup_{k\to\infty}\frac{2(N-1)g(k)}{k}=\delta<1.$ (3)
(C3)

For each $t_{q}\in\mathbb{I}$ , we assume that $\bigcup\limits_{\tau=t_{q}}^{t_{q+1}-1}\mathcal{G}[\tau]$ is strongly-connected.

Let us discuss what the above conditions mean. Condition (C1) in conjunction with condition (C3) tells us that the intervals over which strong-connectivity is preserved are non-decreasing in length (evident from the monotonicity of the function $f(\cdot)$ ).⁵⁵5Our results can be generalized to account for the case when $f(\cdot)$ is non-monotonic by suitably modifying condition (C2). Essentially, our aim here is to come up with distributed estimators that function correctly despite sparse communication; hence, we allow for potentially growing inter-communication intervals. Since we place no assumptions at all on the spectrum of the $\mathbf{A}$ matrix, stability of the estimation error process imposes natural restrictions on how fast the inter-communication intervals can be allowed to grow. Condition (C2) formalizes this intuition by constraining such intervals to grow at most linearly at a certain rate. Notably, conditions (C1)-(C3) cover a very general class of time-varying graph sequences. In particular, we are unaware of any other work that allows the inter-communication intervals to grow unbounded for the problem under consideration.

Consider the following two examples. (i) The mapping $f$ satisfies $f(t_{q})=T,\forall t_{q}\in\mathbb{I}$ , where $T$ is some positive integer. (ii) The mapping $f$ satisfies $f(t_{q})=\lfloor\sqrt{t_{q}+1}\rfloor,\forall t_{q}\in\mathbb{I}$ . It is easy to verify that (C2) is satisfied in each case. While example (i) represents the standard “joint strong-connectivity” setting where the inter-communication intervals remain bounded, example (ii) deviates from existing literature by allowing the inter-communication intervals to grow unbounded.

Background: For communication graphs satisfying conditions (C1)-(C3) as described above, our objective will be to design a distributed algorithm that ensures $\lim_{k\to\infty}\left\|\hat{\mathbf{x}}_{i}[k]-\mathbf{x}[k]\right\|=0,\forall i\in\mathcal{V}$ , with $\hat{\mathbf{x}}_{i}[k]$ representing the estimate of the state $\mathbf{x}[k]$ maintained by node $i\in\mathcal{V}$ . To this end, we recall the following result from [6].

Lemma 1.

Given a system matrix $\mathbf{A}$ , and a set of $N$ sensor observation matrices $\mathbf{C}_{1},\mathbf{C}_{2},\ldots,\mathbf{C}_{N}$ , define $\mathbf{C}\triangleq{\begin{bmatrix}\mathbf{C}^{T}_{1}&\cdots&\mathbf{C}^{T}_{N}\end{bmatrix}}^{T}$ . Suppose $(\mathbf{A},\mathbf{C})$ is observable. Then, there exists a similarity transformation matrix $\mathbf{T}$ that transforms the pair $\mathbf{(A,C)}$ to $(\bar{\mathbf{A}},\bar{\mathbf{C}})$ , such that

$\begin{aligned} \bar{\mathbf{A}}&=\left[\begin{array}[]{c|c|cc}\mathbf{A}_{11}&\lx@intercol\hfil\mathbf{0}\hfil\lx@intercol\\ \hline\cr\mathbf{A}_{21}&\mathbf{A}_{22}&\lx@intercol\hfil\mathbf{0}\hfil\lx@intercol\\ \cline{2-4}\cr\vdots&\vdots&\ddots&\vdots\\ \mathbf{A}_{N1}&\mathbf{A}_{N2}\hskip 5.69054pt\cdots&\mathbf{A}_{N(N-1)}&\vrule\lx@intercol\hfil\mathbf{A}_{NN}\hfil\lx@intercol\end{array}\right],\\ ~{}\\ \bar{\mathbf{C}}&=\begin{bmatrix}\bar{\mathbf{C}}_{1}\\ \bar{\mathbf{C}}_{2}\vspace{-1mm}\\ \vdots\\ \bar{\mathbf{C}}_{N}\end{bmatrix}=\left[\begin{array}[]{cccc}\mathbf{C}_{{11}}&\vrule\lx@intercol\hfil\mathbf{0}\hfil\lx@intercol\\ \hline\cr\mathbf{C}_{{21}}&\lx@intercol\hfil\mathbf{C}_{{22}}\hfil\lx@intercol&\vrule\lx@intercol\hfil\mathbf{0}\hfil\lx@intercol\\ \hline\cr\vdots&\vdots&\vdots&\vdots\\ \mathbf{C}_{{N1}}&\lx@intercol\hfil\mathbf{C}_{{N2}}\hfil\lx@intercol&\cdots\mathbf{C}_{N(N-1)}&\mathbf{C}_{NN}\\ \end{array}\right],\end{aligned}$

(4)

and the pair $(\mathbf{A}_{ii},\mathbf{C}_{ii})$ is observable $\forall i\in\{1,2,\ldots,N\}$ .

We use the matrix $\mathbf{T}$ given by Lemma 1 to perform the coordinate transformation $\mathbf{z}[k]=\mathbf{T}^{-1}\mathbf{x}[k]$ , yielding:

	$\displaystyle\mathbf{z}[k+1]$	$\displaystyle=\bar{\mathbf{A}}\mathbf{z}[k],$		(5)
	$\displaystyle\mathbf{y}_{i}[k]$	$\displaystyle=\bar{\mathbf{C}}_{i}\mathbf{z}[k],\quad\forall i\in\{1,\ldots,N\},$		(5)

where $\bar{\mathbf{A}}={\mathbf{T}}^{-1}\mathbf{A}\mathbf{T}$ and $\bar{\mathbf{C}}_{i}=\mathbf{C}_{i}\mathbf{T}$ are given by (4). Commensurate with the structure of $\bar{\mathbf{A}}$ , the vector $\mathbf{z}[k]$ is of the following form:

\mathbf{z}[k]={\begin{bmatrix}{\mathbf{z}^{(1)}[k]}^{T}&\cdots&{\mathbf{z}^{(N)}[k]}^{T}\end{bmatrix}}^{T},

(6)

where $\mathbf{z}^{(j)}[k]$ will be referred to as the $j$ -th sub-state. By construction, since the pair $(\mathbf{A}_{jj},\mathbf{C}_{jj})$ is locally observable w.r.t. the measurements of node $j$ , node $j$ will be viewed as the unique source node for sub-state $j$ . In this sense, the role of node $j$ will be to ensure that each non-source node $i\in\mathcal{V}\setminus\{j\}$ maintains an asymptotically correct estimate of sub-state $j$ . For a time-invariant strongly-connected graph, this is achieved in [6] by first constructing a spanning tree rooted at node $j$ , and then requiring nodes to only listen to their parents in such a tree for estimating sub-state $j$ . The resulting unidirectional flow of information from the source $j$ to the rest of the network guarantees stability of the error process for sub-state $j$ [6].

However, the above strategy is no longer applicable when the underlying communication graph is time-varying, for the following reasons. (i) For a given sub-state $j$ , there may not exist a common spanning tree rooted at node $j$ in each graph $\mathcal{G}[k],k\in\mathbb{N}$ . (ii) Assuming that a specific spanning tree rooted at node $j$ is guaranteed to repeat at various points in time (not necessarily periodically), is restrictive, and qualifies as only a special case of conditions (C1)-(C3). (iii) Suppose for simplicity that $\mathcal{G}[k]$ is strongly-connected at each time-step (as in [11]), and hence, there exists a spanning tree $\mathcal{T}_{j}[k]$ rooted at node $j$ in each such graph. For estimating sub-state $j$ , suppose consensus at time-step $k$ is performed along the spanning tree $\mathcal{T}_{j}[k]$ . As we demonstrate in the next section, switching between such spanning trees can lead to unstable error processes over time. Thus, if one makes no further assumptions on the system model beyond joint observability, or on the sequence of communication graphs beyond conditions (C1)-(C3), ensuring stability of the estimation error dynamics becomes a challenging proposition. Nonetheless, we develop a simple algorithm in Section IV that bypasses the above problems. In the next section, we provide an example that helps to build the intuition behind this algorithm.

III An Illustrative Example

Figure 1: An LTI system is monitored by a network of 3 nodes, where the communication graph

\mathcal{G}[k]

switches between the two graphs shown above.

Refer to caption — Figure 2: Estimation error plots of the nodes for the model in Figure 1. Simulations are performed for a model where $a=2$ . The figure on the left corresponds to the case where consensus weights are distributed uniformly among neighbors, while the one on the right is the case where weights are placed along a tree rooted at node 1.

Consider a network of 3 nodes monitoring a scalar unstable process $x[k+1]=ax[k]$ , as shown in Figure 1. The communication graph $\mathcal{G}[k]$ switches between the two topologies shown in Figure 1. Specifically, $\mathcal{G}[k]$ is the graph in Figure 1(a) at all even time-steps, and the one in 1(b) at all odd time-steps. Node 1 is the only node with non-zero measurements, and thus acts as the source node for this network. Suppose for simplicity that it perfectly measures the state at all time-steps, i.e., its state estimate is $\hat{x}_{1}[k]=x[k],\forall k\in\mathbb{N}$ . Given this setup, a standard consensus based state estimate update rule would take the form (see for example [5, 6, 11]):

\hat{x}_{i}[k+1]=a\left(\sum_{j\in\mathcal{N}_{i}[k]\cup\{i\}}w_{ij}[k]\hat{x}_{j}[k]\right),i\in\{2,3\},

(7)

where the weights $w_{ij}[k]$ are non-negative, and satisfy $\sum_{j\in\mathcal{N}_{i}[k]\cup\{i\}}w_{ij}[k]=1,\forall k\in\mathbb{N}$ . The key question is: how should the consensus weights be chosen to guarantee stability of the estimation errors of nodes 2 and 3? Even for this simple example, if such weights are chosen naively, then the errors may grow unbounded over time. To see this, consider the following two choices: (1) consensus weights are distributed evenly over the set $\mathcal{N}_{i}[k]\cup\{i\}$ , and (2) consensus weights are placed along the tree rooted at node $1$ . In each case, the error dynamics are unstable, as depicted in Figure 2. To overcome this problem, suppose nodes 2 and 3 are aware of the fact that node 1 has perfect information of the state. Since nodes 2 and 3 have no measurements of their own, intuitively, it makes sense that they should place their consensus weights entirely on node 1 whenever possible. The trickier question for node 2 (resp., node 3) is to decide when it should listen to node 3 (resp., node 2). Let us consider the situation from the perspective of node 2. At time-step 0, it adopts the information of node 1, and hence, the error of node 2 is zero at time-step 1. However, the error of node 3 is not necessarily zero at time-step 1. Consequently, if node 2 places a non-zero consensus weight on the estimate of node 3 at time-step 1, its error at time-step 2 might assume a non-zero value. Clearly, at time-step 1, node 2 is better off rejecting the information from node 3, and simply running open-loop. The main take-away point here is that adoption or rejection of information from a neighbor should be based on the quality of information that such a neighbor has to offer. In particular, a node that has come in contact with node 1 more recently is expected to have better information about the state than the other. Thus, to dynamically evaluate the quality of an estimate, the above reasoning suggests the need to introduce a metric that keeps track of how recent that estimate is with respect to (w.r.t.) the estimate of the source node 1. In the following section, we formalize this idea by introducing such a metric.

Algorithm 1

1:Initialization:

\tau^{(j)}_{j}[0]=0,\tau^{(j)}_{i}[0]=\omega,\forall i\in\mathcal{V}\setminus\{j\}

2:Update Rules for the Source Node: Node

j

maintains

\tau^{(j)}_{j}[k]=0,\forall k\in\mathbb{N}

. It updates

\hat{\mathbf{z}}^{(j)}_{j}[k]

as:

\displaystyle\hat{\mathbf{z}}^{(j)}_{j}[k+1]={\mathbf{F}}_{jj}\hat{\mathbf{z}}^{(j)}_{j}[k]+\sum\limits_{q=1}^{(j-1)}\mathbf{G}_{jq}\hat{\mathbf{z}}^{(q)}_{j}[k]+\mathbf{L}_{j}\mathbf{y}_{j}[k],

(8)

where

{\mathbf{F}}_{jj}=(\mathbf{A}_{jj}-\mathbf{L}_{j}\mathbf{C}_{jj})

\mathbf{G}_{jq}=(\mathbf{A}_{jq}-\mathbf{L}_{j}\mathbf{C}_{jq})

, and

\mathbf{L}_{j}

is an observer gain to be designed later.

3:Update Rules for the Non-Source Nodes: Each non-source node

i\in\mathcal{V}\setminus\{j\}

operates as follows.

4:Case 1:

\tau^{(j)}_{i}[k]=\omega

. Define

\mathcal{M}^{(j)}_{i}[k]\triangleq\{l\in\mathcal{N}_{i}[k]:\tau^{(j)}_{l}[k]\neq\omega\}.

\mathcal{M}^{(j)}_{i}[k]\neq\emptyset

, let

u\in\operatorname*{\arg\!\min}_{l\in\mathcal{M}^{(j)}_{i}[k]}\tau^{(j)}_{l}[k]

. Node

i

updates

\tau^{(j)}_{i}[k]

and

\hat{\mathbf{z}}^{(j)}_{i}[k]

as:

\tau^{(j)}_{i}[k+1]=\tau^{(j)}_{u}[k]+1,

(9)

\hat{\mathbf{z}}^{(j)}_{i}[k+1]=\mathbf{A}_{jj}\hat{\mathbf{z}}^{(j)}_{u}[k]+\sum\limits_{q=1}^{(j-1)}\mathbf{A}_{jq}\hat{\mathbf{z}}^{(q)}_{i}[k].

(10)

\mathcal{M}^{(j)}_{i}[k]=\emptyset

, then

\tau^{(j)}_{i}[k+1]=\omega,

(11)

\hat{\mathbf{z}}^{(j)}_{i}[k+1]=\mathbf{A}_{jj}\hat{\mathbf{z}}^{(j)}_{i}[k]+\sum\limits_{q=1}^{(j-1)}\mathbf{A}_{jq}\hat{\mathbf{z}}^{(q)}_{i}[k].

(12)

5:Case 2:

\tau^{(j)}_{i}[k]\neq\omega

. Define

\mathcal{F}^{(j)}_{i}[k]\triangleq\{l\in\mathcal{M}^{(j)}_{i}[k]:\tau^{(j)}_{l}[k]<\tau^{(j)}_{i}[k]\}

, where

\mathcal{M}^{(j)}_{i}[k]

is as defined in line 4. If

\mathcal{F}^{(j)}_{i}[k]\neq\emptyset

, let

u\in\operatorname*{\arg\!\min}_{l\in\mathcal{F}^{(j)}_{i}[k]}\tau^{(j)}_{l}[k]

. Node

i

then updates

\tau^{(j)}_{i}[k]

as per (9), and

\hat{\mathbf{z}}^{(j)}_{i}[k]

as per (10). If

\mathcal{F}^{(j)}_{i}[k]=\emptyset

, then

\tau^{(j)}_{i}[k]

is updated as

\tau^{(j)}_{i}[k+1]=\tau^{(j)}_{i}[k]+1,

(13)

and

\hat{\mathbf{z}}^{(j)}_{i}[k]

is updated as per (12).

IV Algorithm

Building on the intuition developed in the previous section, we introduce a new approach to designing distributed observers for a general class of time-varying networks. The main idea is the use of a “freshness-index” that keeps track of how delayed the estimates of a node are w.r.t. the estimates of a source node. Specifically, for updating its estimate of $\mathbf{z}^{(j)}[k]$ , each node $i\in\mathcal{V}$ maintains and updates at every time-step a freshness-index $\tau^{(j)}_{i}[k]$ . At each time-step $k\in\mathbb{N}$ , the index $\tau^{(j)}_{i}[k]$ plays the following role: it determines whether node $i$ should adopt the information received from one of its neighbors in $\mathcal{N}_{i}[k]$ , or run open-loop, for updating $\hat{\mathbf{z}}^{(j)}_{i}[k]$ , where $\hat{\mathbf{z}}^{(j)}_{i}[k]$ represents the estimate of $\mathbf{z}^{(j)}[k]$ maintained by node $i$ . In case it is the former, it also indicates which specific neighbor in $\mathcal{N}_{i}[k]$ node $i$ should listen to at time-step $k$ ; this piece of information is particularly important for the problem under consideration, and ensures stability of the error process. The rules that govern the updates of the freshness indices $\tau^{(j)}_{i}[k]$ , and the estimates of the $j$ -th sub-state $\mathbf{z}^{(j)}[k]$ , are formally stated in Algorithm 1. In what follows, we describe each of these rules.

Discussion of Algorithm 1: Consider any sub-state $j\in\{1,\ldots,N\}.$ Each node $i\in\mathcal{V}$ maintains an index $\tau^{(j)}_{i}[k]\in\{\omega\}\cup\mathbb{N}$ , where $\omega$ is a dummy value. Specifically, $\tau^{(j)}_{i}[k]=\omega$ represents an “infinite-delay” w.r.t. the estimate of the source node for sub-state $j$ , namely node $j$ , i.e., it represents that node $i$ has not received any information (either directly or indirectly) from node $j$ regarding sub-state $j$ up to time-step $k$ . For estimation of sub-state $j$ , since delays are measured w.r.t. the source node $j$ , node $j$ maintains its freshness-index $\tau^{(j)}_{j}[k]$ at zero for all time, to indicate a zero delay w.r.t. itself. For updating its estimate of $\mathbf{z}^{(j)}[k]$ , it uses only its own information, as is evident from (8).

Every other node starts out with an “infinite-delay” $\omega$ w.r.t. the source (line 1 of Algo. 1). The freshness-index of a node $i\in\mathcal{V}\setminus\{j\}$ changes from $\omega$ to a finite value when it comes in contact with a neighbor with a finite delay, i.e., with a freshness-index that is not $\omega$ (line 4 of Algo. 1). At this point, we say that $\tau^{(j)}_{i}[k]$ has been “triggered”. Once triggered, at each time-step $k$ , a non-source node $i$ will adopt the information of a neighbor $l\in\mathcal{N}_{i}[k]$ only if node $l$ ’s estimate of $\mathbf{z}^{(j)}[k]$ is “more fresh” relative to its own, i.e., only if $\tau^{(j)}_{l}[k]<\tau^{(j)}_{i}[k]$ .⁶⁶6Under Case 1 or Case 2 in Algo. 1, when a node $i\in\mathcal{V}\setminus\{j\}$ updates $\tau^{(j)}_{i}[k]$ via (9), and $\hat{\mathbf{z}}^{(j)}_{i}[k]$ via (10), we say that “ $i$ adopts the information of $u$ at time $k$ for sub-state $j$ ”; else, if it runs open-loop, we say it adopts its own information. Among the set of neighbors in $\mathcal{M}^{(j)}_{i}[k]$ (if $\tau^{(j)}_{i}[k]$ has not yet been triggered), or in $\mathcal{F}^{(j)}_{i}[k]$ (if $\tau^{(j)}_{i}[k]$ has been triggered), node $i$ only adopts the information (based on (10)) of the neighbor $u$ with the least delay. At this point, the delay of node $i$ matches that of node $u$ , and this fact is captured by the update rule (9). In case node $i$ has no neighbor that has fresher information than itself w.r.t. sub-state $j$ (where informativeness is quantified by $\tau^{(j)}_{i}[k])$ , it increments its own freshness-index by $1$ (as per (13)) to capture the effect of its own information getting older, and runs open-loop based on (12). Based on the above rules, at any given time-step $k$ , $\tau^{(j)}_{i}[k]$ measures the age-of-information of $\hat{\mathbf{z}}^{(j)}_{i}[k]$ , relative to the source node $j$ . This fact is established later in Lemma 23. Finally, note that Algorithm 1 describes an approach for estimating $\mathbf{z}[k]$ , and hence $\mathbf{x}[k]$ , since $\mathbf{x}[k]=\mathbf{Tz}[k]$ .

V Performance Guarantees For Algorithm 1

V-A Statement of the Results

In this section, we first state the theoretical guarantees afforded by Algorithm 1, and then discuss their implications; proofs of these statements are deferred to Appendix A and B. We begin with one of the main results of this paper.

Theorem 1.

Given an LTI system (1), and a measurement model (2), suppose $(\mathbf{A,C})$ is observable. Let the sequence of communication graphs $\{\mathcal{G}[k]\}_{k=0}^{\infty}$ satisfy conditions (C1)-(C3) in Section II. Then, given any desired convergence rate $\rho\in(0,1)$ , the observer gains $\mathbf{L}_{1},\ldots,\mathbf{L}_{N}$ can be designed in a manner such that the estimation error of each node converges to zero exponentially fast at rate $\rho$ , based on Algorithm 1.

The next result states that under the conditions in Theorem 1, one can in fact achieve finite-time convergence.

Proposition 1.

(Finite-Time Convergence) Suppose the conditions stated in Theorem 1 hold. Then, the observer gains $\mathbf{L}_{1},\ldots,\mathbf{L}_{N}$ can be designed in a manner such that the estimation error of each node converges to zero in finite-time.

The next result follows directly from Prop. 1 and indicates that, when the sequence of communication graphs exhibits certain structure, one can derive a closed form expression for the maximum number of time-steps required for convergence.

Corollary 1.

Suppose the conditions stated in Theorem 1 hold. Additionally, suppose $f(t_{q})\leq T,\forall t_{q}\in\mathbb{I}$ , where $T\in\mathbb{N}_{+}$ . Then, the observer gains $\mathbf{L}_{1},\ldots,\mathbf{L}_{N}$ can be designed in a manner such that the estimation error of each node converges to zero in at most $n+2N(N-1)T$ time-steps.

Let us now discuss the implications of our results, and comment on certain aspects of our approach.

Remark 1.

The fact that a network of partially informed nodes can track the state of a dynamical system with arbitrarily large eigenvalues, over inter-communication intervals that can potentially grow unbounded, is non-obvious a priori. Our results in Thm. 1 and Prop. 1 indicate that not only can this be done exponentially fast at any desired rate, it can also be done in finite-time. In contrast, the result closest to ours [11] assumes strong-connectivity at each time-step - an assumption that is significantly stronger than what we make.

Remark 2.

Notice that given a desired convergence rate $\rho$ , the general design approach described in the proof of Thm. 1 (in Appendix A) offers a considerable degree of freedom in choosing the parameters $\lambda_{1},\ldots,\lambda_{N}$ ; the design flexibility so obtained in choosing the observer gains can be exploited to optimize transient performance, or performance against noise. In contrast, the proof of Prop. 1 highlights a specific approach to obtain finite-time convergence. However, such an approach may lead to undesirable transient spikes in the estimation errors, owing to large observer gains.

Remark 3.

Presently, our approach requires a centralized design phase where the agents implement the multi-sensor decomposition in Section II, and design their observer gains to achieve the desired convergence rate as outlined in the proof of Thm. 1. This is of course a limitation, but one that is common to all existing approaches [2, 3, 4, 5, 7, 12, 8, 6, 9, 11, 10] that we are aware of, i.e., each such approach involves a centralized design phase. Our approach also requires the nodes to know an upper-bound on the parameter $\delta$ in Eq. (3) while designing their observer gains. If, however, we constrain the inter-communication intervals to grow sub-linearly at worst, i.e, if $\delta=0$ , then such gains can be designed with no knowledge on the nature of the graph-sequences. Thus, there exists a trade-off between the generality of the graph sequences that can be tolerated, and the information required to do so.

Remark 4.

When strong-connectivity is preserved over uniformly bounded intervals, i.e., when $f(t_{q})\leq T,\forall t_{q}\in\mathbb{I}$ , for some $T\in\mathbb{N}_{+}$ , then our approach leads to bounded estimation errors under bounded disturbances. However, as we will see in Section V-B, this may no longer be the case if the inter-communication intervals grow unbounded, no matter how slowly.

V-B Implications of growing inter-communication intervals under bounded disturbances

While Theorem 1 shows that estimation is possible under growing inter-communication intervals, the goal of this section is to demonstrate via a simple example that this may no longer be true in the presence of disturbances. To this end, consider a scalar, unstable LTI system $x[k+1]=ax[k]+d$ , where $a>1$ , and $d>0$ is a disturbance input to the system. The network comprises of just 2 nodes: node 1 with measurement model $y_{1}[k]=c_{1}x[k],c_{1}\neq 0$ , and node 2 with no measurements. Now consider an increasing sequence of time-steps $\mathbb{I}=\{t_{0},t_{1},\ldots\}$ with $t_{0}=0$ , and let $f(t_{q})=t_{q+1}-t_{q},\forall t_{q}\in\mathbb{I}$ be a non-decreasing function of its argument, as in Section II. Suppose the communication pattern comprises of an edge from node 1 to node 2 precisely at the time-steps given by $\mathbb{I}$ . Node 1 maintains a standard Luenberger observer given by $\hat{x}_{1}[k+1]=a\hat{x}_{1}[k]+l_{1}(y_{1}[k]-c_{1}\hat{x}_{1}[k])$ , where $l_{1}$ is the observer gain. Node 2 applies Algorithm 1, which, in this case, translates to node 2 adopting the estimate of node 1 at each time-step $t_{q}$ , and running open-loop at all other time-steps. Accordingly, we have $\hat{x}_{2}[t_{q+1}]=a^{f(t_{q})}\hat{x}_{1}[t_{q}],\forall t_{q}\in\mathbb{I}$ . With $e_{i}[k]=x[k]-\hat{x}_{i}[k],i\in\{1,2\}$ , one can then easily verify:

	$\displaystyle e_{1}[k+1]$	$\displaystyle=\gamma e_{1}[k]+d,\forall k\in\mathbb{N},$		(14)
	$\displaystyle e_{2}[t_{q+1}]$	$\displaystyle=a^{f(t_{q})}e_{1}[t_{q}]+d\frac{(a^{f(t_{q})}-1)}{(a-1)},\forall t_{q}\in\mathbb{I},$		(14)

where $\gamma=(a-l_{1}c_{1})$ . Now consider a scenario where the inter-communication intervals grow unbounded, i.e., $f(t_{q})\rightarrow\infty$ as $t_{q}\rightarrow\infty$ . Since $a>1$ and $d>0$ , it is clear from (14) that the error subsequence $e_{2}[t_{q}],t_{q}\in\mathbb{I}$ will grow unbounded even if node 1 chooses $l_{1}$ such that $\gamma=0$ . For the specific example under consideration, although the above arguments were constructed w.r.t. our algorithm, it seems unlikely that the final conclusion would change if one were to resort to some other approach.⁷⁷7Note that we are only considering single-time-scale algorithms where nodes are not allowed to exchange their measurements. Also, we assume here that the nodes have no knowledge about the nature of the disturbance $d$ , thereby precluding the use of any disturbance-rejection technique. The discussions in Section V can be thus summarized as follows.

•

For a noiseless, disturbance free LTI system of the form (1), one can achieve exponential convergence at any desired rate, and even finite-time convergence based on Algorithm 1, under remarkably mild assumptions: joint observability, and joint strong-connectivity over intervals that can potentially grow unbounded.
•

For an unstable system, any non-zero persistent disturbance, however small, can lead to unbounded estimation errors when the inter-communication intervals grow unbounded, no matter how slowly. Note however from Eq. (14) that our approach leads to bounded estimation errors under bounded disturbances if the sequence $\{f(t_{q})\}_{t_{q}\in\mathbb{I}}$ is uniformly bounded above (see Remark 4).

In light of the above points, the reasons for stating our results in full generality, i.e., for unbounded communication intervals, are as follows. First, we do so for theoretical interest, since we believe our work is the first to establish that the distributed state estimation problem can be solved with growing inter-communication intervals. Second, we essentially get this result for “free”, i.e., accounting for such general scenarios incurs no additional steps in terms of the design of Algorithm 1. Finally, we emphasize that, while no existing approach can even handle the case where strong-connectivity is preserved over uniformly bounded time intervals (i.e., $\exists\hskip 1.42262ptT\in\mathbb{N}_{+}$ such that $f(t_{q})\leq T,\forall t_{q}\in\mathbb{I}$ ), the analysis for this scenario is simply a special case of that in Appendix A.

VI Resilient Distributed State Estimation over Time-Varying Networks

We now consider a scenario where a subset of agents in the network is adversarial, and can deviate from the prescribed algorithm. We will show how one can employ the notion of freshness-indices to account for such adversarial agents over a time-varying network. To avoid cumbersome notation and to present the key ideas in a clear way, we will consider a scalar LTI system of the form $x[k+1]=ax[k]$ . Later, we will discuss how our approach can be naturally extended to more general system models. We consider a worst-case Byzantine adversary model, where an adversarial node is assumed to be omniscient, and can essentially act arbitrarily: it can transmit incorrect, potentially inconsistent information to its instantaneous out-neighbors, or choose not to transmit anything at all, while colluding with other adversaries in the process [28, 29, 30]. We will focus on an $f$ -total adversarial model where the total number of adversaries in the network is bounded above by $f$ , with $f\in\mathbb{N}$ . The adversarial set will be denoted by $\mathcal{A}$ , and the set of regular nodes by $\mathcal{R}=\mathcal{V}\setminus\mathcal{A}$ . Finally, for the scalar model under consideration, we will define the set of source nodes as follows: $\mathcal{S}=\{i\in\mathcal{V}:c_{i}\neq 0,\,\textrm{where}\,y_{i}[k]=c_{i}x[k]\}$ , i.e., $i\in\mathcal{S}$ if and only if it can measure the state on its own. Note that we will allow $\mathcal{S}\cap\mathcal{A}\neq\emptyset$ .

VI-A Description of Algorithm 2

Algorithm 2

1:Initialization: For each

i\in\mathcal{V}\setminus\mathcal{S}

\tau_{i}[0]=\omega

; each entry of

\mathbf{v}_{i}[0],\mathbf{d}_{i}[0]

, and

\boldsymbol{\phi}_{i}[0]

is empty;

\mathcal{M}_{i}[0]=\emptyset

; and

q_{i}[0]=0

. For each

i\in\mathcal{S}

\tau_{i}[0]=0

2:Update Rules for Source Nodes: Each

i\in\mathcal{S}

maintains

\tau_{i}[k]=0,\forall k\in\mathbb{N}

. It updates

\hat{x}_{i}[k]

based on the following Luenberger observer, where

l_{i}

is the observer gain:

\displaystyle\hat{x}_{i}[k+1]=a\hat{x}_{i}[k]+l_{i}(y_{i}[k]-c_{i}\hat{x}_{i}[k]).

(15)

3:Update Rules for Non-Source Nodes: At each

k\in\mathbb{N}

, every non-source node

i\in\mathcal{V}\setminus\mathcal{S}

operates as follows (lines 4-18).

4:Case 1:

\tau_{i}[k]=\omega

. Define

\mathcal{J}_{i}[k]\triangleq\{j\in\mathcal{N}_{i}[k]:\tau_{j}[k]\neq\omega,\tau_{j}[k]\in\mathbb{N},\tau_{j}[k]\leq k\}

5:if

|\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]|<(2f+1)-q_{i}[k]

then

6: Append each

l\in\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]

\mathcal{M}_{i}[k]

\triangleright

This involves adding

l

\mathcal{M}_{i}[k]

, and setting

v_{i,l}[k]=\hat{x}_{l}[k]

d_{i,l}[k]=\tau_{l}[k]

, and

\phi_{i,l}[k]=k.

7: Perform the following updates.

q_{i}[k+1]=q_{i}[k]+|\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]|.

(16)

d_{i,l}[k+1]=d_{i,l}[k]+1,l\in\mathcal{M}_{i}[k].

(17)

\tau_{i}[k+1]=\omega.

(18)

\hat{x}_{i}[k+1]=a\hat{x}_{i}[k].

(19)

8:else

9: Sort the nodes in

\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]

in ascending order of their freshness-indices

\tau_{l}[k],l\in\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]

. Append the first

(2f+1)-q_{i}[k]

nodes in the resulting list to

\mathcal{M}_{i}[k]

10: Set

q_{i}[\tau]=2f+1,\forall\tau\geq k+1

; update

d_{i,l}[k],l\in\mathcal{M}_{i}[k]

as per (17); update

\tau_{i}[k]

as follows:

\tau_{i}[k+1]=\max_{l\in\mathcal{M}_{i}[k]}d_{i,l}[k]+1.

(20)

11: For each

l\in\mathcal{M}_{i}[k]

, form the following quantity:

\bar{x}_{i,l}[k]=a^{(k-\phi_{i,l}[k])}\hat{x}_{l}[\phi_{i,l}[k]].

(21)

12: Sort

\bar{x}_{i,l}[k],l\in\mathcal{M}_{i}[k]

from highest to lowest, and reject the highest

f

and the lowest

f

of such quantities. Let the quantity that remains after such trimming be denoted

\bar{x}_{i}[k].

Update

\hat{x}_{i}[k]

as follows:

\hat{x}_{i}[k+1]=a\bar{x}_{i}[k].

(22)

13:end if

14:Set

\mathcal{M}_{i}[k+1]=\mathcal{M}_{i}[k],\mathbf{v}_{i}[k+1]=\mathbf{v}_{i}[k]

, and

\boldsymbol{\phi}_{i}[k+1]=\boldsymbol{\phi}_{i}[k]

15:Case 2:

\tau_{i}[k]\neq\omega

. For each

l\in\mathcal{J}_{i}[k]\cap\mathcal{M}_{i}[k]

, if

\tau_{l}[k]<d_{i,l}[k]

, then set

v_{i,l}[k]=\hat{x}_{l}[k]

d_{i,l}[k]=\tau_{l}[k]

, and

\phi_{i,l}[k]=k

(we will classify this as an append operation).

16:Sort the nodes in

\mathcal{M}_{i}[k]\cup\{\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]\}

in ascending order of their freshness-indices, using

d_{i,l}[k]

as the index for

l\in\mathcal{M}_{i}[k]

, and

\tau_{l}[k]

as the index for

l\in\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]

. Append the first

2f+1

nodes in the resulting list to

\mathcal{M}_{i}[k].

17:For each

l\in\mathcal{M}_{i}[k]

, update

d_{i,l}[k]

as per (17). Update

\tau_{i}[k]

as per (20), and

\hat{x}_{i}[k]

via (22) based on the filtering operation described in lines 11-12.

18:Execute the operations in line 14.

Following the same line of reasoning as in Section IV, we are once again interested in answering the following question: When should a regular non-source node $i\in\{\mathcal{V}\setminus\mathcal{S}\}\cap\mathcal{R}$ use the information of a neighbor to update its estimate of the state? Unlike before, however, the presence of adversaries introduces certain complications: not only can an adversary transmit an arbitrary estimate of the state, it can also lie about its freshness-index. In particular, an adversary can follow the strategy of always reporting a freshness-index of $0$ so as to prompt its instantaneous out-neighbors to use its estimate value. To carefully account for such misbehavior, we devise a novel protocol, namely Algo. 2. In what follows, we explain the main idea behind Algo. 2 in a nutshell.

High-level idea: Our goal is to ensure that the state estimate $\hat{x}_{i}[k]$ of a regular non-source node $i$ is “close” to those of the regular source set $\mathcal{S}\cap\mathcal{R}$ . To achieve this, we would like $\hat{x}_{i}[k]$ to be updated based on information that is neither too outdated, nor corrupted by the adversarial set. To meet the two requirements above, our main idea is to have node $i$ store and process information received from a dynamic list of neighbors. At each time-step, the list is first updated so as to retain only those nodes that have the most recent state estimates w.r.t. the source set. Subsequently, $\hat{x}_{i}[k]$ is updated by filtering out extreme estimates in the list in an appropriate manner. We now describe in detail how the above steps are implemented.

Detailed Description: Like Algo. 1, Algo. 2 requires each node $i$ to maintain a freshness-index $\tau_{i}[k]\in\mathbb{N}\cup\{\omega\}$ . Each source node $i\in\mathcal{S}$ maintains $\tau_{i}[k]=0$ for all time, and updates $\hat{x}_{i}[k]$ based on a standard Luenberger observer (see Eq. (15)). Given the presence of adversarial nodes in the network, each regular non-source node $i$ relies on information redundancy for updating its estimate of the state. To achieve this, it maintains four additional vectors $\mathbf{v}_{i}[k],\mathbf{d}_{i}[k]$ , $\boldsymbol{\phi}_{i}[k]$ , and $\mathcal{M}_{i}[k]$ , each of dimension $2f+1$ . The vector $\mathbf{v}_{i}[k]$ consists of state estimates received over time from $2f+1$ distinct nodes. The labels of such nodes are stored in a list $\mathcal{M}_{i}[k]$ , their freshness-indices in $\mathbf{d}_{i}[k]$ , and the time-stamps associated with their estimates (i.e., the time-step at which their estimate is received) are recorded in $\boldsymbol{\phi}_{i}[k]$ . Algo. 2 requires each non-source node $i$ to execute two key steps: (1) Maintaining a dynamical list $\mathcal{M}_{i}[k]$ of those $2f+1$ neighbors that have the lowest freshness-indices based on all the information node $i$ has acquired up to time $k$ ; and (2) Performing a filtering operation to update $\hat{x}_{i}[k]$ based on the latest state estimates of the nodes in the current list $\mathcal{M}_{i}[k]$ . The second step, however, requires the cardinality of the set $\mathcal{M}_{i}[k]$ to be $2f+1$ . Thus, Algo 2 involves an initial pre-filtering phase (lines 4-7) where a non-source node simply gathers enough estimates to later act on.

$\bullet$ Discussion of Case 1: We first describe the rules associated with the pre-filtering phase. Initially, each entry of $\mathbf{v}_{i}[0],\mathbf{d}_{i}[0]$ , and $\boldsymbol{\phi}_{i}[0]$ is empty, and $\mathcal{M}_{i}[k]$ is an empty list (line 1). As time progresses, node $i$ adds distinct nodes to the list $\mathcal{M}_{i}[k]$ based on rules that we will discuss shortly. Until the time when $|\mathcal{M}_{i}[k]|=2f+1$ , node $i$ maintains $\tau_{i}[k]=\omega$ , and uses a counter $q_{i}[k]$ to keep track of the number of entries in $\mathcal{M}_{i}[k]$ . When $\tau_{i}[k]=\omega$ , node $i$ operates as follows. It first considers the subset of neighbors $\mathcal{J}_{i}[k]$ at time $k$ that have freshness-indices other than $\omega$ , belonging to $\mathbb{N}$ , and at most $k$ (line 4).⁸⁸8Note that in the absence of adversaries, any node $j$ following Algo. 1 would satisfy $\tau_{j}[k]\in\mathbb{N}$ and $\tau_{j}[k]\leq k$ , whenever $\tau_{j}[k]\neq\omega$ . We would like the same to hold for any regular node following Algo. 2. Thus, any neighbor reporting otherwise need not be considered for inclusion in $\mathcal{M}_{i}[k].$ In line 5 of Algo. 2, node $i$ checks whether there are enough new nodes (i.e., nodes different from those already in $\mathcal{M}_{i}[k]$ ) in $\mathcal{J}_{i}[k]$ so as to bring the number of distinct entries in $\mathcal{M}_{i}[k]$ up to $(2f+1)$ . If not, it “appends” each node $l\in\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]$ to $\mathcal{M}_{i}[k]$ . By this, we mean that it adds $l$ to $\mathcal{M}_{i}[k]$ , sets $v_{i,l}[k]=\hat{x}_{l}[k]$ , $d_{i,l}[k]=\tau_{l}[k]$ , and $\phi_{i,l}[k]=k$ (see line 6). Here, we use the double subscript $i,l$ to indicate the $i$ -th node’s record of the various quantities associated with a node $l\in\mathcal{M}_{i}[k]$ . Node $i$ keeps track of the number of new nodes appended via (16). The entry $d_{i,l}[k]$ is the $i$ -th node’s internal copy of the freshness-index of node $l$ , which it updates via (17) for future comparisons (such as those in lines 15 and 16). As an indicator of the fact that it has not yet acquired $2f+1$ state estimates to perform a filtering operation, node $i$ sets $\tau_{i}[k+1]=\omega$ (see Eq. (18)), and runs open-loop via (19).

Let us now discuss the case when $\tau_{i}[k]$ transitions from $\omega$ to some value other than $\omega$ (lines 8-12). At the transition time-step, node $i$ appends those nodes from $\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]$ to $\mathcal{M}_{i}[k]$ that have the lowest freshness-indices (see line 9). It does so in a way such that $|\mathcal{M}_{i}[k]|$ is precisely $2f+1$ . Since node $i$ has now acquired enough estimates to perform the filtering step, it maintains $q_{i}[\tau]=2f+1$ for all $\tau\geq k+1$ (see line 10). As in the non-adversarial setting, the freshness-index of a node $i$ is a measure of how delayed its estimate is w.r.t. that of the source set $\mathcal{S}$ . Since node $i$ ’s estimate in turn is updated based on the estimates of nodes in $\mathcal{M}_{i}[k]$ , its freshness-index is essentially dictated by the largest entry in $\mathbf{d}_{i}[k]$ , i.e., the entry corresponding to the most delayed estimate. This facet is captured by (20). Lines 11-12 of Algo. 2 constitute the filtering step that a node $i\in\mathcal{V}\setminus\mathcal{S}$ employs to update $\hat{x}_{i}[k]$ . Since the latest estimates of nodes in $\mathcal{M}_{i}[k]$ might have different time-stamps, we need a way to make meaningful comparisons between them. To this end, for each $l\in\mathcal{M}_{i}[k]$ , node $i$ constructs an intermediate quantity $\bar{x}_{i,l}[k]$ by propagating forward the latest estimate it has obtained from node $l$ , namely $\hat{x}_{l}[\phi_{i,l}[k]]$ , from time $\phi_{i,l}[k]$ to time $k$ (see Eq. (21)).⁹⁹9One can interpret $\bar{x}_{i,l}[k]$ as node $i$ ’s prediction of node $l$ ’s state estimate at time $k$ , based on its current information. Note here that $\phi_{i,l}[k]$ is the latest time-step when node $i$ appended node $l$ to $\mathcal{M}_{i}[k]$ . Having constructed the quantities $\bar{x}_{i,l}[k],l\in\mathcal{M}_{i}[k]$ , node $i$ then rejects the highest $f$ and the lowest $f$ of them (i.e., it filters out extreme values), and uses the one that remains, denoted $\bar{x}_{i}[k]$ , to update $\hat{x}_{i}[k]$ via (22).

$\bullet$ Discussion of Case 2: The rules in lines 15-18 are essentially the same as those just discussed above for lines 8-12. The key difference between them stems from the manner in which $\mathcal{M}_{i}[k]$ is updated. In particular, when $\tau_{i}[k]=\omega$ , or when $\tau_{i}[k]$ transitions from $\omega$ to a value other than $\omega$ , node $i$ is less selective in terms of which nodes to include in $\mathcal{M}_{i}[k]$ ; at this point, its main concern is in gathering enough estimates for implementing the filtering step. In contrast, when $\tau_{i}[k]\neq\omega$ , node $i$ carefully examines the freshness-indices of its instantaneous neighbors prior to a potential inclusion in $\mathcal{M}_{i}[k]$ . This is done in two steps. First, if it comes in contact with a node $l$ that already exists in $\mathcal{M}_{i}[k]$ , then it checks whether node $l$ has fresher information to offer than when it was last appended to $\mathcal{M}_{i}[k]$ (see line 15). If so, node $i$ replaces the entries corresponding to node $l$ with their more recent versions. Next, in line 16, node $i$ considers the set $\mathcal{M}_{i}[k]\cup\{\mathcal{J}_{i}[k]\setminus\mathcal{M}_{i}[k]\}$ , and appends/retains only those $2f+1$ nodes that have the lowest freshness-indices.¹⁰¹⁰10In implementing this step, suppose a node $l$ already existing in $\mathcal{M}_{i}[k]$ gets retained in $\mathcal{M}_{i}[k]$ . Suppose the information concerning node $l$ was stored in the $p$ -th components of $\mathbf{d}_{i}[k],\mathbf{v}_{i}[k]$ and $\boldsymbol{\phi}_{i}[k]$ . Then, node $i$ continues to store node $l$ ’s information in the $p$ -th components of the above vectors. This is done only to simplify some of the arguments (see proof of Lemma 5). The rationale behind breaking up the “appending” operation into two steps (lines 15-16) is to avoid the possibility of having multiple appearances of a single node in $\mathcal{M}_{i}[k]$ . Note that for both Case 1 and Case 2, the values of $\mathcal{M}_{i},\mathbf{v}_{i}$ and $\boldsymbol{\phi}_{i}$ at the beginning of time-step $k+1$ are initialized with their values at the end of time-step $k$ (lines 14 and 18); their values at the end of time-step $k+1$ will naturally depend on the new information available to node $i$ at time $k+1$ . Finally, we emphasize that the rules of Algo. 2 apply only to the regular nodes; an adversary can deviate from them in arbitrary ways.

VII Performance Guarantees for Algorithm 2

In order to state our main result concerning the performance of Algorithm 2, we need to recall the following concepts.

Definition 1.

( $r$ -reachable set) [29] Given a graph $\mathcal{G}=(\mathcal{V,E})$ , a set $\mathcal{C}\subseteq\mathcal{V}$ , and an integer $r\in\mathbb{N}_{+}$ , $\mathcal{C}$ is said to be an $r$ -reachable set if $\exists i\in\mathcal{C}$ such that $|\mathcal{N}_{i}\setminus\mathcal{C}|\geq r$ .

Definition 2.

(Strongly $r$ -robust graph w.r.t. $\mathcal{S}$ ) [23] Given a graph $\mathcal{G}=(\mathcal{V,E})$ , a set $\mathcal{S}\subset\mathcal{V}$ , and an integer $r\in\mathbb{N}_{+}$ , $\mathcal{G}$ is said to be strongly $r$ -robust w.r.t. $\mathcal{S}$ if any non-empty subset $\mathcal{C}\subseteq\mathcal{V}\setminus\mathcal{S}$ is $r$ -reachable.

Based on the above concepts, we now introduce the key graph property that will be of primary importance to us.

Definition 3.

(Joint strong $r$ -robustness w.r.t. $\mathcal{S}$ ) Given an integer $r\in\mathbb{N}_{+}$ , and a set $\mathcal{S}\subset\mathcal{V}$ , a sequence of graphs $\{\mathcal{G}[k]\}_{k=0}^{\infty}$ is said to be jointly strongly $r$ -robust w.r.t. $\mathcal{S}$ if there exists ${T}\in\mathbb{N}_{+}$ such that $\bigcup\limits_{\tau=kT}^{(k+1)T-1}\mathcal{G}[\tau]$ is strongly $r$ -robust w.r.t. $\mathcal{S},\forall k\in\mathbb{N}$ .

The main result of this section is as follows.

Theorem 2.

Consider a scalar LTI system of the form (1), and a measurement model of the form (2). Let the sequence of communication graphs $\{\mathcal{G}[k]\}_{k=0}^{\infty}$ be jointly strongly $(3f+1)$ -robust w.r.t. the source set $\mathcal{S}$ . Then, based on Algorithm 2, the estimation error of each node $i\in\mathcal{R}$ can be made to converge to zero exponentially fast at any desired rate $\rho$ , despite the actions of any $f$ -total Byzantine adversarial set.

The proof of the above theorem is deferred to Appendix C. Prior to that, a few remarks are in order.

Remark 5.

We briefly point out how the above development can be easily generalized to vector dynamical systems. When the system matrix $\mathbf{A}$ contains real, distinct eigenvalues, one can first diagonalize the system. In the new coordinate frame, each state component is a scalar that can be treated analogously as in Algorithm 2. To deal with the case when $\mathbf{A}$ has arbitrary spectrum, we can first transform $\mathbf{A}$ to its real Jordan canonical form, and then combine ideas from [23] and Section VI. In each of the above cases, if (i) every unstable mode (eigenvalue) has its own set of source nodes that can observe it, and (ii) joint strong $(3f+1)$ -robustness holds w.r.t. each such source set, our results will go through.

Remark 6.

While in Def. 3, we required the strong-robustness property to be preserved over intervals of constant length $T$ , we can easily allow for such intervals to grow linearly as well, in the spirit of condition (C2) in Section II.

Remark 7.

Following the proof of Theorem 2 in Section C, it is easy to see that one can obtain finite-time convergence in the adversarial setting as well. To do so, each regular source node $i$ needs to simply choose its observer gain $l_{i}$ in (15) so that $a-l_{i}c_{i}=0$ . One can then verify that it would take at most $2(N-|\mathcal{S}|)T+1$ time-steps for the errors of all regular nodes to converge to $0$ .

Remark 8.

In our prior work [15], we studied a restrictive class of time-varying networks, where the restrictiveness arose from the fact that each regular non-source node was constrained ahead of time to only ever listen to certain specific nodes in the network. The main challenge that we address in this paper is to get rid of this constraint, thereby allowing for far more general graph sequences. This, in turn, requires a node to decide in real-time which neighbors to pay attention to (i.e., append to its list) as in Algorithm 2.

Remark 9.

The reason we require joint strong $(3f+1)$ -robustness as opposed to joint strong $(2f+1)$ -robustness is as follows. Consider a scenario where there are precisely $2f+1$ source nodes, $f$ of whom are adversarial, and there is exactly one non-source node $i$ . Suppose the graph sequence is jointly strongly $(2f+1)$ -robust w.r.t. $\mathcal{S}$ , i.e., each source node will be in a position to transmit information to node $i$ over every interval of the form $[kT,(k+1)T),k\in\mathbb{N}$ . Suppose the $f$ adversaries do not transmit at all. Node $i$ will never be able to attribute such a phenomenon to adversarial behavior (since the lack of information from the adversarial nodes could be due to absence of links in the time-varying graph). Thus, $\tau_{i}[k]$ will equal $\omega$ for all time, and node $i$ will keep running open-loop forever, thereby causing Algorithm 2 to fail.

VIII Conclusion

We proposed a novel approach to the design of distributed observers for LTI systems over time-varying networks. We proved that our algorithm guarantees exponential convergence at any desired rate (including finite-time convergence) under graph-theoretic conditions that are far milder than those existing. In particular, we showed that these results hold even when the inter-communication intervals grow unbounded over time. We then extended our framework to account for the possibility of worst-case adversarial attacks. In terms of future research directions, it would be interesting to explore the performance of our algorithm when the underlying network changes stochastically, as in [31]. We believe that the notion of a “freshness-index”, as employed in this paper, should be applicable to other related classes of problems that involve some form of collaborative estimation or inference - investigations along this line also merit attention.

Appendix A Proof of Theorem 1

The goal of this section is to prove Theorem 1. Before delving into the technical details, we first provide an informal discussion of the main ideas underlying the proof of Theorem 1. To this end, let us fix a sub-state $j\in\{1,\ldots,N\}$ . The starting point of our analysis is Lemma 23 which establishes that for any non-source node $i\in\mathcal{V}\setminus\{j\}$ , its error in estimation of sub-state $j$ at time-step $k$ can be expressed as a delayed version of the corresponding error of the source node $j$ , where the delay is precisely the freshness-index $\mathcal{\tau}^{(j)}_{i}[k].$ Given this result, we focus on bounding the delay $\mathcal{\tau}^{(j)}_{i}[k]$ by exploiting the graph connectivity condition (C3). This is achieved in Lemma 26 where we first establish that $\mathcal{\tau}^{(j)}_{i}[k]$ gets triggered after a finite period of time, and then show that it can be bounded above by the function $\tilde{g}(k)=2(N-1)g(k)$ , where $g(k)$ is as defined in Section II. At this point, we appeal to condition (C2) (which caps the rate of growth of $\tilde{g}(k)$ ) in designing the observer gain $\mathbf{L}_{j}$ at node $j$ . Specifically, in the proof of Theorem 1, we carefully design $\mathbf{L}_{j}$ such that despite a potentially growing delay, every non-source node $i\in\mathcal{V}\setminus\{j\}$ inherits the same exponential convergence to the true dynamics $\mathbf{z}^{(j)}[k]$ as that achieved by the corresponding source node $j$ . With these ideas in place, we first prove a simple result that will be helpful later on; it states that a non-source node for a certain sub-state will always adopt the information of the corresponding source node, whenever it is in a position to do so.

Lemma 2.

Consider any sub-state $j$ , and suppose that at some time-step $k$ , we have $j\in\mathcal{N}_{i}[k]$ , for some $i\in\mathcal{V}\setminus\{j\}$ . Then, based on Algorithm 1, we have:

(i)

If $\tau^{(j)}_{i}[k]=\omega$ , then $j=\operatorname*{\arg\!\min}_{l\in\mathcal{M}^{(j)}_{i}[k]}\tau^{(j)}_{l}[k]$ .
(ii)

If $\tau^{(j)}_{i}[k]\neq\omega$ , then $j=\operatorname*{\arg\!\min}_{l\in\mathcal{F}^{(j)}_{i}[k]}\tau^{(j)}_{l}[k]$ .

Proof.

The result follows from two simple observations that are direct consequences of the rules of Algorithm 1: (i) $\tau^{(j)}_{j}[k]=0,\forall k\in\mathbb{N}$ , and (ii) for any $i\in\mathcal{V}\setminus\{j\}$ , $\tau^{(j)}_{i}[k]\geq 1$ whenever $\tau^{(j)}_{i}[k]\neq\omega$ . In other words, the source node for a given sub-state has the lowest freshness-index for that sub-state at all time-steps. ∎

Lemma 3.

Suppose all nodes employ Algorithm 1. Consider any sub-state $j$ , and suppose that at some time-step $k$ , we have $\tau^{(j)}_{i}[k]=m$ , where $i\in\mathcal{V}\setminus\{j\}$ , and $m\in\mathbb{N}_{+}$ . Then, there exist nodes $v(\tau)\in\mathcal{V}\setminus\{j\},\tau\in\{k-m+1,\ldots,k\}$ , such that the following is true:

$\hat{\mathbf{z}}^{(j)}_{i}[k]=\mathbf{A}^{m}_{jj}\hat{\mathbf{z}}^{(j)}_{j}[k-m]+\sum\limits_{q=1}^{(j-1)}\sum\limits_{\tau=(k-m)}^{(k-1)}\mathbf{A}_{jj}^{(k-\tau-1)}\mathbf{A}_{jq}\hat{\mathbf{z}}^{(q)}_{v(\tau+1)}[\tau].$

(23)

Proof.

Consider any sub-state $j$ , and suppose that at some time-step $k$ , we have $\tau^{(j)}_{i}[k]=m$ , where $i\in\mathcal{V}\setminus\{j\}$ , and $m\in\mathbb{N}_{+}$ . Given this scenario, we claim that there exist nodes $v(\tau)\in\mathcal{V}\setminus\{j\},\tau\in\{k-m+1,\ldots,k\}$ such that $v(\tau)$ adopts the information of $v(\tau-1)$ at time $\tau-1$ for sub-state $j$ , $\forall\tau\in\{k-m+1,\ldots,k\}$ , with $v(k-m)=j$ and $v(k)=i$ . As we shall see, establishing this claim readily establishes (23); thus, we first focus on proving the former via induction on $m$ . For the base case of induction, suppose $\tau^{(j)}_{i}[k]=1$ for some $i\in\mathcal{V}\setminus\{j\}$ at some time-step $k$ . Based on Algorithm 1 and Lemma 2, note that this is possible if and only if $j\in\mathcal{N}_{i}[k-1]$ . In particular, $v(k)=i$ would then adopt the information of $v(k-1)=j$ at time $k-1$ for sub-state $j$ . This establishes the claim when $m=1$ . Now fix an integer $r\geq 2$ , and suppose the claim is true for all $m\in\{1,\ldots,r-1\}$ . Suppose $\tau^{(j)}_{i}[k]=r$ for some $i\in\mathcal{V}\setminus\{j\}$ at some time-step $k$ . From Algorithm 1, observe that this is true if and only if $i$ adopts the information of some node $l\in\mathcal{N}_{i}[k-1]\cup\{i\}$ at time $k-1$ for sub-state $j$ , such that $\tau^{(j)}_{l}[k-1]=r-1$ . Since $r-1\geq 1$ , it must be that $l\in\mathcal{V}\setminus\{j\}$ ; the induction hypothesis thus applies to node $l$ . Using this fact, and setting $v(k-1)=l$ completes our inductive proof of the claim. Finally, observe that for any $\tau\in\{k-m+1,\ldots,k\}$ , whenever $v(\tau)$ adopts the information of $v(\tau-1)$ at $\tau-1$ , the following identity holds based on (10) and (12):

\hat{\mathbf{z}}^{(j)}_{v(\tau)}[\tau]=\mathbf{A}_{jj}\hat{\mathbf{z}}^{(j)}_{v(\tau-1)}[\tau-1]+\sum\limits_{q=1}^{(j-1)}\mathbf{A}_{jq}\hat{\mathbf{z}}^{(q)}_{v(\tau)}[\tau-1].

(24)

Using the above identity repeatedly for all $\tau\in\{k-m+1,\ldots,k\}$ with $v(k-m)=j$ and $v(k)=i$ , immediately leads to (23). This completes the proof. ∎

Lemma 4.

Suppose the sequence $\{\mathcal{G}[k]\}^{\infty}_{k=0}$ satisfies condition (C3) in Section II. Then, for each sub-state $j$ , Algorithm 1 guarantees the following.

\tau^{(j)}_{i}[k]\neq\omega,\forall k\geq\sum_{q=0}^{N-2}f(t_{q}),\forall i\in\mathcal{V},\hskip 5.69054pt\textrm{and}

(25)

\tau^{(j)}_{i}[t_{p(N-1)}]\leq\sum_{q=(p-1)(N-1)}^{p(N-1)-1}f(t_{q}),\forall p\in\mathbb{N}_{+},\forall i\in\mathcal{V}.

(26)

Proof.

Fix a sub-state $j$ , and notice that both (25) and (26) hold for the corresponding source node $j$ , since $\tau^{(j)}_{j}[k]=0,\forall k\in\mathbb{N}$ . To establish these claims for the remaining nodes, we begin by making the following simple observation that follows directly from (9) and (13), and applies to every node $i\in\mathcal{V}\setminus\{j\}$ :

\tau^{(j)}_{i}[k+1]\leq\tau^{(j)}_{i}[k]+1,\hskip 2.84526pt\textrm{whenever}\hskip 2.84526pt\tau^{(j)}_{i}[k]\neq\omega.

(27)

Our immediate goal is to establish (26) when $p=1$ and, in the process, establish (25). Let $\mathcal{C}^{(j)}_{0}=\{j\}$ , and define:

\mathcal{C}^{(j)}_{1}\triangleq\{i\in\mathcal{V}\setminus\mathcal{C}^{(j)}_{0}:\{\bigcup\limits_{\tau=t_{0}}^{t_{1}-1}\mathcal{N}_{i}[\tau]\}\cap\mathcal{C}^{(j)}_{0}\neq\emptyset\}.

(28)

In words, $\mathcal{C}^{(j)}_{1}$ represents the set of non-source nodes (for sub-state $j$ ) that have a direct edge from node $j$ at least once over the interval $[t_{0},t_{1})$ . Based on condition (C3), $\mathcal{C}^{(j)}_{1}$ is non-empty (barring the trivial case when $\mathcal{V}=\{j\}$ ). For each $i\in\mathcal{C}^{(j)}_{1}$ , it must be that $j\in\mathcal{M}^{(j)}_{i}[\bar{k}]$ for some $\bar{k}\in[t_{0},t_{1})$ . Thus, based on (9) and (13), it must be that $\tau^{(j)}_{i}[k]\neq\omega,\forall k\geq t_{1}=f(t_{0}),\forall i\in\mathcal{C}^{(j)}_{1}$ . In particular, we note based on (27) that $\tau^{(j)}_{i}[t_{1}]\leq t_{1}$ , and hence $\tau^{(j)}_{i}[t_{N-1}]\leq t_{N-1}=\sum_{q=0}^{{N-2}}f(t_{q}),\forall i\in\mathcal{C}^{(j)}_{1}$ . We can keep repeating the above argument by recursively defining the sets $\mathcal{C}^{(j)}_{r},1\leq r\leq(N-1)$ , as follows:

\mathcal{C}^{(j)}_{r}\triangleq\{i\in\mathcal{V}\setminus\bigcup\limits_{q=0}^{(r-1)}\mathcal{C}^{(j)}_{q}:\{\bigcup\limits_{\tau=t_{r-1}}^{t_{r}-1}\mathcal{N}_{i}[\tau]\}\cap\{\bigcup\limits_{q=0}^{(r-1)}\mathcal{C}^{(j)}_{q}\}\neq\emptyset\}.

(29)

We proceed via induction on $r$ . Suppose the following is true for all $r\in\{1,\ldots,m-1\}$ , where $m\in\{2,\ldots,N-1\}$ : $\tau^{(j)}_{i}[t_{r}]\neq\omega$ and $\tau^{(j)}_{i}[t_{r}]\leq t_{r},\forall i\in\bigcup\limits_{q=0}^{r}\mathcal{C}^{(j)}_{q}$ . Now suppose $r=m$ . If $\mathcal{V}\setminus\bigcup\limits_{q=0}^{(m-1)}\mathcal{C}^{(j)}_{q}$ is empty, then we are done establishing (25), and (26) for the case when $p=1$ . Else, based on condition (C3), it must be that $\mathcal{C}^{(j)}_{m}$ is non-empty. Consider a node $i\in\mathcal{C}^{(j)}_{m}$ . Based on the way $\mathcal{C}^{(j)}_{m}$ is defined, note that at some time-step $\bar{k}\in[t_{m-1},t_{m})$ , node $i$ has some neighbor $v$ (say) from the set $\bigcup\limits_{q=0}^{(m-1)}\mathcal{C}^{(j)}_{q}$ . Based on the induction hypothesis and (27), it must be that $\tau^{(j)}_{v}[\bar{k}]\neq\omega$ and $\tau^{(j)}_{v}[\bar{k}]\leq\bar{k}$ . At this point, if $\tau^{(j)}_{i}[\bar{k}]=\omega$ , then since $v\in\mathcal{M}^{(j)}_{i}[\bar{k}]$ , node $i$ would update $\tau^{(j)}_{i}[\bar{k}]$ based on (9). Else, if $\tau^{(j)}_{i}[\bar{k}]\neq\omega$ , there are two possibilities: (i) $v\in\mathcal{F}^{(j)}_{i}[\bar{k}]$ , implying $\mathcal{F}^{(j)}_{i}[\bar{k}]\neq\emptyset$ ; or (ii) $v\notin\mathcal{F}^{(j)}_{i}[\bar{k}]$ , implying $\tau^{(j)}_{i}[\bar{k}]\leq\tau^{(j)}_{v}[\bar{k}]\leq\bar{k}$ . The above discussion, coupled with the freshness-index update rules for Case 2 of the algorithm (line 5 of Algo. 1), and (27), imply $\tau^{(j)}_{i}[t_{m}]\neq\omega$ and $\tau^{(j)}_{i}[t_{m}]\leq t_{m}$ . This completes the induction step. Appealing to (27) once again, and noting that $\bigcup_{q=0}^{N-1}\mathcal{C}^{(j)}_{q}=\mathcal{V}$ and $t_{N-1}=\sum_{q=0}^{N-2}f(t_{q})$ , establishes $\eqref{eqn:counterinit}$ , and (26) when $p=1$ .

In order to establish (26) for any $p\in\mathbb{N}_{+}$ , one can follow a similar line of argument as above to analyze the evolution of the freshness indices over the interval $[t_{(p-1)(N-1)},t_{p(N-1)}]$ . In particular, for any $p>1$ , we can set $\mathcal{D}^{(j)}_{0}=\{j\}$ , and define the sets $\mathcal{D}^{(j)}_{r},1\leq r\leq(N-1)$ recursively as follows:

$\mathcal{D}^{(j)}_{r}\triangleq\{i\in\mathcal{V}\setminus\bigcup\limits_{q=0}^{(r-1)}\mathcal{D}^{(j)}_{q}:\{\bigcup\limits_{\tau=t_{h(p,N,r)-1}}^{t_{h(p,N,r)}-1}\mathcal{N}_{i}[\tau]\}\cap\{\bigcup\limits_{q=0}^{(r-1)}\mathcal{D}^{(j)}_{q}\}\neq\emptyset\},$

(30)

where $h(p,N,r)=(p-1)(N-1)+r.$ One can then establish that $\tau^{(j)}_{i}[t_{h(p,N,r)}]\leq\sum_{q=(p-1)(N-1)}^{h(p,N,r)-1}f(t_{q}),\forall i\in\mathcal{D}^{(j)}_{r}$ , $\forall r\in\{1,\ldots,N-1\}$ , via induction. ∎

We are now in position to prove Theorem 1.

Proof.

(Theorem 1) The proof is divided into two parts. In the first part, we describe a procedure for designing the observer gains $\{\mathbf{L}_{i}\}^{N}_{i=1}$ . In the second part, we establish that our choice of observer gains indeed leads to the desired exponential convergence rate $\rho.$

Design of the observer gains: We begin by noting that for each sub-state $j$ , one can always find scalars $\beta_{j},\gamma_{j}\geq 1$ , such that $\left\|{(\mathbf{A}_{jj})}^{k}\right\|\leq\beta_{j}\gamma^{k}_{j},\forall k\in\mathbb{N}$ [32].¹¹¹¹11We use $\left\|\mathbf{A}\right\|$ to refer to the induced 2-norm of a matrix $\mathbf{A}$ . Define $\gamma\triangleq\max\limits_{1\leq j\leq N}\gamma_{j}$ . Next, fix a $\bar{\delta}\in(\delta,1)$ , where $\delta$ is as in (3). Given a desired rate of convergence $\rho\in(0,1)$ , we now recursively define two sets of positive scalars, namely $\{\rho_{j}\}^{N}_{j=1}$ and $\{\lambda_{j}\}^{N}_{j=1}$ , starting with $j=N$ . With $\lambda_{N}=\rho$ , let $\rho_{j},j=N$ , be chosen to satisfy:

\gamma^{\bar{\delta}}\rho^{1-\bar{\delta}}_{j}\leq\lambda_{j}.

(31)

Having picked $\rho_{j}\in(0,1)$ to meet the above condition, we set $\lambda_{j-1}$ to be any number in $(0,\rho_{j})$ , pick $\rho_{j-1}$ to satisfy (31), and then repeat this process till we reach $j=1$ . Observe that the sets $\{\rho_{j}\}^{N}_{j=1}$ and $\{\lambda_{j}\}^{N}_{j=1}$ as defined above always exist, and satisfy: $\rho_{1}<\lambda_{1}<\rho_{2}<\lambda_{2}<\cdots<\lambda_{N-1}<\rho_{N}<\lambda_{N}=\rho.$ For each sub-state $j\in\{1,\ldots,N\}$ , let the corresponding source node $j$ design the observer gain $\mathbf{L}_{j}$ (featuring in equation (8)) in a manner such that the matrix $(\mathbf{A}_{jj}-\mathbf{L}_{j}\mathbf{C}_{jj})$ has distinct real eigenvalues with spectral radius equal to $\rho_{j}$ . Such a choice of $\mathbf{L}_{j}$ exists as the pair $(\mathbf{A}_{jj},\mathbf{C}_{jj})$ is observable by construction. This completes our design procedure.

Convergence analysis: We first note that there exists a set of positive scalars $\{\alpha_{1},\ldots,\alpha_{N}\}$ , such that [32]:

\left\|{(\mathbf{A}_{jj}-\mathbf{L}_{j}\mathbf{C}_{jj})}^{k}\right\|\leq\alpha_{j}\rho_{j}^{k},\forall k\in\mathbb{N}.

(32)

For a particular sub-state $j$ , let $\mathbf{e}^{(j)}_{i}[k]=\hat{\mathbf{z}}^{(j)}_{i}[k]-\mathbf{z}^{(j)}[k]$ . Consider the first sub-state $j=1$ , and observe that based on (4), (5), and (8), the following is true: $\mathbf{e}^{(1)}_{1}[k+1]=(\mathbf{A}_{11}-\mathbf{L}_{1}\mathbf{C}_{11})\mathbf{e}^{(1)}_{1}[k]$ . Thus, we obtain

\mathbf{e}^{(1)}_{1}[k]={(\mathbf{A}_{11}-\mathbf{L}_{1}\mathbf{C}_{11})}^{k}\mathbf{e}^{(1)}_{1}[0].

(33)

Based on (32) and (33), we then have:

\left\|\mathbf{e}^{(1)}_{1}[k]\right\|\leq c_{1}\rho_{1}^{k},\forall k\in\mathbb{N},

(34)

where $c_{1}\triangleq\alpha_{1}\left\|\mathbf{e}^{(1)}_{1}[0]\right\|$ . Given that node 1’s error for sub-state 1 decays exponentially as per (34), we want to now relate the errors $\mathbf{e}^{(1)}_{i}[k],i\in\mathcal{V}\setminus\{1\}$ of the non-source nodes (for sub-state 1) to $\mathbf{e}^{(1)}_{1}[k]$ . To this end, consider any $i\in\mathcal{V}\setminus\{1\}$ , and note that for any $k\geq t_{N-1}$ , Eq. (25) in Lemma 26 implies that $\tau^{(1)}_{i}[k]\neq\omega$ , and hence $\tau^{(1)}_{i}[k]\in\mathbb{N}_{+}.$ Invoking Lemma 23, and using the fact that $\mathbf{z}^{(1)}[k]={(\mathbf{A}_{11})}^{m}\mathbf{z}^{(1)}[k-m],\forall m\in\mathbb{N}$ , we then obtain the following $\forall i\in\mathcal{V}\setminus\{1\}$ :

\mathbf{e}^{(1)}_{i}[k]={(\mathbf{A}_{11})}^{\tau^{(1)}_{i}[k]}\mathbf{e}^{(1)}_{1}[k-\tau^{(1)}_{i}[k]],\forall k\geq t_{N-1}.

(35)

Our next goal is to bound the delay term $\tau^{(1)}_{i}[k]$ in the above relation. For this purpose, consider any time-step $k\geq t_{N-1}$ , and let $p(k)$ be the largest integer such that $t_{p(k)(N-1)}\leq k$ . Then, for any sub-state $j$ , and any $i\in\mathcal{V}\setminus\{j\}$ , we observe:

$\displaystyle\tau^{(j)}_{i}[k]$	$\displaystyle\overset{(a)}{\leq}\tau^{(j)}_{i}[t_{p(k)(N-1)}]+(k-t_{p(k)(N-1)})$	(36)
	$\displaystyle\overset{(b)}{\leq}\sum_{q=(p(k)-1)(N-1)}^{p(k)(N-1)-1}f(t_{q})+(k-t_{p(k)(N-1)})$
	$\displaystyle\overset{(c)}{\leq}2(N-1)f(m(k))\overset{(d)}{=}2(N-1)g(k).$

In the above inequalities, (a) follows from (25) in Lemma 26 and (27); (b) follows from (26) in Lemma 26; and (c) follows from the monotonicity of $f(\cdot)$ in condition (C1), and by recalling that $m(k)\triangleq\max\{t_{q}\in\mathbb{I}:t_{q}\leq k\}$ . Finally, (d) follows by recalling that $g(k)=f(m(k))$ . Recalling that $\left\|{(\mathbf{A}_{11})}^{k}\right\|\leq\beta_{1}\gamma^{k}_{1}$ , using the bounds in (34) and (36), the fact that $\gamma_{1}\geq 1$ and $\rho_{1}<1$ , and the sub-multiplicative property of the 2-norm, we obtain the following by taking norms on both sides of (35):

\left\|\mathbf{e}^{(1)}_{i}[k]\right\|\leq\bar{c}_{1}{\left(\frac{\gamma_{1}}{\rho_{1}}\right)}^{\tilde{g}(k)}\rho_{1}^{k},\forall k\geq t_{N-1},\forall i\in\mathcal{V}\setminus\{1\},

(37)

where $\tilde{g}(k)=2(N-1)g(k)$ and $\bar{c}_{1}\triangleq c_{1}\beta_{1}$ . Based on condition (C3), and our choice of $\bar{\delta}$ , observe that there exists $\bar{k}(\bar{\delta})$ such that $\tilde{g}(k)\leq\bar{\delta}k,\forall k\geq\bar{k}(\bar{\delta})$ . With $k_{1}\triangleq\max\{t_{N-1},\bar{k}(\bar{\delta})\}$ , we then obtain the following based on (31) and (37), for all $k\geq k_{1}$ and for all $i\in\mathcal{V}\setminus\{1\}$ :

\left\|\mathbf{e}^{(1)}_{i}[k]\right\|\leq\bar{c}_{1}{\left(\gamma^{\bar{\delta}}_{1}\rho^{1-\bar{\delta}}_{1}\right)}^{k}\leq\bar{c}_{1}{\left(\gamma^{\bar{\delta}}\rho^{1-\bar{\delta}}_{1}\right)}^{k}\leq\bar{c}_{1}\lambda^{k}_{1}.

(38)

Note that since $\bar{c}_{1}\geq c_{1}$ and $\lambda_{1}>\rho_{1}$ , the above bound applies to node 1 as well (see equation (34)). We have thus established that exponential convergence at rate $\lambda_{1}$ for sub-state 1 holds for each node in the network.

Our aim is to now obtain a bound similar to that in (38) for each sub-state $j\in\{2,\ldots,N\}$ . To this end, with $g_{jq}=\left\|(\mathbf{A}_{jq}-\mathbf{L}_{j}\mathbf{C}_{jq})\right\|$ and $h_{jq}=\left\|\mathbf{A}_{jq}\right\|$ , let us define the following quantities recursively for $j\in\{2,\ldots,N\}$ :

$\displaystyle k_{j}$	$\displaystyle\triangleq\frac{k_{j-1}}{(1-\bar{\delta})},$	(39)
$\displaystyle c_{j}$	$\displaystyle\triangleq\frac{\alpha_{j}}{\rho_{j}^{k_{j-1}}}\left(\left\\|\mathbf{e}^{(j)}_{j}[k_{j-1}]\right\\|+\sum\limits_{q=1}^{(j-1)}\frac{g_{jq}\bar{c}_{q}}{(\rho_{j}-\lambda_{q})}\lambda_{q}^{k_{j-1}}\right),$
$\displaystyle\bar{c}_{j}$	$\displaystyle\triangleq\beta_{j}\left(c_{j}+\sum\limits_{q=1}^{(j-1)}\frac{h_{jq}\bar{c}_{q}}{(\gamma_{j}-\lambda_{q})}\right),$

where $k_{1}\triangleq\max\{t_{N-1},\bar{k}(\bar{\delta})\}$ , $c_{1}\triangleq\alpha_{1}\left\|\mathbf{e}^{(1)}_{1}[0]\right\|$ , and $\bar{c}_{1}=c_{1}\beta_{1}$ . Based on the above definitions, we claim that for each sub-state $j\in\{1,\ldots,N\}$ , the following is true:

\left\|\mathbf{e}^{(j)}_{i}[k]\right\|\leq\bar{c}_{j}\lambda^{k}_{j},\forall k\geq k_{j},\forall i\in\mathcal{V}.

(40)

We will prove the above claim via induction on the sub-state number $j$ . We have already established (40) for the base case when $j=1$ . For $j\geq 2$ , our strategy will be to first analyze the evolution of $\mathbf{e}^{(j)}_{j}[k]$ at the source node $j$ . From (4) and (5), we note that the dynamics of the $j$ -th sub-state are coupled with those of the first $j-1$ sub-states. Thus, $\mathbf{e}^{(j)}_{j}[k]$ will exhibit exponential decay only when the errors for the first $j-1$ sub-states have already started decaying exponentially, with $k_{j-1}$ (as defined in (39)) representing the instant when exponential decay for the $(j-1)$ -th sub-state kicks in. Let us now prove that as soon as this happens, the following holds:

\left\|\mathbf{e}^{(j)}_{j}[k]\right\|\leq{c}_{j}\rho^{k}_{j},\forall k\geq k_{j-1}.

(41)

To do so, suppose (40) holds for all $j\in\{1,\ldots,l-1\}$ , where $l\in\{2,\ldots,N\}$ . Now let $j=l$ and observe that equations (4) and (5) yield:

$\begin{aligned} &\mathbf{z}^{(l)}[k+1]=\mathbf{A}_{ll}\mathbf{z}^{(l)}[k]+\sum_{q=1}^{(l-1)}\mathbf{A}_{lq}\mathbf{z}^{(q)}[k]\\ &=(\mathbf{A}_{ll}-\mathbf{L}_{l}\mathbf{C}_{ll})\mathbf{z}^{(l)}[k]+\sum\limits_{q=1}^{(l-1)}(\mathbf{A}_{lq}-\mathbf{L}_{l}\mathbf{C}_{lq})\mathbf{z}^{(q)}[k]+\mathbf{L}_{l}\mathbf{y}_{l}[k].\end{aligned}$

(42)

Based on the above equation and (8), we obtain:

$\mathbf{e}^{(l)}_{l}[k+1]=(\mathbf{A}_{ll}-\mathbf{L}_{l}\mathbf{C}_{ll})\mathbf{e}^{(l)}_{l}[k]+\sum\limits_{q=1}^{(l-1)}(\mathbf{A}_{lq}-\mathbf{L}_{l}\mathbf{C}_{lq})\mathbf{e}^{(q)}_{l}[k].$

Rolling out the above equation starting from $k_{l-1}$ yields:

$\mathbf{e}^{(l)}_{l}[k]={F_{ll}}^{(k-k_{l-1})}\mathbf{e}^{(l)}_{l}[k_{l-1}]+\sum\limits_{q=1}^{(l-1)}\sum\limits_{\tau=k_{l-1}}^{(k-1)}{F_{ll}}^{(k-\tau-1)}G_{lq}\mathbf{e}^{(q)}_{l}[\tau],$

(43)

where $F_{ll}=(\mathbf{A}_{ll}-\mathbf{L}_{l}\mathbf{C}_{ll})$ , and $G_{lq}=(\mathbf{A}_{lq}-\mathbf{L}_{l}\mathbf{C}_{lq})$ . Taking norms on both sides of the above equation, using the triangle inequality, and the sub-multiplicative property of the two-norm, we obtain:

$\begin{aligned} \left\|\mathbf{e}^{(l)}_{l}[k]\right\|&\overset{(a)}{\leq}\alpha_{l}\rho^{k}_{l}\left(\frac{\left\|\mathbf{e}^{(l)}_{l}[k_{l-1}]\right\|}{\rho_{l}^{k_{l-1}}}+\frac{1}{\rho_{l}}\sum_{q=1}^{(l-1)}g_{lq}\sum\limits_{\tau=k_{l-1}}^{(k-1)}\rho^{-\tau}_{l}\left\|\mathbf{e}^{(q)}_{l}[\tau]\right\|\right)\\ &\overset{(b)}{\leq}\alpha_{l}\rho^{k}_{l}\left(\frac{\left\|\mathbf{e}^{(l)}_{l}[k_{l-1}]\right\|}{\rho_{l}^{k_{l-1}}}+\frac{1}{\rho_{l}}\sum_{q=1}^{(l-1)}g_{lq}\bar{c}_{q}\sum\limits_{\tau=k_{l-1}}^{\infty}{\left(\frac{\lambda_{q}}{\rho_{l}}\right)}^{\tau}\right)\\ &\overset{(c)}{\leq}c_{l}\rho^{k}_{l},\forall k\geq k_{l-1}.\\ \end{aligned}$

(44)

In the above inequalities, (a) follows from (32) and by recalling that $g_{lq}=\|G_{lq}\|$ ; (b) follows by first applying the induction hypothesis noting that $q\leq(l-1)$ and $\tau\geq k_{l-1}$ , and then changing the upper limit of the inner summation (over time); (c) follows by simplifying the preceding inequality using the fact that $\lambda_{q}<\rho_{l},\forall q\in\{1,\ldots,l-1\}$ , and using the definition of $c_{l}$ in (39). We have thus obtained a bound on the estimation error of sub-state $l$ at node $l$ . To obtain a similar bound for each $i\in\mathcal{V}\setminus\{l\}$ , note that equation (42) can be rolled out over time to yield the following for each $m\in\mathbb{N}$ :

\mathbf{z}^{(l)}[k]=\mathbf{A}_{ll}^{m}\mathbf{z}^{(l)}[k-m]+\sum_{q=1}^{(l-1)}\sum_{\tau=(k-m)}^{(k-1)}\mathbf{A}_{ll}^{(k-\tau-1)}\mathbf{A}_{lq}\mathbf{z}^{(q)}[\tau].

Leveraging Lemma 23, we can then obtain the following error dynamics for a node $i\in\mathcal{V}\setminus\{l\},\forall k\geq t_{N-1}.$

	$\displaystyle\mathbf{e}^{(l)}_{i}[k]$	$\displaystyle={(\mathbf{A}_{ll})}^{\tau^{(l)}_{i}[k]}\mathbf{e}^{(l)}_{l}[k-\tau^{(l)}_{i}[k]]$		(45)
		$\displaystyle+\sum_{q=1}^{(l-1)}\sum_{\tau=(k-\tau^{(l)}_{i}[k])}^{(k-1)}\mathbf{A}_{ll}^{(k-\tau-1)}\mathbf{A}_{lq}\mathbf{e}^{(q)}_{v(\tau+1)}[\tau].$		(45)

Based on the above equation, we note that since $\mathbf{A}_{ll}$ can contain unstable eigenvalues, and since $\tau^{(l)}_{i}[k]$ may grow over time (owing to potentially growing inter-communication intervals), we need the decay in $\mathbf{e}^{(l)}_{l}[k-\tau^{(l)}_{i}[k]]$ to dominate the growth due to ${(\mathbf{A}_{ll})}^{\tau^{(l)}_{i}[k]}$ in order for $\mathbf{e}^{(l)}_{i}[k]$ to eventually remain bounded. To show that this is indeed the case, we begin by noting the following inequalities that hold when $k\geq k_{l}$ :

\frac{k_{l-1}}{k}\overset{(a)}{\leq}1-\bar{\delta}\overset{(b)}{\leq}1-\frac{\tilde{g}(k)}{k}\overset{(c)}{\leq}1-\frac{\tau^{(l)}_{i}[k]}{k},

(46)

where $\tilde{g}(k)=2(N-1)g(k)$ . In the above inequalities, (a) follows directly from (39); (b) follows by noting that $k\geq k_{l}\implies k\geq\bar{k}(\bar{\delta})$ ; and (c) follows from (36) and by noting that $k\geq k_{l}\implies k\geq t_{N-1}.$ We conclude that if $k\geq k_{l}$ , then $k-\tau^{(l)}_{i}[k]\geq k_{l-1}$ . Thus, when $k\geq k_{l}$ , at any time-step $\tau\geq k-\tau^{(l)}_{i}[k]$ , the errors of the first $l-1$ sub-states would exhibit exponential decay based on the induction hypothesis. With this in mind, we fix $i\in\mathcal{V}\setminus\{l\}$ , $k\geq k_{l}$ , and bound $\mathbf{e}^{(l)}_{i}[k]$ by taking norms on both sides of $\eqref{eqn:errornonsourcemodej}$ , as follows:

$\begin{aligned} \left\|\mathbf{e}^{(l)}_{i}[k]\right\|&\overset{(a)}{\leq}\beta_{l}\left(c_{l}{\left(\frac{\gamma_{l}}{\rho_{l}}\right)}^{\tilde{g}(k)}\rho^{k}_{l}+\gamma^{(k-1)}_{l}\sum_{q=1}^{(l-1)}h_{lq}\bar{c}_{q}\sum\limits_{\tau=(k-\tau^{(l)}_{i}[k])}^{(k-1)}{\left(\frac{\lambda_{q}}{\gamma_{l}}\right)}^{\tau}\right)\\ &\overset{(b)}{\leq}\beta_{l}\left(c_{l}{\left(\frac{\gamma_{l}}{\rho_{l}}\right)}^{\tilde{g}(k)}\rho^{k}_{l}+\gamma^{(k-1)}_{l}\sum_{q=1}^{(l-1)}h_{lq}\bar{c}_{q}\sum\limits_{\tau=(k-\tilde{g}(k))}^{\infty}{\left(\frac{\lambda_{q}}{\gamma_{l}}\right)}^{\tau}\right)\\ &\overset{(c)}{=}\beta_{l}\left(c_{l}{\left(\frac{\gamma_{l}}{\rho_{l}}\right)}^{\tilde{g}(k)}\rho^{k}_{l}+\sum_{q=1}^{(l-1)}\frac{h_{lq}\bar{c}_{q}}{(\gamma_{l}-\lambda_{q})}{\left(\frac{\gamma_{l}}{\lambda_{q}}\right)}^{\tilde{g}(k)}\lambda^{k}_{q}\right)\\ &\overset{(d)}{\leq}\bar{c}_{l}{\left(\gamma^{\bar{\delta}}\rho^{1-\bar{\delta}}_{l}\right)}^{k}\overset{(e)}{\leq}\bar{c}_{l}\lambda^{k}_{l}.\end{aligned}$

(47)

In the above steps, (a) follows by first recalling that $\left\|{(\mathbf{A}_{ll})}^{k}\right\|\leq\beta_{l}\gamma^{k}_{l},\forall k\in\mathbb{N}$ , $h_{lq}=\|\mathbf{A}_{lq}\|$ , $\tilde{g}(k)=2(N-1)g(k)$ , and then using the induction hypothesis, equations (36), (40), and (LABEL:eqn:boundsourcemodej), and the facts that $\rho_{l}<1,\gamma_{l}\geq 1$ ; (b) follows by suitably changing the upper and lower limits of the inner summation (over time), a change that is warranted since each summand is non-negative; (c) follows by simplifying the preceding inequality; (d) follows by noting that $\lambda_{q}<\rho_{l},\forall q\in\{1,\ldots,l-1\}$ , using the definition of $\bar{c}_{l}$ in (39), and the fact that $\tilde{g}(k)\leq\bar{\delta}k,\forall k\geq k_{l}$ ; and finally (e) follows from (31). This completes the induction step. Let $\mathbf{e}_{i}[k]=\hat{\mathbf{z}}_{i}[k]-\mathbf{z}[k]$ . Recalling that $\lambda_{j}\leq\rho,\forall j\in\{1,\ldots,N\}$ , we obtain as desired:

$\left\|\mathbf{e}_{i}[k]\right\|=\sqrt{\sum\limits_{j=1}^{N}{\left\|\mathbf{e}^{(j)}_{i}[k]\right\|}^{2}}\leq\left(\sqrt{\sum\limits_{j=1}^{N}\bar{c}^{2}_{j}}\right)\rho^{k},\forall k\geq k_{N},\forall i\in\mathcal{V}.$

∎

Appendix B Proof of Proposition 1

Proof.

(Proposition 1) For each sub-state $j\in\{1,\ldots,N\}$ , let the corresponding source node $j$ design the observer gain $\mathbf{L}_{j}$ (featuring in equation (8)) in a manner such that the matrix $(\mathbf{A}_{jj}-\mathbf{L}_{j}\mathbf{C}_{jj})$ has all its eigenvalues at $0$ . Such a choice of $\mathbf{L}_{j}$ exists based on the fact that the pair $(\mathbf{A}_{jj},\mathbf{C}_{jj})$ is observable by construction. Let $n_{j}=dim(\mathbf{A}_{jj})$ . Given the above choice of observer gains, we will prove the result by providing an upper bound on the number of time-steps it takes the error of each node to converge to $\mathbf{0}$ . To this end, we first define a sequence $\{\bar{\tau}_{j}\}_{j=1}^{N}$ of time-steps as follows:

\bar{\tau}_{1}=\inf\{\tau\in\mathbb{N}_{+},\tau\geq t_{N-1}:k-\tilde{g}(k)\geq n_{1},\forall k\geq\tau\},

(48)

where $\tilde{g}(k)=2(N-1)g(k)$ , and

\bar{\tau}_{j}=\inf\{\tau\in\mathbb{N}_{+}:k-\tilde{g}(k)\geq\bar{\tau}_{j-1}+n_{j},\forall k\geq\tau\}.

(49)

Based on condition (C3), namely $\limsup_{k\to\infty}g(k)/k=\delta<1$ , observe that $\bar{\tau}_{j}$ as defined above is finite $\forall j\in\{1,\ldots,N\}$ . Next, note that by construction, $(\mathbf{A}_{jj}-\mathbf{L}_{j}\mathbf{C}_{jj})$ is a nilpotent matrix of index at most $n_{j}$ . Thus, it is easy to see that $\mathbf{e}^{(1)}_{1}[k]=\mathbf{0},\forall k\geq n_{1}$ , based on (33). Recall from (36) that for each sub-state $j$ , $\tau^{(j)}_{i}[k]\leq\tilde{g}(k),\forall k\geq t_{N-1},\forall i\in\mathcal{V}$ . From the definition of $\bar{\tau}_{1}$ in (48), and equation (35), we immediately obtain that $\mathbf{e}^{(1)}_{i}[k]=\mathbf{0},\forall k\geq\bar{\tau}_{1},\forall i\in\mathcal{V}$ . One can easily generalize this argument to the remaining sub-states by using an inductive reasoning akin to that in the proof of Theorem 1. In particular, for any sub-state $j\in\{2,\ldots,N\}$ , one can roll out the error dynamics for node $j$ as in (43), starting from time-step $\bar{\tau}_{j-1}$ . By this time, the induction hypothesis would imply that the estimation errors of all nodes on all sub-states $q\in\{1,\ldots,j-1\}$ have converged to zero. The nilpotentcy of $(\mathbf{A}_{jj}-\mathbf{L}_{j}\mathbf{C}_{jj})$ would then imply that $\mathbf{e}^{(j)}_{j}[k]=\mathbf{0},\forall k\geq\bar{\tau}_{j-1}+n_{j}$ . From the definition of $\bar{\tau}_{j}$ in (49), and $\eqref{eqn:delaybound2}$ , we note that $k\geq\bar{\tau}_{j}\implies k-\tau^{(j)}_{i}[k]\geq\bar{\tau}_{j-1}+n_{j},\forall i\in\mathcal{V}$ . Referring to (45), we conclude that $\mathbf{e}^{(j)}_{i}[k]=\mathbf{0},\forall k\geq\bar{\tau}_{j},\forall i\in\mathcal{V}.$ Based on the above reasoning, the overall error for each node converges to $\mathbf{0}$ in at most $\bar{\tau}_{N}$ time-steps. ∎

Appendix C Proof of Theorem 2

In this section, we develop the proof of Theorem 2. We begin with the following lemma that characterizes the evolution of the vector of freshness-indices $\mathbf{d}_{i}[k]$ .

Lemma 5.

Suppose at any time-step $k\in\mathbb{N}_{+}$ , $\tau_{i}[k]\neq\omega$ for some $i\in\{\mathcal{V}\setminus\mathcal{S}\}\cap\mathcal{R}$ . Then, Algorithm 2 implies the following.

	$\displaystyle\mathbf{d}_{i}[k+1]$	$\displaystyle\leq\mathbf{d}_{i}[k]+\mathbf{1}_{2f+1},$		(50)
	$\displaystyle\tau_{i}[k+1]$	$\displaystyle\leq\tau_{i}[k]+1,$		(51)

where the first inequality holds component-wise.

Proof.

Consider a node $i\in\{\mathcal{V}\setminus\mathcal{S}\}\cap\mathcal{R}$ , and suppose $\tau_{i}[k]\neq\omega$ . This implies that $|\mathcal{M}_{i}[k]|=2f+1$ , and hence, all entries of the vector $\mathbf{d}_{i}[k]$ are populated with non-negative integers. Now fix any component $p\in\{1,\ldots,2f+1\}$ of $\mathbf{d}_{i}[k]$ , and suppose that it corresponds to node $l\in\mathcal{M}_{i}[k]$ , i.e., focus on $d_{i,l}[k]$ . This entry undergoes the following three operations (in order) prior to the end of time-step $k+1$ : (i) It gets incremented by $1$ as per (17); (ii) It gets potentially replaced by a value strictly smaller than $d_{i,l}[k]+1$ if node $i$ hears from node $l$ at time $k+1$ (line 15 of Algo. 2); and (iii) It gets subjected to the operation in line 16 of Algo. 2. Clearly, after operations (i) and (ii), the entry corresponding to node $l$ can increase by at most $1$ . At the end of operation (iii), node $l$ either gets retained in $\mathcal{M}_{i}[k+1]$ , or gets replaced. In case it is the former, we clearly have $d_{i,l}[k+1]\leq d_{i,l}[k]+1$ , and $d_{i,l}[k+1]$ gets stored in the $p$ -th component of $\mathbf{d}_{i}[k+1]$ (see footnote 10). If node $l$ gets removed from $\mathcal{M}_{i}[k+1]$ , then given that the $2f+1$ nodes with the lowest freshness-indices populate $\mathcal{M}_{i}[k+1]$ (line 16 of Algo. 2), it must be that the node replacing $l$ in $\mathcal{M}_{i}[k+1]$ has freshness-index at most $d_{i,l}[k]+1$ at time $k+1$ . Thus, regardless of whether node $l$ gets retained or removed, we have argued that the $p$ -th component of $\mathbf{d}_{i}[k]$ can increase by at most $1$ by the end of time-step $k+1$ . Noting that the above argument holds for any component $p$ of $\mathbf{d}_{i}[k]$ establishes (50); Eq. (51) then follows readily from (20) and (50). ∎

The next lemma is the analogue of Lemma 26 for the adversarial case. It tells us that the freshness-index of a regular node gets “triggered” (i.e., assumes a value other than $\omega$ ) after a finite period of time, and that it is eventually uniformly bounded above. Here, uniformity is w.r.t. all graph sequences that are jointly strongly $(3f+1)$ -robust w.r.t. $\mathcal{S}$ .

Lemma 6.

Suppose the conditions in the statement of Theorem 2 hold. Then, Algorithm 2 guarantees the following.

\tau_{i}[k]\neq\omega,\forall k\geq(N-|\mathcal{S}|)T,\forall i\in\mathcal{R},\hskip 5.69054pt\textrm{and}

(52)

\tau_{i}[k]\leq 2(N-|\mathcal{S}|)T,\forall k\geq(N-|\mathcal{S}|)T,\forall i\in\mathcal{R},

(53)

where $T$ has the same meaning as in Definition 3.

Proof.

The proof of this result mirrors that of Lemma 26; hence, we only sketch the essential details. First, observe that the claims made in the statement of the lemma hold trivially for any $i\in\mathcal{S}\cap\mathcal{R}$ , since each such node maintains $\tau_{i}[k]=0$ for all time. Let us now focus on establishing (52) for all regular non-source nodes. To this end, let $\mathcal{C}_{0}=\mathcal{S}$ , and define:

\mathcal{C}_{1}\triangleq\{i\in\mathcal{V}\setminus\mathcal{C}_{0}:|\{\bigcup\limits_{\tau=0}^{T-1}\mathcal{N}_{i}[\tau]\}\cap\mathcal{C}_{0}|\geq 3f+1\}.

(54)

In words, $i\in\mathcal{C}_{1}$ if and only if node $i$ has at least $3f+1$ neighbors from the source set $\mathcal{C}_{0}=\mathcal{S}$ over the interval $[0,T)$ . Suppose $\mathcal{V}\setminus\mathcal{S}\neq\emptyset$ (else, there is nothing to prove). Then, $\mathcal{C}_{1}$ must also be non-empty based on the definition of joint strong $(3f+1)$ -robustness w.r.t. $\mathcal{S}$ . Now consider any $i\in\mathcal{C}_{1}\cap\mathcal{R}$ . Since at most $f$ nodes are adversarial, node $i$ must have heard from at least $2f+1$ regular source nodes (not necessarily all at the same time-step) over the interval $[0,T)$ , with each such node reporting a freshness-index of $0$ . Regardless of whether or not node $i$ appends all such nodes to $\mathcal{M}_{i}$ , the fact that it has the opportunity to do so, implies that $\tau_{i}[T]\neq\omega$ . Moreover, given the fact that node $i$ appends a node $l$ to $\mathcal{M}_{i}[k]$ only if $\tau_{l}[k]\leq k$ , and appealing to (51), we see that $\tau_{i}[T]\leq T$ . Using (51) again, we have $\tau_{i}[(N-|\mathcal{S}|)T]\leq(N-|\mathcal{S}|)T.$ Now define the sets $\mathcal{C}_{r},1\leq r\leq(N-|\mathcal{S}|)$ , recursively as follows:

\mathcal{C}_{r}\triangleq\{i\in\mathcal{V}\setminus\bigcup\limits_{q=0}^{(r-1)}\mathcal{C}_{q}:|\{\bigcup\limits_{\tau=(r-1)T}^{rT-1}\mathcal{N}_{i}[\tau]\}\cap\{\bigcup\limits_{q=0}^{(r-1)}\mathcal{C}_{q}\}|\geq 3f+1\}.

(55)

We can now employ an inductive argument similar to that in Lemma 26 to establish (52). In particular, proceeding as in the base case when $r=1$ , joint strong $(3f+1)$ -robustness w.r.t. $\mathcal{S}$ implies that $\mathcal{C}_{r}\neq\emptyset$ , whenever $\mathcal{V}\setminus\bigcup\limits_{q=0}^{(r-1)}\mathcal{C}_{q}\neq\emptyset$ . Since $|\mathcal{A}|\leq f$ , any node $i\in\mathcal{C}_{r}\cap\mathcal{R}$ must then hear from at least $2f+1$ regular nodes in $\bigcup\limits_{q=0}^{(r-1)}\mathcal{C}_{q}$ over the interval $[(r-1)T,rT)$ . Moreover, based on an inductive reasoning, each such node would be a potential candidate for getting appended upon making contact with node $i$ . It is easy to then see that for each $i\in\mathcal{C}_{r}\cap\mathcal{R}$ , $\tau_{i}[rT]\neq\omega$ and, in particular, $\tau_{i}[rT]\leq rT$ based on (51). This establishes the claim in equation (52).

We now turn to establishing the correctness of (53). Let $\bar{T}=(N-|\mathcal{S}|)T.$ We claim $\tau_{i}[p\bar{T}]\leq\bar{T},\forall p\in\mathbb{N}_{+},\forall i\in\mathcal{R}$ . Note that the argument for the case when $p=1$ follows from the above discussion, and by using (51). To prove the claim, it suffices to prove it for the case when $p=2$ , since an identical reasoning applies for all $p\geq 2$ . For analyzing the case when $p=2$ , let $\mathcal{D}_{0}=\mathcal{S}$ , and define the sets $\mathcal{D}_{r}$ , $1\leq r\leq(N-|\mathcal{S}|)$ , recursively as follows:

\mathcal{D}_{r}\triangleq\{i\in\mathcal{V}\setminus\bigcup\limits_{q=0}^{(r-1)}\mathcal{D}_{q}:|\{\bigcup\limits_{\tau=\bar{T}+(r-1)T}^{\bar{T}+rT-1}\mathcal{N}_{i}[\tau]\}\cap\{\bigcup\limits_{q=0}^{(r-1)}\mathcal{D}_{q}\}|\geq 3f+1\}.

(56)

Let us first consider the case when $r=1$ , and accordingly focus on the interval $[\bar{T},\bar{T}+T)$ . Fix $\tau\in[0,T-1]$ , and suppose a regular non-source node $i\in\mathcal{D}_{1}\cap\mathcal{R}$ hears from $m$ regular source nodes over the interval $[\bar{T},\bar{T}+\tau]$ . We then claim that at least $m$ components of $\mathbf{d}_{i}[\bar{T}+\tau]$ are at most $\tau$ . This is immediate when $m=1$ based on (50). In particular, whenever a regular source node transmits to node $i$ , given that it always reports a freshness-index of $0$ , at least one of the entries of $\mathbf{d}_{i}$ will take on a value of $0$ at that instant. Now suppose $m=2$ , and let node $i$ hear from regular source nodes $v_{1}$ and $v_{2}$ at time-steps $\bar{T}+\tau_{1}$ and $\bar{T}+\tau_{2}$ , respectively, where $0\leq\tau_{1}\leq\tau_{2}\leq\tau$ . Given that node $i$ hears from $v_{1}$ at $\bar{T}+\tau_{1}$ , based on (50), it must be that at least one component of $\mathbf{d}_{i}[\bar{T}+\tau_{2}]$ is at most $\tau_{2}-\tau_{1}\leq\tau_{2}$ . Given this, the fact that node $i$ hears from $v_{2}$ at $\bar{T}+\tau_{2}$ , and noting that node $i$ appends/retains those $2f+1$ nodes in $\mathcal{M}_{i}[\bar{T}+\tau_{2}]$ that have the lowest freshness-indices (see line 16 of Algo. 2), observe that at the end of time-step $\bar{T}+\tau_{2}$ , at least two components of $\mathbf{d}_{i}[\bar{T}+\tau_{2}]$ are at most $\tau_{2}$ ; from (50), it follows that each of these components in $\mathbf{d}_{i}[\bar{T}+\tau]$ are at most $\tau$ . It is easy to see that the above argument can be generalized for $m\geq 2$ . Now based on joint strong $(3f+1)$ -robustness w.r.t. $\mathcal{S}$ , $\mathcal{D}_{1}\neq\emptyset$ whenever $\mathcal{V}\setminus\mathcal{D}_{0}\neq\emptyset$ . Since $|\mathcal{A}|\leq f$ , any node $i\in\mathcal{D}_{1}\cap\mathcal{R}$ is guaranteed to hear from $m=2f+1$ regular source nodes over the interval $[\bar{T},\bar{T}+T-1]$ . The above discussion then implies that each component of $\mathbf{d}_{i}[\bar{T}+T-1]$ is at most $T-1$ . It then follows from (20) that for each $i\in\mathcal{D}_{1}\cap\mathcal{R}$ , $\tau_{i}[\bar{T}+T]\leq T$ , and hence $\tau_{i}[2\bar{T}]\leq\bar{T}$ based on (51). To establish that $\tau_{i}[2\bar{T}]\leq\bar{T},\forall i\in\mathcal{R}$ , one can proceed via induction to establish that $\tau_{i}[\bar{T}+rT]\leq rT,\forall i\in\mathcal{D}_{r}\cap\mathcal{R}$ . This can be done using arguments similar to when $r=1$ , and hence we omit them. The rest of the proof can be completed as in Lemma 26. ∎

The next lemma reveals the implication of $\tau_{i}[k]$ taking on a finite value, based on the rules of Algorithm 2. To state the result, let us define $\Omega^{(r)}[k]\triangleq\{a^{r}\hat{x}_{s}[k-r]:s\in\mathcal{S}\cap\mathcal{R}\}$ , where $r\in\mathbb{N}$ . We will use $Conv(\Omega)$ to denote the convex hull of a finite set of points $\Omega$ .

Lemma 7.

Suppose that at any time-step $k\in\mathbb{N}_{+}$ , $\tau_{i}[k]=m$ for some $i\in\{\mathcal{V}\setminus\mathcal{S}\}\cap\mathcal{R}$ , where $m\in\mathbb{N}_{+}$ . Then, Algorithm 2 implies the following.

\hat{x}_{i}[k]\in Conv\left(\bigcup\limits_{r=1}^{m}\Omega^{(r)}[k]\right).

(57)

Proof.

We will prove the result via induction on $m$ . For the base case of induction with $m=1$ , suppose that $\tau_{i}[k]=1$ for some node $i\in\{\mathcal{V}\setminus\mathcal{S}\}\cap\mathcal{R}$ at some time-step $k$ . Based on (20), it must then be the case that $\max_{l\in\mathcal{M}_{i}[k-1]}d_{i,l}[k-1]=0$ . Since each entry of $\mathbf{d}_{i}[k-1]$ is non-negative by definition, we have $d_{i,l}[k-1]=0,\forall l\in\mathcal{M}_{i}[k-1]$ . Since $\mathcal{M}_{i}[k-1]$ consists of $2f+1$ distinct nodes, and $|\mathcal{A}|\leq f$ , at least $f+1$ entries in $\mathcal{M}_{i}[k-1]$ correspond to regular nodes, each of which are source nodes (all regular non-source nodes have freshness-indices strictly larger than $1$ ). Thus, node $i$ must have appended at least $f+1$ regular source nodes at time $k-1$ . For each $l\in\mathcal{M}_{i}[k-1]\cap\{\mathcal{S}\cap\mathcal{R}\}$ , observe that $\bar{x}_{i,l}[k-1]=\hat{x}_{l}[k-1]$ . Given the fact that $|\mathcal{A}|\leq f$ , it is easy to see that the following holds based on the filtering operation in line 12 of Algo. 2:

\bar{x}_{i}[k-1]\in Conv(\{\bar{x}_{i,l}[k-1]:l\in\mathcal{M}_{i}[k-1]\cap\mathcal{R}\}).

(58)

Based on the above discussion, we then have that $\bar{x}_{i}[k-1]\in Conv(\Omega^{(0)}[k-1])$ , and hence $\hat{x}_{i}[k]\in Conv(\Omega^{(1)}[k])$ , since $\hat{x}_{i}[k]=a\bar{x}_{i}[k-1]$ . This completes the proof of the base case.

To generalize the above result, let us first observe the following identity which holds for any $l\in\mathcal{M}_{i}[k]\cap\mathcal{R}$ , and follows directly from the rules of Algo. 2:¹²¹²12Essentially, (59) suggests that the difference between node $i$ ’s internal copy of node $l$ ’s freshness-index at time $k$ , namely $d_{i,l}[k]$ , and the actual freshness-index $\tau_{l}[\phi_{i,l}[k]]$ of node $l$ when it was last appended by node $i$ , is simply the time that has elapsed since node $l$ was last appended by node $i$ , namely $(k-\phi_{i,l}[k])$ : this observation follows directly from (17).

\tau_{l}[\phi_{i,l}[k]]+(k-\phi_{i,l}[k])=d_{i,l}[k].

(59)

Now fix an integer $q\geq 2$ , and suppose the inclusion in (57) holds for all $m\in\{1,\ldots,q-1\}$ . Suppose $\tau_{i}[k]=q$ for some $i\in\{\mathcal{V}\setminus\mathcal{S}\}\cap\mathcal{R}$ at some time-step $k$ . Then, based on $\eqref{eqn:tau_update}$ , we have $\max_{l\in\mathcal{M}_{i}[k-1]}d_{i,l}[k-1]=q-1$ . Combining this with (59), we obtain the following for each $l\in\mathcal{M}_{i}[k-1]\cap\mathcal{R}$ :

\tau_{l}[\phi_{i,l}[k-1]]+((k-1)-\phi_{i,l}[k-1])\leq q-1.

(60)

Fix $l\in\mathcal{M}_{i}[k-1]\cap\mathcal{R}$ . The above relation can be now exploited to make certain inferences about the estimate of node $l$ at the time-step $\phi_{i,l}[k-1]$ when it was last appended by node $i$ . To see this, consider the case when $l$ is a non-source node. Since $((k-1)-\phi_{i,l}[k-1])$ is non-negative, $\tau_{l}[\phi_{i,l}[k-1]]\leq q-1$ based on (60). The induction hypothesis thus applies to node $l$ , and we have:

\hat{x}_{l}[\phi_{i,l}[k-1]]\in Conv\left(\bigcup\limits_{r=1}^{\tau_{l}[\phi_{i,l}[k-1]]}\Omega^{(r)}[\phi_{i,l}[k-1]]\right).

(61)

On the other hand, if $l$ is a source node, then clearly $\hat{x}_{l}[\phi_{i,l}[k-1]]\in Conv(\Omega^{(0)}[\phi_{i,l}[k-1]]).$ Combining the two cases above, and noting that $\bar{x}_{i,l}[k-1]=a^{g_{i,l}[k-1]}\hat{x}_{l}[\phi_{i,l}[k-1]]$ based on (21), where $g_{i,l}[k-1]=((k-1)-\phi_{i,l}[k-1])$ , we have:

	$\displaystyle\bar{x}_{i,l}[k-1]$	$\displaystyle\in Conv\left(\bigcup\limits_{r=g_{i,l}[k-1]}^{g_{i,l}[k-1]+\tau_{l}[\phi_{i,l}[k-1]]}\Omega^{(r)}[k-1]\right)$		(62)
		$\displaystyle\in Conv\left(\bigcup\limits_{r=0}^{q-1}\Omega^{(r)}[k-1]\right),$		(62)

where the last inclusion follows by noting that $g_{i,l}[k-1]\geq 0$ , and that $g_{i,l}[k-1]+\tau_{l}[\phi_{i,l}[k-1]]\leq q-1$ based on (60). Appealing to $\eqref{eqn:convhull}$ , and using (62), we conclude that

\bar{x}_{i}[k-1]\in Conv\left(\bigcup\limits_{r=0}^{q-1}\Omega^{(r)}[k-1]\right).

(63)

Since $\hat{x}_{i}[k]=a\bar{x}_{i}[k-1]$ , we then have

\hat{x}_{i}[k]\in Conv\left(\bigcup\limits_{r=1}^{q}\Omega^{(r)}[k]\right),

(64)

which is precisely the desired conclusion. ∎

We are now in position to prove Theorem 2.

Proof.

(Theorem 2) Let us begin by noting that given any desired convergence rate $\rho\in(0,1)$ , each node $i\in\mathcal{S}\cap\mathcal{R}$ can choose its observer gain $l_{i}$ to guarantee $|e_{i}[k]|\leq\alpha\rho^{k},\forall k\in\mathbb{N}$ , based on the observer $\eqref{eqn:algo2luen}$ . Here, $e_{i}[k]=\hat{x}_{i}[k]-x[k]$ , and $\alpha$ is some suitable constant.

Now consider any $i\in\{\mathcal{V}\setminus\mathcal{S}\}\cap\mathcal{R}$ , and suppose $k\geq(N-|\mathcal{S}|)T$ . From Lemma 6, we know that $\tau_{i}[k]\neq\omega$ , and hence $\tau_{i}[k]\in\mathbb{N}_{+}$ . Based on Lemma 57, we conclude that there exist non-negative weights $w^{(r)}_{is}[k]$ , such that $\sum_{s\in\mathcal{S}\cap\mathcal{R}}\sum_{r=1}^{\tau_{i}[k]}w^{(r)}_{is}[k]=1$ , and

\hat{x}_{i}[k]=\sum_{s\in\mathcal{S}\cap\mathcal{R}}\sum_{r=1}^{\tau_{i}[k]}w^{(r)}_{is}[k]a^{r}\hat{x}_{s}[k-r].

(65)

Based on the convexity of the weights $w^{(r)}_{is}[k]$ , note that ${x}[k]=\sum_{s\in\mathcal{S}\cap\mathcal{R}}\sum_{r=1}^{\tau_{i}[k]}w^{(r)}_{is}[k]a^{r}{x}[k-r]$ . Using this and (65) yields:

{e}_{i}[k]=\sum_{s\in\mathcal{S}\cap\mathcal{R}}\sum_{r=1}^{\tau_{i}[k]}w^{(r)}_{is}[k]a^{r}{e}_{s}[k-r].

(66)

Taking norms on both sides of the above equation, we obtain:

$\displaystyle\|e_{i}[k]\|$	$\displaystyle\leq\alpha\sum_{s\in\mathcal{S}\cap\mathcal{R}}\sum_{r=1}^{\tau_{i}[k]}w^{(r)}_{is}[k]{\left(\frac{\|a\|}{\rho}\right)}^{r}\rho^{k}$	(67)
	$\displaystyle\leq\alpha{\left(\frac{\|a\|}{\rho}\right)}^{\tau_{i}[k]}\rho^{k}$	(68)
	$\displaystyle\leq\alpha{\left(\frac{\|a\|}{\rho}\right)}^{2(N-\|\mathcal{S}\|)T}\rho^{k}.$	(69)

In the above steps, we have assumed $|a|\geq 1$ (to avoid trivialities) and exploited the convexity of the weights $w^{(r)}_{is}[k]$ to arrive at the second inequality, and used (53) to arrive at the final inequality. This concludes the proof. ∎

References

[1] C.-T. Chen, Linear system theory and design. Oxford University Press, Inc., 1998.
[2] U. Khan, S. Kar, A. Jadbabaie, and J. M. Moura, “On connectivity, observability, and stability in distributed estimation,” in Proc. of the 49th IEEE Conference on Decision and Control, 2010, pp. 6639–6644.
[3] V. Ugrinovskii, “Conditions for detectability in distributed consensus-based observer networks,” IEEE Trans. on Autom. Control, vol. 58, no. 10, pp. 2659–2664, 2013.
[4] T. Kim, H. Shim, and D. D. Cho, “Distributed luenberger observer design,” in Proc. of the 55th IEEE Decision and Control Conference, 2016, pp. 6928–6933.
[5] S. Park and N. C. Martins, “Design of distributed LTI observers for state omniscience,” IEEE Trans. on Autom. Control, vol. 62, no. 2, pp. 561–576, 2017.
[6] A. Mitra and S. Sundaram, “Distributed observers for LTI systems,” IEEE Trans. on Autom. Control, vol. 63, no. 11, pp. 3689–3704, 2018.
[7] L. Wang and A. S. Morse, “A distributed observer for a time-invariant linear system,” IEEE Trans. on Autom. Control, vol. 63, no. 7, 2018.
[8] W. Han, H. L. Trentelman, Z. Wang, and Y. Shen, “A simple approach to distributed observer design for linear systems,” IEEE Trans. on Autom. Control, vol. 64, no. 1, pp. 329–336, 2019.
[9] F. F. Rego, A. P. Aguiar, A. M. Pascoal, and C. N. Jones, “A design method for distributed Luenberger observers,” in Proc. of the 56th IEEE Conference on Decision and Control, 2017, pp. 3374 – 3379.
[10] Á. R. del Nozal, P. Millán, L. Orihuela, A. Seuret, and L. Zaccarian, “Distributed estimation based on multi-hop subspace decomposition,” Automatica, vol. 99, pp. 213–220, 2019.
[11] L. Wang, A. Morse, D. Fullmer, and J. Liu, “A hybrid observer for a distributed linear system with a changing neighbor graph,” in Proc. of the 56th IEEE Conf. on Decision and Control, 2017, pp. 1024–1029.
[12] S. Wang and W. Ren, “On the convergence conditions of distributed dynamic state estimation using sensor networks: A unified framework,” IEEE Trans. on Cont. Sys. Tech., vol. 26, no. 4, pp. 1300–1316, 2018.
[13] K. M. Lynch, I. B. Schwartz, P. Yang, and R. A. Freeman, “Decentralized environmental modeling by mobile sensor networks,” IEEE transactions on robotics, vol. 24, no. 3, pp. 710–724, 2008.
[14] R. Graham and J. Cortés, “Adaptive information collection by robotic sensor networks for spatial estimation,” IEEE Transactions on Automatic Control, vol. 57, no. 6, pp. 1404–1419, 2011.
[15] A. Mitra, J. A. Richards, S. Bagchi, and S. Sundaram, “Resilient distributed state estimation with mobile agents: overcoming Byzantine adversaries, communication losses, and intermittent measurements,” Autonomous Robots, vol. 43, no. 3, pp. 743–768, 2019.
[16] S. Mou, J. Liu, and A. S. Morse, “A distributed algorithm for solving a linear algebraic equation,” IEEE Trans. on Autom. Control, vol. 60, no. 11, pp. 2863–2878, 2015.
[17] A. Jadbabaie, J. Lin, and A. Morse, “Coordination of groups of mobile autonomous agents using nearest neighbor rules,” IEEE Trans. on Autom. Control, vol. 48, no. 6, pp. 988–1001, 2003.
[18] A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multi-agent optimization,” IEEE Trans. on Autom. Control, vol. 54, no. 1, p. 48, 2009.
[19] H. Lin and P. J. Antsaklis, “Stability and stabilizability of switched linear systems: a survey of recent results,” IEEE Trans. on Autom. control, vol. 54, no. 2, pp. 308–322, 2009.
[20] M. Deghat, V. Ugrinovskii, I. Shames, and C. Langbort, “Detection and mitigation of biasing attacks on distributed estimation networks,” Automatica, vol. 99, pp. 369–381, 2019.
[21] J. Kim, J. G. Lee, C. Lee, H. Shim, and J. H. Seo, “Local identification of sensor attack and distributed resilient state estimation for linear systems,” in Proc. of the 57th IEEE Conference on Decision and Control, 2018, pp. 2056–2061.
[22] X. He, X. Ren, H. Sandberg, and K. H. Johansson, “Secure distributed filtering for unstable dynamics under compromised observations,” arXiv:1903.07345, 2019.
[23] A. Mitra and S. Sundaram, “Byzantine-resilient distributed observers for LTI systems,” Automatica, vol. 108, p. 108487, 2019.
[24] S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should one update?” in IEEE INFOCOM, 2012, pp. 2731–2735.
[25] M. Costa, M. Codreanu, and A. Ephremides, “On the age of information in status update systems with packet management,” IEEE Transactions on Information Theory, vol. 62, no. 4, pp. 1897–1910, 2016.
[26] R. Talak, S. Karaman, and E. Modiano, “Minimizing age-of-information in multi-hop wireless networks,” in Proc. Annual Allerton Conf. on Comm., Control, and Computing, 2017, pp. 486–493.
[27] A. Mitra, J. A. Richards, S. Bagchi, and S. Sundaram, “Finite-time distributed state estimation over time-varying graphs: Exploiting the age-of-information,” in Proc. of the American Control Conference. IEEE, 2019, pp. 4006–4011.
[28] D. Dolev, N. A. Lynch, S. S. Pinter, E. W. Stark, and W. E. Weihl, “Reaching approximate agreement in the presence of faults,” Journal of the ACM (JACM), vol. 33, no. 3, pp. 499–516, 1986.
[29] H. J. LeBlanc, H. Zhang, X. Koutsoukos, and S. Sundaram, “Resilient asymptotic consensus in robust networks,” IEEE Journal on Selected Areas in Comm., vol. 31, no. 4, pp. 766–781, 2013.
[30] N. H. Vaidya, L. Tseng, and G. Liang, “Iterative approximate Byzantine consensus in arbitrary directed graphs,” in Proc. of the ACM Symp. on Principles of Distributed Comp., 2012, pp. 365–374.
[31] A. Tahbaz-Salehi and A. Jadbabaie, “A necessary and sufficient condition for consensus over random networks,” IEEE Trans. on Autom. Control, vol. 53, no. 3, pp. 791–795, 2008.
[32] R. A. Horn, R. A. Horn, and C. R. Johnson, Matrix analysis. Cambridge university press, 1990.

$\displaystyle\|e_{i}[k]\|$	$\displaystyle\leq\alpha\sum_{s\in\mathcal{S}\cap\mathcal{R}}\sum_{r=1}^{\tau_{i}[k]}w^{(r)}_{is}[k]{\left(\frac{\|a\|}{\rho}\right)}^{r}\rho^{k}$	(67)
	$\displaystyle\leq\alpha{\left(\frac{\|a\|}{\rho}\right)}^{\tau_{i}[k]}\rho^{k}$	(68)
	$\displaystyle\leq\alpha{\left(\frac{\|a\|}{\rho}\right)}^{2(N-\|\mathcal{S}\|)T}\rho^{k}.$	(69)