Distributed Filtering Design with Enhanced Resilience to Coordinated Byzantine Attacks

Ashkan Moradi, , Vinay Chakravarthi Gogineni, , Naveen K. D. Venkategowda, , Stefan Werner This work was supported by the Research Council of Norway. Ashkan Moradi, Vinay Chakravarthi Gogineni, and Stefan Werner are with the Department of Electronic Systems, Norwegian University of Science and Technology, Norway, E-mail: {ashkan.moradi, vinay.gogineni, stefan.werner}@ntnu.no. Stefan Werner is also with the Department of Signal Processing and Acoustics, Aalto University, Finland.Naveen K. D. Venkategowda is with the Department of Science and Technology, Linköping University, Sweden, E-mail: [email protected].

Abstract

This paper proposes a Byzantine-resilient consensus-based distributed filter (BR-CDF) wherein network agents employ partial sharing of state parameters. We characterize the performance and convergence of the BR-CDF and study the impact of a coordinated data falsification attack. Our analysis shows that sharing merely a fraction of the states improves robustness against coordinated Byzantine attacks. In addition, we model the optimal attack strategy as an optimization problem where Byzantine agents design their attack covariance or the sequence of shared fractions to maximize the network-wide mean squared error (MSE). Numerical results demonstrate the accuracy of the proposed BR-CDF and its robustness against Byzantine attacks. Furthermore, the simulation results show that the influence of the covariance design is more pronounced when agents exchange larger portions of their states with neighbors. In contrast, the performance is more sensitive to the sequence of shared fractions when smaller portions are exchanged.

Index Terms:

Cyber-physical systems, distributed learning, consensus-based filtering, data falsification attacks, multiagent systems, Byzantine agent.

I Introduction

In recent years, the development of the internet of things (IoT) and machine learning techniques have led to the wide use of cyber-physical systems (CPS) in various infrastructures, such as smart grids, environment monitoring, and signal processing [1, 2, 3, 4]. However, the high reliance of CPS on sensor cooperation makes them vulnerable to various security threats. As a result of malicious attacks, false information spreads throughout the network and threatens the integrity of the entire system [5]. Therefore, the study of security threats in CPS has gained considerable attention from academia and industry in the past few years [6, 7, 8, 9, 10, 11]. In CPSs, distributed filtering and secure estimation are becoming more prevalent due to their resilience to node failures and scalability [12, 13].

Attack strategies influence the performance of CPSs and create complications in developing protection methods against malicious behavior [14, 15]. In CPSs, attacks can be divided into two groups: denial-of-service (DoS) attacks and integrity attacks. DoS attacks occur when the communication links between agents are blocked, and agents cannot exchange information [16]. In contrast, integrity attacks occur when adversaries or malicious agents inject false information into the network [17, 18]. A stealthy attack is also categorized as an integrity attack in which an adversary injects false information into a network without being detected. Various studies have examined the impact of stealthy attacks on state estimation scenarios and investigated situations where adversaries design attacks to degrade the performance of the network [19, 20, 21].

An optimal attack design from an attacker perspective can aid in developing protection methods operating in worst-case scenarios. To this end, [21] designs a false data-injection strategy that maximizes the trace of error covariance in a remote state estimation scenario. An event-based stealthy attack strategy that degrades estimation accuracy is proposed in [22], while [23] develops a stealthy attack strategy and its optimal defense mechanism by exploiting power grid vulnerabilities. Furthermore, [24] proposes a linear stealthy attack strategy and its feasibility constraints. Moreover, [4] proposes an optimal attack strategy and sufficient conditions to limit the resulting estimation error. In [25], authors investigate the impact of reset attacks on cyber-physical systems, while [26] designs a stealthy attack that maximizes the network-wide estimation error by jointly selecting the optimal subset of Byzantine agents and perturbation covariance matrices. In [27], a false data-injection attack on a distributed CPS is proposed that enforces the local state estimates to remain within a pre-specified range.

It is also vital to analyze how countermeasures taken by agents can reduce the impact of the attack. One approach is to detect adversaries and then implement corrective measures [28, 29, 30]. In [31], for example, attack detection is achieved through trusted agents that raise a flag when an adversary is detected. Alternatively, agents can safeguard information and guarantee system performance by running attack-resilient algorithms [32, 33, 34, 31, 35, 36, 37]. In [32, 33], attack-resilient remote state estimators are studied, where the former proposes a stochastic detector with a random threshold to determine whether to fuse the received data, and the latter detects malicious agents using the statistical correlation between trusted agents. Using a probabilistic protector for each sensor, [34] proposes a robust distributed state estimator that decides whether to use data from neighbors based on their innovation signals. Moreover, a Byzantine-resilient distributed state estimation algorithm is proposed in [38] that employs an internal iteration loop within the local aggregation process to compute the trimmed-mean among neighbors.

Providing an extra procedure to protect the system from adversaries [28, 29, 30, 31] can increase the computational load of the agent and make the algorithm undesirable for resource-constrained scenarios. To resolve this issue, works in [35, 39] reduce the weight assigned to measurements whose norm exceeds a certain threshold, limiting the impact of falsified observations, to provide resilience against measurement attacks. Furthermore, the secure state estimation problem in [40] is solved by a local observer that achieves robustness against malicious agents by employing the median of its local estimates. Our primary motivation for conducting this study was to develop an algorithm that provides robustness against malicious adversaries without imposing an extra computation burden on agents.

Even though multi-agent distributed systems are robust to dynamic changes in the network, they are reliant on local interactions that consume power and bandwidth [41]. Partial-sharing-based approaches, originally proposed in [42, 43], reduce local communication overhead by sharing only a fraction of information during each inter-agent interaction. The simplicity of implementation and efficiency of computation make partial-sharing strategies prevalent in distributed processing scenarios [44, 45]. To the best of our knowledge, partial-sharing-based approaches have not been investigated in an adversarial environment. Additionally, the lack of computationally light distributed algorithms that are robust to coordinated attacks inspired us to conduct this research.

This paper proposes a Byzantine-resilient consensus-based distributed filter (BR-CDF) where agents exchange a fraction of their state estimates at each instant. We study the convergence of the BR-CDF and characterize its performance under a coordinated data falsification attack. In addition, we design the optimal attack by solving an optimization problem where Byzantine agents cooperate on designing the covariance of their falsification data or the order of the information fractions they share. The Byzantine agent is a legitimate network agent that shares false information with neighbors to degrade the overall performance of the network. The numerical results validate the theoretical findings and illustrate the robustness of the BR-CDF algorithm.

The remainder of the article is organized as follows. Section II investigates the system model and the attack strategy, while Section III proposes the Byzantine-resilient consensus-based distributed filter. Section IV analyzes the stability and performance of the BR-CDF algorithm and derives its convergence conditions. The performance of the BR-CDF algorithm is investigated under data falsification attacks in Section V, and Section VI develops an optimal coordinated attack strategy. Simulation results are presented in Section VII, and Section VIII concludes the article.

Mathematical Notations: Scalars are denoted by lowercase letters, column vectors by bold lowercase, and matrices by bold uppercase. Transpose and inverse operators are denoted by $(\cdot)^{\text{T}}$ and $(\cdot)^{-1}$ , respectively. The trace operator is denoted by $\text{tr}(\cdot)$ , whereas $\otimes$ indicates the Kronecker product and $\odot$ is the Hadamard product. The $ij$ th element of the matrix $\mathbf{A}$ is denoted by $[\mathbf{A}]_{ij}$ . The symbol $\boldsymbol{1}_{L}$ represents the $L\times 1$ column vector with all entries equal to one and $\mathbf{I}_{L}$ is the $L\times L$ identity matrix. Matrices $\textsf{diag}(\mathbf{a})$ and $\textsf{diag}(\{\mathbf{A}_{i}\}_{i=1}^{L})$ denote diagonal and block-diagonal matrices whose respective diagonals are the elements of vector $\mathbf{a}$ and matrices $\mathbf{A}_{1},\mathbf{A}_{2},\ldots,\mathbf{A}_{L}$ . The set of all real numbers is denoted as $\mathbb{R}$ and $\mathbb{E}\{\cdot\}$ is the statistical expectation operator. A positive semidefinite matrix $\mathbf{A}$ is denoted by $\mathbf{A}\succcurlyeq 0$ and $\mathbf{A}\succcurlyeq\mathbf{B}$ indicates that $\mathbf{A}-\mathbf{B}$ is a positive semidefinite matrix. The inequality $\mathbf{A}\leq\mathbf{B}$ denotes an element-wise inequality for corresponding elements in matrices $\mathbf{A}$ and $\mathbf{B}$ . The maximum and minimum eigenvalues of square matrix $\mathbf{A}$ are denoted by $\lambda_{\max}(\mathbf{A})$ and $\lambda_{\min}(\mathbf{A})$ , respectively. Vector $\text{vec}(\mathbf{A})$ denotes a column vector consisting of elements of the matrix $\mathbf{A}$ and $\text{vec}^{-1}(\cdot)$ denotes the inverse of $\text{vec}(\cdot)$ operator.

II System Model and Byzantine Attack Strategy

Consider a multi-agent network consisting of $L$ agents attempting to estimate a dynamic system state through their local observations. The network is modeled as an undirected graph $\mathcal{G}(\mathcal{V},\mathcal{E})$ , where $\mathcal{V}$ is the set of all agents and pairs in set $\mathcal{E}$ represent communication links between agents. The network adjacency matrix is denoted by $\mathbf{E}$ and $\mathbf{D}$ is a diagonal matrix whose diagonal entries are the degrees of corresponding agents. The neighbor set ${\cal N}_{i}$ contains agents connected to agent $i$ within a single hop, excluding the agent itself.

The state-space model, characterizing the dynamics of the state vector and observation sequences at each agent $i$ and time instant $k$ , is given by

	$\displaystyle\mathbf{x}(k+1)$	$\displaystyle=\mathbf{A}\mathbf{x}(k)+\mathbf{w}(k)$		(1)
	$\displaystyle\mathbf{y}_{i}(k)$	$\displaystyle=\mathbf{H}_{i}\mathbf{x}(k)+\mathbf{v}_{i}(k)$		(1)

where $\mathbf{x}(k)\in\mathbb{R}^{m}$ is the state, $\mathbf{y}_{i}(k)\in\mathbb{R}^{n}$ is the local observation, $\mathbf{A}\in\mathbb{R}^{m\times m}$ is the state matrix, and $\mathbf{H}_{i}\in\mathbb{R}^{n\times m}$ is the observation matrix. The state noise $\mathbf{w}(k)$ and observation noise $\mathbf{v}_{i}(k)$ are mutually independent zero-mean Gaussian processes with covariance matrices $\mathbf{Q}\in\mathbb{R}^{m\times m}$ and $\mathbf{R}_{i}\in\mathbb{R}^{n\times n}$ , respectively. Network agents can employ a consensus-based distributed Kalman filter (CDF) to estimate $\mathbf{x}(k)$ in a collaborative manner [46]. Accordingly, the state estimate at agent $i$ is given by

	$\displaystyle\mathbf{\hat{x}}_{i}(k+1)=\mathbf{A}\mathbf{\hat{x}}_{i}$	$\displaystyle(k)+\mathbf{K}_{i}(k)\big{(}\mathbf{y}_{i}(k)-\mathbf{H}_{i}\mathbf{\hat{x}}_{i}(k)\big{)}$		(2)
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\big{(}\mathbf{\bar{x}}_{j}(k)-\mathbf{\hat{x}}_{i}(k)\big{)}$		(2)

where $\mathbf{C}_{i}\in\mathbb{R}^{m\times m}$ denotes the consensus gain, $\mathbf{K}_{i}(k)\in\mathbb{R}^{m\times n}$ is the Kalman gain, and $\bar{\mathbf{x}}_{j}(k)$ is the received state estimates from neighboring agent $j$ .

To analyze the impact of data falsification attacks on network performance, the attack model needs to be specified. For this purpose, we assume a subset of agents to be Byzantines, i.e., ${\cal B}\subseteq\mathcal{V}$ , that intend to disrupt the performance of the entire network [17]. Fig. 1 shows the dynamic of the information exchange in a network with Byzantine agents. As seen in Fig. 1, a regular agent shares the actual value of the state estimate $\mathbf{\hat{x}}_{j}(k)$ with its neighbors. In contrast, a Byzantine agent $j\in{\cal B}$ shares a perturbed version of its state estimate with neighbors; in particular, the shared information at each agent $j$ is denoted by

\mathbf{\bar{x}}_{j}(k)=\begin{cases}\mathbf{\hat{x}}_{j}(k)+\boldsymbol{\delta}_{j}(k)&j\in\mathcal{B}\\ \mathbf{\hat{x}}_{j}(k)&j\notin\mathcal{B}\end{cases}\quad

(3)

where $\boldsymbol{\delta}_{j}(k)$ denotes the perturbation sequence. To maximize the attack stealthiness, i.e., the ability to evade detection, the perturbation sequence is drawn from a zero-mean Gaussian distribution with covariance $\mathbf{\Sigma}_{j}=\mathbb{E}\{\boldsymbol{\delta}_{j}(k)\boldsymbol{\delta}_{j}^{\text{T}}(k)\}$ [47, 48]. Moreover, to further degrade the network performance, Byzantines can cooperate in designing the attack strategy. The network-wide coordinated attack covariance is denoted by $\boldsymbol{\Sigma}=\mathbb{E}\{\boldsymbol{\delta}(k)\boldsymbol{\delta}^{\text{T}}(k)\}$ where $\boldsymbol{\delta}(k)=[\boldsymbol{\delta}_{1}^{\text{T}}(k),\cdots,\boldsymbol{\delta}_{L}^{\text{T}}(k)]^{\text{T}}$ is the network-wide attack sequence with $\boldsymbol{\delta}_{j}(k)=\mathbf{0}$ if $j\notin\mathcal{B}$ .

Refer to caption — Figure 1: Network topology in the presence of Byzantine agents.

III Byzantine-Resilient consensus-based distributed Kalman filter

By applying partial sharing of information to state estimates in (2), we reduce the information flow between agents at a given instant while maintaining the advantages of cooperation [42, 43]. In particular, each agent only shares a fraction of its state estimate with neighbors rather than the entire vector (i.e., $l$ entries of $\mathbf{\hat{x}}_{j}(k)\in\mathbb{R}^{m}$ , with $l\leq m$ ). Although partial-sharing was originally introduced to reduce inter-agent communication overhead, we show that adopting this idea in the current setting improves robustness to Byzantine attacks.

The state estimate entry selection process is performed at each agent $j$ using a selection matrix $\mathbf{S}_{j}(k)$ of size $m\times m$ , whose main diagonal contains $l$ ones and $m-l$ zeros. The ones on the main diagonal of $\mathbf{S}_{j}(k)$ specify the entries of the state estimate $\mathbf{\hat{x}}_{j}(k)$ to be shared with neighbors. Selecting $l$ entries from $m$ can either be done stochastically or sequentially, as in [42] and [43]. In this paper, we use uncoordinated partial-sharing, which is a special case of stochastic partial-sharing [42]. In uncoordinated partial-sharing, each agent $j$ is initialized with random selection matrices. The selection matrix at the current time instant, i.e., $\mathbf{S}_{j}(k)$ , can be obtained by performing $\tau$ right-circular shift operations on the main diagonal of the selection matrix used in the previous time instant. In other words, if $\mathbf{s}_{j}(k)\in\mathbb{R}^{m}$ contains the main diagonal elements of $\mathbf{S}_{j}(k)$ at the current instant, then $\mathbf{s}_{j}(k)=\text{right-circular shift}\{\mathbf{s}_{j}(k-1),\tau\}$ . Then the selection matrix at the current instant can be constructed as $\mathbf{S}_{j}(k)=\textsf{diag}\{\mathbf{s}_{j}(k)\}$ . This allows each agent $j$ to share only the initial selection matrix $\mathbf{S}_{j}(k)$ with its neighbors, and maintain a record of the indices of parameters shared without needing any additional mechanisms. As a result, the frequency of each entry of the state estimate being shared is equal to $p_{e}=\frac{l}{m}$ .

Due to partial sharing, every agent receives a fraction of the perturbed state estimate vectors from its neighbors, i.e., $\mathbf{S}_{j}(k)\mathbf{\bar{x}}_{j}(k)$ . Thus, unlike (3), the received information here must be compensated to fill in the missing elements. At each agent $i$ , the missing values from neighbors $(\mathbf{I}-\mathbf{S}_{j}(k))\mathbf{\hat{x}}_{j}(k)$ are replaced by $(\mathbf{I}-\mathbf{S}_{j}(k))\mathbf{\hat{x}}_{i}(k)$ . Subsequently, the state update at agent $i$ , as in (2), is modified as follows.

$\displaystyle\mathbf{\hat{x}}_{i}(k+1)=$	$\displaystyle\mathbf{A}\mathbf{\hat{x}}_{i}(k)+\mathbf{K}_{i}(k)\big{(}\mathbf{y}_{i}(k)-\mathbf{H}_{i}\mathbf{\hat{x}}_{i}(k)\big{)}$
	$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\big{(}\mathbf{\hat{x}}_{j}(k)-\mathbf{\hat{x}}_{i}(k)\big{)}$	(4)
	$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\boldsymbol{\delta}_{j}(k)$

At each agent $i$ , the Kalman gain is obtained by minimizing the trace of the estimation error covariance $\mathbf{P}_{i}(k)\triangleq\mathbb{E}\{\mathbf{e}_{i}(k)\mathbf{e}^{\text{T}}_{i}(k)\}$ with the estimation error evolving as

$\displaystyle\mathbf{e}_{i}(k+1)\triangleq$	$\displaystyle\mathbf{\hat{x}}_{i}(k+1)-\mathbf{x}(k+1)$
$\displaystyle=$	$\displaystyle\mathbf{F}_{i}(k)\mathbf{e}_{i}(k)+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\mathbf{e}_{j}(k)$	(5)
	$\displaystyle+\mathbf{K}_{i}(k)\mathbf{v}_{i}(k)-\mathbf{w}(k)+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\boldsymbol{\delta}_{j}(k)$

where

\mathbf{F}_{i}(k)=\mathbf{A}-\mathbf{K}_{i}(k)\mathbf{H}_{i}-\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)

(6)

Accordingly, using (III), we can obtain the evolution of the estimation error covariance at agent $i$ as follows,

	$\displaystyle\mathbf{P}_{i}$	$\displaystyle(k+1)=\mathbf{F}_{i}(k)\mathbf{P}_{i}(k)\mathbf{F}_{i}^{\text{T}}(k)+\mathbf{K}_{i}(k)\mathbf{R}_{i}\mathbf{K}_{i}^{\text{T}}(k)+\mathbf{Q}$		(7)
		$\displaystyle+\Delta\mathbf{P}_{i}(k)+\mathbf{C}_{i}\textstyle\sum_{s\in\mathcal{N}_{i}}\textstyle\sum_{p\in\mathcal{N}_{i}}\mathbf{S}_{s}(k)\boldsymbol{\Sigma}_{sp}\mathbf{S}_{p}^{\text{T}}(k)\mathbf{C}_{i}^{\text{T}}$

where $\boldsymbol{\Sigma}_{sp}=\mathbb{E}\{\boldsymbol{\delta}_{s}(k)\boldsymbol{\delta}_{p}^{\text{T}}(k)\}$ and

	$\displaystyle\Delta\mathbf{P}_{i}(k)=$	$\displaystyle\mathbf{F}_{i}(k)\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{P}_{ij}(k)\mathbf{S}_{j}^{\text{T}}(k)\mathbf{C}_{i}^{\text{T}}$
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\mathbf{P}_{ji}(k)\mathbf{F}_{i}^{\text{T}}(k)$
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{s\in\mathcal{N}_{i}}\textstyle\sum_{p\in\mathcal{N}_{i}}\mathbf{S}_{s}(k)\mathbf{P}_{sp}(k)\mathbf{S}_{p}^{\text{T}}(k)\mathbf{C}_{i}^{\text{T}}$

Similarly, the cross-terms of the error covariance, i.e., $\mathbf{P}_{ij}(k)\triangleq\mathbb{E}\{\mathbf{e}_{i}(k)\mathbf{e}^{\text{T}}_{j}(k)\}$ , evolve as

	$\displaystyle\mathbf{P}_{ij}(k+1)=$	$\displaystyle\mathbf{F}_{i}(k)\mathbf{P}_{ij}(k)\mathbf{F}_{j}^{\text{T}}(k)+\mathbf{Q}+\Delta\mathbf{P}_{ij}(k)$
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{s\in\mathcal{N}_{i}}\textstyle\sum_{p\in\mathcal{N}_{j}}\mathbf{S}_{s}(k)\boldsymbol{\Sigma}_{sp}\mathbf{S}_{p}^{\text{T}}(k)\mathbf{C}_{j}^{\text{T}}$

with

	$\displaystyle\Delta\mathbf{P}_{ij}(k)=$	$\displaystyle\mathbf{F}_{i}(k)\textstyle\sum_{p\in\mathcal{N}_{j}}\mathbf{P}_{ip}(k)\mathbf{S}_{p}^{\text{T}}(k)\mathbf{C}_{j}^{\text{T}}$
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{s\in\mathcal{N}_{i}}\mathbf{S}_{s}(k)\mathbf{P}_{sj}(k)\mathbf{F}_{j}^{\text{T}}(k)$
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{s\in\mathcal{N}_{i}}\textstyle\sum_{p\in\mathcal{N}_{j}}\mathbf{S}_{s}(k)\mathbf{P}_{sp}(k)\mathbf{S}_{p}^{\text{T}}(k)\mathbf{C}_{j}^{\text{T}}$

Differentiating the trace of (7) with respect to $\mathbf{K}_{i}(k)$ gives

	$\displaystyle\mathbf{K}_{i}^{*}(k)$	$\displaystyle=\bigg{(}\left(\mathbf{A}-\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\right)\mathbf{P}_{i}(k)$
		$\displaystyle\quad\quad\quad+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\mathbf{P}_{ji}(k)\bigg{)}\mathbf{H}_{i}^{\text{T}}\mathbf{M}_{i}^{-1}(k)$

with $\mathbf{M}_{i}(k)=\mathbf{R}_{i}+\mathbf{H}_{i}\mathbf{P}_{i}(k)\mathbf{H}_{i}^{\text{T}}$ .

The distributed Kalman filter based on partial sharing is summarized by (4)–(7). We see that the local covariance update in (7) requires access to cross-term covariance matrices of neighbors, resulting in considerable communication overhead. To reduce the communication overhead, for sufficiently small gain values, i.e., $\mathbf{C}_{i}$ , we can ignore the term $\Delta\mathbf{P}_{i}(k)$ in (7) and the last term of $\mathbf{F}_{i}(k)$ in (6) [46], i.e., we have

	$\displaystyle\mathbf{P}_{i}(k+1)=$	$\displaystyle\mathbf{\hat{F}}_{i}(k)\mathbf{P}_{i}(k)\mathbf{\hat{F}}_{i}^{\text{T}}(k)+\mathbf{K}_{i}(k)\mathbf{R}_{i}\mathbf{K}_{i}^{\text{T}}(k)+\mathbf{Q}$		(8)
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{s\in\mathcal{N}_{i}}\textstyle\sum_{p\in\mathcal{N}_{i}}\mathbf{S}_{s}(k)\boldsymbol{\Sigma}_{sp}\mathbf{S}_{p}^{\text{T}}(k)\mathbf{C}_{i}^{\text{T}}$

with $\mathbf{\hat{F}}_{i}(k)=\mathbf{A}-\mathbf{K}_{i}(k)\mathbf{H}_{i}$ . Accordingly, the optimal Kalman gain reduces to

\mathbf{K}_{i}(k)=\mathbf{A}\mathbf{P}_{i}(k)\mathbf{H}_{i}^{\text{T}}\left(\mathbf{R}_{i}+\mathbf{H}_{i}\mathbf{P}_{i}(k)\mathbf{H}_{i}^{\text{T}}\right)^{-1}

(9)

With the above approximations, we obtain a distributed consensus-based Kalman filter, albeit suboptimal [46, 49], that only requires local variables in the error covariance update at each agent. It is worth noting that the last term in (8) is only used to characterize the impact of the perturbation covariances, and since the attack is stealthy from the perspective of an agent, it is excluded from the filtering algorithm. As a result, in addition to the initial selection matrix $\mathbf{S}_{j}(0)$ , each agent $j$ shares a fraction of the perturbed state estimate, i.e., $\mathbf{S}_{j}(k)\mathbf{\bar{x}}_{j}(k)$ , with its neighbors at each instant. The proposed BR-CDF algorithm with reduced communication is summarized in Algorithm 1. We shall see that the BR-CDF in Algorithm 1 performs closely to the solution that shares all necessary variables.

Algorithm 1 BR-CDF Algorithm

0: each agent

i\in\mathcal{N}

\mathbf{\hat{x}}_{i}(0)=\mathbf{x}_{0}

\mathbf{P}_{i}(0)=\mathbf{P}_{0}

\mathbf{S}_{i}(0)=\textsf{diag}(\mathbf{s}_{i}(0))

, and share

\mathbf{S}_{i}(0)

with

j\in\mathcal{N}_{i}

for all

k>0

For all

j\in\mathcal{N}_{i}

receive

\mathbf{S}_{j}(k)\mathbf{\bar{x}}_{j}(k)

\mathbf{K}_{i}(k)=\mathbf{A}\mathbf{P}_{i}(k)\mathbf{H}_{i}^{\text{T}}\left(\mathbf{R}_{i}+\mathbf{H}_{i}\mathbf{P}_{i}(k)\mathbf{H}_{i}^{\text{T}}\right)^{-1}

Update the state estimate

	$\displaystyle\mathbf{\hat{x}}_{i}(k+1)=$	$\displaystyle\mathbf{A}\mathbf{\hat{x}}_{i}(k)+\mathbf{K}_{i}(k)\big{(}\mathbf{y}_{i}(k)-\mathbf{H}_{i}\mathbf{\hat{x}}_{i}(k)\big{)}$		(10)
		$\displaystyle+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\big{(}\mathbf{S}_{j}(k)\mathbf{\bar{x}}_{j}(k)-\mathbf{S}_{j}(k)\mathbf{\hat{x}}_{i}(k)\big{)}$

\mathbf{\hat{F}}_{i}(k)=\mathbf{A}-\mathbf{K}_{i}(k)\mathbf{H}_{i}

Update the local covariance

\mathbf{P}_{i}(k+1)=\mathbf{\hat{F}}_{i}(k)\mathbf{P}_{i}(k)\mathbf{\hat{F}}_{i}^{\text{T}}(k)+\mathbf{K}_{i}(k)\mathbf{R}_{i}\mathbf{K}_{i}^{\text{T}}(k)+\mathbf{Q}

(11)

\mathbf{s}_{i}(k+1)=\text{right-circular shift}\{\mathbf{s}_{i}(k),\tau\}

\mathbf{S}_{i}(k+1)=\textsf{diag}\left(\mathbf{s}_{i}(k+1)\right)

\mathbf{S}_{i}(k+1)\mathbf{\bar{x}}_{i}(k+1)

with all

j\in\mathcal{N}_{i}

end for

IV Stability and Performance Analysis

This section provides a detailed stability analysis of the BR-CDF algorithm. For this purpose, we make the following assumption:
Assumption 1: The selection matrix $\mathbf{S}_{i}(k)$ for all $i\in\mathcal{N}$ is independent of any other data, and the selection matrices $\mathbf{S}_{i}(k)$ and $\mathbf{S}_{j}(s)$ for all $i\neq j$ and $k\neq s$ are independent.

Our main result on the stability of the proposed BR-CDF algorithm is summarized by the following theorem.

Theorem 1.

Consider the BR-CDF in Algorithm 1 with consensus gain $\mathbf{C}_{i}=\gamma\mathbf{A}\mathbf{\bar{M}}_{i}^{-1}(k)$ , where $\mathbf{\bar{M}}_{i}(k)=\mathbf{P}_{i}^{-1}(k)+\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i}$ . Then, for a sufficiently small $\gamma$ , the error dynamics of the BR-CDF is globally asymptotically stable and all local estimators asymptotically reach a consensus on state estimates, i.e., $\mathbf{\hat{x}}_{1}(k)=\mathbf{\hat{x}}_{2}(k)=\cdots=\mathbf{\hat{x}}_{L}(k)=\mathbf{x}(k)$ .

Proof.

The proof begins by analyzing the dynamics of the estimation error in the absence of noise [46]. Given the consensus-based Kalman approach in (10), the estimation error dynamics, without noise, at each agent $i$ , can be written as

	$\displaystyle\mathbf{\hat{e}}_{i}(k+1)$	$\displaystyle=\mathbf{\hat{F}}_{i}(k)\mathbf{\hat{e}}_{i}(k)+\mathbf{C}_{i}\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\big{(}\mathbf{\hat{e}}_{j}(k)-\mathbf{\hat{e}}_{i}(k)\big{)}$
		$\displaystyle=\mathbf{\hat{F}}_{i}(k)\mathbf{\hat{e}}_{i}(k)+\mathbf{C}_{i}\mathbf{u}_{i}(k)$		(12)

where $\mathbf{u}_{i}(k)=\textstyle\sum_{j\in\mathcal{N}_{i}}\mathbf{S}_{j}(k)\big{(}\mathbf{\hat{e}}_{j}(k)-\mathbf{\hat{e}}_{i}(k)\big{)}$ . Our goal is to determine $\mathbf{C}_{i}$ such that the estimation error dynamic in (12) is stable. Following the approach in [46], we use

\mathbf{V}(\mathbf{\hat{e}}(k))=\textstyle\sum_{i=1}^{L}\mathbf{\hat{e}}_{i}^{\text{T}}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{\hat{e}}_{i}(k)

(13)

as a candidate Lyapunov function for (12) where the network-wide stacked error is $\mathbf{\hat{e}}(k)\triangleq[\mathbf{\hat{e}}_{1}^{\text{T}}(k),\cdots,\mathbf{\hat{e}}_{L}^{\text{T}}(k)]^{\text{T}}$ . We can then express $\delta\mathbf{V}(\mathbf{\hat{e}}(k))\triangleq\mathbf{V}(\mathbf{\hat{e}}(k+1))-\mathbf{V}(\mathbf{\hat{e}}(k))$ as

	$\displaystyle\delta\mathbf{V}(\mathbf{\hat{e}}(k))=$	$\displaystyle\textstyle\sum_{i=1}^{L}\bigg{(}\mathbf{\hat{e}}_{i}^{\text{T}}(k+1)\mathbf{P}_{i}^{-1}(k+1)\mathbf{\hat{e}}_{i}(k+1)$
		$\displaystyle\quad\quad\quad-\mathbf{\hat{e}}_{i}^{\text{T}}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{\hat{e}}_{i}(k)\bigg{)}$		(14)

By substituting (9) into (11) and by employing the matrix inversion lemma, we have

\mathbf{P}_{i}(k+1)=\mathbf{A}\mathbf{\bar{M}}_{i}^{-1}(k)\mathbf{A}^{\text{T}}+\mathbf{Q}

(15)

where $\mathbf{\bar{M}}_{i}(k)=\mathbf{P}_{i}^{-1}(k)+\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i}$ . Subsequently, replacing (15), without noise, and (12) into (14) yields

	$\displaystyle\delta\mathbf{V}(\mathbf{\hat{e}}(k))=\textstyle\sum_{i=1}^{L}\bigg{(}\big{(}\mathbf{\hat{F}}_{i}(k)\mathbf{\hat{e}}_{i}(k)+\mathbf{C}_{i}\mathbf{u}_{i}(k)\big{)}^{\text{T}}\big{(}\mathbf{A}\mathbf{\bar{M}}_{i}^{-1}(k)\mathbf{A}^{\text{T}}\big{)}^{-1}$
	$\displaystyle\quad\quad\big{(}\mathbf{\hat{F}}_{i}(k)\mathbf{\hat{e}}_{i}(k)+\mathbf{C}_{i}\mathbf{u}_{i}(k)\big{)}-\mathbf{\hat{e}}_{i}^{\text{T}}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{\hat{e}}_{i}(k)\bigg{)}$		(16)

Furthermore, by substituting (9) into $\mathbf{\hat{F}}_{i}(k)=\mathbf{A}-\mathbf{K}_{i}(k)\mathbf{H}_{i}$ and employing the matrix inversion lemma, we obtain $\mathbf{\hat{F}}_{i}(k)=\mathbf{A}\mathbf{\bar{M}}_{i}^{-1}(k)\mathbf{P}_{i}^{-1}(k)$ . Consequently, by replacing $\mathbf{\hat{F}}_{i}(k)$ into (16) and after some algebraic manipulations, we obtain

$\displaystyle\delta\mathbf{V}$	$\displaystyle(\mathbf{\hat{e}}(k))=\textstyle\sum_{i=1}^{L}\bigg{(}\mathbf{\hat{e}}_{i}^{\text{T}}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{\bar{M}}_{i}^{-1}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{\hat{e}}_{i}(k)$
	$\displaystyle-\mathbf{\hat{e}}_{i}^{\text{T}}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{\hat{e}}_{i}(k)+\mathbf{\hat{e}}_{i}^{\text{T}}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{A}^{-1}\mathbf{C}_{i}\mathbf{u}_{i}(k)$
	$\displaystyle+\mathbf{u}_{i}^{\text{T}}(k)\mathbf{C}_{i}^{\text{T}}\mathbf{A}^{-\text{T}}\mathbf{P}_{i}^{-1}(k)\mathbf{\hat{e}}_{i}(k)$
	$\displaystyle+\mathbf{u}_{i}^{\text{T}}(k)\mathbf{C}_{i}^{\text{T}}\mathbf{A}^{-\text{T}}\mathbf{\bar{M}}_{i}(k)\mathbf{A}^{-1}\mathbf{C}_{i}\mathbf{u}_{i}(k)\bigg{)}$
$\displaystyle=$	$\displaystyle\textstyle\sum_{i=1}^{L}\bigg{(}-\mathbf{\hat{e}}_{i}^{\text{T}}(k)\big{(}\mathbf{P}_{i}(k)+(\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i})^{-1}\big{)}^{-1}\mathbf{\hat{e}}_{i}(k)$
	$\displaystyle\quad\quad\quad+2\mathbf{\hat{e}}_{i}^{\text{T}}(k)\mathbf{P}_{i}^{-1}(k)\mathbf{A}^{-1}\mathbf{C}_{i}\mathbf{u}_{i}(k)$	(17)
	$\displaystyle\quad\quad\quad+\mathbf{u}_{i}^{\text{T}}(k)\mathbf{C}_{i}^{\text{T}}\mathbf{A}^{-\text{T}}\mathbf{\bar{M}}_{i}(k)\mathbf{A}^{-1}\mathbf{C}_{i}\mathbf{u}_{i}(k)\bigg{)}$

With an appropriate choice of consensus gain, i.e., $\mathbf{C}_{i}=\gamma\mathbf{A}\mathbf{\bar{M}}_{i}^{-1}(k)$ , and a proper selection of $\gamma>0$ , all terms of (17) become negative semidefinite. Subsequently, we have

$\displaystyle\delta\mathbf{V}(\mathbf{\hat{e}}$	$\displaystyle(k))=\textstyle\sum_{i=1}^{L}\bigg{(}-\mathbf{\hat{e}}_{i}^{\text{T}}(k)\big{(}\mathbf{P}_{i}(k)+(\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i})^{-1}\big{)}^{-1}\mathbf{\hat{e}}_{i}(k)$
	$\displaystyle\quad\quad+2\gamma\mathbf{\hat{e}}_{i}^{\text{T}}(k)\big{(}\mathbf{I}+\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i}\mathbf{P}_{i}(k)\big{)}^{-1}\mathbf{u}_{i}(k)$	(18)
	$\displaystyle\quad\quad+\gamma^{2}\mathbf{u}_{i}^{\text{T}}(k)\big{(}\mathbf{P}_{i}^{-1}(k)+\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i}\big{)}^{-1}\mathbf{u}_{i}(k)\bigg{)}$

By defining $\mathbf{S}(k)\triangleq\textsf{diag}(\{\mathbf{S}_{i}(k)\}_{i=1}^{L})$ , (18) becomes

	$\displaystyle\delta$	$\displaystyle\mathbf{V}(\mathbf{\hat{e}}(k))=-2\gamma\mathbf{\hat{e}}^{\text{T}}(k)\boldsymbol{\Lambda}_{\text{\@slowromancap iii@}}(k)(\mathbf{L}\otimes\mathbf{I})\mathbf{S}(k)\mathbf{\hat{e}}(k)$		(19)
		$\displaystyle-\mathbf{\hat{e}}^{\text{T}}(k)\bigg{(}\boldsymbol{\Lambda}_{\text{\@slowromancap i@}}(k)-\gamma^{2}\mathbf{S}(k)(\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(k)(\mathbf{L}\otimes\mathbf{I})\mathbf{S}(k)\bigg{)}\mathbf{\hat{e}}(k)$

where $\mathbf{L}=\mathbf{D}-\mathbf{E}$ is the network Laplacian and

	$\displaystyle\boldsymbol{\Lambda}_{\text{\@slowromancap i@}}(k)$	$\displaystyle=\textsf{diag}\left(\left\{\big{(}\mathbf{P}_{i}(k)+(\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i})^{-1}\big{)}^{-1}\right\}_{i=1}^{L}\right)$
	$\displaystyle\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(k)$	$\displaystyle=\textsf{diag}\left(\left\{\mathbf{\bar{M}}_{i}^{-1}(k)\right\}_{i=1}^{L}\right)$
	$\displaystyle\boldsymbol{\Lambda}_{\text{\@slowromancap iii@}}(k)$	$\displaystyle=\textsf{diag}\left(\left\{\big{(}\mathbf{I}+\mathbf{H}_{i}^{\text{T}}\mathbf{R}_{i}^{-1}\mathbf{H}_{i}\mathbf{P}_{i}(k)\big{)}^{-1}\right\}_{i=1}^{L}\right)$

For an appropriate choice of $\gamma$ , we have $\delta\mathbf{V}(\mathbf{\hat{e}}(k))<0$ , implying that $\mathbf{\hat{e}}(k)\rightarrow\mathbf{0}$ . Consequently, $\mathbf{\hat{e}}(k)=\mathbf{0}$ is asymptotically stable. Furthermore, since $\mathbf{\hat{e}}_{i}(k)=\mathbf{\hat{e}}_{j}(k)=\mathbf{0}$ for all $i\neq j$ , all estimators asymptotically reach a consensus on state estimates as $\mathbf{\hat{x}}_{1}(k)=~{}\cdots~{}=\mathbf{\hat{x}}_{L}(k)=\mathbf{x}(k)$ .

In steady-state, i.e., $k\rightarrow\infty$ , we have $\boldsymbol{\Lambda}_{\text{\@slowromancap i@}}=\lim_{k\rightarrow\infty}\boldsymbol{\Lambda}_{\text{\@slowromancap i@}}(k)$ , $\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}=\lim_{k\rightarrow\infty}\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(k)$ . By applying statistical expectation $\mathbb{E}\{\cdot\}$ with respect to $\mathbf{S}(k)$ , we will have the following condition for the stability of the algorithm:

\boldsymbol{\Lambda}_{\text{\@slowromancap i@}}-\gamma^{2}\mathbb{E}\{\mathbf{S}(k)(\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\mathbf{S}(k)\}\succcurlyeq 0

(20)

The expectation term can be simplified as

$\displaystyle\mathbb{E}$	$\{\mathbf{S}(k)(\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\mathbf{S}(k)\}$	(21)
	$\displaystyle=\mathbb{E}\{\text{vec}^{-1}\left(\text{vec}\left(\mathbf{S}(k)(\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\mathbf{S}(k)\right)\right)\}$
	$\displaystyle=\mathbb{E}\{\text{vec}^{-1}\left(\left(\mathbf{S}^{\text{T}}(k)\otimes\mathbf{S}(k)\right)\text{vec}\left((\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\right)\right)\}$
	$\displaystyle=\text{vec}^{-1}\left(\mathbb{E}\{\mathbf{S}^{\text{T}}(k)\otimes\mathbf{S}(k)\}\text{vec}\left((\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\right)\right)$

Following the approach in [42, Appendix B] and [45], we can show that $\mathbb{E}\{\mathbf{S}^{\text{T}}(k)\otimes\mathbf{S}(k)\}\leq p_{e}(\mathbf{I}\otimes\mathbf{I})$ with $0<p_{e}\leq 1$ , and we have

	$\displaystyle\mathbb{E}\{\mathbf{S}(k)$	$\displaystyle(\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\mathbf{S}(k)\}$
		$\displaystyle\leq p_{e}\,\text{vec}^{-1}\left(\text{vec}\left((\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\right)\right)$
		$\displaystyle\leq p_{e}\,(\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I})\cdot$

Subsequently, to satisfy (20), the bound for $\gamma$ is determined as

\gamma\leq\gamma^{*}=\frac{1}{\sqrt{p_{e}}}\left(\frac{\lambda_{\min}(\boldsymbol{\Lambda}_{\text{\@slowromancap i@}})}{\lambda_{\max}((\mathbf{L}\otimes\mathbf{I})\boldsymbol{\Lambda}_{\text{\@slowromancap ii@}}(\mathbf{L}\otimes\mathbf{I}))}\right)^{\frac{1}{2}}

(22)

Thus, if $\gamma$ is chosen as in (22), we can ensure that all agents reach a consensus on state estimates asymptotically, which completes the proof. ∎

V Resilience of the BR-CDF to Byzantine attacks

This section investigates the robustness of the BR-CDF in Algorithm 1 to data falsification attacks. We assume that Byzantine agents start perturbing the information once the network reaches steady-state, i.e., $k=k_{0}>0$ . We further assume that the attack remains stealthy from the perspective of agents; thus, the consensus gain $\mathbf{C}_{i}$ remains fixed for $k~{}\geq~{}k_{0}$ .

In steady-state, the error covariance matrix in (8) satisfies

	$\displaystyle\mathbf{P}_{i}=$	$\displaystyle\mathbf{\hat{F}}_{i}\mathbf{P}_{i}\mathbf{\hat{F}}_{i}^{\text{T}}+\mathbf{K}_{i}\mathbf{R}_{i}\mathbf{K}_{i}^{\text{T}}+\mathbf{Q}$		(23)
		$\displaystyle+\mathbf{C}_{i}\mathbb{E}\left\{\textstyle\sum_{s\in\mathcal{N}_{i}}\textstyle\sum_{p\in\mathcal{N}_{i}}\mathbf{S}_{s}(k)\boldsymbol{\Sigma}_{sp}\mathbf{S}_{p}^{\text{T}}(k)\right\}\mathbf{C}_{i}^{\text{T}}$

where $\mathbf{P}_{i}=\lim_{k\rightarrow\infty}\mathbf{P}_{i}(k)$ . Defining $\mathbf{P}\triangleq\textsf{diag}(\{\mathbf{P}_{i}\}_{i=1}^{L})$ , $\mathbf{\hat{F}}\triangleq\textsf{diag}(\{\mathbf{\hat{F}}_{i}\}_{i=1}^{L})$ , $\mathbf{K}\triangleq\textsf{diag}(\{\mathbf{K}_{i}\}_{i=1}^{L})$ , $\mathbf{C}\triangleq\textsf{diag}(\{\mathbf{C}_{i}\}_{i=1}^{L})$ , and $\mathbf{R}\triangleq\textsf{diag}(\{\mathbf{R}_{i}\}_{i=1}^{L})$ , the network-wide version of (23) is given by

	$\displaystyle\mathbf{P}=\mathbf{\hat{F}}\mathbf{P}\mathbf{\hat{F}}^{\text{T}}+\mathbf{K}\mathbf{R}\mathbf{K}^{\text{T}}+\mathbf{I}_{L}\otimes\mathbf{Q}$		(24)
	$\displaystyle+\mathbf{C}\mathbb{E}\left\{(\mathbf{I}_{L}\otimes\mathbf{I})\odot\left((\mathbf{E}\otimes\mathbf{I})\mathbf{S}(k)\mathbf{\Sigma}\mathbf{S}^{\text{T}}(k)(\mathbf{E}\otimes\mathbf{I})\right)\right\}\mathbf{C}^{\text{T}}$

where $\mathbf{\Sigma}$ is the network-wide coordinated attack covariance. Under Assumption 1, we have

	$\displaystyle\mathbf{P}=\mathbf{\hat{F}}\mathbf{P}\mathbf{\hat{F}}^{\text{T}}+\mathbf{K}\mathbf{R}\mathbf{K}^{\text{T}}+\mathbf{I}_{L}\otimes\mathbf{Q}$		(25)
	$\displaystyle+\mathbf{C}\bigg{(}(\mathbf{I}_{L}\otimes\mathbf{I})\odot\left((\mathbf{E}\otimes\mathbf{I})\mathbb{E}\left\{\mathbf{S}(k)\mathbf{\Sigma}\mathbf{S}^{\text{T}}(k)\right\}(\mathbf{E}\otimes\mathbf{I})\right)\bigg{)}\mathbf{C}^{\text{T}}$

and using the result of (21), we finally have

	$\displaystyle\mathbf{P}=$	$\displaystyle\,\mathbf{\hat{F}}\mathbf{P}\mathbf{\hat{F}}^{\text{T}}+\mathbf{K}\mathbf{R}\mathbf{K}^{\text{T}}+\mathbf{I}_{L}\otimes\mathbf{Q}$		(26)
		$\displaystyle+p_{e}\mathbf{C}\bigg{(}(\mathbf{I}_{L}\otimes\mathbf{I})\odot\big{(}(\mathbf{E}\otimes\mathbf{I})\mathbf{\Sigma}(\mathbf{E}\otimes\mathbf{I})\big{)}\bigg{)}\mathbf{C}^{\text{T}}$

The last term in (26) describes the impact of the coordinated Byzantine attack on the error covariance matrix that is scaled by $p_{e}$ . Thus, defining the steady-state network-wide MSE (NMSE) as

\text{NMSE}\triangleq\lim_{k\rightarrow\infty}\text{tr}(\mathbb{E}\{\mathbf{P}(k)\}),

(27)

we see that partial sharing, i.e., $p_{e}<1$ , results in lower steady-state NMSE compared to the case when the full state is shared, i.e., $p_{e}=1$ , which gives enhanced robustness against coordinated Byzantine attacks.

VI Coordinated Byzantine Attack Design

To analyze the worst-case performance of the BR-CDF algorithm, we consider a scenario where Byzantine agents design a coordinated attack to maximize the NMSE. Based on the attack model in Section II and the error covariance of the BR-CDF algorithm in (7), Byzantine agents have the following two levers to design their coordinated attack:

•

The design of perturbation covariance matrix $\mathbf{\Sigma}$ , modeled as the covariance of zero-mean Gaussian sequences.
•

The choice of selection matrices that impacts the sequence of information fractions that Byzantine agents share at the beginning of the attack, i.e., $\mathbf{S}_{i}(k_{0})$ for $i\in\mathcal{B}$ .

We ensure that the attack remains stealthy from the perspective of regular agents by setting an upper bound on the energy of the perturbation sequences, i.e., $\text{tr}(\boldsymbol{\Sigma})\leq\eta$ . Assuming Byzantines start perturbing information once agents reach steady-state, i.e., $k=k_{0}$ , we derive an expression for the NMSE pertaining to the estimator in (4). The network-wide evolution of the estimation error of the BR-CDF algorithm, given in (III), is given by

\mathbf{e}(k+1)=\mathbf{\tilde{A}}(k)\mathbf{e}(k)+\mathbf{\tilde{b}}(k)+\boldsymbol{\Gamma}(k){\boldsymbol{\delta}}(k)

(28)

where $\mathbf{e}(k)\triangleq[\mathbf{e}_{1}^{\text{T}}(k),\cdots,\mathbf{e}_{L}^{\text{T}}(k)]^{\text{T}}$ ,

	$\displaystyle\mathbf{\tilde{A}}(k)$	$\displaystyle=\textsf{diag}(\{\mathbf{F}_{i}(k)\}_{i=1}^{L})+\mathbf{C}\big{(}\mathbf{E}\otimes\mathbf{I}\big{)}\mathbf{S}(k)$
	$\displaystyle\mathbf{\tilde{b}}(k)$	$\displaystyle=\textsf{diag}(\{\mathbf{{K}}_{i}(k)\mathbf{v}_{i}(k)\}_{i=1}^{L})-\mathbf{1}_{L}\otimes\mathbf{w}(k)$
	$\displaystyle\boldsymbol{\Gamma}(k)$	$\displaystyle=\mathbf{C}\big{(}\mathbf{E}\otimes\mathbf{I}\big{)}\mathbf{S}(k)$

As a result, the network-wide error covariance matrix $\mathbf{P}(k)\triangleq\mathbb{E}\{\mathbf{e}(k)\mathbf{e}^{\text{T}}(k)\}$ , including cross-terms of the error covariance, is given by

\mathbf{P}(k+1)=\mathbf{\tilde{A}}(k)\mathbf{P}(k)\mathbf{\tilde{A}}^{\text{T}}(k)+\mathbf{\tilde{Q}}(k)+\boldsymbol{\Gamma}(k)\boldsymbol{\Sigma}\boldsymbol{\Gamma}^{\text{T}}(k)

(29)

where $\mathbf{\tilde{Q}}(k)=\textsf{diag}(\{{\mathbf{K}}_{i}(k)\mathbf{R}_{i}{\mathbf{K}}_{i}^{\text{T}}(k)\}_{i=1}^{L}+\mathbf{1}_{L}\mathbf{1}_{L}^{\text{T}}\otimes\mathbf{Q}$ . In (29), the last term is due to the injected noise and is given by

\displaystyle\boldsymbol{\Gamma}(k)\boldsymbol{\Sigma}\boldsymbol{\Gamma}^{\text{T}}(k)=\mathbf{C}\big{(}\mathbf{E}\otimes\mathbf{I}\big{)}\mathbf{S}(k)\boldsymbol{\Sigma}\mathbf{S}^{\text{T}}(k)\big{(}\mathbf{E}\otimes\mathbf{I}\big{)}\mathbf{C}^{\text{T}}

(30)

which, compared to the Byzantine-free case, degrades the NMSE. Considering the NMSE in (27), we define two optimization problems to find the optimal coordinated Byzantine attacks by designing the partial-sharing selection matrices at $k=k_{0}$ and attack covariance matrices of Byzantine agents.

The last term of the estimation error covariance $\mathbf{P}(k)$ , as in (30), is the only term of (29) that depends on the attack; thus, maximizing the trace of the estimation error covariance is equivalent to maximizing the trace of its last term [50]. The last term of $\mathbf{P}(k)$ in (29) also depends on the selection matrix $\mathbf{S}(k)$ and given the attack covariance $\boldsymbol{\Sigma}$ , we can show that

$\displaystyle\text{tr}(\boldsymbol{\Gamma}(k)\boldsymbol{\Sigma}\boldsymbol{\Gamma}^{\text{T}}(k))$	$\displaystyle=\text{tr}\big{(}\mathbf{C}(\mathbf{E}\otimes\mathbf{I})\mathbf{S}(k)\boldsymbol{\Sigma}\mathbf{S}^{\text{T}}(k)(\mathbf{E}\otimes\mathbf{I})\mathbf{C}^{\text{T}}\big{)}$
	$\displaystyle=\text{tr}\big{(}(\mathbf{E}\otimes\mathbf{I})\mathbf{C}^{\text{T}}\mathbf{C}(\mathbf{E}\otimes\mathbf{I})\mathbf{S}(k)\boldsymbol{\Sigma}\mathbf{S}^{\text{T}}(k)\big{)}$
	$\displaystyle=\textstyle\sum_{i\in\mathcal{B}}\textstyle\sum_{j\in\mathcal{B}}\text{tr}\left(\mathbf{U}_{ij}\mathbf{S}_{j}(k)\boldsymbol{\Sigma}_{ji}\mathbf{S}_{i}(k)\right)$	(31)

where $\mathbf{U}_{ij}=\textstyle\sum_{q\in\mathcal{N}_{i}\cap\mathcal{N}_{j}}\mathbf{C}_{q}^{\text{T}}\mathbf{C}_{q}$ . Thus, the optimization problem that maximizes the steady-state NMSE can be stated as

$\displaystyle\underset{\{\mathbf{S}_{i},\,i\in\mathcal{B}\}}{\max}$	$\displaystyle\sum_{i\in\mathcal{B}}\sum_{j\in\mathcal{B}}\text{tr}\left(\mathbf{U}_{ij}\mathbf{S}_{j}\boldsymbol{\Sigma}_{ji}\mathbf{S}_{i}\right)$	(32)
s. t.	$\displaystyle\mathbf{0}\leq\mathbf{S}_{i}\leq\mathbf{I}\quad\forall i\in\mathcal{B}$
	$\displaystyle[\mathbf{S}_{i}]_{rs}\in\{0,1\}$
	$\displaystyle\text{tr}(\mathbf{S}_{i})\leq l\quad\forall i\in\mathcal{B}$

where the resulted solution for $\mathbf{S}_{i}$ determines the $\mathbf{S}_{i}(k_{0})$ and the first two constraints restrict the selection matrix to be diagonal with $0$ or $1$ elements on the main diagonal. The last constraint enforces that only $l$ elements of the state vector are shared with neighbors at each given instant. We relax the non-convex Boolean constraint on the elements of $\mathbf{S}_{i}$ and rewrite the optimization problem as

$\displaystyle\underset{\{\mathbf{S}_{i},\,i\in\mathcal{B}\}}{\max}$	$\displaystyle\sum_{i\in\mathcal{B}}\sum_{j\in\mathcal{B}}\text{tr}\left(\mathbf{U}_{ij}\mathbf{S}_{j}\boldsymbol{\Sigma}_{ji}\mathbf{S}_{i}\right)$	(33)
s. t.	$\displaystyle\mathbf{0}\leq\mathbf{S}_{i}\leq\mathbf{I}\quad\forall i\in\mathcal{B}$
	$\displaystyle\text{tr}(\mathbf{S}_{i})\leq l\quad\forall i\in\mathcal{B}$

The objective function in (LABEL:OPT_S_main) can be further simplified as

	$\displaystyle\sum_{i\in\mathcal{B}}\sum_{j\in\mathcal{B}}$	$\displaystyle\text{tr}\left(\mathbf{U}_{ij}\mathbf{S}_{j}\boldsymbol{\Sigma}_{ji}\mathbf{S}_{i}\right)=\sum_{i\in\mathcal{B}}\bigg{(}\text{tr}\left(\mathbf{U}_{ii}\mathbf{S}_{i}\boldsymbol{\Sigma}_{i}\mathbf{S}_{i}\right)$		(34)
		$\displaystyle+\sum_{j\in\mathcal{B}/\{i\}}\frac{1}{2}\text{tr}\left(\mathbf{U}_{ij}\mathbf{S}_{j}\boldsymbol{\Sigma}_{ji}\mathbf{S}_{i}+\mathbf{U}_{ji}\mathbf{S}_{i}\boldsymbol{\Sigma}_{ij}\mathbf{S}_{j}\right)\bigg{)}$

which still contains non-convex quadratic terms. To overcome this problem, we employ the block coordinate descent (BCD) algorithm where each Byzantine agent $i$ , given the selection matrix of other Byzantines, optimizes its own selection matrix. The BCD algorithm is iterated for $T$ iterations and at each iteration $t+1$ , agent $i$ employs the selection matrix of other Byzantine agents from the previous iteration, i.e. $\{\mathbf{S}_{j}^{t}\}_{j\in\mathcal{B}\backslash\{i\}}$ .

Hence, the optimization problem in (LABEL:OPT_S_main) can be solved by employing the BCD method, where at each agent $i\in\mathcal{B}$ and BCD iteration $t+1$ , the optimization problem is modeled as

$\displaystyle\mathcal{P}:$	$\displaystyle\underset{\mathbf{S}_{i}}{\max}$	$\displaystyle F(\mathbf{S}_{i},\{\mathbf{S}_{j}^{t}\}_{j\in\mathcal{B}\backslash\{i\}})$	(35)
	s. t.	$\displaystyle\mathbf{0}\leq\mathbf{S}_{i}\leq\mathbf{I}$
	$\displaystyle\text{tr}(\mathbf{S}_{i})\leq l$

with the objective function

	$\displaystyle F(\mathbf{S}_{i},$	$\displaystyle\{\mathbf{S}_{j}^{t}\}_{j\in\mathcal{B}\backslash\{i\}})=\text{tr}\left(\mathbf{U}_{ii}\mathbf{S}_{i}\boldsymbol{\Sigma}_{i}\mathbf{S}_{i}\right)$		(36)
		$\displaystyle+\sum_{j\in\mathcal{B}/\{i\}}\frac{1}{2}\text{tr}\left(\mathbf{U}_{ij}\mathbf{S}_{j}^{t}\boldsymbol{\Sigma}_{ji}\mathbf{S}_{i}+\mathbf{U}_{ji}\mathbf{S}_{i}\boldsymbol{\Sigma}_{ij}\mathbf{S}_{j}^{t}\right)$

and $\mathbf{S}_{j}^{t}$ as the selection matrix of Byzantine agent $j$ at the former BCD iteration.

Algorithm 2 BCD-based attack design

0: each agent

i\in\mathcal{B}

\mathbf{S}_{j}^{0}=\mathbf{S}_{j}(k_{0})

from

j\in\mathcal{B}\backslash\{i\}

\mathbf{S}_{i}^{0}=\mathbf{S}_{i}(k_{0})

with

j\in\mathcal{B}\backslash\{i\}

for

t=1

to T do

Find

\mathbf{S}_{i}

by solving

\mathcal{P}

in (35)

Set

\mathbf{S}_{i}^{t}=\mathbf{S}_{i}

and share with

j\in\mathcal{B}\backslash\{i\}

Receive

\{\mathbf{S}_{j}^{t}\}_{j\in\mathcal{B}\backslash\{i\}}

end for

For the main diagonal of the

\mathbf{S}_{i}^{T}

, set the

l

largest element to

1

and the others to

0

Set

\mathbf{S}_{i}(k_{0})=\mathbf{S}_{i}^{T}

Algorithm 2 summarizes the BCD algorithm used to solve the optimization problem in (35). Next, we investigate how optimizing the perturbation covariance matrix impacts the NMSE.

Given the selection matrices at the beginning of the attack, i.e., $\mathbf{S}_{i}(k_{0})$ for $i\in\mathcal{N}$ , Byzantine agents can maximize the steady-state NMSE by cooperatively designing their attack covariances in the following optimization problem

$\displaystyle\underset{\boldsymbol{\Sigma}}{\max}$	$\displaystyle\quad\text{tr}(\boldsymbol{\Gamma}(k_{0})\boldsymbol{\Sigma}\boldsymbol{\Gamma}^{\text{T}}(k_{0}))$	(37)
s. t.	$\displaystyle\quad\boldsymbol{\Sigma}\succcurlyeq 0$
	$\displaystyle\quad\text{tr}(\boldsymbol{\Sigma})\leq\eta$

where $\small{\boldsymbol{\Gamma}(k_{0})=\mathbf{C}\big{(}\mathbf{E}\otimes\mathbf{I}\big{)}\mathbf{S}(k_{0})(\,\textsf{diag}(\mathbf{z})\otimes\mathbf{I})}$ and $\mathbf{z}\in\mathbb{R}^{L}$ is a vector designed to preserve the structure of the perturbation covariance. We introduce the Boolean vector $\mathbf{z}=[z_{1},z_{2},\cdots,z_{L}]^{\text{T}}$ where $z_{i}=1$ if $i\in\mathcal{B}$ and $z_{i}=0$ otherwise. By employing $\mathbf{z}$ , the block matrices of the $\boldsymbol{\Sigma}$ that correspond to regular agents are all set to zero. The first constraint in (LABEL:OPT) guarantees that the designed attack covariance is positive semidefinite and the last constraint is related to stealthiness. The energy of the Byzantine noise sequences is assumed to be limited as $\text{tr}(\boldsymbol{\Sigma})\leq\eta$ to maintain the attack stealthiness.

Remark 1.

The optimization problem in (LABEL:OPT) is a semidefinite programming (SDP) problem that can be efficiently solved by interior-point methods.

VII Simulation Results

In this section, we demonstrate the robustness of the BR-CDF algorithm to Byzantine attacks. For this purpose, we consider a target tracking problem with the state vector length of $m=8$ , described by a linear model

\mathbf{x}(k+1)=\left(\begin{bmatrix}0.6&0.005\\ 0.25&0.6\end{bmatrix}\otimes\mathbf{I}_{4}\right)\,\mathbf{x}(k)+\mathbf{w}(k)

We considered a randomly generated undirected connected network with $L=25$ agents, as shown in Fig. 2. At each agent $i$ , the state noise covariance is $\mathbf{Q}=0.1\mathbf{I}$ and the local observation is given by

\mathbf{y}_{i}(k)=\left(\begin{bmatrix}1&1&0&0\\ 1&0&0&0\\ 0&0&1&0\\ 0&0&1&1\end{bmatrix}\otimes\mathbf{I}_{2}\right)\,\mathbf{x}(k)+\mathbf{v}_{i}(k)

In addition, at each agent $i$ , we considered the observation noise covariance as $\mathbf{R}_{i}=\mu_{i}\mathbf{I}$ , where $\mu_{i}\sim\mathcal{U}(0,1)$ . The average NMSE of agents is considered as a performance metric, i.e.,

\text{MSE}\triangleq\frac{1}{L}\textstyle\sum_{i=1}^{L}\text{tr}(\mathbf{P}_{i})

(38)

with $\mathbf{P}_{i}$ being the steady-state error covariance matrix of agent $i$ in (23). The simulation results presented in the following are obtained by averaging over 100 independent experiments.

We simulated the proposed BR-CDF algorithm for different values of $l$ , e.g., 2, 4, 6, and 8 (i.e., $25\%$ , $50\%$ , $75\%$ and $100\%$ information sharing). Fig. 3 shows the corresponding learning curves, i.e., MSE versus time instant $k$ , when no attacks occur in the network. We see that the performance degradation is inversely proportional to the amount of information sharing. Although sharing a smaller fraction results in higher MSE, the difference is negligible in this experiment.

Next, we examined the robustness of the BR-CDF algorithm to Byzantine attacks. After the network has reached convergence, the Byzantine agents launch an attack at $k_{0}=30$ . The Byzantine agents are chosen as the $B=5$ nodes with the highest degree in the network graph and the energy of the attack sequences is restricted with parameter $\eta=L$ . We then compared the accuracy of the proposed suboptimal BR-CDF in Algorithm 1 to the solution of the BR-CDF that shares all necessary variables. Fig. 4 illustrates their corresponding learning curves for different values of $l$ . We observe that the suboptimal solution performs closely to the solution that shares all necessary variables. Furthermore, the proposed algorithms provide robustness against Byzantine attacks since sharing less information results in lower MSE.

In Fig. 5, in order to observe the fluctuation caused by the selection matrices, we plot the MSE in (38) and $\text{MSE}^{\prime}=\frac{1}{L}\lim_{k\rightarrow\infty}\sum_{i=1}^{L}\text{tr}(\mathbf{P}_{i}(k))$ with $\mathbf{P}_{i}(k)$ in (8).¹¹1The difference between $\text{MSE}^{\prime}$ and MSE is that the $\text{MSE}^{\prime}$ does not include the statistical expectation with respect to the selection matrices. Thus, we can examine the accuracy of our theoretical finding to compute the expected value of the error covariance with respect to the selection matrices in (23). The comparable performance of the MSE and $\text{MSE}^{\prime}$ in Fig. 5 demonstrates that simulation results match theoretical findings.

To solve the optimization problem $\mathcal{P}$ in (35), we performed the simulation by the BCD algorithm with $T=10$ iterations and designed the selection matrices $\{\mathbf{S}_{j}(k_{0})\}_{j\in\mathcal{B}}$ at $k_{0}=30$ . We can see from Fig. 6 that the designed selection matrix increases the network MSE. Also, it can be seen that designing the selection matrices has a higher impact on the network performance when a smaller fraction is shared.

By solving the optimization problem in (LABEL:OPT), we examine the impact of optimizing the attack covariance compared to a random attack covariance. To this end, we fixed the constraint on the energy of the perturbation sequences, i.e., $\eta$ . Fig. 7 shows that optimizing the perturbation covariance $\mathbf{\Sigma}^{*}$ increases the MSE, while using partial sharing of information enhances robustness to Byzantine attacks by restricting the growth in MSE. In other words, as we share more information with neighbors, the impact of optimizing the perturbation covariance matrix increases.

For different values of $l$ , Fig. 8 plots the $\text{MSE}^{\prime}$ versus time index $k$ for optimized and random selection of the attack covariance. It can be seen that when less information is shared, the sensitivity to perturbation sequences with optimized covariance increases, resulting in high levels of fluctuation in the $\text{MSE}^{\prime}$ . In addition, Figs. 6 and 7 show that the optimized selection matrices have a greater impact when less information is shared, e.g., $25\%$ and $50\%$ -sharing, while optimal attack covariance has a higher impact when larger fractions of information are shared, e.g., $75\%$ and full-sharing.

In order to analyze the robustness of the proposed BR-CDF algorithm to the number of Byzantine agents, Fig. 9 plots the MSE versus the percentage of Byzantine agents in the network. As expected, we see that as the percentage of Byzantine agents increases, the MSE grows; however, partial sharing of information can significantly improve the resilience to Byzantine attacks, as illustrated by obtaining the lower MSE. In addition, Fig. 10 illustrates the MSE versus the trace of the attack covariance in order to assess the robustness of the BR-CDF algorithm to perturbation sequences. It can be seen that partial sharing of information improves robustness to injected noise by obtaining lower MSE.

VIII Conclusion

This paper proposed a Byzantine-resilient consensus-based distributed filter (BR-CDF) that allows agents to exchange a fraction of their information at each time instant. We characterized the performance and convergence of the BR-CDF and investigated the impact of coordinated data falsification attacks. We showed that partial sharing of information provides robustness against Byzantine attacks and also reduces the communication load among agents by sharing a smaller fraction of the states at each time instant. Furthermore, we analyzed the worst-case scenario of a data falsification attack where Byzantine agents cooperate on designing the covariance of their falsification data or the sequence of their shared fractions. Finally, the numerical results verified the robustness of the proposed BR-CDF against Byzantine attacks and corroborated the theoretical findings.

References

[1] A. Humayed, J. Lin, F. Li, and B. Luo, “Cyber-physical systems security—a survey,” IEEE Internet Things J., vol. 4, no. 6, pp. 1802–1831, Dec. 2017.
[2] D. Ding, Q.-L. Han, X. Ge, and J. Wang, “Secure state estimation and control of cyber-physical systems: A survey,” IEEE Trans. Syst., Man, Cybern. Syst., vol. 51, no. 1, pp. 176–190, Jan. 2021.
[3] A. Farraj, E. Hammad, and D. Kundur, “On the impact of cyber attacks on data integrity in storage-based transient stability control,” IEEE Trans. Ind. Informat., vol. 13, no. 6, pp. 3322–3333, Dec. 2017.
[4] L. Hu, Z. Wang, Q.-L. Han, and X. Liu, “State estimation under false data injection attacks: Security analysis and system protection,” Elsevier Automatica, vol. 87, pp. 176–183, Jan. 2018.
[5] L. J. Rodriguez, N. H. Tran, T. Q. Duong, T. Le-Ngoc, M. Elkashlan, and S. Shetty, “Physical layer security in wireless cooperative relay networks: State of the art and beyond,” IEEE Commun. Mag., vol. 53, no. 12, pp. 32–39, Dec. 2015.
[6] N. Forti, G. Battistelli, L. Chisci, S. Li, B. Wang, and B. Sinopoli, “Distributed joint attack detection and secure state estimation,” IEEE Trans. Signal Inf. Process. Netw., vol. 4, no. 1, pp. 96–110, Mar. 2018.
[7] M. S. Rahman, M. A. Mahmud, A. M. T. Oo, and H. R. Pota, “Multi-agent approach for enhancing security of protection schemes in cyber-physical energy systems,” IEEE Trans. Ind. Informat., vol. 13, no. 2, pp. 436–447, Apr. 2017.
[8] H. Fawzi, P. Tabuada, and S. Diggavi, “Secure estimation and control for cyber-physical systems under adversarial attacks,” IEEE Trans. Autom. Control, vol. 59, no. 6, pp. 1454–1467, Jun. 2014.
[9] A. Moradi, N. K. D. Venkategowda, S. P. Talebi, and S. Werner, “Privacy-preserving distributed Kalman filtering,” IEEE Trans. Signal Process., pp. 1–16, 2022.
[10] A. Moradi, N. K. Venkategowda, S. P. Talebi, and S. Werner, “Distributed Kalman filtering with privacy against honest-but-curious adversaries,” in Proc. 55th IEEE Asilomar Conf. Signals, Syst., Comput., 2021, pp. 790–794.
[11] A. Moradi, N. K. D. Venkategowda, S. P. Talebi, and S. Werner, “Securing the distributed Kalman filter against curious agents,” in Proc. 24th IEEE Int. Conf. Inf. Fusion, 2021, pp. 1–7.
[12] S. Liang, J. Lam, and H. Lin, “Secure estimation with privacy protection,” IEEE Trans. Cybern., pp. 1–15, 2022.
[13] D. Ding, Q.-L. Han, Z. Wang, and X. Ge, “A survey on model-based distributed control and filtering for industrial cyber-physical systems,” IEEE Trans. Ind. Informat., vol. 15, no. 5, pp. 2483–2499, May 2019.
[14] C. Zhao, J. He, and J. Chen, “Resilient consensus with mobile detectors against malicious attacks,” IEEE Trans. Signal Inf. Process. Netw., vol. 4, no. 1, pp. 60–69, Mar. 2018.
[15] Y. Guan and X. Ge, “Distributed attack detection and secure estimation of networked cyber-physical systems against false data injection attacks and jamming attacks,” IEEE Trans. Signal Inf. Process. Netw., vol. 4, no. 1, pp. 48–59, Mar. 2018.
[16] R. K. Chang, “Defending against flooding-based distributed denial-of-service attacks: A tutorial,” IEEE Commun. Mag., vol. 40, no. 10, pp. 42–51, Oct. 2002.
[17] A. Vempaty, L. Tong, and P. K. Varshney, “Distributed inference with byzantine data: State-of-the-art review on data falsification attacks,” IEEE Signal Process. Mag., vol. 30, no. 5, pp. 65–75, Aug. 2013.
[18] A. Moradi, N. K. Venkategowda, and S. Werner, “Total variation-based distributed kalman filtering for resiliency against byzantines,” IEEE Sensors J., vol. 23, no. 4, pp. 4228–4238, Feb. 2023.
[19] Y. Mo and B. Sinopoli, “On the performance degradation of cyber-physical systems under stealthy integrity attacks,” IEEE Trans. Autom. Control, vol. 61, no. 9, pp. 2618–2624, Sept. 2016.
[20] R. Deng, G. Xiao, R. Lu, H. Liang, and A. V. Vasilakos, “False data injection on state estimation in power systems—attacks, impacts, and defense: A survey,” IEEE Trans. Ind. Informat., vol. 13, no. 2, pp. 411–423, Apr. 2017.
[21] F. Li and Y. Tang, “False data injection attack for cyber-physical systems with resource constraint,” IEEE Trans. Cybern., vol. 50, no. 2, pp. 729–738, Feb. 2020.
[22] P. Cheng, Z. Yang, J. Chen, Y. Qi, and L. Shi, “An event-based stealthy attack on remote state estimation,” IEEE Trans. Autom. Control, vol. 65, no. 10, pp. 4348–4355, Oct. 2020.
[23] P. Srikantha, J. Liu, and J. Samarabandu, “A novel distributed and stealthy attack on active distribution networks and a mitigation strategy,” IEEE Trans. Ind. Informat., vol. 16, no. 2, pp. 823–831, Feb. 2020.
[24] Z. Guo, D. Shi, K. H. Johansson, and L. Shi, “Optimal linear cyber-attack on remote state estimation,” IEEE Control Netw. Syst., vol. 4, no. 1, pp. 4–13, Mar. 2017.
[25] Y. Ni, Z. Guo, Y. Mo, and L. Shi, “On the performance analysis of reset attack in cyber-physical systems,” IEEE Trans. Autom. Control, vol. 65, no. 1, pp. 419–425, Jan. 2020.
[26] A. Moradi, N. K. Venkategowda, and S. Werner, “Coordinated data-falsification attacks in consensus-based distributed Kalman filtering,” in Proc. 8th IEEE Int. Workshop Comput. Advances Multi-Sensor Adaptive Process., 2019, pp. 495–499.
[27] M. Choraria, A. Chattopadhyay, U. Mitra, and E. G. Ström, “Design of false data injection attack on distributed process estimation,” IEEE Trans. Inf. Forensics Security, vol. 17, no. 670-683, Jan. 2022.
[28] M. N. Kurt, Y. Yılmaz, and X. Wang, “Distributed quickest detection of cyber-attacks in smart grid,” IEEE Trans. Inf. Forensics Security, vol. 13, no. 8, pp. 2015–2030, Aug. 2018.
[29] M. N. Kurt, Y. Yilmaz, and X. Wang, “Real-time detection of hybrid and stealthy cyber-attacks in smart grid,” IEEE Trans. Inf. Forensics Security, vol. 14, no. 2, pp. 498–513, Feb. 2019.
[30] M. Aktukmak, Y. Yilmaz, and I. Uysal, “Sequential attack detection in recommender systems,” IEEE Trans. Inf. Forensics Security, vol. 16, pp. 3285–3298, Apr. 2021.
[31] Y. Chen, S. Kar, and J. M. Moura, “Resilient distributed estimation through adversary detection,” IEEE Trans. Signal Process., vol. 66, no. 9, pp. 2455–2469, May 2018.
[32] Y. Li and T. Chen, “Stochastic detector against linear deception attacks on remote state estimation,” in Proc. 55th IEEE Conf. Decis. Control, 2016, pp. 6291–6296.
[33] Y. Li, L. Shi, and T. Chen, “Detection against linear deception attacks on multi-sensor remote state estimation,” IEEE Control Netw. Syst., vol. 5, no. 3, pp. 846–856, Sept. 2018.
[34] W. Yang, Y. Zhang, G. Chen, C. Yang, and L. Shi, “Distributed filtering under false data injection attacks,” Elsevier Automatica, vol. 102, pp. 34–44, 2019.
[35] Y. Chen, S. Kar, and J. M. F. Moura, “Resilient distributed estimation: Sensor attacks,” IEEE Trans. Autom. Control, vol. 64, no. 9, pp. 3772–3779, Sept. 2018.
[36] A. Barboni, H. Rezaee, F. Boem, and T. Parisini, “Detection of covert cyber-attacks in interconnected systems: A distributed model-based approach,” IEEE Trans. Autom. Control, vol. 65, no. 9, pp. 3728–3741, Sept. 2020.
[37] J. Shang, M. Chen, and T. Chen, “Optimal linear encryption against stealthy attacks on remote state estimation,” IEEE Trans. Autom. Control, vol. 66, no. 8, pp. 3592–3607, Aug. 2021.
[38] L. Su and S. Shahrampour, “Finite-time guarantees for byzantine-resilient distributed state estimation with noisy measurements,” IEEE Trans. Autom. Control, vol. 65, no. 9, pp. 3758–3771, Sept. 2020.
[39] Y. Chen, S. Kar, and J. M. Moura, “Resilient distributed parameter estimation with heterogeneous data,” IEEE Trans. Signal Process., vol. 67, no. 19, pp. 4918–4933, Oct. 2019.
[40] J. G. Lee, J. Kim, and H. Shim, “Fully distributed resilient state estimation based on distributed median solver,” IEEE Trans. Autom. Control, vol. 65, no. 9, pp. 3935–3942, Sept. 2020.
[41] D. Feng, C. Jiang, G. Lim, L. J. Cimini, G. Feng, and G. Y. Li, “A survey of energy-efficient wireless communications,” IEEE Commun. Surveys Tuts., vol. 15, no. 1, pp. 167–178, First Quarter 2013.
[42] R. Arablouei, S. Werner, Y.-F. Huang, and K. Doğançay, “Distributed least mean-square estimation with partial diffusion,” IEEE Trans. Signal Process., vol. 62, no. 2, pp. 472–484, Jan. 2014.
[43] R. Arablouei, K. Doğançay, S. Werner, and Y.-F. Huang, “Adaptive distributed estimation based on recursive least-squares and partial diffusion,” IEEE Trans. Signal Process., vol. 62, no. 14, pp. 3510–3522, Jul. 2014.
[44] V. C. Gogineni, A. Moradi, N. K. Venkategowda, and S. Werner, “Communication-efficient and privacy-aware distributed LMS algorithm,” in Proc. 25th IEEE Int. Conf. Inf. Fusion, 2022, pp. 1–6.
[45] V. C. Gogineni, S. Werner, Y.-F. Huang, and A. Kuh, “Communication-efficient online federated learning strategies for kernel regression,” IEEE Internet Things J., pp. 1–1, 2022.
[46] R. Olfati-Saber, “Kalman-consensus filter: Optimality, stability, and performance,” in Proc. 48th IEEE Conf. Decis. Control, 2009, pp. 7036–7042.
[47] Y. Chen, S. Kar, and J. M. Moura, “Optimal attack strategies subject to detection constraints against cyber-physical systems,” IEEE Control Netw. Syst., vol. 5, no. 3, pp. 1157–1168, Sept. 2018.
[48] C.-Z. Bai, V. Gupta, and F. Pasqualetti, “On Kalman filtering with compromised sensors: Attack stealthiness and performance bounds,” IEEE Trans. Autom. Control, vol. 62, no. 12, pp. 6641–6648, Des. 2017.
[49] R. Olfati-Saber, “Distributed Kalman filtering for sensor networks,” in Proc. 46th IEEE Conf. Decis. Control, 2007, pp. 5492–5498.
[50] Z. Guo, D. Shi, K. H. Johansson, and L. Shi, “Worst-case stealthy innovation-based linear attack on remote state estimation,” Elsevier Automatica, vol. 89, pp. 117–124, Mar. 2018.