This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Covariance-Robust Dynamic Watermarking

Matt Olfat, Stephen Sloan, Pedro Hespanhol, Matt Porter, Ram Vasudevan, and Anil Aswani This material is based upon work partially supported by the National Science Foundation under Grant CMMI-1847666, by the UC Berkeley Center for Long-Term Cybersecurity, and by a grant from Ford Motor Company via the Ford-UM Alliance under award N022977.Matt Olfat, Steve Sloan, Pedro Hespanhol, and Anil Aswani are with the Department of Industrial Engineering and Operations Research, University of California, Berkeley 94720 {molfat,stephen_sloan,pedrohespanhol,aaswani}@berkeley.eduMatt Porter and Ram Vasudevan are with the Department of Mechanical Engineering, University of Michigan, Ann Arbor 48109 {matthepo,ramv}@umich.edu
Abstract

Attack detection and mitigation strategies for cyberphysical systems (CPS) are an active area of research, and researchers have developed a variety of attack-detection tools such as dynamic watermarking. However, such methods often make assumptions that are difficult to guarantee, such as exact knowledge of the distribution of measurement noise. Here, we develop a new dynamic watermarking method that we call covariance-robust dynamic watermarking, which is able to handle uncertainties in the covariance of measurement noise. Specifically, we consider two cases. In the first this covariance is fixed but unknown, and in the second this covariance is slowly-varying. For our tests, we only require knowledge of a set within which the covariance lies. Furthermore, we connect this problem to that of algorithmic fairness and the nascent field of fair hypothesis testing, and we show that our tests satisfy some notions of fairness. Finally, we exhibit the efficacy of our tests on empirical examples chosen to reflect values observed in a standard simulation model of autonomous vehicles.

I INTRODUCTION

The development of 5G, the “fifth generation” of wireless technology, brings with it increased bandwidth, massive-scale device-to-device (D2D) connections, lower latency, and high reliability. The latency reductions with 5G open the door to further growth in cyberphysical systems (CPS), which involve the intercommunication and real-time management of large numbers of physical sensors and actuators, often in shifting environments [1]. System vulnerabilities to malicious agents abound in all of these technologies [2, 3, 4], and 5G in particular necessitates more robust cyber-security measures for the relevant control systems [1].

Much of the existing work on security for CPS assumes that the system is fixed and all required distributions are exactly known [5, 6, 7]. However, the system description is often time-varying or partially uncertain for many CPS [8, 9]. Given this real-world motivation of security for time-varying or partially unknown CPS, we focus in this paper on designing robust security schemes to test for adversarial attacks on LTI systems. Recent work has established dynamic watermarking as a key active method for detecting sensor attacks [5, 6, 10, 11, 12, 7, 13, 14], and here we build on this work by designing covariance-robust dynamic watermarking.

We design robust watermarking for two sub-cases: The first is where the covariance of measurement noise is fixed but unknown, and the second is where the covariance of measurement noise is unknown and slowly-varying. The first reflects a scenario of many nearly-identical systems with variation between copies of the system. The second sub-case reflects a scenario where a sensor has different accuracy in varying regimes, such as lidar on an autonomous vehicle in changing weather. Attack detection is critical in all of these such cases, and we need statistical tests that retain their power in the face of system changes or uncertainty.

I-A Fairness

Robust data-driven decision-making has gained attention in the literature on algorithmic fairness. Motivated by machine learning tasks with societal applications, the fairness literature has sought to design learning methods that refrain from considering certain variables. To that extent, this body of work defines rigorous, mathematical notions of fairness for supervised learning [15, 16, 17, 18, 19, 20, 21, 22], which have recently been extended to unsupervised learning by [23, 24].

The work in [25] outlines a general framework: Consider (X,Y,Z)(X,Y,Z) with a joint distribution \mathbb{P}, where XX are exogenous inputs, YY are endogenous “targets”, and ZZ is a “protected attribute”. The goal is to choose a decision rule δ(x)\delta(x) that makes a decision dd using inputs XX, in order to minimize some risk function (δ,Y)\mathcal{R}_{\mathbb{P}}(\delta,Y). In dynamic watermarking: XX are measurements, YY is a binary variable that denotes if the system is under attack, and ZZ is the true system characterization; our decision rule δ\delta for if the system is under attack is made without YY and ZZ, which are not observed. We then define a decision rule to be without disparate impact if

δargminδ{R(δ,Y)|δ(X)Z},\delta^{*}\in\arg\min_{\delta}\big{\{}R_{\mathbb{P}}(\delta,Y)\ \big{|}\ \delta(X)\perp\!\!\!\perp Z\big{\}}, (1)

where δ(X)Z\delta(X)\perp\!\!\!\perp Z means δ(X)\delta(X) is independent of ZZ. This increases fairness because it removes any impact of ZZ on the decision by imposing independence as a constraint. However, some [18, 16] have argued that this above definition of fairness can be too restrictive in some cases and that equalized odds is a better definition of fairness. Its only difference is that in (1) we replace δ(X)Z\delta(X)\perp\!\!\!\perp Z with (δ(X)Z)|Y(\delta(X)\perp\!\!\!\perp Z)|Y. That is, equalized odds ask for independence of δ(X)\delta(X) and ZZ when conditioned on YY. We can interpret equalized odds as requiring error rates to be similar across protected groups. Finally, a notion associated equalized odds is that of equal opportunity, which amounts to enforcing (δ(X)Z)|Y=y(\delta(X)\perp\!\!\!\perp Z)|Y=y, for some value yy. This is relevant when one particular type of error is of more interest than another.

I-B Relevance of Fairness to Watermarking

Fairness is relevant to the design of robust tests for two reasons. First, it provides a well-established technical language with which to discuss our requirement of robustness. Past dynamic watermarking techniques require exact system knowledge, and as such the corresponding watermarking tests will have error rates that are biased over inevitable system perturbations or uncertainties. Fairness notions such as equalized odds and equal opportunity allow for more specific framing of the problem and thus give a framework to design more robust methods for dynamic watermarking.

Second, robust cyber-security methods will have improved social impacts, which is the most general way of interpreting “fairness”. For example, smart homes can have many sensors. Changes in the distribution of sensor noise can correlate with factors such as climate, which correlates with geography and thus attributes like race, ethnicity, or class. A systemic bias in the ability to detect threats thus yields, and possibly perpetuates, systemic bias in outcomes among these groups. Robustness of cyber-security methods thus have the potential to improve societal fairness of the corresponding methods.

I-C Outline

In Sect. II, we outline key terminology and results in dynamic watermarking. In Sect. III-A, we present our covariance-robust dynamic watermarking scheme for the case of fixed, but unknown, measurement noise covariance. This is then extended in Sect. III-B to the case where measurement noise covariance is allowed to slowly vary. Sect. IV presents empirical results that demonstrate efficacy of our approach.

II Preliminaries

We describe the LTI system and attack models, and then review existing results about dynamic watermarking.

II-A LTI System Model

Consider a partially-observed MIMO LTI system

xn+1\displaystyle x_{n+1} =Axn+Bun+wn\displaystyle=Ax_{n}+Bu_{n}+w_{n} (2)
yn\displaystyle y_{n} =Cxn+zn+vn\displaystyle=Cx_{n}+z_{n}+v_{n}

for xn,wnp,unqx_{n},w_{n}\in\mathbb{R}^{p},u_{n}\in\mathbb{R}^{q} and yn,zn,vnmy_{n},z_{n},v_{n}\in\mathbb{R}^{m}. Here wnw_{n} is mean-zero i.i.d. multivariate Gaussian process noise with covariance matrix ΣW\Sigma_{W}, and this is independent of znz_{n} that is i.i.d. Gaussian measurement noise with mean-zero; but we assume that the covariance matrix for znz_{n} is a linear function ΣZ(θ)\Sigma_{Z}(\theta) of a set of parameters θ𝒫d\theta\in\mathcal{P}\subset\mathbb{R}^{d} taking values in polyhedron 𝒫\mathcal{P}. For now, θ\theta is assumed constant but unknown for any fixed system. The vnv_{n} is an additive signal chosen by an attacker who seeks to corrupt sensor measurements.

Stabilizability of (A,B)(A,B) and detectability of (A,C)(A,C) imply the existence of a controller KK and observer LL such that A+BKA+BK and A+LCA+LC are Schur stable. The closed-loop system can be stabilized using the control input un=Kx^nu_{n}=K\hat{x}_{n}, where x^n\hat{x}_{n} is the observer-estimated state. Define x~n=[xnTx^nT]T\tilde{x}_{n}^{\vphantom{\textsf{T}}}=\begin{bmatrix}x_{n}^{\textsf{T}}&\hat{x}_{n}^{\textsf{T}}\end{bmatrix}^{\textsf{T}}, D¯=[I0]T\underline{D}^{\vphantom{\textsf{T}}}=\begin{bmatrix}I^{\vphantom{\textsf{T}}}&0^{\vphantom{\textsf{T}}}\end{bmatrix}^{\textsf{T}}, L¯=[0LT]T\underline{L}^{\vphantom{\textsf{T}}}=\begin{bmatrix}0^{\vphantom{\textsf{T}}}&-L^{\textsf{T}}\end{bmatrix}^{\textsf{T}}, and

A¯=[ABKLCA+BK+LC].\underline{A}=\begin{bmatrix}A&BK\\ -LC&A+BK+LC\end{bmatrix}. (3)

We can write the closed-loop evolution of the state and estimated state when vn0v_{n}\equiv 0 as x~n+1=A¯x~n+D¯wn+L¯zn\tilde{x}_{n+1}=\underline{A}\tilde{x}_{n}+\underline{D}w_{n}+\underline{L}z_{n}. Alternatively, we may define the observation error δn=x^nxn\delta_{n}=\hat{x}_{n}-x_{n}. Let x˘n=[xnTδnT]T\breve{x}_{n}^{\vphantom{\textsf{T}}}=\begin{bmatrix}x_{n}^{\textsf{T}}&\delta_{n}^{\textsf{T}}\end{bmatrix}^{\textsf{T}}, D¯¯=[II]T\underline{\underline{D}}^{\vphantom{\textsf{T}}}=\begin{bmatrix}I^{\vphantom{\textsf{T}}}&-I^{\vphantom{\textsf{T}}}\end{bmatrix}^{\textsf{T}}, L¯¯=L¯\underline{\underline{L}}^{\vphantom{\textsf{T}}}=\underline{L}^{\vphantom{\textsf{T}}}, and

A¯¯=[A+BKBK0A+LC].\underline{\underline{A}}=\begin{bmatrix}A+BK&BK\\ 0&A+LC\end{bmatrix}. (4)

The closed-loop system for this change of variables is x˘n+1=A¯¯x˘n+D¯¯wn+L¯¯zn\breve{x}_{n+1}=\underline{\underline{A}}\breve{x}_{n}+\underline{\underline{D}}w_{n}+\underline{\underline{L}}z_{n}. Note that A¯¯\underline{\underline{A}} is Schur stable since both A+BKA+BK and A+LCA+LC are Schur stable.

II-B Attack Model

Following [26], we consider attacks where vn=α(Cxn+zn)+Cηn+ζnv_{n}=\alpha(Cx_{n}+z_{n})+C\eta_{n}+\zeta_{n} for a fixed α\alpha\in\mathbb{R} and i.i.d. Gaussian ζn\zeta_{n} with mean-zero and covariance matrix ΣS\Sigma_{S}. Here, the ηn\eta_{n} are chosen to follow the process ηn+1=(A+BK)ηn+ωn\eta_{n+1}=(A+BK)\eta_{n}+\omega_{n}, where ωn\omega_{n} are similarly i.i.d. Gaussian with mean-zero and covariance matrix ΣO\Sigma_{O}. The implication is that the attacker minimizes or mutes the true output Cxn+znCx_{n}+z_{n}, and instead replaces it with a simulated output that follows the system dynamics and is thus not easily distinguishable as false. Furthermore, the attacker has access to process wnw_{n} and measurement noise znz_{n}. With this attack, the closed-loop systems above become x~n+1=A¯x~n+D¯wn+L¯(zn+vn)\tilde{x}_{n+1}=\underline{A}\tilde{x}_{n}+\underline{D}w_{n}+\underline{L}(z_{n}+v_{n}) and x˘n+1=A¯¯x˘n+D¯¯wn+L¯¯(zn+vn)\breve{x}_{n+1}=\underline{\underline{A}}\breve{x}_{n}+\underline{\underline{D}}w_{n}+\underline{\underline{L}}(z_{n}+v_{n}).

II-C (Nonrobust) Dynamic Watermarking

The steady-state distribution of δn\delta_{n} in an unattacked system will be Gaussian with mean-zero and a covariance matrix of

ΣΔ=(A+LC)ΣΔ(A+LC)T+ΣW+LΣZ(θ)LT.\Sigma_{\Delta}=(A+LC)^{\vphantom{\textsf{T}}}\Sigma_{\Delta}^{\vphantom{\textsf{T}}}(A+LC)^{\textsf{T}}+\Sigma_{W}+L^{\vphantom{\textsf{T}}}\Sigma_{Z}(\theta)^{\vphantom{\textsf{T}}}L^{\textsf{T}}. (5)

Dynamic watermarking adds a small amount of Gaussian noise ene_{n}, the values unknown to the attacker, into the control input un=Kx^n+enu_{n}=K\hat{x}_{n}+e_{n}. This private excitation has mean-zero and covariance matrix ΣE\Sigma_{E}. Defining B¯=[BTBT]T\underline{B}^{\vphantom{\textsf{T}}}=\begin{bmatrix}B^{\textsf{T}}&B^{\textsf{T}}\end{bmatrix}^{\textsf{T}} and B¯¯=[BT0]T\underline{\underline{B}}^{\vphantom{\textsf{T}}}=\begin{bmatrix}B^{\textsf{T}}&0^{\vphantom{\textsf{T}}}\end{bmatrix}^{\textsf{T}}, the closed-loop systems with watermarking are given by x~n+1=A¯x~n+B¯en+D¯wn+L¯(zn+vn)\tilde{x}_{n+1}=\underline{A}\tilde{x}_{n}+\underline{B}e_{n}+\underline{D}w_{n}+\underline{L}(z_{n}+v_{n}) and x˘n+1=A¯¯x˘t+B¯¯en+D¯¯wn+L¯¯(zn+vn)\breve{x}_{n+1}=\underline{\underline{A}}\breve{x}_{t}+\underline{\underline{B}}e_{n}+\underline{\underline{D}}w_{n}+\underline{\underline{L}}(z_{n}+v_{n}), respectively.

The watermarking noise ene_{n} leaves a detectable signal in the measurements yny_{n}, which can detect the presence of an attack vnv_{n} by comparing the observer error Cx^nynC\hat{x}_{n}-y_{n} to previous values of the watermark enke_{n-k} for some integer k>0k>0. Specifically, the work in [26] proposes the tests

aslimN1Nn=0N1(Cx^nyn)(Cx^nyn)T=\displaystyle\textstyle\operatorname{as-lim}_{N\rightarrow\infty}\frac{1}{N}\sum_{n=0}^{N-1}(C\hat{x}_{n}-y_{n})^{\vphantom{\textsf{T}}}(C\hat{x}_{n}-y_{n})^{\textsf{T}}=
CΣΔCT+ΣZ\displaystyle\hskip 156.49014ptC^{\vphantom{\textsf{T}}}\Sigma_{\Delta}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z}^{\vphantom{\textsf{T}}} (6)
aslimN1Nn=0N1(Cx^nyn)enk1T=0,\displaystyle\textstyle\operatorname{as-lim}_{N\rightarrow\infty}\frac{1}{N}\sum_{n=0}^{N-1}(C\hat{x}_{n}-y_{n})^{\vphantom{\textsf{T}}}e_{n-k^{\prime}-1}^{\textsf{T}}=0, (7)

where k=mink1{C(A+BK)kBT0}k^{\prime}=\min_{k\geq 1}\{C^{\vphantom{\textsf{T}}}(A+BK)^{k}B^{\textsf{T}}\neq 0\}. Any modeled attack passing these tests can be shown to asymptotically have zero power aslimN1Nn=0N1vnTvn=0\operatorname{as-lim}_{N\rightarrow\infty}\frac{1}{N}\sum_{n=0}^{N-1}v_{n}^{\textsf{T}}v_{n}^{\vphantom{\textsf{T}}}=0 [26].

Finally, [26] also provides a test statistic for implementing the above test. Define ψn=[(Cx^nyn)Tenk1T]T\psi_{n}=\begin{bmatrix}(C\hat{x}_{n}-y_{n})^{\textsf{T}}&e_{n-k^{\prime}-1}^{\textsf{T}}\end{bmatrix}^{\textsf{T}} and Sn=i=n+1n+ψnψnTS_{n}=\sum_{i=n+1}^{n+\ell}\psi_{n}^{\vphantom{\textsf{T}}}\psi_{n}^{\textsf{T}}. Then the negative log-likelihood of a Wishart distribution is

=\displaystyle\mathcal{L}= (m+q+1)logdetSn\displaystyle(m+q+1-\ell)\log\det{S_{n}} (DW)
+trace{[(CΣΔCT+ΣZ)100ΣE1]×Sn}.\displaystyle+\textrm{trace}\Bigg{\{}\begin{bmatrix}\big{(}C^{\vphantom{\textsf{T}}}\Sigma_{\Delta}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z}\big{)}^{-1}&0\\ 0&\Sigma_{E}^{-1}\end{bmatrix}\times S_{n}\Bigg{\}}.

This can be used to perform a statistical hypothesis test to detect attacks when using dynamic watermarking.

III Covariance-Robust Dynamic Watermarking

We develop covariance-robust dynamic watermarking methods for two different cases. The first is where θ\theta is fixed but unknown, and the second is where θ\theta is slowly varying.

III-A Fixed But Unknown Noise Covariance

We begin by stating our assumptions for this case. First, we assume that we have knowledge of a set of positive semidefinite matrices Σz,1,,Σz,d\Sigma_{z,1},\dots,\Sigma_{z,d} such that these matrices are affinely independent and ΣZ(θ)int(ΩZ)\Sigma_{Z}(\theta)\in\mathrm{int}(\Omega^{Z}) for the set

ΩZ={θ1Σz,1++θdΣz,d:𝟏Tθ=1,θ𝟎}.\Omega^{Z}=\{\theta_{1}\Sigma_{z,1}+\cdots+\theta_{d}{\Sigma}_{z,d}:\mathbf{1}^{T}\theta=1,\theta\geq\mathbf{0}\}. (8)

Note that ΩZ\Omega^{Z} is a polyhedron, and that this set is defined to be the convex combination of Σz,1,,Σz,d{\Sigma}_{z,1},\dots,{\Sigma}_{z,d}. Our first result characterizes ΩΔ\Omega^{\Delta}, which is the set of possible ΣΔ(θ)\Sigma_{\Delta}(\theta).

Lemma 1

Let Σ¯δ,k\bar{\Sigma}_{\delta,k} satisfy Σ¯δ,k=(A+LC)Σ¯δ,k(A+LC)T+ΣW+LΣz,kLT\bar{\Sigma}_{\delta,k}=(A+LC)\bar{\Sigma}_{\delta,k}(A+LC)^{T}+\Sigma_{W}+L{\Sigma}_{z,k}L^{T}. For ΣZ(θ)=θ1Σz,1++θdΣz,d\Sigma_{Z}(\theta)=\theta_{1}{\Sigma}_{z,1}+\cdots+\theta_{d}{\Sigma}_{z,d}, the solution to (5) is ΣΔ(θ)=θ1Σ¯δ,1++θdΣ¯δ,d\Sigma_{\Delta}(\theta)=\theta_{1}\bar{\Sigma}_{\delta,1}+\cdots+\theta_{d}\bar{\Sigma}_{\delta,d}.

Proof:

This immediately follows by noting that both sides of (5) are linear in the matrices ΣΔ\Sigma_{\Delta} and ΣZ(θ)\Sigma_{Z}(\theta). ∎

Since E[ψnψnT]=blkdiag{CΣΔCT+ΣZ,ΣE}E[\psi_{n}^{\vphantom{\textsf{T}}}\psi_{n}^{\textsf{T}}]=\mathrm{blkdiag}\{C^{\vphantom{\textsf{T}}}\Sigma_{\Delta}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z},\Sigma_{E}\}, we need to characterize the set Ω\Omega of feasible matrices in terms of θ\theta.

Lemma 2

Let Σ¯k=blkdiag{CΣ¯δ,kCT+Σz,k,ΣE}\bar{\Sigma}_{k}=\mathrm{blkdiag}\{C^{\vphantom{\textsf{T}}}\bar{\Sigma}_{\delta,k}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{z,k},\Sigma_{E}\}. Then Ω={θ1Σ¯k++θdΣ¯d:𝟏Tθ=1,θ𝟎}\Omega=\{\theta_{1}\bar{\Sigma}_{k}+\cdots+\theta_{d}\bar{\Sigma}_{d}:\mathbf{1}^{T}\theta=1,\theta\geq\mathbf{0}\}.

Proof:

This follows by the linearity in ΣΔ\Sigma_{\Delta} and ΣZ\Sigma_{Z}. ∎

The set Ω\Omega represents covariance matrices of ψn\psi_{n} that are “acceptable”, according to the original set ΩZ\Omega^{Z} of observation noise covariances that we should not mistake for attacks.

Lemma 3

The set Ω\Omega is of dimension d1d-1.

Proof:

This follows from Lemma 1, the fact that LL is of full column-rank, and the observability of (A+LC,C)(A+LC,C), which in turn follows from the observability of (A,C)(A,C). ∎

Finally, consider a modification of (DW) given by

(Sn,V)=(m+q+1)logdetSn+trace{VSn}logdetV.\mathcal{L}(S_{n},V)=(m+q+1-\ell)\log\det{S_{n}}+\\ \textrm{trace}\big{\{}VS_{n}\big{\}}-\ell\log\det{V}. (9)

Note (9) is the negative log-likelihood of an (m+q)×(m+q)(m+q)\times(m+q) Wishart distribution with scale matrix V1V^{-1} and \ell degrees of freedom. Now, we may present our test statistic. Let Ω1={V:V1Ω}\Omega^{-1}=\{V:V^{-1}\in\Omega\} and define the test statistic

T(Sn)=minVΩ1(Sn,V)T(S_{n})=\min_{V\in\Omega^{-1}}\mathcal{L}(S_{n},V) (10)

for the composite null hypothesis H0:E[ψnψnT]int(Ω)H_{0}:E[\psi_{n}^{\vphantom{\textsf{T}}}\psi_{n}^{\textsf{T}}]\in\textrm{int}(\Omega). For some 0ν0\leq\nu, consider the test

{reject H0if T(Sn)>νaccept H0if T(Sn)ν.\begin{cases}\textrm{reject }H_{0}&\textrm{if }T(S_{n})>\nu\\ \textrm{accept }H_{0}&\textrm{if }T(S_{n})\leq\nu.\end{cases} (11)

Since argminVΩ1(Sn,V)=Sn1\arg\min_{V\in\Omega^{-1}}\mathcal{L}(S_{n},V)=S_{n}^{-1}, this proposed test is equivalent to the generalized likelihood ratio test.

Theorem 1

For large enough \ell, the decision rule (11) using test statistic T(Sn)T(S_{n}) satisfies equal opportunity with respect to the null hypothesis and where the protected attribute is the true measurement noise covariance ΣZ(θ)int(ΩZ)\Sigma_{Z}(\theta)\in\mathrm{int}(\Omega^{Z}).

Proof:

Due to Lemma 3 and our assumption that ΣZ(θ)int(ΩZ)\Sigma_{Z}(\theta)\in\mathrm{int}(\Omega^{Z}), T(Sn)T(S_{n}) satisfies the Le Cam regularity conditions required for the application of Wilk’s Theorem [27]. This means 2T(Sn)-2T(S_{n}) will be asymptotically distributed as a χ2(m+qp)\chi^{2}(m+q-p) random variable plus a fixed constant regardless of the true value of ΣΔ\Sigma_{\Delta}, and thus implies that the event of a Type I error is independent of ΣΔ\Sigma_{\Delta}. ∎

This is a useful result because it implies that, in the proper regime, our test can come arbitrarily close to satisfying the initial goal of remaining robust to some uncertainty in the distribution of the measurement noise. However, Ω1\Omega^{-1} is a non-convex set, and so the computation of T(Sn)T(S_{n}) is difficult. To this end, we propose the approximate test statistic

T¯(Sn)=min(Sn,V)s.t.k=1pθkΣ¯k1V,[VIIk=1pθkΣ¯k]0,𝟏Tθ=1,θ𝟎.\begin{array}[]{rrcl}\bar{T}(S_{n})=\min&\lx@intercol\mathcal{L}(S_{n},V)\hfil\lx@intercol\\ \mathrm{s.t.}&\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}^{-1}_{k}&\succeq&V,\\ &\begin{bmatrix}V&I\\ I&\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k}\end{bmatrix}&\succeq&0,\\ &\mathbf{1}^{\textsf{T}}\theta^{\vphantom{\textsf{T}}}&=&1,\\ &\theta&\geq&\mathbf{0}.\end{array} (CRDW)
Lemma 4

For any VΩ1V\in\Omega^{-1}, there exists a θp\theta\in\mathbb{R}^{p} such that (V,θ)(V,\theta) is a feasible solution to the optimization problem defining test (CRDW).

Proof:

First observe that any VΩ1V\in\Omega^{-1} can be written as V=(k=1pθkΣ¯k)1V=(\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k})^{-1} for some nonzero θ\theta such that 𝟏Tθ=1\mathbf{1}^{\textsf{T}}\theta^{\vphantom{\textsf{T}}}=1. Thus, it holds trivially that

(k=1pθkΣ¯k)1V(k=1pθkΣ¯k)1\textstyle(\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k})^{-1}\succeq V\succeq(\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k})^{-1} (12)

The right-most constraint in (12) can be restated using the Schur complement, and this reformulation is exact. Since k=1pθkΣ¯k0\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k}\succeq 0, the Schur complement implies the second constraint in (CRDW) is equivalent to V(k=1pθkΣ¯k)10V-(\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k})^{-1}\succeq 0.

The first constraint in (CRDW) follows from the convexity of the matrix inverse for positive semidefinite matrices: Letting X(τ)=(1τ)X1+τX2X(\tau)=(1-\tau)X_{1}+\tau X_{2} for positive definite n×nn\times n matrices X1,X2X_{1},X_{2} and 0τ10\leq\tau\leq 1, we have 2τ2X(τ)1=2X1(τ)X(τ)X1(τ)X(τ)X1(τ)\frac{\nabla^{2}}{\nabla\tau^{2}}X(\tau)^{-1}=2X^{-1}(\tau)X^{\prime}(\tau)X^{-1}(\tau)X^{\prime}(\tau)X^{-1}(\tau). For any ana\in\mathbb{R}^{n}, the function ϕa(τ)=aTX1(τ)a\phi_{a}(\tau)=a^{\textsf{T}}X^{-1}(\tau)a^{\vphantom{\textsf{T}}} will have second derivative ϕa′′(τ)=2aTX1(τ)X(τ)X1(τ)X(τ)X1(τ)a0\phi_{a}^{\prime\prime}(\tau)=2a^{\textsf{T}}X^{-1}(\tau)X^{\prime}(\tau)X^{-1}(\tau)X^{\prime}(\tau)X^{-1}(\tau)a^{\vphantom{\textsf{T}}}\geq 0 due to the positive-semidefiniteness of X(τ)1X(\tau)^{-1}, so (1τ)ϕa(0)+τϕa(1)ϕa(τ)(1-\tau)\phi_{a}(0)+\tau\phi_{a}(1)\geq\phi_{a}(\tau). Since this holds for any aa, we have that

k=1pθkΣ¯k1(k=1pθkΣ¯k)1.\textstyle\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k}^{-1}\succeq(\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k})^{-1}. (13)

The first constraint in (CRDW) follows from (12) and (13). ∎

Remark 1

It was shown in [26] that test (7) ensures α=0\alpha=0 in any attack such that it holds true. In that case, we have

aslimN1Nn=0N1(Cx^nyn)(Cx^nyn)T=CΣΔ(θ)CT+ΣZ(θ)+ΣS+aslimN1Nn=0N1CηnηnTCT,\textstyle\operatorname{as-lim}_{N\rightarrow\infty}\frac{1}{N}\sum_{n=0}^{N-1}(C\hat{x}_{n}-y_{n})^{\vphantom{\textsf{T}}}(C\hat{x}_{n}-y_{n})^{\textsf{T}}\\ \textstyle=C^{\vphantom{\textsf{T}}}\Sigma^{\vphantom{\textsf{T}}}_{\Delta}(\theta)C^{\textsf{T}}+\Sigma_{Z}(\theta)+\\ \textstyle\Sigma_{S}+\operatorname{as-lim}_{N\rightarrow\infty}\frac{1}{N}\sum_{n=0}^{N-1}C^{\vphantom{\textsf{T}}}\eta_{n}^{\vphantom{\textsf{T}}}\eta_{n}^{\textsf{T}}C^{\textsf{T}}, (14)

since the Schur stability of A+BKA+BK implies that any effect of x0x_{0} and η0\eta_{0} are reduced to zero asymptotically. Since ΣS\Sigma_{S} and aslimN1Nn=0N1CηnηnTCT\operatorname{as-lim}_{N\rightarrow\infty}\frac{1}{N}\sum_{n=0}^{N-1}C^{\vphantom{\textsf{T}}}\eta_{n}^{\vphantom{\textsf{T}}}\eta_{n}^{\textsf{T}}C^{\textsf{T}} are both positive semidefinite, meaning that

aslim1Nn=0N1(Cx^nyn)(Cx^nyn)TCΣΔ(θ)CT+ΣZ(θ).\textstyle\operatorname{as-lim}\frac{1}{N}\sum_{n=0}^{N-1}(C\hat{x}_{n}-y_{n})^{\vphantom{\textsf{T}}}(C\hat{x}_{n}-y_{n})^{\textsf{T}}\succeq\\ \textstyle C^{\vphantom{\textsf{T}}}\Sigma^{\vphantom{\textsf{T}}}_{\Delta}(\theta)C^{\textsf{T}}+\Sigma_{Z}(\theta). (15)

Inverting both sides of this implies that, in the case that ΣS+aslimN1Nn=0N1CηnηnTCT0\Sigma_{S}+\operatorname{as-lim}_{N\rightarrow\infty}\frac{1}{N}\sum_{n=0}^{N-1}C^{\vphantom{\textsf{T}}}\eta_{n}^{\vphantom{\textsf{T}}}\eta_{n}^{\textsf{T}}C^{\textsf{T}}\neq 0, we can generally expect that Sn1(CΣΔ(θ)CT+ΣZ(θ))1Ω1S_{n}^{-1}\preceq(C^{\vphantom{\textsf{T}}}\Sigma^{\vphantom{\textsf{T}}}_{\Delta}(\theta)C^{\textsf{T}}+\Sigma_{Z}(\theta))^{-1}\in\Omega^{-1}. The takeaway is that the looseness of the upper bound (13) should not greatly decrease the power of the modified test in the presence of test (7), as the tight lower bound is more germane to situations where the system is actually being attacked.

Remark 2

If the dimension m+qm+q is large, then the optimization (CRDW) may be expensive to solve from scratch each time. Furthermore, SnS_{n} will likely not change drastically between runs when \ell is large. So, lighter-weight first-order methods such as ADMM can be used instead [28]. These generally take longer to converge to high levels of accuracy, but have the advantage of being able to be readily warm-started.

III-B Slowly Varying Unknown Noise Covariance

A key difference between this setting and that of the static distribution is that a shift in the observer noise covariance in one period can have impacts on ΣΔ\Sigma_{\Delta} over the next few periods that do not easily fit into our previous representation of the Ω\Omega. This is because it will take many steps before the covariance of δn\delta_{n} approaches its asymptotic limit in Ω\Omega. Thus, to accommodate a dynamically changing distribution of znz_{n}, we must use an expansion of the set Ω\Omega.

We modify our setup for this subsection. The true covariance of δn\delta_{n} and znz_{n} are ΣΔn\Sigma_{\Delta_{n}} and ΣZn\Sigma_{Z_{n}}, respectively. Let Ψn=ΣZnΣZn1\Psi_{n}=\Sigma_{Z_{n}}-\Sigma_{Z_{n-1}} and Φnj=(A+LC)jLΨnLT(A+LC)jT\Phi_{n}^{j}=(A+LC)^{j}L\Psi_{n}L^{\textsf{T}}{(A+LC)^{j}}^{\textsf{T}}. Note that all ΣZn\Sigma_{Z_{n}} are still assumed to be in ΩZ\Omega^{Z}. Finally, we make some additional assumptions. Since the spectral radius of A+LCA+LC is less than one, there exists some induced norm (denote this \|\cdot\|) such that A+LC<1\|A+LC\|<1 [29]. We assume θ\theta changes every step but ΣZ0Ω\Sigma_{Z_{0}}\in\Omega and all Ψn\Psi_{n} satisfy Ψnξ\|\Psi_{n}\|\leq\xi for some known value of ξ>0\xi>0. We also assume the system starts at steady state in the sense ΣΔ0=(A+LC)ΣΔ0(A+LC)T+ΣW+LΣZ0LT\Sigma_{\Delta_{0}}=(A+LC)\Sigma_{\Delta_{0}}(A+LC)^{\textsf{T}}+\Sigma_{W}+L\Sigma_{Z_{0}}L^{\textsf{T}}. Under these assumptions we have:

Lemma 5

Let ε\varepsilon\in\mathbb{R} be defined as

ε=ξC2L2A+LC2m(1A+LC2)2\varepsilon=\frac{\xi\|C\|^{2}\|L\|^{2}\|A+LC\|^{2}\sqrt{m}}{\left(1-\|A+LC\|^{2}\right)^{2}} (16)

Then CΣΔnCT+ΣZnΩ{E:εIEεI}C^{\vphantom{\textsf{T}}}\Sigma_{\Delta_{n}}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z_{n}}^{\vphantom{\textsf{T}}}\in\Omega\oplus\{E:-\varepsilon I\preceq E\preceq\varepsilon I\}, where \oplus is the Minkowski sum for all nn.

Proof:

Let Ωm×m\Omega_{m\times m} be the set of m×mm\times m upper-left submatrices of elements of Ω\Omega, associated with CΣΔ(θ)CT+ΣZ(θ)C^{\vphantom{\textsf{T}}}\Sigma_{\Delta}(\theta)^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z}(\theta) terms. We start by noting that

ΣΔ1=\displaystyle\Sigma_{\Delta_{1}}= (A+LC)ΣΔ0(A+LC)T+ΣW+LΣZ1LT\displaystyle(A+LC)\Sigma_{\Delta_{0}}(A+LC)^{\textsf{T}}+\Sigma_{W}+L\Sigma_{Z_{1}}L^{\textsf{T}} (17)
=\displaystyle= ΣΔ0+Φ10.\displaystyle\Sigma_{\Delta_{0}}+\Phi_{1}^{0}.

Similarly, we can see that ΣΔ2=ΣΔ0+L(Ψ0+Ψ1)LT+Φ01=ΣΔ0+Φ20+Φ10+Φ11\Sigma_{\Delta_{2}}=\Sigma_{\Delta_{0}}+L\left(\Psi_{0}+\Psi_{1}\right)L^{\textsf{T}}+\Phi_{0}^{1}=\Sigma_{\Delta_{0}}+\Phi_{2}^{0}+\Phi_{1}^{0}+\Phi_{1}^{1}. Continuing this recursion relation leads to the fact that

ΣΔn=ΣΔ0+i=0n1j=0iΦnij.\textstyle\Sigma_{\Delta_{n}}=\Sigma_{\Delta_{0}}+\sum_{i=0}^{n-1}\sum_{j=0}^{i}\Phi_{n-i}^{j}. (18)

Due to the Schur stability of A+LCA+LC, the following limit exists, and can be represented as in Lemma 1.

ΣΔk=limk(ΣΔ0+i=kkk1j=0iΦkij)\textstyle\Sigma_{\Delta_{\infty}^{k^{\prime}}}=\lim_{k\rightarrow\infty}\big{(}\Sigma_{\Delta_{0}}+\sum_{i=k-k^{\prime}}^{k-1}\sum_{j=0}^{i}\Phi_{k-i}^{j}\big{)} (19)

Note that ΣΔk\Sigma_{\Delta_{\infty}^{k^{\prime}}} is the steady state that ΣΔn\Sigma_{\Delta_{n}} would ultimately reach if θ\theta (and therefore ΣZn\Sigma_{Z_{n}} does not shift after step kk^{\prime}; thus, it solves (5) for ΣZk\Sigma_{Z_{k^{\prime}}} and exists in Ωm×m\Omega_{m\times m}. Denote Υi=ΣΔiΣΔi1\Upsilon_{i}=\Sigma_{\Delta_{\infty}^{i}}-\Sigma_{\Delta_{\infty}^{i-1}}. Then,

ΣΔn\displaystyle\textstyle\Sigma_{\Delta_{n}} =limk(ΣΔ0+i=knk1j=0ik+nΦkij)\displaystyle=\textstyle\lim_{k\rightarrow\infty}\big{(}\Sigma_{\Delta_{0}}+\sum_{i=k-n}^{k-1}\sum_{j=0}^{i-k+n}\Phi_{k-i}^{j}\big{)} (20)
=ΣΔnlimk(i=knk1(j=ik+n+1iΦkij))\displaystyle\textstyle=\Sigma_{\Delta_{\infty}^{n}}-\lim_{k\rightarrow\infty}\big{(}\sum_{i=k-n}^{k-1}\big{(}\sum_{j=i-k+n+1}^{i}\Phi_{k-i}^{j}\big{)}\big{)}
=ΣΔni=1n(A+LC)ni+1Υi(A+LC)ni+1T\displaystyle\textstyle=\Sigma_{\Delta_{\infty}^{n}}-\sum_{i=1}^{n}(A+LC)^{n-i+1}\Upsilon_{i}{(A+LC)^{n-i+1}}^{\textsf{T}}

Note that the term in the limit in the first equality is a constant in kk due to a simple re-indexing of (18). This is convenient because we can now break ΣΔn\Sigma_{\Delta_{n}} into an element known to be in Ωm×m\Omega_{m\times m} and an error term. Our goal is now to choose ε\varepsilon large enough to bound

minPΩm×mCΣΔnCT+ΣZnP2,\min_{P\in\Omega_{m\times m}}\big{\|}C^{\vphantom{\textsf{T}}}\Sigma_{\Delta_{n}}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z_{n}}-P\big{\|}_{2}, (21)

over all paths that ΣZn\Sigma_{Z_{n}} can take. An easy bound on the minimization is to simply set P=CΣΔnCT+ΣZnP=C^{\vphantom{\textsf{T}}}\Sigma_{\Delta_{\infty}^{n}}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z_{n}}. Then, ε\varepsilon only needs to exceed

i=1nC(A+LC)ni+1Υi(A+LC)ni+1TCT2\textstyle\big{\|}\sum_{i=1}^{n}C(A+LC)^{n-i+1}\Upsilon_{i}{(A+LC)^{n-i+1}}^{\textsf{T}}C^{\textsf{T}}\big{\|}_{2} (22)

By sub-multiplicativity of induced norms,

Υi\displaystyle\|\Upsilon_{i}\| =j=0Φijj=0(A+LC)2jLΨn+ki\displaystyle=\textstyle\big{\|}\sum_{j=0}^{\infty}\Phi_{i}^{j}\big{\|}\leq\sum_{j=0}^{\infty}\|(A+LC)\|^{2j}\|L\|\|\Psi_{n^{\prime}+k_{i}}\| (23)
=ξL2(1A+LC2)1\displaystyle=\xi\|L\|^{2}\big{(}1-\|A+LC\|^{2}\big{)}^{-1}

Finally, using the fact that 2m\|\cdot\|_{2}\leq\sqrt{m}\|\cdot\| [30] and applying (23) to the error term from (21) yields the desired result. ∎

Remark 3

Due to the topological equivalence of induced norms, the dependence of our choice of norm \|\cdot\| on A+LCA+LC can only affect the value of ξ\xi required by a constant m\sqrt{m}.

Corollary 1

If A+LC2<1\|A+LC\|_{2}<1, then the statement in Lemma 5 holds for =2\|\cdot\|=\|\cdot\|_{2} and

ε=ξC22L22A+LC22(1A+LC22)2\varepsilon=\frac{\xi\|C\|_{2}^{2}\|L\|_{2}^{2}\|A+LC\|_{2}^{2}}{\left(1-\|A+LC\|_{2}^{2}\right)^{2}} (24)
Proof:

The proof of this result is almost identical to the proof of the previous lemma with the only changes that =2\|\cdot\|=\|\cdot\|_{2} and that we stop after applying (23) to (21). ∎

0\displaystyle 010\displaystyle 1020\displaystyle 2030\displaystyle 3040\displaystyle 4050\displaystyle 50Time (s)1000\displaystyle 10001500\displaystyle 1500L(Sn)\displaystyle L(S_{n})unattackedattackedτ(0.05)\displaystyle\tau(0.05)

(a) Test statistic (DW)

0\displaystyle 010\displaystyle 1020\displaystyle 2030\displaystyle 3040\displaystyle 4050\displaystyle 50Time (s)0\displaystyle 0500\displaystyle 500T¯(Sn)\displaystyle\bar{T}(S_{n})unattackedattackedτ(0.05)\displaystyle\tau(0.05)

(b) Test statistic (CRDW)
Figure 1: The evolution and histogram of test statistics (DW) and (CRDW) on the attacked and unattacked systems where ΣZ\Sigma_{Z} is fixed, but unknown to the tester. In this case, the nonrobust test statistic (DW) is unable to clearly distinguish the attacked from the unattacked system, whereas the new test statistic (CRDW) can.

With this ε\varepsilon, it is straightforward to extend the previous test statistic (CRDW) to this new expansion of Ω\Omega as long as Σ¯kεI\bar{\Sigma}_{k}-\varepsilon I remains positive definite for all kk. In this case, we may define our new test statistic as

T¯(Sn)=min(Sn,V)s.t.k=1pθk(Σ¯kεI)1V,[VIIεI+k=1pθkΣ¯k]0,𝟏Tθ=1,θ𝟎.\begin{array}[]{rrcl}\underline{T}(S_{n})=\min&\lx@intercol\mathcal{L}(S_{n},V)\hfil\lx@intercol\\ \mathrm{s.t.}&\sum_{k=1}^{p}\theta_{k}\left(\bar{\Sigma}_{k}-\varepsilon I\right)^{-1}&\succeq&V,\\ &\begin{bmatrix}V&I\\ I&\varepsilon I+\sum_{k=1}^{p}\theta_{k}\bar{\Sigma}_{k}\end{bmatrix}&\succeq&0,\\ &\mathbf{1}^{\textsf{T}}\theta^{\vphantom{\textsf{T}}}&=&1,\\ &\theta&\geq&\mathbf{0}.\end{array} (CRDW*)
Remark 4

If there is some kk so Σ¯kεI\bar{\Sigma}_{k}-\varepsilon I is not positive definite, then the first constraint above is not well-defined. Recalling that VV is a surrogate for (CΣΔnCT+ΣZn)1\left(C^{\vphantom{\textsf{T}}}\Sigma_{\Delta_{n}}^{\vphantom{\textsf{T}}}C^{\textsf{T}}+\Sigma_{Z_{n}}\right)^{-1}, we note VV trivially satisfies ΣZn1V\Sigma_{Z_{n}}^{-1}\succeq V. Thus in this problematic case, we may replace the (Σ¯kεI)\left(\bar{\Sigma}_{k}-\varepsilon I\right) in the first constraint with Σz,k\Sigma_{z,k}, for all kk. This issue is unlikely to be of practical concern for the same reasons discussed in Remark 1 regarding the relaxation of the set Ω\Omega. Specifically, the structure of the attacks makes it unlikely that the first constraint in (CRDW*) would be binding in any case.

0\displaystyle 010\displaystyle 1020\displaystyle 2030\displaystyle 3040\displaystyle 4050\displaystyle 50Time (s)1000\displaystyle 10001500\displaystyle 1500L(Sn)\displaystyle L(S_{n})unattackedattackedτ(0.05)\displaystyle\tau(0.05)

(a) Test statistic (DW)

0\displaystyle 010\displaystyle 1020\displaystyle 2030\displaystyle 3040\displaystyle 4050\displaystyle 50Time (s)200\displaystyle-2000\displaystyle 0200\displaystyle 200T¯(Sn)\displaystyle\bar{T}(S_{n})unattackedattackedτ(0.05)\displaystyle\tau(0.05)

(b) Test statistic (CRDW*)
Figure 2: The evolution and histogram of test statistics (DW) and (CRDW*) on the attacked and unattacked systems where ΣZ\Sigma_{Z} varies as described, again unknown to the tester. Note the robust statistic (CRDW*) takes distinctly higher for the attacked values over almost the entire 1000 iterations than in the unattacked system, while the nonrobust statistic (DW) is again unable to clearly distinguish the two.

IV Empirical Results

In this section, we present simulation results that showcase the strength of our method when compared with the original test statistic (DW). We present results for both the case where the noise distribution is fixed but unknown, and for the case where the noise covariance is unknown and slowly-varying.

We use the standard model for simulation of an autonomous vehicle in [31], where the error kinematics of lane keeping and speed control is given by xT=[ψysγv]x^{\textsf{T}}=\begin{bmatrix}\psi&y&s&\gamma&v\end{bmatrix} and uT=[ra]u^{\textsf{T}}=\begin{bmatrix}r&a\end{bmatrix}. Here, ψ\psi is heading error, yy is lateral error, ss is trajectory distance, γ\gamma is vehicle angle, vv is vehicle velocity, rr is steering, and aa is acceleration. We linearize and initialize with a straight trajectory and constant velocity v0=10v_{0}=10. We then performed exact discretization with sampling period ts=0.05t_{s}=0.05. This yields the system dynamics

A=[1001100121014000010120001000001],B=[140001240000180012000120]A=\begin{bmatrix}1&0&0&\frac{1}{10}&0\\ \frac{1}{2}&1&0&\frac{1}{40}&0\\ 0&0&1&0&\frac{1}{2}\\ 0&0&0&1&0\\ 0&0&0&0&1\end{bmatrix},\;\;B=\begin{bmatrix}\frac{1}{400}&0\\ \frac{1}{2400}&0\\ 0&\frac{1}{800}\\ \frac{1}{20}&0\\ 0&\frac{1}{20}\end{bmatrix} (25)

with C=[I0]3×5C=\begin{bmatrix}I&0\end{bmatrix}\in\mathbb{R}^{3\times 5}. We use process noise covariance ΣW=108×I\Sigma_{W}=10^{-8}\times I.

All tests use dynamic watermarking with variance ΣE=12I\Sigma_{E}=\frac{1}{2}I, and KK and LL were chosen to stabilize the system without an attack. We conduct four simulations: attacked and non-attacked systems where the measurement noise covariance is fixed, and attacked and non-attacked systems where the measurement noise covariance is allowed to vary. We ran all four simulations for 1000 iterations, or 50 seconds. In all cases, we compare the test metrics using the hypothesis test described in (11), where the measurement noise covariance is assumed to be 105×I10^{-5}\times I. When simulating the attacked system, we choose an attacker with α=1\alpha=-1, η0=0\eta_{0}=0, ΣO=108×I\Sigma_{O}=10^{-8}\times I, and ΣS=108×I\Sigma_{S}=10^{-8}\times I.

IV-A Fixed Covariance

We first show our test outperforms in the case where the true measurement noise covariance matrix is fixed but unknown to the tester. In our simulations, the true noise covariance is ΣZ=105×diag{0.18,30,0.18}\Sigma_{Z}=10^{-5}\times\textrm{diag}\{0.18,30,0.18\}. In all tests, ΩZ\Omega^{Z} is described by the p=4p=4 extreme points: ΣZ,1=106×diag{300,1.8,1.8}\Sigma_{Z,1}=10^{-6}\times\textrm{diag}\{300,1.8,1.8\}, ΣZ,2=106×diag{1.8,300,1.8}\Sigma_{Z,2}=10^{-6}\times\textrm{diag}\{1.8,300,1.8\}, ΣZ,3=106×diag{9,9,12}\Sigma_{Z,3}=10^{-6}\times\textrm{diag}\{9,9,12\}, ΣZ,4=106×diag{9,9,9}\Sigma_{Z,4}=10^{-6}\times\textrm{diag}\{9,9,9\}. Both the true measurement noise covariance and that incorrectly assumed in test statistic (DW) are in the resulting set. The simulation is run for 1000 steps.

Figure 1 shows the efficacy of our method under this new uncertainty. If test detection is consistent, the negative log likelihood values should be lower under regular conditions, and higher when the model is attacked. In particular, the nonrobust test statistic (DW) is shown in Fig. 1(a) to be wholly unable to distinguish an attacked system from an unattacked system when its assumption on the measurement noise covariation is violated, while Fig. 1(b) shows the robust test statistic (CRDW) to be able to do so.

IV-B Varying Covariance

Unattacked and attacked simulations were also conducted with a measurement noise distribution that was allowed to vary. We set τ=1\tau=1 and ξ=0.00002\xi=0.00002, implying ε=7.205×106\varepsilon=7.205\times 10^{-6}. The true measurement noise is initialized at ΣZ0=105×diag{0.9,0.9,1.2}\Sigma_{Z_{0}}=10^{-5}\times\textrm{diag}\{0.9,0.9,1.2\}. This shifts linearly over the course of 250 iterations to a new value of ΣZ250=105×diag{15,15,0.18}\Sigma_{Z_{250}}=10^{-5}\times\textrm{diag}\{15,15,0.18\}, at which point it changes direction to shift linearly over 250 iterations to a value of ΣZ500=105×diag{30,0.18,0.18}\Sigma_{Z_{500}}=10^{-5}\times\textrm{diag}\{30,0.18,0.18\}. The measurement noise covariance stays at this value for 150 iterations. It then shifts linearly over 200 iterations to a terminal value of ΣZ850=105×diag{0.18,30,0.18}\Sigma_{Z_{850}}=10^{-5}\times\textrm{diag}\{0.18,30,0.18\}, which it takes for another 150 iterations before the simulation is terminated. The results for both the nonrobust and robust tests are shown in Fig. 2. As in the fixed covariance case, our test is able to distinguish between the attacked and unattacked systems better and more consistently than the nonrobust test that requires unsatisfied assumptions.

V Conclusion

We developed covariance-robust dynamic watermarking tests for detecting sensor attacks on LTI systems in the presence of uncertainty about the measurement noise covariance. We considered cases where the covariance of measurement noise is unknown and either fixed or slowly-varying, and we required our test to be “fair” with respect to all possible values of the covariance in that it not be more or less powerful for some covariances over others. These reflect real-world needs that will increase as 5G is deployed, because there will be an increase in the deployment of smart CPS systems. In such systems, an “unfair” test can translate to disparate impact across different users in different environments, which is a problem of algorithmic bias. Future research includes studying how dynamic watermarking can be adapted to other system uncertainties.

References

  • [1] M. A. Ferrag, L. Maglaras, A. Argyriou, D. Kosmanos, and H. Janicke, “Security for 4g and 5g cellular networks: A survey of existing authentication and privacy-preserving schemes,” Journal of Network and Computer Applications, vol. 101, pp. 55–82, 2018.
  • [2] M. Abrams and J. Weiss, “Malicious control system cyber security attack case study–Maroochy water services, australia,” MITRE, 2008.
  • [3] R. Langner, “Stuxnet: Dissecting a cyberwarfare weapon,” IEEE Security & Privacy, vol. 9, no. 3, pp. 49–51, 2011.
  • [4] A. A. Cárdenas, S. Amin, and S. Sastry, “Research challenges for the security of control systems.” in HotSec, 2008.
  • [5] B. Satchidanandan and P. Kumar, “Dynamic watermarking: Active defense of networked cyber-physical systems,” Proc. of IEEE, 2016.
  • [6] S. Weerakkody, Y. Mo, and B. Sinopoli, “Detecting integrity attacks on control systems using robust physical watermarking,” in Proc. of IEEE CDC, 2014, pp. 3757–3764.
  • [7] Y. Mo, R. Chabukswar, and B. Sinopoli, “Detecting integrity attacks on scada systems,” IEEE CST, vol. 22, no. 4, pp. 1396–1407, 2014.
  • [8] N. Zhang, J. Wang, G. Kang, and Y. Liu, “Uplink nonorthogonal multiple access in 5g systems,” IEEE Communications Letters, vol. 20, no. 3, pp. 458–461, 2016.
  • [9] I. Ahmad, Z. Kaleem, R. Narmeen, L. D. Nguyen, and D.-B. Ha, “Quality-of-service aware game theory-based uplink power control for 5g heterogeneous networks,” Mobile Networks and Applications, vol. 24, no. 2, pp. 556–563, 2019.
  • [10] B. Satchidanandan and P. Kumar, “On minimal tests of sensor veracity for dynamic watermarking-based defense of cyber-physical systems,” in IEEE COMSNETS, 2017, pp. 23–30.
  • [11] Y. Mo and B. Sinopoli, “Secure control against replay attacks,” in Allerton Conference.   IEEE, 2009, pp. 911–918.
  • [12] Y. Mo, E. Garone, A. Casavola, and B. Sinopoli, “False data injection attacks against state estimation in wireless sensor networks,” in Proc. of IEEE CDC, 2010, pp. 5967–5972.
  • [13] Y. Mo, S. Weerakkody, and B. Sinopoli, “Physical authentication of control systems: Designing watermarked control inputs to detect counterfeit sensor outputs,” IEEE Control Systems, vol. 35, no. 1, pp. 93–109, 2015.
  • [14] M. Porter, S. Dey, A. Joshi, P. Hespanhol, A. Aswani, M. Johnson-Roberson, and R. Vasudevan, “Detecting deception attacks on autonomous vehicles via linear time-varying dynamic watermarking,” arXiv preprint arXiv:2001.09859, 2020.
  • [15] R. Berk, H. Heidari, S. Jabbari, M. Kearns, and A. Roth, “Fairness in criminal justice risk assessments: the state of the art,” arXiv preprint arXiv:1703.09207, 2017.
  • [16] M. Hardt, E. Price, and N. Srebro, “Equality of opportunity in supervised learning,” in NeurIPS, 2016, pp. 3315–3323.
  • [17] T. Calders, F. Kamiran, and M. Pechenizkiy, “Building classifiers with independency constraints,” in IEEE ICDMW, 2009, pp. 13–18.
  • [18] C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel, “Fairness through awareness,” in Proceedings of the 3rd Innovations in Theoretical Computer Science Conference.   ACM, 2012, pp. 214–226.
  • [19] I. Zliobaite, “On the relation between accuracy and fairness in binary classification,” arXiv preprint arXiv:1505.05723, 2015.
  • [20] M. Olfat and A. Aswani, “Spectral algorithms for computing fair support vector machines,” in AISTATS, 2018, pp. 1933–1942.
  • [21] A. Chouldechova, “Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,” arXiv preprint arXiv:1703.00056, 2017.
  • [22] M. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi, “Fairness constraints: Mechanisms for fair classification,” in AISTATS, 2017.
  • [23] F. Chierichetti, R. Kumar, S. Lattanzi, and S. Vassilvitskii, “Fair clustering through fairlets,” in NeurIPS, 2017, pp. 5036–5044.
  • [24] M. Olfat and A. Aswani, “Convex formulations for fair principal component analysis,” in AAAI, vol. 33, 2019, pp. 663–670.
  • [25] A. Aswani and M. Olfat, “Optimization hierarchy for fair statistical decision problems,” arXiv preprint arXiv:1910.08520, 2019.
  • [26] P. Hespanhol, M. Porter, R. Vasudevan, and A. Aswani, “Dynamic watermarking for general LTI systems,” in IEEE CDC, 2017, pp. 1834–1839.
  • [27] S. S. Wilks, “The large-sample distribution of the likelihood ratio for testing composite hypotheses,” The annals of mathematical statistics, vol. 9, no. 1, pp. 60–62, 1938.
  • [28] Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmented lagrangian methods for semidefinite programming,” Mathematical Programming Computation, vol. 2, no. 3-4, pp. 203–230, 2010.
  • [29] R. A. Horn and C. R. Johnson, Matrix analysis.   Cambridge university press, 2012.
  • [30] B. Q. Feng, “Equivalence constants for certain matrix norms,” Linear algebra and its applications, vol. 374, pp. 247–253, 2003.
  • [31] V. Turri, A. Carvalho, H. Tseng, K. Johansson, and F. Borrelli, “Linear model predictive control for lane keeping and obstacle avoidance on low curvature roads,” in Proc. of IEEE ITSC, 2013, pp. 378–383.