This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Model-free False Data Injection Attack in Networked Control Systems: A Feedback Optimization Approach

Xiaoyu Luo, , Chongrong Fang, , Jianping He, , Chengcheng Zhao, , and Dario Paccagnan§ : Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China. E-mail: [email protected], [email protected], [email protected].: State Key Laboratory of Industrial Control Technology and Institute of Cyberspace Research, Zhejiang University, China. E-mail: [email protected].§: Department of Computing, Imperial College London, London SW7 2AZ, UK. E-mail: [email protected].
Abstract

Security issues have gathered growing interest within the control systems community, as physical components and communication networks are increasingly vulnerable to cyber attacks. In this context, recent literature has studied increasingly sophisticated false data injection attacks, with the aim to design mitigative measures that improve the systems’ security. Notably, data-driven attack strategies – whereby the system dynamics is oblivious to the adversary – have received increasing attention. However, many of the existing works on the topic rely on the implicit assumption of linear system dynamics, significantly limiting their scope. Contrary to that, in this work we design and analyze truly model-free false data injection attack that applies to general linear and nonlinear systems. More specifically, we aim at designing an injected signal that steers the output of the system toward a (maliciously chosen) trajectory. We do so by designing a zeroth-order feedback optimization policy and jointly use probing signals for real-time measurements. We then characterize the quality of the proposed model-free attack through its optimality gap, which is affected by the dimensions of the attack signal, the number of iterations performed, and the convergence rate of the system. Finally, we extend the proposed attack scheme to the systems with internal noise. Extensive simulations show the effectiveness of the proposed attack scheme.

Index Terms:
Data-driven, false data injection attacks, zeroth-order feedback optimization.

I Introduction

Networked control systems have seen a surge of interest in recent years, largely owing to their widespread applicability to commonly encountered problems including mobile robots coordination, smart grids operation, unmanned vehicles control, remote diagnosis, to mention but a few [1, 2, 3]. As physical components and communication networks are increasingly vulnerable to cyber attacks, security issues have gathered growing traction in the community.

In this context, false data injection (FDI) – whereby an attacker injects false data by compromising sensor readings or communication channels – is a commonly encountered form of attack [4]. Crucially, through an FDI attack, an adversary can cause significant damage to the infrastructure while remaining undetected.

I-A Motivation

Against this backdrop, recent literature has proposed increasingly sophisticated FDI attacks, with the hope that understanding their workings would lead to the design of mitigative measures that improve the systems’ security [5, 6, 7, 8]. Among them, data-driven attack strategies – whereby an attack signal is designed solely relying on the available system’s measurements – have received growing attention. However, three critical issues deserve further consideration. First, many of the existing works tacitly assume that, while unknown, the underlying system follows a linear time invariant dynamics [9, 10, 11]. This significantly restricts the power of the adversary. Second, without any prior information about the system model, the ability that the adversary achieves its attack objective needs to be explored. It is conducive to analyzing the system’s vulnerability. Third, when the capability of the adversary is limited, e.g., the attack energy is limited [12], it is practical and promising to excavate the potential attack’s impact while achieving the malicious objective.

In this work, we tackle the aforementioned issues by proposing a novel data-driven FDI attack that, crucially, does not rely on prior information regarding the system’s model, and applies to general non-linear dynamics. Towards this goal, we leverage zeroth-order optimization methods, a class of optimization algorithms that do not necessitate the availability of the cost function’s gradient, but simply exploit function evaluations [13]. More specifically, we leverage zeroth-order optimization methods in the context of feedback optimization, where optimization algorithms are utilized as feedback controllers for dynamical systems [14]. These tools provide a new approach to design data-driven attacks.

I-B Contributions

In this work, we design a model-free FDI attack that does not rely on knowledge of the underlying dynamics and that applies to general linear and non-linear systems. Specifically, we aim at steering the output of an unknown dynamical system to a (maliciously chosen) trajectory, through the sole use of real-time measurements. We do so in the bounded attack model, where an upper bound on the energy of the injected signal is given. A comparison between the existing FDI attack strategies and our work is shown in Table I.

Compared to our conference version [15], we extend the proposed model-free attack strategy to systems with internal noise and explore its effects on the optimality of the obtained solutions. Moreover, we significantly expand upon the related work, motivation, performance analysis, and simulation results. The main contributions are summarized as follows.

TABLE I: Comparison of FDI attack designs
Model-based
[5, 8], etc.
Data-driven
[10, 6], etc.
Our work
Prior
Information
Knowledge
of dynamics
Linear
Dyanamics
None
Objective
Steer state
to desired point
Remain
undetected
Steer state to
desired trajectory
Approach
Dynamic
programming
Subspace
method
Zeroth-order
feedback-optimization
  • We construct a zeroth-order feedback optimization framework for the design of an FDI attack strategy, where the adversary has limited capability and no prior information about the system model.

  • We propose a model-free attack scheme that drives the output of the system to a maliciously chosen trajectory. From a methodological standpoint, its novelty lies in directly updating the attack signal based on the objective function evaluations.

  • We theoretically characterize the solution’s optimality gap. Further, we analyze the impact of the attack signal’s dimension, of the iteration numbers, of the variance of the objective function, and of the convergence rate of the dynamical system on the optimality gap. Finally, we extend the proposed model-free attack scheme to noisy systems and derive an upper bound of the optimality gap.

I-C Paper organization

The rest of the paper is organized as follows. Section II reviews the related works. Section III introduces the system and adversary model and formulates the FDI attack design problem. In Section IV, we design the proposed model-free attack strategy and analyze the optimality gap. Section V extends the model-free attack scheme to noisy systems. In Section VI, we analyze the design of stealthy attacks. Simulation results are presented in Section VII. Finally, we conclude our work in Section VIII.

II Related works

Existing FDI attack strategies can be divided into two streams depending on whether they rely on a model-based or data-driven approach.

The literature on model-based FDI attack strategy is vast, and includes [5, 16, 7, 17, 8]. In the following, we review only the works that are most relevant to ours. When the adversary is aware of the system dynamics and other critical information (e.g., statistical properties of noise and the controller’s feedback matrix), Chen et al. [5] formulate a linear quadratic cost function to steer the system state to a desired value, while satisfying a detection-avoidance constraint. In similar settings, i.e., when knowledge of the system dynamics is available, Guo et al. [16] propose an innovation-based linear attack strategy and formulates a two-stage optimization problem to obtain the most-damaging attack policy. In [7], Wang proposes an optimal attack strategy to deteriorate the performance of fault detectors by solving coupled backward recursive Riccati difference equations (RDEs). In [17], the authors design an FDI attack strategy against a remote state estimation algorithm with sensor-to-estimator communication rate constraint. With the knowledge of all system parameters except for the filter gain, Zhang et al. [8] design stealthy attacks based on the Fisher information matrix to maximize the estimation error. Note that the design of the above FDI attack strategies is mostly based on the full knowledge of the system model. However, when the system model changes or its exact knowledge is unavailable, the previous approaches can not be applied.

On the other hand, data-driven attack strategies have recently gained momentum [18, 9, 10, 6]. Two approaches are typically pursued. The first approach consists in exploiting offline observation of the system’s dynamics to identify the matrices of the linear system model. Naturally, this approach does not apply to genuinely non-linear dynamics. The second approach consists in directly utilizing input-output data to design an attack strategy. For example, Esmalifalak et al. [18] apply linear independent component analysis (ICA) to estimate the system Jacobian matrix and design unobservable attacks based on the inferred structure. Kim et al. [9] extend the work in [4] and present two data-driven attack strategies based on subspace methods. An et al. [10] formulate the attack goal as a data-based 2\mathcal{L}_{2}-gain composite optimization problem and propose a new multiobjective adaptive dynamic programming (ADP) method for designing the attack policy. Zhao et al. [6] propose an undetected FDI attack strategy based on a subspace identification technique to maximize the state estimation error. Note that the linearity of the system dynamics is still a crucial and implicit assumption necessary for all the aforementioned works. Our work relaxes this assumption and provides a new perspective to construct a completely model-free attack strategy based on the zeroth-order feedback optimization framework.

III Problem formulation

III-A System dynamic model &\& adversary model

Consider a discrete-time dynamical system

xk+1=f(xk,uk),yk=g(xk),\displaystyle\begin{split}x_{k+1}=&f(x_{k},u_{k}),\\ y_{k}=&g(x_{k}),\end{split} (1)

where xknx_{k}\in\mathbb{R}^{n} is the system state at iteration kk, ukmu_{k}\in\mathbb{R}^{m} is the system input, ykqy_{k}\in\mathbb{R}^{q} is the system output.

Assumption 1.

The system (1) is stable under the control of system input uk,ku_{k},\forall k\in\mathbb{N}.

Consider that the adversary can compromise the stable system and manipulate the state xkx_{k} arbitrarily and aims to steer the output value ykay^{a}_{k} to its expected trajectory. The dynamical system under attack can be rewritten as

xk+1a=f(xka,uk)+Γθk,yka=g(xka),\displaystyle\begin{split}x_{k+1}^{a}=&f(x_{k}^{a},u_{k})+\Gamma\theta_{k},\\ y_{k}^{a}=&g(x_{k}^{a}),\end{split} (2)

where the attack selection matrix Γn×p\Gamma\in\mathbb{R}^{n\times p} is defined as the non-zero columns of diag(γ1,,γn)\mathrm{diag}(\gamma_{1},\ldots,\gamma_{n}) with the binary variable γi=1\gamma_{i}=1 if the ii-th dimensional state is compromised, and θkp\theta_{k}\in\mathbb{R}^{p} is the injected false data. Then, we make the following assumption about the ability of the adversary.

Assumption 2.

The capability of the adversary is limited, i.e., θkTθkR\theta_{k}^{\mathrm{T}}\theta_{k}\leq R, where RR is the upper bound of attack energy.

Assumption 2 is common for energy-constrained adversaries [12], which means that the injected false data is bounded. With Assumptions 1 and 2, it is easy to obtain the following lemma to show that the compromised system (2) is still stable.

Lemma 1.

For the compromised system (2), there exists a unique steady-state map xssa:m×pnx^{a}_{ss}:\mathbb{R}^{m}\times\mathbb{R}^{p}\rightarrow\mathbb{R}^{n} such that θ,f(xssa(u,θ),u,θ)f(xssa(u,θ),u)+Γθ=xssa(u,θ)\forall\theta,f^{\prime}(x^{a}_{ss}(u,\theta),u,\theta)\triangleq f(x^{a}_{ss}(u,\theta),u)+\Gamma\theta=x^{a}_{ss}(u,\theta). The map xssa(u,θ)x^{a}_{ss}(u,\theta) is MxM_{x}-Lipschitz with respect to θ\theta, and the function g(xa)g(x^{a}) is MgM_{g}-Lipschitz with respect to xax^{a}.

Remark 1.

Lemma 1 is similar to [19] for guaranteeing the stability of the system. If the system under the bounded FDI attacks has no unique steady-state map xssax^{a}_{ss}, it is obvious that the system will diverse and even the original system (1) is unstable. The properties of the map xssa(u,θ)x^{a}_{ss}(u,\theta) can be ensured by the implicit function theorem [20, Theorem 1B.1]. With Lemma 1, in the steady state we have

ya=g(xssa(u,θ))h(u,θ).\displaystyle y^{a}=g(x^{a}_{ss}(u,\theta))\triangleq h(u,\theta).

Additionally, the Lyapunov theorem presented in [21, Theorem 2.7] guarantees that there exist a Lyapunov function V:n×m×pV:\mathbb{R}^{n}\times\mathbb{R}^{m}\times\mathbb{R}^{p}\rightarrow\mathbb{R} and parameters α1,α2,α3>0\alpha_{1},\alpha_{2},\alpha_{3}>0 such that

α1xaxssa(u,θ)2V(xa,u,θ)α2xaxssa(u,θ)2,\displaystyle\alpha_{1}\|x^{a}-x^{a}_{ss}(u,\theta)\|^{2}\leq V(x^{a},u,\theta)\leq\alpha_{2}\|x^{a}-x^{a}_{ss}(u,\theta)\|^{2}, (3)
V(f(xssa(u,θ),u,θ))V(xa,u,θ)α3xaxssa(u,θ)2.\displaystyle V(f^{\prime}(x^{a}_{ss}(u,\theta),u,\theta))-V(x^{a},u,\theta)\leq-\alpha_{3}\|x^{a}-x^{a}_{ss}(u,\theta)\|^{2}. (4)

Based on (3) and (4), the rate of the change in one step of the function value V(xa,u,θ)V(x^{a},u,\theta) is denoted by

μ2α2α1(1α3α2).\displaystyle\mu\triangleq\frac{2\alpha_{2}}{\alpha_{1}}(1-\frac{\alpha_{3}}{\alpha_{2}}). (5)
Assumption 3.

The convergence rate μ\mu satisfies μ<1\mu<1.

The smaller μ\mu is, the faster the system converges to the steady state [22]. The formal interpretation of μ\mu will be presented later in Lemma 4.

III-B Problem formulation

In this paper, we aim to design a completely model-free attack strategy, which is independent of the characteristics and parameters of the system model itself.

Herein, we consider that the adversary’s objective is to steer the output value ykay^{a}_{k} to follow its expected malicious trajectory y¯k\bar{y}_{k} as closely as possible. We also consider that the adversary has limited attack energy. Therefore, the total goal of adversaries is to reduce both the error between the true system output and expected trajectory and the consumed attack energy as much as possible. In addition, since our proposed attack strategy performs the optimization with the same objective function at each iteration kk, we omit the subscript kk and formally formulate the problem as

𝒫𝟏:\displaystyle\mathbf{\mathcal{P}_{1}}:\quad minθΦ(θ,ya)=yay¯+θTQθ\displaystyle\mathrm{min}_{\theta}~{}\Phi(\theta,y^{a})=\|y^{a}-\bar{y}\|+\theta^{\mathrm{T}}Q\theta (6)
s.t.ya=h(u,θ),\displaystyle\mathrm{s.t.}~{}y^{a}=h(u,\theta),
θTθR,\displaystyle\quad~{}~{}\theta^{\mathrm{T}}\theta\leq R,

where ya=h(u,θ)y^{a}=h(u,\theta) is the steady-state map under attacks in (2) to guarantee the stability of the compromised system (2), y¯\bar{y} is the expected trajectory and Qp×pQ\in\mathbb{R}^{p\times p} is the positive definite weight matrix chosen by the adversary according to the tradeoff between the limited attack energy and tracking deviation yay¯\|y^{a}-\bar{y}\|. We also make a common assumption for the optimized objective function as follows.

Assumption 4.

The function Φ(θ,ya)\Phi(\theta,y^{a}) is MM-Lipschitz with respect to θ\theta, MyM_{y}-Lipschitz with respect to yay^{a}, and infθ,yaΦ(θ,ya)>\inf_{\theta,y^{a}}\Phi(\theta,y^{a})>-\infty.

The challenges of solving problem 𝒫1\mathcal{P}_{1} come from two aspects. One is the nonlinearity of the system model. For the unknown nonlinear system model (2), it is hard to regress its critical system parameters. The other is how to use the compromised measurements to guide the output value to move along the desired trajectory while reducing the consumed attack energy as much as possible. Since h(u,θ)h(u,\theta) is unknown, it is difficult to directly obtain the gradients of the objective function with respect to the independent variable θ\theta to solve problem 𝒫1\mathcal{P}_{1}.

The key idea of the zeroth-order optimization is to utilize the objective function evaluations to construct gradient estimates, thus avoiding using the gradients directly. We aim to construct the gradient estimates of the objective function to solve problem 𝒫1\mathcal{P}_{1}. Different from the traditional zeroth-order optimization framework for the design of the controller with non-manipulated measurements, our design focuses on utilizing the compromised measurements to design the attack signal in the original control systems with designed controllers. Herein, we mainly explore the model-free attack strategy without detector constraints and the attack design under detector constraints will be analyzed in Section VI.

IV Model-free attack strategy design

In this section, we first introduce the zeroth-order optimization framework, which is the basis of our attack strategy design. Then, we utilize real-time measurements to design the attack signal. Finally, we analyze the optimality of the proposed attack strategy.

IV-A Preliminaries of zeroth-order optimization

The attack strategy design in this paper is inspired by the gradient estimates based on the residual feedback in [13].

For an objective function Φ(w):p\Phi(w):\mathbb{R}^{p}\rightarrow\mathbb{R}, the gradient estimate proposed in [13] is

^Φ(wk)=vkδ(Φ(wk+δvk)Φ(wk1+δvk1)),\displaystyle\hat{\nabla}\Phi(w_{k})=\frac{v_{k}}{\delta}(\Phi(w_{k}+\delta v_{k})-\Phi(w_{k-1}+\delta v_{k-1})), (7)

where vkv_{k} and vk1v_{k-1} are independent random vectors selected uniformly from the unit sphere 𝒮p{vkp:vk=1}\mathcal{S}_{p}\triangleq\{v_{k}\in\mathbb{R}^{p}:\|v_{k}\|=1\}, i.e., vkU(𝒮p)v_{k}\sim U(\mathcal{S}_{p}) and δ>0\delta>0 is the smoothing parameter. Note that only a new objective function evaluation needs to be computed at each iteration in (7) because the objective value evaluated at the previous iteration k1k-1 is reused at the current iteration kk.

According to [13, Lemma 5], ^Φ(wk)\hat{\nabla}\Phi(w_{k}) in (7) is an unbiased estimate of the gradient of the smooth approximation Φδ(w)\Phi_{\delta}(w) for Φ(w)\Phi(w) at wkw_{k}, where

Φδ(w)=𝔼vU(𝒮p)[Φ(w+δv)].\displaystyle\Phi_{\delta}(w)=\mathbb{E}_{v\sim U(\mathcal{S}_{p})}[\Phi(w+\delta v)]. (8)

The properties of Φδ(w)\Phi_{\delta}(w) are shown as follows.

Lemma 2 ([19]).

If Φδ(w):p\Phi_{\delta}(w):\mathbb{R}^{p}\rightarrow\mathbb{R} is MM-Lipschitz, then for any wp,δ>0w\in\mathbb{R}^{p},\delta>0 and Φδ(w)\Phi_{\delta}(w) defined in (8), we have

𝔼vU(𝕊p)[pδΦ(w+δv)v]=\displaystyle\mathbb{E}_{v\sim U(\mathbb{S}_{p})}[\frac{p}{\delta}\Phi(w+\delta v)v]= Φδ(w),\displaystyle\nabla\Phi_{\delta}(w), (9a)
|Φδ(w)Φ(w)|\displaystyle|\Phi_{\delta}(w)-\Phi(w)|\leq Mδ,\displaystyle M\delta, (9b)
Φδ(w)Φ(w)\displaystyle\|\nabla\Phi_{\delta}(w)-\nabla\Phi(w)\|\leq Mpδ.\displaystyle\frac{Mp}{\delta}. (9c)

From (9c), we know that Φδ(w)\Phi_{\delta}(w) is Mpδ\frac{Mp}{\delta}-smooth, i.e., its gradient Φδ(w)\nabla\Phi_{\delta}(w) is Mpδ\frac{Mp}{\delta}-Lipschitz continuous.

Refer to caption
Figure 1: The schematic of model-free attack strategy design.

IV-B Attack strategy design

The proposed attack strategy iteratively updates attack inputs along the composite direction of the negative gradient estimates of the objective function and the projected gradients. Such a design only utilizes real-time measurements and thus makes the attack strategy intrinsically model-free.

We denote 𝒰\mathcal{U} as the constraint set in problem 𝒫1\mathcal{P}_{1}. With the zeroth-order optimization framework, the proposed model-free attack strategy can be divided into three steps and the schematic of the attack strategy design is shown in Fig.1.
Step 11: Compute the gradient estimate ϕ~k\tilde{\phi}_{k}

ϕ~k=pvkδ[Φ(θk,yk+1a)Φ(θk1,yka)],\displaystyle\tilde{\phi}_{k}=\frac{pv_{k}}{\delta}[\Phi(\theta_{k},y_{k+1}^{a})-\Phi(\theta_{k-1},y_{k}^{a})], (10)

where vkv_{k} and vk1v_{k-1} are independent probing signals and follow the uniform distribution from the Euclidean unit sphere 𝒮p\mathcal{S}_{p}, i.e., vkU(𝒮p)v_{k}\sim U(\mathcal{S}_{p}). Since only the real-time measurements are available for the adversary and it is hard to directly compute the gradients of the objective function in problem 𝒫1\mathcal{P}_{1}, we first utilize the probing signal vkv_{k} for measurements, which can be used to construct the objective function evaluations Φ(θk,yk+1a)\Phi(\theta_{k},y_{k+1}^{a}) and Φ(θk1,yka)\Phi(\theta_{k-1},y_{k}^{a}) at the current and previous iteration. Herein, the historic function evaluation Φ(θk1,yka)\Phi(\theta_{k-1},y^{a}_{k}) is reused at iteration k+1k+1. Then we compute the gradient estimates ϕ~k\tilde{\phi}_{k} of the objective function by these evaluations with (10).
Step 22: Update the obtained solution wk+1w_{k+1}

wk+1=Π𝒰[wkηϕ~k],\displaystyle w_{k+1}=\Pi_{\mathcal{U}}[w_{k}-\eta\tilde{\phi}_{k}], (11)

where Π𝒰[]\Pi_{\mathcal{U}}[\cdot] is the projection onto constrained set 𝒰\mathcal{U}, i.e., Π𝒰[l1]argminl2𝒰l1l2\Pi_{\mathcal{U}}[l_{1}]\equiv\mathrm{arg}\min_{l_{2}\in\mathcal{U}}\|l_{1}-l_{2}\|, and step-size 0<η<10<\eta<1. To constrain the obtained solutions in the feasible region set by 𝒰\mathcal{U}, we turn to the projected gradient descent method for updating the solution wk+1w_{k+1} at iteration k+1k+1 and solving the optimization problem 𝒫1\mathcal{P}_{1} with constraints.
Step 33: Update the attack signal θk+1\theta_{k+1}

θk+1=wk+1+δvk+1.\displaystyle\theta_{k+1}=w_{k+1}+\delta v_{k+1}. (12)

Finally, the attack signal θk+1\theta_{k+1} can be obtained by perturbing the solution wk+1w_{k+1} with the probing signal δvk+1\delta v_{k+1}.

IV-C Performance analysis

Let Φ(θk)Φ(θk,h(uk,θk))\Phi(\theta_{k})\triangleq\Phi(\theta_{k},h(u_{k},\theta_{k})). We use the optimality gap, i.e.,

1Tk=1T𝔼v[T][Φ(θk)Φ(θk)]\displaystyle\frac{1}{T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\Phi(\theta_{k})-\Phi(\theta_{k}^{*})] (13)

to measure the optimality of the proposed attack strategy at θk\theta_{k} where θk\theta^{*}_{k} is the optimal solution at iteration kk and 𝔼v[k]\mathbb{E}_{v_{[k]}} is the expectation with respect to v[k]v_{[k]} where v[k](v1,,vk)v_{[k]}\triangleq(v_{1},\ldots,v_{k}).

Before we characterize (13), we provide the upper bounds of wk+1wk+12\|w_{k+1}-w_{k+1}^{*}\|^{2} and wk+1wk2\|w_{k+1}-w_{k}\|^{2}, and some supporting lemmas for auxiliary analysis. We have

wk+1wk+12=\displaystyle\|w_{k+1}-w_{k+1}^{*}\|^{2}= Π𝒰[wkηϕ~k]wk+12\displaystyle\|\Pi_{\mathcal{U}}[w_{k}-\eta\tilde{\phi}_{k}]-w_{k+1}^{*}\|^{2}
(s.1)\displaystyle\overset{(s.1)}{\leq} wkηϕ~kwk+12\displaystyle\|w_{k}-\eta\tilde{\phi}_{k}-w_{k+1}^{*}\|^{2}
(s.2)\displaystyle\overset{(s.2)}{\leq} 2wkwk+12+2η2ϕ~k2,\displaystyle 2\|w_{k}-w_{k+1}^{*}\|^{2}+2\eta^{2}\|\tilde{\phi}_{k}\|^{2}, (14)

where (s.1)(s.1) follows from the projection property [23, Lemma 2.4] and [24], i.e., for any l1pl_{1}\in\mathbb{R}^{p} and all l2𝒰l_{2}\in\mathcal{U}, we have Π𝒰[l1]l2l1l2\|\Pi_{\mathcal{U}}[l_{1}]-l_{2}\|\leq\|l_{1}-l_{2}\|, and (s.2)(s.2) follows the fact that ab22(a2+b2)\|a-b\|^{2}\leq 2(\|a\|^{2}+\|b\|^{2}). Similarly, we have

wk+1wk2=\displaystyle\|w_{k+1}-w_{k}\|^{2}= Π𝒰[wkηϕ~k]wk2\displaystyle\|\Pi_{\mathcal{U}}[w_{k}-\eta\tilde{\phi}_{k}]-w_{k}\|^{2}
\displaystyle\leq wkηϕ~kwk2\displaystyle\|w_{k}-\eta\tilde{\phi}_{k}-w_{k}\|^{2}
\displaystyle\leq η2ϕ~k2.\displaystyle\eta^{2}\|\tilde{\phi}_{k}\|^{2}. (15)

Note that we replace the steady output value h(uk,θk)h(u_{k},\theta_{k}) with the real-time output value yk+1ay^{a}_{k+1} to enter the closed-loop zeroth-order feedback optimization framework. It is unavoidable for the system to produce the error eΦ(xka,θk)e_{\Phi}(x^{a}_{k},\theta_{k}), which is shown as

eΦ(xka,θk)=Φ(θk,yk+1a)Φ(θk,h(uk,θk)).\displaystyle e_{\Phi}(x^{a}_{k},\theta_{k})=\Phi(\theta_{k},y^{a}_{k+1})-\Phi(\theta_{k},h(u_{k},\theta_{k})). (16)

To derive the optimality gap (13), we first analyze the upper bound of the error eΦ(xka,θk)e_{\Phi}(x^{a}_{k},\theta_{k}) and recursive inequalities of two critical variables, i.e., 𝔼v[k][V(xka,uk,θk)]\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k})] and 𝔼v[k][ϕ~k2]\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}].

Lemma 3.

If Assumptions 1, 2, 3, and 4 hold, then we have

|eΦ(xka,θk)|2μMy2Mg22α2V(xka,uk,θk).\displaystyle|e_{\Phi}(x^{a}_{k},\theta_{k})|^{2}\leq\frac{\mu M_{y}^{2}M_{g}^{2}}{2\alpha_{2}}V(x^{a}_{k},u_{k},\theta_{k}). (17)
Lemma 4.

If Assumptions 1, 2, 3, and 4 hold, with (10), (11) and (12), then we have

𝔼v[k]\displaystyle\mathbb{E}_{v_{[k]}} [V(xka,uk,θk)]\displaystyle[V(x^{a}_{k},u_{k},\theta_{k})]
μ𝔼v[k][V(xk1a,uk1,θk1)]\displaystyle\leq\mu\mathbb{E}_{v_{[k]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]
+4α2η2Mx2𝔼v[k][ϕ~k12]+16α2δ2Mx2.\displaystyle+4\alpha_{2}\eta^{2}M_{x}^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+16\alpha_{2}\delta^{2}M_{x}^{2}. (18)
Lemma 5.

If Assumptions 1, 2, 3, and 4 hold, with (10), (11), and (12), then we have

𝔼v[k][ϕ~k2]\displaystyle\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}] 6η2p2M2δ2𝔼v[k][ϕ~k12]+24p2M2\displaystyle\leq\frac{6\eta^{2}p^{2}M^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+24p^{2}M^{2}
+3μp2My2Mg22α2δ2(𝔼v[k][V(xka,uk,θk)]\displaystyle+\frac{3\mu p^{2}M_{y}^{2}M_{g}^{2}}{2\alpha_{2}\delta^{2}}(\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k})]
+𝔼v[k][V(xk1a,uk1,θk1)]).\displaystyle+\mathbb{E}_{v_{[k]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]). (19)

The proofs of Lemmas 3, 4 and 5 are shown in Appendix IX-A, IX-B and IX-C, respectively. Lemma 3 quantifies the close relationship between Φ(θk,yk+1a)\Phi(\theta_{k},y^{a}_{k+1}) and Φ(θk,h(uk,θk))\Phi(\theta_{k},h(u_{k},\theta_{k})). Lemma 4 measures the proximity of the current state xkax^{a}_{k} compared with the steady state xssa(uk,θk)x^{a}_{ss}(u_{k},\theta_{k}). Lemma 5 reflects the first order smoothness of the objective function evaluation Φ(θk,yk+1a)\Phi(\theta_{k},y^{a}_{k+1}) at the solution wkw_{k}.

Next, we provide the following theorem to characterize the optimality of the obtained solutions. Note that Φ(θk)Φ(θk,h(uk,θk))\Phi(\theta_{k})\triangleq\Phi(\theta_{k},h(u_{k},\theta_{k})). For simplicity, all the complexity results in this paper are presented in 𝒪\mathcal{O} notations.

Theorem 1.

Supposing that Assumptions 1, 2, 3, and 4 hold, for any given precision ϵ>0\epsilon>0 such that |Φδ(θ)Φ(θ)|ϵ|\Phi_{\delta}(\theta)-\Phi(\theta)|\leq\epsilon, let δ=ϵM\delta=\frac{\epsilon}{M} and η=κϵpT\eta=\frac{\kappa\epsilon}{pT} with 0<κ<κ0<\kappa<\kappa^{*}, where

κ=𝒪(min{Tμ(1+μ)μ,(1μ)Tμ(1+μ)}),\displaystyle\kappa^{*}=\mathcal{O}\left(\min\left\{\frac{T\sqrt{\mu(1+\mu)}}{\mu},\frac{(1-\mu)T}{\sqrt{\mu(1+\mu)}}\right\}\right),

then we have

1Tk=1T\displaystyle\frac{1}{T}\sum_{k=1}^{T} 𝔼v[T][Φ(θk)Φ(θk)]\displaystyle\mathbb{E}_{v_{[T]}}[\Phi(\theta_{k})-\Phi(\theta_{k}^{*})]
=𝒪(p2(1+μ)(1+1+μ)(1ρ)T2+μp2T),\displaystyle=\mathcal{O}\left(\frac{p^{2}(1+\mu)(1+\sqrt{1+\mu})}{(1-\rho)T^{2}}+\frac{\mu p^{2}}{T}\right), (20)

where ρ(0,1)\rho\in(0,1) is the maximum eigenvalue of matrix PP given by

P=[p11p12p21p12p21p22]\displaystyle P=\left[\begin{array}[]{cc}p_{11}&\sqrt{p_{12}p_{21}}\\ \sqrt{p_{12}p_{21}}&p_{22}\end{array}\right]

with

p11=\displaystyle p_{11}= 6p2η2δ2(M2+μMx2My2Mg2),p22=μ,\displaystyle\frac{6p^{2}\eta^{2}}{\delta^{2}}(M^{2}+\mu M_{x}^{2}M_{y}^{2}M_{g}^{2}),\quad p_{22}=\mu,
p12=\displaystyle p_{12}= 3μp2My2Mg22α2δ2(1+μ),p21=4α2η2Mx2,\displaystyle\frac{3\mu p^{2}M_{y}^{2}M_{g}^{2}}{2\alpha_{2}\delta^{2}}(1+\mu),\quad p_{21}=4\alpha_{2}\eta^{2}M_{x}^{2},
d1=\displaystyle d_{1}= 24p2(M2+μMx2My2Mg2),d2=16α2δ2Mx2.\displaystyle 24p^{2}(M^{2}+\mu M_{x}^{2}M_{y}^{2}M_{g}^{2}),\quad d_{2}=16\alpha_{2}\delta^{2}M_{x}^{2}.

Moreover, it also holds that

ρ=𝒪(max{(1μ)21+μ,μ}+1μ).\displaystyle\rho=\mathcal{O}\left(\max\left\{\frac{(1-\mu)^{2}}{1+\mu},\mu\right\}+1-\mu\right).
Proof.

Please see Appendix IX-D. ∎

Theorem 1 shows that the optimality gap is related to the dimension pp of the attack signal, the convergence rate μ\mu of the system, and the iterations TT. As the iterations TT increase gradually, the optimality gap decreases and it can even decay to zero as long as TT is large enough.

V Noise effects on model-free attack design

In this part, we further explore the effects of internal inherent noises on the proposed attack strategy and derive the optimality of solutions.

V-A Problem reformulation

With noise dkd_{k}, the original system (2) can be rewritten as

xk+1a=f(xka,uk,dk)+Γθk,yka,d=g(xka,dk),\displaystyle\begin{split}x_{k+1}^{a}=&f(x_{k}^{a},u_{k},d_{k})+\Gamma\theta_{k},\\ y_{k}^{a,d}=&g(x_{k}^{a},d_{k}),\end{split} (21)

where the injected false data θk\theta_{k} satisfies Assumption 2 and the internal inherent noise dkrd_{k}\in\mathbb{R}^{r} is independent of the state xkx_{k} and θk\theta_{k} statistically. Herein, we consider the additive noise dkd_{k}, such as f(xka,uk,dk)=f(xka,uk)+dkf(x_{k}^{a},u_{k},d_{k})=f(x_{k}^{a},u_{k})+d_{k}. Similar to Assumption 1, the above discrete-time system is stable with noise dkd_{k} before the invasion of attacks, which can be guaranteed by [25, Theorem 2.2] if dkd_{k} follows a standard Wiener process, i.e., the stochastic noise has zero mean and time-varying covariance. Let Φ(θ,ya,d)=ya,dy¯+θTQθ\Phi(\theta,y^{a},d)=\|y^{a,d}-\bar{y}\|+\theta^{\mathrm{T}}Q\theta. In this case, the optimization problem becomes

𝒫𝟐:\displaystyle\mathbf{\mathcal{P}_{2}}:\quad minθ𝔼d[Φ(θ,ya,d)]\displaystyle\mathrm{min}_{\theta}~{}\mathbb{E}_{d}[\Phi(\theta,y^{a},d)] (22)
s.t.ya,d=h(u,θ,d),\displaystyle\mathrm{s.t.}~{}y^{a,d}=h(u,\theta,d),
θTθR,\displaystyle\quad~{}~{}\theta^{\mathrm{T}}\theta\leq R,

where ya,d=h(u,θ,d)y^{a,d}=h(u,\theta,d) is the steady-state map under attacks in (21) to guarantee the stability of the compromised system. Let Φ~(θ)𝔼d[Φ(θ,ya,d)]\tilde{\Phi}(\theta)\triangleq\mathbb{E}_{d}[\Phi(\theta,y^{a},d)]. Then, we provide the following assumptions for the objective function Φ(θ,ya,d)\Phi(\theta,y^{a},d).

Assumption 5.

For any θp\theta\in\mathbb{R}^{p}, there exists σ>0\sigma>0 such that

𝔼d[(Φ(θ,ya,d)Φ~(θ))2]σ2.\displaystyle\mathbb{E}_{d}[(\Phi(\theta,y^{a},d)-\tilde{\Phi}(\theta))^{2}]\leq\sigma^{2}. (23)
Assumption 6.

The function Φ(θ,ya,d)\Phi(\theta,y^{a},d) is M(d)M(d)-Lipschitz with respect to θ\theta, My(d)M_{y}(d)-Lipschitz with respect to ya,dy^{a,d}, and infθ,ya,dΦ(θ,ya,d)>\inf_{\theta,y^{a},d}\Phi(\theta,y^{a},d)>-\infty. Moreover, we have M(d)MM(d)\leq M and My(d)MyM_{y}(d)\leq M_{y}.

Assumption 5 provides a bounded variance of the objective function in the stochastic setting, which also implies that 𝔼d[(Φ(θ,ya,d01)Φ(θ,ya,d02))2]4σ2\mathbb{E}_{d}[(\Phi(\theta,y^{a},d_{01})-\Phi(\theta,y^{a},d_{02}))^{2}]\leq 4\sigma^{2} [13]. In Assumption 6, the Lipschitz constants in the noisy system are constrained to be not larger than that in the noiseless system.

Moreover, the following lemma reveals that the compromised system (21) can still be stable in spite of the process and measurement noises and the noises do not influence the Lipschitz constant of the steady-state map.

Lemma 6.

For the compromised system (21), there exists a unique steady-state map xssa:m×p×rnx^{a}_{ss}:\mathbb{R}^{m}\times\mathbb{R}^{p}\times\mathbb{R}^{r}\rightarrow\mathbb{R}^{n} such that f(xssa(u,θ,d),u,θ,d)f(xssa(u,θ,d),u,d)+Γθ=xssa(u,θ,d)f^{\prime}(x^{a}_{ss}(u,\theta,d),u,\theta,d)\triangleq f(x^{a}_{ss}(u,\theta,d),u,d)+\Gamma\theta=x^{a}_{ss}(u,\theta,d) for any θ\theta. In addition, xssa(u,θ,d)x_{ss}^{a}(u,\theta,d) is MxM_{x}-Lipschitz with respect to θ\theta, and the function g(xa,d)g(x^{a},d) is MgM_{g}-Lipschitz with respect to xax^{a}.

Proof.

The proof can be divided into two parts. One is to find a Lyapunov function for guaranteeing the existence of the steady-state map. The other is to show the continuation property of the steady-state map based on the implicit function theorem [20, Theorem 1B.1].

Existence of the steady-state. In the steady-state, we have

ya,d=g(xssa(u,θ,d))h(u,θ,d).\displaystyle y^{a,d}=g(x^{a}_{ss}(u,\theta,d))\triangleq h(u,\theta,d).

Similarly, there exists the following Lyapunov function V:n×m×pV:\mathbb{R}^{n}\times\mathbb{R}^{m}\times\mathbb{R}^{p}\rightarrow\mathbb{R} and parameters α1,α2,α3>0\alpha_{1},\alpha_{2},\alpha_{3}>0 such that

c1xaxssa(u,θ,d)2V(xa,u,θ,d)c2xaxssa(u,θ,d)2,\displaystyle c_{1}\|x^{a}-x^{a}_{ss}(u,\theta,d)\|^{2}\leq V(x^{a},u,\theta,d)\leq c_{2}\|x^{a}-x^{a}_{ss}(u,\theta,d)\|^{2}, (24)
V(f(xssa(u,θ,d)))V(xa,u,θ,d)c3xaxssa(u,θ,d)2.\displaystyle V(f^{\prime}(x^{a}_{ss}(u,\theta,d)))-V(x^{a},u,\theta,d)\leq-c_{3}\|x^{a}-x^{a}_{ss}(u,\theta,d)\|^{2}. (25)

Based on (24) and (25), the rate of the change in one step of the function value V(xa,u,θ,d)V(x^{a},u,\theta,d) is denoted as

μ2c2c1(1c3c2).\displaystyle\mu^{\prime}\triangleq\frac{2c_{2}}{c_{1}}(1-\frac{c_{3}}{c_{2}}). (26)

The stability of the compromised system can be guaranteed if μ<1\mu^{\prime}<1.

Continuity of the steady-state. Let F(x,u,θ,d)=f(xssa(u,θ,d),u,d)+Γθxssa(u,θ,d)=0F(x,u,\theta,d)=f(x^{a}_{ss}(u,\theta,d),u,d)+\Gamma\theta-x^{a}_{ss}(u,\theta,d)=0. Differentiating both sides of the above equation with respect to θ\theta gives that

Fxssaxssaθ+Fuuxssaxssaθ+Fθ+Fd0=0.\displaystyle\frac{\partial{F}}{\partial{x_{ss}^{a}}}\frac{\partial{x_{ss}^{a}}}{\partial{\theta}}+\frac{\partial{F}}{\partial{u}}\frac{\partial{u}}{\partial{x_{ss}^{a}}}\frac{\partial{x_{ss}^{a}}}{\partial{\theta}}+\frac{\partial{F}}{\partial{\theta}}+\frac{\partial{F}}{\partial{d}}\cdot 0=0.

When F(x,u,θ,d)F(x,u,\theta,d) is continuously differentiable with respect to θ\theta in the neighborhood of (xssa,u,θ,d)(x_{ss}^{a},u,\theta,d) and Fxssa+Fuuxssa\frac{\partial{F}}{\partial{x_{ss}^{a}}}+\frac{\partial{F}}{\partial{u}}\frac{\partial{u}}{\partial{x_{ss}^{a}}} is nonsingular, xssax_{ss}^{a} is the Lipschitz function with respect to θ\theta where the Lipschitz constant MxM_{x} satisfies

Mx=sup|FθFxssa+Fuuxssa|.\displaystyle M_{x}=\mathrm{sup}\left|-\frac{\frac{\partial{F}}{\partial{\theta}}}{\frac{\partial{F}}{\partial{x_{ss}^{a}}}+\frac{\partial{F}}{\partial{u}}\frac{\partial{u}}{\partial{x_{ss}^{a}}}}\right|.

Since we consider the additive noise from (21), it can be followed that

F(x,u,θ)θ\displaystyle\frac{\partial{F(x,u,\theta)}}{\partial{\theta}} =F(x,u,θ,d)θ,\displaystyle=\frac{\partial{F(x,u,\theta,d)}}{\partial{\theta}},
F(x,u,θ)xssa\displaystyle\frac{\partial{F(x,u,\theta)}}{\partial{x_{ss}^{a}}} =F(x,u,θ,d)xssa,\displaystyle=\frac{\partial{F(x,u,\theta,d)}}{\partial{x_{ss}^{a}}},

where F(x,u,θ)=f(xssa(u,θ),u)+Γθxssa(u,θ)=0F(x,u,\theta)=f(x^{a}_{ss}(u,\theta),u)+\Gamma\theta-x^{a}_{ss}(u,\theta)=0. Thus, it is inferred that the existence of noise does not influence the Lipschitz continuous property of the steady-state with respect to θ\theta and the Lipschitz constant is the same as that without noise. Similarly, for Fy=g(xssa(u,θ,d))ya,d=0F_{y}=g(x_{ss}^{a}(u,\theta,d))-y^{a,d}=0, we have the same result. Hence, the proof is completed. ∎

Remark 2.

Lemma 6 is similar to Lemma 3 where the noise dd is independent of the state xx and the injected false data θ\theta. From Lemma 6, we know that the process and measurement noises affect the convergence rate μ\mu^{\prime} but not the Lipschitz constant of the steady-state map. Apparently, μμ\mu\neq\mu^{\prime}. If μ>μ\mu^{\prime}>\mu, the rate that the system converges to the steady state becomes slow (i.e., the noise reduces the convergence rate), which is shown in Fig. 5 in Section VII.

V-B Attack strategy design with noise

With the zeroth-order optimization framework, the model-free attack strategy under the discrete-time system with noise dkd_{k} is designed as

{ϕ~k=pvkδ[Φ(θk,yk+1a,dk)Φ(θk1,yka,dk1)],wk+1=Π𝒰[wkηϕ~k],θk+1=wk+1+δvk+1,\displaystyle\left\{\begin{aligned} &\tilde{\phi}_{k}=\frac{pv_{k}}{\delta}[\Phi(\theta_{k},y_{k+1}^{a},d_{k})-\Phi(\theta_{k-1},y_{k}^{a},d_{k-1})],\\ &w_{k+1}=\Pi_{\mathcal{U}}[w_{k}-\eta\tilde{\phi}_{k}],\\ &\theta_{k+1}=w_{k+1}+\delta v_{k+1},\end{aligned}\right. (27)

where dkd_{k} and dk1d_{k-1} are independent random noises that are sampled at iterations kk and k1k-1, respectively. Different from (10), the existence of noise will also affect the objective function value. Moreover, the function value is not repeatable at different iterations and it is hard to store the noise value at each iteration for computing the function value. Thus, at iteration kk, only one evaluation is possible. In other words, compared to (10), it takes the residual of objective function evaluations between two consecutive stochastic feedback points.

V-C Optimality with the general noise

With the noise dkd_{k}, the following lemma provides the upper bound of 𝔼v[k][V(xka,uk,θk,dk)]\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k},d_{k})] and 𝔼v[k][ϕ~k2]\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}] in this stochastic setting.

Lemma 7.

If Assumptions 2, 5 and 6 hold, with (27), we have

𝔼v[k]\displaystyle\mathbb{E}_{v_{[k]}} [V(xka,uk,θk,dk)]\displaystyle[V(x^{a}_{k},u_{k},\theta_{k},d_{k})]
μ𝔼v[k][V(xk1a,uk1,θk1,dk1)]+32c2δ2Mx2\displaystyle\leq\mu^{\prime}\mathbb{E}_{v_{[k]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1},d_{k-1})]+32c_{2}\delta^{2}M^{2}_{x}
+8c2η2Mx2𝔼v[k][ϕ~k12]+16c2σ2.\displaystyle+8c_{2}\eta^{2}M^{2}_{x}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+16c_{2}\sigma^{2}. (28)
Lemma 8.

If Assumptions 2, 5 and 6 hold, with (27), we have

𝔼v[k][ϕ~k2]\displaystyle\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}] 12η2p2M2δ2𝔼v[k][ϕ~k12]+48p2M2+24p2σ2δ2\displaystyle\leq\frac{12\eta^{2}p^{2}M^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+48p^{2}M^{2}+\frac{24p^{2}\sigma^{2}}{\delta^{2}}
+3μp2My2Mg22c2δ2(𝔼v[k][V(xka,uk,θk,dk)]\displaystyle+\frac{3\mu^{\prime}p^{2}M^{\prime^{2}}_{y}M^{\prime^{2}}_{g}}{2c_{2}\delta^{2}}(\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k},d_{k})]
+𝔼v[k][V(xk1a,uk1,θk1,dk1)]).\displaystyle+\mathbb{E}_{v_{[k]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1},d_{k-1})]). (29)

The proofs of Lemmas 7 and 8 are shown in Appendix IX-E and IX-F, respectively. Different from Lemmas 4 and 5, the internal inherent noise leads to an additional term 16c2σ216c_{2}\sigma^{2} and 24p2σ2δ2\frac{24p^{2}\sigma^{2}}{\delta^{2}}, respectively.

Next, we show the following theorem to characterize the effects of noise dkd_{k} on the optimality of the obtained solutions.

Theorem 2.

Supposing that Assumptions 2, 5 and 6 hold, for any given precision ϵ>0\epsilon>0 such that |Φ~δ(θ)Φ~(θ)|ϵ|\tilde{\Phi}_{\delta}(\theta)-\tilde{\Phi}(\theta)|\leq\epsilon, let δ=ϵM\delta=\frac{\epsilon}{M} and η=κσϵp2T\eta=\frac{\kappa\sigma\epsilon}{p^{2}\sqrt{T}} with 0<κ<p2Tσϵ0<\kappa<\frac{p^{2}\sqrt{T}}{\sigma\epsilon}, then we have

1Tk=1T\displaystyle\frac{1}{T}\sum_{k=1}^{T} 𝔼v[T][Φ~(θk)Φ~(θk)]\displaystyle\mathbb{E}_{v_{[T]}}[\tilde{\Phi}(\theta_{k})-\tilde{\Phi}(\theta_{k}^{*})]
=𝒪(μ(1+μ)σ3p3(1ρ)Tϵ2+μσ2(1ρ)p4),\displaystyle=\mathcal{O}\left(\frac{\sqrt{\mu^{{}^{\prime}}(1+\mu^{\prime)}}\sigma^{3}p^{3}}{(1-\rho^{\prime})\sqrt{T}\epsilon^{2}}+\frac{\mu^{\prime}\sigma^{2}}{(1-\rho^{\prime})p^{4}}\right), (30)

where ρ(0,1)\rho^{\prime}\in(0,1) is the maximum eigenvalue of matrix PP^{\prime} given by (IX-G).

Remark 3.

The proof is shown in Appendix IX-G. As TT\rightarrow\infty, the right side of (2) approaches μσ2(1ρ)p4\frac{\mu^{\prime}\sigma^{2}}{(1-\rho^{\prime})p^{4}}. The nonzero upper bound is related to the dimension pp of the injected false data, the variance of the objective function originating from noise and the convergence rate μ\mu^{\prime}. Compared with (1) in Theorem 1, we also reveal that the existence of noise increases the optimality gap.

VI Discussion

In this part, we show the detailed comparisons among the existing works on the design of the FDI attack strategy in Table II and Table III, and introduce the feasible stealthy attack design. Since the design of the stealthy attack depends on the existence of the original attack detector, the produced stealthy attack strategy could be different due to distinct detection criteria. Moreover, the general assumption on the stealthy attack is that the knowledge of the existing detector is known. Herein, we discuss the following three detection criteria.

TABLE II: The Comparisons among the Existing Works on Model-based attack strategy
Works [16] [17] [5] [8]
System
Model
xk+1=Axk+wkx_{k+1}=Ax_{k}+w_{k}
yk=Cxk+vky_{k}=Cx_{k}+v_{k}
xk+1=Axk+wkx_{k+1}=Ax_{k}+w_{k}
yk=Cxk+vky_{k}=Cx_{k}+v_{k}
xk+1=Axk+Buk+Dζk+wkx_{k+1}=Ax_{k}+Bu_{k}+D{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\zeta_{k}}+w_{k}
yk=Cxk+Eβk+vky_{k}=Cx_{k}+E{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\beta_{k}}+v_{k}
Noise Gaussian distribution with zero mean (i.i.d.)
Stealthy
Metric
Kullback-Leibler
Divergence
Follow sensor-estimator
communicate rate
χ2\chi^{2} detector
Kullback-Leibler
Divergence
The
Objective
Max: Estimation error
by stealthy linear attacks
Degrade estimation quality
by stealthy attacks
Track the desired state
while keeping stealthy
Max: Estimation error
by stealthy attacks
The
Methods
Innovation-based Random theory
Dynamic
programming
Fisher
information matrix
The
Requirements
All system model knowledge
System parameters
without filter gain

If the detection criterion satisfies

yka,dykdyth,\displaystyle\|y^{a,d}_{k}-y^{d}_{k}\|\leq y_{\mathrm{th}}, (31)

the optimality of the obtained solutions in the proposed strategy remains as long as the actual output trajectory yka,dy^{a,d}_{k} meets yka,dy¯k2yth\|y^{a,d}_{k}-\bar{y}_{k}\|\leq 2y_{\mathrm{th}}. Since it is a crude and inaccurate detection for a nonlinear/linear system, it is easy to deal with the stealthy constraint.

If the detection criterion depends on the distribution gap between the normal output value and the compromised output value. For example, Kullback-Leibler divergence [26] is a good tool to measure how well two probability distributions match. Let zk=ykdykz_{k}=y^{d}_{k}-y_{k} and zkz_{k} follows a known distribution. For example, in the linear system with Gaussian noise, the Kalman filter error zkz_{k} is an independent and identically distributed (i.i.d) Gaussian variable with zk𝒩(0,Σ)z_{k}\sim\mathcal{N}(0,\Sigma). Let zka=yka,dykz^{a}_{k}=y^{a,d}_{k}-y_{k} and then the stealthy attacks should meet

D(zkazk)={ξ|f(zka;ξ)>0}f(zka;ξ)logf(zka;ξ)f(zk;ξ)dξϵ,\displaystyle D(z^{a}_{k}\|z_{k})=\int_{\{\xi|f(z^{a}_{k};\xi)>0\}}f(z^{a}_{k};\xi)\mathrm{log}\frac{f(z^{a}_{k};\xi)}{f(z_{k};\xi)}\mathrm{d}\xi\leq\epsilon^{\prime}, (32)

where ϵ>0\epsilon^{\prime}>0 is a given stealthy parameter. In this case, the stealthy constraint can be further simplified when zkaz^{a}_{k} has the same statistical property as zkz_{k}.

If the system adopts the data-driven detector, such as the machine-learning-based detection mechanisms [27, 28, 29] or the behavior-based data-driven detection methods [30], the anomalies can be detected based on the characteristic of the chosen methods. Specifically, the study [27] develops a One-Class Support Vector Machine (OCSVM) algorithm to classify the outlier class. The work [28] proposes the cumulative sum (CUSUM) method to detect the deviations that correspond to anomalies. In [30], a behavior-based χ2\chi^{2} detector was constructed based on a sequence of inputs and outputs and their covariance. When the stealthy attack is familiar with the existing learning-based/behavior-based detectors, the stealthy constraints can be derived and the obtained solutions are restricted in a new constraint set. Thus, the analysis of the updated constraint set is critical to the design of the FDI attack strategy with detectors.

TABLE III: The Comparisons among the Existing Works on Data-driven attack strategy
Works [18] [9] [10] [6]
System
Model
z=Hx+e+az=Hx+e+{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}a}
xk+1=Axk+Buk+Dζk+wkx_{k+1}=Ax_{k}+Bu_{k}+D{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\zeta_{k}}+w_{k}
yk=Cxk+Eβk+wky_{k}=Cx_{k}+E{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\beta_{k}}+w_{k}
Noise Gaussian distribution k=0Twk2<\sum_{k=0}^{T}\|w_{k}\|^{2}<\infty Gaussian distribution
Stealthy
Metric
z=H(x+a)+ez^{\prime}=H(x+a)+e Subsapce (H)\mathcal{R}(H)
α\alpha-probability
L2L_{2}-stealthiness
χ2\chi^{2} detector
The
Objective
Design stealthy
FDI attack
Design feasible
unobservable attack
Max: Stealthy
attack’s effects
Design data-driven
undetected attacks
The
Methods
Independent component
analysis (ICA)
Subspace method
Adaptive dynamic
programming (ADP)
Subspace method
The
Requirements
Independent, non-Gaussian
and the full sensors observations
Linear
measurement model
The attack
strategy is linear
Linear
system model

VII Simulation results

In this section, we evaluate the performance of the proposed attack scheme, i.e., the tracking performance and the optimality of solutions without/with noise.

Consider the following system

xk+1=Axk+Buk+dk,yk=Cxk+dk.\displaystyle\begin{split}x_{k+1}=&Ax_{k}+Bu_{k}+d_{k},\\ y_{k}=&Cx_{k}+d_{k}.\end{split} (33)

where uk=Kxku_{k}=-Kx_{k} with K=[1.51.5;0.20.1],A=[01;21],B=[00;10],C=[11]K=[1.5~{}-1.5;0.2~{}0.1],A=[0~{}1;2~{}-1],B=[0~{}0;1~{}0],C=[1~{}1]. It is stable, controllable, and observable. We consider two kinds of noise, including dk𝒩(0,0.02)d_{k}\sim\mathcal{N}(0,0.02) and dkU(0,0.02)d_{k}\sim U(0,0.02). We set the initial state x1=[1;3]x_{1}=[1;-3], the probing signal vk=[cos(k);sin(k)]/2v_{k}=[\mathrm{cos}(k);\mathrm{sin}(k)]/\sqrt{2} to satisfy vk=1\|v_{k}\|=1, and the initial solution w1w_{1} is random and follows the standard uniform distribution. We also set the smoothing parameter δ=103\delta=10^{-3}, the step-size η=7.5×105\eta=7.5\times 10^{-5}, the attack selection matrix Γ=I2\Gamma=I_{2} and the weight matrix Q=3I2Q=3I_{2} where I2I_{2} is a two-dimensional diagonal unit matrix. We define two types of the expected output trajectories, including the static trajectory y¯1=1.5\bar{y}_{1}=-1.5 and dynamic output trajectory y¯2=104k\bar{y}_{2}=10^{-4}k with respect to iteration kk. Each data point in the following figures represents an ensemble average of 5050 trials.

Without noise dkd_{k}, we first analyze the tracking performance with different desired output trajectories. As shown in Fig. 2, the output value of the system under the proposed attack strategy has the ability to track the expected output trajectory whether the trajectory is static or dynamic. Especially, Fig. 2 and Fig. 2 illustrate that the output values fluctuate along the desired trajectory. Note that the phenomenon of fluctuation is normal since the output values are constantly perturbed by the time-varying probing signal vkv_{k}.

Then, we illustrate the optimality of solutions via the optimality gap Φ(θk)Φ(θ)\Phi(\theta_{k})-\Phi(\theta^{*}), which is shown in Fig. 3. When the expected trajectory is static, i.e., y¯1=1.5\bar{y}_{1}=-1.5, we find that the obtained solution is close to the optimal solution and the optimality gap converges to about 0.020.02, as shown in Fig. 3. When the expected trajectory is time-varying, i.e., y¯2=104k\bar{y}_{2}=10^{-4}k, in Fig. 3, the obtained solutions also approach the optimal one and the upper bound of the optimality gap does not exceed 0.110.11. To sum up, the proposed model-free attack strategy can obtain the suboptimal attack signals that drive the output values to the desired output trajectory by only utilizing the real-time compromised measurements.

Refer to caption
Refer to caption
Figure 2: Tracking performance under different expected output trajectories. (a) Static trajectory y¯1=1.5\bar{y}_{1}=-1.5 (b) Dynamic trajectory y¯2=104k\bar{y}_{2}=10^{-4}k
Refer to caption
Refer to caption
Figure 3: The optimality gap under different expected output trajectories. (a) Static trajectory y¯1=1.5\bar{y}_{1}=-1.5 (b) Dynamic trajectory y¯2=104k\bar{y}_{2}=10^{-4}k

With noise dkd_{k}, we analyze its effects on the tracking performance and optimality. The output y2\mathrm{y2} and the optimality gap Φ~(θk)Φ~(θ)\tilde{\Phi}(\theta_{k})-\tilde{\Phi}(\theta^{*}) under uniform distribution noise and normal distribution noise are denoted as y2U\mathrm{y2-U}, y2N\mathrm{y2-N}, PhiU\mathrm{Phi-U} and PhiN\mathrm{Phi-N}, respectively. From boxplot Fig. 4 with iterations k=40000k=40000, we know that the final value of the actual output y2y2 is 44 and the median is 22. In other words, the slope of the dynamic trajectory of the output value is 10410^{-4}, which follows the expected one, and the noise does not influence the tracking trend while adding lots of outliers. In addition, combined with Fig. 5, the average optimality gap (red line in Fig. 4 / blue line in Fig 5) of 5050 trials approaches zero although there are some outliers (red plus in Fig. 4 / pink shadow in Fig. 5). Moreover, the optimality gap is larger than that without noise dkd_{k} and the normal distribution noise has smaller effects than the uniform distribution noise on the optimality gap.

Refer to caption
Figure 4: Comparison of the output and the optimality gap under noise.
Refer to caption
Figure 5: The bound of the optimality gap under noise.

VIII Conclusion

We considered the problem of designing a model-free attack scheme where the adversary with limited capability aims to make the output value follow the desired trajectory without any prior system model information. The designed attack scheme is model-free since only real-time measurements are required. These measurements are used to compute objective function evaluations and gradient estimates are constructed to update the attack signal based on these objective function evaluations at the previous and current time. Moreover, considering the adversary has limited capability, we constrained the obtained solutions within the feasible region by the projected gradient descent method. Finally, we analyzed the optimality of solutions and established its dependence on the dimensions of the attack signal, the iterations, the variance of the objective function, and the convergence rate of the system. Future works include the design of attack strategies with partial observations and specific detector constraints.

IX APPENDIX

IX-A Proof of Lemma 3

Based on Assumptions 11-44, then we have

|eΦ(xk,θk)|2=\displaystyle|e_{\Phi}(x_{k},\theta_{k})|^{2}= |Φ(θk,yk+1a)Φ(θk,h(uk,θk))|2\displaystyle|\Phi(\theta_{k},y^{a}_{k+1})-\Phi(\theta_{k},h(u_{k},\theta_{k}))|^{2}
\displaystyle\leq My2yk+1ah(uk,θk)2\displaystyle M_{y}^{2}\|y^{a}_{k+1}-h(u_{k},\theta_{k})\|^{2}
\displaystyle\leq My2Mg2xk+1axssa(uk,θk)2.\displaystyle M_{y}^{2}M_{g}^{2}\|x^{a}_{k+1}-x^{a}_{ss}(u_{k},\theta_{k})\|^{2}. (34)

Combing (3) with (4), it can be inferred that

xk+1a\displaystyle\|x^{a}_{k+1} xssa(uk,θk)2\displaystyle-x^{a}_{ss}(u_{k},\theta_{k})\|^{2}
1α1V(xk+1a,uk,θk)=1α1V(f(xka,uk,θk),uk,θk)\displaystyle\leq\frac{1}{\alpha_{1}}V(x^{a}_{k+1},u_{k},\theta_{k})=\frac{1}{\alpha_{1}}V(f^{\prime}(x^{a}_{k},u_{k},\theta_{k}),u_{k},\theta_{k})
1α1(V(xka,uk,θk)α3xkaxssa(uk,θk)2)\displaystyle\leq\frac{1}{\alpha_{1}}(V(x^{a}_{k},u_{k},\theta_{k})-\alpha_{3}\|x^{a}_{k}-x^{a}_{ss}(u_{k},\theta_{k})\|^{2})
1α1(1α3α2)V(xka,uk,θk).\displaystyle\leq\frac{1}{\alpha_{1}}(1-\frac{\alpha_{3}}{\alpha_{2}})V(x^{a}_{k},u_{k},\theta_{k}). (35)

Substituting (IX-A) into (IX-A), it is easy to obtain (17).

IX-B Proof of Lemma 4

Based on (3), we have

V(xka,uk,θk)α2xkaxssa(uk,θk)2\displaystyle V(x^{a}_{k},u_{k},\theta_{k})\leq\alpha_{2}\|x^{a}_{k}-x^{a}_{ss}(u_{k},\theta_{k})\|^{2}
=α2xkaxssa(uk1,θk1)+xssa(uk1,θk1)xssa(uk,θk)2\displaystyle=\alpha_{2}\|x^{a}_{k}-x^{a}_{ss}(u_{k-1},\theta_{k-1})+x^{a}_{ss}(u_{k-1},\theta_{k-1})-x^{a}_{ss}(u_{k},\theta_{k})\|^{2}
(s.1)2α2(xkaxssa(uk1,θk1)2+xssa(uk1,θk1)xssa(uk,θk)2)\displaystyle\overset{(s.1)}{\leq}2\alpha_{2}(\|x^{a}_{k}-x^{a}_{ss}(u_{k-1},\theta_{k-1})\|^{2}+\|x^{a}_{ss}(u_{k-1},\theta_{k-1})-x^{a}_{ss}(u_{k},\theta_{k})\|^{2})
(s.2)2α2(1α1(1α3α2V(xk1a,uk1,θk1))+Mx2θkθk12),\displaystyle\overset{(s.2)}{\leq}2\alpha_{2}(\frac{1}{\alpha_{1}}(1-\frac{\alpha_{3}}{\alpha_{2}}V(x^{a}_{k-1},u_{k-1},\theta_{k-1}))+M_{x}^{2}\|\theta_{k}-\theta_{k-1}\|^{2}),

where (s.1)(s.1) follows the fact that a+b22(a2+b2)\|a+b\|^{2}\leq 2(\|a\|^{2}+\|b\|^{2}) and (s.2)(s.2) follows from (2), (IX-A), and the Lipschitz continuity of xssa(uk,θk)x^{a}_{ss}(u_{k},\theta_{k}). The upper bound of 𝔼v[k][θkθk12]\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k-1}\|^{2}] is given as

𝔼v[k]\displaystyle\mathbb{E}_{v_{[k]}} [θkθk12]\displaystyle[\|\theta_{k}-\theta_{k-1}\|^{2}]
=𝔼v[k][wkwk1+δvkδvk12]\displaystyle=\mathbb{E}_{v_{[k]}}[\|w_{k}-w_{k-1}+\delta v_{k}-\delta v_{k-1}\|^{2}]
(s.1)𝔼v[k][2wkwk12+2δ2vkvk12]\displaystyle\overset{(s.1)}{\leq}\mathbb{E}_{v_{[k]}}[2\|w_{k}-w_{k-1}\|^{2}+2\delta^{2}\|v_{k}-v_{k-1}\|^{2}]
(s.2)2η2𝔼v[k][ϕ~k12]+2δ2𝔼v[k][2vk2+2vk12]\displaystyle\overset{(s.2)}{\leq}2\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+2\delta^{2}\mathbb{E}_{v_{[k]}}[2\|v_{k}\|^{2}+2\|v_{k-1}\|^{2}]
(s.3)2η2𝔼v[k][ϕ~k12]+8δ2,\displaystyle\overset{(s.3)}{\leq}2\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+8\delta^{2},

where (s.1)(s.1) follows that 𝔼[(a+b)2]2𝔼[a2+b2]\mathbb{E}[(a+b)^{2}]\leq 2\mathbb{E}[a^{2}+b^{2}], (s.2)(s.2) follows from (IV-C) and ab22(a2+b2)\|a-b\|^{2}\leq 2(\|a\|^{2}+\|b\|^{2}), and (s.3)(s.3) follows the fact that vk=1\|v_{k}\|=1 since vkv_{k} is selected uniformly at random from the unit sphere.

Combining the above results, we can infer that (4) holds.

IX-C Proof of Lemma 5

Let Φ(θk)Φ(θk,h(uk,θk))\Phi(\theta_{k})\triangleq\Phi(\theta_{k},h(u_{k},\theta_{k})) and Φ(θk1)Φ(θk1,h(uk1,θk1))\Phi(\theta_{k-1})\triangleq\Phi(\theta_{k-1},h(u_{k-1},\theta_{k-1})). With (10) and (16), then we have

𝔼v[k][ϕ~k2]\displaystyle\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}]
=p2δ2𝔼v[k][vk(Φ(θk,yk+1a)Φ(θk1,yka))2]\displaystyle=\frac{p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},y^{a}_{k+1})-\Phi(\theta_{k-1},y^{a}_{k}))\|^{2}]
=p2δ2𝔼v[k][vk(Φ(θk)Φ(θk1)\displaystyle=\frac{p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k})-\Phi(\theta_{k-1})
+eΦ(xka,θk)eΦ(xk1a,θk1))2]\displaystyle+e_{\Phi}(x^{a}_{k},\theta_{k})-e_{\Phi}(x^{a}_{k-1},\theta_{k-1}))\|^{2}]
(s.1)3p2δ2𝔼v[k][vk(Φ(θk)Φ(θk1))2]\displaystyle\overset{(s.1)}{\leq}\underbrace{\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k})-\Phi(\theta_{k-1}))\|^{2}]}_{①}
+3p2δ2𝔼v[k][vkeΦ(xka,θk)2]+3p2δ2𝔼v[k][vkeΦ(xk1a,θk1)2],\displaystyle+\underbrace{\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}e_{\Phi}(x^{a}_{k},\theta_{k})\|^{2}]+\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}e_{\Phi}(x^{a}_{k-1},\theta_{k-1})\|^{2}]}_{②},

where (s.1)(s.1) follows the fact that 𝔼[(a+b+c)2]3𝔼[a2+b2+c2]\mathbb{E}[(a+b+c)^{2}]\leq 3\mathbb{E}[a^{2}+b^{2}+c^{2}]. Next, we provide the upper bound of the item and , respectively.

(s.1)\displaystyle①\overset{(s.1)}{\leq} 3p2δ2𝔼v[k][vk(Φ(wk+δvk)Φ(wk1+δvk))\displaystyle\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(w_{k}+\delta v_{k})-\Phi(w_{k-1}+\delta v_{k}))
+Φ(wk1+δvk)Φ(wk1+δvk1)2]\displaystyle+\Phi(w_{k-1}+\delta v_{k})-\Phi(w_{k-1}+\delta v_{k-1})\|^{2}]
=\displaystyle= 6p2δ2𝔼v[k][vk(Φ(wk+δvk)Φ(wk1+δvk))2\displaystyle\frac{6p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(w_{k}+\delta v_{k})-\Phi(w_{k-1}+\delta v_{k}))\|^{2}
+Φ(wk1+δvk)Φ(wk1+δvk1)2]\displaystyle+\|\Phi(w_{k-1}+\delta v_{k})-\Phi(w_{k-1}+\delta v_{k-1})\|^{2}]
(s.2)\displaystyle\overset{(s.2)}{\leq} 6p2M2δ2𝔼v[k][vk2wkwk12\displaystyle\frac{6p^{2}M^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}\|^{2}\|w_{k}-w_{k-1}\|^{2}
+vk2δ2vkvk12]\displaystyle+\|v_{k}\|^{2}\delta^{2}\|v_{k}-v_{k-1}\|^{2}]
(s.3)\displaystyle\overset{(s.3)}{\leq} 6p2M2δ2(η2𝔼v[k][ϕ~k12]\displaystyle\frac{6p^{2}M^{2}}{\delta^{2}}(\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]
+δ2𝔼v[k][vk2(2vk2+2vk12)])\displaystyle+\delta^{2}\mathbb{E}_{v_{[k]}}[\|v_{k}\|^{2}(2\|v_{k}\|^{2}+2\|v_{k-1}\|^{2})])
=\displaystyle= 6η2p2M2δ2𝔼v[k][ϕ~k12]+24p2M2,\displaystyle\frac{6\eta^{2}p^{2}M^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+24p^{2}M^{2}, (36)

where (s.1)(s.1) holds by adding and minus Φ(wk1+δvk)\Phi(w_{k-1}+\delta v_{k}), (s.2)(s.2) holds due to Assumption 4 and the dependency of vkv_{k} with respect to wkw_{k}, and (s.3)(s.3) follows the fact that (IV-C) holds and vk=1\|v_{k}\|=1.

=\displaystyle②= 3p2δ2(𝔼v[k][vk2|eΦ(xka,θk)|2]\displaystyle\frac{3p^{2}}{\delta^{2}}(\mathbb{E}_{v_{[k]}}[\|v_{k}\|^{2}|e_{\Phi}(x^{a}_{k},\theta_{k})|^{2}]
+𝔼v[k][vk2]𝔼v[k][|eΦ(xk1a,θk1)|2])\displaystyle+\mathbb{E}_{v_{[k]}}[\|v_{k}\|^{2}]\mathbb{E}_{v_{[k]}}[|e_{\Phi}(x^{a}_{k-1},\theta_{k-1})|^{2}])
(s.1)\displaystyle\overset{(s.1)}{\leq} 3p2δ2(𝔼v[k][vk4])12(𝔼v[k][|eΦ(xka,θk)|4])12\displaystyle\frac{3p^{2}}{\delta^{2}}(\mathbb{E}_{v_{[k]}}[\|v_{k}\|^{4}])^{\frac{1}{2}}(\mathbb{E}_{v_{[k]}}[|e_{\Phi}(x^{a}_{k},\theta_{k})|^{4}])^{\frac{1}{2}}
+3p2δ2𝔼v[k][|eΦ(xk1a,θk1)|2]\displaystyle+\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[|e_{\Phi}(x^{a}_{k-1},\theta_{k-1})|^{2}]
(s.2)\displaystyle\overset{(s.2)}{\leq} 3μp2My2Mg22α2δ2(𝔼v[k][V(xka,uk,θk)]\displaystyle\frac{3\mu p^{2}M_{y}^{2}M_{g}^{2}}{2\alpha_{2}\delta^{2}}(\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k})]
+𝔼v[k][V(xk1a,uk1,θk1)]),\displaystyle+\mathbb{E}_{v_{[k]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]), (37)

where (s.1)(s.1) holds based on the Cauchy-Schwarz inequality capable of splitting the product of two correlated random variables vkv_{k} and eΦ(xk,θk)e_{\Phi}(x_{k},\theta_{k}) and (s.2)(s.2) holds based on (17). Combined with the above results, the proof is completed.

IX-D Proof of Theorem 1

Since the objective function Φ(θk)\Phi(\theta_{k}) is convex, the Gaussian smooth approximation of Φ(θk)\Phi(\theta_{k}) is also convex[31]. With (9b), then we have

Φ(θk)Φ(θk)Φδ(θk)Φδ(θk)+2Mδ.\displaystyle\Phi(\theta_{k})-\Phi(\theta_{k}^{*})\leq\Phi_{\delta}(\theta_{k})-\Phi_{\delta}(\theta_{k}^{*})+2M\delta. (38)

With (9c), the Taylor expansion of Φδ(θk)\Phi_{\delta}(\theta_{k}) at solution θk\theta_{k}^{*} is shown as

Φδ(θk)\displaystyle\Phi_{\delta}(\theta_{k})\leq Φδ(θk)+Φδ(θk)T(θkθk)\displaystyle\Phi_{\delta}(\theta_{k}^{*})+\nabla\Phi_{\delta}(\theta_{k}^{*})^{\mathrm{T}}(\theta_{k}-\theta_{k}^{*})
+\displaystyle+ M2p22δ2θkθk2,\displaystyle\frac{M^{2}p^{2}}{2\delta^{2}}\|\theta_{k}-\theta_{k}^{*}\|^{2}, (39)

where θk\theta_{k}^{*} is the optimal solution of the problem 𝒫𝟏\mathbf{\mathcal{P}_{1}} at iteration kk. Taking the expectation of v[k]v_{[k]} at both ends of the inequality (IX-D), then we have

𝔼v[k]\displaystyle\mathbb{E}_{v_{[k]}} [Φδ(θk)]𝔼v[k][Φδ(θk)]\displaystyle[\Phi_{\delta}(\theta_{k})]-\mathbb{E}_{v_{[k]}}[\Phi_{\delta}(\theta_{k}^{*})]\leq
𝔼v[k][Φδ(θk)T(θkθk)]+M2p22δ2𝔼v[k][θkθk2].\displaystyle\mathbb{E}_{v_{[k]}}[\nabla\Phi_{\delta}(\theta_{k}^{*})^{\mathrm{T}}(\theta_{k}-\theta_{k}^{*})]+\frac{M^{2}p^{2}}{2\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k}^{*}\|^{2}].

Since

𝔼v[k]\displaystyle\mathbb{E}_{v_{[k]}} [Φδ(θk)T(θkθk)]\displaystyle[\nabla\Phi_{\delta}(\theta_{k}^{*})^{\mathrm{T}}(\theta_{k}-\theta_{k}^{*})]\leq
12(𝔼v[k][Φδ(θk)2]+𝔼v[k][θkθk2])\displaystyle\frac{1}{2}(\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(\theta_{k}^{*})\|^{2}]+\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k}^{*}\|^{2}])

where the inequality follows the fact that for a1,a2\forall a_{1},a_{2},

𝔼[a1Ta2](𝔼[a12]𝔼[a22])1212(𝔼[a12]+𝔼[a22]),\displaystyle\mathbb{E}[a_{1}^{\mathrm{T}}a_{2}]\leq(\mathbb{E}[\|a_{1}\|^{2}]\mathbb{E}[\|a_{2}\|^{2}])^{\frac{1}{2}}\leq\frac{1}{2}(\mathbb{E}[\|a_{1}\|^{2}]+\mathbb{E}[\|a_{2}\|^{2}]),

then it can be inferred that

𝔼v[k][Φδ(θk)]\displaystyle\mathbb{E}_{v_{[k]}}[\Phi_{\delta}(\theta_{k})] 𝔼v[k][Φδ(θk)]+12𝔼v[k][Φδ(θk)2]\displaystyle\leq\mathbb{E}_{v_{[k]}}[\Phi_{\delta}(\theta_{k}^{*})]+\underbrace{\frac{1}{2}\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(\theta_{k}^{*})\|^{2}]}_{①}
+(12+M2p22δ2)𝔼v[k][θkθk2].\displaystyle+\underbrace{(\frac{1}{2}+\frac{M^{2}p^{2}}{2\delta^{2}})\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k}^{*}\|^{2}]}_{②}.

Next, we analyze the upper bound of the item and .

=\displaystyle①= 12𝔼v[k][Φδ(wk+δvk)2],\displaystyle\frac{1}{2}\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k}^{*}+\delta v_{k})\|^{2}],
=\displaystyle= 12𝔼v[k][Φδ(wk+δvk)(Φδ(wk+δvk)\displaystyle\frac{1}{2}\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k}+\delta v_{k})-(\nabla\Phi_{\delta}(w_{k}+\delta v_{k})
\displaystyle- Φδ(wk+δvk))2],\displaystyle\nabla\Phi_{\delta}(w_{k}^{*}+\delta v_{k}))\|^{2}],
(s.1)\displaystyle\overset{(s.1)}{\leq} 𝔼v[k][Φδ(wk+δvk)2]+𝔼v[k][Φδ(wk+δvk)\displaystyle\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k}+\delta v_{k})\|^{2}]+\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k}+\delta v_{k})
\displaystyle- Φδ(wk+δvk)2],\displaystyle\nabla\Phi_{\delta}(w_{k}^{*}+\delta v_{k})\|^{2}],
(s.2)\displaystyle\overset{(s.2)}{\leq} 𝔼v[k][Φδ(θk)2]+M2p2δ2𝔼v[k][θkθk2],\displaystyle\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(\theta_{k})\|^{2}]+\frac{M^{2}p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k}^{*}\|^{2}],
=\displaystyle= 𝔼v[k][Φδ(wk)(Φδ(wk)Φδ(θk))2]\displaystyle\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k})-(\nabla\Phi_{\delta}(w_{k})-\nabla\Phi_{\delta}(\theta_{k}))\|^{2}]
+\displaystyle+ M2p2δ2𝔼v[k][θkθk2],\displaystyle\frac{M^{2}p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k}^{*}\|^{2}],
(s.3)\displaystyle\overset{(s.3)}{\leq} 2𝔼v[k][Φδ(wk)2]+2M2p2𝔼v[k][vk2]\displaystyle 2\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k})\|^{2}]+2M^{2}p^{2}\mathbb{E}_{v_{[k]}}[\|v_{k}\|^{2}]
+\displaystyle+ M2p2δ2𝔼v[k][θkθk2],\displaystyle\frac{M^{2}p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k}^{*}\|^{2}],
=\displaystyle= 2𝔼v[k][Φδ(wk)2]+2M2p2+M2p2δ2𝔼v[k][θkθk2],\displaystyle 2\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k})\|^{2}]+2M^{2}p^{2}+\frac{M^{2}p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\theta_{k}-\theta_{k}^{*}\|^{2}],

where (s.1)(s.1) follows the fact that b2=a(ab)22a2+2ab2\|b\|^{2}=\|a-(a-b)\|^{2}\leq 2\|a\|^{2}+2\|a-b\|^{2}, (s.2)(s.2) follows from (9c), i.e., Φδ(θk)\Phi_{\delta}(\theta_{k}) is Mpδ\frac{Mp}{\delta}- smoothness, and (s.3)(s.3) follows from (9c) and δvk2=δ2vk2\|\delta v_{k}\|^{2}=\delta^{2}\|v_{k}\|^{2}.

=\displaystyle②= (12+M2p22δ2)𝔼v[k][wkwk2],\displaystyle(\frac{1}{2}+\frac{M^{2}p^{2}}{2\delta^{2}})\mathbb{E}_{v_{[k]}}[\|w_{k}-w_{k}^{*}\|^{2}],
(s.1)\displaystyle\overset{(s.1)}{\leq} (12+M2p22δ2)(2𝔼v[k][wk1wk2]+2η2𝔼v[k][ϕ~k12]),\displaystyle(\frac{1}{2}+\frac{M^{2}p^{2}}{2\delta^{2}})(2\mathbb{E}_{v_{[k]}}[\|w_{k-1}-w_{k}^{*}\|^{2}]+2\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]),
(s.2)\displaystyle\overset{(s.2)}{\leq} (12+M2p22δ2)(2𝔼v[k][wk1wk2]+2η2𝔼v[k][ϕ~k12]),\displaystyle(\frac{1}{2}+\frac{M^{2}p^{2}}{2\delta^{2}})(2\mathbb{E}_{v_{[k]}}[\|w_{k-1}-w_{k}\|^{2}]+2\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]),
(s.3)\displaystyle\overset{(s.3)}{\leq} (12+M2p22δ2)4η2𝔼v[k][ϕ~k12],\displaystyle(\frac{1}{2}+\frac{M^{2}p^{2}}{2\delta^{2}})4\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}],
=\displaystyle= (2δ2+2M2p2δ2)η2𝔼v[k][ϕ~k12],\displaystyle(\frac{2\delta^{2}+2M^{2}p^{2}}{\delta^{2}})\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}],

where (s.1)(s.1) follows from (IV-C), (s.2)(s.2) follows that wk1wk2wk1wk2\|w_{k-1}-w_{k}^{*}\|^{2}\leq\|w_{k-1}-w_{k}\|^{2}, and (s.3)(s.3) follows from (IV-C).

The second moment of the gradient of Φδ(wk)\Phi_{\delta}(w_{k}) at solution wkw_{k} is Φδ(wk)2\|\nabla\Phi_{\delta}(w_{k})\|^{2} and we have

𝔼v[k][Φδ(wk)2]\displaystyle\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k})\|^{2}]
=𝔼v[k][ϕ~k(ϕ~kΦδ(wk))2],\displaystyle=\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}-(\tilde{\phi}_{k}-\nabla\Phi_{\delta}(w_{k}))\|^{2}],
2𝔼v[k][ϕ~k2]+2𝔼v[k][ϕ~kΦδ(wk)2],\displaystyle\leq 2\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}]+2\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}-\nabla\Phi_{\delta}(w_{k})\|^{2}],

where the inequality follows the fact that 𝔼[(ab)2]2(𝔼[a2]+𝔼[b2])\mathbb{E}[(a-b)^{2}]\leq 2(\mathbb{E}[a^{2}]+\mathbb{E}[b^{2}]). Since

𝔼v[k][ϕ~kΦδ(wk)2]𝔼v[k][ϕ~k2],\displaystyle\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}-\nabla\Phi_{\delta}(w_{k})\|^{2}]\leq\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}],

which follows from (58)(58) in [19, Theorem 8], with (5),

𝔼v[k][Φδ(wk)2]\displaystyle\mathbb{E}_{v_{[k]}}[\|\nabla\Phi_{\delta}(w_{k})\|^{2}] 24η2p2M2δ2𝔼v[k][ϕ~k12]+96p2M2\displaystyle\leq\frac{24\eta^{2}p^{2}M^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+96p^{2}M^{2}
+6μp2My2Mg2α2δ2(𝔼v[k][V(xka,uk,θk)]\displaystyle+\frac{6\mu p^{2}M_{y}^{2}M_{g}^{2}}{\alpha_{2}\delta^{2}}(\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k})]
+𝔼v[k][V(xk1a,uk1,θk1)]).\displaystyle+\mathbb{E}_{v_{[k]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]).

Rearranging the above items, thus we have

𝔼v[k]\displaystyle\mathbb{E}_{v_{[k]}} [Φδ(θk)]𝔼v[k][Φδ(θk)]\displaystyle[\Phi_{\delta}(\theta_{k})]-\mathbb{E}_{v_{[k]}}[\Phi_{\delta}(\theta_{k}^{*})]
(2+54M2p2δ2)η2𝔼v[k][ϕ~k12]+194M2p2\displaystyle\leq(2+\frac{54M^{2}p^{2}}{\delta^{2}})\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+194M^{2}p^{2}
+12μp2My2Mg2α2δ2(𝔼v[k][V(xka,uk,θk)]\displaystyle+\frac{12\mu p^{2}M_{y}^{2}M_{g}^{2}}{\alpha_{2}\delta^{2}}(\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k})]
+𝔼v[k][V(xk1a,uk1,θk1)]).\displaystyle+\mathbb{E}_{v_{[k]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]).

Then, it follows that

1Tk=1T𝔼v[T][Φδ(θk)Φδ(θk)]\displaystyle\frac{1}{T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\Phi_{\delta}(\theta_{k})-\Phi_{\delta}(\theta_{k}^{*})]
(2T+54M2p2δ2T)η2k=1T𝔼v[T][ϕ~k12]+194M2p2\displaystyle\leq(\frac{2}{T}+\frac{54M^{2}p^{2}}{\delta^{2}T})\eta^{2}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\|\tilde{\phi}_{k-1}\|^{2}]+194M^{2}p^{2}
+12μp2My2Mg2α2δ2Tk=1T(𝔼v[T][V(xka,uk,θk)]\displaystyle+\frac{12\mu p^{2}M_{y}^{2}M_{g}^{2}}{\alpha_{2}\delta^{2}T}\sum_{k=1}^{T}(\mathbb{E}_{v_{[T]}}[V(x^{a}_{k},u_{k},\theta_{k})]
+𝔼v[T][V(xk1a,uk1,θk1)])\displaystyle+\mathbb{E}_{v_{[T]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]) (40)

To guarantee |Φδ(w)Φ(w)|ϵ|\Phi_{\delta}(w)-\Phi(w)|\leq\epsilon, we set δ=ϵM\delta=\frac{\epsilon}{M}. Combined (4), (38) and (IX-D), we obtain

1Tk=1T𝔼v[T][Φ(θk)Φ(θk)]\displaystyle\frac{1}{T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\Phi(\theta_{k})-\Phi(\theta_{k}^{*})]
η2p2δ2T(2δ2p2+54M2+48μMx2My2Mg2)k=1T𝔼v[T][ϕ~k12]\displaystyle\leq\frac{\eta^{2}p^{2}}{\delta^{2}T}(\frac{2\delta^{2}}{p^{2}}+54M^{2}+48\mu M_{x}^{2}M_{y}^{2}M_{g}^{2})\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\|\tilde{\phi}_{k-1}\|^{2}]
+12μ(μ+1)p2My2Mg2α2δ2Tk=1T𝔼v[T][V(xk1a,uk1,θk1)]\displaystyle+\frac{12\mu(\mu+1)p^{2}M_{y}^{2}M_{g}^{2}}{\alpha_{2}\delta^{2}T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]
+192μp2Mx2My2Mg2T+194M2p2+2Mδ.\displaystyle+\frac{192\mu p^{2}M_{x}^{2}M_{y}^{2}M_{g}^{2}}{T}+194M^{2}p^{2}+2M\delta. (41)

Since 𝔼v[k][ϕ~k2]\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}] and 𝔼v[k][V(xka,uk,θk)]\mathbb{E}_{v_{[k]}}[V(x^{a}_{k},u_{k},\theta_{k})] are coupled variables, we rely on [19, Lemma 11], which shows the upper bound of the partial sum of non-negative coupled series, to analyze (IX-D).

Combining (4) and (5), we can obtain a compacted form, which is shown as

[𝔼v[k][ϕ~k2]𝔼v[k][p12p21V(xka,uk,θk)]]\displaystyle\left[\begin{array}[]{c}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}]\\ \mathbb{E}_{v_{[k]}}[\sqrt{\frac{p_{12}}{p_{21}}}V(x^{a}_{k},u_{k},\theta_{k})]\end{array}\right]\preceq
P[𝔼v[k][ϕ~k12]𝔼v[k][p12p21V(xk1a,uk1,θk1)]]+[d1p12p21d2],\displaystyle P\left[\begin{array}[]{c}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]\\ \mathbb{E}_{v_{[k]}}[\sqrt{\frac{p_{12}}{p_{21}}}V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]\end{array}\right]+\left[\begin{array}[]{c}d_{1}\\ \sqrt{\frac{p_{12}}{p_{21}}}d_{2}\end{array}\right],

where P=[p11p12p21p12p21p22]P=\left[\begin{array}[]{cc}p_{11}&\sqrt{p_{12}p_{21}}\\ \sqrt{p_{12}p_{21}}&p_{22}\end{array}\right] with

p11=\displaystyle p_{11}= 6p2η2δ2(M2+μMx2My2Mg2),\displaystyle\frac{6p^{2}\eta^{2}}{\delta^{2}}(M^{2}+\mu M_{x}^{2}M_{y}^{2}M_{g}^{2}),
p12=\displaystyle p_{12}= 3μp2My2Mg22α2δ2(1+μ),\displaystyle\frac{3\mu p^{2}M_{y}^{2}M_{g}^{2}}{2\alpha_{2}\delta^{2}}(1+\mu),
p21=\displaystyle p_{21}= 4α2η2Mx2,\displaystyle 4\alpha_{2}\eta^{2}M_{x}^{2},
p22=\displaystyle p_{22}= μ,\displaystyle\mu,
d1=\displaystyle d_{1}= 24p2(M2+μMx2My2Mg2),\displaystyle 24p^{2}(M^{2}+\mu M_{x}^{2}M_{y}^{2}M_{g}^{2}),
d2=\displaystyle d_{2}= 16α2δ2Mx2.\displaystyle 16\alpha_{2}\delta^{2}M_{x}^{2}. (42)

Then, we have

max{k=1T\displaystyle\max\{\sum_{k=1}^{T} 𝔼v[T][ϕ~k12],k=1T𝔼v[T][p12p21V(xk1a,uk1,θk1)]}\displaystyle\mathbb{E}_{v_{[T]}}[\|\tilde{\phi}_{k-1}\|^{2}],\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\sqrt{\frac{p_{12}}{p_{21}}}V(x^{a}_{k-1},u_{k-1},\theta_{k-1})]\}
\displaystyle\leq (ρT+11ρ)B1+T1ρ(d1+p12p21d2),\displaystyle(\rho^{T}+\frac{1}{1-\rho})B_{1}+\frac{T}{1-\rho}(d_{1}+\sqrt{\frac{p_{12}}{p_{21}}}d_{2}), (43)
=\displaystyle= 𝒪(T1ρ(p2+μp2+pδη))\displaystyle\mathcal{O}\left(\frac{T}{1-\rho}(p^{2}+\mu p^{2}+\frac{p\delta}{\eta})\right) (44)

where B1=𝔼[ϕ~(w1)2]+𝔼[p12p21V(x1a,u1,θ1)]B_{1}=\mathbb{E}[\|\tilde{\phi}({w_{1}})\|^{2}]+\mathbb{E}[\sqrt{\frac{p_{12}}{p_{21}}}V(x^{a}_{1},u_{1},\theta_{1})] and ρ<1\rho<1 is the maximum singular value of the matrix PP.

By solving the characteristic equation |λIP|=0|\lambda I-P|=0 with eigenvalues λ\lambda, then

ρ=\displaystyle\rho= p11+p222+(p11p222)2+p12p21\displaystyle\frac{p_{11}+p_{22}}{2}+\sqrt{(\frac{p_{11}-p_{22}}{2})^{2}+p_{12}p_{21}}
\displaystyle\leq p11+p222+|p11p222|+p12p21\displaystyle\frac{p_{11}+p_{22}}{2}+|\frac{p_{11}-p_{22}}{2}|+\sqrt{p_{12}p_{21}}
=\displaystyle= max{p11,p22}+p12p21.\displaystyle\max\{p_{11},p_{22}\}+\sqrt{p_{12}p_{21}}. (45)

To guarantee ρ<1\rho<1, we need to set δ\delta and η\eta such that

p11+p12p21<1,p22+p12p21<1.\displaystyle p_{11}+\sqrt{p_{12}p_{21}}<1,\quad p_{22}+\sqrt{p_{12}p_{21}}<1. (46)

Then, combined (IX-D) and (IX-D), it follows that

1Tk=1T𝔼v[T][Φ(θk)Φ(θk)]l3+\displaystyle\frac{1}{T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\Phi(\theta_{k})-\Phi(\theta_{k}^{*})]\leq l_{3}+
(l1+p21p12l2){(ρT+11ρ)B1+T1ρ(d1+p12p21d2)}\displaystyle(l_{1}+\sqrt{\frac{p_{21}}{p_{12}}}l_{2})\left\{(\rho^{T}+\frac{1}{1-\rho})B_{1}+\frac{T}{1-\rho}(d_{1}+\sqrt{\frac{p_{12}}{p_{21}}}d_{2})\right\} (47)

where

l1=\displaystyle l_{1}= 2η2T+p2δ2T(54M2η2+48μMx2My2Mg2η2),\displaystyle\frac{2\eta^{2}}{T}+\frac{p^{2}}{\delta^{2}T}(54M^{2}\eta^{2}+48\mu M_{x}^{2}M_{y}^{2}M_{g}^{2}\eta^{2}),
l2=\displaystyle l_{2}= 12μ(μ+1)p2My2Mg2α2δ2T,\displaystyle\frac{12\mu(\mu+1)p^{2}M_{y}^{2}M_{g}^{2}}{\alpha_{2}\delta^{2}T},
l3=\displaystyle l_{3}= 192μp2Mx2My2Mg2T+194M2p2+2Mδ.\displaystyle\frac{192\mu p^{2}M_{x}^{2}M_{y}^{2}M_{g}^{2}}{T}+194M^{2}p^{2}+2M\delta.

Due to δ=ϵM\delta=\frac{\epsilon}{M}, we set η=κϵpT\eta=\frac{\kappa\epsilon}{pT} such that p4η2ϵ2\frac{p^{4}\eta^{2}}{\epsilon^{2}} and p2T\frac{p^{2}}{T} have the same order. Then, the order of (IX-D) is shown as (1). The parameter κ\kappa is set to satisfy (46), i.e.,

ξ1κ2+ξ2κ<1,ξ3+ξ2κ<1,\displaystyle\begin{split}\xi_{1}\kappa^{2}+\xi_{2}\kappa&<1,\\ \xi_{3}+\xi_{2}\kappa&<1,\end{split} (48)

where

ξ1\displaystyle\xi_{1} =6M2(M2+μMx2My2Mg2)T2,\displaystyle=\frac{6M^{2}(M^{2}+\mu M_{x}^{2}M_{y}^{2}M_{g}^{2})}{T^{2}},
ξ2\displaystyle\xi_{2} =MMxMyMgT6μ(1+μ),\displaystyle=\frac{MM_{x}M_{y}M_{g}}{T}\sqrt{6\mu(1+\mu)},
ξ3\displaystyle\xi_{3} =μ.\displaystyle=\mu.

The feasible range is denoted by (0,κ)(0,\kappa^{*}). Based on (48), we have

κ=\displaystyle\kappa^{*}= min{ξ2+ξ22+4ξ12ξ1,1ξ3ξ2},\displaystyle\min\left\{\frac{-\xi_{2}+\sqrt{\xi^{2}_{2}+4\xi_{1}}}{2\xi_{1}},\frac{1-\xi_{3}}{\xi_{2}}\right\},
=\displaystyle= 𝒪(min{Tμ(1+μ)μ,(1μ)Tμ(1+μ)}),\displaystyle\mathcal{O}\left(\min\left\{\frac{T\sqrt{\mu(1+\mu)}}{\mu},\frac{(1-\mu)T}{\sqrt{\mu(1+\mu)}}\right\}\right),
ρ=\displaystyle\rho= max{ξ1κ2,ξ3}+ξ2κ,\displaystyle\max\left\{\xi_{1}\kappa^{2},\xi_{3}\right\}+\xi_{2}\kappa,
=\displaystyle= 𝒪(max{(1μ)21+μ,μ}+1μ).\displaystyle\mathcal{O}\left(\max\left\{\frac{(1-\mu)^{2}}{1+\mu},\mu\right\}+1-\mu\right).

IX-E Proof of Lemma 7

The analysis is similar to the proof of Appendix IX-B. First, similar to Lemma 3, we have

|eΦ(xka,θk,dk)|2μMy2Mg22c2V(xka,uk,θk,dk).\displaystyle|e_{\Phi}(x^{a}_{k},\theta_{k},d_{k})|^{2}\leq\frac{\mu^{\prime}M^{2}_{y}M^{2}_{g}}{2c_{2}}V(x^{a}_{k},u_{k},\theta_{k},d_{k}). (49)

Then, it follows that

V(xka,uk,θk,dk)\displaystyle V(x^{a}_{k},u_{k},\theta_{k},d_{k})
2c2(xkaxssa(uk1,θk1,dk1)2\displaystyle\leq 2c_{2}(\|x^{a}_{k}-x^{a}_{ss}(u_{k-1},\theta_{k-1},d_{k-1})\|^{2}
+xssa(uk1,θk1,dk1)xssa(uk,θk,dk)2)\displaystyle+\|x^{a}_{ss}(u_{k-1},\theta_{k-1},d_{k-1})-x^{a}_{ss}(u_{k},\theta_{k},d_{k})\|^{2})
μV(xk1a,uk1,θk1)\displaystyle\leq\mu^{\prime}V(x^{a}_{k-1},u_{k-1},\theta_{k-1})
+2c2(xssa(uk1,θk1,dk1)xssa(uk1,θk1,dk)2\displaystyle+2c_{2}(\|x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k-1})-x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k})\|^{2}
+xssa(uk1,θk1,dk)xssa(uk,θk,dk)2)\displaystyle+\|x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k})-x_{ss}^{a}(u_{k},\theta_{k},d_{k})\|^{2})
μV(xk1a,uk1,θk1)\displaystyle\leq\mu^{\prime}V(x^{a}_{k-1},u_{k-1},\theta_{k-1})
+4c2(xssa(uk1,θk1,dk1)xssa(uk1,θk1,dk)2)\displaystyle+4c_{2}(\|x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k-1})-x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k})\|^{2})
+4c2(xssa(uk1,θk1,dk)xssa(uk,θk,dk)2)\displaystyle+4c_{2}(\|x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k})-x_{ss}^{a}(u_{k},\theta_{k},d_{k})\|^{2})
μV(xk1a,uk1,θk1))+4c2Mx2θkθk12\displaystyle\leq\mu^{\prime}V(x^{a}_{k-1},u_{k-1},\theta_{k-1}))+4c_{2}M_{x}^{2}\|\theta_{k}-\theta_{k-1}\|^{2}
+4c2(xssa(uk1,θk1,dk)xssa(uk,θk,dk)2).\displaystyle+4c_{2}(\|x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k})-x_{ss}^{a}(u_{k},\theta_{k},d_{k})\|^{2}).

Based on Assumption 5, it can be inferred that

𝔼d\displaystyle\mathbb{E}_{d} [(xssa(uk1,θk1,dk1)xssa(uk1,θk1,dk))2]4σ2.\displaystyle[(x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k-1})-x_{ss}^{a}(u_{k-1},\theta_{k-1},d_{k}))^{2}]\leq 4\sigma^{2}.

The upper bound of 𝔼v[k][θkθk12]\mathbb{E}_{v[k]}[\|\theta_{k}-\theta_{k-1}\|^{2}] is given as

𝔼v[k][θkθk12]2η2𝔼v[k][ϕ~k12]+8δ2.\displaystyle\mathbb{E}_{v[k]}[\|\theta_{k}-\theta_{k-1}\|^{2}]\leq 2\eta^{2}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+8\delta^{2}.

Then, 𝔼v[k][V(xka,uk,θk,dk)]\mathbb{E}_{v[k]}[V(x^{a}_{k},u_{k},\theta_{k},d_{k})] can be rewritten as (7).

IX-F Proof of Lemma 8

The analysis is similar to the proof of Appendix IX-C. Let Φ(θk,dk)Φ(θk,h(uk,θk,dk))\Phi(\theta_{k},d_{k})\triangleq\Phi(\theta_{k},h(u_{k},\theta_{k},d_{k})). With (27), then we have

𝔼v[k][ϕ~k2]\displaystyle\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k}\|^{2}]
=p2δ2𝔼v[k][vk(Φ(θk,yk+1a,dk)Φ(θk1,yka,dk1))2]\displaystyle=\frac{p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},y^{a}_{k+1},d_{k})-\Phi(\theta_{k-1},y^{a}_{k},d_{k-1}))\|^{2}]
=p2δ2𝔼v[k][vk(Φ(θk,dk)Φ(θk1,dk1)\displaystyle=\frac{p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},d_{k})-\Phi(\theta_{k-1},d_{k-1})
+eΦ(xka,θk,dk)eΦ(xk1a,θk1,dk1))2]\displaystyle+e_{\Phi}(x^{a}_{k},\theta_{k},d_{k})-e_{\Phi}(x^{a}_{k-1},\theta_{k-1},d_{k-1}))\|^{2}]
(s.1)3p2δ2𝔼v[k][vk(Φ(θk,dk)Φ(θk1,dk1))2]\displaystyle\overset{(s.1)}{\leq}\underbrace{\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},d_{k})-\Phi(\theta_{k-1},d_{k-1}))\|^{2}]}_{①}
+3p2δ2𝔼v[k][vkeΦ(xka,θk,dk)2+vkeΦ(xk1a,θk1,dk1)2].\displaystyle+\underbrace{\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}e_{\Phi}(x^{a}_{k},\theta_{k},d_{k})\|^{2}+\|v_{k}e_{\Phi}(x^{a}_{k-1},\theta_{k-1},d_{k-1})\|^{2}]}_{②}.

Next, we provide the upper bound of the item and , respectively.

=(s.1)\displaystyle①\overset{(s.1)}{=} 3p2δ2𝔼v[k][vk(Φ(θk,dk)Φ(θk,dk1)\displaystyle\frac{3p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},d_{k})-\Phi(\theta_{k},d_{k-1})
+Φ(θk,dk1)Φ(θk1,dk1))2]\displaystyle+\Phi(\theta_{k},d_{k-1})-\Phi(\theta_{k-1},d_{k-1}))\|^{2}]
\displaystyle\leq 6p2δ2𝔼v[k][vk(Φ(θk,dk)Φ(θk,dk1))2]\displaystyle\frac{6p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},d_{k})-\Phi(\theta_{k},d_{k-1}))\|^{2}]
+6p2δ2𝔼v[k][vk(Φ(θk,dk1)Φ(θk1,dk1))2]\displaystyle+\frac{6p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},d_{k-1})-\Phi(\theta_{k-1},d_{k-1}))\|^{2}]
(s.2)\displaystyle\overset{(s.2)}{\leq} 6p2δ2(𝔼v[k][|Φ(θk,dk1)Φ(θk1,dk1)|4])12(𝔼v[k][vk4])12\displaystyle\frac{6p^{2}}{\delta^{2}}(\mathbb{E}_{v_{[k]}}[|\Phi(\theta_{k},d_{k-1})-\Phi(\theta_{k-1},d_{k-1})|^{4}])^{\frac{1}{2}}(\mathbb{E}_{v_{[k]}}[\|v_{k}\|^{4}])^{\frac{1}{2}}
+6p2δ2𝔼v[k][vk(Φ(θk,dk1)Φ(θk1,dk1))2]\displaystyle+\frac{6p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},d_{k-1})-\Phi(\theta_{k-1},d_{k-1}))\|^{2}]
(s.3)\displaystyle\overset{(s.3)}{\leq} 24p2σ2δ2+6p2δ2𝔼v[k][vk(Φ(θk,dk1)Φ(θk1,dk1))2]\displaystyle\frac{24p^{2}\sigma^{2}}{\delta^{2}}+\frac{6p^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|v_{k}(\Phi(\theta_{k},d_{k-1})-\Phi(\theta_{k-1},d_{k-1}))\|^{2}]
(s.4)\displaystyle\overset{(s.4)}{\leq} 24p2σ2δ2+12η2p2M2δ2𝔼v[k][ϕ~k12]+48p2M2,\displaystyle\frac{24p^{2}\sigma^{2}}{\delta^{2}}+\frac{12\eta^{2}p^{2}M^{2}}{\delta^{2}}\mathbb{E}_{v_{[k]}}[\|\tilde{\phi}_{k-1}\|^{2}]+48p^{2}M^{2},

where (s.1)(s.1) holds by adding and minus Φ(θk,dk1)\Phi(\theta_{k},d_{k-1}), (s.2)(s.2) follows from the Cauchy-Schwarz inequality, (s.3)(s.3) follows from Assumption 5 and vk=1\|v_{k}\|=1, (s.4) holds due to the same procedure as (IX-C) of the proof in Appendix IX-C. Similarly, the term follows from (IX-C). Based on the above inequalities, (8) can be obtained.

IX-G Proof of Theorem 2

Following from the same procedure in Appendix IX-D, we have that

1Tk=1T𝔼v[T][Φ~(θk)Φ~(θk)]\displaystyle\frac{1}{T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\tilde{\Phi}(\theta_{k})-\tilde{\Phi}(\theta_{k}^{*})]
η2p2δ2T(2δ2p2+98M2+96μMx2My2Mg2)k=1T𝔼v[T][ϕ~k12]\displaystyle\leq\frac{\eta^{2}p^{2}}{\delta^{2}T}(\frac{2\delta^{2}}{p^{2}}+98M^{2}+96\mu^{\prime}M_{x}^{2}M_{y}^{2}M_{g}^{2})\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\|\tilde{\phi}_{k-1}\|^{2}]
+12μ(μ+1)p2My2Mg2c2δ2Tk=1T𝔼v[T][V(xk1a,uk1,θk1,dk1)]\displaystyle+\frac{12\mu^{\prime}(\mu^{\prime}+1)p^{2}M_{y}^{2}M_{g}^{2}}{c_{2}\delta^{2}T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[V(x^{a}_{k-1},u_{k-1},\theta_{k-1},d_{k-1})]
+(192σ2+384δ2Mx2)μp2My2Mg2δ2T+192p2σ2δ2+386p2M2+2Mδ.\displaystyle+\frac{(192\sigma^{2}+384\delta^{2}M_{x}^{2})\mu^{\prime}p^{2}M_{y}^{2}M_{g}^{2}}{\delta^{2}T}+\frac{192p^{2}\sigma^{2}}{\delta^{2}}+386p^{2}M^{2}+2M\delta. (50)

Combining (7) and (8), (IX-D) can be reconstructed as

P=[p11p12p21p12p21p22]\displaystyle P^{\prime}=\left[\begin{array}[]{cc}p_{11}^{\prime}&\sqrt{p_{12}^{\prime}p_{21}^{\prime}}\\ \sqrt{p_{12}p_{21}^{\prime}}&p_{22}^{\prime}\end{array}\right]

with

p11=\displaystyle p_{11}^{\prime}= 12p2η2δ2(M2+μMx2My2Mg2),\displaystyle\frac{12p^{2}\eta^{2}}{\delta^{2}}(M^{2}+\mu^{\prime}M_{x}^{2}M_{y}^{2}M_{g}^{2}),
p12=\displaystyle p_{12}^{\prime}= 3μp2My2Mg22c2δ2(1+μ),\displaystyle\frac{3\mu^{\prime}p^{2}M_{y}^{2}M_{g}^{2}}{2c_{2}\delta^{2}}(1+\mu^{\prime}),
p21=\displaystyle p_{21}^{\prime}= 8c2η2Mx2,\displaystyle 8c_{2}\eta^{2}M_{x}^{2},
p22=\displaystyle p_{22}^{\prime}= μ,\displaystyle\mu^{\prime},
d1=\displaystyle d_{1}^{\prime}= 48p2(M2+μMx2My2Mg2)\displaystyle 48p^{2}(M^{2}+\mu^{\prime}M_{x}^{2}M_{y}^{2}M_{g}^{2})
+24p2σ2(1+μMy2Mg2)+12p2η2μMx2My2Mg2δ2,\displaystyle+\frac{24p^{2}\sigma^{2}(1+\mu^{\prime}M_{y}^{2}M_{g}^{2})+12p^{2}\eta^{2}\mu^{\prime}M_{x}^{2}M_{y}^{2}M_{g}^{2}}{\delta^{2}},
d2=\displaystyle d_{2}^{\prime}= 32c2δ2Mx2+16c2σ2.\displaystyle 32c_{2}\delta^{2}M_{x}^{2}+16c_{2}\sigma^{2}. (51)

Then, we have

max{k=1T\displaystyle\max\{\sum_{k=1}^{T} 𝔼v[T][ϕ~k2],k=1T𝔼v[T][p12p21V(xka,uk,θk,dk)]}\displaystyle\mathbb{E}_{v_{[T]}}[\|\tilde{\phi}_{k}\|^{2}],\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\sqrt{\frac{p_{12}^{\prime}}{p_{21}^{\prime}}}V(x^{a}_{k},u_{k},\theta_{k},d_{k})]\}
\displaystyle\leq (ρT+11ρ)B1+T1ρ(d1+p12p21d2),\displaystyle(\rho^{\prime^{T}}+\frac{1}{1-\rho^{\prime}})B_{1}^{\prime}+\frac{T}{1-\rho^{\prime}}(d_{1}^{\prime}+\sqrt{\frac{p_{12}^{\prime}}{p_{21}^{\prime}}}d_{2}^{\prime}),
=\displaystyle= 𝒪(T1ρ(p2σ2+p2η2δ2+pδ2+pσ2δη)),\displaystyle\mathcal{O}\left(\frac{T}{1-\rho^{\prime}}(\frac{p^{2}\sigma^{2}+p^{2}\eta^{2}}{\delta^{2}}+\frac{p\delta^{2}+p\sigma^{2}}{\delta\eta})\right),

where B1=𝔼[ϕ~(w1)2]+𝔼[p12p21V(x1a,u1,θ1,d1)]B_{1}^{\prime}=\mathbb{E}[\|\tilde{\phi}({w_{1}})\|^{2}]+\mathbb{E}[\sqrt{\frac{p_{12}}{p_{21}}}V(x^{a}_{1},u_{1},\theta_{1},d_{1})] and ρ<1\rho^{\prime}<1 is the maximum singular value of the matrix PP^{\prime}.

With (IX-G) and (IX-G), it follows that

1Tk=1T𝔼v[T][Φ~(θk)Φ~(θk)]l3+\displaystyle\frac{1}{T}\sum_{k=1}^{T}\mathbb{E}_{v_{[T]}}[\tilde{\Phi}(\theta_{k})-\tilde{\Phi}(\theta_{k}^{*})]\leq l_{3}^{\prime}+
(l1+p21p12l2){(ρT+11ρ)B1+T1ρ(d1+p12p21d2)}\displaystyle(l_{1}^{\prime}+\sqrt{\frac{p_{21}^{\prime}}{p_{12}^{\prime}}}l_{2}^{\prime})\left\{(\rho^{\prime^{T}}+\frac{1}{1-\rho^{\prime}})B_{1}^{\prime}+\frac{T}{1-\rho^{\prime}}(d_{1}^{\prime}+\sqrt{\frac{p_{12}^{\prime}}{p_{21}^{\prime}}}d_{2}^{\prime})\right\} (52)

where

l1=\displaystyle l_{1}^{\prime}= 2η2T+p2η2δ2T(98M2+96μMx2My2Mg2),\displaystyle\frac{2\eta^{2}}{T}+\frac{p^{2}\eta^{2}}{\delta^{2}T}(98M^{2}+96\mu^{\prime}M_{x}^{2}M_{y}^{2}M_{g}^{2}),
l2=\displaystyle l_{2}^{\prime}= 12μ(μ+1)p2My2Mg2c2δ2T,\displaystyle\frac{12\mu^{\prime}(\mu^{\prime}+1)p^{2}M_{y}^{2}M_{g}^{2}}{c_{2}\delta^{2}T},
l3=\displaystyle l_{3}^{\prime}= (192σ2+384δ2Mx2)μp2My2Mg2δ2T\displaystyle\frac{(192\sigma^{2}+384\delta^{2}M_{x}^{2})\mu^{\prime}p^{2}M_{y}^{2}M_{g}^{2}}{\delta^{2}T}
+192p2σ2δ2+386p2M2+2Mδ.\displaystyle+\frac{192p^{2}\sigma^{2}}{\delta^{2}}+386p^{2}M^{2}+2M\delta.

Due to δ=ϵM\delta=\frac{\epsilon}{M}, we set η=κσϵp2T\eta=\frac{\kappa\sigma\epsilon}{p^{2}\sqrt{T}} such that p4η2ϵ4\frac{p^{4}\eta^{2}}{\epsilon^{4}} and σ2ϵ2T\frac{\sigma^{2}}{\epsilon^{2}T} have the same order and κ(0,p2Tσϵ)\kappa\in(0,\frac{p^{2}\sqrt{T}}{\sigma\epsilon}) such that η(0,1)\eta\in(0,1). Then, the order of (IX-G) is shown as (2).

Acknowledgments

The authors would like to thank Zhiyu He (now pursuing Ph.D in ETH ) for early inspiring discussions and valuable comments on this topic.

References

  • [1] P. Antsaklis and J. Baillieul, “Special issue on technology of networked control systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 5–8, 2007.
  • [2] R. A. Gupta and M.-Y. Chow, “Networked control system: Overview and research trends,” IEEE Transactions on Industrial Electronics, vol. 57, no. 7, pp. 2527–2535, 2009.
  • [3] X.-M. Zhang, Q.-L. Han, X. Ge, D. Ding, L. Ding, D. Yue, and C. Peng, “Networked control systems: A survey of trends and techniques,” IEEE/CAA Journal of Automatica Sinica, vol. 7, no. 1, pp. 1–17, 2019.
  • [4] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against state estimation in electric power grids,” ACM Transactions on Information and System Security (TISSEC), vol. 14, no. 1, pp. 1–33, 2011.
  • [5] Y. Chen, S. Kar, and J. M. Moura, “Optimal attack strategies subject to detection constraints against cyber-physical systems,” IEEE Transactions on Control of Network Systems, vol. 5, no. 3, pp. 1157–1168, 2017.
  • [6] Z. Zhao, Y. Huang, Z. Zhen, and Y. Li, “Data-driven false data-injection attack design and detection in cyber-physical systems,” IEEE Transactions on Cybernetics, vol. 51, no. 12, pp. 6179–6187, 2020.
  • [7] X.-L. Wang, “Optimal attack strategy against fault detectors for linear cyber-physical systems,” Information Sciences, vol. 581, pp. 390–402, 2021.
  • [8] Q. Zhang, K. Liu, D. Han, G. Su, and Y. Xia, “Design of stealthy deception attacks with partial system knowledge,” IEEE Transactions on Automatic Control, 2022.
  • [9] J. Kim, L. Tong, and R. J. Thomas, “Subspace methods for data attack on state estimation: A data driven approach,” IEEE Transactions on Signal Processing, vol. 63, no. 5, pp. 1102–1114, 2014.
  • [10] L. An and G.-H. Yang, “Data-driven coordinated attack policy design based on adaptive 2\mathcal{L}_{2}-gain optimal theory,” IEEE Transactions on Automatic Control, vol. 63, no. 6, pp. 1850–1857, 2017.
  • [11] R. Alisic, J. Kim, and H. Sandberg, “Model-free undetectable attacks on linear systems using lwe-based encryption,” IEEE Control Systems Letters, vol. 7, pp. 1249–1254, 2023.
  • [12] H. Zhang, P. Cheng, L. Shi, and J. Chen, “Optimal denial-of-service attack scheduling with energy constraint,” IEEE Transactions on Automatic Control, vol. 60, no. 11, pp. 3023–3028, 2015.
  • [13] Y. Zhang, Y. Zhou, K. Ji, and M. M. Zavlanos, “A new one-point residual-feedback oracle for black-box learning and control,” Automatica, vol. 136, p. 110006, 2022.
  • [14] M. Colombino, E. Dall’Anese, and A. Bernstein, “Online optimization as a feedback controller: Stability and tracking,” IEEE Transactions on Control of Network Systems, vol. 7, no. 1, pp. 422–432, 2020.
  • [15] X. Luo, C. Fang, C. Zhao, and J. He, “A model-free false data injection attack strategy in networked control systems,” in IEEE Conference on Decision and Control (CDC), accepted, 2022.
  • [16] Z. Guo, D. Shi, K. H. Johansson, and L. Shi, “Worst-case stealthy innovation-based linear attack on remote state estimation,” Automatica, vol. 89, pp. 117–124, 2018.
  • [17] H. Zhang, P. Cheng, J. Wu, L. Shi, and J. Chen, “Online deception attack against remote state estimation,” IFAC Proceedings Volumes, vol. 47, no. 3, pp. 128–133, 2014.
  • [18] M. Esmalifalak, H. Nguyen, R. Zheng, and Z. Han, “Stealth false data injection using independent component analysis in smart grid,” in 2011 IEEE International Conference on Smart Grid Communications (SmartGridComm).   IEEE, 2011, pp. 244–248.
  • [19] Z. He, S. Bolognani, J. He, F. Dörfler, and X. Guan, “Model-free nonlinear feedback optimization,” arXiv preprint arXiv:2201.02395, 2022.
  • [20] A. L. Dontchev and R. T. Rockafellar, Implicit functions and solution mappings.   Springer, 2009, vol. 543.
  • [21] N. Bof, R. Carli, and L. Schenato, “Lyapunov theory for discrete time systems,” arXiv preprint arXiv:1809.05289, 2018.
  • [22] G. Belgioioso, D. Liao-McPherson, M. H. de Badyn, S. Bolognani, J. Lygeros, and F. Dörfler, “Sampled-data online feedback equilibrium seeking: Stability and tracking,” arXiv preprint arXiv:2103.13988, 2021.
  • [23] P. Jain, P. Kar et al., “Non-convex optimization for machine learning,” Foundations and Trends® in Machine Learning, vol. 10, no. 3-4, pp. 142–363, 2017.
  • [24] A. Nedić and J. Liu, “Distributed optimization for control,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 1, pp. 77–103, 2018.
  • [25] H. Deng, M. Krstic, and R. J. Williams, “Stabilization of stochastic nonlinear systems driven by noise of unknown covariance,” IEEE Transactions on Automatic Control, vol. 46, no. 8, pp. 1237–1253, 2001.
  • [26] S. Kullback, Information theory and statistics.   Courier Corporation, 1997.
  • [27] L. A. Maglaras and J. Jiang, “Intrusion detection in SCADA systems using machine learning techniques,” in Science and Information Conference, 2014, pp. 626–631.
  • [28] J. Goh, S. Adepu, M. Tan, and Z. S. Lee, “Anomaly detection in cyber physical systems using recurrent neural networks,” in IEEE International Symposium on High Assurance Systems Engineering (HASE), 2017, pp. 140–145.
  • [29] E. Anthi, L. Williams, M. Rhode, P. Burnap, and A. Wedgbury, “Adversarial attacks on machine learning cybersecurity defences in industrial control systems,” Journal of Information Security and Applications, vol. 58, p. 102717, 2021.
  • [30] D. Gadginmath, V. Krishnan, and F. Pasqualetti, “Direct vs indirect methods for behavior-based attack detection,” arXiv preprint arXiv:2209.07564, 2022.
  • [31] S. Liu, X. Li, P.-Y. Chen, J. Haupt, and L. Amini, “Zeroth-order stochastic projected gradient descent for nonconvex optimization,” in IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2018, pp. 1179–1183.

Xiaoyu Luo (S’19) received B.E. degree in the Department of Automation from Tianjin University, Tianjin, China, in 2019. She is currently pursuing the Ph.D. degree with the Department of Automation, Shanghai Jiao Tong University, Shanghai, China. She is a member of Intelligent of Wireless Networking and Cooperative Control group. Her research interests include fault-tolerant control in multi-agent systems, cooperative charging in energy storage system and security of cyber-physical systems.

Chrongrong Fang received the B.Sc. degree in automation and the Ph.D. degree in control science and engineering from Zhejiang University, Hangzhou, China, in 2015 and 2020, respectively. He is currently an Assistant Professor with the Department of Automation, Shanghai Jiao Tong University, Shanghai, China. His research interests include anomaly detection and diagnosis in cyber-physical systems and cloud networks.

Jianping He (M’15-SM’19) is currently an associate professor in the Department of Automation at Shanghai Jiao Tong University. He received the Ph.D. degree in control science and engineering from Zhejiang University, Hangzhou, China, in 2013, and had been a research fellow in the Department of Electrical and Computer Engineering at University of Victoria, Canada, from Dec. 2013 to Mar. 2017. His research interests mainly include the distributed learning, control and optimization, security and privacy in network systems. Dr. He serves as an Associate Editor for IEEE Trans. on Control of Network Systems, IEEE Open Journal of Vehicular Technology and KSII Trans. Internet and Information Systems. He was also a Guest Editor of IEEE TAC, IEEE TII, International Journal of Robust and Nonlinear Control, etc. He was the winner of Outstanding Thesis Award, Chinese Association of Automation, 2015. He received the best paper award from IEEE WCSP’17, the best conference paper award from IEEE PESGM’17, the finalist best student paper award from IEEE ICCA’17, and the finalist best conference paper award from IEEE VTC’20-Fall.

Chengcheng Zhao received the PhD degree in control science and engineering from Zhejiang University, Hangzhou, China, in 2018. She is currently a research fellow in the Department of Electrical and Computer Engineering, University of Victoria. Her research interests include consensus and distributed optimization, distributed energy management in smart grids, vehicle platoon, and security and privacy in network systems. She received IEEE PESGM 2017 best conference papers award, and one of her paper was shortlisted in IEEE ICCA 2017 best student paper award finalist. She is a peer reviewer for Automatica, IEEE Transactions on Information Forensics and Security, IEEE Transactions on Industrial Electronics and etc. She was the TPC member for IEEE GLOBECOM 2017, 2018, and IEEE ICC 2018.

Dario Paccagnan is an Assistant Professor and a member of the Computational Optimization Group in the Department of Computing, Imperial College London. Previously, he was a Postdoctoral Fellow with the Mechanical Engineering Department and the Center for Control, Dynamical Systems and Computation, University of California, Santa Barbara. In 2018, Dario obtained a Ph.D. degree from the Information Technology and Electrical Engineering Department, ETH Zurich, Switzerland. He received a B.Sc. and M.Sc. in Aerospace Engineering in 2011 and 2014 from the University of Padova, Italy, and a M.Sc. in Mathematical Modelling and Computation from the Technical University of Denmark in 2014; all with Honors. Dario was a visiting scholar at the University of California, Santa Barbara in 2017, and at Imperial College of London, in 2014. His interests are at the interface between control theory and game theory, with a focus on the design of behavior-influencing mechanisms for socio-technical systems. Applications include multiagent systems and smart cities. Dr. Paccagnan was awarded the ETH medal, and is recipient of the SNSF fellowship for his work in Distributed Optimization and Game Design.