This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Distributed Least-Squares Optimization Solvers with Differential Privacy

Weijia Liu, Lei Wang, Fanghong Guo, Zhengguang Wu, Hongye Su This work was supported in part by the National Key R&D Program of China under Grant 2018YFA0703800, in part by Zhejiang Provincial Natural Science Foundation of China under Grant No. LZ23F030008 and in part by the National Natural Science Foundation of China under Grant No. 62203386 and 62373328. W. Liu, L. Wang, Z. Wu and H. Su are with Institute of Cyber-Systems and Control, College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, China (Emails: [email protected]; [email protected]; [email protected]; [email protected]). F. Guo is with Department of Automation, Zhejiang University of Technology, Hangzhou, China (Emails: [email protected]).
Abstract

This paper studies the distributed least-squares optimization problem with differential privacy requirement of local cost functions, for which two differentially private distributed solvers are proposed. The first is established on the distributed gradient tracking algorithm, by appropriately perturbing the initial values and parameters that contain the privacy-sensitive data with Gaussian and truncated Laplacian noises, respectively. Rigorous proofs are established to show the achievable trade-off between the (ϵ,δ)(\epsilon,\delta)-differential privacy and the computation accuracy. The second solver is established on the combination of the distributed shuffling mechanism and the average consensus algorithm, which enables each agent to obtain a noisy version of parameters characterizing the global gradient. As a result, the least-squares optimization problem can be eventually solved by each agent locally in such a way that any given (ϵ,δ)(\epsilon,\delta)-differential privacy requirement can be preserved while the solution may be computed with the accuracy independent of the network size, which makes the latter more suitable for large-scale distributed least-squares problems. Numerical simulations are presented to show the effectiveness of both solvers.

I Introduction

Distributed optimization, in which networked agents seek to minimize a global cost function in a distributed and cooperative fashion, has gained significant attention in diverse fields such as machine learning [1], unmanned aerial vehicles (UAV) [2], and sensor networks [3]. For such a problem, since the seminal work of the distributed gradient-descend algorithm [4], numerical algorithms have been developed, that differ from linear or sub-linear convergence rate [5, 6, 7, 8], deterministic or stochastic communication graph [9, 10], and equality or inequality constraints [11, 12], etc. Particularly, the gradient-tracking algorithm [13, 5] has gained increasing attention due to its fast convergence rate, yielding several variants such as [14, 15, 16, 17, 18]. As a special case, distribution least-squares optimization considers the computation problems where cost functions are squared, with wide applications across many fields (e.g., versatile LiDAR SLAM [19]). See [20, 21] for a thorough review of recent advances of distributed optimization.

In distributed optimization, each network agent holds a local dataset (that defines a local cost function), that may contain private sensitive information for the agent. For example, in the digital contact tracing based computational epidemiology, users’ privacy-sensitive data such as localization is encoded in the corresponding optimization problem [22]. In existing distributed optimization algorithms, during computing processes each agent needs to communicate with neighboring agents its local states or some intermediate variables that may directly contain the local sensitive data, or can be used to infer the local data. In view of this, new risks of privacy breach arise when there are eavesdroppers having access to network communication messages.

For privacy concern in distributed optimization, a natural idea is to apply homomorphic encryption on transmitted messages [10, 23]. However, this method may lead to heavy computation and communication burden. Besides, other approaches have also been proposed by injecting uncertainties into exchanged states [24, 25], or into asynchronous heterogeneous stepsize [26]. To further describe a rigorous privacy preserving notion, the differential privacy is proposed in [27], and has been widely applied in various fields such as signal processing [28], and distributed computation [25, 29, 30, 31, 32, 33, 34, 35], due to its resilience to side information and post-processing [36]. For a distributed optimization problem with a trusted cloud, a differentially private computation framework is proposed in [35] where the cloud sends perturbed global gradient to agents. Further, for the scenario without a trusted cloud, [31] studies distributed optimization problems with bounded gradients and sensitive constraints by injecting noises to communication messages, for which the desired privacy is guaranteed by splitting the privacy budget to each iteration round. In [33], to preserve the differential privacy under arbitrary iteration rounds, the idea of perturbing objective functions is introduced for a class of distributed convex optimization problem with square-integrable cost functions. Note that as shown in [33], in these results the introduction of randomness for privacy concern inevitably leads to a certain computation error, i.e., there is an accuracy-privacy trade-off. When an exact optimal solution is sought, [25, 32] turn to make a bit modification on the adjacency notion for the differential privacy, and develop novel algorithms by perturbing communication messages with noises having increasing variants and employing a vanishing stepsize that is appropriately designed.

In this paper, we focus on the problem of distributed least-squares optimization with differential privacy requirements of local cost functions under a common data adjacency notion. For such a problem, a common approach is to perturb communication messages, but as in [31] the change of iteration number may lead to a different privacy budget, i.e., yielding an iteration-number-dependent differential privacy. Besides, we also note that the idea of functional perturbation [33] is not applicable as the cost functions are not square-integrable, though it may remove the dependence on the iteration number.

In view of this analysis, inspired by the truncated Laplace mechanism [37] and the Gaussian mechanism [38], we first propose a Differentially Private Gradient Tracking-based (DP-GT-based) solver by perturbing the gradient tracking algorithm in such a way of appropriately adding Gaussian and truncated Laplacian noises to the initial values and parameters that contain the sensitive data, respectively. Moreover, a trade-off between the achievable (ϵ,δ)(\epsilon,\delta)-differential privacy and the computation accuracy is rigorously proved, and the iteration round is allowed to be arbitrary with no influence on the privacy guarantee. For such a method, it is noted that the intrinsic property of truncated Laplacian differential privacy mechanism [37] limits the achievable differential-privacy level and the computation accuracy linearly relies on the network size, which makes it inapplicable to a large-scaled distributed least-squares problem. In view of this, we further propose a Differentially Private Dishuf Average Consensus-Based (DP-DiShuf-AC-based) solver, where agents run the Distributed Shuffling (DiShuf)-based differentially private average consensus algorithm [39] to obtain a noisy estimate of the global gradient function, enabling each agent to solve for the optimal solution independently with differential privacy guarantees. Benefiting from the use of distributed shuffling mechanism, the least-squares optimization problem can be eventually solved to fulfill any desired differential privacy levels, and with the accuracy independent of the network size, which makes it more suitable for large-scale least-squares optimization problems. In the paper, our contribution mainly lies in proposing two differentially private least-squares optimization solvers: the DP-GT-based solver and DP-DiShuf-AC-based Solver. Particularly, the latter is able to achieve a privacy-accuracy trade-off which is independent of the network size as shown by numerical simulations.

The remainder of this paper is organized as follows. In Section II the differentially private distributed least squares problem is formulated, for which two solvers are explicitly presented in Sections III and IV. Simulations are then conducted in Section V to verify the effectiveness of the proposed solvers. Finally, Section VI concludes the paper.

Notation. Denote by \mathbb{R} the real number, n\mathbb{R}^{n} the real space of nn dimension for any positive integer nn. Denote by 𝟏nn\mathbf{1}_{n}\in\mathbb{R}^{n} as a vector with each entry being 11 and 𝟎nn\mathbf{0}_{n}\in\mathbb{R}^{n} as a vector with each entry being 0. For a vector xnx\in\mathbb{R}^{n}, denote x0{\|x\|}_{0}, x1{\|x\|}_{1}, x{\|x\|} as the 0, 1, and 2-norm of vector xx. For a matrix AA, denote (A)ij(A)_{ij} as its ii-th row jj-th column element, A\|A\| as 2-norm, AF\|A\|_{F} as Frobenius norm, and λi(A)\lambda_{i}(A) as ii-th eigenvalue. Denote by η𝒩(μ,σ2)r\eta\sim\mathcal{N}(\mu,\sigma^{2})^{r} if each entry of ηr\eta\in\mathbb{R}^{r} is i.i.d. and drawn from Gaussian distribution with mean μ\mu and variance σ2\sigma^{2} for positive integer rr. Define Φ(s)=12πseτ2/2𝑑τ\Phi(s)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{s}e^{-\tau^{2}/2}d\tau and κϵ(s)=Φ(s2ϵs)eϵΦ(s2ϵs)\kappa_{\epsilon}(s)=\Phi(\frac{s}{2}-\frac{\epsilon}{s})-e^{\epsilon}\Phi(-\frac{s}{2}-\frac{\epsilon}{s}) for any ϵ>0\epsilon>0. Denote by κ¯ϵ(δ)\bar{\kappa}_{\epsilon}(\delta) the inverse function of κϵ(s)\kappa_{\epsilon}(s) for s>0s>0. It can be verified that both κϵ(s)\kappa_{\epsilon}(s) and κ¯ϵ(δ)\bar{\kappa}_{\epsilon}(\delta) are strictly increasing functions.

II Problem Statement

Consider a multi-agent system with an undirected and connected communication network G=(V,E)\mathrm{G}=(\mathrm{V},\mathrm{E}), where the agent set V={1,2,,n}\mathrm{V}=\{1,2,...,n\}, the edge set EV×V\mathrm{E}\subseteq\mathrm{V}\times\mathrm{V}, and each agent iVi\in\mathrm{V} holds a local and privacy-sensitive cost function of the form

fi(x)=12xAix+Bix+Ci,f_{i}(x)=\frac{1}{2}x^{\top}A_{i}x+B_{i}^{\top}x+C_{i}, (1)

where Aim×mA_{i}\in\mathbb{R}^{m\times m} is symmetric, BimB_{i}\in\mathbb{R}^{m} and CiC_{i} is a scalar. The networked system aims to solve the following least-squares optimization problem

minxmf(x)=i=1nfi(x)\min_{x\in\mathbb{R}^{m}}f(x)=\sum_{i=1}^{n}f_{i}(x) (2)

in a distributed and privacy-preserved manner. It is well-known that the value of CiC_{i} does not affect the solution of the optimization problem and is generally not used. Thus, in the sequel we only consider the privacy concern of the matrix pair (Ai,Bi)(A_{i},B_{i}) while developing algorithms to solve the problem (2), though the privacy of fif_{i} is encoded in the triplet (Ai,Bi,Ci)(A_{i},B_{i},C_{i}). Denote by \mathcal{F} as the mapping that rearranges all elements of matrix AiA_{i} to a vector θiAm(m+1)2\theta_{i}^{A}\in\mathbb{R}^{\frac{m(m+1)}{2}}, i.e. θiA=(Ai)\theta_{i}^{A}=\mathcal{F}(A_{i}). Specifically, θiA\theta_{i}^{A} is obtained by setting the element in the pp-th row and the qq-th column with qpq\geq p of AiA_{i} as the [(2mp+2)(p1)2+q(p1)][\frac{(2m-p+2)(p-1)}{2}+q-(p-1)]-th element. Note that since AiA_{i} is symmetric, we only put the upper-triangle elements of AiA_{i} in the mapping \mathcal{F}. On the other hand, it is clear that \mathcal{F} is invertible. For convenience, we denote by 1\mathcal{F}^{-1} as its inverse, i.e., Ai=1(θiA)A_{i}=\mathcal{F}^{-1}(\theta_{i}^{A}), and define the sensitive-data vector θi:=[θiA;Bi]m(m+3)2\theta_{i}:=[{\theta_{i}^{A}};B_{i}]\in\mathbb{R}^{\frac{m(m+3)}{2}}.

To ensure the solvability of the problem (2), we make the following assumption throughout the paper.

Assumption 1.

The matrix A:=i=1nAiA:=\sum_{i=1}^{n}A_{i} is positive definite, i.e., there exists a λ¯A>0\underline{\lambda}_{A}>0 such that Aλ¯A𝐈mA\geq\underline{\lambda}_{A}\mathbf{I}_{m}.

The above assumption guarantees strict convexity of global cost function f(x)f(x), admitting the unique least-squares solution

x=A1B,x^{\ast}=-A^{-1}B\,, (3)

where we define B:=i=1nBiB:=\sum_{i=1}^{n}B_{i}.

In solving the distributed least-squares optimization problem (2) over the network G\mathrm{G}, necessarily agents communicate with neighboring agents about local information, which may contain the sensitive data (Ai,Bi)(A_{i},B_{i}) (i.e., vector θi\theta_{i}) directly or indirectly. If there exists adversaries having access to the communication messages, then the sensitive data (Ai,Bi)(A_{i},B_{i}) may be inferred, leading to privacy leakage risks. This motivates to develop distributed algorithms that solve the problem (2) with privacy guarantees of the sensitive data (Ai,Bi)(A_{i},B_{i}) against the adversaries eavesdropping all communication messages.

For privacy concern, this paper focuses on the notion of differential privacy, originated from [27] and is widely used in the field of distributed computation. Differential privacy describes rigorous privacy preserving in that eavesdroppers could not infer sensitive data by the way of “differentiating”. Any pair of data θi,θim(m+3)2\theta_{i},\theta_{i}^{\prime}\in\mathbb{R}^{\frac{m(m+3)}{2}} is said to be μ\mu-adjacent with μ>0\mu>0, denoted by (θi,θi)Adj(μ)(\theta_{i},\theta_{i}^{\prime})\in Adj(\mu), if θiθi0=1{\|\theta_{i}-\theta_{i}^{\prime}\|}_{0}=1 and θiθi1μ{\|\theta_{i}-\theta_{i}^{\prime}\|}_{1}\leq\mu. We denote \mathcal{M} as a mapping (or mechanism) from the sensitive data θi\theta_{i} to the eavesdropped communication messages, and give the following definition of (ϵ,δ)(\epsilon,\delta)-differential privacy [27].

Definition 1.

(Differential Privacy). Given ϵ0,δ(0,1)\epsilon\geq 0,\delta\in(0,1), the distributed solver preserves (ϵ,δ)(\epsilon,\delta)-differential privacy of θi\theta_{i} under μ\mu-adjacency, if for all 𝕄range()\mathbb{M}\subseteq range(\mathcal{M}), there holds

((θi)𝕄)eϵ((θi)𝕄)+δ\mathbb{P}(\mathcal{M}(\theta_{i})\in\mathbb{M})\leq e^{\epsilon}\mathbb{P}(\mathcal{M}(\theta_{i}^{\prime})\in\mathbb{M})+\delta (4)

for any (θi,θi)Adj(μ)(\theta_{i},\theta_{i}^{\prime})\in Adj(\mu).

Before the close of this section, we make the following claims on the communication network. Denote by Ni={j|(i,j)E}\mathrm{N_{i}}=\{j|(i,j)\in\mathrm{E}\} as neighbor set of agent ii, and wijw_{ij} the weight of edge (i,j)(i,j), satisfying wij=wji(0,1)w_{ij}=w_{ji}\in(0,1) for jNij\in\mathrm{N_{i}} and wji=0w_{ji}=0 for jNij\notin\mathrm{N_{i}}. Moreover, denote by LL the Laplacian matrix of the graph G\mathrm{G}, satisfying (L)ij=wij,ij(L)_{ij}=-w_{ij},\enspace i\neq j, and (L)ii=k=1nwik(0,1),iV(L)_{ii}=\sum_{k=1}^{n}w_{ik}\in(0,1),\enspace i\in\mathrm{V}. Since the graph G\mathrm{G} is assumed to be connected, by [40, Theorem 2.8] and Gershgorin Circle Theorem [41] there holds 0=λ1(L)<λ2(L)λn(L)<20=\lambda_{1}(L)<\lambda_{2}(L)\leq\ldots\leq\lambda_{n}(L)<2.

III Gradient-Tracking-based Solver with Differential Privacy

In this section, we modify the distributed gradient tracking algorithm [5] by injecting random noises to the sensitive data for the purpose of solving the distributed least-squares problem (2) while preserving differential privacy of θi\theta_{i}, iVi\in\mathrm{V}.

Let γ¯=dnmλ¯A\bar{\gamma}=\frac{d}{\sqrt{n}m}\underline{\lambda}_{A} for d(0,1)d\in(0,1). For any ϵ>0\epsilon>0, we let

μ=cγ¯,1/2>δeϵ12(eϵ/c1)\mu=c\bar{\gamma}\,,\quad 1/2>\delta\geq\frac{e^{\epsilon}-1}{2(e^{\epsilon/c}-1)} (5)

for c(0,1)c\in(0,1), and choose σημ/κ¯ϵ(δ)\sigma_{\eta}\geq{\mu}/{\bar{\kappa}_{\epsilon}(\delta)}. With these parameters, we propose the Differentially Private Gradient Tracking-based (DP-GT-based) solver in Algorithm 1.

  \fname@algorithm 1 Differentially Private Gradient Tracking-based solver (DP-GT-based solver)

 

Input: Data θi\theta_{i}, privacy budgets μ,ϵ,δ\mu,\epsilon,\delta, parameters γ¯,ση\bar{\gamma},\sigma_{\eta}.

  • 1.

    Each agent iVi\in\mathrm{V} independently generates a vector of i.i.d. truncated Laplacian noises γim(m+1)2\gamma_{i}\in\mathbb{R}^{\frac{m(m+1)}{2}}, with each entry independently generated according to the probability density function

    p(γ)={ϵ2μ(1eϵγ¯/μ)eϵ|γ|/μ,γ[γ¯,γ¯]0,otherwise,p(\gamma)=\left\{\begin{aligned} &\frac{\epsilon}{2\mu(1-e^{-{\epsilon\bar{\gamma}}/{\mu}})}e^{-{\epsilon|\gamma|}/{\mu}},&\gamma\in[-\bar{\gamma},\bar{\gamma}]\\ &0,&otherwise\end{aligned}\right., (6)

    and i.i.d. Gaussian noises ηi𝒩(0,ση2)m\eta_{i}\sim\mathcal{N}(0,\sigma_{\eta}^{2})^{m}.

  • 2.

    Each agent iVi\in\mathrm{V} computes Gi=1(θiA+γi)G_{i}=\mathcal{F}^{-1}(\theta_{i}^{A}+\gamma_{i}) and Hi=Bi+ηiH_{i}=B_{i}+\eta_{i}, and initializes the local states as

    xi(0)=𝟎m,si(0)=Hi.x_{i}(0)=\mathbf{0}_{m}\,,\quad s_{i}(0)=H_{i}. (7)
  • 3.

    For t=0,1,2,t=0,1,2,\dots, each agent iVi\in\mathrm{V} sends (xi(t),si(t))(x_{i}(t),s_{i}(t)) to neighboring agents jNij\in\mathrm{N}_{i} and iterates the local states by following

    xi(t+1)=\displaystyle x_{i}(t+1)= xi(t)+jNiwij(xj(t)xi(t))βsi(t)\displaystyle x_{i}(t)+\sum_{j\in\mathrm{N}_{i}}w_{ij}(x_{j}(t)-x_{i}(t))-\beta s_{i}(t) (8)
    si(t+1)=\displaystyle s_{i}(t+1)= si(t)+jNiwij(sj(t)si(t))\displaystyle s_{i}(t)+\sum_{j\in\mathrm{N}_{i}}w_{ij}(s_{j}(t)-s_{i}(t))
    +Gi(xi(t+1)xi(t)).\displaystyle+G_{i}(x_{i}(t+1)-x_{i}(t)).

 

In Algorithm 1, two types of random noises (i.e., truncated Laplacian noises γi\gamma_{i} and Gaussian noises ηi\eta_{i}) are, respectively, added to the gradient tracking algorithm in the manner of directly perturbing the parameters AiA_{i} (or θiA\theta_{i}^{A}) and BiB_{i}, for the purpose of preserving (ϵ,δ)(\epsilon,\delta)-differential privacy of these parameters. Particularly, the use of truncated Laplacian distribution (6) is motivated by [37]. It can be easily verified that the noise generated according to the truncated Laplacian distribution (6) has the mean zero and variance σγ2\sigma_{\gamma}^{2} satisfying

σγ2=ϵ2μ(1eϵγ¯/μ)γ¯γ¯γ2eϵ|γ|μ𝑑γ=11eϵγ¯/μ[2μ2ϵ2eϵγ¯μ(γ¯2+2μϵγ¯+2μ2ϵ2)].\begin{array}[]{rcl}\sigma_{\gamma}^{2}&=&\displaystyle\frac{\epsilon}{2\mu(1-e^{-{\epsilon\bar{\gamma}}/{\mu}})}\int_{-\bar{\gamma}}^{\bar{\gamma}}\gamma^{2}e^{-\frac{\epsilon|\gamma|}{\mu}}d\gamma\\ &=&\displaystyle\frac{1}{1-e^{-{\epsilon\bar{\gamma}}/{\mu}}}[2\frac{\mu^{2}}{\epsilon^{2}}-e^{-\frac{\epsilon\bar{\gamma}}{\mu}}(\bar{\gamma}^{2}+2\frac{\mu}{\epsilon}\bar{\gamma}+2\frac{\mu^{2}}{\epsilon^{2}})].\end{array} (9)

Denote ΩA=1(i=1nγi)\Omega_{A}=\mathcal{F}^{-1}(\sum_{i=1}^{n}\gamma_{i}) and ΩB=i=1nηi\Omega_{B}=\sum_{i=1}^{n}\eta_{i}. The following result demonstrates the differential privacy and computation accuracy that can be achieved by the above algorithm.

Theorem 1.

Given any privacy budgets (ϵ,δ,μ)(\epsilon,\delta,\mu) satisfying (5), by letting σημ/κ¯ϵ(δ)\sigma_{\eta}\geq{\mu}/{\bar{\kappa}_{\epsilon}(\delta)}, there exists a β>0\beta^{\ast}>0 such that for all β(0,β)\beta\in(0,\beta^{\ast}), Algorithm 1 achieves the following properties.

  • i).

    (Privacy) The (ϵ,δ)(\epsilon,\delta)-differential privacy of θi\theta_{i}, iVi\in\mathrm{V} is preserved under μ\mu adjacency.

  • ii).

    (Convergence) There exists α1(0,1)\alpha_{1}\in(0,1) such that xi(t)x()=𝒪(α1t)\|x_{i}(t)-x(\infty)\|=\mathcal{O}(\alpha_{1}^{t}) for all iVi\in\mathrm{V}, with x():=(A+ΩA)1(B+ΩB)x(\infty):=-(A+\Omega_{A})^{-1}(B+\Omega_{B}).

  • iii).

    (Accuracy) The mean-square error between x()x(\infty) and xx^{\ast} satisfies

    𝔼x()x22nm2σγ2x2+2nmση2(1d)2λ¯A2.\mathbb{E}\|x(\infty)-x^{\ast}\|^{2}\leq\frac{2nm^{2}\sigma_{\gamma}^{2}\|x^{\ast}\|^{2}+2nm\sigma_{\eta}^{2}}{(1-d)^{2}\underline{\lambda}_{A}^{2}}\,.
Remark 1.

The algorithm (8) indeed solves the least-squares problem minxmf¯(x)=i=1n12xGix+Hix\min_{x\in\mathbb{R}^{m}}\bar{f}(x)=\sum_{i=1}^{n}\frac{1}{2}x^{\top}G_{i}x+H_{i}^{\top}x. To preserve the convexity of f¯(x)\bar{f}(x) while achieving the differential privacy, truncated Laplacian noises γi\gamma_{i} are injected with the truncated level γ¯<λ¯A/(nm)\bar{\gamma}<\underline{\lambda}_{A}/{(\sqrt{n}m)}. However, due to the presence of such a constraint on the truncated level, from (5) it can be seen that the achievable privacy budgets (ϵ,δ,μ)(\epsilon,\delta,\mu) cannot be arbitrarily prescribed. Moreover, one can see from the statement iii) that the mean-square error 𝔼x()x2\mathbb{E}\|x(\infty)-x^{\ast}\|^{2} is a linear function of the network size nn, indicating a worse computation accuracy under a larger network size. All these issues thus motivate us to develop a new solver that can guarantee arbitrarily given privacy budgets (ϵ,δ,μ)(\epsilon,\delta,\mu) while allowing a better computation accuracy when a large scale of networked least-squares problem is investigated.

IV Average-Consensus-Based Solver with Differential Privacy

In this section, a new distributed solver with differential privacy is proposed to overcome the issues addressed in Remark 1.

We first observe that, the least-squares solution x=(i=1nAi)1(i=1nBi)x^{\ast}=-(\sum_{i=1}^{n}A_{i})^{-1}(\sum_{i=1}^{n}B_{i}) implies that the least-squares problem (2) is solved if each agent obtains the global information A=i=1nAiA=\sum_{i=1}^{n}A_{i} and B=i=1nBiB=\sum_{i=1}^{n}B_{i}. This intuition immediately motivates us to employ differentially private average consensus algorithms (e.g., [29, 42, 39]) to compute these global information in a distributed manner, while preserving the desired differential privacy. Further, we note that the differentially private average consensus algorithm proposed in [42, 39] yields a computation accuracy that can almost recover the centralized averaging algorithm where the accuracy of computing the sum is independent of the network size.

In view of the previous analysis, we propose a new solver by employing the differentially private average consensus algorithm proposed in [42, 39], whose implementation relies on the following distributed shuffling (DiShuf) mechanism where Paillier cryptosystem [43] is adopted with Ei()\mathrm{E}_{i}(\cdot) and Di()\mathrm{D}_{i}(\cdot) as encryption and decryption operation, respectively, and (kpi(k_{pi}, ksik_{si}) as public and private key pair.

  \fname@algorithm 2 Distributed Shuffling(DiShuf) Mechanism

 

Input: Data θim(m+3)/2\theta_{i}\in\mathbb{R}^{{{m(m+3)}/{2}}}, variance ση2\sigma_{\eta}^{2}, key pairs

   (kpik_{pi}, ksik_{si}), a large positive integer a¯>>1\bar{a}>>1.

  • 1.

    Each agent ii generates a vector of i.i.d. Gaussian noise ηi𝒩(0,ση2)m(m+3)/2\eta_{i}\sim\mathcal{N}(0,\sigma_{\eta}^{2})^{{m(m+3)}/{2}} and adds to θi\theta_{i}, i.e. θ¯i:=θi+ηi\bar{\theta}_{i}:=\theta_{i}+\eta_{i}.

  • 2.

    Each agent ii encrypts each entry of θ¯i-\bar{\theta}_{i} with local public key kpik_{pi}, yielding a ciphertext vector Ei(θ¯i)\mathrm{E}_{i}(-\bar{\theta}_{i}), then sends Ei(θ¯i)\mathrm{E}_{i}(-\bar{\theta}_{i}) and public key kpik_{pi} to neighboring agents jNij\in\mathrm{N}_{i}.

  • 3.

    Each agent ii encrypts each entry of θ¯i\bar{\theta}_{i} with received neighboring public keys kpjk_{pj}, yielding a ciphertext vector Ej(θ¯i)\mathrm{E}_{j}(\bar{\theta}_{i}), and computes in entry to obtain a vector cij=Ej(θ¯i)Ej(θ¯j)c_{ij}=\mathrm{E}_{j}(\bar{\theta}_{i})\mathrm{E}_{j}(-\bar{\theta}_{j}) for jNij\in\mathrm{N}_{i}.

  • 4.

    Each agent ii generate a positive integer aij[a¯2,a¯]a_{i\rightarrow j}\in[\frac{\bar{a}}{\sqrt{2}},\bar{a}], and computes in entry to get (cij)aij(c_{ij})^{a_{i\rightarrow j}} for each jNij\in\mathrm{N}_{i}.

  • 5.

    Each agent ii sends (cij)aij(c_{ij})^{a_{i\rightarrow j}} to jNij\in\mathrm{N}_{i} and decrypts received (cji)aji(c_{ji})^{a_{j\rightarrow i}} with local private key ksik_{si}, yielding Di((cji)aji)\mathrm{D}_{i}((c_{ji})^{a_{j\rightarrow i}}) for jNij\in\mathrm{N}_{i}.

  • 6.

    Each agent ii multiplies each Di((cji)aji)\mathrm{D}_{i}((c_{ji})^{a_{j\rightarrow i}}) by aij,jNia_{i\rightarrow j},j\in\mathrm{N}_{i} and computes the sum

    Δi=jNiaijDi((cji)aji).\Delta_{i}=\sum_{j\in\mathrm{N}_{i}}a_{i\rightarrow j}\mathrm{D}_{i}((c_{ji})^{a_{j\rightarrow i}}). (10)

Output: Δi,iV\Delta_{i},\enspace i\in\mathrm{V}

 

With some lengthy but straightforward computations, it can be seen that the output of the DiShuf mechanism (see Algorithm 2) satisfies

Δi=jNiaijaji(θjθi+ηjηi),\Delta_{i}=\sum_{j\in\mathrm{N}_{i}}a_{i\rightarrow j}a_{j\rightarrow i}(\theta_{j}-\theta_{i}+\eta_{j}-\eta_{i})\,,

which implies iVΔi=0m\sum_{i\in\mathrm{V}}\Delta_{i}=\textbf{0}_{m}. Then we introduce the differentially private DiShuf-based average consensus algorithm below.

  \fname@algorithm 3 Differentially Private DiShuf Average Consensus-based solver (DP-DiShuf-AC-based solver)

 

Input: Data θim(m+3)/2\theta_{i}\in\mathbb{R}^{{m(m+3)}/{2}}, variance σγ2\sigma_{\gamma}^{2}, ζ>0\zeta>0

  • 1.

    Each agent ii runs the DiShuf mechanism with input data θi\theta_{i} and outputs Δi\Delta_{i}, then generates a vector of i.i.d. Gaussian noises γi𝒩(0,σγ2)m(m+3)/2\gamma_{i}\sim\mathcal{N}(0,\sigma_{\gamma}^{2})^{{m(m+3)}/{2}}, and initializes its state as

    yi(0)=θi+ζΔi+γi.y_{i}(0)=\theta_{i}+\zeta\Delta_{i}+\gamma_{i}. (11)
  • 2.

    For t=0,1,2,t=0,1,2,\dots, each agent ii sends its state yi(t)y_{i}(t) to its neighboring agents jNij\in\mathrm{N}_{i} and updates its state following

    yi(t+1)=yi(t)+jVwij(yj(t)yi(t)).y_{i}(t+1)=y_{i}(t)+\sum_{j\in\mathrm{V}}w_{ij}(y_{j}(t)-y_{i}(t)). (12)

 

As i=1nΔi=0m\sum_{i=1}^{n}\Delta_{i}=\textbf{0}_{m}, it can be easily verified that

i=1nyi(0)=i=1nθi+i=1nγi,\sum_{i=1}^{n}y_{i}(0)=\sum_{i=1}^{n}\theta_{i}+\sum_{i=1}^{n}\gamma_{i}\,,

implying that the noises ηi\eta_{i} do not affect the average of yi(0)y_{i}(0), iVi\in\mathrm{V}. Thus, there are two design freedoms of noise variances (i.e., ση2\sigma_{\eta}^{2} and σγ2\sigma_{\gamma}^{2}), with the former affecting only achievable differential privacy, but having no influence on the average. As a result, by appropriately designing ση\sigma_{\eta} and σγ\sigma_{\gamma}, a better trade-off between the privacy and the accuracy can be achieved. The following result from [39] demonstrates the corresponding privacy and computation properties.

Lemma 1.

[39, Theorem 3] Suppose Assumption 1 holds and let ζ=1na¯2+1\zeta=\frac{1}{n\bar{a}^{2}+1} and g>0g>0. Given any privacy budgets ϵ>0,δ(0,1),μ>0\epsilon>0,\delta\in(0,1),\mu>0, let

σγ=(1+g)μnκ¯ϵ(δ)ση=(n1)α2(1α)2(κ¯ϵ(δ))2[(1+g)2μ2(1+g)21(1+g)2μ2n(n1)α2]\begin{array}[]{rcl}\sigma_{\gamma}&=&\frac{(1+g)\mu}{\sqrt{n}\bar{\kappa}_{\epsilon}(\delta)}\\ \sigma_{\eta}&=&\frac{(n-1)\alpha^{2}}{(1-\alpha)^{2}(\bar{\kappa}_{\epsilon}(\delta))^{2}}[\frac{(1+g)^{2}\mu^{2}}{(1+g)^{2}-1}-\frac{(1+g)^{2}\mu^{2}}{n(n-1)\alpha^{2}}]\end{array} (13)

where α=(11(2(n+a¯2))n1)1/(n1)\alpha=(1-\frac{1}{(2(n+\bar{a}^{-2}))^{n-1}})^{1/(n-1)}, Algorithm 3 achieves the following properties.

  • i).

    (Privacy) The (ϵ,δ)(\epsilon,\delta)-differential privacy of θi\theta_{i}, iVi\in\mathrm{V} is preserved under μ\mu adjacency.

  • ii).

    (Convergence) There holds yi(t)y()=𝒪(α2t)\|y_{i}(t)-y(\infty)\|=\mathcal{O}(\alpha_{2}^{t}) for all iVi\in\mathrm{V}, with y():=i=1nθi/n+i=1nγi/ny(\infty):=\sum_{i=1}^{n}\theta_{i}/n+\sum_{i=1}^{n}\gamma_{i}/n and α2=max{|1λ2(L)|,|1λn(L)|}<1\alpha_{2}=\max\{|1-\lambda_{2}(L)|,|1-\lambda_{n}(L)|\}<1.

  • iii).

    (Accuracy) The mean-square error between y()y(\infty) and y:=i=1nθi/ny^{\ast}:=\sum_{i=1}^{n}\theta_{i}/n satisfies

    𝔼y()y2=(1+g)2μ2n2(κ¯ϵ(δ))2.\mathbb{E}\|y(\infty)-y^{\ast}\|^{2}=\frac{(1+g)^{2}\mu^{2}}{n^{2}(\bar{\kappa}_{\epsilon}(\delta))^{2}}\,.

Towards this end, by employing Algorithm 3 to compute the average of θi:=[θiA;Bi]\theta_{i}:=[{\theta_{i}^{A}};B_{i}], iVi\in\mathrm{V} in a distributed and differentially private manner, each agent iVi\in\mathrm{V} can eventually obtain θ^:=ny():=i=1nθi+i=1nγi\hat{\theta}:=ny(\infty):=\sum_{i=1}^{n}\theta_{i}+\sum_{i=1}^{n}\gamma_{i} and thus an estimate of the global information (A,B)(A,B) as A^=A+ΩA\hat{A}=A+\Omega_{A} and B^=B+ΩB\hat{B}=B+\Omega_{B}, with ΩA=1(ωA)\Omega_{A}=\mathcal{F}^{-1}(\omega_{A}) where ωA\omega_{A} denotes the vector formed by the first m(m+1)/2m(m+1)/2 entries of i=1nγi\sum_{i=1}^{n}\gamma_{i}, and ΩB\Omega_{B} the vector formed by the last mm entries of i=1nγi\sum_{i=1}^{n}\gamma_{i}. Therefore, each agent ii can compute for the solution xx^{\ast} of the least-squares problem (2) by computing x^\hat{x}^{\ast} from

A^x^=B^.\hat{A}\hat{x}^{\ast}=-\hat{B}. (14)

It is noted that with AA invertible, its noisy version A^=A+ΩA\hat{A}=A+\Omega_{A} is actually invertible in a generic sense. Besides, if the resulting A^\hat{A} is singular, one can implement Algorithm 3 again. It is also noted that 𝔼θ^i=1nθi2=(1+g)2μ2(κ¯ϵ(δ))2\mathbb{E}\|\hat{\theta}-\sum_{i=1}^{n}\theta_{i}\|^{2}=\frac{(1+g)^{2}\mu^{2}}{(\bar{\kappa}_{\epsilon}(\delta))^{2}}, which is independent of the network size nn. A natural conjecture comes out that the mean-square error between the computed x^\hat{x}^{\ast} and the actual solution xx^{\ast} is also independent of nn, though it is still open to derive an explicit expression of 𝔼x^x2\mathbb{E}\|\hat{x}^{\ast}-x^{\ast}\|^{2} due to the inverse operation on the random matrix, i.e., A^1\hat{A}^{-1}.

V Case Studies

In this section, we implement numerical simulations to show the effectiveness of our proposed Algorithms 1 and 3 (i.e., DP-GT-based solver and DP-DiShuf-AC-based solver, respectively), and compare their computation performance. We consider a cycle communication graph with each edge having the same weight wij=0.3w_{ij}=0.3 for (i,j)E(i,j)\in\mathrm{E}. In this section, Ai3×3A_{i}\in\mathbb{R}^{3\times 3} and Bi3B_{i}\in\mathbb{R}^{3} are randomly generated and A=i=1nAi>0A=\sum_{i=1}^{n}A_{i}>0.

Let the network size n=10n=10 and choose γ¯=3.1\bar{\gamma}=3.1. Given the privacy budget μ=3\mu=3, ϵ=10\epsilon=10 and δ=0.2\delta=0.2, we let ση=μκ¯ϵ(δ)\sigma_{\eta}=\frac{\mu}{\bar{\kappa}_{\epsilon}(\delta)} and β=0.005\beta=0.005. It is shown in Figure 1 that the mean-square error 𝔼[1ni=1nxi(t)x2]\mathbb{E}[\frac{1}{n}\sum_{i=1}^{n}\|x_{i}(t)-x^{\ast}\|^{2}] decreases and exponentially converges as tt increases, demonstrating the effectiveness of Algorithm 1.

Refer to caption
Figure 1: Trajectories of mean-square computation errors 𝔼[1ni=1nxi(t)x2]\mathbb{E}[\frac{1}{n}\sum_{i=1}^{n}\|x_{i}(t)-x^{\ast}\|^{2}] of Algorithm 1

Then we show the effectiveness of DP-DiShuf-AC-based solver. Taking the same mm, nn, μ\mu, δ\delta, choose g=0.01g=0.01, we change ϵ\epsilon and corresponding computed σγ\sigma_{\gamma} and ση\sigma_{\eta}. Figure 2 shows the distribution of computation error 1ni=1nx^x2\frac{1}{n}\sum_{i=1}^{n}\|\hat{x}^{\ast}-x^{\ast}\|^{2} under different ϵ\epsilon with 100 samples. It is observed our Algorithm 3 holds a resilience to a higher privacy preservation level, i.e., a smaller ϵ\epsilon, in the sense of computation accuracy.

Refer to caption
Figure 2: Box chart of computation errors 1ni=1nx^x2\frac{1}{n}\sum_{i=1}^{n}\|\hat{x}^{\ast}-x^{\ast}\|^{2} of Algorithm 3 under different ϵ\epsilon with 100 samples

Further, we study the computation error 1ni=1nx^x2\frac{1}{n}\sum_{i=1}^{n}\|\hat{x}^{\ast}-x^{\ast}\|^{2} of our DP-Dishuf-AC-based solver (Algorithm 3) and DP-GT-based solver (Algorithm 1) under varying network size n=10,50,250n=10,50,250. To make a comparison, we consider a differentially private average consensus based solver (DP-AC-based solver), obtained by removing the DiShuf mechanism in Algorithm 3, where Gaussian noises are directly added to privacy-sensitive data to preserve differential privacy [38]. It can be observed in Figure 3 that our Algorithm 3 (i.e., the DP-Dishuf-AC-based solver) shows the best computation accuracy in different scale networks.

Refer to caption
Figure 3: Box chart of computation errors 1ni=1nx^x2\frac{1}{n}\sum_{i=1}^{n}\|\hat{x}^{\ast}-x^{\ast}\|^{2} of three algorithms under n=10,50,250n=10,50,250 with 100 samples

VI Conclusion

In this paper, two solvers were proposed to solve the distributed least squares optimization problem with requirements of preserving differential privacy of local cost functions. The first solver was established on the distributed gradient-tracking algorithm, by perturbing the parameters encoding the sensitive data with truncated Laplacian noises and Gaussian noises. It was shown that this method leads to a limited differential-privacy-protection ability and the resulting computation accuracy linearly relies on the network size. To overcome these issues, the second solver was developed where the differentially private DiShuf-based average consensus algorithm was employed such that each agent obtains an noisy estimate of the global gradient function and thus compute the solution directly. As a result, one could achieve arbitrary differential privacy requirements and a computation accuracy independent of the network size. By simulations, it was shown that the computation accuracy of the former solver strictly increases as the network size increases, while such trend was not observed by the latter.

-A Proof of Theorem 1

i). For each agent iVi\in\mathrm{V}, consider the following two mechanisms:

iA=θiA+γi\displaystyle\mathcal{M}_{i}^{A}=\theta_{i}^{A}+\gamma_{i} (15)
iB=Bi+ηi.\displaystyle\mathcal{M}_{i}^{B}=B_{i}+\eta_{i}.

By [37, Theorem 1], the truncated Laplacian mechanism iA\mathcal{M}_{i}^{A} can be easily verified to be (ϵ,δ)(\epsilon,\delta)-differentially private under μ\mu adjacency, with (ϵ,δ,μ)(\epsilon,\delta,\mu) satisfying (5) and each entry of γi\gamma_{i} generated according to (6). Similarly, by [38, Theorem 8], the Gaussian mechanism iB\mathcal{M}_{i}^{B} can be found to be (ϵ,δ)(\epsilon,\delta)-differentially private under μ\mu adjacency, with ηi𝒩(0,ση2)m\eta_{i}\sim\mathcal{N}(0,\sigma_{\eta}^{2})^{m} and σημ/κ¯ϵ(δ)\sigma_{\eta}\geq\mu/{\bar{\kappa}_{\epsilon}(\delta)}. With these properties in mind, we then turn to take a look at all communication messages that are eavesdropped during computation, and notice that they are in fact deterministic mappings of all mechanisms iA,iB\mathcal{M}_{i}^{A},\mathcal{M}_{i}^{B}, iVi\in\mathrm{V}. Thus, by employing the parallel composition [44, Theorem 4] and post-processing [36, Proposition 2.1] property of differential privacy, we conclude that Algorithm 1 preserves (ϵ,δ)(\epsilon,\delta)-differential privacy of θi,iV\theta_{i},i\in\mathrm{V}.

ii). It is noted from [5] that Algorithm 1 indeed solves the least-squares optimization problem of the from

minxmf¯(x)=i=1n12xGix+Hix.\min_{x\in\mathbb{R}^{m}}\bar{f}(x)=\sum_{i=1}^{n}\frac{1}{2}x^{\top}G_{i}x+H_{i}^{\top}x\,.

More explicitly, by [5, Theorem 1] there exists β>0\beta^{\ast}>0 such that each state xi(t)x_{i}(t) converges to the unique solution if it exists, with some linear convergence rate α1(0,1)\alpha_{1}\in(0,1). With this in mind, we let G=i=1nGiG=\sum_{i=1}^{n}G_{i} and H=i=1nHiH=\sum_{i=1}^{n}H_{i}, which implies G=A+ΩAG=A+\Omega_{A} and H=B+ΩBH=B+\Omega_{B}. By noting that each element of ΩA\Omega_{A} is a truncated Laplacian noise according to (6), it can be seen that ΩAΩAF<nmγ¯\|\Omega_{A}\|\leq\|\Omega_{A}\|_{F}<\sqrt{n}m\bar{\gamma}. Then, with Assumption 1 there holds G=A+ΩA(1d)λ¯A𝐈m>0G=A+\Omega_{A}\geq(1-d)\underline{\lambda}_{A}\mathbf{I}_{m}>0, implying that f¯\bar{f} is strongly convex and admits the unique solution x()=G1Hx(\infty)=-G^{-1}H. The statement ii) is thus completed.

iii). We note that

x()x2\displaystyle\|x(\infty)-x^{\ast}\|^{2} (16)
=\displaystyle= (A+ΩA)1B+(A+ΩA)1ΩBA1B2\displaystyle\|(A+\Omega_{A})^{-1}B+(A+\Omega_{A})^{-1}\Omega_{B}-A^{-1}B\|^{2}
\displaystyle\leq 2(A+ΩA)1BA1B2+2(A+ΩA)1ΩB2\displaystyle 2\|(A+\Omega_{A})^{-1}B-A^{-1}B\|^{2}+2\|(A+\Omega_{A})^{-1}\Omega_{B}\|^{2}
\displaystyle\leq 2((A+ΩA)1A1)B2+2(A+ΩA)1ΩB2\displaystyle 2\|((A+\Omega_{A})^{-1}-A^{-1})B\|^{2}+2\|(A+\Omega_{A})^{-1}\Omega_{B}\|^{2}
=\displaystyle= 2(A+ΩA)1[I(A+ΩA)A1]B2\displaystyle 2\|(A+\Omega_{A})^{-1}[I-(A+\Omega_{A})A^{-1}]B\|^{2}
+2(A+ΩA)1ΩB2\displaystyle+2\|(A+\Omega_{A})^{-1}\Omega_{B}\|^{2}
=\displaystyle= 2(A+ΩA)1ΩAA1B2+2(A+ΩA)1ΩB2\displaystyle 2\|(A+\Omega_{A})^{-1}\Omega_{A}A^{-1}B\|^{2}+2\|(A+\Omega_{A})^{-1}\Omega_{B}\|^{2}
=\displaystyle= 2(A+ΩA)1ΩAx2+2(A+ΩA)1ΩB2\displaystyle 2\|(A+\Omega_{A})^{-1}\Omega_{A}x^{\ast}\|^{2}+2\|(A+\Omega_{A})^{-1}\Omega_{B}\|^{2}
\displaystyle\leq 2(A+ΩA)12(ΩAx2+ΩB2).\displaystyle 2\|(A+\Omega_{A})^{-1}\|^{2}(\|\Omega_{A}x^{\ast}\|^{2}+\|\Omega_{B}\|^{2})\,.

By recalling that A+ΩA(1d)λ¯A𝐈mA+\Omega_{A}\geq(1-d)\underline{\lambda}_{A}\mathbf{I}_{m}, we have (A+ΩA)11(1d)λ¯A\|(A+\Omega_{A})^{-1}\|\leq\frac{1}{(1-d)\underline{\lambda}_{A}}. This yields

x()x22(1d)2λ¯A2(ΩAx2+ΩB2)2(1d)2λ¯A2(x2ΩAF2+ΩB2).\begin{array}[]{rcl}\|x(\infty)-x^{\ast}\|^{2}&\leq&\frac{2}{(1-d)^{2}\underline{\lambda}_{A}^{2}}(\|\Omega_{A}x^{\ast}\|^{2}+\|\Omega_{B}\|^{2})\\ &\leq&\frac{2}{(1-d)^{2}\underline{\lambda}_{A}^{2}}(\|x^{\ast}\|^{2}\|\Omega_{A}\|_{F}^{2}+\|\Omega_{B}\|^{2})\,.\end{array}

To this end, we proceed to study the bound of 𝔼ΩAF\mathbb{E}\|\Omega_{A}\|_{F} and 𝔼ΩB2\mathbb{E}\|\Omega_{B}\|^{2}, and have the following observations.

  • Since ΩA=i=1n1(γi)\Omega_{A}=\sum_{i=1}^{n}\mathcal{F}^{-1}(\gamma_{i}), we can obtain that each element of matrix ΩA\Omega_{A} is the sum of nn i.i.d. truncated Laplacian noises with zero mean and σγ2\sigma_{\gamma}^{2} variance. Thus, 𝔼ΩA2𝔼ΩAF2=nm2σγ2\mathbb{E}\|\Omega_{A}\|^{2}\leq\mathbb{E}\|\Omega_{A}\|^{2}_{F}=nm^{2}\sigma_{\gamma}^{2}.

  • Similarly, each element of ΩB\Omega_{B} is the sum of nn i.i.d. noises subject to 𝒩(0,ση2)\mathcal{N}(0,\sigma_{\eta}^{2}). Then 𝔼ΩB2=mnση2\mathbb{E}\|\Omega_{B}\|^{2}=mn\sigma_{\eta}^{2}.

Thus, we have

𝔼x()x2\displaystyle\mathbb{E}\|x(\infty)-x^{\ast}\|^{2}
\displaystyle\leq 2(1d)2λ¯A2(x2𝔼ΩAF2+𝔼ΩB2)\displaystyle\frac{2}{(1-d)^{2}\underline{\lambda}_{A}^{2}}(\|x^{\ast}\|^{2}\mathbb{E}\|\Omega_{A}\|_{F}^{2}+\mathbb{E}\|\Omega_{B}\|^{2})
\displaystyle\leq 2(1d)2λ¯A2(nm2σγ2x2+nmση2).\displaystyle\frac{2}{(1-d)^{2}\underline{\lambda}_{A}^{2}}(nm^{2}\sigma_{\gamma}^{2}\|x^{\ast}\|^{2}+nm\sigma_{\eta}^{2}).

The proof is thus completed.

References

  • [1] A. Nedic, “Distributed gradient methods for convex machine learning problems in networks: Distributed optimization,” IEEE Signal Processing Magazine, vol. 37, no. 3, pp. 92–101, 2020.
  • [2] L. Yan, J. L. Webber, A. Mehbodniya, B. Moorthy, S. Sivamani, S. Nazir, and M. Shabaz, “Distributed optimization of heterogeneous uav cluster pid controller based on machine learning,” Computers and Electrical Engineering, vol. 101, p. 108059, 2022.
  • [3] P. Wan and M. D. Lemmon, “Event-triggered distributed optimization in sensor networks,” in 2009 International Conference on Information Processing in Sensor Networks, 2009, pp. 49–60.
  • [4] I. Lobel and A. Ozdaglar, “Distributed subgradient methods for convex optimization over random networks,” IEEE Transactions on Automatic Control, vol. 56, no. 6, pp. 1291–1306, 2011.
  • [5] I. Notarnicola, M. Bin, L. Marconi, and G. Notarstefano, “The gradient tracking is a distributed integral action,” IEEE Transactions on Automatic Control, pp. 1–8, 2023.
  • [6] D. Jakovetić, J. Xavier, and J. M. F. Moura, “Fast distributed gradient methods,” IEEE Transactions on Automatic Control, vol. 59, no. 5, pp. 1131–1146, 2014.
  • [7] A. Nedić and A. Olshevsky, “Distributed optimization over time-varying directed graphs,” IEEE Transactions on Automatic Control, vol. 60, no. 3, pp. 601–615, 2015.
  • [8] S. Yang, Q. Liu, and J. Wang, “Distributed optimization based on a multiagent system in the presence of communication delays,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 47, no. 5, pp. 717–728, 2017.
  • [9] K. Srivastava and A. Nedic, “Distributed asynchronous constrained stochastic optimization,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 4, pp. 772–790, 2011.
  • [10] C. Zhang, M. Ahmad, and Y. Wang, “Admm based privacy-preserving decentralized optimization,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 3, pp. 565–580, 2019.
  • [11] A. Nedic, A. Ozdaglar, and P. A. Parrilo, “Constrained consensus and optimization in multi-agent networks,” IEEE Transactions on Automatic Control, vol. 55, no. 4, pp. 922–938, 2010.
  • [12] D. Jakovetić, “A unification and generalization of exact distributed first-order methods,” IEEE Transactions on Signal and Information Processing over Networks, vol. 5, no. 1, pp. 31–46, 2019.
  • [13] D. Varagnolo, F. Zanella, A. Cenedese, G. Pillonetto, and L. Schenato, “Newton-raphson consensus for distributed convex optimization,” IEEE Transactions on Automatic Control, vol. 61, no. 4, pp. 994–1009, 2015.
  • [14] A. Nedic, A. Olshevsky, and W. Shi, “Achieving geometric convergence for distributed optimization over time-varying graphs,” SIAM Journal on Optimization, vol. 27, no. 4, pp. 2597–2633, 2017.
  • [15] J. Xu, S. Zhu, Y. C. Soh, and L. Xie, “Convergence of asynchronous distributed gradient methods over stochastic networks,” IEEE Transactions on Automatic Control, vol. 63, no. 2, pp. 434–448, 2017.
  • [16] G. Scutari and Y. Sun, “Distributed nonconvex constrained optimization over time-varying digraphs,” Mathematical Programming, vol. 176, pp. 497–544, 2019.
  • [17] S. Pu, W. Shi, J. Xu, and A. Nedić, “Push–pull gradient methods for distributed optimization in networks,” IEEE Transactions on Automatic Control, vol. 66, no. 1, pp. 1–16, 2020.
  • [18] S. Pu and A. Nedić, “Distributed stochastic gradient tracking methods,” Mathematical Programming, vol. 187, pp. 409–457, 2021.
  • [19] Y. Pan, P. Xiao, Y. He, Z. Shao, and Z. Li, “Mulls: Versatile lidar slam via multi-metric linear least square,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 11 633–11 640.
  • [20] Y. Zheng and Q. Liu, “A review of distributed optimization: Problems, models and algorithms,” Neurocomputing, vol. 483, pp. 446–459, 2022.
  • [21] T. Yang, X. Yi, J. Wu, Y. Yuan, D. Wu, Z. Meng, Y. Hong, H. Wang, Z. Lin, and K. H. Johansson, “A survey of distributed optimization,” Annual Reviews in Control, vol. 47, pp. 278–305, 2019.
  • [22] C.-N. Hang, Y.-Z. Tsai, P.-D. Yu, J. Chen, and C.-W. Tan, “Privacy-enhancing digital contact tracing with machine learning for pandemic response: A comprehensive review,” Big Data and Cognitive Computing, vol. 7, no. 2, 2023.
  • [23] Y. Lu and M. Zhu, “Privacy preserving distributed optimization using homomorphic encryption,” Automatica, vol. 96, pp. 314–325, 2018.
  • [24] S. Mao, Y. Tang, Z. Dong, K. Meng, Z. Y. Dong, and F. Qian, “A privacy preserving distributed optimization algorithm for economic dispatch over time-varying directed networks,” IEEE Transactions on Industrial Informatics, vol. 17, no. 3, pp. 1689–1701, 2021.
  • [25] Y. Wang and A. Nedić, “Tailoring gradient methods for differentially-private distributed optimization,” IEEE Transactions on Automatic Control, pp. 1–16, 2023.
  • [26] Y. Lou, L. Yu, S. Wang, and P. Yi, “Privacy preservation in distributed subgradient optimization algorithms,” IEEE Transactions on Cybernetics, vol. 48, no. 7, pp. 2154–2165, 2018.
  • [27] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography, S. Halevi and T. Rabin, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 265–284.
  • [28] J. Le Ny and G. J. Pappas, “Differentially private filtering,” IEEE Transactions on Automatic Control, vol. 59, no. 2, pp. 341–354, 2013.
  • [29] E. Nozari, P. Tallapragada, and J. Cortés, “Differentially private average consensus: Obstructions, trade-offs, and optimal algorithm design,” Automatica, vol. 81, pp. 221–231, 2017.
  • [30] M. Ruan, H. Gao, and Y. Wang, “Secure and privacy-preserving consensus,” IEEE Transactions on Automatic Control, vol. 64, no. 10, pp. 4035–4049, 2019.
  • [31] Z. Huang, S. Mitra, and N. Vaidya, “Differentially private distributed optimization,” in Proceedings of the 16th International Conference on Distributed Computing and Networking, ser. ICDCN ’15.   New York, NY, USA: Association for Computing Machinery, 2015.
  • [32] Y. Wang and T. Başar, “Decentralized nonconvex optimization with guaranteed privacy and accuracy,” Automatica, vol. 150, p. 110858, 2023.
  • [33] E. Nozari, P. Tallapragada, and J. Cortés, “Differentially private distributed convex optimization via functional perturbation,” IEEE Transactions on Control of Network Systems, vol. 5, no. 1, pp. 395–408, 2018.
  • [34] S. Han, U. Topcu, and G. J. Pappas, “Differentially private distributed constrained optimization,” IEEE Transactions on Automatic Control, vol. 62, no. 1, pp. 50–64, 2017.
  • [35] M. T. Hale and M. Egerstedt, “Cloud-enabled differentially private multiagent optimization with constraints,” IEEE Transactions on Control of Network Systems, vol. 5, no. 4, pp. 1693–1706, 2018.
  • [36] C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy.” Found. Trends Theor. Comput. Sci., vol. 9, no. 3-4, pp. 211–407, 2014.
  • [37] Q. Geng, W. Ding, R. Guo, and S. Kumar, “Tight analysis of privacy and utility tradeoff in approximate differential privacy,” in Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, S. Chiappa and R. Calandra, Eds., vol. 108.   PMLR, 26–28 Aug 2020, pp. 89–99.
  • [38] B. Balle and Y.-X. Wang, “Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising,” in International Conference on Machine Learning.   PMLR, 2018, pp. 394–403.
  • [39] L. Wang, W. Liu, F. Guo, Z. Qiao, and Z. Wu, “Differentially private average consensus with improved accuracy-privacy trade-off,” arXiv:2309.08464, 2023.
  • [40] M. Mesbahi and M. Egerstedt, Graph theoretic methods in multiagent networks.   Princeton University Press, 2010.
  • [41] R. A. Horn and C. R. Johnson, Matrix Analysis.   Cambridge University Press, 1985.
  • [42] Z. Qiao, F. Guo, X. Pan, Y. Sun, and L. Wang, “Distributed load shedding via differentially private average consensus algorithm,” in 2022 34th Chinese Control and Decision Conference (CCDC), 2022, pp. 1503–1508.
  • [43] P. Paillier, “Public-key cryptosystems based on composite degree residuosity classes,” in International conference on the theory and applications of cryptographic techniques.   Springer, 1999, pp. 223–238.
  • [44] F. McSherry, “Privacy integrated queries: An extensible platform for privacy-preserving data analysis,” Commun. ACM, vol. 53, no. 9, p. 89–97, sep 2010.