This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Locally Differentially Private Analysis of Graph Statistics

Jacob Imola
UC San Diego
The first and second authors made equal contributions.
   Takao Murakamifootnotemark:
AIST
   Kamalika Chaudhuri
UC San Diego
Abstract

Differentially private analysis of graphs is widely used for releasing statistics from sensitive graphs while still preserving user privacy. Most existing algorithms however are in a centralized privacy model, where a trusted data curator holds the entire graph. As this model raises a number of privacy and security issues – such as, the trustworthiness of the curator and the possibility of data breaches, it is desirable to consider algorithms in a more decentralized local model where no server holds the entire graph.

In this work, we consider a local model, and present algorithms for counting subgraphs – a fundamental task for analyzing the connection patterns in a graph – with LDP (Local Differential Privacy). For triangle counts, we present algorithms that use one and two rounds of interaction, and show that an additional round can significantly improve the utility. For kk-star counts, we present an algorithm that achieves an order optimal estimation error in the non-interactive local model. We provide new lower-bounds on the estimation error for general graph statistics including triangle counts and kk-star counts. Finally, we perform extensive experiments on two real datasets, and show that it is indeed possible to accurately estimate subgraph counts in the local differential privacy model.

1 Introduction

Analysis of network statistics is a useful tool for finding meaningful patterns in graph data, such as social, e-mail, citation and epidemiological networks. For example, the average degree (i.e., number of edges connected to a node) in a social graph can reveal the average connectivity. Subgraph counts (e.g., the number of triangles, stars, or cliques) can be used to measure centrality properties such as the clustering coefficient, which represents the probability that two friends of an individual will also be friends of one another [41]. However, the vast majority of graph analytics is carried out on sensitive data, which could be leaked through the results of graph analysis. Thus, there is a need to develop solutions that can analyze these graph properties while still preserving the privacy of individuals in the network.

The standard way to analyze graphs with privacy is through differentially private graph analysis [49, 23, 22]. Differential privacy provides individual privacy against adversaries with arbitrary background knowledge, and has currently emerged as the gold standard for private analytics. However, a vast majority of differentially private graph analysis algorithms are in the centralized (or global) model [13, 15, 16, 27, 33, 35, 42, 48, 49, 52, 59, 58], where a single trusted data curator holds the entire graph and releases sanitized versions of the statistics. By assuming a trusted party that can access the entire graph, it is possible to release accurate graph statistics (e.g., subgraph counts [33, 35, 52], degree distribution [16, 27, 48], spectra [59]) and synthetic graphs [15, 58].

In many applications however, a single trusted curator may not be practicable due to security or logistical reasons. A centralized data holder is amenable to security issues such as data breaches and leaks – a growing threat in recent years [38, 51]. Additionally, decentralized social networks [43, 50] (e.g., Diaspora [5]) have no central server that contains an entire social graph, and use instead many servers all over the world, each containing the data of users who have chosen to register there. Finally, a centralized solution is also not applicable to fully decentralized applications, where the server does not automatically hold information connecting users. An example of this is a mobile application that asks each user how many of her friends she has seen today, and sends noisy counts to a central server. In this application, the server does not hold any individual edge, but can still aggregate the responses to determine the average mobility in an area.

The standard privacy solution that does not assume a trusted third party is LDP (Local Differential Privacy) [20, 34]. This is a special case of DP (Differential Privacy) in the local model, where each user obfuscates her personal data by herself and sends the obfuscated data to a (possibly malicious) data collector. Since the data collector does not hold the original personal data, it does not suffer from data leakage issues. Therefore, LDP has recently attracted attention from both academia [8, 11, 10, 24, 31, 32, 39, 45, 57, 62] as well as industry [56, 17, 55]. However, the use of LDP has mostly been in the context of tabular data where each row corresponds to an individual, and little attention has been paid to LDP for more complex data such as graphs (see Section 2 for details).

Refer to caption
Figure 1: Example of subgraph counts.

In this paper, we consider LDP for graph data, and provide algorithms and theoretical performance guarantees for calculating graph statistics in this model. In particular, we focus on counting triangles and kk-stars – the most basic and useful subgraphs. A triangle is a set of three nodes with three edges (we exclude automorphisms; i.e., #closed triplets =3×=3\times #triangles). A kk-star consists of a central node connected to kk other nodes. Figure 1 shows an example of triangles and kk-stars. Counting them is a fundamental task of analyzing the connection patterns in a graph, as the clustering coefficient can be calculated from triangle and 22-star counts as: 3×#triangles#2-stars\frac{3\times\text{\#triangles}}{\#2\text{-stars}} (in Figure 1, 3×520=0.75\frac{3\times 5}{20}=0.75).

When we look to protect privacy of relationship information modeled by edges in a graph, we need to pay attention to the fact that some relationship information could be leaked from subgraph counts. For example, suppose that user (node) v2v_{2} in Figure 1 knows all edges connected to v2v_{2} and all edges between v3,,v7v_{3},\ldots,v_{7} as background knowledge, and that v2v_{2} wants to know who are friends with v1v_{1}. Then “#2-stars = 20” reveals the fact that v1v_{1} has three friends, and “#triangles = 5” reveals the fact that the three friends of v1v_{1} are v3v_{3}, v4v_{4}, and v6v_{6}. Moreover, a central server that holds all friendship information (i.e., all edges) may face data breaches, as explained above. Therefore, a private algorithm for counting subgraphs in the local model is highly beneficial to individual privacy.

The main challenge in counting subgraphs in the local model is that existing techniques and their analysis do not directly apply. The existing work on LDP for tabular data assumes that each person’s data is independently and identically drawn from an underlying distribution. In graphs, this is no longer the case; e.g., each triangle is not independent, because multiple triangles can involve the same edge; each kk-star is not independent for the same reason. Moreover, complex inter-dependencies involving multiple people are possible in graphs. For example, each user cannot count triangles involving herself, because she cannot see edges between other users; e.g., user v1v_{1} cannot see an edge between v3v_{3} and v4v_{4} in Figure 1.

We show that although these complex dependency among users introduces challenges, it also presents opportunities. Specifically, this kind of interdependency also implies that extra interaction between users and a data collector may be helpful depending on the prior responses. In this work, we investigate this issue and provide algorithms for accurately calculating subgraph counts under LDP.

Our contributions.  In this paper, we provide algorithms and corresponding performance guarantees for counting triangles and kk-stars in graphs under edge Local Differential Privacy. Specifically, our contributions are as follows:

  • For triangles, we present two algorithms. The first is based on Warner’s RR (Randomized Response) [60] and empirical estimation [31, 39, 57]. We then present a more sophisticated algorithm that uses an additional round of interaction between users and data collector. We provide upper-bounds on the estimation error for each algorithm, and show that the latter can significantly reduce the estimation error.

  • For kk-stars, we present a simple algorithm using the Laplacian mechanism. We analyze the upper-bound on the estimation error for this algorithm, and show that it is order optimal in terms of the number of users among all LDP mechanisms that do not use additional interaction.

  • We provide lower-bounds on the estimation error for general graph functions including triangle counts and kk-star counts in the local model. These are stronger than known upper bounds in the centralized model, and illustrate the limitations of the local model over the central.

  • Finally, we evaluate our algorithms on two real datasets, and show that it is indeed possible to accurately estimate subgraph counts in the local model. In particular, we show that the interactive algorithm for triangle counts and the Laplacian algorithm for the kk-stars provide small estimation errors when the number of users is large.

We implemented our algorithms with C/C++, and published them as open-source software [1].

2 Related Work

Graph DP.  DP on graphs has been widely studied, with most prior work being in the centralized model [13, 15, 16, 27, 33, 35, 42, 48, 49, 52, 59, 58]. In this model, a number of algorithms have been proposed for releasing subgraph counts [33, 35, 52], degree distributions [16, 27, 48], eigenvalues and eigenvectors [59], and synthetic graphs [15, 58].

There has also been a handful of work on graph algorithms in the local DP model [46, 53, 63, 64, 65]. For example, Qin et al. [46] propose an algorithm for generating synthetic graphs. Zhang et al. [65] propose an algorithm for software usage analysis under LDP, where a node represents a software component (e.g., function in a code) and an edge represents a control-flow between components. Neither of these works focus on subgraph counts.

Sun et al. [53] propose an algorithm for counting subgraphs in the local model under the assumption that each user allows her friends to see all her connections. However, this assumption does not hold in many practical scenarios; e.g., a Facebook user can change her setting so that friends cannot see her connections. Therefore, we assume that each user knows only her friends rather than all of her friends’ friends. The algorithms in [53] cannot be applied to this setting.

Ye et al. [63] propose a one-round algorithm for estimating graph metrics including the clustering coefficient. Here they apply Warner’s RR (Randomized Response) to an adjacency matrix. However, it introduces a very large bias for triangle counts. In [64], they propose a method for reducing the bias in the estimate of triangle counts. However, the method in [64] introduces some approximation, and it is unclear whether their estimate is unbiased. In this paper, we propose a one-round algorithm for triangles that uses empirical estimation as a post-processing step, and prove that our estimate is unbiased. We also show in Appendix A that our one-round algorithm significantly outperforms the one-round algorithm in [63]. Moreover, we show in Section 5 that our two-rounds algorithm significantly outperforms our one-round algorithm.

Our work also differs from [53, 63, 64] in that we provide lower-bounds on the estimation error.

LDP.  Apart from graphs, a number of works have looked at analyzing statistics (e.g., discrete distribution estimation[8, 24, 31, 32, 39, 57, 62], heavy hitters [11, 10, 45]) under LDP.

However, they use LDP in the context of tabular data, and do not consider the kind of complex interdependency in graph data (as described in Section 1). For example, the RR with empirical estimation is optimal in the low privacy regimes for estimating a distribution for tabular data [31, 32]. We apply the RR and empirical estimation to counting triangles, and show that it is suboptimal and significantly outperformed by a more sophisticated two-rounds algorithm.

Upper/lower-bounds.  Finally, we note that existing work on upper-bounds and lower-bounds cannot be directly applied to our setting. For example, there are upper-bounds [8, 31, 32, 62, 29, 28] and lower-bounds [7, 21, 19, 29, 18, 28, 30] on the estimation error (or sample complexity) in distribution estimation of tabular data. However, they assume that each original data value is independently sampled from an underlying distribution. They cannot be directly applied to our graph setting, because each triangle and each kk-star involve multiple edges and are not independent (as described in Section 1). Rashtchian et al. [47] provide lower-bounds on communication complexity (i.e., number of queries) of vector-matrix-vector queries for estimating subgraph counts. However, their lower-bounds are not on the estimation error, and cannot be applied to our problem.

3 Preliminaries

3.1 Graphs and Differential Privacy

Graphs.  Let \mathbb{N}, 0\mathbb{Z}_{\geq 0}, \mathbb{R}, and 0\mathbb{R}_{\geq 0} be the sets of natural numbers, non-negative integers, real numbers, and non-negative real numbers, respectively. For aa\in\mathbb{N}, let [a]={1,2,,a}[a]=\{1,2,\ldots,a\}.

We consider an undirected graph G=(V,E)G=(V,E), where VV is a set of nodes (i.e., users) and EE is a set of edges. Let nn\in\mathbb{N} be the number of users in VV, and let viVv_{i}\in V the ii-th user; i.e., V={v1,,vn}V=\{v_{1},\ldots,v_{n}\}. An edge (vi,vj)E(v_{i},v_{j})\in E represents a relationship between users viVv_{i}\in V and vjVv_{j}\in V. The number of edges connected to a single node is called the degree of the node. Let dmaxd_{max}\in\mathbb{N} be the maximum degree (i.e., maximum number of edges connected to a node) in graph GG. Let 𝒢\mathcal{G} be the set of possible graphs GG on nn users. A graph G𝒢G\in\mathcal{G} can be represented as a symmetric adjacency matrix 𝐀=(ai,j){0,1}n×n\mathbf{A}=(a_{i,j})\in\{0,1\}^{n\times n}, where ai,j=1a_{i,j}=1 if (vi,vj)E(v_{i},v_{j})\in E and ai,j=0a_{i,j}=0 otherwise.

Types of DP.   DP (Differential Privacy) [23, 22] is known as a gold standard for data privacy. According to the underlying architecture, DP can be divided into two types: centralized DP and LDP (Local DP). Centralized DP assumes the centralized model, where a “trusted” data collector collects the original personal data from all users and obfuscates a query (e.g., counting query, histogram query) on the set of personal data. LDP assumes the local model, where each user does not trust even the data collector. In this model, each user obfuscates a query on her personal data by herself and sends the obfuscated data to the data collector.

If the data are represented as a graph, we can consider two types of DP: edge DP and node DP [27, 49]. Edge DP considers two neighboring graphs G,G𝒢G,G^{\prime}\in\mathcal{G} that differ in one edge. In contrast, node DP considers two neighboring graphs G,G𝒢G,G^{\prime}\in\mathcal{G} in which GG^{\prime} is obtained from GG by adding or removing one node along with its adjacent edges.

Although Zhang et al. [65] consider node DP in the local model where each node represents a software component, we consider a totally different problem where each node represents a user. In the latter case, node DP requires us to hide the existence of each user along with her all edges. However, many applications in the local model send the identity of each user to a server. For example, we can consider a mobile application that sends to a server how many friends a user met today along with her user ID. In this case, the user may not mind sending her user ID, but may want to hide her edge information (i.e., who she met today). Although we cannot use node DP in such applications, we can use edge DP to deny the presence/absence of each edge (friend). Thus we focus on edge DP in the same way as [46, 53, 63, 64].

Below we explain edge DP in the centralized model.

Centralized DP.  We call edge DP in the centralized model edge centralized DP. Formally, it is defined as follows:

Definition 1 (ε\varepsilon-edge centralized DP).

Let ε0\varepsilon\in\mathbb{R}_{\geq 0}. A randomized algorithm \mathcal{M} with domain 𝒢\mathcal{G} provides ε\varepsilon-edge centralized DP if for any two neighboring graphs G,G𝒢G,G^{\prime}\in\mathcal{G} that differ in one edge and any SRange()S\subseteq\mathrm{Range}(\mathcal{M}),

Pr[(G)S]eεPr[(G)S].\displaystyle\Pr[\mathcal{M}(G)\in S]\leq e^{\varepsilon}\Pr[\mathcal{M}(G^{\prime})\in S]. (1)

Edge centralized DP guarantees that an adversary who has observed the output of \mathcal{M} cannot determine whether it is come from GG or GG^{\prime} with a certain degree of confidence. The parameter ε\varepsilon is called the privacy budget. If ε\varepsilon is close to zero, then GG and GG^{\prime} are almost equally likely, which means that an edge in GG is strongly protected.

We also note that edge DP can be used to protect kk\in\mathbb{N} edges by using the notion of group privacy [23]. Specifically, if \mathcal{M} provides ε\varepsilon-edge centralized DP, then for any two graphs G,G𝒢G,G^{\prime}\in\mathcal{G} that differ in kk edges and any SRange()S\subseteq\mathrm{Range}(\mathcal{M}), we obtain: Pr[(G)S]ekεPr[(G)S]\Pr[\mathcal{M}(G)\in S]\leq e^{k\varepsilon}\Pr[\mathcal{M}(G^{\prime})\in S]; i.e., kk edges are protected with privacy budget kεk\varepsilon.

3.2 Local Differential Privacy

LDP (Local Differential Privacy) [34, 20] is a privacy metric to protect personal data of each user in the local model. LDP has been originally introduced to protect each user’s data record that is independent from the other records. However, in a graph, each edge is connected to two users. Thus, when we define edge DP in the local model, we should consider what we want to protect. In this paper, we consider two definitions of edge DP in the local model: edge LDP in [46] and relationship DP introduced in this paper. Below, we will explain these two definitions in detail.

Edge LDP.  Qin et al. [46] defined edge LDP based on a user’s neighbor list. Specifically, let 𝐚i=(ai,1,,ai,n){0,1}n\mathbf{a}_{i}=(a_{i,1},\ldots,a_{i,n})\in\{0,1\}^{n} be a neighbor list of user viv_{i}. Note that 𝐚i\mathbf{a}_{i} is the ii-th row of the adjacency matrix 𝐀\mathbf{A} of graph GG. In other words, graph GG can be represented as neighbor lists a1,,an\textbf{a}_{1},\ldots,\textbf{a}_{n}.

Then edge LDP is defined as follows:

Definition 2 (ε\varepsilon-edge LDP [46]).

Let ε0\varepsilon\in\mathbb{R}_{\geq 0}. For any i[n]i\in[n], let i\mathcal{R}_{i} with domain {0,1}n\{0,1\}^{n} be a randomized algorithm of user viv_{i}. i\mathcal{R}_{i} provides ε\varepsilon-edge LDP if for any two neighbor lists 𝐚i,𝐚i{0,1}n\mathbf{a}_{i},\mathbf{a}^{\prime}_{i}\in\{0,1\}^{n} that differ in one bit and any SRange(i)S\subseteq\mathrm{Range}(\mathcal{R}_{i}),

Pr[i(𝐚i)S]eεPr[i(𝐚i)S].\displaystyle\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})\in S]\leq e^{\varepsilon}\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})\in S]. (2)

Edge LDP in Definition 2 protects a single bit in a neighbor list with privacy budget ε\varepsilon. As with edge centralized DP, edge LDP can also be used to protect kk\in\mathbb{N} bits in a neighbor list by using group privacy; i.e., kk bits in a neighbor list are protected with privacy budget kεk\varepsilon.

RR (Randomized Response).  As a simple example of a randomized algorithm i\mathcal{R}_{i} providing ε\varepsilon-edge LDP, we explain Warner’s RR (Randomized Response) [60] applied to a neighbor list, which is called the randomized neighbor list in [46].

Given a neighbor list 𝐚i{0,1}n\mathbf{a}_{i}\in\{0,1\}^{n}, this algorithm outputs a noisy neighbor lists 𝐛=(b1,,bn){0,1}n\mathbf{b}=(b_{1},\ldots,b_{n})\in\{0,1\}^{n} by flipping each bit in 𝐚i\mathbf{a}_{i} with probability p=1eε+1p=\frac{1}{e^{\varepsilon}+1}; i.e., for each j[n]j\in[n], bjai,jb_{j}\neq a_{i,j} with probability pp and bj=ai,jb_{j}=a_{i,j} with probability 1p1-p. Since Pr[(𝐚i)S]\Pr[\mathcal{R}(\mathbf{a}_{i})\in S] and Pr[(𝐚i)S]\Pr[\mathcal{R}(\mathbf{a}^{\prime}_{i})\in S] in (2) differ by eεe^{\varepsilon} for 𝐚i\mathbf{a}_{i} and 𝐚i\mathbf{a}^{\prime}_{i} that differ in one bit, this algorithm provides ε\varepsilon-edge LDP.

Relationship DP.  In graphs such as social networks, it is usually the case that two users share knowledge of the presence of an edge between them. To hide their mutual edge, we must consider that both user’s outputs can leak information. We introduce a DP definition called relationship DP that hides one entire edge in graph GG during the whole process:

Definition 3 (ε\varepsilon-relationship DP).

Let ε0\varepsilon\in\mathbb{R}_{\geq 0}. A tuple of randomized algorithms (1,,n)(\mathcal{R}_{1},\ldots,\mathcal{R}_{n}), each of which is with domain {0,1}n\{0,1\}^{n}, provides ε\varepsilon-relationship DP if for any two neighboring graphs G,G𝒢G,G^{\prime}\in\mathcal{G} that differ in one edge and any SRange(1)××Range(n)S\subseteq\mathrm{Range}(\mathcal{R}_{1})\times\ldots\times\mathrm{Range}(\mathcal{R}_{n}),

Pr[(1(𝐚1),,n(𝐚n))S]\displaystyle\Pr[(\mathcal{R}_{1}(\mathbf{a}_{1}),\ldots,\mathcal{R}_{n}(\mathbf{a}_{n}))\in S]
eεPr[(1(𝐚1),,n(𝐚n))S],\displaystyle\leq e^{\varepsilon}\Pr[(\mathcal{R}_{1}(\mathbf{a}^{\prime}_{1}),\ldots,\mathcal{R}_{n}(\mathbf{a}^{\prime}_{n}))\in S], (3)

where 𝐚i\mathbf{a}_{i} (resp. 𝐚i\mathbf{a}^{\prime}_{i}) {0,1}n\in\{0,1\}^{n} is the ii-th row of the adjacency matrix of graph GG (resp. GG^{\prime}).

Relationship DP is the same as decentralized DP in [53] except that the former (resp. latter) assumes that each user knows only her friends (resp. all of her friends’ friends).

Edge LDP assumes that user viv_{i}’s edge connected to user vjv_{j} and user vjv_{j}’s edge connected to user viv_{i} are different secrets, with user viv_{i} knowing the former and user vjv_{j} knowing the latter. Relationship DP assumes that the two secrets are the same.

Note that the threat model of relationship DP is different from that of LDP – some amount of trust must be given to the other users in relationship DP. Specifically, user viv_{i} must trust user vjv_{j} to not leak information about their shared edge. If kk\in\mathbb{N} users decide not to follow their protocols, then up to kk edges incident to user viv_{i} may be compromised. This trust model is stronger than LDP, which assumes nothing about what other users do, but is much weaker than centralized DP in which all edges are in the hands of the central party.

Other than the differing threat models, relationship DP and edge LDP are quite closely related:

Proposition 1.

If randomized algorithms 1,,n\mathcal{R}_{1},\ldots,\mathcal{R}_{n} provide ε\varepsilon-edge LDP, then (1,,n)(\mathcal{R}_{1},\ldots,\mathcal{R}_{n}) provides 2ε2\varepsilon-relationship DP.

Proof.

The existence of edge (vi,vj)E(v_{i},v_{j})\in E affects two elements ai,j,aj,i{0,1}a_{i,j},a_{j,i}\in\{0,1\} in the adjacency matrix 𝐀\mathbf{A}. Then by group privacy [23], Proposition 1 holds. ∎

Proposition 1 states that when we want to protect one edge as a whole, the privacy budget is at most doubled. Note, however, that some randomized algorithms do not have this doubling issue. For example, we can apply the RR to the ii-th neighbor list 𝐚i\mathbf{a}_{i} so that i\mathcal{R}_{i} outputs noisy bits (b1,,bi1){0,1}i1(b_{1},\ldots,b_{i-1})\in\{0,1\}^{i-1} for only users v1,,vi1v_{1},\ldots,v_{i-1} with smaller user IDs; i.e., for each j{1,,i1}j\in\{1,\ldots,i-1\}, bjai,jb_{j}\neq a_{i,j} with probability p=1eε+1p=\frac{1}{e^{\varepsilon}+1} and bj=ai,jb_{j}=a_{i,j} with probability 1p1-p. In other words, we can extend the RR for a neighbor list so that (1,,n)(\mathcal{R}_{1},\ldots,\mathcal{R}_{n}) outputs only the lower triangular part of the noisy adjacency matrix. Then all of 1,,n\mathcal{R}_{1},\ldots,\mathcal{R}_{n} provide ε\varepsilon-edge LDP. In addition, the existence of edge (vi,vj)E(v_{i},v_{j})\in E (i>j)(i>j) affects only one element ai,ja_{i,j} in the lower triangular part of 𝐀\mathbf{A}. Thus, (1,,n)(\mathcal{R}_{1},\ldots,\mathcal{R}_{n}) provides ε\varepsilon-relationship DP (not 2ε2\varepsilon).

Our proposed algorithm in Section 4.3 also has this property; i.e., it provides both ε\varepsilon-edge LDP and ε\varepsilon-relationship DP.

3.3 Global Sensitivity

In this paper, we use the notion of global sensitivity [23] to provide edge centralized DP or edge LDP.

Let 𝒟\mathcal{D} be the set of possible input data of a randomized algorithm. In edge centralized DP, 𝒟=𝒢\mathcal{D}=\mathcal{G}. In edge LDP, 𝒟={0,1}n\mathcal{D}=\{0,1\}^{n}. Let f:𝒟f:\mathcal{D}\rightarrow\mathbb{R} be a function that takes data D𝒟D\in\mathcal{D} as input and outputs some statistics f(D)f(D)\in\mathbb{R} about the data. The most basic method for providing DP is to add the Laplacian noise proportional to the global sensitivity [23].

Definition 4 (Global sensitivity).

The global sensitivity of a function f:𝒟f:\mathcal{D}\rightarrow\mathbb{R} is given by:

GSf=maxD,D𝒟:DD|f(D)f(D)|,\displaystyle GS_{f}=\underset{D,D^{\prime}\in\mathcal{D}:D\sim D^{\prime}}{\max}|f(D)-f(D^{\prime})|,

where DDD\sim D^{\prime} represents that DD and DD^{\prime} are neighbors; i.e., they differ in one edge in edge centralized DP, and differ in one bit in edge LDP.

In graphs, the global sensitivity GSfGS_{f} can be very large. For example, adding one edge may result in the increase of triangle (resp. kk-star) counts by n2n-2 (resp. (nk1)\binom{n}{k-1}).

One way to significantly reduce the global sensitivity is to use graph projection [16, 35, 48], which removes some neighbors from a neighbor list so that the maximum degree dmaxd_{max} is upper-bounded by a predetermined value d~max0\tilde{d}_{max}\in\mathbb{Z}_{\geq 0}. By using the graph projection with d~maxn\tilde{d}_{max}\ll n, we can enforce small global sensitivity; e.g., the global sensitivity of triangle (resp. kk-star) counts is at most d~max\tilde{d}_{max} (resp. (d~maxk1)\binom{\tilde{d}_{max}}{k-1}) after the projection.

Ideally, we would like to set d~max=dmax\tilde{d}_{max}=d_{max} to avoid removing neighbors from a neighbor list (i.e., to avoid the loss of utility). However, the maximum degree dmaxd_{max} can leak some information about the original graph GG. In this paper, we address this issue by privately estimating dmaxd_{max} with edge LDP and then using the private estimate of dmaxd_{max} as d~max\tilde{d}_{max}. This technique is also known as adaptive clipping in differentially private stochastic gradient descent (SGD) [44, 54].

3.4 Graph Statistics and Utility Metrics

Graph statistics.  We consider a graph function that takes a graph G𝒢G\in\mathcal{G} as input and outputs some graph statistics. Specifically, let f:𝒢0f_{\triangle}:\mathcal{G}\rightarrow\mathbb{Z}_{\geq 0} be a graph function that outputs the number of triangles in GG. For kk\in\mathbb{N}, let fk:𝒢0f_{k\star}:\mathcal{G}\rightarrow\mathbb{Z}_{\geq 0} be a graph function that outputs the number of kk-stars in GG. For example, if a graph GG is as shown in Figure 1, then f(G)=5f_{\triangle}(G)=5, f2(G)=20f_{2\star}(G)=20, and f3(G)=8f_{3\star}(G)=8. The clustering coefficient can also be calculated from f(G)f_{\triangle}(G) and f2(G)f_{2\star}(G) as: 3f(G)f2(G)=0.75\frac{3f_{\triangle}(G)}{f_{2\star}(G)}=0.75.

Table 1: Basic notations in this paper.

Symbol Description nn Number of users. G=(V,E)G=(V,E) Graph with nn nodes (users) VV and edges EE. viv_{i} ii-th user in VV. dmaxd_{max} Maximum degree of GG. d~max\tilde{d}_{max} Upper-bound on dmaxd_{max} (used for projection). 𝒢\mathcal{G} Set of possible graphs on nn users. 𝐀=(ai,j)\mathbf{A}=(a_{i,j}) Adjacency matrix. 𝐚i\mathbf{a}_{i} ii-th row of 𝐀\mathbf{A} (i.e., neighbor list of viv_{i}). i\mathcal{R}_{i} Randomized algorithm on 𝐚i\mathbf{a}_{i}. f(G)f_{\triangle}(G) Number of triangles in GG. fk(G)f_{k\star}(G) Number of kk-stars in GG.

Table 1 shows the basic notations used in this paper.

Utility metrics.  We use the l2l_{2} loss (i.e., squared error) [31, 57, 39] and the relative error [12, 14, 61] as utility metrics.

Specifically, let f^(G)\hat{f}(G)\in\mathbb{R} be an estimate of graph statistics f(G)f(G)\in\mathbb{R}. Here ff can be instantiated by ff_{\triangle} or fkf_{k\star}; i.e., f^(G)\hat{f}_{\triangle}(G) and f^k(G)\hat{f}_{k\star}(G) are the estimates of f(G)f_{\triangle}(G) and fk(G)f_{k\star}(G), respectively. Let l22l_{2}^{2} be the l2l_{2} loss function, which maps the estimate f^(G)\hat{f}(G) and the true value f(G)f(G) to the l2l_{2} loss; i.e., l22(f^(G),f(G))=(f^(G)f(G))2l_{2}^{2}(\hat{f}(G),f(G))=(\hat{f}(G)-f(G))^{2}. Note that when we use a randomized algorithm providing edge LDP (or edge centralized DP), f^(G)\hat{f}(G) depends on the randomness in the algorithm. In our theoretical analysis, we analyze the expectation of the l2l_{2} loss over the randomness, as with [31, 57, 39].

When f(G)f(G) is large, the l2l_{2} loss can also be large. Thus in our experiments, we also evaluate the relative error, along with the l2l_{2} loss. The relative error is defined as: |f^(G)f(G)|max{f(G),η}\frac{|\hat{f}(G)-f(G)|}{\max\{f(G),\eta\}}, where η0\eta\in\mathbb{R}_{\geq 0} is a very small positive value. Following the convention [12, 14, 61], we set η=0.001n\eta=0.001n for ff_{\triangle} and fkf_{k\star}.

4 Algorithms

In the local model, there are several ways to model how the data collector interacts with the users [20, 30, 46]. The simplest model would be to assume that the data collector sends a query i\mathcal{R}_{i} to each user viv_{i} once, and then each user viv_{i} independently sends an answer i(𝐚i)\mathcal{R}_{i}(\mathbf{a}_{i}) to the data collector. In this model, there is one-round interaction between each user and the data collector. We call this the one-round LDP model. For example, the RR for a neighbor list in Section 3.2 assumes this model.

However, in certain cases it may be possible for the data collector to send a query to each user multiple times. This could allow for more powerful queries that result in more accurate subgraph counts [53] or more accurate synthetic graphs [46]. We call this the multiple-rounds LDP model.

In Sections 4.1 and 4.2, we consider the problems of computing fk(G)f_{k\star}(G) and f(G)f_{\triangle}(G) for a graph G𝒢G\in\mathcal{G} in the one-round LDP model. Our algorithms and bounds highlight limitations of the one-round LDP model. Compared to the centralized graph DP model, the one-round LDP model cannot compute fk(G)f_{k\star}(G) as accurately. Furthermore, the algorithm for f(G)f_{\triangle}(G) does not perform well. In Section 4.3, we propose a more sophisticated algorithm for computing f(G)f_{\triangle}(G) in the two-rounds LDP model, and show that it provides much smaller expected l2l_{2} loss than the algorithm in the one-round LDP model. In Section 4.4, we show a general result about lower bounds on the expected l2l_{2} loss of graph statistics in LDP. The proofs of all statements in Section 4 are given in Appendix D.

4.1 One-Round Algorithms for kk-Stars

Algorithm.  We begin with the problem of computing fk(G)f_{k\star}(G) in the one-round LDP model. For this model, we introduce a simple algorithm using the Laplacian mechanism, and prove that this algorithm can achieve order optimal expected l2l_{2} loss among all one-round LDP algorithms.

Data: Graph GG represented as neighbor lists 𝐚1,,𝐚n{0,1}n\mathbf{a}_{1},\ldots,\mathbf{a}_{n}\allowbreak\in\{0,1\}^{n}, privacy budget ε0\varepsilon\in\mathbb{R}_{\geq 0}, d~max0\tilde{d}_{max}\in\mathbb{Z}_{\geq 0}.
Result: Private estimate of fk(G)f_{k\star}(G).
1 Δ(d~maxk1)\Delta\leftarrow\binom{\tilde{d}_{max}}{k-1};
2 for i=1i=1 to nn do
3       𝐚iGraphProjection(𝐚i,d~max)\mathbf{a}_{i}\leftarrow\texttt{GraphProjection}(\mathbf{a}_{i},\tilde{d}_{max});
       /* did_{i} is a degree of user viv_{i}. */
4       dij=1nai,jd_{i}\leftarrow\sum_{j=1}^{n}a_{i,j};
5       ri(dik)r_{i}\leftarrow\binom{d_{i}}{k};
6       r^iri+Lap(Δε)\hat{r}_{i}\leftarrow r_{i}+\textrm{Lap}\left(\frac{\Delta}{\varepsilon}\right);
7       release(r^i)release(\hat{r}_{i});
8      
9 end for
return i=1nr^i\sum_{i=1}^{n}\hat{r}_{i}
Algorithm 1 LocalLapk⋆

Algorithm 1 shows the one-round algorithm for kk-stars. It takes as input a graph GG (represented as neighbor lists 𝐚1,,𝐚n{0,1}n\mathbf{a}_{1},\ldots,\mathbf{a}_{n}\in\{0,1\}^{n}), the privacy budget ε\varepsilon, and a non-negative integer d~max0\tilde{d}_{max}\in\mathbb{Z}_{\geq 0}.

The parameter d~max\tilde{d}_{max} plays a role as an upper-bound on the maximum degree dmaxd_{max} of GG. Specifically, let di0d_{i}\in\mathbb{Z}_{\geq 0} be the degree of user viv_{i}; i.e., the number of “11”s in her neighbor list 𝐚i\mathbf{a}_{i}. In line 3, user viv_{i} uses a function (denoted by GraphProjection) that performs graph projection [16, 35, 48] for 𝐚i\mathbf{a}_{i} as follows. If did_{i} exceeds d~max\tilde{d}_{max}, it randomly selects d~max\tilde{d}_{max} neighbors out of did_{i} neighbors; otherwise, it uses 𝐚i\mathbf{a}_{i} as it is. This guarantees that each user’s degree never exceeds d~max\tilde{d}_{max}; i.e., did~maxd_{i}\leq\tilde{d}_{max} after line 4.

After the graph projection, user viv_{i} counts the number of kk-stars ri0r_{i}\in\mathbb{Z}_{\geq 0} of which she is a center (line 5), and adds the Laplacian noise to rir_{i} (line 6). Here, since adding one edge results in the increase of at most (d~maxk1)\binom{\tilde{d}_{max}}{k-1} kk-stars, the sensitivity of kk-star counts for user viv_{i} is at most (d~maxk1)\binom{\tilde{d}_{max}}{k-1} (after graph projection). Therefore, we add Lap(Δε)\textrm{Lap}(\frac{\Delta}{\varepsilon}) to rir_{i}, where Δ=(d~maxk1)\Delta=\binom{\tilde{d}_{max}}{k-1} and for b0b\in\mathbb{R}_{\geq 0} Lap(b)\textrm{Lap}(b) is a random variable that represents the Laplacian noise with mean 0 and scale bb. The final answer of Algorithm 1 is simply the sum of all the noisy kk-star counts. We denote this algorithm by LocalLapk⋆.

The value of d~max\tilde{d}_{max} greatly affects the utility. If d~max\tilde{d}_{max} is too large, a large amount of the Laplacian noise would be added. If d~max\tilde{d}_{max} is too small, a great number of neighbors would be reduced by graph projection. When we have some prior knowledge about the maximum degree dmaxd_{max}, we can set d~max\tilde{d}_{max} to an appropriate value. For example, the maximum number of connections allowed on Facebook is 50005000 [3]. In this case, we can set d~max=5000\tilde{d}_{max}=5000, and then graph projection does nothing. Given that the number of Facebook monthly active users is over 2.72.7 billion [6], d~max=5000\tilde{d}_{max}=5000 is much smaller than nn. For another example, if we know that the degree is smaller than 10001000 for most users, then we can set d~max=1000\tilde{d}_{max}=1000 and perform graph projection for users whose degrees exceed d~max\tilde{d}_{max}.

In some applications, the data collector may not have such prior knowledge about d~max\tilde{d}_{max}. In this case, we can privately estimate dmaxd_{max} by allowing an additional round between each user and the data collector, and use the private estimate of dmaxd_{max} as d~max\tilde{d}_{max}. We describe how to privately estimate dmaxd_{max} with edge LDP at the end of Section 4.1.

Theoretical properties.  LocalLapk⋆ has the following guarantees:

Theorem 1.

LocalLapk⋆ provides ε\varepsilon-edge LDP.

Theorem 2.

Let f^k(G,ε,d~max)\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max}) be the output of LocalLapk⋆. Then, for all k,ε0,d~max0k\in\mathbb{N},\varepsilon\in\mathbb{R}_{\geq 0},\tilde{d}_{max}\in\mathbb{Z}_{\geq 0}, and G𝒢G\in\mathcal{G} such that the maximum degree dmaxd_{max} of GG is at most d~max\tilde{d}_{max}, 𝔼[l22(f^k(G,ε,d~max),fk(G))]=O(nd~max2k2ε2)\mathbb{E}[l_{2}^{2}(\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max}),f_{k\star}(G))]=O\left(\frac{n\tilde{d}_{max}^{2k-2}}{\varepsilon^{2}}\right).

The factor of nn in the expected l2l_{2} loss of LocalLapk⋆ comes from the fact that we are adding the Laplacian noise nn times. In the centralized model, this factor of nn is not there, because the central data collector sees all kk-stars; i.e., the data collector knows fk(G)f_{k\star}(G). The sensitivity of fkf_{k\star} is at most 2(d~maxk1)2\binom{\tilde{d}_{max}}{k-1} (after graph projection) under edge centralized DP. Therefore, we can consider an algorithm that simply adds the Laplacian noise Lap(2(d~maxk1)/ε)\textrm{Lap}(2\binom{\tilde{d}_{max}}{k-1}/\varepsilon) to fk(G)f_{k\star}(G), and outputs fk(G)+Lap(2(d~maxk1)/ε)f_{k\star}(G)+\textrm{Lap}(2\binom{\tilde{d}_{max}}{k-1}/\varepsilon). We denote this algorithm by CentralLapk⋆. Since the bias of the Laplacian noise is 0, CentralLapk⋆ attains the expected l2l_{2} loss (== variance) of O(d~max2k2ε2)O\left(\frac{\tilde{d}_{max}^{2k-2}}{\varepsilon^{2}}\right).

It seems impossible to avoid this factor of nn in the one-round LDP model, as the data collector will be dealing with nn independent answers to queries. Indeed, this is the case—we prove that the expected l2l_{2} error of LocalLapk⋆ is order optimal among all one-round LDP algorithms, and the one-round LDP model cannot improve the factor of nn.

Corollary 1.

Let f^k(G,d~max,ε)\hat{f}_{k\star}(G,\tilde{d}_{max},\varepsilon) be any one-round LDP algorithm that computes fk(G)f_{k\star}(G) satisfying ε\varepsilon-edge LDP. Then, for all k,n,d~maxk,n,\tilde{d}_{max}\in\mathbb{N} and ε0\varepsilon\in\mathbb{R}_{\geq 0} such that nn is even, there exists a set of graphs 𝒜𝒢\mathcal{A}\subseteq\mathcal{G} on nn nodes such that the maximum degree of each G𝒜G\in\mathcal{A} is at most d~max\tilde{d}_{max}, and 1|𝒜|G𝒜𝔼[l22(f^k(G,d~max,ε),fk(G))]Ω(e2ε(e2ε+1)2d~max2k2n)\frac{1}{|\mathcal{A}|}\sum_{G\in\mathcal{A}}\operatorname{\mathbb{E}}[l_{2}^{2}(\hat{f}_{k\star}(G,\tilde{d}_{max},\varepsilon),f_{k\star}(G))]\geq\Omega\left(\frac{e^{2\varepsilon}}{(e^{2\varepsilon}+1)^{2}}\tilde{d}_{max}^{2k-2}n\right).

This is a corollary of a more general result of Section 4.4. Thus, any algorithm computing kk-stars cannot avoid the factor of nn in its l22l_{2}^{2} loss. kk-stars is an example where the non-interactive graph LDP model is strictly weaker than the centralized DP model.

Nevertheless, we note that LocalLapk⋆ can accurately calculate fk(G)f_{k\star}(G) for a large number of users nn. Specifically, the relative error decreases with increase in nn because LocalLapk⋆ has a factor of nn (not n2n^{2}) in the expected l2l_{2} error, i.e., 𝔼[(f^k(G,ε,d~max)fk(G))2]=O(n)\mathbb{E}[(\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max})-f_{k\star}(G))^{2}]=O(n) and fk(G)2Ω(n2)f_{k\star}(G)^{2}\geq\Omega(n^{2}) (when we ignore d~max\tilde{d}_{max} and ε\varepsilon). In our experiments, we show that the relative error of LocalLapk⋆ is small when nn is large.

Private calculation of dmaxd_{max}.  By allowing an additional round between each user and the data collector, we can privately estimate dmaxd_{max} and use the private estimate of dmaxd_{max} as d~max\tilde{d}_{max}. Specifically, we divide the privacy budget ε\varepsilon into ε00\varepsilon_{0}\in\mathbb{R}_{\geq 0} and ε10\varepsilon_{1}\in\mathbb{R}_{\geq 0}; i.e., ε=ε0+ε1\varepsilon=\varepsilon_{0}+\varepsilon_{1}. We first estimate dmaxd_{max} with ε0\varepsilon_{0}-edge LDP and then run LocalLapk⋆ with the remaining privacy budget ε1\varepsilon_{1}. Note that LocalLapk⋆ with the private calculation of dmaxd_{max} results in a two-rounds LDP algorithm.

We consider the following simple algorithm. At the first round, each user viv_{i} adds the Laplacian noise Lap(1ε0)\textrm{Lap}(\frac{1}{\varepsilon_{0}}) to her degree did_{i}. Let d^i\hat{d}_{i}\in\mathbb{R} be the noisy degree of viv_{i}; i.e., d^i=di+Lap(1ε0)\hat{d}_{i}=d_{i}+\textrm{Lap}(\frac{1}{\varepsilon_{0}}). Then user viv_{i} sends d^i\hat{d}_{i} to the data collector. Let d^max\hat{d}_{max}\in\mathbb{R} be the maximum value of the noisy degree; i.e., d^max=max{d^1,,d^n}\hat{d}_{max}=\max\{\hat{d}_{1},\ldots,\hat{d}_{n}\}. We call d^max\hat{d}_{max} the noisy max degree. The data collector calculates the noisy max degree d^max\hat{d}_{max} as an estimate of dmaxd_{max}, and sends d^max\hat{d}_{max} back to all users. At the second round, we run LocalLapk⋆ with input GG, ε\varepsilon, and d^max\lfloor\hat{d}_{max}\rfloor.

At the first round, the calculation of d^max\hat{d}_{max} provides ε0\varepsilon_{0}-edge LDP because each user’s degree has the sensitivity 11 under edge LDP. At the second round, Theorem 1 guarantees that LocalLapk⋆ provides ε1\varepsilon_{1}-edge LDP. Then by the composition theorem [23], this two-rounds algorithm provides ε\varepsilon-edge LDP in total (ε=ε0+ε1\varepsilon=\varepsilon_{0}+\varepsilon_{1}).

In our experiments, we show that this algorithm provides the utility close to LocalLapk⋆ with the true max degree dmaxd_{max}.

4.2 One-Round Algorithms for Triangles.

Algorithm.  Now, we focus our attention on the more challenging ff_{\triangle} query. This query is more challenging in the graph LDP model because no user is aware of any triangle; i.e., user viv_{i} is not aware of any triangle formed by (vi,vj,vk)(v_{i},v_{j},v_{k}), because viv_{i} cannot see any edge (vj,vk)E(v_{j},v_{k})\in E in graph GG.

One way to count f(G)f_{\triangle}(G) with edge LDP is to apply the RR (Randomized Response) to a neighbor list. For example, user viv_{i} applies the RR to ai,1,,ai,i1a_{i,1},\ldots,a_{i,i-1} (which corresponds to users v1,,vi1v_{1},\ldots,v_{i-1} with smaller user IDs) in her neighbor list 𝐚i\mathbf{a}_{i}; i.e., we apply the RR to the lower triangular part of adjacency matrix 𝐀\mathbf{A}, as described in Section 3.2. Then the data collector constructs a noisy graph G=(V,E)𝒢G^{\prime}=(V,E^{\prime})\in\mathcal{G} from the lower triangular part of the noisy adjacency matrix, and estimates the number of triangles from GG^{\prime}. However, simply counting the triangles in GG^{\prime} can introduce a significant bias because GG^{\prime} is denser than GG especially when ε\varepsilon is small.

Refer to caption
Figure 2: Four types of subgraphs with three nodes.

Through clever post-processing known as empirical estimation [31, 39, 57], we are able to obtain an unbiased estimate of f(G)f_{\triangle}(G) from GG^{\prime}. Specifically, a subgraph with three nodes can be divided into four types depending on the number of edges. Three nodes with three edges form a triangle. We refer to three nodes with two edges, one edge, and no edges as 2-edges, 1-edge, and no-edges, respectively. Figure 2 shows their shapes. f(G)f_{\triangle}(G) can be expressed using m3m_{3}, m2m_{2}, m1m_{1}, and m0m_{0} as follows:

Proposition 2.

Let G=(V,E)G^{\prime}=(V,E^{\prime}) be a noisy graph generated by applying the RR to the lower triangular part of 𝐀\mathbf{A}. Let m3,m2,m1,m00m_{3},m_{2},m_{1},m_{0}\in\mathbb{Z}_{\geq 0} be respectively the number of triangles, 2-edges, 1-edge, and no-edges in GG^{\prime}. Then

𝔼[e3ε(eε1)3m3e2ε(eε1)3m2+eε(eε1)3m11(eε1)3m0]=f(G).\displaystyle\textstyle{\mathbb{E}\left[\frac{e^{3\varepsilon}}{(e^{\varepsilon}-1)^{3}}m_{3}-\frac{e^{2\varepsilon}}{(e^{\varepsilon}-1)^{3}}m_{2}+\frac{e^{\varepsilon}}{(e^{\varepsilon}-1)^{3}}m_{1}-\frac{1}{(e^{\varepsilon}-1)^{3}}m_{0}\right]=f_{\triangle}(G).} (4)

Therefore, the data collector can count m3m_{3}, m2m_{2}, m1m_{1}, and m0m_{0} from GG^{\prime}, and calculate an unbiased estimate of f(G)f_{\triangle}(G) by (4). In Appendix A, we show that the l2l_{2} loss is significantly reduced by this empirical estimation.

Data: Graph GG represented as neighbor lists 𝐚1,,𝐚n{0,1}n\mathbf{a}_{1},\ldots,\mathbf{a}_{n}\in\{0,1\}^{n}, privacy budget ε0\varepsilon\in\mathbb{R}_{\geq 0}.
Result: Private estimate of f(G)f_{\triangle}(G).
1 for i=1i=1 to nn do
2       Ri(RRε(ai,1),,RRε(ai,i1))R_{i}\leftarrow(RR_{\varepsilon}(a_{i,1}),\ldots,RR_{\varepsilon}(a_{i,i-1}));
3       release(Ri)release(R_{i});
4      
5 end for
6G=(V,E)UndirectedGraph(R1,,Rn)G^{\prime}=(V,E^{\prime})\leftarrow\texttt{UndirectedGraph}(R_{1},\ldots,R_{n});
/* Counts m3,m2,m1,m0m_{3},m_{2},m_{1},m_{0} in GG^{\prime}. */
7 (m3,m2,m1,m0)Count(G)(m_{3},m_{2},m_{1},m_{0})\leftarrow\texttt{Count}(G^{\prime});
8 return 1(eε1)3(e3εm3e2εm2+eεm1m0)\frac{1}{(e^{\varepsilon}-1)^{3}}(e^{3\varepsilon}m_{3}-e^{2\varepsilon}m_{2}+e^{\varepsilon}m_{1}-m_{0})
Algorithm 2 LocalRR

Algorithm 2 shows this algorithm. In line 2, user viv_{i} applies the RR with privacy budget ε\varepsilon (denoted by RRεRR_{\varepsilon}) to ai,1,,ai,i1a_{i,1},\ldots,a_{i,i-1} in her neighbor list 𝐚i\mathbf{a}_{i}, and outputs Ri=(RRε(ai,1),,RRε(ai,i1))R_{i}=(RR_{\varepsilon}(a_{i,1}),\ldots,RR_{\varepsilon}(a_{i,i-1})). In other words, we apply the RR to the lower triangular part of 𝐀\mathbf{A} and there is no overlap between edges sent by users. In line 5, the data collector uses a function (denoted by UndirectedGraph) that converts the bits of (R1,,Rn)(R_{1},\ldots,R_{n}) into an undirected graph G=(V,E)G^{\prime}=(V,E^{\prime}) by adding edge (vi,vj)(v_{i},v_{j}) with i>ji>j to EE^{\prime} if and only if the jj-th bit of RiR_{i} is 11. Note that GG^{\prime} is biased, as explained above. In line 6, the data collector uses a function (denoted by Count) that calculates m3m_{3}, m2m_{2}, m1m_{1}, and m0m_{0} from GG^{\prime}. Finally, the data collector outputs the expression inside the expectation on the left-hand side of (4), which is an unbiased estimator for f(G)f_{\triangle}(G) by Proposition 2. We denote this algorithm by LocalRR.

Theoretical properties.  LocalRR provides the following guarantee.

Theorem 3.

LocalRR provides ε\varepsilon-edge LDP and ε\varepsilon-relationship DP.

LocalRR does not have the doubling issue (i.e., it provides not 2ε2\varepsilon but ε\varepsilon-relationship DP), because we apply the RR to the lower triangular part of 𝐀\mathbf{A}, as explained in Section 3.2.

Unlike the RR and empirical estimation for tabular data [31], the expected l2l_{2} loss of LocalRR is complicated. To simplify the utility analysis, we assume that GG is generated from the Erdös-Rényi graph distribution 𝐆(n,α)\mathbf{G}(n,\alpha) with edge existence probability α\alpha; i.e., each edge in GG with nn nodes is independently generated with probability α[0,1]\alpha\in[0,1].

Theorem 4.

Let 𝐆(n,α)\mathbf{G}(n,\alpha) be the Erdös-Rényi graph distribution with edge existence probability α[0,1]\alpha\in[0,1]. Let p=1eε+1p=\frac{1}{e^{\varepsilon}+1} and β=α(1p)+(1α)p\beta=\alpha(1-p)+(1-\alpha)p. Let f^(G,ε)\hat{f}_{\triangle}(G,\varepsilon) be the output of LocalRR. If G𝐆(n,α)G\sim\mathbf{G}(n,\alpha), then for all ε0\varepsilon\in\mathbb{R}_{\geq 0}, 𝔼[l22(f^(G,ε),f(G))]=O(e6ε(eε1)6βn4)\mathbb{E}[l_{2}^{2}(\hat{f}_{\triangle}(G,\varepsilon),f_{\triangle}(G))]=O\left(\frac{e^{6\varepsilon}}{(e^{\varepsilon}-1)^{6}}\beta n^{4}\right).

Note that we assume the Erdös-Rényi model only for the utility analysis of LocalRR, and do not assume this model for the other algorithms. The upper-bound of LocalRR in Theorem 4 is less ideal than the upper-bounds of the other algorithms in that it does not consider all possible graphs G𝒢G\in\mathcal{G}. Nevertheless, we also show that the l2l_{2} loss of LocalRR is roughly consistent with Theorem 4 in our experiments using two real datasets (Section 5) and the Barabási-Albert graphs [9], which have power-law degree distribution (Appendix B).

The parameters α\alpha and β\beta are edge existence probabilities in the original graph GG and noisy graph GG^{\prime}, respectively. Although α\alpha is very small in a sparse graph, β\beta can be large for small ε\varepsilon. For example, if α0\alpha\approx 0 and ε=1\varepsilon=1, then β1e+1=0.27\beta\approx\frac{1}{e+1}=0.27.

Theorem 4 states that for large nn, the l2l_{2} loss of LocalRR (=O(n4)=O(n^{4})) is much larger than the l2l_{2} loss of LocalRRk{}_{k}\star (=O(n)=O(n)). This follows from the fact that user viv_{i} is not aware of any triangle formed by (vi,vj,vk)(v_{i},v_{j},v_{k}), as explained above.

In contrast, counting f(G)f_{\triangle}(G) in the centralized model is much easier because the data collector sees all triangles in GG; i.e., the data collector knows f(G)f_{\triangle}(G). The sensitivity of ff_{\triangle} is at most d~max\tilde{d}_{max} (after graph projection). Thus, we can consider a simple algorithm that outputs f(G)+Lap(d~max/ε)f_{\triangle}(G)+\textrm{Lap}(\tilde{d}_{max}/\varepsilon). We denote this algorithm by CentralLap. CentralLap attains the expected l2l_{2} loss (== variance) of O(d~max2ε2)O\left(\frac{\tilde{d}_{max}^{2}}{\varepsilon^{2}}\right).

The large l2l_{2} loss of LocalRR is caused by the fact that each edge is released independently with some probability of being flipped. In other words, there are three independent random variables that influence any triangle in GG^{\prime}. The next algorithm, using interaction, reduces this influencing number from three to one by using the fact that a user knows the existence of two edges for any triangle that involves the user.

4.3 Two-Rounds Algorithms for Triangles

Algorithm.  Allowing for two-rounds interaction, we are able to compute ff_{\triangle} with a significantly improved l2l_{2} loss, albeit with a higher per-user communication overhead. As described in Section 4.2, it is impossible for user viv_{i} to see edge (vj,vk)E(v_{j},v_{k})\in E in graph G=(V,E)G=(V,E) at the first round. However, if the data collector publishes a noisy graph G=(V,E)G^{\prime}=(V,E^{\prime}) calculated by LocalRR at the first round, then user viv_{i} can see a noisy edge (vj,vk)E(v_{j},v_{k})\in E^{\prime} in the noisy graph GG^{\prime} at the second round. Then user viv_{i} can count the number of noisy triangles formed by (vi,vj,vk)(v_{i},v_{j},v_{k}) such that (vi,vj)E(v_{i},v_{j})\in E, (vi,vk)E(v_{i},v_{k})\in E, and (vj,vk)E(v_{j},v_{k})\in E^{\prime}, and send the noisy triangle counts with the Laplacian noise to the data collector in an analogous way to LocalLapk⋆. Since user viv_{i} always knows that two edges (vi,vj)(v_{i},v_{j}) and (vi,vk)(v_{i},v_{k}) exist in GG, there is only one noisy edge in any noisy triangle (whereas all edges are noisy in LocalRR). This is an intuition behind our proposed two-rounds algorithm.

As with the RR in Section 4.2, simply counting the noisy triangles can introduce a bias. Therefore, we calculate an empirical estimate of f(G)f_{\triangle}(G) from the noisy triangle counts. Specifically, the following is the empirical estimate of f(G)f_{\triangle}(G):

Proposition 3.

Let G=(V,E)G^{\prime}=(V,E^{\prime}) be a noisy graph generated by applying the RR with privacy budget ε10\varepsilon_{1}\in\mathbb{R}_{\geq 0} to the lower triangular part of 𝐀\mathbf{A}. Let p1=1eε1+1p_{1}=\frac{1}{e^{\varepsilon_{1}}+1}. Let ti0t_{i}\in\mathbb{Z}_{\geq 0} be the number of triplets (vi,vj,vk)(v_{i},v_{j},v_{k}) such that j<k<ij<k<i, (vi,vj)E(v_{i},v_{j})\in E, (vi,vk)E(v_{i},v_{k})\in E, and (vj,vk)E(v_{j},v_{k})\in E^{\prime}. Let si0s_{i}\in\mathbb{Z}_{\geq 0} be the number of triplets (vi,vj,vk)(v_{i},v_{j},v_{k}) such that j<k<ij<k<i, (vi,vj)E(v_{i},v_{j})\in E, and (vi,vk)E(v_{i},v_{k})\in E. Let wi=tip1siw_{i}=t_{i}-p_{1}s_{i}. Then

𝔼[112p1i=1nwi]=f(G).\displaystyle\textstyle{\mathbb{E}\left[\frac{1}{1-2p_{1}}\sum_{i=1}^{n}w_{i}\right]=f_{\triangle}(G).} (5)

Note that in Proposition 3, we count only triplets (vi,vj,vk)(v_{i},v_{j},v_{k}) with j<k<ij<k<i to use only the lower triangular part of 𝐀\mathbf{A}. tit_{i} is the number of noisy triangles user viv_{i} can see at the second round. sis_{i} is the number of 22-stars of which user viv_{i} is a center. Since tit_{i} and sis_{i} can reveal information about an edge in GG, user viv_{i} adds the Laplacian noise to wiw_{i} (=tip1si)(=t_{i}-p_{1}s_{i}) in (5), and sends it to the data collector. Then the data collector calculates an unbiased estimate of f(G)f_{\triangle}(G) by (5).

Data: Graph GG represented as neighbor lists 𝐚1,,𝐚n{0,1}n\mathbf{a}_{1},\ldots,\mathbf{a}_{n}\in\{0,1\}^{n}, two privacy budgets ε1,ε2>0\varepsilon_{1},\varepsilon_{2}>0, d~max0\tilde{d}_{max}\in\mathbb{Z}_{\geq 0}.
Result: Private estimate of f(G)f_{\triangle}(G).
1 p11eε1+1p_{1}\leftarrow\frac{1}{e^{\varepsilon_{1}}+1};
/* First round. */
2 for i=1i=1 to nn do
3       Ri(RRε1(ai,1),,RRε1(ai,i1))R_{i}\leftarrow(RR_{\varepsilon_{1}}(a_{i,1}),\ldots,RR_{\varepsilon_{1}}(a_{i,i-1}));
4       release(Ri)release(R_{i});
5      
6 end for
7G=(V,E)UndirectedGraph(R1,,Ri1)G^{\prime}=(V,E^{\prime})\leftarrow\texttt{UndirectedGraph}(R_{1},\ldots,R_{i-1});
/* Second round. */
8 for i=1i=1 to nn do
9       𝐚iGraphProjection(𝐚i,d~max)\mathbf{a}_{i}\leftarrow\texttt{GraphProjection}(\mathbf{a}_{i},\tilde{d}_{max});
10       ti|{(vi,vj,vk):j<k<i,ai,j=ai,k=1,(vj,vk)E}|t_{i}\leftarrow|\{(v_{i},v_{j},v_{k}):j<k<i,a_{i,j}=a_{i,k}=1,(v_{j},v_{k})\in E^{\prime}\}|;
11       si|{(vi,vj,vk):j<k<i,ai,j=ai,k=1}|s_{i}\leftarrow|\{(v_{i},v_{j},v_{k}):j<k<i,a_{i,j}=a_{i,k}=1\}|;
12       witip1siw_{i}\leftarrow t_{i}-p_{1}s_{i};
13       w^iwi+Lap(d~maxε2)\hat{w}_{i}\leftarrow w_{i}+\textrm{Lap}(\frac{\tilde{d}_{max}}{\varepsilon_{2}});
14       release(w^i)release(\hat{w}_{i});
15      
16 end for
return 112p1i=1nw^i\frac{1}{1-2p_{1}}\sum_{i=1}^{n}\hat{w}_{i}
Algorithm 3 Local2Rounds

Algorithm 3 contains the formal description of this process. It takes as input a graph GG, the privacy budgets ε1,ε20\varepsilon_{1},\varepsilon_{2}\in\mathbb{R}_{\geq 0} at the first and second rounds, respectively, and a non-negative integer d~max0\tilde{d}_{max}\in\mathbb{Z}_{\geq 0}. At the first round, we apply the RR to the lower triangular part of 𝐀\mathbf{A} (i.e., there is no overlap between edges sent by users) and use the UndirectedGraph function to obtain a noisy graph G=(V,E)G^{\prime}=(V,E^{\prime}) by the RR in the same way as Algorithm 2. Note that GG^{\prime} is biased. We calculate an unbiased estimate of f(G)f_{\triangle}(G) from GG^{\prime} at the second round.

At the second round, each user viv_{i} calculates w^i=wi+Lap(d~maxε2)\hat{w}_{i}=w_{i}+\textrm{Lap}(\frac{\tilde{d}_{max}}{\varepsilon_{2}}) by adding the Laplacian noise to wiw_{i} in Proposition 3 whose sensitivity is at most d~max\tilde{d}_{max} (as we will prove in Theorem 5). Finally, we output 112p1i=1nw^i\frac{1}{1-2p_{1}}\sum_{i=1}^{n}\hat{w}_{i}, which is an unbiased estimate of f(G)f_{\triangle}(G) by Proposition 3. We call this algorithm Local2Rounds.

Theoretical properties.  Local2Rounds has the following guarantee.

Theorem 5.

Local2Rounds provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-edge LDP and (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-relationship DP.

As with LocalRR, Local2Rounds does not have the doubling issue; i.e., it provides ε\varepsilon-relationship DP (not 2ε2\varepsilon). This follows from the fact that we use only the lower triangular part of 𝐀\mathbf{A}; i.e., we assume j<k<ij<k<i in counting tit_{i} and sis_{i}.

Theorem 6.

Let f^(G,ε1,ε2,d~max)\hat{f}_{\triangle}(G,\varepsilon_{1},\varepsilon_{2},\tilde{d}_{max}) be the output of Local2Rounds. Then, for all ε1,ε20\varepsilon_{1},\varepsilon_{2}\in\mathbb{R}_{\geq 0}, d~max0\tilde{d}_{max}\in\mathbb{Z}_{\geq 0}, and G𝒢G\in\mathcal{G} such that the maximum degree dmaxd_{max} of GG is at most d~max\tilde{d}_{max}, 𝔼[l22(f^(G,ε1,ε2,d~max),f(G))]O(eε1(1eε1)2(d~max3n+eε1ε22d~max2n))\mathbb{E}[l_{2}^{2}(\hat{f}_{\triangle}(G,\varepsilon_{1},\varepsilon_{2},\tilde{d}_{max}),f_{\triangle}(G))]\leq O\left(\frac{e^{\varepsilon_{1}}}{(1-e^{\varepsilon_{1}})^{2}}\left(\tilde{d}_{max}^{3}n+\frac{e^{\varepsilon_{1}}}{\varepsilon_{2}^{2}}\tilde{d}_{max}^{2}n\right)\right).

Theorem 6 means that for triangles, the l2l_{2} loss is reduced from O(n4)O(n^{4}) to O(d~max3n)O(\tilde{d}_{max}^{3}n) by introducing an additional round.

Private calculation of dmaxd_{max}.  As with kk-stars, we can privately calculate dmaxd_{max} by using the method described in Section 4.1. Furthermore, the private calculation of dmaxd_{max} does not increase the number of rounds; i.e., we can run Local2Rounds with the private calculation of dmaxd_{max} in two rounds.

Specifically, let ε00\varepsilon_{0}\in\mathbb{R}_{\geq 0} be the privacy budget for the private calculation of dmaxd_{max}. At the first round, each user viv_{i} adds Lap(1ε0)\textrm{Lap}(\frac{1}{\varepsilon_{0}}) to her degree did_{i}, and sends the noisy degree d^i\hat{d}_{i} (=di+Lap(1ε0)=d_{i}+\textrm{Lap}(\frac{1}{\varepsilon_{0}})) to the data collector, along with the outputs Ri=(RRε(ai,1),,RRε(ai,i1))R_{i}=(RR_{\varepsilon}(a_{i,1}),\ldots,RR_{\varepsilon}(a_{i,i-1})) of the RR. The data collector calculates the noisy max degree d^max\hat{d}_{max} (=max{d^1,,d^n}=\max\{\hat{d}_{1},\ldots,\hat{d}_{n}\}) as an estimate of dmaxd_{max}, and sends it back to all users. At the second round, we run Local2Rounds with input GG (represented as 𝐚1,,𝐚n\mathbf{a}_{1},\ldots,\mathbf{a}_{n}), ε1\varepsilon_{1}, ε2\varepsilon_{2}, and d^max\lfloor\hat{d}_{max}\rfloor.

At the first round, the calculation of d^max\hat{d}_{max} provides ε0\varepsilon_{0}-edge LDP. Note that it provides 2ε02\varepsilon_{0}-relationship DP (i.e., it has the doubling issue) because one edge (vi,vj)E(v_{i},v_{j})\in E affects both of the degrees did_{i} and djd_{j} by 1. At the second round, LocalLapk⋆ provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-edge LDP and (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-relationship DP (Theorem 5). Then by the composition theorem [23], this two-rounds algorithm provides (ε0+ε1+ε2)(\varepsilon_{0}+\varepsilon_{1}+\varepsilon_{2})-edge LDP and (2ε0+ε1+ε2)(2\varepsilon_{0}+\varepsilon_{1}+\varepsilon_{2})-relationship DP. Although the total privacy budget is larger for relationship DP, the difference (=ε0=\varepsilon_{0}) can be very small. In fact, we set (ε0,ε1,ε2)=(0.1,0.45,0.45)(\varepsilon_{0},\varepsilon_{1},\varepsilon_{2})=(0.1,0.45,0.45) or (0.2,0.9,0.9)(0.2,0.9,0.9) in our experiments (i.e., the difference is 0.10.1 or 0.20.2), and show that this algorithm provides almost the same utility as Local2Rounds with the true max degree dmaxd_{max}.

Time complexity.  We also note that Local2Rounds has an advantage over LocalRR in terms of the time complexity.

Specifically, LocalRR is inefficient because the data collector has to count the number of triangles m3m_{3} in the noisy graph GG^{\prime}. Since the noisy graph GG^{\prime} is dense (especially when ε\varepsilon is small) and there are (n3)\binom{n}{3} subgraphs with three nodes in GG^{\prime}, the number of triangles is m3=O(n3)m_{3}=O(n^{3}). Then, the time complexity of LocalRR is also O(n3)O(n^{3}), which is not practical for a graph with a large number of users nn. In fact, we implemented LocalRR (ε=1\varepsilon=1) with C/C++ and measured its running time using one node of a supercomputer (ABCI: AI Bridging Cloud Infrastructure [4]). When n=5000n=5000, 1000010000, 2000020000, and 4000040000, the running time was 138138, 11071107, 93459345, and 9956199561 seconds, respectively; i.e., the running time was almost cubic in nn. We can also estimate the running time for larger nn. For example, when n=1000000n=1000000, LocalRR (ε=1\varepsilon=1) would require about 3535 years (=1107×1003/(3600×24×365))(=1107\times 100^{3}/(3600\times 24\times 365)).

In contrast, the time complexity of Local2Rounds is O(n2+ndmax2)O(n^{2}+nd_{max}^{2})111When we evaluate Local2Rounds in our experiments, we can apply the RR to only edges that are required at the second round; i.e., (vj,vk)G(v_{j},v_{k})\in G^{\prime} in line 8 of Algorithm 3. Then the time complexity of Local2Rounds can be reduced to O(ndmax2)O(nd_{max}^{2}) in total. We also confirmed that when n=1000000n=1000000, the running time of Local2Rounds was 311311 seconds on one node of the ABCI. Note, however, that this does not protect individual privacy, because it reveals the fact that users vjv_{j} and vkv_{k} are friends with uiu_{i} to the data collector.. The factor of n2n^{2} comes from the fact that the size of the noisy graph GG^{\prime} is O(n2)O(n^{2}). This also causes a large communication overhead, as explained below.

Communication overhead.  In Local2Rounds, each user need to see the noisy graph GG^{\prime} of size O(n2)O(n^{2}) to count tit_{i} and sis_{i}. This results in a per-user communication overhead of O(n2)O(n^{2}). Although we do not simulate the communication overhead in our experiments that use Local2Rounds, the O(n2)O(n^{2}) overhead might limit its application in very large graphs. An interesting avenue of future work is how to compress the graph size (e.g., via graph projection or random projection) to reduce both the time complexity and the communication overhead.

Centralized One-round local Two-rounds local
Upper Bound Lower Bound Upper Bound Upper Bound
fkf_{k\star} O(dmax2k2ε2)O\left(\frac{d_{max}^{2k-2}}{\varepsilon^{2}}\right) Ω(e2ε(e2ε+1)2dmax2k2n)\Omega\left(\frac{e^{2\varepsilon}}{(e^{2\varepsilon}+1)^{2}}d_{max}^{2k-2}n\right) O(dmax2k2ε2n)O\left(\frac{d_{max}^{2k-2}}{\varepsilon^{2}}n\right) O(dmax2k2ε2n)O\left(\frac{d_{max}^{2k-2}}{\varepsilon^{2}}n\right)
ff_{\triangle} O(dmax2ε2)O\left(\frac{d_{max}^{2}}{\varepsilon^{2}}\right) Ω(e2ε(e2ε+1)2dmax2n)\Omega\left(\frac{e^{2\varepsilon}}{(e^{2\varepsilon}+1)^{2}}d_{max}^{2}n\right) O(e6ε(eε1)6n4)O\left(\frac{e^{6\varepsilon}}{(e^{\varepsilon}-1)^{6}}n^{4}\right) (when G𝐆(n,α)G\sim\mathbf{G}(n,\alpha)) O(eε(eε1)2(dmax3n+eεε2dmax2n))O\left(\frac{e^{\varepsilon}}{(e^{\varepsilon}-1)^{2}}(d_{max}^{3}n+\frac{e^{\varepsilon}}{\varepsilon^{2}}d_{max}^{2}n)\right)
Table 2: Bounds on l2l_{2} losses for privately estimating fkf_{k\star} and ff_{\triangle} with ε\varepsilon-edge LDP. For upper-bounds, we assume that d~max=dmax\tilde{d}_{max}=d_{max}. For the centralized model, we use the Laplace mechanism. For the one-round ff_{\triangle} algorithm, we apply Theorem 4 with constant α\alpha. For the two-round protocol ff_{\triangle} algorithm, we apply Theorem 6 with ε1=ε2=ε2\varepsilon_{1}=\varepsilon_{2}=\frac{\varepsilon}{2}.

4.4 Lower Bounds

We show a general lower bound on the l2l_{2} loss of private estimators f^\hat{f} of real-valued functions ff in the one-round LDP model. Treating ε\varepsilon as a constant, we have shown that when d~max=dmax\tilde{d}_{max}=d_{max}, the expected l2l_{2} loss of LocalLaplacek⋆ is O(ndmax2k2)O(nd_{max}^{2k-2}) (Theorem 2). However, in the centralized model, we can use the Laplace mechanism with sensitivity 2(dmaxk1)2\binom{d_{max}}{k-1} to obtain l22l_{2}^{2} errors of O(dmax2k2)O(d_{max}^{2k-2}) for fkf_{k\star}. Thus, we ask if the factor of nn is necessary in the one-round LDP model.

We answer this question affirmatively. We show for many types of queries ff, there is a lower bound on l22(f(G),f^(G))l_{2}^{2}(f(G),\hat{f}(G)) for any private estimator f^\hat{f} of the form

f^(G)=f~(1(𝐚1),,n(𝐚n)),\hat{f}(G)=\tilde{f}(\mathcal{R}_{1}(\mathbf{a}_{1}),\ldots,\mathcal{R}_{n}(\mathbf{a}_{n})), (6)

where 1,,n\mathcal{R}_{1},\ldots,\mathcal{R}_{n} satisfy ε\varepsilon-edge LDP or ε\varepsilon-relationship DP and f~\tilde{f} is an aggregate function that takes 1(𝐚1),,n(𝐚n)\mathcal{R}_{1}(\mathbf{a}_{1}),\ldots,\mathcal{R}_{n}(\mathbf{a}_{n}) as input and outputs f^(G)\hat{f}(G). Here we assume that 1,,n\mathcal{R}_{1},\ldots,\mathcal{R}_{n} are independently run, meaning that they are in the one-round setting. For our lower bound, we require that input edges to ff be “independent” in the sense that adding an edge to an input graph GG independently change ff by at least DD\in\mathbb{R}. The specific structure of input graphs we require is as follows:

Definition 5.

[(n,D)(n,D)-independent cube for ff] Let D0D\in\mathbb{R}_{\geq 0}. For κ\kappa\in\mathbb{N}, let G=(V,E)𝒢G=(V,E)\in\mathcal{G} be a graph on n=2κn=2\kappa nodes, and let M={(vi1,vi2),(vi3,vi4),,(vi2k1,vi2κ)}M=\{(v_{i_{1}},v_{i_{2}}),(v_{i_{3}},v_{i_{4}}),\ldots,(v_{i_{2k-1}},v_{i_{2\kappa}})\} for integers ij[n]i_{j}\in[n] be a set of edges such that each of i1,,i2κi_{1},\ldots,i_{2\kappa} is distinct (i.e., perfect matching on the nodes). Suppose that MM is disjoint from EE; i.e., (vi2j1,vi2j)E(v_{i_{2j-1}},v_{i_{2j}})\notin E for any j[κ]j\in[\kappa]. Let 𝒜={(V,EN):NM}\mathcal{A}=\{(V,E\cup N):N\subseteq M\}. Note that 𝒜\mathcal{A} is a set of 2κ2^{\kappa} graphs. We say 𝒜\mathcal{A} is an (n,D)(n,D)-independent cube for ff if for all G=(V,E)𝒜G^{\prime}=(V,E^{\prime})\in\mathcal{A}, we have

f(G)=f(G)+eEMCe,f(G^{\prime})=f(G)+\sum_{e\in E^{\prime}\cap M}C_{e},

where CeC_{e}\in\mathbb{R} satisfies |Ce|D|C_{e}|\geq D for any eMe\in M.

Refer to caption
Figure 3: (4,2)(4,2)-independent cube 𝒜\mathcal{A} for ff. In this example, M={(v1,v2),(v3,v4)}M=\{(v_{1},v_{2}),(v_{3},v_{4})\}, G1=(V,E)G_{1}=(V,E), 𝒜={(V,EN):NM}\mathcal{A}=\{(V,E\cup N):N\subseteq M\}, C(v1,v2)=2C_{(v_{1},v_{2})}=2, and C(v3,v4)=3C_{(v_{3},v_{4})}=3. Adding (v1,v2)(v_{1},v_{2}) and (v3,v4)(v_{3},v_{4}) increase ff by 22 and 33, respectively.
Refer to caption
Figure 4: Construction of an independent cube for a kk-star function (n=6n=6, dmax=4d_{max}=4). From a 33-regular graph G=(V,E)G=(V,E) and M={(v1,v3),(v2,v6),(v4,v5)}M=\{(v_{1},v_{3}),(v_{2},v_{6}),(v_{4},v_{5})\}, we make a graph G=(V,E)G^{\prime}=(V,E^{\prime}) such that E=EME^{\prime}=E\setminus M. Then 𝒜={(V,EN):NM}\mathcal{A}=\{(V,E^{\prime}\cup N):N\subseteq M\} forms an (n,2(dmax2k1))(n,2\binom{d_{max}-2}{k-1})-independent cube for fkf_{k\star}.

Such a set of inputs has an “independence” property because, regardless of which edges from MM has been added before, adding edge eMe\in M always changes ff by CeC_{e}. Figure 3 shows an example of a (4,2)(4,2)-independent cube for ff.

We can also construct a independent cube for a kk-star function as follows. Assume that nn is even. It is well known in graph theory that if nn is even, then for any d[n1]d\in[n-1], there exists a dd-regular graph where every node has degree dd [25]. Therefore, there exists a (dmax1)(d_{max}-1)-regular graph G=(V,E)G=(V,E) of size nn. Pick an arbitrary perfect matching MM on the nodes. Now, let G=(V,E)G^{\prime}=(V,E^{\prime}) such that E=EME^{\prime}=E\setminus M. Every node in GG^{\prime} has degree between dmax2d_{max}-2 and dmax1d_{max}-1. Adding an edge in MM to GG^{\prime} will produce at least 2(dmax2k1)2\binom{d_{max}-2}{k-1} new kk-stars. Thus, 𝒜={(V,EN):NM}\mathcal{A}=\{(V,E^{\prime}\cup N):N\subseteq M\} forms an (n,2(dmax2k1))(n,2\binom{d_{max}-2}{k-1})-independent cube for fkf_{k\star}. Note that the maximum degree of each graph in 𝒜\mathcal{A} is at most dmaxd_{max}. Figure 4 shows how to construct an independent cube for a kk-star function when n=6n=6 and dmax=4d_{max}=4.

Using the structure that the (n,D)(n,D)-independent cube imposes on ff, we can prove a lower bound:

Theorem 7.

Let f^(G)\hat{f}(G) have the form of (6), where 1,,n\mathcal{R}_{1},\ldots,\mathcal{R}_{n} are independently run. Let 𝒜\cal{A} be an (n,D)(n,D)-independent cube for ff. If (1,,n)(\mathcal{R}_{1},\ldots,\mathcal{R}_{n}) provides ε\varepsilon-relationship DP, then we have

1𝒜G𝒜𝔼[l22(f(G),f^(G))]=Ω(eε(eε+1)2nD2).\frac{1}{\mathcal{A}}\sum_{G\in\mathcal{A}}\operatorname{\mathbb{E}}[l_{2}^{2}(f(G),\hat{f}(G))]=\Omega\left(\frac{e^{\varepsilon}}{(e^{\varepsilon}+1)^{2}}nD^{2}\right).

A corollary of Theorem 7 is that if 1,,n\mathcal{R}_{1},\ldots,\mathcal{R}_{n} satisfy ε\varepsilon-edge LDP, then they satisfy 2ε2\varepsilon -relationship DP and thus for edge LDP we have a lower bound of Ω(e2ε(e2ε+1)2nD2)\Omega\left(\frac{e^{2\varepsilon}}{(e^{2\varepsilon}+1)^{2}}nD^{2}\right).

Theorem 7, combined with the fact that there exists an (n,2(dmax2k1))(n,2\binom{d_{max}-2}{k-1})-independent cube for a kk-star function implies Corollary 1. In Appendix C, we also construct an (n,dmax22)(n,\frac{d_{max}}{2}-2) independent cube for ff_{\triangle} and establish a lower bound of Ω(e2ε(e2ε+1)2ndmax2)\Omega(\frac{e^{2\varepsilon}}{(e^{2\varepsilon}+1)^{2}}nd_{max}^{2}) for ff_{\triangle}.

The upper and lower bounds on the l2l_{2} losses shown in this section appear in Table 2.

5 Experiments

Based on our theoretical results in Section 4, we would like to pose the following questions:

  • For triangle counts, how much does the two-rounds interaction help over a single round in practice?

  • What is the privacy-utility trade-off of our LDP algorithms (i.e., how beneficial are our LDP algorithms)?

We conducted experiments to answer to these questions.

5.1 Experimental Set-up

We used the following two large-scale datasets:

IMDB.  The Internet Movie Database (denoted by IMDB) [2] includes a bipartite graph between 896308896308 actors and 428440428440 movies. We assumed actors as users. From the bipartite graph, we extracted a graph GG^{*} with 896308896308 nodes (actors), where an edge between two actors represents that they have played in the same movie. There are 5706435857064358 edges in GG^{*}, and the average degree in GG^{*} is 63.763.7 (=57064358896308)(=\frac{57064358}{896308}).

Orkut.  The Orkut online social network dataset (denoted by Orkut) [36] includes a graph GG^{*} with 30724413072441 users and 117185083117185083 edges. The average degree in GG^{*} is 38.138.1 (=1171850833072441)(=\frac{117185083}{3072441}). Therefore, Orkut is more sparse than IMDB (whose average degree in GG^{*} is 63.763.7).

For each dataset, we randomly selected nn users from the whole graph GG^{*}, and extracted a graph G=(V,E)G=(V,E) with nn users. Then we estimated the number of triangles f(G)f_{\triangle}(G), the number of kk-stars fk(G)f_{k\star}(G), and the clustering coefficient (=3f(G)f2(G)=\frac{3f_{\triangle}(G)}{f_{2\star}(G)}) using ε\varepsilon-edge LDP (or ε\varepsilon-edge centralized DP) algorithms in Section 4. Specifically, we used the following algorithms:

Algorithms for triangles.  For algorithms for estimating f(G)f_{\triangle}(G), we used the following three algorithms: (1) the RR (Randomized Response) with the empirical estimation method in the local model (i.e., LocalRR in Section 4.2), (2) the two-rounds algorithm in the local model (i.e., Local2Rounds in Section 4.3), and (3) the Laplacian mechanism in the centralized model (i.e., CentralLap\triangle in Section 4.2).

Algorithms for kk-stars.  For algorithms for estimating fk(G)f_{k\star}(G), we used the following two algorithms: (1) the Laplacian mechanism in the local model (i.e., LocalLapk{}_{k}\star in Section 4.1) and (2) the Laplacian mechanism in the centralized model (i.e., CentralLapk{}_{k}\star in Section 4.1).

For each algorithm, we evaluated the l2l_{2} loss and the relative error (as described in Section 3.4), while changing the values of nn and ε\varepsilon. To stabilize the performance, we attempted γ\gamma\in\mathbb{N} ways to randomly select nn users from GG^{*}, and averaged the utility value over all the γ\gamma ways to randomly select nn users. When we changed nn from 10001000 to 1000010000, we set γ=100\gamma=100 because the variance was large. For other cases, we set γ=10\gamma=10.

In Appendix B, we also report experimental results using artificial graphs based on the Barabási-Albert model [9].

Refer to caption
Figure 5: Relation between the number of users nn and the l2l_{2} loss in triangle counts when ε=1\varepsilon=1 (ε1=ε2=12\varepsilon_{1}=\varepsilon_{2}=\frac{1}{2}, d~max=dmax\tilde{d}_{max}=d_{max}). Here we do not evaluate LocalRR when n>10000n>10000, because it is inefficient (see Section 4.3 “Time complexity”).
Refer to caption
Figure 6: Relation between the number of users nn and the l2l_{2} loss in kk-star counts when ε=1\varepsilon=1 (ε1=ε2=12\varepsilon_{1}=\varepsilon_{2}=\frac{1}{2}, d~max=dmax\tilde{d}_{max}=d_{max}).

5.2 Experimental Results

Relation between nn and the l2l_{2} loss.  We first evaluated the l2l_{2} loss of the estimates of f(G)f_{\triangle}(G), f2(G)f_{2\star}(G), and f3(G)f_{3\star}(G) while changing the number of users nn. Figures 5 and 6 shows the results (ε=1\varepsilon=1). Here we did not evaluate LocalRR when nn was larger than 1000010000, because LocalRR was inefficient (as described in Section 4.3 “Time complexity”). In Local2Rounds, we set ε1=ε2=12\varepsilon_{1}=\varepsilon_{2}=\frac{1}{2}. As for d~max\tilde{d}_{max}, we set d~max=dmax\tilde{d}_{max}=d_{max} (i.e., we assumed that dmaxd_{max} is publicly available and did not perform graph projection) because we want to examine how well our theoretical results hold in our experiments. We also evaluate the effectiveness of the private calculation of dmaxd_{max} at the end of Section 5.2.

Figure 5 shows that Local2Rounds significantly outperforms LocalRR. Specifically, the l2l_{2} loss of Local2Rounds is smaller than that of LocalRR by a factor of about 10210^{2}. The difference between Local2Rounds and LocalRR is larger in Orkut. This is because Orkut is more sparse, as described in Section 5.1. For example, when n=10000n=10000, the maximum degree dmaxd_{max} in GG was 73.573.5 and 27.827.8 on average in IMDB and Orkut, respectively. Recall that for a fixed ε\varepsilon, the expected l2l_{2} loss of Local2Rounds and LocalRR can be expressed as O(ndmax3)O(nd_{max}^{3}) and O(n4)O(n^{4}), respectively. Thus Local2Rounds significantly outperforms LocalRR, especially in sparse graphs.

Figures 5 and 6 show that the l2l_{2} loss is roughly consistent with our upper-bounds in terms of nn. Specifically, LocalRR, Local2Rounds, CentralLap, LocalLapk⋆, and CentralLapk⋆ achieve the expected l2l_{2} loss of O(n4)O(n^{4}), O(ndmax3)O(nd_{max}^{3}), O(dmax2)O(d_{max}^{2}), O(ndmax2k2)O(nd_{max}^{2k-2}), and O(dmax2k2)O(d_{max}^{2k-2}), respectively. Here note that each user’s degree increases roughly in proportion to nn (though the degree is much smaller than nn), as we randomly select nn users from the whole graph GG^{*}. Assuming that dmax=O(n)d_{max}=O(n), Figures 5 and 6 are roughly consistent with the upper-bounds. The figures also show the limitations of the local model in terms of the utility when compared to the centralized model.

Relation between ε\varepsilon and the l2l_{2} loss.  Next we evaluated the l2l_{2} loss when we changed the privacy budget ε\varepsilon in edge LDP. Figure 7 shows the results for triangles and 22-stars (n=10000n=10000). Here we omit the result of 33-stars because it is similar to that of 22-stars. In Local2Rounds, we set ε1=ε2=ε2\varepsilon_{1}=\varepsilon_{2}=\frac{\varepsilon}{2}.

Figure 7 shows that the l2l_{2} loss is roughly consistent with our upper-bounds in terms of ε\varepsilon. For example, when we decrease ε\varepsilon from 0.40.4 to 0.10.1, the l2l_{2} loss increases by a factor of about 50005000, 200200, and 1616 for both the datasets in LocalRR, Local2Rounds, and CentralLap, respectively. They are roughly consistent with our theoretical results that for small ε\varepsilon, the expected l2l_{2} loss of LocalRR, Local2Rounds, and CentralLap is O(ε6)O(\varepsilon^{-6})222We used eεε+1e^{\varepsilon}\approx\varepsilon+1 to derive the upper-bound of LocalRR for small ε\varepsilon., O(ε4)O(\varepsilon^{-4}), and O(ε2)O(\varepsilon^{-2}), respectively.

Refer to caption
Figure 7: Relation between ε\varepsilon in edge LDP and the l2l_{2} loss when n=10000n=10000 (ε1=ε2=ε2\varepsilon_{1}=\varepsilon_{2}=\frac{\varepsilon}{2}, d~max=dmax\tilde{d}_{max}=d_{max}).
Refer to caption
Figure 8: Relation between nn and the relative error. In the local model, we used Local2Rounds (ε=1\varepsilon=1 or 22) and LocalLapk{}_{k}\star (ε=1\varepsilon=1 or 22) for estimating triangle counts f(G)f_{\triangle}(G) and kk-star counts fk(G)f_{k\star}(G), respectively (d~max=dmax\tilde{d}_{max}=d_{max}).

Figure 7 also shows that Local2Rounds significantly outperforms LocalRR especially when ε\varepsilon is small, which is also consistent with our theoretical results. Conversely, the difference between LocalRR and Local2Rounds is small when ε\varepsilon is large. This is because when ε\varepsilon is large, the RR outputs the true value with high probability. For example, when ε5\varepsilon\geq 5, the RR outputs the true value with eεeε+1>0.993\frac{e^{\varepsilon}}{e^{\varepsilon}+1}>0.993. However, LocalRR with such a large value of ε\varepsilon does not guarantee strong privacy, because it outputs the true value in most cases. Local2Rounds significantly outperforms LocalRR when we want to estimate f(G)f_{\triangle}(G) or fk(G)f_{k\star}(G) with a strong privacy guarantee; e.g., ε1\varepsilon\leq 1 [37].

Relative error.  As the number of users nn increases, the numbers of triangles f(G)f_{\triangle}(G) and kk-stars fk(G)f_{k\star}(G) increase. This causes the increase of the l2l_{2} loss. Therefore, we also evaluated the relative error, as described in Section 3.4.

Figure 8 shows the relation between nn and the relative error (we omit the result of 33-stars because it is similar to that of 22-stars). In the local model, we used Local2Rounds and LocalLapk{}_{k}\star for estimating f(G)f_{\triangle}(G) and fk(G)f_{k\star}(G), respectively (we did not use Local2RR, because it is both inaccurate and inefficient). For both algorithms, we set ε=1\varepsilon=1 or 22 (ε1=ε2=ε2\varepsilon_{1}=\varepsilon_{2}=\frac{\varepsilon}{2} in Local2Rounds) and d~max=dmax\tilde{d}_{max}=d_{max}. Then we estimated the clustering coefficient as: 3f^(G,ε1,ε2,dmax)f^k(G,ε,dmax)\frac{3\hat{f}_{\triangle}(G,\varepsilon_{1},\varepsilon_{2},d_{max})}{\hat{f}_{k\star}(G,\varepsilon,d_{max})}, where f^(G,ε1,ε2,dmax)\hat{f}_{\triangle}(G,\varepsilon_{1},\varepsilon_{2},d_{max}) and f^k(G,ε,dmax)\hat{f}_{k\star}(G,\varepsilon,d_{max}) are the estimates of f(G)f_{\triangle}(G) and fk(G)f_{k\star}(G), respectively. If the estimate of the clustering coefficient is smaller than 0 (resp. larger than 11), we set the estimate to 0 (resp. 11) because the clustering coefficient is always between 0 and 11. In the centralized model, we used CentralLap and CentralLapk{}_{k}\star (ε=1\varepsilon=1 or 22, d~max=dmax\tilde{d}_{max}=d_{max}) and calculated the clustering coefficient in the same way.

Figure 8 shows that for all cases, the relative error decreases with increase in nn. This is because both f(G)f_{\triangle}(G) and fk(G)f_{k\star}(G) significantly increase with increase in nn. Specifically, let f,vi(G)0f_{\triangle,v_{i}}(G)\in\mathbb{Z}_{\geq 0} the number of triangles that involve user viv_{i}, and fk,vi(G)0f_{k\star,v_{i}}(G)\in\mathbb{Z}_{\geq 0} be the number of kk-stars of which user viv_{i} is a center. Then f(G)=13i=1nf,vi(G)f_{\triangle}(G)=\frac{1}{3}\sum_{i=1}^{n}f_{\triangle,v_{i}}(G) and fk,vi(G)=i=1nfk,vi(G)f_{k\star,v_{i}}(G)=\sum_{i=1}^{n}f_{k\star,v_{i}}(G). Since both f,vi(G)f_{\triangle,v_{i}}(G) and fk,vi(G)f_{k\star,v_{i}}(G) increase with increase in nn, both f(G)f_{\triangle}(G) and fk(G)f_{k\star}(G) increase at least in proportion to nn. Thus f(G)2Ω(n2)f_{\triangle}(G)^{2}\geq\Omega(n^{2}) and fk(G)2Ω(n2)f_{k\star}(G)^{2}\geq\Omega(n^{2}). In contrast, Local2Rounds, LocalLapk{}_{k}\star, CentralLap, and CentralLapk{}_{k}\star achieve the expected l2l_{2} loss of O(n)O(n), O(n)O(n), O(1)O(1), and O(1)O(1), respectively (when we ignore dmaxd_{max} and ε\varepsilon), all of which are smaller than O(n2)O(n^{2}). Therefore, the relative error decreases with increase in nn.

This result demonstrates that we can accurately estimate graph statistics for large nn in the local model. In particular, the relative error is smaller in IMDB because IMDB is denser and includes a larger number of triangles and kk-stars; i.e., the denominator of the relative error is large. For example, when n=200000n=200000 and ε=1\varepsilon=1, the relative error is 0.300.30 and 0.00280.0028 for triangles and 22-stars, respectively. Note that the clustering coefficient requires 2ε2\varepsilon because we need to estimate both f(G)f_{\triangle}(G) and fk(G)f_{k\star}(G). Yet, we can still accurately calculate the clustering coefficient with a moderate privacy budget; e.g., the relative error of the clustering coefficient is 0.300.30 when the privacy budget is 22 (i.e., ε=1\varepsilon=1). If nn is larger, then ε\varepsilon would be smaller at the same value of the relative error.

Refer to caption
Figure 9: Relative error when d~max=n\tilde{d}_{max}=n (#users), dmaxd_{max} (max degree), or d^max\hat{d}_{max} (noisy max degree). We used Local2Rounds (ε=1\varepsilon=1 or 22) and LocalLapk{}_{k}\star (ε=1\varepsilon=1 or 22) for estimating triangle counts f(G)f_{\triangle}(G) and kk-star counts fk(G)f_{k\star}(G), respectively.

Private calculation of dmaxd_{max}.  We have so far assumed that d~max=dmax\tilde{d}_{max}=d_{max} (i.e., dmaxd_{max} is publicly available) in our experiments. We finally evaluate the methods to privately calculate dmaxd_{max} with ε0\varepsilon_{0}-edge LDP (described in Sections 4.1 and 4.3).

Specifically, we used Local2Rounds and LocalLapk{}_{k}\star for estimating f(G)f_{\triangle}(G) and fk(G)f_{k\star}(G), respectively, and evaluated the following three methods for setting d~max\tilde{d}_{max}: (i) d~max=n\tilde{d}_{max}=n; (ii) d~max=dmax\tilde{d}_{max}=d_{max}; (iii) d~max=d^max\tilde{d}_{max}=\hat{d}_{max}, where d^max\hat{d}_{max} is the private estimate of dmaxd_{max} (noisy max degree) in Sections 4.1 and 4.3.

We set n=200000n=200000 in IMDB and n=1600000n=1600000 in Orkut. Regarding the total privacy budget ε\varepsilon in edge LDP for estimating f(G)f_{\triangle}(G) or fk(G)f_{k\star}(G), we set ε=1\varepsilon=1 or 22. We used ε10\frac{\varepsilon}{10} for privately calculating dmaxd_{max} (i.e., ε0=ε10\varepsilon_{0}=\frac{\varepsilon}{10}), and the remaining privacy budget 9ε10\frac{9\varepsilon}{10} as input to Local2Rounds or LocalLapk{}_{k}\star. In Local2Rounds, we set ε1=ε2\varepsilon_{1}=\varepsilon_{2}; i.e., we set (ε0,ε1,ε2)=(0.1,0.45,0.45)(\varepsilon_{0},\varepsilon_{1},\varepsilon_{2})=(0.1,0.45,0.45) or (0.2,0.9,0.9)(0.2,0.9,0.9). Then we estimated the clustering coefficient in the same way as Figure 8.

Figure 9 shows the results. Figure 9 shows that the algorithms with d~max=d^max\tilde{d}_{max}=\hat{d}_{max} (noisy max degree) achieves the relative error close to (sometimes almost the same as) the algorithms with d~max=dmax\tilde{d}_{max}=d_{max} and significantly outperforms the algorithms with d~max=n\tilde{d}_{max}=n. This means that we can privately estimate dmaxd_{max} without a significant loss of utility.

Summary of results.  In summary, our experimental results showed that the estimation error of triangle counts is significantly reduced by introducing the interaction between users and a data collector. The results also showed that we can achieve small relative errors (much smaller than 1) for subgraph counts with privacy budget ε=1\varepsilon=1 or 22 in edge LDP.

As described in Section 1, non-private subgraph counts may reveal some friendship information, and a central server may face data breaches. Our LDP algorithms are highly beneficial because they enable us to analyze the connection patterns in a graph (i.e., subgraph counts) or to understand how likely two friends of an individual will also be a friend (i.e., clustering coefficient) while strongly protecting individual privacy.

6 Conclusions

We presented a series of algorithms for counting triangles and kk-stars under LDP. We showed that an additional round can significantly reduce the estimation error in triangles, and the algorithm based on the Laplacian mechanism provides an order optimal error in the non-interactive local model. We also showed lower-bounds for general functions including triangles and kk-stars. We conducted experiments using two real datasets, and showed that our algorithms achieve small relative errors, especially when the number of users is large.

As future work, we would like to develop algorithms for other subgraph counts such as cliques and kk-triangles [33].

Acknowledgments

Kamalika Chaudhuri and Jacob Imola would like to thank ONR under N00014-20-1-2334 and UC Lab Fees under LFR 18-548554 for research support. Takao Murakami was supported in part by JSPS KAKENHI JP19H04113.

References

  • [1] Tool: LDP graph statistics. https://github.com/LDPGraphStatistics/LDPGraphStatistics.
  • [2] 12th Annual Graph Drawing Contest. http://mozart.diei.unipg.it/gdcontest/contest2005/index.html, 2005.
  • [3] What to Do When Your Facebook Profile is Maxed Out on Friends. https://authoritypublishing.com/social-media/what-to-do-when-your-facebook-profile-is-maxed-out-on-friends/, 2012.
  • [4] AI bridging cloud infrastructure (ABCI). https://abci.ai/, 2020.
  • [5] The diaspora* project. https://diasporafoundation.org/, 2020.
  • [6] Facebook Reports Third Quarter 2020 Results. https://investor.fb.com/investor-news/press-release-details/2020/Facebook-Reports-Third-Quarter-2020-Results/default.aspx, 2020.
  • [7] Jayadev Acharya, Clément L. Canonne, Yuhan Liu, Ziteng Sun, and Himanshu Tyagi. Interactive inference under information constraints. CoRR, 2007.10976, 2020.
  • [8] Jayadev Acharya, Ziteng Sun, and Huanyu Zhang. Hadamard response: Estimating distributions privately, efficiently, and with little communication. In Proc. AISTATS’19, pages 1120–1129, 2019.
  • [9] Albert-László Barabási. Network Science. Cambridge University Press, 2016.
  • [10] Raef Bassily, Kobbi Nissim, Uri Stemmer, and Abhradeep Thakurta. Practical locally private heavy hitters. In Proc. NIPS’17, pages 2285––2293, 2017.
  • [11] Raef Bassily and Adam Smith. Local, private, efficient protocols for succinct histograms. In Proc. STOC’15, pages 127–135, 2015.
  • [12] Vincent Bindschaedler and Reza Shokri. Synthesizing plausible privacy-preserving location traces. In Proc. S&P’16, pages 546–563, 2016.
  • [13] Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. The johnson-lindenstrauss transform itself preserves differential privacy. In Proc. FOCS’12, pages 410–419, 2012.
  • [14] Rui Chen, Gergely Acs, and Claude Castelluccia. Differentially private sequential data publication via variable-length n-grams. In Proc. CCS’12, pages 638–649, 2012.
  • [15] Xihui Chen, Sjouke Mauw, and Yunior Ramírez-Cruz. Publishing community-preserving attributed social graphs with a differential privacy guarantee. Proceedings on Privacy Enhancing Technologies (PoPETs), (4):131–152, 2020.
  • [16] Wei-Yen Day, Ninghui Li, and Min Lyu. Publishing graph degree distribution with node differential privacy. In Proc. SIGMOD’16, pages 123–138, 2016.
  • [17] Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. Collecting telemetry data privately. In Proc. NIPS’17, pages 3574–3583, 2017.
  • [18] John Duchi and Ryan Rogers. Lower Bounds for Locally Private Estimation via Communication Complexity. arXiv:1902.00582 [math, stat], May 2019. arXiv: 1902.00582.
  • [19] John Duchi, Martin Wainwright, and Michael Jordan. Minimax Optimal Procedures for Locally Private Estimation. arXiv:1604.02390 [cs, math, stat], November 2017. arXiv: 1604.02390.
  • [20] John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. Local privacy and statistical minimax rates. In Proc. FOCS’13, pages 429–438, 2013.
  • [21] John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. Local privacy, data processing inequalities, and minimax rates. CoRR, 1302.3203, 2014.
  • [22] Cynthia Dwork. Differential privacy. In Proc. ICALP’06, pages 1–12, 2006.
  • [23] Cynthia Dwork and Aaron Roth. The Algorithmic Foundations of Differential Privacy. Now Publishers, 2014.
  • [24] Giulia Fanti, Vasyl Pihur, and Ulfar Erlingsson. Building a RAPPOR with the unknown: Privacy-preserving learning of associations and data dictionaries. Proceedings on Privacy Enhancing Technologies (PoPETs), 2016(3):1–21, 2016.
  • [25] Ghurumuruhan Ganesan. Existence of connected regular and nearly regular graphs. CoRR, 1801.08345, 2018.
  • [26] Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. Exploring network structure, dynamics, and function using networkx. In Proceedings of the 7th Python in Science Conference (SciPy’08), pages 11–15, 2008.
  • [27] Michael Hay, Chao Li, Gerome Miklau, and David Jensen. Accurate estimation of the degree distribution of private networks. In Proc. ICDM’09, pages 169–178, 2009.
  • [28] Matthew Joseph, Janardhan Kulkarni, Jieming Mao, and Zhiwei Steven Wu. Locally Private Gaussian Estimation. arXiv:1811.08382 [cs, stat], October 2019. arXiv: 1811.08382.
  • [29] Matthew Joseph, Jieming Mao, Seth Neel, and Aaron Roth. The Role of Interactivity in Local Differential Privacy. arXiv:1904.03564 [cs, stat], November 2019. arXiv: 1904.03564.
  • [30] Matthew Joseph, Jieming Mao, and Aaron Roth. Exponential separations in local differential privacy. In Proc. SODA’20, pages 515–527, 2020.
  • [31] Peter Kairouz, Keith Bonawitz, and Daniel Ramage. Discrete distribution estimation under local privacy. In Proc. ICML’16, pages 2436–2444, 2016.
  • [32] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. Extremal mechanisms for local differential privacy. Journal of Machine Learning Research, 17(1):492–542, 2016.
  • [33] Vishesh Karwa, Sofya Raskhodnikova, Adam Smith, and Grigory Yaroslavtsev. Private analysis of graph structure. Proceedings of the VLDB Endowment, 4(11):1146–1157, 2011.
  • [34] Shiva Prasad Kasiviswanathan, Homin K. Lee, Kobbi Nissim, and Sofya Raskhodnikova. What can we learn privately? In Proc. FOCS’08, pages 531–540, 2008.
  • [35] Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Analyzing graphs with node differential privacy. In Proc. TCC’13, pages 457–476, 2013.
  • [36] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, 2014.
  • [37] Ninghui Li, Min Lyu, and Dong Su. Differential Privacy: From Theory to Practice. Morgan & Claypool Publishers, 2016.
  • [38] Chris Morris. Hackers had a banner year in 2019. https://fortune.com/2020/01/28/2019-data-breach-increases-hackers/, 2020.
  • [39] Takao Murakami and Yusuke Kawamoto. Utility-optimized local differential privacy mechanisms for distribution estimation. In Proc. USENIX Security’19, pages 1877–1894, 2019.
  • [40] Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. The MIT Press, 2012.
  • [41] M. E. J. Newman. Random graphs with clustering. Physical Review Letters, 103(5):058701, 2009.
  • [42] Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Smooth sensitivity and sampling in private data analysis. In Proc. STOC’07, pages 75–84, 2007.
  • [43] Thomas Paul, Antonino Famulari, and Thorsten Strufe. A survey on decentralized online social networks. Computer Networks, 75:437–452, 2014.
  • [44] Venkatadheeraj Pichapati, Ananda Theertha Suresh, Felix X. Yu, Sashank J. Reddi, and Sanjiv Kumar. AdaCliP: Adaptive clipping for private SGD. CoRR, 1908.07643, 2019.
  • [45] Zhan Qin, Yin Yang, Ting Yu, Issa Khalil, Xiaokui Xiao, and Kui Ren. Heavy hitter estimation over set-valued data with local differential privacy. In Proc. CCS’16, pages 192–203, 2016.
  • [46] Zhan Qin, Ting Yu, Yin Yang, Issa Khalil, Xiaokui Xiao, and Kui Ren. Generating synthetic decentralized social graphs with local differential privacy. In Proc. CCS’17, pages 425–438, 2017.
  • [47] Cyrus Rashtchian, David P. Woodruff, and Hanlin Zhu. Vector-matrix-vector queries for solving linear algebra, statistics, and graph problems. CoRR, 2006.14015, 2020.
  • [48] Sofya Raskhodnikova and Adam Smith. Efficient lipschitz extensions for high-dimensional graph statistics and node private degree distributions. CoRR, 1504.07912, 2015.
  • [49] Sofya Raskhodnikova and Adam Smith. Differentially Private Analysis of Graphs, pages 543–547. Springer, 2016.
  • [50] Andrea De Salve, Paolo Mori, and Laura Ricci. A survey on privacy in decentralized online social networks. Computer Science Review, 27:154–176, 2018.
  • [51] Tara Seals. Data breaches increase 40% in 2016. https://www.infosecurity-magazine.com/news/data-breaches-increase-40-in-2016/, 2017.
  • [52] Shuang Song, Susan Little, Sanjay Mehta, Staal Vinterboy, and Kamalika Chaudhuri. Differentially private continual release of graph statistics. CoRR, 1809.02575, 2018.
  • [53] Haipei Sun, Xiaokui Xiao, Issa Khalil, Yin Yang, Zhan Qui, Hui (Wendy) Wang, and Ting Yu. Analyzing subgraph statistics from extended local views with decentralized differential privacy. In Proc. CCS’19, pages 703–717, 2019.
  • [54] Om Thakkar, Galen Andrew, and H. Brendan McMahan. Differentially private learning with adaptive clipping. CoRR, 1905.03871, 2019.
  • [55] Abhradeep Guha Thakurta, Andrew H. Vyrros, Umesh S. Vaishampayan, Gaurav Kapoor, Julien Freudiger, Vivek Rangarajan Sridhar, and Doug Davidson. Learning New Words, US Patent 9,594,741, Mar. 14 2017.
  • [56] Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. RAPPOR: Randomized aggregatable privacy-preserving ordinal response. In Proc. CCS’14, pages 1054–1067, 2014.
  • [57] Tianhao Wang, Jeremiah Blocki, Ninghui Li, and Somesh Jha. Locally differentially private protocols for frequency estimation. In Proc. USENIX Security’17, pages 729–745, 2017.
  • [58] Yue Wang and Xintao Wu. Preserving differential privacy in degree-correlation based graph generation. Transactions on Data Privacy, 6(2), 2013.
  • [59] Yue Wang, Xintao Wu, and Leting Wu. Differential privacy preserving spectral graph analysis. In Proc. PAKDD’13, pages 329–340, 2013.
  • [60] Stanley L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.
  • [61] Xiaokui Xiao, Gabriel Bender, Michael Hay, and Johannes Gehrke. ireduct: Differential privacy with reduced relative errors. In Proc. SIGMOD’11, pages 229–240, 2011.
  • [62] Min Ye and Alexander Barga. Optimal schemes for discrete distribution estimation under local differential privacy. In Proc. ISIT’17, pages 759––763, 2017.
  • [63] Qingqing Ye, Haibo Hu, Man Ho Au, Xiaofeng Meng, and Xiaokui Xiao. Towards locally differentially private generic graph metric estimation. In Proc. ICDE’20, pages 1922–1925, 2020.
  • [64] Qingqing Ye, Haibo Hu, Man Ho Au, Xiaofeng Meng, and Xiaokui Xiao. LF-GDPR: A framework for estimating graph metrics with local differential privacy. IEEE Transactions on Knowledge and Data Engineering (Early Access), pages 1–16, 2021.
  • [65] Hailong Zhang, Sufian Latif, Raef Bassily, and Atanas Rountev. Differentially-private control-flow node coverage for software usage analysis. In Proc. USENIX Security’20, pages 1021–1038, 2020.

Appendix A Effectiveness of empirical estimation in LocalRR

In Section 4.2, we presented LocalRR, which uses the empirical estimation method after the RR. Here we show the effectiveness of empirical estimation by comparing LocalRR with the RR without empirical estimation [46, 63].

As the RR without empirical estimation, we applied the RR to the lower triangular part of the adjacency matrix 𝐀\mathbf{A}; i.e., we ran lines 1 to 6 in Algorithm 2. Then we output the number of noisy triangles m3m_{3}. We denote this algorithm by RR w/o emp.

Figure 10 shows the l2l_{2} loss of LocalRR and RR w/o emp when we changed nn from 10001000 to 1000010000 or ε\varepsilon in edge LDP from 0.10.1 to 22. The experimental set-up is the same as Section 5.1. Figure 10 shows that LocalRR significantly outperforms RR w/o emp, which means that the l2l_{2} loss is significantly reduced by empirical estimation. As shown in Section 5, the l2l_{2} loss of LocalRR is also significantly reduced by an additional round of interaction.

Refer to caption
Figure 10: l2l_{2} loss of LocalRR and the RR without empirical estimation (RR w/o emp).

Appendix B Experiments on Barabási-Albert Graphs

Experimental set-up.  In Section 5, we evaluated our algorithms using two real datasets: IMDB and Orkut. We also evaluated our algorithms using artificial graphs that have power-law degree distributions. We used the BA (Barabási-Albert) model [9] to generate such graphs.

In the BA model, an artificial graph (referred to as a BA graph) is grown by adding new nodes one at a time. Each new node is connected to λ\lambda\in\mathbb{N} existing nodes with probability proportional to the degree of the existing node. In our experiments, we used NetworkX [26], a Python package for graph analysis, to generate BA graphs.

We generated a BA graph GG^{*} with 10000001000000 nodes using NetworkX. For the attachment parameter λ\lambda, we set λ=10\lambda=10 or 5050. When λ=10\lambda=10 (resp. 5050), the average degree of GG^{*} was 10.010.0 (resp. 50.050.0). For each case, we randomly generated nn users from the whole graph GG^{*}, and extracted a graph G=(V,E)G=(V,E) with the nn users. Then we estimated the number of triangles f(G)f_{\triangle}(G) and the number of 22-stars f2(G)f_{2\star}(G). For triangles, we evaluated LocalRR, Local2Rounds, and CentralLap\triangle. For 22-stars, we evaluated LocalLap2{}_{2}\star and CentralLap2{}_{2}\star. In Local2Rounds, we set ε1=ε2\varepsilon_{1}=\varepsilon_{2}. For d~max\tilde{d}_{max}, we set d~max=dmax\tilde{d}_{max}=d_{max}.

We evaluated the l2l_{2} loss while changing nn and ε\varepsilon. We attempted γ\gamma\in\mathbb{N} ways to randomly select nn users from GG^{*}, and averaged the l2l_{2} loss over all the γ\gamma ways to randomly select nn users. As with Section 5, we set γ=100\gamma=100 and changed nn from 10001000 to 1000010000 while fixing ε=1\varepsilon=1. Then we set γ=10\gamma=10 and changed ε\varepsilon from 0.10.1 to 22 while fixing n=10000n=10000.

Experimental results.  Figure 11 shows the results. Overall, Figure 11 has a similar tendency to Figures 5, 6, and 7. For example, Local2Rounds significantly outperforms LocalRR, especially when the graph GG is sparse; i.e., λ=10\lambda=10. In Local2Rounds, CentralLap\triangle, LocalLap2{}_{2}\star, and CentralLap2{}_{2}\star, the l2l_{2} loss increases with increase in λ\lambda. This is because the maximum degree dmaxd_{max} (=d~max)(=\tilde{d}_{max}) increases with increase in λ\lambda.

Refer to caption
Figure 11: l2l_{2} loss in the Barabási-Albert graph datasets (left: ε=1\varepsilon=1, right: n=10000n=10000). We set the attachment parameter λ\lambda in the BA model to λ=10\lambda=10 or 5050, and d~max\tilde{d}_{max} to d~max=dmax\tilde{d}_{max}=d_{max}.

Figure 11 also shows that the l2l_{2} loss is roughly consistent with our upper-bounds in Section 4. For example, recall that LocalRR, Local2Rounds, CentralLap, LocalLap2⋆, and CentralLap2⋆ achieve the expected l2l_{2} loss of O(n4)O(n^{4}), O(ndmax3)O(nd_{max}^{3}), O(dmax2)O(d_{max}^{2}), O(ndmax2)O(nd_{max}^{2}), and O(dmax2)O(d_{max}^{2}), respectively. Assuming that dmax=O(n)d_{max}=O(n), the left panels of Figure 11 are roughly consistent with these upper-bounds. In addition, the right panels of Figure 11 show that when we set λ=10\lambda=10 and decrease ε\varepsilon from 0.40.4 to 0.10.1, the l2l_{2} loss increases by a factor of about 38003800, 250250, and 1616 in LocalRR, Local2Rounds, and CentralLap, respectively. They are roughly consistent with our upper-bounds – for small ε\varepsilon, the expected l2l_{2} loss of LocalRR, Local2Rounds, and CentralLap is O(ε6)O(\varepsilon^{-6}), O(ε4)O(\varepsilon^{-4}), and O(ε2)O(\varepsilon^{-2}), respectively.

In summary, for both the two real datasets and the BA graphs, our experimental results showed the following findings: (1) Local2Rounds significantly outperforms LocalRR, especially when the graph GG is sparse; (2) our experimental results are roughly consistent with our upper-bounds.

Appendix C Construction of an (n,dmax22)(n,\frac{d_{max}}{2}-2) independent cube for ff_{\triangle}

Suppose that nn is even and dmaxd_{max} is divisible by 44. Since dmax<nd_{max}<n, it is possible to write n=η1dmax2+η2n=\eta_{1}\frac{d_{max}}{2}+\eta_{2} for integers η1,η2\eta_{1},\eta_{2} such that η11\eta_{1}\geq 1 and 1η2<dmax21\leq\eta_{2}<\frac{d_{max}}{2}. Because η1dmax2\eta_{1}\frac{d_{max}}{2} and nn are even, we must have η2\eta_{2} is even. Now, we can write n=(η11)dmax2+(η2+dmax2)n=(\eta_{1}-1)\frac{d_{max}}{2}+(\eta_{2}+\frac{d_{max}}{2}). Thus, we can define a graph G=(V,E)G=(V,E) on nn nodes consisting of (η11)(\eta_{1}-1) cliques of even size dmax2\frac{d_{max}}{2} and one final clique of an even size η2+dmax2(dmax2,dmax)\eta_{2}+\frac{d_{max}}{2}\in(\frac{d_{max}}{2},d_{max}) with all cliques disjoint.

Since G=(V,E)G=(V,E) consists of even-sized cliques, it contains a perfect matching MM. Figure 12 shows examples of GG and MM, where n=14n=14, dmax=8d_{max}=8, η1=3\eta_{1}=3, and η2=2\eta_{2}=2. Let G=(V,E)G^{\prime}=(V,E^{\prime}) such that E=EME^{\prime}=E\setminus M. Let 𝒜={(V,EN:NM}\mathcal{A}=\{(V,E^{\prime}\cup N:N\subseteq M\}. Each edge in GG is part of at least dmax22\frac{d_{max}}{2}-2 triangles. For each pair of edges in MM, the triangles of GG of which they are part are disjoint. Thus, for any edge eMe\in M, removing ee from a graph in 𝒜\mathcal{A} will remove at least dmax22\frac{d_{max}}{2}-2 triangles. This implies that 𝒜\mathcal{A} is an (n,dmax22)(n,\frac{d_{max}}{2}-2) independent cube for ff_{\triangle}.

Refer to caption
Figure 12: Examples of GG and MM for constructing an independent cube for ff_{\triangle} (n=14n=14, dmax=8d_{max}=8, η1=3\eta_{1}=3, η2=2\eta_{2}=2).

Appendix D Proof of Statements in Section 4

Here we prove the statements in Section 4. Our proofs will repeatedly use the well-known bias-variance decomposition [40], which we briefly explain below. We denote the variance of the random variable XX by 𝕍[X]\mathbb{V}[X]. If we are producing a private, randomized estimate f^(G)\hat{f}(G) of the graph function f(G)f(G), then the expected l2l_{2} loss (over the randomness in the algorithm) can be written as:

𝔼[l22(f^(G),f(G))]=(𝔼[f^(G)]f(G))2+𝕍[f^(G)].\operatorname{\mathbb{E}}[l_{2}^{2}(\hat{f}(G),f(G))]=\left(\operatorname{\mathbb{E}}[\hat{f}(G)]-f(G)\right)^{2}+\operatorname{\mathbb{V}}[\hat{f}(G)]. (7)

The first term is the bias, and the second term is the variance. If the estimate is unbiased (i.e., 𝔼[f^(G)]=f(G)\operatorname{\mathbb{E}}[\hat{f}(G)]=f(G)), then the expected l2l_{2} loss is equal to the variance.

D.1 Proof of Theorem 1

Let i\mathcal{R}_{i} be LocalLapk⋆. Let di,di0d_{i},d^{\prime}_{i}\in\mathbb{Z}_{\geq 0} be the number of “1”s in two neighbor lists 𝐚i,𝐚i{0,1}n\mathbf{a}_{i},\mathbf{a}^{\prime}_{i}\in\{0,1\}^{n} that differ in one bit. Let ri=(dik)r_{i}=\binom{d_{i}}{k} and ri=(dik)r^{\prime}_{i}=\binom{d^{\prime}_{i}}{k}. Below we consider two cases about did_{i}: when di<d~maxd_{i}<\tilde{d}_{max} and when did~maxd_{i}\geq\tilde{d}_{max}.

Case 1: di<d~maxd_{i}<\tilde{d}_{max}.  In this case, both 𝐚i\mathbf{a}_{i} and 𝐚i\mathbf{a}^{\prime}_{i} do not change after graph projection, as didi+1d~maxd^{\prime}_{i}\leq d_{i}+1\leq\tilde{d}_{max}. Then we obtain:

Pr[i(𝐚i)=r^i]\displaystyle\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{r}_{i}] =exp(ε|r^iri|Δ)\displaystyle=\exp\left(-\frac{\varepsilon|\hat{r}_{i}-r_{i}|}{\Delta}\right)
Pr[i(𝐚i)=r^i]\displaystyle\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{r}_{i}] =exp(ε|r^iri|Δ),\displaystyle=\exp\left(-\frac{\varepsilon|\hat{r}_{i}-r^{\prime}_{i}|}{\Delta}\right),

where Δ=(d~maxk1)\Delta=\binom{\tilde{d}_{max}}{k-1}. Therefore,

Pr[i(𝐚i)=r^i]Pr[i(𝐚i)=r^i]\displaystyle\frac{\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{r}_{i}]}{\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{r}_{i}]} =exp(ε|r^iri|Δε|r^iri|Δ)\displaystyle=\exp\left(\frac{\varepsilon|\hat{r}_{i}-r^{\prime}_{i}|}{\Delta}-\frac{\varepsilon|\hat{r}_{i}-r_{i}|}{\Delta}\right)
exp(ε|riri|Δ)\displaystyle\leq\exp\left(\frac{\varepsilon|r^{\prime}_{i}-r_{i}|}{\Delta}\right) (8)
(by the triangle inequality).\displaystyle\hskip 11.38109pt(\text{by the triangle inequality}).

If di=di+1d^{\prime}_{i}=d_{i}+1, then |riri||r^{\prime}_{i}-r_{i}| in (8) can be written as follows:

|riri|=(di+1k)(dik)=(dik1)<(d~maxk1)=Δ,\displaystyle|r^{\prime}_{i}-r_{i}|=\binom{d_{i}+1}{k}-\binom{d_{i}}{k}=\binom{d_{i}}{k-1}<\binom{\tilde{d}_{max}}{k-1}=\Delta,

Since we add Lap(Δε)\textrm{Lap}(\frac{\Delta}{\varepsilon}) to rir_{i}, we obtain:

Pr[i(𝐚i)=r^i]eεPr[i(𝐚i)=r^i].\displaystyle\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{r}_{i}]\leq e^{\varepsilon}\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{r}_{i}]. (9)

If di=di1d^{\prime}_{i}=d_{i}-1, then |riri|=(dik)(di1k)=(di1k1)<Δ|r^{\prime}_{i}-r_{i}|=\binom{d_{i}}{k}-\binom{d_{i}-1}{k}=\binom{d_{i}-1}{k-1}<\Delta and (9) holds. Therefore, LocalLapk⋆ provides ε\varepsilon-edge LDP.

Case 2: did~maxd_{i}\geq\tilde{d}_{max}.  Assume that di=di+1d^{\prime}_{i}=d_{i}+1. In this case, di>d~maxd^{\prime}_{i}>\tilde{d}_{max}. Therefore, did^{\prime}_{i} becomes d~max\tilde{d}_{max} after graph projection. In addition, did_{i} also becomes d~max\tilde{d}_{max} after graph projection. Therefore, we obtain di=di=d~maxd_{i}=d^{\prime}_{i}=\tilde{d}_{max} after graph projection. Thus Pr[i(𝐚i)=r^i]=Pr[i(𝐚i)=r^i]\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{r}_{i}]=\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{r}_{i}].

Assume that di=di1d^{\prime}_{i}=d_{i}-1. If di>d~maxd_{i}>\tilde{d}_{max}, then di=di=d~maxd_{i}=d^{\prime}_{i}=\tilde{d}_{max} after graph projection. Thus Pr[i(𝐚i)=r^i]=Pr[i(𝐚i)=r^i]\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{r}_{i}]=\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{r}_{i}]. If di=d~maxd_{i}=\tilde{d}_{max}, then (9) holds. Therefore, LocalLapk⋆ provides ε\varepsilon-edge LDP. ∎

D.2 Proof of Theorem 2

Assuming the maximum degree dmaxd_{max} of GG is at most d~max\tilde{d}_{max}, the only randomness in the algorithm will be the Laplace noise since graph projection will not occur. Since the Laplacian noise Lap(Δε)\textrm{Lap}(\frac{\Delta}{\varepsilon}) has mean 0, the estimate f^k(G,ε,d~max)\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max}) is unbiased. Then by the bias-variance decomposition [40], the expected l2l_{2} loss 𝔼[l22(f^k(G,ε,d~max),fk(G))]\mathbb{E}[l_{2}^{2}(\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max}),\allowbreak f_{k\star}(G))] is equal to the variance of f^k(G,ε,d~max)\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max}). The variance of f^k(G,ε,d~max)\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max}) can be written as follows:

𝕍[f^k(G,ε,d~max)]\displaystyle\mathbb{V}[\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max})] =𝕍[i=1nLap(Δε)]\displaystyle=\mathbb{V}\left[\sum_{i=1}^{n}\textrm{Lap}\left(\frac{\Delta}{\varepsilon}\right)\right]
=nΔ2ε2.\displaystyle=\frac{n\Delta^{2}}{\varepsilon^{2}}.

Since Δ=(d~maxk1)=O(d~maxk1)\Delta=\binom{\tilde{d}_{max}}{k-1}=O(\tilde{d}_{max}^{k-1}), we obtain:

𝔼[l22(f^k(G,ε,d~max),fk(G))]\displaystyle\mathbb{E}[l_{2}^{2}(\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max}),f_{k\star}(G))] =𝕍[f^k(G,ε,d~max)]\displaystyle=\mathbb{V}[\hat{f}_{k\star}(G,\varepsilon,\tilde{d}_{max})]
=O(nd~max2k2ε2).\displaystyle=O\left(\frac{n\tilde{d}_{max}^{2k-2}}{\varepsilon^{2}}\right).

D.3 Proof of Proposition 2

Let μ=eε\mu=e^{\varepsilon} and 𝐐[0,1]4×4\mathbf{Q}\in[0,1]^{4\times 4} be a 4×44\times 4 matrix such that:

𝐐=1(μ+1)3(μ33μ23μ1μ2μ3+2μ2μ2+1μμ2μ2+1μ3+2μμ213μ3μ2μ3).\displaystyle\mathbf{Q}=\frac{1}{(\mu+1)^{3}}\left(\begin{array}[]{cccc}\mu^{3}&3\mu^{2}&3\mu&1\\ \mu^{2}&\mu^{3}+2\mu&2\mu^{2}+1&\mu\\ \mu&2\mu^{2}+1&\mu^{3}+2\mu&\mu^{2}\\ 1&3\mu&3\mu^{2}&\mu^{3}\end{array}\right). (14)

Let c3,c2,c1,c00c_{3},c_{2},c_{1},c_{0}\in\mathbb{Z}_{\geq 0} be respectively the number of triangles, 2-edges, 1-edge, and no-edges in GG. Then we obtain:

(𝔼[m3],𝔼[m2],𝔼[m1],𝔼[m0])=(c3,c2,c1,c0)𝐐.\displaystyle(\mathbb{E}[m_{3}],\mathbb{E}[m_{2}],\mathbb{E}[m_{1}],\mathbb{E}[m_{0}])=(c_{3},c_{2},c_{1},c_{0})\mathbf{Q}. (15)

In other words, 𝐐\mathbf{Q} is a transition matrix from a type of subgraph (i.e., triangle, 2-edges, 1-edge, or no-edge) in GG to a type of subgraph in GG^{\prime}.

Let c^3,c^2,c^1,c^0\hat{c}_{3},\hat{c}_{2},\hat{c}_{1},\hat{c}_{0}\in\mathbb{R} be the empirical estimate of (c3,c2,c1,c0)(c_{3},c_{2},c_{1},c_{0}). By (15), they can be written as follows:

(c^3,c^2,c^1,c^0)=(m3,m2,m1,m0)𝐐1.\displaystyle(\hat{c}_{3},\hat{c}_{2},\hat{c}_{1},\hat{c}_{0})=(m_{3},m_{2},m_{1},m_{0})\mathbf{Q}^{-1}. (16)

Let 𝐐i,j1\mathbf{Q}_{i,j}^{-1} be the (i,ji,j)-th element of 𝐐1\mathbf{Q}^{-1}. By using Cramer’s rule, we obtain:

𝐐1,11\displaystyle\mathbf{Q}_{1,1}^{-1} =μ3(μ1)3,𝐐2,11=μ2(μ1)3,\displaystyle=\textstyle{\frac{\mu^{3}}{(\mu-1)^{3}}},~{}\mathbf{Q}_{2,1}^{-1}=\textstyle{-\frac{\mu^{2}}{(\mu-1)^{3}}}, (17)
𝐐3,11\displaystyle\mathbf{Q}_{3,1}^{-1} =μ(μ1)3,𝐐4,11=1(μ1)3.\displaystyle=\textstyle{\frac{\mu}{(\mu-1)^{3}}},~{}\mathbf{Q}_{4,1}^{-1}=\textstyle{-\frac{1}{(\mu-1)^{3}}}. (18)

By (16), (17), and (18), we obtain:

c^3=μ3(μ1)3m3μ2(μ1)3m2+μ(μ1)3m11(μ1)3m0.\displaystyle\textstyle{\hat{c}_{3}=\frac{\mu^{3}}{(\mu-1)^{3}}m_{3}-\frac{\mu^{2}}{(\mu-1)^{3}}m_{2}+\frac{\mu}{(\mu-1)^{3}}m_{1}-\frac{1}{(\mu-1)^{3}}m_{0}.}

Since μ=eε\mu=e^{\varepsilon} and the empirical estimate is unbiased [31, 57], we obtain (4) in Proposition 2. ∎

D.4 Proof of Theorem 3

Since LocalRR applies the RR to the lower triangular part of the adjacency matrix 𝐀\mathbf{A}, it provides ε\varepsilon-edge LDP for (R1,,Rn)(R_{1},\ldots,R_{n}). Lines 5 to 8 in Algorithm 2 are post-processing of (R1,,Rn)(R_{1},\ldots,R_{n}). Thus, by the immunity to post-processing [23], LocalRR provides ε\varepsilon-edge LDP for the output 1(μ1)3(μ3m3μ2m2+μm1m0)\frac{1}{(\mu-1)^{3}}(\mu^{3}m_{3}-\mu^{2}m_{2}+\mu m_{1}-m_{0}).

In addition, the existence of edge (vi,vj)E(v_{i},v_{j})\in E (i>j)(i>j) affects only one element ai,ja_{i,j} in the lower triangular part of 𝐀\mathbf{A}. Therefore, LocalRR provides ε\varepsilon-relationship DP.

D.5 Proof of Theorem 4

By Proposition 2, the estimate f^(G,ε)\hat{f}_{\triangle}(G,\varepsilon) by LocalRR is unbiased. Then by the bias-variance decomposition [40], the expected l2l_{2} loss 𝔼[l22(f^(G,ε),f(G))]\mathbb{E}[l_{2}^{2}(\hat{f}_{\triangle}(G,\varepsilon),f_{\triangle}(G))] is equal to the variance of f^(G,ε)\hat{f}_{\triangle}(G,\varepsilon). Let a3=μ3(μ1)3a_{3}=\frac{\mu^{3}}{(\mu-1)^{3}}, a2=μ2(μ1)3a_{2}=-\frac{\mu^{2}}{(\mu-1)^{3}}, a1=μ(μ1)3a_{1}=\frac{\mu}{(\mu-1)^{3}}, and a0=1(μ1)3a_{0}=-\frac{1}{(\mu-1)^{3}}. Then the variance of f^(G,ε)\hat{f}_{\triangle}(G,\varepsilon) can be written as follows:

𝕍[f^(G,ε)]\displaystyle\operatorname{\mathbb{V}}[\hat{f}_{\triangle}(G,\varepsilon)] =𝕍[a3m3+a2m2+a1m1+a0m0]\displaystyle=\operatorname{\mathbb{V}}[a_{3}m_{3}+a_{2}m_{2}+a_{1}m_{1}+a_{0}m_{0}]
=a32𝕍[m3]+a22𝕍RR[m2]+a12𝕍[m1]+a02𝕍[m0]\displaystyle=a_{3}^{2}\operatorname{\mathbb{V}}[m_{3}]+a_{2}^{2}\operatorname{\mathbb{V}}_{RR}[m_{2}]+a_{1}^{2}\operatorname{\mathbb{V}}[m_{1}]+a_{0}^{2}\operatorname{\mathbb{V}}[m_{0}]
+i=03j=0,ji32aiajcov(mi,mj),\displaystyle\hskip 9.95845pt+\sum_{i=0}^{3}\sum_{j=0,j\neq i}^{3}2a_{i}a_{j}\text{cov}(m_{i},m_{j}), (19)

where cov(mi,mj)\text{cov}(m_{i},m_{j}) represents the covariance of mim_{i} and mjm_{j}. The covariance cov(mi,mj)\text{cov}(m_{i},m_{j}) can be written as follows:

cov(mi,mj)\displaystyle\text{cov}(m_{i},m_{j}) 𝕍[mi]𝕍[mj]\displaystyle\leq\sqrt{\operatorname{\mathbb{V}}[m_{i}]\operatorname{\mathbb{V}}[m_{j}]}
(by Cauchy-Schwarz inequality)\displaystyle\hskip 11.95013pt(\text{by Cauchy-Schwarz inequality})
max{𝕍[mi],𝕍[mj]}\displaystyle\leq\max\{\operatorname{\mathbb{V}}[m_{i}],\operatorname{\mathbb{V}}[m_{j}]\}
𝕍[mi]+𝕍[mj].\displaystyle\leq\operatorname{\mathbb{V}}[m_{i}]+\operatorname{\mathbb{V}}[m_{j}]. (20)

By (19) and (20), we obtain:

𝕍[f^(G,ε)]\displaystyle\operatorname{\mathbb{V}}[\hat{f}_{\triangle}(G,\varepsilon)]
(a32+4a3(a2+a1+a0))𝕍[m3]\displaystyle\leq(a_{3}^{2}+4a_{3}(a_{2}+a_{1}+a_{0}))\operatorname{\mathbb{V}}[m_{3}]
+(a22+4a2(a3+a1+a0))𝕍[m2]\displaystyle\hskip 12.80373pt+(a_{2}^{2}+4a_{2}(a_{3}+a_{1}+a_{0}))\operatorname{\mathbb{V}}[m_{2}]
+(a12+4a1(a3+a2+a0))𝕍[m1]\displaystyle\hskip 12.80373pt+(a_{1}^{2}+4a_{1}(a_{3}+a_{2}+a_{0}))\operatorname{\mathbb{V}}[m_{1}]
+(a02+4a0(a3+a2+a1))𝕍[m0]\displaystyle\hskip 12.80373pt+(a_{0}^{2}+4a_{0}(a_{3}+a_{2}+a_{1}))\operatorname{\mathbb{V}}[m_{0}]
=O(e6ε(eε1)6(𝕍[m3]+𝕍[m2]+𝕍[m1]+𝕍[m0])).\displaystyle=O\left(\frac{e^{6\varepsilon}}{(e^{\varepsilon}-1)^{6}}(\operatorname{\mathbb{V}}[m_{3}]+\operatorname{\mathbb{V}}[m_{2}]+\operatorname{\mathbb{V}}[m_{1}]+\operatorname{\mathbb{V}}[m_{0}])\right). (21)

Below we calculate 𝕍[m3]\operatorname{\mathbb{V}}[m_{3}], 𝕍[m2]\operatorname{\mathbb{V}}[m_{2}], 𝕍[m1]\operatorname{\mathbb{V}}[m_{1}], and 𝕍[m0]\operatorname{\mathbb{V}}[m_{0}] by assuming the Erdös-Rényi model 𝐆(n,α)\mathbf{G}(n,\alpha) for GG:

Lemma 1.

Let GG(n,α)G\sim\textbf{G}(n,\alpha). Let p=1eε+1p=\frac{1}{e^{\varepsilon}+1} and β=α(1p)+(1α)p\beta=\alpha(1-p)+(1-\alpha)p. Then 𝕍[m3]=O(β5n4+β3n3)\operatorname{\mathbb{V}}[m_{3}]=O(\beta^{5}n^{4}+\beta^{3}n^{3}), 𝕍[m2]=O(β3n4+β2n3)\operatorname{\mathbb{V}}[m_{2}]=O(\beta^{3}n^{4}+\beta^{2}n^{3}), and 𝕍[m1]=𝕍[m0]=O(βn4)\operatorname{\mathbb{V}}[m_{1}]=\operatorname{\mathbb{V}}[m_{0}]=O(\beta n^{4}).

Before going into the proof of Lemma 1, we prove Theorem 4 using Lemma 1. By (21) and Lemma 1, we obtain:

𝕍[f^(G,ε)]=O(e6ε(eε1)6βn4),\displaystyle\operatorname{\mathbb{V}}[\hat{f}_{\triangle}(G,\varepsilon)]=O\left(\frac{e^{6\varepsilon}}{(e^{\varepsilon}-1)^{6}}\beta n^{4}\right),

which proves Theorem 4. ∎

We now prove Lemma 1:

Proof of Lemma 1.

Fist we show the variance of m3m_{3} and m0m_{0}. Then we show the variance of m2m_{2} and m1m_{1}.

Variance of m3m_{3} and m0m_{0}.  Since each edge in the original graph GG is independently generated with probability α[0,1]\alpha\in[0,1], each edge in the noisy graph GG^{\prime} is independently generated with probability β=α(1p)+(1α)p[0,1]\beta=\alpha(1-p)+(1-\alpha)p\in[0,1], where p=1eε+1p=\frac{1}{e^{\varepsilon}+1}. Thus m3m_{3} is the number of triangles in graph GG(n,β)G^{\prime}\sim\textbf{G}(n,\beta).

For i,j,k[n]i,j,k\in[n], let yi,j,k{0,1}y_{i,j,k}\in\{0,1\} be a variable that takes 11 if and only if (vi,vj,vk)(v_{i},v_{j},v_{k}) forms a triangle. Then 𝔼[m32]\mathbb{E}[m_{3}^{2}] can be written as follows:

𝔼[m32]=i<j<ki<j<k𝔼[yi,j,kyi,j,k]\displaystyle\mathbb{E}[m_{3}^{2}]=\sum_{i<j<k}~{}\sum_{i^{\prime}<j^{\prime}<k^{\prime}}\mathbb{E}[y_{i,j,k}y_{i^{\prime},j^{\prime},k^{\prime}}] (22)

𝔼[yi,j,kyi,j,k]\mathbb{E}[y_{i,j,k}y_{i^{\prime},j^{\prime},k^{\prime}}] in (22) is the probability that both (vi,vj,vk)(v_{i},v_{j},v_{k}) and (vi,vj,vk)(v_{i^{\prime}},v_{j^{\prime}},v_{k^{\prime}}) form a triangle. This event can be divided into the following four types:

  1. 1.

    (i,j,k)=(i,j,k)(i,j,k)=(i^{\prime},j^{\prime},k^{\prime}). There are (n3)\binom{n}{3} such terms in (22). For each term, 𝔼[yi,j,kyi,j,k]=β3\mathbb{E}[y_{i,j,k}y_{i^{\prime},j^{\prime},k^{\prime}}]=\beta^{3}.

  2. 2.

    (i,j,k)(i,j,k) and (i,j,k)(i^{\prime},j^{\prime},k^{\prime}) have two elements in common. There are (n2)(n2)(n3)=12(n4)\binom{n}{2}(n-2)(n-3)=12\binom{n}{4} such terms in (22). For each term, 𝔼[yi,j,kyi,j,k]=β5\mathbb{E}[y_{i,j,k}y_{i^{\prime},j^{\prime},k^{\prime}}]=\beta^{5}.

  3. 3.

    (i,j,k)(i,j,k) and (i,j,k)(i^{\prime},j^{\prime},k^{\prime}) have one element in common. There are n(n12)(n32)=30(n5)n\binom{n-1}{2}\binom{n-3}{2}=30\binom{n}{5} such terms in (22). For each term, 𝔼[yi,j,kyi,j,k]=β6\mathbb{E}[y_{i,j,k}y_{i^{\prime},j^{\prime},k^{\prime}}]=\beta^{6}.

  4. 4.

    (i,j,k)(i,j,k) and (i,j,k)(i^{\prime},j^{\prime},k^{\prime}) have no common elements. There are (n3)(n33)=20(n6)\binom{n}{3}\binom{n-3}{3}=20\binom{n}{6} such terms in in (22). For each term, 𝔼[yi,j,kyi,j,k]=β6\mathbb{E}[y_{i,j,k}y_{i^{\prime},j^{\prime},k^{\prime}}]=\beta^{6}.

Moreover, 𝔼[m3]2=(n3)2β6\mathbb{E}[m_{3}]^{2}=\binom{n}{3}^{2}\beta^{6}. Therefore, the variance of m3m_{3} can be written as follows:

𝕍[m3]\displaystyle\operatorname{\mathbb{V}}[m_{3}] =(n3)β3+12(n4)β5+30(n5)β6+20(n6)β6(n3)2β6\displaystyle=\textstyle{\binom{n}{3}\beta^{3}+12\binom{n}{4}\beta^{5}+30\binom{n}{5}\beta^{6}+20\binom{n}{6}\beta^{6}-\binom{n}{3}^{2}\beta^{6}}
=(n3)β3(1β3)+12(n4)β5(1β)\displaystyle=\textstyle{\binom{n}{3}\beta^{3}(1-\beta^{3})+12\binom{n}{4}\beta^{5}(1-\beta)}
=O(β5n4+β3n3).\displaystyle=O(\beta^{5}n^{4}+\beta^{3}n^{3}).

By changing β\beta to 1β1-\beta and counting triangles, we get a random variable with the same distribution as m0m_{0}. Thus,

𝕍[m0]\displaystyle\operatorname{\mathbb{V}}[m_{0}] =(n3)(1β)3(1(1β)3)+12(n4)(1β)5β\displaystyle=\textstyle{\binom{n}{3}(1-\beta)^{3}(1-(1-\beta)^{3})+12\binom{n}{4}(1-\beta)^{5}\beta}
=O(βn4).\displaystyle=O(\beta n^{4}).

Variance of m2m_{2} and m1m_{1}.  For i,j,k[n]i,j,k\in[n], let zi,j,k{0,1}z_{i,j,k}\in\{0,1\} be a variable that takes 11 if and only if (vi,vj,vk)(v_{i},v_{j},v_{k}) forms 22-edges (i.e., exactly one edge is missing in the three nodes). Then 𝔼[m22]\mathbb{E}[m_{2}^{2}] can be written as follows:

𝔼[m22]=i<j<ki<j<k𝔼[zi,j,kzi,j,k]\displaystyle\mathbb{E}[m_{2}^{2}]=\sum_{i<j<k}\sum_{i^{\prime}<j^{\prime}<k^{\prime}}\mathbb{E}[z_{i,j,k}z_{i^{\prime},j^{\prime},k^{\prime}}] (23)

𝔼[zi,j,kzi,j,k]\mathbb{E}[z_{i,j,k}z_{i^{\prime},j^{\prime},k^{\prime}}] in (23) is the probability that both (vi,vj,vk)(v_{i},v_{j},v_{k}) and (vi,vj,vk)(v_{i^{\prime}},v_{j^{\prime}},v_{k^{\prime}}) form 22-edges. This event can be divided into the following four types:

  1. 1.

    (i,j,k)=(i,j,k)(i,j,k)=(i^{\prime},j^{\prime},k^{\prime}). There are (n3)\binom{n}{3} such terms in (23). For each term, 𝔼[zi,j,kzi,j,k]=3β2(1β)\mathbb{E}[z_{i,j,k}z_{i^{\prime},j^{\prime},k^{\prime}}]=3\beta^{2}(1-\beta).

  2. 2.

    (i,j,k)(i,j,k) and (i,j,k)(i^{\prime},j^{\prime},k^{\prime}) have two elements in common. There are (n2)(n2)(n3)=12(n4)\binom{n}{2}(n-2)(n-3)=12\binom{n}{4} such terms in (23). For example, consider a term in which i=i=1i=i^{\prime}=1, j=j=2j=j^{\prime}=2, k=3k=3, and k=4k^{\prime}=4. Both (v1,v2,v3)(v_{1},v_{2},v_{3}) and (v1,v2,v4)(v_{1},v_{2},v_{4}) form 2-edges if:
    (a) (v1,v2),(v1,v3),(v1,v4)E(v_{1},v_{2}),(v_{1},v_{3}),(v_{1},v_{4})\in E^{\prime}, (v2,v3),(v2,v4)E(v_{2},v_{3}),(v_{2},v_{4})\notin E^{\prime},
    (b) (v1,v2),(v1,v3),(v2,v4)E(v_{1},v_{2}),(v_{1},v_{3}),(v_{2},v_{4})\in E^{\prime}, (v2,v3),(v1,v4)E(v_{2},v_{3}),(v_{1},v_{4})\notin E^{\prime},
    (c) (v1,v2),(v2,v3),(v1,v4)E(v_{1},v_{2}),(v_{2},v_{3}),(v_{1},v_{4})\in E^{\prime}, (v1,v3),(v2,v4)E(v_{1},v_{3}),(v_{2},v_{4})\notin E^{\prime},
    (d) (v1,v2),(v2,v3),(v2,v4)E(v_{1},v_{2}),(v_{2},v_{3}),(v_{2},v_{4})\in E^{\prime}, (v1,v3),(v1,v4)E(v_{1},v_{3}),(v_{1},v_{4})\notin E^{\prime}, or
    (e) (v1,v3),(v1,v4),(v2,v3),(v2,v4)E(v_{1},v_{3}),(v_{1},v_{4}),(v_{2},v_{3}),(v_{2},v_{4})\in E^{\prime}, (v1,v2)E(v_{1},v_{2})\notin E^{\prime}.
    Thus, 𝔼[zi,j,kzi,j,k]=4β3(1β)2+β4(1β)\mathbb{E}[z_{i,j,k}z_{i^{\prime},j^{\prime},k^{\prime}}]=4\beta^{3}(1-\beta)^{2}+\beta^{4}(1-\beta) for this term. Similarly, 𝔼[zi,j,kzi,j,k]=4β3(1β)2+β4(1β)\mathbb{E}[z_{i,j,k}z_{i^{\prime},j^{\prime},k^{\prime}}]=4\beta^{3}(1-\beta)^{2}+\beta^{4}(1-\beta) for the other terms.

  3. 3.

    (i,j,k)(i,j,k) and (i,j,k)(i^{\prime},j^{\prime},k^{\prime}) have one element in common. There are n(n12)(n32)=30(n5)n\binom{n-1}{2}\binom{n-3}{2}=30\binom{n}{5} such terms in (23). For each term, 𝔼[zi,j,kzi,j,k]=(3β2(1β))2=9β4(1β)2\mathbb{E}[z_{i,j,k}z_{i^{\prime},j^{\prime},k^{\prime}}]=(3\beta^{2}(1-\beta))^{2}=9\beta^{4}(1-\beta)^{2}.

  4. 4.

    (i,j,k)(i,j,k) and (i,j,k)(i^{\prime},j^{\prime},k^{\prime}) have no common elements. There are (n3)(n33)=20(n6)\binom{n}{3}\binom{n-3}{3}=20\binom{n}{6} such terms in (23). For each term, 𝔼[zi,j,kzi,j,k]=(3β2(1β))2=9β4(1β)2\mathbb{E}[z_{i,j,k}z_{i^{\prime},j^{\prime},k^{\prime}}]=(3\beta^{2}(1-\beta))^{2}=9\beta^{4}(1-\beta)^{2}.

Moreover, 𝔼[m2]2=(3(n3)β2(1β))2=9(n3)2β4(1β)2\mathbb{E}[m_{2}]^{2}=(3\binom{n}{3}\beta^{2}(1-\beta))^{2}=9\binom{n}{3}^{2}\beta^{4}(1-\beta)^{2}. Therefore, the variance of m2m_{2} can be written as follows:

𝕍[m2]\displaystyle\mathbb{V}[m_{2}] =𝔼[m22]𝔼[m2]2\displaystyle=\mathbb{E}[m_{2}^{2}]-\mathbb{E}[m_{2}]^{2}
=3(n3)β2(1β)+12(n4)(4β3(1β)2+β4(1β))\displaystyle=\textstyle{3\binom{n}{3}\beta^{2}(1-\beta)+12\binom{n}{4}\left(4\beta^{3}(1-\beta)^{2}+\beta^{4}(1-\beta)\right)}
+270(n5)β4(1β)2+180(n6)β4(1β)2\displaystyle\hskip 9.95845pt\textstyle{+270\binom{n}{5}\beta^{4}(1-\beta)^{2}+180\binom{n}{6}\beta^{4}(1-\beta)^{2}}
9(n3)2β4(1β)2.\displaystyle\hskip 9.95845pt\textstyle{-9\binom{n}{3}^{2}\beta^{4}(1-\beta)^{2}.}

By simple calculations,

270(n5)+180(n6)9(n3)2=108(n4)9(n3).\displaystyle\textstyle{270\binom{n}{5}+180\binom{n}{6}-9\binom{n}{3}^{2}=-108\binom{n}{4}-9\binom{n}{3}.}

Thus we obtain:

𝕍[m2]\displaystyle\mathbb{V}[m_{2}] =3(n3)β2(1β)(13β2(1β))\displaystyle=\textstyle{3\binom{n}{3}\beta^{2}(1-\beta)\left(1-3\beta^{2}(1-\beta)\right)}
+12(n4)β3(1β)(4(1β)+β9β(1β))\displaystyle\hskip 9.95845pt\textstyle{+12\binom{n}{4}\beta^{3}(1-\beta)\left(4(1-\beta)+\beta-9\beta(1-\beta)\right)}
=O(β3n4+β2n3).\displaystyle=O(\beta^{3}n^{4}+\beta^{2}n^{3}).

Similarly, the variance of m1m_{1} can be written as follows:

𝕍[m1]\displaystyle\mathbb{V}[m_{1}] =3(n3)β(1β)2(13β(1β)2)\displaystyle=\textstyle{3\binom{n}{3}\beta(1-\beta)^{2}\left(1-3\beta(1-\beta)^{2}\right)}
+12(n4)β(1β)3(4β+(1β)9β(1β))\displaystyle\hskip 9.95845pt\textstyle{+12\binom{n}{4}\beta(1-\beta)^{3}\left(4\beta+(1-\beta)-9\beta(1-\beta)\right)}
=O(βn4).\displaystyle=O(\beta n^{4}).

D.6 Proof of Proposition 3

Let t=i=1ntit_{*}=\sum_{i=1}^{n}t_{i} and s=i=1nsis_{*}=\sum_{i=1}^{n}s_{i}. Let ss_{*}^{\wedge} be the number of triplets (vi,vj,vk)(v_{i},v_{j},v_{k}) such that j<k<ij<k<i, ai,j=ai,k=1a_{i,j}=a_{i,k}=1, and aj,k=0a_{j,k}=0. Let ss_{*}^{\triangle} be the number of triplets (vi,vj,vk)(v_{i},v_{j},v_{k}) such that j<k<ij<k<i, ai,j=ai,k=aj,k=1a_{i,j}=a_{i,k}=a_{j,k}=1. Note that s=s+ss_{*}=s_{*}^{\wedge}+s_{*}^{\triangle} and s=f(G)s_{*}^{\triangle}=f_{\triangle}(G).

Consider a triangle (vi,vj,vk)G(v_{i},v_{j},v_{k})\in G. This triangle is counted 1p11-p_{1} (=eε1eε1+1=\frac{e^{\varepsilon_{1}}}{e^{\varepsilon_{1}}+1}) times in expectation in tt_{*}. Consider 22-edges (vi,vj,vk)G(v_{i},v_{j},v_{k})\in G (i.e., exactly one edge is missing in the three nodes). This is counted p1p_{1} (=1eε1+1=\frac{1}{e^{\varepsilon_{1}}+1}) times in expectation in tt_{*}. No other events can change tt_{*}. Therefore, we obtain:

𝔼[t]=(1p1)s+p1s.\displaystyle\mathbb{E}[t_{*}]=(1-p_{1})s_{*}^{\triangle}+p_{1}s_{*}^{\wedge}.

By s=s+ss_{*}=s_{*}^{\wedge}+s_{*}^{\triangle} and s=f(G)s_{*}^{\triangle}=f_{\triangle}(G), we obtain:

𝔼[i=1nwi]\displaystyle\mathbb{E}\left[\sum_{i=1}^{n}w_{i}\right] =𝔼[i=1n(tip1si)]\displaystyle=\mathbb{E}\left[\sum_{i=1}^{n}(t_{i}-p_{1}s_{i})\right]
=𝔼[tp1s]\displaystyle=\mathbb{E}[t_{*}-p_{1}s_{*}]
=𝔼[t]p1𝔼[s+s]\displaystyle=\mathbb{E}[t_{*}]-p_{1}\mathbb{E}[s_{*}^{\wedge}+s_{*}^{\triangle}]
=(1p1)s+p1sp1(s+s)\displaystyle=(1-p_{1})s_{*}^{\triangle}+p_{1}s_{*}^{\wedge}-p_{1}(s_{*}^{\wedge}+s_{*}^{\triangle})
=(12p1)f(G),\displaystyle=(1-2p_{1})f_{\triangle}(G),

hence

𝔼[112p1i=1nwi]=f(G).\displaystyle\textstyle{\mathbb{E}\left[\frac{1}{1-2p_{1}}\sum_{i=1}^{n}w_{i}\right]=f_{\triangle}(G).}

D.7 Proof of Theorem 5

Let i\mathcal{R}_{i} be Local2Rounds. Consider two neighbor lists 𝐚i,𝐚i{0,1}n\mathbf{a}_{i},\mathbf{a}^{\prime}_{i}\in\{0,1\}^{n} that differ in one bit. Let did_{i} (resp. did^{\prime}_{i}) 0\in\mathbb{Z}_{\geq 0} be the number of “1”s in 𝐚i\mathbf{a}_{i} (resp. 𝐚i\mathbf{a}^{\prime}_{i}). Let 𝐚¯i\bar{\mathbf{a}}_{i} (resp. 𝐚¯i\bar{\mathbf{a}}^{\prime}_{i}) {0,1}n\in\{0,1\}^{n} be neighbor lists obtained by setting all of the ii-th to the nn-th elements in 𝐚i\mathbf{a}_{i} (resp. 𝐚i\mathbf{a}^{\prime}_{i}) to 0. Let d¯i\bar{d}_{i} (resp. d¯i\bar{d}^{\prime}_{i}) 0\in\mathbb{Z}_{\geq 0} be the number of “1”s in 𝐚¯i\bar{\mathbf{a}}_{i} (resp. 𝐚¯i\bar{\mathbf{a}}^{\prime}_{i}). For example, if n=6n=6, 𝐚4=(1,0,1,0,1,1)\mathbf{a}_{4}=(1,0,1,0,1,1), and 𝐚4=(1,1,1,0,1,1)\mathbf{a}^{\prime}_{4}=(1,1,1,0,1,1), then d4=4d_{4}=4, d4=5d^{\prime}_{4}=5, 𝐚¯4=(1,0,1,0,0,0)\bar{\mathbf{a}}_{4}=(1,0,1,0,0,0), 𝐚¯4=(1,1,1,0,0,0)\bar{\mathbf{a}}^{\prime}_{4}=(1,1,1,0,0,0), d¯4=2\bar{d}_{4}=2, and d¯4=3\bar{d}^{\prime}_{4}=3.

Furthermore, let tit_{i} (resp. tit^{\prime}_{i}) 0\in\mathbb{Z}_{\geq 0} be the number of triplets (vi,vj,vk)(v_{i},v_{j},v_{k}) such that j<k<ij<k<i, (vi,vj)E(v_{i},v_{j})\in E, (vi,vk)E(v_{i},v_{k})\in E, and (vj,vk)E(v_{j},v_{k})\in E^{\prime} in 𝐚i\mathbf{a}_{i} (resp. 𝐚i\mathbf{a}^{\prime}_{i}). Let sis_{i} (resp. sis^{\prime}_{i}) 0\in\mathbb{Z}_{\geq 0} be the number of triplets (vi,vj,vk)(v_{i},v_{j},v_{k}) such that j<k<ij<k<i, (vi,vj)E(v_{i},v_{j})\in E, and (vi,vk)E(v_{i},v_{k})\in E in 𝐚i\mathbf{a}_{i} (resp. 𝐚i\mathbf{a}^{\prime}_{i}). Let wi=tip1siw_{i}=t_{i}-p_{1}s_{i} and wi=tip1siw^{\prime}_{i}=t^{\prime}_{i}-p_{1}s^{\prime}_{i}. Below we consider two cases about did_{i}: when di<d~maxd_{i}<\tilde{d}_{max} and when did~maxd_{i}\geq\tilde{d}_{max}.

Case 1: di<d~maxd_{i}<\tilde{d}_{max}.  Assume that di=di+1d^{\prime}_{i}=d_{i}+1. In this case, we have either 𝐚¯i=𝐚¯i\bar{\mathbf{a}}^{\prime}_{i}=\bar{\mathbf{a}}_{i} or d¯i=d¯i+1\bar{d}^{\prime}_{i}=\bar{d}_{i}+1. If 𝐚¯i=𝐚¯i\bar{\mathbf{a}}^{\prime}_{i}=\bar{\mathbf{a}}_{i}, then si=sis_{i}=s^{\prime}_{i}, ti=tit_{i}=t^{\prime}_{i}, and wi=wiw_{i}=w^{\prime}_{i}, hence Pr[i(𝐚i)=w^i]=Pr[i(𝐚i)=w^i]\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{w}_{i}]=\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{w}_{i}]. If d¯i=d¯i+1\bar{d}^{\prime}_{i}=\bar{d}_{i}+1, then sis_{i} and sis^{\prime}_{i} can be expressed as si=(d¯i2)s_{i}=\binom{\bar{d}_{i}}{2} and si=(d¯i2)=(d¯i+12)s^{\prime}_{i}=\binom{\bar{d}^{\prime}_{i}}{2}=\binom{\bar{d}_{i}+1}{2}, respectively. Then we obtain:

sisi=(d¯i+12)(d¯i2)=d¯i.\displaystyle s^{\prime}_{i}-s_{i}=\binom{\bar{d}_{i}+1}{2}-\binom{\bar{d}_{i}}{2}=\bar{d}_{i}.

In addition, since we consider an additional constraint “(vj,vk)E(v_{j},v_{k})\in E^{\prime}” in counting tit_{i} and tit^{\prime}_{i}, we have titisisit^{\prime}_{i}-t_{i}\leq s^{\prime}_{i}-s_{i}. Therefore,

|wiwi|\displaystyle|w^{\prime}_{i}-w_{i}| =|titip1(sisi)|\displaystyle=|t^{\prime}_{i}-t_{i}-p_{1}(s^{\prime}_{i}-s_{i})|
(1p1)d¯i\displaystyle\leq(1-p_{1})\bar{d}_{i}
(1p1)di\displaystyle\leq(1-p_{1})d_{i}
<d~max(by p1>0 and di<d~max).\displaystyle<\tilde{d}_{max}\hskip 14.22636pt\text{(by $p_{1}>0$ and $d_{i}<\tilde{d}_{max}$)}.

Since we add Lap(d~maxε2)\textrm{Lap}(\frac{\tilde{d}_{max}}{\varepsilon_{2}}) to wiw_{i}, we obtain:

Pr[i(𝐚i)=w^i]eε2Pr[i(𝐚i)=w^i].\displaystyle\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{w}_{i}]\leq e^{\varepsilon_{2}}\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{w}_{i}]. (24)

Assume that di=di1d^{\prime}_{i}=d_{i}-1. In this case, we have either 𝐚¯i=𝐚¯i\bar{\mathbf{a}}^{\prime}_{i}=\bar{\mathbf{a}}_{i} or d¯i=d¯i1\bar{d}^{\prime}_{i}=\bar{d}_{i}-1. If 𝐚¯i=𝐚¯i\bar{\mathbf{a}}^{\prime}_{i}=\bar{\mathbf{a}}_{i}, then Pr[i(𝐚i)=w^i]=Pr[i(𝐚i)=w^i]\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{w}_{i}]=\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{w}_{i}]. If d¯i=d¯i1\bar{d}^{\prime}_{i}=\bar{d}_{i}-1, then we obtain sisi=d¯i1s_{i}-s^{\prime}_{i}=\bar{d}_{i}-1 and titisisit_{i}-t^{\prime}_{i}\leq s_{i}-s^{\prime}_{i}. Thus |wiwi|(1p1)(d~i1)<d~max|w^{\prime}_{i}-w_{i}|\leq(1-p_{1})(\tilde{d}_{i}-1)<\tilde{d}_{max} and (24) holds. Therefore, Local2Rounds provides ε2\varepsilon_{2}-edge LDP at the second round. Since Local2Rounds provides ε1\varepsilon_{1}-edge LDP at the first round (by Theorem 3), it provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-edge LDP in total by the composition theorem [23].

Case 2: did~maxd_{i}\geq\tilde{d}_{max}.  Assume that di=di+1d^{\prime}_{i}=d_{i}+1. In this case, we obtain di=di=d~maxd_{i}=d^{\prime}_{i}=\tilde{d}_{max} after graph projection.

Note that 𝐚i\mathbf{a}_{i} and 𝐚i\mathbf{a}^{\prime}_{i} can differ in zero or two bits after graph projection. For example, consider the case where n=8n=8, 𝐚5=(1,1,0,1,0,1,1,1)\mathbf{a}_{5}=(1,1,0,1,0,1,1,1), 𝐚5=(1,1,1,1,0,1,1,1)\mathbf{a}^{\prime}_{5}=(1,1,1,1,0,1,1,1), and d~max=4\tilde{d}_{max}=4. If the permutation is 1,4,6,8,2,7,5,3, then 𝐚5=𝐚5=(1,0,0,1,0,1,0,1)\mathbf{a}_{5}=\mathbf{a}^{\prime}_{5}=(1,0,0,1,0,1,0,1) after graph projection. However, if the permutation is 3,1,4,6,8,2,7,5, then 𝐚5\mathbf{a}_{5} and 𝐚5\mathbf{a}^{\prime}_{5} become 𝐚5=(1,0,0,1,0,1,0,1)\mathbf{a}_{5}=(1,0,0,1,0,1,0,1) and 𝐚5=(1,0,1,1,0,1,0,0)\mathbf{a}^{\prime}_{5}=(1,0,1,1,0,1,0,0), respectively; i.e., they differ in the third and eighth elements.

If 𝐚i=𝐚i\mathbf{a}_{i}=\mathbf{a}^{\prime}_{i}, then Pr[i(𝐚i)=w^i]=Pr[i(𝐚i)=w^i]\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{w}_{i}]=\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{w}_{i}]. If 𝐚i\mathbf{a}_{i} and 𝐚i\mathbf{a}^{\prime}_{i} differ in two bits, 𝐚¯i\bar{\mathbf{a}}_{i} and 𝐚¯i\bar{\mathbf{a}}^{\prime}_{i} differ in at most two bits (because we set all of the ii-th to the nn-th elements in 𝐚i\mathbf{a}_{i} and 𝐚i\mathbf{a}^{\prime}_{i} to 0). For example, we can consider the following three cases:

  • If 𝐚5=(1,0,0,1,0,1,0,1)\mathbf{a}_{5}=(1,0,0,1,0,1,0,1) and 𝐚5=(1,0,0,1,0,1,1,0)\mathbf{a}^{\prime}_{5}=(1,0,0,1,0,1,1,0), then 𝐚¯5=𝐚¯5=(1,0,0,1,0,0,0,0)\bar{\mathbf{a}}_{5}=\bar{\mathbf{a}}^{\prime}_{5}=(1,0,0,1,0,0,0,0).

  • If 𝐚5=(1,0,0,1,0,1,0,1)\mathbf{a}_{5}=(1,0,0,1,0,1,0,1) and 𝐚5=(1,0,1,1,0,1,0,0)\mathbf{a}^{\prime}_{5}=(1,0,1,1,0,1,0,0), then 𝐚¯5=(1,0,0,1,0,0,0,0)\bar{\mathbf{a}}_{5}=(1,0,0,1,0,0,0,0) and 𝐚¯5=(1,0,1,1,0,0,0,0)\bar{\mathbf{a}}^{\prime}_{5}=(1,0,1,1,0,0,\allowbreak 0,0); i.e., they differ in one bit.

  • If 𝐚5=(1,1,0,1,0,1,0,0)\mathbf{a}_{5}=(1,1,0,1,0,1,0,0) and 𝐚5=(1,0,1,1,0,1,0,0)\mathbf{a}^{\prime}_{5}=(1,0,1,1,0,1,0,0), then 𝐚¯5=(1,1,0,1,0,0,0,0)\bar{\mathbf{a}}_{5}=(1,1,0,1,0,0,0,0) and 𝐚¯5=(1,0,1,1,0,0,0,0)\bar{\mathbf{a}}^{\prime}_{5}=(1,0,1,1,0,0,\allowbreak 0,0); i.e., they differ in two bits.

If 𝐚¯i=𝐚¯i\bar{\mathbf{a}}_{i}=\bar{\mathbf{a}}^{\prime}_{i}, then Pr[i(𝐚i)=w^i]=Pr[i(𝐚i)=w^i]\Pr[\mathcal{R}_{i}(\mathbf{a}_{i})=\hat{w}_{i}]=\Pr[\mathcal{R}_{i}(\mathbf{a}^{\prime}_{i})=\hat{w}_{i}]. If 𝐚¯i\bar{\mathbf{a}}_{i} and 𝐚¯i\bar{\mathbf{a}}^{\prime}_{i} differ in one bit, then d¯i=d¯i+1\bar{d}^{\prime}_{i}=\bar{d}_{i}+1. In this case, we obtain (24) in the same way as Case 1.

We need to be careful when 𝐚¯i\bar{\mathbf{a}}_{i} and 𝐚¯i\bar{\mathbf{a}}^{\prime}_{i} differ in two bits. In this case, d¯i=d¯i\bar{d}^{\prime}_{i}=\bar{d}_{i} (because di=di=d~maxd_{i}=d^{\prime}_{i}=\tilde{d}_{max} after graph projection). Then we obtain si=si=(d~max2)s_{i}=s^{\prime}_{i}=\binom{\tilde{d}_{max}}{2}. Since the number of 22-stars that involve a particular user in 𝐚¯i\bar{\mathbf{a}}_{i} is d¯i1\bar{d}_{i}-1, we obtain titid¯i1t^{\prime}_{i}-t_{i}\leq\bar{d}_{i}-1. Therefore,

|wiwi|=|titi|d¯i1<d~max,\displaystyle|w^{\prime}_{i}-w_{i}|=|t^{\prime}_{i}-t_{i}|\leq\bar{d}_{i}-1<\tilde{d}_{max},

and (24) holds. Therefore, if di=di+1d^{\prime}_{i}=d_{i}+1, then Local2Rounds provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-edge LDP in total.

Assume that di=di1d^{\prime}_{i}=d_{i}-1. If di>d~maxd_{i}>\tilde{d}_{max}, then di=di=d~maxd_{i}=d^{\prime}_{i}=\tilde{d}_{max} after graph projection. Thus Local2Rounds provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-edge LDP in total in the same as above. If di=d~maxd_{i}=\tilde{d}_{max}, then we obtain (24) in the same way as Case 1, and therefore Local2Rounds provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-edge LDP in total.

In summary, Local2Rounds provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-edge LDP in both Case 1 and Case 2. Local2Rounds also provides (ε1+ε2)(\varepsilon_{1}+\varepsilon_{2})-relationship DP because it uses only the lower triangular part of the adjacency matrix 𝐀\mathbf{A}. ∎

D.8 Proof of Theorem 6

When the maximum degree dmaxd_{max} of GG is at most d~max\tilde{d}_{max}, no graph projection will occur. By Proposition 3, the estimate f(G,ε)f_{\triangle}(G,\varepsilon) by Local2Rounds is unbiased.

By bias-variance decomposition (7), the expected l2l_{2} loss 𝔼[l22(f^(G,ε),f(G))]\operatorname{\mathbb{E}}[l_{2}^{2}(\hat{f}_{\triangle}(G,\varepsilon),f_{\triangle}(G))] is equal to 𝕍[f^(G,ε)]\operatorname{\mathbb{V}}[\hat{f}_{\triangle}(G,\varepsilon)]. Recall that p1=11+eε1p_{1}=\frac{1}{1+e^{\varepsilon_{1}}}. 𝕍[f^(G,ε)]\operatorname{\mathbb{V}}[\hat{f}_{\triangle}(G,\varepsilon)] can be written as follows:

𝕍[f^(G,ε)]\displaystyle\operatorname{\mathbb{V}}[\hat{f}_{\triangle}(G,\varepsilon)] (25)
=1(12p1)2𝕍[i=1nw^i]\displaystyle=\textstyle{\frac{1}{(1-2p_{1})^{2}}\operatorname{\mathbb{V}}\left[\sum_{i=1}^{n}\hat{w}_{i}\right]}
=1(12p1)2𝕍[i=1ntip1si+Lap(d~max(1p1)ε2)]\displaystyle=\textstyle{\frac{1}{(1-2p_{1})^{2}}\operatorname{\mathbb{V}}\left[\sum_{i=1}^{n}t_{i}-p_{1}s_{i}+\textrm{Lap}(\frac{\tilde{d}_{max}(1-p_{1})}{\varepsilon_{2}})\right]}
=1(12p1)2(𝕍[i=1ntip1si]+𝕍[i=1nLap(d~max(1p1)ε2)])\displaystyle=\textstyle{\frac{1}{(1-2p_{1})^{2}}\left(\operatorname{\mathbb{V}}\left[\sum_{i=1}^{n}t_{i}-p_{1}s_{i}\right]+\operatorname{\mathbb{V}}\left[\sum_{i=1}^{n}\textrm{Lap}(\frac{\tilde{d}_{max}(1-p_{1})}{\varepsilon_{2}})\right]\right)}
=1(12p1)2𝕍[i=1nti]+n(12p1)22d~max2(1p1)2ε22.\displaystyle=\textstyle{\frac{1}{(1-2p_{1})^{2}}\operatorname{\mathbb{V}}\left[\sum_{i=1}^{n}t_{i}\right]+\frac{n}{(1-2p_{1})^{2}}2\frac{\tilde{d}_{max}^{2}(1-p_{1})^{2}}{\varepsilon_{2}^{2}}}. (26)

In the last line, we are able to get rid of the sis_{i}’s because they are deterministic. We are also able to sum the variances of the Lap random variables since they are independent; we are not able to do the same with the sum of the tit_{i}s.

Recall the definition of EE^{\prime} computed by the first round of Local2Rounds—the noisy edges released by randomized response. Now,

ti\displaystyle t_{i} =ai,j=ai,k=1,j<k<i1((vj,vk)E).\displaystyle=\sum_{a_{i,j}=a_{i,k}=1,j<k<i}\textbf{1}((v_{j},v_{k})\in E^{\prime}).

This gives

i=1nti\displaystyle\sum_{i=1}^{n}t_{i} =i=1nai,j=ai,k=1j<k<i1((vj,vk)E)\displaystyle=\sum_{i=1}^{n}\sum_{\begin{subarray}{c}a_{i,j}=a_{i,k}=1\\ j<k<i\end{subarray}}\textbf{1}((v_{j},v_{k})\in E^{\prime})
=1j<kni>kai,j=ai,k=11((vj,vk)E)\displaystyle=\sum_{1\leq j<k\leq n}\sum_{\begin{subarray}{c}i>k\\ a_{i,j}=a_{i,k}=1\end{subarray}}\textbf{1}((v_{j},v_{k})\in E^{\prime})
=1j<kn|{i:i>k,ai,j=ai,k=1}|1((vj,vk)E|.\displaystyle=\sum_{1\leq j<k\leq n}|\{i:i>k,a_{i,j}=a_{i,k}=1\}|\textbf{1}((v_{j},v_{k})\in E^{\prime}|.

Let cjk=|{i:i>k,ai,j=ai,k=1}|c_{jk}=|\{i:i>k,a_{i,j}=a_{i,k}=1\}|. Notice that 1((vj,vk)E)\textbf{1}((v_{j},v_{k})\in E^{\prime}) are independent events. Thus, the variance of the above expression is

𝕍[i=1nti]\displaystyle\operatorname{\mathbb{V}}\left[\sum_{i=1}^{n}t_{i}\right] =𝕍[1j<kncjk1((vj,vk)E)]\displaystyle=\operatorname{\mathbb{V}}\left[\sum_{1\leq j<k\leq n}c_{jk}\textbf{1}((v_{j},v_{k})\in E^{\prime})\right]
=1j<kncjk2𝕍[1((vj,vkE))]\displaystyle=\sum_{1\leq j<k\leq n}c_{jk}^{2}\operatorname{\mathbb{V}}[\textbf{1}((v_{j},v_{k}\in E^{\prime}))]
=p1(1p1)1j<kncjk2.\displaystyle=p_{1}(1-p_{1})\sum_{1\leq j<k\leq n}c_{jk}^{2}. (27)

cjkc_{jk} is the number of ordered 2-paths from jj to kk in GG. Because the degree of user vjv_{j} is at most d~max\tilde{d}_{max}, 0cjkd~max0\leq c_{jk}\leq\tilde{d}_{max}. There are at most nd~max2n\tilde{d}_{max}^{2} ordered 2-paths in GG, since there are only d~max\tilde{d}_{max} nodes to go to once a first is picked. Thus, 1j<kncjknd~max2\sum_{1\leq j<k\leq n}c_{jk}\leq n\tilde{d}_{max}^{2}. Using a Jensen’s inequality style argument, the best way to maximize (27) is to have all cjkc_{jk} be 0 or d~max\tilde{d}_{max}. At most nd~maxn\tilde{d}_{max} of the cjkc_{jk} can be d~max\tilde{d}_{max}, and the rest are zero. Thus,

𝕍[i=1nti]\displaystyle\operatorname{\mathbb{V}}\left[\sum_{i=1}^{n}t_{i}\right] =p1(1p1)1j<kncij2\displaystyle=p_{1}(1-p_{1})\sum_{1\leq j<k\leq n}c_{ij}^{2}
p1(1p1)nd~max×d~max2.\displaystyle\leq p_{1}(1-p_{1})n\tilde{d}_{max}\times\tilde{d}_{max}^{2}.

Plugging this into (26)

𝕍[f^(G,ε)]\displaystyle\operatorname{\mathbb{V}}[\hat{f}_{\triangle}(G,\varepsilon)] p1(1p1)nd~max3(12p1)2+2nd~max2(1p1)2(12p1)2ε22\displaystyle\leq\frac{p_{1}(1-p_{1})n\tilde{d}_{max}^{3}}{(1-2p_{1})^{2}}+\frac{2n\tilde{d}_{max}^{2}(1-p_{1})^{2}}{(1-2p_{1})^{2}\varepsilon_{2}^{2}}
O(p1nd~max3+nd~max2/ε22(12p1)2)\displaystyle\leq O\left(\frac{p_{1}n\tilde{d}_{max}^{3}+n\tilde{d}_{max}^{2}/\varepsilon_{2}^{2}}{(1-2p_{1})^{2}}\right)
O(eε1(1eε1)2(nd~max3+eε1ε22nd~max2)).\displaystyle\leq O\left(\frac{e^{\varepsilon_{1}}}{(1-e^{\varepsilon_{1}})^{2}}\left(n\tilde{d}_{max}^{3}+\frac{e^{\varepsilon_{1}}}{\varepsilon_{2}^{2}}n\tilde{d}_{max}^{2}\right)\right).

D.9 Proof of Theorem 7

Preliminaries.

We begin by defining a Boolean version of the independent cube in Definition 5, which we call the Boolean independent cube. The Boolean independent cube works for functions g:{0,1}κg:\{0,1\}^{\kappa}\rightarrow\mathbb{R} in the local DP model, where each of κ\kappa\in\mathbb{N} users has a single bit and obfuscates the bit to provide ε\varepsilon-DP. As shown later, there is a one-to-one correspondence between the independent cube in Definition 5 and the Boolean independent cube. Based on this, we show a lower-bound for the Boolean independent cube, and use the lower-bound to prove Theorem 7.

Below we define the Boolean independent cube. For i[κ]i\in[\kappa], let xi{0,1}x_{i}\in\{0,1\} be a bit of user viv_{i}. Let X=(x1,,xκ)X=(x_{1},\ldots,x_{\kappa}). We assume user viv_{i} obfuscates xix_{i} using a randomizer 𝒮i:{0,1}𝒵i\mathcal{S}_{i}:\{0,1\}\rightarrow\mathcal{Z}_{i}, where 𝒮i\mathcal{S}_{i} satisfies ε\varepsilon-DP and 𝒵i\mathcal{Z}_{i} is a range of 𝒮i\mathcal{S}_{i}. Examples of 𝒮i\mathcal{S}_{i} include Warner’s RR. Furthermore, we assume the one-round setting, where each 𝒮i\mathcal{S}_{i} is independent, and where the estimator g^\hat{g} for gg has the form

g^(X)=g~(𝒮1(x1),,𝒮κ(xκ)).\hat{g}(X)=\tilde{g}(\mathcal{S}_{1}(x_{1}),\ldots,\mathcal{S}_{\kappa}(x_{\kappa})). (28)

g~\tilde{g} is an aggregate function that takes 𝒮1(x1),,𝒮κ(xκ)\mathcal{S}_{1}(x_{1}),\ldots,\mathcal{S}_{\kappa}(x_{\kappa}) as input and outputs g^(X)\hat{g}(X).

We will prove a lower bound which uses the following stripped-down form of an independent cube (Definition 5).

Definition 6.

[Boolean (κ,D)(\kappa,D)-independent cube] Let g:{0,1}κg:\{0,1\}^{\kappa}\rightarrow\mathbb{R}, and DD\in\mathbb{R}. We say gg has a Boolean (κ,D)(\kappa,D)-independent cube if for all (x1,,xκ){0,1}κ(x_{1},\ldots,x_{\kappa})\in\{0,1\}^{\kappa} we have

g(x1,,xκ)=g(0,0,,0)+i=1κxiCi,g(x_{1},\ldots,x_{\kappa})=g(0,0,\ldots,0)+\sum_{i=1}^{\kappa}x_{i}C_{i},

where CiC_{i}\in\mathbb{R} satisfies |Ci|D|C_{i}|\geq D for any i[κ]i\in[\kappa].

The following theorem applies to the Boolean independent cube and will help us establish Theorem 7. We prove this theorem in Section D.10.

Theorem 8.

Let g:𝒳κg:\mathcal{X}^{\kappa}\rightarrow\mathbb{R} be a function that has a Boolean (κ,D)(\kappa,D)-independent cube. Let g^(X)\hat{g}(X) be an estimator having the form of (28), where each 𝒮i\mathcal{S}_{i} provides ε\varepsilon-DP and is mutually independent. Let XX be drawn uniformly from {0,1}κ\{0,1\}^{\kappa}. Over the randomness both in selecting XX and in 𝒮1,,𝒮κ\mathcal{S}_{1},\ldots,\mathcal{S}_{\kappa}, 𝔼X,𝒮1,,𝒮κ[l22(g(X),g^(X))]=Ω(eε(eε+1)2κD2)\operatorname{\mathbb{E}}_{X,\mathcal{S}_{1},\ldots,\mathcal{S}_{\kappa}}[l_{2}^{2}(g(X),\hat{g}(X))]=\Omega\left(\frac{e^{\varepsilon}}{(e^{\varepsilon}+1)^{2}}\kappa D^{2}\right).

Proof of Theorem 7 using Theorem 8.

To prove Theorem 7, let 𝒜\mathcal{A} be the (n,D)(n,D)-independent cube (Definition 5) for ff given in the statement of Theorem 7. Let GG be the graph, and 𝐀\mathbf{A} be the corresponding symmetric adjacency matrix. Below we sometimes write ff as a function on neighbor lists 𝐚1,,𝐚n\mathbf{a}_{1},\ldots,\mathbf{a}_{n} (rather than GG) because there is a one-to-one correspondence between GG and 𝐚1,,𝐚n\mathbf{a}_{1},\ldots,\mathbf{a}_{n}. Let MM be the perfect matching that defines 𝒜\mathcal{A}. Let n=2κn=2\kappa.

The idea is to pair up users that MM matches to make a new function gg that has a Boolean (κ,D)(\kappa,D)-independent cube and new randomizers 𝒮1,,𝒮κ\mathcal{S}_{1},\ldots,\mathcal{S}_{\kappa} that satisfy ε\varepsilon-DP. In other words, we regard a pair of users in MM as a virtual user (since n=2κn=2\kappa, there are κ\kappa virtual users in total). Then we apply Theorem 8.

Assume that M={(v1,v2),(v3,v4),,(v2κ1,v2κ)}M=\{(v_{1},v_{2}),(v_{3},v_{4}),\ldots,(v_{2\kappa-1},v_{2\kappa})\} without loss of generality (we can construct gg and 𝒮1,,𝒮κ\mathcal{S}_{1},\ldots,\mathcal{S}_{\kappa} for arbitrary MM in the same way). For x1,,xκ{0,1}x_{1},\ldots,x_{\kappa}\in\{0,1\}, define

g(x1,,xκ)=\displaystyle g(x_{1},\ldots,x_{\kappa})= f(𝐚1+x1𝐞2,𝐚2+x1𝐞1,,\displaystyle f(\mathbf{a}_{1}+x_{1}\mathbf{e}_{2},~{}\mathbf{a}_{2}+x_{1}\mathbf{e}_{1},~{}\ldots,
𝐚2κ1+xκ𝐞2κ,𝐚2κ+xκ𝐞2κ1),\displaystyle\hskip 9.3894pt\mathbf{a}_{2\kappa-1}+x_{\kappa}\mathbf{e}_{2\kappa},~{}\mathbf{a}_{2\kappa}+x_{\kappa}\mathbf{e}_{2\kappa-1}),

where 𝐞i{0,1}n\mathbf{e}_{i}\in\{0,1\}^{n} is the ii-th standard basis vector that has 11 in the ii-th coordinate and 0 elsewhere. In other words, xi{0,1}x_{i}\in\{0,1\} indicates whether the ii-th edge in MM should be added to GG. Thus, gg has a Boolean (κ,D)(\kappa,D)-independent cube, and there is a one-to-one correspondence between an (n,D)(n,D)-independent cube 𝒜\mathcal{A} in Definition 5 and (κ,D)(\kappa,D)-Boolean independent cube {0,1}κ\{0,1\}^{\kappa} in Definition 6. Figure 13 shows a (2,2)(2,2)-Boolean independent cube for gg corresponding to the (4,2)(4,2)-independent cube for ff in Figure 3.

Refer to caption
Figure 13: (2,2)(2,2)-Boolean independent cube for gg corresponding to the (4,2)(4,2)-independent cube for ff in Figure 3.

Now, for i[κ]i\in[\kappa], define 𝒮i(xi)\mathcal{S}_{i}(x_{i}) for xi{0,1}x_{i}\in\{0,1\} by

𝒮i(xi)=(2i1(𝐚2i1+xi𝐞2i),2i(𝐚2i+xi𝐞2i1)).\displaystyle\mathcal{S}_{i}(x_{i})=(\mathcal{R}_{2i-1}(\mathbf{a}_{2i-1}+x_{i}\mathbf{e}_{2i}),\mathcal{R}_{2i}(\mathbf{a}_{2i}+x_{i}\mathbf{e}_{2i-1})). (29)

In other words, 𝒮i(xi)\mathcal{S}_{i}(x_{i}) is simply the product of the outputs of users (v2i1,v2i)(v_{2i-1},v_{2i}), with xix_{i} indicating whether to add the edge in MM between them.

Assume that each i\mathcal{R}_{i} is mutually independent and that (1,,n)(\mathcal{R}_{1},\ldots,\mathcal{R}_{n}) provides ε\varepsilon-relationship DP in Definition 3. Then by (3) and (29), each 𝒮i\mathcal{S}_{i} provides ε\varepsilon-DP and is mutually independent.

Define the estimator g^\hat{g} by

g^(x1,,xκ)\displaystyle\hat{g}(x_{1},\ldots,x_{\kappa}) =f~(𝒮1(x1),,𝒮κ(xκ)).\displaystyle=\tilde{f}(\mathcal{S}_{1}(x_{1}),\ldots,\mathcal{S}_{\kappa}(x_{\kappa})).

Then by Theorem 8, for X=(x1,,xκ)X=(x_{1},\ldots,x_{\kappa}),

𝔼X,𝒮1,,𝒮κ[l22(g(X),g^(X))]Ω(eε(eε+1)2κD2).\operatorname{\mathbb{E}}_{X,\mathcal{S}_{1},\ldots,\mathcal{S}_{\kappa}}[l_{2}^{2}(g(X),\hat{g}(X))]\geq\Omega\left(\frac{e^{\varepsilon}}{(e^{\varepsilon}+1)^{2}}\kappa D^{2}\right).

Since there is a one-to-one correspondence between the (n,Dn,D)-independent cube 𝒜\mathcal{A} and the (κ,D\kappa,D)-Boolean independent cube {0,1}κ\{0,1\}^{\kappa}, we also have

𝔼G,1,,n[l22(f(G),f^(G))]Ω(eε(eε+1)2nD2),\operatorname{\mathbb{E}}_{G,\mathcal{R}_{1},\ldots,\mathcal{R}_{n}}[l_{2}^{2}(f(G),\hat{f}(G))]\geq\Omega\left(\frac{e^{\varepsilon}}{(e^{\varepsilon}+1)^{2}}nD^{2}\right),

where GG is drawn uniformly from 𝒜\mathcal{A}, which proves Theorem 7. ∎

D.10 Proof of Theorem 8

Assume that 𝒮i:{0,1}𝒵i\mathcal{S}_{i}:\{0,1\}\rightarrow\mathcal{Z}_{i}. For X=(x1,,xκ){0,1}κX=(x_{1},\ldots,x_{\kappa})\in\{0,1\}^{\kappa}, let S(X)=(𝒮1(x1),𝒮κ(xκ))S(X)=(\mathcal{S}_{1}(x_{1}),\cdots\mathcal{S}_{\kappa}(x_{\kappa})) and Z=(z1,,zκ)Z=(z_{1},\ldots,z_{\kappa}) with zi𝒵iz_{i}\in\mathcal{Z}_{i}. We rewrite the quantity of interest as

𝔼X,S(X)[l22(g(X),g^(X))]=𝔼X,S(X)[(g(X)g~(S(X)))2].\operatorname{\mathbb{E}}_{X,S(X)}[l_{2}^{2}(g(X),\hat{g}(X))]=\operatorname{\mathbb{E}}_{X,S(X)}[(g(X)-\tilde{g}(S(X)))^{2}].

By the law of total expectation, this quantity is the same as the expected value of the conditional expected value of (g(X)g~(S(X)))2(g(X)-\tilde{g}(S(X)))^{2} given S(X)=ZS(X)=Z:

𝔼X,S(X)[(g(X)g~(S(X)))2]\displaystyle\operatorname{\mathbb{E}}_{X,S(X)}[(g(X)-\tilde{g}(S(X)))^{2}]
=𝔼S(X)𝔼X[(g(X)g~(Z))2|S(X)=Z].\displaystyle=\operatorname{\mathbb{E}}_{S(X)}\operatorname{\mathbb{E}}_{X}[(g(X)-\tilde{g}(Z))^{2}|S(X)=Z]. (30)

Let μZ=𝔼X[g(X)|S(X)=Z]\mu_{Z}=\operatorname{\mathbb{E}}_{X}[g(X)|S(X)=Z]. Then the inner expectation in (30) can be written as follows:

𝔼X[(g(X)g~(Z))2|S(X)=Z]\displaystyle\;\operatorname{\mathbb{E}}_{X}[(g(X)-\tilde{g}(Z))^{2}|S(X)=Z]
=𝔼X[((g(X)μZ)+(μZg~(Z)))2|S(X)=Z]\displaystyle=\operatorname{\mathbb{E}}_{X}[((g(X)-\mu_{Z})+(\mu_{Z}-\tilde{g}(Z)))^{2}|S(X)=Z]
=𝔼X[(g(X)μZ)2|S(X)=Z]\displaystyle=\operatorname{\mathbb{E}}_{X}[(g(X)-\mu_{Z})^{2}|S(X)=Z]
+2(μZg~(Z))𝔼X[(g(X)μZ)|S(X)=Z]\displaystyle\hskip 9.95845pt+2(\mu_{Z}-\tilde{g}(Z))\operatorname{\mathbb{E}}_{X}[(g(X)-\mu_{Z})|S(X)=Z]
+(μZg~(Z))2\displaystyle\hskip 9.95845pt+(\mu_{Z}-\tilde{g}(Z))^{2}
=𝔼X[(g(X)μZ)2|S(X)=Z]+(μZg~(Z))2\displaystyle=\operatorname{\mathbb{E}}_{X}[(g(X)-\mu_{Z})^{2}|S(X)=Z]+(\mu_{Z}-\tilde{g}(Z))^{2}
=𝕍X[g(X)|S(X)=Z]+(μZg~(Z))2.\displaystyle=\operatorname{\mathbb{V}}_{X}[g(X)|S(X)=Z]+(\mu_{Z}-\tilde{g}(Z))^{2}.

Thus, it suffices to show that 𝕍X[g(X)|S(X)=Z]Ω(eε(1+eε)2κD2)\operatorname{\mathbb{V}}_{X}[g(X)|S(X)=Z]\geq\Omega\left(\frac{e^{\varepsilon}}{(1+e^{\varepsilon})^{2}}\kappa D^{2}\right). For B=(b1,,bκ){0,1}κB=(b_{1},\ldots,b_{\kappa})\in\{0,1\}^{\kappa}, we have

Pr[X=B|S(X)=Z]=Pr[X=B]Pr[S(X)=Z|X=B]Pr[S(X)=Z].\displaystyle\Pr[X=B|S(X)=Z]=\frac{\Pr[X=B]\Pr[S(X)=Z|X=B]}{\Pr[S(X)=Z]}.

Since Pr[S(X)=Z]\Pr[S(X)=Z] does not depend on BB and Pr[X=B]=12κ\Pr[X=B]=\frac{1}{2^{\kappa}}, Pr[X=B|S(X)=Z]\Pr[X=B|S(X)=Z] can also be expressed as

Pr[X=B|S(X)=Z]\displaystyle\Pr[X=B|S(X)=Z] Pr[S(X)=Z|X=B].\displaystyle\propto\Pr[S(X)=Z|X=B]. (31)

Since S1,,SκS_{1},\ldots,S_{\kappa} are independently run, we have

Pr[S(X)=Z|X=B]\displaystyle\Pr[S(X)=Z|X=B] =Pr[𝒮1(b1)=z1,,𝒮κ(bκ)=zκ]\displaystyle=\Pr[\mathcal{S}_{1}(b_{1})=z_{1},\ldots,\mathcal{S}_{\kappa}(b_{\kappa})=z_{\kappa}]
=i=1κPr[𝒮i(bi)=zi].\displaystyle=\prod_{i=1}^{\kappa}\Pr[\mathcal{S}_{i}(b_{i})=z_{i}].

Define

pi=Pr[𝒮i(1)=zi]Pr[𝒮i(0)=zi]+Pr[𝒮i(1)=zi].\displaystyle p_{i}=\frac{\Pr[\mathcal{S}_{i}(1)=z_{i}]}{\Pr[\mathcal{S}_{i}(0)=z_{i}]+\Pr[\mathcal{S}_{i}(1)=z_{i}]}.

Because each 𝒮i\mathcal{S}_{i} satisfies ε\varepsilon-DP, we have 11+eεpieε1+eε\frac{1}{1+e^{\varepsilon}}\leq p_{i}\leq\frac{e^{\varepsilon}}{1+e^{\varepsilon}}. By (31) and B{0,1}κPr[X=B|S(X)=Z]=1\sum_{B\in\{0,1\}^{\kappa}}\Pr[X=B|S(X)=Z]=1, we have

Pr[X=B|S(X)=Z]=i=1κ(pi)bi(1pi)1bi.\displaystyle\Pr[X=B|S(X)=Z]=\prod_{i=1}^{\kappa}(p_{i})^{b_{i}}(1-p_{i})^{1-b_{i}}. (32)

This means that Pr[X=B|S(X)=Z]\Pr[X=B|S(X)=Z] is distributed according to the independent product of Bernoulli(pi)Bernoulli(p_{i}) for i[κ]i\in[\kappa].

Now, because gg has a Boolean (κ,D)(\kappa,D)-independent cube, there are C1,,Cκ𝒮C_{1},\ldots,C_{\kappa}\in\mathcal{S} with |Ci|D|C_{i}|\geq D such that

g(X)\displaystyle g(X) =g(0,,0)+i=1κxiCi.\displaystyle=g(0,\ldots,0)+\sum_{i=1}^{\kappa}x_{i}C_{i}.

By (32), xix_{i} is an independent draw from Bernoulli(pi)Bernoulli(p_{i}) given S(X)=ZS(X)=Z. Thus, the variance of g(X)g(X) given S(X)=ZS(X)=Z is

𝕍X[g(X)|S(X)=Z]\displaystyle\operatorname{\mathbb{V}}_{X}[g(X)|S(X)=Z] =i=1κ𝕍[xi|S(X)=Z]Ci2\displaystyle=\sum_{i=1}^{\kappa}\operatorname{\mathbb{V}}[x_{i}|S(X)=Z]C_{i}^{2}
i=1κpi(1pi)D2\displaystyle\geq\sum_{i=1}^{\kappa}p_{i}(1-p_{i})D^{2}
i=1κeε(1+eε)2D2\displaystyle\geq\sum_{i=1}^{\kappa}\frac{e^{\varepsilon}}{(1+e^{\varepsilon})^{2}}D^{2}
κeε(1+eε)2D2.\displaystyle\geq\kappa\frac{e^{\varepsilon}}{(1+e^{\varepsilon})^{2}}D^{2}.