Optimal Algorithms for Mean Estimation under Local Differential Privacy
Abstract
We study the problem of mean estimation of -bounded vectors under the constraint of local differential privacy. While the literature has a variety of algorithms that achieve the asymptotically optimal rates for this problem, the performance of these algorithms in practice can vary significantly due to varying (and often large) hidden constants. In this work, we investigate the question of designing the protocol with the smallest variance. We show that PrivUnit [BDFKR18] with optimized parameters achieves the optimal variance among a large family of locally private randomizers. To prove this result, we establish some properties of local randomizers, and use symmetrization arguments that allow us to write the optimal randomizer as the optimizer of a certain linear program. These structural results, which should extend to other problems, then allow us to show that the optimal randomizer belongs to the PrivUnit family.
We also develop a new variant of PrivUnit based on the Gaussian distribution which is more amenable to mathematical analysis and enjoys the same optimality guarantees. This allows us to establish several useful properties on the exact constants of the optimal error as well as to numerically estimate these constants.
1 Introduction
Mean estimation is one of the most fundamental problems in machine learning and is the building block of a countless number of algorithms and applications including stochastic optimization [Duc18], federated learning [BIKMMPRSS17] and others. However, it is now evident that standard algorithms for this task may leak sensitive information about users’ data and compromise their privacy. This had led to the development of numerous algorithms for estimating the mean while preserving the privacy of users. The most common models for privacy are either the central model where there exists a trusted curator or the local model where such trusted curator does not exist.
In this work, we study the problem of mean estimation in the local privacy model. More specifically, we have users each with a vector in the Euclidean unit ball in . Each user will use a randomizer to privatize their data where must satisfy -differential privacy, namely, for any and , . Then, we run an aggregation method such that provides an estimate of . Our goal in this work is to characterize the optimal protocol (pair of randomizer and aggregation method ) for this problem and study the resulting optimal error.
Due to its importance and many applications, the problem of private mean estimation in the local model has been studied by numerous papers [BDFKR18, FT21, CKÖ20]. As a result, a clear understanding of the asymptotic optimal rates has emerged, showing that the optimal squared error is proportional to : [DJW18, BDFKR18] developed algorithms that obtain this rate and [DR19] proved corresponding lower bounds. Subsequent papers [FT21, CKÖ20] have developed several other algorithms that achieve the same rates.
However, these optimality results do not give a clear characterization of which algorithm will enjoy better performance in practice. Constant factors here matter more than they do in run time or memory, as is typically limited by privacy constraints, and increasing the sample size by collecting data for more individuals is often infeasible or expensive. The question of finding the randomizer with the smallest error is therefore of great interest.
1.1 Our contributions
Motivated by these limitations, we investigate strict optimality for the problem of mean estimation with local privacy. We study the family of non-interactive and unbiased protocols, that is, a protocol is a pair of local private randomizer and an aggregation method where the protocol outputs such that . We measure the error of a private protocol in terms of its (worst case) mean squared error
We obtain the following results.
First, we show that PrivUnit of [BDFKR18] with optimized parameters is optimal amongst a large family of protocols. Our strategy for proving optimality consists of two main steps: first, we show that for non-interactive protocols, additive aggregation with a certain randomizer attains the optimal error. Then, for protocols with additive aggregation, we show that PrivUnit obtains the optimal error. Our proof builds on establishing several new properties of the optimal local randomizer, which allow us to express the problem of designing the optimal randomizer as a linear program. This in turn helps characterize the structure of optimal randomizers and allows us to show that there is an optimal randomizer which is an instance of PrivUnit.
Finding the exact constants in the error of PrivUnit is mathematically challenging. Our second contribution is to develop a new algorithm PrivUnitG that builds on the Gaussian distribution and attains the same error as PrivUnit up to a multiplicative factor as . In contrast to PrivUnit, we show that the optimal parameters of PrivUnitG are independent of the dimension, hence enabling efficient calculation of the constants for high dimensional settings. Moreover, PrivUnitG is amenable to mathematical analysis which yields several properties on the constants of the optimal error.
1.2 Related work
Local privacy is perhaps one of the oldest forms of privacy and dates back to [War65] who used it to encourage truthfulness in surveys. This definition resurfaced again in the contex of modern data analysis by [EGS03] and was related to differential privacy in the seminal work of [DMNS06]. Local privacy has attracted a lot of interest, both in the academic community [BNO08, DJW18, BDFKR18], and in industry where it has been deployed in several industrial applications [EPK14, App17]. Recent work in the Shuffle model of privacy [BEMMRLRKTS17, CSUZZ19, EFMRTT19, BBGN19, FMT20] has led to increased interest in the local model with moderate values of the local privacy parameter, as they can translate to small values of central under shuffling.
The problem of locally private mean estimation has received a great deal of attention in the past decade [DJW18, BDFKR18, DR19, EFMRSTT20, ASYKM18, GDDKS20, CKÖ20, GKMM19, FT21]. [DJW18] developed asymptotically optimal procedures for estimating the mean when , achieving expected squared error . [BDFKR18] proposed a new algorithm that is optimal for as well, achieving error . These rates are optimal as [DR19] show tight lower bounds which hold for interactive protocols. There has been more work on locally private mean estimation that studies the problem with additional constraints such as communications cost [EFMRSTT20, FT21, CKÖ20].
[YB17, YB18] study (non-interactive) locally private estimation problems with discrete domains and design algorithms that achieve optimal rates. These optimality results are not restricted to the family of unbiased private mechanisms. However, in contrast to our work, these results are only asymptotic hence their upper and lower bounds are matching only as the number of samples goes to infinity.
While there are several results in differential privacy that establish asymptotically matching lower and upper bounds for various problems of interest, strict optimality results are few. While there are some results known for the one-dimensional problem [GRS09, GS10], some of which extend to a large class of utility functions, such universal mechanisms are known not to exist for multidimensional problems [BN10]. [GKOV15, KOV16] show that for certain loss functions, one can phrase the problem of designing optimal local randomizers as linear programs, whose size is exponential in the size of the input domain.
2 Problem setting and preliminaries
We begin in this section by defining local differential privacy. To this end, we say that two probability distributions and are -close if for every event
We say two random variables are -close if their distributions are -close.
We can now define local DP randomizers.
Definition 2.1.
A randomized algorithm is (replacement) -DP local randomizer if for all , and are -close.
In this work, we will primarily be interested in pure DP randomizers, i.e. those which satisfy -DP. We abbreviate this as -DP. In the setting of local randomizers, the difference between -DP and pure DP is not significant; indeed any -DP local randomizer can be converted [FMT20, CU21] to one that satisfies -DP while changing the distributions by a statistical distance of at most .
The main problem we study in this work is locally private mean estimation. Here, we have unit vectors , i.e . The goal is to design (locally) private protocols that estimate the mean . We focus on the setting of non-interactive private protocols: such a protocol consists of a pair of private local randomizer and aggregation method where the final output is . We require that the output is unbiased, that is, , and wish to find private protocols that minimize the variance
Note that in the above formulation, the randomizer can have arbitrary domains (not necessarily ), and the aggregation method can be arbitrary as well. However, one important special family of private protocols, which we term canonical private protocols, are protocols where the local randomizer has outputs in and the aggregation method is the simple additive aggregation . In addition to being a natural family of protocols, canonical protocols are simple and easy to implement, and achieve the smallest possible variance amongst the family of all possible unbiased private protocols, as we show in the subsequent sections.
Notation
We let denote the unit sphere, and denote the sphere of radius . Whenever clear from context, we use the shorter notation . Given a random variable , we let denote the probability density function of . For a randomizer and input , denotes the probability density function of the random variable . For a Gaussian random variable with , we let denote the probability density function of and denote its cumulative distribution function. For ease of notation, we write and when . Given two random variables and , we say that if and has the same distribution, that is, . Finally, we let denote the standard basis vectors and denote the subspace of orthonormal matrices of dimension .
3 Optimality of PrivUnit
In this section, we prove our main optimality results showing that PrivUnit with additive aggregation achieves the optimal error among the family of unbiased locally private procedures. More precisely, we show that for any -DP local randomizer and any aggregation method that is unbiased,
We begin in Section 3.1 by introducing the algorithm PrivUnit and stating its optimality guarantees in Section 3.2. To prove the optimality result, we begin in Section 3.3 by showing that there exists a canonical private protocol that achieves the optimal error, then in Section 3.2 we show that PrivUnit is the optimal local randomizer in the family of canonical protocols.
3.1 PrivUnit
We begin by introducing PrivUnit which was developed by [BDFKR18]. Given an input vector and letting , has the following distribution (up to normalization)
A normalization factor is needed to obtain the correct expectation. We provide full details in Algorithm 1.
The following theorem states the privacy guarantees of PrivUnit. Theorem 1 in [BDFKR18] provides privacy guarantees based on several mathematical approximations which may not be tight. For our optimality results, we require the following exact privacy guarantee of PrivUnit.
Theorem 1.
[BDFKR18, Theorem 1] Let where . If then is an -DP local randomizer.
Throughout the paper, we will sometimes use the equivalent notation which describes running with as in Theorem 1.
3.2 Optimality
Asymptotic optimality of PrivUnit has already been established by prior work. [BDFKR18] show that the error of PrivUnit is upper bounded by for certain parameters. Moreover, [DR19] show a lower bound of , implying that PrivUnit is asymptotically optimal.
In this section, we prove that additive aggregation applied with PrivUnit with the best choice of parameters is truly optimal, that is, it outperforms any unbiased private algorithm. The following theorem states our optimality result for PrivUnit.
Theorem 2.
Let be an -DP local randomizer, and be an aggregation procedure such that for all . Then there is and such that is -DP local randomizer and
The proof of Theorem 2 will proceed in two steps: first, in Section 3.3 (Proposition 1), we show that there exists an optimal private procedure that is canonical, then in Section 3.2 (Proposition 2) we prove that PrivUnit is the optimal randomizer in this family. Theorem 2 is a direct corollary of these two propositions.
3.3 Optimality of canonical protocols
In this section, we show that there exists a canonical private protocol that achieves the optimal error. In particular, we have the following result.
Proposition 1.
Let be such that is -DP local randomizer and for all . Then there a canonical randomizer that is -DP local randomizer and
To prove Proposition 1, we begin with the following lemma.
Lemma 3.1.
Let satisfy the conditions of Proposition 1. Let be a probability distribution over such that . There is for such that is -DP local randomizer, for all , and
Before proving Lemma 3.1, we now complete the proof the Proposition 1.
Proof.
(Proposition 1) Let be the uniform distribution over the sphere . First, note that
Now we define as follows. First, sample a random rotation matrix where , then set
Note that is -DP local randomizer, , and for all
Overall, we have
where minimizes . The claim follows. ∎
Now we prove Lemma 3.1.
Proof.
(Lemma 3.1) We define to be
Note that is -DP local randomizer as it requires a single application of . Moreover, for all . We define
and . We now have
where follows since , follows by induction, follows from Jensen’s inequality, and follows since and .
∎
3.4 Optimality of PrivUnit among canonical randomizers
In this section, we show that PrivUnit achieves the optimal error in the family of canonical randomizers. To this end, first note that for additive aggregation , we have . Denoting for canonical randomizers, we have the following optimality result.
Proposition 2.
Let be an -DP local randomizer such that for all . Then there is and such that is -DP local randomizer and
The proof of Proposition 2 builds on a sequence of lemmas, each of which allows to simplify the structure of an optimal algorithm. We begin with the following lemma which show that there exists an optimal algorithm which is invariant to rotations.
Lemma 3.2 (Rotation-Invariance Lemma).
Let be an -DP local randomizer such that for all . There exists an -DP local randomizer such that
-
1.
for all
-
2.
-
3.
for all
-
4.
For any , there is an orthonormal matrix such that .
-
5.
for any and with
Proof.
Given , we define as follows. First, sample a random rotation matrix where , then set
We now prove that satisfies the desired properties. First, note that the privacy of immediately implies the same privacy bound for . Moreover, we have that as is unbiased and . For utility, note that
Finally, we show that the distributions of and are the same up to rotations. Indeed, let be a rotation matrix such that . We have that , which can also be written as as is also a random rotation matrix.
Now we prove the final property. Assume towards a contradiction there is and with such that . We will show that this implies that is not -DP. In the proof above we showed that for . Therefore for such that we get that which implies that
which is a contradiction. ∎
Lemma 3.2 implies that we can restrict our attention to algorithms have the same density for all inputs up to rotations and hence allows to study their behavior for a single input. Moreover, as we show in the following lemma, given a randomizer that works for a single input, we can extend it to achieve the same error for all inputs. To facilitate notation, we say that a density is -indistinguishable if for all such that .
Lemma 3.3.
Fix . Let be an -indistinguishable density function with corresponding random variable such that that . There exists an -DP local randomizer such that and for all .
Proof.
Lemma 3.2 and Lemma 3.3 imply that we only need to study the behavior of randomizer for a fixed input. Henceforth, we will fix the input to and investigate properties of the density given .
Given we define its reflection to be . The next lemma shows that we can assume that the densities at and are equal for some optimal algorithm.
Lemma 3.4 (Reflection Symmetry).
Let be an -indistinguishable density function with corresponding random variable such that that . There is with corresponding random variable that satisfies the same properties such that and for all .
Proof.
We define for all . First, it is immediate to see that for all . Moreover, we have
Note also that since the marginal distribution of the first coordinate in the output did not change and it is clear that for other coordinates the expectation is zero as for any . Finally, note that since for all . ∎
We also have the following lemma which shows that the optimal density outputs vectors on a sphere with some fixed radius.
Lemma 3.5.
Let be an -indistinguishable density function with corresponding random variable such that that . For any , there exists an -indistinguishable density with corresponding random variable such that for some , and .
Proof.
By Lemma 3.4, we can assume without loss of generality that satisfies reflection symmetry, that is . We think of the density as first sampling a radius then sampling a vector of radius . We also assume that has bounded radius; otherwise as is bounded, we can project the output of to some large radius while increasing the error by at most for any . Similarly, we can assume that the output has radius at least while increasing the error by at most . Let denote the distribution of the radius, and be the conditional distribution of the output given the radius is . In this terminology implies that
For the purpose of finding the optimal algorithm, we need that minimizes . Denote and set
Noting that , we have
Now consider that has ; exists as has outputs in . Let denote the conditional distribution of given that and let denote the corresponding randomizer. We define a new randomizer as follows
with corresponding density . Note that is -indistiguishable as is -indistiguishable and the conditional distributions given different radii are disjoint which implies is -indistiguishable. Moreover which implies that . Finally, note that satisfies
The claim follows. ∎
Before we present our main proposition which formulates the linear program that finds the optimal minimizer, we need the following key property which allows to describe the privacy guarantee as a linear constraint. We remark that such a lemma can easily be proven for deletion DP, so that our results would extend to that definition.
Lemma 3.6.
Let be an -DP local randomizer. There is such that for all and
Moreover, if satisfies the properties of Lemma 3.2 (invariance) then for .
Proof.
Define . Note that for all ,
The second direction follows similarly. The second part of the claim follows as for any such that , if for any mechanism that satisfies the properties of Lemma 3.2 then there is such that . The definition of now implies that . ∎
We are now ready to present our main step towards proving the optimality result. The following proposition formulates the problem of finding the optimal algorithm as a linear program. As a result, we show that there is an optimal algorithm whose density function has at most two different probabilities.
Proposition 3.
Let be an -DP local randomizer such that for all . For any , there exist constants and an -DP local randomizer such that for all , , , and for all .
Proof.
The proof will proceed by formulating a linear program which describes the problem of finding the optimal randomizer and then argue that minimizer of this program must satisfy the desired conditions. To this end, first we use the properties of the optimal randomizer from the previous lemmas to simplify the linear program. Lemma 3.5 implies that there exists an optimal randomizer for some that is also invariant under rotations (satisfies the conclusions of Lemma 3.2). Moreover, Lemma 3.6 implies that the density function has for some
Adding the requirement of unbiasedness, and noticing that for such algorithms the error is , this results in the following minimization problem where the variables are and the density functions for all
(A) | |||
Lemma 3.2 and Lemma 3.3 also show that the optimal algorithm is invariant under rotations, and that we only need to find the output distribution with respect to a fixed input . Moreover, Lemma 3.4 says that can assume that for all . We also work now with the normalized algorithm (that is, the output on the unit sphere). Note that for we have . Denoting , this results in the following linear program (LP)
(B) | |||
We need to show that most of the inequality constraints must be tight at one of the two extremes. To this end, we approximate the LP (B) using a finite number of variables by discretizing the density function . We assume we have a -cover of . We assume without loss of generality that if then and we also write where and . Let , , and . Now we limit our linear program to density functions that are constant over each , resulting in the following LP
(C) | |||
Let and denote the maximal values of (B) and (C), respectively. Each solution to (C) is also a solution to (B) hence . Moreover, given , let be a solution of (B) that obtains and let be the corresponding randomizer. We can now define a solution for the discrete program (C) by setting for ,
Equivalently, we can define as follows: first run to get and find such that . Then return a vector uniformly at random from . Note that clearly satisfies the first and third constraints in (C). As for the second constraint, it follows since which implies that for some . It remains to show that . The above representation of shows that and therefore we have .
To finish the proof, it remains to show that the discrete LP (C) has a solution that satisfies the desired properties. Note that as this is a linear program with variables and constraints, linearly independent constraints must be satisfied [BT97, theorem 2.3], which shows that for at least of the sets we have .
To finish the proof, we need to manipulate the probabilities for the remaining sets to satisfy our desired requirements. As these sets have small probability, this does not change the accuracy by much and we just need to do this manipulation carefully so as to preserve reflection symmetry and unbiasedness. The full details are tedious and we present them in Section A.1. ∎
Given the previous lemmas, we are now ready to finish the proof of Proposition 2.
Proof.
Fix and an unbiased -DP local randomizer . Proposition 3 shows that there exists that is -DP, unbiased, reflection symmetric ( for all ), and satisfies . Morevoer . We will transform into an instance of PrivUnit while maintaining the same error as .
To this end, if is an instance of PrivUnit then we are done. Otherwise let , and . Consider that solves the following minimization problem
Let be the value of the above minimization problem and the corresponding minimizer. Let and . Assume without loss of generality that (the other direction follows from identical arguments). Let be such that and . We define by swapping the probabilities on and , that is,
Clearly still satisfies all of our desired properties and has as we have that for and . Note also that for such that . Moreover, for such that , we have that only if . Let be such that the set has . We now define
Clearly, is an instance of PrivUnit. Now we prove that it satisfies all of our desired properties. First, note that we can write as
This implies that . Moreover, is -indistiguishable by definition. Finally, note that as for and . Let be the randomizer that corresponds to . We define . We have that and that
As is an instance of PrivUnit, the claim follows. ∎
4 PrivUnitG: an optimal algorithm based on Gaussian distribution
In this section, we develop a new variant of PrivUnit, namely PrivUnitG, based on the Gaussian distribution. PrivUnitG essentially provides an easy-to-analyze approximation of the optimal algorithm PrivUnit. This enables to efficiently find accurate approximations of the optimal parameters and . In fact, we show that these parameters are independent of the dimension which is computationally valuable. Moreover, building on PrivUnitG, we are able to analytically study the constants that characterize the optimal loss.
The main idea in PrivUnitG is to approximate the uniform distribution over the unit sphere using a Gaussian distribution. Roughly, for a Gaussian random variable and input vector , PrivUnitG has the following distribution (up to normalization constants)
We present the full details including the normalization constants in Algorithm 2. We usually use the notation which means applying PrivUnitG with and .
The following proposition gives the privacy and utility guarantees for PrivUnitG. The r.v. is defined (see Algorithm 2) as where is drawn from . We define with and . We defer the proof to Section B.1.
Proposition 4.
Let such that . The algorithm is -DP local randomizer. Moreover, it is unbiased and has error
Moreover, we have
Now we proceed to analyze the utility guarantees of PrivUnitG as compared to PrivUnit. To this end, we first define the error obtained by PrivUnitG with optimized parameters
Similarly, we define this quantity for PrivUnit
The following theorem shows that PrivUnitG enjoys the same error as PrivUnit up to small factors.
Theorem 3.
Let and such that is -DP local randomizer. Then is also -DP local randomizer and has
In particular,
We conduct several experiments that demonstrate that the error of both algorithms is nearly the same as we increase the dimension. We plot the ratio of the error of PrivUnitG and PrivUnit (for the same and ) for different epsilons and dimensions in Figure 1. These plots reaffirm the theoretical results of Theorem 3, that is, the ratio is smaller for large and small .
\begin{overpic}[width=151.76964pt]{{plots/ratio_pug_pu_eps4.0}.pdf} \end{overpic} | \begin{overpic}[width=151.76964pt]{{plots/ratio_pug_pu_eps8.0}.pdf} \end{overpic} | \begin{overpic}[width=151.76964pt]{{plots/ratio_pug_pu_eps16.0}.pdf} \end{overpic} |
---|---|---|
(a) | (b) | (c) |
Proof.
The privacy proof is straightforward as both algorithms use the same and therefore enjoy the same privacy parameter. Moreover, the second part of the claim follows immediately from the first part and the optimality of PrivUnit (Proposition 2).
Now we prove the first part of the claim. Let be such that where . Then we have
Similarly for PrivUnitG we have (see e.g. proof of Proposition 4)
Therefore we have
(1) |
where inequality follows Lemma B.3. We now upper bound the first term. Note first that for the same we have
where follows from Lemma B.2 and follows since . We will need one more step to finish the proof as and are not necessarily equal. We divide to cases. If , then and therefore
Similarly, if then and therefore
Overall this proves that
Putting this back in inequality (4), we have
where the last inequality follows from Lemma B.4 as . The claim follows. ∎
4.1 Analytical expression for optimal error
We wish to understand the constants that characterize the optimal error. To this end, we build on the optimality of PrivUnitG and define the quantity by
We show that as . Moreover, as . We experimentally demonstrate the behavior of and in Figure 2. These experiments show that . We remark that as shown in [FT21], if is close to , then one can get a near optimal algorithm for privacy parameter by repeating the algorithm for privacy parameter times. The latter may be more efficient in terms of computation and this motivates understanding how quickly converges.
\begin{overpic}[width=195.12767pt]{{plots/C_eps_d=35.0}.pdf} \end{overpic} | \begin{overpic}[width=195.12767pt]{{plots/C_eps}.pdf} \end{overpic} |
(a) | (b) |
The following proposition shows that converges as we increase the dimension . We provide the proof in Section B.3.
Proposition 5.
Fix . For any ,
In particular, as
The following proposition shows that also converges as we increase . We present the proof in Section B.4.
Proposition 6.
There is such that .
References
- [App17] Apple Differential Privacy Team “Learning with Privacy at Scale” Available at https://machinelearning.apple.com/2017/12/06/learning-with-privacy-at-scale.html, 2017
- [ASYKM18] Naman Agarwal, Ananda Theertha Suresh, Felix Xinnan X Yu, Sanjiv Kumar and Brendan McMahan “cpSGD: Communication-efficient and differentially-private distributed SGD” In Advances in Neural Information Processing Systems 31 Curran Associates, Inc., 2018, pp. 7564–7575 URL: https://proceedings.neurips.cc/paper/2018/file/21ce689121e39821d07d04faab328370-Paper.pdf
- [BBGN19] Borja Balle, James Bell, Adrià Gascón and Kobbi Nissim “The Privacy Blanket of the Shuffle Model” In Advances in Cryptology – CRYPTO 2019 Cham: Springer International Publishing, 2019, pp. 638–667
- [BDFKR18] Abhishek Bhowmick, John Duchi, Julien Freudiger, Gaurav Kapoor and Ryan Rogers “Protection Against Reconstruction and Its Applications in Private Federated Learning” In arXiv:1812.00984 [stat.ML], 2018
- [BEMMRLRKTS17] Andrea Bittau, Úlfar Erlingsson, Petros Maniatis, Ilya Mironov, Ananth Raghunathan, David Lie, Mitch Rudominer, Ushasree Kode, Julien Tinnes and Bernhard Seefeld “Prochlo: Strong Privacy for Analytics in the Crowd” In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP ’17 Shanghai, China: Association for Computing Machinery, 2017, pp. 441–459 DOI: 10.1145/3132747.3132769
- [BIKMMPRSS17] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal and Karn Seth “Practical Secure Aggregation for Privacy-Preserving Machine Learning” In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security New York, NY, USA: ACM, 2017, pp. 1175–1191
- [BN10] Hai Brenner and Kobbi Nissim “Impossibility of Differentially Private Universally Optimal Mechanisms” In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, 2010, pp. 71–80 DOI: 10.1109/FOCS.2010.13
- [BNO08] Amos Beimel, Kobbi Nissim and Eran Omri “Distributed private data analysis: Simultaneously solving how and what” In Advances in Cryptology 5157, Lecture Notes in Computer Science Springer, 2008, pp. 451–468
- [BT97] Dimitris Bertsimas and John N Tsitsiklis “Introduction to linear optimization” Athena Scientific Belmont, MA, 1997
- [CKÖ20] Wei-Ning Chen, Peter Kairouz and Ayfer Özgür “Breaking the communication-privacy-accuracy trilemma” In arXiv:2007.11707 [cs.LG], 2020
- [CSUZZ19] Albert Cheu, Adam Smith, Jonathan Ullman, David Zeber and Maxim Zhilyaev “Distributed Differential Privacy via Shuffling” In Advances in Cryptology – EUROCRYPT 2019 Cham: Springer International Publishing, 2019, pp. 375–403
- [CU21] Albert Cheu and Jonathan Ullman “The Limits of Pan Privacy and Shuffle Privacy for Learning and Estimation” In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing New York, NY, USA: Association for Computing Machinery, 2021, pp. 1081–1094 URL: https://doi.org/10.1145/3406325.3450995
- [DF87] Persi Diaconis and David Freedman “A dozen de Finetti-style results in search of a theory” In Annales de l’IHP Probabilités et statistiques 23, 1987, pp. 397–423
- [DJW18] John C. Duchi, Michael I. Jordan and Martin J. Wainwright “Minimax Optimal Procedures for Locally Private Estimation (with discussion)” In Journal of the American Statistical Association 113.521, 2018, pp. 182–215
- [DMNS06] C. Dwork, F. McSherry, K. Nissim and A. Smith “Calibrating noise to sensitivity in private data analysis” In TCC, 2006, pp. 265–284
- [DR19] John C. Duchi and Ryan Rogers “Lower Bounds for Locally Private Estimation via Communication Complexity” In Proceedings of the Thirty Second Annual Conference on Computational Learning Theory, 2019
- [Duc18] John C. Duchi “Introductory Lectures on Stochastic Convex Optimization” In The Mathematics of Data, IAS/Park City Mathematics Series American Mathematical Society, 2018
- [EFMRSTT20] Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Shuang Song, Kunal Talwar and Abhradeep Thakurta “Encode, shuffle, analyze privacy revisited: Formalizations and empirical evaluation” In arXiv:2001.03618 [cs.CR], 2020
- [EFMRTT19] Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar and Abhradeep Thakurta “Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity” In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’19 San Diego, California: Society for IndustrialApplied Mathematics, 2019, pp. 2468–2479
- [EGS03] Alexandre V. Evfimievski, Johannes Gehrke and Ramakrishnan Srikant “Limiting privacy breaches in privacy preserving data mining” In Proceedings of the Twenty-Second Symposium on Principles of Database Systems, 2003, pp. 211–222
- [EPK14] Ulfar Erlingsson, Vasyl Pihur and Aleksandra Korolova “RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response” In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS), 2014
- [FMT20] Vitaly Feldman, Audra McMillan and Kunal Talwar “Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling” In arXiv:2012.12803 [cs.LG], 2020
- [FT21] Vitaly Feldman and Kunal Talwar “Lossless Compression of Efficient Private Local Randomizers” In Proceedings of the 38th International Conference on Machine Learning 139 PMLR, 2021, pp. 3208–3219
- [GDDKS20] Antonious M. Girgis, Deepesh Data, Suhas Diggavi, Peter Kairouz and Ananda Theertha Suresh “Shuffled Model of Federated Learning: Privacy, Communication and Accuracy Trade-offs”, 2020 arXiv:2008.07180 [cs.LG]
- [GKMM19] Venkata Gandikota, Daniel Kane, Raj Kumar Maity and Arya Mazumdar “vqsgd: Vector quantized stochastic gradient descent” In arXiv preprint arXiv:1911.07971, 2019
- [GKOV15] Quan Geng, Peter Kairouz, Sewoong Oh and Pramod Viswanath “The Staircase Mechanism in Differential Privacy” In IEEE Journal of Selected Topics in Signal Processing 9.7, 2015, pp. 1176–1184 DOI: 10.1109/JSTSP.2015.2425831
- [GRS09] Arpita Ghosh, Tim Roughgarden and Mukund Sundararajan “Universally Utility-Maximizing Privacy Mechanisms” In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing, STOC ’09 Bethesda, MD, USA: Association for Computing Machinery, 2009, pp. 351–360 DOI: 10.1145/1536414.1536464
- [GS10] Mangesh Gupte and Mukund Sundararajan “Universally Optimal Privacy Mechanisms for Minimax Agents” In Proceedings of the Twenty-Ninth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS ’10 Indianapolis, Indiana, USA: Association for Computing Machinery, 2010, pp. 135–146 DOI: 10.1145/1807085.1807105
- [KOV16] Peter Kairouz, Sewoong Oh and Pramod Viswanath “Extremal Mechanisms for Local Differential Privacy” In J. Mach. Learn. Res. 17.1 JMLR.org, 2016, pp. 492–542 URL: http://dl.acm.org/citation.cfm?id=2946645.2946662
- [War65] Stanley L Warner “Randomized response: A survey technique for eliminating evasive answer bias” In Journal of the American Statistical Association 60.309 Taylor & Francis Group, 1965, pp. 63–69
- [YB17] Min Ye and Alexander Barg “Asymptotically optimal private estimation under mean square loss” In arXiv:1708.00059 [math.ST], 2017
- [YB18] Min Ye and Alexander Barg “Optimal Schemes for Discrete Distribution Estimation Under Locally Differential Privacy” In IEEE Transactions on Information Theory 64.8, 2018, pp. 5662–5676
Appendix A Missing details for PrivUnit (Section 3)
A.1 Proof of Proposition 3 (missing details)
Here we complete the missing details from the proof of Proposition 3. We have sets such that for at least of the sets we have . We now show how to manipulate the probabilities on the other sets to satisfy our desired properties while not affecting the accuracy. Assume without loss of generality that do not satisfy this condition and let . We now show that we can manipulate the density on such that it will satisfy this condition. Moreover, we show that the probability of is small so that this does not affect the accuracy of the algorithm. Note that the probability is at most
where . Now we proceed to balance the density in . Let where and . We show how to balance the probabilities for such that the mass on is . Then we define the density for using the density of as . Let be the measure of the set . We divide to two sets and such that and . We define a new density function such that
First, note that by design we have . We now prove that the choice satisfies all of our conditions. First we show that . Indeed as for all , we get that the average density in is also in this range, that is, , which implies . Moreover, note that
This implies that . Finally, note that this does not affect too much as we have
Note that for sufficiently small , the error of this algorithm is now
where the last inequality follows by choosing small enough such that which gives the claim.
Appendix B Proofs and missing details for Section 4
B.1 Proof of Proposition 4
We will use the following helper lemma.
Lemma B.1.
Let . Then
Proof.
We have
∎
We begin by proving PrivUnitG is unbiased. Note that therefore we need to show that . To this end,
where the third inequality follows since , and the last inequality follows since Lemma B.1 gives that for . For the claim about utility, as PrivUnitG is unbiased we have
Now we prove the claim about the limit. First, note that hence we can write
Moreover, Lemma B.3 and Lemma B.4 shows that . Taking limit as , this yields that
Now we proceed to prove the privacy claim. We need to show that for every and
For every input vector, we divide the output space to two sets: and . The definition of PrivUnitG implies that for we have
and for we have
Using the notation , we now have that for any and
where the first inequality follows since we must have and to maximize the ratio.
B.2 Helper lemmas for Theorem 3
Lemma B.2.
We have
Proof.
We use the fact that for and , we have ([DF87, Inequality (1)]). We have
where the last inequality follows by setting . Inequality follows since has the same distribution as for . Indeed setting , we have for any
where the last inequality follows from
The claim now follows as . ∎
Lemma B.3.
We have
Proof.
It is enough to upper bound . Since
where the last inequality follows by setting . ∎
The following lemma is also useful for our analysis.
Lemma B.4.
Assume is -DP. Then .
Proof.
Proposition 4 implies that where for . As is increasing in , we have that
Gaussian concentration gives that for
This implies that and proves the claim.
∎
B.3 Proof of Proposition 5
Recall that the error of for dimension is
where
Note first that which immediately implies that
Lemma B.3 and Lemma B.4 show that . Finally, as for all and , this implies that . Letting denote the error for for inputs , we now have
This proves the right hand side of the inequality. The left side follows using the same arguments by noting that
The second part of the claim regarding the limit follows directly from the first part.
B.4 Proof of Proposition 6
We need the following lemma for this proof.
Lemma B.5.
We have
-
1.
for
-
2.
for any integer
Proof.
The first part follows as as which implies that
This implies the claim for .
The second part follows from apply a repetition-based randomizer. Given an -DP local randomizer for that achieves (that is, ), the randomizer that returns an average of applications of is -DP and has error . Thus we have
The proves the claim for as it is true for all . ∎
We are now ready to prove Proposition 6.
Proof.
(Proposition 6) First, note that is a sequence of real numbers that is bounded from below by zero. Thus exists. Now we will show that exists. Let . It is enough to show that for all there is such that for all we get . The definition of implies that there is such that . We will show that for all we get for some large . Let where and . We have that
where and follow from the first and second items of Lemma B.5 and the last inequality follows by setting . The claim follows.
∎