How to Find a Point in the Convex Hull Privately
Abstract
We study the question of how to compute a point in the convex hull of an input set of points in in a differentially private manner. This question, which is trivial without privacy requirements, turns out to be quite deep when imposing differential privacy. In particular, it is known that the input points must reside on a fixed finite subset , and furthermore, the size of must grow with the size of . Previous works [1, 2, 3, 4, 5, 11] focused on understanding how needs to grow with , and showed that suffices (so does not have to grow significantly with ). However, the available constructions exhibit running time at least , where typically for some (large) discretization parameter , so the running time is in fact .
In this paper we give a differentially private algorithm that runs in time, assuming that . To get this result we study and exploit some structural properties of the Tukey levels (the regions consisting of points whose Tukey depth is at least , for ). In particular, we derive lower bounds on their volumes for point sets in general position, and develop a rather subtle mechanism for handling point sets in degenerate position (where the deep Tukey regions have zero volume). A naive approach to the construction of the Tukey regions requires time. To reduce the cost to , we use an approximation scheme for estimating the volumes of the Tukey regions (within their affine spans in case of degeneracy), and for sampling a point from such a region, a scheme that is based on the volume estimation framework of Lovász and Vempala [13] and of Cousins and Vempala [7]. Making this framework differentially private raises a set of technical challenges that we address.
1 Introduction
We often would like to analyze data while protecting the privacy of the individuals that contributed to it. At first glance, one might hope to ensure privacy by simply deleting all names and ID numbers from the data. However, such anonymization schemes are proven time and again to violate privacy. This gave rise of a theoretically-rigorous line of work that has placed private data analysis on firm foundations, centered around a mathematical definition for privacy known as differential privacy [8].
Consider a database containing personal information of individuals. Informally, an algorithm operating on such a database is said to preserve differential privacy if its outcome distribution is (almost) insensitive to any arbitrary change to the data of one individual in the database. Intuitively, this means that an observer looking at the outcome of the algorithm (almost) cannot distinguish between whether Alice’s information is or (or whether Alice’s information is present in the database at all) because in any case it would have (almost) no effect on the outcome distribution of the algorithm.
Definition 1.1 (Dwork et al. [8]).
Two databases (multisets) and are called neighboring if they differ in a single entry. That is, and for some items and . A randomized algorithm is -differentially private if for every two neighboring databases and for any event we have
When this notion is referred to as pure differential privacy, and when it is referred to as approximate differential privacy.
Remark 1.2.
Typically, is set to be a small constant, say , and is set to be a small function of the database size (much smaller than ). Note that to satisfy the definition (in any meaningful way) algorithm must be randomized.
Differential privacy is increasingly accepted as a standard for rigorous treatment of privacy. However, even though the field has witnessed an explosion of research in the recent years, much remains unknown and answers to fundamental questions are still missing. In this work we study one such fundamental question, already studied in [1, 2, 3, 4, 5, 11]: Given a database containing points in (where every point is assumed to be the information of one individual), how can we privately identify a point in the convex hull of the input points? This question, which is trivial without privacy requirements, turns out to be quite deep when imposing differential privacy. In particular, Bun et al. [5] showed that in order to be able to solve it, we must assume that the input points reside on a fixed finite subset , and furthermore, the number of input points must grow with the size of .
The Private Interior Point (PIP) Problem.
Let be positive
parameters where are small and is a large integer.
Let be a finite uniform grid with side steps (so ).
Design an algorithm such that for some (as small as possible as a function of ) we have
1.
Utility: For every database containing at least points from it holds that returns a point in the convex hull of with probability at least . (The outcome of does not have to be in .)
2.
Privacy: For every pair of neighboring databases , each containing points from , and for any event , we have
The parameter is referred to as the sample complexity of the algorithm. It is the smallest number of points on which we are guaranteed to succeed (not to be confused with the actual size of the input). The PIP problem is very natural on its own. Furthermore, as was observed in [2], an algorithm for solving the PIP problem can be used as a building block in other applications with differential privacy, such as learning halfspaces and linear regression. Previous works [1, 2, 3, 4, 5, 11] have focused on the task of minimizing the sample complexity while ignoring the runtime of the algorithm. In this work we seek an efficient algorithm for the PIP problem, that still keeps the sample complexity “reasonably small” (where “reasonably small” will be made precise after we introduce some additional notation).
1.1 Previous Work
Several papers studied the PIP problem for . In particular, three different constructions with sample complexity were presented in [3, 4, 5] (for ). Recently, Kaplan et al. [11] presented a new construction with sample complexity (again, for ). Bun et al. [5] gave a lower bound showing that every differentially private algorithm for this task must have sample complexity . Beimel et al. [2] incorporated a dependency in to this lower bound, and showed that every differentially private algorithm for the PIP problem must use at least input points.
For the case of pure differential privacy (i.e., ), a lower bound of on the sample complexity follows from the results of Beimel et al. [1]. This lower bound is tight, as an algorithm with sample complexity (for ) can be obtained using a generic tool in the literature of differential privacy, called the exponential mechanism [15]. We sketch this application of the exponential mechanism here. (For a precise presentation of the exponential mechanism see Section 2.1.) Let be our (1-dimensional) grid within the interval , and let be a multiset containing points from . The algorithm is as follows.
-
1.
For every define the score
-
2.
Output with probability proportional to .
Intuitively, this algorithm satisfies differential privacy because changing one element of changes the score by at most , and thus changes the probabilities with which we sample elements by roughly an factor. As for the utility analysis, observe that with , and the probability of picking this point is (at least) proportional to . As this probability increases exponentially with , by setting to be big enough we can ensure that points outside of the convex hull (those with ) get picked with very low probability.
Beimel et al. [2] observed that this algorithm extends to higher dimensions by replacing with the Tukey depth of the point with respect to the input set (the Tukey depth of a point is the minimal number of points that need to be removed from to ensure that is not in the convex hull of the remaining input points). However, even though there must exist a point with high Tukey depth (at least ; see [14]), the finite grid might fail to contain such a point. Hence, Beimel et al. [2] first refined the grid into a grid that contains a point with high Tukey depth, and then randomly picked a point from with probability proportional to . To compute the probabilities with which grid points are sampled, the algorithm in [2] explicitly computes the Tukey depth of every point in , which, because of the way in which is defined, results in running time of at least and sample complexity . Beimel et al. then presented an improvement of this algorithm with reduced sample complexity of , but the running time remained .
1.2 Our Construction
We give an approximate differentially private algorithm for the private interior point problem that runs in time,111When we use -notation for time complexity we hide logarithmic factors in , , , and polynomial factors in . We assume operations on real numbers in time (the so-called real RAM model). and succeeds with high probability when the size of its input is . Our algorithm is obtained by carefully implementing the exponential mechanism and reducing its running time from to . We now give an informal overview of this result.
To avoid the need to extend the grid and to calculate the Tukey depth of each point in the extended grid, we sample our output directly from . To compute the probabilities with which we sample a point from we compute, for each in an appropriate range, the volume of the Tukey region of depth , which we denote as . (That is, is the region in containing all points with Tukey depth exactly .) We then sample a value with probability proportional to , and then sample a random point uniformly from .
Observe that this strategy picks a point with Tukey depth with probability proportional to . Hence, if for a “large enough” value of (say for a suitable absolute constant ) we have that is “large enough”, then a point with Tukey depth is picked with high probability. However, if (or too small) then a point with Tukey depth is picked with probability zero (or with too small a probability). Therefore, to apply this strategy, we derive a lower bound on the volume of every non-degenerate Tukey region, showing that if the volume is non-zero, then it is at least .
There are two issues here. The first issue is that the best bound we know on the complexity of a Tukey region is , so we cannot compute these regions explicitly (in the worst-case) in time (which is our target runtime complexity). We avoid the need to compute the Tukey regions explicitly by using an approximation scheme for estimating the volume of each region and for sampling a point from such a region, a scheme that is based on the volume estimation framework of Lovász and Vempala [13] and of Cousins and Vempala [7]. The second issue is that it might be the case that all Tukey regions for large values of are degenerate, i.e., have volume 0, in which case this strategy might fail to identify a point in the convex hull of the input points.
Handling degeneracies. We show that if the Tukey regions of high depth are degenerate, then many of the input points must lie in a lower-dimensional affine subspace. This can be used to handle degenerate inputs as follows. We first (privately) check whether there exists an affine proper subspace that contains many points of . If we find such a subspace , we recursively continue the procedure within , with respect to . Otherwise, if no such subspace exists, then it must be the case that the Tukey regions of high depth are full-dimensional (with respect to the subspace into which we have recursed so far), so we can apply our algorithm for the non-degenerate case and obtain a point that lies, with high probability, in the convex hull of the surviving subset of , and thus of the full set .
We remark that it is easy to construct algorithms with running time polynomial in the input size , when grows exponentially in . (In that case one can solve the problem using a reduction to the 1-dimensional case.) In this work we aim to reduce the running time while keeping the sample complexity at most polynomial in and in .
2 Preliminaries
We assume that our input set consists of points that lie in the intersection of a grid with the cube in . We assume that is of side length for a given accuracy integer parameter , so it partitions the cube into cells.
2.1 The exponential mechanism
Let denote the set of all finite databases over a grid , and let be a finite set. Given a database , a quality (or scoring) function assigns a number to each element , identified as the “quality” of with respect to . We say that the function has sensitivity if for all neighboring databases and and for all we have .
The exponential mechanism of McSherry and Talwar [15] privately identifies an element with large quality . Specifically, it chooses an element randomly, with probability proportional to . The privacy and utility of the mechanism are:
Theorem 2.1 (McSherry and Talwar [15]).
The exponential mechanism is -differentially private. Let be a quality function with sensitivity . Fix a database and let . For any , with probability at least , the exponential mechanism outputs a solution with quality .
2.2 Tukey depth
The Tukey depth [17] of a point with respect to is the minimum number of points of we need to remove to make lie outside the convex hull of the remaining subset. Equivalently, is the smallest number of points of that lie in a closed halfspace containing . We will write for when the set is clear from the context. It easily follows from Helly’s theorem that there is always a point of Tukey depth at least (see, e.g., [14]). We denote the largest Tukey depth of a point by (the maximum Tukey depth is always at most ).
We define the regions and for . Note that is the convex hull of and that . It is easy to show that is convex; is in fact the intersection of all (closed) halfspaces containing at least points of ; see [16]. It is easy to show that all this is true also when is degenerate. See Figure 1 for an illustration. The following lemma is easy to establish.
Lemma 2.2.
If is of dimension (we refer to such a region as non-degenerate) then is a convex polytope, each of whose facets is contained in a simplex spanned by points of , such that one of the open halfspaces bounded by the hyperplane supporting contains exactly points of .
3 The case of general position
As already said, we apply the exponential mechanism for privately identifying a point in with (hopefully) large Tukey depth with respect to the input set . This satisfies (pure) differential privacy since the sensitivity of the Tukey depth is . In this section we show that when the input points are in general position, then this application of the exponential mechanism succeeds (with high probability) in identifying a point that has positive Tukey depth, that is, a point inside the convex hull of .
To implement the exponential mechanism (i.e., to sample a point from appropriately), we need to compute the Tukey regions and their volumes. In this section we compute these regions explicitly in time. In Section 5 we will show that the cost can be reduced to .
Computing .
We pass to the dual space, and construct the arrangement of the hyperplanes dual to the points of . A point dual to a hyperplane supporting has at least dual hypeplanes passing below or incident to , or, alternatively, passing above or incident to . Furthermore, if we move slightly down in the first case (resp., up in the second case), the number of hypeplanes below (resp., above) it becomes smaller than .
When is a vertex of we refer to it as -critical, or simply as critical. If is non-degenerate then, by Lemma 2.2, each hyperplane that supports a facet of must be spanned by affinely independent points of . That is, all these hyperplanes are dual to -critical vertices of .
We compute all critical dual vertices that have at least dual hyperplanes passing below them or incident to them, or those with at least dual hyperplanes passing above them or incident to them. The intersection of the appropriate primal halfspaces that are bounded by the hyperplanes corresponding to these dual vertices is . This gives an algorithm for constructing when it is non-degenerate. Otherwise is degenerate and its volume is , and we need additional techniques, detailed in the next subsection, to handle such situations.
We compute the volume of each non-degenerate , for . We do that in brute force, by computing and triangulating and adding up the volumes of the simplices in this triangulation. Then we subtract the volume of from the volume of to get the volume of .
The sampling mechanism.
We assign to each the weight and sample a region , for , with probability
where denotes the volume of . Then we sample a point uniformly at random from . We do this in brute force by computing , triangulating it, computing the volume of each simplex, drawing a simplex from this triangulation with probability proportional to its volume, and then drawing a point uniformly at random from the chosen simplex.222A simple way to implement the last step is to draw uniformly and independently points from , compute the lengths of the intervals into which they partition , and return , where are the vertices of the simplex.
This is an instance of the exponential mechanism in which the score (namely the Tukey depth) has sensitivity , i.e., for any point , when and differ by only one element. It thus follows from the properties of the exponential mechanism (Theorem 2.1) that this procedure is (purely) -differentially private.
Complexity.
Computing the dual arrangement takes time [10]. Assume that is non-degenerate and let denote the number of hyperplanes defining (i.e., the hyperplanes supporting its facets). It takes time to construct , as the intersection of halfspaces, which is a dual version of constructing the convex hull (see [6]). Within the same asymptotic bound we can triangulate and compute its volume. We obviously have , but the number can be somewhat reduced. The following lemma is known.
Lemma 3.1 (Proposition 3 in [12]).
The number of halfspaces needed to construct is .
Proof.
Fix a -tuple of points of , and consider all the relevant (closed) halfspaces, each of which is bounded by a hyperplane that is spanned by and another point of , and contains at least points of . It is easy to check that, as long as the intersection of these halfspaces is full-dimensional, it is equal to the intersection of just two of them. ∎
Summing up, we get that computing the volume of all the non-degenerate regions , for , takes time.
Utility.
We continue to assume that is non-degenerate, and give a lower bound on its volume. By Lemma 2.2, such a is the intersection of halfspaces, each bounded by a hyperplane that is spanned by points of . Denote by the set of these hyperplanes. To obtain the lower bound, it suffices to consider the case where is a simplex, each of whose vertices is the intersection point of hyperplanes of .
The equation of a hyperplane that passes through points, , of is
where , for . The coefficients of the ’s in the equation of are subdeterminants of this determinant, where each determinant has one column of ’s, and other columns, each of whose entries is some , which is a rational of the form , with (the same holds for the ’s, with ). The value of such a determinat (coefficient) is a rational number with denominator . By Hadamard’s inequality, its absolute value is at most the product of the Euclidean norms of its rows, which is at most . That is, the numerator of the determinant is an integer of absolute value at most . The free term is a similar sub-determinant, but all its entries are the ’s, so it too is a rational with denominator , and with numerator of absolute value at most . Multiplying by , all the coefficients become integers of absolute value at most .
Each vertex of any region of Tukey depth , for any , is a solution of a linear system of hyperplane equations of the above form. It is therefore a rational number whose denominator, by Cramer’s rule, is the determinant of all non-free coefficients of the hyperplanes. This is an integer whose absolute value, again by Hadamard’s inequality, is at most
Since the free terms are also integers, we conclude that the coordinates of the intersection point are rationals with a common integral denominator of absolute value at most .
We can finally obtain a lower bound for the nonzero volume of a simplex spanned by any linearly independent intersection points . This volume is
where again , for . Note that all the entries in any fixed row have the same denominator. The volume is therefore a rational number whose denominator is times the product of these denominators, which is thus an integer with absolute value at most
(for ). That is, we get the following lemma.
Lemma 3.2.
If the volume of is not zero then it is at least .
Assume that the volume of is not zero for . Since the score of a point outside the convex hull is zero and the volume of is at most , we get that the probability to sample a point outside of the convex hull is at most
This inequality leads to the following theorem, which summarizes the utility that our instance of the exponential mechanism provides.
Theorem 3.3.
If and has non-zero volume then the exponential mechanism, implemented as above, returns a point in the convex hull with probability at least , in time.
4 Handling degeneracies
In general we have no guarantee that has non-zero volume. In this section we show how to overcome this and compute (with high probability) a point in the convex hull of any input. We rely on the following lemma, which shows that if has zero volume then many points of are in a lower-dimensional affine subspace.
Lemma 4.1.
If spans an affine subspace of dimension then
Proof.
Recall that is the intersection of all closed halfspaces that contain at least points of . Note that a halfspace that bounds and whose bounding hyperplane properly crosses , defines a proper halfspace within , and, by assumption, the intersection of these halfspaces has positive relative volume. This means that the intersection of these halfspaces in has positive volume too, and thus cannot confine to . To get this confinement, there must exist (at least) halfspaces in the above collection, whose intersection is . Hence the union of their complements is . Since this union contains at most points of , the claim follows. ∎
In what follows, to simplify the expressions that we manipulate, we use the weaker lower bound . In order for the lemma to be meaningful, we want to be significantly smaller than the centerpoint bound , so we set, as above, .
We use Lemma 4.1 to handle degenerate inputs , using the following high-level approach. We first (privately) check whether there exists an affine proper subspace that contains many points of . If we find such a subspace , we recursively continue the procedure within , with respect to . Lemma 4.1 then implies that we do not lose too many points when we recurse within (that is, is large), using our choice . Otherwise, if no such subspace exists, Lemma 4.1 implies that is full-dimensional (with respect to the subspace into which we have recursed so far), so we can apply the exponential mechanism, as implemented in Section 3, and obtain a point that lies, with high probability, in the convex hull of the surviving subset of , and thus of the full set . We refer to this application of the exponential mechanism in the appropriate affine subspace as the base case of the recursion.
The points of are not on a standard grid within . (They lie of course in the standard uniform grid of side length within the full-dimensional cube, but is not a standard grid and in general has a different, coarser resolution.) We overcome this issue by noting that there always exist coordinates, which, without loss of generality, we assume to be , such that can be expressed in parametric form by these coordinates. We then project (and ) onto the -coordinate subspace . We recurse within , where the projected points of do lie in a standard grid (a cross-section of ), and then lift the output point , which lies, with high probability, in , back to a point . It is straightforward to verify that if is in the convex hull of the projected points then is in the convex hull of .
4.1 Finding an affine subspace with many points privately
For every affine subspace , of dimension , spanned by some subset of (at least) points of , we denote by the number of points of that contains, and refer to it as the size of .
We start by computing for every subspace spanned by points of , as follows. We construct the (potentially degenerate) arrangement of the set of the hyperplanes dual to the points of . During this construction, we also compute the multiplicity of each flat in this arrangement, namely, the number of dual hyperplanes that contain it. Each intersection flat of the hyperplanes is dual to an affine subspace defined by the corresponding subset of the primal points of (that it contains), and is the number of dual hyperplanes containing the flat. In other words, as a standard byproduct of the construction of , we can compute the sizes of all the affine subspaces that are spanned by points of , in overall time.
We define
and compute , where is a random variable drawn (independently for each ) from a Laplace distribution with parameter centered at the origin. (That is, the probability density function of is .)
Our algorithm now uses a given confidence parameter and proceeds as follows. If for every
(1) |
we apply the base case. Otherwise, set to be the smallest index such that
(2) |
Having fixed , we find (privately) a subspace of dimension that contains a large number of points of . To do so, let be the collection of all -dimensional subspaces that are spanned by affinely independent points of the grid (not necessarily of ). We apply the exponential mechanism to pick an element of , by assigning a score to each subspace of , which we set to be
if , and if . Note that by its definition, is zero if is not spanned by points of . Indeed, in this case the points contained in span some subspace of dimension and therefore must be at least as large as . We will shortly argue that has sensitivity at most (Lemma 4.4), and thus conclude that this step preserves the differential privacy of the procedure.
We would like to apply the exponential mechanism as stated above in time proportional to the number of subspaces of non-zero score, because this number depends only on (albeit being exponential in ) and not on (the much larger) . However, to normalize the scores to probabilities, we need to know the number of elements of with zero score, or alternatively to obtain the total number of subspaces spanned by points of (that is, the size of ).
We do not have a simple expression for (although this is a quantity that can be computed, for each , independently of , once and for all), but clearly . We thus augment (only for the purpose of analysis) with “dummy” subspaces, and denote the augmented set by , whose cardinality is now exactly . We draw a subspace from using the exponential mechanism. To do so we need to compute, for each score , the number of elements of that have score , give each such element weight , choose the index with probability proportional to , and then choose uniformly a subspace from those with score . It is easy to check that this is indeed an implementation of the exponential mechanism as described in Section 2.1.
If the drawing procedure decides to pick a subspace that is not spanned by points of , or more precisely decides to pick a subspace of score , we stop the whole procedure, with a failure. If the selected score is positive, the subspace to be picked is spanned by points of , and those subspaces are available (from the dual arrangement construction outlined above). We thus obtain a selected subspace (by randomly choosing an element from the available list of these subspaces), and apply the algorithm recursively within , restricting the input to . (Strictly speaking, as noted above, we apply the algorithm to a projection of onto a suitable -dimensional coordinate subspace.)
It follows that we can implement the exponential mechanism on all subspaces in in time proportional to the number of subspaces spanned by points of , which is , and therefore the running time of this subspace selection procedure (in each recursive call) is .
4.1.1 Privacy analysis
Lemma 4.2.
Let and be two neighboring data sets. Then, for every , we have .
Proof.
Let be a subspace of dimension at most that is spanned by points of and contains such points. If does not contain then is also a candidate for , so in this case . If does contain then spans a subspace of which is of dimension at most , so it is a candidate for . Since it does not contain (and may contain ) we have in this case that . Therefore we can conclude that in anycase . A symmetric argument shows that . Combining these two inequalities the lemma follows. ∎
Lemma 4.3.
The value of each , for , is -differentially private.
Proof.
Since we choose by comparing each of the ’s to a value which is the same for neighboring data sets and (which have the same cardinality ), Lemma 4.3 implies that the choice of the dimension is differentially private.
The next step is to choose the actual subspace in which to recurse. The following lemma implies that this step too is differentially private.
Lemma 4.4.
Let and be as in Lemma 4.2. Then, for each and for every subspace , we have .
Proof.
Fix and . Clearly, , and, by Lemma 4.2, is also of sensitivity , and the claim follows. ∎
Lemma 4.5.
The subspace-selection procedure described in this section (with all its recursive calls) is -differentially private.
Proof.
4.1.2 Utility analysis
The following lemma is the key for our utility analysis.
Lemma 4.7.
Let be a given parameter. For and for every the following properties hold.
(i) If then, with probability at least ,
(ii) On the other hand, if then, with probability at least ,
Proof.
(i) follows since the probability of the Laplace noise to be smaller than is at most , and (ii) follows since the probability of to be larger than is also at most . ∎
We say that (the present recursive step of) our algorithm fails if one of the complements of the events specified in Lemma 4.7 happens, that is, the step fails if for some , either (i) and , or (ii) and . Otherwise we say that (this step of) our algorithm succeeds.333Note that there is a ‘grey zone’ of values of between these two bounds, in which the step always succeeds.
It follows from Lemma 4.1 that if the algorithm succeeds and applies the base case then is full dimensional. Furthermore, if the algorithm succeeds and decides to recurse on a subspace of dimension (according to the rule in (1) and (2)) then, for every , and . The following lemma is an easy consequence of this reasoning.
Lemma 4.8.
If the algorithm succeeds, with dimension , and applies the exponential mechanism to pick a -dimensional subspace, then there exists a -dimensional subspace with score . Furthermore, if then, with probability at least , the exponential mechanism picks a subspace with .
Proof.
The first part of the lemma follows from the definition of success, as just argued. For the second part notice that, since , the probability of drawing a subspace of score smaller than is at most
This expression is at most if . ∎
If follows that if our algorithm succeeds and recurses in a subspace of dimension then, with probability at least ,
That is, when we recurse in of dimension we lose at most points. Denote by the sequence of dimensions into which the procedure recurses (reaching the base case at dimension ). Hence, keeping fixed throughout the recursion, at the -th recursive step we lose at most points. Summing up these losses over , the total loss is at most
Substituting , we get that the total number of points that we loose is at most if , with a sufficiently large constant of proportionality.
Notice that we keep fixed throughout the recursion and may decrease. Consequently, if is the number of points in some recursive call in some dimension , then and therefore which is still smaller than the centerpoint guarantee of .
As described, our subspace-selection procedure (with all its recursive calls) is -differentially private. Dividing by we get that our subspace-selection procedure is -differentially private, and that the total number of points we lose is much smaller than if .
Recall Section 3, where we showed that we need for the (-differentially private) base case to work correctly. (Recall also that the base case is actually applied in a suitable projection of the terminal subspace onto some coordinate-frame subspace of the same dimension, and that the above lower bound on suffices for any such lower-dimensional instance too.)
The following theorem summarizes the result of this section.
Theorem 4.9.
Given points, our algorithm (including all recursive calls and the base case) is -differentially private, runs in time, and finds a point of depth at least with probability at least .
Proof.
The privacy statement follows by composition, using Lemma 4.5 and the privacy properties of the exponential mechanism. The confidence bound follows since the probability of failure of the recursive call in a subspace of dimension is at most . The running time of the algorithm is dominated by the running time of the exponential mechanism that we perform at the bottom (base case) of the recursion. ∎
5 An algorithm via volume estimation
The upper bound on the running time in Theorem 4.9 is dominated by the running time of the base case, in which we compute all the regions explicitly, which takes time. To reduce this cost, we use instead a mechanism that (a) estimates the volume of to a sufficiently small relative error, and (b) samples a random point “almost” uniformly from . We show how to accomplish (a) and (b) using the volume estimation mechanisms of Lovász and Vempala [13] and later of Cousins and Vempala [7]. We also show how to use these procedures to implement approximately the exponential mechanism described in Section 3. These algorithms are Monte Carlo, so they fail with some probability, and when they fail we may lose our -differential privacy guarantee. As a result, the modified algorithm will not be purely differentially private, as the one in Section 3, and we will only be able to guarantee that it is -differentially private, for any prescribed . We obtain the following theorem which we prove in the rest of this section.
Theorem 5.1.
Given points, our algorithm (including all recursive calls and the base case) is -differentially private, runs in time, and finds a point of depth at least with probability at least .
5.1 Volume estimation
We use the following result of Cousins and Vempala for estimating the volumes of the convex polytopes .
Theorem 5.2 (Cousins and Vempala [7, Theorem 1.1]).
Let be a convex body in that contains the unit ball around the origin and satisfies .444Here is expectation under the uniform measure within . Then, for any , there is an algorithm that can access only via membership queries (each determining whether a query point lies in ), and, with probability at least , approximates the volume of with a relative error (i.e., in the range times the true volume), and performs membership queries in .
We do not know when the algorithm fails; failure simply means that the value that it returns is not within the asserted range.
Membership tests.
Given a query point and a parameter , to test whether amounts to verifying that lies in each of the halfspaces whose intersection forms . Lemma 3.1 shows that there are at most such halfspaces, and we can prepare in advance a list of these halfspaces, as a by-product of the construction of the arrangement of the hyperplanes dual to the points in . Thus we can implement a membership query in time. (Recall that we hide in the -notation factors that are polynomial in .)
In order to bring into the ‘rounded’ form assumed in Theorem 5.2, we use an algorithm of Lovász and Vempala [13] to bring a convex body into an approximate isotropic position.
Definition 5.3.
A convex body in is in isotropic position if for every unit vector , . is in -nearly isotropic position if for every unit vector , .
It is easy to verify that if is in -nearly isotropic position then .
Theorem 5.4 (Lovász and Vempala [13, Theorem 1.1]).
There is an algorithm that, given a convex body that contains the unit ball around the origin and is contained in a ball of radius , and a confidence parameter , brings into a 2-nearly isotropic position. The algorithm succeeds with probability at least , and performs membership queries in .
We can use some of our observations in Section 3 to show:
Lemma 5.5.
Each of the polytopes of nonzero volume contains a ball of radius at least .
Proof.
As we showed in Section 3, each vertex of has rational coordinates of common integral denominator of value at most . We take linearly independent vertices of , and denote by their center of mass, which is guaranteed to be inside . This center of mass has rational coordinates with a common integer denominator of value at most .
We also showed in Section 3 that each hyperplane defining has integer coefficients of absolute value at most .
The largest radius of a ball enclosed in and centered at is the minimum distance of from the defining (supporting) hyperplanes of . The distance from any such hyperplane is positive (since lies in the interior of ), and is a rational whose denominator is the common denominator of the coordinates of times the Euclidean norm of the vector of coefficients of , excluding the free term, which is at most . That is, this distance is at most
This implies the lemma. ∎
is contained in the unit cube, and therefore it is contained in a ball of radius . It follows that we can scale so that it satisfies the conditions of Theorem 5.4, with . Hence we have .
5.2 Implementing the exponential mechanism
We now show how to use the estimates of the volumes of the polytopes to sample “almost” uniformly from a region , which is picked with probability , where
as defined in Section 3.
We carry out this modified sampling by defining new probabilities , for (so that ), choosing the region (rather than ) with probability and then sampling almost uniformly a point in the chosen region , using the sampling algorithm in [7], whose properties are stated in the following theorem. In the theorem, the variation distance between two measures , is , over all possible events .
Theorem 5.6 (Cousins and Vempala [7, Theorem 1.2]).
Let be a convex body in that contains the unit ball around the origin and satisfies . Then, for any , there is an algorithm that can access only via membership queries, and, with probability at least , generates random points from a distribution that is within total variation distance at most from the uniform distribution on . The algorithm performs membership queries for each random point that it generates. The running time is larger by a factor of than the number of membership queries.
Exact sampling.
Let us denote and . We first derive probabilities such that if we pick with probability , and then pick a point uniformly from , then we simulate the exponential mechanism (over the regions ) exactly.
The probability that , conditioned on having chosen to sample from, is for ; it is for . Therefore, the unconditional probability that is , and we want this probability to be proportional to . That is, we want all the equalities , for , to hold, for a suitable normalizing parameter . Putting , we get the triangular system , for , which solves to
for . That is, to ensure that , we then put
and
Approximate sampling.
We approximate these probabilities, by replacing the exact volumes by their approximate values that we obtain from Theorems 5.4 and 5.2, and get the corresponding approximations to and to . By construction, and by the probability union bound, we have, for the prescribed error parameter , with probability at least , , for every , which is easily seen to imply that
(3) |
We sample using the probabilities , and then sample from using Theorem 5.6. Denote by , for any measurable subset of , the probability that the pure exponential mechanism, that uses the (costly to compute) exact values , as just reviewed, returns a point in . It follows that the conditional probability of returning a point in , conditioned on all the parameters satisfying the inequalities (3), and conditioned on successful sampling from using Theorem 5.6, satisfies
Lemma 5.7.
(4) |
Proof.
Indeed, by definition of our algorithm we have that
(5) |
where is a mnemonic for the event that the mechanism decides to sample from the region . By Theorem 5.6 we have that
(6) |
where is the probability to get a point in if we sample uniformly from . Combining Equations (5) and (6) we get that
(7) |
Now in our approximation scheme is , so we get, using (3), that
which implies the lemma by our argument regarding the exact simulation of the exponential mechanism given above. ∎
If we set , to be some constant multiple of (we divide by to account for the failure probablity over for all ’s), and to be a constant multiple of then it follows from Equation (4) that our approximation of the exponential mechanism is -private.
6 Conclusions
We gave an -time algorithm for privately computing a point in the convex hull of points with coordinates that are multiples of in . Even though this gives a huge improvement of what was previously known and requires some nontrivial technical effort, and sophisticated sampling and volume estimation tools, this running time is still not satisfactory for large values of . The main hurdle in improving it further is the nonexistence of efficient algorithms for computing Tukey depths and Tukey levels.
The main question that we leave open is whether there exists a differentially private algorithm for this task which is polynomial in and ? (when the input size, , is still polynomial in and ).
Acknowledgments
We thank Santosh Vempala for many helpful discussions.
References
- [1] Amos Beimel, Shiva Prasad Kasiviswanathan, and Kobbi Nissim. Bounds on the sample complexity for private learning and private data release. In TCC, volume 5978 of LNCS, pages 437–454. Springer, 2010.
- [2] Amos Beimel, Shay Moran, Kobbi Nissim, and Uri Stemmer. Private center points and learning of halfspaces. In Conference on Learning Theory (COLT), pages 269–282, 2019.
- [3] Amos Beimel, Kobbi Nissim, and Uri Stemmer. Private learning and sanitization: Pure vs. approximate differential privacy. In APPROX-RANDOM, volume 8096 of LNCS, pages 363–378. Springer, 2013.
- [4] Mark Bun, Cynthia Dwork, Guy N. Rothblum, and Thomas Steinke. Composable and versatile privacy via truncated cdp. In 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 74–86, 2018.
- [5] Mark Bun, Kobbi Nissim, Uri Stemmer, and Salil P. Vadhan. Differentially private release and learning of threshold functions. In IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 634–649, 2015.
- [6] Bernard Chazelle. An optimal convex hull algorithm in any fixed dimension. Discrete Comput. Geom., 10:377–409, 1993.
- [7] Ben Cousins and Santosh S. Vempala. Gaussian cooling and algorithms for volume and gaussian volume. SIAM J. Comput., 47(3):1237–1273, 2018.
- [8] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In TCC, volume 3876 of LNCS, pages 265–284. Springer, 2006.
- [9] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4), 2014.
- [10] Herbert Edelsbrunner, Raimund Seidel, and Micha Sharir. On the zone theorem for hyperplane arrangements. SIAM J. Comput., 22(2):418–429, 1993.
- [11] Haim Kaplan, Katrina Ligett, Yishay Mansour, Moni Naor, and Uri Stemmer. Privately learning thresholds: Closing the exponential gap. CoRR, abs/1911.10137, 2019. URL: http://arxiv.org/abs/911.10137.
- [12] Xiaohui Liu, Karl Mosler, and Pavlo Mozharovskyi. Fast computation of Tukey trimmed regions and median in dimension p 2. J. of Comput. and Graph. Stat., 28(3):682–697, 2019.
- [13] László Lovász and Santosh S. Vempala. Simulated annealing in convex bodies and an volume algorithm. J. Comput. Syst. Sci., 72(2):392–417, 2006.
- [14] Jiří Matousek. Lectures on Discrete Geometry. Springer-Verlag, Berlin, Heidelberg, 2002.
- [15] Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 94–103, 2007.
- [16] Peter J. Rousseeuw and Ida Ruts. Constructing the bivariate Tukey median. Statistica Sinica, 8(3):827–839, 1998.
- [17] John W. Tukey. Mathematics and the picturing of data. In Proc. of the International Congress of Mathematicians, volume 2, page 523–531, 1975.
- [18] Salil Vadhan. The complexity of differential privacy. In Yehuda Lindell, editor, Tutorials on the Foundations of Cryptography: Dedicated to Oded Goldreich, pages 347–450. Springer, 2017.