Explaining in local differential privacy through the lens of quantitative information flow
Abstract
The study of leakage measures for privacy has been a subject of intensive research and is an important aspect of understanding how privacy leaks occur in computer systems. Differential privacy has been a focal point in the privacy community for some years and yet its leakage characteristics are not completely understood. In this paper we bring together two areas of research –information theory and the -leakage framework of quantitative information flow (QIF)– to give an operational interpretation for the epsilon parameter of local differential privacy. We find that epsilon emerges as a capacity measure in both frameworks; via (log)-lift, a popular measure in information theory; and via max-case -leakage, which we introduce to describe the leakage of any system to Bayesian adversaries modelled using “worst-case” assumptions under the QIF framework. Our characterisation resolves an important question of interpretability of epsilon and consolidates a number of disparate results covering the literature of both information theory and quantitative information flow.
Index Terms:
Differential privacy, log-lift, information leakage, g-leakage, quantitative information flowI Introduction
Over the past two decades, characterising and limiting privacy leakage in data sharing systems has been the subject of intensive research. Information-theoretic privacy [1, 2], differential privacy [3] and quantitative information flow (QIF) [4] are three main frameworks that have been developed and studied for this purpose.
Information-theoretic privacy is concerned with measuring privacy leakage using well-known information-theoretic quantities, such as Shannon mutual information [5]:
where denotes a secret and denotes the observable, with and denoting the joint and marginal distributions of and , respectively. The channel is characterised by the conditional probability distribution and is assumed fixed by the data sharing system and publicly known. For any prior , a joint and a marginal distribution are induced by .
Papers such as [4, 6] discuss some of the limitations of Shannon mutual information in properly quantifying or differentiating various adversarial threats to privacy. Sibson and Arimoto mutual information of order have recently been proposed to measure a spectrum of generalised threats to privacy [7]. It also turns out that the “nucleus” of Shannon mutual information, namely the information density variable: can represent a much stronger notion of privacy leakage than its average. We call the lift variable, the exponential of information density. 111Therefore, we may refer to information density as log-lift. Information density and lift have been studied, albeit under various names, in works such as [1, 8, 9, 10, 11].
Differential privacy (DP) is a well-known privacy framework [3] describing a worst-case attack scenario, in which an attacker with knowledge of all individuals but one in a dataset learns little information about the unknown individual upon an information release. Specifically, for a given , any and any two neighbouring datasets differing in one individual and any observation , DP requires
(1) |
This definition is sometimes called the central model for differential privacy, referring to its dependence on a trusted centralised data curator. Alternatively, in the local model for differential privacy [12, 13], each individual first applies noise to their data before sending it to an untrusted data curator, who releases some aggregate information about the (noisy) data. Local DP is more stringent than DP as it requires the inequation above be satisfied for all . In other words, local -DP implies central -DP, but not vice versa.
Finally, quantitative information flow (QIF) is an information-theoretic framework for measuring leakage from (probabilistic) systems modelled as information-theoretic channels. 222The early work introducing Quantitative Information Flow was based on Shannon entropy to measure flows of information [14]. The QIF used here refers to later work initiated by Smith [4]. In QIF, leakages are derived from a Bayesian adversary model incorporating a gain function describing the adversary’s goal, and the adversary’s prior knowledge. The resulting leakage, computed as the expected gain in the adversary’s knowledge, thus provides a meaningful, interpretable measure of the security risk to a system from that adversary. For example, the Bayes vulnerability measure, widely used in QIF, corresponds to the maximum probability that an adversary can guess the secret in one try; the Bayes leakage measure is the difference between the adversary’s expected probability of guessing the secret after making an observation from the system versus their probability before making an observation (i.e., using only their prior knowledge).
I-A Explaining epsilon: a motivating example
An important development in both the information theory and security communities has been an interest in operationally relevant leakage measures to provide information about the relative security of different probabilistic data sharing systems. In this context, operational measures are ones which provide interpretable guarantees in terms of an adversary’s gain or loss upon interacting with the system. The value produced by such a leakage measure therefore has a correspondence with some measurable information gain to the adversary, justifying its use as a security measure. For both of the above communities, the of differential privacy still lacks a robust operational meaning, despite the privacy community’s explanation in terms of the differential privacy attacker who knows all-but-one secret in the dataset. The following example illustrates this more clearly.
Consider a local differential privacy scenario in which a user wishes to obfuscate their own sensitive data before releasing it to an untrusted data curator. The user is given a choice of two mechanisms to use. The scenario is depicted by the two mechanisms below, in which, for example, the user is answering a question in a survey, and the values {y, m, n} correspond to the answers ‘yes’, ‘maybe’ and ‘no’, respectively. Note that we describe our mechanisms as ‘channels’, in which the row labels correspond to secrets and the column labels correspond to observations. The value in row- column- represents the conditional probability ; that is the probability of observing output given input . In the example below, in the channel we have that the probability of observing given an input is , given by the value in the bottom left of the channel matrix.
From the perspective of differential privacy, the choice of the user’s noisy mechanism should be determined by the parameter described by the DP equation (1); in the DP literature, the smaller the , the more privacy is said to be guaranteed to the user.
In the mechanisms described above, notice that the mechanism satisfies -DP and the mechanism satisfies -DP. 333The mechanisms and are in fact instances of the well-known Geometric and Randomised-Response mechanisms from DP, respectively. (In fact this corresponds to the maximum ratio between any two values within a column). We would therefore expect that is more private and so leaks less information to an adversary than does .
We now ask: for each of these mechanisms, what would be the expected gain to an attacker who is trying to guess the secret in one try? We can model this attack by assuming a uniform prior for the adversary (say), and applying Bayes rule to compute the adversary’s posterior knowledge after making an observation from each mechanism. 444This is the model used in the QIF framework and is described in more detail in §II-C. The posterior probability of success for the attacker is given by the maximum probability in each posterior, averaged by the marginal on that posterior 555In QIF this is called the posterior Bayes vulnerability.. The resulting computation yields that the leakage of the system computed using the adversary’s expected posterior gain from is whereas the equivalent leakage from is (see Appendix A for complete details). So, in fact, our attacker learns more from than she does from .
But since represents a “max-case” measure and the adversarial scenario above uses the expected gain of the attacker, i.e., an “average-case” measure, we might think that the above anomaly occurs because the measures we are using are incompatible. Let us consider now a max-case scenario: an attacker whose advantage is measured by the maximum probability of guessing the secret in one try regardless of which output was observed. In this scenario, the leakage to the attacker is given by the maximum leakage of any of the posteriors of the channel, independent of the marginal probability on that posterior. In this instance, we find that, again under a uniform prior, the max-case leakage computed using the posterior max-case vulnerability for is whereas for it is (see Appendix A for full details). And so also leaks more information to this max-case adversary than does , even though its suggests that should be safer than .
These examples motivate the question: what does tell us about other kinds of attacking threats apart from the one modelled in the DP literature? 666The DP literature describes as a measure of indistinguishability against an attacker who knows every secret in the dataset except for one. If is indeed a robust measure for privacy, it should be useful for making judgments about many different kinds of privacy threats. One way of assessing robustness from the QIF perspective would be to ask: does correspond to a leakage bound for some class of (Bayesian) adversaries? If so, this would give a robust operational meaning, allowing its guarantees to be explained in terms of a large number of attacking threats relevant to the security community. For the privacy community, this may assist in providing some explainability for in terms of its measurement of leakage. This could be used to guide privacy practitioners in determining a reasonable value for , which is a problem that has been identified by, for example, the implementers of differential privacy for the US Census Bureau [15].
In this paper, we bring together the information-theoretic, differential privacy and QIF frameworks to show how the measures of lift from information theory and -leakage from QIF can be used to bring a robust operational meaning – an explanation in terms of a broad class of adversarial attacks – to the parameter in (local) differential privacy.
I-B Key open questions and contributions of this paper
Despite a large body of existing work, a number of questions remain open at the intersection of these frameworks.
While in both the local and central models of DP is agnostic to the adversarial gain function and prior knowledge, it is not clear whether has a robust interpretation in the QIF sense (i.e., in terms of protection against a class of adversaries as highlighted in the example earlier). Such a robust operational interpretation is missing from the DP literature.
In addition, it is still not clear whether the measure of log-lift (in this paper lift) has an operational significance as a measure of privacy leakage or is capable of robustly quantifying maximum privacy leakage in strong adversarial threat models via the QIF framework.
Last, the current theory of max-case leakage in QIF leaves open the question of how best to model the max-case adversarial threats or how to characterise max-case capacity over all priors and a wide class of adversarial threats.
The following contributions of this paper address these questions.
Main contributions:
-
1.
We propose a new set of measures: max-case -leakages (Defn 3), which are a subset of the set of general max-case measures for QIF, which have been extensively studied in [16]. The max-case -leakages have a clear operational significance: they correspond to adversaries whose goal is to maximise their gain over any output. Note that -leakages model information-theoretic Bayesian adversaries and therefore represent a very general threat model.
-
2.
We prove the first robust and operationally meaningful interpretation of in local DP via the proposed max-case -leakage framework. Specifically, we show that is exactly the max-case -leakage capacity, which measures the maximum leakage of a system wrt any Bayesian attack modelled using either average-case or max-case -leakage measures, and is robustly independent of adversarial gain functions and priors.
-
3.
We introduce the lift capacity (Defn 4), which quantifies lift over all priors, and show that it is equivalent to in the local DP framework. This provides the missing connection between lift and which has been a subject of interest in the information theory literature.
We remark that although one conclusion we draw – that the of the DP framework is robustly independent of the adversary – may seem expected or unsurprising, the method developed in this paper of obtaining the result via the QIF framework is non-trivial.
I-C Detail of technical contributions
Technically speaking, the first contribution of this work is establishing a link between lift in information-theoretic privacy and a notion of max-case -leakage, which we introduce in this paper via the QIF framework. Specifically, in Defn 3 we introduce a max-case multiplicative -leakage, denoted by , using the standard notion of -vulnerability in QIF framework. We then show via Thm 2 that lift upper bounds with respect to any gain function for any given prior . In Thm 1, we show that lift is indeed realisable as a max-case -leakage for an adversary who chooses a suitable prior-dependent gain function.
In addition, we establish an important result, linking all three information-theoretic privacy, local differential privacy and QIF frameworks in an operationally robust manner. Specifically in the information-theoretic privacy framework, we define in Defn 4 the supremum of lift over all priors for a channel as lift capacity and denote it by . We then show in Thm 3 and Cor 2 that lift capacity is equivalent to in the local DP framework. Combining this result with Thm 1 and Thm 2 we conclude that lift capacity, aka , is also equal to the max-case -leakage capacity in the QIF framework. This gives lift a robust operational interpretation in terms of strong max-case adversarial threats and explains in local DP as a capacity measure from the lens of information theory and QIF.
Finally, we address a question raised in [6] as to whether the -leakage framework is sufficient to describe adversarial scenarios in which the attacker is trying to guess a randomised function of the secret. In fact the Dalenius leakage described in [16, Ch 10] already addresses this in the case of average-case -leakage, demonstrating that -leakage is unchanged when considering randomised functions of the secret. We extend these results to the cases of lift and lift capacity, showing that – as with the Bayes capacity – both measurements are unaffected under such adversarial scenarios.
Table I visualises existing results in the literature and our contributions. As seen from the Table, this paper bridges between several results in the information-theoretic privacy, differential privacy and QIF literature and establishes new relations between lift and other measures of privacy. Our new results, together with our consolidated summary of existing results, depicts a fuller picture on deep connections between the information-theoretic, differential privacy and QIF frameworks.
I-D Operational measures and their significance in information theory and security
A key motivating example for the importance of operational security measures came from Geoffrey Smith [4] who observed that traditional information-theoretic measures such as mutual information can underestimate the threat to a system. This work led to the study of -leakages, which directly model attacking threats and provide leakage measures which can be interpreted in terms of a concrete information gain to an attacker. The relevance of such leakage measures to the security community is that they are explainable in terms of explicit adversarial threats, and therefore give meaningful security assessments. The study of -leakages now sits under the QIF umbrella [17, 16].
In parallel to the above, recent work from the information theory community has identified the importance of operational measures for information leakage [6]. Of particular interest is the problem of providing an operational interpretation to privacy measures such as (in differential privacy) and log-lift (from information theory). Key work in this area by Issa et al. [6] has independently discovered several results overlapping with QIF, which will be outlined later in this paper. Importantly, both communities use an information-theoretic and decision-theoretic (Bayesian) approach to modelling adversarial threats. Therefore, although in this paper we use the notation and framework of quantitative information flow, the results will be relevant and interpretable for the information theory community as well.
II Information-Theoretic Foundations for Privacy
II-A The channel model for quantitative information flow
We adopt notation from QIF [16], a mathematical framework for studying and quantifying information leaks with respect to adversarial attack scenarios.
A probabilistic channel maps inputs (secrets) to observations according to a distribution . In the discrete case, such channels are matrices whose row-, column- element is the probability that input produces observation . The -th row is thus a discrete distribution in . We write for the type of the channel .
We can use Bayes rule to model an adversary who uses their observations from a channel to (optimally) update their knowledge about the secrets . Given a prior distribution (representing an adversary’s prior knowledge) and channel , we can compute a joint distribution where . Marginalising down columns yields the -marginals each having a posterior over corresponding to the posterior probabilities , computed as (when is non-zero). We denote by the posterior distribution corresponding to the observation . The set of posterior distributions and the corresponding marginals can be used to compute the adversary’s posterior knowledge after making an observation from the channel.
II-B Local differential privacy and lift
Local differential privacy (LDP), as applied by an individual, can be defined as a property of a channel taking data to noisy outputs .
Definition 1.
We say that channel satisfies -LDP if
A central quantity in this paper, which we call lift, was defined in [6] as:
Definition 2.
Given a channel and prior , lift is defined as
(2) |
Remark: Two alternative expressions for the lift are
(3) | ||||
(4) |
Notably, the intuition behind lift as expressed in Eqn. (3) is that it measures the adversary’s change in knowledge, through (multiplicative) comparison of her prior and posterior beliefs for each secret and observation. The observation providing the biggest “knowledge gap” or lift thereby produces the most leakage.
Note that the argument of maximisation in Eqn. (4) is indeed
in plain probability notation as described in §I and its logarithm is known as the information density. Here, we provide a brief account of some works which have used such quantities in defining privacy measures.
In [9], a mechanism is said to provide -local information privacy (LIP) if
(5) |
A similar definition was earlier studied in [1] for sensitive features of datasets. In [10, 11], -lift was said to be satisfied if the logarithm of the above inequalities held true for a sensitive variable and useful variable according to the Markov chain .
An important distinction between our definition of lift and the above works is that they imposed an additional lower bound, , on the ratio of the posterior to prior beliefs. Whereas in this work, we are only concerned with the largest lift ratio (i.e., the upper bound), which we simply call lift (removing the logarithm). Our definition captures the notion of maximum realisable leakage [6], which is proved in [6, Theorem 13] to be equal to the lift as stated in our Definition 2. This also coincides with the notion of almost-sure pointwise maximal leakage in [18, Definition 4].
In [9, 1, 10, 8], it is proven that Eqn (5) implies -LDP. This bound can be improved through optimisation as shown in [9]. However, we highlight that Eqn (5) and the aforementioned relations to LDP depend on the prior . In §III-B, we will establish robust results strongly and directly linking a prior-independent notion of lift capacity to -LDP. The reverse direction, linking -LDP to Eqn (5) is already strong: it has been shown that in works such as [1, 8, 9] that -LDP implies Eqn (5).
Example 1 (Running Example).
We will use the following example throughout to show how to compute our various measures. Consider the following channel :
The rows represent the (secret) eye colour of an individual (blue, green or blue-green) and the column labels represent the eye colour that the individual reports (blue or green). The channel then tells us, for example, that with probability the individual will report eye colour blue when they in fact have eye colour blue-green; however they will report eye colour blue with probability when they have eye colour green.
Let us imagine that our adversary (trying to guess the secret eye colour) has some prior knowledge given by . We then compute the adversary’s posterior knowledge given the prior and the channel as the set of posterior distributions:
with corresponding marginals , .
The for this channel (Defn 1) is computed using the maximum ratio between elements in any column; we have
The lift, in contrast, requires a prior in order to be computed (Defn 2). Given the prior from above, we compute the lift as:
II-C Operational scenarios and the -leakage framework
An important development over the past decade in security has been the use of operationally relevant leakage measures to provide information about the relative security of different probabilistic systems. Operational measures – those which have a direct correspondence with an adversarial scenario – are crucial to a proper understanding of the security properties of a system. Foundational work in this area by Geoffrey Smith [4] has led to the study of -leakage under the umbrella of QIF [17, 16].
In QIF, adversaries are modelled as Bayesian: they are equipped with a prior over secrets and a gain function (over actions and secrets ), which models the gain to the adversary upon taking action when the true value of the secret is . Before observing an output from the system, the adversary’s prior expected gain, which we call the expected prior vulnerability of the secret, can be expressed as
(6) |
The adversary can use their knowledge of the system (modelled as a channel ) to maximise their expected gain after making an observation. This is called the expected posterior vulnerability and is expressed as
(7) | ||||
(8) | ||||
(9) |
where in the last equality, we have used the definition of vulnerability in Eqn. (6) for the posterior distribution, denoted . Finally, the difference between the prior and posterior vulnerabilities gives a measure of the leakage of the system to this adversary; the greater the difference, the better is the adversary able to use the transmitted information to infer the value of the secret. This can be computed multiplicatively as
(10) |
The -leakage framework models a wide variety of attack scenarios, including guessing the secret in tries, guessing a property of the secret or guessing a value close to the secret. 777In fact, it has been shown that any convex vulnerability function is expressible as a -vulnerability [19]. Moreover, attached to each leakage measure is an operational scenario given by the gain function and prior which describes a specific adversarial threat.
An important notion in QIF is that of capacity, which corresponds to the maximum leakage of a system quantified over all priors, or all gain functions, or both. Capacities thus provide robust, interpretable leakage bounds; robust because they quantify over large classes of adversaries and therefore are “robust” to variations in adversarial assumptions; interpretable because the bounds provide security guarantees which can be explained in terms of the particulars of adversarial attacks.
Of particular note is the Bayes capacity, defined as
(11) | ||||
(12) | ||||
(13) |
where the first equality is proved in [17] and the last equality is due to the fact that the capacity is realised under the uniform prior [17, “Miracle Theorem”]. The Bayes capacity has an important operational significance: it is a tight upper bound on the multiplicative leakage of any channel in the average sense, quantified over all priors and gain functions [16]. In other words, there is no adversarial scenario, modelled using the expected gain of an adversary with any prior knowledge, for which the channel leakage exceeds the amount given by the Bayes capacity. This provides a robust leakage bound on the security risk to a system against any average-case attacker.
II-D Connecting -leakage and lift
We conclude this section by showing the connection between Bayes capacity and the information-theoretic measure lift: it turns out that lift is an upper bound on the Bayes capacity. The following lemma has been expressed in [6, Corollary of Theorem 13]; our contribution here is an alternative proof of this result in a QIF formulation.
Lemma 1.
Given a channel , for all priors it holds that
Further, we find that the bound is strict, in that lift can be strictly greater than Bayes capacity. Recalling that Bayes capacity is an upper bound on average-case -leakage measures, this result then shows that lift cannot be represented as an average-case -leakage measure.
To illustrate, from our running example (Ex. 1), we can compute the Bayes capacity on as the sum of the column maxima of , yielding
This means that the maximum leakage of wrt any adversary measured using an average-case gain, and for any prior and gain function, is . However, recall that we computed the lift under the prior as 1.73. This means that lift cannot be expressible as an average-case -leakage measure.
Next, we study the relationship between lift and max-case measures of leakage.
III On Max-case -leakage Measures, Lift and Local DP
The QIF measures we have reviewed thus far have been for average-case attacks; that is, they model the expected gain of an attacker. Local differential privacy and lift, however, are max-case notions; they describe the worst-case gain for an adversary after making an observation, regardless of its probability of occurring. Max-case measures provide an alternative perspective to average-case measures: the average-case can be seen as the perspective of a data curator whose interest is in protecting attacks against the entire dataset, whereas the max-case provides the perspective of an individual in the dataset whose concern is their particular data point.
The theory of QIF has been extended to include max-case measures which quantify the gain to an adversary interested in only the worst-case leakage [16, 20]. To model this, the max-case posterior vulnerability is defined as
(14) |
This says that the max-case posterior vulnerability is the maximum vulnerability computed over the posteriors, where is a vulnerability function defined on distributions. The theory of max-case vulnerability leaves open the question of how best to define , aside from a technical result which says that should be quasi-convex in order to satisfy basic axioms such as the data processing inequality and monotonicity [21].
In this paper, we introduce the notion of max-case -leakage by choosing in Eqn (14) to be the standard -vulnerability function defined in Eqn (6). Note that is convex [16], and therefore also quasi-convex, thus represents a valid choice of vulnerability function. This yields the following definition:
Definition 3 (Max-case -leakage).
Given a channel , prior and gain function , the max-case -leakage of is defined as
(15) |
where is the prior vulnerability function given in Eqn (6).
Comparing Eqn (15) to Eqn (10) we see that the max-case -leakage quantifies the difference between the prior vulnerability and the posterior vulnerability computed as the maximum vulnerability of any of the posteriors wrt the adversary’s measure of gain and their prior knowledge.
We remark that the significance of our restriction of to -vulnerability functions is that it carries with it an operational interpretation in terms of adversarial threats. 888That is, our describes an adversary in terms of a gain function and a prior ; other (strictly quasi-convex) vulnerability functions cannot be written in this way. We also note that max-case -leakage has yet to be studied in the literature. We will leave further discussion on this choice of to §IV.
III-A Lift and max-case -leakage
Using Defn 3 above, we now show that lift can be expressed as a max-case -leakage.
Theorem 1 (Lift as max-case -leakage).
For discrete domain , define the set of actions and the gain function as:
(16) |
where is the (full support) prior of the adversary using the gain function . Then the max-case -leakage of any channel given a prior is equal to its lift. That is, .
The significance of Thm 1 is that it gives an operational meaning to lift in terms of the adversary realised by . To give some intuition, for some distribution , is given by ; i.e., it measures the maximum “surprise” to the adversary, achieved on the secret for which the probability most differs (multiplicatively) from the adversary’s prior . 999We remark that the gain function is also known as the distribution-reciprocal gain function [16, Ch 3].
Continuing with our running example (Ex. 1), we can compute the gain function for as , , .
The prior vulnerability is
The posterior vulnerability is then given by
where
and
And so which is what we calculated for lift in Ex. 1.
We find, however, that lift has a much stronger property: it is in fact an upper bound on any max-case -leakage measure (that is, using any gain function) with the minor restriction that the gain function must be non-negative. 101010Note that this non-negativity restriction is also required in the case of Bayes capacity as an upper bound on average-case leakage.
Theorem 2 (Lift bounds max-case -leakage).
Define the max-case -leakage as per Defn 3. Then for any non-negative gain function , any prior and any channel , it holds that .
Thm 2 tells us that lift is a surprisingly robust measure; for any prior, lift is a measure of the maximum -leakage of a channel wrt any (non-negative) gain function. Moreover, this upper bound holds regardless of whether we consider average-case leakage or max-case leakage: the max-case we proved above (Thm 2); the average-case follows from the fact that lift upper bounds Bayes capacity (Lem 1) and Bayes capacity is, by definition, an upper bound on any average-case -leakage, for any prior and any (non-negative) gain function.
To illustrate, consider again the running example (Ex. 1). For the gain function we will choose the gid function which models the adversary who wishes to guess the secret in one try. This is defined:
Then the max-case -leakage given the prior is given by:
And the lift we computed earlier as , which is larger than the -leakage, as we expected. Thm 2 tells us that in fact for any gain function we choose and any prior, the lift will always be at least as large as the max-case -leakage.
III-B Robustness results: lift capacity and epsilon
Typically, leakage measures (as defined e.g., in QIF) are not robust – that is, they depend on the specifics of the adversary (their prior and gain function), and therefore may not provide a reliable measure of the safety of a channel against different (Bayesian) adversaries having different prior knowledge and intent. A more robust way to characterise a channel’s leakage properties is to measure its maximum leakage 111111Note the distinction between maximum leakage and max-case leakage: the former quantifies over all priors and gain functions; the latter uses a max-case posterior vulnerability measure. – that is, quantified over all priors, or over all gain functions, or both. Such quantities are termed capacities (see §II-C), and have been previously studied in the context of average-case leakage [22]; the Bayes capacity (recall Eqn (11)) is the most well-known and provides robust upper bounds on the average-case leakage of any channel.
There has, to date, been no study of max-case capacities. However, Thm 2 gives an immediate result in this direction – it shows that lift can be seen an example of a max-case capacity, since it provides an upper bound on max-case leakages quantified over all gain functions. It follows also that, by quantifying over all priors as well, the resulting lift capacity is an upper bound on all max-case -leakages for all priors.
Definition 4.
We define the lift capacity for a channel as
Immediate from this definition and Thm 2 we get the following:
Corollary 1.
For any channel and prior it holds that
We now show that this so-called lift capacity also has an important relationship with the epsilon of local differential privacy.
Theorem 3.
Let be a channel and . Then
Proof.
For the forward direction, we first note that occurs when is a point prior on . However, recall from Definition 4 that is a supremum over all full support priors. Let ; now define a sequence of priors
We see that is full support, and also . Therefore
(17) |
We also note that we can rewrite as . Therefore we have:
“Eqn (2)” | |
“Substituting for as noted above” | |
“Rearranging” | |
“ independent of ” | |
“From Eqn (17) above” | |
“” |
And so we have that implies for all . For the reverse direction, we refer to previous works in which this has been shown to hold [1, 8, 9].
∎
Corollary 2.
The lift capacity coincides with the value of an LDP channel. That is, .
Proof.
From Thm 3 we deduce that . ∎
Thm 3 and Cor 2 establish the equivalence between the lift capacity and the epsilon parameter of local differential privacy, and by Thm 1 and Thm 2 this gives that epsilon (or rather, ) is a tight upper bound on the max-case -leakage of any channel. That is, is the max-case -leakage capacity, quantified over all priors and gain functions. This result connecting with the -leakage framework provides the first robust, operational interpretation for local differential privacy in terms of Bayesian adversarial threats.
We note that has previously been interpreted as a capacity under QIF [20]. However, in that work the vulnerability function chosen was not expressible as a -vulnerability, and therefore did not carry with it the operational interpretation attached to the -leakage framework. This was Smith’s original motivation for a -leakage based model for secure information leakage measurements.
III-C Interpreting the results connecting -leakage, lift capacity and epsilon
Thm 2 above tells us that lift bounds max-case -leakage. To give some intuition for what this means in practice, suppose that we are worried about the leakage of a system to some attacker, both wrt any single individual in our system (i.e., the max-case leakage to the attacker) and wrt the system as a whole (i.e., the attacker’s expected gain). Let’s also assume that we can estimate the prior knowledge of the attacker, but we do not know exactly what the attacker is trying to achieve (i.e., what is her gain function). What Thm 2 tells us is that without knowing anything about the attacker’s goal, we can still compute an upper bound on the leakage of the system to this attacker (i.e., we can measure how much more the attacker can gain by having access to the system). This upper bound is given by the lift - a quantity independent of the gain function - and this is an upper bound on both average-case and max-case leakage to Bayesian adversaries.
Now let’s assume that we are worried about the same attacker, however this time we know nothing about her prior knowledge nor her specific goal. What Cor 2 tells us is that we can still compute an upper bound on the leakage of the system to this attacker, and in this case the upper bound is given by the lift capacity, or alternatively by (computed in the local differential privacy sense). This upper bound is robust, in the sense that it makes no assumptions about the adversary’s prior knowledge or goal; thus it is termed a capacity, consistent with the QIF literature. The capacity differs from the existing Bayes capacity in QIF, in that Bayes capacity is an upper bound on adversaries whose leakage is computed in the average-case, whereas the upper bound is a capacity on adversaries computed using either average-case or max-case leakage measures. The significance of these results is that we can now justify as a robust security parameter with an operational interpretation in terms of a large class of Bayesian attackers.
Recall the motivating example presented in the introduction (§I-A) in which a user is given a choice of local differential privacy mechanisms and with values of and , respectively. Remember that we found that had a larger leakage (wrt both an average-case and a max-case Bayesian attacker) than , even though it has a smaller . The results of Thm 2 and Cor 2 do not tell us when one mechanism will be better than another against a particular attack; for this we would turn to the refinement orders of QIF (studied in [20]) which tell us when one mechanism is better than another against all attacks; or alternatively we would compute the leakage of each mechanism wrt an individual attack of concern (as we did in the introduction).
What Thm 2 and Cor 2 do tell us is that there cannot be any attack (max-case or average-case) which produces more leakage than what the lift tells us (if we know the attacker’s prior), or what tells us (if we do not). In the case of and , Cor 2 tells us that there does not exist an attack (max-case or average-case, and regardless of the adversary’s prior) that produces more leakage than or , respectively. Moreover these bounds are tight: there exists an attack which does induce an leakage (and, in fact, Thm 3 gives a construction for such an attacker). This means that we can conclude that is generally safer than , because its leakage never exceeds , even though may be better than in some specific circumstances.
Notice that the leakages we computed in the example in §I-A for were and , and for they were both . These are indeed lower than the corresponding bounds of and , respectively. Interestingly, if we compute the Bayes capacity for and , we find that these are and , respectively (because the Bayes capacity is realised on a uniform prior for exactly the adversarial scenario we computed). What this tells us is that if we are only concerned with attackers modelled in the average-case, then is in fact generally better than (but may be worse for particular attackers). The leakage of that we measured for was for a max-case attacker; and in fact the max-case leakage is always at least as large as the average-case leakage for any prior and any gain function, as we will prove later (see Lem 3 in §IV).
III-D Max-case Dalenius leakage
Up until this point, we have been satisfied with computing leakage measures with respect to deterministic functions of the domain (described by a prior ). As pointed out in [6], one may also be concerned about potentially randomised functions of . The QIF theory of Dalenius leakage [16, Ch 10] accounts for such functions by modelling the leakage that a channel induces on a secret through an unknown correlation . Given such a correlation , we can factorise it into a prior and a channel so that .
In [16] it is shown that the multiplicative Bayes capacity of the combined system (writing for matrix multiplication) is in fact bounded from above by the multiplicative Bayes capacity of ; and thus considering arbitrary randomised functions of the secret is not necessary for quantifying the maximum multiplicative leakage of a channel.
Here we confirm that this property also holds for the max-case leakage capacity (given by ), and also by lift itself. This means that the capacity results proven in this paper also hold for randomised functions of , and therefore there is no advantage in considering randomised functions of the secret space.
The max-case -leakage of caused by and can be written as . We now have the following:
Lemma 2.
Let be a channel and let be a correlation that factorises as . Then
where and .
Proof.
Referring to Thm 1, write for the gain function that realises lift. i.e., . The data processing inequality for max-case leakage gives that
for any . Thus, from Thm 1 we get
We next have that:
“Defn 2” | |
“Rewriting” | |
“Taking max of over ” | |
“Factorise; use ” | |
“Factorise denominator” | |
“Substituting ” | |
“No dependence on ” | |
“Defn 2” |
And thus . ∎
As a corollary, we have that the lift capacity also respects Dalenius leakage, and therefore the max-case -leakage of secrets via a channel and arbitrary correlation is bounded from above by the lift capacity of .
III-E Interpreting the Dalenius leakage results
To provide some intuition on how to interpret the above results, we return to our motivating example from the introduction, in which the two mechanisms and were introduced. Recall that these mechanisms were applied to survey results with answers ‘yes’, ‘maybe’ and ‘no’. Now let us assume that the adversary knows a correlation between survey answers and disease; perhaps users who answered ‘yes’ or ‘maybe’ are likely to have some serious illness, whereas users who answered ‘no’ are unlikely to have one. We wish to know whether such a correlation will cause extra harm to the individual: can the adversary learn more information (via the correlation) than the parameter of or suggests?
What Cor 3 tells us is that the max-case -leakage of the entire system (including the correlation) is bounded above by computed from the mechanism ( or ). This means that any (potentially public) correlations that the adversary may have access to cannot increase the amount of leakage caused by or – the upper bound on leakage remains intact, where here we measure leakage using either lift or lift capacity (). In other words, the designer of the system can focus on the parameter describing the mechanism, and so long as the leakage (represented by ) is small enough, then the system is protected against an adversary who knows any arbitrary correlation . Note that we did not prove a Dalenius leakage result on max-case -leakage, which we leave to future work.
We remark that this result appears to be similar to the notion in differential privacy that is independent of the prior knowledge of an attacker. A difference here is that the notion of prior knowledge is typically represented as a distribution over secrets; in the Dalenius scenario described above, we are interested in arbitrary correlations between the secret and some other (potentially damaging) information, and the concern is what damaging information the attacker can learn as a result of the data release and the correlation. Dalenius leakage reassures us that that the damage caused by any arbitrary correlation is always bounded by the leakage of the original data release.
Interestingly, it may be the case that these Dalenius leakage results do not hold in the general case of (central) differential privacy; a counter-example appears in the work of [24]. We leave further exploration of this idea to future work.
IV Additional Results on Max-case Leakage and Technical Discussion
In this section, we provide further technical details on the max-case leakage definition of Defn 3.
Firstly, the following result shows that the max-case leakage of a channel is at least as large as the average-case leakage. This result completes Table I.
Lemma 3.
Given a channel prior and gain function , it holds that
Next, we recall that in Defn 3 we chose to model max-case leakage using the prior vulnerability function , which models the adversary’s expected gain before interacting with a system (i.e., using only their prior knowledge). However, in the max-case setting it might seem preferable to choose a prior vulnerability function which models the adversary’s max-case gain; i.e., we could have defined:
(18) |
That is, the adversary’s prior gain is computed using the secret which maximises their gain. We now justify our original decision (choosing an average-case prior vulnerability) by demonstrating that these choices are, in fact, equivalent.
Lemma 4.
Let be a channel, be a prior and be a gain function. Then there exists a gain function such that and .
Finally, we recall the result of [21] which shows that, for reasonable axioms to hold under a max-case definition of leakage, then the prior vulnerability function should be quasi-convex (in the prior). However, as we have seen, both the average-case and max-case prior vulnerability functions are convex. An open question in the community has been: are there any strictly quasi-convex functions which produce leakage measures of interest? Previous work [20] showed that the of metric differential privacy (-privacy) can be expressed as an additive capacity using a quasi-convex vulnerability function, suggesting that this was indeed one such example. In this paper we have resolved this question with a much stronger result, showing that in local DP can in fact be expressed using a convex vulnerability function. This justifies our restriction of max-case leakage to -leakage measures, but leaves open the question of the usefulness of max-case leakage defined over the full scope of quasi-convex vulnerabilities.
V Prior Work
Perhaps not surprisingly and underpinned by the foundational theory of information, many linkages have already been established between information-theoretic, differential privacy and QIF frameworks [20, 25]. Here, we only review results that are most pertinent to this work.
First, the logarithm of the Bayes capacity is known as the Sibson mutual information of order in information theory [26] and was recently shown in [6] to measure the maximum leakage of adversaries wanting to guess arbitrary randomised functions of secret . It turns out that this identity is no coincidence and a recent work [18] proves that there is no advantage in generalising the class of adversaries from those who use deterministic gain functions to those who guess randomised functions. We remark that this result is also known in the QIF community via Dalenius leakage [16, Ch 10].
Second, it is not difficult to prove that an upper bound on the information density bounds the Sibson mutual information of order infinity [6], aka the logarithm of Bayes capacity in QIF language. Moreover, works such as [1, 8, 9] link upper and lower bounds on lift to (local) differential privacy measures [3, 12, 13] and vice versa.
There have been a number of works connecting DP and QIF, in particular via the study of max-case leakage measures [16, 20]. A general notion of max-case leakage under QIF has been extensively explored in [16]; there it was found that max-case measures are required to be quasi-convex in order to satisfy certain axioms. Our work differs from theirs in that we restrict our attention to the set of max-case -leakages – that is, derived from -vulnerability functions – which are a subset of the quasi-convex max-case vulnerabilities (in fact, we have shown that our max-case -leakages are all convex). To our knowledge our work is the first to explore max-case -leakages and their capacities. General max-case leakages were also explored in [20] and their connection to differential privacy established via a leakage ordering. In particular, [20] found that whenever there is a max-case leakage order between two mechanisms (meaning one is always safer than the other wrt every max-case adversarial scenario), then there is a corresponding ordering between the two. While [20] explored as inducing a refinement order, finding that it is surprisingly weak (i.e., compared with the refinement orders of average-case and max-case found in QIF), in our work we explore as a capacity, finding that it does have a strong and meaningful interpretation (in terms of max-case attackers). In addition, [20] established that the of DP can be expressed as a max-case leakage capacity (via the QIF framework) derived from a specific quasi-convex vulnerability function. Such quasi-convex vulnerabilities, unfortunately, do not correspond to adversarial models in -leakage, and therefore fall outside the desirable operational interpretability. Our work finds that (actually ) is a capacity which can in fact be represented by a max-case -vulnerability, which brings with it a broad operational meaning in terms of max-case attackers.
Other relationships between Bayesian models and differential privacy have also been previously studied [27, 23], however none of these results give both a robust and an interpretable meaning to . Andres et al. [23] provided a Bayesian interpretation for metric differential privacy in terms of a single adversary and Alvim et al. [27] showed that -differential privacy implies a bound on leakage but that the converse does not hold.
VI Conclusion
In this paper we have investigated the relationships between traditional information theory and the -leakage framework of quantitative information flow. The connections are summarised in Table I. Overall we observe that the two notions converge wrt. robustness, namely via the capacity , which we find is equivalent to of local differential privacy.
Significant also is that local differential privacy’s parameter can now be understood through the lens of QIF. In particular, we see that it is also a measure of robustness in that it behaves as a capacity – that is, independent of any particular prior. Moreover, it represents a tight upper bound on all max-case -leakages. This is the first time that in LDP has been explained as a robust measure of information leakage in the QIF framework.
From the perspective of information theory, lift is also explained as a leakage measure, but interestingly we discovered that it has “capacity-like-properties” (§III-B). Table I clarifies how these measures relate to the leakages and capacities well-known in QIF.
Differential privacy is often seen as a useful technique to protect the privacy of individuals’ data, and has been used in several prominent applications including the 2020 US Census [15]. As noted however, in spite of differential privacy’s theoretical properties, there remain a number of challenges for its successful application. One such challenge is how to choose an appropriate level of relative to a particular scenario. Whilst itself provides information about indistinguishability, it is difficult to reconcile it with the protection against other relevant attacks described here.
The work presented here is an important step towards a better understanding of how to choose where indistinguishability is not the only concern, but where there are differing adversarial assumptions. The results here give a clear theoretical account of and how to view it under differing adversarial conditions which can then be included in the determination of an appropriate threat model that is relevant to the scenario.
While the results of this paper were derived for local DP, they provide some insights about central DP. For instance, based on Thm 3, Cor 2 and the fact that local DP implies central DP, we know that lift capacity upper bounds in central DP. However, more work is required to better understand what type of worst-case threat models in QIF framework have a one-to-one correspondence with in central DP. If such equivalence were to be established, it would settle the question of the operational meaning of central DP in terms of which adversarial threats it can protect against. There may also exist new connections or applications of lift and lift capacity in defining or bounding other privacy and leakage measures. We leave these interesting questions for future work.
References
- [1] F. du Pin Calmon and N. Fawaz, “Privacy against statistical inference,” in 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1401–1408, IEEE, 2012.
- [2] A. Makhdoumi, S. Salamatian, N. Fawaz, and M. Médard, “From the information bottleneck to the privacy funnel,” in 2014 IEEE Information Theory Workshop (ITW 2014), pp. 501–505, Nov 2014.
- [3] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Thr of Crypto. Conf., pp. 265–284, Springer, 2006.
- [4] G. Smith, “On the Foundations of Quantitative Information Flow,” in FOSSACS, vol. 5504 of LNCS, pp. 288–302, Springer, 2009.
- [5] M. Thomas and A. T. Joy, Elements of information theory. Wiley-Interscience, 2006.
- [6] I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,” IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2020.
- [7] J. Liao, O. Kosut, L. Sankar, and F. d. P. Calmon, “Tunable measures for information leakage and applications to privacy-utility tradeoffs,” vol. 65, no. 12, pp. 8043–8066, 2019.
- [8] S. Salamatian, F. P. Calmon, N. Fawaz, A. Makhdoumi, and M. Médard, “Privacy-utility tradeoff and privacy funnel,” Unpublished preprint, https://www.mit.edu/ salmansa/files/privacy_TIFS.pdf, 2020.
- [9] B. Jiang, M. Seif, R. Tandon, and M. Li, “Context-aware local information privacy,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 3694–3708, 2021.
- [10] H. Hsu, S. Asoodeh, and F. P. Calmon, “Information-theoretic privacy watchdogs,” in 2019 IEEE International Symposium on Information Theory (ISIT), pp. 552–556, IEEE, 2019.
- [11] P. Sadeghi, N. Ding, and T. Rakotoarivelo, “On properties and optimization of information-theoretic privacy watchdog,” in IEEE Inf. Theory Workshop (ITW), pp. 1–5, Apr. 2020.
- [12] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?,” SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011.
- [13] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in Proc. IEEE 54th Annu. Symp. Found. Comput. Sci., pp. 429–438, 2013.
- [14] D. Clark, S. Hunt, and P. Malacaria, “Quantitative information flow, relations and polymorphic types.,” J. Log. Comput., vol. 15, pp. 181–199, 01 2005.
- [15] S. L. Garfinkel, J. M. Abowd, and S. Powazek, “Issues encountered deploying differential privacy,” in Proceedings of the 2018 Workshop on Privacy in the Electronic Society, pp. 133–137, 2018.
- [16] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, The Science of Quantitative Information Flow. Springer, 2020.
- [17] M. S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith, “Measuring information leakage using generalized gain functions,” in IEEE CSF, pp. 265–279, June 2012.
- [18] S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Pointwise maximal leakage,” arXiv preprint arXiv:2205.04935, 2022.
- [19] A. McIver, C. Morgan, G. Smith, B. Espinoza, and L. Meinicke, “Abstract channels and their robust information-leakage ordering,” in Principles of Security and Trust - Third International Conference, POST 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014, Proceedings (M. Abadi and S. Kremer, eds.), vol. 8414 of Lecture Notes in Computer Science, pp. 83–102, Springer, 2014.
- [20] K. Chatzikokolakis, N. Fernandes, and C. Palamidessi, “Comparing systems: Max-case refinement orders and application to differential privacy,” in Proc. CSF, IEEE Press, 2019.
- [21] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, “Axioms for information leakage,” in 2016 IEEE 29th Computer Security Foundations Symposium (CSF), pp. 77–92, IEEE, 2016.
- [22] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, “Additive and Multiplicative Notions of Leakage, and Their Capacities,” in IEEE CSF, pp. 308–322, IEEE, 2014.
- [23] M. E. Andrés, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi, “Geo-indistinguishability: Differential privacy for location-based systems,” in Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pp. 901–914, 2013.
- [24] Y. Wang, Y. O. Basciftci, and P. Ishwar, “Privacy-utility tradeoffs under constrained data release mechanisms,” arXiv preprint arXiv:1710.09295, 2017.
- [25] P. Cuff and L. Yu, “Differential privacy as a mutual information constraint,” in CSS, pp. 43–54, 2016.
- [26] R. Sibson, “Information radius,” Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 14, no. 2, pp. 149–160, 1969.
- [27] M. S. Alvim, M. E. Andrés, K. Chatzikokolakis, and C. Palamidessi, “On the relation between differential privacy and quantitative information flow,” in Automata, Languages and Programming - 38th International Colloquium, ICALP 2011, Zurich, Switzerland, July 4-8, 2011, Proceedings, Part II, pp. 60–76, 2011.
Appendix A Motivating Example from the Introduction
In the introduction we motivated our work with the example of the following two channels:
Using a uniform prior , we can compute the set of posteriors using Bayes rule by multiplying out the channel by the prior and then normalising down columns. This yields the following posteriors and their corresponding marginals:
We can now compute the expected posterior gain to an adversary who wishes to guess the secret in one try. This adversary can be modelled using the gain function gid defined as:
(19) |
We first compute the prior vulnerability for this adversary, which gives a measure of the leakage before observing any output from the channel. This is computed as:
The posterior vulnerability determines the expected gain to the adversary after making an observation, and therefore depends on the channel. The posterior gid-vulnerability for is then:
And for we have:
The multiplicative leakage is given by the ratio of the posterior to the prior vulnerabilities. For we have:
And for we compute:
And thus leaks more than to this adversary.
Next we compare the max-case leakage using the same gain function gid and the same uniform prior . Note that the prior max-case -vulnerability is the same (). For the posterior vulnerability for we compute:
And for we have:
And so the corresponding multiplicative leakage for is:
And for it is:
And so again we find that leaks more than for this adversary, now modelled using a max-case vulnerability function.
Appendix B Max-case -leakage using average-case -vulnerability
In this section, we complete the proof of Lem 4, showing a gain function which produces the same average-case leakage as a max-case leakage defined using a . That is, we will prove that for any gain function it is possible to construct a gain function such that where is defined in the usual way as:
(20) |
and is defined as
(21) |
Note that our construction assumes that the domain is discrete, although the proof does not rely on this assumption.
For we can define the set of actions such that for each we have a set , and a mapping satisfying
(22) |
This means that . And therefore:
And this gives that .