This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Explaining ϵ\epsilon in local differential privacy through the lens of quantitative information flow

Natasha Fernandes School of Computing
Macquarie University
Australia
[email protected]
   Annabelle McIver School of Computing
Macquarie University
Australia
[email protected]
   Parastoo Sadeghi School of Engineering and IT
UNSW Canberra
Australia
[email protected]
Abstract

The study of leakage measures for privacy has been a subject of intensive research and is an important aspect of understanding how privacy leaks occur in computer systems. Differential privacy has been a focal point in the privacy community for some years and yet its leakage characteristics are not completely understood. In this paper we bring together two areas of research –information theory and the gg-leakage framework of quantitative information flow (QIF)– to give an operational interpretation for the epsilon parameter of local differential privacy. We find that epsilon emerges as a capacity measure in both frameworks; via (log)-lift, a popular measure in information theory; and via max-case gg-leakage, which we introduce to describe the leakage of any system to Bayesian adversaries modelled using “worst-case” assumptions under the QIF framework. Our characterisation resolves an important question of interpretability of epsilon and consolidates a number of disparate results covering the literature of both information theory and quantitative information flow.

Index Terms:
Differential privacy, log-lift, information leakage, g-leakage, quantitative information flow

I Introduction

Over the past two decades, characterising and limiting privacy leakage in data sharing systems has been the subject of intensive research. Information-theoretic privacy [1, 2], differential privacy [3] and quantitative information flow (QIF) [4] are three main frameworks that have been developed and studied for this purpose.

Information-theoretic privacy is concerned with measuring privacy leakage using well-known information-theoretic quantities, such as Shannon mutual information [5]:

I(X;Y):=x𝒳,y𝒴p(x,y)logp(x,y)p(x)p(y),I(X;Y)~{}~{}:=~{}~{}\sum_{x\in\mathcal{X},y\in\mathcal{Y}~{}}p(x,y)\log\frac{p(x,y)}{p(x)p(y)}~{},

where XX denotes a secret and YY denotes the observable, with p(x,y),p(x)p(x,y),p(x) and p(y)p(y) denoting the joint and marginal distributions of XX and YY, respectively. The channel CC is characterised by the conditional probability distribution p(y|x)p(y|x) and is assumed fixed by the data sharing system and publicly known. For any prior p(x)p(x), a joint p(x,y)p(x,y) and a marginal distribution p(y)p(y) are induced by CC.

Papers such as [4, 6] discuss some of the limitations of Shannon mutual information in properly quantifying or differentiating various adversarial threats to privacy. Sibson and Arimoto mutual information of order α\alpha have recently been proposed to measure a spectrum of generalised threats to privacy [7]. It also turns out that the “nucleus” of Shannon mutual information, namely the information density variable: i(x,y):=logp(x,y)p(x)p(y)i(x,y):=\log\frac{p(x,y)}{p(x)p(y)} can represent a much stronger notion of privacy leakage than its average. We call (x,y):=p(x,y)p(x)p(y)\ell(x,y):=\frac{p(x,y)}{p(x)p(y)} the lift variable, the exponential of information density. 111Therefore, we may refer to information density as log-lift. Information density and lift have been studied, albeit under various names, in works such as [1, 8, 9, 10, 11].

Differential privacy (DP) is a well-known privacy framework [3] describing a worst-case attack scenario, in which an attacker with knowledge of all individuals but one in a dataset learns little information about the unknown individual upon an information release. Specifically, for a given ϵ\epsilon, any E𝒴E\subset\mathcal{Y} and any two neighbouring datasets x,xx,x^{\prime} differing in one individual and any observation Y𝒴Y\in\mathcal{Y}, DP requires

p(YE|x)eϵp(YE|x).p(Y\in E|x)\leq e^{\epsilon}p(Y\in E|x^{\prime}). (1)

This definition is sometimes called the central model for differential privacy, referring to its dependence on a trusted centralised data curator. Alternatively, in the local model for differential privacy [12, 13], each individual first applies noise to their data before sending it to an untrusted data curator, who releases some aggregate information about the (noisy) data. Local DP is more stringent than DP as it requires the inequation above be satisfied for all x,x𝒳x,x^{\prime}\in\mathcal{X}. In other words, local ϵ\epsilon-DP implies central ϵ\epsilon-DP, but not vice versa.

Finally, quantitative information flow (QIF) is an information-theoretic framework for measuring leakage from (probabilistic) systems modelled as information-theoretic channels. 222The early work introducing Quantitative Information Flow was based on Shannon entropy to measure flows of information [14]. The QIF used here refers to later work initiated by Smith [4]. In QIF, leakages are derived from a Bayesian adversary model incorporating a gain function describing the adversary’s goal, and the adversary’s prior knowledge. The resulting leakage, computed as the expected gain in the adversary’s knowledge, thus provides a meaningful, interpretable measure of the security risk to a system from that adversary. For example, the Bayes vulnerability measure, widely used in QIF, corresponds to the maximum probability that an adversary can guess the secret in one try; the Bayes leakage measure is the difference between the adversary’s expected probability of guessing the secret after making an observation from the system versus their probability before making an observation (i.e., using only their prior knowledge).

I-A Explaining epsilon: a motivating example

An important development in both the information theory and security communities has been an interest in operationally relevant leakage measures to provide information about the relative security of different probabilistic data sharing systems. In this context, operational measures are ones which provide interpretable guarantees in terms of an adversary’s gain or loss upon interacting with the system. The value produced by such a leakage measure therefore has a correspondence with some measurable information gain to the adversary, justifying its use as a security measure. For both of the above communities, the ϵ\epsilon of differential privacy still lacks a robust operational meaning, despite the privacy community’s explanation in terms of the differential privacy attacker who knows all-but-one secret in the dataset. The following example illustrates this more clearly.

Consider a local differential privacy scenario in which a user wishes to obfuscate their own sensitive data before releasing it to an untrusted data curator. The user is given a choice of two mechanisms to use. The scenario is depicted by the two mechanisms below, in which, for example, the user is answering a question in a survey, and the values {y, m, n} correspond to the answers ‘yes’, ‘maybe’ and ‘no’, respectively. Note that we describe our mechanisms as ‘channels’, in which the row labels correspond to secrets and the column labels correspond to observations. The value in row-ii column-jj represents the conditional probability P(j|i)P(j|i); that is the probability of observing output jj given input ii. In the example below, in the channel GG we have that the probability of observing yy given an input nn is 1/6\nicefrac{{1}}{{6}}, given by the value in the bottom left of the channel matrix.

Gymny2/31/61/6m1/31/31/3n1/61/62/3Rymny3/51/51/5m1/53/51/5n1/51/53/5\begin{array}[]{|c|ccc|}\hline\cr G&y&m&n\\ \hline\cr y&\nicefrac{{2}}{{3}}&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}\\ m&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}\\ n&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}&\nicefrac{{2}}{{3}}\\ \hline\cr\end{array}\quad\begin{array}[]{|c|ccc|}\hline\cr R&y&m&n\\ \hline\cr y&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}\\ m&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}\\ n&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}\\ \hline\cr\end{array}

From the perspective of differential privacy, the choice of the user’s noisy mechanism should be determined by the ϵ\epsilon parameter described by the DP equation (1); in the DP literature, the smaller the ϵ\epsilon, the more privacy is said to be guaranteed to the user.

In the mechanisms described above, notice that the mechanism GG satisfies log(4)\log(4)-DP and the mechanism RR satisfies log(3)\log(3)-DP. 333The mechanisms GG and RR are in fact instances of the well-known Geometric and Randomised-Response mechanisms from DP, respectively. (In fact this corresponds to the maximum ratio between any two values within a column). We would therefore expect that RR is more private and so leaks less information to an adversary than does GG.

We now ask: for each of these mechanisms, what would be the expected gain to an attacker who is trying to guess the secret in one try? We can model this attack by assuming a uniform prior for the adversary (say), and applying Bayes rule to compute the adversary’s posterior knowledge after making an observation from each mechanism. 444This is the model used in the QIF framework and is described in more detail in §II-C. The posterior probability of success for the attacker is given by the maximum probability in each posterior, averaged by the marginal on that posterior 555In QIF this is called the posterior Bayes vulnerability.. The resulting computation yields that the leakage of the system computed using the adversary’s expected posterior gain from GG is 1.671.67 whereas the equivalent leakage from RR is 1.801.80 (see Appendix A for complete details). So, in fact, our attacker learns more from RR than she does from GG.

But since ϵ\epsilon represents a “max-case” measure and the adversarial scenario above uses the expected gain of the attacker, i.e., an “average-case” measure, we might think that the above anomaly occurs because the measures we are using are incompatible. Let us consider now a max-case scenario: an attacker whose advantage is measured by the maximum probability of guessing the secret in one try regardless of which output was observed. In this scenario, the leakage to the attacker is given by the maximum leakage of any of the posteriors of the channel, independent of the marginal probability on that posterior. In this instance, we find that, again under a uniform prior, the max-case leakage computed using the posterior max-case vulnerability for GG is 1.711.71 whereas for RR it is 1.801.80 (see Appendix A for full details). And so RR also leaks more information to this max-case adversary than does GG, even though its ϵ\epsilon suggests that RR should be safer than GG.

These examples motivate the question: what does ϵ\epsilon tell us about other kinds of attacking threats apart from the one modelled in the DP literature? 666The DP literature describes ϵ\epsilon as a measure of indistinguishability against an attacker who knows every secret in the dataset except for one. If ϵ\epsilon is indeed a robust measure for privacy, it should be useful for making judgments about many different kinds of privacy threats. One way of assessing robustness from the QIF perspective would be to ask: does ϵ\epsilon correspond to a leakage bound for some class of (Bayesian) adversaries? If so, this would give ϵ\epsilon a robust operational meaning, allowing its guarantees to be explained in terms of a large number of attacking threats relevant to the security community. For the privacy community, this may assist in providing some explainability for ϵ\epsilon in terms of its measurement of leakage. This could be used to guide privacy practitioners in determining a reasonable value for ϵ\epsilon, which is a problem that has been identified by, for example, the implementers of differential privacy for the US Census Bureau [15].

In this paper, we bring together the information-theoretic, differential privacy and QIF frameworks to show how the measures of lift from information theory and gg-leakage from QIF can be used to bring a robust operational meaning – an explanation in terms of a broad class of adversarial attacks – to the parameter ϵ\epsilon in (local) differential privacy.

I-B Key open questions and contributions of this paper

Despite a large body of existing work, a number of questions remain open at the intersection of these frameworks.

While ϵ\epsilon in both the local and central models of DP is agnostic to the adversarial gain function and prior knowledge, it is not clear whether ϵ\epsilon has a robust interpretation in the QIF sense (i.e., in terms of protection against a class of adversaries as highlighted in the example earlier). Such a robust operational interpretation is missing from the DP literature.

In addition, it is still not clear whether the measure of log-lift (in this paper lift) has an operational significance as a measure of privacy leakage or is capable of robustly quantifying maximum privacy leakage in strong adversarial threat models via the QIF framework.

Last, the current theory of max-case leakage in QIF leaves open the question of how best to model the max-case adversarial threats or how to characterise max-case capacity over all priors and a wide class of adversarial threats.

The following contributions of this paper address these questions.

Main contributions:

  1. 1.

    We propose a new set of measures: max-case gg-leakages (Defn 3), which are a subset of the set of general max-case measures for QIF, which have been extensively studied in [16]. The max-case gg-leakages have a clear operational significance: they correspond to adversaries whose goal is to maximise their gain over any output. Note that gg-leakages model information-theoretic Bayesian adversaries and therefore represent a very general threat model.

  2. 2.

    We prove the first robust and operationally meaningful interpretation of ϵ\epsilon in local DP via the proposed max-case gg-leakage framework. Specifically, we show that eϵe^{\epsilon} is exactly the max-case gg-leakage capacity, which measures the maximum leakage of a system wrt  any Bayesian attack modelled using either average-case or max-case gg-leakage measures, and is robustly independent of adversarial gain functions and priors.

  3. 3.

    We introduce the lift capacity (Defn 4), which quantifies lift over all priors, and show that it is equivalent to eϵe^{\epsilon} in the local DP framework. This provides the missing connection between lift and ϵ\epsilon which has been a subject of interest in the information theory literature.

We remark that although one conclusion we draw – that the ϵ\epsilon of the DP framework is robustly independent of the adversary – may seem expected or unsurprising, the method developed in this paper of obtaining the result via the QIF framework is non-trivial.

I-C Detail of technical contributions

Technically speaking, the first contribution of this work is establishing a link between lift in information-theoretic privacy and a notion of max-case gg-leakage, which we introduce in this paper via the QIF framework. Specifically, in Defn 3 we introduce a max-case multiplicative gg-leakage, denoted by gmax(π,C)\mathcal{L}^{\mathrm{max}}_{g}(\pi,C), using the standard notion of gg-vulnerability in QIF framework. We then show via Thm 2 that lift upper bounds gmax(π,C)\mathcal{L}^{\mathrm{max}}_{g}(\pi,C) with respect to any gain function gg for any given prior π\pi. In Thm 1, we show that lift is indeed realisable as a max-case gg-leakage for an adversary who chooses a suitable prior-dependent gain function.

In addition, we establish an important result, linking all three information-theoretic privacy, local differential privacy and QIF frameworks in an operationally robust manner. Specifically in the information-theoretic privacy framework, we define in Defn 4 the supremum of lift over all priors for a channel CC as lift capacity and denote it by Lift(C)\mathcal{M}\mathrm{Lift}(C). We then show in Thm 3 and Cor 2 that lift capacity is equivalent to ϵ\epsilon in the local DP framework. Combining this result with Thm 1 and Thm 2 we conclude that lift capacity, aka eϵe^{\epsilon}, is also equal to the max-case gg-leakage capacity in the QIF framework. This gives lift a robust operational interpretation in terms of strong max-case adversarial threats and explains ϵ\epsilon in local DP as a capacity measure from the lens of information theory and QIF.

Finally, we address a question raised in [6] as to whether the gg-leakage framework is sufficient to describe adversarial scenarios in which the attacker is trying to guess a randomised function of the secret. In fact the Dalenius leakage described in [16, Ch 10] already addresses this in the case of average-case gg-leakage, demonstrating that gg-leakage is unchanged when considering randomised functions of the secret. We extend these results to the cases of lift and lift capacity, showing that – as with the Bayes capacity – both measurements are unaffected under such adversarial scenarios.

Table I visualises existing results in the literature and our contributions. As seen from the Table, this paper bridges between several results in the information-theoretic privacy, differential privacy and QIF literature and establishes new relations between lift and other measures of privacy. Our new results, together with our consolidated summary of existing results, depicts a fuller picture on deep connections between the information-theoretic, differential privacy and QIF frameworks.

g×(π,C)\mathcal{L}^{\times}_{g}(\pi,C) Lem 3\underset{\textrm{Lem~{}\ref{lem:leak2}}}{\leq} gmax(π,C)\mathcal{L}^{\mathrm{max}}_{g}(\pi,C) Thm 2\underset{\textrm{Thm~{}\ref{thm:lift_bounds}}}{\leq} Lift(π,C)\mathrm{Lift}(\pi,C)

\leq

\leq Cor 1 \leq Defn 4
1×(𝔻,C)\mathcal{ML}_{1}^{\times}(\mathbb{D},C) Lem 1\underset{\textrm{Lem~{}\ref{bayeslem}}}{\leq} Lift(C)\mathcal{M}\mathrm{Lift}(C) == Lift(C)\mathcal{M}\mathrm{Lift}(C) = Cor 2\underset{\textrm{ Cor~{}\ref{cor:epsilon}}}{=}eϵe^{\epsilon}

TABLE I: Relationship between leakages (top row) and capacities (bottom row). The coloured text indicates new contributions of this paper.

I-D Operational measures and their significance in information theory and security

A key motivating example for the importance of operational security measures came from Geoffrey Smith [4] who observed that traditional information-theoretic measures such as mutual information can underestimate the threat to a system. This work led to the study of gg-leakages, which directly model attacking threats and provide leakage measures which can be interpreted in terms of a concrete information gain to an attacker. The relevance of such leakage measures to the security community is that they are explainable in terms of explicit adversarial threats, and therefore give meaningful security assessments. The study of gg-leakages now sits under the QIF umbrella [17, 16].

In parallel to the above, recent work from the information theory community has identified the importance of operational measures for information leakage [6]. Of particular interest is the problem of providing an operational interpretation to privacy measures such as ϵ\epsilon (in differential privacy) and log-lift (from information theory). Key work in this area by Issa et al. [6] has independently discovered several results overlapping with QIF, which will be outlined later in this paper. Importantly, both communities use an information-theoretic and decision-theoretic (Bayesian) approach to modelling adversarial threats. Therefore, although in this paper we use the notation and framework of quantitative information flow, the results will be relevant and interpretable for the information theory community as well.

II Information-Theoretic Foundations for Privacy

II-A The channel model for quantitative information flow

We adopt notation from QIF [16], a mathematical framework for studying and quantifying information leaks with respect to adversarial attack scenarios.

A probabilistic channel CC maps inputs (secrets) x𝒳x\in\mathcal{X} to observations y𝒴y\in\mathcal{Y} according to a distribution 𝔻𝒴\mathbb{D}{\mathcal{Y}}. In the discrete case, such channels are 𝒳×𝒴\mathcal{X}{\times}\mathcal{Y} matrices CC whose row-xx, column-yy element Cx,yC_{x,y} is the probability that input xx produces observation yy. The xx-th row Cx,C_{x,-} is thus a discrete distribution in 𝔻𝒴\mathbb{D}{\mathcal{Y}}. We write 𝒳𝔻𝒴\mathcal{X}\rightarrow\mathbb{D}{\mathcal{Y}} for the type of the channel CC.

We can use Bayes rule to model an adversary who uses their observations from a channel to (optimally) update their knowledge about the secrets 𝒳\mathcal{X}. Given a prior distribution π:𝔻𝒳\pi:\mathbb{D}{\mathcal{X}} (representing an adversary’s prior knowledge) and channel CC, we can compute a joint distribution J:𝔻(𝒳×𝒴)J:\mathbb{D}(\mathcal{X}{\times}\mathcal{Y}) where Jx,y=πxCx,yJ_{x,y}=\pi_{x}C_{x,y}. Marginalising down columns yields the yy-marginals p(y)=xπxCx,yp(y)=\sum_{x}\pi_{x}C_{x,y} each having a posterior over 𝒳\mathcal{X} corresponding to the posterior probabilities PX|y(x)P_{X|y}(x), computed as Jx,y/p(y)\nicefrac{{J_{x,y}}}{{p(y)}} (when p(y)p(y) is non-zero). We denote by δy\delta^{y} the posterior distribution P(X|y)P(X|y) corresponding to the observation yy. The set of posterior distributions and the corresponding marginals can be used to compute the adversary’s posterior knowledge after making an observation from the channel.

II-B Local differential privacy and lift

Local differential privacy (LDP), as applied by an individual, can be defined as a property of a channel C:𝒳𝔻𝒴C:\mathcal{X}\rightarrow\mathbb{D}{\mathcal{Y}} taking data 𝒳\mathcal{X} to noisy outputs 𝒴\mathcal{Y}.

Definition 1.

We say that channel CC satisfies ϵ\epsilon-LDP if

Cx,yeϵCx,yx,x𝒳,y𝒴.C_{x,y}~{}\leq~{}e^{\epsilon}C_{x^{\prime},y}\qquad\forall x,x^{\prime}\in\mathcal{X},y\in\mathcal{Y}~{}.

A central quantity in this paper, which we call lift, was defined in [6] as:

Definition 2.

Given a channel CC and prior π:𝔻𝒳\pi:\mathbb{D}{\mathcal{X}}, lift is defined as

Lift(π,C):=maxx𝒳,y𝒴:Jx,y>0Cx,yp(y).\mathrm{Lift}(\pi,C)~{}~{}:=~{}~{}\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{C_{x,y}}{p(y)}~{}. (2)

Remark: Two alternative expressions for the lift are

Lift(π,C):=\displaystyle\mathrm{Lift}(\pi,C)~{}~{}:=~{}~{} maxx𝒳,y𝒴:Jx,y>0δxyπx,\displaystyle\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{\delta^{y}_{x}}{\pi_{x}}~{}, (3)
Lift(π,C):=\displaystyle\mathrm{Lift}(\pi,C)~{}~{}:=~{}~{} maxx𝒳,y𝒴:Jx,y>0Jx,yπxp(y).\displaystyle\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{J_{x,y}}{\pi_{x}p(y)}~{}.\, (4)

Notably, the intuition behind lift as expressed in Eqn. (3) is that it measures the adversary’s change in knowledge, through (multiplicative) comparison of her prior and posterior beliefs for each secret and observation. The observation providing the biggest “knowledge gap” or lift thereby produces the most leakage.

Note that the argument of maximisation Jx,yπxp(y)\frac{J_{x,y}}{\pi_{x}p(y)} in Eqn. (4) is indeed

(x,y)=p(x,y)p(x)p(y),\ell(x,y)=\frac{p(x,y)}{p(x)p(y)}~{},

in plain probability notation as described in §I and its logarithm is known as the information density. Here, we provide a brief account of some works which have used such quantities in defining privacy measures.

In [9], a mechanism is said to provide ϵ\epsilon-local information privacy (LIP) if

eϵδxyπxeϵ,x𝒳,y𝒴.\displaystyle e^{-\epsilon}\leq\frac{\delta^{y}_{x}}{\pi_{x}}\leq e^{\epsilon},\qquad\forall x\in\mathcal{X},y\in\mathcal{Y}. (5)

A similar definition was earlier studied in [1] for sensitive features of datasets. In [10, 11], ϵ\epsilon-lift was said to be satisfied if the logarithm of the above inequalities held true for a sensitive variable SS and useful variable XX according to the Markov chain SXYS\to X\to Y.

An important distinction between our definition of lift and the above works is that they imposed an additional lower bound, eϵe^{-\epsilon}, on the ratio of the posterior to prior beliefs. Whereas in this work, we are only concerned with the largest lift ratio (i.e., the upper bound), which we simply call lift (removing the logarithm). Our definition captures the notion of maximum realisable leakage [6], which is proved in [6, Theorem 13] to be equal to the lift as stated in our Definition 2. This also coincides with the notion of almost-sure pointwise maximal leakage in [18, Definition 4].

In [9, 1, 10, 8], it is proven that Eqn (5) implies 2ϵ2\epsilon-LDP. This bound can be improved through optimisation as shown in [9]. However, we highlight that Eqn (5) and the aforementioned relations to LDP depend on the prior π\pi. In §III-B, we will establish robust results strongly and directly linking a prior-independent notion of lift capacity to ϵ\epsilon-LDP. The reverse direction, linking ϵ\epsilon-LDP to Eqn (5) is already strong: it has been shown that in works such as [1, 8, 9] that ϵ\epsilon-LDP implies Eqn (5).

Example 1 (Running Example).

We will use the following example throughout to show how to compute our various measures. Consider the following channel CC:

Cbgb3/41/4g1/43/4bg19/201/20\begin{array}[]{|c|cc|}\hline\cr C&b&g\\ \hline\cr b&\nicefrac{{3}}{{4}}&\nicefrac{{1}}{{4}}\\ g&\nicefrac{{1}}{{4}}&\nicefrac{{3}}{{4}}\\ bg&\nicefrac{{19}}{{20}}&\nicefrac{{1}}{{20}}\\ \hline\cr\end{array}

The rows represent the (secret) eye colour of an individual (blue, green or blue-green) and the column labels represent the eye colour that the individual reports (blue or green). The channel CC then tells us, for example, that with probability 19/20\nicefrac{{19}}{{20}} the individual will report eye colour blue when they in fact have eye colour blue-green; however they will report eye colour blue with probability 1/4\nicefrac{{1}}{{4}} when they have eye colour green.

Let us imagine that our adversary (trying to guess the secret eye colour) has some prior knowledge given by π=(1/4,1/2,1/4)\pi=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}}). We then compute the adversary’s posterior knowledge given the prior and the channel as the set of posterior distributions:

δbδg15/445/365/225/619/441/36\begin{array}[]{|c|c|}\hline\cr\delta^{b}&\delta^{g}\\ \hline\cr\nicefrac{{15}}{{44}}&\nicefrac{{5}}{{36}}\\ \nicefrac{{5}}{{22}}&\nicefrac{{5}}{{6}}\\ \nicefrac{{19}}{{44}}&\nicefrac{{1}}{{36}}\\ \hline\cr\end{array}

with corresponding marginals p(b)=11/20p(b)=\nicefrac{{11}}{{20}}, p(g)=9/20p(g)=\nicefrac{{9}}{{20}}.

The ϵ\epsilon for this channel (Defn 1) is computed using the maximum ratio between elements in any column; we have

ϵ=log(Cg,gCbg,g)=log(15)=2.71.\epsilon=\log(\frac{C_{g,g}}{C_{bg,g}})=\log(15)=2.71~{}.

The lift, in contrast, requires a prior π\pi in order to be computed (Defn 2). Given the prior π\pi from above, we compute the lift as:

Lift(π,C)=max{Cbg,b/p(b),Cg,g/p(g)}=19/11=1.73.\mathrm{Lift}(\pi,C)=\max\{\nicefrac{{C_{bg,b}}}{{p(b)}},\nicefrac{{C_{g,g}}}{{p(g)}}\}=\nicefrac{{19}}{{11}}=1.73~{}.

II-C Operational scenarios and the gg-leakage framework

An important development over the past decade in security has been the use of operationally relevant leakage measures to provide information about the relative security of different probabilistic systems. Operational measures – those which have a direct correspondence with an adversarial scenario – are crucial to a proper understanding of the security properties of a system. Foundational work in this area by Geoffrey Smith [4] has led to the study of gg-leakage under the umbrella of QIF [17, 16].

In QIF, adversaries are modelled as Bayesian: they are equipped with a prior π:𝔻𝒳\pi:\mathbb{D}{\mathcal{X}} over secrets 𝒳\mathcal{X} and a gain function g:𝒲×𝒳0g:\mathcal{W}{\times}\mathcal{X}{\rightarrow}\mathbb{R}_{\geq 0} (over actions 𝒲\mathcal{W} and secrets 𝒳\mathcal{X}), which models the gain to the adversary upon taking action w𝒲w\in\mathcal{W} when the true value of the secret is x𝒳x\in\mathcal{X}. Before observing an output from the system, the adversary’s prior expected gain, which we call the expected prior vulnerability of the secret, can be expressed as

Vg(π):=maxw𝒲x𝒳πxg(w,x).V_{g}(\pi)~{}~{}:=~{}~{}\max_{w\in\mathcal{W}}\sum_{x\in\mathcal{X}}\pi_{x}g(w,x)~{}. (6)

The adversary can use their knowledge of the system (modelled as a channel C:𝒳𝔻𝒴C:\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}) to maximise their expected gain after making an observation. This is called the expected posterior vulnerability and is expressed as

Vg[πC]:=\displaystyle V_{g}{[\pi{\triangleright}C]}~{}~{}:=~{}~{} y𝒴maxw𝒲x𝒳πxCx,yg(w,x)\displaystyle\sum_{y\in\mathcal{Y}}\max_{w\in\mathcal{W}}\sum_{x\in\mathcal{X}}\pi_{x}C_{x,y}g(w,x)~{} (7)
=\displaystyle~{}~{}=~{}~{} y𝒴p(y)maxw𝒲x𝒳δxyg(w,x)\displaystyle\sum_{y\in\mathcal{Y}}p(y)\max_{w\in\mathcal{W}}\sum_{x\in\mathcal{X}}\delta_{x}^{y}~{}g(w,x)~{} (8)
=\displaystyle~{}~{}=~{}~{} y𝒴p(y)Vg(δy),\displaystyle\sum_{y\in\mathcal{Y}}p(y)V_{g}(\delta^{y})~{}, (9)

where in the last equality, we have used the definition of vulnerability in Eqn. (6) for the posterior distribution, denoted δy\delta^{y}. Finally, the difference between the prior and posterior vulnerabilities gives a measure of the leakage of the system to this adversary; the greater the difference, the better is the adversary able to use the transmitted information to infer the value of the secret. This can be computed multiplicatively as

g×(π,C):=Vg[πC]Vg(π).\mathcal{L}^{\times}_{g}(\pi,C)~{}~{}:=~{}~{}\frac{V_{g}{[\pi{\triangleright}C]}}{V_{g}(\pi)}~{}. (10)

The gg-leakage framework models a wide variety of attack scenarios, including guessing the secret in nn tries, guessing a property of the secret or guessing a value close to the secret. 777In fact, it has been shown that any convex vulnerability function is expressible as a gg-vulnerability [19]. Moreover, attached to each leakage measure is an operational scenario given by the gain function and prior which describes a specific adversarial threat.

An important notion in QIF is that of capacity, which corresponds to the maximum leakage of a system quantified over all priors, or all gain functions, or both. Capacities thus provide robust, interpretable leakage bounds; robust because they quantify over large classes of adversaries and therefore are “robust” to variations in adversarial assumptions; interpretable because the bounds provide security guarantees which can be explained in terms of the particulars of adversarial attacks.

Of particular note is the Bayes capacity, defined as

1×(𝔻,C):=\displaystyle\mathcal{ML}_{1}^{\times}(\mathbb{D},C)~{}~{}:=~{}~{} supπsupgg×(π,C)\displaystyle\sup_{\pi}\sup_{g}\mathcal{L}^{\times}_{g}(\pi,C) (11)
=\displaystyle~{}~{}=~{}~{} supπymaxxπxCx,y\displaystyle\sup_{\pi}\sum_{y}\max_{x}\pi_{x}C_{x,y}~{} (12)
=\displaystyle~{}~{}=~{}~{} ymaxxCx,y,\displaystyle\sum_{y}\max_{x}\,C_{x,y}, (13)

where the first equality is proved in [17] and the last equality is due to the fact that the capacity is realised under the uniform prior [17, “Miracle Theorem”]. The Bayes capacity has an important operational significance: it is a tight upper bound on the multiplicative leakage of any channel in the average sense, quantified over all priors and gain functions [16]. In other words, there is no adversarial scenario, modelled using the expected gain of an adversary with any prior knowledge, for which the channel leakage exceeds the amount given by the Bayes capacity. This provides a robust leakage bound on the security risk to a system against any average-case attacker.

II-D Connecting gg-leakage and lift

We conclude this section by showing the connection between Bayes capacity and the information-theoretic measure lift: it turns out that lift is an upper bound on the Bayes capacity. The following lemma has been expressed in [6, Corollary of Theorem 13]; our contribution here is an alternative proof of this result in a QIF formulation.

Lemma 1.

Given a channel C:𝒳𝔻𝒴C:\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}, for all priors π:𝔻𝒳\pi:\mathbb{D}{\mathcal{X}} it holds that

1×(𝔻,C)Lift(π,C).\mathcal{ML}_{1}^{\times}(\mathbb{D},C)~{}\leq~{}\mathrm{Lift}(\pi,C)~{}.
Proof.

We reason as follows:

Lift(π,C)\begin{array}[t]{@{}llll}\mathrm{Lift}(\pi,C)\end{array}
== maxx,yCx,yp(y)\begin{array}[t]{@{}llll}\max_{x,y}\frac{C_{x,y}}{p(y)}\end{array} “Eqn (2)”
== maxx,yCx,yp(y)yp(y)\begin{array}[t]{@{}llll}\max_{x,y}\frac{C_{x,y}}{p(y)}\sum_{y^{\prime}}p(y^{\prime})\end{array} “Since yp(y)=1\sum_{y^{\prime}}p(y^{\prime})=1
== yp(y)maxx,yCx,yp(y)\begin{array}[t]{@{}llll}\sum_{y^{\prime}}p(y^{\prime})\max_{x,y}\frac{C_{x,y}}{p(y)}\end{array} “max independent of yy^{\prime}
\geq yp(y)maxxCx,yp(y)\begin{array}[t]{@{}ll}{}\\ \sum_{y}p(y)\max_{x}\frac{C_{x,y}}{p(y)}\end{array} “Taking max over each column instead of channel”
== ymaxxCx,y\begin{array}[t]{@{}llll}\sum_{y}\max_{x}C_{x,y}\end{array} “Rearranging”
== 1×(𝔻,C)\begin{array}[t]{@{}llll}\mathcal{ML}_{1}^{\times}(\mathbb{D},C)\end{array} “Eqn (13)”

Further, we find that the bound is strict, in that lift can be strictly greater than Bayes capacity. Recalling that Bayes capacity is an upper bound on average-case gg-leakage measures, this result then shows that lift cannot be represented as an average-case gg-leakage measure.

To illustrate, from our running example (Ex. 1), we can compute the Bayes capacity on CC as the sum of the column maxima of CC, yielding

1×(𝔻,C)=19/20+3/4=1.7.\mathcal{ML}_{1}^{\times}(\mathbb{D},C)~{}~{}=~{}~{}\nicefrac{{19}}{{20}}+\nicefrac{{3}}{{4}}~{}~{}=~{}~{}1.7~{}.

This means that the maximum leakage of CC wrt  any adversary measured using an average-case gain, and for any prior and gain function, is 1.71.7. However, recall that we computed the lift under the prior π=(1/4,1/2,1/4)\pi=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}}) as 1.73. This means that lift cannot be expressible as an average-case gg-leakage measure.

Next, we study the relationship between lift and max-case measures of leakage.

III On Max-case gg-leakage Measures, Lift and Local DP

The QIF measures we have reviewed thus far have been for average-case attacks; that is, they model the expected gain of an attacker. Local differential privacy and lift, however, are max-case notions; they describe the worst-case gain for an adversary after making an observation, regardless of its probability of occurring. Max-case measures provide an alternative perspective to average-case measures: the average-case can be seen as the perspective of a data curator whose interest is in protecting attacks against the entire dataset, whereas the max-case provides the perspective of an individual in the dataset whose concern is their particular data point.

The theory of QIF has been extended to include max-case measures which quantify the gain to an adversary interested in only the worst-case leakage [16, 20]. To model this, the max-case posterior vulnerability is defined as

Vmax[πC]:=maxyV(δy).V^{\mathrm{max}}[\pi{\triangleright}C]~{}~{}:=~{}~{}\max_{y}V(\delta^{y})~{}. (14)

This says that the max-case posterior vulnerability VmaxV^{\mathrm{max}} is the maximum vulnerability VV computed over the posteriors, where VV is a vulnerability function defined on distributions. The theory of max-case vulnerability leaves open the question of how best to define VV, aside from a technical result which says that VV should be quasi-convex in order to satisfy basic axioms such as the data processing inequality and monotonicity [21].

In this paper, we introduce the notion of max-case gg-leakage by choosing VV in Eqn (14) to be the standard gg-vulnerability function VgV_{g} defined in Eqn (6). Note that VgV_{g} is convex [16], and therefore also quasi-convex, thus represents a valid choice of vulnerability function. This yields the following definition:

Definition 3 (Max-case gg-leakage).

Given a channel CC, prior π\pi and gain function gg, the max-case gg-leakage of CC is defined as

gmax(π,C):=Vgmax[πC]Vg(π)=maxyVg(δy)Vg(π),\displaystyle\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)~{}~{}:=~{}~{}\frac{V^{\mathrm{max}}_{g}[\pi{\triangleright}C]}{V_{g}(\pi)}~{}~{}=~{}~{}\frac{\max_{y}V_{g}(\delta^{y})}{V_{g}(\pi)}~{}, (15)

where VgV_{g} is the prior vulnerability function given in Eqn (6).

Comparing Eqn (15) to Eqn (10) we see that the max-case gg-leakage quantifies the difference between the prior vulnerability and the posterior vulnerability computed as the maximum vulnerability of any of the posteriors wrt  the adversary’s measure of gain and their prior knowledge.

We remark that the significance of our restriction of VV to gg-vulnerability functions is that it carries with it an operational interpretation in terms of adversarial threats. 888That is, our VV describes an adversary in terms of a gain function gg and a prior π\pi; other (strictly quasi-convex) vulnerability functions cannot be written in this way. We also note that max-case gg-leakage has yet to be studied in the literature. We will leave further discussion on this choice of VV to §IV.

III-A Lift and max-case gg-leakage

Using Defn 3 above, we now show that lift can be expressed as a max-case gg-leakage.

Theorem 1 (Lift as max-case gg-leakage).

For discrete domain 𝒳\mathcal{X}, define the set of actions 𝒲=𝒳\mathcal{W}=\mathcal{X} and the gain function gπ:𝒲×𝒳0g_{\pi}{:}\mathcal{W}{\times}{\mathcal{X}}\rightarrow\mathbb{R}_{\geq 0} as:

gπ(w,x)={1πx,if w=x0,otherwiseg_{\pi}(w,x)=\begin{cases}\frac{1}{\pi_{x}},&\textit{if }w=x\\ 0,&\textit{otherwise}\end{cases} (16)

where π\pi is the (full support) prior of the adversary using the gain function gπg_{\pi}. Then the max-case gg-leakage of any channel CC given a prior π\pi is equal to its lift. That is, gπmax(π,C)=Lift(π,C)\mathcal{L}^{\mathrm{max}}_{g_{\pi}}(\pi,C)=\mathrm{Lift}(\pi,C).

Proof.

For the gain function given above, the prior vulnerability Vgπ(π)V_{g_{\pi}}(\pi) is:

Vgπ(π)\displaystyle V_{g_{\pi}}(\pi)~{}~{} =maxwxπxgπ(w,x)\displaystyle=~{}~{}\max_{w}\sum_{x}\pi_{x}g_{\pi}(w,x)
=1 for all w\displaystyle=~{}~{}1\qquad\textit{ for all }w

and the max-case posterior vulnerability is:

Vgπmax[πC]\begin{array}[t]{@{}llll}V_{g_{\pi}}^{\mathrm{max}}{[\pi{\triangleright}C]}\end{array}
== maxyVgπ(δy)\begin{array}[t]{@{}llll}\max_{y}V_{g_{\pi}}(\delta^{y})\end{array} “Defn 3
== maxymaxwxπxCx,ygπ(w,x)p(y)\begin{array}[t]{@{}llll}\max_{y}\max_{w}\sum_{x}\frac{\pi_{x}C_{x,y}g_{\pi}(w,x)}{p(y)}\end{array} “Expanding δy\delta^{y}
== maxymaxxCx,yp(y)\begin{array}[t]{@{}llll}\max_{y}\max_{x}\frac{C_{x,y}}{p(y)}\end{array} “Substituting Eqn (16)”
== Lift(π,C)\begin{array}[t]{@{}llll}\mathrm{Lift}(\pi,C)\end{array} “Defn 2

Thus the max-case leakage is gπmax(π,C)=Vgπmax[πC]Vgπ(π)=Lift(π,C)\mathcal{L}^{\mathrm{max}}_{g_{\pi}}(\pi,C)=\frac{V_{g_{\pi}}^{\mathrm{max}}{[\pi{\triangleright}C]}}{V_{g_{\pi}}(\pi)}=\mathrm{Lift}(\pi,C). ∎

The significance of Thm 1 is that it gives an operational meaning to lift in terms of the adversary realised by VgπV_{g_{\pi}}. To give some intuition, for some distribution σ\sigma, Vgπ(σ)V_{g_{\pi}}(\sigma) is given by maxxσxπx\max_{x}\frac{\sigma_{x}}{\pi_{x}}; i.e., it measures the maximum “surprise” to the adversary, achieved on the secret xx for which the probability σx\sigma_{x} most differs (multiplicatively) from the adversary’s prior πx\pi_{x}999We remark that the gain function gπg_{\pi} is also known as the distribution-reciprocal gain function [16, Ch 3].

Continuing with our running example (Ex. 1), we can compute the gain function gπg_{\pi} for π(b,g,bg)=(1/4,1/2,1/4)\pi(b,g,bg)=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}}) as gπ(b,b)=4g_{\pi}(b,b)=4, gπ(g,g)=2g_{\pi}(g,g)=2, gπ(bg,bg)=4g_{\pi}(bg,bg)=4.

The prior vulnerability is

Vgπ(π)=maxwxπxgπ(w,x)=1.V_{g_{\pi}}(\pi)~{}~{}=~{}~{}\max_{w}\sum_{x}\pi_{x}g_{\pi}(w,x)~{}~{}=~{}~{}1~{}.

The posterior vulnerability is then given by

Vgπmax[πC]=maxyVgπ(δy)V_{g_{\pi}}^{\mathrm{max}}{[\pi{\triangleright}C]}=\max_{y}V_{g_{\pi}}(\delta^{y})

where

Vgπ(δb)=max{15/11,5/11,19/11}=19/11V_{g_{\pi}}(\delta^{b})=\max\{\nicefrac{{15}}{{11}},\nicefrac{{5}}{{11}},\nicefrac{{19}}{{11}}\}=\nicefrac{{19}}{{11}}

and

Vgπ(δg)=max{5/9,5/3,1/9}=5/3V_{g_{\pi}}(\delta^{g})=\max\{\nicefrac{{5}}{{9}},\nicefrac{{5}}{{3}},\nicefrac{{1}}{{9}}\}=\nicefrac{{5}}{{3}}

And so gπmax(π,C)=19/11=1.73\mathcal{L}^{\mathrm{max}}_{g_{\pi}}(\pi,C)=\nicefrac{{19}}{{11}}=1.73 which is what we calculated for lift in Ex. 1.

We find, however, that lift has a much stronger property: it is in fact an upper bound on any max-case gg-leakage measure (that is, using any gain function) with the minor restriction that the gain function must be non-negative. 101010Note that this non-negativity restriction is also required in the case of Bayes capacity as an upper bound on average-case leakage.

Theorem 2 (Lift bounds max-case gg-leakage).

Define the max-case gg-leakage gmax\mathcal{L}^{\mathrm{max}}_{g} as per Defn 3. Then for any non-negative gain function gg, any prior π\pi and any channel CC, it holds that gmax(π,C)Lift(π,C)\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)\leq\mathrm{Lift}(\pi,C).

Proof.

We reason as follows:

maxyVg(δy)\begin{array}[t]{@{}llll}\max_{y}V_{g}(\delta^{y})\end{array}
== maxy(maxwxπxCx,yg(w,x)p(y))\begin{array}[t]{@{}ll}{}\\ \max_{y}(\max_{w}\sum_{x}\dfrac{\pi_{x}C_{x,y}g(w,x)}{p(y)})\end{array} “Eqn (6), expand δy\delta^{y}
\leq maxy(maxwx(maxxCx,yp(y))πxg(w,x))\begin{array}[t]{@{}ll}{}\\ \max_{y}(\max_{w}\sum_{x}(\max_{x}\dfrac{C_{x,y}}{p(y)})\pi_{x}g(w,x))\end{array} g0g\geq 0
== maxy(maxxCx,yp(y))maxw(xπxg(w,x))\begin{array}[t]{@{}ll}{}\\ \max_{y}(\max_{x}\dfrac{C_{x,y}}{p(y)})\max_{w}(\sum_{x}\pi_{x}g(w,x))\end{array} icxi=cixi\sum_{i}cx_{i}=c\sum_{i}x_{i} and maxicxi=cmaxixi\max_{i}cx_{i}=c\max_{i}x_{i}
== Lift(π,C)Vg(π)\begin{array}[t]{@{}ll}{}\\ \mathrm{Lift}(\pi,C)V_{g}(\pi)\end{array} “Eqn (6), Lift\mathrm{Lift}

The result follows using Defn 3. ∎

Thm 2 tells us that lift is a surprisingly robust measure; for any prior, lift is a measure of the maximum gg-leakage of a channel wrt any (non-negative) gain function. Moreover, this upper bound holds regardless of whether we consider average-case leakage or max-case leakage: the max-case we proved above (Thm 2); the average-case follows from the fact that lift upper bounds Bayes capacity (Lem 1) and Bayes capacity is, by definition, an upper bound on any average-case gg-leakage, for any prior and any (non-negative) gain function.

To illustrate, consider again the running example (Ex. 1). For the gain function we will choose the gid function which models the adversary who wishes to guess the secret in one try. This is defined:

gid(w,x)={1if w=x,0otherwisegid(w,x)=\begin{cases}1&\textit{if }w=x,\\ 0&\textit{otherwise}\end{cases}

Then the max-case gidgid-leakage given the prior π=(1/4,1/2,1/4)\pi=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}}) is given by:

gidmax(π,C)=Vgidmax[πC]Vgid(π)=56/12=1.67.\mathcal{L}^{\mathrm{max}}_{gid}(\pi,C)=\frac{V^{\mathrm{max}}_{gid}{[\pi{\triangleright}C]}}{V_{gid}(\pi)}=\frac{5}{6}/\frac{1}{2}=1.67~{}.

And the lift we computed earlier as 1.731.73, which is larger than the gidgid-leakage, as we expected. Thm 2 tells us that in fact for any gain function we choose and any prior, the lift will always be at least as large as the max-case gg-leakage.

III-B Robustness results: lift capacity and epsilon

Typically, leakage measures (as defined e.g., in QIF) are not robust – that is, they depend on the specifics of the adversary (their prior and gain function), and therefore may not provide a reliable measure of the safety of a channel against different (Bayesian) adversaries having different prior knowledge and intent. A more robust way to characterise a channel’s leakage properties is to measure its maximum leakage 111111Note the distinction between maximum leakage and max-case leakage: the former quantifies over all priors and gain functions; the latter uses a max-case posterior vulnerability measure. – that is, quantified over all priors, or over all gain functions, or both. Such quantities are termed capacities (see §II-C), and have been previously studied in the context of average-case leakage [22]; the Bayes capacity (recall Eqn (11)) is the most well-known and provides robust upper bounds on the average-case leakage of any channel.

There has, to date, been no study of max-case capacities. However, Thm 2 gives an immediate result in this direction – it shows that lift can be seen an example of a max-case capacity, since it provides an upper bound on max-case leakages quantified over all gain functions. It follows also that, by quantifying over all priors as well, the resulting lift capacity is an upper bound on all max-case gg-leakages for all priors.

Definition 4.

We define the lift capacity for a channel CC as

Lift(C):=supπ:𝔻𝒳:πx>0Lift(π,C)=supπ:𝔻𝒳:πx>0maxx𝒳,y𝒴:Jx,y>0Cx,yp(y).\mathcal{M}\mathrm{Lift}(C):=\sup_{\begin{subarray}{c}\pi:\mathbb{D}{\mathcal{X}}:\\ \pi_{x}>0\end{subarray}}\mathrm{Lift}(\pi,C)~{}~{}=~{}~{}\sup_{\begin{subarray}{c}\pi:\mathbb{D}{\mathcal{X}}:\\ \pi_{x}>0\end{subarray}}\,\,\,\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{C_{x,y}}{p(y)}~{}.

Immediate from this definition and Thm 2 we get the following:

Corollary 1.

For any channel CC and prior π\pi it holds that

gmax(π,C)Lift(C).\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)~{}~{}\leq~{}~{}\mathcal{M}\mathrm{Lift}(C)~{}.

We now show that this so-called lift capacity also has an important relationship with the epsilon of local differential privacy.

Theorem 3.

Let C:𝒳𝔻𝒴C:\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}} be a channel and ϵ0\epsilon\geq 0. Then

Lift(C)eϵiffCx,yCx,yeϵfor all x,x𝒳,y𝒴.\mathcal{M}\mathrm{Lift}(C)\leq e^{\epsilon}~{}~{}\textrm{iff}~{}~{}\frac{C_{x,y}}{C_{x^{\prime},y}}\leq e^{\epsilon}~{}~{}\textrm{for all }x,x^{\prime}\in\mathcal{X},y\in\mathcal{Y}~{}.
Proof.

For the forward direction, we first note that infπxπxCx,y\inf_{\pi}\sum_{x}\pi_{x}C_{x,y} occurs when π\pi is a point prior on argminxCx,y\operatorname*{arg\,min}_{x}C_{x,y}. However, recall from Definition 4 that Lift(C)\mathcal{M}\mathrm{Lift}(C) is a supremum over all full support priors. Let x=argminxCx,yx^{*}=\operatorname*{arg\,min}_{x^{\prime}}C_{x^{\prime},y} ; now define a sequence of priors

πxn=11nandπxn=1n(|𝒳|1)for xx.\pi^{n}_{x^{*}}=1{-}\frac{1}{n}~{}~{}\textit{and}~{}~{}\pi^{n}_{x}=\frac{1}{n(|\mathcal{X}|-1)}~{}~{}\textit{for }x\neq x^{*}~{}.

We see that πn\pi^{n} is full support, and also limnxπxnCx,y=minxCx,y\lim_{n\to\infty}\sum_{x}\pi^{n}_{x}C_{x,y}=\min_{x}C_{x,y}. Therefore

infπ:πx>0xπxCx,y=minxCx,y.\inf_{\pi:\pi_{x}>0}\sum_{x}\pi_{x}C_{x,y}=\min_{x}C_{x,y}~{}. (17)

We also note that we can rewrite p(y)p(y) as xπxCx,y\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}. Therefore we have:

supπ:πx>0Lift(π,C)\begin{array}[t]{@{}llll}\sup_{\pi:\pi_{x}>0}\mathrm{Lift}(\pi,C)\end{array}
== supπ:πx>0maxx,yCx,yp(y)\begin{array}[t]{@{}llll}\sup_{\pi:\pi_{x}>0}\max_{x,y}\frac{C_{x,y}}{p(y)}\end{array} “Eqn (2)”
== supπ:πx>0maxx,yCx,yxπxCx,y\begin{array}[t]{@{}ll}{}\\ \sup_{\pi:\pi_{x}>0}\max_{x,y}\frac{C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array} “Substituting for p(y)p(y) as noted above”
== maxx,ysupπ:πx>0Cx,yxπxCx,y\begin{array}[t]{@{}llll}\max_{x,y}\sup_{\pi:\pi_{x}>0}\frac{C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array} “Rearranging”
== maxx,y(Cx,yinfπ:πx>0(xπxCx,y))\begin{array}[t]{@{}llll}\max_{x,y}(\frac{C_{x,y}}{\inf_{\pi:\pi_{x}>0}(\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y})})\end{array} Cx,yC_{x,y} independent of π\pi
== maxx,yCx,yminxCx,y\begin{array}[t]{@{}llll}\max_{x,y}\frac{C_{x,y}}{\min_{x^{\prime}}C_{x^{\prime},y}}\end{array} “From Eqn (17) above”
\geq Cx,yCx,y for all x,x,y\begin{array}[t]{@{}llll}\frac{C_{x,y}}{C_{x^{\prime},y}}~{}~{}\textrm{ for all }x,x^{\prime},y\end{array} minxCx,yCx,y\min_{x}C_{x,y}\leq C_{x,y}

And so we have that Lift(C)eϵ\mathcal{M}\mathrm{Lift}(C)\leq e^{\epsilon} implies Cx,yCx,yeϵ\frac{C_{x,y}}{C_{x^{\prime},y}}\leq e^{\epsilon} for all x,x,yx,x^{\prime},y. For the reverse direction, we refer to previous works in which this has been shown to hold [1, 8, 9].

Corollary 2.

The lift capacity coincides with the ϵ\epsilon value of an LDP channel. That is, Lift(C)=eϵ\mathcal{M}\mathrm{Lift}(C)=e^{\epsilon}.

Proof.

From Thm 3 we deduce that Lift(C)=supx,x,yCx,yCx,y\mathcal{M}\mathrm{Lift}(C)=\sup_{x,x^{\prime},y}\frac{C_{x,y}}{C_{x^{\prime},y}}. ∎

Thm 3 and Cor 2 establish the equivalence between the lift capacity and the epsilon parameter of local differential privacy, and by Thm 1 and Thm 2 this gives that epsilon (or rather, eϵe^{\epsilon}) is a tight upper bound on the max-case gg-leakage of any channel. That is, eϵe^{\epsilon} is the max-case gg-leakage capacity, quantified over all priors and gain functions. This result connecting ϵ\epsilon with the gg-leakage framework provides the first robust, operational interpretation for local differential privacy in terms of Bayesian adversarial threats.

We note that ϵ\epsilon has previously been interpreted as a capacity under QIF [20]. However, in that work the vulnerability function chosen was not expressible as a gg-vulnerability, and therefore did not carry with it the operational interpretation attached to the gg-leakage framework. This was Smith’s original motivation for a gg-leakage based model for secure information leakage measurements.

We also remark that the relationship between lift and local differential privacy has been established previously (eg. [6, 23]), but these results have not been tied back to the gg-leakage framework, and did not establish the operational significance of ϵ\epsilon in terms of max-case gg-leakage measures.

III-C Interpreting the results connecting gg-leakage, lift capacity and epsilon

Thm 2 above tells us that lift bounds max-case gg-leakage. To give some intuition for what this means in practice, suppose that we are worried about the leakage of a system to some attacker, both wrt  any single individual in our system (i.e., the max-case leakage to the attacker) and wrt  the system as a whole (i.e., the attacker’s expected gain). Let’s also assume that we can estimate the prior knowledge of the attacker, but we do not know exactly what the attacker is trying to achieve (i.e., what is her gain function). What Thm 2 tells us is that without knowing anything about the attacker’s goal, we can still compute an upper bound on the leakage of the system to this attacker (i.e., we can measure how much more the attacker can gain by having access to the system). This upper bound is given by the lift - a quantity independent of the gain function - and this is an upper bound on both average-case and max-case leakage to Bayesian adversaries.

Now let’s assume that we are worried about the same attacker, however this time we know nothing about her prior knowledge nor her specific goal. What Cor 2 tells us is that we can still compute an upper bound on the leakage of the system to this attacker, and in this case the upper bound is given by the lift capacity, or alternatively by eϵe^{\epsilon} (computed in the local differential privacy sense). This upper bound is robust, in the sense that it makes no assumptions about the adversary’s prior knowledge or goal; thus it is termed a capacity, consistent with the QIF literature. The ϵ\epsilon capacity differs from the existing Bayes capacity in QIF, in that Bayes capacity is an upper bound on adversaries whose leakage is computed in the average-case, whereas the eϵe^{\epsilon} upper bound is a capacity on adversaries computed using either average-case or max-case leakage measures. The significance of these results is that we can now justify ϵ\epsilon as a robust security parameter with an operational interpretation in terms of a large class of Bayesian attackers.

Recall the motivating example presented in the introduction (§I-A) in which a user is given a choice of local differential privacy mechanisms GG and RR with ϵ\epsilon values of log(4)\log(4) and log(3)\log(3), respectively. Remember that we found that RR had a larger leakage (wrt  both an average-case and a max-case Bayesian attacker) than GG, even though it has a smaller ϵ\epsilon. The results of Thm 2 and Cor 2 do not tell us when one mechanism will be better than another against a particular attack; for this we would turn to the refinement orders of QIF (studied in [20]) which tell us when one mechanism is better than another against all attacks; or alternatively we would compute the leakage of each mechanism wrt  an individual attack of concern (as we did in the introduction).

What Thm 2 and Cor 2 do tell us is that there cannot be any attack (max-case or average-case) which produces more leakage than what the lift tells us (if we know the attacker’s prior), or what eϵe^{\epsilon} tells us (if we do not). In the case of RR and GG, Cor 2 tells us that there does not exist an attack (max-case or average-case, and regardless of the adversary’s prior) that produces more leakage than eϵ=3e^{\epsilon}=3 or eϵ=4e^{\epsilon}=4, respectively. Moreover these bounds are tight: there exists an attack which does induce an eϵe^{\epsilon} leakage (and, in fact, Thm 3 gives a construction for such an attacker). This means that we can conclude that RR is generally safer than GG, because its leakage never exceeds 33, even though GG may be better than RR in some specific circumstances.

Notice that the leakages we computed in the example in §I-A for GG were 1.671.67 and 1.711.71, and for RR they were both 1.801.80. These are indeed lower than the corresponding eϵe^{\epsilon} bounds of 44 and 33, respectively. Interestingly, if we compute the Bayes capacity for GG and RR, we find that these are 1.671.67 and 1.801.80, respectively (because the Bayes capacity is realised on a uniform prior for exactly the adversarial scenario we computed). What this tells us is that if we are only concerned with attackers modelled in the average-case, then GG is in fact generally better than RR (but may be worse for particular attackers). The leakage of 1.711.71 that we measured for GG was for a max-case attacker; and in fact the max-case leakage is always at least as large as the average-case leakage for any prior and any gain function, as we will prove later (see Lem 3 in §IV).

III-D Max-case Dalenius leakage

Up until this point, we have been satisfied with computing leakage measures with respect to deterministic functions of the domain 𝒳\mathcal{X} (described by a prior π\pi). As pointed out in [6], one may also be concerned about potentially randomised functions of 𝒳\mathcal{X}. The QIF theory of Dalenius leakage [16, Ch 10] accounts for such functions by modelling the leakage that a channel C:𝒳𝔻𝒴C{:}\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}} induces on a secret 𝒵\mathcal{Z} through an unknown correlation J:𝔻(𝒵×𝒳)J{:}\mathbb{D}{(\mathcal{Z}{\times}\mathcal{X})}. Given such a correlation JJ, we can factorise it into a prior ρ:𝔻𝒵\rho{:}\mathbb{D}{\mathcal{Z}} and a channel D:𝒵𝔻𝒳D{:}\mathcal{Z}{\rightarrow}\mathbb{D}{\mathcal{X}} so that Jz,x=ρz×Dz,xJ_{z,x}=\rho_{z}{\times}D_{z,x}.

In [16] it is shown that the multiplicative Bayes capacity of the combined system DCDC (writing DCDC for matrix multiplication) is in fact bounded from above by the multiplicative Bayes capacity of CC; and thus considering arbitrary randomised functions of the secret 𝒳\mathcal{X} is not necessary for quantifying the maximum multiplicative leakage of a channel.

Here we confirm that this property also holds for the max-case leakage capacity (given by Lift\mathcal{M}\mathrm{Lift}), and also by lift itself. This means that the capacity results proven in this paper also hold for randomised functions of 𝒳\mathcal{X}, and therefore there is no advantage in considering randomised functions of the secret space.

The max-case gg-leakage of 𝒵\mathcal{Z} caused by JJ and CC can be written as gmax(ρ,DC)\mathcal{L}^{\mathrm{max}}_{g}(\rho,DC). We now have the following:

Lemma 2.

Let C:𝒳𝔻𝒴C{:}\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}} be a channel and let J:𝔻(𝒵×𝒳)J{:}\mathbb{D}{({\mathcal{Z}}{\times}{\mathcal{X}})} be a correlation that factorises as ρD\rho\triangleright D. Then

Lift(ρ,DC)min{Lift(ρ,D),Lift(π,C)},\mathrm{Lift}(\rho,DC)~{}~{}\leq~{}~{}\min\{\mathrm{Lift}(\rho,D),\mathrm{Lift}(\pi,C)\}~{},

where π:𝔻𝒳\pi{:}\mathbb{D}{\mathcal{X}} and πx=zρzDz,x\pi_{x}=\sum_{z}\rho_{z}D_{z,x}.

Proof.

Referring to Thm 1, write gρg_{\rho} for the gain function that realises lift. i.e., Lift(ρ,C)=gρmax(ρ,C)\mathrm{Lift}(\rho,C)=\mathcal{L}^{\mathrm{max}}_{g_{\rho}}(\rho,C). The data processing inequality for max-case leakage gives that

gρmax(ρ,DC)gρmax(ρ,D)\mathcal{L}^{\mathrm{max}}_{g_{\rho}}(\rho,DC)~{}~{}\leq~{}~{}\mathcal{L}^{\mathrm{max}}_{g_{\rho}}(\rho,D)

for any ρ\rho. Thus, from Thm 1 we get

Lift(ρ,DC)Lift(ρ,D).\mathrm{Lift}(\rho,DC)~{}~{}\leq~{}~{}\mathrm{Lift}(\rho,D)~{}.

We next have that:

Lift(ρ,DC)\begin{array}[t]{@{}llll}\mathrm{Lift}(\rho,DC)\end{array}
== maxz𝒵,y𝒴:Jz,y>0(DC)z,yp(y)\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{(DC)_{z,y}}{p(y)}\end{array} “Defn 2
== maxz𝒵,y𝒴:Jz,y>0xDz,xCx,yzρzxDz,xCx,y\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\sum_{x}D_{z,x}C_{x,y}}{\sum_{z}\rho_{z}\sum_{x^{\prime}}D_{z,x^{\prime}}C_{x^{\prime},y}}\end{array} “Rewriting”
\leq maxz𝒵,y𝒴:Jz,y>0xDz,x(maxxCx,y)zρzxDz,xCx,y\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\sum_{x}D_{z,x}(\max_{x}C_{x,y})}{\sum_{z}\rho_{z}\sum_{x^{\prime}}D_{z,x^{\prime}}C_{x^{\prime},y}}\end{array} “Taking max of C,yC_{-,y} over 𝒳\mathcal{X}
== maxz𝒵,y𝒴:Jz,y>0maxxCx,yzρzxDz,xCx,y\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\max_{x}C_{x,y}}{\sum_{z}\rho_{z}\sum_{x^{\prime}}D_{z,x^{\prime}}C_{x^{\prime},y}}\end{array} “Factorise; use xDz,x=1\sum_{x}D_{z,x}=1
== maxz𝒵,y𝒴:Jz,y>0maxxCx,yxz(ρzDz,x)Cx,y\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\max_{x}C_{x,y}}{\sum_{x^{\prime}}\sum_{z}(\rho_{z}D_{z,x^{\prime}})C_{x^{\prime},y}}\end{array} “Factorise denominator”
== maxz𝒵,y𝒴:Jz,y>0maxxCx,yxπxCx,y\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\max_{x}C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array} “Substituting zρzDz,x=πx\sum_{z}\rho_{z}D_{z,x}=\pi_{x}
== maxx𝒳,y𝒴:Jx,y>0Cx,yxπxCx,y\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array} “No dependence on 𝒵\mathcal{Z}
== Lift(π,C)\begin{array}[t]{@{}ll}{}\\ \mathrm{Lift}(\pi,C)\end{array} “Defn 2

And thus Lift(ρ,DC)min{Lift(ρ,D),Lift(π,C)}\mathrm{Lift}(\rho,DC)\leq\min\{\mathrm{Lift}(\rho,D),\mathrm{Lift}(\pi,C)\}. ∎

As a corollary, we have that the lift capacity also respects Dalenius leakage, and therefore the max-case gg-leakage of secrets 𝒵\mathcal{Z} via a channel CC and arbitrary correlation JJ is bounded from above by the lift capacity of CC.

Corollary 3.

For any channel C:𝒳𝔻𝒴C{:}\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}, non-negative gain function gg and correlation JJ given by Jz,x=ρzDz,xJ_{z,x}=\rho_{z}D_{z,x} we have that

gmax(ρ,DC)Lift(DC)Lift(C).\mathcal{L}^{\mathrm{max}}_{g}(\rho,DC)~{}~{}\leq~{}~{}\mathcal{M}\mathrm{Lift}(DC)~{}~{}\leq~{}~{}\mathcal{M}\mathrm{Lift}(C)~{}.
Proof.

By Cor 1, gmax(ρ,DC)Lift(DC)\mathcal{L}^{\mathrm{max}}_{g}(\rho,DC)\leq\mathcal{M}\mathrm{Lift}(DC). From Lem 2 we have Lift(ρ,DC)Lift(π,C)\mathrm{Lift}(\rho,DC)\leq\mathrm{Lift}(\pi,C) and thus we deduce

Lift(DC)\begin{array}[t]{@{}llll}\mathcal{M}\mathrm{Lift}(DC)\end{array}
== supρ:𝔻𝒵:ρz>0Lift(ρ,DC)\begin{array}[t]{@{}llll}\sup_{\begin{subarray}{c}\rho:\mathbb{D}{\mathcal{Z}}:\\ \rho_{z}>0\end{subarray}}\,\,\,\mathrm{Lift}(\rho,DC)\end{array} “Defn 4
\leq supρ:𝔻𝒵:ρz>0Lift(π,C)\begin{array}[t]{@{}ll}{}\\ \sup_{\begin{subarray}{c}\rho:\mathbb{D}{\mathcal{Z}}:\\ \rho_{z}>0\end{subarray}}\,\,\,\mathrm{Lift}(\pi,C)\end{array} “Lem 2, substituting πx=zρzDz,x\pi_{x}=\sum_{z}\rho_{z}D_{z,x}
== supπ:𝔻𝒳:πx>0Lift(π,C)\begin{array}[t]{@{}llll}\sup_{\begin{subarray}{c}\pi:\mathbb{D}{\mathcal{X}}:\\ \pi_{x}>0\end{subarray}}\,\,\,\mathrm{Lift}(\pi,C)\end{array} “Independence from 𝒵\mathcal{Z}
== Lift(C)\begin{array}[t]{@{}llll}\mathcal{M}\mathrm{Lift}(C)\end{array} “Defn 4

III-E Interpreting the Dalenius leakage results

To provide some intuition on how to interpret the above results, we return to our motivating example from the introduction, in which the two mechanisms GG and RR were introduced. Recall that these mechanisms were applied to survey results with answers ‘yes’, ‘maybe’ and ‘no’. Now let us assume that the adversary knows a correlation between survey answers and disease; perhaps users who answered ‘yes’ or ‘maybe’ are likely to have some serious illness, whereas users who answered ‘no’ are unlikely to have one. We wish to know whether such a correlation will cause extra harm to the individual: can the adversary learn more information (via the correlation) than the ϵ\epsilon parameter of GG or RR suggests?

What Cor 3 tells us is that the max-case gg-leakage of the entire system (including the correlation) is bounded above by eϵe^{\epsilon} computed from the mechanism (GG or RR). This means that any (potentially public) correlations DD that the adversary may have access to cannot increase the amount of leakage caused by GG or RR – the upper bound on leakage remains intact, where here we measure leakage using either lift or lift capacity (eϵe^{\epsilon}). In other words, the designer of the system can focus on the ϵ\epsilon parameter describing the mechanism, and so long as the leakage (represented by eϵe^{\epsilon}) is small enough, then the system is protected against an adversary who knows any arbitrary correlation DD. Note that we did not prove a Dalenius leakage result on max-case gg-leakage, which we leave to future work.

We remark that this result appears to be similar to the notion in differential privacy that ϵ\epsilon is independent of the prior knowledge of an attacker. A difference here is that the notion of prior knowledge is typically represented as a distribution over secrets; in the Dalenius scenario described above, we are interested in arbitrary correlations between the secret and some other (potentially damaging) information, and the concern is what damaging information the attacker can learn as a result of the data release and the correlation. Dalenius leakage reassures us that that the damage caused by any arbitrary correlation is always bounded by the leakage of the original data release.

Interestingly, it may be the case that these Dalenius leakage results do not hold in the general case of (central) differential privacy; a counter-example appears in the work of [24]. We leave further exploration of this idea to future work.

IV Additional Results on Max-case Leakage and Technical Discussion

In this section, we provide further technical details on the max-case leakage definition of Defn 3.

Firstly, the following result shows that the max-case leakage of a channel is at least as large as the average-case leakage. This result completes Table I.

Lemma 3.

Given a channel CC prior π\pi and gain function gg, it holds that

g×(π,C)gmax(π,C).\mathcal{L}_{g}^{\times}(\pi,C)~{}~{}\leq~{}~{}\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)~{}.
Proof.

We reason as follows:

Vg[πC]\begin{array}[t]{@{}llll}V_{g}[\pi{\triangleright}C]\end{array}
== yp(y)Vg(δy)\begin{array}[t]{@{}llll}\sum_{y}p(y)V_{g}(\delta^{y})\end{array} “Eqn (9)”
\leq yp(y)maxjVg(δj)\begin{array}[t]{@{}llll}\sum_{y}p(y)\max_{j}V_{g}(\delta^{j})\end{array} “Max over posteriors”
== maxjVg(δj)yp(y)\begin{array}[t]{@{}llll}\max_{j}V_{g}(\delta^{j})\sum_{y}p(y)\end{array} “Factorising”
== Vgmax[πC]\begin{array}[t]{@{}llll}V^{\mathrm{max}}_{g}[\pi{\triangleright}C]\end{array} “Simplify, Defn 3

Thus the corresponding leakages are ordered (since prior vulnerability is common). ∎

Next, we recall that in Defn 3 we chose to model max-case leakage using the prior vulnerability function VgV_{g}, which models the adversary’s expected gain before interacting with a system (i.e., using only their prior knowledge). However, in the max-case setting it might seem preferable to choose a prior vulnerability function which models the adversary’s max-case gain; i.e., we could have defined:

Vgmax(π):=maxw,xπxg(w,x).V^{\mathrm{max}}_{g}(\pi)~{}~{}:=~{}~{}\max_{w,x}\pi_{x}g(w,x). (18)

That is, the adversary’s prior gain is computed using the secret xx which maximises their gain. We now justify our original decision (choosing an average-case prior vulnerability) by demonstrating that these choices are, in fact, equivalent.

Lemma 4.

Let CC be a channel, π:𝔻𝒳\pi{:}\mathbb{D}{\cal{X}} be a prior and gg be a gain function. Then there exists a gain function gg^{*} such that Vgmax(π)=Vg(π)V^{\mathrm{max}}_{g}(\pi)=V_{g^{*}}(\pi) and maxyVgmax(δy)Vgmax(π)=maxyVg(δy)Vg(π)\frac{\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})}{V^{\mathrm{max}}_{g}(\pi)}=\frac{\max_{y}V_{g^{*}}(\delta^{y})}{V_{g^{*}}(\pi)}.

Proof.

(Sketch) Observe that VgmaxV^{\mathrm{max}}_{g} (Eqn (18)) is convex in π\pi and so it can be expressed as a convex gg-vulnerability [16, Ch 11, Thm 11.5]; i.e., there exists a gain function gg^{*} such that Vg(π)=Vgmax(π)V_{g^{*}}(\pi)=V^{\mathrm{max}}_{g}(\pi). This also means that maxyVgmax(δy)=maxyVg(δy)\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})=\max_{y}V_{g^{*}}(\delta^{y}) and thus

maxyVgmax(δy)Vgmax(π)=maxyVg(δy)Vg(π).\frac{\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})}{V^{\mathrm{max}}_{g}(\pi)}~{}=~{}\frac{\max_{y}V_{g^{*}}(\delta^{y})}{V_{g^{*}}(\pi)}.

In Appendix B we show the construction of such a gg^{*}. ∎

Finally, we recall the result of [21] which shows that, for reasonable axioms to hold under a max-case definition of leakage, then the prior vulnerability function should be quasi-convex (in the prior). However, as we have seen, both the average-case and max-case prior vulnerability functions are convex. An open question in the community has been: are there any strictly quasi-convex functions which produce leakage measures of interest? Previous work [20] showed that the ϵ\epsilon of metric differential privacy (dd-privacy) can be expressed as an additive capacity using a quasi-convex vulnerability function, suggesting that this was indeed one such example. In this paper we have resolved this question with a much stronger result, showing that ϵ\epsilon in local DP can in fact be expressed using a convex vulnerability function. This justifies our restriction of max-case leakage to gg-leakage measures, but leaves open the question of the usefulness of max-case leakage defined over the full scope of quasi-convex vulnerabilities.

V Prior Work

Perhaps not surprisingly and underpinned by the foundational theory of information, many linkages have already been established between information-theoretic, differential privacy and QIF frameworks [20, 25]. Here, we only review results that are most pertinent to this work.

First, the logarithm of the Bayes capacity is known as the Sibson mutual information of order α=\alpha=\infty in information theory [26] and was recently shown in [6] to measure the maximum leakage of adversaries wanting to guess arbitrary randomised functions of secret XX. It turns out that this identity is no coincidence and a recent work [18] proves that there is no advantage in generalising the class of adversaries from those who use deterministic gain functions to those who guess randomised functions. We remark that this result is also known in the QIF community via Dalenius leakage [16, Ch 10].

Second, it is not difficult to prove that an upper bound on the information density i(x,y)i(x,y) bounds the Sibson mutual information of order infinity [6], aka the logarithm of Bayes capacity in QIF language. Moreover, works such as [1, 8, 9] link upper and lower bounds on lift to (local) differential privacy measures [3, 12, 13] and vice versa.

There have been a number of works connecting DP and QIF, in particular via the study of max-case leakage measures [16, 20]. A general notion of max-case leakage under QIF has been extensively explored in [16]; there it was found that max-case measures are required to be quasi-convex in order to satisfy certain axioms. Our work differs from theirs in that we restrict our attention to the set of max-case gg-leakages – that is, derived from gg-vulnerability functions – which are a subset of the quasi-convex max-case vulnerabilities (in fact, we have shown that our max-case gg-leakages are all convex). To our knowledge our work is the first to explore max-case gg-leakages and their capacities. General max-case leakages were also explored in [20] and their connection to differential privacy established via a leakage ordering. In particular, [20] found that whenever there is a max-case leakage order between two mechanisms (meaning one is always safer than the other wrt  every max-case adversarial scenario), then there is a corresponding ϵ\epsilon ordering between the two. While [20] explored ϵ\epsilon as inducing a refinement order, finding that it is surprisingly weak (i.e., compared with the refinement orders of average-case and max-case found in QIF), in our work we explore ϵ\epsilon as a capacity, finding that it does have a strong and meaningful interpretation (in terms of max-case attackers). In addition, [20] established that the ϵ\epsilon of DP can be expressed as a max-case leakage capacity (via the QIF framework) derived from a specific quasi-convex vulnerability function. Such quasi-convex vulnerabilities, unfortunately, do not correspond to adversarial models in gg-leakage, and therefore fall outside the desirable operational interpretability. Our work finds that ϵ\epsilon (actually eϵe^{\epsilon}) is a capacity which can in fact be represented by a max-case gg-vulnerability, which brings with it a broad operational meaning in terms of max-case attackers.

Other relationships between Bayesian models and differential privacy have also been previously studied [27, 23], however none of these results give both a robust and an interpretable meaning to ϵ\epsilon. Andres et al. [23] provided a Bayesian interpretation for metric differential privacy in terms of a single adversary and Alvim et al. [27] showed that ϵ\epsilon-differential privacy implies a bound on leakage but that the converse does not hold.

VI Conclusion

In this paper we have investigated the relationships between traditional information theory and the gg-leakage framework of quantitative information flow. The connections are summarised in Table I. Overall we observe that the two notions converge wrt.  robustness, namely via the capacity Lift(C)\mathcal{M}\mathrm{Lift}(C), which we find is equivalent to eϵe^{\epsilon} of local differential privacy.

Significant also is that local differential privacy’s ϵ\epsilon parameter can now be understood through the lens of QIF. In particular, we see that it is also a measure of robustness in that it behaves as a capacity – that is, independent of any particular prior. Moreover, it represents a tight upper bound on all max-case gg-leakages. This is the first time that ϵ\epsilon in LDP has been explained as a robust measure of information leakage in the QIF framework.

From the perspective of information theory, lift is also explained as a leakage measure, but interestingly we discovered that it has “capacity-like-properties” (§III-B). Table I clarifies how these measures relate to the leakages and capacities well-known in QIF.

Differential privacy is often seen as a useful technique to protect the privacy of individuals’ data, and has been used in several prominent applications including the 2020 US Census [15]. As noted however, in spite of differential privacy’s theoretical properties, there remain a number of challenges for its successful application. One such challenge is how to choose an appropriate level of ϵ\epsilon relative to a particular scenario. Whilst ϵ\epsilon itself provides information about indistinguishability, it is difficult to reconcile it with the protection against other relevant attacks described here.

The work presented here is an important step towards a better understanding of how to choose ϵ\epsilon where indistinguishability is not the only concern, but where there are differing adversarial assumptions. The results here give a clear theoretical account of ϵ\epsilon and how to view it under differing adversarial conditions which can then be included in the determination of an appropriate threat model that is relevant to the scenario.

While the results of this paper were derived for local DP, they provide some insights about central DP. For instance, based on Thm 3, Cor 2 and the fact that local DP implies central DP, we know that lift capacity upper bounds eϵe^{\epsilon} in central DP. However, more work is required to better understand what type of worst-case threat models in QIF framework have a one-to-one correspondence with ϵ\epsilon in central DP. If such equivalence were to be established, it would settle the question of the operational meaning of central DP in terms of which adversarial threats it can protect against. There may also exist new connections or applications of lift and lift capacity in defining or bounding other privacy and leakage measures. We leave these interesting questions for future work.

References

  • [1] F. du Pin Calmon and N. Fawaz, “Privacy against statistical inference,” in 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1401–1408, IEEE, 2012.
  • [2] A. Makhdoumi, S. Salamatian, N. Fawaz, and M. Médard, “From the information bottleneck to the privacy funnel,” in 2014 IEEE Information Theory Workshop (ITW 2014), pp. 501–505, Nov 2014.
  • [3] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Thr of Crypto. Conf., pp. 265–284, Springer, 2006.
  • [4] G. Smith, “On the Foundations of Quantitative Information Flow,” in FOSSACS, vol. 5504 of LNCS, pp. 288–302, Springer, 2009.
  • [5] M. Thomas and A. T. Joy, Elements of information theory. Wiley-Interscience, 2006.
  • [6] I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,” IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2020.
  • [7] J. Liao, O. Kosut, L. Sankar, and F. d. P. Calmon, “Tunable measures for information leakage and applications to privacy-utility tradeoffs,” vol. 65, no. 12, pp. 8043–8066, 2019.
  • [8] S. Salamatian, F. P. Calmon, N. Fawaz, A. Makhdoumi, and M. Médard, “Privacy-utility tradeoff and privacy funnel,” Unpublished preprint, https://www.mit.edu/ salmansa/files/privacy_TIFS.pdf, 2020.
  • [9] B. Jiang, M. Seif, R. Tandon, and M. Li, “Context-aware local information privacy,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 3694–3708, 2021.
  • [10] H. Hsu, S. Asoodeh, and F. P. Calmon, “Information-theoretic privacy watchdogs,” in 2019 IEEE International Symposium on Information Theory (ISIT), pp. 552–556, IEEE, 2019.
  • [11] P. Sadeghi, N. Ding, and T. Rakotoarivelo, “On properties and optimization of information-theoretic privacy watchdog,” in IEEE Inf. Theory Workshop (ITW), pp. 1–5, Apr. 2020.
  • [12] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?,” SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011.
  • [13] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in Proc. IEEE 54th Annu. Symp. Found. Comput. Sci., pp. 429–438, 2013.
  • [14] D. Clark, S. Hunt, and P. Malacaria, “Quantitative information flow, relations and polymorphic types.,” J. Log. Comput., vol. 15, pp. 181–199, 01 2005.
  • [15] S. L. Garfinkel, J. M. Abowd, and S. Powazek, “Issues encountered deploying differential privacy,” in Proceedings of the 2018 Workshop on Privacy in the Electronic Society, pp. 133–137, 2018.
  • [16] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, The Science of Quantitative Information Flow. Springer, 2020.
  • [17] M. S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith, “Measuring information leakage using generalized gain functions,” in IEEE CSF, pp. 265–279, June 2012.
  • [18] S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Pointwise maximal leakage,” arXiv preprint arXiv:2205.04935, 2022.
  • [19] A. McIver, C. Morgan, G. Smith, B. Espinoza, and L. Meinicke, “Abstract channels and their robust information-leakage ordering,” in Principles of Security and Trust - Third International Conference, POST 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014, Proceedings (M. Abadi and S. Kremer, eds.), vol. 8414 of Lecture Notes in Computer Science, pp. 83–102, Springer, 2014.
  • [20] K. Chatzikokolakis, N. Fernandes, and C. Palamidessi, “Comparing systems: Max-case refinement orders and application to differential privacy,” in Proc. CSF, IEEE Press, 2019.
  • [21] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, “Axioms for information leakage,” in 2016 IEEE 29th Computer Security Foundations Symposium (CSF), pp. 77–92, IEEE, 2016.
  • [22] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, “Additive and Multiplicative Notions of Leakage, and Their Capacities,” in IEEE CSF, pp. 308–322, IEEE, 2014.
  • [23] M. E. Andrés, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi, “Geo-indistinguishability: Differential privacy for location-based systems,” in Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pp. 901–914, 2013.
  • [24] Y. Wang, Y. O. Basciftci, and P. Ishwar, “Privacy-utility tradeoffs under constrained data release mechanisms,” arXiv preprint arXiv:1710.09295, 2017.
  • [25] P. Cuff and L. Yu, “Differential privacy as a mutual information constraint,” in CSS, pp. 43–54, 2016.
  • [26] R. Sibson, “Information radius,” Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 14, no. 2, pp. 149–160, 1969.
  • [27] M. S. Alvim, M. E. Andrés, K. Chatzikokolakis, and C. Palamidessi, “On the relation between differential privacy and quantitative information flow,” in Automata, Languages and Programming - 38th International Colloquium, ICALP 2011, Zurich, Switzerland, July 4-8, 2011, Proceedings, Part II, pp. 60–76, 2011.

Appendix A Motivating Example from the Introduction

In the introduction we motivated our work with the example of the following two channels:

Gymny2/31/61/6m1/31/31/3n1/61/62/3Rymny3/51/51/5m1/53/51/5n1/51/53/5\begin{array}[]{|c|ccc|}\hline\cr G&y&m&n\\ \hline\cr y&\nicefrac{{2}}{{3}}&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}\\ m&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}\\ n&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}&\nicefrac{{2}}{{3}}\\ \hline\cr\end{array}\quad\begin{array}[]{|c|ccc|}\hline\cr R&y&m&n\\ \hline\cr y&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}\\ m&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}\\ n&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}\\ \hline\cr\end{array}

Using a uniform prior π=[1/3,1/3,1/3]\pi=[\nicefrac{{1}}{{3}},\nicefrac{{1}}{{3}},\nicefrac{{1}}{{3}}], we can compute the set of posteriors using Bayes rule by multiplying out the channel by the prior and then normalising down columns. This yields the following posteriors and their corresponding marginals:

[πG]=δyδmδn4/71/41/72/71/22/71/71/44/7p(y)=[7/182/97/18][πR]=δyδmδn3/51/51/51/53/51/51/51/53/5p(y)=[1/31/31/3]\begin{array}[]{r}[\pi{\triangleright}G]=\begin{array}[]{|c|c|c|}\hline\cr\delta^{y}&\delta^{m}&\delta^{n}\\ \hline\cr\nicefrac{{4}}{{7}}&\nicefrac{{1}}{{4}}&\nicefrac{{1}}{{7}}\\ \nicefrac{{2}}{{7}}&\nicefrac{{1}}{{2}}&\nicefrac{{2}}{{7}}\\ \nicefrac{{1}}{{7}}&\nicefrac{{1}}{{4}}&\nicefrac{{4}}{{7}}\\ \hline\cr\end{array}\\[25.0pt] p(y)=\begin{bmatrix}\nicefrac{{7}}{{18}}&\nicefrac{{2}}{{9}}&\nicefrac{{7}}{{18}}\end{bmatrix}\end{array}\quad\begin{array}[]{r}[\pi{\triangleright}R]=\begin{array}[]{|c|c|c|}\hline\cr\delta^{y}&\delta^{m}&\delta^{n}\\ \hline\cr\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}\\ \nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}\\ \nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}\\ \hline\cr\end{array}\\[25.0pt] p(y)=\begin{bmatrix}\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}\end{bmatrix}\end{array}

We can now compute the expected posterior gain to an adversary who wishes to guess the secret in one try. This adversary can be modelled using the gain function gid defined as:

gid(w,x)={1if w=x,0otherwisegid(w,x)=\begin{cases}1&\textit{if }w=x,\\ 0&\textit{otherwise}\end{cases} (19)

We first compute the prior vulnerability for this adversary, which gives a measure of the leakage before observing any output from the channel. This is computed as:

Vgid(π)\displaystyle V_{gid}(\pi) =maxwxπxgid(w,x)\displaystyle=\max_{w}\sum_{x}\pi_{x}\textit{gid}(w,x)
=maxxπx\displaystyle=\max_{x}\pi_{x}
=1/3\displaystyle=\nicefrac{{1}}{{3}}

The posterior vulnerability determines the expected gain to the adversary after making an observation, and therefore depends on the channel. The posterior gid-vulnerability for GG is then:

Vgid[πG]\displaystyle V_{gid}{[\pi{\triangleright}G]} =yp(y)Vgid(δy)\displaystyle=\sum_{y}p(y)V_{gid}(\delta^{y})
=7/18×4/7+2/9×1/2+7/18×4/7\displaystyle=\nicefrac{{7}}{{18}}{\times}\nicefrac{{4}}{{7}}+\nicefrac{{2}}{{9}}{\times}\nicefrac{{1}}{{2}}+\nicefrac{{7}}{{18}}{\times}\nicefrac{{4}}{{7}}
=5/9\displaystyle=\nicefrac{{5}}{{9}}
=0.56\displaystyle=0.56

And for RR we have:

Vgid[πR]\displaystyle V_{gid}{[\pi{\triangleright}R]} =yp(y)Vgid(δy)\displaystyle=\sum_{y}p(y)V_{gid}(\delta^{y})
=1/3×3/5×3\displaystyle=\nicefrac{{1}}{{3}}{\times}\nicefrac{{3}}{{5}}{\times}3
=3/5\displaystyle=\nicefrac{{3}}{{5}}
=0.6\displaystyle=0.6

The multiplicative leakage is given by the ratio of the posterior to the prior vulnerabilities. For GG we have:

gid×(π,G)=Vgid[πG]Vgid(π)=159=1.67\mathcal{L}^{\times}_{gid}(\pi,G)=\frac{V_{gid}{[\pi{\triangleright}G]}}{V_{gid}(\pi)}=\frac{15}{9}=1.67

And for RR we compute:

gid×(π,R)=Vgid[πR]Vgid(π)=95=1.80\mathcal{L}^{\times}_{gid}(\pi,R)=\frac{V_{gid}{[\pi{\triangleright}R]}}{V_{gid}(\pi)}=\frac{9}{5}=1.80

And thus RR leaks more than GG to this adversary.

Next we compare the max-case leakage using the same gain function gid and the same uniform prior π\pi. Note that the prior max-case gidgid-vulnerability is the same (1/3\nicefrac{{1}}{{3}}). For the posterior vulnerability for GG we compute:

Vgidmax[πG]\displaystyle V^{\mathrm{max}}_{gid}{[\pi{\triangleright}G]} =maxyVgidmax(δy)\displaystyle=\max_{y}V^{\mathrm{max}}_{gid}(\delta^{y})
=max{4/7,1/2,4/7}\displaystyle=\max\{\nicefrac{{4}}{{7}},\nicefrac{{1}}{{2}},\nicefrac{{4}}{{7}}\}
=4/7\displaystyle=\nicefrac{{4}}{{7}}
=0.57\displaystyle=0.57

And for RR we have:

Vgidmax[πR]\displaystyle V^{\mathrm{max}}_{gid}{[\pi{\triangleright}R]} =maxyVgidmax(δy)\displaystyle=\max_{y}V^{\mathrm{max}}_{gid}(\delta^{y})
=3/5\displaystyle=\nicefrac{{3}}{{5}}
=0.6\displaystyle=0.6

And so the corresponding multiplicative leakage for GG is:

gidmax(π,G)=Vgidmax[πG]Vgid(π)=127=1.71\mathcal{L}^{\mathrm{max}}_{gid}(\pi,G)=\frac{V^{\mathrm{max}}_{gid}{[\pi{\triangleright}G]}}{V_{gid}(\pi)}=\frac{12}{7}=1.71

And for RR it is:

gidmax(π,R)=Vgidmax[πR]Vgid(π)=95=1.80\mathcal{L}^{\mathrm{max}}_{gid}(\pi,R)=\frac{V^{\mathrm{max}}_{gid}{[\pi{\triangleright}R]}}{V_{gid}(\pi)}=\frac{9}{5}=1.80

And so again we find that RR leaks more than GG for this adversary, now modelled using a max-case vulnerability function.

Appendix B Max-case gg-leakage using average-case gg-vulnerability

In this section, we complete the proof of Lem 4, showing a gain function gg^{*} which produces the same average-case leakage as a max-case leakage defined using a gg. That is, we will prove that for any gain function gg it is possible to construct a gain function gg^{*} such that Vgmax(π)=Vg(π)V^{\mathrm{max}}_{g}(\pi)=V_{g^{*}}(\pi) where Vg(π)V_{g^{*}}(\pi) is defined in the usual way as:

Vg(π):=maxwxπxg(w,x),V_{g^{*}}(\pi)~{}~{}:=~{}~{}\max_{w}\sum_{x}\pi_{x}g^{*}(w,x)~{}, (20)

and Vgmax(π)V^{\mathrm{max}}_{g}(\pi) is defined as

Vgmax(π):=maxw,xπxg(w,x).V^{\mathrm{max}}_{g}(\pi)~{}~{}:=~{}~{}\max_{w,x}\pi_{x}g(w,x)~{}. (21)

Note that our construction assumes that the domain 𝒳\mathcal{X} is discrete, although the proof does not rely on this assumption.

For g:𝒲×𝒳0g{:}\mathcal{W}{\times}\mathcal{X}\rightarrow\mathbb{R}_{\geq 0} we can define the set of actions 𝒲{\mathcal{W}}^{*} such that for each w𝒲w\in\mathcal{W} we have a set {wx𝒲:x𝒳}\{w_{x}\in{\mathcal{W}}^{*}:x\in\mathcal{X}\}, and a mapping g:𝒲×𝒳0g^{*}{:}{\mathcal{W}}^{*}{\times}\mathcal{X}\rightarrow\mathbb{R}_{\geq 0} satisfying

g(wxi,x)={g(w,x),if x=xi0,otherwise.g^{*}(w_{x_{i}},x)=\begin{cases}g(w,x),&\textit{if }x=x_{i}\\ 0,&\textit{otherwise.}\end{cases} (22)

This means that maxxg(wxi,x)=maxxg(w,x)\max_{x}g^{*}(w_{x_{i}},x)=\max_{x}g(w,x). And therefore:

Vgmax(π)\begin{array}[t]{@{}llll}V^{\mathrm{max}}_{g}(\pi)\end{array}
== maxw𝒲,x𝒳πxg(w,x)\begin{array}[t]{@{}llll}\max_{w\in\mathcal{W},x\in\mathcal{X}}\pi_{x}g(w,x)\end{array} “From Eqn (21)”
== maxw𝒲,x𝒳πxg(w,x)\begin{array}[t]{@{}llll}\max_{w\in{\mathcal{W}}^{*},x\in\mathcal{X}}\pi_{x}g^{*}(w,x)\end{array} “From Eqn (22)”
== maxw𝒲xπxg(w,x)\begin{array}[t]{@{}ll}{}\\ \max_{w\in{\mathcal{W}}^{*}}\sum_{x}\pi_{x}g^{*}(w,x)\end{array} “Since g(w,x)=0g^{*}(w,x)=0 everywhere else”
== Vg(π)\begin{array}[t]{@{}llll}V_{g^{*}}(\pi)\end{array} “Eqn (20)”

And this gives that gmax(π,C)=maxyVgmax(δy)Vgmax(π)=maxyVg(δy)Vg(π)\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)=\frac{\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})}{V^{\mathrm{max}}_{g}(\pi)}=\frac{\max_{y}V_{g^{*}}(\delta^{y})}{V_{g^{*}}(\pi)}.