Explaining $\epsilon$ in local differential privacy through the lens of quantitative information flow

Natasha Fernandes School of Computing
Macquarie University
Australia
[email protected] Annabelle McIver School of Computing
Macquarie University
Australia
[email protected] Parastoo Sadeghi School of Engineering and IT
UNSW Canberra
Australia
[email protected]

Abstract

The study of leakage measures for privacy has been a subject of intensive research and is an important aspect of understanding how privacy leaks occur in computer systems. Differential privacy has been a focal point in the privacy community for some years and yet its leakage characteristics are not completely understood. In this paper we bring together two areas of research –information theory and the $g$ -leakage framework of quantitative information flow (QIF)– to give an operational interpretation for the epsilon parameter of local differential privacy. We find that epsilon emerges as a capacity measure in both frameworks; via (log)-lift, a popular measure in information theory; and via max-case $g$ -leakage, which we introduce to describe the leakage of any system to Bayesian adversaries modelled using “worst-case” assumptions under the QIF framework. Our characterisation resolves an important question of interpretability of epsilon and consolidates a number of disparate results covering the literature of both information theory and quantitative information flow.

Index Terms:

Differential privacy, log-lift, information leakage, g-leakage, quantitative information flow

I Introduction

Over the past two decades, characterising and limiting privacy leakage in data sharing systems has been the subject of intensive research. Information-theoretic privacy [1, 2], differential privacy [3] and quantitative information flow (QIF) [4] are three main frameworks that have been developed and studied for this purpose.

Information-theoretic privacy is concerned with measuring privacy leakage using well-known information-theoretic quantities, such as Shannon mutual information [5]:

I(X;Y)~{}~{}:=~{}~{}\sum_{x\in\mathcal{X},y\in\mathcal{Y}~{}}p(x,y)\log\frac{p(x,y)}{p(x)p(y)}~{},

where $X$ denotes a secret and $Y$ denotes the observable, with $p(x,y),p(x)$ and $p(y)$ denoting the joint and marginal distributions of $X$ and $Y$ , respectively. The channel $C$ is characterised by the conditional probability distribution $p(y|x)$ and is assumed fixed by the data sharing system and publicly known. For any prior $p(x)$ , a joint $p(x,y)$ and a marginal distribution $p(y)$ are induced by $C$ .

Papers such as [4, 6] discuss some of the limitations of Shannon mutual information in properly quantifying or differentiating various adversarial threats to privacy. Sibson and Arimoto mutual information of order $\alpha$ have recently been proposed to measure a spectrum of generalised threats to privacy [7]. It also turns out that the “nucleus” of Shannon mutual information, namely the information density variable: $i(x,y):=\log\frac{p(x,y)}{p(x)p(y)}$ can represent a much stronger notion of privacy leakage than its average. We call $\ell(x,y):=\frac{p(x,y)}{p(x)p(y)}$ the lift variable, the exponential of information density. ¹¹1Therefore, we may refer to information density as log-lift. Information density and lift have been studied, albeit under various names, in works such as [1, 8, 9, 10, 11].

Differential privacy (DP) is a well-known privacy framework [3] describing a worst-case attack scenario, in which an attacker with knowledge of all individuals but one in a dataset learns little information about the unknown individual upon an information release. Specifically, for a given $\epsilon$ , any $E\subset\mathcal{Y}$ and any two neighbouring datasets $x,x^{\prime}$ differing in one individual and any observation $Y\in\mathcal{Y}$ , DP requires

p(Y\in E|x)\leq e^{\epsilon}p(Y\in E|x^{\prime}).

(1)

This definition is sometimes called the central model for differential privacy, referring to its dependence on a trusted centralised data curator. Alternatively, in the local model for differential privacy [12, 13], each individual first applies noise to their data before sending it to an untrusted data curator, who releases some aggregate information about the (noisy) data. Local DP is more stringent than DP as it requires the inequation above be satisfied for all $x,x^{\prime}\in\mathcal{X}$ . In other words, local $\epsilon$ -DP implies central $\epsilon$ -DP, but not vice versa.

Finally, quantitative information flow (QIF) is an information-theoretic framework for measuring leakage from (probabilistic) systems modelled as information-theoretic channels. ²²2The early work introducing Quantitative Information Flow was based on Shannon entropy to measure flows of information [14]. The QIF used here refers to later work initiated by Smith [4]. In QIF, leakages are derived from a Bayesian adversary model incorporating a gain function describing the adversary’s goal, and the adversary’s prior knowledge. The resulting leakage, computed as the expected gain in the adversary’s knowledge, thus provides a meaningful, interpretable measure of the security risk to a system from that adversary. For example, the Bayes vulnerability measure, widely used in QIF, corresponds to the maximum probability that an adversary can guess the secret in one try; the Bayes leakage measure is the difference between the adversary’s expected probability of guessing the secret after making an observation from the system versus their probability before making an observation (i.e., using only their prior knowledge).

I-A Explaining epsilon: a motivating example

An important development in both the information theory and security communities has been an interest in operationally relevant leakage measures to provide information about the relative security of different probabilistic data sharing systems. In this context, operational measures are ones which provide interpretable guarantees in terms of an adversary’s gain or loss upon interacting with the system. The value produced by such a leakage measure therefore has a correspondence with some measurable information gain to the adversary, justifying its use as a security measure. For both of the above communities, the $\epsilon$ of differential privacy still lacks a robust operational meaning, despite the privacy community’s explanation in terms of the differential privacy attacker who knows all-but-one secret in the dataset. The following example illustrates this more clearly.

Consider a local differential privacy scenario in which a user wishes to obfuscate their own sensitive data before releasing it to an untrusted data curator. The user is given a choice of two mechanisms to use. The scenario is depicted by the two mechanisms below, in which, for example, the user is answering a question in a survey, and the values {y, m, n} correspond to the answers ‘yes’, ‘maybe’ and ‘no’, respectively. Note that we describe our mechanisms as ‘channels’, in which the row labels correspond to secrets and the column labels correspond to observations. The value in row- $i$ column- $j$ represents the conditional probability $P(j|i)$ ; that is the probability of observing output $j$ given input $i$ . In the example below, in the channel $G$ we have that the probability of observing $y$ given an input $n$ is $\nicefrac{{1}}{{6}}$ , given by the value in the bottom left of the channel matrix.

\begin{array}[]{|c|ccc|}\hline\cr G&y&m&n\\ \hline\cr y&\nicefrac{{2}}{{3}}&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}\\ m&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}\\ n&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}&\nicefrac{{2}}{{3}}\\ \hline\cr\end{array}\quad\begin{array}[]{|c|ccc|}\hline\cr R&y&m&n\\ \hline\cr y&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}\\ m&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}\\ n&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}\\ \hline\cr\end{array}

From the perspective of differential privacy, the choice of the user’s noisy mechanism should be determined by the $\epsilon$ parameter described by the DP equation (1); in the DP literature, the smaller the $\epsilon$ , the more privacy is said to be guaranteed to the user.

In the mechanisms described above, notice that the mechanism $G$ satisfies $\log(4)$ -DP and the mechanism $R$ satisfies $\log(3)$ -DP. ³³3The mechanisms $G$ and $R$ are in fact instances of the well-known Geometric and Randomised-Response mechanisms from DP, respectively. (In fact this corresponds to the maximum ratio between any two values within a column). We would therefore expect that $R$ is more private and so leaks less information to an adversary than does $G$ .

We now ask: for each of these mechanisms, what would be the expected gain to an attacker who is trying to guess the secret in one try? We can model this attack by assuming a uniform prior for the adversary (say), and applying Bayes rule to compute the adversary’s posterior knowledge after making an observation from each mechanism. ⁴⁴4This is the model used in the QIF framework and is described in more detail in §II-C. The posterior probability of success for the attacker is given by the maximum probability in each posterior, averaged by the marginal on that posterior ⁵⁵5In QIF this is called the posterior Bayes vulnerability.. The resulting computation yields that the leakage of the system computed using the adversary’s expected posterior gain from $G$ is $1.67$ whereas the equivalent leakage from $R$ is $1.80$ (see Appendix A for complete details). So, in fact, our attacker learns more from $R$ than she does from $G$ .

But since $\epsilon$ represents a “max-case” measure and the adversarial scenario above uses the expected gain of the attacker, i.e., an “average-case” measure, we might think that the above anomaly occurs because the measures we are using are incompatible. Let us consider now a max-case scenario: an attacker whose advantage is measured by the maximum probability of guessing the secret in one try regardless of which output was observed. In this scenario, the leakage to the attacker is given by the maximum leakage of any of the posteriors of the channel, independent of the marginal probability on that posterior. In this instance, we find that, again under a uniform prior, the max-case leakage computed using the posterior max-case vulnerability for $G$ is $1.71$ whereas for $R$ it is $1.80$ (see Appendix A for full details). And so $R$ also leaks more information to this max-case adversary than does $G$ , even though its $\epsilon$ suggests that $R$ should be safer than $G$ .

These examples motivate the question: what does $\epsilon$ tell us about other kinds of attacking threats apart from the one modelled in the DP literature? ⁶⁶6The DP literature describes $\epsilon$ as a measure of indistinguishability against an attacker who knows every secret in the dataset except for one. If $\epsilon$ is indeed a robust measure for privacy, it should be useful for making judgments about many different kinds of privacy threats. One way of assessing robustness from the QIF perspective would be to ask: does $\epsilon$ correspond to a leakage bound for some class of (Bayesian) adversaries? If so, this would give $\epsilon$ a robust operational meaning, allowing its guarantees to be explained in terms of a large number of attacking threats relevant to the security community. For the privacy community, this may assist in providing some explainability for $\epsilon$ in terms of its measurement of leakage. This could be used to guide privacy practitioners in determining a reasonable value for $\epsilon$ , which is a problem that has been identified by, for example, the implementers of differential privacy for the US Census Bureau [15].

In this paper, we bring together the information-theoretic, differential privacy and QIF frameworks to show how the measures of lift from information theory and $g$ -leakage from QIF can be used to bring a robust operational meaning – an explanation in terms of a broad class of adversarial attacks – to the parameter $\epsilon$ in (local) differential privacy.

I-B Key open questions and contributions of this paper

Despite a large body of existing work, a number of questions remain open at the intersection of these frameworks.

While $\epsilon$ in both the local and central models of DP is agnostic to the adversarial gain function and prior knowledge, it is not clear whether $\epsilon$ has a robust interpretation in the QIF sense (i.e., in terms of protection against a class of adversaries as highlighted in the example earlier). Such a robust operational interpretation is missing from the DP literature.

In addition, it is still not clear whether the measure of log-lift (in this paper lift) has an operational significance as a measure of privacy leakage or is capable of robustly quantifying maximum privacy leakage in strong adversarial threat models via the QIF framework.

Last, the current theory of max-case leakage in QIF leaves open the question of how best to model the max-case adversarial threats or how to characterise max-case capacity over all priors and a wide class of adversarial threats.

The following contributions of this paper address these questions.

Main contributions:

1.

We propose a new set of measures: max-case $g$ -leakages (Defn 3), which are a subset of the set of general max-case measures for QIF, which have been extensively studied in [16]. The max-case $g$ -leakages have a clear operational significance: they correspond to adversaries whose goal is to maximise their gain over any output. Note that $g$ -leakages model information-theoretic Bayesian adversaries and therefore represent a very general threat model.
2.

We prove the first robust and operationally meaningful interpretation of $\epsilon$ in local DP via the proposed max-case $g$ -leakage framework. Specifically, we show that $e^{\epsilon}$ is exactly the max-case $g$ -leakage capacity, which measures the maximum leakage of a system wrt any Bayesian attack modelled using either average-case or max-case $g$ -leakage measures, and is robustly independent of adversarial gain functions and priors.
3.

We introduce the lift capacity (Defn 4), which quantifies lift over all priors, and show that it is equivalent to $e^{\epsilon}$ in the local DP framework. This provides the missing connection between lift and $\epsilon$ which has been a subject of interest in the information theory literature.

We remark that although one conclusion we draw – that the $\epsilon$ of the DP framework is robustly independent of the adversary – may seem expected or unsurprising, the method developed in this paper of obtaining the result via the QIF framework is non-trivial.

I-C Detail of technical contributions

Technically speaking, the first contribution of this work is establishing a link between lift in information-theoretic privacy and a notion of max-case $g$ -leakage, which we introduce in this paper via the QIF framework. Specifically, in Defn 3 we introduce a max-case multiplicative $g$ -leakage, denoted by $\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)$ , using the standard notion of $g$ -vulnerability in QIF framework. We then show via Thm 2 that lift upper bounds $\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)$ with respect to any gain function $g$ for any given prior $\pi$ . In Thm 1, we show that lift is indeed realisable as a max-case $g$ -leakage for an adversary who chooses a suitable prior-dependent gain function.

In addition, we establish an important result, linking all three information-theoretic privacy, local differential privacy and QIF frameworks in an operationally robust manner. Specifically in the information-theoretic privacy framework, we define in Defn 4 the supremum of lift over all priors for a channel $C$ as lift capacity and denote it by $\mathcal{M}\mathrm{Lift}(C)$ . We then show in Thm 3 and Cor 2 that lift capacity is equivalent to $\epsilon$ in the local DP framework. Combining this result with Thm 1 and Thm 2 we conclude that lift capacity, aka $e^{\epsilon}$ , is also equal to the max-case $g$ -leakage capacity in the QIF framework. This gives lift a robust operational interpretation in terms of strong max-case adversarial threats and explains $\epsilon$ in local DP as a capacity measure from the lens of information theory and QIF.

Finally, we address a question raised in [6] as to whether the $g$ -leakage framework is sufficient to describe adversarial scenarios in which the attacker is trying to guess a randomised function of the secret. In fact the Dalenius leakage described in [16, Ch 10] already addresses this in the case of average-case $g$ -leakage, demonstrating that $g$ -leakage is unchanged when considering randomised functions of the secret. We extend these results to the cases of lift and lift capacity, showing that – as with the Bayes capacity – both measurements are unaffected under such adversarial scenarios.

Table I visualises existing results in the literature and our contributions. As seen from the Table, this paper bridges between several results in the information-theoretic privacy, differential privacy and QIF literature and establishes new relations between lift and other measures of privacy. Our new results, together with our consolidated summary of existing results, depicts a fuller picture on deep connections between the information-theoretic, differential privacy and QIF frameworks.

$\mathcal{L}^{\times}_{g}(\pi,C)$	$\underset{\textrm{Lem~{}\ref{lem:leak2}}}{\leq}$	$\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)$	$\underset{\textrm{Thm~{}\ref{thm:lift_bounds}}}{\leq}$	$\mathrm{Lift}(\pi,C)$
$\leq$		$\leq$ Cor 1		$\leq$ Defn 4
$\mathcal{ML}_{1}^{\times}(\mathbb{D},C)$	$\underset{\textrm{Lem~{}\ref{bayeslem}}}{\leq}$	$\mathcal{M}\mathrm{Lift}(C)$	$=$	$\mathcal{M}\mathrm{Lift}(C)$	$\underset{\textrm{ Cor~{}\ref{cor:epsilon}}}{=}$ $e^{\epsilon}$

TABLE I: Relationship between leakages (top row) and capacities (bottom row). The coloured text indicates new contributions of this paper.

I-D Operational measures and their significance in information theory and security

A key motivating example for the importance of operational security measures came from Geoffrey Smith [4] who observed that traditional information-theoretic measures such as mutual information can underestimate the threat to a system. This work led to the study of $g$ -leakages, which directly model attacking threats and provide leakage measures which can be interpreted in terms of a concrete information gain to an attacker. The relevance of such leakage measures to the security community is that they are explainable in terms of explicit adversarial threats, and therefore give meaningful security assessments. The study of $g$ -leakages now sits under the QIF umbrella [17, 16].

In parallel to the above, recent work from the information theory community has identified the importance of operational measures for information leakage [6]. Of particular interest is the problem of providing an operational interpretation to privacy measures such as $\epsilon$ (in differential privacy) and log-lift (from information theory). Key work in this area by Issa et al. [6] has independently discovered several results overlapping with QIF, which will be outlined later in this paper. Importantly, both communities use an information-theoretic and decision-theoretic (Bayesian) approach to modelling adversarial threats. Therefore, although in this paper we use the notation and framework of quantitative information flow, the results will be relevant and interpretable for the information theory community as well.

II Information-Theoretic Foundations for Privacy

II-A The channel model for quantitative information flow

We adopt notation from QIF [16], a mathematical framework for studying and quantifying information leaks with respect to adversarial attack scenarios.

A probabilistic channel $C$ maps inputs (secrets) $x\in\mathcal{X}$ to observations $y\in\mathcal{Y}$ according to a distribution $\mathbb{D}{\mathcal{Y}}$ . In the discrete case, such channels are $\mathcal{X}{\times}\mathcal{Y}$ matrices $C$ whose row- $x$ , column- $y$ element $C_{x,y}$ is the probability that input $x$ produces observation $y$ . The $x$ -th row $C_{x,-}$ is thus a discrete distribution in $\mathbb{D}{\mathcal{Y}}$ . We write $\mathcal{X}\rightarrow\mathbb{D}{\mathcal{Y}}$ for the type of the channel $C$ .

We can use Bayes rule to model an adversary who uses their observations from a channel to (optimally) update their knowledge about the secrets $\mathcal{X}$ . Given a prior distribution $\pi:\mathbb{D}{\mathcal{X}}$ (representing an adversary’s prior knowledge) and channel $C$ , we can compute a joint distribution $J:\mathbb{D}(\mathcal{X}{\times}\mathcal{Y})$ where $J_{x,y}=\pi_{x}C_{x,y}$ . Marginalising down columns yields the $y$ -marginals $p(y)=\sum_{x}\pi_{x}C_{x,y}$ each having a posterior over $\mathcal{X}$ corresponding to the posterior probabilities $P_{X|y}(x)$ , computed as $\nicefrac{{J_{x,y}}}{{p(y)}}$ (when $p(y)$ is non-zero). We denote by $\delta^{y}$ the posterior distribution $P(X|y)$ corresponding to the observation $y$ . The set of posterior distributions and the corresponding marginals can be used to compute the adversary’s posterior knowledge after making an observation from the channel.

II-B Local differential privacy and lift

Local differential privacy (LDP), as applied by an individual, can be defined as a property of a channel $C:\mathcal{X}\rightarrow\mathbb{D}{\mathcal{Y}}$ taking data $\mathcal{X}$ to noisy outputs $\mathcal{Y}$ .

Definition 1.

We say that channel $C$ satisfies $\epsilon$ -LDP if

C_{x,y}~{}\leq~{}e^{\epsilon}C_{x^{\prime},y}\qquad\forall x,x^{\prime}\in\mathcal{X},y\in\mathcal{Y}~{}.

A central quantity in this paper, which we call lift, was defined in [6] as:

Definition 2.

Given a channel $C$ and prior $\pi:\mathbb{D}{\mathcal{X}}$ , lift is defined as

\mathrm{Lift}(\pi,C)~{}~{}:=~{}~{}\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{C_{x,y}}{p(y)}~{}.

(2)

Remark: Two alternative expressions for the lift are

	$\displaystyle\mathrm{Lift}(\pi,C)~{}~{}:=~{}~{}$	$\displaystyle\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{\delta^{y}_{x}}{\pi_{x}}~{},$		(3)
	$\displaystyle\mathrm{Lift}(\pi,C)~{}~{}:=~{}~{}$	$\displaystyle\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{J_{x,y}}{\pi_{x}p(y)}~{}.\,$		(4)

Notably, the intuition behind lift as expressed in Eqn. (3) is that it measures the adversary’s change in knowledge, through (multiplicative) comparison of her prior and posterior beliefs for each secret and observation. The observation providing the biggest “knowledge gap” or lift thereby produces the most leakage.

Note that the argument of maximisation $\frac{J_{x,y}}{\pi_{x}p(y)}$ in Eqn. (4) is indeed

\ell(x,y)=\frac{p(x,y)}{p(x)p(y)}~{},

in plain probability notation as described in §I and its logarithm is known as the information density. Here, we provide a brief account of some works which have used such quantities in defining privacy measures.

In [9], a mechanism is said to provide $\epsilon$ -local information privacy (LIP) if

\displaystyle e^{-\epsilon}\leq\frac{\delta^{y}_{x}}{\pi_{x}}\leq e^{\epsilon},\qquad\forall x\in\mathcal{X},y\in\mathcal{Y}.

(5)

A similar definition was earlier studied in [1] for sensitive features of datasets. In [10, 11], $\epsilon$ -lift was said to be satisfied if the logarithm of the above inequalities held true for a sensitive variable $S$ and useful variable $X$ according to the Markov chain $S\to X\to Y$ .

An important distinction between our definition of lift and the above works is that they imposed an additional lower bound, $e^{-\epsilon}$ , on the ratio of the posterior to prior beliefs. Whereas in this work, we are only concerned with the largest lift ratio (i.e., the upper bound), which we simply call lift (removing the logarithm). Our definition captures the notion of maximum realisable leakage [6], which is proved in [6, Theorem 13] to be equal to the lift as stated in our Definition 2. This also coincides with the notion of almost-sure pointwise maximal leakage in [18, Definition 4].

In [9, 1, 10, 8], it is proven that Eqn (5) implies $2\epsilon$ -LDP. This bound can be improved through optimisation as shown in [9]. However, we highlight that Eqn (5) and the aforementioned relations to LDP depend on the prior $\pi$ . In §III-B, we will establish robust results strongly and directly linking a prior-independent notion of lift capacity to $\epsilon$ -LDP. The reverse direction, linking $\epsilon$ -LDP to Eqn (5) is already strong: it has been shown that in works such as [1, 8, 9] that $\epsilon$ -LDP implies Eqn (5).

Example 1 (Running Example).

We will use the following example throughout to show how to compute our various measures. Consider the following channel $C$ :

\begin{array}[]{|c|cc|}\hline\cr C&b&g\\ \hline\cr b&\nicefrac{{3}}{{4}}&\nicefrac{{1}}{{4}}\\ g&\nicefrac{{1}}{{4}}&\nicefrac{{3}}{{4}}\\ bg&\nicefrac{{19}}{{20}}&\nicefrac{{1}}{{20}}\\ \hline\cr\end{array}

The rows represent the (secret) eye colour of an individual (blue, green or blue-green) and the column labels represent the eye colour that the individual reports (blue or green). The channel $C$ then tells us, for example, that with probability $\nicefrac{{19}}{{20}}$ the individual will report eye colour blue when they in fact have eye colour blue-green; however they will report eye colour blue with probability $\nicefrac{{1}}{{4}}$ when they have eye colour green.

Let us imagine that our adversary (trying to guess the secret eye colour) has some prior knowledge given by $\pi=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}})$ . We then compute the adversary’s posterior knowledge given the prior and the channel as the set of posterior distributions:

\begin{array}[]{|c|c|}\hline\cr\delta^{b}&\delta^{g}\\ \hline\cr\nicefrac{{15}}{{44}}&\nicefrac{{5}}{{36}}\\ \nicefrac{{5}}{{22}}&\nicefrac{{5}}{{6}}\\ \nicefrac{{19}}{{44}}&\nicefrac{{1}}{{36}}\\ \hline\cr\end{array}

with corresponding marginals $p(b)=\nicefrac{{11}}{{20}}$ , $p(g)=\nicefrac{{9}}{{20}}$ .

The $\epsilon$ for this channel (Defn 1) is computed using the maximum ratio between elements in any column; we have

\epsilon=\log(\frac{C_{g,g}}{C_{bg,g}})=\log(15)=2.71~{}.

The lift, in contrast, requires a prior $\pi$ in order to be computed (Defn 2). Given the prior $\pi$ from above, we compute the lift as:

\mathrm{Lift}(\pi,C)=\max\{\nicefrac{{C_{bg,b}}}{{p(b)}},\nicefrac{{C_{g,g}}}{{p(g)}}\}=\nicefrac{{19}}{{11}}=1.73~{}.

II-C Operational scenarios and the $g$ -leakage framework

An important development over the past decade in security has been the use of operationally relevant leakage measures to provide information about the relative security of different probabilistic systems. Operational measures – those which have a direct correspondence with an adversarial scenario – are crucial to a proper understanding of the security properties of a system. Foundational work in this area by Geoffrey Smith [4] has led to the study of $g$ -leakage under the umbrella of QIF [17, 16].

In QIF, adversaries are modelled as Bayesian: they are equipped with a prior $\pi:\mathbb{D}{\mathcal{X}}$ over secrets $\mathcal{X}$ and a gain function $g:\mathcal{W}{\times}\mathcal{X}{\rightarrow}\mathbb{R}_{\geq 0}$ (over actions $\mathcal{W}$ and secrets $\mathcal{X}$ ), which models the gain to the adversary upon taking action $w\in\mathcal{W}$ when the true value of the secret is $x\in\mathcal{X}$ . Before observing an output from the system, the adversary’s prior expected gain, which we call the expected prior vulnerability of the secret, can be expressed as

V_{g}(\pi)~{}~{}:=~{}~{}\max_{w\in\mathcal{W}}\sum_{x\in\mathcal{X}}\pi_{x}g(w,x)~{}.

(6)

The adversary can use their knowledge of the system (modelled as a channel $C:\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}$ ) to maximise their expected gain after making an observation. This is called the expected posterior vulnerability and is expressed as

$\displaystyle V_{g}{[\pi{\triangleright}C]}~{}~{}:=~{}~{}$	$\displaystyle\sum_{y\in\mathcal{Y}}\max_{w\in\mathcal{W}}\sum_{x\in\mathcal{X}}\pi_{x}C_{x,y}g(w,x)~{}$	(7)
$\displaystyle~{}~{}=~{}~{}$	$\displaystyle\sum_{y\in\mathcal{Y}}p(y)\max_{w\in\mathcal{W}}\sum_{x\in\mathcal{X}}\delta_{x}^{y}~{}g(w,x)~{}$	(8)
$\displaystyle~{}~{}=~{}~{}$	$\displaystyle\sum_{y\in\mathcal{Y}}p(y)V_{g}(\delta^{y})~{},$	(9)

where in the last equality, we have used the definition of vulnerability in Eqn. (6) for the posterior distribution, denoted $\delta^{y}$ . Finally, the difference between the prior and posterior vulnerabilities gives a measure of the leakage of the system to this adversary; the greater the difference, the better is the adversary able to use the transmitted information to infer the value of the secret. This can be computed multiplicatively as

\mathcal{L}^{\times}_{g}(\pi,C)~{}~{}:=~{}~{}\frac{V_{g}{[\pi{\triangleright}C]}}{V_{g}(\pi)}~{}.

(10)

The $g$ -leakage framework models a wide variety of attack scenarios, including guessing the secret in $n$ tries, guessing a property of the secret or guessing a value close to the secret. ⁷⁷7In fact, it has been shown that any convex vulnerability function is expressible as a $g$ -vulnerability [19]. Moreover, attached to each leakage measure is an operational scenario given by the gain function and prior which describes a specific adversarial threat.

An important notion in QIF is that of capacity, which corresponds to the maximum leakage of a system quantified over all priors, or all gain functions, or both. Capacities thus provide robust, interpretable leakage bounds; robust because they quantify over large classes of adversaries and therefore are “robust” to variations in adversarial assumptions; interpretable because the bounds provide security guarantees which can be explained in terms of the particulars of adversarial attacks.

Of particular note is the Bayes capacity, defined as

$\displaystyle\mathcal{ML}_{1}^{\times}(\mathbb{D},C)~{}~{}:=~{}~{}$	$\displaystyle\sup_{\pi}\sup_{g}\mathcal{L}^{\times}_{g}(\pi,C)$	(11)
$\displaystyle~{}~{}=~{}~{}$	$\displaystyle\sup_{\pi}\sum_{y}\max_{x}\pi_{x}C_{x,y}~{}$	(12)
$\displaystyle~{}~{}=~{}~{}$	$\displaystyle\sum_{y}\max_{x}\,C_{x,y},$	(13)

where the first equality is proved in [17] and the last equality is due to the fact that the capacity is realised under the uniform prior [17, “Miracle Theorem”]. The Bayes capacity has an important operational significance: it is a tight upper bound on the multiplicative leakage of any channel in the average sense, quantified over all priors and gain functions [16]. In other words, there is no adversarial scenario, modelled using the expected gain of an adversary with any prior knowledge, for which the channel leakage exceeds the amount given by the Bayes capacity. This provides a robust leakage bound on the security risk to a system against any average-case attacker.

II-D Connecting $g$ -leakage and lift

We conclude this section by showing the connection between Bayes capacity and the information-theoretic measure lift: it turns out that lift is an upper bound on the Bayes capacity. The following lemma has been expressed in [6, Corollary of Theorem 13]; our contribution here is an alternative proof of this result in a QIF formulation.

Lemma 1.

Given a channel $C:\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}$ , for all priors $\pi:\mathbb{D}{\mathcal{X}}$ it holds that

\mathcal{ML}_{1}^{\times}(\mathbb{D},C)~{}\leq~{}\mathrm{Lift}(\pi,C)~{}.

Proof.

We reason as follows:

	$\begin{array}[t]{@{}llll}\mathrm{Lift}(\pi,C)\end{array}$
$=$	$\begin{array}[t]{@{}llll}\max_{x,y}\frac{C_{x,y}}{p(y)}\end{array}$ “Eqn (2)”
$=$	$\begin{array}[t]{@{}llll}\max_{x,y}\frac{C_{x,y}}{p(y)}\sum_{y^{\prime}}p(y^{\prime})\end{array}$ “Since $\sum_{y^{\prime}}p(y^{\prime})=1$ ”
$=$	$\begin{array}[t]{@{}llll}\sum_{y^{\prime}}p(y^{\prime})\max_{x,y}\frac{C_{x,y}}{p(y)}\end{array}$ “max independent of $y^{\prime}$ ”
$\geq$	$\begin{array}[t]{@{}ll}{}\\ \sum_{y}p(y)\max_{x}\frac{C_{x,y}}{p(y)}\end{array}$ “Taking max over each column instead of channel”
$=$	$\begin{array}[t]{@{}llll}\sum_{y}\max_{x}C_{x,y}\end{array}$ “Rearranging”
$=$	$\begin{array}[t]{@{}llll}\mathcal{ML}_{1}^{\times}(\mathbb{D},C)\end{array}$ “Eqn (13)”

∎

Further, we find that the bound is strict, in that lift can be strictly greater than Bayes capacity. Recalling that Bayes capacity is an upper bound on average-case $g$ -leakage measures, this result then shows that lift cannot be represented as an average-case $g$ -leakage measure.

To illustrate, from our running example (Ex. 1), we can compute the Bayes capacity on $C$ as the sum of the column maxima of $C$ , yielding

\mathcal{ML}_{1}^{\times}(\mathbb{D},C)~{}~{}=~{}~{}\nicefrac{{19}}{{20}}+\nicefrac{{3}}{{4}}~{}~{}=~{}~{}1.7~{}.

This means that the maximum leakage of $C$ wrt any adversary measured using an average-case gain, and for any prior and gain function, is $1.7$ . However, recall that we computed the lift under the prior $\pi=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}})$ as 1.73. This means that lift cannot be expressible as an average-case $g$ -leakage measure.

Next, we study the relationship between lift and max-case measures of leakage.

III On Max-case $g$ -leakage Measures, Lift and Local DP

The QIF measures we have reviewed thus far have been for average-case attacks; that is, they model the expected gain of an attacker. Local differential privacy and lift, however, are max-case notions; they describe the worst-case gain for an adversary after making an observation, regardless of its probability of occurring. Max-case measures provide an alternative perspective to average-case measures: the average-case can be seen as the perspective of a data curator whose interest is in protecting attacks against the entire dataset, whereas the max-case provides the perspective of an individual in the dataset whose concern is their particular data point.

The theory of QIF has been extended to include max-case measures which quantify the gain to an adversary interested in only the worst-case leakage [16, 20]. To model this, the max-case posterior vulnerability is defined as

V^{\mathrm{max}}[\pi{\triangleright}C]~{}~{}:=~{}~{}\max_{y}V(\delta^{y})~{}.

(14)

This says that the max-case posterior vulnerability $V^{\mathrm{max}}$ is the maximum vulnerability $V$ computed over the posteriors, where $V$ is a vulnerability function defined on distributions. The theory of max-case vulnerability leaves open the question of how best to define $V$ , aside from a technical result which says that $V$ should be quasi-convex in order to satisfy basic axioms such as the data processing inequality and monotonicity [21].

In this paper, we introduce the notion of max-case $g$ -leakage by choosing $V$ in Eqn (14) to be the standard $g$ -vulnerability function $V_{g}$ defined in Eqn (6). Note that $V_{g}$ is convex [16], and therefore also quasi-convex, thus represents a valid choice of vulnerability function. This yields the following definition:

Definition 3 (Max-case $g$ -leakage).

Given a channel $C$ , prior $\pi$ and gain function $g$ , the max-case $g$ -leakage of $C$ is defined as

\displaystyle\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)~{}~{}:=~{}~{}\frac{V^{\mathrm{max}}_{g}[\pi{\triangleright}C]}{V_{g}(\pi)}~{}~{}=~{}~{}\frac{\max_{y}V_{g}(\delta^{y})}{V_{g}(\pi)}~{},

(15)

where $V_{g}$ is the prior vulnerability function given in Eqn (6).

Comparing Eqn (15) to Eqn (10) we see that the max-case $g$ -leakage quantifies the difference between the prior vulnerability and the posterior vulnerability computed as the maximum vulnerability of any of the posteriors wrt the adversary’s measure of gain and their prior knowledge.

We remark that the significance of our restriction of $V$ to $g$ -vulnerability functions is that it carries with it an operational interpretation in terms of adversarial threats. ⁸⁸8That is, our $V$ describes an adversary in terms of a gain function $g$ and a prior $\pi$ ; other (strictly quasi-convex) vulnerability functions cannot be written in this way. We also note that max-case $g$ -leakage has yet to be studied in the literature. We will leave further discussion on this choice of $V$ to §IV.

III-A Lift and max-case $g$ -leakage

Using Defn 3 above, we now show that lift can be expressed as a max-case $g$ -leakage.

Theorem 1 (Lift as max-case $g$ -leakage).

For discrete domain $\mathcal{X}$ , define the set of actions $\mathcal{W}=\mathcal{X}$ and the gain function $g_{\pi}{:}\mathcal{W}{\times}{\mathcal{X}}\rightarrow\mathbb{R}_{\geq 0}$ as:

g_{\pi}(w,x)=\begin{cases}\frac{1}{\pi_{x}},&\textit{if }w=x\\ 0,&\textit{otherwise}\end{cases}

(16)

where $\pi$ is the (full support) prior of the adversary using the gain function $g_{\pi}$ . Then the max-case $g$ -leakage of any channel $C$ given a prior $\pi$ is equal to its lift. That is, $\mathcal{L}^{\mathrm{max}}_{g_{\pi}}(\pi,C)=\mathrm{Lift}(\pi,C)$ .

Proof.

For the gain function given above, the prior vulnerability $V_{g_{\pi}}(\pi)$ is:

	$\displaystyle V_{g_{\pi}}(\pi)~{}~{}$	$\displaystyle=~{}~{}\max_{w}\sum_{x}\pi_{x}g_{\pi}(w,x)$
		$\displaystyle=~{}~{}1\qquad\textit{ for all }w$

and the max-case posterior vulnerability is:

	$\begin{array}[t]{@{}llll}V_{g_{\pi}}^{\mathrm{max}}{[\pi{\triangleright}C]}\end{array}$
$=$	$\begin{array}[t]{@{}llll}\max_{y}V_{g_{\pi}}(\delta^{y})\end{array}$ “Defn 3”
$=$	$\begin{array}[t]{@{}llll}\max_{y}\max_{w}\sum_{x}\frac{\pi_{x}C_{x,y}g_{\pi}(w,x)}{p(y)}\end{array}$ “Expanding $\delta^{y}$ ”
$=$	$\begin{array}[t]{@{}llll}\max_{y}\max_{x}\frac{C_{x,y}}{p(y)}\end{array}$ “Substituting Eqn (16)”
$=$	$\begin{array}[t]{@{}llll}\mathrm{Lift}(\pi,C)\end{array}$ “Defn 2”

Thus the max-case leakage is $\mathcal{L}^{\mathrm{max}}_{g_{\pi}}(\pi,C)=\frac{V_{g_{\pi}}^{\mathrm{max}}{[\pi{\triangleright}C]}}{V_{g_{\pi}}(\pi)}=\mathrm{Lift}(\pi,C)$ . ∎

The significance of Thm 1 is that it gives an operational meaning to lift in terms of the adversary realised by $V_{g_{\pi}}$ . To give some intuition, for some distribution $\sigma$ , $V_{g_{\pi}}(\sigma)$ is given by $\max_{x}\frac{\sigma_{x}}{\pi_{x}}$ ; i.e., it measures the maximum “surprise” to the adversary, achieved on the secret $x$ for which the probability $\sigma_{x}$ most differs (multiplicatively) from the adversary’s prior $\pi_{x}$ . ⁹⁹9We remark that the gain function $g_{\pi}$ is also known as the distribution-reciprocal gain function [16, Ch 3].

Continuing with our running example (Ex. 1), we can compute the gain function $g_{\pi}$ for $\pi(b,g,bg)=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}})$ as $g_{\pi}(b,b)=4$ , $g_{\pi}(g,g)=2$ , $g_{\pi}(bg,bg)=4$ .

The prior vulnerability is

V_{g_{\pi}}(\pi)~{}~{}=~{}~{}\max_{w}\sum_{x}\pi_{x}g_{\pi}(w,x)~{}~{}=~{}~{}1~{}.

The posterior vulnerability is then given by

V_{g_{\pi}}^{\mathrm{max}}{[\pi{\triangleright}C]}=\max_{y}V_{g_{\pi}}(\delta^{y})

where

V_{g_{\pi}}(\delta^{b})=\max\{\nicefrac{{15}}{{11}},\nicefrac{{5}}{{11}},\nicefrac{{19}}{{11}}\}=\nicefrac{{19}}{{11}}

and

V_{g_{\pi}}(\delta^{g})=\max\{\nicefrac{{5}}{{9}},\nicefrac{{5}}{{3}},\nicefrac{{1}}{{9}}\}=\nicefrac{{5}}{{3}}

And so $\mathcal{L}^{\mathrm{max}}_{g_{\pi}}(\pi,C)=\nicefrac{{19}}{{11}}=1.73$ which is what we calculated for lift in Ex. 1.

We find, however, that lift has a much stronger property: it is in fact an upper bound on any max-case $g$ -leakage measure (that is, using any gain function) with the minor restriction that the gain function must be non-negative. ¹⁰¹⁰10Note that this non-negativity restriction is also required in the case of Bayes capacity as an upper bound on average-case leakage.

Theorem 2 (Lift bounds max-case $g$ -leakage).

Define the max-case $g$ -leakage $\mathcal{L}^{\mathrm{max}}_{g}$ as per Defn 3. Then for any non-negative gain function $g$ , any prior $\pi$ and any channel $C$ , it holds that $\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)\leq\mathrm{Lift}(\pi,C)$ .

Proof.

We reason as follows:

	$\begin{array}[t]{@{}llll}\max_{y}V_{g}(\delta^{y})\end{array}$
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{y}(\max_{w}\sum_{x}\dfrac{\pi_{x}C_{x,y}g(w,x)}{p(y)})\end{array}$ “Eqn (6), expand $\delta^{y}$ ”
$\leq$	$\begin{array}[t]{@{}ll}{}\\ \max_{y}(\max_{w}\sum_{x}(\max_{x}\dfrac{C_{x,y}}{p(y)})\pi_{x}g(w,x))\end{array}$ “ $g\geq 0$ ”
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{y}(\max_{x}\dfrac{C_{x,y}}{p(y)})\max_{w}(\sum_{x}\pi_{x}g(w,x))\end{array}$ “ $\sum_{i}cx_{i}=c\sum_{i}x_{i}$ and $\max_{i}cx_{i}=c\max_{i}x_{i}$ ”
$=$	$\begin{array}[t]{@{}ll}{}\\ \mathrm{Lift}(\pi,C)V_{g}(\pi)\end{array}$ “Eqn (6), $\mathrm{Lift}$ ”

The result follows using Defn 3. ∎

Thm 2 tells us that lift is a surprisingly robust measure; for any prior, lift is a measure of the maximum $g$ -leakage of a channel wrt any (non-negative) gain function. Moreover, this upper bound holds regardless of whether we consider average-case leakage or max-case leakage: the max-case we proved above (Thm 2); the average-case follows from the fact that lift upper bounds Bayes capacity (Lem 1) and Bayes capacity is, by definition, an upper bound on any average-case $g$ -leakage, for any prior and any (non-negative) gain function.

To illustrate, consider again the running example (Ex. 1). For the gain function we will choose the gid function which models the adversary who wishes to guess the secret in one try. This is defined:

gid(w,x)=\begin{cases}1&\textit{if }w=x,\\ 0&\textit{otherwise}\end{cases}

Then the max-case $gid$ -leakage given the prior $\pi=(\nicefrac{{1}}{{4}},\nicefrac{{1}}{{2}},\nicefrac{{1}}{{4}})$ is given by:

\mathcal{L}^{\mathrm{max}}_{gid}(\pi,C)=\frac{V^{\mathrm{max}}_{gid}{[\pi{\triangleright}C]}}{V_{gid}(\pi)}=\frac{5}{6}/\frac{1}{2}=1.67~{}.

And the lift we computed earlier as $1.73$ , which is larger than the $gid$ -leakage, as we expected. Thm 2 tells us that in fact for any gain function we choose and any prior, the lift will always be at least as large as the max-case $g$ -leakage.

III-B Robustness results: lift capacity and epsilon

Typically, leakage measures (as defined e.g., in QIF) are not robust – that is, they depend on the specifics of the adversary (their prior and gain function), and therefore may not provide a reliable measure of the safety of a channel against different (Bayesian) adversaries having different prior knowledge and intent. A more robust way to characterise a channel’s leakage properties is to measure its maximum leakage ¹¹¹¹11Note the distinction between maximum leakage and max-case leakage: the former quantifies over all priors and gain functions; the latter uses a max-case posterior vulnerability measure. – that is, quantified over all priors, or over all gain functions, or both. Such quantities are termed capacities (see §II-C), and have been previously studied in the context of average-case leakage [22]; the Bayes capacity (recall Eqn (11)) is the most well-known and provides robust upper bounds on the average-case leakage of any channel.

There has, to date, been no study of max-case capacities. However, Thm 2 gives an immediate result in this direction – it shows that lift can be seen an example of a max-case capacity, since it provides an upper bound on max-case leakages quantified over all gain functions. It follows also that, by quantifying over all priors as well, the resulting lift capacity is an upper bound on all max-case $g$ -leakages for all priors.

Definition 4.

We define the lift capacity for a channel $C$ as

\mathcal{M}\mathrm{Lift}(C):=\sup_{\begin{subarray}{c}\pi:\mathbb{D}{\mathcal{X}}:\\ \pi_{x}>0\end{subarray}}\mathrm{Lift}(\pi,C)~{}~{}=~{}~{}\sup_{\begin{subarray}{c}\pi:\mathbb{D}{\mathcal{X}}:\\ \pi_{x}>0\end{subarray}}\,\,\,\max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{C_{x,y}}{p(y)}~{}.

Immediate from this definition and Thm 2 we get the following:

Corollary 1.

For any channel $C$ and prior $\pi$ it holds that

\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)~{}~{}\leq~{}~{}\mathcal{M}\mathrm{Lift}(C)~{}.

We now show that this so-called lift capacity also has an important relationship with the epsilon of local differential privacy.

Theorem 3.

Let $C:\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}$ be a channel and $\epsilon\geq 0$ . Then

\mathcal{M}\mathrm{Lift}(C)\leq e^{\epsilon}~{}~{}\textrm{iff}~{}~{}\frac{C_{x,y}}{C_{x^{\prime},y}}\leq e^{\epsilon}~{}~{}\textrm{for all }x,x^{\prime}\in\mathcal{X},y\in\mathcal{Y}~{}.

Proof.

For the forward direction, we first note that $\inf_{\pi}\sum_{x}\pi_{x}C_{x,y}$ occurs when $\pi$ is a point prior on $\operatorname*{arg\,min}_{x}C_{x,y}$ . However, recall from Definition 4 that $\mathcal{M}\mathrm{Lift}(C)$ is a supremum over all full support priors. Let $x^{*}=\operatorname*{arg\,min}_{x^{\prime}}C_{x^{\prime},y}$ ; now define a sequence of priors

\pi^{n}_{x^{*}}=1{-}\frac{1}{n}~{}~{}\textit{and}~{}~{}\pi^{n}_{x}=\frac{1}{n(|\mathcal{X}|-1)}~{}~{}\textit{for }x\neq x^{*}~{}.

We see that $\pi^{n}$ is full support, and also $\lim_{n\to\infty}\sum_{x}\pi^{n}_{x}C_{x,y}=\min_{x}C_{x,y}$ . Therefore

\inf_{\pi:\pi_{x}>0}\sum_{x}\pi_{x}C_{x,y}=\min_{x}C_{x,y}~{}.

(17)

We also note that we can rewrite $p(y)$ as $\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}$ . Therefore we have:

	$\begin{array}[t]{@{}llll}\sup_{\pi:\pi_{x}>0}\mathrm{Lift}(\pi,C)\end{array}$
$=$	$\begin{array}[t]{@{}llll}\sup_{\pi:\pi_{x}>0}\max_{x,y}\frac{C_{x,y}}{p(y)}\end{array}$ “Eqn (2)”
$=$	$\begin{array}[t]{@{}ll}{}\\ \sup_{\pi:\pi_{x}>0}\max_{x,y}\frac{C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array}$ “Substituting for $p(y)$ as noted above”
$=$	$\begin{array}[t]{@{}llll}\max_{x,y}\sup_{\pi:\pi_{x}>0}\frac{C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array}$ “Rearranging”
$=$	$\begin{array}[t]{@{}llll}\max_{x,y}(\frac{C_{x,y}}{\inf_{\pi:\pi_{x}>0}(\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y})})\end{array}$ “ $C_{x,y}$ independent of $\pi$ ”
$=$	$\begin{array}[t]{@{}llll}\max_{x,y}\frac{C_{x,y}}{\min_{x^{\prime}}C_{x^{\prime},y}}\end{array}$ “From Eqn (17) above”
$\geq$	$\begin{array}[t]{@{}llll}\frac{C_{x,y}}{C_{x^{\prime},y}}~{}~{}\textrm{ for all }x,x^{\prime},y\end{array}$ “ $\min_{x}C_{x,y}\leq C_{x,y}$ ”

And so we have that $\mathcal{M}\mathrm{Lift}(C)\leq e^{\epsilon}$ implies $\frac{C_{x,y}}{C_{x^{\prime},y}}\leq e^{\epsilon}$ for all $x,x^{\prime},y$ . For the reverse direction, we refer to previous works in which this has been shown to hold [1, 8, 9].

∎

Corollary 2.

The lift capacity coincides with the $\epsilon$ value of an LDP channel. That is, $\mathcal{M}\mathrm{Lift}(C)=e^{\epsilon}$ .

Proof.

From Thm 3 we deduce that $\mathcal{M}\mathrm{Lift}(C)=\sup_{x,x^{\prime},y}\frac{C_{x,y}}{C_{x^{\prime},y}}$ . ∎

Thm 3 and Cor 2 establish the equivalence between the lift capacity and the epsilon parameter of local differential privacy, and by Thm 1 and Thm 2 this gives that epsilon (or rather, $e^{\epsilon}$ ) is a tight upper bound on the max-case $g$ -leakage of any channel. That is, $e^{\epsilon}$ is the max-case $g$ -leakage capacity, quantified over all priors and gain functions. This result connecting $\epsilon$ with the $g$ -leakage framework provides the first robust, operational interpretation for local differential privacy in terms of Bayesian adversarial threats.

We note that $\epsilon$ has previously been interpreted as a capacity under QIF [20]. However, in that work the vulnerability function chosen was not expressible as a $g$ -vulnerability, and therefore did not carry with it the operational interpretation attached to the $g$ -leakage framework. This was Smith’s original motivation for a $g$ -leakage based model for secure information leakage measurements.

We also remark that the relationship between lift and local differential privacy has been established previously (eg. [6, 23]), but these results have not been tied back to the $g$ -leakage framework, and did not establish the operational significance of $\epsilon$ in terms of max-case $g$ -leakage measures.

III-C Interpreting the results connecting $g$ -leakage, lift capacity and epsilon

Thm 2 above tells us that lift bounds max-case $g$ -leakage. To give some intuition for what this means in practice, suppose that we are worried about the leakage of a system to some attacker, both wrt any single individual in our system (i.e., the max-case leakage to the attacker) and wrt the system as a whole (i.e., the attacker’s expected gain). Let’s also assume that we can estimate the prior knowledge of the attacker, but we do not know exactly what the attacker is trying to achieve (i.e., what is her gain function). What Thm 2 tells us is that without knowing anything about the attacker’s goal, we can still compute an upper bound on the leakage of the system to this attacker (i.e., we can measure how much more the attacker can gain by having access to the system). This upper bound is given by the lift - a quantity independent of the gain function - and this is an upper bound on both average-case and max-case leakage to Bayesian adversaries.

Now let’s assume that we are worried about the same attacker, however this time we know nothing about her prior knowledge nor her specific goal. What Cor 2 tells us is that we can still compute an upper bound on the leakage of the system to this attacker, and in this case the upper bound is given by the lift capacity, or alternatively by $e^{\epsilon}$ (computed in the local differential privacy sense). This upper bound is robust, in the sense that it makes no assumptions about the adversary’s prior knowledge or goal; thus it is termed a capacity, consistent with the QIF literature. The $\epsilon$ capacity differs from the existing Bayes capacity in QIF, in that Bayes capacity is an upper bound on adversaries whose leakage is computed in the average-case, whereas the $e^{\epsilon}$ upper bound is a capacity on adversaries computed using either average-case or max-case leakage measures. The significance of these results is that we can now justify $\epsilon$ as a robust security parameter with an operational interpretation in terms of a large class of Bayesian attackers.

Recall the motivating example presented in the introduction (§I-A) in which a user is given a choice of local differential privacy mechanisms $G$ and $R$ with $\epsilon$ values of $\log(4)$ and $\log(3)$ , respectively. Remember that we found that $R$ had a larger leakage (wrt both an average-case and a max-case Bayesian attacker) than $G$ , even though it has a smaller $\epsilon$ . The results of Thm 2 and Cor 2 do not tell us when one mechanism will be better than another against a particular attack; for this we would turn to the refinement orders of QIF (studied in [20]) which tell us when one mechanism is better than another against all attacks; or alternatively we would compute the leakage of each mechanism wrt an individual attack of concern (as we did in the introduction).

What Thm 2 and Cor 2 do tell us is that there cannot be any attack (max-case or average-case) which produces more leakage than what the lift tells us (if we know the attacker’s prior), or what $e^{\epsilon}$ tells us (if we do not). In the case of $R$ and $G$ , Cor 2 tells us that there does not exist an attack (max-case or average-case, and regardless of the adversary’s prior) that produces more leakage than $e^{\epsilon}=3$ or $e^{\epsilon}=4$ , respectively. Moreover these bounds are tight: there exists an attack which does induce an $e^{\epsilon}$ leakage (and, in fact, Thm 3 gives a construction for such an attacker). This means that we can conclude that $R$ is generally safer than $G$ , because its leakage never exceeds $3$ , even though $G$ may be better than $R$ in some specific circumstances.

Notice that the leakages we computed in the example in §I-A for $G$ were $1.67$ and $1.71$ , and for $R$ they were both $1.80$ . These are indeed lower than the corresponding $e^{\epsilon}$ bounds of $4$ and $3$ , respectively. Interestingly, if we compute the Bayes capacity for $G$ and $R$ , we find that these are $1.67$ and $1.80$ , respectively (because the Bayes capacity is realised on a uniform prior for exactly the adversarial scenario we computed). What this tells us is that if we are only concerned with attackers modelled in the average-case, then $G$ is in fact generally better than $R$ (but may be worse for particular attackers). The leakage of $1.71$ that we measured for $G$ was for a max-case attacker; and in fact the max-case leakage is always at least as large as the average-case leakage for any prior and any gain function, as we will prove later (see Lem 3 in §IV).

III-D Max-case Dalenius leakage

Up until this point, we have been satisfied with computing leakage measures with respect to deterministic functions of the domain $\mathcal{X}$ (described by a prior $\pi$ ). As pointed out in [6], one may also be concerned about potentially randomised functions of $\mathcal{X}$ . The QIF theory of Dalenius leakage [16, Ch 10] accounts for such functions by modelling the leakage that a channel $C{:}\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}$ induces on a secret $\mathcal{Z}$ through an unknown correlation $J{:}\mathbb{D}{(\mathcal{Z}{\times}\mathcal{X})}$ . Given such a correlation $J$ , we can factorise it into a prior $\rho{:}\mathbb{D}{\mathcal{Z}}$ and a channel $D{:}\mathcal{Z}{\rightarrow}\mathbb{D}{\mathcal{X}}$ so that $J_{z,x}=\rho_{z}{\times}D_{z,x}$ .

In [16] it is shown that the multiplicative Bayes capacity of the combined system $DC$ (writing $DC$ for matrix multiplication) is in fact bounded from above by the multiplicative Bayes capacity of $C$ ; and thus considering arbitrary randomised functions of the secret $\mathcal{X}$ is not necessary for quantifying the maximum multiplicative leakage of a channel.

Here we confirm that this property also holds for the max-case leakage capacity (given by $\mathcal{M}\mathrm{Lift}$ ), and also by lift itself. This means that the capacity results proven in this paper also hold for randomised functions of $\mathcal{X}$ , and therefore there is no advantage in considering randomised functions of the secret space.

The max-case $g$ -leakage of $\mathcal{Z}$ caused by $J$ and $C$ can be written as $\mathcal{L}^{\mathrm{max}}_{g}(\rho,DC)$ . We now have the following:

Lemma 2.

Let $C{:}\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}$ be a channel and let $J{:}\mathbb{D}{({\mathcal{Z}}{\times}{\mathcal{X}})}$ be a correlation that factorises as $\rho\triangleright D$ . Then

\mathrm{Lift}(\rho,DC)~{}~{}\leq~{}~{}\min\{\mathrm{Lift}(\rho,D),\mathrm{Lift}(\pi,C)\}~{},

where $\pi{:}\mathbb{D}{\mathcal{X}}$ and $\pi_{x}=\sum_{z}\rho_{z}D_{z,x}$ .

Proof.

Referring to Thm 1, write $g_{\rho}$ for the gain function that realises lift. i.e., $\mathrm{Lift}(\rho,C)=\mathcal{L}^{\mathrm{max}}_{g_{\rho}}(\rho,C)$ . The data processing inequality for max-case leakage gives that

\mathcal{L}^{\mathrm{max}}_{g_{\rho}}(\rho,DC)~{}~{}\leq~{}~{}\mathcal{L}^{\mathrm{max}}_{g_{\rho}}(\rho,D)

for any $\rho$ . Thus, from Thm 1 we get

\mathrm{Lift}(\rho,DC)~{}~{}\leq~{}~{}\mathrm{Lift}(\rho,D)~{}.

We next have that:

	$\begin{array}[t]{@{}llll}\mathrm{Lift}(\rho,DC)\end{array}$
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{(DC)_{z,y}}{p(y)}\end{array}$ “Defn 2”
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\sum_{x}D_{z,x}C_{x,y}}{\sum_{z}\rho_{z}\sum_{x^{\prime}}D_{z,x^{\prime}}C_{x^{\prime},y}}\end{array}$ “Rewriting”
$\leq$	$\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\sum_{x}D_{z,x}(\max_{x}C_{x,y})}{\sum_{z}\rho_{z}\sum_{x^{\prime}}D_{z,x^{\prime}}C_{x^{\prime},y}}\end{array}$ “Taking max of $C_{-,y}$ over $\mathcal{X}$ ”
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\max_{x}C_{x,y}}{\sum_{z}\rho_{z}\sum_{x^{\prime}}D_{z,x^{\prime}}C_{x^{\prime},y}}\end{array}$ “Factorise; use $\sum_{x}D_{z,x}=1$ ”
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\max_{x}C_{x,y}}{\sum_{x^{\prime}}\sum_{z}(\rho_{z}D_{z,x^{\prime}})C_{x^{\prime},y}}\end{array}$ “Factorise denominator”
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}z\in\mathcal{Z},y\in\mathcal{Y}:\\ J_{z,y}>0\end{subarray}}\,\,\frac{\max_{x}C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array}$ “Substituting $\sum_{z}\rho_{z}D_{z,x}=\pi_{x}$ ”
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{\begin{subarray}{c}x\in\mathcal{X},y\in\mathcal{Y}:\\ J_{x,y}>0\end{subarray}}\,\,\frac{C_{x,y}}{\sum_{x^{\prime}}\pi_{x^{\prime}}C_{x^{\prime},y}}\end{array}$ “No dependence on $\mathcal{Z}$ ”
$=$	$\begin{array}[t]{@{}ll}{}\\ \mathrm{Lift}(\pi,C)\end{array}$ “Defn 2”

And thus $\mathrm{Lift}(\rho,DC)\leq\min\{\mathrm{Lift}(\rho,D),\mathrm{Lift}(\pi,C)\}$ . ∎

As a corollary, we have that the lift capacity also respects Dalenius leakage, and therefore the max-case $g$ -leakage of secrets $\mathcal{Z}$ via a channel $C$ and arbitrary correlation $J$ is bounded from above by the lift capacity of $C$ .

Corollary 3.

For any channel $C{:}\mathcal{X}{\rightarrow}\mathbb{D}{\mathcal{Y}}$ , non-negative gain function $g$ and correlation $J$ given by $J_{z,x}=\rho_{z}D_{z,x}$ we have that

\mathcal{L}^{\mathrm{max}}_{g}(\rho,DC)~{}~{}\leq~{}~{}\mathcal{M}\mathrm{Lift}(DC)~{}~{}\leq~{}~{}\mathcal{M}\mathrm{Lift}(C)~{}.

Proof.

By Cor 1, $\mathcal{L}^{\mathrm{max}}_{g}(\rho,DC)\leq\mathcal{M}\mathrm{Lift}(DC)$ . From Lem 2 we have $\mathrm{Lift}(\rho,DC)\leq\mathrm{Lift}(\pi,C)$ and thus we deduce

	$\begin{array}[t]{@{}llll}\mathcal{M}\mathrm{Lift}(DC)\end{array}$
$=$	$\begin{array}[t]{@{}llll}\sup_{\begin{subarray}{c}\rho:\mathbb{D}{\mathcal{Z}}:\\ \rho_{z}>0\end{subarray}}\,\,\,\mathrm{Lift}(\rho,DC)\end{array}$ “Defn 4”
$\leq$	$\begin{array}[t]{@{}ll}{}\\ \sup_{\begin{subarray}{c}\rho:\mathbb{D}{\mathcal{Z}}:\\ \rho_{z}>0\end{subarray}}\,\,\,\mathrm{Lift}(\pi,C)\end{array}$ “Lem 2, substituting $\pi_{x}=\sum_{z}\rho_{z}D_{z,x}$ ”
$=$	$\begin{array}[t]{@{}llll}\sup_{\begin{subarray}{c}\pi:\mathbb{D}{\mathcal{X}}:\\ \pi_{x}>0\end{subarray}}\,\,\,\mathrm{Lift}(\pi,C)\end{array}$ “Independence from $\mathcal{Z}$ ”
$=$	$\begin{array}[t]{@{}llll}\mathcal{M}\mathrm{Lift}(C)\end{array}$ “Defn 4”

∎

III-E Interpreting the Dalenius leakage results

To provide some intuition on how to interpret the above results, we return to our motivating example from the introduction, in which the two mechanisms $G$ and $R$ were introduced. Recall that these mechanisms were applied to survey results with answers ‘yes’, ‘maybe’ and ‘no’. Now let us assume that the adversary knows a correlation between survey answers and disease; perhaps users who answered ‘yes’ or ‘maybe’ are likely to have some serious illness, whereas users who answered ‘no’ are unlikely to have one. We wish to know whether such a correlation will cause extra harm to the individual: can the adversary learn more information (via the correlation) than the $\epsilon$ parameter of $G$ or $R$ suggests?

What Cor 3 tells us is that the max-case $g$ -leakage of the entire system (including the correlation) is bounded above by $e^{\epsilon}$ computed from the mechanism ( $G$ or $R$ ). This means that any (potentially public) correlations $D$ that the adversary may have access to cannot increase the amount of leakage caused by $G$ or $R$ – the upper bound on leakage remains intact, where here we measure leakage using either lift or lift capacity ( $e^{\epsilon}$ ). In other words, the designer of the system can focus on the $\epsilon$ parameter describing the mechanism, and so long as the leakage (represented by $e^{\epsilon}$ ) is small enough, then the system is protected against an adversary who knows any arbitrary correlation $D$ . Note that we did not prove a Dalenius leakage result on max-case $g$ -leakage, which we leave to future work.

We remark that this result appears to be similar to the notion in differential privacy that $\epsilon$ is independent of the prior knowledge of an attacker. A difference here is that the notion of prior knowledge is typically represented as a distribution over secrets; in the Dalenius scenario described above, we are interested in arbitrary correlations between the secret and some other (potentially damaging) information, and the concern is what damaging information the attacker can learn as a result of the data release and the correlation. Dalenius leakage reassures us that that the damage caused by any arbitrary correlation is always bounded by the leakage of the original data release.

Interestingly, it may be the case that these Dalenius leakage results do not hold in the general case of (central) differential privacy; a counter-example appears in the work of [24]. We leave further exploration of this idea to future work.

IV Additional Results on Max-case Leakage and Technical Discussion

In this section, we provide further technical details on the max-case leakage definition of Defn 3.

Firstly, the following result shows that the max-case leakage of a channel is at least as large as the average-case leakage. This result completes Table I.

Lemma 3.

Given a channel $C$ prior $\pi$ and gain function $g$ , it holds that

\mathcal{L}_{g}^{\times}(\pi,C)~{}~{}\leq~{}~{}\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)~{}.

Proof.

We reason as follows:

	$\begin{array}[t]{@{}llll}V_{g}[\pi{\triangleright}C]\end{array}$
$=$	$\begin{array}[t]{@{}llll}\sum_{y}p(y)V_{g}(\delta^{y})\end{array}$ “Eqn (9)”
$\leq$	$\begin{array}[t]{@{}llll}\sum_{y}p(y)\max_{j}V_{g}(\delta^{j})\end{array}$ “Max over posteriors”
$=$	$\begin{array}[t]{@{}llll}\max_{j}V_{g}(\delta^{j})\sum_{y}p(y)\end{array}$ “Factorising”
$=$	$\begin{array}[t]{@{}llll}V^{\mathrm{max}}_{g}[\pi{\triangleright}C]\end{array}$ “Simplify, Defn 3”

Thus the corresponding leakages are ordered (since prior vulnerability is common). ∎

Next, we recall that in Defn 3 we chose to model max-case leakage using the prior vulnerability function $V_{g}$ , which models the adversary’s expected gain before interacting with a system (i.e., using only their prior knowledge). However, in the max-case setting it might seem preferable to choose a prior vulnerability function which models the adversary’s max-case gain; i.e., we could have defined:

V^{\mathrm{max}}_{g}(\pi)~{}~{}:=~{}~{}\max_{w,x}\pi_{x}g(w,x).

(18)

That is, the adversary’s prior gain is computed using the secret $x$ which maximises their gain. We now justify our original decision (choosing an average-case prior vulnerability) by demonstrating that these choices are, in fact, equivalent.

Lemma 4.

Let $C$ be a channel, $\pi{:}\mathbb{D}{\cal{X}}$ be a prior and $g$ be a gain function. Then there exists a gain function $g^{*}$ such that $V^{\mathrm{max}}_{g}(\pi)=V_{g^{*}}(\pi)$ and $\frac{\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})}{V^{\mathrm{max}}_{g}(\pi)}=\frac{\max_{y}V_{g^{*}}(\delta^{y})}{V_{g^{*}}(\pi)}$ .

Proof.

(Sketch) Observe that $V^{\mathrm{max}}_{g}$ (Eqn (18)) is convex in $\pi$ and so it can be expressed as a convex $g$ -vulnerability [16, Ch 11, Thm 11.5]; i.e., there exists a gain function $g^{*}$ such that $V_{g^{*}}(\pi)=V^{\mathrm{max}}_{g}(\pi)$ . This also means that $\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})=\max_{y}V_{g^{*}}(\delta^{y})$ and thus

\frac{\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})}{V^{\mathrm{max}}_{g}(\pi)}~{}=~{}\frac{\max_{y}V_{g^{*}}(\delta^{y})}{V_{g^{*}}(\pi)}.

In Appendix B we show the construction of such a $g^{*}$ . ∎

Finally, we recall the result of [21] which shows that, for reasonable axioms to hold under a max-case definition of leakage, then the prior vulnerability function should be quasi-convex (in the prior). However, as we have seen, both the average-case and max-case prior vulnerability functions are convex. An open question in the community has been: are there any strictly quasi-convex functions which produce leakage measures of interest? Previous work [20] showed that the $\epsilon$ of metric differential privacy ( $d$ -privacy) can be expressed as an additive capacity using a quasi-convex vulnerability function, suggesting that this was indeed one such example. In this paper we have resolved this question with a much stronger result, showing that $\epsilon$ in local DP can in fact be expressed using a convex vulnerability function. This justifies our restriction of max-case leakage to $g$ -leakage measures, but leaves open the question of the usefulness of max-case leakage defined over the full scope of quasi-convex vulnerabilities.

V Prior Work

Perhaps not surprisingly and underpinned by the foundational theory of information, many linkages have already been established between information-theoretic, differential privacy and QIF frameworks [20, 25]. Here, we only review results that are most pertinent to this work.

First, the logarithm of the Bayes capacity is known as the Sibson mutual information of order $\alpha=\infty$ in information theory [26] and was recently shown in [6] to measure the maximum leakage of adversaries wanting to guess arbitrary randomised functions of secret $X$ . It turns out that this identity is no coincidence and a recent work [18] proves that there is no advantage in generalising the class of adversaries from those who use deterministic gain functions to those who guess randomised functions. We remark that this result is also known in the QIF community via Dalenius leakage [16, Ch 10].

Second, it is not difficult to prove that an upper bound on the information density $i(x,y)$ bounds the Sibson mutual information of order infinity [6], aka the logarithm of Bayes capacity in QIF language. Moreover, works such as [1, 8, 9] link upper and lower bounds on lift to (local) differential privacy measures [3, 12, 13] and vice versa.

There have been a number of works connecting DP and QIF, in particular via the study of max-case leakage measures [16, 20]. A general notion of max-case leakage under QIF has been extensively explored in [16]; there it was found that max-case measures are required to be quasi-convex in order to satisfy certain axioms. Our work differs from theirs in that we restrict our attention to the set of max-case $g$ -leakages – that is, derived from $g$ -vulnerability functions – which are a subset of the quasi-convex max-case vulnerabilities (in fact, we have shown that our max-case $g$ -leakages are all convex). To our knowledge our work is the first to explore max-case $g$ -leakages and their capacities. General max-case leakages were also explored in [20] and their connection to differential privacy established via a leakage ordering. In particular, [20] found that whenever there is a max-case leakage order between two mechanisms (meaning one is always safer than the other wrt every max-case adversarial scenario), then there is a corresponding $\epsilon$ ordering between the two. While [20] explored $\epsilon$ as inducing a refinement order, finding that it is surprisingly weak (i.e., compared with the refinement orders of average-case and max-case found in QIF), in our work we explore $\epsilon$ as a capacity, finding that it does have a strong and meaningful interpretation (in terms of max-case attackers). In addition, [20] established that the $\epsilon$ of DP can be expressed as a max-case leakage capacity (via the QIF framework) derived from a specific quasi-convex vulnerability function. Such quasi-convex vulnerabilities, unfortunately, do not correspond to adversarial models in $g$ -leakage, and therefore fall outside the desirable operational interpretability. Our work finds that $\epsilon$ (actually $e^{\epsilon}$ ) is a capacity which can in fact be represented by a max-case $g$ -vulnerability, which brings with it a broad operational meaning in terms of max-case attackers.

Other relationships between Bayesian models and differential privacy have also been previously studied [27, 23], however none of these results give both a robust and an interpretable meaning to $\epsilon$ . Andres et al. [23] provided a Bayesian interpretation for metric differential privacy in terms of a single adversary and Alvim et al. [27] showed that $\epsilon$ -differential privacy implies a bound on leakage but that the converse does not hold.

VI Conclusion

In this paper we have investigated the relationships between traditional information theory and the $g$ -leakage framework of quantitative information flow. The connections are summarised in Table I. Overall we observe that the two notions converge wrt. robustness, namely via the capacity $\mathcal{M}\mathrm{Lift}(C)$ , which we find is equivalent to $e^{\epsilon}$ of local differential privacy.

Significant also is that local differential privacy’s $\epsilon$ parameter can now be understood through the lens of QIF. In particular, we see that it is also a measure of robustness in that it behaves as a capacity – that is, independent of any particular prior. Moreover, it represents a tight upper bound on all max-case $g$ -leakages. This is the first time that $\epsilon$ in LDP has been explained as a robust measure of information leakage in the QIF framework.

From the perspective of information theory, lift is also explained as a leakage measure, but interestingly we discovered that it has “capacity-like-properties” (§III-B). Table I clarifies how these measures relate to the leakages and capacities well-known in QIF.

Differential privacy is often seen as a useful technique to protect the privacy of individuals’ data, and has been used in several prominent applications including the 2020 US Census [15]. As noted however, in spite of differential privacy’s theoretical properties, there remain a number of challenges for its successful application. One such challenge is how to choose an appropriate level of $\epsilon$ relative to a particular scenario. Whilst $\epsilon$ itself provides information about indistinguishability, it is difficult to reconcile it with the protection against other relevant attacks described here.

The work presented here is an important step towards a better understanding of how to choose $\epsilon$ where indistinguishability is not the only concern, but where there are differing adversarial assumptions. The results here give a clear theoretical account of $\epsilon$ and how to view it under differing adversarial conditions which can then be included in the determination of an appropriate threat model that is relevant to the scenario.

While the results of this paper were derived for local DP, they provide some insights about central DP. For instance, based on Thm 3, Cor 2 and the fact that local DP implies central DP, we know that lift capacity upper bounds $e^{\epsilon}$ in central DP. However, more work is required to better understand what type of worst-case threat models in QIF framework have a one-to-one correspondence with $\epsilon$ in central DP. If such equivalence were to be established, it would settle the question of the operational meaning of central DP in terms of which adversarial threats it can protect against. There may also exist new connections or applications of lift and lift capacity in defining or bounding other privacy and leakage measures. We leave these interesting questions for future work.

References

[1] F. du Pin Calmon and N. Fawaz, “Privacy against statistical inference,” in 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1401–1408, IEEE, 2012.
[2] A. Makhdoumi, S. Salamatian, N. Fawaz, and M. Médard, “From the information bottleneck to the privacy funnel,” in 2014 IEEE Information Theory Workshop (ITW 2014), pp. 501–505, Nov 2014.
[3] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Thr of Crypto. Conf., pp. 265–284, Springer, 2006.
[4] G. Smith, “On the Foundations of Quantitative Information Flow,” in FOSSACS, vol. 5504 of LNCS, pp. 288–302, Springer, 2009.
[5] M. Thomas and A. T. Joy, Elements of information theory. Wiley-Interscience, 2006.
[6] I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,” IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2020.
[7] J. Liao, O. Kosut, L. Sankar, and F. d. P. Calmon, “Tunable measures for information leakage and applications to privacy-utility tradeoffs,” vol. 65, no. 12, pp. 8043–8066, 2019.
[8] S. Salamatian, F. P. Calmon, N. Fawaz, A. Makhdoumi, and M. Médard, “Privacy-utility tradeoff and privacy funnel,” Unpublished preprint, https://www.mit.edu/ salmansa/files/privacy_TIFS.pdf, 2020.
[9] B. Jiang, M. Seif, R. Tandon, and M. Li, “Context-aware local information privacy,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 3694–3708, 2021.
[10] H. Hsu, S. Asoodeh, and F. P. Calmon, “Information-theoretic privacy watchdogs,” in 2019 IEEE International Symposium on Information Theory (ISIT), pp. 552–556, IEEE, 2019.
[11] P. Sadeghi, N. Ding, and T. Rakotoarivelo, “On properties and optimization of information-theoretic privacy watchdog,” in IEEE Inf. Theory Workshop (ITW), pp. 1–5, Apr. 2020.
[12] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?,” SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011.
[13] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in Proc. IEEE 54th Annu. Symp. Found. Comput. Sci., pp. 429–438, 2013.
[14] D. Clark, S. Hunt, and P. Malacaria, “Quantitative information flow, relations and polymorphic types.,” J. Log. Comput., vol. 15, pp. 181–199, 01 2005.
[15] S. L. Garfinkel, J. M. Abowd, and S. Powazek, “Issues encountered deploying differential privacy,” in Proceedings of the 2018 Workshop on Privacy in the Electronic Society, pp. 133–137, 2018.
[16] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, The Science of Quantitative Information Flow. Springer, 2020.
[17] M. S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith, “Measuring information leakage using generalized gain functions,” in IEEE CSF, pp. 265–279, June 2012.
[18] S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Pointwise maximal leakage,” arXiv preprint arXiv:2205.04935, 2022.
[19] A. McIver, C. Morgan, G. Smith, B. Espinoza, and L. Meinicke, “Abstract channels and their robust information-leakage ordering,” in Principles of Security and Trust - Third International Conference, POST 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014, Proceedings (M. Abadi and S. Kremer, eds.), vol. 8414 of Lecture Notes in Computer Science, pp. 83–102, Springer, 2014.
[20] K. Chatzikokolakis, N. Fernandes, and C. Palamidessi, “Comparing systems: Max-case refinement orders and application to differential privacy,” in Proc. CSF, IEEE Press, 2019.
[21] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, “Axioms for information leakage,” in 2016 IEEE 29th Computer Security Foundations Symposium (CSF), pp. 77–92, IEEE, 2016.
[22] M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, “Additive and Multiplicative Notions of Leakage, and Their Capacities,” in IEEE CSF, pp. 308–322, IEEE, 2014.
[23] M. E. Andrés, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi, “Geo-indistinguishability: Differential privacy for location-based systems,” in Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pp. 901–914, 2013.
[24] Y. Wang, Y. O. Basciftci, and P. Ishwar, “Privacy-utility tradeoffs under constrained data release mechanisms,” arXiv preprint arXiv:1710.09295, 2017.
[25] P. Cuff and L. Yu, “Differential privacy as a mutual information constraint,” in CSS, pp. 43–54, 2016.
[26] R. Sibson, “Information radius,” Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 14, no. 2, pp. 149–160, 1969.
[27] M. S. Alvim, M. E. Andrés, K. Chatzikokolakis, and C. Palamidessi, “On the relation between differential privacy and quantitative information flow,” in Automata, Languages and Programming - 38th International Colloquium, ICALP 2011, Zurich, Switzerland, July 4-8, 2011, Proceedings, Part II, pp. 60–76, 2011.

Appendix A Motivating Example from the Introduction

In the introduction we motivated our work with the example of the following two channels:

\begin{array}[]{|c|ccc|}\hline\cr G&y&m&n\\ \hline\cr y&\nicefrac{{2}}{{3}}&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}\\ m&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}\\ n&\nicefrac{{1}}{{6}}&\nicefrac{{1}}{{6}}&\nicefrac{{2}}{{3}}\\ \hline\cr\end{array}\quad\begin{array}[]{|c|ccc|}\hline\cr R&y&m&n\\ \hline\cr y&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}\\ m&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}\\ n&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}\\ \hline\cr\end{array}

Using a uniform prior $\pi=[\nicefrac{{1}}{{3}},\nicefrac{{1}}{{3}},\nicefrac{{1}}{{3}}]$ , we can compute the set of posteriors using Bayes rule by multiplying out the channel by the prior and then normalising down columns. This yields the following posteriors and their corresponding marginals:

\begin{array}[]{r}[\pi{\triangleright}G]=\begin{array}[]{|c|c|c|}\hline\cr\delta^{y}&\delta^{m}&\delta^{n}\\ \hline\cr\nicefrac{{4}}{{7}}&\nicefrac{{1}}{{4}}&\nicefrac{{1}}{{7}}\\ \nicefrac{{2}}{{7}}&\nicefrac{{1}}{{2}}&\nicefrac{{2}}{{7}}\\ \nicefrac{{1}}{{7}}&\nicefrac{{1}}{{4}}&\nicefrac{{4}}{{7}}\\ \hline\cr\end{array}\\[25.0pt] p(y)=\begin{bmatrix}\nicefrac{{7}}{{18}}&\nicefrac{{2}}{{9}}&\nicefrac{{7}}{{18}}\end{bmatrix}\end{array}\quad\begin{array}[]{r}[\pi{\triangleright}R]=\begin{array}[]{|c|c|c|}\hline\cr\delta^{y}&\delta^{m}&\delta^{n}\\ \hline\cr\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}\\ \nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}&\nicefrac{{1}}{{5}}\\ \nicefrac{{1}}{{5}}&\nicefrac{{1}}{{5}}&\nicefrac{{3}}{{5}}\\ \hline\cr\end{array}\\[25.0pt] p(y)=\begin{bmatrix}\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}&\nicefrac{{1}}{{3}}\end{bmatrix}\end{array}

We can now compute the expected posterior gain to an adversary who wishes to guess the secret in one try. This adversary can be modelled using the gain function gid defined as:

gid(w,x)=\begin{cases}1&\textit{if }w=x,\\ 0&\textit{otherwise}\end{cases}

(19)

We first compute the prior vulnerability for this adversary, which gives a measure of the leakage before observing any output from the channel. This is computed as:

	$\displaystyle V_{gid}(\pi)$	$\displaystyle=\max_{w}\sum_{x}\pi_{x}\textit{gid}(w,x)$
		$\displaystyle=\max_{x}\pi_{x}$
		$\displaystyle=\nicefrac{{1}}{{3}}$

The posterior vulnerability determines the expected gain to the adversary after making an observation, and therefore depends on the channel. The posterior gid-vulnerability for $G$ is then:

	$\displaystyle V_{gid}{[\pi{\triangleright}G]}$	$\displaystyle=\sum_{y}p(y)V_{gid}(\delta^{y})$
		$\displaystyle=\nicefrac{{7}}{{18}}{\times}\nicefrac{{4}}{{7}}+\nicefrac{{2}}{{9}}{\times}\nicefrac{{1}}{{2}}+\nicefrac{{7}}{{18}}{\times}\nicefrac{{4}}{{7}}$
		$\displaystyle=\nicefrac{{5}}{{9}}$
		$\displaystyle=0.56$

And for $R$ we have:

	$\displaystyle V_{gid}{[\pi{\triangleright}R]}$	$\displaystyle=\sum_{y}p(y)V_{gid}(\delta^{y})$
		$\displaystyle=\nicefrac{{1}}{{3}}{\times}\nicefrac{{3}}{{5}}{\times}3$
		$\displaystyle=\nicefrac{{3}}{{5}}$
		$\displaystyle=0.6$

The multiplicative leakage is given by the ratio of the posterior to the prior vulnerabilities. For $G$ we have:

\mathcal{L}^{\times}_{gid}(\pi,G)=\frac{V_{gid}{[\pi{\triangleright}G]}}{V_{gid}(\pi)}=\frac{15}{9}=1.67

And for $R$ we compute:

\mathcal{L}^{\times}_{gid}(\pi,R)=\frac{V_{gid}{[\pi{\triangleright}R]}}{V_{gid}(\pi)}=\frac{9}{5}=1.80

And thus $R$ leaks more than $G$ to this adversary.

Next we compare the max-case leakage using the same gain function gid and the same uniform prior $\pi$ . Note that the prior max-case $gid$ -vulnerability is the same ( $\nicefrac{{1}}{{3}}$ ). For the posterior vulnerability for $G$ we compute:

	$\displaystyle V^{\mathrm{max}}_{gid}{[\pi{\triangleright}G]}$	$\displaystyle=\max_{y}V^{\mathrm{max}}_{gid}(\delta^{y})$
		$\displaystyle=\max\{\nicefrac{{4}}{{7}},\nicefrac{{1}}{{2}},\nicefrac{{4}}{{7}}\}$
		$\displaystyle=\nicefrac{{4}}{{7}}$
		$\displaystyle=0.57$

And for $R$ we have:

	$\displaystyle V^{\mathrm{max}}_{gid}{[\pi{\triangleright}R]}$	$\displaystyle=\max_{y}V^{\mathrm{max}}_{gid}(\delta^{y})$
		$\displaystyle=\nicefrac{{3}}{{5}}$
		$\displaystyle=0.6$

And so the corresponding multiplicative leakage for $G$ is:

\mathcal{L}^{\mathrm{max}}_{gid}(\pi,G)=\frac{V^{\mathrm{max}}_{gid}{[\pi{\triangleright}G]}}{V_{gid}(\pi)}=\frac{12}{7}=1.71

And for $R$ it is:

\mathcal{L}^{\mathrm{max}}_{gid}(\pi,R)=\frac{V^{\mathrm{max}}_{gid}{[\pi{\triangleright}R]}}{V_{gid}(\pi)}=\frac{9}{5}=1.80

And so again we find that $R$ leaks more than $G$ for this adversary, now modelled using a max-case vulnerability function.

Appendix B Max-case $g$ -leakage using average-case $g$ -vulnerability

In this section, we complete the proof of Lem 4, showing a gain function $g^{*}$ which produces the same average-case leakage as a max-case leakage defined using a $g$ . That is, we will prove that for any gain function $g$ it is possible to construct a gain function $g^{*}$ such that $V^{\mathrm{max}}_{g}(\pi)=V_{g^{*}}(\pi)$ where $V_{g^{*}}(\pi)$ is defined in the usual way as:

V_{g^{*}}(\pi)~{}~{}:=~{}~{}\max_{w}\sum_{x}\pi_{x}g^{*}(w,x)~{},

(20)

and $V^{\mathrm{max}}_{g}(\pi)$ is defined as

V^{\mathrm{max}}_{g}(\pi)~{}~{}:=~{}~{}\max_{w,x}\pi_{x}g(w,x)~{}.

(21)

Note that our construction assumes that the domain $\mathcal{X}$ is discrete, although the proof does not rely on this assumption.

For $g{:}\mathcal{W}{\times}\mathcal{X}\rightarrow\mathbb{R}_{\geq 0}$ we can define the set of actions ${\mathcal{W}}^{*}$ such that for each $w\in\mathcal{W}$ we have a set $\{w_{x}\in{\mathcal{W}}^{*}:x\in\mathcal{X}\}$ , and a mapping $g^{*}{:}{\mathcal{W}}^{*}{\times}\mathcal{X}\rightarrow\mathbb{R}_{\geq 0}$ satisfying

g^{*}(w_{x_{i}},x)=\begin{cases}g(w,x),&\textit{if }x=x_{i}\\ 0,&\textit{otherwise.}\end{cases}

(22)

This means that $\max_{x}g^{*}(w_{x_{i}},x)=\max_{x}g(w,x)$ . And therefore:

	$\begin{array}[t]{@{}llll}V^{\mathrm{max}}_{g}(\pi)\end{array}$
$=$	$\begin{array}[t]{@{}llll}\max_{w\in\mathcal{W},x\in\mathcal{X}}\pi_{x}g(w,x)\end{array}$ “From Eqn (21)”
$=$	$\begin{array}[t]{@{}llll}\max_{w\in{\mathcal{W}}^{},x\in\mathcal{X}}\pi_{x}g^{}(w,x)\end{array}$ “From Eqn (22)”
$=$	$\begin{array}[t]{@{}ll}{}\\ \max_{w\in{\mathcal{W}}^{}}\sum_{x}\pi_{x}g^{}(w,x)\end{array}$ “Since $g^{*}(w,x)=0$ everywhere else”
$=$	$\begin{array}[t]{@{}llll}V_{g^{*}}(\pi)\end{array}$ “Eqn (20)”

And this gives that $\mathcal{L}^{\mathrm{max}}_{g}(\pi,C)=\frac{\max_{y}V^{\mathrm{max}}_{g}(\delta^{y})}{V^{\mathrm{max}}_{g}(\pi)}=\frac{\max_{y}V_{g^{*}}(\delta^{y})}{V_{g^{*}}(\pi)}$ .

Explaining ϵ\epsilon in local differential privacy through the lens of quantitative information flow

Abstract

Index Terms:

I Introduction

I-A Explaining epsilon: a motivating example

I-B Key open questions and contributions of this paper

I-C Detail of technical contributions

I-D Operational measures and their significance in information theory and security

II Information-Theoretic Foundations for Privacy

II-A The channel model for quantitative information flow

II-B Local differential privacy and lift

Definition 1.

Definition 2.

Example 1 (Running Example).

II-C Operational scenarios and the gg-leakage framework

II-D Connecting gg-leakage and lift

Lemma 1.

Proof.

III On Max-case gg-leakage Measures, Lift and Local DP

Definition 3 (Max-case gg-leakage).

III-A Lift and max-case gg-leakage

Theorem 1 (Lift as max-case gg-leakage).

Proof.

Theorem 2 (Lift bounds max-case gg-leakage).

Proof.

III-B Robustness results: lift capacity and epsilon

Definition 4.

Corollary 1.

Theorem 3.

Proof.

Corollary 2.

Proof.

III-C Interpreting the results connecting gg-leakage, lift capacity and epsilon

III-D Max-case Dalenius leakage

Lemma 2.

Proof.

Corollary 3.

Proof.

III-E Interpreting the Dalenius leakage results

IV Additional Results on Max-case Leakage and Technical Discussion

Lemma 3.

Proof.

Lemma 4.

Proof.

V Prior Work

VI Conclusion

References

Appendix A Motivating Example from the Introduction

Appendix B Max-case gg-leakage using average-case gg-vulnerability

Explaining $\epsilon$ in local differential privacy through the lens of quantitative information flow

II-C Operational scenarios and the $g$ -leakage framework

II-D Connecting $g$ -leakage and lift

III On Max-case $g$ -leakage Measures, Lift and Local DP

Definition 3 (Max-case $g$ -leakage).

III-A Lift and max-case $g$ -leakage

Theorem 1 (Lift as max-case $g$ -leakage).

Theorem 2 (Lift bounds max-case $g$ -leakage).

III-C Interpreting the results connecting $g$ -leakage, lift capacity and epsilon

Appendix B Max-case $g$ -leakage using average-case $g$ -vulnerability