This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Enhancing Utility in the Watchdog Privacy Mechanism

Abstract

This paper is concerned with enhancing data utility in the privacy watchdog method for attaining information-theoretic privacy. For a specific privacy constraint, the watchdog method filters out the high-risk data symbols through applying a uniform data regulation scheme, e.g., merging all high-risk symbols together. While this method entirely trades the symbols resolution off for privacy, we show that the data utility can be greatly improved by partitioning the high-risk symbols set and individually privatizing each subset. We further propose an agglomerative merging algorithm that finds a suitable partition of high-risk symbols: it starts with a singleton high-risk symbol, which is iteratively fused with others until the resulting subsets are private. Numerical simulations demonstrate the efficacy of this algorithm in privately achieving higher utilities in the watchdog scheme.

Index Terms—  Information-theoretic privacy; Watchdog privacy mechanism; Privacy-utility trade-off.

1 Introduction

Industries and governments are increasingly sharing data to unlock economic and societal benefits through advances in data analytics and machine learning. However, such data also contains sensitive information about individuals or businesses, which makes the privacy regulators, users, and data providers concerned about the leakage of confidential information, either explicitly or implicitly. In signal processing and information theory, data privacy is underpinned in terms of a measure called information lift[1, 2].

To evaluate how much private data XX is informative about the confidential data SS, the lift measures the change in the posterior belief p(s|x)p(s|x) from the prior belief p(s)p(s) for each instance of ss and xx by

l(s,x)=p(s|x)p(s).l(s,x)=\frac{p(s|x)}{p(s)}. (1)

It is clear that a small lift indicates limited private information gain by the adversary and therefore the more private XX is. The lift is the elementary measure in almost all information leakage measures, e.g., mutual information [1, 3], Sibson mutual information [4, 5, 6], α\alpha-leakage [7] and local differential privacy [8, 9]: as proved in [2], if lift is bounded, all these leakage measures are also bounded.

ε=1\varepsilon=1 utility privacy leakage
complete merging 0.5913 0.8037
two-subset merging 0.7335 0.8488
Table 1: An example of how subset merging can enhance utility in the watchdog mechanism.

The existing approach to attain lift-based privacy is the watchdog method proposed in [2]. For a specific privacy constraint, i.e., a threshold ϵ\epsilon on the lift, the watchdog method filters out and privatizes the high-risk symbols of XX. The authors in [2] adopted a uniform approach: merging all high-risk symbols together into a ‘super’ symbol, which is proved in [10, 6] to be the optimal privatization scheme in attaining data privacy. However, this uniform approach neglects an important issue in data privacy: to achieve the benefits of data sharing, the privatized data should provide a satisfactory level of usefulness in XX.111The data utility is usually quantified by average performance measures such as mutual information [3], ff-divergence [11] and Hamming distortion [12]. We use mutual information in this paper. Despite the relaxation attempts in [10, 6], the complete merging method minimizes the resolution of high-risk symbols and significantly deteriorates data utility, which is at odds with the purpose of data sharing.

In fact, even a small alteration can greatly enhance the data utility. In Table 1, we arbitrarily cut the high-risk symbol set (of size 7) into two subsets, each of which is then privatized individually. The utility (measured by mutual information) is increased significantly without sacrificing too much data privacy, which remains below the design constraint ϵ\epsilon. This not only shows that the complete merging approach is an ‘overkill’ in terms of data utility, but also suggests a partitioned privatization approach.

In this paper, we propose the subset merging method for watchdog mechanism to enhance the data utility. Finding the best partition of high-risk symbols set to achieve optimal utility is generally a combinatorial problem. Accordingly, we propose a heuristic greedy algorithm to find good subsets to merge, which guarantees data privacy. To do so, this greedy algorithm tries to search the finest partition of the original high-risk symbols set that ensures the resulting lift of the whole dataset does not exceed ϵ\epsilon. It starts with the singleton partition of high-risk symbols (highest resolution) and iteratively merges symbols until lift values of the resulting subsets are all below ϵ\epsilon. Numerical simulations show that our proposed algorithm enhances utility significantly and maintains the privacy leakage constraint.

2 System Model

Denote random variables SS and XX the sensitive and public data, respectively. The joint distribution p(s,x)p(s,x) describes the statistical correlation between SS and XX. To protect the privacy of SS, we sanitize XX to YY by the transition probability p(y|x)p(y|x). Here, for each x,yx,y, p(y|x)=p(y|x,s),sp(y|x)=p(y|x,s),\forall s and therefore the Markov chain SXYS\rightarrow X\rightarrow Y is formed.

The watchdog method is based on the logarithm of the lift measure

i(s;x)=logl(s;x).i(s;x)=\log{l(s;x)}.

For each x𝒳x\in\mathcal{X}, denote the maximum symbol-wise information leakage by maxs𝒮|i(s,x)|\max_{s\in\mathcal{S}}|i(s,x)|, where

ω(x)=maxs𝒮|i(s,x)|.\omega(x)=\max_{s\in\mathcal{S}}|i(s,x)|. (2)

Applying an upper bound ϵ\epsilon to ω(x)\omega(x) for all symbols x𝒳x\in\mathcal{X}, the whole alphabet 𝒳\mathcal{X} is divided into two subsets: the low-risk subset is given by 𝒳ε{x𝒳:ω(x)ε}\mathcal{X}_{\varepsilon}\triangleq\{x\in\mathcal{X}:\omega(x)\leq\varepsilon\} that is safe to publish, and the high-risk symbol set

𝒳εc=𝒳𝒳ε{x𝒳:ω(x)>ε}\mathcal{X}_{\varepsilon}^{c}=\mathcal{X}\setminus\mathcal{X}_{\varepsilon}\triangleq\{x\in\mathcal{X}:\omega(x)>\varepsilon\}

that requires some treatment before the data publishing.

The authors in [10, 6] adopt a uniform randomization scheme

p(y|x)={1{x=y}x,y𝒳ε,R(y)x,y𝒳εc,0otherwise,p(y|x)=\begin{cases}1_{\{x=y\}}&x,y\in\mathcal{X}_{\varepsilon},\\ R(y)&x,y\in\mathcal{X}_{\varepsilon}^{c},\\ 0&\text{otherwise},\end{cases} (3)

where R(y)R(y) is complete merging solution, e.g., where there is only one super symbol y𝒳εcy^{*}\in\mathcal{X}_{\varepsilon}^{c} such that R(y)=1R(y^{*})=1 for all x𝒳εcx\in\mathcal{X}_{\varepsilon}^{c} and R(y)=0R(y)=0 otherwise.

1 Input: 𝒳,ε,p(s,x)\mathcal{X},\varepsilon,p(s,x)
2 Output: 𝒫𝒳εc={𝒳1,𝒳2,𝒳p}\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}=\{\mathcal{X}_{1},\mathcal{X}_{2},\cdots\,\mathcal{X}_{p}\}
3 Initialize: Obtain {𝒳ε,𝒳εc}\{\mathcal{X}_{\varepsilon},\mathcal{X}_{\varepsilon}^{c}\}, 𝒳Q𝒳εc\mathcal{X}_{Q}\leftarrow\mathcal{X}_{\varepsilon}^{c}, and p=1p=1 while |𝒳Q|>0|\mathcal{X}_{Q}|>0 do
4       𝒳p={argmaxx𝒳Qω(x)}\mathcal{X}_{p}=\{\displaystyle\operatorname*{arg\,max}_{x\in\mathcal{X}_{Q}}{\omega(x)}\}, and 𝒳Q𝒳Q𝒳p\mathcal{X}_{Q}\leftarrow\mathcal{X}_{Q}\setminus\mathcal{X}_{p};
5      while ω(𝒳p)>ε\omega({\mathcal{X}_{p}})>\varepsilon & |𝒳Q|>0|\mathcal{X}_{Q}|>0 do
6             x=argminx𝒳Qω(𝒳p{x})x^{*}=\displaystyle\operatorname*{arg\,min}_{x\in\mathcal{X}_{Q}}\omega(\mathcal{X}_{p}\cup\{x\})
7             𝒳p𝒳p{x}\mathcal{X}_{p}\leftarrow\mathcal{X}_{p}\cup\{x^{*}\} and 𝒳Q𝒳Q{x}\mathcal{X}_{Q}\leftarrow\mathcal{X}_{Q}\setminus\{x^{*}\};
8       end while
9      𝒫𝒳Q={𝒳1,𝒳2,,𝒳p}\mathcal{P}_{\mathcal{X}_{Q}}=\{\mathcal{X}_{1},\mathcal{X}_{2},\cdots,\mathcal{X}_{p}\}, and pp+1p\leftarrow p+1
10 end while
11
12while ω(𝒳p)>ε\omega(\mathcal{X}_{p})>\varepsilon & |𝒫𝒳Q|>1|\mathcal{P}_{\mathcal{X}_{Q}}|>1  do
13       𝒳k=argmin1<i<pω(𝒳p𝒳i)\mathcal{X}_{k}=\displaystyle\operatorname*{arg\,min}_{1<i<p}\omega(\mathcal{X}_{p}\cup\mathcal{X}_{i});
14       𝒳p𝒳p𝒳k\mathcal{X}_{p}\leftarrow\mathcal{X}_{p}\cup\mathcal{X}_{k}; for k+1ipk+1\leq i\leq p update the index of 𝒳i\mathcal{X}_{i}’s to 𝒳i1\mathcal{X}_{i-1}
15       𝒫𝒳Q={𝒳1,𝒳2,,𝒳p}\mathcal{P}_{\mathcal{X}_{Q}}=\{\mathcal{X}_{1},\mathcal{X}_{2},\cdots,\mathcal{X}_{p}\}
16 end while
17
Algorithm 1 Make a refinement of 𝒳εc\mathcal{X}_{\varepsilon}^{c}

After randomization, the log-lift is given by i(s,y)=logp(y|s)p(y)i(s,y)=\log\frac{p(y|s)}{p(y)} where p(y|s)=x𝒳p(y|x)p(x|s)p(y|s)=\sum_{x\in\mathcal{X}}p(y|x)p(x|s) due to the Markov property and p(y)=x𝒳p(y|x)p(x)p(y)=\sum_{x\in\mathcal{X}}p(y|x)p(x). We can extend the notion of the log-lift, and ω(x)\omega(x) from a single x𝒳x\in\mathcal{X} to a subset 𝒳Q𝒳\mathcal{X}_{Q}\subseteq\mathcal{X} [10]:

i(s,𝒳Q)=logp(𝒳Q|s)p(𝒳Q),ω(𝒳Q)=maxs𝒮|i(s,𝒳Q)|,i(s,\mathcal{X}_{Q})=\log\frac{p(\mathcal{X}_{Q}|s)}{p(\mathcal{X}_{Q})},\quad\omega(\mathcal{X}_{Q})=\max_{s\in\mathcal{S}}|i(s,\mathcal{X}_{Q})|, (4)

where p(𝒳Q|s)=x𝒳Qp(x|s)p(\mathcal{X}_{Q}|s)=\sum_{x\in\mathcal{X}_{Q}}p(x|s) and p(𝒳Q)=x𝒳Qp(x)p(\mathcal{X}_{Q})=\sum_{x\in\mathcal{X}_{Q}}p(x).

Applying p(y|x)p(y|x), the value of maxy𝒴ω(y)\max_{y\in\mathcal{Y}}\omega(y) as the upper bound on privacy leakage after randomization is given by [10]

maxy𝒴ω(y)=max{maxy𝒳εω(y),ω(𝒳εc)}.\max_{y\in\mathcal{Y}}\omega(y)=\max\{\max_{y\in\mathcal{X}_{\varepsilon}}\omega(y),\omega(\mathcal{X}_{\varepsilon}^{c})\}. (5)

It is obvious that maxy𝒳εω(y)ε\max_{y\in\mathcal{X}_{\varepsilon}}\omega(y)\leq\varepsilon, so it attains the privacy constraint. However, the value of ω(𝒳εc)\omega(\mathcal{X}_{\varepsilon}^{c}) is variable and depends on the joint probability distribution.

Example 1

Let 𝒳={x1,x2,,x5}\mathcal{X}=\{x_{1},x_{2},\cdots,x_{5}\}, 𝒮={s1,s2,s3}\mathcal{S}=\{s_{1},s_{2},s_{3}\}, and ε=0.8\varepsilon=0.8. We randomly generate a joint distribution p(s,x)p(s,x) and the resulting maximum symbol-wise leakage and utility are ω(x)=[1.3515,1.6458,0.9295,0.8161,0.2608]\omega(x)=[1.3515,1.6458,0.9295,0.8161,0.2608] and H(X)=1.6034H(X)=1.6034, respectively. For the given ε\varepsilon, the low-risk and high-risk subsets are given by 𝒳ε={x5}\mathcal{X}_{\varepsilon}=\{x_{5}\} and 𝒳εc={x1,x2,x3,x4}\mathcal{X}_{\varepsilon}^{c}=\{x_{1},x_{2},x_{3},x_{4}\}, respectively. After randomization, assume high-risk symbols are mapped to yy^{*} where y=y1=x1y^{*}=y_{1}=x_{1} and y2=x5y_{2}=x_{5}, then the maximum symbol-wise leakage and utility are given by ω(y)=[0.0627,0.2608]\omega(y)=[0.0627,0.2608] and I(X;Y)=0.5269I(X;Y)=0.5269, respectively.

In Example 1, the leakage in the high-risk subset after randomization is ω(𝒳εc)=0.0627\omega(\mathcal{X}_{\varepsilon}^{c})=0.0627, which is an order of magnitude smaller than the original privacy constraint ε=0.8\varepsilon=0.8, based on which 𝒳εc\mathcal{X}_{\varepsilon}^{c} was obtained. Although this small leakage guarantees a very high level of privacy, it damages utility drastically, the utility decreases from H(X)=1.6034H(X)=1.6034 to I(X;Y)=0.5269I(X;Y)=0.5269. On the other hand, when a threshold ε\varepsilon is given as the privacy constraint, it is acceptable to just keep the privacy leakage less than ε\varepsilon, even if it becomes very close to this threshold. Thus, in the next section, we propose a subset merging method to enhance utility where the privacy leakage increases, but remains under ε\varepsilon to the extent possible.

3 Enhancing Utility

00.250.250.50.50.750.75111.251.251.51.51.751.75222.252.252.52.500.20.20.40.40.60.60.80.8111.21.21.41.41.61.61.81.8ϵ\epsilonmaxy𝒳εcω(y)\displaystyle\max_{y\in\mathcal{X}_{\varepsilon}^{c}}\hskip 2.0pt\omega(y)Algortihm 1Complete merging
Fig. 1: Privacy leakage for different values of ε\varepsilon: The mean values of maxy𝒳εcω(y)\max_{y\in\mathcal{X}_{\varepsilon}^{c}}\omega(y) are shown with standard deviation. Algorithm 1 increases privacy leakage in 𝒳εc\mathcal{X}_{\varepsilon}^{c} in comparison with complete merging, however, in all cases it is still below the constraint ε\varepsilon.

In this section, we introduce an approach to enhance utility while maintaining a set privacy constraint. We measure utility by mutual information which for the bi-partition {𝒳ε,𝒳εc}\{\mathcal{X}_{\varepsilon},\mathcal{X}_{\varepsilon}^{c}\} and complete merging randomization 𝒳εc\mathcal{X}_{\varepsilon}^{c} is given by [10]

I(X;Y)=H(X)+x𝒳εcp(x)logp(x)p(𝒳εc).I(X;Y)=H(X)+\sum_{x\in\mathcal{X}_{\varepsilon}^{c}}p(x)\log\frac{p(x)}{p\left(\mathcal{X}_{\varepsilon}^{c}\right)}. (6)

Clearly, the utility depends on p(x)p(x) for x𝒳εcx\in\mathcal{X}_{\varepsilon}^{c} and the overall p(𝒳εc)p(\mathcal{X}_{\varepsilon}^{c}). Our proposed approach to enhance data utility is through increasing data resolution. That is, we propose to randomize subsets of 𝒳εc\mathcal{X}_{\varepsilon}^{c} separately rather than the complete merging of the whole set 𝒳εc\mathcal{X}_{\varepsilon}^{c}.

Let [p]={1,2,,p}[p]=\{1,2,\cdots,p\} and consider a bi-partition {𝒳ε,𝒳εc}\{\mathcal{X}_{\varepsilon},\mathcal{X}_{\varepsilon}^{c}\}, a further partitioning of elements in 𝒳εc\mathcal{X}_{\varepsilon}^{c} denoted by 𝒫𝒳εc={𝒳1,,𝒳p}\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}=\{\mathcal{X}_{1},\cdots,\mathcal{X}_{p}\}, and complete merging randomizations Ri(y),i[p]R_{i}(y),i\in[p] where y𝒳iRi(y)=1\sum_{y\in\mathcal{X}_{i}}R_{i}(y)=1. In other words, we partition 𝒳εc\mathcal{X}_{\varepsilon}^{c} to subsets 𝒳i\mathcal{X}_{i}, so 𝒳εc=i=1p𝒳i\mathcal{X}_{\varepsilon}^{c}=\cup_{i=1}^{p}\mathcal{X}_{i} and each subset 𝒳i\mathcal{X}_{i} is randomized by the corresponding randomization Ri(y)R_{i}(y). Note that for y𝒳iy\in\mathcal{X}_{i} we have p(y)=x𝒳ip(y|x)p(x)=p(𝒳i)Ri(y)p(y)=\sum_{x\in\mathcal{X}_{i}}p(y|x)p(x)=p(\mathcal{X}_{i})R_{i}(y). The resulting mutual information I(X;Y)I(X;Y) is

I(X;Y)=H(X)+i=1px𝒳ip(x)logp(x)p(𝒳i).I(X;Y)=H(X)+\sum_{i=1}^{p}\sum_{x\in\mathcal{X}_{i}}p(x)\log\frac{p(x)}{p\left(\mathcal{X}_{i}\right)}. (7)

Then the normalized mutual information-loss for partition 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} on 𝒳εc\mathcal{X}_{\varepsilon}^{c} is defined as

NMIL(𝒫𝒳εc)\displaystyle\text{NMIL}(\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}) =H(X)I(X;Y)H(X).\displaystyle=\frac{H(X)-I(X;Y)}{H(X)}. (8)

Since p(𝒳i)p(𝒳εc)p(\mathcal{X}_{i})\leq p(\mathcal{X}_{\varepsilon}^{c}) for i[p]i\in[p] the data resolution increases for each x𝒳ix\in\mathcal{X}_{i} and this can results in a larger mutual information and hence a lower utility loss. The following definition and derivations formalize this observation.

Definition 1

Assume two partitions 𝒫𝒳εc={𝒳1,,𝒳p}\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}=\{\mathcal{X}_{1},\cdots,\mathcal{X}_{p}\} and 𝒫𝒳εc={𝒳1,,𝒳p}\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}^{{}^{\prime}}=\{\mathcal{X}_{1}^{{}^{\prime}},\cdots,\mathcal{X}_{p^{\prime}}^{{}^{\prime}}\}. We say 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}^{{}^{\prime}} is a refinement of 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} and 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} is an aggregation of 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}^{{}^{\prime}} [13], if for every i[p],i\in[p], 𝒳i=jJi𝒳j\mathcal{X}_{i}=\cup_{j\in J_{i}}\mathcal{X}_{j}^{{}^{\prime}} where Ji[p]J_{i}\subseteq[p^{\prime}], and we have p(𝒳i)=jJip(𝒳j)p(\mathcal{X}_{i})=\sum_{j\in J_{i}}{p(\mathcal{X}_{j}^{{}^{\prime}})}.

If 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}^{{}^{\prime}} is a refinement of 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} and 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} is an aggregation of 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}^{{}^{\prime}} then NMIL(𝒫𝒳εc)NMIL(𝒫𝒳εc)\text{NMIL}(\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}^{{}^{\prime}})\leq\text{NMIL}(\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}}). This is because

H(X)×NMIL(𝒫𝒳εc)=i=1px𝒳ip(x)logp(𝒳i)p(x)\displaystyle H(X)\times\text{NMIL}(\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}})=\sum_{i=1}^{p}\sum_{x\in\mathcal{X}_{i}}p(x)\log\frac{p(\mathcal{X}_{i})}{p(x)} (9)
i=1px𝒳jp(x)logp(𝒳j)p(x)=H(X)×NMIL(𝒫𝒳εc).\displaystyle\geq\sum_{i=1}^{p^{\prime}}\sum_{x\in\mathcal{X}_{j}}p(x)\log\frac{p(\mathcal{X}_{j})}{p(x)}=H(X)\times\text{NMIL}(\mathcal{P^{\prime}}_{\mathcal{X}_{\varepsilon}^{c}}).

Generally, finding an optimal partition 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} that maximizes utility while maintaining the privacy constraint is combinatorial since it depends on the high-risk symbol probabilities p(x)p(x), x𝒳εcx\in\mathcal{X}_{\varepsilon}^{c} and the joint probability distribution. However, we can use ε\varepsilon as a stop criteria to make a heuristic agglomerative algorithm for obtaining a refinement of 𝒳εc\mathcal{X}_{\varepsilon}^{c} to enhance utility as much as possible.

3.1 A greedy algorithm to refine the high-risk subset 𝒳εc\mathcal{X}_{\varepsilon}^{c}

Knowing that the aggregation of the partition of 𝒳εc\mathcal{X}_{\varepsilon}^{c} reduces the symbol resolution and data utility, we propose a heuristic greedy algorithm that determines the most refined partition of 𝒳εc\mathcal{X}_{\varepsilon}^{c} that satisfies the data privacy constraint specified by ϵ\epsilon. This is a bottom-up algorithm, which bootstraps from the most refined (singleton-element) partition of 𝒳εc\mathcal{X}_{\varepsilon}^{c}. This starting point provides the highest resolution/utility but results in a lowest data privacy level. We let the subsets in the singleton partition merge with each other to reduce the log-lift measure ω(𝒳i),𝒳i𝒫𝒳εc\omega(\mathcal{X}_{i}),\mathcal{X}_{i}\in\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} until the log-lift of all subsets is reduced below ϵ\epsilon. To achieve a low complexity, but effective procedure that results in a finer partition of 𝒳εc\mathcal{X}_{\varepsilon}^{c}, we implement the subset merging in order. A good candidate to begin with is the most challenging symbol with the highest log-lift leakage.

The pseudo-code of our method is shown in Algorithm 1. To find each subset 𝒳i𝒳εc\mathcal{X}_{i}\subseteq\mathcal{X}_{\varepsilon}^{c}, we start with argmaxx𝒳εcω(x)\operatorname*{arg\,max}_{x\in\mathcal{X}_{\varepsilon}^{c}}\omega(x) as the symbol with the highest risk in 𝒳εc\mathcal{X}_{\varepsilon}^{c} (line 4) and then to make the leakage of the subset ω(𝒳i)\omega(\mathcal{X}_{i}) less than ε\varepsilon we merge another symbol to it that minimizes ω(𝒳i)\omega(\mathcal{X}_{i}) in each iteration (lines 5-8). Each symbol that is added to 𝒳i\mathcal{X}_{i} is removed from 𝒳εc\mathcal{X}_{\varepsilon}^{c} (line 7). When a subset 𝒳i\mathcal{X}_{i} is made, we add it to the partition set 𝒫𝒳εc\mathcal{P}_{\mathcal{X}_{\varepsilon}^{c}} and repeat the same process for the remaining x𝒳εcx\in\mathcal{X}_{\varepsilon}^{c}. After making all subsets, there is a possibility that for the last subset 𝒳p\mathcal{X}_{p} the leakage is greater than ε\varepsilon. That is, ω(𝒳p)>ε\omega(\mathcal{X}_{p})>\varepsilon. If this happens, while ω(𝒳p)>ε\omega(\mathcal{X}_{p})>\varepsilon we make an agglomerate 𝒳p\mathcal{X}_{p} by merging a subset to it that minimizes ω(𝒳p)\omega(\mathcal{X}_{p}) (lines 11-15). A complete merging is a special output of our algorithm if no better finer partition can be found that maintains the privacy.

00.250.250.50.50.750.75111.251.251.51.51.751.75222.252.252.52.500.20.20.40.40.60.60.80.8111.21.2ϵ\epsilonNMILAlgortihm 1Complete merging
Fig. 2: Normalized Utility loss (NMIL) for different values of ε\varepsilon: The mean values of NMIL are shown with standard deviation. Algorithm 1 reduces NMIL in comparison with complete merging.

4 Experiments

To make a comparison between Algorithm 1 and complete merging, we randomly generated 1000 joint distributions p(s,x)p(s,x) where |𝒳|=20|\mathcal{X}|=20 and |𝒮|=13|\mathcal{S}|=13. For each distribution, after randomization we obtained maximum privacy leakage of high-risk symbols maxy𝒳εcω(y)\max_{y\in\mathcal{X}_{\varepsilon}^{c}}\omega(y), maximum overall privacy leakage maxy𝒴ω(y)\max_{y\in\mathcal{Y}}\omega(y), and the utility loss NMIL under Algorithm 1 and complete merging for different values of ε{0.25,0.5,0.75,,2.25,2.5}\varepsilon\in\{0.25,0.5,0.75,\cdots,2.25,2.5\}. Then for each ε\varepsilon, we derived the mean value and standard deviation of maxy𝒳εcω(y)\max_{y\in\mathcal{X}_{\varepsilon}^{c}}\omega(y), maxyω(y)\max_{y}\omega(y), and NMIL across these 1000 observations.

In Fig. 1 the mean value of maxy𝒳εcω(y)\max_{y\in\mathcal{X}_{\varepsilon}^{c}}\omega(y) is depicted for each ε\varepsilon, as well as its standard deviation (shown as tolerance bars). As expected, complete merging makes a strong guarantee on privacy leakage and keeps both mean and standard deviation much less than ε\varepsilon in all cases. In contrast, algorithm 1 increases the privacy leakage and lets it be closer to ε\varepsilon compared to complete merging, but crucially it still keeps the mean value and the corresponding deviation less than ε\varepsilon in all cases. As the value of ε\varepsilon increases, the standard deviation is also increased. This is because when ε\varepsilon increases, the privacy constraint is less strict and consequently the size of 𝒳εc\mathcal{X}_{\varepsilon}^{c} decreases. As a result, the sample size to calculate the mean value of privacy leakage reduces, which causes a larger deviation.

Fig. 2 shows the normalized mutual information loss under Algorithm 1 and complete merging. It demonstrates that Algorithm 1 enhances utility substantially for each value of ε\varepsilon.

0.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.80.90.91100.50.5111.51.5222.52.5 ϵ=0.25\text{ }\epsilon\text{=0.25} ϵ=0.5\text{ }\epsilon\text{=0.5} ϵ=0.75\text{ }\epsilon\text{=0.75} ϵ=1\text{ }\epsilon\text{=1} ϵ=1.25\text{ }\epsilon\text{=1.25} ϵ=1.5\text{ }\epsilon\text{=1.5} ϵ=1.75\text{ }\epsilon\text{=1.75} ϵ=2\text{ }\epsilon\text{=2} ϵ=2.25\text{ }\epsilon\text{=2.25} ϵ=2.5\text{ }\epsilon\text{=2.5} ϵ=0.25\text{ }\epsilon\text{=0.25} ϵ=0.5\text{ }\epsilon\text{=0.5} ϵ=0.75\text{ }\epsilon\text{=0.75} ϵ=1\text{ }\epsilon\text{=1} ϵ=1.25\text{ }\epsilon\text{=1.25} ϵ=1.5\text{ }\epsilon\text{=1.5} ϵ=1.75\text{ }\epsilon\text{=1.75} ϵ=2\text{ }\epsilon\text{=2} ϵ=2.25\text{ }\epsilon\text{=2.25} ϵ=2.5\text{ }\epsilon\text{=2.5}NMILmaxy𝒴ω(y)\max_{y\in\mathcal{Y}}\hskip 2.0pt\omega(y)Algortihm 1Complete merging
Fig. 3: Privacy Utility trade-off: For each value of ε\varepsilon, algorithm 1 reduces NMIL substantially while maintain privacy leakage below ε\varepsilon.

Finally, to have a clear privacy-utility trade-off (PUT) comparison, we present PUT curves for both Algorithm 1 and complete merging in Fig. 3. For each ε\varepsilon, the mean value of overall privacy leakage maxy𝒴ω(y)\max_{y\in\mathcal{Y}}\omega(y) is shown versus the corresponding mean value of NMIL. Clearly, Algorithm 1 enhances utility significantly while satisfying the privacy leakage constraint. For a very strict constraint on privacy (ε0.5\varepsilon\leq 0.5), complete merging results in perfect privacy with total utility loss where NMIL=1\text{NMIL}=1. However, Algorithm 1 keeps the average utility loss less than 11.

5 Conclusion

In this paper, we introduced a method to enhance utility in the watchdog privacy mechanism. We showed that it is possible to maintain the privacy constraint and improve utility. In our approach, instead of randomizing the whole high-risk partition, we randomized subsets of high-risk symbols separately. Then we proposed a heuristic greedy algorithm to find subsets of high-risk elements and at the same time keep the leakage of each subset less than the privacy constraint, ε\varepsilon. The simulation results showed substantial utility enhancement and preservation of the privacy constraint.

In future research, it would be interesting to consider how to develop more efficient subset merging algorithms. Investigation of effective parameters like subset size and different privacy and utility measures could be helpful. It would be also beneficial to apply the proposed algorithm to real world data sets and measure actual practical utility. It would be also beneficial to apply the proposed algorithm to real world datasets and measure the obtained utility.

References

  • [1] F. d. P. Calmon and N. Fawaz, “Privacy against statistical inference,” in Proc. Annu. Allerton Conf. Commun., Control, and Comput., Monticello, IL, 2012, pp. 1401–1408.
  • [2] H. Hsu, S. Asoodeh, and F. d. P. Calmon, “Information-theoretic privacy watchdogs,” in Proc. IEEE Int. Symp. Inf. Theory, Paris, France, 2019, pp. 552–556.
  • [3] A. Makhdoumi, S. Salamatian, N. Fawaz, and M. Médard, “From the information bottleneck to the privacy funnel,” in Proc. IEEE Inf. Theory Workshop, Hobart, TAS, 2014, pp. 501–505.
  • [4] R. Sibson, “Information radius,” Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 14, no. 2, pp. 149–160, 1969.
  • [5] S. Verdú, α\alpha-mutual information,” in Proc. Information Theory and Applications Workshop (ITA), San Diego, CA, 2015, pp. 1–6.
  • [6] N. Ding, M. A. Zarrabian, and P. Sadeghi, α\alpha-information-theoretic privacy watchdog and optimal privatization scheme,” in Proc. IEEE Int. Symp. Inf. Theory, 2021, pp. 2584–2589.
  • [7] J. Liao, O. Kosut, L. Sankar, and F. d. P. Calmon, “Tunable measures for information leakage and applications to privacy-utility tradeoffs,” IEEE Trans. Inf. Theory, vol. 65, no. 12, pp. 8043–8066, 2019.
  • [8] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in Proc. IEEE 54th Annu. Symp. Found. Comput. Sci., 2013, pp. 429–438.
  • [9] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?,” SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011.
  • [10] P. Sadeghi, N. Ding, and T. Rakotoarivelo, “On properties and optimization of information-theoretic privacy watchdog,” in Proc. IEEE Inf. Theory Workshop, 2020.
  • [11] P. Kairouz, S. Oh, and P. Viswanath, “Extremal mechanisms for local differential privacy,” in Adv. Neural Inf. Process. Syst, jan 2014, vol. 4, pp. 2879–2887.
  • [12] A. D. Sarwate and L. Sankar, “A rate-disortion perspective on local differential privacy,” in Proc. Annu. Allerton Conf. Commun., Control, and Comput., 2014, pp. 903–908.
  • [13] Y. Liu, P. Sadeghi, F. Arbabjolfaei, and Y. H. Kim, “Capacity theorems for distributed index coding,” IEEE Trans. Inf. Theory, vol. 66, no. 8, pp. 4653–4680, 2020.