This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Measuring Adversarial Robustness using a Voronoi-Epsilon Adversary

Hyeongji Kim Department of Informatics, University of Bergen, Norway Institute of Marine Research, Bergen, Norway Pekka Parviainen Department of Informatics, University of Bergen, Norway Ketil Malde Department of Informatics, University of Bergen, Norway Institute of Marine Research, Bergen, Norway
Abstract

Previous studies on robustness have argued that there is a tradeoff between accuracy and adversarial accuracy. The tradeoff can be inevitable even when we neglect generalization. We argue that the tradeoff is inherent to the commonly used definition of adversarial accuracy, which uses an adversary that can construct adversarial points constrained by ϵ\epsilon-balls around data points. As ϵ\epsilon gets large, the adversary may use real data points from other classes as adversarial examples. We propose a Voronoi-epsilon adversary which is constrained both by Voronoi cells and by ϵ\epsilon-balls. This adversary balances between two notions of perturbation. As a result, adversarial accuracy based on this adversary avoids a tradeoff between accuracy and adversarial accuracy on training data even when ϵ\epsilon is large. Finally, we show that a nearest neighbor classifier is the maximally robust classifier against the proposed adversary on the training data.

1 Introduction

By applying a carefully crafted, but imperceptible perturbation to input images, so-called adversarial examples can be constructed that cause classifiers to misclassify the perturbed inputs (Szegedy et al., 2013). Defense methods like adversarial training (Madry et al., 2017) and certified defenses (Wong & Kolter, 2018) against adversarial examples have often resulted in decreased accuracies on clean samples (Tsipras et al., 2018). Previous studies have argued that the tradeoff between accuracy and adversarial accuracy may be inevitable in classifiers (Tsipras et al., 2018; Dohmatob, 2018; Zhang et al., 2019).

1.1 Problem Settings

Problem setting.

Let 𝒳dim\mathcal{X}\subset\mathbb{R}^{\dim} be a nonempty input space and 𝒴\mathcal{Y} be a set of possible classes. Data points x𝒳x\in\mathcal{X} and corresponding classes cx𝒴c_{x}\in\mathcal{Y} are sampled from a joint distribution 𝒟\mathcal{D}. The distribution 𝒟\mathcal{D} should satisfy the condition that cxc_{x} is unique for all xx. The set of the data points is denoted as XX. XX is a nonempty finite set. A classifier ff assigns a class label from 𝒴\mathcal{Y} for each point x𝒳x\in\mathcal{X}. L(x,y)L(x,y) is a classification loss of the classifier ff provided an input x𝒳x\in\mathcal{X} and a label y𝒴y\in\mathcal{Y}.

More notations are summarized in A.1. Abbreviations are summarized in A.2. We focus on situations that we neglect generalization to simplify the analysis.

1.2 Adversarial Accuracy (AA)

Adversarial accuracy is a commonly used measure of adversarial robustness of classifiers (Madry et al., 2017; Tsipras et al., 2018). It is defined by an adversary region R(x)𝒳R(x)\subset\mathcal{X}, which is an allowed region of the perturbations for a data point xx.

Definition 1 (Adversarial accuracy).

Given an adversary that is constrained to an adversary region R(x)R({x}), adversarial accuracy aa is defined as follows.

a=𝔼(x,cx)𝒟[𝟙(f(x)=cx)] where x=argmaxxR(x)L(x,cx)a=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*})=c_{x}\right)}\right]\text{ where }x^{*}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R({x})}{L(x^{\prime},c_{x})}

The choice of R(x)R({x}) will determine the adversarial accuracy that we are measuring. Commonly considered adversary region is 𝔹(x,ϵ)\mathbb{B}(x,\epsilon), which is a ϵ\epsilon-ball around a data point xx based on a distance metric dd (Biggio et al., 2013; Madry et al., 2017; Tsipras et al., 2018; Zhang et al., 2019).

Definition 2 (Standard adversarial accuracy).

When the adversary region is 𝔹(x,ϵ)\mathbb{B}(x,\epsilon), we refer to the adversarial accuracy aa as standard adversarial accuracy (SAA) astd(ϵ)a_{std}(\epsilon). For SAA, we denote R(x)R({x}) as Rstd(ϵ;x)R_{std}({\epsilon;x}).

astd(ϵ)=𝔼(x,cx)𝒟[𝟙(f(x)=cx)] where x=argmaxxRstd(ϵ;x)L(x,cx)a_{std}(\epsilon)=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*})=c_{x}\right)}\right]\text{ where }x^{*}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{std}({\epsilon;x})}{L(x^{\prime},c_{x})}

This adversary region 𝔹(x,ϵ)\mathbb{B}(x,\epsilon) is based on an implicit assumption that there might be an adequate single epsilon ϵ\epsilon that perturbed samples do not change their classes. However, this assumption has some limitations. We explain that in the next section.

1.3 The Tradeoff Between Accuracy and Standard Adversarial Accuracy

The usage of ϵ\epsilon-ball-based adversary can cause the tradeoff between accuracy and adversarial accuracy. When the two clean samples x1x_{1} and x2x_{2} with d(x1,x2)ϵd(x_{1},x_{2})\leq\epsilon have different classes, the increase of standard adversarial accuracy requires misclassification. We illustrate this with a toy example.

1.3.1 Toy Example

Let us consider an example visualized in Figure 1(a). The input space is 2\mathbb{R}^{2}. There are only two classes AA and BB, i.e., 𝒴={A,B}\mathcal{Y}=\left\{A,B\right\}. We use the l2l_{2} norm as a distance metric in this example.

Let us consider a situation when ϵ=1.0\epsilon=1.0 (see Figure 1(c)). In this case, clean samples can also be considered as adversarial examples. For example, the point (2,1)(2,1) can be considered as an adversarial example originating from the point (1,1)(1,1). If we choose a robust model based on SAA, we might choose a model with excessive invariance. For example, we might choose a model that predicts points belong to 𝔹((1,1),1)\mathbb{B}((1,1),1) (including the point (2,1)(2,1)) have class A. Or, we can choose a model that predicts points belong to 𝔹((2,1),1)\mathbb{B}((2,1),1) (including the point (1,1)(1,1)) have class B. In either case, the accuracy of the chosen model is smaller than 11. This situation explains the tradeoff between accuracy and standard adversarial accuracy when large ϵ\epsilon is used. It originates from the overlapping adversary regions from the samples with different classes.

To avoid the tradeoff between accuracy and adversarial accuracy, one can use small ϵ\epsilon values. Actually, a previous study has argued that commonly used ϵ\epsilon values are small enough to avoid the tradeoff (Yang et al., 2020b). However, when small ϵ\epsilon values are used, we can only analyze local robustness, and we need to ignore robustness beyond the chosen ϵ\epsilon. For instance, let us consider our example when ϵ=0.5\epsilon=0.5 (see Figure 1(b)). In this case, we ignore robustness on 𝔹((2,1),1.0)𝔹((2,1),0.5)\mathbb{B}((-2,1),1.0)-\mathbb{B}((-2,1),0.5). Models with local but without global robustness enable attackers to use large ϵ\epsilon values to fool the models. Ghiasi et al. (2019) have experimentally shown that even models with certified local robustness can be attacked by attacks with large ϵ\epsilon values. Note that their attack applies little semantic perturbations even though the perturbation norms measured by lpl_{p} norms are large.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Figure 1: (a): Plot of the two-dimensional toy example. Data points are colored based on their classes (class AA: red and class BB: blue). (b): Visualization of the adversary regions for SAA when ϵ=0.5\epsilon=0.5. The regions are colored differently depending on their classes (class AA: magenta and class BB: cyan). The decision boundary of a single nearest neighbor classifier is shown as a dashed black curve. (c): Visualization of the adversary regions for SAA when ϵ=1.0\epsilon=1.0. The overlapping adversary regions from the samples with different classes are colored in purple.

These limitations motivate us to find an alternative way to measure robustness. The contributions of this paper are as follows.

  • We propose Voronoi-epsilon adversarial accuracy (VAA) that avoids the tradeoff between accuracy and adversarial accuracy. This allows the adversary regions to scale to cover most of the input space without incurring a tradeoff. To our best knowledge, this is the first work to achieve this without an external classifier. (In Section A.3, we introduce formulas for adversary regions that can be used to estimate VAA.)

  • We explain the connection between SAA and VAA. We define global Voronoi-epsilon robustness as a limit of the Voronoi-epsilon adversarial accuracy. We show that a nearest neighbor (1-NN) classifier maximizes global Voronoi-epsilon robustness.

2 Voronoi-Epsilon Adversarial Accuracy (VAA)

Our approach restricts the allowed region of the perturbations to avoid the tradeoff originating from the definition of standard adversarial accuracy. This is achieved without limiting the magnitude of ϵ\epsilon and without using an external model. We want to have the following property to avoid the tradeoff.

xi,xjX,xixjR(xi)R(xj)=\displaystyle\forall x_{i},x_{j}\in X,\,\,x_{i}\neq x_{j}\Longrightarrow R(x_{i})\cap R(x_{j})=\varnothing (1)

When Property (1) holds for the adversary region, we no longer have the tradeoff as xiR(xj)x_{i}\notin R(x_{j}) for xixjx_{i}\neq x_{j}. In other words, a clean sample cannot be an adversarial example originating from another clean sample. We propose a new adversary called a Voronoi-epsilon adversary that combines the Voronoi-adversary introduced by Khoury & Hadfield-Menell (2019) with an ϵ\epsilon-ball-based adversary. This adversary is constrained to an adversary region Vor(x)𝔹(x,ϵ)Vor(x)\cap\mathbb{B}(x,\epsilon) where Vor(x)Vor(x) is the (open) Voronoi cell around a data point xXx\in X. Vor(x)Vor(x) consists of every point in 𝒳\mathcal{X} that is closer than any xcleanX{x}x_{clean}\in X-\left\{x\right\}. Then, Property (1) holds as Vor(xi)Vor(xj)=Vor(x_{i})\cap Vor(x_{j})=\varnothing for xixjx_{i}\neq x_{j}.

Based on a Voronoi-epsilon adversary, we define Voronoi-epsilon adversarial accuracy (VAA).

Definition 3 (Voronoi-epsilon adversarial accuracy).

When a Voronoi-epsilon adversary is used for the adversary, we refer to the adversarial accuracy as Voronoi-epsilon adversarial accuracy (VAA) aVor(ϵ)a_{Vor}(\epsilon). For VAA, we denote R(x)R({x}) as RVor(ϵ;x)R_{Vor}({\epsilon;x}).

aVor(ϵ)=𝔼xX[𝟙(f(x)=cx)] where x=argmaxxRVor(ϵ;x)L(x,cx)a_{Vor}(\epsilon)=\mathbb{E}_{x\in X}\left[{\mathds{1}\left(f(x^{*})=c_{x}\right)}\right]\text{ where }x^{*}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor}({\epsilon;x})}{L(x^{\prime},c_{x})}

Note that VAA is only defined on a fixed set of data points XX. As we do not know the distribution 𝒟\mathcal{D}, in practice, the fact that VAA is not defined on the whole input space does not matter.

Figure 2 shows the adversary regions for VAA with varying ϵ\epsilon values. When ϵ=0.5\epsilon=0.5, the regions are same with SAA except for the points (1.5,1),(1.5,1)(1.5,1),(1.5,-1) and (2,1.5)(2,-1.5). Even when ϵ\epsilon is large (ϵ>0.5\epsilon>0.5), there is no overlapping adversary region, which was a source of the tradeoff in SAA. Therefore, when we choose a robust model based on VAA, we can get a model that is both accurate and robust. Figure 2(c) shows the single nearest neighbor (1-NN) classifier would maximize VAA. The adversary regions cover most of the points in 2\mathbb{R}^{2} for large ϵ\epsilon.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Figure 2: Visualization of the adversary regions for VAA with varying ϵ\epsilon values. The data points and regions are colored as in Figure 1. (a): When ϵ=0.5\epsilon=0.5. (b): When ϵ=1.0\epsilon=1.0. (c): When ϵ=3.5\epsilon=3.5.
Observation 1.

Let dmind_{min} be the nearest distance of the data point pairs, i.e., dmin=minxi,xjX,xixjd(xi,xj)d_{min}=\min\limits_{x_{i},x_{j}\in X,x_{i}\neq x_{j}}d(x_{i},x_{j}). Then, the following equivalence holds.

aVor(ϵ)=astd(ϵ) when ϵ<12dmin\displaystyle a_{Vor}(\epsilon)=a_{std}(\epsilon)\text{ when }\epsilon<\frac{1}{2}d_{min} (2)

Observation 1 shows that VAA is equivalent to SAA for sufficiently small ϵ\epsilon values. This indicates that VAA is an extension of SAA that avoids the tradeoff when ϵ\epsilon is large. The proof of the observation is in Section A.5. We point out that equivalent findings were also mentioned in Yang et al. (2020a; b); Khoury & Hadfield-Menell (2019).

As explained in Section 1.3.1, studying the local robustness of classifiers has a limitation. Attackers can attack models with only local robustness by using large ϵ\epsilon values. The absence of a tradeoff between accuracy and VAA enables us to increase ϵ\epsilon values and to study global robustness. We define a measure for global robustness using VAA.

Definition 4 (Global Voronoi-epsilon robustness).

Global Voronoi-epsilon robustness aglobala_{global} is defined as

aglobal=limϵaVor(ϵ).a_{global}=\lim\limits_{\epsilon\rightarrow\infty}{a_{Vor}(\epsilon)}.

Global Voronoi-epsilon robustness considers the robustness of classifiers for most points in 𝒳\mathcal{X} (all points except for Voronoi boundary VB(X)VB(X), which is the complement set of the unions of Voronoi cells.). We derive the following theorem from global Voronoi-epsilon robustness.

Theorem 1.

A single nearest neighbor (1-NN) classifier maximizes global Voronoi-epsilon robustness aglobala_{global} on training data. 1-NN classifier is a unique classifier that satisfies this except for Voronoi boundary VB(X)VB(X).

Note that Theorem 1 only holds for exactly the same data under the exclusive class condition as mentioned in the problem settings Problem setting. It does not take into account generalization. The proof of the theorem is in A.6.

3 Discussion

In this work, we address the tradeoff between accuracy and adversarial robustness by introducing the Voronoi-epsilon adversary. Another way to address this tradeoff is to use a Bayes optimal classifier (Suggala et al., 2019; Kim & Wang, 2020). Since this is not available in practice, a reference model must be used as an approximation. In that case, the meaning of adversarial robustness is dependent on the choice of the reference model. VAA removes the need for a reference model by using the data point set XX and the distance metric dd to construct adversary. This is in contrast to Khoury & Hadfield-Menell (2019) who used Voronoi cell-based constraints (without ϵ\epsilon-balls) for an adversarial training purpose, but not for measuring adversarial robustness.

By avoiding the tradeoff with VAA, we can extend the study of local robustness to global robustness. Also, Theorem 1 implies that VAA is a measure of agreement with the 1-NN classifier. For sufficiently small ϵ\epsilon values, SAA is also a measure of agreement with the 1-NN classifier because SAA is equivalent to VAA as in Observation 1. This implies that many defenses (Goodfellow et al., 2014; Madry et al., 2017; Zhang et al., 2019; Wong & Kolter, 2018; Cohen et al., 2019) with small ϵ\epsilon values unknowingly try to make locally the same predictions with a 1-NN classifier.

In our analysis, we do not consider generalization, and robust models are known to often generalize poorly (Raghunathan et al., 2020). The close relationship between adversarially robust models and the 1-NN classifier revealed by Theorem 1 highlights a possible avenue to explore this phenomenon.

Acknowledgments

We thank Dr. Nils Olav Handegard, Dr. Yi Liu, and Jungeum Kim for the helpful feedback. We also thank Dr. Wieland Brendel for the helpful discussions.

References

  • Biggio et al. (2013) Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. Evasion attacks against machine learning at test time. In Joint European conference on machine learning and knowledge discovery in databases, pp.  387–402. Springer, 2013.
  • Cohen et al. (2019) Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning, pp. 1310–1320. PMLR, 2019.
  • Dohmatob (2018) Elvis Dohmatob. Limitations of adversarial robustness: strong No Free Lunch Theorem. arXiv:1810.04065 [cs, stat], October 2018. URL http://arxiv.org/abs/1810.04065. arXiv: 1810.04065.
  • Ghiasi et al. (2019) Amin Ghiasi, Ali Shafahi, and Tom Goldstein. Breaking certified defenses: Semantic adversarial examples with spoofed robustness certificates. In International Conference on Learning Representations, 2019.
  • Goodfellow et al. (2014) Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  • Khoury & Hadfield-Menell (2019) Marc Khoury and Dylan Hadfield-Menell. Adversarial training with Voronoi constraints. arXiv preprint arXiv:1905.01019, 2019.
  • Kim & Wang (2020) Jungeum Kim and Xiao Wang. Sensible adversarial learning, 2020. URL https://openreview.net/forum?id=rJlf_RVKwr.
  • Madry et al. (2017) Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  • Raghunathan et al. (2020) Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John Duchi, and Percy Liang. Understanding and mitigating the tradeoff between robustness and accuracy. arXiv preprint arXiv:2002.10716, 2020.
  • Suggala et al. (2019) Arun Sai Suggala, Adarsh Prasad, Vaishnavh Nagarajan, and Pradeep Ravikumar. Revisiting adversarial risk. In The 22nd International Conference on Artificial Intelligence and Statistics, pp.  2331–2339. PMLR, 2019.
  • Szegedy et al. (2013) Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
  • Tsipras et al. (2018) Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152, 2018.
  • Wong & Kolter (2018) Eric Wong and Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning, pp. 5286–5295. PMLR, 2018.
  • Yang et al. (2020a) Yao-Yuan Yang, Cyrus Rashtchian, Yizhen Wang, and Kamalika Chaudhuri. Robustness for non-parametric classification: A generic attack and defense. In International Conference on Artificial Intelligence and Statistics, pp.  941–951. PMLR, 2020a.
  • Yang et al. (2020b) Yao-Yuan Yang, Cyrus Rashtchian, Hongyang Zhang, Russ R Salakhutdinov, and Kamalika Chaudhuri. A closer look at accuracy vs. robustness. Advances in Neural Information Processing Systems, 33, 2020b.
  • Zhang et al. (2019) Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P Xing, Laurent El Ghaoui, and Michael I Jordan. Theoretically principled trade-off between robustness and accuracy. arXiv preprint arXiv:1901.08573, 2019.

Appendix A Appendix

A.1 List of Notation

ϵ\displaystyle\epsilon A perturbation budget.
dim\displaystyle\dim The dimension of the input space.
𝒳\displaystyle\mathcal{X} The nonempty input space. 𝒳dim\mathcal{X}\subset\mathbb{R}^{\dim}.
𝒴\displaystyle\mathcal{Y} The set of possible classes.
cx\displaystyle c_{x} A corresponding class of a clean data point x𝒳x\in\mathcal{X}.
𝒟\displaystyle\mathcal{D} The joint distribution. 𝒟𝒳×𝒴\mathcal{D}\subset\mathcal{X}\times\mathcal{Y}.
X\displaystyle X The set of data points. We assume it is a nonempty finite set.
f\displaystyle f The classifier that we want to analyze. f:𝒳𝒴f:\mathcal{X}\rightarrow\mathcal{Y}.
L(x,y)\displaystyle L(x,y) A classification loss of the classifier ff provided an input x𝒳x\in\mathcal{X} and a label y𝒴y\in\mathcal{Y}.
R(x)\displaystyle R({x}) An adversary region which is an allowed region of the perturbations for a data point xx. It can be depend on a perturbation budget ϵ\epsilon.
𝟙()\displaystyle\mathds{1}\left({}\right) The indicator function. 𝟙(True)=1\mathds{1}\left(True\right)=1 and 𝟙(False)=0\mathds{1}\left(False\right)=0.
a\displaystyle a Adversarial accuracy.
d\displaystyle d The distance metric that is used for measuring adversarial robustness. It is not limited to lpl_{p} norms. It can be a learned metric or more complex distance.
𝔹(x,ϵ)\displaystyle\mathbb{B}(x,\epsilon) An ϵ\epsilon-ball around a sample xx. Mathematically, 𝔹(x,ϵ)={x𝒳|d(x,x)ϵ}\mathbb{B}(x,\epsilon)=\left\{{x^{\prime}\in\mathcal{X}}|{d(x,x^{\prime})\leq\epsilon}\right\}.
Rstd(ϵ;x)\displaystyle R_{std}({\epsilon;x}) The allowed regions of the perturbations for standard adversarial accuracy around a data point xx. Rstd(ϵ;x)=𝔹(x,ϵ)R_{std}({\epsilon;x})=\mathbb{B}(x,\epsilon).
astd(ϵ)\displaystyle a_{std}(\epsilon) Standard adversarial accuracy using a perturbation budget ϵ\epsilon. In other words, the adversarial accuracy when the adversary region is Rstd(ϵ;x)=𝔹(x,ϵ)R_{std}({\epsilon;x})=\mathbb{B}(x,\epsilon).
HS(x,xclean)\displaystyle HS(x,x_{clean}) The (open) half-space closer to xXx\in X than xcleanX{x}x_{clean}\in X-\left\{x\right\}. Mathematically, HS(x,xclean)={x𝒳|d(x,x)<d(xclean,x)}HS(x,x_{clean})=\left\{x^{\prime}\in\mathcal{X}|d(x,x^{\prime})<d(x_{clean},x^{\prime})\right\}.
Vor(x)\displaystyle Vor(x) The (open) Voronoi cell of a sample xXx\in X. Mathematically, Vor(x)={x𝒳|d(x,x)<d(xclean,x),xcleanX{x}}=xcleanX{x}HS(x,xclean)Vor(x)=\left\{x^{\prime}\in\mathcal{X}|d(x,x^{\prime})<d(x_{clean},x^{\prime}),\forall x_{clean}\in X-\left\{x\right\}\right\}=\bigcap\limits_{x_{clean}\in X-\left\{x\right\}}{HS(x,x_{clean})}.
RVor(ϵ;x)\displaystyle R_{Vor}({\epsilon;x}) The allowed regions of the perturbations for Voronoi-epsilon adversarial accuracy around a data point xx. RVor(ϵ;x)=Vor(x)𝔹(x,ϵ)R_{Vor}({\epsilon;x})=Vor(x)\cap\mathbb{B}(x,\epsilon).
aVor(ϵ)\displaystyle a_{Vor}(\epsilon) The Voronoi-epsilon adversarial accuracy using perturbation budget ϵ\epsilon. In other words, the adversarial accuracy when the adversary region is RVor(ϵ;x)=Vor(x)𝔹(x,ϵ)R_{Vor}({\epsilon;x})=Vor(x)\cap\mathbb{B}(x,\epsilon).
S𝖼\displaystyle S^{\mathsf{c}} The complement set of a set SS. For S𝒳S\subset\mathcal{X}, S𝖼=𝒳SS^{\mathsf{c}}=\mathcal{X}-S.
VB(X)\displaystyle VB(X) Voronoi boundary based on XX. It is the complement set of the unions of Voronoi cells. VB(X)=(xXVor(x))𝖼=xXVor(x)𝖼VB(X)=\left(\bigcup\limits_{x\in X}{Vor(x)}\right)^{\mathsf{c}}=\bigcap\limits_{x\in X}{Vor(x)^{\mathsf{c}}}.
aglobal\displaystyle a_{global} Global Voronoi-epsilon robustness.
N\displaystyle N The number of data points.
RVor;LB(ϵ;x)\displaystyle R_{Vor;LB}({\epsilon;x}) The allowed regions of the perturbations for the lower bound of Voronoi-epsilon adversarial accuracy around a data point xx. When ϵ<12d(x,xm+2)\epsilon<\frac{1}{2}d(x,x_{m+2}), RVor;LB(ϵ;x)=RVor(ϵ;x)R_{Vor;LB}({\epsilon;x})=R_{Vor}({\epsilon;x}). When ϵ12d(x,xm+2)\epsilon\geq\frac{1}{2}d(x,x_{m+2}), RVor;LB(ϵ;x)=𝔹(x,ϵ)(i=2m+1HS(x,xi))R_{Vor;LB}({\epsilon;x})=\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right).
RVor;UB(ϵ;x)\displaystyle R_{Vor;UB}({\epsilon;x}) The allowed regions of the perturbations for the upper bound of Voronoi-epsilon adversarial accuracy around a sample xx. When ϵ<12d(x,xm+2)\epsilon<\frac{1}{2}d(x,x_{m+2}), RVor;UB(ϵ;x)=RVor(ϵ;x)R_{Vor;UB}({\epsilon;x})=R_{Vor}({\epsilon;x}). When ϵ12d(x,xm+2)\epsilon\geq\frac{1}{2}d(x,x_{m+2}), RVor;UB(ϵ;x)=𝔹(x,12d(x,xm+2)τ)(i=2m+1HS(x,xi))R_{Vor;UB}({\epsilon;x})=\mathbb{B}(x,\frac{1}{2}d(x,x_{m+2})-\tau)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right).
aVor;LB(ϵ)\displaystyle a_{Vor;\,LB}(\epsilon) The lower bound of Voronoi-epsilon adversarial accuracy using perturbation budget ϵ\epsilon. It is defined as the adversarial accuracy when the adversary region for a data point xx is RVor;LB(ϵ;x)R_{Vor;LB}({\epsilon;x}).
aVor;UB(ϵ)\displaystyle a_{Vor;\,UB}(\epsilon) The upper bound of Voronoi-epsilon adversarial accuracy using perturbation budget ϵ\epsilon. It is defined as the adversarial accuracy when the adversary region for a data point xx is RVor;UB(ϵ;x)R_{Vor;UB}({\epsilon;x}).

A.2 List of Abbreviation

AA Adversarial accuracy.
SAA Standard adversarial accuracy.
VAA Voronoi-epsilon adversarial accuracy.
1-NN Single nearest neighbor.
LB Lower bound.
UB Upper bound.

A.3 Adversary Region RVor(ϵ;x)R_{Vor}({\epsilon;x})

Voronoi-epsilon adversarial accuracy (VAA) uses RVor(ϵ;x)=Vor(x)𝔹(x,ϵ)R_{Vor}({\epsilon;x})=Vor(x)\cap\mathbb{B}(x,\epsilon). We introduce upper and lower bounds of RVor(ϵ;x)R_{Vor}({\epsilon;x}) using m+1m+1 nearest neighbors of a data point xx. These bounds enable to calculate approximate upper and lower bounds of VAA.

Lemma 1.

When NN is the number of data points, let x2,,xNX{x}x_{2},\cdots,x_{N}\in X-\left\{x\right\} be the sorted neighbors of a data point xXx\in X. Mathematically, d(x,x2)d(x,x3)d(x,xN)d(x,x_{2})\leq d(x,x_{3})\leq\cdots\leq d(x,x_{N}). Then, the following relations hold for a fixed number mN2m\leq N-2. {fleqn}

RVor(ϵ;x)=𝔹(x,ϵ) when ϵ<12d(x,x2)R_{Vor}({\epsilon;x})=\mathbb{B}(x,\epsilon)\text{ when }\epsilon<\frac{1}{2}d(x,x_{2}) (3)
RVor(ϵ;x)=𝔹(x,ϵ)(i=2jHS(x,xi)) when 12d(x,xj)ϵ<12d(x,xj+1)(j=2,,m+1)\begin{split}R_{Vor}({\epsilon;x})=\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{j}{HS(x,x_{i})}\right)\text{ when }\frac{1}{2}d(x,x_{j})\leq\epsilon<\frac{1}{2}d(x,x_{j+1})\\ (j=2,\cdots,m+1)\end{split} (4)
𝔹(x,12d(x,xm+2)τ)(i=2m+1HS(x,xi))RVor(ϵ;x)𝔹(x,ϵ)(i=2m+1HS(x,xi)) when ϵ12d(x,xm+2) and τ>0\begin{split}\mathbb{B}(x,\frac{1}{2}d(x,x_{m+2})-\tau)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right)\subset R_{Vor}({\epsilon;x})\subset\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right)\\ \text{ when }\epsilon\geq\frac{1}{2}d(x,x_{m+2})\text{ and }\tau>0\end{split} (5)

When ϵ<12d(x,xm+2)\epsilon<\frac{1}{2}d(x,x_{m+2}), we can calculate VAA using relations (3) and (4). The relation (5) of Lemma 1 enables to calculate the lower and upper bound of VAA when ϵ12d(x,xm+2)\epsilon\geq\frac{1}{2}d(x,x_{m+2}). When ϵ12d(x,xm+2)\epsilon\geq\frac{1}{2}d(x,x_{m+2}), we denote the leftmost set in the relation (5) as RVor;UB(ϵ;x)R_{Vor;UB}({\epsilon;x}) and the rightmost set as RVor;LB(ϵ;x)R_{Vor;LB}({\epsilon;x}). (When ϵ<12d(x,xm+2)\epsilon<\frac{1}{2}d(x,x_{m+2}), we set RVor;LB(ϵ;x)=RVor;UB(ϵ;x)=RVor(ϵ;x)R_{Vor;LB}({\epsilon;x})=R_{Vor;UB}({\epsilon;x})=R_{Vor}({\epsilon;x}).) Figure 3 visualizes the relationship RVor;UB(ϵ;x)RVor(ϵ;x)RVor;LB(ϵ;x)Rstd(ϵ;x)R_{Vor;UB}({\epsilon;x})\subset R_{Vor}({\epsilon;x})\subset R_{Vor;LB}({\epsilon;x})\subset R_{std}({\epsilon;x}). The proof of the lemma is in Section A.4.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Figure 3: Visualization of the adversary region for the point (1,1)(1,-1) when m=3m=3 and ϵ=1.5\epsilon=1.5 on our example 1.3.1. (a): RVor;UB(1.5;(1,1))R_{Vor;UB}(1.5;(1,-1)). (b): RVor(1.5;(1,1))R_{Vor}(1.5;(1,-1)). (c): RVor;LB(1.5;(1,1))R_{Vor;LB}(1.5;(1,-1)). (d): Rstd(1.5;(1,1))R_{std}(1.5;(1,-1)).
Proposition 1.

aVor;LB(ϵ)a_{Vor;\,LB}(\epsilon) is defined as the adversarial accuracy when the allowed regions of perturbation is RVor;LB(ϵ;x)R_{Vor;LB}({\epsilon;x}). aVor;UB(ϵ)a_{Vor;\,UB}(\epsilon) is defined as the adversarial accuracy when the allowed regions of perturbation is RVor;UB(ϵ;x)R_{Vor;UB}({\epsilon;x}). Then, the following relation holds.

astd(ϵ)aVor;LB(ϵ)aVor(ϵ)aVor;UB(ϵ)\displaystyle a_{std}(\epsilon)\leq a_{Vor;\,LB}(\epsilon)\leq a_{Vor}(\epsilon)\leq a_{Vor;\,UB}(\epsilon) (6)

The proof of Proposition 1 is in Section A.5.

A.4 Proof of Lemma 1

Proof.

Relation (3)
First, we consider when ϵ<12d(x,x2)\epsilon<\frac{1}{2}d(x,x_{2}).
Let x𝔹(x,ϵ)x^{\prime}\in\mathbb{B}(x,\epsilon). Then, d(x,x)ϵd(x,x^{\prime})\leq\epsilon.
12d(x,x2)12d(x,xclean),xcleanX{x}\frac{1}{2}d(x,x_{2})\leq\frac{1}{2}d(x,x_{clean}),\forall x_{clean}\in X-\left\{x\right\}.
Due to the triangle inequality, 12d(x,xclean)12d(x,x)+12d(x,xclean)\frac{1}{2}d(x,x_{clean})\leq\frac{1}{2}d(x,x^{\prime})+\frac{1}{2}d(x^{\prime},x_{clean}).
When we combine the above inequalities, d(x,x)ϵ<12d(x,x2)12d(x,xclean)12d(x,x)+12d(x,xclean),xcleanX{x}d(x,x^{\prime})\leq\epsilon<\frac{1}{2}d(x,x_{2})\leq\frac{1}{2}d(x,x_{clean})\leq\frac{1}{2}d(x,x^{\prime})+\frac{1}{2}d(x^{\prime},x_{clean}),\forall x_{clean}\in X-\left\{x\right\}.
Then, 12d(x,x)<12d(x,xclean)=12d(xclean,x),xcleanX{x}\frac{1}{2}d(x,x^{\prime})<\frac{1}{2}d(x^{\prime},x_{clean})=\frac{1}{2}d(x_{clean},x^{\prime}),\forall x_{clean}\in X-\left\{x\right\}. Thus, xVor(x)x^{\prime}\in Vor(x).
Hence, 𝔹(x,ϵ)Vor(x)\mathbb{B}(x,\epsilon)\subset Vor(x) and RVor(ϵ;x)=𝔹(x,ϵ)Vor(x)=𝔹(x,ϵ)R_{Vor}({\epsilon;x})=\mathbb{B}(x,\epsilon)\cap Vor(x)=\mathbb{B}(x,\epsilon).
Relation (4)
Now, we consider when 12d(x,xj)ϵ<12d(x,xj+1)(j=2,,m+1)\frac{1}{2}d(x,x_{j})\leq\epsilon<\frac{1}{2}d(x,x_{j+1})\,(j=2,\cdots,m+1).
RVor(ϵ;x)=𝔹(x,ϵ)(i=2N1HS(x,xi))𝔹(x,ϵ)(i=2jHS(x,xi))R_{Vor}({\epsilon;x})=\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{N-1}{HS(x,x_{i})}\right)\subset\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{j}{HS(x,x_{i})}\right) is obvious as jN1j\leq N-1.
We only need to proof 𝔹(x,ϵ)(i=2jHS(x,xi))RVor(ϵ;x)\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{j}{HS(x,x_{i})}\right)\subset R_{Vor}({\epsilon;x}).
Let x𝔹(x,ϵ)(i=2jHS(x,xi))x^{\prime}\in\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{j}{HS(x,x_{i})}\right). Then, d(x,x)ϵ,d(x,x)<d(x2,x),,d(x,x)<d(xj,x)d(x,x^{\prime})\leq\epsilon,d(x,x^{\prime})<d(x_{2},x^{\prime}),\cdots,d(x,x^{\prime})<d(x_{j},x^{\prime}).
12d(x,xj+1)12d(x,xk)\frac{1}{2}d(x,x_{j+1})\leq\frac{1}{2}d(x,x_{k}) for k=j+1,,N1k=j+1,\cdots,N-1.
Due to the triangle inequality, 12d(x,xk)12d(x,x)+12d(x,xk)\frac{1}{2}d(x,x_{k})\leq\frac{1}{2}d(x,x^{\prime})+\frac{1}{2}d(x^{\prime},x_{k}).
When we combine the above inequalities, d(x,x)ϵ<12d(x,xj+1)12d(x,xk)12d(x,x)+12d(x,xk)d(x,x^{\prime})\leq\epsilon<\frac{1}{2}d(x,x_{j+1})\leq\frac{1}{2}d(x,x_{k})\leq\frac{1}{2}d(x,x^{\prime})+\frac{1}{2}d(x^{\prime},x_{k}) for k=j+1,,N1k=j+1,\cdots,N-1 .
Then, 12d(x,x)<12d(x,xk)=12d(xk,x)\frac{1}{2}d(x,x^{\prime})<\frac{1}{2}d(x^{\prime},x_{k})=\frac{1}{2}d(x_{k},x^{\prime}) for k=j+1,,N1k=j+1,\cdots,N-1.
We got d(x,x)ϵ,d(x,x)<d(x2,x),,d(x,x)<d(xN1,x)d(x,x^{\prime})\leq\epsilon,d(x,x^{\prime})<d(x_{2},x^{\prime}),\cdots,d(x,x^{\prime})<d(x_{N-1},x^{\prime}) and we proved 𝔹(x,ϵ)(i=2jHS(x,xi))RVor(ϵ;x)\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{j}{HS(x,x_{i})}\right)\subset R_{Vor}({\epsilon;x}).
Relation (5)
Finally, we consider when ϵ12d(x,xm+2)\epsilon\geq\frac{1}{2}d(x,x_{m+2}).
(i) 𝔹(x,12d(x,xm+2)τ)(i=2m+1HS(x,xi))RVor(ϵ;x) for τ>0\mathbb{B}(x,\frac{1}{2}d(x,x_{m+2})-\tau)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right)\subset R_{Vor}({\epsilon;x})\text{ for }\tau>0:
Let x𝔹(x,12d(x,xm+2)τ)(i=2m+1HS(x,xi))x^{\prime}\in\mathbb{B}(x,\frac{1}{2}d(x,x_{m+2})-\tau)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right). Then, d(x,x)12d(x,xm+2)τ<12d(x,xm+2)ϵ,d(x,x)<d(x2,x),,d(x,x)<d(xm+1,x)d(x,x^{\prime})\leq\frac{1}{2}d(x,x_{m+2})-\tau<\frac{1}{2}d(x,x_{m+2})\leq\epsilon,d(x,x^{\prime})<d(x_{2},x^{\prime}),\cdots,d(x,x^{\prime})<d(x_{m+1},x^{\prime}).
Through similar process used in the proof of Relation (3) and Relation (4), we have d(x,x)<12d(x,xm+2)12d(x,xk)12d(x,x)+12d(x,xk)d(x,x^{\prime})<\frac{1}{2}d(x,x_{m+2})\leq\frac{1}{2}d(x,x_{k})\leq\frac{1}{2}d(x,x^{\prime})+\frac{1}{2}d(x^{\prime},x_{k}) for k=m+2,,N1k=m+2,\cdots,N-1.
Then, 12d(x,x)<12d(x,xk)=12d(xk,x)\frac{1}{2}d(x,x^{\prime})<\frac{1}{2}d(x^{\prime},x_{k})=\frac{1}{2}d(x_{k},x^{\prime}) for k=m+2,,N1k=m+2,\cdots,N-1.
We got d(x,x)<ϵ,d(x,x)<d(x2,x),,d(x,x)<d(xN1,x)d(x,x^{\prime})<\epsilon,d(x,x^{\prime})<d(x_{2},x^{\prime}),\cdots,d(x,x^{\prime})<d(x_{N-1},x^{\prime}) and we proved (i).
(ii) RVor(ϵ;x)𝔹(x,ϵ)(i=2m+1HS(x,xi))R_{Vor}({\epsilon;x})\subset\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right):
It is obvious as RVor(ϵ;x)=𝔹(x,ϵ)(i=2N1HS(x,xi))R_{Vor}({\epsilon;x})=\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{N-1}{HS(x,x_{i})}\right) and m+1N1m+1\leq N-1. ∎

A.5 Proof of Observation 1 and Proposition 1

Proof.

Observation 1
dmind(x,xi),d_{min}\leq d(x,x_{i}),
x,xiX,xxi\forall x,x_{i}\in X,x\neq x_{i}.
When ϵ<12dmin\epsilon<\frac{1}{2}d_{min}, ϵ<12dmin12d(x,xi),\epsilon<\frac{1}{2}d_{min}\leq\frac{1}{2}d(x,x_{i}), x,xiX,xxi\forall x,x_{i}\in X,x\neq x_{i}. Thus, RVor(ϵ;x)=𝔹(x,ϵ),xXR_{Vor}({\epsilon;x})=\mathbb{B}(x,\epsilon),\,\forall x\in X due to the relation (3) in Lemma 1.
Then, aVor(ϵ)a_{Vor}(\epsilon) is same with astd(ϵ)a_{std}(\epsilon) as RVor(ϵ;x)=𝔹(x,ϵ)=Rstd(ϵ;x)R_{Vor}({\epsilon;x})=\mathbb{B}(x,\epsilon)=R_{std}({\epsilon;x}),xX,\forall x\in X.
Proposition 1
First, we consider a data point xXx\in X and let x2,,xNX{x}x_{2},\cdots,x_{N}\in X-\left\{x\right\} be the sorted neighbors of xx.
Let x1=argmaxxRstd(ϵ;x)L(x,cx)x^{*1}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{std}({\epsilon;x})}{L(x^{\prime},c_{x})}, x2=argmaxxRVor;LB(ϵ;x)L(x,cx)x^{*2}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor;LB}({\epsilon;x})}{L(x^{\prime},c_{x})}, x3=argmaxxRVor(ϵ;x)L(x,cx)x^{*3}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor}({\epsilon;x})}{L(x^{\prime},c_{x})}, and x4=argmaxxRVor;UB(ϵ;x)L(x,cx)x^{*4}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor;UB}({\epsilon;x})}{L(x^{\prime},c_{x})}.
(i) When ϵ<12d(x,xm+2)\epsilon<\frac{1}{2}d(x,x_{m+2}):
RVor;UB(ϵ;x)=RVor(ϵ;x)=RVor;LB(ϵ;x)R_{Vor;UB}({\epsilon;x})=R_{Vor}({\epsilon;x})=R_{Vor;LB}({\epsilon;x}) from the definition.
RVor;LB(ϵ;x)=RVor(ϵ;x)𝔹(x,ϵ)=Rstd(ϵ;x)R_{Vor;LB}({\epsilon;x})=R_{Vor}({\epsilon;x})\subset\mathbb{B}(x,\epsilon)=R_{std}({\epsilon;x}) from the relations (3) and (4).
Then, 𝟙(f(x1)=cx)𝟙(f(x2)=cx)=𝟙(f(x3)=cx)=𝟙(f(x4)=cx)\mathds{1}\left(f(x^{*1})=c_{x}\right)\leq\mathds{1}\left(f(x^{*2})=c_{x}\right)=\mathds{1}\left(f(x^{*3})=c_{x}\right)=\mathds{1}\left(f(x^{*4})=c_{x}\right) as RVor;UB(ϵ;x)=RVor(ϵ;x)=RVor;LB(ϵ;x)Rstd(ϵ;x)R_{Vor;UB}({\epsilon;x})=R_{Vor}({\epsilon;x})=R_{Vor;LB}({\epsilon;x})\subset R_{std}({\epsilon;x}).
(ii) When ϵ12d(x,xm+2)\epsilon\geq\frac{1}{2}d(x,x_{m+2}):
RVor;UB(ϵ;x)RVor(ϵ;x)RVor;LB(ϵ;x)R_{Vor;UB}({\epsilon;x})\subset R_{Vor}({\epsilon;x})\subset R_{Vor;LB}({\epsilon;x}) from the relation (5).
RVor;LB(ϵ;x)=𝔹(x,ϵ)(i=2m+1HS(x,xi))𝔹(x,ϵ)=Rstd(ϵ;x)R_{Vor;LB}({\epsilon;x})=\mathbb{B}(x,\epsilon)\cap\left(\bigcap\limits_{i=2}^{m+1}{HS(x,x_{i})}\right)\subset\mathbb{B}(x,\epsilon)=R_{std}({\epsilon;x}) from the definition.
Then, 𝟙(f(x1)=cx)𝟙(f(x2)=cx)𝟙(f(x3)=cx)𝟙(f(x4)=cx)\mathds{1}\left(f(x^{*1})=c_{x}\right)\leq\mathds{1}\left(f(x^{*2})=c_{x}\right)\leq\mathds{1}\left(f(x^{*3})=c_{x}\right)\leq\mathds{1}\left(f(x^{*4})=c_{x}\right) as RVor;UB(ϵ;x)RVor(ϵ;x)RVor;LB(ϵ;x)Rstd(ϵ;x)R_{Vor;UB}({\epsilon;x})\subset R_{Vor}({\epsilon;x})\subset R_{Vor;LB}({\epsilon;x})\subset R_{std}({\epsilon;x}).
From (i) and (ii), 𝔼(x,cx)𝒟[𝟙(f(x1)=cx)]𝔼(x,cx)𝒟[𝟙(f(x2)=cx)]𝔼(x,cx)𝒟[𝟙(f(x3)=cx)]𝔼(x,cx)𝒟[𝟙(f(x4)=cx)]\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*1})=c_{x}\right)}\right]\leq\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*2})=c_{x}\right)}\right]\leq\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*3})=c_{x}\right)}\right]\leq\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*4})=c_{x}\right)}\right].
We finished the proof of the relation (6) as astd(ϵ)=𝔼(x,cx)𝒟[𝟙(f(x1)=cx)]a_{std}(\epsilon)=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*1})=c_{x}\right)}\right], aVor;LB(ϵ)=𝔼(x,cx)𝒟[𝟙(f(x2)=cx)]a_{Vor;\,LB}(\epsilon)=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*2})=c_{x}\right)}\right], aVor(ϵ)=𝔼(x,cx)𝒟[𝟙(f(x3)=cx)]a_{Vor}(\epsilon)=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*3})=c_{x}\right)}\right], and aVor;UB(ϵ)=𝔼(x,cx)𝒟[𝟙(f(x4)=cx)]a_{Vor;\,UB}(\epsilon)=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f(x^{*4})=c_{x}\right)}\right]. ∎

A.6 Proof of Theorem 1

To proof Theorem 1, we introduce the following lemma.

Lemma 2.

By changing ϵ\epsilon and xXx\in X, xx^{\prime} that satisfies xRVor(ϵ;x)x^{\prime}\in R_{Vor}(\epsilon;x) can fill up 𝒳\mathcal{X} except for VB(X)VB(X). In other words, VB(X)𝖼=𝒳VB(X)ϵ0(xXRVor(ϵ;x))VB(X)^{\mathsf{c}}=\mathcal{X}-VB(X)\subset\bigcup\limits_{\epsilon\geq 0}{\left(\bigcup\limits_{x\in X}{R_{Vor}(\epsilon;x)}\right)}.

Proof.

Lemma 2
Let xVB(X)𝖼x^{\prime}\in VB(X)^{\mathsf{c}}.
VB(X)𝖼=𝒳VB(X)=𝒳(xXVor(x))𝖼=𝒳(xXVor(x))=xXVor(x)VB(X)^{\mathsf{c}}=\mathcal{X}-VB(X)=\mathcal{X}-\left(\bigcup\limits_{x\in X}{Vor(x)}\right)^{\mathsf{c}}=\mathcal{X}\cap\left(\bigcup\limits_{x\in X}{Vor(x)}\right)=\bigcup\limits_{x\in X}{Vor(x)}.
xX such that xVor(x)\exists x\in X\text{ such that }x^{\prime}\in Vor(x).
Let ϵ=d(x,x)\epsilon^{*}=d(x,x^{\prime}). Then, d(x,x)ϵd(x,x^{\prime})\leq\epsilon^{*} and xVor(x)x^{\prime}\in Vor(x).
x𝔹(x,ϵ)Vor(x)=RVor(ϵ;x)ϵ0(xXRVor(ϵ;x))x^{\prime}\in\mathbb{B}(x,\epsilon^{*})\cap Vor(x)=R_{Vor}(\epsilon^{*};x)\subset\bigcup\limits_{\epsilon\geq 0}{\left(\bigcup\limits_{x\in X}{R_{Vor}(\epsilon;x)}\right)}.
We proved VB(X)𝖼ϵ0(xXRVor(ϵ;x))VB(X)^{\mathsf{c}}\subset\bigcup\limits_{\epsilon\geq 0}{\left(\bigcup\limits_{x\in X}{R_{Vor}(\epsilon;x)}\right)}. ∎

Now, we proof Theorem 1.

Proof.

Part 1
First, we prove that a 1-NN classifier maximizes global Voronoi-epsilon robustness. We denote the 1-NN classifier as f1NNf_{1-NN} and calculate its global Voronoi-epsilon robustness.
For a data point xXx\in X, let xRVor(ϵ;x)=𝔹(x,ϵ)Vor(x)x^{\prime}\in R_{Vor}(\epsilon;x)=\mathbb{B}(x,\epsilon)\cap Vor(x).
xVor(x)d(x,x)<d(xclean,x),xX{x}x^{\prime}\in Vor(x)\Longleftrightarrow d(x,x^{\prime})<d(x_{clean},x^{\prime}),\forall x\in X-\left\{x\right\}.
As xRVor(ϵ;x)Vor(x)x^{\prime}\in R_{Vor}(\epsilon;x)\subset Vor(x), xx is unique nearest data point in XX and thus f1NN(x)=cxf_{1-NN}(x^{\prime})=c_{x}.
When x=argmaxxRVor(ϵ;x)L(x,cx)x^{*}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor}({\epsilon;x})}{L(x^{\prime},c_{x})}, aVor(ϵ)=𝔼(x,cx)𝒟[𝟙(f1NN(x)=cx)]=𝔼(x,cx)𝒟[1]=1a_{Vor}(\epsilon)=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f_{1-NN}(x^{*})=c_{x}\right)}\right]=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[1\right]=1.
aglobal=limϵaVor(ϵ)=limϵ1=1a_{global}=\lim\limits_{\epsilon\rightarrow\infty}{a_{Vor}(\epsilon)}=\lim\limits_{\epsilon\rightarrow\infty}{1}=1. Thus, f1NNf_{1-NN} takes the maximum global Voronoi-epsilon robustness 11.

Part 2
Now, we prove that if ff^{*} maximizes global Voronoi-epsilon robustness, then ff^{*} becomes the 1-NN classifier except for Voronoi boundary VB(X)VB(X).
Let f1f^{*1} be a function that maximizes global Voronoi-epsilon robustness.
From the last part of the part 1, when we calculate global Voronoi-epsilon robustness of f1f^{*1}, it should satisfy the equation aglobal=1a_{global}=1.
For a data point xXx\in X and ϵ1<ϵ2\epsilon_{1}<\epsilon_{2}, RVor(ϵ1;x)=𝔹(x,ϵ1)Vor(x)𝔹(x,ϵ2)Vor(x)=RVor(ϵ2;x)R_{Vor}(\epsilon_{1};x)=\mathbb{B}(x,\epsilon_{1})\cap Vor(x)\subset\mathbb{B}(x,\epsilon_{2})\cap Vor(x)=R_{Vor}(\epsilon_{2};x).
Thus, for a data point xXx\in X and ϵ1<ϵ2\epsilon_{1}<\epsilon_{2}, 𝟙(f1(x1)=cx)𝟙(f1(x2)=cx)\mathds{1}\left(f^{*1}(x^{*1})=c_{x}\right)\geq\mathds{1}\left(f^{*1}(x^{*2})=c_{x}\right) where x1=argmaxxRVor(ϵ1;x)L(x,cx)x^{*1}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor}({\epsilon_{1};x})}{L(x^{\prime},c_{x})} and x2=argmaxxRVor(ϵ2;x)L(x,cx)x^{*2}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor}({\epsilon_{2};x})}{L(x^{\prime},c_{x})}.
aVor(ϵ1)=𝔼(x,cx)𝒟[𝟙(f1(x1)=cx)]𝔼(x,cx)𝒟[𝟙(f1(x2)=cx)]=aVor(ϵ2)a_{Vor}(\epsilon_{1})=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f^{*1}(x^{*1})=c_{x}\right)}\right]\geq\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f^{*1}(x^{*2})=c_{x}\right)}\right]=a_{Vor}(\epsilon_{2}) for ϵ1<ϵ2\epsilon_{1}<\epsilon_{2}. In other words, aVor(ϵ)a_{Vor}(\epsilon) is a decreasing function.
aVor(ϵ)=1,ϵ0a_{Vor}(\epsilon)=1,\,\forall\epsilon\geq 0 (aVor(ϵ)<1\because a_{Vor}(\epsilon^{*})<1 for a ϵ>0\epsilon^{*}>0, then it is a contradictory to aglobal=1a_{global}=1 as aVor(ϵ)a_{Vor}(\epsilon) is a decreasing function.).
1=aVor(ϵ)=𝔼(x,cx)𝒟[𝟙(f1(x)=cx)]1=a_{Vor}(\epsilon)=\mathbb{E}_{(x,c_{x})\sim\mathcal{D}}\left[{\mathds{1}\left(f^{*1}(x^{*})=c_{x}\right)}\right] where x=argmaxxRVor(ϵ;x)L(x,cx)x^{*}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor}({\epsilon;x})}{L(x^{\prime},c_{x})}.
As the calculation is based on the finite set XX, f1(x)=cxf^{*1}(x^{*})=c_{x} (𝟙(f1(x)=cx)=1\because{\mathds{1}\left(f^{*1}(x^{*})=c_{x}\right)}=1) where x=argmaxxRVor(ϵ;x)L(x,cx)x^{*}=\operatorname*{arg\,max}\limits_{x^{\prime}\in R_{Vor}({\epsilon;x})}{L(x^{\prime},c_{x})}.
As xx^{*} are the worst case adversarially perturbed samples, i.e., samples that output mostly different from cxc_{x}, f1(x)=cx=f1NN(x)f^{*1}(x^{\prime})=c_{x}=f_{1-NN}(x^{\prime}) where xRVor(ϵ;x)x^{\prime}\in R_{Vor}({\epsilon;x}).
By changing ϵ\epsilon and xXx\in X, xx^{\prime} that satisfies xRVor(ϵ;x)x^{\prime}\in R_{Vor}({\epsilon;x}) can fill up 𝒳\mathcal{X} except for VB(X)VB(X) (\because Lemma 2). Hence, f1f^{*1} is equivalent to f1NNf_{1-NN} except for Voronoi boundary VB(X)VB(X).