This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: School of Computer Science
University of Adelaide, Australia
{mingyu.guo, ali.babar}@adelaide.edu.au
22institutetext: Graduate School of Information Science
Nara Institute of Science Technology, Japan
[email protected]

Optimizing Affine Maximizer Auctions via Linear Programming: an Application to Revenue Maximizing Mechanism Design for Zero-Day Exploits Markets

Mingyu Guo 11    Hideaki Hata 22    Ali Babar 11
Abstract

Optimizing within the affine maximizer auctions (AMA) is an effective approach for revenue maximizing mechanism design. The AMA mechanisms are strategy-proof and individually rational (if the agents’ valuations for the outcomes are nonnegative). Every AMA mechanism is characterized by a list of parameters. By focusing on the AMA mechanisms, we turn mechanism design into a value optimization problem, where we only need to adjust the parameters. We propose a linear programming based heuristic for optimizing within the AMA family. We apply our technique to revenue maximizing mechanism design for zero-day exploit markets. We show that due to the nature of the zero-day exploit markets, if there are only two agents (one offender and one defender), then our technique generally produces a near optimal mechanism: the mechanism’s expected revenue is close to the optimal revenue achieved by the optimal strategy-proof and individually rational mechanism (not necessarily an AMA mechanism).

Keywords:
a

utomated mechanism design \cdot revenue maximization \cdot mechanism design \cdot security economics \cdot bug bounty

1 Introduction

Revenue maximizing mechanism design is a fundamental topic in algorithmic game theory. Myerson [17] solved for the revenue maximizing mechanism for selling a single item, subject to a technical condition called the monotone hazard rate condition. Myerson’s optimal auction is surprisingly elegant. For example, if every agent’s type is drawn from an identical and independent distribution, then the optimal mechanism is simply the Vickrey auction [20] with a reserve price. Unfortunately, Myerson’s technique does not generalize to more complex settings. For example, when it comes to combinatorial auctions (auctions where multiple items are for sale, and the agents bid on bundles of items), revenue maximizing mechanism design remains an open problem. Another notable application domain of revenue maximizing mechanism design is the sponsored search auctions [13], where the search engines sell advertisement slots to advertisers, aiming to maximize revenue besides other objectives. Even though no optimal mechanisms have been derived for general combinatorial auctions or sponsored search auctions, for restricted domains, well performing mechanisms have been obtained based on a variety of revenue-boosting techniques [6, 10, 11, 15, 16].

There are several general revenue-boosting techniques. For example, we may artificially increase the winning chance of lower bidders, in order to drive up the competition faced by the higher bidders. As another example, we may artificially discourage or outright ban certain outcomes, in order to prevent low-revenue outcomes or force the agents to pay more to achieve the discouraged outcomes. The above techniques form the basis of a family of mechanisms called the affine maximizer auctions (AMA). Lavi et al. [14] conjectured that a combinatorial auction is truthful if and only if it is an AMA mechanism, subject to technical conditions. Likhodedov and Sandholm [15, 16] studied revenue maximizing combinatorial auction design by optimizing within the family of AMA mechanisms. The idea of optimizing within the AMA family is a general approach that can be applied to many different mechanism design settings, because generally the AMA mechanisms are well defined and the family contains a large number of mechanisms. By optimizing within the AMA family, there is a good chance of reaching a well-performing mechanism in terms of revenue. However, the issue with optimizing within the AMA family is that every AMA mechanism is characterized by |O|+n|O|+n parameters, where |O||O| is the size of the outcome space OO, and nn is the number of agents. For combinatorial auctions, |O||O| is exponential in the number of items, which makes it computationally impractical to optimize within the AMA family for this setting. Due to this, Likhodedov and Sandholm only studied the AMA family for the case of selling two items. When there are only two items, the number of parameters is small enough for the authors to conduct optimization via grid-based gradient descent.111The authors also proposed a restricted version of AMA called the VVCA mechanisms. A VVCA mechanism is only characterized by 2n2n parameters, which makes it much easier to optimize over. On the other hand, due to the fact that the VVCA family is only a tiny subset of the whole AMA family, we lose revenue by focusing only on it.

In this paper, we propose a linear programming based technique for optimizing within the AMA family. Every outcome corresponds to one variable in our LP model. As a result, our technique can handle reasonably large number of outcomes. For example, let us consider the case where |O||O| is a few hundred. Running a LP with a few hundred variables is computationally tractable. On the other hand, methods such as grid-based gradient descent are impractical.

We apply our new technique to a specific mechanism design problem. Our paper focuses on revenue maximizing mechanism design for zero-day exploit markets [12]. Zero-day exploits refer to software vulnerabilities that are not known to the software vendor. Trading zero-day exploits as legitimate business is a recent trend in the security industry [5]. According to a price list collected by Greenberg [9], the price of a zero-day exploit is between $5000 to $250,000. There are venture capital backed security consulting companies whose business model is selling zero-day exploits [7]. One of the companies mentioned in [7] even offered one point five million US dollars for one new iOS exploit. The reason an exploit can be priced so high is that generally it can stay alive for a long period of time [2]. Unless the software vendor is informed about an exploit, there is very low chance for an exploit to be discovered independently. To remedy this, software vendors often run bug bounty programs, which are markets where the software vendors buy exploits from security researchers [1, 19].

Guo et al. [12] proposed a formal mechanism design model for zero-day exploit markets. In the authors’ model, one exploit is being sold to multiple buyers over a period of time [0,1][0,1]. The model is different from the classic single-item auction for the following reasons:

  • There are two categories of buyers. The defenders buy exploits to fix them. Typically there is only one defender, which is the software vendor. The offenders buy exploits to utilize them. National security agencies and police are example offenders. For example, Zerodium [7] is a consulting company that buys zero-day exploits and resells them to mostly government agencies. The offenders wish to utilize an exploit for as long as possible. Once an exploit is obtained by a defender, the exploit becomes worthless. Using mechanism design terminologies, the buyers have externalities.

  • The item being sold is an informational item, which means that we can sell the same item multiple times (e.g., to multiple offenders). Of course, once an exploit is sold to a defender, we cannot sell it to any offenders afterwards, because it has become worthless.

  • Because the item being sold is a piece of information, we cannot simply describe it in full details to the buyers without some kind of payment enforcing mechanism, because otherwise the buyers can walk away with the exploit for free. Furthermore, we cannot ask the buyers to bid on an exploit that carries no description, because the buyers cannot come up with their private valuations if no description is given.

Guo et al. [12] proposed a mechanism property called straight-forwardness: a mechanism is straight-forward if it describes the exploit in full details to the offenders, before they submit their bids. This is required because typically offenders already have many exploits in their arsenals. With full description, they can evaluate whether the exploit being sold is original, and to what extent the exploit helps them. Straight-forwardness does not require the mechanism to describe the exploit to the defenders before they bid (otherwise, the exploit gets fixed). Straight-forwardness only describes to the defenders how severe the exploit is: e.g., this exploit allows anyone to remotely control an iOS device. From the perspective of a defender, every exploit is new (otherwise, it wouldn’t be an exploit). We ask the defenders to come up with their private valuations based on the exploit’s severity, which is exactly how bug bounty markets operate (in bug bounty markets, bugs are priced according to their severity levels [19]).

Guo et al. [12] showed that if straightforwardness is required together with strategy-proofness and individual rationality, then one revenue-maximizing mechanism must work as follows: we describe the exploit in full details to all offenders at time 0. We also describe the exploit’s severity level to the defenders at time 0. The offenders and defenders submit their bids. The offenders bid to keep the exploit alive for as long as possible. The defenders bid to kill off the exploit as early as possible. In some sense, the model is similar to the cake-cutting problem [3, 4] and the single facility location problem [8, 18].222In our model, we allow payments. After all, the objective is to maximize revenue. For this model, the authors proposed one heuristic-based randomized AMA mechanism.

In this paper, for the above model, we use our new technique to optimize within the AMA family. We also show that if there are only two agents (one offender and one defender), and if the defender’s valuation is much lower than the offender’s (typically true for zero-day exploit markets), the optimal AMA mechanism’s revenue is close to the optimal revenue. For demonstrating this result, we propose and study a family of mechanisms called the posted-price mechanisms for our model.

2 Model Description

We use OO to denote the outcome space. We use Θi\Theta_{i} to denote agent ii’s type space. We use vi(θi,o)v_{i}(\theta_{i},o) to denote agent ii’s valuation for outcome oOo\in O when her type is θiΘi\theta_{i}\in\Theta_{i}.

For the zero-day exploit mechanism design model proposed in Guo et al. [12], the outcome space is [0,1][0,1]. An outcome o[0,1]o\in[0,1] represents when the exploit is killed off (revealed to the defenders).333If we allow randomized mechanisms, then an outcome is a nonincreasing function o(t)o(t), with o(0)=1o(0)=1 and o(1)=0o(1)=0. o(t)o(t) represents the probability for the exploit to be alive at time tt. In order to run the technique proposed in this paper, we require the outcome space to be finite. So for this technical reason, we set the outcome space to be {0,1k,2k,,1}\{0,\frac{1}{k},\frac{2}{k},\ldots,1\}. That is, we will only reveal the exploit at these discrete moments. The size of the outcome space |O|=k+1|O|=k+1.

The family of AMA mechanisms is defined as follows:

  • Given a type profile θ\theta, the outcome picked is the following:

    o=argmaxoO(i=1nuivi(θi,o)+ao)o^{*}=\arg\max_{o\in O}\left(\sum_{i=1}^{n}u_{i}v_{i}(\theta_{i},o)+a_{o}\right)
  • Agent ii’s payment equals:

    maxoO(jiujvj(θj,o)+ao)jiujvj(θj,o)aoui\frac{\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*})-a_{o^{*}}}{u_{i}}

In the above description, the uiu_{i} and the aoa_{o} are constant parameters. ui1u_{i}\geq 1 for all ii. The aoa_{o} are unrestricted. In total, there are n+|O|n+|O| parameters. Every AMA mechanism is characterized by these many parameters. For any assignments of the parameters, the corresponding AMA mechanism is strategy-proof. However, not every AMA mechanism is individually rational. If we further assume that i,θi,o\forall i,\theta_{i},o, vi(θi,o)0v_{i}(\theta_{i},o)\geq 0, then every AMA mechanism is individually rational. To show this, we only need to show that an agent’s valuation is always at least her payment. That is,

vi(θi,o)maxoO(jiujvj(θj,o)+ao)jiujvj(θj,o)aouiv_{i}(\theta_{i},o^{*})\geq\frac{\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*})-a_{o^{*}}}{u_{i}}
jujvj(θj,o)+aomaxoO(jiujvj(θj,o)+ao)\iff\sum_{j}u_{j}v_{j}(\theta_{j},o^{*})+a_{o^{*}}\geq\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)

The right-hand side is less than or equal to the left-hand side if every agent’s valuation for every outcome is nonnegative.

In our model, an outcome represents when the exploit is killed off. For presentation purpose, we sometimes use tt to refer to an outcome.

An offender’s valuation is defined as:

vi(θi,t)=0tfθi(x)𝑑xv_{i}(\theta_{i},t)=\int_{0}^{t}f_{\theta_{i}}(x)dx

A defender’s valuation is defined as:

vi(θi,t)=t1fθi(x)𝑑xv_{i}(\theta_{i},t)=\int_{t}^{1}f_{\theta_{i}}(x)dx

An offender “enjoys” the exploit from time 0 to tt, and a defender values the safe period from time tt to 11. fθi(x)f_{\theta_{i}}(x) represents agent ii’s instantaneous value (nonnegative) at time xx, when her type is θi\theta_{i}. Based on the above definitions of the agents’ valuations, we have that every AMA mechanism is individually rational for our model.

3 Optimizing Affine Maximizer Auctions

We recall that an AMA mechanism is characterized by n+|O|n+|O| parameters (uiu_{i} for every agent ii, and aoa_{o} for every outcome oo). For presentation purpose, we define Z=n+|O|Z=n+|O| and use p1,p2,,pZp_{1},p_{2},\ldots,p_{Z} to refer to the parameters. Let M(p1,p2,,pZ)M(p_{1},p_{2},\ldots,p_{Z}) be the AMA mechanism characterized by p1p_{1} to pZp_{Z}. The task of optimizing within the AMA family is simply to optimize over the parameters:

maxp1,p2,,pZER(M(p1,p2,,pZ))\max_{p_{1},p_{2},\ldots,p_{Z}}ER(M(p_{1},p_{2},\ldots,p_{Z}))

Here, ER(M)ER(M) represents mechanism MM’s expected revenue. We have analytical characterization of the AMA payments, so the revenue of MM given a specific type profile can be calculated accordingly. Unfortunately, there is no known short-cut for calculating the expected revenue. Given a prior distribution of the θi\theta_{i}, we need to draw large amount of sample profiles to calculate the expected revenue. For example, if for every agent ii, we draw 100100 samples for θi\theta_{i}, then altogether the number of type profiles is 100n100^{n}. For this reason, in this paper, we focus on cases where nn is small.444We have to emphasize that this is not an uncommon constraint when it comes to using numerical methods for maximizing mechanism revenue.

Likhodedov and Sandholm [15, 16] used a grid-based gradient descent approach for optimizing the parameters. Under this approach, suppose we start from a grid point (p1,p2,,pZ)(p_{1},p_{2},\ldots,p_{Z}), we have to examine all neighbouring points (p1+δ1h,p2+δ2h,,pZ+δZh)(p_{1}+\delta_{1}h,p_{2}+\delta_{2}h,\ldots,p_{Z}+\delta_{Z}h), where δi{1,0,1}\delta_{i}\in\{-1,0,1\} and hh is the grid size. We need to examine 3Z3^{Z} points. So this approach requires that both nn and |O||O| be tiny. For example, if Z=100Z=100, then the approach is impractical. For our technique, ZZ is allowed to be large: our technique involves a LP model with ZZ variables, which takes polynomial time in ZZ.

A high-level description of our optimizing technique is as follows:

  • We initialize the algorithm with an AMA mechanism: e.g., one based on random parameters, or the VCG mechanism.

  • Given M0M_{0} characterized by p10,p20,,pZ0p_{1}^{0},p_{2}^{0},\ldots,p_{Z}^{0}, we use a heuristic to approximate the optimal AMA mechanism near this starting point, using a linear program. A mechanism MM (characterized by p1,p2,,pZp_{1},p_{2},\ldots,p_{Z}) is near M0M_{0} if maxi|pipi0|ϵ\max_{i}|p_{i}-p_{i}^{0}|\leq\epsilon for a threshold ϵ\epsilon. We repeat this step using the new mechanism as the starting point.

The above algorithm may end with a locally optimal mechanism, which means that we may need to repeat the algorithm using different initial points.

Now we present the details of the linear program. We index the outcomes using 0,1,,k0,1,\ldots,k. We denote the initial mechanism as M(u10,u20,,un0,a00,a10,,ak0)M(u_{1}^{0},u_{2}^{0},\ldots,u_{n}^{0},a_{0}^{0},a_{1}^{0},\ldots,a_{k}^{0}).

The following optimization model solves for the optimal AMA mechanism near this starting point:

Model 1
Variables:
u1,u2,,un,a0,a1,,aku_{1},u_{2},\ldots,u_{n},a_{0},a_{1},\ldots,a_{k}
Maximize: ER(M(u1,u2,,un,a0,a1,,ak))ER(M(u_{1},u_{2},\ldots,u_{n},a_{0},a_{1},\ldots,a_{k})) Subject to:
For all ii, ui1u_{i}\geq 1 and ui0ϵuiui0+ϵu_{i}^{0}-\epsilon\leq u_{i}\leq u_{i}^{0}+\epsilon
For all tt, ai0ϵaiai0+ϵa_{i}^{0}-\epsilon\leq a_{i}\leq a_{i}^{0}+\epsilon

Of course, the above model is not a linear program, as ER(M)ER(M) is not a linear combination of the variables. We will approximate ER(M)ER(M) using a linear combination of the variables.

Let SS be a large set of type profiles, we will approximate ER(M)ER(M) as follows:

ER(M)θSP(θ)iCi(M,θ)ER(M)\approx\sum_{\theta\in S}P(\theta)\sum_{i}C_{i}(M,\theta)

Here, Ci(M,θ)C_{i}(M,\theta) is agent ii’s payment under MM when the type profile is θ\theta. One way to pick SS is to discretize the type space and let SS be the set of all grid points. Now what remains to be done is to approximate Ci(M,θ)C_{i}(M,\theta) using a linear combination of the variables.

Ci(M,θ)=maxoO(jiujvj(θj,o)+ao)jiujvj(θj,o)aouiC_{i}(M,\theta)=\frac{\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*})-a_{o^{*}}}{u_{i}}

Here, oo^{*} is defined as

o=argmaxoO(i=1nuivi(θi,o)+ao)o^{*}=\arg\max_{o\in O}\left(\sum_{i=1}^{n}u_{i}v_{i}(\theta_{i},o)+a_{o}\right)

We use the following heuristic to approximate Ci(M,θ)C_{i}(M,\theta): because the uiu_{i} and the aoa_{o} are close to the ui0u_{i}^{0} and the ao0a_{o}^{0}, we will use the ui0u_{i}^{0} and the ao0a_{o}^{0} to calculate the outcomes mentioned in the above expressions. That is, we assume that for most type profiles, small perturbation in the parameters will not change the mechanism outcomes.

o0=argmaxoO(i=1nui0vi(θi,o)+ao0)o^{*0}=\arg\max_{o\in O}\left(\sum_{i=1}^{n}u_{i}^{0}v_{i}(\theta_{i},o)+a_{o}^{0}\right)
o0=argmaxoO(jiui0vi(θi,o)+ao0)o^{0}=\arg\max_{o\in O}\left(\sum_{j\neq i}u_{i}^{0}v_{i}(\theta_{i},o)+a_{o}^{0}\right)

We replace oo^{*} and oo using o0o^{*0} and o0o^{0}, we have that

Ci(M,θ)jiujvj(θj,o0)+ao0jiujvj(θj,o0)ao0uiC_{i}(M,\theta)\approx\frac{\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{0})+a_{o^{0}}-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*0})-a_{o^{*0}}}{u_{i}}

We use cjc_{j} to denote vj(θj,o0)v_{j}(\theta_{j},o^{0}) and cjc_{j}^{*} to denote vj(θj,o0)v_{j}(\theta_{j},o^{*0}). Both the cjc_{j} and the cjc_{j}^{*} are constants.

Ci(M,θ)jicjuj+ao0jicjujao0uiC_{i}(M,\theta)\approx\frac{\sum_{j\neq i}c_{j}u_{j}+a_{o^{0}}-\sum_{j\neq i}c_{j}^{*}u_{j}-a_{o^{*0}}}{u_{i}}

We then observe that for any xx, xui=xui0x(uiui0)uiui0\frac{x}{u_{i}}=\frac{x}{u_{i}^{0}}-\frac{x(u_{i}-u_{i}^{0})}{u_{i}u_{i}^{0}}.

We can then rewrite Ci(M,θ)C_{i}(M,\theta) into:

jicjuj+ao0jicjujao0ui0(jicjuj+ao0jicjujao0)(uiui0)uiui0\frac{\sum_{j\neq i}c_{j}u_{j}+a_{o^{0}}-\sum_{j\neq i}c_{j}^{*}u_{j}-a_{o^{*0}}}{u_{i}^{0}}-\frac{(\sum_{j\neq i}c_{j}u_{j}+a_{o^{0}}-\sum_{j\neq i}c_{j}^{*}u_{j}-a_{o^{*0}})(u_{i}-u_{i}^{0})}{u_{i}u_{i}^{0}}

The first term is a linear combination of the variables, as ui0u_{i}^{0}, the cjc_{j}, and the cjc_{j}^{*} are all constants.

The second term can be approximated as follows:

(jicjuj0+ao00jicjuj0ao00)(uiui0)ui0ui0\frac{(\sum_{j\neq i}c_{j}u_{j}^{0}+a_{o^{0}}^{0}-\sum_{j\neq i}c_{j}^{*}u_{j}^{0}-a_{o^{*0}}^{0})(u_{i}-u_{i}^{0})}{u_{i}^{0}u_{i}^{0}}

The above is a linear function involving one variable (uiu_{i}).

Using the above heuristic method, we are able to turn Model 1 into a linear program involving n+|O|n+|O| variables. The number of constraints is n+|O|+|S|n+|O|+|S|. Therefore, we can afford reasonable large nn and |O||O|, as long as |S||S| is not too large.

4 Zero-Day Exploit Mechanism Design Model

In this section, we focus on a specific mechanism design setting for the zero-day exploit model: there are only two agents: one offender and one defender.

We use EPO(M)EPO(M) to denote the offender’s expected payment under mechanism MM. We use EPD(M)EPD(M) to denote the defender’s expected payment under mechanism MM.

Let FF be the set of all strategy-proof and individually rational mechanisms.

Let MM^{*} be the optimal mechanism that maximizes the expected revenue. We recall that ER(M)ER(M) denotes MM’s expected revenue.

M=argmaxMF(EPO(M)+EPD(M))=argmaxMFER(M)M^{*}=\arg\max_{M\in F}\left(EPO(M)+EPD(M)\right)=\arg\max_{M\in F}ER(M)

Let MOMO^{*} be the optimal mechanism that maximizes the expected payment collected from the offender.

MO=argmaxMFEPO(M)MO^{*}=\arg\max_{M\in F}EPO(M)

Let MDMD^{*} be the optimal mechanism that maximizes the expected payment collected from the defender.

MD=argmaxMFEPD(M)MD^{*}=\arg\max_{M\in F}EPD(M)

Obviously, we have

ER(M)=EPO(M)+EPD(M)EPO(MO)+EPD(MD)ER(M^{*})=EPO(M^{*})+EPD(M^{*})\leq EPO(MO^{*})+EPD(MD^{*})

We introduce the following posted-price mechanisms. These mechanisms allow only one agent to make decisions.

  • Every outcome oo is associated with a price aoa_{o}.

  • One agent picks the outcome that maximizes her own utility.

  • The other agent makes no decisions and pays 0.

It is without loss of generality to assume that both MOMO^{*} and MDMD^{*} are posted price mechanisms. Let MOMO^{*} be the mechanism that maximizes the offender’s expected payment. Let EPO(M,θD)EPO(M,\theta_{D}) be the offender’s expected payment under MM when the defender bids θD\theta_{D}. Because EPO(MO)=θDP(θD)EPO(MO,θD)EPO(MO^{*})=\sum_{\theta_{D}}P(\theta_{D})EPO(MO^{*},\theta_{D}), there must exist one θD\theta_{D} so that EPO(MO,θD)EPO(MO)EPO(MO^{*},\theta_{D})\geq EPO(MO^{*}). If we fix the defender’s type to be the said θD\theta_{D}, then the mechanism faced by the offender is exactly a posted-price mechanism, and the expected payment of the offender is at least EPO(MO)EPO(MO^{*}).

For zero-day exploit market, typically the defender has much lower valuation than the offender. For example, according to [9], an exploit that attacks the Chrome browser sells between 80k and 200k for offensive clients (USD). According to Google’s official bug bounty reward program for the Chrome browser [19], a serious exploit is priced between 0.5k and 15k. Therefore, EPD(MD)EPD(MD^{*}) is generally much smaller than EPO(MO)EPO(MO^{*}).

We use PP(a0,a1,,ak)PP(a_{0},a_{1},\ldots,a_{k}) to denote the posted-price mechanism with the parameters a0a_{0} to aka_{k}. We use M(uO,uD,a0,a1,,ak)M(u_{O},u_{D},a_{0},a_{1},\ldots,a_{k}) to denote the AMA mechanism with the parameters uOu_{O} (for offender), uDu_{D} (for defender), and the ata_{t}. If the deciding agent under PP(a0,a1,,ak)PP(a_{0},a_{1},\ldots,a_{k}) is the offender, then PP(a0,a1,,ak)PP(a_{0},a_{1},\ldots,a_{k}) approaches M(uO,1,0,a1uO,,akuO)M(u_{O},1,0,-a_{1}u_{O},\ldots,-a_{k}u_{O}), when uOu_{O} approaches infinity. If the deciding agent under PP(a0,a1,,ak)PP(a_{0},a_{1},\ldots,a_{k}) is the defender, then PP(a0,a1,,ak)PP(a_{0},a_{1},\ldots,a_{k}) approaches M(1,uD,a0uD,,ak1uD,0)M(1,u_{D},-a_{0}u_{D},\ldots,-a_{k-1}u_{D},0), when uDu_{D} approaches infinity. For any posted-price mechanism, there exists an AMA mechanism whose expected performance is arbitrary close to it. That is, the optimal AMA mechanism’s expected revenue is at least the optimal expected revenue of posted-price mechanisms.

We solve for the optimal posted-price mechanism for the offender, denoted as PPPP^{*}.

ER(PP)=EPO(M)ER(M)EPD(MD)ER(PP^{*})=EPO(M^{*})\geq ER(M^{*})-EPD(MD^{*})

Because the optimal AMA mechanism outperforms PPPP^{*} (or they have the same expected revenue), when EPD(MD)EPD(MD^{*}) is small, we have that the optimal AMA mechanism’s expected revenue is close to the expected revenue of the optimal mechanism MM^{*}.

4.1 Optimal Posted-Price Mechanism

In this subsection, we discuss how to solve for the optimal posted-price mechanism.

First of all, we focus on the single-parameter setting [12]. For presentation purpose, we focus on solving for the optimal posted-price mechanism where the offender makes decisions.

In a single-parameter setting, an offender’s valuation is defined as follows, assuming her type is θO\theta_{O}:

v(θO,t)=0tθOc(x)𝑑xv(\theta_{O},t)=\int_{0}^{t}\theta_{O}c(x)dx

c(x)c(x) is a fixed function that characterizes the instantaneous valuation of the exploit by the agent. At time xx, the instantaneous valuation is θOc(x)\theta_{O}c(x).

We use Myerson’s standard technique. We assume θO[0,H]\theta_{O}\in[0,H], where HH is a fixed upper bound. We assume θO\theta_{O}’s pdf and cdf are ff and FF, respectively. We also assume the following expression is monotone nondecreasing. This is called the monotone hazard rate condition, which is satisfied by many common distributions.

ϕ(θO)=θO1F(θO)f(θO)\phi(\theta_{O})=\theta_{O}-\frac{1-F(\theta_{O})}{f(\theta_{O})}

It turns out that if the above all hold, then the optimal post-price mechanism simply sells the whole time interval [0,1][0,1] as a bundle, with a fixed take-it-or-leave-it price pp. If the agent is willing to afford pp to buy the whole interval, then she gets it. Otherwise, she gets nothing and pays nothing. The optimal mechanism PP=PP(0,,,,,p)PP^{*}=PP(0,\infty,\infty,\ldots,\infty,p).

When kk is small, we have another algorithm for solving for the optimal posted-price mechanism, and under this algorithm, we can drop the single-parameter assumption.

Let PP(a0,a1,,ak)PP(a_{0},a_{1},\ldots,a_{k}) be the optimal posted-price mechanism. It is without loss of generality to assume that for any i<ji<j, if both aia_{i} and aja_{j} are finite, then ai<aja_{i}<a_{j}. Otherwise, the outcome ii is never chosen, and we can set aia_{i} to be infinite. It is also without loss of generality to assume that for any aia_{i}, either it is infinite (meaning that this outcome is not allowed), or it must satisfy the following condition:

θO,v(i,θO)ai=maxj<iv(j,θO)aj\exists\theta_{O},v(i,\theta_{O})-a_{i}=\max_{j<i}v(j,\theta_{O})-a_{j}

Here, v(t,θO)v(t,\theta_{O}) is the offender’s valuation for outcome tt when her type is θO\theta_{O}. The above condition basically says that there exists a type for the offender, if we increase aia_{i} just a bit, then it would force the offender to choose an earlier outcome than ii. If the condition is not true, then we can safely increase aja_{j}. By doing so, we can charge more for those types that choose jj. We can also charge more if under some types, the offender chooses a later outcome, which also means that more payment will be collected.

Based on the above condition, if we know the aja_{j} for j<ij<i and θO\theta_{O}, then we can calculate aia_{i}. We already know that a0=0a_{0}=0. To calculate a1a_{1}, we can go over all θO\theta_{O}, possibly by discretizing the offender’s type space. Then, to calculate a2a_{2}, we can go over all θO\theta_{O} again. We do this for every ii. Let NN be the number of types in ΘO\Theta_{O} after the discretization. The total number of iterations is then NkN^{k}.

5 Evaluation

As mentioned earlier, according to [9], an exploit that attacks the Chrome browser sells for at most 200k for offensive clients (USD). According to Google’s official bug bounty reward program for the Chrome browser [19], a serious exploit is priced for at most 15k.

We start with the following setting, which is based on the numbers above. There are two agents. The offender’s valuation function is

v(θO,t)=0tθO(1x)𝑑xv(\theta_{O},t)=\int_{0}^{t}\theta_{O}(1-x)dx

θO\theta_{O} is drawn uniformly at random from U(0,400)U(0,400). That is, the offender’s valuation for the whole time interval [0,1][0,1] is at most 200200.

The defender’s valuation function is

v(θD,t)=t1θDx𝑑xv(\theta_{D},t)=\int_{t}^{1}\theta_{D}xdx

θD\theta_{D} is drawn uniformly at random from U(0,15)U(0,15). That is, the defender’s valuation for the whole time interval [0,1][0,1] is at most 1515.

The above valuation functions satisfy all the conditions needed for the single-parameter model. So MOMO^{*} simply sells the whole interval to the offender for a fixed price pOp_{O}, and MDMD^{*} simply sells the whole interval to the defender for a fixed price pDp_{D}.

pO=argmaxp200pP(v(θO,1)p)=argmaxp200p200p200=100p_{O}=\arg\max_{p\leq 200}pP(v(\theta_{O},1)\geq p)=\arg\max_{p\leq 200}p\frac{200-p}{200}=100
EPO(MO)=50EPO(MO^{*})=50
pD=argmaxp15pP(v(θD,1)p)=argmaxp15p15p15=7.5p_{D}=\arg\max_{p\leq 15}pP(v(\theta_{D},1)\geq p)=\arg\max_{p\leq 15}p\frac{15-p}{15}=7.5
EPD(MD)=3.75EPD(MD^{*})=3.75

Therefore, ER(M)ER(M^{*}) is at most 53.7553.75. We pick k=10k=10 and ϵ=0.01\epsilon=0.01. We use the VCG mechanism as the initial solution. The VCG mechanism’s expected revenue is 6.96.9, which is very far away from the upper bound. Our technique starts from the VCG mechanism, and at the end produces a mechanism whose expected revenue equals 50.650.6, which is very close to the upper bound 53.7553.75. That is, in this case, the optimal AMA mechanism’s expected revenue is close to the optimal mechanism’s expected revenue.

As demonstrated in our analysis, we have the above phenomenon if the defender’s valuation is insignificant compared to the valuation of the offender. We then investigate an example where the defender’s valuation is much higher. We change it so that the defender’s type is drawn from U(0,150)U(0,150) instead of U(0,15)U(0,15). Now the upper bound for ER(M)ER(M^{*}) is 87.587.5. Our technique produces a mechanism whose expected revenue equals 57.957.9. This time, the achieved value is not close to the upper bound.

6 Conclusion

Optimizing within the affine maximizer auctions (AMA) is an effective approach for revenue maximizing mechanism design. We proposed a linear programming based heuristic for optimizing within the AMA family. We applied our technique to revenue maximizing mechanism design for zero-day exploit markets. We showed that due to the nature of the zero-day exploit markets, with one offender and one defender, our technique generally produces a near optimal mechanism.

References

  • [1] Algarni, A.M., Malaiya, Y.K.: Software vulnerability markets: Discoverers and buyers. Int. J. of Computer, Electrical, Automation, Control and Inf. Eng. 8(3), 71–81 (2014)
  • [2] Bilge, L., Dumitras, T.: Before we knew it: An empirical study of zero-day attacks in the real world. In: Proc. of 2012 ACM Conf. on Computer and Communications Security. pp. 833–844. CCS ’12, ACM, New York, NY, USA (2012), http://doi.acm.org/10.1145/2382196.2382284
  • [3] Brams, S.J., Jones, M.A., Klamler, C.: Better ways to cut a cake - revisited. In: Brams, S., Pruhs, K., Woeginger, G. (eds.) Fair Division. No. 07261 in Dagstuhl Seminar Proceedings, Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany, Dagstuhl, Germany (2007)
  • [4] Chen, Y., Lai, J., Parkes, D., Procaccia, A.: Truth, justice, and cake cutting. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Atlanta, GA, USA (2010)
  • [5] Egelman, S., Herley, C., van Oorschot, P.C.: Markets for zero-day exploits: Ethics and implications. In: Proc. of 2013 Workshop on New Security Paradigms Workshop. pp. 41–46. NSPW ’13, ACM, New York, NY, USA (2013), http://doi.acm.org/10.1145/2535813.2535818
  • [6] Emek, Y., Feldman, M., Gamzu, I., Paes Leme, R., Tennenholtz, M.: Signaling schemes for revenue maximization. In: Proceedings of the ACM Conference on Electronic Commerce (EC). Valencia, Spain (2012)
  • [7] Fisher, D.: Vupen founder launches new zero-day acquisition firm zerodium (2015), july 24, 2015 online: https://threatpost.com/vupen-launches-new-zero-day-acquisition-firm-zerodium/113933/
  • [8] Goemans, M., Skutella, M.: Cooperative facility location games. Journal of Algorithms 50, 194–214 (2004), early version: SODA 2000, 76–85
  • [9] Greenberg, A.: Shopping for zero-days: A price list for hackers’ secret software exploits (2012), march 23, 2012 online: http://www.forbes.com/sites/andygreenberg/2012/03/23/shopping-for-zero-days-an-price-list-for-hackers-secret-software-exploits/
  • [10] Guo, M., Deligkas, A.: Revenue maximization via hiding item attributes. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI). Beijing, China (2013)
  • [11] Guo, M., Deligkas, A., Savani, R.: Increasing VCG revenue by decreasing the quality of items. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Quebec, Canada (2014)
  • [12] Guo, M., Hata, H., Babar, A.: Revenue maximizing markets for zero-day exploits. In: PRIMA 2016: Princiles and Practice of Multi-Agent Systems - 19th International Conference, Phuket, Thailand, August 22-26, 2016, Proceedings. pp. 247–260 (2016)
  • [13] Lahaie, S., Pennock, D.M., Saberi, A., Vohra, R.V.: Sponsored search auctions. In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V. (eds.) Algorithmic Game Theory, chap. 28. Cambridge University Press (2007)
  • [14] Lavi, R., Mu’alem, A., Nisan, N.: Towards a characterization of truthful combinatorial auctions. In: Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS). pp. 574–583 (2003)
  • [15] Likhodedov, A., Sandholm, T.: Methods for boosting revenue in combinatorial auctions. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). pp. 232–237. San Jose, CA, USA (2004)
  • [16] Likhodedov, A., Sandholm, T.: Approximating revenue-maximizing combinatorial auctions. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Pittsburgh, PA, USA (2005)
  • [17] Myerson, R.: Optimal auction design. Mathematics of Operations Research 6, 58–73 (1981)
  • [18] Procaccia, A.D., Tennenholtz, M.: Approximate mechanism design without money. In: Proceedings of the ACM Conference on Electronic Commerce (EC). pp. 177–186. Stanford, CA, USA (2009)
  • [19] Projects, T.C.: Severity guidelines for security issues (2015), accessed September 15, 2015 online: https://www.chromium.org/developers/severity-guidelines
  • [20] Vickrey, W.: Counterspeculation, auctions, and competitive sealed tenders. Journal of Finance 16, 8–37 (1961)