¹¹institutetext: School of Computer Science
University of Adelaide, Australia
{mingyu.guo, ali.babar}@adelaide.edu.au ²²institutetext: Graduate School of Information Science
Nara Institute of Science Technology, Japan
[email protected]

Optimizing Affine Maximizer Auctions via Linear Programming: an Application to Revenue Maximizing Mechanism Design for Zero-Day Exploits Markets

Mingyu Guo 11 Hideaki Hata 22 Ali Babar 11

Abstract

Optimizing within the affine maximizer auctions (AMA) is an effective approach for revenue maximizing mechanism design. The AMA mechanisms are strategy-proof and individually rational (if the agents’ valuations for the outcomes are nonnegative). Every AMA mechanism is characterized by a list of parameters. By focusing on the AMA mechanisms, we turn mechanism design into a value optimization problem, where we only need to adjust the parameters. We propose a linear programming based heuristic for optimizing within the AMA family. We apply our technique to revenue maximizing mechanism design for zero-day exploit markets. We show that due to the nature of the zero-day exploit markets, if there are only two agents (one offender and one defender), then our technique generally produces a near optimal mechanism: the mechanism’s expected revenue is close to the optimal revenue achieved by the optimal strategy-proof and individually rational mechanism (not necessarily an AMA mechanism).

Keywords:

utomated mechanism design $\cdot$ revenue maximization $\cdot$ mechanism design $\cdot$ security economics $\cdot$ bug bounty

1 Introduction

Revenue maximizing mechanism design is a fundamental topic in algorithmic game theory. Myerson [17] solved for the revenue maximizing mechanism for selling a single item, subject to a technical condition called the monotone hazard rate condition. Myerson’s optimal auction is surprisingly elegant. For example, if every agent’s type is drawn from an identical and independent distribution, then the optimal mechanism is simply the Vickrey auction [20] with a reserve price. Unfortunately, Myerson’s technique does not generalize to more complex settings. For example, when it comes to combinatorial auctions (auctions where multiple items are for sale, and the agents bid on bundles of items), revenue maximizing mechanism design remains an open problem. Another notable application domain of revenue maximizing mechanism design is the sponsored search auctions [13], where the search engines sell advertisement slots to advertisers, aiming to maximize revenue besides other objectives. Even though no optimal mechanisms have been derived for general combinatorial auctions or sponsored search auctions, for restricted domains, well performing mechanisms have been obtained based on a variety of revenue-boosting techniques [6, 10, 11, 15, 16].

There are several general revenue-boosting techniques. For example, we may artificially increase the winning chance of lower bidders, in order to drive up the competition faced by the higher bidders. As another example, we may artificially discourage or outright ban certain outcomes, in order to prevent low-revenue outcomes or force the agents to pay more to achieve the discouraged outcomes. The above techniques form the basis of a family of mechanisms called the affine maximizer auctions (AMA). Lavi et al. [14] conjectured that a combinatorial auction is truthful if and only if it is an AMA mechanism, subject to technical conditions. Likhodedov and Sandholm [15, 16] studied revenue maximizing combinatorial auction design by optimizing within the family of AMA mechanisms. The idea of optimizing within the AMA family is a general approach that can be applied to many different mechanism design settings, because generally the AMA mechanisms are well defined and the family contains a large number of mechanisms. By optimizing within the AMA family, there is a good chance of reaching a well-performing mechanism in terms of revenue. However, the issue with optimizing within the AMA family is that every AMA mechanism is characterized by $|O|+n$ parameters, where $|O|$ is the size of the outcome space $O$ , and $n$ is the number of agents. For combinatorial auctions, $|O|$ is exponential in the number of items, which makes it computationally impractical to optimize within the AMA family for this setting. Due to this, Likhodedov and Sandholm only studied the AMA family for the case of selling two items. When there are only two items, the number of parameters is small enough for the authors to conduct optimization via grid-based gradient descent.¹¹1The authors also proposed a restricted version of AMA called the VVCA mechanisms. A VVCA mechanism is only characterized by $2n$ parameters, which makes it much easier to optimize over. On the other hand, due to the fact that the VVCA family is only a tiny subset of the whole AMA family, we lose revenue by focusing only on it.

In this paper, we propose a linear programming based technique for optimizing within the AMA family. Every outcome corresponds to one variable in our LP model. As a result, our technique can handle reasonably large number of outcomes. For example, let us consider the case where $|O|$ is a few hundred. Running a LP with a few hundred variables is computationally tractable. On the other hand, methods such as grid-based gradient descent are impractical.

We apply our new technique to a specific mechanism design problem. Our paper focuses on revenue maximizing mechanism design for zero-day exploit markets [12]. Zero-day exploits refer to software vulnerabilities that are not known to the software vendor. Trading zero-day exploits as legitimate business is a recent trend in the security industry [5]. According to a price list collected by Greenberg [9], the price of a zero-day exploit is between $5000 to $250,000. There are venture capital backed security consulting companies whose business model is selling zero-day exploits [7]. One of the companies mentioned in [7] even offered one point five million US dollars for one new iOS exploit. The reason an exploit can be priced so high is that generally it can stay alive for a long period of time [2]. Unless the software vendor is informed about an exploit, there is very low chance for an exploit to be discovered independently. To remedy this, software vendors often run bug bounty programs, which are markets where the software vendors buy exploits from security researchers [1, 19].

Guo et al. [12] proposed a formal mechanism design model for zero-day exploit markets. In the authors’ model, one exploit is being sold to multiple buyers over a period of time $[0,1]$ . The model is different from the classic single-item auction for the following reasons:

•

There are two categories of buyers. The defenders buy exploits to fix them. Typically there is only one defender, which is the software vendor. The offenders buy exploits to utilize them. National security agencies and police are example offenders. For example, Zerodium [7] is a consulting company that buys zero-day exploits and resells them to mostly government agencies. The offenders wish to utilize an exploit for as long as possible. Once an exploit is obtained by a defender, the exploit becomes worthless. Using mechanism design terminologies, the buyers have externalities.
•

The item being sold is an informational item, which means that we can sell the same item multiple times (e.g., to multiple offenders). Of course, once an exploit is sold to a defender, we cannot sell it to any offenders afterwards, because it has become worthless.
•

Because the item being sold is a piece of information, we cannot simply describe it in full details to the buyers without some kind of payment enforcing mechanism, because otherwise the buyers can walk away with the exploit for free. Furthermore, we cannot ask the buyers to bid on an exploit that carries no description, because the buyers cannot come up with their private valuations if no description is given.

Guo et al. [12] proposed a mechanism property called straight-forwardness: a mechanism is straight-forward if it describes the exploit in full details to the offenders, before they submit their bids. This is required because typically offenders already have many exploits in their arsenals. With full description, they can evaluate whether the exploit being sold is original, and to what extent the exploit helps them. Straight-forwardness does not require the mechanism to describe the exploit to the defenders before they bid (otherwise, the exploit gets fixed). Straight-forwardness only describes to the defenders how severe the exploit is: e.g., this exploit allows anyone to remotely control an iOS device. From the perspective of a defender, every exploit is new (otherwise, it wouldn’t be an exploit). We ask the defenders to come up with their private valuations based on the exploit’s severity, which is exactly how bug bounty markets operate (in bug bounty markets, bugs are priced according to their severity levels [19]).

Guo et al. [12] showed that if straightforwardness is required together with strategy-proofness and individual rationality, then one revenue-maximizing mechanism must work as follows: we describe the exploit in full details to all offenders at time $0$ . We also describe the exploit’s severity level to the defenders at time $0$ . The offenders and defenders submit their bids. The offenders bid to keep the exploit alive for as long as possible. The defenders bid to kill off the exploit as early as possible. In some sense, the model is similar to the cake-cutting problem [3, 4] and the single facility location problem [8, 18].²²2In our model, we allow payments. After all, the objective is to maximize revenue. For this model, the authors proposed one heuristic-based randomized AMA mechanism.

In this paper, for the above model, we use our new technique to optimize within the AMA family. We also show that if there are only two agents (one offender and one defender), and if the defender’s valuation is much lower than the offender’s (typically true for zero-day exploit markets), the optimal AMA mechanism’s revenue is close to the optimal revenue. For demonstrating this result, we propose and study a family of mechanisms called the posted-price mechanisms for our model.

2 Model Description

We use $O$ to denote the outcome space. We use $\Theta_{i}$ to denote agent $i$ ’s type space. We use $v_{i}(\theta_{i},o)$ to denote agent $i$ ’s valuation for outcome $o\in O$ when her type is $\theta_{i}\in\Theta_{i}$ .

For the zero-day exploit mechanism design model proposed in Guo et al. [12], the outcome space is $[0,1]$ . An outcome $o\in[0,1]$ represents when the exploit is killed off (revealed to the defenders).³³3If we allow randomized mechanisms, then an outcome is a nonincreasing function $o(t)$ , with $o(0)=1$ and $o(1)=0$ . $o(t)$ represents the probability for the exploit to be alive at time $t$ . In order to run the technique proposed in this paper, we require the outcome space to be finite. So for this technical reason, we set the outcome space to be $\{0,\frac{1}{k},\frac{2}{k},\ldots,1\}$ . That is, we will only reveal the exploit at these discrete moments. The size of the outcome space $|O|=k+1$ .

The family of AMA mechanisms is defined as follows:

•

Given a type profile $\theta$ , the outcome picked is the following:

$o^{*}=\arg\max_{o\in O}\left(\sum_{i=1}^{n}u_{i}v_{i}(\theta_{i},o)+a_{o}\right)$

•

Agent $i$ ’s payment equals:

\frac{\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*})-a_{o^{*}}}{u_{i}}

In the above description, the $u_{i}$ and the $a_{o}$ are constant parameters. $u_{i}\geq 1$ for all $i$ . The $a_{o}$ are unrestricted. In total, there are $n+|O|$ parameters. Every AMA mechanism is characterized by these many parameters. For any assignments of the parameters, the corresponding AMA mechanism is strategy-proof. However, not every AMA mechanism is individually rational. If we further assume that $\forall i,\theta_{i},o$ , $v_{i}(\theta_{i},o)\geq 0$ , then every AMA mechanism is individually rational. To show this, we only need to show that an agent’s valuation is always at least her payment. That is,

v_{i}(\theta_{i},o^{*})\geq\frac{\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*})-a_{o^{*}}}{u_{i}}

\iff\sum_{j}u_{j}v_{j}(\theta_{j},o^{*})+a_{o^{*}}\geq\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)

The right-hand side is less than or equal to the left-hand side if every agent’s valuation for every outcome is nonnegative.

In our model, an outcome represents when the exploit is killed off. For presentation purpose, we sometimes use $t$ to refer to an outcome.

An offender’s valuation is defined as:

v_{i}(\theta_{i},t)=\int_{0}^{t}f_{\theta_{i}}(x)dx

A defender’s valuation is defined as:

v_{i}(\theta_{i},t)=\int_{t}^{1}f_{\theta_{i}}(x)dx

An offender “enjoys” the exploit from time $0$ to $t$ , and a defender values the safe period from time $t$ to $1$ . $f_{\theta_{i}}(x)$ represents agent $i$ ’s instantaneous value (nonnegative) at time $x$ , when her type is $\theta_{i}$ . Based on the above definitions of the agents’ valuations, we have that every AMA mechanism is individually rational for our model.

3 Optimizing Affine Maximizer Auctions

We recall that an AMA mechanism is characterized by $n+|O|$ parameters ( $u_{i}$ for every agent $i$ , and $a_{o}$ for every outcome $o$ ). For presentation purpose, we define $Z=n+|O|$ and use $p_{1},p_{2},\ldots,p_{Z}$ to refer to the parameters. Let $M(p_{1},p_{2},\ldots,p_{Z})$ be the AMA mechanism characterized by $p_{1}$ to $p_{Z}$ . The task of optimizing within the AMA family is simply to optimize over the parameters:

\max_{p_{1},p_{2},\ldots,p_{Z}}ER(M(p_{1},p_{2},\ldots,p_{Z}))

Here, $ER(M)$ represents mechanism $M$ ’s expected revenue. We have analytical characterization of the AMA payments, so the revenue of $M$ given a specific type profile can be calculated accordingly. Unfortunately, there is no known short-cut for calculating the expected revenue. Given a prior distribution of the $\theta_{i}$ , we need to draw large amount of sample profiles to calculate the expected revenue. For example, if for every agent $i$ , we draw $100$ samples for $\theta_{i}$ , then altogether the number of type profiles is $100^{n}$ . For this reason, in this paper, we focus on cases where $n$ is small.⁴⁴4We have to emphasize that this is not an uncommon constraint when it comes to using numerical methods for maximizing mechanism revenue.

Likhodedov and Sandholm [15, 16] used a grid-based gradient descent approach for optimizing the parameters. Under this approach, suppose we start from a grid point $(p_{1},p_{2},\ldots,p_{Z})$ , we have to examine all neighbouring points $(p_{1}+\delta_{1}h,p_{2}+\delta_{2}h,\ldots,p_{Z}+\delta_{Z}h)$ , where $\delta_{i}\in\{-1,0,1\}$ and $h$ is the grid size. We need to examine $3^{Z}$ points. So this approach requires that both $n$ and $|O|$ be tiny. For example, if $Z=100$ , then the approach is impractical. For our technique, $Z$ is allowed to be large: our technique involves a LP model with $Z$ variables, which takes polynomial time in $Z$ .

A high-level description of our optimizing technique is as follows:

•

We initialize the algorithm with an AMA mechanism: e.g., one based on random parameters, or the VCG mechanism.
•

Given $M_{0}$ characterized by $p_{1}^{0},p_{2}^{0},\ldots,p_{Z}^{0}$ , we use a heuristic to approximate the optimal AMA mechanism near this starting point, using a linear program. A mechanism $M$ (characterized by $p_{1},p_{2},\ldots,p_{Z}$ ) is near $M_{0}$ if $\max_{i}|p_{i}-p_{i}^{0}|\leq\epsilon$ for a threshold $\epsilon$ . We repeat this step using the new mechanism as the starting point.

The above algorithm may end with a locally optimal mechanism, which means that we may need to repeat the algorithm using different initial points.

Now we present the details of the linear program. We index the outcomes using $0,1,\ldots,k$ . We denote the initial mechanism as $M(u_{1}^{0},u_{2}^{0},\ldots,u_{n}^{0},a_{0}^{0},a_{1}^{0},\ldots,a_{k}^{0})$ .

The following optimization model solves for the optimal AMA mechanism near this starting point:

Model 1
Variables: $u_{1},u_{2},\ldots,u_{n},a_{0},a_{1},\ldots,a_{k}$ Maximize: $ER(M(u_{1},u_{2},\ldots,u_{n},a_{0},a_{1},\ldots,a_{k}))$ Subject to:
For all $i$ , $u_{i}\geq 1$ and $u_{i}^{0}-\epsilon\leq u_{i}\leq u_{i}^{0}+\epsilon$ For all $t$ , $a_{i}^{0}-\epsilon\leq a_{i}\leq a_{i}^{0}+\epsilon$

Of course, the above model is not a linear program, as $ER(M)$ is not a linear combination of the variables. We will approximate $ER(M)$ using a linear combination of the variables.

Let $S$ be a large set of type profiles, we will approximate $ER(M)$ as follows:

ER(M)\approx\sum_{\theta\in S}P(\theta)\sum_{i}C_{i}(M,\theta)

Here, $C_{i}(M,\theta)$ is agent $i$ ’s payment under $M$ when the type profile is $\theta$ . One way to pick $S$ is to discretize the type space and let $S$ be the set of all grid points. Now what remains to be done is to approximate $C_{i}(M,\theta)$ using a linear combination of the variables.

C_{i}(M,\theta)=\frac{\max_{o\in O}\left(\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o)+a_{o}\right)-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*})-a_{o^{*}}}{u_{i}}

Here, $o^{*}$ is defined as

o^{*}=\arg\max_{o\in O}\left(\sum_{i=1}^{n}u_{i}v_{i}(\theta_{i},o)+a_{o}\right)

We use the following heuristic to approximate $C_{i}(M,\theta)$ : because the $u_{i}$ and the $a_{o}$ are close to the $u_{i}^{0}$ and the $a_{o}^{0}$ , we will use the $u_{i}^{0}$ and the $a_{o}^{0}$ to calculate the outcomes mentioned in the above expressions. That is, we assume that for most type profiles, small perturbation in the parameters will not change the mechanism outcomes.

o^{*0}=\arg\max_{o\in O}\left(\sum_{i=1}^{n}u_{i}^{0}v_{i}(\theta_{i},o)+a_{o}^{0}\right)

o^{0}=\arg\max_{o\in O}\left(\sum_{j\neq i}u_{i}^{0}v_{i}(\theta_{i},o)+a_{o}^{0}\right)

We replace $o^{*}$ and $o$ using $o^{*0}$ and $o^{0}$ , we have that

C_{i}(M,\theta)\approx\frac{\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{0})+a_{o^{0}}-\sum_{j\neq i}u_{j}v_{j}(\theta_{j},o^{*0})-a_{o^{*0}}}{u_{i}}

We use $c_{j}$ to denote $v_{j}(\theta_{j},o^{0})$ and $c_{j}^{*}$ to denote $v_{j}(\theta_{j},o^{*0})$ . Both the $c_{j}$ and the $c_{j}^{*}$ are constants.

C_{i}(M,\theta)\approx\frac{\sum_{j\neq i}c_{j}u_{j}+a_{o^{0}}-\sum_{j\neq i}c_{j}^{*}u_{j}-a_{o^{*0}}}{u_{i}}

We then observe that for any $x$ , $\frac{x}{u_{i}}=\frac{x}{u_{i}^{0}}-\frac{x(u_{i}-u_{i}^{0})}{u_{i}u_{i}^{0}}$ .

We can then rewrite $C_{i}(M,\theta)$ into:

\frac{\sum_{j\neq i}c_{j}u_{j}+a_{o^{0}}-\sum_{j\neq i}c_{j}^{*}u_{j}-a_{o^{*0}}}{u_{i}^{0}}-\frac{(\sum_{j\neq i}c_{j}u_{j}+a_{o^{0}}-\sum_{j\neq i}c_{j}^{*}u_{j}-a_{o^{*0}})(u_{i}-u_{i}^{0})}{u_{i}u_{i}^{0}}

The first term is a linear combination of the variables, as $u_{i}^{0}$ , the $c_{j}$ , and the $c_{j}^{*}$ are all constants.

The second term can be approximated as follows:

\frac{(\sum_{j\neq i}c_{j}u_{j}^{0}+a_{o^{0}}^{0}-\sum_{j\neq i}c_{j}^{*}u_{j}^{0}-a_{o^{*0}}^{0})(u_{i}-u_{i}^{0})}{u_{i}^{0}u_{i}^{0}}

The above is a linear function involving one variable ( $u_{i}$ ).

Using the above heuristic method, we are able to turn Model 1 into a linear program involving $n+|O|$ variables. The number of constraints is $n+|O|+|S|$ . Therefore, we can afford reasonable large $n$ and $|O|$ , as long as $|S|$ is not too large.

4 Zero-Day Exploit Mechanism Design Model

In this section, we focus on a specific mechanism design setting for the zero-day exploit model: there are only two agents: one offender and one defender.

We use $EPO(M)$ to denote the offender’s expected payment under mechanism $M$ . We use $EPD(M)$ to denote the defender’s expected payment under mechanism $M$ .

Let $F$ be the set of all strategy-proof and individually rational mechanisms.

Let $M^{*}$ be the optimal mechanism that maximizes the expected revenue. We recall that $ER(M)$ denotes $M$ ’s expected revenue.

M^{*}=\arg\max_{M\in F}\left(EPO(M)+EPD(M)\right)=\arg\max_{M\in F}ER(M)

Let $MO^{*}$ be the optimal mechanism that maximizes the expected payment collected from the offender.

MO^{*}=\arg\max_{M\in F}EPO(M)

Let $MD^{*}$ be the optimal mechanism that maximizes the expected payment collected from the defender.

MD^{*}=\arg\max_{M\in F}EPD(M)

Obviously, we have

ER(M^{*})=EPO(M^{*})+EPD(M^{*})\leq EPO(MO^{*})+EPD(MD^{*})

We introduce the following posted-price mechanisms. These mechanisms allow only one agent to make decisions.

•

Every outcome $o$ is associated with a price $a_{o}$ .
•

One agent picks the outcome that maximizes her own utility.
•

The other agent makes no decisions and pays $0$ .

It is without loss of generality to assume that both $MO^{*}$ and $MD^{*}$ are posted price mechanisms. Let $MO^{*}$ be the mechanism that maximizes the offender’s expected payment. Let $EPO(M,\theta_{D})$ be the offender’s expected payment under $M$ when the defender bids $\theta_{D}$ . Because $EPO(MO^{*})=\sum_{\theta_{D}}P(\theta_{D})EPO(MO^{*},\theta_{D})$ , there must exist one $\theta_{D}$ so that $EPO(MO^{*},\theta_{D})\geq EPO(MO^{*})$ . If we fix the defender’s type to be the said $\theta_{D}$ , then the mechanism faced by the offender is exactly a posted-price mechanism, and the expected payment of the offender is at least $EPO(MO^{*})$ .

For zero-day exploit market, typically the defender has much lower valuation than the offender. For example, according to [9], an exploit that attacks the Chrome browser sells between 80k and 200k for offensive clients (USD). According to Google’s official bug bounty reward program for the Chrome browser [19], a serious exploit is priced between 0.5k and 15k. Therefore, $EPD(MD^{*})$ is generally much smaller than $EPO(MO^{*})$ .

We use $PP(a_{0},a_{1},\ldots,a_{k})$ to denote the posted-price mechanism with the parameters $a_{0}$ to $a_{k}$ . We use $M(u_{O},u_{D},a_{0},a_{1},\ldots,a_{k})$ to denote the AMA mechanism with the parameters $u_{O}$ (for offender), $u_{D}$ (for defender), and the $a_{t}$ . If the deciding agent under $PP(a_{0},a_{1},\ldots,a_{k})$ is the offender, then $PP(a_{0},a_{1},\ldots,a_{k})$ approaches $M(u_{O},1,0,-a_{1}u_{O},\ldots,-a_{k}u_{O})$ , when $u_{O}$ approaches infinity. If the deciding agent under $PP(a_{0},a_{1},\ldots,a_{k})$ is the defender, then $PP(a_{0},a_{1},\ldots,a_{k})$ approaches $M(1,u_{D},-a_{0}u_{D},\ldots,-a_{k-1}u_{D},0)$ , when $u_{D}$ approaches infinity. For any posted-price mechanism, there exists an AMA mechanism whose expected performance is arbitrary close to it. That is, the optimal AMA mechanism’s expected revenue is at least the optimal expected revenue of posted-price mechanisms.

We solve for the optimal posted-price mechanism for the offender, denoted as $PP^{*}$ .

ER(PP^{*})=EPO(M^{*})\geq ER(M^{*})-EPD(MD^{*})

Because the optimal AMA mechanism outperforms $PP^{*}$ (or they have the same expected revenue), when $EPD(MD^{*})$ is small, we have that the optimal AMA mechanism’s expected revenue is close to the expected revenue of the optimal mechanism $M^{*}$ .

4.1 Optimal Posted-Price Mechanism

In this subsection, we discuss how to solve for the optimal posted-price mechanism.

First of all, we focus on the single-parameter setting [12]. For presentation purpose, we focus on solving for the optimal posted-price mechanism where the offender makes decisions.

In a single-parameter setting, an offender’s valuation is defined as follows, assuming her type is $\theta_{O}$ :

v(\theta_{O},t)=\int_{0}^{t}\theta_{O}c(x)dx

$c(x)$ is a fixed function that characterizes the instantaneous valuation of the exploit by the agent. At time $x$ , the instantaneous valuation is $\theta_{O}c(x)$ .

We use Myerson’s standard technique. We assume $\theta_{O}\in[0,H]$ , where $H$ is a fixed upper bound. We assume $\theta_{O}$ ’s pdf and cdf are $f$ and $F$ , respectively. We also assume the following expression is monotone nondecreasing. This is called the monotone hazard rate condition, which is satisfied by many common distributions.

\phi(\theta_{O})=\theta_{O}-\frac{1-F(\theta_{O})}{f(\theta_{O})}

It turns out that if the above all hold, then the optimal post-price mechanism simply sells the whole time interval $[0,1]$ as a bundle, with a fixed take-it-or-leave-it price $p$ . If the agent is willing to afford $p$ to buy the whole interval, then she gets it. Otherwise, she gets nothing and pays nothing. The optimal mechanism $PP^{*}=PP(0,\infty,\infty,\ldots,\infty,p)$ .

When $k$ is small, we have another algorithm for solving for the optimal posted-price mechanism, and under this algorithm, we can drop the single-parameter assumption.

Let $PP(a_{0},a_{1},\ldots,a_{k})$ be the optimal posted-price mechanism. It is without loss of generality to assume that for any $i<j$ , if both $a_{i}$ and $a_{j}$ are finite, then $a_{i}<a_{j}$ . Otherwise, the outcome $i$ is never chosen, and we can set $a_{i}$ to be infinite. It is also without loss of generality to assume that for any $a_{i}$ , either it is infinite (meaning that this outcome is not allowed), or it must satisfy the following condition:

\exists\theta_{O},v(i,\theta_{O})-a_{i}=\max_{j<i}v(j,\theta_{O})-a_{j}

Here, $v(t,\theta_{O})$ is the offender’s valuation for outcome $t$ when her type is $\theta_{O}$ . The above condition basically says that there exists a type for the offender, if we increase $a_{i}$ just a bit, then it would force the offender to choose an earlier outcome than $i$ . If the condition is not true, then we can safely increase $a_{j}$ . By doing so, we can charge more for those types that choose $j$ . We can also charge more if under some types, the offender chooses a later outcome, which also means that more payment will be collected.

Based on the above condition, if we know the $a_{j}$ for $j<i$ and $\theta_{O}$ , then we can calculate $a_{i}$ . We already know that $a_{0}=0$ . To calculate $a_{1}$ , we can go over all $\theta_{O}$ , possibly by discretizing the offender’s type space. Then, to calculate $a_{2}$ , we can go over all $\theta_{O}$ again. We do this for every $i$ . Let $N$ be the number of types in $\Theta_{O}$ after the discretization. The total number of iterations is then $N^{k}$ .

5 Evaluation

As mentioned earlier, according to [9], an exploit that attacks the Chrome browser sells for at most 200k for offensive clients (USD). According to Google’s official bug bounty reward program for the Chrome browser [19], a serious exploit is priced for at most 15k.

We start with the following setting, which is based on the numbers above. There are two agents. The offender’s valuation function is

v(\theta_{O},t)=\int_{0}^{t}\theta_{O}(1-x)dx

$\theta_{O}$ is drawn uniformly at random from $U(0,400)$ . That is, the offender’s valuation for the whole time interval $[0,1]$ is at most $200$ .

The defender’s valuation function is

v(\theta_{D},t)=\int_{t}^{1}\theta_{D}xdx

$\theta_{D}$ is drawn uniformly at random from $U(0,15)$ . That is, the defender’s valuation for the whole time interval $[0,1]$ is at most $15$ .

The above valuation functions satisfy all the conditions needed for the single-parameter model. So $MO^{*}$ simply sells the whole interval to the offender for a fixed price $p_{O}$ , and $MD^{*}$ simply sells the whole interval to the defender for a fixed price $p_{D}$ .

p_{O}=\arg\max_{p\leq 200}pP(v(\theta_{O},1)\geq p)=\arg\max_{p\leq 200}p\frac{200-p}{200}=100

EPO(MO^{*})=50

p_{D}=\arg\max_{p\leq 15}pP(v(\theta_{D},1)\geq p)=\arg\max_{p\leq 15}p\frac{15-p}{15}=7.5

EPD(MD^{*})=3.75

Therefore, $ER(M^{*})$ is at most $53.75$ . We pick $k=10$ and $\epsilon=0.01$ . We use the VCG mechanism as the initial solution. The VCG mechanism’s expected revenue is $6.9$ , which is very far away from the upper bound. Our technique starts from the VCG mechanism, and at the end produces a mechanism whose expected revenue equals $50.6$ , which is very close to the upper bound $53.75$ . That is, in this case, the optimal AMA mechanism’s expected revenue is close to the optimal mechanism’s expected revenue.

As demonstrated in our analysis, we have the above phenomenon if the defender’s valuation is insignificant compared to the valuation of the offender. We then investigate an example where the defender’s valuation is much higher. We change it so that the defender’s type is drawn from $U(0,150)$ instead of $U(0,15)$ . Now the upper bound for $ER(M^{*})$ is $87.5$ . Our technique produces a mechanism whose expected revenue equals $57.9$ . This time, the achieved value is not close to the upper bound.

6 Conclusion

Optimizing within the affine maximizer auctions (AMA) is an effective approach for revenue maximizing mechanism design. We proposed a linear programming based heuristic for optimizing within the AMA family. We applied our technique to revenue maximizing mechanism design for zero-day exploit markets. We showed that due to the nature of the zero-day exploit markets, with one offender and one defender, our technique generally produces a near optimal mechanism.

References

[1] Algarni, A.M., Malaiya, Y.K.: Software vulnerability markets: Discoverers and buyers. Int. J. of Computer, Electrical, Automation, Control and Inf. Eng. 8(3), 71–81 (2014)
[2] Bilge, L., Dumitras, T.: Before we knew it: An empirical study of zero-day attacks in the real world. In: Proc. of 2012 ACM Conf. on Computer and Communications Security. pp. 833–844. CCS ’12, ACM, New York, NY, USA (2012), http://doi.acm.org/10.1145/2382196.2382284
[3] Brams, S.J., Jones, M.A., Klamler, C.: Better ways to cut a cake - revisited. In: Brams, S., Pruhs, K., Woeginger, G. (eds.) Fair Division. No. 07261 in Dagstuhl Seminar Proceedings, Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany, Dagstuhl, Germany (2007)
[4] Chen, Y., Lai, J., Parkes, D., Procaccia, A.: Truth, justice, and cake cutting. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Atlanta, GA, USA (2010)
[5] Egelman, S., Herley, C., van Oorschot, P.C.: Markets for zero-day exploits: Ethics and implications. In: Proc. of 2013 Workshop on New Security Paradigms Workshop. pp. 41–46. NSPW ’13, ACM, New York, NY, USA (2013), http://doi.acm.org/10.1145/2535813.2535818
[6] Emek, Y., Feldman, M., Gamzu, I., Paes Leme, R., Tennenholtz, M.: Signaling schemes for revenue maximization. In: Proceedings of the ACM Conference on Electronic Commerce (EC). Valencia, Spain (2012)
[7] Fisher, D.: Vupen founder launches new zero-day acquisition firm zerodium (2015), july 24, 2015 online: https://threatpost.com/vupen-launches-new-zero-day-acquisition-firm-zerodium/113933/
[8] Goemans, M., Skutella, M.: Cooperative facility location games. Journal of Algorithms 50, 194–214 (2004), early version: SODA 2000, 76–85
[9] Greenberg, A.: Shopping for zero-days: A price list for hackers’ secret software exploits (2012), march 23, 2012 online: http://www.forbes.com/sites/andygreenberg/2012/03/23/shopping-for-zero-days-an-price-list-for-hackers-secret-software-exploits/
[10] Guo, M., Deligkas, A.: Revenue maximization via hiding item attributes. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI). Beijing, China (2013)
[11] Guo, M., Deligkas, A., Savani, R.: Increasing VCG revenue by decreasing the quality of items. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Quebec, Canada (2014)
[12] Guo, M., Hata, H., Babar, A.: Revenue maximizing markets for zero-day exploits. In: PRIMA 2016: Princiles and Practice of Multi-Agent Systems - 19th International Conference, Phuket, Thailand, August 22-26, 2016, Proceedings. pp. 247–260 (2016)
[13] Lahaie, S., Pennock, D.M., Saberi, A., Vohra, R.V.: Sponsored search auctions. In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V. (eds.) Algorithmic Game Theory, chap. 28. Cambridge University Press (2007)
[14] Lavi, R., Mu’alem, A., Nisan, N.: Towards a characterization of truthful combinatorial auctions. In: Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS). pp. 574–583 (2003)
[15] Likhodedov, A., Sandholm, T.: Methods for boosting revenue in combinatorial auctions. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). pp. 232–237. San Jose, CA, USA (2004)
[16] Likhodedov, A., Sandholm, T.: Approximating revenue-maximizing combinatorial auctions. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Pittsburgh, PA, USA (2005)
[17] Myerson, R.: Optimal auction design. Mathematics of Operations Research 6, 58–73 (1981)
[18] Procaccia, A.D., Tennenholtz, M.: Approximate mechanism design without money. In: Proceedings of the ACM Conference on Electronic Commerce (EC). pp. 177–186. Stanford, CA, USA (2009)
[19] Projects, T.C.: Severity guidelines for security issues (2015), accessed September 15, 2015 online: https://www.chromium.org/developers/severity-guidelines
[20] Vickrey, W.: Counterspeculation, auctions, and competitive sealed tenders. Journal of Finance 16, 8–37 (1961)