University of Adelaide, Australia
{mingyu.guo, ali.babar}@adelaide.edu.au 22institutetext: Graduate School of Information Science
Nara Institute of Science Technology, Japan
[email protected]
Optimizing Affine Maximizer Auctions via Linear Programming: an Application to Revenue Maximizing Mechanism Design for Zero-Day Exploits Markets
Abstract
Optimizing within the affine maximizer auctions (AMA) is an effective approach for revenue maximizing mechanism design. The AMA mechanisms are strategy-proof and individually rational (if the agents’ valuations for the outcomes are nonnegative). Every AMA mechanism is characterized by a list of parameters. By focusing on the AMA mechanisms, we turn mechanism design into a value optimization problem, where we only need to adjust the parameters. We propose a linear programming based heuristic for optimizing within the AMA family. We apply our technique to revenue maximizing mechanism design for zero-day exploit markets. We show that due to the nature of the zero-day exploit markets, if there are only two agents (one offender and one defender), then our technique generally produces a near optimal mechanism: the mechanism’s expected revenue is close to the optimal revenue achieved by the optimal strategy-proof and individually rational mechanism (not necessarily an AMA mechanism).
Keywords:
automated mechanism design revenue maximization mechanism design security economics bug bounty
1 Introduction
Revenue maximizing mechanism design is a fundamental topic in algorithmic game theory. Myerson [17] solved for the revenue maximizing mechanism for selling a single item, subject to a technical condition called the monotone hazard rate condition. Myerson’s optimal auction is surprisingly elegant. For example, if every agent’s type is drawn from an identical and independent distribution, then the optimal mechanism is simply the Vickrey auction [20] with a reserve price. Unfortunately, Myerson’s technique does not generalize to more complex settings. For example, when it comes to combinatorial auctions (auctions where multiple items are for sale, and the agents bid on bundles of items), revenue maximizing mechanism design remains an open problem. Another notable application domain of revenue maximizing mechanism design is the sponsored search auctions [13], where the search engines sell advertisement slots to advertisers, aiming to maximize revenue besides other objectives. Even though no optimal mechanisms have been derived for general combinatorial auctions or sponsored search auctions, for restricted domains, well performing mechanisms have been obtained based on a variety of revenue-boosting techniques [6, 10, 11, 15, 16].
There are several general revenue-boosting techniques. For example, we may artificially increase the winning chance of lower bidders, in order to drive up the competition faced by the higher bidders. As another example, we may artificially discourage or outright ban certain outcomes, in order to prevent low-revenue outcomes or force the agents to pay more to achieve the discouraged outcomes. The above techniques form the basis of a family of mechanisms called the affine maximizer auctions (AMA). Lavi et al. [14] conjectured that a combinatorial auction is truthful if and only if it is an AMA mechanism, subject to technical conditions. Likhodedov and Sandholm [15, 16] studied revenue maximizing combinatorial auction design by optimizing within the family of AMA mechanisms. The idea of optimizing within the AMA family is a general approach that can be applied to many different mechanism design settings, because generally the AMA mechanisms are well defined and the family contains a large number of mechanisms. By optimizing within the AMA family, there is a good chance of reaching a well-performing mechanism in terms of revenue. However, the issue with optimizing within the AMA family is that every AMA mechanism is characterized by parameters, where is the size of the outcome space , and is the number of agents. For combinatorial auctions, is exponential in the number of items, which makes it computationally impractical to optimize within the AMA family for this setting. Due to this, Likhodedov and Sandholm only studied the AMA family for the case of selling two items. When there are only two items, the number of parameters is small enough for the authors to conduct optimization via grid-based gradient descent.111The authors also proposed a restricted version of AMA called the VVCA mechanisms. A VVCA mechanism is only characterized by parameters, which makes it much easier to optimize over. On the other hand, due to the fact that the VVCA family is only a tiny subset of the whole AMA family, we lose revenue by focusing only on it.
In this paper, we propose a linear programming based technique for optimizing within the AMA family. Every outcome corresponds to one variable in our LP model. As a result, our technique can handle reasonably large number of outcomes. For example, let us consider the case where is a few hundred. Running a LP with a few hundred variables is computationally tractable. On the other hand, methods such as grid-based gradient descent are impractical.
We apply our new technique to a specific mechanism design problem. Our paper focuses on revenue maximizing mechanism design for zero-day exploit markets [12]. Zero-day exploits refer to software vulnerabilities that are not known to the software vendor. Trading zero-day exploits as legitimate business is a recent trend in the security industry [5]. According to a price list collected by Greenberg [9], the price of a zero-day exploit is between $5000 to $250,000. There are venture capital backed security consulting companies whose business model is selling zero-day exploits [7]. One of the companies mentioned in [7] even offered one point five million US dollars for one new iOS exploit. The reason an exploit can be priced so high is that generally it can stay alive for a long period of time [2]. Unless the software vendor is informed about an exploit, there is very low chance for an exploit to be discovered independently. To remedy this, software vendors often run bug bounty programs, which are markets where the software vendors buy exploits from security researchers [1, 19].
Guo et al. [12] proposed a formal mechanism design model for zero-day exploit markets. In the authors’ model, one exploit is being sold to multiple buyers over a period of time . The model is different from the classic single-item auction for the following reasons:
-
•
There are two categories of buyers. The defenders buy exploits to fix them. Typically there is only one defender, which is the software vendor. The offenders buy exploits to utilize them. National security agencies and police are example offenders. For example, Zerodium [7] is a consulting company that buys zero-day exploits and resells them to mostly government agencies. The offenders wish to utilize an exploit for as long as possible. Once an exploit is obtained by a defender, the exploit becomes worthless. Using mechanism design terminologies, the buyers have externalities.
-
•
The item being sold is an informational item, which means that we can sell the same item multiple times (e.g., to multiple offenders). Of course, once an exploit is sold to a defender, we cannot sell it to any offenders afterwards, because it has become worthless.
-
•
Because the item being sold is a piece of information, we cannot simply describe it in full details to the buyers without some kind of payment enforcing mechanism, because otherwise the buyers can walk away with the exploit for free. Furthermore, we cannot ask the buyers to bid on an exploit that carries no description, because the buyers cannot come up with their private valuations if no description is given.
Guo et al. [12] proposed a mechanism property called straight-forwardness: a mechanism is straight-forward if it describes the exploit in full details to the offenders, before they submit their bids. This is required because typically offenders already have many exploits in their arsenals. With full description, they can evaluate whether the exploit being sold is original, and to what extent the exploit helps them. Straight-forwardness does not require the mechanism to describe the exploit to the defenders before they bid (otherwise, the exploit gets fixed). Straight-forwardness only describes to the defenders how severe the exploit is: e.g., this exploit allows anyone to remotely control an iOS device. From the perspective of a defender, every exploit is new (otherwise, it wouldn’t be an exploit). We ask the defenders to come up with their private valuations based on the exploit’s severity, which is exactly how bug bounty markets operate (in bug bounty markets, bugs are priced according to their severity levels [19]).
Guo et al. [12] showed that if straightforwardness is required together with strategy-proofness and individual rationality, then one revenue-maximizing mechanism must work as follows: we describe the exploit in full details to all offenders at time . We also describe the exploit’s severity level to the defenders at time . The offenders and defenders submit their bids. The offenders bid to keep the exploit alive for as long as possible. The defenders bid to kill off the exploit as early as possible. In some sense, the model is similar to the cake-cutting problem [3, 4] and the single facility location problem [8, 18].222In our model, we allow payments. After all, the objective is to maximize revenue. For this model, the authors proposed one heuristic-based randomized AMA mechanism.
In this paper, for the above model, we use our new technique to optimize within the AMA family. We also show that if there are only two agents (one offender and one defender), and if the defender’s valuation is much lower than the offender’s (typically true for zero-day exploit markets), the optimal AMA mechanism’s revenue is close to the optimal revenue. For demonstrating this result, we propose and study a family of mechanisms called the posted-price mechanisms for our model.
2 Model Description
We use to denote the outcome space. We use to denote agent ’s type space. We use to denote agent ’s valuation for outcome when her type is .
For the zero-day exploit mechanism design model proposed in Guo et al. [12], the outcome space is . An outcome represents when the exploit is killed off (revealed to the defenders).333If we allow randomized mechanisms, then an outcome is a nonincreasing function , with and . represents the probability for the exploit to be alive at time . In order to run the technique proposed in this paper, we require the outcome space to be finite. So for this technical reason, we set the outcome space to be . That is, we will only reveal the exploit at these discrete moments. The size of the outcome space .
The family of AMA mechanisms is defined as follows:
-
•
Given a type profile , the outcome picked is the following:
-
•
Agent ’s payment equals:
In the above description, the and the are constant parameters. for all . The are unrestricted. In total, there are parameters. Every AMA mechanism is characterized by these many parameters. For any assignments of the parameters, the corresponding AMA mechanism is strategy-proof. However, not every AMA mechanism is individually rational. If we further assume that , , then every AMA mechanism is individually rational. To show this, we only need to show that an agent’s valuation is always at least her payment. That is,
The right-hand side is less than or equal to the left-hand side if every agent’s valuation for every outcome is nonnegative.
In our model, an outcome represents when the exploit is killed off. For presentation purpose, we sometimes use to refer to an outcome.
An offender’s valuation is defined as:
A defender’s valuation is defined as:
An offender “enjoys” the exploit from time to , and a defender values the safe period from time to . represents agent ’s instantaneous value (nonnegative) at time , when her type is . Based on the above definitions of the agents’ valuations, we have that every AMA mechanism is individually rational for our model.
3 Optimizing Affine Maximizer Auctions
We recall that an AMA mechanism is characterized by parameters ( for every agent , and for every outcome ). For presentation purpose, we define and use to refer to the parameters. Let be the AMA mechanism characterized by to . The task of optimizing within the AMA family is simply to optimize over the parameters:
Here, represents mechanism ’s expected revenue. We have analytical characterization of the AMA payments, so the revenue of given a specific type profile can be calculated accordingly. Unfortunately, there is no known short-cut for calculating the expected revenue. Given a prior distribution of the , we need to draw large amount of sample profiles to calculate the expected revenue. For example, if for every agent , we draw samples for , then altogether the number of type profiles is . For this reason, in this paper, we focus on cases where is small.444We have to emphasize that this is not an uncommon constraint when it comes to using numerical methods for maximizing mechanism revenue.
Likhodedov and Sandholm [15, 16] used a grid-based gradient descent approach for optimizing the parameters. Under this approach, suppose we start from a grid point , we have to examine all neighbouring points , where and is the grid size. We need to examine points. So this approach requires that both and be tiny. For example, if , then the approach is impractical. For our technique, is allowed to be large: our technique involves a LP model with variables, which takes polynomial time in .
A high-level description of our optimizing technique is as follows:
-
•
We initialize the algorithm with an AMA mechanism: e.g., one based on random parameters, or the VCG mechanism.
-
•
Given characterized by , we use a heuristic to approximate the optimal AMA mechanism near this starting point, using a linear program. A mechanism (characterized by ) is near if for a threshold . We repeat this step using the new mechanism as the starting point.
The above algorithm may end with a locally optimal mechanism, which means that we may need to repeat the algorithm using different initial points.
Now we present the details of the linear program. We index the outcomes using . We denote the initial mechanism as .
The following optimization model solves for the optimal AMA mechanism near this starting point:
Model 1
Variables:
Maximize:
Subject to:
For all , and
For all ,
Of course, the above model is not a linear program, as is not a linear combination of the variables. We will approximate using a linear combination of the variables.
Let be a large set of type profiles, we will approximate as follows:
Here, is agent ’s payment under when the type profile is . One way to pick is to discretize the type space and let be the set of all grid points. Now what remains to be done is to approximate using a linear combination of the variables.
Here, is defined as
We use the following heuristic to approximate : because the and the are close to the and the , we will use the and the to calculate the outcomes mentioned in the above expressions. That is, we assume that for most type profiles, small perturbation in the parameters will not change the mechanism outcomes.
We replace and using and , we have that
We use to denote and to denote . Both the and the are constants.
We then observe that for any , .
We can then rewrite into:
The first term is a linear combination of the variables, as , the , and the are all constants.
The second term can be approximated as follows:
The above is a linear function involving one variable ().
Using the above heuristic method, we are able to turn Model 1 into a linear program involving variables. The number of constraints is . Therefore, we can afford reasonable large and , as long as is not too large.
4 Zero-Day Exploit Mechanism Design Model
In this section, we focus on a specific mechanism design setting for the zero-day exploit model: there are only two agents: one offender and one defender.
We use to denote the offender’s expected payment under mechanism . We use to denote the defender’s expected payment under mechanism .
Let be the set of all strategy-proof and individually rational mechanisms.
Let be the optimal mechanism that maximizes the expected revenue. We recall that denotes ’s expected revenue.
Let be the optimal mechanism that maximizes the expected payment collected from the offender.
Let be the optimal mechanism that maximizes the expected payment collected from the defender.
Obviously, we have
We introduce the following posted-price mechanisms. These mechanisms allow only one agent to make decisions.
-
•
Every outcome is associated with a price .
-
•
One agent picks the outcome that maximizes her own utility.
-
•
The other agent makes no decisions and pays .
It is without loss of generality to assume that both and are posted price mechanisms. Let be the mechanism that maximizes the offender’s expected payment. Let be the offender’s expected payment under when the defender bids . Because , there must exist one so that . If we fix the defender’s type to be the said , then the mechanism faced by the offender is exactly a posted-price mechanism, and the expected payment of the offender is at least .
For zero-day exploit market, typically the defender has much lower valuation than the offender. For example, according to [9], an exploit that attacks the Chrome browser sells between 80k and 200k for offensive clients (USD). According to Google’s official bug bounty reward program for the Chrome browser [19], a serious exploit is priced between 0.5k and 15k. Therefore, is generally much smaller than .
We use to denote the posted-price mechanism with the parameters to . We use to denote the AMA mechanism with the parameters (for offender), (for defender), and the . If the deciding agent under is the offender, then approaches , when approaches infinity. If the deciding agent under is the defender, then approaches , when approaches infinity. For any posted-price mechanism, there exists an AMA mechanism whose expected performance is arbitrary close to it. That is, the optimal AMA mechanism’s expected revenue is at least the optimal expected revenue of posted-price mechanisms.
We solve for the optimal posted-price mechanism for the offender, denoted as .
Because the optimal AMA mechanism outperforms (or they have the same expected revenue), when is small, we have that the optimal AMA mechanism’s expected revenue is close to the expected revenue of the optimal mechanism .
4.1 Optimal Posted-Price Mechanism
In this subsection, we discuss how to solve for the optimal posted-price mechanism.
First of all, we focus on the single-parameter setting [12]. For presentation purpose, we focus on solving for the optimal posted-price mechanism where the offender makes decisions.
In a single-parameter setting, an offender’s valuation is defined as follows, assuming her type is :
is a fixed function that characterizes the instantaneous valuation of the exploit by the agent. At time , the instantaneous valuation is .
We use Myerson’s standard technique. We assume , where is a fixed upper bound. We assume ’s pdf and cdf are and , respectively. We also assume the following expression is monotone nondecreasing. This is called the monotone hazard rate condition, which is satisfied by many common distributions.
It turns out that if the above all hold, then the optimal post-price mechanism simply sells the whole time interval as a bundle, with a fixed take-it-or-leave-it price . If the agent is willing to afford to buy the whole interval, then she gets it. Otherwise, she gets nothing and pays nothing. The optimal mechanism .
When is small, we have another algorithm for solving for the optimal posted-price mechanism, and under this algorithm, we can drop the single-parameter assumption.
Let be the optimal posted-price mechanism. It is without loss of generality to assume that for any , if both and are finite, then . Otherwise, the outcome is never chosen, and we can set to be infinite. It is also without loss of generality to assume that for any , either it is infinite (meaning that this outcome is not allowed), or it must satisfy the following condition:
Here, is the offender’s valuation for outcome when her type is . The above condition basically says that there exists a type for the offender, if we increase just a bit, then it would force the offender to choose an earlier outcome than . If the condition is not true, then we can safely increase . By doing so, we can charge more for those types that choose . We can also charge more if under some types, the offender chooses a later outcome, which also means that more payment will be collected.
Based on the above condition, if we know the for and , then we can calculate . We already know that . To calculate , we can go over all , possibly by discretizing the offender’s type space. Then, to calculate , we can go over all again. We do this for every . Let be the number of types in after the discretization. The total number of iterations is then .
5 Evaluation
As mentioned earlier, according to [9], an exploit that attacks the Chrome browser sells for at most 200k for offensive clients (USD). According to Google’s official bug bounty reward program for the Chrome browser [19], a serious exploit is priced for at most 15k.
We start with the following setting, which is based on the numbers above. There are two agents. The offender’s valuation function is
is drawn uniformly at random from . That is, the offender’s valuation for the whole time interval is at most .
The defender’s valuation function is
is drawn uniformly at random from . That is, the defender’s valuation for the whole time interval is at most .
The above valuation functions satisfy all the conditions needed for the single-parameter model. So simply sells the whole interval to the offender for a fixed price , and simply sells the whole interval to the defender for a fixed price .
Therefore, is at most . We pick and . We use the VCG mechanism as the initial solution. The VCG mechanism’s expected revenue is , which is very far away from the upper bound. Our technique starts from the VCG mechanism, and at the end produces a mechanism whose expected revenue equals , which is very close to the upper bound . That is, in this case, the optimal AMA mechanism’s expected revenue is close to the optimal mechanism’s expected revenue.
As demonstrated in our analysis, we have the above phenomenon if the defender’s valuation is insignificant compared to the valuation of the offender. We then investigate an example where the defender’s valuation is much higher. We change it so that the defender’s type is drawn from instead of . Now the upper bound for is . Our technique produces a mechanism whose expected revenue equals . This time, the achieved value is not close to the upper bound.
6 Conclusion
Optimizing within the affine maximizer auctions (AMA) is an effective approach for revenue maximizing mechanism design. We proposed a linear programming based heuristic for optimizing within the AMA family. We applied our technique to revenue maximizing mechanism design for zero-day exploit markets. We showed that due to the nature of the zero-day exploit markets, with one offender and one defender, our technique generally produces a near optimal mechanism.
References
- [1] Algarni, A.M., Malaiya, Y.K.: Software vulnerability markets: Discoverers and buyers. Int. J. of Computer, Electrical, Automation, Control and Inf. Eng. 8(3), 71–81 (2014)
- [2] Bilge, L., Dumitras, T.: Before we knew it: An empirical study of zero-day attacks in the real world. In: Proc. of 2012 ACM Conf. on Computer and Communications Security. pp. 833–844. CCS ’12, ACM, New York, NY, USA (2012), http://doi.acm.org/10.1145/2382196.2382284
- [3] Brams, S.J., Jones, M.A., Klamler, C.: Better ways to cut a cake - revisited. In: Brams, S., Pruhs, K., Woeginger, G. (eds.) Fair Division. No. 07261 in Dagstuhl Seminar Proceedings, Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany, Dagstuhl, Germany (2007)
- [4] Chen, Y., Lai, J., Parkes, D., Procaccia, A.: Truth, justice, and cake cutting. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Atlanta, GA, USA (2010)
- [5] Egelman, S., Herley, C., van Oorschot, P.C.: Markets for zero-day exploits: Ethics and implications. In: Proc. of 2013 Workshop on New Security Paradigms Workshop. pp. 41–46. NSPW ’13, ACM, New York, NY, USA (2013), http://doi.acm.org/10.1145/2535813.2535818
- [6] Emek, Y., Feldman, M., Gamzu, I., Paes Leme, R., Tennenholtz, M.: Signaling schemes for revenue maximization. In: Proceedings of the ACM Conference on Electronic Commerce (EC). Valencia, Spain (2012)
- [7] Fisher, D.: Vupen founder launches new zero-day acquisition firm zerodium (2015), july 24, 2015 online: https://threatpost.com/vupen-launches-new-zero-day-acquisition-firm-zerodium/113933/
- [8] Goemans, M., Skutella, M.: Cooperative facility location games. Journal of Algorithms 50, 194–214 (2004), early version: SODA 2000, 76–85
- [9] Greenberg, A.: Shopping for zero-days: A price list for hackers’ secret software exploits (2012), march 23, 2012 online: http://www.forbes.com/sites/andygreenberg/2012/03/23/shopping-for-zero-days-an-price-list-for-hackers-secret-software-exploits/
- [10] Guo, M., Deligkas, A.: Revenue maximization via hiding item attributes. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI). Beijing, China (2013)
- [11] Guo, M., Deligkas, A., Savani, R.: Increasing VCG revenue by decreasing the quality of items. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Quebec, Canada (2014)
- [12] Guo, M., Hata, H., Babar, A.: Revenue maximizing markets for zero-day exploits. In: PRIMA 2016: Princiles and Practice of Multi-Agent Systems - 19th International Conference, Phuket, Thailand, August 22-26, 2016, Proceedings. pp. 247–260 (2016)
- [13] Lahaie, S., Pennock, D.M., Saberi, A., Vohra, R.V.: Sponsored search auctions. In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V. (eds.) Algorithmic Game Theory, chap. 28. Cambridge University Press (2007)
- [14] Lavi, R., Mu’alem, A., Nisan, N.: Towards a characterization of truthful combinatorial auctions. In: Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS). pp. 574–583 (2003)
- [15] Likhodedov, A., Sandholm, T.: Methods for boosting revenue in combinatorial auctions. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). pp. 232–237. San Jose, CA, USA (2004)
- [16] Likhodedov, A., Sandholm, T.: Approximating revenue-maximizing combinatorial auctions. In: Proceedings of the National Conference on Artificial Intelligence (AAAI). Pittsburgh, PA, USA (2005)
- [17] Myerson, R.: Optimal auction design. Mathematics of Operations Research 6, 58–73 (1981)
- [18] Procaccia, A.D., Tennenholtz, M.: Approximate mechanism design without money. In: Proceedings of the ACM Conference on Electronic Commerce (EC). pp. 177–186. Stanford, CA, USA (2009)
- [19] Projects, T.C.: Severity guidelines for security issues (2015), accessed September 15, 2015 online: https://www.chromium.org/developers/severity-guidelines
- [20] Vickrey, W.: Counterspeculation, auctions, and competitive sealed tenders. Journal of Finance 16, 8–37 (1961)