Recommender Systems meet Mechanism Design
Abstract
Machine learning has developed a variety of tools for learning and representing high-dimensional distributions with structure. Recent years have also seen big advances in designing multi-item mechanisms. Akin to overfitting, however, these mechanisms can be extremely sensitive to the Bayesian prior that they target, which becomes problematic when that prior is only approximately known. At the same time, even if access to the exact Bayesian prior is given, it is known that optimal or even approximately optimal multi-item mechanisms run into sample, computational, representation and communication intractability barriers.
We consider a natural class of multi-item mechanism design problems with very large numbers of items, but where the bidders’ value distributions can be well-approximated by a topic model akin to those used in recommendation systems with very large numbers of possible recommendations. We propose a mechanism design framework for this setting, building on a recent robustification framework by Brustle et al., which disentangles the statistical challenge of estimating a multi-dimensional prior from the task of designing a good mechanism for it, and robustifies the performance of the latter against the estimation error of the former. We provide an extension of this framework appropriate for our setting, which allows us to exploit the expressive power of topic models to reduce the effective dimensionality of the mechanism design problem and remove the dependence of its computational, communication and representation complexity on the number of items.
1 Introduction
Mechanism Design has found important applications in the design of offline and online markets. One of its main applications is the design of auctions, where a common goal is to maximize the seller’s revenue from the sale of one or multiple items to one or multiple bidders. This is challenging because bidders are strategic and interact with the auction in a way that benefits themselves rather than the seller. It is well-understood that, without any information about the bidders’ willingness to pay for different bundles of items, there is no meaningful way to optimize revenue. As such, a classical approach in Economics is to assume that bidders’ types – which determine their values for different bundles and thus their willingness to pay for different bundles – are not arbitrary but randomly drawn from a joint distribution that is common knowledge, i.e. known to all bidders and the auctioneer. With such a Bayesian prior, the revenue of different mechanisms is compared on the basis of what revenue they achieve in expectation with respect to bidder type vectors drawn from , and assuming that bidders play according to some (Bayesian) Nash equilibrium strategies, or some other type of (boundedly) rational behavior, e.g. no-regret learning.
Even with a Bayesian prior, however, revenue maximization is quite a challenging task. While Myerson’s celebrated work showed that a relatively simple mechanism is optimal in single-item settings [38], characterizing the structure of optimal multi-item mechanisms has been notoriously difficult both analytically and computationally. Indeed, it is known that (even approximately) optimal multi-item mechanisms may require description complexity that scales exponentially in the number of items, even when there is a single buyer [34, 27, 24, 3], they might be computationally intractable, even in simple settings [10, 23, 20], and they may exhibit several counter-intuitive properties which do not arise in single-item settings; see survey [21]. Nevertheless, recent years have seen substantial progress on various fronts: analytical characterizations of optimal multi-item mechanisms [22, 30, 35, 24]; computational frameworks for computing near-optimal multi-item mechanisms [2, 9, 10, 11, 12]; approximate multi-item revenue optimization via simple mechanisms [17, 18, 1, 33, 36, 14, 4, 46, 40, 13, 19, 15, 25]; and (approximate) multi-item revenue optimization using sample access to the type distribution [37, 31, 8, 44, 32, 7], including via the use of deep learning [29, 42, 28].
The afore-described progress on multi-item revenue optimization provides a diversity of tools that can be combined to alleviate the analytical and computational intractability of optimal mechanisms. Yet, there still remains an important challenge in applying those tools, which is that they typically require that the type distribution is either known or can be sampled. However, this is too strong an assumption. It is common that is estimated through market research or econometric analysis in related settings, involving similar items or a subset of the items. In this case, we would only hope to know some approximate distribution that is close to . In other settings, we may have sample access to the true distribution but there might be errors in measuring or recording those samples. Again, we might hope to estimate an approximate distribution that is close to . Unfortunately, it is well understood that designing a mechanism for and using it for might be a bad idea, as optimal mechanisms tend to overfit the details of the type distribution. This has motivated a strand of recent literature to study how to robustify mechanisms to errors in the distribution [5, 8, 6].
There is, in fact, another important reason why one might want to design mechanisms for some approximate type distribution. Multi-dimensional data is complex and one would want to leverage the extensive statistical and machine learning toolkit that allows approximating such high-dimensional distributions with more structured models. Indeed, while the true type distribution might not conform to a simple model, it might be close to a distribution that does. We would like to leverage the simple structure in to (i) alleviate the computational intractability of multi-item mechanisms, and (ii) reduce the amount of communication that the bidders and the auctioneer need to exchange. While the structured model might allow (i) and (ii), we need the guarantee that the revenue of our mechanism be robust when we apply it to the true distribution .
Motivated by the discussion above, in this work we build a multi-item mechanism design framework that combines matrix factorization models developed for recommendation systems with mechanism design, targeting two issues: (1) the intractability of Mechanism Design with respect to the number of items (arising from the exponential dependence of the number of types on the number of items if no further assumptions are placed); (2) the lack of exact access to the Bayesian priors. In particular, we assume that each bidder draws their type – specifying their values for a very large universe of items (think all restaurants in a city or all items on Amazon) – from a distribution that is close to a Matrix Factorized model , whose latent dimension is . Targeting these approximate distributions allows us to reduce the effective dimensionality of bidder types to , which has huge advantages in terms of the computational/representation/communication/sample complexity of mechanism design. We develop tools that allow us to (a) use the mechanism constructed for the approximate ’s under the true ’s without sacrificing much revenue; and (b) interact with the bidders who are unaware of the latent codes (they only understand their values for the items and are oblivious to the matrix factorized model) yet exploit the factorized model for efficiently communicating with them without the impractical burden of having them communicate their -dimensional type to the mechanism. In sum, our results are as follows:
-
•
With a query protocol that learns an approximate latent representation of a bidder’s type, Theorem 1 shows how to combine it with any mechanism that is designed only for the Matrix Factorization model to produce a mechanism that generates comparable revenue but with respect to the true distribution. The result is obtained via a refinement of the robustification result in [7], where the loss in revenue, as well as the violation in incentive compatibility now only depend on the effective dimension of the Matrix Factorization model, , but not the total number of items, (Lemma 2).
-
•
We show that if the valuations are constrained-additive (Definition 5), we can obtain communication-efficient query protocols in several natural settings (Theorem 2). The queries we consider ask a bidder whether they are willing to purchase an item at a given price. In the first setting, the design matrix of the Matrix Factorization model contains a diagonally dominant matrix – a generalization of the well-known separability assumption by Donoho and Stodden [26]. In two other settings, we assume that the design matrix is generated from a probablistic model and show that a simple query protocol succeeds with high probability.
-
•
Combining Theorems 1 and 2, we show that, given any mechanism that is designed only for the Matrix Factorization model, we can design a mechanism that achieves comparable revenue and only requires the bidders to answer a small number of simple queries. In particular, for several natural settings, we show that the number of queries scales quasi-linearly in the effective dimension of the Matrix Factorization model and independent of the total number of items (Proposition 1).
2 Preliminaries
2.1 Brief Introduction to Mechanism Design
We provide a brief introduction to mechanism design. To avoid a very long introduction, we only define the concepts in the context of multi-item auctions, which will be the focus of this paper. See Chapter 9 of [39] and the references therein for a more detailed introduction to mechanism design.
Multi-item Auctions.
The seller is selling heterogenous items to bidders. Each bidder is assumed to have a private type that encodes their preference over the items and bundles of items. We assume that lives in the -dimensional Euclidean space. For each bidder, there is a publicly known valuation function , where is bidder ’s value for bundle when ’s private type is . In this paper, we consider the Bayesian setting with private types, that is, each bidder’s type is drawn privately and independently from a publicly known distribution .
Mechanism.
The seller designs a mechanism to sell the items to bidders. A mechanism consists of an allocation rule and a payment rule, where the allocation rule decides a way to allocate the items to the bidders, and the payment rule decides how much to charge each bidder.
Direct Mechanism:
In a direct mechanism, the mechanism directly solicits types from the bidders and apply the allocation and payment rules on the reported types. More specifically, for any reported type profile , a direct mechanism selects as the allocation and charges each bidder payment .333Note that . We slightly abuse notation to allow the allocation rule to be randomized, so . We assume that bidders have quasi-linear utilities. If bidder ’s private type is , her utility under reported bid profile is , where the expectation is over the randomness of the allocation and payment rule.
Expected Revenue:
In this paper, our goal is to design mechanisms with high expected revenue. For a direct mechanism , we use to denote , where is the type profile and is drawn from .
Incentive Compatibility and Individual Rationality
Since the bidders’ types are private, unless the mechanism incentivizes the bidders to report truthfully, there is no reason to expect that the reported types correspond to the true types. The notion of incentive compatibility is defined to capture this.
-
•
-Bayesian Incentive Compatible (-BIC): if bidders draw their types from some distribution , then a direct mechanism is -BIC with respect to if for each bidder
for all potential misreports , in expectation over all other bidders bid . A mechanism is BIC if it is -BIC.
-
•
-Bayesian Incentive Compatible (-BIC): if bidders draw their types from some distribution , then a direct mechanism is -BIC with respect to if for each bidder :
-
•
Individually Rational (IR): A direct mechanism is IR if for all type profiles ,
for all bidders .
Indirect Mechanism:
An indirect mechanism does not directly solicit the bidders’ types. After interacting with the bidders, the mechanism selects an allocation and payments. The notions of -Bayesian Incentive Compatibility and Individual Rationality can be extended to indirect mechanisms using the solution concept of -Bayes Nash equilibrium. The notion of -Bayesian Incentive Compatibility can be extended to indirect mechanisms using the new solution concept, which we call -weak approximate Bayes Nash equilibrium. In an incomplete information game, a strategy profile is an -weak approximate Bayes Nash equilibrium if for every bidder, with probability no more than (over the randomness of their own type), unilateral deviation from the Bayesian Nash strategy can increase the deviating bidder’s expected utility (with respect to the randomness of the other bidders’ types and assuming those follow their Bayesian Nash equilibrium strategies) by more than .
Remark 1.
For a -weak approximate Bayes Nash equilibrium, its expected revenue computation is made in this paper using the convention that all bidders follow their -weak approximate Bayes Nash equilibrium strategies. At a cost of an additive loss in revenue (where is the highest possible value of any bidder), we can assume that only the -fraction of types of each bidder who have no more than incentive to deviate from the weak approximate Bayes Nash equilibrium strategies follow these strategies while the remaining fraction use arbitrary strategies. Similarly, we can interpret the -weak approximate Bayes Nash equilibrium definition as requiring that at least -fraction of the types of each bidder have at most incentive to deviate from the Bayes Nash strategies assuming that for every other bidder at most fraction of their types deviate from their Bayes Nash strategies.
2.2 Further Preliminaries
Definition 1.
Let be a metric space and be a -algebra on . For , let . Two probability measure and on have Prokhorov distance
We consider distributions supported on some Euclidean Space, and we choose to be the -distance. We denote the -Prokhorov distance between distributions , by .
We will also make use of the following characterization of the Prokhorov metric by [43].
Lemma 1 (Characterization of the Prokhorov Metric [43]).
Let and be two distributions supported on . if and only if there exists a coupling of and , such that .
Definition 2 (Influence Matrix and Weak Dependence).
For any -dimensional random vector , we define the influence of variable on variable as
where denotes the conditional distribution of given , and denotes the total variational distance between distribution and . Also, let for each , and we use to denote the matrix . In this paper, we consider the coordinates of to be weakly dependent if .
3 Our Model and Main Results
Setting and Goal:
We consider a classical mechanism design problem, wherein a seller is selling items to buyers, where buyer ’s type is drawn from a distribution over independently. The goal is to design a mechanism that maximizes the seller’s revenue. In this paper, we operate in a setting where is unknown, but we are given access to the following components: (I) For each bidder , we are given a machine learning model — of the matrix factorization type as described below, which approximates . (II) We are given a good mechanism for the approximate type distributions; in its design this mechanism can exploit the low effective dimensionality, , of types in the approximate model. Our goal is (III) to use (I) and (II) to obtain a good mechanism for the true type distributions.
(I) The Machine Learning Component:
We assume that each bidder’s type distribution can be well-approximated by a known Matrix Factorization (MF) model . In particular:
-
•
We use to denote the design matrix of the model, where each column can be viewed as the type (over items) of an “archetype.” As described in the following two bullets, types are sampled by each as linear combinations over archetypes.
-
•
We use to denote a distribution over . The subscript is not a parameter of the distribution — it serves to remind us that this distribution samples in the latent space and distinguish it from the distribution defined next.
-
•
If is a distribution over , we use to denote the distribution of the random variable , where . With this notation, we use to denote .
-
•
We assume that, for each bidder, the matrix factorization model is not far away from the true type distribution, that is, for some we have that for all .
Remark 2.
In the above description we assumed that all ’s share the same design matrix . This is done to avoid overloading notation but all our results would hold if each had its own design matrix .
(II) The Mechanism Design Component:
We assume that we are given a direct mechanism for types drawn from the Machine Learning model. In particular, we assume that this mechanism makes use of the effective dimension of the Machine Learning model, accepting “latent types” (of dimension ) as input from the bidders. Specifically:
-
•
Recall that, for each bidder , their valuation function is common knowledge. (Recall that takes as input the bidder’s type and a subset of items so how the bidder values different subsets of items depends on their private type.)
-
•
The designer is given and for each bidder , and treats bidder ’s type as drawn from , i.e. in the latent space . With respect to such “latent types,” there is an induced valuation function. In particular, for each bidder , we use to denote the valuation function defined as follows , where .
-
•
With the above as setup, we assume that the designer designs a mechanism that is BIC and IR w.r.t. and valuation functions .
(III) The New Component:
We consider the regime where , and our goal is to combine the Machine Learning component with the Mechanism Design component to produce a mechanism which generates revenue comparable to when used for bidders whose types are drawn from . There are two challenges: (i) takes as input the latent representation of a bidder’s type under , however under a bidder is simply ignorant about any latent representation of their type so they cannot be asked about it; (ii) ’s revenue is evaluated with respect to and valuation functions and our goal is to obtain a mechanism whose revenue is similar under and valuation functions . We show how to use a communication efficient query protocol together with a robustification procedure to combine the Machine Learning and Mechanism Design components.
To state our results, we first need to formally define query protocols and some of their properties.
Definition 3 (-query protocol).
Let be a query protocol, i.e., some communication protocol that exchanges messages with a bidder over possibly several rounds and outputs a vector in . We say that a bidder interacts with the query protocol truthfully, if whenever the protocol asks the bidder to evaluate some function on their type the bidder evaluates the function and returns the result truthfully. We use to denote the output of when interacting with a truthful bidder whose type is . is called a -query protocol, if for any and satisfying , we have that .
We also need the notion of Lipschitz valuations to formally state our result.
Definition 4 (Lipschitz Valuations).
is a -Lipschitz valuation, if for any two types and any bundle , .
This includes familiar settings, for example if the bidder is -demand, the Lipschitz constant .444A bidder is -demand if for any set of items, the bidder picks their favorite bundle with size no more than in evaluating the value of each such bundle additively, with values as determined by the bidder’s type . Formally, .
We are now ready to state our first main result.
Theorem 1.
Let be the bidders’ type distributions and be a -Lipschitz valuation for each bidder . Also, let be a design matrix and be a distribution over for each .
Suppose we are given query access to a mechanism that is BIC and IR w.r.t. and valuations (as defined in the second bullet of the Mechanism Design component above), and there exists such that for all . Given any -query protocol with , we can construct mechanism using only query access to and obliviously with respect to , such that for any possible that satisfies the above conditions of Prokhorov distance closeness the following hold:
-
1.
only interacts with every bidder using once;
-
2.
is -BIC w.r.t. and IR, where ;
-
3.
The expected revenue of is at least
Remark 3.
The mechanism will be an indirect mechanism. We are slightly imprecise here to call the mechanism -BIC. Formally what we mean is that interacting with truthfully is a -weak approximate Bayes Nash equilibrium. We compute the expected revenue assuming all bidders interacting with truthfully. As we discussed in Remark 1, with an additional additive loss in revenue, we can assume that only the -fraction of types of each bidder who have no more than incentive to deviate from the Bayes Nash strategies interact with truthfully while the remaining fraction uses arbitrary strategies.
Why isn’t [7] sufficient?
One may be tempted to prove Theorem 1 using [7]. However, there are two subtle issues with this approach: (i) The violation of the incentive compatibility constraints and the revenue loss of the robustification process in [7] depend linearly in , rather than in as in Theorem 1. Note that , which only depends on and the largest value an archetype can have for a single item and thus could be significantly smaller than . (ii) The robustification process involves sampling from the conditional distribution of on an -dimensional cube, which is equivalent to sampling from the conditional distribution of on a set whose image after the linear transformation is the -dimensional cube. However, may be difficult to sample from if is not a well-conditioned.
In the following lemma, we refine the robustification result in [7] (Theorem 3 in that paper) and show that given an approximate distribution in the latent space and a BIC and IR mechanism w.r.t. , we can robustify with negligible revenue loss so that it is an approximately BIC and exactly IR mechanism w.r.t. for any distribution that is within the -Prokhorov ball around . Importantly, we exploit the effective dimension of the matrix factorization model to replace the dependence on with in both the violation of the incentive compatibility constraints and the revenue loss. Additionally, we only need to be able to sample from the conditional distribution of on a -dimensional cube. We postpone the proof of Lemma 2 to the Appendix A.
Lemma 2.
Let be the design matrix. Suppose we are given a collection of distributions over latent types , where the support of each lies in , and a BIC and IR mechanism w.r.t. and valuations , where each is an -Lipschitz valuation. Let be any distribution such that for all . Given access to a sampling algorithm for each , where draws a sample from the conditional distribution of on the -dimensional cube , we can construct a randomized mechanism using only query access to and obliviously with respect to , such that for any satisfying the above conditions of Prokhorov distance closeness the following hold:
-
1.
is -BIC and IR w.r.t. and valuations , where ;
-
2.
The expected revenue of is
Proof of Theorem 1: Consider the following mechanism:
Let be bidder ’s type and be a random variable distributed according to . Since , Lemma 1 guarantees a coupling between and such that their distance is more than with probability no more than . As is a -query protocol, when and are not away, we have . Hence, there exists a coupling between and so that their distance is more than with probability no more than (recall ). If we choose to be the distribution of , to be , and to be , Lemma 2 states that is a -BIC mechanism if bidder has valuation and type . Consider two cases: (a) When , then . Since is -Lipschitz, deviating from interacting with truthfully can increase the expected utility by at most . (b) When , the bidder may substantially improve their expected utility by deviating. Luckily, such case happens with probability no more than .
In Theorem 2, we show how to obtain -queries under various settings. We further assume that the bidders’ valuations are all constrained-additive.
Definition 5 (Constrained-Additive valuation).
A valuation function is constrained additive if , where is a downward-closed set system, and is a fixed vector.555One can interpret as the common based values for the items that are shared among all types. For example, unit-demand valuation is when includes all subsets with size no more than . If all elements of have size no more than , then is a -Lipschitz valuation.
Theorem 2.
Let all bidders’ valuations be constrained-additive. We consider queries of the form: , where is the -th standard unit vector in . The query simply asks whether the bidder is willing to pay at least for winning item . The bidder provides a Yes/No answer. We obtain communicationally efficient protocols in the following settings:
-
•
Deterministic Structure: If can be expressed as , where is a permutation matrix, is an arbitrary matrix, and is diagonally dominant both by rows and by columns. This is a relaxation of the well-known separability assumption by Donoho and Stodden [26], that is, can be expressed as , where is the -dimensional identity matrix. Let and . We have a -query protocol using queries for any .
-
•
Ex-ante Analysis: If is generated from a distribution, where each archetype is an independent copy of a -dimensional random vector .
-
–
Multivariate Gaussian Distributions: is distributed according to a multivariate Gaussian distribution . If there exists a subset such that , where is the covariance matrix for items in and is the largest eigenvalue of ,666 is the -dimensional vector that contains all with . then with probability at least , we have a -query protocol using queries for any . Note that when the entries of are i.i.d., any with size at least satisfies the condition.
-
–
Bounded Distributions with Weak Dependence: Let be supported on and has mean for each . If there exists a subset such that , and , where , then with probability at least
, we have a -query protocol using
queries for any . Note that when the entries of are independent, for any set . If each has variance , then any set with size at least suffices for some absolute constant .
-
–
Remark 4.
In the ex-ante analysis, the success probabilities depend on the parameters of the distributions, but note that they are both at least .
Before we prove Theorem 2, we combine it with Theorem 1 to derive results for a few concrete settings.
Proposition 1.
Under the same setting as in Theorem 1 with the extra assumption that every valuation is constrained-additive, we can construct mechanism using only query access to the given mechanism and oblivious to the true type distribution , such that for any possible , is -BIC and IR, where , and has revenue at least . Recall that satisfies for all . We compute the function and the number of queries for the following three concrete settings (one for each of the three assumptions in Theorem 2).
-
1.
Deterministic Structure: Separability. If the design matrix satisfies the separability assumption by Donoho and Stodden [26], that is, can be expressed as , where is a permutation matrix, for all . The number of queries each bidder needs to answer is .
-
2.
Multivariate Gaussian Distributions: Well-Conditioned Covariance Matrix. Let be generated from a distribution, where each archetype is an independent draw from a -dimensional normal distribution . Let be the condition number of .777 is well-conditioned if is small. When , . For any set with size , if we query each bidder about items in , with probability at least , , and each bidder needs to answer queries.
-
3.
Weak Dependence: Sufficient Variance per Item. Let be generated from a distribution, where each archetype is an independent copy of an -dimensional random vector . Assuming (i) , (ii) lies in , and (iii) for each , then for any set with size , if we query each bidder about items in , with probability at least , and each bidder needs to answer queries.888Clearly, we can weaken condition (i),(ii) and (iii). The result still holds if we can find a set , so that for vector , condition (i), (ii), and (iii) hold, and is at least .
Proof.
Proof of Theorem 2: Instead of directly studying the query complexity under our query model. We first consider the query complexity under a seemingly stronger query model, where we directly query the bidder about their value of , and their answer will be within for some . We refer to this type of queries as noisy value queries. Since for each item , for all and we only care about types in that are close to some , we can use our queries to perform binary search on to simulate noisy value queries. In particular, we only need many queries to simulate one noisy value queries. From now on, the plan is to first investigate the query complexity for noisy value queries, then convert the result to query complexity in the original model.
We first fix the notation. Let be the number of noisy value queries, and be the query matrix, where, each row of is a standard unit vector. We use to denote the bidder’s answers to the queries and to true answers to the queries. Note that . Given , we solve the following least squares problem: .
The problem has a closed form solution: . Let , and be a vector that satisfies . We are interested in upper bounding . Note that
Since the rows of are all standard unit vectors, .
Next, we bound under the different assumptions.
Deterministic Structure:
We choose and so that . Since is diagonally dominant, is non-singular, and .
Lemma 3 (Adapted from Theorem 1 and Corollary 1 of [45]).
If a matrix is diagonally dominant both by rows and by columns, and and , then and .
By Lemma 3, . Note that . The last inequality is because is diagonally dominant by columns. To sum up, if we choose so that ,
Ex-ante Analysis:
Since and ,
where (or ) is ’s largest (or smallest) singular value.
Multivariate Gaussian distribution:
When is distributed according to a multivariate Gaussian distribution, we choose and so that each row corresponding to an with . Now, is a random matrix where each column is an independent copy of . We use Lemma 4 to bound ’s largest singular value and smallest singular value . The proof of Lemma 4 is postponed to Section 4.1.
Lemma 4.
[Concentration of Singular Values under multivariate Gaussian distributions]
Let be a random matrix, where each column of is an independent copy of a -dimensional random vector distributed according to a multivariate Gaussian distribution . In particular, is an orthonormal matrix, and is a diagonal matrix. We have with probability at least , where is the largest entry in .
Since , by Lemma 4, and with probability at least . Hence, with probability at least .
Weakly Dependent Distributions:
When the coordinates of are weakly dependent, i.e., , we choose and so that each row corresponding to an with . Now, is a random matrix where each column is an independent copy of . We use Lemma 5 to bound ’s largest singular value and smallest singular value . The proof of Lemma 5 is postponed to Section 4.2.
Lemma 5.
[Concentration of Singular Values under Weak Dependence]
Let be a random matrix, where each column of is an independent copy of a -dimensional random vector . We assume that the coordinates of are weakly dependent, i.e., , and each coordinate of lies in and has mean and variance . Let . We have with probability at least .
Since , by Lemma 5, we have and with probability at least . Therefore, with probability at least .
Query Complexity in Different Models:
We set to be .
-
•
Deterministic structure: we have a -query protocol using queries.
-
•
Multivariate Gaussian distributions: with probability at least (no less than by our choice of ), we have a -query protocol using queries.
-
•
Weakly dependent distributions: with probability at least (no less than by our choice of ), we have a -query protocol using queries.
4 Bounding the Largest and Smallest Singular Values
We prove both Lemma 4 and 5 using an -net argument. We first state a lemma that says that for any matrix , if we can bound the maximum value of over all points in the -net, then we also bound the largest and smallest singular values of .
Lemma 6 (Adapted from [41]).
For any , there exists an -net , i.e., , such that . For any matrix , let and , then and .
Proof of Lemma 6: Let be a vector that satisfies . Let be a vector in such that . Then , which implies that . On the other hand, for any , let satisfies , then .
4.1 Multivariate Gaussian Distributions
In this section, we prove the case where the columns of the random matrix are drawn from a multivariate Gaussian distribution. The key is again to prove that for every unit-vector, lies between with high probability for some absolute constant and (Lemma 7). Lemma 4 follows from the combination of Lemma 7, 6, and the union bound.
Proof of Lemma 4: Let be i.i.d. samples from the distribution , and .
Proposition 2.
and .
Proof.
. ∎
Since is an orthonormal matrix, and . We will proceed to show that both and concentrate around their means. We do so via an -net argument.
Lemma 7.
For any fix , . Moreover,
and
Proof of Lemma 7: Let to be i.i.d. samples from . It is not hard to see that , so we need to prove that concentrates around its mean .
Since distributes according to a chi-square distribution, its moment generating function
If we choose to be no more than , since for any , , we have that
Putting everything together, we have that
When we choose and , the RHS of the inequality becomes .
Next, we upper bound via a similar approach.
Note that .
Proposition 3.
For any , .
Proof of Proposition 3: We first state a few inequalities that are not hard to verify. First, for all , . Second, if . Finally, if . Combining all three inequalities, we have that
If we choose to be no more than , then by Proposition 3, , which is upper bounded by . Putting everything together, we have that
When we choose and , the RHS of the inequality becomes .
4.2 Bounded Distributions with Weak Dependence
In this section, we prove the case where the columns of the random matrix are drawn from a -dimensional distribution that satisfies weak dependence. The overall plan is similar to the one for multivariate Gaussian distributions. The key is again to prove that for every unit-vector, lies between with high probability for some absolute constant and (Lemma 8). Lemma 5 then follows from the combination of Lemma 8, 6, and the union bound.
Proof of Lemma 5:
We first show that for each fix , is concentrates around its mean. Then, we apply Lemma 6 to bound and .
Lemma 8.
Let be a random matrix, where each column of is an independent copy of a -dimensional random vector . We assume that the coordinates of are weakly dependent, i.e., , and each coordinate of lies in and has mean and variance . Let . For any fix , and
Proof of Lemma 8: We first expand .
Therefore, . To prove that concentrates, we first need a result by Chatterjee [16].
Lemma 9 (Adapted from Theorem 4.3 in [16]).
Let be a -dimensional random vector. Suppose function satisfies the following generalized Lipschitz condition:
for any and in the support of . If , we have
The function we care about is , where the variables are . If and only differs at the entry, then
We denote by . Clearly, for any and , . Also, notice that , and therefore . 999 denotes the Kronecker product of the two matrices. We apply Lemma 9 to and derive the following inequality:
References
- [1] Saeed Alaei. Bayesian Combinatorial Auctions: Expanding Single Buyer Mechanisms to Many Buyers. In the 52nd Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2011.
- [2] Saeed Alaei, Hu Fu, Nima Haghpanah, Jason Hartline, and Azarakhsh Malekian. Bayesian Optimal Auctions via Multi- to Single-agent Reduction. In the 13th ACM Conference on Electronic Commerce (EC), 2012.
- [3] Moshe Babaioff, Yannai A Gonczarowski, and Noam Nisan. The menu-size complexity of revenue approximation. Games and Economic Behavior, 2021.
- [4] Moshe Babaioff, Nicole Immorlica, Brendan Lucier, and S. Matthew Weinberg. A Simple and Approximately Optimal Mechanism for an Additive Buyer. In the 55th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2014.
- [5] Dirk Bergemann and Karl Schlag. Robust monopoly pricing. Journal of Economic Theory, 146(6):2527–2543, 2011.
- [6] Johannes Brustle, Yang Cai, and Constantinos Daskalakis. Multi-item mechanisms without item-independence: Learnability via robustness. CoRR, abs/1911.02146, 2019.
- [7] Johannes Brustle, Yang Cai, and Constantinos Daskalakis. Multi-item mechanisms without item-independence: Learnability via robustness. In EC, 2020.
- [8] Yang Cai and Constantinos Daskalakis. Learning multi-item auctions with (or without) samples. In FOCS, 2017.
- [9] Yang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. An Algorithmic Characterization of Multi-Dimensional Mechanisms. In the 44th Annual ACM Symposium on Theory of Computing (STOC), 2012.
- [10] Yang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. Optimal Multi-Dimensional Mechanism Design: Reducing Revenue to Welfare Maximization. In the 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2012.
- [11] Yang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. Reducing Revenue to Welfare Maximization : Approximation Algorithms and other Generalizations. In the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2013.
- [12] Yang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. Understanding Incentives: Mechanism Design becomes Algorithm Design. In the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2013.
- [13] Yang Cai, Nikhil R. Devanur, and S. Matthew Weinberg. A duality based unified approach to bayesian mechanism design. In STOC, 2016.
- [14] Yang Cai and Zhiyi Huang. Simple and Nearly Optimal Multi-Item Auctions. In the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2013.
- [15] Yang Cai and Mingfei Zhao. Simple mechanisms for subadditive buyers via duality. In STOC, 2017, 2017.
- [16] Sourav Chatterjee. Concentration inequalities with exchangeable pairs. PhD thesis, Citeseer, 2005.
- [17] Shuchi Chawla, Jason D. Hartline, and Robert D. Kleinberg. Algorithmic Pricing via Virtual Valuations. In the 8th ACM Conference on Electronic Commerce (EC), 2007.
- [18] Shuchi Chawla, Jason D. Hartline, David L. Malec, and Balasubramanian Sivan. Multi-Parameter Mechanism Design and Sequential Posted Pricing. In the 42nd ACM Symposium on Theory of Computing (STOC), 2010.
- [19] Shuchi Chawla and J. Benjamin Miller. Mechanism design for subadditive agents via an ex-ante relaxation. In Proceedings of the ACM Conference on Economics and Computation (EC), 2016.
- [20] Xi Chen, Ilias Diakonikolas, Anthi Orfanou, Dimitris Paparas, Xiaorui Sun, and Mihalis Yannakakis. On the complexity of optimal lottery pricing and randomized mechanisms. In Proceedings of the 56th Annual Symposium on Foundations of Computer Science (FOCS), 2015.
- [21] Constantinos Daskalakis. Multi-item auctions defying intuition? ACM SIGecom Exchanges, 14(1):41–75, 2015.
- [22] Constantinos Daskalakis, Alan Deckelbaum, and Christos Tzamos. Mechanism design via optimal transport. In Michael J. Kearns, R. Preston McAfee, and Éva Tardos, editors, ACM Conference on Electronic Commerce, EC ’13, Philadelphia, PA, USA, June 16-20, 2013, pages 269–286. ACM, 2013.
- [23] Constantinos Daskalakis, Alan Deckelbaum, and Christos Tzamos. The complexity of optimal mechanism design. In the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2014.
- [24] Constantinos Daskalakis, Alan Deckelbaum, and Christos Tzamos. Strong duality for a multiple-good monopolist. Econometrica, 85(3):735–767, 2017.
- [25] Constantinos Daskalakis, Maxwell Fishelson, Brendan Lucier, Vasilis Syrgkanis, and Santhoshini Velusamy. Simple, credible, and approximately-optimal auctions. In EC, 2020.
- [26] David Donoho and Victoria Stodden. When does non-negative matrix factorization give a correct decomposition into parts? In Proceedings of the 16th International Conference on Neural Information Processing Systems, NIPS’03, page 1141–1148, Cambridge, MA, USA, 2003. MIT Press.
- [27] Shaddin Dughmi, Li Han, and Noam Nisan. Sampling and representation complexity of revenue maximization. In WINE, 2014.
- [28] Paul Dütting, Zhe Feng, Harikrishna Narasimhan, David Parkes, and Sai Srivatsa Ravindranath. Optimal auctions through deep learning. In International Conference on Machine Learning, pages 1706–1715. PMLR, 2019.
- [29] Zhe Feng, Harikrishna Narasimhan, and David C Parkes. Deep learning for revenue-optimal auctions with budgets. In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems, 2018.
- [30] Yiannis Giannakopoulos and Elias Koutsoupias. Duality and optimality of auctions for uniform distributions. In Proceedings of the 15th ACM conference on Economics and Computation (EC), 2014.
- [31] Kira Goldner and Anna R Karlin. A prior-independent revenue-maximizing auction for multiple additive bidders. In International Conference on Web and Internet Economics, pages 160–173. Springer, 2016.
- [32] Yannai A Gonczarowski and S Matthew Weinberg. The sample complexity of up-to- multi-dimensional revenue maximization. In FOCS, 2018.
- [33] Sergiu Hart and Noam Nisan. Approximate Revenue Maximization with Multiple Items. In EC, 2012.
- [34] Sergiu Hart, Noam Nisan, et al. The menu-size complexity of auctions. Center for the Study of Rationality, 2013.
- [35] Ian A Kash and Rafael M Frongillo. Optimal auctions with restricted allocations. In Proceedings of the 2016 ACM Conference on Economics and Computation, pages 215–232, 2016.
- [36] Robert Kleinberg and S. Matthew Weinberg. Matroid Prophet Inequalities. In the 44th Annual ACM Symposium on Theory of Computing (STOC), 2012.
- [37] Jamie Morgenstern and Tim Roughgarden. Learning simple auctions. In Conference on Learning Theory, pages 1298–1318. PMLR, 2016.
- [38] Roger B. Myerson. Optimal Auction Design. Mathematics of Operations Research, 6(1):58–73, 1981.
- [39] Noam Nisan, Tim Roughgarden, E. Tardos, and V. V. Vazirani, editors. Algorithmic Game Theory. Cambridge University Press, 2007.
- [40] Aviad Rubinstein and S. Matthew Weinberg. Simple mechanisms for a subadditive buyer and applications to revenue monotonicity. In EC, 2015.
- [41] Mark Rudelson. Recent developments in non-asymptotic theory of random matrices. Modern aspects of random matrix theory, 72:83, 2014.
- [42] Weiran Shen, Pingzhong Tang, and Song Zuo. Automated mechanism design via neural networks. In Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems, 2019.
- [43] Volker Strassen. The existence of probability measures with given marginals. The Annals of Mathematical Statistics, 36(2):423–439, 1965.
- [44] Vasilis Syrgkanis. A sample complexity measure with applications to learning optimal auctions. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- [45] James M Varah. A lower bound for the smallest singular value of a matrix. Linear Algebra and its applications, 11(1):3–5, 1975.
- [46] Andrew Chi-Chih Yao. An n-to-1 bidder reduction for multi-item auctions and its applications. In SODA, 2015.
Appendix A Missing Proof of Lemma 2
Proof of Lemma 2: The proof essentially follows from the same analysis as Theorem 3 in [7]. We only provide a sketch here. Since we are working with the matrix factorization model and can directly exploit the low dimensionality of the latent representation, we manage to replace the dependence on with in both the revenue loss and violation of the truthfulness constraints. Our proof relies on the idea of “simultaneously coupling” by Brustle et al. [7]. More specifically, it couples with every distribution in the -Prokhorov-ball around . If we round both and any to a random grid with size , we can argue that the expected total variation distance (over the randomness of the grid) between the two rounded distributions is (using Theorem 2 in [7]). Now consider the following mechanism: choose a random grid , round the bids to the random grid, apply the mechanism that we designed for the rounded distribution of . More specifically, is the following mechanism: for each bid , use to sample a bid and run on the bid profile . Since the expected total variation distance (over the randomness of the grid) between the two rounded distributions is , we only need to argue that when the given distribution and the true distribution are close in total variation distance, we can robustify the mechanism designed for one distribution for the other distribution. This is a much easier task, and we again use a similar argument in [7] to prove it. Combining everything, we can show that the randomized mechanism we constructed is approximately-truthful and only loses a negligible revenue compared to under any distribution that is within the -Prokhorov-ball around the given distribution.