11email: [email protected]
22institutetext: Drexel University
22email: [email protected]
33institutetext: Oberlin College
33email: [email protected]
Simple Delegated Choice
Abstract
This paper studies delegation in a model of discrete choice. In the delegation problem, an uninformed principal must consult an informed agent to make a decision. Both the agent and principal have preferences over the decided-upon action which vary based on the state of the world, and which may not be aligned. The principal may commit to a mechanism, which maps reports of the agent to actions. When this mechanism is deterministic, it can take the form of a menu of actions, from which the agent simply chooses upon observing the state. In this case, the principal is said to have delegated the choice of action to the agent.
We consider a setting where the decision being delegated is a choice of a utility-maximizing action from a set of several options. We assume the shared portion of the agent’s and principal’s utilities is drawn from a distribution known to the principal, and that utility misalignment takes the form of a known bias for or against each action. We provide tight approximation analyses for simple threshold policies under three increasingly general sets of assumptions. With independently-distributed utilities, we prove a -approximation. When the agent has an outside option the principal cannot rule out, the constant-approximation fails, but we prove a -approximation, where is the ratio of the maximum value to the optimal utility. We also give a weaker but tight bound that holds for correlated values, and complement our upper bounds with hardness results. One special case of our model is utility-based assortment optimization, for which our results are new. {NoHyper}††Emmanouil Pountourakis was partially supported by NSF CCF 2218813. Samuel Taggart was partially supported by NSF CCF 2218814.
1 Introduction
This paper considers a model of delegated stochastic probing. A decisionmaker (the principal) must pick one of actions, each of which yields randomly distributed reward. Rather than observe rewards directly, the decisionmaker chooses a subset of the actions to allow an agent to consider. The agent observes the realized rewards exactly, but may be biased towards certain actions and away from others. The principal’s goal is to select a set of actions that will maximize their expected reward from the agent’s biased choice. This template captures a range of economic and managerial dilemmas. As examples, a firm might seek to replace a piece of expensive equipment, or a national health service must choose which treatment to provide to a patient who might display a range of symptoms. The equipment operators know their needs better than managers, and the health service relies on doctors to observe patients. In such arrangements, the agent and principal tend not to have preferences which are perfectly aligned: the firm must pay for new equipment (while the operator does not), and specialist doctors might peddle lucrative optional procedures.
The algorithmic problem above can be couched as mechanism design. In a revelation mechanism, the agent would observe the actions’ rewards and report these to the mechanism, which would choose a possibly randomized action. The taxation principle states that every deterministic mechanism is equivalent to a menu: the principal selects the set of allowable actions, and the agent simply chooses their preferred action upon observing the rewards. Such mechanisms eliminate the need for communication between the agent and principal, and are therefore so common in practice that they are often taken for granted as a managerial tool. In economics, delegation refers exactly to this problem of menu design for a better-informed agent, coined by Holmstrom (1978).
In the examples above, the alignment of the agent and principals’ preferences is well-structured. The principal’s main uncertainty in the choice problem is payoff-relevant for both parties: in replacing equipment, both the operator and firm want to purchase the right tool for the job. Meanwhile, misalignment of preferences is predictable – the firm will pay for the new purchase, and prices are likely known in advance. Under these conditions, a particularly salient family of mechanisms is threshold mechanisms, which restrict the agent to actions where the misalignment of preferences is not too great. For our firm and operator, this would take the form of a budget.
Our Contributions.
This work gives a model for delegated choice scenarios like those discussed above. In our model, the agent and principals’ preferences for a particular action are captured by two quantities. First, each action has a shared value , which is unknown to the principal (but distributed according to a known prior) but observable to the agent. Second, each action has a commonly-known and fixed bias , which captures the amount the agents’ utility differs from that of the principal. The agent may also have outside options which the principal cannot prohibit; we extend our model to capture this issue as well.
We study three increasingly general regimes, distinguished by the correlation or independence of the value distributions and the absence or presence of an outside option. For each, we give computational hardness, then take a simple-versus-optimal perspective by completely characterizing the performance of threshold mechanisms. In more detail:
-
•
With independently distributed values and no outside option, we show that threshold mechanisms are a -approximation to the optimal mechanism.111Our results also hold with an outside option if that action has a fixed value, which we make precise subsequently. We show that this problem is NP-hard.
-
•
With independently distributed values and an outside option, threshold mechanisms cannot obtain any nontrivial approximation in general. However, we show a parametrized -approximation, where is the ratio of largest possible value to OPT. This problem generalizes the previous problem, and is thus also NP-hard.
-
•
With correlation, we give a approximation, where is the probability of the least likely value profile. We show this problem is NP-hard to approximate below a constant factor.
We match all three approximation analyses of thresholds with tight examples. A special case of our model is utility-based assortment optimization, a canonical model from revenue management (discussed in Section 3). All our results are new to that literature.
Roadmap
In Section 2, we give our formal model. We then survey existing work on delegation in Section 3, and make specific comparisons to existing work on delegated search and assortment optimization. Section 4 contains our hardness result and constant-approximation under independence and lays the groundwork for our parametrized analysis with an outside option in Section 5. Finally, we analyze delegation with correlated values in Section 6.
2 Model
We now give our model of delegated choice. The principal seeks to choose from a discrete set of actions. The principal’s utility for action is given by a random value , which the principal is unable to observe. To assist in selecting an action, the principal may consult an agent, who observes all actions’ values, and may communicate with the principal after observation. We decompose the agent’s utility for action into its value, shared with the principal, and an unshared bias term. That is, the agent’s utility is given by . Throughout the paper, we assume each bias is constant and known to the principal.
We assume the principal has the power to commit ex ante to a mechanism for communicating with the agent and selecting an action, and study deterministic mechanisms. By the taxation principle, it suffices to consider mechanisms described by menus over actions. The agent observes all actions’ values and selects their utility-maximizing action from the menu — which may differ from the principal’s preferred action. Taking this perspective, we consider the algorithmic problem of selecting a menu to maximize the principal’s expected utility when the agent selects their preferred action according to the observed values. We further assume the existence of an outside option for the agent, denoted action , with value and bias . We assume that regardless of the principal’s choice of , the agent may always select this action.
Formally, when presented with action set and after observing the vector of values , denote the agent’s preferred choice by . That is, . The principal is faced with a set function optimization problem. We assume the principal has a prior distribution over the values , and must select a menu for the agent which maximizes their own expected utility.222We assume the agent breaks ties in the principal’s favor, then lexicographically. That is, the principal solves:
The model above captures applications such those described in Section 1. Note that we allow the agent’s utility to be negative, and that the model is invariant to additive shifts in the agent’s bias for every action. We will study a particularly simple set of mechanisms, namely threshold mechanisms. The threshold mechanism with bias is given by . Note that since the number of threshold policies is at most the number of actions, the principal may compute an optimal threshold efficiently. We analyze the approximation ratio between the best threshold menu and the optimal menu overall.
Example 1
The equipment purchase example described in the introduction can be formulated as follows. The firm (principal) needs to buy a piece of equipment, which will be used by a specialist (agent) with knowledge of the quality of different brands. Each brand has quality , and price . Qualities are unknown to the principal, and prices are known. We may write values as and biases . Note that the values are random, while biases are known, as required. A threshold policy restricts to actions with bias —and hence price— at most .
Example 2
The health services example from the introduction may be heavily stylized as follows. A national health service (principal) needs to select a treatment for a patient with the help of a doctor (agent) who has expertise and observes the patient’s condition. Each potential treatment has cost (known to the doctor and the health service), and given the patient’s condition an efficacy (known to the doctor but not the health service). The health service seeks to maximize the patient’s health less costs, . The doctor is paid a portion of the costs, and shares some concern for the patient’s health. For some , we may therefore write . To cast this in our model, note that scaling agent utilities by will not change their decision, so we may normalize . After normalization, we have and . As required, the value depends on and is hence unknown to the principal, and the bias depends only on , and is hence known to the principal. Further note that a threshold set corresponds to a price cap, restricting the doctor away from the highest-cost procedures.
3 Related Work
Simple Versus Optimal Mechanisms.
A primary contribution of computer science to the study of mechanism design is the use of approximation to explain the prevalence of simple mechanisms. For example, Hartline and Roughgarden (2009) prove that the simple auctions often observed in practice can obtain a constant factor of the sometimes-complicated, rarely-used optimal mechanisms. Hartline (2013) surveys similar results for auctions. Recently, Dütting et al. (2019) and Castiglioni et al. (2021) make similar forays into contract theory, characterizing the power of simple linear contracts. Our work initiates the study of delegated choice through a similar lens.
Real-Valued Delegation.
Delegated decisionmaking is a canonical problem in microeconomic theory and managerial science. Much of the literature subsequent to Holmstrom (1978) has focused on the special case where the state and action space are continuous and real-valued, and where the preferences of both the agent and principal are single-peaked, but differ by a known bias. Notable examples include Melumad and Shibano (1991), Martimort and Semenov (2006), Alonso and Matouschek (2008), and Amador and Bagwell (2010), who characterize the structure of optimal mechanisms under increasingly general variants of the single-peaked model. The main conclusions from these papers are necessary and sufficient conditions for the optimal delegation set to be an interval on the real line. Our work makes a similar known bias assumption, but in a model more amenable to algorithmic analysis. We obtain similar conclusions: the principal can secure high utility by restricting the agent away from extreme actions.
Additional work on similar models includes Kovác and Mylovanov (2009), who study the gap in performance between randomized and deterministic mechanisms, and Ambrus and Egorov (2017), who study a principal who can add additional nonmonetary costs to incentivize more preferred decisions. Aghion and Tirole (1997) and Szalay (2005) consider models in which one or more of the principal and the agent may expend effort to observe a signal about the state.
For multiple decisions, Frankel (2014) considers maxmin robust delegation and Kleiner (2023) studies Bayesian optimal mechanisms.
With the exception of Armstrong and Vickers (2010) and followup works, though, the economics literature has focused on the real-valued model for decisions. Our work considers the mathematically incomparable but similarly common setting of discrete choice. In the latter setting, the structure of the problem renders exact characterization of optimal mechanisms difficult, and motivates the use of a simple-versus-optimal approach.
Delegated Search.
The model of delegated project choice from Armstrong and Vickers (2010) is perhaps closest to ours. The authors consider an agent who chooses between discrete actions. The principal is able to verify the utilities provided by the selected action, and restrict the agent’s choice based on this information. Subsequent followups by Kleinberg and Kleinberg (2018), Bechtel and Dughmi (2021), and Bechtel et al. (2022) note a strong connection between the Armstrong and Vickers (2010) model and well-studied online stochastic optimization problems. They upperbound the delegation gap: they show that even when the agent must pay a search cost to discover each action’s utility, the principal can obtain utility within a constant factor of the first-best solution, where they solve the search problem themselves. More recently, Braun et al. (2022) give a version where the agent searches online, and make similar comparisons to first-best, and Hajiaghayi et al. (2023) study a multi-agent version of the model.
Our model differs from the delegated search literature in two notable ways. First is the absence of search. Our agent can perfectly observe the values of all actions. More significantly, our principal is unable to verify the utilities provided by the agent’s selected action; they may only rule actions in or out completely. In our model, the first-best solution is . The following example shows that no delegation set may approximate the first-best to a factor better than . This contrasts with the constant-approximation results from the work cited above.
Example 3
Consider an instance with independently-distributed actions. Action has a value which is with probability and otherwise. Each action has bias . The first-best expected utility is constant, while in any delegation set, the agent will always pick the highest-indexed action, yielding expected utility .
Stochastic Probing.
There is a by now extensive literature on stochastic probing beyond the economically-inspired settings of this paper and those discussed above. Rather than survey the literature, we offer a few key recent papers, and refer the reader to these for deeper references: Chen et al. (2016); Goel et al. (2006); Mehta et al. (2020); Segev and Singla (2021)
Despite similarity of motivation, we employ techniques that largely differ from this literature.
Assortment Optimization.
Our model captures special cases of the well-studied assortment optimization problem. In assortment optimization, a seller must decide which among a set of fixed-price items to offer. A variety of models are common for the buyer’s purchase choice, including nested logit models (Davis et al., 2014; Li et al., 2015) and Markov chain-based choice (Feldman and Topaloglu, 2017), along with equivalent models based on random buyer utility (Berbeglia, 2016; Aouad et al., 2018, 2023), which includes the especially prevalent multinomial logit (MNL) model as a special case. Our model subsumes assortment optimization with utility-based choice. To see this, consider items, where the buyer utility for each item is random, and the revenue for item is known to the seller. Taking and for sufficiently small yields an equivalent delegation problem. Under this transformation, an outside option with corresponds to the no-buy option, and the option to buy elsewhere with positive utility can be captured with a randomized outside option.
Threshold mechanisms in our model correspond to revenue-ordered assortments, a well-studied class of solutions for assortment optimization. A series of papers analyze the approximation ratio of revenue-ordered assortments under increasingly general models: Talluri and Van Ryzin (2004) show that revenue-ordered assortments are optimal for several choice models including MNL; Rusmevichientong et al. (2014) analyze revenue-ordered assortments for mixtures of MNL models, and further prove NP-hardness of computing the optimal assortment; Berbeglia and Joret (2020) give parametrized analyses under a general choice model. Our approximation analyses for independently-distributed values with an random outside option (Sections 5) apply to utility-based assortment optimization, and are new to this literature. Our logarithmic approximation for correlated values (Section 6) resembles that of Berbeglia and Joret (2020); it is less finely parametrized, but extends to more general forms of delegation.
Other work on computational hardness or approximation in assortment optimization includes Désir et al. (2020), who hardness of approximation under a knapsack-constrained version of the problem, and Immorlica et al. (2018), who study a version where the buyer has combinatorial preferences over bundles.
4 Threshold Delegation with Independent Values
We now consider the simplest case of the model, where the principal’s prior over values is a product distribution, and hence, actions’ values are independent. We further assume that the outside option’s value, , is deterministic, which subsumes the no-outside-option case, as we could have . We present our approximation result first, and defer hardness to Section 4.4
Theorem 4.1
Under independent values and a deterministic outside option, there always exists a threshold mechanism with expected utility that is a -approximation to the optimal deterministic mechanism.
Theorem 4.1 holds regardless of choice of the outside option’s fixed value and bias. Before giving the details of the proof, we derive two technical results which will facilitate analysis. In Section 4.1 for any delegation set, we give a decomposition of the principal’s utility into two quantities, one aligned with the agent’s utility and one not. Then, in Section 4.2 we use independence obtain a lower bound on the value from threshold sets which will prove useful for both this and the next section’s analyses.
4.1 Utility Decomposition
The principal’s task is to balance two sources of utility. On the one hand, when some action has very high value, preferences are aligned: the principal benefits from giving the agent the flexibility to select this action. On the other hand, when actions have smaller values, the principal must control misalignment: they may benefit from restricting the agent away from actions with higher bias, inducing the agent to take actions that provide better value. We now decompose the principal’s utility for the optimal delegation set into two quantities, Sur and BDif, which roughly correspond to the value from each of these two cases.
To make the decomposition precise, note that for the optimal delegation set , there are two lower bounds imposed on the agent utility from any selection: first, the chosen action must be preferred to the outside option, action , which gives utility at least . Second, the agent’s utility is at least the bias of the most-biased action in . Denote the better of these bounds by . We can therefore think of the contribution of any action as decomposing into a bias difference and a surplus . Intuitively, the surplus captures the principal’s utility from giving the agent latitude to pick high-valued actions, and the bias difference captures the misaligned portion of the principal’s utility. Formally, the decomposition is the following.
Lemma 1
Let denote the optimal delegation set, and let . Define Sur and BDif as follows:
Sur | |||
BDif |
Then we can write .
To verify the intuition that Sur captures the aligned portion of the principal’s utility, note that choosing the smallest threshold set containing all of secures Sur for the principal. Formally:
Lemma 2
Let . Then .
Proof
We will argue pointwise for each value profile . The action chosen by the agent under is , which has . Since is the agent’s favorite, we have . Hence,
Taking expectation over yields the lemma.
Lemma 2 implies that the main difficulty for obtaining approximately-optimal delegation sets is managing the misaligned portion of the principal’s utility. Section 4.3 gives this analysis for the case with fixed, yielding a -approximation. Note Lemmas 1 and 2 hold even when the outside option’s value is randomized. We will therefore make further use of them in our analysis of that case in Section 5.
4.2 Lower Bounds via Partial Derandomization
To compare the performance of a threshold set to the optimal set , we will show that threshold sets can retain sufficient value from without introducing actions in which overly distort the agent’s choices. Independence allows us to summarize the interference of with a single deterministic action. This will greatly simplify subsequent analyses. This section focuses on the case of fixed outside options, but we state our lemma for possibly randomized outside options. We will reuse the result in Section 5.
Lemma 3
Assume values are independently distributed. Then for any threshold set , there exists a single action with bias and deterministic value such that .
Note that need not be an action from the original delegation instance. The proof will follow from picking the worst realization of actions in for the principal. Note further that may differ for every threshold : hence our lower bounds correspond not to one derandomized delegation instance, but to one per threshold.
Proof
For brevity, denote by . Actions in may have randomized values. The principal’s expected utility can be computed by first realizing for all , then computing the principal’s expected utility over the values of actions in . Hence, there must exist a joint realization of values for each for which this latter expectation is at most . Let denote a new set of actions consisting of the actions with fixed as . We have . Any actions which are not selected in any realization of the values for may be removed from without consequence. However, since values are fixed for each , the agent consistently prefers some particular action over the others in . Hence, we may remove all actions but from without changing the principal’s utility.
We finally use this remaining action to construct . Let and denote the value and bias of . Define to have bias and value . Note that . Hence, the agent will choose from if and only if they would choose from . Moreover, . Hence, .
4.3 Proof of Theorem 4.1
We now show how to obtain a -approximation to the optimal delegation utility using threshold mechanisms, assuming is fixed. Lemma 1 decomposes the optimal utility into an aligned portion, Sur, and a misaligned portion, BDif. Furthermore, Lemma 2 states that Sur can be -approximated using a threshold set. Hence, it will suffice to obtain a -approximation to BDif using thresholds. To do so, we use the derandomization of Lemma 3 to derive an even stronger lower bound which holds when is fixed. We then select a threshold for which this lower bound is guaranteed to be large.
Lemma 4
For any threshold set :
To understand our lower bound, note two pitfalls a threshold set could face. First, a too-expansive threshold could include high-bias actions which attract the agent while providing little value. Second, a too-restrictive threshold could leave the agent with too few options. Lemma 4 states that these are the only two problems: if a threshold is sufficiently low and includes enough of the actions providing BDif for , will perform well.
Proof (Proof of Lemma 4)
We argue with respect to the derandomized action . For brevity, write . Depending on , we have two cases, each of which produces a lower bound on ,
-
•
Case 1: . In this case, any time , we have . Since every choice from gives the agent utility at least , we have and hence . Integrating over all yields
-
•
Case 2: . Then regardless of , we have . Since the agent breaks ties in the principal’s favor, we also have that , so . We may conclude that for all , , and hence
Hence the lemma holds in both cases.
The lower bound in Lemma 4 is a minimum of two terms. We will now study the threshold , and observe that both terms in the minimum are at least . In particular, we can lower bound the second term as follows:
Lemma 5
Let . Then we have:
Proof
Let denote the event that . The following sequence of inequalities, explained below, implies the lemma:
BDif | |||
The first equality is the definition of BDif. The third equality follows from the fact that under , and from the fact that this occurs with probability at most . The last equality follows from the definition of .
Proof (Proof of Theorem 4.1)
The proof of Theorem 4.1 used independence once, in the derandomization step of Lemma 3. Nevertheless, we show in Section 6 that independence is critical to guaranteeing the performance of threshold mechanisms by giving a super-constant lower bound in its absence. With independence, the following example matches the upper bound exactly:
Example 4
Our example will have five actions, with biases and value distributions given below. The outside option will have , and therefore can be ignored. Take two small numbers, and , with much smaller than . Actions will be as follows:
-
•
. is with probability , and otherwise.
-
•
. .
-
•
. .
-
•
. .
-
•
. is with probability , and otherwise.
We may analyze the instance neglecting terms, which only serve to break ties for the agent. The optimal delegation set is , with principal utility , where the first term comes from the event that either actions or realize their high values (in which case they are chosen), and the second term comes from the event that and are low-valued, in which case the agent prefers action . As , the optimal value goes to as . Meanwhile, no threshold obtains expected value better than . This yields an approximation ratio of in the limit.
4.4 Computational Hardness
We conclude the section by discussing the complexity of the delegation problem with independent values. For the discrete version of the problem, where every action is specified by a bias and a list of realizations , we prove:
Theorem 4.2
Delegation with independent values is NP-complete, even with no outside option.
The challenge in proving NP-hardness is managing the rigid structure of the joint value distribution imposed by independence. We adopt a similar strategy to Chen et al. (2014), who show that pricing to a unit-demand buyer is hard. We reduce from Integer Partition: given integers , the goal is to find a subset such that . We associate each integer with an action . Each such action impacts the principal’s utility via two low-probability realizations: a bad realization which harms the principal’s utility and a good realization which improves the principal’s utility. These low probabilities are tuned in such a way that only first- and second-order terms in the probability calculation are relevant. Furthermore, the tuning is such that the bad events scale linearly with the s, while the good events scale in a concave way, with the principal’s utility being maximized when actions taken correspond to an even split of the integers. Full details can be found in Appendix 0.A.1. Note that Theorem 4.2 also implies hardness of the model in the next section, with a random outside option.
5 Randomized Outside Options
In Section 4, we showed that with a fixed (or non-existent) outside option, a simple delegation set secures a constant fraction of the utility from the optimal delegation set. We now consider the case where the outside option’s value is randomized. This may be more realistic in scenarios such as assortment optimization, where the agent’s outside option is taking an action (i.e. buying a good) somewhere else. In this regime, we again give tight bounds. In Section 5.1, we show that no nontrivial multiplicative approximation is possible with threshold sets: there are examples where thresholds give no better than an -approximation, which can be matched trivially. However, in Section 5.2 we show that such lower bound examples are necessarily unnatural. In particular, we parametrize our analysis by the ratio , where OPT is the optimal principal utility and the highest value in any action’s support. We prove that the worst-case approximation is : hence, whenever thresholds perform poorly, it is because the optimal solution relies on exponentially large, exponentially rare values.
5.1 Unparametrized Analysis: Impossibility
This section gives an unparametrized analysis of threshold delegation with a randomized outside option. We show that it is not possible to guarantee a nontrivial approximation factor which holds across all instances.
Our constant-approximation in Section 4 relied on our ability to separate the optimal utility into two parts, BDif and Sur. Approximating the bias difference BDif was the crux of the analysis. The following example shows that with a random outside option value , this analysis — and in particular the approximation of BDif — fails. We will choose our distribution over to streamline exposition, but the example that follows could be adjusted so that the distribution over satisfies nearly any desired regularity condition.
Example 5
Our example will feature two sets of actions: good actions, which are taken by the optimal delegation set, and bad actions, which are not. We will index the actions so that the th good action is , and the bad action between good actions and is . For will have:
-
•
bias :
-
•
value with probability , and otherwise.333It is equivalent for this example to use distribution with probability , and otherwise. Under this distribution, the example becomes an instance of assortment optimization, as described in Section 3. This makes our parametrized analysis in Section 5.2 tight even for that special case.
The bad actions will be indexed by for . Bad action will have
-
•
bias .
-
•
value , for .
The outside option will have bias and value distributed according to a discrete distribution. We will set . For , we will choose probability mass function . Note that we have picked these probabilities so that . The values of all actions described above are independent.
A solution to the delegation instance we just described is to take only good actions. The probability that at least one good action takes its high value is . Assume this event has occurred, and that the agent’s preferred good action is . Then is preferred to the outside option with probability . Hence, the principal’s expected utility from choosing only good actions is at least:
Now consider a threshold set . It is without loss of generality to consider for some , which implies that the highest-bias actions in are and . For any good action , with , the agent’s utility for on a high-valued realization is . Hence, the agent ignores all actions other than , , and the outside option. If draws its high value, the principal gets utility utility if and only if survives the outside option, which happens with probability . Otherwise, the agent looks to action , and takes it over the outside option with probability . Ignoring value from the outside option, which goes to as , we can account for the utility from as follows:
where the latter approximation holds for and sufficiently small. This implies that every threshold incurs a loss which is .
An upper bound of for threshold mechanisms is trivial, by the following lemma. Hence, up to a constant, the lower bound in Example 5 is tight.
Lemma 6
For any set and , let denote the contribution to from action . Then .
Proof
Consider any where . The action chosen by the agent under is , which has . Since is the agent’s favorite, we have . Hence, . Taking expectation over we obtain:
where the first inequality follows from the nonnegativity of .
An -approximation then follows from noting that for any set , .
Corollary 5.1
With independent values (and possibly randomized outside option), the best threshold is an -approximation to the optimal delegation set.
5.2 Parametrized Approximation
In the previous section, we gave an example where no threshold set was better than an -approximation. However, this example was extreme, in the sense that while the optimal solution obtained utility, some actions had values which were as large as . We now show that this is no coincidence: any example where threshold mechanisms perform poorly must be unnatural in this way.
Theorem 5.1
Let , where OPT is the optimal principal utility and the highest value in any action’s support. Then with independent values (and a possibly randomized outside option), the best threshold is a -approximation to OPT.
Theorem 5.1 is of particular interest for the application of assortment optimization. For an instance of the latter problem, each item yields revenue for the seller, and value for the buyer. Framed as a delegation problem, we have , for sufficiently small . Hence, Theorem 5.1 implies that the prices must be extreme whenever revenue-ordered assortments perform poorly. Another consequence of Theorem 5.1 is a bicriteria approximation when values lie in : either some threshold obtains a small multiplicative approximation, or it is trivial to obtain an additive approximation.
Proof (Proof of Theorem 5.1)
We will argue the contrapositive.
That is, we will argue with respect to some integer , and assume that no threshold obtains a -approximation for any .
Under this assumption, we show that there must be an action with value at least with positive probability.
The analysis will roughly proceed in three steps.
First, we partition the optimal solution into subsets with roughly equal contribution to OPT.
We then consider the thresholds based on each of these subsets, and compare their utility to that from the sets themselves; by assumption, no such threshold will outperform its respective subset.
Finally, we combine the resulting inequalities to show that the only way all can hold simultaneously is if the bias of one of these thresholds is extreme.
This will imply the existence of a comparably high value.
Throughout, we will make use of our decomposition and derandomization from Lemmas 1 and 3, respectively.
Decomposing OPT
Before defining our thresholds, we note that when no threshold approximates OPT well, we may draw several simplifying conclusions about the structure of OPT. Let be an optimal subset of actions, and assume every action in is selected with positive probability. Following Lemma 1, write , and write . By Lemma 2, it must be that , or else the grand threshold would be an -approximation, contradicting the nonexistence of any -approximation. We may therefore focus our analysis on . It must again be that no threshold obtains better than an -approximation to .
Next, note that no one action can comprise a large fraction of BDif. More precisely, let
denote the contribution of action to OPT and BDif, respectively. Since only if , we must have .
Lemma 6 implies that we may obtain from a threshold set for any .
We must therefore have that , and therefore that .
The remainder of the proof will focus on approximating , assuming no approximation better than is possible.
Lemma 6 implies that for all .
Constructing Thresholds
We now obtain a sequence of candidate thresholds by partitioning the actions in based on their contribution to .
Let , and relabel the actions in as , with for all .
Actions not in will be indexed (with keeping its label).
We will partition greedily to produce subsets of roughly equal contribution to .
More precisely, define the first breakpoint between subsets as , and for , define subsequent breakpoints recursively as the smallest such that .
Define the partition as for all .
We can upper- and lowerbound the distortion of each bin:
since for all , it must be that is nonempty for .
Further define .
We must also have for all , and hence no threshold can attain better than a -approximation to for any .
We will define candidate thresholds based on each bin: let (with ), and define the threshold sets for all .
Lower Bounding Threshold Utilities
To lower bound the principal utility from , we apply Lemma 3 and consider for some suitably constructed interfering action with bias . We write . Note that the interfering action must give the agent high utility, or else would perform as well as its respective bin . Formally, let be the event that the agent’s preferred action from is in . (Note that the agent may still ultimately choose action under either or , and that .) If , then the threshold performance can be written as
The righthand expression is the bias difference conditioned on , which is at least This contradicts our assumption that no threshold approximates to a factor better than .
We will now lowerbound by decomposing it into two terms, depending on the event :
(1) |
To lowerbound the first term the right of (1), define the event to be the event that , and to be the event that the agent’s favorite action in is also their favorite in . In , the agent may still ultimately take the outside option. We may now write:
where the inequality comes from the fact that under , the agent chooses the same thing from as they would from , and hence that action gives the agent utility at least .
To lowerbound the second term on the right of (1), let be the event that the agent prefers to the outside option, i.e. . We may then lowerbound the second term as:
The first inequality follows from the fact that the action chosen in yields utility at least and has bias at most . The second inequality follows from the facts that and and are independent. Taking the two terms together, we have the lower bound
(2) |
-
•
: agent’s favorite non- action in is in .
-
•
: .
-
•
: the agent’s favorite action in is also their favorite in .
-
•
: .
-
•
:
Upper Bounding
Next, we upperbound the contribution of to BDif. We will split based on the event that the agent’s favorite action is the same between and . Let denote the event that . We have:
(3) |
We can now upperbound each term in (3), starting by rewriting the leftmost. In the intersection event , the favorite non- action from and are the same. Hence conditioned on this event, and . Therefore:
To upperbound the second term in (3), note first that in the event , the chosen action’s bias is at least and hence the bias difference is at most . Second, note that conditioned on , it must be that the agent prefers to all actions in . Hence, conditioned on , it must be that any time the agent prefers an action in to the outside option, it must be that they also prefer . Hence, . Finally, note that and are independent. These facts imply:
Combining the two terms, we have:
(4) |
Lowerbounding Bias Differences
Inequality (2) lowerbounds in terms of , and inequality (4) upperbounds in terms of . Since no threshold is better than a -approximation to , it must hold that our upper bound on exceeds our lower bound on . That is:
We may rearrange this as
Taking the product over all and canceling yields:
Note that the righthand side is a convex, symmetric function of the s. Moreover, we have . Hence, minimizing the righthand side as a function of the s yields a minimum at for all , and hence . Note also that . Since , we obtain:
Since every action in is selected with positive probability, and since , it must be that some action in has value at least with positive probability. Since , we obtain the desired lower bound on .
6 Threshold Delegation with Correlated Values
In the previous sections, we showed that under independently-distributed values, simple threshold rules obtain a close approximation the optimal principal utility. We now allow arbitrarily correlated values and show that the situation worsens considerably. Assuming the value distribution is discrete, prove tight a approximation guarantee for the principal’s best threshold policy, showing that it is a -approximation, where denotes the lowest probability mass of any value profile realization. Hence, absent independence, threshold policies still perform well under low levels of uncertainty, but their performance gradually degrades as the uncertainty grows more extreme. We state our results formally below, starting with our upper bound.
Theorem 6.1
There always exists a threshold policy which is a -approximation where is the mass of the least likely value profile.
Proof
Let be the optimal delegation set, and let be the maximum bias across actions in . Let be the random variable that corresponds to the bias of the action chosen in . Let be the bias threshold such that and . Let be the principal utility generated conditioned on . We claim that the principal utility generated by using a best threshold out or achieves principal utility at least .
First consider the threshold . Consider any realization where picks an action with bias at least . Let be the value and bias of the action chosen by and let be the value and bias of the action chosen by respectively. Since the action chosen by is available in it must be that
Note that from our assumptions and , therefore the pointwise loss of compared to is at most in this event. As a result we can lower bound the principal utility of as follows:
Second, consider the threshold . Every time chooses an action with bias less than or equal to this action is also available to . Note that if such action is chosen its agent utility must be at least otherwise the action with the maximum bias would have been chosen instead. The action chosen in therefore must have at least agent utility . Since the bias is at most this means that the principal utility from that action is at least . We conclude that
As a result,
Hence, the best out of both of these sets provides at least principal utility. Note that gets at most utility from this event therefore the best of these threshold provides a -approximation to the events’ contribution to ’s principal utility.
We have shown that there exists a threshold that approximates the utility of conditioned that that is
Let us focus on the remaining principal utility that is obtained by in the event that where bias of the most-biased action smaller than . One key observation is that this event happens with probability at most by our choice of since .
If we consider the conditional distributions on this event we can repeat the same analysis to prove that there exists bias threshold such that and also
which corresponds to -approximation to the contribution to solution’s principal utility in this interval. Note that
Repeating this process shrinks the probability of the remaining probability space by half. Let be the maximum number of times we can repeat this process. There are two ways this process can stop. Either we are left with a single action or the probability that ( bias) is strictly below the last used threshold is . Since the minimum probability of any realization is and each time we repeat this process the probability is shrunk by half cannot be larger than .
This process generates disjoint events for such that
and in addition
If we combined these two equations together we get that
Since we get that the best possible threshold is at least a approximation.
Matching lower bound.
The above analysis is tight. We show in Appendix 0.A.2 that our analysis in Theorem 6.1 is tight up to a constant factor. We do so by providing instances where no threshold policy can outperform the logarithmic approximation ratio.
Theorem 6.2
There exists a family of instances where no threshold policy is better than a -approximation.
In Appendix 0.A.3, we also prove the following supplementary hardness result for the case with discretely-distributed, correlated values:
Theorem 6.3
With correlated, discrete values, there exists a constant such that it is NP-hard to compute a mechanism with approximation factor better than .
To prove this result, we reduce from bounded-degree vertex cover, which is similarly hard to approximate.
Acknowledgments
The authors thank Bobby Kleinberg, Hamsa Bastani, Rediet Abebe, and Shaddin Dughmi for helpful discussions. The authors further thank Yuanzhe Liu, Yichi Zhang, and Pucheng Xiong for their help as undergraduate research assistants on this and related projects. Part of this work was done while Ali Khodabakhsh and Emmanouil Pountourakis were visiting the Simons Institute for the Theory of Computing.
References
- (1)
- Aghion and Tirole (1997) Philippe Aghion and Jean Tirole. 1997. Formal and real authority in organizations. Journal of political economy 105, 1 (1997), 1–29.
- Alonso and Matouschek (2008) Ricardo Alonso and Niko Matouschek. 2008. Optimal delegation. The Review of Economic Studies 75, 1 (2008), 259–293.
- Amador and Bagwell (2010) Manuel Amador and Kyle Bagwell. 2010. On the optimality of tariff caps. Unpublished working paper (2010).
- Ambrus and Egorov (2017) Attila Ambrus and Georgy Egorov. 2017. Delegation and nonmonetary incentives. Journal of Economic Theory 171 (2017), 101–135.
- Aouad et al. (2018) Ali Aouad, Vivek Farias, Retsef Levi, and Danny Segev. 2018. The approximability of assortment optimization under ranking preferences. Operations Research 66, 6 (2018), 1661–1669.
- Aouad et al. (2023) Ali Aouad, Jacob Feldman, and Danny Segev. 2023. The exponomial choice model for assortment optimization: an alternative to the MNL model? Management Science 69, 5 (2023), 2814–2832.
- Armstrong and Vickers (2010) Mark Armstrong and John Vickers. 2010. A model of delegated project choice. Econometrica 78, 1 (2010), 213–244.
- Bechtel and Dughmi (2021) Curtis Bechtel and Shaddin Dughmi. 2021. Delegated Stochastic Probing. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
- Bechtel et al. (2022) Curtis Bechtel, Shaddin Dughmi, and Neel Patel. 2022. Delegated Pandora’s box. arXiv preprint arXiv:2202.10382 (2022).
- Berbeglia (2016) Gerardo Berbeglia. 2016. Discrete choice models based on random walks. Operations Research Letters 44, 2 (2016), 234–237.
- Berbeglia and Joret (2020) Gerardo Berbeglia and Gwenaël Joret. 2020. Assortment optimisation under a general discrete choice model: A tight analysis of revenue-ordered assortments. Algorithmica 82, 4 (2020), 681–720.
- Braun et al. (2022) Pirmin Braun, Niklas Hahn, Martin Hoefer, and Conrad Schecker. 2022. Delegated Online Search. arXiv preprint arXiv:2203.01084 (2022).
- Castiglioni et al. (2021) Matteo Castiglioni, Alberto Marchesi, and Nicola Gatti. 2021. Bayesian Agency: Linear versus Tractable Contracts. arXiv preprint arXiv:2106.00319 (2021).
- Chen et al. (2016) Wei Chen, Wei Hu, Fu Li, Jian Li, Yu Liu, and Pinyan Lu. 2016. Combinatorial multi-armed bandit with general reward functions. Advances in Neural Information Processing Systems 29 (2016).
- Chen et al. (2014) Xi Chen, Ilias Diakonikolas, Dimitris Paparas, Xiaorui Sun, and Mihalis Yannakakis. 2014. The complexity of optimal multidimensional pricing. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms. SIAM, 1319–1328.
- Clementi and Trevisan (1999) Andrea EF Clementi and Luca Trevisan. 1999. Improved non-approximability results for minimum vertex cover with density constraints. Theoretical Computer Science 225, 1-2 (1999), 113–128.
- Davis et al. (2014) James M Davis, Guillermo Gallego, and Huseyin Topaloglu. 2014. Assortment optimization under variants of the nested logit model. Operations Research 62, 2 (2014), 250–273.
- Désir et al. (2020) Antoine Désir, Vineet Goyal, Danny Segev, and Chun Ye. 2020. Constrained assortment optimization under the Markov chain–based choice model. Management Science 66, 2 (2020), 698–721.
- Dütting et al. (2019) Paul Dütting, Tim Roughgarden, and Inbal Talgam-Cohen. 2019. Simple versus optimal contracts. In Proceedings of the 2019 ACM Conference on Economics and Computation. 369–387.
- Feldman and Topaloglu (2017) Jacob B Feldman and Huseyin Topaloglu. 2017. Revenue management under the Markov chain choice model. Operations Research 65, 5 (2017), 1322–1342.
- Frankel (2014) Alexander Frankel. 2014. Aligned delegation. American Economic Review 104, 1 (2014), 66–83.
- Goel et al. (2006) Ashish Goel, Sudipto Guha, and Kamesh Munagala. 2006. Asking the right questions: Model-driven optimization using probes. In Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 203–212.
- Hajiaghayi et al. (2023) MohammadTaghi Hajiaghayi, Keivan Rezaei, and Suho Shin. 2023. Multi-agent Delegated Search. In Proceedings of the 2023 ACM Conference on Economics and Computation. ACM.
- Hartline (2013) Jason D Hartline. 2013. Mechanism design and approximation. Book draft. October 122 (2013), 1.
- Hartline and Roughgarden (2009) Jason D Hartline and Tim Roughgarden. 2009. Simple versus optimal mechanisms. In Proceedings of the 10th ACM conference on Electronic commerce. 225–234.
- Holmstrom (1978) Bengt Robert Holmstrom. 1978. On Incentives and Control in Organizations. Ph. D. Dissertation.
- Immorlica et al. (2018) Nicole Immorlica, Brendan Lucier, Jieming Mao, Vasilis Syrgkanis, and Christos Tzamos. 2018. Combinatorial Assortment Optimization. In International Conference on Web and Internet Economics. Springer, 218–231.
- Kleinberg and Kleinberg (2018) Jon Kleinberg and Robert Kleinberg. 2018. Delegated Search Approximates Efficient Search. In Proceedings of the 2018 ACM Conference on Economics and Computation. ACM, 287–302.
- Kleiner (2023) Andreas Kleiner. 2023. Optimal Delegation in a Multidimensional World. In Proceedings of the 2023 ACM Conference on Economics and Computation. ACM.
- Kovác and Mylovanov (2009) Eugen Kovác and Tymofiy Mylovanov. 2009. Stochastic mechanisms in settings without monetary transfers: The regular case. Journal of Economic Theory 144, 4 (2009), 1373–1395.
- Li et al. (2015) Guang Li, Paat Rusmevichientong, and Huseyin Topaloglu. 2015. The d-level nested logit model: Assortment and price optimization problems. Operations Research 63, 2 (2015), 325–342.
- Martimort and Semenov (2006) David Martimort and Aggey Semenov. 2006. Continuity in mechanism design without transfers. Economics Letters 93, 2 (2006), 182–189.
- Mehta et al. (2020) Aranyak Mehta, Uri Nadav, Alexandros Psomas, and Aviad Rubinstein. 2020. Hitting the high notes: Subset selection for maximizing expected order statistics. Advances in Neural Information Processing Systems 33 (2020), 15800–15810.
- Melumad and Shibano (1991) Nahum D Melumad and Toshiyuki Shibano. 1991. Communication in settings with no transfers. The RAND Journal of Economics (1991), 173–198.
- Rusmevichientong et al. (2014) Paat Rusmevichientong, David Shmoys, Chaoxu Tong, and Huseyin Topaloglu. 2014. Assortment optimization under the multinomial logit model with random choice parameters. Production and Operations Management 23, 11 (2014), 2023–2039.
- Segev and Singla (2021) Danny Segev and Sahil Singla. 2021. Efficient approximation schemes for stochastic probing and prophet problems. In Proceedings of the 22nd ACM Conference on Economics and Computation. 793–794.
- Szalay (2005) Dezsö Szalay. 2005. The economics of clear advice and extreme options. The Review of Economic Studies 72, 4 (2005), 1173–1198.
- Talluri and Van Ryzin (2004) Kalyan Talluri and Garrett Van Ryzin. 2004. Revenue management under a general discrete choice model of consumer behavior. Management Science 50, 1 (2004), 15–33.
Appendix 0.A Supplementary materials
0.A.1 Proof of Theorem 4.2
To see that the problem is in NP, note that given an action set , we may compute the principal’s expected utility in polynomial time. Specifically, let be the first elements of , let and . Both can be computed for all by a simple dynamic program, considering . The utility of can then be computed using the law of total expectation.
To show hardness, we will reduce from Integer Partition. An instance of this problem is integers . The goal is to find a subset such that . Let , and . Consider the following delegation instance, with actions.
-
•
Actions have bias for some (large, to be chosen). Let be a number small enough to only matter for agent tiebreaking (and which we will omit from all principal utility computations). The value of action will be
-
*
(high realization) with probability
-
*
(low realization) with probability
-
*
-
•
Action has value with probability and with probability takes value . Action will have bias .
First observe that any set of actions not containing action is suboptimal. By a union bound, the utility from such a set is at most . The utility from taking action alone, meanwhile, is , which is at least as long as . It follows that the principal’s problem is to pick which of actions to pick alongside . Now consider the principal utility from a set for some . To compute the principal’s utility, consider the following two events:
-
•
Let be the event that at least one action in has a high realization. In this event, the agent will choose such an action over , no matter the value of action .
We can approximate the probability of this event using only first-order terms. In more detail, this event has probability
Call the second term on the right . We can show that :
As long as , we have the desired upper bound.
-
•
Let be the event that no action in has a high realization, and at least one has a low realization. Then this event has probability:
Call the last term . A similar argument to the one for shows that . We include it below for completeness.
This implies the desired bound as long as .
The third term above can also be simplified:
Call the first two terms above . Since for any , and , we have . We therefore have .
Analyzing the Principal’s Utility
Given actions , we can write the principal’s utility as:
Taking , we obtain:
The first term does not depend on , and the last term will turn out to be negligibly small. We next analyze the middle term, leaving out , , and for the moment.
This latter expression takes value if the integers can be exactly partitioned, and value at most otherwise. Now we can take , , , and into account. Specifically, we can write:
The first inequality follows from applying the triangle inequality, along with our existing bounds on , , and and the fact that by a union bound. As long as , we will have that . We can therefore solve our Integer Partition instance by asking if our constructed instance of delegation has value at least . Any solution that exactly partitions the integers will obtain at least this value, and any solution that fails to do so will have objective value at most .
0.A.2 Proof of Theorem 6.2
In this appendix, we show that the logarithmic approximation upper bond of Theorem 6.1 is tight, up to a constant factor. That is, no threshold algorithm can perform better than . To prove the tightness of our analysis in Section 6, we construct an infinite family of instances.
For any , consider an instance with actions. The correlated distribution has value profile realizations. We construct a value matrix where each row correspond to an action and each column corresponds to a realization. Therefore, the value at cell gives the value of action at realization value profile . The distribution over value profiles simply selects and value realization uniformly at random.
(5) | |||
(15) |
Note that all the empty entries in the above matrix are zero, and are removed to make the structure of the matrix more apparent. Also, every solution corresponds to eliminating a a set of rows. For each realization (column) the row with maximum agent utility is selected. The optimal solution is to select set of odd actions (with size ). The colored entries indicate (realization,action) pairs that contribute to the optimal principal’s utility (). The optimal utility is equally divided between the odd rows, leaving for each one: the first row has in the first column, the third row has in columns 2 and 3, the fifth row has over the next columns and so on. In the example, the even actions are constructed to lower the principal’s utility whenever they are included in a threshold solution. In every state (column), we divide the colored utility by 2 to find the utility of the next row, and keep dividing by 2 to complete the subsequent even rows. The terms are added to break the ties and are of little importance.
Next, we define the bias: we set , and the rest of actions have the following bias:
(16) |
Now that all the parameters are set, it is easy to verify that given the set of odd actions (rows), the agent will indeed pick the colored entries. This generates the optimal utility, since it is optimal in every single realization. Since each value profiled is realized with probability the optimal expected utility is equal to:
However, the best threshold solution in the constructed instance is to allow the entire set of actions (). To see this, assume that the principal allows actions with bias less than or equal to for some . (Thresholds set at even-indexed actions can be easily shown to be suboptimal.) Note that every even action is preferred by the agent to any other action with less bias. Therefore, the only actions chosen by the agent are or In this case, the principal will get utility of from the first states, and zero from the remaining states.
Observe that the overall utility is an increasing function in , meaning that the best strategy for the principal is to not limit the agent. In this case, the agent will pick the penultimate action in the first half of columns, and the last action for the second half, generating utility of (almost) 2 for principal in every state. More precisely, we have:
We get the desired lower bound by dividing the above objectives:
Example 6
In order to make sure that the above construction is clear, here we present the full matrices for the case of , which translates into actions and realizations. The value matrix in this case is
Calculating the bias in (16) results in:
It is clear that the value matrix is non-negative, and the agent’s utility will be:
Observe that by the set of odd actions , while from the entire set of actions .
0.A.3 Proof of Theorem 6.3
Proof
We give a reduction from the bounded degree vertex cover problem, i.e., the vertex cover problem on graphs with degree at most (constant). This problem is known to be APX-hard Clementi and Trevisan (1999). Consider an instance of the bounded degree vertex cover problem with nodes and edges (where ).444To distinguish between the parameters of the vertex cover instance and the delegation instance, we use tilde () for the graph instance.
We construct an instance of the delegation problem with actions with action corresponding to node and an additional “default” action . All actions have bias apart from which has bias . The correlated distribution of the actions values is defined as follows: we pick an edge or some node uniformly at random, i.e., each element with probability
If we picked some edge then we assign value to actions and , to the default action , and for all other actions. If we picked a node we assign value to and (default action) and for all other actions.
We claim that the optimal solution of the delegation problem produces a utility of for the principal, where is the size of the smallest vertex cover of . To see this, first note that any solution can be improved by including , since has a negative bias. Any time the agent would choose , it is the optimal choice for the principal as well. We therefore only consider solutions containing .
Now if is a vertex cover of with , consider the corresponding delegation set where the principal allows actions . If we generate the values by picking an edge, the agent will pick the action corresponding to one end of that edge (one is guaranteed to be in the cover ) to get a utility of compared to achievable from the default action. This choice will also generate utility of for the principal, which makes in total. If the utility is generated by picking node the agent will pick action which generates the utility of for both principal and agent. This will make in total. Finally, if the utilities are generated using some node the agent picks the default action which generates a utility of for the agent but for the principal. This will give in total. As a result the principal utility in expectation is .
For the converse, consider an optimal solution to the delegation problem. We show that the nodes corresponding to the actions in (excluding the default action) induce a vertex cover; otherwise the solution can be improved. Assume that there exists an edge where neither nor is allowed in . If we add action to , the principal gets a utility of if the utilities are generated from pick edge , compared to current utility of from the default action. On the other hand, the utility of the principal decreases from to if the values are generated by action . So the total utility of is more than which contradicts the optimality of . Therefore should be a vertex cover (plus default action). This in turn implies that the utility is at most where is the size of the minimum vertex cover.
Since and the minimum vertex cover has size at least , a constant factor gap in the bounded degree vertex cover problem translates into a constant factor gap in the optimal solution of the delegation problem, which yields the desired hardness result.