This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\AtAppendix\newdate

draftdate07042023 \usdate

The Simple Economics of Optimal Bundlingthanks: I thank Matthew Gentzkow, Soheil Ghili, Ravi Jagadeesan, Paul Milgrom, Mike Ostrovsky, Ilya Segal, Andy Skrzypacz, Takuo Sugaya, Bob Wilson, and Weijie Zhong for helpful comments and suggestions.

Frank Yang Graduate School of Business, Stanford University. Email: [email protected].
(\displaydatedraftdate)
Abstract

We study optimal bundling when consumers differ in one dimension. We introduce a partial order on the set of bundles defined by (i) set inclusion and (ii) sales volumes (if sold alone and priced optimally). We show that if the undominated bundles with respect to this partial order are nested, then nested bundling (tiered pricing) is optimal. We characterize which nested menu is optimal: Selling a given menu of nested bundles is optimal if a smaller bundle in (out of) the menu sells more (less) than a bigger bundle in the menu. We present three applications of these insights: the first two connect optimal bundling and quality design to price elasticities and cost structures; the last one establishes a necessary and sufficient condition for costly screening to be optimal when a principal can use both price and nonprice screening instruments.

Keywords: Bundling, tiered pricing, multidimensional screening, mechanism design.

1 Introduction

A common bundling strategy is to create nested bundles: A higher-tier bundle includes all the items in a lower-tier bundle. This strategy, also called tiered pricing, is widely adopted across various industries, including software companies (e.g. Slack), streaming services (e.g. Netflix), and e-commerce platforms (e.g. Shopify).111For instance, Slack offers four service tiers — Free, Pro, Business+, and Enterprise Grid — each of which includes all the features of the previous tier (Slack 2022). Several questions arise: When is such a strategy profit-maximizing? Which items to package in one tier versus another tier? How many tiers are optimal? Even though these questions seem to be fundamental, relatively little is known because characterizations for optimal bundling are generally intractable (Rochet and Stole 2003).222For instance, the optimal mechanism for selling two goods with additive, independent values remains unknown except for some special cases (Carroll 2017).

In this paper, we answer these questions when consumers differ in one dimension. This dimension could represent, for instance, income levels in retail pricing or enterprise complexity in enterprise pricing. We allow the consumers to have general non-additive values (in particular, heterogeneous preferences over different items, and complementary or substitutable preferences across different items). We allow the seller to have arbitrary costs for producing different bundles. Even though we assume one-dimensional heterogeneity, our bundling problem is a multidimensional screening problem because there are multiple instruments (allocations of different bundles).

We introduce a partial order on the set of bundles: A bundle b1b_{1} is dominated ()(\preceq) by another bundle b2b_{2} if (i) b1b_{1} is a subset of b2b_{2} and (ii) b1b_{1} has a lower sales volume than b2b_{2} when both are sold alone at their respective monopoly prices. This partial order can be readily determined by examining the demand curve for each bundle separately. However, it turns out that this simple partial order, under quasi-concavity assumptions, characterizes the optimal bundling strategy.

Our first main result (Theorem 1) shows that if the undominated bundles are nested (the nesting condition), then nested bundling is optimal. Our second main result (Theorem 2) characterizes which nested menu is optimal: Selling a given menu of nested bundles is optimal if a smaller bundle in (out of) the menu sells more (less) than a bigger bundle in the menu, where sales volumes are measured when bundles are sold alone. We also present a partial converse (Corollary 2): If nested bundling is optimal, then the minimal optimal menu must include (i) the best-selling bundle as the smallest bundle in the menu, and (ii) the grand bundle as the least-selling bundle in the menu.333Both “best-selling” and “least-selling” refer to sales volumes when bundles are sold alone. Under the optimal menu of bundles, the grand bundle may or may not be bought by the smallest fraction of consumers.

We present three applications of our main result. The first two applications connect the key demand parameters (price elasticities) and supply parameters (cost structures) to the structure of optimal bundling and quality discrimination. The third application shows how our bundling result also delivers insights into other multidimensional mechanism design problems, in particular screening with price and nonprice instruments.

In the first application, we provide a sufficient condition for our nesting condition in terms of price elasticities and further characterize the optimal menu (see Section 5.1). Specifically, we introduce the union elasticity condition which states that if the demand curves for two different bundles are both elastic at a certain quantity, then the demand curve for their union is also elastic at that quantity. Assuming zero marginal costs, we show that this condition implies the nesting condition, which then implies the optimality of nested bundling by our main result. We also show that, in a precise sense, the optimal menu can be iteratively constructed by using items with a more elastic demand curve as the “basic items” and items with a more inelastic demand curve as the “upgrade items” (1). In this case, if sold alone, a large bundle has a sales volume lower than its elastic items but higher than its inelastic items. The full characterization of optimal mechanisms enables comparative statics analysis. We find that as the dispersion of values for one item increases, the monopolist switches the tiers of different items and adopts a menu size that is UU-shaped in the dispersion parameter (2). These comparative statics results differ significantly from those in the standard quality-differentiated goods model, such as in Johnson and Myatt (2006), as our model allows for a much richer set of preferences (see Remark 9).

In fact, the quality-differentiated goods model (a la Mussa and Rosen 1978) is a special case of our model where there are no heterogeneous relative preferences (see Section 5.2). Our second application shows that, even in this well-studied special case, our result yields new insights by providing a new characterization of the optimal menu (3). Using this characterization, we can hold the price elasticities of different qualities constant and study the effects of cost structures on the optimal quality design. We show that the optimal menu of qualities can be characterized by the lower increasing envelope of the average cost curve (4). This result turns out to (i) generalize a finding by Johnson and Myatt (2003) and (ii) refine their intuition about when segmenting markets is profitable (see Remark 10). On a technical level, the reason why our results are new even for one-dimensional screening problems is that we impose much weaker “regularity” assumptions compared to the textbook treatment. This generality allows for various forms of “bunching”, which are ruled out by standard assumptions but can be characterized by our notion of dominance (see Remark 11).

Our bundling result also provides insights into other multidimensional mechanism design problems. In the third application, building on a connection between bundling and costly screening from Yang (2022), we use our main result to characterize when costly screening is optimal for a principal who can use both price and nonprice instruments such as wait time (see Section 5.3). We obtain (see 5) a necessary and sufficient condition for the optimality of costly screening when the agent has “negatively correlated preferences” (higher types have higher disutilities), complementing Yang (2022), which shows that costly screening is always suboptimal when the agent has “positively correlated preferences” (higher types have lower disutilities). Our result shows that, in a precise sense, costly screening is optimal if and only if there exists a costly action such that the elasticity of disutility with respect to consumers’ types exceeds a threshold (see Remark 13).

Discussion of Intuition.

We now present the key intuition behind our main result (see Section 3.4 for a detailed discussion).

Suppose that we have two items {1,2}\{1,2\} and that costs are zero. Suppose further that there are only two possible bundles {1}\{1\} and {1,2}\{1,2\} in the market, whose marginal revenue curves (when sold alone) are given in Figure 1(a). Because consumers differ in one dimension, we can arrange them along a common quantity axis, as in Figure 1(a), with consumers positioned toward the right end having lower values for the bundles. Starting from the right end of the quantity axis, we see that selling to these low-type consumers results in a negative marginal revenue for both bundles {1}\{1\} and {1,2}\{1,2\}. This is the case until we reach quantity D({1,2})D^{*}(\{1,2\}), the point at which the MR curve of bundle {1,2}\{1,2\} crosses zero (i.e. the sales volume when sold alone). From this point on, there is a positive marginal revenue for selling bundle {1,2}\{1,2\} to consumers. At the same time, from this point on, the marginal revenue for selling the smaller bundle {1}\{1\} to these consumers is always lower. Noting that the total revenue from any implementable assignments will be the sum of the integrals of the respective MR curves,444This fact is a consequence of the revenue equivalence theorem (see Section 4). we can see that the optimal strategy is to always sell the bigger bundle {1,2}\{1,2\} to consumers located to the left of the quantity D({1,2})D^{*}(\{1,2\}), which can be implemented by simply posting the usual monopoly price for bundle {1,2}\{1,2\}.

Now, suppose that bundle {2}\{2\} can be also sold in the market. If the MR curve for the smaller bundle {2}\{2\} also relates to the MR curve for the bigger bundle {1,2}\{1,2\} in this way, then the same argument implies that selling only the bundle {1,2}\{1,2\} is optimal even when all three bundles can be sold in the market. Of course, that need not be the case, as Figure 1(b) illustrates. But we can still start with the right end of the quantity axis, “climb up” the MR curves, and attempt to sell the bundle with the highest marginal revenue at that quantity. This strategy, if implementable, will allow us to attain the upper envelope of the MR curves as depicted by the dotted curve in Figure 1(b). This strategy may not be implementable, though, as there may not exist prices such that all consumers self-select into exactly the bundles that we want to assign to them. Nevertheless, if the upper envelope of the MR curves happens to be attained by an assignment rule that is monotone in the set-inclusion order, then we know that there exist appropriate prices to implement the assignments (similar to the case of selling a single good in which we know that monotone allocations can be implemented by prices). This happens to be the case for Figure 1(b), as shown by the depicted bundle assignments. Of course, for the assignment rule to be monotone, the assigned bundles must necessarily be nested.

Refer to caption
(a) MR curves under {1}{1,2}\{1\}\preceq\{1,2\}
Refer to caption
(b) MR curves under {1}{1,2}\{1\}\preceq\{1,2\}, {2}{1,2}\{2\}\not\preceq\{1,2\}
Figure 1: Illustration of the marginal revenue curves

A key contribution of this paper is to show that our nesting condition, under quasi-concavity assumptions, is sufficient for this strategy to work, which automatically gives us (i) the optimality of nested bundling and (ii) the optimal menu (by “climbing up” the MR curves). We briefly explain why our nesting condition is sufficient in the two-item case. Suppose that D({1})<D({1,2})<D({2})D^{*}(\{1\})<D^{*}(\{1,2\})<D^{*}(\{2\}) (in which case the undominated bundles {2}\{2\} and {1,2}\{1,2\} are nested). We claim that the configuration of MR curves must resemble Figure 1(b). Recall that when the revenue functions are quasi-concave, the MR curves cross zero once from above. It turns out that, under the standard quasi-concavity assumption on the revenue function for the incremental bundle, the MR curve of a bigger bundle also crosses the MR curve of a smaller bundle at most once from above.555This quasi-concavity assumption is stronger than what we actually assume in the model, which is a local quasi-concavity condition and allows for multiple crossings of the MR curves (see Section 2.1). This implies that, as Figure 1(a) shows, the two quantities D({1})<D({1,2})D^{*}(\{1\})<D^{*}(\{1,2\}) must be located in the region where the marginal revenue of selling the incremental bundle is positive (i.e. to the left of the vertical dashed line in Figure 1(a)). At the same time, because the ordering is reversed, as Figure 1(b) shows, the two quantities D({1,2})<D({2})D^{*}(\{1,2\})<D^{*}(\{2\}) must be located in the region where the marginal revenue of selling the incremental bundle is negative (i.e. to the right of the vertical dashed line in Figure 1(b)). Thus, the upper envelope of the MR curves must be attained by an implementable allocation rule assigning only nested bundles {2}\{2\} and {1,2}\{1,2\}. Of course, in the general many-item case, it is impossible to exhaustively list all possible configurations of the MR curves. We provide the key intuition for the general case in Section 3.4.

1.1 Related Literature

There is a substantial literature on multidimensional screening and optimal bundling (beginning with Stigler 1963, Adams and Yellen 1976, McAfee et al. 1989). A general lesson is that some form of bundling is generically profitable but characterizing optimal bundling strategies turns out to be very difficult (Armstrong 1996, Rochet and Chone 1998, Carroll 2017). Because of this difficulty, relatively little is known about how optimal bundling strategies depend on economic primitives such as price elasticities and cost structures. This paper departs from most of the bundling literature which assumes additive values and multidimensional heterogeneity (McAfee and McMillan 1988, Manelli and Vincent 2007, Pavlov 2011, Daskalakis et al. 2017).666A few papers also study non-additive values, including Long (1984), Armstrong (2013), Haghpanah and Hartline (2021), Ghili (forthcoming). By doing so, we are able to connect the empirically relevant economic primitives to the structure of optimal bundling strategies and even perform comparative statics analysis.

There is a nascent literature on nested bundling (Bergemann et al. 2022, Yang 2022). Bergemann et al. (2022) study the additive case and obtain conditions different from ours. An earlier work by this author introduces a multidimensional screening model with both price and costly nonprice instruments, and studies nested bundling as one application (Yang 2022).777For the literature on costly screening and money burning, see for example Banerjee (1997), Hartline and Roughgarden (2008), Condorelli (2012), Amador and Bagwell (2013). Building on this connection, the present paper proceeds in the reverse direction, by applying the main bundling result to study costly screening. These two papers also differ by imposing non-overlapping sets of assumptions, studying exactly the opposite cases in terms of the preferences over costly actions (see Section 5.3).

There is a long-standing literature on the profitability of price discrimination and more generally the optimality of pure bundling (Stokey 1979, Deneckere and McAfee 1996, Anderson and Dana Jr 2009, Haghpanah and Hartline 2021, Ghili forthcoming).888Note that pure bundling can be thought of as selling only the highest quality version of a product (i.e. no price discrimination). Ghili (forthcoming) introduces the use of single-bundle sales volumes to study the optimality of pure bundling. Our result nests Ghili (forthcoming) as a special case (see Remark 4), which in turn generalizes a long line of inquiry on the profitability of price discrimination (Salant 1989, Deneckere and McAfee 1996, Anderson and Dana Jr 2009). Importantly, our result pins down the optimal way of price discrimination, beyond the question of whether price discrimination is profitable.

The remainder of the paper proceeds as follows. Section 2 presents the model. Section 3 presents the main result. Section 4 sketches the proof of the main result. Section 5 presents the applications. Section 6 concludes. Appendix A provides omitted proofs.

2 Model

A monopolist sells nn different goods {1,,n}\{1,\dots,n\} to a unit mass of consumers.

Consumers have types tT:=[t¯,t¯]t\in T:=[\underline{t},\overline{t}]. Types are drawn from a distribution FF with a continuous, positive density ff. Type tt has value v(b,t)v(b,t) for bundle b:=2{1,,n}b\in\mathcal{B}:=2^{\{1,\dots,n\}} with v(,t)=0v(\varnothing,t)=0. The value function v(b,t)v(b,t) is (i) nondecreasing in bb (in the set-inclusion order), (ii) continuously differentiable in tt, and (iii) strictly increasing in tt whenever v(b,t)>0v(b,t)>0. For any stochastic assignment aΔ()a\in\Delta(\mathcal{B}), we define v(a,t):=𝔼ba[v(b,t)]v(a,t):=\operatorname{\mathds{E}}_{b\sim a}[v(b,t)]. The monopolist incurs cost C(b)C(b) to produce bundle bb with C()=0C(\varnothing)=0. We assume that it is efficient for the highest type t¯\overline{t} to consume all the items: argmaxb{v(b,t¯)C(b)}=b¯\operatornamewithlimits{argmax}_{b}\{v(b,\overline{t})-C(b)\}=\overline{b} where b¯:={1,,n}\overline{b}:=\{1,\dots,n\}.

The seller wants to maximize expected profits over all stochastic mechanisms. By the revelation principle, it is without loss of generality to restrict attention to direct mechanisms. Specifically, a (stochastic, direct) mechanism is a measurable map (a,p):TΔ()×(a,p):T\rightarrow\Delta(\mathcal{B})\times\mathds{R} that satisfies the usual incentive compatibility (IC) and individual rationality (IR) conditions:

v(a(t),t)p(t)\displaystyle v(a(t),t)-p(t) v(a(t^),t)p(t^)\displaystyle\geq v(a(\hat{t}),t)-p(\hat{t}) for all t,t^t,\hat{t} in TT;
v(a(t),t)p(t)\displaystyle v(a(t),t)-p(t) 0\displaystyle\geq 0 for all t in T.\displaystyle\text{ for all $t$ in $T$}\,.

Two mechanisms are equivalent if they differ on a zero-measure set of types.

A menu is a set of bundles (which we may assume include \varnothing).999To simplify notation, we omit the inclusion of \varnothing in a menu whenever it is clear from the context. We say that selling menu BB is optimal if there exists an optimal mechanism (a,p)(a,p) such that a(t)Ba(t)\in B for all tt.101010When an assignment a(t)Δ()a(t)\in\Delta(\mathcal{B}) is deterministic, we also let a(t)a(t) denote the assigned bundle. A menu BB is minimally optimal if selling BB is optimal and selling any BBB^{\prime}\subset B is not optimal. A menu BB is nested if the bundles in BB can be ordered by set inclusion. We say that nested bundling is optimal if there exists a nested menu BB such that selling BB is optimal.

For any bundle bb, let FbF_{b} be the distribution of v(b,t)v(b,t). Let P(b,q)P(b,q) be the demand curve:

P(b,q):=Fb1(1q).P(b,q):=F^{-1}_{b}(1-q)\,.

We assume that the profit function

π(b,q):=(P(b,q)C(b))q\pi(b,q):=(P(b,q)-C(b))q

is strictly quasi-concave in q[0,1]q\in[0,1].111111For expositional simplicity, whenever we impose strict quasi-concavity of a function gg on [x1,x2][x_{1},x_{2}], we assume in addition that g()=0\nabla g(\,\cdot\,)=0 at x[x1,x2]x\in[x_{1},x_{2}] implies g(x)g(x)g(x)\geq g(x^{\prime}) for all x[x1,x2]x^{\prime}\in[x_{1},x_{2}] (i.e. we assume that the FOC is satisfied only at the maximum). Define D(b)D^{*}(b) as the unique sales volume that maximizes the profit function, assumed to lie within (0,1)(0,1) for any bundle bb with |b|>1|b|>1.

For any pair of nested bundles b1b2b_{1}\subset b_{2}, we assume that (i) the incremental value

v(b2,t)v(b1,t)v(b_{2},t)-v(b_{1},t)

is strictly increasing in tt whenever it is positive, and (ii) the incremental profit function

π(b2,q)π(b1,q)\pi(b_{2},q)-\pi(b_{1},q)

is strictly quasi-concave in q[0,min(D(b1),D(b2))]q\in[0,\min(D^{*}(b_{1}),D^{*}(b_{2}))] (which is the interval where both profit functions, π(b1,q)\pi(b_{1},q) and π(b2,q)\pi(b_{2},q), are increasing).

2.1 Discussion of Assumptions

Complements and Substitutes.

The assumptions made here are orthogonal to whether the items are complements or substitutes. To illustrate, consider a simple example where the value for a bundle bb is given by v(b,t)=v(b)tv(b,t)=v(b)\cdot t. Note that all the above assumptions hold if (i) types tt follow a regular distribution in the sense of Myerson (1981) and (ii) v(b)v(b) is monotone in the set-inclusion order, regardless of whether the value function v(b)v(b) or the monopolist’s cost function C(b)C(b) exhibit supermodularity or submodularity.

One-dimensional Types.

At this level of generality, even with one-dimensional types, our model allows for different consumers to have different ordinal rankings over items (see Example 1). In fact, every item can be the most preferred item for at least some consumers. The main restriction of one-dimensional types in our model is that such horizontal preferences are fixed for a given one-dimensional type tt. Thus, our model is best suited for capturing settings in which some vertical attribute (such as income) is a good predictor of horizontal preferences for different items.

Refer to caption
(a) k=0k=0
Refer to caption
(b) k=0.1k=0.1
Figure 2: Revenue curves given v(b1,t)=tv(b_{1},t)=t, v(b2,t)=t+t2.5+kv(b_{2},t)=t+t^{2.5}+k, and tU[0,1]t\sim U[0,1]
Incremental Profits.

Consider any pair of nested bundles b1b2b_{1}\subset b_{2}. Since the incremental value v(b2,t)v(b1,t)v(b_{2},t)-v(b_{1},t) is monotone, note that the incremental profit function π(b2,q)π(b1,q)\pi(b_{2},q)-\pi(b_{1},q) is equivalent to the profit function of a monopolist optimizing the quantity of the incremental bundle b2\b1b_{2}\backslash b_{1}, given the plan of selling every consumer bundle b1b_{1}.

Local Quasi-concavity.

We impose only a local quasi-concavity condition on the incremental profit function, which states that, within the interval [0,min(D(b1),D(b2))][0,\min(D^{*}(b_{1}),D^{*}(b_{2}))], the function π(b2,q)π(b1,q)\pi(b_{2},q)-\pi(b_{1},q) has at most one peak. This interval is where both profit functions, π(b1,q)\pi(b_{1},q) and π(b2,q)\pi(b_{2},q), are increasing (i.e. before both of their peaks). This condition assumes that, within this interval, the sum of an increasing function π(b2,q)\pi(b_{2},q) and a decreasing function π(b1,q)-\pi(b_{1},q) is single-peaked. This local quasi-concavity condition is weaker than global quasi-concavity, which in turn is weaker than global concavity. The latter holds when the incremental values follow a regular distribution in the sense of Myerson (1981). To illustrate, suppose that v(b1,t)=tv(b_{1},t)=t, v(b2,t)=t+t2.5+kv(b_{2},t)=t+t^{2.5}+k, C(b1)=C(b2)=0C(b_{1})=C(b_{2})=0, and types tt follow a uniform distribution U[0,1]U[0,1]. Consider two cases: (i) k=0k=0 and (ii) k=0.1k=0.1. As shown in Figure 2, the incremental profit function in the first case is globally quasi-concave, whereas the incremental profit function in the second case has two peaks. However, both cases satisfy our local quasi-concavity assumption.

3 Main Result

Our main result characterizes (i) when nested bundling is optimal and (ii) which nested menu is optimal. In Section 3.1, we introduce a partial order on the set of bundles and show how this partial order characterizes the optimality of nested bundling. In Section 3.2, we provide conditions that further characterize the optimality of a given nested menu. In Section 3.3, we present a parameterized example to illustrate. In Section 3.4, we discuss the key intuition behind our results.

3.1 Optimality of Nested Bundling

We define a partial order on the set of bundles \mathcal{B} as follows:

b1b2: b1b2 and D(b1)D(b2) .b_{1}\preceq b_{2}:\text{ $b_{1}\subseteq b_{2}$ and $D^{*}(b_{1})\leq D^{*}(b_{2})$\,.}

A bundle bb is dominated if there exists bbb^{\prime}\neq b such that bbb\preceq b^{\prime} and undominated otherwise. We say that the nesting condition holds if the undominated bundles can be ordered by set inclusion: that is, for any two bundles bb and bb^{\prime},

both b and b are undominatedeither bb or bb.\text{both $b$ and $b^{\prime}$ are undominated}\implies\text{either $b\subseteq b^{\prime}$ or $b^{\prime}\subseteq b$}\,.

Figure 3 illustrates this condition for a three-item example using a diagram, where an upward arrow from b1b_{1} to b2b_{2} represents b1b2b_{1}\preceq b_{2}.

Refer to caption
Figure 3: Illustration of the nesting condition for a three-item example
Theorem 1.

Under the nesting condition, nested bundling is optimal and every optimal mechanism is equivalent to nested bundling.

Proof of Theorem 1.

See Section A.1. ∎

We sketch the proof in Section 4.1. The proof is constructive. It shows that, under the nesting condition, simply selling the set of undominated bundles is optimal. It also provides a simple algorithm to determine the minimal optimal menu and associated prices (see Algorithm 1 in Section A.1). In the case of two items, the construction shows that the minimal optimal menu is exactly the set of undominated bundles, which we record below:

Corollary 1.

Suppose that there are two items. Under the nesting condition, the set of undominated bundles is the minimal optimal menu.

Remark 1.

Note that Theorem 1 holds even when the socially efficient allocations require bundles that are not nested. That is, the nesting condition implies the optimality of nested bundling even when the efficient allocations require a far richer set of bundles. Note also that Theorem 1 implies that the optimal mechanism is deterministic. This is in general not true when the nesting condition is not satisfied.

Remark 2.

The nesting condition is implied by suitable conditions on price elasticities (when marginal costs are zero). Let η(b,q)\eta(b,q) be the usual price elasticity for bundle bb at quantity qq.121212That is, η(b,q):=[dlogP(b,q)dlogq]1\eta(b,q):=\big{[}\frac{\mathop{}\!\mathrm{d}\log P(b,q)}{\mathop{}\!\mathrm{d}\log q}\big{]}^{-1}. In Section 5.1, we show (see Lemma 1) that a sufficient condition for the nesting condition is the union elasticity condition which says that if the demand curves for two bundles are both elastic at a certain quantity qq, then the demand curve for their union is also elastic at quantity qq:

η(b1,q)<1 and η(b2,q)<1η(b1b2,q)<1.\eta(b_{1},q)<-1\,\,\text{ and }\,\,\eta(b_{2},q)<-1\implies\eta(b_{1}\cup b_{2},q)<-1\,.

Therefore, as a corollary of Theorem 1, nested bundling is optimal under the union elasticity condition and zero costs.131313When costs are present, we can simply modify the price elasticity to be η~(b,q):=[dlog(P(b,q)C(b))dlogq]1\tilde{\eta}(b,q):=\big{[}\frac{\mathop{}\!\mathrm{d}\log(P(b,q)-C(b))}{\mathop{}\!\mathrm{d}\log q}\big{]}^{-1}. That is, nested bundling is optimal when bundling results in a demand curve that has a larger elastic region than at least one of the individual demand curves. In Section 5.1, we also characterize (see 1) the undominated bundles under the union elasticity condition, and show that the optimal menu can be constructed iteratively by using items with a more elastic demand curve as the “basic items” and items with a more inelastic demand curve as the “upgrade items”, with both measured by the size of their elastic regions.

Remark 3.

Both the nesting condition and the union elasticity condition are agnostic to whether the items are complements or substitutes. To illustrate, consider two items and zero costs. Suppose that v({1,2},t)=κ(v({1},t)+v({2},t))v(\{1,2\},t)=\kappa\cdot\big{(}v(\{1\},t)+v(\{2\},t)\big{)} where κ\kappa is a positive constant. Depending on the value of κ\kappa, the two items can be complements (κ>1\kappa>1), substitutes (κ<1\kappa<1), or additive (κ=1\kappa=1). However, one can verify that the union elasticity condition always holds in this case, regardless of the value of κ\kappa.

3.2 Characterization of Optimal Menu

Our second main result provides conditions that further characterize the optimality of a given nested menu.

Theorem 2.

Selling a nested menu BB is optimal if:

  • (i)

    For all b1Bb_{1}\in B, D(b1)>D(b2)D^{*}(b_{1})>D^{*}(b_{2}) for all b2Bb_{2}\in B such that b1b2b_{1}\subset b_{2} ;

  • (ii)

    For all b1Bb_{1}\not\in B, D(b1)D(b2)D^{*}(b_{1})\leq D^{*}(b_{2}) for some b2Bb_{2}\in B such that b1b2b_{1}\subset b_{2} .

Conversely, selling a nested menu BB is minimally optimal only if:

  • (i)

    For all b1Bb_{1}\in B, D(b1)>D(b2)D^{*}(b_{1})>D^{*}(b_{2}) for all b2Bb_{2}\in B such that b1b2b_{1}\subset b_{2} ;

  • (ii)

    For all b1Bb_{1}\not\in B, D(b1)D(b2)D^{*}(b_{1})\leq D^{*}(b_{2}) for some non-empty b2Bb_{2}\in B .

Proof of Theorem 2.

See Section A.2. ∎

We sketch the proof in Section 4.2. The first statement follows from the proof of Theorem 1 by unpacking the definition of undominated bundles.141414Additionally, we also show that to characterize the optimality of a given nested menu BB, we can relax our monotone incremental values assumption and locally quasi-concave incremental profits assumption to only apply to pairs of nested bundles b1b2b_{1}\subset b_{2} where b2Bb_{2}\in B. The second statement is proved by constructing a smaller optimal menu when part (i) fails, and a strict improvement when part (ii) fails. The strict improvement when part (ii) fails is not a nested-menu mechanism; rather, it perturbs the original mechanism by adding a new option involving probabilistic assignments.

Let the best-selling bundle bb^{*} be the bundle with the highest sales volume when sold alone:

b:=argmaxb{D(b)},b^{*}:=\operatornamewithlimits{argmax}_{b\neq\varnothing}\{D^{*}(b)\}\,,

which we assume to be unique. A consequence of Theorem 2 is the following result:

Corollary 2.

Every minimal optimal, nested menu BB includes (i) the best-selling bundle bb^{*} as the smallest non-empty bundle in the menu, and (ii) the grand bundle b¯\overline{b} as the least-selling bundle in the menu.

In Corollary 2, the term “least-selling” also refers to sales volumes when bundles are sold alone. Under the optimal menu of bundles, the grand bundle b¯\overline{b} may or may not be bought by the smallest fraction of consumers. The best-selling bundle bb^{*} and the grand bundle b¯\overline{b} are the two “extremal” bundles with respect to our partial order. Corollary 2 says that if nested bundling is optimal, then the minimal optimal menu must include these two “extremal” bundles and must exclude any bundles dominated by either of the two “extremal” bundles.

Theorem 2 can be used to provide sufficient conditions for nested bundling to be strictly suboptimal. For example, a consequence of Corollary 2 is the following result:

Corollary 3.

Suppose that there are two items and that bundle {2}\{2\} is the best-selling bundle. If the optimal profit under menu {{2},{1,2}}\big{\{}\{2\},\{1,2\}\big{\}} is strictly less than the optimal profit under menu {{1},{1,2}}\big{\{}\{1\},\{1,2\}\big{\}}, then nested bundling is strictly suboptimal.

Remark 4.

The second statement of Theorem 2 is a partial converse because (i) it requires the menu BB to be minimally optimal and (ii) it does not show b1b2b_{1}\subset b_{2} in the second part. However, in the case where B={b¯}B=\{\overline{b}\} (the menu consists only of the grand bundle), neither (i) nor (ii) has bite, and hence Theorem 2 becomes an “if-and-only-if” characterization of pure bundling. Thus, our result nests the full characterization of pure bundling by Ghili (forthcoming) as a special case.

3.3 An Example

Example 1.

Suppose that there are two items {1,2}\{1,2\} and zero costs. The valuations for each bundle are given by:

v({1},t)=tα,v({2},t)=tβ,v({1,2},t)=tα+tβ+tγ.v(\{1\},t)=t^{\alpha},\quad v(\{2\},t)=t^{\beta},\quad v(\{1,2\},t)=t^{\alpha}+t^{\beta}+t^{\gamma}\,.

Types tt follow a uniform distribution on [0,2][0,2]. We fix parameter α=1\alpha=1 and consider two cases: (i) γ=0.5\gamma=0.5 and (ii) γ=4.5\gamma=4.5. In both cases, we vary parameter β\beta from 0 to 22.151515Note that types t<1t<1 and types t>1t>1 have different ordinal rankings for items 11 and 22 whenever β1\beta\neq 1.

Refer to caption
(a) Optimal mechanisms vs. β\beta when γ=0.5\gamma=0.5
Refer to caption
(b) Sales volumes D(b)D^{*}(b) vs. β\beta when γ=0.5\gamma=0.5
Refer to caption
(c) Optimal mechanisms vs. β\beta when γ=4.5\gamma=4.5
Refer to caption
(d) Sales volumes D(b)D^{*}(b) vs. β\beta when γ=4.5\gamma=4.5
Figure 4: Optimal mechanisms and sales volumes D(b)D^{*}(b) as β\beta varies for Example 1

First, consider the case of γ=0.5\gamma=0.5. Figure 4(a) plots the numerically computed optimal mechanism in terms of prices, as parameter β\beta varies in 0.10.1 increments. As Figure 4(a) shows, the optimal mechanism takes different forms as parameter β\beta varies. Specifically, the optimal menu is given by:

  • {{2},{1,2}}\big{\{}\{2\},\{1,2\}\big{\}} when β[0,0.74)\beta\in[0,0.74) ;

  • {{1,2}}\big{\{}\{1,2\}\big{\}} when β[0.74,1.5]\beta\in[0.74,1.5] ;

  • {{1},{1,2}}\big{\{}\{1\},\{1,2\}\big{\}} when β(1.5,2]\beta\in(1.5,2] .

The critical parameter values β=0.74\beta=0.74 and β=1.5\beta=1.5 are highlighted by the two vertical dashed lines in Figure 4(a). These transitions are characterized by our results. One can verify the union elasticity condition holds for all values of parameter β\beta. Figure 4(b) plots the sales volumes D(b)D^{*}(b) for each of the three bundles as parameter β\beta varies. As Figure 4(b) shows, the undominated bundles are nested for all values of parameter β\beta. Specifically, the plot can be partitioned into three regions [0,0.74)[0,0.74), [0.74,1.5][0.74,1.5], and (1.5,2](1.5,2]. The set of undominated bundles is {{2},{1,2}}\big{\{}\{2\},\{1,2\}\big{\}} in the first region, {{1,2}}\big{\{}\{1,2\}\big{\}} in the second region, and {{1},{1,2}}\big{\{}\{1\},\{1,2\}\big{\}} in the third region, coinciding with the optimal menu.

Now, consider the case of γ=4.5\gamma=4.5. In contrast to the previous case, as Figure 4(c) shows, nested bundling is never optimal except at the degenerate parameter value β=1\beta=1 where the two items are indistinguishable. As our results show, this is only possible if the undominated bundles are not nested for any value of parameter β\beta, which can be seen from Figure 4(d).

3.4 Discussion of Intuition for Theorem 1 and Theorem 2

3.4.1 Intuition Based on Marginal Revenue Curves

In this section, we provide the key intuition behind our results, building on the earlier discussion in the introduction.

Suppose that there are two items {1,2}\{1,2\} and zero costs. Consider again the case of D({1})<D({1,2})<D({2})D^{*}(\{1\})<D^{*}(\{1,2\})<D^{*}(\{2\}). In this case, bundle {1}\{1\} is dominated by bundle {1,2}\{1,2\}, while bundle {2}\{2\} is not dominated by bundle {1,2}\{1,2\}. Suppose that the revenue function for any incremental bundle is strictly quasi-concave.161616For the sake of this example we assume that the incremental profit functions are globally quasi-concave. As discussed in Section 2.1, we need a weaker local quasi-concavity assumption. This implies that the MR curve of a bigger bundle crosses the MR curve of a smaller bundle at most once from above. Figure 5(a) illustrates the MR curves under this case as in the introduction.

There are three key observations. First, because of the ordering D({1})<D({1,2})D^{*}(\{1\})<D^{*}(\{1,2\}), these two quantities must be located in the region where the marginal revenue of upgrading consumers from bundle {1}\{1\} to bundle {1,2}\{1,2\} is positive. This then implies that if it is profitable to sell a consumer the smaller bundle {1}\{1\}, it is even more profitable to upgrade the consumer to the bigger bundle {1,2}\{1,2\}.

Second, because of the opposite ordering D({2})>D({1,2})D^{*}(\{2\})>D^{*}(\{1,2\}), these two quantities must be located in the region where the marginal revenue of upgrading consumers from bundle {2}\{2\} to bundle {1,2}\{1,2\} is negative. This then implies that, after a certain quantity threshold, it is always more profitable to downgrade a consumer from the bigger bundle {1,2}\{1,2\} to the smaller bundle {2}\{2\}.

Third, with these two operations, we can attain the upper envelope of the MR curves by allocating bundles to consumers in a monotone fashion such that a higher type consumer receives a bigger bundle in the set-inclusion order. As explained in the introduction, this step is the key to guaranteeing that we can in fact “climb up” the MR curves. In the case of two items, this step is self-evident once we recognize that the configuration of MR curves must resemble Figure 5(a) (as shown in the introduction). However, in the general case, it is impossible to exhaustively list all possible configurations of the MR curves. We will shortly discuss the key intuition behind why our nesting condition is sufficient in the general case.

Before that, let us consider a case where nested bundling is strictly suboptimal (see Corollary 3). By the above arguments, this must be the case where all three bundles are undominated. Without loss of generality, suppose D({1,2})<D({1})<D({2})D^{*}(\{1,2\})<D^{*}(\{1\})<D^{*}(\{2\}). Now, suppose further that the revenue under menu {{2},{1,2}}\big{\{}\{2\},\{1,2\}\big{\}} is less than the revenue under menu {{1},{1,2}}\big{\{}\{1\},\{1,2\}\big{\}}. Figure 5(b) illustrates the MR curves under this case. Note that, in this case, the upper envelope of the MR curves cannot be attained by a nested menu, so we cannot use the argument of “climbing up” the MR curves to find the optimal mechanism.

There are two opposing forces in this case. On the one hand, it is more profitable to attract the “medium-type” consumers than attract the “low-type” consumers since the marginal revenue of selling bundle {1}\{1\} to the “medium-type” consumers is high enough. On the other hand, it is always possible to attract a small fraction of the “low-type” consumers using bundle {2}\{2\} which can bring in a positive marginal revenue. It turns out that the second force always wins if the monopolist can ration and sell bundle {2}\{2\} with a small probability ε\varepsilon. This is because, roughly speaking, the gain from expanding the market this way is on the order of O(ε)O(\varepsilon), whereas the loss from the consumers who no longer purchase bundle {1}\{1\} is on the order of O(ε2)O(\varepsilon^{2}). Intuitively, the reason why the loss is on the higher order is that before introducing bundle {2}\{2\}, the monopolist would have already optimized the prices for the menu {{1},{1,2}}\big{\{}\{1\},\{1,2\}\big{\}}, and hence suffers only a second-order loss for a small perturbation. Thus, nested bundling is strictly suboptimal in this case.

Refer to caption
(a) MR curves under {1}{1,2}\{1\}\preceq\{1,2\}, {2}{1,2}\{2\}\not\preceq\{1,2\}
Refer to caption
(b) MR curves under {1}{1,2}\{1\}\not\preceq\{1,2\}, {2}{1,2}\{2\}\not\preceq\{1,2\}
Figure 5: Further illustration of the marginal revenue curves
Beyond Two-item Cases: Nesting Condition.

We now explain the key insight that helps understand our results beyond the two-item cases. The intuition as discussed earlier still holds when there are more than two items, but we may run into issues with both the upgrade and downgrade improvements, because these improvements may not be implementable in the price space. To illustrate, suppose that there are three items and that bundle {1}\{1\} is dominated by bundle {1,2}\{1,2\}. Suppose that we are given an initial allocation rule in the quantity space as depicted in Figure 6(a). By the previous discussion, we know that if we can upgrade the consumers who are currently consuming bundle {1}\{1\} to bundle {1,2}\{1,2\}, then we would achieve an improvement. However, this upgrade may not be feasible, because there may not be prices that can support this change in allocations, given that there are higher types who are currently purchasing bundle {2,3}\{2,3\}, as depicted in Figure 6(b) (highlighted by the double-headed arrow). This is the key difference between our bundling problem and the standard one-dimensional screening problem — the set of implementable allocation rules is both much richer and much more complex.171717Implementability in multidimensional settings is characterized by cyclic monotonicity (see Rochet 1987) which is much more complex than standard monotonicity conditions.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Figure 6: Illustration of the improvement argument for a three-item example

The key insight that resolves this problem is the following: Such a potential conflict arises when bundles bb and bb^{\prime} are not nested, but then the nesting condition implies that at least one of them must be dominated (recall that the nesting condition requires the undominated bundles to be nested). In our example, this means that either bundle {1,2}\{1,2\} or bundle {2,3}\{2,3\} must be dominated. This gives us a way out because we can apply this argument again by going one layer up and further upgrading the consumers from either bb or bb^{\prime} to the bundle that dominates one of them. Repeating this process would always result in a pair of nested bundles.

For our running example, suppose that bundle {1,2}\{1,2\} is dominated by bundle {1,2,3}\{1,2,3\} and bundle {2,3}\{2,3\} is undominated, as depicted in Figure 6(c); so the process in this example terminates in one iteration. Of course, the resulting pair of bundles can be in the “wrong” order in the sense that the higher types are assigned the smaller bundle, which we know cannot be implemented by prices. However, when that happens, since the smaller bundle is undominated, we know that if it is ever profitable to downgrade from the bigger bundle to the smaller bundle at some quantity then it is always profitable to downgrade after that quantity. For our running example, suppose that we can further profitably downgrade bundle {1,2,3}\{1,2,3\} to bundle {2,3}\{2,3\}, as depicted in Figure 6(d). The allocation rule is now monotone and hence implementable. Moreover, it is more profitable than the initial allocation rule by construction. Because these arguments can be applied to any initial allocation rule, the upper envelope of the MR curves, under the nesting condition, must be attained by an implementable allocation rule.

Remark 5.

As the above discussion shows, our nesting condition is essential in two ways: (i) it facilities the comparison of marginal revenues, and (ii) it provides a way out from complex implementability constraints by guiding us toward an even more profitable allocation rule that we know is implementable. The actual proof follows these intuitions closely. In addition, the proof considers stochastic mechanisms and shows that, under the nesting condition, randomization cannot increase profit.

3.4.2 Alternative Intuition Based on Price Elasticities

In this section, we provide an alternative, price-theoretical intuition based on price elasticities (recall from Remark 2 that the nesting condition is implied by suitable conditions on price elasticities). Suppose that there are two items and zero costs, and that η({2},q)<η({1,2},q)<η({1},q)\eta(\{2\},q)<\eta(\{1,2\},q)<\eta(\{1\},q) for all quantities qq. That is, bundle {2}\{2\} has a pointwise more elastic demand curve than bundle {1,2}\{1,2\}, which in turn has a pointwise more elastic demand curve than bundle {1}\{1\}. This assumption implies that D({2})>D({1,2})>D({1})D^{*}(\{2\})>D^{*}(\{1,2\})>D^{*}(\{1\}), but it is much stronger than our union elasticity condition stated in Remark 2.

Let η({1,2},q{1})\eta(\{1,2\},q\mid\{1\}) be the price elasticity of the demand curve for the incremental values for bundle {1,2}\{1,2\} given bundle {1}\{1\}. We can write

η({1,2},q)=P({1,2},q)qddqP({1,2},q)=(P({1,2},q)P({1},q))+P({1},q)qddq(P({1,2},q)P({1},q))+qddqP({1},q).\eta(\{1,2\},q)=\frac{P(\{1,2\},q)}{q\cdot\frac{\mathop{}\!\mathrm{d}}{\mathop{}\!\mathrm{d}q}P(\{1,2\},q)}=\frac{\big{(}P(\{1,2\},q)-P(\{1\},q)\big{)}+P(\{1\},q)}{q\cdot\frac{\mathop{}\!\mathrm{d}}{\mathop{}\!\mathrm{d}q}\big{(}P(\{1,2\},q)-P(\{1\},q)\big{)}+q\cdot\frac{\mathop{}\!\mathrm{d}}{\mathop{}\!\mathrm{d}q}P(\{1\},q)}\,.

By the mediant inequality,181818That is, for any ac<bd\frac{a}{c}<\frac{b}{d} such that cd>0c\cdot d>0, we have ac<a+bc+d<bd\frac{a}{c}<\frac{a+b}{c+d}<\frac{b}{d}. this implies that

η({1,2},q{1})<η({1,2},q)<η({1},q).\eta(\{1,2\},q\mid\{1\})<\eta(\{1,2\},q)<\eta(\{1\},q)\,.

That is, the demand curve for upgrading from bundle {1}\{1\} to bundle {1,2}\{1,2\} is even more elastic. In particular, it is profitable to charge a low enough price to sell the upgrade option to all consumers in the elastic region of the demand curve for bundle {1}\{1\}. Thus, it is profitable to exclude bundle {1}\{1\}, which is a dominated bundle, as an option from the menu.

Symmetrically, for bundle {2}\{2\}, we have

η({1,2},q{2})>η({1,2},q)>η({2},q).\eta(\{1,2\},q\mid\{2\})>\eta(\{1,2\},q)>\eta(\{2\},q)\,.

That is, the demand curve for upgrading from bundle {2}\{2\} to bundle {1,2}\{1,2\} is even more inelastic. In particular, it is profitable to charge a high enough price that leaves at least some consumers unserved in terms of the upgrade option. Thus, it is profitable to include bundle {2}\{2\}, which is an undominated bundle, as an option in the menu.

Remark 6.

Note however that, even in this two-item case, these price-theoretical arguments are incomplete. This is because they do not take into account the presence of the other item in the market when pricing the upgrade from one item to the bundle. In addition, the arguments assume that the elasticities can be pointwise ranked, which is much stronger than our nesting condition. The actual joint pricing problem is much more complex and cannot be simply reduced to two separate pricing problems. We emphasize that it is a consequence of our results that, under the nesting condition, one can pairwise compare the bundles. As explained in Section 3.4.1, the proof deals with the joint pricing problem by working in the quantity space rather than in the price space.

4 Proof Sketch for the Main Result

In this section, we sketch the proofs for Theorem 1 and Theorem 2. Following Myerson (1981), let

ϕ(b,t):=v(b,t)C(b)1F(t)f(t)vt(b,t)\phi(b,t):=v(b,t)-C(b)-\frac{1-F(t)}{f(t)}v_{t}(b,t)

be the virtual surplus function for bundle bb. As observed by Bulow and Roberts (1989), this can be interpreted as the marginal profit for bundle bb evaluated at the quantity such that the marginal consumer is of type tt.

A key difference between our problem and the standard one-dimensional mechanism design problem is that we do not have access to a simple characterization of implementable allocation rules. However, the following revenue equivalence theorem continues to hold in our setting: For every implementable allocation rule a:TΔ()a:T\rightarrow\Delta(\mathcal{B}), the expected profit to the seller is given by

𝔼[bab(t)ϕ(b,t)].\operatorname{\mathds{E}}\Big{[}\sum_{b\in\mathcal{B}}a_{b}(t)\phi(b,t)\Big{]}\,.

The proofs for both Theorem 1 and Theorem 2 involve (i) a series of observations about the collection of functions {ϕ(b,t)}b\big{\{}\phi(b,t)\big{\}}_{b\in\mathcal{B}} and (ii) addressing the implementability issues. In the proof of Theorem 1, we handle implementability by verifying it ex post. In the proof of Theorem 2, we handle implementability by explicitly constructing an improvement.

4.1 Proof Sketch for Theorem 1

The proof sketch involves three steps. In the first step, we establish a key claim about the crossings of ϕ(b1,t)\phi(b_{1},t) and ϕ(b2,t)\phi(b_{2},t) for any pair of nested bundles b1b2b_{1}\subset b_{2}. For any b1b2b_{1}\subset b_{2}, we define the last crossing type of the virtual surplus functions for b1b_{1} and b2b_{2} as

s(b1,b2):=infs{sT:ϕ(b2,t)>ϕ(b1,t) for all t>s}.s(b_{1},b_{2}):=\inf_{s}\Big{\{}s\in T:\phi(b_{2},t)>\phi(b_{1},t)\text{ for all $t>s$}\Big{\}}\,.

We also define the last crossing virtual surplus of b1b_{1} and b2b_{2} as

χ(b1,b2):=ϕ(b1,s(b1,b2)).\chi(b_{1},b_{2}):=\phi(b_{1},s(b_{1},b_{2}))\,.

The key claim is that for every b1b2b_{1}\subset b_{2}, we have

sign[χ(b1,b2)]=sign[D(b1)D(b2)].\operatorname{sign}\Big{[}\chi(b_{1},b_{2})\Big{]}=\operatorname{sign}\Big{[}D^{*}(b_{1})-D^{*}(b_{2})\Big{]}\,.

To prove this claim, we first make two basic observations. First, we have

D(b1)D(b2)t(b1)t(b2)D^{*}(b_{1})\geq D^{*}(b_{2})\iff t^{*}(b_{1})\leq t^{*}(b_{2})

where t(b)t^{*}(b) is the cutoff type at which ϕ(b,t)\phi(b,t) single-crosses 0 from below (which exists by our quasi-concavity assumption on the profit function). Second, we have that ϕ(b2,t)\phi(b_{2},t) single-crosses ϕ(b1,t)\phi(b_{1},t) from below on the interval [max(t(b1),t(b2)),t¯][\max(t^{*}(b_{1}),t^{*}(b_{2})),\overline{t}]. This is implied by our local quasi-concavity condition, which assumes that the incremental profit function is quasi-concave on the interval [0,min(D(b1),D(b2))][0,\min(D^{*}(b_{1}),D^{*}(b_{2}))].

Using these two observations, we can prove 4.1 by making a deeper geometric observation. To illustrate, consider Figure 7, which depicts the relative positions of ϕ(b1,t)\phi(b_{1},t) and ϕ(b2,t)\phi(b_{2},t). The key idea is to vary the horizontal axis and observe that the sign of χ(b1,b2)\chi(b_{1},b_{2}) is fully determined by the ordering of t(b1)t^{*}(b_{1}) and t(b2)t^{*}(b_{2}), given that ϕ(b2,t)\phi(b_{2},t) single-crosses ϕ(b1,t)\phi(b_{1},t) from below on the interval [max(t(b1),t(b2)),t¯][\max(t^{*}(b_{1}),t^{*}(b_{2})),\overline{t}].

Refer to caption
Figure 7: Illustration of possible crossings of ϕ\phi when b1b2b_{1}\preceq b_{2}

In the second step, we establish the following claim:

supa:TΔ()𝔼[bab(t)ϕ(b,t)]=supa:TU𝔼[bab(t)ϕ(b,t)],\sup_{a:\,T\rightarrow\Delta(\mathcal{B})}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}=\sup_{a:\,T\rightarrow U}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}\,,

where UU is the set of undominated bundles (including \varnothing). Because the objective of the left-hand side is linear in probabilities, we only need to consider deterministic allocation rules a:Ta:T\rightarrow\mathcal{B} for the left-hand side problem.

Now, to prove this claim, we exploit two key facts. First, because (,)(\mathcal{B},\preceq) is a finite partially ordered set, every dominated bundle must be dominated by an undominated bundle. Second, for any b1b2b_{1}\subset b_{2}, if χ(b1,b2)0\chi(b_{1},b_{2})\leq 0, then we must have

ϕ(b1,t)max{0,ϕ(b2,t)},\phi(b_{1},t)\leq\max\big{\{}0,\phi(b_{2},t)\big{\}}\,,

because by definition χ(b1,b2)0\chi(b_{1},b_{2})\leq 0 means that the last time ϕ(b2,t)\phi(b_{2},t) crosses ϕ(b1,t)\phi(b_{1},t) from below happens when ϕ(b1,t)\phi(b_{1},t) is non-positive.

We then combine these two facts as follows. For every dominated bundle b1b_{1}, we can find an undominated bundle b2Ub_{2}\in U such that b1b2b_{1}\subset b_{2} and D(b1)D(b2)D^{*}(b_{1})\leq D^{*}(b_{2}), which then implies that χ(b1,b2)0\chi(b_{1},b_{2})\leq 0 by 4.1. Thus, we have ϕ(b1,t)max{0,ϕ(b2,t)}\phi(b_{1},t)\leq\max\big{\{}0,\phi(b_{2},t)\big{\}}. Since this argument holds for all dominated bundles, 4.1 follows.

Refer to caption
Figure 8: Illustration of possible crossings of ϕ\phi in the iterative construction

In the third step, we define aa^{\dagger} to be the allocation rule induced by the menu BUB\subseteq U and prices given by Algorithm 1 in Section A.1. Our goal in this step is to prove the following claim:

aargmaxa:TU𝔼[bab(t)ϕ(b,t)].a^{\dagger}\in\operatornamewithlimits{argmax}_{a:\,T\rightarrow U}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}\,.

To do so, we use an induction argument. By assumption, the set of undominated bundles UU is a nested menu, so we can write U={,b1,,bm}U=\{\varnothing,b_{1},\dots,b_{m}\} such that b1bmb_{1}\subset\cdots\subset b_{m}. We then construct a sequence of monotone allocation rules {a(j)}j=1,,m\big{\{}a^{(j)}\big{\}}_{j=1,\dots,m} using induction such that: (i) the allocation rule a(j)a^{(j)} attains the envelope of the virtual surplus functions for the jj smallest non-empty bundles in UU (and the empty bundle)

ϕ(a(j)(t),t)=max{0,ϕ(b1,t),,ϕ(bj,t)},\phi\big{(}a^{(j)}(t),t\big{)}=\max\Big{\{}0,\phi(b_{1},t),\dots,\phi(b_{j},t)\Big{\}}\,,

and (ii) the final allocation rule a(m)a^{(m)} coincides with the proposed allocation rule aa^{\dagger}.

The key to this argument is the following geometric observation, which we also prove by induction: For any k>jk>j, the function ϕ(bk,t)\phi(b_{k},t) single-crosses the envelope Vj(t):=max{0,ϕ(b1,t),,ϕ(bj,t)}V_{j}(t):=\max\big{\{}0,\phi(b_{1},t),\dots,\phi(b_{j},t)\big{\}} from below. To illustrate, consider Figure 8, which depicts three curves: ϕ(bk,t)\phi(b_{k},t), Vj1(t)V_{j-1}(t), and max{0,ϕ(bj,t)}\max\{0,\phi(b_{j},t)\}. Note that Vj(t)V_{j}(t) can be decomposed into the envelope of Vj1(t)V_{j-1}(t) and max{0,ϕ(bj,t)}\max\{0,\phi(b_{j},t)\}. By varying ϕ(bk,t)\phi(b_{k},t) and observing that these three curves single-cross each other, which holds by the inductive hypothesis (on j1j-1), we can establish this single-crossing property. This property allows us to “paste” the virtual surplus functions ϕ(bj,t)\phi(b_{j},t) together one at a time, from the smallest bundle b1b_{1} to the largest bundle bmb_{m}, and trace out the successive envelopes in this process.

Now, recall that the proposed allocation rule aa^{\dagger} is implementable by construction. But then it must be optimal since we have from 4.1 and 4.1 that:

𝔼[bab(t)ϕ(b,t)]=supa:TU𝔼[bab(t)ϕ(b,t)]=supa:TΔ()𝔼[bab(t)ϕ(b,t)].\operatorname{\mathds{E}}\Big{[}\sum_{b}a^{\dagger}_{b}(t)\phi(b,t)\Big{]}=\sup_{a:\,T\rightarrow U}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}=\sup_{a:\,T\rightarrow\Delta(\mathcal{B})}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}\,.

Thus, under the nesting condition, selling the set of undominated bundles is optimal, and hence nested bundling is optimal. Moreover, as the allocation rule aa^{\dagger} attains the envelope of the virtual surplus functions, the menu BB given by Algorithm 1 must be minimally optimal. The uniqueness part of Theorem 1 follows from a strengthening of 4.1.

4.2 Proof Sketch for Theorem 2

The proof sketch for the first statement is similar to Section 4.1 and is hence omitted. The proof sketch for the second statement involves three steps, which we turn to next.

Suppose that selling a nested menu BB (including \varnothing) is optimal and selling any proper subset of BB is not optimal. Suppose, for contradiction, that either

  • (i)

    there exist b1Bb_{1}\in B and b2Bb_{2}\in B such that b1b2b_{1}\subset b_{2} and D(b1)D(b2)D^{*}(b_{1})\leq D^{*}(b_{2}), or

  • (ii)

    there exists b1Bb_{1}\not\in B such that D(b1)>D(b2)D^{*}(b_{1})>D^{*}(b_{2}) for all non-empty b2Bb_{2}\in B.

In the first step, we define

UB:={bB:b is not dominated by another bundle bB}.U_{B}:=\Big{\{}b\in B:\text{$b$ is not dominated by another bundle $b^{\prime}\in B$}\Big{\}}\,.

Using similar arguments as in 4.1 and 4.1, we show that there exists an optimal mechanism (a,p)(a,p) such that a:TUBa:T\rightarrow U_{B}.

In the second step, we consider Case (i). In this case, the menu UBU_{B} must be a strict subset of the menu BB. Then, since (a,p)(a,p) is an optimal mechanism, we have that selling menu UBU_{B} is also optimal, which contradicts the minimal optimality of menu BB. Hence, this case cannot occur.

In the third step, we consider Case (ii). We construct a strict improvement in this case (which would then contradict the optimality of menu BB). Specifically, we propose adding the following option to the menu given by (a,p)(a,p): A lottery offering bundle b1b_{1} with probability ε\varepsilon, for a price of ε×v(b1,t(b1))\varepsilon\times v(b_{1},t^{*}(b_{1})).

Let us compute the payoff to the monopolist after the addition of this new option. The profit change to the monopolist after adding this new option can be computed as

𝔼[bab(t)ϕ(b,t)]𝔼[bab(t)ϕ(b,t)]\operatorname{\mathds{E}}\Big{[}\sum_{b}a^{\prime}_{b}(t)\phi(b,t)\Big{]}-\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}

where aa^{\prime} is the induced allocation rule after the types re-adjust their optimal choices given the new option.

Let b¯\underline{b} be the smallest non-empty bundle in BB. Note that, by the construction of (a,p)(a,p), the following holds:

  • (i)

    all types t[t¯,t(b1))t\in[\underline{t},t^{*}(b_{1})) will not take this option ;

  • (ii)

    all types t[t(b1),t(b¯))t\in[t^{*}(b_{1}),t^{*}(\underline{b})) will switch from \varnothing to this option ;

  • (iii)

    all types t[t(b¯),tε)t\in[t^{*}(\underline{b}),t^{\dagger}_{\varepsilon}) will switch from b¯\underline{b} to this option, for some threshold tεt^{\dagger}_{\varepsilon} that depends on ε\varepsilon (the subscript emphasizes this dependence) .

Refer to caption
Figure 9: Illustration of the perturbation argument

The monopolist makes a gain from the types t[t(b1),t(b¯))t\in[t^{*}(b_{1}),t^{*}(\underline{b})) and suffers a loss from the types t[t(b¯),tε)t\in[t^{*}(\underline{b}),t^{\dagger}_{\varepsilon}). It is crucial to compute the gain and the loss in terms of the virtual surplus. Denote them by Gain(ε)\text{Gain}(\varepsilon) and Loss(ε)\text{Loss}(\varepsilon). The key claim is the following: For any ε>0\varepsilon>0 small enough, we have

Gain(ε)>Loss(ε).\text{Gain}(\varepsilon)>\text{Loss}(\varepsilon)\,.

Roughly speaking, this is because Gain(ε)\text{Gain}(\varepsilon) is on the order of O(ε)O(\varepsilon) but Loss(ε)\text{Loss}(\varepsilon) is on the order of O(ε2)O(\varepsilon^{2}). The argument for 4.2 relies on a key geometric observation, illustrated in Figure 9 for a two-item example. In particular, observe that the total gain from the types t[t(b1),t(b¯))t\in[t^{*}(b_{1}),t^{*}(\underline{b})) forms a rectangle whose area varies in ε\varepsilon linearly whereas the total loss from the types t[t(b¯),tε)t\in[t^{*}(\underline{b}),t^{\dagger}_{\varepsilon}) forms a triangle whose area varies in ε\varepsilon quadratically.

5 Applications

In this section, we present three applications of our main result. In the first two applications, we connect empirically relevant demand parameters (price elasticities) and supply parameters (cost structures) to optimal bundling and quality design. In Section 5.1, we study the effect of price elasticities on optimal bundling strategies. In Section 5.2, we study the effect of cost structures on optimal quality design. In the last application, in Section 5.3, we connect costly nonprice screening to optimal bundling. Using our main result, we obtain a necessary and sufficient condition for costly screening to be optimal.

5.1 Optimal Bundling Strategies and Price Elasticities

We first introduce a sufficient condition for the nesting condition in Theorem 1, in terms of price elasticities. Let

η(b,q):=[dlogP(b,q)dlogq]1\eta(b,q):=\Big{[}\frac{\mathop{}\!\mathrm{d}\log P(b,q)}{\mathop{}\!\mathrm{d}\log q}\Big{]}^{-1}

be the usual price elasticity for bundle bb evaluated at quantity qq. We say that the union elasticity condition holds if for any bundles b1b_{1} and b2b_{2}, we have

η(b1,q)<1 and η(b2,q)<1η(b1b2,q)<1.\eta(b_{1},q)<-1\,\,\text{ and }\,\,\eta(b_{2},q)<-1\implies\eta(b_{1}\cup b_{2},q)<-1\,.

That is, if the demand curves for two bundles are both elastic at a certain quantity qq, then the demand curve for their union is also elastic at quantity qq.

Lemma 1.

Under zero costs, the union elasticity condition implies the nesting condition.

Proof of Lemma 1.

Suppose for contradiction that there exist undominated bundles b1b_{1} and b2b_{2} that are not nested. Then b1b1b2b_{1}\subset b_{1}\cup b_{2} and b2b1b2b_{2}\subset b_{1}\cup b_{2}. Since costs are zero, the price elasticity for any bundle bb at quantity D(b)D^{*}(b) must satisfy

η(b,D(b))=1.\eta(b,D^{*}(b))=-1\,.

Note that the elasticity curve η(b,)\eta(b,\,\cdot\,) single-crosses 1-1 from below because the MR curve MR(b,)\text{MR}(b,\,\cdot\,) single-crosses 0 from above.191919To see this, recall that MR(b,q)=P(b,q)[1+1η(b,q)]\text{MR}(b,q)=P(b,q)\big{[}1+\frac{1}{\eta(b,q)}\big{]}. Then, since b1b_{1} and b2b_{2} are undominated, we have

η(b1,D(b1b2))<1 and η(b2,D(b1b2))<1,\eta\big{(}b_{1},D^{*}(b_{1}\cup b_{2})\big{)}<-1\,\,\,\text{ and }\,\,\,\eta\big{(}b_{2},D^{*}(b_{1}\cup b_{2})\big{)}<-1\,,

which, by the union elasticity condition, implies that

η(b1b2,D(b1b2))<1.\eta\big{(}b_{1}\cup b_{2},D^{*}(b_{1}\cup b_{2})\big{)}<-1\,.

A contradiction. ∎

Remark 7.

When costs are present, we can simply modify the price elasticity η(b,q)\eta(b,q) to be η~(b,q):=[dlog(P(b,q)C(b))dlogq]1\tilde{\eta}(b,q):=\big{[}\frac{\mathop{}\!\mathrm{d}\log(P(b,q)-C(b))}{\mathop{}\!\mathrm{d}\log q}\big{]}^{-1} to incorporate arbitrary costs.

Now, applying our main result, we can fully characterize the optimal menu under the union elasticity condition. To do so, we order the bundles according to their sales volumes D(b)D^{*}(b) when sold alone and define bib^{*}_{i} as the ii-th best-selling bundle, with ties broken arbitrarily. Then, we have the following result:

Proposition 1.

Under the union elasticity condition and zero costs, selling the following nested menu is optimal:

{b1,b1b2,b1b2b3,,b¯}.\Big{\{}b_{1}^{*},\,\,\,b_{1}^{*}\cup b_{2}^{*},\,\,\,b_{1}^{*}\cup b_{2}^{*}\cup b_{3}^{*},\,\,\,\dots,\,\,\,\overline{b}\Big{\}}\,.
Proof of 1.

By Lemma 1 and Theorem 1, nested bundling is optimal. Let BB be the proposed menu. By Theorem 2, it suffices to show that any (non-empty) bundle bBb\not\in B is dominated. We start by showing that for all ii, we have

D(b1bi)D(bi).D^{*}(b^{*}_{1}\cup\dots\cup b^{*}_{i})\geq D^{*}(b^{*}_{i})\,.

We prove this by induction on ii. The base case i=1i=1 is trivial. For the inductive step, suppose that the claim holds for i1i-1. Now, observe that

D(b1bi)min{D(b1bi1),D(bi)}min{D(bi1),D(bi)}=D(bi),D^{*}(b^{*}_{1}\cup\dots\cup b^{*}_{i})\geq\min\Big{\{}D^{*}(b^{*}_{1}\cup\dots\cup b^{*}_{i-1}),\,D^{*}(b^{*}_{i})\Big{\}}\geq\min\Big{\{}D^{*}(b^{*}_{i-1}),\,D^{*}(b^{*}_{i})\Big{\}}=D^{*}(b^{*}_{i})\,,

where (i) the first inequality follows from the union elasticity condition and the argument used in the proof of Lemma 1, (ii) the second inequality follows from the inductive hypothesis, and (iii) the last equality follows from the definition of bib^{*}_{i} and bi1b^{*}_{i-1}. This proves the inductive step.

Now, fix any bBb\notin B. There exists some index jj such that b=bjb=b^{*}_{j}. Since bBb\notin B, we have

b=bjb1bj.b=b^{*}_{j}\subset b^{*}_{1}\cup\dots\cup b^{*}_{j}\,.

But we also have

D(b)=D(bj)D(b1bj).D^{*}(b)=D^{*}(b^{*}_{j})\leq D^{*}(b^{*}_{1}\cup\dots\cup b^{*}_{j})\,.

Thus, bundle bb is dominated, completing the proof. ∎

Remark 8.

Under the union elasticity condition, 1 provides a simple recipe for constructing the optimal menu: (i) arrange all bundles in descending order based on their sales volumes, and (ii) successively merge them, excluding duplicates. The optimal mechanism iteratively creates nests such that items with a more elastic demand curve become the “basic items” and items with a more inelastic demand curve become the “upgrade items”, with both measured by the size of their elastic regions. Note also that this mechanism sorts the bundles, rather than the items, by their sales volumes. This construction fully accounts for the complementarity or substitutability patterns across different items.

Comparative Statics.

Price elasticities can be affected by advertising and marketing, which can act as a demand rotation in the sense of Johnson and Myatt (2006). Using 1, we can analyze the comparative statics of optimal bundling given a sequence of demand rotations. Suppose that there are two items and zero costs. Consider a family of demand systems indexed by parameter ss\in\mathds{R}, with η(b,q;s)\eta(b,q;s) denoting the price elasticities and D(b;s)D^{*}(b;s) denoting the sales volumes. We use the following notion of demand rotations: There is a sequence of (clockwise, sales-ordered) demand rotations for item ii if for all s<ss<s^{\prime}

D({i};s)D({i};s),D({j};s)=D({j};s),D({1,2};s)D({1,2};s),D^{*}(\{i\};s^{\prime})\leq D^{*}(\{i\};s),\qquad D^{*}(\{j\};s^{\prime})=D^{*}(\{j\};s),\qquad D^{*}(\{1,2\};s^{\prime})\leq D^{*}(\{1,2\};s),

and

D({i};s)D({1,2};s)D({i};s)D({1,2};s).D^{*}(\{i\};s)\leq D^{*}(\{1,2\};s)\implies D^{*}(\{i\};s^{\prime})\leq D^{*}(\{1,2\};s^{\prime})\,.

That is, as parameter ss increases, the demand curve for item ii and the demand curve for bundle {1,2}\{1,2\} become more inelastic in the sense of a smaller elastic region.202020A sufficient (but far from necessary) condition is that, as parameter ss increases, the demand curves become pointwise more inelastic. The last condition ensures that the indirect change in the demand curve for bundle {1,2}\{1,2\} is smaller than the direct change in the demand curve for item ii. To state our result, we define the tier of item ii in a nested menu B:={b1,,bm}B:=\{b_{1},\dots,b_{m}\}, where b1bmb_{1}\subset\cdots\subset b_{m}, as the index of the smallest bundle in BB that includes item ii, denoted by ri(B)r_{i}(B).

Proposition 2.

Suppose that there are two items and zero costs and that the union elasticity condition holds for all ss. Let B(s)B^{*}(s) be the minimal optimal menu. Then, in a sequence of demand rotations for item ii, we have that:

  • (i)

    the tier of item ii in the optimal menu ri(B(s))r_{i}(B^{*}(s)) is nondecreasing in ss ;

  • (ii)

    the tier of item jij\neq i in the optimal menu rj(B(s))r_{j}(B^{*}(s)) is nonincreasing in ss ;

  • (iii)

    the size of the optimal menu |B(s)||B^{*}(s)| is quasi-convex in ss .

Proof of 2.

Without loss of generality, suppose that there is a sequence of demand rotations for item 22. By 1, nested bundling is always optimal at any parameter ss. By Corollary 1, the minimal optimal menu B(s)B^{*}(s) equals the set of undominated bundles. To prove claim (i), observe that it suffices to show that if B(s)={{1},{1,2}}B^{*}(s)=\big{\{}\{1\},\{1,2\}\big{\}}, then B(s)B^{*}(s^{\prime}) must be {{1},{1,2}}\big{\{}\{1\},\{1,2\}\big{\}} for any s<ss<s^{\prime}. Suppose not. Then, for some s<ss<s^{\prime}, we have

D({1,2};s)D({1};s)=D({1};s)>D({1,2};s),D^{*}(\{1,2\};s^{\prime})\geq D^{*}(\{1\};s^{\prime})=D^{*}(\{1\};s)>D^{*}(\{1,2\};s)\,\,,

which is impossible by our notion of demand rotations.

To prove claim (ii), observe that it suffices to show that if B(s)={{2},{1,2}}B^{*}(s^{\prime})=\big{\{}\{2\},\{1,2\}\big{\}}, then B(s)B^{*}(s) must be {{2},{1,2}}\big{\{}\{2\},\{1,2\}\big{\}} for any s<ss<s^{\prime}. Suppose not. Then, for some s<ss<s^{\prime}, we have

D({2};s)D({1,2};s),D({2};s)>D({1,2};s),D^{*}(\{2\};s)\leq D^{*}(\{1,2\};s)\,,\qquad D^{*}(\{2\};s^{\prime})>D^{*}(\{1,2\};s^{\prime})\,,

which is impossible by our notion of demand rotations.

To prove claim (iii), observe that it suffices to show that it cannot be |B(s)|=1,|B(s)|=2,|B(s′′)|=1|B^{*}(s)|=1,|B^{*}(s^{\prime})|=2,|B^{*}(s^{\prime\prime})|=1 for any s<s<s′′s<s^{\prime}<s^{\prime\prime}. To see why this is impossible, note that: if B(s)={{1},{1,2}}B^{*}(s^{\prime})=\big{\{}\{1\},\{1,2\}\big{\}}, then r2(B())r_{2}(B^{*}(\,\cdot\,)) cannot be nondecreasing, contradicting claim (i); if B(s)={{2},{1,2}}B^{*}(s^{\prime})=\big{\{}\{2\},\{1,2\}\big{\}}, then r1(B())r_{1}(B^{*}(\,\cdot\,)) cannot be nonincreasing, contradicting claim (ii). ∎

Remark 9.

2 says that if there is a sequence of demand rotations for item ii, i.e. an increase in the dispersion of consumers’ values for item ii, then the item gets promoted to be the “upgrade item” while the other item gets demoted to be the “basic item”, and the optimal menu first gets coarser and then gets finer. This result complements Johnson and Myatt (2006), who study the effect of value dispersion on a monopolist’s quality design. They show that a demand rotation always leads to an expansion of the product line. In contrast, our bundling setting involves the monopolist switching the tiers of different items and adopting a menu size that is UU-shaped in the dispersion parameter. As an example, consider case (i) of Example 1, where γ=0.5\gamma=0.5 and β\beta varies from 0 to 22. As parameter β\beta increases, there is a sequence of demand rotations for item 22. The optimal menu changes in a way that is consistent with 2: It shifts from {{2},{1,2}}\big{\{}\{2\},\{1,2\}\big{\}} to {{1,2}}\big{\{}\{1,2\}\big{\}}, and then to {{1},{1,2}}\big{\{}\{1\},\{1,2\}\big{\}}, as parameter β\beta increases.

5.2 Optimal Quality Design and Cost Structures

A special case of our model is the quality discrimination model a la Mussa and Rosen (1978). Our result provides new insights even in this well-studied setting. Let 𝒳:={x1,,xn}++\mathcal{X}:=\{x_{1},\dots,x_{n}\}\subset\mathds{R}_{++} be a set of qualities, with x1<<xnx_{1}<\cdots<x_{n}. In this model, a type-tt consumer has value vQ(x,t)v_{Q}(x,t) for a good of quality xx; the monopolist incurs cost CQ(x)C_{Q}(x) to supply a good of quality xx. This can be viewed as a special case of our model, where we define the values and costs for the bundles as follows: For all k=1,,nk=1,\dots,n, let

v({1,,k},t):=vQ(xk,t),C({1,,k}):=CQ(xk).v(\{1,\dots,k\},t):=v_{Q}(x_{k},t)\,,\quad C(\{1,\dots,k\}):=C_{Q}(x_{k})\,.

Let v(b,t)=C(b)=0v(b,t)=C(b)=0 for all bundles bb that are not of the form {1,,k}\{1,\dots,k\} (i.e. these bundles can be simply ignored when applying our results). Note that, in this case, the nesting condition is always satisfied.

With a slight abuse of notation, we drop the subscript QQ whenever it is clear. Let D(x)D^{*}(x) be the unique sales volume of the good of quality xx when it is sold alone; thus, D:𝒳[0,1]D^{*}:\mathcal{X}\rightarrow[0,1]. For simplicity of exposition, assume that D(x)(0,1)D^{*}(x)\in(0,1) for all x𝒳x\in\mathcal{X}. Our next result provides a new characterization of optimal quality discrimination:

Proposition 3.

Consider the upper decreasing envelope of DD^{*}:

D^(x):=inf{g(x): g is nonincreasing and gD}.\widehat{D}^{*}(x):=\inf\Big{\{}g(x):\text{ $g$ is nonincreasing and $g\geq D^{*}$}\Big{\}}\,.

Let

X:={x:D^(x)=D(x)}.X^{*}:=\Big{\{}x:\widehat{D}^{*}(x)=D^{*}(x)\Big{\}}\,.

Then selling menu XX^{*} is optimal.

Proof of 3.

By Theorem 1, it suffices to show that if xXx^{\dagger}\not\in X^{*}, then xx^{\dagger} is dominated by another quality level xxx\neq x^{\dagger}. Fix any xXx^{\dagger}\not\in X^{*}. Then D^(x)>D(x)\widehat{D}^{*}(x^{\dagger})>D^{*}(x^{\dagger}). Suppose, for contradiction, that there does not exist any x>xx>x^{\dagger} such that D(x)D(x)D^{*}(x)\geq D^{*}(x^{\dagger}). Then, we have

maxx>x{D^(x)}=maxx>x{D(x)}<D(x).\max_{x>x^{\dagger}}\big{\{}\widehat{D}^{*}(x)\big{\}}=\max_{x>x^{\dagger}}\big{\{}D^{*}(x)\big{\}}<D^{*}(x^{\dagger})\,.

Let

D~(x):={D^(x) if xx ;D(x) otherwise .\widetilde{D}^{*}(x):=\begin{cases}\widehat{D}^{*}(x)&\text{ if $x\neq x^{\dagger}$\,;}\\ D^{*}(x)&\text{ otherwise\,.}\end{cases}

Then note that D~(x)\widetilde{D}^{*}(x) is also nonincreasing and everywhere above D(x)D^{*}(x). Moreover, D~(x)\widetilde{D}^{*}(x) is everywhere below D^(x)\widehat{D}^{*}(x) with D~(x)<D^(x)\widetilde{D}^{*}(x^{\dagger})<\widehat{D}^{*}(x^{\dagger}). A contradiction. ∎

Cost Structures.

Using 3, we now characterize how optimal quality design depends on the cost structure. Suppose that the consumers’ preferences are multiplicative: v(x,t)=xtv(x,t)=x\cdot t.212121This is equivalent to assuming that v(x,t)=u(x)v(t)v(x,t)=u(x)\cdot v(t) for strictly increasing functions uu and vv, by a change of variables. Let Cavg(x):=C(x)/xC_{avg}(x):=C(x)\big{/}x be the average cost function.

Proposition 4.

Suppose that FF is regular.222222That is, the function t1F(t)f(t)t-\frac{1-F(t)}{f(t)} is strictly increasing (which is equivalent to that the quality-adjusted revenue curve is strictly concave). Consider the lower increasing envelope of CavgC_{avg}:

Cˇavg(x):=sup{g(x): g is nondecreasing and gCavg}.\widecheck{C}_{avg}(x):=\sup\Big{\{}g(x):\text{ $g$ is nondecreasing and $g\leq C_{avg}$}\Big{\}}\,.

Let

X:={x:Cˇavg(x)=Cavg(x)}.X^{*}:=\Big{\{}x:\widecheck{C}_{avg}(x)=C_{avg}(x)\Big{\}}\,.

Then selling menu XX^{*} is optimal.

Proof of 4.

Note that since D(x)(0,1)D^{*}(x)\in(0,1), we must have

MR(D(x))xC(x)=0,\text{MR}(D^{*}(x))\cdot x-C(x)=0\,,

where the quality-adjusted marginal revenue curve MR(q):=ddq(F1(1q)q)\text{MR}(q):=\frac{\mathop{}\!\mathrm{d}}{\mathop{}\!\mathrm{d}q}(F^{-1}(1-q)\cdot q) is a strictly decreasing, continuous function. Therefore, we have

D(x)=MR1(Cavg(x)).D^{*}(x)=\text{MR}^{-1}\Big{(}C_{avg}(x)\Big{)}\,.

Now, observe that for any function h:𝒳h:\mathcal{X}\rightarrow\mathds{R} and any strictly decreasing function Φ:\Phi:\mathds{R}\rightarrow\mathds{R}, we have

U[Φh]=ΦL+[h],\textbf{U}^{-}\big{[}\Phi\circ h\big{]}=\Phi\circ\textbf{L}^{+}\big{[}h\big{]}\,,

where U[]\textbf{U}^{-}[\,\cdot\,] denotes the upper decreasing envelope operator and L+[]\textbf{L}^{+}[\,\cdot\,] denotes the lower increasing envelope operator. Thus, we have

D^(x)=MR1(Cˇavg(x))\widehat{D}^{*}(x)=\text{MR}^{-1}\Big{(}\widecheck{C}_{avg}(x)\Big{)}

because MR1()\text{MR}^{-1}(\,\cdot\,) is strictly decreasing. The claim follows from 3. ∎

Remark 10.

This result generalizes Proposition 1 of Johnson and Myatt (2003), where the average cost curve is assumed to be UU-shaped.232323They define a cost structure to be UU-shaped if there exists some quality threshold xkx_{k} below which the average cost is decreasing and above which the marginal and average costs are increasing. Note that in this case, the set of undominated qualities XX^{*} coincides with the region of increasing marginal and average costs {xk,,xn}\{x_{k},\dots,x_{n}\}. Moreover, by Algorithm 1, the menu XX^{*} in this case is the minimal optimal menu. They conclude that “It is optimal to segment the market with multiple products exactly in the region where average cost and marginal cost are increasing” (Johnson and Myatt 2003). However, 4 shows that this conclusion is incomplete when the cost structure is more complex.242424Average costs are generally not UU-shaped whenever there are kinks in the cost function due to a mix of production technologies, e.g. C(x)=min{k1+xα1,k2+xα2}C(x)=\min\big{\{}k_{1}+x^{\alpha_{1}},k_{2}+x^{\alpha_{2}}\big{\}} where k1<k2k_{1}<k_{2} and α1>α2\alpha_{1}>\alpha_{2}. The optimal mechanism need not segment the market even when average cost and marginal cost are increasing. Figure 10 illustrates. Specifically, the marginal and average costs can both increase within the blue region highlighted in Figure 10(b), yet the optimal mechanism does not segment the market using these qualities. Instead, as illustrated in Figure 10(a), optimal quality choices are characterized by our notion of dominance.

Refer to caption
(a) D(x)D^{*}(x) and D^(x)\widehat{D}^{*}(x)
Refer to caption
(b) Cavg(x)C_{avg}(x) and Cˇavg(x)\widecheck{C}_{avg}(x)
Figure 10: Illustration of the upper decreasing envelope D^\widehat{D}^{*} and the lower increasing envelope Cˇavg\widecheck{C}_{avg}
Remark 11.

At first glance, 3 and 4 may seem related to the “ironing procedures” in Mussa and Rosen (1978) and Myerson (1981). However, this connection is only superficial. Even though both 3 and 4 characterize various “bunching regions”, they operate in a setting where “ironing” is not needed. In the standard textbook treatment of one-dimensional screening problems, the regularity assumptions that rule out “ironing” also rule out the possibility of “bunching” (see pp. 262–268 of Fudenberg and Tirole 1991). Our assumptions are much weaker, allowing for various forms of “bunching.” This generality relies on our new characterization of the upper envelope of the MR curves.

5.3 Optimal Screening with Price and Nonprice Instruments

Consider a monopolist selling a set of quality-differentiated goods, as described in Section 5.2. In addition to setting prices for these goods, the monopolist can also use nonprice instruments by requiring customers to perform certain costly actions, such as waiting in line or collecting coupons, in order to qualify for certain offers. When is such nonprice screening optimal?

We consider a special case of the model introduced in Yang (2022). A consumer’s payoff is given by

u(x,t)utility of consuming quality xc(y,t)disutility of costly action ypprice\underbrace{u(x,t)}_{\text{utility of consuming quality $x$}}-\underbrace{c(y,t)}_{\text{disutility of costly action $y$}}-\quad\underbrace{p}_{\text{price}}

where x{x1,,xn}=:𝒳++x\in\{x_{1},\dots,x_{n}\}=:\mathcal{X}\subset\mathds{R}_{++} denotes the quality and y{y1,,ym}=:𝒴++y\in\{y_{1},\dots,y_{m}\}=:\mathcal{Y}\subset\mathds{R}_{++} denotes the costly action, with the normalization u(0,t)=c(0,t)=0u(0,t)=c(0,t)=0. The seller’s payoff is given by C(x)+p-C(x)+p, where C()C(\,\cdot\,) is the production cost function. From Yang (2022), we know that if c(y,t)c(y,t) is nonincreasing in tt, then the optimal deterministic mechanism does not use the costly instruments (i.e. y(t)=0y(t)=0 for all types tt).

In this section, we consider the opposite case where c(y,t)c(y,t) is strictly increasing in tt. We also restrict attention to deterministic mechanisms (x,y,p):T({0}𝒳)×({0}𝒴)×(x,y,p):T\rightarrow(\{0\}\cup\mathcal{X})\times(\{0\}\cup\mathcal{Y})\times\mathds{R}. We say that costly screening is optimal if every optimal mechanism requires a positive mass of consumers to perform some costly action y𝒴y\in\mathcal{Y}.

Let π(x,q)\pi(x,q) be the profit function of selling quality xx alone and D(x)D^{*}(x) the corresponding sales volume as in Section 5.2. We strengthen the quasi-concavity assumptions in Section 5.2 to strict concavity and assume that D(x)<1D^{*}(x)<1 for all x𝒳x\in\mathcal{X}. To state our result, for any costly action y𝒴y\in\mathcal{Y}, we introduce the following hypothetical profit function:

π(y,q):=c(y,F1(1q))q\pi(y,q):=c\big{(}y,F^{-1}(1-q)\big{)}\cdot q

which is the profit that the seller would obtain if she were to sell the right to opt out of the costly action yy to a mass qq of individuals. Suppose that π(y,q)\pi(y,q) is strictly concave in qq. Let D(y)D^{*}(y) be the unique sales volume that maximizes this hypothetical profit function.

Let 𝒮\mathcal{S} be the set of allocations that involve costly actions but generate a positive total surplus for at least some type tt:

𝒮:={(x,y)𝒳×𝒴:u(x,t)c(y,t)C(x)>0 for some type t}.\mathcal{S}:=\big{\{}(x,y)\in\mathcal{X}\times\mathcal{Y}:u(x,t)-c(y,t)-C(x)>0\text{ for some type $t$}\big{\}}\,.

Note that if there exists some (x,y)𝒮(x,y)\in\mathcal{S} such that the net value u(x,t)c(y,t)u(x,t)-c(y,t) is strictly decreasing in tt, then costly screening is optimal.

Now, to consider the more interesting case, suppose that (i) the net value u(x,t)c(y,t)u(x,t)-c(y,t) is strictly increasing in tt and (ii) the net profit function π(x,q)π(y,q)\pi(x,q)-\pi(y,q) is strictly quasi-concave in qq for all (x,y)𝒮{(x,y)}(x,y)\in\mathcal{S}\cup\{(x^{*},y^{*})\}, where x:=max{argmaxxD(x)}x^{*}:=\max\{\operatornamewithlimits{argmax}_{x}D^{*}(x)\} and y:=min{argminyD(y)}y^{*}:=\min\{\operatornamewithlimits{argmin}_{y}D^{*}(y)\}. Then, we have the following result:

Proposition 5.

Costly screening is optimal if and only if minyD(y)<maxxD(x)\min_{y}D^{*}(y)<\max_{x}D^{*}(x).

Proof of 5.

Without loss of generality, we may assume that the seller only uses allocations in 𝒮{(x,0):x𝒳}\mathcal{S}\cup\{(x,0):x\in\mathcal{X}\}. Indeed, for any option that involves an allocation (x,y)(x,y) such that the surplus u(x,t)C(x)c(y,t)0u(x,t)-C(x)-c(y,t)\leq 0 for all types tt, the seller must obtain a non-positive profit (because of the IR constraints). But the seller can always weakly improve by simply removing all options with non-positive profits from the menu.

Now, we map this problem into a bundling problem as follows. Consider a bundling problem with n+mn+m many items, where the first nn items represent quality upgrades exactly as in Section 5.2, and the remaining mm items represent opting out of each of the mm costly activities. Specifically, for any (xi,yj)𝒮(x_{i},y_{j})\in\mathcal{S}, we define

v({1,,i}{n+1,,n+m}\{n+j},t):=u(xi,t)c(yj,t),v\Big{(}\big{\{}1,\dots,i\big{\}}\cup\{n+1,\dots,n+m\big{\}}\backslash\big{\{}n+j\big{\}},t\Big{)}:=u(x_{i},t)-c(y_{j},t)\,,

with v({1,,i}{n+1,,n+m},t):=u(xi,t)v\big{(}\big{\{}1,\dots,i\big{\}}\cup\{n+1,\dots,n+m\big{\}},t\big{)}:=u(x_{i},t) being the value of quality xix_{i} without any costly action. We can map the production costs accordingly and let v(b,t)=C(b)=0v(b,t)=C(b)=0 for bundles bb that are not of the above form (i.e. ignore them when applying our results). One can verify that this bundling problem satisfies our assumptions. With a slight abuse of notation, we also write (x,y)(x,y) as the bundle of quality xx and costly action yy, and write D(x,y)D^{*}(x,y) as the corresponding sales volume of this ”damaged good” when sold alone.

The “if” part: Suppose that minyD(y)<maxxD(x)\min_{y}D^{*}(y)<\max_{x}D^{*}(x). Suppose for contradiction that there exists an optimal deterministic mechanism that does not use any costly instruments. Then, by 3, there exists an optimal, nested menu BB such that x=max{argmaxxD(x)}x^{*}=\max\big{\{}\operatornamewithlimits{argmax}_{x}D^{*}(x)\big{\}} is the smallest non-empty bundle in BB. Let y=min{argminyD(y)}y^{*}=\min\big{\{}\operatornamewithlimits{argmin}_{y}D^{*}(y)\big{\}}. By assumption, D(y)<D(x)D^{*}(y^{*})<D^{*}(x^{*}). Because π(x,q)\pi(x^{*},q), π(y,q)\pi(y^{*},q), and π(x,q)π(y,q)\pi(x^{*},q)-\pi(y^{*},q) are strictly quasi-concave, this implies that

D(y)<D(x)<D(x,y).D^{*}(y^{*})<D^{*}(x^{*})<D^{*}({x^{*},y^{*}})\,.

So we have that (x,y)𝒮(x^{*},y^{*})\in\mathcal{S} and that the bundle (x,y)(x^{*},y^{*}) sells more than the bundle xx^{*} does when both are sold alone. Since the bundle (x,y)(x^{*},y^{*}) is a subset of the bundle xx^{*}, this implies that the menu BB cannot be optimal by Theorem 2.252525The necessity part of Theorem 2 involves adding probabilistic assignments but the assignments can be made deterministic if the bundle b1Bb_{1}\not\in B is a subset of the smallest non-empty bundle in menu BB. A contradiction.

The “only if” part: Suppose that minyD(y)maxxD(x)\min_{y}D^{*}(y)\geq\max_{x}D^{*}(x). Then, for all (x,y)𝒮(x^{\prime},y^{\prime})\in\mathcal{S},

D(y)minyD(y)maxxD(x)D(x).D^{*}(y^{\prime})\geq\min_{y}D^{*}(y)\geq\max_{x}D^{*}(x)\geq D^{*}(x^{\prime})\,.

Because π(x,q)\pi(x^{\prime},q), π(y,q)\pi(y^{\prime},q), and π(x,q)π(y,q)\pi(x^{\prime},q)-\pi(y^{\prime},q) are strictly quasi-concave, this implies that

D(y)D(x)D(x,y).D^{*}(y^{\prime})\geq D^{*}(x^{\prime})\geq D^{*}(x^{\prime},y^{\prime})\,.

So the bundle (x,y)(x^{\prime},y^{\prime}) is dominated by the bundle xx^{\prime}. Eliminating these dominated bundles leaves a nested menu consisting of only bundles that involve no costly action. By Theorem 1, the resulting menu is optimal (among all stochastic mechanisms and hence also optimal among all deterministic mechanisms). Thus, costly screening is not optimal. ∎

Remark 12.

Note that our model does not require c(y,t)c(y,t) to be monotone in yy, nor does it impose an increasing differences condition on c(y,t)c(y,t). Thus, this model allows different consumers to have different ordinal rankings for the costly actions. In addition, as the proof shows, for the “if” part of 5, it suffices to assume the monotonicity of u(x,t)c(y,t)u(x,t)-c(y,t) and the quasi-concavity of π(x,q)π(y,q)\pi(x,q)-\pi(y,q) for a single pair (x,y)(x^{*},y^{*}).

Remark 13.

To interpret 5, recall that

D(y)=argmaxq{c(y,F1(1q))q}.D^{*}(y)=\operatornamewithlimits{argmax}_{q}\Big{\{}c\big{(}y,F^{-1}(1-q)\big{)}\cdot q\Big{\}}\,.

Note that D(y)D^{*}(y) is smaller if the elasticity of disutility c(y,t)c(y,t) with respect to type tt is pointwise higher (i.e. the costly action is differentially more costly for a higher type). That is, we have

dlogc(y1,t)dlogtdlogc(y2,t)dlogt for all t D(y1)D(y2).\frac{\mathop{}\!\mathrm{d}\log c(y_{1},t)}{\mathop{}\!\mathrm{d}\log t}\leq\frac{\mathop{}\!\mathrm{d}\log c(y_{2},t)}{\mathop{}\!\mathrm{d}\log t}\,\text{ for all $t$ }\implies D^{*}(y_{1})\geq D^{*}(y_{2})\,.

5 shows that, under our assumptions, costly screening is optimal if and only if there exists a costly action yy with sufficiently high elasticity of disutility in a precise sense: D(y)D^{*}(y) is smaller than maxxD(x)\max_{x}D^{*}(x). Figure 11 illustrates.

The intuition behind this characterization can be understood from our main results. In the absence of costly screening, by 3, we know that the best-selling quality x:=max{argmaxxD(x)}x^{*}:=\max\{\operatornamewithlimits{argmax}_{x}D^{*}(x)\} would be optimally offered as the base quality level in the menu. If there exists a costly action yy such that D(y)<D(x)D^{*}(y)<D^{*}(x^{*}), then the “damaged good” (x,y)(x^{*},y) — which requires action yy to purchase quality xx^{*} — has an even higher sales volume. Intuitively, this is because the costly action yy compresses the distribution of values for the “damaged good.” Thus, the “damaged good” (x,y)(x^{*},y) can be profitably included in the menu to expand the market by Theorem 2. On the other hand, if D(y)D(x)D^{*}(y)\geq D^{*}(x^{*}), then we have D(y)D(x)D(x)D^{*}(y)\geq D^{*}(x^{*})\geq D^{*}(x) for all qualities xx, since quality xx^{*} is the best-selling quality. This implies that any “damaged good” (x,y)(x,y) has a lower sales volume than its “undamaged version” xx. When mapped into our bundling problem, this means that every such bundle is dominated by some bundle involving no costly action. The bundles involving no costly action are nested by construction because they represent different quality levels, and thus, costly screening is suboptimal by Theorem 1.

Refer to caption
Figure 11: Illustration of 5

6 Conclusion

This paper studies optimal bundling when consumers differ in one dimension. We introduce a partial order on the set of bundles defined by (i) set inclusion and (ii) sales volumes. This simple partial order turns out to characterize optimal bundling strategies. We show that if the set of undominated bundles is nested, then nested bundling is optimal. In particular, selling the set of undominated bundles is optimal. The proof uses a novel characterization of the upper envelope of the MR curves.

We apply these insights to connect optimal bundling and quality design to price elasticities and cost structures. We also show how these insights can be applied to other multidimensional mechanism design problems. In particular, we obtain a necessary and sufficient condition for the optimality of costly screening when the principal has access to both price and nonprice screening instruments.

References

  • Adams and Yellen (1976) Adams, W. J. and J. L. Yellen (1976): “Commodity Bundling and the Burden of Monopoly,” Quarterly Journal of Economics, 90(3), 475–498.
  • Amador and Bagwell (2013) Amador, M. and K. Bagwell (2013): “The Theory of Optimal Delegation With an Application to Tariff Caps,” Econometrica, 81(4), 1541–1599.
  • Anderson and Dana Jr (2009) Anderson, E. T. and J. D. Dana Jr (2009): “When is Price Discrimination Profitable?” Management Science, 55(6), 980–989.
  • Armstrong (1996) Armstrong, M. (1996): “Multiproduct Nonlinear Pricing,” Econometrica, 64(1), 51–75.
  • Armstrong (2013) ——— (2013): “A More General Theory of Commodity Bundling,” Journal of Economic Theory, 148(2), 448–472.
  • Banerjee (1997) Banerjee, A. V. (1997): “A Theory of Misgovernance,” Quarterly Journal of Economics, 112(4), 1289–1332.
  • Bergemann et al. (2022) Bergemann, D., A. Bonatti, A. Haupt, and A. Smolin (2022): “The Optimality of Upgrade Pricing,” arXiv:2107.10323 [econ.TH].
  • Bulow and Roberts (1989) Bulow, J. and J. Roberts (1989): “The Simple Economics of Optimal Auctions,” Journal of Political Economy, 97(5), 1060–1090.
  • Carroll (2017) Carroll, G. (2017): “Robustness and Separation in Multidimensional Screening,” Econometrica, 85(2), 453–488.
  • Condorelli (2012) Condorelli, D. (2012): “What Money Can’t Buy: Efficient Mechanism Design with Costly Signals,” Games and Economic Behavior, 75(2), 613–624.
  • Daskalakis et al. (2017) Daskalakis, C., A. Deckelbaum, and C. Tzamos (2017): “Strong Duality for a Multiple-Good Monopolist,” Econometrica, 85(3), 735–767.
  • Deneckere and McAfee (1996) Deneckere, R. J. and P. McAfee (1996): “Damaged Goods,” Journal of Economics & Management Strategy, 5(2), 149–174.
  • Fudenberg and Tirole (1991) Fudenberg, D. and J. Tirole (1991): Game Theory, MIT Press.
  • Ghili (forthcoming) Ghili, S. (forthcoming): “A Characterization for Optimal Bundling of Products with Non-Additive Values,” American Economic Review: Insights.
  • Haghpanah and Hartline (2021) Haghpanah, N. and J. Hartline (2021): “When is Pure Bundling Optimal?” Review of Economic Studies, 88(3), 1127–1156.
  • Hartline and Roughgarden (2008) Hartline, J. D. and T. Roughgarden (2008): “Optimal Mechanism Design and Money Burning,” in Proceedings of the 40th Annual ACM SIGACT Symposium on Theory of Computing, STOC ’2008, 75–84.
  • Johnson and Myatt (2003) Johnson, J. P. and D. P. Myatt (2003): “Multiproduct Quality Competition: Fighting Brands and Product Line Pruning,” American Economic Review, 93(3), 748–774.
  • Johnson and Myatt (2006) ——— (2006): “On the Simple Economics of Advertising, Marketing, and Product Design,” American Economic Review, 96(3), 756–784.
  • Long (1984) Long, J. B. (1984): “Comments on “Gaussian Demand and Commodity Bundling”,” Journal of Business, 57(1), S235–S246.
  • Manelli and Vincent (2007) Manelli, A. M. and D. R. Vincent (2007): “Multidimensional Mechanism Design: Revenue Maximization and the Mmultiple-good Monopoly,” Journal of Economic Theory, 137(1), 153–185.
  • McAfee and McMillan (1988) McAfee, R. P. and J. McMillan (1988): “Multidimensional Incentive Compatibility and Mechanism Design,” Journal of Economic Theory, 46(2), 335–354.
  • McAfee et al. (1989) McAfee, R. P., J. McMillan, and M. D. Whinston (1989): “Multiproduct Monopoly, Commodity Bundling, and Correlation of Values,” Quarterly Journal of Economics, 104.
  • Mussa and Rosen (1978) Mussa, M. and S. Rosen (1978): “Monopoly and Product Quality,” Journal of Economic Theory, 18(2), 301–317.
  • Myerson (1981) Myerson, R. B. (1981): “Optimal Auction Design,” Mathematics of Operations Research, 6(1), 58–73.
  • Pavlov (2011) Pavlov, G. (2011): “Optimal Mechanism for Selling Two Goods,” B.E. Journal of Theoretical Economics, 11(1).
  • Rochet (1987) Rochet, J.-C. (1987): “A Necessary and Sufficient Condition for Rationalizability in a Quasi-linear Context,” Journal of Mathematical Economics, 16(2), 191–200.
  • Rochet and Chone (1998) Rochet, J.-C. and P. Chone (1998): “Ironing, Sweeping, and Multidimensional Screening,” Econometrica, 66(4), 783–826.
  • Rochet and Stole (2003) Rochet, J. C. and L. A. Stole (2003): “The Economics of Multidimensional Screening,” in Advances in Economics and Econometrics: Theory and Applications, Eighth World Congress, Volume 1, Cambridge: Cambridge University Press.
  • Salant (1989) Salant, S. W. (1989): “When is Inducing Self-selection Suboptimal for a Monopolist?” Quarterly Journal of Economics, 104(2), 391–397.
  • Slack (2022) Slack (2022): “Pricing,” Accessed at https://slack.com/pricing on November 7, 2022.
  • Stigler (1963) Stigler, G. J. (1963): “United States v. Loew’s Inc.: A Note on Block-booking,” Supreme Court Review, 1963, 152–157.
  • Stokey (1979) Stokey, N. L. (1979): “Intertemporal Price Discrimination,” Quarterly Journal of Economics, 93(3), 355–371.
  • Yang (2022) Yang, F. (2022): “Costly Multidimensional Screening,” Available at SSRN 3915700.

Appendix A Proof of the Main Result

This appendix provides the proof of the main result, following the proof sketch outlined in Section 4.

A.1 Proof of Theorem 1

To begin, we explicitly construct a menu and its associated prices as follows:

Algorithm 1 (Minimal optimal menu).
1. Let UU be the set of undominated bundles (which by assumption are nested). 2. Order the bundles in UU such that U={,b1,,bm}U=\{\varnothing,b_{1},\dots,b_{m}\} where bibjb_{i}\subset b_{j} for all i<ji<j. 3. Initialize B={}B=\{\varnothing\}, q(b)=1q(b)=1 for all bUb\in U, and i=1i=1. 4. While imi\leq m: (i) Let b^\hat{b} be the largest bundle in BB. (ii) Optimize quantity q[0,q(b^)]q^{*}\in[0,q(\hat{b})] for incremental profit π(bi,q)π(b^,q)\pi(b_{i},q)-\pi(\hat{b},q) (iii) If q=q(b^)q^{*}=q(\hat{b}): Update B=B\{b^}B=B\backslash\{\hat{b}\}. (iv) If q=0q^{*}=0: Update i=i+1i=i+1. (v) If 0<q<q(b^)0<q^{*}<q(\hat{b}) or B=B=\varnothing: Update B=B{bi}B=B\cup\{b_{i}\}, q(bi)=qq(b_{i})=q^{*}, and i=i+1i=i+1. 5. Return BB and {q(b)}bB\{q(b)\}_{b\in B}. Prices: Order the bundles in menu BB such that B={,b1,,bl}B=\{\varnothing,b_{1},\dots,b_{l}\} where b1blb_{1}\subset\cdots\subset b_{l}. We construct the corresponding prices using upgrade prices. Specifically, the upgrade price from bundle bib_{i} to bundle bi+1b_{i+1} is set such that, when all consumers have bundle bib_{i} and no other bundles are available in the market, the quantity of the incremental bundle bi+1\bib_{i+1}\backslash b_{i} sold equals q(bi+1)q(b_{i+1}).

In the remainder of the proof, we show that: (i) selling the constructed menu BB with the associated prices is optimal among all stochastic mechanisms, and (ii) the menu BB is minimally optimal. We defer the proof of the uniqueness part of the statement to the end.

Following Myerson (1981), let

ϕ(b,t):=v(b,t)C(b)1F(t)f(t)vt(b,t)\phi(b,t):=v(b,t)-C(b)-\frac{1-F(t)}{f(t)}v_{t}(b,t)

be the virtual surplus function for bundle bb. As observed by Bulow and Roberts (1989), this can be interpreted as the marginal profit for bundle bb evaluated at the quantity such that the marginal consumer is of type tt.

Lemma 2.

Let aa^{*} be a solution to

supa:TΔ()𝔼[bab(t)ϕ(b,t)].\sup_{a:\,T\rightarrow\Delta(\mathcal{B})}\operatorname{\mathds{E}}\Bigg{[}\sum_{b}a_{b}(t)\phi(b,t)\Bigg{]}\,.

If there exists a pricing rule pp^{*} such that (i) (a,p)(a^{*},p^{*}) is a mechanism and (ii) type t¯\underline{t} gets zero payoff under (a,p)(a^{*},p^{*}), then (a,p)(a^{*},p^{*}) is optimal.

Proof of Lemma 2.

By the envelope theorem, under any mechanism (a,p)(a,p) that gives type t¯\underline{t} zero payoff, the profit to the seller must be 𝔼[bab(t)ϕ(b,t)]\operatorname{\mathds{E}}[\sum_{b}a_{b}(t)\phi(b,t)] (see Myerson 1981). The claim follows. ∎

Step 1.

Let t(b):=F1(1D(b))t^{*}(b):=F^{-1}(1-D^{*}(b)) be the unique cutoff type for bundle bb, which is where ϕ(b,t)\phi(b,t) crosses 0 from below. Clearly,

D(b1)D(b2)t(b1)t(b2).D^{*}(b_{1})\geq D^{*}(b_{2})\iff t^{*}(b_{1})\leq t^{*}(b_{2})\,.

A function g(t)g(t) strictly single-crosses (from below) another function h(t)h(t) on an interval [t1,t2][t_{1},t_{2}] if g(t)h(t)g(t)>h(t)g(t)\geq h(t)\implies g(t^{\prime})>h(t^{\prime}) for all t<t[t1,t2]t<t^{\prime}\in[t_{1},t_{2}].

Lemma 3.

For any b1b2b_{1}\subset b_{2}, ϕ(b2,t)\phi(b_{2},t) strictly single-crosses ϕ(b1,t)\phi(b_{1},t) on [max(t(b1),t(b2)),t¯][\max(t^{*}(b_{1}),t^{*}(b_{2})),\overline{t}].

Proof of Lemma 3.

Observe that

ϕ(b,t)=dπ(b,q)dq|q=1F(t).\phi(b,t)=\frac{\mathop{}\!\mathrm{d}\pi(b,q)}{\mathop{}\!\mathrm{d}q}\bigg{|}_{\,\,q=1-F(t)}\,.

By assumption, π(b2,q)π(b1,q)\pi(b_{2},q)-\pi(b_{1},q) is strictly quasi-concave on [0,min(D(b1),D(b2))][0,\min(D^{*}(b_{1}),D^{*}(b_{2}))] and has at most one stationary point in [0,min(D(b1),D(b2))][0,\min(D^{*}(b_{1}),D^{*}(b_{2}))] which if exists is the maximum point in [0,min(D(b1),D(b2))][0,\min(D^{*}(b_{1}),D^{*}(b_{2}))]. The claim follows immediately. ∎

For any b1b2b_{1}\subset b_{2}, let

s(b1,b2):=infs{sT:ϕ(b2,t)>ϕ(b1,t) for all t>s}s(b_{1},b_{2}):=\inf_{s}\Big{\{}s\in T:\phi(b_{2},t)>\phi(b_{1},t)\text{ for all $t>s$}\Big{\}}

be the last crossing type of the virtual surplus functions for b1b_{1} and b2b_{2}. Note that s(b1,b2)s(b_{1},b_{2}) is always well-defined. Let

χ(b1,b2):=ϕ(b1,s(b1,b2))\chi(b_{1},b_{2}):=\phi(b_{1},s(b_{1},b_{2}))

be the last crossing virtual surplus of b1b_{1} and b2b_{2}. The next lemma shows that the sign of the last crossing virtual surplus is determined by the ranking of the sales volumes:

Lemma 4 (Virtual surplus and sales volumes).

For any b1b2b_{1}\subset b_{2},

sign[χ(b1,b2)]=sign[D(b1)D(b2)].\operatorname{sign}\Big{[}\chi(b_{1},b_{2})\Big{]}=\operatorname{sign}\Big{[}D^{*}(b_{1})-D^{*}(b_{2})\Big{]}\,.
Proof of Lemma 4.

Observe that the claim holds if either s(b1,b2)=t¯s(b_{1},b_{2})=\underline{t} or s(b1,b2)=t¯s(b_{1},b_{2})=\overline{t}. Now suppose that s(b1,b2)(t¯,t¯)s(b_{1},b_{2})\in(\underline{t},\overline{t}). Observe that by Lemma 3, only the following three cases can happen:

  • Case (1): s(b1,b2)>max(t(b1),t(b2))s(b_{1},b_{2})>\max(t^{*}(b_{1}),t^{*}(b_{2})) and t(b1)<t(b2)t^{*}(b_{1})<t^{*}(b_{2}) ;

  • Case (2): s(b1,b2)<min(t(b1),t(b2))s(b_{1},b_{2})<\min(t^{*}(b_{1}),t^{*}(b_{2})) and t(b1)>t(b2)t^{*}(b_{1})>t^{*}(b_{2}) ;

  • Case (3): s(b1,b2)=min(t(b1),t(b2))=max(t(b1),t(b2))s(b_{1},b_{2})=\min(t^{*}(b_{1}),t^{*}(b_{2}))=\max(t^{*}(b_{1}),t^{*}(b_{2})) .

This argument is depicted in Figure 7 (in Section 4.1) in which we hold the relative position of ϕ(b1,t)\phi(b_{1},t) and ϕ(b2,t)\phi(b_{2},t) fixed and vary the xx-axis. The claim follows immediately. ∎

Step 2.

Let UU be the set of undominated bundles. We now show that any dominated bundle can be eliminated in the following sense:

Lemma 5 (Elimination of dominated bundles).

We have

supa:TΔ()𝔼[bab(t)ϕ(b,t)]=supa:TU𝔼[bab(t)ϕ(b,t)].\sup_{a:\,T\rightarrow\Delta(\mathcal{B})}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}=\sup_{a:\,T\rightarrow U}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}\,.
Proof of Lemma 5.

Note that by compactness and linearity, a solution to the left-hand side exists and uses only deterministic assignments. Let aa be any such a solution. Suppose that there exists a dominated bundle bb such that a(t)=ba(t)=b for some tTt\in T (otherwise the claim holds). First, observe that because the dominance relation \preceq is a partial order on a finite set, any dominated bundle must also be dominated by some undominated bundle. So there exists bUb^{\prime}\in U that dominates bb.

Second, observe that by Lemma 4, we have χ(b,b)0\chi(b,b^{\prime})\leq 0, and therefore ϕ(b,t)ϕ(b,t)\phi(b^{\prime},t)\geq\phi(b,t) for all tt(b)t\geq t^{*}(b), which implies that

max{0,ϕ(b,t)}ϕ(b,t)\max\big{\{}0,\phi(b^{\prime},t)\big{\}}\geq\phi(b,t)

for all tTt\in T. Therefore, by replacing a(t)a(t) from bb to either \varnothing or bb^{\prime} for all types that are previously assigned bb, we can weakly improve the objective value. Since this argument holds for all dominated bundles bb, the claim follows. ∎

Step 3.

Now, we construct a solution to

supa:TU𝔼[bab(t)(v(b,t)C(b)1F(t)f(t)vt(b,t))]\sup_{a:\,T\rightarrow U}\operatorname{\mathds{E}}\Bigg{[}\sum_{b}a_{b}(t)\Big{(}v(b,t)-C(b)-\frac{1-F(t)}{f(t)}v_{t}(b,t)\Big{)}\Bigg{]}

and show that it coincides with the allocation rule induced by the menu BB and associated prices constructed by Algorithm 1. Because the set of undominated bundles is nested, we can write U={,b1,,bm}U=\{\varnothing,b_{1},\dots,b_{m}\} with b1bmb_{1}\subset\cdots\subset b_{m}. Moreover, we must also have

D(b1)>D(b2)>>D(bm).D^{*}(b_{1})>D^{*}(b_{2})>\cdots>D^{*}(b_{m})\,.

For all j=1,,mj=1,\dots,m, let

Vj(t):=max{0,ϕ(b1,t),,ϕ(bj,t)}V_{j}(t):=\max\Big{\{}0,\phi(b_{1},t),\dots,\phi(b_{j},t)\Big{\}}

be the envelope of the virtual surplus functions of the jj smallest bundles (and the empty bundle).

The next lemma shows that taking envelopes preserves single-crossing properties:

Lemma 6 (Single-crossing envelopes).

ϕ(bk,t)\phi(b_{k},t) strictly single-crosses Vj(t)V_{j}(t) for all j<kj<k.

Proof of Lemma 6.

We prove this by induction on jj.

Base case: Let j=1j=1. Fix any k>1k>1. Note that b1bkb_{1}\subset b_{k} and D(b1)>D(bk)D^{*}(b_{1})>D^{*}(b_{k}). Therefore, by Lemma 3, we have that ϕ(bk,t)\phi(b_{k},t) strictly single-crosses ϕ(b1,t)\phi(b_{1},t) on [t(bk),t¯][t^{*}(b_{k}),\overline{t}], which implies that ϕ(bk,t)\phi(b_{k},t) strictly single-crosses max{0,ϕ(b1,t)}=V1(t)\max\{0,\phi(b_{1},t)\}=V_{1}(t) on [t¯,t¯][\underline{t},\overline{t}].

Inductive step: Suppose that for all k>j1k>j-1, ϕ(bk,t)\phi(b_{k},t) strictly single-crosses Vj1(t)V_{j-1}(t). Fix any k>jk>j. We can write

Vj(t)=max{Vj1(t),max{0,ϕ(bj,t)}}.V_{j}(t)=\max\Big{\{}V_{j-1}(t),\max\big{\{}0,\phi(b_{j},t)\big{\}}\Big{\}}\,.

Note that

  • (i) ϕ(bk,t)\phi(b_{k},t) strictly single-crosses max{0,ϕ(bj,t)}\max\{0,\phi(b_{j},t)\} (by the same argument as before);

  • (ii) ϕ(bj,t)\phi(b_{j},t) strictly single-crosses Vj1(t)V_{j-1}(t) (by the inductive hypothesis);

  • (iii) ϕ(bk,t)\phi(b_{k},t) strictly single-crosses Vj1(t)V_{j-1}(t) (by the inductive hypothesis).

Note that if any of the above crossings does not happen, then the claim for the inductive step clearly holds. Otherwise, let t(k,j)t(k,j), t(j,j1)t(j,j-1), and t(k,j1)t(k,j-1) be the crossing points for (i), (ii), and (iii), respectively.

Now, because of the above single-crossing properties, observe that only the following three cases can happen:

  • Case (1): t(k,j1)<t(j,j1)t(k,j-1)<t(j,j-1) in which case we must have t(k,j)<t(k,j1)t(k,j)<t(k,j-1).

  • Case (2): t(k,j1)>t(j,j1)t(k,j-1)>t(j,j-1) in which case we must have t(k,j)>t(k,j1)t(k,j)>t(k,j-1).

  • Case (3): t(k,j1)=t(j,j1)t(k,j-1)=t(j,j-1) in which case we must have t(k,j)=t(k,j1)t(k,j)=t(k,j-1).

This argument is depicted in Figure 8 (in Section 4.1) in which we hold the positions of Vj1(t)V_{j-1}(t) and ϕ(bj,t)\phi(b_{j},t) fixed and vary ϕ(bk,t)\phi(b_{k},t). The claim for the inductive step follows. ∎

We now iteratively construct the following allocation rule:

Lemma 7 (Iterative construction).

There exists a mapping a:TUa:T\rightarrow U such that

  • (i)

    ϕ(a(t),t)=Vm(t)\phi(a(t),t)=V_{m}(t) for all tt.

  • (ii)

    a(t)a(t)a(t)\subseteq a(t^{\prime}) for all t<tt<t^{\prime}.

Proof of Lemma 7.

Let a(0)(t)=a^{(0)}(t)=\varnothing. For each i=1,,mi=1,\dots,m:

  • 1.

    Let t(i,i1)t(i,i-1) be the type at which ϕ(bi,t)\phi(b_{i},t) strictly single-crosses Vi1(t)V_{i-1}(t) (by Lemma 6). Put t(i,i1)=t¯t(i,i-1)=\overline{t} if the crossing does not exist.

  • 2.

    Let

    a(i)(t)={a(i1)(t) if tt(i,i1) ;bi otherwise.a^{(i)}(t)=\begin{cases}a^{(i-1)}(t)&\text{ if $t\leq t(i,i-1)$\,;}\\ b_{i}&\text{ otherwise}\,.\end{cases}

Let a(t)=a(m)(t)a(t)=a^{(m)}(t). Note that by induction, at each iteration ii, we have for all tt

ϕ(a(i)(t),t)=max{Vi1(t),ϕ(bi,t)}=Vi(t);\phi(a^{(i)}(t),t)=\max\Big{\{}V_{i-1}(t),\,\phi(b_{i},t)\Big{\}}=V_{i}(t)\,;

and therefore for all tt

ϕ(a(t),t)=Vm(t),\phi(a(t),t)=V_{m}(t)\,,

proving part (i). Note also that, by induction, at each iteration ii, we have for all t<tt<t^{\prime}

a(i)(t)a(i)(t)a^{(i)}(t)\subseteq a^{(i)}(t^{\prime})

because bbib\subset b_{i} for any b{a(i1)(t)}tTb\in\{a^{(i-1)}(t)\}_{t\in T}. So a(t)a(t)a(t)\subseteq a(t^{\prime}) for all t<tt<t^{\prime}, proving part (ii). ∎

Let aa^{\dagger} be the allocation rule constructed in Lemma 7. Order the bundles such that {a(t)}tT={,b1,,bl}\big{\{}a^{\dagger}(t)\big{\}}_{t\in T}=\{\varnothing,b_{1},\dots,b_{l}\}, where b1blb_{1}\subset\dots\subset b_{l}, for some number ll. For all j=1,,lj=1,\dots,l, let s(bj)=inf{t:a(t)=bj}s(b_{j})=\inf\big{\{}t:a^{\dagger}(t)=b_{j}\big{\}}. For all j=1,,lj=1,\dots,l, let

p(bj)=v(bj,s(bj))i<j(v(bi,s(bi+1))v(bi,s(bi))).p^{\dagger}(b_{j})=v(b_{j},s(b_{j}))-\sum_{i<j}\big{(}v(b_{i},s(b_{i+1}))-v(b_{i},s(b_{i}))\big{)}\,.

Because (i) the allocation rule aa^{\dagger} is monotone in the set-inclusion order, and (ii) the incremental value is monotone for any pair of nested bundles, observe that the prices pp^{\dagger} implement the allocation rule aa^{\dagger}.

Now, by the construction of aa^{\dagger} in Lemma 7, observe that (i) the set of assigned bundles {a(t)}tT\big{\{}a^{\dagger}(t)\big{\}}_{t\in T} is exactly equal to the menu BB constructed by Algorithm 1, (ii) we have that 1F(s(b))=q(b)1-F(s(b))=q(b) for all bBb\in B, where the quantity q(b)q(b) is constructed in Algorithm 1, and (iii) the price of upgrading from bib_{i} to bi+1b_{i+1} is given by

p(bi+1)p(bi)=v(bi+1,s(bi+1))v(bi,s(bi+1)),p^{\dagger}(b_{i+1})-p^{\dagger}(b_{i})=v(b_{i+1},s(b_{i+1}))-v(b_{i},s(b_{i+1}))\,,

coinciding with the upgrade price from bib_{i} to bi+1b_{i+1} constructed by Algorithm 1. Therefore, aa^{\dagger} is also the allocation rule induced by the menu and prices given by Algorithm 1.

Completion of the Proof of the Optimality Part.

We have

𝔼[ϕ(a(t),t)]\displaystyle\operatorname{\mathds{E}}\Big{[}\phi(a^{\dagger}(t),t)\Big{]} =supa:TU𝔼[bab(t)ϕ(b,t)]\displaystyle=\sup_{a:\,T\rightarrow U}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]} (by Lemma 7)
=supa:TΔ()𝔼[bab(t)ϕ(b,t)].\displaystyle=\sup_{a:\,T\rightarrow\Delta(\mathcal{B})}\operatorname{\mathds{E}}\Big{[}\sum_{b}a_{b}(t)\phi(b,t)\Big{]}\,. (by Lemma 5)

Therefore, by Lemma 2, selling the nested menu BUB\subseteq U given by Algorithm 1 is optimal among all stochastic mechanisms. In addition, dropping any bundle bb from the menu BB would result in a strictly suboptimal menu because, by the proof of Lemma 7, for any bBb\in B, we have

𝔼[maxbB\{b}ϕ(b,t)]<𝔼[maxbUϕ(b,t)]=𝔼[ϕ(a(t),t)].\displaystyle\operatorname{\mathds{E}}\Big{[}\max_{b^{\prime}\in B\backslash\{b\}}\phi(b^{\prime},t)\Big{]}<\operatorname{\mathds{E}}\Big{[}\max_{b^{\prime}\in U}\phi(b^{\prime},t)\Big{]}=\operatorname{\mathds{E}}\Big{[}\phi(a^{\dagger}(t),t)\Big{]}\,.

Therefore, the menu BB is minimally optimal.

Proof of the Uniqueness Part.

We now show that any optimal mechanism (a,p)(a^{\prime},p^{\prime}) must be equivalent to the nested bundling mechanism constructed by Algorithm 1. Suppose, for contradiction, that (a,p)(a^{\prime},p^{\prime}) differs from our construction on a positive-measure set of types. By the payoff equivalence theorem as in Lemma 2, it must be that the allocation rule aa^{\prime} differs from the allocation rule aa^{\dagger} (constructed in Lemma 7) on a positive-measure set of types. Let TT^{*} be that set. By the proofs of Lemma 5 and Lemma 7, we have that for all types tt,

bab(t)ϕ(b,t)maxbUϕ(b,t)=bab(t)ϕ(b,t).\sum_{b}a^{\prime}_{b}(t)\phi(b,t)\leq\max_{b\in U}\phi(b,t)=\sum_{b}a^{\dagger}_{b}(t)\phi(b,t)\,.

Let B:={a(t)}tTB:=\{a^{\dagger}(t)\}_{t\in T}. Order the bundles in BB such that B={,b1,,bl}B=\{\varnothing,b_{1},\dots,b_{l}\} where b1blb_{1}\subset\dots\subset b_{l}. For all bBb\in B, let Tb:={t:a(t)=b}T_{b}:=\{t:a^{\dagger}(t)=b\}. Note that there must exist some bBb^{\dagger}\in B such that

(tTTb)>0.\mathds{P}\big{(}t\in T^{*}\cap T_{b^{\dagger}}\big{)}>0\,.

Note also that by the construction of the allocation rule aa^{\dagger}, for all types t<sup{t:tT}=t(b1)t<\sup\{t:t\in T_{\varnothing}\}=t^{*}(b_{1}) and all bundles bb\neq\varnothing, we have ϕ(b,t)<0\phi(b,t)<0. Thus, if b=b^{\dagger}=\varnothing, then (a,p)(a^{\prime},p^{\prime}) must be strictly suboptimal, which is impossible. Now suppose bb^{\dagger}\neq\varnothing. By the same argument as in the proof of Lemma 5, for any dominated bundle bb^{\prime}, there exists some undominated b′′b^{\prime\prime} such that

ϕ(b,t)<max{0,ϕ(b′′,t)}\phi(b^{\prime},t)<\max\big{\{}0,\phi(b^{\prime\prime},t)\big{\}}

for almost all tt. Now, note that by the construction of aa^{\dagger}, for any undominated bundle b′′bb^{\prime\prime}\neq b^{\dagger}, we have

max{0,ϕ(b′′,t)}<ϕ(b,t)\max\big{\{}0,\phi(b^{\prime\prime},t)\big{\}}<\phi(b^{\dagger},t)

for almost all tTbt\in T_{b^{\dagger}}. These two observations imply that for any bundle bbb\neq b^{\dagger}, we have

ϕ(b,t)<ϕ(b,t)\phi(b,t)<\phi(b^{\dagger},t)

for almost all tTbt\in T_{b^{\dagger}}. Then (a,p)(a^{\prime},p^{\prime}) must be strictly suboptimal. A contradiction.

A.2 Proof of Theorem 2

Proof of the Sufficiency Part.

Note that this part follows directly from the proof of Theorem 1. Now, we further show that this sufficiency part holds even if we only impose monotone incremental values assumption and locally quasi-concave incremental profits assumption on the set of pairs b1b2b_{1}\subset b_{2} where b2Bb_{2}\in B. First, both Lemma 3 and Lemma 4 still hold for every pair b1b2b_{1}\subset b_{2} where b2Bb_{2}\in B. Second, note that for every b1Bb_{1}\not\in B, there exists some b2Bb_{2}\in B such that (i) b1b2b_{1}\subset b_{2} and (ii) D(b1)D(b2)D^{*}(b_{1})\leq D^{*}(b_{2}). Therefore, by the same proof as in Lemma 5, we have that it suffices to consider the allocation rules a:TBa:T\rightarrow B for the purpose of solving the relaxed problem. Third, note that for all b1,b2Bb_{1},b_{2}\in B such that b1b2b_{1}\subset b_{2}, we have that χ(b1,b2)>0\chi(b_{1},b_{2})>0 (by Lemma 4). Therefore, by the same proof as in Step 3 of the previous section (Lemma 6 and Lemma 7), we have that there exists an allocation rule a:TBa:T\rightarrow B such that (i) aa solves the relaxed problem and (ii) aa is monotone (in the set-inclusion order). Because the increasing differences property between v(b1,t)v(b_{1},t) and v(b2,t)v(b_{2},t) still holds for all b1b2b_{1}\subset b_{2} such that b1,b2Bb_{1},b_{2}\in B, we have that the allocation rule aa is implementable, proving the result by Lemma 2. The prices can be constructed in the same way as before.

Proof of the Necessity Part.

Now, we prove the second statement of Theorem 2 by contradiction. Suppose that selling menu BB is optimal and that selling any proper subset of menu BB is not optimal.

Suppose, for contradiction, that either

  • (i)

    there exist b1Bb_{1}\in B and b2Bb_{2}\in B such that b1b2b_{1}\subset b_{2} and D(b1)D(b2)D^{*}(b_{1})\leq D^{*}(b_{2}), or

  • (ii)

    there exists b1Bb_{1}\not\in B such that D(b1)>D(b2)D^{*}(b_{1})>D^{*}(b_{2}) for all non-empty b2Bb_{2}\in B.

Step 1.

Let

UB:={bB:b is not dominated by another bundle bB}.U_{B}:=\Big{\{}b\in B:\text{$b$ is not dominated by another bundle $b^{\prime}\in B$}\Big{\}}\,.

Note that for all b1B\UBb_{1}\in B\backslash U_{B}, there exists b2UBb_{2}\in U_{B} such that for all tt,

ϕ(b1,t)max{0,ϕ(b2,t)},\phi(b_{1},t)\leq\max\big{\{}0,\phi(b_{2},t)\big{\}}\,,

by the same argument in Step 2 of the previous section. Therefore, by the same arguments in Step 2 and Step 3 of the previous section, there exists a mechanism (a,p)(a,p) such that a:TUBa:T\rightarrow U_{B} and (a,p)(a,p) attains a profit of

𝔼[maxbB{ϕ(b,t)}],\operatorname{\mathds{E}}\Big{[}\max_{b\in B}\big{\{}\phi(b,t)\big{\}}\Big{]}\,,

which is an upper bound on the profit of selling menu BB. Because selling BB is optimal, this implies that the mechanism (a,p)(a,p) must be an optimal mechanism.

Step 2.

First, consider Case (i). Note that under Case (i), UBU_{B} is a strict subset of BB. Then, because (a,p)(a,p) is an optimal mechanism, we have that selling the menu UBU_{B} is also optimal, violating the minimal optimality of the menu BB. Hence, this case cannot occur.

Step 3.

Now, consider Case (ii). We construct a strict improvement. Consider offering the following new option in addition to the menu given by (a,p)(a,p): A lottery of getting bundle b1b_{1} with probability ε\varepsilon, for a price of ε×v(b1,t(b1))\varepsilon\times v(b_{1},t^{*}(b_{1})).

We compute the payoff to the monopolist after the addition of this new option. By the payoff equivalence theorem as in Lemma 2, the profit change to the monopolist after adding this new option can be computed as

𝔼[bab(t)ϕ(b,t)]𝔼[bab(t)ϕ(b,t)],\operatorname{\mathds{E}}\Bigg{[}\sum_{b}a^{\prime}_{b}(t)\phi(b,t)\Bigg{]}-\operatorname{\mathds{E}}\Bigg{[}\sum_{b}a_{b}(t)\phi(b,t)\Bigg{]}\,,

where aa^{\prime} is the induced allocation rule after the types re-adjust their optimal choices given the new option.

Let b¯\underline{b} be the smallest non-empty bundle in BB. By the construction of (a,p)(a,p), note that:

  • (i)

    all types t[t¯,t(b1))t\in[\underline{t},t^{*}(b_{1})) will not take this option ;

  • (ii)

    all types t[t(b1),t(b¯))t\in[t^{*}(b_{1}),t^{*}(\underline{b})) will switch from \varnothing to this option ;

  • (iii)

    all types t[t(b¯),t)t\in[t^{*}(\underline{b}),t^{\dagger}) will switch from b¯\underline{b} to this option, for some threshold tt^{\dagger} .

First, we provide a lower bound on the gain in the virtual surplus. Because types in [t(b1),t(b¯))[t^{*}(b_{1}),t^{*}(\underline{b})) will take this new option, the gain in the virtual surplus is at least

Gain(ε):=ε×t(b1)t(b¯)ϕ(b1,t)dF(t)=:K=εK>0,\text{Gain}(\varepsilon):=\varepsilon\times\underbrace{\int_{t^{*}(b_{1})}^{t^{*}(\underline{b})}\phi(b_{1},t)\mathop{}\!\mathrm{d}F(t)}_{=:K}=\varepsilon K>0\,,

where the inequality K>0K>0 uses the single-crossing property of ϕ(b1,t)\phi(b_{1},t).

Now, we provide an upper bound on the loss in the virtual surplus. Note that any type tt who takes this option obtains a payoff that is at most

h(ε):=ε×(v(b1,t¯)v(b1,t(b1)))=:Z=εZ.h(\varepsilon):=\varepsilon\times\underbrace{\big{(}v(b_{1},\overline{t})-v(b_{1},t^{*}(b_{1}))\big{)}}_{=:Z}=\varepsilon Z\,.

Let bb^{\prime} be the second smallest non-empty bundle in BB (if it does not exist, put t(b)=1t^{*}(b^{\prime})=1 in what follows). Note that t(b1)<t(b¯)<t(b)t^{*}(b_{1})<t^{*}(\underline{b})<t^{*}(b^{\prime}). By the construction of (a,p)(a,p), for any δ[0,t(b)t(b¯)]\delta\in[0,t^{*}(b^{\prime})-t^{*}(\underline{b})], we have

U(t(b¯)+δ)=v(b¯,t(b¯)+δ)v(b¯,t(b¯)),U(t^{*}(\underline{b})+\delta)=v(\underline{b},t^{*}(\underline{b})+\delta)-v(\underline{b},t^{*}(\underline{b}))\,,

where UU denotes the indirect utility function under (a,p)(a,p). Let g(δ):=U(t(b¯)+δ)g(\delta):=U(t^{*}(\underline{b})+\delta). Note that v(b¯,t(b¯))>0v(\underline{b},t^{*}(\underline{b}))>0 and hence vt(b¯,t(b¯))>0v_{t}(\underline{b},t^{*}(\underline{b}))>0 by assumption. Thus, +g(0)>0\partial_{+}g(0)>0. Since gg^{\prime} is continuous on [0,δ][0,\delta], there exist some constants δ¯(0,t(b)t(b¯))\overline{\delta}\in(0,t^{*}(b^{\prime})-t^{*}(\underline{b})) and M>0M>0 such that g(δ)Mg^{\prime}(\delta)\geq M for all δ[0,δ¯]\delta\in[0,\overline{\delta}]. Let ε¯:=g(δ¯)>0\overline{\varepsilon}:=g(\overline{\delta})>0. Note that for all ε(0,ε¯)\varepsilon\in(0,\overline{\varepsilon}), we have

g1(ε)=0ε(g1)(s)ds=0ε1g(g1(s))ds1Mε.g^{-1}(\varepsilon)=\int_{0}^{\varepsilon}(g^{-1})^{\prime}(s)\mathop{}\!\mathrm{d}s=\int_{0}^{\varepsilon}\frac{1}{g^{\prime}(g^{-1}(s))}\mathop{}\!\mathrm{d}s\leq\frac{1}{M}\varepsilon\,.

Note also that any type t[t(b¯),t¯]t\in[t^{*}(\underline{b}),\overline{t}] switches to this new option only if

U(t)h(ε).U(t)\leq h(\varepsilon)\,.

Let

δ(ε):=g1(h(ε)).\delta(\varepsilon):=g^{-1}(h(\varepsilon))\,.

Then, observe that for all ε(0,1Zε¯)\varepsilon\in(0,\frac{1}{Z}\overline{\varepsilon}), the loss in the virtual surplus is at most

Loss(ε)\displaystyle\text{Loss}(\varepsilon) :=t(b¯)t(b¯)+δ(ε)ϕ(b¯,t)f(t)dt\displaystyle:=\int_{t^{*}(\underline{b})}^{t^{*}(\underline{b})+\delta(\varepsilon)}\phi(\underline{b},t)f(t)\mathop{}\!\mathrm{d}t
δ(ε)×maxt[t(b¯),t(b¯)+δ(ε)]{f(t)ϕ(b¯,t)}=:Φ(ε)=δ(ε)×Φ(ε)ZMε×Φ(ε).\displaystyle\leq\delta(\varepsilon)\times\underbrace{\max_{t\in[t^{*}(\underline{b}),t^{*}(\underline{b})+\delta(\varepsilon)]}\Big{\{}f(t)\phi(\underline{b},t)\Big{\}}}_{=:\Phi(\varepsilon)}=\delta(\varepsilon)\times\Phi(\varepsilon)\leq\frac{Z}{M}\varepsilon\times\Phi(\varepsilon)\,.

Observe that (i) Φ()\Phi(\,\cdot\,) is a continuous function (by Berge’s theorem), and (ii) Φ(0)=0\Phi(0)=0 since

ϕ(b¯,t(b¯))=0.\phi(\underline{b},t^{*}(\underline{b}))=0\,.

Therefore, there exists ε¯>0\overline{\varepsilon}^{\prime}>0 such that for all ε(0,ε¯)\varepsilon\in(0,\overline{\varepsilon}^{\prime}), we have

Φ(ε)<MKZ.\Phi(\varepsilon)<\frac{MK}{Z}\,.

Now, pick any ε(0,min{1Zε¯,ε¯})\varepsilon\in\big{(}0,\,\min\{\frac{1}{Z}\overline{\varepsilon},\,\overline{\varepsilon}^{\prime}\}\big{)}. We must have

Loss(ε)ZMεΦ(ε)<εK=Gain(ε).\text{Loss}(\varepsilon)\leq\frac{Z}{M}\varepsilon\Phi(\varepsilon)<\varepsilon K=\text{Gain}(\varepsilon)\,.

So (a,p)(a,p) is strictly suboptimal. A contradiction. Figure 9 (in Section 4.2) illustrates this argument with a two-item example. Note that this proof holds even if we only impose monotone incremental values assumption and locally quasi-concave incremental profits assumption on the pairs of nested bundles where both bundles are in the menu BB.