Recovering utility
Abstract.
We provide sufficient conditions under which a utility function may be recovered from a finite choice experiment. Identification, as is commonly understood in decision theory, is not enough. We provide a general recoverability result that is widely applicable to modern theories of choice under uncertainty. Key is to allow for a monetary environment, in which an objective notion of monotonicity is meaningful. In such environments, we show that subjective expected utility, as well as variational preferences, and other parametrizations of utilities over uncertain acts are recoverable. We also consider utility recovery in a statistical model with noise and random deviations from utility maximization.
1. Introduction
Economists are often interested in recovering preferences and utility functions from data on agents’ choices. If we are able to recover a utility function, then a preference relation is obviously implied, but the inverse procedure is more delicate. In this paper, we presume access to data on an agent’s choices, and that these describe the agent’s preferences (or that preferences have been obtained as the outcome of a statistical estimation procedure). Our results describe sufficient conditions under which one can recover, or learn, a utility function from the agents’ choices.
At a high level, the problem is that preferences essentially are choices, because they encode the choice that would be made from each binary choice problem. When we write we really mean that would be chosen from the set . Utility functions are much richer objects, and a given choice behavior may be described by many different utilities. For example, one utility can be used to discuss an agent’s risk preferences: they could have a “constant relative risk aversion” utility, for which a single parameter describes attitudes towards risk. But the same preferences can be represented by a utility that does not have such a convenient parametrization. So recovering, or learning, utilities present important challenges that go beyond the problem of recovering a preference. In the paper, we describe some simple examples that illustrate the challenges. Our main results describe when one may (non-parametrically) recover a utility representation from choice data.
We first consider choice under uncertainty. We adopt the standard (Anscombe-Aumann) setting of choice under uncertainty, and focus attention on a class of utility representations that has been extensively studied in the literature. Special cases include subjected expected utility, the max-min expected utility model of Gilboa and Schmeidler (1989), Choquet expected utility (Schmeidler, 1989), the variational preferences of Maccheroni, Marinacci, and Rustichini (2006), and many other popular models. Decision theorists usually place significance on the uniqueness of their utility representations, arguing that uniqueness provides an identification argument that allows for utility to be recovered from choice data. We argue, in contrast, that uniqueness of a utility representation is not enough to recover a utility from finite choice data.
Counterexamples are not hard to find. Indeed, even when a utility representation is unique, one may find a convergent sequence of utilities that is consistent with larger and larger finite datasets, but that does not converge to the utility function that generated the choices in the data, or to any utility to which it is equivalent. So uniqueness is necessary but not sufficient for a utility representation to be empirically tractable, in the sense of ensuring that a utility is recovered from large, but finite, choice experiments.
Our main results are positive, and exhibit sufficient conditions for utility recovery. Key to our results is the availability of an objective direction of improvements in utility: we focus our attention on models of monotone preferences. Our paper considers choices among monetary acts, meaning state-contingent monetary payoffs. For such acts, there is a natural notion of monotonicity. Between two acts, if one pays more in every state of the world, the agent agent should prefer it. As a discipline on the recovery exercise, this essential notion of monotonicity suffices to ensure that a sequence of utilities that explains the choices in the data converges to the utility function that generated the choices.
We proceed by first discussing the continuity of a utility function in its dependence on the underlying preference relation. If is a function of a preference and of choice objects , then we say that it is a utility function if represents . We draw on the existing literature (Theorem 1) to argue that such continuous utilities exist in very general circumstances. Continuity of this mapping in the preference ensures that if the choice data allow for preference recovery, they also allow a utility to be recovered. The drawback, however, of such general utility representation results is that they do not cover the special theories of utility in which economists generally take interest. There is no reason to expect that the utility coincides with the standard parametrizations of, for example, subjective expected utility or variational preferences.
We then go on to our main exercise, which constrains the environment to the Anscombe-Aumann setting, and considers utility representations that have received special attention in the theory of choice under uncertainty. We consider a setup that is flexible enough to accommodate most theories of choice under uncertainty that have been studied in the literature. Our main result (Theorem 2) says that, whenever a choice experiment succeeds in recovering agents’ underlying preferences, it also serves to recover a utility in the class of utilities of interest. For example, if an agent has subjective expected utility preferences, and these can be recovered from a choice experiment, then so can the parameters of the subjective expected utility representation: the agents’ beliefs and Bernoulli utility index. Or, if the agent has variational preferences that can be inferred from choice data, then so can the different components of the variational utility representation.
Actual data on choices may be subject to sampling noise, and agents who randomly deviate from their preferences. The results we have just mentioned are useful in such settings, once the randomness in preference estimates is taken into account. As a complement to our main findings, we proceed with a model that explicitly takes noisy choice, and randomness, into account. Specifically, we consider choice problems that are sampled at random, and an agent who may deviate from their preferences. They make mistakes. In such a setting, we present sufficient conditions for the consistency of utility function estimates (Theorem 3).
In the last part of the paper we take a step back and revisit the problem of preference recovery, with the goal of showing how data from a finite choice experiment can approximate a preference relation, and, in consequence, a utility function. Our model considers a large, but finite, number of binary choices. We show that when preferences are monotone, then preference recovery is possible (Theorem 5). In such environments, utility recovery follows for the models of choice under uncertainty that we have been interested in (Corollary 1).
Related literature.
The literature on revealed preference theory in economics is primarily devoted to tests for consistency with rational choice. The main result in the literature, Afriat’s theorem (Afriat, 1967a; Diewert, 1973; Varian, 1982), is in the context of standard demand theory (assuming linear budgets and a finite dataset). Versions of Afriat’s result have been obtained in a model with infinite data (Reny, 2015), nonlinear budget sets (e.g., Matzkin, 1991; Forges and Minelli, 2009), general choice problems (e.g., Chavas and Cox, 1993; Nishimura, Ok, and Quah, 2017), and multiperson equilibrium models (e.g., Brown and Matzkin, 1996; Carvajal, Deb, Fenske, and Quah, 2013). Algorithmic questions related to revealed preference are discussed by Echenique, Golovin, and Wierman (2011) and Camara (2022). The monograph by Chambers and Echenique (2016) presents an overview of results.
The revealed preference literature is primarily concerned with describing the datasets that are consistent with the theory, not with recovering or learning a preference, or a utility. In the context of demand theory and choice from linear budgets, Mas-Colell (1978) introduces sufficient conditions under which a preference relation is recovered, in the limit, from a sequence of ever richer demand data observations. More recently, Forges and Minelli (2009) derive the analog of Mas-Colell’s results for nonlinear budget sets. An important strand of literature focuses on non-parametric econometric estimation methods applied to demand theory data: Blundell, Browning, and Crawford (2003, 2008) propose statistical tests for revealed preference data, and consider counterfactual bounds on demand changes.
The problem of preference and utility recovery has been studied from the perspective of statistical learning theory. Beigman and Vohra (2006) considers the problem of learning a demand function within the PAC paradigm, which is closely related to the exercise we perform in Section 4. A key difference is that we work with data on pairwise choices, which are common in experimental settings (including in many recent large-scale online experiments). Zadimoghaddam and Roth (2012) look at the utility recovery problem, as in Beigman and Vohra (2006), but instead of learning a demand function they want to understand when a utility can be learned efficiently. Balcan, Daniely, Mehta, Urner, and Vazirani (2014) follow up on this important work by providing sample complexity guarantees, while Ugarte (2022) considers the problem of recovery of preferences under noisy choice data, as in our paper, but within the demand theory framework. Similarly, the early work of Balcan, Constantin, Iwata, and Wang (2012) considers a PAC learning question, focusing on important sub-classes of valuations in economics. Bei, Chen, Garg, Hoefer, and Sun (2016) pursues the problem assuming that a seller proposes budgets with the objective of learning an agent’s utility (they focus on quasilinear utility, and a seller that obtains aggregate demand data). Zhang and Conitzer (2020) considers this problem under an active-learning paradigm, and contrasts with the PAC sample complexity.
In all, these works are important precedents for our paper, but they are all within the demand theory setting. The results do not port to other environments, such as, for example, binary choice under risk or uncertainty. The closest paper to ours is Chambers, Echenique, and Lambert (2021), which looks at a host of related questions to our paper but focusing on preference, not utility, recovery. The work by Chambers, Echenique, and Lambert considers choices from binary choice problem, but does not address the question of recovering, or learning, a utility function. As we explain below in the paper, the problem for utilities is more delicate than the problem for preferences. In this line of work, Chase and Prasad (2019) obtains important results on learning a utility but restricted to settings of intertemporal choice. The work by Basu and Echenique (2020) looks at learnability of utility functions (within the PAC learning paradigm), but focusing on particular models of choice under uncertainty. Some of our results rely on measures of the richness of a theory, or of a family of preferences, which is discussed by Basu and Echenique (2020) and Fudenberg, Gao, and Liang (2021): the former by estimating the VC dimension of theories of choice under uncertainty, and the latter by proposing and analyzing new measures of richness that are well-suited for economics, as well as implementing them one economic datasets.
2. The Question
We want to understand when utilities can be recovered from data on an agent’s choices. Consider an agent with a utility function . We want know when, given enough data on the agent’s choices, we can “estimate” or “recover” a utility function that is guaranteed to be close to .
In statistical terminology, recovery is analogous to the consistency of an estimator, and approximation guarantees are analogous to learnability. Imagine a dataset of size , obtained from an incentivized experiment with different choice problems.111Such datasets are common in experimental economics, including cases with very large . See, for example, von Gaudecker, van Soest, and Wengstrom (2011), Chapman, Dean, Ortoleva, Snowberg, and Camerer (2017), Chapman, Dean, Ortoleva, Snowberg, and Camerer (2022) and Falk, Becker, Dohmen, Enke, Huffman, and Sunde (2018). One can also apply our results to roll call data from congress, as in Poole and Rosenthal (1985) or Clinton, Jackman, and Rivers (2004). Large-scale A/B testing by tech firms may provide further examples (albeit involving proprietary datasets). The observed choice behavior in the data may be described by a preference , which is associated with a utility function . The preference could be a rationalizing preference, or a preference estimate. So we choose a utility representation for . The recovery, or consistency, property is that as .
Suppose that the utility represents preferences , which summarize the agent’s full choice behavior. Clearly, unless , the exercise is hopeless. So our first order of business is to understand when is enough to ensure that . In other words, we want to understand when recovering preferences is sufficient for recovering utilities. To this end, our main results are in Section 3.4. In recovering a utility, we are interested in particular parametric representations. In choice over uncertainty, for example, one may be interested in measures of risk-attitudes, or uncertainty aversion. It is key then that the utility recovery exercises preserves the aspects of utility that allow such measures to be have meaning. If, say, preferences have the “constant relative risk aversion” (CRRA) form, then we want to recover the Arrow-Pratt measure of risk aversion.
Our data is presumably obtained in an experimental setting, where an agent’s behavior may be recorded with errors; o in which the agent may randomly deviate from their underlying preference . Despite such errors, with high probability, “on the sample path,” we should obtain that . In our paper we uncover situations where this convergence leads to utility recovery. Indeed, the results in Section 3.4 and 3.5 may be applied to say that, in many popular models in decision theory, when (with high probability), then the resulting utility representations enable utility recovery (with high probability).
The next step is to discuss learning and sample complexity. Here we need to explicitly account for randomness and errors. We lay out a model of random choice, with random sampling of choice problems and errors in agents’ choices. The errors may take a very general form, as long as random choices are more likely to go in the direction of preferences than against it (if then is the more likely choice from the choice problem ), and that this likelihood ratio remains bounded away from one. Contrast with the standard theory of discrete choice, where the randomness usually is taken to be additive, and independent of the particular pair of alternatives that are being compared.
Here we consider a formal statistical consistency problem, and exhibit situations where utility recovery is feasible. We use ideas from the literature on PAC learning to provide formal finite sample-size bounds for each desired approximation guarantee. See Section 4.
3. The Model
3.1. Basic definitions and notational conventions
Let be a set. Given a binary relation , we write when . A binary relation that is complete and transitive is called a weak order. If is a topological space, then we say that is continuous if is closed as a subset of (see, for example, Bergstrom, Parks, and Rader, 1976). A preference relation is a weak order that is also continuous.
A preference relation is locally strict if, for all , implies that for each neighborhood of , there is with . The notion of local strictness was first introduced by Border and Segal (1994) as a generalization of the property of being locally non-satiated from consumer theory.
If is a preference on and is a function for which if and only if then we say that is a representation of , or that is a utility function for .
If is a Borel set, we write for the set of all Borel probability measures on . We endow with the weak* topology. If is a finite set, then we topologize with the product topology.
For , we say that is larger than in the sense of first-order stochastic dominance if for all monotone increasing, continuous and bounded functions on .
3.2. Topologies on preferences and utilities.
The set of preferences over , when is a topological space, is endowed with the topology of closed convergence. The space of corresponding utility representations is endowed with the compact-open topology. These are the standard topologies for preferences and utilities, used in prior work in mathematical economics. See, for example, Hildenbrand (1970), Kannai (1970), and Mas-Colell (1974). Here we offer definitions and a brief discussion of our choice of topology.
Let be a topological space, and be a sequence of closed sets in (with the product topology). We define and to be closed subsets of as follows:
-
•
if and only if, for all neighborhoods of , there exists such that for all .
-
•
if and only if, for all neighborhoods of , and all , there is such that .
Observe that . The definition of closed convergence is as follows.
Definition 1.
converges to in the topology of closed convergence if .
Closed convergence captures the property that agents with similar preferences should have similar choice behavior—a property that is necessary to be able to learn the preference from finite data. Specifically, if , and is the set of all locally strict and continuous preferences on , then the topology of closed convergence is the smallest topology on for which the sets
are open.222See Kannai (1970) and Hildenbrand (1970) for a discussion; a proof of this claim is available from the authors upon request. In words: suppose that , then for close to , close to , and close to , we obtain that .
For utility functions, we adopt the compact-open topology, which we also claim is a natural choice of topology. The compact-open topology is characterized by the convergence criterion of uniform convergence on compact sets. The reason it is natural for utility functions is that a utility usually has two arguments: one is the object being “consumed” (a lottery, for example) and the other is the ordinal preference that utility is meant to represent. (The preference argument is usually implicit, but of course it remains a key aspect of the exercise.) Now an analyst wants the utility to be “jointly continuous,” or continuous in both of its arguments. For such a purpose, the natural topology on the set of utilities, when they are viewed solely as functions of consumption, is indeed the compact-open topology. More formally, consider the following result, originally due to Mas-Colell (1977).333Levin (1983) provides a generalization to incomplete preferences.
Theorem 1.
Let be a locally compact Polish space, and the space of all continuous preferences on endowed with the topology of closed convergence. Then there exists a continuous function so that represents .
3.3. The model
As laid our in Section 2, we want to understand when we may conclude that from knowing that . Mas-Colell’s theorem (Theorem 1) provides general conditions under which there exists one utility representation that has the requisite convergence property, but he is clear about the practical limitations of his result: “There is probably not a simple constructive (“canonical”) method to find a function.” In contrast, economists are generally interested in specific parameterizations of utility.
For example, if an agent has subjective expected-utility preferences, economists want to estimate beliefs and a von-Neumann-Morgenstern index; not some arbitrary representation of the agent’s preferences. Or, if the data involve intertemporal choices, and the agent discounts utility exponentially, then an economist will want to estimate their discount factor. Such specific parameterizations of utility are not meaningful in the context of Theorem 1.
The following (trivial) example shows that there is indeed a problem to be studied. Convergence of arbitrary utility representations to the correct limit is not guaranteed, even when recovered utilities form a convergent sequence, and recovered preferences converge to the correct limit.
Example 1.
Consider expected-utility preferences on , where is a compact space, a finite set of states, and is the set of Anscombe-Aumann acts. Fix an affine function , a prior , and consider the preference with representation .
Now if we set then holds trivially. However, it is possible to choose an expected utility representation that does not converge to a utility representation (of any kind) for . In fact one could choose a and a “normalization” for , for example (imagine for concreteness that is finite, and use the Euclidean norm for ). Specifically, choose scalars with . Then the utility represents and converges to a constant function.
The punchline is that the limiting utility represents the preference that exhibits complete indifference among all acts. This is true, no matter what the original preference was.
In the example, we have imposed some discipline on the representation. Given that the utility converges to a constant, the discipline we have chosen is a particular normalization of the utility representations (their norm is constant). The normalization just makes the construction of the example slightly more challenging, and reflects perhaps the most basic care that an analyst could impose on the recovery exercise.
3.4. Anscombe-Aumann acts
We present our first main result in the context of Anscombe-Aumann acts, the workhorse model of the modern theory of decisions under uncertainty. Let be a finite set of states of the world, and fix a closed interval of the real line . An act is a function . We interpret the elements of as monetary lotteries, so that acts are state-contingent monetary lotteries. The set of all acts is . When , we denote the constant act that is identically equal to by ; or sometimes by for short.
Note that we do not work with abstract, general, Anscombe-Aumann acts, but in assuming monetary lotteries we impose a particular structure on the objective lotteries in our Anscombe-Aumann framework. The reason is that our theory necessitates a certain known and objective direction of preference. Certain preference comparisons must be known a priori: monotonicity of preference will do the job, but for monotonicity to be objective we need the structure of monetary lotteries.
An act dominates an act if, for all , first-order stochastic dominates . And strictly dominates if, for all , strictly first-order stochastic dominates . A preference over acts is weakly monotone if whenever first-order stochastic dominates .
Let be the set of all continuous and monotone weakly increasing functions with and . A pair is a standard representation if and are continuous functions such that , for all constant acts . Moreover, we say that a standard representation is aggregative if there is an aggregator with for . An aggregative representation with aggregator is denoted by . Observe that a standard representation rules out total indifference.
A preference on is standard if it is weakly monotone, and there is a standard representation in which represents . Roughly, standard preferences will be those that satisfy the expected utility axioms across constant acts, and are monotone with respect to the (statewise) first order stochastic dominance relation. Aggregative preferences will additionally satisfy an analogue of Savage’s P3 or the Anscombe-Aumann notion of monotonicity.
Example 2.
Variational preferences (Maccheroni, Marinacci, and Rustichini, 2006) are standard and aggregative.444Variational preferences are widely used in macroeconomics and finance to capture decision makers’ concerns for using a misspecified model. Here it is important to recover the different components of a representation, and , because they quantify key features of the environment. See for example Hansen and Sargent (2001); Hansen, Sargent, Turmuhambetova, and Williams (2006); Hansen and Sargent (2022). Let
where
-
(1)
is continuous and affine.
-
(2)
is lower semicontinuous, convex and grounded (meaning that ).
Note that , by the assumption that is grounded, and where the existence of so that is an instance of the Riesz representation theorem. It is clear that we may choose . So is a standard representation.
Letting be defined by , we see that indeed is also an aggregative representation of these preferences.
Some other examples of aggregative preferences include special cases of the variational model Gilboa and Schmeidler (1989), as well as generalizations of it, Cerreia-Vioglio, Maccheroni, Marinacci, and Montrucchio (2011); Chandrasekher, Frick, Iijima, and Le Yaouanq (2021), and others which are not comparable Schmeidler (1989); Chateauneuf, Grabisch, and Rico (2008); Chateauneuf and Faro (2009).555A class of variational preferences that are of particular interest to computer scientists are preferences with a max-min representation (Gilboa and Schmeidler, 1989). These evaluate acts by with a closed and convex set. Here is the indicator function of (as defined in convex analysis).
Theorem 2.
Let be a standard preference with standard representation , and a sequence of standard preferences, each with a standard representation .
-
(1)
If , then .
-
(2)
If, in addition, these preferences are aggregative with representations and , then .
In terms of interpretation, Theorem 2 suggests that, as preferences converge, risk-attitudes, or von Neumann morgenstern utility indices also converge in a pointwise sense. The aggregative part claims that we can study the convergence of risk attitudes and the convergence of the aggregator controlling for risk separately. So, for example, in the multiple priors case, two decision makers whose preferences are close will have similar sets of priors.
3.5. Preferences over lotteries and certainty equivalents
In this section, we focus on a canonical representation for preferences over lotteries: the certainty equivalent. There are many models of preferences over lotteries, but we have in mind in particular Cerreia-Vioglio, Dillenberger, and Ortoleva (2015), whereby a preference representation over lotteries is given by ; a minimum over a set of certainty equivalents for expected utility maximizers. Key is that for this representation, and any degenerate lottery , .
Let , where , be an interval in the real line and consider . Say that on is certainty monotone if when ever first order stochastically dominates , then , and for all for which , . Any certainty monotone continuous preference and any lottery then possesses a unique certainty equivalent , satisfying . To this end, we define to be the certainty equivalent of for . It is clear that, fixing , is a continuous utility representation of .
Proposition 1.
Let be a certainty monotone preference and let . Let be a sequence of certainty monotone preferences and let be a sequence in . If , then .
To this end, the map carrying each preference to its certainty equivalent representation is a continuous map in the topology of closed convergence.
4. Utility recovery with noisy choice data
We develop a model of noisy choice data, and consider when utility may be recovered from a traditional estimation procedure. Recovery here takes the form of an explicit consistency result, together with sample complexity bounds in a PAC learning framework.
The focus is on the Wald representation, analogous to the certainty equivalent we considered in Section 3.5. When choosing among vectors in , the Wald representation is so that
If the choice space is well behaved, a Wald representation exists for any monotone and continuous preference relation. To this end, we move beyond the Anscombe-Aumann setting that we considered above, but it should be clear that some versions of Anscombe-Aumann can be accommodated within the assumptions of this section.
Our main results for the model that explicitly accounts for noisy choice data assumes Wald representations that are either Lipschitz or homogeneous (meaning that preferences are homothetic).
4.1. Noisy choice data
The primitives of our noisy choice model are collected in the tuple , where:
-
•
is the ambient choice, or consumption, space. The set is endowed with the (relative) topology inherited from .
-
•
is a class of continuous and locally strict preferences on . The class comes with a set of utility functions , so that each element of has a utility representation in the set .
-
•
is a probability measure on , assumed to be absolutely continuous with respect to Lebesgue measure. We also assume that , where is a constant and Leb denotes Lebesgue measure.
-
•
is a random choice function, so is the probability that an agent with preferences chooses over . Assume that if , then is chosen with probability and with probability . If then and are chosen with equal probability.
-
•
We shall assume that the error probability satisfies that
The tuple describes a data-generating process for noisy choice data. Fix a sample size and consider an agent with preference . A sequence of choice problems , are obtained by drawing and from , independently, according to the law . Then a choice is made from each problem according to .
Observe that our assumptions on are mild. We allow errors to depend on the pair under consideration, almost arbitrarily. The only requirement is that one is more likely to choose according to one’s preference than to go against them, as well as the more technical assumptions of measurability and a control on how large the deviation from - choice may get.
To keep track of the chosen alternative, we order the elements of each problem so that means that was chosen from the choice problem . So a sample of size is , consisting of iid draws from according to our stochastic choice model: in the th draw, the choice problem was and was chosen.
A utility function is chosen to maximize the number of rationalized choices in the data. So maximizes . The space of utility functions is endowed with a metric, . In this section, all we ask of is that, for any , there is with . For example, we could use the sup norm for the purposes of any of the results in this section.
4.1.1. Lipschitz utilities
One set of sufficient conditions will need the family of relevant utility representations to satisfy a Lipschitz property with a common Lipschitz bound. The representations are of the Wald kind, as in Section 3.5. We now add the requirement of having the Lipschitz property, which allows us to connect differences in utility functions to quantifiable observable (but noisy) choice behavior. The main idea is expressed in Lemma 4 of Section 6.
We say that is a Lipschitz environment if:
-
(1)
is convex, compact, and has nonempty interior.
-
(2)
Each preference has a Wald utility representation so that .
-
(3)
All utilities in are Lipschitz, and admit a common Lipschitz constant . So, for any and , .
4.1.2. Homothetic preferences
The second set of sufficient conditions involve homothetic preferences. It turns out, in this case, that the Wald representations have a homogeneity property, and this allows us to connect differences in utilities to a probability of detecting such differences. The key insights is contained in Lemma 5 of Section 6.
We employ the following auxiliary notation. and .
We say that is a homothetic environment if:
-
(1)
for some (small) and (large) .
-
(2)
is a class of continuous, monotone, homothetic, and complete preferences on .
-
(3)
is a class of Wald representations, so that for each there is a utility function with .
Remark: if is the Wald representation of , then is homogeneous of degree one because iff , so .
4.1.3. VC dimension
The Vapnik-Chervonenkis (VC) dimension of a set of preferences is the largest sample size for which there exists a utility that perfectly rationalizes all the choices in the data, no matter what those are. That is so that for any dataset of size .
VC dimension is a basic ingredient in the standard PAC learning paradigm. It is a measure of the complexity of a theory used in machine learning, and lies behind standard results on uniform laws of large numbers (see, for example, Boucheron, Bousquet, and Lugosi (2005)). Applications of VC to decision theory can be found in Basu and Echenique (2020) and Chambers, Echenique, and Lambert (2021).
It is worth noting that VC dimension is used in classification tasks. It may not be obvious, but when it comes to preferences, our exercise may be thought of as classification. For each pair of alternatives and , a preference “classifies” the pair as or . Then we can think of preference recovery as a problem of learning a classifier within the class .
4.2. Consistency and sample complexity
Theorem 3.
Consider a noisy choice environment that is either a homothetic or a Lipschitz environment. Suppose that is the Wald utility representation of .
-
(1)
The estimates converge to in probability.
-
(2)
There are constants and so that, for any and , with probability at least ,
where is the VC dimension of , when the environment is Lipschitz and when it is homothetic.
Of course, the second statement in the theorem is only meaningful when the VC dimension of is finite. The constants and depend on the primitives in the environment, but not on preferences, utilities, or sample sizes.
5. Recovering preferences and utilities
The discussion in Section 3.4 focused on utility recovery, taking convergence of preferences as given. Here we take a step back, provide some conditions for preference recovery that are particularly relevant for the setting of Section 3.4, and then connect these back to utility recovery in Corollary 1. First we describe an experimental setting in which preferences may be elicited: an agent, or subject, faces a sequence of (incentivized) choice problems, and the choices made produce data on his preferences. The specific model and description below is borrowed from Chambers, Echenique, and Lambert (2021), but the setting is completely standard in choice theory.
Let be the set of acts over monetary lotteries, as discussed in Section 3.4. A choice function is a pair with a collection of nonempty subsets of , and with for all . When , the domain of , is implied, we refer to as a choice function.
A choice function is generated by a preference relation over if
for all .
The notation means that the choice function is generated by the preference relation on .
Our model features an experimenter (a female) and a subject (a male). The subject chooses among alternatives in a way described by a preference over , which we refer to as data-generating preference. The experimenter seeks to infer from the subject’s choices in a finite experiment.
In a finite experiment, the subject is presented with finitely many unordered pairs of alternatives in . For every pair , the subject is asked to choose one of the two alternatives: or .
A sequence of experiments is a collection of pairs of possible choices presented to the subject. Let collect the first elements of a sequence of experiments, and be the set of all alternatives that are used over all the experiments in a sequence. Here is a finite experiment of size .
We make two assumptions on . The first is that is dense in . The second is that, for any there is for which . The first assumption is obviously needed to obtain any general preference recovery result. The second assumption means that the experimenter is able to elicit the subject’s choices over all pairs used in her experiment.666If there is a countable dense , then one can always construct such a sequence of experiments via a standard diagonalization argument.
For each , the subject’s preference generates a choice function by letting, for each , be a maximal element of according to . Thus the choice behavior observed by the experimenter is always consistent with .
We introduce two notions of rationalization: weak and strong. A preference weakly rationalizes if, for all , . A preference weakly rationalizes a choice sequence if it rationalizes the choice function of order , for all .
A preference strongly rationalizes if, for all , . A preference strongly rationalizes a choice sequence if it rationalizes the choice function of order , for all .
In the history of revealed preference theory in consumer theory, strong rationalizability came first. It is essentially the notion in Samuelson (1938) and Richter (1966). Strong rationalizability is the appropriate notion when it is known that all potentially chosen alternatives are actually chosen, or when we want to impose, as an added discipline, that the observed choices are uniquely optimal in each choice problem. This makes sense when studying demand functions, as Samuelson did. Weak rationalizability was one of the innovations in Afriat (1967b), who was interested in demand correspondences.777As an illustration of the difference between these two notions of rationalizability, note that, in the setting of consumer theory, one leads to the Strong Axiom of Revealed Preference while the other to the Generalized Axiom of Revealed Preference. Of course, Afriat’s approach is also distinct in assuming a finite dataset. See Chambers and Echenique (2016) for a detailed discussion.
5.1. A general “limiting” result
Our next result serves to contrast what can be achieved with the “limiting” (countably infinite) data with the limit of preferences recovered from finite choice experiments.
Theorem 4.
Suppose that and are two continuous preference relations (complete and transitive). If , then .
Indeed, as the proof makes clear, Theorem 4 would hold more generally for any which is a connected topological space, but it may not hold in absence of connectedness. There is a sense in which the limiting case with an infinite amount of data offers no problems for preference recovery. The structure we impose is needed for the limit of rationalizations drawn from finite data.
5.2. Recovery from finite data in the AA model
Here we adopt the same structural assumptions as in Section 3.4, namely that , endowed with the weak topology and the first order stochastic dominance relation. However, the result easily extends to broader environments, as the proof makes clear.
Theorem 5.
There is a sequence of finite experiments so that if the subject’s preference is continuous and weakly monotone, and for each , is a continuous and weakly monotone preference that strongly rationalizes a choice function generated by ; then .
Corollary 1.
Let and be as in the statement of Theorem 5. If, in addition, and have standard representations and then .
Note that Theorem 5 requires the existence of the data-generating preference .
A “dual” result to Theorem 5 was established in Chambers, Echenique, and Lambert (2021). There, the focus was on weak rationalization via , which is a weaker notion than the strong rationalization hypothesized here. To achieve a weak rationalization result, we assumed instead that preferences were strictly monotone.
6. Proofs
In this section, unless we say otherwise, we denote by the set of acts , and the elements of by etc. Note that is compact Polish when is endowed with the topology of weak convergence of probability measures. Let be the set of all complete and continuous binary relations on .
6.1. Lemmas
The lemmas stated here will be used in the proofs of our results.
Lemma 1.
Let . If is an increasing sequence in , and is a decreasing sequence, such that . Then
Proof.
This is obviously true for . For , convergence and sups and infs are obtained component-by-component, so the result follows. ∎
Lemma 2.
Let . Let be a convergent sequence in , with . Then there is an increasing sequence and an a decreasing sequence such that , and .
Proof.
The set ordered by first order stochastic dominance is a complete lattice (see, for example, Lemma 3.1 in Kertz and Rösler (2000)). Suppose that . Define and by and . Clearly, is an increasing sequence, is decreasing, and .
Let denote the cdf associated with . Note that while is the right-continuous modification of . For any point of continuity of , , so
by Lemma 1.
Moreover, . Let . Then
Then , as is a point of continuity of . ∎
The results we have obtained motivate two definitions that will prove useful. Say that the set , together with the collection of finite experiments , has the countable order property if for each and each neighborhood of in there is with . We say that has the squeezing property if for any convergent sequence in , if then there is an increasing sequence , and an a decreasing sequence , such that , and .
Lemma 3.
If , then has the squeezing property, and there is such that has the countable order property.
Proof.
The squeezing property follows from Lemma 2, and the countable order property from Theorem 15.11 of Aliprantis and Border (2006): Indeed, let be the set of probability distributions with finite support on , where for all , . Then we may choose a sequence of pairs , and let to be with so that the countable order property is satisfied. ∎
6.2. Proof of Theorem 2
Without loss of generality, we may set . First we show that in the compact-open topology. To this end, let . We want to show that . Suppose then that this is not the case, and by selecting a subsequence that (without loss). Note that , where is the lottery that pays with probability , and with probability . Let be the lottery that pays with probability , and with probability (given that the range of is , we must have ). Now we have that and implies . This is a contradiction because is indifferent in to the lottery that pays with probability , and with probability . The latter is strictly first-order stochastically dominated by the lottery .
To finish the proof, we show that . This is the same as proving that when . For each , continuity and weak monotonicity imply that there is so that
Similarly, there is with .
Now we argue that . Indeed is a sequence in . If there is a subsequence that converges to, say, then we may choose and eventually
using weak monotonicity. This is impossible because and imply that .
Finally, using what we know about the convergence of to , .
We now turn to the second statement in the theorem. Observe that is a continuous function from onto . Let be an arbitrary convergent sequence, and say that . We claim that . Without loss we may assume that , by taking a subsequence if necessary. For each and , choose for which . Again, without loss, we may assume that by taking a subsequence if necessary, and using the finiteness of . Observe also that as we have shown that in the compact-open topology.
Now, we may also choose so that
and further may again without loss (by taking a subsequence) assume that converges to . Thus , again using what we have shown regarding . Then so that, by taking limits, . This implies that .
6.3. Proof of Proposition 1
Take as in the statement of the Proposition, and observe that for every , . Suppose by means of contradiction that is false, then there is some and a subsequence for which , by taking a further subsequence, we assume without loss that . Now, , and and . So by definition of closed convergence, it follows that ; but this violates certainty monotonicity as .
7. Proof of Theorem 3
First some notation. Let , and be represented by . By definition of , we have that for all . And we use to denote the volume of a set in , when this is well defined (see Schneider (2014)).
Consider the measure on defined as
In particular
is the probability that a choice with error made at a randomly-drawn choice problem by an agent with preference will coincide with .
The key identification result shown in Chambers, Echenique, and Lambert (2021) is that, if , then
Lemma 4.
Consider a Lipschitz noise choice environment . There is a constant with the following property. If and are two preferences in with representations and (respectively) in . Then
Proof.
The ball in with center and radius is denoted by . First we show that the map
defined for , is nonincreasing as a function of .
Indeed, let , and let . Then and . By convexity of , , and . Observe further by properties of Lebesgue measure in that . Therefore, . Since , it follows that
like we wanted to show.
Now observe that there exists large enough that for all and . Hence, for any and
as has nonempty interior and the volume of a ball in is independent of its center.
Now we proceed with the proof of the statement in the lemma. Let and fix with (wlog) . Set
We may assume that as defined above, as otherwise we can use a larger upper bound on the Lipschitz constants for the functions in .
Consider the interval
with volume
Consider . If then for any .
Now, if and then
by monotonicity. Similarly,
Thus for any , and
Where the first identity is shown in Chambers, Echenique, and Lambert (2021). The second inequality follows because on . The third inequality is because on .
By the assumptions we have placed on , and the calculations above, we know that
So there is a constant (that only depends on and ) so that is bounded below by
Here is a constant that only depends on , and .
By the assumption that , we get that
for some constant that depends on and . ∎
Lemma 5.
Consider a homothetic noise choice environment . There is a constant with the following property. If and are two preferences in with representations and (respectively) in . Then
Proof.
Let be such that
Choose so that . Let
and
Note that because by homotheticity, and hence . Then we must have as would mean that , contradicting monotonicity and .
Observe that if and then we have that
as and ; while
Hence .
First we estimate . Write for . Define the function and note that when then because . Note also that is a convex combination of and , so as the latter is a convex set. This shows that
and hence that .
Now, since is star shaped we have
where is the surface area of and is the radial function of the set (see Schneider (2014) page 57). The inequality results from as and for any .
Now,
as . Thus we have that
with , a constant.
Moreover, we have as . Then we obtain, again using a formula derived in Chambers, Echenique, and Lambert (2021), and that on :
where . ∎
7.1. Proof of Theorem 3
For the rest of this proof, we denote by .
The rest of the proof uses routine ideas from statistical learning theory. By standard results (see, for example, Theorem 3.1 in Boucheron, Bousquet, and Lugosi (2005)), there exists an event with probability at least on which:
Moreover, again by standard arguments (see Theorem 3.2 in Boucheron, Bousquet, and Lugosi (2005)), we also have
where
is the Rademacher average of .
Now, by the Vapnik-Chervonenkis inequality (see Theorem 3.4 in Boucheron, Bousquet, and Lugosi (2005)), we have that
where is the VC dimension of , and is a universal constant.
So on the event , we have we have that
We now combine these statements with Lemmas 4 and 5. In particular, we let or depending on which of the lemmas we invoke. Let represent and represent . Let , a magnitude that depends on the sample. Then, on the event , by Lemma 4 or 5, we have that
where we have used that by definition of . This proves the second statement in the theorem.
7.2. Proof of Theorem 5
By standard results (see Hildenbrand (1970)), since is locally compact Polish, the topology of closed convergence is compact metric.
We will show that for any subsequence of , there is a subsubsequence converging to , which will establish that .
So choose a convergent subsubsequence of the given subsequence. To simplify notation and with a slight abuse of notation, let us also refer to this subsubsequence as . Call its limit ; is complete as the set of complete relations is closed in the closed convergence topology. It is therefore sufficient to establish that and .
First we show that implies that . So let . Let and be neighborhoods of and , respectively, such that for all and . Such neighborhoods exist by the continuity of . We prove first that if , then there exists such that for all . Recall that . By hypothesis, there exist and such that and . Each is a strong rationalization of the finite experiment of order , so if then implies that for all . Since , there is is such that . Thus implies that for all . So, for , , as is weakly monotone.
Now we establish that . Let be an arbitrary sequence with . By hypothesis, there is an increasing sequence , and a decreasing sequence , such that and while .
Let be large enough that and . Let be such that for all (we established the existence of such above). Then, for any we have that
By the weak monotonicity of , then, . The sequence was arbitrary, so . Thus . Completeness of implies that .
In second place we show that if then , thus completing the proof. So let . We recursively construct sequences such that and , .
So, for any , choose with , and with ; so that , as is weakly monotone. Recall that strongly rationalizes for . So and imply that for all large enough. Let (where we can take ) such that ; and let and .
Then we have and . Thus .
7.3. Proof of Theorem 4
First, it is straightforward to show that implies . Because otherwise there are for which and . Take an open neighborhood about and a pair for which and , a contradiction. Symmetrically, we also have implies .
Now, without loss, suppose that there is a pair for which and . By connectedness and continuity, is nonempty. Indeed if we assume, towards a contradiction that , then and are nonempty open sets. Further, for any , either or (because if then by completeness , which implies that ). Conclude that and each of the sets are nonempty and open (by continuity of the preference ); these sets are disjoint, violating connectedness of . So we conclude that is nonempty. By continuity of the preference , os open.
We claim that there is a pair for which . For otherwise, for all , . Conclude then by continuity that for all , . Observe that this implies that, for any , the set , as if , we also have that , from which we conclude , so that and hence , a contradiction. Observe that contradicts the continuity of and the connectedness of (same argument as nonemptyness of ; see our discussion above).
We have shown that there is for which , so that . Further, we have hypothesized that . By the first paragraph, we know that . If, by means of contradiction, we have , then , a contradiction. So and , a contradiction to .
References
- (1)
- Afriat (1967a) Afriat, S. N. (1967a): “The Construction of Utility Functions from Expenditure Data,” International Economic Review, 8(1), 67–77.
- Afriat (1967b) (1967b): “The Construction of Utility Functions from Expenditure Data,” International Economic Review, 8(1), 67–77.
- Aliprantis and Border (2006) Aliprantis, C. D., and K. Border (2006): Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer, 3 edn.
- Balcan, Constantin, Iwata, and Wang (2012) Balcan, M. F., F. Constantin, S. Iwata, and L. Wang (2012): “Learning valuation functions,” in Conference on Learning Theory, pp. 4–1. JMLR Workshop and Conference Proceedings.
- Balcan, Daniely, Mehta, Urner, and Vazirani (2014) Balcan, M.-F., A. Daniely, R. Mehta, R. Urner, and V. V. Vazirani (2014): “Learning economic parameters from revealed preferences,” in International Conference on Web and Internet Economics, pp. 338–353. Springer.
- Basu and Echenique (2020) Basu, P., and F. Echenique (2020): “On the falsifiability and learnability of decision theories,” Theoretical Economics, 15(4), 1279–1305.
- Bei, Chen, Garg, Hoefer, and Sun (2016) Bei, X., W. Chen, J. Garg, M. Hoefer, and X. Sun (2016): “Learning Market Parameters Using Aggregate Demand Queries,” in AAAI.
- Beigman and Vohra (2006) Beigman, E., and R. Vohra (2006): “Learning from revealed preference,” in Proceedings of the 7th ACM Conference on Electronic Commerce, pp. 36–42.
- Bergstrom, Parks, and Rader (1976) Bergstrom, T. C., R. P. Parks, and T. Rader (1976): “Preferences which Have Open Graphs,” Journal of Mathematical Economics, 3(3), 265–268.
- Blundell, Browning, and Crawford (2008) Blundell, R., M. Browning, and I. Crawford (2008): “Best Nonparametric Bounds on Demand Responses,” Econometrica, 76(6), 1227–1262.
- Blundell, Browning, and Crawford (2003) Blundell, R. W., M. Browning, and I. A. Crawford (2003): “Nonparametric Engel Curves and Revealed Preference,” Econometrica, 71(1), 205–240.
- Border and Segal (1994) Border, K. C., and U. Segal (1994): “Dynamic Consistency Implies Approximately Expected Utility Preferences,” Journal of Economic Theory, 63(2), 170–188.
- Boucheron, Bousquet, and Lugosi (2005) Boucheron, S., O. Bousquet, and G. Lugosi (2005): “Theory of classification: A survey of some recent advances,” ESAIM: probability and statistics, 9, 323–375.
- Brown and Matzkin (1996) Brown, D. J., and R. L. Matzkin (1996): “Testable Restrictions on the Equilibrium Manifold,” Econometrica, 64(6), 1249–1262.
- Camara (2022) Camara, M. K. (2022): “Computationally Tractable Choice,” in Proceedings of the 23rd ACM Conference on Economics and Computation, EC ’22, p. 28, New York, NY, USA. Association for Computing Machinery.
- Carvajal, Deb, Fenske, and Quah (2013) Carvajal, A., R. Deb, J. Fenske, and J. K.-H. Quah (2013): “Revealed Preference Tests of the Cournot Model,” Econometrica, 81(6), 2351–2379.
- Cerreia-Vioglio, Dillenberger, and Ortoleva (2015) Cerreia-Vioglio, S., D. Dillenberger, and P. Ortoleva (2015): “Cautious expected utility and the certainty effect,” Econometrica, 83(2), 693–728.
- Cerreia-Vioglio, Maccheroni, Marinacci, and Montrucchio (2011) Cerreia-Vioglio, S., F. Maccheroni, M. Marinacci, and L. Montrucchio (2011): “Uncertainty averse preferences,” Journal of Economic Theory, 146(4), 1275–1330.
- Chambers and Echenique (2016) Chambers, C. P., and F. Echenique (2016): Revealed preference theory, vol. 56. Cambridge University Press.
- Chambers, Echenique, and Lambert (2021) Chambers, C. P., F. Echenique, and N. S. Lambert (2021): “Recovering preferences from finite data,” Econometrica, 89(4), 1633–1664.
- Chandrasekher, Frick, Iijima, and Le Yaouanq (2021) Chandrasekher, M., M. Frick, R. Iijima, and Y. Le Yaouanq (2021): “Dual-self representations of ambiguity preferences,” Econometrica, forthcoming.
- Chapman, Dean, Ortoleva, Snowberg, and Camerer (2017) Chapman, J., M. Dean, P. Ortoleva, E. Snowberg, and C. Camerer (2017): “Willingness to Pay and Willingness to Accept are Probably Less Correlated Than You Think,” NBER working paper No. 23954.
- Chapman, Dean, Ortoleva, Snowberg, and Camerer (2022) (2022): “Econographics,” Forthcoming, Journal of Political Economic: Microeconomics.
- Chase and Prasad (2019) Chase, Z., and S. Prasad (2019): “Learning Time Dependent Choice,” in 10th Innovations in Theoretical Computer Science Conference (ITCS).
- Chateauneuf and Faro (2009) Chateauneuf, A., and J. H. Faro (2009): “Ambiguity through confidence functions,” Journal of Mathematical Economics, 45(9-10), 535–558.
- Chateauneuf, Grabisch, and Rico (2008) Chateauneuf, A., M. Grabisch, and A. Rico (2008): “Modeling attitudes toward uncertainty through the use of the Sugeno integral,” Journal of Mathematical Economics, 44(11), 1084–1099.
- Chavas and Cox (1993) Chavas, J.-P., and T. L. Cox (1993): “On Generalized Revealed Preference Analysis,” The Quarterly Journal of Economics, 108(2), 493–506.
- Clinton, Jackman, and Rivers (2004) Clinton, J., S. Jackman, and D. Rivers (2004): “The Statistical Analysis of Roll Call Data,” The American Political Science Review, 98(2), 355–370.
- Diewert (1973) Diewert, W. E. (1973): “Afriat and Revealed Preference Theory,” The Review of Economic Studies, 40(3), 419–425.
- Dong, Roth, Schutzman, Waggoner, and Wu (2018) Dong, J., A. Roth, Z. Schutzman, B. Waggoner, and Z. S. Wu (2018): “Strategic classification from revealed preferences,” in Proceedings of the 2018 ACM Conference on Economics and Computation, pp. 55–70.
- Echenique, Golovin, and Wierman (2011) Echenique, F., D. Golovin, and A. Wierman (2011): “A revealed preference approach to computational complexity in economics,” in Proceedings of the 12th ACM conference on Electronic commerce, pp. 101–110.
- Echenique and Prasad (2020) Echenique, F., and S. Prasad (2020): “Incentive Compatible Active Learning,” in 11th Innovations in Theoretical Computer Science Conference (ITCS).
- Falk, Becker, Dohmen, Enke, Huffman, and Sunde (2018) Falk, A., A. Becker, T. Dohmen, B. Enke, D. Huffman, and U. Sunde (2018): “Global Evidence on Economic Preferences,” The Quarterly Journal of Economics, 133(4), 1645–1692.
- Forges and Minelli (2009) Forges, F., and E. Minelli (2009): “Afriat’s Theorem for General Budget Sets,” Journal of Economic Theory, 144(1), 135–145.
- Fox (1945) Fox, R. H. (1945): “On topologies for function spaces,” Bull. Amer. Math. Soc., 51, 429–432.
- Fudenberg, Gao, and Liang (2021) Fudenberg, D., W. Gao, and A. Liang (2021): “How Flexible is that Functional Form? Measuring the Restrictiveness of Theories,” in Proceedings of the 22nd ACM Conference on Economics and Computation, pp. 497–498.
- Gilboa and Schmeidler (1989) Gilboa, I., and D. Schmeidler (1989): “Maxmin expected utility with non-unique prior,” Journal of mathematical economics, 18(2), 141–153.
- Hansen and Sargent (2001) Hansen, L. P., and T. J. Sargent (2001): “Robust control and model uncertainty,” American Economic Review, 91(2), 60–66.
- Hansen and Sargent (2022) (2022): “Risk, Ambiguity, and Misspecification: Decision Theory, Robust Control, and Statistics,” Mimeo: NYU.
- Hansen, Sargent, Turmuhambetova, and Williams (2006) Hansen, L. P., T. J. Sargent, G. Turmuhambetova, and N. Williams (2006): “Robust control and model misspecification,” Journal of Economic Theory, 128(1), 45–90.
- Hildenbrand (1970) Hildenbrand, W. (1970): “On Economies with Many Agents,” Journal of Economic Theory, 2(2), 161–188.
- Kannai (1970) Kannai, Y. (1970): “Continuity Properties of the Core of a Market,” Econometrica, 38(6), 791–815.
- Kertz and Rösler (2000) Kertz, R. P., and U. Rösler (2000): “Complete Lattices of Probability Measures with Applications to Martingale Theory,” Lecture Notes-Monograph Series, 35, 153–177.
- Levin (1983) Levin, V. L. (1983): “A continuous utility theorem for closed preorders on a metrizable -compact space,” Doklady Akademii Nauk, 273(4), 800–804.
- Maccheroni, Marinacci, and Rustichini (2006) Maccheroni, F., M. Marinacci, and A. Rustichini (2006): “Ambiguity aversion, robustness, and the variational representation of preferences,” Econometrica, 74(6), 1447–1498.
- Mas-Colell (1974) Mas-Colell, A. (1974): “Continuous and Smooth Consumers: Approximation Theorems,” Journal of Economic Theory, 8(3), 305–336.
- Mas-Colell (1977) (1977): “On the Continuous Representation of Preorders,” International Economic Review, 18(2), 509–513.
- Mas-Colell (1978) (1978): “On Revealed Preference Analysis,” The Review of Economic Studies, 45(1), 121–131.
- Matzkin (1991) Matzkin, R. L. (1991): “Axioms of Revealed Preference for Nonlinear Choice Sets,” Econometrica, 59(6), 1779–1786.
- Nishimura, Ok, and Quah (2017) Nishimura, H., E. A. Ok, and J. K.-H. Quah (2017): “A Comprehensive Approach to Revealed Preference Theory,” American Economic Review, 107(4), 1239–1263.
- Poole and Rosenthal (1985) Poole, K. T., and H. Rosenthal (1985): “A Spatial Model for Legislative Roll Call Analysis,” American Journal of Political Science, 29(2), 357–384.
- Reny (2015) Reny, P. J. (2015): “A Characterization of Rationalizable Consumer Behavior,” Econometrica, 83(1), 175–192.
- Richter (1966) Richter, M. K. (1966): “Revealed Preference Theory,” Econometrica, 34(3), 635–645.
- Samuelson (1938) Samuelson, P. A. (1938): “A note on the pure theory of consumer’s behaviour,” Economica, 5(17), 61–71.
- Schmeidler (1989) Schmeidler, D. (1989): “Subjective probability and expected utility without additivity,” Econometrica, 57(3), 571–587.
- Schneider (2014) Schneider, R. (2014): Convex Bodies: the Brunn-Minkowski Theory. Cambridge University Press, 2 edn.
- Ugarte (2022) Ugarte, C. (2022): “Preference Recoverability from Inconsistent Choices,” Mimeo, UC Berkeley.
- Varian (1982) Varian, H. R. (1982): “The Nonparametric Approach to Demand Analysis,” Econometrica, 50(4), 945–973.
- von Gaudecker, van Soest, and Wengstrom (2011) von Gaudecker, H.-M., A. van Soest, and E. Wengstrom (2011): “Heterogeneity in Risky Choice Behavior in a Broad Population,” The American Economic Review, 101(2), 664–94.
- Zadimoghaddam and Roth (2012) Zadimoghaddam, M., and A. Roth (2012): “Efficiently learning from revealed preference,” in International Workshop on Internet and Network Economics, pp. 114–127. Springer.
- Zhang and Conitzer (2020) Zhang, H., and V. Conitzer (2020): “Learning the Valuations of a k-demand Agent,” in International Conference on Machine Learning.