This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Recovering utility

Christopher P. Chambers Department of Economics, Georgetown University Federico Echenique Department of Economics, UC Berkeley  and  Nicolas S. Lambert Department of Economics, University of Southern California
Abstract.

We provide sufficient conditions under which a utility function may be recovered from a finite choice experiment. Identification, as is commonly understood in decision theory, is not enough. We provide a general recoverability result that is widely applicable to modern theories of choice under uncertainty. Key is to allow for a monetary environment, in which an objective notion of monotonicity is meaningful. In such environments, we show that subjective expected utility, as well as variational preferences, and other parametrizations of utilities over uncertain acts are recoverable. We also consider utility recovery in a statistical model with noise and random deviations from utility maximization.

Echenique thanks the National Science Foundation for its support through grant SES 1558757. Lambert gratefully acknowledges the financial support and hospitality of Microsoft Research New York and the Yale University Cowles Foundation.

1. Introduction

Economists are often interested in recovering preferences and utility functions from data on agents’ choices. If we are able to recover a utility function, then a preference relation is obviously implied, but the inverse procedure is more delicate. In this paper, we presume access to data on an agent’s choices, and that these describe the agent’s preferences (or that preferences have been obtained as the outcome of a statistical estimation procedure). Our results describe sufficient conditions under which one can recover, or learn, a utility function from the agents’ choices.

At a high level, the problem is that preferences essentially are choices, because they encode the choice that would be made from each binary choice problem. When we write xyx\succ y we really mean that xx would be chosen from the set {x,y}\{x,y\}. Utility functions are much richer objects, and a given choice behavior may be described by many different utilities. For example, one utility can be used to discuss an agent’s risk preferences: they could have a “constant relative risk aversion” utility, for which a single parameter describes attitudes towards risk. But the same preferences can be represented by a utility that does not have such a convenient parametrization. So recovering, or learning, utilities present important challenges that go beyond the problem of recovering a preference. In the paper, we describe some simple examples that illustrate the challenges. Our main results describe when one may (non-parametrically) recover a utility representation from choice data.

We first consider choice under uncertainty. We adopt the standard (Anscombe-Aumann) setting of choice under uncertainty, and focus attention on a class of utility representations that has been extensively studied in the literature. Special cases include subjected expected utility, the max-min expected utility model of Gilboa and Schmeidler (1989), Choquet expected utility (Schmeidler, 1989), the variational preferences of Maccheroni, Marinacci, and Rustichini (2006), and many other popular models. Decision theorists usually place significance on the uniqueness of their utility representations, arguing that uniqueness provides an identification argument that allows for utility to be recovered from choice data. We argue, in contrast, that uniqueness of a utility representation is not enough to recover a utility from finite choice data.

Counterexamples are not hard to find. Indeed, even when a utility representation is unique, one may find a convergent sequence of utilities that is consistent with larger and larger finite datasets, but that does not converge to the utility function that generated the choices in the data, or to any utility to which it is equivalent. So uniqueness is necessary but not sufficient for a utility representation to be empirically tractable, in the sense of ensuring that a utility is recovered from large, but finite, choice experiments.

Our main results are positive, and exhibit sufficient conditions for utility recovery. Key to our results is the availability of an objective direction of improvements in utility: we focus our attention on models of monotone preferences. Our paper considers choices among monetary acts, meaning state-contingent monetary payoffs. For such acts, there is a natural notion of monotonicity. Between two acts, if one pays more in every state of the world, the agent agent should prefer it. As a discipline on the recovery exercise, this essential notion of monotonicity suffices to ensure that a sequence of utilities that explains the choices in the data converges to the utility function that generated the choices.

We proceed by first discussing the continuity of a utility function in its dependence on the underlying preference relation. If U(,x)U(\succeq,x) is a function of a preference \succeq and of choice objects xx, then we say that it is a utility function if xU(,x)x\mapsto U(\succeq,x) represents \succeq. We draw on the existing literature (Theorem 1) to argue that such continuous utilities exist in very general circumstances. Continuity of this mapping in the preference ensures that if the choice data allow for preference recovery, they also allow a utility to be recovered. The drawback, however, of such general utility representation results is that they do not cover the special theories of utility in which economists generally take interest. There is no reason to expect that the utility U(,x)U(\succeq,x) coincides with the standard parametrizations of, for example, subjective expected utility or variational preferences.

We then go on to our main exercise, which constrains the environment to the Anscombe-Aumann setting, and considers utility representations that have received special attention in the theory of choice under uncertainty. We consider a setup that is flexible enough to accommodate most theories of choice under uncertainty that have been studied in the literature. Our main result (Theorem 2) says that, whenever a choice experiment succeeds in recovering agents’ underlying preferences, it also serves to recover a utility in the class of utilities of interest. For example, if an agent has subjective expected utility preferences, and these can be recovered from a choice experiment, then so can the parameters of the subjective expected utility representation: the agents’ beliefs and Bernoulli utility index. Or, if the agent has variational preferences that can be inferred from choice data, then so can the different components of the variational utility representation.

Actual data on choices may be subject to sampling noise, and agents who randomly deviate from their preferences. The results we have just mentioned are useful in such settings, once the randomness in preference estimates is taken into account. As a complement to our main findings, we proceed with a model that explicitly takes noisy choice, and randomness, into account. Specifically, we consider choice problems that are sampled at random, and an agent who may deviate from their preferences. They make mistakes. In such a setting, we present sufficient conditions for the consistency of utility function estimates (Theorem 3).

In the last part of the paper we take a step back and revisit the problem of preference recovery, with the goal of showing how data from a finite choice experiment can approximate a preference relation, and, in consequence, a utility function. Our model considers a large, but finite, number of binary choices. We show that when preferences are monotone, then preference recovery is possible (Theorem 5). In such environments, utility recovery follows for the models of choice under uncertainty that we have been interested in (Corollary 1).

Related literature.

The literature on revealed preference theory in economics is primarily devoted to tests for consistency with rational choice. The main result in the literature, Afriat’s theorem (Afriat, 1967a; Diewert, 1973; Varian, 1982), is in the context of standard demand theory (assuming linear budgets and a finite dataset). Versions of Afriat’s result have been obtained in a model with infinite data (Reny, 2015), nonlinear budget sets (e.g., Matzkin, 1991; Forges and Minelli, 2009), general choice problems (e.g., Chavas and Cox, 1993; Nishimura, Ok, and Quah, 2017), and multiperson equilibrium models (e.g., Brown and Matzkin, 1996; Carvajal, Deb, Fenske, and Quah, 2013). Algorithmic questions related to revealed preference are discussed by Echenique, Golovin, and Wierman (2011) and Camara (2022). The monograph by Chambers and Echenique (2016) presents an overview of results.

The revealed preference literature is primarily concerned with describing the datasets that are consistent with the theory, not with recovering or learning a preference, or a utility. In the context of demand theory and choice from linear budgets, Mas-Colell (1978) introduces sufficient conditions under which a preference relation is recovered, in the limit, from a sequence of ever richer demand data observations. More recently, Forges and Minelli (2009) derive the analog of Mas-Colell’s results for nonlinear budget sets. An important strand of literature focuses on non-parametric econometric estimation methods applied to demand theory data: Blundell, Browning, and Crawford (2003, 2008) propose statistical tests for revealed preference data, and consider counterfactual bounds on demand changes.

The problem of preference and utility recovery has been studied from the perspective of statistical learning theory. Beigman and Vohra (2006) considers the problem of learning a demand function within the PAC paradigm, which is closely related to the exercise we perform in Section 4. A key difference is that we work with data on pairwise choices, which are common in experimental settings (including in many recent large-scale online experiments). Zadimoghaddam and Roth (2012) look at the utility recovery problem, as in Beigman and Vohra (2006), but instead of learning a demand function they want to understand when a utility can be learned efficiently. Balcan, Daniely, Mehta, Urner, and Vazirani (2014) follow up on this important work by providing sample complexity guarantees, while Ugarte (2022) considers the problem of recovery of preferences under noisy choice data, as in our paper, but within the demand theory framework. Similarly, the early work of Balcan, Constantin, Iwata, and Wang (2012) considers a PAC learning question, focusing on important sub-classes of valuations in economics. Bei, Chen, Garg, Hoefer, and Sun (2016) pursues the problem assuming that a seller proposes budgets with the objective of learning an agent’s utility (they focus on quasilinear utility, and a seller that obtains aggregate demand data). Zhang and Conitzer (2020) considers this problem under an active-learning paradigm, and contrasts with the PAC sample complexity.

In all, these works are important precedents for our paper, but they are all within the demand theory setting. The results do not port to other environments, such as, for example, binary choice under risk or uncertainty. The closest paper to ours is Chambers, Echenique, and Lambert (2021), which looks at a host of related questions to our paper but focusing on preference, not utility, recovery. The work by Chambers, Echenique, and Lambert considers choices from binary choice problem, but does not address the question of recovering, or learning, a utility function. As we explain below in the paper, the problem for utilities is more delicate than the problem for preferences. In this line of work, Chase and Prasad (2019) obtains important results on learning a utility but restricted to settings of intertemporal choice. The work by Basu and Echenique (2020) looks at learnability of utility functions (within the PAC learning paradigm), but focusing on particular models of choice under uncertainty. Some of our results rely on measures of the richness of a theory, or of a family of preferences, which is discussed by Basu and Echenique (2020) and Fudenberg, Gao, and Liang (2021): the former by estimating the VC dimension of theories of choice under uncertainty, and the latter by proposing and analyzing new measures of richness that are well-suited for economics, as well as implementing them one economic datasets.

Finally, it is worth mentioning that preference and utilty recovery is potentially subject to to strategic manipulations, as emphasized by Dong, Roth, Schutzman, Waggoner, and Wu (2018) and Echenique and Prasad (2020). This possibility is ignored in our work.

2. The Question

We want to understand when utilities can be recovered from data on an agent’s choices. Consider an agent with a utility function uu. We want know when, given enough data on the agent’s choices, we can “estimate” or “recover” a utility function that is guaranteed to be close to uu.

In statistical terminology, recovery is analogous to the consistency of an estimator, and approximation guarantees are analogous to learnability. Imagine a dataset of size kk, obtained from an incentivized experiment with kk different choice problems.111Such datasets are common in experimental economics, including cases with very large kk. See, for example, von Gaudecker, van Soest, and Wengstrom (2011), Chapman, Dean, Ortoleva, Snowberg, and Camerer (2017), Chapman, Dean, Ortoleva, Snowberg, and Camerer (2022) and Falk, Becker, Dohmen, Enke, Huffman, and Sunde (2018). One can also apply our results to roll call data from congress, as in Poole and Rosenthal (1985) or Clinton, Jackman, and Rivers (2004). Large-scale A/B testing by tech firms may provide further examples (albeit involving proprietary datasets). The observed choice behavior in the data may be described by a preference k\succeq^{k}, which is associated with a utility function uku^{k}. The preference k\succeq^{k} could be a rationalizing preference, or a preference estimate. So we choose a utility representation for uku^{k}. The recovery, or consistency, property is that ukuu^{k}\to u as kk\to\infty.

Suppose that the utility uu represents preferences \succeq, which summarize the agent’s full choice behavior. Clearly, unless k\succeq^{k}\to\succeq, the exercise is hopeless. So our first order of business is to understand when k\succeq^{k}\to\succeq is enough to ensure that ukuu^{k}\to u. In other words, we want to understand when recovering preferences is sufficient for recovering utilities. To this end, our main results are in Section 3.4. In recovering a utility, we are interested in particular parametric representations. In choice over uncertainty, for example, one may be interested in measures of risk-attitudes, or uncertainty aversion. It is key then that the utility recovery exercises preserves the aspects of utility that allow such measures to be have meaning. If, say, preferences have the “constant relative risk aversion” (CRRA) form, then we want to recover the Arrow-Pratt measure of risk aversion.

Our data is presumably obtained in an experimental setting, where an agent’s behavior may be recorded with errors; o in which the agent may randomly deviate from their underlying preference \succeq. Despite such errors, with high probability, “on the sample path,” we should obtain that k\succeq^{k}\to\succeq. In our paper we uncover situations where this convergence leads to utility recovery. Indeed, the results in Section 3.4 and 3.5 may be applied to say that, in many popular models in decision theory, when k\succeq^{k}\to\succeq (with high probability), then the resulting utility representations enable utility recovery (with high probability).

The next step is to discuss learning and sample complexity. Here we need to explicitly account for randomness and errors. We lay out a model of random choice, with random sampling of choice problems and errors in agents’ choices. The errors may take a very general form, as long as random choices are more likely to go in the direction of preferences than against it (if xyx\succ y then xx is the more likely choice from the choice problem {x,y}\{x,y\}), and that this likelihood ratio remains bounded away from one. Contrast with the standard theory of discrete choice, where the randomness usually is taken to be additive, and independent of the particular pair of alternatives that are being compared.

Here we consider a formal statistical consistency problem, and exhibit situations where utility recovery is feasible. We use ideas from the literature on PAC learning to provide formal finite sample-size bounds for each desired approximation guarantee. See Section 4.

3. The Model

3.1. Basic definitions and notational conventions

Let XX be a set. Given a binary relation RX×XR\subseteq X\times X, we write x𝑅yx\mathrel{R}y when (x,y)R(x,y)\in R. A binary relation that is complete and transitive is called a weak order. If XX is a topological space, then we say that RR is continuous if RR is closed as a subset of X×XX\times X (see, for example, Bergstrom, Parks, and Rader, 1976). A preference relation is a weak order that is also continuous.

A preference relation \succeq is locally strict if, for all x,yXx,y\in X, xyx\succeq y implies that for each neighborhood UU of (x,y)(x,y), there is (x,y)U(x^{\prime},y^{\prime})\in U with xyx\succ y. The notion of local strictness was first introduced by Border and Segal (1994) as a generalization of the property of being locally non-satiated from consumer theory.

If \succeq is a preference on XX and u:X𝐑u:X\to\mathbf{R} is a function for which xyx\succeq y if and only if u(x)u(y)u(x)\geq u(y) then we say that uu is a representation of \succeq, or that uu is a utility function for \succeq.

If A𝐑dA\subseteq\mathbf{R}^{d} is a Borel set, we write Δ(A)\Delta(A) for the set of all Borel probability measures on AA. We endow Δ(A)\Delta(A) with the weak* topology. If SS is a finite set, then we topologize Δ(A)S\Delta(A)^{S} with the product topology.

For p,qΔ(A)p,q\in\Delta(A), we say that pp is larger than qq in the sense of first-order stochastic dominance if Af𝑑xAf𝑑y\int_{A}fdx\geq\int_{A}fdy for all monotone increasing, continuous and bounded functions ff on AA.

3.2. Topologies on preferences and utilities.

The set of preferences over XX, when XX is a topological space, is endowed with the topology of closed convergence. The space of corresponding utility representations is endowed with the compact-open topology. These are the standard topologies for preferences and utilities, used in prior work in mathematical economics. See, for example, Hildenbrand (1970), Kannai (1970), and Mas-Colell (1974). Here we offer definitions and a brief discussion of our choice of topology.

Let XX be a topological space, and ={Fn}n\mathcal{F}=\{F^{n}\}_{n} be a sequence of closed sets in X×XX\times X (with the product topology). We define Li()\textrm{Li}(\mathcal{F}) and Ls()\textrm{Ls}(\mathcal{F}) to be closed subsets of X×XX\times X as follows:

  • (x,y)Li()(x,y)\in\textrm{Li}(\mathcal{F}) if and only if, for all neighborhoods VV of (x,y)(x,y), there exists N𝐍N\in\mathbf{N} such that FnVF^{n}\cap V\neq\varnothing for all nNn\geq N.

  • (x,y)Ls()(x,y)\in\textrm{Ls}(\mathcal{F}) if and only if, for all neighborhoods VV of (x,y)(x,y), and all N𝐍N\in\mathbf{N}, there is nNn\geq N such that FnVF^{n}\cap V\neq\varnothing.

Observe that Li()Ls()\textrm{Li}(\mathcal{F})\subseteq\textrm{Ls}(\mathcal{F}). The definition of closed convergence is as follows.

Definition 1.

FnF^{n} converges to FF in the topology of closed convergence if Li()=F=Ls()\textrm{Li}(\mathcal{F})=F=\textrm{Ls}(\mathcal{F}).

Closed convergence captures the property that agents with similar preferences should have similar choice behavior—a property that is necessary to be able to learn the preference from finite data. Specifically, if X𝐑nX\subseteq\mathbf{R}^{n}, and 𝒫\mathcal{P} is the set of all locally strict and continuous preferences on XX, then the topology of closed convergence is the smallest topology on 𝒫\mathcal{P} for which the sets

{(x,y,):xy}X×X×𝒫\{(x,y,\succeq):x\succ y\}\subseteq X\times X\times\mathcal{P}

are open.222See Kannai (1970) and Hildenbrand (1970) for a discussion; a proof of this claim is available from the authors upon request. In words: suppose that xyx\succ y, then for xx^{\prime} close to xx, yy^{\prime} close to yy, and \succeq^{\prime} close to \succeq, we obtain that xyx^{\prime}\succ^{\prime}y^{\prime}.

For utility functions, we adopt the compact-open topology, which we also claim is a natural choice of topology. The compact-open topology is characterized by the convergence criterion of uniform convergence on compact sets. The reason it is natural for utility functions is that a utility usually has two arguments: one is the object being “consumed” (a lottery, for example) and the other is the ordinal preference that utility is meant to represent. (The preference argument is usually implicit, but of course it remains a key aspect of the exercise.) Now an analyst wants the utility to be “jointly continuous,” or continuous in both of its arguments. For such a purpose, the natural topology on the set of utilities, when they are viewed solely as functions of consumption, is indeed the compact-open topology. More formally, consider the following result, originally due to Mas-Colell (1977).333Levin (1983) provides a generalization to incomplete preferences.

Theorem 1.

Let XX be a locally compact Polish space, and 𝒫\mathcal{P} the space of all continuous preferences on XX endowed with the topology of closed convergence. Then there exists a continuous function U:𝒫×X[0,1]U:\mathcal{P}\times X\to[0,1] so that xU(,x)x\mapsto U(\succeq,x) represents \succeq.

We may view the map UU as a mapping from \succeq to the space of utility functions. Then continuity of this induced mapping is equivalent to the joint continuity result discussed in Theorem 1, as long as we impose the compact-open topology on the space of utility functions (see Fox (1945)).

3.3. The model

As laid our in Section 2, we want to understand when we may conclude that ukuu^{k}\to u from knowing that k\succeq^{k}\to\succeq. Mas-Colell’s theorem (Theorem 1) provides general conditions under which there exists one utility representation that has the requisite convergence property, but he is clear about the practical limitations of his result: “There is probably not a simple constructive (“canonical”) method to find a UU function.” In contrast, economists are generally interested in specific parameterizations of utility.

For example, if an agent has subjective expected-utility preferences, economists want to estimate beliefs and a von-Neumann-Morgenstern index; not some arbitrary representation of the agent’s preferences. Or, if the data involve intertemporal choices, and the agent discounts utility exponentially, then an economist will want to estimate their discount factor. Such specific parameterizations of utility are not meaningful in the context of Theorem 1.

The following (trivial) example shows that there is indeed a problem to be studied. Convergence of arbitrary utility representations to the correct limit is not guaranteed, even when recovered utilities form a convergent sequence, and recovered preferences converge to the correct limit.

Example 1.

Consider expected-utility preferences on Δ(K)S\Delta(K)^{S}, where KK is a compact space, SS a finite set of states, and ΔS(K)\Delta^{S}(K) is the set of Anscombe-Aumann acts. Fix an affine function v:Δ(K)𝐑v:\Delta(K)\to\mathbf{R}, a prior μΔ(S)\mu\in\Delta(S), and consider the preference \succeq with representation Sv(f(s))dμ(s)\int_{S}v(f(s))\mathop{}\!\mathrm{d}\mu(s).

Now if we set k=\succeq^{k}=\succeq then k\succeq^{k}\to\succeq holds trivially. However, it is possible to choose an expected utility representation Svk(f(s))dμk(s)\int_{S}v^{k}(f(s))\mathop{}\!\mathrm{d}\mu^{k}(s) that does not converge to a utility representation (of any kind) for \succeq. In fact one could choose a μk\mu^{k} and a “normalization” for vkv^{k}, for example vk=1\|v^{k}\|=1 (imagine for concreteness that KK is finite, and use the Euclidean norm for vkv^{k}). Specifically, choose scalars βk\beta^{k} with βk+1kv=1\|\beta^{k}+\frac{1}{k}v\|=1. Then the utility fSvk(f(s))dμ(s)f\mapsto\int_{S}v^{k}(f(s))\mathop{}\!\mathrm{d}\mu(s) represents k\succeq^{k} and converges to a constant function.

The punchline is that the limiting utility represents the preference that exhibits complete indifference among all acts. This is true, no matter what the original preference \succeq was.

In the example, we have imposed some discipline on the representation. Given that the utility converges to a constant, the discipline we have chosen is a particular normalization of the utility representations (their norm is constant). The normalization just makes the construction of the example slightly more challenging, and reflects perhaps the most basic care that an analyst could impose on the recovery exercise.

3.4. Anscombe-Aumann acts

We present our first main result in the context of Anscombe-Aumann acts, the workhorse model of the modern theory of decisions under uncertainty. Let SS be a finite set of states of the world, and fix a closed interval of the real line [a,b]𝐑[a,b]\subseteq\mathbf{R}. An act is a function f:SΔ([a,b])f:S\to\Delta([a,b]). We interpret the elements of Δ([a,b])\Delta([a,b]) as monetary lotteries, so that acts are state-contingent monetary lotteries. The set of all acts is Δ([a,b])S\Delta([a,b])^{S}. When pΔ([a,b])p\in\Delta([a,b]), we denote the constant act that is identically equal to pp by (p,,p)(p,\ldots,p); or sometimes by pp for short.

Note that we do not work with abstract, general, Anscombe-Aumann acts, but in assuming monetary lotteries we impose a particular structure on the objective lotteries in our Anscombe-Aumann framework. The reason is that our theory necessitates a certain known and objective direction of preference. Certain preference comparisons must be known a priori: monotonicity of preference will do the job, but for monotonicity to be objective we need the structure of monetary lotteries.

An act ff dominates an act gg if, for all sSs\in S, f(s)f(s) first-order stochastic dominates g(s)g(s). And ff strictly dominates gg if, for all sSs\in S, f(s)f(s) strictly first-order stochastic dominates g(s)g(s). A preference \succeq over acts is weakly monotone if fgf\succeq g whenever ff first-order stochastic dominates gg.

Let UU be the set of all continuous and monotone weakly increasing functions u:[a,b]𝐑u:[a,b]\to\mathbf{R} with u(a)=0u(a)=0 and u(b)=1u(b)=1. A pair (V,u)(V,u) is a standard representation if V:Δ([a,b])S𝐑V:\Delta([a,b])^{S}\to\mathbf{R} and uUu\in U are continuous functions such that v(p,,p)=[a,b]udpv(p,\ldots,p)=\int_{[a,b]}u\mathop{}\!\mathrm{d}p, for all constant acts (p,,p)(p,\ldots,p). Moreover, we say that a standard representation (V,u)(V,u) is aggregative if there is an aggregator H:[0,1]S𝐑H:[0,1]^{S}\to\mathbf{R} with V(f)=H((udf(s))sS)V(f)=H((\int u\mathop{}\!\mathrm{d}f(s))_{s\in S}) for fΔ([a,b])Sf\in\Delta([a,b])^{S}. An aggregative representation with aggregator HH is denoted by (V,u,H)(V,u,H). Observe that a standard representation rules out total indifference.

A preference \succeq on Δ([a,b])S\Delta([a,b])^{S} is standard if it is weakly monotone, and there is a standard representation (V,u)(V,u) in which VV represents \succeq. Roughly, standard preferences will be those that satisfy the expected utility axioms across constant acts, and are monotone with respect to the (statewise) first order stochastic dominance relation. Aggregative preferences will additionally satisfy an analogue of Savage’s P3 or the Anscombe-Aumann notion of monotonicity.

Example 2.

Variational preferences (Maccheroni, Marinacci, and Rustichini, 2006) are standard and aggregative.444Variational preferences are widely used in macroeconomics and finance to capture decision makers’ concerns for using a misspecified model. Here it is important to recover the different components of a representation, vv and cc, because they quantify key features of the environment. See for example Hansen and Sargent (2001); Hansen, Sargent, Turmuhambetova, and Williams (2006); Hansen and Sargent (2022). Let

V(f)=inf{v(f(s))𝑑π(s)+c(π):πΔ(S)}V(f)=\inf\{\int v(f(s))d\pi(s)+c(\pi):\pi\in\Delta(S)\}

where

  1. (1)

    v:Δ([a,b])𝐑v:\Delta([a,b])\to\mathbf{R} is continuous and affine.

  2. (2)

    c:Δ(S)[0,]c:\Delta(S)\to[0,\infty] is lower semicontinuous, convex and grounded (meaning that inf{c(π):πΔ(S)}=0\inf\{c(\pi):\pi\in\Delta(S)\}=0).

Note that V(p,,p)=v(p)+inf{c(π):πΔ(S)}=udpV(p,\ldots,p)=v(p)+\inf\{c(\pi):\pi\in\Delta(S)\}=\int u\mathop{}\!\mathrm{d}p, by the assumption that cc is grounded, and where the existence of u:[a,b]𝐑u:[a,b]\to\mathbf{R} so that v(p)=udpv(p)=\int u\mathop{}\!\mathrm{d}p is an instance of the Riesz representation theorem. It is clear that we may choose uUu\in U. So (V,u)(V,u) is a standard representation.

Letting H:[0,1]S𝐑H:[0,1]^{S}\to\mathbf{R} be defined by H(x)=inf{sSx(s)π(s)+c(π):πΔ(S)}H(x)=\inf\{\sum_{s\in S}x(s)\pi(s)+c(\pi):\pi\in\Delta(S)\}, we see that indeed (V,u,H)(V,u,H) is also an aggregative representation of these preferences.

Some other examples of aggregative preferences include special cases of the variational model Gilboa and Schmeidler (1989), as well as generalizations of it, Cerreia-Vioglio, Maccheroni, Marinacci, and Montrucchio (2011); Chandrasekher, Frick, Iijima, and Le Yaouanq (2021), and others which are not comparable Schmeidler (1989); Chateauneuf, Grabisch, and Rico (2008); Chateauneuf and Faro (2009).555A class of variational preferences that are of particular interest to computer scientists are preferences with a max-min representation (Gilboa and Schmeidler, 1989). These evaluate acts by V(f)=inf{v(f(s))𝑑π(s):πΠ},V(f)=\inf\{\int v(f(s))d\pi(s):\pi\in\Pi\}, with ΠΔ(S)\Pi\subseteq\Delta(S) a closed and convex set. Here cc is the indicator function of Π\Pi (as defined in convex analysis).

Theorem 2.

Let \succeq be a standard preference with standard representation (V,u)(V,u), and {k}\{\succeq^{k}\} a sequence of standard preferences, each with a standard representation (Vk,uk)(V^{k},u^{k}).

  1. (1)

    If k\succeq^{k}\to\succeq, then (Vk,uk)(V,u)(V^{k},u^{k})\to(V,u).

  2. (2)

    If, in addition, these preferences are aggregative with representations (Vk,uk,Hk)(V^{k},u^{k},H^{k}) and (V,u,H)(V,u,H), then HkHH^{k}\to H.

In terms of interpretation, Theorem 2 suggests that, as preferences converge, risk-attitudes, or von Neumann morgenstern utility indices also converge in a pointwise sense. The aggregative part claims that we can study the convergence of risk attitudes and the convergence of the aggregator controlling for risk separately. So, for example, in the multiple priors case, two decision makers whose preferences are close will have similar sets of priors.

3.5. Preferences over lotteries and certainty equivalents

In this section, we focus on a canonical representation for preferences over lotteries: the certainty equivalent. There are many models of preferences over lotteries, but we have in mind in particular Cerreia-Vioglio, Dillenberger, and Ortoleva (2015), whereby a preference representation over lotteries is given by U(p)=infu𝒰u1(u𝑑p)U(p)=\inf_{u\in\mathcal{U}}u^{-1}(\int udp); a minimum over a set of certainty equivalents for expected utility maximizers. Key is that for this representation, and any degenerate lottery δx\delta_{x}, U(δx)=xU(\delta_{x})=x.

Let [a,b]𝐑[a,b]\subset\mathbf{R}, where a<ba<b, be an interval in the real line and consider Δ([a,b])\Delta([a,b]). Say that \succeq on Δ([a,b])\Delta([a,b]) is certainty monotone if when ever pp first order stochastically dominates qq, then pqp\succeq q, and for all x,y[a,b]x,y\in[a,b] for which x>yx>y, δxδy\delta_{x}\succ\delta_{y}. Any certainty monotone continuous preference \succeq and any lottery pΔ([a,b])p\in\Delta([a,b]) then possesses a unique certainty equivalent x[0,1]x\in[0,1], satisfying δxp\delta_{x}\sim p. To this end, we define ce(,p)\mbox{ce}(\succeq,p) to be the certainty equivalent of pp for \succeq. It is clear that, fixing \succeq, ce(,)\mbox{ce}(\cdot,\succeq) is a continuous utility representation of \succeq.

Proposition 1.

Let \succeq be a certainty monotone preference and let pΔ([a,b])p\in\Delta([a,b]). Let {k}\{\succeq^{k}\} be a sequence of certainty monotone preferences and let pkp^{k} be a sequence in Δ([a,b])\Delta([a,b]). If (k,pk)(,p)(\succeq^{k},p^{k})\rightarrow(\succeq,p), then ce(k,pk)ce(,p)\mbox{ce}(\succeq^{k},p^{k})\rightarrow\mbox{ce}(\succeq,p).

To this end, the map carrying each preference to its certainty equivalent representation is a continuous map in the topology of closed convergence.

4. Utility recovery with noisy choice data

We develop a model of noisy choice data, and consider when utility may be recovered from a traditional estimation procedure. Recovery here takes the form of an explicit consistency result, together with sample complexity bounds in a PAC learning framework.

The focus is on the Wald representation, analogous to the certainty equivalent we considered in Section 3.5. When choosing among vectors in x𝐑dx\in\mathbf{R}^{d}, the Wald representation is u(x)𝐑u(x)\in\mathbf{R} so that

x(u(x),,u(x)).x\sim(u(x),\ldots,u(x)).

If the choice space is well behaved, a Wald representation exists for any monotone and continuous preference relation. To this end, we move beyond the Anscombe-Aumann setting that we considered above, but it should be clear that some versions of Anscombe-Aumann can be accommodated within the assumptions of this section.

Our main results for the model that explicitly accounts for noisy choice data assumes Wald representations that are either Lipschitz or homogeneous (meaning that preferences are homothetic).

4.1. Noisy choice data

The primitives of our noisy choice model are collected in the tuple (X,𝒫,λ,q)(X,\mathcal{P},\lambda,q), where:

  • X𝐑dX\subseteq\mathbf{R}^{d} is the ambient choice, or consumption, space. The set XX is endowed with the (relative) topology inherited from 𝐑d\mathbf{R}^{d}.

  • 𝒫\mathcal{P} is a class of continuous and locally strict preferences on XX. The class comes with a set of utility functions 𝒰\mathcal{U}, so that each element of 𝒫\mathcal{P} has a utility representation in the set 𝒰\mathcal{U}.

  • λ\lambda is a probability measure on XX, assumed to be absolutely continuous with respect to Lebesgue measure. We also assume that λcLeb\lambda\geq c\,\mathrm{Leb}, where c>0c>0 is a constant and Leb denotes Lebesgue measure.

  • q:X×X×𝒫[0,1]q:X\times X\times\mathcal{P}\to[0,1] is a random choice function, so q(x,y;)q(x,y;\succeq) is the probability that an agent with preferences \succeq chooses xx over yy. Assume that if xyx\succ y, then xx is chosen with probability q(x,y;)>1/2q(x,y;\succeq)>1/2 and yy with probability q(y,x;)=1q(x,y;)q(y,x;\succeq^{*})=1-q(x,y;\succeq). If xyx\sim y then xx and yy are chosen with equal probability.

  • We shall assume that the error probability qq satisfies that

    Θinf{q(,(x,y)):xy and 𝒫}>12.\Theta\equiv\inf\{q(\succeq,(x,y)):x\succ y\text{ and }\succeq\in\mathcal{P}\}>\frac{1}{2}.

The tuple (X,𝒫,λ,q)(X,\mathcal{P},\lambda,q) describes a data-generating process for noisy choice data. Fix a sample size nn and consider an agent with preference 𝒫\succeq^{*}\in\mathcal{P}. A sequence of choice problems {xi,yi}\{x_{i},y_{i}\}, 1in1\leq i\leq n are obtained by drawing xix_{i} and yiy_{i} from XX, independently, according to the law λ\lambda. Then a choice is made from each problem {xi,yi}\{x_{i},y_{i}\} according to q(,;)q(\cdot,\cdot;\succeq^{*}).

Observe that our assumptions on qq are mild. We allow errors to depend on the pair {x,y}\{x,y\} under consideration, almost arbitrarily. The only requirement is that one is more likely to choose according to one’s preference than to go against them, as well as the more technical assumptions of measurability and a control on how large the deviation from 1/21/2-1/21/2 choice may get.

To keep track of the chosen alternative, we order the elements of each problem so that (xi,yi)(x_{i},y_{i}) means that xix_{i} was chosen from the choice problem {xi,yi}\{x_{i},y_{i}\}. So a sample of size nn is {(x1,y1),,(xn,yn)}\{(x_{1},y_{1}),\ldots,(x_{n},y_{n})\}, consisting of 2n2n iid draws from X×XX\times X according to our stochastic choice model: in the iith draw, the choice problem was {xi,yi}\{x_{i},y_{i}\} and xix_{i} was chosen.

A utility function un𝒰u_{n}\in\mathcal{U} is chosen to maximize the number of rationalized choices in the data. So unu_{n} maximizes i=1n𝟏u(xi)u(yi)\sum_{i=1}^{n}\mathbf{1}_{u(x_{i})\geq u(y_{i})}. The space of utility functions is endowed with a metric, ρ\rho. In this section, all we ask of ρ\rho is that, for any u,u𝒰u,u^{\prime}\in\mathcal{U}, there is xXx\in X with |u(x)u(x)|ρ(u,u)\left|u(x)-u^{\prime}(x)\right|\geq\rho(u,u^{\prime}). For example, we could use the sup norm for the purposes of any of the results in this section.

4.1.1. Lipschitz utilities

One set of sufficient conditions will need the family of relevant utility representations to satisfy a Lipschitz property with a common Lipschitz bound. The representations are of the Wald kind, as in Section 3.5. We now add the requirement of having the Lipschitz property, which allows us to connect differences in utility functions to quantifiable observable (but noisy) choice behavior. The main idea is expressed in Lemma 4 of Section 6.

We say that (X,𝒫,λ,q)(X,\mathcal{P},\lambda,q) is a Lipschitz environment if:

  1. (1)

    X𝐑dX\subseteq\mathbf{R}^{d} is convex, compact, and has nonempty interior.

  2. (2)

    Each preference 𝒫\succeq\in\mathcal{P} has a Wald utility representation u:X𝐑u_{\succeq}:X\to\mathbf{R} so that xu(x)𝟏x\sim u_{\succeq}(x)\mathbf{1}.

  3. (3)

    All utilities in 𝒰\mathcal{U} are Lipschitz, and admit a common Lipschitz constant κ\kappa. So, for any x,xXx,x^{\prime}\in X and u𝒰u\in\mathcal{U}, |u(x)u(x)|κxx|u(x)-u(x^{\prime})|\leq\kappa\|x-x^{\prime}\|.

4.1.2. Homothetic preferences

The second set of sufficient conditions involve homothetic preferences. It turns out, in this case, that the Wald representations have a homogeneity property, and this allows us to connect differences in utilities to a probability of detecting such differences. The key insights is contained in Lemma 5 of Section 6.

We employ the following auxiliary notation. SαM={x𝐑d:x=M and xα𝟏}S^{M}_{\alpha}=\{x\in\mathbf{R}^{d}:\|x\|=M\text{ and }x\geq\alpha\mathbf{1}\} and DαM={θx:xSαM and θ[0,1]}D^{M}_{\alpha}=\{\theta x:x\in S^{M}_{\alpha}\text{ and }\theta\in[0,1]\}.

We say that (X,𝒫,λ,q)(X,\mathcal{P},\lambda,q) is a homothetic environment if:

  1. (1)

    X=DαMX=D^{M}_{\alpha} for some (small) α>0\alpha>0 and (large) M>0M>0.

  2. (2)

    𝒫\mathcal{P} is a class of continuous, monotone, homothetic, and complete preferences on X𝐑dX\subseteq\mathbf{R}^{d}.

  3. (3)

    𝒰\mathcal{U} is a class of Wald representations, so that for each 𝒫\succeq\in\mathcal{P} there is a utility function u𝒰u\in\mathcal{U} with xu(x)𝟏x\sim u(x)\mathbf{1}.

Remark: if uUu\in U is the Wald representation of \succeq, then uu is homogeneous of degree one because xu(x)𝟏x\sim u(x)\mathbf{1} iff λxλu(x)𝟏\lambda x\sim\lambda u(x)\mathbf{1}, so u(λx)=λu(x)u(\lambda x)=\lambda u(x).

4.1.3. VC dimension

The Vapnik-Chervonenkis (VC) dimension of a set 𝒫\mathcal{P} of preferences is the largest sample size nn for which there exists a utility u𝒰u\in\mathcal{U} that perfectly rationalizes all the choices in the data, no matter what those are. That is so that n=i=1n𝟏u(xi)u(yi)n=\sum_{i=1}^{n}\mathbf{1}_{u(x_{i})\geq u(y_{i})} for any dataset (xi,yi)i=1n(x_{i},y_{i})_{i=1}^{n} of size nn.

VC dimension is a basic ingredient in the standard PAC learning paradigm. It is a measure of the complexity of a theory used in machine learning, and lies behind standard results on uniform laws of large numbers (see, for example, Boucheron, Bousquet, and Lugosi (2005)). Applications of VC to decision theory can be found in Basu and Echenique (2020) and Chambers, Echenique, and Lambert (2021).

It is worth noting that VC dimension is used in classification tasks. It may not be obvious, but when it comes to preferences, our exercise may be thought of as classification. For each pair of alternatives xx and yy, a preference \succeq “classifies” the pair as xyx\succeq y or yxy\succ x. Then we can think of preference recovery as a problem of learning a classifier within the class 𝒫\mathcal{P}.

4.2. Consistency and sample complexity

Theorem 3.

Consider a noisy choice environment (X,𝒫,λ,q)(X,\mathcal{P},\lambda,q) that is either a homothetic or a Lipschitz environment. Suppose that u𝒰u^{*}\in\mathcal{U} is the Wald utility representation of 𝒫\succeq^{*}\in\mathcal{P}.

  1. (1)

    The estimates unu_{n} converge to uu^{*} in probability.

  2. (2)

    There are constants KK and C¯\bar{C} so that, for any δ(0,1)\delta\in(0,1) and nn, with probability at least 1δ1-\delta,

    ρ(un,u)C¯(KV/n+2ln(1/δ)/n)1/D,\rho(u_{n},u^{*})\leq\bar{C}\left(K\sqrt{V/n}+\sqrt{2\ln(1/\delta)/n}\right)^{1/D},

    where VV is the VC dimension of 𝒫\mathcal{P}, D=dD=d when the environment is Lipschitz and D=2dD=2d when it is homothetic.

Of course, the second statement in the theorem is only meaningful when the VC dimension of 𝒫\mathcal{P} is finite. The constants KK and C¯\bar{C} depend on the primitives in the environment, but not on preferences, utilities, or sample sizes.

5. Recovering preferences and utilities

The discussion in Section 3.4 focused on utility recovery, taking convergence of preferences as given. Here we take a step back, provide some conditions for preference recovery that are particularly relevant for the setting of Section 3.4, and then connect these back to utility recovery in Corollary 1. First we describe an experimental setting in which preferences may be elicited: an agent, or subject, faces a sequence of (incentivized) choice problems, and the choices made produce data on his preferences. The specific model and description below is borrowed from Chambers, Echenique, and Lambert (2021), but the setting is completely standard in choice theory.

Let X=Δ([a,b])SX=\Delta([a,b])^{S} be the set of acts over monetary lotteries, as discussed in Section 3.4. A choice function is a pair (Σ,c)(\Sigma,c) with Σ2X{}\Sigma\subseteq 2^{X}\setminus\{\varnothing\} a collection of nonempty subsets of XX, and c:Σ2Xc:\Sigma\to 2^{X} with c(A)A\varnothing\neq c(A)\subseteq A for all AΣA\in\Sigma. When Σ\Sigma, the domain of cc, is implied, we refer to cc as a choice function.

A choice function (Σ,c)(\Sigma,c) is generated by a preference relation \succeq over XX if

c(A)={xA:xy for all yB},c(A)=\{x\in A:x\succeq y\text{ for all }y\in B\},

for all AΣA\in\Sigma.

The notation (Σ,c)(\Sigma,c_{\succeq}) means that the choice function (Σ,c)(\Sigma,c_{\succeq}) is generated by the preference relation \succeq on XX.

Our model features an experimenter (a female) and a subject (a male). The subject chooses among alternatives in a way described by a preference \succeq^{*} over XX, which we refer to as data-generating preference. The experimenter seeks to infer \succeq^{*} from the subject’s choices in a finite experiment.

In a finite experiment, the subject is presented with finitely many unordered pairs of alternatives Bk={xk,yk}B_{k}=\{x_{k},y_{k}\} in XX. For every pair BkB_{k}, the subject is asked to choose one of the two alternatives: xkx_{k} or yky_{k}.

A sequence of experiments is a collection Σ={Bi}i𝐍\Sigma_{\infty}=\{B_{i}\}_{i\in\mathbf{N}} of pairs of possible choices presented to the subject. Let Σk={B1,,Bk}\Sigma_{k}=\{B_{1},\dots,B_{k}\} collect the first kk elements of a sequence of experiments, and B=k=1BkB=\cup_{k=1}^{\infty}B_{k} be the set of all alternatives that are used over all the experiments in a sequence. Here Σk\Sigma_{k} is a finite experiment of size kk.

We make two assumptions on Σ\Sigma_{\infty}. The first is that BB is dense in XX. The second is that, for any x,yBx,y\in B there is kk for which Bk={x,y}B_{k}=\{x,y\}. The first assumption is obviously needed to obtain any general preference recovery result. The second assumption means that the experimenter is able to elicit the subject’s choices over all pairs used in her experiment.666If there is a countable dense AXA\subseteq X, then one can always construct such a sequence of experiments via a standard diagonalization argument.

For each kk, the subject’s preference \succeq^{*} generates a choice function (Σk,c)(\Sigma_{k},c) by letting, for each BiΣkB_{i}\in\Sigma_{k}, c(B)c(B) be a maximal element of BiB_{i} according to \succeq^{*}. Thus the choice behavior observed by the experimenter is always consistent with (Σk,c)(\Sigma_{k},c_{\succeq^{*}}).

We introduce two notions of rationalization: weak and strong. A preference k\succeq_{k} weakly rationalizes (Σk,c)(\Sigma_{k},c) if, for all BiΣkB_{i}\in\Sigma_{k}, c(Bi)ck(Bi)c(B_{i})\subseteq c_{\succeq_{k}}(B_{i}). A preference k\succeq_{k} weakly rationalizes a choice sequence (Σ,c)(\Sigma_{\infty},c) if it rationalizes the choice function of order kk (Σk,c)(\Sigma_{k},c), for all k1k\geq 1.

A preference k\succeq_{k} strongly rationalizes (Σk,c)(\Sigma_{k},c) if, for all BiΣkB_{i}\in\Sigma_{k}, c(Bi)=ck(Bi)c(B_{i})=c_{\succeq_{k}}(B_{i}). A preference k\succeq_{k} strongly rationalizes a choice sequence (Σ,c)(\Sigma_{\infty},c) if it rationalizes the choice function of order kk (Σk,c)(\Sigma_{k},c), for all k1k\geq 1.

In the history of revealed preference theory in consumer theory, strong rationalizability came first. It is essentially the notion in Samuelson (1938) and Richter (1966). Strong rationalizability is the appropriate notion when it is known that all potentially chosen alternatives are actually chosen, or when we want to impose, as an added discipline, that the observed choices are uniquely optimal in each choice problem. This makes sense when studying demand functions, as Samuelson did. Weak rationalizability was one of the innovations in Afriat (1967b), who was interested in demand correspondences.777As an illustration of the difference between these two notions of rationalizability, note that, in the setting of consumer theory, one leads to the Strong Axiom of Revealed Preference while the other to the Generalized Axiom of Revealed Preference. Of course, Afriat’s approach is also distinct in assuming a finite dataset. See Chambers and Echenique (2016) for a detailed discussion.

5.1. A general “limiting” result

Our next result serves to contrast what can be achieved with the “limiting” (countably infinite) data with the limit of preferences recovered from finite choice experiments.

Theorem 4.

Suppose that \succeq and \succeq^{*} are two continuous preference relations (complete and transitive). If |B×B=|B×B\succeq|_{B\times B}=\succeq^{*}|_{B\times B}, then =\succeq=\succeq^{*}.

Indeed, as the proof makes clear, Theorem 4 would hold more generally for any XX which is a connected topological space, but it may not hold in absence of connectedness. There is a sense in which the limiting case with an infinite amount of data offers no problems for preference recovery. The structure we impose is needed for the limit of rationalizations drawn from finite data.

5.2. Recovery from finite data in the AA model

Here we adopt the same structural assumptions as in Section 3.4, namely that X=Δ([a,b])SX=\Delta([a,b])^{S}, endowed with the weak topology and the first order stochastic dominance relation. However, the result easily extends to broader environments, as the proof makes clear.

Theorem 5.

There is a sequence of finite experiments Σ\Sigma_{\infty} so that if the subject’s preference \succeq^{*} is continuous and weakly monotone, and for each k𝐍k\in\mathbf{N}, k\succeq^{k} is a continuous and weakly monotone preference that strongly rationalizes a choice function (Σk,c)(\Sigma_{k},c) generated by \succeq^{*}; then k\succeq_{k}\rightarrow\succeq^{*}.

Corollary 1.

Let \succeq^{*} and k\succeq^{k} be as in the statement of Theorem 5. If, in addition, \succeq^{*} and k\succeq^{k} have standard representations (V,u)(V,u) and (Vk,uk)(V^{k},u^{k}) then (V,u)=limk(Vk,uk)(V,u)=\lim_{k\to\infty}(V^{k},u^{k}).

Note that Theorem 5 requires the existence of the data-generating preference \succeq^{*}.

A “dual” result to Theorem 5 was established in Chambers, Echenique, and Lambert (2021). There, the focus was on weak rationalization via k\succeq^{k}, which is a weaker notion than the strong rationalization hypothesized here. To achieve a weak rationalization result, we assumed instead that preferences were strictly monotone.

6. Proofs

In this section, unless we say otherwise, we denote by XX the set of acts Δ([a,b])S\Delta([a,b])^{S}, and the elements of XX by x,y,zx,y,z etc. Note that XX is compact Polish when Δ([a,b])\Delta([a,b]) is endowed with the topology of weak convergence of probability measures. Let 𝒫\mathcal{P} be the set of all complete and continuous binary relations on XX.

6.1. Lemmas

The lemmas stated here will be used in the proofs of our results.

Lemma 1.

Let X𝐑nX\subseteq\mathbf{R}^{n}. If {xn}\{x^{\prime}_{n}\} is an increasing sequence in XX, and {xn′′}\{x^{\prime\prime}_{n}\} is a decreasing sequence, such that sup{xn:n1}=x=inf{xn′′:n1}\sup\{x^{\prime}_{n}:n\geq 1\}=x^{*}=\inf\{x^{\prime\prime}_{n}:n\geq 1\}. Then

limnxn=x=limnxn′′.\lim_{n\rightarrow\infty}x^{\prime}_{n}=x^{*}=\lim_{n\rightarrow\infty}x^{\prime\prime}_{n}.
Proof.

This is obviously true for n=1n=1. For n>1n>1, convergence and sups and infs are obtained component-by-component, so the result follows. ∎

Lemma 2.

Let X=Δ([a,b])X=\Delta([a,b]). Let {xn}\{x_{n}\} be a convergent sequence in XX, with xnxx_{n}\rightarrow x^{*}. Then there is an increasing sequence {xn}\{x^{\prime}_{n}\} and an a decreasing sequence {xn′′}\{x^{\prime\prime}_{n}\} such that xnxnxn′′x^{\prime}_{n}\leq x_{n}\leq x^{\prime\prime}_{n}, and limnxn=x=limnxn′′\lim_{n\rightarrow\infty}x^{\prime}_{n}=x^{*}=\lim_{n\rightarrow\infty}x^{\prime\prime}_{n}.

Proof.

The set XX ordered by first order stochastic dominance is a complete lattice (see, for example, Lemma 3.1 in Kertz and Rösler (2000)). Suppose that xnxx_{n}\rightarrow x^{*}. Define xnx^{\prime}_{n} and xn′′x^{\prime\prime}_{n} by xn=inf{xm:nm}x^{\prime}_{n}=\inf\{x_{m}:n\leq m\} and xn′′=sup{xm:nm}x^{\prime\prime}_{n}=\sup\{x_{m}:n\leq m\}. Clearly, {xn}\{x^{\prime}_{n}\} is an increasing sequence, {xn′′}\{x^{\prime\prime}_{n}\} is decreasing, and xnxnxn′′x^{\prime}_{n}\leq x_{n}\leq x^{\prime\prime}_{n}.

Let FxF_{x} denote the cdf associated with xx. Note that Fxn′′(r)=inf{Fxm(r):nm}F_{x^{\prime\prime}_{n}}(r)=\inf\{F_{x_{m}}(r):n\leq m\} while Fxn(r)F_{x^{\prime}_{n}}(r) is the right-continuous modification of sup{Fxm(r):nm}\sup\{F_{x_{m}}(r):n\leq m\}. For any point of continuity rr of FF, Fxm(r)Fx(r)F_{x_{m}}(r)\rightarrow F_{x^{*}}(r), so

Fx(r)=sup{inf{Fxm(r):nm}:n1}F_{x}(r)=\sup\{\inf\{F_{x_{m}}(r):n\leq m\}:n\geq 1\}

by Lemma 1.

Moreover, Fx(r)=inf{sup{Fxm(r):nm}:n1}F_{x^{*}}(r)=\inf\{\sup\{F_{x_{m}}(r):n\leq m\}:n\geq 1\}. Let ε>0\varepsilon>0. Then

Fx(rε)sup{Fxm(rε):nm}Fxn(r)sup{Fxm(r+ε):nm}Fx(r+ε)\begin{split}F_{x^{*}}(r-\varepsilon)\leftarrow\sup\{F_{x_{m}}(r-\varepsilon):n\leq m\}\leq F_{x^{\prime}_{n}}(r)\leq\sup\{F_{x_{m}}(r+\varepsilon):n\leq m\}\\ \rightarrow F_{x^{*}}(r+\varepsilon)\end{split}

Then Fxn(r)Fx(r)F_{x^{\prime}_{n}}(r)\rightarrow F_{x^{*}}(r), as rr is a point of continuity of FxF_{x^{*}}. ∎

The results we have obtained motivate two definitions that will prove useful. Say that the set XX, together with the collection of finite experiments Σ\Sigma_{\infty}, has the countable order property if for each xXx\in X and each neighborhood VV of xx in XX there is x,x′′(iBi)Vx^{\prime},x^{\prime\prime}\in(\cup_{i}B_{i})\cap V with xxx′′x^{\prime}\leq x\leq x^{\prime\prime}. We say that XX has the squeezing property if for any convergent sequence {xn}n\{x_{n}\}_{n} in XX, if xnxx_{n}\rightarrow x^{*} then there is an increasing sequence {xn}n\{x^{\prime}_{n}\}_{n}, and an a decreasing sequence {xn′′}n\{x^{\prime\prime}_{n}\}_{n}, such that xnxnxn′′x^{\prime}_{n}\leq x_{n}\leq x^{\prime\prime}_{n}, and limnxn=x=limnxn′′\lim_{n\rightarrow\infty}x^{\prime}_{n}=x^{*}=\lim_{n\rightarrow\infty}x^{\prime\prime}_{n}.

Lemma 3.

If X=Δ([a,b])SX=\Delta([a,b])^{S}, then XX has the squeezing property, and there is Σ\Sigma_{\infty} such that (X,Σ)(X,\Sigma_{\infty}) has the countable order property.

Proof.

The squeezing property follows from Lemma 2, and the countable order property from Theorem 15.11 of Aliprantis and Border (2006): Indeed, let BB be the set of probability distributions pp with finite support on 𝐐[a,b]\mathbf{Q}\cap[a,b], where for all q𝐐[a,b]q\in\mathbf{Q}\cap[a,b], p(q)𝐐p(q)\in\mathbf{Q}. Then we may choose a sequence of pairs BiB_{i}, and let Σ\Sigma_{\infty} to be BiB_{i} with B=BiB=\cup B_{i} so that the countable order property is satisfied. ∎

6.2. Proof of Theorem 2

Without loss of generality, we may set [a,b]=[0,1][a,b]=[0,1]. First we show that ukuu^{k}\to u in the compact-open topology. To this end, let xkxx^{k}\to x. We want to show that uk(xk)u(x)u^{k}(x^{k})\to u(x). Suppose then that this is not the case, and by selecting a subsequence that uk(xk)Y>u(x)u^{k}(x^{k})\to Y>u(x) (without loss). Note that δxkkpk\delta_{x^{k}}\sim^{k}p^{k}, where pkp^{k} is the lottery that pays 11 with probability uk(xk)[0,1]u^{k}(x^{k})\in[0,1], and 0 with probability 1uk(xk)1-u^{k}(x^{k}). Let pp be the lottery that pays 11 with probability YY, and 0 with probability 1Y1-Y (given that the range of uku^{k} is [0,1][0,1], we must have Y[0,1]Y\in[0,1]). Now we have that (δxk,pk)(δx,p)(\delta_{x^{k}},p^{k})\to(\delta_{x},p) and δxkkpk\delta_{x^{k}}\sim^{k}p^{k} implies δxp\delta_{x}\sim p. This is a contradiction because δx\delta_{x} is indifferent in \succeq to the lottery that pays 11 with probability uk(xk)[0,1]u^{k}(x^{k})\in[0,1], and 0 with probability 1uk(xk)1-u^{k}(x^{k}). The latter is strictly first-order stochastically dominated by the lottery pp.

To finish the proof, we show that VkVV^{k}\to V. This is the same as proving that Vk(fk)V(f)V^{k}(f^{k})\to V(f) when fkff^{k}\to f. For each kk, continuity and weak monotonicity imply that there is xk[0,1]x^{k}\in[0,1] so that

Vk(fk)=Vk(δxk,,δxk)=uk(xk).V^{k}(f^{k})=V^{k}(\delta_{x^{k}},\ldots,\delta_{x^{k}})=u^{k}(x^{k}).

Similarly, there is xx with V(f)=V(δx,,δx)=u(x)V(f)=V(\delta_{x},\ldots,\delta_{x})=u(x).

Now we argue that xkxx^{k}\to x. Indeed {xk}\{x^{k}\} is a sequence in [0,1][0,1]. If there is a subsequence that converges to, say, x>xx^{\prime}>x then we may choose x′′=x+x2x^{\prime\prime}=\frac{x+x^{\prime}}{2} and eventually

fkk(δx′′,,δx′′)(δx,,δx)f,f^{k}\succeq^{k}(\delta_{x^{\prime\prime}},\ldots,\delta_{x^{\prime\prime}})\succ(\delta_{x},\ldots,\delta_{x})\sim f,

using weak monotonicity. This is impossible because (fk,(δxk,,δxk)(f,(δx,,δx))(f^{k},(\delta_{x^{k}},\ldots,\delta_{x^{k}})\to(f,(\delta_{x^{\prime}},\ldots,\delta_{x^{\prime}})) and fkk((δxk,,δxk)f^{k}\succeq^{k}((\delta_{x^{k}},\ldots,\delta_{x^{k}}) imply that f((δx,,δx)(δx′′,,δx′′)f\succeq((\delta_{x^{\prime}},\ldots,\delta_{x^{\prime}})\succeq(\delta_{x^{\prime\prime}},\ldots,\delta_{x^{\prime\prime}}).

Finally, using what we know about the convergence of uku^{k} to uu, Vk(fk)=uk(xk)u(x)=V(f)V^{k}(f^{k})=u^{k}(x^{k})\to u(x)=V(f).

We now turn to the second statement in the theorem. Observe that HkH^{k} is a continuous function from [0,1]S[0,1]^{S} onto [0,1][0,1]. Let zk[0,1]Sz^{k}\in[0,1]^{S} be an arbitrary convergent sequence, and say that zkzz^{k}\rightarrow z^{*}. We claim that Hk(zk)H(z)H^{k}(z^{k})\rightarrow H(z^{*}). Without loss we may assume that Hk(zk)YH^{k}(z^{k})\rightarrow Y, by taking a subsequence if necessary. For each kk and ss, choose yk(s)[0,1]y^{k}(s)\in[0,1] for which uk(yk(s))=zk(s)u^{k}(y^{k}(s))=z^{k}(s). Again, without loss, we may assume that ykyy^{k}\rightarrow y^{*} by taking a subsequence if necessary, and using the finiteness of SS. Observe also that u(y(s))=z(s)u(y^{*}(s))=z^{*}(s) as we have shown that ukuu^{k}\to u in the compact-open topology.

Now, we may also choose z^k[0,1]\hat{z}^{k}\in[0,1] so that

uk(z^k)=Hk(zk)=Hk((uk(yk(s)))sS),u^{k}(\hat{z}^{k})=H^{k}(z^{k})=H^{k}((u^{k}(y^{k}(s)))_{s\in S}),

and further may again without loss (by taking a subsequence) assume that z^k\hat{z}^{k} converges to z^\hat{z}^{*}. Thus u(z^)=limuk(z^k)=limHk(zk)=Yu(\hat{z}^{*})=\lim u^{k}(\hat{z}^{k})=\lim H^{k}(z^{k})=Y, again using what we have shown regarding ukuu^{k}\to u. Then (δz^k,,δz^k)k(yk(s))sS(\delta_{\hat{z}^{k}},\ldots,\delta_{\hat{z}^{k}})\sim^{k}(y^{k}(s))_{s\in S} so that, by taking limits, (δz^,,δz^)(y(s))s(\delta_{\hat{z}^{*}},\ldots,\delta_{\hat{z}^{*}})\sim^{*}(y^{*}(s))_{s}. This implies that Y=u(z^)=H(u(y(s))=H(z)Y=u(\hat{z}^{*})=H(u(y^{*}(s))=H(z^{*}).

6.3. Proof of Proposition 1

Take (k,pk)(\succeq^{k},p^{k}) as in the statement of the Proposition, and observe that for every pΔ([a,b])p\in\Delta([a,b]), ce(k,pk)[a,b]\mbox{ce}(\succeq^{k},p^{k})\in[a,b]. Suppose by means of contradiction that ce(k,pk)ce(,p)\mbox{ce}(\succeq^{k},p^{k})\rightarrow\mbox{ce}(\succeq,p) is false, then there is some ϵ>0\epsilon>0 and a subsequence for which |ce(km,pkm)ce(,p)|>ϵ|\mbox{ce}(\succeq^{k_{m}},p^{k_{m}})-\mbox{ce}(\succeq,p)|>\epsilon, by taking a further subsequence, we assume without loss that ce(km,pkm)αce(,p)\mbox{ce}(\succeq^{k_{m}},p^{k_{m}})\rightarrow\alpha\neq\mbox{ce}(\succeq,p). Now, pkmkmδce(km,pkm)p^{k_{m}}\sim^{k_{m}}\delta_{\mbox{ce}(\succeq^{k_{m}},p^{k_{m}})}, and pkmpp^{k_{m}}\rightarrow p and δce(km,pkm)δα\delta_{\mbox{ce}(\succeq^{k_{m}},p^{k_{m}})}\rightarrow\delta_{\alpha}. So by definition of closed convergence, it follows that pδαp\sim\delta_{\alpha}; but this violates certainty monotonicity as αce(,p)\alpha\neq\mbox{ce}(\succeq,p).

7. Proof of Theorem 3

First some notation. Let μn()=1ni=1n𝟏xiyi\mu_{n}(\succeq)=\frac{1}{n}\sum_{i=1}^{n}\mathbf{1}_{x_{i}\succeq y_{i}}, and n𝒫\succeq_{n}\in\mathcal{P} be represented by un𝒰u_{n}\in\mathcal{U}. By definition of unu_{n}, we have that μn(n)μn()\mu_{n}(\succeq_{n})\geq\mu_{n}(\succeq) for all 𝒫\succeq\in\mathcal{P}. And we use Vol(A)\mathrm{Vol}(A) to denote the volume of a set AA in 𝐑d\mathbf{R}^{d}, when this is well defined (see Schneider (2014)).

Consider the measure μ\mu on X×XX\times X defined as

μ(A,)=Aq(;x,y)dλ(x,y).\mu(A,\succeq)=\int_{A}q(\succeq;x,y)\mathop{}\!\mathrm{d}\lambda(x,y).

In particular

μ(,)=X×X𝟏(x,y)q(;x,y)dλ(x,y).\mu(\succeq^{\prime},\succeq)=\int_{X\times X}\mathbf{1}_{\succeq^{\prime}}(x,y)q(\succeq;x,y)\mathop{}\!\mathrm{d}\lambda(x,y).

is the probability that a choice with error made at a randomly-drawn choice problem by an agent with preference \succeq will coincide with \succeq^{\prime}.

The key identification result shown in Chambers, Echenique, and Lambert (2021) is that, if \succeq^{\prime}\neq\succeq, then

μ(,)<μ(,).\mu(\succeq^{\prime},\succeq)<\mu(\succeq,\succeq).
Lemma 4.

Consider a Lipschitz noise choice environment (X,𝒫,λ,q)(X,\mathcal{P},\lambda,q). There is a constant CC with the following property. If \succeq and \succeq^{\prime} are two preferences in 𝒫\mathcal{P} with representations uu and uu^{\prime} (respectively) in 𝒰\mathcal{U}. Then

Cρ(u,u)dμ(,)μ(,)C\rho(u,u^{\prime})^{d}\leq\mu(\succeq,\succeq)-\mu(\succeq^{\prime},\succeq)
Proof.

The ball in 𝐑d\mathbf{R}^{d} with center xx and radius ε\varepsilon is denoted by Bε(x)B_{\varepsilon}(x). First we show that the map

εVol(Bϵ(x)X)Vol(Bϵ(x)),\varepsilon\mapsto\frac{\mathrm{Vol}(B_{\epsilon}(x)\cap X)}{\mathrm{Vol}(B_{\epsilon}(x))},

defined for xXx\in X, is nonincreasing as a function of ϵ>0\epsilon>0.

Indeed, let ϵ1<ϵ2\epsilon_{1}<\epsilon_{2}, and let yBϵ2(x)Xy\in B_{\epsilon_{2}}(x)\cap X. Then yXy\in X and yxϵ2\|y-x\|\leq\epsilon_{2}. By convexity of XX, y1x+ϵ1ϵ2(yx)=(1ϵ1ϵ2)x+ϵ1ϵ2yXy_{1}\equiv x+\frac{\epsilon_{1}}{\epsilon_{2}}(y-x)=(1-\frac{\epsilon_{1}}{\epsilon_{2}})x+\frac{\epsilon_{1}}{\epsilon_{2}}y\in X, and y1Bϵ1(x)y_{1}\in B_{\epsilon_{1}}(x). Observe further by properties of Lebesgue measure in 𝐑d\mathbf{R}^{d} that Vol({x+ϵ1ϵ2(yx):yBϵ2(x)X})=(ϵ1ϵ2)dVol(Bϵ2(x)X)\mathrm{Vol}(\{x+\frac{\epsilon_{1}}{\epsilon_{2}}(y-x):y\in B_{\epsilon_{2}}(x)\cap X\})=\left(\frac{\epsilon_{1}}{\epsilon_{2}}\right)^{d}\mathrm{Vol}(B_{\epsilon_{2}}(x)\cap X). Therefore, Vol(Bϵ1(x)X)(ϵ1ϵ2)dVol(Bϵ2(x)X)\mathrm{Vol}(B_{\epsilon_{1}}(x)\cap X)\geq\left(\frac{\epsilon_{1}}{\epsilon_{2}}\right)^{d}\mathrm{Vol}(B_{\epsilon_{2}}(x)\cap X). Since Vol(Bϵ1(x))=(ϵ1ϵ2)dVol(Bϵ2(x))\mathrm{Vol}(B_{\epsilon_{1}}(x))=\left(\frac{\epsilon_{1}}{\epsilon_{2}}\right)^{d}\mathrm{Vol}(B_{\epsilon_{2}}(x)), it follows that

Vol(Bϵ1(x)X)Vol(Bϵ1(x))Vol(Bϵ2(x)X)Vol(Bϵ2(x)),\frac{\mathrm{Vol}(B_{\epsilon_{1}}(x)\cap X)}{\mathrm{Vol}(B_{\epsilon_{1}}(x))}\geq\frac{\mathrm{Vol}(B_{\epsilon_{2}}(x)\cap X)}{\mathrm{Vol}(B_{\epsilon_{2}}(x))},

like we wanted to show.

Now observe that there exists ε¯>0\bar{\varepsilon}>0 large enough that XBε(x)X\subseteq B_{\varepsilon}(x) for all εε¯\varepsilon\geq\bar{\varepsilon} and xXx\in X. Hence, for any xXx\in X and ε(0,ε¯]\varepsilon\in(0,\bar{\varepsilon}]

Vol(Bϵ(x)X)Vol(Bϵ(x))Vol(X)Vol(Bϵ¯(x))c>0,\frac{\mathrm{Vol}(B_{\epsilon}(x)\cap X)}{\mathrm{Vol}(B_{\epsilon}(x))}\geq\frac{\mathrm{Vol}(X)}{\mathrm{Vol}(B_{\bar{\epsilon}}(x))}\equiv c^{\prime}>0,

as XX has nonempty interior and the volume of a ball in 𝐑d\mathbf{R}^{d} is independent of its center.

Now we proceed with the proof of the statement in the lemma. Let Δ=ρ(u,u)\Delta=\rho(u,u^{\prime}) and fix xXx\in X with (wlog) u(x)u(x)=Δ>0u(x)-u^{\prime}(x)=\Delta>0. Set

ε=Δ4κ.\varepsilon=\frac{\Delta}{4\kappa}.

We may assume that ε2ε¯\varepsilon\leq 2\bar{\varepsilon} as defined above, as otherwise we can use a larger upper bound on the Lipschitz constants for the functions in 𝒰\mathcal{U}.

Consider the interval

I=[(u(x)+κε)𝟏,(u(x)κε)𝟏],I=[(u^{\prime}(x)+\kappa\varepsilon)\mathbf{1},(u(x)-\kappa\varepsilon)\mathbf{1}],

with volume

(u(x)κε(u(x)+κε))d=(Δ/2)d.(u(x)-\kappa\varepsilon-(u^{\prime}(x)+\kappa\varepsilon))^{d}=(\Delta/2)^{d}.

Consider Bε/2(x)B_{\varepsilon/2}(x). If yBε/2(x)y\in B_{\varepsilon/2}(x) then |u~(y)u~(x)|<κε\left|\tilde{u}(y)-\tilde{u}(x)\right|<\kappa\varepsilon for any u~𝒰\tilde{u}\in\mathcal{U}.

Now, if zIz\in I and yBε(x)y\in B_{\varepsilon}(x) then

u(y)>u(x)κε=u((xκε)𝟏)u(z)u(y)>u(x)-\kappa\varepsilon=u((x-\kappa\varepsilon)\mathbf{1})\geq u(z)

by monotonicity. Similarly,

u(z)u((x+κε)𝟏)=u(x)+κε>u(y)u^{\prime}(z)\geq u^{\prime}((x+\kappa\varepsilon)\mathbf{1})=u^{\prime}(x)+\kappa\varepsilon>u^{\prime}(y)

Thus (y,z)(y,z)\in\succ\setminus\succeq^{\prime} for any (y,z)Bε(x)×I(y,z)\in B_{\varepsilon}(x)\times I, and

μ(,)μ(,)\displaystyle\mu(\succeq,\succeq)-\mu(\succeq^{\prime},\succeq) =1(y,z)[q(;(y,z))q(;(z,y))]dλ(y,z)\displaystyle=\int 1_{\succ\setminus\succ^{\prime}}(y,z)[q(\succeq;(y,z))-q(\succeq;(z,y))]\mathop{}\!\mathrm{d}\lambda(y,z)
Bε/2(x)×I1(y,z)[q(;(y,z))q(;(z,y))]dλ(y,z)\displaystyle\geq\int_{B_{\varepsilon/2}(x)\times I}1_{\succ\setminus\succ^{\prime}}(y,z)[q(\succeq;(y,z))-q(\succeq;(z,y))]\mathop{}\!\mathrm{d}\lambda(y,z)
λ(Bε(x)/2×I)inf{q(;(y,z)q(;(z,y)):(y,z)Bε/2(x)×I}.\displaystyle\geq\lambda(B_{\varepsilon(x)/2}\times I)\inf\{q(\succeq;(y,z)-q(\succeq;(z,y)):(y,z)\in B_{\varepsilon/2}(x)\times I\}.

Where the first identity is shown in Chambers, Echenique, and Lambert (2021). The second inequality follows because q(;(x,y))>1/2>q(;(y,x))q(\succeq;(x,y))>1/2>q(\succeq;(y,x)) on (x,y)(x,y)\in\succ. The third inequality is because (y,z)(y,z)\in\succ\setminus{\succeq}^{\prime}\subseteq\succ\setminus{\succ^{\prime}} on Bε(x)×IB_{\varepsilon}(x)\times I.

By the assumptions we have placed on λ\lambda, and the calculations above, we know that

λ(Bε(x)/2)c¯Vol(Bϵ¯(x)X)c¯cVol(Bϵ¯(x))=c¯c(ε/2)dπd/2Γ(1+d/2).\lambda(B_{\varepsilon(x)/2})\geq\bar{c}\;\mathrm{Vol}(B_{\bar{\epsilon}}(x)\cap X)\geq\bar{c}c^{\prime}\;\mathrm{Vol}(B_{\bar{\epsilon}}(x))=\bar{c}c^{\prime}\frac{(\varepsilon/2)^{d}\pi^{d/2}}{\Gamma(1+d/2)}.

So there is a constant C′′C^{\prime\prime} (that only depends on XX and c¯\bar{c}) so that λ(I×Bε/2(x))\lambda(I\times B_{\varepsilon/2}(x)) is bounded below by

(Δ/2)dC′′(ε/2)dπd/2Γ(1+d/2)=(Δ/2)dC′′Δdπd/2(8κ)dΓ(1+d/2)=CΔ2d.(\Delta/2)^{d}\frac{C^{\prime\prime}(\varepsilon/2)^{d}\pi^{d/2}}{\Gamma(1+d/2)}=(\Delta/2)^{d}\frac{C^{\prime\prime}\Delta^{d}\pi^{d/2}}{(8\kappa)^{d}\Gamma(1+d/2)}=C^{\prime}\Delta^{2d}.

Here CC^{\prime} is a constant that only depends on C′′C^{\prime\prime}, κ\kappa and dd.

By the assumption that Θ>1/2\Theta>1/2, we get that

μ(,)μ(,)CΔ2d\mu(\succeq,\succeq)-\mu(\succeq^{\prime},\succeq)\geq C\Delta^{2d}

for some constant CC that depends on CC^{\prime} and Θ\Theta. ∎

Lemma 5.

Consider a homothetic noise choice environment (X,𝒫,λ,q)(X,\mathcal{P},\lambda,q). There is a constant CC with the following property. If \succeq and \succeq^{\prime} are two preferences in 𝒫\mathcal{P} with representations uu and uu^{\prime} (respectively) in 𝒰\mathcal{U}. Then

Cρ(u,u)2dμ(,)μ(,)C\rho(u,u^{\prime})^{2d}\leq\mu(\succeq,\succeq)-\mu(\succeq^{\prime},\succeq)
Proof.

Let xXx\in X be such that

ρ(u,u)u(x)u(x)=Δ>0.\rho(u,u^{\prime})\leq u(x)-u^{\prime}(x)=\Delta>0.

Choose η(0,1)\eta\in(0,1) so that u(ηx)u(x)=Δ/2u(\eta x)-u^{\prime}(x)=\Delta/2. Let

I=(u(x)𝟏,u(ηx)𝟏)I=(u^{\prime}(x)\mathbf{1},u(\eta x)\mathbf{1})

and

Zη=[ηx,x]DαM.Z_{\eta}=[\eta x,x]\cap D^{M}_{\alpha}.

Note that IXI\subseteq X because by homotheticity, x=M\|x\|=M and hence xα𝟏x\geq\alpha\mathbf{1}. Then we must have α𝟏u(x)𝟏\alpha\mathbf{1}\leq u^{\prime}(x)\mathbf{1} as α𝟏u(x)𝟏\alpha\mathbf{1}\not\leq u^{\prime}(x)\mathbf{1} would mean that u(x)𝟏α𝟏u^{\prime}(x)\mathbf{1}\ll\alpha\mathbf{1}, contradicting monotonicity and xu(x)𝟏x\sim^{\prime}u^{\prime}(x)\mathbf{1}.

Observe that if yIy\in I and zZηz\in Z_{\eta} then we have that

u(y)<u(u(ηx)𝟏)=u(ηx)u(z),u(y)<u(u(\eta x)\mathbf{1})=u(\eta x)\leq u(z),

as y<u(ηx)𝟏y<u(\eta x)\mathbf{1} and ηxz\eta x\leq z; while

u(z)u(x)=u(u(x)𝟏)<u(y).u^{\prime}(z)\leq u^{\prime}(x)=u^{\prime}(u^{\prime}(x)\mathbf{1})<u^{\prime}(y).

Hence (z,y)(z,y)\in{\succ}\setminus{\succeq^{\prime}}.

First we estimate Vol(Zη)\mathrm{Vol}(Z_{\eta}). Write Z0Z_{0} for [0,x]DαM[0,x]\cap D^{M}_{\alpha}. Define the function f(z)=x+(1η)(zx)f(z)=x+(1-\eta)(z-x) and note that when zZ0z\in Z_{0} then f(z)=ηx+(1η)z[ηx,x]f(z)=\eta x+(1-\eta)z\in[\eta x,x] because z0z\geq 0. Note also that f(z)f(z) is a convex combination of xx and zz, so f(z)DαMf(z)\in D^{M}_{\alpha} as the latter is a convex set. This shows that

Zη={x}+(1η)(Z0{x}),Z_{\eta}=\{x\}+(1-\eta)(Z_{0}-\{x\}),

and hence that Vol(Zη)=(1η)dVol(Z0)\mathrm{Vol}(Z_{\eta})=(1-\eta)^{d}\mathrm{Vol}(Z_{0}).

Now, since Z0Z_{0} is star shaped we have

Vol(Z0)=1dySαMρ(y,[0,x])ddy(αM)dAαM,\mathrm{Vol}(Z_{0})=\frac{1}{d}\int_{y\in S^{M}_{\alpha}}\rho(y,[0,x])^{d}\mathop{}\!\mathrm{d}y\geq(\frac{\alpha}{M})^{d}A^{M}_{\alpha},

where AαMA^{M}_{\alpha} is the surface area of SαMS^{M}_{\alpha} and ρ(y,[0,x])=max{θ>0:θy[0,x]\rho(y,[0,x])=\max\{\theta>0:\theta y\in[0,x] is the radial function of the set [0,x][0,x] (see Schneider (2014) page 57). The inequality results from ρ(y,[0,x])α/M\rho(y,[0,x])\geq\alpha/M as xiαx_{i}\geq\alpha and yiMy_{i}\leq M for any ySαMy\in S^{M}_{\alpha}.

Now,

1η=1Δ/2+u(x)u(x)=Δ/2u(x)Δ/2M,1-\eta=1-\frac{\Delta/2+u^{\prime}(x)}{u(x)}=\frac{\Delta/2}{u(x)}\geq\frac{\Delta/2}{M},

as u(x)Mu(x)\leq M. Thus we have that

Vol(Zη)ΔdC,\mathrm{Vol}(Z_{\eta})\geq\Delta^{d}C^{\prime},

with C=Vol(Z0)/(2M)d>0C^{\prime}=\mathrm{Vol}(Z_{0})/(2M)^{d}>0, a constant.

Moreover, we have Vol(I)=(Δ/2)d\mathrm{Vol}(I)=(\Delta/2)^{d} as IXI\subseteq X. Then we obtain, again using a formula derived in Chambers, Echenique, and Lambert (2021), and that q(;(x,y))>1/2>q(;(y,x))q(\succeq;(x,y))>1/2>q(\succeq;(y,x)) on (x,y)(x,y)\in\succ:

μ(,)μ(,)\displaystyle\mu(\succeq,\succeq)-\mu(\succeq^{\prime},\succeq) =1(z,y)[q(;(z,y))q(;(y,z))]dλ(z,y)\displaystyle=\int 1_{\succ\setminus\succ^{\prime}}(z,y)[q(\succeq;(z,y))-q(\succeq;(y,z))]\mathop{}\!\mathrm{d}\lambda(z,y)
Zη×I1(z,y)[q(;(z,y))q(;(y,z))]dλ(z,y)\displaystyle\geq\int_{Z_{\eta}\times I}1_{\succ\setminus\succ^{\prime}}(z,y)[q(\succeq;(z,y))-q(\succeq;(y,z))]\mathop{}\!\mathrm{d}\lambda(z,y)
λ(Zλ×I)inf{q(;(z,y)q(;(y,z)):(z,y)Zη×I}\displaystyle\geq\lambda(Z_{\lambda}\times I)\inf\{q(\succeq;(z,y)-q(\succeq;(y,z)):(z,y)\in Z_{\eta}\times I\}
(Δ/2)dCΔdΘ,\displaystyle\geq(\Delta/2)^{d}C^{\prime}\Delta^{d}\Theta,

where Θ=inf{q(;(z,y)q(;(y,z)):(z,y)Zη×I}>0\Theta=\inf\{q(\succeq;(z,y)-q(\succeq;(y,z)):(z,y)\in Z_{\eta}\times I\}>0. ∎

7.1. Proof of Theorem 3

For the rest of this proof, we denote μ(,)\mu(\succeq,\succeq^{*}) by μ()\mu(\succeq).

The rest of the proof uses routine ideas from statistical learning theory. By standard results (see, for example, Theorem 3.1 in Boucheron, Bousquet, and Lugosi (2005)), there exists an event EE with probability at least 1δ1-\delta on which:

sup{|μn()μ()|:P}missingEsup{|μn()μ()|:P}+2ln(1/δ)n.\sup\{\left|\mu_{n}(\succeq)-\mu(\succeq)\right|:\succeq\in P\}\leq\mathop{\mathbf{missing}}{E}\sup\{\left|\mu_{n}(\succeq)-\mu(\succeq)\right|:\succeq\in P\}+\sqrt{\frac{2\ln(1/\delta)}{n}}.

Moreover, again by standard arguments (see Theorem 3.2 in Boucheron, Bousquet, and Lugosi (2005)), we also have

missingEsup{|μn()μ()|:P}2missingEsup{1n|iσi𝟏x~iyi|:𝒫},\mathop{\mathbf{missing}}{E}\sup\{\left|\mu_{n}(\succeq)-\mu(\succeq)\right|:\succeq\in P\}\leq 2\mathop{\mathbf{missing}}{E}\sup\{\frac{1}{n}\left|\sum_{i}\sigma_{i}\mathbf{1}_{\tilde{x}_{i}\succeq y_{i}}\right|:\succeq\in\mathcal{P}\},

where

Rn(𝒫)=missingEsup{1n|iσi𝟏x~iyi|:𝒫}R_{n}(\mathcal{P})=\mathop{\mathbf{missing}}{E}\sup\{\frac{1}{n}\left|\sum_{i}\sigma_{i}\mathbf{1}_{\tilde{x}_{i}\succeq y_{i}}\right|:\succeq\in\mathcal{P}\}

is the Rademacher average of 𝒫\mathcal{P}.

Now, by the Vapnik-Chervonenkis inequality (see Theorem 3.4 in Boucheron, Bousquet, and Lugosi (2005)), we have that

missingEsup{|μn()μ()|:P}KVn,\mathop{\mathbf{missing}}{E}\sup\{\left|\mu_{n}(\succeq)-\mu(\succeq)\right|:\succeq\in P\}\leq K\sqrt{\frac{V}{n}},

where VV is the VC dimension of 𝒫\mathcal{P}, and KK is a universal constant.

So on the event EE, we have we have that

sup{|μn()μ()|:P}KV/n+2ln(1/δ)n.\sup\{\left|\mu_{n}(\succeq)-\mu(\succeq)\right|:\succeq\in P\}\leq K\sqrt{V/n}+\sqrt{\frac{2\ln(1/\delta)}{n}}.

We now combine these statements with Lemmas 4 and 5. In particular, we let D=dD=d or D=2dD=2d depending on which of the lemmas we invoke. Let u𝒰u^{*}\in\mathcal{U} represent \succeq^{*} and un𝒰u_{n}\in\mathcal{U} represent n\succeq_{n}. Let Δ=ρ(u,un)\Delta=\rho(u^{*},u_{n}), a magnitude that depends on the sample. Then, on the event EE, by Lemma 4 or 5, we have that

CΔD\displaystyle C\Delta^{D} μ()μ(n)\displaystyle\leq\mu(\succeq^{*})-\mu(\succeq_{n})
=μ()μn()+μn()μn(n)+μn(n)μ(n)\displaystyle=\mu(\succeq^{*})-\mu_{n}(\succeq^{*})+\mu_{n}(\succeq^{*})-\mu_{n}(\succeq_{n})+\mu_{n}(\succeq_{n})-\mu(\succeq_{n})
2KVn+22ln(1/δ)n,\displaystyle\leq 2K\sqrt{\frac{V}{n}}+2\sqrt{\frac{2\ln(1/\delta)}{n}},

where we have used that μn()μn(n)<0\mu_{n}(\succeq^{*})-\mu_{n}(\succeq_{n})<0 by definition of n\succeq_{n}. This proves the second statement in the theorem.

To prove the first statement in the theorem, by Lemmas 4 and 5 again, and using that μn(n)μn()\mu_{n}(\succeq_{n})\geq\mu_{n}(\succeq^{*}), we have that, for any η>0\eta>0,

Pr(ρ(u,un)>η)\displaystyle\mathop{}\mathrm{Pr}(\rho(u^{*},u_{n})>\eta) Pr(μ()μ(n)>CηD)\displaystyle\leq\mathop{}\mathrm{Pr}(\mu(\succeq^{*})-\mu(\succeq_{n})>C\eta^{D})
Pr(μ()μn()>CηD/2)+Pr(μn(n)μ(n)>CηD/2)\displaystyle\leq\mathop{}\mathrm{Pr}(\mu(\succeq^{*})-\mu_{n}(\succeq^{*})>C\eta^{D}/2)+\mathop{}\mathrm{Pr}(\mu_{n}(\succeq_{n})-\mu(\succeq_{n})>C\eta^{D}/2)
2Pr(sup{|μ()μn()|:𝒫}>CηD/2)0\displaystyle\leq 2\mathop{}\mathrm{Pr}(\sup\{\left|\mu(\succeq^{\prime})-\mu_{n}(\succeq^{\prime})\right|:\succeq^{\prime}\in\mathcal{P}\}>C\eta^{D}/2)\to 0

as nn\to\infty by the uniform convergence in probability result shown in Chambers, Echenique, and Lambert (2021).

7.2. Proof of Theorem 5

By standard results (see Hildenbrand (1970)), since XX is locally compact Polish, the topology of closed convergence is compact metric.

We will show that for any subsequence of k\succeq^{k}, there is a subsubsequence converging to \succeq^{*}, which will establish that k\succeq^{k}\rightarrow\succeq^{*}.

So choose a convergent subsubsequence of the given subsequence. To simplify notation and with a slight abuse of notation, let us also refer to this subsubsequence as k\succeq^{k}. Call its limit \succeq; \succeq is complete as the set of complete relations is closed in the closed convergence topology. It is therefore sufficient to establish that \succ^{*}\subseteq\succ and \succeq^{*}\subseteq\succeq.

First we show that xyx\succ^{*}y implies that xyx\succ y. So let xyx\succ^{*}y. Let UU and VV be neighborhoods of xx and yy, respectively, such that xyx^{\prime}\succ^{*}y^{\prime} for all xUx^{\prime}\in U and yVy^{\prime}\in V. Such neighborhoods exist by the continuity of \succeq^{*}. We prove first that if (x,y)U×V(x^{\prime},y^{\prime})\in U\times V, then there exists NN such that xnyx^{\prime}\succ_{n}y^{\prime} for all nNn\geq N. Recall that B={B:BΣ}B=\cup\{B^{\prime}:B^{\prime}\in\Sigma_{\infty}\}. By hypothesis, there exist x′′UBx^{\prime\prime}\in U\cap B and y′′VBy^{\prime\prime}\in V\cap B such that x′′xx^{\prime\prime}\leq x^{\prime} and yy′′y^{\prime}\leq y^{\prime\prime}. Each n\succeq_{n} is a strong rationalization of the finite experiment of order nn, so if {x~,y~}Σn\{\tilde{x},\tilde{y}\}\in\Sigma_{n} then x~ny~\tilde{x}\succ_{n}\tilde{y} implies that x~my~\tilde{x}\succ_{m}\tilde{y} for all mnm\geq n. Since x′′,y′′Bx^{\prime\prime},y^{\prime\prime}\in B, there is NN is such that {x′′,y′′}ΣN\{x^{\prime\prime},y^{\prime\prime}\}\in\Sigma_{N}. Thus x′′y′′x^{\prime\prime}\succ^{*}y^{\prime\prime} implies that x′′ny′′x^{\prime\prime}\succ_{n}y^{\prime\prime} for all nNn\geq N. So, for nNn\geq N, xnyx^{\prime}\succ_{n}y^{\prime}, as n\succeq_{n} is weakly monotone.

Now we establish that xyx\succ y. Let {(xn,yn)}\{(x_{n},y_{n})\} be an arbitrary sequence with (xn,yn)(x,y)(x_{n},y_{n})\rightarrow(x,y). By hypothesis, there is an increasing sequence {xn}\{x^{\prime}_{n}\}, and a decreasing sequence {yn}\{y^{\prime}_{n}\}, such that xnxnx^{\prime}_{n}\leq x_{n} and ynyny_{n}\leq y^{\prime}_{n} while (x,y)=limn(xn,yn)(x,y)=\lim_{n\rightarrow\infty}(x^{\prime}_{n},y^{\prime}_{n}).

Let NN be large enough that xNUx^{\prime}_{N}\in U and yNVy^{\prime}_{N}\in V. Let NNN^{\prime}\geq N be such that xNnyNx^{\prime}_{N}\succ_{n}y^{\prime}_{N} for all nNn\geq N^{\prime} (we established the existence of such NN^{\prime} above). Then, for any nNn\geq N^{\prime} we have that

xnxnxNnyNynyn.x_{n}\geq x^{\prime}_{n}\geq x^{\prime}_{N}\succ_{n}y^{\prime}_{N}\geq y^{\prime}_{n}\geq y_{n}.

By the weak monotonicity of n\succeq_{n}, then, xnnynx_{n}\succ_{n}y_{n}. The sequence {(xn,yn)}\{(x_{n},y_{n})\} was arbitrary, so (y,x)=limnn(y,x)\notin\succeq=\lim_{n\rightarrow\infty}\succeq_{n}. Thus ¬(yx)\neg(y\succeq x). Completeness of \succeq implies that xyx\succ y.

In second place we show that if xyx\succeq^{*}y then xyx\succeq y, thus completing the proof. So let xyx\succeq^{*}y. We recursively construct sequences xnk,ynkx^{n_{k}},y^{n_{k}} such that xnknkynkx^{n_{k}}\succeq^{n_{k}}y^{n_{k}} and xnkxx^{n_{k}}\rightarrow x, ynkyy^{n_{k}}\rightarrow y.

So, for any k1k\geq 1, choose xNx(1/k)Bx^{\prime}\in N_{x}(1/k)\cap B with xxx^{\prime}\geq x, and yNy(1/k)By^{\prime}\in N_{y}(1/k)\cap B with yyy^{\prime}\leq y; so that xxyyx^{\prime}\succeq^{*}x\succeq^{*}y\succeq^{*}y^{\prime}, as \succeq^{*} is weakly monotone. Recall that n\succeq_{n} strongly rationalizes cc_{\succeq^{*}} for Σn\Sigma_{n}. So xyx^{\prime}\succeq^{*}y^{\prime} and x,yBx^{\prime},y^{\prime}\in B imply that xnyx^{\prime}\succeq_{n}y^{\prime} for all nn large enough. Let nk>nk1n_{k}>n_{k-1} (where we can take n0=0n_{0}=0) such that xnkyx^{\prime}\succeq_{n_{k}}y^{\prime}; and let xnk=xx^{n_{k}}=x^{\prime} and ynk=yy^{n_{k}}=y^{\prime}.

Then we have (xnk,ynk)(x,y)(x^{n_{k}},y^{n_{k}})\rightarrow(x,y) and xnknkynkx_{n_{k}}\succeq_{n_{k}}y_{n_{k}}. Thus xyx\succeq y.

7.3. Proof of Theorem 4

First, it is straightforward to show that xyx\succ y implies xyx\succeq^{\prime}y. Because otherwise there are x,yx,y for which xyx\succ y and yxy\succ^{\prime}x. Take an open neighborhood UU about (x,y)(x,y) and a pair (z,w)U(B×B)(z,w)\in U\cap(B\times B) for which zwz\succ w and wzw\succ^{\prime}z, a contradiction. Symmetrically, we also have xyx\succ^{\prime}y implies xyx\succeq y.

Now, without loss, suppose that there is a pair x,yx,y for which xyx\succ y and xyx\sim^{\prime}y. By connectedness and continuity, V={z:xzy}V=\{z:x\succ z\succ y\} is nonempty. Indeed if we assume, towards a contradiction that V=V=\varnothing, then {z:xz}\{z:x\succ z\} and {z:zy}\{z:z\succ y\} are nonempty open sets. Further, for any zXz\in X, either xzx\succ z or zyz\succ y (because if ¬(xz)\neg(x\succ z) then by completeness zxz\succeq x, which implies that zyz\succ y). Conclude that {z:xz}{z:zy}=X\{z:x\succ z\}\cup\{z:z\succ y\}=X and each of the sets are nonempty and open (by continuity of the preference \succeq); these sets are disjoint, violating connectedness of XX. So we conclude that VV is nonempty. By continuity of the preference \succeq, VV os open.

We claim that there is a pair (w,z)(V×V)(B×B)(w,z)\in(V\times V)\cap(B\times B) for which wzw\succ z. For otherwise, for all (w,z)V×V(B×B)(w,z)\in V\times V\cap(B\times B), wzw\sim z. Conclude then by continuity that for all (w,z)V×V(w,z)\in V\times V, wzw\sim z. Observe that this implies that, for any wVw\in V, the set {z:wzy}=\{z:w\succ z\succ y\}=\varnothing, as if wzyw\succ z\succ y, we also have that xwzx\succeq w\succ z, from which we conclude xzx\succ z, so that zVz\in V and hence zwz\sim w, a contradiction. Observe that {z:wzy}=\{z:w\succ z\succ y\}=\varnothing contradicts the continuity of \succeq and the connectedness of XX (same argument as nonemptyness of VV; see our discussion above).

We have shown that there is (w,z)(V×V)(B×B)(w,z)\in(V\times V)\cap(B\times B) for which wzw\succ z, so that xwzyx\succ w\succ z\succ y. Further, we have hypothesized that xyx\sim^{\prime}y. By the first paragraph, we know that xwzyx\succeq^{\prime}w\succeq^{\prime}z\succeq^{\prime}y. If, by means of contradiction, we have wzw\succ^{\prime}z, then xyx\succ^{\prime}y, a contradiction. So wzw\sim^{\prime}z and wzw\succ z, a contradiction to B×B=B×B\succeq_{B\times B}=\succeq^{\prime}_{B\times B}.

References

  • (1)
  • Afriat (1967a) Afriat, S. N. (1967a): “The Construction of Utility Functions from Expenditure Data,” International Economic Review, 8(1), 67–77.
  • Afriat (1967b)    (1967b): “The Construction of Utility Functions from Expenditure Data,” International Economic Review, 8(1), 67–77.
  • Aliprantis and Border (2006) Aliprantis, C. D., and K. Border (2006): Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer, 3 edn.
  • Balcan, Constantin, Iwata, and Wang (2012) Balcan, M. F., F. Constantin, S. Iwata, and L. Wang (2012): “Learning valuation functions,” in Conference on Learning Theory, pp. 4–1. JMLR Workshop and Conference Proceedings.
  • Balcan, Daniely, Mehta, Urner, and Vazirani (2014) Balcan, M.-F., A. Daniely, R. Mehta, R. Urner, and V. V. Vazirani (2014): “Learning economic parameters from revealed preferences,” in International Conference on Web and Internet Economics, pp. 338–353. Springer.
  • Basu and Echenique (2020) Basu, P., and F. Echenique (2020): “On the falsifiability and learnability of decision theories,” Theoretical Economics, 15(4), 1279–1305.
  • Bei, Chen, Garg, Hoefer, and Sun (2016) Bei, X., W. Chen, J. Garg, M. Hoefer, and X. Sun (2016): “Learning Market Parameters Using Aggregate Demand Queries,” in AAAI.
  • Beigman and Vohra (2006) Beigman, E., and R. Vohra (2006): “Learning from revealed preference,” in Proceedings of the 7th ACM Conference on Electronic Commerce, pp. 36–42.
  • Bergstrom, Parks, and Rader (1976) Bergstrom, T. C., R. P. Parks, and T. Rader (1976): “Preferences which Have Open Graphs,” Journal of Mathematical Economics, 3(3), 265–268.
  • Blundell, Browning, and Crawford (2008) Blundell, R., M. Browning, and I. Crawford (2008): “Best Nonparametric Bounds on Demand Responses,” Econometrica, 76(6), 1227–1262.
  • Blundell, Browning, and Crawford (2003) Blundell, R. W., M. Browning, and I. A. Crawford (2003): “Nonparametric Engel Curves and Revealed Preference,” Econometrica, 71(1), 205–240.
  • Border and Segal (1994) Border, K. C., and U. Segal (1994): “Dynamic Consistency Implies Approximately Expected Utility Preferences,” Journal of Economic Theory, 63(2), 170–188.
  • Boucheron, Bousquet, and Lugosi (2005) Boucheron, S., O. Bousquet, and G. Lugosi (2005): “Theory of classification: A survey of some recent advances,” ESAIM: probability and statistics, 9, 323–375.
  • Brown and Matzkin (1996) Brown, D. J., and R. L. Matzkin (1996): “Testable Restrictions on the Equilibrium Manifold,” Econometrica, 64(6), 1249–1262.
  • Camara (2022) Camara, M. K. (2022): “Computationally Tractable Choice,” in Proceedings of the 23rd ACM Conference on Economics and Computation, EC ’22, p. 28, New York, NY, USA. Association for Computing Machinery.
  • Carvajal, Deb, Fenske, and Quah (2013) Carvajal, A., R. Deb, J. Fenske, and J. K.-H. Quah (2013): “Revealed Preference Tests of the Cournot Model,” Econometrica, 81(6), 2351–2379.
  • Cerreia-Vioglio, Dillenberger, and Ortoleva (2015) Cerreia-Vioglio, S., D. Dillenberger, and P. Ortoleva (2015): “Cautious expected utility and the certainty effect,” Econometrica, 83(2), 693–728.
  • Cerreia-Vioglio, Maccheroni, Marinacci, and Montrucchio (2011) Cerreia-Vioglio, S., F. Maccheroni, M. Marinacci, and L. Montrucchio (2011): “Uncertainty averse preferences,” Journal of Economic Theory, 146(4), 1275–1330.
  • Chambers and Echenique (2016) Chambers, C. P., and F. Echenique (2016): Revealed preference theory, vol. 56. Cambridge University Press.
  • Chambers, Echenique, and Lambert (2021) Chambers, C. P., F. Echenique, and N. S. Lambert (2021): “Recovering preferences from finite data,” Econometrica, 89(4), 1633–1664.
  • Chandrasekher, Frick, Iijima, and Le Yaouanq (2021) Chandrasekher, M., M. Frick, R. Iijima, and Y. Le Yaouanq (2021): “Dual-self representations of ambiguity preferences,” Econometrica, forthcoming.
  • Chapman, Dean, Ortoleva, Snowberg, and Camerer (2017) Chapman, J., M. Dean, P. Ortoleva, E. Snowberg, and C. Camerer (2017): “Willingness to Pay and Willingness to Accept are Probably Less Correlated Than You Think,” NBER working paper No. 23954.
  • Chapman, Dean, Ortoleva, Snowberg, and Camerer (2022)    (2022): “Econographics,” Forthcoming, Journal of Political Economic: Microeconomics.
  • Chase and Prasad (2019) Chase, Z., and S. Prasad (2019): “Learning Time Dependent Choice,” in 10th Innovations in Theoretical Computer Science Conference (ITCS).
  • Chateauneuf and Faro (2009) Chateauneuf, A., and J. H. Faro (2009): “Ambiguity through confidence functions,” Journal of Mathematical Economics, 45(9-10), 535–558.
  • Chateauneuf, Grabisch, and Rico (2008) Chateauneuf, A., M. Grabisch, and A. Rico (2008): “Modeling attitudes toward uncertainty through the use of the Sugeno integral,” Journal of Mathematical Economics, 44(11), 1084–1099.
  • Chavas and Cox (1993) Chavas, J.-P., and T. L. Cox (1993): “On Generalized Revealed Preference Analysis,” The Quarterly Journal of Economics, 108(2), 493–506.
  • Clinton, Jackman, and Rivers (2004) Clinton, J., S. Jackman, and D. Rivers (2004): “The Statistical Analysis of Roll Call Data,” The American Political Science Review, 98(2), 355–370.
  • Diewert (1973) Diewert, W. E. (1973): “Afriat and Revealed Preference Theory,” The Review of Economic Studies, 40(3), 419–425.
  • Dong, Roth, Schutzman, Waggoner, and Wu (2018) Dong, J., A. Roth, Z. Schutzman, B. Waggoner, and Z. S. Wu (2018): “Strategic classification from revealed preferences,” in Proceedings of the 2018 ACM Conference on Economics and Computation, pp. 55–70.
  • Echenique, Golovin, and Wierman (2011) Echenique, F., D. Golovin, and A. Wierman (2011): “A revealed preference approach to computational complexity in economics,” in Proceedings of the 12th ACM conference on Electronic commerce, pp. 101–110.
  • Echenique and Prasad (2020) Echenique, F., and S. Prasad (2020): “Incentive Compatible Active Learning,” in 11th Innovations in Theoretical Computer Science Conference (ITCS).
  • Falk, Becker, Dohmen, Enke, Huffman, and Sunde (2018) Falk, A., A. Becker, T. Dohmen, B. Enke, D. Huffman, and U. Sunde (2018): “Global Evidence on Economic Preferences,” The Quarterly Journal of Economics, 133(4), 1645–1692.
  • Forges and Minelli (2009) Forges, F., and E. Minelli (2009): “Afriat’s Theorem for General Budget Sets,” Journal of Economic Theory, 144(1), 135–145.
  • Fox (1945) Fox, R. H. (1945): “On topologies for function spaces,” Bull. Amer. Math. Soc., 51, 429–432.
  • Fudenberg, Gao, and Liang (2021) Fudenberg, D., W. Gao, and A. Liang (2021): “How Flexible is that Functional Form? Measuring the Restrictiveness of Theories,” in Proceedings of the 22nd ACM Conference on Economics and Computation, pp. 497–498.
  • Gilboa and Schmeidler (1989) Gilboa, I., and D. Schmeidler (1989): “Maxmin expected utility with non-unique prior,” Journal of mathematical economics, 18(2), 141–153.
  • Hansen and Sargent (2001) Hansen, L. P., and T. J. Sargent (2001): “Robust control and model uncertainty,” American Economic Review, 91(2), 60–66.
  • Hansen and Sargent (2022)    (2022): “Risk, Ambiguity, and Misspecification: Decision Theory, Robust Control, and Statistics,” Mimeo: NYU.
  • Hansen, Sargent, Turmuhambetova, and Williams (2006) Hansen, L. P., T. J. Sargent, G. Turmuhambetova, and N. Williams (2006): “Robust control and model misspecification,” Journal of Economic Theory, 128(1), 45–90.
  • Hildenbrand (1970) Hildenbrand, W. (1970): “On Economies with Many Agents,” Journal of Economic Theory, 2(2), 161–188.
  • Kannai (1970) Kannai, Y. (1970): “Continuity Properties of the Core of a Market,” Econometrica, 38(6), 791–815.
  • Kertz and Rösler (2000) Kertz, R. P., and U. Rösler (2000): “Complete Lattices of Probability Measures with Applications to Martingale Theory,” Lecture Notes-Monograph Series, 35, 153–177.
  • Levin (1983) Levin, V. L. (1983): “A continuous utility theorem for closed preorders on a metrizable σ\sigma-compact space,” Doklady Akademii Nauk, 273(4), 800–804.
  • Maccheroni, Marinacci, and Rustichini (2006) Maccheroni, F., M. Marinacci, and A. Rustichini (2006): “Ambiguity aversion, robustness, and the variational representation of preferences,” Econometrica, 74(6), 1447–1498.
  • Mas-Colell (1974) Mas-Colell, A. (1974): “Continuous and Smooth Consumers: Approximation Theorems,” Journal of Economic Theory, 8(3), 305–336.
  • Mas-Colell (1977)    (1977): “On the Continuous Representation of Preorders,” International Economic Review, 18(2), 509–513.
  • Mas-Colell (1978)    (1978): “On Revealed Preference Analysis,” The Review of Economic Studies, 45(1), 121–131.
  • Matzkin (1991) Matzkin, R. L. (1991): “Axioms of Revealed Preference for Nonlinear Choice Sets,” Econometrica, 59(6), 1779–1786.
  • Nishimura, Ok, and Quah (2017) Nishimura, H., E. A. Ok, and J. K.-H. Quah (2017): “A Comprehensive Approach to Revealed Preference Theory,” American Economic Review, 107(4), 1239–1263.
  • Poole and Rosenthal (1985) Poole, K. T., and H. Rosenthal (1985): “A Spatial Model for Legislative Roll Call Analysis,” American Journal of Political Science, 29(2), 357–384.
  • Reny (2015) Reny, P. J. (2015): “A Characterization of Rationalizable Consumer Behavior,” Econometrica, 83(1), 175–192.
  • Richter (1966) Richter, M. K. (1966): “Revealed Preference Theory,” Econometrica, 34(3), 635–645.
  • Samuelson (1938) Samuelson, P. A. (1938): “A note on the pure theory of consumer’s behaviour,” Economica, 5(17), 61–71.
  • Schmeidler (1989) Schmeidler, D. (1989): “Subjective probability and expected utility without additivity,” Econometrica, 57(3), 571–587.
  • Schneider (2014) Schneider, R. (2014): Convex Bodies: the Brunn-Minkowski Theory. Cambridge University Press, 2 edn.
  • Ugarte (2022) Ugarte, C. (2022): “Preference Recoverability from Inconsistent Choices,” Mimeo, UC Berkeley.
  • Varian (1982) Varian, H. R. (1982): “The Nonparametric Approach to Demand Analysis,” Econometrica, 50(4), 945–973.
  • von Gaudecker, van Soest, and Wengstrom (2011) von Gaudecker, H.-M., A. van Soest, and E. Wengstrom (2011): “Heterogeneity in Risky Choice Behavior in a Broad Population,” The American Economic Review, 101(2), 664–94.
  • Zadimoghaddam and Roth (2012) Zadimoghaddam, M., and A. Roth (2012): “Efficiently learning from revealed preference,” in International Workshop on Internet and Network Economics, pp. 114–127. Springer.
  • Zhang and Conitzer (2020) Zhang, H., and V. Conitzer (2020): “Learning the Valuations of a k-demand Agent,” in International Conference on Machine Learning.