Aggregation of models,
choices, beliefs, and preferences
Abstract.
A natural notion of rationality/consistency for aggregating models is that, for all (possibly aggregated) models and , if the output of model is and if the output model is , then the output of the model obtained by aggregating and must be a weighted average of and . Similarly, a natural notion of rationality for aggregating preferences of ensembles of experts is that, for all (possibly aggregated) experts and , and all possible choices and , if both and prefer over , then the expert obtained by aggregating and must also prefer over . Rational aggregation is an important element of uncertainty quantification, and it lies behind many seemingly different results in economic theory: spanning social choice, belief formation, and individual decision making. Three examples of rational aggregation rules are as follows. (1) Give each individual model (expert) a weight (a score) and use weighted averaging to aggregate individual or finite ensembles of models (experts). (2) Order/rank individual model (expert) and let the aggregation of a finite ensemble of individual models (experts) be the highest-ranked individual model (expert) in that ensemble. (3) Give each individual model (expert) a weight, introduce a weak order/ranking over the set models/experts (two models may share the same rank), aggregate and as the weighted average of the highest-ranked models (experts) in or . Note that (1) and (2) are particular cases of (3) (in (1) all models/experts share the same rank, and in (2) the ranking is strict). In this paper, we show that all rational aggregation rules are of the form (3). This result unifies aggregation procedures across many different economic environments, showing that they all rely on the same basic result. Following the main representation, we show applications and extensions of our representation in various separated economics topics such as belief formation, choice theory, aggregation of optimal models, and social welfare economics.
1. Introduction
This paper presents a general framework characterizing rational/consistent aggregation (of models, choices, beliefs, and preferences, which we simply refer to as features) with applications to economic theory. In this framework, individual features have outcomes, and aggregation rules identify the outcome of groups of features. We focus on a recursive form of aggregation, which is the case in the Cased-Based decision theory developed by [18, 6], where the aggregate outcome for larger collections of features results from aggregating the outcomes of smaller subsets. Specifically, the aggregate outcome of the union of two disjoint collections of features is a weighted average of the outcome of each collection of features separately. We show that this form of recursive aggregation is a common structure that lies behind many seemingly unrelated results in economic theory.
Our central axiom, the weighted averaging axiom/property (we will use the terms axiom and property interchangeably), is a simple formalization of the recursivity. It imposes a structure on how the outcome of the union of two disjoint subsets of features relates to the outcome of each of the subsets separately. The axiom states that the outcome of a set of features can be recursively computed by first partitioning the set of features into two disjoint subsets. Then, the aggregated outcome is a weighted average of the outcome of each of the two smaller subsets.
Our contribution is three-fold: (1) We find all aggregation procedures that satisfy the weighted averaging axiom, which generalized the result of [6]. Moreover, by enhancing the procedure with continuity axiom, we connect the axiom to the path independent axiom, which is studied in the choice literature. (2) With a simple geometrical duality argument, we connect the weighted averaging to the combination axiom of [18] and Extended Pareto of [35]. (3) We present applications and extensions to different domains of economics, notably in the context of Belief Formation, Choice Theory, and Welfare Economics.
Formally, we define an aggregation rule as a function on the set of subsets of features that maps each subset of features to an outcome. Our main result finds all aggregation rules that satisfy recursivity in the form of our weighted average axiom. We show that as long as for any two disjoint subsets of features, the outcome of their union is a weighted average (with non-negative weights) of the outcome of each subset, then the aggregation rule has a simple form (with a technical richness condition):
There exist a strictly positive weight function and a weak order (a transitive and complete order) over the set of features, with the outcome of any subset of features being the weighted average of the outcomes of each of the highest-ordered features of the subset separately.
The importance of the result is that the weight of each feature is independent of the group of features being aggregated. The role of the weak order in the main representation is to partition the set of features into different equivalence classes and rank them from the highest class to the lowest class. If all features of a subset of features are in the same class, then the outcome is the weighted average of the outcomes of each member of the set. However, if some features have a higher ranking than others, then the aggregation rule will ignore lower-ordered features.
Following the main result, we discuss two special cases of our main result. In the first case, we introduce the strict weighted averaging axiom to represent the case where the outcome of the union of two disjoint subsets of features is contained in the “relative” interior of the outcomes of each subset separately. We then show that the strict weighted averaging axiom is the necessary and sufficient condition for the weak order, in the main representation, to have only one equivalence class. Hence, the outcome of a subset of features is just the weighted average (with strictly positive weights) of the outcomes of each feature separately.
In the second case, we model the space of features as a subset of a vector space. By considering the distance between vectors, we capture the notion of similarity or closeness of features. In this context, we can consider the following notion of continuity of outcomes with respect to features: replacing a feature in a subset of features with another closely similar feature, the outcome of this new subset stays close to the outcome of the previous one. Under this property, which we define as the continuity property, we show that all similar enough features attain the same ranking with respect to the weak order. Moreover, the weight function is a continuous function over the set of features. In other words, the weight between two close (or similar) features should be close. In a special case, where the space of features is a convex set, we show that all features attain the same ranking. In this case, there is no difference between the weighted averaging and the strict weighted averaging property.
Depending on the application, features and the aggregation rules may have different interpretations. A feature may represent a signal or an event containing some information about the true state of nature. In this case, the role of an aggregation rule is to form a belief about the true state of nature. In the context of choice theory, features may represent choice objects, where an aggregation rule behaves as a decision-maker that selects a lottery or a random choice out of a group of choice objects. Another interpretation is in the context of welfare economics, where each feature represents a preference of an individual over some alternatives. In this case, an aggregation rule represents a social welfare function that associates with each preference profile, a single preference ordering over the set of alternatives.
To describe a natural interpretation of our result, consider the problem of modeling an agent who seeks to make a prediction about the true state of nature, conditional on observing a set of events. In this context, a feature represents an event, and the outcome of the model conditional on observing a set of events is a belief about the true state of nature. Our main result provides a necessary and sufficient condition for the belief formation process to behave as a Bayesian Updater. Under the averaging property of the belief formation process, there exists a conditional probability system associated with the set of events, and the belief formation process conditional on observing a set of events behaves like a conditional probability. The weak order of the main result is capturing the idea that conditional on observing even a zero probability event, the belief formation still behaves as a Bayesian updater.
To motivate the proposed framework, sections 4, 5, 6, and 7 present applications and extensions of our main representation results. We show that the weighted averaging axiom is closely related to many known axioms in different topics, from the Pareto axiom in Social Choice Theory to the path independence axiom in Choice Theory.
2. Model aggregation
Let be a nonempty set111We make no assumptions about the cardinality or topology of . Write for the set of all nonempty finite subsets of . One may interpret as a (possibly infinite) set of models, as a finite set of models, and as the set of all nonempty finite sets of models. Let be a separable Hilbert space with as a prototypical example.
Definition 1.
An aggregation rule on is a function , that associates with every a vector .222All discussions of this paper continue to hold if is replaced by any general (possibly infinite dimensional) normed vector space.
For and , one may interpret as the output of the model , and as the output of the aggregation of models contained in . The purpose of this section is to characterize aggregation rules satisfying the weighted averaging axiom/property defined below.
Definition 2.
We say that an aggregation rule satisfies the weighted averaging axiom/property if for all such that , it holds true that
(2.1) |
for some (which may depend on and ). We say that satisfies the strict weighted averaging axiom/property if (2.1) holds true for . We say that satisfies the extreme weighted averaging axiom/property if (2.1) holds true for .
Two simple examples of aggregation rules satisfying the weighted averaging property are as follows.
Example 1.
Write for the set of strictly positive real numbers and let be a weight function on . For , let333Abusing notations we write for for . be the output of model . For , define
(2.2) |
Then satisfies the strict weighted averaging property.
Example 2.
Consider a complete strict order on . Given any feature , let be the output of the model . For , write for the highest order element444 and for . in . For , define
(2.3) |
Then satisfies the extreme weighted averaging property.
We will now show that all aggregation rules satisfying the strict weighted averaging property must be as in Example 1 if contains at least three elements such and are not collinear.
Definition 3.
An aggregation rule is rich if the range of is not a subset of a line.
Theorem 1.
Let the aggregation rule be rich. The following are equivalent:
-
(1)
The aggregation rule satisfies the strict weighted averaging property.
-
(2)
There exists a weight function such that for every
(2.4)
Moreover, the function is unique up to multiplication by a positive number.
We will now show that all aggregation rules satisfying the weighted averaging property must be of the form of a combination of Example 1 and 2 if for all we can find such that , and are not collinear and the pairwise aggregation of does not satisfy the extreme aggregation property.
Definition 4.
An aggregation rule is strongly rich if for any there exist such that:
-
(1)
, and are not on the same line.
-
(2)
and 555In the proof of our main result, we show that as long as , and are not on the same line, then and ..
Definition 5.
A binary relation on is a weak order on , if it is reflexive (), transitive ( and imply ), and complete (for all , or ). We say that is equivalent to , and write , if and .
Theorem 2.
Let the aggregation rule be strongly rich. Then the following are equivalent:
-
(1)
The aggregation rule satisfies the weighted averaging axiom.
-
(2)
There exist a unique weak order on and a weight function such that for every :
(2.5)
Moreover in this case, the function is unique up to multiplication by a positive number in each of the equivalence classes of the weak order .
The representation (2.5) has two components: one is captured by the weak order ; the other is the weight function .
The weak order partitions the set of features into equivalence classes and ranks them from top to bottom. If all models have the same ranking, then the outcome of is the weighted average of the outcomes of each element . However, if some elements have a higher ranking than others, then the aggregation rule will ignore the lower-ordered elements.
Hence, the assessment of the aggregation rule has two steps. First, it only considers the highest-ordered elements. Then, it uses the weight function and finds the weighted average among the highest-ordered elements.
The richness condition is necessary for both theorems 1 and 2. Example 3 shows that without this condition, aggregation rules may satisfy the strict weighted averaging axiom without having a weighted average representation.
Example 3.
Let with , and . Assume that there exists a positive weight function on , and the aggregation rule over any coalition of has a representation as a weighted average of its elements. Assume that is the corresponding weight function. In order to have such a representation, we should have . By considering the value of , and , we get . Similarly, by considering the coalition we get . By combining these two observations, we get . However, considering the coalition , and the representation , we get , which is a contradiction. Hence, the representation does not work in this case.
Assume is a subset of a normed vector space. We will now show that weight in the representation (2.5) must be continuous if is continuous as defined below.
Definition 6.
An aggregation rule is (continuous) if, for any , any , and any sequence , implies666The convergence in is with respect to the norm on , and the convergence in the range of the aggregation rule is with respect to norm of . .
Theorem 3.
Let X be a subset of a normed vector space and let be a strongly rich continuous aggregation rule satisfying the weighted averaging property. Then the representation (2.5) holds true with a continuous weight function . Furthermore, for any there exists such that for all where such that .
We will now show that if is a convex subset of a normed vector space, then any continuous aggregation rule on under the weighted averaging axiom can only have a single equivalence class, and as a consequence, both the weighted averaging and strict weighted averaging properties lead to the representation (2.4) for .
Theorem 4.
Let be a convex subset of a normed vector space, and a rich continuous aggregation rule satisfying the weighted averaging property. Then, there exists a continuous weight function such that the representation (2.4) holds true.
3. Preference aggregation and duality
In many cases, where the range of the aggregation rule is the set of linear functionals, a simple geometrical interpretation of the weighted averaging axiom results in a related but different consistency axiom. Let be a Hilbert space and write for the associated inner product. Let be a convex subset of . Every induces a weak order (reflexive, transitive, and complete binary relation) over the set by:
(3.1) |
Let be a non empty set. In this section we define an aggregation rule777By the Riesz representation theorem, can also be defined as function mapping to the space of continuous linear functionals on , in which case for , is identified with the unique element such that for . as a function mapping to . Since we may interpret each as a linear ranking of the elements of the set , the goal of an aggregator is to attach an aggregated linear ranking to every finite subset of .
Example 4.
A simple example of interpretation of and is as follows. Let be a set of experts and a set of alternatives (models, decisions, choices). An expert defines a ranking/preference over . An aggregation rule is a voting mechanism enabling the aggregation of the preferences of a finite set of experts. A rational notion of consistency (employed here and formally introduced below in Definition 7) is that if are two disjoint sets of experts such that both ( and ) prefer over , then their aggregate must also prefer over .
Observing that the order (3.1) is invariant under scaling of we will restrict the range of aggregation rules to the set for some . This restriction also avoids entirely opposite ranking directions by imposing a shared rank on . The condition of existence of such a is what we call a minimal agreement condition.
Definition 7.
An aggregation rule is weakly consistent if for all disjoint sets , and for all ,
(3.2) |
Moreover, it is consistent if it also satisfies the following condition:
(3.3) |
A simple duality argument (Farkas’s lemma) results in the following theorem.
Theorem 5.
Let be an aggregation rule. Then, the followings are equivalent:
-
(1)
is consistent.
-
(2)
satisfies the strict weighted averaging property.
Moreover, the followings are also equivalent:
-
(1)
is weakly consistent.
-
(2)
satisfies the weighted averaging property.
Using Theorem 1, we immediately attain the representation of the consistent aggregation rules.
Corollary 1.
Let be a consistent rich aggregation rule. Then, there exists a weight function such that for every set of features ,
(3.4) |
Moreover, the weight function is unique up to multiplication by a positive number.
Note that we can generalize the result to the case of weakly consistent rules.
4. Belief Formation
In this section, we interpret the set of features as signals. Each signal contains some information about the distribution of states of nature. The role of an aggregation rule is an agent who makes a prediction about the true state of nature-based on observing some signals. In this context, the range of an aggregation rule is that of probability distributions over the states of nature. Following [6], an aggregation rule is a belief formation process that associates with each finite set of signals, a belief over the states of nature.
The representation of the belief formation process under the weighted averaging axiom is a straightforward application of the main results. Using our representation, on the one hand, we propose an extension, where the timing of signals may be important. We consider the case where an agent can receive signals in different time zones in the past. The agent tries to form a prediction at the present time, and it may perceive signals closer to the time of the prediction as more credible. To capture the representation, we introduce the stationarity axiom, in which a belief induced by a set of received signals and their timing is the same as the belief induced by shifting the timings of all signals by a constant number to the past.
Under stationarity, any belief formation process satisfying the strict weighted averaging axiom has a weight function over the set of signals and an exponential discount factor over each time zone. The belief associated with a set of received signals is the discounted weighted average of the beliefs associated with each signal. In this case, the weight function captures the time-independent value of each signal.
On the other hand, we interpret the set of signals as the information structure of an agent who wants to predict the true state. We interpret each subset of signals as an event in her information structure. We show that as long as the information structure has a finite cardinality, the strict weighted averaging axiom is the necessary and sufficient condition for a rich belief formation process to appear as a Bayesian updater. This result answers the question in [36], regarding finding a necessary and sufficient condition for a belief formation process to act as a Bayesian updating rule.
4.1. Belief Formation Processes
Let be a set of states of nature and let be the set of all probability distributions over . We interpret the elements of the set as disjoint signals or events. The role of an aggregation rule over a finite subset of is to predict the true state of nature by assigning probabilities to each state of . Therefore, following [6], the aggregation rules can be interpreted as a belief formation process, which assigns a belief to the set of states of nature after observing a finite subset of signals.
Definition 8.
A belief formation process is a function , that associates with every finite set of signals , a belief on the states of nature.
Theorem 2 shows that if the belief induced by the union of two disjoint finite sets of signals is on the line segment connecting the beliefs induced by each set of signals separately, then, under the strong richness condition, there exists a strictly positive weight function and a weak order over the set of signals such that the belief over any finite subset of signals is a weighted average of the beliefs induced by each of the highest-ordered signals of that subset.
By enforcing the belief formation process to use both of the induced beliefs, i.e., the belief induced by the union of two disjoint finite sets of signals is on the “interior” of the line segment connecting the induced belief of each set of signals separately, we can use Theorem 1 to find the representation. Formally, we have:
Corollary 2.
Let be a strongly rich belief formation process satisfying the weighted averaging property. Then, there exist a unique weak order on and a weight function such that for every :
(4.1) |
Moreover, if satisfies the strict weighted averaging property, then the weak order has only one equivalence class and for every :
(4.2) |
Although representation (4.2) is, under the strict weighted averaging property, similar to the one in [6], their belief formation process is defined over sequences of signals, in which each sequence can have multiple copies of the same signal. In contrast, we define the belief formation process over sets of signals, and there can be only one copy of a signal in each set. Billot et al.’s main axiom, concatenation axiom, is defined over any two sequences of signals, and counts the number of each signal in each sequence. However, our strict weighted averaging property expressed for does not allow and to both contain the same signal.
4.2. Role of Timing
We now explore the role of the timing of signals by associating signals with time labels. In that setting, a signal closer to the time of the prediction may be perceived as more credible (have more weight) compared to the same signal if it was received further in the past. Formally, let X be the set of signals. The present time denoted by , and time represents units of time before the present time. For a given finite subset of signals , let a function , represent the timing of each signal in the set , i.e., for any signal , is the time of receiving the signal . Given a , represents a time shift of size over the timing of a set of received signals . Finally, the set represents all possible realizations of the received signals. In this context, a belief formation process is a function .
Our main consistency property, in addition to the strict weighted averaging property, is the stationarity property. A belief formation process is stationary if a belief induced by a set of received signals and their timing is the same as the belief induced by a constant shift of timings of the same received signals. More precisely:
Definition 9.
A stationary belief formation is a function such that
for , , and .
The next proposition characterizes stationarity belief-formation processes satisfying the strict weighted averaging property.
Proposition 1.
Let a rich and stationary belief formation process satisfy the strict weighted averaging property. Then, there exist a unique discount factor and a unique (up to multiplication by a positive number) weight function , such that for all :
(4.3) |
As a consequence of the representation, under the assumption of the proposition, the weight over a received signal can be separated into two separate factors. One is the intrinsic value of the signal, captured by . The other one is the role of timing, captured by . Moreover, the only discounting that captures the role of the timing is the exponential form. If , the timing is not important. Hence, the belief formation process only considers the intrinsic value of each signal. However, when , the belief formation process places relatively more () or less () weight on a signal received closer to the time of the prediction.
4.3. Bayesian Updating
Let be the measure space of events, where has a finite number of disjoint events. The space of events captures the information structure of the belief formation process. Similarly, by considering the set , we denote as the measure space of states of nature, where is the set of subsets of the set . For any probability distribution and any subset of the state of nature , let denote the probability of which is induced by the distribution . Hence, .
Definition 10.
A belief formation process is Bayesian, if there exists a probability measure on the space , such that for every and we have:
(4.4) |
where, is the marginal probability distribution of over .
The right-hand side of the previous equation is the conditional probability of given . Therefore, a Bayesian belief formation process behaves as a Bayesian updater: by observing an event in her information structure , her prediction about the probability of the true state being in a subset comes from the Bayes rule. To put it differently, is equal to the conditional probability .
Our next proposition shows that our strict weighted averaging axiom is the necessary and sufficient condition for a rich belief formation process to be Bayesian.
Proposition 2.
A rich belief formation process is Bayesian if and only if it satisfies the strict weighted averaging property.
Note that the richness condition is crucial. Otherwise, as shown in Example 3, there are cases where a belief formation process satisfies the strict weighted averaging axiom, but it is not a Bayesian updater. We will now present a more general version of Proposition 2 by adding the strong richness condition and weakening the strict weighted averaging condition to the weighted averaging property. In the more general version, it is possible to have zero probability events. The belief formation process behaves as a Bayesian updater, even conditional on observing a zero probability event. To capture the idea, we need the following definition.
Definition 11.
A class of functions is a conditional probability system if it satisfies the following properties:
-
(1)
For every such that , is a probability measure on with .
-
(2)
For every disjoint events and for every , we have:
In this definition, the probability measure represents a prior probability measure, and represents a posterior (conditional) probability probability given the event . Therefore, for any set , is the conditional probability of given . Moreover, for any two events in , is the conditional probability of event given . The first property of Definition 11 requires that the support of the posterior probability conditioned on an event is contained in . The second property requires that, conditional on the event , the Bayes updating rule should be satisfied even if the prior probability of is zero.
Definition 12.
A belief formation process is rationalizable by a conditional probability system if every and we have:
(4.5) |
By adding the strong richness condition, the next theorem shows that the weighted averaging axiom is the necessary and sufficient condition for rationalizing a belief formation process by a conditional probability system.
Proposition 3.
A strongly rich belief formation process is rationalizable by a conditional probability system if and only if it satisfies the weighted averaging axiom.
Remark 2.
[36] considers the problem of characterizing the updating rules (in our context the belief formation processes) that appear to be Bayesian. By providing an example, they show that their soundness condition, our strict weighted averaging condition, is not a sufficient condition for an updating rule to behave as a Bayesian updater. However, we show that the strict weighted averaging condition is the necessary and sufficient condition as long as the belief formation process satisfies our richness condition.
5. Average Choice Functions
In this section, the set of features is a subset of . We interpret each feature as a choice object. The interpretation of the aggregation rule is a decision-maker that selects a choice randomly from a menu of choice objects. We model the decision-maker as an average choice function that associates with any menu of choice objects, an average choice (mean of the distribution of choices) in the convex combination of choice objects. The average choice is easier to report and obtain rather than the entire distribution888Check [1] for the complete discussion on merits of average choice.. However, except for the case where elements of a menu are affinely independent, average choice does not uniquely reveal the underlying distribution of choices.
First, using our main representation, we show that it is possible to uniquely extract the underlying distribution of choices as long as the average choice function satisfies the weighted averaging axiom.
Then, we illustrate two applications of the result. In one application, we consider the class of average choice functions that can be rationalized by a Luce rule, i.e., a stochastic choice function that satisfies the independence of irrelevant alternatives axiom (IIA) proposed by [29]. We show that the average choice functions satisfying the strict weighted averaging axiom are exactly the ones that can be rationalized by a Luce rule. More generally, we show that the class of average choice functions satisfying the weighted averaging axiom is the same as the class of average choice functions rationalizable by a two-stage Luce model proposed by [14].
In the second application, we consider continuous average choice functions. First, we show that any continuous average choice function under the weighted averaging axiom is rationalizable by a Luce rule. This means that there is no continuous average choice function that is rationalizable by a two-stage Luce rule but not with a Luce rule.
Then, we illustrate a connection of our result with the one by [22], regarding the impossibility of an average choice function to satisfy both the path independence axiom and continuity.
5.1. Set up
In this section, is a nonempty subset of , which is not a subset of a line. For any , we denote by the set of all convex combinations of vectors in .
Definition 13.
An aggregation rule is called an average choice function, if for any (menu of choices) , .
One of the goals of this section is to present a connection between our weighted averaging condition and the Path Independent, Luce, and two-stage Luce choice models. The following is a corollary of theorems 2 and 4,
Corollary 3.
Let an average choice function be strongly rich. The following statements are equivalent:
-
(1)
The average choice function satisfies the weighted averaging condition.
-
(2)
There exists a unique weak order on and a unique weight function , up to multiplication over equivalence classes of the weak order such that for every :
(5.1)
Moreover, if the average choice function satisfies continuity and the weighted averaging condition, the weight function is continuous and the weak order is the equivalence order. In this case, for every :
(5.2) |
5.2. Luce Rationalizable Average Choice Functions
The following definitions are standard definitions in the context of individual decision-making.
Definition 14.
A stochastic choice is a function , such that for any .
For an average choice function and a menu , . Therefore, there exists a stochastic choice (which may not be unique) that rationalizes the average choice function , i.e., , where is the probability of selecting the element from the menu .
One appealing form of a stochastic choice function is the one that satisfies Luce’s IIA, i.e., the probability of selecting an element over another element is independent of any other element. [29] shows that stochastic choices that satisfy the IIA axiom are in the form of Luce rules.
Definition 15.
A stochastic choice is a Luce rule if there is a function , such that:
Furthermore, if is continuous, then is a continuous Luce rule.
Definition 16.
An average choice function is rationalizable by a stochastic choice , if for all :
Furthermore, if there exists a Luce rule that rationalizes the average choice function , then is Luce rationalizable.
By considering our Theorem 1 and corollary 3, a choice has a Luce form representation, i.e, if and only if it satisfies the strict weighted averaging condition. As a result:
Corollary 4.
An average choice function is Luce rationalizable if and only if it satisfies the strict weighted averaging condition. Moreover, the Luce rule that rationalizes the average choice function is unique.
Furthermore, an average choice function is continuous Luce rationalizable if and only if it is continuous and satisfies the strict weighted averaging condition.
In the Luce model, the decision-maker selects each element of a given menu with a strictly positive probability. However, this is not a plausible assumption in many situations. The decision-maker may always select a better choice between two alternatives. We model this behavior by a two-stage Luce model. [14] introduces the two-stage Luce model. In this model, there exists a ranking order and a weight function over elements. A decision-maker choosing from a menu only selects the highest-ordered elements from the menu. The probability of the selection of each highest-ordered element is related to the weight associated with the element. Formally:
Definition 17.
A stochastic choice is a two-stage Luce rule if there are a function and a weak order over elements of , such that:
(5.3) |
Given , the decision-maker only selects the elements in , that are the highest-ordered elements of . She chooses each element of with a probability associated with its weight.
By considering our Theorem 2, any average choice function under the weighted averaging axiom is rationalizable by a two-stage Luce rule.
Corollary 5.
A strongly rich average choice function is two-stage Luce rationalizable if and only if it satisfies the weighted averaging axiom. Moreover, the two-stage Luce rule that rationalizes the average choice function is unique.
Remark 3.
Under the continuity condition, using Theorem 4implies that both the two-stage Luce model and Luce model are equivalent. The next section discusses this observation.
5.3. Continuous Average Choice Functions
In this section, we consider the class of continuous average choice functions satisfying the weighted averaging condition. First, we reinterpret our corollary 3 as an impossibility result. This means that no continuous average choice function is rationalizable by a two-stage Luce model but not by a Luce model. Then, we show the connection with the impossibility result by [22], regarding the impossibility of a choice function satisfying both the path independence and continuity.
[32] extensively studies choice functions under the path independence axiom. Plott’s notion of path independence requires a choice from the union of two disjoint menu , to be the choice between the choice from and the choice from . Using this axiom, the choice from any menu can be recursively obtained by partitioning the elements of the menu into disjoint sub-menus. Then, the choice from the whole menu would be the choice from the choices of each sub-menu. In our setup, for an average choice function , we have:
Definition 18.
satisfied the (path independence) condition if
for all such that .
The path independence condition is stronger than our weighted averaging condition. In other words, any average choice function under Plott’s notion of path independence satisfies the weighted averaging condition. More precisely, given a choice function and two disjoint menus , under the path independence condition, . By the definition of average choice functions, , which shows that the choice function satisfies the weighted averaging axiom.
Continuity is an appealing property of an average choice function. It specifies that by replacing an element of a menu with another element close to it, with respect to the norm of , the average choice of the new menu is close to the average choice of the previous menu. [22] shows that there is no average choice function that satisfies both path independence axiom and continuity. Here, we reinterpret the result of corollary 3 to show a more general result for average choice functions.
Corollary 3 states that, for a strongly rich continuous average choice function satisfying the weighted averaging condition, there exists a unique weight function such that for any :
There are two important observations regarding the representations above.
First, through discussions in Section 5.2, the representation shows that any continuous average choice function that is rationalizable by a two-stage Luce model is also rationalizable by a Luce model. Second, since the function is strictly positive, the average choice of any menu should be in the relative interior of the convex hull of members of the menu.
As a result, our impossibility result specifies that for an average choice function that satisfies the weighted averaging condition, it is impossible to satisfy the continuity condition and also to have a choice from a menu that is on the relative boundary of the elements of the menu. We summarize the observation in the following corollary999To see the connection between our corollary 6 and the result in [22], it is enough to consider a menu with three non-collinear members. [22, Thm. 1] shows that the average choice of a path independent average choice function from any menu is the average choice of the average choice function from a sub-menu of two members of the menu. This shows that the average choice from a menu with three non-collinear members is on the line segment connecting two of the member of the menu. As a result, the choice should be on the relative boundary of the menu. That is why it cannot satisfy continuity. .
Corollary 6.
If is a nonempty convex subset of a vector space that contains at least three non-collinear points, then an average choice function that satisfies the weighted averaging condition cannot be both continuous and contains a menu , with .
6. Extended Pareto Aggregation Rules
This section demonstrates an application of section 3 in the social choice problems. In this domain, each feature represents a preference ordering of individuals over a set of alternatives. Each preference ordering satisfies the axiom of [38]. The role of an aggregation rule is to associate with each coalition of individuals another vN-M preference ordering over the set of alternatives.
An appealing property of an aggregation rule, in this context, is to satisfy the extended Pareto axiom. [35] introduced the extended Pareto. It specifies that, if two disjoint coalitions of individuals, each prefers an outcome over another outcome, then the union of the coalitions also should prefer the same outcome over the other one. Moreover, if one of them strictly prefer one outcome over the other one, then the union of the coalitions should also strictly prefer the same outcome over the other one.
First, we show that under a normalization of cardinal utilities of individuals and a minor richness condition, aggregation rules under the strict weighted averaging (weighted averaging) axiom are exactly aggregation rules under the extended Pareto (extended weak Pareto) axiom.
Following the equivalence, we use our main representation result as a technical tool to pin down the representation of the extended Pareto aggregation rules. We show that the only possible extended Pareto aggregation is to have a positive weight over each individual in the society. Then, the aggregated preference ordering of a given group of individuals is the weighted sum of their preference ordering.
The representation can be considered as a multi-profile version of the theorem by [20] on Utilitarianism. Harsanyi considers a single profile of individuals and a variant of Pareto to get the Utilitarianism. However, in our approach, we partition a profile into smaller groups. Then, we aggregate the preference ordering of these smaller groups using the extended Pareto. Hence, we get the Utilitarianism through this consistent form of aggregation. As a result, in our representation, the weight associated with each individual appears in all sub-profiles that contain her.101010 Similar to the discussion of [39] regarding the debate of Sen-Harsanyi, our result is better to be interpreted as a representation rather than a justification of the utilitarianism.
In Section 6.3, we extend our result on extended Pareto aggregation rules to the class of generalized social welfare function. Unlike our previous model, individuals may have different preference orderings. Therefore, the domain of the generalized social welfare function is a set of all different groups (with all possible sizes) of individuals, with each individual having all different possible preference orderings. Our definition of generalized social welfare function extends the standard definition used by [3], in which the domain is a set of fixed-length profiles of individuals.
For a technical reason, we restrict the set of vN-M preferences to those in which all of them strictly prefer one fixed lottery to another fixed one. We show that the only possible extended Pareto generalized social welfare functions are the ones that associate a positive number to each individual’s preferences (unlike the previous section, in which each weight depends on both the individual and the whole profile), and it associates each coalition with the weighted sum of their cardinal utility using the weight associated to their preferences.
The important observation is that, each positive weight in the representation is independent of the other individuals in any profiles. The weight only depends on each individual and her own preference ordering.
Our representation above has a positive nature, compared to the claims by [23] and [21] that the negative conclusion of Arrow’s theorem holds even with vN-M preferences. Moreover, the representation provides an answer to the main concern of [7, 8] regarding the correctness of the main theorem of [12].
[12] by considering a set of axioms, other than the ones by Arrow, provides one of the first axiomatizations of relative utilitarianism as a possibility result. However, [7] shows a counterexample to their representation. Our representation fixes the error using our variant of the extended Pareto axiom and our restricted domain of the generalized social welfare function.
Finally, adding the anonymity and the weak IIA axiom of [12] gives us the relative utilitarianism as one possible choice of the weight function. However, the primary concern of our paper is to show that the weighted averaging of preferences is the only generalized social welfare function that respects extended Pareto. The possible choices of weights are not our focus in this paper.
6.1. Set up
Let the set and . A lottery associates the probability to the prospect and to the prospect .
A vN-M preference over the set is a preference relation that satisfies the axioms of [38] as defined below111111If is a vN-M preference over the set , then, by the vN-M theorem, there exists an affine representation of the preference . For notational convenience, we normalize all affine representations to have the value over the prospect ..
Definition 19.
We say that is a vN-M preference over the set if it is a weak order and if there exists a , known as a utility, such that for any , if and only if where “” represents the inner product in . Moreover, the (unique) ray contains all normalized affine utilities that represent the vN-M preference . We write for the set of all vN-M preferences over and for the strict part of the preference .
Let represent the set of all agents and be the set of all finite subsets of . Write for the X-Fold Cartesian product of . Every defines a preference profile of the set of agents over the set of lotteries.
Definition 20.
A group aggregation rule on X is a function , that associates with every coalition of agents a vN-M preference .
An rational property of group aggregation rules is that whenever two disjoint coalitions, e.g. , both prefer a lottery to another lottery , then their union, , also prefers the lottery to the lottery .
Definition 21.
A group aggregation rule satisfies the extended Pareto property if for all disjoint coalitions of agents , and for all lotteries ,
(6.1) |
(6.2) |
Our last condition requires the existence of two lotteries in the set of lotteries, in which all agents strictly prefer one over the other.
Definition 22.
A group aggregation rule satisfies the minimal agreement condition if there exist two lotteries such that for every agent , .
Remark 4.
Let a group aggregation rule satisfy both the minimal agreement and extended Pareto axiom. Given two agents , by applying the strict part of the definition of the extended Pareto axiom, we have . Similarly, for every coalition of agents , recursively using the strict part of the extended Pareto axiom, we deduce .
Remark 5.
Let the vector be , where are the two lotteries in the definition of the minimal agreement condition. Let represent the vN-M preference . Hence, if and only if . Therefore, the definition of the minimal agreement condition is equivalent to the existence of a direction such that for all , . this interpretation of is exactly the role of in section 3.
6.2. The Representation of Extended Pareto Group Aggregation Rules
In this section, we assume that the group aggregation rule satisfies the minimal agreement condition. In particular, we assume that all agents strictly prefer the lottery over the lottery . Considering remark 5, we define as the direction that every agent agrees on. For a coalition of agents , let the ray represents the vN-M preference . Let represent the normalization of utilities in which the difference of the value of utility of the lottery and the lottery is exactly 1. For every coalition of agents , there is a unique cardinal utility , such that is in . For the rest of the section, for every coalition , we consider the unique cardinal utility to represent the vN-M preference . Using this representation, we can represent the group aggregation rule , by a normalized group aggregation rule , where .
Remark 6.
Without loss of generality, we can assume that the lottery in the definition of the minimal agreement condition is just the lottery . In that case, the space is vN-M preferences with the value for the lottery and the value for the lottery .
The next proposition, which is the same as Theorem 5, shows that under the representation of the vN-M preference by the , the extended Pareto property is equivalent to the strict weighted averaging property. Formally, we have:
Corollary 7.
Let a group aggregation rule satisfy the minimal agreement condition with as the direction on which all agents agree. Then, the following are equivalent:
-
(1)
satisfies the extended Pareto property.
-
(2)
satisfies the strict weighted averaging property.
Using the result of Theorem 1, we deduce the representation of the extended Pareto group aggregation rules.
Corollary 8.
Let a rich group aggregation rule satisfy both the extended Pareto property and minimal agreement condition. Then, there exists a weight function such that for every coalition of agents ,
(6.3) |
Moreover, the weight function is unique up to multiplication by a positive number.
As shown in Example 3, the richness condition is crucial. The richness here is equivalent to the existence of three non-collinear “normalized” cardinal utilities in the space (the range of the aggregation rule). We can interpret the theorem as a generalization of the main theorem of [20] on Utilitarianism. However, our result shows the connection between weights of individuals in different sub-coalitions of the main profile.
To see the connection with Harsanyi’s result, we rewrite the theorem in an additive form: let the group aggregation rule satisfy both the extended Pareto property and minimal agreement condition. Then, there exists a weight function such that for every coalition of agents , has the following representation:
(6.4) |
Defining for , we can rewrite equation 6.4 in the additive form . Moreover, if we consider only the representations with the value for the lottery , this representation is unique up to multiplication by a positive number.
6.3. The Representation of Extended Pareto Generalized Social Welfare Functions
The setup of this subsection is the same as the one in the previous section. Without loss of generality, we assume that the lottery , in the definition of the minimal agreement, is the vector . Let be any lottery other than . Define as the set of all vN-M preferences that strictly prefer to . Let be the X-fold Cartesian product of . Every defines a preference profile of the set of individuals. For any coalition and for any preference profile , let denote the restriction of the profile to the coalition .
As in Definition 19, we can represent each preference by a unique ray , where is a cardinal utility representing . Moreover, for any preference , there should be a unique cardinal utility with . Write for the space of all cardinal utilities attaining value at the lottery and the value at the lottery . Let the function associate each preference with the unique cardinal utility that represents it. This function is a bijection associating each preference to the unique cardinal utility attaining value at the lottery and value at the lottery .
Write for the set of all profiles where the representation of individuals’ cardinal utilities in the space is not a subset of a single line. Formally, we define , where is the dimension of the smallest linear variety containing all .121212There should be at least four alternatives; otherwise, is the empty set.
Finally, write for all the profiles in and all sub-coalitions of those profiles. is the domain of our generalized social welfare functions. Formally, we have:
Definition 23.
A generalized social welfare function on is a function , that associates with any coalition and any profile a preference . Moreover, we assume that for any individual , and any profile , .
In our setup, the domain of generalized social welfare functions is a rich set of all sizes of profiles. Moreover, it satisfies the Individualism axiom, which means that it associates any individual preference to the same preference.
The connection between profiles of different sizes is the extended Pareto property. The extended Pareto property states that if the associated preference ordering of two disjoint coalitions of individuals, and , each prefer a lottery to , then the associated preference ordering of the union of the coalition with the same preference as before should also prefer to .
Definition 24.
A generalized social welfare function satisfies the extended Pareto property if for every preference profile and for any two disjoint coalitions , and for all lotteries ,
(6.5) |
(6.6) |
Our main result of this section characterizes the class of extended Pareto generalized social welfare functions.
Theorem 6.
Let be a set of individuals with . The generalized social welfare function satisfies the extended Pareto property if and only if there exists a weight function , such that for any coalition and any preference profile , has the following representation:
(6.7) |
Moreover, the weight function is unique up to multiplication by a positive number.
Remark 7.
We can rewrite the theorem to specify that the generalized social welfare function satisfies the extended Pareto axiom if and only if there exists a weight function , such that for any coalition and any preference profile , has the following representation:
(6.8) |
Note that each weight depends only on the associated individual’s preferences and not on the other individuals.
The weight function in the representation depends on each individual’s index. However, adding the classical Anonymity condition makes the weight function independent of individual’s indexes.
Definition 25.
An extended Pareto aggregation rule satisfies the Anonymity condition, if any permutation of the indexes of individuals does not change the generalized social welfare function.
The anonymity condition makes any extended Pareto generalized social welfare functions independent of the individual’s indexes. Hence, the uniqueness of the weight function in Theorem 6 makes the weight function, associated with an anonymous extended Pareto aggregation rule, independent of the indexes. Therefore, we have:
Corollary 9.
Let be a set of individuals with . The extended Pareto generalized social welfare function satisfies the Anonymity condition if and only if there exists a weight function , such that for any coalition and any preference profile , has the representation:
(6.9) |
Or, equivalently, if and only if has the representation:
(6.10) |
Moreover, the weight function is unique up to multiplication by a positive number.
The positive nature of our result appears to contradict the conjectures by [23] and [21] that the negative conclusion of the impossibility theorem by [3] holds even with vN-M preferences. However, other than the differences between our model and theirs, we only consider the restricted domain where all preferences prefer the lottery over the lottery . As discussed before, the definition of our restricted domain is crucial in corollary 9.
Remark 8.
Relative utilitarianism can be obtained by adding the weak IIA axiom of [12]: the weight function normalizes each preference such that the difference between the cardinal utility of the best alternative and the worst alternative becomes . In other words, for any preference , .
7. A Conditional Subjective Expected Utility Theory of State-Dependant Preferences
The choice-theoretic foundation of subjective expected utility was developed by the seminal works of [33], [34], and [2]. In the standard model, the decision-maker has a ranking over acts (state-contingent outcomes). The representation of this ranking consists of a subjective probability over the set of states, capturing the decision maker’s beliefs, and a cardinal utility representing the decision maker’s tastes over the set of outcomes, independent of the realization of the true state. However, in many applications, such as models for buying health insurance, the independence of the utility and the set of states is not a plausible assumption131313Check [4], [9], and [24] for more discussions..
In this section, we provide a simple theory of subjective expected utility of state-dependent utility by reinterpreting our representation of extended Pareto aggregation rules. We build our model using the framework of [2]. In our model, the decision-maker has a preference ordering over the set of conditional constant acts. This means that given any fixed event, the decision-maker has a hypothetical conditional preference ordering over the set of lotteries, representing her conditional preference condition on learning that only that event is happening141414In Section 7, we illustrate another interpretation of hypothetical conditional preferences by providing a preference ordering over the set of conditional constant acts. . Each of these hypothetical conditional preferences satisfies the axioms of [38], which means each has an affine representation. We show that as long as the class of hypothetical conditional preferences satisfies the extended Pareto axiom, there is a subjective probability measure over the set of states and a state-dependent utility over the set of alternatives. The class of hypothetical conditional preferences has a representation in the form of conditional expectation with respect to the subjective probability and the state-dependent utility.
The result shows that the extended Pareto is the main force behind the separation of the belief and the state-dependent utility. However, the representation is not unique. Hence, the challenge is to provide meaning to a decision maker’s prior beliefs when utility is state-dependent. We get the uniqueness by adding a stronger version of our minimal agreement condition. The strong minimal agreement condition specifies that there exist two lotteries where one is strictly preferred to the other, regardless of states. Moreover, the decision maker’s conditional preference for each of them is independent of the set of states.
We show that under the strong minimal agreement, the belief is unique. Moreover, the state-dependent utility is unique up to affine transformation.
7.1. Set up and main result
In this section, we develop a simple theory of subjective expected utility of state-dependant utility by reinterpreting the results of sections 6.2 and 4.3. Our model is built using the framework of [2]. Let be a finite set of states of nature. The finite set represents outcomes. The simplex represents the set of lotteries over the set . A lottery associates the probability to the outcome . Let the . In our setup, the objects of choice are conditional constant acts. For any lottery , and any event , the function such that for and for is termed a conditional constant act and denoted by . The interpretation is that if the event is realized, the decision maker faces the lottery . Otherwise, will be realized. We assume that the decision maker has a preference relation , not necessarily a complete relation, over the set of conditional constant acts.
Let represent the set of conditional constant acts. For any event , let be the set of all conditional constant acts attaining a lottery on the event and staying out on the event . We represent the conditional preference ordering of the decision maker over by . For any two lotteries , we write as a shorthand of .
Our interpretation of conditional preference ordering is related to the models developed by [30], [15], [37], and [26]. However, there is another interpretation of the conditional preference similar to the conditional decision model of [16]. In this interpretation, we assume that the decision-maker may receive some information that only can be realized. In this case, represents the decision maker’s ex-post preference over the set of lotteries. Similarly, represents her ex-ante preference over exactly the same set of lotteries.
Regardless of the interpretation, the goal is to provide a theory that connects the class of conditional preferences through the Bayes updating. Formally, our goal is to find sufficient conditions under which that there exists a state-dependent utility function and a subjective probability measure , such that for every two lotteries , and any event the following holds:
(7.1) |
In the equation above, represents the expected utility of the state-dependent utility in the state and with respect to the lottery . The right-hand side of the equation is comparing the conditional expectation utility of the lottery and , with respect to the subjective probability measure and the state-dependent utility . The importance of the result is that the probability measure depends on the event through the Bayes rule. We will obtain 7.1 from the following axioms/conditions.
Axiom 7.1.
(Weak Order) For any event , the conditional preference is complete and transitive.
Axiom 7.2.
(vN-M Continuity) For any event and for every , if , there exist such that
Axiom 7.3.
(Independence) For any event , every , and every ,
Axiom 7.4.
(extended Pareto) For any two disjoint events , and for every ,
(7.2) |
(7.3) |
Axiom 7.5.
(Minimal Agreement) There exist two lotteries such that for every , .
Axiom 7.6.
(Richness) There exist three states such that for any , there exist two lotteries where and for .
By considering these six axioms, we can rationalize the behavior of the decision-maker as a subjective expected utility maximizer with a state-dependent utility.
Theorem 7.
Suppose that the decision maker’s conditional preferences satisfy axioms 6.1-6.6, then there exist a function and a probability measure , such that for every two lotteries , and any event , the following holds:
(7.4) |
The proof is similar to the proof of corollary 8. The probability measure is not unique. Let be any probability measure on ; by defining a state-dependent utility , equation 7.4 continues to hold with and . However, if we change the minimal agreement axiom to a stronger version, we attain the uniqueness. In the stronger version of the minimal agreement, we assume that the decision maker’s preferences over the lotteries is indifferent to the realization of the states. Formally:
Axiom 7.7.
(Strong Minimal Agreement) There exist two lotteries such that for every , . Moreover, ) and for all .
Conceptually, this axiom is closely related to A.0 axiom by [26]. However, unlike Karni’s axiom, we do not need these two lotteries to be the best and worst lotteries in the set of lotteries. Our model only needs two lotteries, with one strictly preferred to the other, regardless of states. Moreover, the decision maker’s conditional preference for each of them is independent of the set of states.
By replacing the minimal agreement axiom with the strong minimal agreement axiom, we can “uniquely” separate the belief from the state-dependent preference.
Theorem 8.
Suppose that the decision maker’s conditional preferences satisfy axioms 6.1-6.7, then there exist a function and a probability measure , such that for every two lotteries , and every event , the following holds:
(7.5) |
Moreover, the probability measure is unique and the function is unique up to affine transformations.
Proof.
Based on Theorem 7, there exists a pair satisfying equation 7.5. To prove the uniqueness, we assume that and both represent the same class of conditional preferences. By considering the conditional preference and the vN-M Theorem, we know that . By using the strong minimal agreement axiom, we have , , , and for any two states . Therefore, and for all . Hence, for all .
We consider an event . Both and represent the conditional preference . Considering the pair , has the representation
Since, is strictly positive, the last representation is the same as
. However, using the other pair, , we get the representation .
Therefore, for any event , and both represent the conditional preference .
Using the richness axiom, strong minimal agreement condition, and uniqueness of corollary 8, we have .
This completes the proof.
∎
8. Related Literature
Our methods are applicable to different areas of economic theory, and generalize existing ideas in those areas. In particular, instances of our weighted averaging axiom appear in several different papers.
The theory of Case-Based Prediction is developed by the seminal works of [17, 18, 19] and [6]. In this context, the concatenation axiom proposed by [6], is closely related to the strict case of our axiom. However, there are differences between the two axioms. As discussed in detail in Section 4.1, their belief formation process is defined over “sequences” of cases, in which each sequence can have multiple copies of the same case. The role of the concatenation axiom is to count the number of each case. However, in our framework, we define our axiom over “sets” of signals, in which in each set there is only one copy of each signal. Moreover, our axiom is defined over disjoint sets. By weakening our definition for any two general sets, our result does not hold anymore. Using the duality argument, the consistency axiom of our paper is the combination axiom of [18]. However, our goal is to show that the combination axiom can be obtain by weaker version of concatenation axiom using a simple duality argument.
In the paper by [36], they provide an example, on binary state space, to show that their soundness condition is not a sufficient condition for an updating rule to behave as a Bayesian rule. However, we show that under our richness assumption, the strict weighted averaging axiom (which is the same as their soundness condition) is the necessary and sufficient condition for an updating rule to behave as a Bayesian. We also generalize our result for the class of updating rules that can be rationalized by a conditional probability system.
In the context of choice theory, [1] introduces a model of continuous average choice over convex domains. In this application, we generalized their result in many ways. First, their result holds for the strict case of our axiom. Moreover, continuity and convexity are the two important forces behind their result. However, we show that the strictness of an average choice function, continuity, or convexity of the domain are not the main forces behind extracting the underlying distribution of choices. The main force is our weighted averaging axiom. Moreover, we show that it is possible to rationalize an average choice function by a two-stage Luce model, as long as it satisfies our weighted averaging axiom.
The path independence choice functions are extensively studied by [32]. Our representation of average choice functions under the weighted averaging axiom and continuity generalizes the results by [22] and [31], regarding the impossibility of a choice function under both the path independence and the continuity.
[5] study the extended Pareto rule over vN-M preferences by relaxing the completeness axiom. Besides the technical and conceptual differences between the two approaches and results, their model depends on their non-degeneracy condition. The condition is only satisfied when there is a spanning tree over the preferences, and every three consecutive preferences in the spanning tree are linearly independent. However, the richness condition of our theorem only requires three linearly independent vectors among the whole set of preferences. Moreover, our result can be applied even for the class of extended weak Pareto aggregation rules under our strong richness condition. Note that our primary goal in this paper is to show that extended Pareto and extended weak Pareto are special cases of our weighted averaging axiom (under the minimal agreement condition).
The papers by [12], [13], and [8] each by considering different sets of axioms, other than Arrow’s, provide an axiomatization of relative utilitarianism as a positive result. The paper by [12] is the closest one to ours. Dhillon considers a variant of extended Pareto to get a weighted averaging structure. However, [7] shows a counterexample to the representation. We restrict the domain and use our definition of extended Pareto to get the weighted averaging structure as a consequence of our main theorem. Again, the technique we developed can also be used to provide a representation of the extended weak Pareto social welfare functions.
Finally, there are many papers and different approaches to address the shortcomings of subjective expected utility theory. Note that our goal, in this context, is to explain the basic underlying structure that lets us separate beliefs and state-dependent utilities.
[28] and [27] use hypothetical preferences on hypothetical lotteries to obtain the identification of the beliefs and state-dependent preferences. [10], [11], and [25] present different theories to identify state-dependent preferences in situations where moral hazard is present.
[30] and [15] use preferences on enlarged choice space of all conditional acts to model subjective expected utility of state-dependent preferences. However, our paper only considers the hypothetical conditional preferences on the set of conditional “constant” acts. We find the necessary and sufficient condition that our conditional preferences are related to each other through a subjective probability and a state-dependent utility.
The papers by [37], [16], and [26] are closely (conceptually) related to our main result of Section 7. However, there are many differences between each result. Moreover, our goal is to build the model that only extended Pareto derives the separation of beliefs and state-dependent preferences.
[37] presents a nonexpected utility model, by considering hypothetical preferences over the set of act-event pairs. His coherence axiom has the same role as the extended Pareto axiom in our setup. However, he used the solvability axiom to be able to apply the Debreu’s additive representation theorem. In our paper, we consider the class of conditional vN-M preferences. As a result, we only require the extended Pareto for our representation.
[26] presents a general model with a preference ordering over the set of unconditional acts. Using the preference order, he defines the set of conditional preferences over the set of all conditional acts. Therefore, to connect the class of conditional preferences, the model needs the existence of the constant-valuation acts. Moreover, the cardinal and ordinal coherence axioms are the main forces behind obtaining the Bayesian updating in his representation. However, in our more restricted domain, we only need the extended Pareto to get our representation.
Finally, [16], by replacing Savage’s sure-thing principle by dynamic consistency, obtains a subjective expected utility theory that the conditional preferences are connected through the Bayes rule. However, his representation only holds for the state-independent preferences.
Acknowledgments
The authors gratefully acknowledge support from Beyond Limits (Learning Optimal Models) through CAST (The Caltech Center for Autonomous Systems and Technologies) and partial support from the Air Force Office of Scientific Research under awards number FA9550-18-1-0271 (Games for Computation and Learning) and FA9550-20-1-0358 (Machine Learning and Physics-Based Modeling and Simulation).
The first version of the paper was written during the first author’s Ph.D. studies with many helpful comments from Federico Echenique and Kota Saito. The first author thanks his Ph.D. advisors Jaksa Cvitanic, Federico Echenique, Kota Saito, and Robert Sherman. For helpful discussions, the first author thanks Itai Ashlagi, Kim Border, Martin Cripps, David Dillenberger, Drew Fudenberg, Simone Galperti, Michihiro Kandori, Igor Kopylov, Jay Lu, Fabio Maccheroni, Thomas Palfrey, Charles Plott, Luciano Pomatto, Antonio Rangel, Pablo Schenone, Omer Tamuz, and Leeat Yariv.
References
References
- [1] David.. Ahn, F. Echenique and K. Saito “On Path Independent Stochastic Choice” In Theoretical Economics 13, 2018, pp. 61–85
- [2] F.. Anscombe and R.. Aumann “A Definition of Subjective Probability” In The Annals of Mathematics and Statistics 34, 1963, pp. 199–205
- [3] K.. Arrow “Social Choice and Individual Values”, 12 Yale Univ. Press, 1963
- [4] K.. Arrow “Optimal Insurance and Generalized Deductibles” In Scandinavian Actuarial Journal 1, 1974, pp. 1–42
- [5] M. Baucells and L.. Shapley “Multiperson utility” In Games and Economic Behavior 62, 2008, pp. 329–347
- [6] A. Billot, I. Gilboa, D. Samet and D. Schmeidler “Probabilities as Similarity-Weighted Frequencies” In Econometrica 73, 2005, pp. 1125–1136
- [7] T. Borgers and Y. Choo “A Counterexample to Dhillon” In Social Choice Welfare. 48, 2017, pp. 837–843
- [8] T. Borgers and Y. Choo “Revealed Relative Utilitarianism” Working Paper, 2017
- [9] P.. Cook and D.. Graham “The Demand for Insurance and Protection: The Case of Irreplaceable Commodities” In Quarterly Journal of Economics 91, 1997, pp. 143–156
- [10] J. Dereze “Decision Theory with Moral Hazard and State-dependent Preferences” In Essays on Economic Decisions Under Uncertainty Cambridge University Press, 1987
- [11] J. Dereze and A. Rustichini “Moral Hazard and Conditional Preferences” In Journal of Mathematical Economics 31, 1999, pp. 159–181
- [12] A. Dhillon “Extended Pareto Rules and Relative Utilitarianism” In Social Choice Welfare 15, 1998, pp. 521–542
- [13] A. Dhillon and J.. Mertens “Relative Utilitarianism” In Econometrica 67, 1999, pp. 471–498
- [14] F. Echenique and K. Saito “General Luce Model” In Theoretical Economics 13, 2018, pp. 61–85
- [15] P.. Fishburn “A Mixture-Set Axiomatization of Conditional Subjective Expected Utility” In Econometrica 41, 1973, pp. 1–25
- [16] P. Ghirardato “Revisiting Savage in a Conditional World” In Economic Theory 20, 2002, pp. 83–92
- [17] I. Gilboa and D. Schmeidler “Case-Based Decision Theory” In Quarterly Journal of Economics 110, 1995, pp. 605–639
- [18] I. Gilboa and D. Schmeidler “Inductive Inference: An Axiomatic Approach” In Econometrica 71, 2003, pp. 1–26
- [19] I. Gilboa and D. Schmeidler “Case-Based Predictions: An Axiomatic Approach to Prediction, Classification and Statistical Learning” World Scientific Publishing Co, Singapore., 2012
- [20] J.. Harsanyi “Cardinal Welfare, Individual Ethics, and Interpersonal Comparisons of Utility” In Journal of Political Economy 63, 1955, pp. 309–321
- [21] A. Hylland “Aggregation Procedure for Cardinal Preferences: A Comments” In Econometrica 48, 1980, pp. 539–542
- [22] E. Kalai and N. Megiddo “Path Independent Choices” In Econometrica 48, 1980, pp. 781–784
- [23] E. Kalai and N. Schmeidler “Aggregation Procedure for Cardinal Preferences: A Formulation and Proof of Samuelson’s Impossibility Conjecture” In Econometrica 45, 1977, pp. 1431–1438
- [24] E. Karni “Decision making under uncertainty: the case of state-dependent preferences” Cambridge: Harvard University Press, 1985
- [25] E. Karni “Subjective Expected Utility Theory Without States of the World” In Journal of Mathematical Economics 42, 2006, pp. 325–342
- [26] E. Karni and D. Schmeidler “Foundations of Bayesian theory” In Journal of Economic Theory 132, 2007, pp. 167–188
- [27] E. Karni and D. Schmeidler “An Expected Utility Theory for State-Dependent Preferences” In Theory and Decision 81, 2016, pp. 467–478
- [28] E. Karni, D. Schmeidler and K. Vind “On State Dependent Preferences and Subjective Probabilities” In Econometrica 51, 1983, pp. 1021–1031
- [29] R.. Luce “Individual Choice Behavior a Theoretical Analysis” John WileySons, 1959
- [30] R.. Luce and D.. Krantz “Conditional Expected Utility” In Econometrica 39, 1971, pp. 253–271
- [31] M.. Machina and R.. Parks “On Path Independent Randomized Choice” In Econometrica 49, 1981, pp. 1345–1347
- [32] C.. Plott “Path Independence, Rationality, and Social Choice” In Econometrica 41, 1973, pp. 1075–1091
- [33] F. Ramsey “Truth and Probability” In The Foundations of Mathematics and Other Logical Essays RoutledgeKegan Paul Ltd, 1931
- [34] L.. Savage “The Foundations of Statistics” John WileySons, 1954
- [35] L.. Shapley and M. Shubik “Preferences and Utility” In Game Theory in the Social Sciences: Concepts and Solutions Cambridge, MA: MIT Press, 1982
- [36] E. Shmaya and L. Yariv “Foundations for Bayesian Updating” Working Paper, Caltech, 2007
- [37] C. Skiadas “Conditioning and Aggregation of Preferences” In Econometrica 65, 1997, pp. 347–367
- [38] J. Von-Neumann and O. Morgenstern “Theory of Games and Economic Behavior” Princeton University Press, 1944
- [39] J.. Weymark “A Reconsideration of the Harsanyi-Sen Debate on Utilitarianism” In Interpersonal Comparisons of Well-Being Cambridge University Press, 1991, pp. 255–320
9. Appendix
9.1. Proof of Theorem 1
The main part of the proof follows the steps of [6], with some twists.
The following two lemmas are the central ideas behind the proof. They help us to first define the function and then extend it from the binary sets to any finite-cardinality sets.
Lemma 1.
Select X as any nonempty set. Let denote the set of all nonempty finite subsets of X. Consider two functions that satisfy the strict weighted averaging axiom. Select four points in the space such that and . If and not all are on a same line, then .
Proof.
Since satisfies the strict weighted averaging axiom and and , thus is on the line connecting . Also, since , should be on the line connecting and . But are not collinear, thus the line connecting and and the line connecting and can only intersect at most at a single point. But is on the both lines, hence this point must be the unique intersection of them.
Similarly, the same is true for . This means must be the unique intersection of the line passing through and the line passing through . But since , should be the unique intersection of the line passing through and the line passing through . But we have already shown that is also the unique intersection of the line passing through and the line passing through . Thus, .
∎
Lemma 2.
Assume that are three points in X such that are not collinear. Let satisfy the strict weighted averaging axiom and , then must be independent of the choice of z, as long as are not collinear. Moreover, if , then .
Proof.
Since are not collinear, they should be affinely independent. Hence, are uniquely defined.
By the strict weighted averaging axiom, there exists such that . Again by the strict weighted averaging axiom there exists such that . Hence, . By affinely independence of , we should have and . This means that , which means that is independent of the choice of z, as long as are not collinear. ∎
9.1.1. Proving the necessary and the uniqueness part
Assume that the weight function exists. Therefore, . It shows that if , then . By defining , we have . Thus, the strict weighted averaging axiom satisfied.
Regarding the uniqueness of , assume that there exist two such that . Since the range of is not a subset of a line, there exist at least three elements such that are not collinear. Thus, they are affinely independent. Hence, has a unique solution . Hence, there should be an such that . We will show that for all other point .
Select a point , based on the assumption on , there should be at least two points in such that are not collinear. Without loss of generality, assume that . Since are affinely independent, where are unique. Therefore, there exists such that . But notice that . Hence, we should have and this is what we wanted to prove.
9.1.2. Proving the sufficiency part
First, in order to define the function , fix an element and put . Based on the strict weighted averaging axiom for any such that , we have a unique such that . Let define .
To define the weight for any other with , we fix another point such that . Since , we should have . By using the strict weighted averaging axiom, we know that there exists a unique such that . Since the weight on has already been defined, we define the weight of such that . Thus, .
In the rest of this section, we are going to prove that satisfies the representation of the theorem. It means that by defining , we should have .
First, in Step 1 we prove that the representation holds for any three points, as long as the three points under are not collinear. In Step 2, we prove that the representation holds for any two points. In Step 3, (which is not necessary, and we provide it for its simplicity to capture the main ideas of the main part) we prove that for three points the representation holds. Finally in Step 4, by using induction on the cardinality of subsets of , we show that the representation holds for any subset of .
Step 1: for any three points such that are not collinear, we have , where are unique. Note that, it is enough to prove that , because in the same way, we can also get . There are two cases:
Case 1: If are such are not collinear then . Based on Lemma 2, we know that . But Again using the Lemma 2 and the way we define , we know that which means that . Hence, we have .
Case 2: If are such that are collinear, in this case both and are not collinear.
By the same technique as the first case, we get that and . Hence, it means that
, which is what we wanted to prove.
Step 2: Assume that . We want to show that . If , then it is true. If , then by the richness condition, there exists an element such that are not collinear. Based on Step 1, we know that , also we have , and . Notice that, based on the strict weighted averaging axiom, is on the line connecting and . Also, it is on the line connecting and . The reason is that by the strict combination axiom, there exist a such that , which means that is on the line connecting and . Similarly, everything holds for which means that is on the line connecting and and also it is on the line connecting and . Since are not collinear the intersection of two line can have at most one solution and since , , , and then by a similar argument as Lemma 1, we should have a unique intersection, which satisfies . This is what we wanted to prove.
Step 3: (This part is the tricky part, we provide it to capture the main ideas. We will use the same technique in Step 4) We are going to prove that for all three point we have .There are two separate cases to be considered.
Case 1: If are not collinear, then by Step 1, it is correct.
Case 2: Assume that are collinear. If all of them are the same, then by strict weighted averaging axiom . Hence, assume that they are not all the same.
Without loss of the generality, assume that . Based on the richness condition of , we should have a point such that not all , and are collinear. Note that, are not collinear. Similarly, are not collinear. Based on Case 1, we know that and . Also, we know that , , , and . Using the strict weighted averaging axiom, we know that is on the intersection of the line passing through and , and the line passing through the and . Also, note that not all of , , , and are collinear, since otherwise must be on the line connecting and . Similarly, we have the same properties for . Based on the argument of the Lemma 1, we have .
By using the strict weighted averaging axiom, we know that is on the line passing through and , since there exists such that . Again, by using the strict weighted averaging axiom, we know that is on the line passing through and . Also, the same holds for . Moreover, we have , , , and . Also, not all , , , and are on a same line, since otherwise , , , and are collinear which is not correct. As a result, based on the argument of lemma 1, we have . The latter is what we wanted to prove.
Step 4 (The main Step): Up to here, we prove that for any if then . To complete the proof, we will use an induction on the cardinality of . Assume that for all with we have . We are going to show that for all with , we have .
Fix a subset with . Assume that . There are two separate cases to be considered.
Case 1: Assume that not all are collinear. Note that, by the induction hypothesis, and we have . Define as the line passing through and for the case where . However, if , then define it as the single point .
If there exists such that , then based on the strict weighted averaging axiom , there exists such that . Similarly, . But, we know that , which means that .
If , then there exist such that , are not collinear. Because, otherwise all are on the , which cannot be correct since we assumed that not all are collinear. Considering such that are not collinear, based on the strict weighted averaging axiom we know that is on . Also, it must be on . Similarly, by the strict weighted averaging axiom when applied to , we know that is on . Moreover, it must be on the .
Since (1) , , , and, (2) not all , and are collinear, based on the Lemma 1, we have .
Hence in the case that not all are collinear, we showed that .
Case 2: Assume that are collinear. Without loss of generality, assume that are the two extreme points on the line that contains them, which means that all other points are between this two.
If ,then all are the same. Using the strict weighted averaging axiom, it shows that .
If , based on the richness condition of the aggregation rule , we can select a point such that not all , and are collinear. Based on the previous Case 1, we know that , since we have proved that and are coincided for any not collinear points. Similarly, we have .
Using the strict weighted averaging axiom, is on the . It is also on the . Also, not all , , , are collinear, since cannot be on otherwise must be on that line which is not correct. Similarly, everything holds for the .
Since, , , , and then again by using Lemma 1, we get .
The point is on the . It is also on the , since by the strict weighted averaging axiom for some . Similarly, the same holds for . Finally, since (1) , , and, (2) and , and are not collinear, then using the same types of arguments in Lemma 1, we get . This is what we wanted to prove.
Hence, for all with cardinality , we have . Based on the induction, we have for all . This completes the proof.
∎
9.2. Proof of Theorem 2
There are couple of steps in the proof. Defining the weak order:
Step 1: First, we define a binary relation over every two different elements by:
Case 1: If , we define .
Case 2: If , then by the strong richness condition, we select another point , such that . Hence, we have . In this case, we define .
To obtain reflexivity, for any , we define .
Step 2: We prove that is a weak order. The reflexivity and the completeness are trivial. We only need to establish the transitivity. Assume that . We will show that .
The proof is by contradiction. Therefore, assume that .
Case 1: Assume that are non-collinear. Since , based on the way we defined , we have .
Consider the coalition . By using the weighted averaging axiom over the sub-coalitions and , the vector should be on the line joining and (which is the same as ). Similarly, by considering the sub-coalitions and , should be on the line passing through and . Since and are non-collinear, we have . However, by considering the sub-coalitions , and the fact that , this cannot be happen. Therefore, .
Case 2: Assume that are collinear. By using the strong richness condition, we can select a point such that is not on the line passing through , and also (this means that ). First, using Case 1, by considering the coalitions , , we have and . Since are non-collinear, by using case 1, we have . But this is a contradiction. Therefore, .
The main part: proving .
Up to here, we show that is a weak order. Next, we will show that for any coalition we have .
We use the letter for the highest-ordered elements of , and for the rest. In other words, . The proof is by a double-induction on the cardinality of and . In Step 1, we will show that if and are such that , then we should have .
In Step 2, we show that for a given coalition , where all elements of are in the same equivalence class, and for all , if for all , then we have . Using these two steps, we will finish the proof.
Step 1: Fix an element . By induction on the cardinality of , where , we prove that .
We have already proved the case where . Assume that for all the result is correct. We will show that for all with , the result is also correct.
Fix a coalition with and such that . Assume that .
If for all , then using the weighted averaging axiom , which is what we wanted to prove. Similarly, if , we have .
Therefore, consider the case that not all of them are the same and .
Using our definition of the in the proof of Theorem 1, for each we consider . Using the weighted averaging axiom, for all . By using the induction hypothesis . Therefore, .
Similar to the proof of Theorem 1, we consider two separate cases.
Case 1: Consider the case where there exist two elements such that are non-collinear. We use the same technique as in the proof of Theorem 1.
We know that as well as . Moreover, we know that are non-collinear. Therefore, the intersection of the two lines, should be . This shows that .
Case 2: If all the vectors are collinear. In this case, the idea is to add a point such that and is not on the line containing all . This is possible because of the strong richness condition. By using the transitivity of the , .
Fix a point such that . This is possible since we already assumed that not all ,with , are the same as .
Consider the coalition . By using the weighted averaging axiom and the sub-coalitions , we have . Using the induction hypothesis, and . Therefore, .
Next, we show that . Since and , we have . Moreover, based on the way we selected the point , is not on the line containing and . Consider the partition of into and . Based on the choice of the , at the beginning of Step 1, . Since , , and are non-collinear, we have .
Finally, by partitioning into and , the weighted averaging axiom results in . Therefore, is on the line joining and . However, we have already shown that is on the line passing through and . Thus, should be on the line joining and .
However, the only intersection of and the line containing all the points is the point . Thus, , which completes the proof.
Step 2: In this step, by using induction on the cardinality of the set , in which all elements have the same order, we show that for any coalition if all elements of the set have lower orderings compared to the elements of , then we should have .
Fix a set . Based on Step 1, we know that for any such that , we should have . This is the starting point of our induction. Assume that for all , we have . We will show that for any , we have .
For any , by the weighted averaging axiom over the sub-coalitions and , we have . Based on step 1, we know that . Therefore, . Similarly, by the weighted averaging axiom over the coalition and its sub-coalitions , we should have .
Consider two cases:
Case 1: Consider the case in which not all members of are collinear. Hence, there should be at least two elements that and are not collinear. Therefor, the intersection of the lines joining and the line joining can have at most one intersection. Since is on both lines, the unique intersection should be . But is also on both lines. Hence, we should have , which completes this case.
Case 2: Consider the case where all members of the set are on a line. By using the strong richness condition, there exists an element such that is not on that line. We consider the coalition . By using the weighted averaging axiom over the sub-coalitions and , we should have on the line joining and . By the induction hypothesis, . Hence, we should have .
Similarly, by partitioning the set into and , we have .
Select an element . By partitioning the coalition between the sub-coalitions and , using the weighted averaging axiom, we obtain .
Finally, using (1) , and (2) , and (3) , we have . But the intersection of the last line with the line containing all the elements of , can have at most one intersection. Therefore, ,
which completes the proof.
Completing the proof:
Consider a coalition where all elements have the similar order. We consider any two disjoint sub-coalitions , where . Using the same technique of the previous part, we have .
By using the result of Theorem 1, we can get the appropriate representation in each equivalence class. Also by using the result of the previous part, . The combination of these two results completes the proof. ∎
9.3. Proof of Theorem 3
The following two lemmas help us proving the theorem.
Lemma 3.
Given any two linearly independent vectors in , there exists a neighborhood of that any vector in that neighborhood is linearly independent of . More generally, given any vectors such that is not in the linear space generated by the rest of the points, then there exists a neighbor of such that any point in that neighborhood is not in .
Proof.
Since is a closed set that is disjoint from the vector , the distance between and should be nonzero. Hence, there exists a neighborhood of (for example the ball with radios around ) disjoint from . As a result, any point in that neighborhood is not in . ∎
Lemma 4.
Let be two linearly independent vectors and , for some , is a vector between . If the vectors are such that , , and , then .
Proof.
We prove it by contradiction. If it is not the case, there exists a subsequence of and some , such that . Based on compactness of , there exist a subsequence of that is convergent to some . Since , we have . Based on the assumption of the lemma, since the sequence is convergent to , the subsequence also converges to . Similarly, converges to . Hence, and . As a result, . However, since are linearly independent, and should be the same, which is a contradiction. The contradiction shows that . ∎
Using the lemmas mentioned above, we will complete the proof. Based on Theorem 2.5, there exist a unique weak order and a weight function such that for any
Let be any given point. We need to prove that the weight function is continuous around and any point close enough to has the same order, respect to the weak order , as .
To complete the proof, assume that and . We are going to prove that:
1) ,
2) such that for all .
Proving these two completes the proof.
Based on the strong richness condition, there should be a point such that (1) are linearly independent, and (2) , which means that . The reason is that by the strong richness condition, there should be at least two other points with the same order as , such that not all of and are collinear. This means that and at least one of or should be linearly independent. Without loss of generality, we assume that and are linearly independent.
Given any two points , we define the function as follows:
Consider the sequence of vectors . By Theorem 2, we have . Based on continuity of the aggregation rule , and . Since and are linearly independent, all conditions of Lemma 4 are satisfied. Hence, we have and . Since both and are strictly positive, we should have and similarly . This means that for large , . Since , for large we have . This complete part 2 of the proof.
For the part 1, since we have already proved that for large , , the convergence becomes . This means that , which proves that is continuous at .
Proving part 1 and 2 complete the proof.
∎
9.4. Proof of Proposition 1
There are a couple of steps to prove the result.
Step 1: Assume that all signals arrive at time . By using Corollary 2, there exists a unique (up to multiplication) weight function , such that for all , . By using the uniqueness of and the stationarity axiom, for any constant time shift and for all we have:
Consider two signals , where . Let the timing of be . Using the strict weighted averaging axiom, there exists a where . We define such that .
In the rest of the proof, we show that these choices of attain the representation of Proposition 1.
Step 2: We show that for any signal , the representation holds for the coalition and for the timing function .
Case 1: Consider any signal , such that are not collinear. We form the coalition with the timing , . Using the strict weighted averaging axiom, by considering the sub-coalitions and and the fact that and has the same timing, Lemma 2, in the proof of Theorem 1, shows that the representation holds for the coalition with the timing .
Case 2: Consider any signal , such that are collinear. By the richness condition, there exists a signal such that are not collinear. We consider the timing , . The representation holds for the sub-coalitions (by Case 1) and (since all have the same timing). Thus, by applying Lemma 2 first on and then on , we can show that the representation holds for the coalition with the timing .
Step 3: We show that the representation holds for any two signals with the timing function .
Case 1: If are non-collinear, then we consider the timing function . By applying Lemma 2 twice on and with their corresponding timings, we can show that the representation should holds for with the timing function .
Case 1: If are collinear, then by the richness condition, there exists a signal such that are not collinear. Consider the timing function . By applying Lemma 2 for the sub-coalition and their corresponding timing, shows that the representation holds for and their timing . Then, by considering the coalition and their corresponding timing, Lemma 2 shows that the representation holds for and the timing function .
Step 4: In this step, we show that given any , the representation holds for any two signals and the timing function . The proof is by induction on . By Step 3, the representation holds for . Assume that the representation holds for all with . We will show that it also holds for .
Case 1: If , then we consider a signal such that are not collinear. Let the timing function be .
Consider any such that . By Lemma 2, induction hypothesis, and the stationarity axiom, we have . Thus, the representation holds.
Case 1: If , then we consider two signals such that are not collinear (which is possible by the richness condition). Let the timing function be . By the uniqueness part of Theorem 1 and the induction hypothesis, the representation still holds in this case.
Step 4: Finally, for any coalition and any timing function , the uniqueness of Theorem 1 and Step 4 establish that the representation should hold with .
9.5. Proof of Theorem 5 and Corollary 7
Assume that the aggregation rule satisfies the minimal agreement condition and is the direction on which all agents agree.
Consider two disjoint coalitions with the corresponding cardinal utilities and . Assume that is a cardinal utility that represents the preference ordering of the union . If , then the result is trivial. Hence, consider the case .
First, by using the Farkas’ Lemma, we show that the extended Pareto is equivalent to (which denotes the interior of the cone generated by and ).
If , then there exist such that . Therefore, for any if and , then . Similarly, if and , then . This proves that the preference ordering of , satisfies the extended Pareto axiom.
For the other side, if the utility of the union , then such that . The Farkas’ Lemma guarantees that there exists a vector such that and .
Consider a vector that is in the interior of . We select such that . This is possible since we assume that is in the interior of .
By defining , we get . Since and , we have , and . But since , we have and .
But by the extended Pareto axiom, this cannot be true. Therefore, .
Now consider the intersection of and . Since and , there should be a unique both in . It is trivial that .
Since both and , the intersection of the interior of the cone generated by them and the linear variety is the segment . Since , we should have . Hence, there should be a representing . Therefore, . This completes the proof.
∎
9.6. Proof of Theorem 6
There are a couple of steps in the proof. Note that for any profile , and for any coalition , denotes the restricted sub-profile of the coalition .
Step 1: Fix a preference . Using the corollary 8, for any profile such that , we can uniquely define a weight function (which depends on the full profile ) with , such that for any coalition we have:
First, we show that for any individual and for any two profiles with and , we have . There are two separate cases:
Case 1: If , then using and the result of corollary 8, we should have .
Case 2: If , then by considering the definition of the domain , which require the existence of three non-collinear preferences in each profile, there should be a profile and two individual such that , , and . Using Case 1, we have and . Since and , using corollary 8, we should have . Similarly, we have and . Therefore, we should have . Hence, we have .
By considering profiles of the form with , we can define the weight function such that for all . By the result of Step 1, this function is well defined. Moreover, we define .
At this point, for any preference profile with and for any coalition , we have:
Step 2: We need to define for all . We have already fixed the value . For any , let be a profile with and . By corollary 8, there should be a unique function with . We define .
Notice that for any two profile with and , if we normalize the value of the , then we should have . Hence, the value is independent of the choice of the profile .
At this point the function is fully defined. We only need to show that it works.
Step 3: Select any profile . We need to show that the representation holds with the weight function defined above.
If , by the result of Step 1 the representation holds. Hence, fix any where . In the rest of the proof we show that the representation holds for any with .
Similar to Step 1, using the corollary 8, for any profile such that , we can uniquely define a weight function (depending on the full profile ) with , such that for any coalition we have:
In the same manner as Step 1, for any two profiles with and for every individual , we should have . Hence, by considering profiles of the form with , we can define the weight function such that for all . By the result of Step 1, this function is well defined. Moreover, we fix .
For every preference profile with and for every coalition , we have:
To complete the proof, since we have , it remains to show that for all and for all we have .
Case 1: Since and , based on Step 2 we should have .
Case 2: Assume that and . Since , based on definition of , there exist such that and .
Since , , and by Case 1 , then we should have .
Case 3: Assume that and . Since , we can select an individual . Based on the definition of , there exist such that and .
Since , , and by Case 2 , then we should have .
Case 4: Finally, assume that and . Select an individual . We consider profiles such that and . By Case 3, we have . Hence, since and we should have .
The last observation completes the proof. ∎