Dynamic Random Choice

Ricky Li [email protected]. I thank Victor Aguiar and Christopher Turansick for helpful comments. I especially thank Tomasz Strzalecki for his insightful comments, as well as his invaluable guidance throughout my research career.

(This version: June 20, 2022)

Abstract

I study dynamic random utility with finite choice sets and exogenous total menu variation, which I refer to as stochastic utility (SU). First, I characterize SU when each choice set has three elements. Next, I prove several mathematical identities for joint, marginal, and conditional Block–Marschak sums, which I use to obtain two characterizations of SU when each choice set but the last has three elements. As a corollary under the same cardinality restrictions, I sharpen an axiom to obtain a characterization of SU with full support over preference tuples. I conclude by characterizing SU without cardinality restrictions. All of my results hold over an arbitrary finite discrete time horizon.

1 Introduction and Related Literature

A classic result in decision theory is Sen (1971)’s characterization of deterministic choice functions that can be represented by strict preference relations. However, economic choice data is often nondeterministic. In such cases, the analogous primitive and representation is a stochastic choice function (SCF) and random utility (RU) model. Block et al. (1959) define RU on arbitrary finite choice sets and show that their axiom requiring that the SCF’s Block–Marschak sums be nonnegative is necessary; Falmagne (1978) and Barberá and Pattanaik (1986) complete the characterization of RU by showing that this axiom is sufficient.

In addition to nondeterministic choices, economic agents also often make choices across time, leading to a richer primitive than that of static random environments: a dynamic stochastic choice function (DSCF). The corresponding economic model of interest is dynamic random utility (DRU), some variants of which have been examined in recent work. Frick et al. (2019) examine dynamic random choice over the domain of Kreps and Porteus (1978)’s decision trees, where choices in each period are lotteries over consumption-continuation menu pairs. In this choice environment, choices are risky and future menu variation can be endogenously determined. Furthermore, since present choices constrain the set of possible future menus, the DSCF exhibits what Frick et al. (2019) deem limited observability. Frick et al. (2019) characterize a model of dynamic random expected utility over arbitrary finite time periods, as well as sharper Bayesian variants with nonmyopic agents. Kashaev and Aguiar (2022) axiomatize a model of DRU over a domain of consumption vectors in $\mathbb{R}_{+}^{K}$ . In their setup, each period’s menus are derived from exogenous budget planes, following the static framework of Kitamura and Stoye (2018). Hence, Kashaev and Aguiar (2022)’s axioms exploit partial exogenous menu variation by assuming that only a finite subset of the set of menus that can be derived from arbitrary subsets of $\mathbb{R}_{+}^{k}$ are observable. Unlike the aforementioned choice environments, I study a model of DRU with riskless finite choice sets and full exogenous menu variation. To distinguish this setting from the others, I will henceforth refer to this paper’s variant of DRU as stochastic utility (SU), as originally named and defined in Strzalecki (2021).

The recent work most similar to mine is Chambers et al. (2021), who study a model of correlated random utility (CRU) with riskless finite choice sets and full exogenous menu variation. Their primitive is a (two-agent) correlated choice rule, whereas my primitive is a (multi-period) DSCF. In this paper, I show that under arbitrary extrapolation following zero-probability choices, each DSCF induces a stream of multi-agent correlated choice rules, each of which is marginally consistent in the last period. I also show that there is an equivalence between SU representations of a DSCF and CRU representations of the induced full-period correlated choice rule. One interpretation that bridges the gap between our models is to imagine the agent’s current and future selves in my setup as different agents in Chambers et al. (2021)’s setup.

Using a graph-theoretic approach, Chambers et al. (2021) characterize CRU with two agents where at least one choice set is three or less elements. I provide two characterizations of SU for arbitrary finite time periods where all but the last choice set is three elements. One of those characterizations leverages axioms whose two-agent analogs play a similar role in the aforementioned result of Chambers et al. (2021), but my approach uses a different proof strategy that does not involve graphs. I discuss the relationship between my and their axioms in greater detail later in this paper. Chambers et al. (2021) also find a counterexample to CRU under their axioms for larger choice environments, state an additional axiom which disciplines the correlated choice rule’s capacity, and characterize CRU without cardinality restrictions using an analog of McFadden and Richter (1990)’s Axiom of Revealed Stochastic Preference; each of these results straightforwardly yield analogous results about SU over two periods. I characterize SU without cardinality restrictions over arbitrary finite time periods using an analog of Clark (1996)’s Coherency axiom. Finally, Chambers et al. (2021) also study correlated random choice over lotteries over finite prize sets and characterize a model of correlated random expected utility.

The rest of the paper proceeds as follows. Section 2 defines the choice environment, primitive, and model. Section 3 states the axioms and main results. Section 4 contains some useful auxiliary results, including the joint, marginal, and conditional Block–Marschak identities, and all proofs.

2 Stochastic Utility

2.1 Primitive

There are $n$ time periods, indexed by $t=1,\ldots,n$ . For each $t$ , let $X_{t}$ be a finite choice set, $\mathcal{M}_{t}$ be the set of nonempty subsets of $X_{t}$ , and $\Delta(X_{t})$ be the set of probability distributions over $X_{t}$ . Given a finite choice set $X$ and the set of its nonempty subsets $\mathcal{M}$ , say that $\rho:\mathcal{M}\rightarrow X$ is a (static) stochastic choice function (SCF) if $\rho(\cdot,A)\in\Delta(A)$ for each $A\in\mathcal{M}$ .

Definition 2.1.

A dynamic stochastic choice function (DSCF) is a tuple $\rho:=(\rho_{1},\ldots,\rho_{n})$ such that $\rho_{1}:\mathcal{M}_{1}\rightarrow\Delta(X_{1})$ is a SCF and, for each $1<t\leq n$ , $\rho_{t}:\mathcal{H}_{t-1}\times\mathcal{M}_{t}\rightarrow\Delta(X_{t})$ maps each $t-1$ -history to the SCF $\rho_{t}(\cdot|h_{t-1})$ ,¹¹1Note that the domain of $\rho_{t}(\cdot|h_{t-1})$ is the full set of period- $t$ menus $\mathcal{M}_{t}$ , independently of which history $h_{t-1}$ is observed. In Frick et al. (2019)’s setup, the domain of $\rho_{t}(\cdot|h_{t-1})$ is itself a function of $h_{t-1}$ and need not be the entire set $\mathcal{M}_{t}$ . These features of their DSCF illustrate endogenous menu variation and limited observability, respectively. where the set of $t-1$ -histories is defined recursively:²²2Let $\mathcal{H}_{1}:=\{(A_{1},x_{1})\in\mathcal{M}_{1}\times X_{1}:\rho_{1}(x_{1},A_{1})>0\}$ .

\mathcal{H}_{t-1}:=\{(A_{t-1},x_{t-1};h_{t-2})\in\mathcal{M}_{t-1}\times X_{t-1}\times\mathcal{H}_{t-2}:\rho_{t-1}(x_{t-1},A_{t-1}|h_{t-2})>0\}

For each $1\leq t\leq n$ , let poset $(L_{t},\leq_{t}):=(2^{X_{t}},\subseteq)$ . Define their product poset to be $(L,\leq)$ , where $L=\times_{t=1}^{n}L_{t}$ and $A\leq B$ if and only if $A_{t}\subseteq B_{t}$ for each $1\leq t\leq n$ . Given a set of times $T\subseteq N:=\{1,\ldots,n\}$ , let subscript $T$ denote a vector indexed by $t\in T$ , and let $-t:=N\backslash\{t\}$ . For vectors indexed by $N$ , I omit the subscript entirely. Say that $A_{T}<<B_{T}$ if $A_{t}\subsetneq B_{t}$ for all $t\in T$ , and say that $y_{T}\neq\neq x_{T}\in X_{T}$ if $y_{t}\neq x_{t}$ for all $t\in T$ .

A choice path is a tuple $(A,x)^{t}:=(A_{t},x_{t};\ldots;A_{1},x_{1})$ . Let $(A,x)$ denote an $n$ -period choice path. A zero-probability choice path is a choice path $(A,x)^{t}$ such that $\rho_{1}(x_{1},A_{1})=0$ or $\rho_{s}(x_{s},A_{s}|(A,x)^{s-1})=0$ for some $1<s\leq t$ . For each $1<t\leq n$ , each zero-probability choice path $(A,x)^{t-1}$ , and each $A_{t}\in\mathcal{M}_{t}$ , define $\rho_{t}(\cdot,A_{t}|(A,x)^{t-1})$ to be any probability distribution over $A_{t}$ . I take this augmented DSCF $\rho$ as primitive. Note that for any $A_{\leq t}:=(A_{1},\ldots,A_{t})$ , $\rho$ admits a well-defined joint distribution $p_{t}(\cdot|A_{\leq t})\in\Delta(A_{\leq t})$ , defined as $p_{t}(x_{\leq t},A_{\leq t}):=\rho_{1}(x_{1},A_{1})\prod_{s=2}^{t}\rho_{s}(x_{s},A_{s}|(A,x)^{s-1})$ . Let $p:=p_{n}$ . Note that for any $1\leq r<t\leq n$ and any $(A_{1},\ldots,A_{t})$ , $p_{r}$ is the marginal distribution of $p_{t}$ on $\times_{s=1}^{r}A_{s}$ :

	$\displaystyle\sum_{x_{>r}\in A_{>r}}p_{t}(x_{\leq r},A_{\leq r};x_{>r},A_{>r})=\rho_{1}(x_{1},A_{1})\prod_{s=2}^{r}\rho_{s}(x_{s},A_{s}\|(A,x)^{s-1})\ \times$
	$\displaystyle\sum_{x_{r+1}\in A_{r+1}}\bigg{[}\rho_{r+1}(x_{r+1},A_{r+1}\|(A,x)^{r})\bigg{[}\cdots\bigg{[}\sum_{x_{t}\in A_{t}}\rho_{t}(x_{t},A_{t}\|(A,x)^{t-1})\bigg{]}\cdots\bigg{]}\bigg{]}$
	$\displaystyle=\rho_{1}(x_{1},A_{1})\prod_{s=2}^{r}\rho_{s}(x_{s},A_{s}\|(A,x)^{s-1})=p_{r}(x_{\leq r},A_{\leq r})$

Definition 2.2.

Given $A<<X$ and $x\in A^{C}$ , their joint Block–Marschak (BM) sum is

m(x,A):=\sum_{B\geq A^{C}}(-1)^{\sum_{t=1}^{n}(|B_{t}|-|A_{t}^{C}|)}p(x,B)

Joint BM sums are the multi-period analog of Block et al. (1959)’s BM sums. As in the static case, many of my forthcoming axioms will impose discipline on joint BM sums. Unlike the static case, these axioms alone are insufficient for SU.

2.2 Model

For each $1\leq t\leq n$ , let $P_{t}$ be the set of strict preference relations on $X_{t}$ and let $P_{T}:=\times_{t\in T}P_{t}$ . Given $x_{t}\notin A_{t}\in\mathcal{M}_{t}$ , say that $x_{t}\succ_{t}A_{t}$ if $x_{t}\succ_{t}y_{t}$ for all $y_{t}\in A_{t}$ . Given $x_{t}\in A_{t}\in\mathcal{M}_{t}$ , let $C_{t}(x_{t},A_{t}):=\{\succ_{t}\ \in P_{t}:x_{t}\succ_{t}A_{t}\backslash\{x_{t}\}\}$ , and let $C(x_{t},A_{t}):=C_{t}(x_{t},A_{t})\times P_{-t}$ . Given history $h_{t}:=(A,x)^{t}$ , let $C(h_{t}):=\bigcap_{s=1}^{t}C(x_{s},A_{s})$ . Given $x_{T}\in A_{T}$ , let $C(x,A)^{T}:=\bigcap_{t\in T}C(x_{t},A_{t})=\times_{t\in T}C_{t}(x_{t},A_{t})\times P_{-T}$ . Given $T^{\prime}\subseteq T\subseteq N$ , let $C_{T}(x_{T^{\prime}},A_{T^{\prime}}):=\times_{t\in T^{\prime}}C_{t}(x_{t},A_{t})\times P_{T\backslash T^{\prime}}$ .

Definition 2.3.

A stochastic utility (SU) representation of $\rho$ is a probability measure $\mu\in\Delta(P)$ such that $\rho_{1}(x_{1},A_{1})=\mu(C(x_{1},A_{1}))$ for all $x_{1}\in A_{1}\in\mathcal{M}_{1}$ and

\rho_{t}(x_{t},A_{t}|h_{t-1})=\mu(C(x_{t},A_{t})|C(h_{t-1}))

for all $1<t\leq n$ , $h_{t-1}\in\mathcal{H}_{t-1}$ , and $x_{t}\in A_{t}\in\mathcal{M}_{t}$ .

Given $A_{t}\subsetneq X_{t}$ and $x_{t}\in A_{t}^{c}$ , define their joint upper edge set to be $E_{t}(x_{t},A_{t}):=\{\succ_{t}\in P_{t}:A_{t}\succ_{t}x_{t}\succ_{t}A_{t}^{C}\backslash\{x_{t}\}\}$ . In words, this is the set of period- $t$ preferences that rank $x_{t}$ on the uppermost edge of $A_{t}^{C}$ and below the lowermost edge of $A$ . Given $A_{T}<<X_{T}$ and $x_{T}\in A_{T}^{C}$ , define $E(x_{t},A_{t}):=E_{t}(x_{t},A_{t})\times P_{-t}$ and $E(x,A)^{T}=\bigcap_{t\in T}E(x_{t},A_{t})$ . Given $T^{\prime}\subseteq T\subseteq N$ , let $E_{T}(x_{T^{\prime}},A_{T^{\prime}}):=\times_{t\in T^{\prime}}E_{t}(x_{t},A_{t})\times P_{T\backslash T^{\prime}}$ .

3 Results

The first result characterizes SU representations as probability measures that assign every joint upper edge set its corresponding joint Block–Marschak sum. It also establishes an equivalence between SU representations of $\rho$ and CRUM representations of the induced correlated choice rule $p$ , as defined in Chambers et al. (2021).

Proposition 3.1.

The following are equivalent.

1.

$\mu$ is an SU representation of $\rho$ .
2.

$\mu(C(x,A))=p(x,A)$ for all $x\in A\in\mathcal{M}$ .
3.

$\mu(E(x,A))=m(x,A)$ for all $A<<X$ and $x\in A^{C}$ .

The next result uses the following axiom to characterize SU when all choice sets have three elements. I abuse notation to let $\{x\}^{C}=(\{x_{t}\}^{C})_{t=1}^{n}$ .

Axiom 3.2 (Joint Supermodularity).

For all $1\leq t\leq n$ and $y_{t}\neq x_{t}\in X_{t}$ ,

\sum_{B\geq\{x\}^{C}:\sum_{t=1}^{n}|B_{t}|\text{even}}p(y,B)\geq\sum_{B\geq\{x\}^{C}:\sum_{t=1}^{n}|B_{t}|\text{odd}}p(y,B)

Example 3.3.

Let $n=2$ and $|X_{1}|=|X_{2}|=3$ . Joint Supermodularity requires that for any $b\neq a\in X_{1}$ and $y\neq x\in X_{2}$ ,

\displaystyle p(b,\{b,c\};y,\{y,z\})+p(b,X_{1};y,X_{2})\geq p(b,\{b,c\};y,X_{2})+p(b,X_{1};y,\{y,z\})

Theorem 3.4.

Suppose $|X_{t}|=3$ for all $1\leq t\leq n$ . $\rho$ has a unique SU representation if and only if it satisfies Joint Supermodularity and Marginal Consistency.

The proof of this result proceeds from the observation that when all choice sets have three elements, the candidate SU representation is pinned down by joint BM sums of the form $m(y,\{x\})$ , where I again abuse notation and let $\{x\}=(\{x_{t}\})_{t=1}^{n}$ . The next result characterizes SU under a relaxation of the last period’s cardinality restrictions with the following two axioms.

Axiom 3.5 (Joint BM Nonnegativity).

$m(x,A)\geq 0$ for all $A<<X$ and $x\in A^{C}$ .

If $m(x,A)>0$ for all $A<<X$ and $x\in A^{C}$ , say that $\rho$ satisfies Joint BM Positivity. The two-period version of this axiom is equivalent to Chambers et al. (2021)’s Axiom 3. Note that when all choice sets have three elements, Joint BM Nonnegativity implies Joint Supermodularity, since $\sum_{B\geq\{x\}^{C}:\sum_{t=1}^{n}|B_{t}|\text{even}}p(y,B)-\sum_{B\geq\{x\}^{C}:\sum_{t=1}^{n}|B_{t}|\text{odd}}p(y,B)=\sum_{B\geq\{x\}^{C}}(-1)^{\sum_{t=1}^{n}|B_{t}|-2n}p(y,B)=m(y,\{x\})\geq 0$ . As I show in the proof of Theorem 3.4, Joint Supermodularity and the following axiom imply Joint BM Nonnegativity under these cardinality restrictions.

Axiom 3.6 (Marginal Consistency).

For all $1\leq t<n$ and $x_{-t}\in A_{-t}\in\mathcal{M}_{-t}$ ,

\sum_{x_{t}\in A_{t}}p(x_{t},A_{t};x_{-t},A_{-t})

is constant in $A_{t}$ .³³3For $x_{-n}\in A_{-n}\in\mathcal{M}_{-n}$ and any $A_{n}$ , we have $\sum_{x_{n}\in A_{n}}p(x_{-n},A_{-n};x_{n},A_{n})=p_{n-1}(x_{-n},A_{-n})$ , since $p_{n-1}$ is the marginal of $p$ on $A_{-n}$ .

If $p$ satisfies Marginal Consistency, define $p(x_{-t},A_{-t}):=p(x_{t},\{x_{t}\};x_{-t},A_{-t})$ for any $x_{t}\in X_{t}$ . For any $T\subsetneq\{1,\ldots,n\}$ , $A_{T}=(A_{t})_{t\in T}$ , and $A_{-T}=(A_{t})_{t\notin T}$ , I show via induction on the size of $T$ that Marginal Consistency implies a well-defined marginal distribution over $A_{-T}$ , defined by $p(x_{-T},A_{-T};y_{T},\{y\}_{T})$ for any $y_{T}\in X_{T}$ . The base case ( $|T|=1$ ) follows directly from the content of Marginal Consistency. Inductive step ( $1<|T|<n$ ): fix $t\in T$ , $y_{t}\in X_{t}$ , and $y_{T\backslash\{t\}}\in X_{T\backslash\{t\}}$ Then

	$\displaystyle\sum_{x_{T}\in A_{T}}p(x_{T},A_{T};x_{-T},A_{-T})=\sum_{x_{t}\in A_{t}}\sum_{x_{T\backslash\{t\}}\in A_{T\backslash\{t\}}}p(x_{t},A_{t};x_{T\backslash\{t\}},A_{T\backslash\{t\}};x_{-T},A_{-T})$
	$\displaystyle=\sum_{x_{t}\in A_{t}}p(x_{t},A_{t};y_{T\backslash\{t\}},\{y\}_{T\backslash\{t\}};x_{-T},A_{-T})=p(y_{T},\{y\}_{T};x_{-T},A_{-T})$

where the second equality follows from the inductive hypothesis and the third equality follows from Marginal Consistency. To compensate for the latent last-period marginality I assume in $p$ , my axiom of Marginal Consistency is weaker than Chambers et al. (2021)’s Axiom 2 of Marginality.

Theorem 3.7.

Suppose $|X_{t}|=3$ for all $1\leq t<n$ . $\rho$ has an SU representation if and only if it satisfies Joint BM Nonnegativity and Marginal Consistency.⁴⁴4A previous draft of this paper incorrectly stated this result without the cardinality restrictions.

The proof of the backwards direction of Theorem 3.7 proceeds through a series of five claims. First, I recursively define a set function $\nu$ , whose domain is not the entire power set $2^{P}$ but instead tuples of cylinders. The first two claims enforce two necessary additive properties of $\nu$ . The next two claims allow for the construction of a probability measure $\mu\in\Delta(P)$ that extends $\nu$ . The last claim verifies that $\mu$ assigns each joint upper edge set its corresponding joint BM sum. By sharpening Joint BM Nonnegativity to Positivity, Theorem 3.7 admits a corollary that characterizes SU with full support.

Corollary 3.8.

Suppose $|X_{t}|=3$ for all $1\leq t<n$ . $\rho$ has an SU representation with full support over $P$ if and only if it satisfies Joint BM Positivity and Marginal Consistency.

Next, I present an alternative characterization of SU under the same cardinality restrictions. For any $A_{-n}<<X_{-n}$ and $x_{-n}\in A_{-n}^{C}$ , define their marginal Block–Marschak (BM) sum to be

\displaystyle m(x_{-n},A_{-n}):=\sum_{x_{n}\in X_{n}}m(x_{-n},A_{-n};x_{n},\emptyset)=\sum_{B_{-n}\geq A_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}|B_{t}|-|A_{t}^{C}|}p_{-n}(x_{-n},B_{-n})

Recall that $p_{-n}:=p_{n-1}$ is well-defined without assuming Marginal Consistency of $p$ .

Axiom 3.9 (Partial Marginal BM Nonnegativity).

For all $1\leq t<n$ and $y_{t}\neq x_{t}\in X_{t}$ , $m(y_{-n},\{x\}_{-n})\geq 0$ .

Note that Joint BM Nonnegativity implies Partial Marginal BM Nonnegativity, since each marginal BM sum is itself a sum of joint BM sums. Next, for any $A_{-n}<<X_{-n}$ and $x_{-n}\in A_{-n}^{C}$ with $m(x_{-n},A_{-n})>0$ , define their conditional SCF⁵⁵5Strictly speaking, $\rho_{n}(x_{n},A_{n}|(x,A)^{-n})$ need not be nonnegative, even assuming Partial Marginal BM Nonnegativity. Let $n=2$ , $|X_{1}|=3$ , $(x,A)^{-n}=(y_{1},\{x_{1}\})$ , and $x_{2}\in A_{2}=X_{2}$ : then $\rho_{2}(x_{2},A_{2}|(y_{1},\{x_{1}\}))=\frac{p(y_{1},z_{1};x_{2},A_{2})-p(y_{1},X_{1};x_{2},A_{2})}{\rho_{1}(y_{1},z_{1})-\rho_{1}(y_{1},X_{1})}$ . Let $\rho_{1}(y_{1},z_{1})=\frac{1}{2}$ , $\rho_{2}(x_{2},A_{2}|\{y_{1},z_{1}\},y_{1})=0$ , $\rho_{1}(y_{1},X_{1})=\frac{1}{4}$ , and $\rho_{2}(x_{2},A_{2}|X_{1},y_{1})=1$ . The following axiom enforces that conditional SCFs are indeed SCFs. as

\displaystyle\rho_{n}(x_{n},A_{n}|(x,A)^{-n}):=\frac{\sum_{D_{n}\subseteq A_{n}^{C}}m(x_{-n},A_{-n};x_{n},D_{n})}{m(x_{-n},A_{-n})}

for any $x_{n}\in A_{n}\in\mathcal{M}_{n}$ ⁶⁶6If $x_{n}\notin A_{n}$ , define $\rho_{n}(x_{n},A_{n}|(x,A)^{-n}):=0$ . and their conditional Block–Marschak sum as

\displaystyle m(x_{n},A_{n}|(x,A)^{-n}):=\sum_{B_{n}\supseteq A_{n}^{C}}(-1)^{|B_{n}|-|A_{n}^{C}|}\rho_{n}(x_{n},B_{n}|(x,A)^{-n})

for any $A_{n}\subsetneq X_{n}$ and $x_{n}\in A_{n}^{C}$ .

Axiom 3.10 (Partial Conditional BM Nonnegativity).

For all $1\leq t<n$ , $y_{t}\neq x_{t}\in X_{t}$ with $m(y_{-n},\{x\}_{-n})>0$ , $A_{n}\subsetneq X_{n}$ , and $x_{n}\in A_{n}^{C}$ , $m(x_{n},A_{n}|(y,\{x\})^{-n})\geq 0$ .

Lemma 4.9 shows that $m(x_{n},A_{n}|(x,A)^{-n})=\frac{m(x,A)}{m(x_{-n},A_{-n})}$ and hence Joint BM Nonnegativity implies Partial Conditional BM Nonnegativity.

Axiom 3.11 ( $(-n)$ -Marginal Consistency).

$p_{-n}$ satisfies Marginal Consistency.

By the argument following Axiom 3.6, Marginal Consistency of $p$ implies $(-n)$ -Marginal Consistency.⁷⁷7In the context of that argument, take $T=\{t,n\}$ for any $1\leq t<n-1$ . To see that the converse does not hold in general, let $n=2$ : then $(-n)$ -Marginal Consistency is satisfied by definition of $\rho_{1}$ , but $\sum_{x_{1}\in A_{1}}p(x_{1},A_{1};x_{2},A_{2})$ need not be constant in $A_{1}$ . Finally, say that $(x_{-n}^{i},A_{-n}^{i})_{i\in I}$ partition history $h_{n-1}\in\mathcal{H}_{n-1}$ if $\{E(x_{-n}^{i},A_{-n}^{i})\}$ is a partition of $C(h_{n-1})$ . Given partition $(x_{-n}^{i},A_{-n}^{i})_{i\in I}$ of $h_{n-1}$ , let $I^{\prime}=\{i\in I:m(x_{-n}^{i},A_{-n}^{i})>0\}$ .

Axiom 3.12 ( $(-n)$ -Conditional Consistency).

For any $x_{n}\in A_{n}\in\mathcal{M}_{n}$ and $(A,x)^{-n}\in\mathcal{H}_{n-1}$ with partition $(y_{-n}^{i},\{x^{i}\}_{-n})_{i\in I}$ ,

\displaystyle\rho_{n}(x_{n},A_{n}|(A,x)^{-n})=\sum_{i\in I^{\prime}}\rho_{n}(x_{n},A_{n}|(y^{i},\{x^{i}\})^{-n})\frac{m(y_{-n}^{i},\{x^{i}\}_{-n})}{p_{n}(x_{-n},A_{-n})}

This axiom is similar in spirit to Frick et al. (2019)’s Linear History Independence axiom. However, since my choice environment consists of riskless finite sets, the set of observable histories is coarser than that of Frick et al. (2019)’s setup. In particular, $\rho_{n}(\cdot|(y^{i},\{x^{i}\})^{-n})$ must be counterfactually extrapolated, whereas the analog of this SCF in Frick et al. (2019)’s environment can be directly observed.

Theorem 3.13.

Suppose $|X_{t}|=3$ for all $1\leq t<n$ . $\rho$ has an SU representation if and only if it satisfies Axioms 3.9-3.12.

The proof of the backwards direction of Theorem 3.13 proceeds by constructing a marginal SU representation $\mu_{-n}\in\Delta(P_{-n})$ for the first $n-1$ periods, a conditional RU representation $\mu^{\succ_{-n}}\in\Delta(P_{n})$ for each $\succ_{-n}\in\text{supp }\mu_{-n}$ , and finally a candidate SU representation $\mu\in\Delta(P)$ that mixes the conditional RU representations $(\mu^{\succ_{-n}})$ according to $\mu_{-n}$ . Finally, I characterize SU for arbitrary finite choice sets over an arbitrary finite time horizon, using the following axiom.

Axiom 3.14 (Joint Coherency).

For any $(x^{i},A^{i})_{i=1}^{k}$ with $x^{i}\in A^{i}\in\mathcal{M}$ for each $1\leq i\leq k$ and for any $(\lambda^{i})_{i=1}^{k}\subseteq\mathbb{R}$ ,

\displaystyle\sum_{i=1}^{k}\lambda^{i}\mathbbm{1}_{C(x^{i},A^{i})}\geq 0\implies\sum_{i=1}^{k}\lambda^{i}p(x^{i},A^{i})\geq 0

Joint Coherency is the multiperiod analog of Clark (1996)’s Coherency axiom, which in turn is based on De Finetti (1937)’s characterization of finitely additive probability measures on an algebra of sets.

Theorem 3.15.

$\rho$ has an SU representation if and only if it satisfies Joint Coherency.

4 Appendix

4.1 The Möbius Inversion

Let $(L,\leq)$ be a finite, partially ordered set (poset). The following definition and lemma are due to Van Lint and Wilson (2001) Equations 25.2 and 25.5.

Definition 4.1.

The Möbius function $m_{L}:L^{2}\rightarrow\mathbb{Z}$ is

\displaystyle m_{L}(a,b)=\begin{cases}1&a=b\\ 0&a\nleq b\\ -\sum_{a\leq c<b}m_{L}(a,c)&a<b\end{cases}

Let the zeta function of $L$ , denoted $\zeta_{L}$ , be the indicator function for the set $\leq\ \subseteq L^{2}$ . By Van Lint and Wilson (2001) Equation 25.1, $m_{L}$ is the $|L|\times|L|$ matrix inverse of $\zeta_{L}$ .

Lemma 4.2.

Given a function $f:L\rightarrow\mathbb{R}$ , define $F(a):=\sum_{b\geq a}f(b)$ . Then

\displaystyle f(a)=\sum_{b\geq a}m_{L}(a,b)F(b)

Proof.

This proof is adapted from page 336 of Van Lint and Wilson (2001). Fix $f:L\rightarrow\mathbb{R}$ and define $F$ as above. Observe that

\displaystyle\sum_{a\leq c\leq b}m_{L}(a,c)=\sum_{c\in L}m_{L}(a,c)\zeta_{L}(c,b)=\begin{cases}1&a=b\\ 0&\text{else}\end{cases}

where the first equality holds because $m_{L}(a,c)=0$ if $a\nleq c$ and $\zeta_{L}(c,b)=1$ if $c\leq b$ and $0$ else, and the second equality holds because $m_{L}\zeta_{L}$ is the $L\times L$ identity matrix. Thus,

\displaystyle\sum_{b\geq a}m_{L}(a,b)F(b)=\sum_{b\geq a}m_{L}(a,b)\bigg{(}\sum_{c\geq b}f(c)\bigg{)}=\sum_{c\geq a}f(c)\bigg{(}\sum_{a\leq b\leq c}m_{L}(a,b)\bigg{)}=f(a)

∎

Lemma 4.2 shows how to recover any real-valued poset-defined function given its sums over upper contour sets and the poset’s Möbius function. This procedure is known as the Möbius inversion. The following lemma is due to Van Lint and Wilson (2001) Theorem 25.1(i). It states the Möbius function for the power set of a finite set under the inclusion partial order.

Lemma 4.3.

Fix finite $X$ and let $L=2^{X}$ with $\leq=\subseteq$ . Then

\displaystyle m_{L}(A,B)=\begin{cases}(-1)^{|B|-|A|}&A\subseteq B\\ 0&\text{else}\end{cases}

Proof.

See page 343 of Van Lint and Wilson (2001). ∎

Given posets $(L_{i},\leq_{i})_{i=1}^{n}$ , define their product poset to be $(L,\leq)$ , where $L=\times_{i=1}^{n}L_{i}$ and $a\leq b$ if and only if $a_{i}\leq_{i}b_{i}$ for each $i$ . Let $m_{i}$ denote the Möbius function of $L_{i}$ . The following lemma generalizes Godsil (2018) Lemma 3.1 to state the Möbius function for $n$ -ary product posets for arbitrary finite $n$ .

Lemma 4.4.

For all $a,b\in L$ , the Möbius function $m_{L}:L^{2}\rightarrow\mathbb{Z}$ satisfies

\displaystyle m_{L}(a,b)=\prod_{i=1}^{n}m_{L_{i}}(a_{i},b_{i})

Proof.

Let $a,b\in L$ . By definition of $\leq_{L}$ , $\zeta_{L}(a,b)=1$ if and only if $\zeta_{L_{i}}(a_{i},b_{i})=1$ for each $i$ . Hence, $\zeta_{L}(a,b)=\prod_{i=1}^{n}\zeta_{L_{i}}(a_{i},b_{i})$ , which implies the $|L|\times|L|$ matrix $\zeta_{L}$ is the Kronecker product of the $|L_{i}|\times|L_{i}|$ matrices $\zeta_{L_{i}}$ , denoted $\bigotimes_{i=1}^{n}\zeta_{L_{i}}$ .⁸⁸8See Definition 2.1 of Schacke (2004) for a definition of the Kronecker product. Then

m_{L}=(\zeta_{L})^{-1}=\bigg{(}\bigotimes_{i=1}^{n}\zeta_{L_{i}}\bigg{)}^{-1}=\bigotimes_{i=1}^{n}(\zeta_{L_{i}})^{-1}=\bigotimes_{i=1}^{n}m_{L_{i}}

where the third equality follows from KRON 10 of Schacke (2004).⁹⁹9Formally, I can show this by inducting on $n$ . The base case ( $n=1$ ) immediately follows. Inductive step: $(\bigotimes_{i=1}^{n}\zeta_{L_{i}})^{-1}=((\bigotimes_{i=1}^{n-1}\zeta_{L_{i}})\otimes\zeta_{L_{n}})^{-1}=(\bigotimes_{i=1}^{n-1}\zeta_{L_{i}})^{-1}\otimes(\zeta_{L_{n}})^{-1}=\bigotimes_{i=1}^{n}(\zeta_{L_{i}})^{-1}$ , where the second equality follows from KRON 10 of Schacke (2004) and the third equality follows from the inductive hypothesis. By definition of the Kronecker product, we conclude

\displaystyle m_{L}(a,b)=\prod_{i=1}^{n}m_{L_{i}}(a_{i},b_{i})

. ∎

4.2 BM Sum Identities

The following lemma shows how to recover $p$ from lower contour sums of joint BM sums.

Lemma 4.5.

For all $A<<X$ and $x\in A^{C}$ ,

\displaystyle p(x,A^{C})=\sum_{B\leq A}m(x,B)

Proof.

By Lemmas 4.3 and 4.4, I obtain

\displaystyle m_{L}(A,B)=\begin{cases}(-1)^{\sum_{t=1}^{n}(|B_{t}|-|A_{t}|)}&A\leq B\\ 0&\text{else}\end{cases}

For each $1\leq t\leq n$ , fix any $A_{t}\subsetneq X_{t}$ and $x_{t}\in A_{t}^{C}$ . Define $f:L\rightarrow\mathbb{R}$ as

\displaystyle f(B):=(-1)^{\sum_{t=1}^{n}(|B_{t}|-|A_{t}^{C}|)}p(x,B)

and $F:L\rightarrow\mathbb{R}$ as $F(D):=\sum_{B\geq D}f(B)$ . Applying the Möbius inversion from Lemma 4.2,

\displaystyle f(D)=\sum_{B\geq D}(-1)^{\sum_{t=1}^{n}(|B_{t}|-|D_{t}|)}F(B)

which implies

\displaystyle p(x,A^{C})=f(A^{C})=\sum_{B\geq A^{C}}(-1)^{\sum_{t=1}^{n}(|B_{t}|-|A_{t}^{C}|)}F(B)=\sum_{D\leq A}m(x,D)

The last equality follows by matching terms across sums via the bijection $D^{C}\geq A^{C}\iff D\leq A$ :

	$\displaystyle(-1)^{\sum_{t=1}^{n}(\|D_{t}^{C}\|-\|A_{t}^{C}\|)}F(D^{C})=\sum_{B\geq D^{C}}(-1)^{\sum_{t=1}^{n}(\|D_{t}^{C}\|-\|A_{t}^{C}\|)}f(B)$
	$\displaystyle=\sum_{B\geq D^{C}}(-1)^{\sum_{t=1}^{n}(\|B_{t}\|-\|D_{t}^{C}\|)}p(x,B)=m(x,D)$

where the second equality follows by observing that $(-1)^{k}=(-1)^{-k}$ . ∎

Lemma 4.5 is the dynamic analog of Chambers and Echenique (2016)’s Lemma 7.4(I), which is stated in that reference without proof. An analogous argument shows how to recover $\rho_{n}(\cdot|h_{t-1})$ from lower contour sums of (history-)conditional BM sums.

Lemma 4.6.

For all $(A,x)^{-n}\in\mathcal{H}_{n-1}$ , $A_{n}\subsetneq X_{n}$ and $x_{n}\in A_{n}^{C}$ ,

\rho_{n}(x_{n},A_{n}^{C}|(A,x)^{-n})=\sum_{B_{n}\subseteq A_{n}}m(x_{n},B_{n}|(A,x)^{-n})

where

m(x_{n},B_{n}|(A,x)^{-n}):=\sum_{D_{n}\supseteq B_{n}^{C}}(-1)^{|D_{n}|-|B_{n}^{C}|}\rho_{n}(x_{n},D_{n}|(A,x)^{-n})

Proof.

By Lemma 4.3, the Möbius function for $L_{n}=(2^{X_{n}},\subseteq)$ is

\displaystyle m_{L_{n}}(A_{n},B_{n})=\begin{cases}(-1)^{|B_{n}|-|A_{n}|}&A_{n}\subseteq B_{n}\\ 0&\text{else}\end{cases}

Fix any $(A,x)^{-n}\in\mathcal{H}_{n-1}$ , $A_{n}\subsetneq X_{n}$ and $x_{n}\in A_{n}^{C}$ . Define $f:L_{n}\rightarrow\mathbb{R}$ as

\displaystyle f(B_{n}):=(-1)^{|B_{n}|-|A_{n}^{C}|}\rho_{n}(x_{n},B_{n}|(A,x)^{-n})

and $F:L_{n}\rightarrow\mathbb{R}$ as $F(D_{n}):=\sum_{B_{n}\supseteq D_{n}}f(B_{n})$ . Applying the Möbius inversion from Lemma 4.2,

\displaystyle f(D_{n})=\sum_{B_{n}\supseteq D_{n}}(-1)^{|B_{n}|-|D_{n}|}F(B_{n})

which implies

\displaystyle\rho_{n}(x_{n},A_{n}^{C}|(A,x)^{-n})=f(A_{n}^{C})=\sum_{B_{n}\supseteq A_{n}^{C}}(-1)^{|B_{n}|-|A_{n}^{C}|}F(B_{n})=\sum_{D_{n}\subseteq A_{n}}m(x_{n},D_{n}|(A,x)^{-n})

The last equality follows by matching terms across sums via the bijection $D_{n}^{C}\supseteq A_{n}^{C}\iff D_{n}\subseteq A_{n}$ :

	$\displaystyle(-1)^{\|D_{n}^{C}\|-\|A_{n}^{C}\|}F(D_{n}^{C})=\sum_{B_{n}\supseteq D_{n}^{C}}(-1)^{\|D_{n}^{C}\|-\|A_{n}^{C}\|}f(B_{n})$
	$\displaystyle=\sum_{B_{n}\supseteq D_{n}^{C}}(-1)^{\|B_{n}\|-\|D_{n}^{C}\|}\rho_{n}(x_{n},B_{n}\|(A,x)^{-n})=m(x_{n},D_{n}\|(A,x)^{-n})$

where the second equality follows by observing that $(-1)^{k}=(-1)^{-k}$ . ∎

Strictly speaking, Lemma 4.6 holds for SCFs and BM sums that are conditional on histories. However, the same argument implies that for any $A<<X$ and $x\notin A$ with $m(x_{-n},A_{-n})>0$ ,

\rho_{n}(x_{n},A_{n}^{C}|(x,A)^{-n})=\sum_{B_{n}\subseteq A_{n}}m(x_{n},B_{n}|(x,A)^{-n})

This follows by fixing $A<<X$ and $x\notin A$ with $m(x_{-n},A_{-n})>0$ , defining $f(B_{n}):=(-1)^{|B_{n}|-|A_{n}^{C}|}\rho_{n}(x_{n},B_{n}|(x,A)^{-n})$ and $F(D_{n}):=\sum_{B_{n}\supseteq D_{n}}f(B_{n})$ , applying the Möbius inversion as before, and concluding that $\rho_{n}(x_{n},A_{n}^{C}|(x,A)^{-n})=\sum_{B_{n}\supseteq A_{n}^{C}}(-1)^{|B_{n}|-|A_{n}^{C}|}F(B_{n})=\sum_{D_{n}\subseteq A_{n}}m(x_{n},D_{n}|(x,A)^{-n})$ via matching terms: for any $D_{n}^{C}\supseteq A_{n}^{C}$ ,

	$\displaystyle(-1)^{\|D_{n}^{C}\|-\|A_{n}^{C}\|}F(D_{n}^{C})=\sum_{B_{n}\supseteq D_{n}^{C}}(-1)^{\|D_{n}^{C}\|-\|A_{n}^{C}\|}f(B_{n})$
	$\displaystyle=\sum_{B_{n}\supseteq D_{n}^{C}}(-1)^{\|B_{n}\|-\|D_{n}^{C}\|}\rho_{n}(x_{n},B_{n}\|(x,A)^{-n})=m(x_{n},D_{n}\|(x,A)^{-n})$

The following lemma equates two different (single-period) sums of joint BM sums under Marginal Consistency.

Lemma 4.7.

Assume $p$ satisfies Marginal Consistency. For any $1\leq t\leq n$ , $\emptyset\subsetneq A_{t}\subsetneq X_{t}$ , $A_{-t}<<X_{-t}$ , and $x_{-t}\notin A_{-t}$ ,

\sum_{x_{t}\in A_{t}^{C}}m(x_{t},A_{t};x_{-t},A_{-t})=\sum_{y_{t}\in A_{t}}m(y_{t},A_{t}\backslash\{y_{t}\};x_{-t},A_{-t})

Proof.

Expanding the left-hand side yields

\sum_{B_{-t}\geq A_{-t}^{C}}(-1)^{\sum_{i\neq t}|B_{i}|-|A_{i}^{C}|}\bigg{[}\sum_{B_{t}\supseteq A_{t}^{C}}(-1)^{|B_{t}|-|A_{t}^{C}|}p(A_{t}^{C},B_{t};x_{-t},B_{-t})\bigg{]}

and expanding the right-hand side yields

\sum_{B_{-t}\geq A_{-t}^{C}}(-1)^{\sum_{i\neq t}|B_{i}|-|A_{i}^{C}|}\bigg{[}\sum_{y_{t}\in A_{t}}\sum_{B_{t}\supseteq A_{t}^{C}\cup\{y_{t}\}}(-1)^{|B_{t}|-|A_{t}^{C}|-1}p(y_{t},B_{t};x_{-t},B_{-t})\bigg{]}

I will show that the inner sums are equal by matching terms. Expanding the first inner sum yields

	$\displaystyle p(x_{-t},B_{-t})+\sum_{B_{t}\supsetneq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}\big{[}p(x_{-t},B_{-t})-p(B_{t}\backslash A_{t}^{C},B_{t};x_{-t},B_{-t})\big{]}$
	$\displaystyle=\sum_{B_{t}\supseteq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}p(x_{-t},B_{-t})-\sum_{B_{t}\supsetneq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}p(B_{t}\backslash A_{t}^{C},B_{t};x_{-t},B_{-t})$
	$\displaystyle=p(x_{-t},B_{-t})\sum_{B_{t}\supseteq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}+\sum_{B_{t}\supsetneq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|+1}p(B_{t}\backslash A_{t}^{C},B_{t};x_{-t},B_{-t})$
	$\displaystyle=\sum_{B_{t}\supsetneq A_{t}^{C}}\sum_{x_{t}\in B_{t}\backslash A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|+1}p(x_{t},B_{t};x_{-t},B_{-t})$

where the first line follows from Marginal Consistency. To see the last equality, observe that

\sum_{B_{t}\supseteq A_{t}^{C}}(-1)^{|B_{t}|-|A_{t}^{C}|}=\sum_{k=0}^{|A_{t}|}(-1)^{k}\binom{|A_{t}|}{k}=0

since, for each $0\leq k\leq|A_{t}|$ , there are $\binom{|A_{t}|}{k}$ supersets of $A_{t}^{C}$ with $|A_{t}^{C}|+k$ elements. Since $(-1)^{n}=(-1)^{n+2}$ , matching terms across inner sums via the bijection $y_{t}\in A_{t},B_{t}\supseteq A_{t}^{C}\cup\{y_{t}\}\iff B_{t}\supsetneq A_{t}^{C},y_{t}\in B_{t}\backslash A_{t}^{C}$ completes the proof. ∎

Lemma 4.7 is the dynamic analog of Chambers and Echenique (2016)’s Lemma 7.4(II), which is stated in that reference without proof. Note that since $p_{n-1}$ is the marginal of $p$ over $A_{-n}$ , the result of Lemma 4.7 for $t=n$ goes through without assuming Marginal Consistency. Lemma 4.7 admits the following corollary, which equates multi-period sums of Block-Marschak sums.

Corollary 4.8.

Assume $p$ satisfies Marginal Consistency. For any (nonempty) set of indices $T\subseteq\{1,\ldots,n\}$ , any $\emptyset<<A_{T}<<X_{T}$ , $A_{-T}<<X_{-T}$ , and $x_{-T}\notin\notin A_{-T}$ ,¹⁰¹⁰10Let $A_{T}:=(A_{t})_{t\in T}$ and say that $x_{T}\notin\notin A_{-T}$ if $x_{t}\notin A_{t}$ for all $t\in T$ .

\sum_{x_{T}\in A_{T}^{C}}m(x_{T},A_{T};x_{-T},A_{-T})=\sum_{y_{T}\in A_{T}}m(y_{T},(A\backslash\{y\})_{T};x_{-T},A_{-T})

where $(A\backslash\{y\})_{T}:=(A_{t}\backslash\{y_{t}\})_{t\in T}$ .

Proof.

I prove this by inducting on the cardinality of $T$ . The base case ( $|T|=1$ ) is Lemma 4.7. Inductive step: suppose the equality holds for all $T\subseteq\{1,\ldots,n\}$ with $|T|=k$ . Fix $T=\{t_{1},\ldots,t_{k+1}\}\subseteq\{1,\ldots,n\}$ and let $T_{k}=T\backslash\{t_{k+1}\}$ . Then

	$\displaystyle\sum_{x_{T}\in A_{T}^{C}}m(x_{T},A_{T};x_{-T},A_{-T})=\sum_{x_{t_{k+1}}\in A_{t_{k+1}}^{C}}\bigg{[}\sum_{x_{T_{k}}\in A_{T_{k}}^{C}}m(x_{T_{k}},A_{T_{k}};x_{t_{k+1}},A_{t_{k+1}};x_{-T},A_{-T})\bigg{]}$
	$\displaystyle=\sum_{x_{t_{k+1}}\in A_{t_{k+1}}^{C}}\bigg{[}\sum_{y_{T_{k}}\in A_{T_{k}}}m(y_{T_{k}},(A\backslash\{y\})_{T_{k}};x_{t_{k+1}},A_{t_{k+1}};x_{-T},A_{-T})\bigg{]}$
	$\displaystyle=\sum_{y_{T_{k}}\in A_{T_{k}}}\bigg{[}\sum_{x_{t_{k+1}}\in A_{t_{k+1}}^{C}}m(y_{T_{k}},(A\backslash\{y\})_{T_{k}};x_{t_{k+1}},A_{t_{k+1}};x_{-T},A_{-T})\bigg{]}$
	$\displaystyle=\sum_{y_{T_{k}}\in A_{T_{k}}}\bigg{[}\sum_{y_{t_{k+1}}\in A_{t_{k+1}}}m(y_{T_{k}},(A\backslash\{y\})_{T_{k}};y_{t_{k+1}},A_{t_{k+1}}\backslash\{y_{t_{k+1}}\};x_{-T},A_{-T})\bigg{]}$
	$\displaystyle=\sum_{y_{T}\in A_{T}}m(y_{T},(A\backslash\{y\})_{T};x_{-T},A_{-T})$

where the second equality follows from the inductive hypothesis and the fourth equality follows from Lemma 4.7. ∎

The next result verifies that joint BM sums can be decomposed into products of marginal and conditional BM sums.

Lemma 4.9.

For any $A_{-n}<<X_{-n}$ and $x_{-n}\in A_{-n}^{C}$ with $m(x_{-n},A_{-n})>0$ and any $A_{n}\in\mathcal{M}_{n}$ ,

\displaystyle\sum_{x_{n}\in A_{n}}\rho_{n}(x_{n},A_{n}|(x,A)^{-n})=1\ \ \ \text{and}\ \ \ m(x_{n},A_{n}|(x,A)^{-n})m(x_{-n},A_{-n})=m(x,A)

Proof.

Fix any $A_{-n}<<X_{-n}$ and $x_{-n}\in A_{-n}^{C}$ with $m(x_{-n},A_{-n})>0$ and any $A_{n}\in\mathcal{M}_{n}$ . Note that for fixed $x_{n}\in A_{n}$ ,

	$\displaystyle\sum_{D_{n}\subseteq A_{n}^{C}}m(x_{-n},A_{-n};x_{n},D_{n})$
	$\displaystyle=\sum_{B_{-n}\geq A_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}p_{-n}(x_{-n},B_{-n})\sum_{D_{n}\subseteq A_{n}^{C}}\sum_{B_{n}\supseteq D_{n}^{C}}(-1)^{\|B_{n}\|-\|D_{n}^{C}\|}\rho_{n}(x_{n},B_{n}\|(B,x)^{-n})$
	$\displaystyle=\sum_{B_{-n}\geq A_{-n}^{C}:p_{-n}(x_{-n},B_{-n})>0}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}p_{-n}(x_{-n},B_{-n})\sum_{D_{n}\subseteq A_{n}^{C}}m(x_{n},D_{n}\|(B,x)^{-n})$
	$\displaystyle=\sum_{B_{-n}\geq A_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}p(x_{-n},B_{-n};x_{n},A_{n})$

where the second equality holds because every sum is equal to itself excluding the terms equal to zero and by the definition of $m(x_{n},D_{n}|(B,x)^{-n})$ given in Lemma 4.6, and the third equality follows from Lemma 4.6. Substituting the above yields

	$\displaystyle\sum_{x_{n}\in A_{n}}\rho_{n}(x_{n},A_{n}\|(x,A)^{-n})=\frac{\sum_{x_{n}\in A_{n}}\sum_{D_{n}\subseteq A_{n}^{C}}m(x_{-n},A_{-n};x_{n},D_{n})}{m(x_{-n},A_{-n})}$
	$\displaystyle=\frac{\sum_{B_{-n}\geq A_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}\sum_{x_{n}\in A_{n}}p(x_{-n},B_{-n};x_{n},A_{n})}{m(x_{-n},A_{-n})}=1$

by definition of $m(x_{-n},A_{-n})$ . Next,

	$\displaystyle m(x_{n},A_{n}\|(x,A)^{-n})m(x_{-n},A_{-n})=\sum_{B_{n}\supseteq A_{n}^{C}}(-1)^{\|B_{n}\|-\|A_{n}^{C}\|}\sum_{D_{n}\subseteq B_{n}^{C}}m(x_{-n},A_{-n};x_{n},D_{n})$
	$\displaystyle=\sum_{B_{n}\supseteq A_{n}^{C}}(-1)^{\|B_{n}\|-\|A_{n}^{C}\|}\sum_{B_{-n}\geq A_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}p(x_{-n},B_{-n};x_{n},B_{n})=m(x,A)$

where the second equality follows from above. ∎

4.3 Proof of Proposition 3.1

Proof.

(1) $\implies$ (2): Fix any $x\in A\in\mathcal{M}$ . If $p(x,A)=\rho(x_{1},A_{1})\prod_{t=2}^{n}\rho_{t}(x_{t},A_{t}|(A,x)^{t-1})=0$ , then $\mu(C(x_{1},A_{1}))=\rho_{1}(x_{1},A_{1})=0$ or $\mu(C(x_{t},A_{t})|C(A,x)^{t-1})=\rho_{t}(x_{t},A_{t}|(A,x)^{t-1})=0$ for some $1<t\leq n$ with $(A,x)^{t-1}\in\mathcal{H}_{t-1}$ , by definition of $p$ .¹¹¹¹11If $\rho_{1}(x_{1},A_{1})>0$ , then at least one term in the product $\prod_{t=2}^{n}\rho_{t}(x_{t},A_{t}|(A,x)^{t-1})$ is zero. Let $t=\min\{1<s\leq n:\rho_{s}(x_{s},A_{s}|(A,x)^{s-1})=0\}$ : then for all $1<s<t$ , $\rho_{s}(x_{s},A_{s}|(A,x)^{s-1})>0$ and $(A,x)^{s-1}\in\mathcal{H}_{s-1}$ , by definition of $\mathcal{H}_{s-1}$ . Hence, $(A,x)^{t-1}\in\mathcal{H}_{t-1}$ .

If the former, then $C(x,A)\subseteq C(x_{1},A_{1})$ implies $\mu(C(x,A))\leq\mu(C(x_{1},A_{1}))=0$ . If the latter, then $C(x,A)\subseteq\bigcap_{s=1}^{t}C(x_{s},A_{s})$ implies $\mu(C(x,A))\leq\mu\big{(}\bigcap_{s=1}^{t}C(x_{s},A_{s})\big{)}=0$ .¹²¹²12If $(A,x)^{t-1}\in\mathcal{H}_{t-1}$ , then $\rho_{s}(x_{s},A_{s}|(A,x)^{s-1})>0$ for all $1<s<t$ and $\rho_{1}(x_{1},A_{1})>0$ , by definition of $\mathcal{H}_{t-1}$ . I claim that $\mu(C(A,x)^{s})>0$ for all $1<s\leq t-1$ . Base case: $\mu(C(A,x)^{2})=\mu(C(x_{1},A_{1})\cap C(x_{2},A_{2}))=\mu(C(x_{2},A_{2})|C(x_{1},A_{1}))\mu(C(x_{1},A_{1}))=\rho_{2}(x_{2},A_{2}|A_{1},x_{1})\rho_{1}(x_{1},A_{1})>0$ . Inductive step: suppose $\mu(C(A,x)^{s})>0$ for $1<s<t-1$ . Since $(A,x)^{t-1}\in\mathcal{H}_{t-1}$ , $(A,x)^{s}\in\mathcal{H}_{s}$ . Hence, $\mu(C(A,x)^{s+1})=\mu(C(A,x)^{s}\cap C(x_{s+1},A_{s+1}))=\mu(C(x_{s+1},A_{s+1})|C(A,x)^{s})\mu(C(A,x)^{s})=\rho_{s+1}(x_{s+1},A_{s+1}|(A,x)^{s})\mu(C(A,x)^{s})>0$ . Finally, since $\mu(C(A,x)^{t-1})>0$ , we can write $\mu(\bigcap_{s=1}^{t}C(x_{s},A_{s}))=\mu(C(x_{t},A_{t})|C(A,x)^{t-1})\mu(C(A,x)^{t-1})=\rho_{t}(x_{t},A_{t}|(A,x)^{t-1})\mu(C(A,x)^{t-1})=0$ . In either case, $\mu(C(x,A))=0=p(x,A)$ , as desired. If $\rho(x_{1},A_{1})\prod_{t=2}^{n}\rho_{t}(x_{t},A_{t}|(A,x)^{t-1})=p(x,A)>0$ , then every multiplicand of that product is positive. Hence,¹³¹³13By the previous footnote, $\mu(C(A,x)^{t-1})>0$ for all $1<t\leq n$ .

\mu(C(x,A))=\mu(C(x_{1},A_{1}))\prod_{t=2}^{n}\mu(C(x_{t},A_{t})|C(A,x)^{t-1})=\rho(x_{1},A_{1})\prod_{t=2}^{n}\rho_{t}(x_{t},A_{t}|(A,x)^{t-1})=p(x,A)

(2) $\implies$ (3): Fix any $A<<X$ and $x\in A^{C}$ . Since $x_{t}\succ_{t}A_{t}^{C}\backslash\{x_{t}\}$ if and only if $B_{t}^{C}\succ_{t}x_{t}\succ_{t}B_{t}\backslash\{x_{t}\}$ for some $B_{t}\supseteq A_{t}^{C}$ ,

C(x,A^{C})=\bigcup_{B\geq A^{C}}E(x,B^{C})

and this union is disjoint. Hence,

p(x,A^{C})=\sum_{B\geq A^{C}}\mu(E(x,B^{C}))

By Lemma 4.2 with $f(B)=\mu(E(x,B^{C}))$ and $F(A)=p(x,A)$ ,

\mu(E(x,A))=f(A^{C})=\sum_{B\geq A^{C}}m_{L}(A^{C},B)p(x,B)=m(x,A)

(3) $\implies$ (1): For all $y\in D\in\mathcal{M}$ , $y_{t}\succ_{t}D_{t}\backslash\{y_{t}\}$ if and only if $B_{t}\succ_{t}y_{t}\succ_{t}B_{t}^{C}\backslash\{y_{t}\}$ for some $B_{t}\subseteq D_{t}^{C}$ implies

C(y,D)=\bigcup_{B\leq D^{C}}E(y,B)

and this union is disjoint. By Lemma 4.5,

\mu(C(y,D))=\sum_{B\leq D^{C}}m(y,B)=p(y,D)

Fix any $x_{1}\in A_{1}\in\mathcal{M}_{1}$ and fix $A_{-1}\in\mathcal{M}_{-1}$ : since $\{C(x_{1},A_{1};x_{-1},A_{-1})\}_{x_{-1}\in A_{-1}}$ partitions $C(x_{1},A_{1})$ ,

\mu(C(x_{1},A_{1}))=\sum_{x_{-1}\in A_{-1}}\mu(C(x_{1},A_{1};x_{-1},A_{-1}))=\sum_{x_{-1}\in A_{-1}}p(x_{1},A_{1};x_{-1},A_{-1})=\rho_{1}(x_{1},A_{1})

In particular, $\mu(C(h_{1}))=\rho_{1}(x_{1},A_{1})>0$ for any $h_{1}=(A_{1},x_{1})\in\mathcal{H}_{1}$ , by definition of $\mathcal{H}_{1}$ . Next, fix any $2<t\leq n$ , $h_{t-1}\in\mathcal{H}_{t-1}$ , and $A_{\geq t}\in\mathcal{M}_{\geq t}$ . Since $\{C(h_{t-1};x_{\geq t},A_{\geq t})\}_{x_{\geq t}\in A_{\geq t}}$ partitions $C(h_{t-1})$ ,

	$\displaystyle\mu(C(h_{t-1}))=\sum_{x_{\geq t}\in A_{\geq t}}\mu(C(h_{t-1};x_{\geq t},A_{\geq t}))=\sum_{x_{\geq t}\in A_{\geq t}}p(h_{t-1};x_{\geq t},A_{\geq t})$
	$\displaystyle=p_{t-1}(h_{t-1})=\rho_{1}(x_{1},A_{1})\prod_{s=2}^{t-1}\rho_{s}(x_{s},A_{s}\|(A,x)^{s-1})>0$

by definition of $\mathcal{H}_{t-1}$ . Hence, for all $1<t\leq n$ , $h_{t-1}=(A,x)^{t-1}\in\mathcal{H}_{t-1}$ , and $x_{t}\in A_{t}\in\mathcal{M}_{t}$ ,

	$\displaystyle\mu(C(x_{t},A_{t})\|C(h_{t-1}))=\frac{\mu(C(h_{t-1};x_{t},A_{t}))}{\mu(C(h_{t-1}))}$
	$\displaystyle=\frac{\rho_{1}(x_{1},A_{1})\prod_{s=2}^{t}\rho_{s}(x_{s},A_{s}\|(A,x)^{s-1})}{\rho_{1}(x_{1},A_{1})\prod_{s=2}^{t-1}\rho_{s}(x_{s},A_{s}\|(A,x)^{s-1})}=\rho_{t}(x_{t},A_{t}\|h_{t-1})$

∎

4.4 Proof of Theorem 3.4

Proof.

$(\implies)$ : As I have shown, $p$ satisfies Marginal Consistency by Proposition 3.1. To see that $p$ satisfies Joint Supermodularity, fix $y_{t}\neq x_{t}\in X_{t}$ for each $1\leq t\leq n$ . By Proposition 3.1,

	$\displaystyle\sum_{B\geq\{x\}^{C}:\sum_{t=1}^{n}\|B_{t}\|\text{even}}p(y,B)-\sum_{B\geq\{x\}^{C}:\sum_{t=1}^{n}\|B_{t}\|\text{odd}}p(y,B)$
	$\displaystyle=\sum_{B\geq\{y,z\}_{N}}(-1)^{\sum_{t=1}^{n}\|B_{t}\|-2n}p(x,B)=m(y,\{x\})=\mu(E(y,\{x\}))\geq 0$

$(\impliedby)$ : Define $\mu(xyz):=m(y,\{x\})\geq 0$ . Observe that

\displaystyle\sum_{\succ\in P}\mu(\succ)=\sum_{d\in X}\bigg{[}\sum_{e\in(X\backslash\{d\})}m(e,\{d\})\bigg{]}=\sum_{d\in X}m(d,\emptyset)=\sum_{d\in X}p(d,X)=1

where $(X\backslash\{d\})=(X_{t}\backslash\{d_{t}\})_{1\leq t\leq n}$ , the first equality follows from counting (since there is a bijection between preference tuples and their first- and second-ranked elements in each component), the second equality follows from Corollary 4.8, and the third equality follows by definition of $m$ . Hence, $\mu$ is a probability measure over $P$ . Next, fix any $A<<X$ and $x\in A^{C}$ . For $i=0,1,2$ , let $T_{i}:=\{t\in\{1,\ldots,n\}:|A_{t}|=i\}$ . Since $A<<X$ , $\{T_{0},T_{1},T_{2}\}$ partition $\{1,\ldots,n\}$ . Since

E(x,A)=\bigcup_{y_{T_{0}}\neq x_{T_{0}}}\bigcup_{y_{T_{2}}\in A_{T_{2}}}E(y_{T_{0}},\{x\}_{T_{0}};x_{T_{1}},A_{T_{1}};y_{T_{2}},(A\backslash\{y\})_{T_{2}})

and this union is disjoint,

\displaystyle\mu(E(x,A))=\sum_{y_{T_{0}}\neq x_{T_{0}}}\sum_{y_{T_{2}}\in A_{T_{2}}}m(y_{T_{0}},\{x\}_{T_{0}};x_{T_{1}},A_{T_{1}};y_{T_{2}},(A\backslash\{y\})_{T_{2}})=m(x,A)

where the second equality follows from Corollary 4.8. By Proposition 3.1, $\mu$ is an SU representation of $\rho$ . Finally, suppose $\mu^{\prime}$ is an SU representation of $\rho$ . Fix any $xyz\ \in P$ : by Proposition 3.1,

\mu(xyz)=\mu(E(y,\{x\}))=m(y,\{x\})=\mu(xyz)

Hence, $\mu$ is unique. ∎

4.5 Proof of Theorem 3.7

Proof.

$(\implies)$ : By Proposition 3.1, Joint BM Nonnegativity and Marginal Consistency are necessary for SU, since $m(x,A)=\mu(E(x,A))\geq 0$ for all $A<<X$ and $x\in A^{c}$ , and $\sum_{x_{t}\in A_{t}}p(x_{t},A_{t};x_{-t},A_{-t})=\mu(C(x_{-t},A_{-t}))$ is constant in $A_{t}$ for all $1\leq t<n$ and $x_{-t}\in A_{-t}\in\mathcal{M}_{-t}$ .

$(\impliedby)$ : First, I define the $t$ -cylinders.

Definition 4.10.

For $1\leq k\leq|X_{t}|$ , a $\boldsymbol{k}$ -sequence is a (distinct) sequence of elements $(x_{t}^{1},\ldots,x_{t}^{k})$ in $X_{t}$ and its $\boldsymbol{t}$ -cylinder¹⁴¹⁴14This definition is analogous to Chambers and Echenique (2016)’s Chapter 7 definition of cylinders. is

\displaystyle I_{x_{t}^{1},\ldots,x_{t}^{k}}:=\big{\{}\succ_{t}\ \in P_{t}:x_{t}^{1}\succ_{t}\cdots\succ_{t}x_{t}^{k}\succ_{t}X_{t}\backslash\{x_{t}^{1},\ldots,x_{t}^{k}\}\big{\}}

A $k$ -sequence’s $t$ -cylinder is the set of period- $t$ preferences that rank that $k$ -sequence in order above all other elements. Let $\mathcal{I}_{t}$ be the set of all $t$ -cylinders, and let $\mathcal{I}:=\times_{t=1}^{n}\mathcal{I}_{t}$ . Observe that $\mathcal{I}_{t}$ contains all singletons, since $I_{x_{t}^{1},\ldots,x_{t}^{|X_{t}|-1}}=I_{x_{t}^{1},\ldots,x_{t}^{|X_{t}|}}=\{\succ_{t}\}$ for the unique $\succ_{t}$ satisfying $x_{t}^{1}\succ_{t}\cdots\succ_{t}x_{t}^{|X_{t}|}$ . For $1\leq t<n$ , observe that $\mathcal{I}_{t}$ only contains $t$ -cylinders of $k$ -sequences for $k=1,2,3$ . Given (nonempty) menu $A_{t}$ , let $\pi(A_{t})$ be the set of all $|A_{t}|$ -sequences that permute $A_{t}$ . Now, I recursively define $\nu:\mathcal{I}\rightarrow\mathbb{R}_{\geq 0}$ .¹⁵¹⁵15My definition of $\nu$ is the multiperiod analog of Chambers and Echenique (2016) (7.4). First, define

\displaystyle\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{x_{n}^{1}}):=m(x_{-n}^{2},\{x\}_{-n}^{1};x_{n}^{1},\emptyset)

Next, for $1<k\leq|X_{n}|$ and $x_{n}^{K}:=(x_{n}^{1},\ldots,x_{n}^{k})$ , let $A_{n}=\{x_{n}^{1},\ldots,x_{n}^{k-1}\}$ and define

\displaystyle\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{x_{n}^{K}}):=\begin{cases}0&\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})=0\\ \frac{\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{x_{n}^{K-1}})m(x_{-n}^{2},\{x\}_{-n}^{1};x_{n}^{k},A_{j})}{\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})}&\text{else}\end{cases}

Finally, for each $I\in\mathcal{I}$ , let $T_{1}=\{1\leq t\leq n-1:I_{t}=I_{x_{t}^{1}}\}$ and $T_{3}=\{1\leq t\leq n-1:I_{t}=I_{x_{t}^{1},x_{t}^{2},x_{t}^{3}}\}$ , and define

\displaystyle\nu(I):=\sum_{x_{T_{1}}^{2}\neq\neq x_{T_{1}}^{1}}\nu(I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{x_{T_{3}}^{1},x_{T_{3}}^{2}}\times I_{-(T_{1}\cup T_{3})})

Note that Joint BM Nonnegativity implies $\nu\geq 0$ .

Definition 4.11.

For any $k=(k_{1},\ldots,k_{n})$ with $0\leq k_{t}<|X_{t}|$ , the first additive property $\boldsymbol{p_{1}(k)}$ holds if for all $A$ with $|A_{t}|=k_{t}$ and all $x_{t}\in A_{t}^{C}$ ,

\displaystyle\sum_{\tau\in\times_{t\in-T}\pi(A_{t})}\nu(I_{\tau,x_{-T}}\times I_{x_{T}})=m(x_{-T},A_{-T};x_{T},\emptyset)

where $T=\{1\leq t\leq n:k_{t}=0\}$ .

Claim 4.12.

$p_{1}(k)$ holds for all $k$ with $0\leq k_{t}<|X_{t}|$ for each $1\leq t\leq n$ .

Proof.

Base case ( $0\leq k_{t}<3$ for all $1\leq t<n$ , $k_{n}=0$ ): Fix any $A$ with $|A_{t}|=k_{t}$ for $1\leq t<n$ and $A_{n}=\emptyset$ , and fix any $x\in A^{C}$ . For $0\leq i<3$ , let $T_{i}=\{1\leq t<n:k_{t}=i\}$ . Label $x_{T_{0}}^{1}:=x_{T_{0}}$ , $A_{T_{1}}=\{x^{1}\}_{T_{1}}$ , $x_{T_{1}}^{2}:=x_{T_{1}}$ , $A_{T_{2}}=\{x^{1},x^{2}\}_{T_{2}}$ , and $x_{T_{2}}^{3}:=x_{T_{2}}$ . Then

	$\displaystyle\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\nu(I_{x_{T_{0}}^{1}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}},x_{T_{2}}^{3}}\times I_{x_{n}})$
	$\displaystyle=\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{x_{n}})$
	$\displaystyle=\sum_{e_{T_{0}}\in\{x^{1}\}_{T_{0}}^{C}}\sum_{e_{T_{2}}\in A_{T_{2}}}m(e_{T_{0}},\{x^{1}\}_{T_{0}};x_{T_{1}}^{2},\{x^{2}\}_{T_{1}};e_{T_{2}},(A\backslash\{e\})_{T_{2}};x_{n},\emptyset)$
	$\displaystyle=m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};x_{n},\emptyset)$

where the last equality follows from Corollary 4.8.

First inductive step ( $k_{t}=1$ for all $1\leq t<n$ , $0<k_{n}<|X_{n}|$ ): Fix any $A$ with $A_{t}=\{x_{t}^{1}\}$ for all $1\leq t<n$ and $A_{n}=\{x_{n}^{1},\ldots,x_{n}^{k_{n}}\}$ . Fix any $x\in A^{C}$ . First, observe that

	$\displaystyle\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n}})=\sum_{y_{n}\in A_{n}}\bigg{[}\sum_{\tau\in A_{n}\backslash\{y_{n}\}}\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau,y_{n}})\bigg{]}$
	$\displaystyle=\sum_{y_{n}\in A_{n}}m(x_{-n},\{x^{1}\}_{-n};y_{n},A_{n}\backslash\{y_{n}\})=\sum_{x_{n}\in A_{n}^{C}}m(x_{-n},\{x^{1}\}_{-n};x_{n},A_{n})$

where the first equality holds because permuting $A_{n}$ is equivalent to choosing the last element and permuting the remaining $k_{n}-1$ elements, the second equality holds by the inductive hypothesis $p(k_{-n},k_{n}-1)$ ,¹⁶¹⁶16Strictly speaking, for $k_{n}=1$ the inner sum on the first line is not well-defined. However, in this case it still holds that $\nu(I_{x_{-n}^{1},x_{-n}}\times I_{x_{n}^{1}})=m(x_{-n},\{x^{1}\}_{-n};x_{n}^{1},\emptyset)=\sum_{x_{n}\neq x_{n}^{1}}m(x_{-n},\{x^{1}\}_{-n};x_{n},\{x_{n}^{1}\})$ , where the first equality follows by $p(k_{-n},0)$ (or directly by definition of $\nu$ ). and the third equality holds by Lemma 4.7. There are two cases.

If $\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n}})=0$ , then for each $\tau_{n}\in\pi(A_{n})$ it follows by definition of $\nu$ that $\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n},x_{n}})=0$ . Furthermore, Joint BM Nonnegativity implies $m(x_{-n},\{x^{1}\}_{-n};x_{n},A_{n})=0$ for each $x_{n}\in A_{n}^{C}$ . Hence,

\displaystyle\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n},x_{n}})=0=m(x_{-n},\{x^{1}\}_{-n};x_{n},A_{n})

as desired.

If $\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n}})>0$ , then for each $\tau_{n}\in\pi(A_{n})$ it follows by definition of $\nu$ that

	$\displaystyle\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n},x_{n}})=\sum_{\tau_{n}\in\pi(A_{n})}\Bigg{[}\frac{\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n}})m(x_{-n},\{x^{1}\}_{-n};x_{n},A_{n})}{\sum_{\tau_{n}^{\prime}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}}\times I_{\tau_{n}^{\prime}})}\Bigg{]}$
	$\displaystyle=m(x_{-n},\{x^{1}\}_{-n};x_{n},A_{n})$

as desired.

Second inductive step ( $0\leq k_{t}<3$ for all $1\leq t<n$ , $k_{t}\neq 1$ for some $1\leq t<n$ , $0<k_{n}<|X_{n}|$ ): Fix any $A$ with $|A_{t}|=k_{t}$ for $1\leq t\leq n$ and any $x\in A^{C}$ . For $0\leq i<3$ , let $T_{i}=\{1\leq t<n:k_{t}=i\}$ . By assumption, $A_{T_{0}}=\emptyset$ . Label $x_{T_{0}}^{1}:=x_{T_{0}}$ , $A_{T_{1}}=\{x^{1}\}_{T_{1}}$ , $x_{T_{1}}^{2}:=x_{T_{1}}$ , $A_{T_{2}}=\{x^{1},x^{2}\}_{T_{2}}$ , $x_{T_{2}}^{3}:=x_{T_{2}}$ , and $A_{n}=\{x_{n}^{1},\ldots,x_{n}^{k_{n}}\}$ . First, observe that

	$\displaystyle\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}},x_{T_{2}}^{3}}\times I_{\tau_{n}})$
	$\displaystyle=\sum_{y_{n}\in A_{n}}\bigg{[}\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{\tau\in\pi(A_{n}\backslash\{y_{n}\})}\nu(I_{x_{T_{0}}^{1}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}},x_{T_{2}}^{3}}\times I_{\tau,y_{n}})\bigg{]}$
	$\displaystyle=\sum_{y_{n}\in A_{n}}m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};y_{n},A_{n}\backslash\{y_{n}\})=\sum_{x_{n}\in A_{n}^{C}}m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};x_{n},A_{n})$

where the first equality holds because permuting $A_{n}$ is equivalent to choosing the last element and permuting the remaining $k_{n}-1$ elements, the second equality holds by the inductive hypothesis $p_{1}(k_{-n},k_{n}-1)$ ,¹⁷¹⁷17Strictly speaking, for $k_{n}=1$ the inner sum on the second line is not well-defined. However, in this case it still holds that $\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\nu(I_{x_{T_{0}}^{1}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}},x_{T_{2}}^{3}}\times I_{x_{n}^{1}})=m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};x_{n}^{1},\emptyset)=\sum_{x_{n}\neq x_{n}^{1}}m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};x_{n},\{x_{n}^{1}\})$ , where the first equality follows from $p(k_{-n},0$ , as shown in the base case. and the third equality holds because of Lemma 4.7. There are two cases.

	$\displaystyle\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}},x_{T_{2}}^{3}}\times I_{\tau_{n}})$
	$\displaystyle=\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n}})=0$

where the second equality follows by definition of $\nu$ , then $\nu\geq 0$ implies that for each $x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}$ and $\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}$ , $\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n}})=0$ . By definition of $\nu$ , it follows that $\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n},x_{n}})=0$ . Furthermore, Joint BM Nonnegativity implies $m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};x_{n},A_{n})=0$ for each $x_{n}\in A_{n}^{C}$ . Hence,

	$\displaystyle\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}},x_{T_{2}}^{3}}\times I_{\tau_{n},x_{n}})$
	$\displaystyle=\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n},x_{n}})$
	$\displaystyle=0=m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};x_{n},A_{n})$

as desired.

\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n}})>0

Let $T_{>}=\{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}},x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}:\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n}})>0\}$ and $T=\{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}},x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}:\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n}})=0\}$ . For $(\tau_{T_{2}},x_{T_{0}}^{2})\in T$ , it follows that $\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n},x_{n}})=0$ for each $\tau_{n}\in\pi(A_{n})$ and therefore

\displaystyle m((x^{2},\{x^{1}\})_{T_{0}\cup T_{1}};\tau_{T_{2},2},\{\tau_{1}\}_{T_{2}};x_{n},A_{n})=\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n},x_{n}})=0

where the first equality follows by $p(1,\ldots,1,k_{n})$ . Hence,

	$\displaystyle\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}},x_{T_{2}}^{3}}\times I_{\tau_{n},x_{n}})$
	$\displaystyle=\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n},x_{n}})$
	$\displaystyle=\sum_{(\tau_{T_{2}},x_{T_{0}}^{2})\in T_{>}}\sum_{\tau_{n}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n},x_{n}})$
	$\displaystyle=\sum_{(\tau_{T_{2}},x_{T_{0}}^{2})\in T_{>}}\sum_{\tau_{n}\in\pi(A_{n})}\frac{\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n}})m((x^{2},\{x^{1}\})_{T_{0}\cup T_{1}};\tau_{T_{2},2},\{\tau_{1}\}_{T_{2}};x_{n},A_{n})}{\sum_{\tau_{n}^{\prime}\in\pi(A_{n})}\nu(I_{x_{T_{0}}^{1},x_{T_{0}}^{2}}\times I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{\tau_{T_{2}}}\times I_{\tau_{n}^{\prime}})}$
	$\displaystyle=\sum_{(\tau_{T_{2}},x_{T_{0}}^{2})\in T_{>}}m((x^{2},\{x^{1}\})_{T_{0}\cup T_{1}};\tau_{T_{2},2},\{\tau_{1}\}_{T_{2}};x_{n},A_{n})$
	$\displaystyle=\sum_{\tau_{T_{2}}\in\{x^{1}x^{2},x^{2}x^{1}\}_{T_{2}}}\sum_{x_{T_{0}}^{2}\neq\neq x_{T_{0}}^{1}}m((x^{2},\{x^{1}\})_{T_{0}\cup T_{1}};\tau_{T_{2},2},\{\tau_{1}\}_{T_{2}};x_{n},A_{n})$
	$\displaystyle=m(x_{T_{0}}^{1},\emptyset;x_{T_{1}}^{2},\{x^{1}\}_{T_{1}};x_{T_{2}}^{3},A_{T_{2}};x_{n},A_{n})$

as desired.

∎

Definition 4.13.

For any $1\leq t\leq n$ and $k=(k_{1},\ldots,k_{n})$ with $0<k_{s}\leq|X_{s}|$ and $k_{t}<|X_{t}|$ , the second additive property $\boldsymbol{p_{2}(t,k)}$ holds if for all $A$ with $|A_{s}|=k_{s}$ and $\tau\in\times_{s=1}^{n}\pi(A_{s})$ ,

\sum_{x_{t}\in A_{t}^{C}}\nu(I_{\tau_{-t}}\times I_{\tau_{t},x_{t}})=\nu(I_{\tau_{-t}}\times I_{\tau_{t}})

Claim 4.14.

$p_{2}(t,k)$ holds for all $1\leq t\leq n$ and $k$ with $0<k_{s}\leq|X_{s}|$ and $k_{t}<|X_{t}|$ .

Proof.

Fix any $1\leq t\leq n$ and $k$ with $0<k_{s}\leq|X_{s}|$ and $k_{t}<|X_{t}|$ . The following cases are exhaustive.

( $t=n$ and $k_{s}=2$ for all $1\leq s<n$ ): Fix any $A$ with $|A_{s}|=2$ for $1\leq s<n$ and $|A_{n}|=k_{n}$ , and fix $\tau\in\times_{s=1}^{n}\pi(A_{s})$ . Label $\tau_{s}=x_{s}^{1},x_{s}^{2}$ for $1\leq s<n$ . There are two subcases: if

\sum_{\tau_{n}^{\prime}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}^{\prime}})=0

then $\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n},x_{n}})=0$ for each $x_{n}\in A_{n}^{C}$ by definition of $\nu$ and $\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}^{\prime}})=0$ for each $\tau_{n}^{\prime}\in\pi(A_{n})$ since $\nu\geq 0$ . Hence,

\displaystyle\sum_{x_{n}\in A_{n}^{C}}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n},x_{n}})=0=\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})

as desired. Else,

	$\displaystyle\sum_{x_{n}\in A_{n}^{C}}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n},x_{n}})=\sum_{x_{n}\in A_{n}^{C}}\Bigg{[}\frac{\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})m(x_{-n}^{2},\{x\}_{-n}^{1};x_{n},A_{n})}{\sum_{\tau_{n}^{\prime}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}^{\prime}})}\Bigg{]}$
	$\displaystyle=\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})\Bigg{[}\frac{\sum_{x_{n}\in A_{n}^{C}}m(x_{-n}^{2},\{x\}_{-n}^{1};x_{n},A_{n})}{\sum_{\tau_{n}^{\prime}\in\pi(A_{n})}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}^{\prime}})}\Bigg{]}$
	$\displaystyle=\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})$

where the last equality follows from $p_{1}(1,\ldots,1,k_{n}-1)$ .¹⁸¹⁸18Note that $0\leq k_{n}-1<|X_{n}|$ . See the proof of the first inductive step of Claim 4.12.

( $t=n$ and $k_{s}\neq 2$ for some $1\leq s<n$ ): Fix any $A$ with $|A_{s}|=k_{s}$ for all $1\leq s\leq n$ , and fix $\tau\in\times_{s=1}^{n}\pi(A_{s})$ . For $i=1,2,3$ , let $S_{i}=\{1\leq s<n:k_{s}=i\}$ and note that these form a partition of $\{1,\ldots,n-1\}$ . Label $\tau_{S_{1}}=x_{S_{1}}^{1}$ , $\tau_{S_{2}}=x_{S_{2}}^{1},x_{S_{2}}^{2}$ , and $\tau_{S_{3}}=x_{S_{3}}^{1},x_{S_{3}}^{2},x_{S_{3}}^{3}$ . It follows that

	$\displaystyle\sum_{x_{n}\in A_{n}^{C}}\nu(I_{\tau_{-n}}\times I_{\tau_{n},x_{n}})=\sum_{x_{n}\in A_{n}^{C}}\nu(I_{x_{S_{1}}^{1}}\times I_{x_{S_{2}}^{1},x_{S_{2}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2},x_{S_{3}}^{3}}\times I_{\tau_{n},x_{n}})$
	$\displaystyle=\sum_{x_{S_{1}}^{2}\neq\neq x_{S_{1}}^{1}}\sum_{x_{n}\in A_{n}^{C}}\nu(I_{x_{S_{1}}^{1},x_{S_{1}}^{2}}\times I_{x_{S_{2}}^{1},x_{S_{2}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2}}\times I_{\tau_{n},x_{n}})$
	$\displaystyle=\sum_{x_{S_{1}}^{2}\neq\neq x_{S_{1}}^{1}}\nu(I_{x_{S_{1}}^{1},x_{S_{1}}^{2}}\times I_{x_{S_{2}}^{1},x_{S_{2}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2}}\times I_{\tau_{n}})$
	$\displaystyle=\nu(I_{x_{S_{1}}^{1}}\times I_{x_{S_{2}}^{1},x_{S_{2}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2},x_{S_{3}}^{3}}\times I_{\tau_{n}})=\nu(I_{\tau_{-n}}\times I_{\tau_{n}})$

where the first and last equalities follow by definition of $S_{i}$ and $\tau_{-n}$ , the second and fourth follow by definition of $\nu$ , and the third follows from my proof of $p_{2}(2,\ldots,2,k_{n})$ from above.

( $1\leq t<n$ ): Fix any $A$ with $|A_{s}|=k_{s}$ and $\tau\in\times_{s=1}^{n}\pi(A_{s})$ . For $i=1,2,3$ , let $S_{i}=\{1\leq s<n:k_{s}=i\}$ and note that these form a partition of $\{1,\ldots,n-1\}$ . Label $\tau_{S_{1}}=x_{S_{1}}^{1}$ , $\tau_{S_{2}}=x_{S_{2}}^{1},x_{S_{2}}^{2}$ , and $\tau_{S_{3}}=x_{S_{3}}^{1},x_{S_{3}}^{2},x_{S_{3}}^{3}$ . Since $0<k_{t}<3$ , there are two subcases. If $k_{t}=1$ , label $\tau_{t}=x_{t}^{1}$ . Then by definition of $\nu$ ,

	$\displaystyle\sum_{x_{t}\neq x_{t}^{1}}\nu(I_{\tau_{-t}}\times I_{x_{t}^{1},x_{t}})=\sum_{x_{t}\neq x_{t}^{1}}\nu(I_{x_{S_{1}\backslash\{t\}}^{1}}\times I_{x_{S_{2}}^{1},x_{S_{2}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2},x_{S_{3}}^{3}}\times I_{\tau_{n}}\times I_{x_{t}^{1},x_{t}})$
	$\displaystyle=\sum_{x_{t}\neq x_{t}^{1}}\sum_{x_{S_{1}\backslash\{t\}}^{2}\neq\neq x_{S_{1}\backslash\{t\}}^{1}}\nu(I_{x_{S_{1}\backslash\{t\}}^{1},x_{S_{1}S_{1}\backslash\{t\}}^{2}}\times I_{x_{S_{2}}^{1},x_{S_{2}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2}}\times I_{\tau_{n}}\times I_{x_{t}^{1},x_{t}})$
	$\displaystyle=\nu(I_{\tau_{-t}}\times I_{x_{t}^{1}})$

Similarly, if $k_{t}=2$ label $\tau_{t}=x_{t}^{1},x_{t}^{2}$ . Then by definition of $\nu$ ,

	$\displaystyle\nu(I_{\tau_{-t}}\times I_{x_{t}^{1},x_{t}^{2},x_{t}^{3}})=\nu(I_{x_{S_{1}}^{1}}\times I_{x_{S_{2}\backslash\{t\}}^{1},x_{S_{2}\backslash\{t\}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2},x_{S_{3}}^{3}}\times I_{\tau_{n}}\times I_{x_{t}^{1},x_{t}^{2},x_{t}^{3}})$
	$\displaystyle=\sum_{x_{S_{1}}^{2}\neq\neq x_{S_{1}}^{1}}\nu(I_{x_{S_{1}}^{1},x_{S_{1}}^{2}}\times I_{x_{S_{2}\backslash\{t\}}^{1},x_{S_{2}\backslash\{t\}}^{2}}\times I_{x_{S_{3}}^{1},x_{S_{3}}^{2}}\times I_{\tau_{n}}\times I_{x_{t}^{1},x_{t}^{2}})=\nu(I_{\tau_{-t}}\times I_{x_{t}^{1},x_{t}^{2}})$

as desired.

∎

Next, define $\mu:2^{P}\rightarrow\mathbb{R}$ as

\displaystyle\mu(\succ):=\nu(I_{x_{-n}^{1},x_{-n}^{2},x_{-n}^{3}}\times I_{x_{n}^{1},\ldots,x_{n}^{|X_{n}|}})

for the (unique) $x_{-n}^{1},x_{-n}^{2},x_{-n}^{3}$ and $x_{n}^{1},\ldots,x_{n}^{|X_{n}|}$ satisfying $x_{t}^{1}\succ_{t}x_{t}^{2}\succ_{t}x_{t}^{3}$ for all $1\leq t<n$ and $x_{n}^{1}\succ_{n}\cdots\succ_{n}x_{n}^{|X_{n}|}$ , and $\mu(C)=\sum_{\succ\in C}\mu(\succ)$ for all $C\in 2^{P}$ .

Claim 4.15.

$\mu$ is a probability measure.

Proof.

Since $\mu\geq 0$ , it suffices to show that $\sum_{\succ\in P}\mu(\succ)=1$ . Rewriting this sum yields

	$\displaystyle\sum_{\succ\in P}\mu(\succ)=\sum_{\tau\in\times\pi(X_{t})}\nu(I_{\tau})=\sum_{y\in X}\sum_{\tau^{\prime}\in\times\pi(X_{t}\backslash\{y_{t}\})}\nu(I_{\tau^{\prime},y})=\sum_{y\in X}m(y,(X_{t}\backslash\{y_{t}\}))$
	$\displaystyle=\sum_{y\in X}\sum_{B\geq(\{y_{t}\})}(-1)^{\sum\|B_{t}\|-n}p(y,B)=\sum_{B:\|B_{t}\|\geq 1}(-1)^{\sum\|B_{t}\|-n}\sum_{y\in B}p(y,B)=\sum_{B:\|B_{t}\|\geq 1}(-1)^{\sum\|B_{t}\|-n}$
	$\displaystyle=\sum_{B:\|B_{t}\|\geq 1}\prod_{t=1}^{n}(-1)^{\|B_{t}\|-1}=\prod_{t=1}^{n}\sum_{B_{t}:\|B_{t}\|\geq 1}(-1)^{\|B_{t}\|-1}=\prod_{t=1}^{n}\sum_{k=1}^{\|X_{t}\|}(-1)^{k-1}\binom{\|X_{t}\|}{k}=1$

∎

where the second equality holds because permuting $X_{t}$ is equivalent to choosing the last element and permuting the remaining $|X_{t}|-1$ elements, the third equality holds by $p_{1}((|X_{t}|-1))$ , the fifth equality follows from $y\in X$ and $B\geq(\{y_{t}\})$ iff $|B_{t}|\geq 1$ and $y\in B$ , and the ninth equality follows from $1-\sum_{k=1}^{n}(-1)^{k-1}\binom{n}{k}=\sum_{k=0}^{n}(-1)^{k}\binom{n}{k}=(1-1)^{n}=0$ for all $n\geq 1$ .

Claim 4.16.

$\mu$ extends $\nu$ .

Proof.

The proof proceeds by induction on the length of $t$ -cylinders’ $k$ -sequences. Base case ( $k_{t}=2$ for all $1\leq t<n$ , $k_{n}=|X_{n}|$ ): for any $x_{-n}^{1}\neq\neq x_{-n}^{2}\in X_{-n}$ and $\tau_{n}\in\pi(X_{n})$ ,

\displaystyle\mu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})=\nu(I_{x_{-n}^{1},x_{-n}^{2},x_{-n}^{3}}\times I_{\tau_{n}})=\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})

where the first equality follows because $I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}}=\{\succ\}$ for the (unique) $\succ$ satisfying $x_{-n}^{1}\succ_{-n}x_{-n}^{2}\succ_{-n}x_{-n}^{3}$ and $\tau_{n}^{1}\succ_{n}\cdots\succ_{n}\tau_{n}^{|X_{n}|}$ and the second equality holds by definition of $\nu$ .

First inductive step ( $k_{t}=2$ for all $1\leq t<n$ , $1\leq k_{n}<|X_{n}|$ ): for any $x_{-n}^{1}\neq\neq x_{-n}^{2}\in X_{-n}$ , $A_{n}$ with $|A_{n}|=k_{n}$ , and $\tau_{n}\in\pi(A_{n})$ ,

\displaystyle\mu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})=\sum_{x_{n}\in A_{n}^{C}}\mu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n},x_{n}})=\sum_{x_{n}\in A_{n}^{C}}\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n},x_{n}})=\nu(I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}})

where the first equality holds because $I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n}}=\bigcup_{x_{n}\in A_{n}^{C}}\big{(}I_{x_{-n}^{1},x_{-n}^{2}}\times I_{\tau_{n},x_{n}}\big{)}$ holds and is a disjoint union, the second equality follows from the inductive hypothesis, and the third equality follows from $p_{2}(n,2,\ldots,2,k_{n})$ .

Second inductive step ( $1\leq k_{t}\leq 3$ for all $1\leq t<n$ with $k_{t}\neq 2$ for some $1\leq t<n$ , $1\leq k_{n}\leq|X_{n}|$ ): fix any $A$ with $|A_{t}|=k_{t}$ and any $\tau\in\times\pi(A_{t})$ . For $i=1,2,3$ , let $T_{i}=\{1\leq t<n:k_{t}=i\}$ and note that these form a partition of $\{1,\ldots,n-1\}$ . Label $\tau_{T_{1}}=x_{T_{1}}^{1}$ , $\tau_{T_{2}}=x_{T_{2}}^{1},x_{T_{2}}^{2}$ , and $\tau_{T_{3}}=x_{T_{3}}^{1},x_{T_{3}}^{2},x_{T_{3}}^{3}$ . Hence,

	$\displaystyle\mu(I_{\tau})=\mu(I_{x_{T_{1}}^{1}}\times I_{x_{T_{2}}^{1},x_{T_{2}}^{2}}\times I_{x_{T_{3}}^{1},x_{T_{3}}^{2},x_{T_{3}}^{3}}\times I_{\tau_{n}})=\sum_{x_{T_{1}}^{2}\neq\neq x_{T_{1}}^{1}}\mu(I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{x_{T_{2}}^{1},x_{T_{2}}^{2}}\times I_{x_{T_{3}}^{1},x_{T_{3}}^{2}}\times I_{\tau_{n}})$
	$\displaystyle=\sum_{x_{T_{1}}^{2}\neq\neq x_{T_{1}}^{1}}\nu(I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{x_{T_{2}}^{1},x_{T_{2}}^{2}}\times I_{x_{T_{3}}^{1},x_{T_{3}}^{2}}\times I_{\tau_{n}})=\nu(I_{\tau})$

where the second equality holds because $I_{x_{T_{1}}^{1}}\times I_{x_{T_{2}}^{1},x_{T_{2}}^{2}}\times I_{x_{T_{3}}^{1},x_{T_{3}}^{2},x_{T_{3}}^{3}}\times I_{\tau_{n}}=\bigcup_{x_{T_{1}}^{2}\neq\neq x_{T_{1}}^{1}}I_{x_{T_{1}}^{1},x_{T_{1}}^{2}}\times I_{x_{T_{2}}^{1},x_{T_{2}}^{2}}\times I_{x_{T_{3}}^{1},x_{T_{3}}^{2}}\times I_{\tau_{n}}$ holds and is a disjoint union, the third equality follows from the first inductive step, and the fourth equality follows by definition of $\nu$ . ∎

Claim 4.17.

$\mu(E(x,A))=m(x,A)$ for all $A<<X$ and $x\in A^{C}$ .

Proof.

Fix any $A<<X$ and $x\in A^{C}$ , and let $T=\{1\leq t\leq n:k_{t}=0\}$ . Then

\displaystyle\mu(E(x,A))=\sum_{\tau\in\times_{-T}\pi(A_{t})}\mu(I_{\tau,x_{-T}}\times I_{x_{T}})=\sum_{\tau\in\times_{-T}\pi(A_{t})}\nu(I_{\tau,x_{-T}}\times I_{x_{T}})=m(x,A)

where the first equality holds because $E(x,A)=\bigcup_{\tau\in\times_{-T}\pi(A_{t})}I_{\tau,x_{-T}}\times I_{x_{T}}$ holds and is a disjoint union, the second equality follows from Claim 4.16, and the third equality follows from $p_{1}(|A_{1}|,\ldots,|A_{n}|)$ . ∎

By Proposition 3.1, I conclude $\mu$ is an SU representation of $\rho$ . ∎

4.6 Proof of Corollary 3.8

Define $\nu$ and $\mu$ as in the proof of Theorem 3.7. Since Joint BM Positivity implies Joint BM Nonnegativity, $\mu$ is an SU representation. By construction, $\nu>0$ , and hence $\mu>0$ .

4.7 Proof of Theorem 3.13

Proof.

( $\implies$ ): I have shown that SU implies Joint BM Nonnegativity and Marginal Consistency (Theorem 3.7), Joint BM Nonnegativity implies Partial Marginal and Conditional BM Nonnegativity (directly following Axioms 3.9 and 3.10), and Marginal Consistency implies $(-n)$ -Marginal Consistency (directly following Axiom 3.11). Hence, it remains to show that SU implies $(-n)$ -Conditional Consistency.

Let $\mu$ be an SU representation and fix any $x_{n}\in A_{n}\in\mathcal{M}_{n}$ and $(A,x)^{-n}\in\mathcal{H}_{n-1}$ with partition $(y_{-n}^{i},\{x^{i}\}_{-n})_{i\in I}$ . Since $|X_{t}|=3$ for all $1\leq t<n$ , I can identify each index $i$ with the (unique) preference tuple $\succ_{-n}^{i}\in P_{-n}$ satisfying $E_{-n}(y_{-n}^{i},\{x^{i}\}_{-n})=\{\succ_{-n}^{i}\}$ . Let $(\succ_{-n}^{i}):=\{\succ_{-n}^{i}\}\times P_{n}=E(y^{i},\{x^{i}\})^{-n}$ and define $I^{\prime}$ as before: by Proposition 3.1, $I^{\prime}=\{i\in I:\mu(\succ_{-n}^{i})>0\}$ , since $\mu(\succ_{-n}^{i})=\sum_{x_{n}\in X_{n}}\mu((\succ_{-n}^{i})\cap E(x_{n},\emptyset))=\sum_{x_{n}\in X_{n}}m(y_{-n}^{i},\{x^{i}\}_{-n};x_{n},\emptyset)=m(y_{-n}^{i},\{x^{i}\}_{-n})$ . Hence,

	$\displaystyle\rho_{n}(x_{n},A_{n}\|(A,x)^{-n})=\mu(C(x_{n},A_{n})\|C(x,A)^{-n})$
	$\displaystyle=\sum_{\succ_{-n}\in C(x,A)^{-n}:\mu(\succ_{-n})>0}\mu(C(x_{n},A_{n})\|\succ_{-n})\mu(\succ_{-n}\|C(x,A)^{-n})$
	$\displaystyle=\sum_{i\in I^{\prime}}\mu(C(x_{n},A_{n})\|\succ_{-n}^{i})\mu(\succ_{-n}^{i}\|C(x,A)^{-n})$
	$\displaystyle=\sum_{i\in I^{\prime}}\frac{\sum_{D_{n}\subseteq A_{n}^{C}}\mu((\succ_{-n}^{i})\cap E(x_{n},D_{n}))}{\mu(\succ_{-n}^{i})}\frac{\mu(\succ_{-n}^{i})}{\mu(C(x,A)^{-n})}$
	$\displaystyle=\sum_{i\in I^{\prime}}\rho_{n}(x_{n},A_{n}\|(y^{i},\{x^{i}\})^{-n})\frac{m(y_{-n}^{i},\{x^{i}\}_{-n})}{p_{-n}(x_{-n},A_{-n})}$

where the second equality follows from the Law of Total Probability, the fourth equality follows because $\bigcup_{D_{n}\subseteq A_{n}^{C}}((\succ_{-n}^{i})\cap E(x_{n},D_{n}))=(\succ_{-n}^{i})\cap\bigcup_{D_{n}\subseteq A_{n}^{C}}E(x_{n},D_{n})=(\succ_{-n}^{i})\cap C(x_{n},A_{n})$ is a disjoint union, and the fifth equality follows by definition of $\rho_{n}(x_{n},A_{n}|(y^{i},\{x^{i}\})^{-n})$ and by Proposition 3.1, since $\mu((\succ_{-n}^{i})\cap E(x_{n},D_{n}))=\mu(E(y_{-n}^{i};\{x^{i}\}_{-n};x_{n},D_{n}))=m(y_{-n}^{i};\{x^{i}\}_{-n};x_{n},D_{n})$ and $\mu(C(x,A)^{-n})=\mu(C(x_{-n},A_{-n};x_{n},\{x_{n}\})=p(x_{-n},A_{-n};x_{n},\{x_{n}\})=p_{-n}(x_{-n},A_{-n})$ .

( $\impliedby$ ): Since $|X_{t}|=3$ for all $1\leq t<n$ ,

	$\displaystyle m(y_{-n},\{x\}_{-n})=\sum_{B_{-n}\geq\{x\}_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-2(n-1)}p_{-n}(y_{-n},B_{-n})\geq 0$
	$\displaystyle\iff\sum_{B_{-n}\geq\{x\}_{-n}^{C}:\sum_{t=1}^{n-1}\|B_{t}\|\text{even}}p_{-n}(y_{-n},B_{-n})\geq\sum_{B_{-n}\geq\{x\}_{-n}^{C}:\sum_{t=1}^{n-1}\|B_{t}\|\text{odd}}p_{-n}(y_{-n},B_{-n})$

and hence Partial Marginal BM Nonnegativity is equivalent to Joint Supermodularity of $p_{-n}$ . By definition, $(-n)$ -Marginal Consistency is equivalent to Marginal Consistency of $p_{-n}$ . Hence, by Theorem 3.4, $\rho_{-n}$ has a unique SU representation $\mu_{-n}\in\Delta(P_{-n})$ .

Next, fix any $y_{-n}\neq\neq x_{-n}\in X_{-n}$ with $m(y_{-n},\{x\}_{-n})>0$ , and identify $(y_{-n},\{x\}_{-n})$ with the (unique) $\succ_{-n}$ such that $\{\succ_{-n}\}=E_{-n}(y_{-n},\{x\}_{-n})$ . For any $x_{n}\in A_{n}\in\mathcal{M}_{n}$ , Lemma 4.6 and Partial Conditional BM Nonnegativity imply

\rho_{n}(x_{n},A_{n}|(y,\{x\})^{-n})=\sum_{B_{n}\subseteq A_{n}^{C}}m(x_{n},B_{n}|(y,\{x\})^{-n})\geq 0

Furthermore, by Lemma 4.9,

\sum_{x_{n}\in A_{n}}\rho_{n}(x_{n},A_{n}|(y,\{x\})^{-n})=1

Hence, $\rho_{n}(\cdot|(y,\{x\})^{-n})$ is an SCF. By Block et al. (1959)’s characterization of static RU, Partial Conditional BM Nonnegativity implies that $\rho_{n}(\cdot|(y,\{x\})^{-n})$ has an RU representation $\mu^{\succ_{-n}}\in\Delta(P_{n})$ . For $\succ_{-n}$ with $\mu_{-n}(\succ_{-n})=0$ , define $\mu^{\succ_{-n}}\in\Delta(P_{n})$ to be any fixed probability measure over $P_{n}$ .

Next, define $\mu:2^{P}\rightarrow\mathbb{R}$ as

\mu(\succ):=\mu_{-n}(\succ_{-n})\mu^{\succ_{-n}}(\succ_{n})

and $\mu(C):=\sum_{\succ\in C}\mu(\succ)$ for all (non-singleton) $C\in 2^{P}$ . By definition of $\mu$ , $\mu\geq 0$ and

\sum_{\succ\in P}\mu(\succ)=\sum_{\succ_{-n}\in P_{-n}}\mu_{-n}(\succ_{-n})\sum_{\succ_{n}\in P_{n}}\mu^{\succ_{-n}}(\succ_{n})=\sum_{\succ_{-n}\in P_{-n}}\mu_{-n}(\succ_{-n})=1

Hence, $\mu$ is a probability measure over $P$ . More generally, for any $C_{-n}\subseteq P_{-n}$ and $C_{n}\subseteq P_{n}$ ,

\mu(C_{-n}\times C_{n})=\sum_{\succ_{-n}\in C_{-n}}\mu_{-n}(\succ_{-n})\sum_{\succ_{n}\in C_{n}}\mu^{\succ_{-n}}(\succ_{n})=\sum_{\succ_{-n}\in C_{-n}}\mu_{-n}(\succ_{-n})\mu^{\succ_{-n}}(C_{n})

Hence, for any $x_{1}\in A_{1}\in\mathcal{M}_{1}$ ,

	$\displaystyle\mu(C(x_{1},A_{1}))=\mu(C_{-n}(x_{1},A_{1})\times P_{n})=\sum_{\succ_{-n}\in C_{-n}(x_{1},A_{1})}\mu_{-n}(\succ_{-n})\mu^{\succ_{-n}}(P_{n})$
	$\displaystyle=\mu_{-n}(C_{-n}(x_{1},A_{1}))=\rho_{1}(x_{1},A_{1})$

since $\mu_{-n}$ is an SU representation of $\rho_{-n}$ . Similarly, fix any $1<t<n$ and let $T=\{1,\ldots,t\}$ , $T-1=T\backslash\{t\}$ . Fix any $(A,x)^{T-1}\in\mathcal{H}_{t-1}$ and $x_{t}\in A_{t}\in\mathcal{M}_{t}$ : then

	$\displaystyle\mu(C(x_{t},A_{t})\|C(A,x)^{T-1})=\frac{\mu(C(x,A)^{T})}{\mu(C(x,A)^{T-1})}=\frac{\mu(C_{-n}(x,A)^{T}\times P_{n})}{\mu(C_{-n}(x,A)^{T-1}\times P_{n})}$
	$\displaystyle=\frac{\mu_{-n}(C_{-n}(x,A)^{T})}{\mu_{-n}(C_{-n}(x,A)^{T-1})}=\frac{p_{-n}(x,A)^{T}}{p_{-n}(x,A)^{T-1}}=\frac{p_{t}(x,A)^{T}}{p_{t-1}(x,A)^{T-1}}=\rho_{t}(x_{t},A_{t}\|(A,x)^{T-1})$

where the fourth equality holds because $\mu_{-n}$ being an SU representation of $\rho_{-n}$ and $C_{-n}(x,A)^{T}=C_{-n}(x_{T},A_{T};x_{-n\backslash T},\{x\}_{-n\backslash T})$ imply $\mu_{-n}(C_{-n}(x,A)^{T})=p_{-n}(x_{T},A_{T})$ , and the fifth equality holds because $p_{t}$ and $p_{t-1}$ are the marginals of $p_{n}$ on $A_{T}$ and $A_{T-1}$ , respectively.

Finally, for any $(A,x)^{-n}\in\mathcal{H}_{n-1}$ and $x_{n}\in A_{n}\in\mathcal{M}_{n}$ , let $\{\succ_{-n}^{i}\}_{i\in I}$ be the unique singleton partition of $(A,x)^{-n}$ and let $I^{\prime}:=\{i\in I:\mu_{-n}(\succ_{-n}^{i})>0\}$ . Identify $\succ_{-n}^{i}$ with $(y_{-n}^{i},\{x^{i}\}_{-n})$ as before, and define $(\succ_{-n}^{i})$ as before. Then

	$\displaystyle\mu(C(x_{n},A_{n})\|C(A,x)^{-n})=\sum_{i\in I^{\prime}}\mu(C(x_{n},A_{n})\|\succ_{-n}^{i})\mu(\succ_{-n}^{i}\|C(x,A)^{-n})$
	$\displaystyle=\sum_{i\in I^{\prime}}\frac{\mu(C(x_{n},A_{n})\cap(\succ_{-n}^{i}))}{\mu(\succ_{-n}^{i})}\frac{\mu(\succ_{-n}^{i})}{\mu(C(x,A)^{-n})}$
	$\displaystyle=\sum_{i\in I^{\prime}}\frac{\mu(\{\succ_{-n}^{i}\}\times C_{n}(x_{n},A_{n}))}{\mu(\succ_{-n}^{i})}\frac{\mu(\succ_{-n}^{i})}{\mu(C_{-n}(x,A)^{-n}\times P_{n})}$
	$\displaystyle=\sum_{i\in I^{\prime}}\frac{\mu_{-n}(\succ_{-n}^{i})\mu^{\succ_{-n}^{i}}(C_{n}(x_{n},A_{n}))}{\mu_{-n}(\succ_{-n}^{i})}\frac{\mu_{-n}(\succ_{-n}^{i})}{\mu_{-n}(C_{-n}(x,A)^{-n})}$
	$\displaystyle=\sum_{i\in I^{\prime}}\frac{\sum_{D_{n}\subseteq A_{n}^{C}}\mu_{-n}(\succ_{-n}^{i})\mu^{\succ_{-n}^{i}}(E_{n}(x_{n},D_{n}))}{\mu_{-n}(\succ_{-n}^{i})}\frac{\mu_{-n}(\succ_{-n}^{i})}{\mu_{-n}(C_{-n}(x,A)^{-n})}$
	$\displaystyle=\sum_{i\in I^{\prime}}\frac{\sum_{D_{n}\subseteq A_{n}^{C}}m(y_{-n}^{i},\{x^{i}\}_{-n})m(x_{n},D_{n}\|y_{-n}^{i},\{x^{i}\}_{-n})}{m(y_{-n}^{i},\{x^{i}\}_{-n})}\frac{m(y_{-n}^{i},\{x^{i}\}_{-n})}{p_{-n}(x_{-n},A_{-n})}$
	$\displaystyle=\sum_{i\in I^{\prime}}\rho_{n}(x_{n},A_{n}\|(y^{i},\{x^{i}\})^{-n})\frac{m(y_{-n}^{i},\{x^{i}\}_{-n})}{p_{-n}(x_{-n},A_{-n})}=\rho_{n}(x_{n},A_{n}\|(A,x)^{-n})$

where the fifth equality holds because $C_{n}(x_{n},A_{n})=\bigcup_{D_{n}\subseteq A_{n}^{C}}E_{n}(x_{n},D_{n})$ is disjoint, the sixth equality holds by applying Proposition 3.1 to $(\rho,p,\mu)_{-n}$ and the static analog of Proposition 3.1 to $\rho_{n}(\cdot|y_{-n}^{i},\{x^{i}\}_{-n})$ and $\mu^{y_{-n}^{i},\{x^{i}\}_{-n}}$ for each $i\in I^{\prime}$ ,¹⁹¹⁹19For the static analog of Proposition 3.1, see Proposition 7.3 of Chambers and Echenique (2016). the seventh equality holds by Lemma 4.9 and by definition of $\rho_{n}(\cdot|y_{-n}^{i},\{x^{i}\}_{-n})$ , and the eighth equality holds by $(-n)$ -Conditional Consistency. ∎

4.8 Proof of Theorem 3.15

Proof.

( $\implies$ ): Let $\mu$ be an SU representation, and fix any $(x^{i},A^{i})_{i=1}^{k}$ with $x^{i}\in A^{i}\in\mathcal{M}$ for each $1\leq i\leq k$ and any $(\lambda^{i})_{i=1}^{k}\subseteq\mathbb{R}$ such that $\sum_{i=1}^{k}\lambda^{i}\mathbbm{1}_{C(x^{i},A^{i})}\geq 0$ . Hence,

\displaystyle\sum_{i=1}^{k}\lambda^{i}p(x^{i},A^{i})=\sum_{i=1}^{k}\lambda^{i}\mu(C(x^{i},A^{i}))=\mathbb{E}_{\mu}\bigg{[}\sum_{i=1}^{k}\lambda^{i}\mathbbm{1}_{C(x^{i},A^{i})}\bigg{]}\geq 0

where the first equality follows from Proposition 3.1, the second equality follows from linearity of expectation and because the expectation of an indicator random variable of an event is the probability of that event, and the inequality follows because a convex combination of nonnegative numbers is nonnegative.

( $\impliedby$ ): Using the notation from Clark (1996), let $\mathcal{A}:=\{C(x,A)\}_{x\in A\in\mathcal{M}}$ and note that $P=C(x,\{x\}_{N})\in\mathcal{A}$ . Let $\hat{\mathcal{A}}$ be the algebra generated by $\mathcal{A}$ . First, I show $\hat{\mathcal{A}}=2^{P}$ . To show this, it suffices to show that $\hat{\mathcal{A}}$ contains all singletons, since $\hat{\mathcal{A}}$ is closed under finite unions and every event in $2^{P}$ is a finite union of singletons. Fix any $\succ\in P$ and identify it with its ranking of all elements in each period: $(x_{t}^{1},\ldots,x_{t}^{|X_{t}|})_{t=1}^{n}$ . Define indices $t_{1},\ldots,t_{n}$ such that $|X_{t_{1}}|\leq\cdots\leq|X_{t_{n}}|$ . Hence,

\displaystyle\{\succ\}=\bigcap_{k=1}^{|X_{t_{n}}|}C(x^{k},\{x_{t}^{k},\ldots,x_{t}^{|X_{t}|}\}_{N})\in\hat{\mathcal{A}}

where $x_{t}^{k}:=x_{t}^{|X_{t}|}$ and $\{x_{t}^{k},\ldots,x_{t}^{|X_{t}|}\}:=\{x_{t}^{|X_{t}|}\}$ for $k>|X_{t}|$ , and since $\hat{\mathcal{A}}$ is closed under finite intersections.²⁰²⁰20For example, let $n=2$ and $\succ=(x_{1}^{1}x_{1}^{2}x_{1}^{3},x_{2}^{1}x_{2}^{2}x_{2}^{3}x_{2}^{4})$ . Then $\{\succ\}=C(x^{1},X)\cap C(x_{1}^{2},x_{1}^{3};x_{2}^{2},\{x_{2}^{2},x_{2}^{3},x_{2}^{4}\})\cap C(x_{1}^{3},\{x_{1}^{3}\};x_{2}^{3},x_{2}^{4}\}\cap C(x_{1}^{3},\{x_{1}^{3}\};x_{2}^{4},\{x_{2}^{4}\})$ . Consider $p:\mathcal{A}\rightarrow\mathbb{R}$ as $p(C(x,A)):=p(x,A)$ and note that $p(x,\{x\}_{N})=1$ . By Theorem 1 of Clark (1996), there exists a finitely additive probability measure $\mu:2^{P}\rightarrow[0,1]$ such that $\mu(C(x,A))=p(x,A)$ for all $x\in A\in\mathcal{M}$ . By Proposition 3.1, $\mu$ is a SU representation of $\rho$ . ∎

References

Barberá and Pattanaik (1986) Salvador Barberá and Prasanta K. Pattanaik. Falmagne and the rationalizability of stochastic choices in terms of random orderings. Econometrica: Journal of the Econometric Society, pages 707–715, 1986.
Block et al. (1959) Henry David Block, Jacob Marschak, et al. Random orderings and stochastic theories of response. Technical report, Cowles Foundation for Research in Economics, Yale University, 1959.
Chambers and Echenique (2016) Christopher P. Chambers and Federico Echenique. Revealed preference theory, volume 56. Cambridge University Press, 2016.
Chambers et al. (2021) Christopher P. Chambers, Yusufcan Masatlioglu, and Christopher Turansick. Correlated choice. arXiv preprint arXiv:2103.05084, 2021.
Clark (1996) Stephen A. Clark. The random utility model with an infinite choice space. Economic Theory, 7(1):179–189, 1996.
De Finetti (1937) Bruno De Finetti. La prévision: ses lois logiques, ses sources subjectives. In Annales de l’institut Henri Poincaré, volume 7, pages 1–68, 1937.
Falmagne (1978) Jean-Claude Falmagne. A representation theorem for finite random scale systems. Journal of Mathematical Psychology, 18(1):52–72, 1978.
Frick et al. (2019) Mira Frick, Ryota Iijima, and Tomasz Strzalecki. Dynamic random utility. Econometrica, 87(6):1941–2002, 2019.
Godsil (2018) Chris Godsil. An introduction to the moebius function. arXiv preprint arXiv:1803.06664, 2018.
Kashaev and Aguiar (2022) Nail Kashaev and Victor H. Aguiar. Nonparametric analysis of dynamic random utility models. arXiv preprint arXiv:2204.07220, 2022.
Kitamura and Stoye (2018) Yuichi Kitamura and Jörg Stoye. Nonparametric analysis of random utility models. Econometrica, 86(6):1883–1909, 2018.
Kreps and Porteus (1978) David M. Kreps and Evan L. Porteus. Temporal resolution of uncertainty and dynamic choice theory. Econometrica: journal of the Econometric Society, pages 185–200, 1978.
McFadden and Richter (1990) Daniel McFadden and Marcel K. Richter. Stochastic rationality and revealed stochastic preference. Preferences, Uncertainty, and Optimality, Essays in Honor of Leo Hurwicz, Westview Press: Boulder, CO, pages 161–186, 1990.
Schacke (2004) Kathrin Schacke. On the kronecker product. Master’s thesis, University of Waterloo, 2004.
Sen (1971) Amartya K. Sen. Choice functions and revealed preference. The Review of Economic Studies, 38(3):307–317, 1971.
Strzalecki (2021) Tomasz Strzalecki. Stochastic Choice. 2021.
Van Lint and Wilson (2001) Jacobus Hendricus Van Lint and Richard Michael Wilson. A course in combinatorics. Cambridge university press, 2001.

	$\displaystyle p(x_{-t},B_{-t})+\sum_{B_{t}\supsetneq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}\big{[}p(x_{-t},B_{-t})-p(B_{t}\backslash A_{t}^{C},B_{t};x_{-t},B_{-t})\big{]}$
	$\displaystyle=\sum_{B_{t}\supseteq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}p(x_{-t},B_{-t})-\sum_{B_{t}\supsetneq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}p(B_{t}\backslash A_{t}^{C},B_{t};x_{-t},B_{-t})$
	$\displaystyle=p(x_{-t},B_{-t})\sum_{B_{t}\supseteq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|}+\sum_{B_{t}\supsetneq A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|+1}p(B_{t}\backslash A_{t}^{C},B_{t};x_{-t},B_{-t})$
	$\displaystyle=\sum_{B_{t}\supsetneq A_{t}^{C}}\sum_{x_{t}\in B_{t}\backslash A_{t}^{C}}(-1)^{\|B_{t}\|-\|A_{t}^{C}\|+1}p(x_{t},B_{t};x_{-t},B_{-t})$

	$\displaystyle\sum_{D_{n}\subseteq A_{n}^{C}}m(x_{-n},A_{-n};x_{n},D_{n})$
	$\displaystyle=\sum_{B_{-n}\geq A_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}p_{-n}(x_{-n},B_{-n})\sum_{D_{n}\subseteq A_{n}^{C}}\sum_{B_{n}\supseteq D_{n}^{C}}(-1)^{\|B_{n}\|-\|D_{n}^{C}\|}\rho_{n}(x_{n},B_{n}\|(B,x)^{-n})$
	$\displaystyle=\sum_{B_{-n}\geq A_{-n}^{C}:p_{-n}(x_{-n},B_{-n})>0}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}p_{-n}(x_{-n},B_{-n})\sum_{D_{n}\subseteq A_{n}^{C}}m(x_{n},D_{n}\|(B,x)^{-n})$
	$\displaystyle=\sum_{B_{-n}\geq A_{-n}^{C}}(-1)^{\sum_{t=1}^{n-1}\|B_{t}\|-\|A_{t}^{C}\|}p(x_{-n},B_{-n};x_{n},A_{n})$

	$\displaystyle\rho_{n}(x_{n},A_{n}\|(A,x)^{-n})=\mu(C(x_{n},A_{n})\|C(x,A)^{-n})$
	$\displaystyle=\sum_{\succ_{-n}\in C(x,A)^{-n}:\mu(\succ_{-n})>0}\mu(C(x_{n},A_{n})\|\succ_{-n})\mu(\succ_{-n}\|C(x,A)^{-n})$
	$\displaystyle=\sum_{i\in I^{\prime}}\mu(C(x_{n},A_{n})\|\succ_{-n}^{i})\mu(\succ_{-n}^{i}\|C(x,A)^{-n})$
	$\displaystyle=\sum_{i\in I^{\prime}}\frac{\sum_{D_{n}\subseteq A_{n}^{C}}\mu((\succ_{-n}^{i})\cap E(x_{n},D_{n}))}{\mu(\succ_{-n}^{i})}\frac{\mu(\succ_{-n}^{i})}{\mu(C(x,A)^{-n})}$
	$\displaystyle=\sum_{i\in I^{\prime}}\rho_{n}(x_{n},A_{n}\|(y^{i},\{x^{i}\})^{-n})\frac{m(y_{-n}^{i},\{x^{i}\}_{-n})}{p_{-n}(x_{-n},A_{-n})}$

Dynamic Random Choice

Abstract

1 Introduction and Related Literature

2 Stochastic Utility

2.1 Primitive

Definition 2.1.

Definition 2.2.

2.2 Model

Definition 2.3.

3 Results

Proposition 3.1.

Axiom 3.2 (Joint Supermodularity).

Example 3.3.

Theorem 3.4.

Axiom 3.5 (Joint BM Nonnegativity).

Axiom 3.6 (Marginal Consistency).

Theorem 3.7.

Corollary 3.8.

Axiom 3.9 (Partial Marginal BM Nonnegativity).

Axiom 3.10 (Partial Conditional BM Nonnegativity).

Axiom 3.11 ((−n)(-n)-Marginal Consistency).

Axiom 3.12 ((−n)(-n)-Conditional Consistency).

Theorem 3.13.

Axiom 3.14 (Joint Coherency).

Theorem 3.15.

4 Appendix

4.1 The Möbius Inversion

Definition 4.1.

Lemma 4.2.

Proof.

Lemma 4.3.

Proof.

Lemma 4.4.

Proof.

4.2 BM Sum Identities

Lemma 4.5.

Proof.

Lemma 4.6.

Proof.

Lemma 4.7.

Proof.

Corollary 4.8.

Proof.

Lemma 4.9.

Proof.

4.3 Proof of Proposition 3.1

Proof.

4.4 Proof of Theorem 3.4

Proof.

4.5 Proof of Theorem 3.7

Proof.

Definition 4.10.

Definition 4.11.

Claim 4.12.

Proof.

Definition 4.13.

Claim 4.14.

Proof.

Claim 4.15.

Proof.

Claim 4.16.

Proof.

Claim 4.17.

Proof.

4.6 Proof of Corollary 3.8

4.7 Proof of Theorem 3.13

Proof.

4.8 Proof of Theorem 3.15

Proof.

References

Axiom 3.11 ( $(-n)$ -Marginal Consistency).

Axiom 3.12 ( $(-n)$ -Conditional Consistency).