This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Informational Puts

Andrew Koh
MIT
MIT Department of Economics; email: [email protected]
   Sivakorn Sanguanmoo
MIT
MIT Department of Economics; email: [email protected]
   Kei Uzui
MIT
MIT Department of Economics; email: [email protected]
 
First version: December 2023. We are especially grateful to Drew Fudenberg and Stephen Morris for guidance, support, and many illuminating conversations. We also thank Daron Acemoglu, Matt Elliott, Nobuhiro Kiyotaki, Daniel Luo, Daisuke Oyama, Satoru Takahashi, Iván Werning, Alex Wolitzky, Muhamet Yildiz, as well as audiences at Cambridge University, the 25th ACM Conference on Economics and Computation (EC’24), the Econometric Society North American and Asian Meetings, Nuffield College Oxford, and MIT Finance, Macro, and Theory Lunches for helpful comments.
Abstract

We fully characterize how dynamic information should be provided to uniquely implement the largest equilibrium in dynamic binary-action supermodular games. The designer offers an informational put: she stays silent in good times, but injects asymmetric and inconclusive public information if players lose faith. There is (i) no multiplicity gap: the largest (partially) implementable equilibrium can be implemented uniquely; and (ii) no intertemporal commitment gap: the policy is sequentially optimal. Our results have sharp implications for the design of policy in coordination environments.

1  Introduction

Many economic environments feature (i) uncertainty about a payoff-relevant fundamental state, (ii) coordination motives, and (iii) stochastic opportunities to revise actions. These elements are present across all aspects of social and economic life e.g., macroeconomics, finance, industrial organization, and political economy.111In macroeconomics, firms are uncertain about economic conditions, face complementarities (Nakamura and Steinsson, 2010), and change their prices at ticks of a Poisson clock (Calvo, 1983). In finance, creditors are uncertain about the debtor’s profitability/solvency (Goldstein and Pauzner, 2005), have incentive to run if others’ run (Diamond and Dybvig, 1983), but might only be able to withdraw their debt at staggered intervals (He and Xiong, 2012). In industrial organization, consumers are uncertain about a product’s quality, have incentive to adopt the same product as others (Farrell and Saloner, 1985; Ellison and Fudenberg, 2000), and face stochastic adoption opportunities (Biglaiser, Crémer, and Veiga, 2022).

Equilibria of such games are sensitive to dynamic information. Consider a player who, at any history of the game, finds herself with the opportunity to re-optimize her action. The fundamental state matters for her flow payoffs, so her decision must depend on her current beliefs. Moreover, since she plays the same action until she can next re-optimize, her decision also depends on her beliefs about what future agents will do. But those beliefs depend, in turn, on what she expects future players to learn, as well as their beliefs about the play of agents yet further out into the future. Thus, the stochastic evolution of future beliefs—even those arbitrarily distant—shape incentives in the present.

We are interested in dynamic information policies which fully implement the largest time path of aggregate play i.e., as a unique subgame equilibrium of the induced stochastic game. Our main result (Theorem 1) fully characterizes the form, value, and sequential optimality of designer-optimal policies:

  1. 1.

    Form. The form of optimal dynamic information relies on the delivery of carefully chosen off-path information. If players take the designer’s preferred action, the designer stays silent. If, however, agents deviate from a target path of play specified by the policy, the designer injects an asymmetric and inconclusive public signal—this is the informational put.222This is analogous to the “Fed put” in which the Fed’s history of intervening to halt market downturns has arguably created the belief that they are insured against downside risk (Miller et al., 2002). This is as if the Fed has offered the market a put option as insurance against downturns. In our setting, the designer steps in to inject information when players start switching to action 0 which, as we will show, with high probability induces aggregate play to correct. This is as if the designer has offered players a put option as insurance against strategic uncertainty about the play of future players.

    The signal is asymmetric such that the probability that agents become a little more confident is far higher than the probability that agents become much more pessimistic. These small but high-probability movements in the direction of the dominance region—at which playing the designer’s preferred action is strictly dominant—are chained together such that the unique equilibrium of the subgame is for future players to play the designer-preferred action.333This is done via a ”contagion argument” which can be viewed as the dynamic analog of the interim deletion of strictly dominated strategies in static games of incomplete information. The signal is inconclusive such that, even if agents turn pessimistic, they do not become excessively so—this will be important for sequential optimality.

  2. 2.

    Value. The sequentially optimal policy uniquely implements the upper-bound on the time path of aggregate play. Thus, there is no multiplicity gap: whatever can be implemented partially (i.e., as an equilibrium) can also be implemented fully (i.e., as the unique equilibrium). This is in sharp contrast to recent work on static implementation via information design in supermodular games which finds there generically exists a gap even with private information and the ability to manipulate higher-order beliefs (Morris, Oyama, and Takahashi, 2024), or with both private information and transfers (Halac, Lipnowski, and Rappoport, 2021).

  3. 3.

    Sequential optimality. Our dynamic information policy is constructed such that at every history, the designer has no incentive to deviate.444With the caveat that for a small set of histories, deviation incentives can be made arbitrarily small. For histories where this is so, this is simply because optimal information policies continuing from those histories do not exist. Nonetheless, this can be approached via a sequence of policies so that the gap vanishes along this sequence. This openness property is also typical of static full implementation environments as highlighted by Morris, Oyama, and Takahashi (2024). Thus, there is no intertemporal commitment gap: whatever can be implemented with ex-ante commitment to the dynamic information structure can also be implemented when the sender can continually re-optimize her dynamic information.555We further emphasize that sequential optimality is not given—we offer examples of policies which are optimal but not sequentially optimal. Sequentially optimality arises through the delicate interaction between properties of our policy: asymmetry, chaining, and inconclusiveness. Asymmetric off-path information are chained together to obtain full implementation at all states in which the designer-preferred action is not strictly dominated. Then, inconclusive off-path information ensures that, even if agents turn pessimistic, full implementation is still guaranteed.

Conceptually, our contribution highlights how off-path information should be optimally deployed to shape on-path incentives. Of course, it is well-known from implementation theory (Moore and Repullo, 1988; Abreu and Matsushima, 1992) that off-path threats are powerful, albeit not sequentially optimal—if the deviation actually occurs, there is no incentive to follow-through with the policy.666With the caveat that in implementation theory, the designer’s objective function is typically not specified: we have in mind an environment in which the designer is a player in the game, and punishing players is costly. See also work on mechanism design with limited commitment (Laffont and Tirole, 1988; Bester and Strausz, 2001; Skreta, 2015; Liu et al., 2019; Doval and Skreta, 2022) and macroeconomics where time-inconsistency plays a crucial role (Halac and Yared, 2014). Information is different in two substantive ways. It is less powerful: beliefs are martingales, which imposes severe constraints on what payoffs can be delivered off-path. But it is also more flexible: the designer has the freedom to design any distribution of off-path beliefs. What should we make of these differences?

First, we will show that off-path information, through less powerful on its own, can be chained together to close the gap between full and partial implementation. Second, the flexibility of off-path information can be exploited to shape the continuation incentives of the designer. This ensures that the designer’s counterfactual selves at zero probability histories are willing to follow through with the promised information. Together, these insights offer a novel and unified treatment of dynamic information design in supermodular games.

Economically, our results have sharp implications for a range of phenomena where coordination and multiple equilbiria feature prominently e.g., in finance (debt runs, currency crises), macroeconomics (price setting), trade and industrial policy (big pushes), industrial organization (network goods), and political economy (revolutions). We briefly discuss this after stating our main result.

Related Literature

Our results relate most closely to recent work on full implementation in supermodular games via information design (Morris, Oyama, and Takahashi, 2024; Inostroza and Pavan, 2023; Li, Song, and Zhao, 2023). In this literature, information design induces non-degenerate higher-order beliefs, and this is important to obtain uniqueness via a ”contagion argument” over the type space. By contrast, our dynamic information is public and higher-order beliefs are degenerate but we leverage a distinct kind of ”intertemporal contagion”. A key takeaway from this literature is that there is typically a gap between the designer’s value under adversarial equilibrium selection, and under designer-favorable selection (what we call a “multiplicity gap”); by contrast, we show that for dynamic binary-action supermodular games there is no such gap.

Also related is the elegant and complementary work of Basak and Zhou (2020) and Basak and Zhou (2022). We highlight several substantive differences. First, we study different dynamic games: in Basak and Zhou (2020, 2022) players make a once-and-for-all decision on whether to play the risky action, and they focus on regime change games—both features play a key role in their analysis;777Basak and Zhou (2020) study a regime change game with private information where the designer can choose the frequency at which she discloses whether or not the regime has survived. Basak and Zhou (2022) study an optimal stopping game with a regime change payoff structure in which agents chooses when to undertake an irreversible risky action. in ours, agents can continually re-optimize at the ticks of their Poisson clocks and play a general binary-action supermodular game where the designer’s payoff is any increasing functional from the path of aggregate play. Importantly, our optimal dynamic information policies—and the reasons they work—are entirely distinct; we discuss this more thoroughly after stating our main result.

Our paper also relates to work on the equilibria of dynamic coordination games. An important paper of Gale (1995) studies a complete information investment game where players can decide when, if ever, to make an irreversible investment and investing is payoff dominant.888See also Chamley (1999); Dasgupta (2007); Angeletos, Hellwig, and Pavan (2007); Mathevet and Steiner (2013); Koh, Li, and Uzui (2024a) all of which study the equilibria of different dynamic coordination games. The main result is that investment succeeds across all subgame perfect equilibria. Our environment and results differ in several substantive ways. For instance, our policy allows the designer to implement the largest equilibria—irrespective of whether it is payoff dominant.999Moreover, actions in our environment are reversible, so sans any information (and assuming beliefs are not in the dominance regions) there will exist subgame perfect equilbiria in which players “cycle” between actions; this is ruled out in the environment of Gale (1995) because of irreversibility. More subtly, our dynamic information works with—but does not rely on—atomless players i.e., we obtain full implementation even if each player believes that they will not change the aggregate state. By contrast, atomic players is an essential feature of Gale (1995).

Our results are also connected to the literature on dynamic implementation. Moore and Repullo (1988) show that arbitrary social choice functions can be achieved with large off-path transfers.101010See also Aghion, Fudenberg, Holden, Kunimoto, and Tercieux (2012) for a discussion of the lack of robustness to small amounts of imperfect information, and Penta (2015) who takes a belief-free approach to dynamic implementation. Glazer and Perry (1996) show that virtual implementation of social choice functions can be achieved by appealing to extensive-form versions of Abreu and Matsushima (1992) mechanisms.111111See work by Chen and Sun (2015) who exploit the freedom to design the extensive-form. Sato (2023) designs both the extensive-form and information structure a la Doval and Ely (2020) and further utilizes the fact the designer can design information about players’ past moves; by contrast, we fix the dynamic game and past play is observed. Chen et al. (2023) weaken backward induction to initial rationalizability.121212That is, only imposing sequential rationality and common knowledge of sequential rationality at the beginning of the game, but ”anything goes” off-path; see Ben-Porath (1997). Different from these papers, our designer is substantially more constrained: (i) there is no freedom to design the extensive-form game which we take as given; (ii) the designer only offers dynamic information; and (iii) our policy is sequentially optimal.

Our game is one where players have stochastic switching opportunities. Variants of these models have been studied in macroeconomics (Diamond, 1982; Calvo, 1983; Diamond and Fudenberg, 1989; Frankel and Pauzner, 2000), industrial policy (Murphy, Shleifer, and Vishny, 1989; Matsuyama, 1991), finance (He and Xiong, 2012), industrial organization (Biglaiser, Crémer, and Veiga, 2022), and game theory (Burdzy, Frankel, and Pauzner, 2001; Matsui and Matsuyama, 1995; Oyama, 2002; Kamada and Kandori, 2020).131313See also more recent work by Guimaraes and Machado (2018); Guimaraes, Machado, and Pereira (2020). Angeletos and Lian (2016) offer an excellent survey. A common insight from this literature is that switching frictions can generate uniqueness, and the risk-dominant profile is selected via a process of backward induction. Our contribution is to show how the largest equilibrium can be uniquely implemented by carefully choosing the dynamic information policy.

Sequential optimality is an important property of our information policy and thus our work relates to recent work studying the role of (intertemporal) commitment in dynamic information design. Koh and Sanguanmoo (2022); Koh, Sanguanmoo, and Zhong (2024b) show by construction that sequential optimality is generally achievable in single-agent stopping problems. It will turn out that sequential optimal policies also exist in our environment, but for quite distinct reasons; we discuss this more thoroughly in Section 3.

2  Model

Environment

There is a finite set of states Θ={θ1,θ2,θn}\Theta=\{\theta_{1},\theta_{2}\ldots,\theta_{n}\}. We use Δ(Θ)\Delta(\Theta) to denote the set of probability measures and endow it with the Euclidian metric. There is an interior common prior μ0Δ(Θ)Δ(Θ)\mu_{0}\in\Delta(\Theta)\setminus\partial\Delta(\Theta) and a unit measure of players indexed iI:=[0,1]i\in I:=[0,1]. Time is continuous and indexed 𝒯:=[0,+)\mathcal{T}:=[0,+\infty). The action space is binary: aitA:={0,1}a_{it}\in A:=\{0,1\} where aita_{it} is ii’s action at time tt. Write At:=ait𝑑iA_{t}:=\int a_{it}di to denote the proportion of players playing action 11 at time tt. Working with a continuum of agents makes our analysis cleaner because randomness from individual switching frictions vanish in the aggregate.141414By an appropriate continuum law of large numbers (Sun, 2006) where we endow the player space [0,1][0,1] with the appropriate Lebesgue extension. Working with a continuum also clarifies that atomic players are not required for the use of off-path information; we discuss this in Section 4. An analog of our result holds for a finite players; we develop this in Online Appendix I.

Payoffs

The flow payoff for each player is u:{0,1}×[0,1]×Θu:\{0,1\}\times[0,1]\times\Theta\to\mathbb{R}. We write Δu(A,θ):=u(1,A,θ)u(0,A,θ)\Delta u(A,\theta):=u(1,A,\theta)-u(0,A,\theta) to denote the payoff difference from action 11 relative to 0 and assume throughout:

  • (i)

    Supermodularity. Δu(A,θ)\Delta u(A,\theta) is continuously differentiable and strictly increasing in AA.

  • (ii)

    Dominant state. There exists θΘ\theta^{*}\in\Theta such that Δu(0,θ)>0\Delta u(0,\theta^{*})>0.

Condition (i) states that the game is one of strategic complements. Condition (ii) is a standard richness assumption on the space of possible payoff structures: there exists some state θ\theta^{*} under which playing action 11 is strictly dominant.151515This assumption is identical to that in Morris, Oyama, and Takahashi (2024).

The payoff of player iIi\in I is ertu(ait,At,θ)𝑑t\int e^{-rt}u(a_{it},A_{t},\theta)dt where r>0r>0 is an arbitrary discount rate. Each player is endowed with a personal Poisson clock which ticks at an independent rate λ>0\lambda>0. Players can only re-optimize at the ticks of their clocks (Calvo, 1983; Matsui and Matsuyama, 1995; Frankel and Pauzner, 2000; Frankel, Morris, and Pauzner, 2003; Kamada and Kandori, 2020). Our dynamic supermodular game is quite general with the caveat that players are homogeneous.161616A similar assumption has been made in static environments by Inostroza and Pavan (2023); Li et al. (2023) and was weakened by Morris, Oyama, and Takahashi (2024) who characterize optimal private information for full implementation by focusing on potential games with a convexity requirement, which amounts to there not being “too much heterogeneity” across players. We discuss the heterogeneous case in Section 4.

Dynamic information policies

A history Ht:=((μs)st,(As)st)H_{t}:=\big{(}(\mu_{s})_{s\leq t},(A_{s})_{s\leq t}\big{)} specifies beliefs and aggregate play up to time tt. Let t\mathcal{H}_{t} be the set of all histories and :=t0t\mathcal{H}:=\bigcup_{t\geq 0}\mathcal{H}_{t}. Write t\mathcal{F}_{t} as the natural filtration generated by histories. A dynamic information policy is a (t)t(\mathcal{F}_{t})_{t}-martingale. Let

:={𝝁:𝝁 is a (t)t-martingale, μ0=μ0 a.s.}.\mathcal{M}:=\Big{\{}\bm{\mu}^{\prime}:\bm{\mu}^{\prime}\text{ is a $(\mathcal{F}_{t})_{t}$-martingale, $\mu_{0}=\mu^{\prime}_{0}$ a.s.}\Big{\}}.

be the set of all dynamic information policies, where we emphasize that the law of 𝝁\bm{\mu}\in\mathcal{M} can depend on past play.

Strategies and Equilibria

A strategy σi:Δ{0,1}\sigma_{i}:\mathcal{H}\to\Delta\{0,1\} is a map from histories to a distribution over actions so that if ii’s clock ticks at time tt, her choice of action is given by history Ht:=limttHtH_{t-}:=\lim_{t^{\prime}\uparrow t}H_{t}.171717This is well-defined since (At)t(A_{t})_{t} is a.s. continuous and (μt)t(\mu_{t})_{t} has left-limits. Since the measure of agents who act at time tt is almost surely zero, our game is in effect equivalent to one in which play at time tt depends on history HtH_{t}. Given 𝝁\bm{\mu}, this induces a stochastic game;181818Note that information is public so all agents share the same beliefs; in Appendix B we relax this to show that private information often cannot do better. let Σ(𝝁,A0)\Sigma(\bm{\mu},A_{0}) denote the set of subgame perfect equilibria of the stochastic game. We focus on subgame perfection because there is no private information so the game continuing from each history corresponds to a proper subgame.191919Hence, subgame perfection in our setting coincides trivially with Perfect-Bayesian Equilibria (Fudenberg and Tirole, 1991); since we are varying the dynamic information structure, this also corresponds to dynamic Bayes Correlated Equilibria (Makris and Renou, 2023)—but only in the trivial sense since higher-order beliefs are degenerate.

Figure 1: Relationship between beliefs, equilibria, and action paths
Refer to caption

Figure 1 illustrates the connection between dynamic information policies (top left), equilibria (top right), and the path of aggregate actions (bottom). Each information policy (μt)t(\mu_{t})_{t} specifies a Cadlag martingale which depends on both its past realizations, as well as past aggregate play. Given this information policy, this induces a set of equilibria Σ(𝝁,A0)\Sigma(\bm{\mu},A_{0}). Both the realizations of beliefs (μt)t(\mu_{t})_{t} as well as the selected equilibrium σΣ(𝝁,A0)\sigma\in\Sigma(\bm{\mu},A_{0}) induce a path of aggregate play (At)t(A_{t})_{t}. The designer’s problem is to choose its dynamic information policy to influence the set of equilbria and thus (At)t(A_{t})_{t}.

Designer’s problem under adversarial equilibrium selection

The designer’s problem under commitment when nature is choosing the best equilibrium is

sup𝝁𝝈Σ(𝝁,A0)𝔼σ[ϕ(𝑨)]\sup_{\begin{subarray}{c}\bm{\mu}\in\mathcal{M}\\ \bm{\sigma}\in\Sigma(\bm{\mu},A_{0})\end{subarray}}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}

Conversely, when nature is choosing the worst equilibrium, the problem is

sup𝝁inf𝝈Σ(𝝁,A0)𝔼σ[ϕ(𝑨)]\sup_{\bm{\mu}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}

where ϕ:𝒜\phi:\mathcal{A}\to\mathbb{R} is an increasing and bounded functional from the path-space of aggregate play 𝒜\mathcal{A} e.g., the discounted measure of play ϕ(𝑨)=ertAt𝑑t\phi(\bm{A})=\int e^{-rt}A_{t}dt with r>0r>0.

Sequential Optimality

If the designer cannot commit to future information, off-path delivery of information might have no bite in the present. To this end, we can define the payoff gap at history HtH_{t} as the value of the best deviation from the original policy 𝝁\bm{\mu}:

inf𝝈Σ(𝝁,A0)𝔼σ[ϕ(𝑨)|t]sup𝝁inf𝝈Σ(𝝁,A0)𝔼σ[ϕ(𝑨)|t]0\inf_{\bm{\sigma}\in\Sigma(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{|}\mathcal{F}_{t}\Big{]}-\sup_{\bm{\mu}^{\prime}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu}^{\prime},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{|}\mathcal{F}_{t}\Big{]}\geq 0

where t\mathcal{F}_{t} is the filtration corresponding to HtH_{t}. 𝝁\bm{\mu} is sequentially optimal if the gap is zero for all histories HtH_{t}\in\mathcal{H}. Sequential optimality is demanding and states that at every history—including off-path ones—the designer still finds it optimal to follow through with her dynamic information policy.

3  Optimal dynamic information

We begin with an intuitive description of a sequentially-optimal dynamic information policy for binary states before constructing it formally. With binary states, we set Θ={0,1}\Theta=\{0,1\} where 11 is the dominant state. Beliefs are one-dimensional and we will directly associate μt:=(θ=1|t)\mu_{t}:=\mathbb{P}(\theta=1|\mathcal{F}_{t}). Let μ¯\overline{\mu} be the upper-dominance region: μ¯(A)\overline{\mu}(A) is the lowest belief such that if the current aggregate play is AA, playing action 11 is strictly dominant no matter the future play of others.

I. State is near the upper-dominance region. First suppose that at time tt, the public belief μt\mu_{t} and aggregate play AtA_{t} is close to the upper-dominance region as illustrated by the blue dot labelled (μt,At)(\mu_{t},A_{t}) in Figure 2b (a). If players switch to action 11, the designer stays silent. Thus, aggregate action progressively increases as illustrated by the upward arrows in Figure 2b (a).

Figure 2: Policy near upper-dominance region
Refer to caption
(a) Silence on-path
Refer to caption
(b) Injection into dominance region

But suppose, instead, that players start playing action 0 as depicted in Figure 2b (b) I. Then, the designer injects asymmetric information: it is very likely that agents become slightly more optimistic i.e., public beliefs move up a little and into the upper-dominance region, but there is a small chance agents become much more pessimistic (Fig. 2b (b) II). Suppose that this deviation happened and so this information is injected and, furthermore, that it has made agents a little more confident. Then, on this event, future beliefs are in the upper-dominance region so it is strictly dominant for future agents to take action 11. Correspondingly, the designer delivers no further information (Fig. 2b (b) III) and aggregate play begins to increase thereafter. But, knowing that this sequence of events is likely to take place, and because agents have coordination motives, deviating to action 0 in the first place is strictly dominated.

II. State is far from upper-dominance region. Next consider Figure 3b (a) where (μt,At)(\mu_{t},A_{t}) is further away from the dominance region. Our previous argument now breaks down: there is no way for off-path information—no matter how cleverly designed—to ensure beliefs reach the dominance region with a high enough probability as to deter the initial deviation to action 0. This is the key weakness of off-path information vis-a-vis off-path transfers. What then does the designer do?

Figure 3: Chaining off-path information
Refer to caption
(a) Chaining
Refer to caption
(b) Contagion to lower-dominance

If players start switching to 11, the designer delivers asymmetric information so that, with high probability, agents become a little more confident—but not confident enough that action 11 is strictly dominant. This is depicted in Figure 3b (a) II. Upon this realization, if future agents continue deviating to 0, the policy injects yet another bout of asymmetric information which, with high probability, pushes beliefs into the upper-dominance region. This is depicted in Figure 3b (a) IIIB. Knowing this, we have already seen that those future agents strictly prefer to switch to 11. But knowing that, agents in the present state (μt,At)(\mu_{t},A_{t}), anticipating that upon deviation the injection will, with high probability, induce future agents to play 11, also strictly prefer to play 11 in the present.

What are the limits of this line of reasoning? It turns out that by choosing our dynamic information policy carefully, we can chain together these off-path information in such a way as to obtain full implementation at all belief-aggregate pairs for which action 11 is not strictly dominated. This is depicted by Figure (b) where, as before, the blue and pink shaded regions represent the upper- and lower-dominance regions respectively. The logic is related to the “contagion arguments” of Frankel and Pauzner (2000); Burdzy, Frankel, and Pauzner (2001); Frankel, Morris, and Pauzner (2003). These papers show that the risk-dominant action is typically selected as the limit of some iterated deletion procedure in which the blue and pink regions expand with each iteration and meet in the middle which pins down the unique equilibrium.202020In Frankel and Pauzner (2000); Burdzy, Frankel, and Pauzner (2001) this is also obtained via backward induction, where a symmetric random process governs aggregate incentives. Mapped to our model, this corresponds to public information so that the belief martingale is a time-changed Brownian motion. In Frankel, Morris, and Pauzner (2003), the this is obtained via interim deletion of strictly dominated strategies in many-action global games, though the logic is similar. By contrast, we show how dynamic information can be employed to generate asymmetric contagion such that only the upper-dominance region expands to engulf the space of all belief-aggregate play pairs where action 11 is not strictly dominated.

III. Designer-preferred action strictly dominated. Now suppose beliefs are so pessimistic that 11 is strictly dominated i.e., μtμ¯(At)\mu_{t}\leq\underline{\mu}(A_{t}) where μ¯(At)\underline{\mu}(A_{t}) is the highest belief under which, given AtA_{t}, action 11 is strictly dominated.

Figure 4: Escaping the lower-dominance region
Refer to caption
(a) Immediate injection
Refer to caption
(b) Delayed injection
Refer to caption
(c) ‘Smooth’ injection

Then, the above policy no longer works: even if players expect all future players to switch to 11, they are so pessimistic about the state that switching to 0 is strictly better. Now, the designer has to offer non-trivial information on-path to push beliefs out of the lower-dominance region. How is this optimally done?

Figure 4c (a) illustrates the optimal policy which consists of an immediate and precise injection of information such that beliefs jump to either 0 or (just) out of the lower-dominance region. The optimality of such a policy is built on the observation that if the designer does not intervene early to curtail players from progressively switching to 0, it simply becomes more difficult to escape the lower-dominance region down the line. Consider, for instance, the policy in Figure 4c (b) which also injects precise information to maximize the chance of escaping the lower-dominance region, but with a delay. Before this injection, players switch to action 0 and since μ¯(A)\underline{\mu}(A) is strictly decreasing, the probability of escaping the dominance region is strictly smaller. For similar reasons, the policy illustrated in Figure 4c (c) which induces continuous sample belief paths is also sub-optimal.

IV. Sequential optimality. Our previous discussion specified off-path injections of policies upon deviation away from the action 11. Of course, if such deviations actually occur, the designer may not have any incentive to follow-through with its policy. For instance, consider Figure 5b (a) which employs the strategy of injecting conclusive bad news that the state is 0 so that, with high probability beliefs increase a little, and with low probability agents learn conclusively that θ=0\theta=0. Indeed, information of this form maximizes the chance that beliefs increase212121As in Kamenica and Gentzkow (2011) and subsequent work. and, as we have described, these can be be chained together to achieve full implementation.

Figure 5: Sequential optimality
Refer to caption
(a) Not sequentially optimal
Refer to caption
(b) Sequentially optimal

However, this policy is not sequentially optimal: if agents do deviate and play action 0, injecting such information is suboptimal because it poses an extra risk: if conclusive bad news does arrive, beliefs become absorbing at μt=0\mu_{t}=0 and further information is powerless to influence beliefs—it is then strictly dominant for all agents to play 0 thereafter. How, then, is sequential optimality obtained?

Consider inconclusive off-path information as illustrated in Figure 5b (b) where each blue dot represents a potential injection of off-path information upon players’ deviating to action 0. Each injection induces two kinds of beliefs: upon arrival of a ‘good’ signal, agents become a little more optimistic (right arrow); upon arrival of a ‘bad’ signal, agents become relatively more pessimistic, but not so much that action 11 becomes strictly dominated (left arrow). Figure 5b illustrates a particular policy in which, upon realization of the bad signal at state (μt,At)(\mu_{t-},A_{t}), agents’ beliefs move halfway toward the lower-dominance region i.e., to [μt+μ¯(At)]/2[\mu_{t-}+\underline{\mu}(A_{t})]/{2}. Conversely, if the good signal arrives, believes move up a little, so that the probability of the former is much higher than the latter.

By choosing this distribution carefully for each belief-aggregate action pair, we can achieve full implementation via the chaining argument outlined above, which requires that (i) probability of the good signal arriving is sufficiently high as to deter deviations; and (ii) movement in beliefs generated by the good signal is sufficiently large that, when chained together, we obtain full implementation over the whole region. At the same time, this is sequentially optimal since, whenever the designer is faced with the prospect of injecting off-path information, she is willing to do so: with probability 11 agents’ posterior beliefs are such that full implementation remains possible.222222We emphasize that there is nothing circular about this argument: we iteratively delete switching to action 0 under the worst-case conjecture that, upon the bad signal arriving, all future agents play 11. This is sufficient to obtain full implementation as long as action 11 is not strictly dominated.

Sequential optimality of dynamic information has been recently studied in single-agent optimal stopping problems (Koh and Sanguanmoo, 2022; Koh, Sanguanmoo, and Zhong, 2024b) who show that optimal dynamic information can always be modified to be sequentially optimal.232323See also Ball (2023) who finds in a different single-agent contracting environment that the optimal dynamic information policy happens to be sequentially optimal. In such environments, sequential optimality is obtained via an entirely distinct mechanism: the designer progressively delivers more interim information to raise the agent’s outside option at future histories which, in turn, ties the designer’s hands in the future. By contrast, in the present environment our designer chains together off-path information together to raise her own continuation value by guaranteeing that, even on realizations of the asymmetric signal, her future self can always fully implement the largest path of play.

Construction of sequentially-optimal policy.

We now make our previous discussion precise and general.

We will construct a particular martingale 𝝁\bm{\mu^{*}}\in\mathcal{M} which is ‘Markovian’ in the sense that the ‘instantaneous’ information at time tt depends only on the belief-aggregate play pair (μt,At)(\mu_{t},A_{t}), as well as an auxiliary (t)t(\mathcal{F}_{t})_{t}-predictable process (Zt)t(Z_{t})_{t} we will define as part of the policy. We begin with several key definitions:

Definition 1 (Lower dominance region).

Let ΨLD:[0,1]Δ(Θ)\Psi_{LD}:[0,1]\rightrightarrows\Delta(\Theta) denote the set of beliefs under which players prefer action 0 even if all future players choose to play action 11:

ΨLD(At){μΔ(Θ):𝔼θμ[tτersΔu(A¯s,θ)𝑑s]0},\displaystyle\Psi_{LD}(A_{t})\coloneqq\Big{\{}\mu\in\Delta(\Theta):\mathbb{E}_{\theta\sim\mu}\Big{[}\int_{t}^{\tau}e^{-rs}\Delta u(\bar{A}_{s},\theta)ds\Big{]}\leq 0\Big{\}},

where A¯s\bar{A}_{s} solves dA¯s=λ(1As)dsd\bar{A}_{s}=\lambda(1-A_{s})ds for sts\geq t with boundary A¯t=At\bar{A}_{t}=A_{t} and τ\tau is independently distributed according to an exponential distribution with rate λ\lambda re-normalized to start at t.t.

Observe that supermodularity implies ΨLD\Psi_{LD} is decreasing in AtA_{t}: ΨLD(At)ΨLD(At)\Psi_{LD}(A_{t})\subset\Psi_{LD}(A^{\prime}_{t}) if At>AtA_{t}>A^{\prime}_{t}. ΨLD\Psi_{LD} is illustrated by the pink region of Figure 6b for the cases where |Θ|=2|\Theta|=2 (panel (a)), and |Θ|=3|\Theta|=3 (panel (b)).

Figure 6: Illustration of ΨLD\Psi_{L}D, Bdθ\text{Bd}_{\theta^{*}}, and DD
Refer to caption
(a) |Θ|=2|\Theta|=2
Refer to caption
(b) |Θ|=3|\Theta|=3
Definition 2.

For each μtΨLD(At)\mu_{t}\notin\Psi_{LD}(A_{t}) and At[0,1]A_{t}\in[0,1], define

D(μt,At)inf{α[0,1]:μtαδθμt1μt(θ)ΨLD(At)Bdθ}D(\mu_{t},A_{t})\coloneqq\inf\bigg{\{}\alpha\in[0,1]:\mu_{t}-\alpha\cdot\frac{\delta_{\theta^{*}}-\mu_{t}}{1-\mu_{t}(\theta^{*})}\in\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\bigg{\}}

this gives the ‘distance’ from current beliefs μt\mu_{t} as it moves along a linear path starting from δθ\delta_{\theta^{*}} to either (i) the lower dominance region ΨLD(At)\Psi_{LD}(A_{t}); or (ii) the set of beliefs that assign zero probability on state θ\theta^{*} which we denote with Bdθ{μΔ(Θ):μ(θ)=0}.\text{Bd}_{\theta^{*}}\coloneqq\{\mu\in\Delta(\Theta):\mu(\theta^{*})=0\}. This is depicted in Figure 6b where each blue dot represents a belief.

Definition 3 (Tolerance, upward/downward jump sizes, belief direction).

To describe the policy when action 11 is not strictly dominated, we specify the following variables:

  • (i)

    Tolerance. 𝖳𝖮𝖱(D)\mathsf{TOR}(D) specifies the magnitude of deviation of off-path play vis-a-vis a target ZtZ_{t}. If this is exceeded, the policy begins to inject additional information.

  • (ii)

    Upward jump size. M𝖳𝖮𝖱(D)M\cdot\mathsf{TOR}(D) scales the tolerance by a factor of M>0M>0, and specifies the upward movement in beliefs if the injected information is positive.

  • (iii)

    Downward jump size. 𝖣𝖮𝖶𝖭(D)\mathsf{DOWN}(D) specifies the downward movement in beliefs if the injected information is negative.

  • (iv)

    Belief direction. 𝒅^(μ)n\hat{\bm{d}}(\mu)\in\mathbb{R}^{n} specifies the direction of belief movements. We set it as the directional vector of μ\mu towards δθ\delta_{\theta^{*}}.

Refer to caption
Figure 7: Illustration of up/downward jump sizes and belief directions

Definition 3 specifies objects required to define our information policy when beliefs lie outside of the lower-dominance region μΨLD\mu\notin\Psi_{LD}. We now develop objects to define our information policy when beliefs lie inside the lower dominace region μΨLD\mu\in\Psi_{LD}.

Definition 4 (Maximal escape probability and beliefs).

To describe the policy when action 11 is strictly dominated, a few more definitions are in order:

  • (i)

    The set of beliefs which are attainable with probability pp from μ\mu is

    F(p,μ){μΔ(Θ):pμμ}F(p,\mu)\coloneqq\{\mu^{\prime}\in\Delta(\Theta):p\mu^{\prime}\leq\mu\}

    which follows from the martingale property of beliefs.

  • (ii)

    Maximal escape probability. p(μ,A)max{p[0,1]:F(p,μ)ΨLD(A)}p^{*}(\mu,A)\coloneqq\max\{p\in[0,1]:F(p,\mu)\subset\Psi_{LD}(A)\} is a tight upper-bound on the probability that beliefs escape ΨLD\Psi_{LD}.

  • (iii)

    The maximal escape beliefs are

    (η,μ):=F(p(μ,A)η,μ)ΨLDc(A)\partial({\eta},\mu):=F(p^{*}(\mu,A)-\eta,\mu)\cap\Psi_{LD}^{c}(A)

    where ΨLDc(A)=Δ(Θ)ΨLD(A)\Psi^{c}_{LD}(A)=\Delta(\Theta)\setminus\Psi_{LD}(A).

Refer to caption
Figure 8: Illustration of FF and maximal escape probability and beliefs

We are (finally!) ready to define our dynamic information policy 𝝁\bm{\mu^{*}}\in\mathcal{M}. Recall that 𝝁\bm{\mu}^{*} is Cadlag so has left-limits which we denote with μt:=limttμt\mu_{t-}:=\lim_{t^{\prime}\uparrow t}\mu_{t}. We will simultaneously specify the law of 𝝁\bm{\mu^{*}} as well as construct the stochastic process (Zt)t(Z_{t})_{t} which is (t)t(\mathcal{F}_{t})_{t}-predictable242424That is, ZtZ_{t} is measurable with respect to the left filtration limsts\lim_{s\uparrow t}\mathcal{F}_{s}. and initializing Z0=A0Z_{0}=A_{0}. (Zt)t(Z_{t})_{t} is interpreted as the targeted aggregate play at each history.

Given the tuple (μt,Zt,At)(\mu^{*}_{t-},Z_{t-},A_{t}), define the time-tt information structure and law of motion of ZtZ_{t} as follows:

  1. 1.

    Silence on-path. If action 11 is not strictly dominated i.e, μtΨLD(At)\mu_{t-}\notin\Psi_{LD}(A_{t}) and play is within the tolerance level i.e., |AtZt|<𝖳𝖮𝖱(D)|A_{t}-Z_{t-}|<\mathsf{TOR}(D) then

    μt=μt\mu_{t}=\mu_{t-} almost surely,

    i.e., no information, and dZt=λ(1Zt).dZ_{t}=\lambda(1-Z_{t-}).

  2. 2.

    Asymmetric and inconclusive off-path injection. If action 11 is not strictly dominated i.e., μtΨLD(At)\mu_{t-}\notin\Psi_{LD}(A_{t}) and play is outside the tolerance level i.e., |AtZt|𝖳𝖮𝖱(D)|A_{t}-Z_{t-}|\geq\mathsf{TOR}(D) then

    μt={μt+(M𝖳𝖮𝖱(D))d^(μt)w.p. 𝖣𝖮𝖶𝖭(D)𝖣𝖮𝖶𝖭(D)+M𝖳𝖮𝖱(D)μt𝖣𝖮𝖶𝖭(D)d^(μt)w.p. M𝖳𝖮𝖱(D)𝖣𝖮𝖶𝖭(D)+M𝖳𝖮𝖱(D),\displaystyle\mu_{t}=\begin{cases}\mu_{t-}+(M\cdot\mathsf{TOR}(D))\cdot\hat{d}(\mu_{t-})&\text{w.p. $\frac{\mathsf{DOWN}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$}\\ \mu_{t-}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-})&\text{w.p. $\frac{M\cdot\mathsf{TOR}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$},\end{cases}

    where d^(μ):=δθμ1μ(θ)\hat{d}(\mu):=\frac{\delta_{\theta^{*}}-\mu}{1-\mu(\theta^{*})} is the (normalized) directional vector of μ\mu toward δθ\delta_{\theta^{*}}, and reset Zt=AtZ_{t}=A_{t}.

  3. 3.

    Jump. If action 11 is strictly dominated i.e., μtΨLD(At)\mu_{t-}\in\Psi_{LD}(A_{t}) then beliefs jump to a maximal escape point: pick any μ+(μ,η)\mu^{+}\in\partial(\mu,\eta)

    μt={μ+w.p. p(μt,At)ημw.p. 1(p(μt,At)η),\displaystyle\mu_{t}=\begin{cases}\mu^{+}&\text{w.p. $p^{*}(\mu_{t-},A_{t})-\eta$}\\ \mu^{-}&\text{w.p. $1-(p^{*}(\mu_{t-},A_{t})-\eta)$},\end{cases}

    where μ=μt(p(μt,At)η)μ+1(p(μt,At)η).\mu^{-}=\frac{\mu_{t}-(p^{*}(\mu_{t-},A_{t})-\eta)\mu^{+}}{1-(p^{*}(\mu_{t-},A_{t})-\eta)}.

We have defined a family of information structures which depend on 𝖳𝖮𝖱(D)\mathsf{TOR}(D) (tolerance), M𝖳𝖮𝖱(D)M\cdot\mathsf{TOR}(D) (upward jump size), 𝖣𝖮𝖶𝖭(D)\mathsf{DOWN}(D) (downward jump size), and η\eta (distance outside the lower-dominance region ΨLD\Psi_{LD}). There is some flexibility to choose them: we will set 𝖣𝖮𝖶𝖭(D)=12D\mathsf{DOWN}(D)=\frac{1}{2}D, 𝖳𝖮𝖱(D)=mD2\mathsf{TOR}(D)=m\cdot D^{2} where m>0m>0 is a small constant, M>0M>0 is a large constant.

We choose mm small so that 𝖳𝖮𝖱(D)\mathsf{TOR}(D), the upward jump size, is much smaller than the downward jump size—this ensures that the probability of becoming (a little) more optimistic is much larger. MM is the ratio between the upward jump size and the tolerance—it is large to guarantee that off-path information can push future beliefs into the upper-dominance region. The exact choice of mm and MM will depend on the primitives of the game, but are independent of η\eta; a detailed construction is in Appendix A. Hence, we parameterize this family of policies by (𝝁η)η>0(\bm{\mu}^{\eta})_{\eta>0}.

Theorem 1.

  • (i)

    Form and value.

    limη0infσΣ(𝝁η,A0)𝔼σ[ϕ(𝑨)]=(2)=(2).\lim_{\eta\downarrow 0}\inf_{\sigma\in\Sigma(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}=\eqref{eqn:adv}=\eqref{eqn:opt}.
  • (ii)

    Sequential optimality.

    limη0supHt|inf𝝈Σ(𝝁η,A0)𝔼σ[ϕ(𝑨)|t]sup𝝁inf𝝈Σ(𝝁,A0)𝔼σ[ϕ(𝑨)|t]|=0.\lim_{\eta\downarrow 0}\sup_{H_{t}\in\mathcal{H}}\Bigg{|}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{|}\mathcal{F}_{t}\Big{]}-\sup_{\bm{\mu}^{\prime}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu}^{\prime},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{|}\mathcal{F}_{t}\Big{]}\Bigg{|}=0.
Proof.

See Appendix A. ∎

4  Robustness and generalizations

Our dynamic game is quite general in some regards but more specific in others. We now discuss which aspects are crucial, and which can be relaxed.

Continuum vs finite players. We worked with a continuum of players so there is no aggregate randomness in the time path of agents who can re-optimize their action.252525By a suitable continuum law of large numbers (Sun, 2006). This delivers a cleaner analysis since the only source of randomness is fluctuations in beliefs driven by policy. In Online Appendix I we show that Theorem 1 holds, mutatis mutandis, in a model with large but finite number of players.262626See Aumann (1966) and Fudenberg and Levine (1986); Levine and Pesendorfer (1995) for a discussion of the subtleties between continuum and finite players. There, we show that in a finite version of the model, the same policy that was optimal continuum case remains continues to solve problem (2) for large but finite number of players NN. In particular, our policy closes the multiplicity gap at rate O(N1/9)O(N^{-1/9}).272727That is, |(2)(2)|=O(N1/9)|\eqref{eqn:adv}-\eqref{eqn:opt}|=O(N^{-1/9}); see Online Appendix I. Mathematically, this requires more involved arguments to handle the extra randomness from switching times.

Conceptually, however, finiteness is simpler. Notice that in our continuum model, players are atomless but nonetheless off-path information is still effective. That is, players do not need to believe that they can individually influence the state for full implementation to work. This is in sharp contrast to work on dynamic coordination, durable goods monopolist, or public-good provision games which rely on the fact that each agent’s action makes a small but non-negligible difference.282828For instance Gale (1995) highlights a gap between a continuum and finite number of players in dynamic coordination games. A similar gap emerges in durable goods monopolist (Fudenberg, Levine, and Tirole, 1985; Gul, Sonnenschein, and Wilson, 1986; Bagnoli, Salant, and Swierzbinski, 1989). See also recent work by Battaglini and Palfrey (2024) in public goods context where the fact that each agent can influence the state (by a little) is important. The chief difference is that information about deviations are lost in the continuum case (Levine and Pesendorfer, 1995) which precludes the designer from detecting and responding to individual deviations.

Our key insight that this is not required: a dynamic policy with a moving target—such that asymmetric and inconclusive information is injected if aggregate play falls too far from the target—can deliver strict incentives, even if individual players cannot influence aggregate play. That is, our policy credibly insures players against paths of future play by precluding the possibility that ‘too many’ (as prescribed by the tolerance level) future agents might switch to action 0. In this regard working with atomless agents delivers an arguably stronger result.

Public vs private information. When the initial condition is such that playing 11 is not strictly dominated, our policy fully implements the upper-bound on the time path of aggregate play. Thus, private information cannot do better. If initial beliefs are such that action 11 is strictly dominated, however, this is more subtle.292929It is still an open question as to how to characterize feasible joint paths of higher-order beliefs when players also observe past play. We discuss this case in Appendix B where we construct an upper bound on the payoff difference under public and private information policies.

Homogeneous vs heterogeneous players. Payoffs in our dynamic game are quite general, with the caveat that they were identical across players. It is well-known that introducing heterogeneity typically aids equilibrium uniqueness in coordination games.303030See Morris and Shin (2006) for an articulation and survey of this idea. Thus, we expect that this can only make full implementation easier. Since we have already closed the multiplicity gap under homogeneous payoffs, qualitative features of our main result should continue to hold.313131At least when the belief-aggregate play pair is so that action 11 is not strictly dominated.

Switching frictions. Switching frictions are commonly used to model switching costs, inattention, or settings with some staggered structure. They are important in our environment because dynamic information policy can then inject information as soon as players begin deviating from the designer’s preferred action. This allows off-path information to be chained together by correcting incipient deviations. By contrast, if players could continually re-optimize their actions, then off-path information is powerless to rule out equilibria of the form “all simultaneously switch to 0”.323232Indeed, prior work which obtained equilibrium uniqueness (of risk-dominant selection) (Frankel and Pauzner, 2000; Burdzy, Frankel, and Pauzner, 2001) do so via switching frictions. Switching frictions are prevalent in macroeconomics but, as Angeletos and Lian (2016) note, “It is then somewhat surprising that this approach [combining aggregate shocks with switching frictions to generate uniqueness] has not attracted more attention in applied research.” We note, however, that it would suffice for some frictions to exist, but the exact form is not particularly important: the switching rate could vary with aggregate play, change over time, and can be taken to be arbitrarily quick or slow.

5  Discussion

We have shown that dynamic information is a powerful tool for full implementation in general binary-action supermodular games. In doing so, we highlighted key properties of off-path information: asymmetric and inconclusive signals are chained together to obtain full implementation while preserving sequential optimality. We conclude by briefly discussing implications.

Implications for implementation via information. A recent literature on information and mechanism design finds that in static environments, there is generically a multiplicity gap—the designer can do strictly better under partial rather than full implementation (Morris, Oyama, and Takahashi, 2024; Halac, Lipnowski, and Rappoport, 2021).333333See also Inostroza and Pavan (2023); Li, Song, and Zhao (2023); Morris, Oyama, and Takahashi (2022); Halac, Lipnowski, and Rappoport (2024). We show that the careful design of dynamic public information can quite generally close this gap in dynamic coordination environments.343434Moreover, information in our environment is public so higher-order beliefs are degenerate; by contrast, optimal static implementation via information typically requires inducing non-degenerate higher-order beliefs.

But do our results demand more of players’ rationality and common knowledge of rationality? Yes and no. On the one hand, it is well-known that in environments like ours, there is a tight connection between the iterated deletion of interim strictly dominated strategies (as in Frankel, Morris, and Pauzner (2003)) and backward induction, which can be viewed as the iterated deletion of intermporally strictly dominated strategies (as in Frankel and Pauzner (2000); Burdzy, Frankel, and Pauzner (2001)). In this regard, we do not think our results require ”more sophistication” of agents than in static environments. On the other hand, it is also known that common knowledge of rationality is delicate in dynamic games and must continue to hold at off-path histories.353535See Aumann (1995). Samet (2005) offers an entertaining discussion. This motivates implementation in ”initial rationalizability” when the designer has freedom to design the extensive form game (Chen, Holden, Kunimoto, Sun, and Wilkening, 2023). In this regard, our stronger results are obtained at the price of arguably stronger assumptions on common knowledge of rationality.

Implications for coordination policy. Our results have simple and sharp implications for coordination problems. It is often held that to prevent agents from playing undesirable equilbiria, policymakers must deliver substantial on-path information in order to uniquely implement the designer’s preferred action.363636See Morris and Yildiz (2019) for a recent articulation of this idea in static games, and Basak and Zhou (2020, 2022) in a dynamic regime change game where the planner uses either frequent warnings (the former), or early warnings (the latter) to implement their preferred equilibrium. Our results offer a more nuanced view.

When public beliefs are so pessimistic that the designer-preferred action is strictly dominated, an early and precise injection of on-path information is indeed required; waiting only makes implementation harder in the future. But as long as beliefs are not so pessimistic that the designer-preferred action is strictly dominated, no additional on-path information is required: silence backed by the credible promise of off-path information suffices.

References

  • Abreu and Matsushima (1992) Abreu, D. and H. Matsushima (1992): “Virtual implementation in iteratively undominated strategies: complete information,” Econometrica: Journal of the Econometric Society, 993–1008.
  • Aghion et al. (2012) Aghion, P., D. Fudenberg, R. Holden, T. Kunimoto, and O. Tercieux (2012): “Subgame-perfect implementation under information perturbations,” The Quarterly Journal of Economics, 127, 1843–1881.
  • Angeletos et al. (2007) Angeletos, G.-M., C. Hellwig, and A. Pavan (2007): “Dynamic global games of regime change: Learning, multiplicity, and the timing of attacks,” Econometrica, 75, 711–756.
  • Angeletos and Lian (2016) Angeletos, G.-M. and C. Lian (2016): “Incomplete information in macroeconomics: Accommodating frictions in coordination,” in Handbook of macroeconomics, Elsevier, vol. 2, 1065–1240.
  • Arieli et al. (2021) Arieli, I., Y. Babichenko, F. Sandomirskiy, and O. Tamuz (2021): “Feasible joint posterior beliefs,” Journal of Political Economy, 129, 2546–2594.
  • Aumann (1966) Aumann, R. (1966): “Existence of Competitive Equilibria in Markets with a Continuum of Traders,” Econometrica, 34, 1–17.
  • Aumann (1995) Aumann, R. J. (1995): “Backward induction and common knowledge of rationality,” Games and economic Behavior, 8, 6–19.
  • Bagnoli et al. (1989) Bagnoli, M., S. W. Salant, and J. E. Swierzbinski (1989): “Durable-goods monopoly with discrete demand,” Journal of Political Economy, 97, 1459–1478.
  • Ball (2023) Ball, I. (2023): “Dynamic information provision: Rewarding the past and guiding the future,” Econometrica, 91, 1363–1391.
  • Basak and Zhou (2020) Basak, D. and Z. Zhou (2020): “Diffusing coordination risk,” American Economic Review, 110, 271–297.
  • Basak and Zhou (2022) ——— (2022): “Panics and early warnings,” PBCSF-NIFR Research Paper.
  • Battaglini and Palfrey (2024) Battaglini, M. and T. R. Palfrey (2024): “Dynamic Collective Action and the Power of Large Numbers,” Tech. rep., National Bureau of Economic Research.
  • Ben-Porath (1997) Ben-Porath, E. (1997): “Rationality, Nash equilibrium and backwards induction in perfect-information games,” The Review of Economic Studies, 64, 23–46.
  • Bester and Strausz (2001) Bester, H. and R. Strausz (2001): “Contracting with imperfect commitment and the revelation principle: the single agent case,” Econometrica, 69, 1077–1098.
  • Biglaiser et al. (2022) Biglaiser, G., J. Crémer, and A. Veiga (2022): “Should I stay or should I go? Migrating away from an incumbent platform,” The RAND Journal of Economics, 53, 453–483.
  • Burdzy et al. (2001) Burdzy, K., D. M. Frankel, and A. Pauzner (2001): “Fast equilibrium selection by rational players living in a changing world,” Econometrica, 69, 163–189.
  • Calvo (1983) Calvo, G. A. (1983): “Staggered prices in a utility-maximizing framework,” Journal of monetary Economics, 12, 383–398.
  • Chamley (1999) Chamley, C. (1999): “Coordinating regime switches,” The Quarterly Journal of Economics, 114, 869–905.
  • Chen et al. (2023) Chen, Y.-C., R. Holden, T. Kunimoto, Y. Sun, and T. Wilkening (2023): “Getting dynamic implementation to work,” Journal of Political Economy, 131, 285–387.
  • Chen and Sun (2015) Chen, Y.-C. and Y. Sun (2015): “Full implementation in backward induction,” Journal of Mathematical Economics, 59, 71–76.
  • Dasgupta (2007) Dasgupta, A. (2007): “Coordination and delay in global games,” Journal of Economic Theory, 134, 195–225.
  • Diamond and Dybvig (1983) Diamond, D. W. and P. H. Dybvig (1983): “Bank runs, deposit insurance, and liquidity,” Journal of political economy, 91, 401–419.
  • Diamond and Fudenberg (1989) Diamond, P. and D. Fudenberg (1989): “Rational expectations business cycles in search equilibrium,” Journal of political Economy, 97, 606–619.
  • Diamond (1982) Diamond, P. A. (1982): “Aggregate demand management in search equilibrium,” Journal of political Economy, 90, 881–894.
  • Doval and Ely (2020) Doval, L. and J. C. Ely (2020): “Sequential information design,” Econometrica, 88, 2575–2608.
  • Doval and Skreta (2022) Doval, L. and V. Skreta (2022): “Mechanism design with limited commitment,” Econometrica, 90, 1463–1500.
  • Ellison and Fudenberg (2000) Ellison, G. and D. Fudenberg (2000): “The neo-Luddite’s lament: Excessive upgrades in the software industry,” The RAND Journal of Economics, 253–272.
  • Farrell and Saloner (1985) Farrell, J. and G. Saloner (1985): “Standardization, compatibility, and innovation,” the RAND Journal of Economics, 70–83.
  • Frankel and Pauzner (2000) Frankel, D. and A. Pauzner (2000): “Resolving indeterminacy in dynamic settings: the role of shocks,” The Quarterly Journal of Economics, 115, 285–304.
  • Frankel et al. (2003) Frankel, D. M., S. Morris, and A. Pauzner (2003): “Equilibrium selection in global games with strategic complementarities,” Journal of Economic Theory, 108, 1–44.
  • Fudenberg and Levine (1986) Fudenberg, D. and D. Levine (1986): “Limit games and limit equilibria,” Journal of economic Theory, 38, 261–279.
  • Fudenberg et al. (1985) Fudenberg, D., D. Levine, and J. Tirole (1985): “Infinite-horizon models of bargaining with one-sided incomplete information,” Game-theoretic models of bargaining, 73–98.
  • Fudenberg and Tirole (1991) Fudenberg, D. and J. Tirole (1991): “Perfect Bayesian equilibrium and sequential equilibrium,” journal of Economic Theory, 53, 236–260.
  • Gale (1995) Gale, D. (1995): “Dynamic coordination games,” Economic theory, 5, 1–18.
  • Glazer and Perry (1996) Glazer, J. and M. Perry (1996): “Virtual implementation in backwards induction,” Games and Economic Behavior, 15, 27–32.
  • Goldstein and Pauzner (2005) Goldstein, I. and A. Pauzner (2005): “Demand–deposit contracts and the probability of bank runs,” the Journal of Finance, 60, 1293–1327.
  • Guimaraes and Machado (2018) Guimaraes, B. and C. Machado (2018): “Dynamic coordination and the optimal stimulus policies,” The Economic Journal, 128, 2785–2811.
  • Guimaraes et al. (2020) Guimaraes, B., C. Machado, and A. E. Pereira (2020): “Dynamic coordination with timing frictions: Theory and applications,” Journal of Public Economic Theory, 22, 656–697.
  • Gul et al. (1986) Gul, F., H. Sonnenschein, and R. Wilson (1986): “Foundations of dynamic monopoly and the Coase conjecture,” Journal of economic Theory, 39, 155–190.
  • Halac et al. (2021) Halac, M., E. Lipnowski, and D. Rappoport (2021): “Rank uncertainty in organizations,” American Economic Review, 111, 757–786.
  • Halac et al. (2024) ——— (2024): “Pricing for Coordination,” .
  • Halac and Yared (2014) Halac, M. and P. Yared (2014): “Fiscal rules and discretion under persistent shocks,” Econometrica, 82, 1557–1614.
  • He and Xiong (2012) He, Z. and W. Xiong (2012): “Dynamic debt runs,” The Review of Financial Studies, 25, 1799–1843.
  • Inostroza and Pavan (2023) Inostroza, N. and A. Pavan (2023): “Adversarial coordination and public information design,” Available at SSRN 4531654.
  • Kamada and Kandori (2020) Kamada, Y. and M. Kandori (2020): “Revision games,” Econometrica, 88, 1599–1630.
  • Kamenica and Gentzkow (2011) Kamenica, E. and M. Gentzkow (2011): “Bayesian persuasion,” American Economic Review, 101, 2590–2615.
  • Koh et al. (2024a) Koh, A., R. Li, and K. Uzui (2024a): “Inertial Coordination Games,” arXiv preprint arXiv:2409.08145.
  • Koh and Sanguanmoo (2022) Koh, A. and S. Sanguanmoo (2022): “Attention Capture,” arXiv preprint arXiv:2209.05570.
  • Koh et al. (2024b) Koh, A., S. Sanguanmoo, and W. Zhong (2024b): “Persuasion and Optimal Stopping,” arXiv preprint arXiv:2406.12278.
  • Laffont and Tirole (1988) Laffont, J.-J. and J. Tirole (1988): “The dynamics of incentive contracts,” Econometrica: Journal of the Econometric Society, 1153–1175.
  • Levine and Pesendorfer (1995) Levine, D. K. and W. Pesendorfer (1995): “When are agents negligible?” The American Economic Review, 1160–1170.
  • Li et al. (2023) Li, F., Y. Song, and M. Zhao (2023): “Global manipulation by local obfuscation,” Journal of Economic Theory, 207, 105575.
  • Liu et al. (2019) Liu, Q., K. Mierendorff, X. Shi, and W. Zhong (2019): “Auctions with limited commitment,” American Economic Review, 109, 876–910.
  • Makris and Renou (2023) Makris, M. and L. Renou (2023): “Information design in multistage games,” Theoretical Economics, 18, 1475–1509.
  • Mathevet and Steiner (2013) Mathevet, L. and J. Steiner (2013): “Tractable dynamic global games and applications,” Journal of Economic Theory, 148, 2583–2619.
  • Matsui and Matsuyama (1995) Matsui, A. and K. Matsuyama (1995): “An approach to equilibrium selection,” Journal of Economic Theory, 65, 415–434.
  • Matsuyama (1991) Matsuyama, K. (1991): “Increasing returns, industrialization, and indeterminacy of equilibrium,” The Quarterly Journal of Economics, 106, 617–650.
  • Miller et al. (2002) Miller, M., P. Weller, and L. Zhang (2002): “Moral Hazard and The US Stock Market: Analysing the ‘Greenspan Put’,” The Economic Journal, 112, C171–C186.
  • Moore and Repullo (1988) Moore, J. and R. Repullo (1988): “Subgame perfect implementation,” Econometrica: Journal of the Econometric Society, 1191–1220.
  • Morris (2020) Morris, S. (2020): “No trade and feasible joint posterior beliefs,” .
  • Morris et al. (2022) Morris, S., D. Oyama, and S. Takahashi (2022): “On the joint design of information and transfers,” Available at SSRN 4156831.
  • Morris et al. (2024) ——— (2024): “Implementation via Information Design in Binary-Action Supermodular Games,” Econometrica, 92, 775–813.
  • Morris and Shin (2006) Morris, S. and H. S. Shin (2006): “Heterogeneity and uniqueness in interaction games,” The Economy as an Evolving Complex System, 3, 207–42.
  • Morris and Yildiz (2019) Morris, S. and M. Yildiz (2019): “Crises: Equilibrium shifts and large shocks,” American Economic Review, 109, 2823–2854.
  • Murphy et al. (1989) Murphy, K. M., A. Shleifer, and R. W. Vishny (1989): “Industrialization and the big push,” Journal of political economy, 97, 1003–1026.
  • Nakamura and Steinsson (2010) Nakamura, E. and J. Steinsson (2010): “Monetary non-neutrality in a multisector menu cost model,” The Quarterly journal of economics, 125, 961–1013.
  • Oyama (2002) Oyama, D. (2002): “p-Dominance and equilibrium selection under perfect foresight dynamics,” Journal of Economic Theory, 107, 288–310.
  • Penta (2015) Penta, A. (2015): “Robust dynamic implementation,” Journal of Economic Theory, 160, 280–316.
  • Samet (2005) Samet, D. (2005): “Counterfactuals in wonderland,” Games and Economic Behavior, 51, 2005.
  • Sato (2023) Sato, H. (2023): “Robust implementation in sequential information design under supermodular payoffs and objective,” Review of Economic Design, 27, 269–285.
  • Skreta (2015) Skreta, V. (2015): “Optimal auction design under non-commitment,” Journal of Economic Theory, 159, 854–890.
  • Sun (2006) Sun, Y. (2006): “The exact law of large numbers via Fubini extension and characterization of insurable risks,” Journal of Economic Theory, 126, 31–69.

Appendix to Informational Puts

Appendix A proves Theorem 1. Appendix B analyzes the case in which the designer can use private information.

Appendix A Proofs

Preliminaries. We use the following notation for the time-path of aggregate actions following from AtA_{t}: for sts\geq t, A¯s\bar{A}_{s} solves

dA¯s=λ(1A¯s)dswith boundary A¯t=At.d\bar{A}_{s}=\lambda(1-\bar{A}_{s})\cdot ds\quad\text{with boundary $\bar{A}_{t}=A_{t}$.}

Similarly, for sts\geq t, A¯s\underline{A}_{s} solves

dA¯s=λA¯sdswith boundary A¯t=At.d\underline{A}_{s}=-\lambda\underline{A}_{s}\cdot ds\quad\text{with boundary $\bar{A}_{t}=A_{t}$.}

In words, A¯s\bar{A}_{s} and A¯s\underline{A}_{s} denote future paths of aggregate actions when everyone in the future switches to actions 11 and 0 as quickly as possible, respectively.

Finally, it will be helpful to define the operator S:Δ(Θ)×[0,1]S:\mathcal{H}\to\Delta(\Theta)\times[0,1] mapping histories to the most recent pair of belief and aggregate action, i.e., S((μs,As)st):=(μt,At)S((\mu_{s},A_{s})_{s\leq t}):=(\mu_{t},A_{t}).

Outline of proof. The proof of Theorem 1 consists of the following steps:

Step 1: We first show the result for binary states Θ={0,1}\Theta=\{0,1\} with θ=1\theta^{*}=1. With slight abuse of notation, we associate beliefs directly with the probability that the state is 11: μt=μt(θ)\mu_{t}=\mu_{t}(\theta^{*}). Then, our lower-dominance region is one-dimensional and summarized by a threshold belief for each AA:

μ¯(A):=maxμΨLD(A)μ(θ)\underline{\mu}(A):=\max_{\mu\in\Psi_{LD}(A)}\mu(\theta^{*})

We show that μt>μ¯(At)\mu_{t}>\underline{\mu}(A_{t}) implies switching to 11 is the unique subgame perfect equilibrium under the information policy 𝝁\bm{\mu}^{*}. We show this in several sub-steps.

  • Step 1A: There exists a belief threshold, which is a ‘rightward’ translation of the lower-dominance region μ¯(At)\underline{\mu}(A_{t}) such that agents find it strictly dominant to play action 11 regardless of others’ actions if the current belief is above this threshold (Lemma 2). We call this threshold ψ0(At)\psi_{0}(A_{t}).

  • Step 1B: For nn\in\mathbb{N}, suppose that agents conjecture that all agents will switch to action 11 at all future histories HH such that S(H)=(μs,As)S(H)=(\mu_{s},A_{s}) fulfills μs>ψn(As)\mu_{s}>\psi_{n}(A_{s}). Under this assumption, we can compute a lower bound (LB) on the expected payoff difference for agents between playing actions 11 and 0 for any given current belief μt(μ¯(At),ψn(At)]\mu_{t}\in(\underline{\mu}(A_{t}),\psi_{n}(A_{t})].

    To do so, we will separately consider the future periods before and after the aggregate action deviates from the tolerated distance from the target, at which point new information is provided. Call this time TT^{*}.

    • Before TT^{*}, we construct the lower bound using the fact that aggregate actions cannot be too far from the target even in the worst-case scenario.

    • At TT^{*}, the designer injects information with binary support. We choose the upward jump size M𝖳𝖮𝖱(D)M\cdot\mathsf{TOR}(D) to be sufficiently large so that, whenever the ‘good signal’ realizes beliefs exceed ψn(AT)\psi_{n}(A_{T^{*}}). Whenever the ‘bad signal’ realizes, we conjecture the worst-case that all agents switch to action 0.

  • Step 1C: We show that by carefully choosing the information policy, the threshold under which switching to 11 is strictly dominant, ψn+1(At)\psi_{n+1}(A_{t}), is strictly smaller than ψn(At)\psi_{n}(A_{t}). The policy has several key features:

    • Large MM: when the aggregate action ATA_{T^{*}} falls below the tolerated distance 𝖳𝖮𝖱(D(μt,AT))\mathsf{TOR}(D(\mu_{t},A_{T^{*}})) from the target, the high belief after the injection exceeds ψn(AT)\psi_{n}(A_{T^{*}}), which ensures the argument in Step 1B. In particular, we choose MM to be large relative to the Lipschitz constant of ψn\psi_{n}.

    • Small 𝖳𝖮𝖱(D(μt,AT))\mathsf{TOR}(D(\mu_{t},A_{T^{*}})): we should maintain a low tolerance level for deviations from the target. If the designer allowed a large deviation, the aggregate action could drop so low by the time information is injected that agents’ incentives to play action 11 would be too weak to recover.

    • Large 𝖣𝖮𝖶𝖭(D(μt,AT))\mathsf{DOWN}(D(\mu_{t},A_{T^{*}})): the downward jump size should be large relative to the upward jump size M𝖳𝖮𝖱(D(μt,AT))M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})), but not so large that beliefs fall into the lower-dominance region. This ensures that the probability of the belief being high after the injection is sufficiently large.

    These three features guarantee that the lower bound (LB) is sufficiently large and remains positive even when the current belief μt\mu_{t} is slightly below ψn(At)\psi_{n}(A_{t}). Hence ψn+1(At)\psi_{n+1}(A_{t}) is strictly smaller than ψn(At)\psi_{n}(A_{t}), allowing us to expand the range of beliefs under which action 11 is uniquely optimal (Lemma 3).

  • Step 1D: By iterating Step 1C for nn\in\mathbb{N}, we show that ψn(At)\psi_{n}(A_{t}) converges to μ¯(At)\underline{\mu}(A_{t}). Then, if μt>μ¯(At)\mu_{t}>\underline{\mu}(A_{t}), agents who can switch in period tt find it uniquely optimal to choose action 11.

Step 2: We extend the arguments in Step 1 from binary states to finite states: if μtΨLD(At)Bdθ,\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}, then playing action 11 is the unique subgame perfect equilibrium under the information policy 𝝁\bm{\mu}^{*}.

As described in the main text, our policy is such that beliefs move either in the direction d^(μ)\hat{d}(\mu) toward δθ\delta_{\theta^{*}}, or in the direction d^(μ)-\hat{d}(\mu) away from δθ\delta_{\theta^{*}}. The key observation is that we can apply a modification of Step 1 to each direction.

Step 3: We establish sequential optimality:

  • Step 3A: for any ϵ>0\epsilon>0, 𝝁\bm{\mu^{*}} is ϵ\epsilon-sequentially optimal when μtΨLD(At)Bdθ\mu^{*}_{t}\in\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}

  • Step 3B: 𝝁\bm{\mu^{*}} is sequentially optimal when μtΨLD(At)Bdθ.\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}.

Proof of Theorem 1.
 
Step 1. Suppose that Θ={0,1}\Theta=\{0,1\} and θ=1\theta^{*}=1. With slight abuse of notation, we associate beliefs μt\mu_{t} with the probability that θ=1\theta=1. As in the main text, we let μ¯(At)\underline{\mu}(A_{t}) denote the boundary of the lower-dominance region. We will show that as long as action 11 is not strictly dominated i.e., μt>μ¯(At)\mu^{*}_{t}>\underline{\mu}(A_{t}), then action 11 is played under any subgame perfect equilibrium.

Definition 5.

For nn\in\mathbb{N}, we will construct a sequence (ψn)n(\psi_{n})_{n} where ψnΔ(Θ)×[0,1]\psi_{n}\subset\Delta(\Theta)\times[0,1] is a subset of the round-nn dominance region. ψn\psi_{n} will satisfy the following conditions:

  1. (i)

    Contagion. Action 11 is strictly preferred under every history HH where S(H)ψnS(H)\in\psi_{n} under the conjecture that action 11 is played under every history HH^{\prime} such that S(H)ψn1S(H^{\prime})\in\psi_{n-1}.

  2. (ii)

    Translation. There exists a constant cn>0c_{n}>0 such that ψn={(μ,A):D(μ,A)cn},\psi_{n}=\{(\mu,A):D(\mu,A)\geq c_{n}\}, where D(μ,A)=μμ¯(A)D(\mu,A)=\mu-\underline{\mu}(A).

We initialize ψ0\psi_{0} as the upper-dominance region whereby 11 is strictly dominant.

Observe also that since Δu(,θ)\Delta u(\cdot,\theta) is continuous and strictly increasing on a compact domain, it is also Lipschitz and we let the constant be L>0L>0. This also implies the lower-dominance region (as a function of AA) is Lipschitz continuous, and we denote the constant with Lμ¯L_{\underline{\mu}}.

Lemma 1.

μ¯()\underline{\mu}(\cdot) is Lipschitz continuous.

Proof of Lemma 1.

Fix any tt. The expected payoff difference between playing 11 and 0 when everyone in the future switches to action 11 is given by

ΔU(μt,At)\displaystyle\Delta U(\mu_{t},A_{t}) :=μt𝔼τ[s=tτer(st){Δu(A¯s,1)Δu(A¯s,0)}𝑑s]\displaystyle:=\mu_{t}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Big{\{}\Delta u(\bar{A}_{s},1)-\Delta u(\bar{A}_{s},0)\Big{\}}ds\bigg{]}
+𝔼τ[s=tτer(st)Δu(A¯s,0)𝑑s].\displaystyle\quad\quad\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},0)ds\bigg{]}.

Note that ΔU\Delta U is continuously differentiable and strictly increasing in both μt\mu_{t} and AtA_{t}. Since the domain of ΔU\Delta U is compact, the following values are well-defined:

L:=maxA,μΔUA>0,l:=minA,μΔUμ>0.\displaystyle L:=\max_{A,\mu}\frac{\partial\Delta U}{\partial A}>0,\quad l:=\min_{A,\mu}\frac{\partial\Delta U}{\partial\mu}>0.

Then, for any At<AtA_{t}<A_{t}^{\prime} and μt>μt\mu_{t}>\mu_{t}^{\prime}, we have

ΔU(μt,At)ΔU(μt,At)L(AtAt)+l(μtμt)\Delta U(\mu_{t},A_{t})-\Delta U(\mu_{t}^{\prime},A_{t}^{\prime})\geq-L(A_{t}^{\prime}-A_{t})+l(\mu_{t}-\mu_{t}^{\prime})

because the mean value theorem implies

ΔU(μt,At)ΔU(μt,At)\displaystyle\Delta U(\mu_{t},A_{t})-\Delta U(\mu_{t}^{\prime},A_{t}) l(μtμt)\displaystyle\geq l(\mu_{t}-\mu_{t}^{\prime})
ΔU(μt,At)ΔU(μt,At)\displaystyle\Delta U(\mu_{t}^{\prime},A_{t}^{\prime})-\Delta U(\mu_{t}^{\prime},A_{t}) L(AtAt).\displaystyle\leq L(A_{t}^{\prime}-A_{t}).

Substituting μt=μ¯(At)\mu_{t}=\underline{\mu}(A_{t}) and μt=μ¯(At)\mu_{t}^{\prime}=\underline{\mu}(A_{t}^{\prime}) into the above inequality yields

0=ΔU(μ¯(At),At)ΔU(μ¯(At),At)L(AtAt)+l(μ¯(At)μ¯(At)),0=\Delta U(\underline{\mu}(A_{t}),A_{t})-\Delta U(\underline{\mu}(A_{t}^{\prime}),A_{t}^{\prime})\geq-L(A_{t}^{\prime}-A_{t})+l(\underline{\mu}(A_{t})-\underline{\mu}(A_{t}^{\prime})),

where the equality follows from the definition of μ¯\underline{\mu}, i.e., ΔU(μ¯(At),At)=0\Delta U(\underline{\mu}(A_{t}),A_{t})=0 for every AtA_{t}. Hence, we have

μ¯(At)μ¯(At)Ll=:Lμ¯(AtAt).\underline{\mu}(A_{t})-\underline{\mu}(A_{t}^{\prime})\leq\underbrace{\frac{L}{l}}_{=:L_{\underline{\mu}}}(A_{t}^{\prime}-A_{t}).

Step 1A. Construct ψ0\psi_{0}.

Define ψ0\psi_{0} as

ψ0={(μ,A)Δ(Θ)×[0,1]:D(μ,A)c0},\psi_{0}=\Big{\{}(\mu,A)\in\Delta(\Theta)\times[0,1]:D(\mu,A)\geq c_{0}\Big{\}},

with c0:=maxAμ¯(A)μ¯(A)c_{0}:=\max_{A}\bar{\mu}(A)-\underline{\mu}(A), where μ¯(A)\bar{\mu}(A) is defined as

μ¯(A):=min{μΔ(Θ):𝔼[tu(1,A¯s,θ)𝑑s]𝔼[tu(0,A¯s,θ)𝑑s]}.\bar{\mu}(A):=\min\Big{\{}\mu\in\Delta(\Theta):\mathbb{E}\Big{[}\int_{t}u(1,\underline{A}_{s},\theta)ds\Big{]}\geq\mathbb{E}\Big{[}\int_{t}u(0,\underline{A}_{s},\theta)ds\Big{]}\Big{\}}.

μ¯(A)\bar{\mu}(A) is the minimum belief under which players prefer action 11 even if all future players choose to play action 0.

Lemma 2.

Action 11 is strictly preferred under every history HH where S(H)ψ0S(H)\in\psi_{0}.

Proof of Lemma 2.

Fix any history HH such that S(H)ψ0.S(H)\in\psi_{0}. Then, by the definition of ψ0\psi_{0}, the current (μ,A)(\mu,A) satisfies

μμ¯(A)+maxA{μ¯(A)μ¯(A)}μ¯(A).\displaystyle\mu\geq\underline{\mu}(A)+\max_{A^{\prime}}\left\{\bar{\mu}(A^{\prime})-\underline{\mu}(A^{\prime})\right\}\geq\bar{\mu}(A).

Hence, action 11 is strictly preferred regardless of others’ future play. ∎

Step 1B. Construct a lower bound for the expected payoff difference given ψn\psi_{n}.

Suppose that everyone plays action 11 for any histories HH^{\prime} such that S(H)=(μ,A)S(H^{\prime})=(\mu,A) is in the round-nn dominance region ψn\psi_{n}. To obtain ψn+1\psi_{n+1} in Step 1C, we derive the lower bound on the expected payoff difference of playing 0 and 11 given ψn\psi_{n}.

To this end, fix any history HH with the current target aggregate action ZtZ_{t} such that S(H)=(μt,At)ψnS(H)=(\mu_{t},A_{t})\notin\psi_{n} but μt>μ¯(At)\mu_{t}>\underline{\mu}(A_{t}). From our construction of Zt,Z_{t}, we must have ZtAt<𝖳𝖮𝖱(D(μt,At)).Z_{t}-A_{t}<\mathsf{TOR}(D(\mu_{t},A_{t})).373737By construction, if ZtAt𝖳𝖮𝖱(D(μt,At))Z_{t-}-A_{t}\geq\mathsf{TOR}(D(\mu_{t},A_{t})), Zt=AtZ_{t}=A_{t} must hold, which implies ZtAt=0<𝖳𝖮𝖱(D(μt,At))Z_{t}-A_{t}=0<\mathsf{TOR}(D(\mu_{t},A_{t})). If ZtAt<𝖳𝖮𝖱(D(μt,At))Z_{t-}-A_{t}<\mathsf{TOR}(D(\mu_{t},A_{t})), ZtAt<𝖳𝖮𝖱(D(μt,At))Z_{t}-A_{t}<\mathsf{TOR}(D(\mu_{t},A_{t})) is immediate because ZtZ_{t} does not jump. For any continuous path (As)st(A_{s})_{s\geq t}, we define the hitting time T((As)st)T^{*}((A_{s})_{s\geq t}) as follows:

T=inf{st:ZsAs𝖳𝖮𝖱(D(μt,As)) or (μt,As)ψn}.T^{*}=\inf\Big{\{}s\geq t:Z_{s}-A_{s}\geq\mathsf{TOR}(D(\mu_{t},A_{s}))\text{ or }(\mu_{t},A_{s})\in\psi_{n}\Big{\}}.

TT^{*} represents the first time at which either the designer injects new information, or the pair (μs,As)(\mu_{s},A_{s}) enters the round-nn dominance region. We will calculate the agent’s expected payoff before time TT^{*} given the continuous path (As)s[t,T](A_{s})_{s\in[t,T^{*}]} and find a lower bound for this payoff by using the lower bound of AsA_{s} for sts\geq t.

Before time T{T}^{*}. First, we calculate the agent’s payoff before time TT^{*}. Given (As)st(A_{s})_{s\geq t}, we have μs=μt\mu_{s}=\mu_{t} and (μt,As)ψn(\mu_{t},A_{s})\notin\psi_{n} for every s[t,T)s\in[t,T^{*}) because no information is injected when ZsAs<𝖳𝖮𝖱(D(μt,As))Z_{s}-A_{s}<\mathsf{TOR}(D(\mu_{t},A_{s})). Define ψn(At)=sup{μΔ(Θ):(μ,At)ψn}\psi_{n}(A_{t})=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n}\}. This implies

𝖳𝖮𝖱(D(μt,As))=𝖳𝖮𝖱(μtμ¯(As))\displaystyle\mathsf{TOR}(D(\mu_{t},A_{s}))=\mathsf{TOR}(\mu_{t}-\underline{\mu}(A_{s})) 𝖳𝖮𝖱(ψn(As)μ¯(As))\displaystyle\leq\mathsf{TOR}(\psi_{n}(A_{s})-\underline{\mu}(A_{s}))
=𝖳𝖮𝖱(ψn(At)μ¯(At)),\displaystyle=\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})), (1)

where the inequality follows from δ\delta being increasing, and the last equality follows from the property that ψn(At)\psi_{n}(A_{t}) is a translation of μ¯(At).\underline{\mu}(A_{t}). Let A¯s=A¯(At,st)\bar{A}_{s}=\bar{A}(A_{t},s-t), which is the aggregate play at sts\geq t when everyone will switch to action 11 as fast as possible. By the definition of ZZ, we must have Zs=A¯sZ_{s}=\bar{A}_{s} for every s[t,T)s\in[t,T^{*}) because ZsAs<𝖳𝖮𝖱(D(μs,As))Z_{s}-A_{s}<\mathsf{TOR}(D(\mu_{s},A_{s})). Then we can write down the lower bound of AsA_{s} when s[t,T]s\in[t,T^{*}] as follows:

AsA¯s𝖳𝖮𝖱(D(μt,As))A¯s𝖳𝖮𝖱(ψn(At)μ¯(At)),\displaystyle A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),

where the second inequality follows from (A). By Lipschitz continuity of Δu(,θ),\Delta u(\cdot,\theta), we must have

Δu(As,θ)Δu(A¯s,θ)𝖳𝖮𝖱(ψn(At)μ¯(At))L\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L (2)

with the Lipschitz constant LL. Thus, the expected payoff difference of taking action 11 and 0 at time (μt,At)(\mu_{t},A_{t}) given a continuous path (As)s[t,T](A_{s})_{s\in[t,T^{*}]} before time TT^{*} is:

𝔼τ[s=tτTer(st)Δu(As,θ)ds|(As)s[t,T]]\displaystyle\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\Delta u(A_{s},\theta)ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}
=𝔼τ[s=tτTer(st)Δu(A¯s,θ)𝑑s]\displaystyle=\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
+𝔼τ[s=tτTer(st)(Δu(As,θ)Δu(A¯s,θ))ds|(As)s[t,T]]\displaystyle\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\big{(}\Delta u(A_{s},\theta)-\Delta u(\bar{A}_{s},\theta)\big{)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}
𝔼τ[s=tτTer(st)Δu(A¯s,θ)𝑑s]\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
𝖳𝖮𝖱(ψn(At)μ¯(At))L𝔼τ[s=tτTer(st)ds|(As)s[t,T]]\displaystyle\quad\quad-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L\cdot\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]} (From (2))
𝔼τ[s=tτTer(st)Δu(A¯s,θ)𝑑s]𝖳𝖮𝖱(ψn(At)μ¯(At))Lλ,\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L}{\lambda}, (3)

where the last inequality follows from

𝔼τ[s=tτTer(st)ds|(As)s[t,T]]𝔼τ[s=0τds]=1λ.\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}\leq\mathbb{E}_{\tau}\bigg{[}\int_{s=0}^{\tau}ds\bigg{]}=\frac{1}{\lambda}.

After time T{T^{*}}. We calculate the lower bound of the expected payoff difference after time TT^{*}. We know that μT=μt.\mu_{T^{*}-}=\mu_{t}. From the definition of TT^{*}, we consider the following two cases depending on whether ZTAT<𝖳𝖮𝖱(D(μT,AT))Z_{T^{*}}-A_{T^{*}}<\mathsf{TOR}(D(\mu_{T^{*}},A_{T^{*}})) holds or not.

Case 1: ZTAT<𝖳𝖮𝖱(D(μT,AT)){Z_{T^{*}}-A_{T^{*}}<\mathsf{TOR}(D(\mu_{T^{*}},A_{T^{*}}))}. This means μT=μt\mu_{T^{*}}=\mu_{t} because no information has been injected until TT^{*}. Then the definition of TT^{*} implies (μT,AT)ψn,(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}, where ψn\psi_{n} is the round-nn dominance region. This means every agent strictly prefers to take action 1 at TT^{*}. This increases ATA_{T^{*}}, inducing every agent taking action 1 after time TT^{*}.383838If (μ,A)ψn(\mu,A)\in\psi_{n}, then (μ,A)ψn(\mu,A^{\prime})\in\psi_{n} holds for any AA.A^{\prime}\geq A. Thus, for sTs\geq T^{*}, we have

As=A¯(AT,sT)\displaystyle A_{s}=\bar{A}(A_{T^{*}},s-T^{*}) A¯(A¯T𝖳𝖮𝖱(ψn(At)μ¯(At)),sT)\displaystyle\geq\bar{A}(\bar{A}_{T^{*}}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),s-T^{*})
A¯s𝖳𝖮𝖱(ψn(At)μ¯(At)),\displaystyle\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})), (4)

where the first inequality follows from

ATA¯T𝖳𝖮𝖱(D(μt,AT))A¯T𝖳𝖮𝖱(ψn(At)μ¯(At)),A_{T^{*}}\geq\bar{A}_{T^{*}}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))\geq\bar{A}_{T^{*}}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),

and the second inequality follows from

A¯(A¯T𝖳𝖮𝖱(ψn(At)μ¯(At)),sT)\displaystyle\bar{A}(\bar{A}_{T^{*}}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),s-T^{*})
=1(1A¯T+𝖳𝖮𝖱(ψn(At)μ¯(At)))exp(λ(sT))\displaystyle=1-\left(1-\bar{A}_{T^{*}}+\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))\right)\exp(-\lambda(s-T^{*}))
=A¯s𝖳𝖮𝖱(ψn(At)μ¯(At))exp(λ(sT))\displaystyle=\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))\exp(-\lambda(s-T^{*}))
A¯s𝖳𝖮𝖱(ψn(At)μ¯(At)).\displaystyle\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})).

Hence, by the Lipschitz continuity of Δu(,θ)\Delta u(\cdot,\theta), if (μT,AT)ψn(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}, then

Δu(As,θ)Δu(A¯s,θ)𝖳𝖮𝖱(ψn(At)μ¯(At))L.\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L. (5)

The expected payoff difference of taking action 11 and 0 at time (μt,At)(\mu_{t},A_{t}) given a path (As)s[t,T](A_{s})_{s\in[t,T^{*}]} after time TT^{*} is

𝔼τ[s=τTτer(st)Δu(As,θ)ds|(As)s[t,T]]\displaystyle\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(A_{s},\theta)ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}
=𝔼τ[s=τTτer(st)Δu(A¯s,θ)𝑑s]\displaystyle=\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
+𝔼τ[s=τTτer(st)(Δu(As,θ)Δu(A¯s,θ))ds|(As)s[t,T]]\displaystyle\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\big{(}\Delta u(A_{s},\theta)-\Delta u(\bar{A}_{s},\theta)\big{)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}
𝔼τ[s=τTτer(st)Δu(A¯s,θ)𝑑s]\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
𝖳𝖮𝖱(ψn(At)μ¯(At))L𝔼τ[s=τTτer(st)ds|(As)s[t,T]]\displaystyle\quad\quad-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L\cdot\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]} (From (5))
𝔼τ[s=τTτer(st)Δu(A¯s,θ)𝑑s]𝖳𝖮𝖱(ψn(At)μ¯(At))Lλ.\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L}{\lambda}. (6)

Case 2: ZTAT𝖳𝖮𝖱(D(μT,As)){Z_{T^{*}}-A_{T^{*}}\geq\mathsf{TOR}(D(\mu_{T^{*}},A_{s}))}. By the definition of TT^{*}, information is injected at TT^{*}, and thus the belief at TT^{*} must be

μT={μt+M𝖳𝖮𝖱(D(μt,AT))w.p. p+(μt,AT)μt𝖣𝖮𝖶𝖭(D(μt,AT))w.p. p(μt,AT).\displaystyle\mu_{T^{*}}=\begin{cases}\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))&\text{w.p. $p_{+}(\mu_{t},A_{T^{*}})$}\\ \mu_{t}-\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))&\text{w.p. $p_{-}(\mu_{t},A_{T^{*}})$}.\end{cases}

Note that, if (μT,AT)ψn(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}, then everyone strictly prefers to take action 11 at TT^{*}. This increases ATA_{T^{*}} and induces every agent to take action 11 after time TT^{*} because (μs,As)(\mu_{s},A_{s}) stays in ψn\psi_{n} for all sTs\geq T^{*}. Hence, we can write down the lower bound of AsA_{s} when s>Ts>T^{*} as follows:

As\displaystyle A_{s} 1{(μT,AT)ψn}A¯(AT,sT)+1{(μT,AT)ψn}A¯(AT,sT)\displaystyle\geq 1\{(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\}\bar{A}(A_{T^{*}},s-T^{*})+1\{(\mu_{T^{*}},A_{T^{*}})\notin\psi_{n}\}\underline{$A$}(A_{T^{*}},s-T^{*})
1{(μT,AT)ψn}{A¯s𝖳𝖮𝖱(ψn(At)μ¯(At))}\displaystyle\geq 1\{(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\}\{\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))\}
+1{(μT,AT)ψn}A¯(AT,sT),\displaystyle\quad\quad\quad\quad\quad\quad+1\{(\mu_{T^{*}},A_{T^{*}})\notin\psi_{n}\}\underline{$A$}(A_{T^{*}},s-T^{*}),

where the first inequality follows from the fact that everyone in the future will switch to action 0 in the worst-case scenario if (μT,AT)ψn(\mu_{T^{*}},A^{T^{*}})\notin\psi_{n}, and the second inequality follows from (A). By Lipschitz continuity of Δu(,θ)\Delta u(\cdot,\theta), we must have, if (μT,AT)ψn(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}, then

Δu(As,θ)Δu(A¯s,θ)𝖳𝖮𝖱(ψn(At)μ¯(At))L,\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L, (7)

and if (μT,AT)ψn(\mu_{T^{*}},A_{T^{*}})\notin\psi_{n}, then

Δu(As,θ)Δu(A¯s,θ)L(A¯sAs)Δu(A¯s,θ)L.\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-L(\bar{A}_{s}-A_{s})\geq\Delta u(\bar{A}_{s},\theta)-L. (8)

Define pn((μT,AT)ψn(As)s[t,T])p_{n}\coloneqq\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\mid(A_{s})_{s\in[t,T^{*}]}). The expected payoff difference of taking action 11 and 0 at time (μt,At)(\mu_{t},A_{t}) given a path (As)s[t,T](A_{s})_{s\in[t,T^{*}]} after time TT^{*} is

𝔼τ[s=τTτer(st)Δu(As,θ)ds|(As)s[t,T]]\displaystyle\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(A_{s},\theta)ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}
=𝔼τ[s=τTτer(st)Δu(A¯s,θ)𝑑s]\displaystyle=\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
+𝔼τ[s=τTτer(st)(Δu(As,θ)Δu(A¯s,θ))ds|(As)s[t,T]]\displaystyle\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\big{(}\Delta u(A_{s},\theta)-\Delta u(\bar{A}_{s},\theta)\big{)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}
𝔼τ[s=τTτer(st)Δu(A¯s,θ)𝑑s]\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
(𝖳𝖮𝖱(ψn(At)μ¯(At))Lpn+L(1pn))𝔼τ[s=τTτer(st)ds|(As)s[t,T]]\displaystyle\quad\quad-\big{(}\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))Lp_{n}+L(1-p_{n})\big{)}\cdot\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]} (From (7) and (8))
𝔼τ[s=τTτer(st)Δu(A¯s,θ)𝑑s]𝖳𝖮𝖱(ψn(At)μ¯(At))Lpn+L(1pn)λ.\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))Lp_{n}+L(1-p_{n})}{\lambda}. (9)

Combining before and after time T{T^{*}}. We are ready to construct a lower bound of the expected discounted payoff difference. To evaluate sTs\geq T^{*}, it is sufficient to focus on the case in which information is injected (Case 2) since (9) is smaller than (6) because 𝖳𝖮𝖱(ψn(At)μ¯(At))<1\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))<1. By taking the sum of the payoffs before and after time TT^{*}, that is (3) and (9), the expected payoff difference of taking action 1 and 0 at (μt,At)(\mu_{t},A_{t}) given a path (As)s[t,T](A_{s})_{s\in[t,T^{*}]} is lower-bounded as follows:

𝔼[U1(μt,(As)st)U0(μt,(As)st)|(As)s[t,T]]\displaystyle\mathbb{E}\Big{[}U_{1}(\mu_{t},(A_{s})_{s\geq t})-U_{0}(\mu_{t},(A_{s})_{s\geq t})\Big{|}(A_{s})_{s\in[t,T^{*}]}\Big{]}
𝔼τ[s=tτer(st)Δu(A¯s,θ)𝑑s]𝖳𝖮𝖱(ψn(At)μ¯(At))L(1+pn)+L(1pn)λ.\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L(1+p_{n})+L(1-p_{n})}{\lambda}. (LB)

Intuitively, the expected payoff cannot be too low compared to the case where everyone switches to action 11 in the future because (i) aggregate actions are close to the target before new information is injected; and (ii) if the belief jumps upward upon injection, everyone will subsequently switch to action 11.

Step 1C. Finally, we characterize ψn+1\psi_{n+1}. The following lemma establishes that under 𝝁,\bm{\mu^{*}}, ψn\psi_{n} is strictly increasing in the set order.

Lemma 3.

For all nn\in\mathbb{N}, ψnψn+1\psi_{n}\subset\psi_{n+1} (strict inclusion).

Proof of Lemma 3.

To characterize ψn+1\psi_{n+1}, we first show that there exist tolerance level δ\delta, upward jump magnitude M,M, and downward jump size ϵ\epsilon such that if μtψ(At)M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}\geq\psi(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2, then U1(μt,(As)st)U0(μt,(As)st)>0U_{1}(\mu_{t},(A_{s})_{s\geq t})-U_{0}(\mu_{t},(A_{s})_{s\geq t})>0.

Suppose μtψ(At)M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}\geq\psi(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2. First, we evaluate the first term of (LB). We know from the definition of the lower-dominance region μ¯(At)\underline{\mu}(A_{t}) that

μ¯(At)𝔼τ[s=tτer(st)Δu(A¯s,1)𝑑s]0+(1μ¯(At))𝔼τ[s=tτer(st)Δu(A¯s,0)𝑑s]0\underline{\mu}(A_{t})\underbrace{\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}}_{\geq 0}+(1-\underline{\mu}(A_{t}))\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},0)ds\bigg{]}\geq 0

with equality when μ¯(At)>0\underline{\mu}(A_{t})>0. If μ¯(At)>0\underline{\mu}(A_{t})>0, we must have 𝔼τ[s=tτer(st)Δu(A¯s,1)𝑑s]0\mathbb{E}_{\tau}\big{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\big{]}\leq 0, which implies

𝔼τ,θμt[s=tτer(st)Δu(A¯s,θ)𝑑s]\displaystyle\mathbb{E}_{\tau,\theta\sim\mu_{t}}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
=(μtμ¯(At))𝔼τ[s=tτer(st)(Δu(A¯s,1)Δu(A¯s,0))𝑑s]\displaystyle=(\mu_{t}-\underline{\mu}(A_{t}))\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}(\Delta u(\bar{A}_{s},1)-\Delta u(\bar{A}_{s},0))ds\bigg{]}
(μtμ¯(At))𝔼τ[s=tτer(st)Δu(A¯s,1)𝑑s]\displaystyle\geq(\mu_{t}-\underline{\mu}(A_{t}))\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}
>C(μtμ¯(At))\displaystyle>C(\mu_{t}-\underline{\mu}(A_{t})) (10)

for some C>0C>0. This constant CC exists because

minAt[0,1]𝔼τ[s=tτer(st)Δu(A¯s,1)𝑑s]>0\min_{A_{t}\in[0,1]}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}>0

since Δu(A,1)>0\Delta u(A,1)>0 for any A[0,1]A\in[0,1]. If μ¯(At)=0\underline{\mu}(A_{t})=0, we have 𝔼τ[s=tτer(st)Δu(A¯s,1)𝑑s]0\mathbb{E}_{\tau}\big{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\big{]}\geq 0, which implies

𝔼τ,θμt[s=tτer(st)Δu(A¯s,θ)𝑑s]μt𝔼τ[s=tτer(st)Δu(A¯s,1)𝑑s]>C(μtμ¯(At)).\displaystyle\mathbb{E}_{\tau,\theta\sim\mu_{t}}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}\geq\mu_{t}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}>C(\mu_{t}-\underline{\mu}(A_{t})).

Additionally, note that if δ\delta satisfies M𝖳𝖮𝖱(D(μs,As))/2μsM\cdot\mathsf{TOR}(D(\mu_{s},A_{s}))/2\leq\mu_{s} for every (μs,As)(\mu_{s},A_{s}), then μtμ¯(At)12(ψn(At)μ¯(At))\mu_{t}-\underline{\mu}(A_{t})\geq\frac{1}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})). This follows from

12(ψn(At)μ¯(At))\displaystyle\frac{1}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})) 12(μtμ¯(At))+12(M𝖳𝖮𝖱(D(μt,At))/2μ¯(At))\displaystyle\leq\frac{1}{2}(\mu_{t}-\underline{\mu}(A_{t}))+\frac{1}{2}(M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2-\underline{\mu}(A_{t}))
<μtμ¯(At),\displaystyle<\mu_{t}-\underline{\mu}(A_{t}),

where the first inequality follows from μtψ(At)M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}\geq\psi(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2, and the second inequality follows from M𝖳𝖮𝖱(D(μt,At))/2μtM\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2\leq\mu_{t}. Thus, if μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2, then

𝔼τ,θμt[s=tτer(st)Δu(A¯s,θ)𝑑s]\displaystyle\mathbb{E}_{\tau,\theta\sim\mu_{t}}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]} >C2(ψn(At)μ¯(At)).\displaystyle>\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})). (11)

Next, we evaluate the second term of (LB). Notice that, if (μt,At)ψn(\mu_{t},A_{t})\notin\psi_{n}, then

pn=((μT,AT)ψn)=p+(μt,AT)1{(μt+M𝖳𝖮𝖱(D(μt,AT)),AT)ψn}.\displaystyle p_{n}=\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n})=p_{+}(\mu_{t},A_{T^{*}})1\{(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n}\}.

We will show that (μt+M𝖳𝖮𝖱(D(μt,AT)),AT)ψn(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n} if μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2. Observe that when (μt,At)ψn(\mu_{t},A_{t})\notin\psi_{n}, we must have

AT>At𝖳𝖮𝖱(D(μt,At)).\displaystyle A_{T^{*}}>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t})).

To see this, suppose for a contradiction that ATAt𝖳𝖮𝖱(D(μt,At))A_{T^{*}}\leq A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t})), which implies AT<AtA_{T^{*}}<A_{t}. However, since the definition of TT^{*} implies AT=ZT𝖳𝖮𝖱(D(μt,AT))A_{T^{*}}=Z_{T^{*}}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}})), we have

AT\displaystyle A_{T^{*}} =ZT𝖳𝖮𝖱(D(μt,AT))\displaystyle=Z_{T^{*}}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))
>At𝖳𝖮𝖱(D(μt,At)),\displaystyle>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t})),

where the inequality follows from ZT=A¯T>AtZ_{T^{*}}=\bar{A}_{T^{*}}>A_{t} and the fact that 𝖳𝖮𝖱(D(μt,A))\mathsf{TOR}(D(\mu_{t},A)) is increasing in AA. This is a contradiction.

Lemma 1 shows that μ¯\underline{\mu} is a Lipschitz function. Since ψn(At)\psi_{n}(A_{t}) is a translation of μ¯(At)\underline{\mu}(A_{t}), ψn(At)\psi_{n}(A_{t}) has the same Lipschitz constant Lμ¯L_{\underline{\mu}} as μ¯(At)\underline{\mu}(A_{t}). Hence, if μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2, we must have

μt+M𝖳𝖮𝖱(D(μt,AT))\displaystyle\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})) ψn(At)+M𝖳𝖮𝖱(D(μt,At))/2\displaystyle\geq\psi_{n}(A_{t})+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2
>(ψn(AT)Lμ¯𝖳𝖮𝖱(D(μt,At)))+M𝖳𝖮𝖱(D(μt,At))/2\displaystyle>\big{(}\psi_{n}(A_{T^{*}})-L_{\underline{\mu}}\mathsf{TOR}(D(\mu_{t},A_{t}))\big{)}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2
=ψn(AT),\displaystyle=\psi_{n}(A_{T^{*}}),

by setting M=2Lμ¯M=2L_{\underline{\mu}}. Thus, (μt+M𝖳𝖮𝖱(D(μt,AT)),AT)ψn(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n} holds, implying

pn=p+(μt,AT)=𝖣𝖮𝖶𝖭(D(μt,AT))𝖣𝖮𝖶𝖭(D(μt,AT))+M𝖳𝖮𝖱(D(μt,AT)).p_{n}=p_{+}(\mu_{t},A_{T^{*}})=\frac{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}.

We set

𝖣𝖮𝖶𝖭(D(μt,At))=μtμ¯(At)2&𝖳𝖮𝖱(D(μt,At))=δ¯λC(μtμ¯(At))4L+4LM(μtμ¯(At))1,\mathsf{DOWN}(D(\mu_{t},A_{t}))=\frac{\mu_{t}-\underline{\mu}(A_{t})}{2}\quad\text{\&}\quad\mathsf{TOR}(D(\mu_{t},A_{t}))=\bar{\delta}\cdot\frac{\lambda C(\mu_{t}-\underline{\mu}(A_{t}))}{4L+4LM(\mu_{t}-\underline{\mu}(A_{t}))^{-1}},

for a fixed small number δ¯<1\bar{\delta}<1 so that 𝖳𝖮𝖱(D(μ,A))<1\mathsf{TOR}(D(\mu,A))<1 and M𝖳𝖮𝖱(D(μs,As))/2μsM\cdot\mathsf{TOR}(D(\mu_{s},A_{s}))/2\leq\mu_{s} for every μ\mu and AA (e.g., δ¯=min{1,4LλC,4LMλC}\bar{\delta}=\min\{1,\frac{4L}{\lambda C},\frac{4LM}{\lambda C}\}). Thus,

1pn\displaystyle 1-p_{n} =M𝖳𝖮𝖱(D(μt,AT))𝖣𝖮𝖶𝖭(D(μt,AT))+M𝖳𝖮𝖱(D(μt,AT))\displaystyle=\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}
M𝖳𝖮𝖱(D(μt,AT))𝖣𝖮𝖶𝖭(D(μt,AT))\displaystyle\leq\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))}
=δ¯λMC2L+2LM(μtμ¯(AT))1\displaystyle=\frac{\bar{\delta}\lambda MC}{2L+2LM(\mu_{t}-\underline{\mu}(A_{T^{*}}))^{-1}}
δ¯λMC2L+2LM(ψn(At)μ¯(At))1,\displaystyle\leq\frac{\bar{\delta}\lambda MC}{2L+2LM(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))^{-1}}, (12)

where the last inequality follows from the continuity of AsA_{s} and what we argued in (A) that μtμ¯(As)ψn(At)μ¯(At)\mu_{t}-\underline{\mu}(A_{s})\leq\psi_{n}(A_{t})-\underline{\mu}(A_{t}) for every s<Ts<T^{*}.

Thus, if μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2,\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2, we have

𝔼[U1(μt,(As)st)U0(μt,(As)st)(As)s[t,T]]\displaystyle\mathbb{E}[U_{1}(\mu_{t},(A_{s})_{s\geq t})-U_{0}(\mu_{t},(A_{s})_{s\geq t})\mid(A_{s})_{s\in[t,T^{*}]}]
>C2(ψn(At)μ¯(At))1λ(𝖳𝖮𝖱(ψn(At)μ¯(At))L(1+pn)+L(1pn))\displaystyle>\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-\frac{1}{\lambda}\Big{(}\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L(1+p_{n})+L(1-p_{n})\Big{)} (From (LB) and (11))
>C2(ψn(At)μ¯(At))Lλ(2𝖳𝖮𝖱(ψn(At)μ¯(At))+(1pn))\displaystyle>\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-\frac{L}{\lambda}\Big{(}2\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))+(1-p_{n})\Big{)}
C2(ψn(At)μ¯(At))δ¯Lλ(λC(ψn(At)μ¯(At))+λMC2L+2LM(ψn(At)μ¯(At))1)\displaystyle\geq\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-\frac{\bar{\delta}L}{\lambda}\cdot\bigg{(}\frac{\lambda C(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))+\lambda MC}{2L+2LM(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))^{-1}}\bigg{)} (From (A))
=C(1δ¯)2(ψn(At)μ¯(At))\displaystyle=\frac{C(1-\bar{\delta})}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))
>0,\displaystyle>0,

for every given path (As)s[t,T](A_{s})_{s\in[t,T^{*}]}.

In conclusion, we found δ\delta, MM, and ϵ\epsilon such that if μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2, then the agent must choose action 11. Note that δ\delta is increasing in μt\mu_{t} and increasing in AtA_{t}. Thus, μt+M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2 is increasing in μt\mu_{t} and continuous in μt\mu_{t} when μt>μ¯(At)\mu_{t}>\underline{\mu}(A_{t}). Therefore, for each AtA_{t}, there exists μ(At)<ψn(At)\mu^{\prime}(A_{t})<\psi_{n}(A_{t}) such that

μ(At)+M𝖳𝖮𝖱(D(μ(At),At))2=ψn(At).\displaystyle\mu^{\prime}(A_{t})+\frac{M\cdot\mathsf{TOR}(D(\mu^{\prime}(A_{t}),A_{t}))}{2}=\psi_{n}(A_{t}).

Then we define

ψn+1={(μt,At):μtμ(At)},\psi_{n+1}=\{(\mu_{t},A_{t}):\mu_{t}\geq\mu^{\prime}(A_{t})\},

which also implies ψn+1(At):=sup{μΔ(Θ):(μ,At)ψn+1}=μ(At)\psi_{n+1}(A_{t}):=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n+1}\}=\mu^{\prime}(A_{t}). From the argument above, we must have an agent always choosing action 11 whenever (μt,At)ψn+1(\mu_{t},A_{t})\in\psi_{n+1} (Contagion in Definition 5). Moreover, we can rewrite the above equation as follows:

(ψn+1(At)μ¯(At))+M𝖳𝖮𝖱(ψn+1(At)μ¯(At))2=ψn(At)μ¯(At),\displaystyle(\psi_{n+1}(A_{t})-\underline{\mu}(A_{t}))+\frac{M\cdot\mathsf{TOR}(\psi_{n+1}(A_{t})-\underline{\mu}(A_{t}))}{2}=\psi_{n}(A_{t})-\underline{\mu}(A_{t}),

where the RHS is constant in AtA_{t} by the translation property of ψn\psi_{n}. Thus, ψn+1(At)μ¯(At)\psi_{n+1}(A_{t})-\underline{\mu}(A_{t}) must be also constant in AtA_{t} (Translation in Definition 5). This concludes that round-(n+1)(n+1) dominance region ψn+1\psi_{n+1} satisfies ψnψn+1\psi_{n}\subset\psi_{n+1} because cn=ψn(At)μ¯(At)>ψn+1(At)μ¯(At)=:cn+1c_{n}=\psi_{n}(A_{t})-\underline{\mu}(A_{t})>\psi_{n+1}(A_{t})-\underline{\mu}(A_{t})=:c_{n+1}. ∎

Step 1D. In the limit, the sequence (ψn)n(\psi_{n})_{n} covers the (μ,A)(\mu,A) region where action 11 is not strictly dominated.

Lemma 4.
nψn={(μ,A)Δ(Θ)×[0,1]:μ>μ¯(A)}.\bigcup_{n\in\mathbb{N}}\psi_{n}=\Big{\{}(\mu,A)\in\Delta(\Theta)\times[0,1]:\mu>\underline{\mu}(A)\Big{\}}.
Proof of Lemma 4.

Recall ψn(At)=sup{μΔ(Θ):(μ,At)ψn}.\psi_{n}(A_{t})=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n}\}. By Lemma 3, ψn(At)\psi_{n}(A_{t}) is decreasing in nn. Define ψ(At)=limnψn(At)\psi^{*}(A_{t})=\lim_{n\to\infty}\psi_{n}(A_{t}). In limit, we must have

ψ(At)+M𝖳𝖮𝖱(D(ψ(At),At))/2=ψ(At)\displaystyle\psi^{*}(A_{t})+M\cdot\mathsf{TOR}(D(\psi^{*}(A_{t}),A_{t}))/2=\psi^{*}(A_{t})
𝖳𝖮𝖱(D(ψ(At),At))=0ψ(At)=μ¯(At),\displaystyle\Rightarrow\mathsf{TOR}(D(\psi^{*}(A_{t}),A_{t}))=0\Rightarrow\psi^{*}(A_{t})=\underline{\mu}(A_{t}),

which implies

n0ψn={(μt,At):μt>μ¯(At)}\bigcup_{n\geq 0}\psi_{n}=\Big{\{}(\mu_{t},A_{t}):\mu_{t}>\underline{\mu}(A_{t})\Big{\}}

as required. ∎

Step 2. We have constructed an information policy which uniquely implements an equilibrium achieving (2) for |Θ|=2|\Theta|=2. we now lift this to the case with finite states Θ={θ1,θn}\Theta=\{\theta_{1},\ldots\theta_{n}\} as set out in the main text, where recall we set θ\theta^{*} as the dominant state.

In particular, we show that if μtΨLD(At)Bdθ,\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}, then playing action 11 is the unique subgame perfect equilibrium under the information policy 𝝁\bm{\mu}^{*}. To apply Step 1, we will construct an auxiliary binary-state environment for each direction from δθ\delta_{\theta^{*}}.

To this end, we call a vector 𝒅^=(d^θ)θΘn\hat{\bm{d}}=(\hat{d}_{\theta})_{\theta\in\Theta}\in\mathbb{R}^{n} a feasible directional vector if θd^θ=0\sum_{\theta}{\hat{d}_{\theta}}=0 and d^θ=1\hat{d}_{\theta^{*}}=1 but d^θ<0\hat{d}_{\theta}<0 if θθ\theta^{*}\neq\theta. For each feasible directional vector 𝒅^\hat{\bm{d}}, define a function α¯𝒅^:[0,1][0,1]\bar{\alpha}_{\hat{\bm{d}}}:[0,1]\to[0,1] such that, for every A[0,1],A\in[0,1],

α¯𝒅^(A)=inf{α[0,1]:δθ(1α)𝒅^ΨLD(A)Bdθ}.\bar{\alpha}_{\hat{\bm{d}}}(A)=\inf\Big{\{}\alpha\in[0,1]:\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}\notin\Psi_{LD}(A)\cup\text{Bd}_{\theta^{*}}\Big{\}}.

Note that δθ(1α)𝒅^Bdθ\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}\in\text{Bd}_{\theta^{*}} if and only if α=0\alpha=0 because d^θ=1\hat{d}_{\theta^{*}}=1. Observe that

(ΨLD(At)Bdθ)c=𝒅^𝒟{δθ(1α)𝒅^:α(α¯𝒅^(At)),1]},\displaystyle\big{(}\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\big{)}^{c}=\bigcup_{\hat{\bm{d}}\in\mathcal{D}}\big{\{}\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}:\alpha\in(\bar{\alpha}_{\hat{\bm{d}}}(A_{t})),1]\big{\}},

where 𝒟\mathcal{D} is the set of all feasible directional vectors. This is true because 1) (ΨLD(At))c\big{(}\Psi_{LD}(A_{t})\big{)}^{c} is a polygon since the expectation operator is linear; and 2) ΨLD(At)BdθΔ(Θ)\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\Delta(\Theta) is closed. Thus, it is equivalent to show that, for every feasible directional vector 𝒅^,\hat{\bm{d}}, if α(α¯𝒅^(At),1]\alpha\in(\bar{\alpha}_{\hat{\bm{d}}}(A_{t}),1], then playing action 11 is the unique subgame perfect equilibrium under the information policy 𝝁\bm{\mu^{*}}.

Fix a feasible directional vector 𝒅^.\hat{\bm{d}}. Define

Δ(Θ)𝒅^={δθ(1α)𝒅^:α[0,1]}\Delta(\Theta)_{\hat{\bm{d}}}=\big{\{}\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}:\alpha\in[0,1]\big{\}}

as the set of beliefs whose direction from δθ\delta_{\theta^{*}} is 𝒅^\hat{\bm{d}}. Consider an auxiliary environment with binary state Θ~={0,1}.\tilde{\Theta}=\{0,1\}. Construct a bijection ψ𝒅^:Δ(Θ)𝒅^Δ(Θ~)\psi_{\hat{\bm{d}}}:\Delta(\Theta)_{\hat{\bm{d}}}\to\Delta(\tilde{\Theta}) such that ψ𝒅^(μ)=α\psi_{\hat{\bm{d}}}(\mu)=\alpha if μ=δθ(1α)𝒅^.\mu=\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}. Denote μ~ψ𝒅^(μ)Δ(Θ~)\tilde{\mu}\coloneqq\psi_{\hat{\bm{d}}}(\mu)\in\Delta(\tilde{\Theta}) for every μΔ(Θ)𝒅^\mu\in\Delta(\Theta)_{\hat{\bm{d}}}. Note that ψ𝒅^(δθ)=1\psi_{\hat{\bm{d}}}(\delta_{\theta^{*}})=1.

We define a flow payoff for each player under the new environment u~:{0,1}×[0,1]×Θ~\tilde{u}:\{0,1\}\times[0,1]\times\tilde{\Theta}\to\mathbb{R} as follows:

u~(a,A,θ~)=u(a,A,ψ𝒅^1(θ~)).\tilde{u}(a,A,\tilde{\theta})=u\left(a,A,\psi^{-1}_{\hat{\bm{d}}}(\tilde{\theta})\right).

Define Δu~(A,θ~):=u(1,A,θ~)u(0,A,θ~)\Delta\tilde{u}(A,\tilde{\theta}):=u(1,A,\tilde{\theta})-u(0,A,\tilde{\theta}). Since ψ𝒅^\psi_{\hat{\bm{d}}} is a linear map, Δu~(A,θ)\Delta\tilde{u}(A,\theta) is still continuously differentiable and strictly increasing in A.A. Also, given that Δu~(0,1)=Δu(0,θ)>0\Delta\tilde{u}(0,1)=\Delta u(0,\theta^{*})>0, we still have an action-11-dominance region under this new environment.

Then we can similarly define the maximum belief under which players prefer action 0 even if all future players choose to play action 1:1:

μ¯𝒅^(At):=max{μ~Δ(Θ~):𝔼[tu~(0,A¯s,θ~)𝑑s]𝔼[tu~(1,A¯s,θ~)𝑑s]}.\underline{\mu}_{\hat{\bm{d}}}(A_{t}):=\max\Big{\{}\tilde{\mu}\in\Delta(\tilde{\Theta}):\mathbb{E}\Big{[}\int_{t}\tilde{u}(0,\bar{A}_{s},\tilde{\theta})ds\Big{]}\geq\mathbb{E}\Big{[}\int_{t}\tilde{u}(1,\bar{A}_{s},\tilde{\theta})ds\Big{]}\Big{\}}.

We define D~(μ~,A)=μ~μ¯𝒅^(A)\tilde{D}(\tilde{\mu},A)=\tilde{\mu}-\underline{\mu}_{\hat{\bm{d}}}(A). Then it is easy to see that D~(μ,A)=D(μ,A)\tilde{D}(\mu,A)=D(\mu,A) for every μΔ(Θ~)\mu\in\Delta(\tilde{\Theta}) and A[0,1]A\in[0,1]

A key observation is that if μtΨLD(At)\mu_{t-}\notin\Psi_{LD}(A_{t}) and μtΔ(Θ)𝒅^\mu_{t-}\in\Delta(\Theta)_{\hat{\bm{d}}}, then every future belief must stay in Δ(Θ)𝒅^\Delta(\Theta)_{\hat{\bm{d}}} almost surely with respect to any strategy. We can rewrite the time-tt information struture corresponding to the new environment as follows:

  1. 1.

    Silence on-path. If μ~t>μ¯𝒅^(At)\tilde{\mu}_{t-}>\underline{\mu}_{\hat{\bm{d}}}(A_{t}) and |AtZt|<𝖳𝖮𝖱(D)|A_{t}-Z_{t-}|<\mathsf{TOR}(D)

    μt=μt\mu_{t}=\mu_{t-} almost surely,

    i.e., no information, and dZt=λ(1Zt).dZ_{t}=\lambda(1-Z_{t-}).

  2. 2.

    Noisy and asymmetric off-path. If μ~t>μ¯𝒅^(At)\tilde{\mu}_{t-}>\underline{\mu}_{\hat{\bm{d}}}(A_{t}) and ZtAt𝖳𝖮𝖱(D),Z_{t-}-A_{t}\geq\mathsf{TOR}(D),

    μ~t={μ~t+M𝖳𝖮𝖱(D)w.p. 𝖣𝖮𝖶𝖭(D)𝖣𝖮𝖶𝖭(D)+M𝖳𝖮𝖱(D)μ~t𝖣𝖮𝖶𝖭(D)w.p. M𝖳𝖮𝖱(D)𝖣𝖮𝖶𝖭(D)+M𝖳𝖮𝖱(D),\displaystyle\tilde{\mu}_{t}=\begin{cases}\tilde{\mu}_{t-}+M\cdot\mathsf{TOR}(D)&\text{w.p. $\frac{\mathsf{DOWN}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$}\\ \tilde{\mu}_{t-}-\mathsf{DOWN}(D)&\text{w.p. $\frac{M\cdot\mathsf{TOR}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$},\end{cases}

    and reset Zt=AtZ_{t}=A_{t}.

By applying Step 1, we conclude that if μ~t>μ¯𝒅^(At)\tilde{\mu}_{t-}>\underline{\mu}_{\hat{\bm{d}}}(A_{t}), then action 11 is played under any subgame perfect equilibrium. The only subtlety is to verify that as in (10), there exists a constant C>0C>0 such that

minAt[0,1]𝔼τ[s=tτer(st)Δu~(A¯s,1)𝑑t]C\displaystyle\min_{A_{t}\in[0,1]}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta\tilde{u}(\bar{A}_{s},1)dt\bigg{]}\geq C

for any feasible directional vector 𝒅^.\hat{\bm{d}}. This is clear because Δu~(A,1)=Δu(A,θ)>Δu(0,θ)>0\Delta\tilde{u}(A,1)=\Delta u(A,\theta^{*})>\Delta u(0,\theta^{*})>0 for every AA by the definition of θ\theta^{*}.

Since [0,μ¯𝒅^(At)]=ψ𝒅^(Δ(Θ)𝒅^ΨLD(At))[0,\underline{\mu}_{\hat{\bm{d}}}(A_{t})]=\psi_{\hat{\bm{d}}}\big{(}\Delta(\Theta)_{\hat{\bm{d}}}\cap\Psi_{LD}(A_{t})\big{)}, we have (μ¯𝒅^(At),1]=(α¯𝒅^(At),1](\underline{\mu}_{\hat{\bm{d}}}(A_{t}),1]=(\bar{\alpha}_{\hat{\bm{d}}}(A_{t}),1]. Hence, if α(α¯𝒅^(At),1]\alpha\in(\bar{\alpha}_{\hat{\bm{d}}}(A_{t}),1], then playing action 11 is the unique subgame perfect equilibrium under the information policy 𝝁\bm{\mu^{*}}, as desired.

Step 3. We now show sequential optimality. Step 3A handles the case when beliefs are such that 11 is strictly dominated, while 3B handles the case when 11 is not strictly dominated.

Step 3A. 𝝁\bm{\mu^{*}} is ϵ\epsilon-sequentially optimal when μtΨLD(At).\mu^{*}_{t}\in\Psi_{LD}(A_{t}).

Fix any μ0ΨLD(A0)\mu_{0}\in\Psi_{LD}(A_{0}). Define τinf{t:μtΨLD(A0)}\tau^{*}\coloneqq\inf\{t:\mu_{t}\notin\Psi_{LD}(A_{0})\} and τ¯inf{t:μtΨLD(At)}\bar{\tau}\coloneqq\inf\{t:\mu_{t}\notin\Psi_{LD}(A_{t})\}, i.e., τ\tau^{*} and τ¯\bar{\tau} are the first times tt at which the belief μt\mu_{t} is not in ΨLD(A0)\Psi_{LD}(A_{0}) and ΨLD(At)\Psi_{LD}(A_{t}), respectively. This means, at s<τ¯s<\bar{\tau}, all agents who can switch choose action 0. This pins down an aggregate action At=A¯(A0,t)A_{t}=\underline{A}(A_{0},t) for every tτ¯t\leq\bar{\tau}. Therefore, At<A0A_{t}<A_{0} for every tτ¯t\leq\bar{\tau}, implying ΨLD(A0)ΨLD(At)\Psi_{LD}(A_{0})\subset\Psi_{LD}(A_{t}). Thus, ττ¯,\tau^{*}\geq\bar{\tau}, and so At=A¯(A0,t)A_{t}=\underline{A}(A_{0},t) for every tτt\leq\tau^{*}.

Moreover, we know that AsA¯(At,st)A_{s}\leq\bar{A}(A_{t},s-t) for any sts\geq t, so we can find an upper bound of the designer’s payoff as follows:

𝔼σ[ϕ(𝑨)]\displaystyle\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}
=𝔼τ[ϕ(𝑨)𝟙(τ=)+ϕ(𝑨)𝟙(τ<)]\displaystyle=\mathbb{E}_{\tau^{*}}\left[\phi(\bm{A})\mathbb{1}(\tau^{*}=\infty)+\phi(\bm{A})\mathbb{1}(\tau^{*}<\infty)\right]
𝔼τ[ϕ(𝑨¯)𝟙(τ=)+ϕ(𝑨¯)𝟙(τ<)]\displaystyle\leq\mathbb{E}_{\tau^{*}}\left[\phi(\bm{\underline{A}})\mathbb{1}(\tau^{*}=\infty)+\phi(\bm{\bar{A}})\mathbb{1}(\tau^{*}<\infty)\right]
=ϕ(𝑨¯)+{ϕ(𝑨¯)ϕ(𝑨¯)}(τ<),\displaystyle=\phi(\bm{\underline{A}})+\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}\mathbb{P}(\tau^{*}<\infty),

where 𝑨¯\bm{\underline{A}} satisfies A¯t=A¯(A0,t)\underline{A}_{t}=\underline{A}(A_{0},t), and 𝑨¯\bm{\bar{A}} satisfies A¯t=A¯(A0,t)\bar{A}_{t}=\bar{A}(A_{0},t).

For every t[0,)t\in[0,\infty), the optional stopping theorem implies

μ0\displaystyle\mu_{0} =𝔼[μτt]\displaystyle=\mathbb{E}\big{[}\mu_{\tau^{*}\wedge t}\big{]}
=𝔼[μττ<t](τ<t)+𝔼[μtτt](τt)\displaystyle=\mathbb{E}[\mu_{\tau^{*}}\mid\tau^{*}<t]\mathbb{P}(\tau^{*}<t)+\mathbb{E}[\mu_{t}\mid\tau^{*}\geq t]\mathbb{P}(\tau^{*}\geq t)
𝔼[μττ<t]μ^t(τ<t).\displaystyle\geq\underbrace{\mathbb{E}[\mu_{\tau^{*}}\mid\tau^{*}<t]}_{\eqqcolon\hat{\mu}_{t}}\mathbb{P}(\tau^{*}<t).

This implies μ^tF((τ<t),μ0)\hat{\mu}_{t}\in F(\mathbb{P}(\tau^{*}<t),\mu_{0}) for every t.t. By the definition of τ\tau^{*} and μt\mu_{t} is right-continuous, μτΨLDc(A0)¯\mu_{\tau^{*}}\in\overline{\Psi^{c}_{LD}(A_{0})} under the event {τ<}\{\tau^{*}<\infty\}. Since ΨLDc(A0)¯\overline{\Psi^{c}_{LD}(A_{0})} is convex, we also have μ^tΨLDc(A0)¯.\hat{\mu}_{t}\in\overline{\Psi^{c}_{LD}(A_{0})}. This means μ^tInt ΨLD(A0)\hat{\mu}_{t}\notin\text{Int }\Psi_{LD}(A_{0}), but μ^tF((τ<t),μ0)\hat{\mu}_{t}\in F(\mathbb{P}(\tau^{*}<t),\mu_{0}). The definition of p(μ0,A0)p^{*}(\mu_{0},A_{0}) implies p(μ0,A0)(τ<t)p^{*}(\mu_{0},A_{0})\geq\mathbb{P}(\tau^{*}<t) for every t.t. Thus,

𝔼σ[ϕ(𝑨)]\displaystyle\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]} ϕ(𝑨¯)+{ϕ(𝑨¯)ϕ(𝑨¯)}p(μ0,A0)\displaystyle\leq\phi(\bm{\underline{A}})+\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}p^{*}(\mu_{0},A_{0})
=(1p(μ0,A0))ϕ(𝑨¯)+p(μ0,A0)ϕ(𝑨¯).\displaystyle=(1-p^{*}(\mu_{0},A_{0}))\phi(\bm{\underline{A}})+p^{*}(\mu_{0},A_{0})\phi(\bm{\bar{A}}).

This implies

(2)=sup𝝁σΣ(𝝁,A0)𝔼σ[ϕ(𝑨)](1p(μ0,A0))ϕ(𝑨¯)+p(μ0)ϕ(𝑨¯).\displaystyle\eqref{eqn:opt}=\sup_{\begin{subarray}{c}\bm{\mu}\in\mathcal{M}\\ \sigma\in\Sigma(\bm{\mu},A_{0})\end{subarray}}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\leq(1-p^{*}(\mu_{0},A_{0}))\phi(\bm{\underline{A}})+p^{*}(\mu_{0})\phi(\bm{\bar{A}}).

Under 𝝁\bm{\mu}^{*}, if μ0+ΨLDc(A0),\mu_{0+}\in\Psi^{c}_{LD}(A_{0}), then everyone takes action 11 under any equilibrium outcome from we argued earlier. Thus,

infσΣ(𝝁,A0)𝔼σ[ϕ(𝑨)](1p(μ0,A0)+η)ϕ(𝑨¯)+(p(μ0,A0)η)ϕ(𝑨¯).\displaystyle\inf_{\sigma\in\Sigma(\bm{\mu}^{*},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\geq(1-p^{*}(\mu_{0},A_{0})+\eta)\phi(\bm{\underline{A}})+(p^{*}(\mu_{0},A_{0})-\eta)\phi(\bm{\bar{A}}).

Taking limit η0\eta\to 0, we obtain

(2)=sup𝝁infσΣ(𝝁,A0)𝔼σ[ϕ(𝑨)](1p(μ0,A0))ϕ(𝑨¯)+p(μ0,A0)ϕ(𝑨¯)(2).\displaystyle\eqref{eqn:adv}=\sup_{\bm{\mu}\in\mathcal{M}}\inf_{\sigma\in\Sigma(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\geq(1-p^{*}(\mu_{0},A_{0}))\phi(\bm{\underline{A}})+p^{*}(\mu_{0},A_{0})\phi(\bm{\bar{A}})\geq\eqref{eqn:opt}.

Since (2)(2)\eqref{eqn:opt}\geq\eqref{eqn:adv}, we obtain (2)=(2)\eqref{eqn:opt}=\eqref{eqn:adv}.

Step 3B. We finally show 𝝁\bm{\mu^{*}} is sequentially optimal when μtΨLD(At)BdθΔ(Θ).\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\Delta(\Theta). We proceed casewise:

  • Case 1: If μtΨLD(At)\mu_{t-}\notin\Psi_{LD}(A_{t}) and |AtZt|<𝖳𝖮𝖱(D(μt,At))|A_{t}-Z_{t-}|<\mathsf{TOR}(D(\mu_{t-},A_{t})). In this case, there is no information arriving, and everyone takes action 1. This will increase AtA_{t}, and every agent always takes action 11 from time tt onwards. This is the best outcome for the designer, implying sequential optimality.

  • Case 2: If μtΨLD(At)\mu_{t-}\notin\Psi_{LD}(A_{t}) and |AtZt|𝖳𝖮𝖱(D(μt,At))|A_{t}-Z_{t-}|\geq\mathsf{TOR}(D(\mu_{t-},A_{t})). In this case, the belief moves to either μt+(M𝖳𝖮𝖱(D))d^(μt)\mu_{t-}+(M\cdot\mathsf{TOR}(D))\cdot\hat{d}(\mu_{t-}) or μt𝖣𝖮𝖶𝖭(D)d^(μt)\mu_{t-}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-}). Note that μt𝖣𝖮𝖶𝖭(D)d^(μt)ΨLD(At)\mu_{t-}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-})\notin\Psi_{LD}(A_{t}) because ψ𝒅^(μt𝖣𝖮𝖶𝖭(D)d^(μt))=(1+α¯𝒅^(At))/2>α¯𝒅^(At)\psi_{\hat{\bm{d}}}(\mu_{t}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-}))=(1+\bar{\alpha}_{\hat{\bm{d}}}(A_{t}))/2>\bar{\alpha}_{\hat{\bm{d}}}(A_{t}). So no matter what information arrives, every agent takes action 1. This will increase AtA_{t}, and every agent always takes action 11 after time tt. Again, this is the best outcome for the designer, implying sequential optimality.

Appendix B Designing private information

In this appendix we discuss whether the designer can do better by designing private information.

Relaxed feasibility for joint belief processes.

We consider the relaxed problem under which each agent’s belief can be ‘separately controlled’ i.e., any joint distribution over agents’ beliefs under which the marginal distribution is a martingale is feasible under the relaxed problem. There is a common prior μ0\mu_{0} and a private belief process 𝝁i:=(μit)t\bm{\mu}_{i}:=(\mu_{it})_{t}, where μit:=(θ=1|it)\mu_{it}:=\mathbb{P}(\theta=1|\mathcal{F}_{it}) with it\mathcal{F}_{it} being agent ii’s time-tt filtration generated by (As,μis)st(A_{s},\mu_{is})_{s\leq t}.

The belief process for agent i[0,1]i\in[0,1], 𝝁i:=(μit)t\bm{\mu}_{i}:=(\mu_{it})_{t} is R-feasible if it is an (it)t(\mathcal{F}_{it})_{t}-martingale. The set of joint R-feasible belief process is

P:={(μit)t:i[0,1], (μit)t is R-feasible }.\mathcal{M}^{P}:=\Big{\{}(\mu_{it})_{t}:i\in[0,1],\text{ $(\mu_{it})_{t}$ is R-feasible }\Big{\}}.

We emphasize that this is a necessary condition on beliefs, but is not sufficient (see, e.g., Arieli, Babichenko, Sandomirskiy, and Tamuz (2021); Morris (2020) for a discussion of the static case). Let the set of feasible joint belief processes be F\mathcal{M}^{F}. Although it is still an open question of how to characterize this set, we know FP\mathcal{M}^{F}\subseteq\mathcal{M}^{P}.

The problem under private information.

sup𝝁Finf𝝈PBE(𝝁,A0)𝔼σ[ϕ(𝑨)].\sup_{\bm{\mu}\in\mathcal{M}^{F}}\inf_{\bm{\sigma}\in PBE(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}.

noting that we have moved from subgame perfection to Perfect-Bayesian Equilibria (Fudenberg and Tirole, 1991) since there is now private information among players. However, observe that F\mathcal{M}\subseteq\mathcal{M}^{F} and, furthermore, that BNE coincides with SPE under public information so (B)(2)\eqref{eqn:private_ADV}\geq\eqref{eqn:adv}.

Theorem 1B.

Suppose that μ0ΨLD(A0)Bdθ\mu_{0}\notin\Psi_{LD}(A_{0})\cup\text{Bd}_{\theta^{*}}, then

(B)=(2).\eqref{eqn:private_ADV}=\eqref{eqn:adv}.

If μ0ΨLD(A0)\mu_{0}\in\Psi_{LD}(A_{0}) and further supposing ϕ\phi is a convex functional, then

(B)(2)(p(μ0,A0)p(μ0,1))(ϕ(𝑨¯λ)ϕ(𝑨¯λ)).\eqref{eqn:private_ADV}-\eqref{eqn:adv}\leq\Big{(}p^{*}(\mu_{0},A_{0})-p^{*}(\mu_{0},1)\Big{)}\Big{(}\phi\big{(}\bm{\overline{A}}^{\lambda}\Big{)}-\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}\Big{)}.
Proof.

The case in which μ0ΨLD(A0)Bdθ\mu_{0}\notin\Psi_{LD}(A_{0})\cup\text{Bd}_{\theta^{*}} follows directly from Theorem 1 since it already attains the upper bound on the time-path of aggregate play. We prove the second part in several steps.

Step 1A. Constructing a relaxed problem.

Some care is required in constructing the relaxed problem: by moving from F\mathcal{M}^{F} to P\mathcal{M}^{P}, equilibria of the resultant game might not be well-defined. We will deal with this in two ways. First, we will weaken PBE to what we call non-dominance which requires that players play action 11 whenever it is not strictly dominated. Notice that this is not an equilibrium concept and is well-defined even with hetrogeneous beliefs. Second, we will replace the inner inf\inf with sup\sup to obtain the relaxed problem

sup𝝁Psup𝝈ND(𝝁,A0)𝔼σ[ϕ(𝑨)].\sup_{\bm{\mu}\in\mathcal{M}^{P}}\sup_{\bm{\sigma}\in ND(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}.

It is easy to see that this is indeed a relaxed problem i.e., (B)(B)\eqref{eqn:private_ADV_R}\geq\eqref{eqn:private_ADV} since (i) PF\mathcal{M}^{P}\supseteq\mathcal{M}^{F} and furthermore, for each 𝝁F\bm{\mu}\in\mathcal{M}^{F}, PBE(𝝁,A0)ND(𝝁,A0)PBE(\bm{\mu},A_{0})\subseteq ND(\bm{\mu},A_{0}).

Step 1B. Solving the relaxed problem.

First observe that for each player i[0,1]i\in[0,1], a necessary condition for action 11 to not be strictly dominated is

μit>μ¯(A=1)\mu_{it}>\underline{\mu}(A=1)

Hence, consider the strategy σ¯\overline{\sigma} in which each player ii plays 11 if μit>μ¯(A=1)\mu_{it}>\underline{\mu}(A=1) and 0 otherwise. Clearly,

sup𝝁P𝔼σ¯[ϕ(𝑨)](B).\sup_{\bm{\mu}\in\mathcal{M}^{P}}\mathbb{E}^{\overline{\sigma}}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}\geq\eqref{eqn:private_ADV_R}.

Let (μit)t(\mu_{it})_{t} be any Cadlag martingale and let τi:=inf{t𝒯:μitΨLD(1)}\tau_{i}:=\inf\{t\in\mathcal{T}:\mu_{it}\notin\Psi_{LD}(1)\}. Clearly this Cadlag martingale is improvable if it continues to deliver information after τi\tau_{i}, so it is without loss to consider (μit)t(\mu_{it})_{t} which are constant a.s. after τi\tau_{i}. But observe that since (μit)t(\mu_{it})_{t} is a martingale, the probability of exiting the region ΨLD(1)\Psi_{LD}(1) is upper-bounded with the same calculation :

(τi<+)p(μ0,1).\mathbb{P}(\tau_{i}<+\infty)\leq p^{*}(\mu_{0},1).

We define the (random) number of agents whose beliefs eventually cross μ¯(1)\underline{\mu}(1) as follows:

F\displaystyle F =iI1{τi<}𝑑i.\displaystyle=\int_{i\in I}1\{\tau_{i}<\infty\}di.

Consider that

𝔼μ[F]\displaystyle\mathbb{E}_{\mu}[F] =𝔼μ[iI1{τi<}𝑑i]=iIμ(τi<)𝑑ip(μ0,1).\displaystyle=\mathbb{E}_{\mu}\bigg{[}\int_{i\in I}1\{\tau_{i}<\infty\}di\bigg{]}=\int_{i\in I}\mathbb{P}_{\mu}(\tau_{i}<\infty)di\leq p^{*}(\mu_{0},1).

Now we will derive the upper bound of AtA_{t} for each realization of (μit)i,t.(\mu_{it})_{i,t}. Agent iIi\in I takes action 11 at time tt only if either

  1. (I)

    agent ii’s Poisson clock ticked before tt, and his belief eventually crosses μ¯(1)\underline{\mu}(1), or

  2. (II)

    agent ii took action 1 initially, and his Poisson clock has not ticked yet.

The measures of agents in (I) and (II) are F(1exp(λt))F(1-\exp(-\lambda t)) and A0exp(λt)A_{0}\exp(-\lambda t), respectively. Thus,

At\displaystyle A_{t} F(1exp(λt))+A0exp(λt)\displaystyle\leq F(1-\exp(-\lambda t))+A_{0}\exp(-\lambda t)

almost surely. Define 𝑨¯λ:=(A¯tλ)t\overline{\bm{A}}^{\lambda}:=(\overline{A}_{t}^{\lambda})_{t} as the solution to the ODE dA¯tλ=λ(1A¯tλ)dtd\overline{A}_{t}^{\lambda}=\lambda(1-\overline{A}_{t}^{\lambda})dt with boundary A¯0λ=A0\overline{A}_{0}^{\lambda}=A_{0}, and 𝑨¯λ:=(A¯tλ)t\underline{\bm{A}}^{\lambda}:=(\underline{A}_{t}^{\lambda})_{t} as the solution to the ODE dA¯tλ=λA¯tλdtd\underline{A}^{\lambda}_{t}=-\lambda\underline{A}^{\lambda}_{t}dt with boundary A¯0λ=A0\underline{A}_{0}^{\lambda}=A_{0}. We have

A¯tλ=1(1A0)exp(λt),A¯tλ=A0exp(λt),\overline{A}_{t}^{\lambda}=1-(1-A_{0})\exp(-\lambda t),\quad\quad\underline{A}_{t}^{\lambda}=A_{0}\exp(-\lambda t),

so we can rewrite the upper bound of AtA_{t} as follows:

AtFA¯tλ+(1F)A¯tλt𝑨F𝑨¯λ+(1F)𝑨¯λA_{t}\leq F\overline{A}_{t}^{\lambda}+(1-F)\underline{A}_{t}^{\lambda}\quad\forall t\quad\Rightarrow\quad\bm{A}\leq F\bm{\overline{A}}^{\lambda}+(1-F)\bm{\underline{A}}^{\lambda}

almost surely. Since ϕ\phi is a convex and increasing functional, we must have

ϕ(𝑨)Fϕ(𝑨¯λ)+(1F)ϕ(𝑨¯λ)\phi(\bm{A})\leq F\phi(\bm{\overline{A}}^{\lambda})+(1-F)\phi(\bm{\underline{A}}^{\lambda})

almost surely. This implies

𝔼μ[ϕ(𝑨)]\displaystyle\mathbb{E}_{\mu}\Big{[}\phi(\bm{A})\Big{]} 𝔼μ[F]ϕ(𝑨¯λ)+(1𝔼μ[F])ϕ(𝑨¯λ)\displaystyle\leq\mathbb{E}_{\mu}[F]\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}+(1-\mathbb{E}_{\mu}[F])\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}
p(μ0,1)ϕ(𝑨¯λ)+(1p(μ0,1))ϕ(𝑨¯λ)\displaystyle\leq p^{*}(\mu_{0},1)\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}+\big{(}1-p^{*}(\mu_{0},1)\big{)}\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}

for every 𝝁.\bm{\mu}. Thus,

(B)(B)p(μ0,1)ϕ(𝑨¯λ)+(1p(μ0,1))ϕ(𝑨¯λ).\displaystyle\eqref{eqn:private_ADV}\leq\eqref{eqn:private_ADV_R}\leq p^{*}(\mu_{0},1)\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}+\big{(}1-p^{*}(\mu_{0},1)\big{)}\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}.

This implies

(B)(2)(p(μ0,A0)p(μ0,1))(ϕ(𝑨¯λ)ϕ(𝑨¯λ)),\displaystyle\eqref{eqn:private_ADV}-\eqref{eqn:adv}\leq(p^{*}(\mu_{0},A_{0})-p^{*}(\mu_{0},1))\big{(}\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}-\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}\big{)},

as desired. ∎

ONLINE APPENDIX TO ‘INFORMATIONAL PUTS’
ANDREW KOH  SIVAKORN SANGUANMOO  KEI UZUI

Online Appendix I develops Theorem 1 for finite players.

Appendix I Finite players

I.1  Preliminaries

Let A0=A¯0=NnNA_{0}=\bar{A}_{0}=\frac{N-n}{N}, where nn is the number of agents who initially play action 0. For each i{1,,n}i\in\{1,\dots,n\}, define τiExp(λ)\tau_{i}\sim Exp(\lambda) as an iid exponential distribution with rate λ\lambda, i.e., τi\tau_{i} is agent ii’s random waiting time for the first switching opportunity. We define random variables AtA_{t} and A¯t\bar{A}_{t} as follows:

At\displaystyle A_{t} =A0+1Ni=1n1{τit}\displaystyle=A_{0}+\frac{1}{N}\sum_{i=1}^{n}1\{\tau_{i}\leq t\}
A¯t\displaystyle\bar{A}_{t} =1(1A¯0)eλt,\displaystyle=1-(1-\bar{A}_{0})e^{-\lambda t},

where AtA_{t} is the proportion of agents playing action 11 at time tt when everyone switches to action 11 as quickly as possible, while A¯t\bar{A}_{t} is the auxiliary proportion of agents playing action 11 at time tt when 1eλt1-e^{-\lambda t} of the agents initially playing action 0 have had opportunities to switch by time tt.

If the number of agents is finite, the proportion of agents playing action 11 can deviate from the tolerated distance from the target even when no one has switched to action 0. The following lemma provides an upper bound on the probability of such “unlucky” events.

Lemma 5.

(t,At+δA¯t)>112δ4N1\mathbb{P}(\forall t,A_{t}+\delta\geq\bar{A}_{t})>1-12\delta^{-4}N^{-1}.

Proof.

Fix α\alpha such that δ=2Nα\delta=2N^{-\alpha}. We rearrange (τi)1,,n(\tau_{i})_{1,\dots,n} as τ(1)<τ(2)<<τ(n).\tau_{(1)}<\tau_{(2)}<\cdots<\tau_{(n)}. For each k{0,,nNα11}k\in\{0,\dots,\lceil nN^{\alpha-1}\rceil-1\}, define Tk[τ(kN1α),τ((k+1)N1α))T_{k}\coloneqq[\tau_{(kN^{1-\alpha})},\tau_{((k+1)N^{1-\alpha})}), where τ(i)=τ(i)\tau_{(i)}=\tau_{(\lfloor i\rfloor)}, τ(0)=0\tau_{(0)}=0, and τ(n+1)=\tau_{(n+1)}=\infty. If tTkt\in T_{k}, We must have At[A0+kN1αN,A0+(k+1)Nα].A_{t}\in[A_{0}+\frac{\lfloor kN^{1-\alpha}\rfloor}{N},A_{0}+(k+1)N^{-\alpha}]. Therefore,

(tT,AtA¯tδ)\displaystyle\mathbb{P}(\forall t\leq T,A_{t}\geq\bar{A}_{t}-\delta)
=(k=0nNα11{ω:tTk,AtA¯tδ})\displaystyle=\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-1}\{\omega:\forall t\in T_{k},A_{t}\geq\bar{A}_{t}-\delta\}\bigg{)}
(k=0nNα11{ω:tTk,A0+kN1αNA¯tδ})\displaystyle\geq\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-1}\bigg{\{}\omega:\forall t\in T_{k},A_{0}+\frac{\lfloor kN^{1-\alpha}\rfloor}{N}\geq\bar{A}_{t}-\delta\bigg{\}}\bigg{)}
(k=0nNα11{ω:A0+kN1α1NA¯τ((k+1)N1α)δ})\displaystyle\geq\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-1}\bigg{\{}\omega:A_{0}+\frac{kN^{1-\alpha}-1}{N}\geq\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta\bigg{\}}\bigg{)}
=(k=0nN1α2{ω:A0+kN1α1NA¯τ((k+1)N1α)δ}),\displaystyle=\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{1-\alpha}\rceil-2}\bigg{\{}\omega:A_{0}+\frac{kN^{1-\alpha}-1}{N}\geq\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta\bigg{\}}\bigg{)},

where the last equality follows from that if k=nNα11k=\lceil nN^{\alpha-1}\rceil-1 then

A0+kN1α1N\displaystyle A_{0}+\frac{kN^{1-\alpha}-1}{N} >NnN+nN1α1N\displaystyle>\frac{N-n}{N}+\frac{n-N^{1-\alpha}-1}{N}
=1N1α+1N\displaystyle=1-\frac{N^{1-\alpha}+1}{N}
>1δ\displaystyle>1-\delta
>A¯τ((k+1)N1α)δ.\displaystyle>\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta.

We define the event Ωrelax\Omega_{relax} as follows:

Ωrelax=k=0nNα12{τ((k+1)N1α)τ(kN1α)λ1(1+δ/3)log(nkN1αn(k+1)N1α)}Ωk.\displaystyle\Omega_{relax}=\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-2}\underbrace{\bigg{\{}\tau_{((k+1)N^{1-\alpha})}-\tau_{(kN^{1-\alpha})}\leq\lambda^{-1}(1+\delta/3)\log\bigg{(}\frac{n-kN^{1-\alpha}}{n-(k+1)N^{1-\alpha}}\bigg{)}\bigg{\}}}_{\Omega_{k}}.

Under the event Ωrelax\Omega_{relax}, for every knNα12k\leq\lceil nN^{\alpha-1}\rceil-2, we have

A¯τ((k+1)N1α)\displaystyle\bar{A}_{\tau_{((k+1)N^{1-\alpha})}} =1nNexp(λτ((k+1)N1α))\displaystyle=1-\frac{n}{N}\exp(-\lambda\tau_{((k+1)N^{1-\alpha})})
=1nNexp(λi=0k(τ((i+1)N1α)τ(iN1α)))\displaystyle=1-\frac{n}{N}\exp\bigg{(}-\lambda\sum_{i=0}^{k}(\tau_{((i+1)N^{1-\alpha})}-\tau_{(iN^{1-\alpha})})\bigg{)}
1nNexp((1+δ/3)(lognlog(n(k+1)N1α))\displaystyle\leq 1-\frac{n}{N}\exp\bigg{(}(1+\delta/3)(\log n-\log\big{(}n-(k+1)N^{1-\alpha}\big{)}\bigg{)}
=1nN(1(k+1)N1αn)1+δ/3\displaystyle=1-\frac{n}{N}\bigg{(}1-\frac{(k+1)N^{1-\alpha}}{n}\bigg{)}^{1+\delta/3}
1nN(1(1+δ/3)(1+k)N1αn)\displaystyle\leq 1-\frac{n}{N}\bigg{(}1-\frac{(1+\delta/3)(1+k)N^{1-\alpha}}{n}\bigg{)}
=A0+(1+δ/3)(1+k)Nα\displaystyle=A_{0}+(1+\delta/3)(1+k)N^{-\alpha}

Note that k<Nαk<N^{\alpha}. Thus,

A¯τ((k+1)N1α)δ\displaystyle\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta A0+(1+δ)(1+k)Nαδ\displaystyle\leq A_{0}+(1+\delta)(1+k)N^{-\alpha}-\delta
A0+kNα+(1+δ/3+δk/3)Nαδ\displaystyle\leq A_{0}+kN^{-\alpha}+(1+\delta/3+\delta k/3)N^{-\alpha}-\delta
A0+kNα+(1+δ/3)Nα2δ/3\displaystyle\leq A_{0}+kN^{-\alpha}+(1+\delta/3)N^{-\alpha}-2\delta/3
A0+kNα1/N,\displaystyle\leq A_{0}+kN^{-\alpha}-1/N,

where the last inequality holds if NN is large and δ=2Nα\delta=2N^{-\alpha}. This implies

Ωrelaxk=0nN1α2{ω:A0+kN1α1NA¯τ((k+1)N1α)δ}.\displaystyle\Omega_{relax}\subset\bigcap_{k=0}^{\lceil nN^{1-\alpha}\rceil-2}\bigg{\{}\omega:A_{0}+\frac{kN^{1-\alpha}-1}{N}\geq\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta\bigg{\}}.

Now we compute (Ωrelax)\mathbb{P}(\Omega_{relax}). Note that τ((k+1)N1α)τ(kN1α)\tau_{((k+1)N^{1-\alpha})}-\tau_{(kN^{1-\alpha})} has

mean =i=kN1α(k+1)N1α11λ(ni)1λlog(nkN1αn(k+1)N1α)\displaystyle=\sum_{i=\lfloor kN^{1-\alpha}\rfloor}^{\lfloor(k+1)N^{1-\alpha}\rfloor-1}\frac{1}{\lambda(n-i)}\leq\frac{1}{\lambda}\log\bigg{(}\frac{\lfloor n-kN^{1-\alpha}\rfloor}{\lfloor n-(k+1)N^{1-\alpha}\rfloor}\bigg{)}
variance =i=kN1α(k+1)N1α1λ2(ni)21λ2(1nkN1α1n(k+1)N1α).\displaystyle=\sum_{i=\lfloor kN^{1-\alpha}\rfloor}^{\lfloor(k+1)N^{1-\alpha}\rfloor}\frac{1}{\lambda^{2}(n-i)^{2}}\leq\frac{1}{\lambda^{2}}\bigg{(}\frac{1}{\lfloor n-kN^{1-\alpha}\rfloor}-\frac{1}{\lfloor n-(k+1)N^{1-\alpha}\rfloor}\bigg{)}.

Let ak=nkN1αa_{k}=\lfloor n-kN^{1-\alpha}\rfloor. Thus, by Chebyshev inequality, the probability of Ωkc\Omega_{k}^{c} is bounded above by

1/ak1/(ak+1)δ2(logak+1logak)2\displaystyle\frac{1/a_{k}-1/(a_{k+1})}{\delta^{2}(\log a_{k+1}-\log a_{k})^{2}} (1/ak1/ak+1logak+1logak)21/δ21/ak1/ak+1\displaystyle\leq\bigg{(}\frac{1/a_{k}-1/a_{k+1}}{\log a_{k+1}-\log a_{k}}\bigg{)}^{2}\cdot\frac{1/\delta^{2}}{1/a_{k}-1/a_{k+1}}
1δ2ak211/ak1/ak+1\displaystyle\leq\frac{1}{\delta^{2}a_{k}^{2}}\cdot\frac{1}{1/a_{k}-1/a_{k+1}}
=1δ2(1ak+1ak+1ak)\displaystyle=\frac{1}{\delta^{2}}\bigg{(}\frac{1}{a_{k}}+\frac{1}{a_{k+1}-a_{k}}\bigg{)}
<1δ23Nα1\displaystyle<\frac{1}{\delta^{2}}\cdot 3N^{\alpha-1}

for every knNα12k\leq\lceil nN^{\alpha-1}\rceil-2. Thus, the probability that the information triggers is bounded above by

Nα3Nα1δ2=3N2α1δ2=34δ2N1δ2=12δ4N1\displaystyle N^{\alpha}\cdot\frac{3N^{\alpha-1}}{\delta^{2}}=3N^{2\alpha-1}\delta^{-2}=3\cdot\frac{4}{\delta^{2}}N^{-1}\delta^{-2}=12\delta^{-4}N^{-1}

since δ=2Nα\delta=2N^{-\alpha}, as desired. ∎

Another subtlety with a finite number of agents is that agent ii’s action today affects her future decision problem, and thus she needs to account for this effect when choosing her action today. The following lemma shows that when μ>μ¯(A¯0)\mu>\underline{\mu}(\bar{A}_{0}), it is optimal for her to take action 11 regardless of her future actions.

Lemma 6.

For every agent ii, suppose (τin)n(\tau_{in})_{n} be a increasing sequence of Poisson clocks of agent ii. Suppose ain{0,1}a_{in}\in\{0,1\} be a (random) action agent ii takes at τin\tau_{in}. If μ>μ¯(A¯0)\mu>\underline{\mu}(\bar{A}_{0}), then

𝔼μ[n=0τinτi,n+1ersu(ain,A¯s,θ)𝑑s]𝔼μ[0ersu(1,A¯s,θ)𝑑s]\displaystyle\mathbb{E}_{\mu}\Big{[}\sum_{n=0}^{\infty}\int_{\tau_{in}}^{\tau_{i,n+1}}e^{-rs}u(a_{in},\bar{A}_{s},\theta)ds\Big{]}\leq\mathbb{E}_{\mu}\Big{[}\int_{0}^{\infty}e^{-rs}u(1,\bar{A}_{s},\theta)ds\Big{]}
Proof.

For every nn\in\mathbb{N}, consider that

𝔼μ[τinτi,n+1ersΔu(A¯s,θ)𝑑s]\displaystyle\mathbb{E}_{\mu}\Big{[}\int^{\tau_{i,n+1}}_{\tau_{in}}e^{-rs}\Delta u(\bar{A}_{s},\theta)ds\Big{]} =𝔼μ[erτin𝔼μ[0τi,n+1τi,nersΔu(A¯s+τin,θ)ds|τin]]\displaystyle=\mathbb{E}_{\mu}\bigg{[}e^{-r\tau_{in}}\mathbb{E}_{\mu}\Big{[}\int_{0}^{\tau_{i,n+1}-\tau_{i,n}}e^{-rs}\Delta u(\bar{A}_{s+\tau_{in}},\theta)ds\big{\lvert}\tau_{in}\Big{]}\bigg{]}
𝔼μ[erτin𝔼μ[0τi,n+1τi,nersΔu(A¯s,θ)ds|τin]]\displaystyle\geq\mathbb{E}_{\mu}\bigg{[}e^{-r\tau_{in}}\mathbb{E}_{\mu}\Big{[}\int_{0}^{\tau_{i,n+1}-\tau_{i,n}}e^{-rs}\Delta u(\bar{A}_{s},\theta)ds\big{\lvert}\tau_{in}\Big{]}\bigg{]}
0,\displaystyle\geq 0,

where the last inequality follows from μ>μ¯(A¯0)\mu>\underline{\mu}(\bar{A}_{0}). This implies

𝔼μ[τinτi,n+1ersu(ain,A¯s,θ)𝑑s]𝔼μ[τinτi,n+1ersu(1,A¯s,θ)𝑑s]\displaystyle\mathbb{E}_{\mu}\Big{[}\int^{\tau_{i,n+1}}_{\tau_{in}}e^{-rs}u(a_{in},\bar{A}_{s},\theta)ds\Big{]}\leq\mathbb{E}_{\mu}\Big{[}\int^{\tau_{i,n+1}}_{\tau_{in}}e^{-rs}u(1,\bar{A}_{s},\theta)ds\Big{]}

for every nn\in\mathbb{N}, as desired. ∎

I.2  Main theorem

For simplicity, we consider binary states Θ={0,1}\Theta=\{0,1\} with θ=1\theta^{*}=1. Suppose that there are NN agents in the economy. Let ΣN(𝝁,A0)\Sigma^{N}(\bm{\mu},A_{0}) denote the set of subgame perfect equilibria of the stochastic game induced by a belief martingale 𝝁\bm{\mu} under the economy consisting of NN agents whenever A0A_{0} can be written as kN\frac{k}{N} for some k{0,,N}k\in\{0,\dots,N\}. We define the designer’s problem under adversarial equilibrium selection with a finite number of agents as follows:

sup𝝁inf𝝈ΣN(𝝁,A0)𝔼σ[ϕ(𝑨)]\sup_{\bm{\mu}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma^{N}(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}
Theorem 2.

Suppose that u(a,,θ)u(a,\cdot,\theta) is Lipschitz continuous for all a{0,1}a\in\{0,1\} and θ{0,1}\theta\in\{0,1\} and there exists a constant LϕL_{\phi} such that |ϕ(𝑨)ϕ(𝑨)|LϕAA|\phi(\bm{A})-\phi(\bm{A}^{\prime})|\leq L_{\phi}\|A-A^{\prime}\|_{\infty} for every A,A[0,1]A,A^{\prime}\in[0,1]^{\infty}. Then the followings hold.

  1. 1.

    There exists a constant dd such that, under any subgame perfect equilibrium σΣN(𝝁η,A0)\sigma\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0}) ( 𝝁η\bm{\mu}^{\eta} defined in Theorem 1), an agent takes action 11 if μt>μ¯(At)+(dN)1/9\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9} for every history HtH_{t}, aggregate action AtA_{t}, and belief μt\mu_{t},

  2. 2.

    There exists a constant C¯\bar{C} such that, for any (μ0,A0)(\mu_{0},A_{0})393939We implicitly assume A0A_{0}\in\mathbb{Q} and NA0N\cdot A_{0} is an integer., we have

    |(2)(I.2)|C¯N1/9,\left|\eqref{eqn:opt}-\eqref{eqn:adv-n}\right|\leq\bar{C}N^{-1/9},

    for sufficiently large NN (depending on (μ0,A0)(\mu_{0},A_{0})).

  3. 3.

    Sequential optimality:

    limη0supHtlimN|inf𝝈ΣN(𝝁η,A0)𝔼σ[ϕ(𝑨)|t]sup𝝁inf𝝈NΣ(𝝁,A0)𝔼σ[ϕ(𝑨)|t]|=0.\lim_{\eta\downarrow 0}\sup_{H_{t}\in\mathcal{H}}\lim_{N\to\infty}\Bigg{|}\inf_{\bm{\sigma}\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\big{[}\phi\big{(}\bm{A}\big{)}\big{|}\mathcal{F}_{t}\big{]}-\sup_{\bm{\mu}^{\prime}\in\mathcal{M}}\inf_{\bm{\sigma}^{N}\in\Sigma(\bm{\mu}^{\prime},A_{0})}\mathbb{E}^{\sigma}\big{[}\phi(\bm{A})\big{|}\mathcal{F}_{t}\big{]}\Bigg{|}=0.

Proof of Part 1. We follow a similar method as we did in Theorem 1. We restate Lemma 3 as follows:

Lemma 7.

There exists d>0d>0 such that if μt>μ¯(At)+(dN)1/9\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}, ψnψn+1\psi_{n}\subset\psi_{n+1} holds for all nn\in\mathbb{N}.

Proof of Lemma 7.

We follow a similar method as we did in Lemma 2. Suppose that everyone plays action 1 for any histories HH^{\prime} such that S(H)=(μ,A)S(H^{\prime})=(\mu,A) is in the round-nn dominance region ψn.\psi_{n}. To obtain ψn+1\psi_{n+1}, we derive the lower bound on the expected payoff difference of playing 0 and 11 given ψn\psi_{n}.

Fix any history HH with the current target aggregate action ZtZ_{t} such that S(H)=(μt,At)ψnS(H)=(\mu_{t},A_{t})\notin\psi_{n}. From our construction of ZtZ_{t}, we have ZtAt<𝖳𝖮𝖱(D(μt,At))Z_{t-}-A_{t-}<\mathsf{TOR}(D(\mu_{t-},A_{t-})). For any path (As)st(A_{s})_{s\geq t} with an increment at most 1N\frac{1}{N}, we define a (deterministic) hitting time T((As))stT^{*}((A_{s}))_{s\geq t} as follows:

T=inf{st:ZsAs𝖳𝖮𝖱(D(μt,As)) or (μt,As)ψn}.\displaystyle T^{*}=\inf\{s\geq t:Z_{s}-A_{s}\geq\mathsf{TOR}(D(\mu_{t},A_{s}))\text{ or }(\mu_{t},A_{s})\in\psi_{n}\}.

Fix TT^{*}, and we first determine a behavior of the path (As)st(A_{s})_{s\geq t} given TT^{*}.

Before time TT^{*}. For any s[t,T)s\in[t,T^{*}), we showed in (A) that 𝖳𝖮𝖱(D(μt,As))𝖳𝖮𝖱(ψn(At)μ¯(At))\mathsf{TOR}(D(\mu_{t},A_{s}))\leq\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})). By the definition of ZZ, we must have Zs=A¯sZ_{s}=\bar{A}_{s} for every s[t,T)s\in[t,T^{*}) because ZsAs<𝖳𝖮𝖱(D(μt,As))Z_{s}-A_{s}<\mathsf{TOR}(D(\mu_{t},A_{s})) for every s<Ts<T^{*}. Then we can write down the lower bound of AsA_{s} when s[t,T)s\in[t,T^{*}) as follows:

AsA¯s𝖳𝖮𝖱(D(μt,As))A¯s𝖳𝖮𝖱(ψn(At)μ¯(At)),\displaystyle A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})), (13)

almost surely given TT^{*}.

After time TT^{*}. Fix any s>Ts>T^{*}. We consider the following two cases.

Case 1: ZTAT<𝖳𝖮𝖱(D(μt,AT))Z_{T^{*}}-A_{T^{*}}<\mathsf{TOR}(D(\mu_{t},A_{T^{*}})). This means μT=μt\mu_{T^{*}}=\mu_{t} because no information has been injected until TT^{*}. Then the definition of TT^{*} and the right continuity of ZsZ_{s} and AsA_{s} imply (μT,AT)ψn.(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}. This means every agent strictly prefers to take action 1 at TT^{*}. This increases AsA_{s}, inducing every agent taking action 1 after time TT^{*} until time ss^{\prime} at which information is injected (i.e., ZsAs>𝖳𝖮𝖱(D(μt,As))Z_{s^{\prime}}-A_{s^{\prime}}>\mathsf{TOR}(D(\mu_{t},A_{s^{\prime}}))).

The event that no information is injected again after time TT^{*} is equivalent to the event that ZsAs𝖳𝖮𝖱(D(μt,As))Z_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{s})) for every s>Ts>T^{*}. Observe that

(s>T,ZsAs𝖳𝖮𝖱(D(μt,As))|T)\displaystyle\mathbb{P}\Big{(}\forall s>T^{*},Z_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{s}))\big{\lvert}T^{*}\Big{)}
(s>T,ZsAs𝖳𝖮𝖱(D(μt,At))|T)\displaystyle\geq\mathbb{P}\Big{(}\forall s>T^{*},Z_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\big{\lvert}T^{*}\Big{)}
(s>T,A¯sAs𝖳𝖮𝖱(D(μt,At))|T),\displaystyle\geq\mathbb{P}\Big{(}\forall s>T^{*},\bar{A}_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\big{\lvert}T^{*}\Big{)},

where the first inequality follows from AsAtA_{s}\geq A_{t}, and A¯s\bar{A}_{s} is defined as As¯=1(1AT)eλ(sT)\bar{A_{s}}=1-(1-A_{T^{*}})e^{-\lambda(s-T^{*})} for every s>Ts>T^{*}. By the definition of ZsZ_{s}, A¯sZs\bar{A}_{s}\geq Z_{s} holds, which implies the second inequality. From Lemma 5, we know that

(s>T,A¯sAs𝖳𝖮𝖱(D(μt,At))|T)112N1𝖳𝖮𝖱(D(μt,At))4.\displaystyle\mathbb{P}\Big{(}\forall s>T^{*},\bar{A}_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\big{\lvert}T^{*}\Big{)}\geq 1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.

Therefore, we can write down the lower bound of AsA_{s} when s>Ts>T^{*} under this case as follows:

s>T,AsA¯s𝖳𝖮𝖱(D(μt,As))A¯s𝖳𝖮𝖱(ψn(At)μ¯(At))\displaystyle\forall s>T^{*},A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least 112N1𝖳𝖮𝖱(D(μt,At))41-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.

Case 2: ZTAT𝖳𝖮𝖱(D(μt,AT))Z_{T^{*}}-A_{T^{*}}\geq\mathsf{TOR}(D(\mu_{t},A_{T^{*}})). In this case, information is injected at TT^{*}. Note that, if (μT,AT)ψn,(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}, then everyone prefers to take action 1 at TT^{*}. This increases As,A_{s}, inducing every agent taking action 1 after time TT^{*} until time ss^{\prime} at which information is injected again. Thus, the probability that (μT,AT)ψn(\mu_{T^{*}},A_{T^{*}})\in\psi_{n} and no information is injected again is at least

((μT,AT)ψnT){112N1𝖳𝖮𝖱(D(μt,At))4}.\displaystyle\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\mid T^{*})\cdot\left\{1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}\right\}.

By definition, we have

((μT,AT)ψnT)=p+(μt,AT)1{(μt+M𝖳𝖮𝖱(D(μt,AT)),AT)ψn}\displaystyle\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\mid T^{*})=p_{+}(\mu_{t},A_{T^{*}})1\{(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n}\}

Now we claim that

AT>At𝖳𝖮𝖱(D(μt,At))1N.\displaystyle A_{T^{*}}>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t}))-\frac{1}{N}.

To see this, suppose for a contradiction that ATAt𝖳𝖮𝖱(D(μt,At))1NA_{T^{*}}\leq A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t}))-\frac{1}{N}, which implies AT<AtA_{T^{*}}<A_{t}. However, since the definition of TT^{*} implies ATZT𝖳𝖮𝖱(D(μt,AT)),A_{T^{*}-}\geq Z_{T^{*}-}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}-})), we have

AT\displaystyle A_{T^{*}} AT1N\displaystyle\geq A_{T^{*}-}-\frac{1}{N}
ZT𝖳𝖮𝖱(D(μt,AT))1N\displaystyle\geq Z_{T^{*}-}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}-}))-\frac{1}{N}
>At𝖳𝖮𝖱(D(μt,At))1N,\displaystyle>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t}))-\frac{1}{N},

where the first inequality follows from the increment size of AtA_{t} being at most 1/N1/N by assumption. This is a contradiction.

Hence, if μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2+1NLμ¯\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}},404040We use this condition when we construct ψn+1\psi_{n+1}. we must have

μt+M𝖳𝖮𝖱(D(μt,At))\displaystyle\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t})) ψn(At)+M𝖳𝖮𝖱(D(μt,At))/2+1NLμ¯\displaystyle\geq\psi_{n}(A_{t})+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}
>(ψn(AT)Lμ¯𝖳𝖮𝖱(D(μt,At)))+M𝖳𝖮𝖱(D(μt,At))/2\displaystyle>(\psi_{n}(A_{T^{*}})-L_{\underline{\mu}}\mathsf{TOR}(D(\mu_{t},A_{t})))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2
=ψn(AT),\displaystyle=\psi_{n}(A_{T^{*}}),

by setting M=2Lμ¯M=2L_{\underline{\mu}}, where the second inequality follows from Lipschitz continuity of ψn\psi_{n}. Thus, (μt+M𝖳𝖮𝖱(D(μt,At)),AT)ψn(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t})),A_{T^{*}})\in\psi_{n} holds, implying

((μT,AT)ψnT)\displaystyle\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\mid T^{*}) =p+(μt,AT)\displaystyle=p_{+}(\mu_{t},A_{T^{*}})
=1M𝖳𝖮𝖱(D(μt,AT))𝖣𝖮𝖶𝖭(D(μt,AT))+M𝖳𝖮𝖱(D(μt,AT))\displaystyle=1-\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}
1M𝖳𝖮𝖱(D(μt,AT))𝖣𝖮𝖶𝖭(D(μt,AT))\displaystyle\geq 1-\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))}
=1δ¯λMC2L+2LM(μtμ¯(AT))1\displaystyle=1-\frac{\bar{\delta}\lambda MC}{2L+2LM(\mu_{t}-\underline{\mu}(A_{T^{*}}))^{-1}}
1δ¯λMC2L+2LM(ψn(At)μ¯(At))1\displaystyle\geq 1-\frac{\bar{\delta}\lambda MC}{2L+2LM(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))^{-1}}
1δ¯c(ψn(At)μ¯(At))\displaystyle\geq 1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

for absolute constant c:=λC2Lc:=\frac{\lambda C}{2L}. Then we can write down the lower bound of AsA_{s} for every s>Ts>T^{*} under this case as follows:

s>T,AsA¯s𝖳𝖮𝖱(D(μt,As))A¯s𝖳𝖮𝖱(ψn(At)μ¯(At))\displaystyle\forall s>T^{*},A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least

(1δ¯c(ψn(At)μ¯(At)))(112N1𝖳𝖮𝖱(D(μt,At))4)\displaystyle(1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t})))\cdot(1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4})
1δ¯c(ψn(At)μ¯(At))12N1𝖳𝖮𝖱(D(μt,At))4.\displaystyle\geq 1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.

Combining Case 1 and Case 2, we must have

s>T,AsA¯s𝖳𝖮𝖱(ψn(At)μ¯(At))\displaystyle\forall s>T^{*},A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})) (14)

with probability at least 1δ¯c(ψn(At)μ¯(At))12N1𝖳𝖮𝖱(D(μt,At))4.1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.

Obtaining the lower bound. Our next step is to compute the lower bound when agent ii takes action 11 at time tt and the upper bound when agent ii takes action 0 at time tt. One difference from the case with a continuum of agents is that agent ii’s action affects the entire future path of aggregate actions. Therefore, we need to account for these effects when computing the bounds. Finally, using these bounds, we show that there exists dd such that if μt>μ¯(At)+(dN)1/9\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}, agent ii strictly prefers action 11 when μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2+1NLμ¯\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}, given that all agents take action 11 for all (μs,As)ψn(\mu_{s},A_{s})\in\psi_{n}.

Suppose that agent ii takes action a{0,1}a\in\{0,1\} at (μt,At)(\mu_{t-},A_{t-}) and takes a (random) action aina_{in} after each tick of her Poisson clock (τn)n(\tau_{n})_{n}. We call this strategy σi\sigma_{i} and assume that it induces As(σi)A_{s}(\sigma_{i}).414141Again, AsA_{s} depends on agent ii’s strategy σi\sigma_{i} because of finiteness. Her payoff from strategy σi\sigma_{i} is given by

Ui(σi)=𝔼μ[n=0τnτn+1er(st)u(ain,As(σi),θ)𝑑s].\displaystyle U_{i}(\sigma_{i})=\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(a_{in},A_{s}(\sigma_{i}),\theta)ds\bigg{]}.

In (13), we showed that As(σi)A¯s𝖳𝖮𝖱(ψn(At)μ¯(At))A_{s}(\sigma_{i})\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})) for every s<T.s<T^{*}.424242The increment of (As(σi))s(A_{s}(\sigma_{i}))_{s} is at most 1/N1/N because the probability that Poisson clocks of more than one agents tick at the same time is zero. Hence we can apply the earlier arguments. After time TT^{*}, if no information is injected again, everyone (including agent ii) takes action 11, implying ain=1a_{in}=1 if τn>T\tau_{n}>T^{*}. In (14), we showed

s>T,As(σi)A¯s𝖳𝖮𝖱(ψn(At)μ¯(At))\displaystyle\forall s>T^{*},A_{s}(\sigma_{i})\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least 1δ¯c(ψn(At)μ¯(At))12N1𝖳𝖮𝖱(D(μt,At))4.1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}. Combining before and after TT^{*}, we have

sT,As(σi)A¯s𝖳𝖮𝖱(ψn(At)μ¯(At))\displaystyle\forall s\neq T^{*},A_{s}(\sigma_{i})\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least 1δ¯c(ψn(At)μ¯(At))12N1𝖳𝖮𝖱(D(μt,At))4.1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}. By Lipschitz continuity of u(a,,θ)u(a,\cdot,\theta) and Δu(,θ),\Delta u(\cdot,\theta), we must have434343Let L0L_{0} and L1L_{1} be Lipschitz constants of u(0,,θ)u(0,\cdot,\theta) and u(1,,θ)u(1,\cdot,\theta), respectively. Then, Δu(,θ)\Delta u(\cdot,\theta) is Lipschitz continuous with constant L:=L0+L1L:=L_{0}+L_{1}.

sT,a{0,1},|u(a,As(σi),θ)u(a,A¯s,θ)|\displaystyle\forall s\neq T^{*},\forall a\in\{0,1\},|u(a,A_{s}(\sigma_{i}),\theta)-u(a,\bar{A}_{s},\theta)| 𝖳𝖮𝖱(ψn(At)μ¯(At))L\displaystyle\leq\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L

with probability at least PN(μt,At):=1δ¯c(ψn(At)μ¯(At))12N1𝖳𝖮𝖱(D(μt,At))4.P_{N}(\mu_{t},A_{t}):=1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}. Thus, for every strategy σi\sigma_{i} under conjecture ψn\psi_{n}, this implies

|Ui(σi)𝔼μ[n=0τnτn+1er(st)u(ain,A¯s,θ)dsUi(σi)]|\displaystyle\bigg{\lvert}U_{i}(\sigma_{i})-\underbrace{\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(a_{in},\bar{A}_{s},\theta)ds}_{\eqqcolon U^{*}_{i}(\sigma_{i})}\bigg{]}\bigg{\rvert}
=|𝔼μ[n=0τnτn+1er(st){u(ain,As(σi),θ)u(ain,A¯s,θ)}ds|\displaystyle=\left|\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}\left\{u(a_{in},A_{s}(\sigma_{i}),\theta)-u(a_{in},\bar{A}_{s},\theta)\right\}ds\right|
𝔼[n=0τnτn+1er(st)(PN(μt,At)𝖳𝖮𝖱(ψn(At)μ¯(At))L+(1PN(μt,At))L)]ds\displaystyle\leq\mathbb{E}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}(P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L\Big{)}\bigg{]}ds
={PN(μt,At)𝖳𝖮𝖱(ψn(At)μ¯(At))L+(1PN(μt,At))L}ter(st)𝑑s\displaystyle=\left\{P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L\right\}\cdot\int_{t}^{\infty}e^{-r(s-t)}ds
=PN(μt,At)𝖳𝖮𝖱(ψn(At)μ¯(At))L+(1PN(μt,At))Lr.\displaystyle=\frac{P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L}{r}.

Now define σi1\sigma^{1}_{i} to be a strategy that agent ii always takes action 1. Suppose that agent ii takes action 0 at the beginning for σi.\sigma_{i}. Consider that, since μ>μ¯(At),\mu>\underline{\mu}(A_{t}), if μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2+1NLμ¯\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}},

Ui(σi)\displaystyle U^{*}_{i}(\sigma_{i}) =𝔼μ[n=0τnτn+1er(st)u(ain,A¯s,θ)𝑑s]\displaystyle=\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(a_{in},\bar{A}_{s},\theta)ds\bigg{]}
𝔼μ[tτ1er(st)u(0,A¯s,θ)𝑑s]+𝔼μ[n=1τnτn+1er(st)u(1,A¯s,θ)𝑑s]\displaystyle\leq\mathbb{E}_{\mu}\bigg{[}\int_{t}^{\tau_{1}}e^{-r(s-t)}u(0,\bar{A}_{s},\theta)ds\bigg{]}+\mathbb{E}_{\mu}\bigg{[}\sum_{n=1}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(1,\bar{A}_{s},\theta)ds\bigg{]}
=Ui(σi1)𝔼μ[tτ1er(st)Δu(A¯s,θ)𝑑s]\displaystyle=U^{*}_{i}(\sigma^{1}_{i})-\mathbb{E}_{\mu}\bigg{[}\int_{t}^{\tau_{1}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}
Ui(σi1)C2(ψn(At)μ¯(At)),\displaystyle\leq U^{*}_{i}(\sigma^{1}_{i})-\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),

where the first inequality follows from Lemma 6, and the second inequality follows from (LB).

Therefore, we have

Ui(σi1)Ui(σi)\displaystyle U_{i}(\sigma^{1}_{i})-U_{i}(\sigma_{i})
=(Ui(σi1)Ui(σi1))+(Ui(σi1)Ui(σi))+(Ui(σi)Ui(σi))\displaystyle=(U_{i}(\sigma^{1}_{i})-U^{*}_{i}(\sigma^{1}_{i}))+(U^{*}_{i}(\sigma^{1}_{i})-U_{i}^{*}(\sigma_{i}))+(U_{i}^{*}(\sigma_{i})-U_{i}(\sigma_{i}))
C2(ψn(At)μ¯(At))2PN(μt,At)𝖳𝖮𝖱(ψn(At)μ¯(At))L+(1PN(μt,At))Lr\displaystyle\geq\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-2\cdot\frac{P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L}{r}
C2(ψn(At)μ¯(At))2𝖳𝖮𝖱(ψn(At)μ¯(At))L+(1PN(μt,At))Lr\displaystyle\geq\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-2\cdot\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L}{r}

Recall that

PN(μt,At)\displaystyle P_{N}(\mu_{t},A_{t}) =1δ¯c(ψn(At)μ¯(At))12N1𝖳𝖮𝖱(D(μt,At))4.\displaystyle=1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.

Since D(μt,At)=μtμ¯(At)>(dN)1/9D(\mu_{t},A_{t})=\mu_{t}-\underline{\mu}(A_{t})>(dN)^{-1/9} holds by assumption, we must have

PN(μt,At)>1δ¯c(ψn(At)μ¯(At))12(e¯δ¯)4d8/9N1/9\displaystyle P_{N}(\mu_{t},A_{t})>1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9}

for some constant e¯\bar{e} and e¯\underline{e} such that e¯δ¯D2𝖳𝖮𝖱(D)e¯δ¯D2.\bar{e}\bar{\delta}D^{2}\geq\mathsf{TOR}(D)\geq\underline{e}\bar{\delta}D^{2}.444444By the definition of δ\delta, any e¯λC/4L\bar{e}\geq\lambda C/4L and e¯λC/{4L(1+M)}\underline{e}\leq\lambda C/\{4L(1+M)\} works. Let ψn(At)μ¯(At)=ϕn\psi_{n}(A_{t})-\underline{\mu}(A_{t})=\phi_{n}. Since μtψn(At),\mu_{t}\notin\psi_{n}(A_{t}), we have ϕnD(μt,At)>(dN)1/9.\phi_{n}\geq D(\mu_{t},A_{t})>(dN)^{-1/9}. Thus,

Ui(σi1)Ui(σi)\displaystyle U_{i}(\sigma^{1}_{i})-U_{i}(\sigma_{i})
Cϕn22e¯δ¯ϕn2L+(δ¯cϕn+12(e¯δ¯)4d8/9N1/9)r\displaystyle\geq\frac{C\phi_{n}}{2}-2\cdot\frac{\bar{e}\bar{\delta}\phi_{n}^{2}L+(\bar{\delta}c\phi_{n}+12(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9})}{r}
(C22e¯δ¯+δ¯cr)ϕn24(e¯δ¯)4d8/9N1/9r\displaystyle\geq\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta}c}{r}\bigg{)}\phi_{n}-\frac{24(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9}}{r}
(C22e¯δ¯+δc¯r)(dN)1/924(e¯δ¯)4d8/9N1/9r\displaystyle\geq\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta c}}{r}\bigg{)}(dN)^{-1/9}-\frac{24(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9}}{r}
>0,\displaystyle>0,

where these inequalities are true if

δ¯\displaystyle\bar{\delta} <Cr2(2e¯+c)\displaystyle<\frac{Cr}{2(2\bar{e}+c)}
d\displaystyle d <r(e¯δ¯)424(C22e¯δ¯+δ¯cr).\displaystyle<\frac{r(\underline{e}\bar{\delta})^{4}}{24}\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta}c}{r}\bigg{)}.

In conclusion, we have shown that there exists a constant dd such that if μt>μ¯(At)+(dN)1/9\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}, agent ii strictly prefers action 11 when μtψn(At)M𝖳𝖮𝖱(D(μt,At))/2+1NLμ¯\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}, given that all agents take action 11 for all (μs,As)ψn(\mu_{s},A_{s})\in\psi_{n}.

Characterizing ψn+1\psi_{n+1}. Note that δ\delta is increasing and increasing in AtA_{t}. Thus, μt+M𝖳𝖮𝖱(D(μt,At))/2\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2 is increasing in μt\mu_{t} and continuous in μt\mu_{t}. Therefore, for each AtA_{t}, there exists μ(At)<ψn(At)\mu^{\prime}(A_{t})<\psi_{n}(A_{t}) such that

μ(At)+M𝖳𝖮𝖱(D(μ(At),At))2=ψn(At)+Lμ¯N\displaystyle\mu^{\prime}(A_{t})+\frac{M\cdot\mathsf{TOR}(D(\mu^{\prime}(A_{t}),A_{t}))}{2}=\psi_{n}(A_{t})+\frac{L_{\underline{\mu}}}{N}

if

M𝖳𝖮𝖱(D(ψn(At),At))2>Lμ¯N.\displaystyle\frac{M\cdot\mathsf{TOR}(D(\psi_{n}(A_{t}),A_{t}))}{2}>\frac{L_{\underline{\mu}}}{N}.

A sufficient condition for this is

M2δ¯e¯(dN)2/9>Lμ¯Nd<(Mδ¯e¯2Lμ¯)92N72.\displaystyle\frac{M}{2}\bar{\delta}\underline{e}\left(dN\right)^{-2/9}>\frac{L_{\underline{\mu}}}{N}\Leftrightarrow d<\left(\frac{M\bar{\delta}\underline{e}}{2L_{\underline{\mu}}}\right)^{\frac{9}{2}}N^{\frac{7}{2}}.

Hence, taking dd such that

d<min{(Mδ¯e¯2Lμ¯)92,r(e¯δ¯)424(C22e¯δ¯+δ¯cr)}\displaystyle d<\min\left\{\left(\frac{M\bar{\delta}\underline{e}}{2L_{\underline{\mu}}}\right)^{\frac{9}{2}},\frac{r(\underline{e}\bar{\delta})^{4}}{24}\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta}c}{r}\bigg{)}\right\} (15)

is sufficient. Then we define

ψn+1={(μt,At):μtμ(At)}\psi_{n+1}=\{(\mu_{t},A_{t}):\mu_{t}\geq\mu^{\prime}(A_{t})\}

From the argument above, we must have an agent always choosing action 11 whenever (μt,At)ψn+1.(\mu_{t},A_{t})\in\psi_{n+1}. Moreover, we can rewrite the above equation as follows:

(μ(At)μ¯(At))+M𝖳𝖮𝖱(μ(At)μ¯(At))2=ψn(At)μ¯(At)+Lμ¯N,\displaystyle(\mu^{\prime}(A_{t})-\underline{\mu}(A_{t}))+\frac{M\cdot\mathsf{TOR}(\mu^{\prime}(A_{t})-\underline{\mu}(A_{t}))}{2}=\psi_{n}(A_{t})-\underline{\mu}(A_{t})+\frac{L_{\underline{\mu}}}{N},

where the RHS is constant in AtA_{t} by the property of ψ\psi. Thus, μ(At)μ¯(At)\mu^{\prime}(A_{t})-\underline{\mu}(A_{t}) must be also constant in At.A_{t}. This concludes that round-(n+1)(n+1) dominance region ψn+1\psi_{n+1} satisfies ψnψn+1\psi_{n}\subset\psi_{n+1} because cn=ψn(At)μ¯(At)>μ(At)μ¯(At)=:cn+1c_{n}=\psi_{n}(A_{t})-\underline{\mu}(A_{t})>\mu^{\prime}(A_{t})-\underline{\mu}(A_{t})=:c_{n+1} when (15)\eqref{ineq: sufficient condition} is satisfied. ∎

To conclude the proof of part 1 of Theorem 2, we show the following lemma.

Lemma 8.
nψn{(μ,A)Δ(Θ)×[0,1]:μ>μ¯(A)+(dN)1/9}.\bigcup_{n\in\mathbb{N}}\psi_{n}\supseteq\Big{\{}(\mu,A)\in\Delta(\Theta)\times[0,1]:\mu>\underline{\mu}(A)+(dN)^{-1/9}\Big{\}}.
Proof of Lemma 8.

Recall ψn(At)=sup{μΔ(Θ):(μ,At)ψn}.\psi_{n}(A_{t})=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n}\}. By Lemma 7, ψn(At)\psi_{n}(A_{t}) is decreasing in nn. Define ψ(At)=limnψn(At)\psi^{*}(A_{t})=\lim_{n\to\infty}\psi_{n}(A_{t}). In limit, we must have

ψ(At)+M𝖳𝖮𝖱(D(ψ(At),At))/2=ψ(At)+Lμ¯/N\displaystyle\psi^{*}(A_{t})+M\cdot\mathsf{TOR}(D(\psi^{*}(A_{t}),A_{t}))/2=\psi^{*}(A_{t})+L_{\underline{\mu}}/N
𝖳𝖮𝖱(D(ψ(At),At))=2Lμ¯/(MN).\displaystyle\Rightarrow\mathsf{TOR}(D(\psi^{*}(A_{t}),A_{t}))=2L_{\underline{\mu}}/(MN).

Since our choice of dd by (15) ensures

2Lμ¯MNδ((dN)1/9),\frac{2L_{\underline{\mu}}}{MN}\leq\delta\left((dN)^{-1/9}\right),

we have

D(ψ(At),At)μtμ¯(At)ψ(At)μtD(\psi^{*}(A_{t}),A_{t})\leq\mu_{t}-\underline{\mu}(A_{t})\Leftrightarrow\psi^{*}(A_{t})\leq\mu_{t}

for any μt>μ¯(At)+(dN)1/9\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}, as desired. ∎

Proof of Part 2.

Consider the following two cases:

Case 1: μ0>μ¯(A0)\mu_{0}>\underline{\mu}(A_{0}). Consider NN large enough so that μ0>μ¯(A0)+(dN)1/9.\mu_{0}>\underline{\mu}(A_{0})+(dN)^{-1/9}. Under 𝝁η\bm{\mu}^{\eta} and the environment of NN agents, Part 1 implies everyone takes action 1 under any equilibrium outcome until new information is injected.

Without loss of generality, we assume ϕ0.\phi\geq 0. Let 𝝉:=(τi)i=1N\bm{\tau}:=(\tau_{i})_{i=1}^{N}. We have

infσΣN(𝝁η,A0)𝔼σ[ϕ(𝑨)]\displaystyle\inf_{\sigma\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}
𝔼𝝉[1{t,A¯tA¯tN𝖳𝖮𝖱(D(μ0,At))}ϕ(𝑨¯N)]\displaystyle\geq\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{0},A_{t}))\right\}\phi(\bm{\bar{A}}^{N})\right]
𝔼𝝉[1{t,A¯tA¯tN𝖳𝖮𝖱(D(μ0,A0))}ϕ(𝑨¯N)]\displaystyle\geq\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{0},A_{0}))\right\}\phi(\bm{\bar{A}}^{N})\right]
𝔼𝝉[1{t,A¯tA¯tN𝖳𝖮𝖱(D(μ0,A0))}ϕ(𝑨¯)]Lϕ𝖳𝖮𝖱(D(μ0,A0))\displaystyle\geq\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{0},A_{0}))\right\}\phi(\bm{\bar{A}})\right]-L_{\phi}\cdot\mathsf{TOR}(D(\mu_{0},A_{0}))
{112N1𝖳𝖮𝖱(D(μ0,A0))4}ϕ(𝑨¯)Lϕ𝖳𝖮𝖱(D(μ0,A0))\displaystyle\geq\left\{1-12N^{-1}\mathsf{TOR}(D(\mu_{0},A_{0}))^{-4}\right\}\phi(\bm{\bar{A}})-L_{\phi}\cdot\mathsf{TOR}(D(\mu_{0},A_{0})) (Lemma 5)
{112N1/9(e¯δ¯)4d8/9}ϕ(𝑨¯)Lϕ(e¯δ¯)(dN)2/9\displaystyle\geq\left\{1-12N^{-1/9}(\underline{e}\bar{\delta})^{-4}d^{8/9}\right\}\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9} (𝖳𝖮𝖱(D)e¯δ¯D2\mathsf{TOR}(D)\geq\underline{e}\bar{\delta}D^{2})
ϕ(𝑨¯)C¯N1/9\displaystyle\geq\phi(\bm{\bar{A}})-\bar{C}N^{-1/9}

for some constant C¯\bar{C}. The proof of Theorem 1 implies (2)=ϕ(𝑨¯).\eqref{eqn:opt}=\phi(\bm{\bar{A}}). Thus,

|(2)(I.2)|C¯N1/9\displaystyle|\eqref{eqn:opt}-\eqref{eqn:adv-n}|\leq\bar{C}N^{-1/9}

when NN is large enough, as desired.

Case 2: μ0μ¯(A0)\mu_{0}\leq\underline{\mu}(A_{0}). The proof of Theorem 1 implies

(2)=sup𝝁σΣ(𝝁,A0)𝔼σ[ϕ(𝑨)](1p(μ0))ϕ(𝑨¯)+p(μ0)ϕ(𝑨¯),\displaystyle\eqref{eqn:opt}=\sup_{\begin{subarray}{c}\bm{\mu}\in\mathcal{M}\\ \sigma\in\Sigma(\bm{\mu},A_{0})\end{subarray}}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\leq(1-p^{*}(\mu_{0}))\phi(\bm{\underline{A}})+p^{*}(\mu_{0})\phi(\bm{\bar{A}}),

where p(μ0):=μ0/μ¯(A0)p^{*}(\mu_{0}):=\mu_{0}/\underline{\mu}(A_{0}), 𝑨¯\bm{\underline{A}} satisfies A¯t=A¯(A0,t)=A0eλt\underline{A}_{t}=\underline{A}(A_{0},t)=A_{0}e^{-\lambda t}, and 𝑨¯\bm{\bar{A}} satisfies A¯t=A¯(A0,t)=1(1A0)eλt\bar{A}_{t}=\bar{A}(A_{0},t)=1-(1-A_{0})e^{-\lambda t}.

Let η>2(dN)1/9/(2(dN)1/9+μ¯(A0))\eta>2(dN)^{-1/9}/(2(dN)^{-1/9}+\underline{\mu}(A_{0})). In this case, we have

μ0+=μ0p(μ0)η>μ¯(A0)+2(dN)1/9,\mu_{0}^{+}=\frac{\mu_{0}}{p^{*}(\mu_{0})-\eta}>\underline{\mu}(A_{0})+2(dN)^{-1/9},

where μ0+\mu_{0}^{+} is the maximal escaping belief defined in the main text. Under 𝝁η\bm{\mu}^{\eta} and the environment of NN agents, if μ0+>μ¯(A0)+(dN)1/9,\mu_{0+}>\underline{\mu}(A_{0})+(dN)^{-1/9}, then everyone takes action 11 until new information is injected under any equilibrium outcome by Part 1. Thus, we have

infσΣN(𝝁η,A0)𝔼σ[ϕ(𝑨)]\displaystyle\inf_{\sigma\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}
(1p(μ0)+η)𝔼𝝉[1{t,|A¯tA¯tN|𝖳𝖮𝖱(D(μt,At))}ϕ(𝑨¯N)]\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\left|\underline{A}_{t}-\underline{A}_{t}^{N}\right|\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right\}\phi(\bm{\underline{A}}^{N})\right]
+(p(μ0)η)𝔼𝝉[1{t,A¯tA¯tN𝖳𝖮𝖱(D(μt,At))}ϕ(𝑨¯N)]\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right\}\phi(\bm{\bar{A}}^{N})\right]
(1p(μ0)+η)𝔼𝝉[1{t,|A¯tA¯tN|𝖳𝖮𝖱(D(μt,At))}ϕ(𝑨¯N)]\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\left|\underline{A}_{t}-\underline{A}_{t}^{N}\right|\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right\}\phi(\bm{\underline{A}}^{N})\right]
+(p(μ0)η){112N1𝖳𝖮𝖱(D(μt,At))4}𝔼𝝉[ϕ(𝑨¯N)]\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\left\{1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}\right\}\mathbb{E}_{\bm{\tau}}\left[\phi(\bm{\bar{A}}^{N})\right]
(1p(μ0)+η)(1𝒪(N1/9)){ϕ(𝑨¯)Lϕ(e¯δ¯)(dN)2/9}\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\left(1-\mathcal{O}(N^{-1/9})\right)\left\{\phi(\bm{\underline{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}
+(p(μ0)η){112N1/9(e¯δ¯)4d8/9}{ϕ(𝑨¯)Lϕ(e¯δ¯)(dN)2/9},\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\left\{1-12N^{-1/9}(\underline{e}\bar{\delta})^{-4}d^{8/9}\right\}\left\{\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\},

where 𝑨¯N\bm{\bar{A}}^{N} and 𝑨¯N\bm{\underline{A}}^{N} satisfy

A¯tN\displaystyle\bar{A}_{t}^{N} =A0+1Ni=1n1{τit}\displaystyle=A_{0}+\frac{1}{N}\sum_{i=1}^{n}1\{\tau_{i}\leq t\}
A¯tN\displaystyle\underline{A}_{t}^{N} =A01Ni=1Nn1{τit}\displaystyle=A_{0}-\frac{1}{N}\sum_{i=1}^{N-n}1\{\tau_{i}\leq t\}

with nn being the number of agents playing 0 at time 0. Note that the second inequality follows from Lemma 5, and the third inequality follows from 𝖳𝖮𝖱(D)e¯δ¯D2e¯δ¯(dN)2/9\mathsf{TOR}(D)\geq\underline{e}\bar{\delta}D^{2}\geq\underline{e}\bar{\delta}(dN)^{-2/9}. Note also that we can apply the similar argument to Lemma 5 and show that

(t,|A¯tA¯tN|𝖳𝖮𝖱(D(μt,At)))>1𝒪(N1/9).\mathbb{P}\left(\forall t,\left|\underline{A}_{t}-\underline{A}_{t}^{N}\right|\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right)>1-\mathcal{O}(N^{-1/9}).

This implies

(I.2)\displaystyle\eqref{eqn:adv-n} (1p(μ0)+η){1𝒪(N1/9)}{ϕ(𝑨¯)Lϕ(e¯δ¯)(dN)2/9}\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\left\{1-\mathcal{O}(N^{-1/9})\right\}\left\{\phi(\bm{\underline{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}
+(p(μ0)η){1𝒪(N1/9)}{ϕ(𝑨¯)Lϕ(e¯δ¯)(dN)2/9}\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\left\{1-\mathcal{O}(N^{-1/9})\right\}\left\{\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}

Hence, we have

|(2)(I.2)|\displaystyle|\eqref{eqn:opt}-\eqref{eqn:adv-n}| η{ϕ(𝑨¯)ϕ(𝑨¯)}+Lϕ(e¯δ¯)(dN)2/9\displaystyle\leq\eta\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}+L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}
+𝒪(N1/9)(p(μ0)η){ϕ(𝑨¯)Lϕ(e¯δ¯)(dN)2/9}\displaystyle\quad+\mathcal{O}(N^{-1/9})(p^{*}(\mu_{0})-\eta)\left\{\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}
+𝒪(N1/9)(1p(μ0)+η){ϕ(𝑨¯)Lϕ(e¯δ¯)(dN)2/9}\displaystyle\quad+\mathcal{O}(N^{-1/9})(1-p^{*}(\mu_{0})+\eta)\left\{\phi(\bm{\underline{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}
=η{ϕ(𝑨¯)ϕ(𝑨¯)}+𝒪(N1/9).\displaystyle=\eta\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}+\mathcal{O}(N^{-1/9}).

With our choice of η,\eta, we have there exits a constant C¯\bar{C} such that |(2)(I.2)|C¯N1/9,|\eqref{eqn:opt}-\eqref{eqn:adv-n}|\leq\bar{C}N^{-1/9}, as desired. ∎

Proof of Part 3.

If μt>μ¯(At)\mu_{t-}>\underline{\mu}(A_{t}), consider NN large enough such that μt>μ¯(At)+2(dN)1/9\mu_{t-}>\underline{\mu}(A_{t})+2(dN)^{-1/9}. We consider the following two cases for :

  • Case 1: If μt>μ¯(At)+2(dN)1/9\mu_{t-}>\underline{\mu}(A_{t})+2(dN)^{-1/9} and ZtAtZ_{t-}\leq A_{t}. In this case, there is no information arriving, and everyone takes action 1. This will increase AtA_{t}, and every agent always takes action 11 from time tt onwards as long as A¯sA¯sN𝖳𝖮𝖱(D(μs,As))\bar{A}_{s}-\bar{A}_{s}^{N}\leq\mathsf{TOR}(D(\mu_{s},A_{s})) for all sts\geq t. Since Lemma 5 implies that such probability converges to 11 as NN\to\infty, the designer’s payoff converges to the best case, implying sequential optimality as NN\to\infty.

  • Case 2: If μt>μ¯(At)+2(dN)1/9\mu_{t-}>\underline{\mu}(A_{t})+2(dN)^{-1/9} and Zt>AtZ_{t-}>A_{t}. In this case, the belief moves to either μt+M𝖳𝖮𝖱(D)\mu_{t-}+M\cdot\mathsf{TOR}(D) or μt𝖣𝖮𝖶𝖭(D)\mu_{t}-\mathsf{DOWN}(D). Note that μt𝖣𝖮𝖶𝖭(D)=(μt+μ¯(At))/2>μ¯(At)+(dN)1/9\mu_{t-}-\mathsf{DOWN}(D)=(\mu_{t}+\underline{\mu}(A_{t}))/2>\underline{\mu}(A_{t})+(dN)^{-1/9}. So no matter what information arrives, every agent takes action 1. This will increase AtA_{t}, and every agent always takes action 11 after time tt as long as A¯sA¯sN𝖳𝖮𝖱(D(μs,As))\bar{A}_{s}-\bar{A}_{s}^{N}\leq\mathsf{TOR}(D(\mu_{s},A_{s})) for all sts\geq t. Again, since such probability converges to 11 as NN\to\infty, we have sequential optimality as NN\to\infty.