Informational Puts

Andrew Koh
MIT MIT Department of Economics; email: [email protected] Sivakorn Sanguanmoo
MIT MIT Department of Economics; email: [email protected] Kei Uzui
MIT MIT Department of Economics; email: [email protected]

First version: December 2023. We are especially grateful to Drew Fudenberg and Stephen Morris for guidance, support, and many illuminating conversations. We also thank Daron Acemoglu, Matt Elliott, Nobuhiro Kiyotaki, Daniel Luo, Daisuke Oyama, Satoru Takahashi, Iván Werning, Alex Wolitzky, Muhamet Yildiz, as well as audiences at Cambridge University, the 25th ACM Conference on Economics and Computation (EC’24), the Econometric Society North American and Asian Meetings, Nuffield College Oxford, and MIT Finance, Macro, and Theory Lunches for helpful comments.

Abstract

We fully characterize how dynamic information should be provided to uniquely implement the largest equilibrium in dynamic binary-action supermodular games. The designer offers an informational put: she stays silent in good times, but injects asymmetric and inconclusive public information if players lose faith. There is (i) no multiplicity gap: the largest (partially) implementable equilibrium can be implemented uniquely; and (ii) no intertemporal commitment gap: the policy is sequentially optimal. Our results have sharp implications for the design of policy in coordination environments.

1 Introduction

Many economic environments feature (i) uncertainty about a payoff-relevant fundamental state, (ii) coordination motives, and (iii) stochastic opportunities to revise actions. These elements are present across all aspects of social and economic life e.g., macroeconomics, finance, industrial organization, and political economy.¹¹1In macroeconomics, firms are uncertain about economic conditions, face complementarities (Nakamura and Steinsson, 2010), and change their prices at ticks of a Poisson clock (Calvo, 1983). In finance, creditors are uncertain about the debtor’s profitability/solvency (Goldstein and Pauzner, 2005), have incentive to run if others’ run (Diamond and Dybvig, 1983), but might only be able to withdraw their debt at staggered intervals (He and Xiong, 2012). In industrial organization, consumers are uncertain about a product’s quality, have incentive to adopt the same product as others (Farrell and Saloner, 1985; Ellison and Fudenberg, 2000), and face stochastic adoption opportunities (Biglaiser, Crémer, and Veiga, 2022).

Equilibria of such games are sensitive to dynamic information. Consider a player who, at any history of the game, finds herself with the opportunity to re-optimize her action. The fundamental state matters for her flow payoffs, so her decision must depend on her current beliefs. Moreover, since she plays the same action until she can next re-optimize, her decision also depends on her beliefs about what future agents will do. But those beliefs depend, in turn, on what she expects future players to learn, as well as their beliefs about the play of agents yet further out into the future. Thus, the stochastic evolution of future beliefs—even those arbitrarily distant—shape incentives in the present.

We are interested in dynamic information policies which fully implement the largest time path of aggregate play i.e., as a unique subgame equilibrium of the induced stochastic game. Our main result (Theorem 1) fully characterizes the form, value, and sequential optimality of designer-optimal policies:

1.

Form. The form of optimal dynamic information relies on the delivery of carefully chosen off-path information. If players take the designer’s preferred action, the designer stays silent. If, however, agents deviate from a target path of play specified by the policy, the designer injects an asymmetric and inconclusive public signal—this is the informational put.²²2This is analogous to the “Fed put” in which the Fed’s history of intervening to halt market downturns has arguably created the belief that they are insured against downside risk (Miller et al., 2002). This is as if the Fed has offered the market a put option as insurance against downturns. In our setting, the designer steps in to inject information when players start switching to action $0$ which, as we will show, with high probability induces aggregate play to correct. This is as if the designer has offered players a put option as insurance against strategic uncertainty about the play of future players.

The signal is asymmetric such that the probability that agents become a little more confident is far higher than the probability that agents become much more pessimistic. These small but high-probability movements in the direction of the dominance region—at which playing the designer’s preferred action is strictly dominant—are chained together such that the unique equilibrium of the subgame is for future players to play the designer-preferred action.³³3This is done via a ”contagion argument” which can be viewed as the dynamic analog of the interim deletion of strictly dominated strategies in static games of incomplete information. The signal is inconclusive such that, even if agents turn pessimistic, they do not become excessively so—this will be important for sequential optimality.
2.

Value. The sequentially optimal policy uniquely implements the upper-bound on the time path of aggregate play. Thus, there is no multiplicity gap: whatever can be implemented partially (i.e., as an equilibrium) can also be implemented fully (i.e., as the unique equilibrium). This is in sharp contrast to recent work on static implementation via information design in supermodular games which finds there generically exists a gap even with private information and the ability to manipulate higher-order beliefs (Morris, Oyama, and Takahashi, 2024), or with both private information and transfers (Halac, Lipnowski, and Rappoport, 2021).
3.

Sequential optimality. Our dynamic information policy is constructed such that at every history, the designer has no incentive to deviate.⁴⁴4With the caveat that for a small set of histories, deviation incentives can be made arbitrarily small. For histories where this is so, this is simply because optimal information policies continuing from those histories do not exist. Nonetheless, this can be approached via a sequence of policies so that the gap vanishes along this sequence. This openness property is also typical of static full implementation environments as highlighted by Morris, Oyama, and Takahashi (2024). Thus, there is no intertemporal commitment gap: whatever can be implemented with ex-ante commitment to the dynamic information structure can also be implemented when the sender can continually re-optimize her dynamic information.⁵⁵5We further emphasize that sequential optimality is not given—we offer examples of policies which are optimal but not sequentially optimal. Sequentially optimality arises through the delicate interaction between properties of our policy: asymmetry, chaining, and inconclusiveness. Asymmetric off-path information are chained together to obtain full implementation at all states in which the designer-preferred action is not strictly dominated. Then, inconclusive off-path information ensures that, even if agents turn pessimistic, full implementation is still guaranteed.

Conceptually, our contribution highlights how off-path information should be optimally deployed to shape on-path incentives. Of course, it is well-known from implementation theory (Moore and Repullo, 1988; Abreu and Matsushima, 1992) that off-path threats are powerful, albeit not sequentially optimal—if the deviation actually occurs, there is no incentive to follow-through with the policy.⁶⁶6With the caveat that in implementation theory, the designer’s objective function is typically not specified: we have in mind an environment in which the designer is a player in the game, and punishing players is costly. See also work on mechanism design with limited commitment (Laffont and Tirole, 1988; Bester and Strausz, 2001; Skreta, 2015; Liu et al., 2019; Doval and Skreta, 2022) and macroeconomics where time-inconsistency plays a crucial role (Halac and Yared, 2014). Information is different in two substantive ways. It is less powerful: beliefs are martingales, which imposes severe constraints on what payoffs can be delivered off-path. But it is also more flexible: the designer has the freedom to design any distribution of off-path beliefs. What should we make of these differences?

First, we will show that off-path information, through less powerful on its own, can be chained together to close the gap between full and partial implementation. Second, the flexibility of off-path information can be exploited to shape the continuation incentives of the designer. This ensures that the designer’s counterfactual selves at zero probability histories are willing to follow through with the promised information. Together, these insights offer a novel and unified treatment of dynamic information design in supermodular games.

Economically, our results have sharp implications for a range of phenomena where coordination and multiple equilbiria feature prominently e.g., in finance (debt runs, currency crises), macroeconomics (price setting), trade and industrial policy (big pushes), industrial organization (network goods), and political economy (revolutions). We briefly discuss this after stating our main result.

Related Literature

Our results relate most closely to recent work on full implementation in supermodular games via information design (Morris, Oyama, and Takahashi, 2024; Inostroza and Pavan, 2023; Li, Song, and Zhao, 2023). In this literature, information design induces non-degenerate higher-order beliefs, and this is important to obtain uniqueness via a ”contagion argument” over the type space. By contrast, our dynamic information is public and higher-order beliefs are degenerate but we leverage a distinct kind of ”intertemporal contagion”. A key takeaway from this literature is that there is typically a gap between the designer’s value under adversarial equilibrium selection, and under designer-favorable selection (what we call a “multiplicity gap”); by contrast, we show that for dynamic binary-action supermodular games there is no such gap.

Also related is the elegant and complementary work of Basak and Zhou (2020) and Basak and Zhou (2022). We highlight several substantive differences. First, we study different dynamic games: in Basak and Zhou (2020, 2022) players make a once-and-for-all decision on whether to play the risky action, and they focus on regime change games—both features play a key role in their analysis;⁷⁷7Basak and Zhou (2020) study a regime change game with private information where the designer can choose the frequency at which she discloses whether or not the regime has survived. Basak and Zhou (2022) study an optimal stopping game with a regime change payoff structure in which agents chooses when to undertake an irreversible risky action. in ours, agents can continually re-optimize at the ticks of their Poisson clocks and play a general binary-action supermodular game where the designer’s payoff is any increasing functional from the path of aggregate play. Importantly, our optimal dynamic information policies—and the reasons they work—are entirely distinct; we discuss this more thoroughly after stating our main result.

Our paper also relates to work on the equilibria of dynamic coordination games. An important paper of Gale (1995) studies a complete information investment game where players can decide when, if ever, to make an irreversible investment and investing is payoff dominant.⁸⁸8See also Chamley (1999); Dasgupta (2007); Angeletos, Hellwig, and Pavan (2007); Mathevet and Steiner (2013); Koh, Li, and Uzui (2024a) all of which study the equilibria of different dynamic coordination games. The main result is that investment succeeds across all subgame perfect equilibria. Our environment and results differ in several substantive ways. For instance, our policy allows the designer to implement the largest equilibria—irrespective of whether it is payoff dominant.⁹⁹9Moreover, actions in our environment are reversible, so sans any information (and assuming beliefs are not in the dominance regions) there will exist subgame perfect equilbiria in which players “cycle” between actions; this is ruled out in the environment of Gale (1995) because of irreversibility. More subtly, our dynamic information works with—but does not rely on—atomless players i.e., we obtain full implementation even if each player believes that they will not change the aggregate state. By contrast, atomic players is an essential feature of Gale (1995).

Our results are also connected to the literature on dynamic implementation. Moore and Repullo (1988) show that arbitrary social choice functions can be achieved with large off-path transfers.¹⁰¹⁰10See also Aghion, Fudenberg, Holden, Kunimoto, and Tercieux (2012) for a discussion of the lack of robustness to small amounts of imperfect information, and Penta (2015) who takes a belief-free approach to dynamic implementation. Glazer and Perry (1996) show that virtual implementation of social choice functions can be achieved by appealing to extensive-form versions of Abreu and Matsushima (1992) mechanisms.¹¹¹¹11See work by Chen and Sun (2015) who exploit the freedom to design the extensive-form. Sato (2023) designs both the extensive-form and information structure a la Doval and Ely (2020) and further utilizes the fact the designer can design information about players’ past moves; by contrast, we fix the dynamic game and past play is observed. Chen et al. (2023) weaken backward induction to initial rationalizability.¹²¹²12That is, only imposing sequential rationality and common knowledge of sequential rationality at the beginning of the game, but ”anything goes” off-path; see Ben-Porath (1997). Different from these papers, our designer is substantially more constrained: (i) there is no freedom to design the extensive-form game which we take as given; (ii) the designer only offers dynamic information; and (iii) our policy is sequentially optimal.

Our game is one where players have stochastic switching opportunities. Variants of these models have been studied in macroeconomics (Diamond, 1982; Calvo, 1983; Diamond and Fudenberg, 1989; Frankel and Pauzner, 2000), industrial policy (Murphy, Shleifer, and Vishny, 1989; Matsuyama, 1991), finance (He and Xiong, 2012), industrial organization (Biglaiser, Crémer, and Veiga, 2022), and game theory (Burdzy, Frankel, and Pauzner, 2001; Matsui and Matsuyama, 1995; Oyama, 2002; Kamada and Kandori, 2020).¹³¹³13See also more recent work by Guimaraes and Machado (2018); Guimaraes, Machado, and Pereira (2020). Angeletos and Lian (2016) offer an excellent survey. A common insight from this literature is that switching frictions can generate uniqueness, and the risk-dominant profile is selected via a process of backward induction. Our contribution is to show how the largest equilibrium can be uniquely implemented by carefully choosing the dynamic information policy.

Sequential optimality is an important property of our information policy and thus our work relates to recent work studying the role of (intertemporal) commitment in dynamic information design. Koh and Sanguanmoo (2022); Koh, Sanguanmoo, and Zhong (2024b) show by construction that sequential optimality is generally achievable in single-agent stopping problems. It will turn out that sequential optimal policies also exist in our environment, but for quite distinct reasons; we discuss this more thoroughly in Section 3.

2 Model

Environment

There is a finite set of states $\Theta=\{\theta_{1},\theta_{2}\ldots,\theta_{n}\}$ . We use $\Delta(\Theta)$ to denote the set of probability measures and endow it with the Euclidian metric. There is an interior common prior $\mu_{0}\in\Delta(\Theta)\setminus\partial\Delta(\Theta)$ and a unit measure of players indexed $i\in I:=[0,1]$ . Time is continuous and indexed $\mathcal{T}:=[0,+\infty)$ . The action space is binary: $a_{it}\in A:=\{0,1\}$ where $a_{it}$ is $i$ ’s action at time $t$ . Write $A_{t}:=\int a_{it}di$ to denote the proportion of players playing action $1$ at time $t$ . Working with a continuum of agents makes our analysis cleaner because randomness from individual switching frictions vanish in the aggregate.¹⁴¹⁴14By an appropriate continuum law of large numbers (Sun, 2006) where we endow the player space $[0,1]$ with the appropriate Lebesgue extension. Working with a continuum also clarifies that atomic players are not required for the use of off-path information; we discuss this in Section 4. An analog of our result holds for a finite players; we develop this in Online Appendix I.

Payoffs

The flow payoff for each player is $u:\{0,1\}\times[0,1]\times\Theta\to\mathbb{R}$ . We write $\Delta u(A,\theta):=u(1,A,\theta)-u(0,A,\theta)$ to denote the payoff difference from action $1$ relative to $0$ and assume throughout:

(i)

Supermodularity. $\Delta u(A,\theta)$ is continuously differentiable and strictly increasing in $A$ .
(ii)

Dominant state. There exists $\theta^{*}\in\Theta$ such that $\Delta u(0,\theta^{*})>0$ .

Condition (i) states that the game is one of strategic complements. Condition (ii) is a standard richness assumption on the space of possible payoff structures: there exists some state $\theta^{*}$ under which playing action $1$ is strictly dominant.¹⁵¹⁵15This assumption is identical to that in Morris, Oyama, and Takahashi (2024).

The payoff of player $i\in I$ is $\int e^{-rt}u(a_{it},A_{t},\theta)dt$ where $r>0$ is an arbitrary discount rate. Each player is endowed with a personal Poisson clock which ticks at an independent rate $\lambda>0$ . Players can only re-optimize at the ticks of their clocks (Calvo, 1983; Matsui and Matsuyama, 1995; Frankel and Pauzner, 2000; Frankel, Morris, and Pauzner, 2003; Kamada and Kandori, 2020). Our dynamic supermodular game is quite general with the caveat that players are homogeneous.¹⁶¹⁶16A similar assumption has been made in static environments by Inostroza and Pavan (2023); Li et al. (2023) and was weakened by Morris, Oyama, and Takahashi (2024) who characterize optimal private information for full implementation by focusing on potential games with a convexity requirement, which amounts to there not being “too much heterogeneity” across players. We discuss the heterogeneous case in Section 4.

Dynamic information policies

A history $H_{t}:=\big{(}(\mu_{s})_{s\leq t},(A_{s})_{s\leq t}\big{)}$ specifies beliefs and aggregate play up to time $t$ . Let $\mathcal{H}_{t}$ be the set of all histories and $\mathcal{H}:=\bigcup_{t\geq 0}\mathcal{H}_{t}$ . Write $\mathcal{F}_{t}$ as the natural filtration generated by histories. A dynamic information policy is a $(\mathcal{F}_{t})_{t}$ -martingale. Let

\mathcal{M}:=\Big{\{}\bm{\mu}^{\prime}:\bm{\mu}^{\prime}\text{ is a $(\mathcal{F}_{t})_{t}$-martingale, $\mu_{0}=\mu^{\prime}_{0}$ a.s.}\Big{\}}.

be the set of all dynamic information policies, where we emphasize that the law of $\bm{\mu}\in\mathcal{M}$ can depend on past play.

Strategies and Equilibria

A strategy $\sigma_{i}:\mathcal{H}\to\Delta\{0,1\}$ is a map from histories to a distribution over actions so that if $i$ ’s clock ticks at time $t$ , her choice of action is given by history $H_{t-}:=\lim_{t^{\prime}\uparrow t}H_{t}$ .¹⁷¹⁷17This is well-defined since $(A_{t})_{t}$ is a.s. continuous and $(\mu_{t})_{t}$ has left-limits. Since the measure of agents who act at time $t$ is almost surely zero, our game is in effect equivalent to one in which play at time $t$ depends on history $H_{t}$ . Given $\bm{\mu}$ , this induces a stochastic game;¹⁸¹⁸18Note that information is public so all agents share the same beliefs; in Appendix B we relax this to show that private information often cannot do better. let $\Sigma(\bm{\mu},A_{0})$ denote the set of subgame perfect equilibria of the stochastic game. We focus on subgame perfection because there is no private information so the game continuing from each history corresponds to a proper subgame.¹⁹¹⁹19Hence, subgame perfection in our setting coincides trivially with Perfect-Bayesian Equilibria (Fudenberg and Tirole, 1991); since we are varying the dynamic information structure, this also corresponds to dynamic Bayes Correlated Equilibria (Makris and Renou, 2023)—but only in the trivial sense since higher-order beliefs are degenerate.

Refer to caption — Figure 1: Relationship between beliefs, equilibria, and action paths

Figure 1 illustrates the connection between dynamic information policies (top left), equilibria (top right), and the path of aggregate actions (bottom). Each information policy $(\mu_{t})_{t}$ specifies a Cadlag martingale which depends on both its past realizations, as well as past aggregate play. Given this information policy, this induces a set of equilibria $\Sigma(\bm{\mu},A_{0})$ . Both the realizations of beliefs $(\mu_{t})_{t}$ as well as the selected equilibrium $\sigma\in\Sigma(\bm{\mu},A_{0})$ induce a path of aggregate play $(A_{t})_{t}$ . The designer’s problem is to choose its dynamic information policy to influence the set of equilbria and thus $(A_{t})_{t}$ .

Designer’s problem under adversarial equilibrium selection

The designer’s problem under commitment when nature is choosing the best equilibrium is

\sup_{\begin{subarray}{c}\bm{\mu}\in\mathcal{M}\\ \bm{\sigma}\in\Sigma(\bm{\mu},A_{0})\end{subarray}}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}

Conversely, when nature is choosing the worst equilibrium, the problem is

\sup_{\bm{\mu}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}

where $\phi:\mathcal{A}\to\mathbb{R}$ is an increasing and bounded functional from the path-space of aggregate play $\mathcal{A}$ e.g., the discounted measure of play $\phi(\bm{A})=\int e^{-rt}A_{t}dt$ with $r>0$ .

Sequential Optimality

If the designer cannot commit to future information, off-path delivery of information might have no bite in the present. To this end, we can define the payoff gap at history $H_{t}$ as the value of the best deviation from the original policy $\bm{\mu}$ :

\inf_{\bm{\sigma}\in\Sigma(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{|}\mathcal{F}_{t}\Big{]}-\sup_{\bm{\mu}^{\prime}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu}^{\prime},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{|}\mathcal{F}_{t}\Big{]}\geq 0

where $\mathcal{F}_{t}$ is the filtration corresponding to $H_{t}$ . $\bm{\mu}$ is sequentially optimal if the gap is zero for all histories $H_{t}\in\mathcal{H}$ . Sequential optimality is demanding and states that at every history—including off-path ones—the designer still finds it optimal to follow through with her dynamic information policy.

3 Optimal dynamic information

We begin with an intuitive description of a sequentially-optimal dynamic information policy for binary states before constructing it formally. With binary states, we set $\Theta=\{0,1\}$ where $1$ is the dominant state. Beliefs are one-dimensional and we will directly associate $\mu_{t}:=\mathbb{P}(\theta=1|\mathcal{F}_{t})$ . Let $\overline{\mu}$ be the upper-dominance region: $\overline{\mu}(A)$ is the lowest belief such that if the current aggregate play is $A$ , playing action $1$ is strictly dominant no matter the future play of others.

I. State is near the upper-dominance region. First suppose that at time $t$ , the public belief $\mu_{t}$ and aggregate play $A_{t}$ is close to the upper-dominance region as illustrated by the blue dot labelled $(\mu_{t},A_{t})$ in Figure 2b (a). If players switch to action $1$ , the designer stays silent. Thus, aggregate action progressively increases as illustrated by the upward arrows in Figure 2b (a).

But suppose, instead, that players start playing action $0$ as depicted in Figure 2b (b) I. Then, the designer injects asymmetric information: it is very likely that agents become slightly more optimistic i.e., public beliefs move up a little and into the upper-dominance region, but there is a small chance agents become much more pessimistic (Fig. 2b (b) II). Suppose that this deviation happened and so this information is injected and, furthermore, that it has made agents a little more confident. Then, on this event, future beliefs are in the upper-dominance region so it is strictly dominant for future agents to take action $1$ . Correspondingly, the designer delivers no further information (Fig. 2b (b) III) and aggregate play begins to increase thereafter. But, knowing that this sequence of events is likely to take place, and because agents have coordination motives, deviating to action $0$ in the first place is strictly dominated.

II. State is far from upper-dominance region. Next consider Figure 3b (a) where $(\mu_{t},A_{t})$ is further away from the dominance region. Our previous argument now breaks down: there is no way for off-path information—no matter how cleverly designed—to ensure beliefs reach the dominance region with a high enough probability as to deter the initial deviation to action $0$ . This is the key weakness of off-path information vis-a-vis off-path transfers. What then does the designer do?

If players start switching to $1$ , the designer delivers asymmetric information so that, with high probability, agents become a little more confident—but not confident enough that action $1$ is strictly dominant. This is depicted in Figure 3b (a) II. Upon this realization, if future agents continue deviating to $0$ , the policy injects yet another bout of asymmetric information which, with high probability, pushes beliefs into the upper-dominance region. This is depicted in Figure 3b (a) IIIB. Knowing this, we have already seen that those future agents strictly prefer to switch to $1$ . But knowing that, agents in the present state $(\mu_{t},A_{t})$ , anticipating that upon deviation the injection will, with high probability, induce future agents to play $1$ , also strictly prefer to play $1$ in the present.

What are the limits of this line of reasoning? It turns out that by choosing our dynamic information policy carefully, we can chain together these off-path information in such a way as to obtain full implementation at all belief-aggregate pairs for which action $1$ is not strictly dominated. This is depicted by Figure (b) where, as before, the blue and pink shaded regions represent the upper- and lower-dominance regions respectively. The logic is related to the “contagion arguments” of Frankel and Pauzner (2000); Burdzy, Frankel, and Pauzner (2001); Frankel, Morris, and Pauzner (2003). These papers show that the risk-dominant action is typically selected as the limit of some iterated deletion procedure in which the blue and pink regions expand with each iteration and meet in the middle which pins down the unique equilibrium.²⁰²⁰20In Frankel and Pauzner (2000); Burdzy, Frankel, and Pauzner (2001) this is also obtained via backward induction, where a symmetric random process governs aggregate incentives. Mapped to our model, this corresponds to public information so that the belief martingale is a time-changed Brownian motion. In Frankel, Morris, and Pauzner (2003), the this is obtained via interim deletion of strictly dominated strategies in many-action global games, though the logic is similar. By contrast, we show how dynamic information can be employed to generate asymmetric contagion such that only the upper-dominance region expands to engulf the space of all belief-aggregate play pairs where action $1$ is not strictly dominated.

III. Designer-preferred action strictly dominated. Now suppose beliefs are so pessimistic that $1$ is strictly dominated i.e., $\mu_{t}\leq\underline{\mu}(A_{t})$ where $\underline{\mu}(A_{t})$ is the highest belief under which, given $A_{t}$ , action $1$ is strictly dominated.

Then, the above policy no longer works: even if players expect all future players to switch to $1$ , they are so pessimistic about the state that switching to $0$ is strictly better. Now, the designer has to offer non-trivial information on-path to push beliefs out of the lower-dominance region. How is this optimally done?

Figure 4c (a) illustrates the optimal policy which consists of an immediate and precise injection of information such that beliefs jump to either $0$ or (just) out of the lower-dominance region. The optimality of such a policy is built on the observation that if the designer does not intervene early to curtail players from progressively switching to $0$ , it simply becomes more difficult to escape the lower-dominance region down the line. Consider, for instance, the policy in Figure 4c (b) which also injects precise information to maximize the chance of escaping the lower-dominance region, but with a delay. Before this injection, players switch to action $0$ and since $\underline{\mu}(A)$ is strictly decreasing, the probability of escaping the dominance region is strictly smaller. For similar reasons, the policy illustrated in Figure 4c (c) which induces continuous sample belief paths is also sub-optimal.

IV. Sequential optimality. Our previous discussion specified off-path injections of policies upon deviation away from the action $1$ . Of course, if such deviations actually occur, the designer may not have any incentive to follow-through with its policy. For instance, consider Figure 5b (a) which employs the strategy of injecting conclusive bad news that the state is $0$ so that, with high probability beliefs increase a little, and with low probability agents learn conclusively that $\theta=0$ . Indeed, information of this form maximizes the chance that beliefs increase²¹²¹21As in Kamenica and Gentzkow (2011) and subsequent work. and, as we have described, these can be be chained together to achieve full implementation.

However, this policy is not sequentially optimal: if agents do deviate and play action $0$ , injecting such information is suboptimal because it poses an extra risk: if conclusive bad news does arrive, beliefs become absorbing at $\mu_{t}=0$ and further information is powerless to influence beliefs—it is then strictly dominant for all agents to play $0$ thereafter. How, then, is sequential optimality obtained?

Consider inconclusive off-path information as illustrated in Figure 5b (b) where each blue dot represents a potential injection of off-path information upon players’ deviating to action $0$ . Each injection induces two kinds of beliefs: upon arrival of a ‘good’ signal, agents become a little more optimistic (right arrow); upon arrival of a ‘bad’ signal, agents become relatively more pessimistic, but not so much that action $1$ becomes strictly dominated (left arrow). Figure 5b illustrates a particular policy in which, upon realization of the bad signal at state $(\mu_{t-},A_{t})$ , agents’ beliefs move halfway toward the lower-dominance region i.e., to $[\mu_{t-}+\underline{\mu}(A_{t})]/{2}$ . Conversely, if the good signal arrives, believes move up a little, so that the probability of the former is much higher than the latter.

By choosing this distribution carefully for each belief-aggregate action pair, we can achieve full implementation via the chaining argument outlined above, which requires that (i) probability of the good signal arriving is sufficiently high as to deter deviations; and (ii) movement in beliefs generated by the good signal is sufficiently large that, when chained together, we obtain full implementation over the whole region. At the same time, this is sequentially optimal since, whenever the designer is faced with the prospect of injecting off-path information, she is willing to do so: with probability $1$ agents’ posterior beliefs are such that full implementation remains possible.²²²²22We emphasize that there is nothing circular about this argument: we iteratively delete switching to action $0$ under the worst-case conjecture that, upon the bad signal arriving, all future agents play $1$ . This is sufficient to obtain full implementation as long as action $1$ is not strictly dominated.

Sequential optimality of dynamic information has been recently studied in single-agent optimal stopping problems (Koh and Sanguanmoo, 2022; Koh, Sanguanmoo, and Zhong, 2024b) who show that optimal dynamic information can always be modified to be sequentially optimal.²³²³23See also Ball (2023) who finds in a different single-agent contracting environment that the optimal dynamic information policy happens to be sequentially optimal. In such environments, sequential optimality is obtained via an entirely distinct mechanism: the designer progressively delivers more interim information to raise the agent’s outside option at future histories which, in turn, ties the designer’s hands in the future. By contrast, in the present environment our designer chains together off-path information together to raise her own continuation value by guaranteeing that, even on realizations of the asymmetric signal, her future self can always fully implement the largest path of play.

Construction of sequentially-optimal policy.

We now make our previous discussion precise and general.

We will construct a particular martingale $\bm{\mu^{*}}\in\mathcal{M}$ which is ‘Markovian’ in the sense that the ‘instantaneous’ information at time $t$ depends only on the belief-aggregate play pair $(\mu_{t},A_{t})$ , as well as an auxiliary $(\mathcal{F}_{t})_{t}$ -predictable process $(Z_{t})_{t}$ we will define as part of the policy. We begin with several key definitions:

Definition 1 (Lower dominance region).

Let $\Psi_{LD}:[0,1]\rightrightarrows\Delta(\Theta)$ denote the set of beliefs under which players prefer action $0$ even if all future players choose to play action $1$ :

\displaystyle\Psi_{LD}(A_{t})\coloneqq\Big{\{}\mu\in\Delta(\Theta):\mathbb{E}_{\theta\sim\mu}\Big{[}\int_{t}^{\tau}e^{-rs}\Delta u(\bar{A}_{s},\theta)ds\Big{]}\leq 0\Big{\}},

where $\bar{A}_{s}$ solves $d\bar{A}_{s}=\lambda(1-A_{s})ds$ for $s\geq t$ with boundary $\bar{A}_{t}=A_{t}$ and $\tau$ is independently distributed according to an exponential distribution with rate $\lambda$ re-normalized to start at $t.$

Observe that supermodularity implies $\Psi_{LD}$ is decreasing in $A_{t}$ : $\Psi_{LD}(A_{t})\subset\Psi_{LD}(A^{\prime}_{t})$ if $A_{t}>A^{\prime}_{t}$ . $\Psi_{LD}$ is illustrated by the pink region of Figure 6b for the cases where $|\Theta|=2$ (panel (a)), and $|\Theta|=3$ (panel (b)).

Definition 2.

For each $\mu_{t}\notin\Psi_{LD}(A_{t})$ and $A_{t}\in[0,1]$ , define

D(\mu_{t},A_{t})\coloneqq\inf\bigg{\{}\alpha\in[0,1]:\mu_{t}-\alpha\cdot\frac{\delta_{\theta^{*}}-\mu_{t}}{1-\mu_{t}(\theta^{*})}\in\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\bigg{\}}

this gives the ‘distance’ from current beliefs $\mu_{t}$ as it moves along a linear path starting from $\delta_{\theta^{*}}$ to either (i) the lower dominance region $\Psi_{LD}(A_{t})$ ; or (ii) the set of beliefs that assign zero probability on state $\theta^{*}$ which we denote with $\text{Bd}_{\theta^{*}}\coloneqq\{\mu\in\Delta(\Theta):\mu(\theta^{*})=0\}.$ This is depicted in Figure 6b where each blue dot represents a belief.

Definition 3 (Tolerance, upward/downward jump sizes, belief direction).

To describe the policy when action $1$ is not strictly dominated, we specify the following variables:

Definition 3 specifies objects required to define our information policy when beliefs lie outside of the lower-dominance region $\mu\notin\Psi_{LD}$ . We now develop objects to define our information policy when beliefs lie inside the lower dominace region $\mu\in\Psi_{LD}$ .

Definition 4 (Maximal escape probability and beliefs).

To describe the policy when action $1$ is strictly dominated, a few more definitions are in order:

We are (finally!) ready to define our dynamic information policy $\bm{\mu^{*}}\in\mathcal{M}$ . Recall that $\bm{\mu}^{*}$ is Cadlag so has left-limits which we denote with $\mu_{t-}:=\lim_{t^{\prime}\uparrow t}\mu_{t}$ . We will simultaneously specify the law of $\bm{\mu^{*}}$ as well as construct the stochastic process $(Z_{t})_{t}$ which is $(\mathcal{F}_{t})_{t}$ -predictable²⁴²⁴24That is, $Z_{t}$ is measurable with respect to the left filtration $\lim_{s\uparrow t}\mathcal{F}_{s}$ . and initializing $Z_{0}=A_{0}$ . $(Z_{t})_{t}$ is interpreted as the targeted aggregate play at each history.

Given the tuple $(\mu^{*}_{t-},Z_{t-},A_{t})$ , define the time- $t$ information structure and law of motion of $Z_{t}$ as follows:

1.

Silence on-path. If action $1$ is not strictly dominated i.e, $\mu_{t-}\notin\Psi_{LD}(A_{t})$ and play is within the tolerance level i.e., $|A_{t}-Z_{t-}|<\mathsf{TOR}(D)$ then

$\mu_{t}=\mu_{t-}$ almost surely,

i.e., no information, and $dZ_{t}=\lambda(1-Z_{t-}).$

Asymmetric and inconclusive off-path injection. If action $1$ is not strictly dominated i.e., $\mu_{t-}\notin\Psi_{LD}(A_{t})$ and play is outside the tolerance level i.e., $|A_{t}-Z_{t-}|\geq\mathsf{TOR}(D)$ then

\displaystyle\mu_{t}=\begin{cases}\mu_{t-}+(M\cdot\mathsf{TOR}(D))\cdot\hat{d}(\mu_{t-})&\text{w.p. $\frac{\mathsf{DOWN}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$}\\ \mu_{t-}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-})&\text{w.p. $\frac{M\cdot\mathsf{TOR}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$},\end{cases}

where $\hat{d}(\mu):=\frac{\delta_{\theta^{*}}-\mu}{1-\mu(\theta^{*})}$ is the (normalized) directional vector of $\mu$ toward $\delta_{\theta^{*}}$ , and reset $Z_{t}=A_{t}$ .

Jump. If action $1$ is strictly dominated i.e., $\mu_{t-}\in\Psi_{LD}(A_{t})$ then beliefs jump to a maximal escape point: pick any $\mu^{+}\in\partial(\mu,\eta)$

\displaystyle\mu_{t}=\begin{cases}\mu^{+}&\text{w.p. $p^{*}(\mu_{t-},A_{t})-\eta$}\\ \mu^{-}&\text{w.p. $1-(p^{*}(\mu_{t-},A_{t})-\eta)$},\end{cases}

where $\mu^{-}=\frac{\mu_{t}-(p^{*}(\mu_{t-},A_{t})-\eta)\mu^{+}}{1-(p^{*}(\mu_{t-},A_{t})-\eta)}.$

We have defined a family of information structures which depend on $\mathsf{TOR}(D)$ (tolerance), $M\cdot\mathsf{TOR}(D)$ (upward jump size), $\mathsf{DOWN}(D)$ (downward jump size), and $\eta$ (distance outside the lower-dominance region $\Psi_{LD}$ ). There is some flexibility to choose them: we will set $\mathsf{DOWN}(D)=\frac{1}{2}D$ , $\mathsf{TOR}(D)=m\cdot D^{2}$ where $m>0$ is a small constant, $M>0$ is a large constant.

We choose $m$ small so that $\mathsf{TOR}(D)$ , the upward jump size, is much smaller than the downward jump size—this ensures that the probability of becoming (a little) more optimistic is much larger. $M$ is the ratio between the upward jump size and the tolerance—it is large to guarantee that off-path information can push future beliefs into the upper-dominance region. The exact choice of $m$ and $M$ will depend on the primitives of the game, but are independent of $\eta$ ; a detailed construction is in Appendix A. Hence, we parameterize this family of policies by $(\bm{\mu}^{\eta})_{\eta>0}$ .

Theorem 1.

(i)

Form and value.

\lim_{\eta\downarrow 0}\inf_{\sigma\in\Sigma(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}=\eqref{eqn:adv}=\eqref{eqn:opt}.

(ii)

Sequential optimality.

\lim_{\eta\downarrow 0}\sup_{H_{t}\in\mathcal{H}}\Bigg{|}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{|}\mathcal{F}_{t}\Big{]}-\sup_{\bm{\mu}^{\prime}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma(\bm{\mu}^{\prime},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{|}\mathcal{F}_{t}\Big{]}\Bigg{|}=0.

Proof.

See Appendix A. ∎

4 Robustness and generalizations

Our dynamic game is quite general in some regards but more specific in others. We now discuss which aspects are crucial, and which can be relaxed.

Continuum vs finite players. We worked with a continuum of players so there is no aggregate randomness in the time path of agents who can re-optimize their action.²⁵²⁵25By a suitable continuum law of large numbers (Sun, 2006). This delivers a cleaner analysis since the only source of randomness is fluctuations in beliefs driven by policy. In Online Appendix I we show that Theorem 1 holds, mutatis mutandis, in a model with large but finite number of players.²⁶²⁶26See Aumann (1966) and Fudenberg and Levine (1986); Levine and Pesendorfer (1995) for a discussion of the subtleties between continuum and finite players. There, we show that in a finite version of the model, the same policy that was optimal continuum case remains continues to solve problem (2) for large but finite number of players $N$ . In particular, our policy closes the multiplicity gap at rate $O(N^{-1/9})$ .²⁷²⁷27That is, $|\eqref{eqn:adv}-\eqref{eqn:opt}|=O(N^{-1/9})$ ; see Online Appendix I. Mathematically, this requires more involved arguments to handle the extra randomness from switching times.

Conceptually, however, finiteness is simpler. Notice that in our continuum model, players are atomless but nonetheless off-path information is still effective. That is, players do not need to believe that they can individually influence the state for full implementation to work. This is in sharp contrast to work on dynamic coordination, durable goods monopolist, or public-good provision games which rely on the fact that each agent’s action makes a small but non-negligible difference.²⁸²⁸28For instance Gale (1995) highlights a gap between a continuum and finite number of players in dynamic coordination games. A similar gap emerges in durable goods monopolist (Fudenberg, Levine, and Tirole, 1985; Gul, Sonnenschein, and Wilson, 1986; Bagnoli, Salant, and Swierzbinski, 1989). See also recent work by Battaglini and Palfrey (2024) in public goods context where the fact that each agent can influence the state (by a little) is important. The chief difference is that information about deviations are lost in the continuum case (Levine and Pesendorfer, 1995) which precludes the designer from detecting and responding to individual deviations.

Our key insight that this is not required: a dynamic policy with a moving target—such that asymmetric and inconclusive information is injected if aggregate play falls too far from the target—can deliver strict incentives, even if individual players cannot influence aggregate play. That is, our policy credibly insures players against paths of future play by precluding the possibility that ‘too many’ (as prescribed by the tolerance level) future agents might switch to action $0$ . In this regard working with atomless agents delivers an arguably stronger result.

Public vs private information. When the initial condition is such that playing $1$ is not strictly dominated, our policy fully implements the upper-bound on the time path of aggregate play. Thus, private information cannot do better. If initial beliefs are such that action $1$ is strictly dominated, however, this is more subtle.²⁹²⁹29It is still an open question as to how to characterize feasible joint paths of higher-order beliefs when players also observe past play. We discuss this case in Appendix B where we construct an upper bound on the payoff difference under public and private information policies.

Homogeneous vs heterogeneous players. Payoffs in our dynamic game are quite general, with the caveat that they were identical across players. It is well-known that introducing heterogeneity typically aids equilibrium uniqueness in coordination games.³⁰³⁰30See Morris and Shin (2006) for an articulation and survey of this idea. Thus, we expect that this can only make full implementation easier. Since we have already closed the multiplicity gap under homogeneous payoffs, qualitative features of our main result should continue to hold.³¹³¹31At least when the belief-aggregate play pair is so that action $1$ is not strictly dominated.

Switching frictions. Switching frictions are commonly used to model switching costs, inattention, or settings with some staggered structure. They are important in our environment because dynamic information policy can then inject information as soon as players begin deviating from the designer’s preferred action. This allows off-path information to be chained together by correcting incipient deviations. By contrast, if players could continually re-optimize their actions, then off-path information is powerless to rule out equilibria of the form “all simultaneously switch to $0$ ”.³²³²32Indeed, prior work which obtained equilibrium uniqueness (of risk-dominant selection) (Frankel and Pauzner, 2000; Burdzy, Frankel, and Pauzner, 2001) do so via switching frictions. Switching frictions are prevalent in macroeconomics but, as Angeletos and Lian (2016) note, “It is then somewhat surprising that this approach [combining aggregate shocks with switching frictions to generate uniqueness] has not attracted more attention in applied research.” We note, however, that it would suffice for some frictions to exist, but the exact form is not particularly important: the switching rate could vary with aggregate play, change over time, and can be taken to be arbitrarily quick or slow.

5 Discussion

We have shown that dynamic information is a powerful tool for full implementation in general binary-action supermodular games. In doing so, we highlighted key properties of off-path information: asymmetric and inconclusive signals are chained together to obtain full implementation while preserving sequential optimality. We conclude by briefly discussing implications.

Implications for implementation via information. A recent literature on information and mechanism design finds that in static environments, there is generically a multiplicity gap—the designer can do strictly better under partial rather than full implementation (Morris, Oyama, and Takahashi, 2024; Halac, Lipnowski, and Rappoport, 2021).³³³³33See also Inostroza and Pavan (2023); Li, Song, and Zhao (2023); Morris, Oyama, and Takahashi (2022); Halac, Lipnowski, and Rappoport (2024). We show that the careful design of dynamic public information can quite generally close this gap in dynamic coordination environments.³⁴³⁴34Moreover, information in our environment is public so higher-order beliefs are degenerate; by contrast, optimal static implementation via information typically requires inducing non-degenerate higher-order beliefs.

But do our results demand more of players’ rationality and common knowledge of rationality? Yes and no. On the one hand, it is well-known that in environments like ours, there is a tight connection between the iterated deletion of interim strictly dominated strategies (as in Frankel, Morris, and Pauzner (2003)) and backward induction, which can be viewed as the iterated deletion of intermporally strictly dominated strategies (as in Frankel and Pauzner (2000); Burdzy, Frankel, and Pauzner (2001)). In this regard, we do not think our results require ”more sophistication” of agents than in static environments. On the other hand, it is also known that common knowledge of rationality is delicate in dynamic games and must continue to hold at off-path histories.³⁵³⁵35See Aumann (1995). Samet (2005) offers an entertaining discussion. This motivates implementation in ”initial rationalizability” when the designer has freedom to design the extensive form game (Chen, Holden, Kunimoto, Sun, and Wilkening, 2023). In this regard, our stronger results are obtained at the price of arguably stronger assumptions on common knowledge of rationality.

Implications for coordination policy. Our results have simple and sharp implications for coordination problems. It is often held that to prevent agents from playing undesirable equilbiria, policymakers must deliver substantial on-path information in order to uniquely implement the designer’s preferred action.³⁶³⁶36See Morris and Yildiz (2019) for a recent articulation of this idea in static games, and Basak and Zhou (2020, 2022) in a dynamic regime change game where the planner uses either frequent warnings (the former), or early warnings (the latter) to implement their preferred equilibrium. Our results offer a more nuanced view.

When public beliefs are so pessimistic that the designer-preferred action is strictly dominated, an early and precise injection of on-path information is indeed required; waiting only makes implementation harder in the future. But as long as beliefs are not so pessimistic that the designer-preferred action is strictly dominated, no additional on-path information is required: silence backed by the credible promise of off-path information suffices.

References

Abreu and Matsushima (1992) Abreu, D. and H. Matsushima (1992): “Virtual implementation in iteratively undominated strategies: complete information,” Econometrica: Journal of the Econometric Society, 993–1008.
Aghion et al. (2012) Aghion, P., D. Fudenberg, R. Holden, T. Kunimoto, and O. Tercieux (2012): “Subgame-perfect implementation under information perturbations,” The Quarterly Journal of Economics, 127, 1843–1881.
Angeletos et al. (2007) Angeletos, G.-M., C. Hellwig, and A. Pavan (2007): “Dynamic global games of regime change: Learning, multiplicity, and the timing of attacks,” Econometrica, 75, 711–756.
Angeletos and Lian (2016) Angeletos, G.-M. and C. Lian (2016): “Incomplete information in macroeconomics: Accommodating frictions in coordination,” in Handbook of macroeconomics, Elsevier, vol. 2, 1065–1240.
Arieli et al. (2021) Arieli, I., Y. Babichenko, F. Sandomirskiy, and O. Tamuz (2021): “Feasible joint posterior beliefs,” Journal of Political Economy, 129, 2546–2594.
Aumann (1966) Aumann, R. (1966): “Existence of Competitive Equilibria in Markets with a Continuum of Traders,” Econometrica, 34, 1–17.
Aumann (1995) Aumann, R. J. (1995): “Backward induction and common knowledge of rationality,” Games and economic Behavior, 8, 6–19.
Bagnoli et al. (1989) Bagnoli, M., S. W. Salant, and J. E. Swierzbinski (1989): “Durable-goods monopoly with discrete demand,” Journal of Political Economy, 97, 1459–1478.
Ball (2023) Ball, I. (2023): “Dynamic information provision: Rewarding the past and guiding the future,” Econometrica, 91, 1363–1391.
Basak and Zhou (2020) Basak, D. and Z. Zhou (2020): “Diffusing coordination risk,” American Economic Review, 110, 271–297.
Basak and Zhou (2022) ——— (2022): “Panics and early warnings,” PBCSF-NIFR Research Paper.
Battaglini and Palfrey (2024) Battaglini, M. and T. R. Palfrey (2024): “Dynamic Collective Action and the Power of Large Numbers,” Tech. rep., National Bureau of Economic Research.
Ben-Porath (1997) Ben-Porath, E. (1997): “Rationality, Nash equilibrium and backwards induction in perfect-information games,” The Review of Economic Studies, 64, 23–46.
Bester and Strausz (2001) Bester, H. and R. Strausz (2001): “Contracting with imperfect commitment and the revelation principle: the single agent case,” Econometrica, 69, 1077–1098.
Biglaiser et al. (2022) Biglaiser, G., J. Crémer, and A. Veiga (2022): “Should I stay or should I go? Migrating away from an incumbent platform,” The RAND Journal of Economics, 53, 453–483.
Burdzy et al. (2001) Burdzy, K., D. M. Frankel, and A. Pauzner (2001): “Fast equilibrium selection by rational players living in a changing world,” Econometrica, 69, 163–189.
Calvo (1983) Calvo, G. A. (1983): “Staggered prices in a utility-maximizing framework,” Journal of monetary Economics, 12, 383–398.
Chamley (1999) Chamley, C. (1999): “Coordinating regime switches,” The Quarterly Journal of Economics, 114, 869–905.
Chen et al. (2023) Chen, Y.-C., R. Holden, T. Kunimoto, Y. Sun, and T. Wilkening (2023): “Getting dynamic implementation to work,” Journal of Political Economy, 131, 285–387.
Chen and Sun (2015) Chen, Y.-C. and Y. Sun (2015): “Full implementation in backward induction,” Journal of Mathematical Economics, 59, 71–76.
Dasgupta (2007) Dasgupta, A. (2007): “Coordination and delay in global games,” Journal of Economic Theory, 134, 195–225.
Diamond and Dybvig (1983) Diamond, D. W. and P. H. Dybvig (1983): “Bank runs, deposit insurance, and liquidity,” Journal of political economy, 91, 401–419.
Diamond and Fudenberg (1989) Diamond, P. and D. Fudenberg (1989): “Rational expectations business cycles in search equilibrium,” Journal of political Economy, 97, 606–619.
Diamond (1982) Diamond, P. A. (1982): “Aggregate demand management in search equilibrium,” Journal of political Economy, 90, 881–894.
Doval and Ely (2020) Doval, L. and J. C. Ely (2020): “Sequential information design,” Econometrica, 88, 2575–2608.
Doval and Skreta (2022) Doval, L. and V. Skreta (2022): “Mechanism design with limited commitment,” Econometrica, 90, 1463–1500.
Ellison and Fudenberg (2000) Ellison, G. and D. Fudenberg (2000): “The neo-Luddite’s lament: Excessive upgrades in the software industry,” The RAND Journal of Economics, 253–272.
Farrell and Saloner (1985) Farrell, J. and G. Saloner (1985): “Standardization, compatibility, and innovation,” the RAND Journal of Economics, 70–83.
Frankel and Pauzner (2000) Frankel, D. and A. Pauzner (2000): “Resolving indeterminacy in dynamic settings: the role of shocks,” The Quarterly Journal of Economics, 115, 285–304.
Frankel et al. (2003) Frankel, D. M., S. Morris, and A. Pauzner (2003): “Equilibrium selection in global games with strategic complementarities,” Journal of Economic Theory, 108, 1–44.
Fudenberg and Levine (1986) Fudenberg, D. and D. Levine (1986): “Limit games and limit equilibria,” Journal of economic Theory, 38, 261–279.
Fudenberg et al. (1985) Fudenberg, D., D. Levine, and J. Tirole (1985): “Infinite-horizon models of bargaining with one-sided incomplete information,” Game-theoretic models of bargaining, 73–98.
Fudenberg and Tirole (1991) Fudenberg, D. and J. Tirole (1991): “Perfect Bayesian equilibrium and sequential equilibrium,” journal of Economic Theory, 53, 236–260.
Gale (1995) Gale, D. (1995): “Dynamic coordination games,” Economic theory, 5, 1–18.
Glazer and Perry (1996) Glazer, J. and M. Perry (1996): “Virtual implementation in backwards induction,” Games and Economic Behavior, 15, 27–32.
Goldstein and Pauzner (2005) Goldstein, I. and A. Pauzner (2005): “Demand–deposit contracts and the probability of bank runs,” the Journal of Finance, 60, 1293–1327.
Guimaraes and Machado (2018) Guimaraes, B. and C. Machado (2018): “Dynamic coordination and the optimal stimulus policies,” The Economic Journal, 128, 2785–2811.
Guimaraes et al. (2020) Guimaraes, B., C. Machado, and A. E. Pereira (2020): “Dynamic coordination with timing frictions: Theory and applications,” Journal of Public Economic Theory, 22, 656–697.
Gul et al. (1986) Gul, F., H. Sonnenschein, and R. Wilson (1986): “Foundations of dynamic monopoly and the Coase conjecture,” Journal of economic Theory, 39, 155–190.
Halac et al. (2021) Halac, M., E. Lipnowski, and D. Rappoport (2021): “Rank uncertainty in organizations,” American Economic Review, 111, 757–786.
Halac et al. (2024) ——— (2024): “Pricing for Coordination,” .
Halac and Yared (2014) Halac, M. and P. Yared (2014): “Fiscal rules and discretion under persistent shocks,” Econometrica, 82, 1557–1614.
He and Xiong (2012) He, Z. and W. Xiong (2012): “Dynamic debt runs,” The Review of Financial Studies, 25, 1799–1843.
Inostroza and Pavan (2023) Inostroza, N. and A. Pavan (2023): “Adversarial coordination and public information design,” Available at SSRN 4531654.
Kamada and Kandori (2020) Kamada, Y. and M. Kandori (2020): “Revision games,” Econometrica, 88, 1599–1630.
Kamenica and Gentzkow (2011) Kamenica, E. and M. Gentzkow (2011): “Bayesian persuasion,” American Economic Review, 101, 2590–2615.
Koh et al. (2024a) Koh, A., R. Li, and K. Uzui (2024a): “Inertial Coordination Games,” arXiv preprint arXiv:2409.08145.
Koh and Sanguanmoo (2022) Koh, A. and S. Sanguanmoo (2022): “Attention Capture,” arXiv preprint arXiv:2209.05570.
Koh et al. (2024b) Koh, A., S. Sanguanmoo, and W. Zhong (2024b): “Persuasion and Optimal Stopping,” arXiv preprint arXiv:2406.12278.
Laffont and Tirole (1988) Laffont, J.-J. and J. Tirole (1988): “The dynamics of incentive contracts,” Econometrica: Journal of the Econometric Society, 1153–1175.
Levine and Pesendorfer (1995) Levine, D. K. and W. Pesendorfer (1995): “When are agents negligible?” The American Economic Review, 1160–1170.
Li et al. (2023) Li, F., Y. Song, and M. Zhao (2023): “Global manipulation by local obfuscation,” Journal of Economic Theory, 207, 105575.
Liu et al. (2019) Liu, Q., K. Mierendorff, X. Shi, and W. Zhong (2019): “Auctions with limited commitment,” American Economic Review, 109, 876–910.
Makris and Renou (2023) Makris, M. and L. Renou (2023): “Information design in multistage games,” Theoretical Economics, 18, 1475–1509.
Mathevet and Steiner (2013) Mathevet, L. and J. Steiner (2013): “Tractable dynamic global games and applications,” Journal of Economic Theory, 148, 2583–2619.
Matsui and Matsuyama (1995) Matsui, A. and K. Matsuyama (1995): “An approach to equilibrium selection,” Journal of Economic Theory, 65, 415–434.
Matsuyama (1991) Matsuyama, K. (1991): “Increasing returns, industrialization, and indeterminacy of equilibrium,” The Quarterly Journal of Economics, 106, 617–650.
Miller et al. (2002) Miller, M., P. Weller, and L. Zhang (2002): “Moral Hazard and The US Stock Market: Analysing the ‘Greenspan Put’,” The Economic Journal, 112, C171–C186.
Moore and Repullo (1988) Moore, J. and R. Repullo (1988): “Subgame perfect implementation,” Econometrica: Journal of the Econometric Society, 1191–1220.
Morris (2020) Morris, S. (2020): “No trade and feasible joint posterior beliefs,” .
Morris et al. (2022) Morris, S., D. Oyama, and S. Takahashi (2022): “On the joint design of information and transfers,” Available at SSRN 4156831.
Morris et al. (2024) ——— (2024): “Implementation via Information Design in Binary-Action Supermodular Games,” Econometrica, 92, 775–813.
Morris and Shin (2006) Morris, S. and H. S. Shin (2006): “Heterogeneity and uniqueness in interaction games,” The Economy as an Evolving Complex System, 3, 207–42.
Morris and Yildiz (2019) Morris, S. and M. Yildiz (2019): “Crises: Equilibrium shifts and large shocks,” American Economic Review, 109, 2823–2854.
Murphy et al. (1989) Murphy, K. M., A. Shleifer, and R. W. Vishny (1989): “Industrialization and the big push,” Journal of political economy, 97, 1003–1026.
Nakamura and Steinsson (2010) Nakamura, E. and J. Steinsson (2010): “Monetary non-neutrality in a multisector menu cost model,” The Quarterly journal of economics, 125, 961–1013.
Oyama (2002) Oyama, D. (2002): “p-Dominance and equilibrium selection under perfect foresight dynamics,” Journal of Economic Theory, 107, 288–310.
Penta (2015) Penta, A. (2015): “Robust dynamic implementation,” Journal of Economic Theory, 160, 280–316.
Samet (2005) Samet, D. (2005): “Counterfactuals in wonderland,” Games and Economic Behavior, 51, 2005.
Sato (2023) Sato, H. (2023): “Robust implementation in sequential information design under supermodular payoffs and objective,” Review of Economic Design, 27, 269–285.
Skreta (2015) Skreta, V. (2015): “Optimal auction design under non-commitment,” Journal of Economic Theory, 159, 854–890.
Sun (2006) Sun, Y. (2006): “The exact law of large numbers via Fubini extension and characterization of insurable risks,” Journal of Economic Theory, 126, 31–69.

Appendix to Informational Puts

Appendix A proves Theorem 1. Appendix B analyzes the case in which the designer can use private information.

Appendix A Proofs

Preliminaries. We use the following notation for the time-path of aggregate actions following from $A_{t}$ : for $s\geq t$ , $\bar{A}_{s}$ solves

d\bar{A}_{s}=\lambda(1-\bar{A}_{s})\cdot ds\quad\text{with boundary $\bar{A}_{t}=A_{t}$.}

Similarly, for $s\geq t$ , $\underline{A}_{s}$ solves

d\underline{A}_{s}=-\lambda\underline{A}_{s}\cdot ds\quad\text{with boundary $\bar{A}_{t}=A_{t}$.}

In words, $\bar{A}_{s}$ and $\underline{A}_{s}$ denote future paths of aggregate actions when everyone in the future switches to actions $1$ and $0$ as quickly as possible, respectively.

Finally, it will be helpful to define the operator $S:\mathcal{H}\to\Delta(\Theta)\times[0,1]$ mapping histories to the most recent pair of belief and aggregate action, i.e., $S((\mu_{s},A_{s})_{s\leq t}):=(\mu_{t},A_{t})$ .

Outline of proof. The proof of Theorem 1 consists of the following steps:

Step 1: We first show the result for binary states $\Theta=\{0,1\}$ with $\theta^{*}=1$ . With slight abuse of notation, we associate beliefs directly with the probability that the state is $1$ : $\mu_{t}=\mu_{t}(\theta^{*})$ . Then, our lower-dominance region is one-dimensional and summarized by a threshold belief for each $A$ :

\underline{\mu}(A):=\max_{\mu\in\Psi_{LD}(A)}\mu(\theta^{*})

We show that $\mu_{t}>\underline{\mu}(A_{t})$ implies switching to $1$ is the unique subgame perfect equilibrium under the information policy $\bm{\mu}^{*}$ . We show this in several sub-steps.

•

Step 1A: There exists a belief threshold, which is a ‘rightward’ translation of the lower-dominance region $\underline{\mu}(A_{t})$ such that agents find it strictly dominant to play action $1$ regardless of others’ actions if the current belief is above this threshold (Lemma 2). We call this threshold $\psi_{0}(A_{t})$ .
•

Step 1B: For $n\in\mathbb{N}$ , suppose that agents conjecture that all agents will switch to action $1$ at all future histories $H$ such that $S(H)=(\mu_{s},A_{s})$ fulfills $\mu_{s}>\psi_{n}(A_{s})$ . Under this assumption, we can compute a lower bound (LB) on the expected payoff difference for agents between playing actions $1$ and $0$ for any given current belief $\mu_{t}\in(\underline{\mu}(A_{t}),\psi_{n}(A_{t})]$ .
To do so, we will separately consider the future periods before and after the aggregate action deviates from the tolerated distance from the target, at which point new information is provided. Call this time $T^{*}$ .
- –
  
  Before $T^{*}$ , we construct the lower bound using the fact that aggregate actions cannot be too far from the target even in the worst-case scenario.
- –
  
  At $T^{*}$ , the designer injects information with binary support. We choose the upward jump size $M\cdot\mathsf{TOR}(D)$ to be sufficiently large so that, whenever the ‘good signal’ realizes beliefs exceed $\psi_{n}(A_{T^{*}})$ . Whenever the ‘bad signal’ realizes, we conjecture the worst-case that all agents switch to action $0$ .
•
Step 1C: We show that by carefully choosing the information policy, the threshold under which switching to $1$ is strictly dominant, $\psi_{n+1}(A_{t})$ , is strictly smaller than $\psi_{n}(A_{t})$ . The policy has several key features:
- –
  
  Large $M$ : when the aggregate action $A_{T^{*}}$ falls below the tolerated distance $\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))$ from the target, the high belief after the injection exceeds $\psi_{n}(A_{T^{*}})$ , which ensures the argument in Step 1B. In particular, we choose $M$ to be large relative to the Lipschitz constant of $\psi_{n}$ .
- –
  
  Small $\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))$ : we should maintain a low tolerance level for deviations from the target. If the designer allowed a large deviation, the aggregate action could drop so low by the time information is injected that agents’ incentives to play action $1$ would be too weak to recover.
- –
  
  Large $\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))$ : the downward jump size should be large relative to the upward jump size $M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))$ , but not so large that beliefs fall into the lower-dominance region. This ensures that the probability of the belief being high after the injection is sufficiently large.
These three features guarantee that the lower bound (LB) is sufficiently large and remains positive even when the current belief $\mu_{t}$ is slightly below $\psi_{n}(A_{t})$ . Hence $\psi_{n+1}(A_{t})$ is strictly smaller than $\psi_{n}(A_{t})$ , allowing us to expand the range of beliefs under which action $1$ is uniquely optimal (Lemma 3).
•

Step 1D: By iterating Step 1C for $n\in\mathbb{N}$ , we show that $\psi_{n}(A_{t})$ converges to $\underline{\mu}(A_{t})$ . Then, if $\mu_{t}>\underline{\mu}(A_{t})$ , agents who can switch in period $t$ find it uniquely optimal to choose action $1$ .

Step 2: We extend the arguments in Step 1 from binary states to finite states: if $\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}},$ then playing action $1$ is the unique subgame perfect equilibrium under the information policy $\bm{\mu}^{*}$ .

As described in the main text, our policy is such that beliefs move either in the direction $\hat{d}(\mu)$ toward $\delta_{\theta^{*}}$ , or in the direction $-\hat{d}(\mu)$ away from $\delta_{\theta^{*}}$ . The key observation is that we can apply a modification of Step 1 to each direction.

Step 3: We establish sequential optimality:

•

Step 3A: for any $\epsilon>0$ , $\bm{\mu^{*}}$ is $\epsilon$ -sequentially optimal when $\mu^{*}_{t}\in\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}$
•

Step 3B: $\bm{\mu^{*}}$ is sequentially optimal when $\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}.$

Proof of Theorem 1.

Step 1. Suppose that $\Theta=\{0,1\}$ and $\theta^{*}=1$ . With slight abuse of notation, we associate beliefs $\mu_{t}$ with the probability that $\theta=1$ . As in the main text, we let $\underline{\mu}(A_{t})$ denote the boundary of the lower-dominance region. We will show that as long as action $1$ is not strictly dominated i.e., $\mu^{*}_{t}>\underline{\mu}(A_{t})$ , then action $1$ is played under any subgame perfect equilibrium.

Definition 5.

For $n\in\mathbb{N}$ , we will construct a sequence $(\psi_{n})_{n}$ where $\psi_{n}\subset\Delta(\Theta)\times[0,1]$ is a subset of the round- $n$ dominance region. $\psi_{n}$ will satisfy the following conditions:

(i)

Contagion. Action $1$ is strictly preferred under every history $H$ where $S(H)\in\psi_{n}$ under the conjecture that action $1$ is played under every history $H^{\prime}$ such that $S(H^{\prime})\in\psi_{n-1}$ .
(ii)

Translation. There exists a constant $c_{n}>0$ such that $\psi_{n}=\{(\mu,A):D(\mu,A)\geq c_{n}\},$ where $D(\mu,A)=\mu-\underline{\mu}(A)$ .

We initialize $\psi_{0}$ as the upper-dominance region whereby $1$ is strictly dominant.

Observe also that since $\Delta u(\cdot,\theta)$ is continuous and strictly increasing on a compact domain, it is also Lipschitz and we let the constant be $L>0$ . This also implies the lower-dominance region (as a function of $A$ ) is Lipschitz continuous, and we denote the constant with $L_{\underline{\mu}}$ .

Lemma 1.

$\underline{\mu}(\cdot)$ is Lipschitz continuous.

Proof of Lemma 1.

Fix any $t$ . The expected payoff difference between playing $1$ and $0$ when everyone in the future switches to action $1$ is given by

	$\displaystyle\Delta U(\mu_{t},A_{t})$	$\displaystyle:=\mu_{t}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Big{\{}\Delta u(\bar{A}_{s},1)-\Delta u(\bar{A}_{s},0)\Big{\}}ds\bigg{]}$
		$\displaystyle\quad\quad\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},0)ds\bigg{]}.$

Note that $\Delta U$ is continuously differentiable and strictly increasing in both $\mu_{t}$ and $A_{t}$ . Since the domain of $\Delta U$ is compact, the following values are well-defined:

\displaystyle L:=\max_{A,\mu}\frac{\partial\Delta U}{\partial A}>0,\quad l:=\min_{A,\mu}\frac{\partial\Delta U}{\partial\mu}>0.

Then, for any $A_{t}<A_{t}^{\prime}$ and $\mu_{t}>\mu_{t}^{\prime}$ , we have

\Delta U(\mu_{t},A_{t})-\Delta U(\mu_{t}^{\prime},A_{t}^{\prime})\geq-L(A_{t}^{\prime}-A_{t})+l(\mu_{t}-\mu_{t}^{\prime})

because the mean value theorem implies

	$\displaystyle\Delta U(\mu_{t},A_{t})-\Delta U(\mu_{t}^{\prime},A_{t})$	$\displaystyle\geq l(\mu_{t}-\mu_{t}^{\prime})$
	$\displaystyle\Delta U(\mu_{t}^{\prime},A_{t}^{\prime})-\Delta U(\mu_{t}^{\prime},A_{t})$	$\displaystyle\leq L(A_{t}^{\prime}-A_{t}).$

Substituting $\mu_{t}=\underline{\mu}(A_{t})$ and $\mu_{t}^{\prime}=\underline{\mu}(A_{t}^{\prime})$ into the above inequality yields

0=\Delta U(\underline{\mu}(A_{t}),A_{t})-\Delta U(\underline{\mu}(A_{t}^{\prime}),A_{t}^{\prime})\geq-L(A_{t}^{\prime}-A_{t})+l(\underline{\mu}(A_{t})-\underline{\mu}(A_{t}^{\prime})),

where the equality follows from the definition of $\underline{\mu}$ , i.e., $\Delta U(\underline{\mu}(A_{t}),A_{t})=0$ for every $A_{t}$ . Hence, we have

\underline{\mu}(A_{t})-\underline{\mu}(A_{t}^{\prime})\leq\underbrace{\frac{L}{l}}_{=:L_{\underline{\mu}}}(A_{t}^{\prime}-A_{t}).

∎

Step 1A. Construct $\psi_{0}$ .

Define $\psi_{0}$ as

\psi_{0}=\Big{\{}(\mu,A)\in\Delta(\Theta)\times[0,1]:D(\mu,A)\geq c_{0}\Big{\}},

with $c_{0}:=\max_{A}\bar{\mu}(A)-\underline{\mu}(A)$ , where $\bar{\mu}(A)$ is defined as

\bar{\mu}(A):=\min\Big{\{}\mu\in\Delta(\Theta):\mathbb{E}\Big{[}\int_{t}u(1,\underline{A}_{s},\theta)ds\Big{]}\geq\mathbb{E}\Big{[}\int_{t}u(0,\underline{A}_{s},\theta)ds\Big{]}\Big{\}}.

$\bar{\mu}(A)$ is the minimum belief under which players prefer action $1$ even if all future players choose to play action $0$ .

Lemma 2.

Action $1$ is strictly preferred under every history $H$ where $S(H)\in\psi_{0}$ .

Proof of Lemma 2.

Fix any history $H$ such that $S(H)\in\psi_{0}.$ Then, by the definition of $\psi_{0}$ , the current $(\mu,A)$ satisfies

\displaystyle\mu\geq\underline{\mu}(A)+\max_{A^{\prime}}\left\{\bar{\mu}(A^{\prime})-\underline{\mu}(A^{\prime})\right\}\geq\bar{\mu}(A).

Hence, action $1$ is strictly preferred regardless of others’ future play. ∎

Step 1B. Construct a lower bound for the expected payoff difference given $\psi_{n}$ .

Suppose that everyone plays action $1$ for any histories $H^{\prime}$ such that $S(H^{\prime})=(\mu,A)$ is in the round- $n$ dominance region $\psi_{n}$ . To obtain $\psi_{n+1}$ in Step 1C, we derive the lower bound on the expected payoff difference of playing $0$ and $1$ given $\psi_{n}$ .

To this end, fix any history $H$ with the current target aggregate action $Z_{t}$ such that $S(H)=(\mu_{t},A_{t})\notin\psi_{n}$ but $\mu_{t}>\underline{\mu}(A_{t})$ . From our construction of $Z_{t},$ we must have $Z_{t}-A_{t}<\mathsf{TOR}(D(\mu_{t},A_{t})).$ ³⁷³⁷37By construction, if $Z_{t-}-A_{t}\geq\mathsf{TOR}(D(\mu_{t},A_{t}))$ , $Z_{t}=A_{t}$ must hold, which implies $Z_{t}-A_{t}=0<\mathsf{TOR}(D(\mu_{t},A_{t}))$ . If $Z_{t-}-A_{t}<\mathsf{TOR}(D(\mu_{t},A_{t}))$ , $Z_{t}-A_{t}<\mathsf{TOR}(D(\mu_{t},A_{t}))$ is immediate because $Z_{t}$ does not jump. For any continuous path $(A_{s})_{s\geq t}$ , we define the hitting time $T^{*}((A_{s})_{s\geq t})$ as follows:

T^{*}=\inf\Big{\{}s\geq t:Z_{s}-A_{s}\geq\mathsf{TOR}(D(\mu_{t},A_{s}))\text{ or }(\mu_{t},A_{s})\in\psi_{n}\Big{\}}.

$T^{*}$ represents the first time at which either the designer injects new information, or the pair $(\mu_{s},A_{s})$ enters the round- $n$ dominance region. We will calculate the agent’s expected payoff before time $T^{*}$ given the continuous path $(A_{s})_{s\in[t,T^{*}]}$ and find a lower bound for this payoff by using the lower bound of $A_{s}$ for $s\geq t$ .

Before time ${T}^{*}$ . First, we calculate the agent’s payoff before time $T^{*}$ . Given $(A_{s})_{s\geq t}$ , we have $\mu_{s}=\mu_{t}$ and $(\mu_{t},A_{s})\notin\psi_{n}$ for every $s\in[t,T^{*})$ because no information is injected when $Z_{s}-A_{s}<\mathsf{TOR}(D(\mu_{t},A_{s}))$ . Define $\psi_{n}(A_{t})=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n}\}$ . This implies

	$\displaystyle\mathsf{TOR}(D(\mu_{t},A_{s}))=\mathsf{TOR}(\mu_{t}-\underline{\mu}(A_{s}))$	$\displaystyle\leq\mathsf{TOR}(\psi_{n}(A_{s})-\underline{\mu}(A_{s}))$
		$\displaystyle=\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),$		(1)

where the inequality follows from $\delta$ being increasing, and the last equality follows from the property that $\psi_{n}(A_{t})$ is a translation of $\underline{\mu}(A_{t}).$ Let $\bar{A}_{s}=\bar{A}(A_{t},s-t)$ , which is the aggregate play at $s\geq t$ when everyone will switch to action $1$ as fast as possible. By the definition of $Z$ , we must have $Z_{s}=\bar{A}_{s}$ for every $s\in[t,T^{*})$ because $Z_{s}-A_{s}<\mathsf{TOR}(D(\mu_{s},A_{s}))$ . Then we can write down the lower bound of $A_{s}$ when $s\in[t,T^{*}]$ as follows:

\displaystyle A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),

where the second inequality follows from (A). By Lipschitz continuity of $\Delta u(\cdot,\theta),$ we must have

\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L

(2)

with the Lipschitz constant $L$ . Thus, the expected payoff difference of taking action $1$ and $0$ at time $(\mu_{t},A_{t})$ given a continuous path $(A_{s})_{s\in[t,T^{*}]}$ before time $T^{*}$ is:

		$\displaystyle\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{}}e^{-r(s-t)}\Delta u(A_{s},\theta)ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$
		$\displaystyle=\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
		$\displaystyle\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{}}e^{-r(s-t)}\big{(}\Delta u(A_{s},\theta)-\Delta u(\bar{A}_{s},\theta)\big{)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$
		$\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
		$\displaystyle\quad\quad-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L\cdot\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{}}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$		(From (2))
		$\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L}{\lambda},$		(3)

where the last inequality follows from

\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau\wedge T^{*}}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{*}]}\bigg{]}\leq\mathbb{E}_{\tau}\bigg{[}\int_{s=0}^{\tau}ds\bigg{]}=\frac{1}{\lambda}.

After time ${T^{*}}$ . We calculate the lower bound of the expected payoff difference after time $T^{*}$ . We know that $\mu_{T^{*}-}=\mu_{t}.$ From the definition of $T^{*}$ , we consider the following two cases depending on whether $Z_{T^{*}}-A_{T^{*}}<\mathsf{TOR}(D(\mu_{T^{*}},A_{T^{*}}))$ holds or not.

Case 1: ${Z_{T^{*}}-A_{T^{*}}<\mathsf{TOR}(D(\mu_{T^{*}},A_{T^{*}}))}$ . This means $\mu_{T^{*}}=\mu_{t}$ because no information has been injected until $T^{*}$ . Then the definition of $T^{*}$ implies $(\mu_{T^{*}},A_{T^{*}})\in\psi_{n},$ where $\psi_{n}$ is the round- $n$ dominance region. This means every agent strictly prefers to take action 1 at $T^{*}$ . This increases $A_{T^{*}}$ , inducing every agent taking action 1 after time $T^{*}$ .³⁸³⁸38If $(\mu,A)\in\psi_{n}$ , then $(\mu,A^{\prime})\in\psi_{n}$ holds for any $A^{\prime}\geq A.$ Thus, for $s\geq T^{*}$ , we have

	$\displaystyle A_{s}=\bar{A}(A_{T^{}},s-T^{})$	$\displaystyle\geq\bar{A}(\bar{A}_{T^{}}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),s-T^{})$
		$\displaystyle\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),$		(4)

where the first inequality follows from

A_{T^{*}}\geq\bar{A}_{T^{*}}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))\geq\bar{A}_{T^{*}}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),

and the second inequality follows from

	$\displaystyle\bar{A}(\bar{A}_{T^{}}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),s-T^{})$
	$\displaystyle=1-\left(1-\bar{A}_{T^{}}+\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))\right)\exp(-\lambda(s-T^{}))$
	$\displaystyle=\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))\exp(-\lambda(s-T^{*}))$
	$\displaystyle\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})).$

Hence, by the Lipschitz continuity of $\Delta u(\cdot,\theta)$ , if $(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}$ , then

\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L.

(5)

The expected payoff difference of taking action $1$ and $0$ at time $(\mu_{t},A_{t})$ given a path $(A_{s})_{s\in[t,T^{*}]}$ after time $T^{*}$ is

	$\displaystyle\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{}}^{\tau}e^{-r(s-t)}\Delta u(A_{s},\theta)ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$
	$\displaystyle=\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
	$\displaystyle\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{}}^{\tau}e^{-r(s-t)}\big{(}\Delta u(A_{s},\theta)-\Delta u(\bar{A}_{s},\theta)\big{)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$
	$\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
	$\displaystyle\quad\quad-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L\cdot\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{}}^{\tau}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$		(From (5))
	$\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L}{\lambda}.$		(6)

Case 2: ${Z_{T^{*}}-A_{T^{*}}\geq\mathsf{TOR}(D(\mu_{T^{*}},A_{s}))}$ . By the definition of $T^{*}$ , information is injected at $T^{*}$ , and thus the belief at $T^{*}$ must be

\displaystyle\mu_{T^{*}}=\begin{cases}\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))&\text{w.p. $p_{+}(\mu_{t},A_{T^{*}})$}\\ \mu_{t}-\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))&\text{w.p. $p_{-}(\mu_{t},A_{T^{*}})$}.\end{cases}

Note that, if $(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}$ , then everyone strictly prefers to take action $1$ at $T^{*}$ . This increases $A_{T^{*}}$ and induces every agent to take action $1$ after time $T^{*}$ because $(\mu_{s},A_{s})$ stays in $\psi_{n}$ for all $s\geq T^{*}$ . Hence, we can write down the lower bound of $A_{s}$ when $s>T^{*}$ as follows:

	$\displaystyle A_{s}$	$\displaystyle\geq 1\{(\mu_{T^{}},A_{T^{}})\in\psi_{n}\}\bar{A}(A_{T^{}},s-T^{})+1\{(\mu_{T^{}},A_{T^{}})\notin\psi_{n}\}\underline{$A$}(A_{T^{}},s-T^{})$
		$\displaystyle\geq 1\{(\mu_{T^{}},A_{T^{}})\in\psi_{n}\}\{\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))\}$
		$\displaystyle\quad\quad\quad\quad\quad\quad+1\{(\mu_{T^{}},A_{T^{}})\notin\psi_{n}\}\underline{$A$}(A_{T^{}},s-T^{}),$

where the first inequality follows from the fact that everyone in the future will switch to action $0$ in the worst-case scenario if $(\mu_{T^{*}},A^{T^{*}})\notin\psi_{n}$ , and the second inequality follows from (A). By Lipschitz continuity of $\Delta u(\cdot,\theta)$ , we must have, if $(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}$ , then

\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L,

(7)

and if $(\mu_{T^{*}},A_{T^{*}})\notin\psi_{n}$ , then

\displaystyle\Delta u(A_{s},\theta)\geq\Delta u(\bar{A}_{s},\theta)-L(\bar{A}_{s}-A_{s})\geq\Delta u(\bar{A}_{s},\theta)-L.

(8)

Define $p_{n}\coloneqq\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\mid(A_{s})_{s\in[t,T^{*}]})$ . The expected payoff difference of taking action $1$ and $0$ at time $(\mu_{t},A_{t})$ given a path $(A_{s})_{s\in[t,T^{*}]}$ after time $T^{*}$ is

	$\displaystyle\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{}}^{\tau}e^{-r(s-t)}\Delta u(A_{s},\theta)ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$
	$\displaystyle=\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
	$\displaystyle\quad\quad+\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{}}^{\tau}e^{-r(s-t)}\big{(}\Delta u(A_{s},\theta)-\Delta u(\bar{A}_{s},\theta)\big{)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$
	$\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
	$\displaystyle\quad\quad-\big{(}\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))Lp_{n}+L(1-p_{n})\big{)}\cdot\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{}}^{\tau}e^{-r(s-t)}ds\Big{\lvert}(A_{s})_{s\in[t,T^{}]}\bigg{]}$		(From (7) and (8))
	$\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=\tau\wedge T^{*}}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))Lp_{n}+L(1-p_{n})}{\lambda}.$		(9)

Combining before and after time ${T^{*}}$ . We are ready to construct a lower bound of the expected discounted payoff difference. To evaluate $s\geq T^{*}$ , it is sufficient to focus on the case in which information is injected (Case 2) since (9) is smaller than (6) because $\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))<1$ . By taking the sum of the payoffs before and after time $T^{*}$ , that is (3) and (9), the expected payoff difference of taking action 1 and 0 at $(\mu_{t},A_{t})$ given a path $(A_{s})_{s\in[t,T^{*}]}$ is lower-bounded as follows:

	$\displaystyle\mathbb{E}\Big{[}U_{1}(\mu_{t},(A_{s})_{s\geq t})-U_{0}(\mu_{t},(A_{s})_{s\geq t})\Big{\|}(A_{s})_{s\in[t,T^{*}]}\Big{]}$
	$\displaystyle\geq\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}-\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L(1+p_{n})+L(1-p_{n})}{\lambda}.$		(LB)

Intuitively, the expected payoff cannot be too low compared to the case where everyone switches to action $1$ in the future because (i) aggregate actions are close to the target before new information is injected; and (ii) if the belief jumps upward upon injection, everyone will subsequently switch to action $1$ .

Step 1C. Finally, we characterize $\psi_{n+1}$ . The following lemma establishes that under $\bm{\mu^{*}},$ $\psi_{n}$ is strictly increasing in the set order.

Lemma 3.

For all $n\in\mathbb{N}$ , $\psi_{n}\subset\psi_{n+1}$ (strict inclusion).

Proof of Lemma 3.

To characterize $\psi_{n+1}$ , we first show that there exist tolerance level $\delta$ , upward jump magnitude $M,$ and downward jump size $\epsilon$ such that if $\mu_{t}\geq\psi(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ , then $U_{1}(\mu_{t},(A_{s})_{s\geq t})-U_{0}(\mu_{t},(A_{s})_{s\geq t})>0$ .

Suppose $\mu_{t}\geq\psi(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ . First, we evaluate the first term of (LB). We know from the definition of the lower-dominance region $\underline{\mu}(A_{t})$ that

\underline{\mu}(A_{t})\underbrace{\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}}_{\geq 0}+(1-\underline{\mu}(A_{t}))\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},0)ds\bigg{]}\geq 0

with equality when $\underline{\mu}(A_{t})>0$ . If $\underline{\mu}(A_{t})>0$ , we must have $\mathbb{E}_{\tau}\big{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\big{]}\leq 0$ , which implies

	$\displaystyle\mathbb{E}_{\tau,\theta\sim\mu_{t}}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
	$\displaystyle=(\mu_{t}-\underline{\mu}(A_{t}))\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}(\Delta u(\bar{A}_{s},1)-\Delta u(\bar{A}_{s},0))ds\bigg{]}$
	$\displaystyle\geq(\mu_{t}-\underline{\mu}(A_{t}))\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}$
	$\displaystyle>C(\mu_{t}-\underline{\mu}(A_{t}))$		(10)

for some $C>0$ . This constant $C$ exists because

\min_{A_{t}\in[0,1]}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}>0

since $\Delta u(A,1)>0$ for any $A\in[0,1]$ . If $\underline{\mu}(A_{t})=0$ , we have $\mathbb{E}_{\tau}\big{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\big{]}\geq 0$ , which implies

\displaystyle\mathbb{E}_{\tau,\theta\sim\mu_{t}}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}\geq\mu_{t}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},1)ds\bigg{]}>C(\mu_{t}-\underline{\mu}(A_{t})).

Additionally, note that if $\delta$ satisfies $M\cdot\mathsf{TOR}(D(\mu_{s},A_{s}))/2\leq\mu_{s}$ for every $(\mu_{s},A_{s})$ , then $\mu_{t}-\underline{\mu}(A_{t})\geq\frac{1}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))$ . This follows from

	$\displaystyle\frac{1}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))$	$\displaystyle\leq\frac{1}{2}(\mu_{t}-\underline{\mu}(A_{t}))+\frac{1}{2}(M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2-\underline{\mu}(A_{t}))$
		$\displaystyle<\mu_{t}-\underline{\mu}(A_{t}),$

where the first inequality follows from $\mu_{t}\geq\psi(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ , and the second inequality follows from $M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2\leq\mu_{t}$ . Thus, if $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ , then

\displaystyle\mathbb{E}_{\tau,\theta\sim\mu_{t}}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}

\displaystyle>\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})).

(11)

Next, we evaluate the second term of (LB). Notice that, if $(\mu_{t},A_{t})\notin\psi_{n}$ , then

\displaystyle p_{n}=\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n})=p_{+}(\mu_{t},A_{T^{*}})1\{(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n}\}.

We will show that $(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n}$ if $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ . Observe that when $(\mu_{t},A_{t})\notin\psi_{n}$ , we must have

\displaystyle A_{T^{*}}>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t})).

To see this, suppose for a contradiction that $A_{T^{*}}\leq A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t}))$ , which implies $A_{T^{*}}<A_{t}$ . However, since the definition of $T^{*}$ implies $A_{T^{*}}=Z_{T^{*}}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))$ , we have

	$\displaystyle A_{T^{*}}$	$\displaystyle=Z_{T^{}}-\mathsf{TOR}(D(\mu_{t},A_{T^{}}))$
		$\displaystyle>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t})),$

where the inequality follows from $Z_{T^{*}}=\bar{A}_{T^{*}}>A_{t}$ and the fact that $\mathsf{TOR}(D(\mu_{t},A))$ is increasing in $A$ . This is a contradiction.

Lemma 1 shows that $\underline{\mu}$ is a Lipschitz function. Since $\psi_{n}(A_{t})$ is a translation of $\underline{\mu}(A_{t})$ , $\psi_{n}(A_{t})$ has the same Lipschitz constant $L_{\underline{\mu}}$ as $\underline{\mu}(A_{t})$ . Hence, if $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ , we must have

	$\displaystyle\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))$	$\displaystyle\geq\psi_{n}(A_{t})+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$
		$\displaystyle>\big{(}\psi_{n}(A_{T^{*}})-L_{\underline{\mu}}\mathsf{TOR}(D(\mu_{t},A_{t}))\big{)}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$
		$\displaystyle=\psi_{n}(A_{T^{*}}),$

by setting $M=2L_{\underline{\mu}}$ . Thus, $(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n}$ holds, implying

p_{n}=p_{+}(\mu_{t},A_{T^{*}})=\frac{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{*}}))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}.

We set

\mathsf{DOWN}(D(\mu_{t},A_{t}))=\frac{\mu_{t}-\underline{\mu}(A_{t})}{2}\quad\text{\&}\quad\mathsf{TOR}(D(\mu_{t},A_{t}))=\bar{\delta}\cdot\frac{\lambda C(\mu_{t}-\underline{\mu}(A_{t}))}{4L+4LM(\mu_{t}-\underline{\mu}(A_{t}))^{-1}},

for a fixed small number $\bar{\delta}<1$ so that $\mathsf{TOR}(D(\mu,A))<1$ and $M\cdot\mathsf{TOR}(D(\mu_{s},A_{s}))/2\leq\mu_{s}$ for every $\mu$ and $A$ (e.g., $\bar{\delta}=\min\{1,\frac{4L}{\lambda C},\frac{4LM}{\lambda C}\}$ ). Thus,

$\displaystyle 1-p_{n}$	$\displaystyle=\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{}}))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}$
	$\displaystyle\leq\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{}}))}$
	$\displaystyle=\frac{\bar{\delta}\lambda MC}{2L+2LM(\mu_{t}-\underline{\mu}(A_{T^{*}}))^{-1}}$
	$\displaystyle\leq\frac{\bar{\delta}\lambda MC}{2L+2LM(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))^{-1}},$	(12)

where the last inequality follows from the continuity of $A_{s}$ and what we argued in (A) that $\mu_{t}-\underline{\mu}(A_{s})\leq\psi_{n}(A_{t})-\underline{\mu}(A_{t})$ for every $s<T^{*}$ .

Thus, if $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2,$ we have

	$\displaystyle\mathbb{E}[U_{1}(\mu_{t},(A_{s})_{s\geq t})-U_{0}(\mu_{t},(A_{s})_{s\geq t})\mid(A_{s})_{s\in[t,T^{*}]}]$
	$\displaystyle>\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-\frac{1}{\lambda}\Big{(}\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L(1+p_{n})+L(1-p_{n})\Big{)}$		(From (LB) and (11))
	$\displaystyle>\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-\frac{L}{\lambda}\Big{(}2\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))+(1-p_{n})\Big{)}$
	$\displaystyle\geq\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-\frac{\bar{\delta}L}{\lambda}\cdot\bigg{(}\frac{\lambda C(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))+\lambda MC}{2L+2LM(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))^{-1}}\bigg{)}$		(From (A))
	$\displaystyle=\frac{C(1-\bar{\delta})}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))$
	$\displaystyle>0,$

for every given path $(A_{s})_{s\in[t,T^{*}]}$ .

In conclusion, we found $\delta$ , $M$ , and $\epsilon$ such that if $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ , then the agent must choose action $1$ . Note that $\delta$ is increasing in $\mu_{t}$ and increasing in $A_{t}$ . Thus, $\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ is increasing in $\mu_{t}$ and continuous in $\mu_{t}$ when $\mu_{t}>\underline{\mu}(A_{t})$ . Therefore, for each $A_{t}$ , there exists $\mu^{\prime}(A_{t})<\psi_{n}(A_{t})$ such that

\displaystyle\mu^{\prime}(A_{t})+\frac{M\cdot\mathsf{TOR}(D(\mu^{\prime}(A_{t}),A_{t}))}{2}=\psi_{n}(A_{t}).

Then we define

\psi_{n+1}=\{(\mu_{t},A_{t}):\mu_{t}\geq\mu^{\prime}(A_{t})\},

which also implies $\psi_{n+1}(A_{t}):=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n+1}\}=\mu^{\prime}(A_{t})$ . From the argument above, we must have an agent always choosing action $1$ whenever $(\mu_{t},A_{t})\in\psi_{n+1}$ (Contagion in Definition 5). Moreover, we can rewrite the above equation as follows:

\displaystyle(\psi_{n+1}(A_{t})-\underline{\mu}(A_{t}))+\frac{M\cdot\mathsf{TOR}(\psi_{n+1}(A_{t})-\underline{\mu}(A_{t}))}{2}=\psi_{n}(A_{t})-\underline{\mu}(A_{t}),

where the RHS is constant in $A_{t}$ by the translation property of $\psi_{n}$ . Thus, $\psi_{n+1}(A_{t})-\underline{\mu}(A_{t})$ must be also constant in $A_{t}$ (Translation in Definition 5). This concludes that round- $(n+1)$ dominance region $\psi_{n+1}$ satisfies $\psi_{n}\subset\psi_{n+1}$ because $c_{n}=\psi_{n}(A_{t})-\underline{\mu}(A_{t})>\psi_{n+1}(A_{t})-\underline{\mu}(A_{t})=:c_{n+1}$ . ∎

Step 1D. In the limit, the sequence $(\psi_{n})_{n}$ covers the $(\mu,A)$ region where action $1$ is not strictly dominated.

Lemma 4.

\bigcup_{n\in\mathbb{N}}\psi_{n}=\Big{\{}(\mu,A)\in\Delta(\Theta)\times[0,1]:\mu>\underline{\mu}(A)\Big{\}}.

Proof of Lemma 4.

Recall $\psi_{n}(A_{t})=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n}\}.$ By Lemma 3, $\psi_{n}(A_{t})$ is decreasing in $n$ . Define $\psi^{*}(A_{t})=\lim_{n\to\infty}\psi_{n}(A_{t})$ . In limit, we must have

	$\displaystyle\psi^{}(A_{t})+M\cdot\mathsf{TOR}(D(\psi^{}(A_{t}),A_{t}))/2=\psi^{*}(A_{t})$
	$\displaystyle\Rightarrow\mathsf{TOR}(D(\psi^{}(A_{t}),A_{t}))=0\Rightarrow\psi^{}(A_{t})=\underline{\mu}(A_{t}),$

which implies

\bigcup_{n\geq 0}\psi_{n}=\Big{\{}(\mu_{t},A_{t}):\mu_{t}>\underline{\mu}(A_{t})\Big{\}}

as required. ∎

Step 2. We have constructed an information policy which uniquely implements an equilibrium achieving (2) for $|\Theta|=2$ . we now lift this to the case with finite states $\Theta=\{\theta_{1},\ldots\theta_{n}\}$ as set out in the main text, where recall we set $\theta^{*}$ as the dominant state.

In particular, we show that if $\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}},$ then playing action $1$ is the unique subgame perfect equilibrium under the information policy $\bm{\mu}^{*}$ . To apply Step 1, we will construct an auxiliary binary-state environment for each direction from $\delta_{\theta^{*}}$ .

To this end, we call a vector $\hat{\bm{d}}=(\hat{d}_{\theta})_{\theta\in\Theta}\in\mathbb{R}^{n}$ a feasible directional vector if $\sum_{\theta}{\hat{d}_{\theta}}=0$ and $\hat{d}_{\theta^{*}}=1$ but $\hat{d}_{\theta}<0$ if $\theta^{*}\neq\theta$ . For each feasible directional vector $\hat{\bm{d}}$ , define a function $\bar{\alpha}_{\hat{\bm{d}}}:[0,1]\to[0,1]$ such that, for every $A\in[0,1],$

\bar{\alpha}_{\hat{\bm{d}}}(A)=\inf\Big{\{}\alpha\in[0,1]:\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}\notin\Psi_{LD}(A)\cup\text{Bd}_{\theta^{*}}\Big{\}}.

Note that $\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}\in\text{Bd}_{\theta^{*}}$ if and only if $\alpha=0$ because $\hat{d}_{\theta^{*}}=1$ . Observe that

\displaystyle\big{(}\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\big{)}^{c}=\bigcup_{\hat{\bm{d}}\in\mathcal{D}}\big{\{}\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}:\alpha\in(\bar{\alpha}_{\hat{\bm{d}}}(A_{t})),1]\big{\}},

where $\mathcal{D}$ is the set of all feasible directional vectors. This is true because 1) $\big{(}\Psi_{LD}(A_{t})\big{)}^{c}$ is a polygon since the expectation operator is linear; and 2) $\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\Delta(\Theta)$ is closed. Thus, it is equivalent to show that, for every feasible directional vector $\hat{\bm{d}},$ if $\alpha\in(\bar{\alpha}_{\hat{\bm{d}}}(A_{t}),1]$ , then playing action $1$ is the unique subgame perfect equilibrium under the information policy $\bm{\mu^{*}}$ .

Fix a feasible directional vector $\hat{\bm{d}}.$ Define

\Delta(\Theta)_{\hat{\bm{d}}}=\big{\{}\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}:\alpha\in[0,1]\big{\}}

as the set of beliefs whose direction from $\delta_{\theta^{*}}$ is $\hat{\bm{d}}$ . Consider an auxiliary environment with binary state $\tilde{\Theta}=\{0,1\}.$ Construct a bijection $\psi_{\hat{\bm{d}}}:\Delta(\Theta)_{\hat{\bm{d}}}\to\Delta(\tilde{\Theta})$ such that $\psi_{\hat{\bm{d}}}(\mu)=\alpha$ if $\mu=\delta_{\theta^{*}}-(1-\alpha)\hat{\bm{d}}.$ Denote $\tilde{\mu}\coloneqq\psi_{\hat{\bm{d}}}(\mu)\in\Delta(\tilde{\Theta})$ for every $\mu\in\Delta(\Theta)_{\hat{\bm{d}}}$ . Note that $\psi_{\hat{\bm{d}}}(\delta_{\theta^{*}})=1$ .

We define a flow payoff for each player under the new environment $\tilde{u}:\{0,1\}\times[0,1]\times\tilde{\Theta}\to\mathbb{R}$ as follows:

\tilde{u}(a,A,\tilde{\theta})=u\left(a,A,\psi^{-1}_{\hat{\bm{d}}}(\tilde{\theta})\right).

Define $\Delta\tilde{u}(A,\tilde{\theta}):=u(1,A,\tilde{\theta})-u(0,A,\tilde{\theta})$ . Since $\psi_{\hat{\bm{d}}}$ is a linear map, $\Delta\tilde{u}(A,\theta)$ is still continuously differentiable and strictly increasing in $A.$ Also, given that $\Delta\tilde{u}(0,1)=\Delta u(0,\theta^{*})>0$ , we still have an action- $1$ -dominance region under this new environment.

Then we can similarly define the maximum belief under which players prefer action $0$ even if all future players choose to play action $1:$

\underline{\mu}_{\hat{\bm{d}}}(A_{t}):=\max\Big{\{}\tilde{\mu}\in\Delta(\tilde{\Theta}):\mathbb{E}\Big{[}\int_{t}\tilde{u}(0,\bar{A}_{s},\tilde{\theta})ds\Big{]}\geq\mathbb{E}\Big{[}\int_{t}\tilde{u}(1,\bar{A}_{s},\tilde{\theta})ds\Big{]}\Big{\}}.

We define $\tilde{D}(\tilde{\mu},A)=\tilde{\mu}-\underline{\mu}_{\hat{\bm{d}}}(A)$ . Then it is easy to see that $\tilde{D}(\mu,A)=D(\mu,A)$ for every $\mu\in\Delta(\tilde{\Theta})$ and $A\in[0,1]$

A key observation is that if $\mu_{t-}\notin\Psi_{LD}(A_{t})$ and $\mu_{t-}\in\Delta(\Theta)_{\hat{\bm{d}}}$ , then every future belief must stay in $\Delta(\Theta)_{\hat{\bm{d}}}$ almost surely with respect to any strategy. We can rewrite the time- $t$ information struture corresponding to the new environment as follows:

1.

Silence on-path. If $\tilde{\mu}_{t-}>\underline{\mu}_{\hat{\bm{d}}}(A_{t})$ and $|A_{t}-Z_{t-}|<\mathsf{TOR}(D)$

$\mu_{t}=\mu_{t-}$ almost surely,

i.e., no information, and $dZ_{t}=\lambda(1-Z_{t-}).$

Noisy and asymmetric off-path. If $\tilde{\mu}_{t-}>\underline{\mu}_{\hat{\bm{d}}}(A_{t})$ and $Z_{t-}-A_{t}\geq\mathsf{TOR}(D),$

\displaystyle\tilde{\mu}_{t}=\begin{cases}\tilde{\mu}_{t-}+M\cdot\mathsf{TOR}(D)&\text{w.p. $\frac{\mathsf{DOWN}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$}\\ \tilde{\mu}_{t-}-\mathsf{DOWN}(D)&\text{w.p. $\frac{M\cdot\mathsf{TOR}(D)}{\mathsf{DOWN}(D)+M\cdot\mathsf{TOR}(D)}$},\end{cases}

and reset $Z_{t}=A_{t}$ .

By applying Step 1, we conclude that if $\tilde{\mu}_{t-}>\underline{\mu}_{\hat{\bm{d}}}(A_{t})$ , then action $1$ is played under any subgame perfect equilibrium. The only subtlety is to verify that as in (10), there exists a constant $C>0$ such that

\displaystyle\min_{A_{t}\in[0,1]}\mathbb{E}_{\tau}\bigg{[}\int_{s=t}^{\tau}e^{-r(s-t)}\Delta\tilde{u}(\bar{A}_{s},1)dt\bigg{]}\geq C

for any feasible directional vector $\hat{\bm{d}}.$ This is clear because $\Delta\tilde{u}(A,1)=\Delta u(A,\theta^{*})>\Delta u(0,\theta^{*})>0$ for every $A$ by the definition of $\theta^{*}$ .

Since $[0,\underline{\mu}_{\hat{\bm{d}}}(A_{t})]=\psi_{\hat{\bm{d}}}\big{(}\Delta(\Theta)_{\hat{\bm{d}}}\cap\Psi_{LD}(A_{t})\big{)}$ , we have $(\underline{\mu}_{\hat{\bm{d}}}(A_{t}),1]=(\bar{\alpha}_{\hat{\bm{d}}}(A_{t}),1]$ . Hence, if $\alpha\in(\bar{\alpha}_{\hat{\bm{d}}}(A_{t}),1]$ , then playing action $1$ is the unique subgame perfect equilibrium under the information policy $\bm{\mu^{*}}$ , as desired.

Step 3. We now show sequential optimality. Step 3A handles the case when beliefs are such that $1$ is strictly dominated, while 3B handles the case when $1$ is not strictly dominated.

Step 3A. $\bm{\mu^{*}}$ is $\epsilon$ -sequentially optimal when $\mu^{*}_{t}\in\Psi_{LD}(A_{t}).$

Fix any $\mu_{0}\in\Psi_{LD}(A_{0})$ . Define $\tau^{*}\coloneqq\inf\{t:\mu_{t}\notin\Psi_{LD}(A_{0})\}$ and $\bar{\tau}\coloneqq\inf\{t:\mu_{t}\notin\Psi_{LD}(A_{t})\}$ , i.e., $\tau^{*}$ and $\bar{\tau}$ are the first times $t$ at which the belief $\mu_{t}$ is not in $\Psi_{LD}(A_{0})$ and $\Psi_{LD}(A_{t})$ , respectively. This means, at $s<\bar{\tau}$ , all agents who can switch choose action $0$ . This pins down an aggregate action $A_{t}=\underline{A}(A_{0},t)$ for every $t\leq\bar{\tau}$ . Therefore, $A_{t}<A_{0}$ for every $t\leq\bar{\tau}$ , implying $\Psi_{LD}(A_{0})\subset\Psi_{LD}(A_{t})$ . Thus, $\tau^{*}\geq\bar{\tau},$ and so $A_{t}=\underline{A}(A_{0},t)$ for every $t\leq\tau^{*}$ .

Moreover, we know that $A_{s}\leq\bar{A}(A_{t},s-t)$ for any $s\geq t$ , so we can find an upper bound of the designer’s payoff as follows:

	$\displaystyle\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}$
	$\displaystyle=\mathbb{E}_{\tau^{}}\left[\phi(\bm{A})\mathbb{1}(\tau^{}=\infty)+\phi(\bm{A})\mathbb{1}(\tau^{*}<\infty)\right]$
	$\displaystyle\leq\mathbb{E}_{\tau^{}}\left[\phi(\bm{\underline{A}})\mathbb{1}(\tau^{}=\infty)+\phi(\bm{\bar{A}})\mathbb{1}(\tau^{*}<\infty)\right]$
	$\displaystyle=\phi(\bm{\underline{A}})+\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}\mathbb{P}(\tau^{*}<\infty),$

where $\bm{\underline{A}}$ satisfies $\underline{A}_{t}=\underline{A}(A_{0},t)$ , and $\bm{\bar{A}}$ satisfies $\bar{A}_{t}=\bar{A}(A_{0},t)$ .

For every $t\in[0,\infty)$ , the optional stopping theorem implies

	$\displaystyle\mu_{0}$	$\displaystyle=\mathbb{E}\big{[}\mu_{\tau^{*}\wedge t}\big{]}$
		$\displaystyle=\mathbb{E}[\mu_{\tau^{}}\mid\tau^{}<t]\mathbb{P}(\tau^{}<t)+\mathbb{E}[\mu_{t}\mid\tau^{}\geq t]\mathbb{P}(\tau^{*}\geq t)$
		$\displaystyle\geq\underbrace{\mathbb{E}[\mu_{\tau^{}}\mid\tau^{}<t]}_{\eqqcolon\hat{\mu}_{t}}\mathbb{P}(\tau^{*}<t).$

This implies $\hat{\mu}_{t}\in F(\mathbb{P}(\tau^{*}<t),\mu_{0})$ for every $t.$ By the definition of $\tau^{*}$ and $\mu_{t}$ is right-continuous, $\mu_{\tau^{*}}\in\overline{\Psi^{c}_{LD}(A_{0})}$ under the event $\{\tau^{*}<\infty\}$ . Since $\overline{\Psi^{c}_{LD}(A_{0})}$ is convex, we also have $\hat{\mu}_{t}\in\overline{\Psi^{c}_{LD}(A_{0})}.$ This means $\hat{\mu}_{t}\notin\text{Int }\Psi_{LD}(A_{0})$ , but $\hat{\mu}_{t}\in F(\mathbb{P}(\tau^{*}<t),\mu_{0})$ . The definition of $p^{*}(\mu_{0},A_{0})$ implies $p^{*}(\mu_{0},A_{0})\geq\mathbb{P}(\tau^{*}<t)$ for every $t.$ Thus,

	$\displaystyle\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}$	$\displaystyle\leq\phi(\bm{\underline{A}})+\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}p^{*}(\mu_{0},A_{0})$
		$\displaystyle=(1-p^{}(\mu_{0},A_{0}))\phi(\bm{\underline{A}})+p^{}(\mu_{0},A_{0})\phi(\bm{\bar{A}}).$

This implies

\displaystyle\eqref{eqn:opt}=\sup_{\begin{subarray}{c}\bm{\mu}\in\mathcal{M}\\ \sigma\in\Sigma(\bm{\mu},A_{0})\end{subarray}}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\leq(1-p^{*}(\mu_{0},A_{0}))\phi(\bm{\underline{A}})+p^{*}(\mu_{0})\phi(\bm{\bar{A}}).

Under $\bm{\mu}^{*}$ , if $\mu_{0+}\in\Psi^{c}_{LD}(A_{0}),$ then everyone takes action $1$ under any equilibrium outcome from we argued earlier. Thus,

\displaystyle\inf_{\sigma\in\Sigma(\bm{\mu}^{*},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\geq(1-p^{*}(\mu_{0},A_{0})+\eta)\phi(\bm{\underline{A}})+(p^{*}(\mu_{0},A_{0})-\eta)\phi(\bm{\bar{A}}).

Taking limit $\eta\to 0$ , we obtain

\displaystyle\eqref{eqn:adv}=\sup_{\bm{\mu}\in\mathcal{M}}\inf_{\sigma\in\Sigma(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\geq(1-p^{*}(\mu_{0},A_{0}))\phi(\bm{\underline{A}})+p^{*}(\mu_{0},A_{0})\phi(\bm{\bar{A}})\geq\eqref{eqn:opt}.

Since $\eqref{eqn:opt}\geq\eqref{eqn:adv}$ , we obtain $\eqref{eqn:opt}=\eqref{eqn:adv}$ .

Step 3B. We finally show $\bm{\mu^{*}}$ is sequentially optimal when $\mu^{*}_{t}\notin\Psi_{LD}(A_{t})\cup\text{Bd}_{\theta^{*}}\Delta(\Theta).$ We proceed casewise:

•

Case 1: If $\mu_{t-}\notin\Psi_{LD}(A_{t})$ and $|A_{t}-Z_{t-}|<\mathsf{TOR}(D(\mu_{t-},A_{t}))$ . In this case, there is no information arriving, and everyone takes action 1. This will increase $A_{t}$ , and every agent always takes action $1$ from time $t$ onwards. This is the best outcome for the designer, implying sequential optimality.
•

Case 2: If $\mu_{t-}\notin\Psi_{LD}(A_{t})$ and $|A_{t}-Z_{t-}|\geq\mathsf{TOR}(D(\mu_{t-},A_{t}))$ . In this case, the belief moves to either $\mu_{t-}+(M\cdot\mathsf{TOR}(D))\cdot\hat{d}(\mu_{t-})$ or $\mu_{t-}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-})$ . Note that $\mu_{t-}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-})\notin\Psi_{LD}(A_{t})$ because $\psi_{\hat{\bm{d}}}(\mu_{t}-\mathsf{DOWN}(D)\cdot\hat{d}(\mu_{t-}))=(1+\bar{\alpha}_{\hat{\bm{d}}}(A_{t}))/2>\bar{\alpha}_{\hat{\bm{d}}}(A_{t})$ . So no matter what information arrives, every agent takes action 1. This will increase $A_{t}$ , and every agent always takes action $1$ after time $t$ . Again, this is the best outcome for the designer, implying sequential optimality.

Appendix B Designing private information

In this appendix we discuss whether the designer can do better by designing private information.

Relaxed feasibility for joint belief processes.

We consider the relaxed problem under which each agent’s belief can be ‘separately controlled’ i.e., any joint distribution over agents’ beliefs under which the marginal distribution is a martingale is feasible under the relaxed problem. There is a common prior $\mu_{0}$ and a private belief process $\bm{\mu}_{i}:=(\mu_{it})_{t}$ , where $\mu_{it}:=\mathbb{P}(\theta=1|\mathcal{F}_{it})$ with $\mathcal{F}_{it}$ being agent $i$ ’s time- $t$ filtration generated by $(A_{s},\mu_{is})_{s\leq t}$ .

The belief process for agent $i\in[0,1]$ , $\bm{\mu}_{i}:=(\mu_{it})_{t}$ is R-feasible if it is an $(\mathcal{F}_{it})_{t}$ -martingale. The set of joint R-feasible belief process is

\mathcal{M}^{P}:=\Big{\{}(\mu_{it})_{t}:i\in[0,1],\text{ $(\mu_{it})_{t}$ is R-feasible }\Big{\}}.

We emphasize that this is a necessary condition on beliefs, but is not sufficient (see, e.g., Arieli, Babichenko, Sandomirskiy, and Tamuz (2021); Morris (2020) for a discussion of the static case). Let the set of feasible joint belief processes be $\mathcal{M}^{F}$ . Although it is still an open question of how to characterize this set, we know $\mathcal{M}^{F}\subseteq\mathcal{M}^{P}$ .

The problem under private information.

\sup_{\bm{\mu}\in\mathcal{M}^{F}}\inf_{\bm{\sigma}\in PBE(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}.

noting that we have moved from subgame perfection to Perfect-Bayesian Equilibria (Fudenberg and Tirole, 1991) since there is now private information among players. However, observe that $\mathcal{M}\subseteq\mathcal{M}^{F}$ and, furthermore, that BNE coincides with SPE under public information so $\eqref{eqn:private_ADV}\geq\eqref{eqn:adv}$ .

Theorem 1B.

Suppose that $\mu_{0}\notin\Psi_{LD}(A_{0})\cup\text{Bd}_{\theta^{*}}$ , then

\eqref{eqn:private_ADV}=\eqref{eqn:adv}.

If $\mu_{0}\in\Psi_{LD}(A_{0})$ and further supposing $\phi$ is a convex functional, then

\eqref{eqn:private_ADV}-\eqref{eqn:adv}\leq\Big{(}p^{*}(\mu_{0},A_{0})-p^{*}(\mu_{0},1)\Big{)}\Big{(}\phi\big{(}\bm{\overline{A}}^{\lambda}\Big{)}-\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}\Big{)}.

Proof.

The case in which $\mu_{0}\notin\Psi_{LD}(A_{0})\cup\text{Bd}_{\theta^{*}}$ follows directly from Theorem 1 since it already attains the upper bound on the time-path of aggregate play. We prove the second part in several steps.

Step 1A. Constructing a relaxed problem.

Some care is required in constructing the relaxed problem: by moving from $\mathcal{M}^{F}$ to $\mathcal{M}^{P}$ , equilibria of the resultant game might not be well-defined. We will deal with this in two ways. First, we will weaken PBE to what we call non-dominance which requires that players play action $1$ whenever it is not strictly dominated. Notice that this is not an equilibrium concept and is well-defined even with hetrogeneous beliefs. Second, we will replace the inner $\inf$ with $\sup$ to obtain the relaxed problem

\sup_{\bm{\mu}\in\mathcal{M}^{P}}\sup_{\bm{\sigma}\in ND(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}.

It is easy to see that this is indeed a relaxed problem i.e., $\eqref{eqn:private_ADV_R}\geq\eqref{eqn:private_ADV}$ since (i) $\mathcal{M}^{P}\supseteq\mathcal{M}^{F}$ and furthermore, for each $\bm{\mu}\in\mathcal{M}^{F}$ , $PBE(\bm{\mu},A_{0})\subseteq ND(\bm{\mu},A_{0})$ .

Step 1B. Solving the relaxed problem.

First observe that for each player $i\in[0,1]$ , a necessary condition for action $1$ to not be strictly dominated is

\mu_{it}>\underline{\mu}(A=1)

Hence, consider the strategy $\overline{\sigma}$ in which each player $i$ plays $1$ if $\mu_{it}>\underline{\mu}(A=1)$ and $0$ otherwise. Clearly,

\sup_{\bm{\mu}\in\mathcal{M}^{P}}\mathbb{E}^{\overline{\sigma}}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}\geq\eqref{eqn:private_ADV_R}.

Let $(\mu_{it})_{t}$ be any Cadlag martingale and let $\tau_{i}:=\inf\{t\in\mathcal{T}:\mu_{it}\notin\Psi_{LD}(1)\}$ . Clearly this Cadlag martingale is improvable if it continues to deliver information after $\tau_{i}$ , so it is without loss to consider $(\mu_{it})_{t}$ which are constant a.s. after $\tau_{i}$ . But observe that since $(\mu_{it})_{t}$ is a martingale, the probability of exiting the region $\Psi_{LD}(1)$ is upper-bounded with the same calculation :

\mathbb{P}(\tau_{i}<+\infty)\leq p^{*}(\mu_{0},1).

We define the (random) number of agents whose beliefs eventually cross $\underline{\mu}(1)$ as follows:

\displaystyle F

\displaystyle=\int_{i\in I}1\{\tau_{i}<\infty\}di.

Consider that

\displaystyle\mathbb{E}_{\mu}[F]

\displaystyle=\mathbb{E}_{\mu}\bigg{[}\int_{i\in I}1\{\tau_{i}<\infty\}di\bigg{]}=\int_{i\in I}\mathbb{P}_{\mu}(\tau_{i}<\infty)di\leq p^{*}(\mu_{0},1).

Now we will derive the upper bound of $A_{t}$ for each realization of $(\mu_{it})_{i,t}.$ Agent $i\in I$ takes action $1$ at time $t$ only if either

(I)

agent $i$ ’s Poisson clock ticked before $t$ , and his belief eventually crosses $\underline{\mu}(1)$ , or
(II)

agent $i$ took action 1 initially, and his Poisson clock has not ticked yet.

The measures of agents in (I) and (II) are $F(1-\exp(-\lambda t))$ and $A_{0}\exp(-\lambda t)$ , respectively. Thus,

\displaystyle A_{t}

\displaystyle\leq F(1-\exp(-\lambda t))+A_{0}\exp(-\lambda t)

almost surely. Define $\overline{\bm{A}}^{\lambda}:=(\overline{A}_{t}^{\lambda})_{t}$ as the solution to the ODE $d\overline{A}_{t}^{\lambda}=\lambda(1-\overline{A}_{t}^{\lambda})dt$ with boundary $\overline{A}_{0}^{\lambda}=A_{0}$ , and $\underline{\bm{A}}^{\lambda}:=(\underline{A}_{t}^{\lambda})_{t}$ as the solution to the ODE $d\underline{A}^{\lambda}_{t}=-\lambda\underline{A}^{\lambda}_{t}dt$ with boundary $\underline{A}_{0}^{\lambda}=A_{0}$ . We have

\overline{A}_{t}^{\lambda}=1-(1-A_{0})\exp(-\lambda t),\quad\quad\underline{A}_{t}^{\lambda}=A_{0}\exp(-\lambda t),

so we can rewrite the upper bound of $A_{t}$ as follows:

A_{t}\leq F\overline{A}_{t}^{\lambda}+(1-F)\underline{A}_{t}^{\lambda}\quad\forall t\quad\Rightarrow\quad\bm{A}\leq F\bm{\overline{A}}^{\lambda}+(1-F)\bm{\underline{A}}^{\lambda}

almost surely. Since $\phi$ is a convex and increasing functional, we must have

\phi(\bm{A})\leq F\phi(\bm{\overline{A}}^{\lambda})+(1-F)\phi(\bm{\underline{A}}^{\lambda})

almost surely. This implies

	$\displaystyle\mathbb{E}_{\mu}\Big{[}\phi(\bm{A})\Big{]}$	$\displaystyle\leq\mathbb{E}_{\mu}[F]\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}+(1-\mathbb{E}_{\mu}[F])\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}$
		$\displaystyle\leq p^{}(\mu_{0},1)\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}+\big{(}1-p^{}(\mu_{0},1)\big{)}\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}$

for every $\bm{\mu}.$ Thus,

\displaystyle\eqref{eqn:private_ADV}\leq\eqref{eqn:private_ADV_R}\leq p^{*}(\mu_{0},1)\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}+\big{(}1-p^{*}(\mu_{0},1)\big{)}\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}.

This implies

\displaystyle\eqref{eqn:private_ADV}-\eqref{eqn:adv}\leq(p^{*}(\mu_{0},A_{0})-p^{*}(\mu_{0},1))\big{(}\phi\big{(}\bm{\overline{A}}^{\lambda}\big{)}-\phi\big{(}\bm{\underline{A}}^{\lambda}\big{)}\big{)},

as desired. ∎

ONLINE APPENDIX TO ‘INFORMATIONAL PUTS’
ANDREW KOH SIVAKORN SANGUANMOO KEI UZUI

Online Appendix I develops Theorem 1 for finite players.

Appendix I Finite players

I.1 Preliminaries

Let $A_{0}=\bar{A}_{0}=\frac{N-n}{N}$ , where $n$ is the number of agents who initially play action $0$ . For each $i\in\{1,\dots,n\}$ , define $\tau_{i}\sim Exp(\lambda)$ as an iid exponential distribution with rate $\lambda$ , i.e., $\tau_{i}$ is agent $i$ ’s random waiting time for the first switching opportunity. We define random variables $A_{t}$ and $\bar{A}_{t}$ as follows:

	$\displaystyle A_{t}$	$\displaystyle=A_{0}+\frac{1}{N}\sum_{i=1}^{n}1\{\tau_{i}\leq t\}$
	$\displaystyle\bar{A}_{t}$	$\displaystyle=1-(1-\bar{A}_{0})e^{-\lambda t},$

where $A_{t}$ is the proportion of agents playing action $1$ at time $t$ when everyone switches to action $1$ as quickly as possible, while $\bar{A}_{t}$ is the auxiliary proportion of agents playing action $1$ at time $t$ when $1-e^{-\lambda t}$ of the agents initially playing action $0$ have had opportunities to switch by time $t$ .

If the number of agents is finite, the proportion of agents playing action $1$ can deviate from the tolerated distance from the target even when no one has switched to action $0$ . The following lemma provides an upper bound on the probability of such “unlucky” events.

Lemma 5.

$\mathbb{P}(\forall t,A_{t}+\delta\geq\bar{A}_{t})>1-12\delta^{-4}N^{-1}$ .

Proof.

Fix $\alpha$ such that $\delta=2N^{-\alpha}$ . We rearrange $(\tau_{i})_{1,\dots,n}$ as $\tau_{(1)}<\tau_{(2)}<\cdots<\tau_{(n)}.$ For each $k\in\{0,\dots,\lceil nN^{\alpha-1}\rceil-1\}$ , define $T_{k}\coloneqq[\tau_{(kN^{1-\alpha})},\tau_{((k+1)N^{1-\alpha})})$ , where $\tau_{(i)}=\tau_{(\lfloor i\rfloor)}$ , $\tau_{(0)}=0$ , and $\tau_{(n+1)}=\infty$ . If $t\in T_{k}$ , We must have $A_{t}\in[A_{0}+\frac{\lfloor kN^{1-\alpha}\rfloor}{N},A_{0}+(k+1)N^{-\alpha}].$ Therefore,

	$\displaystyle\mathbb{P}(\forall t\leq T,A_{t}\geq\bar{A}_{t}-\delta)$
	$\displaystyle=\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-1}\{\omega:\forall t\in T_{k},A_{t}\geq\bar{A}_{t}-\delta\}\bigg{)}$
	$\displaystyle\geq\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-1}\bigg{\{}\omega:\forall t\in T_{k},A_{0}+\frac{\lfloor kN^{1-\alpha}\rfloor}{N}\geq\bar{A}_{t}-\delta\bigg{\}}\bigg{)}$
	$\displaystyle\geq\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-1}\bigg{\{}\omega:A_{0}+\frac{kN^{1-\alpha}-1}{N}\geq\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta\bigg{\}}\bigg{)}$
	$\displaystyle=\mathbb{P}\bigg{(}\bigcap_{k=0}^{\lceil nN^{1-\alpha}\rceil-2}\bigg{\{}\omega:A_{0}+\frac{kN^{1-\alpha}-1}{N}\geq\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta\bigg{\}}\bigg{)},$

where the last equality follows from that if $k=\lceil nN^{\alpha-1}\rceil-1$ then

	$\displaystyle A_{0}+\frac{kN^{1-\alpha}-1}{N}$	$\displaystyle>\frac{N-n}{N}+\frac{n-N^{1-\alpha}-1}{N}$
		$\displaystyle=1-\frac{N^{1-\alpha}+1}{N}$
		$\displaystyle>1-\delta$
		$\displaystyle>\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta.$

We define the event $\Omega_{relax}$ as follows:

\displaystyle\Omega_{relax}=\bigcap_{k=0}^{\lceil nN^{\alpha-1}\rceil-2}\underbrace{\bigg{\{}\tau_{((k+1)N^{1-\alpha})}-\tau_{(kN^{1-\alpha})}\leq\lambda^{-1}(1+\delta/3)\log\bigg{(}\frac{n-kN^{1-\alpha}}{n-(k+1)N^{1-\alpha}}\bigg{)}\bigg{\}}}_{\Omega_{k}}.

Under the event $\Omega_{relax}$ , for every $k\leq\lceil nN^{\alpha-1}\rceil-2$ , we have

	$\displaystyle\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}$	$\displaystyle=1-\frac{n}{N}\exp(-\lambda\tau_{((k+1)N^{1-\alpha})})$
		$\displaystyle=1-\frac{n}{N}\exp\bigg{(}-\lambda\sum_{i=0}^{k}(\tau_{((i+1)N^{1-\alpha})}-\tau_{(iN^{1-\alpha})})\bigg{)}$
		$\displaystyle\leq 1-\frac{n}{N}\exp\bigg{(}(1+\delta/3)(\log n-\log\big{(}n-(k+1)N^{1-\alpha}\big{)}\bigg{)}$
		$\displaystyle=1-\frac{n}{N}\bigg{(}1-\frac{(k+1)N^{1-\alpha}}{n}\bigg{)}^{1+\delta/3}$
		$\displaystyle\leq 1-\frac{n}{N}\bigg{(}1-\frac{(1+\delta/3)(1+k)N^{1-\alpha}}{n}\bigg{)}$
		$\displaystyle=A_{0}+(1+\delta/3)(1+k)N^{-\alpha}$

Note that $k<N^{\alpha}$ . Thus,

	$\displaystyle\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta$	$\displaystyle\leq A_{0}+(1+\delta)(1+k)N^{-\alpha}-\delta$
		$\displaystyle\leq A_{0}+kN^{-\alpha}+(1+\delta/3+\delta k/3)N^{-\alpha}-\delta$
		$\displaystyle\leq A_{0}+kN^{-\alpha}+(1+\delta/3)N^{-\alpha}-2\delta/3$
		$\displaystyle\leq A_{0}+kN^{-\alpha}-1/N,$

where the last inequality holds if $N$ is large and $\delta=2N^{-\alpha}$ . This implies

\displaystyle\Omega_{relax}\subset\bigcap_{k=0}^{\lceil nN^{1-\alpha}\rceil-2}\bigg{\{}\omega:A_{0}+\frac{kN^{1-\alpha}-1}{N}\geq\bar{A}_{\tau_{((k+1)N^{1-\alpha})}}-\delta\bigg{\}}.

Now we compute $\mathbb{P}(\Omega_{relax})$ . Note that $\tau_{((k+1)N^{1-\alpha})}-\tau_{(kN^{1-\alpha})}$ has

	mean	$\displaystyle=\sum_{i=\lfloor kN^{1-\alpha}\rfloor}^{\lfloor(k+1)N^{1-\alpha}\rfloor-1}\frac{1}{\lambda(n-i)}\leq\frac{1}{\lambda}\log\bigg{(}\frac{\lfloor n-kN^{1-\alpha}\rfloor}{\lfloor n-(k+1)N^{1-\alpha}\rfloor}\bigg{)}$
	variance	$\displaystyle=\sum_{i=\lfloor kN^{1-\alpha}\rfloor}^{\lfloor(k+1)N^{1-\alpha}\rfloor}\frac{1}{\lambda^{2}(n-i)^{2}}\leq\frac{1}{\lambda^{2}}\bigg{(}\frac{1}{\lfloor n-kN^{1-\alpha}\rfloor}-\frac{1}{\lfloor n-(k+1)N^{1-\alpha}\rfloor}\bigg{)}.$

Let $a_{k}=\lfloor n-kN^{1-\alpha}\rfloor$ . Thus, by Chebyshev inequality, the probability of $\Omega_{k}^{c}$ is bounded above by

	$\displaystyle\frac{1/a_{k}-1/(a_{k+1})}{\delta^{2}(\log a_{k+1}-\log a_{k})^{2}}$	$\displaystyle\leq\bigg{(}\frac{1/a_{k}-1/a_{k+1}}{\log a_{k+1}-\log a_{k}}\bigg{)}^{2}\cdot\frac{1/\delta^{2}}{1/a_{k}-1/a_{k+1}}$
		$\displaystyle\leq\frac{1}{\delta^{2}a_{k}^{2}}\cdot\frac{1}{1/a_{k}-1/a_{k+1}}$
		$\displaystyle=\frac{1}{\delta^{2}}\bigg{(}\frac{1}{a_{k}}+\frac{1}{a_{k+1}-a_{k}}\bigg{)}$
		$\displaystyle<\frac{1}{\delta^{2}}\cdot 3N^{\alpha-1}$

for every $k\leq\lceil nN^{\alpha-1}\rceil-2$ . Thus, the probability that the information triggers is bounded above by

\displaystyle N^{\alpha}\cdot\frac{3N^{\alpha-1}}{\delta^{2}}=3N^{2\alpha-1}\delta^{-2}=3\cdot\frac{4}{\delta^{2}}N^{-1}\delta^{-2}=12\delta^{-4}N^{-1}

since $\delta=2N^{-\alpha}$ , as desired. ∎

Another subtlety with a finite number of agents is that agent $i$ ’s action today affects her future decision problem, and thus she needs to account for this effect when choosing her action today. The following lemma shows that when $\mu>\underline{\mu}(\bar{A}_{0})$ , it is optimal for her to take action $1$ regardless of her future actions.

Lemma 6.

For every agent $i$ , suppose $(\tau_{in})_{n}$ be a increasing sequence of Poisson clocks of agent $i$ . Suppose $a_{in}\in\{0,1\}$ be a (random) action agent $i$ takes at $\tau_{in}$ . If $\mu>\underline{\mu}(\bar{A}_{0})$ , then

\displaystyle\mathbb{E}_{\mu}\Big{[}\sum_{n=0}^{\infty}\int_{\tau_{in}}^{\tau_{i,n+1}}e^{-rs}u(a_{in},\bar{A}_{s},\theta)ds\Big{]}\leq\mathbb{E}_{\mu}\Big{[}\int_{0}^{\infty}e^{-rs}u(1,\bar{A}_{s},\theta)ds\Big{]}

Proof.

For every $n\in\mathbb{N}$ , consider that

	$\displaystyle\mathbb{E}_{\mu}\Big{[}\int^{\tau_{i,n+1}}_{\tau_{in}}e^{-rs}\Delta u(\bar{A}_{s},\theta)ds\Big{]}$	$\displaystyle=\mathbb{E}_{\mu}\bigg{[}e^{-r\tau_{in}}\mathbb{E}_{\mu}\Big{[}\int_{0}^{\tau_{i,n+1}-\tau_{i,n}}e^{-rs}\Delta u(\bar{A}_{s+\tau_{in}},\theta)ds\big{\lvert}\tau_{in}\Big{]}\bigg{]}$
		$\displaystyle\geq\mathbb{E}_{\mu}\bigg{[}e^{-r\tau_{in}}\mathbb{E}_{\mu}\Big{[}\int_{0}^{\tau_{i,n+1}-\tau_{i,n}}e^{-rs}\Delta u(\bar{A}_{s},\theta)ds\big{\lvert}\tau_{in}\Big{]}\bigg{]}$
		$\displaystyle\geq 0,$

where the last inequality follows from $\mu>\underline{\mu}(\bar{A}_{0})$ . This implies

\displaystyle\mathbb{E}_{\mu}\Big{[}\int^{\tau_{i,n+1}}_{\tau_{in}}e^{-rs}u(a_{in},\bar{A}_{s},\theta)ds\Big{]}\leq\mathbb{E}_{\mu}\Big{[}\int^{\tau_{i,n+1}}_{\tau_{in}}e^{-rs}u(1,\bar{A}_{s},\theta)ds\Big{]}

for every $n\in\mathbb{N}$ , as desired. ∎

I.2 Main theorem

For simplicity, we consider binary states $\Theta=\{0,1\}$ with $\theta^{*}=1$ . Suppose that there are $N$ agents in the economy. Let $\Sigma^{N}(\bm{\mu},A_{0})$ denote the set of subgame perfect equilibria of the stochastic game induced by a belief martingale $\bm{\mu}$ under the economy consisting of $N$ agents whenever $A_{0}$ can be written as $\frac{k}{N}$ for some $k\in\{0,\dots,N\}$ . We define the designer’s problem under adversarial equilibrium selection with a finite number of agents as follows:

\sup_{\bm{\mu}\in\mathcal{M}}\inf_{\bm{\sigma}\in\Sigma^{N}(\bm{\mu},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi\big{(}\bm{A}\big{)}\Big{]}

Theorem 2.

Suppose that $u(a,\cdot,\theta)$ is Lipschitz continuous for all $a\in\{0,1\}$ and $\theta\in\{0,1\}$ and there exists a constant $L_{\phi}$ such that $|\phi(\bm{A})-\phi(\bm{A}^{\prime})|\leq L_{\phi}\|A-A^{\prime}\|_{\infty}$ for every $A,A^{\prime}\in[0,1]^{\infty}$ . Then the followings hold.

1.

There exists a constant $d$ such that, under any subgame perfect equilibrium $\sigma\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0})$ ( $\bm{\mu}^{\eta}$ defined in Theorem 1), an agent takes action $1$ if $\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}$ for every history $H_{t}$ , aggregate action $A_{t}$ , and belief $\mu_{t}$ ,
2.

There exists a constant $\bar{C}$ such that, for any $(\mu_{0},A_{0})$ ³⁹³⁹39We implicitly assume $A_{0}\in\mathbb{Q}$ and $N\cdot A_{0}$ is an integer., we have

$\left|\eqref{eqn:opt}-\eqref{eqn:adv-n}\right|\leq\bar{C}N^{-1/9},$

for sufficiently large $N$ (depending on $(\mu_{0},A_{0})$ ).

Sequential optimality:

\lim_{\eta\downarrow 0}\sup_{H_{t}\in\mathcal{H}}\lim_{N\to\infty}\Bigg{|}\inf_{\bm{\sigma}\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\big{[}\phi\big{(}\bm{A}\big{)}\big{|}\mathcal{F}_{t}\big{]}-\sup_{\bm{\mu}^{\prime}\in\mathcal{M}}\inf_{\bm{\sigma}^{N}\in\Sigma(\bm{\mu}^{\prime},A_{0})}\mathbb{E}^{\sigma}\big{[}\phi(\bm{A})\big{|}\mathcal{F}_{t}\big{]}\Bigg{|}=0.

Proof of Part 1. We follow a similar method as we did in Theorem 1. We restate Lemma 3 as follows:

Lemma 7.

There exists $d>0$ such that if $\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}$ , $\psi_{n}\subset\psi_{n+1}$ holds for all $n\in\mathbb{N}$ .

Proof of Lemma 7.

We follow a similar method as we did in Lemma 2. Suppose that everyone plays action 1 for any histories $H^{\prime}$ such that $S(H^{\prime})=(\mu,A)$ is in the round- $n$ dominance region $\psi_{n}.$ To obtain $\psi_{n+1}$ , we derive the lower bound on the expected payoff difference of playing $0$ and $1$ given $\psi_{n}$ .

Fix any history $H$ with the current target aggregate action $Z_{t}$ such that $S(H)=(\mu_{t},A_{t})\notin\psi_{n}$ . From our construction of $Z_{t}$ , we have $Z_{t-}-A_{t-}<\mathsf{TOR}(D(\mu_{t-},A_{t-}))$ . For any path $(A_{s})_{s\geq t}$ with an increment at most $\frac{1}{N}$ , we define a (deterministic) hitting time $T^{*}((A_{s}))_{s\geq t}$ as follows:

\displaystyle T^{*}=\inf\{s\geq t:Z_{s}-A_{s}\geq\mathsf{TOR}(D(\mu_{t},A_{s}))\text{ or }(\mu_{t},A_{s})\in\psi_{n}\}.

Fix $T^{*}$ , and we first determine a behavior of the path $(A_{s})_{s\geq t}$ given $T^{*}$ .

Before time $T^{*}$ . For any $s\in[t,T^{*})$ , we showed in (A) that $\mathsf{TOR}(D(\mu_{t},A_{s}))\leq\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))$ . By the definition of $Z$ , we must have $Z_{s}=\bar{A}_{s}$ for every $s\in[t,T^{*})$ because $Z_{s}-A_{s}<\mathsf{TOR}(D(\mu_{t},A_{s}))$ for every $s<T^{*}$ . Then we can write down the lower bound of $A_{s}$ when $s\in[t,T^{*})$ as follows:

\displaystyle A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),

(13)

almost surely given $T^{*}$ .

After time $T^{*}$ . Fix any $s>T^{*}$ . We consider the following two cases.

Case 1: $Z_{T^{*}}-A_{T^{*}}<\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))$ . This means $\mu_{T^{*}}=\mu_{t}$ because no information has been injected until $T^{*}$ . Then the definition of $T^{*}$ and the right continuity of $Z_{s}$ and $A_{s}$ imply $(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}.$ This means every agent strictly prefers to take action 1 at $T^{*}$ . This increases $A_{s}$ , inducing every agent taking action 1 after time $T^{*}$ until time $s^{\prime}$ at which information is injected (i.e., $Z_{s^{\prime}}-A_{s^{\prime}}>\mathsf{TOR}(D(\mu_{t},A_{s^{\prime}}))$ ).

The event that no information is injected again after time $T^{*}$ is equivalent to the event that $Z_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{s}))$ for every $s>T^{*}$ . Observe that

	$\displaystyle\mathbb{P}\Big{(}\forall s>T^{},Z_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{s}))\big{\lvert}T^{}\Big{)}$
	$\displaystyle\geq\mathbb{P}\Big{(}\forall s>T^{},Z_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\big{\lvert}T^{}\Big{)}$
	$\displaystyle\geq\mathbb{P}\Big{(}\forall s>T^{},\bar{A}_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\big{\lvert}T^{}\Big{)},$

where the first inequality follows from $A_{s}\geq A_{t}$ , and $\bar{A}_{s}$ is defined as $\bar{A_{s}}=1-(1-A_{T^{*}})e^{-\lambda(s-T^{*})}$ for every $s>T^{*}$ . By the definition of $Z_{s}$ , $\bar{A}_{s}\geq Z_{s}$ holds, which implies the second inequality. From Lemma 5, we know that

\displaystyle\mathbb{P}\Big{(}\forall s>T^{*},\bar{A}_{s}-A_{s}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\big{\lvert}T^{*}\Big{)}\geq 1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.

Therefore, we can write down the lower bound of $A_{s}$ when $s>T^{*}$ under this case as follows:

\displaystyle\forall s>T^{*},A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least $1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}$ .

Case 2: $Z_{T^{*}}-A_{T^{*}}\geq\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))$ . In this case, information is injected at $T^{*}$ . Note that, if $(\mu_{T^{*}},A_{T^{*}})\in\psi_{n},$ then everyone prefers to take action 1 at $T^{*}$ . This increases $A_{s},$ inducing every agent taking action 1 after time $T^{*}$ until time $s^{\prime}$ at which information is injected again. Thus, the probability that $(\mu_{T^{*}},A_{T^{*}})\in\psi_{n}$ and no information is injected again is at least

\displaystyle\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\mid T^{*})\cdot\left\{1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}\right\}.

By definition, we have

\displaystyle\mathbb{P}((\mu_{T^{*}},A_{T^{*}})\in\psi_{n}\mid T^{*})=p_{+}(\mu_{t},A_{T^{*}})1\{(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}})),A_{T^{*}})\in\psi_{n}\}

Now we claim that

\displaystyle A_{T^{*}}>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t}))-\frac{1}{N}.

To see this, suppose for a contradiction that $A_{T^{*}}\leq A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t}))-\frac{1}{N}$ , which implies $A_{T^{*}}<A_{t}$ . However, since the definition of $T^{*}$ implies $A_{T^{*}-}\geq Z_{T^{*}-}-\mathsf{TOR}(D(\mu_{t},A_{T^{*}-})),$ we have

	$\displaystyle A_{T^{*}}$	$\displaystyle\geq A_{T^{*}-}-\frac{1}{N}$
		$\displaystyle\geq Z_{T^{}-}-\mathsf{TOR}(D(\mu_{t},A_{T^{}-}))-\frac{1}{N}$
		$\displaystyle>A_{t}-\mathsf{TOR}(D(\mu_{t},A_{t}))-\frac{1}{N},$

where the first inequality follows from the increment size of $A_{t}$ being at most $1/N$ by assumption. This is a contradiction.

Hence, if $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}$ ,⁴⁰⁴⁰40We use this condition when we construct $\psi_{n+1}$ . we must have

	$\displaystyle\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))$	$\displaystyle\geq\psi_{n}(A_{t})+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}$
		$\displaystyle>(\psi_{n}(A_{T^{*}})-L_{\underline{\mu}}\mathsf{TOR}(D(\mu_{t},A_{t})))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$
		$\displaystyle=\psi_{n}(A_{T^{*}}),$

by setting $M=2L_{\underline{\mu}}$ , where the second inequality follows from Lipschitz continuity of $\psi_{n}$ . Thus, $(\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t})),A_{T^{*}})\in\psi_{n}$ holds, implying

	$\displaystyle\mathbb{P}((\mu_{T^{}},A_{T^{}})\in\psi_{n}\mid T^{*})$	$\displaystyle=p_{+}(\mu_{t},A_{T^{*}})$
		$\displaystyle=1-\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{}}))+M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{*}}))}$
		$\displaystyle\geq 1-\frac{M\cdot\mathsf{TOR}(D(\mu_{t},A_{T^{}}))}{\mathsf{DOWN}(D(\mu_{t},A_{T^{}}))}$
		$\displaystyle=1-\frac{\bar{\delta}\lambda MC}{2L+2LM(\mu_{t}-\underline{\mu}(A_{T^{*}}))^{-1}}$
		$\displaystyle\geq 1-\frac{\bar{\delta}\lambda MC}{2L+2LM(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))^{-1}}$
		$\displaystyle\geq 1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))$

for absolute constant $c:=\frac{\lambda C}{2L}$ . Then we can write down the lower bound of $A_{s}$ for every $s>T^{*}$ under this case as follows:

\displaystyle\forall s>T^{*},A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(D(\mu_{t},A_{s}))\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least

	$\displaystyle(1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t})))\cdot(1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4})$
	$\displaystyle\geq 1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.$

Combining Case 1 and Case 2, we must have

\displaystyle\forall s>T^{*},A_{s}\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

(14)

with probability at least $1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.$

Obtaining the lower bound. Our next step is to compute the lower bound when agent $i$ takes action $1$ at time $t$ and the upper bound when agent $i$ takes action $0$ at time $t$ . One difference from the case with a continuum of agents is that agent $i$ ’s action affects the entire future path of aggregate actions. Therefore, we need to account for these effects when computing the bounds. Finally, using these bounds, we show that there exists $d$ such that if $\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}$ , agent $i$ strictly prefers action $1$ when $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}$ , given that all agents take action $1$ for all $(\mu_{s},A_{s})\in\psi_{n}$ .

Suppose that agent $i$ takes action $a\in\{0,1\}$ at $(\mu_{t-},A_{t-})$ and takes a (random) action $a_{in}$ after each tick of her Poisson clock $(\tau_{n})_{n}$ . We call this strategy $\sigma_{i}$ and assume that it induces $A_{s}(\sigma_{i})$ .⁴¹⁴¹41Again, $A_{s}$ depends on agent $i$ ’s strategy $\sigma_{i}$ because of finiteness. Her payoff from strategy $\sigma_{i}$ is given by

\displaystyle U_{i}(\sigma_{i})=\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(a_{in},A_{s}(\sigma_{i}),\theta)ds\bigg{]}.

In (13), we showed that $A_{s}(\sigma_{i})\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))$ for every $s<T^{*}.$ ⁴²⁴²42The increment of $(A_{s}(\sigma_{i}))_{s}$ is at most $1/N$ because the probability that Poisson clocks of more than one agents tick at the same time is zero. Hence we can apply the earlier arguments. After time $T^{*}$ , if no information is injected again, everyone (including agent $i$ ) takes action $1$ , implying $a_{in}=1$ if $\tau_{n}>T^{*}$ . In (14), we showed

\displaystyle\forall s>T^{*},A_{s}(\sigma_{i})\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least $1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.$ Combining before and after $T^{*}$ , we have

\displaystyle\forall s\neq T^{*},A_{s}(\sigma_{i})\geq\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))

with probability at least $1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.$ By Lipschitz continuity of $u(a,\cdot,\theta)$ and $\Delta u(\cdot,\theta),$ we must have⁴³⁴³43Let $L_{0}$ and $L_{1}$ be Lipschitz constants of $u(0,\cdot,\theta)$ and $u(1,\cdot,\theta)$ , respectively. Then, $\Delta u(\cdot,\theta)$ is Lipschitz continuous with constant $L:=L_{0}+L_{1}$ .

\displaystyle\forall s\neq T^{*},\forall a\in\{0,1\},|u(a,A_{s}(\sigma_{i}),\theta)-u(a,\bar{A}_{s},\theta)|

\displaystyle\leq\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L

with probability at least $P_{N}(\mu_{t},A_{t}):=1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.$ Thus, for every strategy $\sigma_{i}$ under conjecture $\psi_{n}$ , this implies

	$\displaystyle\bigg{\lvert}U_{i}(\sigma_{i})-\underbrace{\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(a_{in},\bar{A}_{s},\theta)ds}_{\eqqcolon U^{*}_{i}(\sigma_{i})}\bigg{]}\bigg{\rvert}$
	$\displaystyle=\left\|\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}\left\{u(a_{in},A_{s}(\sigma_{i}),\theta)-u(a_{in},\bar{A}_{s},\theta)\right\}ds\right\|$
	$\displaystyle\leq\mathbb{E}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}(P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L\Big{)}\bigg{]}ds$
	$\displaystyle=\left\{P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L\right\}\cdot\int_{t}^{\infty}e^{-r(s-t)}ds$
	$\displaystyle=\frac{P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L}{r}.$

Now define $\sigma^{1}_{i}$ to be a strategy that agent $i$ always takes action 1. Suppose that agent $i$ takes action 0 at the beginning for $\sigma_{i}.$ Consider that, since $\mu>\underline{\mu}(A_{t}),$ if $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}$ ,

	$\displaystyle U^{*}_{i}(\sigma_{i})$	$\displaystyle=\mathbb{E}_{\mu}\bigg{[}\sum_{n=0}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(a_{in},\bar{A}_{s},\theta)ds\bigg{]}$
		$\displaystyle\leq\mathbb{E}_{\mu}\bigg{[}\int_{t}^{\tau_{1}}e^{-r(s-t)}u(0,\bar{A}_{s},\theta)ds\bigg{]}+\mathbb{E}_{\mu}\bigg{[}\sum_{n=1}^{\infty}\int_{\tau_{n}}^{\tau_{n+1}}e^{-r(s-t)}u(1,\bar{A}_{s},\theta)ds\bigg{]}$
		$\displaystyle=U^{*}_{i}(\sigma^{1}_{i})-\mathbb{E}_{\mu}\bigg{[}\int_{t}^{\tau_{1}}e^{-r(s-t)}\Delta u(\bar{A}_{s},\theta)ds\bigg{]}$
		$\displaystyle\leq U^{*}_{i}(\sigma^{1}_{i})-\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t})),$

where the first inequality follows from Lemma 6, and the second inequality follows from (LB).

Therefore, we have

	$\displaystyle U_{i}(\sigma^{1}_{i})-U_{i}(\sigma_{i})$
	$\displaystyle=(U_{i}(\sigma^{1}_{i})-U^{}_{i}(\sigma^{1}_{i}))+(U^{}_{i}(\sigma^{1}_{i})-U_{i}^{}(\sigma_{i}))+(U_{i}^{}(\sigma_{i})-U_{i}(\sigma_{i}))$
	$\displaystyle\geq\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-2\cdot\frac{P_{N}(\mu_{t},A_{t})\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L}{r}$
	$\displaystyle\geq\frac{C}{2}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-2\cdot\frac{\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))L+(1-P_{N}(\mu_{t},A_{t}))L}{r}$

Recall that

\displaystyle P_{N}(\mu_{t},A_{t})

\displaystyle=1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}.

Since $D(\mu_{t},A_{t})=\mu_{t}-\underline{\mu}(A_{t})>(dN)^{-1/9}$ holds by assumption, we must have

\displaystyle P_{N}(\mu_{t},A_{t})>1-\bar{\delta}c(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))-12(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9}

for some constant $\bar{e}$ and $\underline{e}$ such that $\bar{e}\bar{\delta}D^{2}\geq\mathsf{TOR}(D)\geq\underline{e}\bar{\delta}D^{2}.$ ⁴⁴⁴⁴44By the definition of $\delta$ , any $\bar{e}\geq\lambda C/4L$ and $\underline{e}\leq\lambda C/\{4L(1+M)\}$ works. Let $\psi_{n}(A_{t})-\underline{\mu}(A_{t})=\phi_{n}$ . Since $\mu_{t}\notin\psi_{n}(A_{t}),$ we have $\phi_{n}\geq D(\mu_{t},A_{t})>(dN)^{-1/9}.$ Thus,

	$\displaystyle U_{i}(\sigma^{1}_{i})-U_{i}(\sigma_{i})$
	$\displaystyle\geq\frac{C\phi_{n}}{2}-2\cdot\frac{\bar{e}\bar{\delta}\phi_{n}^{2}L+(\bar{\delta}c\phi_{n}+12(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9})}{r}$
	$\displaystyle\geq\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta}c}{r}\bigg{)}\phi_{n}-\frac{24(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9}}{r}$
	$\displaystyle\geq\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta c}}{r}\bigg{)}(dN)^{-1/9}-\frac{24(\underline{e}\bar{\delta})^{-4}d^{8/9}N^{-1/9}}{r}$
	$\displaystyle>0,$

where these inequalities are true if

	$\displaystyle\bar{\delta}$	$\displaystyle<\frac{Cr}{2(2\bar{e}+c)}$
	$\displaystyle d$	$\displaystyle<\frac{r(\underline{e}\bar{\delta})^{4}}{24}\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta}c}{r}\bigg{)}.$

In conclusion, we have shown that there exists a constant $d$ such that if $\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}$ , agent $i$ strictly prefers action $1$ when $\mu_{t}\geq\psi_{n}(A_{t})-M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2+\frac{1}{N}L_{\underline{\mu}}$ , given that all agents take action $1$ for all $(\mu_{s},A_{s})\in\psi_{n}$ .

Characterizing $\psi_{n+1}$ . Note that $\delta$ is increasing and increasing in $A_{t}$ . Thus, $\mu_{t}+M\cdot\mathsf{TOR}(D(\mu_{t},A_{t}))/2$ is increasing in $\mu_{t}$ and continuous in $\mu_{t}$ . Therefore, for each $A_{t}$ , there exists $\mu^{\prime}(A_{t})<\psi_{n}(A_{t})$ such that

\displaystyle\mu^{\prime}(A_{t})+\frac{M\cdot\mathsf{TOR}(D(\mu^{\prime}(A_{t}),A_{t}))}{2}=\psi_{n}(A_{t})+\frac{L_{\underline{\mu}}}{N}

\displaystyle\frac{M\cdot\mathsf{TOR}(D(\psi_{n}(A_{t}),A_{t}))}{2}>\frac{L_{\underline{\mu}}}{N}.

A sufficient condition for this is

\displaystyle\frac{M}{2}\bar{\delta}\underline{e}\left(dN\right)^{-2/9}>\frac{L_{\underline{\mu}}}{N}\Leftrightarrow d<\left(\frac{M\bar{\delta}\underline{e}}{2L_{\underline{\mu}}}\right)^{\frac{9}{2}}N^{\frac{7}{2}}.

Hence, taking $d$ such that

\displaystyle d<\min\left\{\left(\frac{M\bar{\delta}\underline{e}}{2L_{\underline{\mu}}}\right)^{\frac{9}{2}},\frac{r(\underline{e}\bar{\delta})^{4}}{24}\bigg{(}\frac{C}{2}-\frac{2\bar{e}\bar{\delta}+\bar{\delta}c}{r}\bigg{)}\right\}

(15)

is sufficient. Then we define

\psi_{n+1}=\{(\mu_{t},A_{t}):\mu_{t}\geq\mu^{\prime}(A_{t})\}

From the argument above, we must have an agent always choosing action $1$ whenever $(\mu_{t},A_{t})\in\psi_{n+1}.$ Moreover, we can rewrite the above equation as follows:

\displaystyle(\mu^{\prime}(A_{t})-\underline{\mu}(A_{t}))+\frac{M\cdot\mathsf{TOR}(\mu^{\prime}(A_{t})-\underline{\mu}(A_{t}))}{2}=\psi_{n}(A_{t})-\underline{\mu}(A_{t})+\frac{L_{\underline{\mu}}}{N},

where the RHS is constant in $A_{t}$ by the property of $\psi$ . Thus, $\mu^{\prime}(A_{t})-\underline{\mu}(A_{t})$ must be also constant in $A_{t}.$ This concludes that round- $(n+1)$ dominance region $\psi_{n+1}$ satisfies $\psi_{n}\subset\psi_{n+1}$ because $c_{n}=\psi_{n}(A_{t})-\underline{\mu}(A_{t})>\mu^{\prime}(A_{t})-\underline{\mu}(A_{t})=:c_{n+1}$ when $\eqref{ineq: sufficient condition}$ is satisfied. ∎

To conclude the proof of part 1 of Theorem 2, we show the following lemma.

Lemma 8.

\bigcup_{n\in\mathbb{N}}\psi_{n}\supseteq\Big{\{}(\mu,A)\in\Delta(\Theta)\times[0,1]:\mu>\underline{\mu}(A)+(dN)^{-1/9}\Big{\}}.

Proof of Lemma 8.

Recall $\psi_{n}(A_{t})=\sup\{\mu\in\Delta(\Theta):(\mu,A_{t})\notin\psi_{n}\}.$ By Lemma 7, $\psi_{n}(A_{t})$ is decreasing in $n$ . Define $\psi^{*}(A_{t})=\lim_{n\to\infty}\psi_{n}(A_{t})$ . In limit, we must have

	$\displaystyle\psi^{}(A_{t})+M\cdot\mathsf{TOR}(D(\psi^{}(A_{t}),A_{t}))/2=\psi^{*}(A_{t})+L_{\underline{\mu}}/N$
	$\displaystyle\Rightarrow\mathsf{TOR}(D(\psi^{*}(A_{t}),A_{t}))=2L_{\underline{\mu}}/(MN).$

Since our choice of $d$ by (15) ensures

\frac{2L_{\underline{\mu}}}{MN}\leq\delta\left((dN)^{-1/9}\right),

we have

D(\psi^{*}(A_{t}),A_{t})\leq\mu_{t}-\underline{\mu}(A_{t})\Leftrightarrow\psi^{*}(A_{t})\leq\mu_{t}

for any $\mu_{t}>\underline{\mu}(A_{t})+(dN)^{-1/9}$ , as desired. ∎

Proof of Part 2.

Consider the following two cases:

Case 1: $\mu_{0}>\underline{\mu}(A_{0})$ . Consider $N$ large enough so that $\mu_{0}>\underline{\mu}(A_{0})+(dN)^{-1/9}.$ Under $\bm{\mu}^{\eta}$ and the environment of $N$ agents, Part 1 implies everyone takes action 1 under any equilibrium outcome until new information is injected.

Without loss of generality, we assume $\phi\geq 0.$ Let $\bm{\tau}:=(\tau_{i})_{i=1}^{N}$ . We have

	$\displaystyle\inf_{\sigma\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}$
	$\displaystyle\geq\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{0},A_{t}))\right\}\phi(\bm{\bar{A}}^{N})\right]$
	$\displaystyle\geq\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{0},A_{0}))\right\}\phi(\bm{\bar{A}}^{N})\right]$
	$\displaystyle\geq\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{0},A_{0}))\right\}\phi(\bm{\bar{A}})\right]-L_{\phi}\cdot\mathsf{TOR}(D(\mu_{0},A_{0}))$
	$\displaystyle\geq\left\{1-12N^{-1}\mathsf{TOR}(D(\mu_{0},A_{0}))^{-4}\right\}\phi(\bm{\bar{A}})-L_{\phi}\cdot\mathsf{TOR}(D(\mu_{0},A_{0}))$		(Lemma 5)
	$\displaystyle\geq\left\{1-12N^{-1/9}(\underline{e}\bar{\delta})^{-4}d^{8/9}\right\}\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}$		( $\mathsf{TOR}(D)\geq\underline{e}\bar{\delta}D^{2}$ )
	$\displaystyle\geq\phi(\bm{\bar{A}})-\bar{C}N^{-1/9}$

for some constant $\bar{C}$ . The proof of Theorem 1 implies $\eqref{eqn:opt}=\phi(\bm{\bar{A}}).$ Thus,

\displaystyle|\eqref{eqn:opt}-\eqref{eqn:adv-n}|\leq\bar{C}N^{-1/9}

when $N$ is large enough, as desired.

Case 2: $\mu_{0}\leq\underline{\mu}(A_{0})$ . The proof of Theorem 1 implies

\displaystyle\eqref{eqn:opt}=\sup_{\begin{subarray}{c}\bm{\mu}\in\mathcal{M}\\ \sigma\in\Sigma(\bm{\mu},A_{0})\end{subarray}}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}\leq(1-p^{*}(\mu_{0}))\phi(\bm{\underline{A}})+p^{*}(\mu_{0})\phi(\bm{\bar{A}}),

where $p^{*}(\mu_{0}):=\mu_{0}/\underline{\mu}(A_{0})$ , $\bm{\underline{A}}$ satisfies $\underline{A}_{t}=\underline{A}(A_{0},t)=A_{0}e^{-\lambda t}$ , and $\bm{\bar{A}}$ satisfies $\bar{A}_{t}=\bar{A}(A_{0},t)=1-(1-A_{0})e^{-\lambda t}$ .

Let $\eta>2(dN)^{-1/9}/(2(dN)^{-1/9}+\underline{\mu}(A_{0}))$ . In this case, we have

\mu_{0}^{+}=\frac{\mu_{0}}{p^{*}(\mu_{0})-\eta}>\underline{\mu}(A_{0})+2(dN)^{-1/9},

where $\mu_{0}^{+}$ is the maximal escaping belief defined in the main text. Under $\bm{\mu}^{\eta}$ and the environment of $N$ agents, if $\mu_{0+}>\underline{\mu}(A_{0})+(dN)^{-1/9},$ then everyone takes action $1$ until new information is injected under any equilibrium outcome by Part 1. Thus, we have

	$\displaystyle\inf_{\sigma\in\Sigma^{N}(\bm{\mu}^{\eta},A_{0})}\mathbb{E}^{\sigma}\Big{[}\phi(\bm{A})\Big{]}$
	$\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\left\|\underline{A}_{t}-\underline{A}_{t}^{N}\right\|\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right\}\phi(\bm{\underline{A}}^{N})\right]$
	$\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\bar{A}_{t}-\bar{A}_{t}^{N}\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right\}\phi(\bm{\bar{A}}^{N})\right]$
	$\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\mathbb{E}_{\bm{\tau}}\left[1\left\{\forall t,\left\|\underline{A}_{t}-\underline{A}_{t}^{N}\right\|\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right\}\phi(\bm{\underline{A}}^{N})\right]$
	$\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\left\{1-12N^{-1}\mathsf{TOR}(D(\mu_{t},A_{t}))^{-4}\right\}\mathbb{E}_{\bm{\tau}}\left[\phi(\bm{\bar{A}}^{N})\right]$
	$\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\left(1-\mathcal{O}(N^{-1/9})\right)\left\{\phi(\bm{\underline{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}$
	$\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\left\{1-12N^{-1/9}(\underline{e}\bar{\delta})^{-4}d^{8/9}\right\}\left\{\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\},$

where $\bm{\bar{A}}^{N}$ and $\bm{\underline{A}}^{N}$ satisfy

	$\displaystyle\bar{A}_{t}^{N}$	$\displaystyle=A_{0}+\frac{1}{N}\sum_{i=1}^{n}1\{\tau_{i}\leq t\}$
	$\displaystyle\underline{A}_{t}^{N}$	$\displaystyle=A_{0}-\frac{1}{N}\sum_{i=1}^{N-n}1\{\tau_{i}\leq t\}$

with $n$ being the number of agents playing $0$ at time $0$ . Note that the second inequality follows from Lemma 5, and the third inequality follows from $\mathsf{TOR}(D)\geq\underline{e}\bar{\delta}D^{2}\geq\underline{e}\bar{\delta}(dN)^{-2/9}$ . Note also that we can apply the similar argument to Lemma 5 and show that

\mathbb{P}\left(\forall t,\left|\underline{A}_{t}-\underline{A}_{t}^{N}\right|\leq\mathsf{TOR}(D(\mu_{t},A_{t}))\right)>1-\mathcal{O}(N^{-1/9}).

This implies

	$\displaystyle\eqref{eqn:adv-n}$	$\displaystyle\geq(1-p^{*}(\mu_{0})+\eta)\left\{1-\mathcal{O}(N^{-1/9})\right\}\left\{\phi(\bm{\underline{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}$
		$\displaystyle\quad+(p^{*}(\mu_{0})-\eta)\left\{1-\mathcal{O}(N^{-1/9})\right\}\left\{\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}$

Hence, we have

	$\displaystyle\|\eqref{eqn:opt}-\eqref{eqn:adv-n}\|$	$\displaystyle\leq\eta\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}+L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}$
		$\displaystyle\quad+\mathcal{O}(N^{-1/9})(p^{*}(\mu_{0})-\eta)\left\{\phi(\bm{\bar{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}$
		$\displaystyle\quad+\mathcal{O}(N^{-1/9})(1-p^{*}(\mu_{0})+\eta)\left\{\phi(\bm{\underline{A}})-L_{\phi}(\underline{e}\bar{\delta})(dN)^{-2/9}\right\}$
		$\displaystyle=\eta\left\{\phi(\bm{\bar{A}})-\phi(\bm{\underline{A}})\right\}+\mathcal{O}(N^{-1/9}).$

With our choice of $\eta,$ we have there exits a constant $\bar{C}$ such that $|\eqref{eqn:opt}-\eqref{eqn:adv-n}|\leq\bar{C}N^{-1/9},$ as desired. ∎

Proof of Part 3.

If $\mu_{t-}>\underline{\mu}(A_{t})$ , consider $N$ large enough such that $\mu_{t-}>\underline{\mu}(A_{t})+2(dN)^{-1/9}$ . We consider the following two cases for :

•

Case 1: If $\mu_{t-}>\underline{\mu}(A_{t})+2(dN)^{-1/9}$ and $Z_{t-}\leq A_{t}$ . In this case, there is no information arriving, and everyone takes action 1. This will increase $A_{t}$ , and every agent always takes action $1$ from time $t$ onwards as long as $\bar{A}_{s}-\bar{A}_{s}^{N}\leq\mathsf{TOR}(D(\mu_{s},A_{s}))$ for all $s\geq t$ . Since Lemma 5 implies that such probability converges to $1$ as $N\to\infty$ , the designer’s payoff converges to the best case, implying sequential optimality as $N\to\infty$ .
•

Case 2: If $\mu_{t-}>\underline{\mu}(A_{t})+2(dN)^{-1/9}$ and $Z_{t-}>A_{t}$ . In this case, the belief moves to either $\mu_{t-}+M\cdot\mathsf{TOR}(D)$ or $\mu_{t}-\mathsf{DOWN}(D)$ . Note that $\mu_{t-}-\mathsf{DOWN}(D)=(\mu_{t}+\underline{\mu}(A_{t}))/2>\underline{\mu}(A_{t})+(dN)^{-1/9}$ . So no matter what information arrives, every agent takes action 1. This will increase $A_{t}$ , and every agent always takes action $1$ after time $t$ as long as $\bar{A}_{s}-\bar{A}_{s}^{N}\leq\mathsf{TOR}(D(\mu_{s},A_{s}))$ for all $s\geq t$ . Again, since such probability converges to $1$ as $N\to\infty$ , we have sequential optimality as $N\to\infty$ .

∎

	$\displaystyle A_{s}$	$\displaystyle\geq 1\{(\mu_{T^{}},A_{T^{}})\in\psi_{n}\}\bar{A}(A_{T^{}},s-T^{})+1\{(\mu_{T^{}},A_{T^{}})\notin\psi_{n}\}\underline{$A$}(A_{T^{}},s-T^{})$
		$\displaystyle\geq 1\{(\mu_{T^{}},A_{T^{}})\in\psi_{n}\}\{\bar{A}_{s}-\mathsf{TOR}(\psi_{n}(A_{t})-\underline{\mu}(A_{t}))\}$
		$\displaystyle\quad\quad\quad\quad\quad\quad+1\{(\mu_{T^{}},A_{T^{}})\notin\psi_{n}\}\underline{$A$}(A_{T^{}},s-T^{}),$