Markovian Persuasion
Abstract.
In the classical Bayesian persuasion model an informed player and an uninformed one engage in a static interaction. The informed player, the sender, knows the state of nature, while the uninformed one, the receiver, does not. The informed player partially shares his private information with the receiver and the latter then, based on her belief about the state, takes an action. This action determines, together with the state of nature, the utility of both players. We consider a dynamic Bayesian persuasion situation where the state of nature evolves according to a Markovian law. In this repeated persuasion model an optimal disclosure strategy of the sender should, at any period, balance between getting high stage payoff and future implications on the receivers’ beliefs. We discuss optimal strategies under different discount factors and characterize when the asymptotic value achieves the maximal value possible.
Date:
JEL Classification: D72,
D82, D83, K40, M31
Keywords: Markovian persuasion, Dynamic Bayesian persuasion, Markov chain, Asymptotic value, Absorbing set, Homothety.
1. Introduction
The literature devoted to Bayesian persuasion studies optimal policies by which an informed agent, the sender, discloses information to an uninformed agent, the receiver. Kamenika and Gentzkow [16] present a case where a prosecutor, who is fully informed about the state of nature, i.e., whether the suspect is guilty or innocent, wishes to persuade a judge, e.g to convict the suspect of a crime. This is a static scenario: upon the prosecutor’s disclosure, the judge takes a decision and the game is over.
In this paper we study a dynamic model where the interaction between the sender, who has a commitment power, and the receiver evolves over time. At each period nature randomly chooses a state and the sender gets informed about it. The decision of how much information to disclose to the receiver at each stage is the choice of the sender. The latter publicly announces an information provision policy, to which he is committed throughout. The receiver knows the details of this policy and thus, when she receives a signal from the sender she updates her belief about the true state. She then takes an action that is based solely on her posterior belief. This action, together with the realized state, determines not only her own payoff, but also that of the sender. In short, in this dynamic signaling interaction, the sender is in charge of the informational aspects of the interaction (talking), while the receiver is in charge of doing. Neither may affect the evolution of the realized states, which is entirely exogenous.
We assume that the evolution of states follows a Markov chain. The dynamic of the receiver’s belief is governed by both the disclosure policy of the sender and a Markov transition matrix. After observing the signal sent by the sender, the receiver updates her belief according to Bayes’ law. Due to the Markovian transition law, this belief shifts to another belief. Since all terms are known to both players, the nature of the shift from prior belief to a posterior one is also known to both.
When committing to an information provision policy, the sender takes into account the resulting posterior belief of the receiver, which has two effects. The first is on the action taken by the receiver and directly on his own stage payoff. In a dynamic setting, as discussed here, it has also an effect on the belief over future states, and consequently on future payoffs. The posterior resulting from a Bayesian updating in one period is then shifted by the Markov transition matrix and becomes the initial belief in the next one. Balancing between these two, potentially contradicting effects, is the dynamic programming problem faced by the sender.
In this paper we study the long-run optimal values that the sender may achieve in an irreducible and aperiodic Markov chain.111In a companion paper we examine, based on the current study, the general case. This value depends on his discount factor. As in Aumann and Maschler [6] and in Kamenika and Gentzkow [16], the optimal signaling strategy follows a splitting scheme of beliefs in accordance with the concavification of a certain function222That is, the lowest concave function above a given function.. While in a static model this function is the payoff function of the sender, here, this function takes account of the underlying dynamics and combines present and future payoffs.
The first set of results deal with a patient sender. It turns out that the stationary distribution of the Markov chain plays a central role. The reason is that in case the sender keeps mute, the receiver’s beliefs become converge, very fast,333At an exponential rate (see e.g., Theorem 4.9 in [20]). to the stationary distribution.
We show that as the sender becomes increasingly patient,444The method of treating patient players in not new to the literature of dynamic interactions. Fudenberg and Maskin [11] introduced a Folk Theorem characterizing the set of equilibria in a repeated game played by patient players. Typically in these kind of games, it is impossible to provide a description of the value or of the set of equilibria that correspond to a specific discount factor more informative than that given by Bellman equation (see Abreu et. al. [1]). the optimal values converges to a certain level, called the asymptotic value. Moreover, this value cannot exceed the optimal level obtained in the static case when the prior belief is the stationary distribution. A natural question is under what conditions this static optimal value can be obtained in the dynamic model. Our main theorem provides a full characterization. In order to describe it we introduce the notion of an absorbing set.
A set of prior distributions is said to be absorbing, if any distribution in evolves by the Markov law to another distribution which is within the convex boundaries of . The stationary distribution, for instance, is absorbing as it shifts to itself. The main theorem roughly asserts that the asymptotic value is as high as it can get (i.e., the static value at the stationary distribution), precisely when there is an absorbing set by which the optimal static value is defined (as a weighted average). The intuition behind this result is the following. If any set by means of which the optimal static value is defined is non-absorbing, then regardless of the policy the sender uses, the posterior beliefs of the receiver would exit the boundaries of infinitely often, at times having positive limit frequency. Upon each such exit, the sender’s payoff would take a hit which he would not be able to compensate for in subsequent time periods. Therefore, the dream scenario for a patient sender is to find a set by means of which the optimal static value is defined, so that he could confine the posteriors beliefs of the receiver within the convex boundaries of throughout the entire game.
The second type of results is non-asymptotic in nature. For a certain region around the stationary distribution we provide a closed form expression for the values corresponding to any level of patience. In the case where this region includes a neighborhood of the stationary distribution, we give asymptotically effective555That is, bounds whose difference is arbitrarily small for a patient enough sender. two-sided bounds for the values corresponding to any level of patience. Moreover, the effectiveness of those bounds depends on the geometry and size of the described neighborhood.
The closest paper to the present one is that of Renault et. al. [22]. It deals with a specific type of Markov chains, homothetic ones, and with a utility function which gives a fixed amount to the sender in a certain region and zero in its complement. Renault et. al. [22] show that in this model the optimal information provision policy is a greedy one (starting from a certain random time period on). Namely, the sender’s greedy policy instructs him to maximize his current stage payoff at any time period, ignoring future effects of this policy. Also closely related to our work are the works of Ely [9] and Farhadi and Teneketzis [12], which deal with a case where there are two states, whose evolution is described by a Markov chain with one absorbing state. In these two works, the receiver is interested in detecting the jump to the absorbing state, whereas the sender seeks to prolong the time to detection of such a jump.
In a broader context, our paper can be ascribed to a vast growing literature on dynamic information design problems, specifically with the analysis of situations in which the sender is required to convey signals sequentially to a receiver, which he may base on his additional private information. Without dwelling upon the exact details of their models, whose frameworks cover diverse methodological approaches, we refer the reader to Mailath and Samuelson [19], Honryo [14], Phelan [21], Lingenbrink and Iyer [18], Wiseman [26], Athey and Bagwell [4], Arieli and Babichenko [2], Arieli et. al. [3], Dziuda and Gradwohl [8], Escobar and Toikka [10], Ganglmair and Tarantino [13], and Augenblick and Bodoh-Creed [5] for an acquaintance with the literature.
The paper is organized as follows. Section 2 presents the model and an example by which we explain the notations, new concepts and the main results. The asymptotic results as well as the main theorem are provided in Section 3. Results related to non-asymptotic values are given in Section 4. Section 5 provides a characterization of homothetic matrices in terms of the asymptotic value. The proofs are given in Section 6.
2. The Model
Let be a finite set of states. Assume that is an irreducible and aperiodic Markov chain over with prior probability and a transition rule given by the stochastic matrix . We assume that are defined on the probability space .
A sender is an agent who is informed at each time period of the realized value of . Upon obtaining this information a sender is prescribed to send a signal , from a finite set of signals with cardinality at least .666This assumption is in place to make sure the sender can disclose .
A receiver is an agent who, at any time period , is instructed to make a decision from a set of possible set of decisions , assumed to be a compact metric space. This decision may take into account the first signals of the sender.
The payoffs of the sender and the receiver at time period are given by the utilities and , respectively, so that they depend solely on the realized state and the decision . Both the sender and the receiver discount their payoffs by a factor . We denote this game by . As in the models of Renault et. al. [22], Ely [9], and Farhadi and Teneketzis [12] the receiver receives information only through the sender.
A signaling strategy of the sender in is described by a sequence of stage strategies , where each is a mapping Thus, the signal , sent by the sender at time is distributed by the lottery , which may depend on all past states and past signals together with the current state . Let be the space of all signaling strategies.
A standard assumption in many Bayesian persuasion models is that of commitment by the sender. That is, we assume that the sender commits to a signaling strategy at the start of the game , and makes it known to the receiver. The commitment assumption enables the receiver to update her beliefs on the distribution of states based on the signals he receives from the sender. Formally, by Kolmogorov’s Extension Theorem, each signaling strategy together with induce a unique probability measure on the space , determined by the laws
(2.1) |
Thus, the posterior probability the receiver assigns to the event , given the signals and the strategy , is given by the formula
(2) |
Set . A second key assumption of our model is that the receiver’s decision at any time period depends only on . Such an assumption includes the natural situation in which the receiver seeks to maximize her expected payoff based on his current belief (e.g., Renault et. al. (2015)). Denote by the decision policy of the receiver, that is, the mapping which depicts the decision of the receiver as a function of his belief. The last assumption of our model is that the function defined by is continuous. To summarize, our assumptions imply that the signaling strategy of the sender determines his payoff at any time period . Moreover, the total expected payoff to the sender in under the signaling strategy can now be written as
(3) |
where is the expectation w.r.t. . The value of the game is .
Example 1.
Assume that consists of two states, and . Each probability measure over is determined by a parameter , corresponding to the distribution mass assigned to the state by the respective probability measure. Suppose that the receiver’s set of decisions is and that when his belief is his decision policy chooses .
As for the sender, when the decision made by the receiver is and the state is , his utility is minus times the distance between and . That is, . In words, when is the realized state, the closer the decision of the receiver to , the higher the payoff of the sender. However, when the decision made by the receiver is and the state is , his utility is . That is, . When is the realized state, the closer the decision of the receiver to , the higher the payoff of the sender.
We obtain that . The graph of , which for the sake of convenience is plotted as a function of , can be found on the left panel of Figure 1. Thus, is convex on the interval and concave on the interval . This graph illustrates the short-term strategic incentives of the sender. Indeed, the minimal possible stage payoff, , occurs when the sender reveals that the realized state is (corresponding to ). A revelation of the state (corresponding to ) would result in a payoff of . The maximal payoff for the sender, , is reached for . As explained below (see Example 3), for that the optimal signaling strategy would instruct the sender to not reveal any information regarding the realized state.
In a dynamic model the amount of information revealed by the sender is a result of an interplay between the one-shot payoff, , and the transition law governing the evolution of future states. The tension between these two factors is discussed in the rest of this paper.

3. The Main Theorem
3.1. The Existence of Asymptotic Value.
To state our first result we need to introduce some notations. First, let be the unique stationary distribution of . Second, for any function , define the function by
To showcase the Cav operator in action, we show the graph of (), where is that given in Example 1, on the right-hand panel of Figure 1.
Our first result reveals that the effect of on the value of a sufficiently patient sender, i.e., with close to , is negligible compared to the effects of and . Moreover, as the patience level gets closer to the sequences converge for any . Formally, this result is stated as follows.
Theorem 1.
There exists a scalar , , such that for every and every there exists such that
(4) |
As it turns out, the upper bound on described in Theorem 1 is tight. In our next section we shall give a geometric criteria on the achievement of this upper bound. To do so we begin by introducing and studying the notion of -absorbing sets.
3.2. -absorbing sets.
Definition 1.
A non-empty set is said to be -absorbing if 777 is the convex hull of the set . for every .
The intuition behind this choice of terminology is the following. Since is a linear operator, if is -absorbing, so in . However for , -absorption exactly describes the situation in which the image of under lies inside (is absorbed in) . In a dual fashion, if is a closed convex -absorbing set, then since by Krein-Milman Theorem888 is the set of the extreme points of : those points in that cannot be expressed as a convex combination of two distinct points in . , we get that is also -absorbing. This implies, in particular, that since is -absorbing, so is the set of its extreme points (i.e., the set of all mass-point distributions). Lastly, note that since , if and are both -absorbing, then so is .
Example 2.
[Example 1, continued] Suppose that the transition rule underlying the interaction in Example 1 is given by
(5) |
Note that the stationary distribution of equals (which corresponds to in Figure 1) and the pair is on the straight segment of the graph of (see the right panel of Figure 1).
Moreover, the set containing the points and (presented in Figure 1 as the points and ) is not -absorbing. The reason is that is mapped by to , which is not in the convex hull of and .
In order to enhance the intuition about -absorbing sets, assume that at any point a football is passed in a straight line to the point . The orbit generated by the football at is the union of line segments , where . When the set is -absorbing, the orbit generated by the football whose starting point is never exists of .
We proceed with some basic examples of -absorbing sets. The simplest are the singleton and the entire set, . To describe additional examples, consider the -,-, and -norms on , denoted by , and , respectively, for . Denote by the operator norm999 . of w.r.t. the -norm, .
For every we have,
(6) |
It is known that equals the maximum -norm of a row of (see, e.g., Example 5.6.5 on p.345 in [15]), and therefore Also,101010 This follows from the fact that equals the maximum singular value of (see e.g., Example 5.6.6 on p.346 in [15]). . Thus, in view of Eq. (6), any ball (either open or closed) w.r.t. to the or -norms centered at is -absorbing. Moreover, if is doubly stochastic it is known that111111This follows from the fact that equals the maximum -norm of a column of (e.g., Example 5.6.5 on p. 344–345 in [15])). and therefore in that case any ball (either open or closed) w.r.t. the -norm, centered at , is also -absorbing. See Figure 2.

In all the examples above the -absorbing sets contain . This is not a coincidence. Indeed, for every -absorbing set , the image of under the linear map is also contained in . Therefore, by Brouwer’s fixed-point theorem, possesses a fixed point in121212 is the closure of . . As the only fixed point of is , we deduce that for every -absorbing set .
We end the discussion on -absorbing sets with the following proposition whose content and proof (delegated to Section 6) may enhance the intuition about absorbing sets.
Proposition 1.
Let be an -absorbing set. Then, contains a countable -absorbing set.
3.3. The Main Theorem.
To state our main result we begin with a review of some basic concepts from the theory of concave functions. First, for each let . Since () is a concave function, can be supported at by a hyperplane. We may parametrize each such supporting hyperplane by a point in as follows; first, for every define by . Second, set
As for every , the set corresponds to all supporting hyperplanes of at . For every let
The set can thus be interpreted as the projection to the first coordinates of the intersection of with the supporting hyperplane to at parametrized by . A visualization of when is given in Figure 3.

Proposition 2.
We have the following:
-
(i)
If contains an -absorbing set for some , then .
-
(ii)
If then for every , contains a countable -absorbing set.
Why -absorbing sets contained in the ’s are of importance? This has to do with the control the sender has on the receiver’s beliefs. Indeed, once a belief is in the convex hull of an -absorbing subset , , its shift under , which describes the evolution of the posterior in one time period, also lies in . At this point in time the sender may send messages that would induce posteriors within , and in particular is . As is an affine function on (see Lemma 4 in Section 6), the weighted average of the values of evaluated at these posteriors, is equal to the value of evaluated .
In the main theorem we summarize the results of Theorem 1 and Proposition 2. This theorem characterizes when patient senders can obtain a value close to the maximum possible, the upper bound stated in Theorem 1.
Theorem 2.
The following statements are equivalent:
-
(i)
For every and there exists such that
(7) -
(ii)
There exists such that contains an -absorbing set.
-
(iii)
For every , contains a countable -absorbing set.
Note that this characterization is stated in terms of the primitives of the model: and . Moreover, it unravels the sensitivity of a patient sender to the interrelationships between and , as the sets , , are fully determined by the former, whereas -absorbing sets are clearly determined by the latter. An interesting question that arises naturally in this context is how sensitive is a patient sender to such interrelationships. Assume, for instance, that does not contain an -absorbing set for some . Can one quantify the difference between and in terms of and ?
Example 3.
[Example 3 continued] Consider again the matrix in Eq. (5) and recall that equals . For the function given in Example 1 we have that consisting of the two points and in (corresponding to the points and in Figure 1) for every supporting hyperplane . Since this set is not -absorbing, we conclude by Theorem 2 that for sufficiently patient sender the value is strictly less than . However, if
(8) |
then . Therefore as identifies with the set of extreme points of the ball of radius w.r.t. the -norm around , we deduce that is -absorbing for every . Thus, Theorem 2 ensures that the value of a sufficiently patient sender (i.e., when is close to ) is close to .
Under both transition matrices, the maximal payoff possible is obtained at . This point is located in the region where is equal to . We shall prove in Lemma 1 in Section 6 that at this point the sender has no incentive to alter the prior belief of the receiver. He therefore reveals no information to the latter. Such a result holds for any continuous and any point on which agrees with .
We end the section with a mathematical theorem on -absorbing sets, which is a by product of our main economical result.
4. Additional Results: A Non-Asymptotic Approach and a Strong Law
4.1. The value for every .
As it turns out, the case where encompasses information about the behavior of across all . To showcase this, we begin by recalling that as any union of -absorbing sets is also -absorbing, by Proposition 2, if , one may associate with each a maximal (w.r.t. inclusion) -absorbing set . Set (see Figure 4).

In our next theorem we give an exact formula for for all and all in the case where . Therefore, the case is not of a mere asymptotic nature, but rather unravels the full information regarding for any discount factor on the domain . However, outside of , the exact behavior of , even in the case , remains an open problem.
Theorem 3.
Assume that . Then, if we have
(9) |
where is the dimensional identity matrix.
Let us now give some intuition regarding the formula given in Theorem 3. First, recall that the sum of an infinite geometric series with common ratio and scale factor equals . As is stochastic, we will argue in the proof of Theorem 3 that we apply a matrix version of the formula for the sum of infinite geometric series so that
Moreover, as we shall see in the proof section (e.g., Eq. (13)), for any signaling strategy and prior belief we have that . Roughly speaking, the proof of Theorem 3 is divided into two parts. In the first, we come up with a signaling strategy for which the stage payoffs achieve their maximal possible value, shown to be equal to () for . The second part connects () with the formula given in (9). Such an argument is valid on , because () is an affine function on each , .
As it turns out, the topological structure of the set around the point may be utilized to derive two-sided bounds on for every discount factor . Formally, consider the case where is an interior point of (in the topological space equipped with the -norm). Thus, the Convergence Theorem for Markov Chains (e.g., Theorem 4.9 in [20]) guarantees that for every there exists some finite time so that . By employing bounds on the mixing rates of irreducible and aperiodic Markov chains we obtain the following theorem.
Theorem 4.
Assume that and that has a full support (i.e.,131313int is the interior of . ). Then, there exists a finite positive integer such that for every and every we have the following lower and upper bounds:
(4.1) |
where
Moreover, can be chosen to be of the form , where is an integer constant (which may depend on ) and , where is the ball of radius w.r.t. the -norm centered at .
To illustrate the patience level required to make the bounds in Theorem 4 effective. we provide a numerical example. Assume that , so that contains a very small neighborhood of . Hence, to attain the value in the left-hand side of Eq. (4.1) the sender can simply avoid sharing any private information along the first time periods. As we shall see in the proof of Theorem 4, after those time periods, the belief of the receiver will enter the region . As soon as this takes place, we will exploit in the proof of Theorem 4 the recursive structure of as well as the result of Theorem 3, to prescribe to the sender a signaling strategy that will guarantee him in the remaining time periods of the game. Finally, if we choose the discount factor so that , for some small , the lower bound and upper bound on in Eq. (4.1) will defer by at most .
4.2. A Strong Law.
The next result is concerned with convergence in the strong sense, namely with the -almost-surely behavior of the random variable for a specific prior and a strategy . The weak-type of convergence employed so far is concerned with the limit of the expectations. In contrast, the next theorem deals with the almost-surely convergence of the payoff, corresponding to a sender interested in the payoffs he actually obtains. These are the payoffs he truly gets along realized paths, not just their expected values.
As it turns out, finite -absorbing sets can be used to deduce a strong law for the distribution of when .
Theorem 5.
Assume that contains a finite -absorbing set for some . Then, there exist an -absorbing subset and a strategy such that for every it holds that
(11) |
Moreover, if , then (11) holds for every .
In words, under the assumptions of Theorem 5, for almost every infinite sequences of realizations of the Markov chain , if the sender is patient enough he can guarantee himself by following a payoff close to .
5. When the Main Theorem Holds for every : Homothety.
The previous results shed light on the connection between and . Specifically, the main theorem characterizes when in terms of and . A natural question arises as to when this result holds for a fixed and for every . To answer this question, we need to introduce the notion of a homothety.
Definition 3.
A linear map, is said to be homothety with respect to the pair if maps each point into the point . The point is called the center and is called the ratio.
It is clear that when is a homothety with respect to , the point is a fixed point of . Moreover, reduces the distance from any point to by a factor of . When , or equivalently141414For any measure defined on a finite probability space, we denote by its support, i.e., the set of elements to which assigns a positive probability. , and , the matrix defined by the homothety is an irreducible aperiodic stochastic matrix. In particular, the stationary distribution of is . Therefore, we shall say that an irreducible aperiodic stochastic matrix is a homothety if the mapping is a homothety of with center and ration , for some .
When a stochastic matrix is a homothety, the transition from one state to another follows this law: each state stays unchanged with probability and moves according to the distribution to other states with probability .
We proceed by describing an interesting class of -absorbing sets of a homothety . A set is said to be star shaped around if for every . In words, assume that an observer is located at the point . Then is star shaped around if the line of sight, , to any point lies entirely in . Assume now that is star shaped around . If is a homothety, then for every . Hence, when is a homothety, every star shaped set around any is -absorbing.
Let be irreducible and aperiodic. Our next result gives a characterization of when is homothety in terms of . To make the result transparent, note that, by Theorem 1, is constant on , and as such is simply a function of and . In our new characterization we let vary over the space of all continuous functions defined on .Therefore also varies accordingly.
Theorem 6.
is a homothety if and only if for every continuous function .
6. Proofs
We start this section by reviewing the notion of a split, a cornerstone in the field of Bayesian persuasion. This can be described informally as follows: Given a lottery over with law , to which extent can the sender manipulate (split) using his signals. The answer, given by Blackwell (1951) and Aumann and Maschler (1995), is that for every choice of distributions and convex weights s.t. , the sender can correlate his lottery over signals, , with the lottery , so that on the event that is chosen (having marginal probability ) the posterior belief over states becomes . This lottery will obey the rule
(12) |
Thus, if we denote by
the set of all possible splits in , then the signaling strategy can be interpreted as follows: At the first stage of , specifies which element of the sender should pick. Then, for each time period , given that for some , prescribes which element of the sender should pick. Indeed, prior to getting the signal , the receiver updates her belief on the distribution of , by taking the multiplication of her posterior belief on the distribution of with the transition matrix , resulting in the belief . It thus follows that the sequence satisfies the following important distributional law:
(13) |
Consequently, for every . In particular if , then for every .
In light of this fresh view of , standard arguments (e.g., Theorem 2.20 in [25]) lead to the following ‘Bellman type’ recursive formula for :
(14) |
Consider the operator defined by . Since , Carathéodory’s Theorem (see, e.g., Corollary 17.1.5 in [24]) implies that the expression on the right-hand side of Eq. (14) equals . Thus, in conjunction with Eq. (14) we obtain the following key relation:
(15) |
In particular, this shows that the function is concave for every . As is linear, is also concave. Then, by the definition of Cav we infer from Eq. (15) the following two-sided inequality:
(16) |
Since the sender can always decide to not reveal any information at , i.e., to choose the split , where for all , and thereafter play optimally in the game , we also have that . The latter combined with the right hand side of Eq. (16) gives the following result:
Lemma 1.
Assume that satisfies . Then, for any , the optimal signaling strategy in , would instruct the sender to not reveal any information at .
Set and . Then, by (16), and for every . As our Markov chain is irreducible and aperiodic, the Convergence Theorem for Markov chains ensures that as . A combination of the above two arguments yields the following proposition.
Proposition 3.
The functions and are constant on .
It follows from Proposition 3 that on our way to proving Theorem 1 we need to show that . This requires classical tools and techniques from the field of repeated games with incomplete information [23]. We begin by introducing, for every and , the -stage game over the strategy space whose payoff is given by the formula
(17) |
The value of will be denoted by . Standard continuity and compactness based arguments (see, e.g., Theorem 2.14 on p.15 in [25]) show that
The following proposition established a number of fundamental properties of which will play an important role in our future proofs.
Proposition 4.
We have the following:
-
(i)
is concave for every .
-
(ii)
For every , the function is Lipschitz (w.r.t. the -norm) with constant for every .
-
(iii)
The sequence is sub-additive.
-
(iv)
The sequence converges.
-
(v)
The sequence is non-increasing for every .
Proof of Proposition 4.
The proof of (i) uses the following classical neat argument. Let and such that . Prior to the start of the sender writes a computer program which draws either the digit or according to a random lottery. This lottery, denoted also , is dependent of , and independent of . The conditional laws are given as follows:
(18) |
Then, at the first stage the sender sends a messenger151515The messenger cannot be the receiver. to collect on his behalf the realized value of , type it into his program , and then tell him the output of the program. If the outcome of is () the sender plays his optimal strategy in (). We now argue that this strategy would guarantee him in . Indeed, the latter follows by Bayes’ law, which implies that the outcome of the program is (resp., ) with probability (resp., ) and that the posterior distribution of is (resp., ). The last argument required to depict the situation in which the sender faces (resp., ) based on the message of the messenger is that after hearing the message ( or ), the sender must ask the messenger to inform him of the value of which was realized!.
To prove (ii), observe that by conditioning on the outcome of one has that
(19) |
for every , where is the Dirac measure supported on . Therefore, by the triangle inequality, for any and any ,
(20) |
As can be viewed as a finite game in extensive form, it admits a normal-form game description, and thus (ii) follows from the following basic inequality: for any two zero-sum matrix games and of equal dimensions it holds that .
We move on to (iii). For every and every , the ’th stage game payoff equals
(21) |
As the belief of the receiver at the start of the ’st time period, prior to obtaining the signal , equals , we may bound the latter from above by
(22) |
By Jensen’s inequality applied for the functions , which are known to be concave by (i), in conjunction with the fact that at the initial belief , all the steps of the sequence have expectation , we obtain that the latter is at most
(23) |
Since this holds for every , we conclude that, which completes the proof of (iii).
The proof of (iv) can be deduced directly from (iii) based on the following basic result from analysis; if is a sub-additive sequence, then converges. Lastly, by a repeated use of (iii) we obtain that if divides then
(24) |
Therefore, , which is sufficient to prove (v). ∎
We are now in position to show that . We split the proof into two steps, the first of which is.
Lemma 2.
.
Proof of Lemma 2.
Let . Define . A well-known bound on the mixing rates of (see, e.g., Eq. (4.34) in Section 4.5 of [20]) states that there exists a constant (which may depend on ) such that .
Consider now the following signaling strategy . In the first time periods of the game , the sender plays optimally in , where satisfies (i) and (ii) . Then, starting from the ’st stage , asks the sender to ignore all information obtained along the next time periods, and send the same signal during these times. Starting from the ’st stage, by following , the sender will play an optimal strategy in along the next time periods, and subsequently play at each of the next time periods, thereby ignoring all information collected at times .
The extension of the definition of to all time periods is done inductively. Let be defined by . For each , the strategy instructs the sender to play an optimal strategy in starting from the ’st time period for periods, followed by sending the fixed signal at times .
By the definition of , we have that for every . Hence, by the definition of , item (ii) of Proposition 4, and the choice of we obtain that
(25) |
As , using Eq. (25) we obtain that
(26) |
Combining this with a Tauberian theorem (see, e.g., Theorem 3.1 in [25]) and Proposition 3 we conclude that
(27) |
Since was arbitrary all along, the proof is complete. ∎
The second step towards showing that is given in the following lemma.
Lemma 3.
.
Proof of Lemma 3..
For each , let be such that161616Such exists by standard arguments of continuity and compactness, e.g. Theorem 2.14 on p.15 in [25] Since is bounded on , we may apply the following decomposition (see, e.g., Eq. (1) in [17]) of the discounted average into a convex combination of finite averages:
(28) |
Now fix . By item (iv) of Proposition 4, there exists such that
(29) |
Next, as for every and every , we may take for which
(30) |
Thus, by combining Eqs. (28), (29), and (30) we conclude that for every , thus proving the lemma. ∎
Combining Lemmas 2 and 3 we obtain the string of inequalities
Hence, . Thus, the proof of Theorem 1 will be concluded if we show the following simple claim to be true.
Claim 1.
For every and every we have that and .
Proof of Claim 1..
The claim follows easily from the fact that for any , we have, by Jensen’s inequality, that the expected payoff at any time period satisfies
(31) |
∎
As the first step toward the proof of Proposition 2 and Theorems 3 and 4 we prove the following basic lemma.
Lemma 4.
For every we have:
-
(i)
for every .
-
(ii)
Let , for every , and such that . If , then for every .
Proof of Lemma 4..
Let . Take and convex weights such that . Since is concave and , we have
(31) |
where the last equality follows from the fact that is affine. Since by the definition of we have for every , we have shown (i). For (ii) assume that there exists . Then , and since and , we have
(32) |
We reached a contradiction. ∎
Next, let us now prove the following proposition, unveiling the special advantages of -absorbing subsets of for .
Proposition 5.
Assume that for some , the set admits an -absorbing subset . Then,
(33) |
for every , where is the -dimensional identity matrix.
Proof of Proposition 5..
To show Eq. (33) we will show that a two-sided inequality holds. First, by applying the Jensen’s inequality in same fashion as in Eq. (31), we get that for every and every it holds that
(6.1) |
Since and the set is -absorbing, for all . Hence, as , item (i) of Lemma 4 implies that
(35) |
where the last equlity follows from item (i) of Lemma 4 as well. Moreover, since is a stochastic matrix, we have the following well-know formula (e.g., Theorem 2.29 in [25]):
(36) |
A combination of Eqs. (6.1), (35), and (36) yields that is at most for every and every . Let us now show that the inverse direction holds as well. We start by defining for each the set by
(37) |
Since , Carathéodory’s Theorem shows that whenever . Consider now the strategy defined as follows: at each , if , will chose an element in ; otherwise, if , then will chose some element in . As , and is -absorbing, we have that under the strategy , for every . Indeed, we show this by induction on . For , since , by the definition of . Assume now that for some . Since is -absorbing, , and thus by the definition of we see that as well. The latter, coupled with implies that the discounted payoff under can be computed as follows:
(38) |
where we note that the third equality holds because is affine, and the last equality is a consequence of item (i) of Lemma 4. Hence, by Eqs. (35) and (36) we have that equals , thus showing the converse inequality. ∎
We proceed with the proof of Proposition 2.
Proof of Proposition 2..
Assume that is an -absorbing subset of where . Let . First, by Proposition 5 we have
Next, since as , we have that as . Thus, by a Tauberian Theorem (see, e.g., Theorem 3.1. in [25]) we obtain that as . Hence, from the identity given in (36) we obtain that as . Moreover, since is concave on it is also continuous at (see, e.g., Theorem 10.1 in [24]). A combination of the above arguments with Theorem 1 yields
(6.2) |
thus proving item (i) of Proposition 2.
Let us continue with the proof of item (ii). Assume that . We have (i) , (ii) for every (by Claim 1) and (iii) is non-increasing for every (by item (v) of Proposition 4). A combination of (i), (ii), and (iii) shows that for every . Let be an optimal strategy in . Denote by the sequence of posteriors induced by and the prior probability . By Jensen’s inequality, for every . Hence, as , we obtain that for every . Fix . We see that
(40) |
for every . Letting we get . Since by Claim 1 the inverse inequality holds as well, we deduce that . Therefore, there exists a strategy such that . Denote the sequence of posteriors induced by and the prior probability by . By Jensen’s inequality, , and we therefore obtain that for every . Moreover, as is finite and for every , item (ii) of Lemma 4 implies that for every and every . Set . We have for every . Moreover, as the set of signals of the receiver is finite, we have that is the countable union of finite sets, and thus is countable.
We claim that is -absorbing. Indeed, if , then there exists some such that with positive probability. Since , we obtain that . To summarize, is a countable -absorbing subset of for every , as desired. ∎
Proofs of Theorems 3 and 4..
By Proposition 2, the sets , , are well defined (i.e., non-empty). As they are -absorbing, we can apply Proposition 5 to any point to get the result of Theorem 3.
As for the proof of Theorem 4, let where satisfies . By the definition of we have (see, in the statement of the theorem). By employing a classical bound on the mixing rates of (e.g., Eq. (4.34) in Section 4.5 in [20]) we may take of the form (where is a positive integer which may depend on ) so that for every . Thus if the sender does not use his private information along the first time periods, and then plays optimally starting from the ’st time period, he guarantees . As , Theorem 3 ensures that , thus proving the left hand side of Eq. (4.1). On the other hand, by using Jensen inequality for both and , we see that for every and
and thus, since by Theorem 3, we obtain the right-hand side of Eq. (4.1). ∎
We move on with the proof of Theorem 5.
Proof of Theorem 5..
Let . Since is -absorbing, we can assign for each a distribution so that . Moreover, by Carathéodory’s Theorem, we can choose so that for every . Define the matrix by ; is a stochastic matrix. Also, define the dimensional matrix by , where and . We have the following algebraic relation:
(41) |
Let be a communicating class of whose states are recurrent (see, e.g., Lemma 1.26 in [20]). Denote by the restriction of the matrix to the set of states . The matrix is clearly stochastic. Moreover, since is a communication class, is also irreducible. Next, let us denote by the matrix with entries , where and . It follows from Eq. (41) that
(42) |
As is stochastic, Eq. (42) implies that the set is -absorbing.
Consider the following strategy . If , split into elements of according to some prior . Otherwise, ignore private information and send the fixed signal . Assume that for some . If , the sender ignores his private information and sends the fixed signal . Next, if , instructs the sender to choose a split from (see (37) for the definition of ). Finally, if , instructs the sender to split into , where is the ’th row of . Note that such a split is available to the sender because each row in contains at most non-zero elements, and .
It follows from the definition of that for each , the sequence of posteriors follows a Markov chain over the state space with initial probability and transition rule given by the stochastic matrix . Since the latter is irreducible, we may employ the Ergodic Theorem for Markov chains (e.g., Theorem C.1 in [20]) to obtain that
(43) |
where is the unique stationary distribution of . Furthermore, since , we have that
(44) |
where the last equality follows from item (i) of Lemma 4. Next, by multiplying by from the left both sides of Eq. (42) we obtain
(45) |
which together with the uniqueness of implies that . However, as , we deduce that
(46) |
Therefore, by a Tauberian theorem (see, e.g., Theorem 3.1. in [25]),
(47) |
Finally, assume that . Then, since is irreducible and aperiodic, for some finite time period for every . Then, by the definition of , we see that (46) holds for every and thus so does (47) ∎
The proof of Proposition 1 may enhance the intuition about absorbing sets.
Proof of Proposition 1.
By Carathéodory’s Theorem (see, e.g., Corollary 17.1.5 in [24]) we may assign with each distributions such that . Define a correspondence by . In particular, . The set is -absorbing for every , where is the composition of with itself times. Indeed, let , and let such that . By the definition of we have that as desired. ∎
We end the proof section with the proof of Theorem 6.
Proof of Theorem 6..
Suppose that is a homothety and let us fix . As is continuous, Carathéodory’s Theorem (see, e.g., Corollary 17.1.5 in [24]) implies that there exist , , and positive convex weights such that and . Hence, by item (ii) of Lemma 4, for every and every . Therefore, for every . The latter coupled with the convexity of implies that is star shaped around . Hence, since is a homothety we get that is -absorbing for any . By the definition of an -absorbing set, must also be -absorbing for any . By Proposition 2, we deduce that , proving the first direction of Theorem 6.
Suppose now that for every . For each , let be the Dirac measure concentrated on the ’th coordinate of . Fix and consider for each the vector . Clearly, for all . Next, as , there exists such that for every . For each we take an element satisfying for and for . We have that . Moreover, by the definition of , (for corresponding to ), because the hyperplane for all , supports at . As , Proposition 2 shows that contains an -absorbing subset. However, as has a unique stationary distribution we get that neither nor is -absorbing. Therefore, must be -absorbing for every . In particular, for every . Since as , we obtain that . As was arbitrary, we have shown that for each there exists such that . Since is a linear operator, to prove that is a homothety is suffices to show that for all .
The proof now bifurcates according to the dimension of . First, let us assume that . Since is irreducible, there exists a unique so that . We have
(48) |
By plugging into the last expression in Eq. (48) and using simple algebraic manipulations, we get that the convex weight of in the convex decomposition of to and equals
However, as , we must have that . After some further simple algebraic manipulations, one gets that the equality is equivalent to . As , we obtain that , thus proving that is a homothety whenever .
Next, let . Assume that for some . Define . Since , whereas , we have that . Consider for each the vector . Then for every . Moreover, since there exists such that for every we have . As at the beginning of the proof we take for each an element satisfying for and for . Hence, by arguing as before for and , only this time for and , we obtain that for every . As as we see that . On the other hand,
(49) |
As , this implies that lies in the relative interior of the triangle defined by , , and . This of course contradicts the fact that . Hence, for every , thus proving that is a homothety. ∎
References
- [1] Abreu, D., Pearce, D., and Stacchetti, E. (1990), Toward a Theory of Discounted Repeated Games with Imperfect Monitoring. Econometrica, 58(5), 1041–1063.
- [2] Arieli, I. and Babichenko, Y. (2019), Private Bayesian persuasion. Journal Economic Theory, 182 ,185–217.
- [3] Arieli, I., Babichenko, Y., Smorodinsky, R. and Yamashita, T. (2020), Optimal Persuasion via Bi-Pooling. In:The Twenty-First ACM Conference on Economics and Computation.
- [4] Athey, S. and Bagwell, K. (2008), Collusion with Persistent Cost Shocks. Econometrica, 76(3), 493-540.
- [5] Augenblick, N. and Bodoh-Creed, A. (2018), To Reveal or not to Reveal: Privacy Preferences and Economic Frictions. Games and Economic Behavior, 110, pp. 318–329
- [6] Aumann, R.J. and Maschler, M. (1995), Repeated games with incomplete information. With the collaboration of R. Stearns, Cambridge, MA: MIT Press.
- [7] Blackwell, D. (1951), Comparison of Experiments. In: Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, 1950, pp. 93–102. University of California Press, Berkeley and Los Angeles, Calif.
- [8] Dziuda, W. and Gradwohl, R. (2015), Achieving Cooperation Under Privacy Concerns. Am. Econ. J. Microecon., 7, 142–173
- [9] Ely, J.C. (2017), Beeps. American Economic Review, 107(1): 31–53.
- [10] Escobar, J.F. and Toikka, J. (2013), Efficiency in Games with Markovian Private Information. Econometrica, 81, 1887–1934.
- [11] Fudenberg, D. and Maskin, E. (1986), The Folk Theorem in Repeated Games with Discounting or with Incomplete Information. Econometrica, 54(3), 533–554.
- [12] Farhadi, F. and Teneketzis, D. (2021), Dynamic Information Design: A Simple Problem on Optimal Sequential Information Disclosure. Dyn Games Appl.
- [13] Ganglmair, B. and Tarantino, E. (2014), Conversation With Secrets. Rand J. Econ., 45, 273–302.
- [14] Honryo, T. (2018), Dynamic Persuasion. J. Econom. Theory, 178, 36–58.
- [15] Horn, R.A. and Johnson, C.R. (2013), Matrix analysis. Second edition. Cambridge University Press, Cambridge.
- [16] Kamenica, E. and Gentzkow, M. (2011), Bayesian Persuasion. American Economic Review, 101 (6): 2590–2615.
- [17] Lehrer, E. and Sorin, S. (1992), A Uniform Tauberian Theorem in Dynamic Programming. Math. Oper. Res. 17, no. 2, 303–307.
- [18] Lingenbrink, D. and Iyer, K. (2019) Optimal Signaling Mechanisms in Unobservable Queues with Strategic Customers. Oper. Res. 67, no. 5, 1397-1416.
- [19] Mailath, G. and Samuelson, L. (2001), Who Wants a Good Reputation?. Rev. Econom. Stud., 68, no. 2, 415–441.
- [20] Levin, D. A.; Peres, Y. (2017), Markov chains and mixing times, second edition. With contributions by Elizabeth L. Wilmer. With a chapter on “Coupling from the past” by James G. Propp and David B. Wilson. American Mathematical Society, Providence, RI.
- [21] Phelan, C. (2006), Public Trust and Government Betrayal. J. Econom. Theory, 127(1), 27–43.
- [22] Renault, J., Solan, E., and Vieille, N. (2017), Optimal Dynamic Information Provision. Games Econ. Behav., 104, 329–349.
- [23] Renault, J. (2018), Repeated games with incomplete information. In: Meyers R. (eds) Encyclopedia of Complexity and Systems Science. Springer, Berlin, Heidelberg.
- [24] Rockafellar, R. T. (1970), Convex analysis. Princeton University Press.
- [25] Solan, E. (2021), A Course in Stochastic Game Theory. Cambridge University Press. To appear.
- [26] Wiseman, T. (2008), Reputation and Impermanent Types. Games Econ. Behav., 62, 190–210.