Observational Learning with Fake Agents
Abstract
It is common in online markets for agents to learn from other’s actions. Such observational learning can lead to herding or information cascades in which agents eventually “follow the crowd”. Models for such cascades have been well studied for Bayes-rational agents that choose pay-off optimal actions. In this paper, we additionally consider the presence of fake agents that seek to influence other agents into taking one particular action. To that end, these agents take a fixed action in order to influence the subsequent agents towards their preferred action. We characterize how the fraction of such fake agents impacts behavior of the remaining agents and show that in certain scenarios, an increase in the fraction of fake agents in fact reduces the chances of their preferred outcome.
I Introduction
Consider a new item that is up for sale in a recommendation-based market where agents arrive sequentially and decide whether to buy the item, with their choice serving as a recommendation for later agents. Every agent makes a pay-off optimal decision based on his own prior knowledge of the item’s quality/utility and by observing the choices of his predecessors. An informational cascade or herding occurs when it is optimal for an agent to ignore his own private signal and follow the actions of the past agents. Subsequent agents follow suit and from the onset of a cascade, the agents’ actions do not reveal any information conveyed to them by their private signals. Thus the phenomenon of cascading could prevent the agents from learning the socially optimal (right) choice, i.e. the agents could herd to a wrong cascade.
Cascades were first studied in [1, 2, 3] as a Bayesian learning model in which a homogeneous sequence of rational agents take pay-off optimal actions. In this paper, in addition to rational agents, we introduce randomly arriving fake agents that always take (or fake) a fixed action, regardless of their pay-off, in order to influence the outcome of a cascade. For example, this could model a setting where a poor quality item is posted for sale on a website that records buyers’ decisions, and fake agents intentionally buy (or appear to buy) this item to make it seem worth buying to future buyers. The model could also represent a setting where rational actions are manipulated or where fake actions are inserted into a recommendation system. The objective is to study the impact of varying the amount of these fake agents on the probability of their preferred cascade. Our main result shows a counter-intuitive phenomenon: the probability with which this preferred cascade occurs is not monotonically increasing in the fraction of fake agents, . In fact, in some cases the presence of fake agents can reduce the chance of their preferred cascade.
As in [1, 2, 3], we consider a setting in which all agents can observe all prior actions (although in our case, each action could be either rational or fake). Many other variations of this basic model have been studied such as [4], where observations depend on an underlying network structure and [5], where agents are allowed to select their observations at a cost.
Closer to our work is the model in [6], which assumes that the recording of actions for subsequent agents is subject to an error that is unbiased towards each of the possible actions. In our setting, an action being either fake or rational depending on the agent-type could equivalently be perceived as an error while recording a rational action (as in [6]); except that in our case, the error is biased only towards a preferred action111This change requires a different analysis approach than that in [6] as the underlying Markov chain now typically has a countably infinite state-space..
There is also a body of work that considers agents similar to our fake agents, who only take a single action regardless of the true state. This includes the crazy agents considered in [7], stubborn agents in [8] and zealots in [9]. While [7] relaxes the assumption of binary signals, it does not consider how changing the fraction of crazy agents affects the cascade probability, which is the main focus of our work. In [8] a non-Bayesian model for learning was considered instead of the Bayesian model considered here. Other related cascade models are investigated in [10, 11, 12, 13, 14, 15, 16, 17].
The paper is organized as follows. We describe our model in Section II. We analyze this model and identify the resulting cascade properties in Section III. In Section IV, we present our Markov process formulation and in Section V, we consider the limiting scenario where the proportion of fake agents approaches unity. Section VI characterizes the asymptotic welfare of rational agents. We conclude in Section VII.
II Model
We consider a model similar to [1] in which there is a countable sequence of agents, indexed where the index represents both the time and the order of actions. Each agent takes an action of either buying or not buying a new item that has a true value which could either be good or bad . For simplicity, both possibilities of are assumed to be equally likely. The agents are Bayes-rational utility maximizers where the pay-off received by each agent , denoted by , depends on its action and the true value as follows. If the agent chooses , his payoff is . Whereas, if the agent chooses , he incurs a cost of for buying the item, and gains the amount if and if . The buyer’s net pay-off is then the gain minus the cost of buying the item. Observe that when is equiprobable, the ex ante expected pay-off is for either of the actions. Thus, at the start, an agent is indifferent to the two actions.
To incorporate agents’ private beliefs about the new item, every agent receives a private signal . This signal, as shown in Figure 1, partially reveals the information about the true value of the item through a binary symmetric channel (BSC) with crossover probability , where . Each agent takes a rational action that depends on his private signal and the past observations of actions . Next, we modify the information structure in [1] by assuming that at each time instant, an agent could either be fake with probability (w.p.) or ordinary w.p. . An ordinary agent honestly reports his action, i.e. . On the contrary, a fake agent always reports a , reflecting his intention of influencing the successors into buying the new item, regardless of its true value. This implies that at any time , with probability , the reported action satisfies and with probability , . Refer to Figure 1.
An equivalent model is where a fake agent always takes an action , while for all agents. This yields the same information structure as the model above. However we chose the former model mainly to simplify our notation.
III Optimal decision and cascades
For the agent, let the history of past observations be denoted by . As the first agent does not have any observation history, he always follows his private signal i.e., he buys if and only if the signal is . For the second agent onwards, the Bayes’ optimal action for every agent , is chosen according to the hypothesis ( or ) that has the higher posterior probability given the information set . Let denote the posterior probability for the item being good, . Then the Bayes’ optimal decision rule is:
(1) |
Note that when , an agent is indifferent between the actions. Similar to [6], our decision rule in this case follows the private signal , unlike [1] where they employ random tie-breaking.
Definition 1
An information cascade is said to occur when an agent’s decision becomes independent of his private signal
It follows from (1) that, agent cascades to a if and only if for all . The other case being for and less than for ; in which case, agent follows . A more intuitive way to express this condition is by using the information contained in the history observed by agent in the form of the likelihood ratio as follows [7].
Lemma 1
Agent cascades to a if and only if and otherwise follows its private signal .
This lemma follows by expressing in terms of using Bayes’ law, and then using the condition on for a cascade. If agent cascades, then the observation does not provide any additional information about the true value to the successors than what is contained in . As a result, for all and hence they remain in the cascade, which leads us to the following property, also exhibited by prior models [1, 2, 3, 6].
Property 1
Once a cascade occurs, it lasts forever.
On the other hand, if agent does not cascade, then Property 1 and Lemma 1 imply that all the agents until and including follow their own private signals ignoring the observations of their predecessors. Mutual independence between the private signals results in the likelihood ratio depending only on the number of ’s (denoted by ) and ’s (denoted by ) in the observation history . Specifically, where is the difference between the number of ’s weighted by and the number of ’s. Here, the weight , where for all ,
give the probabilities that the observations follow the true value , given that these agents follow their own private signals. It can be shown from Figures 1 and 1 that in the above case, i.e., when follows ,
Thus, agents that have not yet entered a cascade satisfy the following property.
Property 2
Until a cascade occurs, each agent follows its private signal. Moreover, is a sufficient statistic of the information contained in the past observations.
Note that if (no fake agents) then and , in which case is the same as in [6]. The expression for shows that, due to the presence of fake agents, the dependence of an agent’s decision on a in his observation history reduces by a factor of , whereas the dependence on a remains unaffected. This is to be expected because, unlike a which surely comes from an honest agent, a incurs the possibility that the agent could be fake. Further, this reduced dependence on is exacerbated with an increase in the possibility of fake agents, as reduces with an increase in .
IV Markovian Analysis of Cascades
Let us, for the sake of discussion, consider the realization . Similar arguments hold when . It follows from the previous section that the process is a discrete-time Markov process taking values in before getting absorbed into the left wall causing a cascade or the right wall causing a cascade. More specifically, equation (2) shows that, until a cascade occurs, is a random walk that shifts to the right by w.p. or to the left by w.p. . Note that the walk starts at state since the first agent has no observation history.
The random walk is depicted in Figure 2 where denotes the probability of a being observed. Depending on the item’s true value, for whereas for . Note that in the special case where satisfies for some , the process is equivalent to a Markov Chain with finite state-space , and with and being absorption states corresponding to and cascades respectively. In this case, absorption probabilities can be obtained by solving a system of linear equations. In this paper, our main focus is on the more common case of non-integer values of resulting in taking possibly uncountable values in 222For example, if was chosen uniformly at random, then with probability it would fall into this case..

IV-A Error thresholds
In the absence of fake agents as in [1], and so cascading to a cascade requires at least two consecutive ’s (’s). However, in the presence of fake agents, even a single after a could trigger a cascade. On the other hand, as increases, a greater number of consecutive ’s are required to cause a cascade. This is characterized in the following lemma.
Lemma 2
Let . For define the increasing sequence of thresholds , where the threshold is given as:
(3) |
Define as the –interval. Then for , at least consecutive ’s are necessary for a cascade to begin.
The proof follows by noting that, given that if a cascade has not begun, when an is observed, then the rightmost position that the random walk can be in, is in state . From here, at least consecutive ’s would be needed only if , thereby giving the threshold . It follows that implies .
Remark 1
For , staring from state , consecutive ’s start a cascade.
Moreover, given , the integer such that can be readily obtained from Lemma 2 as . Here, recall that is a function of .
Figure 3 shows the thresholds varying with , for different values of . For a fixed , we see that as the signal quality () improves, more consecutive ’s are required for a cascade to begin. This is because, an increase in increases the information contained in a , but not as much as the corresponding increase in the information contained in a . Further, note that as , which implies that infinitely many consecutive ’s are required for a cascade to begin. Equivalently, the information contained in a observation becomes negligible. We further investigate learning in this asymptotic scenario in Section V.
IV-B cascade probability
In this subsection, we will compute the probability of absorption of to a cascade (right wall). The following iterative process depicted in Figure 4 describes all possible sequences that can lead to a cascade. For the rest of the paper, we assume for some . We initialize Stage with . Now, starting from state , consider the sequences shown in Stage in Figure 4. The first sequence of consecutive ’s, denoted by , clearly hits the right wall (Remark 1). The rest of the sequences, each of length , are simply permutations of each other that contain only a single . Two ’s or more are not possible without hitting the left wall. So each of these distinct sequences results in the same net right shift, which ends somewhere in the region . From here, it would take either or consecutive ’s to hit the right wall. Let this value be denoted by . The sequences that would then be possible from this point onwards could again be enumerated exactly as in the first stage, except that now replaces . This forms Stage . Now, unless there are consecutive ’s, the second stage would again end in the region , and then the process continues to the next stage. Here, denotes the number of consecutive ’s required to hit the right wall in the stage. Note that by considering , we avoid the case of integer values of which could result in certain pathological sequences with non-zero probability, that are not enumerated in Figure 4.
Let denote the probability of hitting the right wall given that the sequence has not terminated before the stage. Thus, the following recursion holds:
(4) |
and the probability of a cascade, denoted by is:
(5) |
Here, while for , successive values of for can be obtained from using the updates:
(6) |
Since (4) is an infinite recursion, to compute in practice, we truncate the process to a finite number of iterations . To this end, we first assume that . Next, we use (4) to successively compute while counts down from to , performing a total of iterations. We denote the obtained value as . We now state in the following theorem that is in fact a tight upper bound to as . Moreover, the difference decays to zero at least as fast as , in the number of iterations .

Theorem 1
Let for some , with denoting the probability of a . Then, for any ,
where and for any and , satisfies .
Proof:
Recall that while computing , which implies that is the net probability of: sequences that terminate in a cascade by the stage and sequences that do not terminate by the stage. Thus, the difference: is upper-bounded by the probability of . Accounting for all possible sequence combinations that persist through stages yields the probability . Next, for a fixed and for , is maximized only if is such that , which may not be satisfied for any in . Assuming is maximized, the maximal value of would be . As this maximal value decreases in , evaluating it at yields for any . ∎
Figure 5 shows a plot of with respect to , for the case with . The plot uses which gives an error of less than . It can be seen that in the –interval , increases with , but with infinitely many discontinuities (where decreases). Despite the discontinuities, achieves the minimum and maximum values at the edge points of , i.e., and , respectively. Further, note that the relatively larger drops in observed in Figure 5 (marked by and ) occur exactly at the threshold points . Here, counter to expectation, a slight increase in beyond causes a significant decrease in the probability of a cascade.
We now quantify as tends to each threshold point . For this, we state the following lemma:
Lemma 3
For all , as .
Proof:
We consider two cases: and . The proof outline for the first case is as follows: assume for all and note that as . Then it follows from (6) that only if . Now as , and hence the condition for is satisfied. Using this argument inductively shows that for all . The second case is proved similarly. ∎
It follows from Lemma 3 that as , the recursion in (4) results in the same infinite computation to obtain as for , for all . Thus, all for have the same value which satisfies: . Solving this equation for gives the value for which when used in equation (4) for yields , i.e., . However, note that while solving equation (4), for whereas for . This corresponds to the following two different values of as :
(7) | ||||
(8) |
Hence, the fractional decrease in that occurs abruptly at , defined as is:
(9) |
It can be verified that as , . This is depicted in Figure 5 where the sequences and , marked by and respectively, converge to a limiting value as .
Property 3
If the possibility of fake agents in the history equals the -threshold, , then a further increase in fake agents reduces the chances of a cascade by a factor of , rather than increasing it.
Next, we consider the cascade behaviour for low values of and state the following theorem.
Theorem 2
Given the private signal quality , and the item’s true value , there exists some such that
(10) |
From the point of view of the fake agents, the above theorem implies that if they occur with a probability of less than , then the effect that their presence has on the honest buyers is opposite to what they had intended. That is, they reduce the chances of a cascade instead of increasing it. On the other hand, for honest buyers, if , then they benefit from the presence of fake agents. Otherwise if , then they are harmed by the presence of fake agents.
The proof for Theorem 2 follows by noting that as , the limiting value of can be obtained from (7) with and for , whereas for , which gives
(11) |
However, in the absence of fake agents as in [1], . It then follows from Figure 2 that has a finite state-space , and a cascade starts when . Here, solving for gives
(12) |
By comparing the above expression with the one in (11) for the corresponding values of it follows that
Thus, there exists some such that (10) holds true.
V Learning in the limit as
As the probability of an agent being fake tends to , the information contained in a observation becomes negligible. As a result, an agent would need to observe infinitely many consecutive ’s in his history for him to be convinced of starting a cascade. Hence, one would expect that if , learning would never occur, whereas if , then learning would always occur. However, recall that as , the occurrence of ’s becomes increasingly frequent, i.e. ; for both and . This motivates studying in this limiting scenario. First, recall that in the process of enumerating all sequences leading to a cascade, for , in each stage , is either or . However, implies , in which case . As a result, the expressions obtained in (7) and (8) yield the same limiting value for as , i.e., . In particular,
where Step can be proved by recalling that as . Further, the limiting probability of a cascade, in terms of , is given by (refer to References for proof):
(13) |
where for , and for .


Figure 6 illustrates (13) and the corresponding probability when as a function of the signal quality. For both and , a better signal quality leads to improved learning even when fake agents have overwhelmed the ordinary agents. Also note that for , for a weak signal quality, the incorrect cascade is more likely than the correct one (while for this is never true).
VI Asymptotic welfare for ordinary agents
In this section, we analyze the expected pay-off or welfare of an ordinary (rational) agent , as . Recall from Section II that if whereas or if depending on whether or , respectively. Thus, agent ’s welfare is given by:
(14) |
Next, by taking the limit as of (14), the asymptotic welfare denoted by is given by:
(15) |
In Step , refers to the unconditional probability of a wrong cascade, i.e., the probability that a cascade occurs and or a cascade occurs and . The relation between and presented by equation (15) substantiates the notion that a lower probability of wrong cascade results in higher asymptotic welfare. The probability for can be computed using the recursive method described in Section IV-B by equations (4) and (5). Then, substituting these obtained values in equation (15) yields the value for . Figure 7 shows a plot of with respect to , for and compares it with the constant level of which refers to the asymptotic welfare in the absence of fake agents. We substitute (12) in (15) to obtain , which gives
(16) |
It can be observed in Figure 7 that for , for all . We find this ordering to be consistent over all values in the range of .
Remark 2
For any private signal quality , we observe that the presence of fake agents in any amount worsens the asymptotic welfare.
Further, note that the relatively larger jumps in observed in Figure 7 (marked by and ) occur exactly at the threshold points . Here, counter to expectation, a slight increase in beyond causes an abrupt and significant increase in the asymptotic welfare. Therefore increasing over the -threshold is not only counter-productive for the fake agents but it also leads to a higher asymptotic welfare for the ordinary agents.

VII Conclusions and future work
We studied the effect of randomly arriving fake agents that seek to influence the outcome of an information cascade. We focussed on the impact of varying the amount of fake agents on the probability of their preferred cascade and identified scenarios where the presence of fake agents reduces the chances of their preferred cascade. In particular, the chances of the preferred cascade decrease abruptly as the fraction of agents increase beyond specific thresholds. We quantified the correct cascade probability as a function of the private signal quality in the asymptotic regime where the fraction of fake agents approaches unity. We observed that the presence of fake agents in any amount worsens the asymptotic welfare. Studying the effects of multiple types of fake agents and non-Bayesian rationality are possible directions for future work.
References
- [1] S. Bikhchandani, D. Hirshleifer, and I. Welch, “A theory of fads, fashion, custom, and cultural change as informational cascades,” Journal of political Economy, vol. 100, no. 5, pp. 992–1026, 1992.
- [2] A. V. Banerjee, “A simple model of herd behavior,” The quarterly journal of economics, vol. 107, no. 3, pp. 797–817, 1992.
- [3] I. Welch, “Sequential sales, learning, and cascades,” The Journal of finance, vol. 47, no. 2, pp. 695–732, 1992.
- [4] D. Acemoglu, M. A. Dahleh, I. Lobel, and A. Ozdaglar, “Bayesian learning in social networks,” The Review of Economic Studies, vol. 78, no. 4, pp. 1201–1236, 2011.
- [5] Y. Song, “Social learning with endogenous network formation,” arXiv preprint arXiv:1504.05222, 2015.
- [6] T. N. Le, V. G. Subramanian, and R. A. Berry, “Information cascades with noise,” IEEE Transactions on Signal and Information Processing over Networks, vol. 3, no. 2, pp. 239–251, 2017.
- [7] L. Smith and P. Sørensen, “Pathological outcomes of observational learning,” Econometrica, vol. 68, no. 2, pp. 371–398, 2000.
- [8] D. Acemoğlu, G. Como, F. Fagnani, and A. Ozdaglar, “Opinion fluctuations and disagreement in social networks,” Mathematics of Operations Research, vol. 38, no. 1, pp. 1–27, 2013.
- [9] M. Mobilia, “Does a single zealot affect an infinite group of voters?” Physical review letters, vol. 91, no. 2, p. 028701, 2003.
- [10] I. H. Lee, “On the convergence of informational cascades,” Journal of Economic theory, vol. 61, no. 2, pp. 395–411, 1993.
- [11] Y. Wang and P. M. Djurić, “Social learning with bayesian agents and random decision making,” IEEE Transactions on Signal Processing, vol. 63, no. 12, pp. 3241–3250, 2015.
- [12] T. Le, V. Subramanian, and R. Berry, “Quantifying the utility of noisy reviews in stopping information cascades,” in IEEE CDC Proceeding, 2016.
- [13] Y. Peres, M. Z. Rácz, A. Sly, and I. Stuhl, “How fragile are information cascades?” arXiv preprint arXiv:1711.04024, 2017.
- [14] T. N. Le, V. G. Subramanian, and R. A. Berry, “Bayesian learning with random arrivals,” in 2018 IEEE International Symposium on Information Theory (ISIT). IEEE, 2018, pp. 926–930.
- [15] I. Bistritz and A. Anastasopoulos, “Characterizing non-myopic information cascades in bayesian learning,” in 2018 IEEE Conference on Decision and Control (CDC). IEEE, 2018, pp. 2716–2721.
- [16] Y. Cheng, W. Hann-Caruthers, and O. Tamuz, “A deterministic protocol for sequential asymptotic learning,” in 2018 IEEE International Symposium on Information Theory (ISIT). IEEE, 2018, pp. 1735–1738.
- [17] G. Schoenebeck, S. Su, and V. G. Subramanian, “Social learning with questions,” CoRR, vol. abs/1811.00226, 2018. [Online]. Available: http://arxiv.org/abs/1811.00226
Proof for equation (13)
Proof:
First, we first obtain the limitng value for the term as . For this, recall that as , we have . Since the upper and lower bounds converge to the same limit, it implies that
(17) |
Further, given that , the numerator of can be bounded using the inequality: , which holds for all . As a result, can be bounded as:
(18) |
This in turn extends to the following bounds on ,
(19) |
where for , and for . Now, as both and tend to as ; both the upper and the lower bounds presented in (19) achieve the same limit. Hence,
(20) |
where Step follows by reducing the limit term such that it maps to the identity: .