This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Department of Mathematics and Computer Science,
Eindhoven University of Technology, Netherlands, 11email: [email protected]
22institutetext: Department of CSE, KSVCEM Bijnor, India, 22email: [email protected] 33institutetext: Department of Mathematics, Naval Postgraduate School, Monterey, CA, USA 33email: [email protected]

Competitive Influence Propagation and Fake News Mitigation in the Presence of Strong User Bias

Akrati Saxena 11    Harsh Saxena 22    Ralucca Gera 33
Abstract

Due to the extensive role of social networks in social media, it is easy for people to share the news, and it spreads faster than ever before. These platforms also have been exploited to share the rumor or fake information, which is a threat to society. One method to reduce the impact of fake information is making people aware of the correct information based on hard proof. In this work, first, we propose a propagation model called Competitive Independent Cascade Model with users’ Bias (CICMB) that considers the presence of strong user bias towards different opinions, believes, or political parties. We further propose a method, called kTruthScorek-TruthScore, to identify an optimal set of truth campaigners from a given set of prospective truth campaigners to minimize the influence of rumor spreaders on the network. We compare kTruthScorek-TruthScore with state of the art methods, and we measure their performances as the percentage of the saved nodes (nodes that would have believed in the fake news in the absence of the truth campaigners). We present these results on a few real-world networks, and the results show that kTruthScorek-TruthScore method outperforms baseline methods.

Keywords:
Fake News Mitigation Influence Propagation Competitive Information Propagation.

1 Introduction

Since 1997, Online Social Networks (OSNs) have made it progressively easier for users to share the information with each other, and information reaches millions of people in just a few seconds. Over these years, people shared true as well as fake news or misinformation on OSNs, since no references or proofs are required while posting on an OSN. In 2017, The World Economic Forum announced that the fake news and misinformation is one of the top three threats to democracy worldwide [10]. Google Trend Analysis shows that the web search for the “Fake News” term began to gain relevance from the time of the U.S. presidential election in 2016 [1]; Figure 1 shows the plot we generated using Google Trend data.

Refer to caption
Figure 1: Google Trend for “Fake News” web search since 2016.

There are several reasons why people share fake news. Some of the threatening ones are changing the outcome of an event like an election, damaging the reputation of a person or company, creating panic or chaos among people, gaining profit by improving the public image of a product or company, etc. Less malicious reasons for sharing misinformation are due to the fame that users catch as a result of the news’ catchiness or to start a new conversation while having no malicious intentions [6].

A study on the Twitter data shows that the false news spread faster, farther, and deeper [34, 23], and these effects are even more prominent in the case of political news than financial, disaster, terrorism or science-related news [34]. A large volume of fake information is shared by a small number of accounts, and Andrews et al. [3] show that this could be combated by propagating the correct information in the time of crisis; the accounts propagating true information are referred to as “official” accounts.

In OSNs, users have high bias or polarity towards news topics, such as a bias for political parties [29, 35]. Lee et al. [17] observe that the users who are actively involved in political discussions on OSNs tend to develop more extreme political attitudes over time than the people who do not use OSNs. Users tend to share the news confirming their beliefs. In this work, we propose a propagation model to model the spread of misinformation and its counter correct information in the presence of strong user bias; the proposed model is referred to as the Competitive Independent Cascade Model with users’ Bias (CICMB). In the proposed model, the user’s bias for a belief or opinion keeps getting stronger as they are exposed to more news confirming that opinion, and at the same time, their bias towards counter-opinion keeps getting weaken.

It is very challenging to mitigate the fake news in the presence of strong users’ bias. Researchers have proposed various techniques to minimize the impact of fake news on a given social network. The proposed methods can be categorized as, (i) influence blocking (IB) techniques [2, 24], and (ii) truth-campaigning techniques (TC) [5, 30]. IB techniques aim to identify a set of nodes that can be blocked or immunized to minimize the spread of fake information in the network. However, in truth campaigning techniques, the aim is to identify an optimal set of users who will start spreading the correct information in the network so that the people are aware of true news and share it further. Psychological studies have shown that people believe in true news rather than fake news when they receive both, and this also reduces the sharing of fake information further [31, 22].

Most of the existing methods identify truth campaigners in the network who can minimize the impact of fake information; however, they do not consider the factor that a chosen node might not be interested in starting a truth campaign if asked [5, 21, 30]. In this work, we consider a realistic approach where we have a given set of nodes which are willing to start a truth campaign; these nodes are referred to as prospective truth campaigners. We propose a method to identify kk most influential truth campaigners from the given set of prospective truth campaigners to minimize the damage of fake news. We compare the proposed method, kTruthScorek-TruthScore, with state-of-the-art methods and the results show that the kTruthScorek-TruthScore is effective in minimizing the impact of fake news in the presence of strong user bias.

The paper is structured as follows. In section 2 we discuss the related literature. In Section 3, we discuss the proposed spreading model. Section 4 includes our methodology to choose truth-campaigners. Section 5 shows the comparison of methods on real-world networks. We conclude the paper with future directions in Section 6.

2 Related Work

The problem of fake news spreading needs public attention to control further spreading. In a news feed released by Facebook in April 2017 [20], Facebook outlined two main approaches for countering the spread of fake news: (i) a crowdsourcing approach leveraging on the community and third-party fact-checking organizations, and (ii) a machine learning approach to detect fraud and spam accounts. A study by Halimeh et al. supports the fact that Facebook’s fake news combating techniques will have a positive impact on the information quality [12]. Besides Facebook, there are several other crowdsourced fact-checking websites including snopes.com, politifact.com, and factcheck.org.

Researchers have proposed various influence blocking and truth-campaigning techniques to mitigate fake news in different contexts. In influence blocking, the complexity of the brute force method to identify a set of nodes of size kk to minimize the fake news spread is NP-hard [2]. Therefore, greedy or heuristic solutions are appreciated and feasible to apply in real-life applications. Amoruso et al. [2] proposed a two-step heuristic method that first identifies the set of most probable sources of the infection, and then places a few monitors in the network to block the spread of misinformation.

Pham et al. [24] worked on the Targeted Misinformation Blocking (TMB) problem, where the goal is to find the smallest set of nodes whose removal will reduce the misinformation influence at least by a given threshold γ\gamma. Authors showed that TMB is #Phard\#P-hard problem under the linear threshold spreading model, and proposed a greedy algorithm that provides the solution set within the ratio of 1+ln(γ/ϵ)1+ln(\gamma/\epsilon) of the optimal set and the expected influence reduction is greater than (γϵ)(\gamma-\epsilon), given that the influence reduction function is submodular and monotone. Yang et al. worked on two versions of the influence minimization problem called Loss Minimization with Disruption (LMD) and Diffusion Minimization with Guaranteed Target (DMGT) using Integer Linear Programming (ILP) [36]. Authors proposed heuristic solutions for the LMD problem where kk nodes having the minimum degree or PageRank are chosen. They further proposed a greedy solution for the DMGT problem, where at each iteration, they choose a node that increases the maximal marginal gain.

In contrast to IB, truth campaigning techniques combat fake news by making the users aware of the true information. Budak et al. [5] showed that selecting a minimal group of users to disseminate “good” information in the network to minimize the influence of the “bad” information is an NP-hard problem. They provided an approximation guarantee for a greedy solution for different variations of this problem by proving them submodular. Nguyen et al. [21] worked on a problem called βTI\beta^{I}_{T} where they target to select the smallest set SS of influential nodes which start spreading the good information, so that the expected decontamination ratio in the whole network is β\beta after tt time steps, given that the misinformation was started from a given set of nodes II. They proposed a greedy solution called Greedy Viral Stopper (GVS) that iteratively selects a node to be decontaminated so that the total number of decontaminated nodes will be maximum if the selected node starts spreading the true information.

Farajtabar et al. [9] proposed a point process based mitigation technique using the reinforcement learning framework. The proposed method was implemented in real-time on Twitter to mitigate a synthetically started fake news campaign. Song et al. [30] proposed a method to identify truth campaigners in temporal influence propagation where the rumor has no impact after its deadline; the method is explained in Section 5.2. In [26], authors considered users’ bias, though the bias remains constant over time. In our work, we consider a realistic spreading model where users’ biases keep getting stronger or weaken based on the content they are exposed to and share further. The competitive influence propagation is modeled by extending Independent Cascade Model (ICM). The ICM model has been extended to explain information propagation [27, 16, 4, 11], and competitive influence propagation [18, 30, 32, 13, 28, 19] on OSNs; however users’ bias are not considered while modeling. Next, we propose kTruthScorek-TruthScore method to choose top-kk truth campaigners for minimizing the negative impact of fake news in the network.

3 The Proposed Propagation Model: CICMB

The Independent Cascade Model (ICM) [14] has been used to model the information propagation in social networks. In the existing ICM, each directed edge has an influence probability with which the source node influences the target node. The propagation is started from a source node or a group of source nodes. At each iteration, a newly influenced node tries to influence each of its neighbors with the given influence probability, and will not influence any of its neighbors in further iterations. Once there is no newly influenced node in an iteration, the propagation process is stopped. The total number of influenced nodes shows the influencing or spreading power of the seed nodes.

Kim and Bock [15] observed that peoples’ beliefs construct their positive or negative emotions about a topic, which further affects their attitude and behavior towards the misinformation spreading. We believe that in real life, people have biases towards different opinions, and once they believe in one information, they are less willing to switch their opinion.

Competitive Independent Cascade Model with users’ Bias (CICMB). In our work, we propose a Competitive Independent Cascade Model with users’ Bias (CICMB) by incorporating the previous observation in the ICM model when two competitive misinformation and its counter true information propagates in the network. In this model, each user has a bias towards misinformation at timestamp ii, namely Bm(u)[i]B_{m}(u)[i], and its counter true information, Bt(u)[i]B_{t}(u)[i]. The influence probability of an edge (u,v)(u,v) is denoted as, P(u,v)P(u,v). Before the propagation starts, each node is in the neutral state, NN. If the user believes in misinformation or true information, then it can change its state to MM or TT, respectively.

Once the propagation starts, all Misinformation starters will change their state to MM, and truth campaigners will be in state TT, and they will never change their state during the entire propagation being stubborn users. At each iteration, truth and misinformation campaigners will influence their neighbors as we explain next. A misinformation spreader uu will change the state of its neighbor vv at timestamp ii with the probability Prob=P(u,v)Bm(v)[i]Prob=P(u,v)\cdot B_{m}(v)[i]. If the node vv change its state from NN to MM, its bias values are updated as (Bm(v)[i+1],Bt(v)[i+1])=f(Bm(v)[i],Bt(v)[i])(B_{m}(v)[i+1],B_{t}(v)[i+1])=f(B_{m}(v)[i],B_{t}(v)[i]). Similarly, a truth campaigner uu will influence its neighbor vv with P(u,v)Bt(v)[i]P(u,v)\cdot B_{t}(v)[i] probability, and the bias values are updated as, (Bm(v)[i+1],Bt(v)[i+1])=f(Bm(v)[i],Bt(v)[i])(B_{m}(v)[i+1],B_{t}(v)[i+1])=f(B_{m}(v)[i],B_{t}(v)[i]).

In our implementation, we consider that when a node uu believes in one information, its bias towards accepting other information is reduced in half, using these functions:
(i). When uu changes its state to MM, Bm(u)[i+1]=Bm(u)[i]B_{m}(u)[i+1]=B_{m}(u)[i] &\& Bt(u)[i+1]=Bt(u)[i]/2B_{t}(u)[i+1]={B_{t}(u)[i]}/2.
(ii). When uu changes its state to TT, Bm(u)[i+1]=Bm(u)[i]/2B_{m}(u)[i+1]={B_{m}(u)[i]}/2 &\& Bt(u)[i+1]=Bt(u)[i]B_{t}(u)[i+1]=B_{t}(u)[i].

The model is explained in Figure 2 for this linear function. As a second case for our research, we will present results for another stronger bias function, where the bias is reduced faster, using a quadratic function.

Refer to caption
Figure 2: CICMB Model used in the experiments, and the bias that is not displayed on the link stays constant between states.

4 Methodology

We first introduce the problem formulation and then follow with our proposed solution.

4.1 Problem Formulation

In OSNs, when true or misinformation is propagated, users change their state between the elements of the set {N,M,T}\{N,M,T\}. The state of a user uu at timestamp ii is denoted as 𝝅u[i]\boldsymbol{\pi}_{u}[i] with the following possible assignments: (i) 𝝅u[i]=T\boldsymbol{\pi}_{u}[i]=T if user believes in true information, (ii) 𝝅u[i]=M\boldsymbol{\pi}_{u}[i]=M if user believes in misinformation, and (iii) 𝝅u[i]=N\boldsymbol{\pi}_{u}[i]=N if user is in neutral state.

Given a set of rumor starters RR who spreads misinformation, the deadline of misinformation spread α\alpha, and a set of prospective truth campaigners PP, we aim to identify a set DD of chosen truth-campaigners of size kk from set PP (DPD\subset P and |D|=k|D|=k) to start a truth-campaign such that the impact of misinformation is minimized.

Let uu be a node such that 𝝅u[α]=M\boldsymbol{\pi}_{u}[\alpha]=M if only misinformation is propagated in the network and 𝝅u[α]=T\boldsymbol{\pi}_{u}[\alpha]=T, when both misinformation and its counter true information is propagated in the network. The node uu is considered a saved node at the deadline α\alpha as it believes in true information and would have believed in the misinformation in the absence of truth-campaign.

Problem Definition: Given a graph G=(V,E)G=(V,E), a rumor deadline α\alpha, a set of rumor starters RR, and a set of prospective truth-campaigners PP. Let SS be the set of nodes whose state is MM at time α\alpha when only nodes in the set RR propagate misinformation using CICMB. Let II be the set of nodes whose state is TT at time α\alpha when sets RR and DD propagate misinformation and true information, respectively, using CICMB. Our aim is to find a set DPD\subset P of given size kk, such that the number of saved nodes is maximized as follows:

f(D,M)\displaystyle f(D,M) =\displaystyle= vSI(1|𝝅v(α)=T)\displaystyle\sum_{v\in S\cap I}(1|\boldsymbol{\pi}_{v}(\alpha)=T)

4.2 The Proposed Solution

In this section, we introduce our proposed algorithm, kTruthScorek-TruthScore, giving intuition for how it works, and we then summarize it at the end of the section. For a given set of misinformation starters RR and prospective truth-campaigners PP, our goal is to estimate which truth campaigner node will save the maximum number of nodes by the deadline α\alpha tracked by their TruthScore that we introduce below. We then choose top-kk nodes having the highest TruthScore as truth-campaigners (DD) to minimize the impact of misinformation.

To compute TruthScore, we assign to each node uu, two arrays mvalmval and tvaltval, each of length (α\alpha + 1), where mvalu[i]mval_{u}[i] and tvalu[i]tval_{u}[i] denote the estimated probability that node uu will change its state to MM and TT at time ii, respectively. To estimate these probability values, first, we create the Directed Acyclic Graph (DAG) G(V,E)G^{\prime}(V,E^{\prime}) of the given network GG to remove the cycles from the network. Otherwise, if there would be a cycle in the network, then the nodes belonging to the cycle will keep updating the probabilities of each other in an infinite loop.

We now compute the probability of an arbitrary node uu changing its state to MM at some iteration ii, namely mvalu[i]mval_{u}[i]. For this to happen, we compute two probabilities:

  1. 1.

    the probability that the node uu is not in state MM at time i1i-1 is computed as, (1j=1i1mvalu[j])(1-\sum_{j=1}^{i-1}mval_{u}[j]), and

  2. 2.

    the probability that the node uu will receive the misinformation at the ithi^{th} step that considers all parents vv of node uu that have updated their mvalmval at i1i-1 timestamp, V1={v|(v,u)E&mvalv[i1]>0}V_{1}=\{v|(v,u)\in E^{\prime}\;\&\;mval_{v}[i-1]>0\}. Then we compute the value of mvalu[i]mval_{u}[i] by taking their product as shown in Equation 1:

    mvalu[i]=vV1(mvalu[i]+(1mvalu[i])(1j=1i1mvalu[j])P(v,u)Bm(u)mvalv[i1])mval_{u}[i]=\sum_{v\in V_{1}}(mval_{u}[i]+(1-mval_{u}[i])\cdot(1-\sum_{j=1}^{i-1}mval_{u}[j])\cdot P(v,u)\cdot B_{m}(u)\cdot mval_{v}[i-1]) (1)

We use this formula to compute mvalu[i]mval_{u}[i] for all nodes from i=1i=1 to α\alpha. All the nodes whose mvalmval has been updated, are added to set AA.

Next, we compute the TruthScore of each prospective truth-campaigner ww. We estimate the probability that a node uu will believe in true information at ithi_{th} timestamp when the true information is propagated from node ww in the network. For this update tvalw[0]=1tval_{w}[0]=1, and compute for each node u(VR)u\in(V-R), tvalu[i]tval_{u}[i] from i=1i=1 to α\alpha.

The probability that node uu will change its state to TT at timestamp ii is the probability that the node uu has not changed its state to TT at any previous timestamp multiplied by the probability of receiving the true information at ithi_{th} timestamp. It is computed using the same approach as defined in Equation 1.

The estimated probability that a node uu will change its state to TT at time stamp ii is computed as follows: Consider all parents vv of node uu who has updated tvalv[i1]tval_{v}[i-1] at i1i-1 timestamp, V2={v|(v,u)E&tvalv[i1]>0}V_{2}=\{v|(v,u)\in E^{\prime}\;\&\;tval_{v}[i-1]>0\}.

tvalu[i]=vV2(tvalu[i]+(1tvalu[i])(1j=1i1tvalu[j])P(v,u)Bt(u)tvalv[i1])tval_{u}[i]=\sum_{v\in V_{2}}(tval_{u}[i]+(1-tval_{u}[i])\cdot(1-\sum_{j=1}^{i-1}tval_{u}[j])\cdot P(v,u)\cdot B_{t}(u)\cdot tval_{v}[i-1]) (2)

The tvaltval is computed for i=1i=1 to α\alpha. All the nodes whose tvaltval has been updated, are added to BB. The truth score of truth-campaigner ww is computed as:

TruthScore(w)=vABi=1αtvalv[i]TruthScore(w)=\sum_{v\in A\cap B}\sum_{i=1}^{\alpha}tval_{v}[i] (3)

For the fast computation, a node vv will update the mvalmval of its child node uu at timestamp ii, if mvalv[i1]>θmval_{v}[i-1]>\theta, where θ\theta is a small threshold value. The same threshold value is used while computing tvaltval array of the nodes.

We now summarize the above described method, and we call it kk-TruthScore:

  1. 1.

    Create G(V,E)G^{\prime}(V,E^{\prime}), the DAG of the given network GG.

  2. 2.

    For all nodes in the set RR of rumor starters, update mvalu[0]mval_{u}[0]=1. Compute mvalmval for the nodes reachable from RR by the given deadline α\alpha using Equation 1 and add these nodes to set AA.

  3. 3.

    For each given prospective truth-campaigner ww from set PP,

    1. (a)

      Update tvalw[0]=1tval_{w}[0]=1

    2. (b)

      Compute tvaltval arrays for the nodes reachable by ww by the given deadline α\alpha using Equation 2, and add these nodes to set BB.

    3. (c)

      Compute TruthScore(w)TruthScore(w) by adding the values of tvaltval for the nodes in ABA\cap B using Equation 3.

  4. 4.

    Choose top-kk truth-campaigners having the highest TruthScoreTruthScore.

The complete method is explained in Algorithm 1.

Algorithm 1 kTruthScore(G(V,E),R,P,k,α,θ)k-TruthScore(G(V,E),R,P,k,\alpha,\theta)
  G(V,E)=G^{\prime}(V,E^{\prime})= Create DAG of (G(V,E))(G(V,E))
  Each node uu has mvalmval and tvaltval arrays of size α+1\alpha+1 initialized with 0 entries.
  A={}A=\{\}, and B={}B=\{\}.
  for each node uu in RR do
     mvalu[0]=1mval_{u}[0]=1
  end for
  i=1i=1, R=RR^{\prime}=R
  for ii in range(11,α\alpha + 1) do
     X={}X=\{\}
     while RR^{\prime} is not empty do
        Remove node vv from RR^{\prime}
        if mvalv[i1]>θmval_{v}[i-1]>\theta then
           for each child uu of vv do
              mvalu[i]=mvalu[i]+(1mvalu[i])(1j=1i1mvalu[j])P(v,u)Bm(u)mvalv[i1]mval_{u}[i]=mval_{u}[i]+(1-mval_{u}[i])\cdot(1-\sum_{j=1}^{i-1}mval_{u}[j])\cdot P(v,u)\cdot B_{m}(u)\cdot mval_{v}[i-1]
              Add uu to XX
              Add uu to AA
           end for
        end if
     end while
     R=XR^{\prime}=X
  end for
  for each node ww in PP do
     BB={ }, PP^{\prime}={ }
     tvalw[i]=1tval_{w}[i]=1
     Add ww to PP^{\prime}
     for ii in range(1,α+1)(1,\alpha+1) do
        X=X= { }
        while PP^{\prime} is not empty do
           Remove node vv from PP^{\prime}
           if tvalv[i1]>θtval_{v}[i-1]>\theta then
              for each child uu of vv do
                 if uRu\notin R then
                    tvalu[i]=tvalu[i]+(1tvalu[i])(1j=1i1tvalu[j])P(v,u)Bt(u)tvalv[i1]tval_{u}[i]=tval_{u}[i]+(1-tval_{u}[i])\cdot(1-\sum_{j=1}^{i-1}tval_{u}[j])\cdot P(v,u)\cdot B_{t}(u)\cdot tval_{v}[i-1]
                    Add uu to XX
                    Add uu to BB
                 end if
              end for
           end if
        end while
        PP^{\prime}=XX
     end for
     for each node uu in ABA\cap B do
        for ii in range (1,α+1)(1,\alpha+1) do
           TruthScore(w)=TruthScore(w)+tvalu[i]TruthScore(w)=TruthScore(w)+tval_{u}[i]
        end for
     end for
     Update all entries of tvaltval array for all nodes with 0.
  end for
  DD = choose top-kk nodes from set PP having highest TruthScore
  Return DD

5 Performance Study

We carry out experiments to validate the performance of the kTruthScorek-TruthScore to identify top-kk truth-campaigners.

5.1 Datasets

We perform the experiments on three real-world directed social networks, Digg, Facebook, and Twitter, as presented in Table 1. For each of them, the diameter is computed by taking the undirected version of the network.

Network Nodes Edges Diameter Ref
Digg 29652 85983 12 [8]
Facebook 43953 262631 18 [33]
Twitter 81306 1768135 7 [7]
Table 1: Datasets

We assign the influence probability of each edge (v,u)(v,u) uniformly at random (u.a.r.) from the interval (0,1](0,1]. Each node in the network has two bias values, one for the misinformation and another for the true information. For misinformation-starters, the bias for misinformation is randomly assigned a real value between [0.7,1][0.7,1] as the nodes spreading misinformation will be highly biased towards it. For these nodes, the bias for true information will be assigned as, Bt[0]=1Bm[0]B_{t}[0]=1-B_{m}[0].

Similarly, the nodes chosen to be prospective truth campaigners will have a high bias towards true information, and it will be assigned u.a.r. from the interval [0.7,1][0.7,1]. For prospective truth-campaigners, the bias for misinformation will be assigned as, Bm[0]=1Bt[0]B_{m}[0]=1-B_{t}[0]. For the rest of the nodes, the bias value for misinformation and their counter true-information will be assigned uniformly at random from the interval (0,1](0,1]. Note that the size of the prospective truth-campaigners set is fixed as |P|=50|P|=50, and set PP is chosen u.a.r. from set (VR)(V-R). We fix θ=0.000001\theta=0.000001 for all the experiments.

5.2 Baseline Methods

We have compared our method to the following two state-of-the-art methods.

  1. 1.

    Temporal Influence Blocking (TIB) [30]. The TIB method runs into two phases. In the first phase, it identifies the set of nodes that can be reached by misinformation spreaders by the given deadline. Then, it identifies the potential nodes that can influence these nodes. In the second phase, it generates Weighted Reverse Reachable (WRR) trees to compute the influential power of identified potential mitigators by estimating the number of reachable nodes for each potential mitigator. In our experiments, we select the top-kk nodes to be the prospective truth-campaigners having the highest influential power.

  2. 2.

    Targeted Misinformation Blocking (TMB) [25]. The TMB computes the influential power of a given node by computing the number of saved nodes if the given node is immunized in the network. Therefore, the influence reduction of a node vv is computed as h(v)=N(G)N(Gv)h(v)=N(G)-N(G\setminus v), where N(G)N(G) and N(Gv)N(G\setminus v) denote the number of nodes influenced by misinformation starters in the GG and (Gv)(G\setminus v), respectively. We then select top-kk nodes having the highest influence reduction as truth-campaigners.

    After selecting top-kk truth campaigners using TIB and TMB methods, the CICMB model is used to propagate misinformation and counter true information.

If set RR starts propagating misinformation, then SS is the set of nodes whose state is MM at t=αt=\alpha. If set RR propagates misinformation, and set DD propagates true information, then let II be the set of nodes whose state is TT at t=αt=\alpha. The performance of various methods is evaluated by computing the percentage of nodes saved, i.e., |S||I||S|100\frac{|S|-|I|}{|S|}\cdot 100.

We compute the results by choosing five different sets of misinformation starters and truth-campaigners. In several instances, each experiment is repeated 100100 times, and we report their average value to show the percentage of saved nodes.

First, we study the performance of kTruthScorek-TruthScore as a function of chosen truth-campaigners kk, varying kk from 22 to 1010. We also set the deadline for the misinformation to be the network diameter, if not specified otherwise. Figure 4 shows that the kTruthScorek-TruthScore outperforms state-of-the-art methods for finding the top-kk truth-campaigners. TIB and TMB methods are designed to choose truth-campaigners globally, and we restrict these methods to choose truth-campaigners from the given set of prospective truth-campaigners. Under this restriction, kTruthScorek-TruthScore significantly outperforms both TIB and TMB methods.

Refer to caption

(a) Digg

Refer to caption

(b) Facebook

Refer to caption

(c) Twitter

Figure 3: Effect of varying kk for node selection methods.
Refer to caption

(a) Digg

Refer to caption

(b) Facebook

Refer to caption

(c) Twitter

Figure 4: Effect of varying |M||M| when kk=5.
Refer to caption
Figure 5: Effect of varying deadline α\alpha on Digg Network.
Refer to caption
Figure 6: Square-bias function on Digg Network.

Next, we study the impact of varying the number of rumor starters. We fix k=5k=5, and allow MM to vary from 5 to 25. Figure 4 shows that in this case, the percentage of nodes saved reduces as the number of rumor starters increases, while kTruthScorek-TruthScore still outperforms TIB and TMB methods.

We also study the impact of varying the deadline on the percentage of nodes saved when |M|=10|M|=10 and k=5k=5. The results are shown for the Digg network and α\alpha varying from 4 to 20. The results show that the percentage of the saved nodes is maximum around the iteration time α=diameter(G)\alpha=diameter(G), and consistently kTruthScorek-TruthScore outperforms at any other set deadline.

The results discussed so far depend on linear degradation, as presented in Section 3. We now study the efficiency of kTruthScorek-TruthScore for a quadratic bias function. We evaluate kTruthScorek-TruthScore for the following bias function:
When uu changes its state to MM, Bm(u)[i+1]=Bm(u)[i]B_{m}(u)[i+1]=B_{m}(u)[i] &\& Bt(u)[i+1]=(Bt(u)[i])2B_{t}(u)[i+1]=({B_{t}(u)}[i])^{2}.
When uu changes its state to TT, Bm(u)[i+1]=(Bm(u)[i])2B_{m}(u)[i+1]=({B_{m}(u)}[i])^{2} &\& Bt(u)[i+1]=Bt(u)[i]B_{t}(u)[i+1]=B_{t}(u)[i].

The results show that kTruthScorek-TruthScore saves the maximum number of nodes for the quadratic bias function. However, the percentage of saved nodes is smaller than the ones observed in Figure 3. This is due to the reason that the square function reduces the biases faster, and users are more stubborn to change their state once they have believed in one information.

6 Conclusion

The current research presents a solution to the problem of minimizing the impact of misinformation by propagating its counter true information in OSNs. In particular, we look to identify the top-kk candidates as truth-campaigners from a given set of prospective truth-campaigners and given rumor starters. We first propose a propagation model called the Competitive Independent Cascade Model with users’ Bias that considers the presence of strong user bias towards different opinions, believes, or political parties. For our experiments, we used two different functions to capture the bias dynamics towards true and misinformation, one linear and one quadratic.

Next, we introduce an algorithm, kTruthScorek-TruthScore, to identify top-kk truth-campaigners, and compare the results against two state of the art algorithms, namely Temporal Influence Blocking and Targeted Misinformation Blocking. To compare the algorithms, we compute the percentage of saved nodes per network GG, by the deadline α\alpha. A node is tagged as saved if at the deadline α\alpha it believes in true information and would have believed in the misinformation in the absence of truth campaigners.

We compare the three algorithms under each of the two bias functions on three different networks, namely Digg, Facebook, and Twitter. Moreover, we compare the three algorithms by varying the number of originating rumor spreaders as well as varying the deadline at which we compute the TruthSocre. Our results show that kTruthScorek-TruthScore outperforms the state of the art methods in every case.

In the future, we would like to do an in-depth analysis of the CICMB model for different bias functions, such as constant increase/decrease (where the bias values are increased or decreased by a constant value, respectively), other linear functions (for example, if one bias value of a user increases then the other decreases), different quadratic functions, and so on. The proposed kTruthScorek-TruthScore method outperforms for both the considered functions; however, one can propose a method, i.e., specific to a given bias function.

References

  • [1] Google trends fake news. Accessed on 2020-09-11
  • [2] Amoruso, M., Anello, D., Auletta, V., Ferraioli, D.: Contrasting the spread of misinformation in online social networks. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems. pp. 1323–1331. International Foundation for Autonomous Agents and Multiagent Systems (2017)
  • [3] Andrews, C., Fichet, E., Ding, Y., Spiro, E.S., Starbird, K.: Keeping up with the tweet-dashians: The impact of’official’accounts on online rumoring. In: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. pp. 452–465. ACM (2016)
  • [4] Barbieri, N., Bonchi, F., Manco, G.: Topic-aware social influence propagation models. Knowledge and information systems 37(3), 555–584 (2013)
  • [5] Budak, C., Agrawal, D., El Abbadi, A.: Limiting the spread of misinformation in social networks. In: Proceedings of the 20th international conference on World wide web. pp. 665–674. ACM (2011)
  • [6] Chen, X., Sin, S.C.J., Theng, Y.L., Lee, C.S.: Why do social media users share misinformation? In: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries. pp. 111–114. ACM (2015)
  • [7] De Choudhury, M., Lin, Y.R., Sundaram, H., Candan, K.S., Xie, L., Kelliher, A.: How does the data sampling strategy impact the discovery of information diffusion in social media? In: Fourth International AAAI Conference on Weblogs and Social Media (2010)
  • [8] De Choudhury, M., Sundaram, H., John, A., Seligmann, D.D.: Social synchrony: Predicting mimicry of user actions in online social media. In: 2009 international conference on computational science and engineering. vol. 4, pp. 151–158. IEEE (2009)
  • [9] Farajtabar, M., Yang, J., Ye, X., Xu, H., Trivedi, R., Khalil, E., Li, S., Song, L., Zha, H.: Fake news mitigation via point process based intervention. arXiv preprint:1703.07823 (2017)
  • [10] Forum, W.E.: The global risks report 2017
  • [11] Gupta, Y., Iyengar, S., Saxena, A., Das, D.: Modeling memetics using edge diversity. Social Network Analysis and Mining 9(1),  2 (2019)
  • [12] Halimeh, A.A., Pourghomi, P., Safieddine, F.: The impact of facebook’s news fact-checking on information quality (iq) shared on social media
  • [13] Hosni, A.I.E., Li, K., Ding, C., Ahmed, S.: Least cost rumor influence minimization in multiplex social networks. In: International Conference on Neural Information Processing. pp. 93–105. Springer (2018)
  • [14] Kempe, D., Kleinberg, J., Tardos, É.: Influential nodes in a diffusion model for social networks. In: International Colloquium on Automata, Languages, and Programming. pp. 1127–1138. Springer (2005)
  • [15] Kim, J.H., Bock, G.W.: A study on the factors affecting the behavior of spreading online rumors: Focusing on the rumor recipient’s emotions. In: PACIS. p. 98 (2011)
  • [16] Kumari, A., Singh, S.N.: Online influence maximization using rapid continuous time independent cascade model. In: 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence. pp. 356–361. IEEE (2017)
  • [17] Lee, C., Shin, J., Hong, A.: Does social media use really make people politically polarized? direct and indirect effects of social media use on political polarization in south korea. Telematics and Informatics 35(1), 245–254 (2018)
  • [18] Lin, K.S., Dai, B.R.: Biog: An effective and efficient algorithm for influence blocking maximization in social networks. In: International Conference on Data Mining and Big Data. pp. 328–337. Springer (2019)
  • [19] Lv, J., Yang, B., Yang, Z., Zhang, W.: A community-based algorithm for influence blocking maximization in social networks. Cluster Computing pp. 1–16 (2017)
  • [20] Mosseri, A.: Working to stop misinformation and false news. https://newsroom.fb.com/news/2017/04/working-to-stop-misinformation-and-false-news/ (2017)
  • [21] Nguyen, N.P., Yan, G., Thai, M.T., Eidenbenz, S.: Containment of misinformation spread in online social networks. In: Proceedings of the 4th Annual ACM Web Science Conference. pp. 213–222. ACM (2012)
  • [22] Ozturk, P., Li, H., Sakamoto, Y.: Combating rumor spread on social media: The effectiveness of refutation and warning. In: System Sciences (HICSS), 2015 48th Hawaii International Conference on. pp. 2406–2414. IEEE (2015)
  • [23] Park, J., Cha, M., Kim, H., Jeong, J.: Managing bad news in social media: A case study on domino’s pizza crisis. ICWSM 12, 282–289 (2012)
  • [24] Pham, C.V., Phu, Q.V., Hoang, H.X.: Targeted misinformation blocking on online social networks. In: Asian Conference on Intelligent Information and Database Systems. pp. 107–116. Springer (2018)
  • [25] Pham, C.V., Phu, Q.V., Hoang, H.X., Pei, J., Thai, M.T.: Minimum budget for misinformation blocking in online social networks. Journal of Combinatorial Optimization 38(4), 1101–1127 (2019)
  • [26] Saxena, A., Hsu, W., Lee, M.L., Leong Chieu, H., Ng, L., Teow, L.N.: Mitigating misinformation in online social network with top-k debunkers and evolving user opinions. In: Companion Proceedings of the Web Conference 2020. pp. 363–370 (2020)
  • [27] Saxena, A., Iyengar, S., Gupta, Y.: Understanding spreading patterns on social networks based on network topology. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015. pp. 1616–1617 (2015)
  • [28] Simpson, M., Srinivasan, V., Thomo, A.: Scalable misinformation prevention in social networks. arXiv preprint arXiv:1807.01162 (2018)
  • [29] Soares, F.B., Recuero, R., Zago, G.: Influencers in polarized political networks on twitter. In: Proceedings of the 9th international conference on social media and society. pp. 168–177 (2018)
  • [30] Song, C., Hsu, W., Lee, M.: Temporal influence blocking: Minimizing the effect of misinformation in social networks. In: 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, April 19-22, 2017. pp. 847–858 (2017). https://doi.org/10.1109/ICDE.2017.134, https://doi.org/10.1109/ICDE.2017.134
  • [31] Tanaka, Y., Sakamoto, Y., Matsuka, T.: Toward a social-technological system that inactivates false rumors through the critical thinking of crowds. In: System Sciences (HICSS), 2013 46th Hawaii International Conference on. pp. 649–658. IEEE (2013)
  • [32] Tong, G., Wu, W., Guo, L., Li, D., Liu, C., Liu, B., Du, D.Z.: An efficient randomized algorithm for rumor blocking in online social networks. IEEE Transactions on Network Science and Engineering (2017)
  • [33] Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in facebook. In: Proceedings of the 2nd ACM workshop on Online social networks. pp. 37–42. ACM (2009)
  • [34] Vosoughi, S., Roy, D., Aral, S.: The spread of true and false news online. Science 359(6380), 1146–1151 (2018)
  • [35] Wang, Y., Feng, Y., Hong, Z., Berger, R., Luo, J.: How polarized have we become? a multimodal classification of trump followers and clinton followers. In: International Conference on Social Informatics. pp. 440–456. Springer (2017)
  • [36] Yang, L., Li, Z., Giua, A.: Influence minimization in linear threshold networks. Automatica 100, 10–16 (2019)