This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Continuous Profit Maximization: A Study of Unconstrained Dr-submodular Maximization

Jianxiong Guo, Weili Wu J. Guo and W. Wu are with the Department of Computer Science, Erik Jonsson School of Engineering and Computer Science, Univerity of Texas at Dallas, Richardson, TX, 75080 USA E-mail: [email protected] received April 19, 2005; revised August 26, 2015.
Abstract

Profit maximization (PM) is to select a subset of users as seeds for viral marketing in online social networks, which balances between the cost and the profit from influence spread. We extend PM to that under the general marketing strategy, and form continuous profit maximization (CPM-MS) problem, whose domain is on integer lattices. The objective function of our CPM-MS is dr-submodular, but non-monotone. It is a typical case of unconstrained dr-submodular maximization (UDSM) problem, and take it as a starting point, we study UDSM systematically in this paper, which is very different from those existing researcher. First, we introduce the lattice-based double greedy algorithm, which can obtain a constant approximation guarantee. However, there is a strict and unrealistic condition that requiring the objective value is non-negative on the whole domain, or else no theoretical bounds. Thus, we propose a technique, called lattice-based iterative pruning. It can shrink the search space effectively, thereby greatly increasing the possibility of satisfying the non-negative objective function on this smaller domain without losing approximation ratio. Then, to overcome the difficulty to estimate the objective value of CPM-MS, we adopt reverse sampling strategies, and combine it with lattice-based double greedy, including pruning, without losing its performance but reducing its running time. The entire process can be considered as a general framework to solve the UDSM problem, especially for applying to social networks. Finally, we conduct experiments on several real datasets to evaluate the effectiveness and efficiency of our proposed algorithms.

Index Terms:
Continuous profit maximization, Social networks, Integer lattice, dr-submodular maximization, Sampling strategies, Approximation algorithm

I Introduction

Online social networks (OSNs) were becoming more and more popular to exchange ideas and make friends gradually in recent years, and accompanied by the rise of a series of social giants, such as Twitter, Facebook, Wechat, and LinkedIn. People tended to share what one sees and hears, and discuss some hot issues on these social platforms instead of traditional ways. Many companies or advertisers exploited to spread their products, opinions or innovations. By offering those influential users free or discounted samples, information can be spread across the whole network through word-of-mouth effect [1] [2]. Inspired from that, the influence maximization (IM) problem [3] was formulated, which selects a subset of users (seed set) to maximizing the expected follow-up adoptions (influence spread) for a given information cascade. In this Kempe et al.’s seminal work [3], IM was defined on the two basic discrete diffusion models, independent cascade model (IC-model) and linear threshold model (LT-model), and these two models can be generalized to the triggering model. Then, they proved the IM problem is NP-hard, and obtain a (11/e)(1-1/e)-approximation under the IC/LT-model by use of a simple hill-climbing in the framework of monotonicity and submodularity.

Since this seminal work, plenty of related problems based on IM that used for different scenarios emerged [4] [5]. Among them, profit maximization (PM) [6] [7] [8] [9] [10] is the most representative and widely used one. Consider viral marketing for a given product, the gain is the influence spread generated from our selected seed set in a social network. However, it is not free to activate those users in this seed set. For instance, in a real advertisement scenario, discounts and rewards are usually adopted to improve users’ desire to purchase and stimulate consumption. Thus, the net profit is equal to influence spread minus the expense of seed set, where more incentives do not imply more benefit. Tang et al. [8] proved the objective function of PM is submodular, but not monotone, and double greedy algorithm has a (1/2)(1/2)-approximation if the objective value is non-negative. Before this, Kempe et al. [11] proposed the generalized marketing instead of the seed set. A marketing strategy is denoted by 𝒙+d\boldsymbol{x}\in\mathbb{Z}^{d}_{+} where a user uu will be activated as a seed with probability hu(𝒙)h_{u}(\boldsymbol{x}). Thus, the seed set is not deterministic, but activated probabilistically according to a marketing strategy. In this paper, we propose a continuous profit maximization under the general marketing strategies (CPM-MS) problem, which aims to choose the optimal marketing vector 𝒙𝒃\boldsymbol{x}^{*}\preceq\boldsymbol{b} such that the net profit can be maximized. Each component 𝒙(i)𝒙\boldsymbol{x}(i)\in\boldsymbol{x} stand for the investment to marketing action MiM_{i}. Actually, in order to promote their products, a company often adopts multiple marketing techniques, such as advertisements, discounts, cashback, and propagandas, whose effects are different to customers at different levels. Therefore, CPM-MS is much more generalized than traditional PM.

In this paper, after formulating our CPM-MS problem, we discuss its properties first. We show that its objective function is NP-hard, and given a marketing vector 𝒙\boldsymbol{x}, it is #P-hard to compute the expected profit exactly. Because of the difficulty to compute the expected profit, we give an equivalent method that needs to run Monte-Carlo simulations on a constructed graph. Then, we prove that the objective function of CPM-MS problem is dr-submodular, but not monotone. Extended from set function to vector function on integer lattice, the dr-submodularity has a diminishing return property. For the unconstrained submodular maximization (USM), Buchbinder et al. [12] proposed a randomized double greedy algorithm that can achieve a tight (1/2)(1/2)-approximation ratio. To our CPM-MS problem, we are able to consider it as a case of unconstrained dr-submodular maximization (UDSM) inspired by USM. Here, we introduce a lattice-based double greedy algorithm for the UDSM, and a (1/2)(1/2)-approximation can be obtained as well if objective value is non-negative. The marketing vector 𝒙\boldsymbol{x} is defined on 𝟎𝒙𝒃\boldsymbol{0}\preceq\boldsymbol{x}\preceq\boldsymbol{b}, thus this approximation can be guaranteed only when the sum of objective values on 𝟎\boldsymbol{0} and 𝒃\boldsymbol{b} is not less than zero, which is hard to be satisfied in the real applications. Imagine to offer all marketing actions full investments, is it still profitable? The answer is no. To overcome this defect, we design a lattice-based iterative pruning technique. It shrinks the searching space gradually in an iterative manner, and then we initialize our lattice-based double greedy with this smaller searching space. According to this revised process, the objective values on this smaller space are very likely to be non-negative, thereby increasing greatly the applicability of our algorithm’s approximation. As mentioned earlier, even if we can use Monte-Carlo simulations to estimate the expected profit, its time complexity is too high, and thus restrict its scalability. Here, based on the reverse influence sampling (RIS) [13] [14] [15] [16], we design an unbiased estimator for the profit function, which can estimate the objective value of a given marketing vector accurately if the number of samplings is large enough. Next, we take this estimator as our new objective function, combine with lattice-based pruning and double greedy algorithm, and propose DG-IP-RIS algorithm eventually. It guarantees to obtain a (1/2ε)(1/2-\varepsilon)-approximation under a weak condition, whose time complexity is improved significantly. Finaly, we conduct several experiments to evaluate the superiority of our proposed DG-IP-RIS algorithm to other heuristic algorithms and compare their running times respectively, especially with the help of reverse sampling, which support the effectiveness and efficiency of our approaches strongly.

Organization: Sec. II discusses the related work. Sec. III is dedicated to introduce diffusion model, and formulate the problem based on that. The properties and computability of our CPM-MS problem are presented in Sec. IV. Sec. V is the main contributions, including lattice-based double greedy and pruning algorithms. Sec. VI analyzes the time complexity and designs speedup algorithms based on sampling strategies. Experiments and discussions are presented in Sec. VII and VIII is the conclusion for this paper.

II Related Work

Influence Maximization: Kempe et al. [3] formulated IM to a combinatorial optimization problem, generalized the triggering model, including IC-model and LT-model, and proposed a greedy algorithm with (11/eε)(1-1/e-\varepsilon)-approximation by adopting Monte-Carlo simulations. Given a seed set, Chen et al. proved that computing its exact influence spread under the IC-model [17] and LT-model [18] are #P-hard, and they designed two heuristic algorithms that can solve IM problem under the IC-model [17] and LT-model [18], which reduce the computation overhead effectively. Brogs et al. [13] took RIS to estimate the influence spread first, subsequently, a lot of researchers utilized RIS to design efficient algorithms with (11/eε)(1-1/e-\varepsilon)-approximation. Tang et al. [14] proposed TIM/TIM+ algorithms, which were better than Brogs et al.’s IM method regardless of accuracy and time complexity. Then, they developed a more efficient algorithm, IMM [15], based on martingale analysis. Nguyen et al. [19] designed SSA/D-SSA and claimed it reduces the running time significantly without losing approximation ratio, but still be doubted by other researchers. Recently, Tang et al. [16] created an online process of IM, and it can be terminated at any time and get a solution with its approximation guarantee.

Profit Maximization: Domingos et al. [1] [2] studied viral marketing systematically first, and they proposed customers’ value and utilized markov random fields to model the process of viral marketing. Lu et al. [6] distinguished between influence and actual adoption, and designed a decision-making process to explain how to adopt a product. Zhang et al. [7] studied the problem of distributing a limited budget across multiple products such that maximizing total profit. Tang et al. [8] analyzed and solved USM problem by double greedy algorithm thoroughly with PM as background, and proposed iterative pruning technique, which is different from our pruning process, because our objective function is defined on integer lattice. Tong et al. [9] considered the coupon allocation in the PM problem, and designed efficient randomized algorithms to achieve (1/2ε)(1/2-\varepsilon)-approximation with high probability. Guo et al. [10] proposed a budgeted coupon problem whose domain is constrained, and provided a continuous double greedy algorithm with a valid approximation.

(Dr-)submodular maximization: Nemhauser et al. [20] [21] began to study monotone submodular maximization problem, and laid the theoretical foundation, where IM problem was the most relevant work. However, PM is submodular, but not monotone, which is a case of USM [22] [12]. Feige et al. [22] pointed out no approximation algorithm exists for general USM unless giving an additional assumption, and they developed a deterministic local search wich (1/3)(1/3)-approximation and a randomized local search with (2/5)(2/5)-approximation for maximizing non-negative submodular function. Assume non-negativity satisfied as well, Buchbinder et al. [12] optimized it to (1/2)(1/2)-approximation further with much lower computational complexity. Soma et al. [23] generalized the diminishing return property to the integer lattice first, and solved submodular cover problem with a bicriteria approximation algorithm. Then, they [24] studied monotone dr-submodular maximization problem exhaustively, where they designed algorithms with (11/e)(1-1/e)-approximation under the cardinality, polymatroid, and knapsack constraint. Even if the UDSM problem was discussed in [25], their techniques and backgrounds were very different from ours. Therefore, how to solve UDSM, especially when non-negativity cannot be satisfied, is still an open problem, which is the main contribution of this paper.

III Problem Formulation

In this section, we provides some preliminaries to the rest of this paper, and fomulate our continuous profit maximization problem under the general marketing strategies.

III-A Influence model

An OSN can be abstracted as a directed graph G=(V,E)G=(V,E) where V={v1,v2,,vn}V=\{v_{1},v_{2},\cdots,v_{n}\} is the set of nn nodes (users) and E={e1,e2,,em}E=\{e_{1},e_{2},\cdots,e_{m}\} is the set of mm edges (relationship between users). We default |V|=n|V|=n and |E|=m|E|=m when given G=(V,E)G=(V,E). For each directed edge (u,v)E(u,v)\in E, we say vv is an outgoing neighbor of uu, and uu is an incoming neighbor of vv. For any node uVu\in V, let N(u)N^{-}(u) denote its set of incoming neighbors, and N+(u)N^{+}(u) denote its set of outgoing neighbors. In the process of influence diffusion, we consider a user is active if she accepts (is activated by) the information cascade from her neighbors or she is selected as a seed successfully. To model the influence diffusion, Kempe et al. [3] proposed two classical models, IC-model and LT-model.

Let SVS\subseteq V be a seed set and SiVS_{i}\subseteq V be the set of all active nodes at time step tit_{i}. The influence diffusion initiated by SS can be represented by a discrete-time stochastic process. At time step t0t_{0}, all nodes in SS are activated, so we have S0:=SS_{0}:=S. Under the IC-model, there is a diffusion probabiltiy puv(0,1]p_{uv}\in(0,1] associated with each edge (u,v)E(u,v)\in E. We set Si:=Si1S_{i}:=S_{i-1} at time step tit_{i} (t1)(t\geq 1) first; then, for each node uSi1\Si2u\in S_{i-1}\backslash S_{i-2}, activated first at time step ti1t_{i-1}, it have one chance to activate each of its inactive outgoing neighbor vv with probability puvp_{uv}. We add vv into SiS_{i} if uu activates vv successfully at tit_{i}. Under the LT-model, each edge (u,v)E(u,v)\in E has a weight buvb_{uv}, and each node vVv\in V has a threshold θv\theta_{v} sampled uniformly in [0,1][0,1] and uN(v)buv1\sum_{u\in N^{-}(v)}b_{uv}\leq 1. We set Si:=Si1S_{i}:=S_{i-1} at time step tit_{i} (t1)(t\geq 1) first; then, for each inactive node vV\Si1v\in V\backslash S_{i-1}, it can be activated if uSi1N(v)buvθv\sum_{u\in S_{i-1}\cap N^{-}(v)}b_{uv}\geq\theta_{v}. We add vv into SiS_{i} if vv is activated successfully at tit_{i}. The influence diffusion terminates when no more inactive nodes can be activated. In this paper, we consider the triggering mode, where IC-model and LT-model are its special cases.

Definition 1 (Triggering Model [3]).

Each node vv selects a triggering set TvT_{v} randomly and independently according to a distribution 𝒟v\mathcal{D}_{v} over the subsets of N(v)N^{-}(v). We set Si:=Si1S_{i}:=S_{i-1} at time step tit_{i} (t1)(t\geq 1) first; then, for each inactive node vV\Si1v\in V\backslash S_{i-1}, it can be activated if there is at least one node in TvT_{v} activated in ti1t_{i-1}. We add vv into SiS_{i} if vv is activated successfully at tit_{i}. The influence diffusion terminates when no more inactive nodes can be activated.

From above, a triggering model can be defined as Ω=(G,𝒟)\Omega=(G,\mathcal{D}), where 𝒟={𝒟v1,𝒟v2,𝒟vn}\mathcal{D}=\{\mathcal{D}_{v_{1}},\mathcal{D}_{v_{2}},\cdots\mathcal{D}_{v_{n}}\} is a set of distribution over the subsets of each N(vi)N^{-}(v_{i}).

III-B Realization

For each node vVv\in V, under the IC-model, each node uN(v)u\in N^{-}(v) appears in vv’s random triggering set TvT_{v} with probability puvp_{uv} independently. Under the LT-model, at most one node can appear in TvT_{v}, thus, for each node uN(v)u\in N^{-}(v), Tv={u}T_{v}=\{u\} with probability buvb_{uv} exclusively and Tv=T_{v}=\emptyset with probability 1uN(v)buv1-\sum_{u\in N^{-}(v)}b_{uv}. Now, we can define the realization (possible world) gg of graph GG under the triggering model Ω=(G,𝒟)\Omega=(G,\mathcal{D}), that is

Definition 2 (Realization).

Given a directed graph G=(V,E)G=(V,E) and triggering model Ω=(G,𝒟)\Omega=(G,\mathcal{D}), a realization g={Tv1,Tv2,,Tvn}g=\{T_{v_{1}},T_{v_{2}},\cdots,T_{v_{n}}\} of GG is a set of triggering set sampled from distribution 𝒟\mathcal{D}, denoted by gΩg\sim\Omega. For each node vVv\in V, we have Tv𝒟vT_{v}\sim\mathcal{D}_{v} respectively.

If a node uu appears in vv’s triggering set, uTvu\in T_{v}, we say edge (u,v)(u,v) is live, or else edge (u,v)(u,v) is blocked. Thus, realization gg can be regarded as a subgraph of GG, which is the remaining graph by removing these blocked edges. Let Pr[g|gΩ]\Pr[g|g\sim\Omega] be the probability of realization gg of GG sampled from distribution 𝒟\mathcal{D}, that is,

Pr[g|gΩ]=i=1nPr[Tvi|Tvi𝒟vi]\Pr[g|g\sim\Omega]=\prod_{i=1}^{n}\Pr[T_{v_{i}}|T_{v_{i}}\sim\mathcal{D}_{v_{i}}] (1)

where Pr[Tvi|Tvi𝒟vi]\Pr[T_{v_{i}}|T_{v_{i}}\sim\mathcal{D}_{v_{i}}] is the probability of TviT_{v_{i}} sampled from 𝒟vi\mathcal{D}_{v_{i}}. Under the IC-model, Pr[Tv|Tv𝒟v]=uTvpuvuN(v)\Tv(1puv)\Pr[T_{v}|T_{v}\sim\mathcal{D}_{v}]=\prod_{u\in T_{v}}p_{uv}\prod_{u\in N^{-}(v)\backslash T_{v}}(1-p_{uv}), and under the LT-model, Pr[Tv={u}|Tv𝒟v]=buv\Pr[T_{v}=\{u\}|T_{v}\sim\mathcal{D}_{v}]=b_{uv} for each uN(v)u\in N^{-}(v) and Pr[Tv=|Tv𝒟v]=1uN(v)buv\Pr[T_{v}=\emptyset|T_{v}\sim\mathcal{D}_{v}]=1-\sum_{u\in N^{-}(v)}b_{uv} deterministically.

Given a seed set SVS\subseteq V, we consider IΩ(S)I_{\Omega}(S) as a random variable that denotes the number of active nodes (influence spread) when the influence diffusion of SS terminates under the triggering model Ω=(G,𝒟)\Omega=(G,\mathcal{D}). Then, the number of nodes that are reachable from at least one node in SS under a realization gg, gΩg\sim\Omega, is denoted by Ig(S)I_{g}(S). Thus, the expected influence spread σΩ(S)\sigma_{\Omega}(S), that is

σΩ(S)=𝔼gΩ[Ig(S)]=gΩPr[g]Ig(S)\sigma_{\Omega}(S)=\mathbb{E}_{g\sim\Omega}[I_{g}(S)]=\sum_{g\sim\Omega}\Pr[g]\cdot I_{g}(S) (2)

where it is the weighted average of influence spread under all possible graph realizations. The IM problem aims to find a seed set SS, such that |S|k|S|\leq k, to maximize the expected influence spread σΩ(S)\sigma_{\Omega}(S).

Theorem 1 ([3]).

Under a triggering model Ω=(G,𝒟)\Omega=(G,\mathcal{D}), the expected influence spread σΩ(S)\sigma_{\Omega}(S) is monotone and submodular with respect to seed set SS.

III-C Problem Definition

Under the general marketing strategies, the definition of IM problem will be different from above [3]. Let +d\mathbb{Z}^{d}_{+} be the collection of non-negative integer vector. A marketing strategy can be denoted by a dd-dimensional vector 𝒙=(x1,x2,,xd)+d\boldsymbol{x}=(x_{1},x_{2},\cdots,x_{d})\in\mathbb{Z}^{d}_{+}, and we call it “marketing vector”. Each component 𝒙(i)+\boldsymbol{x}(i)\in\mathbb{Z}_{+}, i[d]={1,2,,d}i\in[d]=\{1,2,\cdots,d\} means the number of investment units assigned to marketing action MiM_{i}. For example, 𝒙(i)=b\boldsymbol{x}(i)=b tells us that marketing strategy 𝒙\boldsymbol{x} assigns bb investment units to marketing action MiM_{i}. Given a marketing vector 𝒙\boldsymbol{x}, the probability that node uVu\in V is activated as a seed is denoted by strategy function hu(𝒙)h_{u}(\boldsymbol{x}), where hu(𝒙)[0,1]h_{u}(\boldsymbol{x})\in[0,1]. Thus, unlike the standard IM problem, the selection of seed set is not deterministic, but stochastic. Given a marketing vector 𝒙\boldsymbol{x}, the probability of seed set SS sampled from 𝒙\boldsymbol{x}, that is

Pr[S|S𝒙]=uShu(𝒙)vV\S(1hv(𝒙))\Pr[S|S\sim\boldsymbol{x}]=\prod_{u\in S}h_{u}(\boldsymbol{x})\cdot\prod_{v\in V\backslash S}(1-h_{v}(\boldsymbol{x})) (3)

where Pr[S|S𝒙]\Pr[S|S\sim\boldsymbol{x}] is the probability that exactly nodes in SS are selected as seeds, but not in S are not selected as seeds under the marketing strategy 𝒙\boldsymbol{x}, because each node is select as a seed independently. Thus, the expected influence spread μΩ(𝒙)\mu_{\Omega}(\boldsymbol{x}) of marketing vector 𝒙\boldsymbol{x} under the triggering model Ω(G,𝒟)\Omega(G,\mathcal{D}) can be formulated, that is

μΩ(𝒙)\displaystyle\mu_{\Omega}(\boldsymbol{x}) =SVPr[S|S𝒙]σΩ(S)\displaystyle=\sum_{S\subseteq V}\Pr[S|S\sim\boldsymbol{x}]\cdot\sigma_{\Omega}(S) (4)
=SVσΩ(S)uShu(𝒙)vV\S(1hv(𝒙))\displaystyle=\sum_{S\subseteq V}\sigma_{\Omega}(S)\cdot\prod_{u\in S}h_{u}(\boldsymbol{x})\cdot\prod_{v\in V\backslash S}(1-h_{v}(\boldsymbol{x})) (5)

As we know, benefit is the gain obtained from influence spread and cost is the price required to pay for marketing strategy. Here, we assume each unit of marketing action MiM_{i}, i[d]i\in[d], is associated with a cost ci+c_{i}\in\mathbb{R}_{+}. Then, the total cost function c:+d+c:\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}_{+} can be defined as c(𝒙)=i[d]ci𝒙(i)c(\boldsymbol{x})=\sum_{i\in[d]}c_{i}\cdot\boldsymbol{x}(i). For simplicity, we consider the expected influence spread as our benefit. Thus, the expected profit fΩ(𝒙)f_{\Omega}(\boldsymbol{x}) we can obtain from marketing strategy 𝒙\boldsymbol{x} is the expected influence spread of 𝒙\boldsymbol{x} minus the cost of 𝒙\boldsymbol{x}, that is

fΩ(𝒙)=μΩ(𝒙)c(𝒙)f_{\Omega}(\boldsymbol{x})=\mu_{\Omega}(\boldsymbol{x})-c(\boldsymbol{x}) (6)

where c(𝒙)=i[d]ci𝒙(i)c(\boldsymbol{x})=\sum_{i\in[d]}c_{i}\cdot\boldsymbol{x}(i). Therefore, the continuous profit maximization under the general marketing strategies (CPM-MS) problem is formulated as follows:

Problem 1 (CPM-MS).

Given a triggering model Ω=(G,𝒟)\Omega=(G,\mathcal{D}), a constraint vector 𝐛+d\boldsymbol{b}\in\mathbb{Z}^{d}_{+} and a cost function c:+d+c:\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}_{+}, the CPM-MS problem aims to find an optimal marketing vector 𝐱𝐛\boldsymbol{x}^{*}\preceq\boldsymbol{b} that maximizes its expected profit fΩ(𝐱)f_{\Omega}(\boldsymbol{x}). That is, 𝐱=argmax𝐱𝐛fΩ(𝐱)\boldsymbol{x}^{*}=\arg\max_{\boldsymbol{x}\preceq\boldsymbol{b}}f_{\Omega}(\boldsymbol{x}).

IV Properties of CPM-MS

In this section, we introduce the submodularity on integer lattice, and then analyze the submodularity and computability of our CPM-MS problem.

IV-A Submodularity on Integer Lattice

Generally, defined on set, a set function α:2V\alpha:2^{V}\rightarrow\mathbb{R} is monotone if α(S)α(T)\alpha(S)\leq\alpha(T) for any STVS\subseteq T\subseteq V, and submodular if α(S)+α(T)α(ST)+α(ST)\alpha(S)+\alpha(T)\geq\alpha(S\cup T)+\alpha(S\cap T). The submodularity of set function implies a diminishing return property, thus α(S{u})α(S)α(T{u})α(T)\alpha(S\cup\{u\})-\alpha(S)\geq\alpha(T\cup\{u\})-\alpha(T) for any STVS\subseteq T\subseteq V and uTu\notin T. These two definitions of submodularity on set function are equivalent. Defined on integer lattice, a vector function β:+d\beta:\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R} is monotone if β(𝒔)β(𝒕)\beta(\boldsymbol{s})\leq\beta(\boldsymbol{t}) for any 𝒔𝒕+d\boldsymbol{s}\preceq\boldsymbol{t}\in\mathbb{Z}^{d}_{+}, and submodular if β(𝒔)+β(𝒕)β(𝒔𝒕)+β(𝒔𝒕)\beta(\boldsymbol{s})+\beta(\boldsymbol{t})\geq\beta(\boldsymbol{s}\lor\boldsymbol{t})+\beta(\boldsymbol{s}\land\boldsymbol{t}) for any s,t+ds,t\in\mathbb{Z}^{d}_{+}, where (𝒔𝒕)(i)=max{𝒔(i),𝒕(i)}(\boldsymbol{s}\lor\boldsymbol{t})(i)=\max\{\boldsymbol{s}(i),\boldsymbol{t}(i)\} and (𝒔𝒕)(i)=min{𝒔(i),𝒕(i)}(\boldsymbol{s}\land\boldsymbol{t})(i)=\min\{\boldsymbol{s}(i),\boldsymbol{t}(i)\}. Here, 𝒔𝒕\boldsymbol{s}\preceq\boldsymbol{t} implies 𝒔(i)𝒕(i)\boldsymbol{s}(i)\leq\boldsymbol{t}(i) for each component i[d]i\in[d]. Besides, we consider a vector function is diminishing return submodular (dr-submodular) if β(𝒔+𝒆i)β(𝒔)β(𝒕+𝒆i)β(𝒕)\beta(\boldsymbol{s}+\boldsymbol{e}_{i})-\beta(\boldsymbol{s})\geq\beta(\boldsymbol{t}+\boldsymbol{e}_{i})-\beta(\boldsymbol{t}) for any 𝒔𝒕\boldsymbol{s}\preceq\boldsymbol{t} and i[d]i\in[d], where 𝒆i+d\boldsymbol{e}_{i}\in\mathbb{Z}^{d}_{+} is the ii-th unit vector with the ii-th component being 11 and others being 0. Different from the submodularity for a set function, for a vector function, β\beta is submodular does not mean it is dr-submodular, but the oppposite is true. Thus, dr-submodularity is stronger than submodularity generally.

Lemma 1.

Given a set function α:2V\alpha:2^{V}\rightarrow\mathbb{R} and a vector function β:+d\beta:\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}, they sastisfy

β(𝒙)=SVα(S)uShu(𝒙)vV\S(1hv(𝒙))\beta(\boldsymbol{x})=\sum_{S\subseteq V}\alpha(S)\cdot\prod_{u\in S}h_{u}(\boldsymbol{x})\cdot\prod_{v\in V\backslash S}(1-h_{v}(\boldsymbol{x})) (7)

If α()\alpha(\cdot) is monotone and submodular and hu()h_{u}(\cdot) is monotone and dr-submodular for each uVu\in V, then β()\beta(\cdot) is monotone and dr-submodular.

Proof.

It is an indirect corollary that has been implied by the proof process of section 7 in [11] and [26]. ∎

Theorem 2.

Given a triggering model Ω=(G,𝒟)\Omega=(G,\mathcal{D}), the profit function fΩ()f_{\Omega}(\cdot) is dr-submodular, but not monotone.

Proof.

From Lemma 1, Theorem 1, and Equation (5), we have known that the expected influence spread μΩ()\mu_{\Omega}(\cdot) is monotone and dr-submodular because σΩ()\sigma_{\Omega}(\cdot) is monotone and submodular. Thus, we have fΩ(𝒙+𝒆i)fΩ(𝒙)=μΩ(𝒙+𝒆i)μΩ(𝒙)ciμΩ(𝒚+𝒆i)μΩ(𝒚)ci=fΩ(𝒚+𝒆i)fΩ(𝒚)f_{\Omega}(\boldsymbol{x}+\boldsymbol{e}_{i})-f_{\Omega}(\boldsymbol{x})=\mu_{\Omega}(\boldsymbol{x}+\boldsymbol{e}_{i})-\mu_{\Omega}(\boldsymbol{x})-c_{i}\geq\mu_{\Omega}(\boldsymbol{y}+\boldsymbol{e}_{i})-\mu_{\Omega}(\boldsymbol{y})-c_{i}=f_{\Omega}(\boldsymbol{y}+\boldsymbol{e}_{i})-f_{\Omega}(\boldsymbol{y}) iff 𝒙𝒚+d\boldsymbol{x}\preceq\boldsymbol{y}\in\mathbb{Z}^{d}_{+}. Thus, fΩ()f_{\Omega}(\cdot) is dr-submodular. ∎

IV-B Computability

Given a seed set SVS\subseteq V, it is #P-hard to compute the expected influence spread σΩ(S)\sigma_{\Omega}(S) under the IC-model [17] and the LT-model [18]. Assume that a marketing vector 𝒙{0,1}n\boldsymbol{x}\in\{0,1\}^{n} and hu(𝒙)=𝒙(u)h_{u}(\boldsymbol{x})=\boldsymbol{x}(u) for uVu\in V where user uu is a seed if and only if 𝒙(u)=1\boldsymbol{x}(u)=1. According to the Equation (4), the expected influence spread μΩ(𝒙)\mu_{\Omega}(\boldsymbol{x}) is equivalent to σΩ(S)\sigma_{\Omega}(S) in which S={uV:𝒙(u)=1}S=\{u\in V:\boldsymbol{x}(u)=1\}. Thereby, given a marketing vector 𝒙\boldsymbol{x}, computing the expected influence spread μΩ(𝒙)\mu_{\Omega}(\boldsymbol{x}) is #P-hard as well under the IC-model and LT-model. Subsequently, a natural question how to estimate the value of μΩ(𝒙)\mu_{\Omega}(\boldsymbol{x}) given 𝒙\boldsymbol{x} effectively. To estimate μΩ(𝒙)\mu_{\Omega}(\boldsymbol{x}), we usually adopt Monte-Carlo simulations. However, it is inconvenient for us to use such a method here because the randomness comes from two parts, one is from the seed selection, and the other is from the process of influence diffusion. Therefore, we require to design a more simple and efficient method.

First, we are able to establish an equivalent relationship between σΩ()\sigma_{\Omega}(\cdot) and μΩ()\mu_{\Omega}(\cdot). Given a social network G=(V,E)G=(V,E) and a marketing vector 𝒙+d\boldsymbol{x}\in\mathbb{Z}^{d}_{+}, we create a constructed graph G~=(V~,E~)\widetilde{G}=(\widetilde{V},\widetilde{E}) by adding a new node u~\widetilde{u} and a new directed edge (u~,u)(\widetilde{u},u) for each node uVu\in V to GG. Take IC-model for instance, the diffusion probability for this new edge (u~,u)(\widetilde{u},u) can be set as pu~u=hu(𝒙)p_{\widetilde{u}u}=h_{u}(\boldsymbol{x}). Then, we have

μ(𝒙|G)=σ(V~V|G~)|V|\mu(\boldsymbol{x}|G)=\sigma(\widetilde{V}-V|\widetilde{G})-|V| (8)

where μ(|G)\mu(\cdot|G) and σ(|G~)\sigma(\cdot|\widetilde{G}) imply that we compute them under the graph GG and the constructed graph G~\widetilde{G}.

Theorem 3.

Given a social network G=(V,E)G=(V,E) and a marketing vector 𝐱+d\boldsymbol{x}\in\mathbb{Z}^{d}_{+}, the expected influence spread μΩ(𝐱)\mu_{\Omega}(\boldsymbol{x}) can be estimated with (γ,δ)(\gamma,\delta)-approximation by Monte-Carlo simulations in O((m+3n)n2ln(2/δ)2(γuVhu(𝐱))2)O\left(\frac{(m+3n)n^{2}\ln(2/\delta)}{2(\gamma\sum_{u\in V}h_{u}(\boldsymbol{x}))^{2}}\right) running time.

Proof.

Mentioned above, we can compute σ(V~V|G~)\sigma(\widetilde{V}-V|\widetilde{G}) on the constructed graph instead of μ(𝒙|G)\mu(\boldsymbol{x}|G) on the original graph. Let S=V~VS=\widetilde{V}-V, the value of σΩ(S)\sigma_{\Omega}(S) can be estimated by Monte-Carlo simulations according to (2). Based on Hoeffding’s inequality, we can note that

Pr[|σ^Ω(S)σΩ(S)|γ(σΩ(S)n)]2e2r(γ(μΩ(S)n)n)2\Pr\left[\left|\hat{\sigma}_{\Omega}(S)-\sigma_{\Omega}(S)\right|\geq\gamma(\sigma_{\Omega}(S)-n)\right]\leq 2e^{-2r\left(\frac{\gamma(\mu_{\Omega}(S)-n)}{n}\right)^{2}}

where rr is the number of Monte-Carlo simulations and σΩ(S)nn\sigma_{\Omega}(S)-n\leq n. We have σΩ(S)nuVhu(𝒙)\sigma_{\Omega}(S)-n\geq\sum_{u\in V}h_{u}(\boldsymbol{x}), and to achieve a (γ,δ)(\gamma,\delta)-estimation, the number of Monte-Carlo simulations rn2ln(2/δ)2(γuVhu(𝒙))2r\geq\frac{n^{2}\ln(2/\delta)}{2(\gamma\sum_{u\in V}h_{u}(\boldsymbol{x}))^{2}}. For each iteration of simulations in the constructed graph, it takes O(m+3n)O(m+3n) running time. Thus, we can obtain a (γ,δ)(\gamma,\delta)-approximation of μΩ(𝒙)\mu_{\Omega}(\boldsymbol{x}) in O((m+3n)n2ln(2/δ)2(γuVhu(𝒙))2)O\left(\frac{(m+3n)n^{2}\ln(2/\delta)}{2(\gamma\sum_{u\in V}h_{u}(\boldsymbol{x}))^{2}}\right) running time. ∎

Based on the Theorem 3, we can get an accurate estimation for the objective function fΩ(𝒙)f_{\Omega}(\boldsymbol{x}), shown as (6), of CPM-MS problem by adjusting the parameter γ\gamma and δ\delta definitely.

V Algorithms Design

From the last section, we have known that the objective function of CPM-MS is dr-submodular, but not monotone. In this section, we develop our new methods based on the double greedy algorithm [12] for our CPM-MS, and obtain an optimal approximation ratio.

V-A Lattice-based Double Greedy

For non-negative submodular functions, Buchbinder et al. [12] designed a double greedy algorithm to get a solution for the USM problem with a tight theoretical guarantee. Under the deterministic setting, the double greedy algorithm has a (1/3)(1/3)-approximation, while it has a (1/2)(1/2)-approximation under the randomized setting. Extending from set to integer lattice, we derive a revised double greedy algorithm that is suitable for dr-submodular functions, namely UDSM problem. We adopt the randomized setting, and the lattice-based double greedy algorithm is shown in Algorithm 1. We omit the subscript of fΩ()f_{\Omega}(\cdot), denote it by f()f(\cdot) from now on.

Algorithm 1 Lattice-basedDoubleGreedy
0:  f:+df:\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}, [𝒔,𝒕][\boldsymbol{s},\boldsymbol{t}] where 𝒔𝒕+d\boldsymbol{s}\preceq\boldsymbol{t}\in\mathbb{Z}_{+}^{d}
0:  𝒙+d\boldsymbol{x}\in\mathbb{Z}^{d}_{+}
1:  Initialize: 𝒙𝒔\boldsymbol{x}\leftarrow\boldsymbol{s}, 𝒚𝒕\boldsymbol{y}\leftarrow\boldsymbol{t}
2:  for i[d]i\in[d] do
3:     while 𝒙(i)<𝒚(i)\boldsymbol{x}(i)<\boldsymbol{y}(i) do
4:        af(𝒆i|𝒙)a\leftarrow f(\boldsymbol{e}_{i}|\boldsymbol{x}) and bf(𝒆i|𝒚)b\leftarrow f(-\boldsymbol{e}_{i}|\boldsymbol{y})
5:        amax{a,0}a^{\prime}\leftarrow\max\{a,0\} and bmax{b,0}b^{\prime}\leftarrow\max\{b,0\}
6:        rUniform(0,1)r\leftarrow\text{Uniform}(0,1)
7:        (Note: we set a/(a+b)=1a^{\prime}/(a^{\prime}+b^{\prime})=1 if a=b=0a^{\prime}=b^{\prime}=0)
8:        if ra/(a+b)r\leq a^{\prime}/(a^{\prime}+b^{\prime}) then
9:           𝒙𝒙+𝒆i\boldsymbol{x}\leftarrow\boldsymbol{x}+\boldsymbol{e}_{i} and 𝒚𝒚\boldsymbol{y}\leftarrow\boldsymbol{y}
10:        else
11:           𝒚𝒚𝒆i\boldsymbol{y}\leftarrow\boldsymbol{y}-\boldsymbol{e}_{i} and 𝒙𝒙\boldsymbol{x}\leftarrow\boldsymbol{x}
12:        end if
13:     end while
14:  end for
15:  return  𝒙(=𝒚)\boldsymbol{x}(=\boldsymbol{y}) 

Here, we denote by f(𝒆i|𝒙)=f(𝒙+𝒆i)f(𝒙)f(\boldsymbol{e}_{i}|\boldsymbol{x})=f(\boldsymbol{x}+\boldsymbol{e}_{i})-f(\boldsymbol{x}), the marginal gain of adding component i[d]i\in[d] by 11. Generally, this algorithm is initialized by [𝟎,𝒃][\boldsymbol{0},\boldsymbol{b}], and for each component i[d]i\in[d], we increase 𝒙(i)\boldsymbol{x}(i) by 11 or decrease 𝒚(i)\boldsymbol{y}(i) by 11 until they are equal in each inner (while) iteration. The result returned by Algorithm 1 has 𝒙=𝒚\boldsymbol{x}=\boldsymbol{y}. Then, we have the following conclusion which can be inferred directly from double greedy algorithm in [12], that is

Theorem 4.

For our CPM-MS problem, if we initalize Algorithm 1 by [𝟎,𝐛][\boldsymbol{0},\boldsymbol{b}], and f(𝟎)+f(𝐛)0f(\boldsymbol{0})+f(\boldsymbol{b})\geq 0 is satisfied, the marketing vector 𝐱\boldsymbol{x}^{\circ} returned by Algorithm 1 is a (1/2)(1/2)-approximate solution, such that

𝔼[f(𝒙)](1/2)max𝒙𝒃f(𝒙)\mathbb{E}[f(\boldsymbol{x}^{\circ})]\geq(1/2)\cdot\max_{\boldsymbol{x}\preceq\boldsymbol{b}}f(\boldsymbol{x}) (9)

Here, f(𝒙)0f(\boldsymbol{x})\geq 0 for any 𝒙𝒃\boldsymbol{x}\preceq\boldsymbol{b} is equivalent to say f(𝟎)+f(𝒃)0f(\boldsymbol{0})+f(\boldsymbol{b})\geq 0, namely f(𝒃)0f(\boldsymbol{b})\geq 0 because of f(𝟎)=0f(\boldsymbol{0})=0, which is a natural inference from the dr-submodularity.

Algorithm 2 Lattice-basedPruning
0:  f:+df:\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}, 𝒃+d\boldsymbol{b}\in\mathbb{Z}^{d}_{+}
0:  πt=[𝒈t,𝒉t]\pi_{t}=[\boldsymbol{g}_{t},\boldsymbol{h}_{t}]
1:  Initialize: 𝒈t𝟎\boldsymbol{g}_{t}\leftarrow\boldsymbol{0}, 𝒉t𝒃\boldsymbol{h}_{t}\leftarrow\boldsymbol{b}
2:  Initialize: t0t\leftarrow 0
3:  while 𝒈t𝒈t1\boldsymbol{g}_{t}\neq\boldsymbol{g}_{t-1} or 𝒉t𝒉t1\boldsymbol{h}_{t}\neq\boldsymbol{h}_{t-1} do
4:     for i[d]i\in[d] do
5:        if 𝒈t(i)=𝒉t(i)\boldsymbol{g}_{t}(i)=\boldsymbol{h}_{t}(i) then
6:           𝒈t+1(i)𝒈t(i)\boldsymbol{g}_{t+1}(i)\leftarrow\boldsymbol{g}_{t}(i)
7:           𝒉t+1(i)𝒉t(i)\boldsymbol{h}_{t+1}(i)\leftarrow\boldsymbol{h}_{t}(i)
8:           Continue
9:        end if
10:        if f(𝒆i|𝒉t𝒉t(i)𝒆i+𝒈t(i)𝒆i)0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i})\leq 0 then
11:           𝒈t+1(i)𝒈t(i)\boldsymbol{g}_{t+1}(i)\leftarrow\boldsymbol{g}_{t}(i)
12:        else
13:           𝒈t+1(i)𝒈t(i)+max{k:f(𝒆i|𝒉t𝒉t(i)𝒆i+𝒈t(i)𝒆i+(k1)𝒆i)>0}\boldsymbol{g}_{t+1}(i)\leftarrow\boldsymbol{g}_{t}(i)+\max\{k:f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(k-1)\boldsymbol{e}_{i})>0\}, k{1,,𝒉t(i)𝒈t(i)}k\in\{1,\cdots,\boldsymbol{h}_{t}(i)-\boldsymbol{g}_{t}(i)\}
14:        end if
15:        if f(𝒆i|𝒈t)<0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t})<0 then
16:           𝒉t+1(i)𝒉t(i)\boldsymbol{h}_{t+1}(i)\leftarrow\boldsymbol{h}_{t}(i)
17:        else
18:           𝒉t+1(i)𝒈t(i)+max{k:f(𝒆i|𝒈t+(k1)𝒆i)0}\boldsymbol{h}_{t+1}(i)\leftarrow\boldsymbol{g}_{t}(i)+\max\{k:f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}+(k-1)\boldsymbol{e}_{i})\geq 0\}, k{0,,𝒉t(i)𝒈t(i)}k\in\{0,\cdots,\boldsymbol{h}_{t}(i)-\boldsymbol{g}_{t}(i)\}
19:        end if
20:     end for
21:     tt+1t\leftarrow t+1
22:  end while
23:  return  πt=[𝒈t,𝒉t]\pi_{t}=[\boldsymbol{g}_{t},\boldsymbol{h}_{t}]

V-B Lattice-based Iterative Pruning

According to Theorem 4, (1/2)(1/2)-approximation is based on an assumption that f(𝟎)+f(𝒃)0f(\boldsymbol{0})+f(\boldsymbol{b})\geq 0, and this is almost impossible in may real application scenarios. It means that we are able to gain profit if giving all marketing action full investments, which is ridiculous for viral marketing. However, a valid approximation ratio cannot be obtained by using Algorithm 1 when f(𝒃)<0f(\boldsymbol{b})<0 exists. To address this problem, Tang et al. [8] proposed a groundbreaking techniques, called iterative pruning, to reduce the search space such that the objective is non-negative in this space and without losing approximation guarantee. But their techniques are designed for submodular function based on set domain, it cannot be applied to dr-submodular function directly. Inspired by [8], we develop an iterative pruning technique suitable for dr-submodular functions in this section, which is a non-trival transformation from set to integer lattice.

Given a dr-submodular function f(𝒙)f(\boldsymbol{x}) defined on 𝒙𝒃\boldsymbol{x}\preceq\boldsymbol{b}, we have two vectors 𝒈1\boldsymbol{g}_{1} and 𝒉1\boldsymbol{h}_{1}, such that: (1) 𝒈1(i)=0\boldsymbol{g}_{1}(i)=0 if f(𝒆i|𝒃𝒃(i)𝒆i)0f(\boldsymbol{e}_{i}|\boldsymbol{b}-\boldsymbol{b}(i)\boldsymbol{e}_{i})\leq 0, or else 𝒈1(i)=max{k:f(𝒆i|𝒃𝒃(i)𝒆i+(k1)𝒆i)>0}\boldsymbol{g}_{1}(i)=\max\{k:f(\boldsymbol{e}_{i}|\boldsymbol{b}-\boldsymbol{b}(i)\boldsymbol{e}_{i}+(k-1)\boldsymbol{e}_{i})>0\} for k{1,,𝒃(i)}k\in\{1,\cdots,\boldsymbol{b}(i)\}; (2) 𝒉1(i)=0\boldsymbol{h}_{1}(i)=0 if f(𝒆i|𝟎)<0f(\boldsymbol{e}_{i}|\boldsymbol{0})<0, or else 𝒉1(i)=max{k:f(𝒆i|𝟎+(k1)𝒆i)0}\boldsymbol{h}_{1}(i)=\max\{k:f(\boldsymbol{e}_{i}|\boldsymbol{0}+(k-1)\boldsymbol{e}_{i})\geq 0\} for k{1,,𝒃(i)}k\in\{1,\cdots,\boldsymbol{b}(i)\}.

Lemma 2.

We have 𝐠1𝐡1\boldsymbol{g}_{1}\preceq\boldsymbol{h}_{1}.

Proof.

For any component i[d]i\in[d], we have f(𝒆i|𝒃𝒃(i)𝒆i+𝒈1(i)𝒆i)0f(\boldsymbol{e}_{i}|\boldsymbol{b}-\boldsymbol{b}(i)\boldsymbol{e}_{i}+\boldsymbol{g}_{1}(i)\boldsymbol{e}_{i})\leq 0 but f(𝒆i|𝒃𝒃(i)𝒆i+(𝒈1(i)1)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{b}-\boldsymbol{b}(i)\boldsymbol{e}_{i}+(\boldsymbol{g}_{1}(i)-1)\boldsymbol{e}_{i})>0; and f(𝒆i|𝟎+𝒉1(i)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{0}+\boldsymbol{h}_{1}(i)\boldsymbol{e}_{i})<0 but f(𝒆i|𝟎+(𝒉1(i)1)𝒆i)0f(\boldsymbol{e}_{i}|\boldsymbol{0}+(\boldsymbol{h}_{1}(i)-1)\boldsymbol{e}_{i})\geq 0. Because of dr-submodularity, it satisfies f(𝒆i|𝒃𝒃(i)𝒆i+𝒉1(i)𝒆i)f(𝒆i|𝟎+𝒉1(i)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{b}-\boldsymbol{b}(i)\boldsymbol{e}_{i}+\boldsymbol{h}_{1}(i)\boldsymbol{e}_{i})\leq f(\boldsymbol{e}_{i}|\boldsymbol{0}+\boldsymbol{h}_{1}(i)\boldsymbol{e}_{i})<0. Thus, (𝒈1(i)1)<𝒉1(i)(\boldsymbol{g}_{1}(i)-1)<\boldsymbol{h}_{1}(i) and 𝒈1(i)𝒉1(i)\boldsymbol{g}_{1}(i)\leq\boldsymbol{h}_{1}(i). Subsequently, 𝒈1𝒉1\boldsymbol{g}_{1}\preceq\boldsymbol{h}_{1}. ∎

Then, we define a collection denoted by π1=[𝒈1,𝒉1]\pi_{1}=[\boldsymbol{g}_{1},\boldsymbol{h}_{1}] that contains all the marketing vectors 𝒙\boldsymbol{x} that satisfies 𝒈1𝒉1\boldsymbol{g}_{1}\preceq\boldsymbol{h}_{1}. Apparently, π1\pi_{1} is a subcollection of [𝟎,𝒃][\boldsymbol{0},\boldsymbol{b}].

Lemma 3.

All optimal solutions 𝐱\boldsymbol{x}^{*} that satisfy f(𝐱)=max𝐱𝐛f(𝐱)f(\boldsymbol{x}^{*})=\max_{\boldsymbol{x}\preceq\boldsymbol{b}}f(\boldsymbol{x}) are contained in the collection π1=[𝐠1,𝐡1]\pi_{1}=[\boldsymbol{g}_{1},\boldsymbol{h}_{1}], i.e., 𝐠1𝐱𝐡1\boldsymbol{g}_{1}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}_{1} for all 𝐱\boldsymbol{x}^{*}.

Proof.

For any component i[d]i\in[d], we consider any vector 𝒙\boldsymbol{x} with 𝒙(i)<𝒈1(i)\boldsymbol{x}(i)<\boldsymbol{g}_{1}(i), we have f(𝒆i|𝒙)f(𝒆i|𝒃𝒃(i)𝒆i+𝒙(i)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{x})\geq f(\boldsymbol{e}_{i}|\boldsymbol{b}-\boldsymbol{b}(i)\boldsymbol{e}_{i}+\boldsymbol{x}(i)\boldsymbol{e}_{i})>0 because of dr-submodularity. Thereby, 𝒙+𝒆i\boldsymbol{x}+\boldsymbol{e}_{i} has a larger profit than 𝒙\boldsymbol{x} for sure, so the ii-th component of the optimal marketing vector 𝒙\boldsymbol{x}^{*} at least eqauls to 𝒙(i)+1\boldsymbol{x}(i)+1, which indicates 𝒙(i)𝒈1(i)\boldsymbol{x}^{*}(i)\geq\boldsymbol{g}_{1}(i). On the other have, consider 𝒙(i)𝒉1(i)\boldsymbol{x}(i)\geq\boldsymbol{h}_{1}(i), we have f(𝒆i|𝒙)f(𝒆i|𝟎+𝒙(i)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{x})\leq f(\boldsymbol{e}_{i}|\boldsymbol{0}+\boldsymbol{x}(i)\boldsymbol{e}_{i})<0. Thereby, 𝒙+𝒆i\boldsymbol{x}+\boldsymbol{e}_{i} has a less profit than 𝒙\boldsymbol{x} for sure, so the ii-th component of the optimal marketing vector 𝒙\boldsymbol{x}^{*} at most equals to 𝒙(i)\boldsymbol{x}(i), which indicates 𝒙(i)𝒉1(i)\boldsymbol{x}^{*}(i)\leq\boldsymbol{h}_{1}(i). Thus, 𝒈1𝒙𝒉1\boldsymbol{g}_{1}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}_{1}. ∎

From above, Lemma 3 determines a range for the optimal vector, thus reducing the search space. Then, the collection π1=[𝒈1,𝒉1]\pi_{1}=[\boldsymbol{g}_{1},\boldsymbol{h}_{1}] can be pruned further in an iterative manner. Now, the upper bound of the optimal vector is 𝒉1\boldsymbol{h}_{1}, i.e., 𝒙𝒉1\boldsymbol{x}^{*}\preceq\boldsymbol{h}_{1}, hereafter, we are able to increase 𝒈1\boldsymbol{g}_{1} to 𝒈2\boldsymbol{g}_{2}, where 𝒈2(i)=𝒈1(i)\boldsymbol{g}_{2}(i)=\boldsymbol{g}_{1}(i) if f(𝒆i|𝒉1𝒉1(i)𝒆i+𝒈1(i)𝒆i)0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{1}-\boldsymbol{h}_{1}(i)\boldsymbol{e}_{i}+\boldsymbol{g}_{1}(i)\boldsymbol{e}_{i})\leq 0, or else 𝒈2(i)=𝒈1(i)+max{k:f(𝒆i|𝒉1𝒉1(i)𝒆i+𝒈1(i)𝒆i+(k1)𝒆i)>0}\boldsymbol{g}_{2}(i)=\boldsymbol{g}_{1}(i)+\max\{k:f(\boldsymbol{e}_{i}|\boldsymbol{h}_{1}-\boldsymbol{h}_{1}(i)\boldsymbol{e}_{i}+\boldsymbol{g}_{1}(i)\boldsymbol{e}_{i}+(k-1)\boldsymbol{e}_{i})>0\} for k{1,,𝒉1(i)𝒈1(i)}k\in\{1,\cdots,\boldsymbol{h}_{1}(i)-\boldsymbol{g}_{1}(i)\}. The lower bound of the optimal vector is 𝒈1\boldsymbol{g}_{1}, i.e., 𝒙𝒈1\boldsymbol{x}^{*}\succeq\boldsymbol{g}_{1}, and similarly we are able to decrease 𝒉1\boldsymbol{h}_{1} to 𝒉2\boldsymbol{h}_{2}, where 𝒉2(i)=𝒉1(i)\boldsymbol{h}_{2}(i)=\boldsymbol{h}_{1}(i) if f(𝒆i|𝒈1)<0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{1})<0, or else 𝒉2(i)=𝒈1(i)+max{k:f(𝒆i|𝒈1+(k1)𝒆i)0}\boldsymbol{h}_{2}(i)=\boldsymbol{g}_{1}(i)+\max\{k:f(\boldsymbol{e}_{i}|\boldsymbol{g}_{1}+(k-1)\boldsymbol{e}_{i})\geq 0\} for k{1,,𝒉1(i)𝒈1(i)}k\in\{1,\cdots,\boldsymbol{h}_{1}(i)-\boldsymbol{g}_{1}(i)\}. In this process, it generates a more compressed collection π2=[𝒈2,𝒉2]\pi_{2}=[\boldsymbol{g}_{2},\boldsymbol{h}_{2}] than π1\pi_{1}. We repeat this process iteratively until 𝒈t\boldsymbol{g}_{t} and 𝒉t\boldsymbol{h}_{t} cannot be increased and decreased further. The Lattice-basedPruning algorithm is shown in Algorithm 2. The collection returned by Algorithm 2 is denoted by π=[𝒈,𝒉]\pi^{\circ}=[\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}].

Lemma 4.

All optimal solutions 𝐱\boldsymbol{x}^{*} that satisfy f(𝐱)=max𝐱𝐛f(𝐱)f(\boldsymbol{x}^{*})=\max_{\boldsymbol{x}\preceq\boldsymbol{b}}f(\boldsymbol{x}) are contained in the collection π=[𝐠,𝐡]\pi^{\circ}=[\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}], and 𝐠t𝐠t+1𝐠𝐱𝐡𝐡t+1𝐡t\boldsymbol{g}_{t}\preceq\boldsymbol{g}_{t+1}\preceq\boldsymbol{g}^{\circ}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}^{\circ}\preceq\boldsymbol{h}_{t+1}\preceq\boldsymbol{h}_{t} holds for all 𝐱\boldsymbol{x}^{*} and any t0t\geq 0.

Proof.

First, we show that the collection generated in current iteration is a subcollection of that generated in previous iteration, namely 𝒈t𝒈t+1𝒉t+1𝒉t\boldsymbol{g}_{t}\preceq\boldsymbol{g}_{t+1}\preceq\boldsymbol{h}_{t+1}\preceq\boldsymbol{h}_{t}. We prove it by induction. In Lemma 2, we have shown that 𝒈0=𝟎𝒈1𝒉1𝒉0=𝒃\boldsymbol{g}_{0}=\boldsymbol{0}\preceq\boldsymbol{g}_{1}\preceq\boldsymbol{h}_{1}\preceq\boldsymbol{h}_{0}=\boldsymbol{b}. For any t>1t>1, we assume that 𝒈t1𝒈t𝒉t𝒉t1\boldsymbol{g}_{t-1}\preceq\boldsymbol{g}_{t}\preceq\boldsymbol{h}_{t}\preceq\boldsymbol{h}_{t-1} is satisfied. Given a component i[d]i\in[d], for any q𝒈t(i)q\leq\boldsymbol{g}_{t}(i), we have f(𝒆i|𝒉t1𝒉t1(i)𝒆i+(q1)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t-1}-\boldsymbol{h}_{t-1}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})>0. Because of the dr-submodularity, we have f(𝒆i|𝒉t𝒉t(i)𝒆i+(q1)𝒆i)f(𝒆i|𝒉t1𝒉t1(i)𝒆i+(q1)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})\geq f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t-1}-\boldsymbol{h}_{t-1}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})>0, which indicates 𝒈t𝒈t+1\boldsymbol{g}_{t}\preceq\boldsymbol{g}_{t+1}. Similarly, for any q𝒉t+1(i)q\leq\boldsymbol{h}_{t+1}(i), we have f(𝒆i|𝒈t𝒈t(i)𝒆i+(q1)𝒆i)0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}-\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})\geq 0. Because of the dr-submodularity, we have f(𝒆i|𝒈t1𝒈t1(i)𝒆i+(q1)𝒆i)f(𝒆i|𝒈t𝒈t(i)𝒆i+(q1)𝒆i)0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t-1}-\boldsymbol{g}_{t-1}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})\geq f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}-\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})\geq 0, which indicates 𝒉t+1𝒉t\boldsymbol{h}_{t+1}\preceq\boldsymbol{h}_{t}. Moreover, for any q𝒈t+1(i)q\leq\boldsymbol{g}_{t+1}(i), we have f(𝒆i|𝒉t𝒉t(i)𝒆i+(q1)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})>0. Due to 𝒈t𝒉t\boldsymbol{g}_{t}\preceq\boldsymbol{h}_{t} and dr-submodularity, we have f(𝒆i|𝒈t𝒈t(i)𝒆i+(q1)𝒆i)f(𝒆i|𝒉t𝒉t(i)𝒆i+(q1)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}-\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})\geq f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})>0, which indicates 𝒈t+1𝒉t+1\boldsymbol{g}_{t+1}\preceq\boldsymbol{h}_{t+1}. Thus, we conclude that 𝒈t𝒈t+1𝒉t+1𝒉t\boldsymbol{g}_{t}\preceq\boldsymbol{g}_{t+1}\preceq\boldsymbol{h}_{t+1}\preceq\boldsymbol{h}_{t} holds for any t0t\geq 0.

Then, we show that any optimal solutions 𝒙\boldsymbol{x}^{*} are contained in the collection π=[𝒈,𝒉]\pi^{\circ}=[\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}] returned by Algorithm 2, namely 𝒈𝒙𝒉\boldsymbol{g}^{\circ}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}^{\circ}. We prove it by induction. In Lemma 3, we have shown that 𝒈1𝒙𝒉1\boldsymbol{g}_{1}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}_{1}. For any t>1t>1, we assume that 𝒈t𝒙𝒉t\boldsymbol{g}_{t}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}_{t} is satisfied. Given a component i[d]i\in[d], for any q𝒈t+1(i)q\leq\boldsymbol{g}_{t+1}(i), we have f(𝒆i|𝒉t𝒉t(i)𝒆i+(q1)𝒆𝒊)>0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e_{i}})>0. Because of the dr-submodularity, we have f(𝒆i|𝒙𝒙(i)𝒆i+(q1)𝒆𝒊)f(𝒆i|𝒉t𝒉t(i)𝒆i+(q1)𝒆𝒊)>0f(\boldsymbol{e}_{i}|\boldsymbol{x}^{*}-\boldsymbol{x}^{*}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e_{i}})\geq f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e_{i}})>0, which implies 𝒙(i)q\boldsymbol{x}^{*}(i)\geq q. Otherwise, if 𝒙(i)<q\boldsymbol{x}^{*}(i)<q, we have f(𝒆i|𝒙)>0f(\boldsymbol{e}_{i}|\boldsymbol{x}^{*})>0, which contradicts the optimality of 𝒙\boldsymbol{x}^{*}, thus 𝒙𝒈t+1\boldsymbol{x}^{*}\succeq\boldsymbol{g}_{t+1}. Similarly, for any q>𝒉t+1(i)q>\boldsymbol{h}_{t+1}(i), we have f(𝒆i|𝒈t𝒈t(i)𝒆i+(q1)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}-\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})<0. Because of the dr-submodularity, we have f(𝒆i|𝒙𝒙(i)𝒆i+(q1)𝒆𝒊)<f(𝒆i|𝒈t𝒈t(i)𝒆i+(q1)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{x}^{*}-\boldsymbol{x}^{*}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e_{i}})<f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}-\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})<0, which implies 𝒙(i)<q\boldsymbol{x}^{*}(i)<q. Otherwise, if 𝒙(i)q\boldsymbol{x}^{*}(i)\geq q, we have f(𝒆i|𝒙𝒙(i)𝒆i+(q1)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{x}^{*}-\boldsymbol{x}^{*}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})<0, which contradicts the optimality of 𝒙\boldsymbol{x}^{*}, thus 𝒙𝒉t+1\boldsymbol{x}^{*}\preceq\boldsymbol{h}_{t+1}. Thus, we conclude that 𝒈t+1𝒙𝒉t+1\boldsymbol{g}_{t+1}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}_{t+1} holds for any t0t\geq 0, and 𝒈𝒙𝒉\boldsymbol{g}^{\circ}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}^{\circ}. The proof of Lemma is completed. ∎

Lemma 5.

For any two vectors 𝐱,𝐲+d\boldsymbol{x},\boldsymbol{y}\in\mathbb{Z}^{d}_{+} with 𝐱𝐲\boldsymbol{x}\preceq\boldsymbol{y} and a dr-submodular function f:+df:\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}, we have

f(𝒚)=f(𝒙)+i=1dj=1𝒛(i)f(𝒆i|𝒙+k=1i1l=1𝒛(k)𝒆k+l=1j1𝒆i)f(\boldsymbol{y})=f(\boldsymbol{x})+\sum_{i=1}^{d}\sum_{j=1}^{\boldsymbol{z}(i)}f\Bigg{(}\boldsymbol{e}_{i}|\boldsymbol{x}+\sum_{k=1}^{i-1}\sum_{l=1}^{\boldsymbol{z}(k)}\boldsymbol{e}_{k}+\sum_{l=1}^{j-1}\boldsymbol{e}_{i}\Bigg{)} (10)

where we define 𝐳(i)=𝐲(i)𝐱(i)\boldsymbol{z}(i)=\boldsymbol{y}(i)-\boldsymbol{x}(i).

To understand Lemma 5, we give a simple example here. Let vector 𝒙,𝒚\boldsymbol{x},\boldsymbol{y} be 𝒙=(1,1),𝒚=(2,3)\boldsymbol{x}=(1,1),\boldsymbol{y}=(2,3), subsequently we can see 𝒙𝒚\boldsymbol{x}\preceq\boldsymbol{y} and 𝒛=(1,2)\boldsymbol{z}=(1,2). From the definition of (10), we have f(𝒙)+f(𝒆1|𝒙)+f(𝒆2|𝒙+𝒆1)+f(𝒆2|𝒙+𝒆1+𝒆2)=f(𝒙+𝒆1+2𝒆2)=f(𝒚)f(\boldsymbol{x})+f(\boldsymbol{e}_{1}|\boldsymbol{x})+f(\boldsymbol{e}_{2}|\boldsymbol{x}+\boldsymbol{e}_{1})+f(\boldsymbol{e}_{2}|\boldsymbol{x}+\boldsymbol{e}_{1}+\boldsymbol{e}_{2})=f(\boldsymbol{x}+\boldsymbol{e}_{1}+2\boldsymbol{e}_{2})=f(\boldsymbol{y}), which reflects the essence and correctness of Lemma 5 definitely.

Lemma 6.

The f(𝐠t)f(\boldsymbol{g}_{t}) and f(𝐡t)f(\boldsymbol{h}_{t}) are monotone non-decreasing with the increase of tt.

Proof.

We prove that f(𝒈t)f(𝒈t+1)f(\boldsymbol{g}_{t})\leq f(\boldsymbol{g}_{t+1}) and f(𝒉t)f(𝒉t+1)f(\boldsymbol{h}_{t})\leq f(\boldsymbol{h}_{t+1}) respectively. Given a component i[d]i\in[d], for any q𝒈t+1(i)q\leq\boldsymbol{g}_{t+1}(i), we have f(𝒆i|𝒉t𝒉t(i)𝒆i+(q1)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})>0. Because of the dr-submodularity, we have f(𝒆i|𝒈t+1𝒈t+1(i)𝒆i+(q1)𝒆i)f(𝒆i|𝒉t𝒉t(i)𝒆i+(q1)𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t+1}-\boldsymbol{g}_{t+1}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})\geq f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t}-\boldsymbol{h}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})>0, where 𝒈t+1𝒉t\boldsymbol{g}_{t+1}\preceq\boldsymbol{h}_{t}. According to the Lemma 5, that is

f(𝒈t+1)=f(𝒈t)+i=1dj=1𝒛t(i)f(𝒆i|𝒈t+k=1i1l=1𝒛t(k)𝒆k+l=1j1𝒆i)\displaystyle f(\boldsymbol{g}_{t+1})=f(\boldsymbol{g}_{t})+\sum_{i=1}^{d}\sum_{j=1}^{\boldsymbol{z}_{t}(i)}f\Bigg{(}\boldsymbol{e}_{i}|\boldsymbol{g}_{t}+\sum_{k=1}^{i-1}\sum_{l=1}^{\boldsymbol{z}_{t}(k)}\boldsymbol{e}_{k}+\sum_{l=1}^{j-1}\boldsymbol{e}_{i}\Bigg{)}
f(𝒈t)+i=1dj=1𝒛t(i)f(𝒆i|𝒈t+1𝒛t(i)𝒆i+l=1j1𝒆i)\displaystyle\geq f(\boldsymbol{g}_{t})+\sum_{i=1}^{d}\sum_{j=1}^{\boldsymbol{z}_{t}(i)}f\Bigg{(}\boldsymbol{e}_{i}|\boldsymbol{g}_{t+1}-\boldsymbol{z}_{t}(i)\boldsymbol{e}_{i}+\sum_{l=1}^{j-1}\boldsymbol{e}_{i}\Bigg{)} (11)

where 𝒛t(i)=𝒈t+1(i)𝒈t(i)\boldsymbol{z}_{t}(i)=\boldsymbol{g}_{t+1}(i)-\boldsymbol{g}_{t}(i). The inequality (11) is established since its dr-submodularity, that is 𝒈t+k=1i1l=1𝒛t(k)𝒆k𝒈t+1𝒛t(i)𝒆i\boldsymbol{g}_{t}+\sum_{k=1}^{i-1}\sum_{l=1}^{\boldsymbol{z}_{t}(k)}\boldsymbol{e}_{k}\preceq\boldsymbol{g}_{t+1}-\boldsymbol{z}_{t}(i)\boldsymbol{e}_{i} definitely. Besides, since f(𝒆i|𝒈t+1𝒛t(i)𝒆i+l=1j1𝒆i)>0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t+1}-\boldsymbol{z}_{t}(i)\boldsymbol{e}_{i}+\sum_{l=1}^{j-1}\boldsymbol{e}_{i})>0, we have f(𝒈t+1)f(gt)f(\boldsymbol{g}_{t+1})\geq f(g_{t}). Similarly, for any q>𝒉t+1(i)q>\boldsymbol{h}_{t+1}(i), we have f(𝒆i|𝒈t𝒈t(i)𝒆i+(q1)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}-\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})<0. Because of the dr-submodularity, we have f(𝒆i|𝒉t+1𝒉t+1(i)𝒆i+(q1)𝒆i)f(𝒆i|𝒈t𝒈t(i)𝒆i+(q1)𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t+1}-\boldsymbol{h}_{t+1}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})\leq f(\boldsymbol{e}_{i}|\boldsymbol{g}_{t}-\boldsymbol{g}_{t}(i)\boldsymbol{e}_{i}+(q-1)\boldsymbol{e}_{i})<0, where 𝒈t𝒉t+1\boldsymbol{g}_{t}\preceq\boldsymbol{h}_{t+1}. According to the Lemma 5, that is f(𝒉t)=f(\boldsymbol{h}_{t})=

=f(𝒉t+1)+i=1dj=1𝒛t(i)f(𝒆i|𝒉t+1+k=1i1l=1𝒛t(k)𝒆k+l=1j1𝒆i)\displaystyle=f(\boldsymbol{h}_{t+1})+\sum_{i=1}^{d}\sum_{j=1}^{\boldsymbol{z}_{t}(i)}f\Bigg{(}\boldsymbol{e}_{i}|\boldsymbol{h}_{t+1}+\sum_{k=1}^{i-1}\sum_{l=1}^{\boldsymbol{z}_{t}(k)}\boldsymbol{e}_{k}+\sum_{l=1}^{j-1}\boldsymbol{e}_{i}\Bigg{)}
f(𝒉t+1)+i=1dj=1𝒛t(i)f(𝒆i|𝒉t+1+l=1j1𝒆i)\displaystyle\leq f(\boldsymbol{h}_{t+1})+\sum_{i=1}^{d}\sum_{j=1}^{\boldsymbol{z}_{t}(i)}f\Bigg{(}\boldsymbol{e}_{i}|\boldsymbol{h}_{t+1}+\sum_{l=1}^{j-1}\boldsymbol{e}_{i}\Bigg{)} (12)

where 𝒛t(i)=𝒉t(i)𝒉t+1(i)\boldsymbol{z}_{t}(i)=\boldsymbol{h}_{t}(i)-\boldsymbol{h}_{t+1}(i). The inequality (12) is established since its dr-submodularity, that is 𝒉t+1+k=1i1l=1𝒛t(k)𝒆k𝒉t+1\boldsymbol{h}_{t+1}+\sum_{k=1}^{i-1}\sum_{l=1}^{\boldsymbol{z}_{t}(k)}\boldsymbol{e}_{k}\succeq\boldsymbol{h}_{t+1} definitely. Besides, since f(𝒆i|𝒉t+1+l=1j1𝒆i)<0f(\boldsymbol{e}_{i}|\boldsymbol{h}_{t+1}+\sum_{l=1}^{j-1}\boldsymbol{e}_{i})<0, we have f(𝒉t+1)f(𝒉t)f(\boldsymbol{h}_{t+1})\geq f(\boldsymbol{h}_{t}). ∎

At this time, we can initialize 𝒙,𝒚\boldsymbol{x},\boldsymbol{y} with 𝒙𝒈,𝒚𝒉\boldsymbol{x}\leftarrow\boldsymbol{g}^{\circ},\boldsymbol{y}\leftarrow\boldsymbol{h}^{\circ} instead of starting with 𝒙𝟎,𝒚𝒃\boldsymbol{x}\leftarrow\boldsymbol{0},\boldsymbol{y}\leftarrow\boldsymbol{b} in Algorithm 1, where the search space required to be checked is reduced to [𝒈,𝒉][\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}]. Then, we are able to build the approximation ratio for our revised lattice-based double greedy algorithm.

Lemma 7.

If we initialize Algorithm 1 by [𝐠,𝐡][\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}], the solution 𝐱\boldsymbol{x}^{\circ} returned by Algorithm 1 satisfies

𝔼[f(𝒙)]f((𝒙𝒈)𝒉)+12(f(𝒈)+f(𝒉))2\mathbb{E}[f(\boldsymbol{x}^{\circ})]\geq\frac{f((\boldsymbol{x}^{*}\lor\boldsymbol{g}^{\circ})\land\boldsymbol{h}^{\circ})+\frac{1}{2}\cdot(f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ}))}{2} (13)
Proof.

It can be extended from the proof of Lemma 3.1 in [12]. This procedure is complicated, so we omit here because of space limitation. ∎

Theorem 5.

For our CPM-MS problem, if we initialize Algorithm 1 by [𝐠,𝐡][\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}], and f(𝐠)+f(𝐡)0f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ})\geq 0 is satisfied, the marketing vector 𝐱\boldsymbol{x}^{\circ} returned by Algorithm 1 is a (1/2)(1/2)-approximate solution.

Proof.

Based on Lemma 4, we have 𝒈𝒙𝒉\boldsymbol{g}^{\circ}\preceq\boldsymbol{x}^{*}\preceq\boldsymbol{h}^{\circ}, hence (𝒙𝒈)𝒉=𝒙(\boldsymbol{x}^{*}\lor\boldsymbol{g}^{\circ})\land\boldsymbol{h}^{\circ}=\boldsymbol{x}^{*}. If f(𝒈)+f(𝒉)0f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ})\geq 0, we can get that 𝔼[f(𝒙)](1/2)(f(𝒙)+(1/2)(f(𝒈)+f(𝒉)))(1/2)f(𝒙)=(1/2)max𝒙𝒃f(𝒙)\mathbb{E}[f(\boldsymbol{x}^{\circ})]\geq(1/2)\cdot(f(\boldsymbol{x}^{*})+(1/2)(f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ})))\geq(1/2)\cdot f(\boldsymbol{x}^{*})=(1/2)\cdot\max_{\boldsymbol{x}\preceq\boldsymbol{b}}f(\boldsymbol{x}). ∎

From Theorem 5, it enables us to obtain the same approximation ratio by applying the lattice-based double greedy algorithm initialized by using iterative pruning if we have f(𝒈)+f(𝒉)0f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ})\geq 0. According to the Lemma 6, that is f(𝟎)+f(𝒃)=f(𝒈0)+f(𝒉0)f(𝒈1)+f(𝒉1)f(𝒈)+f(𝒉)f(\boldsymbol{0})+f(\boldsymbol{b})=f(\boldsymbol{g}_{0})+f(\boldsymbol{h}_{0})\leq f(\boldsymbol{g}_{1})+f(\boldsymbol{h}_{1})\leq\cdots\leq f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ}). To achieve this condition f(𝒈)+f(𝒉)0f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ})\geq 0 is much easier than f(𝟎)+f(𝒃)0f(\boldsymbol{0})+f(\boldsymbol{b})\geq 0. Therefore, the applications of Algorithm 1 with a theoretical bound are extended greatly by the technique of lattice-based iterative prunning.

VI Speedup by Sampling Techniques

In this section, we analyze the time complexity of Algorithm 1 and Algorithm 2, and then discuss how to reduce their running time by sampling techniques.

VI-A Time Complexity

First, we assume there is a value oracle for computing the marginal gain of increasing or decreasing component i[d]i\in[d] by 11. If we initialize 𝒙𝟎\boldsymbol{x}\leftarrow\boldsymbol{0} and 𝒚𝒃\boldsymbol{y}\leftarrow\boldsymbol{b} at the beginning of lattice-based double greedy algorithm (Algorithm 1), we have to take 2i=1d𝒃(i)2\cdot\sum_{i=1}^{d}\boldsymbol{b}(i) times together for checking each component whether to increase or decrease it by 11. Consider shrinking collection [𝟎,𝒃][\boldsymbol{0},\boldsymbol{b}] to [𝒈,𝒉][\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}] by applying lattice-based iterative pruning (Algorithm 2) first, we use it to initialize 𝒙\boldsymbol{x} and 𝒚\boldsymbol{y} at the beginning of Algorithm 1, and then running the Algorithm 1. For each component i[d]i\in[d], we check its marginal gain 𝒈(i)+(𝒃(i)𝒉(i))\boldsymbol{g}^{\circ}(i)+(\boldsymbol{b}(i)-\boldsymbol{h}^{\circ}(i)) times in the iterative pruning, thus totally i=1d(𝒈(i)+(𝒃(i)𝒉(i)))\sum_{i=1}^{d}(\boldsymbol{g}^{\circ}(i)+(\boldsymbol{b}(i)-\boldsymbol{h}^{\circ}(i))) times. Then, we are required to check 2i=1d(𝒉(i)𝒈(i))2\cdot\sum_{i=1}^{d}(\boldsymbol{h}^{\circ}(i)-\boldsymbol{g}^{\circ}(i)) times in subsequent double greedy initialized by [𝒈,𝒉][\boldsymbol{g}^{\circ},\boldsymbol{h}^{\circ}]. Combine together, we have to check i=1d(𝒃(i)+𝒉(i)𝒈(i))2i=1d𝒃(i)\sum_{i=1}^{d}(\boldsymbol{b}(i)+\boldsymbol{h}^{\circ}(i)-\boldsymbol{g}^{\circ}(i))\leq 2\cdot\sum_{i=1}^{d}\boldsymbol{b}(i) times. Hence, the time complexity is O(𝒃1)O(\left\|\boldsymbol{b}\right\|_{1}). However, to compute the marginal gain of profit is a time consuming process, and the running time is given by Theorem 3, which is not acceptable in a large-scale social graph as well as a large searching space.

VI-B Sampling Techniques

To overcome the #P-hardness of computing the objective f()f(\cdot), we borrow from the idea of reverse influence sampling (RIS) [13]. In the beginning, consider traditional IM problem, we need to introduce the concept of reverse reachable set (RR-set) first. Given a triggering model Ω=(G,𝒟)\Omega=(G,\mathcal{D}), a random RR-set can be generated by selecting a node uVu\in V uniformly and sampling a graph realization gg from Ω\Omega, then collecting those nodes can reach uu in gg. RR-sets rooted at uu is the collected nodes that are likely to influence uu. A larger expected influence spread a seed set SS has, the higher the probability that SS intersects with a random RR-set is. Given a seed set SS and a random RR-set RR, we have σ(S)=nPr[RS]\sigma(S)=n\cdot\Pr[R\cap S\neq\emptyset].

Extended to the lattice domain, given a marketing vector 𝒙\boldsymbol{x}, its expected influence spread under the triggering model Ω\Omega can be denoted by μ(𝒙)=n𝔼R[1uR(1hu(𝒙))]\mu(\boldsymbol{x})=n\cdot\mathbb{E}_{R}[1-\prod_{u\in R}(1-h_{u}(\boldsymbol{x}))] [27]. Let ={R1,R2,,Rθ}\mathcal{R}=\{R_{1},R_{2},\cdots,R_{\theta}\} be a collection of random RR-sets generated independently, we have

f^(,𝒙)=nθR(1uR(1hu(𝒙)))c(𝒙)\hat{f}(\mathcal{R},\boldsymbol{x})=\frac{n}{\theta}\cdot\sum_{R\in\mathcal{R}}\left(1-\prod_{u\in R}(1-h_{u}(\boldsymbol{x}))\right)-c(\boldsymbol{x}) (14)

that is an unbiased estimator of f(𝒙)f(\boldsymbol{x}). From here, the vector 𝒙\boldsymbol{x} that maximizes f^(,𝒙)\hat{f}(\mathcal{R},\boldsymbol{x}) will be close to the optimal solution intuitively, and more and more close with the increase of |||\mathcal{R}|. Similar to f(𝒙)f(\boldsymbol{x}), fix the collection \mathcal{R}, the f^(,𝒙)\hat{f}(\mathcal{R},\boldsymbol{x}) is dr-submodular, but not monotone as well. By Theorem 5, Algorithm 1 offers a (1/2)(1/2)-approximation if f^(,𝒈)+f^(,𝒉)0\hat{f}(\mathcal{R},\boldsymbol{g}^{\circ})+\hat{f}(\mathcal{R},\boldsymbol{h}^{\circ})\geq 0 is satisfied after the process of pruning.

Now, we begin to design our algorithm based on the idea of reverse sampling. First, we need to sample the enough number of random RR-sets so that its estimation to the objective function is accurate. Let ε1\varepsilon_{1}, ε2\varepsilon_{2}, and ε3\varepsilon_{3} be three adjustable parameters, where they satisfy

ε2+(1/2)ε3=ε\varepsilon_{2}+(1/2)\cdot\varepsilon_{3}=\varepsilon (15)

where ε1,ε2,ε3>0\varepsilon_{1},\varepsilon_{2},\varepsilon_{3}>0. Then, we can set that

θ1=n2ln(3δi=1d(𝒃(i)+1))2ε12\displaystyle\theta_{1}=\sqrt{\frac{n^{2}\cdot\ln(3\delta\cdot\prod_{i=1}^{d}(\boldsymbol{b}(i)+1))}{2\varepsilon_{1}^{2}}} (16)
θ2=n(2n+ε22OPT¯)ln(3δi=1d(𝒃(i)+1))ε22OPT¯2\displaystyle\theta_{2}=\frac{n(2n+\varepsilon_{2}^{2}\cdot\underline{OPT})\cdot\ln(3\delta\cdot\prod_{i=1}^{d}(\boldsymbol{b}(i)+1))}{\varepsilon_{2}^{2}\cdot\underline{OPT}^{2}} (17)
θ3=2n2ln(3δ)ε32OPT¯2\displaystyle\theta_{3}=\frac{2n^{2}\cdot\ln(3\delta)}{\varepsilon_{3}^{2}\cdot\underline{OPT}^{2}} (18)

where the OPT¯\underline{OPT} is the lower bound of the optimal objective f(𝒙)f(\boldsymbol{x}^{*}). The algorithm that combining double greedy with reverse sampling and iterative pruning, called DG-IP-RIS algorithm, is shown in Algorithm 3.

Algorithm 3 DG-IP-RIS
0:  f^:×+d\hat{f}:\mathcal{R}\times\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}, 𝒃+d\boldsymbol{b}\in\mathbb{Z}^{d}_{+}, (ε1,ε2,ε3)(\varepsilon_{1},\varepsilon_{2},\varepsilon_{3}), δ\delta
0:  𝒙^+d\hat{\boldsymbol{x}}\in\mathbb{Z}^{d}_{+}
1:  Initialize: θ1\theta_{1} define on (16)
2:  OPT¯\underline{OPT}\leftarrow OptEstimation(f^,𝒃,θ1)(\hat{f},\boldsymbol{b},\theta_{1})
3:  Initialize: θ2,θ3\theta_{2},\theta_{3} defined on (17) (18)
4:  θmax(θ1,θ2,θ3)\theta\leftarrow\max(\theta_{1},\theta_{2},\theta_{3})
5:  Generate a collection of random RR-sets \mathcal{R} with ||=θ|\mathcal{R}|=\theta
6:  [𝒈^,𝒉^][\hat{\boldsymbol{g}}^{\circ},\hat{\boldsymbol{h}}^{\circ}]\leftarrow Lattice-basedPruning(f^(,),𝒃)(\hat{f}(\mathcal{R},\cdot),\boldsymbol{b})
7:  𝒙^\hat{\boldsymbol{x}}\leftarrow Lattice-basedDoubleGreedy(f^(,),[𝒈^,𝒉^])(\hat{f}(\mathcal{R},\cdot),[\hat{\boldsymbol{g}}^{\circ},\hat{\boldsymbol{h}}^{\circ}])
8:  return  𝒙^\hat{\boldsymbol{x}}

In DG-IP-RIS algorithm, we estimate the number of random RR-sets θ\theta in line 4, and then generate a collection \mathcal{R} of random RR-sets with the size of θ\theta. The objective f^(,)\hat{f}(\mathcal{R},\cdot) is computed based on this \mathcal{R}, from which we are able to get a solution by iterative pruning and double greedy algorithm. In the first step, we require to compute a lower bound of optimal value f(𝒙)f(\boldsymbol{x}^{*}), which is shown in Algorithm 4. Here, we increase the component by 11 with the largest marginal gain at each iteration until there is no component having positive marginal gain. After the while loop, we can obtain a vector 𝒙\boldsymbol{x} and set OPT¯f^(,𝒙)2ε1\underline{OPT}\leftarrow\hat{f}(\mathcal{R},\boldsymbol{x})-2\varepsilon_{1}, because Pr[|f^(,𝒙)f(𝒙)|ε1]\Pr[|\hat{f}(\mathcal{R},\boldsymbol{x})-f(\boldsymbol{x})|\leq\varepsilon_{1}] is satisfied with a high probability under the setting of θ1\theta_{1}. Like this, we have OPT¯>0\underline{OPT}>0 as well because the largest marginal gain should be greater than 0 due to the dr-submodularity, or else the definition of our problem is not valid and meaningless. For convenience, given a random RR-set RR, we denote p(R,𝒙)=1uR(1hu(𝒙))p(R,\boldsymbol{x})=1-\prod_{u\in R}(1-h_{u}(\boldsymbol{x})) in subsequent proof.

Algorithm 4 OptEstimation
0:  f^:×+d\hat{f}:\mathcal{R}\times\mathbb{Z}^{d}_{+}\rightarrow\mathbb{R}, 𝒃+d\boldsymbol{b}\in\mathbb{Z}^{d}_{+}, θ1\theta_{1}
0:  OPT¯\underline{OPT}
1:  Initialize: 𝒙0\boldsymbol{x}\leftarrow 0, t0t\leftarrow 0
2:  Generate a collection of random RR-sets \mathcal{R} with ||=θ1|\mathcal{R}|=\theta_{1}
3:  while t<i=0d𝒃(i)t<\sum_{i=0}^{d}\boldsymbol{b}(i) do
4:     iargmaxi[d],𝒙(i)<𝒃(i)f^(𝒆i|,𝒙)i^{*}\leftarrow\arg\max_{i\in[d],\boldsymbol{x}(i)<\boldsymbol{b}(i)}\hat{f}(\boldsymbol{e}_{i}|\mathcal{R},\boldsymbol{x})
5:     if f^(𝒆i|,𝒙)0\hat{f}(\boldsymbol{e}_{i^{*}}|\mathcal{R},\boldsymbol{x})\leq 0 then
6:        Break
7:     end if
8:     𝒙𝒙+𝒆i\boldsymbol{x}\leftarrow\boldsymbol{x}+\boldsymbol{e}_{i^{*}}, tt+1t\leftarrow t+1
9:  end while
10:  OPT¯f^(,𝒙)2ε1\underline{OPT}\leftarrow\hat{f}(\mathcal{R},\boldsymbol{x})-2\varepsilon_{1}
11:  return  OPT¯\underline{OPT}
Lemma 8.

The OPT¯\underline{OPT} returned by Algorithm 4 satisfies f(𝐱)OPT¯f(\boldsymbol{x}^{*})\geq\underline{OPT} with at least 11/(3δ)1-1/(3\delta) probability.

Proof.

For any marketing vector 𝒙\boldsymbol{x}, we want to obtain Pr[|f^(,𝒙)f(𝒙)|ε1]1/(3δi=1d𝒃(i))\Pr[|\hat{f}(\mathcal{R},\boldsymbol{x})-f(\boldsymbol{x})|\geq\varepsilon_{1}]\leq 1/(3\delta\cdot\prod_{i=1}^{d}\boldsymbol{b}(i)). By the additive form of Chernoff-Hoeffding Inequality, it is equivalent to compute, that is

Pr[|1θ1p(Ri,𝒙)μ(𝒙)n|ε1n]exp(2θ12ε12n2)\Pr\left[\left|\frac{1}{\theta_{1}}\sum p(R_{i},\boldsymbol{x})-\frac{\mu(\boldsymbol{x})}{n}\right|\geq\frac{\varepsilon_{1}}{n}\right]\leq\exp\left(-\frac{2\theta_{1}^{2}\varepsilon_{1}^{2}}{n^{2}}\right)

When θ1\theta_{1} is defined as (16), we have 1/(3δi=1d(𝒃(i)+1))=exp(2θ12ε12/n2)1/(3\delta\cdot\prod_{i=1}^{d}(\boldsymbol{b}(i)+1))=\exp(-2\theta_{1}^{2}\varepsilon_{1}^{2}/n^{2}) definitely. By the union bound, the above relationship holds for the 𝒙\boldsymbol{x}^{\prime} generated in line 10 of Algorithm 3 with a probablity less than 1/(3δ)1/(3\delta). ∎

Remark 1.

Given a marketing vector 𝐱𝐛\boldsymbol{x}\preceq\boldsymbol{b}, for each component i[d]i\in[d], the possible values of 𝐱(i)\boldsymbol{x}(i) are {0,1,2,,𝐛(i)}\{0,1,2,\cdots,\boldsymbol{b}(i)\}, thus the number of possible values for 𝐱(i)\boldsymbol{x}(i) is 𝐛(i)+1\boldsymbol{b}(i)+1. Thereby the total number of possible combinations for vector 𝐱\boldsymbol{x} is i=1d(𝐛(i)+1)\prod_{i=1}^{d}(\boldsymbol{b}(i)+1), which explains why the union bound in the previous lemma happened.

Lemma 9 (Chernoff bounds [28]).

Given a collection 𝒵={Z1,Z2,,Zθ}\mathcal{Z}=\{Z_{1},Z_{2},\cdots,Z_{\theta}\}, each Zi[0,1]Z_{i}\in[0,1] is an i.i.d. random variable with 𝔼[Zi]=ν\mathbb{E}[Z_{i}]=\nu, we have

Pr[i=1θZi(1+γ)νθ]exp(γ2νθ2+γ)\Pr\left[\sum\nolimits_{i=1}^{\theta}Z_{i}\geq(1+\gamma)\cdot\nu\theta\right]\leq\exp\left(-\frac{\gamma^{2}\cdot\nu\theta}{2+\gamma}\right) (19)
Pr[i=1θZi(1γ)νθ]exp(γ2νθ2)\Pr\left[\sum\nolimits_{i=1}^{\theta}Z_{i}\leq(1-\gamma)\cdot\nu\theta\right]\leq\exp\left(-\frac{\gamma^{2}\cdot\nu\theta}{2}\right) (20)

where we assume that γ>0\gamma>0.

Lemma 10.

Given a collection \mathcal{R} with ||=θ2|\mathcal{R}|=\theta_{2}, for any marketing vector 𝐱𝐛\boldsymbol{x}\preceq\boldsymbol{b}, it satisfies f^(,𝐱)f(𝐱)<ε2f(𝐱)\hat{f}(\mathcal{R},\boldsymbol{x})-f(\boldsymbol{x})<\varepsilon_{2}\cdot f(\boldsymbol{x}^{*}) with at least 11/(3δ)1-1/(3\delta) probability.

Proof.

For any marketing vector 𝒙\boldsymbol{x}, we want to obtain Pr[f^(,𝒙)f(𝒙)ε2f(𝒙)]1/(3δi=1d(𝒃(i)+1))\Pr[\hat{f}(\mathcal{R},\boldsymbol{x})-f(\boldsymbol{x})\geq\varepsilon_{2}\cdot f(\boldsymbol{x}^{*})]\leq 1/(3\delta\cdot\prod_{i=1}^{d}(\boldsymbol{b}(i)+1)). By the Chernoff bound, defined as (19), it is equivalent to compute, that is

Pr[p(Ri,𝒙)(1+ε2f(𝒙)μ(𝒙))μ(𝒙)θ2n]\displaystyle\Pr\left[\sum p(R_{i},\boldsymbol{x})\geq\left(1+\frac{\varepsilon_{2}f(\boldsymbol{x}^{*})}{\mu(\boldsymbol{x})}\right)\cdot\frac{\mu(\boldsymbol{x})\theta_{2}}{n}\right]
exp((ε2f(𝒙)μ(𝒙))2μ(𝒙)θ2n2+ε2f(𝒙)μ(𝒙))\displaystyle\leq\exp\left(-\frac{\left(\frac{\varepsilon_{2}f(\boldsymbol{x}^{*})}{\mu(\boldsymbol{x})}\right)^{2}\cdot\frac{\mu(\boldsymbol{x})\theta_{2}}{n}}{2+\frac{\varepsilon_{2}f(\boldsymbol{x}^{*})}{\mu(\boldsymbol{x})}}\right) (21)

From Lemma 9 and μ(𝒙)n\mu(\boldsymbol{x})\leq n, we have

(21)exp(θ2ε22OPT¯2n(2n+ε2OPT¯))13δi=1d(𝒃(i)+1)(21)\leq\exp\left(-\frac{\theta_{2}\cdot\varepsilon_{2}^{2}\cdot\underline{OPT}^{2}}{n\cdot(2n+\varepsilon_{2}\cdot\underline{OPT})}\right)\leq\frac{1}{3\delta\cdot\prod_{i=1}^{d}(\boldsymbol{b}(i)+1)}

By the union bound, the above relationship holds for any 𝒙𝒃\boldsymbol{x}\preceq\boldsymbol{b} with at most 1/(3δ)1/(3\delta) probability. ∎

Lemma 11.

Given a collection \mathcal{R} with ||=θ3|\mathcal{R}|=\theta_{3}, for an optimal solution 𝐱\boldsymbol{x}^{*}, it satisfies f^(,𝐱)f(𝐱)>ε3f(𝐱)\hat{f}(\mathcal{R},\boldsymbol{x}^{*})-f(\boldsymbol{x}^{*})>-\varepsilon_{3}\cdot f(\boldsymbol{x}^{*}) with at least 11/(3δ)1-1/(3\delta) probability.

Proof.

For an optimal solution 𝒙\boldsymbol{x}^{*}, we want to obtain Pr[f^(,𝒙)f(𝒙)ε3f(𝒙)]1/(3δ)\Pr[\hat{f}(\mathcal{R},\boldsymbol{x}^{*})-f(\boldsymbol{x}^{*})\leq-\varepsilon_{3}\cdot f(\boldsymbol{x}^{*})]\leq 1/(3\delta). By the Chernoff bound, defined as (20), it is equivalent to compute, that is

Pr[p(Ri,𝒙)(1ε3f(𝒙)μ(𝒙))μ(𝒙)θ2n]\displaystyle\Pr\left[\sum p(R_{i},\boldsymbol{x}^{*})\leq\left(1-\frac{\varepsilon_{3}f(\boldsymbol{x}^{*})}{\mu(\boldsymbol{x}^{*})}\right)\cdot\frac{\mu(\boldsymbol{x}^{*})\theta_{2}}{n}\right]
exp((ε3f(𝒙)μ(𝒙))2μ(𝒙)θ3n2)\displaystyle\leq\exp\left(-\frac{\left(\frac{\varepsilon_{3}f(\boldsymbol{x}^{*})}{\mu(\boldsymbol{x}^{*})}\right)^{2}\cdot\frac{\mu(\boldsymbol{x}^{*})\theta_{3}}{n}}{2}\right) (22)

From Lemma 9 and μ(𝒙)n\mu(\boldsymbol{x})\leq n, we have

(22)exp(θ3ε32OPT¯22n2)13δ(22)\leq\exp\left(-\frac{\theta_{3}\cdot\varepsilon_{3}^{2}\cdot\underline{OPT}^{2}}{2n^{2}}\right)\leq\frac{1}{3\delta}

The above relationship holds for the optimal solution 𝒙\boldsymbol{x}^{*} with at most 1/(3δ)\leq 1/(3\delta) probability. ∎

Let 𝒙^\hat{\boldsymbol{x}}^{\circ} be the result returned by Algorithm 3. If f^(,𝒙^)\hat{f}(\mathcal{R},\hat{\boldsymbol{x}}^{\circ}) and f^(,𝒙)\hat{f}(\mathcal{R},\boldsymbol{x}^{*}) are accurate estimations to the f(𝒙^)f(\hat{\boldsymbol{x}}^{\circ}) and f(𝒙)f(\boldsymbol{x}^{*}), we can say this solution 𝒙^\hat{\boldsymbol{x}} has an effective approximation guarantee, which is shown in Theorem 6.

Theorem 6.

For our CPM-MS problem, if it satisfies f^(,𝐠^)+f^(,𝐡^)0\hat{f}(\mathcal{R},\hat{\boldsymbol{g}}^{\circ})+\hat{f}(\mathcal{R},\hat{\boldsymbol{h}}^{\circ})\geq 0, for any ε(0,1/2)\varepsilon\in(0,1/2) and δ>0\delta>0, the marketing vector 𝐱^\hat{\boldsymbol{x}}^{\circ} returned by Algorithm 3 is a (1/2ε)(1/2-\varepsilon)-approximation, that is

f(𝒙^)(1/2ε)f(𝒙)f(\hat{\boldsymbol{x}}^{\circ})\geq(1/2-\varepsilon)\cdot f(\boldsymbol{x}^{*}) (23)

holds with at least 11/δ1-1/\delta probability.

Proof.

Based on Lemma 10, f^(,𝒙^)f(𝒙^)<ε2f(𝒙)\hat{f}(\mathcal{R},\hat{\boldsymbol{x}}^{\circ})-f(\hat{\boldsymbol{x}}^{\circ})<\varepsilon_{2}\cdot f(\boldsymbol{x}^{*}) holds with at least 11/(3δ)1-1/(3\delta) probability, and on Theorem 5, we have f^(,𝒙^)(1/2)f^(,𝒙)\hat{f}(\mathcal{R},\hat{\boldsymbol{x}}^{\circ})\geq(1/2)\cdot\hat{f}(\mathcal{R},\boldsymbol{x}^{*}). Thus, f(𝒙^)f^(,𝒙^)ε2f(𝒙)(1/2)f^(,𝒙)ε2f(𝒙)f(\hat{\boldsymbol{x}}^{\circ})\geq\hat{f}(\mathcal{R},\hat{\boldsymbol{x}}^{\circ})-\varepsilon_{2}\cdot f(\boldsymbol{x}^{*})\geq(1/2)\cdot\hat{f}(\mathcal{R},\boldsymbol{x}^{*})-\varepsilon_{2}\cdot f(\boldsymbol{x}^{*}). By Lemma 11, f^(,𝒙)f(𝒙)>ε3f(𝒙)\hat{f}(\mathcal{R},\boldsymbol{x}^{*})-f(\boldsymbol{x}^{*})>-\varepsilon_{3}\cdot f(\boldsymbol{x}^{*}) holds with at least 11/(3δ)1-1/(3\delta) probability, thus we have f(𝒙^)(1/2(ε2+1/2ε3))f(𝒙)=(1/2ε)f(𝒙)f(\hat{\boldsymbol{x}}^{\circ})\geq(1/2-(\varepsilon_{2}+1/2\cdot\varepsilon_{3}))\cdot f(\boldsymbol{x}^{*})=(1/2-\varepsilon)\cdot f(\boldsymbol{x}^{*}). Combined with that f(𝒙)OPT¯f(\boldsymbol{x}^{*})\geq\underline{OPT} holds with 11/(3δ)1-1/(3\delta), by the union bound, (23) holds with at least 11/δ1-1/\delta probability. ∎

Finally, we consider the running time of Algorithm 3. Given a collection RR with |R|=θ=max{θ1,θ2,θ3}|R|=\theta=\max\{\theta_{1},\theta_{2},\theta_{3}\}, we have θ=O(n2)\theta=O(n^{2}). To compute f^(R,)\hat{f}(R,\cdot), it takes O(θn)O(\theta n) time, and to generate a random RR-set, it takes O(m)O(m) times. Thus, the time complexity of Algorithm 3 is O(mθ+𝒃1nθ)=O((m+n)n2)O(m\theta+\left\|\boldsymbol{b}\right\|_{1}\cdot n\theta)=O((m+n)n^{2}). Besides, this running time can be reduced further. Look at the forms of (16) (17) and (18), θ1\theta_{1} is apparently less than θ2\theta_{2} and θ3\theta_{3}. Therefore, we are able to select the remaining two parameters (ε2,ε3)(\varepsilon_{2},\varepsilon_{3}) such that (ε2,ε3)=argminε2+(1/2)ε3=εmax{θ2,θ3}(\varepsilon_{2},\varepsilon_{3})=\arg\min_{\varepsilon_{2}+(1/2)\cdot\varepsilon_{3}=\varepsilon}\max\{\theta_{2},\theta_{3}\}.

VII Experiments

In this section, we carry out several experiments on different datasets to validate the performance of our proposed algorithms. It aims to test the efficiency of DG-IP-RIS algorithm (Algorithm 3) and its effectiveness compared to other heuristic algorithms. All of our experiments are programmed by python, and run on Windows machine with a 3.40GHz, 4 core Intel CPU and 16GB RAM. There are four datasets used in our experiments: (1) NetScience [29]: a co-authorship network, co-authorship among scientists to publish papers about network science; (2) Wiki [29]: a who-votes-on-whom network, which come from the collection Wikipedia voting; (3) HetHEPT [30]: an academic collaboration relationship on high energy physics area; (4) Epinions [30]: a who-trust-whom online social network on Epinions.com, a general consumer review site. The statistics information of these four datasets is represented in Table I. For undirected graph, each undirected edge is replaced with two reversed directed edges.

TABLE I: The datasets statistics (K=103)(K=10^{3})
Dataset n m Type Avg.Degree
NetScience 0.4K 1.01K undirected 5.00
Wiki 1.0K 3.15K directed 6.20
HetHEPT 12.0K 118.5K undirected 19.8
Epinions 75.9K 508.8K directed 13.4
Refer to caption
(a) NetScience, Performance
Refer to caption
(b) NetScience, Time (s)
Refer to caption
(c) Wiki, Performance
Refer to caption
(d) Wiki, Time (s)
Refer to caption
(e) HetHEPT, Performance
Refer to caption
(f) HetHEPT, Time (s)
Refer to caption
(g) Epinions, Performance
Refer to caption
(h) Epinions, Time (s)
Figure 1: The performance and running time comparisons among different algorithms under the IC-model.
Refer to caption
(a) NetScience, Performance
Refer to caption
(b) NetScience, Time (s)
Refer to caption
(c) Wiki, Performance
Refer to caption
(d) Wiki, Time (s)
Refer to caption
(e) HetHEPT, Performance
Refer to caption
(f) HetHEPT, Time (s)
Refer to caption
(g) Epinions, Performance
Refer to caption
(h) Epinions, Time (s)
Figure 2: The performance and running time comparisons among different algorithms under the LT-model.

VII-A Experimental Settings

We test different algorithms based on IC/LT-model. For the IC-model, the diffusion probability puvp_{uv} for each (u,v)E(u,v)\in E is set to the inverse of vv’s in-degree, i.e., puv=1/|N(v)|p_{uv}=1/|N^{-}(v)|, and for the LT-model, the weight buv=1/|N(v)|b_{uv}=1/|N^{-}(v)| for each (u,v)E(u,v)\in E is set as well, which are adopted by previous studies of IM widely [3] [13] [14] [15] [19]. Then, we need to consider our strategy function, that is

hu(𝒙)=1i[d]j=1𝒙(i)(1ηj1rui)h_{u}(\boldsymbol{x})=1-\prod_{i\in[d]}\prod_{j=1}^{\boldsymbol{x}(i)}\left(1-\eta^{j-1}\cdot r_{ui}\right) (24)

where η(0,1)\eta\in(0,1) is an attenuation coefficient, and rui[0,1]r_{ui}\in[0,1] for uVu\in V and i[d]i\in[d], where a unit of investment to marketing action MiM_{i} activates user uu to be a seed with the probability ruir_{ui}, and each activation is independent. Here, we define vector 𝒓u=(ru1,ru2,,rud)\boldsymbol{r}_{u}=(r_{u1},r_{u2},\cdots,r_{ud}).

We assume there are five marketing action totally, namely 𝒙=(x1,x2,,x5)\boldsymbol{x}=(x_{1},x_{2},\cdots,x_{5}) and d=5d=5, and 𝒃={5}d\boldsymbol{b}=\{5\}^{d}. Thus, 𝒙(i)5\boldsymbol{x}(i)\leq 5 for each i[5]i\in[5]. Besides, we set η=0.8\eta=0.8, {ru1,ru3}\{r_{u1},r_{u3}\} is sampled from [0,0.1][0,0.1] and {ru2,ru4,ru5}\{r_{u2},r_{u4},r_{u5}\} is sampled from [0,0.05][0,0.05] uniformly. Apparently, hu(𝒙)h_{u}(\boldsymbol{x}) is monotone and dr-submodular with respect to 𝒙\boldsymbol{x}. For example, consider a marketing vector 𝒙=(1,3,0,0,2)\boldsymbol{x}=(1,3,0,0,2) and a node uu with 𝒓u=(0.1,0.04,0.08,0,0.05)\boldsymbol{r}_{u}=(0.1,0.04,0.08,0,0.05), we have hu(𝒙)=1[(10.1)][(10.04)(10.8×0.04)(10.82×0.04)][(10.05)(10.8×0.05)]=0.257h_{u}(\boldsymbol{x})=1-[(1-0.1)][(1-0.04)(1-0.8\times 0.04)(1-0.8^{2}\times 0.04)][(1-0.05)(1-0.8\times 0.05)]=0.257 definitely. For the cost function cc, we adopt a uniform cost distribution. The cost cic_{i} for a unit of marketing action MiM_{i}, i[d]i\in[d], is set as ci=λn/𝒃1c_{i}=\lambda\cdot n/\left\|\boldsymbol{b}\right\|_{1}, where λ0\lambda\geq 0 is a cost coefficient. The cost coefficient λ\lambda defined above is used to regulate the effect of cost on objective function. For example, f()f(\cdot) is monotone dr-submodular if λ=0\lambda=0; When we set λ=1\lambda=1, it implies f(𝒃)=0f(\boldsymbol{b})=0 if all users in a given social network can be influenced by full marketing vector 𝒃\boldsymbol{b}, or else this profit is negative; If λ>1\lambda>1, we have f(𝒃)<0f(\boldsymbol{b})<0 definitely.

In addition, the number of Monte-Carlo simulations for each estimation to profit function is 20002000. For those algorithms that adopt speedup by sampling techniques, the parameters setting of four datasets is shown in Table II. Next, we denote “XXX” is achieved by Monte-Carlo simulations, but “XXXS” is achieved with speedup by sampling techniques. The algorithms we compare in this experiment are shown as follows: (1) DG(S): lattice-based double greedy feed with [𝟎,𝒃][\boldsymbol{0},\boldsymbol{b}]; DGIT(S): lattice-based double greedy feed with the collection returned by lattice-based iterative pruning; (3) Greedy(S): select the component with maximum marginal gain until no one has postive gain; (4) Random: select the component randomly until reaching negative marginal gain.

TABLE II: The parameters setting for algorithms that adopt speedup by sampling techniques
Dataset ε1\varepsilon_{1} ε2\varepsilon_{2} ε3\varepsilon_{3} δ\delta
NetScience 0.10 0.10 0.10 10.00
Wiki 0.10 0.10 0.10 10.00
HetHEPT 0.15 0.10 0.10 10.00
Epinions 1.00 0.10 0.10 10.00

VII-B Experimental Results

Fig. 1 and Fig. 2 draws the expected profit and running time produced by different algorithms under the IC-model and LT-model. From the left columns of Fig. 1 and Fig. 2, we can see that the expected profits decrease with the increase of cost coefficient, which is obvious because a larger cost coefficient implies larger cost for a unit of investment. Its trend is close to the inverse proportional relationship, namely f(1/λ)f\propto(1/\lambda). Then, the expected profits achieved by DG, DGIT, DGS, and DGITS(DG-IP-RIS) only have very slight even negligible gaps. By comparing the performance between DG and DGS (between DGIT and DGITS), it can show that speedup by sampling techinques is completely effective, which can estimate the objective function accurately. By comparing the performance between DG and DGIT (between DGS and DGITS), it can prove that the optimal solution lies in the shrinked collection returned by itertive pruning, because DGIT does not make the performance of original DG worse. It means that the expected profit will not be reduced at least if we initial double greedy with the shrinked collection returned by iterative pruning. However, do such a thing, it can provide a theoretical bound, so as to avoid some extreme situations. In addition, even if Greedy(S) gives a satisfactory solution in our experiment, there are still some exceptions, for example, (c) in Fig. 1 and (g) in Fig. 2. It often happens in some positions with larger cost coefficients.

From the right columns of Fig. 1 and Fig. 2, the trend of running time with cost coefficient is a little complex, but there are two apparent characteristics. First, by comparing between DG and DGS (between DGIT and DGITS, or between Greedy and GreedyS), we can see that their running times are reduced significantly by our sampling techniques. Here, in order to test the running time of different algorithms, we do not use parallel acceleration in our implementations. Generally speaking, the running times of algorithms implemented by sampling do not exceed 10% of the corrsponding algorithms implemented by Monte-Carlo simulations in average. Second, look at DGS and DGITS, their running times increase with the increase of cost coeffient. This is because the lower bound of optimal solution returned by Algorithm 4 will be smaller and smaller as cost efficient grows, resulting in a larger θ2\theta_{2} and θ3\theta_{3}. Hence, the number of random RR-sets needed to be generated and searched will increases certainly. Third, by comparing between DGS and DGITS, their running times are roughly equal, shown as (f) (h) in Fig. 1 and Fig. 2. It infers that initializing by iterative prunning will not increase the time complexity actually, which is very meaningful.

Table III and Table IV shows the effect of lattice-based iterative prunning on the sum of initialized objective values under the IC-model and LT-model, where we denote A=f(𝟎)+f(𝒃)A=f(\boldsymbol{0})+f(\boldsymbol{b}) and B=f(𝒈)+f(𝒉)B=f(\boldsymbol{g}^{\circ})+f(\boldsymbol{h}^{\circ}) for convenience. When cost coefficient λ1\lambda\geq 1, A<0A<0 in all cases, thus there is no approximation guarantee if we run double greedy algorithm feed with [𝟎,𝒃][\boldsymbol{0},\boldsymbol{b}] directly. However, with the help of iterative prunning, B0B\geq 0 holds for most of cases. Like this, our DGIT(S) algorithm is able to offer a (1/2ε)(1/2-\varepsilon)-approximate solution according to Theorem 5 and Theorem 6.

TABLE III: Sum of initialized objective value under the IC-model
NetScience Wiki HetHEPT
λ\lambda AA BB AA BB AA BB
0.8 -22 219 -127 379 -586 7050
1.0 -97 178 -303 256 -2860 848
1.2 -174 101 -481 213 -5065 751
1.4 -250 -2 -658 147 -7329 -317
1.6 -325 82 -836 -11 -9523 -463
1.8 -401 55 -1015 -269 -11795 500
2.0 -477 -17 -1192 -792 -14031 -137
TABLE IV: Sum of initialized objective value under the LT-model
NetScience Wiki HetHEPT
λ\lambda AA BB AA BB AA BB
0.8 24 294 -64 482 1406 1410
1.0 -51 229 -242 390 -869 1186
1.2 -126 33 -420 177 -3080 887
1.4 -202 159 -599 228 -5332 392
1.6 -278 -6 -776 166 -7603 -948
1.8 -353 11 -954 66 -9807 -106
2.0 -429 93 -1132 -778 -12063 229

VIII Conclusion

In this paper, we propose the continuous profit maximization problem first, and based on it, we study unconstrained dr-submodular problem further. For UDSM problem, lattice-based double greedy is an effective algorithm, but there is not approximation guarantee unless all objective values are non-negative. To solve it, we propose lattice-based iterative pruning, and derive it step by step. With the help of this technique, the possibility of satisfying non-negative is enhanced greatly. Our approach can be used as a flexible framework to address UDSM problem. Then, back to CPM-MS problem, we design a speedup strategy by using sampling techniques, which reduce its running time significantly without losing approximation guarantee. Eventually, we evaluate our proposed algorithms on four real networks, and the results validate their effectiveness and time efficiency thoroughly.

Acknowledgment

This work is partly supported by National Science Foundation under grant 1747818.

References

  • [1] P. Domingos and M. Richardson, “Mining the network value of customers,” in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 57–66.
  • [2] M. Richardson and P. Domingos, “Mining knowledge-sharing sites for viral marketing,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002, pp. 61–70.
  • [3] D. Kempe, J. Kleinberg, and É. Tardos, “Maximizing the spread of influence through a social network,” in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003, pp. 137–146.
  • [4] J. Guo and W. Wu, “A novel scene of viral marketing for complementary products,” IEEE Transactions on Computational Social Systems, vol. 6, no. 4, pp. 797–808, 2019.
  • [5] J. Guo, T. Chen, and W. Wu, “A multi-feature diffusion model: Rumor blocking in social networks,” arXiv preprint arXiv:1912.03481, 2019.
  • [6] W. Lu and L. V. Lakshmanan, “Profit maximization over social networks,” in 2012 IEEE 12th International Conference on Data Mining.   IEEE, 2012, pp. 479–488.
  • [7] H. Zhang, H. Zhang, A. Kuhnle, and M. T. Thai, “Profit maximization for multiple products in online social networks,” in IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on Computer Communications.   IEEE, 2016, pp. 1–9.
  • [8] J. Tang, X. Tang, and J. Yuan, “Profit maximization for viral marketing in online social networks,” in 2016 IEEE 24th International Conference on Network Protocols (ICNP).   IEEE, 2016, pp. 1–10.
  • [9] G. Tong, W. Wu, and D.-Z. Du, “Coupon advertising in online social systems: Algorithms and sampling techniques,” arXiv preprint arXiv:1802.06946, 2018.
  • [10] J. Guo, T. Chen, and W. Wu, “Budgeted coupon advertisement problem: Algorithm and robust analysis,” IEEE Transactions on Network Science and Engineering, pp. 1–1, 2020.
  • [11] D. Kempe, J. Kleinberg, and É. Tardos, “Maximizing the spread of influence through a social network,” Theory OF Computing, vol. 11, no. 4, pp. 105–147, 2015.
  • [12] N. Buchbinder, M. Feldman, J. Seffi, and R. Schwartz, “A tight linear time (1/2)-approximation for unconstrained submodular maximization,” SIAM Journal on Computing, vol. 44, no. 5, pp. 1384–1402, 2015.
  • [13] C. Borgs, M. Brautbar, J. Chayes, and B. Lucier, “Maximizing social influence in nearly optimal time,” in Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms.   SIAM, 2014, pp. 946–957.
  • [14] Y. Tang, X. Xiao, and Y. Shi, “Influence maximization: Near-optimal time complexity meets practical efficiency,” in Proceedings of the 2014 ACM SIGMOD international conference on Management of data, 2014, pp. 75–86.
  • [15] Y. Tang, Y. Shi, and X. Xiao, “Influence maximization in near-linear time: A martingale approach,” in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015, pp. 1539–1554.
  • [16] J. Tang, X. Tang, X. Xiao, and J. Yuan, “Online processing algorithms for influence maximization,” in Proceedings of the 2018 International Conference on Management of Data, 2018, pp. 991–1005.
  • [17] W. Chen, C. Wang, and Y. Wang, “Scalable influence maximization for prevalent viral marketing in large-scale social networks,” in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 2010, pp. 1029–1038.
  • [18] W. Chen, Y. Yuan, and L. Zhang, “Scalable influence maximization in social networks under the linear threshold model,” in 2010 IEEE international conference on data mining.   IEEE, 2010, pp. 88–97.
  • [19] H. T. Nguyen, M. T. Thai, and T. N. Dinh, “Stop-and-stare: Optimal sampling algorithms for viral marketing in billion-scale networks,” in Proceedings of the 2016 International Conference on Management of Data, 2016, pp. 695–710.
  • [20] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher, “An analysis of approximations for maximizing submodular set functions—i,” Mathematical programming, vol. 14, no. 1, pp. 265–294, 1978.
  • [21] M. L. Fisher, G. L. Nemhauser, and L. A. Wolsey, “An analysis of approximations for maximizing submodular set functions—ii,” in Polyhedral combinatorics.   Springer, 1978, pp. 73–87.
  • [22] U. Feige, V. S. Mirrokni, and J. Vondrák, “Maximizing non-monotone submodular functions,” SIAM Journal on Computing, vol. 40, no. 4, pp. 1133–1153, 2011.
  • [23] T. Soma and Y. Yoshida, “A generalization of submodular cover via the diminishing return property on the integer lattice,” in Advances in Neural Information Processing Systems, 2015, pp. 847–855.
  • [24] ——, “Maximizing monotone submodular functions over the integer lattice,” Mathematical Programming, vol. 172, no. 1-2, pp. 539–563, 2018.
  • [25] T. Soma, “Non-monotone dr-submodular function maximization,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.   AAAI, 2017, pp. 898–904.
  • [26] J. Guo, T. Chen, and W. Wu, “Continuous activity maximization in online social networks,” arXiv preprint arXiv:2003.11677, 2020.
  • [27] W. Chen, R. Wu, and Z. Yu, “Scalable lattice influence maximization,” arXiv preprint arXiv:1802.04555, 2018.
  • [28] R. Motwani and P. Raghavan, Randomized algorithms.   Cambridge university press, 1995.
  • [29] R. A. Rossi and N. K. Ahmed, “The network data repository with interactive graph analytics and visualization,” in AAAI, 2015. [Online]. Available: http://networkrepository.com
  • [30] J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http://snap.stanford.edu/data, Jun. 2014.