Multi-Agent Maximization of a Monotone Submodular Function via Maximum Consensus
Abstract
Constrained submodular set function maximization problems often appear in multi-agent decision-making problems with a discrete feasible set. A prominent example is the problem of multi-agent mobile sensor placement over a discrete domain. However, submodular set function optimization problems are known to be NP-hard. In this paper, we consider a class of submodular optimization problems that consists of maximization of a monotone and submodular set function subject to a uniform matroid constraint over a group of networked agents that communicate over a connected undirected graph. Our objective is to obtain a distributed suboptimal polynomial-time algorithm that enables each agent to obtain its respective policy via local interactions with its neighboring agents. Our solution is a fully distributed gradient-based algorithm using the multilinear extension of the submodular set functions and exploiting a maximum consensus scheme. This algorithm results in a policy set that when the team objective function is evaluated at worst case the objective function value is in of the optimal solution. An example demonstrates our results.
1 Introduction
In recent years, optimal multi-agent sensor placement/dispatch to cover areas for sampling/observation objectives has been of great interest in the robotic community to solve various coverage, exploration, and monitoring problems. Depending on the problem setting, the planning space can be continuous [1, 2, 3, 4] or discrete [5, 6, 7, 8]. In this paper, our interest is in planning in discrete space. Many optimal discrete policy-making for multi-agent sensor placement problems appear in the form of maximizing a utility for the team [9, 10, 11, 12, 13, 5]. A particular subset of these problems that we are interested in is in the form of a set-valued function maximization problem described by
(1) |
where is the admissible set of subsets originated from the ground policy set , which is the union of the local policy set of the agents; see Fig. 1 for an example.
Problem (1) is known to be NP hard [14]. However, when the objective function is monotone increasing and submodular set function, it is well-known that the sequential greedy algorithm [15] delivers a polynomial-time suboptimal solution with guaranteed optimality bound. For large scale submodular maximization problems, reducing the size of the problem through approximations [16] or using several processing units leads to a faster sequential greedy algorithm, however, with some sacrifices on the optimality bound [17, 18, 19, 20].
When the set that the agents can choose their policy from is the same for all agents and the agents are homogeneous, the feasible set of (1) can be spanned by a uniform matroid constraint111see Section 3 for a formal definition of uniform matroid constraint.. For problems with uniform matroid constraint, the sequential greedy algorithm delivers a suboptimal solution whose utility value is no worse than times the optimal utility value [14]. On the other hand, when the local policy set of the agents is disjoint and/or the agents are heterogeneous, and each agent has to choose one policy from its local set, the feasible set of (1) can be spanned by the so-called partitioned matroid constraint222see Section 3 for a formal definition of partitioned matroid constraint.. For problems with partitioned matroid constraint, the sequential greedy algorithm delivers a suboptimal solution whose utility value is no worse than times the optimal utility value [15]. For problems with partitioned matroid constraint, more recently, a suboptimal solution with a better optimality gap is proposed in [21, 22, 23, 24] using the multilinear continuous relaxation of a submodular set function. The multilinear continuous relaxation of a submodular set function defined on the vertices of the -dimensional hypercube facilitates achieves better optimality bounds for the maximization problem (1) by allowing to use a continuous gradient-based optimization algorithm. That is, the relaxation transforms the main discrete problem into a continuous optimization problem with linear constraints. The optimality bound achieved by the algorithm proposed in [21] is bound for a partitioned matroid constraint, which outperforms the sequential greedy algorithm whose optimally bound is . However, this solution requires a central authority to find the optimal solution. In sensor placement problems, when the agents are self-organizing autonomous mobile agents with communication and computation capabilities, it is desired to solve the optimization problem 1 in a distributed way, without involving a central authority. A distributed multilinear extension based continuous algorithm that uses an average consensus scheme between the agents to solve (1) subject to partitioned matroid constraint is proposed in [25]. However, the proposed algorithm assumes that access to the exact multilinear extension of the utility function is available, whose calculation is exponentially hard.
In this paper, motivated by the improved optimality gap of multilinear continuous relaxation based algorithms, we develop a distributed implementation of the algorithm of [21] over a connected undirected graph to obtain a suboptimal solution for (1) subject to a partitioned matroid constraint. In this algorithm, to manage the computational cost of constructing the multilinear extension of the utility function, we use a sampled based evaluation of the multilinear extension and propose a gradient-based algorithm, which uses a maximum consensus message passing scheme over the communication graph. Our algorithm uses a fully distributed rounding technique to compute the final solution of each agent. Through rigorous analysis, we show that our proposed distributed algorithm achieves, with a known probability, a optimality bound, where is the number of times agents communicated over the network. A numerical example demonstrates our results.
The organization of this paper is as follows. In Section 2 we introduce the notation used throughout the paper. Section 3 provides the necessary background on submodularity and submodular functions. We introduce our proposed algorithm in Section 5. A demonstration of our results is provided in Section 6. Appendix A gives the proof of our main results in Section 5. Appendix B gives the auxiliary results that we use in establishing the proof of our main results.
2 Notation
We denote the vectors with bold small font. The element of vector is denoted by . We denote the inner product of two vectors and with appropriate dimensions by . Furthermore, we use as a vector of ones with appropriate dimension. We denote sets with capital curly font. For a and a vector with , the set is a random set where is in with the probability . Hence, we call such vector as membership probability vector. Furthermore, for , is the vector whose element is if and otherwise; we call the membership indicator vector of set . is the absolute value of . By overloading the notation, we also use as the cordiality of set . We denote a graph by where is the node set and is the edge set. is undirected if and only means that agents and can exchange information. An undirected graph is connected if there is a path from each node to every other node in the graph. We denote the set of the neighboring nodes of node by . We also use to show the diameter of the graph.
Given a set and an element we define the addition operator as . Given a collection of sets , , we define the max-operation over these collection as , where
3 Submodular Functions
A set function is monotone increasing if for sets ,
holds. Furthermore, defining , the set function is submoular if for ,
(2) |
holds, which shows the diminishing return of a set function. Furthermore, a set function is normalized if .
In what follows without loss of generality we assume that the ground set is equal to . Submodular functions are functions assigning values to all subsets of a finite set . Using the set membership indicator, equivalently, we can regard them as functions on the boolean hypercube, . For a submodular function , its multilinear extension in the continuous space is
(3) |
The multilinear extension of set function is a unique multilinear function agreeing with f on the vertices of the hypercube. The multilinear extension function has the following properties.
Lemma 3.1 (See [21])
If is non-decreasing, then for all , everywhere in (monotonicity of ). If is submodular, then for all , everywhere in .
An alternative way to perceive is to observe that it is indeed equivalent to
(4) |
where is a set where each element appears independently with the probabilities . Then, it can be shown that taking the derivatives of yields
(5) |
and
(6) |
Constructing in (3) requires the knowledge of for all , which becomes computationally intractable when the size of ground set increases. However, with the stochastic interpretation (4) of the multilinear extension and its derivatives, if enough samples are drawn according to , we can obtain an estimate of with a reasonable computational cost. We can use Chernoff-Hoeffding’s inequality to quantify the quality of this estimation given the number of samples.
Theorem 3.1 (Chernoff-Hoeffding’s inequality [26])
Consider a set of independent random variables where . Letting , then
Problems such as sensor placement with a limited number of sensors and ample places of possible sensor placement can be formulated as a set value optimization (1) where the feasible set constraint is in the form of matroid. A matroid is defined as the pair with such that
-
•
and then
-
•
and , then there exists such that
One of the known matroids is uniform matroid A sensor placement scenario where there are sensors to be placed in locations selected out of possible locations can be described by a uniform matroid. On the other hand, when there is a set of agents , each with multiple sensors , , that they can place in a set of specific and non-overlapping part of a discrete space, this constrained can be described by the partitioned matroid , where with and .
A matroid polytop is a convex hull defined as
with to be the rank function of the matroid defined as
For the partition matroid with which is the focus of this paper, the convex hull is such that for with associated with membership probability of policies in , we have , i.e.,
(7) |

4 Problem Statement
We consider a group of with communication and computation capabilities interacting over a connected undirected graph . These agents aim to solve in a distributed manner the optimization problem
(8a) | ||||
(8b) |
where utility function is monotone increasing and submodular set function over the discrete policy space , with being the policy space of agent , which is only known to agent . In this paper, we work in the value oracle model where the only access to the utility function is through a black box returning for a given set . Every agent can obtain the value of the utility function at any subset . The constraint (8b) is the partitioned matroid , which ensures that only one policy per agent is selected from each local policy set , . An example application scenario of our problem of interest is shown in Fig. 1. Without loss of generality, to simplify notation, we assume that the policy space is and is sorted agent-wise with and .
Finding the optimal solution of (8) even in central form is NP-Hard [28]. The computational time of finding the optimizer set increases exponentially with [29]. The well-known sequential greedy algorithm finds a suboptimal solution for (8) with the optimality bound of , i.e., a -approximation at worst case [28]. More recently, by using the multilinear extension of the submodular utility functions, Vondrák [21] developed a randomized centralized continuous greedy algorithm which achieves a -approximate for set value optimization (8) in the value oracle model, see Algorithm 1333Pipage rounding is a polynomial time algorithm which moves a fractional point on a hypercube to a integral point on the same hypercube such that . Our objective in this paper is to develop a distributed implementation of Algorithm 1 to solve (8) for when agents interact over a connected graph . Recall that in our problem setting, every agent can evaluate the utility function for a given but it has access only to its own policy space .
Note: is the in (10).
5 A polynomial-time distributed multi-agent randomized continuous greedy algorithm
Our proposed distributed multi-agent randomized continuous greedy algorithm to solve the set value optimization problem (8) over a connected graph is given in Algorithm 2, whose convergence guarantee and suboptimality gap is given in Theorem 5.1 below. To provide the insight into construction of Algorithm 2, we first review the idea behind the central suboptimal solution via Algorithm 1 following [21]. We also provide some intermediate results that will be useful in establishing our results.
5.1 A short overview of the central continuous greedy process
As we mentioned earlier the continuous multilinear extension is a relaxation strategy that extends a submodular function , which is defined on the vertices of the dimensional hypercube to a continuous multilinear function defined on . The two functions evaluate identically for any vector that is the membership indicator vector of a set . Then, by way of a process that runs continuously, depending only on local properties of , we can produce a point to approximate the optimum (here recall (7)). The proposal in [21] is to move in the direction of a vector constrained by which maximizes the local gain. To understand the logic behind Algorithm 1 let us review the conceptual steps of this continuous greedy process. [21] views the process as a particle starting at and following the flow
(9) |
over a unit time interval . We note that for is contained in , since it is a convex combination of vectors in . Now consider a point along the trajectory of our flow (9) and assume that is the true optimum . Now consider a direction , which is a nonnegative vector. Because , then . By virtue of Lemma 3.1, is monotone increasing, therefore, using we have . However, does not belong to necessarily. Therefore, lets consider for . It can be shown that is concave in and is non-increasing. Thus, it can be established that . But, since , and that is used to generate maximizes , we can write . Now we note that Therefore, given , using the Comparison Lemma [30], we can conclude the , and thus and also . In the second stage of the algorithm, the fractional solution is rounded to a point in by use of Pipage rounding method, see [31] for more details about Pipage rounding. The aforementioned exposition is the conceptual design behind the continuous greedy process. The algorithm 1is a practical implementation achieved by the use of a numerical iterative process
(10) |
and use of sampling to compute and consequently , see [21] for more details. In what follows, we explain a practical distributed implementation of the the continuous greedy process, which is realized as Algorithm 2, and is inspired by this central solution.
5.2 Design and analysis of the distributed continuous greedy process
We start off our design and analysis of the distributed continuous greedy process by introducing our notation and the set of computations that agents carry out locally using their local information and interaction with their neighbors. The algorithm is performed over the finite time horizon of steps. Let be the information set of agent at time step , initialized at . For, any couple the first element is the policy and the second element is the corresponding membership probability. We let (recall ) be the local copy of the membership probability of our suboptimal solution of (8) at agent at time step , defined according to
(11) |
Recall that and it is sorted agent-wise with and . Hence, where is the membership probability vector of agent ’s own policy at iteration , while is the local estimate of the membership probability vector of agent by agent . At time step agent solves the optimization problem
(12) |
with
(13) |
The term is the estimate of which is calculated by taking samples of set according to membership probability vector . Recall (5). Hence, is estimated by averaging over the samples. We denote the th element of by , and represent it by
(14) |
We note that given the definition of , to compute (12), we only need for .
Remark 5.1 (Local computation of , , by agent )
Given the definition (11), we note that , an estimate of can be obtained from drawing sample policy sets such that with the probability for all and using
(15) |
Let each agent propagate its local variable according to
(16) |
One can make a conceptual connection between (16) and the practical implementation of (9) discussed earlier. Because the propagation is only based on local information of agent , next, each agent, by interacting with its neighbors, updates its propagated by element-wise maximum seeking
(17) |
Lemma 5.1 below shows that, as one expects, , i.e., the corrected component of corresponding to agent itself is the propagated value maintained at agent , and not the estimated value of any of its neighbors.
Lemma 5.1
In the distributed Algorithm 2 at each time step each agent has its own belief on the probabilities of the policies that is not necessarily the same as the belief of the other agents. The following result establishes the difference between the belief of the agents.
Proposition 5.1
Let agents follow the distributed Algorithm 2. Then, the vectorized membership probability for each agent realized from satisfy
(19a) | |||
(19b) | |||
(19c) |
The proof of Proposition 5.1 is given in the Appendix A. Because is normal and monotone increasing, we have the guarantees that . Therefore, without loss of generality, we know that one realization of in (12) corresponds to where
(20) |
Next, let each agent propagate its information set according to
(21) |
and update it using a local interaction with its neighbors according to
(22) |
By definition to and operators, we have the guarantees that if , then there exists no that .
Lemma 5.2
Initialized by , , (20), (21), and (21) where is computed via (15) constitute a distributed iterative process, formally stated by Algorithm 2, that runs for steps. At the end of these steps, as stated in Algorithm 2, each agent , obtains its suboptimal policy by sampling one policy with the probability given by , where for ,
The following result, whose proof is available in the Appendix A, gives the convergence guarantee and suboptimality gap of Algorithm 2.
Theorem 5.1 (Convergence guarantee and suboptimality gap of Algorithm 2)
By simplifying the probability statement and dropping the higher order terms, the optimality gap grantee of Theorem 5.1 holds with the probability of at least ; note that . Then, it is clear that the probability improves as and the , the number of the samples collected by agents, increase.
Remark 5.2 (Extra communication for improved optimality gap)
Replacing the update step (17) with where and
i.e., starting with and recursively repeating the update step (17) using the output of the previous recursion for times, each agent arrives at (recall Lemma 5.1). Hence, for this revised implementation, following the proof of Theorem 5.1, we observe that (33) is replaced by , which consequently, leads to
(23) |
with the probability of . This improved optimality gap is achieved by extra communication per agent. The optimality bound (23) is the same bound that is achieved by the centralized algorithm of [21]. To implement this revision, Algorithm 2’s step 11 (equivalent to (21)) should be replaced by , where , and
(24) |








6 Numerical Example
Consider the multi-agent sensor placement problem introduced in Fig. 1 for agents for which , i.e., each agent is able to move to any of the sensor placement points. This sensor placement problem is cast by optimization problem (8). The field is a 6 unit by 6 unit square and the feasible sensor locations are the 6 by 6 grid in the center square of the field, see Fig. 2. The points of interest are spread around the map (small red dots in Fig. 2) in the total number of . The sensing zone of the agents are circles with radii of respectively . The agents communicate over a ring graph as shown in Fig. 2. We first solve this problem using our proposed distributed Algorithm 2. The results of the simulation for different iteration and sampling numbers are shown in Table 1. The algorithm produces good results at a modest number of iteration and sampling numbers (e.g. see and ). Fig. 2(g) shows the result of the deployment using Algorithm 2. Next, we solve this algorithm using the sequential greedy algorithm [29] in a decentralized way by first choosing a route that visits all the agents, and then giving to the agents so they follow to share their information in a sequential manner. Figure 2(a)-(f) gives 6 possible denoted by the semi-circular arrow inside the networks. The results of running the sequential greedy algorithm over the sequences in Fig. 2(a)-(f) is shown in Table 2.
10000 | 500 | 100 | 50 | 10 | 5 | 1 | |
100 | 768 | 768 | 718 | 710 | 718 | 716 | 696 |
20 | 768 | 768 | 718 | 710 | 726 | 716 | 696 |
10 | 661 | 640 | 657 | 640 | 634 | 602 | 551 |
5 | 630 | 630 | 634 | 626 | 583 | 608 | 540 |
1 | 456 | 456 | 456 | 456 | 456 | 456 | 456 |
Case | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
Utility | 634 | 704 | 699 | 640 | 767 | 760 |
What stands out about the sequential greedy algorithm is that the choice of sequence can affect the outcome of the algorithm significantly. We can attribute this inconsistency to the heterogeneity of the sensors’ measurement zone. We can see that when sensor is given the last priority to make its choice, the sequential greedy algorithm acts poorly. This can be explained by agents with smaller sensing zone picking high-density areas but not being able to cover it fully, see Fig. 3(h) which depicts the outcome of a sequential greedy algorithm using the sequence in Case 1. A simpler example justifying this observation is shown in Fig. 3 with the two disjoint clusters of points and two sensors. One may suggest to sequence the agents from high to low sensing zone order, however this is not necessarily the best choice as we can see in Table 2; the utility of case 6 is less than case 5 (the conjecture of sequencing the agents from strongest to weakest is not valid). Moreover, this ordering may lead to a very long over the communication graph. Interestingly, this inconsistency does not appear in solutions of Algorithm 2 where the agents intrinsically are overcoming the importance of a sequence by deciding the position of the sensors over a time horizon of communication and exchanging their information set.

7 Conclusion
We proposed a distributed suboptimal algorithm to solve the problem of maximizing an monotone increasing submodular set function subject to a partitioned matroid constraint. Our problem of interest was motivated by optimal multi-agent sensor placement problems in discrete space. Our algorithm was a practical decentralization of a multilinear extension based algorithm that achieves optimally gap, which is an improvement over optimality gap that the well-known sequential greedy algorithm achieves. In our numerical example, we compared the outcome obtained by our proposed algorithm with that of a decentralized sequential greedy algorithm which is constructed from assigning a priority sequence to the agents. We showed that the outcome of the sequential greedy algorithm is inconsistent and depends on the sequence. However, our algorithm’s outcome due to its iterative nature intrinsically tended to be consistent, which in some ways also explains its better optimally gap over the sequential greedy algorithm. Our future work is to study the robustness of our proposed algorithm to message dropout.
Acknowledgments
This work is supported by NSF award IIS-SAS-1724331.
References
- [1] S. Martínez and F. Bullo, “Optimal sensor placement and motion coordination for target tracking,” Automatica, vol. 42, no. 4, pp. 661–668, 2006.
- [2] N. Zhou, C. Cassandras, X. Yu, and S. Andersson, “Decentralized event-driven algorithms for multi-agent persistent monitoring tasks,” in IEEE Int. Conf. on Decision and Control, pp. 4064–4069, 2017.
- [3] Y. Khazaeni and C. Cassandras, “Event-driven trajectory optimization for data harvesting in multi-agent systems,” IEEE Tran. on Control of Network Systems, 2017.
- [4] N. Zhou, X. Yu, S. Andersson, and C. Cassandras, “Optimal event-driven multi-agent persistent monitoring of a finite set of data sources,” IEEE Tran. on Automatic Control, 2018.
- [5] N. Rezazadeh and S. S. Kia, “A sub-modular receding horizon approach to persistent monitoring for a group of mobile agents over an urban area,” IFAC-PapersOnLine, vol. 52, no. 20, pp. 217–222, 2019.
- [6] X. Duan, M. George, R. Patel, and F. Bullo, “Robotic surveillance based on the meeting time of random walks,” IEEE Tran. on Robotics and Automation, 2020.
- [7] S. Welikala and C. Cassandras, “Event-driven receding horizon control for distributed estimation in network systems,” arXiv preprint arXiv:2009.11958, 2020.
- [8] S. Alamdari, E. Fata, and L. Smith, “Persistent monitoring in discrete environments: Minimizing the maximum weighted latency between observations,” The International Journal of Robotics Research, vol. 33, no. 1, pp. 138–154, 2014.
- [9] N. Mehr and R. Horowitz, “A submodular approach for optimal sensor placement in traffic networks,” in 2018 Annual American Control Conference (ACC), pp. 6353–6358, IEEE, 2018.
- [10] H. A, M. Ghasemi, H. Vikalo, and U. Topcu, “Randomized greedy sensor selection: Leveraging weak submodularity,” IEEE Transactions on Automatic Control, 2020.
- [11] V. Tzoumas, A. Jadbabaie, and G. J. Pappas, “Sensor placement for optimal kalman filtering: Fundamental limits, submodularity, and algorithms,” in American Control Conference, pp. 191–196, IEEE, 2016.
- [12] M. Shamaiah, S. Banerjee, and H. Vikalo, “Greedy sensor selection: Leveraging submodularity,” in IEEE conference on decision and control, pp. 2572–2577, 2010.
- [13] S. T. Jawaid and S. Smith, “Submodularity and greedy algorithms in sensor scheduling for linear dynamical systems,” Automatica, vol. 61, pp. 282–288, 2015.
- [14] G. Nemhauser, L. Wolsey, and M. Fisher, “An analysis of approximations for maximizing submodular set functions—i,” Mathematical programming, vol. 14, no. 1, pp. 265–294, 1978.
- [15] L. Fisher, G. Nemhauser, and L. Wolsey, “An analysis of approximations for maximizing submodular set functions—ii,” in Polyhedral combinatorics, pp. 73–87, Springer, 1978.
- [16] K. Wei, R. Iyer, and J. Bilmes, “Fast multi-stage submodular maximization,” in International conference on machine learning, pp. 1494–1502, PMLR, 2014.
- [17] B. Mirzasoleiman, A. Karbasi, R. Sarkar, and A. Krause, “Distributed submodular maximization: Identifying representative elements in massive data,” in Advances in Neural Information Processing Systems, pp. 2049–2057, 2013.
- [18] B. Mirzasoleiman, M. Zadimoghaddam, and A. Karbasi, “Fast distributed submodular cover: Public-private data summarization,” in Advances in Neural Information Processing Systems, pp. 3594–3602, 2016.
- [19] R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani, “Fast greedy algorithms in mapreduce and streaming,” ACM Transactions on Parallel Computing, vol. 2, no. 3, pp. 1–22, 2015.
- [20] P. S. Raut, O. Sadeghi, and M. Fazel, “Online dr-submodular maximization with stochastic cumulative constraints,” arXiv preprint arXiv:2005.14708, 2020.
- [21] J. Vondrák, “Optimal approximation for the submodular welfare problem in the value oracle model,” in Proceedings of the fortieth annual ACM symposium on Theory of computing, pp. 67–74, 2008.
- [22] A. A. Bian, B. Mirzasoleiman, J. Buhmann, and A. Krause, “Guaranteed non-convex optimization: Submodular maximization over continuous domains,” in Artificial Intelligence and Statistics, pp. 111–120, 2017.
- [23] A. Mokhtari, H. Hassani, and A. Karbasi, “Stochastic conditional gradient methods: From convex minimization to submodular maximization,” Journal of Machine Learning Research, vol. 21, no. 105, pp. 1–49, 2020.
- [24] O. Sadeghi and M. Fazel, “Online continuous dr-submodular maximization with long-term budget constraints,” in International Conference on Artificial Intelligence and Statistics, pp. 4410–4419, 2020.
- [25] A. Robey, A. Adibi, B. Schlotfeldt, J. G. Pappas, and H. Hassani, “Optimal algorithms for submodular maximization with distributed constraints,” arXiv preprint arXiv:1909.13676, 2019.
- [26] H. W, “Probability inequalities for sums of bounded random variables,” in The Collected Works of Wassily Hoeffding, pp. 409–426, Springer, 1994.
- [27] C. Chekuri and A. Kumar, “Maximum coverage problem with group budget constraints and applications,” in Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pp. 72–83, Springer, 2004.
- [28] M. Conforti and G. Cornuéjols, “Submodular set functions, matroids and the greedy algorithm: tight worst-case bounds and some generalizations of the rado-edmonds theorem,” Discrete applied mathematics, vol. 7, no. 3, pp. 251–274, 1984.
- [29] N. Rezazadeh and S. S. Kia, “A sub-modular receding horizon solution for mobile multi-agent persistent monitoring,” arXiv preprint arXiv:1908.04425, 2019.
- [30] H. K. Khalil and J. W. Grizzle, Nonlinear systems, vol. 3. Prentice hall Upper Saddle River, NJ, 2002.
- [31] A. A. Ageev and M. I. Sviridenko, “Pipage rounding: A new method of constructing algorithms with proven performance guarantee,” Journal of Combinatorial Optimization, vol. 8, no. 3, pp. 307–328, 2004.
Appendix A: Proof of the Results in Section 5
-
Proof of Lemma 5.1
Since is a monotone increasing and submodular set function, we have and hence has positive entries . This results in the optimization (12) subject to vector space (5.2) to output vector that has entries greater or equal to zero. Hence, according to the propagation and update rule (16) and (17), we can conclude that has increasing elements and only agent can update it and other agents only copy this value as . Therefor, we can conclude that for all which concludes the proof.
-
Proof of Proposition 5.1
is a monotone increasing and submodular set function therefor and hence has positive entries . Then, because , it follows from (12) that has non-negative entries, , which satisfy . Therefore, it follows from (16) and Lemma 5.1 that
(25) Using (25), we can also write
(26) Furthermore, it follows from Lemma (5.1) that for all and any we can write
(27) Also, since every agent can be reached from agent at most in hops, it follows from the propagation and update laws (16) and (17), for all , for any that
(28) Thus, for any and , (27) and (28) result in
(29) Next, we can use (26) and (29) to write
(30) for and . Using (30) for any we can write
(31) Then, using Lemma 5.1, from (31) we can write
which ascertains (19a). Next, note that from Lemma 5.1, we have for any . Then, using (16) and invoking Lemma 5.1, we obtain (19b),which, given (25), also ascertains (19c).
-
Proof of Theorem 5.1
Knowing that from Lemma 7.2 and (19c), it follows from Lemma 7.3 that
which, given (19b), leads to
(32) Next, we note that by definition, for any . Therefore, given (19a), by invoking Lemma 7.3, for any we can write
(33) for . Recall that at each time step , the realization of in (12) that Algorithm 2 uses is
(34) for every . Thus, , . Consequently, using (33) we can write
(35) Next, we let
and
Because is monotone increasing, by virtue of Lemma 3.1, , and as such and , where and . Therefore, using
and
, and (33) we can also write
(36a) (36b)
On the other hand, by virtue of Lemma 7.4, , that each agent uses to solve optimization problem (20) (equivalently (12)) satisfies
(37) |
with the probability of . Using (36b) and (37), and also that the samples are drawn independently
(38a) | ||||
(38b) |
with the probability of .
Next, let be the projection of into . Knowing that s are disjoint sub-spaces of covering the whole space then we can write
(40) |
Then, using (39), (40), and invoking Lemma 7.1 and the fact that we obtain
(41) |
with the probability of . Hence, using (32) and (Appendix A: Proof of the Results in Section 5), we conclude that
(42) |
with the probability of .
Next, let and , to rewrite (42) as
(43) |
Then from inequality (Appendix A: Proof of the Results in Section 5) we get
(44) |
with the probability of . Solving for inequality (44) at time yields
(45) |
with the probability of . Substituting back and , in (45) we then obtain
(46) |
with the probability of . By applying , we get
(47) |
with the probability of .
Given (34), from the propagation and update rules (16) and (17) and Lemma 5.1 we can conclude that . Furthermore by defining to be a random set where each member if sampled according to and from . Since , we can also define to be a random set where only one policy is sampled from according to , then using Lemma 7.5, we can write
which concludes the proof.
Appendix B: Auxiliary Results
In what follows we let to represent the optimizer of problem (8).
Lemma 7.1
Consider the set value optimization problem (8). Suppose is an increasing and submodular set function and consider its multilinear extension . Then
-
Proof
see [21] for the proof.
Lemma 7.2
Consider the set value optimization problem (8). Suppose is an increasing and submodular set function and consider its multilinear extension . Then
- Proof
Lemma 7.3
Consider a twice differentiable function which satisfies for any . Then for any satisfying and the following holds,
(50a) | |||
(50b) |
for .
-
Proof
Let . Then, we can write
(51)
Lemma 7.4
Consider the set value optimization problem (8). Suppose is an increasing and submodular set function and consider its multilinear extension . Let be the estimate of that is calculated by taking samples of set according to membership probability vector . Then
(52) |
with the probability of , for any .
-
Proof
Define the random variable
and assume that agent takes samples from to construct realization of . Since is a submodular function, then we have . Consequently using equation (5), we conclude that . Hence, using Theorem 3.1, we have
with the probability of . Hence, the estimation accuracy of , is given by
with the probability of .
Lemma 7.5
Suppose is an increasing and submodular set function and consider to be a membership probability vector of set with . We define to be the set resulted by sampling independently each member of according to probability vector and to be a single member set which is chosen according to . the following holds for any random set .
-
Proof
Defining , then we can write
which concludes the proof.